Group 'B'
4.) What do you mean by knowledge discovery in database (KDD) ?
Ans :- Knowledge discovery in database is the process of finding useful information and patterns in data. It involves the cleaning of unnecessary data in a database to the presentation of the extracted useful information. Data Mining itself is a process within KDD. Some of the steps or stages of KDD are as follows :
a.) Data Cleaning : It is the process of removing noise and unnecessary and unwanted data from a database.
b.) Data Integration : In this stage, data coming from multiple sources are combined.
c.) Data Selection : In this stage, the data that are related or relevant to the knowledge analysis field are selected for further processing.
d.) Data Transformation : In this stage, data is transformed into appropriate form making it ready for data mining step. Summary or aggregation operations are applied onto the selected data.
e.) Data Mining : In this stage, intelligent steps such as algorithms or other techniques are applied to extract data patterns in the available data.
f.) Pattern Evaluation : This step interprets the mined patterns and relationships based on the standard measures. If the pattern evaluated is not useful, then it may start the KDD process from beginning.
g.) Knowledge Presentation : It is the final stage of KDD. In this stage, the knowledge discovered is presented to the user in a simple and easy to understand format. Mostly visualization techniques are used to interpret and make users understand.
5.) Explain the applications of Data Warehousing and Data Mining .
Ans :-
Applications of Data Mining :
A.) Data Mining for financial Data Analysis :
*Loan Payment Prediction and customer analysis
*Classification and clustering of customers for targeted marketing.
*Fraud detection and financial crimes.
*Design and construction of data warehouses for multi-dimensional data analysis and data mining.
B.) Data Mining for retail and telecommunication industries :
*For collecting huge amount of data on sales, customer shopping history, goods transportation, consumption and service.
*Helps in identifying customer behavior, preferences, shopping patterns, improving quality of customer service, etc.
*Analysis of sales companies
*Design and construction of data warehouse.
*Multi-dimensional analysis
*Fraudient analysis.
C.) Data Mining in science and engineering :
*Data collection from geo-sciences, astronomy, space, terrains, meterology, etc.
*DW and pre-processing
*Mining complex data types
*graph and network based mining
D.) Intrusion Detection and preventions :
*Signature based detection
*Anomaly based detection
Applications of Data Warehouse :
A.) Information Processing
B.) Analytical Processing
C.) Data Mining, etc.
6.) Differentiate between OLAP and OLTP.
Ans :-
The differences between OLAP and OLTP are as follows :
*Mining complex data types
*graph and network based mining
D.) Intrusion Detection and preventions :
*Signature based detection
*Anomaly based detection
Applications of Data Warehouse :
A.) Information Processing
B.) Analytical Processing
C.) Data Mining, etc.
6.) Differentiate between OLAP and OLTP.
Ans :-
The differences between OLAP and OLTP are as follows :
7.) Explain the data mining techniques.
Ans :-
Followings are some of the data mining techniques :
A.) Statistics :
Statistics collects, analyzes and presents large amount of data. Statistical models are used to model data and data classes. Statistics is used for mining patterns from data as well as for understanding factors affecting patterns.
B.) Machine Learning :
machine learning studies how computers can learn from data . It also involves recognizing complex patterns and make intelligent decisions based on data.
C.) Database System and Data Warehouse :
Database and DW store large amount of real-time, fast-streaming, large data sets. Data mining can make use of database systems for high efficiency and scalability.Capability of DBMS can be extended and data warehouses support OLAP which is useful in data mining.
D.) Information Retrieval :
Large amount of information is present in web in the form of documents, multimedia, audio, text, etc. So, information retrieval, integrated with text mining and multimedia data mining have become very much important technique.
E.) Neural Networks :
These are set of interconnected I/O units which learn. These can be used to extract patterns and detect trends which are difficult to find out for humans and computer techniques.
Some other data mining techniques are :
*Association
*Classification
*Clustering
*Prediction
*Sequential Patterns
*Decision Trees
*Regression
8.) Explain the apriori algorithm.
Ans :-
Given minimum required support s as interestingness criterion:
Search for all individual elements (1 element item set) that have a minimum support of s
Repeat
2.1 From the result of the previous search for i-element item-sets, search for all i+1 element item-sets that have a minimum support of s
2.2 This becomes the sets of all frequent (i+1) element item-sets that are interesting
Until item-set size reaches maximum
A.) Statistics :
Statistics collects, analyzes and presents large amount of data. Statistical models are used to model data and data classes. Statistics is used for mining patterns from data as well as for understanding factors affecting patterns.
B.) Machine Learning :
machine learning studies how computers can learn from data . It also involves recognizing complex patterns and make intelligent decisions based on data.
C.) Database System and Data Warehouse :
Database and DW store large amount of real-time, fast-streaming, large data sets. Data mining can make use of database systems for high efficiency and scalability.Capability of DBMS can be extended and data warehouses support OLAP which is useful in data mining.
D.) Information Retrieval :
Large amount of information is present in web in the form of documents, multimedia, audio, text, etc. So, information retrieval, integrated with text mining and multimedia data mining have become very much important technique.
E.) Neural Networks :
These are set of interconnected I/O units which learn. These can be used to extract patterns and detect trends which are difficult to find out for humans and computer techniques.
Some other data mining techniques are :
*Association
*Classification
*Clustering
*Prediction
*Sequential Patterns
*Decision Trees
*Regression
8.) Explain the apriori algorithm.
Ans :-
The Apriori Algorithm
Given minimum required support s as interestingness criterion:
Search for all individual elements (1 element item set) that have a minimum support of s
Repeat
2.1 From the result of the previous search for i-element item-sets, search for all i+1 element item-sets that have a minimum support of s
2.2 This becomes the sets of all frequent (i+1) element item-sets that are interesting
Until item-set size reaches maximum
You can find examples and further explanations in the notes :
9.) Explain the k-mediod algorithm.
Ans :-
10.) Mention the spatial database and it's features.
Ans :-
A spatial database is adatabase that is optimized to store and query data that represents objects defined in a geometric space. Most spatial databases allow representing simple geometric objects such as points, lines and polygons. Spatial data is often accessed, manipulated or analyzed through Geographic Information Systems. Spatial databases can perform a wide variety of spatial operations.
Some of the features of spatial databases are as follows :
- Spatial Measurements: Computes line length, polygon area, the distance between geometries, etc.
- Spatial Functions: Modify existing features to create new ones, for example by providing a buffer around them, intersecting features, etc.
- Spatial Predicates: Allows true/false queries about spatial relationships between geometries. Examples include "do two polygons overlap" or 'is there a residence located within a mile of the area we are planning to build the landfill?'
- Geometry Constructors: Creates new geometries, usually by specifying the vertices (points or nodes) which define the shape.
- Observer Functions: Queries which return specific information about a feature such as the location of the center of a circle
11.) What is data cube ? Explain with example.
Ans :-
A data cube is a multi-dimensional data model that allows data to be modeled and represented in multiple dimensions.It is defined by dimensions and facts. In general terms, dimensions are the perspectives or entities with respect to which an organization wants to keep records.
It is especially useful when representing data together with dimensions as certain measures of business requirements. A cube's every dimension represents certain characteristic of the database, for example, daily, monthly or yearly sales. The data included inside a data cube makes it possible analyze almost all the figures for virtually any or all customers, sales agents, products, and much more. Thus, a data cube can help to establish trends and analyze performance.
lauda lassan
ReplyDeleteI really like your article. It’s evident that you have a lot knowledge on this topic. Your points are well made and relatable. Thanks for writing engaging and interesting material. useful reference
ReplyDeleteYour comprehensive solution set for data warehousing is effective. The Play Game Your in-depth coverage of concepts, architecture, and implementation strategies demonstrates your expertise.
ReplyDeleteRespect and that i have a keen supply: House Renovation home renovation youtube
ReplyDelete