Data mining
Transcript of Data mining
![Page 1: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/1.jpg)
Data Mining
Priyabrata satapathyM.Tech 1st Year
SIS NO.-MCS12121
![Page 2: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/2.jpg)
Contents What is Data mining.
Why Data mining needed.
Data, Information, Knowledge.
Data mining & KDD.
Data Warehouses.
Data Cleaning.
Applications of Data mining.
![Page 3: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/3.jpg)
What is data MiningData mining (knowledge discovery in databases):
Extraction of interesting information or patterns from data in large databases.
Knowledge discovery in databases (KDD) is the process of identifying valid, useful and ultimately understandable patterns in data from large database.
![Page 4: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/4.jpg)
Why Data Mining Needed Data mining is needed for providing
tools to discover Knowledge from data.
Data mining turns a large collection of data into knowledge.
![Page 5: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/5.jpg)
The Data •Data
Data are any facts, numbers, or text that can be processed by a computer.
operational or transactional data : such as, sales, cost, inventory, payroll, and accounting
meta data - : data about the data itself, such as logical database design or data dictionary definitions
nonoperational data: such as industry sales, forecast data, and macro economic data
![Page 6: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/6.jpg)
The InformationThe patterns, associations, or relationships among All this data can provide information.
For example, analysis of retail point of sale transaction data can yield information on which products are selling and when.
![Page 7: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/7.jpg)
The Knowledge•Information can be converted into knowledge about historical patterns and future trends.
For example, summary information on retail supermarket sales can be analyzed in light of promotional efforts to provide knowledge of consumer buying behavior.
![Page 8: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/8.jpg)
Data, Information & Knowledge
![Page 9: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/9.jpg)
Data Mining & KDDData cleaning Used to remove noise and inconsistent data.Data integration Where multiple data sources may be combined.Data selection Where data relevant to the analysis task are retrieved from the database.Data transformation Where data are transformed or consolidated into forms appropriate for mining by performing summary.Data mining An essential process where intelligent methods are applied in order to extract data patterns.
![Page 10: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/10.jpg)
Data Mining & KDD
![Page 11: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/11.jpg)
Data WarehouseIA data warehouse is a repository of information collected from multiple sources, stored under a unified schema and residing to a single site.
Data warehouse constructed through a process of data cleaning, data integration, data transformation, data loading & data refreshing.
![Page 12: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/12.jpg)
Data CleaningData that is to be analyze by data mining techniques can be incomplete, noisy, and inconsistent.
Data cleaning routines attempt to fill in missing values, smooth out noise while identifying outliers, and correct inconstancies of data.
![Page 13: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/13.jpg)
Missing ValuesWe can clean the missing values in data by Ignoring the tuple. Filling the missing value manually. Using a global constant to fill the values. Using the measure of mean, median to fill the missing value. Using the most probable value to fill.
![Page 14: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/14.jpg)
Noisy DataNoisy data means error full data .To handle noisy data : Binning:Binning methods smooth a sorted data value by consulting the neighborhood values around it. Regression: Data smoothing can be done by regression . Here data values changes to a function. Outlier: Outliers may be detected by clustering. Here similar values are arranged in clusters, those are fall outside are outliers.
![Page 15: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/15.jpg)
Applications of Data MiningData mining for Financial data analysis.
Data mining for Retail and
Telecommunication Industries.
Data mining for Science and Engineering.
Data mining and Recommender systems.
![Page 16: Data mining](https://reader034.fdocuments.in/reader034/viewer/2022042816/5587a150d8b42a2a368b4625/html5/thumbnails/16.jpg)
Thank You