Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
-
Upload
vernon-cunningham -
Category
Documents
-
view
233 -
download
2
Transcript of Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
![Page 1: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/1.jpg)
Part I
Data Mining Fundamentals
![Page 2: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/2.jpg)
Data Mining: A First View
Chapter 1
![Page 3: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/3.jpg)
1.1 Data Mining: A Definition
![Page 4: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/4.jpg)
Data Mining
The process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data.
![Page 5: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/5.jpg)
Induction-based Learning
The process of forming general concept definitions by observing specific examples of concepts to be learned.
![Page 6: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/6.jpg)
Knowledge Discovery in Databases (KDD)
The application of the scientific method to data mining. Data mining is one step of the KDD process.
![Page 7: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/7.jpg)
1.2 What Can Computers Learn?
![Page 8: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/8.jpg)
Four Levels of Learning
• Facts
• Concepts
• Procedures
• Principles
![Page 9: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/9.jpg)
Concepts
Computers are good at learning concepts. Concepts are the output of a data mining session.
![Page 10: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/10.jpg)
Three Concept Views
• Classical View
• Probabilistic View
• Exemplar View
![Page 11: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/11.jpg)
Supervised Learning
• Build a learner model using data instances of known origin.
• Use the model to determine the outcome new instances of
unknown origin.
![Page 12: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/12.jpg)
Supervised Learning:
A Decision Tree Example
![Page 13: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/13.jpg)
Decision Tree
A tree structure where non-terminal nodes represent tests on one or more attributes and terminal nodes reflect decision outcomes.
![Page 14: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/14.jpg)
Table 1.1 • Hypothetical Training Data for Disease Diagnosis
Patient Sore SwollenID# Throat Fever Glands Congestion Headache Diagnosis
1 Yes Yes Yes Yes Yes Strep throat2 No No No Yes Yes Allergy3 Yes Yes No Yes No Cold4 Yes No Yes No No Strep throat5 No Yes No Yes No Cold6 No No No Yes No Allergy7 No No Yes No No Strep throat8 Yes No No Yes Yes Allergy9 No Yes No Yes Yes Cold10 Yes Yes No Yes Yes Cold
![Page 15: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/15.jpg)
Figure 1.1 A decision tree for the data in Table 1.1
SwollenGlands
Fever
No
Yes
Diagnosis = Allergy Diagnosis = Cold
No
Yes
Diagnosis = Strep Throat
![Page 16: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/16.jpg)
Table 1.2 • Data Instances with an Unknown Classification
Patient Sore SwollenID# Throat Fever Glands Congestion Headache Diagnosis
11 No No Yes Yes Yes ?12 Yes Yes No No Yes ?13 No No No No Yes ?
![Page 17: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/17.jpg)
Production Rules
IF Swollen Glands = Yes
THEN Diagnosis = Strep Throat
IF Swollen Glands = No & Fever = Yes
THEN Diagnosis = Cold
IF Swollen Glands = No & Fever = No
THEN Diagnosis = Allergy
![Page 18: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/18.jpg)
Unsupervised Clustering
A data mining method that builds models from data without predefined classes.
![Page 19: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/19.jpg)
Table 1.3 • Acme Investors Incorporated
Customer Account Margin Transaction Trades/ Favorite AnnualID Type Account Method Month Sex Age Recreation Income
1005 Joint No Online 12.5 F 30–39 Tennis 40–59K1013 Custodial No Broker 0.5 F 50–59 Skiing 80–99K1245 Joint No Online 3.6 M 20–29 Golf 20–39K2110 Individual Yes Broker 22.3 M 30–39 Fishing 40–59K1001 Individual Yes Online 5.0 M 40–49 Golf 60–79K
![Page 20: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/20.jpg)
1.3 Is Data Mining Appropriate for My Problem?
![Page 21: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/21.jpg)
Data Mining or Data Query?
• Shallow Knowledge
• Multidimensional Knowledge
• Hidden Knowledge
• Deep Knowledge
![Page 22: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/22.jpg)
Data Mining vs. Data Query: An Example
• Use data query if you already almost know what you are looking for.
• Use data mining to find regularities in data that are not obvious.
![Page 23: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/23.jpg)
1.4 Expert Systems or Data Mining?
![Page 24: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/24.jpg)
Expert System
A computer program that emulates the problem-solving skills of one or more human experts.
![Page 25: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/25.jpg)
Knowledge Engineer
A person trained to interact with an expert in order to capture their knowledge.
![Page 26: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/26.jpg)
Figure 1.2 Data mining vs. expert systems
Data Mining Tool
Expert SystemBuilding Tool
Human Expert
If Swollen Glands = YesThen Diagnosis = Strep Throat
If Swollen Glands = YesThen Diagnosis = Strep Throat
Knowledge Engineer
Data
![Page 27: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/27.jpg)
1.5 A Simple Data Mining Process Model
![Page 28: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/28.jpg)
Figure 1.3 A simple data mining process model
SQL QueriesOperationalDatabase
DataWarehouse
ResultApplication
Interpretation&
EvaluationData Mining
![Page 29: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/29.jpg)
Assembling the Data
• The Data Warehouse
• Relational Databases and Flat Files
![Page 30: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/30.jpg)
Mining the Data
Interpreting the Results
Result Application
![Page 31: Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649f225503460f94c3b6b7/html5/thumbnails/31.jpg)
1.7 Data Mining Applications