Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures...

26
Data Mining

Transcript of Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures...

Page 1: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

Data Mining

Page 2: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

2

Models Created by Data Mining

• Linear Equations

• Rules

• Clusters

• Graphs

• Tree Structures

• Recurrent Patterns

Page 3: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

3

Knowledge Discovery in Databases (KDD)

• Select target data

• Preprocess data

• Transform (if necessary)

• Data mine information

• Interpret discovered structures

Page 4: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

4

Dependant and Independent Variables

• Dependant Variable - Attribute to be predicted.

• Independent Variable - Attributes used for making the prediction.

Page 5: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

5

Fields Contributing to Data Mining

• Database Technology• Statistics• Machine Learning• High Performance Computing• Pattern Recognition• Neural Networks• Data Visualization• Information Retrieval

Page 6: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

6

Applications of Data Mining

• Decision Making

• Process Control

• Information Management

• Query Processing

Page 7: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

7

Methods of Data Reduction

• Drill-down analysis

• Clustering

• Aggregation

• Simple Tabulation

Page 8: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

8

Exploratory Data Analysis (EDA)

• Distributions of Variables

• Correlation Matrices

• Multi-way Frequency Tables

• Cluster Analysis

• Classification Trees

• Other multivariate techniques

Page 9: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

9

Statistical Methods Used in Data Mining

• Regression Analysis

• Standard Distribution

• Cluster Analysis

Page 10: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

10

Industries Using Data Mining

• Banking

• Insurance

• Medicine

• Retail

• Security

• Sciences

Page 11: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

11

Financial Uses of Data Mining

• Fraud Detection

• Money Laundering Detection

• Risk Management

Page 12: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

12

Medical Uses of Data Mining

• Chemical Compounds

• Genetic Material

• Predictive Treatment Models

Page 13: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

13

Retail Uses of Data Mining

• Direct Marketing

• Store Design

• Store Operations

Page 14: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

14

Security Uses of Data Mining

• Assess crime patterns

• Homeland Security

• Identification of suspicious activities

• Pre-screening

Page 15: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

15

Scientific Uses of Data Mining

• Image analysis

• Classification of large data sets

Page 16: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

16

Other Novel Uses for Data Mining

• NBA’s Advanced Scout Program

• Firefly

Page 17: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

17

Predictive Analytics

• An advanced form of data mining that makes prediction models for the behavior of variables in large data sets.

• Highly specialized for each application

Page 18: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

18

Uses of Predictive Analytics

• Cost-Benefit Analysis

• Predicting Customer Behavior

• Reducing Costs

Page 19: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

19

Financial Uses of Predictive Analytics

• Credit Ratings

• Economic Prediction Models

• Federal Reserve

Page 20: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

20

Text Mining

• Extracts data from unstructured data sets

• Allows for data mining of large data sets that are not databases

Page 21: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

21

Sentiment Analysis

• Uses semantic techniques and keywords to detect favorable and unfavorable opinions toward specific subjects.

Page 22: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

22

Privacy Concerns with Data Mining

• Big Brother

• Puts too much power into the hands of Governmental Security Forces

Page 23: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

23

False Positives in Data Mining for Security Reasons

• Costs the people and the Government

• Subject of controversy and civilian mistrust

Page 24: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

24

Data Mining as Another Tool for Security

• Government doesn’t wish to interfere in civilian life

• Actual intrusions of privacy incur legal costs

• Useful for correlating with other sources of data

Page 25: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

25

Visual and Speech Processing

• Examining large amounts of real-time input for specific data and relationships between data

• Requires a certain amount of predictive modeling

Page 26: Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

26

Data Mining is an Essential Use of Computers

• It makes the previously impossible possible

• Powerful tool for progress and understanding

• Lasting Impact