Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data...

15
Data Mining CS 157B Section 2 Keng Teng Lao

Transcript of Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data...

Page 1: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Data Mining

CS 157B Section 2Keng Teng Lao

Page 2: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Overview

• Definition of Data Mining• Application of Data Mining

Page 3: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Data Mining

• Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data.

• To be useful, data mining must be carried out efficiently on large files and databese.

Page 4: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

KDD

• Knowledge Discovery in Databases

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

Page 5: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Data Mining Vs. Data Warehousing

• The goal of a data warehouse is to support decision making with data.

• Data Mining can be used in conjunction with a data warehouse to help with certain types of decisions

Page 6: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Goals of Data Mining and Knowledge Discovery

• Prediction – Data mining can show how certain attributes within the data will behave in the future.

• Identification – Data patterns can be used to identify the existence of an item, an event, or an activity.

Page 7: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Cont.

• Classification – Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters

• Optimization – Once eventual goal of data mining may be to optimize the use of limited resources such as time, space… to maximize output variables such as sales or profits under a given set of constraints.

Page 8: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Types of Knowledge Discovered During Data Mining• Association rules• Classification hierarchies• Sequential patterns• Patterns within time series• Clustering

Page 9: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Classification hierarchies

• Process of learning a model that describes different classes of data.

• Decision Tree

Page 10: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Sequential Patterns

• The discovery of sequential patterns is based on the concept of a sequence of itemsets.

• TO find all subsequences from the given sets of sequences that have a user-defined minimum support.

Page 11: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Patterns with in Time Series

• Time series are sequences of event• Each event may be a given fixed type

of a transaction

• The closing price of a stock or a fund is an event that occurs every weekday for each stock fund.

Page 12: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Application of Data Ming

• Marketing – Application include analysis of consumer behavior based on buying patterns

• Finance – Applications include analysis of creditworthiness of clients, segmentation of account receivables…

Page 13: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Cont.

• Manufacturing – Applications involve optimization of resources like machines, manpower, and materials

• Health Care – Applications include discovering patterns in radiological images, analyzing side effects of drugs…

Page 14: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Real Life Application

• The LA police departments counterterrorism unit next are using a new data-analysis system designed to identify and connect related pieces of intelligence to help officers dter and respond to terrorist attacks.

Page 15: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.

Reference

• Elmasri, Remez Fundamentals of Database Systems. Pearson. Singapore. 2004.

• LAPD turns to data analysis to fight terrorism. <http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=107670>