Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research...
Transcript of Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research...
![Page 2: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/2.jpg)
2
Data Mining
• Extraction knowledge (rules, patterns, constraints) from"large" databases
• Data Deluge & Big Data introduce more complexity– We are immersed in data, but hungry for knowledge!
![Page 3: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/3.jpg)
3
Overview: Confluence of various disciplines
Data Mining
Databases Statistics
Domain Area
Artificial Intelligence
Visualization
![Page 4: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/4.jpg)
4
Goal: Answer key questions
• Consider an example of a internet sales system
• Do we have the data?– Transactions involving credit cards, loyalty cards, central customer
service, lifestyle customer
• Can we identify groups?– Identified clusters for "models" of customers who share certain
characteristics: interest, income, spending habits
• Can we classify or predict?– Identified patterns of consumption?
– Can establish a pattern of consumption for single and married?
• Can we associate patterns?– Association / correlation between product sales
– Prediction based on association information
![Page 5: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/5.jpg)
5
Data Mining Process (DMP)
1. Selection
DataSets
2. Pre-processing
3. Mining
Algorithms
4. Pattern
Evaluation
Knowledge
![Page 6: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/6.jpg)
6
DMAA - Research Project
• Goal– Develop Data Mining Algorithms and Application for practical problems
• Research Process– Basic Research
• Produce data knowledge extraction algorithms or tools without immediateapplication for real problems
– Applied Research
• Usage of know or developed algorithms or tools for solving real problems
Basic Research Applied Research
![Page 7: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/7.jpg)
7
Basic Research
• Data Management & Data Mining Process– Data Streaming
– Data Sources Location
– Uniform Data Model
– Data Experiment Representation
– Modeled through Workflows
• Algorithms and Methods– Research on specific Data Mining Activities
![Page 8: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/8.jpg)
8
DMP: Step #1 – Selection and Integration
• Types of data– Relational, Spatial, Temporal, Textual, Graph
• Sources– Centralized
– Distributed
– Streaming
• Integration– Data integration from different sources
• Dataset Production• Projection
• Selection
![Page 9: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/9.jpg)
9
DMP: Step #2 – Pre-processing
• Very important step– 60% of Data Mining effort!
• Data Reduction– Filtering or Sampling
• Attribution Reduction– Selection of important attributed
– Reduction of Dimensionality
• Transformation– Combination
– Derivation
– Aggregation
![Page 10: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/10.jpg)
10
DMP: Step #3 – Data Mining Algorithms and Methods
• Focus: Searching for patterns of interest
• Data Mining Methods– Clustering
– Classification and Prediction
– Association
– New Approaches
![Page 11: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/11.jpg)
11
DMP: Step #4 – Pattern Evaluation
• Error Measurements– Error Metrics
• Evaluation of patterns and extracted knowledge– Visualization
– Transformation
– Redundant pattern removal
• Use of discovered knowledge
• Are all discovered patterns important?
![Page 12: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/12.jpg)
12
Applied Research
• Focus on solving target application areas (datasets)
• Basic domain of target area– data
– Problem
– Identification of goals
• Examples– Astronomy
– Production engineering
– Social Networks
– Computing in Educational
![Page 13: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/13.jpg)
13
Evaluation of Enterprise Social Networks Adoption
![Page 14: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/14.jpg)
14
Time Series Prediction
![Page 15: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/15.jpg)
15
Forecast Sea Surface Temperature in Atlantic Pacific Ocean
![Page 16: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/16.jpg)
16
Identification of Stars and Gallaxies
NGC 7793: a spiral galaxy
![Page 17: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/17.jpg)
17
Production of Access Flow Dataset for
Identification of Deny of Service (DoS) Attacks
![Page 18: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/18.jpg)
18
Clustering of Access Flow Dataset
(Clustering of Time Series)
![Page 19: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/19.jpg)
19
Forecasting Delays in Airports
![Page 20: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/20.jpg)
20
Prediction of electrical power output
of a base load operated combined cycle power plan
Tufekci, P. 2014
![Page 21: Mining Frequent Patterns Without Candidate …eogasawara/wp-content/uploads/...6 DMAA -Research Project •Goal –Develop Data Mining Algorithms and Application for practical problems](https://reader033.fdocuments.in/reader033/viewer/2022041620/5e3e382fc69bdf4caf5bfe37/html5/thumbnails/21.jpg)
21
Referências