Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...

20
Data Mining By Dave Maung

Transcript of Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...

Page 1: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Data Mining

By

Dave Maung

Page 2: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

What is Data Mining?

The process of automatically searching large volumes of data for patterns.

Also known as KDD Knowledge-Discovery.

Page 3: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Different types of Data Mining

Relational data mining Text mining Web mining

Page 4: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Relational Data Mining

Data mining technique for relational databases

Relational data mining algorithms look for patterns among multiple tables

Used classification rules and Association rules

Page 5: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Classification

Predicting an item classFinding rules that partition the given data

into disjoints groupsPopular classification Methods is

decision tree

Page 6: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Decision Tree

A graph of decisions and their possible consequences

Decision trees are constructed to help making decisions.

A decision tree used tree structure.

Page 7: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Example of Decision Tree

Page 8: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Text Mining

Is the process of extracting interesting non-trivial informationknowledge from unstructured text

Page 9: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Text Mining (continued)

Also known as intelligent text analysistext data mining unstructured data managementor knowledge-discovery in text

Page 10: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Mining

Is the extraction of interesting potentially useful patterns

Implicit information from artifacts Activity related to the Worldwide Web

Page 11: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Mining (continued)

Three knowledge discovery domains that pertain to web miningWeb Content Mining, Web Structure Mining, Web Usage Mining

Page 12: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Content Mining

Is an automatic process that goes beyond keyword extraction.

There are two groups of web content mining strategies: mine the content of documents improve on the content search of other tools

like search engines.

Page 13: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Structure Mining

Is Worldwide Web can reveal more information than just the information contained in documents

Page 14: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Structure Mining (example)

Links pointing to a document indicate the popularity of the document.

Links coming out of a document indicate the richness or perhaps the variety of topics covered in the document.

Page 15: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Usage Mining

Web servers record and accumulate data about user interactions whenever requests for resources are received.

Analyzing the web access logs of different web sites

Page 16: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Usage Mining

Two main tendencies in Web Usage Mining driven: General Access Pattern Tracking Customized Usage Tracking

Page 17: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

General access pattern

Analyzes the web logs to understand access patterns and trends

Give better structure and grouping of resource providers

Can be used to restructure sites in a more efficient grouping, and target specific users for specific selling ads

Page 18: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Customized usage tracking

Analyzes individual trends To customize web sites to usersSuccess of Application depends on what

and how much valid and reliable knowledge one can discover from the large raw log data.

Page 19: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Web Mining Architecture

Page 20: Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.

Reference

http://wikipedia.com