Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...

Post on 03-Jan-2016

214 views 0 download

Transcript of Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...

Data Mining

By

Dave Maung

What is Data Mining?

The process of automatically searching large volumes of data for patterns.

Also known as KDD Knowledge-Discovery.

Different types of Data Mining

Relational data mining Text mining Web mining

Relational Data Mining

Data mining technique for relational databases

Relational data mining algorithms look for patterns among multiple tables

Used classification rules and Association rules

Classification

Predicting an item classFinding rules that partition the given data

into disjoints groupsPopular classification Methods is

decision tree

Decision Tree

A graph of decisions and their possible consequences

Decision trees are constructed to help making decisions.

A decision tree used tree structure.

Example of Decision Tree

Text Mining

Is the process of extracting interesting non-trivial informationknowledge from unstructured text

Text Mining (continued)

Also known as intelligent text analysistext data mining unstructured data managementor knowledge-discovery in text

Web Mining

Is the extraction of interesting potentially useful patterns

Implicit information from artifacts Activity related to the Worldwide Web

Web Mining (continued)

Three knowledge discovery domains that pertain to web miningWeb Content Mining, Web Structure Mining, Web Usage Mining

Web Content Mining

Is an automatic process that goes beyond keyword extraction.

There are two groups of web content mining strategies: mine the content of documents improve on the content search of other tools

like search engines.

Web Structure Mining

Is Worldwide Web can reveal more information than just the information contained in documents

Web Structure Mining (example)

Links pointing to a document indicate the popularity of the document.

Links coming out of a document indicate the richness or perhaps the variety of topics covered in the document.

Web Usage Mining

Web servers record and accumulate data about user interactions whenever requests for resources are received.

Analyzing the web access logs of different web sites

Web Usage Mining

Two main tendencies in Web Usage Mining driven: General Access Pattern Tracking Customized Usage Tracking

General access pattern

Analyzes the web logs to understand access patterns and trends

Give better structure and grouping of resource providers

Can be used to restructure sites in a more efficient grouping, and target specific users for specific selling ads

Customized usage tracking

Analyzes individual trends To customize web sites to usersSuccess of Application depends on what

and how much valid and reliable knowledge one can discover from the large raw log data.

Web Mining Architecture

Reference

http://wikipedia.com