CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New...

18
CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New...

Page 1: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

CSC 466: Knowledge Discovery From Data

Alex DekhtyarDepartment of Computer Science Cal Poly

New Computer Science Elective

Page 2: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Outline

Why?

What?

How?

Discussion

Page 3: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Information Retrieval

Page 4: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Text Classification? Link Analysis?

Page 5: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Recommender Systems

Page 6: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Market Basket Analysis. Purchasing trends analysis.

Page 7: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Data Warehouse… and so much more…

Page 8: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Link Analysis

Page 9: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

Cluster Analysis

Page 10: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Buzzwords

Data warehousing Data mining

Information filtering

Recommender SystemsInformation retrieval

Text classification

Web mining

OLAP Cluster Analysis

Market basket analysis

Page 11: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

Why?

As professionals, hobbyists and consumers students constantly interact with intelligent information management technologies

This is moving into the realm of undergraduate-level knowledge

Page 12: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

@Calstate.edu

CSU Fullerton: CPSC 483 Data Mining and Pattern Recognition

CSU LA: CS 461 Machine Learning CS 560 Advanced Topics in Artificial Intelligence

CSU Northridge: 595DM Data Mining

CSU Sacramento: CSC 177. Data Warehousing and Data Mining

CSU SF: CSC 869 - Data Mining

CSU San Marcos: CS475 Machine Learning CS574 Intelligent Information Retrieval

Page 13: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

What?

Undergraduate course

Informed consumers Professionals

OLAP/Data Warehousing

Data Mining

Collaborative Filtering

Information Retrieval

1 quarter = 10 weeks

Knowledge Discoveryfrom Data

Page 14: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

What? (goals) Understand KDD technologies @ consumer

level Understand basic types of

Data mining Information filtering Information retrieval

techniques Use KDD to analyze information Implement KDD algorithms Understand/appreciate societal impacts

Page 15: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

What? (syllabus in a nutshell) Intro (data collections, measurement): 2 lectures Data Warehousing/OLAP: 2 lectures Data Mining:

Association Rule Mining: 3 lectures Classification: 3 lectures Clustering: 3 lectures

Collaborative Filtering/Recommendations: 2 lectures Information Retrieval: 4 lectures

19 lectures

(= spring quarter)CSC 466, Spring 2009 quarter

Page 16: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

How? (Alex’s ideas) Learn-by-doing....

Labs: work with existing software, analyze data, interpret

Labs: small groups, implement simple KDD techniques Project: groups, find interesting data, analyze it…

Need to incorporate “societal issues”: privacy vs. data access, etc… Students to make informed choices

Lectures Breadth over depth do a follow-up CSC 560 (grad. DB topics class)

Page 17: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

How?

TODO List:

Find data for labs and projects Investigate open source mining/retrieval software Figure out the textbook

(Web Data Mining by Bing Liu is promising)

Page 18: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.

How?

This slide intentionally left blank