1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State...

14
1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach [email protected] / [email protected]

Transcript of 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State...

Page 1: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

1

Data Mining Education Across Disciplines

Dr. Alan Safer and Dr. Lesley Farmer

California State University Long Beach

[email protected] / [email protected]

Page 2: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

2

What is Data Mining?

Data mining is essentially the process of uncovering meaningful new correlations, patterns and trends from large quantities of complex data using statistical and mathematical techniques.

Page 3: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

3

Looking for Differences Among Disciplines

KeywordsPublicationsCourses offeredTextbooks usedSoftware used

Page 4: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

4

Methodology

Identification by professional association of accredited programs

Content analysis of graduate program websites

Literature review of discipline-specific database aggregators (“data min*)

Page 5: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

5

Data Mining Education

Business: identifying credit card fraud, insider trading patterns, defect analyses

Sciences: medicare fraud, astronomical variations, and disease risk

Statistics: fuzzy logic, theory and applications

Library/Information Sciences: research and best practices (e.g., health, business)

Page 6: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

6

Course Offerings

Discipline Number of departments with data mining courses

Percent of total number of departments in discipline offering data mining courses

Business

83 17.60%

Computer Science/ Engineering

187 48.80%

Statistics

46 28.00%

Library/ Information Science

15 30.00%

Page 7: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

7

Comparative Key Words

Business: decision-making, management, and competition

Computer science/engineering: technology-related and intelligence-related terms

Statistics: methodological terms. Library/info science: greatest variation; most terms

associated with applications (e.g., bibiometrics, health informatics, and information management)

Greatest overlap existed between business and library/information science due to decision-making methodology and management issues.

Page 8: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

8

The Domains of Data Mining

Page 9: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

9

The Domains of Data Mining In Library/Information Sciences

Page 10: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

10

Major Textbooks

Artificial Intelligence: A Modern Approach (3rd ed.), by Russell S., and Norvig, P. (2009), Prentice Hall.

•Pattern Classification (2nd ed.), by Duda, R., Hart, P., and Stork, D. (2000), Wiley-Interscience.

Machine Learning, by Mitchell, T. (1997), McGraw-Hill.

Introduction to Data Mining, by Tan, P. et. al. (2006), Addison Wesley.

Page 11: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

11

Titles Patterns

Little agreement on textbooks except in computer science/engineering.

In specialized subsets of field (e.g., biometrics), few titles available from which to choose

Textbook choice depends on specific course objectives and content focus

Page 12: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

12

Major Software

SAS: business and statistics Matlab and C++: computer scienceSPSS, SQL, Excel : library/info science

Factors: student ability, DB features

Page 13: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

13

Data Mining Articles by Discipline(refereed journals, also proceedings and books)

4

284485 441

0 105

1200

1754

0 25 50 402.8 19.6 68 149.2

0

500

1000

1500

2000

1990-1994 1995-1999 2000-2004 2005-Sept09

Years

Avg #

artic

les pe

r yea

r

business

comp sci/engineering

statistics

library/info sci

Page 14: 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State University Long Beach asafer@csulb.edu / lfarmer@csulb.edulfarmer@csulb.edu.

14

Bottom Line about Data Mining Education

Blend of theory and practice that reflects each academic discipline rather than a unified system