1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State...
-
Upload
phoebe-robbins -
Category
Documents
-
view
212 -
download
0
Transcript of 1 Data Mining Education Across Disciplines Dr. Alan Safer and Dr. Lesley Farmer California State...
1
Data Mining Education Across Disciplines
Dr. Alan Safer and Dr. Lesley Farmer
California State University Long Beach
2
What is Data Mining?
Data mining is essentially the process of uncovering meaningful new correlations, patterns and trends from large quantities of complex data using statistical and mathematical techniques.
3
Looking for Differences Among Disciplines
KeywordsPublicationsCourses offeredTextbooks usedSoftware used
4
Methodology
Identification by professional association of accredited programs
Content analysis of graduate program websites
Literature review of discipline-specific database aggregators (“data min*)
5
Data Mining Education
Business: identifying credit card fraud, insider trading patterns, defect analyses
Sciences: medicare fraud, astronomical variations, and disease risk
Statistics: fuzzy logic, theory and applications
Library/Information Sciences: research and best practices (e.g., health, business)
6
Course Offerings
Discipline Number of departments with data mining courses
Percent of total number of departments in discipline offering data mining courses
Business
83 17.60%
Computer Science/ Engineering
187 48.80%
Statistics
46 28.00%
Library/ Information Science
15 30.00%
7
Comparative Key Words
Business: decision-making, management, and competition
Computer science/engineering: technology-related and intelligence-related terms
Statistics: methodological terms. Library/info science: greatest variation; most terms
associated with applications (e.g., bibiometrics, health informatics, and information management)
Greatest overlap existed between business and library/information science due to decision-making methodology and management issues.
8
The Domains of Data Mining
9
The Domains of Data Mining In Library/Information Sciences
10
Major Textbooks
Artificial Intelligence: A Modern Approach (3rd ed.), by Russell S., and Norvig, P. (2009), Prentice Hall.
•Pattern Classification (2nd ed.), by Duda, R., Hart, P., and Stork, D. (2000), Wiley-Interscience.
Machine Learning, by Mitchell, T. (1997), McGraw-Hill.
Introduction to Data Mining, by Tan, P. et. al. (2006), Addison Wesley.
11
Titles Patterns
Little agreement on textbooks except in computer science/engineering.
In specialized subsets of field (e.g., biometrics), few titles available from which to choose
Textbook choice depends on specific course objectives and content focus
12
Major Software
SAS: business and statistics Matlab and C++: computer scienceSPSS, SQL, Excel : library/info science
Factors: student ability, DB features
13
Data Mining Articles by Discipline(refereed journals, also proceedings and books)
4
284485 441
0 105
1200
1754
0 25 50 402.8 19.6 68 149.2
0
500
1000
1500
2000
1990-1994 1995-1999 2000-2004 2005-Sept09
Years
Avg #
artic
les pe
r yea
r
business
comp sci/engineering
statistics
library/info sci
14
Bottom Line about Data Mining Education
Blend of theory and practice that reflects each academic discipline rather than a unified system