B.Tech jntuh DWDM Course Description.docx
-
Upload
srinivasa-reddy-nallimilli -
Category
Documents
-
view
148 -
download
1
description
Transcript of B.Tech jntuh DWDM Course Description.docx
VARDHAMAN COLLEGE OF ENGINEERING(Autonomous)
Shamshabad, Hyderabad – 501 218
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Academic year 2014-2015VII Semester
COURSE DESCRIPTION
Course Code : ACS11T20
Course Title : DATA WAREHOUSING AND DATA MINING
Course Structure :Lecture
s Tutorials Practicals Credits
3 1 - 4
Course Coordinator : Prof L V Narasimha Prasad, Professor and Head
Team of Instructors : Mr H Venkateswara Reddy and Mr R Madana Mohan
I. Course Overview:
The course addresses the concepts, skills, methodologies, and models of data warehousing. The course addresses proper techniques for designing data warehouses for various business domains, and covers concepts for potential uses of the data warehouse and other data repositories in mining opportunities. Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions.
II. Prerequisite(s):
Level Credits Periods / Week Prerequisites
UG 4 4 Understand basic
DWDM theory and operational concepts
III. Marks Distribution:
Sessional MarksUniversity End Exam
Marks
Total Marks
There shall be 2 midterm examinations. Each midterm examination consists of subjective test. The subjective test is for 20 marks, with duration of 2 hours. Subjective test of each semester shall contain 5 one mark compulsory questions in part-A and part-B contains 5 questions, the student has to answer 3 questions, each carrying 5 marks.First midterm examination shall be conducted for the first two and half units of syllabus and second midterm examination shall be conducted for the remaining portion.Five marks are earmarked for assignments. There shall be two assignments in every theory course. Marks shall be awarded considering the average of two assignments in each course.
75 100
1 | P a g e
IV. Evaluation Scheme:
V. Course Objectives:
I. To introduce students to the basic concepts, techniques and applications of Data Mining.
II. Learn how to preprocess data before applying data mining techniques.
III. Mathematical foundations of data mining tools.
IV. Acquiring, parsing, filtering, mining, representing, refining, visualization and interacting with data.
V. To develop professional and ethical attitude, effective communication skills,, leadership, teamwork skill, multidisciplinary approach and an ability to relate data mining issues to broader social context.
VI. To develop skills of Programming data mining algorithms using recent data mining software for solving practical problems.
VII. To gain experience of doing independent study and research.
VI. Course Outcomes:
1. Create a target data set to be used for discovery.
2. Choose the data-mining task (classification, regression, clustering, etc.).Understand and apply a wide range of clustering, estimation, prediction, and classification algorithms, including k-means clustering, BIRCH clustering, Kohonen clustering, classification and regression trees, the C4.5 algorithm, logistic Regression, k-nearest neighbor, multiple regression, and neural networks.
3. Understand the mathematical statistics foundations of the algorithms outlined above.
4. Understand and apply the most current data mining techniques and applications, such as Text mining, Time series mining, Spatial mining, Web mining and other current issues.
5. Be proficient with leading data mining software, including WEKA and Clementine.
6. The beneficial uses of data mining and what potential threats do these activities pose.
7. Demonstrate knowledge of professional and ethical responsibilities.
8. Able to communicate effectively in both verbal and written form.
9. Understanding of impact of engineering solutions on the society and also will be aware of contemporary issues.
10. Develop confidence for self education and ability for life-long learning.
11. Can participate and succeed in competitive examinations like GATE, GRE.
2 | P a g e
S.No Component Duration(hours) Marks
1 I Mid Examination 2 202 I Assignment - 053 II Mid Examination 2 204 II Assignment - 055 External Examination 3 75
VII. How Course Outcomes are assessed:
N = None S = Supportive H = Highly Related
VIII. Syllabus:
UNIT - IDATA WAREHOUSE AND OLAP TECHNOLOGY Data Warehouses – definitions – multidimensional data model – data warehouse architecture – schemas.
INTRODUCTION TO DATA MINING Definition of data mining – kinds of data – data mining functionalities– classification of data mining systems – primitives – major issues in data mining.
UNIT - IIDATA PREPROCESSINGDescriptive data summarization- data cleaning – data integration and transformation – data reduction – data discretization and concept hierarchy generation.
MINING FREQUENT PATTERNS AND ASSOCIATIONSBasic concepts – efficient and scalable frequent itemset mining methods – association rule mining.
3 | P a g e
Outcome Level Proficiency assessed by
a An ability to apply knowledge of computing, mathematical foundations, algorithmic principles, and computer science and engineering theory in the modeling and design of computer – based systems to real-world problems.
H --
b An ability to design and conduct experiments, as well as to analyze and interpret data. H --
c An ability to design, implement, and evaluate a computer-based system, process, component, or program to meet desired needs, within realistic constraints such as economic, environmental, social, political, health and safety, manufacturability, and sustainability.
SAssignments,
Tutorials, Exams
d An ability to function effectively on multi-disciplinary teams. S --e An ability to analyze a problem, and identify, formulate and use the
appropriate computing and engineering requirements for obtaining its solution.
H Assignments, Exams
f An understanding of professional, ethical, legal, security and social issues and responsibilities. N --
g An ability to communicate effectively, both in writing and orally. S --h The broad education necessary to analyze the local and global
impact of computing and engineering solutions on individuals, organizations, and society.
H --
i Recognition of the need for, and an ability to engage in continuing professional development and life-long learning. S Exams
j Knowledge of contemporary issues. S --k An ability to use current techniques, skills, and tools necessary for
computing and engineering practice. H Lab, Exams
l An ability to apply design and development principles in the construction of software and hardware systems of varying complexity.
S --
m An ability to recognize the importance of professional development by pursuing postgraduate studies or face competitive examinations that offer challenging and rewarding careers in computing.
N --
UNIT - IIICLASSIFICATIONDecision tree induction, bayesian classification – rule based classification, prediction – accuracy and error measures.
UNIT - IVCLUSTER ANALYSISCluster analysis – categories of clustering methods – partitioning methods – hierarchical methods – density based methods – grid based methods – model based clustering methods – clustering high dimensional data – outlier analysis.
UNIT - VMINING STREAM, TIME SERIES AND SEQUENCE DATAMining data streams, Mining time series data, mining sequence patterns in biological data.
MINING OBJECT, SPATIAL, MULTIMEDIA, TEXT AND WEBMulti dimensional analysis on complex object data types – descriptive mining on complex objects – spatial data mining – multimedia data mining – text mining – web mining.
IX. List of Text Books / References / Websites / Journals / Others
Text Books:1. Jiawei Han and Micheline Kamber (2008), Data Mining: Concepts and Techniques, 2nd edition,
Elsevier.
Reference Books:1. Margaret H Dunham (2006), Data Mining Introductory and Advanced Topics, 2nd edition, Pearson
Education.2. Amitesh Sinha (2007), Data Warehousing, Thomson Learning.3. Xingdong Wu, Vipin Kumar (2009), The Top Ten Algorithms in Data Mining, Taylor and Francis
Group.4. Max Barmer (2007), Principles of Data Mining, Springer.
X. Course Plan:The course plan is meant as a guideline. There may probably be changes.
Lecture No.
Learning Objective Topics to be covered Reference
1-2 To understand the database technology
Introduction to Data Mining: Definition of data mining,Evolution of database technology
T1: 1.1 -1.2
3-4 To know the steps in KDD and kinds of data mining
Steps in KDD and Kinds of Data in mining
T1: 1.2-1.3
5-6 Able to understand the data mining functionalities
Data Mining Functionalities T1: 1.4
7 Able to differentiate data mining systems
Classification of Data Mining Systems T1: 1.6
8 To know how to enforce task primitives in data mining algorithms.
Data mining task Primitives T1: 1.7
9 Able to think issues in data mining Major Issues in Data Mining. T1: 1.9 10 To compare data warehouse with
other data repositories.Data Warehouse and OLAP Technology: What is Data Warehouse
T1: 3.1
11-12 To construct multidimensional data model.
A Multidimensional Data Model and schemas
T1: 3.2
13-14 To draw data warehouse architecture. Data Warehouse Architecture T1: 3.3
4 | P a g e
15 To formulate descriptive data summarization
Data Preprocessing: Descriptive Data Summarization
T1: 2.1-2.2
16-17 To understand data cleaning methods Data Cleaning T1: 2.318 To know problems in data integration Data Integration T1: 2.4
19-20 Able to apply various transformations in data transformation
Data Transformation T1: 2.4
21-22 To understand data reduction techniques
Data Reduction T2: 2.5
23-24 To know Data Discretization and Concept Hierarchy Generation.
Data Discretization and Concept Hierarchy Generation.
T1: 2.6
25 Able to know frequent patterns. Mining Frequent Patterns and Associations: Basic Concepts.
T1: 5.1
26-30 To understand and design algorithms to find frequent item sets
Efficient and Scalable Frequent Itemset Mining Methods
T1: 5.2
31-34 Able to mine interested association rules.
Mining Various Kinds of Association Rules
T1: 5.3
35-36 Able to classify data items based on their similarity.
Classification and Prediction: Issues regarding classification and prediction Bayesian classification
T1:6.2&6.4
37 To draw decision tree for classification. classification by decision tree induction T1:6.338-39 Apply rules for classification Rule based classification T1: 6.5
40 Able to predict object behavior. Prediction T1: 6.1141-42 To derive formulas for accuracy and
error measures.Accuracy and Error Measures. T1: 6.12
43 Able to group similar objects. Cluster Analysis: cluster analysis and types of data in cluster analysis
T1: 7.1-7.2
44 To know all the clustering algorithms. A Categorization of Major Clustering Methods
T1: 7.3
45-46 To compare partitioning methods with hierarchical methods.
Partitioning Methods- Hierarchical Methods
T1: 7.4 -7.5
47-48 To compare density based methods with grid based methods.
Density based Methods and Grid based methods
T1: 7.6-7.7
49 To know model based clustering methods
Model based clustering methods T1: 7.8
50-51 To know high dimensional data clustering and outlier analysis.
Clustering high dimensional data and Outlier analysis
T1: 7.9&7.11
52-53 To understand time series and sequence data.
Mining Stream- Time-Series- and Sequence Data: Mining Data Streams
T1: 8.1
54-56 To know in detail about the time series data.
Mining Time-Series Data T1: 8.2
57-59 To apply mining sequence patterns in biological data.
Mining Sequence Patterns in Biological Data
T1: 8.4
59-61 To analyze spatial-text and web data as a multidimensional data.
Mining Object- Spatial- Multimedia- Text and Web data: Multidimensional analysis on complex object data types
T1: 10.1
62 To know descriptive mining of complex data objects.
Descriptive mining of complex data objects
T1: 10.1
63-65 To understand spatial data mining. Spatial Data Mining T1: 10.266 To understand multimedia data
mining.Multimedia Data Mining T1: 10.3
67 To understand text data mining. Text Mining T1: 10.468 To apply mining techniques for world
wide web.Mining the World Wide Web. T1: 10.5
5 | P a g e
XI. Mapping course objectives leading to the achievement of the course outcomes:
Course Objectives
Programme Outcomes
a b c d e f g h i j k l m
I H
II H
III H
IV H
V S
VI H
VII H
N = None S = Supportive H = Highly Related
XII. Mapping course objectives leading to the achievement of the course outcomes:
Course Outcomes
Programme Outcomes
a b c d e f g h i j k l m
1 S
2 H
3 H
4 H
5 S
6 H
7 H
8 S
9 H
10 H
11 H
N = None S = Supportive H = Highly Related
Prepared By : Prof L V Narasimha PrasadDate : 27 January, 2012
6 | P a g e