Professional Issues in Computing...NUCES, Islamabad Campus Data Warehousing - Fall 2012 12 NUCES,...
Transcript of Professional Issues in Computing...NUCES, Islamabad Campus Data Warehousing - Fall 2012 12 NUCES,...
Data Warehousing(The Need, Importance & the Big Picture)
Naveed Iqbal, Assistant Professor
NUCES, Islamabad Campus(Lecture Slides Week # 1)
NUCES, Islamabad Campus Data Warehousing - Fall 2012 2
Why this Course?
The World is changing / (in fact changed)
Either change or Be left behind.
Missing the opportunities or going in thewrong direction has prevented us fromgrowing.
What is the right direction?
Harnessing the data, in the knowledgedriven economy.
Doing what can’t be or difficult to automate.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 3
The Need of the Time
Drowning in data AND/BUT starving for
information.
Knowledge is power BUT Intelligence is
absolute/super power.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 4
The Need of the Time
Data
Information
Knowledge
Intelligence
POWER
($/£)
Evolution of Information Systems
NUCES, Islamabad Campus Data Warehousing - Fall 2012 5
NUCES, Islamabad Campus Data Warehousing - Fall 2012 6
NUCES, Islamabad Campus Data Warehousing - Fall 2012 7
Business Intelligence
NUCES, Islamabad Campus Data Warehousing - Fall 2012 8
NUCES, Islamabad Campus Data Warehousing - Fall 2012 9
Visualization
NUCES, Islamabad Campus Data Warehousing - Fall 2012 10
NUCES, Islamabad Campus Data Warehousing - Fall 2012 11
Date Warehousing – the big picture
Data Warehouse Server
(Tier 1)
Data
Warehouse
Operational
Data Bases
Semistructured
Sources Query/Reporting
Data Marts
MOLAP
ROLAP
Clients
(Tier 3)
Tools
Meta
Data
Data sources
Data
(Tier 0)
IT
Users
Business
Users
Business Users
Data Mining
Archived
data
Analysis
OLAP Servers
(Tier 2)
Extract
Transform
Load
(ETL)
www data
NUCES, Islamabad Campus Data Warehousing - Fall 2012 12
NUCES, Islamabad Campus Data Warehousing - Fall 2012 13
Approach of the Course
Develop an understanding of the underlying RDBMSconcepts.
Apply these concepts to VLDB / DSS environmentsand understand where and why they break down?
Expose the differences between RDBMS and DataWarehouse in the context of VLDB.
Provide the basics of DSS tools such as OLAP, DataMining and demonstrate their applications.
Demonstrate the application of DSS concepts andlimitations of the OLTP concepts through labexercises.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 14
Summary of the Course
Introduction & Background
Extract-Transform-Load (ETL)
Normalization & De-Normalization
Dimensional Modeling
Online Analytical Processing (OLAP)
Data Quality Management (DQM)
Need for Speed (Parallelism, Join and Indexing Techniques)
DWH Implementation Steps
Complete Implementation Case Study
Lab and Tool Usage
…
NUCES, Islamabad Campus Data Warehousing - Fall 2012 15
Books
Reference Books Golfarelli & Rizzi, Data Warehouse Design – Modern
Principles and Methodoligies, McGRAW-Hill
W. H. Inmon, Building the Data Warehouse,
John Wiley & Sons Inc., NY
R. Kimball, The Data Warehouse Toolkit,
John Wiley & Sons Inc., NY
A. Abdullah, “Data Warehousing for Beginners: Concepts
& Issues”.
Paulraj Ponniah, Data Warehousing Fundamentals, John
Wiley & Sons Inc., NY
. . .
NUCES, Islamabad Campus Data Warehousing - Fall 2012 16
Course Execution Plan
Lecturing / Discussions
Lab Work + Tutorials
Assignments / Case Studies
Projects
Marks Breakup:
Mid-I: 12% Quizzes: 6%
Mid-II: 13% Assignments/Case Study: 9%
Final*: 40% Projects*: 20%
* Mandatory (Missing means F)
NUCES, Islamabad Campus Data Warehousing - Fall 2012 17
Code of Conduct
Regularity Attendance criteria as per university policy
Punctuality No entry after 5 minutes from class start time (N/A for habitual late
comers)
Discipline ABSOLUTLY NO COMPROMISE
Positive Attitude
High Level of Class Participation
No Plagiarism, Cheating …
No Change in Deadlines
No Usage of Mobile / Other Devices
…
NUCES, Islamabad Campus Data Warehousing - Fall 2012 18
Scenario 1
ABC Pvt Ltd is a company with branches at
Karachi, Quetta, Peshawar and Lahore. The Sales
Manager wants quarterly sales report. Each
branch has a separate operational system.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 19
Scenario 1 : ABC Pvt Ltd.
Karachi
Quetta
Peshawar
Lahore
Sales
ManagerSales per item type per branch
for first quarter.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 20
Solution 1:ABC Pvt Ltd.
Extract sales information from each database.
Store the information in a common repository
at a single site.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 21
Solution 1:ABC Pvt Ltd.
Karachi
Quetta
Peshawar
Lahore
Data
Warehouse
Sales
Manager
Query &
Analysis tools
Report
NUCES, Islamabad Campus Data Warehousing - Fall 2012 22
Scenario 2
One Stop Shopping Super Market has huge
operational database. Whenever Executives wants
some report, the OLTP system becomes slow and
data entry operators have to wait for some time.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 23
Scenario 2 : One Stop Shopping
Operational
Database
Data Entry Operator
Data Entry Operator
ManagementWait
Report
NUCES, Islamabad Campus Data Warehousing - Fall 2012 24
Solution 2
Extract data needed for analysis fromoperational database.
Store it in warehouse.
Refresh warehouse at regular interval so that itcontains up to date information for analysis.
Warehouse will contain data with historicalperspective.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 25
Solution 2
Operational
database
Data
Warehouse
Extract
data
Data Entry
Operator
Data Entry
Operator
Manager
Report
Transaction
NUCES, Islamabad Campus Data Warehousing - Fall 2012 26
Scenario 3
Cakes & Cookies is a small, new company. President
of the company wants his company should grow. He
needs information so that he can make correct
decisions.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 27
Solution 3
Improve the quality of data before
loading it into the warehouse.
Perform data cleaning and
transformation before loading the data.
Use query analysis tools to support
adhoc queries.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 28
Solution 3
Query and Analysis
toolPresident
Expansio
n
Improvemen
t
sales
time
Data
Warehouse
NUCES, Islamabad Campus Data Warehousing - Fall 2012 29
Case Study
AFCO Foods & Beverages is a new companywhich produces dairy, bread and meatproducts with production unit located atGujranwala.
There products are sold in all the region ofPakistan.
They have sales units at provincial HeadQuarters.
The President of the company wants salesinformation.
NUCES, Islamabad Campus Data Warehousing - Fall 2012 30
Sales Information
Report: The number of units sold.
113
Report: The number of units sold over time
January February March April
14 41 33 25
NUCES, Islamabad Campus Data Warehousing - Fall 2012 31
Sales Information
Report : The number of items sold for each product with
time
Jan Feb Mar Apr
Wheat Bread 6 17
Cheese 6 16 6 8
Swiss Rolls 8 25 21
Product
NUCES, Islamabad Campus Data Warehousing - Fall 2012 32
Sales Information
Report: The number of items sold in each City for each
product with time
Jan Feb Mar Apr
Karachi Wheat
Bread
3 10
Cheese 3 16 6
Swiss Rolls 4 16 6
Lahore Wheat
Bread
3 7
Cheese 3 8
Swiss Rolls 4 9 15
Product
Tim
e
NUCES, Islamabad Campus Data Warehousing - Fall 2012 33
Sales Information
Report: The number of items sold and income in each region for
each product with time.
Jan Feb Mar Apr
Rs U Rs U Rs U Rs U
Karachi Wheat Bread 7.44 3 24.80 10
Cheese 7.95 3 42.40 16 15.90 6
Swiss Rolls 7.32 4 29.98 16 10.98 6
Lahore Wheat Bread 7.44 3 17.36 7
Cheese 7.95 3 21.20 8
Swiss Rolls 7.32 4 16.47 9 27.45 15
NUCES, Islamabad Campus Data Warehousing - Fall 2012 34
Data Warehousing includes
Building Data Warehouse
Online Analysis/Analytical Processing (OLAP)
Presentation
RDBMS
Flat File
Presentation
Cleaning ,Selection &
Integration
Warehouse & OLAP serverClient