DWH Fall2010 Lecture Slides Week1

download DWH Fall2010 Lecture Slides Week1

of 30

Transcript of DWH Fall2010 Lecture Slides Week1

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    1/30

    Data Warehousing

    Naveed Iqbal, Assistant Professor

    FAST-NU, Islamabad(Lecture Slides Week # 1)

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    2/30

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    3/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 3

    Instructor Profile Naveed Iqbal

    9+ years hands-on Versatile and Multidisciplinary Experience inthe areas of: Geographic Information Systems (GIS)

    Data Warehousing and Decision Support Systems

    IT Management / IT Service Management and Project Management

    Software Development, Implementation and InfrastructureManagement

    University Level Teaching

    Pioneering Career, Techno-Managerial skill-set and expertise.

    MS (CS) FAST-NU, M.Sc (CS), M.Sc (Mathematics)

    Professional Courses from LUMS, IMS, NUST, COMSATS,

    ORACLE UNIVERSITY and UK Certified ITIL / IT Service Management Professional

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    4/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 4

    Code of Conduct

    Regularity Attendance criteria as per university policy

    Punctuality No entry after 5 minutes from class start time (N/A for habitual late

    comers)

    Discipline ABSOLUTLY NO COMPROMISE

    Positive Attitude

    High Level of Class Participation

    No Plagiarism, Cheating

    No Change in Deadlines

    No Usage of Mobile / Other Devices

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    5/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 5

    Approach of the Course

    Develop an understanding of the underlying RDBMSconcepts.

    Apply these concepts to VLDB / DSS environmentsand understand where and why they break down?

    Expose the differences between RDBMS and DataWarehouse in the context of VLDB.

    Provide the basics of DSS tools such as OLAP, DataMining and demonstrate their applications.

    Demonstrate the application of DSS concepts andlimitations of the OLTP concepts through labexercises.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    6/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 6

    Summary of the Course

    Introduction & Background

    De-Normalization

    Online Analytical Processing (OLAP)

    Dimensional Modeling

    Extract-Transform-Load (ETL)

    Data Quality Management (DQM)

    Need for Speed (Parallelism, Join and Indexing Techniques)

    DWH Implementation Steps

    Complete Implementation Case Study Lab and Tool Usage

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    7/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 7

    Books

    Reference Books W. H. Inmon, Building the Data Warehouse,

    John Wiley & Sons Inc., NY

    R. Kimball, The Data Warehouse Toolkit,John Wiley & Sons Inc., NY

    A. Abdullah, Data Warehousing for Beginners:Concepts & Issues.

    Paulraj Ponniah, Data WarehousingFundamentals, John Wiley & Sons Inc., NY

    . . .

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    8/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 8

    Course Execution Plan

    Lecturing / Discussions

    Assignments

    Case Studies

    Projects

    Marks Breakup:

    Mid-I: 15% Quizzes: 5%

    Mid-II: 15% Assignments: 7%

    Final: 35% Case Study*: 8%Project*: 15%

    * Mandatory (Missing means F)

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    9/30

    Data Warehousing(Introduction and Background)

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    10/30

    FAST-NU, Islamabad Data Warehousing - Fall 201010

    Why this Course?

    The World is changing / (in fact changed)

    Either change or Be left behind.

    Missing the opportunities or going in thewrong direction has prevented us fromgrowing.

    What is the right direction?

    Harnessing the data, in the knowledgedriven economy.

    Doing what cant be or difficult to automate.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    11/30

    FAST-NU, Islamabad Data Warehousing - Fall 201011

    Historical Overview

    1960: Master Files and Reports

    1965: Lots of Master Files

    1970: Direct Memory Access and DBMS

    1975: Online High Performance Transaction

    Processing

    1980: PCs and 4GL Technology (MIS/DSS)

    1985: Extract Programs, Extract Processing 1990: The Legacy Systems Web

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    12/30

    FAST-NU, Islamabad Data Warehousing - Fall 201012

    The Need of the Time

    Drowning in data AND/BUT starving for

    information.

    Knowledge is power BUT Intelligence is

    absolute/super power.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    13/30

    FAST-NU, Islamabad Data Warehousing - Fall 201013

    The Need of the Time

    Data

    Information

    Knowledge

    Intelligence

    POWER($/)

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    14/30

    FAST-NU, Islamabad Data Warehousing - Fall 201014

    Scenario 1

    ABC Pvt Ltd is a company with branches at

    Karachi, Quetta, Peshawar and Lahore. The Sales

    Manager wants quarterly sales report. Eachbranch has a separate operational system.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    15/30

    FAST-NU, Islamabad Data Warehousing - Fall 201015

    Scenario 1 : ABC Pvt Ltd.

    Karachi

    Quetta

    Peshawar

    Lahore

    Sales

    ManagerSales per item type per branch

    for first quarter.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    16/30

    FAST-NU, Islamabad Data Warehousing - Fall 201016

    Solution 1:ABC Pvt Ltd.

    Extract sales information from each database.

    Store the information in a common repository

    at a single site.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    17/30

    FAST-NU, Islamabad Data Warehousing - Fall 201017

    Solution 1:ABC Pvt Ltd.

    Karachi

    Quetta

    Peshawar

    Lahore

    Data

    Warehouse

    Sales

    Manager

    Query &

    Analysis tools

    Report

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    18/30

    FAST-NU, Islamabad Data Warehousing - Fall 201018

    Scenario 2

    One Stop Shopping Super Market has huge

    operational database. Whenever Executives wants

    some report, the OLTP system becomes slow anddata entry operators have to wait for some time.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    19/30

    FAST-NU, Islamabad Data Warehousing - Fall 201019

    Scenario 2 : One Stop Shopping

    Operational

    Database

    Data Entry Operator

    Data Entry Operator

    ManagementWait

    Report

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    20/30

    FAST-NU, Islamabad Data Warehousing - Fall 201020

    Solution 2

    Extract data needed for analysis fromoperational database.

    Store it in warehouse.

    Refresh warehouse at regular interval so that itcontains up to date information for analysis.

    Warehouse will contain data with historicalperspective.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    21/30

    FAST-NU, Islamabad Data Warehousing - Fall 201021

    Solution 2

    Operational

    database

    Data

    Warehouse

    Extract

    data

    Data Entry

    Operator

    Data Entry

    Operator

    Manager

    Report

    Transaction

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    22/30

    FAST-NU, Islamabad Data Warehousing - Fall 201022

    Scenario 3

    Cakes & Cookies is a small, new company. President

    of the company wants his company should grow. He

    needs information so that he can make correctdecisions.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    23/30

    FAST-NU, Islamabad Data Warehousing - Fall 201023

    Solution 3

    Improve the quality of data before

    loading it into the warehouse.

    Perform data cleaning andtransformation before loading the data.

    Use query analysis tools to support

    adhoc queries.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    24/30

    FAST-NU, Islamabad Data Warehousing - Fall 201024

    Solution 3

    Query and Analysis

    toolPresident

    Expansion

    Improvement

    sales

    time

    Data

    Warehouse

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    25/30

    FAST-NU, Islamabad Data Warehousing - Fall 201025

    Case Study

    AFCO Foods & Beverages is a new companywhich produces dairy, bread and meatproducts with production unit located atGujranwala.

    There products are sold in all the region ofPakistan.

    They have sales units at provincial HeadQuarters.

    The President of the company wants salesinformation.

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    26/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 26

    Sales Information

    Report: The number of units sold.

    113

    Report: The number of units sold over time

    25334114

    AprilMarchFebruaryJanuary

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    27/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 27

    Sales Information

    Report : The number of items sold for each product with

    time

    21258Swiss Rolls

    86166Cheese

    176Wheat Bread

    AprMarFebJan

    Product

    Time

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    28/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 28

    Sales Information

    Report: The number of items sold in each City for each

    product with time

    1594Swiss Rolls

    83Cheese

    73Wheat

    Bread

    Lahore

    6164Swiss Rolls

    6163Cheese

    103WheatBread

    Karachi

    AprMarFebJan

    Product

    Time

    City

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    29/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 29

    Sales Information

    Report: The number of items sold and income in each region for

    each product with time.

    AprMarFebJan

    21.20

    17.36

    24.80

    Rs

    27.45

    7.44

    10.98

    15.90

    7.44

    Rs

    16.47

    29.98

    42.40

    Rs

    7.32

    7.95

    7.32

    7.95

    Rs

    1594Swiss Rolls

    83Cheese

    73Wheat BreadLahore

    6164Swiss Rolls

    6163Cheese

    103Wheat BreadKarachi

    UUUU

    Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

  • 8/8/2019 DWH Fall2010 Lecture Slides Week1

    30/30

    FAST-NU, Islamabad Data Warehousing - Fall 2010 30

    Data Warehousing includes

    Build Data Warehouse

    Online Analysis/Analytical Processing (OLAP).

    Presentation.

    RDBMS

    Flat File

    Presentation

    Cleaning ,Selection &

    Integration

    Warehouse & OLAP serverClient

    Created with Print2PDF To remove this line buy a license at: http://www software602 com/