Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who...
-
Upload
willa-cross -
Category
Documents
-
view
216 -
download
0
description
Transcript of Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who...
1
Recap of Day 1
Dr. Chaitali Basu Mukherji
2
Which are our lowest/highest margin customers ?
Who are my customers and what products are they buying?
Which customers are most likely to go to the competition ?
What impact will new products/services have on revenue and margins?
What product prom--otions have the biggest impact on revenue?
What is the most effective distribution channel?
A producer wants to know….
Data, Data everywhere yet ...
3
• I can’t find the data I need– data is scattered over the network– many versions, subtle differences
I can’t get the data I need need an expert to get the data
I can’t understand the data I found available data poorly documented
I can’t use the data I found results are unexpected data needs to be transformed
from one form to other
What is a Data Warehouse?
4
A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.
[Barry Devlin]
What are the users saying...
5
• Data should be integrated across the enterprise
• Summary data has a real value to the organization
• Historical data holds the key to understanding data over time
• What-if capabilities are required
What is Data Warehousing?
6
A process of transforming data into information and making it available to users in a timely enough manner to make a difference
[Forrester Research, April 1996]
Data
Information
Evolution
7
• 60’s: Batch reports– hard to find and analyze information– inflexible and expensive, reprogram every new request
• 70’s: Terminal-based DSS and EIS (executive information systems)– still inflexible, not integrated with desktop tools
• 80’s: Desktop data access and analysis tools– query tools, spreadsheets, GUIs– easier to use, but only access operational databases
Evolution
8
• 90’s: Data warehousing with integrated OLAP engines and tools• 91: Prism Solutions, founded by Bill Inmon, introduces Prism
Warehouse Manager, software for developing a data warehouse.• 95: The Data Warehousing Institute, a for-profit organization that
promotes data warehousing, is founded.• 2000: Daniel Linstedt releases the Data Vault, enabling real time
auditable Data Warehouses warehouse.
9
Advantagesof using Data warehousing
1. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.
2. Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.
10
Disadvantagesof using data warehousing
1. Data warehouses are not the optimal environment for unstructured data.
2. Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.
Data Warehouse Architecture
11
Data Warehouse Engine
Optimized Loader
ExtractionCleansing
AnalyzeQuery
Metadata Repository
RelationalDatabases
LegacyData
Purchased Data
ERPSystems
12
Data warehousing methodologies
• Bottom-up design– Ralph Kimball, a well-known author on data warehousing, is a
proponent of an approach to data warehouse design– In this approach
• data marts are first created to provide reporting and analytical capabilities for specific business processes.
• Top-down design– Bill Inmon, is one of the leading proponents of the top-
down approach to data warehouse design.– In this approach
• data warehouse is designed using a normalized enterprise data model. "Atomic" data, that is, data at the lowest level of detail, are stored in the data warehouse.
13
14
Application Areas
Industry ApplicationFinance Credit Card AnalysisInsurance Claims, Fraud AnalysisTelecommunication Call record analysisTransport Logistics managementConsumer goods promotion analysisData Service providersValue added dataUtilities Power usage analysis