Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who...

14
Recap of Day 1 1 Dr. Chaitali Basu Mukherji

description

Data, Data everywhere yet... 3 I can’t find the data I need – data is scattered over the network – many versions, subtle differences zI can’t get the data I need yneed an expert to get the data zI can’t understand the data I found yavailable data poorly documented zI can’t use the data I found yresults are unexpected ydata needs to be transformed from one form to other

Transcript of Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who...

Page 1: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

1

Recap of Day 1

Dr. Chaitali Basu Mukherji

Page 2: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

2

Which are our lowest/highest margin customers ?

Who are my customers and what products are they buying?

Which customers are most likely to go to the competition ?

What impact will new products/services have on revenue and margins?

What product prom--otions have the biggest impact on revenue?

What is the most effective distribution channel?

A producer wants to know….

Page 3: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

Data, Data everywhere yet ...

3

• I can’t find the data I need– data is scattered over the network– many versions, subtle differences

I can’t get the data I need need an expert to get the data

I can’t understand the data I found available data poorly documented

I can’t use the data I found results are unexpected data needs to be transformed

from one form to other

Page 4: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

What is a Data Warehouse?

4

A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.

[Barry Devlin]

Page 5: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

What are the users saying...

5

• Data should be integrated across the enterprise

• Summary data has a real value to the organization

• Historical data holds the key to understanding data over time

• What-if capabilities are required

Page 6: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

What is Data Warehousing?

6

A process of transforming data into information and making it available to users in a timely enough manner to make a difference

[Forrester Research, April 1996]

Data

Information

Page 7: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

Evolution

7

• 60’s: Batch reports– hard to find and analyze information– inflexible and expensive, reprogram every new request

• 70’s: Terminal-based DSS and EIS (executive information systems)– still inflexible, not integrated with desktop tools

• 80’s: Desktop data access and analysis tools– query tools, spreadsheets, GUIs– easier to use, but only access operational databases

Page 8: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

Evolution

8

• 90’s: Data warehousing with integrated OLAP engines and tools• 91: Prism Solutions, founded by Bill Inmon, introduces Prism

Warehouse Manager, software for developing a data warehouse.• 95: The Data Warehousing Institute, a for-profit organization that

promotes data warehousing, is founded.• 2000: Daniel Linstedt releases the Data Vault, enabling real time

auditable Data Warehouses warehouse.

Page 9: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

9

Advantagesof using Data warehousing

1. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.

2. Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.

Page 10: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

10

Disadvantagesof using data warehousing

1. Data warehouses are not the optimal environment for unstructured data.

2. Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.

Page 11: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

Data Warehouse Architecture

11

Data Warehouse Engine

Optimized Loader

ExtractionCleansing

AnalyzeQuery

Metadata Repository

RelationalDatabases

LegacyData

Purchased Data

ERPSystems

Page 12: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

12

Data warehousing methodologies

• Bottom-up design– Ralph Kimball, a well-known author on data warehousing, is a

proponent of an approach to data warehouse design– In this approach

• data marts are first created to provide reporting and analytical capabilities for specific business processes.

• Top-down design– Bill Inmon, is one of the leading proponents of the top-

down approach to data warehouse design.– In this approach

• data warehouse is designed using a normalized enterprise data model. "Atomic" data, that is, data at the lowest level of detail, are stored in the data warehouse.

Page 13: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

13

Page 14: Recap of Day 1 1 Dr. Chaitali Basu Mukherji. 2 Which are our lowest/highest margin customers ? Who are my customers and what products are they buying?

14

Application Areas

Industry ApplicationFinance Credit Card AnalysisInsurance Claims, Fraud AnalysisTelecommunication Call record analysisTransport Logistics managementConsumer goods promotion analysisData Service providersValue added dataUtilities Power usage analysis