Cs753 2a

24
Data Warehouse/Data Mart Components Concepts Characteristics

Transcript of Cs753 2a

Page 1: Cs753 2a

Data Warehouse/Data Mart

Components

Concepts

Characteristics

Page 2: Cs753 2a

Overview

• Operational vs Informational Systems

• Data Warehouse components

• Data Marts

Page 3: Cs753 2a

Basic Data Warehouse Architecture

Copyright © 1997, Enterprise Group, Ltd.

Source OLTPSource OLTPSystemsSystems

Subset Data MartsSubset Data Marts

EnterpriseData

Warehouse

One VersionOne Versionof the Truthof the Truth

Page 4: Cs753 2a

Operational vs. Informational Systems

Operational vs. Informational Systems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

OrderOrderEntryEntry Manf.Manf.

Page 5: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

Page 6: Cs753 2a

Operational vs. Informational Systems

• Most of the advances in end-user programming have run into difficulty in actually accessing data that exists in backbone, operational data bases.

• Operational data bases have a very, very long life. Large operational systems are converted from one technology to a more advanced one very infrequently (typically every eight to twenty years).

• Therefore, why not create specific DBs whose role was to make large scale end user access easy to isolate the operational DBs, i.e. a Data Warehouse

Page 7: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery System

Page 8: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

Page 9: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

Page 10: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

InformationalInformationalSystemsSystems

Page 11: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

InformationalInformationalSystemsSystems

Notice that one of the big impacts of Notice that one of the big impacts of Data Warehousing is to eliminate large Data Warehousing is to eliminate large numbers of existing DSS systems!numbers of existing DSS systems!Y2000 will make this essential!!!Y2000 will make this essential!!!

Page 12: Cs753 2a

Operational vs. InformationalSystems

Operational vs. InformationalSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

InformationalInformationalSystemsSystems

Data Data MartsMarts

Page 13: Cs753 2a

Data Mart Layer

Presentation/ Desktop

Access Layer

Meta-data Repository Layer

Warehouse Management Layer

Core DW Layer

Data Staging and Quality Layer

Data Access Layer

Operational Data Layer

External Data Layer

Data Feed/Data Mining/

Indexing Layer

Virtual DW

Coarse DW

Central DW

Distributed DW

Application Messaging (Transport) Layer

Internet/Intranet Layer

direct queries

virtual queries

ad hoc queries

1

2a

2b

2c

3

4 56 7

8

9

10

11

United Statesby Sales

$10,340 to $10,350 (1)$8,730 to $10,340 (2)$4,320 to $8,730 (2)$1,100 to $4,320 (1)

$730 to $1,100 (3)

United States$11,000

Sales

North America

Non-operational

Data Layer

Data Marts vs Data WarehousesData Marts vs Data Warehouses

Page 14: Cs753 2a

Data Mart Layer

Presentation/ Desktop

Access Layer

Meta-data Repository Layer

Warehouse Management Layer

Core DW Layer

Data Staging and Quality Layer

Data Access Layer

Operational Data Layer

External Data Layer

Data Feed/Data Mining/

Indexing Layer

Central DW

Application Messaging (Transport) Layer

Internet/Intranet Layer

direct queries

virtual queries

ad hoc queries

1

2a

2b

2c

3

4 56 7

8

9

10

11

United Statesby Sales

$10,340 to $10,350 (1)$8,730 to $10,340 (2)$4,320 to $8,730 (2)$1,100 to $4,320 (1)

$730 to $1,100 (3)

United States$11,000

Sales

North America

Non-operational

Data Layer

Central Data WarehouseCentral Data Warehouse

Tracking DBTracking DB

Lawson DBLawson DB

Page 15: Cs753 2a
Page 16: Cs753 2a

Virtual Date Warehouse

• A Virtual Data Warehouse approach is often chosen when there are infrequent demands for data and management wants to determine if/how users will use operational data.

• One of the weaknesses of a Virtual Data Warehouse approach is that user queries a made against operational DBs.

• One way to minimize this problem is to build a “Query Monitor” to check the performance characteristics of a query before executing it.

Page 17: Cs753 2a

• A Coarse Data Warehouse is often chosen when the organization has a relatively clean/new operational system and management wants to make the operational data more easily available for just that system.

• A Central Data Warehouse• is often chosen when the organization has a clear

understanding about it Information Access needs and wants to provide “quality”, “integrated” , information to its knowledge workers

• A Distributed Data Warehouse is similar in most respects to a Central Data Warehouse, except that the data is distributed to separate mini-Data Warehouses (Data Marts )on local or specialized servers

Page 18: Cs753 2a

Data Mart Layer

Presentation/ Desktop

Access Layer

Meta-data Repository Layer

Warehouse Management Layer

Core DW Layer

Data Staging and Quality Layer

Data Access Layer

Operational Data Layer

External Data Layer

Data Feed/Data Mining/

Indexing Layer

Virtual DW

Coarse DW

Central DW

Application Messaging (Transport) Layer

Distributed DW

Internet/Intranet Layer

direct queries

virtual queries

ad hoc queries

1

2a

2b

2c

3

4 56 7

8

9

10

11

United Statesby Sales

$10,340 to $10,350 (1)$8,730 to $10,340 (2)$4,320 to $8,730 (2)$1,100 to $4,320 (1)

$730 to $1,100 (3)

United States$11,000

Sales

North America

Non-operational

Data Layer

Central Data WarehouseCentral Data Warehouse

Page 19: Cs753 2a

Data Mart Layer

Presentation/ Desktop

Access Layer

Meta-data Repository Layer

Warehouse Management Layer

Core DW Layer

Data Staging and Quality Layer

Data Access Layer

Operational Data Layer

External Data Layer

Data Feed/Data Mining/

Indexing Layer

Virtual DW

Coarse DW

Central DW

Distributed DW

Application Messaging (Transport) Layer

Internet/Intranet Layer

direct queries

virtual queries

ad hoc queries

1

2a

2b

2c

3

4 56 7

8

9

10

11

United Statesby Sales

$10,340 to $10,350 (1)$8,730 to $10,340 (2)$4,320 to $8,730 (2)$1,100 to $4,320 (1)

$730 to $1,100 (3)

United States$11,000

Sales

North America

Non-operational

Data Layer

Data Marts OnlyData Marts Only

Page 20: Cs753 2a

Heterogeneity - The Reality

Oracle Financials

CustomMarketingData Warehouse

PackagedOracle FinancialData Warehouse

PackagedI2 Supply ChainNon- ArchitectedData Mart

SubsetData Marts

i2 Supply Chain Siebel CRM 3rd PartyData

Page 21: Cs753 2a

Federated BI Architecture

Real TimeODS

FederatedFinancialData Warehouse

SubsetData Marts

CommonStagingArea

Oracle Financialsi2 Supply Chain Siebel CRM 3rd Party

FederatedPackagedI2 SupplyChainData Marts

AnalyticalApplications

e-commerce

Real TimeData Miningand Analytics

Real TimeSegmentation,Classification, Qualification,Offerings, etc.

FederatedMarketingData Warehouse

Page 22: Cs753 2a

Benefits of Data Warehouse Architecture

• Provides organizing framework• Gives flexibility for changes and allows

simplified maintenance• Speeds up future development by aiding

understanding of dw• Communication tool for roles and

requirements• Coordinate data marts

Page 23: Cs753 2a

Primary Technical Challenge Axis

EasyEasy HardHard

FastFast

SlowSlowParallelParallelERP DWERP DW

FinanceFinance

CustomCustomERP DWERP DW

TurnkeyTurnkeyERP DWERP DW

VLDBVLDB

NearNearReal Real TimeTime

MarketingMarketing

Mid-Size Co.Mid-Size Co.

Large Co.Large Co.

Single SourceSingle Source

Multi-SourceMulti-Source

MonthlyMonthlyFreqFreq

Small DBSmall DB

Dirty DataDirty Data

Clean DataClean Data

Page 24: Cs753 2a

Prerequisites for Success

• Pain driven

• Sponsorship at the highest levels

• Sustainable political will

• Iterative methodology

• Manageable scope

• User driven design

• Service business mindset

• Sustainability