MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
-
Upload
meljun-cortes-mbampa -
Category
Documents
-
view
216 -
download
0
Transcript of MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
1/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
BAFEDM2: Fundamentals of EnterpriseData Management
Week 09, 11
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
2/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Copyright IBM Corporation 2013. All rights reserved.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS FOR INFORMATIONALPURPOSES ONLY. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUTOF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHERDOCUMENTATION.
IBM, the IBM logo, ibm.com, Cognos, SPSS and iLog are trademarks or registered
trademarks of International Business Machines Corporation in the United States,other countries, or both. If these and other IBM trademarked terms are U.S.registered or common law trademarks owned by IBM at the time this informationwas published. Trademarks may also be registered or common law trademarks inother countries. A current list of IBM trademarks is available on the Web atCopyright and trademark information at http://www.ibm.com/legal/copytrade.html .
The IBM logo must not be moved, added to or altered in any way.
Other company, product, or service names may be trademarks or service marks ofothers.
http://www.ibm.com/legal/copytrade.htmlhttp://www.ibm.com/legal/copytrade.html -
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
3/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Agenda
3
Module 4: Extract, Transform and Loading Process (continued)
The 34 Subsystems of ETL
Group Project
Project Development (ETL Process)
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
4/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Readings
4
Mohanty, S. (2012). Data Warehousing: Design,Development and Best Practices. Tata McGraw-Hill
Publishing Company, India.
Kimball, R. (2008). The Data Warehouse LifecycleToolkit, Second Edition. John Wiley & Sons.
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
5/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Module 4: Extract, Transformand Loading Process (continued)
BAFEDM2: Fundamentals of Enterprise Data
Management
5
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
6/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Overview[1]
Extract, transformation, and load (ETL) system is often
estimated to consume 70 percent of the time and
effort of building a DW/BI environment.
6
WarehouseWarehouse
Cognos
ETLETL
Data QualityData QualityCom
monM
etaDa
ta
Commo
nMeta
Data
CubingCubing
DifferentSources Users
accessingcubes
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
7/55 2013 IBMCorporation
IBM Global Center for Smarter Analytics
Ten Major Requirements for ETL[1]
7
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
8/55 2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL[1]
Extracting (1 3).
Cleaning and Conforming (4 8).
Delivering (9 21).
Managing (22 34).
8
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
9/55 2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystem of ETL:Extracting
BAFEDM2: Fundamentals of Enterprise Data
Management
9
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
10/55 2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Extracting[1]
Subsystem 1: Data Profiling
Unsorted Files, Profiled and then Sorted.
10
Data
UnsortedFiles
UnsortedFiles
UnsortedFiles
Sorted Files
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
11/55 2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Extracting[1]
11
Subsystem 2: Change Data Capture
The key goals for the change data capture subsystem are:
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
12/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Extracting[1]
Subsystem 3: Extract System
Getting Data from a source
12
Data
Name ( Last Name,First Name, Middle
Name )
Address (Street No. ,Phase, Village, City )
Contact Numbers( Tel No., Cell phone
No., )
Sorted Files
Sorted Files
Sorted Files
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
13/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystem of ETL:Cleaning and Conforming
BAFEDM2: Fundamentals of Enterprise Data
Management
13
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
14/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Cleaning and
Confirming Data[1]Subsystem 4: Data Cleansing SystemDetermine the dirty data to be fixed
14
Data
Emp ID Name Age Salary
1111 Juan Dela Cruz 28 30000
2222Pedro Gil $anchez 25 15000
Emp ID Name Manager Report To
1111Juan Dela Cruz Y N/A2222Pedro Gil $anchez N 1111
Emp ID Name City Contact No
1111Juan Dela Cruz Antipolo 123456
2222Pedro Gil Sanche$ Manila 654321
RejectData
Data to be fixed
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
15/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
REC IDRECDATE
BUS UNITID
BUSUNIT
SOURCEDETAILS
SRCCNT
TARGETDEBIT
TARGETCREDITS
TGTCNT
BUS UNITSRC
SRCDEBIT SR CREDITS
RC01PP234
31-Aug-11 2PPXRL FIN TRX 3 -62,027.13 102,078,553.52 3.00PPXL 72,027.13
102,078,553.52
The 34 Subsystems of ETL: Cleaning and
Confirming Data[1]Subsystem 5: Error-Event Schema Is a centralized dimensional schema. Purpose is to record every error event thrown by a quality screen
anywhere in the ETL pipeline.
15
Error seen by Users
Error Event Schemaused
Transactional data seenby Support / Developer
Team
Sample Financial Data /Journal Records
[FIN] - ERROR: 1130 - FIN TARGET DEBIT is notequal to SRC DEBIT
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
16/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
17/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Cleaning and
Confirming Data[1]Subsystem 7: De-duplication SystemResponsible for determining duplicate data
17
Emp ID Name Age Salary
1111Juan Dela Cruz 28 30000
2222Pedro Gil 25 15000
Emp ID Name Manager Report To
1111 Juan Dela Cruz Y N/A
2222Pedro Gil N 1111
Emp ID Name City Contact No
1111Juan Dela Cruz Antipolo 1234562222Pedro Gil Manila 654321
Emp ID Name Age SalaryManager Report To City Contact No
1111Juan Dela Cruz 28 300002222Pedro Gil 25 150001111Juan Dela Cruz Y N/A2222Pedro Gil N 1111
1111Juan Dela Cruz Antipolo 1234562222Pedro Gil Manila 654321
Deduplication happens whenmerged
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
18/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Cleaning and
Confirming Data[1]Subsystem 8: Conforming SystemDefine keys that can be used for conforming data.
18
Emp ID Name Age SalaryManager Report To City Contact No
1111Juan Dela Cruz 28 300002222Pedro Gil 25 150001111Juan Dela Cruz Y N/A2222Pedro Gil N 11111111Juan Dela Cruz Antipolo 1234562222Pedro Gil Manila 654321
Emp ID
1111
2222
We can use this to conform anddescribe customers data
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
19/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
20/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystem of ETL:Delivering
BAFEDM2: Fundamentals of Enterprise Data
Management
20
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
21/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 9: Slowly Changing Dimension ManagerHistory keeping dimension.
21
EmpID Name Age Salary
UpdateFlag
ExpiryDate
BeginDate End Date
1111
Juan
DelaCruz 28 30000 N 1/1/2012 1/1/2011 1/1/20122222Pedro Gil 25 15000 Y 1/1/2013
1111
JuanDelaCruz 29 40000 Y 1/1/2012 1/1/2013
Used for slowly changing dimension
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
22/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 10: Surrogate Key GeneratorThe use of surrogate keys for all dimension tables is strongly
recommended.
This implies that you need a robust mechanism for producingsurrogate keys in your ETL system.
22
SEQ_ID EMP ID EMP NAME CONTACT NO1 1123Rhia Trogo 11238992 3321Aurea Muncal 11234563 1234Apple Bulao 9800112
4 1111Joseph Lim 4561188
5 2344VincentInocentes 4561178
6 1122ChristianCequena 9701102
7 2211Juan Dela Cruz 97012438 6657Pedro Gil Puyat 86100929 1125Steve John 1870092
10 1124Billy Joe 14501828
Surrogate Keys
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
23/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 11: Hierarchy Manager
23
StudentData
Student Information
StudentName
StudentContact
No
StudentAddress
FirstName
LastName
MiddleName
Area
Number
Street
City
Town
Fixed Hierarchy Ragged
Hierarchy
No Hierarchy
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
24/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 12: Special Dimensions Manager
24
Date/Timedimensions
Mini Dimensions
Junk Dimensions
User Maintained Dimensions
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
25/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 13: Fact Table Builders Transaction grain fact tablesThe pure addition of most current
records is the easiest case, simply bulk loading new rows into thefact table.
Periodic snapshots have similar loading characteristics to those ofthe transaction grain fact tables. The same processing applies forinserts and updates.
Accumulating Snapshot: The design and administration of theaccumulating snapshot is quite different from the first two fact tabletypes. All accumulating snapshot fact tables have a set of dates,usually four to eight, which describe the typical process workflow.
25
Transactional ( BulkLoading) Periodic (Insert then
Update )
Accumulating (KeepinHistory, Updatethen Insert)
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
26/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 14: Surrogate Key Pipeline
Replace natural keys with the appropriate dimension surrogatekeys.
26
Emp ID Name Age Salary Update Flag Expiry Date Begin Date End Date
1111Juan DelaCruz 28 30000N 1/1/2012 1/1/2011 1/1/2012
2222Pedro Gil 25 15000Y 1/1/2013
1111
Juan Dela
Cruz 29 40000Y 1/1/2012 1/1/2013
Emp IDManagerID Name Manager
1111 1 Juan Dela Cruz Y
2222 2Pedro Gil N
02 3Steve Gates Y
SEQ ID Emp ID Reports to Manager ID
1 1111 02 1
2 2222 1111 2
Surrogate KeyForeign Key
Primary Key
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
27/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 15: Multi-Valued Dimension Bridge Table Builder
Metadata of different dimension
27
Fact Table
Dimension Tables
Cardinalities (M : MRelationship )
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
28/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 16: Late Arriving Data Handler
Handles the process of late arriving facts or dimension data
28
SourceData
DataWarehouse
30min
s
Delayed Processing /Loading
Logs all the journal entries in a given sales.Indicating Journal amount is not equal toaccount amount. Process is still running and
not yet updated
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
29/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 17: Dimension Manager System
Prepares and publishes conformed dimensions
29
Data warehouseData Mart
Dimension
Manager
Table 3.4: Demonstration Sales Report for Used Car Dealers After the Slice in the dimension Date
Sales quantity for Date = 1997Region
Product Warsaw Cracow Poznan
BMW 1000 150 300
Audi 500 250 300
Ford 500 100 200
Published Cube
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
30/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 18: Fact Provider System
Administration of one or more fact tables.
30
Emp ID Salary ID Manager IDCustomerID Location ID
11111 1212 2211 1 222222 1213 2212 2 233333 1214 2211 3 344444 1215 2211 4 455555 1216 2212 5 1
F
d
Fd
F
d
d
HoldsSalary ID,
CustomerID, Salary
Holds EmpID,ManagerID, Time
in, Timeout
F
d
Holds Location ID,Building , Branch,Street, City, Town,Zip, Region
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
31/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 19: Aggregate Builder
Data structures created to improve performance
31
Data
Used aggregation to filterdesired output
Data
Data
Data
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
32/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 20: OLAP Cube Builder
Enable analytic user to slice and dice data
32
CubeBuilder
Users
Table 3.4: Demonstration Sales Report for Used Car Dealers After the Slice in the dimension Date
Sales quantity for Date = 1997Region
Product Warsaw Cracow Poznan
BMW 1000 150 300
Audi 500 250 300
Ford 500 100 200
After slice anddiced
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
33/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Delivering Data for
Presentation[1]Subsystem 21: Data Propagation Manager
Responsible for integrating enterprise data from the datawarehouse to be used by multiple users
33
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
34/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
35/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
36/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
37/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
RecoveryProcess
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 24: Recovery and Restart System
Provides recovery and restart the system
37
Sources ETL
Back Up
Cubes
UserCubesDuring Process / Schedule , creating back
up
Failed
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
38/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
39/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
40/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 27: Workflow Monitor
Provides detailed steps of how the workflow runs
40
Sample WorkflowLogs
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
41/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 28: Sorting System
Used for sorting data.
41
Query used to sort date byascending order
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
42/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
43/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 30: Problem Escalation System
Responsible for reporting error / audit logs
43
Sources ETLFailed
Testers/ Monitors / Support Team If Production Failed,Automatically reportedto designated groups
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
44/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 31: Parallelizing/Pipelining System
Able to run one ETL Process from different Sources
44
Sources ETL
Sources
Sources
Different Sources are usingOne ETL Pipeline
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
45/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 32: Security System
Provides and assure business data/information sent are secured andencrypted.
45
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
46/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 33: Compliance Manager
Responsible for assuring data is factual and precise when usersaccess the cubes.
46
Data Bank
Sources ETL
Error Data Junk Data
Make Sure Data isCorrect andAccurate
Audits
Audits
Audits
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
47/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
The 34 Subsystems of ETL: Managing the ETL
Environment[1]Subsystem 34: Metadata Repository Manager
47
SEQ_ID Error ID Error Desc1mp_TRXS_ERR_01 Load Error : Process didnt complete2mp_TRXS_ERR_02 Data Error: Unknown member
3mp_TRXS_ERR_03Configuration Error : Limit Reach for non characterssubsets
SEQ_ID Version ID Mapping Name Mapping Desc Date1 1 mp_TRNX_SALES Transaction Process in Sales 1/1/2010
2 1mp_TRNX_MKTGTransaction Process inMarketing 1/1/2010
3 1mp_TRNX_PRODTransaction Process inProduction 1/1/2010SEQ_ID Version ID Mapping Name Error ID
1 1mp_TRNX_SALES 1
2 1mp_TRNX_SALES 2
3 1mp_TRNX_SALES 3
Sources ETLFailed:ERR_02
Error 2 Generated and Reported
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
48/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Real Time Implication
BAFEDM2: Fundamentals of Enterprise DataManagement
48
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
49/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Real Time Implication[2]
Business users expect the data warehouse to be continuously updated
throughout the day
49
Sources ETL
While ETL Process isrunning , users canactually refresh cubes forrecent and up to datedata
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
50/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
51/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
Designing and Developing ETL System[1]
IMPORTANCE OF GOOD SYSTEM DEVELOPMENT PRACTICES
ETL development may follow an iterative, interactive process, but thefundamental systems development practices still apply.
Set up a header format and comment fields for your code.
Hold structured design reviews early enough to allow changes.
Write clean, well-commented code.
Stick to the naming standards.
Use the code library and management system.
Test everything both unit testing and system testing.
Document everything!!.
51
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
52/55
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
53/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
For the Next Session
BAFEDM2: Fundamentals of Enterprise DataManagement
53
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
54/55
2013 IBMCorporation
IBM Global Center for Smarter Analytics
For the Next Sessions
Agenda
Module 5: Advanced Topics
Master Data Management
Measuring the Effectiveness of a Data Warehouse
10 Signs of a Data Warehousing Project in Trouble
Ethical Dilemmas in Data Mining and WarehousingBig Data
54
-
7/27/2019 MELJUN CORTES BAFEDM2 - Week 09, 11, Presentation Deck
55/55
References
[1] : Kimball, R. (2008). The Data Warehouse Lifecycle Toolkit, Second
Edition. John Wiley & Sons.
[2] : Mohanty S. (2006), Data Warehousing: Design, development andBest Practices. Tata McGraw-Hill Publishing Company, India.