Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data...

6
Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data

Transcript of Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data...

Page 1: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

OracleReal time Data WarehousingTeam 18Mian LiSiyuan Song

Class topics related:

Week 12: Data WarehousesWeek 13: Data Mining ConceptsWeek 14: Big-Data

Page 2: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

Challenges & ApproachesChallenges:1. Some to none data

integration latency2. Low or no impact on

data source and Data Warehouse

3. Integration projects (Reporting BI Tools) perform in shorter time

Approaches:1. Access Changes

immediately. 2. Subscribe changes

Page 3: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

CDC MechanismFilter QueryPerformed through incremental queries filtered based on a timestamp or flag

Change Data Capture (CDC)Capture data changes immediately

CDC through logProcess changes by processing log filesRequire changes to sourceBetter for scalability

CDC through TriggersDefine procedures executed inside source when change occurs Limited scalability

Page 4: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

Publish-and-Subscribe ModelThree steps in Publish-and-subscribe model1. Subscriber (Integration Process) subscribes to

changes in datastore2. CDC captures changes and publishes3. Subscriber can process tracked changes at any time

Changes Processing1.Pull Mode: Process in batches (every five minutes)

2.Push Mode: Process immediately as change occurs

Page 5: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

Compare different approaches

Page 6: Oracle Real time Data Warehousing Team 18 Mian Li Siyuan Song Class topics related: Week 12: Data Warehouses Week 13: Data Mining Concepts Week 14: Big-Data.

References:

1. Oracle. Best Practices for Real Time Datawarehousing. August 2012

2. PCWorld.http://www.pcworld.com/article/2055920/oracle-data-integrator-12c-ready-for-realtime-analysis.html.October 2013.

3. Wikipedia. http://en.wikipedia.org/wiki/Data_warehouse.

4. Plotting Success. http://plotting-success.softwareadvice.com/beginners-guide-to-bi-software-1113011/. December,2011