A Hybrid Row-column OLTP Database Architecture for Operational Reporting
Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Agenda
■ Operational Reporting
■ Related Work
■ Architecture of Hybrid System
■ Virtual Cube
■ Outlook and Discussion
2
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Operational Reporting
Dinstinction according to Inmon:
■ Informational Reporting
□ Supports long-term, strategic decisions
□ Summarized data
□ Long-term horizons
Typically done using a data warehouse (DW)
■ Operational Reporting
□ Supports day-to-day decisions
□ Data on a more detailed level
□ Takes up-to-the-minute data into account
Done using a DW or an OLTP system?
3
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Operational Reporting (contd.)
■ Using a DW for Operational Reporting
□ DW must be designed to the same level of granularity as the OLTP systems huge data volumes
□ Updates are required to frequently be replicated into the DW endless optimization
■ Using an OLTP Store for Operational Reporting
□ Operational reporting queries are relatively long-running in comparison to pure OLTP workloads
□ Resource contention:Locks of long-running queries block the short-running ones
□ Different data model:Not optimized for reporting (i.e. no star-schema)
4
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Common Data Warehouse Architecture
■ DW contains ETL processor which
□ ...extracts data from various OLTP sources into a staging area
□ ...applies transformations for cleansing and integration
□ ...stores data in a dimensional layout
■ OLAP engine runs queries against dimensional data store
5
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
“Real-Time” DW Architectures
■ Microbatch
□ Configure ETL process to run in very short intervals
□ Up-to-date data but very resource intensive
■ Push Architectures
□ Handling of deltas on a business or database transaction level
□ Up-to-date data but still resource intensive
■ Operational Data Store (ODS)
□ Store copy of the OLTP data using an integrated schema
□ High data granularity but no up-to-date data
6
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
“Real-Time” DW Architectures (contd.)
■ ELT
□ Data is extracted from the OLTP sources and loaded into the ODS
□ Transformations are done in the warehouse at query-runtime
□ High granularity (transactional data) but no up-to-date data
■ Virtual ODS
□ Virtual in the sense that queries are redirected against OLTP system
□ High granularity (transactional data) and up-to-date data
□ Performs ETL on-the-fly
□ Affects performance of OLTP system
7
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
8
Column-Stores: New “Trend” for OLAP
□ Column-store databases:
◊ Vertical fragmentation
◊ Fast aggregations (sum, min, max, avg, …) more flexibility for ad-hoc reporting
◊ Each column can be compressed individually
□ Both disk-based …
◊ Vertica
◊ Greenplum
□ … and in-memory:
◊ SAP BIA
◊ MonetDB
◊ Exasol
c1
v11
v21
v31
c2
v12
v22
v32
c3
v13
v23
v33
sID
1
2
3
c1
v11
v21
v31
sID
1
2
3
c2
v12
v22
v32
sID
1
2
3
c3
v13
v23
v33
row-oriented column-oriented
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Encoding Schemes
9
Ordered
Unordered
Few distinct values Many distinct values
Delta representationDelta representation
Sequence of triples:• value• offset position• # occurrences
Sequence of triples:• value• offset position• # occurrences
Sequence of tuples:• value• bitmap for positional occurence
Sequence of tuples:• value• bitmap for positional occurence
??
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Architecture of Hybrid System
10
■ Essentially integration between row- and column store DBs
■ MaxDB is used as the row store
□ Database underlying SAP Business ByDesign
□ Supports ACID transactions
■ TREX is used as the column store
□ Main memory
□ Engine underlying SAP BIA
□ Has a copy of (some of) the OLTP data
□ Primary OLTP system and main-memory database (MMDB) aregoverned using a single resource manager
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Architecture of Hybrid System (contd.)
11
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Virtual Cube
■ Similar architecture as virtual ODS
□ Virtual Cube provides the same interfaceas a typical cube (slice, dice, drill-down, …)
□ Virtual Cube rewrites queries and issues them against the MMDB (TREX in our case)
□ TREX has a copy of the OLTP data
□ Primary OLTP system and MMDB aretied together as described above
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Outlook
13
■ Build a “real” hybrid database in-memory as part of ChunkyStore
■ Data can be stored as either:
□ Rows
□ Columns
□ Chunks (adjacent fragments of rows and columns)
■ DB decides which physical storage alternative is most suitable
■ Main-memory implementation will cater for fast updates as well as fast operational reporting capabilities
Top Related