Efficient Transaction Processing in SAP HANA Database The...

28
Efficient Transaction Processing in SAP HANA Database – The End of a Column Store Myth Ravi Raj Kadam(2709227)

Transcript of Efficient Transaction Processing in SAP HANA Database The...

Page 1: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Efficient Transaction Processing in

SAP HANA Database – The End of a

Column Store Myth

Ravi Raj Kadam(2709227)

Page 2: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

ABSTRACT

• SAP’s (Systems, Applications and Products) new data management platform.

• Provide a generic but powerful system for different query scenarios-both transactional and analytical. (Scalable execution)

• General architecture- design criteria and Initial Myth about usage Of Column and Row Store Database

• Concept of record life cycle management to use different storage formats for different stages of the record.

• SAP HANA database abilities to efficiently work in analytical as well as transactional workload environment

Page 3: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Row VS Column Store:

Page 4: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

INTRODUCTION

• Challenges of Data mining in modern business software industry.

• Classic Enterprise Resource Planning(ERP) and Online Transaction Processing(OLTP).

• SAP- efficient, flexible, robust and cost-efficient in Data Management(DM) layer in different application scenarios.

• OLTP workload of ERP systems typically required thousands of concurrent users and transactions with very selective point queries.

Page 5: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Observation Of Current Situation

Usage Perspective:• Users like to directly interact with the database. Application layer and scripting languages are main mechanisms with built-in database for specific application domains like R for statistical programming, Pig to work on ha-doop and SAP FOX for functional Planning scenarios.

Cost Awareness:• Clear demand to provide Lower Total Cost Of(TCO) for the complete DM(Hardware setup cost to operational and maintenance costs)

Page 6: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

PERFORMANCE:• Performance is the main reason to use specialized systems.

• Challenge is to provide flexible solution with the ability to use specialized operators or data structures whenever they are used in the database.

Page 7: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Features Of SAP HANA DB

• It is a combination of hardware and software made to process massive real time data using In-Memory computing.

• It combines row-based, column-based database technology.

• Data now resides in main-memory (RAM) and no longer on a hard disk.

• It’s best suited for performing real-time analytics, and developing and deploying real-time applications.

• Complex calculations on data are not carried out in the application layer, but are moved to the database.

Page 8: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column
Page 9: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Contribution and Outline

• HANA DB comprises a multi- engine query processing environment.

• Offers different degrees of structure from well- structured relational data to irregularly structured data graphs to understand texts.

• Supports ‘Transactional Level’ snapshot isolation and ‘Statement Level’ snapshot isolation.

• Represents the application specific business objects (OLAP Cubes) and Logic(Domain- Specific Function Lib)

Page 10: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Contribution and Outline

• HANA DB is optimized to efficiently communicate between the Data Management(DM) and Application Layer(AL)

• The HANA DB supports and handles the SAP application server that supports all the data types for scripting languages. So, it is highly-optimized Column- Oriented data represented. It is achieved by Multi- step record life cycle management approach

• Transactional process uses Multi-Version Concurrency Control(MVCC) to implement transaction level isolation and statement level Isolation.

Page 11: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Layered Architecture- SAP HANA DB

• SAP provides support for SAP Business Warehouse(BW) to speedup query and transactions.

• In- order to provide this capability data loading and transformational tools module are used to create and maintain complex data flows in and out of SAP HANA.

• Business Intelligence Consumer Services(BICS), MDX SQL can be used on SAP HANA Appliance.

• SAP Business Suite, SAP NetWeaver Business Warehouse(NW- BW) and other third party provide services for this database.

Page 12: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Overview of HANA DB Layered Architecture

Page 13: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Calculation Graph

Model

•“Calculation Graph” (calc graph for short), which forms the heart of the logical query processing framework.

•Calc model defines a set of intrinsic operators e.g. aggregation, projection, joins, union etc. On the other side, the calc model provides operators which implement core business algorithms like currency conversion.

Example of a SAP HANA Calc Model Graph

Page 14: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Calculation Graph Model

Deploy and Compile Time Run Time

• Dynamic Sql Nodes: Calc operator that execute sql statement in data flow. It can be resulting in a form of “Nested calc” models.

• Custom Nodes: This are used to implement domain specific operations in C++ for performance reasons.

• R Nodes: This is used to forward incoming data sets to an R execution environment.

• LNODE

• Relational Operators: This are the collection of relational operators that handles relational Query graph. Ex: Equi- Join, Unified Table.

• OLAP Operations: These are optimized for Star-Join with fact and Dimension tables.

• L runtime: Runtime for the internal language reflects building block to execute L code. Using split & combine operator pair, the L runtime in invoked in parallel runtime.

• Text Operators & Graph Operators

Page 15: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Life-Cycle Management of DB Records

• The physical Operators provide excellent performance for both aggregation queries and also highly selective queries.

• This is generally known as update-in-place-style databases systems.

• This database generally Row and column based Module.

• L1 Delta: This is a Row level module where it accepts all the incoming data requests and stores them in a write-Optimized manner, i.e. it provides a logical flow format of the record. It is optimized for fast insert/ delete/ Update and record projection.

Page 16: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

• L2Delta: This is a Column level module. In contrast to the L1-delta this employs dictionary encoding to achieve better memory usage.

• Main Store: This is the final index level which represents the core data format with the highest compression rate exploiting a variety of different compression schemes.

Page 17: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Overview of the Unified table concept:

Column StoreStructure:

Page 18: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Memory Compression

Page 19: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

1) Prefix Encoding 2) Run Length Encoding 3) Cluster Encoding4) Sparse Encoding 5) Indirect Encoding

Page 20: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Persistency Mapping

• This Mapping has two levels REDO Log and SAVEPOINT data area.

REDO Log:• This generally stores the entry log between the L1-Delta-to-L2Delta whenever the system crash occurs this log is checked by the user/admin and the transformation continues from the next entry. The L2-Delta saves the data till where it has been updated.

SAVEPOINT data Area: • This is majorly the backup data that have been created between the L2-Delta-to-Main. This is used whenever the system entry failure occurs in the transaction level this will not save L1-Delta-to-L2 Delta.

Page 21: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Overview of the Persistency mechanisms of the unified table

Page 22: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Details of the L1-to-L2-Delta Merge

Page 23: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Characteristics of the SAP HANA database record life cycle

Page 24: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

CONCLUSION

• Column store systems are well known to provide superb performance for OLAP-style workload. Typically aggregation queries touching only a few columns of 100s of millions of rows benefit from a column-oriented data layout. On the one hand, operational systems embed more and more statistical operations for the on-the-fly business decision into the individual business process. On the other hand, classical data-warehouse infrastructures are required to capture transactions feeds for real-time analytics. Additionally, we explained in more detail the common unified table data structure consisting of different states on the one hand but providing a common interface to the consuming query engines on the other hand.

Page 25: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

General Questions

• What if system failure occurs?

�Although the SAP HANA database is a main-memory-centric database system, its full ACID support guarantees durability as well as atomicity and recovery in case of a system restart after regular shutdown or system failure.

• Does this support all languages?

� Yes, it supports all the languages if they have Common Connection Layer. Basically this database is recommended for OLAP and OLTP transactions.

Page 26: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column
Page 27: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Encouraging SAP Community:

Page 28: Efficient Transaction Processing in SAP HANA Database The ...cis.csuohio.edu/~sschung/CIS601/Efficient...Efficient Transaction Processing in SAP HANA Database –The End of a Column

Ravi Raj Kadam(2709227)