IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
-
Upload
george-joseph -
Category
Software
-
view
330 -
download
4
Transcript of IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS
FOR BIG DATA MANAGEMENT. SAP HANA DATABASE
PRESENTED BY : GEORGE JOSEPH
S7 CS ALPHA ROLL NO-39
RSET , KERALA.
AGENDA
.ball
•Revisiting Traditional RDBMS
•Defining IMDB
•A look at a few IMDB products in the market
•SAP HANA database in detail
What is a database ?
.ball
•An organised collection of information•Allows reading and writing .•Provides authorisation and authentication.•Provides some level of data safety.
Traditional RDBMS
.ball
•Developed by E F Codd in early 1970s
•This model is based on tables rows and columns and the manipulation of data stored within.
•A Relational DB is the collection of all these table
•Example: Oracle, mysql & microsoft access
What is a database ?
.ball
•An organised collection of information•Allows reading and writing .•Provides authorisation and authentication.•Provides some level of data safety.
PROBLEM
.ball
• Existing disk-based systems can no longer offer timely response due to the high access latency to hard disks
•The unacceptable performance an obstacle for a meaningful real-time service.
•Eg :Real-time bidding, advertising, social gaming, Stock market .
© 2013 SAP AG. All rights reserved. 9Public
Hardware Advances: Moore’s Law - DRAM Pricing
1980: Memory $10,000/MB
2000: Memory $1/MB
2013: Memory $0.004/MB
Time
MemoryCost /Speed
gdfgfgfgh ss
© 2013 SAP AG. All rights reserved. 10Public
Hardware Advances: Moore‘s Law - CPUs
2002
1 core32 bits4MB
2007
2 cores2 CPUs per serverExternal Controllers
8 cores -16 threads / CPU4 CPUs per serverOn-chip memory controlQuick interconnectVM and vector support64 bits; 256 GB - 1 TB
2010
More cores, bigger caches16 ... 64 CPUs per server Greater on-chip integration(PCIe, network, ...)Data-direct I/OTens of TBs
2013
Images: Intel, Danilo Rizzuti / FreeDigitalPhotos.netball cold
IN-MEMORY DATABASE SYSTEMS
.ball
•For in-memory DB ,Data resides permanently on main memory.
•Source data is loaded into system memory in a compressed, non-relational format
•Only backup copy on disk.
•Memory optimised data structures are used
Disk VS Memory
.ball
•Order of magnitude of access time is less for main memory.•Main memory is normally volatile while disk storage is not.•The layout of disk is much more critical than layout of main memory
.ball
•SAP HANA is the market leader in IMDB systems. It is also a platform for big data processing analysis and prediction.•SAP HANA can help business for building real-time applications and analytics for accelerating the process
© 2013 SAP AG. All rights reserved. 16Public
In-Memory
Column Database
Massively Parallel
Processing
Optimized Calculation
Engine
Columnar storage increases the amount of data that can be stored in limited memory
(compared to disk)
Column databases enable easier parallelization of
queries
Row buffer fast transactional processing
In-memory processing gives
more time for relatively slow
updates to column data
In-memory allows sophisticated
calculations in real-time
MPP optimized software enables linear performance
scaling making sophisticated calculations like allocations
possible
Each technology works well on its own, but combining them all is the real opportunity — provides all of the upside benefits while mitigating the downsides
SAP in-memory innovations make the “New Way” a reality
s
© 2013 SAP AG. All rights reserved. 17Public
Order Country Product Sales456 France corn 1000457 Italy wheat 900458 Italy corn 600459 Spain rice 800
SAP HANA: Column Store
456 France corn 1000
457 Italy wheat 900
458 Italy corn 600
459 Spain rice 800
456457458459
FranceItalyItalySpain
cornwheatcornrice
1000900600800
Typical Database
SAP HANA: column order
SELECT Country, SUM(sales) FROM SalesOrders WHERE Product = ‘corn’ GROUP BY Country
s
© 2013 SAP AG. All rights reserved. 18Public
SAP HANA: Data Compression
Efficient compression methods (dictionary, run length, cluster, prefix, etc.) Compression works well with columns and can speedup operations on
columns (~ factor 10) Because of compression, write changes into less compressed delta storage
Needs to be merged into columns from time to time or when a certain size is exceeded Delta merge can be done in background Trade-off between compression ratio and delta merge runtime
Updates into delta data storage and periodically merged into main data storage High write performance not affected by compression Data is written to delta storage with less compression which is optimized for write access. This is
merged into the main area of the column store later on.
© 2013 SAP AG. All rights reserved. 19Public
SAP HANA: Dictionary Compression
JonesMiller
MillmanZsuwalskiBakerMillerJohnMillerJohnsonJones
Column „Name“(uncompressed)
Value-ID sequenceOne element for each row in column
415N042431
Value ID
s
JohnsonMiller
JohnJones
01234
Millman
ZsuwalskiN
Dictionary
sorte
d
Value ID implicitly given by sequence in which values are stored
Value
Baker
5
Column „Name“ (dictionary compressed)
point intodictionary
s
© 2013 SAP AG. All rights reserved. 20Public
SAP HANA: ScalabilityScales from very small servers to very large clusters
Single Server• 2 CPU 128GB to 8 CPU 1TB
Scale Out Cluster• 2 to n servers per cluster• Largest certified configuration: 16
servers• Largest tested configuration: 100+
servers• Support for high availability
and disaster tolerance
Cloud Deployment
s
© 2013 SAP AG. All rights reserved. 21Public
What is inside HANA?
ACID Compliant Database- In-Memory- Column Store
Out
In
SQL
BICS
MDX
JSON / XML
DataServices
HANA Studio
ParallelExecution
ScriptingEngine
Business FunctionLibrary
Unstructured(Text)
PredictiveAnalysisLibrary
OLAP
XS AppServer
“R” HSIntegration
1. Batch Transfer2. SAP & Non-SAP3. Extensive Transformations4. Structured & Unstructured5. Hadoop Integration
1. ODBC / JDBC2. 3rd Party Apps3. 3rd Party Tools
1. BICS 2. NetWeaver BW3. SAP BOBJ
1. ODBO2. MS Excel3. 3rd Party OLAP Tools
1. HTTP2. RESTful services3. OData Compliant
“R”
ESP
Spatial /Geospatial
QueryFederation
1. IQ / ASE2. Teradata / Oracle3. Hadoop
ReplicationServices 1. Near Real Time
2. Non-SAP
s
.ball
•In-Memory Big Data Management and Processing: By Hao Zhang, Gang Chen, Member, IEEE, Beng Chin Ooi, Fellow, IEEE, Kian-Lee Tan, Member, IEEE, and Meihui Zhang, Member, IEEE
•SAP HANA Distributed In-Memory Database System: Transaction, Session, and Metadata Management Juchang Lee#1, Yong Sik Kwon#2, Franz Färber*3, Michael Muehle*4, Chulwon SAP Labs, Korea
•In-memory database www.wikipedia.org
REFERENCES