NewSQL vs NoSQL for New OLTP

download NewSQL vs NoSQL for New OLTP

If you can't read please download the document

description

Historically, enterprises used traditional RDBMs for on-line transaction processing (OLTP) applications. We affectionately call these systems OldSQL, because they are largely legacy code bases, originally written decades ago. New OLTP applications have more extreme performance requirements than the Old OLTP applications of yesteryears. These are caused by web users directly submitting transactions rather than using a professional terminal operator and by mobile devices (and sensors) enabling transaction submission from many more locations. In a considerable number of modern applications (multiplayer games, risk analysis in electronic trading, gambling, social networks, etc.) OldSQL is cracking under the volume of interactions. This talk contrasts two alternatives to OldSQL in this area: NoSQL, where both SQL and ACID transactions are jettisoned for better performance NewSQL, where SQL and ACID are retained, and better performance is delivered through innovative architectures

Transcript of NewSQL vs NoSQL for New OLTP

  • 1.the NewSQL database youll never outgrowOldSQL vs. NoSQL vs. NewSQL on New OLTPMichael Stonebraker, CTOVoltDB, Inc.

2. Old OLTP Remember how we used to buy airplane tickets in the 1980s + By telephone + Through an intermediary (professional terminal operator) Commerce at the speed of the intermediary In 1985, 1,000 transactions per second was considered an incredible stretch goal!!!! + HPTS (1985)VoltDB2 3. How has OLTP Changed in 25 Years? The internet + Client is no longer a professional terminal operator + Instead Aunt Martha is using the web herself + Sends volume through the roofVoltDB3 4. How has OLTP Changed in 25 Years? PDAs and sensors + Your cell phone is a transaction originator + Everything is being geo-positioned by sensors (marathon runners, your car, .) + Sends volume through the roofVoltDB 4 5. How has OLTP Changed in 25 Years? The definitions + Online no longer exclusively means a human operator The oncoming data tsunami is often device and system-generated + Transaction now transcends the traditional business transaction High-throughput ACID write operations are a new requirement + HA and durability are now core database requirementsVoltDB5 6. Example NewOLTP Use CasesDataHigh-frequency Lower-frequency Source operations operationsWrite/index all trades, store Show consolidated riskCapital marketstick data across tradersCall initiation request Real-time authorization Fraud detection/analysisInbound HTTPVisitor logging, analysis,Traffic pattern analyticsrequestsalertingRank scores:Online game Defined intervalsLeaderboard lookupsPlayer bestsReal-time ad tradingMatch form factor,Report ad performancesystems placement criteria, bid/ask from exhaust streamMobile device location Location updates, QoS,Analytics on transactionssensor transactionsVoltDB VoltDB 6 66 7. New OLTP Challenges New OLTP and You You need to ingest the firehose in real time You need to process, validate, enrich and respond in real-time You often need real-time analyticsVoltDB VoltDB777 8. Solution Choices OldSQL + Legacy RDBMS vendors NoSQL + Give up SQL and ACID for performance NewSQL + Preserve SQL and ACID + Get performance from a new architectureVoltDB 8 9. OldSQL Traditional SQL vendors (the elephants) + Code lines dating from the 1980s + bloatware + Not very good at anything Can be beaten by at least an order of magnitude in every verticalmarket I know of + Mediocre performance on New OLTP At low velocity it doesnt matter Otherwise you get to tear your hair outVoltDB9 10. Reality Check TPC-C CPU cycles On the Shore DBMS prototype Elephants should be similarVoltDB 10 11. The Elephants Are slow because they spend all of their time on overhead!!! + Not on useful work Would have to re-architect their legacy code to do betterVoltDB11 12. To Go a Lot Faster You Have to Focus on overhead + Better B-trees affects only 4% of the path length Get rid of ALL major sources of overhead + Main memory deployment gets rid of buffer pool Leaving other 75% of overhead intact i.e. win is 25%VoltDB 12 13. NoSQL Give up SQL Give up ACIDVoltDB13 14. Give Up SQL? Compiler translates SQL at compile time into asequence of low level operations Similar to what the NoSQL products make youprogram in your application 30 years of RDBMS experience + Hard to beat the compiler + High level languages are good (data independence, less code, ) + Stored procedures are good! One round trip from app to DBMS rather than one one round tripper record Move the code to the data, not the other way aroundVoltDB 14 15. Give Up ACID? If you have multi-record transactions + Need ACID or have to program it in user-level code + Rolling your own is messy, brittle, error-prone Compensating transactions = pain + Theyre people intensive and expensive + They cause customer/user dissatisfaction + Non-commutative single-record transactions (e.g. buy an item) simply failVoltDB15 16. NoSQL Summary Appropriate for non-transactional systems Appropriate for single record transactions that are commutative Not a good fit for New OLTP Use the right tool for the job Interesting Two recently-proposed NoSQL language standards CQL and UnQL are amazingly similar to (you guessed it!) SQLVoltDB 16 17. NewSQL SQL ACID Performance and scalability through modern innovative software architectureVoltDB17 18. NewSQL Needs something other than traditional record level locking (1st big source of overhead) + timestamp order + MVCC + Your good idea goes hereVoltDB 18 19. NewSQL Needs a solution to buffer pool overhead (2nd big source of overhead) + Main memory (at least for data that is not cold) + Some other way to reduce buffer pool costVoltDB19 20. NewSQL Needs a solution to latching for shared data structures (3rd big source of overhead) + Some innovative use of B-trees + Single-threading + Your good idea goes hereVoltDB20 21. NewSQL Needs a solution to write-ahead logging (4th big source of overhead) + Obvious answer is built-in replication and failover + New OLTP views this as a requirement anyway Some details + On-line failover? + On-line failback? + LAN network partitioning? + WAN network partitioning?VoltDB 21 22. Node Speed Really MattersApplication Requirement: 1MM operations/sec at scale Operations/sec/node 10,000 Operations/sec/node 100,000 Nodes (without HA)100Nodes (without HA)10 HA Replication factor 2HA Replication factor 2 Nodes (including HA)300Nodes (including HA)30 Greater scale through reduced network traffic Reduced probability of hardware failures Lower monetary and environmental costsA high-scaling RDBMS must combine transparentdistribution, built-in durability and blazing node speedVoltDB22 23. A NewSQL Example VoltDB Main-memory storage Single threaded, run Xacts to completion + No locking + No latching Built-in HA and durability + No logVoltDB23 24. Yabut: What About Multicore? For A K-core CPU, divide memory into K (nonoverlapping) buckets i.e. convert multi-core to K single coresVoltDB 24 25. Where all the time goes revisited Before VoltDBVoltDB25 26. How Scalable is VoltDB? VoltDB is very scalable; it shouldscale to 120 partitions, 39 servers,and 1.6 million complex transactionsper second at over 300 CPU cores.SGI Announces Support and RecordBenchmarks for VoltDB Database High Performance SGI RackableBaron Schwartz Cluster Delivers Over Three MillionChief Performance Architect Transactions per SecondPercona Vanilla HardwarePumped-up HardwareVoltDB 26 27. Current VoltDB Status Runs a subset of SQL (which is getting larger) On VoltDB clusters (in memory on commodity gear) No WAN support yet + Discussions abound on exactly what to do here 45X a popular OldSQL DBMS on TPC-C 5-7X Cassandra on VoltDB K-V layer Clearly note this is an open source system!VoltDB 27 28. Summary Old OLTPNew OLTP Too slow OldSQL for New OLTP Does not scale Lacks consistency guarantees NoSQL for New OLTP Low-level interface Fast, scalable and consistent NewSQL for New OLTP Supports SQLVoltDB 28 29. the NewSQL database youll never outgrowThank You