In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc."...
Transcript of In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc."...
![Page 1: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/1.jpg)
© 2014 GridGain Systems, Inc.
YAKOV ZHDANOV Director R&D
In-‐Memory Compu7ng: From Disk-‐First Architecture to Memory-‐First
www.gridgain.com @gridgain
![Page 2: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/2.jpg)
© 2014 GridGain Systems, Inc.
GridGain: In-‐Memory Compu7ng Leader
• More than 5 years in producGon • 100s of customers & users • Starts every 10 seconds worldwide
![Page 3: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/3.jpg)
© 2014 GridGain Systems, Inc.
Agenda
• [R]EvoluGon of Data: Extreme Volume Growth • In-‐Memory CompuGng: Faster Apps on Larger Data Sets • Example System: Moving ApplicaGon “In-‐Memory” • GridGain: Overview And “Live Coding” Sessions • QA
![Page 4: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/4.jpg)
© 2014 GridGain Systems, Inc.
[R]Evolu7on of Data
2.5 exabytes* of data was generated every day in 2012 (IBM) This is 62.5 km high stack of 1Tb 3.5” HDD
* 1 exabyte = 1018 bytes
62.5 km
![Page 5: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/5.jpg)
© 2014 GridGain Systems, Inc.
Why Speeding Up Data Processing?
• 2008: Amazon found every 100ms of latency cost them 1% in sales
• Analysts demand sub-‐second, near real-‐Gme query results
• On-‐line traders want ultra-‐low latencies
![Page 6: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/6.jpg)
© 2014 GridGain Systems, Inc.
Problem
Data sets are large and complex – new tools needed.
![Page 7: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/7.jpg)
© 2014 GridGain Systems, Inc.
Solu7on: In-‐Memory Compu7ng • Data is in-‐memory – no disk IO • Mostly distributed – mulG-‐node topologies • Parallel processing • Deal with operaGonal data set • Map-‐Reduce support • Middleware sogware
RAM storage and parallel distributed processing are two fundamental pillars of in-‐memory compuGng.
![Page 8: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/8.jpg)
© 2014 GridGain Systems, Inc.
In-‐Memory Compu7ng: The Best Use Cases*
• Investment banking • Insurance claim processing & modeling • Real-‐Gme ad plahorms • Merchant plahorm for online games • GeospaGal/GIS processing • Medical imaging processing • Complex event processing of streaming sensor data
*I can only speak for GridGain which has producGon customers in a wide variety of industries to be staGsGcally significant
![Page 9: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/9.jpg)
© 2014 GridGain Systems, Inc.
Example System
• Imagined system in the beginning of its lifecycle • Growing: handling more users and data • Moving to memory for beker characterisGcs • Comparison: before and ager
![Page 10: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/10.jpg)
© 2014 GridGain Systems, Inc.
Example System: Just Launched
![Page 11: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/11.jpg)
© 2014 GridGain Systems, Inc.
Example System: Growing
![Page 12: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/12.jpg)
© 2014 GridGain Systems, Inc.
Example System: Moving to Memory (Step 1)
![Page 13: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/13.jpg)
© 2014 GridGain Systems, Inc.
Example System: Moving to Memory (Step 2)
![Page 14: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/14.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Before And A\er
• Data distribuGon in in-‐memory system • Simple data update scenarios • Long running queries • Scalability • Failure tolerance
![Page 15: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/15.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Data Distribu7on
Employee Larger data sets stored in PARTITIONED manner
Employee PARTITIONED 1/4 + [1/4] 1/4 + [1/4] 1/4 + [1/4] 1/4 + [1/4]
Each server is PRIMARY for 1/4 of Employee objects and BACKUP for another 1/4.
![Page 16: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/16.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Data Distribu7on
Company Employee • Smaller data sets may be
REPLICATED for colocaGon
• Larger data sets stored in PARTITIONED manner
Company REPLCATED
Employee PARTITIONED
FULL
1/4 + [1/4]
FULL
1/4 + [1/4]
FULL
1/4 + [1/4]
FULL
1/4 + [1/4]
Companies and employees are colocated
![Page 17: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/17.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Data Access And Update Before: Update User Profile
100% of the requests end up to the same
DB server
Ager: Update User Profile
Servers evenly share the load, since profiles are distributed along the cluster. Persistent storage is
updated in async manner
![Page 18: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/18.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Long Running Queries Before: select avg(sum) from Orders
100% of the requests end up to the same DB server which is responsible for scanning the enGre data set.
Ager: select avg(sum) from Orders
Servers evenly share the load running query against parGGoned data. Each server has smaller
data set to process.
![Page 19: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/19.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Scalability
![Page 20: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/20.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Scalability
![Page 21: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/21.jpg)
© 2014 GridGain Systems, Inc.
Comparison: Failure Tolerance
Failure of master DB server or certain servers in NoSQL deployment threatens the app.
Failure of 1-‐2-‐3-‐N nodes or planned shutdown does not threaten the cluster.
![Page 22: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/22.jpg)
© 2014 GridGain Systems, Inc.
Economics of In-‐Memory Compu7ng
• High Performance and low Latencies • RAM + Network Faster than Disk or Flash • VolaGle and Persistent • Cost EffecGve • OLAP and OLTP Use Cases • Distributed or Not • Caching, Streaming, ComputaGons • Data Querying – SQL or Unstructured
![Page 23: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/23.jpg)
© 2014 GridGain Systems, Inc.
Where Is Your Applica7on?
RDBMS
L2 Caching NoSQL Distributed Disk Systems
In-‐Memory Data Grids IMDBs
![Page 24: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/24.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Try In-‐Memory Compu7ng
• Full in-‐memory stack: compute, caching, streaming • IntuiGve APIs – easy to start with • Open-‐sourced under Apache 2.0 license • Hosted and developed on GitHub – fork and enjoy!
![Page 25: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/25.jpg)
© 2014 GridGain Systems, Inc.
GridGain: In-‐Memory Data Fabric Strategic Approach to IMC
• Supports all Apps
• Simple Java APIs • 1 JAR Dependency • High Performance & Scale • Automatic Fault Tolerance • Management/Monitoring • Runs on Commodity Hardware
• Supports existing & new data sources
• No need to rip & replace
Clustering & Compute Grid
Data Grid Streaming
Hadoop Acceleration
© 2014 GridGain Systems, Inc.
![Page 26: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/26.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Clustering And Compute • Direct API for MapReduce • Direct API for Fork/Join • Zero Deployment • Cron-‐like Task Scheduling • State Checkpoints • Early and Late Load Balancing • AutomaGc Failover • Full Cluster Management • Pluggable SPI Design
![Page 27: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/27.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Automa7c Cluster Discovery
![Page 28: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/28.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Closure Execu7on
![Page 29: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/29.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Closure Execu7on
![Page 30: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/30.jpg)
© 2014 GridGain Systems, Inc.
GridGain: In-‐Memory Caching and Data Grid
• Distributed In-‐Memory Key-‐Value Store • Replicated and ParGGoned • TBs of data, of any type • On-‐Heap and Off-‐Heap Storage • Backup Replicas / AutomaGc Failover • Distributed ACID TransacGons • SQL queries and JDBC driver • ColocaGon of Compute and Data
![Page 31: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/31.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Cache Opera7ons
![Page 32: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/32.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Cache Transac7on
![Page 33: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/33.jpg)
© 2014 GridGain Systems, Inc.
GridGain: Distributed Java Data Structures
• Distributed Map (cache) • Distributed Set • Distributed Queue • CountDownLatch • AtomicLong • AtomicSequence • AtomicReference • Distributed ExecutorService
![Page 34: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/34.jpg)
© 2014 GridGain Systems, Inc.
Client-‐Server vs Affinity Coloca7on
Client-‐Server Affinity ColocaGon
![Page 35: In-Memory&Compu7ng:& FromDisk ...©2014"GridGain"Systems,"Inc." GridGain:&In-Memory&Compu7ng&Leader&& • More"than"5"years"in"producGon" • 100s"of"customers"&"users"" • Starts"every"10"seconds](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f8796ae516ae9253345581e/html5/thumbnails/35.jpg)
© 2014 GridGain Systems, Inc.
THANK YOU
www.gridgain.com @gridgain