The magic of apps: 5 ways going mobile will move your business
The Magic and Mystery of In-Memory Apps
-
Upload
truongdien -
Category
Documents
-
view
223 -
download
2
Transcript of The Magic and Mystery of In-Memory Apps
![Page 1: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/1.jpg)
The Magic and Mystery of In-Memory Apps Taufik Ma – Industry Insight Shaun Walsh - Marketeer
![Page 2: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/2.jpg)
2 © 2015 G2M COMMUNICATIONS. All rights reserved.
Contents
The Use In Memory Applications?
Evolution towards & Role of In-Memory Computing
Role of Storage in In-memory solutions
Customer Trends
Emerging Technologies & Some Predictions
Summary
![Page 3: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/3.jpg)
Magic and In-Memory Applications Shaun Walsh - Marketeer
![Page 4: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/4.jpg)
![Page 5: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/5.jpg)
5 © 2015 G2M COMMUNICATIONS. All rights reserved.
The Evolution of Storage Tiers
NVM will Accelerate Both Meta-Data & Application Data
![Page 6: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/6.jpg)
6 © 2015 G2M COMMUNICATIONS. All rights reserved.
NVDIMM Acceleration Segments
Late
ncy
Meta Data Acceleration
NVDIMM Type Presentation Access Method Latency
-N DRAM Byte Consistent
-F Storage Block Variable
-P DRAM and/or Storage Byte & Block Variable
Meta Data Acceleration
NVDIMM-N
NVDIMM-F NVDIMM-P 3D-XPoint
• Data Base Log Files • Clustering • Cache Synchronization
• In-Memory DBs • MemCacheD • RAID • De-Dupe
![Page 7: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/7.jpg)
7 © 2015 G2M COMMUNICATIONS. All rights reserved.
NVM-DIMM – fills growing DRAM-NAND gap
In Memory Applications are driving a new class of Storage Class Memory (SMC)
Latency and persistence are as important as absolute bandwidth
Byte and Block address flexibility is vital to scaling In-Memory Applications (IMA)
7
Persistent Memory
![Page 8: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/8.jpg)
8 © 2015 G2M COMMUNICATIONS. All rights reserved.
The Future of Business Intelligence
Latency and Persistence are the new value currency for real-time applications & storage
• Old performance was data rates (GB/s) & capacity (TB)
• Store Everything, Sort Later • Higher Cost, Slow Decisions
Latency & Persistence Bandwidth & Capacity
• Real-Time is Business Critical • Major Players Driving NMV • Store the Vital & Analyze now
Latency & Persistence
![Page 9: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/9.jpg)
9 © 2015 G2M COMMUNICATIONS. All rights reserved.
Procter & Gamble - Real-Time Reporting & Business Decisions
https://hana.sap.com/abouthana/customer-stories/pg.html
35,000 Retail, supply chain and business users
supported
400% Increase in decision
support systems performance
55% Reduced database from 36TB to 16TB
all in memory
P&G achieved faster, more reliable reporting and analytics
![Page 10: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/10.jpg)
McLaren Group – Faster Formula 1
• Faster and more consistent lap times • Improved down force for better grip • Real-time telemetric analysis • More World Championships
![Page 11: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/11.jpg)
The Art and Science of In Memory Applications Taufik Ma Industry Insight
![Page 12: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/12.jpg)
12 © 2015 G2M COMMUNICATIONS. All rights reserved.
Evolution of Databases & Analytics
1980s 1990s 2000s-2015
RDBMS
EDW/OLAP
RDBMS
Operational (OLTP, ERM)
Data Warehousing (Data mining, DSS, Analytics)
RDBMS
NoSQL
Hadoop
EDW/OLAP
Oracle, MS SQL, Sybase
Teradata, Oracle, SAS, etc
MongoDB Cassandra
MapReduce HBase
MySQL Postgres
IBM Netezza EMC Greenplum
![Page 13: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/13.jpg)
13 © 2015 G2M COMMUNICATIONS. All rights reserved.
RDBMS NoSQL
EDW/OLAP Hadoop
Structured Data, Relational
Unstructured, Schema-less
Real-time, Online Operations
Batch, Offline Analytics
Ongoing Evolution & Specialization…
OLTP, ERM Purchases, clicks User profiles, reviews Content Management
User Segmentation Daily offer recommendation Ad serving engine Fraud Detection
![Page 14: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/14.jpg)
14 © 2015 G2M COMMUNICATIONS. All rights reserved.
RDBMS NoSQL
EDW/OLAP Hadoop
Structured Data, Relational
Unstructured, Schema-less
Real-time, Online Operations
Batch, Offline Analytics
Ongoing Evolution & Specialization…
In-Memory Database Hana, Exalytics, MemSQL, etc
In-Mem Data Processing Spark, Hadoop in-mem
Real-time analytics
OLTP, ERM Purchases, clicks User profiles, reviews Content Management
Financial risk/value analysis Fraud Prevention Real-time recommendations Profitability analysis
User Segmentation Daily offer recommendation Ad serving engine Fraud Detection
![Page 15: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/15.jpg)
15 © 2015 G2M COMMUNICATIONS. All rights reserved.
Multiple Tools Within A Customer
Customer Profiles (G2M Survey) $500M+ Retail $500M+
Pharma $1B+
Manufacturing $1B+ Pharma
$1B+ SaaS
$250M+
Healthcare
Hadoop Yes Yes Yes Yes Yes Yes MongoDB Yes No plans Yes Yes No plans
Spark Yes No plans Considering
Yes, in 6 months Yes Yes, in 6 months
SAP HANA No plans Yes Considering Yes No plans Considering Microsoft Hekaton No plans No plans Considering
Yes, in 6 months No plans Yes, in 12 months
memSQL No plans No plans Considering Yes, in 6 months No plans
Yes, in 12+ months
Oracle Exalytics No plans No plans Yes Yes No plans Yes, in 12+ months
“Specialized Tools for Specific Needs” (Or “Too Many Data Islands”?)
![Page 16: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/16.jpg)
16 © 2015 G2M COMMUNICATIONS. All rights reserved.
Multiple In-Memory Applications within a Customer
How many in-memory applications do you (or will you) run?
1-5 6-10 More than 10
![Page 17: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/17.jpg)
17 © 2015 G2M COMMUNICATIONS. All rights reserved.
Key Enabler of In-Memory Computing: Today’s Technologies
On a human scale…
If I complete 50 operations in 50 seconds, then have to wait for data…
Time to get data
CPU L1 cache
0.001 usec
DRAM 0.01 usec
NAND 100 usec
HDD 10,000 usec
DRAM = getting food from the fridge (10’s of seconds)
NAND = taking the day off
HDDs = hiking the Pacific Coast Trail (months)
![Page 18: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/18.jpg)
18 © 2015 G2M COMMUNICATIONS. All rights reserved.
Performance Comes at a Price
Storage Time to get data
Price / GB Cost for 100TB # 2U Servers Req’d to Hold 100TB*
DRAM 0.01 usec $5.60 32G DIMM for $179 ea, Samsung Registered DDR4, M393A4K40BB0-CPB0
$560,000 3125 x 32G DIMMs
130
NAND 100 usec $0.35 2.5” 1TB SSD, $350 ea, Intel 540S
$35,000 100 x 2.5” 1TB SSD
5
HDD 10,000 usec
$0.03 3.5” 4TB SATA HDD for $120 ea, Seagate ST4000DM000
$3,000 25 x 3.5” 4TB SATA HDD
2-3
* Assuming 24 DIMM slots, 24x 2.5” drives or 12x 3.5” drives
![Page 19: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/19.jpg)
19 © 2015 G2M COMMUNICATIONS. All rights reserved.
Location of Data & Tasks
Input File
Chunks
1 2 3
Hadoop: MapReduce / HDFS
Parallel Tasks
DISK 1
Parallel Tasks
Parallel Tasks
2 3
Input File
Partitions (RDDs)
1 2 3
Spark / Tachyon
1
Parallel Tasks
MEM 2
Parallel Tasks
3
Parallel Tasks
Input File
User Partitioning
1 2
SAP Hana
1
Local Tasks
MEM 2
Local Tasks
Master Slave(s) Standby JobTracker / Name Node
Sends tasks to data nodes
Spark Driver
Sends tasks to worker nodes
![Page 20: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/20.jpg)
20 © 2015 G2M COMMUNICATIONS. All rights reserved.
Surviving Failures
Input Files
Chunks
1 2 3
Hadoop: MapReduce / HDFS
Parallel Tasks
DISK 1 3
Parallel Tasks
Parallel Tasks
2 3 3 2
Input Files
To persistent storage
1 2 3
Spark / Tachyon
1
Parallel Tasks
MEM 2
Parallel Tasks
3
Parallel Tasks
Input Files
User Partitioning
1 2
SAP Hana
1
Local Tasks
MEM 2
Local Tasks
Ext Storage
Logs & savepoints
Lineage: Record of transformations that created an
RDD from its “parent”
3-fold Replication
2 1 1
Lineage
Master Slave(s) Standby
& checkpoints
Partitions (RDDs)
![Page 21: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/21.jpg)
21 © 2015 G2M COMMUNICATIONS. All rights reserved.
No such thing as 100% In-Memory
a b c
Input Files
Chunks
1 2 3
Hadoop: MapReduce / HDFS
1
Parallel Tasks
DISK* 2 3
SSD
RAM_ DISK
a
2
Parallel Tasks
a
3
Parallel Tasks
a
1 3 2 1
a b c
Input Files
Partitions (RDDs)
1 2 3
Spark / Tachyon
1
Parallel Tasks
HDD
SSD
MEM
a
2
Parallel Tasks
a
3
Parallel Tasks
a
HDFS2.0 Heterogeneous Storage Storage Types & Policies
Files/directories assigned policies (e.g. Lazy_persist, All_SSD)
Tachyon Tiered Storage (for Off_heap Spark RDDs)
Auto or manual
a b c
* ARCHIVE tier not shown
a b c b a a c c b
a b
Input Files
User Partitioning
1 2
SAP Hana
1
Local Tasks
MEM 2
Local Tasks
Ext Storage
a b
Logs & savepoints Caching
WARM: Primary
image on Disk
HOT: Primary image
in Mem
SAP HANA Dynamic Tiering Data spec’d as either Hot or Warm
![Page 22: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/22.jpg)
22 © 2015 G2M COMMUNICATIONS. All rights reserved.
Customer In-Memory Computing Trends (based on G2M survey)
• Cluster sizes similar to
big data solutions o ½ respondents > 500
servers, 1/3 at >50 o And not just for Spark
• With datasets that fit
available DRAM capacity o 1/3 at >100TB, 1/3 at >10TB
~Half with 10-20%+/yr dataset growth Majority use/want tier-ing when dataset > DRAM
Only minority would rely on scale-out only
Mixed on whether tier-ing should be transparent or not
Some want it transparent to developer; Rest want developer to have control via policy
• ~Half believe “my storage
capacity forces me to have more compute capacity then I need”
• Majority have or have plans for consolidated data silos o OLTP+IMDB,
Spark+Hadoop, NoSQL+Hadoop
SIZE GROWTH EFFICIENCY
![Page 23: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/23.jpg)
23 © 2015 G2M COMMUNICATIONS. All rights reserved.
Emerging Technologies: High-speed Fabrics & Disaggregated Storage
Ethernet or PCIe based fabric DAS-like performance Local or SAN Map any drive to any host Scale each storage tier separately
from compute Early proof points: EMC DSSD,
SanDisk InfiniFlash, DriveScale HDD
NAND
DRAM
Low latency fabric
CPU CPU CPU CPU CPU CPU
50G 40G 25G
10G
100G
… Data Center Ethernet speeds ramping faster
than drive speeds: 10/25/40/50/100G
RDMA-over-Ethernet technologies
Multi-host PCIe fabrics emerging (e.g. OCP Lightning) albeit w/ less scalability
SATA/SAS
NVMe PCIeX4 Gen3
time
![Page 24: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/24.jpg)
24 © 2015 G2M COMMUNICATIONS. All rights reserved.
Emerging Technologies: Storage Class Memory
Storage Persist-ence
Time to access data
Price / GB Cost for 100TB
# 2U Servers Req’d to Hold 100TB*
DRAM N 10ns+ $5.60 $560,000 3125 x 32G DIMMs
130
NV-DIMM -N Y 10ns+ $10+ If 2X+ DRAM
$1,000,000+ 260 16G NVDIMM, supercap
3DXP DIMM 100ns Rd 500ns Wr
$2+ If 1/3+ DRAM
$190,000+ ~50 assuming 96 or 128GB DIMMs
NAND Y 100 usec $0.35 2.5” 1TB SSD, $350 ea, Intel 540S
$35,000 100 x 2.5” 1TB SSD
5
HDD Y 10,000 usec $0.03 3.5” 4TB SATA HDD for $120 ea, Seagate ST4000DM000
$3,000 25 x 3.5” 4TB SATA HDD
2-3
* Assuming 24 DIMM slots, 24x 2.5” drives or 12x 3.5” drives
![Page 25: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/25.jpg)
25 © 2015 G2M COMMUNICATIONS. All rights reserved.
In-Memory Computing Predictions / Trends
1. 3DXP DIMMs used for “Jumbo Memory” – value in lower $/GB vs DRAM, not persistence – Mix of 3DXP & DRAM DIMMs in server nodes – Tier-ing will be tuned to accommodate slower writes & reads – Spark, In-mem Hadoop, MemSQL, Hana, etc – NV-DIMM –P might have similar adoption but predictable latency is a concern
2. Increasing use of NVMe SSDs as “Far Memory” – as next tier (below DRAM/3DXP) – Priority on $/TB, not persistence. Resiliency still via Lineage, logs, etc – Remove ”last-inch” of latency via BLKB (block-layer/kernel bypass) stacks (e.g. EMC libflood, SPDK) – Implemented as a fabric-disaggregated cluster to enable efficiency & independent scalability – Longer-term, HW-based paging of near-memory to far-memory
3. Use of “Persistent Memory” for In-Mem computing will evolve – For 3DXP & NV-DIMM –N – Industry progress on pmem file systems (Linux, Windows) – Does persistence replace or complement lineage/logs? – Need low latency replication across nodes (PMoF)
![Page 26: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/26.jpg)
26 © 2015 G2M COMMUNICATIONS. All rights reserved.
Summary
In-memory solutions growing in adoption – driven by real-time analytics
Co-existence of structured (e.g. Hana) and unstructured frameworks (e.g. Spark)
Confluence of big-data & real-time analytics drives increasing adoption of tier-ing
Newer technologies on horizon will continue to create disruptions to in-memory computing architectures
![Page 27: The Magic and Mystery of In-Memory Apps](https://reader037.fdocuments.in/reader037/viewer/2022103106/58a2f9a11a28ab722c8ba901/html5/thumbnails/27.jpg)