Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
-
Upload
redis-labs -
Category
Technology
-
view
1.081 -
download
0
Transcript of Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
© 2016 IBM Corporation 1
IBM Streams22 April 2016
Matt Grover, Walmart ISD Enterprise ArchitectureRoger Rea, IBM Streams Offering Manager
Mike Spicer, IBM Lead Architect IBM Streams
Linear Road Benchmark
Performance Comparison of Streaming Analytic Offerings
© 2016 IBM Corporation 2
Walmart and IBMToday, nearly 260 million customers visit our more than 11,500 stores under 72 banners in 28 countries and e-commerce sites in 11 countries each week.
We employ 2.2 million associates around the world, 1.4 million in the U.S. alone.
International Business Machines Corporation ('IBM') is a globally integrated enterprise operating in over 170 countries, has 380.000 employees. It brings innovative solutions to a diverse client base to help solve some of their toughest business challenges.
© 2016 IBM Corporation 3
Why rewrite Linear Road • No comprehensive streaming benchmark
available • The storage design did not represent the current
state of streaming data systems
Requirement for Streaming Analytics at Walmart• Worldwide monitoring of logistics• Real time inventory control• Real time analytics
Linear Road Benchmark (2004) • Original White paper (link)• Open Source benchmark• Enables comparison between offerings• Sophisticated application with state management
Why Linear Road?
4 © 2016 IBM Corporation© 2016 IBM Corporation
Linear Road Benchmark
Linear city is a fictional metropolis 100x100 miles
10 Expressways every 10 miles
Every mile each has an exit and onramp
Each expressway has 4 lanes in each direction
3 travel lanes and one lane for entrance and exit
Every vehicle emits position report every 30 seconds
One accident occurs randomly on each expressway
every 20 minutes, taking 10 to 20 minutes to clear
Linear Road on github
5 © 2016 IBM Corporation© 2016 IBM Corporation
Linear Road Benchmark
Four types of events
Type 0: 99% of events are real-time position reports
Type 2: Historical requests for account balances
Type 3: Daily expenditure for a specific day in the
past 10 weeks
Type 4: travel time predictions
GOAL: Maximum L-Rating (max # expressways)
Linear Road on github
6 © 2016 IBM Corporation© 2016 IBM Corporation
High level Linear Road architecture
Linear Roaddata generator
Courtesy of: Wal-Mart Stores Inc.
Linear Road
(Solutionimplementation
using vendorspecific Streaming
analytics middleware)
ResultsValidator(Rewritten inPython by
Wal-Mart Stores Inc.)
DetermineL-Rating
7 © 2016 IBM Corporation© 2016 IBM Corporation
Why did IBM select Redis ?
Great maturity level
Top performance
API is tremendously easy and very flexible
Clustered in memory Key Value Store with fault tolerance
Option for in memory or in memory backed by persistence
Easy installation and monitoring
8 © 2016 IBM Corporation© 2016 IBM Corporation
High level Linear Road architecture with Redis and IBM Streams
Linear Road Data Feeder streamingthe events via TCP or Kafka
Type 3 resultsEventrouter
TCP receiver
Data Feeder IBM Streams Linear Road logic
Kafkaconsumer
Daily expenditureanalyticsAccountBalanceanalytics
Type 2 results
Position report analytics(for eachxway anddirection)1 .. N
Type 1 accident alerts
Type 0 toll notifications
Historical referencedata loader (A separate Streams application)
Distributed state keeper
9 © 2016 IBM Corporation© 2016 IBM Corporation
IBM Streams Linear Road test environment
Cloud Service (all nodes on a vnet named Subnet-1)
Streaming analytics test bed
Linux or WindowsJump box
IBM Network
Subnet-1
CPU: Intel Xeon E5-2670 @ 2.60 GHZ (16 cores on all the machines)
Memory: 110GB on Nodes 1 to 6)
Redis: Total of 10 instances running on 5 machines
Streams Management Server
[Node 1]
Streams Application Server [Node 2]
Streams Application Server [Node 3]
Streams Application Server [Node 4]
Streams Application Server (Ingest)
[Node 5]
Standby and scratch work Server [Node 6]
10 © 2016 IBM Corporation
Streams results L-Rating 50 on one Azure node, 200 on
4 Azure nodes 1 node, 16 cores, nearly 1B events 4 nodes, 64 cores, nearly 4B events Linear scalability Handles bursty traffic 99% of responses sub-second
# of x-ways # of cars Entries Memory CPU 1 278973 19.2 Million 2.2 GB 2%2 558726 38.5 Million 4.5 GB 4%5 1.3 Million 96.3 Million 10.9 GB 7%10 2.7 Million 192.5 Million 22.0 GB 11%15 4.1 Million 289.7 Million 33.0 GB 16%20 5.6 Million 385.2 Million 43.5 GB 20%25 6.9 Million 482.0 Million 54.5 GB 26%50 14.0 Million 963.1 Million 109.0 GB 31%100 27.6 Million 1.9 Billion 220 GB 22%150 41.5 Million 2.8 Billion 330 GB 33%200 55.0 Million 3.8 Billion 440 GB 45%
0
40
80
Number of expressways
Avg
. Th
roug
hput
(K
eve
nts/
seco
nd)
50 100 150 2000
100200300400
Number of expressways
Avg
. Th
roug
hput
(K
eve
nts/
seco
nd)
11 © 2016 IBM Corporation
Streams results Development effort: one person, 14.5 days
1.5 days install Linux & Streams on 5 Azure nodes 2 days design application 8 days iterative development 3 days unit testing & tuning
Scale automated with User Defined Parallelization
One Way
12 © 2016 IBM Corporation© 2016 IBM Corporation
Comparison to other technologies
Technology Hardware on Azure
L-Rating
IBM Streams Option 1 200
Apache Apex Option 1 102
Apache Storm Option 2 10
Four nodes of Option 1 or 2 for application processing:Option 1: Azure A11 (16 cores, 112 GB RAM, 382
GB Disk, 10 Gbit/s networking), or•CPU model: 45, Intel(R) Xeon(R) CPU E5-
2670 0 @ 2.60GHzOption 2: Azure D14 (16 cores, 112 GB RAM, 800
GB Disk (SSD), 1 Gbit/s)•CPU model: 45, Intel(R) Xeon(R) CPU E5-
2660 0 @ 2.20GHz
Two nodes for ingesting data:•If A11 selected, then A10 (8 cores, 56 GB RAM,
382 GB Disk, 10 Gbit/s networking•If D14 selected, then D13 (8 cores, 56 GB RAM,
400 GB Disk, 1 Gbit/s)Plus: an A10 or D13 Windows Server, or Linux,
jump box (Windows if a GUI is needed)
Six total nodes
IBM Streams:2x better than Apex20x better than Storm * Twitter has replaced Storm
13 © 2016 IBM Corporation© 2016 IBM Corporation
IBM recognized as a leaderThe Forrester Wave™: Big Data
Streaming Analytics Platforms, Q1 ‘16
The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
“IBM’s architecture can flex to handle any streaming challenge.”
IBM had the highest possible scores in Architecture, Operational Management,
Streaming Operators, Application Development and Business Applications, Roadmap, Ability to Execute, Implementation Support and Partners.
.
“The development environment provides one of the richest set of operators in the market.”
“Streams can ingest and understand the always-on stream of data to make the decisions
that underlie cognitive solutions.”
© 2016 IBM Corporation
14 © 2016 IBM Corporation© 2016 IBM Corporation
Stream ComputingOpen Source
Extensible platform
Managed Service
Batch & Streaming
Command Line i/face
Web & JMX mgmt
At Least Once
Exactly one
State
Windows
Back pressure
Machine Learning
Model scoring
Video/Image
Geospatial
Text Analytics
Visual development
Automated HA
Enterprise adapters
Open source adapters
EsperIBM StreamsStormFlinkSpark StreamingDataflow
15 © 2016 IBM Corporation© 2016 IBM Corporation
Affordable Realtime Analytics IBM Streams
100 Azure nodes$110,261/Mo
5 Azure nodes
$5,513/Mo
16 © 2016 IBM Corporation© 2016 IBM Corporation
Streams is the industry leading stream computing runtime for real time analytic processing for large-scale, in-memory distributed data processing.Why do customers choose Streams?
• Superior performance and low latency• Superior reliability and management • Widest range of adapters• Rapid development/debug capabilities• User Community – StreamsDev, github• Advanced Analytics – Machine Learning, Audio/Video, Geospatial,
Natural Language Processing• Enterprise integration & reliability• IBM worldwide services and support
IBM Streams Success
17 © 2016 IBM Corporation© 2016 IBM Corporation
Additional resources
Visit:
ibm.com/streams
github.com/Walmart
github.com/IBMStreams/benchmarks