Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

17
© 2016 IBM Corporation 1 IBM Streams 22 April 2016 Matt Grover, Walmart ISD Enterprise Architecture Roger Rea, IBM Streams Offering Manager Mike Spicer, IBM Lead Architect IBM Streams Linear Road Benchmark Performance Comparison of Streaming Analytic Offerings

Transcript of Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

Page 1: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

© 2016 IBM Corporation 1

IBM Streams22 April 2016

Matt Grover, Walmart ISD Enterprise ArchitectureRoger Rea, IBM Streams Offering Manager

Mike Spicer, IBM Lead Architect IBM Streams

Linear Road Benchmark

Performance Comparison of Streaming Analytic Offerings

Page 2: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

© 2016 IBM Corporation 2

Walmart and IBMToday, nearly 260 million customers visit our more than 11,500 stores under 72 banners in 28 countries and e-commerce sites in 11 countries each week.

We employ 2.2 million associates around the world, 1.4 million in the U.S. alone.

International Business Machines Corporation ('IBM') is a globally integrated enterprise operating in over 170 countries, has 380.000 employees. It brings innovative solutions to a diverse client base to help solve some of their toughest business challenges.

Page 3: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

© 2016 IBM Corporation 3

Why rewrite Linear Road • No comprehensive streaming benchmark

available • The storage design did not represent the current

state of streaming data systems

Requirement for Streaming Analytics at Walmart• Worldwide monitoring of logistics• Real time inventory control• Real time analytics

Linear Road Benchmark (2004) • Original White paper (link)• Open Source benchmark• Enables comparison between offerings• Sophisticated application with state management

Why Linear Road?

Page 4: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

4 © 2016 IBM Corporation© 2016 IBM Corporation

Linear Road Benchmark

Linear city is a fictional metropolis 100x100 miles

10 Expressways every 10 miles

Every mile each has an exit and onramp

Each expressway has 4 lanes in each direction

3 travel lanes and one lane for entrance and exit

Every vehicle emits position report every 30 seconds

One accident occurs randomly on each expressway

every 20 minutes, taking 10 to 20 minutes to clear

Linear Road on github

Page 5: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

5 © 2016 IBM Corporation© 2016 IBM Corporation

Linear Road Benchmark

Four types of events

Type 0: 99% of events are real-time position reports

Type 2: Historical requests for account balances

Type 3: Daily expenditure for a specific day in the

past 10 weeks

Type 4: travel time predictions

GOAL: Maximum L-Rating (max # expressways)

Linear Road on github

Page 6: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

6 © 2016 IBM Corporation© 2016 IBM Corporation

High level Linear Road architecture

Linear Roaddata generator

Courtesy of: Wal-Mart Stores Inc.

Linear Road

(Solutionimplementation

using vendorspecific Streaming

analytics middleware)

ResultsValidator(Rewritten inPython by

Wal-Mart Stores Inc.)

DetermineL-Rating

Page 7: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

7 © 2016 IBM Corporation© 2016 IBM Corporation

Why did IBM select Redis ?

Great maturity level

Top performance

API is tremendously easy and very flexible

Clustered in memory Key Value Store with fault tolerance

Option for in memory or in memory backed by persistence

Easy installation and monitoring

Page 8: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

8 © 2016 IBM Corporation© 2016 IBM Corporation

High level Linear Road architecture with Redis and IBM Streams

Linear Road Data Feeder streamingthe events via TCP or Kafka

Type 3 resultsEventrouter

TCP receiver

Data Feeder IBM Streams Linear Road logic

Kafkaconsumer

Daily expenditureanalyticsAccountBalanceanalytics

Type 2 results

Position report analytics(for eachxway anddirection)1 .. N

Type 1 accident alerts

Type 0 toll notifications

Historical referencedata loader (A separate Streams application)

Distributed state keeper

Page 9: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

9 © 2016 IBM Corporation© 2016 IBM Corporation

IBM Streams Linear Road test environment

Cloud Service (all nodes on a vnet named Subnet-1)

Streaming analytics test bed

Linux or WindowsJump box

IBM Network

Subnet-1

CPU: Intel Xeon E5-2670 @ 2.60 GHZ (16 cores on all the machines)

Memory: 110GB on Nodes 1 to 6)

Redis: Total of 10 instances running on 5 machines

Streams Management Server

[Node 1]

Streams Application Server [Node 2]

Streams Application Server [Node 3]

Streams Application Server [Node 4]

Streams Application Server (Ingest)

[Node 5]

Standby and scratch work Server [Node 6]

Page 10: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

10 © 2016 IBM Corporation

Streams results L-Rating 50 on one Azure node, 200 on

4 Azure nodes 1 node, 16 cores, nearly 1B events 4 nodes, 64 cores, nearly 4B events Linear scalability Handles bursty traffic 99% of responses sub-second

# of x-ways # of cars Entries Memory CPU 1 278973 19.2 Million 2.2 GB 2%2 558726 38.5 Million 4.5 GB 4%5 1.3 Million 96.3 Million 10.9 GB 7%10 2.7 Million 192.5 Million 22.0 GB 11%15 4.1 Million 289.7 Million 33.0 GB 16%20 5.6 Million 385.2 Million 43.5 GB 20%25 6.9 Million 482.0 Million 54.5 GB 26%50 14.0 Million 963.1 Million 109.0 GB 31%100 27.6 Million 1.9 Billion 220 GB 22%150 41.5 Million 2.8 Billion 330 GB 33%200 55.0 Million 3.8 Billion 440 GB 45%

0

40

80

Number of expressways

Avg

. Th

roug

hput

(K

eve

nts/

seco

nd)

50 100 150 2000

100200300400

Number of expressways

Avg

. Th

roug

hput

(K

eve

nts/

seco

nd)

Page 11: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

11 © 2016 IBM Corporation

Streams results Development effort: one person, 14.5 days

1.5 days install Linux & Streams on 5 Azure nodes 2 days design application 8 days iterative development 3 days unit testing & tuning

Scale automated with User Defined Parallelization

One Way

Page 12: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

12 © 2016 IBM Corporation© 2016 IBM Corporation

Comparison to other technologies

Technology Hardware on Azure

L-Rating

IBM Streams Option 1 200

Apache Apex Option 1 102

Apache Storm Option 2 10

Four nodes of Option 1 or 2 for application processing:Option 1: Azure A11 (16 cores, 112 GB RAM, 382

GB Disk, 10 Gbit/s networking), or•CPU model: 45, Intel(R) Xeon(R) CPU E5-

2670 0 @ 2.60GHzOption 2: Azure D14 (16 cores, 112 GB RAM, 800

GB Disk (SSD), 1 Gbit/s)•CPU model: 45, Intel(R) Xeon(R) CPU E5-

2660 0 @ 2.20GHz

Two nodes for ingesting data:•If A11 selected, then A10 (8 cores, 56 GB RAM,

382 GB Disk, 10 Gbit/s networking•If D14 selected, then D13 (8 cores, 56 GB RAM,

400 GB Disk, 1 Gbit/s)Plus: an A10 or D13 Windows Server, or Linux,

jump box (Windows if a GUI is needed)

Six total nodes

IBM Streams:2x better than Apex20x better than Storm * Twitter has replaced Storm

Page 13: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

13 © 2016 IBM Corporation© 2016 IBM Corporation

IBM recognized as a leaderThe Forrester Wave™: Big Data

Streaming Analytics Platforms, Q1 ‘16

The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.

“IBM’s architecture can flex to handle any streaming challenge.”

IBM had the highest possible scores in Architecture, Operational Management,

Streaming Operators, Application Development and Business Applications, Roadmap, Ability to Execute, Implementation Support and Partners.

.

“The development environment provides one of the richest set of operators in the market.”

“Streams can ingest and understand the always-on stream of data to make the decisions

that underlie cognitive solutions.”

© 2016 IBM Corporation

Page 14: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

14 © 2016 IBM Corporation© 2016 IBM Corporation

Stream ComputingOpen Source

Extensible platform

Managed Service

Batch & Streaming

Command Line i/face

Web & JMX mgmt

At Least Once

Exactly one

State

Windows

Back pressure

Machine Learning

Model scoring

Video/Image

Geospatial

Text Analytics

Visual development

Automated HA

Enterprise adapters

Open source adapters

EsperIBM StreamsStormFlinkSpark StreamingDataflow

Page 15: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

15 © 2016 IBM Corporation© 2016 IBM Corporation

Affordable Realtime Analytics IBM Streams

100 Azure nodes$110,261/Mo

5 Azure nodes

$5,513/Mo

Page 16: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

16 © 2016 IBM Corporation© 2016 IBM Corporation

Streams is the industry leading stream computing runtime for real time analytic processing for large-scale, in-memory distributed data processing.Why do customers choose Streams?

• Superior performance and low latency• Superior reliability and management • Widest range of adapters• Rapid development/debug capabilities• User Community – StreamsDev, github• Advanced Analytics – Machine Learning, Audio/Video, Geospatial,

Natural Language Processing• Enterprise integration & reliability• IBM worldwide services and support

IBM Streams Success

Page 17: Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

17 © 2016 IBM Corporation© 2016 IBM Corporation

Additional resources

Visit:

ibm.com/streams

github.com/Walmart

github.com/IBMStreams/benchmarks