1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong [email protected],...

56
1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong [email protected], Intel Software

Transcript of 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong [email protected],...

Page 1: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

1

GearpumpReal time DAG-Processing at Scale

Strata Singapore 2015

Sean [email protected], Intel Software

Page 2: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

2

What is Gearpump• Akka based lightweight Real time data processing platform.• Apache License http://gearpump.io version 0.7

• Akka: • Communication, concurrency, Isolation, and fault-tolerant

Simple and PowerfulMessage level streamingLong running daemons

What is Akka?

Page 3: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

3

What is Akka?• Micro-service(Actor) oriented.

• Message Driven• Lock-free• Location-transparent

It is like our human society, driven by messageWhich can scale to 7 billion population!

Page 4: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

4

Micro-service orientedhigher abstraction

• Break your application into Micro services instead of object.

• Throw away locks• Use Immutable Async message to exchange

information between micro-service instead of shared object.

Page 5: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

5

Gearpump in Big Data Stack

store

visualization

batch stream

SQLCatalystStreamSQLImpala

Cluster manager monitor/alert/notify

Machine learning

Graphx

ClouderaManager

Storage

Engine

Analytics

Visualization& management

storm

Here !

Data explore

Gearpump

Page 6: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

6

Why another streaming platform?

• The requirements are not fully met.• A higher abstraction like micro-service oriented can largely simplify

the problem.

Page 7: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

7

What we want• Meet The 8 Requirements of Real-Time Stream

Processing (2006)

7

Flexible Volume Speed Accuracy VisualAny whereAny sizeAny sourceAny use casedynamic DAG②StreamSQL

High throughput⑦Scale linearly

①In-StreamZero latency⑥HA⑧Responsive

Exactly-once③Message loss/delay/out of order④Predictable

Easy to debugWYSWYG

Page 8: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

8

Overview

Page 9: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

9

DAG representation and API

Graph(A~>B~>C~>D, B~>E~>D)

Low level Graph API Syntax:

A B C D

E

Processor

Processor Processor Processor

ProcessorDAG

ShuffleField grouping

Field grouping

Page 10: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

10

Architecture - Actor Hierarchy

Client

Hook in and query state

As general service

YARNEach App has one isolated AppMaster, and use Actor Supervision tree for error handling.

Master ClusterHA Design

Page 11: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

11

WorkerWorkerWorker

Akka Cluster

Master

standbyMaster

StandbyMaster

State

Goss

ip Gossip

Gossip

Architecture - Master HA (no SPOF)• Akka Cluster for a centerless HA system• Conflict free data types(CRDT) for consistency

CRDT Data type example:

Decentralized: Not rely on single central meta server

leader

Page 12: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

12

Feature HighlightsAkka/Akka-

stream/Stormcompatible

Throughput14 million/s

(*)

2ms Latency(*)

Exactly-once Dynamic DAG Out of OrderMessage

Flexible DSL DAGVisualization

Internet of Thing

function

usability

[*] Test environment: Intel® Xeon™ Processor E5-2690, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network. We use default configuration of Gearpump 0.7 release. See backup page for more details.We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark) (tested carried by Intel team)

Page 13: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

13

Using Gearpump

Page 14: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

14

Three steps to use it1. Download binary from http://gearpump.io2. Submit jar by UI3. Monitor Status

Page 15: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

15

Application Submission Flow

Master

Workers1. Submit a Jar

Master

Workers

AppMaster 2. Create AppMaster

Master

Workers

AppMaster

3. Request Resource YARNMaster

Workers

AppMaster Executor Executor Executor

Ask YARN

for Resource1 2

43

154. Report Executor to AppMaster

YARN or without YARN

Page 16: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

16

Low level Graph API - WordCountval context = new ClientContext() val split = Processor[Split](splitParallism) val sum = Processor[Sum](sumParallism) val app = StreamApplication("wordCount", Graph(split ~> sum), UserConfig.empty) val appId = context.submit(app) context.close() class Split(taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) { override def onNext(msg : Message) : Unit = { /* split the line */ } } class Sum (taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) { val count = /**count of words **/ override def onNext(msg : Message) : Unit = {/* do aggregation on word*/} }

Scala Java

Page 17: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

17

High Level DSL API - WordCount

val context = ClientContext() val app = new StreamApp("dsl", context)val data = "This is a good start, bingo!! bingo!!" app.fromCollection(data.lines) // word => (word, count = 1) .flatMap(line => line.split("[\\s]+")).map((_, 1)) // (word, count1), (word, count2) => (word, count1 + count2) .groupByKey().sum.log

val appId = context.submit(app) context.close()

Page 18: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

18

Akka-stream API – WordCountimplicit val system = ActorSystem("akka-test")implicit val materializer = new GearpumpMaterializer(system)val echo = system.actorOf(Props(new Echo()))val sink = Sink.actorRef(echo, "COMPLETE")val source = GearSource.from[String](new CollectionDataSource(lines))source.mapConcat{line => line.split(" ").toList} .groupBy2(x=>x) .map(word => (word, 1)) .reduce {(a, b) => (a._1, a._2 + b._2)} .log("word-count").runWith(sink)

Available at branch https://github.com/gearpump/gearpump/tree/akkastream

Page 19: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

19

DAG PageUI Portal - DAG Visualization

Track global min-Clock of all message DAG:

• Node size reflect throughput• Edge width represents flow rate• Red node means something goes wrong

19

Page 20: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

20

UI Portal – Processor DetailProcessor Page

Data skew distribution Task throughput and latency

Page 21: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

21

Performance optimization

Page 22: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

22

Throughput and LatencyThroughput: 11 million message/second

Latency: 17ms on full load

SOL Shuffle test32 tasks->32 tasks

[*] Test environment: Intel® Xeon™ Processor E5-2680, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team)We use default configuration of Gearpump 0.2 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)

Page 23: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

23

Scalability• Test run on 100 nodes(*) and 3000 tasks• Gearpump performance scales:

100 nodes

[*] We use 8 machines to simulate 100 worker nodesTest environment: Intel® Xeon™ Processor E5-2680, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team)We use default configuration of Gearpump 0.3.5 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)

Page 24: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

24

Fault-Tolerance: Recovery time

[*]: Recovery time is the time interval between: a) failure happen b) all tasks in topology resume processing data.

91 worker nodes, 1000 tasks

Test environment: Intel® Xeon™ Processor E5-2680 91 worker nodes, 1000 tasks (We use 7 machines to simulate 91 worker nodes). Each node has 32 CPU cores, and 64GB memory, 10GbE network. We use default configuration of Gearpump 0.3.5 release. We use the SOL workload (https://github.com/intel-hadoop/storm-benchmark) (by Intel team)

Page 25: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

25

High performance Messaging Layer• Akka remote message has a big overhead, (sender + receiver address)• Reduce 95% overhead (400 bytes to ~20 bytes)

Effective batching

convert from short address

convert to short address

Sync with other executors

Page 26: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

26

Effective batchingNetwork Idle: Flush as fast as we canNetwork Busy: Smart batching until the network is open again.

This feature is ported from Storm-297

Network BandwidthDoubled

For 100 byte per message

Test environment: Same as storm-297

Page 27: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

27

High performance flow Control

Pass back-pressure level-by-levelAbout ~1% throughput impact

TaskTaskTask TaskTaskTask TaskTaskTask TaskTaskTask

Back-pressure Sliding window

Another option(not used): big-loop-feedback flow control

1. NO central ack nodes2. Each level knows

network status, thus can optimize the network at best

Page 28: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

28

Use cases

Page 29: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

29

What is a good use case for Gearpump?

When you want exactly-once message processing, with millisecond latency. When you want to integrate with Akka and Akka-stream transparently. When you want dynamic modification of online DAGWhen you want to connect with IoT edge device, location tranparency.When you want a simple scheduler to distribute customized application, like collecting logs, distributed cron…When you want to use Monoid state like Count-Min Sketch…Besides, it can integrate with: YARN, Kafka, Storm and etc..

Page 30: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

30

IoT Transparent Cloud

Location transparent. Unified programming model across the boundary.

log

Data Center

dag on device side

Target Problem Large gap between edge devices and data center

case: Intelligent Traffic System, 3000 traffic lights, travel time, overspeed detection…

Page 31: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

31

Exactly-once: Financial use casesTarget Problem both real-time and accuracy are important

realtime Stock index

CrawlersProcess Alerts Actions Reports

Rules

TransactionAccountOther

data source

Programing trading

Page 32: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

32

Delete

Transformers: Dynamic DAG

Add

change parallelism online to scale out without message lossadd/remove source/sink processor dynamically

B

Target Problem No existing way to manipulate the DAG on the fly

Each Processor can has its own independent jar

Replace

It can

Page 33: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

33

Eve: Online Machine Learning

• Decide immediately based online learning result

sensor

Learn

Predict

Decide

Input

Output

Target Problem ML train and predict online for real-time decision support.

TAP analytics platform integration: http://trustedanalytics.org/

Page 34: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

34

Gearpump Internals

Page 35: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

35

General ideas

State

Minclock service

Replayable Source DAG

MinClock service track the min timestamp of all pending messages in the system now and future

Message(timestamp)

Every message in the systemHas a application timestamp(message birth time)

Normal Flow

Page 36: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

36

General ideas

State

Minclock service

Replayable Source

Checkpoint Store

Message(timestamp)

④Exactly-once State can be reconstructed by message replay:

①Detect Message loss at Tp

② clock pause at Tp

③ replay from Tp

Recovery Flow

DAG

Page 37: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

37

1. Detect Message loss2. Pause the clock at Tp when message is lost3. Replay from clock Tp4. Exactly-once

Page 38: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

38

Detect Failure in timeAckRequest and Ack to detect Message loss:

Easy to trouble-shootWhen An error happen, we know When Where Why

Master

AppMaster

Executor

Task

Failure

Failure

Failure

Super

vision ch

ain

Page 39: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

39

Executor Executor

AppMaster

TaskTaskTask

Store

Source

Global clock service

① error detected

②Fence zombie

Recover the runtime when machine crashed

Use dynamic session ID to fence zombies

Send message

1. Quarantine

Page 40: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

40

Executor

AppMaster

Task

Store

SourceReplay

Global clock service

① error detected

②isolate zombie

Executor

Task

③ Recover

DAG Recovery: Quarantine and Recover2. Recover the executor JVM, and replay message

Send message

Page 41: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

41

1. Detect Message loss2. Pause the clock at Tp when message is lost3. Replay from clock Tp4. Exactly-once

Page 42: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

42

Application’s Clock Service

Definition: Task min-clock isMinimum of ( min timestamp of pending-messages in current task Task min-Clock of all upstream tasks)

Report task min-clock

Level ClockA

B

C

D

E

Later

Earlier

1000

800

600

400

Ever incremental

Clock of D

Page 43: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

43

1. Detect Message loss2. Pause the clock at Tp when message is lost3. Replay from clock Tp4. Exactly-once

Page 44: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

44

Source-based Message Replay

Normal Flow

Replay from the very-beginning source

Source

Like offset of kafka queue -> message timestamp

Page 45: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

45

Source-based Message ReplayReplay from the very-beginning source

Global Clock Service

①Resolve offset with timestamp Tp

②Replay from offset

Source

Recovery Flow

Page 46: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

46

1. Detect Message loss2. Clock pause at Tp when message is lost3. Replay from clock Tp4. Exactly-once

Page 47: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

47

Checkpoint Store

DAG runtime

Key: Ensure State(t) only contains message(timestamp <= t)

Exactly-once message processing

How?

append only checkpoint

Page 48: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

48

Exactly-once message processingTwo states

StateAccept (t <

Tc)

StateAccept all

Checkpoint Store

Streaming SystemMessages

Normal Flow

Page 49: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

49

Exactly-once message processingTwo states

StateAccept (t <

Tc)

StateAccept all

Checkpoint Store

Streaming SystemMessages

Recover checkpoint in failures

Recovery Flow

Page 50: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

50

How to do Dynamic DAG?Multiple-Version DAG

A B C

B’Message.time >= Tc

Target Problem: Replace processor B with B’ at time Tc

DAG(Version = 0) DAG(Version = 1)transit

Message.time < Tc

NO message loss during the transition

A B C

Page 51: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

51

Demo

Page 52: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

52

Live demo

Page 53: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

53

References• 钟翔 大数据时代的软件架构范式: Reactive 架构及 Akka 实践 , 程序员期刊 2015 年2A期• Gearpump whitepaper http://typesafe.com/blog/gearpump-real-time-streaming-engine-using-akka• 吴甘沙 低延迟流处理系统的逆袭 , 程序员期刊 2013 年10期• Stonebraker http://cs.brown.edu/~ugur/8rulesSigRec.pdf• https://github.com/intel-hadoop/gearpump• Gearpump: https://github.com/intel-hadoop/gearpump• http://highlyscalable.wordpress.com/2013/08/20/in-stream-big-data-processing/• https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap

-machines• Sqlstream http://www.sqlstream.com/customers/• http://www.statsblogs.com/2014/05/19/a-general-introduction-to-stream-processing/• http://www.statalgo.com/2014/05/28/stream-processing-with-messaging-systems/• Gartner report on IOT

http://www.zdnet.com/article/internet-of-things-devices-will-dwarf-number-of-pcs-tablets-and-smartphones/

Page 54: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

54

Legal Disclaimers1 This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.2 Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com].3 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.§ For more information go to http://www.intel.com/performance. Intel, the Intel logo, Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.© 2015 Intel Corporation

Page 55: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.

55

Latest Performance evaluation on Gearpump 0.7

Microsoft Word Document

This is the latest performance test on version 0.7. Please see the embedded document on the right to see the configuration details.

Page 56: 1 Gearpump Real time DAG-Processing at Scale Strata Singapore 2015 Sean Zhong Xiang.zhong@intel.com, Intel Software.