Reactive Streams / Akka Streams - GeeCON Prague 2014

Post on 14-Jun-2015

2.011 views 1 download

Tags:

description

Talk explaining the core concepts behind reactive streams.

Transcript of Reactive Streams / Akka Streams - GeeCON Prague 2014

Konrad 'ktoso' MalawskiGeeCON 2014 @ Kraków, PL

Konrad `@ktosopl` Malawski

Akka Streams

OST: Tomas Dvorak ./‘ ./‘

Konrad `@ktosopl` Malawski Akka Team,

Reactive Streams TCK

hAkker @

Konrad `@ktosopl` Malawski

typesafe.comgeecon.org

Java.pl / KrakowScala.plsckrk.com / meetup.com/Paper-Cup @ London

GDGKrakow.pl meetup.com/Lambda-Lounge-Krakow

hAkker @

You?

?

Czech User Group?

You?

You?

?

z ?

You?

?

z ?

?

You?

?

z ?

?

?

Streams

Streams

Streams“You cannot enter the same river twice”

~ Heraclitus

http://en.wikiquote.org/wiki/Heraclitus

StreamsReal Time Stream Processing

When you attach “late” to a Publisher,you may miss initial elements – it’s a river of data.

http://en.wikiquote.org/wiki/Heraclitus

Reactive Streams

Reactive Streams

Stream processing

Reactive Streams

Back-pressured

Stream processing

Reactive Streams

Back-pressured Asynchronous

Stream processing

Reactive Streams

Back-pressured Asynchronous

Stream processing Standardised (!)

Reactive Streams: Goals

1. Back-pressured Asynchronous Stream processing

2. Standard implemented by many libraries

Reactive Streams - Specification & TCK

http://reactive-streams.org

Reactive Streams - Who?

http://reactive-streams.org

Kaazing Corp.rxJava @ Netflix,

reactor @ Pivotal (SpringSource),vert.x @ Red Hat,

Twitter,akka-streams @ Typesafe,

spray @ Spray.io,Oracle,

java (?) – Doug Lea - SUNY Oswego …

Reactive Streams - Inter-op

http://reactive-streams.org

We want to make different implementations co-operate with each other.

Reactive Streams - Inter-op

http://reactive-streams.org

The different implementations “talk to each other”using the Reactive Streams protocol.

Reactive Streams - Inter-op

http://reactive-streams.org

The Reactive Streams SPI is NOT meant to be user-api.You should use one of the implementing libraries.

What is back-pressure?

Back-pressure? Example Without

Publisher[T] Subscriber[T]

Back-pressure? Example Without

Fast Publisher Slow Subscriber

Back-pressure?“Why would I need that!?”

Back-pressure? Push + NACK model

Back-pressure? Push + NACK model

Subscriber usually has some kind of buffer.

Back-pressure? Push + NACK model

Back-pressure? Push + NACK model

Back-pressure? Push + NACK model

What if the buffer overflows?

Back-pressure? Push + NACK model (a)

Use bounded buffer, drop messages + require re-sending

Back-pressure? Push + NACK model (a)

Kernel does this!Routers do this!

(TCP)

Use bounded buffer, drop messages + require re-sending

Back-pressure? Push + NACK model (b)Increase buffer size… Well, while you have memory available!

Back-pressure? Push + NACK model (b)

Back-pressure?NACKing is NOT enough.

Negative ACKnowledgement

Back-pressure? Example NACKingBuffer overflow is imminent!

Back-pressure? Example NACKingTelling the Publisher to slow down / stop sending…

Back-pressure? Example NACKingNACK did not make it in time,

because M was in-flight!

Back-pressure?

speed(publisher) < speed(subscriber)

Back-pressure? Fast Subscriber, No Problem

No problem!

Back-pressure?Reactive-Streams

= “Dynamic Push/Pull”

Just push – not safe when Slow Subscriber

Just pull – too slow when Fast Subscriber

Back-pressure? RS: Dynamic Push/Pull

Solution:Dynamic adjustment

Back-pressure? RS: Dynamic Push/Pull

Just push – not safe when Slow Subscriber

Just pull – too slow when Fast Subscriber

Back-pressure? RS: Dynamic Push/PullSlow Subscriber sees it’s buffer can take 3 elements. Publisher will never blow up it’s buffer.

Back-pressure? RS: Dynamic Push/PullFast Publisher will send at-most 3 elements. This is pull-based-backpressure.

Back-pressure? RS: Dynamic Push/Pull

Fast Subscriber can issue more Request(n), before more data arrives!

Back-pressure? RS: Dynamic Push/PullFast Subscriber can issue more Request(n), before more data arrives.

Publisher can accumulate demand.

Back-pressure? RS: Accumulate demandPublisher accumulates total demand per subscriber.

Back-pressure? RS: Accumulate demandTotal demand of elements is safe to publish. Subscriber’s buffer will not overflow.

Back-pressure? RS: Requesting “a lot”Fast Subscriber can issue arbitrary large requests, including “gimme all you got” (Long.MaxValue)

Back-pressure? RS: Dynamic Push/Pull

MAX

speed

Back-pressure? RS: Dynamic Push/Pull

Easy

MAX

speed

How does fit all this?

Akka

Akka has multiple modules:

akka-actor: actors (concurrency abstraction)akka-camel: integrationakka-remote: remote actorsakka-cluster: clusteringakka-persistence: CQRS / Event Sourcingakka-streams: stream processing…

AkkaAkka is a high-performance concurrency library for Scala and Java.

At it’s core it focuses on the Actor Model:

An Actor can only: • Send and receive messages• Create Actors• Change it’s behaviour

AkkaAkka is a high-performance concurrency library for Scala and Java.

At it’s core it focuses on the Actor Model:

class Player extends Actor {

def receive = { case NextTurn => sender() ! decideOnMove() }

def decideOnMove(): Move = ???}

Akka

Akka

Akka has multiple modules:

akka-actor: actors (concurrency abstraction)akka-camel: integrationakka-remote: remote actorsakka-cluster: clusteringakka-persistence: CQRS / Event Sourcingakka-streams: stream processing…

Akka Streams0.9 early preview

Akka Streams – Linear Flow

Akka Streams – Linear Flow

Akka Streams – Linear Flow

Akka Streams – Linear Flow

Akka Streams – Linear Flow

Flow[Double].map(_.toInt). [...]

No Source attached yet.“Pipe ready to work with Doubles”.

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")

An ActorSystem is the world in which Actors live in.AkkaStreams uses Actors, so it needs ActorSystem.

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

Contains logic on HOW to materialise the stream.

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

A materialiser chooses HOW to materialise a Stream.

The Flow’s AST is fully “lifted”.The Materialiser can choose to materialise the Flow in any way it sees fit.

Our implementation uses Actors.But you could easily plug in an SparkMaterializer!

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

You can configure it’s buffer sizes etc.

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

val foreachSink = Sink.foreach[Int](println)val mf = Source(1 to 3).runWith(foreachSink)

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

val foreachSink = Sink.foreach[Int](println)val mf = FlowFrom(1 to 3).runWith(foreachSink)(mat)

Uses the implicit FlowMaterializer

Akka Streams – Linear Flow

implicit val sys = ActorSystem("tokyo-sys")implicit val mat = FlowMaterializer()

// sugar for runWithSource(1 to 3).foreach(println)

Akka Streams – Linear Flow

val mf = Flow[Int]. map(_ * 2). runWith(Sink.foreach(println)) // is missing a Source, // can NOT run == won’t compile!

Akka Streams – Linear Flow

val f = Flow[Int]. map(_ * 2).

runWith(Sink.foreach(i => println(s"i = $i”))). // needs Source to run!

Akka Streams – Linear Flow

val f = Flow[Int]. map(_ * 2).

runWith(Sink.foreach(i => println(s"i = $i”))). // needs Source to run!

Akka Streams – Linear Flow

val f = Flow[Int]. map(_ * 2).

runWith(Sink.foreach(i => println(s"i = $i”))). // needs Source to run!

Akka Streams – Linear Flow

val f = Flow[Int]. map(_ * 2).

runWith(Sink.foreach(i => println(s"i = $i”))). // needs Source to run!

f.connect(Source(1 to 10)).run()

Akka Streams – Linear Flow

val f = Flow[Int]. map(_ * 2).

runWith(Sink.foreach(i => println(s"i = $i”))). // needs Source to run!

f.connect(Source(1 to 10)).run()

With a Source attached… it can run()

Akka Streams – Linear Flow

Flow[Int]. map(_.toString). runWith(Source(1 to 10), Sink.ignore)

Connects Source and Sink, then runs

Akka Streams – Flows are reusable

f.withSource(IterableSource(1 to 10)).run() f.withSource(IterableSource(1 to 100)).run() f.withSource(IterableSource(1 to 1000)).run()

Akka Streams <-> Actors – Advanced

val subscriber = ActorSubscriber( system.actorOf(Props[SubStreamParent], ”parent”))

Source(1 to 100). map(_.toString). filter(_.length == 2). drop(2). groupBy(_.last). runWith(subscriber)

Akka Streams <-> Actors – Advanced

Each “group” is a stream too! It’s a “Stream of Streams”.

val subscriber = ActorSubscriber( system.actorOf(Props[SubStreamParent], ”parent”))

Source(1 to 100). map(_.toString). filter(_.length == 2). drop(2). groupBy(_.last). runWith(subscriber)

Akka Streams <-> Actors – Advanced

groupBy(_.last).

GroupBy groups “11” to group “1”, “12” to group “2” etc.

Akka Streams <-> Actors – Advanced

groupBy(_.last).

It offers (groupKey, subStreamSource) to Subscriber

Source

Akka Streams <-> Actors – Advanced

groupBy(_.last).

It can then start children, to handle the sub-flows!

Source

Akka Streams <-> Actors – Advanced

groupBy(_.last).

For example, one child for each group.

Source

Akka Streams <-> Actors – Advanced

val subscriber = ActorSubscriber( system.actorOf(Props[SubStreamParent], ”parent”))

Source(1 to 100). map(_.toString). filter(_.length == 2). drop(2). groupBy(_.last). runWith(subscriber)

The Actor, will consume SubStream offers.

Consuming streams with Actors(advanced)

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber with ImplicitFlowMaterializer with ActorLogging {

override def requestStrategy = OneByOneRequestStrategy

override def receive = { case OnNext((groupId: String, subStream: Source[String])) =>

val subSub = context.actorOf(Props[SubStreamSubscriber], s"sub-$groupId") subStream.runWith(Sink.subscriber(ActorSubscriber(subSub))) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber with ImplicitFlowMaterializer with ActorLogging {

override def requestStrategy = OneByOneRequestStrategy

override def receive = { case OnNext((groupId: String, subStream: Source[String])) =>

val subSub = context.actorOf(Props[SubStreamSubscriber], s"sub-$groupId") subStream.runWith(Sink.subscriber(ActorSubscriber(subSub))) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber with ImplicitFlowMaterializer with ActorLogging {

override def requestStrategy = OneByOneRequestStrategy

override def receive = { case OnNext((groupId: String, subStream: Source[String])) =>

val subSub = context.actorOf(Props[SubStreamSubscriber], s"sub-$groupId") subStream.runWith(Sink.subscriber(ActorSubscriber(subSub))) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber with ImplicitFlowMaterializer with ActorLogging {

override def requestStrategy = OneByOneRequestStrategy

override def receive = { case OnNext((groupId: String, subStream: Source[String])) =>

val subSub = context.actorOf(Props[SubStreamSubscriber], s"sub-$groupId") subStream.runWith(Sink.subscriber(ActorSubscriber(subSub))) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber with ImplicitFlowMaterializer with ActorLogging {

override def requestStrategy = OneByOneRequestStrategy

override def receive = { case OnNext((groupId: String, subStream: Source[String])) =>

val subSub = context.actorOf(Props[SubStreamSubscriber], s"sub-$groupId") subStream.runWith(Sink.subscriber(ActorSubscriber(subSub))) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber {

override def requestStrategy = WatermarkRequestStrategy(highWatermark = 10)

override def receive = { case OnNext(n: String) => println(s”n = $n”) }}

Akka Streams <-> Actors – Advanced

class SubStreamParent extends ActorSubscriber {

override def requestStrategy = WatermarkRequestStrategy(highWatermark = 10)

override def receive = { case OnNext(n: String) => println(s”n = $n”) }}

Akka Streams – FlowGraph

FlowGraph

Akka Streams – FlowGraph

Linear Flows or

non-akka pipelines

Could be another RS implementation!

Akka Streams – GraphFlow

Fan-out elements and

Fan-in elements

Akka Streams – GraphFlow

// first define some pipeline piecesval f1 = Flow[Input].map(_.toIntermediate)val f2 = Flow[Intermediate].map(_.enrich)val f3 = Flow[Enriched].filter(_.isImportant)val f4 = Flow[Intermediate].mapFuture(_.enrichAsync)

// then add input and output placeholdersval in = SubscriberSource[Input]val out = PublisherSink[Enriched]

Akka Streams – GraphFlow

Akka Streams – GraphFlowval b3 = Broadcast[Int]("b3")val b7 = Broadcast[Int]("b7")val b11 = Broadcast[Int]("b11")val m8 = Merge[Int]("m8")val m9 = Merge[Int]("m9")val m10 = Merge[Int]("m10")val m11 = Merge[Int]("m11")val in3 = Source(List(3))val in5 = Source(List(5))val in7 = Source(List(7))

Akka Streams – GraphFlow

Akka Streams – GraphFlow

// First layerin7 ~> b7b7 ~> m11b7 ~> m8

in5 ~> m11

in3 ~> b3b3 ~> m8b3 ~> m10

Akka Streams – GraphFlow

// Second layerm11 ~> b11b11 ~> Flow[Int].grouped(1000) ~> resultFuture2 b11 ~> m9b11 ~> m10

m8 ~> m9

Akka Streams – GraphFlow

// Third layerm9 ~> Flow[Int].grouped(1000) ~> resultFuture9m10 ~> Flow[Int].grouped(1000) ~> resultFuture10

Akka Streams – GraphFlow

// Third layerm9 ~> Flow[Int].grouped(1000) ~> resultFuture9m10 ~> Flow[Int].grouped(1000) ~> resultFuture10

Akka Streams – GraphFlow

// Third layerm9 ~> Flow[Int].grouped(1000) ~> resultFuture9m10 ~> Flow[Int].grouped(1000) ~> resultFuture10

Akka Streams – GraphFlow

val resultFuture2 = Sink.future[Seq[Int]]val resultFuture9 = Sink.future[Seq[Int]]val resultFuture10 = Sink.future[Seq[Int]]

val g = FlowGraph { implicit b => // ... m10 ~> Flow[Int].grouped(1000) ~> resultFuture10 // ...}.run()

Await.result(g.get(resultFuture2), 3.seconds).sorted should be(List(5, 7))

Sinks and Sources are “keys”which can be addressed within the graph

Akka Streams – GraphFlow

val resultFuture2 = Sink.future[Seq[Int]]val resultFuture9 = Sink.future[Seq[Int]]val resultFuture10 = Sink.future[Seq[Int]]

val g = FlowGraph { implicit b => // ... m10 ~> Flow[Int].grouped(1000) ~> resultFuture10 // ...}.run()

Await.result(g.get(resultFuture2), 3.seconds).sorted should be(List(5, 7))

Sinks and Sources are “keys”which can be addressed within the graph

Akka Streams – GraphFlow

val g = FlowGraph {}

FlowGraph is immutable and safe to share and re-use!

Available Elements0.9 early preview

Available Sources• FutureSource• IterableSource• IteratorSource• PublisherSource• SubscriberSource• ThunkSource• TickSource (timer based)• … easy to add your own!

0.9 early preview

Available operations• drop / dropWithin• take / takeWithin• filter• groupBy• grouped• map• prefixAndTail• … easy to add your own!

0.9 early preview

“Rate – detaching” operations:• buffer• collect• concat• conflate

Available Sinks• BlackHoleSink• FoldSink• ForeachSink• FutureSink• OnCompleteSink• PublisherSink / FanoutPublisherSink• SubscriberSink• … easy to add your own!

0.9 early preview

Available Junctions• Broadcast• Merge

• FlexiMerge• Zip• Unzip• Concat• … easy to add your own!

0.9 early preview

Akka-StreamsComing soon!

Rough plans

• 0.9 released

• 0.10 in 2~3 weeks

• 1.0 “soon” after…• Means “stabilised APIs”• Will not yet be performance tuned, though it’s already

pretty good… we know where and how we can tune it for 1.x.

0.7 early preview

Java DSL

• Partial Java DSL in 0.9 (released)

• Full Java DSL in 0.10 (in 2~3 weeks)• as 1st class citizen (!)

0.7 early preview

Spray => Akka-Http && ReactiveStreams

Spray is now merged into Akka, as Akka-HttpWorks on Reactive Streams

Streaming end-to-end!

Links• http://akka.io • http://reactive-streams.org

• https://groups.google.com/group/akka-user

• 0.7 release http://akka.io/news/2014/09/12/akka-streams-0.7-released.html

• javadsl https://github.com/akka/akka/pulls?q=is%3Apr+javadsl

Děkuji vám!Dzięki!ありがとう!Ask questions,get Stickers! (for real!)

http://akka.io

ktoso @ typesafe.com twitter: ktosoplgithub: ktosoteam blog: letitcrash.com

©Typesafe 2014 – All Rights Reserved