Building large scale, job processing systems with Scala Akka Actor framework
The things we don't see – stories of Software, Scala and Akka
-
Upload
konrad-malawski -
Category
Technology
-
view
2.744 -
download
3
Transcript of The things we don't see – stories of Software, Scala and Akka
Konrad `@ktosopl` Malawski @ Scalapeno 2016
q
The things we don’t see Exploring the unseen worlds of the Scala,
Akka and Software in general
Konrad `@ktosopl` Malawski @ Scalapeno 2016
The things we don’t see Exploring the unseen worlds of the Scala,
Akka and Software in general
Konrad `@ktosopl` Malawski
akka.iotypesafe.comgeecon.org
Java.pl / KrakowScala.plsckrk.com / meetup.com/Paper-Cup @ London
GDGKrakow.pl lambdakrk.pl
“The things we don’t see”
What it means (at least for me, and in this talk)
… in Software in general
… in Scala (programming languages)
… in Akka (concurrency, distributed systems)
The things we don’t see…in Software
A world of tradeoffs
Reality
A world of tradeoffs
Reality
A world of tradeoffs
Reality
A world of tradeoffs
Reality
A world of tradeoffs
Reality
At the core of it…This talk is about trade-offs.
Measure
Measure
What we can’t measure, we can’t improve.And often: simply don’t improve at all.
Measure
What we can’t measure, we can’t improve.And often: simply don’t improve at all.
“ArrayDeque surely will be faster here!” Wha..! Turns out it isn’t, let’s see why…
Measure, to gain understanding of your system
OMG! I changed all the settings to “more”! And it’s not improving…!
Measure, to gain understanding of your system
OMG! I changed all the settings to “more”! And it’s not improving…!
Hmm… weird. Where do you see the bottleneck?
Measure, to gain understanding of your system
OMG! I changed all the settings to “more”! And it’s not improving…!
Hmm… weird. Where do you see the bottleneck?
Measure, and don’t be that guy
OMG! That lib is 1000x better in serving a file!!! I’ll use kernel bypass networking for my blog!
Have & understand actual performance requirements,don’t invent them based on high frequency trading* talks :-)
* I’m assuming here that’s not your business, if it is – carry on.
Systems, what do they do?!
Systems, what do they do?!
Systems, what do they do?!
Systems, what do they do?!
Systems, what do they do?!
Systems, what do they do?!
The things we don’t see… in Scala
A thing about nulls
A thing about nulls
"I call it my billion-dollar mistake." Sir C. A. R. Hoare, on his invention of the null reference
A thing about nulls
something.calculateSum(2, 2)
What does this code do?
A thing about nulls
something.calculateSum(2, 2)
What does this code do?
a) return 4b) NullPointerException!c) System.exit(0) // though I have a love-hate relationship with this answer…
The curious case of OptionsScala Option – 2007 (Scala 2.5, sic! http://www.scala-lang.org/old/node/165)Guava Optional – 2011 (since v10.0)Java Optional – 2014 (since v1.8)
Scala Option – 2007 (Scala 2.5, sic! http://www.scala-lang.org/old/node/165)Guava Optional – 2011 (since v10.0)Java Optional – 2014 (since v1.8)
The curious case of Options
sealed abstract class Option[+A] extends Product with Serializable { self =>
def isEmpty: Boolean def isDefined: Boolean = !isEmpty def get: A @inline final def getOrElse[B >: A](default: => B): B = if (isEmpty) default else this.get
Scala Option – 2007 (Scala 2.5, sic! http://www.scala-lang.org/old/node/165)Guava Optional – 2011 (since v10.0)Java Optional – 2014 (since v1.8)
The curious case of Options
sealed abstract class Option[+A] extends Product with Serializable { self =>
def isEmpty: Boolean def isDefined: Boolean = !isEmpty def get: A @inline final def getOrElse[B >: A](default: => B): B = if (isEmpty) default else this.get
public final class Optional<T> { public boolean isPresent() { return value != null; } public T get() { if (value == null) throw new NoSuchElementException("No value present"); return value; } public T orElseGet(Supplier<? extends T> other) { return value != null ? value : other.get(); }
The curious case of Options
val o: Option[String] = ??? o.foreach(_.toUpperCase(Locale.ROOT)) // ok, sure o match { case Some(value) => value.toUpperCase(Locale.ROOT) case None => "_"}
We all have the same “mistake”: get seems innocent, but it’s not…
final Optional<String> optional = Optional.of("");
optional.map(it -> it.toUpperCase(Locale.ROOT)); if (optional.isPresent()) { optional.get().toUpperCase();}
Can we do better than that though?
“What the eyes don’t see,the programmer does not invoke.”
Blocking is the new “you broke the build!”
Blocking is the new “you broke the build!”// BAD! (due to the blocking in Future):implicit val defaultDispatcher = system.dispatcher
val routes: Route = post { complete { Future { // uses defaultDispatcher
Thread.sleep(5000) // will block on the default dispatcher, System.currentTimeMillis().toString // starving the routing infra } }}
Blocking is the new “you broke the build!”// BAD! (due to the blocking in Future):implicit val defaultDispatcher = system.dispatcher
val routes: Route = post { complete { Future { // uses defaultDispatcher
Thread.sleep(5000) // will block on the default dispatcher, System.currentTimeMillis().toString // starving the routing infra } }}
http://stackoverflow.com/questions/34641861/akka-http-blocking-in-a-future-blocks-the-server/34645097#34645097
The curious case of Futures
The curious case of Futurespublic class CompletableFuture<T> implements Future<T>, CompletionStage<T> {
public T get() throws InterruptedException, ExecutionException { // ... }
public T get(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException { // ... }
The curious case of Futures
Anyone remember the days beforescala.concurrent.Future?
Back to the Future, in which we discuss Akka and Twitter Futures in 2012 :-)https://groups.google.com/forum/?fromgroups=#!topic/akka-user/eXiBV5V7ZzE%5B1-25%5D
The curious case of Futurespublic class CompletableFuture<T> implements Future<T>, CompletionStage<T> {
public T get() throws InterruptedException, ExecutionException { // ... }
public T get(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException { // ... }
trait Future[+T] extends Awaitable[T] {
// THERE IS NO GET! // Closest thing to it is...
def value: Option[Try[T]] // However it’s not that widely known actually // Notice that it is non-blocking!
The curious case of Futurespublic class CompletableFuture<T> implements Future<T>, CompletionStage<T> {
public T get() throws InterruptedException, ExecutionException { // ... }
public T get(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException { // ... }
trait Future[+T] extends Awaitable[T] {
// THERE IS NO GET! // Closest thing to it is...
def value: Option[Try[T]] // However it’s not that widely known actually // Notice that it is non-blocking!
The curious case of Futurespublic class CompletableFuture<T> implements Future<T>, CompletionStage<T> {
public T get() throws InterruptedException, ExecutionException { // ... }
public T get(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException { // ... }
trait Future[+T] extends Awaitable[T] {
// THERE IS NO GET! // Closest thing to it is...
def value: Option[Try[T]] // However it’s not that widely known actually // Notice that it is non-blocking!
object Await {
@throws(classOf[TimeoutException]) @throws(classOf[InterruptedException]) def ready[T](awaitable: Awaitable[T], atMost: Duration): awaitable.type = blocking(awaitable.ready(atMost)(AwaitPermission))
@throws(classOf[Exception]) def result[T](awaitable: Awaitable[T], atMost: Duration): T = blocking(awaitable.result(atMost)(AwaitPermission)) }
The curious case of Futures
Java strongly valued fitting-in with existing types.This forced exposing the get() method.
Scala had a clean-slate, could keep out of sight methodswe don’t want devs to call as they’re dangerous.
“If you don’t see a way at first sight…maybe it was hidden on purpose?”
Hidden scaladoc features
Hidden scaladoc features
Akka HTTP 2.4.2we didn’t know about the
hidden feature as well actually :-)
Hidden scaladoc features
Akka HTTP 2.4.4
Started using the scaladoc@group feature.
Hidden scaladoc features
Akka HTTP 2.4.4
Started using the scaladoc@group feature.
Hidden scaladoc features
Akka HTTP 2.4.4
Started using the scaladoc@group feature.
/** * @groupname basic Basic directives * @groupprio basic 10 */ trait BasicDirectives {
/** * @group basic */ def mapInnerRoute(f: Route ⇒ Route): Directive0 = Directive { inner ⇒ f(inner(())) }
/** * @group basic */ def mapRequestContext(f: RequestContext ⇒ RequestContext): Directive0 = mapInnerRoute { inner ⇒ ctx ⇒ inner(f(ctx)) }
// unlocked by: scala -groups
Trait representation
Trait representation, in Scala 2.11
trait T { def foo = "bar!" }class A extends Tclass B extends T with SomethingElse
Trait representation, in Scala 2.11
Trait representation, in Scala 2.11
./target/scala-2.11/classes/scalapeno/Example$class.class
./target/scala-2.11/classes/scalapeno/Example.class
Trait representation, in Scala 2.11
$ javap -v ./target/scala-2.11/classes/scalapeno/Example.class Classfile target/scala-2.11/classes/scalapeno/Example.class Last modified 25-Apr-2016; size 475 bytes MD5 checksum fe12967b00411cde03bdc57f0ca6869b Compiled from "Example.scala"
public interface scalapeno.Example minor version: 0 major version: 50 flags: ACC_PUBLIC, ACC_INTERFACE, ACC_ABSTRACT {
public abstract java.lang.String newMethod(); descriptor: ()Ljava/lang/String; flags: ACC_PUBLIC, ACC_ABSTRACT }
./target/scala-2.11/classes/scalapeno/Example$class.class
./target/scala-2.11/classes/scalapeno/Example.class
Trait representation, in Scala 2.11
$ javap -v ./target/scala-2.11/classes/scalapeno/Example.class Classfile target/scala-2.11/classes/scalapeno/Example.class Last modified 25-Apr-2016; size 475 bytes MD5 checksum fe12967b00411cde03bdc57f0ca6869b Compiled from "Example.scala"
public interface scalapeno.Example minor version: 0 major version: 50 flags: ACC_PUBLIC, ACC_INTERFACE, ACC_ABSTRACT {
public abstract java.lang.String newMethod(); descriptor: ()Ljava/lang/String; flags: ACC_PUBLIC, ACC_ABSTRACT }
$ javap -v ./target/scala-2.11/classes/scalapeno/Example\$class.class Classfile target/scala-2.11/classes/scalapeno/Example$class.class Last modified 01-May-2016; size 450 bytes MD5 checksum ea679b7db9d6275dbe9dff53475000c2 Compiled from "Example.scala"
public abstract class scalapeno.Example$class minor version: 0 major version: 50 flags: ACC_PUBLIC, ACC_SUPER, ACC_ABSTRACT
public static java.lang.String newMethod(scalapeno.Example); descriptor: (Lscalapeno/Example;)Ljava/lang/String; flags: ACC_PUBLIC, ACC_STATIC Code: stack=1, locals=1, args_size=1 0: ldc #9 // String 2: areturn
New trait encoding using default methods https://github.com/scala/scala/pull/5003
Trait representation, in Scala 2.12
./target/scala-2.12.0-M4/classes/scalapeno/Example.class
Trait representation, in Scala 2.12
$ javap -v ./target/scala-2.12.0-M4/classes/scalapeno/Example.class Classfile scala-2.12.0-M4/classes/scalapeno/Example.class Last modified 25-Apr-2016; size 684 bytes MD5 checksum 07bad6e3b9c5d73b1276e5e7af937e29 Compiled from "Example.scala" public interface scalapeno.Example minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_INTERFACE, ACC_ABSTRACT { public java.lang.String newMethod(); descriptor: ()Ljava/lang/String; flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: ldc #12 // String 2: areturn LocalVariableTable: Start Length Slot Name Signature 0 3 0 this Lscalapeno/Example; LineNumberTable: line 4: 0
The things we don’t see… in…
The things we don’t see… in עכו?
The things we don’t see in… The Queen of Laponia?!
The things we don’t see… in Akka, the open source project.
Messages! Not method calls.
Akka Actor in one sentence:
Messaging as a core abstraction, not slap-on afterthought.
Messages! Not method calls.
Akka in one sentence:
A toolkit for building highly distributed and concurrent apps.
Messages! Not method calls.
http://c2.com/cgi/wiki?AlanKayOnMessaging
Messages! Not method calls.
Waldo J, Wyant G, Wollrath A, Kendall S. A @ Sun Microsystems Laboratories. 1994.Note on Distributed Computing
Messages! Not method calls.Methods: // locally:
val value: Long = local.calculateSum(2, 2)// if it’s parallel then we need some middle man to handle concurrency issues hm…
// but remote will have big latency so... val value: Future[Long] = remote.calculateSum(2, 2)// Q1: what is actually put on the wire?// Q2: what about retrying to different host, // - now need magic to handle it...// Q3: can the downstream directly respond to upstream?// - ok, so we could build a special method that does this// Q4: what if the networking breaks...? Do I need to try/catch?
// ... but why, if it could be a simple message send :-)
Messages! Not method calls.Methods: // locally:
val value: Long = local.calculateSum(2, 2)// if it’s parallel then we need some middle man to handle concurrency issues hm…
// but remote will have big latency so... val value: Future[Long] = remote.calculateSum(2, 2)// Q1: what is actually put on the wire?// Q2: what about retrying to different host, // - now need magic to handle it...// Q3: can the downstream directly respond to upstream?// - ok, so we could build a special method that does this// Q4: what if the networking breaks...? Do I need to try/catch?
// ... but why, if it could be a simple message send :-)
Messages: // locally:local ! CalculateSum(2, 2)// I'll get a reply in a bit, or I can retry etc etc.// Actor is can be running in parallel, on multi-core etc, same API.// what can happen?// JVM can crash => effectively message loss
// remotelyremote ! CalculateSum(2, 2)// what can happen?// message loss// receiver of the msg can crash... => message loss
Messages! Not method calls.
class OneWayProxyActor extends Actor { val downstream: ActorRef = ??? def receive = { case StateQuery(q) => sender() ! run(q)
case req: IncomingRequest => downstream forward transformed(req) } def run(any: Any) = ???}
Messages! Not method calls.
What do we not see in these scenarios though?
Messages! Not method calls.
What do we not see in these scenarios though?
Local synchronous methods are trivial to trace: that’s what a StackTrace is. But the tradeoff is very high: scalability / resilience – all lost.
Messages! Not method calls.
What do we not see in these scenarios though?
Local synchronous methods are trivial to trace: that’s what a StackTrace is. But the tradeoff is very high: scalability / resilience – all lost.
Tracing async / distributed systems is not quite solved yet…
Dapper / Zipkin are not a new things, they’re the beginning of a journey – not the complete solution…
Messages! Not method calls.
I can totally do the same with REST HTTP calls!
Messages! Not method calls.
I can totally do the same with REST HTTP calls!
Sure you can (201, 204), but: In Akka that’s both default, and exactly 40bytes cheap.
Messages! Not exposing state.
Encapsulation, so nice…
Messages! Not exposing state.
class ComplexLogic { def add(i: Item): ComplexLogic def apply(): Effect}
But it’s “only for testing”!
Messages! Not exposing state.
getter
class ComplexLogic { def add(i: Item): ComplexLogic def apply(): Effect def state: State}
Messages! Not exposing state.
val logicActor: ActorRef = ???// no way to access state - for your own good.
logicActor ! Add(item)logicActor ! Add(item)logicActor ! ApplyexpectMsgType[StateAppliedSuccessfully]
Test: yes: behaviour and interactions, not: raw state.
“I want my words back.”Roland Kuhn (former Akka Team lead)
We want our words back…!
Two examples of name-collisions causing confusion:
“Scheduler”&
“Stream”
Naming is hard…
“Scheduler”
“Scheduler” - but is it the right one? class CountdownActor extends Actor { val untilEndOfWorld = 127.days // according to mayan prophecy // oh, great, a scheduler! val scheduler = context.system.scheduler scheduler.scheduleOnce(untilEndOfWorld, self, EndOfTheWorld) def receive = { case EndOfTheWorld => println("Aaaaa!!!") } }
http://doc.akka.io/docs/akka/2.4.4/scala/scheduler.html#scheduler-scala
“Scheduler” - but is it the right one? class CountdownActor extends Actor { val untilEndOfWorld = 127.days // according to mayan prophecy // oh, great, a scheduler! val scheduler = context.system.scheduler scheduler.scheduleOnce(untilEndOfWorld, self, EndOfTheWorld) def receive = { case EndOfTheWorld => println("Aaaaa!!!") } }
Into the Dungeon
Into the akka.actor.dungeon
Technically LARS is not in the dungeon,but couldn’t stop myself from making this reference…
“Scheduler” - but is it the right one?
“Scheduler” - but is it the right one?
It’s a plain “Hashed Wheel Timer” though!
It’s a plain “Hashed Wheel Timer” though!
It’s a plain “Hashed Wheel Timer” though!
Optimised for: • high performance • lockless insertion • O(1) insertion• huge amounts of tasks
// timeouts - on each request, ask timeouts
Original white paper: Hashed and Hierarchical Timing Wheels:https://pdfs.semanticscholar.org/0a14/2c84aeccc16b22c758cb57063fe227e83277.pdf
It’s a plain “Hashed Wheel Timer” though!
Optimised for: • high performance • lockless insertion • O(1) insertion• huge amounts of tasks
// timeouts - on each request, ask timeouts
Not for: • preciseness • persistence
Original white paper: Hashed and Hierarchical Timing Wheels:https://pdfs.semanticscholar.org/0a14/2c84aeccc16b22c758cb57063fe227e83277.pdf
Akka’s Scheduler fun facts!
• Akka impl. after Netty implementation• Akka impl. improved performance quite a bit• Netty pulled-in the optimisations
(just one of multiple (both way) healthy interactions) => Yay, healthy Open Source ecosystem!
Akka’s Scheduler fun facts!Error message you’ll likely never see:LightArrayRevolverScheduler for short: “LARS”
Akka’s Scheduler fun facts!Error message you’ll likely never see:LightArrayRevolverScheduler for short: “LARS”
“LARS cannot start new thread, ship’s going down!”
Persistent, reliable job scheduler
Ok, so what do we need for it?
Persistent, reliable job scheduler: Chronos
Ok, so what do we need?
• Persistence• Replication• Consensus on who executes
Persistent, reliable job scheduler: Chronos
Optimised for:
• ”Like CRON, but distributed”• Visibility, including management UI• Retries of failed tasks• Far-out tasks (usually measured in days etc)• Consensus on who executes• Also considered Lightweight• 2k lines of Scala• in it’s context… well, it is!
https://www.youtube.com/watch?v=FLqURrtS8IA
Persistent, reliable job scheduler: Chronos
Definitely NOT for:
• millions of tasks inserted per second• to be run in the next second
… that’s what the the Akka scheduler is for!
Persistent, reliable job scheduler: Chronos
Definitely NOT for:
• millions of tasks inserted per second• to be run in the next second
… that’s what the the Akka scheduler is for!
Notable mention: Quartz (well known long-term scheduler, with Akka integration and persistence)
“Stream”What does it mean?!
* when put in “” the word does not appear in project name, but is present in examples / style of APIs / wording.
Suddenly everyone jumped on the word “Stream”.
Akka Streams / Reactive Streams started end-of-2013.
“Streams”
* when put in “” the word does not appear in project name, but is present in examples / style of APIs / wording.
Suddenly everyone jumped on the word “Stream”.
Akka Streams / Reactive Streams started end-of-2013.
“Streams”
Akka Streams Reactive Streams RxJava “streams”* Spark Streaming Apache Storm “streams”* Java Steams (JDK8) Reactor “streams”* Kafka Streams ztellman / Manifold (Clojure)
* when put in “” the word does not appear in project name, but is present in examples / style of APIs / wording.
Apache GearPump “streams” Apache [I] Streams (!) Apache [I] Beam “streams” Apache [I] Quarks “streams” Apache [I] Airflow “streams” (dead?) Apache [I] Samza Scala Stream Scalaz Streams, now known as FS2 Swave.io Java InputStream / OutputStream / … :-)
A core feature not obvious to the untrained eye…!
Akka Streams / HTTP
Quiz time! TCP is a ______ protocol?
A core feature not obvious to the untrained eye…!
Akka Streams / HTTP
Quiz time! TCP is a STREAMING protocol!
Streaming from Akka HTTPNo demand from TCP
= No demand upstream
= Source won’t generate tweets
Streaming from Akka HTTPNo demand from TCP
= No demand upstream
= Source won’t generate tweets
=>Bounded memory stream processing!
Binary Compatibility A.K.A. “The thing you think you want, but you actually want much more”
Brian Goetz, Java Language Architect, “Stewardship: The sobering parts” https://www.youtube.com/watch?v=2y5Pv4yN0b0
Compatibilities intertwinedScala versioning scheme is:Epoch.Major.Minor
Technically Java too is 1.6, 1.7, 1.8 by the way.
Akka releases closely track Scala versions.
Serialization Compatibility
=>
Akka Persistence, (experimental in 2.3.x)
Experimental version accidentally used Java Serialization for Scala’s Option.
Akka now released to both Scala: 2.10 and 2.11.Scala’s Option changed its serialised form slightly.
Akka Persistence is geared towards long-term storage of events – even if experimental, we did not want to block anyone from upgrading to 2.11…!
Serialization Compatibility =>
Akka Persistence, in this specific place, provides Serialization compatibility of raw Scalawhere Scala was allowed to break it.
Because we care about you :-)
// If we are allowed to break serialization compatibility of stored snapshots in 2.4 // we can remove this attempt to deserialize SnapshotHeader with JavaSerializer. // Then the class SnapshotHeader can be removed. See issue #16009 val oldHeader = if (readShort(in) == 0xedac) { // Java Serialization magic value with swapped bytes val b = if (SnapshotSerializer.doPatch) patch(headerBytes) else headerBytes serialization.deserialize(b, classOf[SnapshotHeader]).toOption } else None
Serialization Compatibility
Plenty of work goes completely un-seen, “if it works as it should, you don’t see it”.
The Scala ecosystem has matured, that many projects really care and tend those aspects.
“Why don’t we just magically…”
“In terms of magic in Akka…we prefer to have None of it.”
quoting myself in silly attempt to coin a phrase (-:
“Why don’t we just magically…”
nums.foldingLeft(0)(_ + _)
nums.foldingLeft(0)(_ + _)// Summing up:
A hidden force behind all our successes…
Before we sum up:
A hidden force behind all our successes…
You!The Scala community as a whole,
thank you!
Before we sum up:
Everything is a tradeoff. If you can’t see the tradeoff, look again.
Some things… you don’t need to see, some things you do.
Pick wisely.
Don’t believe in “oh yeah it’s magic”. Seek deeper understanding of things.
Keep learning: “Life is Study!”
Summing up
!תודה רבהThanks for listening!
ktoso @ typesafe.com twitter: ktosopl
github: ktosoteam blog: letitcrash.com
home: akka.io
Thus spake the Master Programmer:
“After three days without programming,life becomes meaningless.”
Q/A (Now’s the time to ask things!)
ktoso @ lightbend.com twitter: ktosopl
github: ktosoblog: kto.so
home: akka.io