Akka in Production: Our Story

38
Akka in Production Our Story Evan Chan PNWScala 2013 Saturday, October 19, 13

description

Everyone in the Scala world is using or looking into using Akka for low-latency, scalable, distributed or concurrent systems. We want to share our story of developing and productionizing multiple Akka apps, including low-latency ingestion and real-time processing systems, and Spark-based applications. When does one use actors vs futures? Why did we go with Logback instead of Akka's built-in logging? Can we use Akka with, or in place of, Storm? How did we set up instrumentation and monitoring in production? How does one use VisualVM to debug Akka apps in production? What happens if the mailbox gets full? What is our Akka stack like? We will share best practices that we've discovered when building Akka and Scala apps, pitfalls and things we'd like to avoid, and a vision of where we would like to go for ideal Akka monitoring, instrumentation, and debugging facilities.

Transcript of Akka in Production: Our Story

Page 1: Akka in Production: Our Story

Akka in ProductionOur Story

Evan ChanPNWScala 2013

Saturday, October 19, 13

Page 2: Akka in Production: Our Story

• Staff Engineer, Compute and Data Services, Ooyala

• Building multiple web-scale real-time systems on top of C*, Kafka, Storm, etc.

• github.com/velvia

• Author of ScalaStorm, Scala DSL for Storm

• @evanfchan

Who is this guy?

2

Saturday, October 19, 13

Page 3: Akka in Production: Our Story

WANT REACTIVE?

3

event-driven, scalable, resilient and responsive

Saturday, October 19, 13

Page 4: Akka in Production: Our Story

SCALA AND AKKAAT OOYALA

4

Saturday, October 19, 13

Page 5: Akka in Production: Our Story

CONFIDENTIAL—DO NOT DISTRIBUTE 5

Founded in 2007

Commercially launch in 2009

230+ employees in Silicon Valley, LA, NYC, London, Paris, Tokyo, Sydney & Guadalajara

Global footprint, 200M unique users,110+ countries, and more than 6,000 websites

Over 1 billion videos played per month and 2 billion analytic events per day

25% of U.S. online viewers watch video powered by Ooyala

COMPANY OVERVIEW

Saturday, October 19, 13

Page 6: Akka in Production: Our Story

How we started using Scala

• Ooyala was a mostly Ruby company - even MR jobs

• Lesson - don’t use Ruby for big data

• Started exploring Scala for real-time analytics and MR

• Realized a 1-2 orders of magnitude performance boost from Scala

• Today use Scala, Akka with Storm, Spark, MR, Cassandra, all new big data pipelines

Saturday, October 19, 13

Page 7: Akka in Production: Our Story

Ingesting 2 Billion Events / Day

NginxRaw Log Feeder Kafka

Storm

New Stuff

Consumer watches video

Saturday, October 19, 13

Page 8: Akka in Production: Our Story

Livelogsd - Akka/Kafka file tailer

Current File

Rotated File

Rotated File 2

File Reader Actor

File Reader Actor

Kafka Feeder

CoordinatorKafka

Saturday, October 19, 13

Page 9: Akka in Production: Our Story

Storm - with or without Akka?

Kafka Spout

Bolt

Actor

Actor

• Actors talking to each other within a bolt for locality

• Don’t really need Actors in Storm

• In production, found Storm too complex to troubleshoot

• It’s 2am - what should I restart? Supervisor? Nimbus? ZK?

Saturday, October 19, 13

Page 10: Akka in Production: Our Story

Akka Cluster-based Pipeline

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Saturday, October 19, 13

Page 11: Akka in Production: Our Story

Lessons Learned

• Still too complex -- would we want to get paged for this system?

• Akka cluster in 2.1 was not ready for production (newer 2.2.x version is stable)

• Mixture of actors and futures for HTTP requests became hard to grok

• Actors were much easier for most developers to understand

Saturday, October 19, 13

Page 12: Akka in Production: Our Story

Simplified Ingestion Pipeline

Kafka Partition

1

Kafka SimpleConsumer

Converter Actor

Cassandra Writer Actor

Kafka Partition

2

Kafka SimpleConsumer

Converter Actor

Cassandra Writer Actor

• Kafka used to partition messages

• Single process - super simple!

• No distribution of data

• Linear actor pipeline - very easy to understand

Saturday, October 19, 13

Page 13: Akka in Production: Our Story

STACKABLE ACTOR TRAITS

13

Saturday, October 19, 13

Page 14: Akka in Production: Our Story

Why Stackable Traits?

• Keep adding monitoring, logging, metrics, tracing code gets pretty ugly and repetitive

• We want some standard behavior around actors -- but we need to wrap the actor Receive block:

class someActor extends Actor { def wrappedReceive: Receive = { case x => blah } def receive = { case x => println(“Do something before...”) wrappedReceive(x) println(“Do something after...”) }}

Saturday, October 19, 13

Page 15: Akka in Production: Our Story

Start with a base trait...

trait ActorStack extends Actor { /** Actor classes should implement this partialFunction for standard * actor message handling */ def wrappedReceive: Receive

/** Stackable traits should override and call super.receive(x) for * stacking functionality */ def receive: Receive = { case x => if (wrappedReceive.isDefinedAt(x)) wrappedReceive(x) else unhandled(x) }}

Saturday, October 19, 13

Page 16: Akka in Production: Our Story

Instrumenting Traits...

trait Instrument1 extends ActorStack { override def receive: Receive = { case x => println("Do something before...") super.receive(x) println("Do something after...") }}

trait Instrument2 extends ActorStack { override def receive: Receive = { case x => println("Antes...") super.receive(x) println("Despues...") }}

Saturday, October 19, 13

Page 17: Akka in Production: Our Story

Now just mix the Traits in....

class DummyActor extends Actor with Instrument1 with Instrument2 { def wrappedReceive = { case "something" => println("Got something") case x => println("Got something else: " + x) }}

• Traits add instrumentation; Actors stay clean!

• Order of mixing in traits matter

Antes...Do something before...Got somethingDo something after...Despues...

Saturday, October 19, 13

Page 18: Akka in Production: Our Story

PRODUCTIONIZING AKKA

18

Saturday, October 19, 13

Page 19: Akka in Production: Our Story

Our Akka Stack

• Spray - high performance HTTP

• SLF4J / Logback

• Yammer Metrics

• spray-json

• Akka 2.x

• Scala 2.9 / 2.10

Saturday, October 19, 13

Page 20: Akka in Production: Our Story

On distributed systems:“The only thing that matters is

Visibility”

20

Saturday, October 19, 13

Page 21: Akka in Production: Our Story

Using Logback with Akka

• Pretty easy setup

• Include the Logback jar

• In your application.conf:event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]

• Use a custom logging trait, not ActorLogging

• ActorLogging does not allow adjustable logging levels

• Want the Actor path in your messages?• org.slf4j.MDC.put(“actorPath”, self.path.toString)

Saturday, October 19, 13

Page 22: Akka in Production: Our Story

Using Logback with Akka

trait Slf4jLogging extends Actor with ActorStack { val logger = LoggerFactory.getLogger(getClass) private[this] val myPath = self.path.toString

logger.info("Starting actor " + getClass.getName)

override def receive: Receive = { case x => org.slf4j.MDC.put("akkaSource", myPath) super.receive(x) }}

Saturday, October 19, 13

Page 23: Akka in Production: Our Story

Akka Performance Metrics

• We define a trait that adds two metrics for every actor:

• frequency of messages handled (1min, 5min, 15min moving averages)

• time spent in receive block

• All metrics exposed via a Spray route /metricz

• Daemon polls /metricz and sends to metrics service

• Would like: mailbox size, but this is hard

Saturday, October 19, 13

Page 24: Akka in Production: Our Story

Akka Performance Metrics

trait ActorMetrics extends ActorStack { // Timer includes a histogram of wrappedReceive() duration as well as moving avg of rate of invocation val metricReceiveTimer = Metrics.newTimer(getClass, "message-handler", TimeUnit.MILLISECONDS, TimeUnit.SECONDS)

override def receive: Receive = { case x => val context = metricReceiveTimer.time() try { super.receive(x) } finally { context.stop() } }}

Saturday, October 19, 13

Page 25: Akka in Production: Our Story

Performance Metrics (cont’d)

Saturday, October 19, 13

Page 26: Akka in Production: Our Story

Performance Metrics (cont’d)

Saturday, October 19, 13

Page 27: Akka in Production: Our Story

Flow control

• By default, actor mailboxes are unbounded

• Using bounded mailboxes

• When mailbox is full, messages go to DeadLetters

• mailbox-push-timeout-time: how long to wait when mailbox is full

• Doesn’t work for distributed Akka systems!

• Real flow control: pull, push with acks, etc.

• Works anywhere, but more work

Saturday, October 19, 13

Page 28: Akka in Production: Our Story

Flow control (Cont’d)

• A working flow control system causes the rate of all actor components to be in sync.

• Witness this message flow rate graph of the start of event processing:

Saturday, October 19, 13

Page 29: Akka in Production: Our Story

VisualVM and Akka• Bounded mailboxes = time spent enqueueing msgs

Saturday, October 19, 13

Page 30: Akka in Production: Our Story

VisualVM and Akka

• My dream: a VisualVM plugin to visualize Actor utilization across threads

Saturday, October 19, 13

Page 31: Akka in Production: Our Story

Tracing Akka Message Flows

• Stack trace is very useful for traditional apps, but for Akka apps, you get this:

at akka.dispatch.Future$$anon$3.liftedTree1$1(Future.scala:195) ~[akka-actor-2.0.5.jar:2.0.5]

at akka.dispatch.Future$$anon$3.run(Future.scala:194) ~[akka-actor-2.0.5.jar:2.0.5]

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:94) [akka-actor-2.0.5.jar:2.0.5]

at akka.jsr166y.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1381) [akka-actor-2.0.5.jar:2.0.5]

at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259) [akka-actor-2.0.5.jar:2.0.5]

at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) [akka-actor-2.0.5.jar:2.0.5]

at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479) [akka-actor-2.0.5.jar:2.0.5]

at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) [akka-actor-2.0.5.jar:2.0.5]

--> trAKKAr message trace <-- akka://Ingest/user/Super --> akka://Ingest/user/K1: Initialize akka://Ingest/user/K1 --> akka://Ingest/user/Converter: Data

• What if you could get an Akka message trace?

Saturday, October 19, 13

Page 32: Akka in Production: Our Story

Tracing Akka Message Flows

Saturday, October 19, 13

Page 33: Akka in Production: Our Story

Tracing Akka Message Flows

• Trait sends an Edge(source, dest, messageInfo) to a local Collector actor

• Aggregate edges across nodes, graph and profit!

trait TrakkarExtractor extends TrakkarBase with ActorStack { import TrakkarUtils._

val messageIdExtractor: MessageIdExtractor = randomExtractor

override def receive: Receive = { case x => lastMsgId = (messageIdExtractor orElse randomExtractor)(x) Collector.sendEdge(sender, self, lastMsgId, x) super.receive(x) }}

Saturday, October 19, 13

Page 34: Akka in Production: Our Story

Good Akka development practices

• Don't put things that can fail into Actor constructor

• Default supervision strategy stops an Actor which cannot initialize itself

• Instead use an Initialize message

• Put your messages in the Actor’s companion object

• Namespacing is nice

Saturday, October 19, 13

Page 35: Akka in Production: Our Story

PUTTING IT ALL TOGETHER

35

Saturday, October 19, 13

Page 36: Akka in Production: Our Story

Akka Visibility, Minimal Footprint

trait InstrumentedActor extends Slf4jLogging with ActorMetrics with TrakkarExtractor

object MyWorkerActor { case object Initialize case class DoSomeWork(desc: String)}

class MyWorkerActor extends InstrumentedActor { def wrappedReceive = { case Initialize => case DoSomeWork(desc) => }}

Saturday, October 19, 13

Page 37: Akka in Production: Our Story

Next Steps

• Name?

• Open source?

• Talk to me if you’re interested in contributing

Saturday, October 19, 13

Page 38: Akka in Production: Our Story

THANK YOUAnd YES, We’re HIRING!!

ooyala.com/careers

Saturday, October 19, 13