Scaling with Scala: refactoring a back-end service into the mobile age

Post on 01-Dec-2014

1.921 views 2 download

description

Services built with 20th century programming languages are reaching their scalability limits. The global interpreter lock and the lack of an asynchronous programming model are becoming barriers to accommodating the numbers of users typical of today's mobile as well as web worlds. In this talk I cover the transition of a back-end service to Scala and the changes associated with it. The improved performance and cost savings of the Scala implementation free up resources that could be better leveraged elsewhere.

Transcript of Scaling with Scala: refactoring a back-end service into the mobile age

Scaling with Scala: Refactoring a Back-end

Service into the Mobile AgeDragos Manolescu (@hysteresis),Whitepages

dmanolescu at whitepages com

About Whitepages

• Top web and mobile site for finding phones, people and locations

• 50M unique users per month

• 35M search queries per day

• 70 engineers, mostly in Seattle

Background• Ruby backend services

• Shifting to Scala to accommodate growth

• Technologies:

• spray.io (client, server)

• Thrift and Scrooge

• Coda Hale metrics

• Typesafe Webinar: http://j.mp/Y6gH05

Worker Backend Service

N1

N3

N2

WorkerAMQ AMQ

AMQ

AMQ

Riak Cluster

JSON

JSON

JSON

JSON

Compressed binary Thrift

Sidebar: Scala Async Programming Model

• Future-based

• Future[T] monad

• Composition w/ the collection-like API (map, etc.)

• Actor-based

• Eliminates locks and thread management

• Resiliency through supervision (Erlang/OTP)

Future Composition def retrieveDataAndMaterialize(req: MaterializationRequest, resolutions: ResolutionList, jobStatus: JobStatus): Future[CreateMaterializationResult] = { /* snip */ ! val dasFactsF = getContactListFor(dasBucketPrefix) val deviceFactsF = getContactListFor(deviceBucketPrefix) val fbFactsF = getContactListFor(facebookBucketPrefix) val twFactsF = getContactListFor(twitterBucketPrefix) val lnFactsF = getContactListFor(linkedinBucketPrefix) ! val materializationResultF = for { mlHolderOpt <- mlHolderOptF rflHolderOpt <- rflHolderOptF dasFacts <- dasFactsF deviceFacts <- deviceFactsF fbFacts <- fbFactsF twFacts <- twFactsF lnFacts <- lnFactsF } yield materialize(req, resolutions, jobStatus, mlHolderOpt, rflHolderOpt, dasFacts, deviceFacts, fbFacts, twFacts, lnFacts) materializationResultF.flatMap(f => f ) }

Sidebar: Akka Actor Model

Actor

BehaviorMailbox

Parent

Child

Child

M M M

Actor

BehaviorMailbox

Messages

Messaging to Actors private def connectedBehavior(consumer: MessageConsumer): Receive = { case PullNextMessage => Option(consumer.receiveNoWait()) match { case Some(m) => monitor ! SignalMaterializationRequestReceived(m.getJMSRedelivered) self ! ProcessMessage(m) case None if incomingRequests.isEmpty => context.system.scheduler.scheduleOnce(wakeupInterval, self, PullNextMessage) case None if incomingRequests.size <= prefetch => acknowledgeSessionMessages() case _ => /* nop */ }

Sidebar: Monitoring w/ CodaHale metrics and Graphite

Sidebar: ActiveMQ, Camel and Akka

• Apache ActiveMQ: message broker

• JMS, AMQP, MQTT, …

• Durable messaging, transactions

• “The main use case for ActiveMQ is migrating off ActiveMQ”

• Apache Camel: messaging w/ glue and routing

• Wide range of endpoints (file, JMS, JDBC, XMPP)

• Enterprise Integration patterns

• Akka-Camel: actors w/ Camel endpoints

Sidebar: ActiveMQ, Camel and Akka (cont.)

• Conflicting assumptions?

• Guaranteed delivery

• Delivery semantics

• JMS prefetch and CLIENT_ACKNOWLEDGE

• Pragmatic architectural decisions (Lucy Berlin, When Objects Collide, OOPSLA 1990)

Futures and Actors/* inside actor code */ def acknowledgeSessionMessages(): Unit = { Future.sequence(resultsF) .map { results => AcknowledgeSession} .recover { case t: Throwable => RecoverSession(t)} .pipeTo(self) } !override def receive: Receive { case AcknowledgeSession => // case RecoverSession(t) => // // }

Supervision:Error KernelstartstopregisterQueueListenerunregisterQueueListenercreateQueueSenderdeleteQueueSender

AmqClient

connection [become]AmqActor

Actor

sessionprocessMessage

ConsumerActorActor

sessionProducerActor

Actor

JMS.MessageProducer

sendTextMessage()AmqSender

disconnectConsumersdisconnectSenders

consumersclosingConsumerssendersclosingSenderschild [become]

AmqSupervisorActor

<<parent-of>>

<<parent-of>>

<<parent-of>>

Results

290 Ruby instances

2 Scala instances

Not so fast (no pun intended)

Performance Tuning

Riak Clientobject ConverterBase extends ClassSupport { ! def bytesToRiakObject(key: String, bucket: String, vclock: VClock, bytes: Array[Byte]): IRiakObject = { val blob = Snappy.compress(bytes) monitor ! BashoValueSize(blob.length, BashoPutOperation) RiakObjectBuilder.newBuilder(bucket, key) .withValue(blob).withVClock(vclock) .withContentType(BashoMobileClient.contentType) .withUsermeta(BashoMobileClient.userMeta) .build() } ! def riakObjectToBytes(riakObject: IRiakObject) = { val thriftBytes = Snappy.uncompress(riakObject.getValue) Thrift.deserializeThrift(thriftBytes, ThriftCompactProtocol) } } !abstract class ConverterBase[T <: ThriftStruct](key: String, bucket: String) extends Converter[StoredObjectHolder[T]] { ! override def fromDomain(valueHolder: StoredObjectHolder[T], vclock: VClock): IRiakObject = ConverterBase.bytesToRiakObject(valueHolder.key, bucket, vclock, Thrift.serializeThrift(valueHolder.value.get, ThriftCompactProtocol)) ! override def toDomain(riakObject: IRiakObject): StoredObjectHolder[T] = { if (riakObject == null) new StoredObjectHolder[T](key) else StoredObjectHolder[T](riakObject.getKey, Some(riakObject.getVClock), Some(makeNew(ConverterBase.riakObjectToBytes(riakObject)))) } ! def makeNew(protocol: TProtocol): T }

JVM Serialization

Snappy or LZ4? Micro-benchmarking with JMH

Throughput Measurements

Compression Measurements

Summary• Shifting from Ruby to Scala

• Scala async programing model

• JVM optimizations

• Results:

• Increased throughput

• Better hardware utilization

• Lower operating cost

• Seamless integration with Java ecosystem

Thank you! (we are hiring)

Resources• Typesafe Webinar w/ Whitepages: http://j.mp/Y6gH05

• JVM Serializer Benchmarks: http://j.mp/1BBYdky

• YourKit Java profiler: http://j.mp/10mFu17

• JMH: http://j.mp/1BC5Bwv

• When Objects Collide: http://j.mp/1vBiPsz

• Coda Hale metrics: http://j.mp/1vyl9j1