COMPUTING ATSCALE WITHFINAGLE - monkey.orgmarius/wt294.pdf · Finagle from 10k feet RPC system...

COMPUTING ATSCALE

WITHFINAGLEMARIUSERIKSEN

EECS294MARCH222015 [email protected]@MARIUS

#WT294

mailto:[email protected]

SettingThe modern computing environment is distributed. • Datacenters are the only way we know how to build

big, cheap computers.

Internet services must be “carrier grade.” • Their usefulness is directly related to their

availability.

Utility computing means environment variability. • You get what you pay for: resources may be revoked

at any time; you may have noisy neighbors.

The dismal computerA datacenter is a really crappy computer; they: • have deep memory hierarchies,• exhibit partial failures,• have dynamic topologies,• are heterogeneous,• are connected via asynchronous networks,• make lots of room for operator error,• and are very complex.

But, they’re what we’ve got. We need to gain reliability, safety, and efficiency through software.

It gets worseMost of our tools, languages, tribal knowledge, and capacity for sophistication is geared towards local computing, for example: • debuggers,• profilers,• linker/loaders,• memory models,• runtimes, and• type systems.

The very model of computing changes.

Problems

How do we have harness datacenter computing?

How do we provide a sane programming model?

Can we recoup what is lost?

*What I’m about to talk about — IPC/RPC systems, are just a small, though important, piece of this puzzle.

At the end of the day, the implications of datacenter computing are pervasive, and must be accounted for in every layer.

What we can do, however, is provide good tools to solve these problems. That’s what I’m talking about today.

The service model

A service is an autonomous, asynchronous, isolated, and failure-explicit module.

Services compose other services. A system is a graph of services.

Services operate concurrently.

STORAGE & RETRIEVAL

LOGICPRESENTATIONROUTING

Redis

Memcache

Flock

T-Bird

MySQLTweet

User

Timeline

Social Graph

DMs

API

Web

Monorail

TFE

HTTP Thrift “Stuff”

Services in Finagle

// A service is a function that takes a // Req-typed value, returning a // Future of a Rep-typed value. // // It’s an asynchronous function.

trait Service[Req, Rep] extends (Req => Future[Rep])

Defining a servicecalc.thrift:

// Define an interface using an IDL. // (In this case, Thrift.) service Calculator { i32 multiply(1: i32 a, 2: i32 b); }

calc.scala:

trait Calculator { def multiply(a: Int, b: Int): Future[Int] }

Implementing a calculator

val calculator = new Calculator { def multiply(a: Int, b: Int): Future[Int] = Future.value(a+b) }

Rpc.serve(“calculator”, calculator)

Using the calculatorval calculator = Rpc.bind[Calculator](“calculator”)

val f: Future[Int] = calculator.multiply(100, 200)

f.respond { case Return(res) => println(s”100*200=$res”) case Throw(err) => println(s“error $err”) }

Concurrent compositiondef querySegment(id: Int, query: String) : Future[Result]def search(query: String): Future[Set[Result]] = {

val queries: Seq[Future[Result]] = for (id <- 0 until NumSegments) yield { querySegment(id, query) }

Future.collect(queries) flatMap { results: Seq[Set[Result]] => Future.value(results.flatten.toSet) }}

An RPC systemRpc.serve("calculator")

Rpc.serve("calculator")



Rpc.bind("calculator")

RPC system

Finagle from 10k feetR

PC s

yste

m

Serialize Name DistributeSession

Client

Deserialize Admit

Server

calc.multiply(100, 200)

calc.multiply(100, 200)

Datacenter network

Serialize

Deserialize

Session

Serialization

0000000 d2791881 4ca6401e 32003b8f d4fe0a240000010 087b66b0 ddbc8058 cff4bb11 7b9cce850000020 5546dd41 858dfb25 2614aa5b f872082a0000030 48cc7d91 5f7f2884 f0b74ae8 1a1e2c680000040 16f8d867 971112cb b84827de ef52f2810000050 06eb6c5b 0098603b 5a0e49b6 c607fda0

Call("Calculator.multiply", 100, 200)Call("Calculator.multiply", 100, 200)

calculator.multiply(100, 200)

Naming

Logical/Abstract

calculator

Replica set

zone/ owner/ env/ calculator

Physical

host1.smf1:122host2.smf1:123host3.smf1:124…

Distributionserver1

server2

server3

servern

...

distributor

Distribution (2)

server1

server2

server3

servern

...

distributor

server1

statistics: failures latencies load session health

controller

circuit breaker

Session (Mux)session

data plane

control plane

session

data plane

control plane

requestresponse

cancelpingcrediterrornack

Admission control

admit?

nack

request requestservice

response

Distributing traffic

Distribution happens at different scales: • Time.• Geography.

This is a recursive problem! • Use the same mechanism, in different places.

Diagnostics

val sr: StatsReceiver val counter = sr.counter(“requests”) val stat = sr.stat(“latency”)

…

counter.incr() stat.add(latencyMs)

Diagnostics% curl http://.../admin/metrics.json ... "Gizmoduck/request_latency_ms": { "average": 1, "count": 124909591, "maximum": 950, "minimum": 0, "p50": 1, "p90": 3, "p95": 5, "p99": 19, "p999": 105, "p9999": 212, "sum": 222202958 }, ... "err/CancelledRequest": 11, "err/Unknown": 285, "err/Timeout": 106, ...

Aggregates

avg(ts(AVG, Gizmoduck, Gizmoduck/request_latency_ms.p{50,90,99,999}))

A word on resiliencyresilience, n. The act of resiling, springing back, or rebounding; as, the resilience of a ball or of sound.

Systems must be designed, end-to-end, for resiliency. RPC systems are a toolkit for resilient applications, not a panacea.

Resilient systems should balance MTTF vs. MTTR.

We can’t wish these problems away.

Message broker architectures

Explicit, decoupled, message queues. • Publishers, subscribers.• Topics.• Patterns on top — “request/reply,” “fire and forget,”

etc.

Brokered by middleware — the message queues.

Actor architecturesActors and Services are both structuring idioms. • Whereas Services are asynchronous functions;

actors are asynchronous sinks.

Actors • Independent, isolated.• Asynchronous message passing.• Arranged into systems. • Request-reply is a pattern.

Error handling through supervisor hierarchies.

Thanks.

This is all open source!

github.com/twitter/finagle

github.com/twitter/util

github.com/twitter/scrooge

Backup slides

Systems thinking

We no longer care about a single {process,machine,service,…}. What matter is how the system works.

For example, we want to optimize end-to-end performance, not that of individual servers.

Lessons

Define high-level objectives, not low-level parameters, e.g., SLOs. • Give the system more freedom.• Make use of dynamism.• Balance with simplicity.

Load balancing

A rich topic, with many tradeoffs presented.

Power of Two Choices (Mitzenmacher).

Apertures.

Latency-based metrics.

Session liveness

How do we determine whether a session is live? A surprisingly tricky question.

φ accrual.

Threshold detector.

Requeueing

After receiving a server NACK, what do we do?

Credit/debit scheme. Cost ratio.

COMPUTING ATSCALE WITHFINAGLE - monkey.orgmarius/wt294.pdf · Finagle from 10k feet RPC system...

Documents

Transcript of COMPUTING ATSCALE WITHFINAGLE - monkey.orgmarius/wt294.pdf · Finagle from 10k feet RPC system...