Erlang and Scalability

24
Erlang and Scalability Jan Henry Nystrom [email protected] Percona Performance 2009

description

A few thoughts on Erlang and scalability presented at the Percona Performance Conference Santa Clara 2009

Transcript of Erlang and Scalability

Page 1: Erlang and Scalability

Erlang andScalability

Jan Henry [email protected]

Percona Performance 2009

Page 2: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 2Erlang and Scalability

Introduction• Scalability Killers• Design Decisions – Language and Yours• Thinking Scalable/Parallel• Code for the correct case• Rules of Thumb• Scalability in the small: SMP

Page 3: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 3Erlang and Scalability

Scalability Killers• Synchronization• Resource contention• Synchronization

Page 4: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 4Erlang and Scalability

Design Decisions

No sharing

• Processes• Encapsulation• No implicit synchronization

Page 5: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 5Erlang and Scalability

Design Decisions

No implicit synchronization

• Spawn always succeed• Sending always succeed• Random access message buffer• Fire and forget unless you need the synchronization

Page 6: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 6Erlang and Scalability

Design Decisions

Concurrency oriented programming

• Concurrency support an integral part of the language• Distribution support • Sets the focus firmly on the concurrent tasks• Code for the correct case• Clear Code

Clarity is King!

I rather try to get clear code correct than correct code clear

Page 7: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 7Erlang and Scalability

0

List length: Obviously Linear

:

But not when you have n processors?

Thinking Scalable/Parallel

1234

Page 8: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 8Erlang and Scalability

List length: O(logN) with sufficient processors

Thinking Scalable/Parallel

2

4

1 111

2

Page 9: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 9Erlang and Scalability

Thinking Scalable/Parallel

In the Erlang setting

• Do not introduce unneeded synchronization • Remember processes are cheap• Do not introduce unneeded synchronization• A terminated process is all garbage• Do not introduce unneeded synchronization

Page 10: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 10Erlang and Scalability

Code for the Correct Case

set timer

set timer

set timer

release timercheck

release timercheck

release timercheck

request

request

request

answer

answer

answer

Page 11: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 11Erlang and Scalability

Code for the Correct Case

set timer

release timercheck

request

request

request

answer

Page 12: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 12Erlang and Scalability

Rules of Thumb• Rule 1 - All independent tasks should be processes• Rule 2 - Do not invent concurrency that is not there!

f()

g()

h()

h(g(f()))h(g(f()))

h(g(f()))h(g(f()))

Page 13: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 13Erlang and Scalability

Scalability in the small: SMP

Erlang SMP ”Credo”

SMP should be transparent to the programmer inSMP should be transparent to the programmer inmuch the same way as Erlang Distributionmuch the same way as Erlang Distribution

• You shouldn’t have to think about it ...but sometimes you must

• Use SMP mainly for stuff that you’d make concurrent anyway• Erlang uses concurrency as a structuring principle

• Model for the natural concurrency in your problem

Page 14: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 14Erlang and Scalability

Scalability in the small: SMP

• Erlang on multicore

• SMP prototype ‘97, First OTP release May ‘06.

• Mid -06 benchmark mimicking call handling (axdmark) on the (experimental) SMP emulator. Observed speedup/core: 0.95

• First Ericsson product (TGC) released on SMP Erlang in Q207.

”Big bang” benchmark on Sunfire T2000

Simultaneous processes16 schedulers

1 scheduler

Page 15: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 15Erlang and Scalability

Scalability in the small: SMP

Case Study: Telephony Gateway Controller

• Mediates between legacy telephony and multimedia networks.

• Hugely complex state machines• + massive concurrency.• Developed in Erlang.• Multicore version shipped to customer Q207.• Porting from 1-core PPC to 2-core Intel took < 1 man-year

(including testing).

AXE TGC

GWGW GW

Page 16: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 16Erlang and Scalability

Scalability in the small: SMP

3.17X call/sec

1.55X call/sec

0.4X call/sec

AXDCPB5

14X call/sec

7.6X call/sec

2.1X call/sec

AXDCPB6

ISUP-ISUP /Intra MGW

ISUP-ISUP /Inter MGW

POTS-POTS /AGW

Trafficscenario

5.5X call/sec

3.6X call/sec

X call/sec

IS/GCP1slot/board

7.7X call/sec

One core used

2.3X call/sec

One core used

IS/GEPDual coreOne core running

2slots/board

26X call/sec

13X call/secOTP R11_3

beta+patches

4.3X call/secOTP R11_3

beta+patches

IS/GEPDual coreTwo cores

running2slots/board

Case Study: Telephony Gateway Controller

Page 17: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 17Erlang and Scalability

Scalability in the small: SMPSpeedup on 4 Hyper Threaded Pentium4

1

1.92 2.05

2.733.11

3.633.79

3.96

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8

# Schedulers

Sp

ed

du

p

• Chatty• 1000 processes created• Each process randomly sends req/recieves ack from all other

processes

Page 18: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 18Erlang and Scalability

Scalability in the small: SMPErlang VM

Scheduler

run queuenon-SMP VM

Page 19: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 19Erlang and Scalability

Scalability in the small: SMPErlang VM

Scheduler #1

Scheduler #2

Scheduler #N

run queueCurrent SMP VM

OTP R11/R12

Page 20: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 20Erlang and Scalability

Scalability in the small: SMP

Erlang VM

Scheduler #1

Scheduler #2

run queue

Scheduler #2

Scheduler #N

run queue

run queue

migrationlogic

migrationlogic

New SMP VMOTP R13

Released 21th April

Page 21: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 21Erlang and Scalability

• Speedup of ”Big Bang” on a Tilera Tile64 chip (R13A)• 1000 processes, all talking to each other

Memory allocation locks dominate...

Scalability in the small: SMP

Multiplerun queues

Singlerun queue

Speedup: Ca 0.43 * N @ 32 cores

Page 22: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 22Erlang and Scalability

Scalability in the small: SMP

Shift in Bottlenecks

• All scalable Erlang systems were stress tested for CPU usage for network usage

• With SMP hardware we must stress test for memory usage • In the typical SMP system, the bottleneck has shifted from

the CPU to the memory

Page 23: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 23Erlang and Scalability

Scalability in the small: SMP

Death by a thousand cuts

• Many requests that generate short spikes in memory usage• Limit or serialize those requests• More on this in coming paper from CTO Ulf Wiger

loop(State) ->

receive

{request, typeA, Data} ->

Data1 = allocate_lots_of_memory(Data),

a_server ! {request, typeA, self()},

receive

{answer, …

Page 24: Erlang and Scalability

Percona Performance Conference © 2009 -2009, Erlang Training and Consulting 24Erlang and Scalability

Questions

???