stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

93
State You’re Doing it Wrong: Alternative Concurrency Paradigms For the JVM Jonas Bonér Crisp AB blog: http://jonasboner.com work: http://crisp.se code: http://github.com/jboner twitter: jboner

description

 

Transcript of stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Page 1: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

State You’re Doing it Wrong: Alternative Concurrency Paradigms For the JVMJonas BonérCrisp ABblog: http://jonasboner.comwork: http://crisp.secode: http://github.com/jbonertwitter: jboner

Page 2: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

2

Agenda>An Emergent Crisis>State: Identity vs Value>Shared-State Concurrency>Software Transactional Memory (STM)>Message-Passing Concurrency (Actors) >Dataflow Concurrency>Wrap up

Page 3: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

3

Moore’s Law>Coined in the 1965 paper by Gordon E. Moore >The number of transistors is doubling every 18 months

>Processor manufacturers have solved our problems for years

Page 4: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

4

Not anymore

Page 5: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

5

The free lunch is over>The end of Moore’s Law >We can’t squeeze more out of one CPU

Page 6: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

6

Conclusion>This is an emergent crisis >Multi-processors are here to stay >We need to learn to take advantage of that >The world is going concurrent

Page 7: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

7

State

Page 8: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

8

The devil is in the state

Page 9: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

9

Wrong, let me rephrase

Page 10: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

10

The devil is in the mutable state

Page 11: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

11

Definitions&

Philosophy

Page 12: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

What is a Value?

A Value is something that does not change

Discussion based onhttp://clojure.org/state

by Rich Hickey

12

Page 13: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

What is an Identity?

A stable logical entity associated with a

series of different Values over time

13

Page 14: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

What is State?

The Value an entity with a specific Identity

has at a particular point in time

14

Page 15: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

How do we know if something has State?

If a function is invoked with the same arguments at

two different points in time and returns different values...

...then it has state

15

Page 16: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

The Problem

Unification of Identity & Value

They are

not the same

16

Page 17: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

We need to separate Identity & Value...add a level of indirectionSoftware Transactional Memory

Managed References

Message-Passing ConcurrencyActors/Active Objects

Dataflow ConcurrencyDataflow (Single-Assignment) Variables

17

Page 18: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

18

Shared-State Concurrency

Page 19: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

19

Shared-State Concurrency>Concurrent access to shared, mutable state. >Protect mutable state with locks >The Java C# C/C++ Ruby Python etc. ...way

Page 20: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

20

Shared-State Concurrency is incredibly hard

>Inherently very hard to use reliably>Even the experts get it wrong

Page 21: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

21

Page 22: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

...and for each of these...

1. Look at an implementation using Shared-State Concurrency

2. Compare with implementation using an alternative paradigm

22

Page 23: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

23

Page 24: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

24

Problem 1:

Transfer funds between bank accounts

Page 25: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

25

Shared-State Concurrency

Transfer funds between bank accounts

Page 26: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

26

Account

publicclassAccount{privatedoublebalance;publicvoidwithdraw(doubleamount){balance‐=amount;}publicvoiddeposit(doubleamount){balance+=amount;}}> Not thread-safe

Page 27: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

27

Let’s make it thread-safepublicclassAccount{privatedoublebalance;publicsynchronizedvoidwithdraw(doubleamount){balance‐=amount;}publicsynchronizedvoiddeposit(doubleamount){balance+=amount;}}

>Thread-safe, right?

Page 28: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

It’s still brokenNot atomic

28

Page 29: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

29

Let’s write an atomic transfer method

publicclassAccount{...

publicsynchronizedvoidtransferTo(Accountto,doubleamount){this.withdraw(amount);to.deposit(amount);}...}

> This will work right?

Page 30: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

30

Let’s transfer funds

Accountalice=...Accountbob=...//inonethreadalice.transferTo(bob,10.0D);//inanotherthreadbob.transferTo(alice,3.0D);

Page 31: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Might lead to DEADLOCK

Darn, this is really hard!!!

31

Page 32: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

32

We need to enforce lock ordering>How? >Java won’t help us >Need to use code convention (names etc.) >Requires knowledge about the internal state and implementation of Account

>…runs counter to the principles of encapsulation in OOP

>Opens up a Can of Worms

Page 33: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

The problem with locksLocks do not composeTaking too few locksTaking too many locksTaking the wrong locksTaking locks in the wrong orderError recovery is hard

33

Page 34: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Java bet on the wrong horse

But we’re not completely screwed There are alternatives

34

Page 35: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

We need better and more high-level

abstractions

35

Page 36: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

36

Alternative Paradigms>Software Transactional Memory (STM) >Message-Passing Concurrency (Actors) >Dataflow Concurrency

Page 37: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

37

Software Transactional Memory (STM)

Page 38: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

38

Software Transactional Memory>See the memory (heap and stack) as a transactional dataset

>Similar to a database begin commit abort/rollback

>Transactions are retried automatically upon collision

>Rolls back the memory on abort

Page 39: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

39

Software Transactional Memory> Transactions can nest> Transactions compose (yipee!!)atomic{..atomic{..}}

Page 40: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

40

Restrictions >All operations in scope of a transaction: Need to be idempotent Can’t have side-effects

Page 41: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

41

Case study: Clojure

Page 42: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

42

What is Clojure? >Functional language>Runs on the JVM>Only immutable data and datastructures>Pragmatic Lisp>Great Java interoperability>Dynamic, but very fast

Page 43: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

43

Clojure’s concurrency story >STM (Refs) Synchronous Coordinated

>Atoms Synchronous Uncoordinated

>Agents Asynchronous Uncoordinated

>Vars Synchronous Thread Isolated

Page 44: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

44

STM (Refs)>A Ref holds a reference to an immutable value>A Ref can only be changed in a transaction>Updates are atomic and isolated (ACI)>A transaction sees its own snapshot of the world>Transactions are retried upon collision

Page 45: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

45

Let’s get back to our banking problem

The STM way Transfer funds

between bank accounts

Page 46: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

46

;;alice’saccountwithbalance1000USD(defalice(ref1000))

;;bob’saccountwithbalance1000USD(defbob(ref1000))

Create two accounts

Page 47: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

47

;;amounttotransfer(defamount100)

;;notvalid;;throwsexceptionsince;;notransactionisrunning(ref‐setalice(‐@aliceamount))(ref‐setbob(+@bobamount))

Transfer 100 bucks

Page 48: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

48

;;updatebothaccountsinsideatransaction(dosync(ref‐setalice(‐@aliceamount))(ref‐setbob(+@bobamount)))

Wrap in a transaction

Page 49: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Potential problems with STMHigh contention (many transaction collisions) can lead to:

Potential bad performance and too high latencyProgress can not be guaranteed (e.g. live locking)Fairness is not maintained

Implementation details hidden in black box

49

Page 50: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

50

My (humble) opinion on STM >Can never work fine in a language that don’t have compiler enforced immutability>E.g. never in Java (as of today)

>Should not be used to “patch” Shared-State Concurrency

>Still a research topic how to do it in imperative languages

Page 51: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Discussion: Problem 1Need for consensus and truly shared knowledge

Shared-State ConcurrencyBad fit

Software Transactional Memory Great fitMessage-Passing Concurrency

Terrible fitDataflow Concurrency

Terrible fit

51

Page 52: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

52

Message-Passing Concurrency

Page 53: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

53

Actor Model of Concurrency >Implements Message-Passing Concurrency>Originates in a 1973 paper by Carl Hewitt>Implemented in Erlang, Occam, Oz>Encapsulates state and behavior>Closer to the definition of OO than classes

Page 54: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

54

Actor Model of Concurrency >Share NOTHING>Isolated lightweight processes

> Can easily create millions on a single workstation>Communicates through messages>Asynchronous and non-blocking

Page 55: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

55

Actor Model of Concurrency >No shared state … hence, nothing to synchronize.

>Each actor has a mailbox (message queue)

Page 56: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

56

Actor Model of Concurrency>Non-blocking send>Blocking receive>Messages are immutable>Highly performant and scalable Similar to Staged Event Driven Achitecture style (SEDA)

Page 57: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

57

Actor Model of Concurrency >Easier to reason about>Raised abstraction level>Easier to avoid Race conditions Deadlocks Starvation Live locks

Page 58: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

58

Fault-tolerant systems >Link actors>Supervisor hierarchies One-for-one All-for-one

>Ericsson’s Erlang success story 9 nines availability (31 ms/year downtime)

Page 59: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

59

Page 60: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

60

Problem 2:

A game of ping pong

Page 61: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

61

Shared-State Concurrency

A game of ping pong

Page 62: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Ping Pong Table

publicclassPingPongTable{publicvoidhit(Stringhitter){System.out.println(hitter);}}

62

Page 63: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

PlayerpublicclassPlayerimplementsRunnable{privatePingPongTablemyTable;privateStringmyName;publicPlayer(Stringname,PingPongTabletable){myName=name;myTable=table;}

...}

63

Page 64: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Player cont......publicvoidrun(){while(true){synchronized(myTable){try{myTable.hit(myName);myTable.notifyAll();myTable.wait();}catch(InterruptedExceptione){}}}}}

64

Page 65: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Run it

PingPongTabletable=newPingPongTable();Threadping=newThread(newPlayer("Ping",table));Threadpong=newThread(newPlayer("Pong",table));ping.start();pong.start();

65

Page 66: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

66

Help: java.util.concurrent>Great library >Raises the abstraction level

>No more wait/notify & synchronized blocks>Concurrent collections>Executors, ParallelArray

>Simplifies concurrent code >Use it, don’t roll your own

Page 67: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

67

Actors

A game of ping pong

Page 68: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Define message

caseobjectBall

68

Page 69: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Player 1: Pong

valpong=actor{loop{receive{//waitonmessagecaseBall=>//matchonmessageBallprintln("Pong")reply(Ball)}}}

69

Page 70: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Player 2: Ping

valping=actor{pong!Ball//startthegameloop{receive{caseBall=>println("Ping")reply(Ball)}}}

70

Page 71: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Run it...well, they are already up and running

71

Page 72: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

72

Actor implementations for the JVM >Killim (Java)>Jetlang (Java)>Actor’s Guild (Java)>ActorFoundry (Java)>Actorom (Java)>FunctionalJava (Java)>Akka Actor Kernel (Java/Scala)>GParallelizer (Groovy)>Fan Actors (Fan)

Page 73: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Discussion: Problem 2Coordination of interrelated tasks/processes

Shared-State ConcurrencyBad fit (ok if java.util.concurrent is used)

STM Won’t helpMessage-Passing Concurrency

Great fitDataflow Concurrency

Ok

73

Page 74: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Dataflow ConcurrencyThe forgotten paradigm

74

Page 75: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

75

Dataflow Concurrency>Declarative >No observable non-determinism >Data-driven – threads block until data is available>On-demand, lazy >No difference between:

>Concurrent and >Sequential code

Page 76: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

76

Dataflow Concurrency>No race-conditions >Deterministic >Simple and beautiful

Page 77: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

77

Dataflow Concurrency>Dataflow (Single-Assignment) Variables >Dataflow Streams (the tail is a dataflow variable) >Implemented in Oz and Alice

Page 78: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

78

Just three operations>Create a dataflow variable >Wait for the variable to be bound >Bind the variable

Page 79: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

79

Limitations>Can’t have side-effects Exceptions IO (println, File, Socket etc.) Time etc.

Not general-purpose Generally good for well-defined isolated modules

Page 80: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

80

Oz-style dataflow concurrency for the JVM

>Created my own implementation (DSL) > On top of Scala

Page 81: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

81

API: Dataflow Variable//Createdataflowvariablevalx,y,z=newDataFlowVariable[Int]//Accessdataflowvariable(Waittobebound)z()//Binddataflowvariablex<<40//Lightweightthreadthread{y<<2}

Page 82: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

82

API: Dataflow StreamDeterministic streams (not IO streams)

//Createdataflowstreamvalproducer=newDataFlowStream[Int]//Appendtostreamproducer<<<s//Readfromstreamproducer()

Page 83: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

83

Page 84: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

84

Problem 3:

Producer/Consumer

Page 85: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

85

Shared-State Concurrency

Producer/Consumer

Page 86: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Use java.util.concurrent

Fork/Join framework (ParallelArray etc.) ExecutorServiceFutureBlockingQueue

86

Page 87: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

87

Dataflow Concurrency

Producer/Consumer

Page 88: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

88

Example: Dataflow Variables

//sequentialversionvalx,y,z=newDataFlowVariable[Int]x<<40y<<2z<<x()+y()println("z="+z())

Page 89: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

89

Example: Dataflow Variables

//concurrentversion:nodifferencevalx,y,z=newDataFlowVariable[Int]thread{x<<40}thread{y<<2}thread{z<<x()+y()println("z="+z())}

Page 90: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Dataflow Concurrency in Java

DataRush (commercial)Flow-based Programming in Java (dead?) FlowJava (academic and dead)

90

Page 91: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

Discussion: Problem 3Workflow related dependent processes

Shared-State ConcurrencyOk (if java.util.concurrent is used)

STM Won’t helpMessage-Passing Concurrency

OkDataflow Concurrency

Great fit

91

Page 92: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

92

Wrap up>Parallel programs is becoming increasingly important>We need a simpler way of writing concurrent programs

>“Java-style” concurrency is too hard>There are alternatives worth exploring Message-Passing Concurrency Software Transactional Memory Dataflow Concurrency

Each with their strengths and weaknesses

Page 93: stateyouredoingitwrongjavaone2009-090617031310-phpapp02.pdf

93

Jonas BonérCrisp AB

blog: http://jonasboner.comwork: http://crisp.secode: http://github.com/jbonertwitter: jboner