Ruslan.shevchenko: most functional-day-kiev 2014

49
SCALA - organizing control flow (between imperative and declarative approaches) Ruslan Shevchenko GoSave [email protected] @rssh1 https://github.com/rssh

description

Lecture about available methods for scala control-flow organization on http://frameworksdays.com/event/most-functional-day

Transcript of Ruslan.shevchenko: most functional-day-kiev 2014

Page 1: Ruslan.shevchenko: most functional-day-kiev 2014

SCALA - organizing control flow (between imperative and declarative approaches)Ruslan Shevchenko GoSave [email protected] @rssh1 https://github.com/rssh

Page 2: Ruslan.shevchenko: most functional-day-kiev 2014

Styles of control flow organizations:

!• Future-s … • async/await • Actors • Channels • Reactive Collections • DSLs for Computation Plans

Page 3: Ruslan.shevchenko: most functional-day-kiev 2014

Control Flow

val x = readX val y = readY val z = x*y

scala

X

Y

Z

Imperative = we explicitly set one

Page 4: Ruslan.shevchenko: most functional-day-kiev 2014

directed by evaluation strategy

z = x*y where x = readX y = readX

haskellX Y

Z

let x = readX y = readX in x*y

Page 5: Ruslan.shevchenko: most functional-day-kiev 2014

Control Flow:

!• Imperative/Declarative (?) !• Declarative + Understanding =

Imperative

Page 6: Ruslan.shevchenko: most functional-day-kiev 2014

Control flow: what we need ?

!• Manage multi{core,machine} control-

flows. !• Optimize resource utilization

( see reactive manifesto http://www.reactivemanifesto.org/ )

Page 7: Ruslan.shevchenko: most functional-day-kiev 2014

Reactivity

!• Ugly situation in industry !• In ideal world - {operation system/

language VM} must care about resource utilization, human - about logic !

• Our world is far from ideal

Page 8: Ruslan.shevchenko: most functional-day-kiev 2014

Control Flow

val x = readX val y = readY val z = x*y

Can we readX and Y in parallel

X Y

Z

Page 9: Ruslan.shevchenko: most functional-day-kiev 2014

Low level@volatile var x = _ val xThread = new Thread( public void run() { x = readX } ).start(); !@volatile var y = _ val yThread = new Thread( public void run() { y = readY } ).start(); yThread.join() xThread.join() z = x+y

As GOTO from 60-s

Page 10: Ruslan.shevchenko: most functional-day-kiev 2014

Low level with thread poolval xTask = new Task( public X run() { readX } ); pool.submit(xTask) !val yTask = new Task( public Y run() { readY } ) pool.submit(yTask) !z = xTask.get()+yTask.get()

X Y

Z

Page 11: Ruslan.shevchenko: most functional-day-kiev 2014

Scala Future

X Y

Z

val x = Future{ readX } val y = Future{ readY } val z = Await.result(x, 1 minute) + Await.result(y, 1 minute)

Page 12: Ruslan.shevchenko: most functional-day-kiev 2014

Future

!• Future[T] = pointer to the value in the

future !• isComplete: Boolean

!• Await.result(feature, duration): T !• onComplete(Try[T] => U) : Unit

Page 13: Ruslan.shevchenko: most functional-day-kiev 2014

Future { readX }

object Future { ! def apply[T](body: =>T) : Future[T] = ………… !}

Call by name syntax = [call by function .. ] by name - term from Algol 68

Own functions like control flow constructions

Page 14: Ruslan.shevchenko: most functional-day-kiev 2014

Future-s are composable.

!• map: X=>Y Future[X] => Future[Y] !• Future[X].map[Y](f:X=>Y): Future[Y]

!• flatMap: • X => Future[Y] Future[X]=>Future[Y] !

• Future[X].flatMap[Y](f: x => Future[Y]):Future[Y]

Page 15: Ruslan.shevchenko: most functional-day-kiev 2014

Future-s are composable: map

!• Future[X].map[Y](f:X=>Y): Future[Y] ! Future{ calculatePi() } map ( _ + 1) = ! Future{ calculatePi() } map (x => x+1) = ! Future{ => 4.1415926 }

Page 16: Ruslan.shevchenko: most functional-day-kiev 2014

Future-s are composable: flatMap

!• Future[X]:

• flatMap[Y](f:X=>Future[Y]): Future[Y] ! Future{ calculatePi() } flatMap ( calculateExp(_) ) = ! Future{ => e^pi … }

Page 17: Ruslan.shevchenko: most functional-day-kiev 2014

Scala Futureval x = Future{ readX } val y = Future{ readY } val z = Future{ … after X and Y }

val z = Future{ readX } flatMap { x => readY map ( y=> x+y ) } !

Page 18: Ruslan.shevchenko: most functional-day-kiev 2014

Scala Futureval x = Future{ readX } val y = Future{ readY } val z = Future{ … after X and Y }

val xFuture = Future{ readX } val yFuture = Future{ readY } val z = xFuture flatMap { x => yFuture map ( y=> x+y ) }

for{ x <- Future{ readX }, y <- Future{ readY }) yield x+y !

Using monadic syntax:

Page 19: Ruslan.shevchenko: most functional-day-kiev 2014

Scala: monadic syntaxfor{ x <- Future{ readX }, y <- Future{ readY }) yield x+y !

Future{ readX } flatMap{ x => for (y <- Future{ readY }) yield x+y }

Future{ readX } flatMap{ x => Future{ readY } map ( y => x+y ) }

Page 20: Ruslan.shevchenko: most functional-day-kiev 2014

Future

!• Good for simple cases !• Hard when we want to implement

complex logic !

Page 21: Ruslan.shevchenko: most functional-day-kiev 2014

SIP22 (Async/Await)

!• when we want to implement complex

logic !• Ozz style -> (limited) F# ->(limited) C# !• in scala as library: https://github.com/

scala/async !

Page 22: Ruslan.shevchenko: most functional-day-kiev 2014

SIP22 (Async/Await)

!• async(body: =>T):Future[T] !• await(future: Future[T]): T

• can be used only inside async !• async macro rewrite body as state

machine. !

Page 23: Ruslan.shevchenko: most functional-day-kiev 2014

Async/Awaitval x = Future{ readX } val y = Future{ readY } val z = Future{ … after X and Y }

val z = async{ val x = future{ readX } val y = future{ readY } await(x) + await(y) }

Page 24: Ruslan.shevchenko: most functional-day-kiev 2014

Async/Awaitval z = async{ val x = Future{ readX } val y = Future{ readY } await(x) + await(y) }

val z = { var state=0, awaitX=false, awaitY=false def f = state match { case 0 => state = 1 Future{ x=readX(); awaitX=true } onComplete f Future{ x=readY(); awaitY=true} onComplete f case 1

// Just show the idea, not actual

Page 25: Ruslan.shevchenko: most functional-day-kiev 2014

Async/Await

val z = { var state=0, awaitX=false, awaitY=false val res = Promise[X]() def f = state match { case 0 => state = 1 Future{ x=readX(); awaitX=true } onComplete f Future{ x=readY(); awaitY=true} onComplete f case 1 => if (awaitX && awaitY) { res.successful(x + y)} } f() res.future() }

// Just show the idea, not actual

Page 26: Ruslan.shevchenko: most functional-day-kiev 2014

Async/Await

!• Good for relative complex logic !• No support for awaits inside closures

inside async block. !• Still low-level, can’t be used for

organizing program structure !

Page 27: Ruslan.shevchenko: most functional-day-kiev 2014

Akka http://www.akka.io

!• Erlang-style concurrency !• Actor - active object

• available by name in akka cluster. • send message (opt. receive answer) • have local state

!!

!!

!

Page 28: Ruslan.shevchenko: most functional-day-kiev 2014

Actor

Mailbox

Processor (with state) 1

Page 29: Ruslan.shevchenko: most functional-day-kiev 2014

Akka

!• tell(x:Any)=>Unit — send and forget

actor ! x !!• ack(x:Any)=>Future[Any] — send and

receive future to answer actor ? x !!

Page 30: Ruslan.shevchenko: most functional-day-kiev 2014

Actor!class EventsProcessor extends Actor { var nMessages=0 ! def receive = { case Event(msg) => println(msg) nMessages+=1 case Echo(msg) => sender ! Echo(msg) case Ping => sender ! Pong case Stop => context.stop(self) } !}

Page 31: Ruslan.shevchenko: most functional-day-kiev 2014

Akka

!• Actor Supervising (restart if fail) !• Utils (Scheduler, EventBus, … )

!• Common patterns

• Load balancing • Throttling messages • …..

!!

Page 32: Ruslan.shevchenko: most functional-day-kiev 2014

Akka

!• Scale on cluster

!• Persistent Queue/ Actor State !• Optional monitoring console

(commercial) !!

Page 33: Ruslan.shevchenko: most functional-day-kiev 2014

Akka : Differences from Erlang Model

!• No blocking inside actors. [Use

additional tread-pool] !

• Scheduler switch to other actor after processing <N> messages (<N> instructions in Erlang)

!• Supervising is more robust. !!

Page 34: Ruslan.shevchenko: most functional-day-kiev 2014

Go-like channels

!• Come from Go language

• http://golang.org/ !!

Page 35: Ruslan.shevchenko: most functional-day-kiev 2014

Go-like channels

!• Bounded Queue !• coroutines (different threads) can

• write to queue (wait if full) • read from queue (wait if full) • select/wait one from possible

operations

Page 36: Ruslan.shevchenko: most functional-day-kiev 2014

Go-like channels

!• Implemented as library on top of Akka

and SIP22 !

• Fully async (blocked reads must be in async block) !

• (yet not ready) ;))) https://github.com/rssh/scala-gopher branch “async/unsugared.”

Page 37: Ruslan.shevchenko: most functional-day-kiev 2014

Rx streams

!• Reactive extensions

• http://rxscala.github.io/ !• Rx collection call you.

!

Page 38: Ruslan.shevchenko: most functional-day-kiev 2014

Rx scala

!trait Observer[E] { def onNext(e: E) def onError(e: Throwable) def onCompleted() !} !!

!trait Observable[E] { ! def subscribe(o:Observer) ! ………. ! map, flatMap, zip, filter … } !

Page 39: Ruslan.shevchenko: most functional-day-kiev 2014

Rx scala: Observer/Iterator duality

!trait Observer[-E] { def onNext(e: E) def onError(e: Throwable) def onCompleted() !} !!

!trait Iterator[+E] { ! def next: E // throw …. def hasNext() ! } !

Page 40: Ruslan.shevchenko: most functional-day-kiev 2014

Rx streams

!• Use - if event-source provide this format. !• Typical pattern - collect event-sources

into you collection via hight-level operations, than process. !

Page 41: Ruslan.shevchenko: most functional-day-kiev 2014

Computation plan DSL !• Collections operations can be distributed !• Simple form: .par

for( x <- collection.par) yield x+1 !

!• Same idea for hadoop map/reduce !

Page 42: Ruslan.shevchenko: most functional-day-kiev 2014

Computation plan DSL for Hadoop

!• Scalding

• https://github.com/twitter/scalding !• Scoobi

• http://nicta.github.io/scoobi/ !

• Scrunch • http://crunch.apache.org/scrunch.html

Page 43: Ruslan.shevchenko: most functional-day-kiev 2014

Scoobi!val lines = fromTextFile("hdfs://in/...") !val counts = lines.mapFlatten(_.split(" ")) .map(word => (word, 1)) .groupByKey .combine(Sum.int) !counts.toTextFile(\“hdfs://out/…", overwrite=true).persist(ScoobiConfiguration())

map, groupByKey, combine => Map/Reduce tasks

Page 44: Ruslan.shevchenko: most functional-day-kiev 2014

Mahout: computation plan DSL for Spark

!• https://mahout.apache.org/

!• Scalable machine learning library. • R-like matrix operations • Optimizer for algebraic expression

Page 45: Ruslan.shevchenko: most functional-day-kiev 2014

Mahout!// R-like operations (linear algebra) !val g = bt.t %*% bt - c - c.t + (s_q cross s_q) * (xi dot xi) !drmA.mapBlock(ncol = r) { case (keys, blockA) => val blockY = blockA %*% Matrices.symmetricUniformView(n, r, omegaSeed) keys -> blockY }

Match operations => computations plans on storm claster

Page 46: Ruslan.shevchenko: most functional-day-kiev 2014

Mahout!val inCoreA = dense(! (1, 2, 3, 4),! (2, 3, 4, 5),! (3, -4, 5, 6),! (4, 5, 6, 7),! (8, 6, 7, 8)! ) val A = drmParallelize(inCoreA, numPartitions = 2) !val inCoreB = drmB.collect

In core => out core transformation

Page 47: Ruslan.shevchenko: most functional-day-kiev 2014

Scala: organization of control flow

!• Many styles. No one is better. !• Low-level: futures & callbacks !• Middle-level: actors, channels, streams !• Hight-level: declarative DSL

Page 48: Ruslan.shevchenko: most functional-day-kiev 2014

Scala: organization of control flow

• Possibilities: • Flexible syntax • Call-by-name • Macroses

!• Limitations:

• JVM • Complex language constructions (hard

to change structure.)

Page 49: Ruslan.shevchenko: most functional-day-kiev 2014

Scala: organization of control flow

!• Thanks for attention. !• Questions (?) !• //Ruslan Shevchenko @rssh1

<[email protected]> !

• GoSave, inc. !