TaskFlow Y! + HP brownbag

20
Yahoo! Joshua Harlow TaskFlow and OpenStack HP Min Pae

Transcript of TaskFlow Y! + HP brownbag

Page 1: TaskFlow Y! + HP brownbag

Yahoo!Joshua Harlow

TaskFlowand

OpenStack

HPMin Pae

Page 2: TaskFlow Y! + HP brownbag

● Joshua Harlow○ Yahoo! dev. for ~7 years○ OpenStack dev. for ~3.5 years○ Master Trouble-maker○ Oslo, kazoo, anvil, taskflow, cloudinit… more …

● Min Pae○ HP dev. for ~7 months○ OpenStack dev. for ~7 months○ Lead spell checker○ Cue, taskflow, automaton… period ...

Who are we

Page 3: TaskFlow Y! + HP brownbag

- Distributed systems are complex- Scale out, resumption, resilency, HA,

visibility into active work … are not easily solveable problems (some learn this the hard way)

- Understanding your states and workflows (and managing, transitioning and running) is key to solving many of these complex problems

The problem

Page 4: TaskFlow Y! + HP brownbag

- Declarative workflows

- Persisted execution state (checkpoints)

- Automatic migration of workflows/jobs

- Horizontal scalability

- Magic!

Taskflow does ...

Page 5: TaskFlow Y! + HP brownbag

- Atom (task and retry execution units)

- Flow (composition unit)

- Engine (work execution <-> persistence)

- Job / Jobboard (work discovery/ownership unit)

- Conductor (‘conducts’ automated discovery/ownership, flow construction and execution)

Taskflow is ...

Page 6: TaskFlow Y! + HP brownbag

- Execution unit

- Has- dependencies (“requires”)- data (“provides”)

- Defines- execute(...) - business

logic- revert(...) - exception

handler

Taskflow - Atom:Task

class TakeABottleDown(task.Task):

def execute(self, bottles_left): sys.stdout.write('Take one down, ') sys.stdout.flush() time.sleep(TAKE_DOWN_DELAY) return bottles_left - 1

def revert(self, **kwargs): …

class PassItAround(task.Task): …

class Conclusion(task.Task): ...

Page 7: TaskFlow Y! + HP brownbag

- Controls retry semantics of associated flow (and subflows and …)

- Has- dependencies (“requires”)- data (“provides”)

- Defines- execute(...) - business logic- revert(...) - exception

handler- on_failure(...) - decision

maker that affects retry semantics

Taskflow - Atom:Retry

class Retry1(retry.Retry):

def execute(self, param1): print param1 return param1 + ‘ printed’

def revert(self, **kwargs) print “reverting...”

def on_failure(self, **kwargs): if self.attempts < 5: return retry.RETRY else: return retry.REVERT_ALL

Page 8: TaskFlow Y! + HP brownbag

- Composition of Tasks

- Defines transitions between Tasks

- Allows implicit and explicit dependencies

- Required methods(?)- add(...) - add (and link)

task(s), flow(s)- iter_links(...) - iterator over

the created links (links are created during add)

Taskflow - Flow

s = linear_flow.Flow(‘bottle-song’)

take_bottle = TakeABottleDown(...)

pass_it = PassItAround(...)

next_bottles = Conclusion(...)

s.add(take_bottle, pass_it, next_bottles)

Page 9: TaskFlow Y! + HP brownbag

- Run flows (and associated tasks) to completion- Decompose flows into a DAG

- Edge dependencies mandated by flow(s) patterns are always retained

- Prepare persistence layer- Run tasks/retries as they are ready

- Optionally in parallel (and/or remotely)...

- Save and fetch results from persistence layer and run next tasks/retries (and repeat)

- State machine based:- http://docs.openstack.org/developer/taskfl

ow/states.html#engine

Taskflow - Engines(s)

Page 10: TaskFlow Y! + HP brownbag

- Place where work can be placed by producer entities and consumed/owned (and worked on) by other consumer entities

- Similar to a job queue but builds in liveness semantics/capabilities (and semantics expect single ownership via a claim concept)- If a owner of a unit of work dies, the claim

on the work they are performing is automatically lost and freed up for others

- Typically tied to a unit of work (being a flow) and its optional persistence location (so that prior work can be resumed)

Taskflow - Job(s) & Jobboard(s)

Page 11: TaskFlow Y! + HP brownbag

● Essentially an advanced/specialized job processor- Connects to a jobboard- Periodically fetches contents of jobboard- Attempts to claim a job- Constructs jobs work (flow, other...)- Performs jobs work (using engines of

various types and persistence backends to enable reliablility)

- Removes job (on completion)- (rinse and repeat)

● Expected to be scaled out (run as many conductors as needed/desired)

Taskflow - Conductor

Page 12: TaskFlow Y! + HP brownbag

Why would u want this?

Page 13: TaskFlow Y! + HP brownbag

- Jobs and Jobboards provide work ownership and work discovery- Horizontal scaling via conductors

- Automatic migration of work between conductors- Persistence of execution state enable

resumption and automated ownership transfer

- When a conductor fails, job(s) in progress is picked up (and resumed to last checkpoint) by the next worker that frees up, no need to wait for the worker to come back.

- Turn your software off safely and handle failures gracefully!

Wherefore Taskflow?

Page 14: TaskFlow Y! + HP brownbag

- Declarative definition of work- Decouples what (Task, Flow) from how

(Engine)- Coroutines are not separable from the

surrounding code, and can not be automatically parallelized

- Separation of declaration and execution allows flexibility in execution strategy- Engine tracks execution state and

transitions- Parallel (green)threaded execution…- Remote worker execution…

Wherefore Taskflow? (cont.)

Page 15: TaskFlow Y! + HP brownbag

- Not strongly tied into python as a language (for better or worse); concepts are easily transferable to java/go/….

- Alacarte: use what you want - Use the basics until you are ready to use

jobboards, or select a local engine until you are ready to run remote workers…

Wherefore Taskflow? (cont.)

Page 16: TaskFlow Y! + HP brownbag

Wherefore Taskflow? (cont.)

Page 17: TaskFlow Y! + HP brownbag

Notifications

Remote task workers

Dynamic flow modification

Real time dashboard of atom/flow/job transitions (WIP)

Applications that can be paused

DDOS your favorite site (joke)

The potential is nearly limitless!!

Wherefore Taskflow? (cont.)

Page 18: TaskFlow Y! + HP brownbag

DEMO

Page 19: TaskFlow Y! + HP brownbag

?? Questions ??

Page 20: TaskFlow Y! + HP brownbag

- High level (overview)- https://wiki.openstack.org/wiki/TaskFlow- https://wiki.openstack.org/wiki/TaskFlow#Big_picture

- Developer oriented (more detail)- http://docs.openstack.org/developer/taskflow/

- Extreme!! developer oriented (ultra detail)- Freenode

- #openstack-state-management- #openstack-oslo

- ML: [email protected] Moar examples!

More information!