How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices...

35
How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making state-of-the-art algorithms discoverable and accessible to everyone Full-Spectrum Developer & Advocate [email protected] @peckjon Algorithmia.com

Transcript of How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices...

Page 1: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

How Microservices and Serverless ComputingEnable the Next Gen of Machine Intelligence

Jon Peck

Making state-of-the-art algorithmsdiscoverable and accessible to everyone

Full-Spectrum Developer & Advocate

[email protected]@peckjon

Algorithmia.com

Page 2: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

2

The Problem: ML is in a huge growth phase,

difficult/expensive for DevOps to keep up

Initially:

● A few models, a couple frameworks, 1-2 languages● Dedicated hardware or VM Hosting● IT Team for DevOps

● High time-to-deploy, manual discoverability● Few end-users, heterogenous APIs (if any)

Pretty soon...● > 5,000 algorithms (50k versions) on many runtimes / frameworks

● > 60k algorithm developers: heterogenous, largely unpredictable● Each algorithm: 1 to 1,000 calls/second, a lot of variance

● Need auto-deploy, discoverability, low (15ms) latency● Common API, composability, fine-grained security

Page 3: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

3

The Need: an “Operating System for AI”AI/ML scalable infrastructure on demand + marketplace

● Function-as-a-service for Machine & Deep Learning

● Discoverable, live inventory of AI via APIs

● Anyone can contribute & use

● Composable, Monetizable

● Every developer on earth can make their app intelligent

Page 4: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

An Operating System for AIWhat did the evolution of OS look like?

iOS/AndroidBuilt-in App Store(Discoverability)

Punch Cards1970s

UnixMulti-tenancy, Composability

DOSHardware Abstraction

GUI (Win/Mac)Accessibility

4

General-purpose computing had a long evolution, as we learned what the common problems were / what abstractions to build. AI is in the earlier stages of that evolution.

An Operating System:

• Provides common functionality needed by many programs• Standardizes conventions to make systems easier to work with• Presents a higher level abstraction of the underlying hardware

Page 5: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Use CaseJian Yang made an app to recognize food “SeeFood”

© H BO A ll R ights R eserved 5

Page 6: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Use CaseHe deployed his trained model to a GPU-enabled server

GPU-enabled Server

?

6

Page 7: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Use CaseThe app is a hit!

SeeFoodP ro d u c t iv i ty

7

Page 8: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

??

Use Case… and now his server is overloaded.

GPU-enabled Server

?

xN

8

Page 9: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

• Two distinct phases: training and inference

• Lots of processing power

• Heterogenous hardware (CPU, GPU, FPGA, TPU, etc.)

• Limited by compute rather than bandwidth

• “Tensorflow is open source, scaling it is not.”

Characteristics of AI

9

Page 10: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

10

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

Single user

Page 11: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

TRAINING

11

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

Single user

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 12: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

12

INFERENCE

Short com pute bursts

OWNER: DevOps

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

M ultip le usersSingle user

State less

Elastic

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 13: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

13

INFERENCE

Short com pute bursts

OWNER: DevOps

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

M ultip le usersSingle user

State less

Elastic

Analogous to an OS. Running concurrent models

requires task scheduling.

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 14: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

14

INFERENCE

Short com pute bursts

OWNER: DevOps

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

M ultip le usersSingle user

State less

Elastic

Containers

Analogous to an OS. Running concurrent models

requires task scheduling.

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 15: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

15

INFERENCE

Short com pute bursts

OWNER: DevOps

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

OWNER: Data Scientists

M ultip le usersSingle user

State less

Elastic

Containers Kubernetes

Analogous to an OS. Running concurrent models

requires task scheduling.

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 16: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

16

INFERENCE

Short com pute bursts

State less

Elastic

M ultip le users

Containers Kubernetes

OWNER: DevOps

TRAINING

Long com pute cycle

Fixed load (Inelastic)

Statefu l

S ingle user

OWNER: Data Scientists

Analogous to an OS. Running concurrent models

requires task scheduling.

Analogous to dev tool chain.Building and iterating over a

model is similar to building an app.

Metal or VM

Page 17: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

MICROSERVICES: the design of a system as independently deployable, loosely coupled services.

Microservices & Serverless Computing => ML Hosting

ADVANTAGES

• Maintainable, Scalable• Software & Hardware Agnostic• Rolling deployments

SERVERLESS: the encapsulation, starting, and stopping of singular functions per request, with a just-in-time-compute model.

ADVANTAGES

• Elasticity, Cost Efficiency• Concurrency• Improved Latency

+ +17

Page 18: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Why Serverless - Cost EfficiencyC

alls

per

Sec

ond

Max calls/s

Avg calls/s

40

35

30

25

20

15

10

5

GP

U S

erve

r In

stan

ces

12AM

02AM

04AM

06AM

08AM

10AM

12PM

02PM

04PM

06PM

08PM

10PM

160

140

120

100

80

60

40

20

Jian Yang’s “SeeFood” is most active during lunchtime.

18

Page 19: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Traditional Architecture - Design for MaximumC

alls

per

Sec

ond

Max calls/s

Avg calls/s

40

35

30

25

20

15

10

5

12AM

02AM

04AM

06AM

08AM

10AM

12PM

02PM

04PM

06PM

08PM

10PM

40 machines 24 hours. $648 * 40 = $25,920 per month

GP

U S

erve

r In

stan

ces

160

140

120

100

80

60

40

20

19

Page 20: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Autoscale Architecture - Design for Local MaximumC

alls

per

Sec

ond

Max calls/s

Avg calls/s

40

35

30

25

20

15

10

5

12AM

02AM

04AM

06AM

08AM

10AM

12PM

02PM

04PM

06PM

08PM

10PM

19 machines 24 hours. $648 * 40 = $12,312 per month

GP

U S

erve

r In

stan

ces

160

140

120

100

80

60

40

20

20

Page 21: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Serverless Architecture - Design for MinimumC

alls

per

Sec

ond

Max calls/s

Avg calls/s

40

35

30

25

20

15

10

5

12AM

02AM

04AM

06AM

08AM

10AM

12PM

02PM

04PM

06PM

08PM

10PM

Avg. of 21 calls / sec, or equivalent of 6 machines. $648 * 6 = $3,888 per month

160

140

120

100

80

60

40

20

GP

U S

erve

r In

stan

ces

21

Page 22: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

??

Why Serverless - Concurrency

GPU-enabled Servers

?

Lo

ad

Ba

lan

ce

r

22

Page 23: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Why Serverless - Improved LatencyPortability = Low Latency

23

Page 24: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

24

+ +

Almost there! We also need:

GPU Memory Management, Job Scheduling, Cloud Abstraction,

Discoverability, Authentication, Logging, etc.

Page 25: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

25

Elastic Scale

User

Web Load Balancer

API Load Balancer

Web Servers

API Servers

Cloud Region #1

Worker xN

D o c k e r ( a lg o r i th m # 1 )

. .

D o c k e r ( a lg o r i th m # n )

Cloud Region #2

Worker xN

D o c k e r ( a lg o r i th m # 1 )

. .

D o c k e r ( a lg o r i th m # n )

Page 26: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

26

Elastic Scaling with

Intelligent Orchestration

Knowing that:

● Algorithm A always calls Algorithm B● Algorithm A consumes X CPU, X Memory, etc● Algorithm B consumes X CPU, X Memory, etc

Therefore we can slot them in a way that:

● Reduce network latency● Increase cluster utilization● Build dependency graphs

FoodClassifier

FruitClassifier VeggieClassifier

Runtime Abstraction

Page 27: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

27

Composability

Composability is critical for AI workflows because of data

processing pipelines and ensembles.

Fruit or VeggieClassifier

FruitClassifier

VeggieClassifiercat file.csv | grep foo | wc -l

Page 28: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

28

Cloud Abstraction - Storage

# No storage abstraction

s3 = boto3.client("s3")

obj = s3.get_object(Bucket="bucket-name", Key="records.csv")

data = obj["Body"].read()

# With storage abstraction

data = client.file("blob://records.csv").get()

s3://foo/bar

blob://foo/bar

hdfs://foo/bardropbox://foo/bar

etc.

Page 29: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

29

Compute EC2 CE VM Nova

Autoscaling Autoscaling Group Autoscaler Scale Set Heat Scaling Policy

Load BalancingElastic Load

BalancerLoad Balancer Load Balancer LBaaS

Remote Storage Elastic Block Store Persistent Disk File Storage Block Storage

Partial Source: Sam Ghods, KubeConf 2016

Cloud Abstraction

Page 30: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

30

Runtime Abstraction

Support any program m ing language or fram ew ork, includ ing interoperability betw een m ixed stacks.

Elastic Scale

Prioritize and autom atically optim ize execution of concurrent short-lived jobs.

Cloud Abstraction

Provide portab ility to algorithm s, includ ing public clouds or private clouds.

Discoverability, Authentication, Instrumentation, etc.

Shell & Services

Kernel

An Operating System for AI: the “AI Layer”

Page 31: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

31

Discoverability: an App Store for AI

Page 32: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

32

Algorithmia’s OS for AI: discover a model

1. Discover a model

● AppStore-like interface

● Categorized, tagged, rated

● Well-described

(purpose, source, API)

Page 33: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

33

Algorithmia’s OS for AI: execute a model

2. Execute from any language

● Raw JSON, or lang stubs

● Common syntax

● Autoscaled elastic cloud-exec

● Secure, isolated

● Concurrent, orchestrated

● 15ms overhead

● Hardware agnostic

Page 34: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

34

Algorithmia’s OS for AI: add a model

3. Add new models

● Many languages, frameworks

● Instant JSON API

● Call other models seamlessly

(regardless of lang)

● Granular permissions

● GPU environments

● Namespaces & versioning

Page 35: How Microservices and Serverless Computing Enable the Next … · 2020-01-01 · How Microservices and Serverless Computing Enable the Next Gen of Machine Intelligence Jon Peck Making

Jon Peck Developer Advocate

Thank you!FREE STUFF

$50 free at Algorithmia.comsignup code: WOSC18

[email protected]@peckjon

Algorithmia.com WE ARE HIRING

algorithmia.com/jobs● Seattle or Remote● Bright, collaborative env● Unlimited PTO● Dog-friendly