S U M M I T - Amazon Web Services...“We use Amazon Rekognition to enrich our mapping content....

Post on 22-May-2020

3 views 0 download

Transcript of S U M M I T - Amazon Web Services...“We use Amazon Rekognition to enrich our mapping content....

S U MM I TB E R L I N

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Machine Learning Process:From Business Model to ML in Production

Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

What you’ll get out of this session

• An overview of the Machine Learning (ML) process• From business model

• to data collection and processing

• to ML training and deployment

• … and all the way back

• A better understanding of how to apply ML to your business

• Examples from AWS customers and Amazon.com

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Put machine learning in the

hands of every developer

Our mission at AWS

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Customers running machine learning on AWS

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Machine Learning stack

A I S E R V I C E S

M L S E R V I C E S

M L F R A M E W O R K S &

I N F R A S T R U C T U R E

A m a z o n

S a g e M a k e rG r o u n d T r u t h A l g o r i t h m s

N o t e b o o k s

M a r k e t p l a c e

U n s u p e r v i s e d

L e a r n i n g

S u p e r v i s e d

L e a r n i n g

R e i n f o r c e m e n t

L e a r n i n g

O p t i m i z a t i o n

( N e o )

T r a i n i n g

H o s t i n g

D e p l o y m e n t

Frameworks Interfaces Infrast ructure

A m a z o n

R e k o g n i t i o n

I m a g e

A m a z o n

P o l l y

A m a z o n

T r a n s c r i b e

A m a z o n

T r a n s l a t e

A m a z o n

C o m p r e h e n d

A m a z o n

L e x

A m a z o n

R e k o g n i t i o n

V i d e o

Vis ion Speech Language Chatbots

A m a z o n

F o r e c a s t

Forecast ing

A m a z o n

T e x t r a c tA m a z o n

P e r s o n a l i z e

Recommendat ions

A m a z o n

E C 2 P 3

& P 3 D N

A m a z o n

E C 2 C 5

F P G A s A W S G r e e n g r a s s A m a z o n

E l a s t i c

I n f e r e n c e

A m a z o n

I n f e r e n t i a

NEW NEW NEW

NEW

NEW NEW

NEW

NEW

NEW NEW

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

But how do I start?Where should I apply Machine Learning?What approach should I be taking?

?

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Visualization &

Analysis

Business Problem –

ML problem framing Data Collection

Data Integration

Data Preparation &

Cleaning

Feature Engineering

Model Training &

Parameter Tuning

Model Evaluation

Are

Business

Goals

met?

Model Deployment

Monitoring &

Debugging

– Predictions

YesNo

Da

ta A

ug

me

nta

tio

n

Fe

atu

re

Au

gm

en

tati

on

Re-training

The Machine Learning process

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Visualization &

Analysis

Business Problem –

ML problem framing Data Collection

Data Integration

Data Preparation &

Cleaning

Feature Engineering

Model Training &

Parameter Tuning

Model Evaluation

Are

Business

Goals

met?

Model Deployment

Monitoring &

Debugging

– Predictions

YesNo

Da

ta A

ug

me

nta

tio

n

Fe

atu

re

Au

gm

en

tati

on

Re-training

The Machine Learning process

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Flywheel

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

There’s an app a service for that …

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Flywheel

Product

recommendations

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

There’s a service for that: Amazon Translate

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

There’s a service for that: Amazon Translate

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Flywheel

Automatic

translation

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon Forecast

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Flywheel

Supply chain

optimization

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Amazon Flywheel

ML opportunityML opportunity

ML opportunity

ML opportunity

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Build your own Flywheel

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Build your own Flywheel

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Build your own Flywheel

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML• Save cost through better planning

• Save cost through automating human work

• Increase revenue through delivering a better product/service

• Increase revenue through better customer experience

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Common ML use cases for some verticals

• Retail: Supply chain and demand forecasting

• Financial services: Credit default prediction for customer behavior

• Manufacturing: Real-time predictions for industrial IoT

• Advertising: Predict click-through rate for targeted ads

• Media and branding: Prediction of language content quality

• Automotive innovation: Self-driving vehicles and simulation

• Health and wellness: Track disease progression

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Visualization &

Analysis

Business Problem –

ML problem framing Data Collection

Data Integration

Data Preparation &

Cleaning

Feature Engineering

Model Training &

Parameter Tuning

Model Evaluation

Are

Business

Goals

met?

Model Deployment

Monitoring &

Debugging

– Predictions

YesNo

Da

ta A

ug

me

nta

tio

n

Fe

atu

re

Au

gm

en

tati

on

Re-training

The Machine Learning process

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data

every 5 years

There is more data than people think

15years

live for

Data platforms need to

1,000xscale

>10xgrows

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Web/App

Classic

• Traditional databases, data warehousing

• Historical data, logs, archives

• Big data

Industry

• Production facilities

• Control devices

• IoT-Sensors

• Websites, web apps, mobile apps

• Enterprise applications (BI, CRM, etc.)

• External data (partners, weather, traffic, etc.)

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Streaming

Import/Batch

Web/App

Classic

Industry

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

Web/App

Classic

Industry

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

EncryptionData catalog

classification

Access

control

Web/App

Classic

Industry

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

Web/App

Classic

Industry

ETL

Pre-processing

Machine learning

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

Web/App

Classic

Industry

ETL

Pre-processing

Machine learning Monitoring

Control

Applications

API

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

Building a data lake

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service

On-premises Data Movement

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service

AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams

On-premises Data Movement

Real-time Data Movement

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service

AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams

On-premises Data Movement

Real-time Data Movement

Amazon AthenaAmazon EMRAmazon RedshiftAmazon Elasticsearch ServiceAmazon KinesisAmazon QuickSight

Analytics

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lake

Streaming

Import/Batch

AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service

AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams

On-premises Data Movement

Real-time Data Movement

Amazon AthenaAmazon EMRAmazon RedshiftAmazon Elasticsearch ServiceAmazon KinesisAmazon QuickSight

Analytics

Amazon SageMakerAmazon RekognitionAmazon ComprehendAmazon TranslateAmazon TranscribeEtc.

Machine Learning

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

There’s a service for that: AWS Lake Formation

Data Lake Storage

Data

Catalog

Access

Control

Data

import

Lake Formation

Crawlers ML-based

data prep

Use Amazon S3 as the

storage layer for

Lake Formation

Ask Lake Formation

to create required

S3 buckets and

import data into them

Register existing

S3 buckets that

contain your data

Data is stored in

your account:

You have direct access.

No lock-in.

Amazon S3

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML

3. Collect training data from your data lake• Production data

• Web/app data

• Legacy/archive data

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Here.com uses Amazon Rekognition toenrich their mapping content

“We use Amazon Rekognition to enrich our mapping content.

Rekognition’s Text in Image allows us to continually update signage

information so our customers have the latest information at their

fingertips. We look forward to continuing our partnership with AWS and

implementing their computer vision solutions in more of our products.”

–Rajkumar Jain

Director of Engineering at HERE Technologies

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

Images: Unsplash.com

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

Cheap!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

Cheap! Expensive!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

• Historical data

• Log files

• Transactions

• Data Warehouse

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

• Operational data

• Streaming

• IoT sensors

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

• Real humans

• Experts

• User feedback

• Hired “eyes and

ears”

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

NewAmazon SageMaker Ground Truth

Label machine learning training data easily and accurately

Launched 11/28

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Ground Truth: Data labeling tasks

Bounding boxes Image classification Semantic segmentation

Text classification Custom tasks

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Labeled datasets

Active learning and auto data labeling

Machine Learning

Models

Input datasets

Active

Learning

Auto

Labeling

Label

Consolidation

Human

Labeling

Machine Learning

Models

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Human workforce options

Public

On-demand, 24 x7 workforce

500,000+ independent

contractors worldwide,

powered by Amazon Mechanical Turk

Private

You source your own workers,

e.g. employees or contractors.

for data that needs to stay

within your organization

Vendors

Curated list of

third-party vendors,

specialized in labeling services,

available via the AWS Marketplace

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

X0,000

or more

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

ML

Algorithm

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

ML

Algorithm

ML

Model

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Supervised Learning

Data Labels

ML

Algorithm

ML

Model

New

DataPrediction

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML

3. Collect training data from your data lake

4. Collect labels for your training data• From historical data

• From operations

• From real humans

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Visualization &

Analysis

Business Problem –

ML problem framing Data Collection

Data Integration

Data Preparation &

Cleaning

Feature Engineering

Model Training &

Parameter Tuning

Model Evaluation

Are

Business

Goals

met?

Model Deployment

Monitoring &

Debugging

– Predictions

YesNo

Da

ta A

ug

me

nta

tio

n

Fe

atu

re

Au

gm

en

tati

on

Re-training

The Machine Learning process

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon SageMaker: Build, train, and deploy ML models at scale

Collect and

prepare training

data

Choose and

optimize your

ML algorithm

123

Set up and

manage

environments

for training

Train and

Tune ML Models Deploy models

in production

Scale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML

3. Collect training data from your data lake

4. Collect labels for your training data

5. Build, train and deploy your ML model withAmazon SageMaker

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML

3. Collect training data from your data lake

4. Collect labels for your training data

5. Build, train and deploy your ML model withAmazon SageMaker…or use a ready-to-run Amazon ML application service

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Checklist

1. Create a flywheel model of your business

2. Find opportunities to add value through ML

3. Collect training data from your data lake

4. Collect labels for your training data

5. Build, train and deploy your ML model withAmazon SageMaker…or use a ready-to-run Amazon ML application service

6. Repeat from step 2

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

When should you consider using ML for a problem?

• Software too complex

• Manual processnot cost effective

• Lots of training dataavailable

• Easy to express in ML terms

Image: Unsplash

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

When is ML probably not a good idea?

• No data

• No labels

• Not a lot of time

• No tolerance for mistakes

Image: Unsplash

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Summary

• Understand your business model

• Find bottlenecks, then ask:• Where can you add value with ML?

• Where can you reduce waste with ML?

• Build your data lake

• Be competent with data

• Build your first ML solution• Leverage AWS managed ML services

• Use Amazon SageMaker for building your own

• Iterate!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Related breakouts

From an Engineering Company to a Core ML Competency CompanyRaul-Andrei Firu, Haufe Lexware Group, Wed., Feb. 27th, Hall 6

How to Innovate Your Software with ML?4FriendsOnly.com, Can Do, fme and Haufe Group, Wed., Feb. 27th, Hall 4

Apache MXNet (Incubating) and Gluon: What's in it for You?Constantin Gonzalez, AWS, Wed., Feb. 27th, Keynote Hall

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

What will you build?

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

What will you build?

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Machine Learning process (technical view)

Collect and prepare

training data

Choose and

optimize your ML

algorithm

Set up and manage

environments for

training

Train and tune

model

(trial and error)

Deploy model

in production

Scale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Machine Learning process (technical view)

Collect and prepare

training data

Choose and

optimize your ML

algorithm

Set up and manage

environments for

training

Train and tune

model

(trial and error)

Deploy model

in production

Scale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Machine Learning process (technical view)

Collect and prepare

training data

Choose and

optimize your ML

algorithm

Set up and manage

environments for

training

Train and tune

model

(trial and error)

Deploy model

in production

Scale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

The Machine Learning process (technical view)

Collect and prepare

training data

Choose and

optimize your ML

algorithm

Set up and manage

environments for

training

Train and tune

model

(trial and error)

Deploy model

in production

Scale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon SageMaker

Collect and prepare

training data

Choose and

optimize your ML

algorithm

Set up and manage

environments for

training

Train and tune

model

(trial and error)

Deploy model

in production

Scale and manage

the production

environment

Easily build, train, and deploy machine learning models

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon SageMaker

Pre-built

notebooks for

common

problems

K-Means Clustering

Principal Component Analysis

Neural Topic Modelling

Factorization Machines

Linear Learner – Regression

DeepAR

Random Cut Forest

XGBoost

Latent Dirichlet Allocation

Image Classification

Seq2Seq

Linear Learner – Classification

BlazingText

ALGORITHMS

Apache MXNet

TensorFlow

Caffe2, CNTK,

PyTorch, Torch

FRAMEWORKSSe t up a nd m a nage

e nv i ronments fo r

t ra in ing

Tra in a nd t une

m o del ( t r i a l a nd

e r ror )

De p loy m o del

in p ro duc t ion

Sc a le a nd m a nage t he

p ro duc t ion e nv i ronment

Built-in, high-

performance

algorithms

Build

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon SageMaker

Pre-built

notebooks for

common

problems

Built-in, high-

performance

algorithms

One-click

training

Hyperparameter

optimization

Build Train

Deploy model

in productionScale and manage

the production

environment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon SageMaker

Fully managed

hosting with auto-

scaling

One-click

deployment

Deploy

Pre-built

notebooks for

common

problems

Built-in, high-

performance

algorithms

One-click

training

Hyperparameter

optimization

Build Train