Post on 22-May-2020
S U MM I TB E R L I N
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Machine Learning Process:From Business Model to ML in Production
Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
What you’ll get out of this session
• An overview of the Machine Learning (ML) process• From business model
• to data collection and processing
• to ML training and deployment
• … and all the way back
• A better understanding of how to apply ML to your business
• Examples from AWS customers and Amazon.com
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Put machine learning in the
hands of every developer
Our mission at AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Customers running machine learning on AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Machine Learning stack
A I S E R V I C E S
M L S E R V I C E S
M L F R A M E W O R K S &
I N F R A S T R U C T U R E
A m a z o n
S a g e M a k e rG r o u n d T r u t h A l g o r i t h m s
N o t e b o o k s
M a r k e t p l a c e
U n s u p e r v i s e d
L e a r n i n g
S u p e r v i s e d
L e a r n i n g
R e i n f o r c e m e n t
L e a r n i n g
O p t i m i z a t i o n
( N e o )
T r a i n i n g
H o s t i n g
D e p l o y m e n t
Frameworks Interfaces Infrast ructure
A m a z o n
R e k o g n i t i o n
I m a g e
A m a z o n
P o l l y
A m a z o n
T r a n s c r i b e
A m a z o n
T r a n s l a t e
A m a z o n
C o m p r e h e n d
A m a z o n
L e x
A m a z o n
R e k o g n i t i o n
V i d e o
Vis ion Speech Language Chatbots
A m a z o n
F o r e c a s t
Forecast ing
A m a z o n
T e x t r a c tA m a z o n
P e r s o n a l i z e
Recommendat ions
A m a z o n
E C 2 P 3
& P 3 D N
A m a z o n
E C 2 C 5
F P G A s A W S G r e e n g r a s s A m a z o n
E l a s t i c
I n f e r e n c e
A m a z o n
I n f e r e n t i a
NEW NEW NEW
NEW
NEW NEW
NEW
NEW
NEW NEW
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
But how do I start?Where should I apply Machine Learning?What approach should I be taking?
?
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are
Business
Goals
met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Da
ta A
ug
me
nta
tio
n
Fe
atu
re
Au
gm
en
tati
on
Re-training
The Machine Learning process
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are
Business
Goals
met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Da
ta A
ug
me
nta
tio
n
Fe
atu
re
Au
gm
en
tati
on
Re-training
The Machine Learning process
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Flywheel
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
There’s an app a service for that …
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Flywheel
Product
recommendations
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
There’s a service for that: Amazon Translate
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
There’s a service for that: Amazon Translate
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Flywheel
Automatic
translation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon Forecast
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Flywheel
Supply chain
optimization
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Amazon Flywheel
ML opportunityML opportunity
ML opportunity
ML opportunity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Build your own Flywheel
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Build your own Flywheel
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Build your own Flywheel
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML• Save cost through better planning
• Save cost through automating human work
• Increase revenue through delivering a better product/service
• Increase revenue through better customer experience
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Common ML use cases for some verticals
• Retail: Supply chain and demand forecasting
• Financial services: Credit default prediction for customer behavior
• Manufacturing: Real-time predictions for industrial IoT
• Advertising: Predict click-through rate for targeted ads
• Media and branding: Prediction of language content quality
• Automotive innovation: Self-driving vehicles and simulation
• Health and wellness: Track disease progression
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are
Business
Goals
met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Da
ta A
ug
me
nta
tio
n
Fe
atu
re
Au
gm
en
tati
on
Re-training
The Machine Learning process
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data
every 5 years
There is more data than people think
15years
live for
Data platforms need to
1,000xscale
>10xgrows
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Web/App
Classic
• Traditional databases, data warehousing
• Historical data, logs, archives
• Big data
Industry
• Production facilities
• Control devices
• IoT-Sensors
• Websites, web apps, mobile apps
• Enterprise applications (BI, CRM, etc.)
• External data (partners, weather, traffic, etc.)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Streaming
Import/Batch
Web/App
Classic
Industry
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
Web/App
Classic
Industry
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
EncryptionData catalog
classification
Access
control
Web/App
Classic
Industry
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
Web/App
Classic
Industry
ETL
Pre-processing
Machine learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
Web/App
Classic
Industry
ETL
Pre-processing
Machine learning Monitoring
Control
Applications
API
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
Building a data lake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service
On-premises Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service
AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams
On-premises Data Movement
Real-time Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service
AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams
On-premises Data Movement
Real-time Data Movement
Amazon AthenaAmazon EMRAmazon RedshiftAmazon Elasticsearch ServiceAmazon KinesisAmazon QuickSight
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lake
Streaming
Import/Batch
AWS Direct ConnectAWS SnowballAWS SnowmobileAWS Database Migration Service
AWS IoT CoreAmazon Kinesis Data FirehoseAmazon Kinesis Data StreamsAmazon Kinesis Video Streams
On-premises Data Movement
Real-time Data Movement
Amazon AthenaAmazon EMRAmazon RedshiftAmazon Elasticsearch ServiceAmazon KinesisAmazon QuickSight
Analytics
Amazon SageMakerAmazon RekognitionAmazon ComprehendAmazon TranslateAmazon TranscribeEtc.
Machine Learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
There’s a service for that: AWS Lake Formation
Data Lake Storage
Data
Catalog
Access
Control
Data
import
Lake Formation
Crawlers ML-based
data prep
Use Amazon S3 as the
storage layer for
Lake Formation
Ask Lake Formation
to create required
S3 buckets and
import data into them
Register existing
S3 buckets that
contain your data
Data is stored in
your account:
You have direct access.
No lock-in.
Amazon S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML
3. Collect training data from your data lake• Production data
• Web/app data
• Legacy/archive data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Here.com uses Amazon Rekognition toenrich their mapping content
“We use Amazon Rekognition to enrich our mapping content.
Rekognition’s Text in Image allows us to continually update signage
information so our customers have the latest information at their
fingertips. We look forward to continuing our partnership with AWS and
implementing their computer vision solutions in more of our products.”
–Rajkumar Jain
Director of Engineering at HERE Technologies
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
Images: Unsplash.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
Cheap!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
Cheap! Expensive!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
• Historical data
• Log files
• Transactions
• Data Warehouse
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
• Operational data
• Streaming
• IoT sensors
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
• Real humans
• Experts
• User feedback
• Hired “eyes and
ears”
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
NewAmazon SageMaker Ground Truth
Label machine learning training data easily and accurately
Launched 11/28
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Ground Truth: Data labeling tasks
Bounding boxes Image classification Semantic segmentation
Text classification Custom tasks
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Labeled datasets
Active learning and auto data labeling
Machine Learning
Models
Input datasets
Active
Learning
Auto
Labeling
Label
Consolidation
Human
Labeling
Machine Learning
Models
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Human workforce options
Public
On-demand, 24 x7 workforce
500,000+ independent
contractors worldwide,
powered by Amazon Mechanical Turk
Private
You source your own workers,
e.g. employees or contractors.
for data that needs to stay
within your organization
Vendors
Curated list of
third-party vendors,
specialized in labeling services,
available via the AWS Marketplace
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
X0,000
or more
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
ML
Algorithm
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
ML
Algorithm
ML
Model
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Supervised Learning
Data Labels
ML
Algorithm
ML
Model
New
DataPrediction
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML
3. Collect training data from your data lake
4. Collect labels for your training data• From historical data
• From operations
• From real humans
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are
Business
Goals
met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Da
ta A
ug
me
nta
tio
n
Fe
atu
re
Au
gm
en
tati
on
Re-training
The Machine Learning process
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon SageMaker: Build, train, and deploy ML models at scale
Collect and
prepare training
data
Choose and
optimize your
ML algorithm
123
Set up and
manage
environments
for training
Train and
Tune ML Models Deploy models
in production
Scale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML
3. Collect training data from your data lake
4. Collect labels for your training data
5. Build, train and deploy your ML model withAmazon SageMaker
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML
3. Collect training data from your data lake
4. Collect labels for your training data
5. Build, train and deploy your ML model withAmazon SageMaker…or use a ready-to-run Amazon ML application service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Checklist
1. Create a flywheel model of your business
2. Find opportunities to add value through ML
3. Collect training data from your data lake
4. Collect labels for your training data
5. Build, train and deploy your ML model withAmazon SageMaker…or use a ready-to-run Amazon ML application service
6. Repeat from step 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
When should you consider using ML for a problem?
• Software too complex
• Manual processnot cost effective
• Lots of training dataavailable
• Easy to express in ML terms
Image: Unsplash
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
When is ML probably not a good idea?
• No data
• No labels
• Not a lot of time
• No tolerance for mistakes
Image: Unsplash
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Summary
• Understand your business model
• Find bottlenecks, then ask:• Where can you add value with ML?
• Where can you reduce waste with ML?
• Build your data lake
• Be competent with data
• Build your first ML solution• Leverage AWS managed ML services
• Use Amazon SageMaker for building your own
• Iterate!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Related breakouts
From an Engineering Company to a Core ML Competency CompanyRaul-Andrei Firu, Haufe Lexware Group, Wed., Feb. 27th, Hall 6
How to Innovate Your Software with ML?4FriendsOnly.com, Can Do, fme and Haufe Group, Wed., Feb. 27th, Hall 4
Apache MXNet (Incubating) and Gluon: What's in it for You?Constantin Gonzalez, AWS, Wed., Feb. 27th, Keynote Hall
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
What will you build?
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
What will you build?
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Constantin GonzalezPrincipal Solutions ArchitectAmazon Web Servicesglez@amazon.de | @zalez
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Machine Learning process (technical view)
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Machine Learning process (technical view)
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Machine Learning process (technical view)
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
The Machine Learning process (technical view)
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon SageMaker
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
Easily build, train, and deploy machine learning models
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon SageMaker
Pre-built
notebooks for
common
problems
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
Factorization Machines
Linear Learner – Regression
DeepAR
Random Cut Forest
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner – Classification
BlazingText
ALGORITHMS
Apache MXNet
TensorFlow
Caffe2, CNTK,
PyTorch, Torch
FRAMEWORKSSe t up a nd m a nage
e nv i ronments fo r
t ra in ing
Tra in a nd t une
m o del ( t r i a l a nd
e r ror )
De p loy m o del
in p ro duc t ion
Sc a le a nd m a nage t he
p ro duc t ion e nv i ronment
Built-in, high-
performance
algorithms
Build
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon SageMaker
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Build Train
Deploy model
in productionScale and manage
the production
environment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Deploy
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Build Train