Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
-
Upload
papisio -
Category
Technology
-
view
237 -
download
0
Transcript of Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
![Page 1: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/1.jpg)
![Page 2: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/2.jpg)
![Page 3: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/3.jpg)
![Page 4: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/4.jpg)
![Page 5: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/5.jpg)
Use Case: UberEATS ETD Prediction
5
![Page 6: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/6.jpg)
●○
●
○
●
○
●
○○
●○
![Page 7: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/7.jpg)
HADOOP / YARN (Batch)
![Page 8: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/8.jpg)
HADOOP / YARN (Batch)
Hive Feature Store
![Page 9: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/9.jpg)
NETWORK (Realtime)
HADOOP / YARN (Batch)
Cassandra Feature Store
Hive Feature Store
![Page 10: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/10.jpg)
![Page 11: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/11.jpg)
Rea
l-tim
e pr
edic
tion
Trai
ning
Use Case: UberEATS ETD ML Pipeline
Hive
11
Feature store
Model Training
ModelUberEATS App
Model Performance
ETD
![Page 12: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/12.jpg)
Problems• Hard to figure out good features
• Hard to build the pipelines to generate features
• Can’t compute some features in real time
Solution: DSL and Feature Store● Database of curated and crowd-sourced features
● Make it easy to use and transform these features in ML projects
● Make it easy to discover new useful features
● Batch and realtime serving
![Page 13: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/13.jpg)
Data Pipeline For Predictions
Feature DSL
Transformed Features
Basis Features ML Model PredictionsData Lake Spark or
SQL
![Page 14: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/14.jpg)
Data Pipeline For Predictions w/ Feature Palette
Feature Store
Feature DSL
Transformed Features
Basis Features ML Model PredictionsData Lake Spark or
SQL
![Page 15: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/15.jpg)
Use Case: UberEATS ETD Model Details
15
Feature store
Model: GBT RegressionUberEATS
AppETD
● restaurant features○ location, avg prep-time, avg delivery time,
avg demand during lunch ...● contextual features
○ time of day, day of week, ...● order features
○ #items, total cost, ...● near real-time features
○ info about the past N orders, ...● ...
● Feature store provides aggregate features for real-time prediction
○ These features are time-consuming to compute in real-time
![Page 16: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/16.jpg)
Problem● Often you want to train a model per city
● But hard to train and manage 400+ models for a project
Solution ● Let users define partitioning scheme
● Automatically train model per partition
● Manage and deploy as single logical model
![Page 17: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/17.jpg)
1. Define partition scheme
![Page 18: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/18.jpg)
2. Make train / test split
![Page 19: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/19.jpg)
3. Keep same split for each level
![Page 20: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/20.jpg)
M
M
M M M
M
M M M
4. Train model for every node
![Page 21: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/21.jpg)
M
M
M M
M
M M M
5. Prune bad models
![Page 22: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/22.jpg)
M
M
M M
M
M M M
6. At prediction time, use best model for each node
![Page 23: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/23.jpg)
![Page 24: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/24.jpg)
Use Case: UberEATS ETD Prediction Performance
24
● Partitioned GBDT Regression Model
● Latency (measured from client)
○ p50: 7ms
○ p95: 15ms
○ p99: 20ms
![Page 25: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/25.jpg)
Conclusion● We present a scalable ML as a service system
● We focus on the scalability challenges and solutions
○ Feature store key to enable aggregate features for real-time prediction
■ Same API to access feature store for both batch training and real-time prediction
○ Partitioned models greatly simplifies model management and selection
■ Per city model performance often worse than global model
○ Scalable low latency real-time prediction service enables interactive user experiences
■ Load balancing across containers without global state
■ Fast one button deployment
■ Hot swap model upgrade
![Page 26: Scaling machine learning as a service at Uber — Li Erran Li at #papis2016](https://reader033.fdocuments.in/reader033/viewer/2022051709/587326501a28ab596c8b4b9d/html5/thumbnails/26.jpg)