Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ PAPIs Connect
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs Connect
-
Upload
papisio -
Category
Technology
-
view
221 -
download
0
Transcript of Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs Connect
DISTRIBUTED DEEPLEARNING
SPARK ON AWSPresented by Vincent Van Steenbergen - @nsteenv
WHOAMIVINCENT VAN STEENBERGEN
Data Engineer @ Abstract Minds
Playing with Scala, Akka & Spark +/- 3 years
Deeply interested in Artificial Intelligence and Data Analysis
DISCLAIMER
DEEP LEARNINGconvolutional neural networks
APPLICATIONS
IMAGE ANALYSIS
IMAGE GENERATION
GAMES
TRAINING A MODEL REQUIRES:a lot of time
even more computing power
Ex: AlphaGo - 1202 CPU and 176 GPU
SO HOW CAN I DO THAT...from my laptop?
for a decent cost?
within a short timespan?
possible on a laptop but very slow
solution: distribute training over a cluster
APACHE SPARK
Scala/Python framework for big data analysis
Like Hadoop but faster
ADVANTAGESAble to handle potentially Tb of data in streaming
Parallelise operations on a big cluster of machines
Improves accuracy of results
AMAZON WEB SERVICES (EC2)GPU instances (g2.2xlarge, g2.8xlarge)
Spot instances (on demand, generally 2-3 times cheaperthan regular instances)
G2.8XLARGE CONFIGURATIONFour NVIDIA GRID GPUs, each with 1,536 CUDA cores and 4
GB of video memory
32 vCPUs
60 GiB of memory
240 GB (2 x 120) of SSD storage
Average price: $1.00 per hour
NOT BAD...
DEEP LEARNING FRAMEWORKSTensorFlow (Google)
Caffe (Berkeley)
MNIST DATASET
Handwriten digits dataset
CROSS VALIDATION
COMPUTATION TIME
RESULTS7x speedup compared to training the models one at a time
on one machine
best result with hyperparameter tuning has a 99.47%accuracy on the test set
which is a 34% reduction of the test error.
IMAGE CLASSIFICATION
RESULTS('coral reef', 0.88503921),('scuba diver', 0.025853464),('brain coral', 0.0090828091),('snorkel', 0.0036010914),('promontory, headland, head, foreland', 0.0022605944)])
THANK YOU!Any questions?
My email: [email protected]
RESSOURCESSample launch scripts
Neural Networks
Deep Learning
cuDNN