Post on 12-Mar-2018
GPUs Enable Deep Neuroevolution for Vision-Based Autonomous
Driving
Faustino Gomez, CEO NNAISENSE SA
Lugano, Switzerland
SUPERVISED LEARNING
Learned Model
+Training set
prediction
target
errorinput
• Labeled training set
• Data is fixed and know a priori
• Learn from prediction error
• Regression / Classification
• Particularly amenable to GPUs
REINFORCEMENT LEARNING
sense act
S1 S2 S3 Sn…
sense actsense act
a1 a2 a3
• Sequential decision task
• Data determined by learning agent
• No targets (i.e. teacher)
• Learn good control policy from sparse reward signal
• Partial obersvability
Environment
Learning Agent
reward
NEUROEVOLUTION
Environment
fitness
evaluate
sensors action
fitness
Evolutionaryalgorithm
Neuroevolution: Advantages
• No linearity assumptions
• Can cope with high-dimensional input/output
• Can use history of sensor readings of unknown depth (short-term memory)
• Can incorporate arbitrary constraints
• Behavior is learned not programmed
• Does not require knowledge of what constitutes optimal performance, i.e. reference signal
SUCCESSFUL TEST CASE A U T O M AT E D D R I V I N G U S I N G V I S I O N
Jan Koutnik, Giuseppe Cuccu, Juergen Schmidhuber, and Faustino Gomez (2013). Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO, Amsterdam).
Million weight RNN learns to drive car using vision WITHOUT A TEACHER
AUDI:Autonomous Parking
Objective: Learn to park car elegantly in more general conditions using only local sensors
• Reinforcement-learn continuous non-linear control • No global information • High-dimensional, noisy input • Closing reality gap (forward model) • Computationally intensive (physical model on GPU) • Timeline: build system for scratch in 6 months for NIPS
Challenges:
AUDI PLAYCAR DEMMO
Couldn’t do without GPU
• GPUs used to train all components
- CNN localizer
- Forward car model
- RNN controller
• Enable running 20K simulations parallel
• 100x speedup over multicore CPU cluster
THANK YOU