The Deep Learning Revolution · The Deep Learning Revolution rethinking machine learning pipelines...

The Deep Learning Revolutionrethinking machine learning pipelines

Soumith ChintalaFacebook AI Research

7th April 2015EmergingTech 2015

Getting your attention

● Changed the landscape of:○ Natural language processing○ Speech recognition○ Computer vision○ Robotics○ Modern statistical physics○ Computational Biology○ Digital assistants (Siri, Cortana, etc.)

Traditional Machine Learning

Hand-Crafted Features

SIFT/HOG

Spectrogram

Traditional ML

● Feature Engineering + [your favorite classifier]● Steps:

○ Find a poor sod to think of good features○ Find another unfortunate chap to extract those

features from your data○ Collect all the features and grind them through:

■ Logistic regressor■ Decision trees, random forests, boosted■ Support Vector Machines

Deep learning

● End-to-end learning○ No feature engineering

● Chained cascade of non-linear transforms● General framework

How to win at life (5-step process)

● Pick a problem● Get as much data as you can● Expand your dataset more● Train several deep nets● Ensemble● Win!

Deep Neural Networks

What is a neural network?

x^2 Sin(x)Input Output

What is a deep neural network?

i*w Sigmoid

Input (x) Output

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

Neural networks vs computation graphs?

● Neural networks are trained via back-propagation● Every node has f(x) and df(x)/dx

i*w Sigmoid

Input (x) Output

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

forward

i*w Sigmoid gradient

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

i*w Sigmoid

weights (w)

backward

Input (x)

One weird trick: Convolutions

Fully-connected layers: issues

Locally connected layers

Convolutional layers

Convolution layer

Convolutional neural networks

● on the very small scale every piece of image can be processed the same way

ImageNet classification results 20121M training images, 1K categories, top-5 error

Human performance ~3-5%

Deep-learning models ~15%

Non-deep learning modelsISI, JapanOxford, EnglandINRIA, FranceUniversity of Amsterdam, etc.

ImageNet classification results 20151M training images, 1K categories, top-5 error

Human performance ~3-5%

Deep-learning models ~4.5%

Non-deep learning modelsISI, JapanOxford, EnglandINRIA, FranceUniversity of Amsterdam, etc.

Segmentation and Stereo

LeCun (2014)

Couprie et. al. (2013)

Farabet et. al. (2012)

ConvNets + Graphical Model (Tompson et. al. 2014)

Generative Adversarial NetsGoodfellow et. al. (2014)

ConvNets for Video

DeepVideo (Karpathy et. al)

C3D (Tran et. al.)

ConvNets for NLP

Collobert et. al.

Zhang et. al.

Weird trick #2: Recurrent Nets

Recurrent Networks

i*w Sigmoid

Input (x) Output

weights (w)

forward

past experience

i*w Sigmoid

Input (x) Output

weights (w)

backward

past experience

Acyclic computational graphs

Unfolding in time

RNN-LSTMs

● Machine Translation● Language Modeling● Learning to execute (Python programs)

Basic LSTM unit (figure from deeplearning.net)

Sutskever et. al. (2014)

Examples

Sequence of character on the input and on the output.

Recurrent Nets for Q&A

Weston et. al. 2014Facebook AI Research

Implementation and Engineering

Implementation: FLOP Eaters

● Convolutions are expensive● Sequential Processing● Matrix multiplies are expensive

Implementation: FLOP Eaters

● Convolutions are expensive● Sequential Processing● Matrix multiplies are expensive

Training

● Days to weeks● Engineering feat● Multiple machines, multiple GPUs

Frameworks

● Torch● Caffe● Theano, Keras, Lasagne● Common Features

○ syntax to define graph○ every node in graph has derivative○ train via backpropagation

Trends● Deeper nets● Smaller convolutions● RNN + LSTM● Multiple GPUs + Multiple Machines● Neural Machines● Other kinds of memory units● Better weight initialization● Meta-problems

Challenges

● Scaling up for big data (videos, social networks etc.)

● Discrete optimization● Memory that works

Questions

The Deep Learning Revolution · The Deep Learning Revolution rethinking machine learning pipelines...

Documents

Transcript of The Deep Learning Revolution · The Deep Learning Revolution rethinking machine learning pipelines...

A million spiking-neuronintegrated circuit with ...montek/teaching/Comp740-Fall16/EmergingTech/Neurom...a substantial imbalance ofthe PacificN budget ... and flexible non–von Neumann

Deep Learning for Reinforcement Learning in · PDF fileDeep Learning for Reinforcement Learning in ... Deep Learning for Reinforcement Learning in Pacman Deep Learning für ... Während

Learning from observations Inductive Learning - learning from examples Machine Learning.

WassersteinGANpeople.ee.duke.edu/~lcarin/Chunyuan2.17.2017.pdf · 2/17/2017 · Wasserstein GAN Author: Martin Arjovsky1, Soumith Chintala2, Leon Bottou1,2 Created Date: 2/16/2017

Learning, E-Learning & Re-Learning: Leadership … @diversitywoman The Innovation of Human Capital: Inclusion & Learning Learning, E-Learning & Re-Learning: Leadership Success in the

LinkedIn Learning | What We're Learning About Learning

NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumith Chintala

Soumith Chintala, AI research engineer, Facebook, at MLconf NYC 2017

Discovering Causal Signals in Images · David Lopez-Paz Facebook AI Research dlp@fb.com Robert Nishihara UC Berkeley rkn@eecs.berkeley.edu Soumith Chintala Facebook AI Research soumith@fb.com

Wasserstein Generative Adversarial Networksproceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf · Wasserstein Generative Adversarial Networks Martin Arjovsky1 Soumith Chintala2

Learning Before Learning During Learning After

Wasserstein GANbonner/courses/2020s/csc2547/papers/... · 2020. 2. 25. · Wasserstein GAN Martin Arjovsky1, Soumith Chintala2, and L´eon Bottou1,2 1Courant Institute of Mathematical

Eick: Reinforcement Learning. Reinforcement Learning Introduction Passive Reinforcement Learning Temporal Difference Learning Active Reinforcement Learning.

Soumith Chintala - IBMProblem Statement •Deep Learning Workloads - Vision models - model is very deep, straight-line chain with no recurrence - lots of convolutions - typically run

Bayesian Learning, Regression-based learning. Overview Bayesian Learning Full MAP learning Maximum Likelihood Learning Learning Bayesian Networks.

Deep Generative Modelsslwang/generative_model.pdf · minimization." Advances in Neural Information Processing Systems. 2016. [6] Denton, Emily L., Soumith Chintala, and Rob Fergus.

Deep Learning Feature Learning Representation Learning Generative Learning

Soumith Chintala at AI Frontiers: A Dynamic View of the Deep Learning World

Reformatted Adaptive Learning and Learning Analytics- a ...i.unisa.edu.au/...learning-and-learning-analytics---a-new-learning... · Largescaleonlinedeliveryof#learning#(such#as#Massive

EDFD 302 MGMSANTOS Learning Models. Learning Models: Mastery Learning Discovery Learning Guided Discovery Learning Meaningful Reception Learning.