Deep Learning - A Literature survey

29
A Technical seminar on “DEEP LEARNING” Student: Akshay N. Hegde 1RV12SIT02 Mtech –IT 1 st sem Department of ISE, RVCE

description

“Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled” Deep learning algorithms have shown superior learning and classification performance. In areas such as transfer learning, speech and handwritten character recognition, face recognition among others. (I have referred many articles and experimental results provided by Stanford University)

Transcript of Deep Learning - A Literature survey

Page 1: Deep Learning - A Literature survey

A Technical seminar on

“DEEP LEARNING”

Student: Akshay N. Hegde 1RV12SIT02 Mtech –IT 1st sem

Department of ISE, RVCE

Page 2: Deep Learning - A Literature survey

Presentation Outline• INTRODUCTION• LITERATURE SURVEY• EXAMPLES• METHADOLOGY• EXPERIMENTS• RESULTS• CONCLUSION AND FUTURE WORK• REFERENCES

Page 3: Deep Learning - A Literature survey

INTRODUCTION• What is Deep Learning?• Some successful stories.• Examples of Deep learning.• Learning and training of Objects.• Conclusion & Future scope

Dept. of ISE, RVCE.

Page 4: Deep Learning - A Literature survey

What is Deep learning?

• “Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled”

• Deep learning algorithms have shown superior learning and classification performance

• In areas such as transfer learning, speech and handwritten character recognition, face recognition among others.

Page 5: Deep Learning - A Literature survey

• A deep learning algorithm automatically extracts the low & high-level features necessary for classification.

• By high level features, one means feature that hierarchically depends on other features.

• “Automatic representation learning” is key point of interest of this kind of approach as the need for potentially time consuming handcrafted feature design is eliminated.

Page 6: Deep Learning - A Literature survey

Semi-supervised learning

Unlabeled images (all cars/motorcycles)

What is this?

Car Motorcycle

Page 7: Deep Learning - A Literature survey

Hierarchies in Vision• Lampert et al. CVPR’09

• Learn attributes, then classesas combination of attributes

ClassLabels

Attributes

Image Features

Page 8: Deep Learning - A Literature survey

What we can do ? (With the right dataset)

• Recognize faces

• Categorize scenes

• Detect, segment and track objects

• 3D from multiple images or stereo

• Classify actions

Page 9: Deep Learning - A Literature survey

What we can do..

Detect and Localize ObjectsCategorize Scenes

BEACH

Face Detection and Recognition

Page 10: Deep Learning - A Literature survey

Why Deep Learning ?

• Data mining: using historical data to improve decision– medical records ⇒ medical knowledge– log data to model user

• Software applications we can’t program by hand– autonomous driving– speech recognition

• Self customizing programs– Newsreader that learns user interests

Page 11: Deep Learning - A Literature survey

Some success stories

• Data Mining

• Analysis of astronomical data• Human Speech Recognition• Handwriting recognition• Face recognition

• Fraudulent Use of Credit Cards• Drive Autonomous Vehicles

• Predict Stock Rates

• Intelligent Elevator Control• DNA Classification

Page 12: Deep Learning - A Literature survey

Spectrogram

Detection units

Max pooling unit

Deep learning examplesConvolutional DBN for audio

Page 13: Deep Learning - A Literature survey

Convolutional DBN for audio

Spectrogram

Page 14: Deep Learning - A Literature survey

Probabilistic max pooling

X3X1 X2 X4

max {x1, x2, x3, x4}

Convolutional Neural net:Convolutional DBN:

X3X1 X2 X4

max {x1, x2, x3, x4}

Where xi are real numbers.

Where xi are {0,1}, and mutually exclusive. Thus, 5 possible cases:

Collapse 2n configurations into n+1 configurations. Permits bottom up and top down inference.

0

0 0 0 0

0

0 0 0 0 0 0 0 0 0 0

00000 0

1 1

1

1

11

1

1

Page 15: Deep Learning - A Literature survey

Convolutional DBN for audio

One CDBN layerDetection units

Max pooling

Detection units

Max poolingSecond CDBN

layer

Page 16: Deep Learning - A Literature survey

Convolutional DBN for Images

Visible nodes (binary or real)

Wk

Detection layer H

Max-pooling layer P

Hidden nodes (binary)

“Filter” weights (shared)

‘’max-pooling’’ node (binary)

Input data V

Page 17: Deep Learning - A Literature survey

Convolutional DBN on face images

pixels

edges

object parts(combination of edges)

object models

Page 18: Deep Learning - A Literature survey

Learning of object parts

Examples of learned object parts from object categoriesFaces Cars Elephants Chairs

Page 19: Deep Learning - A Literature survey

Training on multiple objects

Plot of H(class|neuron active)

Trained on 4 classes (cars, faces, motorbikes, airplanes).

Second layer: Shared-features and object-specific features.

Third layer: More specific features.

Page 20: Deep Learning - A Literature survey

• Unsupervised feature learning: Does it work?

Unsupervised & Supervised Training

Page 21: Deep Learning - A Literature survey

EXPERIMENTS & RESULTS

Page 22: Deep Learning - A Literature survey

State-of-the-art task performance

TIMIT Phone classification Accuracy

Prior art (Clarkson et al.,1999) 79.6%

Stanford Feature learning 80.3%

TIMIT Speaker identification Accuracy

Prior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%

Audio

Images

Multimodal (audio/video)

CIFAR Object classification Accuracy

Prior art (Yu and Zhang, 2010) 74.5%

Stanford Feature learning 75.5%

NORB Object classification Accuracy

Prior art (Ranzato et al., 2009) 94.4%

Stanford Feature learning 96.2%

AVLetters Lip reading Accuracy

Prior art (Zhao et al., 2009) 58.9%Stanford Feature learning 63.1%

VideoUCF activity classification Accuracy

Prior art (Kalser et al., 2008) 86%

Stanford Feature learning 87%

Hollywood2 classification Accuracy

Prior art (Laptev, 2004) 47%Stanford Feature learning 50%

Page 23: Deep Learning - A Literature survey

• Fig. 1. DeSTIN Hierarchy for the MNIST dataset studies. Four layers are used with 64, 16, 4 and 1 node per layer arranged in a hierarchical manner.

• At each node the output belief b(s) at each temporal step is fed to a parent-node.

• At each temporal step the parent receives input beliefs from four

child nodes to generate its own belief (fed to its parent) and an advice value a which is fed back to the child nodes.

Page 24: Deep Learning - A Literature survey
Page 25: Deep Learning - A Literature survey

Named-entity recognition (NER)

• Also known as entity identification and entity extraction is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

• Most research on NER systems has been structured as taking an unannotated block of text, such as this one:

• “Jim bought 300 shares of Acme Corp. in 2006.”

Page 26: Deep Learning - A Literature survey

• And producing an annotated block of text, such as this one:

<ENAMEX TYPE="PERSON"> Jim </ENAMEX> bought <NUMEX TYPE="QUANTITY"> 300 </NUMEX> shares of <ENAMEX TYPE="ORGANIZATION"> Acme Corp. </ENAMEX> in <TIMEX TYPE="DATE">2006</TIMEX>

• State-of-the-art NER systems for English produce near-human performance. For example, the best system entering MUC-7 scored 93.39% of F measure while human annotators scored 97.60% and 96.95%

Page 27: Deep Learning - A Literature survey

CONCLUSION & FUTURE WORK

• Test result shows that a deep learning approach allows better classification than popular classifiers on the handcrafted features chosen in this work.

• This is a significant advantage over the typical classification approach that requires careful (and possibly time consuming) selection of features.

• Instead of hand-tuning features, use unsupervised feature learning

• Advanced topics:o Self-taught learning o Scaling up

Page 28: Deep Learning - A Literature survey

• More practical implementations must be done.• Researches are going on by Stanford University.

Page 29: Deep Learning - A Literature survey

REFERENCES• [1]D. Erhan, Y. Bengio, A. Courville, P. A. Manzagol, P. Vincent, and S. Bengio, "Why Does

Unsupervised Pre-training Help Deep Learning?," Journal of Machine Learning Research, vol. 11, pp. 625-660, Feb 2010.

• [2] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion," Journal of Machine Learning Research, vol. 11, 2010.

• [3] G. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554, 2006.

• [4] D. Keysers, “Comparison and Combination of State-of-the-art Techniques for Handwritten Character Recognition: Topping the MNIST Benchmark,” Arxiv preprint arXiv:0710.2231, 2007.

• [5] H. Lee, Y. Largman, P. Pham, and A. Ng, “Unsupervised feature learning for audio classification using convolutional deep belief networks,”Advances in neural information processing systems, vol. 22, pp. 1096– 1104, 2009.

• [6] Francis, Quintal, Lauzon, “An introduction to deep learning,” IEEE Transactions on Deep Learning, pp. 1438–1439, 2012.