Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982]...
Transcript of Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982]...
![Page 1: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/1.jpg)
Deep Learning - Theory and Practice
20-02-2020Linear Regression, Least Squares Classification and Logistic Regression
http://leap.ee.iisc.ac.in/sriram/teaching/DL20/
![Page 2: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/2.jpg)
Least Squares for Classification
Bishop - PRML book (Chap 3)
❖ K-class classification problem
❖ With 1-of-K hot encoding, and least squares regression
![Page 3: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/3.jpg)
![Page 4: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/4.jpg)
![Page 5: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/5.jpg)
Logistic Regression
Bishop - PRML book (Chap 3)
❖ 2- class logistic regression
❖ K-class logistic regression
❖ Maximum likelihood solution
❖ Maximum likelihood solution
![Page 6: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/6.jpg)
![Page 7: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/7.jpg)
![Page 8: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/8.jpg)
![Page 9: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/9.jpg)
![Page 10: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/10.jpg)
![Page 11: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/11.jpg)
Typical Error Surfaces
Typical Error Surface as a function of parameters
(weights and biases)
![Page 12: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/12.jpg)
![Page 13: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/13.jpg)
![Page 14: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/14.jpg)
Learning with Gradient Descent
Error surface close to a local
![Page 15: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/15.jpg)
Learning Using Gradient Descent
![Page 16: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/16.jpg)
Parameter Learning
• Solving a non-convex optimization.
• Iterative solution. • Depends on the initialization. • Convergence to a local
optima. • Judicious choice of learning
rate
![Page 17: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/17.jpg)
Least Squares versus Logistic Regression
Bishop - PRML book (Chap 4)
![Page 18: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/18.jpg)
Least Squares versus Logistic Regression
Bishop - PRML book (Chap 4)
![Page 20: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/20.jpg)
Perceptron Algorithm
What if the data is not linearly separable
Perceptron Model [McCulloch, 1943, Rosenblatt, 1957]
Targets are binary classes [-1,1]
![Page 21: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/21.jpg)
Multi-layer PerceptronMulti-layer Perceptron [Hopfield, 1982]
thresholding function
non-linear function (tanh,sigmoid)
![Page 22: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/22.jpg)
Neural Networks
Multi-layer Perceptron [Hopfield, 1982]
thresholding function
non-linear function (tanh,sigmoid)
• Useful for classifying non-linear data boundaries - non-linear class separation can be realized given enough data.
![Page 23: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/23.jpg)
Neural Networks
tanh sigmoid ReLu
Cost-Function
are the desired outputs
Mean Square Error Cross Entropy
Types of Non-linearities
![Page 24: Deep Learning - Theory and Practice...Neural Networks Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)• Useful for classifying non-linear](https://reader030.fdocuments.in/reader030/viewer/2022041005/5ea99c8dc0465d5300183526/html5/thumbnails/24.jpg)
Learning Posterior Probabilities with NNsChoice of target function
• Softmax function for classification
• Softmax produces positive values that sum to 1 • Allows the interpretation of outputs as posterior
probabilities