Neural Networks - Home | Computer Science and...

43
CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman Neural Networks Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007

Transcript of Neural Networks - Home | Computer Science and...

Page 1: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Neural NetworksNeural Networks

Introduction to Artificial IntelligenceCSE 150

May 29, 2007

Page 2: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

AdministrationAdministration• Last programming assignment has been posted!

• Final Exam: Tuesday, June 12, 11:30-2:30

Page 3: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Last LectureLast Lecture• Naïve Bayes classifier (needed for programming

assignment)

•• ReadingReading– Chapter 13 (for probability)

Page 4: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

This LectureThis Lecture• Neural Networks

– Representation– Perceptron– Learning in linear networks

ReadingReading• Chapter 20, section 5 only

Page 5: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

This LectureThis Lecture

Neural NetworksNeural Networks

Page 6: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

What & WhyWhat & Why• Artificial Neural Networks: a bottom-up attempt to

model the functionality of the brain• Two main areas of activity:

– Biological Try to model biological neural systems– Computational

•Artificial neural networks are biologically inspiredbut not necessarily biologically plausible

• So may use other terms: Connectionism, ParallelDistributed Processing, Adaptive Systems Theory

• Interests in neural network differ according toprofession.

Page 7: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Some ApplicationsSome ApplicationsNETTALK vs. DECTALKDECTALK is a system developed by DEC which readsEnglish characters and produces, with a 95% accuracy, thecorrect pronunciation for an input word or text pattern.

•DECTALK is an expert system which took 20 years to finalise

•It uses a list of pronunciation rules and a large dictionary ofexceptions.

NETTALK, a neural network version of DECTALK, wasconstructed over one summer vacation.

•After 16 hours of training the system could read a 100-wordsample text with 98% accuracy!

•When trained with 15,000 words it achieved 86% accuracy on atest set.

NETTALK is an example of one of the advantages of theneural network approach over the symbolic approach toArtificial Intelligence.

•It is difficult to simulate the learning process in a symbolicsystem; rules and exceptions must be known.

•On the other hand, neural systems exhibit learning very clearly;the network learns by example.

Page 8: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Human Brain & Connectionist ModelHuman Brain & Connectionist ModelConsider humans:

• Neuron switching time .001 second• Number of neurons 1010

• Connections per neuron 104

• Scene recognition time .1 second• 100 inference steps doesn’t look likeenough - massively parallel computation

Properties of ANNs:• Many neuron-like switching units• Many weighted interconnectionsamong units• Highly parallel, distributed process• Emphasis on tuning weightsautomatically

Page 9: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

PerceptronsPerceptrons

Xi : inputs

Wi : weights

θ : threshold

Linear threshold units

Page 10: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

PerceptronsPerceptrons

The threshold can be easilyforced to 0 by introducing anadditional weight input W0 = θfrom a unit that is always -1.

Page 11: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

How powerful is aHow powerful is a perceptron perceptron??Threshold = 0Threshold = 0

Page 12: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Concept Space & Linear Concept Space & Linear SeparabilitySeparability

Page 13: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Learning ruleLearning rule::

• If I’m ON and I’m supposed to be OFF, LOWER weights from active inputsand RAISE the threshold

• If I’m OFF and I’m supposed to be ON, RAISE weights from active inputs andLOWER the threshold

Page 14: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Training Training PerceptronsPerceptrons

Page 15: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 16: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 17: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 18: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

• STOP HERE!!!• Now do: derivation of the learning rule

for linear networks• Derivation of the learning rule for non-

linear networks.

Page 19: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 20: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Begin: Some slides borrowed andBegin: Some slides borrowed andmodified from modified from Sanjoy DasguptaSanjoy Dasgupta

Page 21: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Learning classifiersLearning classifiers

Input space: X = {28x28 images}

Classes (or labels): Y = {0,1,2,…,9}

Given some labeled examples, learn to classifyimages, i.e., learn a prediction rule

f: X Y

Most work assumes a binary classification task,such as Y = {-1,+1} , eg.

-1: {digits 0,2,3,4,5,6,7,8,9}

+1: {digit 1}

Page 22: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

““BatchBatch”” learning learning

1. Start with a set of labeled examples.

2. Learn an underlying rule f: X Y.

3. Use this to classify future examples.

Page 23: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Linear classifiersLinear classifiers

The images X lie in R784ASSUME: they have had the

mean subtracted: If they arelinearly separable - then wedon’t have to worry about thethreshold.

Linear classifier parametrized bya weight vector w:fw(x) = sign(w·x)

w

+

_

_

_

__

_

_

+

+

++

+

+

+

Page 24: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Learning linear classifiersLearning linear classifiers

Given training data(x(1),y(1)), …, (x(m),y(m))

Find a vector w such thaty(i)(w·x(i)) ≥ 0 for all i = 1,…,m

This is linear programming.

Page 25: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

The trouble with LPThe trouble with LP

+

_

_

_

__

_

_

+

+

++

+

+

+

+

_

_

_

__

_

_

+

+

++

+

+

+

DIFFICULT

EASY

Margin r

Page 26: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

When thereWhen there’’s a margin:s a margin:

Perceptron algorithm (Rosenblatt, 1958)

w = 0while some (x,y) is misclassified:

w = w + yx

Claim: If all points have length at most one, and there is amargin r>0, then this algorithm makes at most 1/r2updates.

Page 27: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Perceptron Perceptron in actionin action

+

_

_

_

_

+

++

+

+

1

4

5

26

8

3

9

10 7w = x(1) – x(7)

Sparserepresentation:[(1,1), (7,-1)]

w = 0while some (x,y) ismisclassified:

w = w + yx

Page 28: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

End: Some slides borrowed andEnd: Some slides borrowed andmodified from modified from Sanjoy DasguptaSanjoy Dasgupta

• I didn’t really use the following slides, but youmay find them useful!!

• Most of what I did was on the board…

Page 29: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 30: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 31: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Increasing Expressiveness:Increasing Expressiveness:Multi-Layer Neural NetworksMulti-Layer Neural Networks

2-layer Neural Net

Page 32: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 33: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 34: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 35: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 36: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 37: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 38: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 39: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Comments on TrainingComments on Training• No guarantee of convergence; may oscillate or

reach a local minimum• In practice, many large networks can be trained on

large amounts of data for realistic problems.• Many epochs (tens of thousands) may be needed

for adequate training. Large datasets may requiremany hours or days of CPU time.

• Termination criteria: Number of epochs; thresholdon training set error; Not decrease in error,increased error on validation set.

• To avoid local mimima: several trials withdifferent random initial weights.

Page 40: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Preference Bias: Early StoppingPreference Bias: Early Stopping

• Overfitting in ANN• Stop training when error goes up on validation set

Page 41: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 42: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

Page 43: Neural Networks - Home | Computer Science and Engineeringcseweb.ucsd.edu/classes/sp07/cse150/lectures-pdf/ch20-neural-nets.… · Reading •Chapter 20 ... •Artificial Neural Networks:

CSE 150, Spring 2007 Gary Cottrell’s modifications of slides originally produced by David Kriegman

ConclusionsConclusions• Perceptrons: learning is convergent, but limited

expressiveness.• Multi-layer neural networks are more expressive

and can be trained using gradient descent(backprop).

• Much slower to train than other learning methods,but exploring a rich space works well in manydomains.

• Representation is not very accessible and difficultto take advantage of domain (prior) knowledge

• Analogy to brain and successful applications havegenerated significant interest.