Wed June 12
description
Transcript of Wed June 12
![Page 1: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/1.jpg)
Wed June 12
• Goals of today’s lecture.– Learning Mechanisms
– Where is AI and where is it going? What to look for in the future? Status of Turing test?
– Material and guidance for exam.
– Discuss any outstanding problems on last assignment.
![Page 2: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/2.jpg)
Automated Learning Techniques
• ID3 : A technique for automatically developing a good decision tree based on given classification of examples and counter-examples.
![Page 3: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/3.jpg)
Automated Learning Techniques
• Algorithm W (Winston): an algorithm that develops a “concept” based on examples and counter-examples.
![Page 4: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/4.jpg)
Automated Learning Techniques
• Perceptron: an algorithm that develops a classification based on examples and counter-examples.
• Non-linearly separable techniques (neural networks, support vector machines).
![Page 5: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/5.jpg)
Perceptrons
Learning in Neural Networks
![Page 6: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/6.jpg)
Natural versus Artificial Neuron
• Natural Neuron McCullough Pitts Neuron
![Page 7: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/7.jpg)
One NeuronMcCullough-Pitts
• This is very complicated. But abstracting the details,we have
w1
w2
wn
x1
x2
xn
hresholdntegrate
Integrate-and-fire Neuron
![Page 8: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/8.jpg)
•Pattern Identification
•(Note: Neuron is trained)
•weights
field. receptive in the is letter The Axw ii
Perceptron
![Page 9: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/9.jpg)
Three Main Issues
• Representability
• Learnability
• Generalizability
![Page 10: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/10.jpg)
One Neuron(Perceptron)
• What can be represented by one neuron?
• Is there an automatic way to learn a function by examples?
![Page 11: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/11.jpg)
•weights
field receptivein threshold Axw ii
Feed Forward Network
•weights
![Page 12: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/12.jpg)
Representability
• What functions can be represented by a network of McCullough-Pitts neurons?
• Theorem: Every logic function of an arbitrary number of variables can be represented by a three level network of neurons.
![Page 13: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/13.jpg)
Proof
• Show simple functions: and, or, not, implies
• Recall representability of logic functions by DNF form.
![Page 14: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/14.jpg)
Perceptron
• What is representable? Linearly Separable Sets.
• Example: AND, OR function
• Not representable: XOR
• High Dimensions: How to tell?
• Question: Convex? Connected?
![Page 15: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/15.jpg)
AND
![Page 16: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/16.jpg)
OR
![Page 17: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/17.jpg)
XOR
![Page 18: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/18.jpg)
Convexity: Representable by simple extension of perceptron
• Clue: A body is convex if whenever you have two points inside; any third point between them is inside.
• So just take perceptron where you have an input for each triple of points
![Page 19: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/19.jpg)
Connectedness: Not Representable
![Page 20: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/20.jpg)
Representability
• Perceptron: Only Linearly Separable– AND versus XOR– Convex versus Connected
• Many linked neurons: universal– Proof: Show And, Or , Not, Representable
• Then apply DNF representation theorem
![Page 21: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/21.jpg)
Learnability
• Perceptron Convergence Theorem:– If representable, then perceptron algorithm
converges– Proof (from slides)
• Multi-Neurons Networks: Good heuristic learning techniques
![Page 22: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/22.jpg)
Generalizability
• Typically train a perceptron on a sample set of examples and counter-examples
• Use it on general class• Training can be slow; but execution is fast.
• Main question: How does training on training set carry over to general class? (Not simple)
![Page 23: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/23.jpg)
Programming: Just find the weights!
• AUTOMATIC PROGRAMMING (or learning)
• One Neuron: Perceptron or Adaline
• Multi-Level: Gradient Descent on Continuous Neuron (Sigmoid instead of step function).
![Page 24: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/24.jpg)
Perceptron Convergence Theorem
• If there exists a perceptron then the perceptron learning algorithm will find it in finite time.
• That is IF there is a set of weights and threshold which correctly classifies a class of examples and counter-examples then one such set of weights can be found by the algorithm.
![Page 25: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/25.jpg)
Perceptron Training Rule
• Loop: Take an positive example or negative example. Apply to network. – If correct answer, Go to loop.
– If incorrect, Go to FIX.
• FIX: Adjust network weights by input example– If positive example Wnew = Wold + X; increase threshold
– If negative example Wnew = Wold - X; decrease threshold
• Go to Loop.
![Page 26: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/26.jpg)
Perceptron Conv Theorem (again)
• Preliminary: Note we can simplify proof without loss of generality– use only positive examples (replace example
X by –X)– assume threshold is 0 (go up in dimension by
encoding X by (X, 1).
![Page 27: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/27.jpg)
Perceptron Training Rule (simplified)
• Loop: Take a positive example. Apply to network. – If correct answer, Go to loop. – If incorrect, Go to FIX.
• FIX: Adjust network weights by input example– If positive example Wnew = Wold + X
• Go to Loop.
![Page 28: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/28.jpg)
Proof of Conv Theorem• Note:
1. By hypothesis, there is a such that V*X > for all x in F 1. Can eliminate threshold (add additional dimension to input) W(x,y,z) > threshold if and only
if W* (x,y,z,1) > 0
2. Can assume all examples are positive ones (Replace negative examples by their negated vectors) W(x,y,z) <0 if and only if W(-x,-y,-z) > 0.
![Page 29: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/29.jpg)
Perceptron Conv. Thm.(ready for proof)
• Let F be a set of unit length vectors. If there is a (unit) vector V* and a value >0 such that V*X > for all X in F then the perceptron program goes to FIX only a finite number of times (regardless of the order of choice of vectors X).
• Note: If F is finite set, then automatically there is such an
![Page 30: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/30.jpg)
Proof (cont).
• Consider quotient V*W/|V*||W|.
(note: this is cosine between V* and W.)
Recall V* is unit vector .
= V*W*/|W|
Quotient <= 1.
![Page 31: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/31.jpg)
Proof(cont)
• Consider the numerator
Now each time FIX is visited W changes via ADD.
V* W(n+1) = V*(W(n) + X)
= V* W(n) + V*X
> V* W(n) + Hence after n iterations:
V* W(n) > n
![Page 32: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/32.jpg)
Proof (cont)
• Now consider denominator:• |W(n+1)|2 = W(n+1)W(n+1) =
( W(n) + X)(W(n) + X) =
|W(n)|**2 + 2W(n)X + 1 (recall |X| = 1)
< |W(n)|**2 + 1 (in Fix because W(n)X < 0)
So after n times
|W(n+1)|2 < n (**)
![Page 33: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/33.jpg)
Proof (cont)
• Putting (*) and (**) together:
Quotient = V*W/|W| > n sqrt(n) = sqrt(n)
Since Quotient <=1 this means n < 1/This means we enter FIX a bounded number of times. Q.E.D.
![Page 34: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/34.jpg)
Geometric Proof
• See hand slides.
![Page 35: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/35.jpg)
Additional Facts
• Note: If X’s presented in systematic way, then solution W always found.
• Note: Not necessarily same as V*• Note: If F not finite, may not obtain
solution in finite time• Can modify algorithm in minor ways and
stays valid (e.g. not unit but bounded examples); changes in W(n).
![Page 36: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/36.jpg)
Percentage of Boolean Functions Representable by a
Perceptron
• Input Perceptrons Functions
1 4 42 16 143 104 2564 1,882 65,5365 94,572 10**96 15,028,134 10**19
7 8,378,070,864 10**388 17,561,539,552,946 10**77
![Page 37: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/37.jpg)
What wont work?
• Example: Connectedness with bounded diameter perceptron.
• Compare with Convex with
(use sensors of order three).
![Page 38: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/38.jpg)
What wont work?
• Try XOR.
![Page 39: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/39.jpg)
What about non-linear separableproblems?
• Find “near separable solutions”
• Use transformation of data to space where they are separable (SVM approach)
• Use multi-level neurons
![Page 40: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/40.jpg)
Multi-Level Neurons
• Difficulty to find global learning algorithm like perceptron
• But …– It turns out that methods related to gradient
descent on multi-parameter weights often give good results. This is what you see commercially now.
![Page 41: Wed June 12](https://reader035.fdocuments.in/reader035/viewer/2022062309/568152b4550346895dc0d663/html5/thumbnails/41.jpg)
Applications
• Detectors (e. g. medical monitors)
• Noise filters (e.g. hearing aids)
• Future Predictors (e.g. stock markets; also adaptive pde solvers)
• Learn to steer a car!
• Many, many others …