Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio...
Transcript of Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio...
![Page 1: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/1.jpg)
Second-Order Perceptron
Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005)
1
![Page 2: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/2.jpg)
Outline
• Introduction: Perceptron• Algorithm: Second-order perceptron• Analysis: Mistake bounds• Simulations• Conclusions
2
![Page 3: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/3.jpg)
Perceptron Algorithm (F. Rosenblatt, 1958)
• One of the oldest machine learning algorithm
• Online algorithm for learning a linear threshold function with small error
3
![Page 4: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/4.jpg)
Perceptron Algorithm (F. Rosenblatt, 1958)
• Goal: find a linear classifier with small error
4
If no error, keeping the same; otherwise, update.
![Page 5: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/5.jpg)
Perceptron Mistake Bound
• Consider w* separate the data:• Define margin
• The number of mistakes perceptron makes is at most
5
Norm of x: the larger, the larger mistake bound
The larger, the more confidence
0* >iiT yxw
22*
*
sup
min
ii
iT
i
xw
xw=γ
2−γ
![Page 6: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/6.jpg)
Proof of Perceptron Mistake Bound [Novikoff, 1963]
Proof: Let be the hypothesis before the k-th mistake. Assume that the k-th mistake occurs on the input example .
6
First, Second,
Hence,
kv
( )ii y,x
22*
*
sup
min
ii
iT
i
xw
xw=γ
![Page 7: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/7.jpg)
Problem of Perceptron
• Mistake bound
• Convergence issue
7
![Page 8: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/8.jpg)
How to Incorporate Second-order Information?
• Intuitive idea: Whitened perceptron– Construct the correlation matrix – Run standard Perceptron on
• Properties– Not incremental (instance available, label is
hidden)– Make correlation matrix to identity
– Mistake bound approaches 2
8
![Page 9: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/9.jpg)
Second-order Perceptron: Basic Form
• Algorithm
Xk: store the mis-classified instances
Not a linear-threshold predictor
9
![Page 10: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/10.jpg)
Analysis
• Theorem
10
![Page 11: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/11.jpg)
Sketched Proof
11
![Page 12: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/12.jpg)
Extension-Kernel
12
![Page 13: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/13.jpg)
Extension-Adaptive Parameter
13
![Page 14: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/14.jpg)
Extension-Pseudoinverse
14
![Page 15: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/15.jpg)
Simulations• Linearly separable Gaussian data
with 100 attributes– Correlation matrix: a dominant
eigenvalue, eight times bigger than the others
– Data 1: hyperplane is orthogonal to the eigenvector with the dominant eigenvalue
– Data 2: hyperplane is orthogonal to the eigenvector with the first non-dominant eigenvalue
15
![Page 16: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/16.jpg)
Simulation
• Procedure (Randomly repeat 5 times)– Train two epochs on 9,000 examples– Test on 3,000 examples
• Results
16
![Page 17: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/17.jpg)
Conclusions
• Second-order Perceptron algorithm– Online binary classification exploiting
spectral properties– Prove the best known mistake bound
for kernel-based linear threshold classifiers
– Two variants for replacing inverse of correlation matrix
17
![Page 18: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/18.jpg)
Q & A
18
![Page 19: Second-Order Perceptron...Second-Order Perceptron Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: A Second-Order Perceptron Algorithm. SIAM J. Comput. 34(3): 640-668 (2005) Proof](https://reader036.fdocuments.in/reader036/viewer/2022071417/61153ea0b4af325f6800b277/html5/thumbnails/19.jpg)
Lemma
19