Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear...

28
Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas

Transcript of Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear...

Page 1: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Nonlinear ridge regressionRisk, regularization, and cross-validation

Nando de Freitas

Page 2: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Outline of the lectureThis lecture will teach you how to fit nonlinear functions by usingbases functions and how to control model complexity. The goal is foryou to:

Learn how to derive ridge regression. Understand the trade-off of fitting the data and regularizing it. Learn polynomial regression. Understand that, if basis functions are given, the problem oflearning the parameters is still linear. Learn cross-validation. Understand model complexity and generalization.

Page 3: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Regularization

Page 4: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Derivation

Page 5: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Ridge regression as constrained optimization

Page 6: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Regularization paths

[Hastie, Tibshirani & Friedman book]

Asd increases, t(d) decreases and each qi goes to zero.

q1

q8

q2

q

t(d)

Page 7: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Ridge regression and Maximum a Posteriori(MAP) learning

Page 8: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Ridge regression and Maximum a Posteriori(MAP) learning

Page 9: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Going nonlinear via basis functions

We introduce basis functions Á(¢) to deal with nonlinearity:

y(x) = Á(x)µ + ²

For example, Á(x) = [1; x; x2],

Page 10: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Going nonlinear via basis functions

y(x) = Á(x)µ + ²

Á(x) = [1; x1; x2; x21; x

22]Á(x) = [1; x1; x2]

Page 11: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Effect of data when we have the right model

yi = q0 + xi q1 + xi2 q2 + N ( 0 , s 2 )

Page 12: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Effect of data when the model is too simple

yi = q0 + xi q1 + xi2 q2 + N ( 0 , s 2 )

Page 13: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Effect of data when the model is very complex

yi = q0 + xi q1 + xi2 q2 + N ( 0 , s 2 )

Page 14: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline
Page 15: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Example: Ridge regression with a polynomial of degree 14

y(xi ) = 1 q0 + xi q1 + xi2 q2 + . . . + xi

13 q13 + xi14 q14

F = [ 1 xi xi2 . . . xi

13 xi14 ]

J(q) = ( y - F q ) T ( y - F q ) + d2 q T q

small d large dmedium d

y

x

y y

xx

Page 16: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Kernel regression and RBFsWe can use kernels or radial basis functions (RBFs) as features:

Á(x) = [·(x;¹1; ¸); : : : ; ·(x;¹d; ¸)]; e:g: ·(x;¹i; ¸) = e(¡ 1¸kx¡¹ik

2)

y(xi ) = f (xi ) q = 1q0 + k(xi , m1 , l) q1 + . . . + k(xi , md , l) qd

Page 17: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline
Page 18: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline
Page 19: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Kernel regression in Torch

Page 20: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Kernel regression in Torch

Page 21: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Kernel regression in Torch

Page 22: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

We can choose the locations m of the basis functions to be the inputs.That is, mi = xi . These basis functions are the known as kernels.The choice of width l is tricky, as illustrated below.

Too small l

Right l

Too large l

kernels

Page 23: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

The big question is how do wechoose the regularization coefficient,

the width of the kernels or thepolynomial order?

Page 24: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Simple solution: cross-validation

Page 25: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

K-fold crossvalidation

The idea is simple: we split the training data into K folds; then, for eachfold k 2 f1; : : : ; Kg, we train on all the folds but the k'th, and test on thek'th, in a round-robin fashion.

It is common to use K = 5; this is called 5-fold CV.

If we set K = N , then we get a method called leave-one out crossvalidation, or LOOCV, since in fold i, we train on all the data casesexcept for i, and then test on i.

Page 26: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Example: Ridge regression with polynomial of degree 14

Page 27: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Where cross-validation fails) (K-means)

Page 28: Nonlinear ridge regression Risk, regularization, and cross-validation · 2015-02-05 · Nonlinear ridge regression Risk, regularization, and cross-validation Nando de Freitas. Outline

Next lecture

In the next lecture, we delve into the world of optimization.

Please revise your multivariable calculus and in particular thedefinition of gradient