New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) [email protected]...

19
11/3/2008 1 Linear Regression (Part II) [email protected] Outline Generalize the linear regression Simple multiple regression Regularization Bias vs. Variance tradeoff Model selection 2

Transcript of New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) [email protected]...

Page 1: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

1

Linear Regression(Part II)

[email protected]

Outline

• Generalize the linear regression

• Simple multiple regression

• Regularization

• Bias vs. Variance tradeoff

• Model selection

2

Page 2: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

2

Linear regression in D dimensions

From D = 1

To D > 1 and linear on x(multiple regression)

To D > 1 and any set of nonlinear functions on x

3

All are linear on wLinear regression

Linear basis functions models

4

Page 3: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

3

Example of basis functions transformation

5

Maximum likelihood and least squares

Assuming normal distribution

The likelihood to observe N target values t from a set of X D dimensional predictors

Maximize ln of likelihood…

6

is the same as minimizing E

Page 4: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

4

Solving MLEFind optimal w…

based on Moore-Penrose pseudo-inverse

7

Outline

• Generalize the linear regression

• Simple multiple regression

• Regularization

• Bias vs. Variance tradeoff

• Model selection

8

Page 5: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

5

The case of simple multiple regression

wi XiY=w0 +

w

w

w0

w1

wD

9

The case of simple multiple regression

w

10

Page 6: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

6

Least squares estimatew

w

w w w

w

w

11

ww w

Least squares estimate

w

w

12

Page 7: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

7

Example

13

Solution

14

Page 8: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

8

Solution (cont.)

15

w

What it looks like?

16

Page 9: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

9

Outline

• Generalize the linear regression

• Simple multiple regression

• Regularization

• Bias vs. Variance tradeoff

• Model selection

17

Regularization

Recall the case of

• a single predictor

• and polynomial basis function φj(x) = xj

18

Page 10: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

10

1st Order Polynomial

19

3rd Order Polynomial

20

Page 11: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

11

9th Order Polynomial

21

Over‐fitting

Root‐Mean‐Square (RMS) Error:

22

Page 12: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

12

Polynomial Coefficients   

23

Data Set Size: 

9th Order Polynomial

24

Page 13: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

13

Data Set Size: 

9th Order Polynomial

25

Regularization

Penalize large coefficient values

26

Page 14: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

14

Regularization: 

27

Regularization: 

28

Page 15: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

15

Regularization:           vs. 

29

Polynomial Coefficients   

30

Page 16: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

16

Outline

• Generalize the linear regression

• Simple multiple regression

• Regularization

• Bias vs. Variance tradeoff

• Model selection

31

Bias vs. Variance

noise

32

variance

bias

Page 17: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

17

How this applies to regression?

33

How this applies to regression?

34

Page 18: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

18

How this applies to regression?

35

Outline

• Generalize the linear regression

• Simple multiple regression

• Regularization

• Bias vs. Variance tradeoff

• Model selection

36

Page 19: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •

11/3/2008

19

Model Selection

M (degree of polynomial) corresponds to the l it f th d lcomplexity of the model

The bigger M, the bigger the overfitting

Regularization (λ) helps by controlling the complexity

But how to find good combinations for M and λ?But how to find good combinations for M and λ?

37

Information criteria

p(D|wML) is the likelihoodM is the number of parameters

• AIC (Akaike Information Criterion)

maximize p(D|wML) - M

Tends to favor overly simply models

38