Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

15
Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan

Transcript of Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Page 1: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Methods for Classification

20.04.2015:

Presentation for MA seminar in statistics

Eli Dahan

Page 2: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Outline

Introduction - problem and solution LDA - Linear Discriminant Analysis LR :

Logistic Regression (Linear Regression) LDA Vs. LR In a word – Separating Hyperplanes

Page 3: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Introduction - the problem

X

Group k

Observation

Or Group l?

*We can think of G as “group label”

Posteriori

Pj=P(G=j|X=x)

Page 4: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Introduction - the solution

Linear Decision boundary:

pk=pl

pk>plchoose K

pl>pkchoose L

Page 5: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Discriminant Analysis

Let P(G = k) = k and P(X=x|G=k) = fk(x) Then by bayes rule:

Decision boundary:

Page 6: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Discriminant Analysis

Assuming fk(x) ~ gauss(k, k) and 1 =2 = …

=K= We get Linear (in x) decision boundary

For not common we get QDA (RDA)

Page 7: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Discriminant Analysis

Using empirical estimation methods:

Top classifier (Michie et al., 1994) – the data supports linear boundaries, stability

Page 8: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Logistic Regression

Models posterior prob. Of K classes; they sum to one and remain in [0,1]:

• Linear Decision boundary:

Page 9: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Logistic Regression

Model fit:

• In max. ML Newton-Raphson algorithm is used

Page 10: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Regression

Recall the common features of multivariate regression:

• +Lack of multicollinearity etc.• Here: Assuming N instances (N*p observation

matrix X), Y is a N*K indicator response matrix (K classes).

Page 11: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Regression

Page 12: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

Linear Regression

Page 13: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

LDA Vs. LR

Similar results, LDA slightly better (56% vs. 67% error rate for LR)

Presumably, they are identical because of the linear end-form of decision boundaries (return to see).

Page 14: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

LDA Vs. LR

LDA: parameters fit by max. full log-likelihood based on the joint density which assumes Gaussian density (Efron 1975 – worst case of ignoring gaussianity 30% eff. reduction)

Linearity is derived

LR: P(X) arbitrary (advantage in model selection and abitility to absorb extreme X values), fits parameters of P(G|X) by maximizing the conditional likelihood.

Linearity is assumed

Page 15: Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan.

In a word – separating hyperplanes