Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September...

82
Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012

description

Dummy dependent (2) Discrete choice models  Agent chooses among discrete choices: {commute, walk}  Utility maximizing choice is that which solves: Max [U(commute), U(walk)]  Utility levels are not observed, but choices are  Use a dummy variable for actual choice  Estimate a demand function for public transportation where Y = 1 if individual chose to commute = 0 otherwise

Transcript of Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September...

Page 1: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Linear Probability and Logit Models (Qualitative Response Regression Models)

SA QuimboSeptember 2012

Page 2: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Dummy dependent variables Dependent variable is qualitative in nature

For example, dependent variable takes only two possible values, 0 or 1

Examples. Labor force participation Insurance decision Voter’s choice School enrollment decision Union membership Home ownership

Predicted dependent variable ~ estimated probability

Page 3: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Dummy dependent (2) Discrete choice models

Agent chooses among discrete choices:{commute, walk}

Utility maximizing choice is that which solves:Max [U(commute), U(walk)]

Utility levels are not observed, but choices are Use a dummy variable for actual choice Estimate a demand function for public transportation

where Y = 1 if individual chose to commute = 0 otherwise

Page 4: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-4

Binary Dependent Variables (cont.)

• Suppose we were to predict whether NFL football teams win individual games, using the reported point spread from sports gambling authorities.

• For example, if the Packers have a spread of 6 against the Dolphins, the gambling authorities expect the Packers to lose by no more than 6 points.

Page 5: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-5

Binary Dependent Variables (cont.)

• Using the techniques we have developed so far, we might regress

• How would we interpret the coefficients and predicted values from such a model?

0 1

where indexes games

Wini i iD Spread

i

Page 6: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-6

Binary Dependent Variables (cont.)

• DiWin is either 0 or 1. It does not make sense

to say that a 1 point increase in the spread increases Di

Win by 1. DiWin can change only

from 0 to 1 or from 1 to 0.

• Instead of predicting DiWin itself, we predict

the probability that DiWin = 1.

0 1Wini i iD Spread

Page 7: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-7

Binary Dependent Variables (cont.)

• It can make sense to say that a 1 point increase in the spread increases the probability of winning by 1.

• Our predicted values of DiWin are the

probability of winning.

0 1Wini i iD Spread

Page 8: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Linear Probability Model (LPM) Consider the ff. model:

Yi = β1 + β2Xi + ui

where i ~ familiesYi = 1 if family owns a house= 0 otherwise

Xi = family income

Dichotomous variable is a linear function of Xi

Page 9: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

The predicted values of Yi can be interpreted as the estimated probability of owning a house, conditional on income

i.e.,

E(Yi|Xi) = Pr(Yi=1|Xi)

LPM (2)

Page 10: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Let Pi = probability that Yi=1 Probability that Yi=0 is 1-Pi

E(Yi)? E(Yi) = (1)(Pi) + (0) (1-Pi) = Pi

Yi = β1 + β2Xi + ui Linear Probability Model

LPM (3)

Page 11: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Assuming E(ui) = 0

Then E(Yi|Xi) = β1 + β2Xi

Or Pi = β1 + β2Xi

Where 0 ≤ Pi ≤ 1

LPM (4)

Page 12: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Problems in Estimating the LPM Non-normality of disturbances:

Yi = β1 + β2Xi + ui

ui = Yi – β1 – β2Xi

If Yi = 1: ui = 1 – β1 – β2Xi

Yi = 0: ui = - β1 – β2Xi

* ui’s are binomially distributed

-> OLS estimates are unbiased;-> as the sample increases, ui’s will tend to be normal

Page 13: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Heteroskedastic Disturbances

var(ui) = E(ui-E(ui))2 = E(ui2)

= (1 – β1 – β2Xi)2(Pi) + (- β1 – β2Xi)2(1-Pi)

= (1 – β1 – β2Xi)2(β1 + β2Xi) + (- β1 – β2Xi)2(1-β1-β2Xi)

= (β1 + β2Xi) (1-β1-β2Xi)

= Pi (1-Pi)* Var (ui) will vary with Xi

Problems (2)

Page 14: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

11)(var1var

i

ii

ii

i ww

uww

u

Transform model in such a way that the transformed disturbances are not heteroskedastic:

Let wi = Pi (1-Pi)

i

i

i

i

ii

i

wu

wX

wwY

21

Problems (3)

Page 15: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

R2 may not be a good measure of model fit

X X X X X X XXX

XXXXXXXX

SRF

Income

Home ownership

Problems (4)

Page 16: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Assumed bounds (0≤ E(Yi|Xi) ≤1) could be violated

Example, Gujarati (see next slide): six estimated values are negative and six values are in excess of one

Problems (4)

Page 17: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Example

Hypothetical data on home ownership

Gujarati, p.588Source SS df MS Number of obs 40Model 8.027495 1 8.027495 F( 1, 38) 156.63Residual 1.947505 38 0.051251 Prob > F 0Total 9.975 39 0.255769 R-squared 0.8048

Adj R-squared 0.7996Root MSE 0.22638

y Coef. Std. Err. t P>t [95% Conf. Interval]x 0.102131 0.008161 12.52 0 0.085611 0.118651

_cons -0.945686 0.122842 -7.7 0 -1.194366 -0.697007

Page 18: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Linear vs. Non-linear Probability Models

P

X0

1

SRF, LPM example

CDF, RV

Constant= -0.94

Slope = 0.10

~ logistically or

normally distributed RVs

Page 19: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-19

Binary Dependent Variables (cont.)

• We need a procedure to translate our linear regression results into true probabilities.

• We need a function that takes a value from -∞ to +∞ and returns a value from 0 to 1.

Page 20: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-20

Binary Dependent Variables (cont.)

• We want a translator such that: The closer to -∞ is the value from our linear

regression model, the closer to 0 is our predicted probability.

The closer to +∞ is the value from our linear regression model, the closer to 1 is our predicted probability.

No predicted probabilities are less than 0 or greater than 1.

Page 21: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-21

Figure 19.2 A Graph of Probability of Success and X

Page 22: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-22

Binary Dependent Variables

• How can we construct such a translator?

• How can we estimate it?

Page 23: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-23

Probit/Logit Models (Chapter 19.2)

• In common practice, econometricians use TWO such “translators”: probit logit

• The differences between the two models are subtle.

• For present purposes there is no practical difference between the two models.

Page 24: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-24

Probit/Logit Models

• Both the Probit and Logit models have the same basic structure.

1. Estimate a latent variable Z using a linear model. Z ranges from negative infinity to positive infinity.

2. Use a non-linear function to transform Z into a predicted Y value between 0 and 1.

Page 25: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-25

Probit/Logit Model (cont.)

• Suppose there is some unobserved continuous variable Z that can take on values from negative infinity to infinity.

• The higher E(Z) is, the more probable it is that a team will win, or a student will graduate, or a consumer will purchase a particular brand.

Page 26: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-26

Probit/Logit Model (cont.)

• We call an unobserved variable, Z, that we use for intermediate calculations, a latent variable.

Page 27: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-27

Deriving Probit/Logit (cont.)

0 1 1 ..

1 00 0

We assume acts "as if" determined by latent variable .

if if

i

i i K Ki i

i i

i i

YZ

Z X X

Y ZY Z

Page 28: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-28

Deriving Probit/Logit (cont.)

• Note: the assumption that the breakpoint falls at 0 is arbitrary.

• 0 can adjust for whichever breakpoint you might choose to set.

Page 29: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-29

Deriving Probit/Logit (cont.)

• We assume we know the distribution of ui.

• In the probit model, we assume ui is distributed by the standard normal.

• In the logit model, we assume ui is distributed by the logistic.

Page 30: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

30

dteXF

whereXF

dtedte

XuPZY

i

ii

Xt

i

i

X

t

ZE

t

iiii

21

2

21

22

221

21

2)(2

21

21)(

)(121

21

)()0Pr()1Pr(

(13)

Here t is standardised normal variable, 1,0~ Nt

Probit model (one explanatory variable: )𝑍𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖

Page 31: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

31

Hence

)(21

21

)()0Pr()0Pr(

21

2)(

2

21

21

22

i

Xt

ZEt

iiii

XF

dtedte

XuPZY

i

(13)

Here t is standardised normal variable, 1,0~ Nt

Probit model

Page 32: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

32

1 . p r o b a b i l i t y d e p e n d s u p o n u n o b s e r v e d u t i l i t y i n d e x iZ

w h i c h d e p e n d s u p o n o b s e r v a b l e v a r i a b l e s s u c h a si n c o m e . T h e r e i s a t h r e s h - h o l d o f t h i s i n d e x w h e n a f t e rw h i c h f a m i l y s t a r t s o w n i n g a h o u s e , *

ii ZZ .

iP

iI0

1 ii ZFP

Steps for a probit model

Page 33: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

28-33

Estimating a Probit/Logit Model(Chapter 19.2)

• In practice, how do we implement a probit or logit model?

• Either model is estimated using a statistical method called the method of maximum likelihood.

Page 34: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

28-34

Estimating a Probit/Logit Model

• In practice, how do we implement a probit or logit model?

• Either model is estimated using a statistical method called the method of maximum likelihood.

Page 35: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Alternative Estimation Method

Ungrouped/ individual data

Maximum Likelihood Estimation

Choose the values of the unknown parameters (β1,β2) such that the probability of observing the given Ys is the highest possible

Page 36: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

MLE Recall:

P’s are not observed but Y’s are; Pr(Y=1)=Pi

Pr(Y=0)=1-Pi

Joint probability of observing n Y values:

f(Y1,…,Yn)=Πi=1,…,nPiYi (1-Pi)1-Yi

ii ZXiii eeXYEP

1

11

1)|1( )( 21

Page 37: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

MLE (Gujarati, page 634)

Danao (2013), page 485: “Under standard regularity conditions, maximum likelihood estimators are consistent, asymptotically normal, and asymptotically efficient. In other words, in large samples, maximum likelihood estimators are consistent, normal, and best.

Page 38: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

MLE (2)

Taking its natural logarithm, the log likelihood function is obtained:

ln f(Y1,…,Yn)=Yi(β1+ β2Xi)- ln[1+exp(β1+β2Xi)]

Max log likelihood function by choosing (β1,β2)

Page 39: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Example

Individual dataIteration 0: log likelihood = -27.27418Iteration 1: log likelihood = -16.41239Iteration 2: log likelihood = -15.49205Iteration 3: log likelihood = -15.43501Iteration 4: log likelihood = -15.43463

Logit estimates

Number 0f obs = 40Log likelihood= -15.43463 LR chi 2(1) = 23.68

Prob > chi(2) = 0Pseudo R2 = 0.4341

y Coef. Std. Err. z P>z [95% Conf. Interval]x 0.494246 0.139327 3.55 0 0.2211705 0.767322_cons -6.582815 1.951325 -3.37 0.001 -10.40734 -2.758289

Page 40: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Interpreting the results Iterative procedure to get at the maximum of the log

likelihood function Use Z (standard normal variable) instead of t Pseudo R2 – more meaningful alternative to R2; or, use the

count R2

LR statistic is equivalent to the F ratio computed in testing the overall significance of the model

Estimated slope coefficient measures the estimated change in the logit for a unit change in X

Predicted probability (at the mean income) of owning a home is 0.63

Or, every unit increase in income increases the odds of owning a home by 11 percent

Page 41: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Pseudo R2

Danao, page 487, citing Gujarati (2008): “In binary regressand models, goodness of fit is of secondary importance. What matters are the expected signs of the regression coefficients and their statistical and practical significance”

Page 42: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Logit Model

Consider the home ownership model:

Yi = β1 + β2Xi + ui

where i ~ familiesYi = 1 if family owns a house= 0 otherwise

Xi = family income

Page 43: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

43

Logistic Probability Distribution

PDF: f(x) = exp(x)/[1+exp(x)]2

CDF: F(a) = exp(a)/[1+exp(a)] Symmetric, unimodal distribution Looks a lot like the normal Incredibly easy to evaluate the CDF and

PDF Mean of zero, variance > 1 (more

variance than normal)

Page 44: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

The Logistic Distribution Function Assume that owning a home is a random event, and the

probability that this random event occurs is given by:

where Zi = β1 + β2Xi

(i) 0≤Pi≤1 and (ii) Pi is a nonlinear function of Zi

OLS is not appropriate

ii ZXiii eeXYEP

1

11

1)|1( )( 21

Page 45: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

The Odds Ratio (1)

111

111

1111

1111

111

1

11

ZZZZ

ZZZ

Z

Z

Z

Zi

eeee

eeee

ee

eP

i

iii

i

i

i

ii

i

i

i

i

i

iZ

Z

ZZ

Z

Z

Z

Z

Z

Z

i

i ee

eee

eee

ee

PP

11

/111

11

)1(1)1(1

1

Page 46: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

If the probability of owning a home is 10 percent, then the odds ratio is .10/(1-0.10),

or the odds are 1 to 9 in favor of owning a home

The Odds Ratio (2)

Page 47: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Logit

ln (Pi/(1-Pi))=ln (ezi)=Zi

ln (Pi/(1-Pi)) = β1 + β2Xi

- The log of the odds-ratio is a linear function of X and the parameters β1 and β2

- Pi Є [0,1] but ln (Pi/(1-Pi)) Є (-∞, ∞)

- Li = ln (Pi/(1-Pi)), Li~ “logit”

Page 48: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

The Logit Model

Li = ln (Pi/(1-Pi)) = β1 + β2Xi + ui

Although P Є [0,1], logits are unbounded. Logits are linear in X but P is not linear in X L<0 if the odds ratio<1 and

L>0 if the odds ratio>1 β2 measures the change in L (“log-odds”) as

X changes by one unit

Page 49: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Estimating the Logit Model

Problem with individual households/units: ln(1/0) and ln(0/1) are undefined

Solution: Estimate Pi/(1-Pi) from the data, where Pi=relative frequency = ni/Ni

Ni = number of families for a specific level of Xi (say, income)ni= number of families owning a home

Page 50: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Example using Grouped Data

x tothh numhomeown6 40 88 50 12

10 60 1813 80 2815 100 4520 70 3625 65 3930 50 3335 40 3040 25 20

Estimate the home ownership model using grouped data and OLS:

Yi = β1 + β2Xi + ui

Page 51: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Estimating (2) Problem: heteroskedastic disturbances

If the proportion of families owning a home follows a binomial distribution, then

iii

i

PPNNu

1

1,0~

Page 52: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Estimating (3) Solution: Transform the model such that the new

disturbance term is homoscedastic

Consider: wi = NiPi(1-Pi)

iiiiiii uwXwwLw 21

11)var()var( i

iiiii wwuwuw

Page 53: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Estimating (4)

iiiiiii uwXwwLw 21

)1(^^

iiii PPNw

i

ii

NnP

^

Estimate the ff. by OLS:

where

i

ii

P

PL ^

^

1log

Note: regression model has two

regressors and no constant

Page 54: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Example (2)

STATA results:

Source SS df M MS Number of obs 10Model 63.39469 2 31.6973 F( 2, 8) 108.52Residual 2.336665 8 0.292083 Prob > F 0Total 65.73135 10 6.5731 R-squared 0.9645

Adj R-squared 0.9556Root MSE 0.54045

lstar Coef. Std. Err. t P>t [95% Conf. Interval]xstar 0.078669 0.005448 14.44 0 0.0661066 0.091231sqrtw -1.593238 0.111494 -14.29 0 -1.850344 -1.336131

Page 55: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Interpreting the results

iiiii

iiiii

XwwLw

XwwLw

.0786686 1.593238-

^

2

^

1

^

A unit increase in weighted income (=sqrt(w)*X) increases the weighted log-odds (=sqrt(w)*L) by 0.0786

Page 56: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Interpreting (2)

(antilog of the estimated coefficient of weighted X – 1) *100= percent change in the odds in favor of owning a house for every unit increase in weighted X;

Predicted probabilities:

where V is the predicted logit (= predicted lstar divided by sqrt(w))

How does a unit increase in X impact on predicted probabilities?

-> varies with X

->

V

V

eep

1

^

Page 57: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-57

Estimating a Probit/Logit Model (cont.)

• The computer then calculates the ’s that assigns the highest probability to the outcomes that were observed.

• The computer can calculate the ’s for you. You must know how to interpret them.

Page 58: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-58

TABLE 19.3 What Point Spreads Say About the Probability of Winning in the NFL: III

Page 59: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-59

Estimating a Probit/Logit Model (cont.)

• In a linear regression, we look to coefficients for three elements:

1. Statistical significance: You can still read statistical significance from the slope dZ/dX. The z-statistic reported for probit or logit is analogous to OLS’s t-statistic.

2. Sign: If dZ/dX is positive, then dProb(Y)/dX is also positive.

Page 60: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-60

Estimating a Probit/Logit Model (cont.)

The z-statistic on the point spread is -7.22, well exceeding the 5% critical value of 1.96. The point spread is a statistically significant explanator of winning NFL games.

The sign of the coefficient is negative. A higher point spread predicts a lower chance of winning.

Page 61: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-61

Estimating a Probit/Logit Model (cont.)

3. Magnitude: the magnitude of dZ/dX has no particular interpretation. We care about the magnitude of dProb(Y)/dX.

From the computer output for a probit or logit estimation, you can interpret the statistical significance and sign of each coefficient directly. Assessing magnitude is trickier.

Page 62: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-62

Probit/Logit (cont.)

• To predict the Prob(Y ) for a given X value, begin by calculating the fitted Z value from the predicted linear coefficients.

• For example, if there is only one explanator X:

0 1ˆ ˆ( ) i iE Z Z X

Page 63: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-63

Probit/Logit Model (cont.)

Page 64: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-64

Probit/Logit Model (cont.)

• Then use the nonlinear function to translate the fitted Z value into a Prob(Y ):

ˆ( ) ( )Prob Y F Z

Page 65: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-65

Probit/Logit Model (cont.)

Page 66: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-66

Estimating a Probit/Logit Model (cont.)

• Problems in Interpreting Magnitude:

1. The estimated coefficient relates X to Z. We care about the relationship between X and Prob(Y = 1).

2. The effect of X on Prob(Y = 1) varies depending on Z.

Page 67: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-67

Estimating a Probit/Logit Model (cont.)

• There are two basic approaches to assessing the magnitude of the estimated coefficient.

• One approach is to predict Prob(Y ) for different values of X, to see how the probability changes as X changes.

Page 68: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-68

Estimating a Probit/Logit Model (cont.)

Page 69: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-69

Estimating a Probit/Logit Model (cont.)

• Note Well: the effect of a 1-unit change in X varies greatly, depending on the initial value of E(Z ).

• E(Z ) depends on the values of all explanators.

Page 70: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-70

Estimating a Probit/Logit Model (cont.)

Page 71: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-71

Estimating a Probit/Logit Model (cont.)

• For example, let’s consider the effect of 1 point change in the point spread, when we start 1 standard deviation above the mean, at SPREAD = 5.88 points.

• Note: In this example, there is only one explanator, SPREAD. If we had other explanators, we would have to specify their values for this calculation, as well.

Page 72: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-72

Estimating a Probit/Logit Model (cont.)

• Step One: Calculate the E(Z ) values for X = 5.88 and X = 6.88, using the fitted values.

• Step Two: Plug the E(Z ) values into the formula for the logistic density function.

Page 73: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-73

Estimating a Probit/Logit Model (cont.)

(5.88) 0 0.1098 5.88 0.6456(6.88) 0 0.1098 6.88 0.7554

ˆexp( )ˆ( ) ˆ1 exp( )

(0.7554) (0.6456) 3.20 3.44 0.024.

For the logit,

ZZ

ZF ZZ

F F

Page 74: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-74

Estimating a Probit/Logit Model (cont.)

• Changing the point spread from 5.88 to 6.88 predicts a 2.4 percentage point decrease in the team’s chance of victory.

• Note that changing the point spread from 8.88 to 9.88 predicts only a 2.1 percentage point decrease.

Page 75: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-75

Estimating a Probit/Logit Model (cont.)

• The other approach is to use calculus.

1

1 1

ˆ( ) ( ) ˆˆ ˆ

dProb Y dProb Y dZ dFdX XdZ dZ

Page 76: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-76

Estimating a Probit/Logit Model (cont.)

1

1 1

ˆ( ) ( ) ˆˆ ˆ

ˆˆ

ˆ

ˆ

Unfortunately, varies, depending

on . However, a sample value can

be calculated for a representative

value. Typically, we use the calcu

dProb Y dProb Y dZ dFdX XdZ dZ

dFdZ

Z

Z

Z

lated at the mean values for each .X

Page 77: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 28-77

Estimating a Probit/Logit Model (cont.)

• Some econometrics software packages can calculate such “pseudo-slopes” for you.

• In STATA, the command is “dprobit.”

• EViews does NOT have this function.

Page 78: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

78

It is an extension of the probit model, named after Tobin. We observe variables if the event occurs: i.e. amount spent if someone buys a house. We do not observe the dependent variable for people who have not bought a house. The observed sample is censored, contains observations for only those who buy the house.

otherwiseoccurseventifuX

Y tti 0

10

tY is equal to tt uX 10 is the event is observed equal to zero if the

event is not observed. It is unscientific to estimate the equation only with observed sample without worrying about the remaining observations in the truncated distribution. The Tobit model tries to correct this bias.

Tobit Model

Page 79: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

79

Page 80: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Censored Regression Model

80

Page 81: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Truncated Regression Model

81

Page 82: Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

82