Session 10 (Logistic Regression)

19
URBN LOFTS 1837 LOFT STR EET, ANYTOWN , NY 50080 URBN LOFTS URBN LOFTS Logistic R GR SSION

Transcript of Session 10 (Logistic Regression)

Page 1: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 1/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

URBNLOFTS

Logistic R GR SSION

Page 2: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 2/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

Early uses of logistic regression were in

biomedical studies, for instance, to model whether

subjects have a particular condition such as lung

cancer. The past 25 years have seen much use in

social science research, for modeling opinions and

behavior decisions, and in business applications. In

credit-scoring, logistic regression is used to model

the probability that a subject is credit worthy.

Page 3: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 3/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

For instance, the probability that a subject

pays a bill on time may use predictors such as the

size of the bill, annual income, occupation,

mortgage and debt obligations, percentage of bills

paid on time in the past, and other aspects of an

applicant's credit history.

Page 4: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 4/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

Another area of increasing application is

genetics, such as to estimate quantitative trait loci

effects by modeling the probability that an

offspring inherits an allele of one type instead of 

another type as a function of phenotypic values on

various traits for that offspring.

Page 5: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 5/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

For binary response variable Y and an explanatory X, let

= = 1 = = 1 − = 0 = . The logistic

regression model is

Equivalently, the logit (log odds) has the linear relationship

= exp + 1 + exp +

=

1 − = +

Page 6: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 6/19

URBN LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• The sign determines whether π(x) is increasing ordecreasing as x  increases.

• The rate of climb or descent increases as |β|

increases; as β → 0 the curve flattens to a horizontal

straight line.

• When β = 0, Y is independent of X.

• For quantitative  x  with   β > 0, the curve for π( x ) has

the shape of the CDF of the logistic distribution. Sincethe logistic density is symmetric, π( x ) approaches 1

at the same rate that it approaches 0.

β

:

Page 7: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 7/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• Exponentiating both sides of the equation shows that the oddsare an exponential function of x.

• This provides a basic interpretation for the magnitude of  β : The

odds multiply by for every 1 -unit increase in x.

• In other words, is an odds ratio, the odds at X = x + 1 dividedby the odds at X = x.

• The intercept parameter α is not usually of particular interest.

• By centering the predictor about 0 [i.e., replacing x by (x —   )],

α becomes the logit at x =   , and thus α/(1 + α) = π(   ). As in

ordinary regression, centering is also helpful in complex models

containing quadratic or interaction terms to reduce correlations

among model parameter estimates.

β:

Page 8: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 8/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• To illustrate logistic regression, we re-analyze thehorseshoe crab data introduced last time.

• The binary response is whether a female crab has any

male crabs residing nearby (satellites). For crab   i, let

yi = 1 if she has at least one satellite and yi = 0 if shehas none.

• Here, we use as a predictor the female crab's carapace

width.

• In the succeeding graph, it appears that yi = 1 tends to

occur relatively more often at higher x values, in fact,

all crabs with width >29 cm have satellites.

Page 9: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 9/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

Page 10: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 10/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• In each of the eight width categories, we computed thesample proportion of crabs having satellites and the

mean width for the crabs in that category.

• A curve based on smoothing the data using the

generalized additive modeling method, assuming abinomial response and logit link is also in the graph

• This curve shows a roughly increasing trend and is more

informative than viewing the binary data alone.

• It suggests that an S-shaped regression function may

describe this relationship relatively well.

Page 11: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 11/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

Page 12: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 12/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• The ML fit is

• Substituting x = 26.3 cm, the mean width level in this

sample,π(x) = 0.674.

• The estimated probability equals 0.5 when

• The estimated odds of a satellite multiply by exp( ) =

exp(0.497) = 1.64 for each 1-cm increase in width; that

is, there is a 64% increase.

x = −

=12.351

0.497= 24.8

Page 13: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 13/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• The statistic z = /s.e. = 0.497/0.102 = 4.89 providesstrong evidence of a positive width effect (P < 0.0001).

• The equivalent Wald chi-squared statistic, z2 = 23.89,

has df = 1.

• The maximized log likelihoods equal —112.88 under H0:

= 0 and —97.23 for the full model.

• The likelihood-ratio statistic equals

—2[

—112.88

—(-97.23)] = 31.31, with df = 1.

• This provides even stronger evidence than the Wald

test.

Page 14: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 14/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• The Wald 95% confidence interval for is 0.497 ±1.96(0.102), or (0.298, 0.697).

• The table reports a likelihood-ratio confidence interval

of (0.308, 0.709), based on the profile likelihood

function.

• The confidence interval for the effect on the odds per

1-cm increase in width equals (e0.308, e0.709) =

(1.36,2.03).

• We infer that a 1-cm increase in width has at least a

36% increase and at most a doubling in the odds of a

satellite.

Page 15: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 15/19

URBNLOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• In practice, there is no guarantee that a certain logistic

regression model fits the data well.

• For any type of binary data, one way to detect lack of fit

uses a likelihood-ratio test to compare the model to more

complex ones.

• A more complex model might contain a nonlinear effect.

• Models with multiple predictors would consider interaction.

If more complex models do not fit better, this provides

some assurance that the model chosen is reasonable.

Page 16: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 16/19

URBN

LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• For models with a continuous explanatory variable, X2 and

G2 do not approximate chi-squared distributions due to very

few counts for each value of  x .

• One solution for this is to “bin” the continuous variable

into several categories and then perform a goodness-of-fit

test.

• If the test resulted to a good fit, then it can be inferred

that the model with the continuous explanatory variable

has also a good fit.

Page 17: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 17/19

URBN

LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

Page 18: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 18/19

URBN

LOFTS1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• In each width category, the fitted value for a "yes" response

is the sum of the estimated probabilities π(x) for all crabs

having width in that category

• The fitted value for a "no" response is the sum of 1 —   π(x)

for those crabs.

• In this table we have an 8 by 2 table, to compute for thechi-squared statistic, we square the difference of the count

for observed and the fitted values.

Page 19: Session 10 (Logistic Regression)

8/17/2019 Session 10 (Logistic Regression)

http://slidepdf.com/reader/full/session-10-logistic-regression 19/19

URBN LOFTS

1837 LOFT STREET, ANYTOWN, NY 50080

URBN LOFTS

• Their values are X2 = 5.3 and G2 = 6.2.

• The constructed table has eight binomial samples, one for

each width setting.

• The model has two parameters, so df = 8 — 2 = 6.

• Neither X2 nor G2 shows evidence of lack of fit (P-value >

0.4).

• Thus, we can feel more comfortable about using the model

for the original ungrouped data.