The maximum likelihood method Likelihood = probability that an observation is predicted by the...

17
The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models

Transcript of The maximum likelihood method Likelihood = probability that an observation is predicted by the...

Page 1: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

The maximum likelihood method

Likelihood = probability that an observation is predicted by the specified model

Plausible observationsand

plausible models

Page 2: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

MLE• Observations are ‘outcomes of random experiments’: the outcome

is represented by a random variable (e.g. Y). A representation of Y is yi (I = 1, 2, …. m)

• The distribution of possible outcomes is given by probability distribution.

• The same data (observations) can be generated by different models and the different observations may be generated by the same model.

what is the range of plausible observations, given the model, and what are the different models that could plausibly have generated the data?

– Plausible observations and plausible models

• A probability model predicts an outcome and associates a probability with each outcome.

Page 3: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

What is a plausible model?

A model that predicts observations with a probability that exceeds a given minimum.

What is the most plausible model?

A model that most likely predicts observations, i.e. that predicts the observations with the largest probability most

likely model, given the data.

Page 4: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Observation from a normal distribution N(,2)

Probability that an observation is predicted by N(,2): probability that 120 is predicted by N(100,100):

0054.0100*2

)100120( 2exp

210

1)100,100 ;120(

f

Probability that 120 is predicted by N(120,100):

0399.0100*2

)120120( 2exp

210

1)120,100 ;120(

f

Page 5: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Log-likelihood

μyσ2

1 - )ln( - )2(1/ln ), ;f(ln l

2

2

2 y

2 2

)( 2exp

2

1),;y( L 2 y

f

0.3989

Page 6: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

A. Likelihood function: Normal distribution with given mean and variance 100mu

obs Value 100 105 110 115 120 125 130 135 140 1451 128 0.0008 0.0028 0.0079 0.0171 0.0290 0.0381 0.0391 0.0312 0.0194 0.00942 118 0.0079 0.0171 0.0290 0.0381 0.0391 0.0312 0.0194 0.0094 0.0035 0.00103 126 0.0014 0.0044 0.0111 0.0218 0.0333 0.0397 0.0368 0.0266 0.0150 0.00664 127 0.0010 0.0035 0.0094 0.0194 0.0312 0.0391 0.0381 0.0290 0.0171 0.00795 120 0.0054 0.0130 0.0242 0.0352 0.0399 0.0352 0.0242 0.0130 0.0054 0.00186 129 0.0006 0.0022 0.0066 0.0150 0.0266 0.0368 0.0397 0.0333 0.0218 0.01117 112 0.0194 0.0312 0.0391 0.0381 0.0290 0.0171 0.0079 0.0028 0.0008 0.00028 119 0.0066 0.0150 0.0266 0.0368 0.0397 0.0333 0.0218 0.0111 0.0044 0.00149 124 0.0022 0.0066 0.0150 0.0266 0.0368 0.0397 0.0333 0.0218 0.0111 0.0044

10 139 0.0000 0.0001 0.0006 0.0022 0.0066 0.0150 0.0266 0.0368 0.0397 0.0333Mean 124.2Product 1.0464 1.1007 1.1846 1.2846 1.3650 1.3845 1.3324 1.2399 1.1482 1.0800

B. Log-likelihood functionmu

obs Value 100 105 110 115 120 125 130 135 140 1451 128 -7.14 -5.87 -4.84 -4.07 -3.54 -3.27 -3.24 -3.47 -3.94 -4.672 118 -4.84 -4.07 -3.54 -3.27 -3.24 -3.47 -3.94 -4.67 -5.64 -6.873 126 -6.60 -5.43 -4.50 -3.83 -3.40 -3.23 -3.30 -3.63 -4.20 -5.034 127 -6.87 -5.64 -4.67 -3.94 -3.47 -3.24 -3.27 -3.54 -4.07 -4.845 120 -5.22 -4.35 -3.72 -3.35 -3.22 -3.35 -3.72 -4.35 -5.22 -6.356 129 -7.43 -6.10 -5.03 -4.20 -3.63 -3.30 -3.23 -3.40 -3.83 -4.507 112 -3.94 -3.47 -3.24 -3.27 -3.54 -4.07 -4.84 -5.87 -7.14 -8.678 119 -5.03 -4.20 -3.63 -3.30 -3.23 -3.40 -3.83 -4.50 -5.43 -6.609 124 -6.10 -5.03 -4.20 -3.63 -3.30 -3.23 -3.40 -3.83 -4.50 -5.43

10 139 -10.83 -9.00 -7.43 -6.10 -5.03 -4.20 -3.63 -3.30 -3.23 -3.40Mean 124.2Sum -64.00 -53.15 -44.80 -38.95 -35.60 -34.75 -36.40 -40.55 -47.20 -56.35

Probability model: Normal random variable: variance = 100; different values of mean (mu)

Page 7: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Likelihood function: Gaussian prob. model [N(125,100)]

0.00

0.01

0.02

0.03

0.04

100 105 110 115 120 125 130 135 140 145 150

mu

Lik

elih

oo

d

-6

-5

-4

-3

Lo

g-l

ikel

iho

od

Likelihood Log-likelihood

Page 8: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Range of plausible modelsLikelihood ratio

) ;,μf(

) ;,μf(

2

2

0

y

y

b

With μbμ0 the specified model and the ‘best’model

Ratio of likelihood of any model to likelihood of ‘best’ model

μ -y

σ

μ -y

σ

μ -y

2

21

2

21

2

21

2

21

-exp -exp

-exp

-exp

λ 0

b

0

Log-likelihood ratio ln = - ½ z2

z2 = -2 ln

Page 9: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

A plausible value of is one for which the likelihood ratio exceeds a critical value (less negative), e.g. -1.9208, which corresponds to a 95% confidence interval, or -1.353 which corresponds to a 90% confidence interval.

Values of for which ln > -1.9208 isthe support range for .

When is outside the support range, we reject the claim that does not differ significantly from b . We accept a risk of 5% of wrongly rejecting the claim (Type I error).

Page 10: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

To get support range, find * for which ln = -1.9208 (given that ‘best’ value of is 125 and 2 is fixed):

1.9208- -10

μ - 1252

21

Solution:

3.8416 10

μ - 125 *2

1.96 10

μ - 125 *

144.6) (105.4, 10*1.96 125 μ*

The observation could come from ANY model in the support range. All models in the ‘support range’ are supported by the data.

Page 11: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Log-likelihood ratio

-3.5000

-3.0000

-2.5000

-2.0000

-1.5000

-1.0000

-0.5000

0.0000

100 105 110 115 120 125 130 135 140 145 150mu

Lo

g-l

ikel

iho

od

rat

io

Page 12: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Observation from a binomial distribution with parameter p and

index m

p (k ) P rN m k m !k! (m k )!

p k (1 p )m k

)p1(p)km(k

L Likelihood function:

Log-likelihood function: p)-ln(1 k)-(m ln(p)k ln(L) l

Page 13: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Probability model: Binomial model p 36345 d:\s\teach\98\prob.w q2

A. Binomial distributionk k! m!/(k!(m-k)! 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91 1 10 0.38742 0.26844 0.12106 0.04031 0.00977 0.00157 0.00014 0.00000 0.000002 2 45 0.19371 0.30199 0.23347 0.12093 0.04395 0.01062 0.00145 0.00007 0.000003 6 120 0.05740 0.20133 0.26683 0.21499 0.11719 0.04247 0.00900 0.00079 0.000014 24 210 0.01116 0.08808 0.20012 0.25082 0.20508 0.11148 0.03676 0.00551 0.000145 120 252 0.00149 0.02642 0.10292 0.20066 0.24609 0.20066 0.10292 0.02642 0.001496 720 210 0.00014 0.00551 0.03676 0.11148 0.20508 0.25082 0.20012 0.08808 0.011167 5040 120 0.00001 0.00079 0.00900 0.04247 0.11719 0.21499 0.26683 0.20133 0.057408 40320 45 0.00000 0.00007 0.00145 0.01062 0.04395 0.12093 0.23347 0.30199 0.193719 362880 10 0.00000 0.00000 0.00014 0.00157 0.00977 0.04031 0.12106 0.26844 0.38742

10 3628800 B. Likelihood function p^k (1-p)^(m-k)

k 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91 0.03874 0.02684 0.01211 0.00403 0.00098 0.00016 0.00001 0.00000 0.000002 0.00430 0.00671 0.00519 0.00269 0.00098 0.00024 0.00003 0.00000 0.000003 0.00048 0.00168 0.00222 0.00179 0.00098 0.00035 0.00008 0.00001 0.000004 0.00005 0.00042 0.00095 0.00119 0.00098 0.00053 0.00018 0.00003 0.000005 0.00001 0.00010 0.00041 0.00080 0.00098 0.00080 0.00041 0.00010 0.000016 0.00000 0.00003 0.00018 0.00053 0.00098 0.00119 0.00095 0.00042 0.000057 0.00000 0.00001 0.00008 0.00035 0.00098 0.00179 0.00222 0.00168 0.000488 0.00000 0.00000 0.00003 0.00024 0.00098 0.00269 0.00519 0.00671 0.004309 0.00000 0.00000 0.00001 0.00016 0.00098 0.00403 0.01211 0.02684 0.03874

Page 14: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Likelihood function: binomial (k=4, m=10)

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

p

Lik

elih

oo

d

-18.00

-16.00

-14.00

-12.00

-10.00

-8.00

-6.00

Lo

g-l

ikel

iho

od

Likelihood Log-likelihood

Page 15: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Leaving parental home, 1961 cohort, micro-dataSurvey Sept. 87 - Febr. 88

First 23 respondents. Three censored observations.ID Sex Father Month Reason Sex1 2 2 268 2 1 Female2 1 3 268 2 2 Male3 1 2 202 14 2 2 320 4 Father status5 1 1 237 1 1 Low6 1 1 295 2 2 Middle7 1 1 272 2 3 High8 2 1 231 19 2 1 312 3 Reason

10 1 2 289 2 1 Educ/work11 1 1 316 2 2 Marriage/cohabit12 2 1 321 4 3 Freedom13 2 1 260 1 4 Censored14 2 2 281 2 at interview15 2 1 273 216 1 2 251 317 2 2 212 118 2 2 320 219 1 2 221 320 2 2 322 421 2 1 221 222 2 3 308 223 1 2 233 1

Total %TWO CATEGORIES leave home early (before age 20) 209 35.85

leave home late (at or after age 20) 321 55.06Censored 53 9.09Total 583 100.00

Data: leaving parental home

Page 16: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

Analysis of young adults who left homeleave out censored cases (conditional analysis)

20 obervationsA. Probability of observing k young adults leaving home early: binomial distribution

7 out of 20 left home before age 20 (early) m 20k 7

FIND the probability of leaving home early (p) by maximum likelihoodDetermine, for various values of p, the probability that the model (binomial) predicts the observationsCODING: reference category = 'leave home late' Hence: 'Early' =1, 'Late'= 0

p Bin.coef Pr{N=k}0.10 77520 0.00200.15 77520 0.01600.20 77520 0.05450.25 77520 0.11240.30 77520 0.16430.35 77520 0.18440.40 77520 0.16590.45 77520 0.12210.50 77520 0.07390.55 77520 0.03660.60 77520 0.01460.65 77520 0.00450.70 77520 0.00100.75 77520 0.00020.80 77520 0.0000

Likelihood function

0.00

0.05

0.10

0.15

0.20

0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80

p

Pro

b.

of

ob

serv

ing

7 i

n 20

Page 17: The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.

The expected value of the of the timing of leaving home: E[Y] = p 0.35Variance of timing of leaving home: Var[Y] = p(1-p) = 0.35*(1-0.35) = 0.23

Expected number of young adults leaving home early in 20 observations: E[N] = mp 7Variance of number of young adults leaving home early Var[N] = mp(1-p) 4.55

Standard error 2.13

Variance of p Var[p] =p(1-p)/m 0.011s.e.(p) = 0.107

95% confidence interval around p: binomial approximated by normal distributionwith mean mp and variance mp(1-p): N(mp,mp(1-p))

p = E[p] +/- 1.96*s.e.(p) = 0.35 +/- 1.96 * 0.107 = (0.141, 0.559)

The interval depends on number of observations:

Number m Interval10 (0.054, 0.646)

100 (0.257, 0.443)530 (0.353, 0.436)