The Method of Likelihood Hal Whitehead BIOL4062/5062.

38
The Method of Likelihood Hal Whitehead BIOL4062/5062

Transcript of The Method of Likelihood Hal Whitehead BIOL4062/5062.

Page 1: The Method of Likelihood Hal Whitehead BIOL4062/5062.

The Method of Likelihood

Hal Whitehead

BIOL40625062

bull What is likelihoodbull Maximum likelihoodbull Maximum likelihood estimationbull Likelihood ratio testsbull Likelihood profile confidence intervalsbull Model selection

ndash Likelihood ratio testsndash Akaike Information Criterion (AIC)

bull Likelihood and least-squaresbull Calculating likelihood

The Method of Likelihood

Observations Y = y1y2y3

eg Weights of 30 crabs of known age and sex

Model specified by μ1 μ2 μ3hellip

eg y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

where e ~ N(0 1)

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

LikelihoodThe LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

The LIKELIHOOD that Z became a criminal

Probability Z became a criminal given what we what we know of Zrsquos characteristics and how those characteristics translate into the probability of being a criminal

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3hellip)

We can work this out if we know μ1 μ2 μ3hellip

Weights of 30 crabs of known age and sex

y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

eg Prob of these 30 weights is 004 iffemale wt at age 0 μ1 = 300

growth parameter μ2 = 07

excess male weight μ3 = 50

residual sd μ4 = 63

L(μ1=30μ2=07μ3=50 μ4=63)=004

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 2: The Method of Likelihood Hal Whitehead BIOL4062/5062.

bull What is likelihoodbull Maximum likelihoodbull Maximum likelihood estimationbull Likelihood ratio testsbull Likelihood profile confidence intervalsbull Model selection

ndash Likelihood ratio testsndash Akaike Information Criterion (AIC)

bull Likelihood and least-squaresbull Calculating likelihood

The Method of Likelihood

Observations Y = y1y2y3

eg Weights of 30 crabs of known age and sex

Model specified by μ1 μ2 μ3hellip

eg y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

where e ~ N(0 1)

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

LikelihoodThe LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

The LIKELIHOOD that Z became a criminal

Probability Z became a criminal given what we what we know of Zrsquos characteristics and how those characteristics translate into the probability of being a criminal

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3hellip)

We can work this out if we know μ1 μ2 μ3hellip

Weights of 30 crabs of known age and sex

y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

eg Prob of these 30 weights is 004 iffemale wt at age 0 μ1 = 300

growth parameter μ2 = 07

excess male weight μ3 = 50

residual sd μ4 = 63

L(μ1=30μ2=07μ3=50 μ4=63)=004

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 3: The Method of Likelihood Hal Whitehead BIOL4062/5062.

The Method of Likelihood

Observations Y = y1y2y3

eg Weights of 30 crabs of known age and sex

Model specified by μ1 μ2 μ3hellip

eg y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

where e ~ N(0 1)

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

LikelihoodThe LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

The LIKELIHOOD that Z became a criminal

Probability Z became a criminal given what we what we know of Zrsquos characteristics and how those characteristics translate into the probability of being a criminal

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3hellip)

We can work this out if we know μ1 μ2 μ3hellip

Weights of 30 crabs of known age and sex

y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

eg Prob of these 30 weights is 004 iffemale wt at age 0 μ1 = 300

growth parameter μ2 = 07

excess male weight μ3 = 50

residual sd μ4 = 63

L(μ1=30μ2=07μ3=50 μ4=63)=004

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 4: The Method of Likelihood Hal Whitehead BIOL4062/5062.

LikelihoodThe LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3 )

The LIKELIHOOD that Z became a criminal

Probability Z became a criminal given what we what we know of Zrsquos characteristics and how those characteristics translate into the probability of being a criminal

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3hellip)

We can work this out if we know μ1 μ2 μ3hellip

Weights of 30 crabs of known age and sex

y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

eg Prob of these 30 weights is 004 iffemale wt at age 0 μ1 = 300

growth parameter μ2 = 07

excess male weight μ3 = 50

residual sd μ4 = 63

L(μ1=30μ2=07μ3=50 μ4=63)=004

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 5: The Method of Likelihood Hal Whitehead BIOL4062/5062.

The LIKELIHOOD of Y is

L = Probability ( Y | Model amp μ1 μ2 μ3hellip)

We can work this out if we know μ1 μ2 μ3hellip

Weights of 30 crabs of known age and sex

y = μ1 + μ2radicAge + μ3Sex(01) + μ4e

eg Prob of these 30 weights is 004 iffemale wt at age 0 μ1 = 300

growth parameter μ2 = 07

excess male weight μ3 = 50

residual sd μ4 = 63

L(μ1=30μ2=07μ3=50 μ4=63)=004

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 6: The Method of Likelihood Hal Whitehead BIOL4062/5062.

If we do not know μ1 μ2 μ3

MAXIMUM LIKELIHOOD of Y is

L(μ1μ2μ3) = MaxProb( Y | μ1 μ2 μ3 )

μ1μ2hellip

eg Max prob of 30 weights is 012 whenfemale wt at age 0 μ1 = 284

growth parameter μ2 = 031

excess male weight μ3 = 17

residual sd μ4 = 39

MaximumLikelihoodEstimators

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 7: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Maximum Likelihood

μ1

Likelihood

Maximumlikelihood

Maximumlikelihood

estimator of μ1

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 8: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Maximum Likelihood

μ1

Likelihood

Precise estimate

Imprecise estimate

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 9: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Likelihood Ratio TestsIf μ1μ2μ3hellipμt is true model

μ1μ2μ3hellipμtμg is more general model

then

G = 2∙Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

(twice the log of the ratios of the maximum likelihoods)

is distributed as χsup2 with g-t degrees of freedomfor large sample sizes (asymptotically)

If G is unexpectedly large then data are unlikely to be from model μ1μ2μ3hellipμt

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 10: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Likelihood Ratio Tests

G = 2Log[L(μ1μ2μ3hellipμg)L(μ1μ2μ3hellipμt)]

This is the G-test for goodness-of-fit

null hypothesis μ1μ2μ3hellipμt

alternative hypothesis μ1μ2μ3hellipμtμg

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 11: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Likelihood an example

Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 12: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Null hypothesis Binomial Distribution with q =

075Expect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

Likelihood(q=075) = 90C10 07580 02510

= 000551

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 13: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Alternative hypothesis Binomial Distribution with q =

Expect FindWild Type 75 80Mutants 25 10Total 100 90

Likelihood(q) = 90C10 q80 (1-q)10 This has a maximum value when q = 8090 = 089Max Likelihood(q) = 90C10 (089)80 (1-089)10 = 01236

MaximumLikelihoodEstimator

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 14: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Likelihood Ratio TestExpect Find

Wild Type 75 80

Mutants 25 10

Total 100 90

G = 2 Log Max Likelihood (q)

Likelihood (q = 075) = 2 Log(01236 0000551) = 1096

is distributed as χsup2 with 1 df if q=075significantly large (Plt001) in χsup2(1)

so reject null hypothesis

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 15: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Profile LikelihoodConfidence Intervals

μ1

Likelihood

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 16: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Profile LikelihoodConfidence Intervals

μ1

Log-Likelihood

2

Maximumlikelihood

Maximumlikelihood

estimator of μ1

95 ci

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 17: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Profile LikelihoodConfidence Intervals

Log-LikelihoodContours (relative to maximum

likelihood)

μ1

μ2

MLE(0)

-2 95 Confidence region

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 18: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 19: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 20: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -

2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 21: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radic Age + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 22: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304

M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005

G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010

G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 23: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sexM(0) y = μ1 + μ4 e Log(L)= -2304M(1) y = μ1 + μ2 radicAge + μ4 e Log(L)= -2034M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e Log(L)= -1984

G(M(0)vsM(1)) = 2x(-2034 - (-2304)) = 540 P(χsup2(1))lt005 G(M(1)vsM(2)) = 2x(-1984 - (-2034)) = 100 P(χsup2(1))gt010 G(M(0)vsM(2)) = 2x(-1984 - (-2304)) = 640 P(χsup2(2))lt005

But What is critical p-value

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 24: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

Weights of 30 crabs of known age and sex

M(1) y = μ1 + μ2 radicAge + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

But Cannot compare M(1) and M(3)

using likelihood-ratio tests

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 25: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing Likelihood-Ratio Tests

bull What is critical p-value

bull Cannot compare models which are not subsets of one another using likelihood-ratio tests

So Akaike Information Criteria (AIC)

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 26: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Akaike Information Criteria (AIC)bull Kullback-Leibler Information (KLI)

ndash ldquoinformation lost when model M(0) is used to approximate model M(1)rdquo

ndash ldquodistance from M(0) to M(1)rdquo

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters of model M

bull AIC is an estimate of the expected relative distance (KLI) between a fitted model M and the unknown true mechanism that generated the data

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 27: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Akaike Information Criteria (AIC)

bull AIC(M) = - 2xLog(Likelihood(M)) + 2xK(M)ndash K(M) is number of estimable parameters

bull In model selection choose model with smallest AIC

ndash least expected relative distance between M and the unknown true mechanism that generated the data

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 28: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e

M(1) y = μ1 + μ2 radicAge + μ4 e

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e

M(3) y = μ1 + μ3 Sex(01) + μ4 e

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 29: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ2 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 30: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668

M(2) y = μ1 + μ2 radicAge + μ3 Sex(01) + μ4 e AIC=4768

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 31: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing AIC

bull Differences in AIC between models ΔAIC

bull Support for less favoured modelndash ΔAIC 0-2 Substantialndash ΔAIC 4-7 Considerably lessndash ΔAIC gt10 Essentially none

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 32: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Model SelectionUsing AIC

Weights of 30 crabs of known age and sex

M(0) y = μ1 + μ4 e AIC=5008 Unlikely

M(1) y = μ1 + μ2 radicAge + μ4 e AIC=4668 BEST

M(2) y = μ1 + μ2radicAge + μ3Sex(01) + μ4e AIC=4768 Good

M(3) y = μ1 + μ3 Sex(01) + μ4 e AIC=4995 Unlikely

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 33: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Modifications to AIC

AIC for small sample sizes

AICC = - 2x(Log-Likelihood) + 2xKxn(n-K-1)

n is sample size

AIC for overdispersed count data

QAIC = - 2xLog-Likelihoodc + 2xK

c is ldquovariance inflation factorrdquo (c=χsup2df)

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 34: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Burnham K P and D R Anderson2002

Model selection and multimodel inference

a practical information-theoretic approach 2nd ed

New York Springer-Verlag

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 35: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Likelihood and Least-Squares

bull If errors are normally distributedndash least squares and maximum-likelihood

estimates of parameters are the samendash but not σ2 estimators

bull Likelihood is a more powerful and theoretically-based technique

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 36: The Method of Likelihood Hal Whitehead BIOL4062/5062.

AIC and Least-Squares

bull If all models assume normal errors with constant variance

bull AIC = nLog(σ2) + 2Kndash σ2 = Σei

2n (the MLE of σ2)

ndash K is total no of estimated regression parameters including the intercept and σ2

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 37: The Method of Likelihood Hal Whitehead BIOL4062/5062.

Calculating Likelihoods

bull Analytical formulae

bull Compute by multiplying probabilities

bull Estimate by simulationndash number of times data are obtained in 1000

simulations given model and parameters

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38
Page 38: The Method of Likelihood Hal Whitehead BIOL4062/5062.

The Method of Likelihood

bull Probability of data given model

bull Estimate parameters using maximum likelihood

bull Estimate confidence intervals using likelihood profiles

bull Compare models usingndash likelihood ratio testsndash Akaike Information Criterion (AIC)

  • The Method of Likelihood
  • Slide 2
  • Slide 3
  • Likelihood
  • Slide 5
  • Slide 6
  • Maximum Likelihood
  • Slide 8
  • Likelihood Ratio Tests
  • Slide 10
  • Likelihood an example
  • Null hypothesis Binomial Distribution with q = 075
  • Alternative hypothesis Binomial Distribution with q =
  • Likelihood Ratio Test
  • Profile Likelihood Confidence Intervals
  • Slide 16
  • Slide 17
  • Model Selection Using Likelihood-Ratio Tests
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Akaike Information Criteria (AIC)
  • Slide 27
  • Model Selection Using AIC
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Modifications to AIC
  • Burnham K P and D R Anderson 2002 Model selection and multimodel inference a practical information-theoretic approach 2nd ed New York Springer-Verlag
  • Likelihood and Least-Squares
  • AIC and Least-Squares
  • Calculating Likelihoods
  • Slide 38