1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum...

61
1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple examples. p m m L 0.0 0.1 0.2 0.3 0.4 0 1 2 3 4 5 6 7 8 0.00 0.02 0.04 0.06 0 1 2 3 4 5 6 7 8

Transcript of 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum...

Page 1: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

1

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple examples.

p

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

m

m

L

Page 2: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

2

Suppose that you have a normally-distributed random variable X with unknown population mean m and standard deviation s, and that you have a sample of two observations, 4 and 6. For the time being, we will assume that s is equal to 1.

p

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

m

m

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

Page 3: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

3

Suppose initially you consider the hypothesis m = 3.5. Under this hypothesis the probability density at 4 would be 0.3521 and that at 6 would be 0.0175.

p

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

m

m

0.3521

0.0175

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6)

3.5 0.3521 0.0175

Page 4: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

4

The joint probability density, shown in the bottom chart, is the product of these, 0.0062.

p

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

m

m

0.3521

0.0175

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

Page 5: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

5

Next consider the hypothesis m = 4.0. Under this hypothesis the probability densities associated with the two observations are 0.3989 and 0.0540, and the joint probability density is 0.0215.

p

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8 m

m

0.3989

0.0540

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

Page 6: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

6

Under the hypothesis m = 4.5, the probability densities are 0.3521 and 0.1295, and the joint probability density is 0.0456.

p

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8 m

m

0.3521

0.1295

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

4.5 0.3521 0.1295 0.0456

Page 7: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

7

Under the hypothesis m = 5.0, the probability densities are both 0.2420 and the joint probability density is 0.0585.

p

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8 m

m

0.24200.2420

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

4.5 0.3521 0.1295 0.0456

5.0 0.2420 0.2420 0.0585

Page 8: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

8

Under the hypothesis m = 5.5, the probability densities are 0.1295 and 0.3521 and the joint probability density is 0.0456.

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

p

m

m

0.3521

0.1295

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

4.5 0.3521 0.1295 0.0456

5.0 0.2420 0.2420 0.0585

5.5 0.1295 0.3521 0.0456

Page 9: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

9

The complete joint density function for all values of m has now been plotted in the lower diagram. We see that it peaks at m = 5.

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

4.5 0.3521 0.1295 0.0456

5.0 0.2420 0.2420 0.0585

5.5 0.1295 0.3521 0.0456

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

p

Lm

m

0.1295

0.3521

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 10: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

10

Now we will look at the mathematics of the example. If X is normally distributed with mean m and standard deviation s, its density function is as shown.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

)(

X

eXf

Page 11: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

11

For the time being, we are assuming s is equal to 1, so the density function simplifies to the second expression.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

)(

X

eXf

2

21

21

)(

X

eXf

Page 12: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

12

Hence we obtain the probability densities for the observations where X = 4 and X = 6.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

)(

X

eXf

2

21

21

)(

X

eXf

2

421

21

)4(

ef

26

21

21

)6(

ef

Page 13: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

13

The joint probability density for the two observations in the sample is just the product of their individual densities.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

)(

X

eXf

2

21

21

)(

X

eXf

2

421

21

)4(

ef

26

21

21

)6(

ef

2

62

124

2

1

21

21

eejoint density

Page 14: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

14

In maximum likelihood estimation we choose as our estimate of m the value that gives us the greatest joint density for the observations in our sample. This value is associated with the greatest probability, or maximum likelihood, of obtaining the observations in the sample.

2

21

21

)(

X

eXf

2

21

21

)(

X

eXf

2

421

21

)4(

ef

26

21

21

)6(

ef

2

62

124

2

1

21

21

eejoint density

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 15: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

15

In the graphical treatment we saw that this occurs when m is equal to 5. We will prove this must be the case mathematically.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

p

0.0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8 m

m

0.24200.2420

L

m p(4) p(6) L

3.5 0.3521 0.0175 0.0062

4.0 0.3989 0.0540 0.0215

4.5 0.3521 0.1295 0.0456

5.0 0.2420 0.2420 0.0585

5.5 0.1295 0.3521 0.0456

0.00

0.02

0.04

0.06

0 1 2 3 4 5 6 7 8

Page 16: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

16

To do this, we treat the sample values X = 4 and X = 6 as given and we use the calculus to determine the value of m that maximizes the expression.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 17: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

17

When it is regarded in this way, the expression is called the likelihood function for m, given the sample observations 4 and 6. This is the meaning of L( m | 4,6).

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 18: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

18

To maximize the expression, we could differentiate with respect to m and set the result equal to 0. This would be a little laborious. Fortunately, we can simplify the problem with a trick.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 19: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

19

log L is a monotonically increasing function of L (meaning that log L increases if L increases and decreases if L decreases).

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

26

212

421

26

212

421

26

212

421

log21

loglog21

log

21

log21

log

21

21

loglog

ee

ee

eeL

2

6212

421

21

21

6,4|

eeL

Page 20: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

26

212

421

26

212

421

26

212

421

log21

loglog21

log

21

log21

log

21

21

loglog

ee

ee

eeL

20

It follows that the value of m which maximizes log L is the same as the one that maximizes L. As it so happens, it is easier to maximize log L with respect to m than it is to maximize L.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 21: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

26

212

421

26

212

421

26

212

421

log21

loglog21

log

21

log21

log

21

21

loglog

ee

ee

eeL

21

The logarithm of the product of the density functions can be decomposed as the sum of their logarithms.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 22: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

22

Using the product rule a second time, we can decompose each term as shown.

26

212

421

26

212

421

26

212

421

log21

loglog21

log

21

log21

log

21

21

loglog

ee

ee

eeL

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

Page 23: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

23

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

22

26

212

421

621

421

21

log2

log21

loglog21

loglog

eeL

22421

421

log421

log2

XeXeX

abab loglog

Now one of the basic rules for manipulating logarithms allows us to rewrite the second term as shown.

Page 24: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

24

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

22

26

212

421

621

421

21

log2

log21

loglog21

loglog

eeL

22421

421

log421

log2

XeXeX

abab loglog

log e is equal to 1, another basic logarithm result. (Remember, as always, we are using natural logarithms, that is, logarithms to base e.)

Page 25: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

25

Hence the second term reduces to a simple quadratic in X. And so does the fourth.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

22

26

212

421

621

421

21

log2

log21

loglog21

loglog

eeL

22421

421

log421

log2

XeXeX

abab loglog

Page 26: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

26

We will now choose m so as to maximize this expression.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

6212

421

21

21

6,4|

eeL

22

26

212

421

621

421

21

log2

log21

loglog21

loglog

eeL

Page 27: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

27

Quadratic terms of the type in the expression can be expanded as shown.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

Page 28: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

28

Thus we obtain the differential of the quadratic term.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

aa 2

21

dd

Page 29: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

29

Applying this result, we obtain the differential of log L with respect to m. (The first term in the expression for log L disappears completely since it is not a function of m.)

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

aa 2

21

dd

64dlogd L

Page 30: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

30

Thus from the first order condition we confirm that 5 is the value of m that maximizes the log-likelihood function, and hence the likelihood function.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

aa 2

21

dd

64dlogd L

5ˆ0dlogd

L

Page 31: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

31

Note that a caret mark has been placed over m, because we are now talking about the specific value of m that maximizes the log-likelihood.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

aa 2

21

dd

64dlogd L

5ˆ0dlogd

L

Page 32: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

32

Note also that the second differential of log L with respect to m is –2. Since this is negative, we have found a maximum, not a minimum.

22 621

421

21

log2log

L

22222

21

21

221

21 aaaaa

aa 2

21

dd

64dlogd L

5ˆ0dlogd

L

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2dlogd2

2

L

Page 33: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

33

We will generalize this result to a sample of n observations X1,...,Xn. The probability density for Xi is given by the first line.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

iX

i eXf

Page 34: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

2

21

21

iX

i eXf

34

The joint density function for a sample of n observations is the product of their individual densities.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

212

21

1 21

...21

,...,|1

nXX

n eeXXL

Page 35: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

35

Now treating the sample values as fixed, we can re-interpret the joint density function as the likelihood function for m, given this sample. We will find the value of m that maximizes it.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

iX

i eXf

2

212

21

1 21

...21

,...,|1

nXX

n eeXXL

Page 36: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

36

We will do this indirectly, as before, by maximizing log L with respect to m. The logarithm decomposes as shown.

2

212

21

1 21

...21

,...,|1

nXX

n eeXXL

2

212

21

2

212

21

21

log...21

log

21

...21

loglog

1

1

n

n

XX

XX

ee

eeL

2

21

21

iX

i eXf

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 37: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

37

We differentiate log L with respect to m.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

nXXL

...dlogd

1

221

2

212

21

21

...21

21

log

21

log...21

loglog1

n

XX

XXn

eeLn

Page 38: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

38

The first order condition for a minimum is that the differential be equal to zero.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

nXXL

...dlogd

1

0ˆ0dlogd

nXL

i

221

2

212

21

21

...21

21

log

21

log...21

loglog1

n

XX

XXn

eeLn

Page 39: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

39

Thus we have demonstrated that the maximum likelihood estimator of m is the sample mean. The second differential, –n, is negative, confirming that we have maximized log L.

nXXL

...dlogd

1

0ˆ0dlogd

nXL

i

XXn i

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

221

2

212

21

21

...21

21

log

21

log...21

loglog1

n

XX

XXn

eeLn

nL

2

2

dlogd

Page 40: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

40

So far we have assumed that s, the standard deviation of the distribution of X, is equal to 1. We will now relax this assumption and find the maximum likelihood estimator of it.

2

21

21

iX

i eXf

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 41: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

41

We will illustrate the process graphically with the two-observation example, keeping m fixed at 5. We will start with s equal to 2.

0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6 7 8 9 m

s

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

p

0

0.02

0.04

0.06

0 1 2 3 4

Page 42: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

42

With s equal to 2, the probability density is 0.1760 for both X = 4 and X = 6, and the joint density is 0.0310.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6 7 8 9 mL

p

s0

0.02

0.04

0.06

0 1 2 3 4

s p(4) p(6) L

2.0 0.1760 0.1760 0.0310

Page 43: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

43

Now try s equal to 1. The individual densities are 0.2420 and so the joint density, 0.0586, has increased.

0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6 7 8 9

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

s0

0.02

0.04

0.06

0 1 2 3 4

m

p

s p(4) p(6) L

2.0 0.1760 0.1760 0.0310

1.0 0.2420 0.2420 0.0586

Page 44: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6 7 8 9

44

Now try putting s equal to 0.5. The individual densities have fallen and the joint density is only 0.0117.

0

0.02

0.04

0.06

0 1 2 3 4

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

s

m

p

s p(4) p(6) L

2.0 0.1760 0.1760 0.0310

1.0 0.2420 0.2420 0.0586

0.5 0.1080 0.1080 0.0117

Page 45: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

45

The joint density has now been plotted as a function of s in the lower diagram. You can see that in this example it is greatest for s equal to 1.

0

0.02

0.04

0.06

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6 7 8 9

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

L

s

m

p

s p(4) p(6) L

2.0 0.1760 0.1760 0.0310

1.0 0.2420 0.2420 0.0586

0.5 0.1080 0.1080 0.0117

Page 46: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

46

We will now look at this mathematically, starting with the probability density function for X given m and s.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

iX

i eXf

Page 47: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

47

The joint density function for the sample of n observations is given by the second line.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

212

21

1 21

...21

,...,|,1

nXX

n eeXXL

2

21

21

iX

i eXf

Page 48: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

48

As before, we can re-interpret this function as the likelihood function for m and s, given the sample of observations.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

212

21

1 21

...21

,...,|,1

nXX

n eeXXL

2

21

21

iX

i eXf

Page 49: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

49

We will find the values of m and s that maximize this function. We will do this indirectly by maximizing log L.

2

212

21

21

...21

loglog1

nXX

eeL

2

212

21

1 21

...21

,...,|,1

nXX

n eeXXL

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

21

21

iX

i eXf

Page 50: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

50

We can decompose the logarithm as shown. To maximize it, we will set the partial derivatives with respect to m and s equal to zero.

2212

22

1

2

212

21

2

212

21

21

...211

21

log1

log

21

...21

21

log

21

log...21

log

21

...21

loglog

1

1

n

n

XX

XX

XXnn

XXn

ee

eeL

n

n

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 51: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

51

When differentiating with respect to m, the first two terms disappear. We have already seen how to differentiate the other terms.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

2

2212

221

loglog

21

...211

21

log1

loglog

i

n

Xnn

XXnnL

nX

XX

XXL

i

n

n

2

12

2212

1

...1

21

...211log

Page 52: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

52

Setting the first differential equal to 0, the maximum likelihood estimate of m is the sample mean, as before.

XL

ˆ0

log

2

2

2212

221

loglog

21

...211

21

log1

loglog

i

n

Xnn

XXnnL

nX

XX

XXL

i

n

n

2

12

2212

1

...1

21

...211log

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 53: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

53

Next, we take the partial differential of the log-likelihood function with respect to s.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

2

2212

221

loglog

21

...211

21

log1

loglog

i

n

Xnn

XXnnL

Page 54: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

54

Before doing so, it is convenient to rewrite the equation.

abab loglog

loglog1log1

log 1

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

2

2212

221

loglog

21

...211

21

log1

loglog

i

n

Xnn

XXnnL

Page 55: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

55

The derivative of log s with respect to s is 1/s. The derivative of s--2 is –2s--3.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

2

2

2212

221

loglog

21

...211

21

log1

loglog

i

n

Xnn

XXnnL

23log iXnL

Page 56: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

56

Setting the first derivative of log L to zero gives us a condition that must be satisfied by the maximum likelihood estimator.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

23log iXnL

0ˆˆˆ

0log 23

iXnL

Page 57: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

57

We have already demonstrated that the maximum likelihood estimator of m is the sample mean.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

23log iXnL

0ˆˆˆ

0log 23

iXnL

0ˆ 22 XXn i

Page 58: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

58

Hence the maximum likelihood estimator of the population variance is the mean square deviation of X.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

23log iXnL

0ˆˆˆ

0log 23

iXnL

0ˆ 22 XXn i

22 1ˆ XX

n i

Page 59: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

59

Note that it is biased. The unbiased estimator is obtained by dividing by n – 1, not n.

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

23log iXnL

0ˆˆˆ

0log 23

iXnL

0ˆ 22 XXn i

22 1ˆ XX

n i

Page 60: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

23log iXnL

0ˆˆˆ

0log 23

iXnL

0ˆ 22 XXn i

However it can be shown that the maximum likelihood estimator is asymptotically efficient, in the sense of having a smaller mean square error than the unbiased estimator in large samples.

60

22 1ˆ XX

n i

INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION

Page 61: 1 INTRODUCTION TO MAXIMUM LIKELIHOOD ESTIMATION This sequence introduces the principle of maximum likelihood estimation and illustrates it with some simple.

Copyright Christopher Dougherty 2012.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section 10.6 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit

from participation in a formal course should consider the London School of

Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

20 Elements of Econometrics

www.londoninternational.ac.uk/lse.

2012.12.16