Method of moments and Maximum likelihood...
Transcript of Method of moments and Maximum likelihood...
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Method of moments and Maximum likelihoodestimation
Botond Szabo
Leiden University
Leiden, 11. 03. 2019
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Outline
1 Method of moments
2 Maximum likelihood
3 Asymptotic normality
4 Optimality
5 Delta method
6 Parametric bootstrap
7 Quiz
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Method of moments
Recall: the jth moment of the random variable X ∼ fθ is
αj(θ) := EθXj =
∫x j f (x ; θ)dx .
The jth sample moment is defined as
αj := n−1n∑
i=1
X ji .
For θ = (θ1, ...θk) the method of moments estimator θn isdefined as
α1(θ) = α1,
....
αk(θ) = αk ,
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Examples I
Example
Let X1, ...,Xn ∼ Bernoulli(p). Then α1(p) = EpX1 = p andα1 = n−1
∑ni=1 Xi . By equating these we get
pn = n−1n∑
i=1
Xi .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Examples II
Example
Let X1, ...,Xn ∼ N(µ, σ2). Then α1 = EX1 = µ andα2 = EX 2
1 = µ2 + σ2. Furthermore α1 = n−1∑n
i=1 Xi andα2 = n−1
∑ni=1 X
2i . Solving the equations
µn = n−1n∑
i=1
Xi , and µ2 + σ2 = n−1n∑
i=1
X 2i ,
we get that
µn = n−1n∑
i=1
Xi , and σ2 = n−1n∑
i=1
(Xi − Xn)2.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties
Theorem
Let θn denote the method of moments estimator. Underappropriate conditions on the model, the following statementshold:
The estimate θn exists with probability tending to one.
The estimate is consistent , i.e. θnP→ θ.
The estimate is asymptotically normal (the precise form isgiven in the book).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Likelihood
Definition
Let X1, . . . ,Xn be IID with PDF (or PMF) f (x ; θ). The likelihoodfunction is defined by
Ln(θ) =n∏
i=1
f (Xi ; θ).
The log-likelihood function is defined by `n(θ) = logLn(θ).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Maximum likelihhod estimator
Definition
The maximum likelihood estimator (MLE), denoted by θn, is thevalue of θ that maximises Ln(θ) (or, equivalently, logLn(θ)).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Example
Suppose that X1, . . . ,Xn ∼ Bernoulli(p). The PMF isf (x ; p) = px(1− p)1−x for x = 0, 1. The unknown parameter is p.Then
Ln(θ) =n∏
i=1
f (Xi ; θ) =n∏
i=1
pXi (1− p)1−Xi = pS(1− p)n−S ,
where S =∑n
i=1 Xi . Therefore,
`n(θ) = S log p + (n − S) log(1− p).
To find the maximum, take a derivative of `n(θ), set it to zero andfind that pn = S/n.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Likelihood for Bernoulli with n = 20 and∑n
i=1 Xi = 12.
The MLE is pn = 12/20 = 0.6.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Likelihood for Bernoulli with n = 20 and∑n
i=1 Xi = 12.The MLE is pn = 12/20 = 0.6.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Example
Let X1, . . . ,Xn ∼ N(µ, σ2). The unknown parameter is θ = (µ, σ2)and the likelihood (up to some constants not depending on θ) is
Ln(θ) =n∏
i=1
1
σexp
(− 1
2σ2(Xi − µ)2
)
= σ−n exp
(− 1
2σ2
n∑i=1
(Xi − µ)2
)
= σ−n exp
(−nS2
2σ2
)exp
(−n(X n − µ)2
2σ2
),
where X n = n−1∑n
i=1 Xi and S2 = n−1∑n
i=1(Xi − X n)2.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example (continued)
Example
The log-likelihood is
`(µ, σ) = −n log σ − nS2
2σ2− n(X n − µ)2
2σ2.
Solving the equations
∂`(µ, σ)
∂µ= 0,
∂`(µ, σ)
∂σ= 0,
we conclude that µn = X n and σn = S (it can be verified thatthese are the global maxima of the likelihood).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties of maximum likelihood estimators
Theorem
Under suitable conditions on f (x ; θ), the MLE θn is consistent:
θnP−→ θ.
Theorem
Let τ = g(θ) be a function of θ. Let θn be an MLE of θ. Thenτn = g(θn) is an MLE of τ.
Example
Let X1, . . . ,Xn ∼ N(θ, 1). The MLE for θ is θn = X n. Let τ = eθ.
Then the MLE for τ is τn = eX n .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties of maximum likelihood estimators
Theorem
Under suitable conditions on f (x ; θ), the MLE θn is consistent:
θnP−→ θ.
Theorem
Let τ = g(θ) be a function of θ. Let θn be an MLE of θ. Thenτn = g(θn) is an MLE of τ.
Example
Let X1, . . . ,Xn ∼ N(θ, 1). The MLE for θ is θn = X n. Let τ = eθ.
Then the MLE for τ is τn = eX n .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties of maximum likelihood estimators
Theorem
Under suitable conditions on f (x ; θ), the MLE θn is consistent:
θnP−→ θ.
Theorem
Let τ = g(θ) be a function of θ. Let θn be an MLE of θ. Thenτn = g(θn) is an MLE of τ.
Example
Let X1, . . . ,Xn ∼ N(θ, 1). The MLE for θ is θn = X n. Let τ = eθ.
Then the MLE for τ is τn = eX n .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Score function and Fisher information
Definition
The score function is defined by
s(x ; θ) =∂ log f (x ; θ)
∂θ.
The Fisher information in n IID observations X1, . . . ,Xn ∼ f (x ; θ)is
In(θ) = Vθ
(n∑
i=1
s(Xi ; θ)
)
=n∑
i=1
Vθ(s(Xi ; θ))
= nVθ(s(Xi ; θ)).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties
Theorem
For n = 1 write I (θ) instead of I1(θ). One has Eθ[s(X1; θ)] = 0 andhence Vθ(s(Xi ; θ)) = E[s2(X1; θ)].
Theorem
One has In(θ) = nI (θ). Also,
I (θ) = −Eθ[∂2 log f (X ; θ)
∂θ2
]= −
∫ (∂2 log f (x ; θ)
∂θ2
)f (x ; θ)dx .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Properties
Theorem
For n = 1 write I (θ) instead of I1(θ). One has Eθ[s(X1; θ)] = 0 andhence Vθ(s(Xi ; θ)) = E[s2(X1; θ)].
Theorem
One has In(θ) = nI (θ). Also,
I (θ) = −Eθ[∂2 log f (X ; θ)
∂θ2
]= −
∫ (∂2 log f (x ; θ)
∂θ2
)f (x ; θ)dx .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Asymptotic normality
Theorem
Let θn be an MLE and se =√Vθ[θn]. Under appropriate
conditions,
1 se ≈√
1/In(θ) and
θn − θse
N(0, 1).
2 Let se =√
1/In(θn). Then
θn − θse
N(0, 1).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Confidence interval
Theorem
LetCn = (θn − zα/2se, θn + zα/2se).
Then Pθ(θ ∈ Cn)→ 1− α as n→∞.
In particular, for α = 0.05, zα/2 = 1.96 ≈ 2, and
θn ± 2se
is an approximate 95% confidence interval.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Example
Let X1, . . . ,Xn be IID Poisson(λ). Then it can be shown thatλn = X n and I (λ) = 1/λ, so that
se =1√
nI (λn)=
√λnn.
Therefore, an approximate 1− α confidence interval for λ is
λn ± zα/2
√λnn.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
MLE vs. median
Example
Let X1, . . . ,Xn be IID N(θ, 1). The MLE for θ is θn = X n. Anotherreasonable estimator for θ is the sample median θn.
The MLE satisfies
√n(θn − θ) N(0, 1).
The sample median can be shown to satisfy
√n(θn − θ) N
(0,π
2
).
The sample median converges to the right value, but has a largervariance than MLE.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Efficiency of MLE
Theorem
Let θn be an MLE and θn (almost) any other estimator. Suppose
√n(θn − θ) N(0, σ2MLE),
√n(θn − θ) N(0, σ2tilde).
Define the asymptotic relative efficiency as
ARE(θn, θn) =σ2MLE
σ2tilde.
Then ARE(θn, θn) ≤ 1. Thus the MLE has the smallest(asymptotic) variance and we say that the MLE is optimal orasymptotically efficient.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Delta method
Theorem
If τ = g(θ), where g is differentiable with g ′(θ) 6= 0, then forτn = g(θn),
τn − τse(τn)
N(0, 1).
Herese(τn) = |g ′(θn)|se(θn).
Hence, if
Cn = (τn − zα/2se(τn), τn + zα/2se(τn)),
then Pθ(τ ∈ Cn)→ 1− α as n→∞.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Example
Let X1, . . . ,Xn ∼ N(0, σ2). Suppose we want to estimateτ = log σ. The log-likelihood is
`(σ) = −n log σ − 1
2σ2
n∑i=1
X 2i .
Differentiate `(σ) and set the derivative to zero to conclude that
σn =
√∑ni=1 X
2i
n.
With some effort we can show that I (σ) = 2/σ2. Therefore,se(σn) = σn/
√2n.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example (continued)
Example
Now τn = log σn. Since (log σ)′ = 1/σ, we get that
se(τn) =1
σn
σn√2n
=1√2n,
and an approximate 95% confidence interval for τ is
τn ±√
2
n.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Parametric bootstarp
The bootstrap is a method for estimating standard errors,computing confidence intervals and other quantities thatmight be difficult to compute otherwise.
In the parametric models we know that X1, . . . ,Xn ∼ f (x ; θ).
Suppose we want to approximate the standard error ofT (X1, . . . ,Xn), denoted by se =
√Vθ[T ].
The idea of the parametric bootstrap is to approximate se
with√
Vθn [T ], where θn is e.g. the MLE. However, in many
cases it is difficult to compute Vθn [T ]. Idea: approximate itvia simulation.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Parametric bootstarp: sampling
Sample X ∗1 , ...,X∗n from f (x ; θn).
Compute the estimator T using the “new sample” X ∗1 , ...,X∗n :
T ∗ = T (X ∗1 , ...,X∗n ).
Repeat this procedure B times resulting T ∗1 , ...,T∗B .
Denote the average of the outomes by T ∗.
Approximate Vθn [T ] by
se2boot = B−1B∑
b=1
(T ∗b − T ∗
)2.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Example
Example
Let X1, . . . ,Xn ∼ N(0, σ2). The MLE of σ is
σn =
√∑ni=1 X
2i
n.
We use the parametric bootstrap to approximate se.
Simulate X ∗1 , . . . ,X∗n ∼ N(0, σ2n) and compute
σ∗ =√n−1
∑ni=1 X
∗2i . Next repeat this process B times to get
σ∗1, . . . , σ∗B . Finally, set
seboot =
√√√√√ 1
B
B∑b=1
σ∗b − 1
B
B∑j=1
σ∗j
2
.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 1
In what sense is the MLE optimal?
Answers:
1 It has the smallest asymptotic variance.
2 It is unbiased.
3 It is a normally distributed random variable and therefore easyto work with.
4 It is the value minimizing the likelihood function.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 1
In what sense is the MLE optimal?
Answers:
1 It has the smallest asymptotic variance.
2 It is unbiased.
3 It is a normally distributed random variable and therefore easyto work with.
4 It is the value minimizing the likelihood function.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 2
What is the Delta method good for
Answers:
1 Compute the MLE for a function of the paramter τ = g(θ).
2 Construct a confidence set for a function of the paramterτ = g(θ).
3 Compute the Fisher information.
4 Compute the relative efficiency.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 2
What is the Delta method good for
Answers:
1 Compute the MLE for a function of the paramter τ = g(θ).
2 Construct a confidence set for a function of the paramterτ = g(θ).
3 Compute the Fisher information.
4 Compute the relative efficiency.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 3
If τ = g(θ) and τn = g(θn), then what is the estimated standarderror of τn
Answers:
1 se(τn) = se(θn).
2 se(τn) = se(θn)/|g ′(θn)|.3 se(τn) = |g(θn)|se(θn).
4 se(τn) = |g ′(θn)|se(θn).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 3
If τ = g(θ) and τn = g(θn), then what is the estimated standarderror of τn.
Answers:
1 se(τn) = se(θn).
2 se(τn) = se(θn)/|g ′(θn)|.3 se(τn) = |g(θn)|se(θn).
4 se(τn) = |g ′(θn)|se(θn).
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 4
What is the bootstrap method good for?
Answers:
1 To estimate standard error and compute confidence intervals.
2 To help put on boots.
3 To compute the MLE.
4 Compute the Fisher information.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 4
What is the bootstrap method good for?
Answers:
1 To estimate standard error and compute confidence intervals.
2 To help put on boots.
3 To compute the MLE.
4 Compute the Fisher information.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 5
What does the star denote in X ∗1 , ...,X∗n in the bootstrap method?
Answers:
1 They are a permutation of X1, ...,Xn.
2 They are a “new” sample (with replacement) from thenumbers X1, ...,Xn.
3 They are a “new” sample (without replacement) from thenumbers X1, ...,Xn.
4 They are new draws from the distribution F .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 5
What does the star denote in X ∗1 , ...,X∗n in the bootstrap method?
Answers:
1 They are a permutation of X1, ...,Xn.
2 They are a “new” sample (with replacement) from thenumbers X1, ...,Xn.
3 They are a “new” sample (without replacement) from thenumbers X1, ...,Xn.
4 They are new draws from the distribution F .
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 6
How do we estimate the variance using the bootstrap sample?
Answers:
1 B−1∑B
b=1
(T ∗b − T ∗
)2.
2 B−1∑B
b=1
(Xb − T
)2.
3 B−1∑B
b=1
(T ∗b − T
)2.
4 B−1∑B
b=1
(Xb − T ∗
)2.
Method of moments Maximum likelihood Asymptotic normality Optimality Delta method Parametric bootstrap Quiz
Question 6
How do we estimate the variance using the bootstrap sampleseboot?
Answers:
1 B−1∑B
b=1
(T ∗b − T ∗
)2.
2 B−1∑B
b=1
(Xb − T
)2.
3 B−1∑B
b=1
(T ∗b − T
)2.
4 B−1∑B
b=1
(Xb − T ∗
)2.