21 Inference
-
Upload
hadley-wickham -
Category
Technology
-
view
326 -
download
2
Transcript of 21 Inference
![Page 1: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/1.jpg)
Hadley Wickham
Stat310Inference
Tuesday, 31 March 2009
![Page 2: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/2.jpg)
1. Homework / Take home exam
2. Recap
3. Data vs. distributions
4. Estimation
1. Maximum likelihood
2. Method of moments
5. Feedback
Tuesday, 31 March 2009
![Page 3: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/3.jpg)
Short homework this week. (But you have to do some reading)
Take home test will be available online next Thursday.
Both take home and homework will be due in class on Thursday April 9.
Will put up study guide asap.
Assessment
Tuesday, 31 March 2009
![Page 4: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/4.jpg)
What are the 5 parameters of the bivariate normal?
If X and Y are bivariate normal, and their correlation is zero, what does that imply about X and Y? Is that usually true?
Recap
Tuesday, 31 March 2009
![Page 5: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/5.jpg)
Data vs. Distributions
Random experiments produce data.
A repeatable random experiment has some underlying distribution.
We want to go from the data to say something about the underlying distribution.
Tuesday, 31 March 2009
![Page 6: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/6.jpg)
Coin tossingHalf the class generates 100 heads and tails by flipping coins.
The other half generates 100 heads and tails just by writing down what they think the sequence would be.
Write up on the board.
I’ll come in and guess which group was which.
Tuesday, 31 March 2009
![Page 7: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/7.jpg)
Problem
Have some data
and a probability model, with unknown parameters.
Want to estimate the value of those parameters
Tuesday, 31 March 2009
![Page 8: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/8.jpg)
Some definitons
Parameter space: set of all possible parameter values
Estimator: process/function which takes data and gives best guess for parameter (usually many possible estimators for a problem)
Point estimate: estimator for a single value
Tuesday, 31 March 2009
![Page 9: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/9.jpg)
Example
Data: 5.7 3.0 5.7 4.5 6.0 6.3 4.9 5.8 4.4 5.8
Model: Normal(?, 1)
What is the mean of the underlying distribution? (5.2?)
Tuesday, 31 March 2009
![Page 10: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/10.jpg)
Uncertainty
Also want to be able to quantify how certain/confident we are in our answer.
How close is our estimate to the true mean?
Tuesday, 31 March 2009
![Page 11: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/11.jpg)
Simulation
One approach to find the answer is to use simulation, i.e., set up a case where we know what the true answer is and see what happens.
X ~ Normal(5, 1)
Draw 10 numbers from this distribution and calculate their average.
Tuesday, 31 March 2009
![Page 12: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/12.jpg)
3.1 3.4 5.1 4.9 2.2 4.4 4.2 3.9 5.6 4.9 4.2
5.9 2.8 6.0 5.1 2.7 6.5 4.2 4.9 4.6 4.4 4.7
5.0 5.3 5.3 5.1 5.4 4.7 4.7 4.4 5.9 4.2 5.0
4.3 5.4 5.5 4.9 3.1 4.1 4.8 3.6 6.8 5.5 4.8
3.8 6.1 3.8 5.2 5.7 5.2 3.2 5.2 5.3 2.3 4.6
5.6 6.0 5.5 5.5 5.1 7.3 5.4 6.1 4.4 4.9 5.6
Tuesday, 31 March 2009
![Page 13: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/13.jpg)
samp
count
0
20
40
60
80
100
120
4.0 4.5 5.0 5.5 6.0
Repeat 1000 times
95% of values lie between 4.5 and 5.6
Tuesday, 31 March 2009
![Page 14: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/14.jpg)
Theory
From Tuesday, we know what the distribution of the average is. Write it down.
Create a 95% confidence interval.
How does it compare to the simulation?
Tuesday, 31 March 2009
![Page 15: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/15.jpg)
Why is the mean of the data a good estimate of μ? Are there other estimators that might be as good or better?
In general, how can we figure out an estimator for a parameter of a distribution?
Why the mean?
Tuesday, 31 March 2009
![Page 16: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/16.jpg)
Maximum likelihood
Method of moments
Tuesday, 31 March 2009
![Page 17: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/17.jpg)
Maximum likelihood
Write down log-likelihood (i.e., given this data how likely is it that it was generated from this parmeter?)
Find the maximum (i.e., differentiate and set to zero)
Tuesday, 31 March 2009
![Page 18: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/18.jpg)
Example
X ~ Binomial(10, p?)
Here is some data drawn from that random experiment: 4 5 1 5 3 2 4 2 2 4
We know the joint pdf because they are independent. Can try out various values of p and see which is most likely
Tuesday, 31 March 2009
![Page 19: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/19.jpg)
Your turn
Write down the joint pdf for X1, X2, …, Xn ~ Binomial(n, p)
Try evaluating it for x = (4 5 1 5 3 2 4 2 2 4), n = 10, p = 0.1
Tuesday, 31 March 2009
![Page 20: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/20.jpg)
p
prob
0.0e+00
5.0e−09
1.0e−08
1.5e−08
2.0e−08
2.5e−08
3.0e−08
● ●●
●
●
● ● ● ● ● ●
0.0 0.2 0.4 0.6 0.8 1.0
Try 10 different values of p
Tuesday, 31 March 2009
![Page 21: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/21.jpg)
p
prob
0.0e+00
5.0e−09
1.0e−08
1.5e−08
2.0e−08
2.5e−08
3.0e−08
3.5e−08
0.0 0.2 0.4 0.6 0.8 1.0
Try 100 different values of p
True p is 0.3
Tuesday, 31 March 2009
![Page 22: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/22.jpg)
CalculusCan do the same analytically with calculus. Want to find the maximum of the pdf with respect to p. (How do we do this?)
Normally call this the likelihood when we’re thinking of the x’s being fixed and the parameters varying.
Usually easier to work with the log pdf (why?)
Tuesday, 31 March 2009
![Page 23: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/23.jpg)
Steps
Write out log-likelihood
(Discard constants)
Differentiate and set to 0
(Check second derivative is positive)
Tuesday, 31 March 2009
![Page 24: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/24.jpg)
Analytically
Mean of x’s is 3.2
n = 10
Maximum likelihood estimate of p for this example is 0.32
Tuesday, 31 March 2009
![Page 25: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/25.jpg)
Method of moments
We know how to calculate sample moments (e.g. mean and variance of data)
We know what the moments of the distribution are in terms of the parameters.
Why not just match them up?
Tuesday, 31 March 2009
![Page 26: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/26.jpg)
Binomial
E(X) = np Var(X) = np(1-p)
Tuesday, 31 March 2009
![Page 27: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/27.jpg)
Binomial
E(X) = np Var(X) = np(1-p)
p = mean / n = 3.2 / 10 = 0.32
Tuesday, 31 March 2009
![Page 28: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/28.jpg)
Binomial
E(X) = np Var(X) = np(1-p)
p = mean / n = 3.2 / 10 = 0.32
p(1-p) = var / n = 2 / 10 = 0.2
Tuesday, 31 March 2009
![Page 29: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/29.jpg)
Binomial
E(X) = np Var(X) = np(1-p)
p = mean / n = 3.2 / 10 = 0.32
p(1-p) = var / n = 2 / 10 = 0.2
-p2 + p - 0.2 = 0
Tuesday, 31 March 2009
![Page 30: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/30.jpg)
Binomial
E(X) = np Var(X) = np(1-p)
p = mean / n = 3.2 / 10 = 0.32
p(1-p) = var / n = 2 / 10 = 0.2
-p2 + p - 0.2 = 0
p = (0.276, 0.725)
Tuesday, 31 March 2009
![Page 31: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/31.jpg)
Your turn
What are the method of moments estimators for the mean and variance of the normal distribution?
What about the gamma distribution?
Tuesday, 31 March 2009
![Page 32: 21 Inference](https://reader033.fdocuments.in/reader033/viewer/2022052907/5592a2441a28ab30238b479b/html5/thumbnails/32.jpg)
Feedback
Tuesday, 31 March 2009