Agricultural and Biological Statistics. Sampling and Sampling Distributions Chapter 5.
Chapter 6: Statistics and Sampling DistributionsMATH450 Chapter 6: Statistics and Sampling...
Transcript of Chapter 6: Statistics and Sampling DistributionsMATH450 Chapter 6: Statistics and Sampling...
Chapter 6: Statistics and Sampling Distributions
MATH450
September 14th, 2017
MATH450 Chapter 6: Statistics and Sampling Distributions
Where are we?
Week 1 · · · · · ·• Chapter 1: Descriptive statistics
Week 2 · · · · · ·• Chapter 6: Statistics and Sampling
Distributions
Week 4 · · · · · ·• Chapter 7: Point Estimation
Week 7 · · · · · ·• Chapter 8: Confidence Intervals
Week 10 · · · · · ·• Chapter 9: Test of Hypothesis
Week 13 · · · · · ·• Two-sample inference, ANOVA, regression
MATH450 Chapter 6: Statistics and Sampling Distributions
Where are we?
6.1 Statistics and their distributions
6.2 The distribution of the sample mean
6.3 The distribution of a linear combination
Order: 6.1 ! 6.3 ! 6.2
MATH450 Chapter 6: Statistics and Sampling Distributions
Questions for this chapter
Given a random sample X1,X2, . . . ,Xn
, and
T = a1X1 + a2X2 + . . .+ a
n
X
n
Last week: If we know the distribution of Xi
’s, can we obtainthe distribution of T?
Tuesday:
What if X 0i
s follow normal distributions?Computations with normal random variables.
Today: If we don’t know the distribution of Xi
’s, can we stillobtain/approximate the distribution of T?
MATH450 Chapter 6: Statistics and Sampling Distributions
How to do computations with normal random variables?
MATH450 Chapter 6: Statistics and Sampling Distributions
N (µ, �2)
MATH450 Chapter 6: Statistics and Sampling Distributions
�(z)
�(z) = P(Z z) =
Zz
�1f (y , 0, 1) dy
MATH450 Chapter 6: Statistics and Sampling Distributions
Shifting and scaling normal random variables
MATH450 Chapter 6: Statistics and Sampling Distributions
�(z)
MATH450 Chapter 6: Statistics and Sampling Distributions
Example 3
Problem
Two airplanes are flying in the same direction in adjacent parallel
corridors. At time t = 0, the first airplane is 10 km ahead of the
second one.
Suppose the speed of the first plane (km/h) is normally distributed
with mean 520 and standard deviation 10 and the second planes
speed, independent of the first, is also normally distributed with
mean and standard deviation 500 and 10, respectively.
What is the probability that after 2h of flying, the second plane
has not caught up to the first plane?
MATH450 Chapter 6: Statistics and Sampling Distributions
�(z)
MATH450 Chapter 6: Statistics and Sampling Distributions
What if Xi
’s are not normal distributions?
MATH450 Chapter 6: Statistics and Sampling Distributions
Linear combination of random variables
Theorem
Let X1,X2, . . . ,Xn
be independent random variables (with possibly
di↵erent means and/or variances). Define
T = a1X1 + a2X2 + . . .+ a
n
X
n
,
then the mean and the standard deviation of T can be computed
by
E (T ) = a1E (X1) + a2E (X2) + . . .+ a
n
E (Xn
)
�2T
= a
21�
2X1
+ a
22�
2X2
+ . . .+ a
2n
�2X
n
MATH450 Chapter 6: Statistics and Sampling Distributions
The distribution of the sample mean
MATH450 Chapter 6: Statistics and Sampling Distributions
Random sample
Definition
The random variables X1,X2, ...,Xn
are said to form a (simple)random sample of size n if
1 the X
i
’s are independent random variables
2 every X
i
has the same probability distribution
MATH450 Chapter 6: Statistics and Sampling Distributions
Mean and variance of the sample mean
Problem
Given a random sample X1,X2, ...,Xn
from a distribution with
mean µ and standard deviation �, the mean is modeled by a
random variable X̄ ,
X̄ =X1 + X2 + . . .+ X
n
n
Compute E (X̄ )
Compute Var(X̄ )
MATH450 Chapter 6: Statistics and Sampling Distributions
Mean and variance of the sample mean
Note: If Xi
’s are normal random variables, then the distribution ofX̄ can be computed by
P✓X̄ � µ
�/pn
z
◆= P[Z z ] = �(z)
MATH450 Chapter 6: Statistics and Sampling Distributions
The Central Limit Theorem
Theorem
Let X1,X2, . . . ,Xn
be a random sample from a distribution with
mean µ and variance �2. Then, in the limit when n ! 1, the
standardized version of X̄ have the standard normal distribution
limn!1
P✓X̄ � µ
�/pn
z
◆= P[Z z ] = �(z)
Rule of Thumb:
If n > 30, the Central Limit Theorem can be used for computation.
MATH450 Chapter 6: Statistics and Sampling Distributions
Example: population distribution
Matt Nedrick (2015).http://github.com/mattnedrich/CentralLimitTheoremDemo
MATH450 Chapter 6: Statistics and Sampling Distributions
Sample distribution: n = 3
MATH450 Chapter 6: Statistics and Sampling Distributions
Sample distribution: n = 10
MATH450 Chapter 6: Statistics and Sampling Distributions
Sample distribution: n = 30
MATH450 Chapter 6: Statistics and Sampling Distributions
Example 4
Problem
When a batch of a certain chemical product is prepared, the
amount of a particular impurity in the batch is a random variable
with mean value 4.0 g and standard deviation 1.5 g.
If 50 batches are independently prepared, what is the
(approximate) probability that the sample average amount of
impurity X is between 3.5 and 3.8 g?
Hint:
First, compute µX̄
and �X̄
Note thatX̄ � µ
X̄
�X̄
is (approximately) standard normal.
MATH450 Chapter 6: Statistics and Sampling Distributions
Law of large numbers
MATH450 Chapter 6: Statistics and Sampling Distributions
Proof of the CLT
MATH450 Chapter 6: Statistics and Sampling Distributions
Moment generating function
Definition
The moment generating function (mgf) of a continuous randomvariable X is
M
X
(t) = E (etX ) =
Z 1
�1e
tx
f
X
(x)dx
Why is it call the moment generating function?
M
X
(0) =R1�1 f
X
(x)dx = 1.
M
0X
(0) =R1�1 xf
X
(x)dx = E (X ).
M
00X
(0) =R1�1 x
2f
X
(x)dx = E (X 2).
MATH450 Chapter 6: Statistics and Sampling Distributions
Moment generating function
MATH450 Chapter 6: Statistics and Sampling Distributions
Moment generating function
Property
Let X1,X2, . . . ,Xn
be independent random variables with moment
generating functions M
X1(t),MX2(t), . . . ,MX
n
(t), respectively.Define
T = a1X1 + a2X2 + . . .+ a
n
X
n
where a1, a2, . . . , an are constants.
Then
M
T
(t) = M
X1(a1t)MX2(a2t) . . .MX
n
(an
t)
Idea:
M
T
(t) = E (etT ) = E (et(a1X1+a2X2+...+a
n
X
n
))
= E (et(a1X1) · et(a2X2) · . . . · et(anXn
))
MATH450 Chapter 6: Statistics and Sampling Distributions
Proof of the CLT
Assume: µX
= 0, �X
= 1, and
X̄ =X1 + . . .+ X
n
n
T =X̄ � µ
X̄
�X̄
=X1 + . . .+ X
npn
We want to prove that
M
T
(t) ⇡ e
0·t+12·12·t2 or logM
T
(t) ⇡ t
2/2
Using the previous slide:
M
T
(t) = M
X
✓tpn
◆n
MATH450 Chapter 6: Statistics and Sampling Distributions
Proof of the CLT
Thus
logMT
(t) = n logMX
✓tpn
◆
Note:
M
X
(0) = 1, M
0X
(0) = EX = 0, M
00X
(X ) = E (X 2) = 1
Change variable: x = 1/pn, then x ! 0 when n ! 1
limn!1
logMT
(t) = limx!0
logMX
(tx)
x
2
= limx!0
tM
0X
(tx)
2xMX
(tx)
= limx!0
t
2M
00X
(tx)
2xtM 0X
(tx) + 2MX
(tx)=
t
2
2.
MATH450 Chapter 6: Statistics and Sampling Distributions
Chapter 6: Summary
MATH450 Chapter 6: Statistics and Sampling Distributions
Chapter 6
6.1 Statistics and their distributions
6.2 The distribution of the sample mean
6.3 The distribution of a linear combination
MATH450 Chapter 6: Statistics and Sampling Distributions
Random sample
Definition
The random variables X1,X2, ...,Xn
are said to form a (simple)random sample of size n if
1 the X
i
’s are independent random variables
2 every X
i
has the same probability distribution
MATH450 Chapter 6: Statistics and Sampling Distributions
Section 6.1: Sampling distributions
1 If the distribution and the statistic T is simple, try toconstruct the pmf of the statistic
2 If the probability density function f
X
(x) of X ’s is known, thetry to represent/compute the cumulative distribution (cdf) ofT
P[T t]
take the derivative of the function (with respect to t )
MATH450 Chapter 6: Statistics and Sampling Distributions
Example 1
Problem
Consider the distribution P
x 40 45 50
p(x) 0.2 0.3 0.5
Let {X1,X2} be a random sample of size 2 from P , and
T = X1 + X2.
1Derive the probability mass function of T
2Compute the expected value and the standard deviation of T
MATH450 Chapter 6: Statistics and Sampling Distributions
Example 2
Problem
Let {X1,X2} be a random sample of size 2 from the exponential
distribution with parameter �
f (x) =
(�e��x
if x � 0
0 if x < 0
and T = X1 + X2.
1Compute the cumulative density function (cdf) of T
2Compute the probability density function (pdf) of T
MATH450 Chapter 6: Statistics and Sampling Distributions
Section 6.3: Linear combination of normal random
variables
Theorem
Let X1,X2, . . . ,Xn
be independent normal random variables (with
possibly di↵erent means and/or variances). Then
T = a1X1 + a2X2 + . . .+ a
n
X
n
also follows the normal distribution.
MATH450 Chapter 6: Statistics and Sampling Distributions
Section 6.3: Computations with normal random variables
MATH450 Chapter 6: Statistics and Sampling Distributions
Section 6.3: Linear combination of random variables
Theorem
Let X1,X2, . . . ,Xn
be independent random variables (with possibly
di↵erent means and/or variances). Define
T = a1X1 + a2X2 + . . .+ a
n
X
n
,
then the mean and the standard deviation of T can be computed
by
E (T ) = a1E (X1) + a2E (X2) + . . .+ a
n
E (Xn
)
�2T
= a
21�
2X1
+ a
22�
2X2
+ . . .+ a
2n
�2X
n
MATH450 Chapter 6: Statistics and Sampling Distributions
Section 6.2: Distribution of the sample mean
Theorem
Let X1,X2, . . . ,Xn
be a random sample from a distribution with
mean µ and variance �2. Then, in the limit when n ! 1, the
standardized version of X̄ have the standard normal distribution
limn!1
P✓X̄ � µ
�/pn
z
◆= P[Z z ] = �(z)
Rule of Thumb:
If n > 30, the Central Limit Theorem can be used for computation.
MATH450 Chapter 6: Statistics and Sampling Distributions