1 Sampling Models for the Population Mean Ed Stanek UMASS Amherst.
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Sampling Models for the Population Mean Ed Stanek UMASS Amherst.
1
Sampling Models for the Population Mean
Ed Stanek
UMASS
Amherst
2
Basic Problem (Population Mean)
PopulationData
Listing Latent Value
Rose
Lily
Daisy
Rosey
Lilyy
Daisyy
3Rose Lily Daisyy y y
What is ?
Rosey
3
Basic Problem (Population Mean)
Some NotationPopulation
Listing Latent Value
Rose
Lily
Daisy
Rosey
Lilyy
Daisyy
1,...,j
L j N
y
Label Set of Subjects in the Population
1
0 2
3
j
Rose
Lily
Daisy
λ
Listing
1
0 2
3
Rose
j Lily
Daisy
y y
y y y
y y
y
Latent Values Assumption: Response is equal to the latent value for the subject. Thereis no measurement error.
1
1 N
jj
yN
Using vector notation:
1
L
Lily Rose Daisy
yN
y y y
N
Using set notation:
4
Sampling Model
• Select a simple random sample without replacement of size n– Define an estimator that is a linear function of
the sample data– Require the estimator to be unbiased– Determine coefficients that minimize the
variance (over all possible samples)
• Best Linear Unbiased Estimator (BLUE)
5
Sampling Model
Select a simple random sample without replacement
p*=1 p*=2 p*=3 p*=4 p*=5 p*=6
All possible Permutationsof subjects
Order Potential Response
i
1i 1Y
2i 2Y
3i 3Y
R
L
D
R
D
L
L
R
D
L
D
R
D
R
L
D
L
R
Probability of Permutation
01p
02p
03p
04p 0
5p 06p
* *
0 0 1
!p pE I p
N for all * 1,2,..., !p N
Listing
*
*
001 if
0 otherwise
p
pI
Y u y
6
Sampling Model
Select a simple random sample without replacement
p*=1 p*=2 p*=3 p*=4 p*=5 p*=6
All possible Permutationsof latent values
1
0 2
3
Rose
j Lily
Daisy
y y
y y y
y y
y
* *
*
1 !0
2 01
3
N
i p pp
Y
Y Y I
Y
Y u y
Potential Response
1 0u y2 0u y
6 0u y
1
1 0 0
0 1 0
0 0 1
u
2
1 0 0
0 0 1
0 1 0
u 6
0 0 1
0 1 0
1 0 0
u
Lilyy
Rosey
Daisyy
Lilyy Rosey
Rosey
Rosey
RoseyRoseyLilyy
Lilyy Lilyy
Lilyy
Lilyy
Daisyy
Daisyy
Daisyy
Daisyy
Daisyy
7
Permutation
All possible Permutation
Order Potential Response
i
1i 1Y
2i 2Y
3i 3Y
1
2
3
Y
Y
Y
Y
Data
Remainder
i i iY E Y E
Sampling Model
Select a simple random sample without replacement
* *
*
!0
01
N
p pp
I
Y u y
8
•Represent the Population as a Vector of Random Variables
•The random variables are indexed by their position- not the label for the subject in a position subject
•The subject corresponding to a random variable can not be identified
Permutation
1
2
3
Y
Y
Y
Y
Data
Remainder
Position i=1
Sampling Model
Select a simple random sample without replacement
Sample Size: n=1
Permutation
1
2
3
Y
Y
Y
YData
Remainder
Sample Size: n=2
9
Sampling Model
Define the Target
1
N
i ii
P
g Y
g YLinear combination of Population Random Variables:
Special case: Mean (Parameter) 1 for all 1,...ig i N
N
1
1
1
N
N
ii
PN
YN
1 Y
•May be a Parameter•May be a Random variable
Special case: Latent value for Randomly
Selected Subject *
*
1 for
0 for all
i
i
g i
g i i
iP Y
1 2 Ng g g g
10
Sampling Model
Expected Value
1
2
3
I
II
Y
Y
Y
YY
Y
Data
i i iY E Y E
I I n
II II N n
E
Y X 1
Y X 1 NE Y 1
Expected Value Expected Value
Under SRS w/o Rep: iE Y
E
Y Y E
Xβ E E Y Xβ
NE Y 1
LinearLink Function
NE X 1 β
11
Sampling Model
Variance
1
2
3
I
II
Y
Y
Y
YY
Y
Data
i i iY E Y E
2
2
1var N N
N
N
Y I J
P
22
1
1
1
N
ss
yN
2
,
,
1 1
var1 1
n n n N nI
IIN n n N n N n
I I II
II I II
N N
N N
I J 1 1Y
Y1 1 I J
V V
V V
Variance
Variance
Term due to finite populationcorrection factor
1N N NN P I J
where
12
Sampling ModelExpected Value and Variance
Reference Sets
Reference Set: The set of possible values that sample random variables can have with positive probability
Expectation is evaluated over a reference set
1
2
3
I
II
Y
Y
Y
YY
Y
Data
1I YY1n
Example:
If
, ,Lily Rose Daisyy y y
Reference set for 1I YY
13
Sampling Model
Expected Value and Variance:
Reference Sets
1
2
3
I
II
Y
Y
Y
YY
Y
Data
1I YY 1 ReferenceElementReference
Reference
ElementI
Elements
E E Y P y
Y
Reference set for 1I YY
Reference 1
Element 3P
1 ReferenceElementReference
Reference
Element
1 1 1
3 3 3
Elements
Lily Rose Daisy
E Y P y
y y y
1n
, ,Lily Rose Daisyy y y
14
Sampling ModelExpected Value and Variance
Reference Sets
1
2
3
I
II
Y
Y
Y
YY
Y
Data1
2I
Y
Y
Y
Reference set for
2n
1
2I
Y
Y
Y
, , , , ,Lily Rose Lily Daisy Rose Daisyy y y y y y
Example when
Sets of possible latent values
If 10Lilyy
8Daisyy
6Rosey
10 6 , 10 8 , 6 8Reference set for IY
15
Sampling ModelExpected Value and Variance
Reference Sets vs Sequence
1
2
3
I
II
Y
Y
Y
YY
Y
Data1
2I
Y
Y
Y
2n
, , , , ,Lily Rose Lily Daisy Rose Daisyy y y y y y
Example when
Reference Set for IY
L
R
L
D
R
L
R
D
D
L
D
R
D R D L R L
Permutation (sequences)
p*=1 p*=2 p*=3 p*=4 p*=5 p*=6
1
2
3
Y
Y
Y
, , , , ,Lily Rose Rose DaisyLily Daisy
Daisy Lily Daisy LilyRose Rose
y y y yy y
y y y yy y
Reference Sequence for IY
16
Sampling ModelExpected Value and Variance
Reference Sets vs Sequence
1
2
3
I
II
Y
Y
Y
YY
Y
Data 1
2I
Y
Y
Y
2n
, , , , ,Lily Rose Lily Daisy Rose Daisyy y y y y y
Example when
Reference Set :
, , , , ,Rose Lily Daisy RoseLily Daisy
Lily Daisy Lily DaisyRose Rose
y y y yy y
y y y yy y
Reference Sequence :Used in Random PermutationModel
Sufficient, assumingorder doesn’t matter
17
Sampling Model
Determining the BLUE for
1 2 na a a a
Linear Estimator:
Question: What should a be so that the estimator is unbiased and has minimum variance?
I
I IIII
I I II II
P
g Y
Yg g
Y
g Y g Y
Target:
1I II n N nN g g 1 1where
data
ˆI I
I I I
P
g a Y
g Y a Y
18
Sampling Model
Determining the BLUE for Unbiased Constraint
Unbiased requirement:
Implies that
ˆ I I I
I I II II
P
P
g Y a Y
g Y g Y
ˆ 0E P P
ˆ III
II
E P P
Xa g
X 0I II II a X g X
P̂ P NE Y 1
nI
N nII
E
1Y
1Y
ˆ I II IIP P a Y g Y
19
Sampling ModelDetermining the BLUE
Minimizing the Variance
Variance
ˆI II IIP P a Y g Y
0I II II a X g X ,
,
ˆvar I I IIR II
II I II II
P P
V V aa g
V V g
Unbiased Constraint
Lagrangian Function to Minimize with Respect to a
,, 2 2I II II I II II II I II IIf a λ a V a g V a g V g a X g X η
,
,2 2 2I I II II I
f
a η
V a V g X ηa
,2 I II II
f
a η
X a X gη
,
ˆ ˆ,ˆ1
ˆ ˆ ˆ,2I I I II II n
I II II
f
f
aV X V ga 0aX 0 X ga 0
,ˆ
ˆI I I II II
I II II
V X V ga
X 0 X g
20
Sampling ModelDetermining the BLUE
Minimizing the Variance
Solving the Estimating Equations
,ˆ
ˆI I I II II
I II II
V X V ga
X 0 X g
A BM
C D
1 1 1 1 1 11
1 1 1
A A BQ CA A BQM
Q CA Q1 Q D CA Bwhere
1 11 1 1 1 1 11
1 11 1 10
I I I I I I I I I I I I II I
II I I I I I I I
V V X X V X X V V X X V XV X
X X V X X V X V X
1 11 1 1 1 1 1
,ˆ I I I I I I I I I II II I I I I I II II
a V V X X V X X V V g V X X V X X g
21
Sampling ModelDetermining the BLUE
Minimizing the Variance
Solving the Estimating Equations
1 11 1 1 1 1 1
,ˆ I I I I I I I I I II II I I I I I II II
a V V X X V X X V V g V X X V X X g
11 1ˆ
I I I I I I
X V X X V YLet
ˆ ˆI I IP g Y a Y
1 11 1 1 1 1 1,ˆ II I II I I I I I I I I II II I I I I I
a g V V V X X V X X V g X X V X X V
1,
ˆ ˆ ˆI I II II II I I I IP g Y g X V V Y X
ˆ ˆvar var
ˆ ˆ
I I
I I I
P
g a Y
g a V g a
22
Sampling Model
Determining the BLUE of
Using
1,
ˆ ˆ ˆI I II II II I I I IP g Y g X V V Y X
1 1I II n N nN N
g g 1 1
11 1 1I n n n nN N n
V I J I J
and1
I nNX 1
1II N nN X 1
11 1 1
ˆI I I I I I n IN
n
X V X X V Y 1 Y
1
I I I
n
N N n
X V X so that
wheren
fN
1,
1,
1,
1 1 1 1ˆ
1 1 1 1
1 11
n I N n N n n II I I n n n I
n I n N n II I I n n I
N n II I I n n I
N NP f
n N N n N n
N nfn N n N n
fY f YN n
1 Y 1 1 1 V V I 1 1 Y
1 Y 1 1 V V I J Y
1 V V I J Y
1n IY
n 1 Y
23
Sampling Model
Determining the BLUE of
Now
where
1,
1 1ˆ 1 N n II I I n n IP fY f YN n
1 V V I J Y
1n IY
n 1 Y
1,
1 1
11
1
II I I N n n n n
N n n
N n n
N N n
n
N N n
N n
V V 1 1 I J
1 1
1 1
and 1n n n nn
1 I J 0
As a result ˆ 1P fY f Y
Y
24
Sampling Model
Determining the BLUE of
Now
where andˆ ˆI I IP
Y
g Y a Y I I fY g Y ˆ 1I f Y a Y
2
ˆ ˆvar var
var
1var
I I
n I n
P
Y
n
g a Y
1 Y 1
Since 2 1I n nN
V I J
2
ˆvar 1P fn