Post on 13-Dec-2015
Ch11: Comparing 2 Samples11.1: INTRO:
This chapter deals with analyzing continuous measurements.
Later, some experimental design ideas will be introduced.
Chapter #13 will be devoted to qualitative data analysis.
11.2: Comparing Two Independent Samples
In medical study, one sample X of subjects may be assigned to a control (placebo) treatment and another sample Y to a particular (group) treatment.
This section deals with independent samples and later sections with dependent & paired samples.
GandFofparameterslocationotherormeansofdifferencethe
aboutINFERENCE
samplesrandomtIndependenTwoareYandX
GgrouptreatmentfromnsobservatioYY
FgroupcontrolfromnsobservatioXXiid
m
iidn
:
,...,
,...,
1
1
11.2.1: Methods based on Normal Distributions
Assumptions:
2
)1()1(var
)%1(10011
11,~
,~,...,
,~,...,
2222
2/2
2
21
21
nm
SmSnsiancesamplepooledthebyestimatedunknown
forCIaismn
zYXknown
mnNYX
YXestimatenaturalhasmeansofdifferenceThe
samplesrandomtIndependenTwoareYandX
NYY
NXX
YXp
YX
YX
YX
Ym
Xn
11.2.1 : (cont’d)
mn
stYXisforCIa
ATheoremofsassumptiontheunderACorollary
freedomofreesnmdfwithibutiondistrtaist
tafollows
mns
YXtstatisticThe
samplesrandomtIndependenTwobe
NYYandNXXLet
ATheorem
pnmYX
df
df
p
YX
YmXn
112/)%1(100
,:
deg2
11
,~,...,,~,...,
:
2
21
21
11.2.1 : (cont’d) Test Procedures for Normal Populations:
Null Hypothesis:
Test Statistic:
There 3 common alternative hypotheses. 2 of which are one-sided ( ) and one is two-sided ( ).
Revisit my handouts about CI and HT for references
0:0 YXYX orH
mns
YXt
p
11
0
YXAYXA HorH ::YXAH :
11.2.2 : Power calculationThe power of the 2-sample t-test depends on:
1. (real difference)– The larger , the greater the power
2. (level of significance)– The larger , the more powerful the test
3. (population standard deviation)– The smaller , the larger the power
4. n and m (sample sizes) – The larger n and m, the greater the power
YX
11.2.2 : Power calculation (cont’d)Assume that n=m (same sample size) are large enough
to test at level , with
test statistic based on ,
where are given.
The rejection region (RR) of such a test is:
The power of a test is the probability of rejecting the null hypothesis when it is false. That is,
n
YXZ
2
YXAYX HvsH ::0
nzYXzZ /2)2/()2/(
,,
Power against
2
'
22
'
21
/2
'/2)2/(
/2
'
/2
'/2)2/(
/2
'
/2)2/(
/2)2/(
/2)2/('|)'(
nz
nz
n
nz
n
YXP
n
nz
n
YXP
nzYXP
nzYXP
nzYXPRRP
YX '
Application: what n is needed?As the difference moves away from zero, one of the
terms
will be negligible with respect to the other.
Problem: want to be able to detect a difference of with probability 0.9 and ?
Solution:
'
2
'
22
'
2
nzor
nz
1'
525
28.1)1.0(25
196.1
1.025
196.19.00
2
'96.11
1
n
n
nn
5
11.2.3: The Mann-Whitney Test(a nonparametric method)
Known as the Wilcoxon RST (Rank Sum Test).Assume that m + n experimental units are to be
assigned (at random) to a treatment group and a control group. In this specific context, n (remaining m) units are randomly chosen and assigned to the ctrl (to the trt).
We are interested in testing the null hypothesis that the treatment has NO EFFECT.
Then, if the null is true, then any difference in the outcomes under the 2 conditions is due to the randomization (i.e. solely by chance).
The Mann-Whitney Test: (cont’d)The MW-test statistic is calculated as follows:
1. Group all m + n observations together and Rank them in order of increasing size (no ties)
2. Calculate the sum of the ranks of those observations that came from the ctrl group.
3. Reject null if the sum is too small or too largeExample: ranks are bold and shown in parentheses
R = 3 + 4 = 7 (ctrl) and R = 1 + 2 = 3 (trt)
Treatment Control
1 (1) 6 (4)
3 (2) 4 (3)
The Mann-Whitney Test: (cont’d)Question: Does this discrepancy provide convincing
evidence of a systematic difference between trt & ctrl, or could it be just due by chance?
Answer: null hypothesis trt had no effect
Under the null, every assignment (total: 4!=24) of ranks to observations happens equally likely.
In particular, each of the assignments
of ranks to the ctrl group (shown below) is equally likely:
6)!24(!2
!4
2
4
Rank {1,2} {1,3} {1,4} {2,3} {2,4} {3,4}
R 3 4 5 5 6 7
The Mann-Whitney Test: (cont’d)The null distribution of R is the discrete r.v. R:
From this table, ; that is to say that this discrepancy would occur one time out of 6 by chance.
Similar computations can be carried out for any sample sizes m and n and can be even extended to testing:
Read page 404 (textbook).
r 3 4 5 6 7 Sum
P(R=r) 16
1
6
1
6
2
6
1
6
1
6
1)7( RP
GYYTRTandFXXCTRLwhereGFH nn ~,...,~,...,,: 110
The Mann-Whitney Test: Another approach
Suppose that the X’s are sampled from F and the Y’s are sampled from G. The Mann-Whitney test can be derived from a different point of view than what was seen earlier.
We would like to estimate the probability that an observation from F is smaller than an independent observation from G which is as a measure of the treatment, where X and Y are independently distributed with distribution functions F and G.
An estimate of can be obtained by comparing all n values of X to all m values of Y and by calculating the proportion of the comparisons for which X is less than Y.
)( YXP
The Mann-Whitney Test: Another approach (Cont’d)
12
)1(
2
,::
,0
,1
,1
ˆ:
0
1 1
nmmnUVarand
mnUE
GFHnulltheUnderATheorem
otherwise
YXifZand
ZUwhere
Umn
isThat
YY
jiij
n
i
m
jijY
Y
11.3: Comparing Paired Samples
Paired Design vs Unpaired design:
YXYXYX
YXYXiYXi
iii
YXXYii
YY
XX
ii
nDVarandDE
DVarandDE
YXDandtIndependenareYXDsdifferenceThe
ncorrelatiomemberspairYXAssume
ddistributelytIndependenarepairsdifferentAssume
ianceandmeanhavesYThe
ianceandmeanhavesXThe
niwithpairsareYX
DesignPairedCASE
21
)()(
2)()(
,),cov(
var'
var'
,...,1),(
:1
22
22
2
2
11.3: (cont’d)
Unpaired Design:
.,
01
21,
1
0,''2
:2
2222
22
DESIGNeffectivemoretheisPAIRINGancecircumstthisIn
fornn
Thus
nYXVarandYXE
YXbyestimatedbewillThen
thentindependenaresYandsXsamplestheIf
DesignUnpairedCASE
YXYXYX
YXYX
YX
11.3: (cont’d)
What if ?
.22
1,
12
:,
)1(22
?
2
22
treatmentpersubjectsnwithDesignUnpairedan
aspreciseasbewillpairsnwithDesignPairedisThat
nYXVar
DVarisefficiencyrelativetheThus
nDVarand
nYXVarThen
ifWhat
unpaired
paired
pairedunpaired
YX
YX
Pros & Cons Paired vs Independent Samples:
Here are 2 competiting sampling schemes:
Paired Samples: n pairs (2n measurements)
Independent Samples: 2n observations (m=n)
They both give the common form:
But, the SE estimates and the df for t are different:
ES ˆˆ
Independent Samples Paired Samples
2n—2 = 2(n—1) n—1
YXDwheretESD df
,2
ˆˆ
nns p
11
dftn
sD
Pros & Cons Paired vs Independent Samples:
For a same SE estimate, a loss of DF (degrees of freedom) gives a larger value for the t-test.
(example: )A loss of DF for the t-test produces:
• C.I. Larger Confidence Intervals • H.T. Loss of Power to detect real differences in
the population means.Such loss of DF for Paired Samples is compensated
by a smaller variance Var(X—Y) of Paired Samples with respect to Independent Samples.
734.1)18(833.1)9(10 05.005.0 ttn
11.3.1: Parametric Methods on the Normal Distribution for Paired Data
22
0:)(0:
2:)%1(100
.deg1'
:
)var()(
,~
11
0
1
2
2
2
nnD
DAD
nDD
D
D
D
iDiYXD
DDiii
tttsDregionrejectionthehaslevelat
HvseffecttreatmentnoHTesting
tsDisforCIa
freedomofreesnwithndisttafollowst
generalinunknownisbecauses
Dt
onbasedbewillInferences
DandDEwhere
NYXDthatAssume
11.3.2: Nonparametric Method for Paired Data: Sign Rank Test (SRT)
The Wilcoxon SRT is computed as follows:
1. Rank the absolute values of the differences (no ties) with
2. To get the signed ranks, just restore the signs of the to the ranks.
3. Calculate , the sum of those ranks that have positive (+) signs.
Example: Let be -2, 4, 3, 2, -1, 5
-1(r1), -2(r2) ,+2(r3) ,+3(r4) ,+4(r5) ,+5(r6) 4 + obs.
niforDofrankR ii ,...,1
iD
W
iD
5.1765422
325.2
betweentieW
Wilcoxon SRT (cont’d):Theorem A: Under the null hypothesis that the are
independent and symmetrically distributed about zero,
Proof:
24
)12)(1(
4
)1(
nnnWVarand
nnWE
iD
.4
1)(
2
1)(
2
1~,
,0
0||arg,1,
0
1
followsresultThe
IVarandIE
tlyindependeniBernoullIHunder
otherwise
DhasDestlktheifIwherekIW
kk
k
iith
k
n
kk
11.4: Experimental Design
Some basic principles of DOE (Design of Experiment) are introduced here.
Experimental Design can be viewed as a sequence of linked studies under some conditions.
Read case studies 11.4.1 thru 11.4.8