Two Populations
-
Upload
sergiodragomiroff -
Category
Documents
-
view
215 -
download
0
Transcript of Two Populations
-
8/18/2019 Two Populations
1/41
Copyright ©2009 Pearson Education. Inc.
7. Comparison of Two Groups
Goal: Use CI and/or significance test to copare
eans !"uantitati#e #aria$le%
proportions !categorical #aria$le%
Group & Group 2 Estiate
Population ean
Population proportion
'e conduct inference a$out the difference $et(een the eans
or difference $et(een the proportions !order irrele#ant%.
1 2 2 1
1 2 2 1
ˆ ˆ
y y µ µ
π π π π
−
−
-
8/18/2019 Two Populations
2/41
-
8/18/2019 Two Populations
3/41
Copyright ©2009 Pearson Education. Inc.
4utcoe easure: ean response tie for a
su$5ect o#er a large nu$er of trials
) Purpose of study: *naly6e (hether !conceptual%
population ean response tie differs significantly for
the t(o groups+ and if so+ $y ho( uch.
) ata
Cell7phone group: 8 .2 illiseconds+ s1 = 9.-
Control group: 8 .;+ s2 8 -..
hape< 4utliers<
1 y2 y
-
8/18/2019 Two Populations
4/41
Copyright ©2009 Pearson Education. Inc.
-
8/18/2019 Two Populations
5/41
Copyright ©2009 Pearson Education. Inc.
Types of variales and samples
) =he outcoe #aria$le on (hich coparisons areade is the response variale.
) =he #aria$le that defines the groups to $e copared is
the explanatory variale.
Example: Reaction time is response #aria$le
Experimental group is eplanatory #aria$le77 a categorical #ar. (ith categories: !cell7phone+ control%
4r+ could epress eperiental group as
>cell7phone use? (ith categories !yes+ no%
-
8/18/2019 Two Populations
6/41
Copyright ©2009 Pearson Education. Inc.
) ifferent ethods apply for
independent samples 77 different saples+ no
atching+ as in this eaple and in >cross7sectionalstudies?
dependent samples 77 natural atching $et(een each
su$5ect in one saple and a su$5ect in other saple+such as in >longitudinal studies+? (hich o$ser#esu$5ects repeatedly o#er tie
Example: 'e later consider a separate eperient in(hich the same subjects fored the control group atone tie and the cell7phone group at another tie.
-
8/18/2019 Two Populations
7/41
Copyright ©2009 Pearson Education. Inc.
se for difference etween two estimates
!independent samples"
) =he sapling distri$ution of the difference $et(een t(oestiates is approximately normal !large n1 and n2 % and hasestiated
Eaple: ata on >@esponse ties? has
2 using cell phone (ith saple ean .2+ s 8 9.- 2 in control group (ith saple ean .;+ s 8 -.
'hat is se for difference $et(een saple eans of
.2 A .; 8 &.,<
2 2
1 2( ) ( ) se se se= +
-
8/18/2019 Two Populations
8/41
-
8/18/2019 Two Populations
9/41
Copyright ©2009 Pearson Education. Inc.
C# comparing two proportions
) @ecall se for a saple proportion used in a CI is
) o+ the se for the difference $et(een t(o saple proportions for
independent saples is
) * CI for the difference $et(een population proportions is
*s usual+ ! depends on confidence le#el+ &.9- for 9 confidence
ˆ ˆ(1 ) / se nπ π = −
2 2 1 1 2 21 2
1 2
ˆ ˆ ˆ ˆ(1 ) (1 )( ) ( ) se se se
n n
π π π π − −= + = +
1 1 2 22 1
1 2
ˆ ˆ ˆ ˆ(1 ) (1 )ˆ ˆ( ) z
n n
π π π π π π
− −− ± +
-
8/18/2019 Two Populations
10/41Copyright ©2009 Pearson Education. Inc.
Example: College *lcohol tudy conducted $y
ar#ard chool of Pu$lic ealth!http://(((.hsph.har#ard.edu/cas/%
=rends o#er tie in percentage of $inge drin1ing!consuption of or ore drin1s in a ro( for en and , orore for (oen+ at least once in past t(o (ee1s%
and of acti#ities perhaps influenced $y it<
>a#e you engaged in unplanned seual acti#ities$ecause of drin1ing alcohol
-
8/18/2019 Two Populations
11/41Copyright ©2009 Pearson Education. Inc.
) Estiated change in proportion saying >yes? is
0.2& A 0.&92 8 0.02&.
9 CI for change in population proportion is
0.02& D &.9-!0.00-% 8 0.02& D 0.0&&+ or roughly!0.0&+ 0.0%
'e can $e 9 confident that the populationproportion saying >yes? (as $et(een a$out 0.0&
larger and 0.0 larger in 200& than in &99.
1 1 2 2
1 2
ˆ ˆ ˆ ˆ(1 ) (1 ) (.192)(.808) (.213)(.787)0.005612,708 8783 se n n
π π π π − −= + = + =
-
8/18/2019 Two Populations
12/41Copyright ©2009 Pearson Education. Inc.
Coents a$out CIs for difference $et(een
t(o population proportions
) If 9 CI for is !0.0&+ 0.0%+ then 9 CI
for is !70.0+ 70.0&%.
It is ar$itrary (hat (e call Group & and Group 2 and(hat the order is for coparing the proportions.
)'hen 0 is not in the CI+ (e can conclude that onepopulation proportion is higher than the other.
!e.g.+ if all positi#e #alues for Group 2 A Group &+ then conclude
population proportion higher for Group 2 than Group &%
2 1π π −
1 2π π −
-
8/18/2019 Two Populations
13/41Copyright ©2009 Pearson Education. Inc.
) 'hen 0 is in the CI+ it is plausi$le that the populationproportions are identical.
Example: uppose 9 CI for change in population proportion
!200& A &99% is !70.0&+ 0.0%>9 confident that population proportion saying yes (as
$et(een 0.0& smaller and 0.0 larger in 200& than in &99.?
) =here is a significance test of 0: & 8 2 that the population
proportions are identical!i.e.+ difference & 7 2 8 0%+ using test statistic
! = !difference $et(een saple proportions%/se
For unplanned se in &99 and 200&+! = diff./se 8 0.02&/0.00- 8 .;
=(o7sided P7#alue 8 0.0002
=his sees to $e statistical significance (ithout practicalsignificance"
-
8/18/2019 Two Populations
14/41Copyright ©2009 Pearson Education. Inc.
etails a$out test on pp. &97&90 of tet use se#
(hich pools data to get $etter estiate of se under 0
!'e study this test as a special case of >chi7s"uared
test? in net chapter+ (hich deals (ith possi$ly anygroups+ any outcoe categories%
) =he theory $ehind the CI uses the fact that sapleproportions !and their differences% ha#e approiatenoral sapling distri$utions for large nHs+ $y theCentral 3iit =heore+ assuing randoi6ation%
) In practice+ forula (or1s o1 if at least &0 outcoes ofeach type for each saple !Bote: 'e donHt use t dist. forinference a$out proportions ho(e#er+ there are speciali6edsall7saple ethods+ e.g.+ using $inoial distri$ution%
-
8/18/2019 Two Populations
15/41Copyright ©2009 Pearson Education. Inc.
uantitati#e @esponses:
Coparing Jeans
) Paraeter: µ 2 7 µ &
) Estiator:
) Estiated standard error:
A apling dist.: *pproiately noral !large n$s% $y C3=%
A CI for independent rando saples from t&o normal
population distributions has for
A Forula for df for t 7score is cople !later%. If $oth saple
si6es are at least 0+ can 5ust use !'score
2 1 y y− 2 21 2
1 2
s s se
n n= +
( ) ( )
2 2
1 22 1 2 1
1 2 ( ), which is
s s
y y t se y y t n n− ± − ± +
-
8/18/2019 Two Populations
16/41Copyright ©2009 Pearson Education. Inc.
Example: G data on >nu$er of close friends?
Use gender as the eplanatory #aria$le:
,- feales (ith ean .+ s 8 &.-
, ales (ith ean .9+ s 8 &.
Estiated difference of .9 A . 8 0.- has a argin
of error of &.9-!&.09% 8 2.&+ and 9 CI is
0.- D 2.&+ or !7&.+ 2.;%.
1 1 1
2 2 2
2 2 2 2
1 2
/ 15.6 / 486 0.708
/ 15.5 / 354 0.824
( ) ( ) (0.708) (0.824) 1.09
se s n
se s n
se se se
= = =
= = =
= + = + =
-
8/18/2019 Two Populations
17/41Copyright ©2009 Pearson Education. Inc.
) 'e can $e 9 confident that the population ean nu$erof close friends for ales is $et(een &. less and 2.; orethan population ean nu$er of close friends for feales.
) 4rder is ar$itrary. 9 CI coparing eans for feales Aales is !72.;+ &.%
) 'hen CI contains 0+ it is plausi$le that the difference is 0 inthe population !i.e.+ population eans e"ual%
) ere+ noral population assuption clearly #iolated. Forlarge n$ s+ no pro$le $ecause of C3=+ and for sall n$ s the
ethod is ro$ust. !Kut+ eans ay not $e rele#ant for #eryhighly s1e(ed data.%
) *lternati#ely could do significance test to find strength ofe#idence a$out (hether population eans differ.
-
8/18/2019 Two Populations
18/41Copyright ©2009 Pearson Education. Inc.
$ignificance Tests for % & '
) =ypically (e (ish to test (hether the t(o populationeans differ
!null hypothesis $eing no difference+ >no effect?%.
) ( 0: µ 2 7 µ & 8 0 ! µ & 8 µ 2%
) ( a: µ 2 7 µ & ≠ 0 ! µ & ≠ µ 2%
) =est tatistic:
( )2 1 2 12 21 2
1 2
0 y y y yt se s s
n n
− − −= =
+
-
8/18/2019 Two Populations
19/41Copyright ©2009 Pearson Education. Inc.
=est statistic has usual for of
!estiate of paraeter A ( 0 #alue%/standard error.
) P 7#alue: 27tail pro$a$ility fro t distri$ution
) For &7sided test !such as ( a: µ 2 7 µ & L 0%+ P'#alue 8
one7tail pro$a$ility fro t distri$ution !$ut+ not ro$ust%) Interpretation of P'#alue and conclusion using 7le#elsae as in one7saple ethods
ex. uppose P'#alue 8 0.. =hen+ under suppositionthat null hypothesis true+ pro$a$ility 8 0. of gettingdata li1e o$ser#ed or e#en >ore etree+? (here>ore etree? deterined $y ( a
E l C i f l d l $ f
-
8/18/2019 Two Populations
20/41Copyright ©2009 Pearson Education. Inc.
Example: Coparing feale and ale ean nu$er ofclose friends+ 0: µ & 8 µ 2 ( a: µ & ≠ µ 2
ifference $et(een saple eans 8 .9 A . 8 0.-
se 8 &.09 !sae as in CI calculation% =est statistic t = 0.-/&.09 8 0.
P'#alue 8 2!0.29% 8 0. !using standard noral ta$le%
If ( 0 true of e"ual population eans+ (ould not $eunusual to get saples such as o$ser#ed.
For 8 0.0 8 P!=ype I error%+ not enough e#idence to
re5ect 0. !Plausi$le that population eans are identical.%
For ( a: µ & M µ 2 !i.e.+ µ 2 7 µ & L 0%+ P7#alue 8 0.29
For ( a: µ & L µ 2 !i.e.+ µ 2 7 µ & M 0%+ P7#alue 8 & A 0.29 8 0.;&
-
8/18/2019 Two Populations
21/41Copyright ©2009 Pearson Education. Inc.
E"ui#alence of CI and ignificance =est
>0: µ & 8 µ 2 re5ected !not re5ected% at 8 0.0 le#el in
fa#or of ( a: µ & ≠ µ 2?
is e"ui#alent to
>9 CI for µ & 7 µ 2 does not contain 0 !contains 0%?
Example: P'#alue 8 0.+ so >'e do not re5ect 0 of
e"ual population eans at 0.0 le#el?
9 CI of !7&.+ 2.;% contains 0.
!For other than 0.0+ corresponds to &00!& 7 % confidence%
-
8/18/2019 Two Populations
22/41Copyright ©2009 Pearson Education. Inc.
*lternati#e inference coparing eans
assues e)ual population standard de*iations
) 'e (ill not consider forulas for this approach here!in ec. ;. of tet%+ as itHs a special case of >analysisof #ariance? ethods studied later in Chapter &2.
=his CI and test uses t distri$ution (ith
df = n1 + n2 ' 2
) 'e (ill see ho( soft(are displays this approach andthe one (eH#e used that does not assue e"ualpopulation standard de#iations.
-
8/18/2019 Two Populations
23/41Copyright ©2009 Pearson Education. Inc.
Example: Eercise ;.0+ p. 2&. Ipro#eent scores for
therapy *: &0+ 20+ 0
therapy K: 0+ ,+ ,
*: ean 8 20+ s& 8 &0
K: ean 8 ,0+ s2 = .--
ata file+ (hich (e input into P and analy6e
u$5ect =herapy Ipro#eent
& * &0
2 * 20
* 0 , K 0
K ,
- K ,
-
8/18/2019 Two Populations
24/41Copyright ©2009 Pearson Education. Inc.
= t f (
-
8/18/2019 Two Populations
25/41Copyright ©2009 Pearson Education. Inc.
=est of 0: µ & 8 µ 2 ( a: µ & ≠ µ 2
=est statistic t = !,0 A 20%/;.-, 8 2.-2 'hen df = ,+ P'#alue 8 2!0.029,% 8 0.09.
For one7sided ( a: µ & <
µ 2 !i.e.+ predict $efore study thattherapy K is $etter%+ P'#alue 8 0.029
'ith 8 0.0+ insufficient e#idence to re5ect null for t(o7
sided a+ $ut can re5ect null for one7sided a andconclude therapy K $etter.
!$ut ree$er+ ust choose a ahead of tieN%
-
8/18/2019 Two Populations
26/41Copyright ©2009 Pearson Education. Inc.
o( does soft(are get df for >une"ual
#ariance? ethod<
) 'hen allo( σ &2 ≠ σ 22 recall that
) =he >ad5usted? degrees of freedo for the t distri$utionapproiation is !'elch7atterth(aite approiation% :
2 2
1 2
1 2
s s se
n n= +
22 2
1 2
1 2
2 22 2
1 2
1 2
1 21 1
s s
n n
df s s
n n
n n
+ ÷
=
÷ ÷ ÷ ÷+
÷− − ÷ ÷
-
8/18/2019 Two Populations
27/41Copyright ©2009 Pearson Education. Inc.
oe coents a$out coparing eans
) If data sho( potentially large differences in #aria$ility
!say+ the larger s $eing at least dou$le the saller s%+safer to use the >une"ual #ariances? ethod
) ,ne'sided t tests not ro$ust against se#ere #iolationsof noral population assuption+ (hen n is sall.
Ketter then to use >nonparametric ? ethods !(hich do
not assue a particular for of population distri$ution%
for one7sided inference see tet ec. ;.;.
) CI ore inforati#e than test+ sho(ing (hether
plausi$le #alues are near or far fro 0.
Eff t $i
-
8/18/2019 Two Populations
28/41Copyright ©2009 Pearson Education. Inc.
Effect $i(e
) 'hen groups ha#e siilar #aria$ility+ a suary
easure of effect si!e is
) Example: =he therapies had saple eans of 20 for *and ,0 for K and standard de#iations of &0 and .--. Ifcoon standard de#iation in each group is estiated to$e s 8 9. !say%+ then
effect si6e 8 !,0 A 20%/9. 8 2.&.
Jean for therapy K estiated to $e a$out t(o standardde#iations larger than the ean for therapy *.
=his is a large effect.
2 1mean meaneffect size =standad de!iati"n in each #"$%
−
-
8/18/2019 Two Populations
29/41
Copyright ©2009 Pearson Education. Inc.
=his effect si6e easure is soeties called >-ohen$s
d .? e considered
d 8 0.2 8 (ea1+ d 8 0. 8 ediu+ d L 0. large.
Example: 'hich study sho(ed the largest effect<
1 2
1 2
1 2
1. 20, 30, 10
2. 200, 300, 100
3. 20, 25, 2
y y s
y y s
y y s
= = =
= = =
= = =
Coparing Jeans (ith ependent aples
-
8/18/2019 Two Populations
30/41
Copyright ©2009 Pearson Education. Inc.
Coparing Jeans (ith ependent aples
) etting: Each saple has the sae su$5ects !as inlongitudinal studies or crosso#er studies% or matched pairs of su$5ects
) =hen+ it is not true that for coparing t(o statistics+
) Just allo( for >correlation? $et(een estiates !'hy
-
8/18/2019 Two Populations
31/41
Copyright ©2009 Pearson Education. Inc.
Example: Cell7phone study also had eperient (ithsae su$5ects in each group
!data on p. &9, of tet%
For this >atched7pairs? design+ data file has the for
u$5ect CellOno CellOyes
& -0, -- 2 - -2
,0 -&
!for 2 su$5ects%
aple eans are ,.- illiseconds (ithout cell phone
.2 illiseconds+ using cell phone
'e reduce the 2 o$ser#ations to 2 difference scores
-
8/18/2019 Two Populations
32/41
Copyright ©2009 Pearson Education. Inc.
'e reduce the 2 o$ser#ations to 2 difference scores+
-- A -0, 8 2
-2 A - 8 -;
-& A ,0 8 ;
.
and analy6e the (ith standard ethods for a single saple
8 0.- 8 .2 A ,.-+ sd 8 2. 8 std de# of 2+ -;+ ;
For a 9 CI+ df 8 n 1 8 &+ t'score 8 2.0,
'e get 0.- D 2.0,!9.2%+ or !&.;+ -9.%
d y
/ 52.5 / 32 9.28d se s n= = =
-
8/18/2019 Two Populations
33/41
Copyright ©2009 Pearson Education. Inc.
) 'e can $e 9 confident that the population ean
using a cell phone is $et(een &.; and -9.
illiseconds higher than (ithout cell phone.
) For testing 0 : Qd 8 0 against a : Qd ≠ 0+ the test
statistic is
t 8 ! 7 0%/se 8 0.-/9.2 8 .,-+ df = &+
=(o7sided P'#alue 8 0.00000+ so there isetreely strong e#idence against the null
hypothesis of no difference $et(een the population
eans.
d y
In class (e (ill use P to
-
8/18/2019 Two Populations
34/41
Copyright ©2009 Pearson Education. Inc.
In class+ (e (ill use P to
) @un the dependent7saples t analyses
) Plot cellOyes against cellOno and o$ser#e a strong
positi#e correlation !0.&,%+ (hich illustrates ho( an
analysis that ignores the dependence $et(een the
o$ser#ations (ould $e inappropriate.) Bote that one su$5ect !nu$er 2% is an outlier
!unusually high% on $oth #aria$les
) 'ith outlier deleted+ P tell us that t 8 .2-+ df =0 for coparing eans !P 8 0.0000&% for coparing
eans+ 9 CI of !29.&+ --.0%. =he pre#ious results
(ere not influenced greatly $y the outlier.
-
8/18/2019 Two Populations
35/41
Copyright ©2009 Pearson Education. Inc.
P output for original dependent7saples t
analysis !including the outlier%
-
8/18/2019 Two Populations
36/41
Copyright ©2009 Pearson Education. Inc.
oe coents
) ependent saples ha#e ad#antages of !&% controllingsources of potential $ias !e.g.+ $alancing saples on#aria$les that could affect the response%+ !2% ha#ing asaller se for the difference of eans+ (hen the pair(iseresponses are highly positi#ely correlated !in (hich case+ thedifference scores sho( less #aria$ility than the separatesaples%
) 'ith dependent saples+ (hy canHt (e use the se forula
for independent saples<
2 2
1 2
1 2
s s se
n n= +
Ex !artificial $ut a1es the point%
-
8/18/2019 Two Populations
37/41
Copyright ©2009 Pearson Education. Inc.
Ex. !artificial+ $ut a1es the point%'eights $efore and after anoreia therapy
u$5ect Kefore *fter ifference
& && &22 ;
2 9& 9 ;
&00 &0; ; , &2 &9 ;
3ots of #aria$ility (ithin each group of o$ser#ations+ $ut
no #aria$ility for the difference scores !so+ actual se isuch saller than independent saples forula suggests%
If you plot x = $efore against y = after+ (hat do you see<
) =he )c*emar test !pp 20&720% copares
-
8/18/2019 Two Populations
38/41
Copyright ©2009 Pearson Education. Inc.
=he )c*emar test !pp. 20&720% copares
proportions (ith dependent saples
) +isher,s exact test !pp. 20720,% copares proportions for sall independent saples
) oeties itHs ore useful to copare groups usingratios rather than differences of paraeters
-
8/18/2019 Two Populations
39/41
Copyright ©2009 Pearson Education. Inc.
Example: U.. ept. of Rustice reports that proportion of
adults in prison is a$out
900/&00+000 for ales+ -0/&00+000 for feales
/ifference: 900/&00+000 A -0/&00+000 8 ,0/&00+000 8 0.00,
Ratio: S900/&00+000T/S-0/&00+000T 8 900/-0 8 &.0
In applications in (hich the proportion refers to anundesira$le outcoe !e.g.+ ost edical studies%+ the
ratio is called the relati*e ris0. Inference ethods !CI+
test% are a#aila$le for it also.
* f ti
-
8/18/2019 Two Populations
40/41
Copyright ©2009 Pearson Education. Inc.
* fe( suary "uestions
&. Gi#e an eaple of !a% independent saples+ !$% dependentsaples
2. Gi#e an eaple of !a% response #ar.+ !$% categoricaleplanatory #ar.+ and identify (hether response is "uantitati#e orcategorical and state the appropriate analyses.
. uppose that a 9 CI for difference $et(een Jassachusetts and=eas in the population proportion supporting legal sae7searriage is !0.&+ 0.22%.
a. Population proportion of support is higher in =eas
$. ince 0.& and 0.22 M 0.0+ less than half the population supportslegal sae7se arriage.
c. =he 99 CI could $e !0.&;+ 0.20%
d. It is plausi$le that population proportions are e"ual.e. P7#alue for testing e"ual population proportions against t(o7sided
alternati#e could $e 0.,0.
f. 'e can $e 9 confident that the population proportion of
support in J* is $et(een 0.& higher and 0.22 higher than in =.
Example: *noreia study studying (eight change for
-
8/18/2019 Two Populations
41/41
Example: *noreia study+ studying (eight change for
groups !$eha#ioral therapy+ faily therapy+ control%.
Patients randoly assigned to one of the three
therapies. Is this an eaple of independent saplesor dependent saples<