Two Populations

download Two Populations

of 41

Transcript of Two Populations

  • 8/18/2019 Two Populations

    1/41

    Copyright ©2009 Pearson Education. Inc. 

    7. Comparison of Two Groups

    Goal: Use CI and/or significance test to copare

    eans !"uantitati#e #aria$le%

    proportions !categorical #aria$le%

      Group & Group 2 Estiate

    Population ean

    Population proportion

    'e conduct inference a$out the difference $et(een the eans

    or difference $et(een the proportions !order irrele#ant%.

    1 2 2 1

    1 2 2 1

     

    ˆ ˆ 

     y y µ µ 

    π π π π  

  • 8/18/2019 Two Populations

    2/41

  • 8/18/2019 Two Populations

    3/41

    Copyright ©2009 Pearson Education. Inc. 

    4utcoe easure: ean response tie for a

    su$5ect o#er a large nu$er of trials

    ) Purpose of study: *naly6e (hether !conceptual%

    population ean response tie differs significantly for

    the t(o groups+ and if so+ $y ho( uch.

    ) ata

    Cell7phone group: 8 .2 illiseconds+ s1  = 9.-

    Control group: 8 .;+ s2   8 -..

    hape< 4utliers<

    1 y2 y

  • 8/18/2019 Two Populations

    4/41

    Copyright ©2009 Pearson Education. Inc. 

  • 8/18/2019 Two Populations

    5/41

    Copyright ©2009 Pearson Education. Inc. 

    Types of variales and samples

    ) =he outcoe #aria$le on (hich coparisons areade is the response variale.

    ) =he #aria$le that defines the groups to $e copared is

    the explanatory variale.

    Example: Reaction time is response #aria$le

    Experimental group is eplanatory #aria$le77 a categorical #ar. (ith categories: !cell7phone+ control%

    4r+ could epress eperiental group as

    >cell7phone use? (ith categories !yes+ no%

  • 8/18/2019 Two Populations

    6/41

    Copyright ©2009 Pearson Education. Inc. 

    ) ifferent ethods apply for

    independent samples 77 different saples+ no

    atching+ as in this eaple and in >cross7sectionalstudies?

    dependent samples 77 natural atching $et(een each

    su$5ect in one saple and a su$5ect in other saple+such as in >longitudinal studies+? (hich o$ser#esu$5ects repeatedly o#er tie

    Example: 'e later consider a separate eperient in(hich the same subjects fored the control group atone tie and the cell7phone group at another tie.

  • 8/18/2019 Two Populations

    7/41

    Copyright ©2009 Pearson Education. Inc. 

    se for difference etween two estimates

    !independent samples"

    ) =he sapling distri$ution of the difference $et(een t(oestiates is approximately normal   !large n1 and n2 % and hasestiated

    Eaple: ata on >@esponse ties? has

      2 using cell phone (ith saple ean .2+ s 8 9.-  2 in control group (ith saple ean .;+ s 8 -.

    'hat is se for difference $et(een saple eans of

    .2 A .; 8 &.,<

    2 2

    1 2( ) ( ) se se se= +

  • 8/18/2019 Two Populations

    8/41

  • 8/18/2019 Two Populations

    9/41

    Copyright ©2009 Pearson Education. Inc. 

    C# comparing two proportions

    ) @ecall se for a saple proportion used in a CI is

    ) o+ the se for the difference $et(een t(o saple proportions for

    independent saples is

    )  * CI for the difference $et(een population proportions is

     *s usual+ ! depends on confidence le#el+ &.9- for 9 confidence

    ˆ ˆ(1 ) / se nπ π = −

    2 2 1 1 2 21 2

    1 2

    ˆ ˆ ˆ ˆ(1 ) (1 )( ) ( ) se se se

    n n

    π π π π  − −= + = +

    1 1 2 22 1

    1 2

    ˆ ˆ ˆ ˆ(1 ) (1 )ˆ ˆ( )   z 

    n n

    π π π π  π π 

      − −− ± +

  • 8/18/2019 Two Populations

    10/41Copyright ©2009 Pearson Education. Inc. 

    Example: College *lcohol tudy conducted $y

    ar#ard chool of Pu$lic ealth!http://(((.hsph.har#ard.edu/cas/%

    =rends o#er tie in percentage of $inge drin1ing!consuption of or ore drin1s in a ro( for en and , orore for (oen+ at least once in past t(o (ee1s% 

    and of acti#ities perhaps influenced $y it<

    >a#e you engaged in unplanned seual acti#ities$ecause of drin1ing alcohol

  • 8/18/2019 Two Populations

    11/41Copyright ©2009 Pearson Education. Inc. 

    ) Estiated change in proportion saying >yes? is

    0.2& A 0.&92 8 0.02&.

    9 CI for change in population proportion is

      0.02& D &.9-!0.00-% 8 0.02& D 0.0&&+ or roughly!0.0&+ 0.0%

    'e can $e 9 confident that the populationproportion saying >yes? (as $et(een a$out 0.0&

    larger and 0.0 larger in 200& than in &99.

    1 1 2 2

    1 2

    ˆ ˆ ˆ ˆ(1 ) (1 ) (.192)(.808) (.213)(.787)0.005612,708 8783 se n n

    π π π π  − −= + = + =

  • 8/18/2019 Two Populations

    12/41Copyright ©2009 Pearson Education. Inc. 

    Coents a$out CIs for difference $et(een

    t(o population proportions

    ) If 9 CI for is !0.0&+ 0.0%+ then 9 CI

    for is !70.0+ 70.0&%.

    It is ar$itrary (hat (e call Group & and Group 2 and(hat the order is for coparing the proportions.

    )'hen 0 is not in the CI+ (e can conclude that onepopulation proportion is higher than the other.

    !e.g.+ if all positi#e #alues for Group 2 A Group &+ then conclude

    population proportion higher for Group 2 than Group &%

    2 1π π −

    1 2π π −

  • 8/18/2019 Two Populations

    13/41Copyright ©2009 Pearson Education. Inc. 

    ) 'hen 0 is in the CI+ it is plausi$le that the populationproportions are identical.

    Example: uppose 9 CI for change in population proportion

    !200& A &99% is !70.0&+ 0.0%>9 confident that population proportion saying yes (as

    $et(een 0.0& smaller  and 0.0 larger  in 200& than in &99.?

    ) =here is a significance test of 0: & 8 2  that the population

    proportions are identical!i.e.+ difference & 7 2 8 0%+ using test statistic

    ! = !difference $et(een saple proportions%/se

    For unplanned se in &99 and 200&+! = diff./se 8 0.02&/0.00- 8 .;

      =(o7sided P7#alue 8 0.0002

    =his sees to $e statistical significance (ithout practicalsignificance" 

  • 8/18/2019 Two Populations

    14/41Copyright ©2009 Pearson Education. Inc. 

    etails a$out test on pp. &97&90 of tet use se# 

    (hich pools data to get $etter estiate of se under 0 

    !'e study this test as a special case of >chi7s"uared

    test? in net chapter+ (hich deals (ith possi$ly anygroups+ any outcoe categories%

    ) =he theory $ehind the CI uses the fact that sapleproportions !and their differences% ha#e approiatenoral sapling distri$utions for large nHs+ $y theCentral 3iit =heore+ assuing randoi6ation%

    ) In practice+ forula (or1s o1 if at least &0 outcoes ofeach type for each saple !Bote: 'e donHt use t dist. forinference a$out proportions ho(e#er+ there are speciali6edsall7saple ethods+ e.g.+ using $inoial distri$ution%

  • 8/18/2019 Two Populations

    15/41Copyright ©2009 Pearson Education. Inc. 

    uantitati#e @esponses:

    Coparing Jeans

    ) Paraeter:  µ 2 7  µ &

    ) Estiator:

    ) Estiated standard error:

     A apling dist.: *pproiately noral !large n$s% $y C3=%

     A CI for independent rando saples from t&o normal

     population distributions has for

     A Forula for df for t 7score is cople !later%. If $oth saple

    si6es are at least 0+ can 5ust use !'score

    2 1 y y− 2 21 2

    1 2

     s s se

    n n= +

    ( ) ( )

    2 2

    1 22 1 2 1

    1 2  ( ), which is

     s s

     y y t se y y t  n n− ± − ± +

  • 8/18/2019 Two Populations

    16/41Copyright ©2009 Pearson Education. Inc. 

    Example: G data on >nu$er of close friends?

    Use gender as the eplanatory #aria$le:

      ,- feales (ith ean .+ s 8 &.-

      , ales (ith ean .9+ s 8 &.

    Estiated difference of .9 A . 8 0.- has a argin

    of error of &.9-!&.09% 8 2.&+ and 9 CI is

    0.- D 2.&+ or !7&.+ 2.;%.

    1 1 1

    2 2 2

    2 2 2 2

    1 2

    / 15.6 / 486 0.708

    / 15.5 / 354 0.824

    ( ) ( ) (0.708) (0.824) 1.09

     se s n

     se s n

     se se se

    = = =

    = = =

    = + = + =

  • 8/18/2019 Two Populations

    17/41Copyright ©2009 Pearson Education. Inc. 

    ) 'e can $e 9 confident that the population ean nu$erof close friends for ales is $et(een &. less and 2.; orethan population ean nu$er of close friends for feales.

    ) 4rder is ar$itrary. 9 CI coparing eans for feales Aales is !72.;+ &.%

    ) 'hen CI contains 0+ it is plausi$le that the difference is 0 inthe population !i.e.+ population eans e"ual%

    ) ere+ noral population assuption clearly #iolated. Forlarge n$ s+ no pro$le $ecause of C3=+ and for sall n$ s the

    ethod is ro$ust. !Kut+ eans ay not $e rele#ant for #eryhighly s1e(ed data.%

    )  *lternati#ely could do significance test to find strength ofe#idence a$out (hether population eans differ.

  • 8/18/2019 Two Populations

    18/41Copyright ©2009 Pearson Education. Inc. 

    $ignificance Tests for % & '

    ) =ypically (e (ish to test (hether the t(o populationeans differ

    !null hypothesis $eing no difference+ >no effect?%.

    ) ( 0:  µ 2 7  µ & 8 0 ! µ & 8  µ 2%

    ) ( a:  µ 2 7  µ & ≠ 0 ! µ & ≠  µ 2%

    ) =est tatistic:

    ( )2 1 2 12 21 2

    1 2

    0 y y   y yt  se   s s

    n n

    − −   −= =

    +

  • 8/18/2019 Two Populations

    19/41Copyright ©2009 Pearson Education. Inc. 

    =est statistic has usual for of

    !estiate of paraeter A ( 0 #alue%/standard error.

    ) P 7#alue: 27tail pro$a$ility fro t distri$ution

    ) For &7sided test !such as ( a:  µ 2 7  µ & L 0%+ P'#alue 8

    one7tail pro$a$ility fro t distri$ution !$ut+ not ro$ust%) Interpretation of P'#alue and conclusion using 7le#elsae as in one7saple ethods

    ex. uppose P'#alue 8 0.. =hen+ under suppositionthat null hypothesis true+ pro$a$ility 8 0. of gettingdata li1e o$ser#ed or e#en >ore etree+? (here>ore etree? deterined $y ( a

    E l C i f l d l $ f

  • 8/18/2019 Two Populations

    20/41Copyright ©2009 Pearson Education. Inc. 

    Example: Coparing feale and ale ean nu$er ofclose friends+ 0:  µ & 8  µ 2 ( a:  µ & ≠  µ 2

    ifference $et(een saple eans 8 .9 A . 8 0.-

      se 8 &.09 !sae as in CI calculation%  =est statistic t = 0.-/&.09 8 0.

      P'#alue 8 2!0.29% 8 0. !using standard noral ta$le%

    If ( 0 true of e"ual population eans+ (ould not $eunusual to get saples such as o$ser#ed.

    For  8 0.0 8 P!=ype I error%+ not enough e#idence to

    re5ect 0. !Plausi$le that population eans are identical.%

    For ( a:  µ & M  µ 2 !i.e.+  µ 2 7  µ & L 0%+  P7#alue 8 0.29

    For   ( a:  µ & L  µ 2 !i.e.+  µ 2 7  µ & M 0%+ P7#alue 8 & A 0.29 8 0.;&

  • 8/18/2019 Two Populations

    21/41Copyright ©2009 Pearson Education. Inc. 

    E"ui#alence of CI and ignificance =est

    >0:  µ & 8  µ 2 re5ected !not re5ected% at  8 0.0 le#el in

    fa#or of ( a:  µ & ≠  µ 2?

    is e"ui#alent to

    >9 CI for  µ & 7  µ 2 does not contain 0 !contains 0%?

    Example: P'#alue 8 0.+ so >'e do not re5ect 0 of

    e"ual population eans at 0.0 le#el?

      9 CI of !7&.+ 2.;% contains 0.

    !For  other than 0.0+ corresponds to &00!& 7 % confidence%

  • 8/18/2019 Two Populations

    22/41Copyright ©2009 Pearson Education. Inc. 

     *lternati#e inference coparing eans

    assues e)ual population standard de*iations

    ) 'e (ill not consider forulas for this approach here!in ec. ;. of tet%+ as itHs a special case of >analysisof #ariance? ethods studied later in Chapter &2.

    =his CI and test uses t distri$ution (ith

    df = n1 + n2 ' 2 

    ) 'e (ill see ho( soft(are displays this approach andthe one (eH#e used that does not assue e"ualpopulation standard de#iations.

  • 8/18/2019 Two Populations

    23/41Copyright ©2009 Pearson Education. Inc. 

    Example: Eercise ;.0+ p. 2&. Ipro#eent scores for

    therapy *: &0+ 20+ 0

      therapy K: 0+ ,+ ,

     *: ean 8 20+ s& 8 &0

    K: ean 8 ,0+ s2 = .--

      ata file+ (hich (e input into P and analy6e 

    u$5ect =herapy Ipro#eent

      & * &0

      2 * 20

      * 0  , K 0

      K ,

      - K ,

     

  • 8/18/2019 Two Populations

    24/41Copyright ©2009 Pearson Education. Inc. 

    = t f (

  • 8/18/2019 Two Populations

    25/41Copyright ©2009 Pearson Education. Inc. 

    =est of 0:  µ & 8  µ 2 ( a:  µ & ≠  µ 2

    =est statistic t = !,0 A 20%/;.-, 8 2.-2 'hen df = ,+ P'#alue 8 2!0.029,% 8 0.09.

    For one7sided ( a:  µ & <

     µ 2  !i.e.+ predict $efore study thattherapy K is $etter%+ P'#alue 8 0.029

    'ith  8 0.0+ insufficient e#idence to re5ect null for t(o7

    sided a+ $ut can re5ect null for one7sided a andconclude therapy K $etter.

    !$ut ree$er+ ust choose a ahead of tieN% 

  • 8/18/2019 Two Populations

    26/41Copyright ©2009 Pearson Education. Inc. 

    o( does soft(are get df for >une"ual

    #ariance? ethod<

    ) 'hen allo( σ &2 ≠ σ 22  recall that

    ) =he >ad5usted? degrees of freedo for the t distri$utionapproiation is !'elch7atterth(aite approiation% :

    2 2

    1 2

    1 2

     s s se

    n n= +

    22 2

    1 2

    1 2

    2 22 2

    1 2

    1 2

    1 21 1

     s s

    n n

    df   s s

    n n

    n n

     + ÷

     =    

    ÷ ÷ ÷   ÷+

    ÷− − ÷ ÷  

  • 8/18/2019 Two Populations

    27/41Copyright ©2009 Pearson Education. Inc. 

    oe coents a$out coparing eans

    ) If data sho( potentially large differences in #aria$ility

    !say+ the larger s $eing at least dou$le the saller s%+safer to use the >une"ual #ariances? ethod

    ) ,ne'sided  t tests not ro$ust against se#ere #iolationsof noral population assuption+ (hen n is sall.

    Ketter then to use >nonparametric ? ethods !(hich do

    not assue a particular for of population distri$ution%

    for one7sided inference see tet ec. ;.;.

    ) CI ore inforati#e than test+ sho(ing (hether

    plausi$le #alues are near or far fro 0.

    Eff t $i

  • 8/18/2019 Two Populations

    28/41Copyright ©2009 Pearson Education. Inc. 

    Effect $i(e 

    ) 'hen groups ha#e siilar #aria$ility+ a suary

    easure of effect si!e is

    ) Example: =he therapies had saple eans of 20 for *and ,0 for K and standard de#iations of &0 and .--. Ifcoon standard de#iation in each group is estiated to$e s 8 9. !say%+ then

    effect si6e 8 !,0 A 20%/9. 8 2.&.

    Jean for therapy K estiated to $e a$out t(o standardde#iations larger than the ean for therapy *.

    =his is a large effect. 

    2 1mean meaneffect size =standad de!iati"n in each #"$%

  • 8/18/2019 Two Populations

    29/41

    Copyright ©2009 Pearson Education. Inc. 

    =his effect si6e easure is soeties called >-ohen$s

    d .? e considered

      d  8 0.2 8 (ea1+ d  8 0. 8 ediu+ d  L 0. large.

    Example: 'hich study sho(ed the largest effect<

    1 2

    1 2

    1 2

    1. 20, 30, 10

    2. 200, 300, 100

    3. 20, 25, 2

     y y s

     y y s

     y y s

    = = =

    = = =

    = = =

    Coparing Jeans (ith ependent aples

  • 8/18/2019 Two Populations

    30/41

    Copyright ©2009 Pearson Education. Inc. 

    Coparing Jeans (ith ependent aples 

    ) etting: Each saple has the sae su$5ects !as inlongitudinal studies or crosso#er studies% or matched pairs of su$5ects

    ) =hen+ it is not true that for coparing t(o statistics+

    ) Just allo( for >correlation? $et(een estiates !'hy

  • 8/18/2019 Two Populations

    31/41

    Copyright ©2009 Pearson Education. Inc. 

    Example: Cell7phone study also had eperient (ithsae su$5ects in each group

    !data on p. &9, of tet%

    For this >atched7pairs? design+ data file has the for

     

    u$5ect CellOno CellOyes

      & -0, --  2 - -2

      ,0 -&

    !for 2 su$5ects%

    aple eans are ,.- illiseconds (ithout cell phone

      .2 illiseconds+ using cell phone

     

    'e reduce the 2 o$ser#ations to 2 difference scores

  • 8/18/2019 Two Populations

    32/41

    Copyright ©2009 Pearson Education. Inc. 

    'e reduce the 2 o$ser#ations to 2 difference scores+

      -- A -0, 8 2

      -2 A - 8 -;

      -& A ,0 8 ;

    .

    and analy6e the (ith standard ethods for a single saple

      8 0.- 8 .2 A ,.-+ sd 8 2. 8 std de# of 2+ -;+ ;  

    For a 9 CI+ df 8 n 1 8 &+ t'score 8 2.0,

    'e get 0.- D 2.0,!9.2%+ or !&.;+ -9.%

    d  y

    / 52.5 / 32 9.28d  se s n= = =

  • 8/18/2019 Two Populations

    33/41

    Copyright ©2009 Pearson Education. Inc. 

    ) 'e can $e 9 confident that the population ean

    using a cell phone is $et(een &.; and -9.

    illiseconds higher than (ithout cell phone.

    ) For testing 0 : Qd 8 0 against a : Qd ≠ 0+ the test

    statistic is

    t 8 ! 7 0%/se 8 0.-/9.2 8 .,-+ df = &+

    =(o7sided P'#alue 8 0.00000+ so there isetreely strong e#idence against the null

    hypothesis of no difference $et(een the population

    eans.

    d  y

    In class (e (ill use P to

  • 8/18/2019 Two Populations

    34/41

    Copyright ©2009 Pearson Education. Inc. 

    In class+ (e (ill use P to

    ) @un the dependent7saples t analyses

    ) Plot cellOyes against cellOno and o$ser#e a strong

    positi#e correlation !0.&,%+ (hich illustrates ho( an

    analysis that ignores the dependence $et(een the

    o$ser#ations (ould $e inappropriate.) Bote that one su$5ect !nu$er 2% is an outlier

    !unusually high% on $oth #aria$les

    ) 'ith outlier deleted+ P tell us that t 8 .2-+ df =0 for coparing eans !P 8 0.0000&% for coparing

    eans+ 9 CI of !29.&+ --.0%. =he pre#ious results

    (ere not influenced greatly $y the outlier.

  • 8/18/2019 Two Populations

    35/41

    Copyright ©2009 Pearson Education. Inc. 

    P output for original dependent7saples t

    analysis !including the outlier%

  • 8/18/2019 Two Populations

    36/41

    Copyright ©2009 Pearson Education. Inc. 

    oe coents

    ) ependent saples ha#e ad#antages of !&% controllingsources of potential $ias !e.g.+ $alancing saples on#aria$les that could affect the response%+ !2% ha#ing asaller se for the difference of eans+ (hen the pair(iseresponses are highly positi#ely correlated !in (hich case+ thedifference scores sho( less #aria$ility than the separatesaples%

    ) 'ith dependent saples+ (hy canHt (e use the se forula

    for independent saples<

    2 2

    1 2

    1 2

     s s se

    n n= +

    Ex !artificial $ut a1es the point%

  • 8/18/2019 Two Populations

    37/41

    Copyright ©2009 Pearson Education. Inc. 

    Ex. !artificial+ $ut a1es the point%'eights $efore and after anoreia therapy

    u$5ect Kefore *fter ifference

    & && &22 ;

      2 9& 9 ;

      &00 &0; ;  , &2 &9 ;

    3ots of #aria$ility (ithin each group of o$ser#ations+ $ut

    no #aria$ility for the difference scores !so+ actual se isuch saller than independent saples forula suggests%

    If you plot x = $efore against y = after+ (hat do you see<

    ) =he )c*emar test !pp 20&720% copares

  • 8/18/2019 Two Populations

    38/41

    Copyright ©2009 Pearson Education. Inc. 

    =he )c*emar test !pp. 20&720% copares

     proportions (ith dependent saples

    ) +isher,s exact test !pp. 20720,% copares proportions for sall independent saples

    ) oeties itHs ore useful to copare groups usingratios rather than differences of paraeters

  • 8/18/2019 Two Populations

    39/41

    Copyright ©2009 Pearson Education. Inc. 

    Example: U.. ept. of Rustice reports that proportion of

    adults in prison is a$out

    900/&00+000 for ales+ -0/&00+000 for feales

    /ifference: 900/&00+000 A -0/&00+000 8 ,0/&00+000 8 0.00,

    Ratio: S900/&00+000T/S-0/&00+000T 8 900/-0 8 &.0

    In applications in (hich the proportion refers to anundesira$le outcoe !e.g.+ ost edical studies%+ the

    ratio is called the relati*e ris0. Inference ethods !CI+

    test% are a#aila$le for it also.

    * f ti

  • 8/18/2019 Two Populations

    40/41

    Copyright ©2009 Pearson Education. Inc. 

     * fe( suary "uestions

    &. Gi#e an eaple of !a% independent saples+ !$% dependentsaples

    2. Gi#e an eaple of !a% response #ar.+ !$% categoricaleplanatory #ar.+ and identify (hether response is "uantitati#e orcategorical and state the appropriate analyses.

    . uppose that a 9 CI for difference $et(een Jassachusetts and=eas in the population proportion supporting legal sae7searriage is !0.&+ 0.22%.

    a. Population proportion of support is higher in =eas

    $. ince 0.& and 0.22 M 0.0+ less than half the population supportslegal sae7se arriage.

    c. =he 99 CI could $e !0.&;+ 0.20%

    d. It is plausi$le that population proportions are e"ual.e. P7#alue for testing e"ual population proportions against t(o7sided

    alternati#e could $e 0.,0.

    f. 'e can $e 9 confident that the population proportion of 

    support in J* is $et(een 0.& higher and 0.22 higher than in =.

    Example: *noreia study studying (eight change for

  • 8/18/2019 Two Populations

    41/41

    Example: *noreia study+ studying (eight change for

    groups !$eha#ioral therapy+ faily therapy+ control%.

    Patients randoly assigned to one of the three

    therapies. Is this an eaple of independent saplesor dependent saples<