lec3.PDF

8
STAT3010: Lecture 3 1 REVIEW OF HPYOTHESIS TESTS-CONT’D Two Independent Populations (not equal variances) Example: An experiment was conducted involving the comparison of the mean heart rates following 30 minutes of aerobic exercise in females aged 20 to 24 as compared to females aged 25-30. For this experiment, 10-second heart rates are recorded on each participant following 30 minutes of intense aerobic exercise and converted to beats per minute. The sample data are Statistic Age: 20-24 Age: 25-30 Sample size 15 10 Mean heart rate 146.22 141.10 Variance in heart rates 40.0 10.0 Use the data to test if there is a significant difference in the mean heart rates following 30 minutes of aerobic exercise between the two groups. 2 2 2 1 2 1 2 1 2 1 ) ( n s n s x x t o 1 ) ( 1 ) ( ] ) ( ) [( 2 4 2 1 4 1 2 2 2 2 1 n se n se se se df

Transcript of lec3.PDF

Page 1: lec3.PDF

STAT3010: Lecture 3

1

REVIEW OF HPYOTHESIS TESTS-CONT’D Two Independent Populations (not equal variances) Example: An experiment was conducted involving the comparison of the mean heart rates following 30 minutes of aerobic exercise in females aged 20 to 24 as compared to females aged 25-30. For this experiment, 10-second heart rates are recorded on each participant following 30 minutes of intense aerobic exercise and converted to beats per minute. The sample data are Statistic Age: 20-24 Age: 25-30 Sample size 15 10 Mean heart rate 146.22 141.10 Variance in heart rates 40.0 10.0 Use the data to test if there is a significant difference in the mean heart rates following 30 minutes of aerobic exercise between the two groups.

2

22

1

21

2121 )(

ns

ns

xxto

1)(

1)(

])()[(

2

42

1

41

222

21

nse

nse

sesedf

Page 2: lec3.PDF

STAT3010: Lecture 3

2

Example: An educator believes that new reading activities in the classroom will help elementary school pupils improve some aspects of their reading ability. She arranges for a third-grade class of 21 students to take part in these activities for an eight week period. A control classroom of 23 third-graders follows the same curriculum without the activities. At the end of the eight weeks, all students are given a Degree of Reading Power (DRP) test, which measures the aspects of reading ability that the treatment is designed to improve. The data appear in the following table.

DRP scores for third-graders

Treatment Group Control Group 24 61 59 46 43 44 52 43 58 67 62 57 71 49 54 43 53 57 49 56 33

42 33 46 37 43 41 10 42 55 19 17 55 26 54 60 28 62 20 53 48 37 85 42

SAS CODE: options LS=80 PS=60 nodate; data DRP; input Group $ x; cards; T 24 T 61 T 59 . . . C 42 C 33 C 46 . . . ; Proc ttest; class group; var x; run;

Page 3: lec3.PDF

STAT3010: Lecture 3

3

SAS OUTPUT: The TTEST Procedure Statistics Lower CL Upper CL Lower CL Variable Group N Mean Mean Mean Std Dev Std Dev x C 23 34.106 41.522 48.937 13.263 17.149 x T 21 46.466 51.476 56.487 8.4213 11.007 x Diff (1-2) -18.82 -9.954 -1.091 11.998 14.551 Statistics Upper CL Variable Group Std Dev Std Err Minimum Maximum x C 24.271 3.5758 10 85 x T 15.895 2.402 24 71 x Diff (1-2) 18.495 4.3919 T-Tests Variable Method Variances DF t Value Pr > |t| x Pooled Equal 42 -2.27 0.0286 x Satterthwaite Unequal 37.9 -2.31 0.0264

Two Dependent Populations (paired data) relevant formulas:

n

dd

n

ii

1 , 1

1

2

12

2

nn

dd

s

n

i

n

ii

i

d , n

sdt

d

d with df = n-1

Page 4: lec3.PDF

STAT3010: Lecture 3

4

Example: A nutrition expert is examining a weight-loss program to evaluate its effectiveness. Ten subjects are randomly selected for the investigation. The subjects’ initial weights are recorded, they follow the program for 6 weeks, and they are weighed again. The data are as follows: Subject Initial Weight Final Weight 1 180 165 2 142 138 3 126 128 4 138 136 5 175 170 6 205 197 7 116 115 8 142 128 9 157 144 10 136 130 Do the data suggest that the weight-loss program worked?

Page 5: lec3.PDF

STAT3010: Lecture 3

5

SAS CODE: options LS=80 PS=60 nodate; data paired; input preweight postweight; d=preweight-postweight; lines; 180 165 142 138 126 128 138 136 175 170 205 197 116 115 142 128 157 144 136 130 ; proc means mean std t prt; title 'Paired t-test'; proc print; run; SAS OUTPUT: Paired t-test The MEANS Procedure Variable Mean Std Dev t Value Pr > |t| ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ preweight 151.7000000 27.4268725 17.49 <.0001 postweight 145.1000000 24.8616170 18.46 <.0001 d 6.6000000 5.8156876 3.59 0.0059 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Paired t-test Obs preweight postweight d 1 180 165 15 2 142 138 4 3 126 128 -2 4 138 136 2 5 175 170 5 6 205 197 8 7 116 115 1 8 142 128 14 9 157 144 13 10 136 130 6

Page 6: lec3.PDF

STAT3010: Lecture 3

6

Categorical Data - Chi-Square Tests relevant formulas:

20 =

n

i

n

j ij

ijij

EEO

1 1

2)(; n

crE ji

ij

))(( ; df = (r-1)(c-1)

NOTE: The Chi – Square distribution is a right skewed distribution and has a single parameter called its degrees of freedom. Table B.5 is used for a critical region. Example: The following data were collected in a multisite observational study of medical effectiveness in Type II diabetes. Three sites were involved: a health maintenance organization (HMO), a university teaching hospital (UTH), and an independent practice association (IPA). Type II diabetic patients were enrolled in the study from each site and monitored over a 3-year observation period. The data shown display the treatment regimens of patients measured at baseline by site.

Treatment Regimen

Site Diet & Exercise Oral Hypoglycemics Insulin Total HMO: 294 827 579 1700 UTH: 132 288 352 772 IPA: 189 516 404 1109 Total: 615 1631 1335 3581

We wish to use the data to test the hypothesis that the two variables (site and treatment regimen) are independent (i.e., no difference in treatment regimens across sites). The hypotheses are written as follows:

ncr

E jiij

))((

Page 7: lec3.PDF

STAT3010: Lecture 3

7

20 =

n

i

n

j ij

ijij

EEO

1 1

2)(

SAS CODE: options LS=80 PS=60 nodate; data independent; input site $ trt $ count; cards; hmo diet 294 hmo oral 827 hmo insulin 579 uth diet 132 uth oral 288 uth insulin 352 ipa diet 189 ipa oral 516 ipa insulin 404 ; run; proc freq; tables site*trt/expected cellchi2 chisq; weight count; title 'Chi-Square Test for Independence'; proc print; run;

Page 8: lec3.PDF

STAT3010: Lecture 3

8

SAS OUTPUT: Chi-Square Test for Independence The FREQ Procedure Table of site by trt site trt Frequency ‚ Expected ‚ Cell Chi-Square‚ Percent ‚ Row Pct ‚ Col Pct ‚diet ‚insulin ‚oral ‚ Total ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ hmo ‚ 294 ‚ 579 ‚ 827 ‚ 1700 ‚ 291.96 ‚ 633.76 ‚ 774.28 ‚ ‚ 0.0143 ‚ 4.7318 ‚ 3.5895 ‚ ‚ 8.21 ‚ 16.17 ‚ 23.09 ‚ 47.47 ‚ 17.29 ‚ 34.06 ‚ 48.65 ‚ ‚ 47.80 ‚ 43.37 ‚ 50.71 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ ipa ‚ 189 ‚ 404 ‚ 516 ‚ 1109 ‚ 190.46 ‚ 413.44 ‚ 505.1 ‚ ‚ 0.0112 ‚ 0.2154 ‚ 0.235 ‚ ‚ 5.28 ‚ 11.28 ‚ 14.41 ‚ 30.97 ‚ 17.04 ‚ 36.43 ‚ 46.53 ‚ ‚ 30.73 ‚ 30.26 ‚ 31.64 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ uth ‚ 132 ‚ 352 ‚ 288 ‚ 772 ‚ 132.58 ‚ 287.8 ‚ 351.61 ‚ ‚ 0.0026 ‚ 14.32 ‚ 11.509 ‚ ‚ 3.69 ‚ 9.83 ‚ 8.04 ‚ 21.56 ‚ 17.10 ‚ 45.60 ‚ 37.31 ‚ ‚ 21.46 ‚ 26.37 ‚ 17.66 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 615 1335 1631 3581 17.17 37.28 45.55 100.00 Statistics for Table of site by trt Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 4 34.6291 <.0001 Likelihood Ratio Chi-Square 4 34.4975 <.0001 Mantel-Haenszel Chi-Square 1 10.5953 0.0011 Phi Coefficient 0.0983 Contingency Coefficient 0.0979 Cramer's V 0.0695 Sample Size = 3581 Chi-Square Test for Independence Obs site trt count 1 hmo diet 294 2 hmo oral 827 3 hmo insulin 579 4 uth diet 132 5 uth oral 288 6 uth insulin 352 7 ipa diet 189 8 ipa oral 516 9 ipa insulin 404