Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive...

27
1 1 Tests of Significance Analysis of Two-by-Two Table Data: Tests of Significance UI-MEPI-J: Research Design and Methodology Workshop 2 Tests of Significance ILL VANILLA | + - | Total -----------+-------------+------ + | 43 11 | 54 - | 3 18 | 21 -----------+-------------+------ Total | 46 29 | 75 Single Table Analysis Odds ratio 23.45 Cornfield 95% confidence limits for OR 5.07 < OR < 125.19* RISK RATIO(RR)(Outcome:ILL=+; Exposure:VANILLA=+) 5.57 95% confidence limits for RR 1.94 < RR < 16.03 Ignore risk ratio if case control study Chi-Squares P-values ----------- -------- Uncorrected: 27.22 0.00000018 <--- Mantel-Haenszel: 26.86 0.00000022 <--- Yates corrected: 24.54 0.00000073 <---

Transcript of Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive...

Page 1: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

1

1

Tests of Significance

Analysis of

Two-by-Two Table Data:

Tests of Significance

UI-MEPI-J: Research Design and Methodology Workshop

2

Tests of Significance

ILL

VANILLA | + - | Total

-----------+-------------+------

+ | 43 11 | 54

- | 3 18 | 21

-----------+-------------+------

Total | 46 29 | 75

Single Table Analysis

Odds ratio 23.45

Cornfield 95% confidence limits for OR 5.07 < OR < 125.19*

RISK RATIO(RR)(Outcome:ILL=+; Exposure:VANILLA=+) 5.57

95% confidence limits for RR 1.94 < RR < 16.03

Ignore risk ratio if case control study

Chi-Squares P-values

----------- --------

Uncorrected: 27.22 0.00000018 <---

Mantel-Haenszel: 26.86 0.00000022 <---

Yates corrected: 24.54 0.00000073 <---

Page 2: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

2

3

Tests of Significance

Objectives

Describe the reason for using statistical tests

Describe what a P-value is

Describe the two main influences on a P-value

for a two-by-two table

Properly interpret the results of chi-square

statistical tests

4

Tests of Significance

P < 0.05

P = NS

Page 3: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

3

5

Tests of Significance

What is a Statistical Testing?

Also called “hypothesis testing”

… the process of inferring from your data

whether an observed difference is likely to

represent chance variation or a real difference

(Does NOT address bias, confounding, or

investigator error!)

For two-by-two table data, influenced by:

Number of subjects or observations in study

Size of difference in results between groups

6

Tests of Significance

Data from a Hypothetical Cohort Study

Dead Alive Total % Dead

Diabetic 2 2 4 50.0%

Nondiabetic 1 3 4 25.0%

Total

3 5 8 37.5%

Page 4: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

4

7

Tests of Significance

Data from a Hypothetical Cohort Study

Dead Alive Total % Dead

Diabetic 10 10 20 50.0%

Nondiabetic 5 15 20 25.0%

Total

15 25 40 37.5%

8

Tests of Significance

Data from a Hypothetical Cohort Study

Dead Alive Total % Dead

Diabetic 20 20 40

50.0%

Nondiabetic 10 30 40 25.0%

Total

30 50 80 37.5%

Page 5: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

5

9

Tests of Significance

Data from a Hypothetical Cohort Study

Dead Alive Total % Dead

Diabetic 200 200 400 50.0%

Nondiabetic 100 300 400 25.0%

Total

300 500 800 37.5%

10

Tests of Significance

Steps in Statistical Testing

1. State the null and alternative hypotheses

2. Choose a statistical test for testing the null hypothesis

3. Specify a significance level

4. Perform the statistical test, i.e., calculate probability of obtaining data you got if null hypothesis were true

5. Make a decision about the hypotheses

Page 6: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

6

11

Tests of Significance

Mindset for Statistical Testing

2 groups – diabetics vs. non-diabetics, cases

vs. controls, etc.

Each is a sample from some larger population

Are they likely to be samples from the same

population, or different populations?

12

Tests of Significance

RR = 1

Page 7: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

7

13

Tests of Significance

0

5

10

15

20

25

30

35

40

45

Relative Risk

RR = 1

14

Tests of Significance

1. State the Null and Alternative Hypotheses

Null hypothesis

H0: The observed difference is not real, i.e., the

observed difference is the result of chance

Alternative hypothesis

HA: H0 is not true, i.e., the observed difference is

not due to chance

Page 8: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

8

15

Tests of Significance

Investigation: Gastroenteritis after a Wedding

16

Tests of Significance

1. State H0 and HA – Wedding Cake Study

Study 1: Wedding attendees

Attack rate, cake+ = 254 / 411 = 61.8%

Attack rate, cake− = 33 / 223 = 14.8%

H0: the attack rates in the two groups are the

same (RR=1)

HA: the attack rates in the two groups are not the

same (RR ≠ 1), or

HA: those who ate cake had higher attack rate (RR > 1)

Page 9: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

9

17

Tests of Significance

2. Choosing a Statistical Test

Choice depends on:

study design

measurement scale of the variables

study size

Test for comparison of 2 means:

Test for 2-x-2 table data:

Student t-test

Chi-square test

18

Tests of Significance

Statistical Tests for a 2-by-2 Table

Fisher Exact Test

– use when any expected value < 5

Chi-square Test

– use when all expected values > 5

– 4 variations

– Uncorrected

– Mantel-Haenszel uncorrected

– Yates corrected

– Mantel-Haenszel corrected

Page 10: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

10

19

Tests of Significance

ILL

VANILLA | + - | Total

-----------+-------------+------

+ | 43 11 | 54

- | 3 18 | 21

-----------+-------------+------

Total | 46 29 | 75

Single Table Analysis

Odds ratio 23.45

Cornfield 95% confidence limits for OR 5.07 < OR < 125.19*

RISK RATIO(RR)(Outcome:ILL=+; Exposure:VANILLA=+) 5.57

95% confidence limits for RR 1.94 < RR < 16.03

Ignore risk ratio if case control study

Chi-Squares P-values

----------- --------

Uncorrected: XX.XX 0.XXXXXXXX

Mantel-Haenszel: XX.XX 0.XXXXXXXX

Yates corrected: XX.XX 0.XXXXXXXX

20

Tests of Significance

Test Statistic (“Uncorrected”)

degrees of freedom = (rows−1) (columns −1)

Chi-square test determines whether the

deviations between observed and expected are

too large to be attributed to chance.

expected

expected) - (observed 2

2

Chi-Square Test for Independence

Page 11: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

11

21

Tests of Significance

Oswego Data

Ill Well Total AR

Ate vanilla

ice cream?

Y 43 11 54 79.6%

N 3 18 21 14.3%

46 29 75

How many degrees of freedom?

What is expected in a 2-by-2 table?

1 d.f.

22

Tests of Significance

Observed number in cell A = 43

Overall attack rate (AR) = 46 / 75 = _____

Expected AR (each group) under H0 = _____

N who ate vanilla ice cream = _____

Expected # cases among those who ate vanilla

ice cream, under H0 = _______________

So, Expected (a) =

column total x (row total / table total)

What’s Expected in a 2-by-2 Table?

0.613

0.613

54

54 x 0.613 = 33.1

Page 12: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

12

23

Tests of Significance

What’s Expected in a 2-by-2 Table?

Ill Well Total

Ate vanilla

ice cream?

Y H1V1 /T H1V0 /T H1

N H0V1 /T H0V0 /T H0

V1 V0 T

In general, expected value =

row total x column total / table total

24

Tests of Significance

Oswego: Observed vs. Expected

Observed Expected

Cell a 43

Cell b 11

Cell c 3

Cell d 18

Total 75

Page 13: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

13

25

Tests of Significance

Oswego: Observed vs. Expected

Observed Expected

Cell a 43 33.12

Cell b 11

Cell c 3

Cell d 18

Total 75

20.88

12.88

8.12

75.00

26

Tests of Significance

Oswego: Observed vs. Expected

Observed Ill Well Total % Ill

Vanilla IC+ 43 11 54 79.6%

Vanilla IC− 3 18 21 14.3%

46 29 75 61.3%

Expected Ill Well Total % Ill

Vanilla IC+ 33.12 20.88 54 79.6%

Vanilla IC− 12.88 8.12 21 14.3%

46 29 75 61.3%

Page 14: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

14

27

Tests of Significance

Chi-Square Tests for 2-by-2 Tables

Uncorrected (Pearson) Chi-square Test

Mantel-Haenszel Chi-square Test

Yates corrected Chi-square Test

0101

22 )(

VVHH

bcadt

0101

2

2 2VVHH

tbcadt

0101

22 ))(1(

VVHH

bcadt

28

Tests of Significance

Example: Randomized Clinical Trial

Cured Not Total Cure (%)

New Tx 7 1 8 87.5%

Old Tx 2 5 7 28.6%

Total 9 6 15

Can we use chi-square? Calculate expected value for cell d.

7 x 6 / 15 = 42 / 15 = 2.8 Use FET

Page 15: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

15

29

Tests of Significance

3. Specify a Level of Significance

Level of significance = an arbitrary cut-off, a small

probability, for deciding whether to declare the

null hypothesis untenable

Also called alpha level

Commonly, alpha set at 0.05 (5%) or 0.01 (1%)

30

Tests of Significance

4. Perform the Statistical Test, Compute P -value

Chi-square tests provide chi-square test statistic,

which must be converted to P-value (use

computer or look-up table)

P-value = probability of observing a difference as

great or greater than the observed difference, if

the null hypothesis were true

P-value influenced by:

– size of difference / strength of association

– size of the sample

Page 16: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

16

31

Tests of Significance

Oswego: Observed vs. Expected

Observed Expected

Cell a 43 33.12

Cell b 11 20.88

Cell c 3 12.88

Cell d 18 8.12

Total 75 75.00

2.947

4.675

7.579

12.021

27.222

(O-E)2

E

32

Tests of Significance

Chi-Square Tests for 2-by-2 Tables

Uncorrected (Pearson) Chi-square Test

0101

22 )(

VVHH

bcadt

29462154

)3111843)(75( 22

222.272

Page 17: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

17

33

Tests of Significance

Converting a X2 to a P-Value

To convert the X2 into a P-value by hand, use a

special X2 table

The bigger the X2, the smaller the P-value

For data with 1 degree of freedom, i.e., data from

a 2x2 table, the X2 value must be ≥ 3.84 to yield

a P-value ≤ 0.05

Alternatively, let the computer do the conversion

34

Tests of Significance

ILL

VANILLA | + - | Total

-----------+-------------+------

+ | 43 11 | 54

- | 3 18 | 21

-----------+-------------+------

Total | 46 29 | 75

Single Table Analysis

Odds ratio 23.45

Cornfield 95% confidence limits for OR 5.07 < OR < 125.19*

RISK RATIO(RR)(Outcome:ILL=+; Exposure:VANILLA=+) 5.57

95% confidence limits for RR 1.94 < RR < 16.03

Ignore risk ratio if case control study

Chi-Squares P-values

----------- --------

Uncorrected: 27.22 0.00000018 <---

Mantel-Haenszel: 26.86 0.00000022 <---

Yates corrected: 24.54 0.00000073 <---

Page 18: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

18

35

Tests of Significance

36

Tests of Significance

5. Make Decision about Hypothesis

If computed P-value < alpha, reject H0, i.e.,

conclude that difference is unlikely to be due to

chance*

If computed P-value > alpha, do not reject H0,

i.e., conclude that difference could be due to

chance*

* You could be right or you could be wrong!

Page 19: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

19

37

Tests of Significance

Two Types of Possible Errors

In reality, H0 is…

True False

Decision re: H0, based on our data

Accept OK Type II

(β) error

Reject Type I

(α) error OK

Level of significance (α) = probability of

making Type I error

1 – α = Confidence 1 – β = Power

38

Tests of Significance

What Influences a P-value?

Strength of association / size of

difference

Number of subjects (size of sample)

Page 20: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

20

39

Tests of Significance

P- value and Strength of Association

D+ D- AR RR

E+ 10 10 20 50% 2.0 X2 = 2.67

E- 5 15 20 25% p = 0.10

D+ D- AR RR

E+ 12 8 20 60% 2.4 X2 = 5.01

E- 5 15 20 25% p = 0.03

40

Tests of Significance

P- value and Size of Study

D+ D- AR RR

E+ 10 10 20 50% 2.0 X2 = 2.67

E- 5 15 20 25% p = 0.10

D+ D- AR RR

E+ 20 20 40 50% 2.0 X2 = 5.33

E- 10 30 40 25% p = 0.02

Page 21: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

21

41

Tests of Significance

Hypothetical Cohort Study

Dead Alive Total % Dead

Diabetic 2 2 4 50.0%

Nondiabetic 1 3 4 25.0%

Diabetic 10 10 20 50.0%

Nondiabetic 5 15 20 25.0%

Diabetic 20 20 40 50.0%

Nondiabetic 10 30 40 25.0%

X2 = 0.53

P = 0.47

X2 = 2.67

P = 0.10

X2 = 5.33

P = 0.02

42

Tests of Significance

Notes on Interpretation of Statistical Tests

Statistical testing does not address bias!

Statistical significance ≠ importance

“A difference, to be a difference, has to make a difference.” – Carl Tyler

Not significant ≠ no association

“Absence of evidence should not be taken as evidence of absence.”

– Sherlock Holmes

Statistical significance ≠ causation

Page 22: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

22

43

Tests of Significance

Interpret the Findings (Studies 1–6)

Study P value Interpretation

1 0.007

2 0.03

3 0.08

4 0.65

5 0.0001

6 8 x 10-11

44

Tests of Significance

Interpret the Findings (Studies 7–11)

Study P value Interpretation

7 0.060

8 0.052

9 0.048

10 0.00009

11 0.9

Page 23: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

23

45

Tests of Significance

Objectives

Describe the reason for using statistical tests – To evaluate the role of chance as an explanation

of observed differences / associations

Describe what a P-value is – Probability of observing >difference under H0

Describe the two influences on a P-value – size of difference / strength of association

– size of the sample

Properly interpret the results of chi-square test – reject H0 if P < α, but use judgment!

46

Tests of Significance

Author, Acknowledgements, References

Author

Richard Dicker

Acknowledgement

Virgil Peavy

Page 24: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

24

47

Tests of Significance

Baking the Cake Layers

48

Tests of Significance

Filling the Cakes

Page 25: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

25

49

Tests of Significance

Filling the Cakes

Raspberry jam

Syrup

White

Chocolate

Mousse Filling

50

Tests of Significance

The Strawberry Filling

Page 26: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

26

51

Tests of Significance

Strawberry Filling

Strawberries

– Fresh

– Washed in sink

– Sliced in the

bake room

– Hand-spread

onto cake filling

52

Tests of Significance

Cake Composition

Cake Layer 3

Cake Layer 1

Cake Layer 2

Baked Cake

Filling

Page 27: Tests of Significance · Tests of Significance Data from a Hypothetical Cohort Study Dead Alive Total % Dead Diabetic 200 200 400 50.0% Nondiabetic 100 300 400 25.0% Total 300 500

27

53

Tests of Significance

Icing, Assembly, and Decorating

54

Tests of Significance

Attack Rates by Type of WCM Filling

Type of

Filling # Ate

Attack

rate

(eaters)

Attack rate

(non-eaters) RR P-value

Strawberry 408 62% 15% 4.2 0.0001

Chocolate

and Mocha 36 53% 45% 1.2 0.3

WCM only 9 44% 45% 1.0 1.0