Biostatistics Unit 10 Categorical Data Analysis 1.

Post on 31-Mar-2015

223 views 3 download

Tags:

Transcript of Biostatistics Unit 10 Categorical Data Analysis 1.

Biostatistics

Unit 10

Categorical Data Analysis

1

Categorical Data Analysis

• Categorical data analysis deals with discrete data that can be organized into categories.

• The data are organized into a contingency table. The basic structure consists of two columns and two rows.

• The 2 distribution is used in categorical data analysis.

2

Basic Contingency Table Structure

• Basic structure of a 2X2 contingency table has two columns and two rows.

3

Structure of Contingency Tables

• Cells are labeled A through D.

• Columns and rows are added for labels.

4

Using the contingency table as a comparison table

• Comparison of outcomes in laboratory tests is studied using contingency tables.

5

Absolute and Relative Risk• Relative risk is the ratio of two proportions. In each

row is an absolute risk of getting the disease. • The ratio of these two proportions is the relative risk.

6

Absolute Risk

7

Relative Risk

8

Example

A total of 452 children in elementary schools in Georgia and Florida were served burritos for lunch. Among these, 304 children reported eating the burritos. Among those who ate burritos, 155 reported getting sick from bacterial contamination.

There were also 148 children who did not eat burritos. Among these, 10 cases of illness were reported.

A case of disease was defined as gastrointestinal upset, fever and other symptoms. The CDC studied this event using categorical data analysis. They reported relative risk, significance and a confidence interval.

9

Contingency Table

• Data from the reports of the incident were entered into a contingency table.

10

Absolute risk—ate burritos

11

Absolute risk—did not eat burritos

12

Relative Risk

• Relative risk is the ratio of the two absolute risk probabilities.

• Conclusion: A child who ate burritos had 7.06 times the probability of getting sick as one who did not.

13

Significance in relative risk

Significance in relative risk is found using the 2 distribution. The general formula is below.

14

Significance in relative risk

In contingency table calculations, the values from the table are used to give a 2 value according to the formula below.

15

Find significance using the TI-83

A. Matrix setup

16

Find significance using the TI-83

B. Calculation results

Conclusion: With p this small, the result is highly significant.

17

CI for a Relative Risk Calculation

The confidence interval consists of the usual components of estimator, reliability coefficient and standard error. Standard error is found using the formula

18

CI for a Relative Risk Calculation

Logarithmic transformation is used

because of the shape of the 2 curve 1 df

which is hyperbolic. The antilog gives the

boundaries of the confidence interval.

19

CI for a Relative Risk Calculation

20

CI for a Relative Risk Calculation

21

CI for a Relative Risk Calculation

• Take antilog to complete the calculation.

• Conclusion: The relative risk is 7.06. We are 95% confident that the true value lies between 3.575 and 13.93.

22

Odds Ratio

• The odds come from the ratio of two proportions.

• The odds ratio is the ratio of these two odds.

• Odds ratio is generally calculated from data in a case control study.

• The following gives the theoretical basis for the calculation of odds ratio. The outcome is determined as the cross-product.

23

Contingency Table

24

Odds ratio and the contingency table

• The probability of being exposed and getting sick (success) is P(E). The probability of being exposed and not getting sick (failure) is 1 – P(E).

• The probability of getting sick when not exposed is P(E’) while the probability of not getting sick when not exposed is 1 – P(E’).

25

Determining Odds Ratio

Odds of getting sick when exposed

Odds of getting sick when not exposed

26

Determining Odds Ratio

Odds ratio is the ratio of these two odds

The probability values are related to the cells in the contingency table.

27

Determining Odds Ratio

The final ratio of cells to find odds ratio

This calculation of odds ratio is the cross-product of AD divided by BC.

28

Case study for odds ratio

In the case control study, 52 children were involved. There were 13 children who ate the burritos among which 8 got sick. There were also 39 children who did not eat the burritos among which 6 reported symptoms of the illness. The odds ratio was calculated.

29

Odds Ratio Calculation

Conclusion: The odds ratio is 8.8

30

Find significance using the TI-83

A. Matrix setup

31

Find significance using the TI-83

B. Calculation results

Conclusion: p < .001

32

CI for an Odds Ratio Calculation

Calculation for SE after logarithmic transformation

33

CI for an Odds Ratio Calculation

34

CI for an Odds Ratio Calculation

Take antilog to complete the calculation.

Conclusion: The odds ratio is 8.8. We are 95% confident that the true value lies between 2.14 and 36.3.

35

fin

36