Epid 600 Class 5 Cohort Studies
Transcript of Epid 600 Class 5 Cohort Studies
EPID 600; Class 5 Cohort studies
University of Michigan School of Public Health
Drug Abuse: A workshop on behavioral and economic research
October 18-20, 2004
1
Three key dimensions to epidemiologic studies
Measures of association Relative measures (relative risks, rates, and odds) Absolute measures (risk and rate differences) Study design Observational Cohort Case-control Cross-sectional Experimental Randomized trial Field trials Group randomized trials Units of analysis Individual Group
2
Three key dimensions to epidemiologic studies
Measures of association Relative measures (relative risks, rates, and odds) Absolute measures (risk and rate differences) Study design Observational Cohort Case-control Cross-sectional Experimental Randomized trial Field trials Group randomized trials Units of analysis Individual Group
3
The world
persons “exposed” persons “unexposed”
4
The cohort study
persons “exposed” persons “unexposed”
5
The cohort study
persons “exposed” with disease persons “unexposed” with disease
6
What is a cohort?
1. A place to play tennis 2. The tenth part of a Roman legion 3. A population that is surveyed at a given moment in time 4. People born a hundred years apart 5. Equivalent to a trohoc
7
Why epidemiologic studies
Why epidemiologic studies? To determine whether there is an association between “exposure” and “outcome”
Example Does aspirin prevent myocardial infarctions? Is eating carrots associated with increased risk of skin cancer?
8
Concurrent Mixed Non-concurrent Prospective Retrospective
It is where the INVESTIGATOR sits that determines the type of cohort
Types of cohort studies
time
i i i
9
Concurrent Mixed Non-concurrent Prospective Retrospective
It is where the INVESTIGATOR sits that determines the type of cohort
time
i i i
Types of cohort studies
10
Prospective vs. retrospective
In prospective cohort studies, exposure and non-exposure ascertainment happens in present then study groups followed over time to measure disease In retrospective cohort studies, exposure and non-exposure are ascertained in the past In “mixed” cohort studies, we have a bit of each approach
11
Retrospective vs. prospective
Advantages of retrospective studies Less expensive Less time consuming Efficient for study of diseases with long latency periods
Disadvantages of retrospective studies Introduces possible error in the form of recall of past information, challenges in collecting data retrospectively, primarily information bias (to be covered in a future lecture)
12
Fixed vs. open cohorts
Fixed cohort Has fixed membership Once group is defined and follow-up begins, no-one is added
Open cohort Also called a “dynamic cohort”, can take on new members during study period
13
Fixed cohort
Fixed cohort
14
Fixed cohort
Fixed cohort
cohort is set; no new participants
size of cohort gets smaller over time
there may be withdrawals from cohort
15
Closed fixed cohort
Fixed cohort
cohort is set; no new participants
there are no withdrawals, all persons are followed until end of follow-up period of until they get disease
16
Open cohort
Open cohort
17
Open
Open (dynamic) cohort
new cohort members
cohort may be replenished; size of cohort does not necessarily shrink
18
Rothman KJ. Epidemiology: An Introduction. Oxford, 2002. 19
Exposed
Unexposed
D+
D-
D+
D-
Cohort studies
20
Exposed
Unexposed Disease No
disease
Exposed a b
Not Exposed c d
D+
D-
D+
D-
Cohort studies
21
Exposed
Unexposed Disease No
disease
Exposed a b
Not Exposed c d
D+
D-
D+
D-
Cohort studies
22
Exposed
Unexposed Disease No
disease
Exposed a b Not Exposed c d
D+
D-
D+
D-
Cohort studies
23
Exposed
Unexposed Disease No
disease
Exposed a b
Not Exposed c d
D+
D-
D+
D-
Cohort studies
24
Exposed
Unexposed Disease No
disease
Exposed a b
Not Exposed c d
D+
D-
D+
D-
Cohort studies
25
Advantages of cohort studies
Maintains temporal sequence, i.e., assesses exposure before outcome Good for assessing rare exposures and rapidly fatal diseases Can study multiple diseases/outcomes from a given exposure Can calculate incidence among exposed and unexposed Minimizes error in ascertainment of exposure (at least if prospective) Provides complete description of experience subsequent to exposure, including rate of progression and natural history of disease
26
Disadvantages of cohort studies
Expensive Inefficient for rare diseases Potentially long duration for follow-up Secular trends in technology, behaviors, and changes that may influence behavior and study characteristics over time
27
An example
Population: A cadmium factory in South Dakota
Exposure: Exposure to cadmium production (which involves the gaseous decomposition of cadmium compounds); exposure assessed by information on jobs at high risk of exposure between 1950 and 1970
Outcome: respiratory cancers, mostly lung and nasal cancer
28
What kind of study is this?
Exposure interval, 1950-1970
You (the investigator) are here
End of follow-up, 2000
29
Therefore...
This study aims to identify association between cadmium exposure and subsequent development of respiratory cancer
Who is at risk of respiratory cancer? Persons with lungs and noses Persons who do not already have respiratory cancer at baseline
30
Exposed
Unexposed
D+
D-
D+
D-
What did we find?
31
Exposed
Unexposed
D+
D-
D+
D-
250
450
100
90
150
360
What did we find?
32
2x2 table
Cancer No cancer Total
Exposed 100 150 250
Not Exposed 90 360 450
Total 190 510 700
33
How many of you eat breakfast?
A cross-sectional association between skipping breakfast and obesity has been shown in adults. Is it that people with low SES skip breakfast so tend to be obese, or is it that skipping breakfast in itself makes people obese?
A research team in Britain decided to investigate the association between proportion of calories consumed at breakfast and weight gain.
Purslow et al. Energy intake at breakfast and weight change: prospective study of 6,764 middle-aged men and women. Am J Epid. 2007; 167(2):188-192.
34
Breakfast study: cohort mechanics
STEP 1: Indentify a group of people of interest.
Participants recruited for the European Prospective Investigation into Cancer and Nutrition (Norfolk cohort); first (baseline) measurement on nutrition and weight
STEP 2: Follow them through time and monitor outcome of interest
During 1998-2000, a second measurement on nutrition and weight; include only those who did not report stroke, cancer, or heart attack at baseline
STEP 3: Classify participants according to outcome across exposure categories.
Classify people as breakfast eaters (exposed cohort) or non-eaters (unexposed cohort); assess weight change in both cohorts
Purslow et al. Energy intake at breakfast and weight change: prospective study of 6,764 middle-aged men and women. Am J Epid. 2007; 167(2):188-192.
35
Breakfast study: Find people and follow them through time
2) 1998-2000: 2nd measurement on nutrition and weight; include only those who did not report stroke, cancer, or heart attack at baseline
1) 1993-1997: 1st (baseline) measurement on nutrition and weight
3) 2007: Analysis; Assess weight change in both cohorts
Purslow et al. Energy intake at breakfast and weight change: prospective study of 6,764 middle-aged men and women. Am J Epid. 2007; 167(2):188-192.
36
Breakfast study: set up
What is the “exposure”? > Breakfast consumption How do the investigators measure the exposure? > They determine the percentage of total energy intake consumed at breakfast What is the outcome of interest? > Weight gain How do the investigators measure the outcome? > Change in weight in kg from time 1 to time 2
Purslow et al. Energy intake at breakfast and weight change: prospective study of 6,764 middle-aged men and women. Am J Epid. 2007; 167(2):188-192.
37
Breakfast study: findings
Calorie Distribution Avg weight gain in the follow up
Big breakfast (22-50% of total intake)
1.23 kg
Small Breakfast (0-11% of total intake)
0.79 kg
Everyone gained weight over time….
But larger breakfasts are associated with lesser weight gain among the middle aged participants of this study
This association is independent of the quantity of calories consumed in the day, social class, physical activity level, fruit and vegetable intake, fat/carb/protein intake, smoking, BMI
10 yrs x 1 lb/year = 10 lbs of potentially avoidable weight
gain!
Purslow et al. Energy intake at breakfast and weight change: prospective study of 6,764 middle-aged men and women. Am J Epid. 2007; 167(2):188-192.
38
So, what measures of disease occurrence and what measures of association can we calculate?
Risk (incidence proportion) Odds Prevalence at any point during the cohort (but making some assumptions about duration of disease) Incidence rate?
And...
Risk ratio Risk difference Odds ratio (relative odds) Incidence rate difference?
39
Relative risk (risk ratio)
The ratio of risks for two populations
exp
exp
osed
un osed
RRR
R= exp
exp
10025090450
100100*450250 2.090 250*90
450
osed
un osed
R
R
RR
=
=
= = =
So, the risk of developing respiratory cancer among South Dakota miners exposed to cadmium was 2.0 higher than among miners not exposed to cadmium in a cohort study with a follow-up period of 30-50 years 40
Risk difference
exp exposed un osedRD R R= −
The additional risk among those exposed when compared to those unexposed
100 90 (100*450) (90*250) 0.2250 450 (250*450)
RD −= − = =
So, the difference in the risk of developing respiratory cancer among South Dakota miners exposed to cadmium compared to miners not exposed to cadmium was 0.2 in a cohort study with a follow-up period of 30-50 years 41
Can we calculate incidence rate ratio from this information?
We could, if we made some assumptions Length of follow-up same for everyone No competing risks No loss to follow-up Assume then that average follow-up was 40 years; i.e., we are assuming that everyone was followed for 40 years
exp
exp
250*40 10000
450*40 18000osed
un osed
PYOPYO
= =
= =42
Relative rate (rate ratio)
The ratio of rates for two populations
exp
exp
osed
un osed
IRIRR
IR=
So, the rate of developing respiratory cancer among South Dakota miners exposed to cadmium was 2.0 higher than among miners not exposed to cadmium in a cohort study with a follow-up period of 30-50 years And, the relative rate and relative risk are the same assuming that time of follow-up is identical; of course this assumption is only valid if there is nothing else is competing as a risk, that follow-up is complete, and that this is a closed cohort
100100*18,00010,000 2.0 90 10,000*90
18,000
IRR = = =
43
Relative odds (odds ratio)
exp
exp
100250 250
150 100 250 100* 100 360250 250 150 150 * 2.6790 90 450 90 150 90*450 450 450 360 3601
360450
exp
osed exp
unexpunexposed
un
100
p 1001-Odds 1- p 250 = p 90Oddsp
901-450
= = = = = =
−
Cancer No cancer
Total
Exposed 100 150 250
Not Exposed
90 360 450
Total 190 510 700
44
Relative odds (odds ratio)
exp
exp
100250 250
150 100 250 100* 100 360250 250 150 150 * 2.6790 90 450 90 150 90*450 450 450 360 3601
360450
exp
osed exp
unexpunexposed
un
100
p 1001-Odds 1- p 250 = p 90Oddsp
901-450
= = = = = =
−
Cancer No cancer
Total
Exposed 100 150 250
Not Exposed
90 360 450
Total 190 510 700
45
Relative odds (odds ratio)
exp
exp
100250 250
150 100 250 100* 100 360250 250 150 150 * 2.6790 90 450 90 150 90*450 450 450 360 3601
360450
exp
osed exp
unexpunexposed
un
100
p 1001-Odds 1- p 250 = p 90Oddsp
901-450
= = = = = =
−
Cancer No cancer
Total
Exposed 100 150 250
Not Exposed
90 360 450
Total 190 510 700
46
Relative odds (odds ratio)
exp
exp
100250 250
150 100 250 100* 100 360250 250 150 150 * 2.6790 90 450 90 150 90*450 450 450 360 3601
360450
exp
osed exp
unexpunexposed
un
100
p 1001-Odds 1- p 250 = p 90Oddsp
901-450
= = = = = =
−
**
a dtherefore Odds Ratio b c
=
Cancer No cancer
Total
Exposed 100 150 250
Not Exposed
90 360 450
Total 190 510 700
47
Interpretation of odds ratio
The odds of respiratory cancer is 2.67 times greater in those exposed to cadmium compared to unexposed
Note that RR < OR...remember why?
48
Reality...
Exposure interval, 1950-1970 End of follow-up, 2000
p1
p2 p3 p4
etc 49
So we can take into account PYO
Cancer PYO
Exposed 100 20,000
Not Exposed 90 30,000
Total 190 50,000
That is, by taking into consideration, the actual time of follow-up
100100*30,00020,000Re 1.6790 90*20,000
30,000
lative rate = = =
50
But of course we can do better
Cancer PYO
Exposed 100 20,000
Not Exposed 90 30,000
Total 190 50,000
That is, by taking into consideration, the actual time of follow-up
Without knowing
actual PYO we cannot know how
RR is estimating
IRR
100100*30,00020,000Re 1.6790 90*20,000
30,000
lative rate = = =
51
But of course we can do better
Cancer PYO
Exposed 100 20,000
Not Exposed 90 30,000
Total 190 50,000
That is, by taking into consideration, the actual time of follow-up
If PYOs were different, would it be possible
that cadmium is actually
protective?
100100*30,00020,000Re 1.6790 90*20,000
30,000
lative rate = = =
52
Rate difference
The additional incidence rate comparing those exposed vs. those unexposed
exp exposed un osedIRD IR IR= −
100 90 0.00220,000 30,000
2IRD or 1000 person years
= − =
53
Attributable fraction among exposed
AFexp osed =
Rexp osed − Run exp osed
Rexp osed
Proportion of the disease burden among exposed people that is due to the exposure
AFexp osed =
100250
− 90450
100250
=0.20.4
= 0.5
So, we say, that 50% of disease among exposed is attributable to exposure; or, if we removed all the exposure, we might expect to reduce disease by 50% among exposed 54
Attributable fraction in population
AFpopulation =
Rpopulation − Run exp osed
Rpopulation
Proportion of the disease burden among the whole population that is due to the exposure
AFpopulation =
190700
− 90450
190700
=0.0720.271
= 0.26
55
Or...
AFpopulation =
p * (RR −1)p * (RR −1) +1
So, if exposure is removed, we would expect that disease would be reduced by ~26% in the whole population Attributable fraction in population is always lower than attributable fraction among exposed, as long as exposure is associated with more disease. Why?
prevalence of exposure
AFpopulation ≅
250700
* (2.0 −1)
250700
* (2.0 −1) +1=
0.361.36
= 0.26
56
Summary
Cohort studies follow participants forward in time Cohort studies allow us to calculate all cardinal measures of association Key limitations to cohort studies are unsuitability for rare disease and the fact that they take a long time (and are expensive!) to implement
57
Prevalence studies “Snapshot” look at exposure and outcome at a point in time Only provide relative odds; do not allow us to calculate either risk or rates, so no relative risks or relative rates Potential over-representation of diseases with long duration (P=ID)
Aside...cross-sectional studies
58