Post on 28-Mar-2015
SADC Course in Statistics
Basic summaries for epidemiological studies
(Session 04)
2To put your footer here go to View > Header and Footer
Learning Objectives
At the end of this session, you will be able to
• correctly distinguish and use ideas of prevalence and incidence
• explain the concepts of risk in relation to health outcomes, and of what may be “causal” factors
• use the concepts of relative risk and odds ratio in relation to simple epidemiological studies
3To put your footer here go to View > Header and Footer
Attribute Data
An attribute is an ascertainable characteristic either present or absent in an individual, so that the “measurement” on an individual can be represented as either 1 or 0.
Many measures in epidemiology are of this type e.g. a test for HIV seropositivity yields such a 0/1 response. This may still involve expert interpretation & judgment, with possibility of false positives and false negatives.
4To put your footer here go to View > Header and Footer
Point Prevalence
Prevalence concerns the number of instances of attribute in the popn, usually at a point in time, relative to the number at risk, i.e. expressed as a proportion, a percentage, per 1000 or even per million where +s are rare. So point prevalence (as a %age) is
No. individuals with + attribute at time point
No. of indiv.s in population at risk at time point X 100
5To put your footer here go to View > Header and Footer
Period Prevalence
This refers to number of cases known to have been prevalent during a period e.g. a year.
Numerator above wd be replaced by sum of (1) no. of prevalent cases at start of year, and (2) no. of new cases arising during the period.
Denominator usually then a mid-year figure for population at risk.
6To put your footer here go to View > Header and Footer
Prevalence: notes
• Occasionally “prevalence” is used for absolute number of cases/instances – best not to call this “prevalence”!
• Both point and period prevalence are snapshot figures. They are NOT rates.
• Period prevalence sensible for short-duration condition where numbers can rise/fall fast.
• No. “at risk” needs thought e.g. males only for prostate conditions.
• Prevalences can be age-specific.
7To put your footer here go to View > Header and Footer
Incidence
Incidence (always a rate ~ a flow statistic) as a population measure is normally on a yearly rate basis. As a proportion:-
No. of new cases arising in a period of 1 yr.
Mid-yr. population at risk
As with prevalence, often put as %, ‰ etc
• Watch out for non-experts confusing or misusing the terms prevalence & incidence!
8To put your footer here go to View > Header and Footer
Relationship of prevalence & incidence
When prevalence P is relatively small and condition is of limited duration (say averaging time T) and population is in a “steady state”, then approximately:-
P = I x T
where I = incidence.
Exercise ~ try to express in words a rough justification for the above expression.
9To put your footer here go to View > Header and Footer
Probability, risk or cumulative incidence
Sometimes a study population is relatively small, or a sample can be followed up. Then we can calculate “risk” or cumulative incidence as:-
No. new cases arising in one year
No. healthy individuals in popn at start of yr
This is then an estimated probability; note the mortality rate of session 14 is an example of this.
10To put your footer here go to View > Header and Footer
Sources of risk: 1
Much of epidemiology concerns “risk factors” that may be “causes” of the disease.
There are logical difficulties in proving causation, & often a complex set of pre-disposing and influencing factors.
In simplest case, consider just one risk factor e.g. cigarette smoking, and reduce the risk factor ~ as well as disease attribute ~ to present/absent.
Discuss what might be more realistic model!
11To put your footer here go to View > Header and Footer
Sources of risk: 2
With one Yes/No attribute and one “present/absent” risk factor a 2x2 table of frequencies could be:-
DiseasedNot
diseasedTotal
Risk factor
present a b a + b
Risk factor
absent c d c + d
Total a + c b + da+b+c+d
= n
12To put your footer here go to View > Header and Footer
Cohort study
This involves selecting, & following through a period of time, individuals some with risk factor present, some absent. Outcome observation = no. with disease at endpoint.
In a general population cohort study only n is fixed. If low general exposure to risk, (a + b) will be small relative to n ~ costly, so where possible (a + b), (c + d) often selected e.g. to be equal sample sizes.
13To put your footer here go to View > Header and Footer
Cohort study relative risk: 1
With observed frequencies a, b, c, d as above the disease risk (over the study duration) among:-
the risk-factor + group is: a/(a + b)
the risk-factor – group is: c/(c + d)
The relative risk is the ratio of these two risks:- a . (c + d)
(a + b). d
RR =
14To put your footer here go to View > Header and Footer
Cohort study relative risk: 2
The relative risk is the ratio of these two risks:- a . (c + d)
(a + b). cOften disease rates are relatively low, soa/(a + b) ≈ a/b ;
c/(c + d) ≈ c/d and then
RR ≈ a.d/b.c – described as the “odds ratio”
or “approximate relative risk”, witha/b being odds of getting disease, having the
exposure, c/d odds not having the exposure.
RR =
15To put your footer here go to View > Header and Footer
Cohort study relative risk: 3
Example ~ population of miners
RR = (58/430)/(27/370) = 1.85
Odds ratio = (58/372)/(27/343) = 1.98
Similar representations of extra risk factor due to occupational asbestos exposure.
Asbestos Lung cancer + No LC i.e. – Total
Exposured 58 372 430
Not exposed 27 343 370
Total 85 715 800
16To put your footer here go to View > Header and Footer
Case-control relative risk
In a case-control study (module I1, sess. 05) numbers of lung cancer positive “cases” and lung cancer negative “controls” would be fixed by design. RR cannot be calculated, but the same odds ratio can, & is used as approximation to relative risk.
Odds ratios are statistically modelled by professional epidemiologists to account for numerous complicating factors.
17To put your footer here go to View > Header and Footer
Confounding: 1
Counfounders are “nuisance” variables that make over-simple conclusions misleading!
Example ~ suppose in a study population the TRUE average figures are as below, so tea/coffee drinking adds 4 mg Hg to diastolic blood pressure:-
Average diastolic BP Overweight Not overweight
Tea/coffee drinker 94 74
Non-drinker of tea/coffee 90 70
18To put your footer here go to View > Header and Footer
Confounding: 2
Now say numbers of individuals in study are:-
If study ignores obesity and calculates simple averages, it could expect diastolic BPs as follows:-
Drinkers: [(94 x 300) + (74 x 100)]/(300 + 100) = 89;
non-drinkers: [(90 x 50) + (70 x 150)]/(50 + 150) = 75.
Misleading 14 mg difference. Confounders only corrected if someone thinks of them!
Numbers of individuals Overweight Not overweight
Tea/coffee drinker 300 100
Non-drinker of tea/coffee 50 150
19To put your footer here go to View > Header and Footer
Practical work follows to ensure learning objectives
are achieved…