Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard...

24
Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard...

Page 1: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Statistical methods for epidemiology

Professor Louise Ryan

Department of Biostatistics

Harvard School of Public Health

Assisted by Dr Beth Ann Griffin

Page 2: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Today’s lecture• Course overview

• What is epidemiology?

• Concepts of causation

• Epidemiology vs clinical trials (Observational versus randomized study design)

• Some examples that we’ll come back to throughout the course

Page 3: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

What is epidemiology?The study of the factors (genetic, environmental,

behavioral) that cause human disease

Many epidemiologists have MD degrees and specialize in one of many subfields such as cancer, nutrition, environment, genetics, oral, cardiovascular……

Epidemiology is heavily quantitative and relies on a variety of advanced statistical methods. Some statisticians (e.g. Professor Norman Breslow) specialize in the development of statistical methods for epidemiology

Non-quantitative aspects of epidemiology (subject matter knowledge) are also extremely important

Page 4: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

An example - Antiepileptic Drugs and birth outcomes

• Hypothesis: Taking drugs like Dilantin during pregnancy can cause mental retardation. Affected children can be identified through subtle facial alterations (similar to fetal alcohol syndrome)

• Subjects chosen from 128,000 births screened at MGH.

Drug exposed Seizure history, no drugs Controls

# kids 316 98 508

Major malformation 5.7% 0% 1.8%

Growth retardation 4.8% 2.1% 1.2%

Altered face 11.3% 2.9% 3.8%

Page 5: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Causation• We use statistical methods to assess association.

For example, blood pressure is correlated with body mass index

• Epidemiology adds subject matter knowledge to go beyond association in order to determine causation. For example– Does increasing one’s body mass index lead to an

increase in blood pressure? – Do the antiepileptic drugs cause adverse birth

outcomes? Or is it the epilepsy itself?

Page 6: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

• A study shows a positive correlation

between hospital size (number of beds) and length of time patients stay in hospital. Does this mean that you can shorten a hospital stay by choosing a small hospital?

Association vs causation

• An Australian study of aging found that people who exercised

had better cognitive functioning than those who did not. Does exercise improve cognitive functioning? Or are those with better cognitive functioning more likely to exercise?

• There is some evidence that eating fish with high levels of methylmercury can lead to problems with child neurodevelopment. But, is it the methylmercury that causes the problems, or some other exposure (e.g. pcb)?

Page 7: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

What does is mean for A to “cause” B?

• Some naïve interpretations might be– An antecedent event? Unless A happens, B

can’t happen? – If A happens, B must happen?

• In practice, causality is more subtle– Many factors generally contribute– People’s risks vary due to genetics etc. – Knowledge is usually incomplete

• Probabilistic interpretation is that if event A occurs, the risk of event B is increased.

Page 8: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Bradford Hill’s Causal Criteria • Strength of association• Consistency• Specificity• Temporality• Biologic gradient• Plausibility• Coherence• Experimental Evidence• Analogy

Page 9: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Causal graphs (Jewell Section 8.2)

Dotted lines denote association. Solid lines denote causation. Solid lines are unidirectional (directed causal graph). Graphs are acyclic if no directed graph forms a loop (no variable can cause itself).

Z is a lurking variable in these schematics: a possibly unmeasured variable that may influence interpretation of the data

Reverse causation refers to erroneous conclusion that X causes Y when in fact Y causes X

Page 10: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Causal inference (Jewell Ch 8) Has gained in popularity among statisticians in recent years (e.g. work by Jamie

Robins, Don Rubin).

Causal inference often uses Counterfactuals to represent individual responses under hypothetical repetitions of reality.

Let Yi1 be the birth outcome of the ith baby when the mother takes drugs during pregnancy, Yi0 the response of that same baby when the mother doesn’t take drugs. True causal effect is E(Yi1-Yi0). Because we only get to observe one of the two, assumptions are needed to tease out the true effect.

Group Coffee No coffee Number

1 Disease Disease Np1

2 Disease No disease Np2

3 No disease Disease Np3

4 No disease No disease Np4

Jewell Table 8.1: Pancreatic cancer in response to hypothetical coffee exposure scenarios

Page 11: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Causal inference (cont’d)

In reality, we can’t observe all 8 cells, but only 4.

Suppose that group 1 & 3 individuals all drink coffee and group 2 & 4 individuals don’t. This would make it appear that coffee has a very strong effect. But it is really the mechanism by which people are exposed that is inducing the effect.

Is there any hope of sorting out causal effects?– Randomization

– Assume we have measured all the factors that influence both exposure and outcome of interest (no unmeasured confounders).

Type of study design has strong influence on how confident we are about inferring causation from an epidemiological study

Page 12: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Broad types of study design• Experimental Study – investigator controls who gets the factor of

interest– Women’s Health Initiative randomizes women to hormone replacement

therapy or placebo

• Observational study – investigator observes the natural patterns of disease in study sample. – Cohort study of indoor allergens and asthma

– Case-control study of petrochemical exposure and childhood leukemia

• Quasi-experimental study – event provides opportunity for “natural” experiment– Accident rates before and after seatbelt laws enacted

– Birth defects after methylmercury spill in Minamata, Japan

• Ecological Study – observe characteristics of a group of people. Often use administrative data.

Page 13: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Clinical trial Observational Study

Often focus is how to treat those already diseased. Or giving a treatment to prevent disease.

Understand why people get disease in first place

Randomization assures treatment arms are balanced on age, sex, etc

Unethical to experimentally manipulate factors that cause disease. Need statistical adjustment to prevent bias

Protocol for careful patient follow-up

Need to observe people “as they are” – potential for confounding to lead to bias

Page 14: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Some more examples

• Methylmercury and IQ

• Arsenic in drinking water

• Collaborative Perinatal Project

• National Health and Nutritional Survey

Page 15: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Methylmercury• Mercury is natural occurring metal• Environmental levels increased through coal-burning• Converted to methylmercury (mehg) by aquatic

organisms, then bioaccumulates • Most common human exposure via fish consumption,

especially tuna, swordfish, shark & whale. • Accidental poisonings in Minamata revealed serious effects in children• Effects of low level exposure controversial – several good

studies have conflicting results

Page 16: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Methylmercury (cont’d)• Three studies

– Faroe Islands (n~1000 Mother/infant pairs) – published studies report that higher levels of mehg in cord blood lead to reduced performance on neurocognitive tests

– Seychelle Islands (n~800) – no association– New Zealand (n~300) – marginal association

• Why the differences? – Sample sizes? – Population differences? – Other exposures? PCBs perhaps?

Page 17: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

A combined analysis of three studiesEstimated regression coefficients and 95% confidence intervals for effect of 1 unit increase in hair mehg on IQ

-.17 (.13) (Seychelles)-.50 (.28) (NZ) -.13 (.061) (Faroes)

Individual studies non-significant. But hierarchical model suggests an effect of -0.15, with 95% confidence limit of (-0.259, -0.047)

Page 18: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Arsenic in drinking water• Natural metal, found in wells

in mountainous areas.

• Health effects include hyperpigmentation, Blackfoot disease, cancer

• Natural experiment in South West region of Taiwan where new wells dug in postwar era.

• Data include age-specific cancer counts and person-years at risk between 1976 and 1986 for 42 villages, along with well arsenic levels

Page 19: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Issues for Arsenic analysis

US equivalent concentration (micro-g/L)

lifet

ime

risk

0 500 1000 1500 2000

0.0

0.02

0.06

0.10

• Ecological study design – exposure at village level only.

• Measurement error a problem

• No information about smoking and other factors

• Noisy data leads to model sensitivity

Page 20: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Collaborative Perinatal Project (CPP)• Conducted between 1959-1974 in twelve centers around the US• Prospectively following pregnant women enrolled prentally• Children examined at birth, with followup at four, eight, and twelve

months, and three, four, seven, and eight years. • Women could enrol multiple times. • CPP studied 58,000 pregnancies, with up to 6 pregnancies on the

same woman.• Information was also collected on pregnancies prior to enrollment,

with up to 14 total pregnancies per woman.• Outcomes include various anthropomorphic measurements (weight,

length, head circumference etc).• Interest in exposures such as smoking, education etc.• How do past pregnancy outcome affect current outcome?

Page 21: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.
Page 22: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

National Health and Nutrition Survey

• Conducted by the National Center for Health Statistics in the United States

• Probability-based sample from the US population

• Extensive data on demographics, weight, height, blood pressure, chronic diseases

• Biological samples used to assess methylmercury and lots of other exposures

Page 23: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

NHANES (cont’d)• Provides critical data on

distributions of chronic disease in the US

• Good potential for epidemiological explorations

Low femur density in older US women

Page 24: Statistical methods for epidemiology Professor Louise Ryan Department of Biostatistics Harvard School of Public Health Assisted by Dr Beth Ann Griffin.

Hormone Replacement Therapy and Breast Cancer

A Japanese case/control study

Over to Professor Masahiro Takeuchi