Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology...

8
PUBLIC HEALTH MAnERS Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause and causal inference are largely self-taught from early learn- ing experiences. A model of causation that describes causes in terms of suffi- cient causes and their component causes illuminates important principles such as multicausality, the dependence of the strength of component causes on the prevalence of complementary component causes, and interaction between com- ponent causes. Philosophers agree that causal propositions cannot be proved, and find flaws or practical limitations in all philosophies of causal inference. Hence, the role of logic, belief, and observation in evaluating causal propositions is not settled. Causal inference in epidemiology is better viewed as an exercise in measurement of an effect rather than as a criterion-guided process for deciding whether an effect is pres- ent or not. (4m JPub//cHea/f/i. 2005;95:S144-S150. doi:10.2105/AJPH.2004.059204) What do we mean by causation? Even among those who study causation as the object of their work, the concept is largely self-taught, cob- bled together fioni early experiences. As a youngster, each person develops and tests an inventory of causal explanations that brings meaning to perceived events and Ihat ulti- mately leads to more control of those events. Because our first appreciation ofthe con- cept of causation is based on our own direct observations, the resulting concept is limited by the .scope of those obsei-vations. We typi- cally observe causes with effects that are im- mediately apparent. For example, when one turns a light switch to the "on" position, one normally sees the instant efTect of the light going on. Nevertheless, the causal mechanism for getting a light to shine involves more than turning a light switch to "on." Suppose a storm has downed the electric lines to the building, or the witnng is faulty, or the bulb is burned out—in any of these cases, turning the switch on wili have no efFect One cause ofthe light going on is having the switch in the proper position, but along with it we must have a supply of power to the circuit, good wiring, and a working bulb. When all other factors are in place, turning the switch will cause the light to go on, but if one or more of the other factors is lacking, the light will not go on. Despite the tendency to consider a switch as the unique cause of turning on a light, the complete causal mechanism is more intricate, and the switch is only one component of sev- eral. The tendency to identify the switch as the unique cause stems from its usual i-ote as the final factor that acts in the causal mecha- nism. The wiring can be considered part of the causal mechanism, but once it is put in place, it seldom warrants further attention. The switch, however, is often the only part of the mechanism that needs to be activated to obtam the effect of turning on the light. The effect usually occurs immediately after turn- ing on the switch, and as a result we slip into thefi^ameof thinking in which we identify the switch as a unique cause. The inadequacy of this assumption is emphasized when the bulb goes bad and needs to he replaced, 'lliese concepts of causation that are established empirically early in life are too rudimentary to serve well as the basis for scientific theo- ries. To enlarge upon them, we need a more general conceptual model that can serve as a common stalling point in discussions of causal theories. SUFFICIENT AND COMPONENT CAUSES The concept and definition of causation engender continuing debate among philoso- phers. Nevertheless, researchers interested in causal phenomena must adopt a working defi- nition. We can define a cause of a specific dis- ease event as an antecedent event, condition, or characteristic that was necessary for the occun'ence of the disease at the moment it occun-ed, given that other conditions are fixed. In other words, a cause of a disease event is an event, condition, or characteristic that preceded the disease event and without which the disease event either would not have occun-ed at all or would not have oc- curred until some later time. Under this defi- nition it may be that no specific event, condi- tion, or characteristic is sufiicient by itself to produce disease. This is not a definition, then, of a complete causal mechanism, but only a component of it. A "sufficient cause," which means a complete causal mechanism, can be defined as a set of minimal conditions and events that inevitably pT'oduce ciisease; "mini- mal" implies that all ofthe conditions or events are necessar}' to tliat occurrence. In disease etiology, the completion of a sufficient cause may be considered equivalent to the onset of disease. (Onset here refers to the onset of the earliest stage of the disease pro- cess, rather than the onset of signs or symp- toms,) For biological effects, most and some- times all of the components of a sufficient cause are unknown.' For example, tobacco smoking Ls a cause of lung cancer, but by itself it is not a sufficient cause. First the term smoking is too imprecise to be used in a causal description. One must specify the type of smoke (e.g., cigarette. cigar, pipe), whether it is filtered or unfiltered, the manner and frequency of inhalation, and the onset and duration of smoking. More im- portantly, smoking, even defined explicitly, will not cause cancer in everyone. Appar- ently, there are some people who, by virtue of their genetic makeup or previous experi- ence, are susceptible to the effects of smok- ing, and others who are not. These suscepti- hiWty factors are other components in the various causa] mechanisms through which smoking causes lung cancer. Figure 1 provides a schematic diagram of sufficient causes in a hypothetical individual, Kach constellation of component causes rep- resented in Figure 1 is minimally sufficient to produce the disease: that is, there is no redun- dant or extraneous component cause. Kach one is a necessary part of that specific causal S144 I Public Health Matters ( Peer Reviewed I Rothman and Greenland American Journai of Public Heaith | Supplement 1, 2005. Vol 95, No, S I

Transcript of Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology...

Page 1: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAnERS

Causation and Causal Inference in EpidemiologyKenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat

Concepts of cause and causal inference are largely self-taught from early learn-ing experiences. A model of causation that describes causes in terms of suffi-cient causes and their component causes illuminates important principles suchas multicausality, the dependence of the strength of component causes on theprevalence of complementary component causes, and interaction between com-ponent causes.

Philosophers agree that causal propositions cannot be proved, and find flaws orpractical limitations in all philosophies of causal inference. Hence, the role of logic,belief, and observation in evaluating causal propositions is not settled. Causalinference in epidemiology is better viewed as an exercise in measurement of aneffect rather than as a criterion-guided process for deciding whether an effect is pres-ent or not. (4m JPub//cHea/f/i. 2005;95:S144-S150. doi:10.2105/AJPH.2004.059204)

What do we mean by causation? Even amongthose who study causation as the object of theirwork, the concept is largely self-taught, cob-bled together fioni early experiences. As ayoungster, each person develops and tests aninventory of causal explanations that bringsmeaning to perceived events and Ihat ulti-mately leads to more control of those events.

Because our first appreciation ofthe con-cept of causation is based on our own directobservations, the resulting concept is limitedby the .scope of those obsei-vations. We typi-cally observe causes with effects that are im-mediately apparent. For example, when oneturns a light switch to the "on" position, onenormally sees the instant efTect of the lightgoing on. Nevertheless, the causal mechanismfor getting a light to shine involves morethan turning a light switch to "on." Supposea storm has downed the electric lines to thebuilding, or the witnng is faulty, or the bulbis burned out—in any of these cases, turningthe switch on wili have no efFect One causeofthe light going on is having the switch inthe proper position, but along with it wemust have a supply of power to the circuit,good wiring, and a working bulb. When allother factors are in place, turning the switchwill cause the light to go on, but if one ormore of the other factors is lacking, the lightwill not go on.

Despite the tendency to consider a switchas the unique cause of turning on a light, thecomplete causal mechanism is more intricate,and the switch is only one component of sev-

eral. The tendency to identify the switch asthe unique cause stems from its usual i-ote asthe final factor that acts in the causal mecha-nism. The wiring can be considered part ofthe causal mechanism, but once it is put inplace, it seldom warrants further attention.The switch, however, is often the only part ofthe mechanism that needs to be activated toobtam the effect of turning on the light. Theeffect usually occurs immediately after turn-ing on the switch, and as a result we slip intothe fi^ame of thinking in which we identify theswitch as a unique cause. The inadequacy ofthis assumption is emphasized when the bulbgoes bad and needs to he replaced, 'llieseconcepts of causation that are establishedempirically early in life are too rudimentaryto serve well as the basis for scientific theo-ries. To enlarge upon them, we need a moregeneral conceptual model that can serve as acommon stalling point in discussions ofcausal theories.

SUFFICIENT AND COMPONENTCAUSES

The concept and definition of causationengender continuing debate among philoso-phers. Nevertheless, researchers interested incausal phenomena must adopt a working defi-nition. We can define a cause of a specific dis-ease event as an antecedent event, condition,or characteristic that was necessary for theoccun'ence of the disease at the moment itoccun-ed, given that other conditions are

fixed. In other words, a cause of a diseaseevent is an event, condition, or characteristicthat preceded the disease event and withoutwhich the disease event either would nothave occun-ed at all or would not have oc-curred until some later time. Under this defi-nition it may be that no specific event, condi-tion, or characteristic is sufiicient by itself toproduce disease. This is not a definition, then,of a complete causal mechanism, but only acomponent of it. A "sufficient cause," whichmeans a complete causal mechanism, can bedefined as a set of minimal conditions andevents that inevitably pT'oduce ciisease; "mini-mal" implies that all ofthe conditions orevents are necessar}' to tliat occurrence. Indisease etiology, the completion of a sufficientcause may be considered equivalent to theonset of disease. (Onset here refers to theonset of the earliest stage of the disease pro-cess, rather than the onset of signs or symp-toms,) For biological effects, most and some-times all of the components of a sufficientcause are unknown.'

For example, tobacco smoking Ls a cause oflung cancer, but by itself it is not a sufficientcause. First the term smoking is too impreciseto be used in a causal description. One mustspecify the type of smoke (e.g., cigarette.cigar, pipe), whether it is filtered or unfiltered,the manner and frequency of inhalation, andthe onset and duration of smoking. More im-portantly, smoking, even defined explicitly,will not cause cancer in everyone. Appar-ently, there are some people who, by virtueof their genetic makeup or previous experi-ence, are susceptible to the effects of smok-ing, and others who are not. These suscepti-hiWty factors are other components in thevarious causa] mechanisms through whichsmoking causes lung cancer.

Figure 1 provides a schematic diagram ofsufficient causes in a hypothetical individual,Kach constellation of component causes rep-resented in Figure 1 is minimally sufficient toproduce the disease: that is, there is no redun-dant or extraneous component cause. Kachone is a necessary part of that specific causal

S144 I Public Health Matters ( Peer Reviewed I Rothman and Greenland American Journai of Public Heaith | Supplement 1, 2005. Vol 95, No, SI

Page 2: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAHERS

One Causal MechanismSingle Component Cause

FIGURE l-Three sufficient causes of disease.

mechanism. A specific component cause mayplay a role in one, two, or all three of thecausal mechanisms pictured.

MULTiCAUSAUTY

The model of causation implied byFigure 1 illuminates several important princi-ples regaRJiiig causes. Perhaps the most im-portant of these principles is self-evident fromthe model: A given disease can be caused bymore than one causal mechanism, and ever}'causal mechanism involves tbe joint action ofa multitude of component causes. Consideras an example the cause of a broken bip. Sup-pose tbat someone experiences a naumatieinjury to the bead tbal leads to a permanentdisturbance in equilibrium. Many years later,the faulty equilibrium plays a causal role in afall tbat occui"s wbile tbe person is walkingon an icy patb. Tbe fall results in a brokenhip. Otber factors playing a causal role for thebroken bip could include tbe type of sboe tbeperson was wearing, tbe lack of a baiidrailalong tbe patb, a strong wind, or the bodyweigbt of tbe person, among others. Tbe com-plete causal mechanism involves a multitudeof factors. Some factors, such as tbe person'sweigbt and tbe earlier injury tbat resulted intbe equilibrium disturbance, reflect earlierevents tbat bave bad a lingering efTect. Somecausal components are genetic and would af-fect the person's weight, gait, behavior, recov-ery from the earlier trauma, and so fortb.Otber factors, sucb as the force of tbe wind,are environmental. It is a reasonably safe as-

sertion tbat tbere are nearly always somegenetic and some environmental componentcauses in every causa! mecbanism, Tbus,even an event sucb as a fall on an icy patbleading to a broken bip is part of a compli-cated causal mecbanism tbat involves manycomponent causes.

Tbe importance of multicausality is tbatmost identified causes are neitber necessar}'nor sufficient to produce disease, Nevertbe-less, a cause need not be eitber necessary orsufficient for its removal to result in diseaseprevention. If a component cause tbat is nei-tber necessary nor sufficient is blocked, a sub-stantial amount of disease may be prevented,Tbat tbe cause is not necessary implies tbatsome disease may still occur after tbe causeis blocked, but a component cause wili never-theless be a necessary cause for some of thecases tbat occur, Tbat tbe component cause isnot sufficient implies tbat otber componentcauses must interact witb it to produce tbedisease, and tbat blocking any of tbem wouldresult in prevention of some cases of disease.Tbus, one need not identify every componentcause to prevent some cases of disease. In tbelaw. a distinction is sometimes made amongcomponent causes to identify tbose tbat maybe considered a "proximate" cause, implyinga more direct connection or responsibility fortbe outcome.^

STRENGTH OF A CAUSE

In epidemiology, tbe strengtb of a factor'seffect is usually measured by tbe cbange in

disease frequency produced by introducingtbe factor into a population. Tbis cbange maybe measured in absolute or relative terms. Ineitber case, tlie strength of an effect maybave tremendous public health significance,bul it may have little biological signili :ance.Tbe reason is that given a spedfic causalmechanism, any ol' the component causes canhave strong or weak efTects. Tbe aetua! iden-tity of tbe constituent components of tbecausal mecbanism amounts to the biology ofcausation. In contrast, the strengtb of a fac-tor's effect depends on the time-specific distri-bution of its causal complements in the popu-lation. Over a span of time, tbe strengtb oftbe effect of a given factor on the occurrenceof a given disease may change, because tbeprevalence of its causal complements in vari-ous causal mechanisms may also cbange.The causal mechanisms in which the factorand its complements act could remain un-changed, however.

INTERACTION AMONG CAUSES

Tbe causal pie model posits tbat severalcausal components act in concert to producean effect. "Acting in concert" does not neces-sarily imply that factors must act at tbe samedme. Consider tbe example above of tbe per-son wbo sustained trauma to the head thatresulted in an equilibrium disturbance,wbieh led, years later, to a fall on an icypath. Tbe earlier head trauma played acausal role in tbe later hip fracture: so didthe weatber conditions on the day of tbefracture. If both of tbese factors played acausal role in tbe bip fracture, tben tbey in-teracted with one anotber to cause the frac-ture, despite the fact tbat tbeir time of aetionis many years apart. We would say tbat anyand all of the factors in tbe same causalmechanism for disease interact witb one an-otber to cause disease. Tbus, tbe beadirauma interacted with the weather condi-tions, as well as with otber component causessucb as tbe type of footwear, tbe absence ofa bandbold, and any otber conditions thatwere necessary to tbe causal mecbanism oftbe fall and the broken hip tbat resulted.One can view each causal pie as a set of in-teracting causal components. I'his modelprovides a biological basis for a concept of

Supplement 1, 2005, Vol 95. No, SI I American Journal ot Public Health Rothman and Greenland ] Peer Reviewed j Public Heaith Matters i S145

Page 3: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAnERS

Table 1-Hypothetical Rates of Head

and Neck Cancer (Cases per 100000

Person-Years) According to Smoking

Status and Aicohoi Drinking

Smoking Status

Nonsmoker

Smoker

Alcohol Drinking

No

1

4

Yes

3

12

interaction distinct from the usual statisticalview of interaction.'*

SUM OF ATTRIBUTABLE FRACTIONS

Consider the data on rates of head andneck cancer according to whether peoplehave been cigarette smokers, alcohol drinii-ers, or both (Table 1). Suppose that the differ-ences in the rates all rettect causal efTects./\mong those people who are smokers andalso alcohol drinkers, what proportion of thecases is attributable to the effect of smoking?We know that the rate for these people is 12cases per 100000 person-years. If thesesame people were not smokers, we can inferthat their rate of head and neck cancer wouldbe 3 cases per 100000 person-years, if thisdifference reflects the causal role of smoking,then we might infer that 9 of every 12 cases,or 75%, are attributable to smoking amongthose who both smoke and drink alcohol. Ifwe tum the question around and ask whatproportion of disease among these samepeople is attributable to alcohol drinking,we would be able to attribute 8 of eveiy 12cases, or 67%, to alcohol drinking.

How can we attribute 75"Ai of the cases tosmoking and 67"/o to alcohol drinking amongthose who are exposed to both? We can be-cause some cases are counted more thanonce. Smoking and alcohol interact in somecases of head and neck cancer, and thesecases are attributable both to smoking ami toalcohol drinking. One consequence of interac-tion is that we should not expect that the pro-portions of disease attributable to variouscomponent causes will sum to 100%.

A widely discussed (though unpublished)paper from the 1970s, written by scientists atthe National Institutes of Health, proposed

tbat as much as 4O''/() of cancer is attributableto occupational exposures. Many scientiststhought that this fr aclion was an overestimate,and ai^ed against this claim."*' One of thearguments used in rebuttal was as follows;X percent of cancer is caused by smoking,y percent by diet, z percent by aicohoi, andso on; when all these percentages are addedup. oniy a small percentage, much less than40%, is left for occupational causes. But thisrebuttal is fallacious, because it is based onthe naive view that every case of disease hasa single cause, and that two causes cannotboth contribute to the same case of cancer.In fact, smce diet, smokiiig, asbestos, and van-ous occupational exposures, along with otberfactors, interact with nne another and withgenetic factoi's to cause cancer, each case ofcaneer could be attributed repeatedly tomany separate component causes. Tbe sumof disease attributable to vaiious componentcauses thus has no upper limit.

A single cause or category of causes that ispresent in every sulTicient cause of diseasewill have an attributable fraction of 100%,Much publicity attended the pronouncementin 1960 that as much as 90*'/i) of cancer iscaused by environmental factors.' Since "envi-ronment" can be thought of as an all-embracingcategory that represents nongenetic causes,wbich must be present to some extent inevery sufficient cause, it is clear on a priorigrounds that 100% of any disease is environ-mentally caused. Thus, lligginson's estimateof 9O"/o was an underestimate.

Similarly, one can show that lOO'Vu of anydisease is inherited. MacMalion ' dted the ex-ample of yellow shanks, ^ a trait occurring incertain strains of fowl fed yellow com. Boththe right set of genes and the yellow-com dietare necessary to produce yellow shanks. Afanner with several strains of fowl, feedingthem all only yellow com, would consideryellow shanks to be a genetic condition, sinceonly one strain would get yellow shanks, de-spite all strains getting the same diet. A differ-ent fanner, who owned only the strain liableto get yellow shanks, but who fed some ofthe bii'ds yellow com and others white com.would consider yellow shanks to be an envi-ronmentally detemiined condition because itdepends on diet. In reality, yellow shanks isdetermined by both genes and environment;

there is no reasonable way to allocate a por-tion of tlie causation to either genes or envi-ronment. Similarly, every case of every dis-ease has some enviranmenta! and somegenetic component causes, and thereforeevery case can be attributed both to genesand to environmeTit. No paradox exists aslong as it is understood that the fractions ofdisease attributable to genes and to environ-ment overlap.

Many researchers bave spent considerableefTort in developing heritabilit}' indices, wbicbare supposed to measure the fraction of dis-ease that is inherited. Unfortunately, theseindices only assess the relative role of envi-ronmental and genetic causes of disease in aparticular setting. For example, some geneticcauses may be necessary components ofevery causal mechanism. If everyone in apopulation has an identical set of the genesthat cause disease, however, tbeir efTect isnot included in heritabiiity indices, despitethe fact that having these genes is a cause ofthe disease. I he two fanners in the exampleabove would offer very difTerent values forthe heritabiiity of yellow shanks, despite thefact tbat the condition is always 100% depen-dent on having certain genes.

If all genetic factors that detennine diseaseare taken into account, whether or not theyvary witliin populations, then 100% of dis-ease can be said to be inherited. Analogously,100% of any disease is environmentallycaused, even those diseases that we oftenconsider purely genetic. Phenylketonuria, forexample, is considered by many to be purelygenetic. Nonetheless, the mental retardationthat it may cause can be prevented by appro-priate dietary intervention.

The treatment for phenylketonuria illus-trates the interaction of genes and environ-ment to cause a disease commonly thought tobe purely genetic. What about an apparentlypurely environmental cause of death such asdeath from an automobile accident? It is easyto conceive of genetic traits tbat lead to psy-chiatric problems such as alcoholism, whichin tum lead to dnink driving and consequentfatality. Consider another more extreme envi-ronmental example, being killed by lightning.Paitially heritable psychiatric conditions caninfluence whether someone will take shelterduring a lightning storm; genetic traits sucb as

S146 I Public Health Matters I Peer Reviewed I Rothman and Greenland American Journal of Public Heaitti | Supplement 1, 2005, Vol 95, No. SI

Page 4: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAnERS

athletic ability may influence the likelihood ofbeing outside when a lightning storm strikes;and having an outdoor occupation or pastimethat is more frequent among men (or women),and in that sense genetic, would also inilu-cnce the probability of getting killed by light-ning. The argument may seem stretched onthis e.iiiamjjle, but the point that every case ofdisease has both genetic and environmentalcauses is defensible and has important impli-cations for research.

MAKING CAUSAL INFERENCES

Causal inference may be viewed as a spe-cial case of the more general process of scien-lilic reasoning, about which there is substan-tial scholarly debate among scientists andphilosophers.

Impossibility of Proof

Vigorous debate is a characteristic of mod-em scientific philosophy, no less in epidemiol-ogy than in other areas. Perhaps the most im-f)ortant common thread that emerges Fromthe debated philosophies stems from lStli-centLiiy empiricist David I lume's observationthat proof is impossible in empirical science.This simple fact is espedally important to epi-demiologists, who often face the criticism tbatproof is impossible in epidemiology', witli theimplication that it is possible in other scien-tific disciplines. Such criticism may stem froma view that experiments are the definitivesource of scientific knowledge. Such a view ismistaken on at least two counts. First, thenonexperimental nature of a science does notpreclude impressive scientific discovei-ies; themyriad examples include plate tectonics, theevolution of species, planets orbiting otherstars, and the efTects of cigarette smoking onhuman health. Hven when they are possible.experiments (including randomized trials) donot provide anything approaching proof, andin fact may be controveraal. contradictory,or irreprodiicible. The cold-fusion dehadedemonstrates well that neither physical norexperimental science is immune to suchprobiems.

Some experimental scientists hold thatepidemiologic relations are only suggestive,and believe that detailed laboratory study ofmechanisms within single individuals ean

reveal cause-effect relations with certainty.This view overlooks the fact that all relationsarc suggestive m exactly the manner dis-cussed by Hume: even the most careful anddetailed mechanistic dissection of individualevents cannot provide more than associations,albeit at a finer level. Laboratory studiesoften involve a degree of observer controlthat cannot be approached in epidemiology;it is only this control, not the level of observa-tion, that can strengthen the inferences Iromlaboratoiy studies. Furthermore, such controlis no guarantee against error All of the fruitsof scientific work, in epidemiology or otherdisciplines, are at best only tentative formula-tions of a description of nature, even wbenthe work itself is carried out without mistakes.

Testing Competing EpidemiologieTheories

Biological knowledge about epidemiologiehypotheses Ls often scant, making the hy-potheses themselves at times little more thanvague statements of causal association be-tween exposLu e and disease, sucb as "smok-ing causes cardiovascular disease." Thesevague hypotheses have only vague conse-quences Ihat can be dilTicult to test To copewith this vagueness, epidemiologists usuallyfocus on testing tiie negation of the causalhypothesis, that is, tbe null hypothesis thatthe exposure does not have a causal relationto disease. Then, any observed associationcan potentially refute the hypothesis, subjectto tlie assumption (auxiliary hypothesis) thatbiases are absent.

lithe causal mecbanism is stated specifi-cally enough, epidemiologie observations(.mder some circumstances might providecrucial tests of competing non-nutl causalhypotheses. On the other hand, many epide-miologie .studies are not designed to test acausal hypothesis. For example, epidemio-logie data related to the finding that womenwho took replacement estrogen therapy wereat a considerably higher risk for endometrialcancer was examined by Horwitz and Fein-stein, who conjectured a competing theory toexplain the association: they proposed thatwomen taking estrogen experienced symp-toms such as bleeding that induced them toconsult a physician.'* The resulting diagnosticworkup led to the detection of endometriat

cancer at an earlier stage in these women, ascompared with women not taking estrogens,Many epidemiologie observations could havebeen and were used to evaluate tbese com-peting hypotheses. The causal theory pre-dicted that the risk of endometi-ial cancerwould tend to increase with increasing use(dose, frequency, and duration) of estrogens,as for other carcinogenic exposures. Thedetection bias theory, on the other hand,predicted that women who had used estro-gens only for a short white would have thegi-eatest risk, since the symptoms related toestrogen use that led to the medical consulta-tion tend to appear soon after use begins.Because the association of recent estrogenuse and endometrial cancer was the samein both long-term and short-term estrogenusers, the detection bias theoiy was refutedas an explanation for all but a small fractionof endometrial cancer cases occurring afterestrogen use.

The endometrial cancer example illus-trates a critical point in understanding theprocess of causal inference in epidemiologiestudies: many of the hypotheses being evalu*ated in the interpretation of epidemiotogicstudies are noncausal hypotheses, in thesense of involving no causal connection be-tween tbe study exposure and the disease.For example, hypotheses that amount toexplanations of how specific types of biascould bave led to an association between ex-posure and disease are the ustial alternativesto the primary study hypothesis that the epi-demiologist needs to consider in drawing in-ferences. Much of the interpretation of epi-demiologie studies amounts to the testing ofsuch noncausal explanations,

THE DUBIOUS VALUE OF CAUSALCRITERIA

In practice, how do epidemiologists sepa-rate out the causal from the noncausal expla-nations? Despite philosophic criticisms of in-ductive inference, inductively oriented causalcriteria have commonly been used to makesueh inferences. If a set of necessary and suf-fident causal criteria could be used to distin-guish causal from noncausal relations in epi-demiologie studies, the job of the scientistwould be eased considerably. With such

Supplement 1, 2005. Vol 95. No, SI I American Journal of Public Health Rothnian and Greenland | Peer Reviewed I Public Health Matters I S147

Page 5: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBUC HEALTH MAHERS

criteria, ail the concerns about the logic orlack thereof in causal inference could beforgotten: it would only be necessaiy to con-sult the checklist of criteria to see if a relationwere causal. We know from piiilosophy that aset of sufficient criteria does not exist. Never-theless, lists of causal criteria have becomepopular, possibly because they seem toprovide a road map through complicatedterritory.

Hill's CriteriaA commonly used set of criteria was pro-

posed by Hill," it was an expansion of a setof criteria offered previously in the landmarksurgeon general's repoit on smoking andhealth," which in tum were anticipated bytbe inductive canons of John Stuart Mill'and the rules given by Hume. ''

Hill suggested that the following aspects ofan association be considered in attempting todistinguish causal from noncausal associa-tions: (1) strength, (2} consistency, (3) speci-ficity, (4) temporality, (5) biological gradient,(6) plausibility, (7) coherence, (8) experimen-tal evidence, and (9) analogy. These criteriasuffer from their induetivist origin, but theirpopularity demands a more specific discus-sion of their utility.

/. Strength. Hill's argument is essentiallythat strong associations are more likely to becausal than weak associations because, if theycould be explained by some otber factor, theeffect of that factor would have to be evenstronger than the observed assodation andtherefore would have become evident. Weakassociations, on the other hand, are morelikely to be explained by undetected biases. Tosome extent tbis is a reasonable argument but,as Hill himself acknowledged, the tact that anassociation is weak does not rule out a causalconnection, A commonly cited counterexam-ple is the relation between dgarette smokingand cardiovascular disease: one explanationfor this relation being weak is tbat cardiovas-cular disease is common, making any ratiomeasure of effect comparatively small com-pared with ratio measures for diseases that areless common.'"* Nevertheless, dgarette smok-ing is not serioasly doubted as a cause of car-diovascular disease. Another example wouldbe passive smoking and lung cancer, a weakassociation that few consider to be noncausal.

Counterexamples of strong but noncausalassociations ai e also not bard to find; anystudy with strong confounding illustrates thephenomenon. For example, consider thestrong but noncausal relation between Downsyndrome and birth rank, which is con-founded by the relation between Down syn-drome and maternal age. Of course, once theconfounding factor is identified, the associa-tion is diminished by adjustment for the fac-tor These examples remind us that a strongassociation is neither necessary nor sufficientfor causality, nor is weakness necessary orsufficient for absence of causality. Further-more, neither relative risk nor any othei mea-sure of association is a biologically consi.stentfeature of an association; as described above,sudi measures of association are cbaracteris-tics of a given population that depend on tlierelative prevalence of other causes in thatpopulation. A strong association serves onlyto mle out hypotheses that the assodation isentirely due to one weak unmeasured con-founder or other source of modest bias,

2. Consistency. Consistency refers to the re-peated observation of an assodation in differ-ent populations under different circumstances.Lack of consistency, however, does not ruleout a causal assodation, beeause some effectsare produced by their causes only under un-usual circumstances. Moi'e precisely, the effectof a causal agent cannot occur unless the eom-plementary component causes act, or have al-ready acted, to complete a sufficient cause.These conditions will not always be met. Tlius,transtiisions can cause HIV infection but theydo not always do so: the virus must also bepresent. Tampon use can cause toxic shocksyndrome, but only rarely when certain other,perhaps imknown, conditions are met. Consis-tency is apparent only after all the relevant de-tails of a causal mechanism are understood,which is to say very seldom. Furthermore,even studies of exactly the same phenomenacan be expected to yield different results sim-ply because they differ in their methods andrandom errors. Consistency serves only to ruleout hypotheses that the association is attributa-ble to some factor that varies across stiadies.

One mistake in implementing the consis-tency criterion is so common that it desen,'esspecial mention. It is sometimes claimed thata literature or set of results is inconsistent

simply beeause some results are "statisticallysignificant" and some are not This sort ofevaluation is completely fallacious even if oneaccepts the use of significance testing meth-ods: The results (effect estimates) from thestudies could all be identical even if manywere significant and many were not, tbe dif-ference in significance arising solely becauseof difTerences in the standard eixors or sizesof the studies. Furthermore, this fallacy is noteliminated by "standai dizing" estimates.

3. Specificity. Tbe criterion of specificityrequires that a cause leads to a single effect,not multiple efTects. This argument has oftenbeen advanced to refute causal interpreta-tions of exposures that appear to relate tomyriad effects—for example, by those seekingto exonerate smoking as a cause of lung can-cer. Unfortunately, the criterion is invalid as ageneral mle. Causes of a given effect cannotbe expected to lack all other effects. In fact,everyday experience teaches us repeatedlythat single events or conditions may havemany effects. Smoking is an excellent exam-ple; it leads to many effects in the smoker,in part because smoking involves exposureto a wide range of agents.'^"' The existenceof one effect of an exposure does not detractfrom the possibility that another effect exisls.

On the other hand, Weiss"' convincingly ar-gued that specificity can be used to distinguishsome causal hypotheses from noncausal hy-potheses, when the causal hypothesis predictsa relation with one outcome but no relationwith another outcome. 'ITius, specificity cancome into play when il can be logically de-duced Irom the causal hypothesis in question.

4. Temporality. Temporality refers to thenecessity for a cause to precede an effect intime. 'ITiis criterion is inarguable, insofar asany claimed observation of causation must in-volve the putative cause C preceding the pu-tative effect D. It does not however, followthat a reverse time order is evidence againsttlie hypothesis that C can cause D. Rather,observations in which C followed D merelyshow that C could not have caused D in theseinstances; they provide no evidence for oragainst the hypothesis that C can cause D inthose instances in which it precedes D.

5. Biological gradient. Biological gradientrefers to the presence of a unidirectionaldose-response curve. We often expect such a

S148 Public Health Matters t Peer Reviewed I Rothman and Greenland American Journai of Public Health | Supplement 1, 2005, Voi 95, No. SI

Page 6: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAnERS

monotonic relation to exist For example,more smoking means more carcinogen expo-sure and more tissue damage, hence more op-portunity for carcinogenesis. Some eausal as-sociations, however, show a single jump(threshold) rather than a monotonic trend; anexample is the association between DES andadenocarcinoma of the vagina. A possihie ex-planation is that the doses of DKS that wereadministered were alt sufficiently great to pro-duce tlie maximum elTect from DP S. Underthis hypothesis, for all those exposed to DES,the development of disease would dependentirely on (ither eomponent causes.

Alcohol consumption ajid mortality is an-other example. Death rates are higher amongnondrinkers than among moderate drinkers,hut ascend to the highest levels for heavydrinkers. There is considerable debate aboutwhich paits of tlie J-shaped dose-responsecurve are causally related to alcohol con-sumption and which parts are noncausal ar-tifacts stemming from confounding or otherbiases. Some studies appear to find only anincreasing relation between alcohol consump-tion eind mortality, possibly because the cate-gories of alcohol consumption are too broadto distinguish dilTerent rates aanong moderatedrinkers and nondrinkers.

Associations that do show a monotonietrend in disease frequency with increasing lev-els of exposure are not necessarily eausal; eon-founding can result in a monotonic relationbetween a noncausal ilsk factoi' and disease ifthe confounding factor itself demonstrates abiological gradient in its relation witli disease.The noneaiisal relation between hirth rankeind Down syndrome mentioned in pail 1above shows a biological gradient that merelyreflects the progi'essive relation between ma-ternal age and Down syndrome occurrence.

These examples imply that the existence ofa monotonic association is neither necessarynor sufficient for a eausal relation. A nonmo-notonic relation only refutes those causal hy-potlieses specific enough to predict a monoto-nic dose-response curve.

6, Plausibility. Plausibility relers to the bio-logical plausibility ol the hj'pothesis, an impor-tant concern but one that is far fh)m objectiveor absolute. Sartwell, emphasizing this pointcited tlie 1861 comments of Cheever on theetiology of typhus before its mode of transmis-

sion (via body lice) was known: "It could be nomoi-e ridiculoiLs for tlie stranger who passedtlie night in the steerage of an emigi-ant ship toascrihe the typhus, which he there contracted,to tlie vermin witli whicb hfxlies of the sickmight be mfested. An adequate cause, one rea-sonable ui itself, must con-ect the coincidencesof simple experience."'' What was to Qieeveran implaiLsible explanation turned out to bethe correct explanation, since it was indeed thevermin that caused tlie ty(ihus infection. Suchis tiie problem with plausibility: it is too oftennot based on logic or data, but only on priorbeliefs. This Ls not to say that biological knowl-edge should be discounted when evaluating anew hy[)othesLS, but only to point out the difll-culty in applying that knowledge.

'ITie Bayesian approach to inference at-tempts to deal with this problem by requiringthat one quantify, on a probability (0 to 1)scale, the certainty thai one has in prior be-liefs, as well as in new hypotheses. This quan-tification displays the dogmatism or open-mindedne.ss ofthe analyst in a public fashion,with ceitaitity values near 1 or 0 betraying astrong commitment of the analyst for oragainst a hypothesis, It can also provide ameans of testing those quantified beliefsagainst new evidenee,'" Nevertheless, theBayesian approach cannot transfonn plausi-bility into an objective causal criterion.

7 Coherence. Taken fiiDm the sui^eon gen-eral's report on smoking and healtli," the temicoherence implies that a cause-and-efTect inter-pretation for an assoeiation does not conflictwith whal is known of tlie natural history andbiology of the disease. The examples Hill gavefor coherence, such as the histopathologic ef-fect of smoking on bronchial epithelium (in ref-erence to tlie association between smoking andlung cancer) or the difference in lung eaneerincidence hy gender, could reasonably beconsidered exajnples of plausibility as wellas coherence; the distinction appears to be afine one. Hill emphasized that the absence ofcoherent infojinadon, as distinguished, appai -ently, from the presence of conflicting infonna-tion, .should not be taken as evidence againstan association hemg considered causal. On tlieother hand, presence of conllicting informationmay indeed refute a hypothesis, but one tnustalways remember that the conllicting infbnna-tion may be mistaken or misinterpreted,'^

8, Experimental evidence. It is not elear whatI lill meant by expeiimcntal evidence. It mig^thave refen'ed to evidence from laboratory' ex-periments on animals, or to evidence fromhuman experiments. Evidence from hiunaii ex-periments, however, is seldom available formost epidemioiogic reseairh questions, and an-imal evidence relates to different species andusually to levels of exposure very difTerentfrom those humans experience. From Hill's ex-amples, il seems that what he had in mind forexperimental evidenee was the result of re-moval of .some harmful exposure in an inter-venti(jn or prevention program, leather than theresults of laboratoiy experiments. 'ITie lack ofavaiiabilit}' of such evidenee would at least be

a pragmatic difficulty in making this a criterionfor inference. Logically, however, expeiimentalevidence is not a criterion but a test of thecausal hypothesis, a test that is simply unavail-able in most circumstances. Altliough exjieri-mental tests ean be much stronger than othertests, they are often not as decisive as thought,because of difficulties in inteipretation. For ex-ample, one can attempt to test the hypothesisthat malaria is caused by swamp gas by drain-ing swamps in some areas and not in others tosee if the malaria rates among residents are af-fected by the draining. As predicted by the hy-))othesis, the rates will drop in the areas wherethe swamps are drained. As Popper empha-sized, however, there are always many alterna-tive explanations for the outcome of every ex-periment In this example, one alternative,which happens to be correct, is that mosqui-toes are responsible for malaria transmission.

9. Antilogy. Whatever insight might be de-rived from analogy is handicapped by the in-ventive imagination of scientists who can findanalogies everywhere. At best, ajialogy pro-vides a souree of more elaborate hy{iothesesabout the associations under study; absence ofsuch analogies only reflects lack of imaginationor experience, not falsify of the hypothesis.

Is There Any Use for Causal Criteria?

As is evident the standards of epidemio-iogic evidence offered by Hill are saddled withreservations and exceptions. I lill himself wasambivalent about the utility of these "view-points" (he did not use the word criteria in thepaper). On tlie one hand, he asked, "In whatcircumstances can we pass from this observed

Supplement 1, 2005, Vol 95, No. S I I American Journal of Public Health Rothman and Greenland I Peer Reviewed I Public Health Matters S149

Page 7: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause

PUBLIC HEALTH MAnERS

association to a verdict of causation?" Yet de-spite speaking of verdicts on causation, he dis-agreed that any "hard-and-fast rules of evi-dence" existed by which to judge causadon:This conclusion accords with the views ofHume, Popper, and others that causal infer-ences cannot attain the certainty of logica] de-ductions. Although some sdentists continue topromulgate causal criteria as aids to inference,others argue that it is actually detrimental tocloud the inferential process by consideringchecklist criteria.' An intermediate, reftitation-ist reproach seeks to transform the criteriainto deductive tests of causal hypotheses, "" 'Such an approach avoids the temptation touse causal criteria simply to buttress pet theo-lies at hand, and instead allows epidemiolo-gists to focus on evaluating competing causaltheories using crucial observations.

CRITERIA TO JUDGE WHETHERSCIENTIFIC EVIDENCE IS VALID

Just as causal criteria cannot be used toestablish the validity of an inference, thereare no criteria that can be used to establishthe validity of data or evidence. There aremethods by which validity can be assessed,but this assessment would not resemble any-thing like the application of rigid criteria.

Some of the difficulty can be understood bytaking the view that sdentific evidence canusually be viewed as a form of measurement.If an epidemiologic study sets out to assess therelation between exposure to tobacco smokeand lung cancer risk, the results can andshould be framed as a measure ol" causal el-fect. such as the ratio of the risk of lung canceramong smokers to the risk among nonsmok-ers. Like any measurement, the measurementof a causal efFect is subject to measurementerroi". For a scientific study, measurement errorencompasses more than the eiTor that wemight have in mind when we attempt to mea-sure the length of a piece of carpet. In additionto statistical error, the measurement error sub-sumes problems that relate to study design, in-cluding subject selection and retention, infor-mation acquisition, and uncontrolledconfounding and other sources of bias. Thereare many individual sources of possible error.It is not suffident to chai acterize a study ashaving or not having any of these sources of

error, since nearly every study will have nearlyevery type of error The real issue is to quan-tify the errors. As there is no precise cutoffwith respect to how much error can be toler-ated before a study must be considered in-valid, there is no altemative to the quantifica-tion of study errors to the extent possible.

Although there are no absolute criteria forassessing the validity of scientific evidence, itis stOl possible to assess the validity of astudy. What is required is much more thanthe application of a list of criteria. Instead,one must apply thorough criticism, with thegoal of obtaining a quantified evaluation ofthe total error that afflicts the study. This typeof assessment is not one that can be doneeasily by someone who lacks the skiils andtraining of a scientist familial' with the subjectmatter and the sdentific methods that wereemployed. Neither can it be applied readilyby judges in court, nor by sdentists who ei-ther lack the requisite knowledge or who donot take the time to penetrate the work. •

About the AuthorsKenneth J. Rothman is with the Boston University MedicalCenter. Boston. Mass. Sunder Greenland is with the Uni-versity of California. Los Angeles.

Requests for repnnLs should be sent to Kenneth]. Rothman.DrPH, Boston University School of Public Heallh. Depart-ment of Hpideminlogy, 7/5 Albany St., Boston. MA02118 (e-maii: [email protected]).

This article was accepted November 18. 2004.

ContributorsKt'iineth J. Rulhman and Sander Greenland participatedequally in the planning and writing of this articie.

AcknowledgmentsThis woi'k LS laj-gely abridged from diajjter 2 of ModemEpidemiology, 2nd ed., hy K.j. Rothman and S. Gceun-land. Lippmcott, Williams & Wilkins, 1998. and chap-ter 2 of Epidemiology-An Introduction by K.J. Rothman,Oxford Univei-sity VKSS, 2002.

References1. Rothman KJ. Causes. Am J Epidemiol. 1976:104:587-592.

2. Honorpe A. Causation in the Law. In: Zalta EN,ed. Stanford Encyclopedia of Philosophy. Winter 2001ed. Stanford. Caiif: Stanford University; 2001. Avail-able at: h(tp://plato,stanford,edu/archive.s/win2l)01/entries/causation- law.

3. Rothman Kj, Greenland S. Modem EpidemiologyPhiladelphia, Pa: Lippincott: 1998: chap 18.

4. Higginson J. Proportion of cancer due to occupa-tion, PJCT Med 1980:9:180-188.

5. Ivphron li. Apacalyptics: Cancer and the Big Lie-How Environmental Politics Controls What We Knowahout Cancer. New York, NY: Simon and Schuster:1984.

G. Higginson J. Population studies in cancer. ActaUnio Intemat Contra Cancrum 1960:16:1667-1670.

7. MacMahon B. Gene-environment interaction inhuman disease. J Psychiatr Res ]968;6:393-402.

8 i logbcn L. !\'ature and Nurture London, Kngland:Williams and Norgate: 1933.

9, Horwitz Rl, Feinstein AR, Altemative analyticmethods for case-control studies of estrogens andeiidometriai cancer, N EnglJ Med. 1978;299:1089-1094.

10, Hill AB. "ITie environment and disease: associationor causation? PriK R Soc Med. 1965;58;295-300,

11, Smoking and Health: Report of the AdvisoryCommittee to the Surgeon General of the PuhlicHealth Service. Washington, DC: US Department ofHealth, Education, and Welfare: 1964, Public HealthServiee I^iblication No. 1103.

12, Mill JS. .'1 System of Logic, Ratiocinative and Induc-tive. 5th ed, London, England: Parker, Son and Bowin,1862, Cited in Clark DW. MacMahon B, eds. Preventiveand Community Medicine. 2nd ed, Boston, Mass: Little.Brown; 1981:chap 2.

13, Hume D. A Treatise of Human Nature. (Originallypublished in 1739.) Oxford University Press edition,with an Analytical Index by L, A. Selby-Bigge. pub-lished 1888, Second edition with text revised andnotes by P,H. Nidditch, 1978.

14, Rothman KJ. Poole C, A strengthening programmefor weak associations. IntJ Epidemiol 1988;17{Suppl):955-959,

15, Smidi GD, Specificity as a criterion for eausation:a premature hurial? hit j Epidemiol. 2002 ;31:710—713,

16, Weiss NS:, Can the spedfieity ol' an association berehahilitated as a basis for supporting a causal hypoth-esis? Epidemiology. 2002:13:6-8.

17, Sartwell P, On the methodology of investigationsof etiologic factors in chronic diseases—furiher com-ments, / CArwn Dis, 1960;n;61-63.

18, Popper. KR, The Logic of Scientific Discovery. NewYork. MY: Harper & Row; 1959 (first published in Ger-man in 1934),

19, Lanes SF, Poolc C. 'Tnith in packaging?" Theunwrapping of epidemiologic research, J Occup Med.19B4;2t);571-574,

20, Maclure M, Popperian refutation in epidemiology,AmJ Epidemiol. 19B5;121:343-350,

21, Weed D. On the logic of causal inference, AmJEpidemiol. ]986;123:965-979.

S150 I Public Health Matters , Peer Reviewed I Rothman and Greenland American Journal of Public Heaith i Suppiement 1. 2005. Voi 95. No. SI

Page 8: Causation and Causal Inference in Epidemiology · Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat Concepts of cause