Download - Causal Inference: predic1on, explana1on, and interven1on · 2016-11-26 · Causal Inference: predic1on, explana1on, and interven1on Lecture 11: Bias, evidence, evalua1on Samantha

CausalInference:predic1on,explana1on,andinterven1on

Lecture11:Bias,evidence,evalua1onSamanthaKleinberg

[email protected]

Administra1via

•  Nextweek:op1onalfinalexam•  Twoweeks:lastclass,papersdue,presenta1ons– 5minutetalk+3minq+a

Cholesterol•  Previously:–  Significantassocia1onbetweenhighHDLandreducedriskofMI

–  IncreasedHDL=lowerrisk•  Result:companiesdevelopingdrugstoraiseHDL

•  Now:Studyusednaturalgene1cvaria1onstotestwhetherinheritedpredisposi1ontohigh/lowHDLaffectsMIrisk– NolinkbetweentheincreasedHDLandlowerrisk,reducedHDL,higherrisk

Voight,B.F.etal.(2012).PlasmaHDLcholesterolandriskofmyocardialinfarc1on:Amendelianrandomisa1onstudy.TheLancet.

SOMEBACKGROUND…

Whyworryaboutsta1s1calsignificance

•  CalculateADCS,condi1onalprobabili1es,etc•  Whichvaluesarecausalornoncausal?–  IsADCS0.1automa1callynoncausaland0.9definitelycausal?

•  Iseffectintestgroupreallybiggerthanincontrol?– Howtocompare50%chanceofdeathto47%?

Choosingathreshold

•  Getsomenumericalresult•  Needtopickacutoffpoint,buthow?

•  Flipcoin101mes•  Observe9H1T•  Ini1alhypothesis:coinisfair.Howunlikelyisitthatwewouldobserve9Horsomethingatleastasextreme(10H,10T,9T)?

Ap-valueistheprobabilityofgecngateststa1s1catleastasextremeaswhatisobserved,giventhatthenullhypothesisistrue.Note:comparingnulltoalterna1veRejectnullforP<α

Whatalpha?

•  Fisher(1925)suggestedusing0.05asathreshold– “Weshallnotohenbeastrayifwedrawaconven1onallineat.05”

•  BUTasignificantresultcanhaveP>0.05,andaninsignificantone<0.05

Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120

•  Flipcoin101mes•  Observe9H1T•  Ini1alhypothesis:coinisfair.Howunlikelyisitthatwewouldobserve9Horsomethingatleastasextreme(10H,10T,9T)?

P(9H1T )+P(10H )+P(10T )+P(9T1H )

P(10H ) = P(10T ) = (1 / 2)10 = 0.001P(9H1T ) = P(9T1H ) = (1 / 2)10 ×10 = 0.01

0.01+ 0.001+ 0.001+ 0.01= 0.022

Mul1pletes1ng

•  Whathappensifweflip100faircoins101meseach?

•  Shouldweexpecttoseeatleastoneinstanceof9H1T?

x=9H1T,y=eventsatleastasextremeasxWecalculatedP(y)=0.022Now,wewantPofybeingtrueatleastonceinNtries

=1-P(noty)N,P(noty)=1-0.022WithN=5:

1-P(noty)5=0.11N=50?

1-P(noty)50=0.67N=100?

1-P(noty)100=0.89

Experimentvs.comparisonsignificancelevel

•  Before,rejectednullforα<0.05•  αc=foronecomparison•  αe=forwholeexperiment(i.e.setoftests)

•  Doesαc=0.05meanαe=0.05?

Generalcase

•  Ifweuseα=0.05,probabilityofafalseposi1vein100testsis:

•  Why?Ntestswithαc=0.05

1− 0.95100 = 0.994

αe =1− (1−αc )N if tests independent

αe ≤ N ×αc if dependent

Typesoferrors

Someterminology

•  FDRV/R•  FNRT/m-R•  FWERP(V≥1)•  PCERV/m

Bonferronicorrec1on

•  ControlsFWERP(F+≥1)butmayleadtomanyFNs

•  Recallthatwecanes1mate

•  Thuscansetαe =αc ×N

αc =αe / N

FDRcontrol

•  Compare– 10tests,2falsediscoveries– 100tests,2falsediscoveries

•  20%FDRvs2%FDR•  Basicidea:focusonpropor1onoffalsediscoveries

Oneapproach:Benjamini-Hochberg

•  Ordermp-valuessop1<p2<…<pm

•  Wherekislargestisuchthat

•  RejectallHi,i=1,2…k

P(i) ≤ imα

Benjamini,Y.,&Hochberg,Y.(1995).Controllingthefalsediscoveryrate:Aprac1calandpowerfulapproachtomul1pletes1ng.JournaloftheRoyalSta8s8calSociety.SeriesB(Methodological),57(1),289-300

Example

Saywehave15comparisonswiththefollowingp-values:0.0001,0.0004,0.0019,0.0095,0.0201,0.0278,0.0298,0.0344,0.0459,0.3240,0.4262,0.5719,0.6528,0.7590,1.000

Bonferroni:0.05/15=0.0033

Example

Saywehave15comparisonswiththefollowingp-values:0.0001,0.0004,0.0019,0.0095,0.0201,0.0278,0.0298,0.0344,0.0459,0.3240,0.4262,0.5719,0.6528,0.7590,1.000

Bonferroni:0.05/15=0.0033BH:p(4) = 0.0095≤ 4

150.05= 0.013

localfdr•  ResultontheedgeofthresholdwillseemlesslikelythanitshouldtobeaFP(themoresignificantresultsreducetheFDR)

•  ForatestsuchthatFDR=0.05,thatpar1cularonehashigherFDR

•  Mainidea:insteadofusingtailofdistribu1on,assessindividualresults– What’stheprobabilityofnullhypothesiscondi1onedonteststa1s1c

–  Assumelargenumberof(possiblynon-independent)tests

Efron,B.,Tibshirani,R.,Storey,J.D.,&Tusher,V.(2001).Empiricalbayesanalysisofamicroarrayexperiment.JournaloftheAmericanSta8s8calAssocia8on,96(456),1151-1160.

localfdr(con1nued)

•  Tes1nghowsignificantlyindividualresultsdifferfromexpecta1ons

•  localfalsediscoveryrate:fdr(z)=P(null|z)

p0=P(null)p1=P(non-null),p0>>p1f0(z)andf1(z)densi1esf(z)=p0f0(z)+p1f1(z)fdr(z)=P(i=null|zi=z) =p0f0(z)/f(z) ≈f0(z)/f(z)

Rejectnullforf0(z)/f(z)≤α

What’sthenulldistribu1on?

•  Normal•  Uniform•  Canpermutedatatogenerate

•  Empirical

Whynotadjust?

•  Tradeoffbetweenfalsediscoveriesandfalsenega1ves

•  Argumentsagainstadjustment(e.g.Rothman,1990)basicallycenteron:–  IncreasedFNsareundesirable– Resultsrequireinterpreta1on/follow-up

FirstthatwastheBonferronicorrec1on,whichmakesmeupdatemybeliefabouttheresultsofanexperimentbasedonhowmanyotherexperimentsIhappentoconductwithit(andwhichofcourseimplicitlyassignsalowpriorprobability).Oneresearchereventoldmeoncethathehasstudentsfirstconductfewerexperimentssoafindinghasabe{erchanceofbeingsignificant.Ijustwalkedawayscratchingmyhead.–UAIemaillist

Benne{,C.M.,Miller,M.B.,&Wolford,G.L.(2009).Neuralcorrelatesofinterspeciesperspec1vetakinginthepost-mortematlan1csalmon:Anargumentformul1plecomparisonscorrec1on.NeuroImage,47(1),125.

Mul1plecomparisons

h{p://xkcd.com/882/

Thisanalysisrevealedthatonlyin~20-25%oftheprojectsweretherelevantpublisheddatacompletelyinlinewithourin-housefindings

Fihy-threepapersweredeemed‘landmark’studies…scien1ficfindingswereconfirmedinonly6(11%)cases.

“Thereisnowenoughevidencetosaywhatmanyhavelongthought:thatanyclaimcomingfromanobserva1onalstudyismostlikelytobewrong–wronginthesensethatitwillnotreplicateiftestedrigorously.”Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120

•  Separatedatacleaningandanalysis•  Splitsampleintotraining/testsets•  Analysisplanbasedonlyontrainingset– Fixedpriortoanalysis

•  Makedatapublic

•  Discuss

•  Took12papersthatanalyzedobserva1onaldata,whereclaimswerealsotestedinRCTs

•  NoresultswerereplicatedintheRCT– Observa1on=52posi1veclaims– RCT=5nega1veclaims,47nosignificance

•  Why?

Bias

•  Recallselec1oncriteriaforRCT,methodsforrandomiza1on

•  Reasonformedica1onvsitseffects

Mul1plemodeling

Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120.

Hill’sviewpoints

Ninefeaturestoconsiderwhenevalua1ngcausalityincaseswhereexperimenta1onisnotpossibleMixof:showingcausemakesadifference,andthatthere’sapoten1almechanism

Hill,A.B.(1965).Theenvironmentanddisease:Associa1onorcausa1on?ProceedingsoftheRoyalSocietyofMedicine,58,295-300

(1)Strength

•  Howstrongistheassocia1on?

•  Intui1vely,ifpresenceofadverseeventinpeopletakingadrugisnearlythesameasthatinthegeneralpopula1on,thisislessconvincingthanifthereisasignificantincrease

(1)Strengthcon1nued

•  Weakcauses– Secondhandsmokeandlungcancer– Uniden1fiedsubgroups– Rareevents

•  Strongnon-causes– Downsyndromeandbirthorder

(2)Consistency

•  Areresultsreplicable?•  Isrela1onshipobservedbydifferentgroups,usingdifferentmethods?

•  Ex:runningandsides1tch

(2)Consistencycon1nued

•  Inconsistentcauses– Uniden1fiedfeaturesnecessaryforcausetobeeffec1ve

– Mosquitobitesandmalaria–  Inconsistencyincause≠inconsistentstudyresults

•  Consistentnon-causes– Commonstudyflaw

(3)Specificity

•  Doescauseleadtooneeffectoragroupofeffects?

•  Specificityalsoinmagnitude

•  Examples– smokingcausingillnessvs.smokingcausinglungcancer

– OneSNPcausingmanyphenotypes

(3)Specificitycon1nued

•  Notnecessarily1to1rela1onship– Factorscanincreaseriskofmul1pledisease– Drugshavemanysideeffects

(4)Temporality

•  Doesthecauseprecedetheeffect?•  Isthedelayconsistentwithhowcausemightwork?

•  Ex:Occupa1onandillness

(5)Biologicalgradient

•  Doeslevelofeffectorriskofitincreasewithincreaseinlevelofcause?

•  Doseresponsecurve

•  Asmorepeopleinanareaareexposedtoriskfactor,doesdiseaseincidenceincrease?

•  Studyinvolving121,342people•  Foundeachdailyincreaseof3ozofmeatassociatedwith12%increasedriskofcvdeatha10%increaseincancerdeath

J-shapedcurve

Rehm,J.,Gmel,G.,Sempos,C.T.,&Trevisan,M.(2003).Alcohol-relatedmorbidityandmortality.Mouth,140(208),C00-C97

(6)Plausibility

•  Isthereapoten1almechanismthatcouldconnectthecauseandeffectgivenwhatwecurrentlyknow?

•  Dependsoncurrentknowledge

(7)Coherence

•  Doestherela1onshipconflictwithknownfacts?

•  Notethatcurrentknowledgemaybewrong

Plausbilityvscoherence

•  Plausibility:wecanconceiveofawaytherela1onshipcouldworkgivenwhatweknow

•  Coherence:therela1onshipdoesnotconflictwithwhatweknow

(8)Experiment

•  Arethereexperimentalresultssuppor1ngtherela1onship?

•  Ifweintervenebyincreasing/introducingthecause,doestheeffectresult?

•  Ex:doeslungcancerratechangeassmokingratedoes?

(8)Experimentcon1nued

•  Humanvs.animal(+representa1veness)•  Randomiza1on,blinding•  Samplesize,popula1on

(9)Analogy

•  Example:KnowthatHPVcausescervicalcancer,somaybemorelikelytoacceptvirusascauseofanothertypeofcancer

•  Example2:Ifweseeeffectoflis1ngcaloriecontentonordersatrestaurants,moreplausiblethatlis1ngsaltcontentcanchangebehaviortoo

Hillnotes

•  NOTachecklist•  NOTfeaturesofrela1onshipitself

•  Methodsforevalua1ngassocia1ons,recognizingcauses.

•  S1llneedtothinkaboutqualityofevidenceofeachtype– Recalllastweek:varia1onsinRCTs

Hill+causalinference1.  Strength2.  Consistency3.  Specificity4.  Temporality5.  Biologicalgradient6.  Plausibility7.  Coherence8.  Experiment9.  Analogy

Probabilitye.g.ADCS,GrangerProbability,regularityProbabilityTemporalpriority

Mill’smethodsMechanismsMechanismsInterven1ons,RCTs

Mechanisms

Hill’sViewpointsappliedtosmoking

1.  Strength–Rateoflungcancer(anddeathfromLC)insmokersmuchhigherthaninnon-smokers,9-10xhigher.

2.  Consistency–Samelinkfoundbymanystudies(retrospec1veandprospec1ve),inmanypopula1ons(men&women,geographic)

3.  Specificity–Deathrategenerallyinc.amongsmokers,butdeathfromLCinc.substan1allymore

4.  Temporality–PeopleusuallysmokebeforedevelopingLC/dying5.  Biologicalgradient–Deathrate/LCincidenceincreaseslinearly

withincreasedsmoking6.  Plausibility–Theoryof1ssuedamageeventuallyleadingto

cancer7.  Coherence–Fitwithwhatwasknownat1me8.  Experiment–Tarontheearsoflabratsproducedcancer,showing

carcinogensarepresent9.  Analogy–Animalàhuman

Fornextweek

Finalexam!!