CausalInference:predic1on,explana1on,andinterven1on
Lecture11:Bias,evidence,evalua1onSamanthaKleinberg
Administra1via
• Nextweek:op1onalfinalexam• Twoweeks:lastclass,papersdue,presenta1ons– 5minutetalk+3minq+a
Cholesterol• Previously:– Significantassocia1onbetweenhighHDLandreducedriskofMI
– IncreasedHDL=lowerrisk• Result:companiesdevelopingdrugstoraiseHDL
• Now:Studyusednaturalgene1cvaria1onstotestwhetherinheritedpredisposi1ontohigh/lowHDLaffectsMIrisk– NolinkbetweentheincreasedHDLandlowerrisk,reducedHDL,higherrisk
Voight,B.F.etal.(2012).PlasmaHDLcholesterolandriskofmyocardialinfarc1on:Amendelianrandomisa1onstudy.TheLancet.
SOMEBACKGROUND…
Whyworryaboutsta1s1calsignificance
• CalculateADCS,condi1onalprobabili1es,etc• Whichvaluesarecausalornoncausal?– IsADCS0.1automa1callynoncausaland0.9definitelycausal?
• Iseffectintestgroupreallybiggerthanincontrol?– Howtocompare50%chanceofdeathto47%?
Choosingathreshold
• Getsomenumericalresult• Needtopickacutoffpoint,buthow?
• Flipcoin101mes• Observe9H1T• Ini1alhypothesis:coinisfair.Howunlikelyisitthatwewouldobserve9Horsomethingatleastasextreme(10H,10T,9T)?
Ap-valueistheprobabilityofgecngateststa1s1catleastasextremeaswhatisobserved,giventhatthenullhypothesisistrue.Note:comparingnulltoalterna1veRejectnullforP<α
Whatalpha?
• Fisher(1925)suggestedusing0.05asathreshold– “Weshallnotohenbeastrayifwedrawaconven1onallineat.05”
• BUTasignificantresultcanhaveP>0.05,andaninsignificantone<0.05
Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120
• Flipcoin101mes• Observe9H1T• Ini1alhypothesis:coinisfair.Howunlikelyisitthatwewouldobserve9Horsomethingatleastasextreme(10H,10T,9T)?
P(9H1T )+P(10H )+P(10T )+P(9T1H )
P(10H ) = P(10T ) = (1 / 2)10 = 0.001P(9H1T ) = P(9T1H ) = (1 / 2)10 ×10 = 0.01
0.01+ 0.001+ 0.001+ 0.01= 0.022
Mul1pletes1ng
• Whathappensifweflip100faircoins101meseach?
• Shouldweexpecttoseeatleastoneinstanceof9H1T?
x=9H1T,y=eventsatleastasextremeasxWecalculatedP(y)=0.022Now,wewantPofybeingtrueatleastonceinNtries
=1-P(noty)N,P(noty)=1-0.022WithN=5:
1-P(noty)5=0.11N=50?
1-P(noty)50=0.67N=100?
1-P(noty)100=0.89
Experimentvs.comparisonsignificancelevel
• Before,rejectednullforα<0.05• αc=foronecomparison• αe=forwholeexperiment(i.e.setoftests)
• Doesαc=0.05meanαe=0.05?
Generalcase
• Ifweuseα=0.05,probabilityofafalseposi1vein100testsis:
• Why?Ntestswithαc=0.05
1− 0.95100 = 0.994
αe =1− (1−αc )N if tests independent
αe ≤ N ×αc if dependent
Typesoferrors
Someterminology
• FDRV/R• FNRT/m-R• FWERP(V≥1)• PCERV/m
Bonferronicorrec1on
• ControlsFWERP(F+≥1)butmayleadtomanyFNs
• Recallthatwecanes1mate
• Thuscansetαe =αc ×N
αc =αe / N
FDRcontrol
• Compare– 10tests,2falsediscoveries– 100tests,2falsediscoveries
• 20%FDRvs2%FDR• Basicidea:focusonpropor1onoffalsediscoveries
Oneapproach:Benjamini-Hochberg
• Ordermp-valuessop1<p2<…<pm
• Wherekislargestisuchthat
• RejectallHi,i=1,2…k
P(i) ≤ imα
Benjamini,Y.,&Hochberg,Y.(1995).Controllingthefalsediscoveryrate:Aprac1calandpowerfulapproachtomul1pletes1ng.JournaloftheRoyalSta8s8calSociety.SeriesB(Methodological),57(1),289-300
Example
Saywehave15comparisonswiththefollowingp-values:0.0001,0.0004,0.0019,0.0095,0.0201,0.0278,0.0298,0.0344,0.0459,0.3240,0.4262,0.5719,0.6528,0.7590,1.000
Bonferroni:0.05/15=0.0033
Example
Saywehave15comparisonswiththefollowingp-values:0.0001,0.0004,0.0019,0.0095,0.0201,0.0278,0.0298,0.0344,0.0459,0.3240,0.4262,0.5719,0.6528,0.7590,1.000
Bonferroni:0.05/15=0.0033BH:p(4) = 0.0095≤ 4
150.05= 0.013
localfdr• ResultontheedgeofthresholdwillseemlesslikelythanitshouldtobeaFP(themoresignificantresultsreducetheFDR)
• ForatestsuchthatFDR=0.05,thatpar1cularonehashigherFDR
• Mainidea:insteadofusingtailofdistribu1on,assessindividualresults– What’stheprobabilityofnullhypothesiscondi1onedonteststa1s1c
– Assumelargenumberof(possiblynon-independent)tests
Efron,B.,Tibshirani,R.,Storey,J.D.,&Tusher,V.(2001).Empiricalbayesanalysisofamicroarrayexperiment.JournaloftheAmericanSta8s8calAssocia8on,96(456),1151-1160.
localfdr(con1nued)
• Tes1nghowsignificantlyindividualresultsdifferfromexpecta1ons
• localfalsediscoveryrate:fdr(z)=P(null|z)
p0=P(null)p1=P(non-null),p0>>p1f0(z)andf1(z)densi1esf(z)=p0f0(z)+p1f1(z)fdr(z)=P(i=null|zi=z) =p0f0(z)/f(z) ≈f0(z)/f(z)
Rejectnullforf0(z)/f(z)≤α
What’sthenulldistribu1on?
• Normal• Uniform• Canpermutedatatogenerate
• Empirical
Whynotadjust?
• Tradeoffbetweenfalsediscoveriesandfalsenega1ves
• Argumentsagainstadjustment(e.g.Rothman,1990)basicallycenteron:– IncreasedFNsareundesirable– Resultsrequireinterpreta1on/follow-up
FirstthatwastheBonferronicorrec1on,whichmakesmeupdatemybeliefabouttheresultsofanexperimentbasedonhowmanyotherexperimentsIhappentoconductwithit(andwhichofcourseimplicitlyassignsalowpriorprobability).Oneresearchereventoldmeoncethathehasstudentsfirstconductfewerexperimentssoafindinghasabe{erchanceofbeingsignificant.Ijustwalkedawayscratchingmyhead.–UAIemaillist
Benne{,C.M.,Miller,M.B.,&Wolford,G.L.(2009).Neuralcorrelatesofinterspeciesperspec1vetakinginthepost-mortematlan1csalmon:Anargumentformul1plecomparisonscorrec1on.NeuroImage,47(1),125.
Mul1plecomparisons
h{p://xkcd.com/882/
Thisanalysisrevealedthatonlyin~20-25%oftheprojectsweretherelevantpublisheddatacompletelyinlinewithourin-housefindings
Fihy-threepapersweredeemed‘landmark’studies…scien1ficfindingswereconfirmedinonly6(11%)cases.
“Thereisnowenoughevidencetosaywhatmanyhavelongthought:thatanyclaimcomingfromanobserva1onalstudyismostlikelytobewrong–wronginthesensethatitwillnotreplicateiftestedrigorously.”Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120
• Separatedatacleaningandanalysis• Splitsampleintotraining/testsets• Analysisplanbasedonlyontrainingset– Fixedpriortoanalysis
• Makedatapublic
• Discuss
• Took12papersthatanalyzedobserva1onaldata,whereclaimswerealsotestedinRCTs
• NoresultswerereplicatedintheRCT– Observa1on=52posi1veclaims– RCT=5nega1veclaims,47nosignificance
• Why?
Bias
• Recallselec1oncriteriaforRCT,methodsforrandomiza1on
• Reasonformedica1onvsitseffects
Mul1plemodeling
Young,S.S.,&Karr,A.(2011).Deming,dataandobserva1onalstudies.Significance,8(3),116-120.
Hill’sviewpoints
Ninefeaturestoconsiderwhenevalua1ngcausalityincaseswhereexperimenta1onisnotpossibleMixof:showingcausemakesadifference,andthatthere’sapoten1almechanism
Hill,A.B.(1965).Theenvironmentanddisease:Associa1onorcausa1on?ProceedingsoftheRoyalSocietyofMedicine,58,295-300
(1)Strength
• Howstrongistheassocia1on?
• Intui1vely,ifpresenceofadverseeventinpeopletakingadrugisnearlythesameasthatinthegeneralpopula1on,thisislessconvincingthanifthereisasignificantincrease
(1)Strengthcon1nued
• Weakcauses– Secondhandsmokeandlungcancer– Uniden1fiedsubgroups– Rareevents
• Strongnon-causes– Downsyndromeandbirthorder
(2)Consistency
• Areresultsreplicable?• Isrela1onshipobservedbydifferentgroups,usingdifferentmethods?
• Ex:runningandsides1tch
(2)Consistencycon1nued
• Inconsistentcauses– Uniden1fiedfeaturesnecessaryforcausetobeeffec1ve
– Mosquitobitesandmalaria– Inconsistencyincause≠inconsistentstudyresults
• Consistentnon-causes– Commonstudyflaw
(3)Specificity
• Doescauseleadtooneeffectoragroupofeffects?
• Specificityalsoinmagnitude
• Examples– smokingcausingillnessvs.smokingcausinglungcancer
– OneSNPcausingmanyphenotypes
(3)Specificitycon1nued
• Notnecessarily1to1rela1onship– Factorscanincreaseriskofmul1pledisease– Drugshavemanysideeffects
(4)Temporality
• Doesthecauseprecedetheeffect?• Isthedelayconsistentwithhowcausemightwork?
• Ex:Occupa1onandillness
(5)Biologicalgradient
• Doeslevelofeffectorriskofitincreasewithincreaseinlevelofcause?
• Doseresponsecurve
• Asmorepeopleinanareaareexposedtoriskfactor,doesdiseaseincidenceincrease?
• Studyinvolving121,342people• Foundeachdailyincreaseof3ozofmeatassociatedwith12%increasedriskofcvdeatha10%increaseincancerdeath
J-shapedcurve
Rehm,J.,Gmel,G.,Sempos,C.T.,&Trevisan,M.(2003).Alcohol-relatedmorbidityandmortality.Mouth,140(208),C00-C97
(6)Plausibility
• Isthereapoten1almechanismthatcouldconnectthecauseandeffectgivenwhatwecurrentlyknow?
• Dependsoncurrentknowledge
(7)Coherence
• Doestherela1onshipconflictwithknownfacts?
• Notethatcurrentknowledgemaybewrong
Plausbilityvscoherence
• Plausibility:wecanconceiveofawaytherela1onshipcouldworkgivenwhatweknow
• Coherence:therela1onshipdoesnotconflictwithwhatweknow
(8)Experiment
• Arethereexperimentalresultssuppor1ngtherela1onship?
• Ifweintervenebyincreasing/introducingthecause,doestheeffectresult?
• Ex:doeslungcancerratechangeassmokingratedoes?
(8)Experimentcon1nued
• Humanvs.animal(+representa1veness)• Randomiza1on,blinding• Samplesize,popula1on
(9)Analogy
• Example:KnowthatHPVcausescervicalcancer,somaybemorelikelytoacceptvirusascauseofanothertypeofcancer
• Example2:Ifweseeeffectoflis1ngcaloriecontentonordersatrestaurants,moreplausiblethatlis1ngsaltcontentcanchangebehaviortoo
Hillnotes
• NOTachecklist• NOTfeaturesofrela1onshipitself
• Methodsforevalua1ngassocia1ons,recognizingcauses.
• S1llneedtothinkaboutqualityofevidenceofeachtype– Recalllastweek:varia1onsinRCTs
Hill+causalinference1. Strength2. Consistency3. Specificity4. Temporality5. Biologicalgradient6. Plausibility7. Coherence8. Experiment9. Analogy
Probabilitye.g.ADCS,GrangerProbability,regularityProbabilityTemporalpriority
Mill’smethodsMechanismsMechanismsInterven1ons,RCTs
Mechanisms
Hill’sViewpointsappliedtosmoking
1. Strength–Rateoflungcancer(anddeathfromLC)insmokersmuchhigherthaninnon-smokers,9-10xhigher.
2. Consistency–Samelinkfoundbymanystudies(retrospec1veandprospec1ve),inmanypopula1ons(men&women,geographic)
3. Specificity–Deathrategenerallyinc.amongsmokers,butdeathfromLCinc.substan1allymore
4. Temporality–PeopleusuallysmokebeforedevelopingLC/dying5. Biologicalgradient–Deathrate/LCincidenceincreaseslinearly
withincreasedsmoking6. Plausibility–Theoryof1ssuedamageeventuallyleadingto
cancer7. Coherence–Fitwithwhatwasknownat1me8. Experiment–Tarontheearsoflabratsproducedcancer,showing
carcinogensarepresent9. Analogy–Animalàhuman
Fornextweek
Finalexam!!
Top Related