Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus...

43
How black-box use of imputation can cause bias Nicole Erler Erasmus Medical Center, Rotterdam Nicole Erler, FGME 2019, Kiel 1

Transcript of Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus...

Page 1: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

How black-box use of imputation can cause bias

Nicole Erler

Erasmus Medical Center, Rotterdam

Nicole Erler, FGME 2019, Kiel 1

Page 2: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values is Easy!

Functions automatically exclude missing values:

## [...]## Residual standard error: 2.305 on 69 degrees of freedom## (25 observations deleted due to missingness)## Multiple R-squared: 0.09255, Adjusted R-squared: 0.02679## F-statistic: 1.407 on 5 and 69 DF, p-value: 0.2325

Imputation is super easy:

library("mice")imp <- mice(mydata)

However ...

Nicole Erler, FGME 2019, Kiel 2

Page 3: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values is Easy!

Functions automatically exclude missing values:

## [...]## Residual standard error: 2.305 on 69 degrees of freedom## (25 observations deleted due to missingness)## Multiple R-squared: 0.09255, Adjusted R-squared: 0.02679## F-statistic: 1.407 on 5 and 69 DF, p-value: 0.2325

Imputation is super easy:

library("mice")imp <- mice(mydata)

However ...

Nicole Erler, FGME 2019, Kiel 2

Page 4: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values Correctly is Not So Easy!

(Imputation) methods makes certain assumptions, e.g.:

missingness is M(C)AR

the incomplete variable has a certain conditional distribution(e.g. normal)all associations are linear

no interactionsno non-linear effectsno transformations

compatibility of the imputation modelscongeniality (compatibility between analysis and imputation models)

violation á bias

Nicole Erler, FGME 2019, Kiel 3

Page 5: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values Correctly is Not So Easy!

(Imputation) methods makes certain assumptions, e.g.:

missingness is M(C)ARthe incomplete variable has a certain conditional distribution(e.g. normal)

all associations are linearno interactionsno non-linear effectsno transformations

compatibility of the imputation modelscongeniality (compatibility between analysis and imputation models)

violation á bias

Nicole Erler, FGME 2019, Kiel 3

Page 6: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values Correctly is Not So Easy!

(Imputation) methods makes certain assumptions, e.g.:

missingness is M(C)ARthe incomplete variable has a certain conditional distribution(e.g. normal)all associations are linear

no interactionsno non-linear effectsno transformations

compatibility of the imputation modelscongeniality (compatibility between analysis and imputation models)

violation á bias

Nicole Erler, FGME 2019, Kiel 3

Page 7: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values Correctly is Not So Easy!

(Imputation) methods makes certain assumptions, e.g.:

missingness is M(C)ARthe incomplete variable has a certain conditional distribution(e.g. normal)all associations are linear

no interactionsno non-linear effectsno transformations

compatibility of the imputation modelscongeniality (compatibility between analysis and imputation models)

violation á bias

Nicole Erler, FGME 2019, Kiel 3

Page 8: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Handling Missing Values Correctly is Not So Easy!

(Imputation) methods makes certain assumptions, e.g.:

missingness is M(C)ARthe incomplete variable has a certain conditional distribution(e.g. normal)all associations are linear

no interactionsno non-linear effectsno transformations

compatibility of the imputation modelscongeniality (compatibility between analysis and imputation models)

violation á bias

Nicole Erler, FGME 2019, Kiel 3

Page 9: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Literature: mis-specification in Multiple Imputation

Several authors have

investigated robustness to mis-specification (of distribution)in MI using FCS / MICEin joint model MI

and/or proposed to useTukey’s gh distributionFleishman polynomialsGAMs (in FCS)Doubly-robust weighted estimating equations (instead of MI)

Nicole Erler, FGME 2019, Kiel 4

Page 10: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Fully Bayesian Analysis & Imputation

Joint distribution

p(y | X , b, θ)︸ ︷︷ ︸analysismodel

p(X | θ)︸ ︷︷ ︸imputation

part

p(b | θ)︸ ︷︷ ︸randomeffects

p(θ)︸︷︷︸priors

Imputation part

p(X︷ ︸︸ ︷

x1, . . . , xp,Xcompl . | θ) = p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

SoftwareImplemented in the R package JointAI

Nicole Erler, FGME 2019, Kiel 5

Page 11: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Fully Bayesian Analysis & Imputation

Joint distribution

p(y | X , b, θ)︸ ︷︷ ︸analysismodel

p(X | θ)︸ ︷︷ ︸imputation

part

p(b | θ)︸ ︷︷ ︸randomeffects

p(θ)︸︷︷︸priors

Imputation part

p(X︷ ︸︸ ︷

x1, . . . , xp,Xcompl . | θ) = p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

SoftwareImplemented in the R package JointAI

Nicole Erler, FGME 2019, Kiel 5

Page 12: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Fully Bayesian Analysis & Imputation

Joint distribution

p(y | X , b, θ)︸ ︷︷ ︸analysismodel

p(X | θ)︸ ︷︷ ︸imputation

part

p(b | θ)︸ ︷︷ ︸randomeffects

p(θ)︸︷︷︸priors

Imputation part

p(X︷ ︸︸ ︷

x1, . . . , xp,Xcompl . | θ) = p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

SoftwareImplemented in the R package JointAI

Nicole Erler, FGME 2019, Kiel 5

Page 13: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

MICE vs JointAI

Imputation in MICE

p(x1 | y ,Xcompl ., x2, x3, x4, . . . , θ)p(x2 | y ,Xcompl ., x1, x3, x4, . . . , θ)p(x3 | y ,Xcompl ., x1, x2, x4, . . . , θ). . .

Imputation in JointAI

p(y | Xcompl ., x1, x2, x3, . . . , θ)p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

No issues withcomplex outcomes, e.g.:

multi-levelsurvival

congenialitycompatibility

Nicole Erler, FGME 2019, Kiel 6

Page 14: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

MICE vs JointAI

Imputation in MICE

p(x1 | y ,Xcompl ., x2, x3, x4, . . . , θ)p(x2 | y ,Xcompl ., x1, x3, x4, . . . , θ)p(x3 | y ,Xcompl ., x1, x2, x4, . . . , θ). . .

Imputation in JointAI

p(y | Xcompl ., x1, x2, x3, . . . , θ)

p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

No issues withcomplex outcomes, e.g.:

multi-levelsurvival

congenialitycompatibility

Nicole Erler, FGME 2019, Kiel 6

Page 15: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

MICE vs JointAI

Imputation in MICE

p(x1 | y ,Xcompl ., x2, x3, x4, . . . , θ)p(x2 | y ,Xcompl ., x1, x3, x4, . . . , θ)p(x3 | y ,Xcompl ., x1, x2, x4, . . . , θ). . .

á

Imputation in JointAI

p(y | Xcompl ., x1, x2, x3, . . . , θ)p(x1 | Xcompl ., θ)p(x2 | Xcompl ., x1, θ)p(x3 | Xcompl ., x1, x2, θ). . .

áPotential mis-specification of

association structureconditional distributionM(C)AR

Nicole Erler, FGME 2019, Kiel 7

Page 16: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Quadratic Effect

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Quadratic association between covariates: x1 ∼ α0 + α1x2 +HHHα2x2

2 + . . .

−2 0 2 −2 0 2 −2 0 2

−10

0

10

20

30

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

mice JointAI mice JointAI mice JointAI0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 8

Page 17: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Quadratic Effect

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Quadratic association between covariates: x1 ∼ α0 + α1x2 +HHHα2x2

2 + . . .

−2 0 2 −2 0 2 −2 0 2

−10

0

10

20

30

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

mice JointAI mice JointAI mice JointAI0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 8

Page 18: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Logarithmic Effect

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Log-association between covariates: x1 ∼ α0 + α1ZZlog(x2) + . . .

0.4 0.8 1.2 0.4 0.8 1.2 0.4 0.8 1.2−12

−8

−4

0

4

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

mice JointAI mice JointAI mice JointAI

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 9

Page 19: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Gamma distribution

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Gamma-distributed covariate: x1 | x2, x3, . . . ∼ Ga()

0 2 4 6 0 2 4 6 0 2 4 60.0

0.5

1.0

1.5

2.0

x1(incomplete covariate)

cond

ition

al d

istr

ibut

ion

mice JointAI mice JointAI mice JointAI

0.8

1.0

1.2

1.4

1.6

1.8

2.0

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 10

Page 20: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Flexible Bayesian Models

We need more flexible imputation models!

Ideally: models that fit (almost) any distribution / association structure.

Ideas:

flexible association structure: penalized splinesflexible residual distribution: mixture of Polya-Trees

Nicole Erler, FGME 2019, Kiel 11

Page 21: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Flexible Bayesian Models

We need more flexible imputation models!

Ideally: models that fit (almost) any distribution / association structure.

Ideas:

flexible association structure: penalized splinesflexible residual distribution: mixture of Polya-Trees

Nicole Erler, FGME 2019, Kiel 11

Page 22: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Bayesian P-Splines

Instead of β1x2 we used∑

`=1β`B`(x2):

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●

● ●

●●

x2

x 1

Nicole Erler, FGME 2019, Kiel 12

Page 23: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Bayesian P-Splines

Instead of β1x2 we used∑

`=1β`B`(x2):

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●

● ●

●●

x2

x 1

Nicole Erler, FGME 2019, Kiel 12

Page 24: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Bayesian P-Splines

Instead of β1x2 we used∑

`=1β`B`(x2):

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●

● ●

●●

x2

x 1

Nicole Erler, FGME 2019, Kiel 12

Page 25: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation: Bayesian P-SplinesAnalysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Quadratic association between covariates: x1 ∼ α0 + α1x2 +HHHα2x22 + . . .

−2 0 2 −2 0 2 −2 0 2

−10

0

10

20

30

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

JointAIdefault

JointAIp−spline

JointAIdefault

JointAIp−spline

JointAIdefault

JointAIp−spline

1.00

1.25

1.50

rela

tive

bias

Nicole Erler, FGME 2019, Kiel 13

Page 26: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation: Bayesian P-SplinesAnalysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Quadratic association between covariates: x1 ∼ α0 + α1x2 +HHHα2x22 + . . .

−2 0 2 −2 0 2 −2 0 2

−10

0

10

20

30

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

JointAIdefault

JointAIp−spline

JointAIdefault

JointAIp−spline

JointAIdefault

JointAIp−spline

1.00

1.25

1.50

rela

tive

bias

Nicole Erler, FGME 2019, Kiel 13

Page 27: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation: Bayesian P-Splines

Potential Issue:

covariate

inco

mpl

ete

cova

riate

missingobserved

Nicole Erler, FGME 2019, Kiel 14

Page 28: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation: Bayesian P-Splines

Potential Issue:

covariate

inco

mpl

ete

cova

riate

missingobserved

Nicole Erler, FGME 2019, Kiel 14

Page 29: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 30: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 31: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 32: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 33: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 34: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 35: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Mixture of Polya Trees

0.0

0.5

1.0

1.5

dens

ity

Beta(a0, a0)

Beta(a1, a1) Beta(a1, a1)

Beta(a2, a2) Beta(a2, a2) Beta(a2, a2) Beta(a2, a2)

Beta(a3, a3) Beta(a3, a3) Beta(a3, a3) Beta(a3, a3)

Nicole Erler, FGME 2019, Kiel 15

Page 36: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Practical Issues & Ideas

Potential / probable issues for practice:

flexible fit needs observed data everywherecomputational time

Ideas:

check first if simple model fits, e.g.posterior predictive checks

χ2 type of testsKolmogorov-Smirnoff test?discordance tests?

feasibility checks before running the complex model

Nicole Erler, FGME 2019, Kiel 16

Page 37: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Practical Issues & Ideas

Potential / probable issues for practice:

flexible fit needs observed data everywherecomputational time

Ideas:

check first if simple model fits, e.g.posterior predictive checks

χ2 type of testsKolmogorov-Smirnoff test?discordance tests?

feasibility checks before running the complex model

Nicole Erler, FGME 2019, Kiel 16

Page 38: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Practical Issues & Ideas

Potential / probable issues for practice:

flexible fit needs observed data everywherecomputational time

Ideas:

check first if simple model fits, e.g.posterior predictive checks

χ2 type of testsKolmogorov-Smirnoff test?discordance tests?

feasibility checks before running the complex model

Nicole Erler, FGME 2019, Kiel 16

Page 39: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Take home message

assumptions of imputation models can easily be violated á bias

more flexible imputation models are needed

semi- / non-parametric (Bayesian) methods can offer a solution

more flexibility á more complexity á need for guidance tools

Nicole Erler, FGME 2019, Kiel 17

Page 41: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Beta Distribution

Analysis model: y ∼ β0 + β1x1 + β2 + . . .

Beta-distributed covariate: x1 | x2, x3, . . . ∼ Be()

0.00 0.25 0.50 0.75 1.000.00 0.25 0.50 0.75 1.000.00 0.25 0.50 0.75 1.000

1

2

3

4

x1(incomplete covariate)

cond

ition

al d

istr

ibut

ion

mice JointAI mice JointAI mice JointAI

0.6

0.8

1.0

1.2

1.4

rela

tive

bias

Nicole Erler, FGME 2019, Kiel 19

Page 42: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Reverting the sequence

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Exclusion of important predictor: x1 ∼ α0 + α1x2 +XXXα2x3 + . . .

−5 0 5 −5 0 5 −5 0 50.0

0.2

0.4

0.6

x1(incomplete covariate)

cond

ition

al d

istr

ibut

ion

mice JointAI mice JointAI mice JointAI0.6

0.8

1.0

1.2

1.4

1.6

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 20

Page 43: Erasmus Medical Center, Rotterdam · Howblack-boxuseofimputationcancausebias NicoleErler Erasmus Medical Center, Rotterdam NicoleErler,FGME2019,Kiel 1

Simulation Study: Ignoring an Interaction

Analysis model: y ∼ β0 + β1x1 + β2x2 + . . .

Interaction between covariates: x1 ∼ α0 + α1x2 + α2x3 +XXXXα3x2x3 + . . .

−2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2

−5.0

−2.5

0.0

2.5

5.0

x2(complete covariate)

x 1(in

com

plet

e co

varia

te)

mice JointAI mice JointAI mice JointAI

0.6

0.8

1.0

1.2

1.4

1.6

1.8

rela

tive

bias

10% NA

30% NA

50% NA

Nicole Erler, FGME 2019, Kiel 21