A Constrained Instrumental Variable Approach,

41
A Constrained Instrumental Variable Approach, and its Application to Mendelian Randomization with Pleiotropy Lai J iang (Lady Davis Institute & McGill), Karim Oualkacha (UQAM), Celia Greenwood (Lady Davis Institute & McGill) Montreal Causal Inference Workshop, July 2016

Transcript of A Constrained Instrumental Variable Approach,

Page 1: A Constrained Instrumental Variable Approach,

A Constrained Instrumental Variable Approach, and its Application to Mendelian Randomization with Pleiotropy

Lai J iang (Lady Davis Ins titute & McGill), Karim Oualkacha (UQAM),

Celia Greenwood (Lady Davis Ins titute & McGill)

Montreal Caus al Inference Works hop, J uly 2016

Page 2: A Constrained Instrumental Variable Approach,

Outline

1. The problem

2. Solutions and es timators

3. Simulations – s imple example

4. Simulations - an example borrowed from reality

5. Finally …

2

Page 3: A Constrained Instrumental Variable Approach,

3

The Problem

Page 4: A Constrained Instrumental Variable Approach,

Instrumental Variables

We are interes ted in the causal effect of X (phenotype) on Y (dis ease)

U: unobserved variable that may confound the as socia tion between X and Y

G: a variable (genotype) we want to use as an ins trumental variable

DAG: U

G X Y

4

Page 5: A Constrained Instrumental Variable Approach,

Core Conditions for Instrumental Variable analysis

U

G X Y 1) G independent of U 2) G not marginally independent of X 3) G and Y are independent conditional on X and U

5

Page 6: A Constrained Instrumental Variable Approach,

Potential violations of assumptions in MR

Linkage Disequilibrium (Condition 3 or Condition 1)

Genetic Heterogeneity (Weak IV)

Population Stra tifica tion (Condition 3)

6

G2 U

G1 X Y

G1

G2

G3

X

U

Y

G1

P

X Y

U

Page 7: A Constrained Instrumental Variable Approach,

Pleiotropy A SNP (G) is associated with another intermediate phenotype (Z, other than X) which also has an effect on the disease Y.

Z

G X Y

U

G X Y

Z U (a) (b)

7

• Condition (3) – Y and G independent given X and U – is violated in (a). • Condition (1) – G and U independent – is violated in (b).

Page 8: A Constrained Instrumental Variable Approach,

(Original) Problem

Page 9: A Constrained Instrumental Variable Approach,

9

Estimators

Page 10: A Constrained Instrumental Variable Approach,

Causal effect estimation: IV assumptions satisfied

One SNP

G X Y

U

δ β

ρ η

Page 11: A Constrained Instrumental Variable Approach,
Page 12: A Constrained Instrumental Variable Approach,

In the presence of pleiotropy

Z

G X

U

δ

ω

β

ξ η

ρ

Y

Page 13: A Constrained Instrumental Variable Approach,

In the presence of pleiotropy

Y

13

Z

G X

U

Page 14: A Constrained Instrumental Variable Approach,

Potential solutions when there is pleiotropy

1. Naïve Method (TSLS us ing a ll SNPs ) {Biased} [Baum 2003] 2. Adjus t for G for Z, us e Gres as IV {Unbiased} 3. Limited Information Maximum Likelihood. (LIML) {Biased} [Burges s 2013] 4. Stepwise Selection Methods -- s elect subset of G {Incons is tent} [Bai 2008] 5. Inverse Probability Weights [Cole 2008]. 6. Cons tra ined Ins trumental Variables

14

Page 15: A Constrained Instrumental Variable Approach,

1. Naïve method in practice (#1, #2)

1. Select SNPs s trongly as socia ted with phenotype of interes t X

2. Remove SNPs s trongly as socia ted with pleiotropic phenotypes Z

3. Remove weak SNPs to avoid overfitting & bias

4. If the resulting ins trumental variable G s till has a s trong pleiotropic effect,

cons ider regres s ing G|Z and use Gres ins tead of G (ie. Method #2)

5. TSLS model fitting us ing G (or Gres) -> X -> Y

Page 16: A Constrained Instrumental Variable Approach,

3. LIML

Limited Information Maximum Likelihood uses ins truments to rectify bias in a regres s ion Y ~ X, a ris ing when X is correla ted with res iduals

• i.e. Pleiotropy leads to correla tion between X and Z

LIML takes into account the covariances of the errors

More s table to weak ins trumental variable bias

• Z is not used

16

Page 17: A Constrained Instrumental Variable Approach,

4. Stepwise Selection Methods.

Step by s tep methods (such as (Bai, 2008)):

G: Construct a new IV based on a subset S

Select new features into S With respect to X | G

Delete features in S with respect to Z | G

17

Implementation details a re crucial here

1. Specific criteria (s ignificance thres holds ) for s election and deletion?

2. How to s ummarize the information in the final s et S? PCA? PLS? Etc…

3. Stop s trategy?

4. Starting S (previous publis hed s ources )?

5. Backward? Forward? Bidirectional?

Page 18: A Constrained Instrumental Variable Approach,

5. Inverse Probability Weight Adjusted Regression (IPWAR)

18

Page 19: A Constrained Instrumental Variable Approach,

19

Constrained Instrumental Variable Method

Page 20: A Constrained Instrumental Variable Approach,

Constrained Instrumental Variable

20

Y

Z

G X

U

Page 21: A Constrained Instrumental Variable Approach,

Constrained Instrumental Variable

-

21

Page 22: A Constrained Instrumental Variable Approach,

Constrained Instrumental Variable

• When 𝑝 < 𝑛 there is an exact solution

• In fact there are 𝑝 − 𝑘 orthogonal solutions

• 𝐺𝑟𝑟𝑟 = 𝐼𝑛 − 𝑃𝑍 𝐺, the projection of 𝐺 to subspace orthogonal to columns of 𝑍

• We search for a direction in Span(𝐺), 𝐺∗, mos t informative for 𝑋, orthogonal to 𝑍

• 𝐺∗ belongs to Span(𝐺𝑟𝑟𝑟)

• 𝑆𝑝𝑆𝑛 𝐺𝑟𝑟𝑟 = 𝑆𝑝𝑆𝑛 𝐺∗ + 𝑆𝑝𝑆𝑛(𝑐𝑐𝑐𝑝𝑐𝑐𝑐𝑐𝑛𝑐 𝑐𝑜 𝐺∗𝑖𝑛 𝐺𝑟𝑟𝑟 𝑠𝑠𝑠𝑠𝑝𝑆𝑐𝑐)

• Complement is not informative for X

• Hence, CIV has identical performance to s tra ightforward 𝐺𝑟𝑟𝑟

Page 23: A Constrained Instrumental Variable Approach,

Constrained Instrumental Variable

23

What about when 𝑝 > 𝑛?

Page 24: A Constrained Instrumental Variable Approach,

Penalized Constrained Instrumental Variable

L0 penalty

LASSO penalty

Non-convex Penalties

…...

Penalization

24

Page 25: A Constrained Instrumental Variable Approach,

Which penalty?

• LASSO (L1) penalty does not a lways yield a sparse solution automatically for this problem.

• Neither do non-convex penalties (e.g. SCAD)

25

Page 26: A Constrained Instrumental Variable Approach,

L0 penalty

• L0 penalty: 𝑠 0 = ∑ 𝑠𝑗0

𝑗 < 𝜆 directly enforces spars ity

• However, greedy s earch for optimal 𝑠, 𝜆 is 𝑛𝑝-hard in computational complexity

• Alternatively, cons ider smoothed approximate L0 penalties as 𝜎 → 0:

𝑜𝜎 𝑥 = 𝑐𝑥𝑝 −𝑥2

2𝜎2 → �1, 𝑥 = 00, 𝑥 ≠ 0

So 𝑠 0 ≈ 𝑝 − ∑ 𝑜𝜎(𝑠𝑗)𝑗

Page 27: A Constrained Instrumental Variable Approach,

Implementation

Two methods :

1. If 𝐩 < 𝐧, linear a lgebra solution

2. If 𝐩 > 𝐧, we propose a numerical solution for penalized CIV Start from an initial gues s 𝑠, and an initial penalty 𝜎 = max |𝑠| Iteratively decreas e 𝜎 → 𝜎𝑚𝑚𝑛 a long a chos en s equence Cros s validation to choos e 𝜆 for each 𝜎 bas ed on MSE of (𝑌|𝐺,𝑠)

3. Boots trap for s tandard errors

27

Page 28: A Constrained Instrumental Variable Approach,

Simulations

28

Page 29: A Constrained Instrumental Variable Approach,

• Naive TSLS

• Adjusting each G for Z, then TSLS

• LIML

• Stepwise Selection Methods (forward, backward)

• Inverse Probability Weight Adjusted Regression (IPWAR)

• CCA/sparseCCA

• CIV (Constrained Instrumental Variable)

• Smoothed CIV

29

Methods compared

Page 30: A Constrained Instrumental Variable Approach,

Bias as a function of 𝜉 • N=200 • 20 SNPs • 5 SNPs as s ociated with X. • Among thes e 5: • 2 SNPs als o as s ociated with Z • 𝛽: Effect 𝑋 → 𝑌

Conditional TSLS/ G res iduals CCA, CCA_LASSO LIML Backward/ Forward IPWAR CIV/Smoothed CIV

Page 31: A Constrained Instrumental Variable Approach,

Bias and standard error 𝝃=1 𝝃=5

Estimate Mean square error

Estimate Mean square error

G residual 1.09 0.01 1.08 0.01

IPWAR 2.00 1.21 15.30 2.20

CIV 1.09 0.01 1.09 0.02

CIV smooth 1.01 0.01 1.05 0.01

Page 32: A Constrained Instrumental Variable Approach,

SNPs selected over simulations

Method SNP1 SNP2 SNP3 SNP4 SNP5 Others

CIV 0.76 0.76 1.00 1.0 1.0 ~0.2

Smoothed CIV

0.00 0.00 0.29 0.62 0.29 0.0

Note: Applied hard threshold: 0.1*max(estimate)

Page 33: A Constrained Instrumental Variable Approach,

SNPs selected CIV Smoothed CIV

Page 34: A Constrained Instrumental Variable Approach,

34

Contrived Example

LDL-HDL-CHD

Page 35: A Constrained Instrumental Variable Approach,

HDL/LDL Simulation

Page 36: A Constrained Instrumental Variable Approach,

Bias Bootstrap

variance Lower_CI Upper_CI

Naive -0.090 0.025 -0.145 -0.034

G residuals 0.004 0.027 -0.050 0.059

LIML -0.112 0.027 -0.169 -0.066

IPWAR 0.066 0.025 0.011 0.121

Backward -0.008 0.042 -0.094 0.079

Forward 0.006 0.055 -0.097 0.109

CCA_LASSO -0.090 0.030 -0.147 -0.033

CCA -0.090 0.028 -0.147 -0.032

CIV 0.002 0.031 -0.052 0.057

TRUE -0.0002 0.031 -0.067 0.066

HDL/LDL Simulation

Page 37: A Constrained Instrumental Variable Approach,

37

Finally…

Page 38: A Constrained Instrumental Variable Approach,

Comments

• The smoothing CIV selects SNPs as socia ted with X and not Z in a non-greedy s earch for the optimal subset

• G_res , CCA etc s elect SNPs differently and will keep overlapping SNPs

• For 𝑝 > 𝑛 or 𝑝 large • CIV will work • G_res iduals will not work or will become uns table

• 𝑝 > 𝑛 is a very unlikely Mendelian randomization s cenario

Page 39: A Constrained Instrumental Variable Approach,

Questions

• Other applications of the smoothed L0 penalty? • We create one ins trument

• What about creating s everal ins truments through the s ame kinds of projections (p-k)?

• Links to • Targeted learning • Multiple invalid ins truments (averages of weak ins truments )

Page 40: A Constrained Instrumental Variable Approach,

(New) problem

• Computationally efficient variable s election subject to linear cons tra ints

Page 41: A Constrained Instrumental Variable Approach,

Thank you! Acknowledgements: CIHR Ludmer Centre for Neuroinformatics and Mental Health Brent Richards & his group

41