Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of...

48
1 Analysis of RNAseq data 1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam Mark van de Wiel [email protected] 1 and other high-dimensional data

Transcript of Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of...

Page 1: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

1

Analysis of RNAseq data1 using ShrinkBayes

Department of Epidemiology & Biostatistics VUmc, Amsterdam

Mark van de Wiel [email protected]

1and other high-dimensional data

Page 2: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

2

7. ShrinkBayes: plusses and minuses 6.

Overview

ShrinkBayes: inference ShrinkBayes: shrinkage 4.

5. ShrinkBayes: software

ShrinkBayes: model 3.

Contents

Why go Bayesian? 1. INLA 2.

Page 3: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

3

Some Bayesian statistics

Page 4: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

Source: http://xkcd.com/1132/

Page 5: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

5

Bayes in a nutshell…

• Data: Y • Parameter of interest: β

• Known: likelihood π(Y| β) and prior π(β)

• ?Posterior: π(β | Y)

• Bayes’ rule:

• 95% credibility interval: CI= [q0.025(π(β | Y), q0.975(π(β | Y)]

• Simple inference: β ∈ CI?

∫==

β

βd)β(π)β|(π)β(π)β|(π

)(π)β(π)β|(π)|β(π

YY

YYY

Presentator
Presentatienotities
q_alpha(f) denotes the alpha quantile of density function f (or its corresponding distr function F)
Page 6: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

6

Bayesian inference, single test Parameter of interest: gene expression difference between two conditions: primary and metastasis. Posteriors:

Presentator
Presentatienotities
Credibility interval: dotted lines. First: beta = 0 not in CI (null-hypothesis ‘rejected’), second: beta = 0 in CI.
Page 7: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

7

Bayesian inference

Conjugate priors

Prior π(β) is conjugate to likelihood π(Y β) when the posterior has an analytical form. 1. Normal-Normal 2. Normal-Normal-Inverse Gamma (for variance σ2) 3. Binomial-Beta (=Beta-binomial distribution) 4. Poission-Gamma (= Negative Binomial) 5. .....

Mixtures of conjugates are also conjugate when the mixture components are known

Page 8: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

8

Bayesian inference, conjugate priors, example

Page 9: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

9

Bayesian inference, conjugate priors, example

1

1Ignoring constant -1/2

Page 10: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

10

Bayesian inference, approximations

In many practical cases, exact formulas for the posteriors are not available, hence approximations are necessary: 1. Markov Chain Monte Carlo

Very popular. Computationally intensive. Needs to be developed for each new problem. Very general though

2. Variational Bayes approximations. Fast. Approximate posteriors by analytical formulas

assuming knowledge of conditional posteriors. Require specific development for each problem

3. Integrated Nested Laplace Integration (INLA)

Page 11: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

11

INLA

Page 12: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

12

Bayesian inference, GLM setting

Page 13: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

13

• Approximations for marginal posteriors based on Laplace approximations • Fast, accurate alternative for MCMC

• Applicable to a wide variety of models, including GLM.

• Can also be used for count data

• Implementation in R: www.r-inla.org

INLA

Page 14: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

14

INLA

Page 15: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

15

Why go Bayesian?

Tossing of 10.000 coins, a simple example

• Setting: 10.000 coins, each flipped only three times

• Suppose 8000 fair coins: p(head) = 0.5; 1000 for which p(head) = 0.35 and 1000 for which p(head) = 0.65

• Some computations:

nk

:estimatetFrequentis

++n+k

=]k|p[E:estimateBayesian

),(Beta~p)p,n(Bin~X

βαα

βα

Page 16: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

16

Why go Bayesian?

Beta-distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

x

dens

ity

Black: beta(3,3) Red: beta(2,2) Green: beta(1,1) = U[0,1]

Page 17: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

17

Why go Bayesian?

Estimates

nk

=)n,k(F:estimatetFrequentis

++n+k

=),,n,k(B=]k|p[E:estimateBayesianβα

αβα

Suppose k=1. F(1,3) = 1/3. B(1,3,1,1) = 2/5; B(1,3,3,3) = 4/9

Suppose k=0. F(1,3) = 0. B(1,3,1,1) = 1/5; B(1,3,3,3) = 3/9

In particular, the informative prior renders better estimates for the majority of coins (not for all)

Page 18: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

18

Why go Bayesian?

Estimation also improves in terms of precision

Possibly more important: Bayesian estimates using an informative prior tend to be more precise (less variance) Show this for the beta-binomial coin tossing experiment. [exer]

Page 19: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

19

Why go Bayesian?

Informative prior renders better estimates

Nice, but we don’t know it. True, but we can estimate it! Empirical Bayes: estimate the prior from data. Many different forms. Simplest one: Equate the theoretical moments of the data to the empirical moments.

Page 20: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

20

Why go Bayesian?

Empirical Bayes

Equate theoretical moments of the data to empirical ones.

2m

1ii

222

22

22

m

1ii

)X-X()1-m/(1V̂

)βα/(αn))βα/(α(n-)]1-βα()βα/[(αβ)n-n(

]p[nE])p[E(n-]p[V)n-n(

])p[E-]p[E(n-]p[Vn]]p|X[V[E]]p|X[E[VVX

m/XX)βα/(αn]]p|X[E[EEX

)β,α(Beta~p),p,n(Bin~X

=

=

==

+++++=

+=

=+=

==+==

Page 21: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

21

ShrinkBayes: motivating example

Page 22: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

22

Motivating example • 25 libraries, 70,000 tags (tag clusters) • 5 brain regions • 7 donors • Main interest: which tags differ between brain regions (Groups)

Page 23: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

23

Motivating example

Page 24: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

24

Motivating example

Page 25: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

25

Random effects

Source: http://anythingbutrbitrary.blogspot.fr/2012/06/random-regression-coefficients-using.html

Page 26: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

26

ShrinkBayes: model

Page 27: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

27

Model MODEL

Yij ~ ZI-NB(p0, μij, φi) = p0iδ(0) + (1-p0i)NB(μij, φi)

μij = g-1(Xjβi) βik ~ N(0,(τk)2

) [or βik ~ N(0,(τik)2 ); (τik)-2 ~ Г(α,β) ]

θil ~ Fl g : link function, e.g. g(x)=log(x) θil : hyper-parameter (not linked to mean): φi, p0i, etc.

Presentator
Presentatienotities
Same link function as for Poisson; this is the natural link function
Page 28: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

28

Model Poisson,Negative Binomial, Zero-Inflated Negative Binomial

Page 29: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

29

ShrinkBayes: shrinkage

Page 30: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

30

ShrinkBayes How does Bayesian shrinkage work?

Page 31: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

31

ShrinkBayes Why is shrinkage useful?

• Dispersion enables borrowing information across features leads to more stable estimates

• Parameters of interest (e.g. βi,group) accommodates multiplicity correction corrects for ‘selection bias’ caused by regression to the mean

• Nuisance parameters: (e.g. βi,batch) automatic ‘model selection’ property: prior shrinks to null when unimportant

Page 32: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

32

ShrinkBayes: shrinkage Back to the example....

Yij ~ p0iδ(0) + (1-p0i)NB(μij, φi), φi = log(νi)

μij = exp(βi0 + βi,groupXg + βi,batch Xb + βi,indjXindj)

Priors1

1Vague priors for p0i and βi0

Page 33: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

33

ShrinkBayes: shrinkage Priors for the example

How do we estimate the parameters of the priors?

Page 34: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

34

ShrinkBayes Bayesian Empirical Bayes

Page 35: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

35

ShrinkBayes Iterative estimation of the prior parameters

Page 36: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

36

ShrinkBayes Nonparametric prior

• Shape of prior may be important, in particular for central parameter of interest (e.g. θi = βi,group) • How to get

from

θi~N(0,τ2)

to

θi~Fnp

?

Page 37: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

37

ShrinkBayes Nonparametric prior

• Empirical Bayes fitting is easy, remember:

• So, we simply sample from this mixture (given an initial prior) and smoothen it (e.g. using ‘density’ in R):

∑≈p

1=ii )|()( Yθπθπ

Page 38: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

38

ShrinkBayes Nonparametric prior

• Nice, but INLA does not accept non–parametric priors!

• Solution: 1. Use iterative procedure to find a good parametric

shrinkage prior: fp(θ) 2. Obtain posteriors under fp(θ) 3. Apply the same iterative procedure to obtain a non-

parametric estimate fnp, computing posteriors by:

==

=

∫∫ θd)θ(F)θ(f

)|θ(π1C1θd)|θ(π

]exer[)θ(f)θ(f

)|θ(πC)|θ(π

f

npipinp

p

npipinp

YY

YY

Page 39: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

39

ShrinkBayes: inference

Page 40: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

40

ShrinkBayes: inference Hypothesis testing in a Bayesian setting

• In a Bayesian setting it is very natural to test interval H0: |θi|≤ δ

• Using δ ≠ 0 avoids detecting small, biologically non-relevant effects

• Testing H0: |θi| ≤ δ trivial from posteriors: ...and BFDR(t) =E[lfdr|lfdr ≤ t] computed from lfdr as before1

i

δ

δ- iiiii0 θd)|θ(P)|δ|θ(|P)|H(Plfdr ∫≤ YYY ===

1More on False Discovery Rate (FDR) in later lectures

Page 41: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

41

ShrinkBayes: inference Bayesian hypothesis testing: multiple comparisons

• Multiple comparisons: θikl = θik - θil for all (k,l). E.g. (βi,group1 - βi,group2, βi,group1 - βi,group3, βi,group2 - βi,group3, etc.)

• H0: |θikl|≤ δ for all (k,l).

kl)l,k(iikl)l,k(

iikl)l,k(iikl)l,k(

iikl)l,k(iikl)l,k(i0

lfdrmin)|δ|θ(|Pmin

)]|δ|θ|P-1[min)|δ|θ(|Pmax-1

)|δ|θ|(maxP-1)|δ|θ|(maxP)|H(Plfdr

==

>=>

>===

Y

YY

YYY

≤≤

• So: if minimum lfdr < α then certainly global lfdr < α

• Also works for BFDR

Presentator
Presentatienotities
Inequallity comes from P[union (Ai)] >= P(Ai) for any i, hence also for the index corrsponding to the maximum probability
Page 42: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

42

ShrinkBayes: inference

E.g: ( ) ),0(N)p1(+0p=)( 200 τδθπ -

∫+=

+=

θd)θ|(π)θ('π)p-1()0|(πp)0|(πp

)(Π)p-1()0|(πp)0|(πp

)(π00

0

00

00 YY

YYY

YY

+

Then1:

Point null-hypothesis in a Bayesian setting

• Recall that for testing H0: θi = 0 (e.g. θi = βi,group) prior needs to contain mass on 0!

1dropping index i

Page 43: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

43

ShrinkBayes: inference Point null-hypothesis in a Bayesian setting

π(Y|0) and П(Y) are marginal likelihoods which are conveniently provided by INLA (where π(Y|0) is obtained after fitting the null-model: i.e. the model not containing βik

[ ]{ }

{ }∑

=

=== p

1it)(π

p

1it)(π0

00

0

0

I

I)(πˆt)(π)|(πE)t(BFDR

≤Y

YYYY

∫+=

+=

θd)θ|(π)θ('π)p-1()0|(πp)0|(πp

)(Π)p-1()0|(πp)0|(πp

)(π00

0

00

00 YY

YYY

YY

Page 44: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

44

ShrinkBayes: inference

0for,)p1(+)(/)0|(p

)|('f)p1(=)|(f

)()p1(+)0|(p)0|(p

=)|0=(P

00

0

00

0

≠θππ

θθ

πππ

θ

-YYY-

Y

Y-YY

Y

Posterior

Posterior is [exer]:

Where f’(θ|Y) is the posterior obtained by INLA under the Guassian prior N(0,τ2)

Page 45: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

45

ShrinkBayes: Software

Page 46: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

46

ShrinkBayes Demo

See the R-Vignette ShrinkBayesVignette.pdf for many examples

Page 47: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

47

ShrinkBayes: Plusses and minuses

Page 48: Module 1: Definitie Basiscursus Steekproevenmavdwiel/SHDD/ShrinkBayes_slides.pdf · 1 Analysis of RNAseq data1 using ShrinkBayes Department of Epidemiology & Biostatistics VUmc, Amsterdam

48

ShrinkBayes Plusses and minuses, comparison with edgeR

+ Better (in terms of power) for small sample sizes + Can deal with pairs, random effects, excess of zeros + Automatic model selection + More reproducible results - More difficult to use

- More time-consuming

- Bayesian concept, less familiar to most biologists