Higher criticism for estimating proportion of non-null...

41
Higher criticism for estimating proportion of non-null effect in high-dimensional multiple comparison Annika Tillander joint work with Tatjana Pavlenko Karolinska Institutet, Department of Medical Epidemiology and Biostatistics LinStat 26 August 2014 Annika Tillander bHC for estimating proportion of non-null 1 / 39

Transcript of Higher criticism for estimating proportion of non-null...

Page 1: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Higher criticism for estimating proportion ofnon-null effect in high-dimensional multiple

comparison

Annika Tillanderjoint work with Tatjana Pavlenko

Karolinska Institutet, Department of Medical Epidemiology and Biostatistics

LinStat 26 August 2014

Annika Tillander bHC for estimating proportion of non-null 1 / 39

Page 2: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Outline

◮ Introduction◮ Covariance structure approximation◮ Separation strength◮ Detection boundary◮ Estimating proportion of non-null effect◮ Higher criticism◮ Application

Annika Tillander bHC for estimating proportion of non-null 2 / 39

Page 3: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Classification

We have n observations where each observationx = (x1, . . . , xp) corresponds to p number of features.Supervised classification problem with C classes, class labelyj = c where j = 1, . . . , n, c ∈ {1, . . . , C}

T = {(x1, y1), (x2, y2), . . . , (xn, yn)} .Using the training data a prediction model is built which enablesprediction of new observations where the outcome is unknown.

Annika Tillander bHC for estimating proportion of non-null 3 / 39

Page 4: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Linear Discriminant Analysis (LDA)

Assume that the outcome in each class is modeled by theGaussian distribution, i.e. xc ∼ N(µc ,Σc), where µc is theclass mean and Σc is the class-wise covariance matrix.

Dc(x) = x′Σ−1c µc − 1

2µ′cΣ

−1c µc + logπc

πc is the prior probability of class c andC∑

c=1πc = 1.

c∗ = argmaxc=1,...,CDc(x)

Annika Tillander bHC for estimating proportion of non-null 4 / 39

Page 5: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Two classes

Assign x to class 1 iflog π1

π2+(

x − 12(µ1 + µ2)

)′Σ−1(µ1 − µ2) ≥ 0

Annika Tillander bHC for estimating proportion of non-null 5 / 39

Page 6: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

High dimensionality

It is the relation between the number of observation (n) and thenumber of features (p) that decides the dimensionality of thedata.In the case with more features than available observations, theproblem is said to be ”high-dimensional” (p > n).

Standard asymptoticp is fixed and n → ∞

Asymptotic when p is not fixedp = pn grows with n, pn ≫ n for n → ∞

Annika Tillander bHC for estimating proportion of non-null 6 / 39

Page 7: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

gLasso

Using the correlation matrix K−1 instead of covariance allowsfor faster convergence in high-dimensional setting Rothman et al.

(2008).

Let Γ denote the diagonal matrix of true standarddeviationsΣ−1 = Γ−1KΓ−1

Sparse inverse covariance estimation with the graphicallasso Friedman et al. (2009)

K̂λ = arg minK≻0

{

trace(

KK̂−1)

− log detK + λ∥

∥K−1∥

1

}

Annika Tillander bHC for estimating proportion of non-null 7 / 39

Page 8: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Cuthill-McKee ordering

Reducing the bandwidth of sparse symmetric matrices Cuthill &

McKee (1969)

Let S be a p × p symmetric matrix where i denote rowsand j denote columns.The bandwidth of S is the maximum value of |i − j | for thenon-zero elements

Determine a permutation matrix P such that non-zeroelements will cluster about the main diagonalSC = PSPT

Annika Tillander bHC for estimating proportion of non-null 8 / 39

Page 9: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

SC= PSPT

S=

SC=

Annika Tillander bHC for estimating proportion of non-null 9 / 39

Page 10: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

True block-structure

1 0 0 0 1 0 1 0 0 1

0 1 0 1 0 0 0 1 0 0

0 0 1 0 0 1 0 0 1 0

0 1 0 1 0 0 0 1 0 0

1 0 0 0 1 0 1 0 0 1

0 0 1 0 0 1 0 0 1 0

1 0 0 0 1 0 1 0 0 1

0 1 0 1 0 0 0 1 0 0

0 0 1 0 0 1 0 0 1 0

1 0 0 0 1 0 1 0 0 1

S=

1 1 1 1 0 0 0 0 0 01 1 1 1 0 0 0 0 0 01 1 1 1 0 0 0 0 0 01 1 1 1 0 0 0 0 0 00 0 0 0 1 1 1 0 0 00 0 0 0 1 1 1 0 0 00 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 1 1 1

SC=

Annika Tillander bHC for estimating proportion of non-null 10 / 39

Page 11: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Algorithm

Combining gLasso and Cuthill-McKee orderingBootstrap sample, calculate K̂−1

j

Estimate K̂j [λi ] with gLassoSij = 1K̂j [λi ]>0

S̃ik = 1 r∑

j=1Sij

r >qk

Find permutation matrix Pik for skeleton S̃ik with Cuthill-McKeeordering algorithm

Annika Tillander bHC for estimating proportion of non-null 11 / 39

Page 12: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Identifying block-structure

(a)

(b)

λλ == 0.63, limit == 0.99

M1

(c)

λλ == 0.66,limit == 0.99

M2

λλ == 0.66,limit == 0.99

M3

λλ == 0.65,limit == 0.99

M4

Annika Tillander bHC for estimating proportion of non-null 12 / 39

Page 13: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Additive classifier

Block diagonal segmentationΣ−1 = diag

[

Σ−11 , . . . ,Σ−1

b

]

where b is the number of blocks. Both the class means µc andthe observed vector x can be partioned into b disjoint subsetsµc,i = (µc,i1 , . . . , µc,ipi

) and xi = (xi1 , . . . , xipi), pi is the block

size, i = 1, . . . , b, such that for any i 6= j , xi and xj areconditionally independent given the class variable y .

Two-class linear function with additive structure

D(x) =b∑

i=1

(

xi − 12(µ1,i + µ2,i)

)′Σ−1

i (µ1,i − µ2,i)

Annika Tillander bHC for estimating proportion of non-null 13 / 39

Page 14: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Mahalanobis distance

Let π1 = π2 = 1/2 then the optimal misclassificationprobability can be expressed asε = Φ

(

−12

√δ2)

where Φ(·) is the Gaussian cumulative distribution function and

δ2 =∥

∥Σ−1/2µ

2is the Mahalanobis shift vector norm, where

µ = µ1 − µ2 is a shift vector and ‖·‖ denotes the ℓ2 norm.

Annika Tillander bHC for estimating proportion of non-null 14 / 39

Page 15: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Separation strength

The i th block separation strength

δ2i =

∥Σ−1/2i µi

2

Rescaled estimate of the i th separation strengthS2

i = ηµ̂′i Σ̂

−1i µ̂i

where η = n1n2n , µ̂i = µ̂1i − µ̂2i is the shift vector of the sample

class means and Σ̂i is the maximum likelihood estimate of thecovariance matrix of the i th block. S2 ∼ χ2(p0, ω

2) where p0

degrees of freedom and ω2 = ηδ2 the non-centrality parameter.

Annika Tillander bHC for estimating proportion of non-null 15 / 39

Page 16: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Misclassifiation

0 10 20 30 40 50

0.0

0.1

0.2

0.3

0.4

0.5

Data from West et.al

Number of variables

Mis

cla

ssific

atio

n L

DA

0 50 100 150

0.2

00

.25

0.3

00

.35

0.4

00

.45

0.5

0

Data from Pawitan et.al

Number of variables

Mis

cla

ssific

atio

n L

DA

1

1Data from West et al. (2001) and Pawitan et al. (2005)Annika Tillander bHC for estimating proportion of non-null 16 / 39

Page 17: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Sparse and weak setting

Traditional mixture

Fre

quen

cy

−2 0 2 4 6

050

100

150

200

Non informativeInformative

Sparse and weak mixture

Fre

quen

cy−4 −2 0 2 4

050

010

0015

00

Non informativeInformative

Annika Tillander bHC for estimating proportion of non-null 17 / 39

Page 18: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Detection boundary

β: sparsity parameter, proportion of non-null effectω2: weakness parameter, effect strength

Asymptotic sparse and weak modelβ = b−γ where γ ∈ (0, 1)ω2 = 2r log b where r ∈ (0, 1)

Annika Tillander bHC for estimating proportion of non-null 18 / 39

Page 19: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

For p0 = 1 and p = b

H0 : Si ∼ N(0,1) i.i.d 1 ≤ i ≤ pH1 : Si ∼ (1 − β)N(0,1) + βN(

√ω2,1) i.i.d 1 ≤ i ≤ p

Ingster (1999), Donoho & Jin (2008), Klaus & Strimmer (2013)

0.5 0.6 0.7 0.8 0.9 1.0

0.0

0.2

0.4

0.6

0.8

1.0

γγ

r

undetectable

detectable

estimable

Annika Tillander bHC for estimating proportion of non-null 19 / 39

Page 20: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Detection boundary for blocks

H0 : S2i ∼ χ2

p0(·; 0) i.i.d, 1 ≤ i ≤ b

H1 : S2i ∼ (1 − β)χ2

p0(·; 0) + βχ2

p0

(

·;ω2)

i.i.d, 1 ≤ i ≤ b

LRi =fH1

(s2i )

fH0(s2

i )=

(1−β)χ2p0(s2

i ;0)+βχ2p0(s2

i ;ω2)

χ2p0(s2

i ;0)

LRb = LRb(s21, s

22, . . . , s

2b;β, ω

2)

Reject H0 ifflog(LRb) > 0

Annika Tillander bHC for estimating proportion of non-null 20 / 39

Page 21: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Detection boundary for blocks

Annika Tillander bHC for estimating proportion of non-null 21 / 39

Page 22: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Let πi denote the significance level (”p-value”)

Under the global null hypothesisπi ∼ U(0, 1)

where U denotes the uniform distribution.

β denote the proportion false null hypotheses

β =

b∑

i=11{ωi 6=0}

b

Annika Tillander bHC for estimating proportion of non-null 22 / 39

Page 23: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Rank the p-values in increasing order π(1) ≤ π(2) ≤ . . . ≤ π(b)

Storey & Tibshirani (2003)

β̂ = 1 −b∑

1=i1{π(i)>t}(1−t)b

Meinshausen & Rice (2006)

β̂ = maxt∈(0,1)

Fb(t)−t−Bb,α∆(t)1−t

where Fb(t) =

b∑

i=11{πi≤t}

b is the empirical distribution of p-values,∆(t) =

t(1 − t) the standard deviation-proportional bounding function and

Bb,α the bounding sequence for ∆(t) at level α given by Bb,α = G−1(1−α)+mbnb

where G is the Gumbel distribution, mb = 2 log2 b + 12 log3 b −

12 log 4π

Annika Tillander bHC for estimating proportion of non-null 23 / 39

Page 24: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Annika Tillander bHC for estimating proportion of non-null 24 / 39

Page 25: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Annika Tillander bHC for estimating proportion of non-null 25 / 39

Page 26: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Annika Tillander bHC for estimating proportion of non-null 26 / 39

Page 27: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Estimating proportion of non-null effect

Annika Tillander bHC for estimating proportion of non-null 27 / 39

Page 28: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Higher criticismDonoho & Jin (2004, 2008, 2009)

HCi,π(i)=

√p

i/p−π(i)√i/p(1−i/p)

i = 1, . . . , p and for fixed α0 ∈ (0, 1) the HC test statistic is

HC∗ = max1≤i≤(α0×p)

HCi,π(i)

Block higher criticism

bHCi,π(i)=

√b

i/b−π(i)√i/b(1−i/b)

i = 1, . . . , b and for fixed α0 ∈ (0, 1) the bHC test statistic is

bHC∗ = max1≤i≤(α0×b)

bHCi,π(i)

Annika Tillander bHC for estimating proportion of non-null 28 / 39

Page 29: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

False Discovery Rates

Benjamini & Hochberg (1995)Fdr(πi) = P (Non informative|Π ≤ πi) =

(1−β)πiF (πi )

≤ pi π(i)

Efron et al. (2001), Efron (2004)Lfdr(πi) = P {Non-informative|πi} =

(1−β)fH0(πi )

f (πi )

Sun & Cai (2007)

Ofdr(s2i ) =

(1−β)χ2p0(s2

i ;0)(1−β)χ2

p0(s2i ;0)+βχ2

p0(s2i ;ω

2)

k = max1≤i≤k

{

i ; 1i

i∑

j=1Ofdr(j) ≤ α

}

Annika Tillander bHC for estimating proportion of non-null 29 / 39

Page 30: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 30 / 39

Page 31: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 31 / 39

Page 32: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 32 / 39

Page 33: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 33 / 39

Page 34: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 34 / 39

Page 35: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

Annika Tillander bHC for estimating proportion of non-null 35 / 39

Page 36: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Simulated data

bHC Fdrb̃ fpr mc b̃ fpr mc

β ǫ m sd m sd m sd m sd m sd m sd0.010 0.005 36 41.1 0.03 0.04 0.02 0.02 11 5.9 0.00 0.00 0.06 0.080.020 0.005 42 64.5 0.03 0.06 0.04 0.03 12 7.6 0.00 0.00 0.09 0.090.030 0.005 94 106.6 0.08 0.10 0.04 0.05 7 7.7 0.00 0.00 0.17 0.110.010 0.010 39 42.1 0.03 0.04 0.03 0.03 9 6.1 0.00 0.00 0.10 0.100.020 0.010 62 68.8 0.05 0.07 0.04 0.05 7 7.2 0.00 0.00 0.15 0.110.030 0.010 99 103.7 0.09 0.10 0.05 0.08 3 4.0 0.00 0.00 0.26 0.100.010 0.050 53 55.0 0.05 0.05 0.06 0.07 2 2.1 0.00 0.00 0.26 0.080.020 0.050 30 58.2 0.03 0.06 0.14 0.11 1 0.4 0.00 0.00 0.33 0.050.030 0.050 21 24.5 0.02 0.02 0.16 0.13 1 0.4 0.00 0.00 0.34 0.040.010 0.100 21 21.2 0.02 0.02 0.14 0.12 1 0.6 0.00 0.00 0.32 0.050.020 0.100 14 20.5 0.01 0.02 0.20 0.13 1 0.2 0.00 0.00 0.34 0.030.030 0.100 19 34.4 0.02 0.03 0.17 0.12 1 0.3 0.00 0.00 0.33 0.050.010 0.150 23 32.1 0.02 0.03 0.18 0.13 1 0.2 0.00 0.00 0.34 0.040.020 0.150 12 16.1 0.01 0.02 0.22 0.12 1 0.0 0.00 0.00 0.34 0.040.030 0.150 17 31.1 0.02 0.03 0.18 0.12 1 0.0 0.00 0.00 0.34 0.03

Table: Number of blocks selected as informative(

b̃)

, proportion of

falsely selected blocks (fpr ) and the misclassification rate (mc)averaged over 100 runs, presented as mean (m) and standarddeviation (sd) for block size p0 = 20.

Annika Tillander bHC for estimating proportion of non-null 36 / 39

Page 37: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Real dataBlock No.selected blocks Misclassification ratesize bHC Fdr Lfdr bHC Fdr Lfdr All

Breast cancer data I1 657 999 583 0.24 0.23 0.24 -2 328 1461 804 0.23 0.23 0.22 0.285 131 1219 929 0.22 0.27 0.25 0.26

10 65 657 601 0.20 0.26 0.25 0.2815 43 438 393 0.21 0.28 0.26 0.2820 32 328 308 0.19 0.28 0.28 0.2630 21 219 209 0.14 0.26 0.25 0.3040 16 164 156 0.17 0.26 0.25 0.2650 13 131 121 0.15 0.25 0.26 0.25

Prostate cancer data1 126 808 442 0.10 0.20 0.13 -2 630 5547 4082 0.14 0.38 0.35 0.365 252 2520 2427 0.09 0.26 0.28 0.25

10 126 1260 1254 0.08 0.19 0.19 0.1715 84 840 838 0.07 0.14 0.13 0.1520 63 630 620 0.06 0.13 0.12 0.1330 42 420 411 0.05 0.11 0.13 0.1040 31 315 309 0.05 0.11 0.09 0.1250 25 252 247 0.04 0.12 0.09 0.13

Breast cancer data II1 712 165 61 0.02 0.08 0.04 -2 712 1254 696 0.06 0.06 0.04 0.105 285 1313 759 0.04 0.08 0.06 0.14

10 142 712 705 0.04 0.12 0.10 0.0815 95 475 443 0.06 0.10 0.14 0.1620 71 356 351 0.02 0.16 0.10 0.1030 1 237 235 0.10 0.08 0.08 0.1040 1 178 17650

Annika Tillander bHC for estimating proportion of non-null 37 / 39

Page 38: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Real dataBlock No.inf.blocks Misclassification ratesize bHC Fdr Lfdr bHC Fdr Lfdr All

Breast cancer data I1 328 550 315 0.22 0.21 0.23 -2 164 614 342 0.22 0.22 0.21 0.285 65 565 384 0.21 0.26 0.23 0.26

10 32 328 286 0.21 0.30 0.28 0.2715 21 219 188 0.20 0.27 0.28 0.2620 16 164 152 0.16 0.28 0.27 0.2830 10 109 105 0.18 0.24 0.28 0.2640 8 82 77 0.18 0.25 0.27 0.2650 6 65 60 0.20 0.27 0.24 0.26

Prostate cancer data1 315 89 43 0.36 0.34 0.28 -2 157 150 54 0.27 0.30 0.23 0.385 63 260 99 0.16 0.25 0.20 0.31

10 31 218 101 0.09 0.19 0.14 0.2515 21 188 127 0.06 0.21 0.16 0.2320 15 155 134 0.11 0.18 0.16 0.1930 10 105 97 0.11 0.14 0.10 0.1540 7 78 72 0.09 0.10 0.12 0.1350 6 63 60 0.15 0.14 0.14 0.16

Breast cancer data II1 356 230 133 0.22 0.20 0.22 -2 307 217 97 0.02 0.02 0.02 0.275 142 461 289 0.02 0.06 0.02 0.14

10 71 356 271 0.02 0.12 0.06 0.1415 47 237 225 0.04 0.12 0.16 0.1220 35 178 174 0.02 0.12 0.12 0.1830 1 118 115 0.12 0.10 0.08 0.1040 1 89 87 0.18 0.16 0.12 0.1650

Annika Tillander bHC for estimating proportion of non-null 38 / 39

Page 39: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Thank You

Annika Tillander bHC for estimating proportion of non-null 39 / 39

Page 40: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Benjamini, Y. & Hochberg, Y. (1995), ‘Controlling the false discovery rate: A practicaland powerful approach to multiple testing’, Journal of the Royal Statistical Society.Series B (Methodological) 57(1), pp. 289–300.URL: http://www.jstor.org/stable/2346101

Cuthill, E. & McKee, J. (1969), Reducing the bandwidth of sparse symmetric matrices,in ‘Proceedings of the 1969 24th national conference’, ACM ’69, ACM, New York,NY, USA, pp. 157–172.URL: http://doi.acm.org/10.1145/800195.805928

Donoho, D. & Jin, J. (2004), ‘Higher criticism for detecting sparse heterogeneousmixtures’, Ann. Statist pp. 962–994.

Donoho, D. & Jin, J. (2008), ‘Higher criticism thresholding: Optimal feature selectionwhen useful features are rare and weak’, Proceedings of the National Academy ofSciences 105(39), 14790–14795.URL: http://www.pnas.org/content/105/39/14790.abstract

Donoho, D. & Jin, J. (2009), ‘Feature selection by higher criticism thresholding achievesthe optimal phase diagram’, Philosophical Transactions of the Royal Society A:Mathematical, Physical and Engineering Sciences 367(1906), 4449–4470.URL: http://rsta.royalsocietypublishing.org/content/367/1906/4449.abstract

Efron, B. (2004), Local false discovery rates, Technical report, Department of Statistics,Stanford University.

Efron, B., Storey, J. D. & Tibshirani, R. (2001), ‘Microarrays, empirical bayes methods,and false discovery rates’, Genet. Epidemiol 23, 70–86.

Friedman, J., Hastie, T. & Tibshirani, R. (2009), Graphical lasso- estimation ofGaussian graphical models. Manual to the R-package glasso.

Ingster, Y. I. (1999), ‘Minimax detection of a signal for lpn balls’, Mathematical Methodsof Statistics 7, 401–428.

Annika Tillander bHC for estimating proportion of non-null 39 / 39

Page 41: Higher criticism for estimating proportion of non-null …conferences.mai.liu.se/LinStat2014/presentations/Til... · 2014-08-26 · Higher criticism for estimating proportion of non-null

Klaus, B. & Strimmer, K. (2013), ‘Signal identification for rare and weak features: highercriticism or false discovery rates?’, Biostatistics 14, 129.

Meinshausen, N. & Rice, J. (2006), ‘Estimating the proportion of false null hypothesesamong a large number of independently tested hypotheses’, Annals of Statistics34(1), 373–393.

Pawitan, Y., Bjohle, J., Amler, L., Borg, A., Egyhazi, S., Hall, P., Han, X., Holmberg, L.,Huang, F., Klaar, S., Liu, E. T., Miller, L., Nordgren, H., Ploner, A., Sandelin, K.,Shaw, P. M., Smeds, J., Skoog, L., Wedren, S. & Bergh, J. (2005), ‘Gene expressionprofiling spares early breast cancer patients from adjuvant therapy: derived andvalidated in two population-based cohorts’, Breast Cancer Research 7(6), 953–964.

Rothman, A., Bickel, P., Levina, E. & Zhu, J. (2008), ‘Sparse permutation invariantcovariance estimation’, Electronic Journal of Statistics 2, 494–515.

Storey, J. D. & Tibshirani, R. (2003), ‘Statistical significance for genomewide studies’,Proceedings of the National Academy of Sciences 100(16), 9440–9445.URL: http://www.pnas.org/content/100/16/9440.abstract

Sun, W. & Cai, T. T. (2007), ‘Oracle and adaptive compound decision rules for falsediscovery rate control’, Journal of the American Statistical Association102(479), 901–912.URL: http://amstat.tandfonline.com/doi/abs/10.1198/016214507000000545

West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H.,Olson, J. A., Marks, J. R. & Nevins, J. R. (2001), ‘Predicting the clinical status ofhuman breast cancer by using gene expression profiles’, PNAS98(20), 11462–11467.

Annika Tillander bHC for estimating proportion of non-null 39 / 39