Reducing the Sample Size of Diagnostic-Biomarker ... the Sample Size of...
Transcript of Reducing the Sample Size of Diagnostic-Biomarker ... the Sample Size of...
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Interuniversity Institute for Biostatistics and statistical Bioinformatics
Reducing the Sample Size ofDiagnostic-Biomarker-Validation Designs by a
Bayesian Framework
L. Garcia Barrado1 E. Coart2 T. Burzykowski1,2
1Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat)2International Drug Development Institute (IDDI)
1 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Interuniversity Institute for Biostatistics and statistical Bioinformatics
Outline
Problem settingAccuracy definitionIndex definitionIncorporating pre-validation information
Bayesian frameworkDevelopment stageValidation stageInformation transfer
Simulation studySettingsResults
Conclusions
2 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem settingInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Outline
Problem settingAccuracy definitionIndex definitionIncorporating pre-validation information
Bayesian frameworkDevelopment stageValidation stageInformation transfer
Simulation studySettingsResults
Conclusions
3 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem settingInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Problem setting
Develop a diagnostic biomarker-index and efficiently validate itsaccuracy
I Area under the Receiver Operating Characteristics (ROC) curve(AUC) as measure of accuracy
I Define index as linear combination of biomarkers maximizingAUC
I Incorporate information from development to validation stage
=> To this end a Bayesian framework will be proposed
4 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Accuracy definitionInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Area under the Receiver Operating Characteristics curve
−2 0 2 4
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
ControlDisease
−2 0 2 4
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
ControlDisease
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC curve
1−Sp
Se1
AUC=0.98AUC=0.76
5 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Index definitionInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Data assumptions and notation
Underlying true biomarker distributionI Mixture of two K -variate normal distributions by true disease
status (D)I Y|D=0 ∼ NK (µ0,Σ0)I Y|D=1 ∼ NK (µ1,Σ1)
I Reference test (T) is imperfectI Se: Unknown sensitivity of the reference testI Sp: Unknown specificity of the reference testI Conditionally on true disease status, misclassification
independent of biomarker value
I θ: Unknown true prevalence of disease in the data set
I Assume for the rest of the presentation K=3
6 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Index definitionInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Definintion of biomarker-index
Linear combination maximizing AUC of the form*:
a’Y|D=0 ∼ N(a’µ0, a’Σ0a)
a’Y|D=1 ∼ N(a’µ1, a’Σ1a)
For which:a’ ∝ (Σ0 + Σ1)−1(µ1 − µ0)
Area Under the ROC Curve:
AUCIndex = Φ{[
(µ1 − µ0)′(Σ0 + Σ1)−1(µ1 − µ0)] 1
2
}*Siu, J.Q., and Liu, J.S. (1993)
7 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Incorporating pre-validation informationInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Data
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
Results
1a
2a
3a
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC curve
1-Sp
Se2
Development
8 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Incorporating pre-validation informationInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Data Results
1a
2a
3a
Data Results
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC curve
1-Sp
Se2
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
Development Validation
9 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Problem setting
Incorporating pre-validation informationInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Data Results
1a
2a
3a
Data Results
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC curve
1-Sp
Se2 0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC curve
1-Sp
Se2
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
-2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
Classification example
Biomarker value
Control
Disease
Development Validation
10 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian frameworkInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Outline
Problem settingAccuracy definitionIndex definitionIncorporating pre-validation information
Bayesian frameworkDevelopment stageValidation stageInformation transfer
Simulation studySettingsResults
Conclusions
11 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Development stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Bayesian latent-class mixture model
Full data likelihood
L(µ0,µ1,Σ0,Σ1, θ,Se,Sp|Y, T,D)
=N∏
i=1
(θSeti (1− Se)(1−ti ) 1√
2π|Σ1|× EXP
{−1
2(Yi − µ1)
′Σ−11 (Yi − µ1)
})di
×
((1− θ)(1− Sp)ti Sp(1−ti ) 1√
2π|Σ0|× EXP
{−1
2(Yi − µ0)
′Σ−10 (Yi − µ0)
})(1−di )
12 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Development stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Prior distributions
Hyperpriorθ ∼ Uniform(0.1,0.9)
PriorsDi ∼ Bernoulli(θ) (Observation i: 1,. . . ,N)Se = Sp ∼ Beta(1,1)T(0.51,∞)
13 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Development stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Prior distributions
Set Σj =VjRjVj*For: Vj = σk ,j I3 and Rj is a correlation matrix. [j:0,1; k:1,. . .,3]
Then: Cj =
1 cj,12 cj,13
0 cj,22 cj,23
0 0 cj,33
= Cholesky factor of Rj .
σk ,j ∼ Uniform(0,1000)
cj,12 = ρj,12 ∼ Uniform(-1,1) ρj,23 = ρj,12ρj,13 + cj,22cj,23
cj,13 = ρj,13 ∼ Uniform(-1,1)
cj,23 ∼ Uniform(−√
1− ρ2j,13,
√1− ρ2
j,13
)* Wei, Y and Higgins, J.P.T (2013)
14 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Development stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Prior distributions
AUCIndex = Φ{
[(µ1 − µ0)′(Σ0 + Σ1)−1(µ1 − µ0)]12
}Reparameterize:
AUCIndex = Φ{√
∆′∆}
Where ∆ = L(µ1 − µ0)For L = the Cholesky factor of (Σ0 + Σ1)−1
Priors∆ ∼ N3(κ,Ψ)µ0k ∼ N(0, 106) (k: 1,. . . ,3)µ1 = ∆L−1 + µ0
15 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Validation stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Bayesian latent-class mixture model
YIndex = a’YVal (Biomarker index observations)
Full data likelihood
L(µ0, µ1, σ0, σ1, θ,Se,Sp|YIndex , TVal ,DVal)
=N∏
i=1
(θSetVali (1− Se)(1−tVali ) 1√
2πσ21
× EXP
{− (YIndexi − µ1)
2
σ21
})dVali
×
((1− θ)(1− Sp)tVali Sp(1−tVali ) 1√
2πσ20
× EXP
{− (YIndexi − µ0)
2
σ20
})(1−dVali )
16 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Validation stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Prior distributions
Hyperpriorθ ∼ Uniform(0.1,0.9)
PriorsDi ∼ Bernoulli(θ) (Observation i: 1,. . . ,N)Se = Sp ∼ Beta(1,1)T(0.51,∞)σj ∼ Uniform(0,1000) [j:0,1]
17 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Validation stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Prior distributions
AUCIndex = Φ
{(µ1−µ0)√σ2
0+σ21
}
Reparameterize:
AUCIndex = Φ {γ}Where γ = (µ1−µ0)√
σ20+σ2
1
Priorsγ ∼ N(λ, τ 2)µ0 ∼ N(0, 106)
µ1 = γ ×√σ2
0 + σ21 + µ0
18 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Validation stageInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Validation criterionBased on Bayesian hypothesis testing paradigm:
H0 : AUC ≤ δH1 : AUC > δ
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
02
46
810
Example of validated diagnostic biomarker index
AUC
Den
sity
AUC posteriorqα
δ = 0.75α = 0.2
Consider result significant
when posterior probability of
AUC exceeding δ is largerthan 1− α.
19 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Information transferInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Incorporate pre-validation information
Development Validationφ−1 {AUCIndex} =
√∆′∆ ≈ γ = φ−1 {AUCIndex}
Take approximation to posterior distribution of√
∆′∆ as priordistribution for γ
20 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Information transferInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Incorporate pre-validation information
Development Validationφ−1 {AUCIndex} =
√∆′∆ ≈ γ = φ−1 {AUCIndex}
Take approximation to posterior distribution of√
∆′∆ as priordistribution for γ
21 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Bayesian framework
Information transferInteruniversity Institute for Biostatistics
and statistical Bioinformatics
√∆′∆ posterior approximation
Prior: γ ∼ N(λ, τ 2)
Where λ = x√∆′∆1:M
, and τ 2 = s2√∆′∆1:M
(For M MCMC samples)
Posterior ∆’∆ distribution + approximation
∆’∆
Den
sity
0.0 0.2 0.4 0.6
01
23
45 Normal approx.
Posterior ∆’∆ distribution + approximation
∆’∆
Den
sity
0.5 0.6 0.7 0.8 0.9 1.0
02
46
8
Normal approx.
Posterior ∆’∆ distribution + approximation
∆’∆
Den
sity
1.4 1.6 1.8 2.0 2.2
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Normal approx.
22 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation studyInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Outline
Problem settingAccuracy definitionIndex definitionIncorporating pre-validation information
Bayesian frameworkDevelopment stageValidation stageInformation transfer
Simulation studySettingsResults
Conclusions
23 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
SettingsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
GOALEstablish difference in power to validate AUC of biomarker index when
ignoring vs incorporating pre-validation information
For 3 correlated biomarkersθ = 0.5Se = Sp = 0.85
Mixture component parameters set such that:
AUC of biomarker 1 = 0.75AUC of biomarker 2 = 0.75AUC of biomarker 3 = 0.75
AUCIndex= 0.78
24 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
SettingsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Development StageI NDev = 400I a′ = posterior median of a’
Validation StageI 200 data sets for NVal = 100, 400, 600, and 800I Power = proportion of simulations for which P(AUC > 0.75|data) > 0.80I Prior AUC information:
I Ignoring AUC information (Red)I Prior γ : N(0, 1)
I Incorporating AUC information (Blue)I Prior γ : Discounted normal
approx. to posterior predictivedistribution of
√∆′∆
25 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
SettingsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Development StageI NDev = 400I a′ = posterior median of a’
Validation StageI 200 data sets for NVal = 100, 400, 600, and 800I Power = proportion of simulations for which P(AUC > 0.75|data) > 0.80I Prior AUC information:
I Ignoring AUC information (Red)I Prior γ : N(0, 1)
I Incorporating AUC information (Blue)I Prior γ : Discounted normal
approx. to posterior predictivedistribution of
√∆′∆
26 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
SettingsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Development StageI NDev = 400I a′ = posterior median of a’
Validation StageI 200 data sets for NVal = 100, 400, 600, and 800I Power = proportion of simulations for which P(AUC > 0.75|data) > 0.80I Prior AUC information:
Pre−validation AUC information
AUC
Den
sity
0.5 0.6 0.7 0.8 0.9 1.0
02
46
8
Including InfoIgnoring Info I Ignoring AUC information (Red)
I Prior γ : N(0, 1)
I Incorporating AUC information (Blue)I Prior γ : Discounted normal
approx. to posterior predictivedistribution of
√∆′∆
27 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
ResultsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Simulated Power
100 200 300 400 500 600 700 800
0.0
0.2
0.4
0.6
0.8
1.0
Power in terms of validation sample size
Validation sample size
Pow
er +
+
++
+
+
++
+
+
+ +
+
+
+ +
Including InfoIgnoring Info
I Increasing validation sample sizeincreases power
I Incorporating pre-validationinformation significantly increasespower
I Reduction of about 12 of sample
size to maintain power
28 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
Simulation study
ResultsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Simulated Power
100 200 300 400 500 600 700 800
0.0
0.2
0.4
0.6
0.8
1.0
Power in terms of validation sample size
Validation sample size
Pow
er +
+
++
+
+
++
+
+
+ +
+
+
+ +
XX
Including InfoIgnoring Info
I Increasing validation sample sizeincreases power
I Incorporating pre-validationinformation significantly increasespower
I Reduction of about 12 of sample
size to maintain power
29 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
ConclusionsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Outline
Problem settingAccuracy definitionIndex definitionIncorporating pre-validation information
Bayesian frameworkDevelopment stageValidation stageInformation transfer
Simulation studySettingsResults
Conclusions
30 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
ConclusionsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Conclusions
I Bayesian framework:I Allows including pre-validation information into validation stage
I By approximating posterior AUC information as priorI Also other pre-validation information possible
I Simulation studyI Power to reach validation is significantly increasedI Sample size reduction for equal power
31 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
ConclusionsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Further considerations
I Other validation criteria
I Robustness to miss-estimated linear combination coefficients
I Extend to incorporate non-normally distributed biomarkers
I Evaluate impact of conditional independence assumption
32 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
ConclusionsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
References
I Su, J.Q., Liu, J.S.: Linear combinations of multiple diagnostic markers.Journal of the American Statistical Association. 88, 1350–1355 (1993)
I Wei, Y, Higgins, P.T.: Bayesian multivariate meta-analysis with multipleoutcomes. Statistics in Medicine (2013) doi: 10.1002/sim.5745
33 / 34
Reducing the Sample Size of Diagnostic-Biomarker-Validation Designs by a Bayesian Framework
ConclusionsInteruniversity Institute for Biostatistics
and statistical Bioinformatics
Thank you for your attention !
34 / 34