Center for Medical Statistics, Informatics and Intelligent ...

69
The multiple faces of shrinkage Georg Heinze Center for Medical Statistics, Informatics and Intelligent Systems Section for Clinical Biometrics [email protected] Partly supported by Austrian Science Fund FWF, Project I2276-N33

Transcript of Center for Medical Statistics, Informatics and Intelligent ...

The multiple faces of shrinkageGeorg HeinzeCenter for Medical Statistics, Informatics and Intelligent SystemsSection for Clinical [email protected]

Partly supported by Austrian Science Fund FWF, Project I2276-N33

The multiple faces of shrinkage

β€’

β€’

Dunkler, Sauerbrei and Heinze, JStatSoft 2016

β€’

Puhr, Heinze, Nold, Lusa and Geroldinger, StatMed 2017

Historical outline

β€’

β€’

β€’

β€’

β€’

β€’

β€’

β€’

β€’

Purposes of shrinkage estimators

β€’

β€’

β€’

β€’

Post-estimation shrinkage methodsJoint work with Michael Kammer, Daniela Dunkler, Willi Sauerbrei

Post-estimation shrinkage methods

β€’

β€’ 𝛽

β€’

β€’

β€’ 𝑏

β€’ 𝛽 𝛽(βˆ’π‘–)

β€’ πœ‚π‘– = 𝑗 π‘₯𝑖𝑗 𝛽𝑗

(βˆ’π‘–)

β€’ 𝑏

Use of the shrinkage factors

β€’

β€’

β€’

β€’

β€’

β€’

𝑦𝑛𝑒𝑀 = 𝛽0 + 𝑏 π‘₯𝑖𝑛𝑒𝑀 𝛽

β€’

Sauerbreiβ€˜s (1999) β€šparameterwise shrinkage factorsβ€˜

β€’ 𝛽 𝛽(βˆ’π‘–)

β€’

partial πœ‚π‘–π‘— = π‘₯𝑖𝑗 𝛽𝑗(βˆ’π‘–)

β€’ 𝑏𝑗

β€’

Dunklerβ€˜s (2016) extension of parameterwise shrinkage

β€’ 𝑏𝑗

β€’

β€’

β€’

β€’

β€’ 𝐺 πœ‚π‘–π‘” = π‘—βˆˆπ½π‘”π‘₯𝑖𝑗

𝛽𝑗(βˆ’π‘–)

𝑔 = 1, … , 𝐺

β€’ πœ‚π‘–π‘” 𝑏𝑔, 𝑔 = 1, … , 𝐺

β€’ 𝛽(βˆ’π‘–) β‰ˆ 𝛽 βˆ’ 𝐷𝐹𝐡𝐸𝑇𝐴𝑖

Example: deep vein thrombosis study

How do shrinkage effects of different methods compare?

β€’

β€’

β€’

β€’ πœ†

β€’ πœ†

β€’

β€’

β€’

β€’

too pessimistic

too optimistic

From bias reduction to shrinkage and beyondJoint work with Rainer Puhr, Angelika Geroldinger, Sander Greenland

Setting the scene

𝛽

𝛽

πœ‹

πœ‹

𝛽 πœ‹

Firthβ€˜s penalization for logistic regression

πΏβˆ— 𝛽 = 𝐿 𝛽 det( 𝐼 𝛽 )1/2,

𝐼 𝛽 𝐿 𝛽

β€’ 𝛽,

β€’

β€’

Firthβ€˜s penalization for logistic regression

πΏβˆ— 𝛽 = 𝐿 𝛽 det(π‘‹π‘‘π‘Šπ‘‹)1/2

π‘Š = diag expit Xi𝛽 (1 βˆ’ expit Xi𝛽 )

= diag(πœ‹π‘– 1 βˆ’ πœ‹π‘– )

β€’

π‘Š πœ‹π‘– =1

2𝛽 = 0

β€’1

2,

β€’

Firthβ€˜s penalization for logistic regression

β€’

β€’

β€’

β€’

Firthβ€˜s penalization for logistic regression

β€’

β€’

β€’

Firthβ€˜s Logistic regression

1/2

=2

50= 0.04

= 11

=3

52~0.058

= 9.89= 0.054

Example of Greenland 2010

320

32

346 6 352

=32

352= 0.091 =

33

354= 0.093

= 2.03 = 2.73

321

33

346.5 6.5 354

Greenland example: likelihood, prior, posterior

Bayesian non-collapsibility:anti-shrinkage from penalization

β€’

β€’

β€’

An even more extreme examplefrom Greenland 2010

β€’

β€’ 𝛽1 = 0)

β€’

30

6

30 6 36

Simulating the example of Greenland

β€’

β€’

β€’

β€’

320

32

346 6 352

Simulating the example of Greenland

β€’

𝛽1

𝛽1

𝜷𝟏

𝛽1 βˆ’βˆž

Simulating the example of Greenland

β€’

β€’

β€’

logF(1,1) prior (Greenland and Mansournia, 2015)

β€’

𝐿 𝛽 βˆ— = 𝐿 𝛽 β‹… βˆπ‘’

𝛽𝑗2

1+𝑒𝛽𝑗

.

βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—

βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—βˆ— βˆ— βˆ—

Simulating the example of Greenland

β€’

𝛽1

𝛽1

𝜷𝟏

𝛽1 βˆ’βˆž

Simulating the example of Greenland

β€’

𝛽1

𝛽1

𝜷𝟏

𝛽1 βˆ’βˆž

Other, more subtle occurrencesof Bayesian non-collapsibility

β€’

β€’

β€’

β€’

Simulation of bivariable log reg models

β€’ 𝑋1, 𝑋2~Bin(0.5) π‘Ÿ = 0.8, 𝑛 = 50

β€’ 𝛽1 = 1.5 𝛽2 = 0.1 πœ†

𝝀

𝛽1

𝛽1

𝛽2

𝛽2

𝜷𝟐

Anti-shrinkage from penalization?

β€’

β€’

with

β€’ β‰ 

Reason for anti-shrinkage

β€’

β€’

β€’

β€’

β€’

Example of Greenland 2010 revisited

320

32

346 6 352

321

33

347 7 352

β€’

FLAC: Firthβ€˜s Logistic regression with Added Covariate

=

+

=

FLAC: Firthβ€˜s Logistic regression with Added Covariate

𝑖=1

𝑁

𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ + β„Žπ‘–

1

2βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ = 0; π‘Ÿ = 0, … , 𝑝

β„Žπ‘– 𝐻 = π‘Š1

2𝑋 π‘‹β€²π‘Šπ‘‹ βˆ’1π‘‹π‘Š1/2

𝑖=1

𝑁

𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖

𝑁

β„Žπ‘–

1

2βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ =

=

𝑖=1

𝑁

𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖=1

π‘β„Žπ‘–

2(𝑦𝑖 βˆ’ πœ‹π‘–) +

𝑖=1

π‘β„Žπ‘–

2(1 βˆ’ 𝑦𝑖 βˆ’ πœ‹π‘–) = 0

FLAC: Firthβ€˜s Logistic regression with Added Covariate

β€’

𝑖=1

𝑁

𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖=1

π‘β„Žπ‘–

2𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖=1

π‘β„Žπ‘–

21 βˆ’ 𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ = 0

β„Žπ‘–/2 β„Žπ‘–/2

FLAC: Firthβ€˜s Logistic regression with Added Covariate

β€’

𝑖=1

𝑁

𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖=1

π‘β„Žπ‘–

2𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ +

𝑖=1

π‘β„Žπ‘–

21 βˆ’ 𝑦𝑖 βˆ’ πœ‹π‘– π‘₯π‘–π‘Ÿ = 0

β„Žπ‘–/2 β„Žπ‘–/2

FLAC: Firthβ€˜s Logistic regression with Added Covariate

FLIC

Simulation study: the set-up

β€’

β€’

β€’

β€’

Other methods for accurate prediction

β€’

𝐿 𝛽 βˆ— = 𝐿 𝛽 det(π‘‹π‘‘π‘Šπ‘‹)𝜏, 𝜏 = 0.1,

β€’

β€’

β€’

Cauchy priors (CP)

β€’

β€’

β€’

β€’

bayesglm arm.

Simulation results

β€’ 𝛽

β€’ 𝛽

β€’

β€’ πœ‹

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Predictions: bias RMSE

Comparison

FLAC

β€’

β€’

β€’

β€’

β€’

β€’

β€’

Bayesian methods (CP, logF)

β€’

β€’ m m m

m m

β€’ m

β€’

β€’

Ridge

Confidence intervals

β€’

β€’

β€’ a-priori

β€’

β€’ 𝛽 Β± 1.96 𝑆𝐸)

Conclusion

Part 1: Prediction under model uncertainty

β€’

β€’

β€’

β€’

β€’

β€’

β€’

Part 2: Prediction under sparsity (fixed model)

References

β€’

β€’

β€’