What You See May Not Be What You Get: A Primer on Regression Artifacts

94
What You See May Not Be What You Get: A Primer on Regression Artifacts Michael A. Babyak, PhD Duke University Medical Center

description

What You See May Not Be What You Get: A Primer on Regression Artifacts. Michael A. Babyak, PhD Duke University Medical Center. Topics to Cover. Models: what and why? Preliminaries—requirements for a good model Dichotomizing a graded or continuous variable is dumb - PowerPoint PPT Presentation

Transcript of What You See May Not Be What You Get: A Primer on Regression Artifacts

Page 1: What You See May Not Be What You Get:  A Primer on Regression Artifacts

What You See May Not Be What You Get:

A Primer on Regression Artifacts

Michael A. Babyak, PhD

Duke University Medical Center

Page 2: What You See May Not Be What You Get:  A Primer on Regression Artifacts
Page 3: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Topics to Cover

1. Models: what and why?2. Preliminaries—requirements for a good model3. Dichotomizing a graded or continuous variable

is dumb4. Using degrees of freedom wisely5. Covariate selection6. Transformations and smoothing techniques for

non-linear effects7. Resampling as a superior method of model

validation

Page 4: What You See May Not Be What You Get:  A Primer on Regression Artifacts

What is a model ?

Y = f(x1, x2, x3…xn)

Y = a + b1x1 + b2x2…bnxn

Y = e a + b1x1 + b2x2…bnxn

Page 5: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Why Model? (instead of test)

1. Can capture theoretical/predictive system

2. Estimates of population parameters

3. Allows prediction as well as hypothesis testing

4. More information for replication

Page 6: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Preliminaries

1. Correct model2. Measure well and don’t throw

information away3. Adequate Sample Size

Page 7: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Correct Model

• Gaussian: General Linear Model• Multiple linear regression

• Binary (or ordinal): Generalized Linear Model• Logistic Regression• Proportional Odds/Ordinal Logistic

• Time to event: • Cox Regression

• Distribution of predictors generally not important

Page 8: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Measure well and don’t throw information away

• Reliable, interpretable• Use all the information about the

variables of interest• Don’t create “clinical cutpoints”

before modeling• Model with ALL the data first, then

use prediction to make decisions about cutpoints

Page 9: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomizing for Convenience Can Destroy a Model

Page 10: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0 4 8 12 16 20 24 28 32 36 40 44

Depression score

AB C

Implausible measurement assumption

“not depressed” “depressed”

Page 11: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomization, by definition, reduces power by a minimum of about

30%

http://psych.colorado.edu/~mcclella/MedianSplit/

Page 12: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomization, by definition, reduces power by a minimum of about

30%

Dear Project Officer,

In order to facilitate analysis and interpretation, we have decided to throw away about 30% of our data. Even though this will waste about 3 or 4 hundred thousand dollars worth of subject recruitment and testing money, we are confident that you will understand.

Sincerely,

Dick O. Tomi, PhDProf. Richard Obediah Tomi, PhD

Page 13: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Examples from the WCGS Study:Correlations with CHD Mortality (n = 750)

Continuous Dichotomizedat median

Reductionin r2

Variable r r2 r r2

SystolicBloodPressure

.15 .023 .12 .014 -39%

Hostility .15 .023 .08 .006 -74%

Page 14: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomizing does not reduce measurement error

Gustafson, P. and Le, N.D. (2001). A comparison of continuous and discrete measurement error: is it wise

to dichotomize imprecise covariates? Submitted. Available at http://www.stat.ubc.ca/people/gustaf.

Page 15: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Simulation: Dichotomizing makes matters worse when

measure is unreliable

X1 = .4

Y

True Model: X1 continuous

Page 16: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Simulation: Dichotomizing makes matters worse when

measure is unreliable

X1 = .4

Y

Same Model with X1 dichotomized

Page 17: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Simulation: Dichotomizing makes matters worse when

measure is unreliable

X1 = .4

Y

= .4YX1

Contin.

Dich.

Reliability=.65, .75., .85, 1.00

Models with reliability of X1 manipulated

Page 18: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomization of a variable measured with error (y = .4x + e)

50

60

70

80

90

100

1.00 0.85 0.75 0.65Reliability of x

% c

orr

ec

t re

jec

tio

ns

of

nu

ll h

yp

oth

es

is

Continuous x

Page 19: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomization of a variable measured with error (y = .4x + e)

50

60

70

80

90

100

1.00 0.85 0.75 0.65Reliability of x

% c

orr

ec

t re

jec

tio

ns

of

nu

ll h

yp

oth

es

is

Continuous xDichotomized x

Page 20: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Dichotomizing will obscure non-linearity

Dichotomized at Median (CES-D = 7)

Perc

ent w

ith W

all

Motio

n A

bnorm

alit

y

0

6

12

18

24

30

Not Depressed Depressed

Page 21: What You See May Not Be What You Get:  A Primer on Regression Artifacts

WMA on at Least 1 TaskUsing Cubic Spline

CES-D Score

Pro

babi

lity

of W

MA

0.0

0.2

0.4

0.6

0.8

1.0

0 5 10 15 20 25 30 35 40

Dichotomizing will obscure non-linearity

Page 22: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Simulation 2: Dichotomizing a continuous predictor that is

correlated with another predictor

X1 = .4

= .0X2

Y

X1 and X2 continuous

Page 23: What You See May Not Be What You Get:  A Primer on Regression Artifacts

X1 = .4

= .0X2

Y

X1 dichotomized

Simulation 2: Dichotomizing a continuous predictor that is

correlated with another predictor

Page 24: What You See May Not Be What You Get:  A Primer on Regression Artifacts

X1 = .4

= .0

=

.0, .4, .7

X2

Y

X1 dichotomized; rho12 manipulated

Simulation 2: Dichotomizing a continuous predictor that is

correlated with another predictor

Page 25: What You See May Not Be What You Get:  A Primer on Regression Artifacts

00.5

11.5

22.5

33.5

44.5

0 0.4 0.7

Correlation between x1, x2

(%)

Incorr

ect

reje

cti

ons

of X2 =

0

X1 and X2 continuous

Simulation 2: Dichotomizing a continuous predictor that is

correlated with another predictor

Page 26: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

5

10

15

20

25

30

0 0.4 0.7

Correlation between x1, x2

(%)

Inco

rrec

t re

ject

ion

s

of

X2

= 0

Both continuous x1 dichotomous, x2 continuous

Simulation 2: Dichotomizing a continuous predictor that is

correlated with another predictor

Page 27: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Is it ever a good idea to categorize quantitatively measured variables?

• Yes: – when the variable is truly categorical– for descriptive/presentational purposes– for hypothesis testing, if enough categories

are made.• However, using many categories can lead to problems of

multiple significance tests and still run the risk of misclassification

Page 28: What You See May Not Be What You Get:  A Primer on Regression Artifacts

CONCLUSIONS• Cutting:

– Doesn’t always make measurement sense– Almost always reduces power– Can fool you with too much power in some

instances– Can completely miss important features of the

underlying function• Modern computing/statistical packages can

“handle” continuous variables

• Want to make good clinical cutpoints? Model first, cut later.

Page 29: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Pro

b{

even

t}

Maximum Change in LVEF (%)

Clinical Events and LVEF Change during Mental Stress: 5 Year follow-upModel first, cut later

Page 30: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Requirements: Sample Size

• Linear regression– minimum of N = 50 + 8:predictor (Green, 1990)

• Logistic Regression– Minimum of N = 10-15/predictor among

smallest group (Peduzzi et al., 1990a)

• Survival Analysis– Minimum of N = 10-15/predictor (Peduzzi et al.,

1990b)

Page 31: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Y = b X + error

bs

1

bs

2

bs

3

bs

4

bsk-1 bsk………………….

Concept of Simulation

Page 32: What You See May Not Be What You Get:  A Primer on Regression Artifacts

bs

1

bs

2

bs

3

bs

4

bsk-1 bsk………………….

Y = b X + error

Evaluate

Concept of Simulation

Page 33: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Y = .4 X + error

bs

1

bs

2

bs

3

bs

4

bsk-1 bsk………………….

Simulation Example

Page 34: What You See May Not Be What You Get:  A Primer on Regression Artifacts

bs

1

bs

2

bs

3

bs

4

bsk-1 bsk………………….

Evaluate

Y = .4 X + error

Simulation Example

Page 35: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0.2 0.4 0.6

05

00

10

00

15

00

20

00

25

00

Value of beta for x1

Fre

qu

en

cy o

f b

eta

va

lue

True Model:Y = .4*x1 + e

Page 36: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Sample Size• Linear regression

– minimum of N = 50 + 8:predictor (Green, 1990)

• Logistic Regression– Minimum of N = 10-15/predictor among

smallest group (Peduzzi et al., 1990a)

• Survival Analysis– Minimum of N = 10-15/predictor (Peduzzi et

al., 1990b)

Page 37: What You See May Not Be What You Get:  A Primer on Regression Artifacts

All-noise, but good fit

R-Square from Full Model

De

nsi

ty

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

02

46

81

01

21

41

6

n/p~3n/p~6.6n/p=10n/p~13.3

Page 38: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Simulation: number of events/predictor ratio

Y = .5*x1 + 0*x2 + .2*x3 + 0*x4

-- Where x1 x4 = .4

-- N/p = 3, 5, 10, 20, 50

Page 39: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Parameter stability and n/p ratiox1

Den

sity

-2.0 -1.0 0.0 0.5 1.0 1.5 2.0

01

23

45

67

8

n/p=3n/p=5n/p=10n/p=20n/p=50

x2

-2.0 -1.0 0.0 0.5 1.0 1.5 2.0

01

23

45

67

8

x3

Parameter Estimate

Den

sity

-2.0 -1.0 0.0 0.5 1.0 1.5 2.0

01

23

45

67

8

x4

Parameter Estimate

-2.0 -1.0 0.0 0.5 1.0 1.5 2.0

01

23

45

67

8

Page 40: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Peduzzi’s Simulation: number of events/predictor ratio

P(survival) =a + b1*NYHA + b2*CHF + b3*VES+b4*DM + b5*STD + b6*HTN + b7*LVC

--Events/p = 2, 5, 10, 15, 20, 25

--% relative bias = (estimated b – true b/true b)*100

Page 41: What You See May Not Be What You Get:  A Primer on Regression Artifacts

-20

-10

0

10

20

30

40

50

0 2 5 10 15 20 25

Events per variable

% R

elat

ive

Bia

s NYHACHFVESDMSTDHTNLVC

Simulation results: number of events/predictor ratio

Page 42: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 2 5 10 15 20 25

Events per variable

Pro

port

ion w

/ B

ias

>

100%

NYHACHFVESDMSTDHTNLVC

Simulation results: number of events/predictor ratio

Page 43: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Predictor (covariate) selection

1. Theory, substantive knowledge, prior models

2. Testing for confounding

3. Univariate testing

4. Last (and least), automated methods, aka stepwise and best subset regression

Page 44: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Searching for Confounders

• Fundamental tension between underfitting and overfitting•Underfitting = not adjusting for

important confounders•Overfitting = capitalizing on

chance relations (sample fluctuation)

Page 45: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Covariate selection

• Overfitting has been studied extensively

• “Scariest” study is by Faraway (1992)—showed that any pre-modeling strategy cost a df over and above df used later in modeling.

• Premodeling strategies included: variable selection, outlier detection, linearity tests, residual analysis.

Page 46: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Covariate selection

• Therefore, if you transform, select, etc., you must include the DF in (i.e., penalize for) the “Final Model”

Page 47: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Covariate selection: Univariate Testing

• Non-Significant tests also cost a DF• Variables may not behave the

same way in a multivariable model—variable “not significant” at univariate test may be very important in the presence of other variables

Page 48: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Covariate selection

• Despite the convention, testing for confounding has not been systematically studied—likely leads to overadjustment and underestimate of true effect of variable of interest.

• At the very least, pulling variables in and out of models inflates the Type I error rate, sometimes dramatically

Page 49: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high

SOME of the problems with stepwise variable selection.

Page 50: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high

2. The F and chi-squared tests quoted next to each variable on the printout do not have the claimed distribution

SOME of the problems with stepwise variable selection.

Page 51: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution

3. The method yields confidence intervals for effects and predicted values that are falsely narrow (See Altman and Anderson Stat in Med)

SOME of the problems with stepwise variable selection.

Page 52: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med)

4. It yields P-values that do not have the proper meaning and the proper correction for them is a very difficult problem

SOME of the problems with stepwise variable selection.

Page 53: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem

5. It gives biased regression coefficients that need shrinkage (the coefficients for remaining variables are too large; see Tibshirani, 1996)

SOME of the problems with stepwise variable selection.

Page 54: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem 5. It gives biased regression coefficients that need shrinkage (the coefficients for

remaining variables are too large; see Tibshirani, 1996).

6. It has severe problems in the presence of collinearity

SOME of the problems with stepwise variable selection.

Page 55: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem 5. It gives biased regression coefficients that need shrinkage (the coefficients for

remaining variables are too large; see Tibshirani, 1996). 6. It has severe problems in the presence of collinearity

7. It is based on methods (e.g. F- tests for nested models) that were intended to be used to test pre-specified hypotheses

SOME of the problems with stepwise variable selection.

Page 56: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem 5. It gives biased regression coefficients that need shrinkage (the coefficients for

remaining variables are too large; see Tibshirani, 1996). 6. It has severe problems in the presence of collinearity 7. It is based on methods (e.g. F tests for nested models) that were intended to be

used to test pre-specified hypotheses.

8. Increasing the sample size doesn't help very much (see Derksen and Keselman)

SOME of the problems with stepwise variable selection.

Page 57: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem 5. It gives biased regression coefficients that need shrinkage (the coefficients for

remaining variables are too large; see Tibshirani, 1996). 6. It has severe problems in the presence of collinearity 7. It is based on methods (e.g. F tests for nested models) that were intended to be

used to test pre-specified hypotheses. 8. Increasing the sample size doesn't help very much (see Derksen and Keselman)

9. It allows us to not think about the problem

SOME of the problems with stepwise variable selection.

Page 58: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1. It yields R-squared values that are badly biased high 2. The F and chi-squared tests quoted next to each variable on the printout do not

have the claimed distribution 3. The method yields confidence intervals for effects and predicted values that are

falsely narrow (See Altman and Anderson Stat in Med) 4. It yields P-values that do not have the proper meaning and the proper correction

for them is a very difficult problem 5. It gives biased regression coefficients that need shrinkage (the coefficients for

remaining variables are too large; see Tibshirani, 1996). 6. It has severe problems in the presence of collinearity 7. It is based on methods (e.g. F tests for nested models) that were intended to be

used to test pre-specified hypotheses. 8. Increasing the sample size doesn't help very much (see Derksen and Keselman) 9. It allows us to not think about the problem

10. It uses a lot of paper

SOME of the problems with stepwise variable selection.

Page 59: What You See May Not Be What You Get:  A Primer on Regression Artifacts

“I now wish I had never written the stepwise selection code for SAS.” --Frank Harrell, author of forward and

backwards selection algorithm for SAS PROC REG

Page 60: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Automated Selection: Derksen and Keselman (1992) Simulation Study

• Studied backward and forward selection

• Some authentic variables and some noise variables among candidate variables

• Manipulated correlation among candidate predictors

• Manipulated sample size

Page 61: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Automated Selection: Derksen and Keselman (1992) Simulation Study

• “The degree of correlation between candidate predictors affected the frequency with which the authentic predictors found their way into the model.”

• “The greater the number of candidate predictors, the greater the number of noise variables were included in the model.”

• “Sample size was of little practical importance in determining the number of authentic variables contained in the final model.”

Page 62: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7

Variables in Final Model

% o

f sa

mple

s

100200500100010000

Simulation results: Number of Noise Variables Included

20 candidate predictors; 100 samples

Sample Size

Page 63: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0102030405060708090

100

0 0-5 5-10 10-15 15-20 20-25 > 25

% Variance Explained

% o

f sa

mple

s

100200500100010000

Simulation results: R-Square From Noise Variables

20 candidate predictors; 100 samples

Sample Size

Page 64: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

0.05

0.1

0.15

0.2

0.25

0.3

Samples (Deciles)

R-S

quare

10,0001,000500200100

Simulation results: R-Square From Noise Variables

20 candidate predictors; 100 samples

Sample Size

Page 65: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Variable Selection

• Pick variables a priori• Stick with them• Penalize appropriately for any

data-driven decision about how to model a variable

Page 66: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Spending DF wisely

• Select variables of most importance• Use DF to assess non-linearity using

flexible curve approach (more about this later)

• If not enough N/predictor, combine covariates using techniques that do not look at Y in the sample, PCA, FA, conceptual clustering, collapsing, scoring, established indexes, propensity scores.

Page 67: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Can use data to determine where to spend DF

• Use Spearman’s Rho to test “importance”

• Not peeking because we have chosen to include the term in the model regardless of relation to Y

• Use more DF for non-linearity

Page 68: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Example-Predict Survival from age, gender, and fare on Titanic

Page 69: What You See May Not Be What You Get:  A Primer on Regression Artifacts

If you have already decided to include them (and promise to keep them in the model) you can peek at predictors in order to see where to add complexity

Page 70: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Adjusted rho^2

0.0 0.05 0.10 0.15 0.20 0.25

1046 1

1308 1

1309 1

N df

age

fare

sex

Spearman Test

Page 71: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Non-linearity using splines

Page 72: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

0.5

1

1.5

2

2.5

0 0 5 10 15 20 25

X

YLinear Spline

(piecewise regression)

Y = a + b1(x<10) + b2(10<x<20) + b3 (x >20)

Page 73: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

0.5

1

1.5

2

2.5

0 0

X

Y

Cubic Spline (non-linear piecewise

regression)

knots

Page 74: What You See May Not Be What You Get:  A Primer on Regression Artifacts

fitfare<-lrm(survived~(rcs(fare,3)+age+sex)^2,x=T,y=T)

anova(fitfare)

Logistic regression model

Spline with 3 knots

Page 75: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Wald Statistics Response: survived

Factor Chi-Square d.f. P fare (Factor+Higher Order Factors) 55.1 6 <.0001 All Interactions 13.8 4 0.0079 Nonlinear (Factor+Higher Order Factors) 21.9 3 0.0001 age (Factor+Higher Order Factors) 22.2 4 0.0002 All Interactions 16.7 3 0.0008 sex (Factor+Higher Order Factors) 208.7 4 <.0001 All Interactions 20.2 3 0.0002 fare * age (Factor+Higher Order Factors) 8.5 2 0.0142 Nonlinear 8.5 1 0.0036 Nonlinear Interaction : f(A,B) vs. AB 8.5 1 0.0036 fare * sex (Factor+Higher Order Factors) 6.4 2 0.0401 Nonlinear 1.5 1 0.2153 Nonlinear Interaction : f(A,B) vs. AB 1.5 1 0.2153 age * sex (Factor+Higher Order Factors) 9.9 1 0.0016 TOTAL NONLINEAR 21.9 3 0.0001 TOTAL INTERACTION 24.9 5 0.0001 TOTAL NONLINEAR + INTERACTION 38.3 6 <.0001 TOTAL 245.3 9 <.0001

Page 76: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Wald Statistics Response: survived

Factor Chi-Square d.f. P fare (Factor+Higher Order Factors) 55.1 6 <.0001 All Interactions 13.8 4 0.0079 Nonlinear (Factor+Higher Order Factors) 21.9 3 0.0001 age (Factor+Higher Order Factors) 22.2 4 0.0002 All Interactions 16.7 3 0.0008 sex (Factor+Higher Order Factors) 208.7 4 <.0001 All Interactions 20.2 3 0.0002 fare * age (Factor+Higher Order Factors) 8.5 2 0.0142 Nonlinear 8.5 1 0.0036 Nonlinear Interaction : f(A,B) vs. AB 8.5 1 0.0036 fare * sex (Factor+Higher Order Factors) 6.4 2 0.0401 Nonlinear 1.5 1 0.2153 Nonlinear Interaction : f(A,B) vs. AB 1.5 1 0.2153 age * sex (Factor+Higher Order Factors) 9.9 1 0.0016 TOTAL NONLINEAR 21.9 3 0.0001 TOTAL INTERACTION 24.9 5 0.0001 TOTAL NONLINEAR + INTERACTION 38.3 6 <.0001 TOTAL 245.3 9 <.0001

Page 77: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Wald Statistics Response: survived

Factor Chi-Square d.f. P fare (Factor+Higher Order Factors) 55.1 6 <.0001 All Interactions 13.8 4 0.0079 Nonlinear (Factor+Higher Order Factors) 21.9 3 0.0001 age (Factor+Higher Order Factors) 22.2 4 0.0002 All Interactions 16.7 3 0.0008 sex (Factor+Higher Order Factors) 208.7 4 <.0001 All Interactions 20.2 3 0.0002 fare * age (Factor+Higher Order Factors) 8.5 2 0.0142 Nonlinear 8.5 1 0.0036 Nonlinear Interaction : f(A,B) vs. AB 8.5 1 0.0036 fare * sex (Factor+Higher Order Factors) 6.4 2 0.0401 Nonlinear 1.5 1 0.2153 Nonlinear Interaction : f(A,B) vs. AB 1.5 1 0.2153 age * sex (Factor+Higher Order Factors) 9.9 1 0.0016 TOTAL NONLINEAR 21.9 3 0.0001 TOTAL INTERACTION 24.9 5 0.0001 TOTAL NONLINEAR + INTERACTION 38.3 6 <.0001 TOTAL 245.3 9 <.0001

Page 78: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Wald Statistics Response: survived

Factor Chi-Square d.f. P fare (Factor+Higher Order Factors) 55.1 6 <.0001 All Interactions 13.8 4 0.0079 Nonlinear (Factor+Higher Order Factors) 21.9 3 0.0001 age (Factor+Higher Order Factors) 22.2 4 0.0002 All Interactions 16.7 3 0.0008 sex (Factor+Higher Order Factors) 208.7 4 <.0001 All Interactions 20.2 3 0.0002 fare * age (Factor+Higher Order Factors) 8.5 2 0.0142 Nonlinear 8.5 1 0.0036 Nonlinear Interaction : f(A,B) vs. AB 8.5 1 0.0036 fare * sex (Factor+Higher Order Factors) 6.4 2 0.0401 Nonlinear 1.5 1 0.2153 Nonlinear Interaction : f(A,B) vs. AB 1.5 1 0.2153 age * sex (Factor+Higher Order Factors) 9.9 1 0.0016 TOTAL NONLINEAR 21.9 3 0.0001 TOTAL INTERACTION 24.9 5 0.0001 TOTAL NONLINEAR + INTERACTION 38.3 6 <.0001 TOTAL 245.3 9 <.0001

Page 79: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Wald Statistics Response: survived

Factor Chi-Square d.f. P fare (Factor+Higher Order Factors) 55.1 6 <.0001 All Interactions 13.8 4 0.0079 Nonlinear (Factor+Higher Order Factors) 21.9 3 0.0001 age (Factor+Higher Order Factors) 22.2 4 0.0002 All Interactions 16.7 3 0.0008 sex (Factor+Higher Order Factors) 208.7 4 <.0001 All Interactions 20.2 3 0.0002 fare * age (Factor+Higher Order Factors) 8.5 2 0.0142 Nonlinear 8.5 1 0.0036 Nonlinear Interaction : f(A,B) vs. AB 8.5 1 0.0036 fare * sex (Factor+Higher Order Factors) 6.4 2 0.0401 Nonlinear 1.5 1 0.2153 Nonlinear Interaction : f(A,B) vs. AB 1.5 1 0.2153 age * sex (Factor+Higher Order Factors) 9.9 1 0.0016 TOTAL NONLINEAR 21.9 3 0.0001 TOTAL INTERACTION 24.9 5 0.0001 TOTAL NONLINEAR + INTERACTION 38.3 6 <.0001 TOTAL 245.3 9 <.0001

Page 80: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0.50 2.00 4.00 6.00 8.00 10.00 12.00

fare - 31:7.9

age - 39:21

0.95

sex - female:male

Adjusted to:fare=14 age=28 sex=male

Predictors of Survival on Titanic

Page 81: What You See May Not Be What You Get:  A Primer on Regression Artifacts

0

50

100150

200250

Fare10

20

30

40

50

60

age

00.

20.

40.

60.

81

Pro

b. o

f Sur

viva

l

Adjusted to: sex=male

Fare and Age Interaction

Page 82: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Fare

Pro

b.

of

Su

rviv

al

0 50 100 150 200 250 300

0.2

0.4

0.6

0.8

1.0

female

male

Adjusted to: age=28

Fare and Gender Interaction

Page 83: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Validation• Apparent

• too optimistic• Internal

• cross-validation, bootstrap• honest estimate for model

performance• provides an upper limit to what would

be found on external validation• External validation

• replication with new sample, different circumstances

Page 84: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Validation

• Steyerburg, et al. (1999) compared validation methods

• Found that split-half was far too conservative

• Bootstrap was equal or superior to all other techniques

Page 85: What You See May Not Be What You Get:  A Primer on Regression Artifacts

?1………………….

My Sample

Evaluate

Bootstrap

?2 ?3 ?4 ?k-1 ?k

WITH REPLACEMENT

Page 86: What You See May Not Be What You Get:  A Primer on Regression Artifacts

1, 3, 4, 5, 7, 10

7114510

1032221

351427

211727

4414210

Page 87: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Index Training Corrected

Dxy 0.6565 0.646

R2 0.4273 0.407

Intercept 0.0000 -0.011

Slope 1.0000 0.952

Bootstrap Validation

Page 88: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Summary

• Think about your model• Collect enough data

Page 89: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Summary

• Measure well• Don’t destroy what you’ve

measured

Page 90: What You See May Not Be What You Get:  A Primer on Regression Artifacts

• Pick your variables ahead of time and collect enough data to test the model you want

• Keep all your variables in the model unless extremely unimportant

Summary

Page 91: What You See May Not Be What You Get:  A Primer on Regression Artifacts

• Use more df on important variables, fewer df on “nuisance” variables

• Don’t peek at Y to combine, discard, or transform variables

Summary

Page 92: What You See May Not Be What You Get:  A Primer on Regression Artifacts

• Estimate validity and shrinkage with bootstrap

Summary

Page 93: What You See May Not Be What You Get:  A Primer on Regression Artifacts

• By all means, tinker with the model later, but be aware of the costs of tinkering

• Don’t forget to say you tinkered

• Go collect more data

Summary

Page 94: What You See May Not Be What You Get:  A Primer on Regression Artifacts

Web links for references, software, and more

• Harrell’s regression modeling text– http://hesweb1.med.virginia.edu/biostat/rms/

• SAS Macros for spline estimation– http://hesweb1.med.virginia.edu/biostat/SAS/survrisk.txt

• Some results comparing validation methods– http://hesweb1.med.virginia.edu/biostat/reports/logistic.val.pdf

• SAS code for bootstrap– ftp://ftp.sas.com/pub/neural/jackboot.sas

• S-Plus home page– insightful.com

• Mike Babyak’s e-mail – [email protected]

• This presentation– http://www.duke.edu/~mbabyak