Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman...

26
Model-based vs. non-parametric estimators of net survival Paul W Dickman 1 Paul C Lambert 1,2 1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden 2 Department of Health Sciences, University of Leicester, UK EPAAC WP9 Satellite Meeting State of Art of Methods for the Analysis of Population-Based Cancer Data 23 January 2014

Transcript of Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman...

Page 1: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Model-based vs. non-parametric estimators of

net survival

Paul W Dickman1 Paul C Lambert1,2

1Department of Medical Epidemiology and Biostatistics,Karolinska Institutet, Stockholm, Sweden

2Department of Health Sciences,University of Leicester, UK

EPAAC WP9 Satellite MeetingState of Art of Methods for the Analysis of

Population-Based Cancer Data23 January 2014

Page 2: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Which approach should I use to estimate net

survival?

A common question in our teaching.

Unless there is reason to prefer a cause-specific approach, werecommend one of the following:

Ederer IIPohar PermeModel-based

The choice of method depends on the research question andpractical considerations.

We were recently critical [1] of a paper [2] that advocated thePohar Perme approach; our critisism was of that particular paperand not the Pohar Perme approach per se, of which we are greatadmirers.

Paul Dickman Model-based vs. non-parametric 23 January 2014 2

Page 3: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

We are not suggesting Ederer II is superior

We are not suggesting that the Ederer II approach is superior tothe Pohar Perme approach.

However, we argue that it is not as inferior as some others (e.g.,Roche et al [2]) would have us believe.

We do not agree with Roche et al [2] that “In estimating netsurvival, cancer registries should abandon all classical methodsand adopt the new Pohar-Perme estimator” because “greaterrors may occur ...”.

Internally standardised, or age-specific, Ederer II estimates of5-year and 10-year net survival are biased, although the bias isgenerally so small that it makes no practical difference.

In short, don’t panic if you have used, or are using, Ederer II.

Paul Dickman Model-based vs. non-parametric 23 January 2014 3

Page 4: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

All methods require assumptions

1 Conditional independence between cancer and non-cancermortality.That is, there are no factors associated with both cancer andnon-cancer mortality other than those factors that have beencontrolled for in the estimation (e.g., via stratification,regression modelling or appropriate weighting).

2 The estimates of expected mortality represent the mortality thatwould have been experienced by the cancer patients if they werenot diagnosed with cancer.

3 Administrative censoring is non-informative, or an appropriateadjustment is applied.

Paul Dickman Model-based vs. non-parametric 23 January 2014 4

Page 5: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

If interest is in a summary of net survival for all

patients with a particular cancer

The Pohar Perme estimator was designed specifically for thistype of application.

A model-based approach can also give an estimate with minimalbias, but why bother?

Paul Dickman Model-based vs. non-parametric 23 January 2014 5

Page 6: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Simulation Study

Expected survival from the UK general population.

Assume linear association (HR=1.03) between age and netmortality and the effect is constant throughout follow-up. Inanother scenario (not shown here), the effect of age wasrestricted to the early follow-up years.

Only considered Danieli’s mechanism 1 and not informativeadministrative censoring (Danieli’s mechanism 2). This isbecause the second mechanism will affect all estimators and weare primarily interested in differences between the estimators dueto mechanism 1.

500 data sets simulated with 15,000 patients in each.

Paul Dickman Model-based vs. non-parametric 23 January 2014 6

Page 7: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Simulation Study - True values of relative survival

Age 1 Year 5 Years 10 Years 15 years35 92.4 83.8 77.9 73.745 89.9 78.8 71.4 66.255 86.6 72.5 63.5 57.365 82.4 64.8 54.1 47.275 77.0 55.7 43.7 36.385 70.2 45.3 32.7 25.495 62.0 34.4 22.1 15.7

Internal 81.2 63.3 52.8 46.1

The bias in Ederer II will be proportional to the size of theassociation between relative survival and age. This scenariorepresents a relatively large association between relative survivaland age (compared to what one typically observes for cancer).

Paul Dickman Model-based vs. non-parametric 23 January 2014 7

Page 8: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Results - 5-year survival for all ages

61

62

63

64

65

Bias = -0.06MSE = 0.271Coverage = 95.6

PoharPerme

Bias = 0.76MSE = 0.793Coverage = 65.6

Ederer 2(All Age)

Bias = 0.11MSE = 0.248Coverage = 94.6

Ederer 2(Standardized)

Bias = 0.57MSE = 0.536Coverage = 79.2

Model based(grouped)

Bias = 0.03MSE = 0.219Coverage = 94.4

Model based(continuous)

Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

5 years

Paul Dickman Model-based vs. non-parametric 23 January 2014 8

Page 9: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Results - 10-year survival for all ages

50

52

54

56

Bias = -0.07MSE = 0.869Coverage = 95.0

PoharPerme

Bias = 1.57MSE = 2.772Coverage = 20.2

Ederer 2(All Age)

Bias = 0.22MSE = 0.452Coverage = 94.2

Ederer 2(Standardized)

Bias = 0.88MSE = 1.122Coverage = 71.0

Model based(grouped)

Bias = 0.09MSE = 0.440Coverage = 96.4

Model based(continuous)

Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

10 years

Paul Dickman Model-based vs. non-parametric 23 January 2014 9

Page 10: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Results - 15-year survival for all ages

40

45

50

55

60

Bias = -0.27MSE = 4.392Coverage = 93.6

PoharPerme

Bias = 2.28MSE = 5.599Coverage = 5.8

Ederer 2(All Age)

Bias = 0.21MSE = 0.896Coverage = 93.6

Ederer 2(Standardized)

Bias = 1.04MSE = 1.685Coverage = 73.6

Model based(grouped)

Bias = 0.07MSE = 0.807Coverage = 95.2

Model based(continuous)

Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

15 years

Paul Dickman Model-based vs. non-parametric 23 January 2014 10

Page 11: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Comments

The Pohar Perme approach provides an internallyage-standardised estimate of marginal net survival, and it does itoptimally.

Other than Ederer II applied to all ages, the other approachesalso provide internally age-standardised estimates of net survival.

Paul Dickman Model-based vs. non-parametric 23 January 2014 11

Page 12: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Comparing survival between populations

Using the Pohar Perme estimator we can obtain unbiasedestimates of net survival for patients diagnosed with breastcancer in Norway and in the UK.

What if we want to compare them?

Age-standardisation is a common approach and can beimplemented non-parametrically or using a model.

What if we want to understand reasons for the differences?– Are the differences consistent for all ages?– Are the differences consistent across follow-up?

The following graphs were based on a model; could do similarusing non-parametric approach but modelling has advantages.

Paul Dickman Model-based vs. non-parametric 23 January 2014 12

Page 13: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Relative Survival for England and Norway [3]

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 35

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 45

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 55

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 65

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 75

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 85

Paul Dickman Model-based vs. non-parametric 23 January 2014 13

Page 14: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Excess Mortality Rate Ratios (England/Norway)

1

2

3

0 2 4 6 8

Age 35

1

2

3

0 2 4 6 8

Age 45

1

2

3

0 2 4 6 8

Age 55

1

2

3

0 2 4 6 8

Age 65

1

2

3

0 2 4 6 8

Age 75

1

2

3

0 2 4 6 8

Age 85

Exc

ess

Mor

talit

y R

ate

Rat

io

Years from DiagnosisPaul Dickman Model-based vs. non-parametric 23 January 2014 14

Page 15: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

The model

H(t) = H∗ (t) + Λ(t)

Model on ln [Λ(t)] scale which includes terms for

Baseline hazard (time) - Splines (6 parameters)Country - 1 dummy covariateAge - Splines (4 parameters)Age×Country - (4 parameters)Country×Time - Splines (3 parameters)Age×Time - 4×3 = 12 parameters

Results extremely robust to number and locations of the knots.

Paul Dickman Model-based vs. non-parametric 23 January 2014 15

Page 16: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Localised colon carcinoma in Finland 1985–1994

In a clinical setting, interest is often in prediction for patientswith specific characteristics (e.g., of a particular age).

Using the Pohar Perme approach, the estimated 5-year netsurvival for 60 year-old males is 0.94.

Some researchers would argue that this estimate is preferredover all other approaches for estimating net survival since theother approaches are known to be biased.

Let’s look (on the next slide) at the Pohar Perme andmodel-based estimates for a range of ages.

As an aside, the Ederer II estimates are identical to the PoharPerme estimates, but comparing non-parametric estimators isnot the focus of this talk.

Paul Dickman Model-based vs. non-parametric 23 January 2014 16

Page 17: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Non-parametric and model-based estimates

of 5-year net survival

Age PP Model60 0.94 0.8561 0.61 0.8462 0.77 0.8463 0.70 0.8364 0.73 0.8265 0.78 0.8266 0.87 0.8167 0.83 0.8068 0.86 0.8069 0.77 0.79

Paul Dickman Model-based vs. non-parametric 23 January 2014 17

Page 18: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Modelling assumptions give lower SEs

Age PP Model60 0.94 (0.07) 0.85 (0.02)61 0.61 (0.09) 0.84 (0.02)62 0.77 (0.09) 0.84 (0.02)63 0.70 (0.09) 0.83 (0.02)64 0.73 (0.08) 0.82 (0.02)65 0.78 (0.09) 0.82 (0.02)66 0.87 (0.07) 0.81 (0.02)67 0.83 (0.07) 0.80 (0.02)68 0.86 (0.09) 0.80 (0.02)69 0.77 (0.09) 0.79 (0.02)

Paul Dickman Model-based vs. non-parametric 23 January 2014 18

Page 19: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Non-parametric and model-based estimates

I prefer the model-based estimate. The survival of 60 year-oldsshould not be markedly different to 61 year-olds.

Proponents of a non-parametric approach argue that it is free ofassumptions and therefore preferable.

I believe the assumptions made in the model-based approach areappropriate, and incorporating them into the analysis leads tomore appropriate estimates of survival for each age.

By introducing an assumption, we are reducing variance by‘borrowing strength’ from the surrounding ages.

Could apply the non-parametric approach to a broader age group(e.g., patients aged 60–69), but doesn’t this imply adding anassumption on the interpretation (even if we don’t make anassumption in the estimation).

Paul Dickman Model-based vs. non-parametric 23 January 2014 19

Page 20: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

There is no single correct model

Age PP Model1 Model2 Model3

60 0.94 0.85 0.94 0.8261 0.61 0.84 0.61 0.8262 0.77 0.84 0.77 0.8263 0.70 0.83 0.70 0.8264 0.73 0.82 0.73 0.8265 0.78 0.82 0.78 0.8266 0.87 0.81 0.87 0.8267 0.83 0.80 0.83 0.8268 0.86 0.80 0.86 0.8269 0.77 0.79 0.77 0.82

Parallels between models 2 and 3 and analogous non-parametricapproaches; same assumptions and identical estimates.How we specify and interpret covariate effects is often moreimportant, and is relevant independent of approach.

Paul Dickman Model-based vs. non-parametric 23 January 2014 20

Page 21: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Modelling is a powerful tool and requires skill

The utility of a model-based approach requires fitting anappropriate model

Choice of covariatesFunctional form for metric covariatesInteractions and how to parameterise them

Decisions on which interactions to include and how toparameterise them (spline-spline interactions require somethought) should often be based on subject matter considerationsrather than statistical significance.

Paul Dickman Model-based vs. non-parametric 23 January 2014 21

Page 22: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

A comment from Riccardo Capocaccia

A well-constructed model does not assure to provide fittedsurvival estimates close to the empirical ones for allcombinations of covariate values. Differences between fitted andempirical survival can be due to random variability as well as to‘true’ effects. For instance, a model applied to European datamight not show a sudden increase of survival for colon cancer ina single country given by the introduction of mass screening.

I agree. But I see this as fundamental to the role of statisticalmodelling and the skills required to perform it.

Paul Dickman Model-based vs. non-parametric 23 January 2014 22

Page 23: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Goal in modelling

Examining the empirical estimates for a large number ofcovariate patterns does not provide a good basis for scientificinference.

Our goal in modelling is to fit a model that is sufficiently simplethat it provides a basis for scientific inference, while at the sametime being sufficiently complex that it does not obscureimportant effects or otherwise produce misleading results.

Paul Dickman Model-based vs. non-parametric 23 January 2014 23

Page 24: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional
Page 25: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional
Page 26: Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

References

[1] Dickman PW, Lambert PC, Coviello E, Rutherford MJ. Estimating net survival inpopulation-based cancer studies. Int J Cancer 2013;133:519–21.

[2] Roche L, Danieli C, Belot A, Grosclaude P, Bouvier AM, Velten M, et al.. Cancer netsurvival on registry data: Use of the new unbiased Pohar-Perme estimator and magnitudeof the bias with the classical methods. Int J Cancer 2012;132:2359–69.

[3] Lambert PC, Holmberg L, Sandin F, Bray F, Linklater KM, Purushotham A, et al..Quantifying differences in breast cancer survival between England and Norway. CancerEpidemiology 2011;35:526–533.

Paul Dickman Model-based vs. non-parametric 23 January 2014 26