Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang...

Post on 26-Dec-2015

214 views 0 download

Tags:

Transcript of Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang...

Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women

Michael ElliottXiaobi HuangSioban Harlow

University of Michigan School of Public HealthSeptember 30, 2008

Outline

•Has the onset of menopause changed since the early 20th Century?

•Tremin I: U. Minnesota Undergraduates 1930’s.•Tremin II: U. Minnesota Undergraduates 1960’s.

•Develop statistical model for menstrual cycle length.

•Bayesian methods•Apply to complete data from Tremin I.•Future work

•Accounting for missing data (hormone, dropout).•Including Tremin II data.•Relating to existing suggestions for FMP markers (60 days, 90 days, etc.).

Modeling Menstrual Cycle Length: Observed Data

Characterized by: • Stable trend during a

woman’s 20’s and 30’s

• “Breakdown” several months to several years before FMP

• Increase in variability• Increase in mean

length

Modeling Menstrual Cycle Length: Observed Data

Easier to seeon a log scale:

Modeling Menstrual Cycle Length: Linear Changepoint Model

There appears to be a common pattern to how menstrual cycle length changes over age.

• A linear changepoint model:

Can be implemented as a linear spline with one

changepoint: , where

1 1a 2 2a

( )a a if 0

( )0 if 0

x xx

x

Modeling Menstrual Cycle Length: Linear Changepoint Model

Different slopes and intercepts either side of θ:

But means converge at the “knot” of θ:

1 1

2 2

( | ) ; ,

( | ) ( ) ( ) ; ,

E Y a a

E Y a a a a

( | ) ( );

( | ) ( ) ( ) ( )

As , ( | ) ( | )

E Y a

E Y a

a E Y a E Y a

Modeling Menstrual Cycle Length: Linear Changepoint Model

Variance can be modeled via linear changepoint model, just like the mean.

Note that the changepoint(s) θ are estimated from the data, not fixed in advance.

Modeling Menstrual Cycle Length: Hierarchical Model

Despite general overall pattern being the same, women have unique ages when their cycle lengths begin to change, as well as difference means and variances at “baseline”.

This suggests constructing a hierarchical model in which women will have unique parameters governing mean and variance changepoint models.

Modeling Menstrual Cycle Length: Hierarchical Model

Start at age 35 and take log of cycle length to improve normal approximation.

Linear changepoint for both mean and variance

where is the length in days of the tth menstrual cycle for the ith woman, , is her age in years at the start of her tth cycle, and

2 2

2

log( ) | , ~ ( , )

( 35) ( )

exp( ( 35) ( ) )

it it it it it

it i i it i it i

it i i t i it i

y N

a a

a a

ity

1,...,i N ita

( ) if 0 and 0 if 0.x x x x

Modeling Menstrual Cycle Length: Hierarchical Model

Each of the individual parameters is then assumed to follow a common distribution:

where .• Allows information about cycle parameters to be

shared across women.• Accommodates “within-woman” correlation in cycle

lengths• Relates cycle parameters to baseline covariates via

regression coefficients .

8~ ,ind

i iN φ x

( , , , , , , , )i i i i i i i i i φ

Bayesian Models

Considering a model of this form is easier from a Bayesian perspective.

• Classical or “frequentist" approach to statistics considers observed data y to be random, governed by fixed (unknown) parameters θ.

• Determine the joint distribution of . and consider as a function of θ: . • Point estimate of made by maximizing or . • Inference about θ made by considering repeated

sampling properties of y for different θ. • Ex: 95% confidence interval, a hypothetical set of

which will contain θ with 95% probability

1,...., ~ ( ; )ny y f y ( ; ) ( ; )L y f y

( ; )L y( ; ) ln ( ; )l y L y

Bayesian Models

• Bayesian statistics also models , but considers θ to have a probability distribution of its own, .

• Prior information contained in is updated from data y to obtain a posterior distribution of θ:

• Hierarchical models model prior parameters with “hyperprior” distributions :

~ ( ; )y f y

( )p ( )p

( , ) ( | ) ( )( | ) ( ; ) ( )

( ) ( )

f y f y pp y L y p

p y p y

| ~ ( )p

( | ) ( ; ) ( | ) ( )p y L y p p d

Bayesian Models

• Prior distributions or hyperprior distributions encode prior knowledge about parameters, but can be chosen to be very weakly informative if little prior information is available, or if it is to be ignored.

• Here

• Modern computational techniques such as Markov Chain Monte Carlo allow complex models such as those to be used here to be fit (relatively) painlessly.

• Results from 3,000 draws of Gibbs sampler, 1,000 draws discarded after “burn-in”.

( )p

( , ) ~ ( ;1, )p Inv Wishart I

( )p

Results

• Fit the above model to the 106 women with complete data in Tremin I.

• Pregnancies, abortions.• No hormone use or gaps in reporting.

• Subject level parameters (random effects).• Fit of predicted means and variances.• Regression against parity, menarche, means

and standard errors at age 25-29.• Correlation of subject-level parameters.

Results: Subject-level estimates of trends

i1 3.36

(3.33,3.39)

-.010(-.014,-.006)

.089(.007,.303)

49.9(47.2,51.4)

-4.8(-5.3,-4.3)

.01(-.06.07)

.76(.56,.99)

47.3(46.5,48.0)

25 3.15(3.06,3.23)

.037(-.010,.084)

.279(-.136,.727)

42.0(41.4,42.1)

-3.6(-4.1,-3.0)

.52(.36,.66)

1.27(.19,2.61)

42.0(41.0,42.1)

i i

i i

i i

i i

Results: Subject-level estimates of trends

Mean cycle length: Posterior mean (95% posterior predictive interval)

2.5 and 97.5 Percentiles: Posterior mean (95% PPI)

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

Age (years)

Log(

cycl

e le

ngth

)

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

Age (years)

Log(

cycl

e le

ngth

)

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

Age (years)

Log(

cycl

e le

ngth

)

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

35 40 45 50 55

2.0

3.0

4.0

5.0

Age (years)

Log(

cycl

e le

ngth

)

Results: Subject-level estimates of trends

•One-changepoint model provides reasonably good fit to the data.

•Subject-level differences in mean and variance trends appear to be captured.

•Uncertainty in the location of the changepoints reflected in the smoothness of the ``elbow’’ for the mean and variances.

Results: Subject-level estimates of trends

40 42 44 46 48 50 52

40

42

44

46

48

50

52

Variance Changepoint (age in years)

Me

an

Ch

an

ge

po

int (

ag

e in

ye

ars

)

•Posterior means and 80% posterior predictive intervals for mean changepoint and variance changepoint.

•Variability in cycle length increases well in advance of increases in mean cycle length itself. •Changepoints for some subjects are well-estimated, while there is a great deal of uncertainty for others.

Results: Population Mean for trend parameters (unadjusted)

Mean intercept 3.30 (3.27,3.32)

Mean slope before changepoint

-.004 (-.023,.015)

Mean slope after changepoint

.264 (.151,.425)

Changepoint for mean 47.0 (46.4,47.6)

Exp(Variance intercept) .0088 (.0067,.0112)

Exp(Variance slope before changepoint)

1.09 (1.02,1.15)

Exp(Variance slope after changepoint)

3.05 (2.60,3.68)

Variance changepoint 45.2 (44.6,45.8)

Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean

and variance of cycles at age 25-29)

Intercept Nullparious Menarche Mean 25-29 SD 25-29

Mean intercept 3.28(3.22,3.31)

-.00(-.02,.02)

.02(-.03,.07)

.023(.015,.029)

-.005(-.007,-.003)

Mean slope before changepoint

-.002(-.044,.028)

.000(-.016,.012)

-.001(-.050,.035)

-.001(-.008,.004)

.000(-.002,.002)

Mean slope after changepoint

.307(.135,.446)

-.013(-.061,.018)

.006(-.126,.098)

-.005(-.024,.009)

.001(-.004,.005)

Changepoint for mean

47.22(46.11,48.03)

.09(-.31,.38)

-.02(-1.16,.81)

.15(-.01,.27)

-.000(-.001,-.000)

Exp(Variance intercept)

.0091(.0054,.0133)

1.09(.87,1.23)

.920(.507,1.395)

1.060(.973,1.123)

.972(.928,1.006)

Exp(Variance slope before changepoint)

1.00(.89,1.09)

1.00(.95,1.03)

1.115(.973,1.228)

.984(.969,.997)

1.003(.998,1.007)

Exp(Variance slope after changepoint)

3.20(2.41,3.95)

.96(.86,1.04)

.946(.691,1.197)

1.023(.981,1.054)

.995(.982,1.004)

Variance changepoint

44.73(43.65,45.53)

.172(-.235,.484)

.58(-.57,1.47)

.181(.018,.302)

-.040(-.087,.005)

Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean

and variance of cycles at age 25-29)

•Parity not associated with cycle structure.•Weak evidence that earlier menarche associated with increasing variability before changepoint.•Higher historical mean:

•Higher mean at 35.•Decline in variability before changepoint.•Later changepoints for both mean and variance.

•Higher historical variance:•Lower mean at 35.•Earlier changepoints for both mean and variance.

Results: Correlations among random effects

Mean intercept

Meanslope before change-point

Meanslope after change-point

MeanChange-point

Exp(Var intercept)

Exp(Varslope before change-point)

Exp(Varslope after change-Point)

VarChange-point

Mean intercept 1 -.049 -.120 .268 .126 -.171 -.008 .211

Mean slope before changepoint

1 .011 -.036 .014 .027 .001 -.021

Mean slope after changepoint

1 -.143 -.017 .070 .096 -.129

Changepoint for mean

1 .08 -.355 -.269 .594

Exp(Variance intercept)

1 -.609 .260 -.046

Exp(Variance slope before changepoint)

1 -.172 .146

Exp(Variance slope after changepoint)

1 -.045

Variance changepoint

1

Results: Correlations among random effects

• Longer cycles at age 35 are associated with somewhat later changepoints in both mean and variance.

• Highly variable cycles at age 35 are associated with slower increases/declines in variability before their variability changepoint, but more rapid increases thereafter.

• Later mean changepoints are associated with slower increases or even declines in variability before their variability changepoint, and slower increases thereafter.

• Later mean changepoints are strongly associated with later variance changepoints.

Next Steps: Modeling

• Account for missing data (hormone, dropout).• Impute missing cycles under model, and then obtain

draws from posterior distribution of parameters conditional on observed and imputed data.

• Use results from alternative non-model-based approaches

Next Steps: Modeling

• Model checking• “Eyeball” approach shows good fit• Formalize with posterior predictive checks.

• Generate predictive data under model and compute posterior distributions of statistics of interest (chi-square measures, etc.)

• Compare with posterior distribution of statistics using observed (fixed) data.

Next Steps: Analysis

• Include Tremin II data.• Add as covariate to population model• Assess secular trends pre- and post- birth control use.

Next Steps: Analysis

• Relate to existing suggestions for FMP markers (60 days, 90 days, etc.).

• Consider predictive value of model for pre-FMP subjects.

• “Cross-validation” with Tremin data.• Validation with other data sources.

Next Steps: Analysis

• Look for cycle behavior that might be reflective of disease

Histogram of mean int.

Fre

qu

en

cy

3.2 3.4 3.6 3.8

01

02

03

04

0Histogram of variance int.

Fre

qu

en

cy

-7 -5 -3 -1

01

02

03

0