Design and Analytic Methods for Time‐Varying Exposures in Perinatal Epidemiology
By Anthony Philip Nunes
MS, University of Massachusetts at Amherst, 2006
A Dissertation Submitted in Partial Fulfillment of the
Requirements for Degree of Doctor of Philosophy
in the Division of Biology and Medicine
at Brown University
Providence, Rhode Island
May 2011
III
This dissertation by Anthony Philip Nunes is accepted in its present form
by the Division of Biology and Medicine as satisfying the
dissertation requirement for the Degree of Doctor of Philosophy.
Date________________ _________________________________
Elizabeth W. Triche, PhD, Advisor
Recommended to the Graduate Council
Date________________ _________________________________
E. Andres Houseman, ScD (Reader)
Date________________ _________________________________
Maureen G. Phipps, MD, MPH (Reader)
Date________________ _________________________________
Gregory A. Wellenius, ScD (Reader)
Approved by the Graduate Council
Date________________ _________________________________
Peter Weber, Dean of the Graduate School
IV
Curriculum Vitae
Anthony Nunes was born in Fall River, MA in September of 1980. He attended the
University of Massachusetts at Amherst where he received a Bachelors of Science
degree in Environmental Science with a concentration in Toxicology and Chemistry.
Anthony then received his Master of Science degree in Epidemiology from the School of
Public Health and Health Sciences at the University of Massachusetts at Amherst.
Anthony’s Master’s thesis assessed the association between stress and motor vehicle
injuries among active duty US Army personnel. Anthony then worked as a researcher
for the US Army Research Institute for Environmental Medicine, assisting in data
analysis within the substantive area of injury prevention epidemiology, and as an
associate epidemiologist for Environ International, assisting with grant writing, data
collection, data analysis, and drafting of manuscripts and expert reports.
He entered the Doctoral Program in Epidemiology in the Department of Community
Health at Brown University in September of 2007 to pursue research within the
substantive area of perinatal and reproductive epidemiology. He received funding
through a National Institute on Aging Training Grant and through a research
assistantship in the Division of Research at Women & Infants Hospital in Providence, RI.
In addition, Anthony worked as a statistical and methodological consultant for the
Community Health Clerkship rotation in the Warren Alpert Medical School. During his
graduate training, Anthony was honored by receiving an invitation to attend the NICHD
V
Summer Institute in Reproductive and Perinatal Epidemiology. He has contributed to
publications within the substantive areas of adolescent pregnancy, pharmaco‐
epidemiology, and injury/environmental epidemiology. Anthony’s research has been
presented at conferences for the International Society for Pharmacoepidemiology, the
American College of Obstetricians and Gynecologists, and the Society for Epidemiologic
Research.
VI
Acknowledgments
Throughout my graduate training at Brown, I have been blessed to have mentors who
have been as caring and kind as they have been knowledgeable. I would like to thank
Dr. Beth Triche, my advisor and mentor, for her commitment, guidance, and support
throughout this investigation. To Dr. Maureen Phipps, Dr. E. Andres Houseman, and Dr.
Gregory Wellenius, I would like to express my appreciation for each of their unique
perspectives and insights they provided to help shape the methods and clinical
relevance of my research. I am grateful to my academic advisors and research mentors
Dr. Stephen Buka, Dr. Melissa Clark, Dr. Kate Lapane, Dr. Martin Weinstock, Dr. Joseph
Hogan, and Dr. Vincent Mor; each of whom have helped to develop the questions and
analytic approaches addressed in this dissertation. I would like to acknowledge Dr.
Enrique Schisterman, who served as an external reader; and Dr. Michael Bracken, Dr.
Theodore Holford, Dr. Kathleen Belanger, and Dr. Brian Leaderer from the Yale Center
for Perinatal, Pediatric, and Environmental Epidemiology for granting access to the data
utilized in this research. Lastly, none of this would have been possible without the
support of my family, friends, and colleagues. I would like to specifically thank my
parents, Anthony and Gail Nunes; my wife, Heather Nunes; and my children, Anthony
and Daniel Nunes; for the support and encouragement they have provided and the
sacrifices they have made to allow me to pursue higher education.
VII
Table of Contents
Signature Page ................................................................................................................... III Curriculum Vitae ................................................................................................................ IV Acknowledgments ............................................................................................................. VI List or Tables .................................................................................................................... VIII List of Figures ..................................................................................................................... IX Introduction ........................................................................................................................ 1 Overview ................................................................................................................. 1
Analytic Solution ................................................................................................................. 3
Design Solution ................................................................................................................... 4
Specific Aims ........................................................................................................... 5
Chapter 1: TIME‐DEPENDENT BIAS OF AVERAGE AND JOINTLY MODELED EXPOSURES: EXAMPLES IN PERINATAL EPIDEMIOLOGY .................................................................................... 6
Abstract ................................................................................................................... 7
Introduction ........................................................................................................................ 8
Simulation Methods .......................................................................................................... 14
Simulation Results ............................................................................................................. 17
Bias Correction Methods .................................................................................................. 18
Bias Correction Illustration ............................................................................................... 19
Discussion.............................................................................................................. 22
Chapter 2: EVALUATING MISSING DATA DESIGNS IN THE PRESENCE OF NON‐DESIGNED MISSING DATA: APPLICATIONS IN PERINATAL EPIDEMIOLOGY ................................... 34 Abstract ................................................................................................................. 35
Introduction ...................................................................................................................... 37
Methods ............................................................................................................................ 40
Results ............................................................................................................................... 48
Discussion.............................................................................................................. 50
Chapter 3: TIME DEPENDENT ASSOCIATIONS BETWEEN MATERNAL CAFFEINE CONSUMPTION AND FETAL GROWTH ............................................................................ 63 Abstract ................................................................................................................. 64
Introduction ...................................................................................................................... 65
Methods ............................................................................................................................ 67
Results ............................................................................................................................... 74
Discussion.............................................................................................................. 77
General Discussion ........................................................................................................... 89 References ........................................................................................................................ 95
VIII
List of Tables
Table 1.1, Average Effect Estimates for 1000 Simulations of 10,000 Pregnancies Using Time‐Invariant and Time‐Varying Methods ..................................................................... 28
Table 1.2, Average Effect Estimates for 1000 Simulations of 10,000 Pregnancies Using Time‐Invariant and Time‐Varying Methods .................................................. 29
Table 1.3, Simulation of Average Exposure and Preterm Birth: Average Effect Estimates for 1000 Simulations of 10,000 Pregnancies Using Time‐Invariant and Time‐Varying Methods ........................................................................................... 30
Table 1.4, Association Between Prenatal Care Initiation and Preterm Birth and Low Birth Weight, 2006 US Natality Data ...................................................................... 31
Table 2.1, Sample Size and Cost Parameters Used for Data Simulations ......................... 56
Table 2.2, Characteristics of Study Participants Within Protocols of the Nutrition in Pregnancy Study Prior to and After Weighting ............................................. 57
Table 2.3, Bias and Relative Efficiency of Missing Data Designs (MDD) in the Presence of Non‐Designed Missing Data Relative to Complete Ascertainment Designs (CAD), 1000 Data Simulations ....................................................................... 58
Table 2.4, Cost‐Fixed Sample Size and Compliance Within Study Protocols Among Participants in the Nutrition in Pregnancy Study .......................................... 59
Table 2.5, Association Between Measures of Smoking and Small for Gestational Age by Week of Pregnancy and Study Design Among Participants in the Nutrition in Pregnancy Study ............................................................................................ 60
Table 3.1, Distribution of Baseline Characteristics by Levels of First Trimester Caffeine Consumption, Health and Nutrition in Pregnancy Study, 1996‐2001 .......... 83
Table 3.2, Association Between Caffeine Consumption and Intrauterine Growth Retardation Among Full Term Live Births, Health and Nutrition in Pregnancy Study, 1996‐2001 .......................................................................................... 85
Table 3.3, Associations Between Joint Effects of Self Reported Caffeine Intake and Potential Effect Modifiers and Intrauterine Growth Retardation Among Full Term Live Births, Health and Nutrition in Pregnancy Study, 1996‐2001 ...... 86
Table 3.4, Association Between Caffeine Consumption and Birth Weight Among Full Term Live Births, Health and Nutrition in Pregnancy Study, 1996‐2001 ...... 87
IX
List of Figures
Figure 1.1, Calculation of the Incidence Rate Ratio from a 2X2 Table ............................ 32
Figure 1.2, Calculation of the Incidence Rate Ratio of Jointly Modeled Exposures ......... 32
Figure 1.3, Dose‐Response Patterns for (a) Constant Probability of Exposure Initiation, (b) Declining Probability of Exposure Initiation, and (c) Increasing Probability of Exposure Initiation .................................................................................... 33
Figure 2.1, Candidate Missing Data Designs ..................................................................... 61
Figure 2.2, Distribution of the Predicted Probability of Being Assigned to the Intensive Protocol Prior to Weighting (a) and After Weighting (b) ............................. 62
Figure 3.1, Directed Acyclic Graph for Confounders of the Association Between Caffeine and Fetal Growth ........................................................................................... 88
1
INTRODUCTION
In epidemiological investigations of time varying exposures, repeat assessments of
exposures are necessary to accurately characterize exposed person‐time and to quantify
time‐dependent effects. Though time‐dependent effects are not unique to perinatal
epidemiology, the sensitivity of exposure effects are magnified more so than any other
time during human development due to the physiologic changes experienced by the
mother and fetus. The profound changes in maternal metabolic, hematologic,
cardiovascular, and respiratory physiology are rivaled only by that experience in the
embryonic and fetal periods of development.[1] As a consequence of the maternal and
fetal physiological changes, associations between perinatal exposures and adverse
pregnancy outcomes are timing sensitive.[2, 3] That is, the same exposure may result in
different outcomes depending on the gestational age at which the exposure occurred.
This has been well documented for fetal/neonatal outcomes such as spontaneous
abortion, birth defects, low birth weight, growth restriction, and fetal programming [2,
3]. In mothers, outcomes such as preeclampsia, eclampsia, maternal hemorrhage, and
maternal mortality are sensitive to exposure timing[4]. Without assessment and
evaluation of timing specific exposures, epidemiological investigations can not
sufficiently describe the exposure/disease association nor will is validly estimate the
underlying causal effect.
2
When exposure is time‐varying while individuals are at risk of experiencing the outcome
of interest, time‐fixed analytic methods will produce biased measures of association.
Despite sufficient literature documenting bias resulting from ignoring exposure timing,
investigators may rely on time‐fixed analytic methods due to analytic simplicity or data
limitations. Recent examples of time‐fixed analyses of time‐varying exposures in
perinatal epidemiology include maternal weight change [5‐8], maternal infections [9,
10], illicit drug use [11, 12], medication use [13‐15], environmental exposures[16],
smoking cessation[17], and prenatal care utilization [18].
The feasibility of designs incorporating repeated measures in pregnancy is limited due to
cost and excessive subject burden within a short duration of time. Minimizing subject
burden in perinatal investigations is particularly important due to mothers’ increased
resistance to participate in invasive and non‐invasive study methodologies [19, 20]. As a
consequence, there is a need for design approaches to increase the feasibility of
collecting repeat exposure measures and analytic approaches to validly estimate
measures of association where exposure timing cannot be feasibly collected.
3
Analytic Solutions Time‐fixed analyses of time‐varying exposures can lead to biased estimates between the
exposure and outcome termed time‐dependent bias, immortal time bias, and survivor
treatment selection bias [21‐23]. Several studies have examined the impact of this bias
in the context of an exposure treated as a binary indicator of “ever exposed” during
follow‐up [22‐25]. Where exposure timing is available, time‐dependent bias is
eliminated through the use of time‐varying analytic methods. Time‐dependent bias is
fairly common in published cohort studies; however, it is often preventable and rarely
discussed as a cause for concern [21, 22].
The presence and direction of time‐dependent bias for “ever exposed” metrics have
been addressed in the existing literature. We add to this base of methodological
literature by expanding the mathematical proofs, further describing the pattern and
magnitude of time‐dependent bias, and presenting a solution to preventing the bias.
We address the problem where exposure timing cannot be feasibly ascertained through
a missing data perspective. We consider scenarios in which it is known that some
exposure has occurred but where we are limited to some final measure of cumulative
exposure. From a missing data perspective, we then assume a functional form of the
exposure over time using observed data within our sample and prior information from
the existing literature. Once the functional form of exposure/timing is specified, we
multiply impute exposure timing and obtain corrected measures of association using
time‐varying analyses.
4
Design Solution Missing data designs, including partial questionnaire designs [26‐29], multi‐cohort
longitudinal designs [30, 31], and multi‐measurement methods of construct assessment
[28, 30], deliberately omit collection of some data elements with the study sample. In
doing so, they require less intensive follow‐up protocols than a comparable complete
ascertainment study design.
In idealized simulation scenarios, missing data designs have been shown to improve
statistical efficiency without sacrificing validity. Though theoretically appropriate,
missing data designs are rarely used in perinatal investigations, in part due to a general
mistrust in missing data methodologies [32]. Prior methodological publications have not
assessed performance of missing data designs implemented in the context of time
varying exposures nor have missing data designs been evaluated in scenarios with non‐
designed missing data due to non‐compliance and loss to follow‐up. We expand upon
the existing methodological literature by introducing concepts of missing data designs
for time varying exposures and by assessing the performance of missing data designs in
non‐idealized scenarios encountered in observational epidemiology by exploring the
impact of non‐designed missingness on the statistical efficiency and validity.
5
Specific Aims
This series of papers aims to demonstrate the need to obtain and analyze repeat
exposure measures of time‐varying exposures, introduce a bias correction method
where exposure timing cannot be ascertained, introduce design solutions to increase
feasibility repeat exposure measures, and to implement identified methods to quantify
the association between timing specific caffeine consumption and fetal growth. To
demonstrate the need to obtain an analyzing time varying exposures, we quantify the
magnitude, direction, and patter of bias introduced in scenarios where time‐varying
exposures are treated as time‐fixed. We propose and evaluate missing data designs as a
valid method for assessing time‐varying exposures while minimizing cost and subject
burden. Lastly, we implement designed missing data methods to quantify the
association between maternal caffeine consumption and fetal growth.
6
Chapter 1: TIME‐DEPENDENT BIAS OF AVERAGE AND JOINTLY MODELED EXPOSURES:
EXAMPLES IN PERINATAL EPIDEMIOLOGY
7
Abstract:
Epidemiologic studies frequently treat time‐varying exposures as if they were time‐
fixed, either for analytic simplicity or because detailed data on exposure timing was not
collected. For binary exposures this approach can lead to health effect estimates that
are on average biased in the negative direction. We performed simulation studies to
evaluate the magnitude of this potential bias in the setting of non‐binary exposures
using examples from perinatal epidemiology. Specifically, we simulated effects of
trimester‐specific and average exposures on time to event, and compared the results
from time‐fixed logistic and survival analyses with those from a time‐varying survival
analysis. Time‐fixed analyses were biased downward for all exposure metrics
considered. Moreover, when using average exposure metrics, we observed an artificial
non‐linear dose response function. We propose and illustrate a method based on
multiple‐imputation of timing‐specific exposure that can be used to avoid this bias when
data on exposure timing are unavailable. In conclusion, treating time‐varying exposures
as time‐invariant can bias health effect estimates and yield incorrect dose‐response
functions. Where timing‐specific data are not available, multiple imputation of
exposure timing may be a useful tool in obtaining unbiased effect estimates or
performing sensitivity analyses. This method may be applied to other epidemiologic
substantive areas.
8
Though recent publications have highlighted the appropriateness of time varying
methods [33‐35], observational studies in perinatal and reproductive epidemiology
often treat time‐varying exposures as if they were time‐invariant or time‐fixed.
Examples of exposures that are time‐varying but have been treated as time‐fixed
include maternal weight change [5‐8], maternal infections [9, 10], illicit drug use [11,
12], medication use [13‐15], environmental exposures[16], smoking cessation[17], and
prenatal care utilization [18]. Investigators may choose to treat time‐varying exposures
as time‐fixed due to lack of adequate data on exposure timing or as a means to
minimize analytic complexity. However, this approach can lead to biased estimates
between the exposure and outcome termed time‐dependent bias, immortal time bias,
and survivor treatment selection bias [21‐23]. Distinct from misclassification or collider
stratification, time‐dependent bias occurs when individuals are eligible to become
exposed while at risk of experiencing the outcome. Several studies have examined the
impact of this bias in the context of an exposure treated as a binary indicator of “ever
exposed” during follow‐up [22‐25]. In perinatal epidemiology, this would be akin to
creating a metric of ever exposed within pregnancy or within a more specific relevant
etiologic period of interest (e.g. trimester specific exposures). O’Neal et al. describe a
scenario in which time‐fixed analyses quantifying the association between any urinary
tract infection and preterm birth have produced misleading results biased in the
negative direction[36]. Others have confirmed that time‐dependent bias of a binary
exposure is expected to bias measures of association in a negative direction [23, 24]
such that protective associations appear more protective, null associations appear
9
protective, and causal associations appear weaker or possibly protective relative to the
unbiased association. Where exposure timing is available, time‐dependent bias is
eliminated through the use of time‐varying analytic methods. Time‐dependent bias is
fairly common in published cohort studies; however, it is often preventable and rarely
discussed as a cause for concern [21, 22].
The presence and direction of time‐dependent bias for “ever exposed” metrics have
been addressed in the existing literature; however, the magnitude of such bias in
scenarios relevant to perinatal research has not been addressed. Additionally, prior
publications have not fully evaluated the impact of the bias when using alternative time‐
fixed metrics such as average exposure or joint modeling of multiple binary indicators
(e.g. trimester specific binary indicators). Though prior studies have demonstrated that
time varying methods prevent time‐dependent bias, analytic solutions where exposure
timing was not collected have not been proposed. In this study, we quantify the
magnitude, direction, and pattern of time‐dependent bias associated with several time‐
fixed exposure metrics commonly implemented in perinatal epidemiology. In addition,
we demonstrate the validity of using time‐varying analytic approaches and propose a
method for bias correction based on time varying analysis of multiply imputed exposure
timing where exposure timing is unknown.
10
Time‐Dependent Bias:
Suissa (2008) presents a simple mathematical proof of the biased incidence rate ratio
for binary, single transition exposures (i.e. individuals can only transition from
unexposed to exposed) and binary non‐recurring outcomes. In our handling of time‐
dependent bias, we extend this mathematical proof to include transient exposures (i.e.
individuals may transition from unexposed to exposed and from exposed to unexposed).
We address time dependent bias of odds ratios in the appendix.
In a 2x2 time‐fixed analysis of epidemiological data (Figure 1.1), “a” represents the
number of exposed who experience the outcome, “c” represents the number of
unexposed who experience the outcome, “T+” is the total person‐time among those
with any exposure, and “T‐” is the total person‐time among those with no exposure.
Assuming a constant hazard over time, the incidence rate ratio is estimated by:
(1)
Where: k is the ratio of follow‐up time between the unexposed and exposed (T‐/T+)
For a time‐varying exposure, this time‐fixed analysis incorrectly assumes that exposed
persons are exposed for their entire time at risk for experiencing the event. When
discussing time‐dependent bias, Suissa (2008) emphasized the mischaracterization of
11
exposure prior to exposure initiation. Here we define “p” as the average proportion of
time unexposed among those classified as “ever exposed”. Specifically, this is the
average ratio of time preceding exposure initiation to time in follow‐up among the
exposed. Knowing the value of “p”, Suissa demonstrated the corrected rate ratio to be
estimated as:
(2)
The rate ratio may be further biased if the exposure is transient (i.e. individuals may
transition from an exposed to unexposed state). Exposures such as smoking cessation
and initiation of prenatal care are examples of single transition exposures and would not
be susceptible to this aspect of the bias. Other exposures, such as caffeine,
acetaminophen, and maternal illness, may be expected to have a limited duration of
effect, and may occur multiple times during the time period of interest. Where subjects
may transition from exposed to unexposed during follow‐up, time‐fixed analytic
approaches would inappropriately attribute some events and person‐time to an
exposed state. We define “q” as the ratio of unexposed follow‐up time following
exposure initiation to follow‐up time after exposure initiation among the exposed. The
unbiased rate ratio can be estimated by:
(3)
12
Given the equations for the biased and unbiased rate ratios, we can quantify the nature
of the bias to determine factors affecting the magnitude and direction of the bias.
(4)
Therefore, the magnitude and direction of the bias are dependent on the proportion of
time unexposed among the exposed (p and q), the ratio of follow‐up times between the
exposed and unexposed (k), and the incidence of the outcome in the exposed and
unexposed (c and a). From the bias equation, it can be shown that no bias is present
when p and q equal zero. If either p or q are non‐zero, a bias will be present. Any non‐
zero value of p will lead to a bias in the negative direction.
Time‐fixed metrics are not limited to single binary indicators. One common metric in
perinatal epidemiology is to use trimester specific exposures by creating binary
indicators of exposure within each trimester. If indicators for each trimester are
simultaneously included in a regression model, then time‐dependent bias may impact
associations observed in each trimester. In a time‐fixed analysis, we may produce a 4x2
table from which to calculate the trimester specific measures of association (Figure 1.2).
For example, the formula for a time‐fixed rate ratio for first trimester exposure is given
as follows:
(5)
13
For preterm birth, individuals are at risk of experiencing the exposure and outcome
during the 2nd and 3rd trimesters. Consequently, some proportion of follow‐up time
among those with 2nd or 3rd trimester exposures may be unexposed. Thus, the follow‐
up time among the unexposed is underestimated in the time‐fixed analysis. The
unbiased rate ratio for first trimester exposures can be expressed as:
(6)
where p2 and p3 represent the proportion of time unexposed among those exposed in
the 2nd and 3rd trimesters respectively. Comparing the unbiased to the biased effect
estimate, we see that this scenario would result in negative bias even though first
trimester exposures were not time‐varying during the period of time at risk of preterm
birth.
Average exposure and cumulative exposure metrics also are also susceptible to time‐
dependent bias. Suissa (2008) addresses the problem of immortal time prior to
initiation of exposure resulting in a bias in the negative direction [21]. Additionally,
time‐fixed average exposures may produce artificially non‐linear associations due to
variable amounts of information used to calculate the averages. An average exposure
for an individual is calculated by summing the observed exposure at all time points and
dividing by the time in follow‐up. The sample size of the number of exposure
assessments obtained for an individual is dependent on the number of days in follow‐
14
up. Shorter follow‐up times will be more susceptible to extreme values and will produce
an exposure distribution with fatter tails than the distribution of exposure for those with
longer follow‐up times. Consequently, those with shorter durations of follow‐up will
have a greater probability of being classified in the lower or upper tails of the exposure
distribution. This bias will result in an artificial “U” or “J” shaped dose‐response
function.
We have demonstrated how the incidence rate ratio will be biased in the presence of
time‐dependent bias. In observational studies, investigators often estimate hazard
ratios from Cox Proportional Hazards models rather than incidence rate ratios[37].
Though there are some limitations in relying on hazard ratios, they can generally be
interpreted as incidence rate ratios[38] and are susceptible to time‐dependent bias[24].
Simulation Methods
We assessed the bias under several scenarios relevant to perinatal epidemiology using
simulated data representing 10000 pregnancies. As identified in the previous section,
the magnitude of the bias depends on the proportion of time unexposed among the
exposed (p and q), the probability of the outcome conditional on the exposure and the
average follow‐up times for those with and without the outcome (k), and the and the
incidence of the outcome in the exposed and unexposed (a and c). To capture each of
these factors, we provided the following parameters: weekly probability of exposure
15
initiation, continuity of exposure, hazard function for birth, and the magnitude of
association between exposure and outcome.
The proportion of time unexposed among the exposed is a function of timing of
initiation, continuity of exposure, and duration of follow‐up. We considered weekly
probabilities of initiation [P(Ei+1=1|Ei=0)] ranging from 0.01 to 0.1 under three scenarios;
constant, increasing, and decreasing probability of exposure over pregnancy. Continuity
of exposure was specified by providing the probability of being exposed at time ti+1
given exposure was present at ti. We considered exposures with a high degree of
continuity [P(Ei+1=1|Ei=1) =1], moderate continuity [P(Ei+1=1|Ei=1) =0.9], and low
continuity [P(Ei+1=1|Ei=1) =0.5]. The exposure scenarios utilized in our data simulations
are summarized in Table 1.1.
For the purpose of this simulation, we defined the outcome as preterm birth. To
simulate preterm birth, an estimation of the hazard function[39] of birth at each week
of gestation was identified using the 2006 US Natality data[40]. The 2006 US Natality
data includes all registered births in the 50 states, District of Columbia, and New York
City. The hazard of birth was estimated at each week of gestation up to the 37th week
(i=1 to 37). The hazard at the midpoint of each week was calculated as
2⁄⁄ where di is the number of births during week i and ni is the number at
risk at the beginning of week i.
16
We specified the probability of being born at a specific GA to be dependent on the
identified hazard function, the timing specific exposure state (E|T), the timing specific
prevalence of exposure, and the specified magnitude of association (RR) such that the
baseline hazard function of our simulated data was representative of the hazard
function from observational data. The magnitude of the association was not dependant
on the timing of exposure. Time‐fixed and time‐varying exposure metrics were
created. Time‐fixed exposure metrics include “ever exposed” during pregnancy, “ever
exposed” within trimesters and average exposure during pregnancy. Average exposure
was calculated as the number of weeks exposed divided by the weeks in follow‐up.
Time‐varying exposure metrics included timing specific binary indicators of exposure
(during pregnancy and within trimesters) and average exposure. Average exposure was
calculated as the number of weeks exposed prior to time T divided by the duration of
follow‐up at time T.
Simulated data were analyzed using logistic regression, time‐fixed Cox Proportional
Hazards Models and time‐varying Cox Proportional Hazards Models[41]. For each
scenario and analytic method, we report the average effect estimate from 1000 data
simulations. For analyses of average exposures, we assessed whether there was a
departure from linearity by including higher order terms (up to the 10th power). We
obtained the average AIC among the 1000 simulations for each of the higher order
models (2nd order to 10th order). For each scenario, the model with the lowest average
AIC was considered as our final model. Where the final model included higher order
17
terms, we concluded that the dose response relationship was artificially non‐linear. We
plotted the resulting dose response functions to qualitatively describe the pattern of the
observed bias.
Simulation Results
Simulated analyses of a single transition binary exposure [P(Ei=1|Ei‐1=1)=1] were
consistent with the expected direction of time‐dependent bias (Table 1.2). That is, the
bias tended to be in the negative direction when using logistic regression or the time‐
fixed hazards model. We did not observe a notable bias for the single transition
exposure where the probability of being exposed declined over the course of pregnancy.
The lack of bias can be explained by the relatively low probability of initiating the
exposure while at risk for experiencing the outcome. For transient exposures [P(Ei=1|Ei‐
1=1)=0.5 or 0.9], the direction and magnitude of the observed bias differed between
scenarios. When including binary indicators of exposure for each trimester, the
resulting bias was largest for the third trimester; however, bias was observed in the first
and second trimesters as well. For each of the assessed scenarios, the bias was in the
negative direction.
When modeling the association between average exposure and preterm birth assuming
a linear dose response relationship, time‐fixed analyses generally produced biased
effect estimates (Table 1.3). For exposures that are more common in later pregnancy,
the bias was in the negative direction for each of the assessed scenarios. For constant
18
and decreasing exposures, the direction of the bias was dependent on the magnitude of
the association and the consistency of the exposure. The dose‐response relationship
identified with logistic regression and the time‐fixed Hazards models were artificially
non‐linear in all of the assessed scenarios. The pattern of the bias was dependent on
the probability of exposure initiation, distribution of exposure timing, association
between the exposure and the outcome (Figure 1.3). In general, the effect estimates
were underestimated for exposures approaching 0 and overestimated for exposures
approaching 1.
Bias Correction
The above simulations demonstrate that time‐dependent bias is of concern in perinatal
epidemiology. The easiest solution is to utilize time‐varying methods when analyzing
time‐varying exposures. However, timing‐specific data is often unavailable in existing
data sources or may not be feasible to collect in ongoing studies. If data on exposure
timing is not available, we propose a bias correction method based on multiple
imputation of exposure timing. If there is sufficient prior knowledge, we can specify the
functional form representing the timing specific probability of exposure and duration of
exposure (i.e. initiation, continuity, pattern). Where there is limited prior knowledge,
we may perform sensitivity analyses by specifying plausible functional forms of timing‐
specific exposure probabilities and duration of exposure. Once specified, we propose
multiply imputing exposure timing then analyzing the data using standard time‐varying
19
approaches to produce corrected effect estimates or a range of plausible effect
estimates.
Illustration
As an illustration of the bias and the bias correction method, we quantify the association
between initiation of prenatal care and preterm birth and low birth weight using 2006
US Natality data[42]. Prenatal care has been presumed to help prevent preterm birth
and LBW; however, attempts to quantify this association have produced equivocal
results [43]. Previous studies looking at the association between timing of prenatal care
initiation and LBW have not confirmed the hypothesis that early prenatal care is more
beneficial than delayed prenatal care [18, 44]. Contrary to expectation, results from
prior studies indicate that delayed prenatal care is more protective than early prenatal
care [18]. The explanation for this unexpected finding has been residual confounding;
however, time‐dependent bias may have contributed to the reported effect estimates.
An article published in 1962 attributes the findings to mothers who delay initiation of
care until the third trimester having lower risks because they are closer to reaching full
term [45]. Though this essentially describes time‐dependent bias, it was interpreted
and addressed as confounding.
In this illustration, we quantify the association between prenatal care initiation and
preterm birth. In the time‐fixed models, prenatal care was defined as early (initiation
within the first trimester), delayed (initiation after the first trimester but before the 37th
week of gestation), or no prenatal care. The time‐fixed metrics were analyzed using
20
logistic regression and Cox Proportional Hazards Models. In addition, we also report
effect estimates obtained from a logistic regression adjusted for gestational age at birth
for low birth weight. This is consistent with the conclusion that gestational age at birth
is a confounder. It was not possible to adjust for gestational age in the model for
preterm delivery since preterm delivery is defined by gestational age. We created time‐
varying indicators of prenatal care initiation and analyzed using the extended Cox
Proportional Hazards Model. We report effect estimates using multiple imputation of
timing of prenatal care initiation and observed timing of prenatal care initiation. For the
imputed timing, we assumed we only knew whether prenatal care was early or delayed.
Where it was early, we sampled from a uniform distribution of initiation times ranging
from week 4 to week 12. Where it was delayed, we sampled from a uniform
distribution of initiation times ranging from week 13 to week 37 or gestational age at
delivery. The imputation step was repeated five times. Point estimates and confidence
intervals were quantified using SAS Proc MIAnalyze to reflect the uncertainty associated
with the imputation process [46].
The findings from our time‐fixed analysis are consistent with reported effect estimates
from previous studies[47]. That is, those who received prenatal care had approximately
1/3rd the risk of preterm birth and low birth weight (Table 1.4). After adjusting for
gestational age at delivery, the magnitude of negative association observed for low birth
weight was reduced by approximately 60%. The results from the time‐varying models
21
were dramatically different. Early prenatal care remained significantly protective while
delayed prenatal care approached null. Effect estimates from the imputed time and
observed time models were comparable suggesting that the imputed time‐varying
method effectively addressed the time‐dependent bias in this example. It is important
to note that adjusting for gestational age at delivery did not sufficiently address the
time‐dependent bias and is fundamentally different than treating gestational age as a
time axis in a time‐varying analysis.
22
Discussion:
In perinatal epidemiology, time‐dependent bias has the potential to substantially impact
the validity of analyses when exposure timing is ignored. We have demonstrated that
this bias extends beyond single binary exposure metrics. Of particular importance,
analyses of trimester‐specific exposures and average exposure are susceptible to time‐
dependent bias. Recognizing that timing specific data may not be available in some
situations, we have demonstrated the utility of simulated exposure event times to
obtain unbiased effect estimates. If we are confident in our knowledge of the functional
form of exposure event times, these corrected effect estimates can be viewed as
unbiased estimates. Where little is known about the functional form of exposure event
times, multiple assumptions can be tested to perform a sensitivity analysis.
In our simulations of average exposure, we demonstrated the bias where the
distribution of point‐in‐time exposure was binary (e.g. yes/no indicators of medication
use, maternal illness). The bias as described and simulated is also relevant to point‐in‐
time exposures that are continuous (e.g. concentrations of pollutants, blood pressure)
where exposure is repeatedly assessed throughout follow‐up. For continuous
exposures, individuals with shorter follow‐up times will have fewer measurements
contributing to their estimate of average exposure. As a result of the limited
information used to quantify average exposure among those with shorter follow‐up
times, the distribution of average exposure will be flattened with wider tails.
Consequently, those with the outcome will be more likely to have exposures in the
23
extremes. Similar to our findings, this would lead to an underestimation of effect
estimates at low exposures and an overestimation at higher exposures. Where the
time‐fixed average exposure is known but data for exposures at specific time points is
not available (e.g. passive monitoring of pollutants, maternal weight gain), our
imputation method would appropriately address time‐dependent bias.
For many perinatal outcomes, event times are known or can be assessed (e.g. preterm
birth, spontaneous abortion, clinical preeclampsia); however, other outcomes have
unknown event times and are only diagnosed at birth (e.g. malformations, growth
restriction). Though time‐varying methods are more appropriate than time‐fixed
methods, the validity of these models is dependent on our knowledge of when the
event actually occurred. Where the gestational age of outcome occurrence is unknown,
the bias is impacted by the ratio between the average gestational age at which the
outcome occurred and gestational age at birth. Because the timing of the event can
occur no later than the timing of birth, the ratio is always between 0 and 1. As the ratio
approaches 0, the resulting bias is in the negative direction. Though potentially still
biased, use of time‐varying methods with gestational age at birth as the event time will
more closely approximate the true association as compared to a time‐fixed approach.
Though this paper focused on immediately detectable adverse outcomes such as
preterm birth and low birth weight, the bias discussed in this paper is relevant to studies
of perinatal etiology of later life outcomes. Studies in the area of fetal programming are
also susceptible if exposure is classified as time‐fixed during pregnancy. Outcomes
24
should be viewed as occurring during pregnancy but not diagnosed until later in life due
to long latency. For example, when assessing the association between prenatal
exposures and asthma in the offspring, it is useful to identify the actual direct effect of
the exposure (e.g. impaired lung development). When viewed from this perspective,
individuals are at risk for impaired lung development due to a prenatal exposure from
week 18 through delivery. The period of time from birth to diagnosis of asthma can be
viewed as a latency period. Thus the time axis from week 18 to birth is critical for the
exposure effect on the outcome while the time from birth to asthma diagnosis
contributes to the sensitivity and specificity of the outcome assessment. For this
reason, the appropriate time axes in an epidemiological analysis should be from week
18 to delivery while age at diagnosis should be considered as a potential confounder.
We have demonstrated that time‐dependent bias is a concern in perinatal
epidemiology. The easiest solution is to utilize time‐varying methods where timing
specific data are available. Where timing specific data is not available, we
demonstrated that imputation of exposure timing may be a useful tool in obtaining
unbiased effect estimates or performing sensitivity analyses. Ignoring the time‐varying
nature of an exposure is not a viable option unless the exposure is effectively invariant
while subjects are at risk for experiencing the event. Adjusting for gestational age at
delivery is not a suitable alternative to a time‐varying analysis. For time‐varying
exposures, an increased emphasis should be given to obtaining exposure assessments at
multiple time points. In addition to preventing time‐dependent bias, assessing exposure
25
at multiple time points enables our ability to detect timing‐specific effects. Whether in
a prospective or retrospective setting, the prospect of obtaining multiple exposure
assessments in a pregnant population may be difficult from the perspective of study
cost and subject burden; however, the potential impact of time‐dependent bias
warrants careful consideration. Perinatal epidemiologists should consider novel
methods such as imputation of exposure timing or more efficient study designs to
feasibly collect time‐varying exposure data. Future research should aim to further test
and develop methods for exposure timing imputation and methods for increasing the
efficiency of exposure assessment.
26
Appendix Time Dependent Bias of Odds Ratio:
In the context of a case control analysis, the odds ratio can be shown to be biased in the
negative direction by a simple mathematical proof. Suppose a time varying exposure (E)
and case status (C) are independent random variables (i.e. E does not affect the risk of
C). The distribution of time varying exposure events occurring over the course or
pregnancy can be thought of as a Poisson process. That is, the number of exposure
events experienced during pregnancy follows a Poisson distribution with a daily rate of
λ.
Daily Exposure Events ~Pois(λ|C)= Pois(λ| )= Pois(λ)
When calculating an OR from a 2X2 table or logistic regression, our operational
definition of exposure is “ever exposed” (E ≥ 1) within some time period of a specified
duration (T). For example, we may be interested in assessing exposures from
conception to 28 weeks when looking at stillbirth or from 20 weeks to 37 weeks when
looking at preterm birth. Exposure events over a time interval are distributed as a
Poisson distribution with rate λT. If the outcome of interest is associated with earlier
gestational age at birth, then the exposure event distributions over the specified period
will not be equivalent (Pois(λTC) ≠ Pois(λ )). Therefore, whereas daily exposure and the
outcome are independent by definition, the probability of observing an exposure event
within a specified period is dependent on outcome status such that:
1| 1
1| 1
28
CHAPTER 1 TABLES
Table 1.1: Exposure scenarios utilized in data simulations
Exposure Scenario
Proportion Exposed
Duration of Exposure Event
(weeks)*
Proportion of time
Unexposed *
P(Ei=1|Ei‐1=0) P(Ei=1|Ei‐
1=1) p q
Constant at 0.01 0.50 0.33 1.96 0.46 0.82 Constant at 0.01 0.90 0.33 8.03 0.45 0.50 Constant at 0.01 1.00 0.33 21.82 0.45 0.00
↓ from 0.1 to 0.001 0.50 0.61 2.03 0.15 0.91 ↓ from 0.1 to 0.001 0.90 0.61 9.95 0.15 0.66 ↓ from 0.1 to 0.001 1.00 0.61 34.07 0.15 0.00
↑ from 0.001 to 0.1 0.50 0.56 1.83 0.74 0.61 ↑ from 0.001 to 0.1 0.90 0.56 5.48 0.74 0.27 ↑ from 0.001 to 0.1 1.00 0.56 10.41 0.74 0.00
p: Time to first exposure divided by total time in follow‐up q: Time unexposed after first exposure divided by time after first exposure ↓: Decreasing over time ↑: Increasing over time
* Calculated assuming 40 week follow‐up. Will vary with differing definitions of follow‐up and magnitude of association between exposure and outcome of interest
29
P(Ei=1|Ei-1=0) P(Ei=1|Ei-1=1) OR HRINV HRTV OR HRINV HRTV OR HRINV HRTV
Any Pregnancy Constant at 0.01 0.50 0.78 0.79 0.50 1.00 1.00 1.00 1.35 1.31 2.00Constant at 0.01 0.90 0.44 0.46 0.50 0.88 0.89 1.00 1.81 1.72 2.00Constant at 0.01 1.00 0.42 0.44 0.50 0.85 0.86 1.00 1.81 1.67 2.00
↓ from 0.1 to 0.001 0.50 1.55 1.51 0.50 1.89 1.81 1.00 2.80 2.56 2.00↓ from 0.1 to 0.001 0.90 0.59 0.58 0.50 1.08 1.07 1.00 2.33 2.15 2.00↓ from 0.1 to 0.001 1.00 0.50 0.50 0.50 0.99 0.99 1.00 2.00 2.00 2.00
↑ from 0.001 to 0.1 0.50 0.48 0.49 0.50 0.56 0.59 1.00 0.81 0.82 2.00↑ from 0.001 to 0.1 0.90 0.27 0.27 0.50 0.48 0.50 1.00 0.99 0.92 2.00↑ from 0.001 to 0.1 1.00 0.26 0.26 0.50 0.47 0.49 1.00 0.98 0.97 2.00
Trimester Specific*Any 1st Trimester Constant at 0.01 1.00 0.45 0.48 0.50 0.95 0.96 1.00 1.90 1.89 2.00
Any 2nd Trimester 0.45 0.48 0.50 0.94 0.95 1.00 1.93 1.89 2.00Any 3rd Trimester 0.31 0.33 0.50 0.62 0.64 1.00 1.24 1.20 2.00
Any 1st Trimester ↓ from 0.1 to 0.001 1.00 0.48 0.50 0.50 0.99 0.99 1.00 2.00 2.00 2.00Any 2nd Trimester 0.48 0.50 0.50 0.99 0.99 1.00 2.00 2.00 2.00Any 3rd Trimester 0.37 0.41 0.50 0.78 0.79 1.00 1.68 1.58 2.00
Any 1st Trimester ↑ from 0.001 to 0.1 1.00 0.33 0.37 0.50 0.70 0.72 1.00 1.54 1.46 2.00Any 2nd Trimester 0.33 0.37 0.50 0.69 0.71 1.00 1.53 1.45 2.00Any 3rd Trimester 0.19 0.22 0.50 0.39 0.41 1.00 0.81 0.81 2.00
*Joint modeling of binary exposure indicators within each trimester
Table 1.2: Average effect estimates for 1000 simulations of 10,000 preganacies using time invariant and time varying methodsTrue RR=0.5 True RR=1 True RR=2.0Exposure
30
P(Ei=1|Ei-1=0) P(Ei=1|Ei-1=1) OR HRINV HRTV OR HRINV HRTV OR HRINV HRTV
Constant at 0.01 0.5 0.44 0.47 0.50 1.00 1.00 1.00 2.28 2.15 2.00Constant at 0.01 0.9 0.47 0.48 0.50 0.99 1.00 1.00 2.26 2.20 2.00Constant at 0.01 1 0.55 0.55 0.50 0.98 0.99 1.00 2.20 1.95 2.00
↓ from 0.1 to 0.001 0.5 0.53 0.54 0.50 1.18 1.17 1.00 2.76 2.44 2.00↓ from 0.1 to 0.001 0.9 0.49 0.51 0.50 1.03 1.03 1.00 2.59 2.15 2.00↓ from 0.1 to 0.001 1 0.69 0.69 0.50 0.99 1.00 1.00 2.55 2.30 2.00
↑ from 0.001 to 0.1 0.5 0.23 0.24 0.50 0.59 0.60 1.00 1.47 1.92 2.00↑ from 0.001 to 0.1 0.9 0.33 0.34 0.50 0.84 0.85 1.00 1.87 1.79 2.00↑ from 0.001 to 0.1 1 0.39 0.39 0.50 0.90 0.91 1.00 1.90 1.80 2.00
* Effect estimates for simple regression assuming linear dose response relationship
Table 1.3: Simulation of Average Exposure and Preterm Birth: Average effect estimates* for 1000 simulations of 10,000 pregnancies using time invariant and time varying methods
Exposure True RR=0.5 True RR=1 True RR=2.0
31
Table 1.4: Association between prenatal care initiation and preterm birth and low birth weight, 2006 US Natality Data
OR HRTI ORADJ HRTV HRImpTV
Preterm BirthNo Prenatal Care 1.00 (--) 1.00 (--) NA 1.00 (--) 1.00 (--)
Early 0.36 (0.35-0.37) 0.38 (0.38-0.39) NA 0.88 (0.88-0.89) 0.88 (0.88-0.89)Delayed 0.40 (0.39-0.41) 0.43 (0.42-0.44) NA 0.99 (0.98-1.00) 0.99 (0.97-1.01)
Low Birth WeightNo Prenatal Care 1.00 (--) 1.00 (--) 1.00 (--) 1.00 (--) 1.00 (--)
Early 0.32 (0.32-0.33) 0.33 (0.32-0.34) 0.54 (0.52-0.57) 0.90 (0.89-0.90) 0.90 (0.89-0.90)Delayed 0.34 (0.33-0.25) 0.35 (0.34-0.36) 0.55 (0.53-0.57) 0.96 (0.95-0.97) 0.97 (0.95-0.98)
HRTI= Time invariant hazard ratio
HRADJ= Time invariant hazard ratio adjusted for gestational age at birth
HRImpTV=Time varying hazard ratio using imputed exposure timing
HRTV=Time varying hazard ratio using observed exposure timing
Time Invariant Time Varying
32
CHAPTER 1 FIGURES
D
+ ‐ P‐Time
E + a b T+
‐ c d T‐
Figure 1.1: Calculation of the incidence rate ratio from a 2X2 table
Preterm Birth
Yes No Time
Exposure None A b T0
1st Trimester C d T1
2nd Trimester E f T2
3rd Trimester G h T3
Figure 1.2: Calculation of incidence rate ratio of jointly modeled exposures
33
a)
b)
c)
Figure 1.3: Dose response patterns for (a) constant probability of exposure initiation, (b) declining probability of exposure initiation, (c) increasing probability of exposure initiation.
0.000.000.010.020.050.140.391.052.827.63
20.59
0 0.2 0.4 0.6 0.8 1
Odds Ratio
.0
.0.01.02.05.14.391.052.827.63
20.59
0 0.2 0.4 0.6 0.8 1
Odds Ratio
.0
.0.01.02.05.14.391.052.827.63
20.59
0 0.2 0.4 0.6 0.8 1
Odds Ratio
Dose
p(E|E)=0.9, RR=0.5 p(E|E)=0.9, RR=1
p(E|E)=0.9, RR=2.0
34
Chapter 2: EVALUATING MISSING DATA DESIGNS IN THE PRESENCE OF NON‐DESIGNED
MISSING DATA: APPLICATIONS IN PERINATAL EPIDEMIOLOGY
35
Abstract
The feasibility of designs incorporating repeated exposure measures in pregnancy is
limited due to cost and excessive subject burden within a short duration of time. Prior
methodological work has identified designed missingness as an efficient and valid tool
for longitudinal assessment of outcomes. The goal of this paper is to introduce concepts
of missing data designs for prospective assessment of exposures and to assess
performance of missing data designs in scenarios encountered in observational
epidemiology. Study designs with designed missing data were compared to the
traditional cohort design with intended complete exposure ascertainment. We use
simulated data to quantify bias and relative efficiency under several scenarios
representing a range of non‐designed missing data due to non‐compliance or loss to
follow‐up. We further evaluate the performance of missing data designs using an
observational dataset implementing multiple unique patterns of designed missing data.
We observed that study designs with designed missing data were unbiased relative to
the comparative traditional cohort study. Efficiency of the missing data designs was
dependent on the between time correlation of the true exposure, the within time
correlation between the proxy exposure and the true exposure, and the proportion of
observations with non‐designed missing data. Missing data designs were more
susceptible to a loss of precision in the presence of non‐designed missing data.
36
Within the observational dataset, we observed that participant compliance was
strongest among the missing data designs. In conclusion, missing data designs are a
viable option for prospective assessment of exposures. Intensive studies should
consider missing data designs as a means to improve efficiency, reduce subject burden,
and reduce selection bias.
37
Introduction:
In perinatal epidemiology, exposures of interest are often time‐varying and may have
narrow yet unknown relevant etiologic periods, thus requiring repeated assessments to
be validly measured [2, 3, 48‐50]. The feasibility of designs incorporating repeated
measures in pregnancy is limited due to cost and excessive subject burden within a
short duration of time. Minimizing subject burden in perinatal investigations is
particularly important due to mothers’ increased resistance to invasive and non‐invasive
study methodologies [19, 20]. Recognizing constraints on financial cost and subject
burden, designing studies of repeatedly measured exposures can be presented as a
balancing act between small and thick or large and thin [51]. That is, designs may
sacrifice sample size to maximize exposure data or sacrifice exposure data to maximize
sample size. Without constraints on cost, time, or subject burden, ideal studies would
be large and thick. As a consequence, design methodologies have been developed to
maximize the amount of information collected at a fixed cost (i.e. attempt to approach
the validity of a thick study at the cost of a thin study).
Missing data designs, including partial questionnaire designs [26‐29], multi‐cohort
longitudinal designs [30, 31], and multi‐measurement methods of construct assessment
[28, 30], deliberately introduce data that is either missing at random (MAR) or missing
completely at random (MCAR) so as to validly utilize missing data methods when making
inference. In doing so, they require less intensive follow‐up protocols than a
38
comparable complete ascertainment study design. Unbiased estimates can be
obtained by ignoring (MCAR), multiply imputing (MCAR and MAR), or by maximum
likelihood based approaches (MCAR and MAR)[28]. In idealized simulation scenarios,
missing data designs have been shown to improve statistical efficiency without
sacrificing validity.
Recognizing that epidemiologist may be reluctant to rely on missing data methods for
the primary exposure of interest, it is important to note that traditional epidemiological
designs can be conceived as missing data designs in which all designed missing data is
clustered among the unsampled population [52]. Typically, missing data among the
unsampled population is considered MCAR, thus justifying analyses ignoring the missing
data. While the designed missing data in traditional studies is primarily concerned with
cost (e.g. random sampling of source population in a cohort study) and statistical
efficiency (e.g. outcome dependent sampling in a case‐control study), designed missing
data in novel missing data designs have an additional benefit of reducing individual level
subject burden while also controlling cost and maximizing statistical efficiency. Through
reducing subject burden, missing data designs may have an added benefit of reducing
selection bias due self selection and loss to follow‐up.
Assessment of associations between time‐varying exposures and adverse pregnancy
outcomes may be well suited for missing data designs because pregnant women are
more reluctant to participate in burdensome studies [20], adequate assessment of many
39
perinatal exposure requires measurements at multiple time‐points [49, 50], and non‐
invasive measures may not adequately characterize the conceptual exposures
experienced by the fetus [53]. Though theoretically appropriate, missing data designs
are rarely used in perinatal investigations, in part due to a general mistrust in missing
data methodologies [32]. Additionally, prior methodological publications have not
assessed performance of missing data designs implemented in the context of time
varying exposures nor have missing data designed been evaluated in scenarios with
non‐designed missing data due to non‐compliance and loss to follow‐up. The goals of
this paper are to introduce concepts of missing data designs and to assess the
performance of missing data designs in non‐idealized scenarios encountered in
observational epidemiology by exploring the impact of non‐designed missingness on the
statistical efficiency and validity.
‐
40
Simulation Methods
To evaluate missing data designs in the presence of non‐designed missingness, we
analyzed simulated data and data from the Health and Nutrition in Pregnancy Study
(NIP). Simulated data of generalized exposure scenarios were created within a source
population representing 100,000 pregnant women. Variable inputs for our simulations
included the total number of assessment time points, distributions of exposures
measured by gold standard or proxy, correlation between exposures between time
points [COR(Gi, Gi+1), where Gi is the gold standard measured at time i], correlation
between exposure methods within time points [COR(Gi, Pi), where P is the proxy
exposure assessment], and the magnitude of association between a binary outcome and
exposure at each time‐point. For analyses presented in this paper, we assumed a study
with 5 designed assessment time points with between‐time and within‐time exposure
correlations ranging from 0.2 to 0.9. Outcomes were simulated such that there was a 2‐
fold increased risk per unit change in exposure at a specific time point.
For a study with exposure assessed at 5 time points and where there is an accepted gold
standard and a reasonable proxy, there are essentially an unlimited number of potential
missing data designs. We have selected four candidate designs for comparison (Figure
2.1): a complete ascertainment study in which the gold standard is measured at all time
points for all participants, a two‐stage design in which the proxy is measured at all time
points but the gold standard is only measured at time 1 (t1) and one additional time
point (t2‐t5), a multi‐cohort design in which the gold standard is measured at three time
41
points, and a complex design utilizing elements of two‐stage and multi‐cohort designs.
Each of the candidate designs includes a common component in which the gold
standard is measured for all subjects at t1. The variable components at t2‐t5 collect
exposure data with less intensity.
We compared the validity and precision of these study designs at a fixed cost. Though
cost may conceptually include factors such as time, subject burden, and effort, we chose
to compare designed at a fixed financial cost. As described by Helms (1992), the cost of
a study could be thought of consisting of a core administrative cost (Admin) that would
be relatively constant between designs, a subject recruitment cost (SRC), and the
subject assessment cost (SAC)[31]. Cost is also impacted by the sample size (N) and the
number of assessments per subject (K). Thus the total cost of a study may be
summarized in a simplified model.
Cost=Admin+N x (SRC + K*SAC)
The cost model assumes that subject recruitment costs and subject assessment costs do
not vary as the sample size or number of assessments per subjects changes. If we
further assume that the administrative costs, including initiation of the study,
maintenance of the data, analysis of the data, and production of a manuscript, are
relatively constant between designs, then this cost contributor may be dropped from
the model for comparative purposes.
42
Using recruitment and assessment costs (gold standard and proxy), we obtained sample
sizes at a fixed total cost for each of the candidate designs by solving for N in our cost
model. For subject recruitment, we assumed a scenario requiring 1 hour of effort at
$20/hr per subject. The cost of a proxy assessment was intended to be representative
of a phone interview with 1 hour of effort at $20/hr. The gold standard was intended to
be representative of routine biospecimen collection and evaluation costing $100 plus an
additional 2 hours of effort at $20/hr. The cost and sample size parameters are
summarized in Table 2.1.
To simulate data collected from each design, we first sampled the specified number of
subjects from the simulated source population. Next we introduced missing data
consistent with the design by deleting exposure values. No values were deleted from
the complete ascertainment design during this stage of the simulation. For the two‐
stage design, sampled participants were randomly assigned 1 time point at which the
gold standard would be measured. For the multi‐cohort design, all proxy measures
were deleted and 2 gold standard assessments were randomly deleted within each
subject. For the complex design; 10% of the population retained complete data on both
the gold standard and proxy exposure; 10% retained all proxy exposures, the baseline
gold standard, and one additional randomly selected gold standard; and 80% retained
the baseline gold standard, the baseline and final proxy, and 1 additional randomly
43
identified gold standard. We then introduced non‐designed missing data by randomly
deleting 5%, 10%, 20%, 30%, and 40% of designed measures at each time point.
We use multiple imputation methods to impute designed and non‐designed missing
data simultaneously [46, 54‐57]. Multiple imputation methods for missing data designs
have been presented as a preferred method due to ease of use in readily available
software and ability to perform sensitivity analyses where non‐designed missing data is
expected to be MNAR[28]. A Markov Chain Monte Carlo (MCMC) multiple imputation
(MI) was chosen to simultaneously impute missing values for the gold standard at all
time points [54, 57]. The first step of MCMC MI requires the calculation of prior
parameters for multivariate normal means and covariances of variables included in the
imputation model. To accomplish this, missing values were initially filled in using a
vector of mean values and the covariance matrix estimated from the EM algorithm for
the observed data. Using the parameters from the prior model, missing data were
updated (imputed) by sampling a value from the predictive distribution. Posterior
parameters for multivariate normal predictive means and covariance matrix were then
estimated using the observed and imputed values for all variables in the imputation
model. This process was repeated until the vector of means and the covariance matrix
were unchanged between consecutive iterations. This process was repeated to produce
10 complete datasets. All analyses and imputations were performed using SAS version
9.2 (SAS Institute, Cary, NC).
44
Timing specific measures of association were quantified using logistic regression. We
report bias and relative efficiency separately for the common component and the
variable components. Bias was quantified as the percent difference between the
specified and observed effect estimates. We evaluated the efficiency of the study
designs by calculating the ratio of the standard errors from the missing data designs to
the standard errors from the complete ascertainment design subjected to the same
degree on non‐designed missingness. Relative efficiencies greater than 1 represent
scenarios in which the missing data design was more efficient than the complete
ascertainment design.
Observational Data Methods
The Health and Nutrition in Pregnancy Study enrolled 2478 women for the purpose of
assessing the association between dietary factors and adverse pregnancy outcomes
[58]. Women were recruited from 1996 through 2000 from 56 obstetrical practices and
15 clinics associated with six hospitals in Connecticut and Massachusetts. Eligible
women had a gestational age less than 24 weeks at enrollment and had no prior history
of diabetes. Women were assigned to one of three screening protocols sharing a
common component including a baseline survey and urine sample and a postpartum
survey: 1) telephone group consisting of one telephone follow‐up interview at 20, 28, or
36 weeks; 2) biomonitoring group consisting of three telephone follow‐up interviews at
20, 28, and 36 weeks in conjunction with an additional urinary sample at one of the
follow‐up times; or 3) intensive monitoring group consisting of three in‐person
45
interviews and three urinary samples at 20, 28, and 36 weeks. For our purposes, each
protocol (intensive, telephone, and biomonitor) and the overall NIP design was
considered representative of a unique missing data design with the intensive group
serving as the traditional complete ascertainment cohort design. We evaluated the
performance of these designs by quantifying the magnitude and precision of the
association between maternal smoking and birth weight using data from the Health and
Nutrition in Pregnancy Study. We had insufficient numbers to evaluate other clinically
relevant outcomes such as small for gestational age.
As with our simulated data, we compared the efficiency of the missing data designs
(telephone, biomonitor, NIP) relative to the complete ascertainment design (intensive
group) at a fixed cost. Based on observable covariates, assignment to study protocol
was not completely at random. This non‐random assignment would produce a selection
bias such that intra‐study comparisons would not be valid. To remove the selection bias
we employed a cluster sampling approach within each design. Sampling weights were
obtained using predicted probability (Ppredict) of being assigned to the intensive protocol
as quantified using a logistic regression with baseline covariates. Sampling weights were
assigned as 1‐ Ppredict for those assigned to the intensive group and Ppredict for those in the
Telephone and Biomonitoring group. We then sampled from each protocol according to
the sampling weights such that the total cost and the probability of being assigned to
the intensive group within each missing data design was the same. The distributions for
the predicted probabilities for being assigned to the intensive protocol are presented in
46
Figure 2.2. We evaluated the distribution of baseline characteristics to confirm
comparability between designs after weighting (Table 2.2). Prior to propensity
weighting, those assigned to the intensive protocol were more likely to be married,
white, higher educated, and recruited in the first trimester. After weighting, no notable
differenced were observed.
To ensure that individuals had an opportunity to complete assessments at all designed
time points, this analysis was restricted to women recruited prior to week 20 with
deliveries occurring after 36 weeks gestation. Small for gestational age was defined as a
birth weight below the 10th percentile for gestational age, gender, and ethnicity using an
external standard developed from 1999 US Natality Data. The primary exposure of
interest was urinary cotinine. Participants were provided with urine containers and
requested to collect urine between dinner and bedtime on the night prior to the
interview. Samples were collected the following day and were analyzed for urinary
cotinine by Labstat, Inc. (Kitchener, Ontario, Canada). As a proxy measure women were
asked to report the number of cigarettes smoked per day the previous week at weeks
20, 28, and 36. Respondents were asked to recall their first trimester exposures during
their baseline interview.
Within each study design, we assess compliance with both proxy and gold standard
measures by calculating the total percentage of designed measures that were missing.
The association between urinary cotinine and birth weight is quantified using
47
multivariable linear adjusting for maternal age, race, and gestational age. Cotinine
measures were simultaneously assessed in the regression models and variance inflation
factors were generated to assess presence of multicolinearity (VIF > 2). Standard errors
for observed associations are compared across studies.
48
Simulation Results
We did not observe any bias due to designed missing data in any of the assessed
scenarios. In both the common component and the variable component, the relative
efficiency of the missing data designs declined as the prevalence of non‐designed
missing data increased (Table 2.3). For the common component, missing data designs
were more efficient than complete ascertainment designs subjected to the same
probability of non‐designed missing data. For the variable components, each of the
missing data designs was more efficient where correlation was high and there was no
non‐designed missing data. The multi‐measurement design remained more efficient
even when the within time correlation was low and the probability of non‐designed
missing data was high. For the multi‐cohort design, there was no efficiency advantage
where the prevalence of non‐designed missing data was 10% or above or where the
between time correlation was below 0.9. The complex missing data design only realized
increased efficiency where the between time and within time correlations were high
(Cor=0.9) and non‐designed missingness was low (Pmiss≤0.1).
49
Observational Data Results
Using the observational data obtained from the Health and Nutrition in pregnancy
study, we observed that compliance with the study assessments was inversely related
with the demands of the protocol (Table 2.5). While non‐designed missingness was high
for both proxy (12%‐21%) and gold standard (16%‐38%) measures, the intensive group
had the highest percentage of non‐designed missing data of the gold standard urinary
measures. Similarly, the biomonitoring and the intensive groups had the highest
percentage of non‐designed missing data of the self reported smoking measures.
When quantifying the association between urinary cotinine and birth weight, missing
data designs tended to be more efficient. However, the associations observed in the
missing data designs were somewhat attenuated relative to the complete ascertainment
design. Due to the small sample size within each study protocol used in this analysis,
associations between cotinine and birth weight did not achieve statistical significance
with the exception of first trimester exposures assessed using the complex NIP design.
Similar finding were observed with respect to self reported smoking.
50
Discussion
This paper is unique in that we evaluated missing data designs in the presence of non‐
designed missing data using both simulated and observational data. We have
demonstrated that missing data designs have the potential to increase study efficiency
and reduce subject burden while still obtaining valid results; however, investigators
must carefully consider the expected prevalence of non‐designed missing data resulting
from non‐compliance and loss to follow‐up. The missing data designs evaluated in this
paper were more susceptible to loss of precision in the presence of non‐designed
missing data. The relative efficiency was dependent on the pattern of designed
missingness, joint exposure distributions (gold standard and proxy), and prevalence of
non‐designed missingness. In addition to increased efficiency, missing data designs may
increase validity by increasing sample size and reducing selection bias at enrollment and
due to subject fatigue (non‐compliance and loss to follow‐up) by minimizing the burden
on study participants. In our example using data from the Nutrition in Pregnancy Study,
the probability of non‐compliance with the study protocol for the gold standard
assessment was much greater in the complete ascertainment cohort (PMiss=0.38) as
compared to the missing data designs (PMiss=0.24 to 0.16).
Missing data designs are often viewed as methods of introducing missingness into a
study design so as to decrease cost; however, this incorrectly ignores designed
missingness present in traditional study designs (i.e clustering missingness in the
unsampled population). A more accurate description would be that missing data
51
designs attempt to optimize the distribution of missing data in a design so as to
maximize statistical and/or cost efficiency. Morara et al. address missing data
optimization in longitudinal designs by defining a probability function for the probability
of being sampled at a particular stage at a particular time given cost, between time
correlations, and exposure ascertainment correlations. Using this equation, it can be
shown that complete ascertainment cohort and case‐control studies are specific types
of missing data designs. Thus the primary difference between complete ascertainment
and missing data designs is the distribution, rather than presence, of missing data.
In this paper, we have focused on missing data designs with direct applications to
repeated measures in perinatal epidemiology; however, the partial questionnaire design
warrants some discussion as it was the predecessor to the multi‐cohort methods.
Several partial questionnaire designs have been developed to address survey fatigue
[28, 29, 59‐61] The three‐form design, a specific type of partial questionnaire, divides
the parent survey into 4 components, a common component (X) and three variant
components (A, B, C) [28, 29]. Three different surveys are then developed by combining
the common component with two of the variant components (XAB, XAC, XBC). Study
respondents are then randomly assigned one of the survey instruments, but no single
respondent answers all potential survey questions (XABC). Thus, missingness of survey
responses is expected to be MCAR through randomization. Using this method,
correlations can be obtained between all variables included in the parent survey even
though each individual only answers a subset of the questions. If the missing data is
52
correlated with observed data, imputation or maximum likelihood methods can be used
to obtain valid estimates of associations while increasing precision of the estimates as
compared to complete case or available case analyses.
The multi‐cohort design is similar to the partial questionnaire design, the difference
being that the multi‐cohort randomizes subjects in a longitudinal study to have missing
values at some of the follow‐up times. The multi‐cohort design was initially described
by Helms (1992) in the context of evaluating the efficacy of a weight loss treatment [31].
In the example provided by Helms, subjects could be assigned to one of three follow‐up
protocols. Assuming high correlations between consecutive assessment intervals
(COR>0.7), little information was lost by decreasing proportion of subjects in protocol 1
and increasing proportion in protocol 2 or protocol 3. Similar methods can be applied in
perinatal investigations of time‐varying exposures provided some degree of temporal
correlation is expected. Exposures such as diet, physical activity, air pollution, and
drug/alcohol use are potential exposures for which this design may work well.
Multi‐measurement methods of construct assessment quantify exposure using one or
more measures at each time‐point. The most well known multi‐measurement method
is the two‐stage design[62, 63], though this design is typically not implemented in a
longitudinal setting. The two‐stage method of construct measurement is appropriate
where the gold standard of exposure ascertainment is relatively expensive and there is
an affordable and reasonably valid proxy measure [28]. In a longitudinal two‐stage
53
design, proxy exposure measures could be ascertained at all time points for all
participants. Exposure ascertainment by gold standard could then be randomized to
individuals within the study sample and time points within an individual. The intent of
the two‐stage design is to obtain more power than is attainable with the gold standard
alone and more validity than is attainable with the proxy alone [28]. While the multi‐
measurement missing data design is comparable if for to two‐stage validation studies,
their similarities largely deviate in how the data are analyzed. Typically designs
incorporating validation studies either proceed with analysis using an uncorrected proxy
(if the proxy is deemed to be sufficiently valid) or analyses are corrected using a
regression calibration. Analyzing as a missing data design offers some advantages over
regression calibration. Most notably, the missing data designs only imputes values for
individuals and time‐points with missing data while regression calibration adjusts all
observations with respect to their measured proxy even when the gold standard is
observed. Examples of perinatal exposures suitable for this design include dietary
factors, medication use, drug use, and environmental or occupational exposures.
Methods for power and sample size calculations for missing data designs have been
developed; however current methods do not allow for time varying exposures or non‐
designed missing data[30, 64]. It was not within the scope of this study to develop
methods for power and sample size calculations; however, through data simulation,
investigators can identify relative efficiency and susceptibility to bias of candidate study
designs. Though potentially burdensome, this would be a valuable exercise given the
54
potential for increased efficiency, reduced bias, reduced non‐compliance and loss to
follow‐up, and reduced subject burden. Future work should be targeted towards
developing software to identify the optimum distribution of designed missing data in
studies with time varying exposures and non‐designed missing data,
Complete ascertainment was obtained on a subset of the data for each of the missing
data designs assessed in this study. Each study design attempted to assess baseline
exposure on all subjects. For this reason, baseline associations between urinary cotinine
and SGA were consistently more precisely estimated when using missing data designs.
By incorporating a common component between study protocols within a design,
missing data designs allow for the estimation of some measures of association in the
absence of designed missing data. While designed missingness will not introduce bias
on average and may increase efficiency, it is preferable to minimize missing data in
variables required to test the primary hypothesis by distributing the missing data among
ancillary variables and time‐points. Where prior evidence indicates a specific relevant
etiologic period of interest, the common component of the missing data designs should
include detailed and complete assessment within the period of interest. For example,
when assessing the association between smoking and growth restriction, we may design
a study with a common component assessing detailed exposure in the third trimester
and variable components assessing exposure in the first and early second trimesters
based on prior evidence. The intensive protocol of the NIP study was included to have
complete data on a 10% sample of the study population. Missing data designs
55
incorporating a complete ascertainment protocol are better equipped to estimate
partial correlations or interactions involving three or more coviariates/time points. In
addition to directly assessing these interactions, this feature is enables more precise
imputations and increases the statistical efficiency of the study design.
With the advent of readily available software for multiple imputations and maximum
likelihood analyses, missing data designs offer a viable option for increasing precision
and decreasing subject burden without sacrificing validity. The relative efficiency of
missing data designs is highly sensitive to the distribution of exposure variables, costs
associated with recruitment and assessments, and the presence of non‐designed
missing data. Methods for power and sample size calculations and missing data
optimization need to be developed; however, data simulations can adequately compare
candidate study designs under variable conditions of non‐designed missingness.
Missing data designs may be particularly useful to minimize subject burden in studies of
vulnerable populations such as pregnant women and newborns.
56
CHAPTER 2 TABLES
Design Sample Size
Recruitment($20/Subject)
Gold Standard ($140/Measure)
Proxy Measure ($20/Measure)
Total Cost
Complete Ascertainment 543 10,870 391,304 500,000 Multi-Measurement 1042 20,833 375,000 104,167 500,000 Multi-Cohort 893 17,857 482,143 500,000 Complex 1397 27,933 377,095 94,972 500,000
Table 2.1: Sample size and cost parameters used for data simulationsDesign Cost
57
Intensiven=100
Telephonen=315
Biomonitorn=178
Intensiven=100
Telephonen=315
Biomonitorn=178
Maternal Age<25 18% 21% 25% 23% 22% 26%
25-29 27% 25% 25% 29% 27% 22%30-34 35% 35% 32% 28% 32% 33%
>35 20% 19% 19% 20% 19% 19%Marital Status
Married 75% 71% 63% 73% 69% 66%Never Married 21% 24% 32% 21% 25% 30%
Divorced/Separated/Widowed 4% 5% 4% 6% 6% 4%Ethnicity
White 72% 70% 60% 68% 68% 64%Black 5% 9% 8% 10% 9% 7%
Hispanic 22% 19% 28% 19% 21% 26%Other 2% 2% 4% 3% 3% 3%
Education<HS 14% 13% 19% 17% 17% 17%
HS 18% 18% 16% 17% 17% 19%Some College 19% 22% 27% 18% 18% 18%
College 21% 25% 24% 32% 31% 28%College+ 28% 21% 13% 16% 16% 18%
GA at 1st Interview1st trimester 86% 26% 23% 58% 57% 60%
2nd trimester 14% 74% 77% 42% 43% 40%
Weighted*Pre-WeightingTable 2.2: Characteristics of study participants within protocols of the Nutrition in Pregnancy Study prior to and after weighting
*Inverse probability of being asigned to the intensive protocol used as sampling weight to remove selection bias between protocols
58
Design 0 0.1 0.3 0 0.1 0.3
Multi-MeasurementCor(Ei, Pi)=0.9 0 (1.75) 0 (1.75) 0 (1.75) 0 (1.69) 0 (1.64) 0 (1.56)Cor(Ei, Pi)=0.6 0 (1.75) 0 (1.75) 0 (1.75) 0 (1.37) 0 (1.35) 0 (1.28)Cor(Ei, Pi)=0.2 0 (1.75) 0 (1.74) 0 (1.74) 0 (1.25) 0 (1.21) 0 (1.10)
Multi-Cohort Cor(Ei, Ei+1)=0.9 0 (1.28) 0 (1.25) 0 (1.20) 0 (1.01) 0 (0.93) 0 (0.87)Cor(Ei, Ei+1)=0.6 0 (1.28) 0 (1.25) 0 (1.20) 0 (0.99) 0 (0.90) 0 (0.87)Cor(Ei, Ei+1)=0.2 0 (1.28) 0 (1.28) 0 (1.24) 0 (0.98) 0 (0.89) 0 (0.92)
ComplexCor(Ei, Ei+1)=0.9 & Cor(Ei, Pi)=0.9 0 (1.60) 0 (1.58) 0 (1.56) 0 (1.15) 0 (1.13) 0 (1.06)Cor(Ei, Ei+1)=0.6 & Cor(Ei, Pi)=0.6 0 (1.60) 0 (1.56) 0 (1.52) 0 (1.06) 0 (0.84) 0 (0.81)Cor(Ei, Ei+1)=0.2 & Cor(Ei, Pi)=0.2 0 (1.60) 0 (1.59) 0 (1.53) 0 (0.83) 0 (0.76) 0 (0.80)
Bias: Percent difference between CAD and MDD odds ratiosRelative Efficiency: standard error from CAD divided by standard error from MDD� Bias and relative efficiency for t1*Average bias and relative efficiency for t2-t5
Table 2.3: Bias and relative efficiency of missing data designs (MDD) in presence of non-designed missing data relative to complete ascertainment designs (CAD) , 1000 data simulaitons
Common Component�
Bias (Relative Efficiency)
Variable Components*
Bias (Relative Efficiency)
Probability of non-designed missingness
59
Complete Ascertainment
Intensive Biomonitoring Telephone NIPSample Size 100 178 315 244Proxy (Interview Data)
Prenatal Interviews/Subject 4 4 2 2.4# Designed Prenatal Interviews 400 712 630 585.6
# Observed Observed Interviews 318 549 557 498%Non-Designed Missing 21% 23% 12% 15%
Gold Standard (Urinary Cotinine)Prenatal Samples/Subject 4 2 1 1.4
# Designed Prenatal Samples 400 356 315 342# Observed Prenatal Samples 250 271 266 266
%Non-Designed Missing 38% 24% 16% 22%
Table 2.4: Cost-fixed sample size and compliance within study protocols among participants in the Nutrition in Pregnancy Study
Missing Data Designs
60
Complete Ascertainment
IntensiveBirth Weight Δ(95%CI) Δ(95%CI) Rel Eff Δ(95%CI) Rel Eff Δ(95%CI) Rel Eff
Urinary Cotinine1st Trimester -12.52 (-43.27-18.23) ‐16.71 (‐50.63‐17.21) 1.13 ‐4.15 (‐23.45‐15.16) 1.59 -17.20 (-31.93--2.47) 2.09
Week 20 -21.31 (-55.16-12.55) ‐10.01 (‐45.37‐25.34) 0.96 ‐7.50 (‐74.59‐59.59) 0.50 -3.35 (-10.90-4.20) 4.49Week 28 -29.49 (-118.38-59.39) ‐7.42 (‐38.79‐23.96) 2.33 ‐5.19 (‐38.36‐27.99) 2.68 -3.14 (-23.30-17.02) 4.41Week 36 -57.11 (-152.63-38.40) ‐11.32 (‐51.28‐28.65) 2.54 0.70 (‐24.02‐25.41) 3.86 -11.06 (-29.78-7.65) 5.10
Reported Smoking1st Trimester -18.98 (-57.36-19.39) ‐14.57 (‐29.79‐0.65) 2.52 -18.98 (-57.36-19.39) 1.00 -25.61 (-39.71--11.52) 2.72
Week 20 -9.22 (-43.06-24.62) ‐7.52 (‐27.92‐12.88) 1.66 -18.14 (-57.10-20.82) 0.87 -6.76 (-36.59-23.06) 1.13Week 28 -28.69 (-101.94-44.56) ‐11.90 (‐32.89‐9.09) 3.49 -30.93 (-110.06-48.20) 0.93 -6.74 (-28.59-15.10) 3.35Week 36 -25.29 (-126.98-76.39) ‐7.43 (‐22.76‐7.90) 6.63 -1.39 (-99.00-96.23) 1.04 -14.18 (-37.50-9.15) 4.36
Tabe 2.5: Association between measures of smoking and small for gestational age by week of pregnancy and study design among participants in the Nutrition in Pregnancy Study.
Δ Difference in birth weight in grams per unit increase in exposure (per 50 ng of urinary cotinine or per cigarette of reported smoking). Adjusted for maternal age, race, and gestation at deliveryRel Eff: Relative Efficiency defined as standard error from CAD divided by standard error from MDD
Missing Data DesignsTelephone Biomonitor NIP
61
CHAPTER 2 FIGURES
Figure 2.1: Candidate missing data designs
t1 t2 t3 t4 t5 t1 t2 t3 t4 t5 t1 t2 t3 t4 t5 t1 t2 t3 t4 t5
Indicate Gold StandardIndicates unsampled population
ComplexMulti-CohortMulti-MeasurementComplete Ascertainment
4
3 3 3 3
4 4 4
2
1
2 2 2
1 1 1
8
7 7 7 7
8 8 8
6
5 5 5 5
6 6 6
10
9 9 9 9
10 10 10
12
11 11 11 11
12 12 12
14
13 13 13 13
14 14 14
16
15 15 15 15
16 16 16
18
17 17 17 17
18 18 18
Indicates Proxy Measure20
19 19 19 19
20 20 20
62
a
b
Figure 2.2: Distribution of the predicted probability of being assigned to the intensive
protocol prior to weighting (a) and after weighting (b).
63
Chapter 3: TIME DEPENDENT ASSOCIATIONS BETWEEN MATERNAL CAFFEINE
CONSUMPTION AND INTRAUTERINE GROWTH RESTRICTION
64
Abstract
Understanding the association between maternal caffeine consumption and IUGR is of
high public health importance given the prevalence of caffeinated beverage
consumption during pregnancy and the serious risks associated with intrauterine growth
retardation (IUGR). We quantify timing dependent associations between maternal
caffeine consumption and measures of fetal growth and evaluate for effect measure
modification by acetaminophen within a cohort of 2277 full term singleton pregnancies
recruited from 1996‐2000 in Connecticut and Massachusetts. Caffeine measures were
assessed in the first trimester, second trimester (week 20), early third trimester (week
28), and mid third trimester (week 36). Assessed fetal growth measures included IUGR
and birth weight. Associations between caffeine and measures of fetal growth were
quantified using inverse probability of treatment weighted logistic regression. Effect
measure modification on the additive scale was quantified using the relative excess risk
due to interaction. We observed significant increased odds of IUGR for caffeine
exposures occurring in the first trimester (OR per 100 mg=1.15, 95%CI: 1.04‐1.28) and
mid‐third trimester (OR per 100mg= 1.16, 95%CI: 1.00‐1.35). We did not observe a
statistically significant departure from additivity due to acetaminophen use. In
conclusion, we observed increased risks associated with even moderate levels of
caffeine consumption (60mg‐170mg) for exposures occurring in the first trimester and
mid third trimester. Subsequent analyses should assess whether decreasing caffeine
intake is associated improved reproductive outcomes.
65
Introduction:
Intrauterine growth restriction (IUGR) is an etiologically diverse adverse outcome
attributed to fetal, placental, and maternal characteristics[65]. Neonates identified as
growth restricted have an increased risk of perinatal mortality and morbidity. More
recently, IUGR has been identified as a risk factor for later life conditions such as
hypertension, diabetes, and cognitive functioning. Caffeine, a fairly ubiquitous exposure
during pregnancy, may contribute to IUGR through inhibition of trophoblast mRNA
during placental development [66] or through decreased uteroplacental and
fetalplacental blood flow during rapid fetal growth[67]. Understanding the association
between maternal caffeine consumption and IUGR is of high public health importance
given the prevalence of caffeinated beverage consumption during pregnancy and the
serious risks associated with IUGR.
Despite several prior investigations, the relationship between caffeine and IUGR
remains undetermined [40, 58, 68]. Assessing the association between caffeine and
pregnancy outcomes is complicated by changes in caffeine metabolism associated with
pregnancy progression, co‐dependency between pregnancy symptoms and caffeinated
beverage intake, and potential interactions with other chemical exposures. Prior
investigations have assessed for potential effect measure modification by smoking
status [58] and gender [69]. Acetaminophen, a common exposure in pregnancy [70],
has not been assessed as a potential effect modifier in the association between caffeine
and IUGR. Acetaminophen and caffeine are metabolized by CYP1A2 and the
66
combination of caffeine and acetaminophen has been demonstrated to retard fetal
growth in experimental rodent studies [71, 72].
In the present paper, we quantify time dependent associations between caffeine
exposure and measures of fetal growth. Based on the proposed biological mechanisms,
we hypothesize associations between caffeine and IUGR will be most pronounced for
exposures during the first trimester (placentation) and mid‐third trimester (rapid fetal
growth). In addition, we evaluate time‐dependent interactions between caffeine and
acetaminophen, smoking and gender.
67
Methods
The Health and Nutrition in Pregnancy Study enrolled 2478 women for the purpose of
assessing the association between dietary factors and adverse pregnancy outcomes
[58]. Women were recruited from 1996 through 2000 from 56 obstetrical practices and
15 clinics associated with six hospitals in Connecticut and Massachusetts. Eligibility
criteria included English speaking, gestational length less than 24 weeks at enrollment,
no prior history of diabetes, and no intent to terminate the pregnancy. Among
identified eligible women, all women consuming greater than 150mg of caffeine per day
(n=718) and a random sample of women consuming less than 150 mg of caffeine per
day (n=2915) were invited to participate. Of those invited to participate, 17.6% refused,
0.6% were no longer eligible, 2% miscarried prior to the first interview, and 11.7% could
not be contacted. The final cohort included 2478 pregnancies. This analysis was limited
to 2277 pregnancies after we further excluded those for whom fetal growth measures
could not be ascertained (70 miscarried, 6 stillbirths, 5 withdrew from study, 44 lost to
follow‐up), non‐singleton pregnancies (n=53), and births prior to the 36th week of
gestation (n=14). Non‐singleton pregnancy were excluded due to known associations
between multiples and IUGR.
Women were assigned to one of three follow‐up protocols based on gestational age at
first interview, level of caffeine consumption, and randomization. Each of the follow‐up
protocols included a baseline survey and urine sample prior to 24 weeks gestation
68
(average of 14.4 weeks) and a postpartum survey. The majority of participants (80%)
were assigned to a follow‐up protocol involving one telephone follow‐up interview at
20, 28, or 36 weeks in addition to their baseline and postpartum assessments. Of the
remainder, 10% were assigned to a follow‐up protocol involving three telephone
interviews at 20, 28, and 36 weeks and an additional urinary sample at either 20, 28, or
36 weeks of gestation, and 10% were assigned to a follow‐up protocol involving three
in‐person interviews and three urinary samples at 20, 28, and 36 weeks.
Caffeine exposure was estimated based on self reported measures and urinary
metabolites. At their baseline interview, women were asked to recall their caffeine
intake prior to pregnancy and during their first three months of pregnancy. Women
were asked to report the frequency and quantity of consumed coffee, tea, and soda.
Model cup sizes were presented to participants to aid in their recall of serving size.
Detailed questions elicited coffee type (regular or instant), coffee preparation method,
coffee brand, tea type (hot or iced), tea preparation (steeping duration), and tea brands.
Among a 10% subsample of women, beverage samples were obtained and analyzed for
their caffeine content at two randomly selected visits. Average daily caffeine intake
(mg/day) was computed based on self reported caffeinated beverage consumption and
beverage caffeine content obtained from the beverage analysis. Additional interviews
were conducted in the second trimester (week 20), early third trimester (week 28), mid
third trimester (week 36), and postpartum to assess type, frequency, and serving size of
caffeinated beverage consumption over the previous week.
69
Urinary concentrations of caffeine metabolites were assessed at baseline for all
participants and at weeks 20, 28, and 36 for subsamples of participant. Participants
were provided with urine containers and requested to collect urine between dinner and
bedtime on the night prior to the interview. Samples were collected the following day
and were analyzed for urinary caffeine and paraxanthene by Labstat, Inc. (Kitchener,
Ontario, Canada). Caffeine metabolites have a relatively short half life (less than 6
hours) and less than 2% of caffeine is excreted through urine, consequently caffeine
metabolites do not accurately reflect average weekly caffeine exposure [73]. Caffeine
metabolites were not directly evaluated as a risk factor in this study; however, they
were utilized as informative covariates for multiple imputation of missing self‐reported
caffeine.
Our operational measure for intrauterine growth restriction was defined as a birth
weight below the 10th percentile of for gestational age, gender, and ethnicity. Birth
weights were recorded with 24 hours of birth. Gestational age was calculated from the
reported number of days since the first day of the last menstrual period. Where timing
of the last menstrual period could not be recalled, gestational age was calculated from
the physicians estimated date of delivery. Sonography estimates confirmed the
gestational ages of 61.2% of the women. We compared observed birth weights to the
distributions of birth weights within the entire US population using publically available
US Natality data[74]. Gestational age, gender, and ethnicity specific birth weight
70
distributions were assessed using 1999 US Natality data representing all singleton US
births [74]. In addition to IUGR we assessed birth weight in grams as a continuous
outcome.
Demographic, health history, prenatal health, and health behavior covariates were
assessed at the prenatal and postpartum interviews. Time‐fixed covariates included
maternal age at first interview, ethnicity, parity, education, height, pre‐pregnancy
weight, pre‐pregnancy BMI, and prior chronic disease. Time‐varying covariates included
smoking status, alcohol use, and pregnancy complications (gestational hypertension),
and medication use (Acetaminophen and NSAIDs). Smoking, medication use, and
alcohol use were assessed at each of the interviews. Subjects were asked to provide the
number of cigarettes smoked per day over the previous week. For medication use,
individuals provided brand names for medications taken over the previous week. We
created indicators for medications with the class of non‐steroidal anti‐inflammatory
drugs and for acetaminophen. Alcohol use was assessed by asking respondents to
report the number of servings of wine, beer, or mixed drinks consumed over the
previous week. Candidate confounders were identified via review of existing literature
and directed acyclic graphs. The directed acyclic graph representing the association
between caffeine and IUGR highlights confounders that may be appropriately adjusted
for in a multivariable analysis and those where adjustment would lead to bias.
71
Analyses were performed using SAS version 9.2 (The SAS Institute, Cary, NC). As
described above, this study design deliberately introduces missing data through the
assigned follow‐up protocols. Previous work has demonstrated that missing data
designs, in general and the specific design implemented in the Health and Nutrition in
Pregnancy Study, do not contribute bias to the observed associations (Chapter 2 Ref).
We use multiple imputation methods to impute designed and non‐designed missing
data simultaneously [46, 54‐57]. Multiple imputation methods for missing data designs
have been presented as a preferred method due to ease of use in readily available
software and ability to perform sensitivity analyses where non‐designed missing data is
expected to be MNAR[28]. Caffeine variables were transformed into two components, a
logit transformation of the binary indicator of caffeine consumption and a log
transformation of the continuous measure of caffeine mg/day. This transformation was
identified via sensitivity analyses of our imputation method. The final imputed value of
caffeine was obtained by multiplying the imputed binary indicator by the exponent of
the imputed value for log caffeine. A Markov Chain Monte Carlo (MCMC) multiple
imputation was chosen to simultaneously impute missing values [54, 57]. The first step
of MCMC MI requires the calculation of prior parameters for multivariate normal means
and covariances of variables included in the imputation model. To accomplish this,
missing values were initially filled in using a vector of mean values and the covariance
matrix estimated from the EM algorithm for the observed data. Using the parameters
from the prior model, missing data were updated (imputed) by sampling a value from
the predictive distribution. Posterior parameters for multivariate normal predictive
72
means and covariance matrix were then estimated using the observed and imputed
values for all variables in the imputation model. This process was repeated until the
vector of means and the covariance matrix was unchanged between consecutive
iterations. This process was repeated to produce 10 complete datasets.
Caffeine was analyzed as a continuous and as a categorical variable. The categorical
indicators of caffeine exposure were created by identifying relevant cutpoints in
proximity to the 25th, 50th, 75th and 90th percentiles of reported caffeine consumption
(mg/day) at the baseline interview. The distribution of demographic, behavioral, and
reproductive characteristics were examined by quantiles of caffeine consumption.
Associations between caffeine and IUGR were quantified using inverse probability
weighted logistic regression [35, 75, 76]. For our analysis of categorical caffeine
exposure, we obtained predicted probabilities for an individual’s observed quantile of
caffeine exposure using multinomial logistic regression. Stabilized weights were
calculated by dividing the predicted probabilities obtained from an uninformed model
by the predicted probabilities obtained from a model with time‐fixed and time‐
dependent covariates. Weight models were evaluated using the Hosmer‐Lemeshow
Goodness‐of‐Fit Test [77]. For continuous exposures, weights were based on the
inverse of the normal density function identified using linear regression with time‐fixed
and time‐dependent covariates. Confidence intervals were calculated using
bootstrapped standard errors. We present the fraction of missing information (λ) as a
measure of relative precision of our missing data design versus a complete
73
ascertainment design of the same sample size. As the fraction of missing information
approaches 0, the precision obtained from our design with planned missing data
approaches the precision of a complete ascertainment design of the same sample size.
We evaluate time‐dependent effect measure modification of acetaminophen, smoking,
and gender on the association between caffeine and IUGR using a joint effects
approach. The joint effects approach enables assessment of additive effect measure
modification from odds ratios. Joint effect analysis weights were calculated using
multinomial logistic regression as described for the caffeine main effect models. The
magnitude and direction of departures from additivity were quantified using the
Relative Excess Risk Index (RERI)[78, 79]. Confidence intervals were calculated for the
RERI using a Taylor Series expansion to estimate the standard error[79]. For effect
modification on the association between caffeine and birth weight, we perform a linear
regression analysis with an interaction term. We report the average change in birth
weight per 100mg of caffeine by stratum of timing specific acetaminophen use, smoking
status, and gender.
74
RESULTS Within our study population, caffeine consumption was associated with demographic
characteristics, health behaviors, and health history (Table 1). As compared to women
not drinking caffeinated beverages, high caffeine consumers were tended to be
younger, not married, Hispanic, and of lower education. Caffeine consumers were more
likely to exhibit risk behaviors such as first trimester smoking and drinking.
To assess the time dependent effects of caffeine we, timing specific caffeine measures
were simultaneously included in our analytic models. We assessed for and ruled out
multi‐ colinearity based on the variance inflation factor using a threshold of 2 for
detection of colinearity. In our age adjusted models, we observed significantly
increased odds of IUGR in the first and third trimester among those consuming between
60 and 17 mg of caffeine per day and those consuming greater than 170 mg of caffeine
per day (Table 2). Measures of association from our fully adjusted models were
attenuated. In our analysis of quantiles of caffeine consumption, the association
between first trimester caffeine and IUGR remained significant after adjusting for age,
parity, education, BMI, prior chronic disease, first smoking (trimester1, week 20, 28 and
36), hypertension (week 20, 28 and 36), and alcohol (trimester 1, weeks 20, 28, and 36).
It is important to note that the fraction of missing information was substantially smaller
for first trimester exposures (λ=0.00‐0.002) as compared to second and third trimester
exposures (λ=0.05‐0.56). Consumption of greater than 170mg of caffeine per day was
associated with a 2‐fold increased odds of IUGR (95%CI:1.13‐4.01). Assessing first
75
trimester exposures as a continuous measure indicated that for each 100mg increase in
caffeine consumption, the odds of IUGR increased by 17% (OR=1.15, 95%CI: 1.04‐1.28).
We did not observe any significant associations or trends for caffeine exposure in the
second trimester. The magnitude and pattern of associations observed for exposures in
weeks 28 and 36 were suggestive of a potentially clinically relevant though not
statistically significant association. The quantile analysis for week 36 exposures was
suggestive of increased odds associated with even low levels of caffeine consumption
relative to those not consuming any caffeine. For each 100mg increase in third
trimester caffeine consumption, odds of IUGR increased by 16% (OR= 1.16 , 95%CI: 1.00‐
1.35)
Effect measure modification on the additive scale was assessed using joint effects
marginal structural models (Table 3). We failed to identify any statistically significant
departures from additively; however, measures of association between caffeine and
IUGR tended to be larger among acetaminophen users, smokers, and males. For the
assessment of additive effect modification by acetaminophen for third trimester
caffeine exposures, the observed RERIs were indicative of a potential synergistic
relationship (RERI=1.32, 95%CI ‐1.02‐3.65).
76
Associations between caffeine and birth weight were quantified using multivariable
adjusted linear regression. In both the first and third trimesters, we observed a non‐
significant birth weight reduction of approximately 25g per 100mg increase in caffeine.
The association between caffeine and birth weight was not significantly modified by
acetaminophen, smoking, or gender. Though not statistically significant, the magnitude
of the difference in the association between caffeine and birth weight among
acetaminophen users (Δ=‐102.71, 95%CI: ‐205.54‐0.12) and non‐users (Δ=‐‐21.29,
95%CI: ‐80.73‐38.16) is notable.
77
Discussion
Consisted with the hypothesized biological mechanism, we observed the strongest
associations between caffeine exposure and fetal growth for exposures occurring in the
first and third trimesters. Our analyses identified an association between first trimester
caffeine consumption independent of caffeine consumption later in pregnancy
suggesting the potential for placental etiology. We also observed an association
between third trimester caffeine consumption and fetal growth independent of earlier
caffeine exposures suggesting that caffeine may also act through reduced blood flow
during rapid fetal growth (week 36). We did not observe significant associations for
caffeine exposures occurring after placentation but prior to peak fetal growth (weeks 20
and 28). Our analysis of effect measure modification did not identify significant
modification by acetaminophen use and failed to confirm previous reports of effect
measure modification by smoking and gender.
A recent committee opinion from the American College of Obstetricians and
Gynecologists concluded that there was insufficient evidence that caffeine increases the
risk of IUGR[80]. Studies assessing caffeine intake greater than 300mg per day have
generally observed increased risks of IUGR and reduced birth weight [81‐84]. A large
prospective cohort study in the Netherlands observed significant increased risks of small
for gestational age among those consuming greater than 2 servings of caffeinated
beverages per day. Deviations in fetal weight and crown‐rump length as measured by
ultrasound were observed as early as the first trimester for those consuming greater
78
than 6 servings of coffee per day [84]. Studies of more moderate levels of caffeine
consumption vary in their conclusions [58, 85, 86]. Our finding of a significant increased
risk of IUGR is consistent with recent findings from comparably sized prospective
study[85]. The CARE study group assessed caffeine exposures in each trimester in a
cohort of 2635 pregnant women. While they observed significant increased risks of
IUGR for caffeine exposures in each trimester, the strongest associations were observed
for third trimester exposures. In the only randomized control trial of caffeine intake
during pregnancy, no association was observed between caffeine and birth weight[87].
Women with pre‐pregnancy caffeine consumption of greater than 3 cups per day were
randomized to consume caffeinated or decaffeinated coffee during their second and
third trimesters. Pregnancies randomized to the decaffeinated group were on average
16 grams (95%CI: ‐40‐70) larger than those consuming caffeinated beverages. This trial
did not assess the impact of caffeine during the first trimester nor did it assess
compliance with the study protocol after randomization. Observational studies
assessing caffeinated and decaffeinated coffee consumption have concluded that only
caffeinated coffee is associated with measures of fetal growth.
In a previous publication from the Nutrition in Pregnancy study, caffeine was
significantly associated with reduced birth weight; however, neither first trimester nor
third trimester caffeine exposure was significantly associated with IUGR [58]. The
disparity in the magnitude and significance of the odds ratios observed in our analysis
relative to the prior publication may be explained by differences in the source of
79
exposure data and in the analytic approach. The earlier publication relied on
postpartum recalled third trimester caffeine exposure while we relied on the
prospective measures of caffeine exposure. In relying on the prospective measures, our
analysis may be less susceptible to recall bias but requires stronger missing data
assumptions due to the designed missing data for interview data at week 20, 28, and 36.
Analytically, we estimate the marginal effect rather than the conditional effect. Recent
methodological work has demonstrated biases associated with estimation of the
conditional associations when relying on multivariable adjusted logistic regression due
to non‐collapsibility of the odds ratio [68, 88]. Furthermore, multivariable adjustment
for time‐varying variables (such as gestational hypertension and smoking) can lead to a
biased estimation of the measure of effect where the variable acts as both a confounder
and mediator [35, 75]. For example, caffeine consumption may be associated with IUGR
through its association with hypertension[89] or gestational diabetes[90] thus serving as
a mediators; however, being diagnosed with hypertension or gestational diabetes may
alter subsequent caffeine consumption thus serving as confounders. While still prone to
bias due to unmeasured confounding, inverse probability weighting is not susceptible to
bias due to non‐collapsibility of the odds ratio.
We assess for effect measure modification using the RERI in models relying on logistic
regression as opposed to including a cross‐product term because prior work has
concluded that additive effect measure modification is a better assessment of biological
interaction. The cross‐product terms in our linear regression models directly estimate
80
additive effect measure modification. The RERI can be interpreted as the excess risk
due to interaction relative to the risk without exposure. We did not observe any
significant interactions between acetaminophen, smoking, or gender and caffeine
exposures; however, our findings were suggestive of a potential synergistic relationship
between acetaminophen and caffeine. Given the magnitude of the difference between
regression parameters for birth weight by acetaminophen status, further exploration of
this potential relationship should be assessed in future research. Reliance on odds
ratios for calculation of the RERI may produce biased estimates of departures from
additivity[91]; however this is unlikely to be a problem in this investigation given the
prevalence of the outcome (8%) and the magnitude of the observed RERIs.
The NIP study was specifically designed to assess reproductive outcomes associated
with maternal caffeine consumption. For this reasons, our caffeine assessments were
designed to reduce recall bias and to prospectively measure caffeine at four potentially
relevant time points. Though urinary concentrations of caffeine metabolites were
available, we elected to rely primarily on the self reported measures because urinary
caffeine metabolites only reflect immediate caffeine exposures and would not provide
an accurate assessment of typical daily or weekly caffeine consumption. With a half life
of under 6 hours and only 0.5%‐2% being excreted through urine, urinary caffeine
metabolites may be a good indicator of recent caffeine consumption but would not
reflect typical weekly consumption. Though not assessed as a primary exposure,
81
paraxanthine, the primary metabolite of caffeine, was used to aid in the imputation of
missing self‐reported caffeine values.
The design of this study intentionally omitted collection of caffeine exposure at some
time points for some individuals. The missing data present in this study was a tool to
reduce subject burden and improve statistical efficiency. Prior publications have
demonstrated the validity and efficiency of studies with designed missing data. To be
validly implemented, the multiple imputation methods employed in this study assume
that the missing data is missing at random and that the imputation model is correctly
specified. Though we can ensure that the designed missing data is missing at random,
we cannot rule out the possibility that non‐designed missingess due to noncompliance
or loss to follow‐up introduced missing data that was not at random. To validate our
imputation model and to assess the susceptibility of the imputation model to
missingness not at random, we artificially introduced missing data into the subset of our
population with complete exposure data. We then applied our imputation model to this
subset and compared parameters (mean caffeine and OR between caffeine and IUGR)
between the complete data subset without artificial missing data and the complete data
subset with imputed values for artificial missing data. Artificial missing data was
introduced in two scenarios: 1) where missing data was completely at random and 2)
where missing data was dependent on the caffeine consumption and IUGR. These
analyses confirmed that our imputation model was correctly specified and robust to the
assessed scenario of missingness not at random. The missing data design performed
82
well in our assessment of main effects; however, may have impeded our ability to
identify significant departures from additivity in our joint effects models.
Our findings, in conjunction with biological plausibility and consistency with prior
epidemiological investigations, provide additional evidence that even moderate caffeine
exposure during pregnancy can result in a clinically significant reduction in fetal growth.
Though it may seem prudent to recommend a reduction in caffeine consumption during
pregnancy, it is important to note that there may be adverse consequences associated
with discontinuation of caffeine. To further resolve this question, future observational
studies should assess temporal patterns of caffeine consumption in addition to timing
specific quantities of caffeine consumption.
83
0 mg 0.1-8 mg 8.1-60 mg 60-170 mg > 170 mgn=545 n=592 n=573 n=340 n=227.2
Age<24 14.7 22.1 22.9 23.7 39.025-29 26.5 29.7 25.1 20.5 21.930-34 39.4 32.0 33.8 33.7 22.1>35 19.4 16.2 18.3 22.1 17.0
Marital statusMarried 81.9 73.4 70.5 62.6 40.2
EthnicityWhite 75.1 74.2 70.9 66.5 47.8Black 9.2 7.4 8.4 6.8 8.1Hispanic 12.5 15.3 18.7 24.2 43.2Other 3.1 3.1 2.0 2.6 0.9
Education<11 6.7 10.7 13.5 16.8 35.812 14.5 15.9 17.1 24.4 25.013-15 19.7 22.0 26.7 23.6 24.016 31.6 29.1 23.7 20.5 9.8>17 27.5 22.3 19.0 14.8 5.4
BMI< 20 17.0 16.3 14.5 12.6 14.620-25 53.0 52.8 48.9 49.4 46.725-30 20.5 21.2 23.6 20.9 21.7>30 9.5 9.8 13.0 17.1 17.0
First Trimester Caffeine
Table 3.1: Distribution of baseline characteristics by levels of first trimester caffeine consumption, Health and Nutrition in Pregnancy Study, 1996-2001
84
0 mg 0.1-8 mg 8.1-60 mg 60-170 mg > 170 mgn=545 n=592 n=573 n=340 n=227.2
1st trimester smoking (Cigs/day)0 95.0 89.3 86.0 76.4 50.61-5 3.8 7.4 8.1 10.8 18.86-10 0.7 2.3 4.1 7.3 12.0>10 0.6 1.1 1.8 5.6 18.6
1st trimester alcoholYes 23.9 33.8 37.0 39.2 35.0No 76.2 66.2 63.0 60.8 65.1
Pre-pregnancy healthChronic disease 8.3 8.4 9.5 10.8 17.2Emotional problems 5.6 6.8 4.7 9.6 13.6
Parity0 49.3 47.9 44.7 35.0 27.01 35.0 38.1 34.9 36.0 35.9>1 15.7 14.0 20.5 29.0 37.1
Prior pregnancy morbidityPregnancy hypertensio 3.5 3.5 4.6 7.0 4.6Preterm labor 1.6 4.6 3.8 4.3 6.6Gestational diabetes 2.5 0.4 2.8 2.3 2.6
Table 3.1 Continued: Distribution of baseline characteristics by levels of first trimester caffeine consumption, Health and Nutrition in Pregnancy Study, 1996-2001
First Trimester Caffeine
85
IPTW-Agea
IPTW-Fullb
Births IUGR λ OR (95%CI) OR (95%CI)Reported Caffeine
Trimester 1: 0 mg 545 30 1.00 (--) 1.00 (--)0-8 mg 592 48 0.01 0.61 (0.36-1.04) 0.54 (0.30-1.00)8-60 mg 573 57 0.01 1.37 (0.90-2.09) 1.68 (0.87-3.21)60-170 mg 340 29 0.00 2.78 (1.40-5.55) 1.44 (0.50-4.17)>170 mg 227 27 0.02 1.54 (1.04-2.30) 2.12 (1.13-4.01)
Week 20 0 mg 592 44 1.00 (--) 1.00 (--)0-8 mg 669 72 0.27 0.81 (0.46-1.42) 0.66 (0.35-1.23)8-60 mg 573 33 0.56 1.82 (0.78-4.23) 0.96 (0.24-3.78)60-170 mg 274 13 0.31 0.81 (0.47-1.39) 0.66 (0.21-2.07)>170 mg 169 29 0.27 1.13 (0.58-2.17) 0.81 (0.21-3.15)
Week 28 0 mg 537 42 1.00 (--) 1.00 (--)0-8 mg 632 55 0.06 1.36 (0.42-4.36) 0.34 (0.05-2.58)8-60 mg 645 43 0.11 1.55 (0.93-2.60) 1.28 (0.70-2.34)60-170 mg 299 31 0.12 1.80 (1.08-2.99) 1.34 (0.79-2.26)>170 mg 164 20 0.17 3.54 (1.31-9.57) 1.86 (0.66-5.29)
Week 36 0 mg 548 22 1.00 (--) 1.00 (--)0-8 mg 636 53 0.18 1.54 (0.78-3.03) 1.45 (0.46-4.59)8-60 mg 680 62 0.19 1.82 (0.83-4.00) 1.65 (0.52-5.21)60-170 mg 272 27 0.14 1.96 (0.90-4.25) 1.67 (0.52-5.29)>170 mg 141 27 0.05 3.47 (1.41-8.59) 2.24 (0.60-8.41)
Continuous (per 100mg) 2277 191Trimester 1: 0.19 1.17 (1.06-1.30) 1.15 (1.04-1.28)
Week 20 0.51 1.09 (0.95-1.24) 1.10 (0.95-1.27)Week 28 0.79 1.08 (0.91-1.28) 1.11 (0.91-1.34)Week 36 0.06 1.17 (1.01-1.34) 1.16 (1.00-1.35)
Table 3.2: Association between caffeine consumption and intrauterine growth retardation among full term live births, Health and Nutrition in Pregnancy Study, 1996-2001
λ: Fraction of missing information: Ratio of between imputation variance to total variancea: Adjusted for age and caffeine measuresb: Adjusted for age, caffeine measures, parity, education, bmi, prior chronic disease, first trimester smoking, smoking (week 20, 28 and 36), hypertension (week 20, 28 and 36), alcohol (trimester 1, weeks 20, 28, and 36)
86
No AcetaminophenOR (95%CI)
AcetaminophenOR (95%CI) RERI (95%CI)
First Trimestera
0 1.00 (--) 0.76 (0.26‐2.17)
0-8 1.44 (0.43‐4.86) 0.96 (0.31‐2.97) ‐0.25 (‐1.64‐1.15)
8-60 1.73 (0.75‐4.03) 1.42 (0.55‐3.72) ‐0.07 (‐1.05‐0.92)
> 60 1.10 (0.46‐2.62) 1.19 (0.44‐3.19) 0.33 (‐0.40‐1.06)
Third Trimestera
0 1.00 (--) 1.19 (0.43‐3.33)
0-8 1.34 (0.45‐3.96) 1.93 (0.73‐5.06) 0.39 (‐1.06‐1.85)
8-60 1.62 (0.59‐4.44) 2.43 (1.04‐5.69) 0.62 (‐0.79‐2.03)
> 60 1.56 (0.62‐3.98) 3.07 (1.09‐8.68) 1.32 (‐1.02‐3.65)
Nonsmoker Smoker
First Trimesterb
0 1.00 (--) 1.79 (0.57‐5.57)
0-8 1.29 (0.67‐2.47) ‐‐ ‐‐
8-60 1.53 (1.01‐2.32) 3.66 (1.80‐7.47) 1.35 (‐1.81‐4.50)
> 60 1.03 (0.62‐1.69) 3.22 (1.99‐5.21) 1.41 (‐0.98‐3.80)
Third Trimesterb
0 1.00 (--) 1.87 (0.82‐4.25)
0-8 0.89 (0.31‐2.56) 2.27 (0.11‐47.07) 0.51 (‐5.75‐6.77)
8-60 1.54 (0.76‐3.13) 1.97 (0.44‐8.80) ‐0.44 (‐3.50‐2.61)
> 60 1.58 (0.85‐2.97) 3.48 (1.99‐6.08) 1.03 (‐1.33‐3.39)
Male Female
First Trimesterc
0 1.00 (--) 1.01 (0.59‐1.71)
0-8 1.52 (0.66‐3.47) 0.83 (0.29‐2.40) ‐0.69 (‐2.26‐0.88)
8-60 1.51 (0.86‐2.67) 1.84 (1.06‐3.20) 0.32 (‐0.75‐1.39)
> 60 2.07 (1.22‐3.51) 1.11 (0.61‐2.04) ‐0.97 (‐2.21‐0.28)
Third Trimesterc
0 1.00 (--) 0.85 (0.41-1.76)0-8 0.92 (0.14-5.83) 0.85 (0.23-3.20) 0.08 (‐1.84‐2.01)
8-60 1.43 (0.62-3.30) 1.29 (0.60-2.75) 0.01 (‐1.40‐1.42)
> 60 2.02 (1.08-3.78) 1.50 (0.78-2.87) ‐0.37 (‐1.88‐1.13)
Table 3.3: Associations between joint effects of self reported caffeine intake and potential effect measure modifiers and intrauterine growth retardation among full term live births, Health and Nutrition in Pregnancy Study, 1996-2001
MSM Joint Effects
Weights from multinomial logistic regression with predictors age, parity, education, a: smoking (trimester 1, week 20, 28 and 36)b: alcohol (trimester 1, weeks 20, 28, and 36)c: smoking (trimester 1, week 20, 28 and 36)
87
Δ (95%CI) PInteraction Δ (95%CI) PInteraction
Self Reported CaffeineAll -24.41 (-51.75-2.92) -25.27 (-79.29-28.76)Acetaminophen
No -28.12 (-76.09-19.85) 0.87 -21.29 (-80.73-38.16) 0.15Yes -23.88 (-54.82-7.06) -102.71 (-205.54-0.12)
SmokingNo -20.15 (-61.88-21.57) 0.75 -27.45 (-107.3-52.39) 0.94
Yes -27.47 (-57.92-2.98) -24.97 (-73.73-23.79)Gender
Male -23.31 (-52.64-6.03) 0.90 -28.52 (-107.00-49.96) 0.74Female -25.77 (-63.34-11.8) -18.90 (-67.47-29.67)
Trimester 1 Trimester 3
Δ: Change in birth weight (grams) per 100mg increase in caffeine modeled using multivariable linear regression adjusted for age, parity, education, bmi, prior chronic disease, first trimester smoking, smoking (week 20, 28 and 36), and hypertension (week 20, 28 and 36)
Table 3.4: Association between caffeine consumption and birth weight among full term live births, Health and Nutrition in Pregnancy Study, 1996-2001
88
Figure 1: Directed Acyclic Graph for confounders of the association between caffeine and fetal growth
Time‐Fixed Confounders: Maternal age, ethnicity, parity, education, height, weight, pre‐pregnancy BMI, prior chronic disease.
Time‐Dependent Confounders: Smoking status, alcohol use, gestational hypertension, and medication use (Acetaminophen and
NSAIDs).
CaffT1 CaffT2 IUGR
Time Fixed
Time Dependent
Time Dependent
Time Dependent
89
General Discussion Through this series of papers, we have demonstrated the need to obtain and
analyze repeat exposure measures of time‐varying exposures, introduced a bias
correction method where exposure timing cannot be ascertained, introduced
design solutions to enable repeat exposure measures, and quantified the
association between timing specific caffeine consumption and fetal growth.
We extend the concepts of time‐dependent bias substantially from what has
been previously addressed in the methodological literature through our
assessment of transient exposures, average exposures, and jointly modeled
binary exposures. We have demonstrated that time‐dependent bias has the
potential to substantially impact the validity of analyses when exposure timing is
ignored in perinatal epidemiology. We demonstrate the utility of imputed
exposure event times to obtain unbiased effect estimates where true exposure
timing cannot be feasibly ascertained. This novel solution needs further
development to assess performance in preventing bias for continuous
exposures.
The use of time varying methods in perinatal epidemiology has been limited
because event times are often unknown and outcomes only become recognized
at birth (e.g. malformations, growth restriction). Though time‐varying methods
90
are more appropriate than time‐fixed methods, the validity of these models is
dependent on our knowledge of when the event actually occurred. Additional
methodological work should address the biases associated with various analytic
approaches to assessing outcomes with unknown event times in perinatal
epidemiology.
In our application of the bias correction method, we reveal how previous
analyses of adverse reproductive outcomes associated with delayed prenatal
care would be susceptible to time‐dependent bias. In fact, previous studies have
reported that delayed prenatal care is associated with a lower the risk of adverse
outcomes such as low birth weight and preterm birth[18] or that receiving any
prenatal care is unrealistically protective against a range of outcomes[47]. Given
this example and the potential consequences drawn from these findings, we feel
that careful review of existing literature relying on time‐fixed methods for time‐
varying exposures in pregnancy may be warranted.
Our design solution to rely on intentional missing data for prospective
assessment of exposure offers a novel alternative to improve efficiency and
reduce subject burden. In recognizing that traditional complete ascertainment
designs utilize designed missing data through the sampling process, we have
attempted to frame alternative missing data designs as a redistribution of
missing data to improve efficiency and reduce subject burden. Through our
91
assessment of the performance of missing data designs, we observed that the
efficiency of the designs is subject to the distribution of designed and non‐
designed missing data and the joint distributions of the exposure measures. Cost
parameters also have a large impact on the performance of missing data designs
relative to complete ascertainment designs. In addition to potential efficiency
advantages, missing data designs may increase validity by increasing sample size
and reducing selection bias at enrollment and due to loss to follow‐up.
Areas of future research within the methodological area of designed missing
data for prospective exposure assessment include development of methods for
power and sample size calculations and development of methods for optimizing
the distribution of designed missing data. Previous work addressing power and
missing data optimization for missing data designs do not allow for time varying
exposures or non‐designed missing data[30, 64]. It was not within the scope of
this study to develop methods for power and sample size calculations; however,
through data simulation, investigators can identify the relative efficiency for
candidate study designs by specifying cost parameter and exposure/outcome
distributions. Though potentially burdensome, this would be a valuable exercise
given the potential for increased efficiency, reduced bias, reduced non‐
compliance and loss to follow‐up.
92
In our third chapter quantifying time‐dependent associations between caffeine
and fetal growth, we utilized multiple imputation for designed and non‐designed
missing data. To be validly implemented, the multiple imputation methods
employed in this study assume that the missing data is missing at random and
that the imputation model is correctly specified. Given the extent of the
designed missingness within the NIP study design, we took care to validate our
imputation model. To validate our imputation model and to assess the
susceptibility of the imputation model to missingness not at random, we
artificially introduced missing data into the subset of our population with
complete exposure data. We then applied our imputation model to this subset
and compared parameters (mean caffeine and OR between caffeine and IUGR)
between the complete data subset without artificial missing data and the
complete data subset with imputed values for artificial missing data. Artificial
missing data was introduced in two scenarios: 1) where missing data was
completely at random and 2) where missing data was dependent on the caffeine
consumption and IUGR. The process of validating the imputation model is
important in any context, but is critical when imputing the primary exposure of
interest.
In perinatal epidemiology, accurate characterization of exposed person‐time and
ability to detect time‐dependent effects are both dependent on the validity of
estimated length of gestation. Outside the context of in‐vitro fertilization, the
93
precise date of conception is unknown; therefore estimated date of ovulation
that resulted in fertilization is used as a proxy for the start of pregnancy.
Previous studies have documented factors associated with misclassification of
gestational age; however, the potential for differential misclassification of
gestational age specific exposures has not been addressed. Misclassification of
gestational age based on LMP is well documented [92]. LMP based gestational
age estimates rely on accurate recall of the first day of the last menstrual period.
Studies have found that approximately 20% of pregnant of women indicate that
their reported LMP date is uncertain or unknown [93, 94]. Even among women
who report the date of their last menstrual period with certainty, there is
evidence of error due to digit preference [95] and mistaken reporting due to
skipped menstrual bleeding, mid‐cycle bleeding, or third trimester bleeding.
Evidence of LMP based GA misclassification, including implausible gestational
age specific birth weights and non‐normally distributed birth weights for preterm
and post term births, has been documented in several studies [96‐98]. In
contrast to LMP, ultrasound based estimates of gestational age rely on fetal
growth trajectories rather than a direct estimate of time in pregnancy. Factors
affecting accuracy of ultrasound measurement include timing of ultrasound,
facility characteristics, and maternal characteristics. Hadlock et al. present
precision of gestational age prediction models at various time‐points in
pregnancy[99]. They demonstrated that early ultrasound (12‐18 weeks based on
known LMP) may be accurate to within 1 to 2 weeks while later ultrasound (24‐
94
36 weeks based on known LMP) may be accurate to within 2 to 3 weeks.
Maternal characteristics such as central adiposity, preference in mode of
ultrasound, and factors associated with delayed prenatal care (e.g. access to
care, pregnancy intention) may impact the validity of ultrasound dating. Even if
measured accurately, estimated GA may still be misclassified due to variability in
growth trajectories[100]. Future research should explore missing data methods
as a potential tool for correcting biases associated with misclassification of
gestational age.
Traditionally, missing data in observational epidemiology has been viewed as a
nuisance. In this dissertation, we have attempted to utilize missing data as a
tool. In our first chapter on time‐dependent bias, we attempt to remove the
analytic bias by multiply imputing exposure time. In our second chapter, we
assess the validity and efficiency of study designs in which we deliberately have
missing data within our study sample. In our third chapter, missing data
methods are utilized in the design of the study and in our adjustment for
confounding. While the missing data methods implemented in this dissertation
are not novel, they are applied in a novel context to address common problems
in observational epidemiology.
95
1. Cunningham, F.G., et al., Chapter 5. Maternal Physiology: Williams
Obstetrics, 23e: http://www.accessmedicine.com/content.aspx?aID=6043606.
2. Kochenour, N.K., Adverse pregnancy outcome: sensitive periods, types of adverse outcomes, and relationships with critical exposure periods. Prog Clin Biol Res, 1984. 160: p. 229-35.
3. Czeizel, A.E., Specified critical period of different congenital abnormalities: a new approach for human teratological studies. Congenit Anom (Kyoto), 2008. 48(3): p. 103-9.
4. Karumanchi, S.A. and R.J. Levine, How does smoking reduce the risk of preeclampsia? Hypertension, 2010. 55(5): p. 1100-1.
5. Rooney, B.L., M.A. Mathiason, and C.W. Schauberger, Predictors of Obesity in Childhood, Adolescence, and Adulthood in a Birth Cohort. Matern Child Health J.
6. Oken, E., et al., Maternal gestational weight gain and offspring weight in adolescence. Obstet Gynecol, 2008. 112(5): p. 999-1006.
7. Beyerlein, A., et al., Associations of gestational weight loss with birth-related outcome: a retrospective cohort study. BJOG, 2010.
8. Bodnar, L.M., et al., Severe obesity, gestational weight gain, and adverse birth outcomes. Am J Clin Nutr, 2010. 91(6): p. 1642-8.
9. Yates, L., et al., Influenza A/H1N1v in pregnancy: an investigation of the characteristics and management of affected women and the relationship to pregnancy outcomes for mother and infant. Health Technol Assess. 14(34): p. 109-82.
10. Chen, Y.K., et al., No increased risk of adverse pregnancy outcomes in women with urinary tract infections: a nationwide population-based study. Acta Obstet Gynecol Scand. 89(7): p. 882-8.
11. van Gelder, M.M., et al., Characteristics of pregnant illicit drug users and associations between cannabis use and perinatal outcome in a population-based study. Drug Alcohol Depend. 109(1-3): p. 243-7.
12. Schempf, A.H. and D.M. Strobino, Illicit drug use and adverse birth outcomes: Is it drugs or context. Journal of Urban Health, 2008. 85(6): p. 858-873.
13. Lund, N., L.H. Pedersen, and T.B. Henriksen, Selective serotonin reuptake inhibitor exposure in utero and pregnancy outcomes. Arch Pediatr Adolesc Med, 2009. 163(10): p. 949-54.
14. Calderon-Margalit, R., et al., Risk of preterm delivery and other adverse perinatal outcomes in relation to maternal use of psychotropic medications during pregnancy. Am J Obstet Gynecol, 2009. 201(6): p. 579 e1-8.
15. Ververs, T.F., et al., Association between antidepressant drug use during pregnancy and child healthcare utilisation. BJOG, 2009. 116(12): p. 1568-77.
96
16. Ritz, B., et al., Ambient air pollution and preterm birth in the environment and pregnancy outcomes study at the University of California, Los Angeles. Am J Epidemiol, 2007. 166(9): p. 1045-52.
17. Vardavas, C.I., et al., Smoking and smoking cessation during early pregnancy and its effect on adverse pregnancy outcomes and fetal growth. Eur J Pediatr. 169(6): p. 741-8.
18. Hueston, W.J., et al., Delayed prenatal care and the risk of low birth weight delivery. J Community Health, 2003. 28(3): p. 199-208.
19. Daniels, J.L., et al., Attitudes toward participation in a pregnancy and child cohort study. Paediatr Perinat Epidemiol, 2006. 20(3): p. 260-6.
20. Nechuta, S., et al., Attitudes of pregnant women towards participation in perinatal epidemiological research. Paediatr Perinat Epidemiol, 2009. 23(5): p. 424-30.
21. Suissa, S., Immortal time bias in pharmacoepidemiology. Am J Epidemiol, 2009. 167(4): p. 492-499.
22. van Walraven, C., et al., Time-dependent bias was common in survival analyses published in leading clinical journals. J Clin Epidemiol, 2004. 57(7): p. 672-82.
23. Beyersmann, J., et al., An easy mathematical proof showed that time-dependent bias inevitably leads to biased effect estimation. J Clin Epidemiol, 2008. 61(12): p. 1216-21.
24. Beyersmann, J., M. Wolkewitz, and M. Schumacher, The impact of time-dependent bias in proportional hazards modelling. Stat Med, 2008. 27(30): p. 6439-54.
25. Tleyjeh, I.M., et al., Propensity score analysis with a time-dependent intervention is an acceptable although not an optimal analytical approach when treatment selection bias and survivor bias coexist. J Clin Epidemiol. 63(2): p. 139-40.
26. Andres Houseman, E. and D.K. Milton, Partial questionnaire designs, questionnaire non-response, and attributable fraction: applications to adult onset asthma. Stat Med, 2006. 25(9): p. 1499-519.
27. Wacholder, S., et al., The partial questionnaire design for case-control studies. Stat Med, 1994. 13(5-7): p. 623-34.
28. Graham, J.W., et al., Planned missing data designs in psychological research. Psychol Methods, 2006. 11(4): p. 323-43.
29. Graham, J.W., S.M. Hofer, and D.P. MacKinnon, Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedured. Multivariate Behavioral Research, 1996. 31(2): p. 197-218.
30. Morara, M., et al., Optimal design for epidemiological studies subject to designed missingness. Lifetime Data Anal, 2007. 13(4): p. 583-605.
31. Helms, R.W., Intentionally incomplete longitudinal designs: I. Methodology and comparison of some full span designs. Stat Med, 1992. 11(14-15): p. 1889-913.
32. Graham, J.W., Missing data analysis: making it work in the real world. Annu Rev Psychol, 2009. 60: p. 549-76.
97
33. Howards, P.P., E.F. Schisterman, and P.J. Heagerty, Potential confounding by exposure history and prior outcomes: an example from perinatal epidemiology. Epidemiology, 2007. 18(5): p. 544-51.
34. Howards, P.P., et al., Misclassification of gestational age in the study of spontaneous abortion. Am J Epidemiol, 2006. 164(11): p. 1126-36.
35. Bodnar, L.M., et al., Marginal structural models for analyzing causal effects of time-dependent treatments: an application in perinatal epidemiology. Am J Epidemiol, 2004. 159(10): p. 926-34.
36. M, S.O.N., et al., Have studies of urinary tract infection and preterm delivery used the most appropriate methods? Paediatr Perinat Epidemiol, 2003. 17(3): p. 226-33.
37. Symons, M.J. and D.T. Moore, Hazard rate ratio and prospective epidemiological studies. J Clin Epidemiol, 2002. 55(9): p. 893-9.
38. Hernan, M.A., The hazards of hazard ratios. Epidemiology. 21(1): p. 13-5.
39. Lee, E.T., Statistical methods for survival data analysis, 2nd edition. Probability and Mathematical Statistics, ed. V. Barnett, et al. 1992, New York, New York: Wiley Interscience.
40. Clausson, B., et al., Effect of caffeine exposure during pregnancy on birth weight and gestational age. Am J Epidemiol, 2002. 155(5): p. 429-36.
41. Therneau, T.M. and P.M. Grambsch, Modeling survival data: extending the Cox model, ed. K. Dietz, et al. 2000, New York: Springer.
42. Roth, J. NCHS's Vital Statistics Natality Birth Data -- 1968-2006. 2009 [cited 2010; Available from: http://www.nber.org/data/vital-statistics-natality-data.html.
43. Alexander, G.R. and C.C. Korenbrot, The role of prenatal care in preventing low birth weight. Future Child, 1995. 5(1): p. 103-20.
44. Guillory, V.J., et al., Prenatal care and infant birth outcomes among Medicaid recipients. J Health Care Poor Underserved, 2003. 14(2): p. 272-89.
45. Shwartz, S., Prenatal care, prematurity, and neonatal mortality. A critical analysis of prenatal care statistics and associations. Am J Obstet Gynecol, 1962. 83: p. 591-8.
46. Molenberghs, G. and M. Kenward, Missing data in clinical studies, ed. S. Senn and V. Barnett. 2007: John Wiley & Sons.
47. Vintzileos, A.M., et al., The impact of prenatal care in the United States on preterm births in the presence and absence of antenatal high-risk conditions. Am J Obstet Gynecol, 2002. 187(5): p. 1254-7.
48. Hertz-Picciotto, I., L.M. Pastore, and J.J. Beaumont, Timing and patterns of exposures during pregnancy and their implications for study methods. Am J Epidemiol, 1996. 143(6): p. 597-607.
49. Nunes, A.P., et al., Time dependent bias of non-binary exposures:examples in perinatal epidemiology. 2010.
50. Savitz, D.A., et al., Epidemiologic measures of the course and outcome of pregnancy. Epidemiol Rev, 2002. 24(2): p. 91-101.
98
51. Golding, J. and C. Steer, How many subjects are needed in a longitudinal birth cohort study? Paediatr Perinat Epidemiol, 2009. 23 Suppl 1: p. 31-8.
52. Wacholder, S., The case-control study as data missing by design: estimating risk differences. Epidemiology, 1996. 7(2): p. 144-50.
53. Hogue, C.J. and M.A. Brewster, The potential of exposure biomarkers in epidemiologic studies of reproductive health. Environ Health Perspect, 1991. 90: p. 261-9.
54. Yuan, Y. Multiple imputation for missing values: Concepts and new development. in SUGI. 2000. Rockville, MD.
55. Rubin, D.B., Multiple Imputation for Nonresponse in Surveys, ed. J.W. Sons. 1987, New York.
56. Schafer, J.L., Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 2003. 57(1): p. 19-35.
57. Schafer, J.L. and M.K. Olsen, Multiple imputation for multivariate missing-data problems: A data analyst's perspective. Multivariate Behavioral Research, 1998. 33(4): p. 545-571.
58. Bracken, M.B., et al., Association of maternal caffeine consumption with decrements in fetal growth. Am J Epidemiol, 2003. 157(5): p. 456-66.
59. Adiguzel, F. and M. Wedel, Split Questionnaire Design for Massive Surveys. Journal of Marketing Research, 2008. 45(5): p. 608-617.
60. Chipperfield, J.O. and D.G. Steel, Design and Estimation for Split Questionnaire Surveys. Journal of Official Statistics, 2009. 25(2): p. 227-244.
61. Raghunathan, T.E. and J.E. Grizzle, A Split Questionnaire Survey Design. Journal of the American Statistical Association, 1995. 90(429): p. 54-63.
62. Newman, S.C., P.E. Shrout, and R.C. Bland, The efficiency of two-phase designs in prevalence surveys of mental disorders. Psychol Med, 1990. 20(1): p. 183-93.
63. Shrout, P.E. and S.C. Newman, Design of two-phase prevalence surveys of rare disorders. Biometrics, 1989. 45(2): p. 549-55.
64. Brown, C.H., A. Indurkhya, and G.K. Sheppard, Power calculations for data missing by design: Applications to a follow-up study of lead exposure and attention. Journal of the American Statistical Association, 2000. 95(450): p. 383-395.
65. Resnik, R., Intrauterine growth restriction. Obstet Gynecol, 2002. 99(3): p. 490-6.
66. Nomura, K., et al., Caffeine suppresses the expression of the Bcl-2 mRNA in BeWo cell culture and rat placenta. J Nutr Biochem, 2004. 15(6): p. 342-9.
67. Kirkinen, P., et al., The effect of caffeine on placental and fetal blood flow in human pregnancy. Am J Obstet Gynecol, 1983. 147(8): p. 939-42.
68. Austin, P.C., The performance of different propensity score methods for estimating marginal odds ratios. Stat Med, 2007. 26(16): p. 3078-94.
99
69. Vik, T., et al., High caffeine consumption in the third trimester of pregnancy: gender-specific effects on fetal growth. Paediatr Perinat Epidemiol, 2003. 17(4): p. 324-31.
70. Scialli, A.R., et al., A review of the literature on the effects of acetaminophen on pregnancy outcome. Reprod Toxicol, 2010. 30(4): p. 495-507.
71. Burdan, F., Effects of prenatal exposure to combination of acetaminophen, isopropylantipyrine and caffeine on intrauterine development in rats. Hum Exp Toxicol, 2002. 21(1): p. 25-31.
72. Burdan, F., Intrauterine growth retardation and lack of teratogenic effects of prenatal exposure to the combination of paracetamol and caffeine in Wistar rats. Reprod Toxicol, 2003. 17(1): p. 51-8.
73. Grosso, L.M., et al., Prenatal caffeine assessment: fetal and maternal biomarkers or self-reported intake? Ann Epidemiol, 2008. 18(3): p. 172-8.
74. . Centers for Disease Control and Prevention, National center for Health Statistics. 1999 natality detail file, issued June 2001. (NCHS CD-ROM series 21, no. 12H, ASCII version.
75. Robins, J.M., M.A. Hernan, and B. Brumback, Marginal structural models and causal inference in epidemiology. Epidemiology, 2000. 11(5): p. 550-60.
76. Cole, S.R. and M.A. Hernan, Constructing inverse probability weights for marginal structural models. Am J Epidemiol, 2008. 168(6): p. 656-64.
77. Hosmer, D.W., et al., A comparison of goodness-of-fit tests for the logistic regression model. Stat Med, 1997. 16(9): p. 965-80.
78. Rothman, K.J., S. Greenland, and T.L. Lash, Modern Epidemiology. 3rd ed, ed. S. Seigafuse and L. Bierig. 2008, Philadelphia, PA: Lippincott Wolliams & Wilkins.
79. Hosmer, D.W. and S. Lemeshow, Confidence interval estimation of interaction. Epidemiology, 1992. 3(5): p. 452-6.
80. ACOG CommitteeOpinion No. 462: Moderate caffeine consumption during pregnancy. Obstet Gynecol, 2010. 116(2 Pt 1): p. 467-8.
81. Martin, T.R. and M.B. Bracken, The association between low birth weight and caffeine consumption during pregnancy. Am J Epidemiol, 1987. 126(5): p. 813-21.
82. Fenster, L., et al., Caffeine consumption during pregnancy and fetal growth. Am J Public Health, 1991. 81(4): p. 458-61.
83. Peacock, J.L., J.M. Bland, and H.R. Anderson, Effects on birthweight of alcohol and caffeine consumption in smoking women. J Epidemiol Community Health, 1991. 45(2): p. 159-63.
84. Bakker, R., et al., Maternal caffeine intake from coffee and tea, fetal growth, and the risks of adverse birth outcomes: the Generation R Study. Am J Clin Nutr, 2010. 91(6): p. 1691-8.
85. Maternal caffeine intake during pregnancy and risk of fetal growth restriction: a large prospective observational study. BMJ, 2008. 337: p. a2332.
100
86. Bracken, M.B., et al., Heterogeneity in assessing self-reports of caffeine exposure: implications for studies of health effects. Epidemiology, 2002. 13(2): p. 165-71.
87. Bech, B.H., et al., Effect of reducing caffeine intake on birth weight and length of gestation: randomised controlled trial. BMJ, 2007. 334(7590): p. 409.
88. Williamson, E., et al., Propensity scores: From nave enthusiasm to intuitive understanding. Stat Methods Med Res, 2011.
89. Bakker, R., et al., Maternal Caffeine Intake, Blood Pressure, and the Risk of Hypertensive Complications During Pregnancy. The Generation R Study. Am J Hypertens, 2010.
90. Adeney, K.L., et al., Coffee consumption and the risk of gestational diabetes mellitus. Acta Obstet Gynecol Scand, 2007. 86(2): p. 161-6.
91. Kalilani, L. and J. Atashili, Measuring additive interaction using odds ratios. Epidemiol Perspect Innov, 2006. 3: p. 5.
92. Lynch, C.D. and J. Zhang, The research implications of the selection of a gestational age estimation method. Paediatr Perinat Epidemiol, 2007. 21 Suppl 2: p. 86-96.
93. Buekens, P., et al., Epidemiology of pregnancies with unknown last menstrual period. J Epidemiol Community Health, 1984. 38(1): p. 79-80.
94. Hall, M.H., et al., The extent and antecedents of uncertain gestation. Br J Obstet Gynaecol, 1985. 92(5): p. 445-51.
95. Waller, D.K., et al., Assessing number-specific error in the recall of onset of last menstrual period. Paediatr Perinat Epidemiol, 2000. 14(3): p. 263-7.
96. Dietz, P.M., et al., A comparison of LMP-based and ultrasound-based estimates of gestational age using linked California livebirth and prenatal screening records. Paediatr Perinat Epidemiol, 2007. 21 Suppl 2: p. 62-71.
97. Haglund, B., Birthweight distributions by gestational age: comparison of LMP-based and ultrasound-based estimates of gestational age using data from the Swedish Birth Registry. Paediatr Perinat Epidemiol, 2007. 21 Suppl 2: p. 72-8.
98. Ananth, C.V., Menstrual versus clinical estimate of gestational age dating in the United States: temporal trends and variability in indices of perinatal outcomes. Paediatr Perinat Epidemiol, 2007. 21 Suppl 2: p. 22-30.
99. Hadlock, F.P., et al., Estimating fetal age: computer-assisted analysis of multiple fetal growth parameters. Radiology, 1984. 152(2): p. 497-501.
100. Henriksen, T.B., et al., Bias in studies of preterm and postterm delivery due to ultrasound assessment of gestational age. Epidemiology, 1995. 6(5): p. 533-7.
Top Related