NSF-Census Research Network Meeting September 11, 2014 Jennifer D. Parker
-
Upload
eliana-moody -
Category
Documents
-
view
26 -
download
0
description
Transcript of NSF-Census Research Network Meeting September 11, 2014 Jennifer D. Parker
Geographic linkages between National Center for Health Statistics’ population health surveys and air quality measures
NSF-Census Research Network MeetingSeptember 11, 2014
Jennifer D. Parker National Center for Health Statistics
Background
• Air pollution has been found to be related to numerous acute and chronic health outcomes, including mortality, asthma exacerbation, emergency department visits, other cardiovascular and respiratory outcomes, etc
• The Environmental Protection Agency regulates air quality, largely based on health impacts
• Few national studies are available
National Center for Health Statistics
• 1960 - National Office of Vital Statistics and the National Health Survey merged to form National Center for Health Statistics
• 1987 - NCHS became part of the Centers for Disease Control and Prevention
• Official Federal Health Statistics Agency and part of the Federal Statistical System
• Data products– A principle source of health statistics for Department of Health and Human Services– Public use micro-data: individual records to enable data users to calculate estimates– Integrated Data Tools: reports and online tools with pre-tabulated indicators– Restricted use micro-data: individual records with information that could potentially be used to identify an
individual• Geographic identifiers and dates• Genetic information, detailed race and ethnicity• Can be accessed in the NCHS Research Data Center (RDC)
Population Health SurveysNational Health Interview Survey• Household interview survey – detailed demographic, health status, health
insurance, access to care, economic etc.– Generally chronic health outcomes rather than acute
• Approximately 35,000 – 40,000 households, 80,000 – 100,000 individuals per year– Data years can be combined to increase sample or study trends
• Complex sample design (stratified, clustered, oversampling)
• Collected since 1957
• Interview dates and geocoded residential locations available on restricted use files– Locations can be used to examine characteristics of locations, not actual
places
Population Health SurveysNational Health and Nutrition Examination Survey
• National examinations since 1960, since 1999 an ongoing survey– Physical measures, blood, urine, DNA– Household interview and 3-4 hr exam
• Complex survey design• About 5,000 total survey respondents each year• About 15 locations each year, 30 location in a two-year data
release; data can be combined across releases • Exam dates, interview dates and geocoded residential
locations available on restricted use files– Locations can be used to examine characteristics of locations, not
actual places– Data collected at each location over a few weeks
National Vital Statistics System (NVSS)Death and Birth Data
• Vital statistics data can be used to assess events during specific time windows, at specific locations, aggregated locations, and/or for specific endpoints
– Death data have been combined with daily pollution data at county-level to study acute effects of air pollution
– Birth data have been combined with county level pollution data to study birth weight, infant mortality, preterm birth
National Health Care Surveys
• Provider surveys designed to provide a comprehensive look at medical care provided in the U.S
• Information generally obtained from administrative records (e.g. claims), not interview• Selected covariates: age, race, expected source of payment, procedures
and diagnoses• Data can be combined across years to increase power
• Restricted use files can contain dates, locations of hospitals, and zip codes of patients• Locations can be used to examine characteristics of locations, not actual
places
8
EPA Air Quality System• EPA is a regulatory agency
• Selected pollution data– Criteria pollutants – ozone, sulfur dioxide, carbon monoxide, particulate matter, nitrogen dioxide,
lead– Collected at monitors throughout the U.S. under local authority– Both raw numbers and annual summaries available
• Air Quality System (AQS) is EPA’s repository for ambient air quality data. – Monitors, however, are not located everywhere. – Placement is not random, rather for regulatory and convenience purposes– Not all locations monitor for all pollutants, not all pollutants are monitored on the same periodicity,
and the ability to extrapolate from the monitor to a distance away from the monitor varies by pollutant.
• As the AQS has become more user-friendly, the number of studies using these data have greatly expanded.
• More recently, EPA Community Multiscale Air Quality (CMAQ) model provides estimates for locations with no monitors
Not all monitors are near NCHS survey participants
Exposure estimates are highly clustered
Hypothetical sampled segments and locations of monitors within a hypothetical NCHS sampled county
Combining NCHS survey and EPA data
• Complex sample survey designs– Not all counties in sample– While survey designed to be nationally
representative, the merged survey may not be representative for pollution
• Only ~ 30 counties sampled in a two-year NHANES release
– Temporal correspondence between health measures and pollution measures may not align
– Over sampling of subpopulation increases sample size for understudied groups
11
Combining NCHS survey and EPA data, continued• Surveys currently geocoded to latitude and longitude of
residence and/or administrative units
• Air quality data– Latitude and longitude of monitor locations– Administrative units (county)– Modeled air pollution data available at lat/long and/or smaller administrative units
• Basic krigging• EPA’s CMAQ – Community Multiscale Air Quality• Others, including Land Use Regression (LUR)
• Geographically linked data cannot be used for maps or to identify sensitive locations, rather as a contextual variable for analysis (e.g., areas with higher compared to lower levels of air pollution)
Evolution of geographic linkages between NCHS survey data and EPA data
MSA-level County-level Distance measures
Krigging Modeled data
NHIS linked to county level air pollution
0
1
2
3
1st 2nd 3rd 4th 1st 2nd 3rd 4th 1st 2nd 3rd 4th
Pollutant level (quartile)
ad
juste
d o
dd
s r
ati
o
Childhood asthma and county-level PM2.5 12-mo average air pollution. 2001-2004 NHIS
Current asthma
Asthma attack past 12 months
ER visit past 12 months
Evaluations of methods to link NCHS and EPA data
Percent NHIS respondents linked to pollutants by linkage method: 2000-2003 NHIS
Percent distribution race/ethnicity. NHIS linked to PM2.5, by linkage method: 2000-2003 NHIS
Percent with PM2.5 data by linkage method and time from examination: 1999-2004 NHANES
Prior day 7 days 6 weeks Annual0
10
20
30
40
50
60
70
80
Time from examination
Pe
rce
nt
5 miles
County
20 miles
SOURCE: NHANES 99-04, unpublished data
Adjusted* odds of fair/poor health status by PM2.5 (per 10 mg/m3) by geographic linkage method. 2000-2003 NHIS
5 miles N=155K 20 miles N=313K 5 miles N=30K 20 miles N=176K 0.800000000131481
1.00000000013148
1.20000000013148
1.40000000013148O
dd
s ra
tio (9
5% C
on
fiden
ce In
terv
al)
Linked to PM2.5 Linked to multiple pollutants
*Adjusted for poverty, age, region, urban/rural, and race/ethnicity
Application of krigging used to obtain air measures for the NHIS
Relationship between asthma and PM2.5 using krigged PM2.5 estimates. NHIS 2002-2005
What was learned comparing distance and county measures
• The more tightly controlled the exposure window in terms of space and time, the fewer survey respondents have monitor data
– Large literature on ‘exposure measurement error’– Growing interest in time-use patterns– The more pollutants examined simultaneously, the fewer survey respondents
would be included– More urban and disadvantaged populations would be more likely to be included
• Impact on policy inferences unknown for all outcomes, but may be small
• Adjustment of sample weights may partially address this issue
• Random effects and Bayesian statistical models may address these issues
Modeled Data from CDC/National Center for Environmental Health
• Downscaler (DS) model based on Bayesian space-time methods– Combines monitored data from AQS with EPA
CMAQ data– Daily, county, tract level– Available for ozone and PM2.5 for 2001-2008
Hazard ratios for association between heart disease mortality and PM2.5 (2001), black and white adults. 1986-2004 NHIS with mortality follow-up through 2006
What we’ve learned (so far) from DS model
• Inferences for mortality and PM2.5 are similar for monitor data using AQS and the DS data– Confidence intervals narrower for DS models– Inferences for black adults similar to those for white adults
• Availability of DS data in more locations may allow for inference by geographic indicators (e.g. region of the country, levels of urbanization)
• Propagating model error into epidemiological models with complex survey data should be done but best method for doing so is unknown
26
Conclusion• Linkages between national health data and monitored and modeled
air pollution data are resources for studying environmental health– Nationally representative data include population subgroups less often
included in other studies– Wide range of health outcomes
• Numerous methodological challenges– Temporal correspondence between exposure measures and health
outcome– Geographic correspondence between exposure measures and health
outcomes• Monitor locations• Residential and daily mobility
– Incorporation of estimation and model errors into inference analyses
Thanks to my collaborators!
• Nataliya Kravets
• Ambarish Vaidyanathan• Lara Akinbami• Keeve Nachman• Tracey Woodruff
Contact:
Jennifer D. ParkerChief, Special Projects BranchOffice of Analysis and EpidemiologyNational Center for Health Statistics
[email protected](301) 458-4419