50 (Suppl 1): S12-S18

7
81ournal of Epidemiology and Community Health 1996;50(Suppl 1):S12-S18 Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA protocol K Katsouyanni, J Schwartz, C Spix, G Touloumi, D Zmirou, A Zanobetti, B Wojtyniak, J M Vonk, A Tobias, A Ponka, S Medina, L Bacharova, H R Anderson Department of Hygiene and Epidemiology, University of Athens Medical School, Greece K Katsouyanni G Touloumi Harvard School of Public Health, Boston, USA J Schwartz GSF- Forschungszentrum Umwelt und Gesundheit, Germany C Spix Faculte de Medicine, Universite de Grenoble, France D Zmirou Institute of Clinical Physiology, National Research Council, Pisa, Italy A Zanobetti National Institute of Hygiene, Warsaw, Poland B Wojtyniak Department of Epidemiology and Statistics, University of Groningen, The Netherlands J M Vonk Institut Municipal d'Investigacio Medica, Barcelona, Spain A Tobias Helsinki City Center of the Environment, Finland A Ponka Observatoire Regional de la Sante, Paris, France S Medina National Center for Health Promotion, Bratislava, Slovakia L Bacharova Department of Public Health Sciences, St George's Hospital Medical School, London, UK H R Anderson Correspondence to: Dr K Katsouyanni, Department of Hygiene and Epidemiology, University of Athens Medical School, 75, Mikras Asias Street, 115 27 Athens (Goudi), Greece. Abstract Background and objectives - Results from several studies over the past five years have shown that the current levels of pol- lutants in Europe and North America have adverse short term effects on health. The APHEA project aims to quantifying these in Europe, using standardised method- ology. The project protocol and analytical methodology are presented here. Design - Daily time series data were gathered for several air pollutants (sul- phur dioxide; particulate matter, meas- ured as total particles or as the particle fraction with an aerodynamic diameter smaller than a certain cut off, or as black smoke; nitrogen dioxide; and ozone) and health outcomes (the total and cause spe- cific number ofdeaths and emergency hos- pital admissions). The data included fulfilled the quality criteria set by the APHEA protocol. Setting - Fifteen European cities from 10 different countries with a total population over 25 million. Methodology - The APHEA collaborative group decided on a specific methodo- logical procedure to control for con- founding effects and evaluate the hypothesis. At the same time there was sufficient flexibility to allow local char- acteristics to be taken into account. The procedure included modelling of all potential confounding factors (that is, seasonal and long term patterns, met- eorological factors, day of the week, hol- idays, and other unusual events), choosing the "best" air pollution models, and applying diagnostic tools to check the adequacy of the models. The final analysis used autoregressive Poisson models allowing for overdispersion. Effects were reported as relative risks contrasting defined increases in the cor- responding pollutant levels. Each par- ticipating group applied the analyses to their own data. Conclusions - This methodology enabled results from many different European settings to be considered collectively. It represented the best available com- promise between feasibility, com- parability, and local adaptibility when using aggregated time series data not originally collected for the purpose of epidemiological studies. (J Epidemiol Comm Health 1995;50(Suppl 1):S1 2-s 18) The APHEA project is supported by the Euro- pean Union Environment 1991-94 Pro- gramme. The project must be placed in the context of recent studies investigating the short term adverse health effects of moderate and relatively low air pollution levels which have consistently indicated the existence of effects at levels below the current national and inter- national air quality guidelines. 1-7 The back- ground and rationale of the study as well as the study areas and air pollution levels have been described in detail elsewhere.8 The APHEA project is a multicentre tem- poral study that uses aggregated data.9 Eleven European groups participate, analysing data from 15 European cities, with a total population over 25 000 000. The objectives of the pro- gramme are: * To provide quantitative estimates of the short term health effects (using the total and cause specific daily number of deaths and emergency hospital admissions) of air pollution, taking into consideration interactions between differ- ent pollutants and between pollutants and other environmental factors. * To further develop and standardise the methodology for the detection of short term health effects in the analysis of epidemiological time series data. * To select and develop a meta-analytic ap- proach for epidemiological time series studies. * To assess the feasibility of creating a Euro- pean data base of air pollution measurements and of health indicators recorded on a daily basis. This will allow continuous surveillance of short term effects of air pollution in the future. The cities involved in the project are (in alphabetical order): Amsterdam, Athens, Bar- celona, Bratislava, Cracow, Helsinki, Koeln, Lodz, London, Lyon, Milan, Paris, Poznan, Rotterdam, Wroclaw. There is substantial vari- ability in air pollutants levels, mixtures, and patterns, as well as in the climatic conditions of the 15 APHEA cities. All the cities except one have analysed data over at least five years, typically covering part of the 1980s and be- ginning of 1990s.8 S12

Transcript of 50 (Suppl 1): S12-S18

81ournal of Epidemiology and Community Health 1996;50(Suppl 1):S12-S18

Short term effects of air pollution on health:a European approach using epidemiologictime series data: the APHEA protocol

K Katsouyanni, J Schwartz, C Spix, G Touloumi, D Zmirou, A Zanobetti,B Wojtyniak, J M Vonk, A Tobias, A Ponka, S Medina, L Bacharova, H R Anderson

Department ofHygiene andEpidemiology,University of AthensMedical School,GreeceK KatsouyanniG Touloumi

Harvard School ofPublic Health,Boston, USAJ Schwartz

GSF-ForschungszentrumUmwelt undGesundheit, GermanyC Spix

Faculte de Medicine,Universite deGrenoble, FranceD Zmirou

Institute of ClinicalPhysiology, NationalResearch Council,Pisa, ItalyA Zanobetti

National Institute ofHygiene, Warsaw,PolandB Wojtyniak

Department ofEpidemiology andStatistics, Universityof Groningen,The NetherlandsJ M Vonk

Institut Municipald'Investigacio Medica,Barcelona, SpainA Tobias

Helsinki City Centerof the Environment,FinlandA Ponka

Observatoire Regionalde la Sante, Paris,FranceS Medina

National Center forHealth Promotion,Bratislava, SlovakiaL Bacharova

Department of PublicHealth Sciences,St George's HospitalMedical School,London, UKH R Anderson

Correspondence to:Dr K Katsouyanni,Department of Hygiene andEpidemiology, University ofAthens Medical School,75, Mikras Asias Street,115 27 Athens (Goudi),Greece.

AbstractBackground and objectives - Results fromseveral studies over the past five yearshave shown that the current levels of pol-lutants in Europe and North America haveadverse short term effects on health. TheAPHEA project aims to quantifying thesein Europe, using standardised method-ology. The project protocol and analyticalmethodology are presented here.Design - Daily time series data were

gathered for several air pollutants (sul-phur dioxide; particulate matter, meas-

ured as total particles or as the particlefraction with an aerodynamic diametersmaller than a certain cut off, or as blacksmoke; nitrogen dioxide; and ozone) andhealth outcomes (the total and cause spe-cific number ofdeaths and emergency hos-pital admissions). The data includedfulfilled the quality criteria set by theAPHEA protocol.Setting - Fifteen European cities from 10different countries with a total populationover 25 million.Methodology - The APHEA collaborativegroup decided on a specific methodo-logical procedure to control for con-

founding effects and evaluate thehypothesis. At the same time there was

sufficient flexibility to allow local char-acteristics to be taken into account. Theprocedure included modelling of allpotential confounding factors (that is,seasonal and long term patterns, met-eorological factors, day of the week, hol-idays, and other unusual events),choosing the "best" air pollution models,and applying diagnostic tools to checkthe adequacy of the models. The finalanalysis used autoregressive Poissonmodels allowing for overdispersion.Effects were reported as relative riskscontrasting defined increases in the cor-

responding pollutant levels. Each par-

ticipating group applied the analyses totheir own data.Conclusions - This methodology enabledresults from many different Europeansettings to be considered collectively. Itrepresented the best available com-

promise between feasibility, com-

parability, and local adaptibility whenusing aggregated time series data not

originally collected for the purpose ofepidemiological studies.

(J Epidemiol Comm Health 1995;50(Suppl 1):S1 2-s 18)

The APHEA project is supported by the Euro-pean Union Environment 1991-94 Pro-gramme. The project must be placed in thecontext of recent studies investigating the shortterm adverse health effects of moderate andrelatively low air pollution levels which haveconsistently indicated the existence of effectsat levels below the current national and inter-national air quality guidelines. 1-7 The back-ground and rationale of the study as well asthe study areas and air pollution levels havebeen described in detail elsewhere.8The APHEA project is a multicentre tem-

poral study that uses aggregated data.9 ElevenEuropean groups participate, analysing datafrom 15 European cities, with a total populationover 25 000 000. The objectives of the pro-gramme are:* To provide quantitative estimates ofthe shortterm health effects (using the total and causespecific daily number of deaths and emergencyhospital admissions) of air pollution, takinginto consideration interactions between differ-ent pollutants and between pollutants and otherenvironmental factors.* To further develop and standardise themethodology for the detection of short termhealth effects in the analysis of epidemiologicaltime series data.* To select and develop a meta-analytic ap-proach for epidemiological time series studies.* To assess the feasibility of creating a Euro-pean data base of air pollution measurementsand of health indicators recorded on a dailybasis. This will allow continuous surveillanceof short term effects of air pollution in thefuture.The cities involved in the project are (in

alphabetical order): Amsterdam, Athens, Bar-celona, Bratislava, Cracow, Helsinki, Koeln,Lodz, London, Lyon, Milan, Paris, Poznan,Rotterdam, Wroclaw. There is substantial vari-ability in air pollutants levels, mixtures, andpatterns, as well as in the climatic conditionsof the 15 APHEA cities. All the cities exceptone have analysed data over at least five years,typically covering part of the 1980s and be-ginning of 1990s.8

S12

Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA protocol

The APHEA collaborators decided on astandardised procedure for testing and quan-tifying the short term health effects of air pol-lution and this was applied by each participatinggroup in the analysis of their own data. Thus,exchange of knowledge and expertise and har-monisation of the methodology between thegroups were achieved. The data used and theirselection criteria are briefly described here,since they have been presented in detail else-where,8 while emphasis is placed on the stat-istical evaluation procedures. The analyticalmethodology is presented from an "applied"point of view; for an in depth statistical reviewsee Schwartz et al.'0

The data baseHEALTH OUTCOME DATAThe outcome time series were daily countsof all deaths, or deaths from specific causes(respiratory, cardiovascular, and digestive), orthe daily number of emergency hospital ad-missions from certain causes (total respiratory,asthma, chronic obstructive pulmonary disease,cardiac).The sources of the mortality data were the

national death registration systems for all citiesexcept Athens, where the researchers collectedand coded the data. Death registration wascomplete in all the participating countries. Dataon causes of death have problems of in-completeness or erroneous entries and al-though the extent of these was not studied ineach participating city, it was probably relativelysmall for the diagnostic categories consideredin this project.""13

Hospital admissions data may have moreproblems of comparability and uniformity. Inmost cities"-" the data were provided bygovernment operated national registers ormunicipality operated local registers.'7 InBarcelona, the hospital admissions register wasset up and operated by the researchers and hasbeen tested for validity.3 Discharge diagnosiswas used in all cases. The coverage was above95%, and in the cities where this changed overthe study period, care was taken to model thistrend.'6 Completeness of diagnosis was above70% for all diagnostic categories and tendedto be higher for the categories studied here.'415Incomplete recording of diagnosis, however, isprobably not correlated to air pollution levelson a day to day basis and is therefore unlikelyto introduce biases.10

Errors in the outcome data are probably non-differential with regard to exposure - that is,they do not depend on the air pollution levels.Therefore, if any bias were introduced it wouldprobably tend to decrease the magnitude of theestimated effect parameters (bias toward thenull). 18

AIR POLLUTION DATAThe exposure times series data which wereanalysed concern7ed sulphur dioxide (S02), par-ticulate matter (either total suspended particles(TSP) or particles with aerodynamic diametersmaller than a certain cut off, for example,

PM10, or black smoke (BS)), nitrogen dioxide(NO2), and ozone (03). Daily (24 hour) meas-urements (from midnight to midnight in allcities except Polish ones) were used for part-icles, SO2, and NO2. For SO2 and NO2 themaximum hourly value for each day was alsoused. For 03 the maximum one hour andthe maximum eight hour daily values wereconsidered.8There was no quality assurance or quality

control programme within APHEA to ensurecomparability of air pollution measurements.However, all EU countries have quality controlprogrammes in order to conform with EU re-quirements19-23 and Finland also has a cal-ibration programme. Furthermore, it hasrecently been shown that the methods em-ployed in theWHO European Region for meas-uring SO2, NO, and NO2 differ only slightly.24

AdmissibilityIt was decided to exclude from the analysismeasurements done in stations located in lim-ited access highways. Only urban air pollutionwas studied, so air pollution monitoring sitessituated outside urban areas were not used,except for 03, for which in some instances a"suburban" site was used. All groups usedurban background monitoring sites and onesite near an urban road, with the exception ofthe French groups who used only backgroundsites. The arithmetic average of measurementsavailable from all urban stations which fulfilledthe completeness criteria (see below) was usedfor every day. The average from the same sta-tions was used for the whole study period.

All time series studies probably suffer frominaccurate estimation of population exposure,since the same level of exposure is assigned toall members of the population. The numberof studies estimating the relationship betweenindividual exposure and the levels measured atfixed monitors are very limited so far.2528 In theAPHEA project, however, in order to ensureadequate representation of the population ex-posure, the number of monitoring sites usedfor every pollutant was at least equal to three inmost cases. The correlation coefficents betweenthe measurements from various monitoringsites in the same city generally ranged from0-20 to 0-91 for SO2, 0-16 to 0-93 for blacksmoke, 0 44 to 0 95 for TSP, 0-20 to 0-84 forNO2 and 0 40 to 0 97 for 03.

CompletenessFor the calculation of 24 hour NO2 and SO2and maximum one hour NO2 values, at least75% of the one hour values on that particularday had to be available. For the maximum onehour 03 values, 75% of the hourly values from6 am to 7 pm had to be available, since themaximum 03 levels always occur during day-light. For the eight hour value of 03, it wasdecided to take the 9 am to 5 pm average (since03 peaks at or immediately after mid-day andthis eight hour average is probably identical orvery close to the maximum), and to calculatethis, at least six hourly values had to be avail-

S13

Katsouyanni, Schwartz, Spix, Touloumi, Zmirou, Zanobetti, Wojtyniak, Vonk, Tobias, Ponka, Medina, Bacharovd, Anderson

able. If a station had more than 25% of thevalues for the whole period of analysis missingit was excluded. In some centres a station wasclosed for a long period and some participantsused the measurements from another nearbystation. In this situation, care was taken not tointroduce systematic error, because in somecases a nearby (in geographic terms) station,could give systematically different values (forexample if the levels of the substitute stationwere systematically higher by 25%, they weremultiplied by 0 8).

Missing dataIn spite of the completeness criteria, there werestill a few missing values in the air pollutantstime series for some days. It was decided to fillthese in since it is desirable to have completedata series for time series analysis. The methodwas based on estimating the missing value(s)using the available measurements in the othermonitoring sites on the same day. If, for ex-ample, a value was missing for a specific pol-lutant from station i, the mean of all stations(on days when all measurements were avail-able) was regressed on the measurements ofthe other sites (used as independent variables)except i, adjusting also for season. This wasthen used to predict the mean daily level whenthe measurement from i was missing, and fromthis mean to calculate a level for station i. Thisapproach is very difficult when there are morethan three monitoring sites. In this situation asimpler approach was adopted: the missingvalue was replaced by the mean level of theremaining stations multiplied by a factor equalto the ratio of the seasonal (three month) meanfor the missing station over the correspondingmean from the stations available on that par-ticular day.

DATA ON CONFOUNDING FACTORS

Time series data on daily temperature (°C) andrelative humidity (%) were used to control forthe potential confounding effects of weather.Information of influenza epidemics and otherunusual events (heat wave, strike, etc) was alsoused when possible, but for the control of otherconfounding factors (seasonality, calendareffects, long term trends) no collection of datawas necessary.

Statistical modelling for control ofconfoundingSTATISTICAL CONSIDERATIONS AND DIAGNOSTICTOOLSTime series of the type analysed here are ap-

proximately Poisson distributed, overdispersed(that is, the variance is greater than the mean),and usually positively autocorrelated. Over-dispersion and serial correlation are usually a

result of extraneous factors, such as seasonalityand weather, less an intrinsic feature of thedaily case counts, and are best taken care ofby an appropriate mean model. For details ofdistributional issues, the relative risk model(Poisson regression), and the principle of in-

cluding potential confounders in the model seeSchwartz et al.'0 According to the APHEAprotocol, each centre controlled individuallyfor each confounder in the a priori list, followingthe guidelines and procedures set by the pro-tocol. The elements are described in the nextparagraph.For technical reasons, the model building

was done using linear regression with log trans-formed dependent variables. Instead of thePoisson model (where Y is the daily numberof cases, X the independent explanatory vari-ables, : the parameter vector, and E the stat-istical expectation)

log(E(Y)) =Xf

we fit

E(log(Y + c)) = Xf

(where c is a constant, taking a value >0 whenthere are 0 counts in the outcome data series,for example, c=0 5 or c= 1) which has adifferent error structure.'0Many tools were available for diagnosis and

decision making during the process of modelbuilding. Some groups used these at each step,and they were used by all groups after finalisingthe "core" model.

The time series plot of the observed dependentvariableThis helped determine whether there was aconsiderable trend, seasonality, changes instructure (more likely to occur in hospital ad-missions data than in mortality counts), andshorter unusual periods such as effects of hotor cold spells and epidemics.102930

The predicted time series plotThis depicted the effects of the trend and sea-sonal model and all other longer term struc-tures. It helped to decide whether a shortercycle (see below) in the seasonal model rep-resented a genuine change in the seasonalshape. It also helped to determine (togetherwith the observed data plot) whether a certainstructure was mostly driven by a short in-fluential phase and, although perhaps stat-istically significant, was probably not beappropriate for modelling the rest of theseries.10 29 30

The residual time series plotThis was most useful in connection with theraw data plot and the predicted series. Thegoal was to come as close as possible to whitenoise. Structures such as trend, seasonality,and epidemics visible in the raw series shouldbe adequately described in the predicted seriesand invisible in the residual series. Structuresthat showed up in the residual series but werenot visible in the original series, pointed toinappropriate or overfitting of one or severalconfounders. Problems with changes in pat-terns of longer phases also became visible. 10 29 30

S14

Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA protocol

Statistical tests; goodness offitTests for improvement of fit (F test, %2 test)after the inclusion ofone or a group of potentialconfounder variables were helpful in decidingwhich terms actually to include in the model.They also helped in deciding between severalways of describing the same phenomenon. Themain goal in this type of analysis was notstatistical hypothesis testing, it was confoundercontrol. One should not therefore rely ex-clusively on statistical tests for decision making;the other more descriptive tools presentedabove are just as important. It should alsobe taken into account that several correlatedvariables can jointly lead to good confoundercontrol, while the statistical tests of their para-meters are quite misleading. As described inSchwartz et al,'0 R' is not an appropriate meas-

ure of model fit in the Poisson case. It is,however, possible to check whether the extra-Poisson variation usually observed in the raw

data had been completely removed (or removedto a large extent) in the residual series.

ParsimonyGiven the usually large number of data pointsavailable here, parsimony (that is, the use ofthe smallest possible number of parameters)was not in the itself important. Unnecessarilylarge numbers of variables, however, should beavoided. Different descriptions of the same

phenomenon sometimes require very differentnumbers of parameters. For example, a longterm trend may be concisely described by a

second order polynomial or by a large numberof harmonic waves with periods of more thanone year (see below). Epidemics that distortthe seasonal pattern in some years may requiremany by-year seasonal terms to describe them,or, alternatively, a small number of (not ne-

cessarily rectangular) indicators per epidemic.

PeriodogramThe periodogram is part of the spectral de-composition of a time series. Its practical valueis that it can be read like an empirical frequencydistribution of cyclical patterns of differentperiod lengths in the data. A trend shows up

as large values for periods well above one year,

seasonality as large values around period 365,and day of week patterns around periods ofseven days. After model fitting, the residualperiodogram should come as close as possibleto white noise, though a formal test of whitenoise was an insufficient criterion here. Largevalues in the residual series periodogram afterconfounder control at periods above 60 days(the limit set for long term pattern removal inthis protocol) point to insufficient trend, sea-

son, and epidemic control. Large values inthe residual periodogram during periods wherethere were none in the raw data series, pointto inappropriate or over correction of at leastsome potential confounders."9

Partial autocorrelation function (PACF)The PACF describes the serial correlation of atime series at lags 1, 2, etc, with the value ateach lag corrected for the previous lags. Fordetails on the interpretation and treatment ofthe autocorrelation structure in the regressionsee Schwartz et al."0 The usually relativelystrong positive autocorrelation in the observedhealth outcome series should be reduced towhite noise as far as possible by confoundercontrol. Large positive values at the first lagspoint to insufficient confounder control. Stat-istical significance is a helpful but insufficientcriterion for what "large" is; this also dependson how large the raw series autocorrelationwas. If the first several lags of the residualseries are consistently below 0, this points toovercorrection. The PACF ofthe residuals aftercompleting the confounder model is also help-ful in determining the order of the au-tocorrelation correction in the regression (seebelow).

Residual-residual plotsWhen meteorological terms or pollution vari-ables are included in the model, the dose-response curve sometimes requires trans-formation of the independent variables or ahigher order polynomial of them. Smoothedresidual-residual plots - that is, using the healthoutcome variable corrected for all influencesexcept the new one and the new independentvariable corrected (pre-filtered) for the sameinfluences - are helpful for determining theshape. Smoothing can be done in a number ofways. Here we chose to plot the means of 20consecutive days ordered by the pre-filterednew variable. If a pattern does not allow a finaldecision, but at least restricts the choice to afew alternatives, the decision may then be basedon model goodness of fit.

Cross correlationsThe delay of the short term effect of meteor-ology on air pollution can be determined bygoodness of fit or with the help of a crosscorrelation plot. The pre-filtered residuals ofthe health outcome series and the perhapstransformed influence variable residuals arecorrelated to a number of delays (for re-strictions see below) and the largest correlationis chosen. In series of this type, only one dir-ection of influence is interpretable: if there isany influence, it may only be of the en-vironmental factor on health, not vice versa.

In time series, where the unit of data ag-gregation is one day, confounding factors arefactors likely to vary on a chronological basisover the study period. Such factors are longterm trends, cyclical patterns, day of the weekand holidays, unusual events (influenza epi-demics, strikes), and meteorological variables.In the APHEA data analysis, decisions weretaken for the procedures used for the controlof all potential confounders.

S15

Katsouyanni, Schwartz, Spix, Touloumi, Zmirou, Zanobetti, Wojtyniak, Vonk, Tobias, Pdnkd, Medina, Bachdrov, Anderson

CONTROL OF CONFOUNDINGSeasonality and other cyclical patternsSeasonality is an important (probably the mostimportant) confounder when short term effectsof air pollution on health are considered. Manyhealth outcomes have pronounced seasonalpatterns - for example, mortality counts peakin the winter, a phenomenon which has beenobserved in places with different latitudes andclimates worldwide.3'At the same time, the levels of several pol-

lutants exhibit seasonal patterns which, if notcontrolled for in the analysis, may result inspurious positive (if the pollutants also peak inthe winter) or negative (if the pollutant peaksin the summer) associations. The strength andshape ofthe seasonal pattern depends to a largeextent on local conditions.

In the APHEA project it was decided to use

a parametric way of controlling for seasonaland other cyclical patterns - that is, throughthe introduction of sinusoidal terms in theregression model. An annual harmonic (sinu-soidal) wave can be described by a function:ct sin (2t t/365) + , cos(2it t/365), t=1 to T,day of study.While the term w= 2x t/365 determines

the frequency or period of the harmonicwave (here: 1 year), oc and determine theamplitude (that is, how strongly the patternis expressed) and the relationship between ac

and determines the phase (that is, wherein the course of 1 year the peak and dip ofthe wave are placed). cx and are determinedas regression coefficients. As this pattern is,by definition, symmetric, which may not bea realistic description of the data, furtherterms with w=2n t k/365-25, k=2,3 etc are

included. This can now describe other annualshapes - for example, steep increases and flatdeclines, or long summers and short winters,or additional peaks and dips beyond the simplesummer-winter pattern. For this protocol itwas decided to include, at most, terms ofthe 6th order (k< 6), as this would pick up

two month periods or events of one month'slength, which seemed reasonable given thatthe study aimed at short term effects. If thereis reason to believe that the seasonal patternis different between study years or othersections of the data (with respect to amplitudeand/or phase), interaction terms between cyc-lical terms and indicators for years or sectionscan be included. Overlaying a biannual cycle(k=0 5) can sometimes help in explainingdifferences in seasonality between years too.

Biannual cycles or varying patterns betweenyears can be caused by epidemics, which are

better corrected for directly either as (notnecessarily rectangular) indicator variables or,where available, as data on case counts.Details of the reasoning behind the inclusionof these terms in the model at all, theadvantages and disadvantages of using thismethod, as well as other methods availableto control for seasonal patterns are discussedelsewhere.'" The maximum order of the sea-

sonal model for an individual city is basedon goodness of fit tests as well as on inspectionof time series plots and residual periodograms.

The sine and cosine term of a certain periodmust always be included or excluded sim-ultaneously.

Short term effects of meteorological factorsAfter the deseasonalisation of the temperatureand humidity series, the residuals from themodels of temperature and humidity describedin the previous section were plotted separatelyagainst the deseasonilised mortality residuals.To smooth the curve and reveal any patternsin the association, the days were sorted ac-cording to the temperature (or humidity) andthe mean ofeach 20 consecutive days was taken(see above "Residual-residual plots"). The de-cisions on which temperature and humidityterms to include were based on the shape ofthis plot, and where it was ambiguous, onmodel fit differences. Examples of temperatureterms which have been used in different loc-ations are as follows:* Linear term only.'4* Second order polynomial of temperature orhumidity (linear and quadratic term, parabolicdose response curve).2932* Double quadratic curve: the inclusion of twovariables, one for "cold" taking the values (A-temperature), when temperature was <A (A in°C is the turning point of the double quadraticcurve) and 0 otherwise; and one for "hot",taking the value (temperature -A), when tem-perature >A and 0 otherwise. The level of theturning point in the double quadratic curve (aU shaped curve) depends on the local con-ditions.30* Third order polynomial and other vari-ations.'7

In most instances one of the above choiceswas tested statistically by alternative inclusionin the model. When transformations of tem-perature and humidity were used, meas-urements of the same day (lag 0), the previousday (lag 1), and two days before (lag 2) weretested. The same lag was used for temperatureand humidity in any single data set. Sometimesinclusion of the temperature-humidity inter-action improves the fit.'429 Temperature andhumidity variables were kept in the core modelirrespective of their statistical significance.

Long term trendsIf a systematic increase or decrease in the healthoutcome was observed over the study period,a variable taking the values 1 to T, where T isthe number of days in the study period wasincluded in the model. The square of thisvariable was also introduced if it improvedthe model fit. Where there were year to yearfluctuations (non-monotonic), dummy vari-ables for the years were included in the model(the number of dummy variables was N-1,where N the number of years studied).

Calendar effectsSix dummy variables were included in themodel to account for systematic difference inthe health outcome in relation to the day of

S16

Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA protocol

the week. They were all included in the modelor all excluded, based on model fit. Sometimesthe day of week pattern changes in relation toyear or season and this was accounted for inthe model. Holidays were included as one orseveral dummy variables, depending on thelength of the series and the diversity of theireffect.

Influenza epidemics and other eventsAppropriate variables were introduced if locallynecessary when influenza epidemic data wereavailable or any other unusual event (strike,heat wave, etc)'4 had taken place. For influenzaepidemics delayed effects ofup to 15 days weretested.When all core model variables were decided,

the time plots of residuals were inspected forany remaining patterns.

Air pollution variables for singlepollutant modelsResidual-residual plots were examined to de-cide on the shape of the dose-response curve.For this purpose, the core model was appliedwith each air pollutant as the dependent vari-able. Then the residuals of mortality and eachpollutant were plotted in order to decide whichpollutant transformation to use. Where no de-cision could be reached on this basis, decisionswere made on the basis ofmodel fit. At the nextstage, the pollutant variables were included inthe model as independent variables. Lags 0 to3 (0 to 5 for 03) were checked for one daymeasurements of each pollutant and cumul-ative effects by two (lag 0 and 1), three (lag 0,1 and 2), and four days (lag 0, 1 to 3) (03;also five and six days) were also checked al-ternatively. The observed delay of an effectcan be influenced, in addition to biologicalmechanisms, by a number of local factors suchas wind direction, placement of the measuringstation, size of the city, "day" definition ofpollution measurements, air chemistry, etc.One, one day variable and one indicatingcumulative exposure were chosen for each city.

In cleaner settings linear terms for the pol-lutants generally fitted the data better. Whenlog transformations were the best, these wereused to assess the pollutant effects in a city,but in the interest of meta-analysis additionalmodels were fitted with linear pollutant terms,retaining in the analysis only those days whenthe corresponding pollutant did not exceed200 gg/m3.33 Quadratic transformation was alsoused but specifically for O3.

AutocorrelationRemaining autocorrelation was controlled forby using autoregressive error models. Auto-regressive terms up to order 4 were tried.Only those that were at least moderately sig-nificant (that is, p<0-10) were retained.

Final modelsThe final models on which the results werebased were autoregressive Poisson models thatallowed for overdispersion. These models pro-vided relative risk estimates. The SAS softwarepackage was used for the above analyses.

Effect modification by seasonThe possible differential effects of each airpollutant on every health outcome in summerand winter were investigated by introducing anadditional dummy variable for season in themodel, plus an interaction term of the air pol-lutant and the season variable. Summer andwinter could be differently defined at eachcentre, depending on the local climatic con-ditions.

Interaction between pollutantsThe same principle used for seasonal inter-action was used to investigate possible inter-actions between pollutants. It was decided toanalyse S02 effects at different levels ofparticles(the cut off point for "low" and "high" particlelevels was 100 pg/m' for BS and TSP and 60 gg/m' for PM,0), particle effects at different levelsof SO2 (cut off point for "low" v "high" S02:100 gg/m'), and particle effects at differentlevels of NO2 (cut off point for "low" and"high" NO2 levels were 80 pg/m' for 24 hourvalues and at 120 jtg/m' for one hour levels).Thus, in the S02 model, for example, a dummyvariable was added indicating "low" and "high"particle pollution for each day (usually takingthe value 0 for "low" particle days and 1 for"high") plus an interaction term of this dummyvariable and the SO2 variable. In this way,collinearity which presents problems in multi-pollutant models, was avoided.'0

ConclusionsThe above procedure was followed by all par-ticipants to ensure maximum comparability ofthe resulting effect estimates. An importantadvantage of this standardisation is that it en-abled us to consider results from many differentEuropean settings collectively, without theproblem of different methodologies that hasbeset older studies.'435 Furthermore, the pro-cedure described above allows the flexibility toaccount for local climate, pollution levels andmixtures, social, and other conditions. It rep-resents the best available compromise betweenfeasibility, comparability, and local adaptabilitywhen using aggregated time series data notoriginally collected for the purpose of epi-demiological studies.

The APHEA collaborative group consists of: K Katsouyanni,G Touloumi (Greece; Coordinating centre); D Zmirou, P Ritter,T Barumandzadeh, F Balducci, G Laham (Lyon, France);H E Wichmann, C Spix (Germany); J Sunyer, J Castellsague,M Saez, A Tobias (Spain); J P Schouten, J M Vonk, A C Mde Graaf (Netherlands); A Ponka (Finland); H R Anderson, APonce de Leon, J Bower, D Strachan, M Bland (UK); W Dab,P Quenel, S Medina, A Le Tertre, B Thelot, B Festy, Y LeMoullec, C Monteil (Paris, France); B Wojtyniak, T Piekarski(Poland); M A Vigotti, G Rossi, L Bisanti, F Repetto, AZanobetti (Italy); L Bacharova, K Fandakova (Slovakia).

S17

Katsouyanni, Schwartz, Spix, Touloumi, Zmirou, Zanobetti, Wojtyniak, Vonk, Tobias, Ponka, Medina, Bacharova, Anderson

1 Air quality guidelines for Europe World Health Or-ganization, Regional Office for Europe. WHO RegionalPublications, European series No 23. Copenhagen: WHO,1987.

2 Kinney PL, Ozkaynak H. Associations of daily mortalityand air pollution in Los Angeles county. Environ Res 1991;54:99-120.

3 Sunyer J, Anto J, Murillo C, Saez M. Effects of urbanair pollution on emergency room admissions for chronicobstructive pulmonary disease. Am J Epidemiol 1991;134:277-86.

4 Pope CA, Schwartz J, Ransom MR. Daily mortality andPM,o pollution in Utah Valley. Arch Env Health 1992;47:211-17.

5 Schwartz J, Dockery DW. Particulate air pollution and dailymortality in Steubenville, Ohio. AmJEpidemiol 1992;135:12-19.

6 Schwartz J. Air pollution and daily mortality in Birmingham,Alabama. Am J7 Epidemiol 1993;137: 1136-47.

7 Schwartz J. Air pollution and daily mortality: a review andmeta-analysis. Environ Res 1994;64:36-52.

8 Katsouyanni K, Zmirou D, Spix C, et al. Short-term effectsof air pollution on health: A European approach usingepidemiologic time series data. Eur RespirJ 1995;8:1030-38.

9 Katsouyanni K ed. Study designs. Commission of the Euro-pean Communities, Air Pollution Epidemiology ReportSeries, no 4. Luxemburg: Office for Official Publicationsof the European Communities, 1993.

10 Schwartz J, Spix C, Touloumi G, et al. Methodologicalissues in studies of air pollution and daily counts of deathsand hospital admissions. J7 Epidemiol Community Health1996;50(suppl 1):S3-S 1.

11 Camilli AE, Robbins DR, Lebowitz MD. Death certificatereporting of confirmed airways obstructive disease. Am JfEpidemiol 1991;133:795-800.

12 Nielsen GP, Bjomson J, Jonasson JG. The accuracy of deathcertificates. Virchows Arch 1991;419:143-6.

13 Ashley J, Devis T. Death certification from the point of viewof the epidemiologist. Population Trends 1992;67:22-28.

14 Dab W, Medina S, Quenel P, et al. Short term respiratoryhealth effects of ambient air pollution: results of theAPHEA project in Paris. JT Epidemiol Community Health1996;50(suppl 1):S42-S46.

15 Ponce de Leon A, Anderson HR, Bland JM, Strachan DP.The effects of air pollution on daily hospital admissionsfor respiratory disease in London between 1987-88 and1991-92 7 Epidemiol Community Health 1996;50(suppl 1):S63-S70.

16 Schouten JP, Vonk JM, de Graaf A. Short term effectsof air pollution on emergency hospital admissions forrespiratory disease: results of the APHEA project in twomajor cities in the Netherlands during 1977-1989. JEpidemiol Community Health 1996 50(suppl 1):S22-S29.

17 P6nka A, Virtanen M. Asthma and ambient air pollution inHelsinki. J Epidemiol Community Health 1996;50(suppl 1):S59-S62.

18 Flegal KM, Brownie C, Haas JD. The effects of exposuremisclassification on estimates of relative risk. Am Jf Ep-idemiol 1986;123:736-51.

19 Department of the Environment. Urban air quality in theUnited Kingdom. Bradford: Quality of Urban Air ReviewGroup, 1993.

20 Conseil des Communautes Europeennes Directive du 15/7/1980 concernant des valeurs limitees et des valeursguides de qualite atmospherique pour l'anhydride sul-fureux et les particules en suspension (80/779/CEE).Journal Officiel des Communautis Europeennes 1980;L229:30-48.

21 Conseil des Communautes Europeennes Directive du 7/3/1985 concemant les normes de qualite de l'air pour ledioxyde d'azote (85/203/CEE). Journal Officiel des Com-munautes Europiennes 1985;L87: 1-7.

22 Conseil des Communautes Europeennes Directive 21/6/1986 modifiant la directive 80/779/CEE concernant desvaleurs limitees et des valeurs guides de qualite at-mospherique pour l'anhydride sulfureux et des particulesen suspension (89/427/CEE). Journal Officiel des Com-munautis Europiennes 1989;L201:53-55.

23 Conseil des Communautes Europeennes Directive 92/72/CEE du conseil du 21/9/92 concernant la pollution del'air par ozone. Journal Officiel des Communautes Euro-piennes 1993;L297:1-7.

24 Mucke HG, Manns H, Turowski E, Nitz G. Europeanintercomparison workshop on air quality monitoring - meas-uring of SO2, NO and NO2. Vol 1 Berlin: WHO Col-laborating Centre for Air Quality Management and AirPollution Control at the Institute for Water, Soil and AirHygiene, Federal Environmental Agency. Report 7,1995.

25 Environmental Protection Agency. The total exposure as-sessment methodology (TEAM) study. Summary and analysisVol 1. Washington DC: Office of Research and De-velopment, EPA, 1987.

26 SpenglerJD, Treitman RD, Tosteson TD, Mage DT, SoczekML. Personal exposures to respirable particulates andimplications for air pollution epidemiology. EnvironmentalScience and Technology 1995;19(8):700-7.

27 Wallace LA, Pellizzari ED, Hartwell TD, Sparacino CM,Sheldon LS, Zelon H. Personal exposures, indoor-outdoorrelationships and breath levels of toxic air pollutants meas-ured for 355 persons in New Jersey. Atmospheric En-vironment 1985;19(10):1651-61.

28 Moschandreas DJ. Exposure to pollutants and daily timebudgets of people. Bull NYAcad Med 1981;57(10):845-59.

29 Spix C, Wichmann HE. Daily mortality and air pollutants:findings from K6ln, Germany. i Epidemiol CommunityHealth 1986;50(suppl 1):S52-S58.

30 Touloumi G, Samoli E, Katsouyanni K. Daily mortality andair pollution from particulate matter, sulphur dioxide, andcarbon monoxide in Athens, Greece, 1987-91. A time-series analysis within the APHEA project. Jf EpidemiolCommunity Health 1996;50(suppl 1):S47-S51.

31 Lipfert FW. Air pollution and community health. A criticalreview and data source book. New York: Van NostrandReinhold, 1994.

32 Sunyer J, Castellsague J, Saez M, Tobias A, Anto JM.Air pollution and mortality in Barcelona. JT EpidemiolCommunity Health 1986:50(suppl 1):00-00.

33 Wojtyniak B, Piekarski T. Short term effects of air pollutionon mortality in Polish urban populations - what is thedifferent? J Epidemiol Community Health 1996;50(suppl1):S36-S41.

34 Hatzakis A, Katsouyanni K, Kalandidi A, Day N, Tri-chopoulos D. Short-term effects of air pollution on mor-tality in Athens. Int J Epidemiol 1986;15:73-81.

35 Derrienic F, Richardson S, Mollie A, Lellouch J. Short-term effects of sulphur dioxide pollution on mortality intwo French cities. IntJ Epidemiol 1989;18:186-97.

S18