Searching for needles in haystacks: A Bayesian approach to chronic disease surveillance

download Searching for needles in haystacks:          A Bayesian approach to chronic disease surveillance

If you can't read please download the document

description

Searching for needles in haystacks: A Bayesian approach to chronic disease surveillance. Nicky Best Department of Epidemiology and Biostatistics Imperial College, London Joint work with: Guangquan (Philip) Li Lea Fortunato Sylvia Richardson - PowerPoint PPT Presentation

Transcript of Searching for needles in haystacks: A Bayesian approach to chronic disease surveillance

PowerPoint Presentation

Searching for needles in haystacks: A Bayesian approach to chronic disease surveillance

Nicky Best Department of Epidemiology and BiostatisticsImperial College, London

Joint work with: Guangquan (Philip) Li Lea FortunatoSylvia Richardson Anna HansellMireille ToledanoFrontiers in Spatial Epidemiology SymposiumFrontiers in Spatial Epidemiology Symposium1OutlineIntroduction Example 1: Detecting unusual trends in COPD mortalityBaySTDetect ModelSimulation study to evaluate model performanceExample 2: Data mining of cancer registriesConclusions and further developments

Frontiers in Spatial Epidemiology Symposium2IntroductionGrowing interest in space-time modelling of small-area health dataMany different inferential goalsdescriptionprediction/forecastingestimation of change / policy impact......surveillance Key feature is that small area data are typically sparse Bayesian hierarchical models allow smoothing over space and timehelp separate signal from noiseimproved estimation & inference

Frontiers in Spatial Epidemiology Symposium3Surveillance of small area health dataFor most chronic diseases, smooth changes in rates over time are expected in most areasHowever, policy makers, health service providers and researchers are often interested in identifying areas that depart from the national trend and exhibit unusual temporal patternsThese unusual changes may be due to emergence of localised risk factorsimpact of a new policy or intervention or screening programmelocal health services provisiondata quality issuesDetection of areas with unusual temporal patterns is therefore important as a screening tool for further investigationsFrontiers in Spatial Epidemiology Symposium4Retrospective and Prospective SurveillanceWHO defines surveillance as the systematic collection, analysis and interpretation of health data and the timely dissemination of this data to policymakers and othersRetrospective Surveillancedata analyzed once at end of study perioddetermine if space-time cluster occurred at some point in the pastProspective Surveillancedata analyzed periodically over time as new observations are obtainedidentify if space-time cluster is currently formingOur focus is on retrospective surveillancediscuss extensions to prospective surveillance at endFrontiers in Spatial Epidemiology Symposium5Example 1: COPD mortalityChronic Obstructive Pulmonary Disease (COPD) is responsible for ~5% of deaths in UKTime trends may reflect variation in risk factors (e.g. smoking, air pollution) and also variation in diagnostic practice/definitionsObjective 1: Retrospective surveillanceto highlight areas with a potential need for further investigation and/or intervention (e.g. additional resource allocation)Objective 2: Informal policy assessmentIndustrial Injuries Disablement Benefit was made available for coal miners developing COPD from 1992 onwards in the UK There was debate on whether this policy may have differentially increased the likelihood of a COPD diagnosis in mining areas, as miners with other respiratory problems with similar symptoms (e.g., asthma) could potentially have benefited from this scheme. Frontiers in Spatial Epidemiology Symposium6DataObserved and age-standardized expected annual counts of COPD deaths in males aged 45+ years374 local authority districts in England & Wales8 years (1990 1997)Median expected count per area per year = 42 (range 9-331)

Difficult to assess departures of the local temporal patterns by eyeNeed methods to quantify the difference between the common trend pattern and the local trend patternsexpress uncertainty about the detection outcomes

Frontiers in Spatial Epidemiology Symposium7Bayesian Space-Time Detection: BaySTDetectBaySTDetect (Li et al 2012) - detection method for short time series of small area data using Bayesian model choice between 2 space-time models

Frontiers in Spatial Epidemiology Symposium8BaySTDetect: full model specification9The temporal trend pattern is the same for all areas Temporal trends are independently estimated for each area.

Model selectionPrior on model indicator: zi ~ Bernoulli(p )expect only a small number of unusual areas a priori, e.g. p = 0.95ensures common trend can be meaningfully defined and estimated

Frontiers in Spatial Epidemiology Symposium9Implementation in WinBUGSModel 1: Common trendyitmit[C]higtEitModel 2: Local trendyitmit[L]uifitEityitmitEit

Selection modelzicut linkused to prevent double counting of yit

Frontiers in Spatial Epidemiology Symposium10Classifying areas as unusualAreas are classified as unusual if they have a low posterior probability of belonging to the common trend model (model 1): pi = Pr(zi = 1| data)Need to set suitable cut-off value C, such that areas with pi < C are declared to be unusualPut another way, if we declare area i to be unusual, then pi can be thought of as the probability of false detection for that areaWe choose C in such a way that we ensure that the expected average probability of false detection (FDR) amongst areas declared as unusual is less than some pre-set level a

Frontiers in Spatial Epidemiology Symposium11Simulation study to evaluate operating characteristics of BaySTDetect50 replicate data sets were simulated based on the observed COPD mortality data3 patterns small, medium and large departures from common trend

Either the original set of expected counts (median E = 42) or a reduced set (E 0.2; median E = 8) or an inflated set (E 2.5; median E = 105) were used15 areas (4%) were chosen to have the unusual trend patternsResults were compared to those from the popular SaTScan space-time scan statistic

Frontiers in Spatial Epidemiology Symposium12Low ESensitivity of detecting the 15 truly unusual areas

FDR = 0.05; prior prob. of common trend p = 0.95high departures (2) moderate departures (1.5) low departures (1.2) Sensitivity increases as FDR increases and p decreases (not shown)Moderate EHigh EFrontiers in Spatial Epidemiology SymposiumFrontiers in Spatial Epidemiology Symposium13Sensitivity: Comparison with SaTScan

E=24 E=33 E=42 E=52 E=80 Expected count quantilesE=24 E=33 E=42 E=52 E=80 Expected count quantilesE=24 E=33 E=42 E=52 E=80 Expected count quantilesE=24 E=33 E=42 E=52 E=80 Expected count quantilesSensitivity0.0 0.2 0.4 0.6 0.8 1.0Sensitivity0.0 0.2 0.4 0.6 0.8 1.0Sensitivity0.0 0.2 0.4 0.6 0.8 1.0Sensitivity0.0 0.2 0.4 0.6 0.8 1.0 BaySTDetectSaTScan (p=0.05)

moderate departures (1.5) high departures (2) Moderate EFrontiers in Spatial Epidemiology Symposium14Simulation Study: FDR controlEmpirical FDR vs corresponding pre-defined level

Low E: 4-16 High departures (2)Moderate E: 20-80High departures (2)High E: 60-200Moderate departures (1.5)Frontiers in Spatial Epidemiology Symposium15FDR control: Comparison with SaTScan

Low E: 4-16 High departures (2)Moderate E: 20-80High departures (2)High E: 60-200Moderate departures (1.5)SaTScan (p=0.05)Frontiers in Spatial Epidemiology Symposium16Simulation Study: SummarySensitivity to detect unusual trendsHigh sensitivity to detect moderate departure patterns with E>80High sensitivity to detect large departure patterns with E>20Difficult to detect realistic departure patterns for E 0.4)Sensitivity of BaySTDetect superior to SaTScanControl of false discovery ratePre-defined FDR corresponds reasonably well with empirical rate of false discoveriesBut empirical FDR increases as prior probability of declaring area to be unusual increases (p decreases)BaySTDetect has lower empirical FDR than SaTScan when controlled at 5% level

Frontiers in Spatial Epidemiology Symposium17

COPD application: Detected areas (FDR=0.05; p =0.95)Frontiers in Spatial Epidemiology Symposium18COPD application: SaTScan

Primary cluster: North (46 districts) excess risk of 1.05 during 1990-92Secondary cluster: Wales (19 districts) excess risk of 1.12 during 1995-96

Frontiers in Spatial Epidemiology Symposium19Example 2: Data mining of cancer registriesThe Thames Cancer Registry (TCR) collects data on newly diagnosed cases of cancer in the population of London and South East EnglandWe performed retrospective surveillance of time trends by local authority district (94 areas) for several cancer types using BaySTDetect for the period 1981-2008 (split into 7 x 4-year intervals)aim to provide screening tool to detect areas with unusual temporal patterns automatically flag-up areas warranting further investigationsaid local health resource allocation and commissioningFrontiers in Spatial Epidemiology Symposium20ResultsUnpublished results presented at conference, but supressed for web publication Frontiers in Spatial Epidemiology Symposium21SummaryWe have proposed a Bayesian space-time model for retrospective surveillance of unusual time trends in small area disease ratesSimulation study shows good performance in detecting realistic departures (1.5 to 2-fold change in risk) with relatively modest sample sizes (expected counts >20 per area and time period) Improved performance and richer output than popular alternative (SaTScan)

Frontiers in Spatial Epidemiology Symposium22ExtensionsPossible extensions include:Spatial prior on zi to detect clusters of areas with unusual trendsTime-specific model choice indicator zit, to allow longer time series to be analysedAlternative approaches to calibrating posterior model probabilities, e.g. decision theoretic approach balancing false detection and sensitivityAdapt method for prospective surveillance Moving window to down-weight past dataAdapt control chart methodology (e.g. average time until correct detection)

Frontiers in Spatial Epidemiology Symposium23Future ApplicationsQuarterly hospital admissions for various diseases by district (cf Atlas of Variation in Healthcare) Monthly GP data (symptoms) by PCT or CCGSurveillance: the systematic collection, analysis and interpretation of health data and the timely dissemination of this data to policymakers and othersNeed timely data collectionNeed tools to visualize and interrogate output Resource implications of conducting such surveillance and follow-up of detected areas

Thank you for your attention!

Frontiers in Spatial Epidemiology Symposium24G. Li, N. Best, A. Hansell, I. Ahmed, and S. Richardson. BaySTDetect: detecting unusual temporal patterns in small area data via Bayesian model choice. Biostatistics (2012).G. Li, S. Richardson , L. Fortunato, I. Ahmed, A. Hansell and N. Best. Data mining cancer registries: retrospective surveillance of small area time trends in cancer incidence using BaySTDetect. Proceedings of the International Workshop on Spatial and Spatiotemporal Data Mining, 2011.

www.bias-project.org.ukFunded by ESRC National Centre for Research MethodsReferencesFrontiers in Spatial Epidemiology Symposium25