How Many Cases Are Too Many? Detection of Disease...
Transcript of How Many Cases Are Too Many? Detection of Disease...
![Page 1: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/1.jpg)
How Many Cases Are Too Many?
Detection of Disease Outbreaks and Clusters
Lance A. Waller, Department of Biostatistics, Rollins School of Public Health, Emory University
![Page 2: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/2.jpg)
How many are too many?
What sets off the public health “alarm”?
For anthrax and smallpox…
ONE (no statistics needed)
(rare enough and dangerous enough)
![Page 3: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/3.jpg)
What about…
…a more subtle pattern?5 flu cases in a single day.20 acute asthma attacks in one
neighborhood.
We want to detect anomolies, patterns of cases differing from the “usual” pattern.
![Page 4: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/4.jpg)
What are we looking for?
Among “Epidemiologic clues that may signal a covert bioterrorism attack” CDC’sThe Public Health Response to Biological and Chemical Terrorism: Interim Planning Guidance for State Public Health Officials (July 2001):
“Disease with unusual geographic or seasonal distribution”
http://www.bt.cdc.gov/Documents/Planning/PlanningGuidance.PDF
![Page 5: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/5.jpg)
John Snow, M.D. 1845 map
!
Snow, J. (1949) Snow on Cholera.Oxford University Press: London.
![Page 6: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/6.jpg)
What we want...
Statistical assessments of the “unusualness” of observed patterns in space and time.Suggests statistical tests of: H0: No clusters in the data.
Yes/no answer?Easy to ask, harder to answer.
![Page 7: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/7.jpg)
Distributed “by chance”…
Need to “operationalize” H0
What sort of data arise under H0?What counts as evidence against H0 ?
Simple random (uniform) pattern?
![Page 8: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/8.jpg)
Scan statistics
Count events in moving window.In time:
Consideration: Cluster “anywhen”, or outbreak now?
4 3 2 20
Wallenstein, S. (1980) A test for detection of clustering over time. American Journal of Epidemiology 111, 367-372.
![Page 9: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/9.jpg)
Scan statistic in space
2
0
3
1
Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics-Theory and Methods 26, 1481-1496.
![Page 10: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/10.jpg)
Complication
Heterogeneous population density
![Page 11: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/11.jpg)
Refine the question…
“Are there clusters in the data?” to
“Are there clusters in the data after adjusting for heterogeneities in the population at risk?”
![Page 12: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/12.jpg)
Complication:
Where is “where”?Which location for each case?Example: Maxcy (1926) study of endemic typhus fever in Montgomery, AL, 1922-1925.
Lilienfeld, D.E. and Stolley, P.D. (1994) Foundations of Epidemiology, Third Edition. Oxford University Press: New York, pp. 136-140.
Maxcy, K.F. (1926) “An epidemiological study of endemic typhus (Brill’s disease) in the Southeastern United States with special reference to its
mode of transmition.” Public Health Reports 41, 2967-2995.
![Page 13: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/13.jpg)
Residence location Place of employment
![Page 14: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/14.jpg)
Refine the question…
“Are there clusters in the data after adjusting for heterogeneities in the population at risk?” to…
“Are there clusters of case residences in the data after adjusting for heterogeneities in the population at risk?”
We’re building a conceptual model…
![Page 15: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/15.jpg)
What we have…
Disease surveillance (ongoing collection, monitoring, and analysis of disease data).Vital statistics (birth/death certificates)Notifiable diseases (required reporting)Registries (link multiple sources of information on each case, e.g. SEER)Health surveys (NHANES, NHIS, BRFSS)
Teutsch, S.M. and Churchill, R.E. (1994) Principles and Practice of Public Health Surveillance. Oxford University Press: New York.
![Page 16: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/16.jpg)
Data components
Types of location (time or space) data:
Point data (case locations)• Latitude/longitude• Street address• Confidentiality?
Regional count data• Counts for enumeration
districts
![Page 17: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/17.jpg)
Background data
Types of background data:Point locations for non-cases (“controls”)• Is the spatial distribution of
cases close to that of controls?
Regional census counts• Are the observed number of
cases close to the number expected under H0?
![Page 18: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/18.jpg)
Point data
Case locations geocoded from registry or billing records.Controls:
All non-cases (e.g., birth records)Sample (perhaps matched) of non-cases.Different outcome (e.g., nonrespiratory ED visits, compared to respiratory ED visits)
![Page 19: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/19.jpg)
Regional Count Data
Aggregate to regional counts, often to preserve confidentiality.
4 1
211 2
![Page 20: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/20.jpg)
Complication:
Counts lose some resolution...
4 1
211 2
![Page 21: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/21.jpg)
Modifiable Areal Unit Problem
Different aggregations can lead to different results.
4 1
211 2
0 0 0 0
2210
20
24
0
![Page 22: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/22.jpg)
MAUP example: John Snow
?
Monmonier, M (1991) How to Lie with Maps. University of Chicago Press: Chicago. p. 142.
![Page 23: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/23.jpg)
Operationalizing H0 :
Case/control point data:Random labeling hypothesisSay n0 control, n1 case locations.H0: Case/control label randomly assigned to the n = n0 + n1 total locations.
![Page 24: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/24.jpg)
Operationalizing H0 :
Regional count data:Constant risk hypothesisEach individual subject to same risk.Expected count = (risk)*(population size).
Variable total: Poisson counts.Fixed total: Multinomial counts.
4 1
211 2
5 2
101 1
3 0
410 3
![Page 25: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/25.jpg)
H0 drives type of test
Random labeling: often compare observed spatial intensities (expected number of events per unit area) of cases and controls.Constant risk: compare observed to those expected counts (goodness of fit).
![Page 26: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/26.jpg)
What deviation from H0 ?
Tests of clustering: check tendency for cases to occur in clusters. Tests to detect clusters: find most likely cluster(s).General tests: detect clusters or clustering anywhere.Focused tests: detect clusters or clustering around suspected foci.
Besag, J. and Newell, J. (1991) “The detection of clusters in rarediseases”. Journal of the Royal Statistical Society-A 154 327-333.
![Page 27: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/27.jpg)
How weird? (Monte Carlo test)
Random labeling/constant risk simulate data sets under H0.For any test statistic, calculate value in observed data, Tobs.Simulate many data sets under H0, and calculate the test statistic for each (T1,T2,…,Tnumsim ).p-value = proportion of test statistics from simulated data sets exceeding Tobs (fraction of T’s > Tobs).
![Page 28: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/28.jpg)
Example: Regional Counts
Comparing observed to expected.Pearson’s chi-square statistic:
X2 =Sum of (Oi – Ei)2
But X2 ignores location of lack of fit.
![Page 29: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/29.jpg)
Spatial goodness-of-fit
Instead of squaring (Oi – Ei), what if we link (Oi – Ei) and (Ok – Ek) by proximity of regions i and k ?Say, sum wik (Oi – Ei)(Ok – Ek), where wik gives link between i and k ?This (essentially) gives Tango’s index of clustering.
Tango, T. (1990) An index for cancer clustering. EnvironmentalHealth Perspectives 87, 157-162.
![Page 30: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/30.jpg)
Finding spatial clusters?
Spatial scan statistic (SaTScan)Scan on windows with distance radii.
Turnbull et al’s Cluster Evaluation Permutation Procedure (CEPP).
Scan on window of constant population size (e.g., 10,000 people at risk).
Besag and Newell’s approachScan on window of constant number of cases (e.g., 10 cases).
All seek collection least consistent with H0 .
![Page 31: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/31.jpg)
New York Leukemia
592 cases 1978-1982, 8 counties, 790 census regions, ~ 1 million people.
![Page 32: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/32.jpg)
Example: case/control point data
Kelsall and Diggle (1995)Compare ratio of case intensity to control intensity.Random labeling simulations.Identify locations where case intensity significantly exceeds control intensity (pointwise test of significance).
Approach to detect clusters.
Kelsall, J.E. and Diggle, P.J. (1995) Non-parametric estimation ofspatial variation in relative risk. Statistics in Medicine 14, 2335-2342.
![Page 33: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/33.jpg)
Archeology data
Alt and Vach (1991)143 grave sites, 30 with affected teeth (“cases”)Question: families buried together?Tested question: Do gravesites with affected teeth cluster?
Alt, K.W., and Vach, W. (1991) “The reconstruction of ‘genetickinship’ in prehistoric burial complexes – problems and statistics”
In Classification, Data Analysis, and Knowledge Organization:Models and Methods with Applications. H.-H. Beck and P. Ihm (eds.)
Springer: Berlin.
![Page 34: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/34.jpg)
Map
![Page 35: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/35.jpg)
Case and control intensities
f
Y
Z
Affected
4000 6000 8000 10000
4000
6000
8000
10000
**
*
*
*
*
*
*
* *
***
*
*
**
**
*
*
*
**
*** **
*
Affected, bw = 500
g
Y
Z
Non-affected
4000 6000 8000 10000
4000
6000
8000
10000
o
oooo
oo o o
oo
oo
o
oo
oooo
oo
oo
o
o
ooo
oooo
oo
oo
o
o
oo
oo
oo
oo
o
o
oo
ooo
oo
o
o
o
o
ooo o
ooo
o
oooo
o
o
o
oo
oo
o
o
oo
o
o
o
o
o
o
oo oooo
oo
o
ooo
ooooo
o
ooo
oo
o
Non-affected, bw = 500
![Page 36: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/36.jpg)
Relative risk surface
r
Y
Z
Relative risk surface
4000 6000 8000
4000
6000
8000 **
*
*
*
*
*
*
* ****
*
*
**
**
*
*
*
***** **
*
o
ooooooo o
oooo
ooo
oooo
oo
oo
o
o
oooooo
o
oooo
o
o
oo
oo
ooo
o
ooooooo
oo
o
o
o
o
ooo o
ooo
o
oooo
o
oo
o ooo
o
o
oo
o
o
o
o
o
o
oo oooo
ooo
ooo
ooooo
o
ooo
ooo
Relative risk surface, bw= 500
![Page 37: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/37.jpg)
Spatial scan statistic
Most likely cluster (p-value = 0.067)
![Page 38: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/38.jpg)
Important ideas
What question do I want to answer?What data can I get?What statistical method will I use? What question can I answer with the data I have and the method?Does this match my first question?
![Page 39: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/39.jpg)
Additional important ideas
Results depend on data structure (MAUP).Every test involves a specific definition of “cluster”…ask yourself:
What data results from H0 (the model of “no clustering”)?
• Can you simulate data from H0?
What constitutes evidence against H0(the model of “clustering”)?
• Do your data appear consistent with H0?
![Page 40: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely](https://reader035.fdocuments.in/reader035/viewer/2022071006/5fc392181ddf594dbd1723e4/html5/thumbnails/40.jpg)
Reading listBesag, J. and Newell, J. (1991). The detection of clusters in rare diseases. Journal of the Royal Statistical Society, Series A 154, 143-155. Kelsall, J.E. and Diggle, P.J. (1995) Non-parametric estimation of spatial variation in relative risk. Statistics in Medicine 14, 2335-2342. Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics-Theory and Methods 26, 1481-1496.Neutra, R.R. (1990). Counterpoint from a cluster buster. American Journal of Epidemiology 132, 1-8.Rothman, K. (1990). A sobering start to the cluster busters’ conference. American Journal of Epidemiology 132 (Supplement), S6-S13.Snow, J. (1946) Snow on Cholera. Oxford University Press.Tango, T. (1990) An index for cancer clustering. Environmental Health Perspectives 87, 157-162.Turnbull, B.W., Iwano, E.J., Burnett, W.S., Howe, H.L., and Clark, L.C. (1990). Monitoring for clusters of disease: application to leukemia incidence in upstate New York. American Journal of Epidemiology 132 (Supplement), S136-S143. Wallenstein, S. (1980) A test for detection of clustering over time. American Journal of Epidemiology 111, 367-372.Waller, L.A. and Jacquez, G.M. (1995). Disease models implicit in statistical tests of disease clustering. Epidemiology 6, 584-590.Waller, L.A. (2002). Methods for detecting disease clustering in time or space”. In Statistical Methods and Principles in Public Health Surveillance. R. Brookmeyer and D. Stroup (eds). Oxford University Press.