Download - Title Methodological considerations in fine-scale spatial analysis: point pattern investigation of discarded syringes used in public injection of illicit.

Title

Methodological considerations in fine-scale spatial analysis: point pattern investigation of discarded syringes used in public injection of illicit drugs

Mapping and Analysis for Public SafetySeptember 2005Savannah, Georgia

Luc de Montigny ([email protected])University of WashingtonUrban Design and Planning

Outline

Overview of the case study – the big picture

Issues associated with fine-scale analysis of point data• Constrained opportunity surface• Non-observations as data

Investigation of two novel approaches• Kernel Density Estimate ratios• Random Labeling (SPPA)

Discussion and conclusion

Luc de Montigny – MAPS – 2005

Context: the Case Study

Analyze the distribution of syringes found in the most active hard-drug use neighborhood of Montréal, Canada.


Research Questions• Where do “needle-drops” cluster?• Why are some areas more

affected than others?• How effective are current

interventions (drop-boxes, NEP)?

Ultimate Goals• Understand public injection

behavior• Educate CPTED initiatives

Syringes: n=4,172

Macro-Scale Analysis

Crime – like disease – is often analyzed for large areas (states, counties, cities).

Large extents usually mean low resolution (big units of analysis) and aggregation of data.

Discrete events are pooled; point values become area counts (points -> surface).

Traditional geo/spatial statistical analyses can be used. Underlying assumptions effectively hold.


Micro-Scale Analysis

There are compelling reasons to push analysis to finer spatial resolutions.

• Substantive (analysis scale = intervention scale)

• Methodological (MAUP)

Fine-scale analysis introduces new challenges to old tools.

• Why?

• What to do about it?


Methodological Implications of Micro-Scale Analysis

Crime events are not points sampled from a continuous surface; they represent observations of discrete events.

This is a different type of pattern resulting from different types of processes.

This distinction has implications, two of which are discussed:

1. Non-observations constitute useful data

2. The area of opportunity may not be continuous


1) Non-Observations as Data

Assuming an exhaustive sampling strategy (e.g., documentation of all police reports), units of analysis that do not host an event represent a none-event.

There is a difference between “zero” and “no data.”

• Problem: a useful source of information is ignored

• Proposed Solution: borrow case/control approaches developed in epidemiology


Using non-observations: Random Labeling

Comparison of the spatial distribution of events (cases) to the spatial distribution of non-events (controls).

• Cases: points where syringes were found

• Controls: random points where syringes were not found

Used to assess whether clustering in the events is greater than what is expected due to environmental heterogeneity.

• Here we use Ripley’s K function to summarize the spatial point patterns.

• D (d ) = Kcases(d ) — Kcontrols(d )


Random Labeling – Significance


To assess the significance of the difference between Kcases(d ) and Kcontrols(d ), generate simulation

envelopes:

• pool the points (cases + controls);

• randomly assign “case” status to ncase points;

• and calculate the summary function;

• repeat X number of times.

• The maximum and minimum values for each distance bin are taken from all iterations of the simulation.

Under the null hypothesis:

• Kcases(d ) = Kcontrols(d ) = Krandom assignment(d )

Random Labeling – Results

K1: observations (cases)K2: non-observations (controls)

Non-flat curve* indicates difference between spatial distribution of cases from distribution of controls: clustering over and above that of environmental heterogeneity. Peaks outside the simulation envelope should be considered significant.


*D (d )=Kcases(d ) - Kcontrols(d )ˆ

2) Constrained Opportunity Surface

In many situations, events can occur in some spaces, but not others.

• Problem: increased likelihood of type II error

• Proposed Solution: constrain the opportunity surface to the area where events can be observed, i.e. explicitly define a spatial sample frame


Delimiting the Sample Frame

Where can syringes be found?

• Alleys and sidewalks

• Parks

• Parking lots and vacant lots

Sample frame ≈ 0.3 Study area


Cluster Analysis using Kernel Density Estimates (KDE)

KDE is a form of surface modeling – values are estimated for locations between data points.

KDE can be extended to estimate the intensity of one type of point data relative to another.

• In epidemiology, KDE are calculated for both events (cases) and for populations at risk (controls), to control for uneven distributions of population.

• This approach can be adapted for use here: density of events (distribution of syringes) can be normalized by density of opportunity (distribution of sample frame).


The “smoothed” surface represents the intensity of discarded syringes within the search radius, or bandwidth (100m) of any given location in the study area (i.e., for every grid cell).

KDE – Syringe Points


PVC lines represent the boundary of the area that contains 90% of the volume of a probability density distribution; on average 90% of the points that were used to generate the KDE are contained within the lines.

KDE – Sample Frame

Here the sample frame is converted to a grid (10m), and the centroid of each cell is used for the purposes of the kernel density estimation.


KDE – Syringe/Sample Ratio

The ratio surface represents, for each grid cell, the syringe point KDE value divided by the square of the sample frame KDE value.


KDE – Comparison

A comparison of how syringe points cluster in the study area (simple density estimate), to how those same points cluster within the sample space (the ratio between the two density estimates).

These results suggest that the distribution (clustering) of syringes is due to factors other than the distribution of opportunity.


Caveats and Limitations

Random Labeling

• Huge departure from envelope is due to mis-specifying the null hypothesis (only “proving” the obvious) – should use different null hypothesis.

• K function assumes stationarity; probably violated in this case – should use inhomogenous function.

Kernel Density Estimate ratios

• The intensity of opportunity (sample space density estimates) is measured in an arbitrary way. The choice of grid resolution, and bandwidth size are influential to the density estimate, yet are not grounded in theory.

• Density surfaces should be “clipped” to areas within the sample frame for the purposes of visualization and analysis.


Summary

• Most events studied in criminology are the result of point processes (point patterns).

• Tools designed for the analysis of surfaces may not be appropriate for criminology.

• Popular analytic techniques have underlying assumptions that are violated at the micro-scale.

• Ignoring the above can result in erroneous results (type II error, model mis-specification)

Contact information: [email protected]


Acknowledgements

This research would not be possible without the hard work and collaboration of Spectre de rue, Montréal.


Selected Reading & Software References

• Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS (1996) Spatial point pattern analysis and its application in geographic epidemiology. Transactions of the Institute of British Geographers 21: 256-274

• Walter C, McBratney AB, Viscarra Rossel RA, Markus JA (2005) Spatial point-process statistics: concepts and application to the analysis of lead contamination in urban soil. Environmetrics 16: 339-355

• Beyer HL (2005) Hawth's Analysis Tools for ArcGIS. Available at http://www.spatialecology.com/htools

• Rowlingson BS, Diggle PJ, Bivand R (2005) The splancs package for R. Available at http://www.maths.lancs.ac.uk/~rowlings/Splancs

• Baddeley A, Turner R (2005) The spatstat package for R. Available at http://www.maths.uwa.edu.au/~adrian/spatstat

• Lewin-Koh NJ, Bivand R (2005) The maptools package for R. Available at http://cran.r-project.org


The K-function describes the degree to which there is spatial dependence in the arrangement of events

K(d) = λ-1E[number of events within d from a randomly selected event]

Where λ is the intensity, and E[] the expectation

Formally:

Where,• R is the region (extent)• I is a binary indicator function• w is the proportion of the search radius that falls within R

Appendix – Ripley’s K Summary Function