Crime Early Warning: Automated Data Mining of CAD and RMS
-
Upload
azavea -
Category
Technology
-
view
1.031 -
download
2
description
Transcript of Crime Early Warning: Automated Data Mining of CAD and RMS
340 N 12th St, Suite 402Philadelphia, PA 19107
www.azavea.com/hunchlab
Crime Early Warning Systems
Automated Data Mining of CAD and RMS Databases
About Us
Robert CheethamPresident & [email protected]
Jeremy HeffnerHunchLab Product [email protected]
Agenda
• Company Background• The Backstory• HunchLab
– Concept of Early Warning / Data Mining– Demonstration of Hunches– Underlying Statistics
• Q&A
About Azavea
• Founded in 2000
• 27 people
• Based in Philadelphia
– Also Boston & Minneapolis
• Geospatial + web + mobile
– Software development
– Spatial analysis services
Clients & Industries
• Public Safety• Municipal Services• Public Health• Human Services• Culture • Elections & Politics• Land Conservation• Economic Development
Azavea & Governments
The Backstory
How Phila PD uses GIS
Customized Map Products
Weekly CompStat Meetings
Web Crime Analysis
Complainant
CAD
Verizon911
911 Operator
RadioDispatcher
Police Officer
District48 Desk
INCT
Daily download& Geocoding Routines
Incident ReportCompleted by Officer District X
District Y
District Z
Maps distributedThrough Intranet,
Printing, CompStat
INCT & PARS – main database sources
over 5,000 incidents daily, over 2 million annually
PARS
The Context
1,500,000 people
7,000 police officers
1,000 civilian employees
2,000,000 new incidents / year
3 crime analysts
What we did
• Weekly Compstat• Lots of maps• Automation of map creation• Web-based systems
… but what if we could…
Accelerate the cycle Proactively notify Automate the process
Prototype
ArcViewVB & MapObjects
MS SQL Server
Crime Incidents Database
Shapefiles
and
GRIDs
Process Documentation
.ini file
… but there was a problem …
It was crap … sort of.
We needed ….
1. Better Statistics
2. Notification
3. Very Straightforward
web-based crime analysis, early warning, and risk forecasting
Crime Analysis
– Mapping (spatial / temporal densities)
– Trending
– Intelligence Dashboard
Early Warning
– Statistical & Threshold-based Hunches (data mining)
– Alerting
Risk Forecasting
– Near Repeat Pattern
– Load Forecasting
Crime Analysis – What has happened?
– Mapping (spatial / temporal densities)
– Trending
– Intelligence Dashboard
Early Warning – What is out of the ordinary?
– Statistical & Threshold-based Hunches (data mining)
– Alerting
Risk Forecasting – What is likely to happen?
– Near Repeat Pattern
– Load Forecasting
Early Warning
Early Warning
• Geographic Early Warning System– A system to alert staff of an unusual situation in a
particular location– Ingests data sets to automatically “cook on” and only
involves staff when a statistically unusual situation is found
HunchLab Database
Operational Database Alerting
System
Geostatistical Engine
Operational DatabaseOperational Databases
Data Mining
• What do we mean by data mining?– The process of “cooking on” the data to reveal
something new (unusual)• Benefits
– Automated discovery process– Can examine large data sets without additional staff
time• Major crime incidents• Minor crime incidents
– Near real-time alerts• Limitations
– Can’t determine why something unusual is happening, only that it is happening
Early Warning
bit.ly/crimespikedetector
Demo
What is a Hunch?
• A proposed hypothesis, saved into the system, and continually tested for validity
• Incident Attribute Requirements– Location (x, y)– Time (timestamp)– Classification
• Hunch Attributes– Location (area)– Time (recent / historic periods)– Classification
• Analyses– Statistical Hunch– Threshold Hunch
Hunch Parameters: Location
• Address & Radius• Precinct/County/Country• Custom Drawn Area• Mass Hunch
Hunch Parameters: Time
• Statistical Hunch– Recent Past– Historic Past
Hunch Parameters: Classification
• Category• Time of Day• Narrative
Hunch Helper
Email Alert
Hunch Details
The Statistics
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
Within Hunch Outside of Hunch
Recent past ? ?
Historic past ? ?
Hypergeometric Distribution
• Arises when selecting items at random from a heterogenous pool without replacement– Example
• A bag contains 45 black marbles and 5 white marbles• What is the chance of picking 4 white marbles when we
draw 10 marbles?
Tony SmithUniversity of Pennsylvania
Drawn Not Drawn
White Marbles
4 1
Black Marbles
6 39
Hypergeometric Distribution
Drawn Not Drawn Total
White Marbles
4 = k 1 = m – k 5 = m
Black Marbles
6 = n-k 39 = N + k – n - m
45 = N – m
Total 10 = n 40 = N - n 50 = N
en.wikipedia.org/wiki/Hypergeometric_distribution
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
Within Hunch Outside of Hunch
Recent past ? ?
Historic past ? ?
What do we know?
• Valid Hunch– The current condition (and all worse conditions) is
unlikely to simply be due to chance
Demo
Research Topics
Research Topics
• Mobile Interfaces• Analysis
– Real-time Functionality• Consume real-time data streams & conduct ongoing
analysis
Research Topics
• Risk Forecasting– Load forecasting enhancements
• Machine learning-based model selection• Weather and special events
– Combining short and long term risk forecasts– Risk Terrain Modeling
Q&A
Contact Us
Robert CheethamPresident & [email protected]
Jeremy HeffnerHunchLab Product [email protected]