San Francisco Crime Classification

SAN FRANCISCO CRIME CLASSIFICATIONSai Praneeth

Project Outline1.Problem Identification2.Data Understanding & Cleansing 3.Data Visualization4.Prediction Methodologies 5.Validation & Scoring

Problem IdentificationCurrent State• The current crime index of

S.F is 3(Safer than 3% ofthe cities in the US.)

• 67.67 annual crimes per 1,000 residents.

• Don’t have model to predict crimes based on location and time

Future State• A proper model

predicting crime based on Date, Time and Location.

• Help the corrections department to act properly with corrective measures based on our model.

• What are the different metrics that influence response?

• Is the data enough to give us a clear picture of crime committed?

• What kind of model best fits the data?

Problem Statement

• Given time and location, you must predict the category of

crime that occurred.

• This competition's dataset provides nearly 12 years of crime

reports from across all of San Francisco's neighborhoods.

• It also encourages us to explore the dataset visually.

Data OverviewTimestamp

Category(Different Crimes)DescriptionResolution

Day of Week

PdDistrict Address Longitude & Latitude

Data Cleansing and Manipulation

Cleaning The Data

Check for Missing valuesCheck for Entry errorsCheck for Duplicates

Check for outliers

Manipulating The Data Time Stamp

AddressLongitude Latitude

Data Visualization

Variables Selection & Data Partition

• Data Partition▫ 60:40

Project Diagram

1. Decision Tree (Two-way split)• This decision tree with typical two way split.• In the properties panel the method was changed to assessment and the

assessment measure was changed to decision as we are trying to classify the categorical variables.

1.Decision Tree (Two-way split)• Most Important variable for split -> Zip code • No of leaves in the pruned tree -> 6• Validation Misclassification 0.273474

1. Decision Tree (Two-way split)

2. Decision Tree (Three-way splits)• This decision tree has three way split.• In the properties panel we changed the maximum branch to three and we

still have the same assessment criteria.• This greatly increased model accuracy.

2. Decision Tree (Three-way splits)• Most Important variable for split -> Zip codes• No of leaves in the pruned tree -> 7• Validation Misclassification -> 0.134316

2. Decision Tree (Three-way splits)

3.Gradient Boosting• “Gradient boosting is a boosting approach that resamples the data set

several times to generate results that form a weighted average of the resampled data set. Tree boosting creates a series of decision trees which together form a single predictive model”

• Here the assessment measure is taken as misclassification.• The Train proportion is taken as 60%• Most Important variable for split -> PDistrict• Validation Misclassification -> 0.34221

4.Ensemble model• Combination of all the four models.• Validation misclassification of 0.141683

Model Comparison

• Best model is Three way decision tree with misclassification of 0.135668• Model drastically improved after converting latitude and longitude to zip

codes.

Betterment of Model • Demographics Data Inclusion

• Time Series Analysis

Questions

THANK YOU

San Francisco Crime Classification

Data & Analytics

Transcript of San Francisco Crime Classification

Crime Classification

The classification and measurement of crime · The seven crime categories: Violence NB: One crime can be classified under more than one crime category and sub-category; depending

October 8, 2015 14.2 FROM: Chief of Police SUBJECT: CRIME ... crime-classification-audit.pdfDepartment of Justice, Uniform Crime Reporting Program, Summary Reporting System User Manual

Criminal Law. Classification of Crimes Crime: an offense committed against the public good or society A person convicted of a crime can be fined, imprisoned.

San Francisco Crime Data Visualization

RoboCop: Crime Classification and Prediction in San …cs229.stanford.edu/proj2015/254_poster.pdf · RoboCop: Crime Classification and Prediction in San Francisco John Cherian, Mitchell

San Francisco Crime Prediction Report

WEEKLY CRIME TRENDS · 2020. 6. 4. · WEEKLY CRIME TRENDS (Provide an overview of offenses occurring in San Francisco) OVERALL PART 1 CRIME - CITYWIDE - AS OF 5/31/20 o This Week

Computer Crime on the Rise FBI-San Francisco Computer Intrusion Squad.

Crime Forecasting Using Data Mining Techniques · classification techniques is employed to perform the crime forecasting. We analyze a variety of classification methods to determine

Computer Crime COEN 1. Classification Computers as an instrument of crime Check forgery Child pornography e-auction fraud, identity theft Phishing.

University of San Francisco Department of Public Safety ... · Classification Case Number Date/Time Reported Date/Time Occurred Location Disposition ... University of San Francisco

Development of the Philippine Standard Classification of ... · Outline I. Introduction II. International Classification of Crime for Statistical Purposes (ICCS) III. Statistical

glennengstrand.info · One generates a report of San Francisco crime totals by category vs week. Another generates San Francisco crime by district vs week. The third job creates a

El Museo de Arte “Legion of Honor” de San Francisco · El Museo de Arte “Legion of Honor” de San Francisco ... Sin, The Crime of Padre Amaro, La Ley de Herodes, Su Alteza

[XLS]CJIS-1U - Missouri Uniform Crime Reporting Home · Web viewA. UCR Point of Contact B. Classification - Part I C. Classification - Part II D. Offense Scoring E. Property Values

Classification of species of Stipa with awns having ... of romania/ac XII 1101... · Classification of species of Stipa with awns having plumose distal segments Francisco María Vázquez

San Francisco Crime

San Francisco State University - CLASSIFICATION OF ARTISANAL … · 2019-12-11 · San Francisco, California 2015 Artisanal and small-scale gold mining activities (ASM) have grown

Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001