A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

29
LOGO A Comparative Study A Comparative Study of Data Mining of Data Mining Methods to Analyzing Methods to Analyzing Libyan National Libyan National Crime Data Crime Data Presented by: Presented by: Dr. Dr. Zakaria Suliman Zubi Zakaria Suliman Zubi Associate Professor Associate Professor Computer Science Department Computer Science Department Faculty of Science Faculty of Science Sirte University Sirte University Sirte, Libya Sirte, Libya

description

Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.

Transcript of A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Page 1: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

LOGO

A Comparative Study of A Comparative Study of Data Mining Methods to Data Mining Methods to

Analyzing Libyan Analyzing Libyan National Crime Data National Crime Data

Presented by:Presented by:

Dr.Dr.Zakaria Suliman ZubiZakaria Suliman ZubiAssociate ProfessorAssociate ProfessorComputer Science DepartmentComputer Science DepartmentFaculty of ScienceFaculty of ScienceSirte UniversitySirte UniversitySirte, LibyaSirte, Libya

Page 2: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContents

AbstractAbstract..

Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 3: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

AbstractAbstract Law enforcement agencies represented in the police today faced a large

volume of data every day. These data can be processed and transformed into useful information. In this since, Data mining can be applied to greatly improve crime analysis. Which can help to reduce and preventing crime as much as possible.

Crime reports and data are used as an input for the formulation of the crime prevention policies and strategic plans.

This work will apply some data mining methods to analyses Libyan national criminal record data to help the Libyan government to make a strategically decision regarding prevention the increasing of the high crime rate these days.

The data was collected manually from Benghazi, Tripoli, and Al-Jafara Supremes Security Committee (SSC).

Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.

A comparison between both algorithms was discussed in this work as well.

Page 4: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContents

AbstractAbstract.. Introduction.Introduction.

Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 5: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

IntroductionIntroduction Data Mining or Knowledge Discovery in Databases (KDD) in simple

words is nontrivial extraction of implicit, previously unknown, and potentially useful information from data.

KDD is the process of identifying a valid, potentially, useful and ultimately understandable structure in data..

Crime analyzes is an emerging field in law enforcement without standard definitions. This makes it difficult to determine the crime analyzes focus for agencies that are new to the field.

Crime analysis is act of analyzing crime. More specifically, crime analysis is the breaking up of acts committed in violation of laws into their parts to find out their nature and reporting, some analysis.

The role of the crime analysts varies from agency to agency. Statement of these findings, The objective of most crime analysis is to find meaningful information in vast amounts of data and disseminate this information to officers and investigators in the field to assist in their efforts to apprehend criminals and suppress criminal activity.

Page 6: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContents

AbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.

Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 7: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Data Mining CategoriesData Mining Categories

The Data Mining models are categorized into different leaves. Further,

each leaf signifies the relationship, if any, that is highlighted from the

database. The Data Mining Models can be put into one of the six main

categories: 1) Association, 2) Classification, 3) Clustering, 4)

Prediction, 5) Sequence Discovery, and 6) Generalization

Page 8: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

data Mining Categories Cont…data Mining Categories Cont…

Table 2 classifies the various Data Mining algorithms according to problem type, namely, Association, Classification, Clustering, Prediction, Discovery,

and Summarization.

Page 9: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Cont….

data Mining Categories Cont…data Mining Categories Cont…

Page 10: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?

Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 11: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Why Analyze Crime?Why Analyze Crime?Crime Analysts usually tend to justify their existence as crime analysts in

what is known as law enforcement agency. It makes sense to analyze crime. Some good reasons are listed as follow:

1. Analyze crime to inform law enforcers about general and specific crime trends, patterns, and series in an ongoing, timely manner.

2. Analyze crime to take advantage of the abundance of information existing in law enforcement agencies, the criminal justice system, and public domain.

3. Analyze crime to maximize the use of limited law enforcement resources.

4. Analyze crime to have an objective means to access crime problems locally, regionally, nationally within and between law enforcement agencies.

5. Analyze crime to be proactive in detecting and preventing crime.

6. Analyze crime to meet the law enforcement needs of a changing society.

Page 12: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Why Analyze Crime?Why Analyze Crime? Cont…Cont…

In general there are four different techniques for analyzing crimes, as follow:

1. Linkage Analysis

2. Statistical Analysis

3. Profiling

4. Spatial AnalysisEach of the above techniques has its own

advantages and drawbacks and can be used in specific cases.

Page 13: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.

Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 14: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Data Mining TaskData Mining TaskA. Data collection Phase.

In this phase, the dataset that we used as training and testing

data were extracted from the police departments. These data

contain data about both Crimes and Criminals with the following

main attributes:

1. Crime ID: Individual crimes are designated by unique crime id.

2. Crime type: indicates crime type.

3. Date: Indicate when a crime happened.

4. Gender: Male or Female.

5. Age: age of individual Criminal.

6. Crime Address: location of the crime.

7. Marital status: status of the Criminal.

Figure2: Raw Data

Page 15: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Data Mining Task Data Mining Task Cont…Cont…B. Data Preprocessing

Real world usually have the following drawbacks: Incompleteness, Noisy, and Inconsistence. So these data need to be preprocessed to get the data suitable for analysis purpose. The preprocessing includes

the following tasks :

1. Data cleaning.

2. Data integration

3. Data transformation.

4. Data reduction.

5. Data discretization.

Figure (3): attributes for crime and criminal. It also shows the distribution of offenses versus different crime and criminal attributes.

Page 16: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.

The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 17: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

Data SetData Set

We will consider crime database as a training dataset

used in our model. The mentioned database contains a

real data values from crime and criminal attributes. We

will also consider 70 percent as training value of the

proposed model and 30 percent for testing. The

following table shows the data we used in our model.

Page 18: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.

Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 19: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

The MLCR proposed modelThe MLCR proposed model The Mining Libyan Criminal Record (MLCR) proposed model will be implemented to conduct and interact with two types of mining algorithms to overcome with

two different types of results effectively. Those two approaches are considered as a sub-prototypes of the proposed MLCR model. Those prototypes will be illustrated as follows:

A.Mining Libyan Criminal Record-using Association rules (MLCR-AR).

B.Mining Libyan Criminal Record-using Clustering (MLCR-C).

Page 20: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

The MLCR proposed model The MLCR proposed model Cont… Cont…

A. Mining Libyan Criminal Record-using Association rules (MLCR-AR).

Association rule mining is a method used to generate rules from crime dataset based on frequents occurrence of patterns to help the decision makers of our security society to make a prevention action.

One of the most popular algorithm are called Apriori and FP-growth Association rule mining classically intends at discovering association between items in a transactional database.

The Apriori algorithm called also as “Sequential Algorithm” developed by [Agrawal1994]. Is a great accomplishment in the history of mining association rules[Cheung1996c]. It is also the most well known association rules algorithm. This technique uses to perform association analyze on the attributes of crimes.

Page 21: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

The MLCR proposed model The MLCR proposed model Cont… Cont…

After applying the Apriori Algorithm we’ve got the following results:Apriori

======

Minimum support: 0.3 (4 instances)

Minimum metric <confidence>: 0.9

Number of cycles performed: 14

Generated sets of large itemsets:

Size of set of large itemsets L(1): 5

Size of set of large itemsets L(2): 6

Size of set of large itemsets L(3): 2

Best rules found:

 

www.themegallery.com

1. MARRIED=NO 12 ==> GENDER=M 12<conf:(1)> lift:(1) lev:(0) [0] conv:(0) 2.CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M 5 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) 3. CRIMEADDRESS=TRIPOLI 5 ==> MARRIED=NO 5 <conf:(1)> lift:(1.08) lev:(0.03) [0] conv:(0.38) 4.CRIMEADDRESS=TRIPOLI MARRIED=NO 5 ==> GENDER=M 5 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) 5. CRIMEADDRESS=TRIPOLI GENDER=M 5 ==> MARRIED=NO 5 <conf:(1)> lift:(1.08) lev:(0.03) [0] conv:(0.38) 6. CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M MARRIED=NO 5 <conf:(1)> lift:(1.08) lev:(0.03) [0] conv:(0.38) 7. CRIMEADDRESS=BENGHAZI 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) 8. CRIMEADDRESS=JAFARA 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) 9. CRIMEADDRESS=JAFARA 4 ==> MARRIED=NO 4 <conf:(1)> lift:(1.08) lev:(0.02) [0] conv:(0.31)10. CRIMEADDRESS=JAFARA MARRIED=NO 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:(0) [0] conv:(0)

Page 22: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

The MLCR proposed model The MLCR proposed model Cont… Cont…

Fig4: criminal age vs. Number of crimes After applying K-means algorithm

B. Mining Libyan Criminal Record-using Clustering (MLCR-C).

This prototype will use the same dataset indicated in MLCR_AR prototype. But with Clustering Analysis.

Clustering is the technique that is used to group objects (crime and criminals) without having predefined specification for their attributes.

Clustering is unsupervised classification: no predefined classes. Simple K-means clustering algorithm is used in this work.

K-mean algorithm clusters the data members groups were m is predefined. Input-Crime type. Number of clusters, Number of Iteration Initial seeds might produce an important role in the final results.  Step1: Randomly choose cluster centers. Step2: Assign instance to cluster based on their Distance to the cluster centers. Step3: Centers of clusters are adjusted. Step4: go to Step1 until convergence. Step5: output X0 ,X1,X2 ,X3.

Page 23: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

The MLCR proposed model The MLCR proposed model Cont… Cont… After Applying K-means Algorithm we’ve got this result === Clustering model (full training set) === K-Means======  Number of iterations: 3 Within cluster sum of squared errors: 43.0 Missing values globally replaced with mean/mode

  Cluster centroids: Cluster# Attribute Full Data 0 1 (13) (9) (4) =================================================== CRIMEID 13 13 65 CRIMETYPE MOLESTATION MOLESTATION DACOITY CRIMEADDRESS TRIPOLI JAFARA TRIPOLI CRIMEDATE 12OCT12 05NOV12 12OCT12 GENDER M M M MARRIED NO NO NO AGE 19 19 30

  Time taken to build model (full training data) : 0 seconds

  === Model and evaluation on training set ===

  Clustered Instances

  0 9 ( 69%) 1 4 ( 31%)

www.themegallery.com

Page 24: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.

ConclusionConclusion..

Page 25: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

K-MEAN Algorithm Aprioir Algorithm

Clustering techniques by using K-

mean algorithm group data items

into classes based on

characteristics like age, gender.

For example, we can identify

suspect and analyze those conduct

crimes in similar fashion or the

same manner.

Association rules by using Aprioir

algorithm mine frequent data

patterns by treating them as rules.

The recognized benefits include an

improvement in the accuracy of

results over current semi-manual

Processes and a reduction in the

time taken to achieve those results

.

www.themegallery.com

A comparison between both algorithms A comparison between both algorithms

Page 26: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

ContentsContentsAbstractAbstract.. Introduction.Introduction.Data Mining Categories.Data Mining Categories.Why Analyze Crime?Why Analyze Crime?Data Mining Task.Data Mining Task.Data Set.Data Set.The MLCR Proposed Model.The MLCR Proposed Model.Comparison Between Algorithms Used.Comparison Between Algorithms Used.ConclusionConclusion..

Page 27: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Company LogoCompany Logoww

w.t

hem

eg

allery

.com

www.themegallery.com

ConclusionConclusion Clustering and association rules were defined as a data mining techniques to

automatically retrieve, extract and evaluate information for knowledge discovery

from crime data. This information was collected from many police department.

Association rules Mining is one of the data mining techniques for data to be used to

identify the relationship and to generate rules from crime dataset based on frequents

occurrence of patterns to help the decision makers of our security society to make a

prevention action.

Clustering is one of the data mining techniques also used to group objects (crime

and criminals) without having predefined specification for their attributes.

The algorithms such as K-means algorithm and Aproir algorithm are used in this

paper.

Those algorithms were expressed in details and a comparative study were denoted

in this paper.

Page 28: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

Thank you !!!

Page 29: A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data