2012 (Vol. 1, No. 4)
Transcript of 2012 (Vol. 1, No. 4)
Secure Network Communication Based on Text-to-Image Encryption
Ahmad Abusukhon1, Mohamad Talib
2 , Issa Ottoum
3
1 IT Faculty, - Computer Network Department
Al-Zaytoonah University of Jordan
Amman, JORDAN
[email protected] 2 Department of Computer Science
University of Botswana
Gaborone, BOTSWANA
[email protected] 3IT Faculty, - Computer Network Department
Al-Zaytoonah University of Jordan
Amman, JORDAN
ABSTRACT
Security becomes an important issue when
secure or sensitive information is sent over a
network where all computers are connected
together. In such a network a computer is
recognized by its IP address. Unfortunately,
an IP address is attacked by hackers; this is
where one host claims to have the IP address
of another host and thus sends packets to a
certain machine causing it to take some sort
of action. In order to overcome this problem
cryptography is used. In cryptographic
application, the data sent are encrypted first
at the source machine using an encryption
key then the encrypted data are sent to the
destination machine. This way the attacker
will not have the encryption key which is
required to get the original data and thus the
hacker is unable to do anything with the
session. In this paper, we propose a novel
method for data encryption. Our method is
based on private key encryption. We call our
method Text-To-Image Encryption (TTIE).
KEYWORDS Network; Secured Communication; Text-to-
Image Encryption; Algorithm; Decryption;
Private key; Encoding.
1 INTRODUCTION
Information security is one of the most
important issues to be considered when
describing computer networks. The
existence of many applications on the
Internet, for example e-commerce
(selling and buying through the Internet)
is based on network security. In addition,
the success of sending and receiving
sensitive data using wireless networks
depends on the existence of a secure
communication (the Virtual Private
Network, VPN) [11]. One of the
methods which are used to provide
secure communication is Cryptography.
Cryptography (or sometimes referred to
as encipherment) is used to convert the
plain text to encode or make unreadable
form of text [9]. An Encryption method
uses what is known as an encryption key
to hide the contents of a plain text (make
it unintelligible). Without knowing the
decryption key it is difficult to determine
what the plain text is. In computer
networks; the sensitive data are
encrypted on the sender side in order to
have them hidden and protected from
263
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
unauthorized access and then sent via the
network. When the data are received
they are decrypted depending on an
algorithm and zero or more encryption
keys as described in "Fig.1".
Decryption is the process of converting
data from encrypted format back to their
original format [3]. Data encryption
becomes an important issue when
sensitive data are to be sent through a
network where unauthorized users may
attack the network. These attacks include
IP spoofing in which intruders create
packets with false IP addresses and
exploit applications that use
authentication based on IP and packet
sniffing in which hackers read
transmitted information. One of the
applications that are attacked by the
hackers is the E-mail. There are many
companies providing the E-mail service
such as Gmail, Hotmail and Yahoo mail.
These companies need to provide the
user with a certain data capacity, speed
access, as well as a certain level of
security. Security is an important issue
that we should consider when we choose
Web Mail [14].
Some of the techniques that are used to
verify the user identity (i.e. to verify that
a user sending a message is the one who
he claims to be) are the digital signature
and the digital certificate [5]. Digital
signature and digital certificate are not
the focus of this research.
There are some standard methods which
are used with cryptography such as
private-key (also known as symmetric,
conventional, or secret key), public-key
(also known as asymmetric), digital
signature, and hash functions [17]. In
private-key cryptography, a single key is
used for both encryption and decryption.
This requires that each individual must
possess a copy of the key and the key
must be passed over a secure channel to
the other individual [15]. Private-key
algorithms are very fast and easily
implemented in hardware. Therefore
they are commonly used for bulk data
encryption.
Mainly, there are two types of private-
key encryption; stream ciphers and block
ciphers [1].
Here is a text
message
Encryption Key
#%XYZ#$
Decryption Key
Here is a text
message
Receiver
(cipher text) (plaintext) (plaintext)
Secure
Channel
Figure 1 Encryption and Decryption methods with a secure channel for key exchange.
264
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
In stream ciphers a given text is
encrypted one byte or one bit at a time
whereas in block ciphers a given text is
divided into chunks and then chunks are
encrypted using an encryption algorithm.
Example of stream ciphers are RC4
ciphers and one time pad ciphers.
Examples of block ciphers are DES and
AES [15].
Data encryption is performed serially or
in parallel. Data encryption is performed
in parallel in order to speed up
cryptographic transformations. In Block
ciphers algorithms such as DES there are
some of the operations executed serially
like CBC and CFB and other operations
executed in parallel like ECB and OFB
[10]. Parallel encryption is not the focus
of this research. In this research we
focus on stream ciphers rather than block
ciphers.
The main components of the symmetric
encryption include - plaintext,
encryption algorithm, secret key, cipher
text and decryption algorithm. The
plaintext is the text before applying the
encryption algorithm. It is one of the
inputs to the encryption algorithm. The
encryption algorithm is the algorithm
used to transfer the data from plaintext
to cipher text. The secret key is a value
independent of the encryption algorithm
and of the plaintext and it is one of the
inputs of the encryption algorithm. The
cipher text is the scrambled text
produced as output. The decryption
algorithm is the encryption algorithm
run in reverse [16, 3, 14].
Public-key encryption uses two distinct
but mathematically related keys – public
key and private key. The public key is
the non-secret key that is available to
anyone you choose (it is often made
available through a digital certificate).
The private key is kept in a secure
location used only by the user. When
data are sent they are protected with a
secret-key encryption that was encrypted
with the public key. The encrypted
secret key is then transmitted to the
recipient along with the encrypted data.
The recipient will then use the private
key to decrypt the secret key. The secret
key will then be used to decrypt the
message itself. This way the data can be
sent over insecure communication
channels [16]. Examples on public key
encryption are Pretty Good Privacy
(PGP) and RSA. PGP is one of the most
public key encryption methods. RSA
[12] is based on the product of two very
large prime numbers (greater than 10100
).
The idea of RSA algorithm is that it is
difficult to determine the prime factors
of these large numbers. There are other
algorithms used to create public keys
such as E1Game1 and Rabin but these
algorithms are not common as RSA [9].
In this paper, we propose a new data
encryption algorithm based on
symmetric encryption technique. We
propose to encrypt a given text into an
image.
2 RELATED WORK
Bh. P., et al. [2] proposed the Elliptic
Curve Cryptography. In this method
encoding and decoding a text in the
implementation of Elliptic Curve
Cryptography is a public key
cryptography using Koblitz's method [7,
8]. In their work, each point on the curve
represents one character in the text
message. When the message is parsed
each character is encoded by its ASCII
code then the ASCII value is encoded to
one point on the curve and so on. Our
265
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
work differs from their work. In their
work they used public-key technique
whereas in our work we use private key
technique. They encoded each character
by its ASCII value but we encode each
character by one pixel (three integer
values - R for Red, G for Green and B
for Blue).
Singh and Gilhorta [15] proposed
encrypting a word of text to a floating
point number that lies in range from 0 to
1. The floating point number is then
converted into binary number and after
that one time key is used to encrypt this
binary number. In this paper, we encode
each character by one pixel (three
integer values R, G and B).
Kiran et al. [6] proposed a new method
for data encryption. In their method the
original text (plain text) was ordered into
a two-directional circular queue in a
matrix say A of a given size say m x n. In
their work data encryption is reliant on
matrix disordering. To do so, they
proposed to perform transformation
operations on the rows or the columns of
matrix A a number of times. They
proposed three types of transformation
operations to be performed on A. These
operations were encoded as follows; 0
for circular left shift, 1 for circular right
shift, and 2 for reverse operation. The
matrix disordering was carried out by
generating a positive random number say
R, and then this number is converted to a
binary number. The decision on which to
perform rows or columns transformation
was based on the value of the individual
bits in the binary number. For example if
the binary bit is 0 then row
transformation is performed otherwise (if
the binary bit is 1) column
transformation is performed. To
determine which transformation
operation should be carried out; another
random number is generated and then
divided by 3. The reminder of the
division is 0, 1, or 2. The reminder
represents the transformation operation.
In case of row transformation, two
distinct rows were selected randomly by
generating two distinct random numbers
say R1 and R2. Another two distinct
random numbers were generated c1 and
c2 that represent two distinct columns.
The two columns c1 and c2 were
generated in order to determine the range
of rows in which transformation had to
be performed. After the completion of
each transformation a sub-key is
generated and stored in a file key. The
file key is then sent to the receiver to be
used as decryption key. The sub-key
format is (T, Op, R1, R2, Min, Max)
where:
T: the transformation applied to either
row or column.
Op: the operation type coded as 0, 1, or
2, e.g., shift left array contents, shift
right array contents, and reverse array
contents.
R1 and R2: two random rows or
columns.
Min, Max: minimum and maximum
values of range for two selected R1 and
R2.
3 OUR ALGORITHM
Here we describe the main features of
our proposed algorithm TTIE. Our
algorithm includes two main phases
namely the TTIE phase (this is where
our work is based) and the ISE (Image-
Shuffle Encryption) phase. In the TTIE
phase the plain text is transformed
(encrypted) into an image. In this phase
the plain text is concatenated as one
string and then this string is stored into
an array of characters say C. For each
266
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Server
Client
(RGB) or pixel 1
(RGB) or pixel 12
…
Pixels Matrix
Shuffle the matrix to produce a new Matrix. Random columns
/rows are swapped with another
random columns/rows.
Store the pixels into an Image “img” of type PENG
…
cryptography
R1,R2,R3 R34,R35,R36
c y
Key 1 is generated randomly. 3 random numbers for each character
Plaintext
Key 2
“img” image is sent to the client
Read the pixels from the
image “img”.
Re-shuffle the matrix to
produce the original one.
Use key 2
Get the pixel’s matrix
…
cryptography
R1,R2,R3
c y
Use key 1 Key1 R34,R35,R36
ciphertext
ciphertext
Plaintext
character in C, one pixel of the resulting
image is generated. Each pixel consists
of three integers created randomly in
advance and before the transformation
(encryption) begins (see Fig 3-A, key 1).
Each integer of the three integer values
represents one color. The color value is
in the range from 0 to 255. The result of
this phase is a matrix, say M, in which
each three contiguous columns in a
given row represent one character of the
original text (plain text). This is done in
order to make it difficult for hackers to
guess what the plain text is. To the best
of our knowledge, no previous work has
attempted transforming a text file into an
image.
The second phase is the ISE phase. The
work in this phase is based on a previous
work carried out by Kiran et al. [6]. In
the ISE phase the matrix M is shuffled a
number of times. The shuffle process
includes row swapping and column
swapping. In row swapping, two rows
are selected randomly and then swapped.
In column swapping two columns are
selected randomly and then swapped.
This matrix disordering makes it
difficult for hackers to guess the original
order of the matrix M. The shuffle key
(key 2) is shown in Fig. 3-B. These two
phases (the TTIE and the ISE) are
carried out on the sender machine (in
this paper it is the server machine) as
described in Fig. 2.
The encrypted message is then sent to
the client machine where the message is
Figure 2 The main steps of the Text-to-Image-Encryption (TTIE) algorithm Figure 2 The main steps of the Text-to-Image-Encryption (TTIE) algorithm
267
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
0#5#5#12#13#17#20#25#25#30#32#32#37#41#37#47#52#53#55#56#60#68#69#68#78#74#79#88#82#86#9
(A) Part of Key 1
5736834348:644:34:3641834:868:4348:644,34:364,438:1643,34::6413:316:33::6:4:38:364:138136::8313463:
(B) Part of Key 2
decrypted using key2 and key1
respectively.
4 OUR EXPERIMENT
Java NetBeans is used as a vehicle to
carry out our experiments. We build the
client's and server's programs on
different machines and then we tested
sending and receiving data on both sides.
We use the following text message in
our experiments:
"encryption is the conversion of data
into a form called a cipher text that
cannot be easily understood by
unauthorized people. decryption is the
process of converting encrypted data
back into its original form so it can be
understood. The use of encryption
decryption is as old as the art of
communication in wartime. a cipher
often incorrectly called a code can be
employed to keep the enemy from
obtaining the contents of transmissions.
technically a code is a means of
representing a signal without the intent
of keeping it secret.
examples are morse code and ascii
simple ciphers include the substitution of
letters for numbers the rotation of letters
in the alphabet and the scrambling of
voice signals by inverting the sideband
frequencies". [13].
"Fig. 3" shows part of the generated keys
namely "Key 1" and "Key 2" whereas
"Fig. 3" (A) shows the format of "Key
1". Each value is delimited by the #
symbol. The first three values (0, 5, 5)
represent one pixel in the result image.
In this pixel, R (the Red color value) = 0,
G (the Green color value) = 5, and B
(the Blue color value) = 5. In order to
guarantee that distinct letters have
unique colors i.e. unique RGB values,
we create 26 different ranges because of
26 alphabets. For example, these ranges
are unique subsets of the main set which
ranges from 0 to 255. The letter A may
be represented by RGB values in the
range from 0 to 9, the letter B may be
represented in the range from 10 to 19
and so on. This pixel (0, 5, 5) represents
the letter A. The next three values (12,
13, 17) are another pixel which
represents the letter B and so on.
Figure 3 The format of Key1 and Key2
Figure 4 Cipher text – the output of Text-to-Image-Encryption
268
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
"Fig. 3" (B) shows the format of "Key2".
Each two contiguous values represent
two columns in the matrix M. The first
pair in Key 2 is 375:364 which means
that column number 375 is swapped with
column number 364 and so on.
"Fig. 4" shows the cipher text (is the text
after it is encrypted as an image). The
image in Fig. 4 is zoomed out many
times to make it clear. In this image
pixels are created randomly and thus
they do not form a known shape like
tree, fish, mobile, etc. The image shown
in "Fig. 4" is sent to the client and on the
client side we decrypt the cipher text
shown in "Fig. 4" then we finally get the
original text message (i.e. the plain text).
5 ANALYSIS In our algorithm each letter is
represented by a random pixel, i.e., three
random values namely R, G and B. To
attack the data, hackers need to guess the
following:
1. That each three contiguous values
represent one letter. Since we send the
data as integers’ values, it is hard to
guess that each three contiguous values
represent one letter.
2. If a hacker is able to guess point 1,
then he needs to guess what random
numbers represent the letters A, B, C,
etc. In other words, a hacker needs to
guess the value of key 1 "Fig. 3". Note
that guessing the value of key 1 is
difficult since we shuffle (scramble) the
matrix using key 2 (key 2 is based on the
algorithm described in [6]). For
example, suppose that the message we
want to send is "abcd". Using key 1
"Fig. 3" (A) the random numbers
generated for "a", “b”, “c” and “d” are
(0,5,5), (12,13,17), (20,25,25), and
(30,32,32) respectively. The matrix
before shuffling is described in Table-1.
Table-2 describes the matrix after
shuffling (Table-2 describes a simple
swap operation where column 1 is
swapped with column 2).
Table 1 Pixels before shuffling- each three
contiguous integers in a row represent one pixel
or one letter.
Letter R-value G-value B-value
A 0 5 5
B 12 13 17
C 20 25 25
D 30 32 32
Table 2 Pixels after column 1 is swapped with
column 2
Letter R-value G-value B-value
? 5 0 5
? 13 12 17
? 25 20 25
? 32 30 32
Using statistical analysis, hackers may
guess the letters from Table-1. However,
it is very difficult for hackers to guess
the letters from Table-2 because the
order of the values RGB is changed. In
other words, each three contiguous
values RGB in Table-1 which represent
one letter are now distributed randomly
in Table-2 and thus make it difficult to
guess that letter even if hackers use
statistical analysis (a method involving
a statistical breakdown of byte patterns
such as the number of times any
particular value appears in the
encrypted output would quickly reveal
whether any potential patterns might
exist). Similarly, it is hard for "letter A
follows letter B" analysis to decrypt the
cipher text.
With the simple calculation, the number
of possible permutations to encrypt 26
letters is-
((256)3)26
) (1)
Since each pixel consists of three values
and each one of these values is in the
269
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
range from 0 to 255, choosing three
values has (256)3
permutations. We have
26 letters and thus the permutations for
26 letters is ((256)3)26
which is equal to
1.1679847981112819759721399310593
exp+195. The individual keys: key1 and
key2, are generated each time a new
message is sent. This is done in order to
avoid regularity in the resultant cipher
text.
6 CONCLUSION AND FUTURE
WORK
In this paper, we add another level of
data security at the top of the data
security system proposed by Kiran et al.
[6]. In our method of encryption we first
encrypted the text to an image (matrix of
pixels) then based on the work done by
Kiran et al. [6], we scrambled the matrix
to a new one making it more difficult for hackers to guess the original text
message. Our algorithm is good for text
encryption for a network system as well
as for individual offline machines. It is
also useful for e-mail security since all
messages stored in the mail box will be
displayed as images and thus even if
someone leaves the e-mail page on it is
difficult for others to guess the meaning
(the original text) of these images. In
future, we propose to investigate
dividing the text into blocks and then
transfer each block into an image and
thus create an individual key for each
block. This will make it difficult for
hackers to use statistical approach to
guess the color of each letter since
different colors will be assigned to a
specific letter when it appears in
different blocks. In addition we will
investigate the efficiency of our
proposed algorithm (the TTIE) when
large scale data collection (multiple
Gigabytes) is used.
ACKNOWLEDGMENT
I would like to acknowledge and extend
my heartfelt gratitude to Al-zaytoonah
University for their financial support to
carry out this work successfully.
REFERENCES
[1] Bellare, M., Kilian J., and Rogaway, P.: The
Security of cipher block chaining. In
Proceedings of the Conference on Advances
in Cryptology (CRYPTO’94). Lecture Notes
in Computer Science, vol. 839 (1994).
[2] Bh, P., Chandravathi, D., Roja, P.: Encoding
and decoding of a message in the
implementation of Elliptic Curve
cryptography using Koblitz’s method.
International Journal of Computer Science
and Engineering, 2(5) (2010).
[3] Chan, A.: A Security framework for privacy-
preserving data aggregation in wireless
sensor networks. ACM transactions on
sensor networks 7(4) (2011).
[4] Chomsiri, T.: A Comparative Study of
Security Level of Hotmail, Gmail and Yahoo
Mail by Using Session Hijacking Hacking
Test. International Journal of Computer
Science and Network Security IJCSNS, 8(5)
(2008).
[5] Goldwasser, S., Micali, S., L.Rivest, R.: A
Digital signature scheme secure against
adaptive chosen-message attacks, SIAM
Journal of Computing 17(2) pp. 281-308
(1998).
[6] Kiran Kumar, M., Mukthyar Azam, S., and
Rasool, S.: Efficient digital encryption
algorithm based on matrix scrambling
technique. International Journal of Network
Security and its Applications (IJNSA), 2(4)
(2010).
[7] Koblitz, N.: Elliptic Curve cryptosystems,
Mathematics of Computation, 48 (1987),
pp. 203-209 (1987).
[8] Koblitz, N.: A Course in Number Theory and
cryptography. 2'nd edition. Springer-Verlag
(1994).
[9] Lakhtaria K. Protecting computer network
with encryption technique: A Study.
International Journal of u- and e-service,
Science and Technology 4(2) (2011).
[10] Pieprzyk, J. and Pointcheval, D.: Parallel
Authentication and Public-Key
270
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Encryption. The Eighth Australasian
Conference on Information Security and
Privacy (ACISP '03). Wollongong,
Australia) R. Safavi-Naini Ed. Springer-
Verlag, LNCS. (2003).
[11] Ramaraj, E., and Karthikeyan, S.: A New
Type of Network Security Protocol Using
Hybrid Encryption in Virtual Private
Networking. Journal of Computer Science
2(9) (2006).
[12] Rivest, R.L., Shamir, A and Adelman, L.:
A method of obtaining digital signatures
and public key cryptosystems. Comms.
ACM, 21(2) (1978).
[13] SearchSecurity , definition Encryption
[online] available at:
http://searchsecurity.techtarget.com/definit
ion/encryption Accessed on 13-06-2012.
[14] Shannon, C. E.: Communication Theory of
secrecy systems. Bell System Technical
Journal (1948).
[15] Singh, A., Gilhorta, R.: Data security using
private key encryption system based on
arithmetic coding. International Journal of
Network Security and its Applications
(IJNSA), 3(3) (2011).
[16] Stalling, W.: Cryptography and network
security principles and practices ,4th
edition Prentice Hall. [online] Available
at: http://www.filecrop.com/cryptography-
and-network-security-4th-edition.html,
Accessed on 1-Oct-2011.
[17] Zaidan, B., Zaidan A., Al-Frajat, A., Jalab,
H.: On the differences between hiding
Information and cryptography techniques:
An Overview. Journal of Applied Sciences
10(15) (2010).
271
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 263-271
The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Department of Information SystemsUniversity of Cape Town
South [email protected], {adrie.stander, jacques.ophoff}@uct.ac.za
Index Terms—Mobile-cellular network, Base station, Cell-phone, Location, Information accuracy
I. INTRODUCTION
It is well known that the location of a cellphone, and thus thelocation of its user, can be determined with a certain degreeof accuracy. This information can be used to offer variouslocation-based services and creates the opportunity to buildnew information services that can be useful to both cellphoneusers and companies. In addition, location information can beused in other scenarios, such as providing law enforcementagencies with tracking data [1]. One example is that of amurder suspect being found by police after inserting his SIMcard into the cellphone of a murder victim [2].
Location information can be used to aid police in trackingmovements during investigations and locating suspects. How-ever, it can also be valuable in tracing people for humanitarianreasons, such as search-and-rescue teams defining search areasfor locating missing persons. By increasing the accuracy oflocation information the process of finding the cellphone andits user can be made faster, simpler, and cheaper. In borderlinecases it can be the difference between finding someone in needof medical attention in time, or catching a suspect who wouldhave otherwise escaped.
Many of the most feasible methods for estimating the loca-tion of a cellphone within a mobile-cellular network dependson using the location of network base stations as known refer-ence points from which to calculate the estimated position ofthe cellphone. The benefit of such network-based approaches
is that no modifications to the handset or network are required.However, by using network, handset, or hybrid approaches theaccuracy of location information can be improved [1].
This study investigates the accuracy with which the lo-cations of network base stations are known, as inaccuracycan impair the ability of many of the most feasible methodsto provide accurate cellphone location estimates. It starts byproviding background information on current techniques fordetermining the location of a cellphone within a mobile-cellular network. Thereafter the research methodology fol-lowed in the investigation is discussed, followed by a reportof the data collected. Finally, the findings are presented andthe implications are highlighted.
II. BACKGROUND
Many handset and network techniques for determininglocation exist. The most widely known, using the internalhardware of the cellphone, is satellite positioning using GPSbut WiFi, Bluetooth, and augmented sensor networks can alsobe employed [3], [4], [5]. The accuracy of these techniquescan vary depending on the technology, line-of-sight, and sensornetwork coverage [6]. An improvement is to use such hardwarein combination with mobile-cellular network information, suchas in the case of Assisted-GPS (A-GPS) which uses networkresources in the case of poor signal reception.
In addition new algorithms have greatly improved the ac-curacy and efficiency with which a cellphone can calculateits position [7], [8]. However, major obstacles including highenergy usage and non-availability of features in older cell-phones remain. Thus using location methods based primarilyon mobile-cellular network information is widespread.
Global System for Mobile Communications (GSM) net-works were not originally designed to calculate locations forthe cellphones which access and make use of the network.Many methods have been proposed and developed to be retro-fitted to existing networks [9]. There are a range of accuraciesand costs associated with the various methods. The followingare the most feasible methods, in order of increasing potentialaccuracy.
• Cell identification (Cell ID) is the simplest location esti-mation method available, but also the least accurate. Theestimated area is at best a wedge shaped area, comprisingroughly a third of the cell (for three sectored sites), but
An Analysis of Base Station Location Accuracywithin Mobile-Cellular Networks
272
Liam Smit, Adrie Stander and Jacques Ophoff
Abstract—An important feature within a mobile-cellular net-work is that the location of a cellphone can be determined.As long as the cellphone is powered on, the location of thecellphone can always be traced to at least the cell from whichit is receiving, or last received, signal from the cellular network.Such network-based methods of estimating the location of acellphone is useful in cases where the cellphone user is unableor unwilling to reveal his or her location, and have practicalvalue in digital forensic investigations. This study investigatesthe accuracy of using mobile-cellular network base stationinformation for estimating the location of cellphones. Throughquantitative analysis of mobile-cellular network base stationdata, large variations between the best and worst accuracy ofrecorded location information is exposed. Thus, depending on therequirements, base station locations may or may not be accurateenough for a particular application.
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
can include the entire circular area for sites using omni-directional antennas in low-density single sector cells[10].
• Round Trip Time (RTT) is merely a measure of distancefrom the base station which is calculated from the timetaken by a radio signal to travel from the base station tothe cellphone and back. It provides a drastic reductionin the estimated location area compared to the Cell IDmethod for the same site.
• Cell ID and RTT combines the aforementioned methodsto provide an estimated location for the cellphone wherethese areas overlap [11].
• Observed Time Difference of Arrival (OTDOA) useshyperbolic arcs from three (or more) base stations toestimate the location of a cellphone. These arcs aredetermined by the distance that the radio signals travel inthe measured time (i.e. the difference) [12].
• Angle of Arrival (AOA) is a seemingly practical solutiondue to its straightforward method of calculating an esti-mated location from the intersection of the bearings to thecellphone provided by each base station. In practice thismethod requires expensive antenna arrays, which limit itsfeasibility despite its potential for high accuracy [10].
It is important to bear in mind that all of the above methodsestimate the location of the cellphone, and thus its user, relativeto the location of the base station. Next follows a discussionof factors impacting on accuracy and ways of negating thesefactors.
A. Factors that negatively impact accuracy
There are a number of well recognized challenges to accu-rately determining the location of cellphones. In addition todegrading accuracy these challenges can also increase the costof estimating location. These challenges include non-line-of-sight and multi-path propagation of radio waves, the near-fareffect in Code Division Multiple Access (CDMA) based thirdgeneration networks [12], base station density (or lack thereof)and accuracy of base station locations [13], optimisations fornetwork capacity, and the unsynchronised nature of UniversalMobile Telecommunications System (UMTS) type networks[14].
There are varying levels of accuracy inherent to the methodsand combinations thereof, as well as the enhancements whichhave been implemented for a particular method. In order ofincreasing accuracy: Cell ID (the whole area of a circularcell), Cell ID and sector (the area of the wedge), Cell ID andRTT (circular band), Cell ID and the intersection of multipleRTT determined hyperbolic arcs and A-GPS (outdoor onlyand which requires GPS functionality to be available in thecellphone) [15]. Pilot correlation method (PCM) has been leftout of the list as it can be made as accurate as the fidelity ofthe spacing of the measurement sites [16].
Certain base stations with low utilisation, in small townsfor example, will not be sectored and there will only beone site. It will be possible to obtain a circular band fromRTT calculations, but to achieve a more precise location will
require adding another measurement technique such as PCMor probabilistic fingerprinting [17].
B. Methods of improving accuracy
To address these challenges there are various solutions andenhancements to methods for estimating location that can beemployed. Less accurate measurements can be identified andthen discarded, re-weighted or adjusted. It is feasible to usemore than the minimum number of required data points, othermethods which are not impacted by inaccurate measurements,and improving the precision of data by employing high fidelitymeasurements and oversampling [15]. It is also possible to em-ploy techniques such as forced soft handover and minimisingproblems by using methods which are not negatively affectedby challenges such as non-line-of-sight or multi-path radiowave propagation.
The methods of estimating location can be organised intotwo groups. The first group consists of those methods whichdo not depend on base station location and are thus unaffectedby the accuracy with which these locations are known. Thesemethods include A-GPS, PCM [16], probabilistic fingerprint-ing [17], bulk map-matching, and the centroid algorithm [18].
The second group consists of methods which estimate thelocation of the cellphone and its user relative to the location ofthe base station and are therefore dependant on the accuracywith which these network base station locations are known.These include the Cell ID based methods of Cell ID, Cell IDand RTT, enhanced Cell ID and RTT, as well as cell polygonsand RTT [15]. The Time of Arrival (TOA), OTDOA, as wellits enhancements, such as cumulative virtual blanking, areaffected in a similar fashion although this may have moreof an impact as these methods are meant to deliver greateraccuracy than the Cell ID based methods [14]. While not verywidespread in implementation, the methods of AOA and theTOA to the Time Difference of Arrival algorithm are alsonegatively impacted [12].
There are a range of direct and indirect costs that can beattributed to most methods. The greater the work involvedin network configuration, the larger the amount of additionalhardware, and the more involved the deployment the higher thecost. Some methods require more human intervention to setup, such as PCM and probabilistic fingerprint matching, whilstothers might require additional hardware, such as OTDOArequiring location measurement units. There is also the possi-bility that certain methods will reduce the network capacity.Thus it is vitally important to the network operator thatexisting infrastructure information (i.e. network base stationlocations) is as accurate as possible, to minimise and managefurther costs to improve accuracy.
In summary, it can be seen that there are many methodsof determining the location of a cellphone within a mobile-cellular network. While some of these are not dependent onbase station location, the majority of network-based methodsare. The accuracy of such data is thus the main focus of thisstudy.
273
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
III. RESEARCH METHODOLOGY
A quantitative analysis of base station information in aSouthern African mobile-cellular network was performed. Thepopulation consisted of all active base stations that form partof the network. Any base station that was operational on thenetwork (including those that had recently gone live or arescheduled to be replaced) was included due to the possibilitythat such a base station could participate in estimating thelocation of cellphones.
To evaluate the accuracy of base stations locations, theyhad to be evaluated by comparing their recorded locations toobservations of their actual locations. For each base station aGPS location in a valid number format was stored in the net-work database. The method used to measure the base station’sactual observed location in order to be able to compare it tothe stored value also served to validate the stored value.
As this is a time consuming process it was not performed forall base station sites. Instead the entire population consistingof all available recorded base station locations was sampled.All sub-populations needed to be represented in the samplein order to be able to compare their results for commonalitiesor differences. Each of the ten regions which comprise theSouthern African network were individually queried to find alist of sites that contain operational base stations. The samplinginterval was determined by taking the number of sites anddividing it by the desired minimum sample size of thirtybase stations for each region. The sampling interval was thenrounded down in order to provide some spare sample basestation locations in the event of being unable to locate one ormore of the selected base stations and having to select another.A sampling method of a random starting number followed byperiodic sampling was employed.
For each sample the latitude and longitude was entered intoGoogle Maps [19] with maximum zoom enabled together withthe ‘Satellite’ and ‘Show labels’ options selected. The resultingaerial photograph was examined to identify the presence of abase station. If the base station could be identified then itsposition was measured using a set procedure:
• The map was centred on the base of the sampled basestation using the ‘Right-Click’ and ‘Center map here’function.
• The latitude and longitude of the map centred on the basestation was copied via the ‘Link’ function.
For each base station that was found by the above process,the following additional information was captured in a spread-sheet to add to the original recorded base station location:
• The base station’s location was categorized as servingeither: 1) a population centre (city, town, suburb, village,township, commercial or industrial area), or 2) an areaoutside of a population centre (mountains, road, farms ormines).
• Categorising information was captured for each basestation location: 1) technology generation (second and/orthird), and 2) equipment vendor.
Fig. 1. Aerial view of palm tree
Fig. 2. ‘Street View’ of palm tree
• The GPS coordinates of the recorded and measuredlocations were then used to calculate the difference inmetres between the two using the ‘Great Circle’ method:1) employ the law of cosines, 2) convert to radians, and3) multiply by the radius of Earth.
If a base station could not be identified from the aerialphotograph then the Google Maps Street View function wasused to assist with identifying the base station location. If thebase station still could not be detected then it was discardedand the next base station was selected and the identificationand measuring process repeated. Reasons for not being able toidentify a base station included unclear satellite photographs,the use of camouflage, and multiple base stations in close prox-imity to each other. An example of the difficulty in identifyingstructures is illustrated in Figures 1 and 2, which shows anaerial and ‘Street View’ of a base station camouflaged as apalm tree.
The first stage of analysis consisted of categorising thecollected data into various categories, such as geographicregion, technology type, vendor, site owner, and whether or notthe base station serves a population centre. This was followedby finding the minimum (best accuracy), maximum (worstaccuracy), median, average and standard deviation valuesfor the location accuracy data in each category. Accuracyresults for base stations were placed into categories of variousintervals of accuracy to better allow for evaluation in termsof desired levels of accuracy of the base station locations for
274
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
TABLE ISUMMARY OF ENTIRE SAMPLE
IntervalSpacing
STDV Worst Best AVG Median SampleSize
5 152.38 1634 0.52 77.04 25.38 369
varying applications.The preceding steps allowed for comparisons between dif-
ferent categories to see if there were differences or similaritiesin terms of accuracy. By identifying the base stations sitesfor which the recorded location accuracy was far worse andcategorising them as outliers, these sites could be revisited inan attempt to find out why they differed so markedly to therest of the base station locations in the category.
IV. DATA ANALYSIS
Due to the nature of how the network database was con-structed the location data was both complete and in a validnumber format. Accuracy was examined for the entire sampleas well as the various categories of base stations. The best,worst, average (AVG) and median accuracies, together withthe standard deviation (STDV) were calculated and is shownin Table I.
By starting with a high level overview of all sampled basestation locations it is possible to gain an understanding of therange of accuracies for the overall sample population. Thedata is represented in Figure 3 as a cumulative percentage ofthe base stations for a given level of accuracy. For example66.67 percent of base stations have a recorded location that isaccurate to within 50 metres of the measured location while80 percent of recorded base station locations are accurate towithin 100 metres of their measured locations.
In a near ideal situation 100 percent of the base stationlocations would be accurate to less than two and half metresand rounded down, with zero deviation remaining the ultimateprize. This would result in a vertical line at zero metres fromzero to 100 percent (of base stations) after which it would thenmake a ninety degree turn to the right, indicating that all basestation locations are accurate to within the distances given onthe X axis.
Fig. 3. Entire Sample
Fig. 4. Map of South Africa [20]
Fig. 5. Distribution per region
A. Regions
The base stations that comprise the sample are situatedin ten regions. These regions are Central (CEN), Eastern(EAS), KwaZulu Natal (KZN), Lesotho (LES), Limpopo(LIM), Mpumalanga (MPU), as well as Northern (NGA), Cen-tral (SGC) and Southern Gauteng (SGS) and lastly Western(WES). These regions correspond in area to the provinces ofSouth Africa, which are illustrated in Figure 4 for reference.Figure 5 shows the distribution graph for these regions.
The KwaZulu Natal region stands out markedly as havingthe best average and median accuracy values. It also has thelowest worst accuracy figure, which all told, results in it havingthe lowest standard deviation.
The Lesotho region has an extremely large worst accuracyfigure which results in it having the worst average and thehighest standard deviation of all the regions.
The Central Gauteng region stands out for having thehighest median value, despite not having a large worst value.The accuracy of the Central Gauteng is lower that of the
275
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Fig. 6. Vendors
Lesotho and Southern Gauteng regions for the cumulative mostaccurate 80 percent of base stations portrayed in Figure 5.It lags the other regions until the 160 metres of accuracylevel is reached where it then begins to rapidly surpass thecumulative percentage of the other regions. In addition to theCentral Gauteng and Lesotho regions, the Southern Gautengand Northern Gauteng regions also lag behind the accuracy ofthe more accurate regions.
B. Vendors
The sampled base stations can also be categorised by thenetwork equipment vendors that supply them. These basestation vendors are Alcatel, Huawei, Motorola and Siemens.As before the highest (worst) numbers have been marked inbold and the lowest (best) numbers have been italicised inaddition to be marked in bold.
Looking at Table II it is clear that Siemens offers the bestoverall accuracy of the vendors and Huawei the worst, withAlcatel and Motorola falling in between these two extremes.
However when analysing Figure 6, it is apparent that Alcateloffers the best accuracy for the most accurate cumulative 85percent of its base stations that were measured (up to 110metres difference between recorded and measured locations).Only when the last 15 percent of the base stations with accu-racies worse than 110 metres are included, is it overtaken bySiemens. The accuracy of the base station location informationfor Huawei is confirmed as the lowest of the four vendors withMotorola assuming a position between it and the two more
TABLE IIBASE STATION DATA CATEGORISED BY VENDORS
Vendor STDV Worst Best AVG Median SampleSize
Alcatel 141.77 879.32 0.52 68.14 19.98 121Huawei 133.76 849.44 1.73 86.8 36.59 94Motorola 170.9 1634 1 77.12 25.27 150Siemens 62.05 296.55 1.99 47.52 19.35 94
Fig. 7. Technology generation
accurate vendors.
C. Technology generation
When categorising base station locations by technologygeneration (for example second or third) there are three cate-gories. This is due to co-location of base stations of differentgenerations on the same sites. It is however not a simple ‘onefor one’ correlation but rather a case where a site which hasa second generation base station on it may also have a thirdgeneration base station on it but the converse is not necessarilytrue. This results in the three categories of sites:
1) Those with only second generation base stations (2ndOnly).
2) Those with both third and second generation base sta-tions (3rd & 2nd).
3) Those with second generation base stations which willpossibly, but not necessarily, also include third genera-tion base stations (2nd (incl. 3rd)).
In comparing the sites in Figure 7 it becomes clear thatthe locations of those sites that contain third (and second)generation base stations are known with better accuracy thanthose containing only second generation base stations.
Sites that contain second generation base stations, andpossibly include third generation base stations, tend to fall inthe middle. Unfortunately there is no set of sites that containonly third generation base stations and which would enablethe comparison of sites that contain only second generationbase stations to those that contain only third generation basestations.
D. Site owner
Base station sites are not necessarily used exclusively by theowner of the sites. This leads to a situation where some basestations are installed on sites that belong to another networkoperator. The “Own” network sites constitute the vast majorityof the sampled base station locations. As such it was necessaryto combine the sites from the other vendors into a singlecategory “Other” in order to achieve a meaningful sample size.
276
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
According to Table III despite the ”Own” category contain-ing a very large worst accuracy figure and being only slightlyworse for best accuracy, it offers better overall accuracy asshown by all other metrics.
When reviewing Figure 8, for any cumulative percentage,the “Own” category has a lower (better) accuracy measurefor base stations locations than the “Other” category for atleast the first cumulative 95 percent of most accurate recordedlocations.
E. Population centres
Base station locations contain base stations that either servecentres of population or the areas in between them. Basestations serving population centres have a higher median valuethan the those serving the areas between population centres.However, Figure 9 shows that base stations in population cen-tres only have better accuracy once the last (most inaccurate)15 percent of the base station locations are included.
F. Outliers
Outliers were defined as the ten percent of the total samplewith the worst accuracy. Notably this category covers allregions except for the KwaZulu Natal region and with onlyone base station location for Western region. In Table IV theresults for the ten percent least accurate base station locationsare presented. Even looking past the ‘Worst’ accuracy figureand instead at the average, median or even the ‘Best’ figuresthe outlier locations are clearly very inaccurate.
To gain an understanding of why outliers occur and howtheir accuracies can be so poor, examples of outliers were se-lected to illustrate the difference in recorded versus measuredaccuracy.
TABLE IIIBASE STATION DATA CATEGORISED BY SITE OWNER
Site owner STDV Worst Best AVG Median SampleSize
Own 151.05 1634 1 73.07 25.07 318Other 161.93 879.32 0.52 105.61 49.14 49
Fig. 8. Site owner
TABLE IVBASE STATION OUTLIERS
IntervalSpacing
STDV Worst Best AVG Median SampleSize
25 297.67 1634 178.65 410.15 303.92 38
The location of the access road (marked with a red ‘A’)which is used to reach the base station instead of the locationof the base station itself (marked with six red dots) has beenrecorded in Figure 10. This Northern Gauteng region basestation serves a population centre but its location is off by324 metres.
The Pretoria University building (tagged with Green arrow)in Figure 11 has been recorded instead of the actual locationof the base station (indicated by six red dots) on the grounds.This base station serves a population centre in the NorthernGauteng region. It has a difference of 178.5 metres betweenits recorded and measured locations.
Figure 12 shows that while the recorded location (marked bythe red ‘A’) is atop the same mountain in the Central region,it does not follow the track all the way to the base station(circled with red dots). This results in a deviation of 879 metresfrom the measured location of the base station which serves a
Fig. 9. Population centres
Fig. 10. Watloo Despatch
277
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Fig. 11. Pretoria University
Fig. 12. Carnarvon
population centre at the foot of the mountain.From the above data several points need to be considered.
Firstly, the large outliers and standard deviations for allvendors, technology generations, site owners, and almost allregions. The KwaZulu Natal region was a notable exception tothis pattern, proving by example that good accuracy is entirelypossible. Secondly, one category could be cumulatively moreaccurate for the majority of its (more accurate) base stationlocations but when including its least accurate base stations,these were so inaccurate that its overall accuracy would dropbelow that of another category. Lastly, the extent of theinaccuracy for the outliers was so great that it warrantedfurther assessment. This revealed the ease with which highlyinaccurate locations could be recorded.
V. CONCLUSIONS
This paper builds on previous research, emphasising theimportance of accurately knowing base station location forcellphone localisation [12], [21]. The nature of this studyallows it to be replicated in any country and for any technologytype or other category of base station site. The resultingdata shows that depending on the requirements, base stationlocations may or may not be accurate enough for a particular
application. This could have serious implications when thedata is used for security-related incidents.
Base station accuracies ranged from less than one metreto more that 1600 metres. Fifty percent of base stations wereaccurate to 25 metres (rounded) and 80 percent are accurate to100 metres (rounded). However to include 90 percent of basestations it would be necessary to accept base station locationsthat were off 180 metres (rounded). The deviation of the leastaccurate ten percent of base station locations ranged from179 to 1634 metres. The significance of these inaccuraciesand their impact would depend on the particular applicationand its requirement for accuracy. When investigating outliers adiscernible pattern emerged, revealing that the given locationswere actually the access point, or the access road to the basestation was recorded instead of the base station itself.
Network operators can improve the accuracy of the esti-mated locations that they are able to provide by increasingthe accuracy of recorded base station locations. This canbe done by analysing and measuring aerial photographs orthrough taking more accurate measurements when performingroutine maintenance, upgrades or equipment swap-outs of basestations.
REFERENCES
[1] I.A. Junglas and R.T. Watson, “Location-based services,” Commun. ACM,vol. 51, no. 3, pp. 65–69, 2008.
[2] J. Warner, “Murder Suspect Caught,” Weekend Argus (Sept. 11), p. 4,2010.
[3] V. Zeimpekis, G.M. Giaglis, and G. Lekakos, “A Taxonomy of Indoor andOutdoor Positioning Techniques for Mobile Location Services,” SIGecomExch., vol. 3, no. 4, pp. 19–27, 2003.
[4] M. Hazas, J. Scott, and J. Krumm, “Location-Aware Computing Comesof Age,” Comput., vol. 37, no. 2, pp. 95–97, 2004.
[5] A. Kupper, Location-Based Services: Fundamentals and Operation.Chichester: Wiley, 2005.
[6] S. von Watzdorf and F. Michahelles, “Accuracy of Positioning Data onSmartphones,” in Proc. 3rd Int. Workshop on Location and the Web,Tokyo, Japan, 2010, pp. 1–4.
[7] M. Ibrahim and M. Youssef, “A Hidden Markov Model for LocalizationUsing Low-End GSM Cell Phones,” in Proc. 2011 IEEE Int. Conf. onCommunications (ICC), Cairo, Egypt, 2011, pp. 1–5.
[8] J. Paek, K. Kim, J.P. Singh, and R. Govindan, “Energy-Efficient Position-ing for Smartphones using Cell-ID Sequence Matching,” in Proc. 9th Int.Conf. on Mobile Systems, Applications, and Services, Maryland, USA,2011, pp. 293–306.
[9] W. Buchanan, J. Munoz, R. Manson, and K. Raja, “Analysis and Mi-gration of Location-Finding Methods for GSM and 3G Networks,” inProc. 5th IEE Int. Conf. on 3G Mobile Communication Technologies,Edinburgh, United Kingdom, 2004, pp. 352–358.
[10] J. Borkowski, “Performance of Cell ID+RTT Hybrid Positioning Methodfor UMTS,” M. Sc. thesis, Tampere University of Technology, Finland,2004.
[11] J. Niemela and J. Borkowski. (2004) Topology planning considera-tions for capacity and location techniques in WCDMA radio networks.[Online]. Available: http://www.cs.tut.fi/tlt/RNG/publications/abstracts/topoplanning.shtml
[12] J.J. Caffery and G.L. Stuber, “Overview of Radiolocation in CDMACellular Systems,” IEEE Commun. Mag., vol. 36, no. 4, pp. 38–45, 1998.
[13] M. Mohr, C. Edwards, and B. McCarthy, “A study of LBS accuracyin the UK and a novel approach to inferring the positioning technologyemployed,” Comput. Commun., vol. 31, no. 6, pp. 1148–1159, 2008.
[14] P.J. Duffett-Smith and M.D. Macnaughtan, “Precise UE Positioning inUMTS using Cumulative Virtual Blanking,” in Proc. 3rd Int. Conf. on 3GMobile Communication Technologies, London, United Kingdom, 2002,pp. 355–359.
278
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
[15] J. Borkowski, J. Niemela, and J. Lempiainen. (2004) Location Tech-niques for UMTS Radio Networks. [Online]. Available: http://www.cs.tut.fi/tlt/RNG/publications/abstracts/UMTSlocation.shtml
[16] J. Borkowski and J. Lempiainen, “Pilot correlation positioning methodfor urban UMTS networks,” in Proc. 11th European Next GenerationWireless and Mobile Communications and Services Conf., Tampere,Finland, 2005, pp. 1–5.
[17] M. Ibrahim and M. Youssef, “CellSense: A Probabilistic RSSI-BasedGSM Positioning System,” in Proc. 2010 IEEE Global Telecommunica-tions Conf., Cairo, Egypt, 2010, pp. 1–5.
[18] A. Varshavsky, M.Y. Chen, E. de Lara, J. Froehlich, D. Haehnel, J.Hightower, A. LaMarca, F. Potter, T. Sohn, K. Tang, and I. Smith, “AreGSM phones THE solution for localization?” in Proc. 7th IEEE Workshopon Mobile Computing Systems and Applications, Washington, USA, 2006,pp. 20–28.
[19] Google. (2012) Google Maps. [Online]. Available: https://maps.google.com/
[20] Htonl. (2010) Map of South Africa (via Wikimedia Commons). [On-line]. Available: http://commons.wikimedia.org/wiki/File:Map of SouthAfrica with English labels.svg
[21] J. Yang, A. Varshavsky, H. Liu, Y. Chen, and M. Gruteser, “AccuracyCharacterization of Cell Tower Localization,” in Proc. 12th ACM Int.Conf. on Ubiquitous Computing, Copenhagen, Denmark, 2010, pp. 223–226.
279
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 272-279The Society of Digital Information and Wireless Communications (SDIWC) 2012 (ISSN: 2305-0012)
Technical Security Metrics Model in Compliance with ISO/IEC 27001
Standard
M.P. Azuwa, Rabiah Ahmad, Shahrin Sahib and Solahuddin Shamsuddin
[email protected], {rabiah,shahrin}@utem.edu.my
ABSTRACT
Technical security metrics provide
measurements in ensuring the effectiveness
of technical security controls or technology
devices/objects that are used in protecting
the information systems. However, lack of
understanding and method to develop the
technical security metrics may lead to
unachievable security control objectives and
inefficient implementation. This paper
proposes a model of technical security
metrics to measure the effectiveness of
network security management. The
measurement is based on the security
performance for (1) network security
controls such as firewall, Intrusion Detection
Prevention System (IDPS), switch, wireless
access point and network architecture; and
(2) network services such as Hypertext
Transfer Protocol Secure (HTTPS) and
virtual private network (VPN). The
methodology used is Plan-Do-Check-Act
process model. The proposed technical
security metrics provide guidance for
organizations in complying with
requirements of ISO/IEC 27001 Information
Security Management System (ISMS)
standard. The proposed model should also
be able to provide a comprehensive
measurement and guide to use ISO/IEC
27004 ISMS Measurement standard.
KEYWORDS
Information security metrics, technical
security metrics model, measurement,
vulnerability assessment, ISO/IEC
27001:2005, ISO/IEC 27004:2009, Critical
National Information Infrastructure.
1 INTRODUCTION
The phenomena of instant grow and
increasing number of cyber attacks has
urged the organizations to adopt security
standards and guidelines. International
Organization for Standardization and the
International Electrotechnical
Commission (ISO/IEC) has developed
the ISO/IEC 27000 series of standards
that have been specifically reserved for
information security matters. Through
ISO/IEC 27001 Information Security
Management System (ISMS) –
Requirements [1], the organization may
comply and obtain the certification in
increasing level of protection for their
information and information systems.
Information security metrics can be
ineffective tools if organizations do not
have data to measure, procedures or
processes to follow, indicators to make
good protection decisions and people to
develop and report to the management.
To be useful, measurement of
information security effectiveness
should be comparable. Comparisons are
usually made on the basis of quantifiable
measurement of a common
characteristic. The main problems in the
information security metrics
development are identified; (i) lack of
clarity on defining quantitative effective
security metrics to the security standards
and guidelines; (ii) lack of method to
guide the organizations in choosing
security objectives, metrics and
280
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
measurements for mitigating current
cyber attacks [2][3].
Hulitt and Vaughn [4] report, lack of
clarity in a standard quantitative metric
to describe information system’s level of
compliance with the FISMA standard,
even though thorough and repeatable
compliance assessment conducted using
Risk Management Framework (RMF).
Bellovin [5] remarks that defining
metrics is hard. It is not infeasible,
because an attacker’s effort is often
linear, even when the exponential
security work is needed. Those pursuing
the development of a security metrics
program should think of themselves as
pioneers and be prepared to adjust
strategies as experience dictate [6]. It is
also known that ISO/IEC 27001
provides generic guidance in developing
the security objectives and metrics and
still lack of method to guide the
organizations [2][3].
1.1 Information Security Metrics
In understanding the meaning of
information security metrics, the security
practitioners and researchers have
simplified their definitions of
information security metrics and
measures (as described in Table 1).
Table 1: Definitions of Information Security
Metrics and Measures
Author Definition
Stoddard
et al. [7]
A metric is a measurement that is
compared to a scale or benchmark
to produce a meaningful result.
Metrics are a key component of risk
management.
Savola [8] Security Metric is a quantitative and
objective basis for security
assurance. It eases in making
business and engineering decisions
concerning information security.
The metrics are derived from
comparing two or more
measurements taken over time with
a predetermined baseline.
Brotby
[9]
The metric is a term used to denote
a measure based on a reference and
involves at least two points, the
measure and the reference. A
security is the protection from or
absence of danger.
The security metrics are categorized
by what they measure. The
measures include the process,
performance, outcomes, quality,
trends, conformance to standards
and probabilities.
Masera et
al. [10]
“Security metrics are indicators,
and not measurements of security.
Security metrics highly depend on
the point of reference taken for the
measurement, and shouldn’t be
considered as absolute values with
respect to an external scale.”
Hallberg
et al. [11]
“A security metric contains three
main parts: a magnitude, a scale
and an interpretation.
The security values of systems are
measured according to a specified
magnitude and related to a scale.
The interpretation prescribes the
meaning of obtained security
values.”
Lundholm
et al. [12]
The measurement quantifies only a
single dimension of the object of
measurement that does not hold
value (facilitate decision making) in
itself.
The metric is derived from two or
more of the measurement to
demonstrate an important
correlation that can aid a decision.
From these definitions, we propose the
definition as information security
metrics is a measurement standard for
information security controls that can be
quantified and reviewed to meet the
security objectives. It facilitates the
relevant actions for improvement,
provide decision making and guide
281
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
compliancy to security standards.
Information security measurement is a
process of measuring/assessing the
effectiveness of information security
controls that can be described by the
relevant measurement methods to
quantify data and the measurement
results are comparable and reproducible.
Hence, information security
measurement is a subset of information
security metric.
1.2 Technical Security Metrics and
Measurement
We found the research activities for
technical security metrics are very
limited. Also, there is lack of specific
technical security metrics research to
measure the technical security controls
from a total 133 security controls from
the ISO/IEC 27001 standard.
Vaughn et al. [13] define Technical
Target of Assessment (TTOA) as to
measure how much a technical object,
system or product is capable of
providing assurance in terms of
protection, detection and response.
According to Stoddard et al. [7],
technical security metrics are used to
assess technical objects, particularly
products or systems [8], against
standards; to compare such objects; or to
assess the risks inherent in such objects.
Additionally, the technical security
metrics should be able to evaluate the
strength in resistance and response to
attacks and weaknesses (in terms of
threats, vulnerabilities, risks, anticipation
of losses in face of attack) [13]. At the
same time, it indicates the security
readiness with respect to a possible set
of attack scenarios [10].
1.3 Effective Measurement
Requirement from ISO/IEC 27001
Standard
Information security measurement is a
mandatory requirement in ISO/IEC
27001 standard where it is indicated in a
few clauses in: 4.2.2(d) “Define how to
measure the effectiveness of the selected
controls or groups of controls and
specify how these measurements are to
be used to assess control effectiveness to
produce comparable and reproducible
results”, 4.2.3(c) “Measure the
effectiveness of controls to verify that
security requirements have been met”,
4.3.1(g) “documented procedures needed
by the organization to ensure the
effective planning, operation and control
of its information security processes and
describe how to measure the
effectiveness of controls”, 7.2(f) “results
from effectiveness measurements” and
7.3(e) “Improvement to how the
effectiveness of controls is being
measured”. The importance of
information security measurement is
well defined in these clauses.
2 SECURITY METRICS
DEVELOPMENT APPROACH
The development of technical security
metrics model (TSMM) is derived from
the following approach:
(1) The requirements of technical
security controls are based on
ISO/IEC 27002 ISMS – Code of
Practices standard [14].
(2) Identify relevant security
requirements
(3) Achieve security performance
objectives
(4) Align to risk assessment value
282
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
(5) The development of technical
security metrics should not be an
extensive list, but more focus on
the critical security controls that
provide high impact to the
organizations. According to
Lennon [15], “the metrics must be
prioritized to ensure that the final
set selected for initial
implementation facilitates
improvement of high priority
security control implementation.
Based on current priorities, no
more than 10 to 20 metrics at a
time should be used. This ensures
that an IT security metrics program
will be manageable.”
(6) Align to risk assessment value
(7) Ease of measurement.
(8) Provide the process to obtain
data/evidence, method and formula
to assess the security measurement
(9) Resistance and response to known
and unknown attacks
(10) Provide the threshold values to
determine the level of protection
(11) Provide actions to improve
(12) Comply to the ISO/IEC 27001
standard
3 TECHNICAL SECURITY
METRICS MODEL (TSMM)
The development of TSMM is based on
Plan-Do-Check-Act (PDCA) model. The
development of TSMM is described in
Figure 1.
3.1 PLAN Phase: (Selection of
Controls and Definition)
The focus is on the technical security
controls that will be extracted from the
total 133 security controls as stated in
the Annex A of ISO/IEC 27001
standard.
We define technical security metrics as a
measurement standard to address the
performance of security
countermeasures within the technical
security controls and to fulfill the
security requirements. The technical
security measures are based on
information security performance
objectives that can be accomplished by
quantifying the implementation,
efficiency, and effectiveness of security
controls.
ISO/IEC 27002 [14] provides the best
practice guidance in initiating,
implementing or maintaining the
security control in the ISMS. This
standard regards that “not all of the
controls and guidance in this code of
practice may be applicable and
additional controls and guidelines not
included in this standard may be
required”.
Federal Information Processing
Standards 200 (FIPS 200) [16] defines
technical controls as “the security
controls (i.e., safeguards or
countermeasures) for an information
system that are primarily implemented
and executed by the information system
through mechanisms contained in the
hardware, software, or firmware
components of the system”. These are
the basis of our definition for technical
security controls.
Based on NIST SP800-53 guidelines
[17], the technical security controls
comprise of:
(1) Access Control (AC-19 controls)
(2) Audit and Accountability (AU-
14 controls)
(3) Identification and Authentication
(IA-8 controls)
(4) System and Communications
Protection (SC-34 controls)
283
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
The total of technical security controls
from NIST SP800-53 guidelines is
seventy-five (75). In the Appendix H of
[18], the technical security controls are
extracted from Table H-2. This table is
mapping from the security controls in
ISO/IEC 27001 (Annex A) to NIST
Special Publication 800-53. We extract
and analyze these technical security
controls. We discover that:
(1) Within three (3) main domains
from ISO/IEC 27001 (Annex A)
that include:
A.10 Communications and operations management
A.11 Access Control
A.12 Information systems acquisition, development and
maintenance
(2) The initial total of technical
security controls is forty-five
(45).
(3) The identified technical security
controls only require a process or
policy implementation and not
related to technical
implementation, such as
A.11.1.1 Access control policy,
A.11.4.1 Policy on use of
network services, A.11.5.1
Secure log-on procedures,
A.11.6.2 Sensitive system
isolation, A.11.7.2 Teleworking,
A.12.3.1 Policy on the use of
cryptographic control and
A.12.6.1 Control of technical
vulnerabilities.
(4) There are relationships with other
security controls in NIST SP800-
53 document, including:
• Management controls:
Security Assessment and
Authorization (CA), Planning
(PL), System and Services
Acquisition (SA)
• Operational controls:
Configuration Management
(CM), Maintenance (MA),
Media Protection (MP),
Physical and Environmental
Protection (PE), Personnel
Security (PS), System and
Information Integrity (SI).
Figure 1: Technical Security Metrics Model
(TSMM)
The technical security controls should be
practical, customized and measured
according to organization’s business
requirements and environments.
A risk management approach will be
used in identifying the relevant security
controls. Threat and vulnerability
assessment will be carried out.
Threat and vulnerability assessment will
be carried out. Also, identifying both
impact and risk exposure to determine
the prioritization of security controls.
Cyber-Risk Index: A cyber-risk index is
used to evaluate the vulnerability and
threat probabilities related to the
successfulness of current and future
attacks. Attack-Vulnerability-Damage
284
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
(AVD) model [19] and Common
Vulnerability Scoring System (CVSS) -
Base Metric [20] are used to determine
this weighted-index. We will extent and
include the criticality or impact of loss to
the organization. The CVSS base score
is calculated using the information
provided by the U.S. National
Vulnerability Database (NVD) Common
Vulnerability Scoring System Support
v2 [21] and other relevant Cyber
Emergency Response Team (CERT)
Advisories and Report.
3.2 DO Phase: (Effective
Measurement)
The security requirements describe the
actual security functional for technical
security controls in protecting the
information systems. Security functional
includes the identification and
authentication, access control,
configurations/algorithm, architecture
and communication.
A set of performance objectives is
developed for each security requirement.
Vulnerability Assessment (VA) Index:
The VA index is that can be derived by
conducting the security or vulnerability
assessment to the information systems
through a simulation assessment,
vulnerability scanning or penetration
testing. This is based on the current
assessment of potential attacks and will
be weighted-index using the numeric
CVSS scores: "Low" severity (CVSS
base score of 0.0-3.9), "Medium"
severity (CVSS score of 4.0-6.9) and
"High" severity ( CVSS base score of
7.0-10.0). The VAI can also be derived
from Vulnerability-Exploits-Attack
(VEAbility) metrics [22]. The VEAbility
measures the security of a network that
is influenced by the severity of existing
vulnerabilities, distribution of services,
connectivity of hosts, and possible attack
paths. These factors are modeled into
three network dimensions: Vulnerability,
Exploitability, and Attackability. The
overall VEA-bility score, a numeric
value in the range [0,10], is a function of
these three dimensions.
At this phase, the data collection must be
easily obtainable and the measurements
are not complicated. The measurement
should be able to cater for current
(through audit report and evidence of
events) and future attacks.
3.3 CHECK Phase: (Security
Indicators and Corrective Action)
In verifying the effectiveness of controls,
we measure how much the control
decreases the probability of realization
of the described risks. The attributes
must be significant in determining the
increase or decrease of risk. The
expected measure function can be
derived by the percentage of the
successful or failure occurrences. For
example, number of patches successfully
installed on information systems (>
95%), number of security incidents
caused by attacks from the network (<
3%). The determination of the
percentage should consider that even
though the security controls are
implemented, the risk of attacks can still
occur. Therefore, the percentage depicts
the strength of the existing security
controls in mitigating the risks.
Security Indicator Index: If the measure
is equal to or below the
recommendation, the risk is adequately
controlled, thus explain the effectiveness
of the security controls. The proposed
indicators are the trends of the derived
measures and they must be within the
285
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
same measurement scale in order to
establish that the risk is adequately
controlled [23]. This indicator index can
also act as a compliance index to
ISO/IEC 27001 standard. Algorithm or
calculation combining one or more base
and/or derived measures with associated
decision criteria. For example: 0-60% -
Red; 60-90% - Yellow; 90-100% Green.
Decision Criteria: Thresholds, targets, or
patterns used to determine the need for
action or further investigation, or to
describe the level of confidence in a
given result (for example, Red –
intervention is required, causation
analysis must be conducted to determine
reasons for non-compliance and poor
performance; Yellow – indicator should
be watched closely for possible slippage
to Red; Green – no action is required).
Corrective actions provide the range of
Potential changes in improving the
efficiency and effectiveness of the
security controls. They can be prioritized
based on overall risk mitigation goals
and select based on cost-benefit analysis.
3.4 ACT Phase:
The developed technical security metric
and measurement will be validated by
the respective organizations. The metric
is to comply to ISO/IEC 27001 standard
requirements. The development of
technical security metrics will be based
on Information security measurement
model in ISO/IEC 27004 standard.
The measurement result should be
reported to the management in ensuring
the continuity and improvement of
information security in the organization.
4 CONCLUSIONS AND FUTURE
WORK
Malaysia government has seen the
importance of Critical National
Information Infrastructure (CNII)
organizations to protect their critical
information systems. In the year of 2010,
the government has mandated for their
systems to be ISO/IEC 27001 ISMS
certified within 3 years [24].
The ISO 27001 certification is one of the
most used corporate best practices for IT
security standards, addressing
management requirements as well as
identifying specific control areas for
information security. It provides a
comprehensive framework for designing
and implementing a risk-based
Information Security Management
System. The requirements and guidance
cover policies and actions that are
necessary across the whole range of
information security vulnerabilities and
threats. By customizing the security
requirements from ISO/IEC 27002 and
other relevant security standards and
guidelines, the CNII organizations will
implement the necessary security
controls in compliance with ISO/IEC
27001 ISMS standard.
The proposed TSMM is to provide
guidance for CNII organizations to
measure the effectiveness of the network
security controls in compliance with
ISO/IEC 27001 standard. The relevant
type of information security
measurement and metrics are interrelated
and worth to use in aligning with
business risk management. We also want
to explore the usability of the ISO/IEC
27004 standard and conduct a case study
at several CNII organizations.
286
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
ACKNOWLEDGMENT
The authors wish to acknowledge and thank
members of the research teams of the Long
Term Fundamental Research Grant Scheme
(LRGS) number
LRGS/TD/2011/UKM/ICT/02/03 for this
work. The research scheme is supported by
the Ministry of Higher Education (MOHE)
under the Malaysian R&D National Funding
Agency Programme.
5 REFERENCES
1. International Organization for
Standardization and International
Electrotechnical Commission, “Information
technology - Security techniques -
Information security management systems-
Requirements,” ISO/IEC 27001:2005, 2005.
2. R. Barabanov, S. Kowalski, and L.
Yngström, “Information Security Metrics:
Research Directions,” FOI Swedish Defence
Research Agency, 2011.
3. C. Fruehwirth, S. Biffl, M. Tabatabai, and E.
Weippl, “Addressing misalignment between
information security metrics and business-
driven security objectives,” Proceedings of
the 6th International Workshop on Security
Measurements and Metrics - MetriSec ’10,
p. 1, 2010.
4. E. Hulitt and R. B. Vaughn, “Information
system security compliance to FISMA
standard: A quantitative measure,” 2008
International Multiconference on Computer
Science and Information Technology, no. 4,
pp. 799–806, Oct. 2008.
5. S. M. Bellovin, “On the Brittleness of
Software and the Infeasibility of Security
Metrics,” IEEE Security & Privacy
Magazine, vol. 4, no. 4, pp. 96–96, Jul.
2006.
6. K. Stouffer, J. Falco, and K. Scarfone,
“Guide to Industrial Control Systems ( ICS )
Security,” National Institute of Standards
and Technology, NIST Special Publication
800-82, no. June, 2011.
7. J. Stoddard, M., Bodeau, D., Carlson, R.,
Glantz, C., Haimes, Y., Lian, C., Santos, J.,
and Shaw, “Process Control System Security
Metrics – State of Practice,” Institute for
Information Infrastructure Protection (I3P),
vol. Research R, no. August, 2005.
8. R. Savola, “Towards a Security Metrics
Taxonomy for the Information and
Communication Technology Industry,” in
International Conference on Software
Engineering Advances, 2007.
9. W. K. Brotby, Information Security
Management Metrics: A Definitive Guide to
Effective Security Monitoring and
Measurement. Auerbach Publications, 2009.
10. M. Masera and I. N. Fovino, “Security
metrics for cyber security assessment and
testing,” Joint Research Centre of the
European Commission,, vol. ESCORTS D4,
no. August, pp. 1–26, 2010.
11. J. Hallberg, M. Eriksson, H. Granlund, S.
Kowalski, K. Lundholm, Y. Monfelt, S.
Pilemalm, T. Wätterstam, and L. Yngström,
“Controlled Information Security: Results
and conclusions from the research project,”
FOI Swedish Defence Research Agency, pp.
1–42, 2011.
12. H. Lundholm, K., Hallberg, J., Granlund,
“Design and Use of Information Security
Metrics,” FOI, Swedish Defence Research
Agency, pp. ISSN 1650–1942, 2011.
13. J. Rayford B. Vaughn, R. Henning, and A.
Siraj, “Information Assurance Measures and
Metrics - State of Practice and Proposed
Taxonomy,” in Proceedings of the 36th
Hawaii International Conference on System
Sciences, 2003, p. 10 pp.
14. International Organization for
Standardization and International
Electrotechnical Commission, “Information
technology - security techniques - Code of
practice for information security
management,” ISO/IEC 27002:2005, vol.
2005, 2005.
15. E. B. Lennon, M. Swanson, J. Sabato, J.
Hash, L. Graffo, and N. Sp, “IT Security
Metrics,” ITL Bulletin, National Institute of
Standards and Technology, no. August,
2003.
16. W. J. Carlos M. Gutierrez, “Federal
Information Processing Standards 200 -
Minimum Security Requirements for
Federal Information and Information
Systems,” National Institute of Standards
and Technology,, no. March, 2006.
17. Computer Security Division and Information
Technology Laboratory, “Recommended
Security Controls for Federal Information
Systems and Organizations,” National
287
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
Institute of Standards and Technology, NIST
Special Publication 800-53 , Revision 3,
2010.
18. Computer Security Division and I. T.
Laboratory, “Security and Privacy Controls
for Federal Information Systems and
Organizations,” National Institute of
Standards and Technology, NIST Special
Publication 800-53 , Revision 4, no.
February, 2012.
19. T. Fleury, H. Khurana, and V. Welch,
“Towards A Taxonomy Of Attacks Against
Energy Control Systems,” in Proceedings of
the IFIP International Conference on
Critical Infrastructure Protection, 2008.
20. P. Mell, K. Scarfone, and S. Romanosky, “A
Complete Guide to the Common
Vulnerability Scoring System,” Forum of
Incident Response and Security Teams,
FIRST Organization, pp. 1–23, 2007.
21. “NVD Common Vulnerability Scoring
System Support v2,” NIST, National
Vulnerability Database (NVD),
http://nvd.nist.gov/cvss.cfm?version=2.
22. M. Tupper and a. N. Zincir-Heywood,
“VEA-bility Security Metric: A Network
Security Analysis Tool,” 2008 Third
International Conference on Availability,
Reliability and Security, pp. 950–957, Mar.
2008.
23. M. H. S. Peláez, “Measuring effectiveness in
Information Security Controls,” SANS
Institute InfoSec Reading Room,
http://www.sans.org/reading_room/whitepa
pers/basics/measuring-effectiveness-
information-security-controls_33398, 2010.
24. J. P. M. Malaysia, “Pelaksanaan Pensijilan
MS ISO/IEC 27001:2007 Dalam Sektor
Awam,” Unit Pemodenan Tadbiran dan
Perancangan Pengurusan Malaysia
(MAMPU), vol. MAMPU.BPIC, p. 1, 2010.
288
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 280-288The Society of Digital Information and Wireless Communications (SDIWC), 2012 (ISSN: 2305-0012)
Ahmed B. Elmadani Department of Computer Science Faculty of Science Sebha University
Sebha Libya www.sebhau.edu.ly [email protected]
ABSTRACT
Keywords Digital Signature, Smart card, Hash, True identity and Biometric (face).
1. INTRODUCTION
A mathematical scheme for demonstrating the authenticity of a digital message or document is known as Digital Signature (DS) [1]. DS convince a recipient that a document was created by a known sender. DSs are commonly used for software distribution, financial transactions, and in other cases to avoid forgery and tampering
[2]. Digitally signed messages may be anything that can be represented as a bit or a string, examples include electronic mail, contracts, or a message sent via some other cryptographic protocol [3]. Hash function is used in creating and verifying a DS. Hash function is an algorithm which creates a digital representation of document. Few hashing algorithms have been developed such Secure Hash Algorithm – 128 (SHA-1) and Message Digest Version 5 (MD5) to be used in e-commerce [4]. SHA-1 is a secured hash algorithm – 160. Produces 160-bit hash value. It is designed by NIST & NSA in 1993 revised 1995 as SHA – 160, US standard for use with digital signature algorithm (DSA) signature scheme. SHA-256, SHA-384, and SHA-512. Designed for compatibility with increased security provided by the advanced encryption standard (AES) cipher[3]. In traditional DS, normally a smart card is used to perform signatures because the used cryptographic keys are stored inside the card [6]. However most of the existing DS systems, provide signature without proofing true identity[5], because they stand on using keys that anyone can use[7]. Therefore, documents have to be signed in such a way that proofs the true identity to avoid many attacks reported in [8][11]. This can be done only by using user’s personal characteristics such as fingerprint, Iris or face [7]. In automation security, faces are more secured than passwords, because of fine
An online secured document exchange, secured bank transactions, and other e-commerce requirements need to be protected in commercial environment as it becomes big,. Digital signature (DS) is the only means of achieving it. This paper introduces a prototype online-algorithm in signing and verifying a document digitally. Document’s hash value is calculated, and protected using keys derived from face characteristics. This paper presents a method in signing document differ from traditional systems using passwords, smartcards or biometrics based on direct access. It utilizes a wirelessly accessed biometrics type to provide:
1. Un tampered biometrics in digital signatures.
2. Proof of a true identity. It also investigates existing digital signature system that is based on smart card. The obtained results were translated in term of speed and security enhancement which is highly in demand of e-commerce society
289
Trusted Document Signing based on use of biometric (Face) keys
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
differentiation between seemingly identical and won’t be forgotten or stolen [9]. Faces are also more secured than fingerprint, because fingerprint can be spoof using jelly[10]. Face image as any digital image always needs to be enhanced, to come out with its features clearly. This is because of the low quality images captured using camera devices. An image once captured and resized, is filtered using one of the known filters methods such as Linear, Wiener, Median, or Gaussian [9]. The image using one or more filtering algorithm is filtered several times until it becomes clear. Then information can be constructed [12]. The constructed information are stored for future comparison use. Face structure is eyes, mouth, and there position, which are different from person to another. All related together forming a unique characteristic of face [9]. There are more factors that can be used, that might make recognition easy or difficult they are listed in the FERET dataset [15]. Several face recognition algorithms were introduced in recent years. One of them is to measure the resulted triangle between eyes and mouth, but this is trivial of change, so a measurement should be taken in age intervals [][16]. The first mention to eigenfaces in image processing, a technique that would become the dominant approach in following years, was made by L. Sirovich and M. Kirby in 1986, it is based on principal component analysis (PCA) [16]. It becomes a base of developing many new face algorithms such as the measurement of the importance of certain intuitive feature, geometric measures between eye distances, with length ratio [17]. This work considered as an improvement of the research done by Costas et al (2008), they perform face-based digital
signature in retrieving video segments using pre-extracted face in detection and recognition[14]. They use signature in retrieving while in this work we use segments of document to retrieve their signatures for verification. In our proposed DS system, we will introduce a system that uses keys derived from user’s face that will help in assuring true identity, face factors mentioned in [15] are out of our concern. In our security analysis, we only consider secure signature-generation systems that use SMCs to protect DS from attacks mention in [8]. Then to improve the use of biometrics in order to proof true user identity as in [13], and DS protection. Meanwhile avoid using systems based on biometric which can be tampered such as fingerprint [14]. In the proposed system, we shall construct keys from face that is protected using a Ron’s Code ver. 5 (RC5) a variable-key-size encryption algorithm. It is fast and suitable in protecting SMC keys [6]. Of course other solutions exist. However, they are out of the scope of this paper..
2. METHODOLOGY AND
DISCUSSIONS
The following paragraphs will discuss proposed algorithm, experiment and obtained results. 2.1 PROPOSED ALGORITHM Sequence of DS in the proposed system for any given document shown in Figure 1 are performed in five steps described as following:
• Enhancement, face image adjustment and filtering..
• Feature extraction, information extract and keys construction.
• Document signing, obtaining document fingerprint.
290
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
• Signatures protection, document
and keys protection. • Siging authenticity, signatures
matching.
Figure 1. Sequence process in proposed system 2.2 FACE IMAGE ENHANCEMENT At each sign-point, there is a fixed webcam that is used to capture face image.
Figure 2. Face image enhancement, and noise remove using wiener filter The selected area surrounds eyes, nose and mouth, within the dimension of 200x200 pixels. Figure 2 A shows an original image, while B presents the histogram of A, and it shows that information are not
distributed, it has to be filtered. In C a face image is shown after removing noise by using fast Fourier transform (FFT) “wiener filter”. It was used several times to come out with it features. The histogram of a well distributed information as a result filtering process was shown in D. Face image is then cropped to dimension 150x150 pixels in an area reach of information. It contains eyes, nose, and mouth to use it for feature extraction as shown in Figure 3.
Figure 3. Selected face image area that is reach of information.
2.3 INFORMATION EXTRACTION The cropped face image that prepared in paragraph 2.2 is used to extract features to calculate user key ( skey as sender’s key and rkey receiver’s key). User key are calculated using an equation (1).
∑ ===
mn
jis jixkey ,
0 1 ),(
∑ ===
mn
jir jixkey ,
0 2 ),(
(1) Where 1x and 2x are sender’s, receiver’s cropped face image 160x160 pixels and
mjni ...0,.....0 == . Obtained users keys are unique as results of applying the equation (1). Table 1 shows obtained users keys, it is insure that users can be distinguish from each other.
291
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
As a requirement of signing process, user requires another key ( srkey ), it is constructed after selecting a target user as a receiver of a document.
Table 1. Users keys User No.
User key ( skey or rkey )
6 581497 7 533018 8 668856 9 627684 18 632414
The key is constructed by combining the two keys ( skey and rkey ) using equation no. (2). The constructed key ( srkey ) is used in encryption process.
mnnjniwherejkeyikeykey rssr
++===
ΛΚΚ 1,1,
(2)
The constructed srkey is used in both sides for encryption or/decryption and to protect an outgoing document in sender’s side or incoming document in receiver’s side. A third column in Table 2 shows results of applying equation (2) to construct a key ( srkey ) that is used in an encryption process.
Table 2. Constructed key srkey between sender and receiver
Sender’s key
( skey )
Receiver’s key ( rkey )
Combined key’s ( srkey )
581497 7533018 58149775330187533018 581497 7533018581497668856 668856 668856668856 627684 632414 627684632414 632414 627684 632414627684
2.4 SIGNING PROCESS
A user who intends to sign a document (Doc) has to first select or prepare a document, then a process using equation no. (3) to calculate a fingerprint of the document. SHA-1 as stable hash algorithm was chosen to calculate document’s fingerprint. Then a sender invokes RC5 algorithm with a constructed key ( srkey ) to encrypt the calculated fingerprint as shown in equation (4).
Fingerprint = SHA-1(Doc) (3) Encrypted-fingerprint = RC5 srkey ( fingerprint)
(4) Sender prepares a message that contains document, its fingerprint and sender’s key and sent them to the receiver according to the equation no. (5) and as shown in Figure 4.
Message = (Encrypted-fingerprint, Doc, skey ) (5)
Figure 4. Sequence process of a document signing and message encryption 2.5 SECURE SIGNATURES To avoid un authorized use of document and used keys in signatures, a RC5 cryptographic algorithm is used to protected them. Message contains document (Doc), fingerprint and keys are prepared by sender and protected using a formed key ( srkey ) that only target
292
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
receiver can decrypt according to the equation (6).
Encrypted-Mseg= RC5 srkey (Message) (6)
2.6 AUTHENTICITY of SIGNATURES
Verification process is performed in the receiver’s side, receiver once he received an encrypted message, he decrypts it using his key ( rkey ) to obtain original document, sender’s key ( skey ), and encrypted fingerprint. Two processes are used one to calculate new fingerprint and second to construct combined key ( srkey ) as discussed in 2.3. The key is used to decrypt the received encrypted fingerprint. Signature is authenticated by comparing the two obtained fingerprints. A document is said authenticated and sent by trusted person if fingerprint are equals. In Figure 5
Figure 5 Received message decryption and signing authentication process. illustration of verification process starts by the decrypting of the received message with receiver’s key to obtain the sender’s key ( skey ). The skey will be used to
construct a combination key ( srkey ) that needed to decrypt received fingerprint. Receiver calculates fingerprint of received document using SHA-1 algorithm and compare the two fingerprints to see if they match. 2.7 TESTING THE ALGORITHM
Two signature points are configured using two connected computers where each was equipped with webcam. They are used to test the proposed algorithm. One is for document signing, where the second is for signature verification. The system was tested for acceptance and rejection in term of signature-verification running process. This test is used to discover the system’s incorrect decision. Use was made of 1030 matching trails (MT) and three security levels. Table 3 shows used intensity level for each of the three levels security. Group (1) uses 30 low intensity face images, group (2) uses 400 medium intensity face images, where group (3) uses 600 high intensity face images.
Table 3 Number of Recognized- Rejected users by the proposed system
The results of testing for the system to the MT, for group 1 a 28 out of 30 low intensity images were recognized, that is 93.33%. Meanwhile, 2 images were rejected with 6.67% as demonstrated in Figure 6.
Group
Description
Group (1)
Low
Group (2)
medium
Group (3)
High
Total
Number of Users
30 400 600 1030
Recognized 28 393 592 1013 Rejected 2 7 8 17
Recognized Rate
93.33 98.25 98.67 98.35
Error Rate 6.67 1.75 1.33 1.65
293
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Figure 6 Accepted users by the proposed system In group 2 which presents medium intensity face images as shown in Figure 7, 393 out of 400 face images, were recognized with percentage of 98.25% and only 7 images were rejected with percentage of 1.75%.
Figure 7 Rejected users by proposed system In group 3, 600 high intensity face images were used as shown in Figure 8, 592 were recognized registering a 98.67% and 8 rejection, that is 1.33%. In summary 1030 face images were used, images got different intensity. 1013 of them were recognized with percentage of 98.35%, and only 17 of them were rejected, that is 1.65% and this demonstrates the success of the proposed system.
Figure 8 Accepted and rejected users by the proposed system In Table 4 tests of the proposed system done only for known users, the false acceptance rate (FAR) registered value equals to zero in all groups. The false rejection rate (FRR) goes in descending order which means configuring the system with big number of users will translate in getting less rejection as results show for low and all intensity.
Table 4 FAR and FRR Ranges No. Description FAR FRR 1. Low intensity 0 6.67 2. Medium intensity 0 1.75 3. High intensity 0 1.33 4. For all intensity 0 1.65
2.8 THIS ALGORITHM AGAINST
EXISTING ALGORITHMS
In recent years few algorithms are developed to solve document signing digitally, but they fail in covering a lot of issues. The proposed algorithm solves them as will be described below. Most of DS systems as in Sufreenmohd at el (2002) or in Elmadani at el (2005) are using smart card to store keys and they suffer from forgery or tampering, were the proposed algorithm solve this problem by authenticating user with their faces, that were stolen, forgotten, tampered and user has nothing to carry with a hand. The existing DS algorithms as in Sirovich and
294
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Kirby (1987) are based on temple selection in extracting features, Givens at el (2003) and Yang (2010) algorithms are based on calculating values from image to compare them later with storing ones, such processes are time consuming, were in the proposed system features are based on forming keys which are numbers, directly processed no need to store them, which means protection from any attack mentioned by Langweg (2006). The proposed algorithm uses simple mathematic functions in key calculation different than algorithms used by Costas at el (2008) or used by Kirby and Sirovich (1990), our system is fast because it is based calculating numbers, it requires minor process, less memory space compared to them. 3. COCLUSION
A model of signing – verifying document signature and protecting it was presented. Meanwhile, an investigation and drawback of existing digital signatures were shown. The proposed algorithm uses person characteristics biometrics (face) which is not possibly stolen or forge or tampered. It provides an easy method in use, that requires nothing to carry with. Our results shows that with no doubt, face is strongly recommended for online document signing. 4. REFERENCES
1. Nentwich F, Kirda E and Kruegel C. Practical Security Aspects of Digital Signature Systems. Technical University Vienna. Technical. 2006.
2. Introduction to digital signature. www.e-signature.gov.eg/ ElectronicSignature_Mechanizm_Arabic. 2010.
3. Robshow M. MD2, MD5, SHA and other Hash Functions. RSA Laboratories Technical Report TR-101.1995.
4. Wang X, Feng D, Lai X and Yu. Collision for Hash Functions MD4, MD5, HAVAL-128, and RIPEDMD. Proceedings of the 2th Annual
International Cryptology Conference (Crypto ’04), Santa Barbara CA. 2004.
5. Elmadani. A. B. Digital Signature forming and keys protection based on person’s Characteristics. Proceedings of the IEEE International Conference on Information Technology and e-services (ICITeS’2012). Souse, Tunisia. 2012.
6. Elmadani A. B, Prakash V and Ramli A. R. Application of Smartcard & Secure Coprocessor, BICET conference. Brunei.2001.
7. Elmadani A. B. Human Authentication using FingerIris algorithm based on statistical approach the 2nd International in network digital conference (NDT '10), Prague Czech Republic. pp (288-296). 2010.
8. Spalka A. Cremers A and Langweg H. Protecting the Creation of Digital Signature with Trusted Computing Platform Technology Against Attacks by Trojan Horse. In IFIP Security Conference. 2001.
9. Fang, Y. Wang Y and Tan T. Combining Color, Contour and Region for Face Detection. ACCV2002: The 5th Asian Conference on Computer Vision, Melbourne, Australia. 2002.
10. Elmadani A. B, Prakash V, Ali, B. M, Ramli A. R and Jumari K. Fingerprint Access Control with Anti-spoofing Protection, Brunei Darussalam Journal of Technology and Commerce. Brunei. 2005.
11. Langweg H. Malware Attacks on Electronic Signatures Revisited. In Sicherheit 3rd Jahrestagug Fachbereich Sicherheit der Gesellschaft fuer Informatik. 2006.
12. Zhao W, Chellappa R, Phillips P. J and Rosenfeld A. Face Recognition: A Literature Survey. ACM Computing Survey. Vol. 35, no. 4. PP. 399–458. 2003.
13. Yang J. Biometrics Verification Techniques Combing with Digital Signature for Multimodal Biometrics Payment System. Proceedings of Fourth International Conference on Management of e-Commerce and e-Government (ICMeCG), pp. 405-420. China.2010.
14. Costas C, Nikolaidis N and Ioannis P. Face-based Digital Signatures for Video Retrieval. IEEE Transactions on Circuits and Video Technology, Vol. 18. No. 4. Pp. 549-553. 2008.
15. Givens G, Beveridge J, Bruce A, Draper B and Bolme D. A Statistical Assessment of Subject Factors in the PCA Recognition of Human Faces. Proceedings of Computer Vision and Pattern Recognition Workshop (CVPRW’03). Wisconsin USA. 2003.
295
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
16. Sirovich Land Kirby M. Low-dimensional
procedure for the characterization of human faces. Journal of the Optical Society of America A - Optics, Image Science and Vision, Vol 4. No 3. pp 519–524. 1987.
17. Kirby M and Sirovich L. Application of the karhunen-loeve procedure for the characterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12. No. 1. Pp 103–108. 1990. Ahmed B. Elmadani was born in Libya 1956. He received Ph.D. degree at UPM University Malysia in 2003. He worked in computer science department Faculty of Science Sebha University (Libya), from 1997 to 1999 as Assistant lectuerar and head department of computer Science, from 2003 – 2008 as lectuerar at the same department, from 2009- till now as asistant prof. and Vice Dean at the same Faculty. His main research interests include cryptography, information security, imaging, digital signature and biometrics fingerprint.
296
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 289-296The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
* **
, and Paul Gardner-Stephen
* Computer Science Department, College of Computer Science and Information
Technology, King Faisal University,
P.O. Box: 400 Al-Hassa 31982, Kingdom of Saudi Arabia
**
School of Computer Science, Engineering and Mathematics, Faculty of Science and
Engineering, Flinders University,
GPO Box 2100, Adelaide SA 5001, Australia
[email protected], [email protected]
KEYWORDS: SPAM, email, Arabic, users,
English, Saudi.
1. INTRODUCTION Email is an important tool for many people
and they consider email as a necessary part of
their daily lives. Email enables people to
communicate with each other in a short time at
low cost. Although email gives benefits for
people who use it, some people, called
spammers, have exploited email for their
personal purposes. They send so-called SPAM to
a large number of recipients. They can use
programs known as spam-bots to catch email
addresses on the internet or they can buy email
addresses from individuals and organizations to
send email SPAM to these addresses [11]. They
297
A Comparative Study of the Perceptions of End Users in the Eastern,
Western, Central, Southern and Northern Regions of Saudi Arabia
about Email SPAM and Dealing with it
Hasan Alkahtani , Robert Goodwin**
ABSTRACT This paper presents the results of a survey of email
users in different regions of Saudi Arabia about email
SPAM. The survey investigated the nature of email
SPAM, how email users in the eastern, western,
central, southern and northern dealt with it, and the
efforts made to combat it. It also investigated the
effectiveness of existing Anti-SPAM filters in
detecting Arabic and English email SPAM.
1,500 participants located in the eastern, western,
central, southern and northern regions of Saudi
Arabia were surveyed and completed surveys were
collected from 1,020 of the participants.
The results showed that there were different
definitions for email SPAM based on different users’
opinions in Saudi Arabia. The results showed that the
participants in the central and western regions were
more aware of SPAM than the participants in other
regions.
The results revealed that the volume of email
SPAM was different from region to another and the
volume of SPAM received by the participants in the
northern and central regions was larger than that
received in other regions. The results indicated that
the majority of email SPAM received by the
participants in different regions was written in
English. The results showed that the most common
type of email SPAM received in Arabic was emails
related to forums and in English was phishing and
fraud, and business advertisements.
The results also showed that a few participants in
all regions responded to SPAM and the average of
the participants who responded to SPAM was larger
in the southern region than other regions.
The results showed that most of the participants
were not aware of Anti-SPAM programs and the
participants in the central region were more aware of
Anti-SPAM programs than the participants in other
regions. The results showed that the participants in
all regions estimated that the existing Anti-SPAM
programs were more effective in detecting English
SPAM than Arabic SPAM.
The results showed that most of the participants in
all regions were not aware of the government efforts
to combat SPAM and the participants in the central
region were more aware of the government efforts
than the participants in other regions.
The results also showed that most of the
participants in all regions were not aware of the ISPs
efforts to combat SPAM and the participants in the
central and western regions were more aware of the
ISPs efforts than the participants in other regions.
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
also use many methods to bypass SPAM filters
such as tokenization and obfuscation [27].
Email SPAM is defined as "Unsolicited,
unwanted email that is sent indiscriminately,
directly or indirectly, by a sender having no
current relationship with the recipient" [12],
[13]. It is also defined as Unsolicited Bulk
Email (UBE) that is sent to a large number of
recipients who were not asked if they wanted to
receive it [4], [14], [18]. Some studies [6], [7],
[25] defined email SPAM as Unsolicited
Commercial Email (UCE) that contains business
advertisements sent to a large number of
recipients.
There are legal and technical methods [2] to
combat SPAM. Legally, some countries enacted
laws against SPAM. Examples of these countries
include the United States of America [26],
European Union countries and Australia [5].
However, there are no laws in Saudi Arabia to
combat SPAM although research and projects
were conducted to assess the problem of SPAM
in the country.
Technically, there exist many filters to combat
SPAM. Examples of these filters include
content based filters such as Bayesian [24],
keywords [11] and genetic algorithms [15], and
origin based filters like black lists [11], white
lists [22], origin diversity analysis [16] and
challenge response systems [21]. However, some
of these techniques need to be updated to detect
new types of email SPAM due to spammers
developing ways to bypass these techniques.
This study aimed to gain an understanding
about:
a. The nature of email SPAM, its definition
based on email users’ opinions, its volume
and its types in different regions of Saudi
Arabia.
b. Differences between Arabic SPAM and
English SPAM received by the participants
in different regions of Saudi Arabia.
c. The effects of email SPAM on email users in
different regions of Saudi Arabia.
d. How email users in the eastern, western,
central, southern and northern deal with
email SPAM.
e. The efforts of government to combat email
SPAM.
f. The efforts of ISPs to combat email SPAM.
g. Evaluation of email users’ perception in
different regions of Saudi Arabia for the
effectiveness of Anti-SPAM filters in
detecting Arabic and English email SPAM.
2. METHODOLOGY
2.1. Measures It was decided that the best way to answer the
research questions was through a questionnaire.
Therefore, a questionnaire was distributed to the
participants in different region of Saudi Arabia
and the responses were analyzed.
Initially a pilot questionnaire was prepared and
distributed to a few participants to get their
comments about the questions. Then all the
participants completed the 10 page questionnaire
which included both yes/no answers and open
ended answers. The questionnaire consisted of
three main parts as follows.
2.1.1. General information questions
In this part, the participants were asked for the
following information: gender, age, nationality,
speaking language, highest level of education,
major area of study, work status and the nature
of the work. These questions helped in
understanding and comparing the level of
awareness of users about email SPAM.
Examples for the first part of questions of the
survey can be seen in Figure 1.
1. Gender:
O Male
O Female
2. What is your age?
3. Nationality:
O Saudi
O Other
4. What is your current work status?
O Student
O Employed
O Self employed
Figure 1: Examples of questions of the first part of the survey
2.1.2. Email SPAM questions
At the beginning of this part, the participants
were asked for a definition of email SPAM in
their own words in order to understand the
definition of email SPAM based on their
opinions.
Then the study defined email SPAM as “an unsolicited, unwanted, commercial or non-commercial email that is sent indiscriminately, directly or indirectly, to a large number of recipients without their permission and there is no relationship between the recipients and sender”. This definition was in the survey and
used to provide a reference point for the
remainder of the questions. Care was taken to
ensure that the respondents did not see the study
298
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
supplied definition until after they had supplied
their own definition of email SPAM to prevent
introducing a strong bias. The variety of
responses to the question of what is SPAM is
evidence that this approach was successful.
Some examples of email SPAM, keywords and
phrases used in email SPAM were given in the
survey.
The participants were asked if they knew about
email SPAM prior to reading the survey, and
what were the sources of their knowledge. The
participants were also asked if they received
email SPAM and how many email SPAMs they
received on average weekly. They were also
asked about the languages they received in email
and types of Arabic and English email SPAM.
The study focused on English and Arabic email
SPAM because English is the main language in
the world and Arabic is the native language in
Saudi Arabia.
The participants were asked about what they
did when they receive email SPAM (i.e. the
actions of email users in dealing with SPAM).
The actions of emails users in dealing with
SPAM described in the survey were as follows:
reading the entire email SPAM, deleting the
email SPAM without reading it, and contacting
the ISP and notifying it about email SPAM. The
participants were asked to choose one option
from the following options to answer their action
in dealing with SPAM. These options were as
follows: never, sometimes and always. Figure 2
shows an example for questions of email users in
Saudi Arabia about their actions in dealing with
email SPAM.
Note: the following question will ask you to choose
the appropriate option for your dealing with email
SPAM.
For example, when I am not reading the SPAM
email at all, I will circle the option "Never" in the
scale in the following table. If I sometimes read
SPAM, I will circle the option "Sometimes".
Read the
entire email
Figure 2: An example for questions of email users in Saudi
Arabia about their actions in dealing with email SPAM
The participants were asked if they purposely
responded to an offer made by a SPAM email
and what benefits they derived from email
SPAM. They were also asked if they were
affected by email SPAM and what were the
effects of email SPAM on them.
The participants were asked if they were aware
of Anti-SPAM filters to block email SPAM,
what were the sources of their knowledge about
these filters, and how effective these filters were
in detecting Arabic and English email SPAM.
Examples for the second part of questions of the
survey can be seen in Figure 3.
1. Everyone defines SPAM differently, in your own
words, how would you define email SPAM?
2. Did you know about SPAM emails prior to reading this
survey?
O Yes
O No
3. Have you received SPAM emails?
O Yes
O No
2. What is the
language of SPAM
email you receive on
average weekly? The
percentages should add
up to 100 %.
Percentage %
O English
O Arabic
O Other language
O Languages I do not
recognize
5. Are you aware of Anti-SPAM programs?
O Yes
O No
6. If you have used Anti-SPAM programs, please rate their
effectiveness in detecting English and Arabic email
SPAM?
Current
Programs\
Percentage
0% 25% 50% 75% 100%
The effectiveness of
current programs in
detecting Arabic
email SPAM
The effectiveness of
current programs in
detecting English
email SPAM
Figure 3: Examples of questions of the second part of the survey
2.1.3. Questions about the efforts of government
and ISPs to combat email SPAM
In this part, the participants were asked if they
were aware of government efforts to combat
SPAM and which efforts they were aware of.
The participants were also asked if they were
aware of ISPs efforts to combat SPAM and
which efforts they were aware of. Examples for
the third part of questions of the survey can be
seen in Figure 4.
Sometimes Never Always
299
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
1. Are you aware of efforts by the government in Saudi Arabia to
combat email SPAM?
O Yes
O No
2. Are you aware of efforts by ISPs in Saudi Arabia to combat
email SPAM?
O Yes
O No
Figure 4: Examples of questions of the third part of the survey
2.2. Participants The questionnaire was designed and
distributed to 1,500 participants in the central,
eastern, western, southern and northern regions
of Saudi Arabia. Completed questionnaires were
received from 1,020 participants in Saudi
Arabia.
34% of the participants were from the central
region, 20% were from the eastern region, 20%
were from the western region, 13% were from
the southern region and 13% were from the
northern region. Table 1 shows general
information about the participants who were
located in the Eastern, Western, Central,
Southern and Northern regions in Saudi Arabia.
Table 1: General information about the participants in the Eastern,
Western, Central, Southern and Northern regions of Saudi Arabia
Region Question
N S C W E
Part 1: General Information
61% 64% 57% 59% 62% Male Gender:
39% 36% 43% 41% 38% Female
35% 37% 35% 63% 58% 15-25
Age:
47% 38% 41% 26% 25% 26-35
12% 21% 17% 10% 14% 36-45
6% 2% 6% 1% 2% 46-55
0% 2% 1% 0% 1% 56 and more
86% 75% 81% 88% 90% Saudi Nationality:
14% 25% 19% 12% 10% Other
99% 99% 99% 100% 99% Arabic Language of
speaking: 75% 73% 63% 81% 62% English
1% 3% 3% 2% 2% Other
12% 15% 11% 17% 17% High school
Highest
level of
education:
8% 5% 7% 2% 2% Diploma
52% 49% 54% 70% 61% Bachelor
19% 17% 16% 7% 12% Master
9% 14% 12% 4% 8% PhD
26% 16% 20% 13% 17% Education and
teaching Major area
of study for the
participants
who had
diploma,
bachelor,
master or
PhD:
26% 31% 34% 40% 31%
Computer science
and information
technology
15% 20% 12% 5% 4% Social sciences
6% 5% 11% 7% 21% Physical and
biological sciences
10% 12% 8% 7% 16% Health sciences
and medicine
17% 16% 15% 28% 11% Other
45% 41% 29% 61% 58% Student
Work status: 51% 59% 70% 37% 42% Employed
4% 0% 1% 2% 0% Self-employed
58% 47% 48% 55% 44% Educational Nature of
work for the
employed
participants:
16% 8% 8% 8% 17% Medical
9% 16% 18% 20% 14% Technical
3% 24% 19% 16% 21% Management
14% 5% 7% 1% 4% Other
3. RESULTS This section described the responses of the
participants in the eastern, western, central,
southern and northern regions of Saudi Arabia
for the email users’ survey.
3.1. Respondents Definition and Awareness
of Email SPAM
Email users were asked for a definition of
email SPAM based on their opinions. The
responses showed that only 428 of 1,020
participants in different regions of Saudi Arabia
answered this question.
42% of the participants who answered this
question defined email SPAM as an email that
was sent randomly to numerous recipients and
contained Spyware, files, links, images or text
that aims to hack the computer or steal
confidential information such as email
passwords, credit card numbers and bank
account numbers.
39% defined email SPAM as an email that did
not contain an email address or that was sent
randomly, directly or indirectly by unknown
senders or sources to a large number of
recipients without their permission to receive it.
33% said that email SPAM was an email that
was sent randomly and contained malicious
programs such as Viruses, Trojans, Worms, or
contained hidden links, strange contents and
untrusted attachments that aimed to damage
computer, software and hardware, or aimed to
delete important information in a computer.
29% defined email SPAM as Unsolicited
Commercial Email (UCE) or email that was sent
to a large number of recipients and aimed to
promote commercial advertisements which
contained attractive words that were used to
encourage the recipient to buy medical, technical
and sexual products.
9% said that email SPAM was annoying and
unimportant email that was sent from friends,
but it was not sent in person and contained jokes,
greetings, invitations to subscribe to forums,
invitations for friendship by social networks
such as Facebook, competition, puzzles, political
and religious reviews, news, and scandals of
famous people in the world.
7% defined email SPAM as junk email or as
Unwanted, Unsolicited Bulk Email (UBE) that
was sent randomly to a large number of
recipients.
300
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
1% defined email SPAM as an email that was
not related to recipients’ work or was not related
to their interests.
From the definitions described above, it can
be clearly seen that there was no a specific
definition for email SPAM by email users and
that the most common definition for email
SPAM was that “an email that was sent randomly to numerous recipients and contained Spyware, files, links, images or text that aims to hack the computer or steal confidential information such as email passwords, credit card numbers and bank account numbers”. The
definitions described above indicated that some
definitions of users in Saudi Arabia for email
SPAM agreed with the international definitions
for email SPAM by defining email SPAM as
Unsolicited Commercial Email (UCE) and as
Unsolicited Bulk Email (UBE).
The differences in definition of email SPAM
could cause problems in enacting laws to combat
SPAM in Saudi Arabia and developing Anti-
SPAM filters for different languages such as
Arabic. This suggests that there is a scope to
specify an agreed definition for email SPAM
which could be used for enacting laws to combat
SPAM and developing Anti-SPAM techniques
in Saudi Arabia.
When the participants were asked if they knew
about email SPAM prior to reading the survey,
the results revealed that approximately third of
email users in Saudi Arabia did not know about
email SPAM and this is a significant and a risk
for Saudi society. The results of the survey
revealed that most of the participants indicated
prior awareness of SPAM, suggesting that the
survey itself has acted as a means of educating
the participants about SPAM and its impact.
This suggests that a broader survey or
information campaign about SPAM would have
a further positive impact in different regions of
Saudi Arabia. Also, this suggests that conducting
research related to SPAM and funding
researchers who work in the field of SPAM
could help in increasing the awareness of email
users in all regions about email SPAM and
hence reducing the impact of email SPAM in
Saudi Arabia.
As seen in Table 2, the results revealed that the
participants in the central and western regions
were more aware of SPAM than the participants
in other regions of Saudi Arabia. This could be
because of the major area of study where the
results indicated that the percentages of the
participants who studied computer science and
information technology in the western and
central regions were higher than the percentages
of the participants who studied the same area of
study in the other regions. Also, it could be
because of the work nature where the results
indicated that the participants who worked in
technical positions in the central and western
regions were more than the participants who
worked in the same positions in the other
regions. The results suggest that there should be
a focus on awareness programs about SPAM for
users in different regions of Saudi Arabia,
especially in the eastern, southern and northern
regions. These awareness programs could be
executed by the government sectors or private
sectors.
The results, as shown in Table 2, revealed that
most of the participants in all regions knew
about SPAM by self-education through the
internet and forums, and friends and relatives.
The results showed that there were prominent
efforts by school and university education in
informing users about SPAM in all regions
compared to other public and private sectors,
and the educational sectors in the southern
region have the highest percentage in the
awareness of users about SPAM.
The results also revealed that there was a
deficiency in the government efforts in
awareness of email users about SPAM in all
regions, and the efforts of the government in
informing users about SPAM was better in the
northern region than other regions. Also, the
results revealed that there were no government
efforts in informing users about SPAM in the
western region. The results also revealed that
there was a deficiency in the ISPs efforts in
awareness of users about SPAM although they
are one of the sectors who are responsible to
control internet service in Saudi Arabia.
This suggests that the government should
focus on the awareness of users about SPAM in
all regions, especially in the western region. The
awareness programs could be executed by
educational sectors such as universities,
broadcast media such as magazines and
newspapers, and sectors who are responsible to
provide and control internet services in Saudi
Arabia.
301
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Table 2: Responses of the participants in the Eastern, Western, Central,
Southern and Northern regions about their knowledge about email SPAM
Region Question
N S C W E
Part 2: Email SPAM
37% 56% 72% 70% 57% Yes Did you know
about SPAM
emails prior to reading the
survey? 63% 44% 28% 30% 43% No
13% 13% 6% 7% 9% Internet Service
Providers (ISPs)
How
do you
know
about
SPAM
emails?
50% 51% 59% 76% 67% The internet and
forums
8% 11% 13% 21% 10%
Broadcast media such
as radio, TV,
newspapers and magazines
44% 48% 39% 56% 45% Friends and relatives
8% 4% 4% 0% 6%
Government
ministries and
commissions
40% 44% 41% 29% 38% Through my school or
university education
6% 7% 5% 3% 4% Other
3.2. Volume and Nature of Email SPAM in
Saudi Arabia
When the participants were asked if they
received email SPAM, the results showed that
most of the participants in Saudi Arabia received
email SPAM. Email users estimated they
received an average of 108 SPAM emails per
week.
Another study, conducted by [17], showed that
the participants received an average of 94.5
emails SPAM per week. By comparing the
volume of SPAM received in Saudi Arabia to
the volume of SPAM in that study [17], it can be
clearly seen that the volume of SPAM in Saudi
Arabia was broadly similar to the volume in that
study.
The results shown in Table 3 revealed that the
highest percentage of the participants who
received SPAM was in the southern region. The
results indicated that the average of the number
of email SPAM received weekly by the
participants was different from region to another.
The results revealed that the average of SPAM
received weekly was 77 emails SPAM in the
eastern region, 104 emails SPAM in the western
region, 126 emails SPAM in the central region,
95 emails SPAM in the southern region and 129
in the northern region. This indicated that the
number of SPAM received was larger in the
northern and central regions than other regions.
When the participants were asked about the
language of email SPAM that they received, the
results showed that the most email SPAM
received (59%) was in English, 34% was in
Arabic, 4% was not recognized and 3% was in
other languages.
A study conducted in Bahrain indicated that
64% of the respondents said that they received
English SPAM, 18% said that they received
Arabic SPAM and 18% said that they received
both Arabic and English SPAM [1]. The results
of this study indicated that the volume of
English SPAM received in Bahrain was similar
to the volume of English SPAM that received in
Saudi Arabia. The results of the study also
revealed that the volume of Arabic SPAM
received in Bahrain was less than that received
in Saudi Arabia.
As seen in Table 3, the results revealed that the
volume of English SPAM received was larger in
the northern region than other regions. Also, the
results showed that the volume of Arabic SPAM
was larger in the western region than other
regions. The number of unrecognized SPAM
was larger in the southern and northern regions
than other regions. The results showed that the
participants in the southern region received
SPAM in other languages such as Chinese,
Japanese, Russian, Turkish, French, Brazilian,
Spanish, Persian, German, Italian, Hindi, Urdu
and Hebrew more than other regions.
Table 3: Responses of the participants in the Eastern, Western, Central,
Southern and Northern regions about the languages of email SPAM
Region Question
N S C W E
Part 2: Email SPAM
65% 83% 73% 75% 70% Yes Have you
received SPAM
emails? 35% 17% 27% 25% 30% No
65% 61% 61% 51% 60% English What is the
language of SPAM email you
receive on
average weekly?
29% 30% 33% 43% 33% Arabic
5% 5% 4% 3% 4% Not recognized
1% 4% 2% 3% 3% Other language
When the participants were asked about the
types of Arabic and English emails SPAM that
they received, the results showed that there were
many types for both Arabic and English SPAM
and these types were different from Arabic to
English SPAM. Types of Arabic and English
SPAM and the differences between them can be
seen in Table 4.
302
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Table 4: The differences between Arabic and English email SPAM
received by end users in Saudi Arabia
Types of email SPAM AR (%) EN (%)
Business 31 30
Religious and Political Party 5 2
Pornographic 10 24
Forums 36 3
Products and services 11 12
Phishing and Fraud 6 28
Other 1 1
Total 100 100
As described in Table 4, it can be clearly seen
that the volume of business advertisements,
emails from religious and political parties, and
emails related to forums was larger in Arabic
SPAM than English SPAM. The percentages
indicated that there was a significant difference
in composition between Arabic and English
SPAM, for example in the volume of forum
emails where this volume was much more in
Arabic SPAM than English SPAM.
Also, the results showed that the volume of
pornographic emails, products and services
emails, and phishing and fraud emails was larger
in English SPAM than Arabic SPAM. The
percentages indicated that there was a significant
difference between Arabic and English email
SPAM in the volume of pornographic and
phishing and fraud emails where this volume
was much more in English SPAM than Arabic
SPAM (See Table 4).
The results revealed other types of Arabic
SPAM that did not exist in English SPAM.
These types included news, training
consultation, jokes, scandals of famous people,
puzzles, greetings, competition, and invitations
by social networks websites such as Facebook.
A study conducted by the Communication and
Information Technology Commission (CITC) in
Saudi Arabia in 2007 showed that 64% of email
SPAM received in Saudi Arabia were direct
marketing, 25% were sexual emails, 5% were
religious emails, and 5% was other types [20].
However, this study did not specify if the email
SPAM received was written in Arabic or
English. The results of the CITC study indicated
that the volume of religious emails,
pornographic emails and other types of email
SPAM was similar to the volume of the same
types in this study.
The results, seen in Table 4, showed that the
volume of pornographic emails for both Arabic
and English email SPAM was lower compared
to the same type in other countries such as
Bahrain. The results of a study conducted in
Bahrain by [1] revealed that 76% of the
participants received pornographic emails while
24% did not receive pornographic emails. The
results of this study did not specify if the volume
of pornographic emails was larger in English or
Arabic. Therefore, the results of this study
indicated that the volume of pornographic emails
in Saudi Arabia was lower and this could be
because the access to pornographic websites is
not allowed for public in Saudi Arabia and this
could be contributed in reducing the volume of
SPAM email that sent from pornographic
websites.
Table 5 shows the different averages of Arabic
email SPAM received by the participants in the
eastern, western, central, southern and northern
regions of Saudi Arabia. The results revealed
that the participants in the southern region
received business advertisements more than the
participants in other regions. The volume of
religious and political emails received by the
participants in the eastern region was higher
compared to the same type received by the
participants in other regions. The results
indicated that the volume of pornographic emails
received in the western and central regions was
larger than the same type received in other
regions.
In addition, the results revealed that the
participants in the northern region received more
forums emails than the participants in other
regions. The volume of products and services
emails was larger in the eastern and western
regions than other regions. The results showed
that the volume of phishing and fraud was larger
in the western region than other regions. The
percentages also showed that the volume of
other types of Arabic SPAM was larger in the
eastern, central and southern regions than other
regions (See Table 5).
Table 5 shows the different averages of
English email SPAM received by the
participants in the eastern, western, central,
southern and northern regions of Saudi Arabia.
The results showed that the volume of business
advertisements was larger in the northern region
than other regions. The volume of religious and
political emails received in the western and
southern regions was larger compared to the
same type in other regions. The results revealed
that the participants in the eastern region
received pornographic emails more than other
regions. The volume of forums, products and
services, and other types of English SPAM was
larger in the western region than other regions.
303
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
The results also showed that the volume of
phishing and fraud emails was larger in the
southern region than other regions.
Table 5: Averages of Arabic and English email SPAM received by the
participants in the Eastern, Western, Central, Southern and Northern
regions of Saudi Arabia
Types of
email SPAM
E W C S N
AR
%
EN
%
AR
%
EN
%
AR
%
EN
%
AR
%
EN
%
AR
%
EN
%
Business 31 27 29 28 32 31 34 30 31 32
Religious and
Political
Parties
6 2 5 3 5 2 4 3 5 2
Pornographic 9 27 11 22 11 24 6 23 9 26
Forums 35 3 30 6 36 2 39 3 42 2
Products and
Services 13 9 13 17 10 13 11 9 8 10
Phishing and
Fraud 5 31 12 22 5 28 5 32 5 27
Other 1 1 0 2 1 0 1 0 0 1
A study conducted by [3] described some
keywords and phrases used in Arabic and
English email SPAM in Saudi Arabia. These
keywords and phrases were collected from
different ISPs in Saudi Arabia.
Examples of Arabic SPAM keywords and
phrases are as follows: "أدوية” ,”ألعاب“ ,"فياقرا”, ,”مسابقة” ,”مبروك لقد ربحت” ,”فرصة للربح” ,”ريجيم”بطاقة ”, ”انضم إلينا”, ”تعليم”, ”اربح مليون لایر سعودي” ”زواج”, ”حصريا ”, ”موضة”, ”خضراء للسفر إلى أمريكا”, ”جنس”, ”شريك العمر”, + فمافوق18 ”رومانسية”, ” ,”تبرعات“ ,"تدريب" ,"برامج", ”مفاجآت”, ”فضيحة”," ,"اشترك في المنتدى" شارك واربح عرض ”, ”جائزة” ," , ”ثورة“ ,”أقل اKسعار”, ”إباحية”, ”ھدية”, ”خاصمقاطع ”, ”أزياء” ,”دورة“ ,”أموال“ , ”بشرى” ,”أسھم“ ."للرجال فقط " and ,"اعمل من المنزل" ,”مضحكة Examples for English SPAM keywords and
phrases are as follows: "sex", "Cialis", “gift”, ”Dollar”, ”discount”, ”bonus”, "girls", "Viagra", "Loto winner", "Investment", "Forex", "Green", "Visa and Master", “reactivate your email account”, “Incomplete personal information”, “Verify your account”, “Account not updated”, “Financial Information Missing”, “$USD”, “You have won”, “fund”, “money”, “winning promotion”, “transferring”, "Training", "South Africa", "Partnership", "Bank loans", and "work and live in USA". 3.3. Actions of Email Users in Dealing with
SPAM The participants were asked about the
appropriate action for dealing with email SPAM.
In the survey, the participants were given three
actions for their dealing with SPAM. These
actions were as follows. The first action was that
reading the entire email SPAM. The second
action was that deleting the email SPAM without
reading it. The third action was that contacting
with the ISP and notifying it about email SPAM.
To answer this question, the participants were
asked to evaluate their actions in dealing with
SPAM by choosing one of the following options
for each action. The options for each action were
as follows: never, sometimes and always.
Firstly, when the participants were asked if
they read the entire email SPAM, the results
revealed that the most of the participants said
that they sometimes read the entire SPAM. The
results showed that participants in the eastern
and central regions were better than the
participants in other regions where the results
showed that the average of the participants who
said that they never read the entire email SPAM
was larger in the eastern and central regions than
other regions (See Table 6).
Secondly, when the participants were asked if
they delete the email SPAM without reading it,
the results showed that the most of the
participants said that they sometimes delete the
email SPAM without reading it. The results, as
shown in Table 6, revealed that the participants
in the central and eastern regions were better
than the participants in other regions where the
results indicated that the average of the
participants who said that they always delete the
email SPAM without reading it was larger in the
central and eastern regions than other regions.
Thirdly, when the participants were asked if
they contact with ISP and notify it about email
SPAM, the results revealed that the most of the
participants said that they never contact with ISP
and notify it about SPAM (See Table 6). The
results indicated that the participants in the
southern and northern regions were better than
the participants in other regions where the results
revealed that the average of the participants who
said that they always contact with ISP and notify
it about SPAM was larger in the southern and
northern regions than other regions.
The results of a study conducted by [17]
showed that 11.7% of the participants said that
they contacted their ISPs when they received
email SPAM. By comparing the results of two
studies, it can be clearly seen that most of email
users in the two studies did not contact with ISPs
regarding SPAM problems.
From the results shown above regarding the
actions of users in dealing with email SPAM, it
can be clearly suggest that the ISPs in Saudi
Arabia should inform users about email SPAM,
its impacts, technical and legal efforts of the
ISPs to combat SPAM, and what are the
304
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
necessary procedures that users do when they
receive SPAM.
When the participants were asked if they
responded to an offer made by a SPAM email,
the results showed that the most of the
participants in all regions did not respond to an
offer made by a SPAM email (See Table 6). The
results revealed that the participants in the
southern region responded to offers made by
SPAM email more than the participants in other
regions of Saudi Arabia.
The results indicated that the participants in
the western and southern regions were enjoyed
fun emails involved in SPAM more than the
participants in other regions. The results also
showed that the participants in the eastern and
northern regions used purchasing and selling
offers involved in SPAM email more than the
participants in other regions. Also, the results
revealed that the participants in the central and
southern and northern regions used SPAM as a
learning tool more than the participants in other
regions. The participants in the northern region
derived other benefits from SPAM such as
friendship requests more than the participants in
other regions (See Table 6).
The results indicated that as long as some users
responded to some offers of SPAM, email
SPAM could be increased and caused problems
for other users unless those users combat it. This
suggests that laws against SPAM in Saudi
Arabia could reduce the incidence of SPAM by
greatly reducing the ability of spammers to make
sales without fear of penalties.
Table 6: Actions of users in the Eastern, Western, Central, Southern and
Northern regions of Saudi Arabia in dealing with email SPAM
Region Question
N S C W E
Part 2: Email SPAM
29% 28% 37% 33% 40% Never 1- Read
the entire
email What
do you
do
when
you
receive
SPAM
email?
65% 62% 53% 62% 48% Sometimes
6% 10% 10% 5% 12% Always
5% 13% 7% 6% 11% Never 2- Delete
the email
without
reading it
62% 52% 50% 59% 49% Sometimes
33% 35% 43% 35% 40% Always
86% 73% 83% 87% 77% Never 3- Contact with ISP
and notify
it about
SPAM
6% 15% 14% 12% 19% Sometimes
8% 12% 3% 1% 4% Always
20% 34% 20% 15% 19% Yes Have you ever
purposely responded
to an offer made by a
SPAM email? 80% 66% 80% 85% 81% No
23% 16% 18% 10% 23% Purchasing and selling What benefits did
you derive from
SPAM emails?
46% 47% 47% 39% 33% Learning
50% 71% 54% 71% 56% Fun
4% 0% 0% 3% 3% Other
3.4. Effects of Email SPAM on End Users When the participants were asked if they
affected negatively by email SPAM, the results
revealed that approximately half of the
participants in all regions affected by email
SPAM (See Table 7).
The results showed that the participants in the
southern and northern regions were affected by
email SPAM more than the other participants in
other regions. This could be because of the most
of the participants in the southern and northern
regions were not aware of SPAM and the
effective ways in dealing with it. Also, this could
be because of dealing of the participants in the
southern and northern regions with offers made
by a SPAM email where the results revealed that
the participants in the southern and northern
regions responded to emails SPAM more than
the participants in other regions (See Table 7).
The results revealed that the main impact of
SPAM on users was that filling inboxes with
SPAM. The results showed that the participants
in the southern region were more affected by this
impact than the participants in other regions. The
results also showed that the second main impact
of SPAM on users was that the infection of
computers by a Virus, Worm or other malicious
program. The results revealed that the
participants in the northern and central regions
were more affected by this impact than the
participants in other regions (See Table 7).
The results showed that the participants in the
western region were affected by SPAM through
losing time and reducing productivity more than
the participants in other regions. The results
revealed that the participants in the eastern,
southern and western regions were affected by
SPAM through stealing personal information
such as user name, password and credit card
numbers more than the participants in other
regions. The results also revealed that the
participants in the eastern, western and central
regions felt less confidence in using the email
more than the participants in other regions. Also,
the results showed that the participants in the
central region were affected by email SPAM
through other effects such as annoying and
bothering more than the participants in other
regions (See Table 7).
305
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Table 7: Effects of email SPAM on of users in the Eastern, Western,
Central, Southern and Northern regions of Saudi Arabia
Region Question
N S C W E
Part 2: Email SPAM
52% 51% 46% 37% 43% Yes Have you
been
affected
negatively
by email
SPAM?
48% 49% 54% 63% 57% No
16% 23% 18% 22% 23%
Stealing personal
information such as
user name, password
and credit card
numbers
What was
the impact
of email
SPAM?
35% 36% 44% 51% 45% Losing time and
reducing productivity
15% 7% 22% 23% 25% Less confidence in
using the email
56% 71% 65% 66% 52% Filling email inbox
59% 43% 58% 51% 55%
Computer was
infected by a Virus,
Worm or other
malicious program
3% 3% 4% 3% 2% Other impacts
3.5. Awareness of Anti-SPAM Filters and
the Effectiveness of Anti-SPAM Filters
in Detecting Arabic and English SPAM When the participants were asked if they
aware of Anti-SPAM programs, the results
revealed that the most of the participants in all
regions were not aware of Anti-SPAM
programs. The results indicated that the
participants in the central region were more
aware of Anti- SPAM programs than the
participants in other regions (See Table 8 ).
A study conducted in Bahrain [1] revealed that
26% of the participants knew about Anti-SPAM
programs while 74% did not know about Anti-
SPAM programs. By comparing the results of
Bahraini study to the results of this study, it can
be clearly seen that Saudi society was more
aware of Anti-SPAM programs than Bahraini
society, but still most Saudi society were not
aware.
When the participants were asked about how
they knew about Anti-SPAM programs, the
results showed that the majority of the
participants in all regions knew about Anti-
SPAM programs through the internet and forums
and through school and university education.
The results also revealed that there was a
deficiency in the government and ISPs efforts in
informing users about Anti-SPAM programs and
how they work. As seen in Table 8, there were
no government efforts to inform users about
Anti-SPAM programs in the western and
southern regions. This suggests that there should
be a coordinating between the government and
the sectors of providing the internet service in
Saudi Arabia in informing users in all regions,
especially in the western and southern regions,
about Anti-SPAM programs and how they work
to detect SPAM. This also suggests that
distributing free Anti-SPAM programs by the
government or by sectors of providing the
internet service for email users could reduce the
effects of SPAM in Saudi Arabia.
Table 8: Responses of the participants in the Eastern, Western, Central,
Southern and Northern regions of Saudi Arabia about their knowledge about Anti-SPAM programs
Region Question
N S C W E
Part 2: Email SPAM
28% 31% 44% 38% 38% Yes Are you
aware of
Anti-
SPAM
programs? 72% 69% 56% 62% 62% No
8% 10% 6% 8% 4% Internet Service
Providers (ISPs)
How did
you know
about
Anti-
SPAM
programs?
67% 52% 62% 79% 67% The internet and
forums
3% 5% 8% 3% 6%
Broadcast media
such as radio, TV,
newspapers and magazines
14% 48% 28% 25% 32% Friends and
relatives
11% 0% 3% 0% 6%
Government
ministries and
commissions
36% 52% 47% 27% 33%
Through my school
or university
education
6% 5% 5% 5% 1% Other
When the participants were asked to rate the
effectiveness of Anti-SPAM programs in
detecting Arabic and English SPAM, the results
revealed that the existing Anti-SPAM programs
were not completely effective in detecting
Arabic and English SPAM. This suggests that
the existing Anti-SPAM filters need to be
developed to detect SPAM in different
languages such as Arabic and English.
The results showed that the participants in all
regions estimated that the existing Anti-SPAM
programs were effective in detecting English
SPAM more than Arabic SPAM. This suggests
that there should be a focus on producing and
developing techniques to detect email SPAM in
Arabic language.
The evaluation of the participants in all regions
for the effectiveness of Anti-SPAM programs in
detecting Arabic and English SPAM can be seen
in Figure 5 and Figure 6.
306
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Figure 5: The effectiveness of Anti-SPAM filters in detecting Arabic
email SPAM based on the evaluation of the participants in the Eastern,
Western, Central, Southern and Northern regions of Saudi Arabia
Figure 6: The effectiveness of Anti-SPAM filters in detecting English email SPAM based on the evaluation of the participants in the Eastern,
Western, Central, Southern and Northern regions of Saudi Arabia
3.6. Efforts of Government and ISPs to
combat SPAM
When the participants were asked if they
aware of the government efforts to combat
SPAM, the results showed that only a few
participants were aware of the government
efforts to combat SPAM. The results revealed
that users in the central regions were more aware
of the government efforts to combat SPAM than
other regions (See Table 9). This suggest that the
government should inform users about their
efforts to combat SPAM and should provide
awareness programs about SPAM, its impacts
and methods of combating it for users in all
regions of Saudi Arabia. This could help in
reducing the effects of SPAM on email users in
Saudi Arabia.
Table 9: The awareness of the participants in the Eastern, Western,
Central, Southern and Northern regions of Saudi Arabia about the government and ISPs efforts
Region Question
N S C W E
Part 3: Efforts of combating of Email SPAM in Saudi Arabia
23% 20% 30% 22% 20% Yes Are you aware of
efforts by the
government in Saudi
Arabia to combat
email SPAM? 77% 80% 70% 78% 80% No
10% 13% 16% 15% 11% Yes Are you aware of
efforts by ISPs in
Saudi Arabia to
combat email
SPAM? 90% 87% 84% 85% 89% No
The participants who were aware of
government efforts to combat SPAM were asked
about these efforts that they were aware of. Most
of the participants (62%) said that the
government efforts could be observed by King
Abdulaziz City for Science and Technology
(KACST). They said that KACST blocks
unsecured websites and websites that send
SPAM, informs people about dangerous security
attacks and their impacts, and conducts and fund
researches related to information technology
[19].
24% of the participants said that the
government recommended that each government
sector and private sector in Saudi Arabia should
apply security policy in the organization. The
policy should include: providing the
organization with software and hardware that are
necessary to avoid security attacks such as
Viruses and SPAM, awareness of employees and
customers about security attacks and methods of
combating them, conducting researches related
to security attacks and countermeasures for these
attacks, conducting training and workshops
related to security issues for employees,
employment of qualified people in the field of
networks security in the organization to deal
with security attacks, providing financial budget
to develop the work of security policy and
reviewing the security policy regularly to find
out the strengths and weaknesses of the work of
security policy.
22% said that the government established and
funded centres to deal with information security
issues. Examples for these centres are Centre of
Excellence in Information Assurance (COEIA)
[8], Computer Emergency Response Team
(CERT) [10] and Prince Muqrin Chair for
Information Security Technologies (PMC IT
SECURITY) [23]. They said the aims of these
centres were to inform people about security
attacks such as Viruses and SPAM and their
63%61%61%60%
51%
0
20
40
60
80
100
Region
The effectiveness of Anti-SPAM filters in detecting Arabic email SPAM based on the evaluation of email
users in Eastern, Western, Central, Southern and Northern regions in Saudi Arabia
Northern (n=130)
Southern (n=134)
Central (n=352)
Eastern (n=203)
Western (n=201)
85%83%
80%79%
74%
0
20
40
60
80
100
Region
The effectiveness of Anti-SPAM filters in detecting English email SPAM based on the evaluation of email
users in Eastern, Western, Central, Southern and Northern regions in Saudi Arabia
Southern (n=134)
Eastern (n=203)
Central (n=352)
Northern (n=130)
Western (n=201)
307
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
impacts, conducting and funding researches
related to security issues and conducting
conferences and workshops regarding security
attacks.
19% of the participants said that the
government efforts could be observed by
Communication and Information Technology
Commission (CITC). They said that CITC
funded Saudi National Anti-SPAM Program
project and created a website for this project that
includes information about SPAM, methods of
combating it and published it for public on the
internet. They also said that this project
informed people about SPAM by publishing
brochures or by subscription of people in
mailing list of CITC to make people look for the
new development in SPAM. The participants
also said that the project conducted some
researches regarding SPAM problems and
publish the results of researches for public. They
also said that CITC received complaints of
people regarding SPAM problems and it
processed these problems with the other
responsible government sectors [9].
18% said that some universities in Saudi
Arabia established centres for information
security which provide the following services for
people. First of all, information security centres
provide awareness of people about security
attacks. Second, these centres conducted
workshops, conferences and ongoing training in
the field of security issues and methods of
combating it for people. Third, centres published
valued researches in the field of security issues
for people and different libraries in Saudi
Arabia.
18% of the participants said that the
government enacted law for combating
electronic crimes in Saudi Arabia and there were
no specific laws for SPAM. They said that the
government sectors that are responsible to
execute the electronic crime law are
Communication and Information Technology
Commission (CITC) with coordination with
other legal sectors.
When the participants were asked if they
aware of the ISPs efforts to combat SPAM, the
results revealed that only a few participants were
aware of ISPs efforts to combat SPAM. The
results indicated that users in the central and
western regions were more aware of ISPs efforts
to combat SPAM than other regions (See Table
9). This suggests that the ISPs should provide
awareness programs about SPAM and its impact,
and their efforts to combat it for users in all
regions of Saudi Arabia which could help in
reducing the effects of SPAM on email users.
The participants who were aware of the ISPs
efforts to combat SPAM were asked about these
efforts that they were aware of. 42% of the
participants said that the ISPs used advanced
Anti-SPAM filters to block email SPAM before
it reaches end users inboxes.
26% said that the ISPs blocked websites or
forums that send email SPAM for recipients and
put them in black lists.
13% of the participants said that the ISPs
informed people about email SPAM and
methods of combating it by email, brochures,
and Short Message Service (SMS).
13% said that the ISPs warned customers not
to send SPAM, they received customers’
complaints regarding SPAM and they executed
some legal actions against people who sent email
SPAM such as disconnecting the internet service
and cancellation of the contract.
4. CONCLUSION AND FUTURE WORK This paper presented the results of a survey of
email users in the eastern, western, central,
southern and northern regions of Saudi Arabia
about email SPAM and how they deal with it.
The results showed that there was no a specific
definition for email SPAM and the most
common definition for email SPAM was that
“an email that was sent randomly to numerous recipients and contained Spyware, files, links, images or text that aimed to hack the computer or steal confidential information such as email passwords, credit card numbers and bank account numbers”.
The results revealed that approximately third
of users in Saudi Arabia did not know about
email SPAM and this is a significant and a risk
for Saudi society. The results showed that the
level of the awareness of the participants about
SPAM was different from region to another and
the participants in the central and western
regions were more aware of SPAM more than
the participants in other regions.
The results showed that the volume of email
SPAM was high in Saudi Arabia compared to
other countries. The results revealed that the
volume of email SPAM was different from
region to another and the volume of SPAM
received by the participants was larger in the
northern and central regions than other regions.
The results showed that most of the email SPAM
received in all regions was written in English
308
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
and the volume of English SPAM was different
from region to another.
The results also showed that there were many
types of Arabic and English SPAM received by
the participants in all regions. The results
showed that the most common type of Arabic
SPAM was forums emails and for English was
business advertisements, and phishing and fraud
emails and the volume of these types for both
Arabic and English were different from region to
another.
The results showed that a few participants in
all regions responded to SPAM and the average
of the participants who responded to SPAM was
larger in the southern region than other regions.
The results revealed that approximately half of
the participants in all regions were affected
negatively by email SPAM and the average of
the participants who affected negatively by
SPAM was larger in the southern and northern
regions than other regions.
The results showed that most of the
participants in all regions were not aware of
Anti-SPAM programs and the participants in the
central region were more aware of Anti-SPAM
programs than the participants in other regions.
The results showed that the participants in all
regions estimated that the existing Anti-SPAM
programs were more effective in detecting
English than Arabic SPAM.
The results showed that most of the
participants in all regions were not aware of the
government efforts to combat SPAM and the
participants in the central region were more
aware of the government efforts than the
participants in other regions.
Finally, the results showed that most of the
participants in all regions were not aware of the
ISPs efforts to combat SPAM and the
participants in the central and western regions
were more aware of the ISPs efforts than the
participants in other regions.
Future work could include investigating
government efforts to combat SPAM to find
more effective methods to combat SPAM.
Laws to combat SPAM in Saudi Arabia could
be investigated. This could be achieved by
taking the experiences of developed countries to
combat SPAM. This could help in enacting a
new clear law to combat SPAM in Saudi Arabia.
The legal and technical efforts of ISPs in Saudi
Arabia to combat email SPAM, and ways to
encourage ISPs to collaborate with each other
ISPs, private sectors, government sectors and
customers could be investigated.
Effective awareness programs to inform users
in all regions of Saudi Arabia, private sectors
and government sectors about SPAM, its effects
and methods of combating it could be
investigated.
Improving the performance of existing Anti-
SPAM filters in detecting Arabic and English
email SPAM could be investigated. This could
be achieved by testing the effectiveness of
existing Anti-SPAM filters in detecting Arabic
and English SPAM email and this could help in
creating and developing effective filters to detect
new types of Arabic and English SPAM.
A listing of keywords and phrases used in
Arabic email SPAM were involved in this
research and this could help in designing and
producing special Anti-SPAM filters for Arabic
SPAM.
5. REFERENCES
1. Al-A'ali, M.: A Study of Email Spam and How to
Effectively Combat It.
http://www.webology.org/2007/v4n1/a37.html ,
Webology (2007).
2. Alkahtani, H. S., Gardner-Stephen, P., Goodwin, R.: A
taxonomy of email SPAM filters. In: Proc. The 12th
International Arab Conference on Information
Technology (ACIT), pp. 351--356, Riyadh, Saudi
Arabia (2011).
3. Alkahtani, H. S., Goodwin, R., Gardner-Stephen, P.:
Email SPAM related issues and methods of controlling
used by ISPs in Saudi Arabia. In: Proc. The 12th
International Arab Conference on Information
Technology (ACIT), pp. 344--351, Riyadh, Saudi
Arabia (2011).
4. Androutsopoulos, I., Koutsias, J., Chandrinos, K. V.,
Spyropoulos, C. D.: An experimental comparison of
naive Bayesian and keyword-based anti-spam filtering
with personal e-mail messages. In: Proc of the 23rd
annual international ACM SIGIR conference on
Research and development in information retrieval, pp.
160--167, Athens, Greece (2000).
5. Australian Communications & Media Authority
(ACMA),
http://www.efa.org.au/Issues/Privacy/spam.html#acts
6. Boykin, O., Roychowdhury, V.: Personal Email
networks: an effective anti-spam tool. Condensed
Matter cond-mat 0402143, pp. 1--10 (2004).
7. Carreras, X., Marquez, L.: Boosting Trees for Anti-
Spam Email Filtering. In: Proc. of RANLP, 4th
International Conference on Recent Advances in
Natural Language Processing, pp. 1--7, Tzigov Chark,
BG (2001).
8. Centre of Excellence in Information Assurance
(COEIA), http://coeia.edu.sa/index.php/en/about-
coeia/strategic-plan.html
9. Communication and Information Technology
Commission (CITC) ,
http://www.spam.gov.sa/eng_main.htm
309
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
10. Computer Emergency Response Team (CERT),
http://www.cert.gov.sa/index.php?option=com_conten
t&task=view&id=69&Itemid=116
11. Cook, D., Hartnett, J., Manderson, K., Scanlan, J.:
Catching spam before it arrives: domain specific
dynamic blacklists. In: Proc. of the 2006 Australasian
workshops on Grid computing and e-research, pp.
193--202, Hobart, Tasmania, Australia (2006).
12. Cormack, G., Lynam, T.: Spam corpus creation for
TREC. In: Proc. of Second Conference on Email and
Anti-Spam (CEAS), pp. 1--2 (2005).
13. Cormack, G. V., Kolcz, A.: Spam filter evaluation
with imprecise ground truth. In: Proce. of the 32nd
international ACM SIGIR conference on Research
and development in information retrieval, pp. 604--
611, Boston, MA, USA (2009).
14. Damiani, E., Vimercati, S. D. C. d., Paraboschi, S.,
Samarati, P.: An Open Digest-based Technique for
Spam Detection. pp. 1--6, San Francisco, CA, USA
(2004).
15. Garcia, F. D., Hoepman, J.-H., Nieuwenhuizen, J. V.:
SPAM FILTER ANALYSIS. SEC, pp. 395--410
(2004).
16. Gardner-Stephen, P.: A Biologically Inspired Method
of SPAM Detection. 20th International Workshop,
pp. 53--56, DEXA (2009).
17. Grimes, G. A., Hough, M. G., Signorella, M. L.:
Email end users and spam: relations of gender and age
group to attitudes and actions. Computers in Human
Behavior 23, 1, 318--332 (2007).
18. Hovold, J.: Naive Bayes Spam Filtering Using Word-
Position-Based Attributes. In: Proc. Of Conference on
Email and Anti-Spam, pp. 1--8 (2005).
19. King Abdulaziz City for Science and Technology,
http://www.kacst.edu.sa/en/about/Pages/default.aspx
20. National Saudi Anti-SPAM Program,
http://www.spam.gov.sa/eng_stat2.htm
21. O'Brien, C., Vogel, C.: Spam filters: bayes vs. chi-
squared; letters vs. words. In: Proc. of the 1st
international symposium on Information and
communication technologies, pp. 291--296, Dublin,
Ireland (2003).
22. Pfleeger, S. L., Bloom, G.: Canning Spam: Proposed
Solutions to Unwanted Email. IEEE Security and
Privacy 3, 2, pp. 40--47 (2005).
23. [23] Prince Muqrin Chair for Information Security
Technologies (PMC IT SECURITY),
http://pmc.ksu.edu.sa/AboutPMC.aspx
24. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.:
A Bayesian Approach to Filtering Junk E-Mail:
Learning for Text Categorization. Papers from the
1998 Workshop, pp. 1--8, Madison, Wisconsin
(1998).
25. Sakkis, G., Androutsopoulos, I., Paliouras, G.,
Karkaletsis, V., Spyropoulos, C. D., Stamatopoulos,
P.: A Memory-Based Approach to Anti-Spam
Filtering for Mailing Lists. Information Retrieval 6,
1, 49--73 (2003).
26. Sorkin, D. E.: SPAM LAWS. The Center for
Information Technology and Privacy Law,
http://www.spamlaws.com/ (2009).
27. Wittel, G. L., Wu, S. F.: On Attacking Statistical
Spam Filters. In: Proc. of the Conference on Email
and Anti-Spam (CEAS), pp. 1--7, Mountain View,
CA, USA (2004).
310
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 297-310The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
A Survey on Privacy Issues in Digital Forensics
Asou Aminnezhad
Faculty of Computer Science
and Information Technology
University Putra Malaysia [email protected]
Ali Dehghantanha
Faculty of Computer Science
and Information Technology
University Putra Malaysia
Mohd Taufik Abdullah
Faculty of Computer Science
and Information Technology
University Putra Malaysia
ABSTRACT
Privacy issues have always been a major concern in
computer forensics and security and in case of any
investigation whether it is pertaining to computer or
not always privacy issues appear. To enable
privacy’s protection in the physical world we need
the law that should be legislated, but in a digital
world by rapidly growing of technology and using
the digital devices more and more that generate a
huge amount of private data it is impossible to
provide fully protected space in cyber world
during the transfer, store and collect data. Since its
introduction to the field, forensics investigators,
and developers have faced challenges in finding the
balance between retrieving key evidences and
infringing user privacy. This paper looks into
developmental trends in computer forensics and
security in various aspects in achieving such a
balance. In addition, the paper analyses each
scenario to determine the trend of solutions in these
aspects and evaluate their effectiveness in resolving
the aforementioned issues.
KEYWORDS
Privacy, Computer Forensics, Digital Forensics,
Security.
1 INTRODUCTION
Computer forensics has always been a field which
is growing alongside technology. As networks
become more and more available and data transfer
through networks getting faster, the risks involved
gets higher. Malicious software, tools and
methodologies are designed and implemented every
day to exploit networks and data storage associated
with them to extract useful private information that
can be used in various crimes.
This is where computer forensics and security
comes in. The field applies to scientifically collect,
preserve, and recover latent evidence from crime
scenes with techniques and tools.
Computer forensics is the science of identifying,
analyzing, preserving, documenting and presenting
evidence and information from digital and
electronic devices, and it is meant to preserve the
privacy of users from being exploited.
Forensic specialists have a duty to their client to
pay attention about the data to be extracted that can
become possibly evidence, essentially it can be
digital evidence’s investigation and way guiding to
feasible litigation.
However, the process of extracting data evidences
itself opens up avenues for forensic investigators to
infringe user privacy themselves. The privacy
concern that computer forensics disclose can be
image, encrypted key , the user passwords and
utilize knowledge that more than aim of the
investigation. In order to prevent such potential
abuses and protect the forensics investigators as
well as users, researches and analysis has been
done in various fields to provide solutions for the
problem.
311
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
This paper comprises of 5 Sections and will be
presented as such: Section 2 determines the
limitations of the study, collects data from research
publications and reviews related works in the field
of privacy application in various fields and their
solutions. Section 3 analyses these solutions and
determine whether privacy can be preserved on
both user and forensic investigator’s perspective.
Section 4 identifies the overlooked privacy issues
by current developmental trends of privacy
preservation and its potential setbacks. Section 5
concludes the paper and summarizes the overall
development of technology in privacy preservation.
1.1 Limitations of the study
This paper focuses on statistical analysis based on
trends from 2006. Due to the technicalities of each
paper in specification of research field it is not
possible to rely solely on the results to reflect the
holistic picture of the real trend in privacy issues
when it comes to forensics investigations. It is also
difficult to fully explain the development trends of
privacy issues as they are delicate in each research
specimen. The research nature and scenarios used
cannot be fully dependably upon as they are not
necessarily applicable in another similar scenario.
The numbers of specimen provided are also too few
to adequately sustain very significant research
value. In this case, where most of the papers
reviewed are too specific in their corresponding
research field and purpose, it is difficult to
generalize the specimen into statistical data with
higher accuracy. We also realize that most
specimens are from the Elsevier journal platform,
and thus also acknowledge this as a form of
limitation on availability of more related research
publications in other sources.
We also credit another limitation on the lack of
graphical statistical data, as most of the papers
researched do not necessarily belong to statistical
based research. It is not practical to add statistical
assumptions into these graphical statistical data as
it will possibly divert the accurate picture of the
research.
1.2 Data Collection
In this research, a stringent data collection
procedure is set up. Such procedure is required as
the resource provided to achieve high level research
results is scarce, hence every important data cannot
be risked being left overlooked.
We consider 3 very important analyses: research
nature analyses, keyword analyses and individual
analytic platform. There is a total of 21 documents
analyzed based on the aforementioned 3
approaches.
Table 1 signifies the shift of research focus when it
comes to preserving privacy. It is rather evident
that the current focus of forensics and security
solutions are now more towards databases and
networking with the rise of dependency on cloud
computing technology, with 8 papers focusing on
that area. More data are being stored in third party
databases as compared to 5 years ago, and it
became a tempting source to gain valuable private
information. A shift of focus is inevitable from
software and systems to database and networking
under such circumstance where it is harder to gain
access to information without networking access
and maintain it for further exploitation.
Methodologies and framework still receive
adequate focus as these are the foundation of many
solutions that are to be proposed in the future.
The keyword analysis signifies the focus of each
specific specimen analyzed. As it is shown in Table
2, keywords used do not necessarily bear the same
signature as published in these specimens, but are
grouped based on their representation. For
example, a computer forensics publication with
digital forensics representation will be grouped
together as they represent similar research nature.
Keyword analysis provides a picture of techniques
and theories that are being emphasized within the
timeframe of this research paper.
The clear distinction on the focus of researchers to
privacy and digital forensics issues marked the
importance of balancing privacy and forensics.
Excluding the specific related issues, general
privacy and digital forensics focus achieved a total
of 24 keyword matches out of 21 papers. To
quantify, that would mean there are at least 3
papers that draw a comparison between both issues
312
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
in finding a balance as a major purpose of research.
The other important trend is the diversity of the
research. There are only 11 out of 53 representable
keywords identified that bear more than 2 keyword
matches. This means that more focus is given to
individually specified research subjects rather a
holistic picture of privacy-forensics balance.
The individual analytic platform is conducted as a
final data collection. This is done by picking up a
summary of each paper, and gives a brief
explanation of what the paper is trying to prove and
possible benefits from the publications.
Before a forensics investigator or computer security
designer works on finding evidence or putting up
detection systems, the first step is always to gather
information and plan. The problem with Standard
of Procedures (SOP) [1] of forensics investigations
are that there are many instances where forensics
investigators step into information that are not
necessarily related to a particular crime.
The Fourth Amendment of the Constitution of
United States of America is no stranger to digital
forensics investigators.
Table 1. Research Nature Analaysis
0
1
2
3
4
5
6
7
8
9
methodolgiesand
framework
software andsystems
database andnetworking
education andnetworking
0 2 4 6 8 10 12 14 16
Cybercrime Computer Prevention
Computer/Digital Forensics Fraud
Netflow Network Forensics
Cryptography Identity Based Encryption
Privacy Privacy Preserving Semantics
Statistical Database Onion Routing
Antiforensics Anonymizers
Security Log Files Analysis
Traffic Analysis Network Intelligence
Privacy Enhancing Technologies Transparency/Reliability
Legal Issues Warrants
Forensics Readiness Capability Information Privacy Incidents
Forensics Images Privacy Protection
Forensics Computing Education Forensics/Digital Investigation
Sequence Release Privacy Accurate
Privacy Preserving Object Compound Document Format
Document Security Electronic Document Information Leakage
Portable Document Format Information Forensics
Private Browsing Incognito In-Private
Privacy Preserving Forensics Encrpyted Data Searching Homomorphic Encryption Commutative Encryption Data Privacy/Protection
Sensor Web Distributed Information System
Forensics Database Relational Database
Suspicious Database Queries … Phisher Network
Forensics Framework PIS
Table 2. Keyword Analaysis
313
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
2 CURRENT TRENDS OF PRIVACY IN
DIGITAL FORENSICS
The Amendment protects people from unreasonable
seizure and searches, and warrants that allow such
seizure has to be specific to its cause. For example,
if a warrant is issued against an individual to be
searched for evidence of drugs, any related
searches that turned out to be child pornography
will not be eligible to be used against the
individual. The amendment also stretches to
interception of communication networks, including
wiretapping [2].
However, the Amendment only limits what type of
information to be searched and seized, not the
protocols on how they are to be searched and
seized. On this ground, [2] proposed that an audit
trail on methodologies used by forensics
investigators will be enough to verify if the
investigation protocols exceeded court
authorization.
Apart from a general audit, many related researches
also produced different models for forensics
investigations in recent years. In [3] proposed a
framework where enterprises can meet forensics
readiness to approach privacy related violations. It
consisted of a series of business processes and
forensics approach, executed in hierarchical order
such that enterprises can conduct quality privacy-
related forensics investigations on information
privacy incidents.
There are 2 later models proposed in 2010. Firstly,
in their research, [4] proposed a cryptographic
model to be incorporated into the current digital
investigation framework, where forensics
investigators first have to allow the data owner to
encrypt his digital data with a key and perform
indexing of the image of the data storage.
Investigators will then extract data from relative
image sectors that matches keywords they used,
with the encryption key. Image sectors without the
keywords will then not be revealed to forensics
investigators, guaranteeing privacy.
The next model proposed by [5] introduces a
layering system on data in order to protect privacy
of users from being violated and the forensics
investigators themselves from infringing privacy.
It allows forensics investigators to first obtain
information that is layered as not related to
individual before moving towards the next layer.
As each layer of information is justified and
obtained the layer gets deeper and closer in
relation to the individual until the final layer
where information is needed for forensics
investigation and directly linked to the person.
In [6], PPINA (Protect Private Information Not
Abuser) is proposed, an embedded framework in
Privacy Enhancing Technologies (PET), a
technology designed to preserve user anonymity
while accessing the internet. The framework
allows users to continue being anonymous unless
the server has enough evidence to prove that the
user is attacking the server, hence requesting a
forensics investigation entity to reveal user
identity. The framework is designed to achieve a
balance between user privacy and digital forensics,
where both goals can be achieved with a
harmonious combination of network forensics and
PET.
The development of digital forensics and security
on software level also raises many privacy related
issue. This includes information systems and
related tools.
The first software that is looking into is the
counter forensics privacy tool. A review was done
in 2005 on this software type that prevents
forensics investigators from accessing private
information by wiping out data like cache,
temporary files and registry values when executed.
In [7], the researchers evaluated 6 tools under this
category and found that while the tools potentially
eliminate vast majority of targeted data, they either
partially or fully failed in 6 evaluation sections
which they claim to function, including
incomplete wiping of unallocated space, erasing
targeted user and system files, registry usage
records, recoverable registry archive from system
restore point, recoverable data from special file
system structures and the tool’s own activity
records disclosure. The authors suggested that
encryption might be a better alternative to replace
these tools, such as Encrypting File System.
314
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
A similar analysis done on Privacy-Invasive
Software (PIS) by [8], software that collects user
information without user knowledge such as
spyware and advertisement software known as
adware, also found that current tools designed to
combat them (anti-spyware and adware) failed to
identify them fast enough or even identifying them
at all and have problems classifying PIS properly.
The research concluded that these tools, that be
run on similar algorithm dealing with viruses and
malware (signature identification) does not work
well on PIS due to its nature of existence in grey
area between business facilitating and malicious.
Manual forensics method, upon experiments,
provided better results instead.
Browsers also raise privacy related issues, as they
are used to perform many activities such as trading
online, which requires a private information
transfer. In [9] published an analysis on three
widely used browsers in terms of their private
browsing effectiveness. Private browsing is a
feature that prevents browsing history to be stored
in the computer’s data storage. The authors
concluded that while all three browsers do not
display visible evidences in private browsing
mode, related data can still be extracted with
proper forensics tool and methodology. From the
user’s viewpoint, the authors also concluded that
Google Chrome and Mozilla Firefox are better
private browsing solutions compared to Internet
Explorer.
Portable Document Format (PDF) is invented by
Adobe, credited with its security compared to
other document format. In [10], the researchers
released their review in this format, suggesting
that PDF is subject to leak information due to its
several interactive features, including flagging
content as “deleted” instead of really deleting
them, allow tracing of IP address on its
distribution, and very subject to hackers to collect
this information while using PDF to conduct
malicious cyber-attacks. The authors proved the
investigation with several tools and attacks,
suggested a few solutions on an administrator
level dealing with PDFs, such as the shocking
nature of PDF files received and systems like EES
(Elsevier Editorial System) to monitor PDF files.
In [11], on the concept of Onion Routing, pointing
out the evolution of the concept in preserving
privacy raised issues of difficulties during
investigations. Onion Routing is created to
absolutely prevent traffic analysis from third
parties by encrypting socket connections and act
as a proxy instead. Only the adjacent kin routers
along the anonymous connection can “unpeel” the
encryption as the packets approach its destination,
preventing hijacks and man-in-the-middle
phenomena. However, the author argued that the
same technology could be used by criminals to
prevent traffic analysis of forensics investigators
and bypass censorship, or combining the concept
to perform other malicious attacks on networks.
Such concept makes it very difficult for forensics
investigators to collect evidence as there are too
few avenues to access the information pockets
from third parties, unless access is gained from the
inside chain of the connection or tracing the last
router’s communication with the destination which
is the weakest protection in the chain.
In [12], the researcher published their findings on
preserving privacy in forensics DNA databases.
Such databases are designed to be centralized,
usable by forensics investigators globally to
identify criminal identities based on DNA
matches. To solve issues where such information
may be leaked into parties for non-investigative
purposes on forensics ground, the authors
proposed a framework in reworking the database
access controls to only accept certain queries that
are legitimate forensics queries. These queries
include blood samples and cell tissues that are
found at crime scenes.
In [13], the researcher outlined his research on
privacy issues raised by sensor webs and
distributed information systems, an active field
after the 911 incident. Distributed information
systems are information collecting systems with
huge data repository, including private
information such as financial and communications
records. Sensor webs use small, independent
sensors to collect and share information about
their environment without wire. The author
proposed several policies to maintain privacy in
distributed information systems and sensor webs,
315
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
including fundamental security primitives such as
low level encryption and authentication, human
interfaces to limit queries, selective revelation of
data, strong audits and better querying
technologies, with policy experimenting, security
and legal analysis, masking strategies to obtain
results.
Another networking issue arises in shared and
remote servers, servers that stores data for users as
a form of third party data storage. Essentially there
are two problems here; firstly, these servers are
owned by third party service providers, hence
getting access without their knowledge of what
investigators are looking for is difficult due to
permission grants (privacy preservation). Secondly,
the servers’ nature to be remote also makes it
difficult to trace evidence in a large number of
shared and distributed storage using traditional
forensics method of imaging (cloning) the storage
devices. The usual privacy issue of tampering into
irrelevant data also exists. To solve these problems,
[14] proposed two schemes, the homomorphic and
commutative encryption. The homomorphic
encryption is a scheme where both administrator of
remote servers and investigators encrypt their data
and queries. The administrator then uses the
encrypted queries with the investigator’s key to
search the server for relevant data, and the
investigator then decrypts the data with the
administrator’s key. The commutative encryption
introduces a Trusted Third Party (TTP) that
supervises the administrator to prevent unfair play.
The details are similar to homomorphic encryption,
with another layer of commutative-law based
encryption applied by TTP before the searching on
data storage is conducted. Both schemes allow
investigators to obtain information that they need
without exposing them to administrators of the
remote servers.
In [15], the researchers presented an approach to
detect accessing parties of leaked information from
a relational database through queries. In this
approach, the authors argued that suspicious
queries can be determined if and only if the
disclosed secret information could be inferred from
its answers. To do this, a series of optimization
steps involving the concept of replaceable tuples
and certificates, and database instances are
explained in relational mathematics. An algorithm
is constructed then from these optimization steps to
determine whether a query is suspicious with
respect to a secret and a database instance.
In [16], a framework in 2011 to preserve privacy
while handling network flow records is proposed.
Network flow recording collects information about
network traffic sessions. This information can
contain very private data, including network user
information; their activities on network, amount of
data transferred and used services. The authors
proposed a framework of integrated tools and
concepts to prevent such data from falling into the
wrong hands. The framework is divided into 3
sections: data collection and traffic flow recording,
combined encryption with Identity Based
Encryption and Advanced Encryption System, and
statistical database modelling and inference
controls. The framework is implemented to prevent
privacy on two phases, including encryption and
decryption of data collected and the manner of
constructing statistical reports such that inference
controls are applied to prevent a response to
suspicious queries.
To combat phishing that often leads to identity
theft, [17] proposed a framework in 2008 (citation
2008 a forensic). The framework is to counter-
phish phishers, using a fake service (phoneypot)
with traceable credential data (phoneytokens).
When a phisher is identified, he/she is directed to
the phoneypot and transact with it, transferring
phoneytokens into the phisher’s collection server.
This allows investigators to trace and profile the
identity of the phisher through these tokens. The
authors argued that even if the counter-phishing
attempt is discovered, it would have caused enough
problems to the phisher to avoid the target in the
future, protecting the user from further exploitation
by phishing attacks.
In general, database systems are supposedly
designed to store and handle data in a proper
manner. In [18], the researchers’ findings in 2007
that proved this wrong are published. They
concluded that database systems do not necessarily
remove stored data securely after deletion whereby
remnant data and operations can be found in
316
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
allocated storage. Database systems also made
redundant copies of data items that can be found in
file systems. These data present a strong threat to
privacy as not only investigators may find
themselves dealing with unwarranted data,
criminals may also access them for malicious
purposes. To avoid this, the authors designed a set
of transparency principles to ensure secure deletion
of data, modified database language (MySQL)
internals to encrypt the expunction log with
minimal performance impact that usually occur
when it comes to overwriting-encryption.
In 2008, [19] published a paper explaining the
importance of computer forensics to be practiced in
today’s networked organizations. It outlined the
key questions including the definition of computer
forensics, its importance, legal aspects involved
and online resources available for organizations to
understand computer forensics in a nutshell.
In [20], a paper is published that addressed a rising
problem of professionalism when it comes to
digital forensics in other fields. The author pointed
out that in many scenarios when it comes to
InfoSec professionals being deployed to work on
digital crime investigations their duties are very
limited to laws and legal systems, and lack the
intersection of business requirements from
enterprises and government. He argued that
coordination between different departments is
essential to achieve investigation goals, hence
proposed a GRC-InfoSec compliance effort. A few
suggestions put forth include a legal research
database to create a cross-referencing table of
regulatory actions and legal case citations to IT-
specific laws and guidelines, and presentation of
resulting costs and business disruption. (GRC
stands for Governance, Risk management and amp;
Compliance)
As for education, [21] published a system that
produces file system images for forensic computing
training courses. The system known as forensig,
developed with Python and Qemu, allows
instructors to set constraints on certain user
behavior such as deleting and copying files, in a
script which is then executed in a form of image
that can be analyzed by the students. The results
can then be matched with the input script. It solves
the issues of instructors using second hand hard
disks for analysis practice, which often times
contain private data.
Besides that, [22] tackle cybercrime-related issues.
Issues regarding privacy as a fundamental right,
comparison of legal issues between countries
discuss in the workshop. In addition there were few
works on privacy issues that may arise during
malware analysis [23,24], analysis of cloud and
virtualized environments [25-27], and in pervasive
and ubiquitous systems [28-32]. With growing
usage of mobile devices and Voice over IP (VoIP)
protocol several researchers tried to provide
privacy sound models for investigation in these
environments [33-36]. Finally, there were models
for forensics log protection while considering user
privacy in log access occasions [37,38].
3 DISCUSSION AND ANALYSIS OF
RESULTS
We believe that the development of solutions and
frameworks to contain privacy issues in various
fields are not synchronized. Our analysis is done
based on each field, with comparison to related
fields and their effects as a whole towards privacy
preservation. We found out that while research in
one field contributed compelling solutions that
might be a long term answer to privacy
preservation, it does not necessarily be the case on
another field. To analyze the development of each
field, we split the stakeholders in each section,
from users’ and forensics investigators’
perspectives.
3.1 Privacy Preservation from User’s
Perspective
We found that in the case of a user, the major
problem of preserving privacy is the lack of
knowledge and understanding. General users do not
know the technicalities of how networks and data
storage are being managed, and their rights in their
personal and private information being used by
organizations. Hence, researches and development
of a framework and systems with privacy
preservation of user’s data are focused more
317
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
towards passive preservation, without them
knowing how the framework and system preserve
their data.
We found this to be very effective, yet deceiving at
the same time. In instances where frameworks are
applied to networks and databases, for example the
inference controls and encryption framework that
are implemented on network flow recording and
traffic analysis, onion routing, cryptographic
approach on DNA forensic databases,
homomorphic and commutative encryption, and
sensor webs protection framework, the solutions
provided are usually effective in tackling
situational crisis on data privacy, and users usually
do not know such solutions are implemented to
protect their data from being exploited. However,
the review on counter-forensics privacy tools and
analysis of how database systems delete data, plus
the problems in Portable Document Format when
they “delete” data, proved the deceiving pictures of
these tools and systems being able to live up to
expectations, or placed a false dichotomy that they
deliver in their tasks. Especially when users
generally do not know if these tools work exactly
like what they expect, and assumed that they do
work, private data are constantly under threat of
being exploited by malicious parties with no
warning posed to the users to be aware of the
situation of their private data.
We also found that privacy preservation can never
be achieved at its fullest. The proposed frameworks
and models, with encryption and technologies
implemented, their findings have a similar issue; it
is particularly hard to design a fully protected
system, with constraints and assumptions primarily
added into the calculus to prove their frameworks
and models can function under these constraints.
The mention of “future works” or manual audits
have been used in particularly general models,
including sensor webs and distributed information
systems, database systems, relational database
query controls and counter-Phishing. This presents
another issue; not all users are aware of what type
of scenario their data would most likely be
exploited, or in which type of scenario their current
data storage is in. This contributes generally to
another problem; when user privacy is breached,
the need for different professionalism to handle the
investigations become difficult due to the lacking
of standardization and understanding of the
scenarios and the status quo.
Throughout these flaws, we understand that while
development and researches to preserve user
privacy better are getting better on the road, the
idea of a fully protected framework or model will
not suffice in the near future. It is important for
users to understand the need for them to secure
their private information at the best of their interest,
particularly when cloud computing technology is
on the rise, and more remote and shared data
storages are made available for users. Users must
know their responsibility in their own personal
information, and utilize as much as possible
combinations of several developed privacy
preserving solutions to protect their data well while
networking. From picking the right browser to
perform private browsing to using the services of
trusted organizations with proven functioning
privacy preservation policies and technologies in
place are a few sets of decisions and combination
of models and framework to secure private data
better.
We also think that users must always have the
awareness and understanding that their private data
might be leaked. Such awareness is needed with
status quo proving that privacy preservation is still
in its developmental stages in redefining their
borders and to what extent they should provide
protection. Users must always be prepared to face
scenarios and seek solutions when such leaks
happen, and know how forensics investigators
perform investigation without further threatening
their privacy in this regard.
To conclude this subsection, we believe that users
need to have a general understanding and
knowledge on how technologies aid in privacy
preservation while they are storing data on
networks, using tools and services, and if these
technologies are delivering their functions. We also
believe that users must understand that
technologies can only help in privacy preservation
that much and it is a collective effort of a
combination of technologies with professionalism
and expertise of other aspects to better privacy
318
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
preservation. It is also important that users are
prepared to deal with situations when their privacy
has been breached, and seek the best solutions
available, including forensics investigations. It is
also evident to us that development of privacy
preservation techniques and tools are predicated
more towards technical solutions rather than a
holistic approach, desynchronizing the focus to
tackle the problem.
3.2 Privacy Preservation from Forensics
Investigators’ Perspective
The jobs of forensics investigators are to collect,
preserve and analyze information, then reconstruct
the events of a crime. We found that when it comes
to privacy preservation from the forensics
investigators’ perspective, it is always a dilemma
strongly linked with user privacy and legal systems,
as pointed out by many related works.
We concur that forensics investigators’
procedural methodologies in collecting, preserving
and analyze information possess potential avenues
of user privacy infringement. Our agreement on
this course is based on a general assumption that
forensic investigators have vested interest in this
information; either they are important in proving a
court case or a crime, or they are important for
personal use, which often times contain malicious
purposes.
We found that the related research and proposed
solutions provided positive and negative effects in
forensics investigations. We argue that the
limitations and constraints implemented in these
systems and models do help in protecting forensics
investigators from infringing privacy, but on the
other hand, limit them from conducting forensics
investigation in a more direct and effective
approach.
We want to explain this on both levels. On the
positive note, constraints applied on various
frameworks, such as homomorphic and
commutative encryption, onion routing, inference
controls, DNA blood and tissue samples from the
crime scene as key queries, sequential data release
based on relational levels and network flow
recording framework all demonstrated a vast
implementation of constraints to protect unrelated
data from being exposed to forensics investigators
while conducting investigations. We believe that
sequential data release based on relational levels is
particularly critical in addressing privacy issues and
balancing user privacy and legal need to access
such private data, as it allows direct avenue to gain
access to private information through a specific
process, not as general as organized queries and
encryption. We believe that integration of these
technologies can bring more positive contribution
in aiding forensics investigators. Using the
Sequential release of information based on a
relational level as a framework to implement and
shape organized queries is an example of
integration of both techniques while conducting
forensics investigations.
However, there are negative sides of it as well. The
issues here are on the non-technical part of dealing
with privacy. We found that the most obvious
impact of the proposed frameworks, such as cross
referencing encrypted queries with data, onion
routing and strong audit are among the frameworks
that directly limit avenues that can be taken by
forensics investigators to approach their
investigations. We need to consider the assumption
that all crime investigations are time sensitive and
such constraints placed by these frameworks may
prolong the already time consuming investigation
progress, as investigators now have to plan their
investigation methods to be more technical and
direct in order to extract the right evidence. Besides
that, the possibility of extracting wrong or
irrelevant evidence still exists regardless of how
these frameworks are in place. The fact that tracing
private information without really knowing the
content and only based on keywords does not
necessarily reflect the nature of data collected,
meaning the data might not be useful to the
investigation, and risks the possibility of exposing
private information as well.
Finally, we found that ambiguity always exists in
privacy issues when it comes to forensics
investigators. We argue that a forensics investigator
is an individual that is equipped with decent
knowledge of computer security. We believe that if
an individual’s purpose of obtaining private
319
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
information is malicious, the data will still be
leaked into the wrong hands anyway. The idea is
regardless of how far technology has gone into
preserving privacy, it still runs the possibility of
being leaked and exposed, considering of their
possible use and management by another person
other than the user him/herself. While having such
technologies deter forensics investigators to use the
extracted information properly, it is still not a
guarantee that the information will not be misused
in the hands of forensics investigators, whether
intentional or unintentional.
To conclude, we believe that the proposed
frameworks, introduced technologies and
implemented models and tools believed to be able
to aid forensics investigators from infringing user
privacy while conducting investigations might not
be as one sided as it seems. We believe that the
rationale and professionalism of the forensic
investigators are important when handling private
data as their expertise in handling computer
security is on level enough to know how these
technologies work in protecting private data. We
also believe that such technologies still need to
remain to deter forensics investigators from drifting
off their professionalism, but essentially the
negative impacts of such deterrence in place might
jeopardize privacy even further with the possibility
of irrelevant information leaking out anyway, and
prolonging the forensics investigation process. We
conclude that it is important that the forensics
investigators know the sensitivity of data they are
going to handle in each investigation and
understand their professionalism is important in
preserving privacy.
3.3 Privacy Preservation from Technologies’
Perspective
We found that from a technology perspective, the
current development of cyber security and digital
forensics in preserving privacy may have reached a
bottleneck, and the latest developments are too
constrained to very few general security measures.
This in turn does not bring too much positive
improvement in the field, but returns negative
effects as well.
We analyzed some of the reviews and would like to
highlight several examples to support our findings.
The first problem with current technologies is the
similarity of techniques. We found that almost all
security measures taken in various frameworks and
models, be it database systems, remote servers,
relational databases or network flow recording, the
framework looks similar in terms of their
algorithm, which includes encryption, data deletion
and controls. We concur that some of the
combinations are effective, such as onion routing
and sequential data release in preserving privacy
from being exposed to unrelated parties. However,
assuming in general scenarios, similarity in security
frameworks often means faster workarounds being
developed by malicious hackers, as these
frameworks share a common structure, and provide
more examples for malicious parties to work their
ways around the security system. We also noticed
that in some of the frameworks proposed, the
authors made assumptions that otherwise will
jeopardize the system, and offer a contingency
solution. However, in one such scenario such as
onion routing, the author mentioned about how it
would also harm investigators should the
framework be used against them. As onion routing
renders traffic analysis from third parties
impossible, it would be extremely difficult to trace
or extract information from such routing method
used by malicious users for tracking and profiling
purposes. This is a typical example of how
technologies, even in the cyber security field, can
reserve wanted results and have an unexpected and
undesired effect when it is being used by the wrong
party.
The same happens to the commutative encryption
example. The framework could only work properly
under the assumption that the administrator
provides all database information in an encrypted
manner. Should this is not the case, not only the
extracted information by the forensics investigators
suffer possibilities of being irrelevant, it also
jeopardizes the process of investigation as the
forensics investigators would likely miss out
important evidence in reconstructing the sequence
of events on the crime.
320
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
To conclude, development of technologies in cyber
security and digital forensics are very much
predicated on technicalities only, and does not
necessarily provide more improvement to
preserving privacy as it has been expected to. The
similarity in frameworks and models proposed, plus
the possibility of technologies being used in the
wrong hands are all issues that have to be solved at
grassroots level to ensure privacy preservation is
successful. We believe that apart from technical
development, technologies will need to take into
consideration other aspects that influence digital
forensics and cyber security, including education,
business requirements, professionalism from other
related fields and work together to ensure a more
holistic level of improvement in preserving privacy
can be achieved. We also argue that technologies in
digital forensics and security can backfire and
become dangerous if it is reversely used by
malicious users with intent to harm and infringe
user privacy.
4 CRITICALLY OVERLOOKED ISSUES
As mentioned in the analysis section, we believe
that privacy issues stem from intention, and made
possible with the use of technology. However,
technology has already revolutionized to a level
that it is applicable to almost every industry; a good
example is how database technology is used in
storing DNA samples of criminals, which can stem
into medical forensics for a start. Research focus
should now be more emphasized on solving the
issue at a root problem rather than introducing
more technical countermeasures in the field, which
many publications in this research also proved to be
applicable on both privacy preservation and
exploitation use.
We also note that the focus on education and
awareness of intention of protecting privacy and
preservation in a professional forensics field are not
adequate enough to strike the balance between
privacy preservation and getting the investigation
done in quality level. We find that this is
particularly detrimental, as technologies that are
continuously being rolled out into the commercial
market will not be able to be utilized in satisfactory
level by professional forensics investigators
without proper training and awareness. This opens
up to more possibilities of abuse without consent or
abuse without a motive by investigators. Awareness
is also not given emphasis on the user’s side, and
this exposes users to higher risk of being abused
under the same paradigm. Simply put, even with
the latest technologies and framework in place to
preserve privacy, it would have been rendered
useless should both parties that use them are not
aware of their potential, and subject to risk of being
abused by such technologies instead.
5 CONCLUSION
This paper has identified various privacy issues in
cyber security and digital forensics, issues that use
for protecting privacy of data in forensic
investigation, whereby how forensics investigators
may have infringed user privacy while conducting
forensics investigations, and how user privacy is
always under threat without proper protection. It
has also reviewed the current development trend
shift in this industry, why such trend could have
happened and its drive.
The paper has reviewed various fields and their
development in the technicalities and technologies
to address this problem. The paper describing each
field in a nutshell that explains how these
technologies work, and what are their approaches
in solving the problem of preserving privacy. The
reviews are split into three sections, each with its
corresponding fields of reviews and explanation.
The paper then analyses these reviews and view
them from the user and forensics investigator’s
perspectives, whether such development in cyber
security and digital forensics actually improve the
efforts on preserving privacy. The paper concluded
that while every development has its positive
approach and finds the solution to what the authors
want to solve, the issue of privacy preservation still
exists, with the consideration of non-technical
aspects in professionalism in practice and the
ambiguity of scenarios causing some approaches to
be counterproductive. The paper also analyses on
how at a technical level, advanced technologies in
321
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
digital forensics and security are facing a
bottleneck in development and could bring about as
equal harms to the current efforts in preserving
privacy.
6 REFERENCES
[1] I-Long Lin, Yun-Sheng Yen, Annie Chang: “A
Study on Digital Forensics Standard Operation Procedure
for Wireless Cybercrime,” International Journal of Computer
Engineering Science (IJCES), Volume 2 Issue 3, 2012.
[2] C. W. Adams, “Legal issues pertaining to the
development of digital forensic tools,” Third International
Workshop on Systematic Approaches to Digital Forensic
Engineering, pp. 123-132, 2008.
[3] K. Reddy and H. Venter, “A Forensic Framework
for Handling Information Privacy Incidents,” Advances in
Digital Forensics, volume V, pp. 143-155, 2009.
[4] Frank Y.W. Law et al, “Protecting Digital Data
Privacy in Computer Forensic Examination,” Systematic
Approaches to Digital Forensic Engineering (SADFE), 2011.
[5] N. J. Croft, M.S. Olivier, “Sequenced release of
privacy-accurate information in a forensic investigation,”
Digital Investigation, volume 7, pp. 1-7, 2010.
[6] G. Antoniou, C. Wilson, and D. Geneiatakis,
PPINA – “A Forensic Investigation Protocol for Privacy
Enhancing Technologies,” Proceedings of the 10th IFIP on
Communication and Multimedia Security, pp. 185-195,
2006.
[7] M. Geiger and L. F. Cranor, “Counter-forensic
privacy tools,” Privacy in the Electronic Society, 2005.
[8] M. Boldt and B. Carlsson, “Analysing
countermeasures against privacy-invasive software,” in
ICSEA, 2006.
[9] H. Said, N. Al Mutawa, I. Al Awadhi and M.
Guimaraes, “Forensic analysis of private browsing
artifacts,” in International Conference on Innovations in
Information Technology, 2011.
[10] A. Castiglionea, A. D. Santisa and C. Sorien,
“Security and privacy issues in the Portable Document
Format,” The Journal of Systems and Software, volume 83,
pp. 1813–1822, 2010.
[11] D. Forte, “Advances in Onion Routing: Description
and backtracing/investigation problems,” Digital
Investigation, volume 3, pp. 85-88, 2006.
[12] P. Bohannon, M. Jakobsson and S. Srikwan,
“Cryptographic Approaches to Privacy in Forensic DNA
Databases,” Lecture Notes in Computer Science Volume
1751, pp 373-390, 2000.
[13] J.D. Tygar, “Privacy in sensor webs and
distributed information systems,” Software Security, pp. 84-
95, 2003.
[14] Y. M. Lai, Xueling Zheng, K. P. Chow, Lucas Chi
Kwong Hui, Siu-Ming Yiu, “Privacy preserving confidential
forensic investigation for shared or remote servers,” in
International Conference on Intelligent Information Hiding
and Multimedia Signal Processing, pp.378-383, 2011.
[15] S. Böttcher, R. Hartel and M. Kirschner,
“Detecting suspicious relational database queries,” in The
Third International Conference on Availability, Reliability
and Security, 2008.
[16] B. Shebaro and J. R. Crandall, “Privacy-
preserving network flow recording,” Digital Investigation,
volume 8, pp. 90-100, 2011.
[17] S. Gajek, and A. Sadeghi, “A forensic framework
for tracing phishers,” volume 6102 of LNCS, pages 19-33.
Springer, 2008.
[18] P. Stahlberg, G. Miklau, and B. N. Levine,
“Threats to privacy in the forensic analysis of database
systems,” ACM Intl Conf. on Management of Data
(SIGMOD/PODS), 2007.
[19] US-CERT, Computer Forensics, 2008.
[20] S. M. Giordano, “Applying Information Security
and Privacy Principles to Governance,” Risk Management
& Compliance, 2010.
[21] C. Moch and F. C. Freiling, “The forensic image
generator generator,” in Fifth International Conference on
IT Security Incident Management and IT Forensics, 2009.
[22] J. R . Agustina and F. Insa, “Challenges before
crime in a digital era: Outsmarting cybercrime offenders,”
Workshop on Cybercrime, Computer Crime Prevention and
the Surveillance Society, volume 27, pp.211-212, 2011.
[23] F. Daryabar, A. Dehghantanha, HG. Broujerdi,
Investigation of Malware Defence and Detection
Techniques,” International Journal of Digital Information
and Wireless Communications(IJDIWC), volume 1, issue 3,
pp. 645-650, 2012.
[24] F. Daryabar, A. Dehghantanha, NI. Udzir,
“Investigation of bypassing malware defences and malware
detections,” Conference on Information Assurance and
Security (IAS), pp. 173-178, 2011.
[25] M. Damshenas, A. Dehghantanha, R. Mahmoud, S.
Bin Shamsuddin, “Forensics investigation challenges in
cloud computing environments,” Cyber Warfare and Digital
Forensics (CyberSec), pp. 190-194, 2012.
[26] F. Daryabar, A. Dehghantanha, F. Norouzi, F
Mahmoodi, “Analysis of virtual honeynet and VLAN-based
virtual networks,” Science & Engineering Research
(SHUSER), pp.73-70, 2011.
[27] S. H. Mohtasebi, A. Dehghantanha, “Defusing the
Hazards of Social Network Services,” International Journal
of Digital Information, pp. 504-515, 2012.
[28] A. Dehghantanha, R. Mahmod, N. I Udzir, Z.A.
Zulkarnain, “User-centered Privacy and Trust Model in
Cloud Computing Systems,” Computer And Network
Technology, pp. 326-332, 2009.
[29] A. Dehghantanha, “Xml-Based Privacy Model in
Pervasive Computing,” Master thesis- University Putra
Malaysia 2008.
[30] C. Sagaran, A. Dehghantanha, R Ramli, “A User-
Centered Context-sensitive Privacy Model in Pervasive
322
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Systems,” Communication Software and Networks, pp. 78-
82, 2010.
[31] A. Dehghantanha, N. Udzir, R. Mahmod,
“Evaluating user-centered privacy model (UPM) in
pervasive computing systems,” Computational Intelligence in
Security for Information Systems, pp. 272-284, 2011.
[32] A. Dehghantanha, R. Mahmod, “UPM: User-
Centered Privacy Model in Pervasive Computing Systems,”
Future Computer and Communication, pp. 65-70, 2009.
[33] S. Parvez, A. Dehghantanha, HG. Broujerdi,
“Framework of digital forensics for the Samsung Star Series
phone,” Electronics Computer Technology (ICECT),
Volume 2, pp. 264-267, 2011.
[34] S. H. Mohtasebi, A. Dehghantanha, H. G.
Broujerdi, “Smartphone Forensics: A Case Study with Nokia
E5-00 Mobile Phone,” International Journal of Digital
Information and Wireless Communications
(IJDIWC),volume 1, issue 3, pp. 651-655, 2012.
[35] FN. Dezfouli, A. Dehghantanha, R. Mahmoud
,”Volatile memory acquisition using backup for forensic
investigation,” Cyber Warfare and Digital Foresnsic, pp.
186-189, 2012
[36] M. Ibrahim, MT. Abdullah, A. Dehghantanha ,
“VoIP evidence model: A new forensic method for
investigating VoIP malicious attacks,” Cyber Security, Cyber
Warfare and Digital Forensic , pp. 201-206, 2012.
[37] Y. TzeTzuen, A. Dehghantanha, A. Seddon,
“Greening Digital Forensics: Opportunities and Challenges,”
Signal Processing and Information Technology, pp. 114-119,
2012.
[38] N. Borhan, R. Mahmod, A. Dehghantanha, “A
Framework of TPM, SVM and Boot Control for Securing
Forensic Logs,” International Journal of Computer
Application, volume 50, Issue 13, pp. 65-70, 2009.
323
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 311-323The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Modelling Based Approach for Reconstructing Evidence of VoIP
Malicious Attacks
Mohammed Ibrahim, Mohd Taufik Abdullah and Ali Dehghantanha
Faculty of Computer Science and Information Technology
Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
[email protected], {mtaufik, alid}@fsktm.upm.edu.my
ABSTRACT
Voice over Internet Protocol (VoIP) is a
new communication technology that uses
internet protocol in providing phone
services. VoIP provides various forms of
benefits such as low monthly fee and
cheaper rate in terms of long distance and
international calls. However, VoIP is
accompanied with novel security threats.
Criminals often take advantages of such
security threats and commit illicit activities.
These activities require digital forensic
experts to acquire, analyses, reconstruct and
provide digital evidence. Meanwhile, there
are various methodologies and models
proposed in detecting, analysing and
providing digital evidence in VoIP forensic.
However, at the time of writing this paper,
there is no model formalized for the
reconstruction of VoIP malicious attacks.
Reconstruction of attack scenario is an
important technique in exposing the
unknown criminal acts. Hence, this paper
will strive in addressing that gap. We
propose a model for reconstructing VoIP
malicious attacks. To achieve that, a formal
logic approach called Secure Temporal
Logic of Action(S-TLA+) was adopted in
rebuilding the attack scenario. The expected
result of this model is to generate additional
related evidences and their consistency with
the existing evidences can be determined by
means of S-TLA+
model checker.
KEYWORDS
Voice over IP, S-TLA+, Reconstruction,
malicious attack, Investigation, SIP,
Evidence Generation, attack scenario
1 INTRODUCTION
Voice-over Internet Protocols (VoIP) phone
services are prevalent in modern
telecommunication settings and demonstrate
a potentiality to be the next-generation
telephone system. This novel
telecommunication system provides a set of
platform that varied from the subjected and
closed environment offered by conventional
public switch network telephone (PSTN)
service providers [1]. The exploitation of
VoIP applications has drastically changed
the universal communication patterns by
dynamically combining video and audio
(Voice) data to traverse together with the
usual data packets within a network system
[2]. The advantages of using VoIP services
incorporated with cheaper call costs for
long distance, local and international calls.
Users make telephone calls with soft phones
or IP phones (such as Skype) and send
instant messages to their friends or loved
ones via their computer systems [3].
The development of VoIP has brought a
significant amount of benefits and
satisfactory services to its subscribers [2].
However, VoIP services are exposed to
various security threats derived from the
Internet Protocol (IP) [4]. Threats related to
this new technology are denial of service,
324
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
host and protocol vulnerability exploits,
surveillance of calls, hijacking of calls,
identity theft of users, eavesdropping and
the insertion, deletion and modification of
audio streams [5]. Criminals take advantage
of such security threats and commit illicit
activities such as VoIP malicious attacks.
This requires acquisitions, analysing and
reconstruction of digital evidence.
However, detecting and analysing evidence
of attacks related to converged network
application is the most complicated task.
Moreover, the complex settings of its
service infrastructure such as DHCP
servers, AAA server, routers, SIP registrar,
SIP proxies, DNS server, and wireless and
wired network devices also complicate the
process of analysing digital evidence. As a
result, reconstructing the root cause of the
incident or crime scenario would be
difficult without a specific model guiding
the process.
1.1 Related Work
In recent times, researchers have developed
new models to assist forensic analysis by
providing comprehensive methodologies
and sound proving techniques.
Palmer [6] first proposed a framework with
the following steps: identification,
preservation, collection, examination,
analysis, presentation as well as decision
steps. The framework was presented at the
proceeding of the first Digital Forensic
Workshop (DFRW) and served as the first
attempt to apply forensic science into
network system. The framework was later
cobble together and produced an abstract
digital forensic model with the addition of
preparation and approach strategy phases;
the decision phase was replaced by
returning evidence. However, the model
works independently on system technology
or digital crime [7].
Similarly, the work of Mandila and Procise
developed simple and accurate
methodology in incident response. At the
initial response phase of the methodology, it
is aimed at determining the incident, and
strategy response phase is formulated and
added [8]. On the other hand, Casey and
Palmer [9] proposed an investigative
process model that ensures appropriate
handling of evidence and decrease chances
of mistakes through a comprehensive
systematic investigation. Also in another
paper, it was reported that Carrier and
Spafford [10], has adopted the process of
physical investigation and proposed an
integrated digital forensic process. In
another approach [11] combined existing
models in digital forensic and comes up
with an extended model for investigating
cyber crime that represents the flow of
information and executes full investigation.
Baryamureeba and Tushabe reorganized
different phases of the work of Carrier and
Spafford and enhanced digital investigation
process by adding two new phases (i.e.
traceback and dynamite)[12] .
Other frameworks include the work of
Bebee and Clark which is hierarchical and
objective based for digital investigation
process[22]. However, all the
aforementioned models are applied to
digital investigation in a generalized form.
Meanwhile, Ren and Jin [14] were the first
to introduce a general model for network
forensic that involves the following steps:
capture, copy, transfer, analysis,
investigation and presentation. The authors
in [15] after surveyed the existing models
suggest a new generic model for network
forensic built from the aforementioned
models. This model consists of preparation,
detection, collection, preservation,
examination, analysis, investigation and
presentation.
325
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Furthermore, many authors proposed event
reconstruction attacks models for instance
Stephenson [16] analysed the root cause of
digital incident and applied colored Petri
Nets for modelling of occurred events.
Gladyshev and Patel [17] developed event
reconstruction in which potential attack
scenarios are constructed based on finite
state machine (FSM) and neglecting
scenario that deviate from the available
evidence. The author in [18] uses a
computation model based on finite state
machine together with computer history and
came up with a model that supports the
existing investigation. Rekhis and Boudriga
proposed in [19], [20] and [21] a formal
logic entitled Investigation-based Temporal
Logic of Action (I-TLA) which can be used
to proof the existence or non-existence of
potential attack scenario for reconstruction
and investigation of network malicious
attacks. On the other hand, Pelaez and
Fernandez [22] in an effort to analyse and
reconstruct evidence of attacks in converged
network, logs correlation and normalization
techniques were proposed. However, such
techniques are effective if the data in the
file or forensic logs are not altered.
The existing models stated above are more
of generic not specific to a particular kind
of attacks. Therefore, the need for
reconstructing the evidences of malicious
attacks against VoIP is highly needed
because it plays an important role in
revealing the unknown attack scenario. As a
result, the reliability and integrity of
analysis of evidence in VoIP digital forensic
would be improved and enhances its
admissibility in the court of law. In view of
that, the work in this paper is focused on
reconstruction of Session Initiation Protocol
(SIP) server malicious attacks. Hence, the
VoIP evidence reconstruction model
(VoIPERM) is proposed that categorized
the previous model in [23] into main
components and subcomponents. The model
described VoIP system as a state machine
through which information could be
aggregated from various components of the
system and formulates them into hypotheses
that enable investigator model the attack
scenario. Following the reconstruction of
attack scenario, actions that contradict the
desirable properties of the system state
machine are considered to be malicious
[23]. Consequently, the collection of both
legitimate and malicious actions enables the
reconstruction of attack scenario that will
uncover new more evidence. To determine
the consistency of additional evidences with
respect to the existing evidence, a state
space representation was adopted that depict
the relationship between set of evidence
using graphical representation. The
graphical representation enables
investigators understand if generated
evidences can support the existing once.
Hence, it reduces the accumulation of
unnecessary data during the process of
investigation [23]. Additionally, the model
is capable of reconstructing actions
executed during the attack that moves the
system from the initial state to the unsafe
state. Thus, all activities of the attacker are
conceptualized to determine what, where
and how such an attack occurred for proper
analysis of evidence [23]. To handle
ambiguities in the reconstruction of attack
scenario, S-TLA+
is to be applied.
Essentially, the application of S-TLA+ into
computer security technology is efficient
and generic. On the other hand, S-TLA+ is
built on the basis of logic formalism that
accumulate forward hypotheses if there is
deficient details to comprehend the
compromised system [19].
In addition there were several works on
malware investigation [24,25], analysis of
cloud and virtualized environments [26-28],
privacy issues that may arise during
forensics investigation[29-34], mobile
326
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
device investigation [35-37] and greening
digital forensics process [38].
The main contribution of this paper is to
propose a novel model in VoIP digital
forensic analysis that can integrate digital
evidences from various components of
VoIP system and reconstruct the attack
scenario. Our objective is to reconstruct
VoIP malicious attacks to generate more
additional evidences from the existing
evidence. The remaining of the paper is
arranged as follows: next section discusses
VoIP malicious attacks; 3 discuss VoIP
digital forensic investigation, section 4
introduces the new model, section 5
discusses S-TLC model checker, section 6
case study and 7 conclusions.
2 VoIP MALICIOUS ATTACKS
In general, an appropriate term used related
to software built purposely to negatively
affect a computer system without the
consent of the user is called a malware [39].
And the increased number of malicious
activities during the last decade brought
most of the failures in computer systems
[40]. Nevertheless, Voice over IP is prone
to those malware attacks by exploiting its
related vulnerabilities. Having access to
VoIP network devices, intruders can disrupt
media service by flooding traffic, whip and
control confidential information by illicit
interception of call content or call signal.
Through impersonating servers, intruders
can hijack and make fake calls by spoofing
identities [3]. Consequently, the
confidentiality, integrity and availability of
the users are negatively affected. Also VoIP
services are utilized by spammers to deliver
instant messages, spam calls, or presence
information. However, these spam calls are
more problematic than the usual email spam
since they are hard to filter [3]. Similarly,
attacks can transverse gateways to an
integrated network system like traditional
telephony and mobile system. Meanwhile,
compromising VoIP applications composed
a link to break out security mechanisms and
attack internal networks [39]. Also,
attackers make use of malformed SIP
messages to attack embedded web servers
through Database injection vectors or Cross
Script attacks [39].
2.1 SIP Malicious Attack
As previously explained, this paper
considers SIP Server attacks. Several
attacks are related to SIP server, but the
most concern threat within research
community is VoIP spam. Generally, spam
is an unwanted bulk email or call,
deliberated to publicize social engineering.
The author in [3] discusses that “Spam
wastes network bandwidth and system
resources. It exists in the form of instant
message (IM), Voice and presence Spam
within a VoIP setting” [3]. It affects the
availability of network resources to
legitimate users which can result to denial
of service (DoS) attack. Spam originates
from the collection of session initiation in
an effort to set up a video or an audio
communications session. If the users
accepted, the attacker continues to transmit
a message over the real-time media. This
kind of spam refers to as classic
telemarketer Spam and is applicable to SIP
protocol and is well known as Spam over IP
Telephone (SPIT). However, spam is
categorized into instant Message (IM spam)
and presence Spam (SPPP). The former is
like email spam, but it is bulky and
unwelcome set of instant messages
encapsulated with the message that the
attacker wishes to send. IM spam is
delivered using SIP message request with
bulky subject headers, or SIP message with
text or HTML bodies. The latter, is like the
former, but it is placed on presence request
(that is, SIP subscribes requests) in an effort
327
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
to obtain the "white list" of users to transmit
them an instant message or set off another
kind of communication [3].
3 VoIP DIGITAL FORENSIC
INVESTIGATION
Lin and Yen [41] define digital forensic
science to preserve, identify, extract,
record as well as interpret the computer and
network system evidence and analyse
through complete and perfect methods and
procedures.” On the other hand, forensic
computing is particularly important
interdisciplinary research area founded from
computer science and drawing on
telecommunications and network
engineering, law, justice studies, and social
science [42]. However, to convene with the
security challenges various organizations
developed numerous models and
Methodologies that satisfy their
organizational security policy. Presently,
more than hundreds of digital forensic
procedures developed globally [43]. Also
the increase number of security challenges
in VoIP persuades researcher to developed
several models. On the other hand, in VoIP
digital forensic a standard operating
procedure called VoIP Digital Evidence
Forensic Standard Operating Procedure
(VoIP DEFSOP) is established [41].
Moreover, previous study noted that there
was not established research agenda in
digital forensic; to resolve that, six
additional research areas were proposed at
the 42nd
Hawaii international conference,
which include Evidence Modelling. In
evidence modelling investigation procedure
is replicated for practitioners and case
modelling for various categories of crimes
[44]. However, the increase number of
crimes associated with computers over the
last decade pushes product and company to
support in understanding what, who, where
and how such attack happened [45]. To
fulfil this current development, in this paper
the proposed model can support
investigation and analysis of evidence by
reconstructing attack scenario related to
VoIP malicious attacks. Afterwards, the
reconstruction of potential attack scenario
will assist investigators to conceptualize
what, where, and how does the attack
happened in the VoIP system.
4 VoIP EVIDENCE
RECONSTRUCTION MODEL
(VoIPERM)
The idea proposed in [43] is to assist
investigators in finding and tracing out the
origin of attacks through the formulation of
hypotheses. However, our proposed model
considered VoIP system as a state machine
(which observed the system properties in a
given state) and the model is built up from
four main components as shown below.
Figure 1. VoIP evidence reconstruction model
The explanation of each component is as
follows:
4.1 Terminal State/Available Evidence
This component observes the final state of
the system at the prevalence of the crime; it
328
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
is the primary source of evidence and is
characterized by the undesirable system
behavior. The terminal state provides
available evidence and gives an inside about
the kind of action acted upon on the
compromised system [23]. Other properties
of system compromise described by [21]
include any of the following:
Undesirable safety property of some system components
Unexpected temporal property
Given be the set of all reachable states in VoIP system and
be the collection of all
desirable properties in a given state. If
then the final state of the system is said to be unsafe and can be
represented as . For all actions
where is the sequence of actions associated with each reachable state;
then is said to be a malicious action. So
is signifying one of the available
evidence [23].
4.2 Information Gathering
This component is aimed to collect and
gather information that gives details about
VoIP system state. It requires the following
subcomponents.
VoIP components: these components provide services such as voice mail
access, user interaction media control,
protocol conversion, and call set up,
and so on. The components can be
proxy servers, call processing servers,
media gateways and so on, depends
on the type of protocol in use [23].
Moreover, software and hardware
behaviours are observed to assist the
investigator with some clue about
VoIP system state. VoIP system states
are defined as the valuation of
component variables that change as a
result of actions acted upon them.
If are components variables that
change by executing action in a given
state. These variables are referred to
as flexible variables given as
... and for any action that
transforms . Where and are
respectively variables in old and new
state and . Then the properties of
and are observed to decide whether
they belongs to the system desirable
properties [23].
VoIP vulnerabilities: These refer to any
faults an adversary can abuse and commit a
crime. Vulnerabilities make a system more
prone to be attack by a threat or permit
some degree of chances for an attack to be
successful [46]. In VoIP systems,
vulnerabilities include weaknesses of the
operating systems and network
infrastructures. Some weaknesses formed
from poor in design and implementation
security mechanism and Mis-configuration
settings of network devices. VoIP protocol
stack also associated with weaknesses that
attacker exploits and access text based
credentials and other private information.
4.3 Evidence Generation
In this component, hypotheses are
formulated based on information gathered
in the previous stage. The formulated
hypotheses are used in the process of
finding and generation of additional
evidence. The formal logic of digital
investigation is applied to consider available
evidence collected from different sources
and handle incompleteness in them by
generating a series of crime scenario
according to the formulated hypotheses.
This stage involves the following
subcomponents:
Hypothesis formulation: To overcome the lack of system details encountered
during the investigation, hypotheses
are formulated based on intruder’s
anticipated knowledge about the
329
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
system and the details of information
captured from VoIP components. The
basis of hypothesis formulation is to
predict the unknown VoIP malicious
attack. In this case, there is a need to
have specific variables attached to
hypotheses and VoIP components
respectively and make an assumption
to establish a relationship between the
variables. This determines what effect
of such hypothesis if it is applied to
VoIP components. To achieve this,
three main requirements are set out:
Hypotheses should establish a
relationship between system
states (that is, VoIP component
states in this regard), to avoid
violating the original properties
(Type Invariant) of the system
under investigation.
All hypotheses found to be
contradictory are eliminated to
avoid adding deceptive
hypotheses within a generated
attack scenario.
To efficiently select and
minimize the number of
hypotheses through which a
node is reached, the relationship
among the hypotheses should be
clearly expressed [19].
Moreover, the process of investigation
relied on the formulation of hypotheses to
describe the occurrence of the crime. At the
lowest levels of investigation, hypotheses
are used to reconstruct events and to
abstract data into files and complex storage
types. While at higher levels of
investigation, hypotheses are used to
explain user actions and sequences of
events [45]. An investigation is a process
that applies scientific techniques to
formulate and test hypotheses. At this point,
VoIP variables are signifying as (indigenous
Variable), while variables formed by
hypotheses are denoted as (Exogenous
Variable). Consequently, it describes how
VoIP components are expected to behave if
formulated hypotheses are executed.
However, Assumptions are obviously made
based on the expected knowledge of the
attacker about the system. The sets of
hypotheses are said to be variables
signifying attacker’s expected knowledge
about the system which is different from the
flexible variables as has been mentioned. However, all the variables derived from
hypothesis formulation are referred to as
constrained variables denoted by ... . Meanwhile, while hypotheses are aggregated care should be
taking to stay away from adding ambiguous
hypothesis that can prevent the system from
moving to the next state. In S-TLA+ it is
signifies inconsistency and denoted as
[19]
Modelling of Attack scenario: Digital forensic practices demands for the
generation of temporal analysis that
logically reconstruct the crime [26].
Also according to [47], in crime
investigation it is likely to reason
about crime scenarios: explanation of
states and events that change those
states that may have occurred in the
real world. However, due to the
complexity of understanding attack
scenario, to handle them, it is vital to
develop a model that simplifies their
description and representation within
a collection of information and set
aside new attacks to be regenerated
from the existing ones [19]. For this
reason, it is essential to model VoIP
malicious attacks to enable
investigators understand the attack
scenario and describes how and where
to acquire digital evidence. In this
regard, instead of modelling both the
system and witness statement as a
finite automata like in [40] an S-TLA+
is used to model attack scenario as its
330
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
support logic formulation with
uncertainty. In addition, evidences can
easily be identified with S-TLA+
using
a state predicate that evaluates
relevant system variables [19].
Moreover, S-TLA+ is an advancement
over a temporal logic of action (TLA).
However, a system is signified in
TLA by a formula of the form x:
[ ]v , relating the set of all its
authorised behaviours. It expresses a
system whose initial behaviour
satisfies and where every state
satisfies the next state relation or
leaves the tuple of specification
variable unchanged. The infinite
behaviour of the system is constrained
by the Liveness property (written as
a conjunction of weak and strong
fairness conditions of actions). In this
regard, TLA can be used in S-TLA+ to
illustrate a system’s progress from a
state to another, in advance of the
execution of an action under a given
hypothesis [11].Meanwhile, in S-
TLA+ a constrained variable with
hypothesis not yet express out,
assumed a fictive value denoted as
[19].
An action is a collection of Boolean
function true or false if (
: / , ′) = true i.e. each
unprimed variable in the state is replaced with prime variable ′ in
state the action become true [19].
( : / , ) = true i.e.
each non-assumed constrained
variable in state s is replaced with assumed constrained variable in
state t. The action becomes true, and
if { ⋀
then the set of actions is
said to be legitimate actions. Likewise
if { ⋀
then the set of actions
is said to be malicious actions, where
is the property satisfying the
behaviour of [23], Attack scenario
fragment are the collection of both
legitimate and malicious actions that
move the system to an unsafe state.
Thus, attack scenario denoted as is
defined [23]
Testing Attack scenario: the purpose of testing generated attack scenario is
to ascertain its reliability in respect to
the system behaviours. The properties
of the system at a given state is
examined, the investigator should
compare the properties of the
generated attack scenario with the
system final state. If any of the
scenarios satisfied the properties of
the final state, then the investigator
should then generate and print digital
evidence else the hypotheses should
be reformulated [23]. Let
be the set of the generated attack
scenario and be the set of
VoIP system states. If
} and then
satisfied the properties of the
system final state, where is the
property satisfying the behaviour of and ( ) otherwise known as
[23].
4.4 Print Generated evidence
Evidences can be generated from attack
scenario using forward and Backward
chaining phases adopted from inferring
scenarios with S-TLC [19]. However, the
proposed model after being logically proof
by the S-TLA+, it is expected to reconstruct
malicious attack scenario in the form of
specifications that can be verified using S-TLA
+ model checker called S-TLC. S-TLC
is a directed graph founded on the basis of
state space representation that verifies the
logical flow of specifications written in S-
331
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
TLA+ formal language. Therefore, absolute
reconstructions of attack scenario fragments
are represented and the logical relationships
between them are illustrated on a directed
graph [23]. At this point, the investigator is
likely to realize what, how, where and why
such an incident was accomplished in the
VoIP system. Also the resulting outcome of
the graph is to generate new evidence that
matches the existing evidence. For all
generated attack scenarios ⟨ ⟩
such that all the flexible variables
and constrained variable are
evaluated as and respectively, where is the valuation of all non-constrained
variables called a node core and is the
valuation of all constrained variables called
node label. Then, each reachable state can
be represented on the directed graph G with
their node core and node label as ,
respectively.
5 S-TLC MODEL CHECKER,
STATE SPACE
REPRESANTATION
A state can be represented on the generated
graph as a valuation of all its variables
including the constrained ones. It involves
two notions:
Node core: it represents the valuation of the entire non-constrained variables and
Node label: is a valuation of the entire constrained variables under a given
hypothesis.
Given a state t, tn is used to denote its
equivalent node core, tc to describe its
resulting environment (is a set of
hypotheses) and Label (G, t) to refer to its
label in graph G.
The S-TLC algorithm is built on three data
structures G, UF and UB , G refers to the
reachable directed graph under construction.
UF and UB are FIFO (first in first out)
queues containing states whose successors
are not yet computed, during forward and
backward chaining phases respectively. The
S-TLC model checker works in three phases
[19].
5.1 Initialization Phase
Initialization phase is the first stage in S-
TLC algorithm and involve the following
steps:
1. G as well as UF and UB are created and
initialized respectively to empty set
and empty sequence . At this step,
each step satisfying the initial
predicate is computed and then
checked whether it satisfies the
invariant predicate Invariant (that is a
state predicate to be satisfied by each
reachable state).
2. On satisfying the predicate Invariant, it
is appended to graph G with a pointer
to the null state and a label equal to the
set of hypotheses relative to the
current state. Otherwise, an error is
generated. If the state does not satisfy
the evidence predicate (i.e. a predicate characterized by
system terminal state that represent
digital evidence), it is attached to UF,
otherwise it is considered as terminal
state and append to UB which can be
retrieved in backward chaining phase
[19].
5.2 Forward Chaining UF
In this phase, all the scenarios that originate
from the set of initial system states are
inferred in forward chaining. This involves
the generation of new sets of hypotheses
and evidences that are consequent to these
scenarios. During this phase and until the
queue becomes empty, state is retrieved from the tail of UF and its successor states
are computed. For every successor state t
satisfying the predicate constraint (specified
to assert bound on the set of reachable
332
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
states), if the predicate Invariant is not
satisfied, an error is generated and the
algorithm terminates otherwise state is appended to G as follows:
1. If a node core tn does not exist in G, a
new node (set to tn) is appended to the
graph with a label equal to tc and a
predecessor equal to sn. State t is
appended to UB if satisfied
predicate , otherwise it is
attached to UF.
2. If there exists a node x in G that is
equal to tn and whose label includes tc,
then a conclusion could be made
stating that node t was added
previously to G. In that case, a pointer
is simply added from x to the
predecessor state sn.
3. If there exists a node x in G that is
equal to tn, but whose label does not
include tc, then the node label is
updated as follows:
tc is added to Label (G, x).
Any environment from Label (G, x), which is a superset of some other
elements on this label, is deleted to
ensure hypotheses minimality.
If tc is still in Label (G, t) then x is pointed to the predecessor state sn and
node t is appended to UB if it satisfies
predicate ateEvidenceSt .
Otherwise, it is attached to UF [19] The resulting graph is a set of scenarios that
end in any state satisfying the predicate
ateEvidenceSt and/or Constraint.
5.3 Backward Chaining Phase
All the scenarios that could produce states
satisfying predicate generated in forward chaining, are
constructed. During this phase and until the
queue becomes empty, the tail of UB,
described by state t, is retrieved and its
predecessor states (i.e. the set of states si
such that (si, t) satisfy action Next) which
are not terminal states and satisfy the
predicate Invariant (States that doesn’t
satisfy predicate Invariant are discarded
because this step aims simply to generate
additional explanations) and Constraint are
computed. Each computed state s is
appended to G as follows:
1. If sn is not in G, a new node (set to sn) is
appended to G with a label equal to the
environment sc. Then a pointer is added
from node tn to sn and state s is
appended to UB.
2. If there exists a node x in G that is equal
to sn, and whose label includes sc, then it
is stated that node s was been added
previously to G. In that case a pointer is
simply added from tn to the predecessor
state sn and s is appended to UB.
3. If there is x in G that is equal to Sn, but
whose label doesn't include sc, then
Label (G, t) is updated as follows:
sc is added to Label (G, x).
Any environment from Label (G, x) which is a superset of some other
elements in this label is deleted to
ensure hypotheses minimality.
If sc is still contained in the label of
state x then the node t is pointed to
the predecessor state x and the node
is appended to UB. The outcome of the three phases is a graph
G containing the set of possible causes
relative to the collected evidence. It
embodies different initial system states
apart from those described by the
specification [19].
6 CASE STUDY
To investigate VoIP malicious attack using
the proposed model, the following case
study on the reconstruction of spam over
Internet Telephony (SPIT) attack is
proposed, to investigate the denial of
service experienced by some of the VoIP
users as a result of VoIP spam. A direct
333
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
investigation shows that the network
bandwidth and other resources has been
exhausted by the server as it was busy
receiving and sending audio message
request to SIP URIs(Uniform Resource
Identifiers).
According to the VoIP evidence
reconstruction model, the first stage
emphasis on the identification of the
terminal state and the available evidence of
the attack.
6.1 Terminal State/Available Evidence
Exhausting of bandwidth and other
resource/sending an audio message request
to SIP URIs.
6.2 Information Gathering
This includes the following:
VoIP Components: these comprise both signalling and media
infrastructure. The former is based
on session initiation protocol (SIP)
in particular, that include: SIP
STACK (SS) (which is responsible
for sending and receiving,
manufacturing and parsing SIP
messages) and SIP addressing (SA)
(is based on the URI). The latter,
considered Real Transmission
Protocol (RT) (RTP stacks) which
code and decode, compress and
expand, and encapsulate and
demultiplex of media flows.
VoIP vulnerabilities: it can be as a result of the following:
a. Unchanged default passwords of
deployed VoIP platforms can be
strongly vulnerable to remote
brute force attack,
b. Many of the services that
exposes data also interact as web
services with VoIP system and
these are open to common
vulnerabilities such as cross-site
request forgeries and cross- site
scripting.
c. Many phones expose service that
allows administrators to gather
statistics, information and
remote configuration settings.
These ports open the door for
information disclosure that
attackers can use to gain more
insight to a network and identify
the VoIP phones.
d. Wrong configure access device
that broadcast messages enable
an attacker to sniff messages in
VoIP domain.
e. The initial version of SIP allows
plain text-based credentials to
pass through access device.
6.3 Evidence Generation
This stage involves the following:
Hypothesis formulation. Using the hypothesis that a VoIP running a service
on a default password can grant an
access to an intruder after a remote brute
force attack. A hypothesis stating that
service ports on VoIP phones expose
data, also interact as web services, an
intruder that have access to VoIP
service can exploit such vulnerability in
the form of cross-site scripting to have
an administrator access.
. Some phones expose a service that
allows administrators to gather statistics,
information and remote configuration, a
hypothesis stating that such phones can
grant an intruder with direct access to
administrative responsibility.
a. A hypothesis stating that there is a
wrong configured access device
which broadcast SIP messages. This
enables the attacker to intercept SIP
messages.
334
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
b. A hypothesis stating that the
messages are running on the initial
version of SIP, which has a
vulnerability that send a plain text
SIP message. The intruder that
intercepts the messages can extract
user information from the message.
c. An intruder who is equipped with
administrator function can create,
decode and send a request message
d. An intruder can extract SIP
extension/URIs by sending an
OPTION message request after
searching all ports running on 5060
in SIP domain, to send a SIP
message.
e. A hypothesis stating that the
credentials were encrypted with
cipher text requires an encryption
engine to enable the intruder to
digest SIP message header and
obtain other information.
Modelling of Attack Scenario: in
this case, we are to use STLA+
The specification describes the
available evidence with predicate
which uses the function request to state that the
machine is busy sending invite
audio messages.
In this segment we are to represent
hacking scenario fragment inform of
hypothetical action as described
below.
a. : There is a
Hypothesis stated that there is
vulnerability that VoIP running
service on a default password, an
intruder can easily brute force
and gain access and raise up his
privilege from no access( ) to access level ( )
on the VoIP network, by
performing brute force on
VoIP( ) default password.
b. : using the hypothesis
stating that the service ports on
VoIP has some vulnerabilities if
it is exploited can raise the
accessibility level of an attacker
from ( ) to
administrator access( ) by exploring service port
vulnerability ( . c. : A hypothesis stating
that some VoIP phones expose
service that allows
administrators to gather
information for remote
configuration. Such vulnerability
can grant a direct access from
( ) to an administrator
access ( ), if it’s exploited by exploring phone
vulnerability ( ). d. : hypothesis stating
that if there is wrong configured
access devices, which allow
messages to be broadcast a SIP
has vulnerabilities that send
messages with plain-text
credentials. If it’s exploited, an
intruder can intercept SIP
messages ( ) and eavesdrop.
e. : a user with
administrative access can
manufacture ( ), decode and encapsulate SIP
messages using SIP STACK
(SS).
f. the user requires SIP
extension or URIs to send an
invite messages, being equipped
with administrative access the
intruder sends OPTION message
request to extract SIP URIs (
) provided that the
335
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
service port is running on 5060
ports.
g. t: the intruder takes
advantage of vulnerability that
the device has an encryption
engine, it will enable him digest
the cipher text on SIP message
header field value to extract
other information related to SIP
message credentials.
h. : the intruder with
administrative access and
manufactured SIP message then
send an invite audio message
( ) to the server as a message request.
i. : the user then logout
from the VoIP domain.
The S-TLA+
attack scenario fragment
module is depicted in the figure below.
Figure 2. Generated attack scenario fragment using
S-TLA+
Testing Generated Scenario: given a set of a generated attack scenario, if any of
the scenarios satisfies the terminal state
of the system under investigation, then
digital evidence is generated and printed
otherwise the hypothesis is
reformulated. In the case study
presented above, an action
in the generated scenarios
satisfied the available evidence of the
terminal state of the system.
Print Generated evidence: To generate evidence from the attack scenario
fragment presented in Figure 2, we used
forward and backward chaining phases
as explained above. This has been
adopted from inferring scenarios with S-
TLC[19].
Figure 3. Forward chaining phase VoIP attack
scenario
The graph of Figure 3 shows the main
possible attack scenario on VoIP. Initially,
there is no user accessing the VoIP system.
The default password was not changed
during implementation of the system. An
336
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
intruder exploit this vulnerability by
performing an action and gained access to the VoIP Service and the
intruder further exploits vulnerability in the
service ports with an action and
gain and administrator access. Or exploit
VoIP phones vulnerability with an action
that grants access to administrative functions and obtain
Administrator access. The hacker can
intercept all the incoming messages into the
server by executing an action , as a result of exploiting a vulnerability in
which messages are sent as plain text based
on the initial version of SIP. With
administrative power, the intruder access
SIP URIs from the intercepted messages
after executing an action and
send an audio invite messages to the
collected URIs by performing an action
without any hypothesis been established in the last two actions.
Therefore the node labels remain the same
and then logout and leave evidences within
the system. The underlined texts in the
generated graph are the available evidence,
while others are new evidence generated
during an investigation.
The generated attack scenario stopped
inconsistency from occurring. The action
( ) is not part of the generated scenario as a result of contradicting with
action .
The generated graph after execution of
forward and backward chaining phase is
shown in Figure 4. It shows a new
generated scenario. It follows the same
pattern with the forward chaining phase, but
in this case the VoIP system is holding
information on received messages that are
not accessible to the intruder. The intruder
performs the same actions as in the forward
chaining phase and was granted an
administrator access. Thereafter, the
intruder manufactured a SIP invite
messages by executing an action
( ). The intruder access SIP URIs and send a SIP invite audio message
to the collected URIs by performing actions
and
respectively. No any hypotheses have been
established for these actions to be executed,
the intruder then logout from the system
after executing an action and leave digital evidence. The underlined texts in the
generated graph are the available evidence,
while other texts are new evidences
generated during reconstruction of attack
scenario.
Figure 4. Backward chaining phase, scenario attacks
on VoIP
7 CONCLUSIONS
In this paper, we proposed a model for
reconstructing Voice over IP (VoIP)
malicious attacks. This model generates
more specified evidences that match with
the existing evidence through the
reconstruction of potential attack scenario.
337
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
Consequently, it provides significant
information on what, where, why and how a
particular attack happens in VoIP System.
To harmonize our study, there is a need for
reconstruction of anonymous and Peer-to-
peer SIP malicious attacks.
REFERENCES
1. Yun-Sheng Yen, I-Long Lin, Bo-Lin Wu. A:
Study on the Mechanisms of VoIP attacks:
Analysis and digital Evidence. Journal of Digital
Investigation 8, 56–67 Science direct (2011).
2. Jaun C. Pelaez: Using Misuse Patterns for VoIP
Steganalysis. 20th International Workshop on
Database and Expert Systems
Application (2009).
3. Patric Park. Voice over IP Security. Cisco press
ISBN: 1587054698 (2009)
4. Hsien-Ming Hsu, Yeali S. Sun, Meng Chang
Chen. Collaborative Forensic Framework for
VoIP Services in Multi-network Environments.
In: Proc. 2008 IEEE International workshops on
intelligence and security informatics, pp. 260-
271 Springer-Verlag Berlin Heidelberg (2008)
5. Jill Slay and Mathew Simon: Voice over IP:
Privacy and Forensic Implication. International
Journal of Digital Crime and Forensics (IJDCF)
IGI Global (2009).
6. Palmer G. : A road map for digital forensic
research. In: First digital forensic research
workshop. DFRWS Technical Report New York
(2001).
7. Mark Reith, Clint Carr and Gregg Gunsch: An
Examination of Digital Forensic Models.
International Journal of Digital Evidence. Vol.
1Issue 3. Fall (2002)
8. Mandia K, Procise C.: Incident Response and
Computer Forensics. In: Emmanuel S. Pilli, R.C.
Joshi, Rajdeep Niyogi: Network Forensic
Frameworks: Survey and Research Challenges.
Digital Investigation pp.1-14, Elsevier(2010).
9. Casey E, Palmer G.: The investigative process.
In: Emmanuel S. Pilli, R.C. Joshi, Rajdeep Niyogi:
Network Forensic Frameworks: Survey and Research Challenges. Digital Investigation pp.1-14,
Elsevier(2010).
10. Barian Carrier, Eugene Spafford.: Getting
Physical with the Digital Investigation Process.
International Journal of Digital Evidence, Vol.2
Issue 2. Fall(2003).
11. Ciarduhain O.S.: An extended Model of
Cybercrime Investigation. International Journal
of Digital Evidence, Vol.3 Issue1.
Summer(2004).
12. Baryamureeba V. Tushabe F.: The Enhanced
Digital Investigation Process Model. In :
Proceedings of the fourth digital forensic
research workshop (DFRWS); (2004).
www.makerere.ac.ug/ics
13. Beebe NL, Clark JG: A Hierarchical,
Objectives-Based Framework For the Digital
Investigations Process. Digital Investigation
2(2) pp146-66. Elsevier(2005)
14. Ren W , Jin H. : Modeling the Network Forensic
Behavior. In: Security and Privacy for Emerging
Areas in Ccommunication Networks, 2005.
Workshop of the 1st International Conference
pp 1-8 IEEE(2005)
15. Emmanuel S. Pilli, R.C. Joshi, Rajdeep Niyogi: Network Forensic Frameworks: Survey and Research
Challenges. Digital Investigation pp.1-14,
Elsevier(2010).
16. Peter Stephenson.: Modeling of Post-incident
Root Cause Analysis. International Journal
of Digital Evidence 2, pp. 1-16 (2003).
17. Pavel Glydyshev and Ahmed Patel :Finite State
Machine Approach to Digital Event
Reconstructions, International Journal of Digital
Forensic & Incident, ACM pages 130-
149,(2004)
18. Brian D. Carrier and Eugene H. Spafford: An
Event-Based Digital Forensic Investigation
Framework. In: Proc. 2004 DFRWS 2004, pp.
1-12 (2004).
19. Slim Rekhis: Theoretical Aspects of Digital
Investigation of Security Incidents. PhD thesis,
Communication Network and Security (CN&S)
research Laboratory (2008).
20. Slim Rekhis and Noureddine Boudriga: Logic
Based approach for digital forensic
investigation in communication Networks.
Computers & Security pp 1-21, Elsevier (2011).
21. Slim Rekhis and Noureddine Boudriga: A
Formal Logic- Based Language and an
Automated Verification Tool for Computer
Forensic Investigation in communication
Networks. 2005 ACM symposium on Applied
Computing pp. 287-289 (2005)
22. Jaun C. Pelaez and Eduardo B Fernandez.
Network Forensic Models for Converged
Architectures. International Journal on
Advances in security, Vol 3 no 1 & 2 (2010).
23. Mohammed Ibrahim, Mohd Taufik Abdullah,
Ali Dehghantanha: VoIP Evidence Model : A
New Forensic Method For Investigating VoIP
Malicious Attacks. Cyber Security, Cyber
Warfare and Digital Forensic (CyberSec), IEEE
International Confence, Malaysia (2012).
338
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
24. F. Daryabar, A. Dehghantanha, HG. Broujerdi,
Investigation of Malware Defence and
Detection Techniques,” International Journal of
Digital Information and Wireless
Communications(IJDIWC), volume 1, issue 3,
pp. 645-650, 2012.
25. F. Daryabar, A. Dehghantanha, NI. Udzir,
“Investigation of bypassing malware defences
and malware detections,” Conference on
Information Assurance and Security (IAS), pp.
173-178, 2011.
26. M. Damshenas, A. Dehghantanha, R.
Mahmoud, S. Bin Shamsuddin, “Forensics
investigation challenges in cloud computing
environments,” Cyber Warfare and Digital
Forensics (CyberSec), pp. 190-194, 2012.
27. F. Daryabar, A. Dehghantanha, F. Norouzi, F
Mahmoodi, “Analysis of virtual honeynet and
VLAN-based virtual networks,” Science &
Engineering Research (SHUSER), pp.73-70,
2011.
28. S. H. Mohtasebi, A. Dehghantanha, “Defusing
the Hazards of Social Network Services,”
International Journal of Digital Information,
pp. 504-515, 2012.
29. A. Dehghantanha, R. Mahmod, N. I Udzir,
Z.A. Zulkarnain, “User-centered Privacy and
Trust Model in Cloud Computing Systems,”
Computer And Network Technology, pp. 326-
332, 2009.
30. A. Dehghantanha, “Xml-Based Privacy Model
in Pervasive Computing,” Master thesis-
University Putra Malaysia 2008.
31. C. Sagaran, A. Dehghantanha, R Ramli, “A
User-Centered Context-sensitive Privacy
Model in Pervasive Systems,” Communication
Software and Networks, pp. 78-82, 2010.
32. A. Dehghantanha, N. Udzir, R. Mahmod,
“Evaluating user-centered privacy model
(UPM) in pervasive computing systems,”
Computational Intelligence in Security for
Information Systems, pp. 272-284, 2011.
33. A. Dehghantanha, R. Mahmod, “UPM: User-
Centered Privacy Model in Pervasive
Computing Systems,” Future Computer and
Communication, pp. 65-70, 2009.
34. A.Aminnezhad,A.Dehghantanha,M.T.Abdullah
, “A Survey on Privacy Issues in Digital
Forensics,” International Journal of Cyber-
Security and Digital Forensics (IJCSDF)- Vol
1, Issue 4, pp. 311-323, 2013.
35. S. Parvez, A. Dehghantanha, HG. Broujerdi,
“Framework of digital forensics for the
Samsung Star Series phone,” Electronics
Computer Technology (ICECT), Volume 2, pp.
264-267, 2011.
36. S. H. Mohtasebi, A. Dehghantanha, H. G.
Broujerdi, “Smartphone Forensics: A Case
Study with Nokia E5-00 Mobile Phone,”
International Journal of Digital Information
and Wireless Communications
(IJDIWC),volume 1, issue 3, pp. 651-655,
2012.
37. FN. Dezfouli, A. Dehghantanha, R. Mahmoud
,”Volatile memory acquisition using backup for
forensic investigation,” Cyber Warfare and
Digital Foresnsic, pp. 186-189, 2012
38. Y. TzeTzuen, A. Dehghantanha, A. Seddon,
“Greening Digital Forensics: Opportunities and
Challenges,” Signal Processing and Information
Technology, pp. 114-119, 2012.
39. Mohammed Nassar, Radu State, Olivier
Festor: VoIP Malware: Attack Tool & Attack
Scenarios In: 2009 IEEE International
Conference on Communications (2009).
40. Mouna Jouini, Anis Ben Aissa, Latifa Ben
ArfaRabai, Ali Milli: Towards quantitative
measures of Information Security: A cloud
computing case Study” International Journal of
Cyber-Security and Digital Forensic (IJCSDF)
1(3):248-262. The society of Digital Information
and Wireless communications.(ISSN:2305-
0012)( 2012)
41. I-Long Lin, Yun-Sheng Yen: VoIP Digital
Evidence Standard Operating Procedure.
International Journal of Research and Reviews
in Computer Science 2, pp. 173 (2011).
42. Jill Slay and Mathew Simon: Voice over IP
forensics. In: e-Forensics 08 Proceedings of the
1st international conference on Forensic
applications and techniques in
telecommunications, information, and
multimedia workshop. Adelaide, Australia
(2008).
43. Siti Rahayu Selamat, Robiah Yusof, Shaharin
Sahib, Nor Hafeizah Hassan, Mohd Faizal
Abdollah, Zaheera Zainal Abidin. Traceability
in Digital Forensic Investigation Process. In:
2011 IEEE Conference on Open Systems, pp.
101-106 (2011).
44. Kara Nance Brian Hay, Matt Bishop. Digital
Forensic: Defining a Research Agenda Incident
Response. In: Proc. 42nd
Hawaii International
Conference on system science (2009).
45. Karen Kent Suzanne Chevaliar, Tim Grance,
Hung Dang. Integrating Forensic Techniques
into Incident Response. A white paper submitted
by Guidance Software Inc. UK (2006).
46. Tamjidyamcholo A, Dawoud R A.: Genetic
Agorithm for Risk Reduction of Information
Security. International Journal of Cyber-
Security and Digital Forensic(IJCSDF) 1(1):59-
339
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)
66 (ISSN:2305-0012) the society of Digital
Information and wireless communications
(2012).
47. Jeroen Keppens and John Zeleznikow. “A
Model Based Approach for Generating Plausible
Crime Scenarios from Evidence. In: Proc. of the
9th International Conference on Artificial
intelligence and Law (2003).
340
International Journal of Cyber-Security and Digital Forensics (IJCSDF) 1(4): 324-340The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2305-0012)