CHAPTER 5 COMBINED WIRED AND WIRELESS NETWORK...
Transcript of CHAPTER 5 COMBINED WIRED AND WIRELESS NETWORK...
74
CHAPTER 5
COMBINED WIRED AND WIRELESS NETWORK
INTRUSION DETECTION USING STATISTICAL DATA
STREAMS AND CLUSTERING METHOD
5.1 INTRODUCTION
The widespread usage of internet application through wireless
medium along with the wired medium has made the internet server to be one
for combined wired and wireless network. The criteria for adapting intrusion
detection in wireless scenario are different from the traditional wired intrusion
detection. The wired network routing and data transmission lies in the
standard physical routing. However, the data streams routing of wireless
network are based on the radio signal with various obstacles on transition.
With the evolution of combined wired and wireless network in the internet
scenario, it is essential to handle intrusion detection for both the wired
intruders and wireless intruders.
The combined model for intrusion detection attack in wireless
network selects the optimal features to detect particularly 802.11 intrusion
attacks. The improved version of the combined model is developed using
existing wireless intrusion detection model specific to 802.11 features and
also by making use of integrated statistical and clustering principles. The
clustering method effectively identifies the set of reduced optimal features
which is found to be more efficient in detecting the intrusions of low level
network framework. The k-means clustering is applied to select the optimal
feature set which improves the efficiency of the intrusion detection by
75
classifying the attributes of the associated attack class obtained from the
records sets of the tracked data of the wireless network MAC frames.
5.2 CLUSTERED STATISTICAL BASED TRAFFIC ANOMALY
INTRUSION DETECTION IN COMBINED NETWORK
The traffic anomaly intrusion detection model addressed in this
work first collects the traffic stream in the target server. The traffic streams
are filtered out based on the packet type which is indicated in the packet
header of the incoming data streams. The IP packets referring to the wired
network and the MAC frame packets referring to the wireless network
(802.11) are segregated. The proposed work analyzes IP address, port
number, Record Route, Stream ID, Loose Source Routing, Strict Source
Routing, and Internet Timestamp fields from IP packet header for wired
network and Port ID, Cache Address, Total number of packets, Total number
of bytes and Source-destination-pairs fields from MAC frame for wireless
network.
The separated IP packets and its volume of data streams are
subjected to the identification of anomaly traffic intrusion as mentioned in
Chapters 3 and Chapter 4. The statistical wavelet transformation is carried out
on the incoming traffic stream to know the volume of the data stream and to
identify the first level of anomaly intrusion traces. Then the IP packet headers
are verified with five fields which is the second level of anomaly intrusion of
the traffic data streams. Incoming IP packet traffic is monitored at regular
intervals and compared with historical norms of normal and abnormal streams
to find anomaly intrusion (change detection).
Finally, clustering of the statistical anomaly intrusion detected
attacks is recorded to visualize the traffic traces of non-intrusive packet
76
header data in different clusters. The cluster traffic streams of detected traces
contain all relevant information required for the network administrator to
govern the anomaly intrusions effectively and list out the inappropriate
objects in the cluster. On processing the wired IP packet stream, the wireless
MAC frame streams are processed.
5.3 WIRELESS TRAFFIC DATA STREAMS
The combined network model for intrusion detection attack in
wireless network selects the optimal features to detect 802.11 specific
intrusion attacks. The combined network model (wired and wireless) of
feature selection used information gain ratio. It assesses the relevance of
802.11 specific features. Its measures of information gain ratio indicate the
features that are essentially needed for detecting anomaly intrusion attack in
the network. The enhanced k-means clustering is applied to select the optimal
feature set to improve the efficiency of the intrusion detection by classifying
the attributes to the associated attack class obtained from the records sets of
the tracked data from the wireless MAC frames. To reduce the detection time
and improve the overall performance of wireless intrusion detection with
optimal set of features for detection criteria, the statistical characteristic of
data traffic learning is used.
Wireless networks are vulnerable to many routing attacks such as
Selective forward attack , Wormhole attack, HELLO flood attack , Sinkhole
attack, etc., because of resource restriction on sensor nodes, broadcast nature
of transmission medium, and uncontrolled environments where they are left
unattended.
77
5.3.1 Selective Forwarding Attack
Selective forwarding attack is a type of Denial of Service (DoS)
attack. In this attack, during the route discovery process an adversary first
exhibits the same behavior as an honest node, and then silently drops some or
all of the data packets sent to it for further forwarding even when no
congestion occurs. There are two types of selective forwarding attack,
namely, Node id based and Time interval based attack.
5.3.1.1 Node ID based attack: Each packet consists of the node id of
originating sensor node, which is a unique identification number in the
network. The adversary node can forward or drop the selected packets with
particular node id. This selection of node id can be made using a random
function of node ids. So, the proper and complete information from the sensor
nodes is not being received by the base station. The base station will not be
aware of this selective drop of the packets, as it is receiving the packets in
regular fashion. Adversary node starts to drop some important packets in the
selective forwarding attack. So the base station is not able to receive all the
packets destined to it from sender nodes. The detection of selective
forwarding attack depends upon the total number of packets sent from the
sensor nodes to the base station. There may be packet drops due to congestion
also, but if the drop ratio rises by threshold value then it is one of the
conditions of selective forwarding attack.
5.3.1.2 Time Interval based attack: The adversary node uses time interval
to drop or forward the packets received. Every simulation time consists of
time interval for which the adversary launches the selective forwarding attack.
This may cause the loss of significant information. In some critical
application like military surveillance selective forwarding attack is more
harmful, as movements of opposing forces may not be reported for the given
interval.
78
In the proposed work, node id is dynamically configured for each
session by the base station. At the time of transmission the node will get
activated. Base station broadcasts the set of node ids whenever it needs any
information and activates the timer. Allotted node is temporarily stored in the
base station for each session. After receiving the node id, the networks
topology is created. Route_request packet is send by the source node to the
destination node and the destination node sends route_reply packet along with
selected forward path. Based on the dynamic source routing protocol the
forward path is selected. Destination port assigns a node to act as check point
in the forward path. After the successful reception of the packet, check point
generates a trap message. After selecting the forward path, the source node
transmits the data to the destination node. When the data packet is received
successfully by the node, it sends the acknowledgement to the previous node.
Previous node forwards the acknowledgement to its neighbour node, likewise
acknowledgement packets are cumulated. After receiving the cumulative
acknowledgement the check point generates the trap message and sends it to
the next node in the forward path. When the destination receives the trap
message generated by the last check point, it indicates that the data has been
successfully transmitted from the source and the destination. It can be ensured
that the data has been successfully transmitted from the source node to the
destination port, when the destination port receives the trap message. Based
on the negative acknowledgement, the base station detects the exact malicious
node in the forward path. Suppose a node holds its id after the encoded time
interval then that node is also suspected as a malicious node. If the malicious
node is detected then the node is removed from the network and the packet is
forwarded through the alternate path.
79
5.3.2 De-authentication Attack
The work introduces de-authentication attack, an easy to mount
attack that can work on any type of 802.11 networks that modifies the 802.11
MAC frame. The attacker sends a de-authentication frame with a destination
address. The stations that collect this frame will automatically get detached
from the network. The operation is continuously repeated to prevent the
stations from maintaining their connections to the access point. In wireless
networks, an attacker does not need physical access to communication lines.
The attacker can be located anywhere within the range of the wireless
communication equipment and that range cannot be precisely defined and
guaranteed because of inherent properties of radio communication that are
influenced by many factors. As the consequence of this imprecision, the
traditional concepts of insider and outsider attacks must be redefined in
wireless networks.
Wireless networks are faced with specific types of attacks that are
not possible in wired networks, such as creation of unauthorized Access
Points (AP) so-called war driving (probe requests that have not set the values
of specific fields), flooding APs with associations, MAC address spoofing,
etc. In order to defend not only the wireless network but also the wired
network, a combination of physical, technical and organizational measures are
implemented. The measures included in the proposed work are firewalls,
vulnerability scanners, virus detection software/ hardware and intrusion
detection/prevention systems (IDS/IPS). The alterations of traditional
concepts related to IDS as well as specific attacks against wireless networks,
wireless IDS must implement new solutions capable of detecting and
responding to the new threats.
The most frequently exploited vulnerabilities of wireless networks
are first enumerated and then some of the wireless IDS solutions intended to
80
defend such networks are identified and are presented in this work. Special
focus is given to anomaly detection, since this type of IDS, although more
difficult to implement, includes a complete solution to the problems of
wireless networks protection. Unlike the ordinary wired networks, in the
wireless network environment defined by the 802.11 standard, it is possible to
detect some intrusions even at the physical level. For example, unauthorized
access points can be detected by carefully deploying radio sensors throughout
the protected area and by using goniometric algorithms in order to locate
unknown sources of radio transmission. This physical defense line of the
wireless network is combined with the lines of defense at the network level in
order to detect other kinds of attacks.
5.4 802.11 MAC FRAMES
The 802.11 MAC frame includes MAC header, frame body and
frame check sequence (FCS) as shown in Figure 5.1.
MAC Header
2 2 6 6 6 6 2 0-2312 4
Frame Duration/ Address Address Address Sequence Address Frame FCS
Control ID 1 2 3 Control 4 Body
Figure 5.1 802.11 MAC Frame Structure
In Figure 5.2, the number of bytes for each field is shown. The
Frame Control field has control information which is used for defining the
type of 802.11 MAC frame and provides information required for the
following fields in order to understand how to process the MAC frame.
81
2 bits 2 4 1 1 1 1 1 1 1 1
Protocol Type Sub To From More Retry Power More WEP Order
Version DS DS Fragments Mgt. data
Figure 5.2 Frame Control Field
Each Frame Control field’s subfields are explained below: Protocol
Version provides the current version of the 802.11 protocol used. Receiving
STAs uses this value to find out whether the version of the protocol of the
received frame is supported or not. Type and Subtype determines the function
of the frame. The three different frame type fields are data, management and
control. There are multiple subtype fields for each frame type. Each subtype
finds out the specific function in order to perform its related frame type. “To
DS” and “From DS” frame types specify whether the frame is going into or
exits from the DS (Distributed System). It is only used in data type frames of
STAs associated with an AP.
The field, “More Fragments” indicates whether more fragments of
the frame, either data or management type, must follow. “Retry” indicates
whether or not the frame, for frame types either data or management, is being
retransmitted. “Power Management” indicates whether the sending STA is in
power-save (PS) mode or active mode. “WEP” indicates whether or not
encryption and authentication are used in the frame. “WEP” can be set for all
management and data frames which have the subtype that is set to
authentication. “Order” indicates that all the received data frames must be
processed in order. Duration/ID Field indicates the remaining duration needed
to receive the next frame transmission. It is used for all control type frames
excluding the subtype of Power-Save Poll. While the sub-type is PS Poll, the
field possesses the association identity (AID) of the transmitting STA.
82
Depending upon the frame type, the four address field contains a
combination of the following address types. BSS Identifier (BSSID) which
uniquely identifies each BSS. When the frame is from STA in an
infrastructure BSS, the BSSID belongs to the MAC address of the AP. When
the frame is from STA in an IBSS, the BSSID is randomly generated.
Destination Address (DA) indicates the MAC address of the final destination
to receive the frame. Source Address (SA) indicates the MAC address of the
original source that initially created and transmitted the frame. Receiver
Address (RA) indicates the MAC address of the next immediate STA on the
wireless medium to receive the frame. Transmitter Address (TA) indicates the
MAC address of the STA that transmitted the frame onto the wireless
medium.
The Sequence Control field contains two subfields, the Sequence
Number field and the Fragment Number field and is indicated in the following
Figure 5.3.
Sequence Number Fragment Number
12 bits 4 bits
Figure 5.3 Sequence Control Field
Sequence Number indicates the sequence number of each frame in
sequence control field. The sequence number remains the same for each frame
sent for a fragmented frame; or else, the number is increased by ‘one’ until
reaching 4095, when it then starts at zero again. Fragment Number shows the
number of each frame sent for a fragmented frame. The initial value is set to
zero and then increased by one for every successive frame sent for the
fragmented frame.
83
The frame body contains the data or information included in either
management type or data type frames. The transmitting STA uses a Cyclic
Redundancy Check (CRC) over all the fields of the MAC header and the
frame body field, to produce the FCS value. The STA that receives then uses
the same CRC computation to find out its own value of the FCS field to
verify whether or not any errors occurred in the frame during the
transmission. The feature extraction is responsible for the extraction of
attributes and characteristics that are most effective for intrusion detection
from the 802.11 MAC Frame fields. The attributes selected depend on the
type of the frame and the detection algorithm. Then this set of characteristics
is sent to the central module and the local misuse detection for detecting
anomalies.
In order to maintain the communication, management frames are
used. 802.11 authentications start with the Wireless Network Interface
Controller (WNIC) sending an authentication frame to the access point
including its identity. WNIC only sends a single authentication frame with an
open system authentication and the access point responds with an
authentication frame of its own, signifying acceptance or rejection. After the
WNIC sends its initial authentication request with shared key authentication,
it will receive an authentication frame from the AP including challenge text.
An authentication frame containing the encrypted version of the challenge
text is sent by WNIC to the access point. AP ensures that the text is encrypted
with the correct key by decrypting it with its own key. The outcome of this
process determines the WNIC's authentication status.
Association request frame sent from a station enables the access
point to allocate resources and synchronize it. The frame carries information
regarding the WNIC, including supported data rates and the SSID of the
network that the station wishes to associate with it. If the request is accepted,
84
the access point stores the memory and creates an association ID for the
WNIC. Association response frame that is sent to a station from an AP
includes the acceptance or rejection for an association request. If it is an
acceptance, the frame will have information such as the association ID and
supported data rates. Beacon frame sent from time to time from an access
point announces its presence and provides the SSID and other parameters for
WNICs that contain within the range.
De-authentication frame is sent from a station wishing to terminate
connection from another station. Disassociation frame sent, contains the
station wishing to terminate connection. It's an elegant way to permit the AP
to relinquish memory allocation and remove the WNIC from the association
table. Probe request frame is sent from a station when it requires information
from another station. Probe response frame is sent from an access point
contains supported data rates, capability information, etc., after receiving a
probe request frame. WNIC sends a re-association request when it drops from
range of the currently associated access point and finds another access point
with a stronger signal. The new access point synchronizes the forwarding of
any information that may still be included in the buffer of the previous access
point. Re-association response frame sent from an access point contains the
acceptance or rejection to a WNIC re-association request frame. The frame
includes information needed for association such as the supported data rates
and association ID.
Control frames facilitate the exchange of data frames between
stations. The 802.11 control frame contains ACK (Acknowledgement) frame,
RTS (Request To Send) frame and CTS (Clear To Send) frame. When the
destination station receives the data frame then it will send an ACK frame to
the source station if no errors are found. Suppose the source station does not
receive an ACK frame within a predetermined period of time, the source
85
station will resend the frame. The RTS and CTS frames offer an optional
collision reduction scheme for AP with hidden stations. As a first step, a
station sends a RTS frame in a two-way handshake manner which is needed
before sending data frames. A station responds to RTS frame among a CTS
frame. It gives clearance for the requesting station to send a data frame. The
CTS gives collision control management by having a time value for which all
other stations are to delay transmission while the requesting stations transmit.
Data frames carry packets from files, web pages, etc., within the body.
5.5 STATISTICAL ANOMALY TRAFFIC CHARACTERISTIC
IN MAC FRAMES
The MAC frame fields in the 802.11 packet header are analyzed to
detect anomalies in the traffic. Discrete values collected from the individual
fields in the traffic header data shows discontinuities. MAC address space,
span multiples addresses and addresses contained in the sample are likely to
exhibit many discontinuities over this space. It is very tough to analyze the
data over the address space. To conquer the discontinuities over a discrete
space, packet header data is converted into a continuous signal through
correlation of samples over successive samples. A simplified correlation of
time series is utilized to investigate the sequence of a random process for
computational efficiency.
For each MAC address in the traffic count, a number of packets is
sent in the sampling instance. To compute the address correlation signal, take
two adjacent sampling instances ‘n-1’ and ‘n’. The detection model defines
address correlation signal in sampling instances. When an address spans the
two sampling instances, the user obtains a positive contribution. To minimize
storage and processing complexity, employ a linked data structure. A location
count is considered for recording the packet count for the address ‘j’ in ith
field of the IP address through scaling. This gives a brief description of the
86
address required to store the address occurrence uniquely. The statistical
anomaly detection model filters this signal by computing a correlation of the
address for two success samples. The employment of this approximate
representation of addresses allows reducing the computational and storage
demands by an appreciable factor. In order to generate the address correlation
signal at the end of sampling point, multiply each segment correlation with
scaling factors. From a statistical view, they have an approximately same
mean and standard deviation as cross-correlation coefficient (Equation 5.1).
2
j j jD a,b a – b / f (5.1)
In the above equation, fi denotes the weighing factor for the jth
feature. Attributes in an event of data traffic in the combined network are
independent of a particular attack instance used for clustering. Attributes are
dependent on the attack instance used in the clustered alert aggregation
process to distinguish different attack instances. Dependent metrics such as
source MAC address identifies the attacker. Destination port #80, an example
of independent metrics, specifies, in case of web-based attacks. Both hint to
the attacker’s actual target service specifically designed to target a particular
service only. Regarding an attack instance, a random process generates
aggregates that are distributed according to a certain multivariate probability
distribution. Reconstructing an attack situation from observed samples
estimate all the parameters of the mixture distribution. Maximum Likelihood
(ML) Estimation approach is adopted here. Combined network anomaly
intrusion detection is situation-aware and it tries to keep the model updated to
current attack situation. There is a trade-off between run-time (reaction time)
and accuracy. It is likely to make a decision on the existence of a new attack
instance when only one observation is made.
87
The overall random process is non stationary and is regarded as the
mixing coefficients at certain points of time. The mixing co-efficient is either
zero or reciprocal of the number of active components (for the time interval of
the respective attack instance).With suitable innovation and obsoleteness,
detection mechanisms seek to detect the data traffic in varying time with both
sufficient certainty and timeliness. With the formation of a new component,
an appropriate aggregate that represents the information about the component
in an abstract way is created. Every time a new aggregate is added to a
component, the corresponding aggregate is updated too.
5.6 BLACKLISTED INTRUSION RULES
Detection of Anomalies to de-authentication attack is based on
observing the behavior of the system to generate profiles and data structures
that describe the normal state of the system using features extracted from
802.11 MAC Frame. Any subsequent deviation from normal behavior
generates an alert. IDS identify behavior of the network to establish the
standard mode so as to build an efficient anomaly detector. The normal user
behavior is assessed with various training samples which deviate from
malicious activity. Even at times anomaly intrusion detector generates a false
alarm, when uneven activity occurs due of traffic detection over a period of
time. Anomaly detection is effective for the detection of abnormal behavior,
based on traditional and recent blacklist rules.
The blacklist is the list of entities being denied at a particular
privilege, recognition, mobility, access or service as per the rules and
governance of worldwide web consortium. The blacklist denies the attribute
values of data traffic, as is compared with the characteristics of standard list
maintained universally. The particular attribute values of abnormality listed as
standard is formed over a period of time.
88
Blacklist is a new rule that must be processed for every connection.
Blacklist includes IP addresses of legitimate clients or servers. Blacklist
cannot include IP addresses of all possible intruders as the intruders often use
fake addresses. Therefore, it is likely to set the same actions for blacklist as
for detected intrusions. Log and drop information about the detected traffic
and blocked IP address will be recorded in the Security log and any network
traffic from that IP address will be blocked. Log information about the
detected traffic and blocked IP address will be only recorded in
the Security log. If there is no action, the detected blacklisted IP address will
not be considered as an intruder.
The firewall administrators can set up blacklists manually straight
from the firewall logs if they see something alarming in the logs. Blacklisting
can stop worm propagation between network segments. Early quarantine will
decrease both the time and resources, required for cleaning the worm-infected
systems. By combining both white listing and blacklisting, it allows a safe
automatic response to attacks while preserving production-critical traffic.
Dynamic (and also manual) blacklisting rules completely block
access for a remote host after confirmed security violations either
permanently or for a defined period. The blacklisting range can differ from
incident to incident. It can stop the traffic either permanently or just for a
certain period of time, from single IP addresses to whole network
segments. With remediation features such as the dynamic application of new
rules or blacklist conditions, the Security Information and Event Management
(SIEM) therefore act as an additional detection layer, looking for wider
threats over the complete infrastructure and then applying point-defenses
against those threats through an update to the IPS itself. This can be done
automatically, though many believe it a better practice to remediate manually,
89
in order to avoid the unintended blocking or disruption of potentially
legitimate traffic.
Blacklisting utilizes the NitroGuard IPS's firewall to block traffic
and this operation differs from normal IPS Block condition in three ways: its
ability to block subsequent traffic explicitly by source IP address and/or
source port; its ability to enforce the block on subsequent traffic for a period
of time; and its ability to block traffic conditionally. The final point is
essential, especially if used with rate-based or threshold-based signatures. For
instance: A signature to detect occurrences of 5 sequential DNS responses
from a common source IP, are set to block, then all DNS responses will be
blocked and an aggregate will be generated after every fifth response. For this
purpose, majority of the threshold-signatures are set to aggregate only. In this
example, setting the rule to 'block' will stop operation of all DNS services.
Blacklisting = conventional intrusion detection / prevention
A blacklist rule will block all messages from a specific sender. If a
sender is blacklisted directly from a Spam Report, see the Spam Filtering &
Reports section. If there is a message from the sender and one wish to block
within the sender’s web mail account on the mail server, then add this sender
to the account's global Blacklist by performing the following:
a) Login to the web mail account with the full email address and
password.
b) Navigate to the message from the sender to be blocked.
Drag-and-drop this message to the Blacklist folder
(within the Filters folder) in the Folder pane; or
90
Right-click this message within the Message pane and
select Blacklist Sender and click OK.
Black list rules adapted in this work follows the below mentioned
rule sets: Verify Google, Norton and Sucuri’s internal blacklists of sites
which are known to have malware and they display a warning before allowing
data streams to pass through the server. Content-control blacklist work in
order to block URLs of sites deemed inappropriate for a work or educational
environment. E-mail spam filter based blacklist of addresses prevent reaching
its intended destination. DNS blacklisting (DNSBL), hostile IPs blacklist are
other entities verified in anomaly detection. The latest blacklist websites
verified in due course of experimentation are Cleaning up an infected
WordPress site (Posted on March 16, 2011 by Sucuri-research), Cleaning up
an infected osCommerce web site (Posted on March 9, 2011 by Sucuri-
research), Cleaning up an infected Joomla web site (Posted on March 9,
2011 by Sucuri-research), Cleaning up blacklisted sites (Posted on March 8,
2011 by Sucuri-research) and WPSecurityLock (Posted on January 19,
2011 by Dremeda)
5.7 IMPLEMENTATION OF COMBINED NETWORK
ANOMALY TRAFFIC INTRUSION DETECTION
The anomaly detection mode is based on the functions of statistical
data transformation and network traffic splitter. The traffic splitter generates
network traffic signal from data flow records or packet header traces. The
statistical data transformation analysis is done with wavelet transforms of IP
address and port number correlation over some timescales. Then the detection
of attacks and anomalies are checked with the help of thresholds. The
analyzed information is compared with historical thresholds to verify whether
the traffic’s characteristics are out of regular norms. This comparison leads to
91
some form of a detection signal that can be used to aggregate the network
administrator for the potential anomalies in the network traffic.
Selective forwarding attack is a type of intrusion attack in wireless
mesh network. It is an intermediate misbehaving router that forwards a
portion of packets it receives and discards the others. Most of the wireless
channels drop packets due to the medium access collision, poor channel
quality etc., In this work, anomaly based channel aware detection is provided
to identify the selective forwarding misbehavior from the normal channel loss
of hybrid network data traffic.
The anomaly based channel aware detection scheme uses a multi-
hop acknowledgement method to initiate alarms by acquiring responses from
intermediate nodes. This scheme is competent and consistent in the sense that
an intermediate node will report any abnormal packet loss and suspect nodes
to both the base station and the source node. Every intermediate node, besides
the forwarding path, is responsible of detecting malicious nodes. When an
intermediate node detects the misbehavior of its downstream (upstream)
nodes, it will produce an alarm packet and send it to the source node (the base
station) through multiple hops. Downstream indicates the direction towards
the base station, and upstream indicates the direction towards the source node.
The flow chart for Combined Wired and Wireless Network Anomaly Traffic
Intrusion Detection System is represented in Figure 5.4.
92
(Based on specific time
intervals)
(as available in www)
Wired
Network
Wireless
Network
Extraction of data (Wired & Wireless)
Derivation of
Black listed rules
Signal Generation
Clustering of Alerts
Alert Creation
Stop Process
Figure 5.4 Flowchart showing Combined Network Anomaly Traffic
Intrusion Detection System
The Combined Wired and Wireless Network Anomaly Intrusion
Detection System extracts IP packet header and MAC frame from the data
stream and apply the statistical wavelet analysis and clustering method to find
the anomalies through the following steps.
Initialization
93
Step 1 (Initialization): Collect input data stream based on specific time
intervals from combined wired and wireless network.
Step 2 (Characteristic Extraction): Extract wired and wireless data
characteristics from the collected time specific data streams. The
traffic splitter splits the IP address header and MAC address header
separately.
Step 3 (Signal Generation) Packet header data is converted into a
continuous signal through correlation of samples over successive
samples. Compute data transform over several sampling points.
Step 4 (Statistical Analysis): Statistical wavelet analysis technique for
traffic anomaly intrusion detection is applied separately for wired
and wireless network traffic data.
Step 5 (Alert Creation): The detector assesses the attack events and
searches for known attack signatures and suspicious behavior.
Create alerts when suspicion is found and forward it to the alert
generation phase.
Step 6 (Clustering of alerts): Collected alerts are combined according to
the specific attack instance or type to form meta-alerts.
Step 7 (Derivatives of Black listed Rules): Derive server admin specific
black listed rules from standard blacklists available in www.
Formed clusters are compared with the black listed rules to verify
the intrusive data.
Step 8 (Iteration): Iterate step 3 to step 7 for all the characteristics of the
data stream to improve the anomaly intrusion detection rate.
94
Anomaly detection approaches develop models of normal data and
then try to detect deviations from the normal model in observed data. As a
result, these algorithms can detect new types of intrusions since these new
intrusions may deviate from normal network usage. However, these
algorithms need a set of purely normal data from which they train their model.
If the training data has traces of intrusions, the algorithm may fail to detect
future instances of this attack, as it will believe that they are normal.
Anomaly detection is an important element of intrusion detection in
which deviations from normal behavior denotes the presence of intentionally
or unintentionally excited attacks or faults. Anomaly detection approaches
usually develop models of normal data and detect deviations from the normal
model in observed data. Anomaly detection algorithms have the benefit that
they can detect new types of intrusions as deviations from normal usage.
To model network traffic, every connection record is scrutinized and
basic traffic features are extracted. After preprocessing, the aim of the
intrusion detection algorithm is to guide the system with normal data and
model normal network traffic from the given set of normal data. Then, the
task will be to find out whether the test data belongs to normal or to an
abnormal behavior from a given new test data. The proposed Anomaly
Detection Phase is composed of three sub modules: Preprocessing Module,
Anomaly Analyzer Modules and Communication Module.
Each Anomaly Analyzer Module, (TCP Anomaly Analyzer, UDP
Anomaly Analyzer) uses the Self Organizing Map (SOM) algorithm to built
profiles of normal traffic. The profile developed in the Anomaly Analyzer
Module will later be used to find out whether a network connection is normal
or abnormal. Communications Module considers the communications through
the Decision Support System (DSS). Figure 5.5 displays the block diagram of
the Anomaly Detection Module.
95
Figure 5.5 Anomaly Detection Module
In the dataset competition, raw packet based network traffic data is
collected from the network by a network sniffer and is processed into a stream
of connections to form the intrusion detection dataset. In the intrusion
detection dataset, 41 features are derived to summarize connection
information. From each connection, six basic features are used in this work.
Anomaly analyzer modules (TCP Anomaly Analyzer, UDP
Anomaly Analyzer), operate on different protocols; though their processing
procedures are the same. Every Anomaly Analyzer Module uses the SOM
algorithm to develop profiles of normal behavior. The corresponding SOM
structure is trained with the corresponding normal traffic data and the profile
of normal behavior is modeled, whose hypothesis is that the normal traffic
represents normal behavior. It is clustered around one or more cluster centers
on the SOM lattice and if any anomalous traffic represents abnormal and
possibly suspicious behavior, it will be clustered outside of the normal
clustering or will be clustered inside the normal clustering with probabilities
of quantization error. Subsequently, the profile developed later is used to find
out whether a network connection is normal or abnormal.
DatasetDecision
Support
System
Preproc
essing
TCP/UDP
Anomaly
Analyzer
Communic
ation
Module
96
5.8 PERFORMANCE MEASURE OF COMBINED NETWORK
INTRUSIVE ANOMALY DETECTION
The simulation of anomaly intrusion traffic detection is done based
on the monitored traces of combined wired and wireless network traffic
generated from real time data traffic from ISP servers. The real trace of
samples is carried for a period of one month connecting with 10Mbps broad
band link comprising of wireless and wired network servers. The samples that
are taken from the ISP server have 1000s of wired and 1000s of wireless
connection, at the traffic rate ranging from 256Kbps to 1MBPs. These traces
are preserved with MAC and IP prefix relationships. The use of the anomaly
traffic detection, applied in the combined network is done based on the
clustering meta aggregate and statistical characteristics of data traffic. The
simulation is done on IBM PC with Dual Core 2.20 GHz and 2GB of RAM in
NS2 simulator.
The simulation assesses the performance, like the detection accuracy
and communication overhead of the proposed scheme, through simulations. A
field size of 1000 1000 m is used where 80 nodes are uniformly distributed.
A stationary sink and a stationary source sit on opposite sides of the field,
with about 2 to 3 hops in between them. Simulations are carried out in which
the source generated 500 reports in total and one report is sent out every two
seconds. Packets can be send hop-by-hop at 10 Kbps. To prevent detection,
the malicious nodes drop only part of the packets passing by. To make the
scheme more resilient in poor radio conditions, a hop-by hop transport layer
retransmission mechanism is implemented. The retransmission limit is 5 by
default. The channel error rate is 10% by default, which is usually regarded as
a rather harsh radio condition. Every simulation runs 10 times and the
outcome shown is an average of these runs.
97
The proposed metrics assesses the detection accuracy and
communication overhead of both the existing and proposed schemes. Alarm
reliability calculates the ratio of the number of detected maliciously-dropped
packets to the total number of lost packets detected together with those lost
due to poor radio conditions. Undetected rate calculates the ratio of the
number of undetected maliciously-dropped packets to the total number of
maliciously-dropped packets. Relative communication overhead calculates
the ratio of the total communication overhead in a system that incorporates
the proposed detection scheme against a system that does not.
The simulation conducted also performs the evaluation of cluster
aggregation approach. The simulation deploys different combinational
network data sets to demonstrate the feasibility of integrated statistical and
cluster based anomaly intrusion system. After several weeks of training, test
data have been generated on a test bed that emulates a small confidential data
site. In the combinational network set-up of both wired and wireless, the
generated network traffic is scrutinized for its intrusion characteristic
detection. The simulation uses both MAC frame and IP address dump as input
data and analyzes various attack instances against different target hosts. The
statistical information taken from the network traffic data applies support
vector machines (SVM) to classify the clustered data characteristic samples.
The performance graph shown in Figure 5.6 shows that the Combinational
network anomaly intrusion detection proposed in this work has better
response time for the detection rate compared to that of the classical intrusion
detection scheme. The percentage of improvement is nearly 17% in terms of
response time for the anomaly intrusion detection from the combinational
network data traffic.
The tabulated values of the detection rate response time against
number of data records taken from the combined network data traffic are
98
shown in Table 5.1. This indicates that the proposal has better response time
in detection anomaly intrusion both in wired and wireless network. With
changing threshold to the data traffic rate of the clustered aggregation, the true
positive rate (TPR = number of true positives divided by the sum of true
positives and false negatives) and false positive rate (FPR = number of false
positives divided by the sum of false positives and true negatives) are
identified for the trained data traffic records.
Table 5.1 Combined network intrusion detection rate against number
of data traffic records
Data
Records
Classical Anomaly
Intrusion Detection Rate
Response time - Existing
(ms)
Combined Network Anomaly
Intrusion Detection Rate
Response time - Proposed
(ms)
20 0.02 0.025
25 0.03 0.027
30 0.035 0.028
35 0.04 0.03
40 0.05 0.04
0.01
0.02
0.03
0.04
0.05
0.06
10 15 20 25 30 35 40 45
Data records
intr
us
ion
de
tec
tio
n r
ate
Hybrid network anomaly intrusion detection proposed
anomaly classical intrusion detection existing
Figure 5.6 Performance of Anomaly Intrusion Detection in combined
network
99
Various operating points are marked to assess the normal traffic
scenario and anomaly traffic intrusion at different data rates. The simulation
conducted to investigate aggregation under idealized conditions where it is
believed to have a perfect detector layer with no false aggregates and no
missing at all. MAC frame and IP address, the attack type, the creation time
differences (based on the creation time stamps) and source and destination
port are used as attributes for the alerts. The performance of the hybrid
network intrusion detection shows improved detection rate with decreased set
of features. The false positive rate found is the percentage of frames
containing normal traffic which is classified as intrusive frames, that is
minimal in this scheme. False negative rate identified is the percentage of
frames generated from wireless attacks which are classified as normal traffic
that is precisely assessed in this scheme.
5.9 SUMMARY
Anomaly Traffic Intrusion Detection Algorithm is framed to detect
threats in both wired and wireless networks. Software is implemented for the
proposed method and its architecture and operations are described in detail
using high level class diagram and pseudo-code. A number of experiments
have been carried out using a benchmark data set in order to show the efficacy
of the developed software. One of the major advantages of this technique is
that in the real world, the types of intrusions are becoming complicated. The
proposed detection system can upload and update new rules to the systems as
the new intrusions become known. Here the blacklist rules are presented. The
performance of statistical traffic anomaly detection in combined wired and
wireless network along with efficiency of clustering of traffic anomaly
intrusion aggregates are evaluated to show better results when compared with
anomaly traffic intrusion detection in the traditional wired internet scenario.