As traffic flows through router, Staleness Detector monitors the characteristics of traffic and...
-
Upload
laureen-clark -
Category
Documents
-
view
212 -
download
0
Transcript of As traffic flows through router, Staleness Detector monitors the characteristics of traffic and...
As traffic flows through router, Staleness Detector monitors the characteristics of traffic and triggers an alarm if the behavior has changed significantly. The alarm starts a process on Signature Factory, which clusters the flows matching the alarmed signature into groups. The new cluster is analyzed for signature. The new signatures are merged with purchased signatures, and then the new set of signatures is tested against a corpus of end user traffic.
Algorithms for Identification of Network Data Streams
Background:
There is too much traffic in the Internet and identifying accurately its essential traits is a challenging problem. Existing techniques typically rely on manually generated signatures specified in packet headers, which makes traffic identification tests relatively simple. However, it lacks the flexibility required to deal with the constant changes in network traffic patterns.
Problems:
• How to constantly sense/detect changes of network traffic streams
• How to identify suspicious traffic streams without pre-specified signatures
• Can we generate network traffic signatures automatically (i.e., without consumption of a network expert’s power)
• Allocate network resources only when needed
Proposed Solution:
AutoImmune System: an Intelligent IP Service Infrastructure
AutoImmune addressees a more general traffic stream identification problem that needs complex packet-payload based membership tests without pre-specified signature sets. We implemented AutoImmune by integrating the three developed algorithms, and tested the system against simulated data traffic. The system runs very well in various networking environments for non-stationary traffic streams. It adapts automatically to changes in the characteristics of network traffic and identifies new types of traffic patterns almost in real time. (It takes less than 10 seconds in a Gbps communication network to obtain a new traffic pattern). Simulation results showed that the system successfully identifies a new type of network traffic, which occupies as small as 0.2% of total network traffic. To the best of our knowledge, the lowest reachable worm detection rate that has been reported in the literature is 1.1% by a worm detection system referred to as DoWitcher. The smaller the percentage of the new type of traffic is, the longer the time spent for identifying the new type of signature is.
1. Change Detection
The algorithm (in Staleness Detector) keeps a dictionary of data elements that are deemed useful in predicting future data elements. New data points that are not well explained by this dictionary are signaled as alarms. For each new data point
Compute distance from this point to the points already in a dictionary
If this point is very far, then set Red Alarm
If it is somewhat far, then set Orange Alarm
If it is close, then no alarm
Periodically, evaluate Orange Alarms, and clean up dictionary
A related study to our change detection algorithm is [1].
This research was supported in part by the MITACS Internship Program. The authors would like to acknowledge the contributions made by Katrina Rogers-Stewart, Yihui Tang, and Pin Yuan.
[1] T. Ahmed, M. Coates and A. Lakhina, Multivariate online anomaly detection using kernel recursive least squares, in Proc. IEEE INFOCOM, Anchorage, AK, May 2007.[2] Paxson, Vern, “Bro: A System for Detecting Network Intruders in Real-Time,” Lawrence Berkeley National Laboratory Proceedings, the 7th USENIX Security Symposium, Jan. 26-29, 1998, San Antonio TX. [3] Roesch, Martin, “Snort - Lightweight Intrusion Detection for Networks,” Proc. USENIX Lisa '99, Seattle: Nov. 7-12,1999. [4] F. Hao, M.S. Kodialam, T.V. Lakshman, and H. Zhang, “Fast Payload-Based Flow Estimation for Traffic Monitoring and Network Security,” in Proc. ANCS 2005, Oc. 26-28, 2005, New Jersey, USA.
INTRODUCTION
AUTOIMMUNE SYSTEM ARCHITECTURE
ALGORITHMS
ALGORITHMS (CONT.)
CONCLUSION
ACKNOWLEDEMENT
Jun Li*, and Peter Rabinovitch** *Carleton University, **Bell Labs, Alcatel-Lucent
Supervisor: Dr. Yiqiang Q. Zhao (Carleton University)
RESULTS (CONT.)
2. Data Clustering and Classification
The algorithm (in Signature Factory) classifies test data points into two clusters, typical and atypical traffic clusters.
The data space is split into small regions
Obtaining TWO density estimates for each region
1. The proportion of known observations
2. The proportion of test observations
The observations in areas that have a nil (or very small) estimate under typical traffic, but a relatively large estimate assuming test traffic, are classified as atypical traffic.
Purchased
Signatures
Signature
Factory
Staleness
Detector
Router
Packets of changed signature
5-tuple, packet size, …
Alarms of signature changes
New signatures
RESULTS
3. Signature Extraction A signature-based algorithm similar to Bro [2], SNORT [3], and
based on [4]
Only the cluster of atypical traffic is examined for extracting signatures
•mean packet size
• flow
length
in p
ack
ets
Data space
Cluster of typical traffic
Cluster of atypical traffic
Fig. 2: Clustering 2-dimensional Data
REFERENCES
In an implemented system, 20 computers are connected through Router (shown in Fig. 1) and communicate multimedia traffic. Staleness Detector and Signature Factory connect Router and run separately. Five types of traffic flows are Web, Mix, Smtp, VoIP, and Video. The statistics of the traffic flows are shown in Table 1.
Fig. 1: AutoImmune Architecture
Avg. flow length
(# of packets)
Std. flow length
Avg. Packet size
Std. packet size
Web 6 2 1500 100
SMTP 3 2 1500 100
VoIP 200 50 200 100
Video 600 100 400 200
Mix 40 2 1000 100
Table 1: Five Types of Traffic Flows
Network speed is assumed to be 1 Gbps. At the beginning of simulation, each computer generates traffic without Mix flows. When simulation enters steady state, Mix flows start to be generated on each computer with a specified proportion shown in Table 2. The payload of each Mix packet is injected with a synthetic worm. The injected Mix traffic is of Web type while passing through the router.
Simulation run Web Smtp Voip Video Mix (or Malicious)
S1 45% 20% 20% 10% 5%
S2 49% 20% 20% 10% 1%
S3 49.8%
20% 20% 10% 0.2%
Table 2: Proportions of Traffic Flows
Fig. 3: Change Detection in S1 Fig. 4: Change Detection in S2
Fig. 5: Change Detection in S3 Fig. 6: Flow Clustering and Classification in S1
Fig. 7: Flow Clustering and Classification in S2 Fig. 8: Flow Clustering and Classification in S3
Simulation run T N N’ MEAN L
S1 0.25 679 48 1030.6 21
S2 0.679 859 80 1048.9 18
S3 3.14 789 117 1091.2 20
Define the following parameters for each simulation run:
1) T -- Period from when malware (e.g., Mix traffic) starts until new signature is obtained by Router
2) N -- Number of items in the Cluster of atypical traffic
3) N’ -- Number of items in the atypical traffic Cluster that are NOT malware (or of Mix type)
4) MEAN -- Mean of the length (in Bytes) of packets in the Cluster of atypical traffic
5) L – Length (in Bytes) of the signature extracted
Table 3: Numerical Values of Parameters