A Statistical Anomaly Detection Technique based on Three Different Network Features
Yuji Waizumi Tohoku Univ.
BackgroundThe Internet has entered the business worldNeed to protect information and systems from hackers and attacksNetwork security has been becoming important issue Many intrusion/attack detection methods has been proposed
Intrusion Detection System
Two major detection principles: Signature Detection
Attempts to flag behavior that is close to some previously defined pattern signature of a known intrusion
Anomaly Detection Attempts to quantify the usual or
acceptable behavior and flags other irregular behavior as potentially intrusive.
MotivationAnomaly detection system Pro: can detect unknown attacks Con: many false positives
Improve the performance of Anomaly detection system Analyze the characteristics of attacks Propose method to construct features as
numerical values from network traffic Construct detection system using the
features
Classification of Attacks
DARPA Intrusion Detection Evaluation DoS: Denial of Service Probe: Surveillance of Targets Remote to Local(R2L), User to Root(U2R): Unauthorized Access to a Host or Super User
Re-classification of AttacksClassification by Traffic Characteristics DoS, Probe
Traffic Quantity Access Range
Probe Structure of Communication Flows
DoS, R2L, U2R Contents of Communications
To detect attacks with above characteristics,it is necessary to construct features corresponding those classes.
Network Traffic Feature
Numerical values(vectors) expressing state of traffic
We propose three different network feature sets Based of re-classification of attacks Analyzed independently
Time Slot Feature (34 dimension)
Count various packets, flags, transmission and reception bytes, and port variety by a unit timeEstimate scale and range of attacksTarget Probe (Scan) DoS
Each slot is expressed as a vectorEx) (TCP,icmp,SYN,FIN,RST,UDP,DNS,…)
Examples (Time Slot Feature)
normal traffic only
rst flag (port 21)
rst flag (port 23)
ftp scan telnet scan
Vector element
Ele
ment
valu
e
Values are regularizes as mean=0, variance=1.0
Flow Counting Feature
Flow is specified by (srcIP, dstIP, srcPort,dstPort,protocol)
Count packets, flags, transmission and reception bytes in a flowTarget Scan with illegal flags Ports used as backdoorsTCP:19 dim. , UDP:7 dim.
Examples (Flow Counting Feature)
Normal traffic Port sweep(scan)
Decrease of SYN packet
Vector element
Ele
ment
valu
e
Specific packets of attacks are extremelyhigh and low.
Flow Payload Feature
Represent content of communicationHistogram of character codes of a flow Count 8bit-unit(256 class) Transmission and reception are counted
independently (total 512 class)
Target Buffer overflow Malicious code
Examples (Flow Payload Feature)
Specific character of attacks are extremelyhigh and low.
Normal traffic
imap attack
Modeling Normal Behavior
Each packet appears based on protocol Correlations between elements of the
feature vectors
Profile based on correlations can represent normal behavior of network traffic
Principal Component Analysis:PCA
Extract correlation among samples as Principal ComponentPrincipal Component lay along sample distribution
Principal Component
Non-correlated data
Discriminant Function
Projection Distance Principal Component
Anomaly sample
Projection Distance
Long Distant Samples:
•Unordinary traffic•Break Correlation
Detection Criterion
Detection AlgorithmIndependent Detection The three features are used for
PCA independently "Logical OR" operation for detection
alerts by each feature
Time Slot
Flow Counting
Flow Payload
FeaturesNetw
ork T
raffi
c
PCA
PCA
PCA
Alert
Alert
Alert ORAlert
Performance EvaluationTwo Examine Scenario Scenario1
Learn Week1 and 3 Test Week4 and 5
Scenario2 Learn Week 4 and 5 Test Week 4 and 5 More Practical Situation
Real network traffic may include attack traffic
Criterion for Evaluation Detection rate when number of miss-detection (fal
se positive) per day is 10
Data Set
Data Set 1999 DARPA off-line intrusion detection
evaluation test set Contain 5 weeks data (from Monday to
Friday) Week1,3: Normal traffic only Week2: Including attacks (for learning) Week4,5: Including attacks (for testing)
Scenario 1 Result
# of detection # of target Detection
rate
Proposed Method
104 171 60.8%
NETAD 132 185 71.4%
Forensics 15 27 55.6%
Expert1 85 169 50.3%
Expert2 81 173 46.8%
Dmine 41 102 40.2%
2003
2000
Scenario 2 Result# of detectio
n # of target Detection rate
Proposed Method
100 171 58.5%
NETAD 70 185 37.8%
NETAD•Use IP address as white list•Overfit learning data
Proposed Method•Independent of IP address•Evaluate only anomaly of traffic
Detection Results every Features
( FP )( FC )( TS )
3( TS ) & ( FC ) & ( FP )
40Flow Payload(FP )38Flow Counting ( FC )2737Time Slot Feature ( TS )
( FP )( FC )( TS )
5( TS ) & ( FC ) & ( FP )
44Flow Payload Feature(FP)
613Flow Counting Feature(FC)
5922Time Slot Feature(TS)
Scenario 1
Scenario 2
# of Detection by both TS & FP
# of Detection by FP only# of Detection by
all Three Features
Low detection overlap
Each feature detect different characteristic attacks
Conclusion
For network security Classification attacks into three types Construct three features corresponding
to three attack characteristics
Detection method with PCA Learning the three features independently
Higher detection accuracy With samples including attacks
Top Related