1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li,...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li,...
1
Towards Anomaly/Intrusion
Detection and Mitigation on High-Speed Networks
Yan Gao, Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao
Northwestern Lab for Internet and Security Technology (LIST)
Department of Computer ScienceNorthwestern University
http://list.cs.northwestern.edu
2
Outline• Motivation• Architecture of RAIDM• Statistical Sketch-based
Anomaly/Intrusion Detection– Design– Evaluation Results
• Polymorphic Worm Signature Generation with Provable Attack Resilience– Design– Preliminary Evaluation Results
• Conclusions
3Desired Features of Intrusion Detection Systems (IDS)
• Network-based and scalable to high-speed links– Slammer worm infected 75,000 machines in <10 mins– Flash worm can take less than 1 second to compromise
1M vulnerable machines in the Internet [Staniford04]– Host-based schemes inefficient and user dependent
» Have to install IDS on all user machines !
– Existing network IDS unscalable: In a 10Gbps link, each 40-byte packet only has 32ns for processing !
• Aggregated detection over multiple vantage points– Multi-homing, load balancing, policy routing become
popular asymmetric routing– Even worse… Per-packet load balancing– Cannot afford to move traffic around
4Features of Demanded Intrusion Detection Systems (IDS)
• Attack resiliency– Human adversaries are difficult to handle. Attacks
tend to fool or DoS the IDS system.– Many IDSs use exact per-flow states, which is
vulnerable to DoS attack.
• Attack root cause analysis for mitigation– Detection is only the first step. We also need
mitigation and prevention.– Overall traffic based approaches can scale to high
speed but cannot really help mitigation– We need flow-level information for mitigation– Signature generation for polymorphic worms
5
Outline• Motivation• Architecture of RAIDM• Statistical Sketch-based
Anomaly/Intrusion Detection– Design– Evaluation Results
• Polymorphic Worm Signature Generation with Provable Attack Resilience– Design– Preliminary Evaluation Results
• Conclusions
6Router-based Anomaly/Intrusion Detection and Mitigation System
(RAIDM)• Online traffic recording
– Design reversible sketch for data streaming computation– Record millions of flows (GB traffic) in a few hundred KB
• Online flow-level anomaly/intrusion detection & mitigation
– As a first step, detect TCP SYN flooding, horizontal and vertical scans even when mixed
» Existing schemes like TRW/AC, CPM will have high false positives
– Infer key characteristics of malicious flows for mitigation
• Dos attack resiliency• Apply to distributed detection environment
RAIDM: First flow-level intrusion detection that can sustain 10s Gbps bandwidth even for worst case traffic of 40-byte packet streams
7Router-based Anomaly/Intrusion Detection and Mitigation System
(RAIDM)• Attach to routers as a black box
Router
LAN
Internet
Switch
(a)
Router
LAN
(b)
RAIDMsystem
scan
po
rtsc
an
port
Splitter
Router
(c)
Splitter
HAIDMsystem
RA
IDM
syst
em
RAIDMsystem
Switch
Switch
Switch
Switch
Switch
LAN
LAN
LAN
LAN
InternetInternet
Attach HRAID black boxes to high-speed routers (a) original configuration, (b) distributed configuration for which each port is monitored separately, (c) aggregate configuration for which a splitter is used to aggregate the traffic from all the ports of a router.
8
Reversiblek-ary sketch monitoring
Filtering
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords
Per-flow monitoring
Streaming packet data
Normal flows
Suspicious flows
Intrusion or anomaly alarms
Keys of suspicious flows
Keys of normal flows
Data path Control pathModules on the critical path
Signature-based detection
Polymorphic worm detection (Hamsa)
Part ISketch-basedmonitoring & detection
Part IIPer-flowmonitoring & detection
Modules on the non-critical path
Network fault diagnosis (DOD)
RAIDM Architecture
9
Outline• Motivation• Architecture of RAIDM• Statistical Sketch-based
Anomaly/Intrusion Detection– Design– Evaluation Results
• Polymorphic Worm Signature Generation with Provable Attack Resilience– Design– Preliminary Evaluation Results
• Conclusions
10Statistical Sketch-based Anomaly/Intrusion Detection
(SSAD)
• Recording stage: record traffic of each router with different sketches, then transfer and combine them to current reversible sketch.
• Detection stage: use Time Series Analysis (Holt-Winter and EWMA) to detect large forecast error as anomalies.
• False positive reduction & 2D sketch skipped (lack of time)
Sketch recording
Current reversible
sketch
Forecast sketch
Real traffic stream
Attack mitigation
Update
Forecast error
Statistical detection
Aggregate Two
dimensional sketch
False positive
reduction
Aggregate reversible
sketch
Sketches from other routers
...
Recording process Detection process
11Statistical Sketch-based Anomaly/Intrusion Detection
(SSAD)• Distributed Detection Environment
SYN/ACK1
SY
N1
Dept 1 Dept 2 Dept n
Router 1
Router 2
Router 3
…...
Internal Network
` ` `
SY
N/A
CK
2
SY
N2
12
Background: Reversible k-ary Sketch
• Array of hash tables: Tj[K] (j = 1, …, H)– Similar to count sketch, counting bloom filter,
multi-stage filter, …
• Update (k, u): Tj [ hj(k)] += u (for all j)
1
j
H
0 1 K-1…
……
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)
13
Background: Reversible k-ary Sketch
+ =
• Estimate v(S, k): sum of updates for key k• Sketches are linear
– Can Combine Sketches– Can aggregate data from different times, locations,
and sources
• Inference I(S, t): output the heavy keys whose values are larger than threshold t
14
• RS((DIP, Dport), SYN-SYN/ACK)• RS((SIP, DIP), SYN-SYN/ACK)• RS((SIP, Dport), SYN-SYN/ACK)
Attack types RS((DIP, Dport),
SYN-SYN/ACK)
RS((SIP, DIP),
SYN-SYN/ACK)
RS((SIP, Dport),
SYN-SYN/ACK)
SYN flooding Yes Yes Yes
Vertical scans No Yes No
Horizontal scans
No No Yes
Detection Algorithm
HORIZONTAL
PORT NUMBER
SOURCE IP
BLOCK
VERTICAL
15
DoS Resiliency Analysis• Possible DoS attack to existing approaches
like TRW and TRW/AC– Source spoofed SYN flooding attack– Source spoofed packets to random location
• Attack SSAD is difficult. Possible attack is based on creating collisions in sketches. But…– Reverse engineering of hash functions is difficult– The possibility of finding collisions through
exhaustive search is very low– Attacks are limited even with collisions: need the
cooperation with comprised internal hosts.
16
• Data Sets– NU traces (239M flows, 1.8TB traffic/day)– Lawrence Berkeley National Laboratory (LBL) Trace
(900M flows)
• Rest of results based on NU trace evaluation
• Scalability and memory usage– Total 9.4MB used for recording hundreds of millions
of flows» 3 reversible k-ary sketches, 2 two dimensional sketches and
1 original k-ary sketch
– Small # of memory access per packet
Evaluation
17
• Fast – Recording speed for the worst case traffic, all 40B pkts
» 16 Gbps on a single FPGA board» 526 Mbps on a Pentium-IV 2.4GHz PC
– Detection speed» On site NU experiment covering 1430 minutes: 0.34 sec for one
minute on average. (std=0.64 sec)
• Accurate Anomaly Detection w/ Reversible Sketch– Compared with detection using complete flow-level logs– Provable probabilistic accuracy guarantees– Even more accurate on real Internet traces
Evaluation (cont’d)
18Evaluation (cont’d)• 25 SYN flooding, 936 horizontal scans and 19 vertical scans
detected (after sketch-based false positive reduction)• 18 out of 25 SYN flooding verified w/ backscatter
– Complete flow-level connection info used for backscatter
• Scans verified (all for vscan, top and bottom 10 for hscan)– Unknown scans also found in DShield and other alert reports
Top 10 horizontal scans Bottom 10 horizontal scans
Description Dport count
SQLSnake 1433 5
W32.Rahack 4899 2
unknown scan 6101 1
Scan SSH 22 1
MySQL Bot scans 10202 1
Description Dport count
Sasser or Korgo worm 445 3
W32.Sasser.B.Worm 5554 1
Nachi or MSBlast worm 135 3
NetBIOS scan 139 3
19
Outline• Motivation• Architecture of RAIDM• Statistical Sketch-based
Anomaly/Intrusion Detection– Design– Evaluation Results
• Polymorphic Worm Signature Generation with Provable Attack Resilience– Design– Preliminary Evaluation Results
• Conclusions
20Requirements for polymorphic worm
signature generation• Network-based– Worm spread at exponential speed, at early stage
there are limited worm samples.– Keep up with the network speed !
• Noise tolerant– Most network level flow classifier may suffer false
positives.• Attack resilience
– Attackers always try to evade the signature generation system
• Efficient signature matching– Again, on high-speed networks!
• No existing work satisfies these requirements
21
Hamsa• Content based v.s. behavior based signature
– Content based: treat a worm as a byte sequence– Behavior based: the actual dynamics of the worm
execution– Fast matching could be a problem for behavior
based
• We propose Hamsa, a network-based signature generation system to meet the aforementioned requirements– Content based system (token based)– Accuracy comparable to the state-of-the art
technique [Polygraph] but much faster– Propose an adversary model and has attack
resilience guarantee
22
Hamsa Design
• Sniff traffic from networks• Assembly the packets to flows• Classify traffic based protocol• Filter out known worms• Generate suspicious and normal pools.
ProtocolClassifier
UDP1434
HamsaSignatureGenerator
WormFlow
Classifier
TCP137
. . .TCP80
TCP53
TCP25
NormalTraffic Pool
SuspiciousTraffic Pool
Signatures
NetworkTap
KnownWormFilter
23
Hamsa Design
• Iterative signature generation• Extract the tokens form suspicious pool• Identify the same set of tokens in normal pool• Core part: greedy algorithm based on the token information
Hamsa Signature Generator
TokenExtractor
TokenIdentification
Tokens
CoreSignature
FilterPool sizetoo small?
NO
SuspiciousTraffic Pool
NormalTraffic Pool
YES
Quit
24Model-based Polymorphic Worm Signature Generation Problem &
Algorithm
No noise: O(n) Noise: NP-Hard
The problem
The model
The algorithm
25
Evaluation Methodology• Data collection
– Normal pool:» 300MB Web traffic for training» 20GB Web traffic and Binary distribution of Linux for evaluation
– Suspicious pool:» 10 ~ 500 samples (with noise and different worms)» 5000 samples each worm as false negative testing data set
– Three pseudo worms: ATPhttpd, Apache-Knacker, Codered II– Two polymorphic engines from Internet
• Simulation settings– Single worm with noise
» Test pool size 100 and 200» Noise ratio 0, 10%, 30%, 50%
– Multiple worms with noise» 3 worms together: ATPhttpd, Apache-Knacker, Codered II» Test pool size 100 and 200» Noise ratio 0, 10% and 25%
26Preliminary Evaluation Results
• Signature Quality
• Signature Generation Speed– 64 ~ 361 times faster than Polygraph– Only use several seconds to at most minutes to generate
signatures
• Attack Resilience– Provable attack resilience (details omitted)– For token-fit attack, Polygraph fail but Hamsa succeed
27
Backup Slides
28
Intrusion Detection and Mitigation
Attacks detected MitigationDenial of Service (DoS), e.g., TCP SYN flooding
SYN defender, SYN proxy, or SYN cookie for victim
Port Scan and worms Ingress filtering with attacker IPVertical port scan Quarantine the victim machineHorizontal port scan Monitor traffic with the same
port # for compromised machine
Spywares Warn the end users being spied
HORIZONTAL
PORT NUMBER
SOURCE IP
BLOCK
VERTICAL
29
Reversible Sketch Based Anomaly Detection
• Input stream: (key, update) (e.g., SIP, SYN-SYN/ACK)
Sketchmodule
Forecastmodule(s)
Anomaly detectionmodule
(k,u) … SketchesError
Sketch Alarms
• Report flows with large forecast errors• Infer the (characteristics) key for mitigation
• Summarize input stream using sketches
• Build forecast models on top of sketches
30Reducing False Positives for SYN Flooding Detection
• Reasons of false positives of SYN flooding detection– Network/server congestions/failtures– Polluted or outdated DNS entries
• Filters to reduce false positives caused by bursty network /server congestions/failures1.2. Lifetime > Thresholdlife
• Filters to reduce the false positives caused by misconfigurations or related problems– No connection history
2
1
#
/##
SYN
ACKSYNSYN
31Intrusion Classification with Two Dimensional Sketch
• We need to distinguish different types of attack to take the most effective mitigation scheme– However, one dimensional information is not
enough• Non-spoofed SYN flooding v.s. horizontal scan {SIP,DP}
– Bi-modal distribution.
• Two dimensional SketchStructure of 2D sketch
hx(SIP,DIP) hx(SIP,DIP) hx(SIP,DIP)
H two-dimensional hash matrix
hy(DP)
hy(DP)
hy(DP)
+1
hx(2.3.0.5,9.7.2.3)
hy(80)
Packet {2.3.0.5, 9.7.2.3, 80,SYN}Example UPDATE
32Intrusion Classification with Two Dimensional Sketch
• Two dimensional sketch is accurate– Refer to the paper about accuracy proof– Can archive 99.9956% detection rate by the
parameter mentioned in the paper
33
Automated worm signature generation
• Manual signature generation is too slow for fast propagated worms
• Automatic signature generation has been proposed [Earlybird][Autograph]
• But it is not hard for hackers to apply polymorphism to their worms.
• Polymorphic worm signature generation becomes necessary.