Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two...
-
date post
19-Dec-2015 -
Category
Documents
-
view
222 -
download
2
Transcript of Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two...
Tunnel Hunter: Detecting application-layer tunnels
with statistical fingerprinting
M. Dusi, M. Crotti, F. Gringoli, L. Salgarelli
Presented by:Vangelis Kafentarakis – S.N. 1780
Introduction (1/2)
Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two
types of attack▪ Control sites which local users connect to▪ (Try to) limit attacks coming from the Internet
Firewalls check TCP ports and destination addresses
ALGs verify that the nature of the traffic crossing the network boundary is conforming to the security policies and that is not malicious
Introduction (2/2)
Tunneling techniques Disguise one application-layer protocol
into another one Make security policies ineffective and
can lead to a dangerous illusion of security
Can be based on DNS, HTTP or SSH protocols
An overview of tunneling techniques (1/3)
Packets are encoded at the application-layer conforming to specific allowed protocol(s).
Most commonly, three protocols are used to tunnel Internet traffic: DNS, HTTP, SSH.
An overview of tunneling techniques (2/3)
DNS Tunneling Exploits the way regular DNS requests
for a given domain are forwarded Powerful technique since DNS is rarely
blocked Can rarely achieve throughputs higher
than a few kb/s due to the mechanism’s complexity and is therefore rarely used
An overview of tunneling techniques (3/3)
HTTP Tunneling The packets of the tunneled flow are encoded
so that they can be incorporated in one or more regular, semantically valid HTTP sessions
SSH Tunneling SSH tunneling is also known as port forwarding▪ Deep-packet-inspection techniques become useless
due to data encryption▪ Therefore, today’s ALGs allow any protocol to be
tunneled through SSH▪ That makes SSH tunneling a very powerful
technique
The SSH authentication process (1/2)
The SSH authentication process (2/2)
The two previous authentication phases are not used by possible tunnels:The host authentication is not encrypted
therefore its packets can be easily discarded.
The user authentication is encrypted therefore it is difficult to know when it ends and the actual data exchange begins.
Background on statistical pattern recognition (1/3)
Definition: Automatic (machine) recognition, description, classification, and grouping of patterns according to specific features. If the information about how to group the data
into classes is known before examining the data, the approach is called supervised, otherwise it is called unsupervised
The goal of a pattern recognition technique is to represent each element as a pattern and to assign the pattern to the class that best describes it
Background on statistical pattern recognition (2/3)
Stages in a pattern recognition problem
Data collection
Feature selection or
feature extraction
Definition of patterns and
classes
Definition and application of
the discrimination
procedure
Assessment and interpretation of
results
Background on statistical pattern recognition (3/3)
Class description…revisited Once classes have been identified, a
training set Ts(ωi) can be created for every class ωi
A thorough inspection of Ts can lead to an analytical model describing the corresponding class
Then, a decision function f has to be determined with input the observed data x and output a prediction of the class that generated it, ω(x) = f(x)
Our system: Tunnel Hunter (1/4)
Aims at detecting tunneling activities over the HTTP and SSH protocols
Focuses on building an accurate description of legitimate traffic
Builds on known pattern recognition techniques
Tunnel Hunter (2/4)
Building patterns and classes (1/2) The features are gathered directly from the
legitimate flows composing the TCP session TCP flow represented by a pattern which
takes into account the:▪ packet size si
▪ inter-arrival time Δt between two consecutive packets▪ number of packets r that are useful for
measurement
Tunnel Hunter (3/4)
Building patterns and classes (2/2) Class model: the concept of protocol
fingerprint▪ A protocol may be used for N different
purposes▪ Issue: How many classes one has to consider▪ Two approaches to the issue
1. Train the classifier with flows from a single target class (one-class classifier)
2. New classes composed of outlier flows are added to the analysis (multi-class classifier)
Tunnel Hunter (4/4)
One-class tunnel detection algorithm Algorithm definition: the decision function▪ App = The application-layer protocol that is
examined▪ ωt = The acceptance region (“legitimate” use of
App)▪ ωr = The rejected region (complementary to ωt)▪ Given an unknown flow F, the algorithm
compares its pattern representation with the fingerprint (for ωt and ωr) and returns an index of (dis-)similarity (anomaly score)
Multi-class classification: fingerprinting the anomalies (1/2)
Tunnel Hunter can perform better if is provided with more knowledge about the nature of the traffic.
Multi-class classification adds an outlier class ωo which can reduce the number of cases where the uncertainty could allow a packet that should have been rejected.
Multi-class classification: fingerprinting the anomalies (2/2)
One-class algorithm: experimental results (1/5)
Experiments are for HTTP and SSH
Run on a 100Base-TX link
Packet size s range [40, 1500]
Inter-arrival times Δt range [10-7, 103]
sec
One-class algorithm: experimental results (2/5)
The HTTP case (1/2) 20,000 flows used for gathering the
training sets Ts and T”s
About 17,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P
At the same time, about 15,000 non-tunneled sessions were collected in order to detect if the classifier lets legitimate HTTP traffic to pass
One-class algorithm: experimental results (3/5)
The HTTP case (2/2)
One-class algorithm: experimental results (4/5)
The SSH case (1/2) 4,000 flows used for gathering the training
sets Ts and T”s
About 10,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P
At the same time, about 600 interactive sessions and about 1700 bulk-transfer sessions were collected in order to detect if the classifier lets legitimate SSH/SCP traffic to pass
One-class algorithm: experimental results (5/5)
The SSH case (2/2)
Multi-class algorithm: experimental results (1/3)
State A results (same as in one-class algorithm)
Multi-class algorithm: experimental results (2/3)
State B results
Multi-class algorithm: experimental results (3/3)
State C results
Discussion on potential attacks
Tunnel Hunter problems If an SSH tunnel is initially used for
remote administration and then for tunneling other protocols▪ The first state is legitimate and the classifier
will label the session as authorized Sensitive to packet-size and timing value
manipulation
Conclusion and future work (1/2)
Tunnel Hunter can successfully recognize whenever a generic application protocol is tunneled on top of HTTP or SSH
Increasing the knowledge of the system can significantly improve its performance
The experimental results are very promising Virtually no legitimate traffic is blocked The vast majority of tunneled traffic is blocked Completeness near 100% (exactly 100% for
HTTP)
Conclusion and future work (2/2)
Tunnel Hunter can be used to improve existing ALGs By augmenting their ability to recognize
tunneled traffic The model can be improved
By introducing new variables and studying better the role of the existing variables in order to produce stronger fingerprints
Thank you!
Questions?