Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two...

Tunnel Hunter: Detecting application-layer tunnels

with statistical fingerprinting

M. Dusi, M. Crotti, F. Gringoli, L. Salgarelli

Presented by:Vangelis Kafentarakis – S.N. 1780

Introduction (1/2)

Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two

types of attack▪ Control sites which local users connect to▪ (Try to) limit attacks coming from the Internet

Firewalls check TCP ports and destination addresses

ALGs verify that the nature of the traffic crossing the network boundary is conforming to the security policies and that is not malicious

Introduction (2/2)

Tunneling techniques Disguise one application-layer protocol

into another one Make security policies ineffective and

can lead to a dangerous illusion of security

Can be based on DNS, HTTP or SSH protocols

An overview of tunneling techniques (1/3)

Packets are encoded at the application-layer conforming to specific allowed protocol(s).

Most commonly, three protocols are used to tunnel Internet traffic: DNS, HTTP, SSH.


DNS Tunneling Exploits the way regular DNS requests

for a given domain are forwarded Powerful technique since DNS is rarely

blocked Can rarely achieve throughputs higher

than a few kb/s due to the mechanism’s complexity and is therefore rarely used


HTTP Tunneling The packets of the tunneled flow are encoded

so that they can be incorporated in one or more regular, semantically valid HTTP sessions

SSH Tunneling SSH tunneling is also known as port forwarding▪ Deep-packet-inspection techniques become useless

due to data encryption▪ Therefore, today’s ALGs allow any protocol to be

tunneled through SSH▪ That makes SSH tunneling a very powerful

technique

The SSH authentication process (1/2)

The SSH authentication process (2/2)

The two previous authentication phases are not used by possible tunnels:The host authentication is not encrypted

therefore its packets can be easily discarded.

The user authentication is encrypted therefore it is difficult to know when it ends and the actual data exchange begins.

Background on statistical pattern recognition (1/3)

Definition: Automatic (machine) recognition, description, classification, and grouping of patterns according to specific features. If the information about how to group the data

into classes is known before examining the data, the approach is called supervised, otherwise it is called unsupervised

The goal of a pattern recognition technique is to represent each element as a pattern and to assign the pattern to the class that best describes it


Stages in a pattern recognition problem

Data collection

Feature selection or

feature extraction

Definition of patterns and

classes

Definition and application of

the discrimination

procedure

Assessment and interpretation of

results


Class description…revisited Once classes have been identified, a

training set Ts(ωi) can be created for every class ωi

A thorough inspection of Ts can lead to an analytical model describing the corresponding class

Then, a decision function f has to be determined with input the observed data x and output a prediction of the class that generated it, ω(x) = f(x)

Our system: Tunnel Hunter (1/4)

Aims at detecting tunneling activities over the HTTP and SSH protocols

Focuses on building an accurate description of legitimate traffic

Builds on known pattern recognition techniques

Tunnel Hunter (2/4)

Building patterns and classes (1/2) The features are gathered directly from the

legitimate flows composing the TCP session TCP flow represented by a pattern which

takes into account the:▪ packet size si

▪ inter-arrival time Δt between two consecutive packets▪ number of packets r that are useful for

measurement

Tunnel Hunter (3/4)

Building patterns and classes (2/2) Class model: the concept of protocol

fingerprint▪ A protocol may be used for N different

purposes▪ Issue: How many classes one has to consider▪ Two approaches to the issue

1. Train the classifier with flows from a single target class (one-class classifier)

2. New classes composed of outlier flows are added to the analysis (multi-class classifier)

Tunnel Hunter (4/4)

One-class tunnel detection algorithm Algorithm definition: the decision function▪ App = The application-layer protocol that is

examined▪ ωt = The acceptance region (“legitimate” use of

App)▪ ωr = The rejected region (complementary to ωt)▪ Given an unknown flow F, the algorithm

compares its pattern representation with the fingerprint (for ωt and ωr) and returns an index of (dis-)similarity (anomaly score)

Multi-class classification: fingerprinting the anomalies (1/2)

Tunnel Hunter can perform better if is provided with more knowledge about the nature of the traffic.

Multi-class classification adds an outlier class ωo which can reduce the number of cases where the uncertainty could allow a packet that should have been rejected.

Multi-class classification: fingerprinting the anomalies (2/2)

One-class algorithm: experimental results (1/5)

Experiments are for HTTP and SSH

Run on a 100Base-TX link

Packet size s range [40, 1500]

Inter-arrival times Δt range [10-7, 103]

sec


The HTTP case (1/2) 20,000 flows used for gathering the

training sets Ts and T”s

About 17,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P

At the same time, about 15,000 non-tunneled sessions were collected in order to detect if the classifier lets legitimate HTTP traffic to pass


The HTTP case (2/2)


The SSH case (1/2) 4,000 flows used for gathering the training

sets Ts and T”s

About 10,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P

At the same time, about 600 interactive sessions and about 1700 bulk-transfer sessions were collected in order to detect if the classifier lets legitimate SSH/SCP traffic to pass


The SSH case (2/2)

Multi-class algorithm: experimental results (1/3)

State A results (same as in one-class algorithm)


State B results


State C results

Discussion on potential attacks

Tunnel Hunter problems If an SSH tunnel is initially used for

remote administration and then for tunneling other protocols▪ The first state is legitimate and the classifier

will label the session as authorized Sensitive to packet-size and timing value

manipulation

Conclusion and future work (1/2)

Tunnel Hunter can successfully recognize whenever a generic application protocol is tunneled on top of HTTP or SSH

Increasing the knowledge of the system can significantly improve its performance

The experimental results are very promising Virtually no legitimate traffic is blocked The vast majority of tunneled traffic is blocked Completeness near 100% (exactly 100% for

HTTP)

Conclusion and future work (2/2)

Tunnel Hunter can be used to improve existing ALGs By augmenting their ability to recognize

tunneled traffic The model can be improved

By introducing new variables and studying better the role of the existing variables in order to produce stronger fingerprints

Thank you!

Questions?

Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two...

Documents

Transcript of Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two...