Evading Stepping Stone Detection Under the Cloak of ... · Earlier approaches to stepping stone...

Department of Computer Science and EngineeringUniversity of Texas at Arlington

Arlington, TX 76019

Evading Stepping Stone DetectionUnder the Cloak of Streaming

Media

Madhu Venkateshaiah and Matthew [email protected], [email protected]

Technical Report CSE-2007-6

A longer version of this report was also submitted as an M.S.thesis.

Evading Stepping Stone Detection Under the Cloak

of Streaming Media

Madhu Venkateshaiah and Matthew Wright

April 12, 2007

Abstract

Network-based intrusions have become a serious treat to the usersof the Internet. To help cover their tracks, attackers launch attacksfrom a series of previously compromised systems called stepping stones.Timing correlations on incoming and outgoing packets can lead to de-tection of the stepping stone and can be used to trace the attackerthrough each link. Existing approaches, however, deliberately ignorethe fact that an attacker can add chaff packets to a traffic stream. Anattacker that has complete control over the stepping stone node caninstall rogue applications that use chaff and introduce delays to makethe incoming and outgoing streams have very different traffic char-acteristics. In this work, we show that such an attacker could avoiddetection by the best stepping stone detection methods. We propose asimple buffering technique that could be used by an attacker on a step-ping stone to evade detection. In our technique, packets are buffered,selectively dropped, and chaff packets are added to generate constantrate traffic. This traffic has the characteristics of a multimedia stream,such as voice over IP (VoIP), which is quite common on the Internettoday. To test the effectiveness of our technique, we simulate boththe traffic and detection using a watermark-based timing analysis al-gorithm. We show that our buffering technique can successfully evadedetection with latencies that are reasonable for interactive streams.

1 Introduction

Hackers attack computers to compromise security and gain control of sys-tems, compromise privacy, or to create a disruption so as to deny or degradeaccess to legitimate users. It is being increasing recognized that detect-ing and preventing such attacks is an important requirement of computersystems. To increase the difficulty of detection, attackers typically launch

1

attacks indirectly by relaying their attack through a chain of intermediate(previously compromised) systems called stepping stones. The attacker doesthis by constructing a chain of interactive connections using protocols likeTelnet or SSH. The commands that the attacker types on his local terminalare relayed through the stepping stones until they reach the victim. Thisgives the attackers a certain a level of anonymity and makes it hard for aninvestigator to trace the attack back to its origin.

Since the attacker is sending attack traffic to a stepping stone and thestepping stone is just relaying (forwarding) it to an external host, the step-ping stone detection problem comes down to finding an outgoing connectionwith the same characteristics as a given incoming connection. An intuitiveapproach to solve this problem would be to compare the contents of theincoming and outgoing packets in a network to find packets with the samecontent. But the use of encrypted communication protocols like SSH, havemade this approach ineffective. We thus need to use other characteristicsof the traffic like timing characteristics to detect stepping stones. One ofthe more promising approaches to solve the detection problem is to activelyperturb the timings of packets in incoming streams, making it easy to findoutgoing streams with the same timing patterns. This approach can be maderobust to random changes in packet timing introduced by the attacker.

Earlier approaches to stepping stone detection (see Section 2) make lim-iting assumptions about the what the attacker can do on the stepping stone.In particular, they assume that the attacker does not use cover traffic. Inmany cases, however, the attacker has total control over the stepping stones.Thus, he can change the configuration of the system and install rogue appli-cations with the ability to add cover traffic and delays to the traffic streams.In this work, we explore the ways in which such an attacker could operateto evade detection. Our primary contribution is a simple technique that anattacker can use on the stepping stone to reduce correlation between inputand output streams. Our technique masks the interactive traffic as multi-media streams, and uses buffering of incoming packets, selective dropping,and adding chaff to make the outgoing stream appear unconnected with theincoming stream. We show that with reasonable buffering delay and packetdrops, our technique completely stops existing detection methods.

We now describe the organization of the rest of the paper. In Section 2,we discuss the background concepts and related work to justify the designdecisions explained in our description of the system. In Section 3, we presentour proposed buffering technique in detail. In Section 4, we explain thesimulation setup and aspects of the system that we tested. Section 5 presentsthe results of the simulations and their interpretation. Section 6 concludes

2

with ideas for future work.

2 Background

We now present some background on the techniques that have been used todetect stepping stones and related work in timing analysis techniques.

First let us define some terminology. When a person logs into one com-puter and from there logs into another computer and so on, we refer to thesequence of logins as a connection chain [1]. Any intermediate host on aconnection chain is called a stepping stone. Typically, connection chainsare formed by using terminal emulation programs like telnet and SSH. Thestepping stones thus formed are called interactive stepping stones, since theattacker types in a command and waits for a response. Though it is possiblefor a computer program to create a connection chain, we limit the scope ofour research to interactive stepping stones.

Stepping stone detection is a process of observing all incoming and outgoing connections in a network and determining which ones are a part of aconnection chain. This problem is closely related to the problem of trace-back or tracing intruders through the Internet by following the connectionchain. In both the above applications, the fundamental underlying prob-lem is to compare and analyze two connections and determine if there is anycorrelation between them. Researchers have proposed two main approaches:passive monitoring and active perturbation.

2.1 Passive Monitoring

Passive monitoring is an approach in which traffic flows are observed andanalyzed to find correlations. The interactive stepping stone problem wasfirst formulated and studied by Staniford and Heberlein [1]. They proposeda content-based algorithm that creats thumbprints of streams and comparesthem, looking for good matches. The problem with this approach is thatit requires that the traffic be un-encrypted. Zhang and Paxson [2] werethe first to propose a scheme to correlate traffic across stepping stones evenif the traffic is encrypted by the stepping stone. The method is based onclassifying traffic into on and off periods, e.g. active typing versus thinkingperiods in interactive SSH sessions, and correlating the timings of theseperiods. Since this method only uses the timing information of packets itcan be applied to encrypted traffic.

Yoda and Etoh [3] define the minimum average delay gap between thepacket streams of two TCP connections as the deviation. They then propose

3

to correlate streams using deviations, which does not require clock synchro-nization and is able to correlate connections observed at different points ofnetwork. Wang, Reeves, and Wu [4] address the problem of correlation bya scheme based on inter-packet timing characteristics. While timing-basedcorrelation approaches have the advantage that they are simple and do notdisturb normal traffic, they are vulnerable to countermeasures by the at-tacker. The attacker can perturb the timing characteristics of connectionsby selectively or randomly delaying packets at the stepping stone [5]. Thiskind of perturbation adversely affects the effectiveness of timing-based cor-relation.

2.2 Active perturbation

A promising defense against the problem of random timing perturbation byan attacker is active perturbation by the observer. In this approach, anincoming connection is perturbed by inducing a packet loss or delay andthe outgoing connections are checked to see if the perturbation is echoed inthem. Since the attacker does not know what the perturbation is, he willnot be able to use random perturbation to precisely affect the results. Wangand Reeves [6] proposed the first active correlation method that is designedto be robust against random timing perturbation. In this scheme, a uniquedelay-based watermark is embedded into a traffic flow by slightly adjustingthe timing of selected packets in the flow. The watermarked flow can beuniquely identified and thus correlated with other flows in the connectionchain.

Watermarking consists of two complementary processes: embedding thewatermark and decoding the watermark. A watermark is simply a uniquebinary string. The process of embedding one bit of this string consists ofchanging some property of a traffic flow such that the change represents abit. Decoding the watermark involves capturing candidate flows that mightmatch the watermarked flow and looking for the bits in the flow character-istics. The bits of the watermark should have enough redundancy to ensurethat they are decoded correctly with high probability.

In the technique proposed by Wang and Reeves [6], the packets that arewatermarked are randomly chosen and paired to obtain inter-packet delays(IPDs). A watermark is embedded by manipulation of these IPDs. Tomake the scheme robust against random perturbations, multiple IPDs aremanipulated to embed one bit. In their paper the authors make the followingassumptions:

1. While the attacker can add extra delay to any or all packets of an

4

outgoing flow of the stepping stone, the maximum delay that he canintroduce is bounded

2. The attacker does not know which packets are being watermarked

3. The component that decodes watermarks in traffic flows knows whichpackets have been watermarked

The first assumption is reasonable for an interactive session, as very highdelays will disrupt the attacker’s ability to use the connection. The secondassumption is reasonable, due to random packet selection, although detec-tion may be possible when the amount of timing manipulation is large [7].The third assumption, however, can be broken with the addition of just a fewchaff packets. While this is a problem for earlier works that assume perfectpacket matching [6, 8], new work by Pyun, et al. handles this problem [9].Our attacker approach defeats all of these methods.

2.3 Anonymity

Another related area is timing analysis attacks against systems for low-latency anonymous communications. Danezis [10] presents an attack basedon traffic analysis on connection-based mixed networks functioning in con-tinuous mode. It uses signal detection techniques to compare a traffic pat-tern extracted from the stream that is being tracked with all the links inthe network. Levine et al. [11] show that simple statisical correlation is aneffective timing analysis technique, even if all users have the same constantrate of traffic. They propose a defense called defensive dropping, in whichintermediate nodes on the path drop selected chaff packets, and show that itis effective in reducing attacker effectiveness. Shmatikov and Wang proposeadaptive padding, in which packets are added at intermediate nodes on thepath in response to gaps between packets [12]. Both defensive dropping andadaptive padding are inappropriate for evading stepping stone detection, asthe resulting traffic flows are evidently not typical and could be flagged asabnormal.

Anonymous Voice over IP (VoIP) would have different traffic patternsfrom typical Web browsing, similar to a constant rate flow. As an attack onthis anonymized service, Wang, Chen, and Jajodia propose a watermark-based technique to track anonymized VoIP calls [8]. This technique adaptsthe technique of [6], which cannot be used directly due to the more stringentreal-time constraints for VoIP streams than for SSH. In the VoIP-trackingtechnique, the packets that are watermarked are chosen randomly. The

5

Figure 1: The system model.

chosen packets are delayed by a fixed amount. This delay is called thewatermarking delay. Since the user has no way of knowing the watermarkingdelay and which packets are delayed, random perturbations by the attackerare not effective.

3 System Model

In this section, we describe a model for the use and detection of steppingstones. We also propose a technique by which an attacker can evade sophis-ticated stepping stone detection methods, including all methods that havebeen proposed to date.

Consider an attacker, with node A, who is using a compromised systemS in a network N1 as a stepping stone to launch an attack on the targetsystem T in another network N2. For simplicity, let us assume that thereis only one stepping stone, i.e. the attacker is relaying the attack throughonly one compromised system. Let us call the person who is trying todetect the stepping stone the observer. The observer’s objective is to detectthat an incoming connection is being relayed through node S to a hostoutside the network. The observer does this by embedding a timing-basedwatermark into all the connections coming into N1 and tries to detect if anyof the outgoing connections contain the watermark that he embedded. Theattacker’s objective is to evade detection by the observer by distorting thewatermark to an extent that it is hard to detect.

We propose that the attacker do this by buffering packets and addingdummy packets to generate a constant rate traffic stream. Since all thetiming information is lost, the observer will not be able to detect the water-mark.

For the purpose of our study, we make the following assumptions:

• The attacker has complete control of A and S. Thus he can modifythe system or use rogue applications on any of these systems.

6

• The attacker does not know which packets are watermarked

• The attacker does not know the parameters of the watermark algo-rithm, including the watermark delay and the distance between pack-ets.

3.1 Connection profiling

The attacker’s objective is to remove all timing information from the trafficstream by generating constant rate traffic. To do this he needs to knowthe delay characteristics of the network. The attacker profiles the networkconnection between his host and the stepping stone. Before establishing theconnection chain to launch his attack, the attacker sends a stream of packetsfrom his system at a specific rate to the stepping stone. On the steppingstone, the attacker records the arrival time of these packets and calculatesthe inter-packet delays(IPDs) and standard deviation of the IPDs. Thisallows him to profile the connection’s delay characteristics. The attackercombines this information with his knowledge of the traffic rate to derivethe expected arrival time of packets.

Given a packet stream P1, ..., Pn being sent at a rate r and receivedon the stepping stone with time stamps t1, ..., tn respectively (ti < tj for1 ≤ i < j ≤ n), we define the IPD between Pk+1 and Pk as:

ipdk = tk+1 − tk, (k = 1, ..., n− 1)

Since the attacker knows the traffic rate r at the source, he knows the meaninter-packet delay ipd = 1/r. He calculates the IPD standard deviation as:

σ =

√√√√√ 1n− 1

(n−1)∑

i=1

(ipdi − ipd)2

Since the attacker sends packets at a constant rate, he can expect that theIPDs are normally distributed around the mean when the packets arriveat the stepping stone [6]. According to the empirical rule for a normaldistribution, the attacker can deduce that:

• 68% of the packets will arrive within 1 standard deviation (of themean)

• 95% of the packets will arrive within 2 standard deviations

• 99.8% of the packets will arrive within 3 standard deviations

While an approximation, it is sufficient to derive a useful buffer size.

7

Figure 2: Traffic time-line

3.2 Evading detection

In this section we describe a technique that the attacker can use to evadedetection by the observer. Once the attacker has profiled the connectionhe establishes the connection chain through the stepping stone S to thetarget host T and starts the attack. When the attack packets arrive at thegateway to N1, the observer embeds a watermark. The attacker calculatesthe standard deviation on inter-packet delay as explained in Section 3.1. Togenerate a constant rate traffic stream, the attacker needs to buffer packetsthat arrive early and drop packets when they arrive late. To achieve this,the attacker divides the time into fixed-length slots, beginning with thearrival of the first packet at S. The length of each slot is ipd = 1/r whichis the mean inter-packet delay (IPD) at the source of the traffic. Eachpacket is expected to arrive in its respective slot, but packets may arriveearlier or later than expected due to the Internet and watermarking delaysencountered by packets. This is illustrated in Figure 2. So the attackerneeds to have a tolerance margin for each slot to decide if the packet arrivedin its respective slot or not. The tolerance is a configurable parameter thatthe attacker can adjust based on the standard deviation of the IPDs. Sincethe IPDs are normally distributed over the mean, using the empirical rulefor normal distribution, the attacker knows that around 95% of the packetsarrive within two standard deviations of the mean. So a reasonable choicefor the tolerance level may be 2σ. This would also be the maximum bufferingdelay B.

If a packet arrives early, the packet is delayed till the end of the slot. Ifa packet arrives late, i.e. after the end of its slot, a dummy packet is sentin place of the actual packet. When the delayed packet finally arrives, it isbuffered until the end of the next available slot and then sent. Some packetsmay arrive very late, followed by packets that arrive on time. This could

8

lead to multiple packets being queued in the buffer and have a cascadingeffect on the buffering delays. As increasingly longer delays would affect thequality of the connection, the stepping stone should drop packets that arrivevery late. Since the attacker knows that almost all the packets would arrivewithin 4σ of the mean, there should be a maximum of two packets in thebuffer. If there are are more than two packets in the buffer, the attackerdrops the first packet (which presumably arrived very late) and sends thenext packet in the buffer. The entire process can be summarized by thefollowing algorithm:

1. When the first packet arrives, delay the packet by the maximum buffer-ing delay

2. From the time the first packet was sent, trigger an event at intervalsof duration ipd

3. From the second packet onwards, buffer the packets that arrive beforean event is triggered

4. When an event is triggered:

• if there is no packet in the buffer, send a dummy packet

• if there are more than two packets in the buffer, drop the firstpacket and send the second packet

• if there are less than or equal to two packets in the buffer, sendthe first packet in the buffer

3.2.1 Sequence Numbers

One way to potentially improve the performance of the system is for theattacker’s client to include a sequence number in each packet it sends. Inthis way, the stepping stone can determine when packets have arrived latefor their slot, or out-of-order, and drop such packets. In some cases, a packetthat arrives late or out-of-order is buffered, leading to a long series of packetsthat are buffered one extra time slot. This increases average latency in thesystem. By matching packets to their respective time slots, we can reducethis problem. Additionally, we can create direct trade-offs between latencyand drop rate that are more malleable than with the use of buffer delayalone.

We show in Section 5 that this approach sometimes leads to significantdrop rates. We note that drops can be countered by redundancy. In most

9

cases, for interactive stepping stone traffic, the attacker’s think times dom-inate the IPD. Additionally, each packet in a VoIP stream will be manytimes larger than necessary for passing encrypted characters (10 to 50 bytesper packet) [13]. Thus we can duplicate characters over multiple packets.When several characters are being sent in succession, they can be inter-leaved within the packets so that getting any of them through allows all thecharacters to be received.

3.3 Observer countermeasures

Since the buffering algorithm generates a constant rate traffic stream, alltiming information of the traffic flow is lost. In effect, this totally removesthe watermark that was previously embedded. To make it harder for theattacker to evade detection, the observer can increase the watermarkingdelays. Increasing the watermarking delay makes the detection scheme morerobust. This would also increase the jitter in the stepping stone connection.To work against this, the attacker has to buffer packets for a longer durationand would have to drop more packets, thus degrading the quality of his ownconnection. The observer could attempt to delay relatively few packets withmuch higher delays, e.g. 100 ms. In this case, the connection profiler wouldcreate a traffic profile similar to the un-watermarked case. The delayedpackets would be treated as drops at S, slightly degrading the attacker’susability.

Note that increasing the watermarking delay affects connection quality.Since the observer needs to watermark all incoming connections, an increasein watermarking delay would not only affect the connection being used bythe attacker, but all other incoming connections in the network. This sub-stantially limits the options for active perturbation.

4 Simulation

We now describe simulation experiments intended to show the effectivenessand the costs our buffering technique. We focus on hiding attacker trafficin streams that appear to be multimedia, like VoIP traffic. Because ofthis, many of the existing stepping stone detection techniques do not applydirectly. However, we can use a watermarking technique that is capable ofcorrelating constant rate traffic streams. The technique that comes closest tosatisfying our requirements was proposed by Wang, Chen, and Jajodia [8] totrack VoIP calls. We chose this technique for our simulations. We conductedexperiments to test the effectiveness of our technique to evade of watermark

10

Figure 3: Experimental setup

detection. We also performed experiments to determine the effect of differentwatermarking delays on the drop rates and amount of chaff.

4.1 Simulation Design

The system was simulated in Java. Figure 3 shows the architecture of ourexperimental setup. We describe each individual component of the systemin the following sections.

4.1.1 Traffic generator

The traffic generator generates Internet traffic as well as attack traffic basedon a delay distribution. The delays simulate Internet packet delays. Forour experiments we used a normal distribution with a standard deviationof one fourth of the mean for Internet delays. The characteristics of thegenerated traffic, such as rate, average end-to-end delay and packet loss canbe controlled by configurable parameters. We chose parameters that reflectrealistic VoIP traffic on the Internet. For each experiment we generated 100traffic flows at rate of 30 pkts/sec for a duration of 900 secs (15 mins).Theaverage end-to-end delay was set to 100ms and the drop rate was set to anaverage of 1%. This traffic is in the form of traffic logs.

4.1.2 Watermarking engine

The watermarking engine uses the watermarking technique proposed byWang, Chen, and Jajodia [8] to alter the inter arrival timing of packets

11

that are generated by the traffic generator. Watermark generator : The wa-termark generator generates unique watermarks with a specified hammingdistance. The distance is important to ensure low false positives. We chosea hamming distance of 9 based on the best results of [8]. These watermarksare embedded into traffic flows by the watermarking engine. Parameter se-lection: The parameters used for the watermarking algorithm were chosenbased on the results obtained by Wang, Chen and Jajodia [8]. The param-eters corresponding to their best results were chosen. All the watermarksare 24 bits in size. The watermarks are generated such that the minimumhamming distance between any two watermarks is 9. Wang, Chen and Jajo-dia did some experiments to determine the optimal redundancy factor to beused. They found that a redundancy factor of 25 yields a very low averagebit error rate. So for our project we chose to have a redundancy factor of25. Having a large redundancy makes the watermark robust against net-work jitters and results in a low bit error rate. The delay introduced bythe watermark has to be small in order to make it difficult for the attackerto determine if his flow is watermarked. Wang, Chen and Jajodia foundthat a watermarking delay of 3ms was sufficient to confuse the attacker andachieve high true positive rate.

4.1.3 LAN delay simulator

The LAN delay simulator is a black box in to which packets enter, suffera delay, and exit the network. This basically simulates the network andprocessing delays that the packets encounter before reaching the steppingstone and after leaving it.

To obtain a model for delays in a Local Area Network(LAN), we per-formed some experiments on the local area network of the University ofTexas at Arlington. Ping packets were sent to randomly chosen systems inthe network and the round trip times(RTTs) were recorded. 10 ping packetswere sent to each of the 60 randomly chosen hosts. This experiment was re-peated on three different systems on the network. A cumulative distributionfunction(CDF) was generated with the collected RTTs and curve fitting wasused to obtain a function for the delay model. We chose an exponential.

4.1.4 Watermark decoder

The watermark decoder acts as the egress monitor which checks outgoingtraffic flows for watermarks. It is assumed that there is coordination betweenthe watermarking engine and the decoder and the decoder knows which

12

packets are watermarked. The watermark that is found is compared withall the embedded watermarks and analyzed to determine false positives.

For a detection scheme to be effective, it should not only have a high de-tection rate but also a low false positive rate. A false positive can occur whenthe watermark decoder erroneously finds a watermark in an un-watermarkedflow or in a flow that has a different watermark. We performed two sets ofexperiments to test the effectiveness of our proposed technique to evadedetection. For both sets, we used 100 simulated traffic flows generated asexplained in Section 4.1.1. Our first set of experiments was to show theeffectiveness of the watermarking technique when there is no buffering doneby the attacker. This is an ideal case when the attacker is not perturbing anycharacteristics of the traffic flows. To demonstrate the effectiveness of ourproposed buffering technique to evade detection, we conducted a second setof experiments where we assume that the attacker is perturbing traffic flowsthat are being relayed through the stepping stone by using the bufferingtechnique.

5 Results

In this section we present the results of experiments we performed to test theeffectiveness of our proposed technique to evade watermark-based detection.We also conducted experiments to determine the effect of our technique anddifferent watermarking delays on the buffering delay, drop rate, and amountof chaff.

5.1 Watermark detection

Figure 4 shows our results is the form of Receiver operating characteris-tic(ROC) curves. An ROC curve is a graphical plot of the sensitivity (frac-tion of true positives) vs. specificity (the fraction of false positives). Asshown in Figure 4, when buffering is not used, the detection rate is highwith a low false positive rate. When buffering is used, however, the ROCcurve becomes almost linear with a 45 degree gradient indicating that thedetection rate has reduced drastically and false positive rate has increased.This also means that the detection rate can be increased only at the cost ofdrastically increasing false positives. In the remaining figures, we only showattacker success

13

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Tru

e po

sitiv

es

False positives

After BufferingBefore Buffering

Random Guessing

Figure 4: ROC curve showing the effectivness of buffering againstwatermark-based detection.

38

40

42

44

46

48

50

52

5 10 15 20 25 30 35 40 45 50

Avg

buf

fer

dela

y (m

s)

Watermarking delay (ms)

1 std dev2 std dev3 std dev

Figure 5: The amount of buffering delay needed as watermark delay in-creases. The Y-axis only ranges from 38 to 52 ms.

14

5.2 Buffering delay

When the attacker uses buffering to evade detection, the observer may tryto counter it by increasing the watermarking delay. If the attacker has tomaintain a constant rate traffic stream, he would have to delay packets fora longer duration. We performed a experiments to determine the effect ofincreased watermarking delays on the buffering delay. Figure 5 shows thateven when the watermarking delay is varied from 5ms to 50ms, the aver-age buffering delay varies by less than 10ms. This shows that with a slightincrease in buffering delay, an attacker would still be successful in evadingdetection. The choice of the tolerance margin used by the buffering algo-rithm could have an effect on the buffering delay. To determine this effect, weperformed experiments by varying the tolerance margin from one standarddeviation to three standard deviations. As seen in Figure 5, increasing thetolerance margin does not affect the buffering delay much. This is counter-intuitive because in principle, as the tolerance margin increases, one wouldexpect the average buffering delay to increase. The average buffer delayremains pretty much the same because as the tolerance margin increases,packets that arrive late are sent in their respective slots rather than beingbuffered. So the packets that arrive after a late packet do not have to waitfor the late packet to be sent. In effect, this reduces the average buffer delay.

5.3 Drop rate

Our next set of experiments were to determine the drop rates that we incurfor different watermarking delays. We varied the watermarking delays from5ms to 50ms. Figure 6(a) shows that even for a watermarking delay of 50ms,the average drop rate is very small (0.3%). This supports our argument thatwatermark detection can be severely degraded with our buffering techniquewith a very low drop rate. The choice of the tolerance margin used by thebuffering algorithm could have an effect on the drop rate. To determine thiseffect we performed experiments by varying the tolerance margin from oneto three standard deviations. As seen in Figure 6(a), increasing the tolerancemargin reduces the drop rate. The reason for this is that as the tolerancemargin increases, packets that arrive late are sent in their respective slotsrather than being buffered and eventually dropped.

5.4 Amount of chaff

Though the amount of chaff used by the attacker would not considerablyaffect the quality of the connection, it would still be desirable to determine

15

0.12

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

5 10 15 20 25 30 35 40 45 50

Avg

dro

p ra

te (

%)


1 std dev2 std dev3 std dev

(a) Drop rate

1.24

1.26

1.28

1.3

1.32

1.34

1.36

1.38

1.4

1.42

5 10 15 20 25 30 35 40 45 50

Avg

cha

ff ra

te (

%)


Avg chaff rate vs wm delay

(b) Chaff rate

Figure 6: Effects of increasing watermark delay on the rates of packet dropsand chaff packets at the stepping stone.

16

the amount of chaff needed by the attacker to evade detection. We performedexperiments to determine the amount of chaff needed to counter differentwatermarking delays. Figure 6(b) shows that the overall chaff rate is verylow. Even with a watermarking delay of 50ms, the chaff rate is only 1.4%.We also see that as the watermarking delay is varied from 5ms to 50ms,the chaff rate only varies by about 0.15%. This further corroborates ourargument that with buffering and a very low amount of chaff the attackercan severely degrade watermark detection.

5.5 Sequence Numbers

The use of sequence numbers allows the attacker to select which packets todrop and thereby tradeoff between drops and buffer delay. The attacker hasa choice of how to treat late and out-of-order packets at the stepping stone.It can drop all such packets, but this leads to very high drop rates. Instead,we allow some amount of delay, which we will call the tolerance, betweenone and four time slots. The higher the tolerance, the lower the drop rateand the higher the delay, as showing in Figure 7. This shows us the tradeoffinvolved in changing the tolerance.

A tolerance of one time slot leads to significant reductions in delay versusnot using sequence numbers. We see approximately 21% less delay whenthe watermark delay is 5 ms–a drop of 8.2 ms–and 43% less delay whenthe watermark delay is 50 ms–a drop of 21.5 ms. However, we get muchhigher drop rates with this low tolerance, from 0.14% drops without sequencenumbers to 3.2% drops with a tolerance of one time slow (for 5 ms watermarkdelay). As discussed in Section 3.2.1, a higher drop rate may be acceptableif we build redundancy into the data stream. It is beyond the scope of thispaper to determine which approach is preferable to the attacker.

6 Conclusion

In this paper, we make arguments against some of the assumptions aboutthe capabilities of an attacker made by earlier approaches to stepping stonedetection. We loosen some of these assumptions and assume that an at-tacker is capable of adding delays and cover traffic to his traffic streamson the stepping stone. We propose a simple buffering technique that whenused by an attacker on a stepping stone, is effective is in severely degrad-ing detection. Our technique involves buffering of packets and adding chaffto generate constant rate traffic streams. We perform simulations using a

17

(a) Drop rate

(b) Buffer delay

Figure 7: The effects of increasing watermark delay on the rates of packetdrops and buffer delay at the stepping stone, for the use of sequence numbersand each line representing a different tolerance.

18

watermark-based detection scheme [8] designed to detect correlations be-tween constant rate traffic streams and show that our technique is effectiveis evading detection even by this correlation scheme.

There is a growing body of work in watermarking for both stepping stonedetection and attacking anonymous communications. We have shown herethat advances in this direction could be undone by smart attackers. Worse,the techniques studied in this paper could be made into software and dis-tributed in the cracker community, so even script kiddies would have access.This issue calls the whole field of stepping-stone detection, which alreadyhas significant hurdles to cost-effective implementation, into question. Webelieve that future detection mechanisms would need to take a broader viewof detection to succeed.

References

[1] Staniford-Chen, S., Heberlein, L.: Holding Intruders Accountable onthe Internet. Proceedings of the 1995 IEEE Symposium on Securityand Privacy (1995) 39–49

[2] Zhang, Y., Paxson, V.: Detecting stepping stones. Proceedings of the9th USENIX Security Symposium (2000) 171–184

[3] Yoda, K., Etoh, H.: Finding a Connection Chain for Tracing Intruders.F. Guppens, Y. Deswarte, D. Gollmann and M. Waidner, editors, 6thEuropean Symposium on Research in Computer Security–ESORICS2000 LNCS-1895 (2000)

[4] Wang, X., Reeves, D., Wu, S.: Inter-packet delay-based correlation fortracing encrypted connections through stepping stones (2002)

[5] Donoho, D., Flesia, A., Shankar, U., Paxson, V., Coit, J., Staniford,S.: Multiscale stepping-stone detection: detecting pairs of jittered in-teractive streams by exploiting maximum tolerable delay. In: RecentAdvances in Intrusion Detection (RAID 2002). (Oct. 2002) 16–18

[6] Wang, X., Reeves, D.: Robust correlation of encrypted attack trafficthrough stepping stones by manipulation of interpacket delays. Pro-ceedings of the 10th ACM conference on Computer and communicationsecurity (2003) 20–29

19

[7] Peng, P., Ning, P., Reeves, D.: On the secrecy of timing-based activewatermarking trace-back techniques. In: Proc. 2006 IEEE Symposiumon Security and Privacy. (May 2006)

[8] Wang, X., Chen, S., Jajodia, S.: Tracking anonymous peer-to-peerVoIP calls on the internet. Proceedings of the 12th ACM conference onComputer and communications security (2005) 81–91

[9] Pyun, Y.J., Park, Y.H., Wang, X., Reeves, D., Ning, P.: Tracing trafficthrough intermediate hosts that repacketize flows. In: Proc. IEEEInfocom. (April 2007)

[10] Danezis, G.: The traffic analysis of continuous-time mixes. In: Proceed-ings of Privacy Enhancing Technologies workshop (PET 2004). Volume3424 of LNCS. (May 2004)

[11] Levine, B.N., Reiter, M.K., Wang, C., Wright, M.K.: Timing attacksin low-latency mix-based systems. In Juels, A., ed.: Proceedings of Fi-nancial Cryptography (FC ’04), Springer-Verlag, LNCS 3110 (February2004)

[12] Shmatikov, V., Wang, M.: Timing analysis in low-latency mix net-works: attacks and defenses. In: Proc. European Symposium on Re-search in Computer Security (ESORICS ’06). (Sep. 2006)

[13] Kuhn, D.R., Walsh, T.J., Fries, S.: Security Considerations for VoiceOver IP Systems : Recommendation of the National Institue of Stan-dards and Technology. National Institue of Standards and Technology.NIST Special Publication 800-58 edn. (January 2005)

20

Evading Stepping Stone Detection Under the Cloak of ... · Earlier approaches to stepping stone...

Documents

Transcript of Evading Stepping Stone Detection Under the Cloak of ... · Earlier approaches to stepping stone...