[IEEE 2008 First International Conference on the Applications of Digital Information and Web...

4
A Scalable Method to Protect From IP Spoofing Hikmat Farhat Computer Science Department Notre Dame University Zouk Mosbeh, Lebanon [email protected] Abstract Denial of Service (DoS) attacks present a serious prob- lem for Internet communications. The problem is aggra- vated when the attacker(s) spoof their IP addresses. The Implicit Token Scheme (ITS), presented in [5], is an effi- cient method to defend against IP spoofing. ITS, however, requires perimeter routers to maintain state information for thousands of simultaneous connections. In this paper we add a component to ITS to improve its scalability using Bloom filters. It is found that implementing ITS using Bloom filters is simple, saves a substantial amount of router mem- ory, and does not impose large strain on routers. The effi- ciency of the method is demonstrated through simulations by using real-world Internet data. Keywords: Denial of Service, Bloom filters, IP spoofing. 1. Introduction IP spoofing is usually employed in conjunction with one of the most damaging attacks that can be mounted on the In- ternet today, a Distributed Denial of Service attack (DDoS). A DDoS floods the link of the victim network with a large amount of packets leading to a high rate of packet drops for legitimate users. DDoS are often not publicized , but the threat is prevalent [7]. What makes protection against IP spoofing paramount is that many approaches that could mitigate DDoS are inefficient in the presence of IP spoof- ing. Because of the destination-based forwarding paradigm of the Internet Protocol , IP address spoofing is both simple and very effective in evading detection. The straightforward method of installing filters at border routers, is rendered in- efficient by IP spoofing. The attacker(s) can choose ran- domly an IP address as the source for different packets and thus make the protection method infeasible. Therefore, detecting and blocking packets with spoofed source address has been actively pursued in the research community. Most solutions to the IP spoofing problem are either a variation of the IP traceback method [9][11][12] or try to restrict the address space available to attacker(s) as in [3][8]. Recently we proposed the Implicit Token Scheme (ITS) as a method to mitigate DDoS attacks [5]. The key idea in ITS is that attackers cannot complete the TCP three-way handshake if they use spoofed source addresses. The main shortcoming of ITS was its need to maintain state infor- mation for many thousands of flows which requires a large amount of router memory. Wire-speed filters on Internet routers are usually stored in Ternary Content Addressable Memory (TCAM) which is expensive. Advanced router line cards usually have only 1 TCAM chip which can hold 256k entries to be shared with the router’s forwarding table. Therefore it is paramount for any proposed solution to IP spoofing to be scalable and to save router memory. For more information see [1] and ref- erences therein. The rest of the paper is organized as follows. Bloom filters are reviewed in Section 2. The ITS method and its improvements using Bloom filters are discussed in Section 3. The results of the performed simulations are presented in Section 4. We conclude with Section 5. 2. Bloom Filters We present a quick review of the theory of Bloom fil- ters in this section. Bloom filters were introduced in 1970 by B. H. Bloom [2] and they have been widely used since, especially in database applications. Recently they have also been used in many areas in networking such as ovelays, peer to peer networks, resource routing and packet routing. A Bloom filter is a space-efficient data structure used to test set membership. It is an array of m bits, initial- ized to zero, used to represent a set of n elements, S = {x 1 ,...,x n }. The filter uses k independent and uniform hash functions, h 1 ,...,h k each with range in {1,...,m}. To ”add” an element x i ∈{x 1 ,...,x n } to the filter the k 978-1-4244-2624-9/08/$25.00 ©2008 IEEE 569

Transcript of [IEEE 2008 First International Conference on the Applications of Digital Information and Web...

Page 1: [IEEE 2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT) - Ostrava (2008.08.4-2008.08.6)] 2008 First International Conference

A Scalable Method to Protect From IP Spoofing

Hikmat FarhatComputer Science Department

Notre Dame UniversityZouk Mosbeh, Lebanon

[email protected]

Abstract

Denial of Service (DoS) attacks present a serious prob-lem for Internet communications. The problem is aggra-vated when the attacker(s) spoof their IP addresses. TheImplicit Token Scheme (ITS), presented in [5], is an effi-cient method to defend against IP spoofing. ITS, however,requires perimeter routers to maintain state information forthousands of simultaneous connections. In this paper weadd a component to ITS to improve its scalability usingBloom filters. It is found that implementing ITS using Bloomfilters is simple, saves a substantial amount of router mem-ory, and does not impose large strain on routers. The effi-ciency of the method is demonstrated through simulationsby using real-world Internet data.

Keywords: Denial of Service, Bloom filters, IP spoofing.

1. Introduction

IP spoofing is usually employed in conjunction with oneof the most damaging attacks that can be mounted on the In-ternet today, a Distributed Denial of Service attack (DDoS).A DDoS floods the link of the victim network with a largeamount of packets leading to a high rate of packet dropsfor legitimate users. DDoS are often not publicized , butthe threat is prevalent [7]. What makes protection againstIP spoofing paramount is that many approaches that couldmitigate DDoS are inefficient in the presence of IP spoof-ing. Because of the destination-based forwarding paradigmof the Internet Protocol , IP address spoofing is both simpleand very effective in evading detection. The straightforwardmethod of installing filters at border routers, is rendered in-efficient by IP spoofing. The attacker(s) can choose ran-domly an IP address as the source for different packets andthus make the protection method infeasible.

Therefore, detecting and blocking packets with spoofedsource address has been actively pursued in the research

community. Most solutions to the IP spoofing problem areeither a variation of the IP traceback method [9][11][12] ortry to restrict the address space available to attacker(s) as in[3][8].

Recently we proposed the Implicit Token Scheme (ITS)as a method to mitigate DDoS attacks [5]. The key ideain ITS is that attackers cannot complete the TCP three-wayhandshake if they use spoofed source addresses. The mainshortcoming of ITS was its need to maintain state infor-mation for many thousands of flows which requires a largeamount of router memory.

Wire-speed filters on Internet routers are usually storedin Ternary Content Addressable Memory (TCAM) which isexpensive. Advanced router line cards usually have only 1TCAM chip which can hold 256k entries to be shared withthe router’s forwarding table. Therefore it is paramount forany proposed solution to IP spoofing to be scalable and tosave router memory. For more information see [1] and ref-erences therein.

The rest of the paper is organized as follows. Bloomfilters are reviewed in Section 2. The ITS method and itsimprovements using Bloom filters are discussed in Section3. The results of the performed simulations are presented inSection 4. We conclude with Section 5.

2. Bloom Filters

We present a quick review of the theory of Bloom fil-ters in this section. Bloom filters were introduced in 1970by B. H. Bloom [2] and they have been widely used since,especially in database applications. Recently they have alsobeen used in many areas in networking such as ovelays, peerto peer networks, resource routing and packet routing.

A Bloom filter is a space-efficient data structure usedto test set membership. It is an array of m bits, initial-ized to zero, used to represent a set of n elements, S ={x1, . . . , xn}. The filter uses k independent and uniformhash functions, h1, . . . , hk each with range in {1, . . . ,m}.To ”add” an element xi ∈ {x1, . . . , xn} to the filter the k

978-1-4244-2624-9/08/$25.00 ©2008 IEEE 569

Page 2: [IEEE 2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT) - Ostrava (2008.08.4-2008.08.6)] 2008 First International Conference

hash functions are applied to xi and the corresponding bitsin the filter are set to one. Adding an element x to the filteris written in pseudo-code as follows

ADD ELEMENT X

1 for j = 1 to k2 do3 filter[hj(X)]← 1

It should be noted that when a bit is set to one an additionalsetting does not change it. To check if an element y be-longs to the set the k hash functions are applied to y and thecorresponding bits are checked. If one of the bits is 0 thenclearly the element is not in the set. If all the bits are equalto 1 then we could say that the element belongs to the set.The following pseudo-code checks if y is an element of theset

CHECK ELEMENT Y

1 for j = 1 to k2 do3 if filter[hj(Y )] = 0 return False4 return True

Obviously, an element z could have all the correspond-ing bits equal to 1 without the element itself belonging tothe set. This is called a false positive. It is in our inter-est that the rate of false positives be as small as possible.The false positive rate can be calculated as follows. When agiven hash function hi is applied to an input x1 the resultsis a value between 1 and m. Since the hash functions areuniform, the probability that this result is equal to a partic-ular number ν is 1

m . Therefore the probability of the bit atposition ν being 1 after one hash function is 1

m . The proba-bility that it is 0 is 1 − 1

m . The probability that it is 0 afterall k hash functions are applied is (1− 1

m )k. Since there aren elements in the set, the probability that the bit ν is equalto 0 after we process all n elements is (1 − 1

m )kn. Hence1− (1− 1

m )kn is the probability that a given bit ν is set to 1after all input elements x1, . . . , xn are processed. Since wewant the false positive rate, we need the probability that foran arbitrary input y the corresponding k bits are 1 withouty belonging to the set. This probability is

fp =(

1− (1− 1m

)kn

)k

(1)

≈(1− e−

nkm

)k

(2)

Asymptotically the false positive rate depends on k andthe ratio m/n. If we fix the ratio m/n then one can show[6] that the minimum of the false positive rate in equation(1), as a function of k, occurs when

Figure 1. Defense model against DDoS at-tacks.

ko =m

nln 2 (3)

And the optimal false positive ratio is

fpo = (12)k (4)

Usually, the false positive rate and the n are fixed and weneed to deduce the number of bits needed. Combining equa-tions (3) and (4) we get

m = −2.08 n ln f (5)

One disadvantage of Bloom filters is that it is not possi-ble to delete entries stored in the filter. To do so requires thesetting to zero all the k bits that the entry points to. But thiscould confuse the filter since as we mentioned a bit couldbe set to 1 by multiple entries. To solve this problem a vari-ation of Bloom Filters called counting Bloom Filters wasintroduced by Fan et. al.[4]. In a counting Bloom Filtereach entry is a counter rather than a single bit. When weadd an entry the corresponding counters are incrementedand when the item is removed the corresponding countersare decremented.

3. The ITS Method

The Implicit Token Scheme (ITS) [5] provides protec-tion against IP spoofed traffic by having Internet ServiceProviders (ISPs) install filters at the border router as shownin Figure 1. These filters are based on entries in a databasethat map a given IP address to the network path that a packet

570

Page 3: [IEEE 2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT) - Ostrava (2008.08.4-2008.08.6)] 2008 First International Conference

ADDPACKETTOFILTER(PKT)

1 TOKEN = PKT.SIG‖PKT.SOURCE2 for i=0 to k3 do4 bitPos = hi(TOKEN)5 filter[bitPos]← 1

Figure 2. Adding a token to the filter

CHECKPACKET(PKT)

1 TOKEN = PKT.SIG‖PKT.SOURCE2 for i=0 to k3 do4 bitPos = hi(TOKEN)5 if filter[bitPos] = 0 return False6 return True

Figure 3. Checking if a token is in the filter

from that IP address takes to reach the victim. The basic ITSmethod requires intermediate routers to add a deterministicmark to the identification field of the IP header of transit-ing packets. The collection of marks by all intermediaterouters is called the packet’s signature. According to [5] amark of a 2-bit hash related to the IP address of the inter-mediate router and its peer is optimal. Therefore each entryin the database contains the source IP address and the cor-responding path signature. Such entries are used as filtersat the border router by dropping any packet that does notcontain the valid path signature. New entries are added tothe database only when a client completes the TCP three-way handshake and therefore making absolutely sure thatthe added entry belongs to a legitimate IP address and notfor a spoofed one.

3.1. Building the Bloom Filter

Originally the list of tokens was stored in what is calleda tokens database. The implementation of the database wasnot specified but rather assumed to exist. Furthermore, anassumption was made that one can retrieve and store en-tries in the database. In this section we show how the abovementioned database can be implemented as a single bloomfilter.

Each entry in the database contains a token, which iscomposed of the source IP address and the correspondingpath signature stored in the 16-bit IP identification field ofthe IP header [5]. This field is marked by the routers alongthe path, from the source to the destination, where eachrouter contributes 2 bits.

FILTER AT A BORDER ROUTER

1 for each packet PKT2 do3 if PKT.SYN=14 then5 sendCookie6 Exit7 if CheckPacket(PKT)=True8 then9 forward packet

10 elseif checkCookie(pkt.ACK)= TRUE11 then12 forward PKT13 AddPacketToFilter(PKT)14 else15 drop PKT

Figure 4. Packet filtering using exact match.

In the discussion of Bloom filters in section 2 we haveassumed that the elements of the set and their number areknown in advance. In ITS, the tokens are added to the filterevery time a TCP connection is established. Therefore thenumber of elements is not known in advance but increaseswith time. This is not really a problem at all. Recall fromsection 2 that the number of bits needed to get the optimalvalue of false positive is proportional to n which is the num-ber of elements in the set. In our analysis we will regard thisnumber n as an upper bound on the number of elements thatwe can store in the Bloom filter. It can be seen from equa-tion (2) that the smaller the value of n the smaller the falsepositive rate. The pseudo-code for adding the token of apacket to the filter in shown in Figure 2.

3.2. Packet Filtering

When the border router receives a packet it executes thealgorithm in Figure 4. If the SYN bit is on, a reply con-taining a cookie is send to the client. If the SYN bit is noton, the router checks the validity of the packet token (line7) using the CheckPacket() procedure shown in Figure 3.If the return value is True it means the packet belongs toa legitimate (there will be false positives but more on thatlater) connection and the packet is forwarded. If the returnvalue is False then either the token of the packet hasn’t beenadded to the filter but in this case it should contain a validcookie which is checked in line 10. If the packet containsa valid cookie then its token is added to the filter using theAddToken() procedure shown in Figure 2. Finally, if neithercase is true then the packet should be dropped (line 15).

571

Page 4: [IEEE 2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT) - Ostrava (2008.08.4-2008.08.6)] 2008 First International Conference

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Filter Size (MB)

Faction o

f B

andw

idth

Used b

y A

ttackers

Figure 5. Efficiency of the filter as a functionof the filter size.

4. Implementation & Simulations

We have implemented ITS using 8 hash functionsh0, . . . , h7 all of them derived from two independent MD5functions. Since the output of an MD5 function is 128 bitwe have used each 32 bits as a different value. There-fore a single MD5 plays the role of four hash functions.Our goal is to maintain about 500,000 flows. For a falsepositive rate of 1%, from equation (5), the filter size is2.08×5×105 ln 0.01 ≈ 0.6 MB. As a comparison, withoutBloom filters we need 6 bytes for each token for a total of6× 5× 105 = 3MB. This is a memory saving of 5 times.

To test our method we did a series of simulations usingreal-world topological data from Skitter [10]. We selectedrandomly 600 hosts: 100 were used as clients and 500 asattack sources. The IP address and path signature of clientswere manually added to the Bloom filter (not through TCP).The attacking sources send data at the constant rate of 10Mpackets/s while the clients send at the rate of 1M packets/s.The link victim’s link rate is set to 100MB, i.e. just enoughfor the legitimate clients. The metric used to measure theperformance of our method is the fraction of bandwidth ofthe link between the border router and the target consumedby the attacking packets. As expected, the results in Figure5 show that the bigger the filter size, the better the efficiencyof the method since the false positive rate is smaller.

It should be noted that for a size of 0.9 MB less than5% of the bandwidth is used by the attackers which is anexcellent results with a gain of a factor of more than 3 inmemory size since the original ITS method requires 3 MBof memory.

5. Conclusion

The Implicit Token Scheme (ITS) is an efficient methodto defend against spoofed IP traffic. In this paper we haveproposed the use of Bloom filters, a space efficient datastructure, to store rules of ITS and thereby reduce the stor-age requirements on intermediate routers. Since Bloom fil-ters can give rise to false positives we also derived an ex-pression for the false positive rate as a function of the filtersize as well as the optimal values needed to minimize therate of false positives. Several simulations were preformedon real-world data and the results prove that the proposedmethod accomplishes its aim of saving the (up to a factor of5) memory requirements on intermediate routers.

References

[1] K. Argyraki and D. R. Cheriton. Active Internet traffic fil-tering: real-time response to denial-of-service attacks. InProceedings of the Annual USENIX Technical Conference,pp. 135–148, 2005.

[2] B. H. Bloom. Space/time trade-offs in hash coding will al-lowable errors. Commun. ACM, vol. 13, no. 7, pp. 422–426,1970.

[3] Z. Duan, X. Yuan, and J. Chandrashekar. Constructing inter-domain packet filters to control IP spoofing based on BGPupdates. In Proceedings of IEEE INFOCOMM, pp. 1–12,2006.

[4] L. Fan, P. Cao, J. Almeida, and A. Z. Broder. Summarycache: A scalable wide-area web cache sharing protocol.IEEE/ACM Trans. Netw., vol. 8, no. 3, pp. 281–293, 2000.

[5] H. Farhat. Protecting TCP services from denial of serviceattacks. In Proceedings of the ACM SIGCOMM workshopon Large-scale attack defense, pp. 155–160, 2006.

[6] M. Mitzenmacher. Compressed Bloom filters. IEEE/ACMTrans. Netw., vol. 10, no. 5, pp. 604–612, 2002.

[7] D. Moore, C. Shannon, D. J. Brown, G. M. Voelker, andS. Savage. Inferring Internet denial-of-service activity. ACMTrans. Comput. Syst., vol. 24, no. 2, pp. 115–139, 2006.

[8] K. Park and H. Lee. On the effectiveness of route-basedpacket filtering for distributed DoS attack prevention inpower-law internets. In Proceedings of ACM SIGCOMM,pp. 15–26, 2001.

[9] S. Savage, D. Wetherall, A. Karlin, and T. Anderson. Net-work support for IP traceback. IEEE/ACM Trans. Netw., vol.9, no. 3, pp. 226–237, 2001.

[10] CAIDA’s skitter initiative. http://www.caida.org.[11] D. Song and A. Perrig. Advanced and authenticated mark-

ing schemes for IP traceback. In Proceedings of IEEE IN-FOCOMM, pp. 878–886, 2001.

[12] A. Yaar, A. Perrig, and D. Song. FIT: Fast Internet trace-back. In Proceedings of IEEE INFOCOMM, pp. 1395–1406,2005.

572