Revisiting IP Traceback as a Coupon Collector’s...

University of California,Irvine

Revisiting IP Traceback as a Coupon Collector’s Problem

Thesis

submitted in partial satisfaction of the requirementsfor the degree of

Master of Science

in Electrical and Computer Engineering

by

Pegah Sattari

Thesis Committee:Professor Athina Markopoulou, Chair

Professor Hamid JafarkhaniProfessor Syed Ali Jafar

2007

The thesis of Pegah Sattariis approved:

Committee Chair

University of California, Irvine2007

ii

For my parents: Rohangiz Shojaei, Siavash Sattari.

iii

Table of Contents

List of Figures v

List of Tables vi

Acknowledgments vii

Abstract viii

1 Introduction 11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Model and Motivation 62.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Key Ideas and Rationale . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Proposed Mechanisms 113.1 Unequal PPM: Optimal Marking Probabilities . . . . . . . . . . . . . 11

3.1.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.1.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 PPM+NC: Using Network Coding . . . . . . . . . . . . . . . . . . . 193.2.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4 Additional Evaluation 314.1 Discussion of Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Additional Simulation Results . . . . . . . . . . . . . . . . . . . . . . 32

4.2.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5 Conclusion 38

Appendices 39A The Coupon Collector’s Problem with Network Coding . . . . . . . . 39B Tail Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40C Combining the Proposed Mechanisms with Previous Schemes . . . . . 42

iv

List of Figures

2.1 PPM over a single path of length d. . . . . . . . . . . . . . . . . . . . 7

3.1 Comparison of the tradeoff for constant and unequal PPM. . . . . . . 13

3.2 Example of an attack tree with 7 nodes and 4 attackers. . . . . . . . 16

3.3 Average number of packets needed for paths with varying d. . . . . . 21

3.4 Average number of packets needed for a path with varying p. . . . . . 21

3.5 Average number of packets needed for Algebraic and PPM+NC. . . . 23

3.6 Average and 95th percentile for the number of packets required. . . . 28

3.7 Average and 95th percentile for the number of packets required. . . . 28

4.1 Comparison of all three schemes for single path. . . . . . . . . . . . . 33

4.2 Average number of packets needed for a binary attack tree. . . . . . . 34

4.3 Average number of packets needed for a ternary attack tree. . . . . . 35

4.4 Average number of packets needed for a degree 4 attack tree. . . . . . 35

4.5 Average number of packets needed for a realistic attack tree. . . . . . 36

v

List of Tables

4.1 BRITE Topology Generator Parameters . . . . . . . . . . . . . . . . 36

vi

Acknowledgements

I would like to thank Prof. Athina Markopoulou for her detailed comments andsuggestions which helped to substantially improve this work.

vii

Abstract

Revisiting IP Traceback as a Coupon Collector’s Problem

By

Pegah Sattari

Master of Science in Electrical and Computer Engineering

University of California, Irvine, 2007

Professor Athina Markopoulou, Chair

Traceback schemes aim at identifying the source(s) of a sequence of network pack-

ets and the nodes these packets traversed. This is, for example, useful for tracing the

sources of a Distributed Denial-of-Service (DDoS) attack [11]. The main idea is to

have intermediate nodes mark packets with information about their identity and the

receiver uses the information on the marked packets to reconstruct the paths. Past

work has designed probabilistic and algebraic traceback schemes and has explored the

tradeoff between the number of bits required for marking and the number of packets

required for reconstruction.

In this work, we use the insight that probabilistic traceback is essentially a coupon

collector’s problem to design two mechanisms that improve its performance. First, we

optimize the assignment of marking probabilities at intermediate nodes and show that

it improves the tradeoff between the number of packets and the work at intermediate

nodes, compared to schemes where nodes use the same marking probability; this idea

can be used together with any probabilistic marking scheme. Second, we propose

a network coding based marking approach that stores random linear combinations

viii

of router ids; a special case of this approach advances algebraic traceback. Our

mechanisms can be combined with and complement previous marking schemes to

improve the overall traceback performance. We also provide performance models,

based on the coupon collector problem with unequal weights, that accurately capture

the performance of the proposed mechanisms as well as that of prior schemes.

ix

Chapter 1

Introduction

1.1 Overview

Distributed Denial-of-Service attacks (DDoS) are one of the hardest problems on the

Internet today. During a DDoS attack, a large number of compromised hosts coordi-

nate and send unwanted traffic to the victim thus exhausting the victim’s resources

and preventing it from serving its legitimate clients. For example, victims of DDoS

attacks can be companies that rely on the Internet for their business (in which case

DDoS attacks can result in severe financial loss or even in the company quitting

the business), government sites and other organizations (in which case disruption of

operation results in a political or reputation cost). We are particularly interested

in bandwidth flooding attacks, which flood the victim’s access link with unwanted

traffic.

Several approaches and mechanisms have been proposed to deal with DDoS at-

tacks. In this work, we focus on IP traceback mechanisms that try to trace the

attacks back to their sources. Traceback in itself does not stop the attacks, but it is

an important piece of the puzzle. Many techniques for IP traceback problem have

already been proposed. In this work, we are particularly interested in the family of

Probabilistic Packet Marking (PPM) schemes, that started with the work in [15] and

evolved into a number of improved marking schemes. In PPM, packets are marked

probabilistically with the IP addresses of the routers they traverse. The victim uses

this information in the marked packets to trace the attack back to its source. We

are also interested in Algebraic Traceback (AT) that uses algebraic techniques to

encode/decode information into/from packets.

1

Our key insight in this thesis is that probabilistic traceback is essentially a coupon

collector’s problem, where the coupons correspond to the router ids and the proba-

bilities of the coupons are affected by the marking probabilities at the routers. This

fundamental observation has been made even in the original paper [15] and is well-

known since [12]. However, to the best of our knowledge, it has not been exploited so

far to optimize the performance. Instead, researchers focused on marking algorithms

and on improving the tradeoff between the number of bits in the packet header and

the number of packets needed to reconstruct the path [2, 13, 15, 16]. In this work,

we are interested in minimizing the number of packets to reconstruct the path, while

keeping the load at the routers and bits on the header low, by tuning parameters that

affect the coupon collector problem, namely the marking probabilities and network

coding.

Based on this insight, we propose two new mechanisms that can improve prob-

abilistic traceback. First, we optimize the assignment of marking probabilities to

improve the tradeoff between the number of packets and the work at intermediate

nodes; we develop an optimal unequal marking scheme that outperforms equal mark-

ing probability schemes. Second, inspired by recent developments in network coding,

we propose a network-coding based marking approach that stores random linear com-

binations of router ids; a special case of this scheme improves over algebraic traceback.

We provide accurate performance models based on the coupon collector problem with

unequal weights. We evaluate our techniques through analysis and simulations, for

both paths and trees, and we compare them to appropriate baseline mechanisms.

However, our ideas are orthogonal to and therefore can be combined with previous

marking schemes to improve the overall traceback performance.

The rest of the thesis is organized as follows. In Section 1.2 we summarize the

related work. Chapter 2 discusses how the coupon collector’s problem can be used to

model probabilistic traceback and presents the key ideas. Chapter 3 presents the two

2

new mechanisms, namely unequal PPM (unequal Probabilistic Packet Marking) and

PPM+NC (Probabilistic Packet Marking with Network Coding), and evaluates them

through analysis and simulation for single and multipath scenarios. Chapter 4 dis-

cusses various costs and provides additional simulation results for realistic topologies.

Chapter 5 concludes the thesis and discusses future work.

1.2 Related Work

Many techniques for IP traceback problem have already been proposed. Savage et

al. [15] first proposed Compressed Edge Fragment Sampling (CEFS) scheme which

divides each IP address and its hash function into eight fragments and each router

probabilistically marks the IP packet with one of the eight fragments selected at

random. This approach works well for a single attacker, but suffers from high com-

putation overhead due to checking a large number of combinations of the fragments

in case of a Distributed Denial-of-Service attack. Savage et al. also took advantage

of reserving a distance field in each packet for the first time. When the packet arrives

at the victim, the distance field represents the number of hops traversed since the

content of the packet was sampled. By introducing this distance field, Savage et al.

minimized spoofing by the attacker such that a single attacker (or the closest attacker

for a distributed attack) can not forge any IP address between themselves and the

victim.

Song et al. [16] improved the efficiency and accuracy for reconstructing the attack

path under DDoS by predetermining the network topology and introducing new en-

coding schemes. Their advanced and authenticated marking schemes feature high pre-

cision and lower computation overhead for the victim to reconstruct the attack paths

under large scale DDoS attacks. Especially their authenticated marking scheme pre-

vents a compromised router from forging markings of other uncompromised routers.

3

Unfortunately they assumed that the victim has a map of its upstream routers. Al-

though they described how such maps can be obtained in practice, this assumption

is a drawback because such information could be difficult to obtain and maintain.

Dean et al. [3] proposed an algebraic approach by replacing XOR-based marking

scheme with one encoding the path information as points on polynomials. Their

scheme improves robustness over previous approaches, both for noise elimination and

distributed attack graph reconstruction. We will discuss more about the algebraic

approach in the evaluation of our proposed PPM+NC mechanism.

Goodrich [7] presented a new PPM based traceback approach which he called

randomize-and-link. The main idea of this scheme is to have each router mark with

a random fragment of its message together with a large checksum cord on its entire

message. The checksum cords make the reconstruction procedure much more efficient;

the scheme features fast and efficient traceback under the case of a distributed attack

without requiring a prior knowledge of the topology of the attack tree.

Park et al. [13] and Adler [2] studied the tradeoffs for different parameters in

PPM schemes. Park et al. showed the tradeoff between the ability of the victim to

localize the attacker and the severity of the attack which is related to the marking

probability, attack path length, and traffic volume. They showed that PPM is effective

at localizing the attack source in case of a single path attack although it is always

possible for the attacker to send packets with spoofed IP addresses. However, PPM

faces more difficulties as the number of attack sources increase in case of a DDoS

attack.

Adler introduced a new marking technique that only requires a single bit in the

packet header under the case of a single path attack. He showed the tradeoff between

b, the number of bits needed in the packet header, and the number of packets required.

He proved that the number of packets required for reconstructing the attack path

grows exponentially with n, but decreases doubly exponentially with b where n is the

4

number of bits for representing the attack path. In case of a multiple paths attack,

he provided a lower bound on the number of bits needed as a function of the number

of attack paths. He also demonstrated a closely matching upper bound for some

restricted scenarios.

Ma [10] introduced Tabu Marking Scheme (TMS) in which a router regards a

packet marked by an upstream router as a tabu and does not mark it again. TMS

reduces the convergence time compared with Song et al.’s Advanced Marking Scheme

I (AMS-I) under the case of DDoS attacks with multiple flooding sources, but has the

same convergence time under a DoS attack with a single source. This comparison is

also valid for any other PPM scheme that allows overwriting e.g. CEFS scheme and

AMS-II.

5

Chapter 2

Model and Motivation

2.1 Model

Consider the path of length d shown in Fig. 2.1. Attacker A sends packets towards

the victim. In Probabilistic Packet Marking approaches (which we call PPM), each

intermediate node i along this path marks the packet, with probability p(i), with its

IP address.

As a first step, let us make the same assumption as Savage et al. in [15], where

each router makes an independent decision for marking the packet and there is only

space for one mark on the header, and routers overwrite previous marks: every packet

finally contains at most one router’s mark after traversing the entire attack path. The

marks on the received packets allow the victim to sample the routers on the path.

After receiving a sufficient number of packets, X, the victim obtains at least one

sample for every router in the attack path and can reconstruct the entire path.

Traceback as a Coupon Collector’s Problem. Since the first paper on prob-

abilistic traceback [15], it has been observed that PPM resembles a coupon collector’s

problem [12]. In [15], it was assumed that all routers mark with the same probability

p and it was argued that the average number of packets required for the victim to

reconstruct a path of length d is bounded by:

E[X] <ln d

p(1− p)d−1(2.1)

The reason is that although router i marks with probability p(i) = p, its mark is

observed at the victim only if the routers between i and V do not overwrite it, which

6

123d d-1 VA

p(d) p(d-1) p(3) p(2) p(1)

Figure 2.1: PPM over a single path of length d.

happens with probability pi = p(i) ·∏1i−1(1 − p(j)) = p(1 − p)i−1. We call p(i) the

marking probability and pi the perceived probability of node i. Clearly pi > pd, ∀i,thus the bound of Eq. (2.1).

Clearly, in overwriting schemes, the perceived probability depends on the distance

of that router from the victim: the further a router is from the victim, the less likely

that its marks will survive (not be overwritten) as the packet moves along the path.

Therefore, the coupon collector’s problem with unequal weights provides an accurate

model for the IP traceback problem [14]. In this version, there are n distinct coupons

and a type i coupon is obtained with probability pi, independent of the previous

coupons, where∑n

i=1 pi = 1. The average number of boxes required to collect all n

for unequal coupon collector problem is known to be [14]:

E[X] =

∫ ∞

0

(1−n∏

i=1

(1− e−pix))dx (2.2)

In the IP traceback case, the coupons are the router ids and pi is the perceived

probability of observing a mark from the ith router. Clearly pi depends on the

marking probabilities p(j)’s: pi = p(i)∏1

i−1(1− p(j)).

One difference between the IP traceback problem and the coupon collector’s prob-

lem is the “null coupon”. In the coupon collector’s problem, there are n distinct

types of coupons and∑n

i=1 pi = 1; in the traceback problem, there are d distinct

router IP addresses, that need to be collected, plus a null event which is related to

those packets that finally do not contain any marks. In other words, in traceback

there exists n = d + 1 coupons (d distinct router IP addresses and no mark) and the

7

number of coupons the victim requires to reconstruct the attack path is d. We assign

the probability p0 to this null event:∑d

i=1 pi + p0 = 1.

Therefore, if all routers mark with the same p(i) = p as in [15], the exact value for

the average number of packets the victim needs to observe to reconstruct the path is:

E[X] =

∫ ∞

0

(1−d∏

i=1

(1− e−p(1−p)i−1x))dx (2.3)

which is a better model than Eq. (2.1). Compared to Eq. (2.2), in Eq. (2.3), the

product goes from 1 to d due to the desired d distinct IP addresses out of the total

d + 1 coupons and pi = p(1− p)i−1.

2.2 Key Ideas and Rationale

Our objective in this thesis is to minimize E[X] by tuning parameters that can affect

the behavior of the coupon collector’s model. In particular, we build on the following

key ideas:

• First, we observe that E[X] is minimum for equal weights of the coupons pi = 1n.

We then construct marking probabilities p(j) that lead to those equal weights

pi’s, for a single path and for multiple paths scenarios. We call this optimal

marking scheme unequal PPM and we discuss it in detail in Section 3.1.

• Second, we observe that recent developments in network coding [5] can further

reduce E[X] from n ln n + Θ(n) to Θ(n), by marking with random linear com-

binations of router ids instead of router ids themselves. We call this scheme

PPM+NC and we discuss it in detail in Section 3.2. In the special case that

all routers mark in PPM+NC (i.e. p(i) = 1) the scheme is comparable to but

outperforms Algebraic Traceback.

Assumptions. Let us briefly discuss some of the assumptions behind the above

model.

8

• We assume that routers place a mark on the packet with their entire IP address

(or linear combinations of IP addresses in the PPM+NC scheme). In practice,

due to the limited number of available bits on the header, each router encodes

only partial information about its IP address using methods like fragmentation

and hashing. These important considerations are part of the marking scheme

design, which is orthogonal to - and can be combined with - the ideas in this

thesis. Therefore, we consider as baseline for comparison the best case of PPM

schemes: without bits limitation they would mark their entire IP address, and

this is an upper limit to their performance. We refer to the basic overwriting

scheme with the same marking probability on all routers as constant PPM or

simply PPM.

• Similarly, we present our discussion in terms of router ids (node sampling algo-

rithm in [15]). The same analysis holds for edge sampling algorithms [15] that

mark with edge ids (XOR of two nodes) instead of router ids. We are interested

in collecting the full set of router ids and not about the order, which is trivial

in the case of edge ids, or can be inferred from the relative number of samples

per node.

• We assume that each router decides to mark a packet or not independently of

other routers; this is consistent with [15] and the rationale is that upstream

routers may have forged information.

• Our ideas extend from a single path to a multipath scenario, where several

attackers form an attack tree towards the victim. Multipath scenarios are con-

sidered throughout the thesis.

Other performance metrics of interest. We are interested in fast inference,

i.e. in reducing the number of packets X needed to reconstruct the path. So far, we

discussed only the computation of E[X]. More generally, one can compute the entire

9

distribution of X. For details on the computation of the percentile P (X > n), please

see the appendix.

Another metric of interest is the amount of work that routers are required to do

for marking. The average number of marks on all packets is (∑d

i=1 p(i)).E[X] : there

are E[X] of packets and each packet gets one mark on each router with marking

probability p(i). We are interested in schemes that generate low load at the routers.

Finally, another important performance metric is the number of bits on the packet

header required for marking. The tradeoff between number of packets and number of

bits has been extensively explored in the literature and intelligent marking schemes

have been designed to achieve traceback within a tight bit budget. This tradeoff

is orthogonal to tuning the marking probability of PPM schemes (please note the

difference between marking probability and the content of the mark itself). However,

the bits requirement is relevant for network coding and is discussed in the relevant

Sections 3.2 and 4.1.

10

Chapter 3

Proposed Mechanisms

3.1 Unequal PPM: Optimal Marking Probabili-

ties

As discussed in Section 2.1, when all routers mark packets with the same marking

probability p, the perceived probability of observing a mark from each router is pi =

p(1− p)i−1 for the ith router, which favors routers closer to the victim. An intuitive

idea is to assign higher marking probabilities to the routers further from the victim,

so as to balance out the above bias caused by overwriting. We are interested in

constructing a function that assigns marking probabilities p(i) to the routers according

to their distance from the victim i, so as to achieve the above goal. In particular, we

are interested in finding the optimal function p(i), i = 1...d that minimizes E[X], the

average number of packets required to reconstruct the path.

3.1.1 Single Path

If we consider only two routers, we can calculate E[X] from Eq. (2.2) for different

values of pi where p1 = p is the perceived probability of observing a mark from router

1 and p2 = 1−p is the perceived probability of observing a mark from router 2. First

we focus on making the probability of null event, i.e. the perceived probability of

receiving no mark, equal to 0. That is why we chose p and 1 − p for the perceived

probabilities of routers 1 and 2 respectively where∑2

i=1 pi = 1, the same condition

as that we mentioned in the coupon collector’s problem with unequal weights. We

will discuss other values of p0 at the end of this section. It is easy to see that E[X]

11

is minimized for p1 = p2 = 1/2.

More generally, for a path with d routers, E[X] is minimum when the perceived

probability (probability of observing a mark from every router) is the same and equal

to pi = 1/d. Indeed, Eq. (2.2) is symmetric with respect to all variables, i.e. p1,

p2, . . . , pn. And since∑n

i=1 pi = 1, p1 = p2 = · · · = pn = 1/n is the answer to the

minimization problem of E[X] in Eq. (2.2).

From the optimal perceived probabilities pi = 1n, i = 1...d, we can now construct

the marking probabilities p(i), i = 1...d, that lead to the optimal pi’s. We can calculate

these probabilities starting from the routers close to the victim i = 1, 2, ..., d:

p1 = p(1) ⇒ p(1) =1

d

p2 = p(2)(1− p(1)) ⇒ p(2) =1

d− 1

p3 = p(3)(1− p(2))(1− p(1)) ⇒ p(3) =1

d− 2

. . .

p(d− 2) =1

3, p(d− 1) =

1

2, p(d) = 1

This scheme assigns probability of marking 1 to the first router that receives the

packet and then the probability of marking gradually decreases as the router gets

closer to the victim. Interestingly, this result is similar to the reservoir sampling

problem [12]: the kth router that receives the packet in its path (router numbered as

d − k + 1), marks it with probability 1/k. This formula has an interesting practical

advantage: if a router knows that it is the kth router on the packet’s path, based

e.g. on a hop count or TTL (Time-to-Live), then it can configure itself to use the

optimal marking probability 1/k for each packet. If the optimal probability depended

on the number of routers after the current router, this would not be possible, and

an additional protocol/configuration would be needed to inform the routers what

12

101

102

103

104

101

102

103

104

(sum of p(i)).E[X]

E[X

]

constant PPMunequal PPM

Figure 3.1: Comparison of the tradeoff for constant and unequal PPM.

marking probability to use.

Let us now characterize the tradeoff between (i) the average number of packets re-

quired for reconstructing the attack path and (ii) the work at the intermediate routers

under this unequal marking scheme. As we mentioned above, p0, the probability of

observing no mark, has been assumed to be 0 till now. We want to change this

assumption and see how this effects the benefit we get from marking with unequal

probabilities instead of marking with constant probability for all routers. For this

purpose, we fix the path length d and gradually change p0 from 0 to 0.95. Then for

each value of p0 we construct the probabilities of marking from the following formula,

p(i) =1−p0

d∏i−1j=1(1− p(j))

=1− p0

d− (i− 1)(1− p0)

where the perceived probability is set to pi = (1−p0)/d for all routers; i = 1, 2, · · · , d.

Note that in the special case of p0 = 0, we get the expected result:

p(i) =1d∏i−1

j=1(1− p(j))=

1

d− i + 1

13

We now want to compare this optimal marking scheme, which we call unequal

PPM to the constant PPM [15] that uses the same marking probability p at all routers.

Comparing the two schemes in terms of E[X] alone would not be fair, because they can

both decrease the number of packets by increasing the number of marks. Therefore,

we compare the two schemes in terms of the tradeoff (E[X], (∑d

i=1 p(i)).E[X])) they

can achieve. We have used the analytical formula in Eq. (2.2) for calculating E[X]

with the corresponding perceived probabilities (pis) for each scheme and n replaced by

the appropriate number of coupons which is d. This comparison is shown in Fig. 3.1

for d = 14.

For the constant PPM scheme, we vary the constant marking probability p in the

range 0.01-0.40 and obtain the blue curve in the figure. In the same way that p is

the only parameter over which we have control in the constant PPM scheme, p0 is

the only parameter which can be changed in the unequal PPM mechanism. We have

considered the range 0.01-0.40 for p and 0-0.95 for p0 in this comparison. It is worth

noting that we have always considered probabilities of marking in the range 0.01 to

0.04 in this thesis. But for this comparison, we want to show that the optimized PPM

(that results to the same perceived probability equal to 1/d) improves the tradeoff

between the number of packets required and the work at the intermediate routers

over constant PPM for all values of p one might consider.

When p in the constant PPM scheme goes to 1, all packets will have a mark

from the closest router to the victim and therefore, the victim will never be able to

reconstruct the attack path; i.e. E[X] goes to infinity. In other words by increasing p

in the constant PPM scheme, after some point, both E[X] and (∑d

i=1 p).E[X] increase

and the curve moves to the right side which can be seen in the figure. However, in

the unequal PPM scheme, as we increase p0 or the probability of not receiving any

marks, E[X] increases, but the work at the intermediate routers decreases. Thus

the curve moves to the left side. The figure shows that for a small area at the left,

14

E[X] resulted from unequal PPM exceeds that of constant PPM. However this area

corresponds to very large values of p0 (p0 > 0.9). Over the range of our interest;

i.e. small value of p0, it can be seen that the optimized unequal PPM scheme always

performs better than the constant PPM scheme by moving the entire curve down.

In summary, the proposed unequal PPM mechanism improves the tradeoff between

the number of packets required and the work at intermediate nodes by optimizing the

assignment of probabilities of marking. It is worth noting that the unequal PPM

mechanism can be combined with and take advantage of any proposed reconstruction

algorithm for other PPM schemes that allow overwriting.

3.1.2 Multiple Paths

In the case of a DDoS attack, attackers send traffic towards the victim over an attack

tree rooted at the victim V . Each attack source {Ai} is located at a leaf node and

the attack path from {Ai} is the ordered list of routers between {Ai} and V that the

attack packet has traversed. For example, Fig. 3.2 shows a binary tree with 7 nodes in

which nodes {Ri} represent the routers and the {Ai} represent the flooding sources.

The attacker chooses one of the nodes out of the set of all possible attack sources for

each packet it sends. The choice of an attack source automatically determines the

path from that attack source to the victim. Probabilistic packet marking takes place

over the chosen path, and is the same as in the case of a single path attack. Therefore,

the multi-path problem is decomposed to a number of single-path problems, one for

each packet.

We now describe how to construct the optimal marking probabilities in the case of

a DDoS attack for tree topologies. In general, one should follow the same approach

as in single path to construct the probabilities of marking. The difference is that one

should also consider different probabilities of going through different routers in the

case of multiple attackers. If we denote the number of paths going through router i

15

R4

R7

R6

R5

R1

R3

R2

A1

A4

A3

V

A2

Figure 3.2: Example of an attack tree with 7 nodes and 4 attackers.

by Ni and the total number of paths in the tree by Ntotal, then we can write P (packet

goes through router i)=Ni/Ntotal.

In a symmetric tree of degree m1, e.g. binary (m = 2), ternary (m = 3), or degree4

(m = 4) tree, this probability has a simple expression and is equal to 1mli−1 for each

router i in layer li, li = 1, 2, · · · , L where L is the total number of layers in the tree

e.g. L = 3 in Fig. 3.2 and the first layer is the closest one to the victim that contains

only one router. The perceived probability is:

P (observing a mark from router i)=P (router i marks the packet and the next

routers do not mark).P (packet goes through router i)

We can now construct the marking probabilities for all routers from the formula

above similarly to what we did for the case of single path. We present two different

schemes for constructing the marking probabilities:

Scheme A: this scheme minimizes E[X]. In this scheme, one should set the per-

ceived probability formulated above to pi = 1n, i = 1...n (p0 = 0 in the optimal case)

where n is the total number of nodes in the tree and start constructing the proba-

bilities of marking from the first layer. Following this procedure, we can show that

for a binary tree, the kth router in the path the packet traverses (any router in layer

L−k+1) always marks the packet with probability 1/(2k−1) noting that any router

1By a symmetric tree of degree m, we mean a tree in which every node that is not a leaf nodehas exactly m children.

16

in layer L is the first router the packet may go through and so forth. It means that

after the path is chosen for the packet to traverse, the first router in its path always

puts a mark on it (the same as that in single path) and the probability of marking,

and therefore overwriting the current mark on the packet, decreases with the specified

formula for the routers closer to the victim. For a tree of degree 3 or 4 the probability

of marking for the kth router in the attack path becomes 1/∑k−1

j=0 3j or 1/∑k−1

j=0 4j

respectively. From these formulas, it can be seen that similar to the case of a single

path attack, the router needs to know the previous routers the packet has traversed

to decide with what probability it should mark.

Scheme B: in addition to the previous scheme, one might also think of a per path

optimization scheme where the probability of marking for every router depends on

which path the current attack packet has been sent through. Once the attacker

chooses a path for an attack packet to traverse, the probabilities of marking for the

routers in that path will be assigned exactly in the same way as in single path.

Therefore, a single router might mark with different probabilities for different attack

packets based on its placement in the path that packet is traversing. In fact a tree is

regarded as a concatenation of several single paths in this scheme and every time the

packet is sent through one of these paths, the routers in that path always mark with

probabilities 1, 1/2, 1/3, ... starting from the first router after the attacker. In case of

a binary, ternary, degree 4 or any other symmetric tree, this per path optimization

scheme results in the perceived probability equal to pi = 1L. 1mli−1 for each router i

in the tree, i = 1...n. For asymmetric trees in general, it results in the perceived

probability equal to 1/d for all routers in the path of length d the packet is traversing

(the same as single path).

In fact, scheme A is the optimal assignment of marking probabilities and scheme B

is suboptimal. As we discussed above, scheme A is the direct result of minimization of

E[X] in case of multiple attackers. However, there is no guarantee that one can always

17

construct the marking probabilities such that the perceived probability is constant

for all routers in the tree and is equal to 1/n as scheme A describes. One can always

apply scheme A for an optimal assignment of marking probabilities in the symmetric

trees e.g. binary, ternary, and degree 4 trees discussed above.

However, for asymmetric trees this optimal assignment might not result in logical

values for marking probabilities. On the other hand, we can not find a general closed

form for the formula of marking probability of the kth router in an asymmetric tree

like what we found for the symmetric trees. Therefore, in each asymmetric tree

one should start constructing the marking probabilities from the first layer using

the general formula mentioned previously; furthermore, he might not achieve logical

probabilities of marking for all nodes.

In asymmetric attack trees, scheme B can be used as a heuristic assignment of

marking probabilities. We can formulate the average number of packets needed for

reconstructing the attack paths in each of these schemes using Eq. (2.2) with the

appropriate perceived probability pi we described for each scheme. As a result, we

will have the following formula for E[X] of scheme A:

E[X] =

∫ ∞

0

(1−n∏

i=1

(1− e−1n

x))dx (3.1)

And for scheme B:

E[X] =

∫ ∞

0

(1−n∏

i=1

(1− e− 1

L. 1

mli−1 x))dx (3.2)

In Eq. (3.1) and Eq. (3.2), the product is calculated over all nodes in the tree

(i = 1...n). E[X] required for both schemes will be shown in Section 4.2.2 through

simulations. From the results, we conclude that scheme B is a reasonable suboptimal

assignment of marking probabilities for unequal PPM mechanism and substantially

improves constant PPM scheme.

18

3.2 PPM+NC: Using Network Coding

The second mechanism for improving PPM is combining it with the idea of network

coding. First, we start from the coupon collector’s problem. It is well known that

as the size of the collection increases, the probability of the next boxes contain new

types of coupons decreases. In other words, most of the time is spent on collecting the

last few coupons. More recently, it has been observed that network coding can help

solve this problem. Storing linear combinations of coupons in each box, instead of

individual coupons, increases the chance that each new box contains a useful coupon.

One can prove that when network coding is used, the average number of coupons

required for the entire collection, E[X], decreases from n ln n + Θ(n) to Θ(n). The

proof can be found in [5] and is given in the Appendix A.

In the case of IP traceback, we consider random linear network coding technique

combined with the probabilistic packet marking. It means that after decision of

marking a packet by a router, the current content of the packet will be updated to

a linear combination of the new IP address and the previous IP addresses. In other

words, once a router decides to mark the packet, it chooses a coefficient randomly out

of a field F2n [4], multiplies its IP address with the coefficient, and adds the result to

the current content of the packet.

It is worth noting that the router does not need to know anything about the

previous routers or the complete path the packet traverses. At the end, every packet

contains a linear combination of several routers’ IP addresses together with a vector

of the random coefficients for those routers. The reconstruction process is similar to

solving a system of linear equations. Therefore, at least d packets are required for the

victim to reconstruct a path of length d because the matrix of the linear equations

must become full rank to be solvable.

We should note that with this definition, network coding in the traceback problem

is different from that in the coupon collector’s problem. As mentioned earlier, in the

19

coupon collector’s problem we can have any linear combination of any coupons in

every box while in traceback, each packet contains a linear combination of some of

the routers’ IP addresses prior to the last router who has decided to mark it and the

last router itself. It can not contain any arbitrary linear combination of routers’ IP

addresses. It means that we should not directly use the result of applying network

coding to the coupon collector’s problem, which we discussed above, in the case of IP

traceback.

3.2.1 Single Path

To give a model for applying network coding to the single path traceback problem,

first we calculate pi which is the perceived probability of observing a mark from router

i. We can write:

P (observing a mark from the ith router)=P (the ith router marks the packet).P (obtaining

a full rank matrix out of the linear combinations)

We can show that the second probability can be assumed to be 1 with a very good

approximation using the lower bound on the success probability of a random network

code proposed in [9]. In fact, we are looking for the probability of obtaining a full

rank matrix out of the linear combinations to make the system of linear equations

solvable. Ho et al. proves that there is a certain probability of choosing linearly

dependent combinations under a random network code. This probability depends

on the field size 2n, but even for small field sizes e.g. 28 the probability becomes

negligible. Therefore, pi = p is a good approximation for the perceived probability

of observing a mark from the ith router in PPM+NC mechanism. This fact will also

be confirmed by plotting both the simulation results and the analytical model given

based on this pi in the rest of this thesis.

We can now replace p(1 − p)i−1 in Eq. (2.3) by this pi value. As a result, the

average number of packets required in PPM+NC mechanism will, with a very good

20

0 5 10 15 20 25 30 350

50

100

150

200

250

Path length

Ave

rage

num

ber

of p

acke

ts

simulations PPMmodel PPMsimulations PPM+NCmodel PPM+NC

Figure 3.3: Average number of packets needed for paths with varying d.

0.01 0.015 0.02 0.025 0.03 0.035 0.0450

100

150

200

250

300

350

400

Probability of marking the packets

Ave

rage

num

ber

of p

acke

ts

simulations PPMmodel PPMsimulations PPM+NCmodel PPM+NC

Figure 3.4: Average number of packets needed for a path with varying p.

approximation, reduce to:

E[X] =

∫ ∞

0

(1−d∏

i=1

(1− e−px))dx (3.3)

Which represents the benefit of applying network coding in the single path IP trace-

back problem.

Fig. 3.3 shows the average number of packets required to reconstruct paths of

varying lengths over 500 random test runs for each length value. We have considered

paths of length 1 to 31 hops and the marking probability p is set to 1/25. The figure

21

shows the models as well as the simulations for both PPM and PPM+NC mechanisms.

The model for PPM is given in Eq. (2.3) and the model for PPM+NC is given in

Eq. (3.3). We have calculated the analytical formulas in Eq. (2.3) and Eq. (3.3) for

path lengths less than or equal to d = 16. The results confirm the compatibility of

the given models with the simulation results for both PPM and PPM+NC schemes.

As we discussed above, the coefficients for PPM+NC mechanism can even be

selected out of a field of a small size. Here we have assumed a very small field of

size 22 = 4. Each realization terminates once the victim observes a full rank matrix

of the random coefficients. The figure shows that even for such a small field size,

the model completely agrees with the simulation results. Furthermore, PPM+NC

performs much better than the basic PPM scheme which allows overwriting.

The reason is that when the path length is long enough, even when we choose

the coefficients out of a very small field, it is very likely to obtain independent linear

combinations of routers’ IP addresses from different packets. In other words, although

larger fields show more benefit in random linear network coding schemes, because

of the large enough path lengths in most of the traceback problems, there is a low

possibility of selecting linearly dependent combinations according to the large number

of distinct router IP addresses even with less options for the coefficients to be chosen

randomly. This is shown in the figure as well; PPM+NC represents a bigger benefit

compared with PPM for longer paths rather than for shorter paths. But even for

short paths, the effect of the small field size is not too big according to Ho et al.’s

theorem and can be neglected. Therefore we have chosen a small field of size 22 = 4

in all our experiments on PPM+NC mechanism.

Fig. 3.4 shows the same comparison as Fig. 3.3 but with a fixed path length

(d = 14) and for different probabilities of marking (0.01-0.04). We have considered

the condition p ≤ 1d

for the optimum results [15].

We now want to compare PPM+NC mechanism to the algebraic approach pro-

22

0 200 400 600 800 1000 120010

15

20

25

30

35

Field size

Ave

rage

num

ber

of p

acke

ts

simulations Algebraicsimulations PPM+NC

Figure 3.5: Average number of packets needed for Algebraic and PPM+NC.

posed in [3]. Dean et al. also reconstructs the attack path by solving a system of linear

equations. It is very similar to our proposed PPM+NC scheme; the main difference is

that in the algebraic approach, there is a random packet id xj for each attack packet

and the matrix of the final linear equations looks like a Vandermonde matrix (if the

packet ids are distinct) with different powers of the pre-selected packet id as coeffi-

cients for different router IP addresses. But in PPM+NC, there is no pre-selected id

and each time a router decides to mark the packet based on some probability p, it

chooses a random coefficient and adds the multiplication of that coefficient with its

IP address to the current content of the packet. Therefore, the final matrix will have

different coefficients, selected uniformly at random out of a field, for different routers.

We claim that PPM+NC scheme performs better than the algebraic approach in

terms of the average number of packets required for reconstructing the attack path

especially over small field sizes. The reason is that although one obtains d distinct

packet ids, and therefore a full rank Vandermonde matrix, after sending d packets by

choosing the ids from a large field, for smaller field sizes one would need more than

d packets in the algebraic approach. It is obvious that in the algebraic approach, the

smallest possible field size should have at least d elements. Also all routers mark the

23

packet going through them i.e. p = 1. We can prove that the number of packets

required to obtain d distinct packet ids out of a field of size q is equal to:

E[X] = q

q∑

k=q−d+1

1

k(3.4)

Proof. It can be viewed as a coupon collector’s problem with the packet ids

representing the coupons. Let X be the number of packet ids chosen out of the field

until d distinct packet ids are obtained. If Xi is the number of packet ids chosen

while one had exactly i − 1 distinct packet ids, clearly X =∑d

i=1 Xi. Each Xi is a

geometric random variable with probability pi = q−(i−1)q

. Hence,

E[Xi] =1

pi

=q

q − (i− 1)

And using the linearity of expectations,

E[X] =d∑

i=1

E[Xi] = q

q∑

k=q−d+1

1

k

This proves Eq. (3.4).

But in PPM+NC, if we follow the same marking procedure as that of [3] and let

all routers mark with probability 1, even for very small field sizes, the victim always

needs to receive d packets for obtaining a full rank matrix and solving the system

of linear equations. It comes from the randomly selected coefficients in PPM+NC

scheme. This is shown in Fig. 3.5. In this figure d is set to 14. Since the field size

22 = 4 is not even possible for the algebraic approach with d = 14, we have considered

fields of sizes 24 to 210.

The figure shows that when we use the same probability of marking as the one in

the Algebraic approach (p = 1) for our proposed PPM+NC mechanism, the average

number of packets required for reconstructing the attack path of length d does not

24

depend on the size of the field out of which the coefficients are selected randomly.

Even for small field sizes, PPM+NC almost always needs d packets to get a full rank

matrix. However, the number of packets needed in the algebraic approach depends

on the size of the field out of which the packet ids are selected randomly and only for

large field sizes, it requires d packets to obtain a full rank Vandermonde matrix.

Let us discuss more about the analytical model we proposed for PPM+NC at

the beginning of this section. Ho et al. [9] considers a feasible multicast connection

problem with independent or linearly correlated sources and N receivers where the

components of local coding vectors are chosen independently and uniformly at random

over a finite field Fq with q > N . The probability that all N receivers can decode

the source processes is at least (1 − Nq)ν where ν is the maximum number of links

receiving signals with the random coefficients in any set of links from all sources to

any receiver. In other words, ν is the number of coding points that are encountered

in all paths from the source to any receiver, maximized over all receivers [5, 9].

In the traceback problem, there exists one receiver (the victim) which needs to

decode the routers’ IP addresses from the linear combinations it receive. Instead

of the links receiving signals with independent randomized coefficients in Ho et al.’s

theorem, here the linear combinations of routers’ IP addresses are stored in the attack

packets.

It is obvious that one should replace N in the formula of success probability by 1 for

the case of a single path attack. However, ν is not straightforward to be substituted

in the traceback problem. The reason is that the number of links in the network is

known for a multicast connection problem while in traceback, the number of packets

required for the victim to reconstruct the attack path is not known from the beginning.

Indeed, the number of packets required and as a result, the probability of success in

the traceback problem, is a function of marking probability p, path length d and field

size q.

25

One can obtain a bound on the number of packets by calculating the total number

of possible linear combinations of routers’ IP addresses which would be∑d

i=0

(di

)pi(1−

p)d−iqi. However, it is much larger than the real number of packets needed for re-

constructing the attack path in most cases. As we discussed at the beginning of

this section, for the range of marking probabilities we usually consider for traceback

problem (0.01-0.04) and regarding the usually large enough attack path length d,

approximating the success probability, i.e. the probability of obtaining a full rank

matrix, by 1 is reasonable even when we choose the coefficients out of a field of a

small size as it can be seen in Fig. 3.3 and Fig. 3.4.

We now want to consider the tradeoff between the number of bits needed and the

number of packets required for both algebraic and network coding approaches. By

splitting a router’s IP address into c chunks, the number of bits in the packet header

required by the algebraic approach would be equal to log2 2d32ce+dlog2 de+dlog2 ce or

even log2 2d32ce+ dlog2 de when each router substitutes each chunk of its IP address in

order [3]. d represents the attack path length. For example, when c = 4, one would

need 15 bits or in the best case, 13 bits per packet and 4d packets for a path of length

16 in the algebraic approach.

In PPM+NC mechanism, a vector of the randomly chosen coefficients is stored in

the packet along with the total content of the packet which is the linear combination of

IP addresses. Assuming a field of size 232, this would require log2 232 + dnavg ∗ log2 22ebits in the packet header. We can trade off the number of bit for the average number

of packets required by dividing a router’s IP address into c chunks and adding dlog2 cebits that represents the offset of the chunk. In this way, we can reduce the field size

to 2d32ce. Therefore, the number of bits needed by PPM+NC mechanism will be equal

to:

log2 2d32ce + dnavg ∗ log2 22e+ dlog2 ce (3.5)

which can further be simplified to log2 2d32ce + d2navge + dlog2 ce. navg is the average

26

number of coefficients stored as the elements of a vector in each packet. In other

words, it is the average number of routers who have put their marks with randomly

selected coefficients on one packet.

navg depends on the path length d and the probability of marking p and is equal

to navg = dp. Since p is usually a small value e.g. 0.04, the average number of marks

on each packet would be small e.g. 1.24 for a large path length of 31. On the other

hand, it is assumed that the random coefficients have been selected out of a field of

size 22 = 4. Therefore, the number of bits needed by Eq. (3.5) is in the order of the

number of bits required by the best case of algebraic approach.

We have also compared the average number of packets required for reconstructing

the attack path with the 95th percentile for the number of packets. The 95th per-

centile is obtained by using the empirical cdf. First we found the empirical cdf of

the values obtained from every realization for the number of packets required. Then,

we could find different percentiles e.g. 95th percentile for the number of packets re-

quired from the distribution. The figures are plotted versus both path length and

probability.

Fig. 3.6 shows the 95th percentiles as well as the average number of packets needed

for reconstructing paths of varying length from 1 to 31 hops. The marking proba-

bility p is set to 1/25. Both percentile and average are presented for both PPM and

PPM+NC mechanisms. Obviously, as the path length increases, both 95 percentile

and average increase and the 95th percentile curve is always above the average curve

for both of the mechanisms. Fig. 3.7 shows the same thing versus probability. The

attack path length is set to 14 and the considered range for probabilities of marking

is 0.01-0.04. The analysis of tail distribution can be found in the Appendix B of this

thesis.

27

0 5 10 15 20 25 30 350

50

100

150

200

250

300

350

400

Path length

Num

ber

of p

acke

ts r

equi

red

simulations 95% PPMsimulations 95% PPM+NCmodel 95% PPM+NCsimulations mean PPMsimulations mean PPM+NC

Figure 3.6: Average and 95th percentile for the number of packets required.

0.01 0.015 0.02 0.025 0.03 0.035 0.040

100

200

300

400

500

600

Probability of marking

Num

ber

of p

acke

ts r

equi

red

simulations 95% PPMsimulations 95% PPM+NCmodel 95% PPM+NCsimulations mean PPMsimulations mean PPM+NC

Figure 3.7: Average and 95th percentile for the number of packets required.


In the case of a distributed attack, we can apply random linear network coding tech-

nique to the probabilistic packet marking in the same way as what we did for a single

path attack. As we mentioned in Section 3.1.2, the attacker chooses one of the nodes

out of the set of all possible nodes for each packet it sends. The choice of the attack

source automatically determines the path from that attacker to the victim. Similar

to the two schemes for constructing marking probabilities in the unequal PPM mech-

anism described in Section 3.1.2, one can think of two different PPM+NC schemes:

28

Scheme A: This scheme considers the topology structure of the attack tree in

PPM+NC mechanism. In this scheme, every node maintains a list of its upstream

routers (children) in the tree. Once the node receives a packet from any of its chil-

dren, it will remember that child’s IP address for the next attack packets. One can

summarize scheme A as follows; after the path is determined for the attack packet to

traverse, each node in the attack path decides to mark the packet going through it

with some probability p (PPM). However, with the idea of network coding, each time

the node decides to mark, it marks not only with its own IP address, but also with

the IP addresses of those children of it for which it remembers the IP addresses from

the previous attack packets that went through those children. Therefore, the general

idea is the same as single path with the difference that after decision of marking a

packet by a router, several marks with several randomly selected coefficients may be

put on the packet at the same time instead of only 1 mark per decision in case of a

single path attack.

Scheme B: This scheme is, similarly to scheme B of unequal PPM in Section 3.1.2,

a per path based scheme. In fact, once a path is selected for the attack packet to

traverse, PPM+NC mechanism takes place in that path similarly to that of a single

path attack. Therefore, once a router in the attack path decides to mark the packet,

it chooses a coefficient randomly out of a field F2n , multiplies its IP address with

the coefficient, and adds the result to the current content of the packet. The tree is

regarded as a concatenation of several single paths in this scheme and it is assumed

that each router does not know anything about the IP addresses of its children.

Both PPM+NC schemes are plotted in Section 4.2.2. It can be seen that scheme

A gives more benefit in terms of the average number of packets required for recon-

structing the attack paths which is obvious from its description.

We can also give a model for PPM and PPM+NC mechanisms in the case of

multiple attackers in a tree topology. It is sufficient to calculate the appropriate pi

29

for both mechanisms and replace it in Eq. (2.2). We gave the formula of pi for an

overwriting scheme in a distributed attack scenario in Section 3.1.2,

P (observing a mark from router i)=P (router i marks the packet and the next

routers do not mark).P (packet goes through router i)

The first probability is equal to p(1−p)nnext(i) where nnext(i) is the number of routers

that are after router i in the path the packet traverses. For example in a binary tree,

the number of next routers for each router i in layer li is simply equal to li − 1. The

second probability was explained in Section 3.1.2. For PPM+NC mechanism, in its

second scheme (scheme B), pi can be calculated as follows:

P (observing a mark from router i)=P (router i marks the packet).P (obtaining a

full rank matrix out of the linear combinations).P (packet goes through router i)

The first probability is simply equal to p. With the same discussion as in the

case of single path (Section 3.2.1), the second probability can be assumed to be

approximately 1. Therefore, in case of a distributed attack, PPM mechanism and

scheme B of PPM+NC mechanism can be modeled by Eq. (2.2) with their pis specified

above.

30

Chapter 4

Additional Evaluation

4.1 Discussion of Costs

In general, one should consider three main cost factors in the traceback problem.

The first one is the average number of packets required for the victim to reconstruct

the attack path. We discussed this factor for both unequal PPM and PPM+NC

mechanisms in the previous sections and observed that both schemes reduce the

average number of packets needed compared to PPM mechanism. We want to focus

on the other two factors in this section. One of them is the number of bits in the

packet header each mechanism requires. The last one is the work at the intermediate

routers which is related to the amount of marks they put on the attack packets.

First we should note that for the unequal PPM mechanism, the number of bits

needed in the packet header is not different from that in the PPM scheme because

unequal PPM is based on the optimization of assignment of marking probabilities

and does not effect the number of bits required. Therefore, any technique that has

been proposed for reducing the number of bits needed in the packet header for any

PPM scheme that allows overwriting can be applied to unequal PPM mechanism as

well. In the same way, we do not need to consider the work at the intermediate nodes

for PPM+NC mechanism. The reason is that we did not make any changes in the

constant probability of marking p in our proposed PPM+NC mechanism.

As a result we only discuss about the number of bits needed by PPM+NC and

the work at the intermediate routers in unequal PPM. In Section 3.2.1, we gave the

number of bits that need to be allocated to PPM+NC scheme in Eq. (3.5). We

compared it to the number of bits required by the algebraic approach and showed

31

that they are similar. For example when c = 4, we would need a total of 12 bits in

the packet header for a path of length d = 16. We discussed the work at intermediate

nodes for unequal PPM scheme in Section 3.1.1 and from Fig. 3.1, we concluded that

unequal PPM mechanism improves the tradeoff between the number of packets and

the work at the intermediate nodes over the constant PPM scheme.

We should discuss another issue of PPM+NC mechanism. Network coding is

inherently susceptible to jamming attacks; interestingly the same applies for algebraic

traceback. In PPM+NC, this problem comes into play because each router adds its

IP address multiplied by a random coefficient to the current content of the packet

disregarding what packet contains. The victim reconstructs the attack path by solving

a system of linear equations resulted from the linear combinations of IP addresses in

the packets it has received. Thus, it has no way to understand whether it is decoding

the right information or not. Some solutions have been proposed to the problem

of jamming attacks in network coding technique which may be of interest in the

traceback problem as well [6].

4.2 Additional Simulation Results

4.2.1 Single Path

We have included our simulation results along with our proposed schemes in the

previous sections. We now want to compare our two mechanisms, unequal PPM and

PPM+NC, to constant PPM scheme in one figure. Fig. 4.1 shows such a comparison.

In this figure the constant probability of marking p is set to 0.04 for the constant

PPM mechanism and the path length varies from 1 to 31. Each path length result

represents the result of 500 independent simulation runs. We have simulated the best

possible unequal PPM scheme in which p0, the probability of observing no mark, is

set to 0; i.e. the perceived probability of observing a mark from all routers is set to

32

0 5 10 15 20 25 30 350

50

100

150

200

250

Path length

Ave

rage

num

ber

of p

acke

ts

PPMPPM+NCUnequal PPM

Figure 4.1: Comparison of all three schemes for single path.

1/d for an attack path of length d.

We can see that unequal PPM scheme performs much better than constant PPM

in terms of the average number of packets needed for reconstructing the attack path.

It is obvious according to the optimization of probabilities of marking in our proposed

unequal PPM mechanism. Unequal PPM even performs better than PPM+NC over

a wide range of attack path lengths. But it still allows overwriting and as a result,

is a different mechanism from PPM+NC in which no overwriting takes place. Both

schemes always perform better than constant PPM scheme as we discussed in the

previous sections.


In this section, we test our proposed mechanisms, unequal PPM and PPM+NC in case

of a distributed attack scenario. First, we have assumed symmetric trees of degrees

2, 3, 4 and set the probability of marking to 0.04 in all cases (the same as that in

single path) and plotted the average number of packets required versus the number

of nodes in the tree for trees of different sizes in the same degree. For unequal PPM

33

0 20 40 60 80 100 120 1400

2000

4000

6000

8000

10000

12000

14000

Number of nodes in the tree

Ave

rage

num

ber

of p

acke

ts r

equi

red

Binary tree

PPMPPM+NC BPPM+NC Aunequal PPM Bunequal PPM A

Figure 4.2: Average number of packets needed for a binary attack tree.

scheme, the same as what we did in single path, we have considered the best case in

which p0 = 0. We have shown both schemes A and B described in Chapter 3 for both

unequal PPM (constructing the marking probabilities) and PPM+NC mechanisms.

The results are shown in Fig. 4.2, Fig. 4.3, and Fig. 4.4 for binary, ternary and degree

4 trees respectively.

For PPM+NC mechanism, when we focus on each figure separately, we can see

that both schemes A and B of PPM+NC show more benefit for trees with more

depth among the trees with the same degree. This is similar to the larger benefit of

PPM+NC for longer paths in case of a single path attack. Also, it is obvious that

scheme A performs much better than scheme B. On the other hand, if we consider

scheme A of PPM+NC in all figures for the same number of nodes, we see that it

presents more benefit for wider trees. It was expected because the number of children

of each node, and therefore the number of marks which may be written simultaneously

into the packet header, increases by an increase in the degree of the attack tree in

scheme A. As a result, PPM+NC shows more benefit for both deeper and wider trees

in general.

About the unequal PPM mechanism, it can be seen that in all trees of different

34

0 20 40 60 80 100 120 1400

2000

4000

6000

8000

10000

12000

14000


Ave

rage

num

ber

of p

acke

ts r

equi

red

Ternary tree


Figure 4.3: Average number of packets needed for a ternary attack tree.

0 20 40 60 80 100 120 1400

2000

4000

6000

8000

10000

12000

14000


Ave

rage

num

ber

of p

acke

ts r

equi

red

Degree4 tree


Figure 4.4: Average number of packets needed for a degree 4 attack tree.

degrees both of its schemes (A and B) always perform much better than PPM; they

perform even better than both schemes of PPM+NC in terms of the average number

of required packets although they allow overwriting and are different from PPM+NC

mechanism in nature. Such a good performance was expected for scheme A as a

direct result of the optimization in the assignment of marking probabilities; we also

conclude that scheme B performs well enough as a suboptimal assignment of marking

probabilities.

At the last step, we want to test our proposed mechanisms on a realistic tree based

35

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

Number of attackers

Ave

rage

num

ber

of p

acke

ts r

equi

red

Realistic tree

PPMPPM+NC BPPM+NC Aunequal PPM B

Figure 4.5: Average number of packets needed for a realistic attack tree.

Table 4.1: BRITE Topology Generator ParametersRouter Only Parameters

Model GLPNode Placement RandomGrowth Type Incremental

Preferential Connectivity Onm 1

on the power-law structure of Internet. We chose to use BRITE topology generator [1]

on Router only mode. We took use of the experiments performed in [8] to choose

the right parameters that produce realistic results. Table 4.1 contains the parameters

used to generate the random topology that simulates a real topology. Incremental

growth and preferential connectivity are chosen as intuitive mechanisms for topology

generation. Parameter m sets the number of links added per new node that effects

the average node degree of the generated topology.

Using the parameters specified above, first we generated a 150 node graph and

randomly chose one of the nodes with one edge as the victim. Then we constructed

a tree out of the generated graph using Dijkstra algorithm. In the resulted tree, we

found all one-edge nodes which present the total possible attackers. Then each time

we chose a specific number of attackers out of the total possible ones and reconstructed

36

the attack graph using PPM, PPM+NC and unequal PPM mechanisms.

For PPM+NC mechanism, we have simulated both schemes A and B; similarly to

the symmetric trees shown above, scheme A performs much better than scheme B

and constant PPM mechanism. However, for unequal PPM mechanism we have only

used scheme B for constructing the marking probabilities; the reason is that as we

discussed in Section 3.1.2, scheme B of unequal PPM can be used as a suboptimal

assignment of marking probabilities in any arbitrary tree structure. For each number

of attackers, the result represents the average over 100 independent simulation runs.

This is shown in Fig. 4.5.

It can be seen that in the same way as a symmetric tree, both schemes A and B

of PPM+NC mechanism show more benefit with increasing the number of attackers

and therefore the size of the attack graph in a realistic asymmetric tree. It is obvious

that the suboptimal scheme of unequal PPM (scheme B) shown in Fig. 4.5 does

not perform as well as the optimal scheme (scheme A) shown previously in case of

symmetric trees; but it still performs much better than constant PPM which confirms

that scheme B is a reasonable suboptimal assignment of marking probabilities.

37

Chapter 5

Conclusion

In this thesis, we revisited the IP traceback problem as a coupon collector’s problem,

and used the insights obtained to design two mechanisms: (i) unequal PPM, which

assigns marking probabilities so as to minimize the expected number of packets and

(ii) PPM+NC, which uses random linear combinations of node ids instead of over-

writing. We evaluated these mechanisms through analysis and simulation for single-

and multi-path scenarios and showed that they improve the tradeoffs of interest. In

future work, we plan to combine our mechanisms with existing marking schemes to

further improve the overall traceback performance.

38

Appendices

A The Coupon Collector’s Problem with Net-

work Coding

In [5], it has been shown that the average number of coupons required for the entire

collection, E[X], when network coding is used is Θ(n).

The proof uses the following lemma:

Lemma 1 assume Π1 and Π2 are two subspaces in Fnq and Π1 is not a subspace of

Π2. Assume that v is chosen uniformly at random from Π1. Pr(v ∈ Π2) > 1q

Proof. The proof can be given from the formula of conditional probability. Since

the total number of elements in Fnq is q, after choosing v from Π1, we have a new

space with less than q number of elements. Hence, Pr(v ∈ Π2|v ∈ Π1) > 1q.

We now give the proof for E[X] after applying network coding to the coupon

collector’s problem:

Proof. In lemma 1, let Π1 = Fnq and Πi

2 = the subspace spanned by the coding

vectors after i experiments. Let X i2 be the rank of Πi

2; by rank we mean the number

of boxes after i trials which contain distinct types of coupons. In other words, X i2 is

the size of basis for Πi2 and in general, X i

2 ≤ i. According to lemma 1, we also know

that Xn2 is the sum of n bernoulli random variables, each of which has probability of

success less than 1− 1q. Now we can write:

E[Xn2 ] = E[X1

2 + X22 + · · ·+ Xn

2 ] =n∑

i=1

E[X i2] (A.1)

Thus E[Xn2 ] < n(1− 1

q) which means that E[X] = Θ(n).

39

B Tail Analysis

In the thesis, we mainly focused on analyzing E[X]. However, one can also analyze

the entire distribution of X. In [14], P (T > n) is calculated where T is the number

of coupons which needs to be collected until one obtains a complete set of at least

one of each type assuming there are N different types of coupons and the selection

is performed independently and equally likely every time. If we follow the same

approach as in [14] for our traceback problem, we can find a model for the percentile

(e.g. 95 percentile) of the number of packets required.

In the same way as in [14], first we fix n and define the events A1, A2, . . . , Ad as

follows: Aj is the event that no packet with a mark from router j is contained among

the first n packets, j = 1, · · · , d. Hence

P (X > n) = P (d⋃

j=1

Aj)

=∑

j

P (Aj)−∑j1<j2

∑P (Aj1Aj2)

+ · · ·+ (−1)k+1∑ ∑j1<j2<···<jk

∑P (Aj1Aj2 · · ·Ajk

)

· · ·+ (−1)d+1P (A1A2 · · ·Ad) (B.2)

Now Aj will occur if none of the first n packets contains a mark of router j. Since

each packet will not have a mark from the jth router with probability 1 − pj where

pj is the perceived probability of observing a mark from the jth router and is equal

to p(1− p)j−1, by the assumed independency, we have

P (Aj) = (1− pj)n

In general, the event Aj1Aj2 . . . Ajkwill occur if none of the first n packets contains

a mark from either router j1 or j2 or . . . jk. Therefore, P (Aj1Aj2 · · ·Ajk) = (1− pj1 −

pj2−· · ·−pjk)n. Now if we replace these probabilities in Eq. (B.2), we get the general

40

formula for modeling the percentile in the traceback problem. In the overwriting

mechanisms, pj is different for every router j due to its distance from the victim and

is equal to p(1− p)j−1. Therefore, we can not go further than this general analytical

formula in the case of basic PPM scheme. However, for PPM+NC approach, the

perceived probability is the same for all routers and is (with a good approximation)

equal to p. Therefore, for n > 0 in the case of PPM+NC, Eq. (B.2) will be simplified

to:

P (X > n) = d(1− p)n −(

d

2

)(1− 2p)n

+

(d

3

)(1− 3p)n − · · ·+ (−1)d

(d

d− 1

)(1− (d− 1)p)n (B.3)

Which is simply equal to:

P (X > n) =d−1∑i=1

(d

i

)(1− ip)n(−1)i+1 (B.4)

As a result, if we calculate the probability in Eq. (B.4) for different values of n,

the 95th percentile for PPM+NC mechanism would be that value of n for which

the probability becomes equal to 0.05. In Eq. (B.4), d is the number of routers in

the attack path from each of which we want to have a mark (equivalent to the N

distinct types of coupons in the coupon collector problem) and p is the perceived

probability (equal to the marking probability) in PPM+NC mechanism. We have

shown the model given for PPM+NC mechanism in Fig. 3.6 and Fig. 3.7 together

with the simulations. It can be seen that the given model completely agrees with the

simulation results.

41

C Combining the Proposed Mechanisms with Pre-

vious Schemes

As we described in our assumptions in Section 2.2, in this thesis we mainly discussed

in terms of router ids (node sampling algorithm in [15]). We now want to discuss

about edge sampling algorithms [15] that mark with edge ids (XOR of two nodes)

instead of router ids. Let us focus on Song et al.’s AMS-I [16] and explain unequal

PPM and PPM+NC mechanisms using the same framework as that scheme. AMS-I

is similar to CEFS, but instead of encoding a router’s IP address into eight fragments,

one simply encodes a hash value of the router’s IP address. In fact, two independent

hash functions, h and h′, are used in the encoding of the routers’ IP addresses to

distinguish the order of the two routers in the XOR result while reconstructing the

attack paths. Both hash functions have 11-bit outputs; the output together with the

5-bit distance field is stored in the 16-bit IP identification field.

In the reconstruction procedure, the victim uses the upstream router map Gm as

a road-map and performs a breadth-first search from the root. The set of edge fields

marked with a distance d is denoted as Ψd and the set of routers at distance d from

the victim in the reconstructed attack graph is denoted as Sd. If we go through the

reconstruction procedure of AMS-I [16], we will find that the computational com-

plexity of this scheme is O(∑

d |Sd|.|Ψd+1|). This is much lower than that of CEFS

scheme which is O(∑

d |Sd|.|Ψd+1|8) resulted from splitting each router’s IP address

and redundancy information into eight fragments and marking with one of the eight

fragments selected at random for each packet.

It is obvious that one can simply combine the unequal PPM mechanism with any

edge id based PPM scheme e.g. AMS-I. There is only one modification needed in the

marking procedure; one needs to assign the optimized marking probabilities to differ-

ent routers instead of the constant marking probability for all routers. The number

42

of packets required by AMS-I is less than one eighth of that of CEFS scheme given

the same marking probability. Unequal PPM decreases this number even further and

minimizes it as we described in the previous sections. The reconstruction procedure

for unequal PPM, and thus the formula for its computational complexity, are the

same as those of AMS-I. However, |Ψd+1| decreases as a result of the decrease in the

average number of packets required by the unequal PPM scheme.

For the PPM+NC mechanism, one would need to make more modifications to

make the mechanism work as an edge id based scheme. If we regard edge ids as the

concatenation of two nodes, PPM+NC can simply be used similar to the mechanism

discussed for the router ids. However, edge id is defined as the XOR value of the

start and end fields [15]. Therefore, an additional protocol would be needed to make

the PPM+NC mechanism compatible with edge ids. One solution might be similar

to the random partial path encoding scheme proposed by Dean et al. [3]; one can

add an additional parameter, l, that represents the maximum length of an encoded

path. The value of this parameter is set by the marking router and reduced by

every participating router who adds in its IP address. The routers can add in their

information until the value reaches 0. Therefore, l = 1 would represent encoding of

edges between routers. As Dean et al. [3] described, this would be obtained at a cost

of adding dlog2(l + 1)e bits to the packets.

It is worth noting the computational complexity of PPM+NC mechanism on the

other hand. PPM+NC always requires one step prior to the reconstruction procedures

commonly used for PPM schemes. This prior step is to solve the system of linear

equations which is of computational complexity O(n3) that will be added to the

complexity of the reconstruction algorithm. However, PPM+NC reduces the number

of required packets which mitigates the effect of this additional complexity.

43

Bibliography

[1] Brite topology generator. http://www.cs.bu.edu/brite/.

[2] M. Adler. Tradeoffs in probabilistic packet marking for ip traceback. ACMSymposium Theory of Computing (STOC), pages 407–418, May 2002.

[3] D. Dean, M. Franklin, and A. Stubblefield. An algebraic approach to ip trace-back. In Proceedings of NDSS, February 2001.

[4] C. Fragouli, J. Y. LeBoudec, and J. Widmer. Network coding: an instant primer.ACM SIGCOM Computer Communication Review, January 2006.

[5] C. Fragouli and E. Soljanin. Network Coding: Fundamentals and Applications.Monograph in Foundations and Trends in Networking, 2007.

[6] C. Gkantsidis and P. R. Rodriguez. Cooperative security for network coding filedistribution. Technical Report accepted at Infocom, 2006.

[7] M. T. Goodrich. Efficient packet marking for large-scale ip traceback. 9th ACMConference on Computer and Communications Security (CCS’02), pages 117–126, November 2002.

[8] O. Heckmann, M. Piringer, J. Schmitt, and R. Steinmetz. How to use topologygenerators to create realistic topologies. Technical Report, December 2002.

[9] T. Ho, R. Koetter, M. M’edard, D. R. Karger, and M. Effros. The benefits ofcoding over routing in a randomized setting. In IEEE International Symposiumon Information Theory (ISIT), July 2003.

[10] M. Ma. Tabu marking scheme for ip traceback. Proceedings of IEEE InternationalParallel and Distributed Processing Symposium (IPDPS’05), 2005.

[11] J. Mirkovic, S. Dietrich, D. Dittrich, and P. Reiher. Internet Denial of Service:Attack and Defense Mechanisms. Prentice Hall Professional Technical Reference,2005.

[12] M. Mitzenmacher and E. Upfal. Probability and Computing, Randomized Algo-rithms and Probabilistic Analysis. Cambridge, 2005.

[13] K. Park and H. Lee. On the effectiveness of probabilistic packet marking for iptraceback under denial of service attack. In Proceedings of IEEE Infocom 2001.

[14] S. Ross. A first course in probability. Pearson Prentice Hall, Upper Saddle River,New Jersey, 2006.

[15] S. Savage, D. Wetherall, A. Karlin, and T. Anderson. Network support for iptraceback. IEEE/ACM Transactions on Networking, 9(3):226–237, June 2001.

44

[16] D. X. Song and A. Perrig. Advanced and authenticated marking schemes for iptraceback. Proceedings of IEEE Infocom 2001, pages 878–886.

45

Revisiting IP Traceback as a Coupon Collector’s...

Documents

Transcript of Revisiting IP Traceback as a Coupon Collector’s...