Revisiting IP Traceback as a Coupon Collector’s...
Transcript of Revisiting IP Traceback as a Coupon Collector’s...
University of California,Irvine
Revisiting IP Traceback as a Coupon Collector’s Problem
Thesis
submitted in partial satisfaction of the requirementsfor the degree of
Master of Science
in Electrical and Computer Engineering
by
Pegah Sattari
Thesis Committee:Professor Athina Markopoulou, Chair
Professor Hamid JafarkhaniProfessor Syed Ali Jafar
2007
c© 2007 Pegah Sattari
The thesis of Pegah Sattariis approved:
Committee Chair
University of California, Irvine2007
ii
For my parents: Rohangiz Shojaei, Siavash Sattari.
iii
Table of Contents
List of Figures v
List of Tables vi
Acknowledgments vii
Abstract viii
1 Introduction 11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Model and Motivation 62.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Key Ideas and Rationale . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Proposed Mechanisms 113.1 Unequal PPM: Optimal Marking Probabilities . . . . . . . . . . . . . 11
3.1.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.1.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 PPM+NC: Using Network Coding . . . . . . . . . . . . . . . . . . . 193.2.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Additional Evaluation 314.1 Discussion of Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Additional Simulation Results . . . . . . . . . . . . . . . . . . . . . . 32
4.2.1 Single Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.2 Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Conclusion 38
Appendices 39A The Coupon Collector’s Problem with Network Coding . . . . . . . . 39B Tail Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40C Combining the Proposed Mechanisms with Previous Schemes . . . . . 42
iv
List of Figures
2.1 PPM over a single path of length d. . . . . . . . . . . . . . . . . . . . 7
3.1 Comparison of the tradeoff for constant and unequal PPM. . . . . . . 13
3.2 Example of an attack tree with 7 nodes and 4 attackers. . . . . . . . 16
3.3 Average number of packets needed for paths with varying d. . . . . . 21
3.4 Average number of packets needed for a path with varying p. . . . . . 21
3.5 Average number of packets needed for Algebraic and PPM+NC. . . . 23
3.6 Average and 95th percentile for the number of packets required. . . . 28
3.7 Average and 95th percentile for the number of packets required. . . . 28
4.1 Comparison of all three schemes for single path. . . . . . . . . . . . . 33
4.2 Average number of packets needed for a binary attack tree. . . . . . . 34
4.3 Average number of packets needed for a ternary attack tree. . . . . . 35
4.4 Average number of packets needed for a degree 4 attack tree. . . . . . 35
4.5 Average number of packets needed for a realistic attack tree. . . . . . 36
v
List of Tables
4.1 BRITE Topology Generator Parameters . . . . . . . . . . . . . . . . 36
vi
Acknowledgements
I would like to thank Prof. Athina Markopoulou for her detailed comments andsuggestions which helped to substantially improve this work.
vii
Abstract
Revisiting IP Traceback as a Coupon Collector’s Problem
By
Pegah Sattari
Master of Science in Electrical and Computer Engineering
University of California, Irvine, 2007
Professor Athina Markopoulou, Chair
Traceback schemes aim at identifying the source(s) of a sequence of network pack-
ets and the nodes these packets traversed. This is, for example, useful for tracing the
sources of a Distributed Denial-of-Service (DDoS) attack [11]. The main idea is to
have intermediate nodes mark packets with information about their identity and the
receiver uses the information on the marked packets to reconstruct the paths. Past
work has designed probabilistic and algebraic traceback schemes and has explored the
tradeoff between the number of bits required for marking and the number of packets
required for reconstruction.
In this work, we use the insight that probabilistic traceback is essentially a coupon
collector’s problem to design two mechanisms that improve its performance. First, we
optimize the assignment of marking probabilities at intermediate nodes and show that
it improves the tradeoff between the number of packets and the work at intermediate
nodes, compared to schemes where nodes use the same marking probability; this idea
can be used together with any probabilistic marking scheme. Second, we propose
a network coding based marking approach that stores random linear combinations
viii
of router ids; a special case of this approach advances algebraic traceback. Our
mechanisms can be combined with and complement previous marking schemes to
improve the overall traceback performance. We also provide performance models,
based on the coupon collector problem with unequal weights, that accurately capture
the performance of the proposed mechanisms as well as that of prior schemes.
ix
Chapter 1
Introduction
1.1 Overview
Distributed Denial-of-Service attacks (DDoS) are one of the hardest problems on the
Internet today. During a DDoS attack, a large number of compromised hosts coordi-
nate and send unwanted traffic to the victim thus exhausting the victim’s resources
and preventing it from serving its legitimate clients. For example, victims of DDoS
attacks can be companies that rely on the Internet for their business (in which case
DDoS attacks can result in severe financial loss or even in the company quitting
the business), government sites and other organizations (in which case disruption of
operation results in a political or reputation cost). We are particularly interested
in bandwidth flooding attacks, which flood the victim’s access link with unwanted
traffic.
Several approaches and mechanisms have been proposed to deal with DDoS at-
tacks. In this work, we focus on IP traceback mechanisms that try to trace the
attacks back to their sources. Traceback in itself does not stop the attacks, but it is
an important piece of the puzzle. Many techniques for IP traceback problem have
already been proposed. In this work, we are particularly interested in the family of
Probabilistic Packet Marking (PPM) schemes, that started with the work in [15] and
evolved into a number of improved marking schemes. In PPM, packets are marked
probabilistically with the IP addresses of the routers they traverse. The victim uses
this information in the marked packets to trace the attack back to its source. We
are also interested in Algebraic Traceback (AT) that uses algebraic techniques to
encode/decode information into/from packets.
1
Our key insight in this thesis is that probabilistic traceback is essentially a coupon
collector’s problem, where the coupons correspond to the router ids and the proba-
bilities of the coupons are affected by the marking probabilities at the routers. This
fundamental observation has been made even in the original paper [15] and is well-
known since [12]. However, to the best of our knowledge, it has not been exploited so
far to optimize the performance. Instead, researchers focused on marking algorithms
and on improving the tradeoff between the number of bits in the packet header and
the number of packets needed to reconstruct the path [2, 13, 15, 16]. In this work,
we are interested in minimizing the number of packets to reconstruct the path, while
keeping the load at the routers and bits on the header low, by tuning parameters that
affect the coupon collector problem, namely the marking probabilities and network
coding.
Based on this insight, we propose two new mechanisms that can improve prob-
abilistic traceback. First, we optimize the assignment of marking probabilities to
improve the tradeoff between the number of packets and the work at intermediate
nodes; we develop an optimal unequal marking scheme that outperforms equal mark-
ing probability schemes. Second, inspired by recent developments in network coding,
we propose a network-coding based marking approach that stores random linear com-
binations of router ids; a special case of this scheme improves over algebraic traceback.
We provide accurate performance models based on the coupon collector problem with
unequal weights. We evaluate our techniques through analysis and simulations, for
both paths and trees, and we compare them to appropriate baseline mechanisms.
However, our ideas are orthogonal to and therefore can be combined with previous
marking schemes to improve the overall traceback performance.
The rest of the thesis is organized as follows. In Section 1.2 we summarize the
related work. Chapter 2 discusses how the coupon collector’s problem can be used to
model probabilistic traceback and presents the key ideas. Chapter 3 presents the two
2
new mechanisms, namely unequal PPM (unequal Probabilistic Packet Marking) and
PPM+NC (Probabilistic Packet Marking with Network Coding), and evaluates them
through analysis and simulation for single and multipath scenarios. Chapter 4 dis-
cusses various costs and provides additional simulation results for realistic topologies.
Chapter 5 concludes the thesis and discusses future work.
1.2 Related Work
Many techniques for IP traceback problem have already been proposed. Savage et
al. [15] first proposed Compressed Edge Fragment Sampling (CEFS) scheme which
divides each IP address and its hash function into eight fragments and each router
probabilistically marks the IP packet with one of the eight fragments selected at
random. This approach works well for a single attacker, but suffers from high com-
putation overhead due to checking a large number of combinations of the fragments
in case of a Distributed Denial-of-Service attack. Savage et al. also took advantage
of reserving a distance field in each packet for the first time. When the packet arrives
at the victim, the distance field represents the number of hops traversed since the
content of the packet was sampled. By introducing this distance field, Savage et al.
minimized spoofing by the attacker such that a single attacker (or the closest attacker
for a distributed attack) can not forge any IP address between themselves and the
victim.
Song et al. [16] improved the efficiency and accuracy for reconstructing the attack
path under DDoS by predetermining the network topology and introducing new en-
coding schemes. Their advanced and authenticated marking schemes feature high pre-
cision and lower computation overhead for the victim to reconstruct the attack paths
under large scale DDoS attacks. Especially their authenticated marking scheme pre-
vents a compromised router from forging markings of other uncompromised routers.
3
Unfortunately they assumed that the victim has a map of its upstream routers. Al-
though they described how such maps can be obtained in practice, this assumption
is a drawback because such information could be difficult to obtain and maintain.
Dean et al. [3] proposed an algebraic approach by replacing XOR-based marking
scheme with one encoding the path information as points on polynomials. Their
scheme improves robustness over previous approaches, both for noise elimination and
distributed attack graph reconstruction. We will discuss more about the algebraic
approach in the evaluation of our proposed PPM+NC mechanism.
Goodrich [7] presented a new PPM based traceback approach which he called
randomize-and-link. The main idea of this scheme is to have each router mark with
a random fragment of its message together with a large checksum cord on its entire
message. The checksum cords make the reconstruction procedure much more efficient;
the scheme features fast and efficient traceback under the case of a distributed attack
without requiring a prior knowledge of the topology of the attack tree.
Park et al. [13] and Adler [2] studied the tradeoffs for different parameters in
PPM schemes. Park et al. showed the tradeoff between the ability of the victim to
localize the attacker and the severity of the attack which is related to the marking
probability, attack path length, and traffic volume. They showed that PPM is effective
at localizing the attack source in case of a single path attack although it is always
possible for the attacker to send packets with spoofed IP addresses. However, PPM
faces more difficulties as the number of attack sources increase in case of a DDoS
attack.
Adler introduced a new marking technique that only requires a single bit in the
packet header under the case of a single path attack. He showed the tradeoff between
b, the number of bits needed in the packet header, and the number of packets required.
He proved that the number of packets required for reconstructing the attack path
grows exponentially with n, but decreases doubly exponentially with b where n is the
4
number of bits for representing the attack path. In case of a multiple paths attack,
he provided a lower bound on the number of bits needed as a function of the number
of attack paths. He also demonstrated a closely matching upper bound for some
restricted scenarios.
Ma [10] introduced Tabu Marking Scheme (TMS) in which a router regards a
packet marked by an upstream router as a tabu and does not mark it again. TMS
reduces the convergence time compared with Song et al.’s Advanced Marking Scheme
I (AMS-I) under the case of DDoS attacks with multiple flooding sources, but has the
same convergence time under a DoS attack with a single source. This comparison is
also valid for any other PPM scheme that allows overwriting e.g. CEFS scheme and
AMS-II.
5
Chapter 2
Model and Motivation
2.1 Model
Consider the path of length d shown in Fig. 2.1. Attacker A sends packets towards
the victim. In Probabilistic Packet Marking approaches (which we call PPM), each
intermediate node i along this path marks the packet, with probability p(i), with its
IP address.
As a first step, let us make the same assumption as Savage et al. in [15], where
each router makes an independent decision for marking the packet and there is only
space for one mark on the header, and routers overwrite previous marks: every packet
finally contains at most one router’s mark after traversing the entire attack path. The
marks on the received packets allow the victim to sample the routers on the path.
After receiving a sufficient number of packets, X, the victim obtains at least one
sample for every router in the attack path and can reconstruct the entire path.
Traceback as a Coupon Collector’s Problem. Since the first paper on prob-
abilistic traceback [15], it has been observed that PPM resembles a coupon collector’s
problem [12]. In [15], it was assumed that all routers mark with the same probability
p and it was argued that the average number of packets required for the victim to
reconstruct a path of length d is bounded by:
E[X] <ln d
p(1− p)d−1(2.1)
The reason is that although router i marks with probability p(i) = p, its mark is
observed at the victim only if the routers between i and V do not overwrite it, which
6
123d d-1 VA
p(d) p(d-1) p(3) p(2) p(1)
Figure 2.1: PPM over a single path of length d.
happens with probability pi = p(i) ·∏1i−1(1 − p(j)) = p(1 − p)i−1. We call p(i) the
marking probability and pi the perceived probability of node i. Clearly pi > pd, ∀i,thus the bound of Eq. (2.1).
Clearly, in overwriting schemes, the perceived probability depends on the distance
of that router from the victim: the further a router is from the victim, the less likely
that its marks will survive (not be overwritten) as the packet moves along the path.
Therefore, the coupon collector’s problem with unequal weights provides an accurate
model for the IP traceback problem [14]. In this version, there are n distinct coupons
and a type i coupon is obtained with probability pi, independent of the previous
coupons, where∑n
i=1 pi = 1. The average number of boxes required to collect all n
for unequal coupon collector problem is known to be [14]:
E[X] =
∫ ∞
0
(1−n∏
i=1
(1− e−pix))dx (2.2)
In the IP traceback case, the coupons are the router ids and pi is the perceived
probability of observing a mark from the ith router. Clearly pi depends on the
marking probabilities p(j)’s: pi = p(i)∏1
i−1(1− p(j)).
One difference between the IP traceback problem and the coupon collector’s prob-
lem is the “null coupon”. In the coupon collector’s problem, there are n distinct
types of coupons and∑n
i=1 pi = 1; in the traceback problem, there are d distinct
router IP addresses, that need to be collected, plus a null event which is related to
those packets that finally do not contain any marks. In other words, in traceback
there exists n = d + 1 coupons (d distinct router IP addresses and no mark) and the
7
number of coupons the victim requires to reconstruct the attack path is d. We assign
the probability p0 to this null event:∑d
i=1 pi + p0 = 1.
Therefore, if all routers mark with the same p(i) = p as in [15], the exact value for
the average number of packets the victim needs to observe to reconstruct the path is:
E[X] =
∫ ∞
0
(1−d∏
i=1
(1− e−p(1−p)i−1x))dx (2.3)
which is a better model than Eq. (2.1). Compared to Eq. (2.2), in Eq. (2.3), the
product goes from 1 to d due to the desired d distinct IP addresses out of the total
d + 1 coupons and pi = p(1− p)i−1.
2.2 Key Ideas and Rationale
Our objective in this thesis is to minimize E[X] by tuning parameters that can affect
the behavior of the coupon collector’s model. In particular, we build on the following
key ideas:
• First, we observe that E[X] is minimum for equal weights of the coupons pi = 1n.
We then construct marking probabilities p(j) that lead to those equal weights
pi’s, for a single path and for multiple paths scenarios. We call this optimal
marking scheme unequal PPM and we discuss it in detail in Section 3.1.
• Second, we observe that recent developments in network coding [5] can further
reduce E[X] from n ln n + Θ(n) to Θ(n), by marking with random linear com-
binations of router ids instead of router ids themselves. We call this scheme
PPM+NC and we discuss it in detail in Section 3.2. In the special case that
all routers mark in PPM+NC (i.e. p(i) = 1) the scheme is comparable to but
outperforms Algebraic Traceback.
Assumptions. Let us briefly discuss some of the assumptions behind the above
model.
8
• We assume that routers place a mark on the packet with their entire IP address
(or linear combinations of IP addresses in the PPM+NC scheme). In practice,
due to the limited number of available bits on the header, each router encodes
only partial information about its IP address using methods like fragmentation
and hashing. These important considerations are part of the marking scheme
design, which is orthogonal to - and can be combined with - the ideas in this
thesis. Therefore, we consider as baseline for comparison the best case of PPM
schemes: without bits limitation they would mark their entire IP address, and
this is an upper limit to their performance. We refer to the basic overwriting
scheme with the same marking probability on all routers as constant PPM or
simply PPM.
• Similarly, we present our discussion in terms of router ids (node sampling algo-
rithm in [15]). The same analysis holds for edge sampling algorithms [15] that
mark with edge ids (XOR of two nodes) instead of router ids. We are interested
in collecting the full set of router ids and not about the order, which is trivial
in the case of edge ids, or can be inferred from the relative number of samples
per node.
• We assume that each router decides to mark a packet or not independently of
other routers; this is consistent with [15] and the rationale is that upstream
routers may have forged information.
• Our ideas extend from a single path to a multipath scenario, where several
attackers form an attack tree towards the victim. Multipath scenarios are con-
sidered throughout the thesis.
Other performance metrics of interest. We are interested in fast inference,
i.e. in reducing the number of packets X needed to reconstruct the path. So far, we
discussed only the computation of E[X]. More generally, one can compute the entire
9
distribution of X. For details on the computation of the percentile P (X > n), please
see the appendix.
Another metric of interest is the amount of work that routers are required to do
for marking. The average number of marks on all packets is (∑d
i=1 p(i)).E[X] : there
are E[X] of packets and each packet gets one mark on each router with marking
probability p(i). We are interested in schemes that generate low load at the routers.
Finally, another important performance metric is the number of bits on the packet
header required for marking. The tradeoff between number of packets and number of
bits has been extensively explored in the literature and intelligent marking schemes
have been designed to achieve traceback within a tight bit budget. This tradeoff
is orthogonal to tuning the marking probability of PPM schemes (please note the
difference between marking probability and the content of the mark itself). However,
the bits requirement is relevant for network coding and is discussed in the relevant
Sections 3.2 and 4.1.
10
Chapter 3
Proposed Mechanisms
3.1 Unequal PPM: Optimal Marking Probabili-
ties
As discussed in Section 2.1, when all routers mark packets with the same marking
probability p, the perceived probability of observing a mark from each router is pi =
p(1− p)i−1 for the ith router, which favors routers closer to the victim. An intuitive
idea is to assign higher marking probabilities to the routers further from the victim,
so as to balance out the above bias caused by overwriting. We are interested in
constructing a function that assigns marking probabilities p(i) to the routers according
to their distance from the victim i, so as to achieve the above goal. In particular, we
are interested in finding the optimal function p(i), i = 1...d that minimizes E[X], the
average number of packets required to reconstruct the path.
3.1.1 Single Path
If we consider only two routers, we can calculate E[X] from Eq. (2.2) for different
values of pi where p1 = p is the perceived probability of observing a mark from router
1 and p2 = 1−p is the perceived probability of observing a mark from router 2. First
we focus on making the probability of null event, i.e. the perceived probability of
receiving no mark, equal to 0. That is why we chose p and 1 − p for the perceived
probabilities of routers 1 and 2 respectively where∑2
i=1 pi = 1, the same condition
as that we mentioned in the coupon collector’s problem with unequal weights. We
will discuss other values of p0 at the end of this section. It is easy to see that E[X]
11
is minimized for p1 = p2 = 1/2.
More generally, for a path with d routers, E[X] is minimum when the perceived
probability (probability of observing a mark from every router) is the same and equal
to pi = 1/d. Indeed, Eq. (2.2) is symmetric with respect to all variables, i.e. p1,
p2, . . . , pn. And since∑n
i=1 pi = 1, p1 = p2 = · · · = pn = 1/n is the answer to the
minimization problem of E[X] in Eq. (2.2).
From the optimal perceived probabilities pi = 1n, i = 1...d, we can now construct
the marking probabilities p(i), i = 1...d, that lead to the optimal pi’s. We can calculate
these probabilities starting from the routers close to the victim i = 1, 2, ..., d:
p1 = p(1) ⇒ p(1) =1
d
p2 = p(2)(1− p(1)) ⇒ p(2) =1
d− 1
p3 = p(3)(1− p(2))(1− p(1)) ⇒ p(3) =1
d− 2
. . .
p(d− 2) =1
3, p(d− 1) =
1
2, p(d) = 1
This scheme assigns probability of marking 1 to the first router that receives the
packet and then the probability of marking gradually decreases as the router gets
closer to the victim. Interestingly, this result is similar to the reservoir sampling
problem [12]: the kth router that receives the packet in its path (router numbered as
d − k + 1), marks it with probability 1/k. This formula has an interesting practical
advantage: if a router knows that it is the kth router on the packet’s path, based
e.g. on a hop count or TTL (Time-to-Live), then it can configure itself to use the
optimal marking probability 1/k for each packet. If the optimal probability depended
on the number of routers after the current router, this would not be possible, and
an additional protocol/configuration would be needed to inform the routers what
12
101
102
103
104
101
102
103
104
(sum of p(i)).E[X]
E[X
]
constant PPMunequal PPM
Figure 3.1: Comparison of the tradeoff for constant and unequal PPM.
marking probability to use.
Let us now characterize the tradeoff between (i) the average number of packets re-
quired for reconstructing the attack path and (ii) the work at the intermediate routers
under this unequal marking scheme. As we mentioned above, p0, the probability of
observing no mark, has been assumed to be 0 till now. We want to change this
assumption and see how this effects the benefit we get from marking with unequal
probabilities instead of marking with constant probability for all routers. For this
purpose, we fix the path length d and gradually change p0 from 0 to 0.95. Then for
each value of p0 we construct the probabilities of marking from the following formula,
p(i) =1−p0
d∏i−1j=1(1− p(j))
=1− p0
d− (i− 1)(1− p0)
where the perceived probability is set to pi = (1−p0)/d for all routers; i = 1, 2, · · · , d.
Note that in the special case of p0 = 0, we get the expected result:
p(i) =1d∏i−1
j=1(1− p(j))=
1
d− i + 1
13
We now want to compare this optimal marking scheme, which we call unequal
PPM to the constant PPM [15] that uses the same marking probability p at all routers.
Comparing the two schemes in terms of E[X] alone would not be fair, because they can
both decrease the number of packets by increasing the number of marks. Therefore,
we compare the two schemes in terms of the tradeoff (E[X], (∑d
i=1 p(i)).E[X])) they
can achieve. We have used the analytical formula in Eq. (2.2) for calculating E[X]
with the corresponding perceived probabilities (pis) for each scheme and n replaced by
the appropriate number of coupons which is d. This comparison is shown in Fig. 3.1
for d = 14.
For the constant PPM scheme, we vary the constant marking probability p in the
range 0.01-0.40 and obtain the blue curve in the figure. In the same way that p is
the only parameter over which we have control in the constant PPM scheme, p0 is
the only parameter which can be changed in the unequal PPM mechanism. We have
considered the range 0.01-0.40 for p and 0-0.95 for p0 in this comparison. It is worth
noting that we have always considered probabilities of marking in the range 0.01 to
0.04 in this thesis. But for this comparison, we want to show that the optimized PPM
(that results to the same perceived probability equal to 1/d) improves the tradeoff
between the number of packets required and the work at the intermediate routers
over constant PPM for all values of p one might consider.
When p in the constant PPM scheme goes to 1, all packets will have a mark
from the closest router to the victim and therefore, the victim will never be able to
reconstruct the attack path; i.e. E[X] goes to infinity. In other words by increasing p
in the constant PPM scheme, after some point, both E[X] and (∑d
i=1 p).E[X] increase
and the curve moves to the right side which can be seen in the figure. However, in
the unequal PPM scheme, as we increase p0 or the probability of not receiving any
marks, E[X] increases, but the work at the intermediate routers decreases. Thus
the curve moves to the left side. The figure shows that for a small area at the left,
14
E[X] resulted from unequal PPM exceeds that of constant PPM. However this area
corresponds to very large values of p0 (p0 > 0.9). Over the range of our interest;
i.e. small value of p0, it can be seen that the optimized unequal PPM scheme always
performs better than the constant PPM scheme by moving the entire curve down.
In summary, the proposed unequal PPM mechanism improves the tradeoff between
the number of packets required and the work at intermediate nodes by optimizing the
assignment of probabilities of marking. It is worth noting that the unequal PPM
mechanism can be combined with and take advantage of any proposed reconstruction
algorithm for other PPM schemes that allow overwriting.
3.1.2 Multiple Paths
In the case of a DDoS attack, attackers send traffic towards the victim over an attack
tree rooted at the victim V . Each attack source {Ai} is located at a leaf node and
the attack path from {Ai} is the ordered list of routers between {Ai} and V that the
attack packet has traversed. For example, Fig. 3.2 shows a binary tree with 7 nodes in
which nodes {Ri} represent the routers and the {Ai} represent the flooding sources.
The attacker chooses one of the nodes out of the set of all possible attack sources for
each packet it sends. The choice of an attack source automatically determines the
path from that attack source to the victim. Probabilistic packet marking takes place
over the chosen path, and is the same as in the case of a single path attack. Therefore,
the multi-path problem is decomposed to a number of single-path problems, one for
each packet.
We now describe how to construct the optimal marking probabilities in the case of
a DDoS attack for tree topologies. In general, one should follow the same approach
as in single path to construct the probabilities of marking. The difference is that one
should also consider different probabilities of going through different routers in the
case of multiple attackers. If we denote the number of paths going through router i
15
R4
R7
R6
R5
R1
R3
R2
A1
A4
A3
V
A2
Figure 3.2: Example of an attack tree with 7 nodes and 4 attackers.
by Ni and the total number of paths in the tree by Ntotal, then we can write P (packet
goes through router i)=Ni/Ntotal.
In a symmetric tree of degree m1, e.g. binary (m = 2), ternary (m = 3), or degree4
(m = 4) tree, this probability has a simple expression and is equal to 1mli−1 for each
router i in layer li, li = 1, 2, · · · , L where L is the total number of layers in the tree
e.g. L = 3 in Fig. 3.2 and the first layer is the closest one to the victim that contains
only one router. The perceived probability is:
P (observing a mark from router i)=P (router i marks the packet and the next
routers do not mark).P (packet goes through router i)
We can now construct the marking probabilities for all routers from the formula
above similarly to what we did for the case of single path. We present two different
schemes for constructing the marking probabilities:
Scheme A: this scheme minimizes E[X]. In this scheme, one should set the per-
ceived probability formulated above to pi = 1n, i = 1...n (p0 = 0 in the optimal case)
where n is the total number of nodes in the tree and start constructing the proba-
bilities of marking from the first layer. Following this procedure, we can show that
for a binary tree, the kth router in the path the packet traverses (any router in layer
L−k+1) always marks the packet with probability 1/(2k−1) noting that any router
1By a symmetric tree of degree m, we mean a tree in which every node that is not a leaf nodehas exactly m children.
16
in layer L is the first router the packet may go through and so forth. It means that
after the path is chosen for the packet to traverse, the first router in its path always
puts a mark on it (the same as that in single path) and the probability of marking,
and therefore overwriting the current mark on the packet, decreases with the specified
formula for the routers closer to the victim. For a tree of degree 3 or 4 the probability
of marking for the kth router in the attack path becomes 1/∑k−1
j=0 3j or 1/∑k−1
j=0 4j
respectively. From these formulas, it can be seen that similar to the case of a single
path attack, the router needs to know the previous routers the packet has traversed
to decide with what probability it should mark.
Scheme B: in addition to the previous scheme, one might also think of a per path
optimization scheme where the probability of marking for every router depends on
which path the current attack packet has been sent through. Once the attacker
chooses a path for an attack packet to traverse, the probabilities of marking for the
routers in that path will be assigned exactly in the same way as in single path.
Therefore, a single router might mark with different probabilities for different attack
packets based on its placement in the path that packet is traversing. In fact a tree is
regarded as a concatenation of several single paths in this scheme and every time the
packet is sent through one of these paths, the routers in that path always mark with
probabilities 1, 1/2, 1/3, ... starting from the first router after the attacker. In case of
a binary, ternary, degree 4 or any other symmetric tree, this per path optimization
scheme results in the perceived probability equal to pi = 1L. 1mli−1 for each router i
in the tree, i = 1...n. For asymmetric trees in general, it results in the perceived
probability equal to 1/d for all routers in the path of length d the packet is traversing
(the same as single path).
In fact, scheme A is the optimal assignment of marking probabilities and scheme B
is suboptimal. As we discussed above, scheme A is the direct result of minimization of
E[X] in case of multiple attackers. However, there is no guarantee that one can always
17
construct the marking probabilities such that the perceived probability is constant
for all routers in the tree and is equal to 1/n as scheme A describes. One can always
apply scheme A for an optimal assignment of marking probabilities in the symmetric
trees e.g. binary, ternary, and degree 4 trees discussed above.
However, for asymmetric trees this optimal assignment might not result in logical
values for marking probabilities. On the other hand, we can not find a general closed
form for the formula of marking probability of the kth router in an asymmetric tree
like what we found for the symmetric trees. Therefore, in each asymmetric tree
one should start constructing the marking probabilities from the first layer using
the general formula mentioned previously; furthermore, he might not achieve logical
probabilities of marking for all nodes.
In asymmetric attack trees, scheme B can be used as a heuristic assignment of
marking probabilities. We can formulate the average number of packets needed for
reconstructing the attack paths in each of these schemes using Eq. (2.2) with the
appropriate perceived probability pi we described for each scheme. As a result, we
will have the following formula for E[X] of scheme A:
E[X] =
∫ ∞
0
(1−n∏
i=1
(1− e−1n
x))dx (3.1)
And for scheme B:
E[X] =
∫ ∞
0
(1−n∏
i=1
(1− e− 1
L. 1
mli−1 x))dx (3.2)
In Eq. (3.1) and Eq. (3.2), the product is calculated over all nodes in the tree
(i = 1...n). E[X] required for both schemes will be shown in Section 4.2.2 through
simulations. From the results, we conclude that scheme B is a reasonable suboptimal
assignment of marking probabilities for unequal PPM mechanism and substantially
improves constant PPM scheme.
18
3.2 PPM+NC: Using Network Coding
The second mechanism for improving PPM is combining it with the idea of network
coding. First, we start from the coupon collector’s problem. It is well known that
as the size of the collection increases, the probability of the next boxes contain new
types of coupons decreases. In other words, most of the time is spent on collecting the
last few coupons. More recently, it has been observed that network coding can help
solve this problem. Storing linear combinations of coupons in each box, instead of
individual coupons, increases the chance that each new box contains a useful coupon.
One can prove that when network coding is used, the average number of coupons
required for the entire collection, E[X], decreases from n ln n + Θ(n) to Θ(n). The
proof can be found in [5] and is given in the Appendix A.
In the case of IP traceback, we consider random linear network coding technique
combined with the probabilistic packet marking. It means that after decision of
marking a packet by a router, the current content of the packet will be updated to
a linear combination of the new IP address and the previous IP addresses. In other
words, once a router decides to mark the packet, it chooses a coefficient randomly out
of a field F2n [4], multiplies its IP address with the coefficient, and adds the result to
the current content of the packet.
It is worth noting that the router does not need to know anything about the
previous routers or the complete path the packet traverses. At the end, every packet
contains a linear combination of several routers’ IP addresses together with a vector
of the random coefficients for those routers. The reconstruction process is similar to
solving a system of linear equations. Therefore, at least d packets are required for the
victim to reconstruct a path of length d because the matrix of the linear equations
must become full rank to be solvable.
We should note that with this definition, network coding in the traceback problem
is different from that in the coupon collector’s problem. As mentioned earlier, in the
19
coupon collector’s problem we can have any linear combination of any coupons in
every box while in traceback, each packet contains a linear combination of some of
the routers’ IP addresses prior to the last router who has decided to mark it and the
last router itself. It can not contain any arbitrary linear combination of routers’ IP
addresses. It means that we should not directly use the result of applying network
coding to the coupon collector’s problem, which we discussed above, in the case of IP
traceback.
3.2.1 Single Path
To give a model for applying network coding to the single path traceback problem,
first we calculate pi which is the perceived probability of observing a mark from router
i. We can write:
P (observing a mark from the ith router)=P (the ith router marks the packet).P (obtaining
a full rank matrix out of the linear combinations)
We can show that the second probability can be assumed to be 1 with a very good
approximation using the lower bound on the success probability of a random network
code proposed in [9]. In fact, we are looking for the probability of obtaining a full
rank matrix out of the linear combinations to make the system of linear equations
solvable. Ho et al. proves that there is a certain probability of choosing linearly
dependent combinations under a random network code. This probability depends
on the field size 2n, but even for small field sizes e.g. 28 the probability becomes
negligible. Therefore, pi = p is a good approximation for the perceived probability
of observing a mark from the ith router in PPM+NC mechanism. This fact will also
be confirmed by plotting both the simulation results and the analytical model given
based on this pi in the rest of this thesis.
We can now replace p(1 − p)i−1 in Eq. (2.3) by this pi value. As a result, the
average number of packets required in PPM+NC mechanism will, with a very good
20
0 5 10 15 20 25 30 350
50
100
150
200
250
Path length
Ave
rage
num
ber
of p
acke
ts
simulations PPMmodel PPMsimulations PPM+NCmodel PPM+NC
Figure 3.3: Average number of packets needed for paths with varying d.
0.01 0.015 0.02 0.025 0.03 0.035 0.0450
100
150
200
250
300
350
400
Probability of marking the packets
Ave
rage
num
ber
of p
acke
ts
simulations PPMmodel PPMsimulations PPM+NCmodel PPM+NC
Figure 3.4: Average number of packets needed for a path with varying p.
approximation, reduce to:
E[X] =
∫ ∞
0
(1−d∏
i=1
(1− e−px))dx (3.3)
Which represents the benefit of applying network coding in the single path IP trace-
back problem.
Fig. 3.3 shows the average number of packets required to reconstruct paths of
varying lengths over 500 random test runs for each length value. We have considered
paths of length 1 to 31 hops and the marking probability p is set to 1/25. The figure
21
shows the models as well as the simulations for both PPM and PPM+NC mechanisms.
The model for PPM is given in Eq. (2.3) and the model for PPM+NC is given in
Eq. (3.3). We have calculated the analytical formulas in Eq. (2.3) and Eq. (3.3) for
path lengths less than or equal to d = 16. The results confirm the compatibility of
the given models with the simulation results for both PPM and PPM+NC schemes.
As we discussed above, the coefficients for PPM+NC mechanism can even be
selected out of a field of a small size. Here we have assumed a very small field of
size 22 = 4. Each realization terminates once the victim observes a full rank matrix
of the random coefficients. The figure shows that even for such a small field size,
the model completely agrees with the simulation results. Furthermore, PPM+NC
performs much better than the basic PPM scheme which allows overwriting.
The reason is that when the path length is long enough, even when we choose
the coefficients out of a very small field, it is very likely to obtain independent linear
combinations of routers’ IP addresses from different packets. In other words, although
larger fields show more benefit in random linear network coding schemes, because
of the large enough path lengths in most of the traceback problems, there is a low
possibility of selecting linearly dependent combinations according to the large number
of distinct router IP addresses even with less options for the coefficients to be chosen
randomly. This is shown in the figure as well; PPM+NC represents a bigger benefit
compared with PPM for longer paths rather than for shorter paths. But even for
short paths, the effect of the small field size is not too big according to Ho et al.’s
theorem and can be neglected. Therefore we have chosen a small field of size 22 = 4
in all our experiments on PPM+NC mechanism.
Fig. 3.4 shows the same comparison as Fig. 3.3 but with a fixed path length
(d = 14) and for different probabilities of marking (0.01-0.04). We have considered
the condition p ≤ 1d
for the optimum results [15].
We now want to compare PPM+NC mechanism to the algebraic approach pro-
22
0 200 400 600 800 1000 120010
15
20
25
30
35
Field size
Ave
rage
num
ber
of p
acke
ts
simulations Algebraicsimulations PPM+NC
Figure 3.5: Average number of packets needed for Algebraic and PPM+NC.
posed in [3]. Dean et al. also reconstructs the attack path by solving a system of linear
equations. It is very similar to our proposed PPM+NC scheme; the main difference is
that in the algebraic approach, there is a random packet id xj for each attack packet
and the matrix of the final linear equations looks like a Vandermonde matrix (if the
packet ids are distinct) with different powers of the pre-selected packet id as coeffi-
cients for different router IP addresses. But in PPM+NC, there is no pre-selected id
and each time a router decides to mark the packet based on some probability p, it
chooses a random coefficient and adds the multiplication of that coefficient with its
IP address to the current content of the packet. Therefore, the final matrix will have
different coefficients, selected uniformly at random out of a field, for different routers.
We claim that PPM+NC scheme performs better than the algebraic approach in
terms of the average number of packets required for reconstructing the attack path
especially over small field sizes. The reason is that although one obtains d distinct
packet ids, and therefore a full rank Vandermonde matrix, after sending d packets by
choosing the ids from a large field, for smaller field sizes one would need more than
d packets in the algebraic approach. It is obvious that in the algebraic approach, the
smallest possible field size should have at least d elements. Also all routers mark the
23
packet going through them i.e. p = 1. We can prove that the number of packets
required to obtain d distinct packet ids out of a field of size q is equal to:
E[X] = q
q∑
k=q−d+1
1
k(3.4)
Proof. It can be viewed as a coupon collector’s problem with the packet ids
representing the coupons. Let X be the number of packet ids chosen out of the field
until d distinct packet ids are obtained. If Xi is the number of packet ids chosen
while one had exactly i − 1 distinct packet ids, clearly X =∑d
i=1 Xi. Each Xi is a
geometric random variable with probability pi = q−(i−1)q
. Hence,
E[Xi] =1
pi
=q
q − (i− 1)
And using the linearity of expectations,
E[X] =d∑
i=1
E[Xi] = q
q∑
k=q−d+1
1
k
This proves Eq. (3.4).
But in PPM+NC, if we follow the same marking procedure as that of [3] and let
all routers mark with probability 1, even for very small field sizes, the victim always
needs to receive d packets for obtaining a full rank matrix and solving the system
of linear equations. It comes from the randomly selected coefficients in PPM+NC
scheme. This is shown in Fig. 3.5. In this figure d is set to 14. Since the field size
22 = 4 is not even possible for the algebraic approach with d = 14, we have considered
fields of sizes 24 to 210.
The figure shows that when we use the same probability of marking as the one in
the Algebraic approach (p = 1) for our proposed PPM+NC mechanism, the average
number of packets required for reconstructing the attack path of length d does not
24
depend on the size of the field out of which the coefficients are selected randomly.
Even for small field sizes, PPM+NC almost always needs d packets to get a full rank
matrix. However, the number of packets needed in the algebraic approach depends
on the size of the field out of which the packet ids are selected randomly and only for
large field sizes, it requires d packets to obtain a full rank Vandermonde matrix.
Let us discuss more about the analytical model we proposed for PPM+NC at
the beginning of this section. Ho et al. [9] considers a feasible multicast connection
problem with independent or linearly correlated sources and N receivers where the
components of local coding vectors are chosen independently and uniformly at random
over a finite field Fq with q > N . The probability that all N receivers can decode
the source processes is at least (1 − Nq)ν where ν is the maximum number of links
receiving signals with the random coefficients in any set of links from all sources to
any receiver. In other words, ν is the number of coding points that are encountered
in all paths from the source to any receiver, maximized over all receivers [5, 9].
In the traceback problem, there exists one receiver (the victim) which needs to
decode the routers’ IP addresses from the linear combinations it receive. Instead
of the links receiving signals with independent randomized coefficients in Ho et al.’s
theorem, here the linear combinations of routers’ IP addresses are stored in the attack
packets.
It is obvious that one should replace N in the formula of success probability by 1 for
the case of a single path attack. However, ν is not straightforward to be substituted
in the traceback problem. The reason is that the number of links in the network is
known for a multicast connection problem while in traceback, the number of packets
required for the victim to reconstruct the attack path is not known from the beginning.
Indeed, the number of packets required and as a result, the probability of success in
the traceback problem, is a function of marking probability p, path length d and field
size q.
25
One can obtain a bound on the number of packets by calculating the total number
of possible linear combinations of routers’ IP addresses which would be∑d
i=0
(di
)pi(1−
p)d−iqi. However, it is much larger than the real number of packets needed for re-
constructing the attack path in most cases. As we discussed at the beginning of
this section, for the range of marking probabilities we usually consider for traceback
problem (0.01-0.04) and regarding the usually large enough attack path length d,
approximating the success probability, i.e. the probability of obtaining a full rank
matrix, by 1 is reasonable even when we choose the coefficients out of a field of a
small size as it can be seen in Fig. 3.3 and Fig. 3.4.
We now want to consider the tradeoff between the number of bits needed and the
number of packets required for both algebraic and network coding approaches. By
splitting a router’s IP address into c chunks, the number of bits in the packet header
required by the algebraic approach would be equal to log2 2d32ce+dlog2 de+dlog2 ce or
even log2 2d32ce+ dlog2 de when each router substitutes each chunk of its IP address in
order [3]. d represents the attack path length. For example, when c = 4, one would
need 15 bits or in the best case, 13 bits per packet and 4d packets for a path of length
16 in the algebraic approach.
In PPM+NC mechanism, a vector of the randomly chosen coefficients is stored in
the packet along with the total content of the packet which is the linear combination of
IP addresses. Assuming a field of size 232, this would require log2 232 + dnavg ∗ log2 22ebits in the packet header. We can trade off the number of bit for the average number
of packets required by dividing a router’s IP address into c chunks and adding dlog2 cebits that represents the offset of the chunk. In this way, we can reduce the field size
to 2d32ce. Therefore, the number of bits needed by PPM+NC mechanism will be equal
to:
log2 2d32ce + dnavg ∗ log2 22e+ dlog2 ce (3.5)
which can further be simplified to log2 2d32ce + d2navge + dlog2 ce. navg is the average
26
number of coefficients stored as the elements of a vector in each packet. In other
words, it is the average number of routers who have put their marks with randomly
selected coefficients on one packet.
navg depends on the path length d and the probability of marking p and is equal
to navg = dp. Since p is usually a small value e.g. 0.04, the average number of marks
on each packet would be small e.g. 1.24 for a large path length of 31. On the other
hand, it is assumed that the random coefficients have been selected out of a field of
size 22 = 4. Therefore, the number of bits needed by Eq. (3.5) is in the order of the
number of bits required by the best case of algebraic approach.
We have also compared the average number of packets required for reconstructing
the attack path with the 95th percentile for the number of packets. The 95th per-
centile is obtained by using the empirical cdf. First we found the empirical cdf of
the values obtained from every realization for the number of packets required. Then,
we could find different percentiles e.g. 95th percentile for the number of packets re-
quired from the distribution. The figures are plotted versus both path length and
probability.
Fig. 3.6 shows the 95th percentiles as well as the average number of packets needed
for reconstructing paths of varying length from 1 to 31 hops. The marking proba-
bility p is set to 1/25. Both percentile and average are presented for both PPM and
PPM+NC mechanisms. Obviously, as the path length increases, both 95 percentile
and average increase and the 95th percentile curve is always above the average curve
for both of the mechanisms. Fig. 3.7 shows the same thing versus probability. The
attack path length is set to 14 and the considered range for probabilities of marking
is 0.01-0.04. The analysis of tail distribution can be found in the Appendix B of this
thesis.
27
0 5 10 15 20 25 30 350
50
100
150
200
250
300
350
400
Path length
Num
ber
of p
acke
ts r
equi
red
simulations 95% PPMsimulations 95% PPM+NCmodel 95% PPM+NCsimulations mean PPMsimulations mean PPM+NC
Figure 3.6: Average and 95th percentile for the number of packets required.
0.01 0.015 0.02 0.025 0.03 0.035 0.040
100
200
300
400
500
600
Probability of marking
Num
ber
of p
acke
ts r
equi
red
simulations 95% PPMsimulations 95% PPM+NCmodel 95% PPM+NCsimulations mean PPMsimulations mean PPM+NC
Figure 3.7: Average and 95th percentile for the number of packets required.
3.2.2 Multiple Paths
In the case of a distributed attack, we can apply random linear network coding tech-
nique to the probabilistic packet marking in the same way as what we did for a single
path attack. As we mentioned in Section 3.1.2, the attacker chooses one of the nodes
out of the set of all possible nodes for each packet it sends. The choice of the attack
source automatically determines the path from that attacker to the victim. Similar
to the two schemes for constructing marking probabilities in the unequal PPM mech-
anism described in Section 3.1.2, one can think of two different PPM+NC schemes:
28
Scheme A: This scheme considers the topology structure of the attack tree in
PPM+NC mechanism. In this scheme, every node maintains a list of its upstream
routers (children) in the tree. Once the node receives a packet from any of its chil-
dren, it will remember that child’s IP address for the next attack packets. One can
summarize scheme A as follows; after the path is determined for the attack packet to
traverse, each node in the attack path decides to mark the packet going through it
with some probability p (PPM). However, with the idea of network coding, each time
the node decides to mark, it marks not only with its own IP address, but also with
the IP addresses of those children of it for which it remembers the IP addresses from
the previous attack packets that went through those children. Therefore, the general
idea is the same as single path with the difference that after decision of marking a
packet by a router, several marks with several randomly selected coefficients may be
put on the packet at the same time instead of only 1 mark per decision in case of a
single path attack.
Scheme B: This scheme is, similarly to scheme B of unequal PPM in Section 3.1.2,
a per path based scheme. In fact, once a path is selected for the attack packet to
traverse, PPM+NC mechanism takes place in that path similarly to that of a single
path attack. Therefore, once a router in the attack path decides to mark the packet,
it chooses a coefficient randomly out of a field F2n , multiplies its IP address with
the coefficient, and adds the result to the current content of the packet. The tree is
regarded as a concatenation of several single paths in this scheme and it is assumed
that each router does not know anything about the IP addresses of its children.
Both PPM+NC schemes are plotted in Section 4.2.2. It can be seen that scheme
A gives more benefit in terms of the average number of packets required for recon-
structing the attack paths which is obvious from its description.
We can also give a model for PPM and PPM+NC mechanisms in the case of
multiple attackers in a tree topology. It is sufficient to calculate the appropriate pi
29
for both mechanisms and replace it in Eq. (2.2). We gave the formula of pi for an
overwriting scheme in a distributed attack scenario in Section 3.1.2,
P (observing a mark from router i)=P (router i marks the packet and the next
routers do not mark).P (packet goes through router i)
The first probability is equal to p(1−p)nnext(i) where nnext(i) is the number of routers
that are after router i in the path the packet traverses. For example in a binary tree,
the number of next routers for each router i in layer li is simply equal to li − 1. The
second probability was explained in Section 3.1.2. For PPM+NC mechanism, in its
second scheme (scheme B), pi can be calculated as follows:
P (observing a mark from router i)=P (router i marks the packet).P (obtaining a
full rank matrix out of the linear combinations).P (packet goes through router i)
The first probability is simply equal to p. With the same discussion as in the
case of single path (Section 3.2.1), the second probability can be assumed to be
approximately 1. Therefore, in case of a distributed attack, PPM mechanism and
scheme B of PPM+NC mechanism can be modeled by Eq. (2.2) with their pis specified
above.
30
Chapter 4
Additional Evaluation
4.1 Discussion of Costs
In general, one should consider three main cost factors in the traceback problem.
The first one is the average number of packets required for the victim to reconstruct
the attack path. We discussed this factor for both unequal PPM and PPM+NC
mechanisms in the previous sections and observed that both schemes reduce the
average number of packets needed compared to PPM mechanism. We want to focus
on the other two factors in this section. One of them is the number of bits in the
packet header each mechanism requires. The last one is the work at the intermediate
routers which is related to the amount of marks they put on the attack packets.
First we should note that for the unequal PPM mechanism, the number of bits
needed in the packet header is not different from that in the PPM scheme because
unequal PPM is based on the optimization of assignment of marking probabilities
and does not effect the number of bits required. Therefore, any technique that has
been proposed for reducing the number of bits needed in the packet header for any
PPM scheme that allows overwriting can be applied to unequal PPM mechanism as
well. In the same way, we do not need to consider the work at the intermediate nodes
for PPM+NC mechanism. The reason is that we did not make any changes in the
constant probability of marking p in our proposed PPM+NC mechanism.
As a result we only discuss about the number of bits needed by PPM+NC and
the work at the intermediate routers in unequal PPM. In Section 3.2.1, we gave the
number of bits that need to be allocated to PPM+NC scheme in Eq. (3.5). We
compared it to the number of bits required by the algebraic approach and showed
31
that they are similar. For example when c = 4, we would need a total of 12 bits in
the packet header for a path of length d = 16. We discussed the work at intermediate
nodes for unequal PPM scheme in Section 3.1.1 and from Fig. 3.1, we concluded that
unequal PPM mechanism improves the tradeoff between the number of packets and
the work at the intermediate nodes over the constant PPM scheme.
We should discuss another issue of PPM+NC mechanism. Network coding is
inherently susceptible to jamming attacks; interestingly the same applies for algebraic
traceback. In PPM+NC, this problem comes into play because each router adds its
IP address multiplied by a random coefficient to the current content of the packet
disregarding what packet contains. The victim reconstructs the attack path by solving
a system of linear equations resulted from the linear combinations of IP addresses in
the packets it has received. Thus, it has no way to understand whether it is decoding
the right information or not. Some solutions have been proposed to the problem
of jamming attacks in network coding technique which may be of interest in the
traceback problem as well [6].
4.2 Additional Simulation Results
4.2.1 Single Path
We have included our simulation results along with our proposed schemes in the
previous sections. We now want to compare our two mechanisms, unequal PPM and
PPM+NC, to constant PPM scheme in one figure. Fig. 4.1 shows such a comparison.
In this figure the constant probability of marking p is set to 0.04 for the constant
PPM mechanism and the path length varies from 1 to 31. Each path length result
represents the result of 500 independent simulation runs. We have simulated the best
possible unequal PPM scheme in which p0, the probability of observing no mark, is
set to 0; i.e. the perceived probability of observing a mark from all routers is set to
32
0 5 10 15 20 25 30 350
50
100
150
200
250
Path length
Ave
rage
num
ber
of p
acke
ts
PPMPPM+NCUnequal PPM
Figure 4.1: Comparison of all three schemes for single path.
1/d for an attack path of length d.
We can see that unequal PPM scheme performs much better than constant PPM
in terms of the average number of packets needed for reconstructing the attack path.
It is obvious according to the optimization of probabilities of marking in our proposed
unequal PPM mechanism. Unequal PPM even performs better than PPM+NC over
a wide range of attack path lengths. But it still allows overwriting and as a result,
is a different mechanism from PPM+NC in which no overwriting takes place. Both
schemes always perform better than constant PPM scheme as we discussed in the
previous sections.
4.2.2 Multiple Paths
In this section, we test our proposed mechanisms, unequal PPM and PPM+NC in case
of a distributed attack scenario. First, we have assumed symmetric trees of degrees
2, 3, 4 and set the probability of marking to 0.04 in all cases (the same as that in
single path) and plotted the average number of packets required versus the number
of nodes in the tree for trees of different sizes in the same degree. For unequal PPM
33
0 20 40 60 80 100 120 1400
2000
4000
6000
8000
10000
12000
14000
Number of nodes in the tree
Ave
rage
num
ber
of p
acke
ts r
equi
red
Binary tree
PPMPPM+NC BPPM+NC Aunequal PPM Bunequal PPM A
Figure 4.2: Average number of packets needed for a binary attack tree.
scheme, the same as what we did in single path, we have considered the best case in
which p0 = 0. We have shown both schemes A and B described in Chapter 3 for both
unequal PPM (constructing the marking probabilities) and PPM+NC mechanisms.
The results are shown in Fig. 4.2, Fig. 4.3, and Fig. 4.4 for binary, ternary and degree
4 trees respectively.
For PPM+NC mechanism, when we focus on each figure separately, we can see
that both schemes A and B of PPM+NC show more benefit for trees with more
depth among the trees with the same degree. This is similar to the larger benefit of
PPM+NC for longer paths in case of a single path attack. Also, it is obvious that
scheme A performs much better than scheme B. On the other hand, if we consider
scheme A of PPM+NC in all figures for the same number of nodes, we see that it
presents more benefit for wider trees. It was expected because the number of children
of each node, and therefore the number of marks which may be written simultaneously
into the packet header, increases by an increase in the degree of the attack tree in
scheme A. As a result, PPM+NC shows more benefit for both deeper and wider trees
in general.
About the unequal PPM mechanism, it can be seen that in all trees of different
34
0 20 40 60 80 100 120 1400
2000
4000
6000
8000
10000
12000
14000
Number of nodes in the tree
Ave
rage
num
ber
of p
acke
ts r
equi
red
Ternary tree
PPMPPM+NC BPPM+NC Aunequal PPM Bunequal PPM A
Figure 4.3: Average number of packets needed for a ternary attack tree.
0 20 40 60 80 100 120 1400
2000
4000
6000
8000
10000
12000
14000
Number of nodes in the tree
Ave
rage
num
ber
of p
acke
ts r
equi
red
Degree4 tree
PPMPPM+NC BPPM+NC Aunequal PPM Bunequal PPM A
Figure 4.4: Average number of packets needed for a degree 4 attack tree.
degrees both of its schemes (A and B) always perform much better than PPM; they
perform even better than both schemes of PPM+NC in terms of the average number
of required packets although they allow overwriting and are different from PPM+NC
mechanism in nature. Such a good performance was expected for scheme A as a
direct result of the optimization in the assignment of marking probabilities; we also
conclude that scheme B performs well enough as a suboptimal assignment of marking
probabilities.
At the last step, we want to test our proposed mechanisms on a realistic tree based
35
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
Number of attackers
Ave
rage
num
ber
of p
acke
ts r
equi
red
Realistic tree
PPMPPM+NC BPPM+NC Aunequal PPM B
Figure 4.5: Average number of packets needed for a realistic attack tree.
Table 4.1: BRITE Topology Generator ParametersRouter Only Parameters
Model GLPNode Placement RandomGrowth Type Incremental
Preferential Connectivity Onm 1
on the power-law structure of Internet. We chose to use BRITE topology generator [1]
on Router only mode. We took use of the experiments performed in [8] to choose
the right parameters that produce realistic results. Table 4.1 contains the parameters
used to generate the random topology that simulates a real topology. Incremental
growth and preferential connectivity are chosen as intuitive mechanisms for topology
generation. Parameter m sets the number of links added per new node that effects
the average node degree of the generated topology.
Using the parameters specified above, first we generated a 150 node graph and
randomly chose one of the nodes with one edge as the victim. Then we constructed
a tree out of the generated graph using Dijkstra algorithm. In the resulted tree, we
found all one-edge nodes which present the total possible attackers. Then each time
we chose a specific number of attackers out of the total possible ones and reconstructed
36
the attack graph using PPM, PPM+NC and unequal PPM mechanisms.
For PPM+NC mechanism, we have simulated both schemes A and B; similarly to
the symmetric trees shown above, scheme A performs much better than scheme B
and constant PPM mechanism. However, for unequal PPM mechanism we have only
used scheme B for constructing the marking probabilities; the reason is that as we
discussed in Section 3.1.2, scheme B of unequal PPM can be used as a suboptimal
assignment of marking probabilities in any arbitrary tree structure. For each number
of attackers, the result represents the average over 100 independent simulation runs.
This is shown in Fig. 4.5.
It can be seen that in the same way as a symmetric tree, both schemes A and B
of PPM+NC mechanism show more benefit with increasing the number of attackers
and therefore the size of the attack graph in a realistic asymmetric tree. It is obvious
that the suboptimal scheme of unequal PPM (scheme B) shown in Fig. 4.5 does
not perform as well as the optimal scheme (scheme A) shown previously in case of
symmetric trees; but it still performs much better than constant PPM which confirms
that scheme B is a reasonable suboptimal assignment of marking probabilities.
37
Chapter 5
Conclusion
In this thesis, we revisited the IP traceback problem as a coupon collector’s problem,
and used the insights obtained to design two mechanisms: (i) unequal PPM, which
assigns marking probabilities so as to minimize the expected number of packets and
(ii) PPM+NC, which uses random linear combinations of node ids instead of over-
writing. We evaluated these mechanisms through analysis and simulation for single-
and multi-path scenarios and showed that they improve the tradeoffs of interest. In
future work, we plan to combine our mechanisms with existing marking schemes to
further improve the overall traceback performance.
38
Appendices
A The Coupon Collector’s Problem with Net-
work Coding
In [5], it has been shown that the average number of coupons required for the entire
collection, E[X], when network coding is used is Θ(n).
The proof uses the following lemma:
Lemma 1 assume Π1 and Π2 are two subspaces in Fnq and Π1 is not a subspace of
Π2. Assume that v is chosen uniformly at random from Π1. Pr(v ∈ Π2) > 1q
Proof. The proof can be given from the formula of conditional probability. Since
the total number of elements in Fnq is q, after choosing v from Π1, we have a new
space with less than q number of elements. Hence, Pr(v ∈ Π2|v ∈ Π1) > 1q.
We now give the proof for E[X] after applying network coding to the coupon
collector’s problem:
Proof. In lemma 1, let Π1 = Fnq and Πi
2 = the subspace spanned by the coding
vectors after i experiments. Let X i2 be the rank of Πi
2; by rank we mean the number
of boxes after i trials which contain distinct types of coupons. In other words, X i2 is
the size of basis for Πi2 and in general, X i
2 ≤ i. According to lemma 1, we also know
that Xn2 is the sum of n bernoulli random variables, each of which has probability of
success less than 1− 1q. Now we can write:
E[Xn2 ] = E[X1
2 + X22 + · · ·+ Xn
2 ] =n∑
i=1
E[X i2] (A.1)
Thus E[Xn2 ] < n(1− 1
q) which means that E[X] = Θ(n).
39
B Tail Analysis
In the thesis, we mainly focused on analyzing E[X]. However, one can also analyze
the entire distribution of X. In [14], P (T > n) is calculated where T is the number
of coupons which needs to be collected until one obtains a complete set of at least
one of each type assuming there are N different types of coupons and the selection
is performed independently and equally likely every time. If we follow the same
approach as in [14] for our traceback problem, we can find a model for the percentile
(e.g. 95 percentile) of the number of packets required.
In the same way as in [14], first we fix n and define the events A1, A2, . . . , Ad as
follows: Aj is the event that no packet with a mark from router j is contained among
the first n packets, j = 1, · · · , d. Hence
P (X > n) = P (d⋃
j=1
Aj)
=∑
j
P (Aj)−∑j1<j2
∑P (Aj1Aj2)
+ · · ·+ (−1)k+1∑ ∑j1<j2<···<jk
∑P (Aj1Aj2 · · ·Ajk
)
· · ·+ (−1)d+1P (A1A2 · · ·Ad) (B.2)
Now Aj will occur if none of the first n packets contains a mark of router j. Since
each packet will not have a mark from the jth router with probability 1 − pj where
pj is the perceived probability of observing a mark from the jth router and is equal
to p(1− p)j−1, by the assumed independency, we have
P (Aj) = (1− pj)n
In general, the event Aj1Aj2 . . . Ajkwill occur if none of the first n packets contains
a mark from either router j1 or j2 or . . . jk. Therefore, P (Aj1Aj2 · · ·Ajk) = (1− pj1 −
pj2−· · ·−pjk)n. Now if we replace these probabilities in Eq. (B.2), we get the general
40
formula for modeling the percentile in the traceback problem. In the overwriting
mechanisms, pj is different for every router j due to its distance from the victim and
is equal to p(1− p)j−1. Therefore, we can not go further than this general analytical
formula in the case of basic PPM scheme. However, for PPM+NC approach, the
perceived probability is the same for all routers and is (with a good approximation)
equal to p. Therefore, for n > 0 in the case of PPM+NC, Eq. (B.2) will be simplified
to:
P (X > n) = d(1− p)n −(
d
2
)(1− 2p)n
+
(d
3
)(1− 3p)n − · · ·+ (−1)d
(d
d− 1
)(1− (d− 1)p)n (B.3)
Which is simply equal to:
P (X > n) =d−1∑i=1
(d
i
)(1− ip)n(−1)i+1 (B.4)
As a result, if we calculate the probability in Eq. (B.4) for different values of n,
the 95th percentile for PPM+NC mechanism would be that value of n for which
the probability becomes equal to 0.05. In Eq. (B.4), d is the number of routers in
the attack path from each of which we want to have a mark (equivalent to the N
distinct types of coupons in the coupon collector problem) and p is the perceived
probability (equal to the marking probability) in PPM+NC mechanism. We have
shown the model given for PPM+NC mechanism in Fig. 3.6 and Fig. 3.7 together
with the simulations. It can be seen that the given model completely agrees with the
simulation results.
41
C Combining the Proposed Mechanisms with Pre-
vious Schemes
As we described in our assumptions in Section 2.2, in this thesis we mainly discussed
in terms of router ids (node sampling algorithm in [15]). We now want to discuss
about edge sampling algorithms [15] that mark with edge ids (XOR of two nodes)
instead of router ids. Let us focus on Song et al.’s AMS-I [16] and explain unequal
PPM and PPM+NC mechanisms using the same framework as that scheme. AMS-I
is similar to CEFS, but instead of encoding a router’s IP address into eight fragments,
one simply encodes a hash value of the router’s IP address. In fact, two independent
hash functions, h and h′, are used in the encoding of the routers’ IP addresses to
distinguish the order of the two routers in the XOR result while reconstructing the
attack paths. Both hash functions have 11-bit outputs; the output together with the
5-bit distance field is stored in the 16-bit IP identification field.
In the reconstruction procedure, the victim uses the upstream router map Gm as
a road-map and performs a breadth-first search from the root. The set of edge fields
marked with a distance d is denoted as Ψd and the set of routers at distance d from
the victim in the reconstructed attack graph is denoted as Sd. If we go through the
reconstruction procedure of AMS-I [16], we will find that the computational com-
plexity of this scheme is O(∑
d |Sd|.|Ψd+1|). This is much lower than that of CEFS
scheme which is O(∑
d |Sd|.|Ψd+1|8) resulted from splitting each router’s IP address
and redundancy information into eight fragments and marking with one of the eight
fragments selected at random for each packet.
It is obvious that one can simply combine the unequal PPM mechanism with any
edge id based PPM scheme e.g. AMS-I. There is only one modification needed in the
marking procedure; one needs to assign the optimized marking probabilities to differ-
ent routers instead of the constant marking probability for all routers. The number
42
of packets required by AMS-I is less than one eighth of that of CEFS scheme given
the same marking probability. Unequal PPM decreases this number even further and
minimizes it as we described in the previous sections. The reconstruction procedure
for unequal PPM, and thus the formula for its computational complexity, are the
same as those of AMS-I. However, |Ψd+1| decreases as a result of the decrease in the
average number of packets required by the unequal PPM scheme.
For the PPM+NC mechanism, one would need to make more modifications to
make the mechanism work as an edge id based scheme. If we regard edge ids as the
concatenation of two nodes, PPM+NC can simply be used similar to the mechanism
discussed for the router ids. However, edge id is defined as the XOR value of the
start and end fields [15]. Therefore, an additional protocol would be needed to make
the PPM+NC mechanism compatible with edge ids. One solution might be similar
to the random partial path encoding scheme proposed by Dean et al. [3]; one can
add an additional parameter, l, that represents the maximum length of an encoded
path. The value of this parameter is set by the marking router and reduced by
every participating router who adds in its IP address. The routers can add in their
information until the value reaches 0. Therefore, l = 1 would represent encoding of
edges between routers. As Dean et al. [3] described, this would be obtained at a cost
of adding dlog2(l + 1)e bits to the packets.
It is worth noting the computational complexity of PPM+NC mechanism on the
other hand. PPM+NC always requires one step prior to the reconstruction procedures
commonly used for PPM schemes. This prior step is to solve the system of linear
equations which is of computational complexity O(n3) that will be added to the
complexity of the reconstruction algorithm. However, PPM+NC reduces the number
of required packets which mitigates the effect of this additional complexity.
43
Bibliography
[1] Brite topology generator. http://www.cs.bu.edu/brite/.
[2] M. Adler. Tradeoffs in probabilistic packet marking for ip traceback. ACMSymposium Theory of Computing (STOC), pages 407–418, May 2002.
[3] D. Dean, M. Franklin, and A. Stubblefield. An algebraic approach to ip trace-back. In Proceedings of NDSS, February 2001.
[4] C. Fragouli, J. Y. LeBoudec, and J. Widmer. Network coding: an instant primer.ACM SIGCOM Computer Communication Review, January 2006.
[5] C. Fragouli and E. Soljanin. Network Coding: Fundamentals and Applications.Monograph in Foundations and Trends in Networking, 2007.
[6] C. Gkantsidis and P. R. Rodriguez. Cooperative security for network coding filedistribution. Technical Report accepted at Infocom, 2006.
[7] M. T. Goodrich. Efficient packet marking for large-scale ip traceback. 9th ACMConference on Computer and Communications Security (CCS’02), pages 117–126, November 2002.
[8] O. Heckmann, M. Piringer, J. Schmitt, and R. Steinmetz. How to use topologygenerators to create realistic topologies. Technical Report, December 2002.
[9] T. Ho, R. Koetter, M. M’edard, D. R. Karger, and M. Effros. The benefits ofcoding over routing in a randomized setting. In IEEE International Symposiumon Information Theory (ISIT), July 2003.
[10] M. Ma. Tabu marking scheme for ip traceback. Proceedings of IEEE InternationalParallel and Distributed Processing Symposium (IPDPS’05), 2005.
[11] J. Mirkovic, S. Dietrich, D. Dittrich, and P. Reiher. Internet Denial of Service:Attack and Defense Mechanisms. Prentice Hall Professional Technical Reference,2005.
[12] M. Mitzenmacher and E. Upfal. Probability and Computing, Randomized Algo-rithms and Probabilistic Analysis. Cambridge, 2005.
[13] K. Park and H. Lee. On the effectiveness of probabilistic packet marking for iptraceback under denial of service attack. In Proceedings of IEEE Infocom 2001.
[14] S. Ross. A first course in probability. Pearson Prentice Hall, Upper Saddle River,New Jersey, 2006.
[15] S. Savage, D. Wetherall, A. Karlin, and T. Anderson. Network support for iptraceback. IEEE/ACM Transactions on Networking, 9(3):226–237, June 2001.
44
[16] D. X. Song and A. Perrig. Advanced and authenticated marking schemes for iptraceback. Proceedings of IEEE Infocom 2001, pages 878–886.
45