Post on 30-Jan-2016
Predicting Communication Latency in the Internet
Dragan Milic
Universität Bern
June 17, 2005
Predicting Communication Latency in the Internet
2
Table of Contents
> Introduction> IDMaps> Coordinate-based RTT prediction
— GNP– Simplex Downhill
— Vivaldi
— ICS– Principal Component Analysis
> Own research ideas
June 17, 2005
Predicting Communication Latency in the Internet
3
Introduction
> Why is communication latency prediction in the Internet important?
— TCP throughput
— QoS
— P2P> Problems of latency measurements
— One way latency– time synchronization– asymmetric links => RTT != 2 x one way latency
— Scaling problem of full mesh measurements– time needed to measure the latency to all potential
communication partners– O(n2)
June 17, 2005
Predicting Communication Latency in the Internet
4
IDMaps [Francis et al. 2001]
> Pioneer work about RTT prediction in the Internet
> A global service for RTT estimation
> Estimation of the RTT using triangulation
> Proactive measurements
June 17, 2005
Predicting Communication Latency in the Internet
5
IDMaps: Architecture
> Address Prefix (AP): Consecutive address range of IP addresses within which all hosts with assigned addresses are equidistant (with some tolerance) to the rest of the Internet.
> Tracer: A host deployed in the access network (AS). The tracer measures the network distance to all other tracers in the Internet.
> Virtual Link (VL): A raw distance between two tracers (Tracer-Tracer VL) and between a tracer and an AP (Tracer-AP VL).
June 17, 2005
Predicting Communication Latency in the Internet
6
IDMaps: Architecture (2)
AS4Tracer
4
AS 2Tracer
2
AS 3Tracer
3
AP1
AP2
AP4
AP3
AP5
AP6
AP7AP8
AS 1Tracer
1
AS5
distance(AP1,AP6)=distance(AP1,Tracer1)+distance(Tracer1,Tracer3)+distance(Tracer3,AP6)
Physical Link
Tracer – AP Virtual Link (VL)
Tracer – Tracer Virtual Link (VL)
June 17, 2005
Predicting Communication Latency in the Internet
7
IDMaps: Drawbacks
> Deployment: Infrastructure support is needed. One tracer must be deployed to each access AS.
> Scalability: Each tracer measures and stores RTT to all other tracers in the Internet: the complexity of storage and measurement traffic generation grows quadratically with the number of tracers deployed in the Internet - O(n2).
June 17, 2005
Predicting Communication Latency in the Internet
8
Coordinate based RTT prediction
> Idea:— Each host in the Internet is assigned one point in an virtual n-
dimensional euclidean space.
— The euclidean distance function in the virtual n-dimensional space predicts the RTT of the communication.
> Problem:— The Internet cannot be projected to an ideal euclidean space.
> Solution:— The coordinates of each host must be chosen in such way, that
the square distance between measured and predicted RTT is minimized.
> Practical implementations:— GNP, PIC, NPS
— Vivaldi, Big Bang simulation
— ICS
June 17, 2005
Predicting Communication Latency in the Internet
9
General Network Positioning (GNP) [Ng et al. 2002]> GNP procedure:
— Each host measures the RTT to a fixed set of hosts (landmarks). To uniquely determine the coordinates of a host, at least n+1 landmarks must exist for a n-dimensional space.
— Using the Landmark positions and measured RTTs, each host can calculate its own coordinates by minimizing the square distance between predicted and measured RTTs using the simplex downhill method for function minimization.
> Determining landmark coordinates:— Each landmark measures the RTT to all other landmarks.
— One landmark (the leading landmark) receives the measurement results from all landmarks and calculates the coordinates for each landmark by minimizing the square distance function of the distance between estimated distance using landmark coordinates and the measured distances. The function is minimized using the simplex downhill method.
June 17, 2005
Predicting Communication Latency in the Internet
10
GNP (2)
Coordinates of the host determined by minimizingthe following function:
f( xHost , yHost )=i=1
3
xHost x L i2ƒ yHost y L i
2 d L i , Host2
d L i , Host :measured distance between landmark i and host
Determining host coordinates
L1
L3
L2
Coordinates of the host determined by minimizingthe following function:
f x L1, y L1
, x L2, y L2
, x L3, y L3
=i , j {1..3 } , i„ j
x L i x L j2ƒ y L i y L i
2 d L i , L j2
d L i , Host :measured RTT between landmarks i and j
Determining landmark coordinates
L1
L3
L2
Host
June 17, 2005
Predicting Communication Latency in the Internet
11
Simplex Downhill [Nelder et al. 1965]
> A numerical algorithm for n-dimensional function minimization
> Simplex: the simplest object that can be constructed using n+1 points in an n-dimensional space (i.e. triangle in 2-D, tetrahedron in 3-D etc.).
> Input of the algorithm: a function to be minimized, initial corner points of a simplex (usually randomly chosen) and a condition for stopping the iteration (like the maximal number of iterations or the minimal progress for each iteration).
> Possible transformations of the simplex in each iteration: reflecting (and optionally expanding), contracting and contracting in all directions.
June 17, 2005
Predicting Communication Latency in the Internet
12
Simplex Downhill (2)
> The algorithm:— Find the high and low corner points of the simplex by evaluating
the function for each corner point of the simplex.
— Try to find a better point to replace the high point moving the high point by reflecting, stretching, reflecting and stretching or contracting the simplex relative to all other points. If one of the transformations generates a better value, the high point is replaced by the new point.
— If none of the above transformations leads to a better high point, the simplex is contracted in the direction of the low point (all other points are moved in the direction of the low point).
June 17, 2005
Predicting Communication Latency in the Internet
13
Simplex Downhill (3)
June 17, 2005
Predicting Communication Latency in the Internet
14
GNP: Drawbacks
> Using the same landmarks for all hosts does not scale well.— High network load on each landmark.
— Solution proposed in NPS [Ng 2004].– Three levels of landmarks.– Only the landmarks of the first level are the real landmarks.– The of landmarks in the other levels are hosts, that are used by
other hosts as landmarks.
> Requires infrastructure (fixed landmarks).— Landmarks must be deployed in the Internet.
— Solution proposed in PIC [Costa et al. 2004].– First n+1 hosts which need positioning become landmarks and
compute their coordinates as described for GNP landmarks.– following hosts use the first n+1 hosts as landmarks.
June 17, 2005
Predicting Communication Latency in the Internet
15
GNP: Drawbacks (2)
> Simplex downhill— Does not always find the global minimum.
— The result depends on the starting points (initial simplex).
— Simplex downhill does not converge as fast as other function minimization methods (i.e. Gauss-Newton nonlinear, Newton nonlinear, etc.), that exploit additional knowledge about the function that has to be minimized
> The solution for the coordinates of the landmarks is under-determined: infinite number of solutions.
L1
L3
L2
L1
L3
L2
June 17, 2005
Predicting Communication Latency in the Internet
16
Vivaldi [Dabek et al. 2004]
> Determining coordinates using physical model simulation (virtual springs)— All hosts start at the same coordinates.
— Each host measures the RTT to few other hosts.
— After each measurement, the host corrects its coordinates in such way, that the difference between the predicted and the measured distance (the potential energy of the virtual spring) is (partially) reduced by moving the host to reduce the spring force.
— Requires no infrastructure.
— Distributed algorithm.
> Similar (but more complex) model: Big Bang Simulation [Shavitt et al. 2004]— Takes the kinetic energy and friction into account.
— More complex model without distributed algorithm.
June 17, 2005
Predicting Communication Latency in the Internet
17
Vivaldi: Example
H1
H3
H2 H
1
H3
H2
H1
H3
H2 H
1
H3
H2
a) b)
c) d)
June 17, 2005
Predicting Communication Latency in the Internet
18
Vivaldi: Drawbacks
> Unstable— A new host can affect positions in the whole system.
— Numerous “moves” are needed until a newly joined host reaches its ideal position.
> Oscillation— When the algorithm is applied to measurements collected from
the Internet, the whole system seems to be oscillating (most of the hosts are constantly changing their coordinates).
— Possible solution for the oscillation of the system: Stable Vivaldi [de Launois 2004]: using a loss factor increasing with time for each spring.
June 17, 2005
Predicting Communication Latency in the Internet
19
Open Questions for GNP and Vivaldi
> GNP: Which hosts should be chosen as landmarks?
> GNP and Vivaldi: How many dimensions should the virtual space have?
June 17, 2005
Predicting Communication Latency in the Internet
20
Internet Coordinate System (ICS)[Lim et al. 2003]
> Uses the Principal Component Analysis (PCA) to determine landmark and host coordinates.
> Lower computational overhead than GNP (only basic matrix transformations and eigenvalue decomposition).
> Positions of landmarks in the virtual space can be uniquely determined.
> The sufficient number of dimensions needed to represent the whole system as a virtual space can be computed!
June 17, 2005
Predicting Communication Latency in the Internet
21
Principal Component Analysis (PCA)
> Linear transformation of the sample data using eigenvalues and eigenvectors.
> The data is transformed so that the variance of the data is decreasing for every next dimension.
> The new representation of the data allows reducing the number of dimensions with minimal loss of information.
June 17, 2005
Predicting Communication Latency in the Internet
22
ICS: Algorithm
> Landmarks:— Determine the RTT between all landmarks (represented as a
n*n matrix for n landmarks).
— Perform PCA of the RTTs. The result is the PCA transformation matrix.
— Determine the number of dimensions that are sufficient to represent most of the measured data.
— Scale the calculated transformation by using an least square estimator to achieve the preservation of distances between the landmarks in the transformed space.
> Hosts:— Measure distance to all (or a subset of) landmarks. Represent
the measurements as a n*1 matrix.
— Retrieve the scaled transformation matrix and calculate the host position by multiplying the distance matrix with the received transformation matrix.
June 17, 2005
Predicting Communication Latency in the Internet
23
Drawbacks of the ICS
> Part of the information that could be exploited (i.e. by function minimization) is lost.
> Seems to perform worse than GNP (not yet verified results).
June 17, 2005
Predicting Communication Latency in the Internet
24
Own Research Ideas
> Using multilateration (non-linear Newton iterative method) to determine the host and landmark coordinates.— Is already used by GPS.
— Converges faster than Simplex Downhill.
— Can find a global minimum (uses more information about the function that is minimized than the Simplex Downhill method).
— Drawback: Solution for the landmarks is under-determined (there are multiple solutions) which leads to divergence of the non-linear Newton iterative method.
> Using PCA to determine the number of dimensions that is needed for the virtual space and as a starting point for the non-linear Newton iterative method.
> Analyzing how the choice of the landmarks influences the overall error of the system (simulations with generated topologies).
June 17, 2005
Predicting Communication Latency in the Internet
25
Questions
?