Rethinking Internet Traffic Management Using Optimization Theory Jennifer Rexford Princeton...

37
Rethinking Internet Traffic Management Using Optimization Theory Jennifer Rexford Princeton University t work with Jiayue He, Martin Suchara, Ma’ayan Bresler, and Mung Chi http://www.cs.princeton.edu/~jrex/papers/ conext07.pdf
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Rethinking Internet Traffic Management Using Optimization Theory Jennifer Rexford Princeton...

Rethinking Internet Traffic Management Using Optimization

Theory

Jennifer RexfordPrinceton University

Joint work with Jiayue He, Martin Suchara, Ma’ayan Bresler, and Mung Chiang

http://www.cs.princeton.edu/~jrex/papers/conext07.pdf

2

Clean-Slate Network Architecture

Network architecture More than designing a single protocol Definition and placement of function

Clean-slate design Without the constraints of today’s artifacts To have a stronger intellectual foundation And move beyond the incremental fixes

But, how do we do clean-slate design?

3

Two Ways to View This Talk

A design process Based on optimization decomposition

A new design For traffic management

4

Why Traffic Management?

Traffic management is important Determines traffic rate along each path Major resource-allocation issues Routing, congestion control, traffic

engineering, … Some traction studying mathematically

Reverse engineering of TCP Redesigning protocols (e.g., TCP variants) Mathematical tools for tuning protocols

But still does not have a holistic view…

5

Traffic Management Today

User:Congestion Control

Operator: Traffic Engineering

Routers:Routing Protocols

Evolved organically without conscious design

6

Shortcoming of Today’s Traffic Management

Protocol interactions ignored Congestion control assumes routing is

fixed TE assumes the traffic is inelastic

Inefficiency of traffic engineering Link-weight tuning problem is NP-hard TE at the timescale of hours or days

Only limited use of multiple paths

What would a clean-slate redesign look like?

7

Top-down Redesign

Problem Formulation

Distributed Solutions

TRUMP algorithm

Optimization decomposition

Compare using simulations

TRUMP Protocol

Translate into packet version

8

Congestion Control ImplicitlyMaximizes Aggregate User Utility

max.∑i Ui(xi)

s.t. ∑i Rlixi ≤ cl

var. x

aggregate utility

Source rate xi

UserUtilityUi(xi)

Source-destination pair indexed by i

source rate

Utility represents user satisfaction and elasticity of traffic

routing matrix

Fair rate allocation amongst greedy users

9

Traffic Engineering ExplicitlyMinimizes Network Congestion

min. ∑l f(ul)s.t. ul =∑i Rlixi/cl

var. R Link Utilization ul

Costf(ul)

aggregate congestion costLinks are indexed by l

ul =1

Cost function is a penalty for approaching capacity

Avoids bottlenecks in the network

link utilization

1010

A Balanced Objective

max. ∑i Ui(xi) - w∑l f(ul)

Network users:Maximize throughput Generate bottlenecks

Network operators:Minimize delay

Avoid bottlenecks

Penalty weight

11

Top-down Redesign

Problem Formulation

Distributed Solutions

TRUMP algorithm

Optimization decomposition

Compare using simulations

TRUMP Protocol

Translate into packet version

Optimization decomposition requires convexity

12

Convex Problems are Easier to Solve

Convex Non-convex

Convex problems have a global minimum Distributed solutions that converge to global

minimum can be derived using decomposition

13

i source-destination pair, j path number

How to make our problem convex?

max. ∑i Ui(∑j zji) – w∑l f(ul)

s.t. link load ≤ cl

var. path rates zz1

1

z21

z31

Single path routing is non convex Multipath routing + flexible splitting is convex

Path rate captures source rates and routing

14

Overview of Distributed Solutions

Edge node: Update path rates zRate limit incoming traffic

Operator: Tune w, U, f Other parameters

Routers: Set up multiple pathsMeasure link loadUpdate link prices s

ss

s

15

Optimization Decomposition

Deriving prices and path rates Prices: penalties for violating a constraint Path rates: updates driven by penalties

Example: TCP congestion control Link prices: packet loss or delay Source rates: AIMD based on prices

Our problem is more complicated More complex objective, multiple paths

16

Rewrite capacity constraint:

Subgradient feedback price update:

Stepsize controls the granularity of reaction Stepsize is a tunable parameter

Effective capacity keeps system robust

Effective Capacity (Links)

sl(t+1) = [sl(t) – stepsize*(yl – link load)]+

link load ≤ cl

link load = yl

effective capacity yl≤ cl

17

Key Architectural Principles Effective capacity

Advance warning of impending congestion

Simulates the link running at lower capacity and give feedback on that

Dynamically updated Consistency price

Allowing some packet loss Allowing some overshooting in exchange

for faster convergence

18

Four Decompositions - Differences

Algorithms Features Paramete

rs

Partial-dual Effective capacity

1

Primal-dual Effective capacity

3

Full-dual Effective capacity,Allow packet loss

2

Primal-driven Direct s update 1

Iterative updates contain parameters:They affect the dynamics of the distributed algorithms

Differ in how link & source variables are updated

19

Top-down Redesign

Problem Formulation

Distributed Solutions

TRUMP algorithm

Optimization decomposition

Compare using simulations

Final Protocol

Optimization doesn’t answer all the questions

Translate into packet version

20

Theoretical results and limitations: All proven to converge to global optimum for

well-chosen parameters No guidance for choosing parameters Only loose bounds for rate of convergence

Sweep large parameter space in MATLAB Effect of w on convergence Compare rate of convergence Compare sensitivity of parameters

Evaluating Four Decompositions

2121

Effect of Penalty Weight (w)

Can achieve high aggregate utility for a range of w

User utility w Operator penalty

Topology dependent

22

Convergence Properties: Partial Dual

Tunable parameters impact convergence time

Best rate

parameter sensitivity

stepsize

Itera

tion

s to

converg

ence

o average valuex actual values

23

Convergence Properties (MATLAB)

Algorithms Convergence Properties

All Converges slower for small w

Partial-dual vs.Primal-dual

Extra parameters do not improve convergence

Partial-dual vs.Full-dual

Allow some packet loss may improve convergence

Partial-dual vs.Primal-driven

Direct updates converges faster than iterative updates

Parameter sensitivity correlated to rate of convergence

24

Insights from simulations: Have as few tunable parameters as possible Use direct update when possible Allow some packet loss

Cherry-pick different parts of previous algorithms to construct TRUMP

One tunable parameter

TRUMP: TRaffic-management Using Multipath Protocol

25

TRUMP Algorithm

Source i: Path rate zj

i(t+1) = max. Ui(∑kzki) – zj

i *path price

Link l: loss pl(t+1) = [pl(t) – stepsize(cl – link load)]+

queuing delay ql(t+1) = wf’(ul)

Price for path j = ∑ l on path j (pl+ql)

26

TRUMP is not another decomposition We can prove convergence, but only

under more restrictive conditions From MATLAB:

Faster rate of convergence Easy to tune parameter

TRUMP versus Other Algorithms

27

Top-down Redesign

Problem Formulation

Distributed Solutions

TRUMP algorithm

Optimization decomposition

Compare using simulations

TRUMP Protocol

So far, assume fluid model, constant feedback delay

Translate into packet version

28

TRUMP: Packet-Based Version

Link l: link load = (bytes in period T) / T Update link prices every T

Source i: Update path rates at maxj {RTTj

i}

Arrival and departure of flows are notified implicitly through price changes

29

Set-up: Realistic topologies and delays of large ISPs Selected flows and paths Realistic ON-OFF traffic model

Questions: Do MATLAB results still hold? Does TRUMP react quickly to link dynamics? Can TRUMP handle ON-OFF flows?

Packet-level Experiments (NS-2)

3030

TRUMP versus Partial dual (in Sprint)

TRUMP “trumps” partial dual for w ≤ 1/3

TRUMP Partial dual

Total sending rate vs. time for TRUMP and Partial Dual

31

TRUMP Link Dynamics (NJ-IN Link)

TRUMP reacts quickly to link dynamics

Link failureor recovery

32

TRUMP versus file size

TRUMP’s performance is independent of variance

Worse for smaller filesStill faster than TCP

3333

TRUMP: A Few Paths Suffice

Do not need to incur the overhead of many paths…

Worse for smaller files but still faster than TCP

Having 2-3 paths is enough.

34

Summary of TRUMP PropertiesProperty TRUMP

Tuning Parameters

One easy to tune parameterOnly need to be tuned for small w

Robustness to link dynamics

Reacts quickly to link failures and recoveries

Robustness to flow dynamics

Independent of variance of file sizes, more efficient for larger files

General Trumps other algorithms

Feedback Possible with implicit feedback

35

Division of Functionality Today TRUMP

Operators

Tune link weightsSet penalty function

(Set-up multipath)Tune w & stepsize

Sources Adapt source rates Adapt path rates

Routers Shortest path routing

(Compute prices)

Sources: end hosts or edge routers? Feedback: implicit or explicit? Computation: centralized or distributed?

Mathematics leaves open architecture questions

36

Contributions: Design with multiple decompositions New TRUMP traffic-management protocol

Extensions to TRUMP Implicit feedback based on loss and

delay Interdomain realization of the protocol

Conclusions

37

Ongoing Work: Multiple Traffic Classes

Different application requirements Throughput-sensitive: file transfers Delay-sensitive: VoIP and gaming

Optimization formulation Weighted sum of the two objectives Per-class variables for routes and rates

Decompose into two subproblems Two virtual networks with custom protocols Simple dynamic update to bandwidth shares

Theoretical foundation for adaptive network virtualization