Declarative Routing: Extensible Routing with Declarative Queries UC Berkeley: Boon Thau Loo, Joseph...
-
Upload
william-oliver -
Category
Documents
-
view
217 -
download
0
Transcript of Declarative Routing: Extensible Routing with Declarative Queries UC Berkeley: Boon Thau Loo, Joseph...
Declarative Routing:Extensible Routing with
Declarative QueriesUC Berkeley: Boon Thau Loo, Joseph M. Hellerstein, Ion Stoica.
Intel Research: Joseph M. Hellerstein.
Wisconsin-Madison: Raghu Ramakrishnan
SIGCOMM’05
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
Abstract (1)• Problem: The Internet routing infrastructure is difficult to supply new
applications.• Other researches on the above problem:
– New hard-coded routing protocols• In order to change or upgrade a routing protocol, must get access to each router. This
is more tedious.– Active Networks: allows one to apply new routing functionality without the direct
access to routers.• Difficulties in both router performance and the security and reliability of the resulting
infrastructure.– Overlay networks: allows third parties to replace Internet routing with new,
“from-scratch” implementation of routing functionality that run at the application layer.
• Simply move the problem from the network layer to the application layer that third parties have control.
• This paper explores a new point to strike a better balance between the extensibility and robustness of a routing infrastructure.
– Declarative Routing: to express routing protocols using a database query language.
Abstract (2)• Ideas based on an observation:
– Recursive query languages studied in the deductive database literature are a natural fit for expressing routing protocols. Deductive database query languages focus on identifying recursive relationships among nodes of a graph, and are well suited for expressing paths among nodes in a network.
• Basic mechanism:– A routing protocol is implemented by writing a simple query in a declarat
ive query language like Datalog, which is then executed in a distributed fashion at some or all of the nodes.
• The future applications:– Individual end-user will explicitly request routes with particular properties,
by submitting route construction queries to the network.– An administrator at an ISP might re-configure the ISP’s routers by issuin
g a query to the network.• The simplicity and safety of declarative routing has benefits over the curren
t relatively fragile approaches to upgrading routers.
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
System Model (1)
• We model the routing infrastructure as a directed graph, where each link is associated with a set of parameters (e.g., loss rate, available bandwidth, delay).– The nodes can either be IP routers or overlay nodes.
• Fully distributed implementation:– Like traditional routers, the nodes maintain links to
their neighbors, compute routes, and set up the forwarding state.
– But, instead of running a traditional routing protocol, each node runs a general-purpose query processor.
System Model (2)
System Model (3)• The query processor can read the neighbor table, and install entries into the
forwarding table. – This simple interface is the only interaction between the query processor and the
core forwarding logic.• Declarative query:
– Both routing protocols and route requests can be expressed.– The results of the query are used to establish router forwarding state.
• Base tuples: the local information that the node reads.– E.g. link tuple:
• link (source, destination, …), a copy of an entry in the neighbor table.
• Derived tuples: the generated intermediate data.– E.g. path tuple:
• path (source, destination. pathVector, cost)
• Lifetime of a query:– Each query is accompanied by a specification of the lifetime.– During this period, neighbor table updates would trigger the re-computation of so
me of the existing derived and result tuples.
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
The Basics (1)• A Datalog program (a query) consist of a set of declarative rules.
– rule form:• <head> :- <body>• Variable names begin with an upper-case letter.• Function symbols, predicates and constants begin with an lower-case letter.
– Ex.• NR1: path(S, D, P, C) :- link(S, D, C), P = f_concatPath( link(S, D, C), nil ).• NR2: path(S, D, P, C) :- link(S, Z, C1), path(Z, D, P2, C2), C = C1 + C2, P = f_c
oncatPath( link(S, Z, C1), P2 ).• Query: path(S, D, P, C)
– Since both S and D are unbound variables, this query will compute the full transitive closure consisting of the paths between all pairs of reachable nodes.
• Pitfall:– The above query will not terminate due to the generation of path tuples with cycle
s.» Solution: add an extra predicate f_inPath(P2, S) = false to rule NR2.
The Basics (2)
• Query Plan Generation– Before executing a query, we need to generat
e a query plan.– A query plan: a dataflow diagram consisting of
relational operators that are connected by arrows indicating the flow of tuples. Figure 2.
Query Plan
The Basics (3)
• Query Plan Distributed Execution– Upon receipt of the Datalog query, each node
creates the query plan. Figure 3 (First, ignore query
initial dissemination)– Query Initial Dissemination
• Flood• Piggy-back
– Embedded the query into the first data tuple sent to each neighboring node.
– Optimization: Nodes not involved in the query computation will not receive the query.
Query Plan Distributed Execution (1)
Each Iteration represent the traversal of a “cloud“ in Figure 2.
Query Plan Distributed Execution (2)
• A node needs to take up to k iterations to converge to a steady state when receiving a query. k is the diameter of the network.
• The total time taken for a query to converge is proportional to 2k.– The initial query takes up to k iterations to
reach the farthest node from the query node.
The Basics (4)• Distance Vector Protocol Expression
– DV1: path(S, D, D, C) :- link(S, D, C)– DV2: path(S, D, Z, C) :- link(S, Z, C1), path(Z, D, W, C2), C = C1 + C2
– DV3: shortestCost(S, D, min<C>) :- path(S,D, Z, C).– DV4: nextHop(S, D, Z, C) :- path(S, D, Z, C), shortestCost(S, D, C).– Query: nextHop(S, D, Z, C)
• Count-to-Infinity problem– Using split-horizon: a method of preventing a routing loop in a network.
• The basic principle is simple: Information about the routing for a particular packet is never sent back in the direction from which it was received.
• Modification:– #include(DV1, DV3, DV4)– DV2: path(S, D, Z, C) :- link(S, Z, C1), path(Z, D, W, C2), C = C1 + C2, W ≠ S.– DV5: path(S, D, Z, ∞) :- link(S, Z, C1), path(Z, D, S, C2).
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
Challenges
• Four challenges to justify the feasibility of declarative routing:– Expressiveness
• How to express various routing policies? Any limitation?– Security
• Is it safe enough to execute queries issued by untrusted third-parties?
– Efficiency• How to adapt or develop the Query Plan Generation to perform the
queries well in a large network?• How to reduce the redundant work performed by various routing q
ueries issued concurrently?– Stability and Robustness
• Since the network is dynamic, how to efficiently maintain the robustness and accuracy of long term routes?
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
Expressiveness
• Main goal here is:– To illustrate the natural connection between recursive
queries and network routing.– To highlight the flexibility, ease of programming and
ease of reuse afforded by a query language.
• Routing protocols to be expressed:– Best-Path Routing– Policy-Based Routing– Dynamic Source Routing– Link State
Expressiveness: Best-Path Routing
• From base rules in the first Network-Reachability example:– NR1: path(S, D, P, C) :- link(S, D, C), P = f_concatPath( link(S, D, C), nil
).– NR2: path(S, D, P, C) :- link(S, Z, C1), path(Z, D, P2, C2), C = f_comput
e(C1, C2), P = f_concatPath( link(S, Z, C1), P2 ).– BPR1: bestPathCost(S, D, AGG<C>) :- path(S,D, P, C).– BPR2: bestPath(S, D, P, C) :- path(S, D, P, C), bestPathCost(S, D, C).– Query: bestPath(S, D, Z, C)
• If best-path is the shortest-path, replace f_compute with f_sum, and AGG with min.
• Then, add an extra condition f_inPath(P2, S) = false to rule NR2 to avoid cycles in paths.
• Additionally, QoS requirement can be specified with certain of constraints.– Add an extra constraint C<k to the rules NR1 and NR2 to restirct the set
of paths to those with costs below a loss or latency threshold k.
Expressiveness: Policy-Based Routing
• To restrict the scope of routing by precluding paths that involve “undesirable” nodes.– #include(NR1, NR2)– PBR1: permitPath(S, D, P, C) :- path(S, D, P, C), excl
udeNode(S, W), f_inPath(P, W) = false.– Query: permitPath(S, D, P, C).
• An additional table excludeNode is introduced.– excludeNode(S, W) reresents that node S does not ca
rry any traffic for node W. This table is stored at each node S.
• We can generate bestPath tuples meeting the above policy by adding BPR1 and BPR2.
Expressiveness: Dynamic Source Routing
• Flip the order of path and link in the body of rule NR2 to using left recursion, we get DSR.– #include(NR1)
– DSR1: path(S, D, P, C) :- path(S, Z, P1, C1), link(Z, D, C2), f_concatPath(P1, link(Z, D, C2)), C = C1 + C2.
– Query: path(S, D, P, C).
• We can also generate bestPath tuples by adding BPR1 and BPR2.
Expressiveness: Link State
• Flood the links to all nodes in the network.– LS1: floodLink(S, S, D, C, S) :- link(S, D, C).– LS2: floodLink(M, S, D, C, N) :- link(N, M, C1),
floodLink(N, S, D, C, W), M ≠ W.– Query: floodLink(M, S, D, C, N)
• Then, if all the links are available at each node, a local version of the Best-Path query is executed locally using the floodLink tuples.
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation
– Simulation– Experiments
• Conclusion
Security• Security is a key concern with any extensible system.
– Bound on the resource consumption:• Queries written in Datalog language have polynomial time and space comple
xity in the size of the input.– Termination of an augmented Datalog query:
• The addition of arbitrary functions, the time complexity of a Datalog program is no longer polynomial.
• Several powerful static tests for termination [18].– Side-effect-free language:
• Taking a set of stored tables as input and produce a set of derived tables.– The execution is “standboxed” within the query engine.
• As a result, Datalog eliminates many of the risks usually associated with extensible systems.
• Still many other security issues: (but orthogonal to network extensibility, and won’t be addressed)– Denial-of-service attacks– Compromised routers
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation
– Simulation– Experiments
• Conclusion
Optimization (1)
• Pruning Unnecessary Paths– The Inefficiency:
• Queries with aggregates start by enumerating all possible paths.
– Solution:• Using a query optimization technique, aggregate selections
[25, 22].• E.g. Figure 3:
– By maintaining a “min-so-far” aggregate value for the current shortest path cost from node a to its destination nodes, we can selectively avoid sending path tuples to neighbors if we know they can not be involved in the shortest path.
• In general, aggregate selections are useful when AGG function can be used to prune communication.
Optimization (2)
• Subsets of Sources and Destinations (2 techniques)– Magic Sets Rewrite:
• To limit query computation to the relevant portion(nodes) of the network.
• E.g.: if nodes b and c are the only nodes issuing the path query: (After this, only nodes reachable from b and c participate in this query)
Optimization (3)
– Left-Right Recursion Rewrite:• To generate best paths from magicSources to mag
icDsts nodes, Best-Path-Pair: (Using left recursion)
Optimization (4)
• If all nodes are running the same query, then using right-recursion to directly utilize path info sent by neighboring nodes.
• If only a small of subset of nodes are issuing the same query, using left-recursion to lower message overhead.
• Drawback of Best-Path-Pair: – No sharing if magicSource(b) is added.
Optimization (5)
• Multi-Query Sharing:
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation
– Simulation– Experiments
• Conclusion
Stability and Robustness (1)
• Each query is accompanied by a lifetime. During this period, changes in the network might result in some re-computation.
• Using continuous queries to re-compute new results based on changes in the network:– Each router is responsible for detecting changes to its local infor
mation and reporting these changes to its local query processor.– E.g. consider the Network-Reachability query:
Stability and Robustness (2)
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
Evaluation (Size of network)
Evaluation (All-Pairs Shortest Paths 1)
Evaluation (All-Pairs Shortest Paths 2)
• Two observations:– Convergence latency for the Best-Path query
is proportional to the network diameter, and converges in the same time compared to the path vector protocol.
– Per-node communication overhead increases linearly with the number of nodes.
• Both observations are consistent with the scalability properties of the traditional distance vector and path vector protocols
Evaluation (Source/Destination Queries 1)
Evaluation (Source/Destination Queries 2)
Evaluation (Mixed Query Workload)
Outlines
• Abstract• System Model• The Basics• Challenges• Expressiveness• Security• Optimization• Stability and Robustness• Evaluation• Conclusion
Conclusion
• Observation:– Finding a connection between two different
domain might address some difficult problems in one by using the techniques from the other (well-studied).