CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on...
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on...
CS728
Lecture 17
Web Indexes III
•Last Time• Showed how build indexes for graph connectivity
•Based on 2-hop covers•Today
•Look at more general problem of compact encodings for
graphs and network problems• Applications
- Fast queries for path information - routing & routing table construction
- topology control- spanning trees- dominating sets & clustering- hierarchical clustering
Main Problem Considered
• arbitrary topology• goal small routing tables to find path to destination• related problem: finding closest item of certain type
Routing: how do I get there from here?
source
destination
Definitions:
Spanner: subgraph whose distance between two nodes is close to that in the original graph
We will see that radio networks need energy-spanners, i.e, subgraphs that contain energy-efficient paths
Spanning Trees:
K-Dominating Sets:
• minimum connected subgraph• useful for routing• single point of failure• non-minimal routes• many variants
• set of nodes that are within K hops of every node• used to defines partition of the network into zones 1-dominating set
Graph Clustering:
Hierarchical Clustering
• K-center problem – find k nodes such that minimize the max distance to all nodes – Flat Clustering
• Hierarchical Clustering• tree clustering with internal and border nodes and edges
Hierarchical Clustering
• The hierarchy imposes a natural addressing scheme
• Each node labeled with the path in the hierarchy tree
• Problem: give a compact labeling for a tree– Clearly need logn bits to identify some nodes.– Need to add information about tree structure– Complete binary tree– Other n-node trees
• Interval labeling scheme– Label the leaves of the tree uniquely logn bits– Label each internal node with the range of its
descendents 2log n bits.– Given two nodes x,y and their labels
• Can you test if x is an ancestor or y?• Can you describe the path from x to y?
• Greedy Dewey Labeling scheme
• Label each edge with small unique string
• Nodes are concatenation of edge labels
v0
00 01
v1 v2
10
v3 v4
Out-degree 4 requires edge labels of maximum length 2.
v0
v1
0
v600
……..
Out-degree 600 requires edge labels of maximum length 10.
1101101110
Theorem: Upper bound on GDL label length withunary delimiters is bits, - is the depth of v in T - n is number of nodes in T
• Alternative use binary (fixed length) for delimiting each edge– Seems to do worse in practice
• Can remove dependence on depth by converting encodings of long interior paths using count labels
)log(2 nv
v
Spanners and Stretch
• Stretch of a subgraph H is the maximum ratio of the distance between two nodes in H to that between them in G– Extensively studied in the graph algorithms and graph
theory literature [Eppstein 96]• Distance stretch and topological stretch• A spanner is a subgraph that has constant stretch
– The Delaunay triangulation yields a planar Euclidean distance-spanner
– The Yao-graph [Yao 82] is also a simple distance-spanner
Energy Stretch and Energy Spanners
• Commonly adopted power attenuation model:– is between 2 and 4
• Assuming uniform threshold for reception power and interference/noise levels, energy consumed for transmitting from to needs to be proportional to
• Power control: Radios have the capability to adjust their power levels so as to reach destination with desired fidelity
• Energy consumed along a path is simply the sum of the transmission energies along the path links
• Define energy-stretch analogous to distance-stretch
distancepowerTransmit
Power Received
u v ),( vud
Energy-Aware Routing
• A path with many short hops consumes less energy than a path with a few large hops– Which edges to use? (Considered in topology control)– Can maintain “energy cost” information to find minimum-energy
paths [Rodoplu-Meng 98]
• Routing to maximize network lifetime [Chang-Tassiulas 99]– Formulate the selection of paths and power levels as an
optimization problem– Suggests the use of multiple routes between a given source-
destination pair to balance energy consumption
• Energy consumption also depends on transmission rate– Schedule transmissions lazily [Prabhakar et al 2001]– Can split traffic among multiple routes at reduced rate [Shah-
Rabaey 02]
Topology Control
• Given:– A collection of nodes in the plane– Transmission range of the nodes
(assumed equal)
• Goal: To determine a subgraph of the transmission graph G that is– Connected – Low-degree– Small stretch, hop-stretch, and power-
stretch
The Yao Graph
• Divide the space around each node into sectors (cones) of angle
• Each node has an edge to nearest node in each sector
• Number of edges is
• For any edge (u,v) in transmission graph– There exists edge (u,w) in same sector such that w is closer to v than u is
• Theorem: The Yao Graph has stretch ))2/sin(21/(1
)(nO
u
wv
Dominating Set
• Applications Facility location– A set of -dominating centers can be selected to
locate servers or copies of a distributed directory– Dominating sets can serve as location database for
storing routing information in ad hoc networks [Liang Haas 00]
• NP-hard for general graphs• Reduces to the minimum set cover problem• Recall last time: Greedy gives logn
approximation• Admits a PTAS for planar graphs [Baker 94]
k
• An Example
Greedy Algorithm
Hierarchical Network Decomposition
• Sparse neighborhood covers [Awerbuch-Peleg 89, Linial-Saks 92]– Applications in location management, replicated data
management, routing– Provable guarantees, though difficult to adapt to a
dynamic environment
• Routing scheme using hierarchical partitioning [Dolev et al 95]– Adaptive to topology changes– Weak guarantees in terms of stretch and memory per
node
Sparse Neighborhood Covers
• An r-neighborhood cover is a set of overlapping clusters such that the r-zone of any node is in one of the clusters
• Aim: Have covers that are low diameter and have small overlap
• Overlap is measured by the max number of clusters a node is in
• Tradeoff between diameter and overlap– Set of all r-zones: Have diameter 2r but overlap n– The entire network single cluster: Overlap 1 but diameter could
be n
• Sparse r-neighborhood with O(r log(n)) diameter clusters and O(log(n)) overlap [Peleg 89, Awerbuch-Peleg 90]
Sparse Neighborhood Covers
• Set of sparse neighborhood covers– { -neighborhood cover: }
• For each node:– For any , the -zone is contained within a
cluster of diameter – The node is in clusters
• Applications:– Tracking mobile users– Distributed directories for replicated objects
r)log( nrO
)(log2 nO
ni log0
r
i2
Online Tracking of Mobile Users
• Given a fixed network with mobile users• Need to support location query operations• Home location register (HLR) approach:
– Whenever a user moves, corresponding HLR is updated
– Inefficient if user is near the seeker, yet HLR is far
• Performance issues:– Cost of query: ratio with “distance” between source
and destination– Cost of updating the data structure when a user
moves
Mobile User Tracking: Initial Setup
• The sparse -neighborhood cover forms a regional directory at level
• At level , each node u selects a home cluster that contains the -zone of u
• Each cluster has a leader node.
• Initially, each user registers its location with the home cluster leader at each of the levels
i2i
)(lognO
i2i
The Location Update Operation
• When a user X moves, X leaves a forwarding pointer at the previous host.
• User X updates its location at only a subset of home cluster leaders– For every sequence of moves that add up to
a distance of at least , X updates its location with the leader at level
• Amortized cost of an update is for a sequence of moves totaling distance
i2i
)log( ndO
d
The Location Query Operation
• To locate user X, go through the levels starting from 0 until the user is located
• At level , query each of the clusters u belongs to in the -neighborhood cover
• Follow the forwarding pointers, if necessary• Cost of query: , if is the
distance between the querying node and the current location of the user
i2i
)log( ndOd
)(lognO
Comments on the Tracking Scheme
• Distributed construction of sparse covers in time [Awerbuch et al 93]
• The storage load for leader nodes may be excessive; use hashing to distribute the leadership role (per user) over the cluster nodes
• Distributed directories for accessing replicated objects [Awerbuch-Bartal-Fiat 96]– Allows reads and writes on replicated objects– An -competitive algorithm assuming each
node has times more memory than the optimal
• Unclear how to maintain sparse neighborhood covers in a dynamic network
)loglog( 2 nnnmO
)(lognO)(lognO
Bubbles Routing and Partitioning Scheme
• Adaptive scheme by [Dolev et al 95]
• Hierarchical Partitioning of a spanning tree structure
• Provable bounds on efficiency for updates
2-level partitioningof a spanning tree
root
Bubbles (cont.)
• Size of clusters at each level is bounded
• Cluster size grows exponentially
• # of levels equal to # of routing hops
• Tradeoff between number of routing hops and update costs
• Each cluster has a leader who has routing information
• General idea:
- route up the tree until in the same cluster as destination,
- then route down
- maintain by rebuilding/fixing things locally inside subtrees
Bubbles Algorithm
• A partition is an [x,y]-partition if all its clusters are of size between x and y
• A partition P is a refinement of another partition P’ if each cluster in P is contained in some cluster of P’.
• An (x_1, x_2, …, x_k)-hierarchical partitioning is a sequence of partitions P_1, P_2, .., P_k such that
- P_i is an [x_i, d x_i] partitioning (d is the degree)
- P_i is a refinement of P_(i-1)
• Choose x_(k+1) = 1 and x_i = x_(i+1) n1/k
Clustering Construction
• Build a spanning tree, say, using BFS
• Let P_1 be the cluster consisting of the entire tree
• Partition P_1 into clusters, resulting in P_2
• Recursively partition each cluster
• Maintenance rules:
- when a new node is added, try to include in existing cluster, else split cluster
- when a node is removed, if necessary combine clusters
• memory requirement
• adaptability
• k hops during routing
• matching lower bound for bounded degree graphs
• Note: Bubbles does not provide a non-trivial upper bound
on stretch in the non-hop model
Performance Bounds
kk nd /123
nkdn k log/11