Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)
-
Upload
randolph-manning -
Category
Documents
-
view
227 -
download
0
Transcript of Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)
![Page 1: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/1.jpg)
Distributed Data Distributed Data Structures: A SurveyStructures: A Survey
Cyril GavoilleCyril Gavoille(LaBRI, University of Bordeaux)(LaBRI, University of Bordeaux)
![Page 2: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/2.jpg)
ContentsContents
1.1. Efficient data structuresEfficient data structures
2.2. Distributed data structuresDistributed data structures
3.3. Informative labeling schemesInformative labeling schemes
4.4. ConclusionConclusion
![Page 3: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/3.jpg)
1. Efficient data structures1. Efficient data structures(Tarjan’s like)(Tarjan’s like)
Example 1:Example 1:
A tree (static) A tree (static) TT with with nn vertices vertices
Question:Question: nearest common ancestor nearest common ancestor nca(nca(x,yx,y) for some vertices ) for some vertices x,yx,y??
Note:Note: queries ( queries (x,yx,y) are not known in advance) are not known in advance
((on-line queries on a static treeon-line queries on a static tree))
![Page 4: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/4.jpg)
[Harel-Tarjan ’84][Harel-Tarjan ’84]
Each tree with Each tree with nn vertices has a data vertices has a data structure of O(structure of O(nn) space (computable in ) space (computable in linear time) such that nca queries can linear time) such that nca queries can be answered in constant time.be answered in constant time.
![Page 5: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/5.jpg)
A weighted graph A weighted graph GG with with nn vertices, vertices, and a parameter and a parameter kk≥11
Question:Question: a a kk-approximation δ(-approximation δ(x,yx,y) on ) on dist(dist(x,yx,y) in ) in G G for some vertices for some vertices x,yx,y??
with with dist(dist(x,yx,y) ≤ δ) ≤ δ((x,yx,y) ) ≤ ≤ kk..dist(dist(x,yx,y))
Example 2:Example 2:
![Page 6: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/6.jpg)
[Thorup-Zwick - [Thorup-Zwick - J.ACMJ.ACM ’05] ’05]
Each undirected weighted graph Each undirected weighted graph GG with with nn vertices, and each integer vertices, and each integer kk≥1, 1, has a data structure of has a data structure of O(O(kk..nn1+1/k) ) space (computable in O(space (computable in O(kkmm..nn1/k) ) expected time) such that (2expected time) such that (2kk-1)--1)-approximated distance queries can approximated distance queries can be answered in O(be answered in O(kk) time.) time.
Essentially optimal, related to an Erdös Conjecture.
![Page 7: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/7.jpg)
2. Distributed data structures2. Distributed data structures
Typical questions are:Typical questions are:
Answer to query Answer to query QQ with the local knowledge of with the local knowledge of xx (or (or its vicinity), so without any access to a global data its vicinity), so without any access to a global data structure.structure.
A networkA network
x
![Page 8: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/8.jpg)
Query at Query at xx: : who has any mpeg file named ‘‘Sta*Wa*’’?who has any mpeg file named ‘‘Sta*Wa*’’?
Example 1: Distributed Hash Tables Example 1: Distributed Hash Tables (DHT)(DHT)
x
Answer:Answer: go to go to ww and ask it. and ask it.xx does not know, but does not know, but ww certainly knows … at least a certainly knows … at least a
pointerpointer
set of peersset of peerslogical networklogical network
![Page 9: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/9.jpg)
Query at Query at xx: : next hop to go to next hop to go to yy??
Example 2: Routing in a physical Example 2: Routing in a physical networknetwork
x
y
![Page 10: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/10.jpg)
Query at Query at xx: : the number of descents of the number of descents of xx(or a constant approximation of it)(or a constant approximation of it)
Example 3: in a dynamic settingExample 3: in a dynamic setting
A growing rooted A growing rooted treetree
It is possible to maintain a 2-approximation on the It is possible to maintain a 2-approximation on the number of descendants with O(lognumber of descendants with O(log22n) amortized n) amortized messages of O(loglogn) bits each, n number of messages of O(loglogn) bits each, n number of inserted vertices.inserted vertices.
[Afek,Awerbuch,Plokin,Saks – [Afek,Awerbuch,Plokin,Saks – J.ACM J.ACM ’96]’96]
![Page 11: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/11.jpg)
Goals are:Goals are:
►The same as for global data structures:The same as for global data structures: Low preprocessing timeLow preprocessing time Small size data structureSmall size data structure Fast query timeFast query time Efficient updatesEfficient updates
+ + Smaller and balanced local data Smaller and balanced local data structuresstructures
++ Low communication cost (trade-offs), Low communication cost (trade-offs), for multiple hops answersfor multiple hops answers
![Page 12: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/12.jpg)
3. Informative Labeling 3. Informative Labeling SchemesSchemes
For the talkFor the talk A static network/graphA static network/graph Queries: involve only verticesQueries: involve only vertices Answers: do not require any Answers: do not require any
communication (direct data structures)communication (direct data structures)
![Page 13: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/13.jpg)
Question: dist(Question: dist(x,yx,y) in a graph ) in a graph GG??
Answering to dist(Answering to dist(x,yx,y) consists only in inspecting ) consists only in inspecting the local data structure of the local data structure of xx and of and of yy..
Main goal:Main goal: minimize the maximal size of a local minimize the maximal size of a local data structure. Wish: |DS(data structure. Wish: |DS(x,Gx,G)| )| « « |DS(|DS(GG)|, ideally)|, ideally
|DS(|DS(x,Gx,G)| )| ≈ (1/n)≈ (1/n)..|DS(|DS(GG)|)|
Data StructureData Structurefor graph for graph GG
xx yy
![Page 14: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/14.jpg)
[Thorup-Zwick - [Thorup-Zwick - J.ACMJ.ACM ’05] ’05]
… … Moreover, each vertex Moreover, each vertex w w L( L(ww) of ) of Õ(Õ(nn1/kloglogDD) bits () bits (DD=weighted diameter of =weighted diameter of GG) ) such that a (2such that a (2kk-1)-approximation on dist(-1)-approximation on dist(x,yx,y) ) can be answered from L(can be answered from L(xx) and L() and L(yy) only.) only.
nn1+1/k
nn1/k
ww yyxx
Overlap: Õ(logOverlap: Õ(logDD))
![Page 15: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/15.jpg)
Informative labeling schemesInformative labeling schemes(more formally) (more formally) [Peleg ’00][Peleg ’00]
A A PP-labeling scheme for -labeling scheme for FF is a pair ‹ is a pair ‹L,fL,f› › such that: such that: GG F F ,, u,vu,v GG::
• (labeling)(labeling) LL((u,Gu,G) is a binary ) is a binary stringstring• (decoder)(decoder) ff(L((L(u,Gu,G),L(),L(v,Gv,G)) = )) = PP((u,v,Gu,v,G))
Let Let PP be a graph property defined on be a graph property defined on pairs of vertices (can be extended to pairs of vertices (can be extended to any tuple), and let any tuple), and let FF be a graph family. be a graph family.
![Page 16: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/16.jpg)
Some Some PP-labeling schemes-labeling schemes
► AdjacencyAdjacency► Distance (exact or approximate)Distance (exact or approximate)► First edge on a (near) shortest path (compact First edge on a (near) shortest path (compact
routing, labeled-based routing)routing, labeled-based routing)► Ancestry, parent, nca, sibling relation in treesAncestry, parent, nca, sibling relation in trees► Edge connectivity, flowEdge connectivity, flow► General predicate General predicate PP described in monadic described in monadic
second order logic second order logic [Courcelle][Courcelle]► Proof labeling systems Proof labeling systems [Korman,Kutten,Peleg][Korman,Kutten,Peleg]
![Page 17: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/17.jpg)
Ancestry in rooted treesAncestry in rooted trees
Motivation: Motivation: [Abiteboul,Kaplan,Milo ’01][Abiteboul,Kaplan,Milo ’01]
The <TAG> … </TAG> structure of a huge XML data-The <TAG> … </TAG> structure of a huge XML data-base is a rooted tree. Some queries are ancestry base is a rooted tree. Some queries are ancestry relations in this tree.relations in this tree.
Use compact index for fast query XML search engine. Use compact index for fast query XML search engine. Here the constants do matter. Saving Here the constants do matter. Saving 11 byte on each byte on each entry of the index table is important. Here entry of the index table is important. Here nn is very is very large, ~ large, ~ 101099..
Ex: Is <“distributed computing”> descendantEx: Is <“distributed computing”> descendantof <book_title>?of <book_title>?
![Page 18: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/18.jpg)
Folklore? Folklore? [Santoro, Khatib ’85][Santoro, Khatib ’85]
[a,b] [a,b] [c,d]?[c,d]?
2logn bit labels2logn bit labels
DFS labelingDFS labeling 1
L(x)=[2,18]
3
4 5 6
7
8
910
[13,18]
18
[22,27]
24
27
121114
16
23
2625
17
15
2120
19
![Page 19: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/19.jpg)
[Alstrup,Rauhe – SODA ’02][Alstrup,Rauhe – SODA ’02]
Upper bound: logn + O(Upper bound: logn + O(logn) bitslogn) bits
Lower bound: logn + Lower bound: logn + (loglogn) bits(loglogn) bits
1
2
3
4 5 6
7
8
910
13
18
22
24
27
121114
16
23
2625
17
15
2120
19
![Page 20: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/20.jpg)
Adjacency Labeling /Adjacency Labeling /Implicit RepresentationImplicit Representation
PP(x,y,(x,y,GG)=1 iff xy in E()=1 iff xy in E(GG))
[Kanan,Naor,Rudich – STOC ’92][Kanan,Naor,Rudich – STOC ’92]
O(logn) bit labels for:O(logn) bit labels for:• trees (and forests)trees (and forests)• bounded arboricity graphs (planar, …)bounded arboricity graphs (planar, …)• bounded treewidth graphsbounded treewidth graphs
In particular:In particular:• 2logn bits for trees2logn bits for trees• 4logn bits for planar4logn bits for planar
![Page 21: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/21.jpg)
Acutally, the problem is equivalent to an old Acutally, the problem is equivalent to an old combinatorial problem:combinatorial problem:
[Babai,Chung,Erd[Babai,Chung,Erdöös,Graham,Spencer ’82]s,Graham,Spencer ’82]
Small Universal Induced GraphSmall Universal Induced Graph
UU is an universal graph for the family is an universal graph for the family FF if every if every graph of graph of F F is isomorphic to an induced subgraph is isomorphic to an induced subgraph
of of UU b
e
b
a
c
ed
f
g
c
e
c ga
g
![Page 22: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/22.jpg)
Universal graphUniversal graph UU(fixed for (fixed for FF)
Graph Graph GG of of F F
|L(|L(x,Gx,G)| = )| = loglog22|V(|V(UU)|)|
b
e
b
a
c
ed
f
g
c
e
c ga
g
![Page 23: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/23.jpg)
Best known results/Open Best known results/Open questionsquestions
►Bounded degree graphs: 1.Bounded degree graphs: 1.867 867 lognlogn[Alon,Asodi - FOCS ’02][Alon,Asodi - FOCS ’02]
►Trees: logn + O(logTrees: logn + O(log**n)n)[Alstrup,Rauhe - FOCS ’02][Alstrup,Rauhe - FOCS ’02]
Planar: 3logn + O(logPlanar: 3logn + O(log**n)n)
xxvv
ZZ
yy
log*n = min{ ilog*n = min{ i0 | log0 | log(i)(i)nn 1} 1}
![Page 24: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/24.jpg)
Lower bounds?: logn + Lower bounds?: logn + (1) for (1) for planarplanar
No hereditary family with n!2No hereditary family with n!2O(n)O(n) labeled graphs (trees, planar, labeled graphs (trees, planar, bounded genus, bounded treewidth,bounded genus, bounded treewidth,…) is known to require labels of logn …) is known to require labels of logn + + (1) bits.(1) bits.
logn + O(1) bits for this logn + O(1) bits for this family?family?
![Page 25: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/25.jpg)
DistanceDistance
Motivation: Motivation: [Peleg ’99][Peleg ’99]
If a short label (say of polylogarithmic size) can be If a short label (say of polylogarithmic size) can be added to the address of the destination, then routing to added to the address of the destination, then routing to any destination can be done without routing tables and any destination can be done without routing tables and with a “limited” number of messages.with a “limited” number of messages.
PP(x,y,(x,y,GG)=dist(x,y) in )=dist(x,y) in GG
dist(dist(x,x,yy))
xx
message header=hop-countyy
![Page 26: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/26.jpg)
A selection resultsA selection results
► (n) bits for general graphs(n) bits for general graphs 1.56n bits, but with O(n) time decoder!1.56n bits, but with O(n) time decoder!
[Winkler ’83 (Squashed Cube Conjecture)][Winkler ’83 (Squashed Cube Conjecture)]
11n bits and O(loglogn) time decoder11n bits and O(loglogn) time decoder[Gavoille,Peleg,Pérennès,Raz ’01][Gavoille,Peleg,Pérennès,Raz ’01]
► (log(log22n) bits for trees and bounded n) bits for trees and bounded treewidth graphs, … treewidth graphs, … [Peleg ’99, GPPR ’01][Peleg ’99, GPPR ’01]
► (logn) bits and O(1) time decoder for (logn) bits and O(1) time decoder for interval, permutation graphs, … interval, permutation graphs, … [ESA ’03]:[ESA ’03]: O(n) space O(1) time data structure, O(n) space O(1) time data structure, even for m=even for m=(n(n22))
![Page 27: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/27.jpg)
Results (cont’d)Results (cont’d)
► (logn(logn..loglogn) bits and (1+o(1))-approximation loglogn) bits and (1+o(1))-approximation for trees and bounded treewidth graphsfor trees and bounded treewidth graphs[GKKPP – ESA ’01][GKKPP – ESA ’01]
► More recently: doubling dimension-More recently: doubling dimension- graphs graphs
Every radius-2r ball can be covered by Every radius-2r ball can be covered by 2 2 radius-r balls radius-r balls
• Euclidean graphs have Euclidean graphs have =O(1)=O(1)• Include bounded growing Include bounded growing graphsgraphs• Robust notionRobust notion
![Page 28: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/28.jpg)
Distance labeling for doubling Distance labeling for doubling dimension graphsdimension graphs
((-O(-O() ) lognlogn..loglogn) bitsloglogn) bits
(1+(1+)-approximation for doubling )-approximation for doubling dimension-dimension- graphs graphs
[Gupta,Krauthgamer,Lee – FOCS ’03][Gupta,Krauthgamer,Lee – FOCS ’03]
[Talwar – STOC ’04][Talwar – STOC ’04]
[Mendel,Har-Peled – SoCG ’05][Mendel,Har-Peled – SoCG ’05]
[Slivkins - PODC ’05][Slivkins - PODC ’05]
![Page 29: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/29.jpg)
Distance labeling for planarDistance labeling for planar
►O(logO(log22n) bits for 3-approximationn) bits for 3-approximation
[Gupta,Kumar,Rastogi – [Gupta,Kumar,Rastogi – SICOMPSICOMP ’05] ’05]
►O(O(-1-1loglog22n) bits for (1+n) bits for (1+)-)-approximationapproximation
[Thorup – [Thorup – J.ACMJ.ACM ’04] ’04]
►(n(n1/31/3) ) ? ? Õ( Õ(n) for exact distancen) for exact distance
![Page 30: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/30.jpg)
Lower bounds for planarLower bounds for planar[Gavoille,Peleg,Pérennès,Raz – SODA [Gavoille,Peleg,Pérennès,Raz – SODA
’01]’01]#vertices ~ k3
#critical edges ~ k2
#labels =2k
|label|> k2/ 2k ~ n1/3
![Page 31: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/31.jpg)
► A graph G with a state SA graph G with a state Suu at each vertex u: (G,S) at each vertex u: (G,S)
► A global property A global property PP (MST, 3-coloring, …) (MST, 3-coloring, …)► A marker algorithm applied on (G,S) that returns A marker algorithm applied on (G,S) that returns
a label L(u) for ua label L(u) for u► A binary decoder (checker) for u applied on N(u):A binary decoder (checker) for u applied on N(u):
ffuu = f(S = f(Suu,L(u),L(v,L(u),L(v11)…L(v)…L(vkk)) )) ∈ ∈ {0,1}{0,1}
G has property G has property PP ffuu=1 =1 uu
G hasn't prop. G hasn't prop. PP w, fw, fww=0 whatever the labels =0 whatever the labels areare
Proof Labeling SystemsProof Labeling Systems[Korman,Kutten,Peleg – PODC ’05][Korman,Kutten,Peleg – PODC ’05]
uu
vv11
vv33
vv22
SS11
SS44
SS22SS33SS55
![Page 32: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/32.jpg)
What is the knowledge needed for local What is the knowledge needed for local verifications of global properties?verifications of global properties?
SS11
SS44
SS22SS33SS55
![Page 33: Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649cea5503460f949b5d67/html5/thumbnails/33.jpg)
ConclusionConclusion
►Labeling scheme for Labeling scheme for distributed distributed computingcomputing is a rich concept. is a rich concept.
►Many things remain to do, specially Many things remain to do, specially lower boundslower bounds