Algorithm*Design:*Theory& Prac6ce* - University of...
Transcript of Algorithm*Design:*Theory& Prac6ce* - University of...
Algorithm Design: Theory & Prac6ce
Robert E. Tarjan Princeton University &
HP labs
Observa6ons
Over the last 50 years, theore6cians have developed many beau6ful and asympto6cally efficient algorithms. But many such algorithms have yet to be used in prac6ce, and some, when used, are less efficient in prac6ce than simpler methods with worse theore6cal efficiency.
Why?
Implementers, pressed for 6me, will choose the simplest solu6on that works, or seems to. Einstein: “Make everything as simple as possible, but not simpler.”
Prac6cal instances are not worst-‐case. Indeed, they may have some structure that can be exploited by an algorithm.
How should we theore6cians respond?
Develop and analyze simple methods: apply theory to analyze and improve methods used in prac6ce. Find the structure in empirical datasets and devise algorithms that exploit such structure.
Two examples
Data structure: CB Trees Devise a simple kind of search tree that exploits differences in access frequencies, does liXle restructuring, and gracefully supports concurrency.
Algorithm: IBFS Max Flow Algorithm Devise a max flow algorithm that exploits the structure of networks arising in vision applica6ons and that has robust and provable efficiency.
CBTree: A Prac,cal Concurrent Self-‐Adjus,ng
Search Tree Boris Korenfeld
joint with Yehuda Afek Haim Kaplan
Adam Morrison Robert E. Tarjan DISC 2012
8
Binary search trees
Access 6me: O(log n) J 9
Search trees in prac,ce
Access 6me: O(log n) L
Some items are more popular than the others
10
Self-‐adjus,ng search trees
Access 6me: f(access paXern) J 11
Ideal 6me: O(log ) # total accesses # accesses to v
Self-‐adjus,ng search trees
12
Splay tree
Traverse the tree to find the node s.
Lookup (s)
s
13
Zig-‐zag step
D
z
y
x
A
B C
y
A B
z
x
C D
14
D
z
y
x
A
B C
Zig-‐zig step
z
y
x
A B
C
D
x
z
C D
y
B
A
15
x
x
Splay tree Lookup (h)
h
s
16
dk
Splay tree Lookup (h) s
17
kdh
Splay tree Lookup (h)
s
18
kd
h
Splay tree h
Ideal amor)zed
access 6me
O(log
)
# total acce
sses
# accesses t
o v
19
Concurrent splay tree?
r c
x
Every thread tries to move its node up to the root
20
Our work: Concurrent self-‐adjus,ng tree
without the hot spot
21
Contribu,ons • Counter Based Tree -‐ a new sequen,al self-‐adjus,ng tree with provably few rota,ons
• Efficient concurrent self-‐adjus,ng tree without a hot spot
22
Counter-‐based tree
50 L
12 S
7 W
1 Z
3 U
5 O
1 R
1 N
20 F
8 J
7 K
1 G
7 D
1 E
1 A
50 opera6ons on the tree
23
Counter-‐based tree
50 L
12 S
7 W
1 Z
3 U
5 O
1 R
1 N
20 F
8 J
7 K
1 G
7 D
1 E
1 A
20 opera6ons on the le_ sub-‐tree 12 opera6ons on the right sub-‐tree
24
Counter-‐based tree
50 L
12 S
7 W
1 Z
3 U
5 O
1 R
1 N
20 F
8 J
7 K
1 G
7 D
1 E
1 A
18 accesses to node L
25
CBTree traversal
A_er traversal (boXom-‐up) or during traversal (top-‐down), do single and double rota6ons when they improve the frequency balance of the tree.
26
Zig-‐zag step Rotate only if x is heavy, then take two steps down along search path
27
D
z
y
x
A
B C
y
A B
z
x
C D
D
z
y
x
A
B C
Semi zig-‐zig step Rotate only if z – y is heavy, then take two steps down along search path.
28
z
y
x
A B
C
D
z
C D
y
B A x
x
Analysis + Algorithm Design Use splay tree poten6al func6on in both design and analysis: Do rota6ons when they decrease the splay tree poten6al, otherwise just follow search path. Total 6me = total traversal steps + number of rota6ons = sum of ideal 6mes plus total poten6al decrease. Poten6al increase can be made O(1) per inser6on.
Sequen,al CBTree • Self-‐adjus,ng tree with ideal access ,me and provably few rota,ons
• The price: space to store a counter per node.
30
Sequen,al to concurrent
31
r
c x
g
b
• Synchronizing rota,ons – As in balanced BSTs! – We use Bronson et al.’s [PPoPP’10] algorithm.
• Maintaining counters – Plain read/write.
Sequen,al to concurrent
32
• Algorithms: – AVL, CBTree, Splay, Treap
• Real life sequences – Lookups
• Hardware architecture: – Sun UltraSPARC T2+: 64 threads
Experiments
33
s Maximum Flow by Incremental
Breadth First Search ESA 2011 Sagi Hed
Tel Aviv University Haim Kaplan
Tel Aviv University
Renato F. Werneck Microso_ Research
Andrew V. Goldberg Microso_ Research
Robert E. Tarjan Princeton University & HP Labs
Maximum Flow in Computer Vision
• Graphs have specific structure
• Regular low degree grids • Arc capaci6es: different models for grid arcs and s-‐t arcs
Maximum Flow • Input: directed graph G=(V,E), ver6ces s, t є V and capacity assignment c(e) for e є E
• Output: flow func6on f sa6sfying -‐ conserva6on: for every v≠s,t Σ(u,v)єE f(u,v) = Σ(v,u)єE f(v,u) capacity: for every e f(e) ≤ c(e) with maximal |f|=sum of flow out of s (into t)
• Well studied problem • Equivalent to the Minimum s-‐t Cut problem • Solu6on methods: Augmen6ng Path (and blocking flow), Network Simplex, Push-‐Relabel
BK • Boykov and Kolmogorov developed an algorithm (BK) which is
the fastest in prac6ce on the vision instances [Boykov, Kolmogorov 04]
• Used as the standard min-‐cut algorithm in computer vision • Usually outperforms Push-‐Relabel implementa6on by
considerable factors
• Problem: BK has no known polynomial 6me guarantee… Best bound is O(mnF) for integral capaci6es (F is the maximal flow value)
• Indeed on some instances, BK performs poorly and is outperformed by Push-‐Relabel implementa6on
Our Contribu,on IBFS
• We develop the IBFS algorithm – Incremental Breadth First Search
• Has many similari6es to BK • However, always performs shortest path augmenta6ons
• Compe66ve in prac6ce to BK Usually outperforms BK by small factors
• Has a polynomial worst case 6me guarantee O(mn2)
Augmen,ng Path Algorithms • Augmen6ng path algorithms constantly maintain a flow
func6on f, f constantly increases. • When the algorithm terminates f is maximal • Augmenta6on: add (maximal) X to flow along an s-‐t path • Residual graph:
Gf = (V,Ef) Ef = {(u,v) | (u,v) є E V f(u,v) < c(u,v)} U {(v,u) | (u,v) є E V f(u,v) > 0} Extend f and c to f(v,u)=-‐f(u,v) and c(v,u)=0 for (u,v) є E
• Ford & Fulkerson: augmenta6ons in Gf give maximal flow
s
BK Overview • Maintain trees S, T in the residual graph • Iterate 3 phases: Growth, Augmenta6on, Adop6on • Growth: grow S and T bi-‐direc6onally
s t
S T
s
BK Overview • Augmenta6on: when the trees meet, we augment flow • Adop6on: a_er an augmenta6on, we try to reconnect “orphaned” sub-‐trees
s t
S T
s
IBFS Overview
• We maintain S, T as BFS trees with heights ≈ Ds , Dt
• Augment on shortest paths
s t
S T
s
shortest+1
shortest
IBFS Overview • Adop6on / how to rebuild the trees: If subtree reconnects at the same level, we’re done
s t
S T Ds Dt
s
IBFS Overview • Otherwise: • Relabel: set label to lowest poten6al parent + 1 • Make children into orphan sub-‐trees
s t
S T
s
IBFS Overview
• BFS trees => worst case bound O(mn2)
s
s t
S T
IBFS vs. BK
• Maintaining BFS trees => more work rebuilding the trees a_er each augmenta6on
• Shortest augmen6ng paths => less work in each augmenta6on
• Shortest augmen6ng paths lead to less augmenta6ons => growth steps
• We get rid of the parent traversal step
s
IBFS Experiments
• Ran on computer vision instances public benchmark [hXp://vision.csd.uwo.ca/maxflow-‐data/] our own crea6on [hXp://www.cs.tau.ac.il/~sagihed/ibfs/]
• BK implementa6on available publicly [hXp://vision.csd.uwo.ca/code/]
• We compare to a modified version of BK, with the same low level op6miza6ons as our own (≈ 20% faster)
• IBFS outperforms BK on all but two instances: 2 different capacity versions of the instance “bone”
• Factors are mostly modest. For few they are large.
s
IBFS Experiments
OT Orphans Growth Pushes Speedup
Instance BK BK IBFS BK IBFS BK IBFS
38.4 7.7 87.8 7.7 6.7 160.0 16.9 3 digged
126.5 43.9 601.7 25.4 7.3 353.2 108.4 1.11 hessi1a
43.7 13.3 129.6 10.2 6.3 122.2 33.0 1.24 house
83.3 27.3 348.3 17.3 6.8 153.0 53.5 1.07 anthra
23.0 6.8 30.1 8.8 ;6.8 10.9 2.8 1.17 bone_subx100
66.5 13.6 56.0 12.3 6.9 23.2 7.5 2.15 liver100
39.5 9.5 46.3 10.7 6.6 12.7 4.5 1.76 babyface100
7.0 5.1 35.6 8.1 6.9 2.0 0.5 0.79 bone100
0.6 0.4 0.6 6.2 6.2 0.5 0.3 1.23 bunny-‐med
61.2 13.0 92.4 9.4 6.8 74.0 20.4 1.54 camel-‐med
250.5 20.7 121.6 12.1 8.7 337.2 22.7 6.16 gargoyle-‐med
8.1 13.5 18.0 11.2 8.8 6.2 3.3 1.39 kz2-‐venus
s
Opera,on Counts (per vertex)
Conclusions
Do both theory and prac6ce! Each gives direc6on to the other.