15-211 Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
-
Upload
melvyn-potter -
Category
Documents
-
view
218 -
download
0
Transcript of 15-211 Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
15-211Fundamental Data Structures and Algorithms
Peter LeeApril 24, 2003
Union-Find
Announcements
Quiz #4 is available until midnight tonight!
HW6 is due next week!tournament on May 7
Final Exam on May 8, 8:30am!review session TBA
Building Mazes
Thinking about the problem
Think about a grid of rooms separated by walls.
Each room can be given a name.
a b c d
hgfe
i j k l
ponm
Randomly knock out walls until we get a good maze.
Mathematical formulation
A set of rooms:{a, b, c, d, e, f, g, h,
i, j, k, l, m, n, o, p}
Pairs of adjacent rooms that have an open wall between them.For example, (a,b)
and (g,k) are pairs.
a b c d
hgfe
i j k l
ponm
Mazes as graphs
a b c d
hgfe
i j k l
ponm
{(a,b), (b,c), (a,e), (e,i), (i,j), (f,j), (f,g), (g,h), (d,h), (g,k), (m,n), (n,o), (k,o), (o,p), (l,p)}
Mazes as graphs
{(a,b), (b,c), (a,e), (e,i), (i,j), (f,j), (f,g), (g,h), (d,h), (g,k), (m,n), (n,o), (k,o), (o,p), (l,p)}
a b c d
e f g h
i j k l
m n o p
For a maze to have a unique solution, its graph must be a tree.
Mazes as trees
A spanning tree is a tree that includes all of the nodes.Why is it good to
have a spanning tree?
a
b
c
d
e
f
gh
i
j
k
lm
no
p
Algorithm
Essentially:Randomly pick a wall and delete it (add
it to the tree) if it won’t create a cycle.Stop when a spanning tree has been
created.This is Kruskal’s Algorithm.
Creating a spanning tree
When adding a wall to the tree, how do we detect that it won’t create a cycle?When adding wall (x,y), we want to
know if there is already a path from x to y in the tree.
Using the union-find algorithm
We put rooms into an equivalence class if there is a path connecting them.
Before adding an edge (x,y) to the tree, make sure that x and y are not in the same equivalence class.
a b c d
e f g h
i j k l
m n o p
Partially-constructed maze
Dynamic Equivalence Relations
Equivalence relations
The two-place relation “~” is an equivalence relation if (for all a, b, and c):
a ~ areflexivea ~ b iff b ~ asymmetrica ~ b & b~ c a ~ ctransitive
Equivalence relations?
< transitive, not reflexive, not symmetric
<= transitive, reflexive, not symmetric
e1 = O(e2) transitive, not reflexive, not symmetric
== transitive, reflexive, symmetric
connected transitive, reflexive, symmetric
Equivalence classes
For any given element x 2 S and two-place equivalence relation ~, the equivalence class of x is
{ y | y 2 S Æ x~y}
Making equivalence dynamic
Dynamic operations on an equivalence relation.For example, when removing walls in a maze
Operations:
find(i): returns the equivalence class of i.
union(i,j): joins the classes of i and j.
{1} {2} {3} {4} {5} {6} {7}
Dynamic equivalence
{1} {2,3} {4} {5} {6} {7}
{1} {2,3,4} {5} {6} {7}
{1} {2,3,4} {5,6} {7}
{1} {2,3,4,5,6} {7}
Operationsfind(i) return the name of the set containing i.union(i,j) joins the sets containing i and j.
union(2,3)
union(3,4)
union(5,6)
union(6,3)
Union Find
The UnionFind interface
class UnionFind { UnionFind(int n) { . . . };
int find(int i) { . . . };void union(int i, int j) { . . . };
}
To simplify matters, use integers {0,1,2,…,n} to represent the set elements.
Implementing Union-Find
A key question:How should we represent the
equivalence classes?
Let’s consider a naïve approach first, and then a better way…
A naïve array representation
1 2 4 7
1 2 7
Array with set indexes 1 1 2 1 4 2 4 3 2
4 sets: {0,1,3}, {2,5,8}, {4,6}, {7}
union(1,4) yields: 1 1 2 1 1 2 1 3 2
3 sets: {0,1,3,4,6}, {2,5,8}, {7}
Running time for naïve approach
With this naïve representation, find(n) runs in O(1) time.
What about union(n,m)?
Forest and tree representation
Each set is a tree {1}{2}{0,3} {4}{5}
union(2,1) adds a new subtree to a root {1,2}{0,3}{4}{5}
union(0,1) adds a new subtree to a root
{1,2,0,3}{4}{5}
demo
1 2 30
4 5
1 30
42
5
13
0
42
5
{1,2,0,3}{4}{5}
find(2) = 1 find(4) = 4 Array representation
3 -1 1 1 -1 -1 0 1 2 3 4 5
Forest and trees: array repn
13
0
42
5
Find, v.0
13
0
42
5 {1,2,0,3}{4}{5} find(0) = 1
s: 3 -1 1 1 -1 -1 0 1 2 3 4 5
public int find(int x) { if (s[x] < 0) return s[x]; return find(s[x]);}
Union, v.-1 1 30
42
5
1
30
42
5 {1,2}{0,3}{4}{5}
{1,2,0,3}{4}{5}
union(0,2)s: 3 -1 1 -1 -1 -1 before
s’: 3 -1 1 2 -1 -1 after 0 1 2 3 4 5
public void union(int x, int y){ s[find(x)] = y; }
Union, v.0
13
0
42
5 {1,2}{0,3}{4}{5}
{1,2,0,3}{4}{5}
union(0,2)s: 3 -1 1 -1 -1 -1 before
s’: 3 -1 1 1 -1 -1 after 0 1 2 3 4 5
public void union(int x, int y){ s[find(x)] = find(y); }
1 30
42
5
Union v.0 is still O(n)!
Find must walk the path to the root Unlucky combinations of unions can
result in long paths
13
0
2
54
6
Trick 1: union by height
union shallow trees into deep trees• Tree depth increases only when depths equal
Track path length to root 3 -3 1 1 -1 -1 0 1 2 3 4 5
Tree depth at most O(log2 N)
13
0
42
5
Trick 1’: union by size
union small trees into big trees• (Tree size always increases)
Track subtree size 3 -4 1 1 -1 -1 0 1 2 3 4 5
Tree depth at most ???
13
0
42
5
Trick 2: Path compression
find flattens trees• Redirect nodes to point directly to the root
Example: find(0)
Do this whenever traversing a path from node to root.
13
0
42
5 10
42
53
Path compression
find flattens trees• Redirect nodes to point directly to the root• Do this whenever traversing a path from node
to root.
public int find(int x) { if (s[x]< 0) return x; return s[x] = find(s[x]);}
This implies that union does path compression (through its calls to find)
The Code
All the code
class UnionFind { int[] u;
UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; }
int find(int i) { int j,root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; }
void union(int i,int j) { i = find(i); j = find(j); if (i !=j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } } }}
The UnionFind class
class UnionFind { int[] u;
UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; }
int find(int i) { ... }
void union(int i,int j) { ... }}
Trick 2: Iterative find
int find(int i) { int j, root;
for (j = i; u[j] >= 0; j = u[j]) ; root = j;
while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; }
return root; }
Trick 1’: union by size
void union(int i,int j) { i = find(i); j = find(j);
if (i != j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } } }
Time bounds
VariablesM operations. N elements.
AlgorithmsSimple forest representation
• Worst: find O(N). mixed operations O(MN).• Average: tricky
Union by height; Union by size• Worst: find O(log N). mixed operations O(M log N).• Average: mixed operations O(M) [see text]
Path compression in find• Worst: mixed operations: “nearly linear”
[analysis in 15-451]