15-211 Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.

15-211Fundamental Data Structures and Algorithms

Peter LeeApril 24, 2003

Union-Find

Announcements

Quiz #4 is available until midnight tonight!

HW6 is due next week!tournament on May 7

Final Exam on May 8, 8:30am!review session TBA

Building Mazes

Thinking about the problem

Think about a grid of rooms separated by walls.

Each room can be given a name.

a b c d

hgfe

i j k l

ponm

Randomly knock out walls until we get a good maze.

Mathematical formulation

A set of rooms:{a, b, c, d, e, f, g, h,

i, j, k, l, m, n, o, p}

Pairs of adjacent rooms that have an open wall between them.For example, (a,b)

and (g,k) are pairs.

a b c d

hgfe

i j k l

ponm

Mazes as graphs

a b c d

hgfe

i j k l

ponm

{(a,b), (b,c), (a,e), (e,i), (i,j), (f,j), (f,g), (g,h), (d,h), (g,k), (m,n), (n,o), (k,o), (o,p), (l,p)}

Mazes as graphs

{(a,b), (b,c), (a,e), (e,i), (i,j), (f,j), (f,g), (g,h), (d,h), (g,k), (m,n), (n,o), (k,o), (o,p), (l,p)}

a b c d

e f g h

i j k l

m n o p

For a maze to have a unique solution, its graph must be a tree.

Mazes as trees

A spanning tree is a tree that includes all of the nodes.Why is it good to

have a spanning tree?

a

b

c

d

e

f

gh

i

j

k

lm

no

p

Algorithm

Essentially:Randomly pick a wall and delete it (add

it to the tree) if it won’t create a cycle.Stop when a spanning tree has been

created.This is Kruskal’s Algorithm.

Creating a spanning tree

When adding a wall to the tree, how do we detect that it won’t create a cycle?When adding wall (x,y), we want to

know if there is already a path from x to y in the tree.

Using the union-find algorithm

We put rooms into an equivalence class if there is a path connecting them.

Before adding an edge (x,y) to the tree, make sure that x and y are not in the same equivalence class.

a b c d

e f g h

i j k l

m n o p

Partially-constructed maze

Dynamic Equivalence Relations

Equivalence relations

The two-place relation “~” is an equivalence relation if (for all a, b, and c):

a ~ areflexivea ~ b iff b ~ asymmetrica ~ b & b~ c a ~ ctransitive

Equivalence relations?

< transitive, not reflexive, not symmetric

<= transitive, reflexive, not symmetric

e1 = O(e2) transitive, not reflexive, not symmetric

== transitive, reflexive, symmetric

connected transitive, reflexive, symmetric

Equivalence classes

For any given element x 2 S and two-place equivalence relation ~, the equivalence class of x is

{ y | y 2 S Æ x~y}

Making equivalence dynamic

Dynamic operations on an equivalence relation.For example, when removing walls in a maze

Operations:

find(i): returns the equivalence class of i.

union(i,j): joins the classes of i and j.

{1} {2} {3} {4} {5} {6} {7}

Dynamic equivalence

{1} {2,3} {4} {5} {6} {7}

{1} {2,3,4} {5} {6} {7}

{1} {2,3,4} {5,6} {7}

{1} {2,3,4,5,6} {7}

Operationsfind(i) return the name of the set containing i.union(i,j) joins the sets containing i and j.

union(2,3)

union(3,4)

union(5,6)

union(6,3)

Union Find

The UnionFind interface

class UnionFind { UnionFind(int n) { . . . };

int find(int i) { . . . };void union(int i, int j) { . . . };

}

To simplify matters, use integers {0,1,2,…,n} to represent the set elements.

Implementing Union-Find

A key question:How should we represent the

equivalence classes?

Let’s consider a naïve approach first, and then a better way…

A naïve array representation

1 2 4 7

1 2 7

Array with set indexes 1 1 2 1 4 2 4 3 2

4 sets: {0,1,3}, {2,5,8}, {4,6}, {7}

union(1,4) yields: 1 1 2 1 1 2 1 3 2

3 sets: {0,1,3,4,6}, {2,5,8}, {7}

Running time for naïve approach

With this naïve representation, find(n) runs in O(1) time.

What about union(n,m)?

Forest and tree representation

Each set is a tree {1}{2}{0,3} {4}{5}

union(2,1) adds a new subtree to a root {1,2}{0,3}{4}{5}

union(0,1) adds a new subtree to a root

{1,2,0,3}{4}{5}

demo

1 2 30

4 5

1 30

42

5

13

0

42

5

{1,2,0,3}{4}{5}

find(2) = 1 find(4) = 4 Array representation

3 -1 1 1 -1 -1 0 1 2 3 4 5

Forest and trees: array repn

13

0

42

5

Find, v.0

13

0

42

5 {1,2,0,3}{4}{5} find(0) = 1

s: 3 -1 1 1 -1 -1 0 1 2 3 4 5

public int find(int x) { if (s[x] < 0) return s[x]; return find(s[x]);}

Union, v.-1 1 30

42

5

1

30

42

5 {1,2}{0,3}{4}{5}

{1,2,0,3}{4}{5}

union(0,2)s: 3 -1 1 -1 -1 -1 before

s’: 3 -1 1 2 -1 -1 after 0 1 2 3 4 5

public void union(int x, int y){ s[find(x)] = y; }

Union, v.0

13

0

42

5 {1,2}{0,3}{4}{5}

{1,2,0,3}{4}{5}

union(0,2)s: 3 -1 1 -1 -1 -1 before

s’: 3 -1 1 1 -1 -1 after 0 1 2 3 4 5

public void union(int x, int y){ s[find(x)] = find(y); }

1 30

42

5

Union v.0 is still O(n)!

Find must walk the path to the root Unlucky combinations of unions can

result in long paths

13

0

2

54

6

Trick 1: union by height

union shallow trees into deep trees• Tree depth increases only when depths equal

Track path length to root 3 -3 1 1 -1 -1 0 1 2 3 4 5

Tree depth at most O(log2 N)

13

0

42

5

Trick 1’: union by size

union small trees into big trees• (Tree size always increases)

Track subtree size 3 -4 1 1 -1 -1 0 1 2 3 4 5

Tree depth at most ???

13

0

42

5

Trick 2: Path compression

find flattens trees• Redirect nodes to point directly to the root

Example: find(0)

Do this whenever traversing a path from node to root.

13

0

42

5 10

42

53

Path compression

find flattens trees• Redirect nodes to point directly to the root• Do this whenever traversing a path from node

to root.

public int find(int x) { if (s[x]< 0) return x; return s[x] = find(s[x]);}

This implies that union does path compression (through its calls to find)

The Code

All the code

class UnionFind { int[] u;

UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; }

int find(int i) { int j,root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; }

void union(int i,int j) { i = find(i); j = find(j); if (i !=j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } } }}

The UnionFind class

class UnionFind { int[] u;

UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; }

int find(int i) { ... }

void union(int i,int j) { ... }}

Trick 2: Iterative find

int find(int i) { int j, root;

for (j = i; u[j] >= 0; j = u[j]) ; root = j;

while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; }

return root; }

Trick 1’: union by size

void union(int i,int j) { i = find(i); j = find(j);

if (i != j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } } }

Time bounds

VariablesM operations. N elements.

AlgorithmsSimple forest representation

• Worst: find O(N). mixed operations O(MN).• Average: tricky

Union by height; Union by size• Worst: find O(log N). mixed operations O(M log N).• Average: mixed operations O(M) [see text]

Path compression in find• Worst: mixed operations: “nearly linear”

[analysis in 15-451]

15-211 Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.

Documents

Transcript of 15-211 Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.