Data Structures & Algorithms Union-Find Example Richard Newman.

29
Data Structures & Algorithms Union-Find Example Richard Newman

Transcript of Data Structures & Algorithms Union-Find Example Richard Newman.

Data Structures & AlgorithmsUnion-Find Example

Richard Newman

Steps to Develop an Algorithm Define the problem – model it Determine constraints Find or create an algorithm to solve it Evaluate algorithm – speed, space, etc. If algorithm isn’t satisfactory, why not? Try to fix algorithm Iterate until solution found (or give up)

Dynamic Connectivity Problem

• Given a set of N elements

• Support two operations:• Connect two elements• Given two elements, is there a path between

them?

ExampleConnect (4, 3)

Connect (3, 8)

Connect (6, 5)

Connect (9, 4)

Connect (2, 1)

Are 0 and 7 connected (No)

Are 8 and 9 connected (Yes)

0 1 2 3 4

5 6 7 8 9

Example (con’t)Connect (5, 0)

Connect (7, 2)

Connect (6, 1)

Connect (1, 0)

Are 0 and 7 connected (Yes)

Now consider a problem with 10,000 elements and 15,000 connections….

0 1 2 3 4

5 6 7 8 9

Modeling the ElementsVarious interpretations of the elements:Pixels in a digital photoComputers in a networkSocket pins on a PC boardTransistors in a VLSI designVariable names in a C++ programLocations on a mapFriends in a social network…Convenient to just number 0 to N-1Use as array index, suppress details

Modeling the ConnectionsAssume “is connected to” is an equivalence relationReflexive: a is connected to aSymmetric: if a is connected to b, then

b is connected to aTransitive: if a is connected to b, and

b is connected to c, thena is connected to c

Connected Components

A connected component is a maximal set of elements that are mutually connected (i.e., an equivalence set)

0 1 2 3 4

5 6 7 8 9

{0} {1,2} {3,4,8,9} {5,6} {7}

Implementing the OperationsRecall – connect two elements, and answer if two elements have a path between themFind: in which component is element a?Union: replace components containging elements a and b with their unionConnected: are elements a and b in the same component?

Example

Union(1,6)

0 1 2 3 4

5 6 7 8 9

{0} {1,2} {3,4,8,9} {5,6} {7}

0 1 2 3 4

5 6 7 8 9

{0} {1,2,5,6} {3,4,8,9} {7}

Components?

Union-Find Data TypeGoal: Design an efficient data structure for union-findNumber of elements can be hugeNumber of operations can be hugeUnion and find operations can be intermixed

public class UF UF int(N);

void union(int a, int b);int find(int a);boolean connected(int a, int b);

;

Dynamic Connectivity Client Read in number of elements N from stdin Repeat:

– Read in pair of integers from stdin– If not yet connected, connect them and print out pair

read input int N while stdin is not empty

read in pair of ints a and bif not connected (a, b)

union(a, b)print out a and b

;

Quick-Find Data Structure

– Integer array id[] of length N – Interpretation: id[a] is the id of the component containing a

0 1 2 3 4

5 6 7 8 9

i: 0 1 2 3 4 5 6 7 8 9 id[i]: 0 1 1 4 4 5 5 7 4 4

Quick-Find Data Structure

– Integer array id[] of length N – Interpretation: id[a] is the id of the component containing a

Find: what is the id of a? Connected: do a and b have the same id? Union: Change all the entries in id that have

the same id as a to be the id of b.

Quick-Find

Union(1,6)0 1 2 3 4

5 6 7 8 9

i: 0 1 2 3 4 5 6 7 8 9 id: 0 1 1 4 4 5 5 7 4 4

i: 0 1 2 3 4 5 6 7 8 9 id: 0 5 5 4 4 5 5 7 4 4

It works – so is there a problem?Well, there may be many values to change, and many to search!

Quick-Find Quick-Find operation times

– Initialization takes time O(N) – Union takes time O(N)– Find takes time O(1)– Connected takes time O(1)

Union is too slow – it takes O(N2) array accesses to process N union operations on N elements

Quadratic Algos Do Not Scale! Rough Standards (for now)

– 109 operations per second– 109 words of memory– Touch all words in 1 second (+/- truism since 1950!)Huge problem for Quick-Find: 109 union commands on 109 elements Takes more than 1018 operations This is 30+ years of computer time!

Quadratic Algos Do Not Scale! They do not keep pace with technology

New computer may be 10x as fast But it has 10x as much memory Want to solve problems 10x as big With quadratic algorithm, it takes…

… 10 x as long!!!

Quick-Union Data Structure

– Integer array id[] of length N – Interpretation: id[a] is the parent of a– Component is root of a = id[id[…id[a]…]] (fixed point)

0 1

2

3

9

5

6

7

8 4

i: 0 1 2 3 4 5 6 7 8 9 id[i]: 0 1 1 3 3 5 5 7 3 4

Quick-Union Data Structure

– What is root of tree of a?– Do a and b have the same root?– Set id of root of b’s tree to be root of a’s tree

0 1

2

3

9

5

6

7

8 4

i: 0 1 2 3 4 5 6 7 8 9 id[i]: 0 1 1 3 3 5 5 7 3 4

– Find: – Connected: – Union:

Quick-Union

0 1

2

3

9

5

6

7

8 4

i: 0 1 2 3 4 5 6 7 8 9 id[i]: 0 1 1 3 3 5 5 7 3 4

– Find 9– Connected 8, 9: – Union 7,5

5

Only ONE value changes! = FAST

Quick-Union Quick-Union operation times (worst case)

– Initialization takes time O(N) – Union takes time O(N) (must find two roots)– Find takes time O(N)– Connected takes time O(N)

Now union AND find are too slow – it takes O(N2) array accesses to process N operations on N elements

Quick-Find/Quick-Union Observations: Problem with Quick-Find is unions

– May take N array accesses – Trees are flat, but too expensive to keep them flat!

Problem with Quick-Union– Trees may get tall – Find (and hence, connected and union) may take N

array accesses

Weighted Quick-Union Make Quick-Union trees stay short! Keep track of tree size Join smaller tree into larger tree

– May alternatively do union by height/rank– Need to keep track of “weight”

b

a

a

b

Quick-Union may do this But we always

want this

Weighted Quick-Union Weighted Quick-Union operation times

– Initialization takes time O(N) – Union takes time O(1) (given roots)– Find takes time O(depth of a)– Connected takes time O(max {depth of a, b})

Proposition:Depth of any node x is at most lg NPf: What causes depth of x to increase?

Weighted Quick-Union Proposition:

Depth of any node x is at most lg NPf: What causes depth of x to increase?Only union! And if x is in smaller tree.So x’s tree must at least double in size each time union increases x’s depthWhich can happen at most lg N times.(Why?)

Next – Lecture 3

Read Chapter 2 Empirical analysis Asymptotic analysis of algorithms Basic recurrences