Download - Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010.

Union-Find A data structure for

maintaining a collection of disjoint sets

Course: Data StructuresLecturer: Hanoch Levy

January 2010

- קבוצות של קבוצות.

- מאחדים קבוצות זרות ורוצים לדעת היכן כל עצם.

-FIND ו MERGE פעולות-

: דוגמהמקיים: (EQUIVALENCE)יחס האקוויולנטיות/שקילות

רפלקסיבי:

סימטרי:

טרנזיטיבי:

a a

a b b a

a b, b c a c

נתון: רצף של פעולות שקילות

1 2, 3 4, 5 6, 2 3

רוצים: לייצר קבוצות שקילות.

לאיחוד הקבוצות MERGE: משתמשים ב- FIND לחפש למי שקול :

Data Structures, CS, TAU - 5.21

עם FIND-ו MERGE קבוצות

MERGE)A, B( בצע אחוד והכנס תוצאה ל - A אוB

FIND)x( - מצא באיזו קבוצה נמצא x

INITIAL)A, x( - הכנס x ל A

:יישום פשוט

- מערך שבו כל איבר מכיל את שם הקבוצה לה האיבר שייך

A={1, 3, 5}, B={2, 4}, C={6, 7, 8}

A B A B A C C C

1 2 3 4 5 6 7 8

O)1( : FIND, INITיעילות:

O)N( : MERGE)צריך לעבור על כל אברי המערך(

FIND ו- MERGE פעולות Nמדד יעילות:


פעולות

לחוד )רשימה מקושרת( B לחוד ושל A לקשר את האיברים של-

- לא צריך לרוץ על כל אברי התחום אלא רק על אברי הקבוצה.

O)n (2 מיזוגים יכולים לעלות: nעדיין -

הקבוצה שנוצרה לאיבר מיזוגים שבו ממזגים את n-1 כי: רצף של:בודד

)2(1

12

)1( nOn

i

nni

:פתרון

- לשמור את גודל הקבוצות

- למזג קבוצות קטנות לגדולות


יותר מהיר יישום

:סיבוכיות

( מתחשבנים עם כל איבר בנפרד )לא עם הקבוצה(1

( כשאיבר עובר קבוצה גודל קבוצת האם לפחות מוכפל.2

1( גודל קבוצה ראשונית - 3

2גודל קבוצה שנייה

4גודל קבוצה שלישית

8גודל קבוצה רביעית i-1

i2 גודל קבוצה

N אבל גודל הקבוצה האחרונה

2#steps גודל

קבוצה אחרונה

N

Nsteps 2log# N2logכל איבר עובר לכל היותר פעמים

)2log(סבוכיות כוללת NNO


סיבוכיות

א( גודלה( צריך לכל קבוצה:1ב( האיבר הראשון בה

א( קבוצת השייכות( צריך לכל איבר:2ב( האיבר הבא בקבוצה

)הנחה: כל האיברים הם השלמים(:ישום

type nametype = 1,…,n elementype = 1, 300, n MFSET = record setheaders: array[1…n] of record count: 0,…,n; firstelement: 0,…,n; end; names: array[1…n] of record setname: nametype nextelement: 0,…,n

לכל קבוצהגודלה והאיבר

הראשון

לכל איברשם הקבוצה

והבא.


נתונים מבנה

(A בודקים מי הקבוצה הקטנה )נניח -

B רצים לאורך הקבוצה ומשנים שמה ל- -

B ל- A באיבר האחרון עושים את השרשור-

.ואת גודל הקבוצה מעדכנים את האיבר הראשון Headers ב-

:סבוכיות

גודל הבעלים גדל פי שניים ,- כל איבר שעובר לבעלים חדשים)לפחות(

.פעמים log n לכן כל איבר עובר לכל היותר-

)O)n log n :סבוכיות


MERGE ביצוע

-B כשמעבירים ל A נסיון למנוע ריצה על כל אברי-

בעץ מייצגים איברים. -

כל צומת מצביע לאביו. -

בשורש יושב שם הקבוצה. -

A

1

73

5

B

8

6

C

17

:ביצוע הפעולות

MERGE)A, B( - תלה את השורש של A על זה שלB

FIND)x(.רוץ כלפי מעלה -


עץ באמצעות יישום

O)1( = MERGE

O)n( = FIND)יתכן(

N שידוכים וחיפושים O)n( 2

)אם תולים גדול על קטן נוצרת רשימה(

תלה עץ קטן על גדול:שיפור

.1- בכל תליה עומק גדל ב-

. בכל תליה מס’ הצמתים בעץ לפחות מוכפל-

N2log- צומת משתתף בתליה

N2logעומק כל צומת

NN( find)סבוכיות: log


סיבוכיות

השורש לקפל את המסלול אל FIND כשמבצעים•) )כל צמתי המסלול יהפכו לבני השורש

ביצוע קל: בשני מעברים )ראשון לזיהוי השורש, שני לקיפול ותליה(•

1

7

A

3

2

8

1 7

A

3

2

8

FIND )7(

:ניתוח סיבוכיות

)O)n עדיין יתכן -פעולה בודדת

- מסובך לניתוח.ממוצע

קשה ( FINDS N לבצוע )O)NlogN קטן על גדול, יקחאם לא תולים )לאנליזה


מסלולים Finished 13/12/04Finished 13/12/04קיפול

))((:פעולות N קטן על גדול, סבוכיות לאם כן תולים NNO

: )N(קרובה לקבועאינה קבוע

N אבל גדלה לאט מאוד עם )A)X, Y:פונקצית אקרמן

A)0, y( = 1

A)1, 0( = 2

A)x, 0( = x+2 for x 2

A)x, y( = A)A)x-1, y(, y-1(, x,y 1

A)x, 0( = x+2

A)x, 1( = A)A)x-1(, 1(, 0( = A)x-1, 1(+2 = 2x

A)x, 2( = A)A)x-1(, 2(, 1( = 2A)x-1, 2( = 2x

A)x, 3( = A)A)x-1(, 3(, 2( = 2 = 2A)x-1, 3( 2

22

xפעמים

A)x, 4( = אין צורה מתמטית

הגדרה


Union-Find

• Make(x): Create a set containing x • Union(x,y): Unite the sets containing x and y• Find(x): Return a representative of the

set containing x

Union Find

make

union

find

a

c

b

d e

O(1)

O(α(n))

O(α(n))

Amortized

Fun aplications: Generating mazes

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

make(1)make(2)

make(16)

…

Choose edges in random order and remove them if they connect two different regions

find(6)=find(7) ?union(6,7)

find(7)=find(11) ?union(7,11)…

Fun aplications: Generating mazes

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

Generating mazes – a larger example

Construction time -- O(n2 α(n2))

n

More serious aplications:

• Maintaining an equivalence relation

• Incremental connectivity in graphs

• Computing minimum spanning trees

• …

Union FindRepresent each set as a rooted tree

Union by rank Path compression

The parent of a vertex x is denoted by p[x]

x

Find(x) traces the path from x to the root

p[x]

Path Compression

Union by rank

0 r1

r2 r r

r+1

r1< r2

Union by rank on its own gives O(log n) find time

A tree of rank r contains at least 2r elements

If x is not a root, then rank(x)<rank(p[x])

Rank = height (disregarding compressions)

Cs

6/1/2010

Union Find - pseudocode

Cs

5/1/2010

Union-Find

make link find

O(1) O(1) O(log n)

make link find

O(1) O(α(n)) O(α(n))

Worst case

Amortized

Nesting / Repeated application

Inverse functions

Union by Size

0 r1

r2 r r

r+1

r1< r2

Hang the smaller (# of nodes) tree on the larger tree.

Cs

6/1/2010

Continue from Notes

)2()2()2( 1 hhh

sdescendanth2

www.cse.yorku.ca/~andy/courses/4101/lecture-notes/LN6.pdf

Lemma 1: if conduct number of UNION ops. If node has height h then it has

Proof: induction. When height grows size at least doubles

Cor 0: if do UNION+find (path compression) : if height of tree = h , it has

sdescendanth2Cor1: height <= log n

Assume: do UNION by size

Cont2

Cor 2: Worst case of UNION = O(1)

Worst case of find = O(logn)

Will show amortized = O(log*(n))

Fact 1: lg* (r) = g iff exp*(g-1) < r <= exp *(g)

Reminder: rank (x) = height of node x in uncompressed forest

Cont 3

Lemma 2:for any sequence s of operations (Union + Find) number of nodes at rank r is at most

Proof:

•Lemma 1 each such node has 2r descendants.

• All nodes that are at rank r must have their descendants disjoint.

•Sum them must give less than n+1 nodes.

rn 2/

Cont 4

Lemma 3: If during execution of sequence s . Node x is a

proper descendant of node y then : rank(x) < rank (y) in s.

Proof: a) If x becomes descendant due to union then after union rank (x) < rank (y).

b) If due to compression (find) then also due to (earlier) union as before.

0 r1

r2 r r

r+1

Recall: rank =

height in uncompres

sed

Cont 5

• Put nodes in groups

• for x : group(x) = lg* (rank(x))

• Analyze FIND (+compression)

• Look at x1, x2, .. Xk on the path being compressed.

• If group(xi) = group (xi+1) charge to xi

• else : charge to FIND

rank group

2 1

4 2

16 3

65536 4

2^65536 5

• Cost attributed to single find = O(log*n)

Cont 6

• Cost attributed to x:

• every compression – x gets new parent (move up)

• => new parent has higher rank than old parent

• As long as parent (x) remains in group (x) charge to x

• After that x becomes a child of a parent in another group (x and parent are not in same group) charging of next compressions to FIND.

• node at group g will be charged at most exp*(g)-exp*(g-1).

rank group

2 1

4 2

16 3

65536 4

2^65536 5

Recall: group is by rank (accounted for

in the uncompressed

tree)

Cont 7

...]4/12/11[2

2/)(1)1exp*(

)exp*(

1)1exp*( g

g

gr

r nngN

• node at group g will be charged at most exp*(g)-exp*(g-1).

•Number of nodes in g: N(g)

rank group

2 1

4 2

16 3

65536 4

2^65536 5

)(exp*2 )1exp*( g

nng

Cont 8 • Total number of moves in group g

rank group

2 1

4 2

16 3

65536 4

2^65536 5

)())1(exp*)((exp*)(exp*

))1(exp*)()(exp*( nOggg

ngggN

N groups overall work = n log*(n)

Amortized work = log*(n)