datastructII - web.cse.ohio-state.edu
Transcript of datastructII - web.cse.ohio-state.edu
CSE 2331
Table Doubling
11.1
CSE 2331
Dynamic Arrays
Allocate an array of size 10.
What if you try to insert 11 elements in the array?
Need to reallocate the array.
Reallocate to size 11.Insert element.Reallocate to size 12.Insert element.etc.
How much time does it take to insert 100 elements?
How much time does it take to insert n elements?
11.2
CSE 2331
Open Address Hashing
Hash table of size m = 20.
For Θ(1) expected running time, need:
# elements n ≤ 20/2 = 10.
After inserting 10 elements, need to increase hash table size m.
11.3
CSE 2331
Array Doubling
Start with array of size 2.
After inserting 2 elements, replace with array of size 4.
After inserting 4 elements, replace with array of size 8.
After inserting 8 elements, replace with array of size 16.
etc.
11.4
CSE 2331
Array Doubling
function Insert(x, A, m, n)
/* A is an array of size m. */
/* n = # elements in A. */
1 if (n = m) then
2 Create new array A2 with size 2m;
3 for i← 1 to m do
4 A2[i]← A[i];
5 end
6 Replace array A with A2;
7 m← 2m;
8 end
9 n← n + 1;
10 A[n]← x;
11.5
CSE 2331
Example
A : [ , ]
Insert: 21
A : [21, ]
Insert: 22
A : [21, 22]
Insert: 41
A : [21, 22, 41, ] (Array size m = 4.)
Insert: 42
A : [21, 22, 41, 42]
Insert: 81
A : [21, 22, 41, 42, 81, , , ] (Array size m = 8.)
Insert: 82
Insert: 83
Insert: 84
A : [21, 22, 41, 42, 81, 82, 83, 84]
Insert: 61
A : [21, 22, 41, 42, 81, 82, 83, 84, 61, , , , , , , ] (m = 16.)
11.6
CSE 2331
Running Time Analysis
function Insert(x, A, m, n)
1 if (n = m) then
2 Create new array A2 with size 2m;
3 for i← 1 to m do
4 A2[i]← A[i];
5 end
6 Replace array A with A2;
7 m← 2m;
8 end
9 n← n + 1;
10 A[n]← x;
What is the running time of Insert when m 6= n?
What is the running time of Insert when m = n?
11.7
CSE 2331
Running Times
# elements Array size m Insert time
0 2 c
1 2 c
2 2 c + 4c
3 4 c
4 4 c + 8c
5 8 c
6 8 c
7 8 c
8 8 c + 16c
9 16 c
10 16 c
......
...
15 16 c
16 16 c + 32c
17 32 c
11.8
CSE 2331
Running Time Analysis
Total running time for n insertions:
T (n) = cn + (Cost of doubling)
= cn + (4c + 8c + 16c + 32c + . . . + nc/2 + nc + 2nc)
= cn + (2nc + nc + nc/2 + . . . + 32c + 16c + 8c + 4c)
= cn + 2nc(1 + 1/2 + 1/4 + . . . + 2/n)
≤ cn + 4cn = 5cn.
T (n) = cn + (Cost of doubling) ≥ cn.
Since cn ≤ T (n) ≤ 5cn,
T (n) ∈ Θ(n).
11.9
CSE 2331
Amortized Analysis
Cost of one single operation may vary greatly.
Average cost is much lower than the highest cost.
Example: Array doubling.
Insert cost: c or c + 2cn.
Total cost of n Insert’s: 5cn.
Average Insert cost: 5cn/n = 5c ∈ Θ(1).
11.10
CSE 2331
Hash Table Doubling
function Dict.Insert(K, D)
1 m← HashTable.size;
2 if (HashTable.NumElements ≥ m/2) then
3 Create new hash table HashTable2 with size 2m;
4 for i← 1 to HashTable.size do
5 if (HashTable[i] is not empty) then
6 K← HashTable[i].key;
7 HashTable2.Insert(K, HashTable[i].data);
8 end
9 end
10 Replace HashTable with HashTable2;
11 end
12 HashTable.Insert(K,D);
11.11
CSE 2331
Deletions from Dynamic Tables
11.12
CSE 2331
Pop
function Pop(x, A, m, n)
/* A is an array of size m. */
/* n = # elements in A. */
1 if (n = 0) then error “Empty array.”;
2 x← A[n];
3 n← n− 1;
4 return (x);
11.13
CSE 2331
Dynamic Table Deletions
m = size of array A.
n = number of elements in A.
Want to shrink the array if n is a lot less than m.
Proposal: Create new array of size m/2 if n ≤ m/2.
What’s the problem?
11.14
CSE 2331
Insert/Delete problem
Operation # elements Array size m Time
32 32
Insert c + 64c
33 64
Delete c + 32c
32 32
Insert c + 64c
33 64
Delete c + 32c
32 32
Insert c + 64c
33 64
Delete c + 32c
32 32
11.15
CSE 2331
Dynamic Table Deletions
m = size of array A.
n = number of elements in A.
Want to shrink the array if n is a lot less than m.
Solution: Create new array of size m/2 if n ≤ m/4.
11.16
CSE 2331
Dynamic Table Pop
function DynamicPop(x, A, m, n)
/* A is an array of size m. */
/* n = # elements in A. */
1 if (n = 0) then error “Empty array.”;
2 x← A[n];
3 n← n− 1;
4 if ((n ≤ m/4) and (m ≥ 4)) then5 Create new array A2 with size m/2;
6 for i← 1 to n do7 A2[i]← A[i];
8 end
9 Replace array A with array A2;
10 m← m/2;
11 end
12 return (x);
11.17
CSE 2331
Example
A : [31, 32, 33, 34, 35, 36, , , , , , , , , , ]
(Array has size m = 16.)
DynamicPop
A : [31, 32, 33, 34, 35, , , , , , , , , , , ]
DynamicPop
A : [31, 32, 33, 34, , , , ]. (Array has size m = 8.)
DynamicPop
A : [31, 32, 33, , , , , ].
DynamicPop
A : [31, 32, , ] (Array has size m = 4.)
11.18
CSE 2331
Dynamic Table Popfunction DynamicPop(x, A, m, n)
1 if (n = 0) then error “Empty array.”;
2 x← A[n];
3 n← n− 1;
4 if ((n ≤ m/4) and (m ≥ 4)) then5 Create new array A2 with size m/2;
6 for i← 1 to n do7 A2[i]← A[i];
8 end
9 Replace array A with array A2;
10 m← m/2;
11 end
12 return (x);
What is the running time of DynamicPop when n 6= m/4?
What is the running time of DynamicPop when n = m/4?
11.19
CSE 2331
Running Times
# elements Array size m DynamicPop time
18 64 c
17 64 c + 32c
16 32 c
......
...
10 32 c
9 32 c + 16c
8 16 c
7 16 c
6 16 c
5 16 c + 8c
4 8 c
3 8 c + 4c
2 4 c + 2c
1 2 c
11.20
CSE 2331
Running Time Analysis
Assume number of elements n = array size m = 2k.
Total running time for n deletes:
T (n) = cn + (Cost of table halving)
= cn + (cn + cn/2 + cn/4 + cn/8 + . . . + 2c)
= cn + cn(1 + 1/2 + 1/4 + 1/8 + . . . + 2/n)
≤ cn + 2cn = 3cn.
T (n) = cn + (Cost of table halving) ≥ cn.
Since cn ≤ T (n) ≤ 3cn,
T (n) ∈ Θ(n).
11.21
CSE 2331
Insert & Delete Running Time
n inserts.
n deletes.
All inserts do not necessarily precede all deletes.
Total running time of 2n operations is still Θ(n).
11.22
CSE 2331
Data Structures for Disjoint Sets
11.23
CSE 2331
Union-Find Data Structure
Represent disjoint sets: Each element is in exactly one set.
Operations:
MakeSet(x) - Create a new set containing element x.
Union(x, y) - Union of the sets containing x and y.
FindSet(x) - Return a reference to a representative element of the
set containing x.
11.24
CSE 2331
Connected Component
v12
KKKKKKKKKKKKKKKKKKKKKKKKKKKK v1
0000
0000
0000
0000
0000
00
v11
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
<<<<
<<<<
<<<<
<<<<
<<<<
<<<<
<<<
v2
v10
ccccccccccccccccccccccccccccccccccccc v3
v9
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[ v4
v8
pppppppppppppppppppppppppppppppppv5
v7 v6
Are vi and vj in the same connected component?
11.25
CSE 2331
Connected Component
/* Create a set for each vertex */
1 foreach vertex vi do MakeSet(vi);
2 while (Not Done) do. . .
/* Add edge (vi, vj) */
3 if (FindSet(vi) 6= FindSet(vj)) then4 Union (vi, vj);
5 end
. . .
/* Report if vx and vy are in the same component */
6 if (FindSet(vx) = FindSet(vy)) then7 Print “vx and vy are in the same component.”
8 end
. . .
9 end
11.26
CSE 2331
Linked List Based Union-Find
.. x1//oo x2
//oo x3//oo x4
//oo x5
.. x6//oo x7
//oo x8//oo x9
//oo x10//oo x11
11.27
CSE 2331
Linked List Based Union-Find
.. x1//oo x2
//oo x3//oo x4
//oo x5
.. x6//oo x7
//oo x8//oo x9
//oo x10//oo x11
11.28
CSE 2331
Linked List Based Union-Find
.. x1 //
tail
99
x2 //
rrx3 //
wwx4 //
xxx5
yy
.. x6 //
tail
77
x7 //
rrx8 //
wwx9 //
xxx10 //
yyx11
11.29
CSE 2331
Linked List Based Union-Find
.. x1 //
tail
99
x2 //
rrx3 //
wwx4 //
xxx5
yy
.. x6 //
tail
77
x7 //
rrx8 //
wwx9 //
xxx10 //
yyx11
procedure FindSet(x)
1 return (x.head);
11.30
CSE 2331
Linked List Based Union-Find
// x1 //
tail
::
x2 //
ssx3 //
wwx4 //
x5
// x6 //
tail
==
x7 //
ssx8 //
wwx9
procedure Union(x, y)
1 x′ ← FindSet (x);
2 y′ ← FindSet (y);
3 x′.tail.next← y′;
4 w ← y′;
5 while (w 6= NULL) do6 w.head← x′;
7 w ← w.next;
8 end
9 x′.tail← y′.tail;
11.31
CSE 2331
Linked List Based Union-Find
procedure MakeSet(x)
1 x.head← x;
2 x.tail← x;
3 x.next← NULL;
procedure FindSet(x)
1 return (x.head);
procedure Union(x, y)
1 x′ ← FindSet (x);
2 y′ ← FindSet (y);
3 x′.tail.next← y′;
4 w ← y′;
5 while (w 6= NULL) do6 w.head← x′;
7 w ← w.next;
8 end
9 x′.tail← y′.tail;
11.32
CSE 2331
Linked List: Weighted Union
procedure WeightedUnion(x, y)
1 x′ ← FindSet (x);
2 y′ ← FindSet (y);
3 if (x′ = y′) then return;
4 if (x′.length ≥ y′.length) then5 x′.tail.next← y′;
6 w ← y′;
7 while (w 6= NULL) do8 w.head← x′;
9 w ← w.next;
10 end
11 x′.length = x′.length + y′.length;
12 x′.tail← y′.tail;
13 else14 WeightedUnion(y, x);
15 end
11.33
CSE 2331
Weighted Union: Analysis
For a given node xi, how many times does xi.head change?
Min size of
set containing xi Cost of changing xi.head
1 c
2 c
4 c
8 c...
...
n/2 c
n c
Ki = Total cost of changing xi.head ≤ c + c + . . . + c| z
log2(n)
= c log2(n).
Total for all xi:Pn
i=1 Ki ≤Pn
i=1 c log(n) = n log2(n).
11.34
CSE 2331
Weighted Union: Analysis
Lower bound:
WeightedUnion (x1, x2);WeightedUnion (x3, x4);WeightedUnion (x5, x6);WeightedUnion (x7, x8);. . .WeightedUnion (x1, x3);WeightedUnion (x5, x7);WeightedUnion (x9, x11);. . .WeightedUnion (x1, x5);WeightedUnion (x9, x13); . . .
Time : c + c + . . . + c| z
n
+ 2c + 2c + . . . + 2c| z
⌊n/2⌋
+4c + 4c + . . . + 4c| z
⌊n/4⌋
+ . . .
= cn log2(n)
Takes Ω(n log(n)) time.
11.35
CSE 2331
Linked List: Weighted Union
procedure MakeSet(x)
1 x.head← x;
2 x.tail← x;
3 x.next← NULL;
4 x.length← 1;
procedure FindSet(x)
1 return (x.head);
11.36
CSE 2331
Tree Based Union-Find
x1
x2
x3
x4
x5
==x6
OO
x7
OO
x8
??x9
OO
x10
``AAAAAA
x11
==x12
OO
x13
aaCCCCCCx14
OO
x15
aaCCCCCCx16
OO
x17
OO
procedure FindSet(x)
1 if (x = x.parent) then return (x);
2 else3 z ← FindSet(x.parent);
4 return (z);
5 end
11.37
CSE 2331
Tree Based Union-Find
x1
x2
x3
x4
x5
==x6
OO
x7
OO
x8
??x9
OO
x10
``AAAAAA
x11
==x12
OO
x13
aaCCCCCCx14
OO
x15
aaCCCCCCx16
OO
x17
OO
procedure Union(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 y′.parent← x′;
11.38
CSE 2331
Tree Based Union-Find
procedure MakeSet(x)
1 x.parent← x;
procedure FindSet(x)
1 if (x = x.parent) then return (x);
2 else3 z ← FindSet(x.parent);
4 return (z);
5 end
procedure Union(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 y′.parent← x′;
11.39
CSE 2331
Tree Based Union-Find
x1
x2
x3
x4
x5
==x6
OO
x7
OO
x8
??x9
OO
x10
``AAAAAA
x11
==x12
OO
x13
aaCCCCCCx14
OO
x15
aaCCCCCCx16
OO
x17
OO
procedure Union(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 y′.parent← x′;
11.40
CSE 2331
Height Based Union
procedure MakeSet(x)
1 x.parent← x;
2 x.height← 0;
procedure UnionByHeight(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 if (x′.height > y′.height) then y′.parent← x′;
4 else if (x′.height < y′.height) then x′.parent← y′;
5 else6 y′.parent← x′;
7 x′.height← x′.height + 1;
8 end
11.41
CSE 2331
Height Based Union
x1
x2
x3
x4
x5
;;wwwwx6
OO
x7
OO
x8
==zzzzx9
OO
x10
bbEEEE
x11
;;wwwwx12
OO
x13
ccGGGGx14
OO
x15
ccGGGGx16
OO
x17
OO
procedure UnionByHeight(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 if (x′.height > y′.height) then y′.parent← x′;
4 else if (x′.height < y′.height) then x′.parent← y′;
5 else6 y′.parent← x′;
7 x′.height← x′.height + 1;
8 end
11.42
CSE 2331
Height Based Union
procedure UnionByHeight(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 if (x′.height > y′.height) then y′.parent← x′;
4 else if (x′.height < y′.height) then x′.parent← y′;
5 else6 y′.parent← x′;
7 x′.height← x′.height + 1;
8 end
Proposition 1.If r is the root of tree T , then r.height is the height of tree T .
11.43
CSE 2331
Height Based Union: Proof of Proposition 1
Proposition 1.
If r is the root of tree T , then r.height is the height of tree T .
Proof. Proof by induction on the number of vertices.
Base case: If a tree T has one vertex, r, then r.height equals 0 and
height(T ) equals 0.
Induction hypothesis: If tree T has fewer than n vertices and r is
the root of T , then r.height = height(T ).
Induction Step: Show that if tree T has n > 1 vertices and r is the
root of T , then r.height = height(T ).
11.44
CSE 2331
Induction Step: Show that if tree T has n > 1 vertices and r is the
root of T , then r.height = height(T ).
Tree T was created by UnionByHeight(x, y) where x is in tree Tx
and y is in tree Ty. Let x′ be the root of Tx and y′ be the root of Ty.
By the induction hypothesis,
x′.height = height(Tx) and y′.height = height(Ty).
Case I: x′.height > y′.height.
r.height = x′.height = height(Tx) = height(r). (Why?)
Case II: x′.height < y′.height: Same argument as Case I.
Case III: x′.height = y′.height.
r.height = x′.height + 1 = height(Tx) + 1 = height(r).
(Why?)
11.45
CSE 2331
Height Based Union
procedure UnionByHeight(x, y)
1 x′ ← FindSet(x);
2 y′ ← FindSet(y);
3 if (x′.height > y′.height) then y′.parent← x′;
4 else if (x′.height < y′.height) then x′.parent← y′;
5 else6 y′.parent← x′;
7 x′.height← x′.height + 1;
8 end
Definition. size(T ) = number of vertices of tree T .
Proposition 2. If T has height h, then size(T ) ≥ 2h.
11.46
CSE 2331
Height Based Union: Proof of Proposition 2
Proposition 2. If T has height h, then size(T ) ≥ 2h.
Proof. Proof by induction on size(T ).
Base case: If size(T ) = 1, then height(T ) equals 0 and
size(T ) = 1 ≥ 2height(T ).
Induction hypothesis: If size(T ) < n, then size(T ) ≥ 2height(T ).
Induction Step:
Show that if size(T ) = n > 1, then size(T ) ≥ 2height(T ).
Tree T was created by UnionByHeight(x, y) where x is in tree Tx
and y is in tree Ty. Let x′ be the root of Tx and y′ be the root of Ty.
By the induction hypothesis,
size(Tx) ≥ 2height(Tx) and size(Ty) ≥ 2height(Ty).
11.47
CSE 2331
Induction Step: Show if size(T ) = n > 1, then size(T ) ≥ 2height(T ).
By the induction hypothesis,
size(Tx) ≥ 2height(Tx) and size(Ty) ≥ 2height(Ty).
Case I: x′.height > y′.height.
height(T ) = r.height = x′.height = height(Tx) (Proposition 1.)
size(T ) = size(Tx) + size(Ty) ≥ size(Tx) ≥ 2height(Tx) = 2height(T ).
(Why?)
Case II: x′.height < y′.height: Same argument as Case I.
Case III: x′.height = y′.height.
height(T ) = r.height = x′.height+1 = height(Tx)+1 (Prop. 1.)
height(Tx) = x′.height = y′.height = height(Ty) (Prop. 1).
size(T ) = size(Tx) + size(Ty) ≥ 2height(Tx) + 2height(y)
= 2× 2height(Tx) = 2height(Tx)+1 = 2height(T ).
(Why?)
11.48
CSE 2331
Height Based Union
Proposition 2. If T has height h, then size(T ) ≥ 2h.
Corollary. If T has height h, then h ≤ log2(size(T )).
Proof. By Proposition 2, size(T ) ≥ 2h.
log2(size(T )) ≥ log2(2h) = h.
11.49
CSE 2331
Tree Based Union-Find
x1
x2
x3
x4
x5
==x6
OO
x7
OO
x8
??x9
OO
x10
``AAAAAA
x11
==x12
OO
x13
aaCCCCCCx14
OO
x15
aaCCCCCCx16
OO
x17
OO
procedure FindSet(x)
1 if (x = x.parent) then return (x);
2 else3 z ← FindSet(x.parent);
4 return (z);
5 end
11.50
CSE 2331
Find With Path Compression
x1
x2
x3
x4
x5
==x6
OO
x7
OO
x8
??x9
OO
x10
``AAAAAA
x11
==x12
OO
x13
aaCCCCCCx14
OO
x15
aaCCCCCCx16
OO
x17
OO
procedure FindSetPathCompress(x)
1 if (x = x.parent) then return (x);
2 else3 z ← FindSetPathCompress(x.parent);
4 x.parent← z;
5 return (z);
6 end
11.51
CSE 2331
Find With Path Compression: Example
x15
x5
77oooooooooox17
__?????x9
jjTTTTTTTTTTTTTTT
x4
77oooooooooox2
^^=====x1
ggOOOOOOOOOO
x14
__?????x13
@@x11
^^=====
x10
??x7
__?????x16
??x6
^^=====x3
^^=====
x18
??x12
__?????x8
@@
11.52
CSE 2331
Union by Rank with Path Compression
Combine height based union find with path compression.
Replace x.height with x.rank.
Proposition 1: If r is the root of tree T , then height(T ) ≤ r.rank.
Proposition: The cost of n UnionByRank and
FindSetPathCompress is nα(n) where α(n) is the inverse of the
Ackerman function.
11.53
CSE 2331
The Ackermann Function
A1(n) = 2n.
A2(n) = 2× 2× · · · × 2︸ ︷︷ ︸
n
= 2n.
A3(n) = 222...
2
n.
Ak(n) = A(n)k−1(1) = Ak−1(Ak−1(Ak−1(. . . (Ak−1(1)) . . .)))
︸ ︷︷ ︸n
.
Inverse of A1(n) = ⌈n/2⌉.
Inverse of A2(n) = ⌈log2(n)⌉.
Inverse of A3(n) = ⌈log ∗(n)⌉ = ⌈log2(log2(log2(. . . (log2(n)) . . .)))︸ ︷︷ ︸
n
⌉.
α(n) = mink : Ak(k) ≥ n.
(Slightly different than definition in Intro to Algorithms by CLRS.)
11.54