Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi...

29
Linked Lists and Hash Tables Jon Woods CS 5321

Transcript of Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi...

Page 1: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Linked Lists and Hash Tables

Jon Woods

CS 5321

Page 2: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Stacks and Queues

Stack – LIFO

Queue - FIFO

123456789

1 2 3 4 5 6 7 8 9

Page 3: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Stack

Push – O(1)

Pop – O(1)

Stack Empty - O(1)

3 3 2 1

3 3 2 1

Page 4: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Queue

Enqueue – O(1)

Dequeue - O(1)

3 1 2 3

3 1 2 3

Page 5: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Linked List

Singly Linked List

Doubly Linked List

Circularly Linked List

1 2 3

1 2 3

Head

Head

1 2 3Head

Page 6: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

List Search

List-Search(L,k)

x = head[L]while x != NIL and key[x] != k

do x = next[x]return x

List Search = θ(n)

Page 7: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

List Insert

List-Insert(L, x)

next[x] = head[L]if head[L] != NIL

then prev[head[L]] = xhead[L] = xprev[x] = NIL

List Insert = O(1)

Page 8: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

List Delete

List-Delete(L, x)

if prev[x] != NILthen next[prev[x]] = next[x]else head[L] = next[x]

if next[x] != NILthen prev[next[x]] = prev[x]

List Delete = O(1) or θ(n)? Why?

Page 9: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Single and Multi Arrays

Multi array implementations represent linked lists with three arrays: key, next,

prev

Single array implementations represent linked lists as a single array, with key, next, and prev stored as sequential values within

a single array.

Page 10: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Multi Array Implementation

next

key

prev

1 2 3 4 5 6 7 8

3

4

5

1

2

2

16

7

5

9

L (7)

The variable L represents the index of the head, 7 in this case.

Page 11: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Single Array Implementation

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

4 7 13 1 4 16 4 19 9 13

L (19)

What are the advantages of using this implementation? Disadvantages?

Page 12: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Allocate and Free

Allocate-Object() Free-Object(x)If free != NIL next[x] = freex = free free = xfree = next [x]return x

These functions both take O(1) time.

Page 13: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Allocate

1 2 3 4 5 6 7 8

4

5

1

2

83

16

7

2 1 5

9

6next

key

prev

7

4free

L

3

4

5

1

2

16

7

2 1 5

9

6next

key

prev

4

8free

L

7

25

Allocate-Object() will return 4 (the next item on the free list) and then calls List-Insert(L,4).

The new head of the free list is 8.

4

Page 14: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Free1 2 3 4 5 6 7 8

4

5

1

2

3

16

7

2 1 5

9

6next

key

prev

4

8free

L

7

25

3

4

5

1

2

16

7

2 1 5

9

6next

key

prev

4

5free

L

7

25

4

4

8

After calling List-Delete(L,5), we call Free(5).

Object 5 now becomes the new head of the free list.

Page 15: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Direct Address Table

UUniverse of Keys

KActual Keys

1 9

407

6

2

35 8

0

1

2

3

4

5

6

7

8

9

2

3

5

8

Key SatelliteData

Page 16: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Direct Address Table

DIRECT_ADDRESS_SEARCH(T,x)return T[k]

DIRECT_ADDRESS_INSERT(T,x)T[key[x]] = x

DIRECT_ACCESS_DELETE(T,x)T[key[x]] = NIL

All functions are O(1)

Page 17: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Collisions and Chaining

U

Kk1

k4 k5

k7

k2

k8

k3

k6

k1 k4

k2 k5 k7

k3

k6 k8

h k1=hk4 ,hk2=h k5=hk7 ,hk6=hk8

Page 18: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Analysis of ChainingE [

1n∑i=1

n

1 ∑j=i1

n

1m

]

=11nm

∑i=1

n

n−i

=11nm

∑i=1

n

n−∑i=1

n

i

=11nm

n2−nn1

2

=1n−12m

=1

2−

2n

=2

2−

2n

1

During a search for x, we examine 1 more than the number of elements preceding x.

Assuming uniform hashing, P{h( ) = h( )} = 1/m

Thus, the expected length that we will have to search, E, is 1/m.

If the number of slots is proportional to the number of elements in a table, then n = O(m).

Since α = n/m, O(m)/m = O(1)

ki k j

Page 19: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Hash Functions

Division: h(k) = k mod m

Multiplication: h(k) = m(k A mod 1)

We should choose a power of 2 for m in the multiplication hashing scheme, but NOT for

the division scheme. Why?

Page 20: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Universal Hashing

Randomized hashing functions offer a probabilistic efficiency.

This ensures good average case performance.

With universal hashing, we can achieve θ(1+a) expected search time without

making assumptions based on the keys.

Page 21: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Universal HashingE [Yk]≤ ∑

l∈T , l≠k

1m

if k∉T

nhk =Y k

∣l : l∈T∧l≠k∣=n

E [nhk ]=E[Y k]≤nm

=

if k∈T

nhk =Y k1

∣l : l∈T∧l≠k∣=n−1

E [nhk ]=E[Y k1]≤n−1m

1=1−1m

1

Let Y be the number of keys other than k that hash to the same slot as k.

As before, a single pair of keys collide with a probability of 1/m.

If the key k is not in the table, then the number of keys in the same slot as k is equal to the number of keys in the slot not equal to k. The number of keys in T that are not equal to k is n. If k is not in T, then we must examine α keys to find a spot for k.

If the key k is in the table, then the number of keys in the slot with k includes k. The number of keys in T that are not equal to k is n-1. If k is in T, then we must examine α+1 keys to determine we found k.

Page 22: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Designing a Universal Hash Function

We choose a prime number p such that every possible key is in the range 0 to p-1.

We choose two different values, a and b, from that range.

h(k) = ((ak + b) mod p) mod m

Page 23: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Open Addressing

Instead of storing pointers, we have a computation function which indexes values

by calculating a probing sequence.

By not storing pointers, we may yield fewer collisions and attain faster retrieval.

Truly uniform hashing requires m! distinct probing sequences.

Page 24: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Linear and Quadratic Probing

h(k, i) = (h'(k)+i) mod mPrimary Clustering

Only offers m distinct probing sequences

h(k, i) = (h'(k) + c1i + c2i^2) mod mSecondary Clustering

Also offers only m distinct probing sequences

Page 25: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Double Hashing

h k,i=h1kih2 kmodm

h1 k=kmod13

h2 k=1kmod11

h1 14=1,h2 14=4

79

69

98

72

14

50

0

1

2

3

4

5

6

7

8

9

10

11

12

In double hashing, we calculate two hashes, one for the initial position and one for the offset should that position be full.

In this example, we choose the hash functions depicted at left. After inserting 5 values into the table, we try to insert 14.

Position 1 is full, so we increase by the offset 4. Position 5 is also full, so we put our data into position 9.

Double hashing offers m^2 distinct probing sequences.

Page 26: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Analysis of Open Addressing

E [X ]=∑i=1

P nm

∗n−1m−1

∗...∗n−i2

m−i2

E [X ]≤∑i=1

nm

i−1

E [X ]≤∑i=1

i−1

E [X ]=∑i=0

i

1

1−probes

The expected number of probes necessary to find an empty slot is equal to the sum of the probabilities of each of the cells being empty assuming the previous one was full.

By manipulating the equation, we can bound the expected number of probes.

Thus, we expect at most 1/(1-a) probes on average.

Page 27: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Perfect Hashing

When used with a static set of keys, and two 'universal' hash schemes, we can

construct a structure with no collisions and a O(1) search time.

Why is this better than other hash schemes?

Page 28: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

Perfect Hashing

hk=akbmodp modm

a=3,b=42,p=101,m=9

0

1

2

3

4

5

6

8

7 16 23 88 40 52 22 37

m7 a7 b7 S7

1 0

m5 a5 b5

0

9 10

m2 a2 b2

18

1 0

m0 a0 b0

0

70

S5

60 72 75

10

S0

S2

T

h(75) = 2, so 75 hashes to slot 2 of table T.

h'(75) = 7, so 75 hashes to slot 7 of secondary hash table S2.

Page 29: Linked Lists and Hash Tablesaebnenas/teaching/fall2007/cs5321/lectures/… · Single and Multi Arrays Multi array implementations represent linked lists with three arrays: key, next,

This man owns the patent on linked lists

Linked List - Patent No. 10260471

Patent Issued April 11, 2006 to LSI Logic Corporation

“A computerized list is provided with auxiliary pointers for traversing the list in different sequences. One or more auxiliary pointers enable a fast, sequential traversal of the list with a minimum of computational time. Such lists may be used in any application where lists may be reordered for various purposes.”

Abhi Talwalkar, CEO LSI Logic