Designing Hash Tables

22
1 Designing Hash Tables • Sections 5.3, 5.4, 5.5

description

Designing Hash Tables. Sections 5.3, 5.4, 5.5. Designing a hash table. Hash function: establishing a key with an indexed location in a hash table E.g. Index = hash(key) % table_size; Resolve conflicts: Need to handle case where multiple keys mapped to the same index. - PowerPoint PPT Presentation

Transcript of Designing Hash Tables

Page 1: Designing Hash Tables

1

Designing Hash Tables

• Sections 5.3, 5.4, 5.5

Page 2: Designing Hash Tables

2

Designing a hash table

1. Hash function: establishing a key with an indexed location in a hash table

– E.g. Index = hash(key) % table_size;

2. Resolve conflicts: – Need to handle case where multiple keys mapped to

the same index.– Two representative solutions

• Chaining with separate lists

• Probing open addressing

Page 3: Designing Hash Tables

3

Separate Chaining

• Each table entry stores a list of items

• Multiple keys mapped to the same entry maintained by the list

• Example– Hash(k) = k mod 10

– (10 is not a prime, just for illustration)

Page 4: Designing Hash Tables

4

Separate Chaining Implementation

Type Declaration forSeparate ChainingHash Table

Page 5: Designing Hash Tables

5

HashedObj

• Needs to provide– Hash function

• Provided for string and int (the two non-member functions)

– Equality operators (operator== or operator!= )

Page 6: Designing Hash Tables

6

An example class for HashedObj

Page 7: Designing Hash Tables

7

Chaining

Page 8: Designing Hash Tables

8

Chaining (contd.)

Page 9: Designing Hash Tables

9

Chaining (contd.)

Page 10: Designing Hash Tables

10

Analysis of Chaining

• Consider an array of size M with N records– Worst case insert without uniqueness check = O(1)

• Find location to insert and push_back/front

– Worst case remove/find/unique insert = O(N)– Expected case unique insert/find/remove

• 1 + O(N/M)– Let us resize the table if N/M exceeds some constant – Expected time = 1 + O() = O(1)

Page 11: Designing Hash Tables

11

Hash Tables Without Chaining

• Try to avoid buckets with separate lists

• How use Probing Hash Tables– If collision occurs, try another cell in the hash table.

– More formally, try cells h0(x), h1(x), h2(x), h3(x)… in succession until a free cell is found.• hi(x) = hash(x) + f(i)

• And f(0) = 0

Page 12: Designing Hash Tables

12

• f(i)=i

Insert (assume no duplicated keys)1. Index = hash(key) % table_size;

2. If table[index] is empty, put information (key and others) in entry table[index].

3. If table[index] is not empty thenIndex ++; index = index % table_size;

goto 2.

Search (key)1. Index = hash(key) % table_size;

2. If (table[index] is empty) return –1 (not found).

3. Else if (table[index].key == key) return index;

4. Index ++; index = index % table_size; goto 2.

Linear Probing

Page 13: Designing Hash Tables

13

Example

Insert 89, 18, 49, 58, 69 (hash(k) = k mod 10)

Page 14: Designing Hash Tables

14

Linear probing

• Delete– Can be tricky, must maintain the consistency of the hash table.

– What is the simplest deletion strategy you can think of??

Page 15: Designing Hash Tables

15

Quadratic Probing

f(i) = i2

Page 16: Designing Hash Tables

16

Probing strategy hash table

Page 17: Designing Hash Tables

17

Double Hashing

• f(i) = i*hash2(x)

• E.g. hash2(x) = 7 – (x % 7)

What if hash2(x) = 0 for some x?

Page 18: Designing Hash Tables

18

Analysis of Hash Table Without Chaining

• Expected case analysis of insertion into a table of size M containing n records

i=1 iprobability of trying i buckets

– = 1(M-n)/M + 2(n/M)(M-n)/M + 3(n/M)2(M-n)/M + ... • Let = n/M

– Time = 1*(1-) + 2 (1-) + 32(1-) + ... – = 1 - + 2 - 22 + 32 - 33 + 43 - 44 ... – = 1 + + 2 + 3 + ... = 1/(1-)

• Assume < 1– Keep bounded by some constant < 1

Page 19: Designing Hash Tables

19

Rehashing

• Hash Table may get full– No more insertions possible

• Hash table may get too full– Insertions, deletions, search take longer time

• Solution: Rehash– Build another table that is twice as big and has a new hash

function– Move all elements from smaller table to bigger table

• Cost of Rehashing = O(N)– But happens only when table is close to full– Close to full = table is X percent full, where X is a tunable

parameter

Page 20: Designing Hash Tables

20

Rehashing Example

After RehashingOriginal Hash Table

After Inserting 23

Page 21: Designing Hash Tables

21

Rehashing Implementation

Page 22: Designing Hash Tables

22

Rehashing implementation