COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision...

21
COSC 1030 Lecture 10 COSC 1030 Lecture 10 Hash Table

Transcript of COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision...

Page 1: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

COSC 1030 Lecture 10COSC 1030 Lecture 10

Hash Table

Page 2: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

TopicsTopics

TableHash ConceptHash FunctionResolve collisionComplexity Analysis

Page 3: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

TableTable

Table– A collection of entries– Entry :<key, info>– Insert, search and delete– Update, and retrieve

Array representation– Indexed– Maps key to index

Page 4: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Hash TableHash Table Hash Table

– A table– Key range >> table size– Many-to-one mapping (hashing)– Indexed – hash code as index

Tabbed Address Book– Map names to A:Z– Multiple names start with same letter

Same tab, sequential slots

Page 5: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Hash Table ADTHash Table ADT

Interface Hashtable {

void insert(Item anItem);

Item search(Key aKey);

boolean remove(Key aKey);

boolean isFull();

boolean isEmpty();

}

Page 6: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Hash FunctionHash Function

Maps key to index evenlyFor any n in N,

hash(n) = n mod Mwhere M is the size of hash table.

hash(k*M + n) = n, where n < M, k: integerMap to integer first if key is not an integer

– A:Z 0:25String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1)String s h(s[0])*26^(n-1) + …+h(s[n-1])

Page 7: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Hash FunctionHash Function

String s h(s[0])*26^(n-1) + …+h(s[n-1])

int toInt(String s) {

assert(s != null);

int c = 0;

for (int I = 0; I < s.length(); I ++) {

c = c*26 + toInt(s.charAt(I));

}

return c;

}

int hash(String s) { return hash(toInt(s)); }

Page 8: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Example Example

Table[7] – HASHTABLE_SIZE = 7 Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table

2, 0, 5, 4, 5 Collision

– The slot indexed by hash code is already occupied

A simple solution– Sequentially decreases index until find an empty slot or

table is full

Page 9: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Collision PossibilityCollision Possibility

How often collision may occur? Insert 100 random number into a table of 200 slots 1 – ((200 – I)/200), I=0:99

= 1 – 6.66E-14 > 0.99999999999993 Load factor

– 100/200 = 0.5 = 50% 0.99999999999993– 20/ 200 = 0.1 = 10% 0.63– 10/200 = 0.05 = 5% 0.2

Default load factor is 75% in java Hashtable

Page 10: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Primary ClusterPrimary Cluster

The biggest solid block in hash tableJoin clustersThe bigger the primary cluster is, the easier

to growDistributed evenly to avoid primary cluster

Page 11: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Probe MethodProbe Method

What we can do when collision occurred?– A consistent way of searching for an empty slot– Probe

Linear probe – decrease index by 1, wrap up when 0 Double hash – use quotient to calculate decrement

– Max(1, (Key / M) % M)

Separate chaining – linked list to store collision items Hash tree – link to another hash table (A4)

Page 12: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Probe sequence coverageProbe sequence coverage

Ensure probe sequence cover all table– Utilizes the whole table– Even distribution– M and probe decrement are relative prime

No common factor except 1

– Makes M a prime number M and any decrement (< M) are relative prime

Page 13: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Probe MethodProbe Method

void insert(Item item) {

if(!isFull()) {

int index = probe(item.key);

assert(index >=0 && index < M);

table[index] = item;

count ++;

}

}

Page 14: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Linear Probe MethodLinear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE;

if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

do { index--; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 15: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Double Hash Probe MethodDouble Hash Probe Method int probe(int key) {

int hashcode = key % HASHTABLE_SIZE;if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec);

do { index -= dec; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 16: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Search MethodSearch Method Item search(int key) {

int hashcode = key % HASHTABLE_SIZE;

int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE);

while(table[hashcode] != null) {

if(table[hashcode].key == key) break;

hashcode -= dec;

}

return table[hashcode];

}

Page 17: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Delete MethodDelete Method

Difficulty with delete when open addressing– Destroy hash probe chain

Solution– Set a deleted flag– Search takes it as occupied– Insert takes it as deleted– Forms primary cluster

Separate chaining– Move one up from chained structure

Page 18: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

EfficiencyEfficiency Successful search

– Best case – first hit, one comparison– Average

Half of average length of probe sequence Load factor dependent O(1) if load factor < 0.5

– Worst case – longest probe sequence Load factor dependent

Unsuccessful search– Average - average length of probe sequence– Worst case - longest probe sequence

Page 19: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Advanced TopicsAdvanced Topics Choosing Hash Functions

– Generate hash code randomly and uniformly– Use all bits of the key– Assume K=b0b1b2b3– Division

h(k) = k % M; p(k) = max (1, (k / M) % M)

– Folding h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR

– Middle squaring h(k) = (b1b2) ^ 2

– Truncating h(k) = b3;

Page 20: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Advanced TopicsAdvanced TopicsHash Tree

– Separate chained collision resolution– Recursively hashing the key

Hash Table

Hash Table Hash Table Hash Table

Hash Table

Hash Table

Page 21: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.

Hash TreeHash Treevoid insert(int key, Item item) {

Int h = h(key);Int k = g(key); // one-to-one mapping Key KeyIf(table[h] == null) {

table[h] = item;} else {

if(table[h].link == null) table[h].link = new HashTree();

table[h].link.insert(k, item);}

}