Post on 30-May-2018
8/14/2019 Hashing (DASTAL)
1/27
Hashing
8/14/2019 Hashing (DASTAL)
2/27
Hashing
Hashing is the transformation of a
string of characters into a usually
shorter fixed-length value or key that
represents the original string. Hashing
is used to index and retrieve items in adatabase because it is faster to find the
item using the shorter hashed key than
to find it using the original value. It isalso used in many encryption
algorithms.
8/14/2019 Hashing (DASTAL)
3/27
Hash Table
Is a data structure that
associates keys with values
A small phone book as a hash table.
http://en.wikipedia.org/wiki/File:HASHTB08.svg8/14/2019 Hashing (DASTAL)
4/27
Hash Table (1)
The primary operation it supports
efficiently is a lookup: given a key (a
person's name), find the corresponding
value (that person's telephone number). It
works by transforming the key using ahash function into a hash, a number that
is used as an index in an array to locate
the desired location where the values
should be.
8/14/2019 Hashing (DASTAL)
5/27
Hash Function
The hashing algorithm
is any well-defined procedure or
mathematical function which converts a
large, possibly variable-sized amount of
data into a small datum, usually a singleinteger that may serve as an index into an
array. The values returned by a hash
function are called hash values, hashcodes, hash sums, or simply hashes.
8/14/2019 Hashing (DASTAL)
6/27
Hash Function
8/14/2019 Hashing (DASTAL)
7/27
1.Direct Hashing The key is the address without anyalgorith-mic manipulation. The data structure must
therefore contain an element for everypossible key.
While the situations where you can use
direct hashing are limited, when it can beused it is very powerful because itguarantees that there are no synonyms.
8/14/2019 Hashing (DASTAL)
8/27
001 Elmer
002 Markh
005 Reymund
007 Hubert
100 Rollyn
HashFunction
005100002
5100
2
Address
Key
8/14/2019 Hashing (DASTAL)
9/27
2.Subtration MethodSometimes we have keys that areconsecutive but do not start from one.
Example:A company may have only 100
employees, but the employee numbersstart from 1000 and go to 1100. In this
case, we use a very simple hashing functionthat subtracts 1000 from the key todetermine the address.
8/14/2019 Hashing (DASTAL)
10/27
3.Digit ExtractionSelected digits are extracted from the keyand used as the address.
Example:Using six-digit employee number to
hash to a three-digit address (000-999), wecould select the first, third, and fourth
digits.
379452 = 394121267 = 112
378845 = 388=
8/14/2019 Hashing (DASTAL)
11/27
379452 Elmer
121267 Markh
378845 Hubert
160252 Arno045128 Rollyn
HashFunction
121267045128379452
33071
Divides the key by thearray size and usesthe remainder + 1
[001]
[006][005]
[004]
[003]
[002]
[007]
[306]
[307]
.
.
.
.
.
4.Mod division
8/14/2019 Hashing (DASTAL)
12/27
5.Midsquare Hashing The key is squared and the addressselected from the middle of the squarednumber.
Example:
9452 * 9452 = 89340304 : address is3403
As a variation, we can select a portion ofthe key, and then use them rather than thewhole key.
379452 : 379 * 379 = 143641 : address
8/14/2019 Hashing (DASTAL)
13/27
6.Folding Methods There are two folding methods that areused:
Fold Shift, the key value is divided intoparts whose size matches the size of therequired address. Then, the left and rightparts are shifted and added with the middle
part. Fold Boundary, the left and right numbers
are folded on a fixed boundary betweenthem and the center number. This resultsin a two outside values being reverse
8/14/2019 Hashing (DASTAL)
14/27
12
345678936
8
32
145698776
4
1
123456789
1
123
789
Discarded
123
Key
Digitsreversed
789
Digitsreversed
8/14/2019 Hashing (DASTAL)
15/27
8/14/2019 Hashing (DASTAL)
16/27
Collision
8/14/2019 Hashing (DASTAL)
17/27
Collision
Is the event that occurs when a hashingalgorithm produce an address for aninsertion key and that address is alreadyoccupied.Home Address
The address produced by hashingalgorithm.Prime Area
The memory that contains all of the homeaddresses.
Probe Calculation of address and test for success.
8/14/2019 Hashing (DASTAL)
18/27
[1] [5] [9] [17]
1. hash(A)
2. hash(B) 3. hash(C)
B & ACollides C & B
Collides
A BC
8/14/2019 Hashing (DASTAL)
19/27
Collision Resolution
The process of finding alternate location
Collision strategy techniques:
Separate chaining
Open addressing
Coalesced hashing
Perfect hashing Dynamic perfect hashing
Probabilistic hashing
Robin hood hashing
Cache-conscious collision resolution
8/14/2019 Hashing (DASTAL)
20/27
Separate Chaining
Sometimes called simply
chaining or direct chaining, inits simplest form each slot in the
array is a linked list, or the
head cell of a linked list, where
the list contains the elements
that hashed to the samelocation. Insertion requires
finding the correct slot, then
appending to either end of the
list in that slot
http://en.wikipedia.org/wiki/File:HASHTB32.svg8/14/2019 Hashing (DASTAL)
21/27
Open Addressing
Open addressing hash tables store the records directly
within the array. This approach is also called closedhashing. A hash collision is resolved byprobing, or
searching through alternate locations in the array
(following aprobe sequence) until either the target record
is found, or an unused array slot is found, which indicatesthat there is no such key in the table.
8/14/2019 Hashing (DASTAL)
22/27
8/14/2019 Hashing (DASTAL)
23/27
379452 Elmer
121267 Markh
378845 Hubert
160252 Arno
045128 Rollyn
HashFunction
070918
166702
Collision is resolvedby adding one(1) tothe current address
[001]
[006]
[005]
[004]
[003]
[002]
[007]
[306]
[307]
.
.
.
.
.Linear Probing
070918 Redjie
166702 Reymund
8/14/2019 Hashing (DASTAL)
24/27
Quadratic ProbingThe increment is the collision probe number
squared.
Probe Collision Probe2 and New
Num Location Increment Address
1 1 12 = 1 12 2 22 = 4 3
3 6 32
= 9 54 15 42 = 16 75 31 52 = 25 96 56 62 = 36 11
8/14/2019 Hashing (DASTAL)
25/27
Key OffsetIs a double hashing method that produces
different collision path for different keys.
Formula:
offset = (key / listsize)adress = ((offset + old address) modulo
listsize) + 1
For example if the key is 166702 and thelistsize is 307, using the modulo division
offset = (166702 / 307) = 543
address = ((543 + 002) modulo 307) + 1= 239
8/14/2019 Hashing (DASTAL)
26/27
379452 Elmer
070918 Redjie
121267 Markh
378845 Hubert
160252 Arno
045128 Rollyn
[001]
[006]
[005]
[004]
[003]
[002]
[007]
[306]
[307]
.
.
.
.
.
166702 Reymund
572556 Angelus
8/14/2019 Hashing (DASTAL)
27/27
H h lli i l d b li bi (i t l 1)
http://en.wikipedia.org/wiki/File:HASHTB12.svghttp://en.wikipedia.org/wiki/File:HASHTB12.svghttp://en.wikipedia.org/wiki/File:HASHTB12.svg