12. Hashing

download 12. Hashing

of 10

Transcript of 12. Hashing

  • 8/8/2019 12. Hashing

    1/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 11

    DATASTRUCTURES

    MAHESH GOYANI

    MAHATMA GANDHI INSTITUE OF TECHNICAL EDUCATION & RESEARCH CENTER

    [email protected]

  • 8/8/2019 12. Hashing

    2/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 22

    HASHING

  • 8/8/2019 12. Hashing

    3/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 33

    Hashing provides O(1) constant time complexity for searching.

    Generally, array index works as an searching index if there is one toone correspondence between key and element.

    In practice, it is very tough to maintain 1-1 relationship between keyand element if data is too large, like ID = 12345.

    TERMINOLOGY

  • 8/8/2019 12. Hashing

    4/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 44

    31300

    49001

    52202

    Empty

    12704

    Empty

    65606

    00

    01

    02

    03

    04

    05

    06

    98

    99

    Index Record

    31300

    49001

    52202

    12704

    65606

    00

    01

    02

    03

    04

    05

    06

    98

    99

    Index Record

    Hashed Insertion Linear Insertion

    COMPARISION

    Key = 12704

    Hash Fun.Key % 100

  • 8/8/2019 12. Hashing

    5/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 55

    COLLISION

    If two number appears with the same hash value, like 12345 and 23645, thanboth has to the same location ARRAY [45]. This is known as collision.

    Collision is one of the problem while designing the good hash function.

    Minimization ofcollision is also too difficult.

  • 8/8/2019 12. Hashing

    6/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 66

    LINEAR PROBING

    A simple approach to resolving collisions is to store the colliding record in tonext available record.

    Actual SceneOrder ofInsertion Index

    00

    01

    02

    03

    04

    05

    06

    07

    08

    ..

    ..

    99

    14001

    14001

    00104

    001045000350003

    77003 77003

    42504 42504

    33099

    33099

  • 8/8/2019 12. Hashing

    7/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 77

    DELETION

    Actual SceneIndex

    00

    01

    02

    03

    04

    05

    06

    07

    08

    ..

    ..

    99

    14001

    00104

    50003

    77003

    42504

    33099

    Linear probing solves one problem butgenerates another

  • 8/8/2019 12. Hashing

    8/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 88

    CLUSTERING

    Actual SceneIndex

    00

    01

    02

    03

    04

    05

    06

    07

    08

    ..

    ..

    99

    14001

    00104

    50003

    77003

    42504

    33099

    Record with key 03,04,05,06 and 07 wouldbe inserted at array room 7, that is array room7 is five times as likely as array room 8 to befilled.

    Clustering results in to inconsistentefficiency of insertion and retrieval.

  • 8/8/2019 12. Hashing

    9/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 99

    REHASHING

    If the hash function produces a collision, the hash value is usedas the input to rehash function.

    In Linear probing : (hash value + 1) % array size

    For rehashing with linear probing:

    (Hash value + constant) % array size

    constant and Array Size should be relatively prime, so that itcan cover all the odd and even digit

    Quadratic Probing :

    (Hash Value + i2) % Array Size Pseudo random Generator probing :

    (Hash Value + random()) % Array Size

  • 8/8/2019 12. Hashing

    10/10

    (C) GOYANI MAHESH(C) GOYANI MAHESH 1010

    BUCKET & CHAINING

    99

    ..

    07

    06

    05

    04

    03

    02

    01

    00

    INDEX

    RECORD

    10100 10100

    33303 11103 22203

    33306 11106 22206