Hash Table Concepts & Implementations. Sorting by theory Hash Table Concepts Implementation.

23
Hash Table Concepts & Implementations

Transcript of Hash Table Concepts & Implementations. Sorting by theory Hash Table Concepts Implementation.

Hash Table

Concepts & Implementations

Sorting by theory

Hash Table

Concepts Implementation

Concepts: hash table & hash function

Hash table

We have seen one way to have a dictionnary.

Binary Search Trees

• Basic BST

• AVL (balanced)

• Red black (balanced)

• Array

• Pointers

3 types and 2 kinds of implementations

We’re done with trees. We’ll see another way to have a dictionnary.

We have a couple {key, data} and we want:

add(key, data) reassign(key, data) remove(key) lookup(key)

44

{ , « True North »}

090 7

0

3

50

Concepts: hash table & hash function

Hash table

We have a couple {key, data} and we want:

add(key, data) reassign(key, data) remove(key) lookup(key)

We have a special array called a hash table.

Given a couple {key, data}, you use a function that turns the key into a cell of the array.

{ , « 我的教授是坏的  »}

我的教授是坏的

{ , « moalem khoob hast »}

moalem khoob hast

{ , « Just before dark »}

Just before dark

True North

50

Concepts: hash table & hash function

Hash table

We have a couple {key, data} and we want:

add(key, data) reassign(key, data) remove(key) lookup(key)

We have a special array called a hash table.

Given a couple {key, data}, you use a function that turns the key into a cell of the array.

我的教授是坏的

moalem khoob hast

Just before dark

True North

0

1

2

3

4

5

6

7

How can you know if a key is in the hash table?

Take the key, turn it into a cell of the array, and

look if there is something in that cell.

3

Concepts: hash table & hash function

Hash table

In a nutshell

A hash table is an implementation of a dictionnary using an array.

A hash function transforms the key of an element into its position in the hash

table.

Implementation

Hash table

BasicHashFunction

Content

key

data

key position

BasicHashTable

Array of Content

BasicHashfunction

Add Lookup Remove

Implementation

Hash table

The simplest way of having a hash function is by turning the key into a number and by using the modulo to make it fit in a small array.

Example: my key is « Vercingétorix ». My array has a size of 7. I apply 13%7 = 6, so it will go into the 6th cell.

Implementation

Hash table

When I create my hash function, I tell it what’s the size of the array so it can do the modulo accordingly.

To add something, get its position and just insert it.

To know if an element is inside, check if its position is used.

We cannot access the data of something null so we first check if the cell is used.

Then we save the data, empty the cell and return the data.

Implementation

Hash table

http://java.sun.com/j2se/1.5.0/docs/api/java/util/Hashtable.html

As all basic structures, you have it available from Java, so use it.

Now that it has been implemented, we must DEBUG!!!

Concepts

Hash table

Lets say that our keys are words of at most 13 characters.

That’s 26 possible values…13

Hash function

Since the hash table is smaller, it means that two (or more) keys may have the same position. When it happens, we say that we have a collision.

Concepts

Hash table

You’re taking the train and you have a ticket with your seat number.

Some people actually don’t give a damn about the tickets and they sit where they feel like. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaSo, how do you find a seat?

Concepts

Hash table

First, you try to go where you’re supposed to be seated.

If it’s busy, try to take over the next seat.

And continue until you

get a seat.

Concepts

Hash table

In a nutshell

A collision happens when two keys are assigned the same position.

Resolving a collision means that we look for a free location to be assigned

to the newcoming key.

Linear probing resolves a collision by a sequential search.

Implementation

Hash table

We start by the seat we’re supposed to have

As long as we don’t have a seat, we try to take over the next one

Got a seat so sitWe’re looking for somebody: start by his ‘official’ seat

If you find an empty seat, he should have been there so false.

Otherwise, look who’s sitting in the next seat.

Implementation

Hash table

0 1 2 3 4

Hash function: f(x) = x % 5

Add: {7, « Busy »}

Add: {2, « Bee »}

Busy Bee

Add: {1, « Big »}

Add: {36, « Bitching »}

Big Bitching71 2 31

Implementation

Hash table

0 1 2 3 4

Busy BeeBig Bitching

Lookup(2)

71 2 31

Lookup(4)

Previously, we removed an element by just setting

it to null.

But if we do this, it creates problems for the probe.

Lookup(2)

Incorrect since 2 is in!

Concepts

Hash table

How should we remove something?

We want to remove so that the cell becomes free but we know that there was an element in it and it does not stop the lookup.

We create a special Available value.

private static final Content available = new Content(«  », null);

Implementation

Hash table

Nothing.

We know it’s in so search until you get it.

Then save the data, make the cell available, and return the data.

We only continue if we cannot add: the position is not empty or not available.

What should we change to lookup?

Concepts

Hash table

When you use linear probing, it solves your problem but it’s making it worst for the next time.

Indeed, there will be a high probability that if you have one collision then you will have more.

We see…

We see…

More probing!

Yet you’ll earn $ and be happy

Concepts

Hash table

When you use linear probing, it solves your problem but it’s making it worst for the next time.

Indeed, there will be a high probability that if you have one collision then you will have more.

0 1 2 3 4 5 6

Hash function: f(x) = f % 7 Add {10, Creek}

Creek Livingston Suite

Add {3, Livingstone}

Add {17, Suite}

Add {2, Braided}

Braided

Add {16, Jim}

Jim

Linear probing: 4

Instead of a linear probing that adds i, we can try quadratic with i².

Concepts

Hash table

When you use linear probing, it solves your problem but it’s making it worst for the next time.

Indeed, there will be a high probability that if you have one collision then you will have more.

0 1 2 3 4 5 6

Hash function: f(x) = f % 7 Add {10, Creek}

Creek LivingstonSuite

Add {3, Livingstone}

Add {17, Suite}

Add {2, Braided}

Braided

Add {16, Jim}

Jim

Linear probing: 4

Instead of a linear probing that adds i, we can try quadratic with i².

Quadratic probing: 2

Concepts

Hash table

In a nutshell

Linear probing tends to create clusters: a contiguous group of

elements.

Quadratic probing offers a better spacing, i.e. it reduces the likeliness

that one collision is followed by many.

There are other probes…