Hashing Part Two: Static Perfect Hashing

Post on 20-Jan-2017

34 views 4 download

Transcript of Hashing Part Two: Static Perfect Hashing

Advanced Algorithms – COMS31900

Hashing part two

Static Perfect Hashing

Benjamin Sach

Dictionaries and Hashing recap

� A dynamic dictionary stores (key, value)-pairs and supports:

Universe U of u keys.Hash table T of size m > n.

Collisions were fixed by chaining

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

n arbitrary operations arrive online, one at a time.

add(key, value), lookup(key) (which returns value) and delete(key)

(building linked lists)

Dictionaries and Hashing recap

� A dynamic dictionary stores (key, value)-pairs and supports:

Universe U of u keys.Hash table T of size m > n.

Collisions were fixed by chaining

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

A set H of hash functions is weakly universal if for any

two keys x, y ∈ U (with x 6= y),

Pr(h(x) = h(y)

)6

1

m

(h is picked uniformly at random from H)

n arbitrary operations arrive online, one at a time.

add(key, value), lookup(key) (which returns value) and delete(key)

(building linked lists)

Dictionaries and Hashing recap

� A dynamic dictionary stores (key, value)-pairs and supports:

Universe U of u keys.Hash table T of size m > n.

Collisions were fixed by chaining

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

A set H of hash functions is weakly universal if for any

two keys x, y ∈ U (with x 6= y),

Pr(h(x) = h(y)

)6

1

m

(h is picked uniformly at random from H)

For any n operations, the expected

run-time is O(1) per operation.

Using weakly universal hashing:

n arbitrary operations arrive online, one at a time.

add(key, value), lookup(key) (which returns value) and delete(key)

(building linked lists)

Dictionaries and Hashing recap

� A dynamic dictionary stores (key, value)-pairs and supports:

Universe U of u keys.Hash table T of size m > n.

Collisions were fixed by chaining

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

A set H of hash functions is weakly universal if for any

two keys x, y ∈ U (with x 6= y),

Pr(h(x) = h(y)

)6

1

m

(h is picked uniformly at random from H)

For any n operations, the expected

run-time is O(1) per operation.

But this doesn’t tell us much about the

worst-case behaviour

Using weakly universal hashing:

n arbitrary operations arrive online, one at a time.

add(key, value), lookup(key) (which returns value) and delete(key)

(building linked lists)

Static Dictionaries and Perfect hashing

� A static dictionary stores (key, value)-pairs and supports:

Hash table T of size m > n.

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

we are given n different (key, value)-pairs and want to pick a good h

lookup(key) (which returns value) - no inserts or deletes are allowed

Universe U of u keys.

Collisions were fixed by chaining(building linked lists)

Static Dictionaries and Perfect hashing

� A static dictionary stores (key, value)-pairs and supports:

Hash table T of size m > n.

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

we are given n different (key, value)-pairs and want to pick a good h

lookup(key) (which returns value) - no inserts or deletes are allowed

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

Universe U of u keys.

Collisions were fixed by chaining(building linked lists)

Static Dictionaries and Perfect hashing

� A static dictionary stores (key, value)-pairs and supports:

Hash table T of size m > n.

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

we are given n different (key, value)-pairs and want to pick a good h

lookup(key) (which returns value) - no inserts or deletes are allowed

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The rest of this lecture is devoted to the

FKS scheme

Universe U of u keys.

Collisions were fixed by chaining(building linked lists)

Static Dictionaries and Perfect hashing

� A static dictionary stores (key, value)-pairs and supports:

Hash table T of size m > n.

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

we are given n different (key, value)-pairs and want to pick a good h

lookup(key) (which returns value) - no inserts or deletes are allowed

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The rest of this lecture is devoted to the

FKS scheme

The construction is based on weak

universal hashing

Universe U of u keys.

Collisions were fixed by chaining(building linked lists)

Static Dictionaries and Perfect hashing

� A static dictionary stores (key, value)-pairs and supports:

Hash table T of size m > n.

A hash function maps a key x to position h(x)

- i.e T [h(x)] = (key, value).

we are given n different (key, value)-pairs and want to pick a good h

lookup(key) (which returns value) - no inserts or deletes are allowed

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The rest of this lecture is devoted to the

FKS scheme

The construction is based on weak

universal hashing

(with an O(1) time hash function)

Universe U of u keys.

Collisions were fixed by chaining(building linked lists)

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

n

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

n(where any h(x) can be computed in O(1) time)

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

n

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisionsn

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Profit!

n

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

n

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

n

Linearity of Expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E( k∑i=1

Yi

)=

k∑i=1

E(Yi)

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

E(Ix,y) = 1 ·Pr(Ix,y = 1)+0 ·Pr(Ix,y = 0) 61

m

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

By the definition of expectation. . .

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation

(n2

)=

n(n− 1)

2

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m

Perfect hashing - a first attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

Step 1: Insert everything into a hash table of size m = nusing a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

n

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m6

n

2.

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

How many collisions do we get on average?

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m6

1

2

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m6

1

2

muchbetter!

Perfect hashing - a second attempt

A set H of hash functions is weakly universal if for any two keys x, y ∈ U (x 6= y),

Pr(h(x) = h(y)

)6

1

mwhere h is picked uniformly at random from H

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if necessary

n2

Step 1: Insert everything into a hash table of size m = n2

How many collisions do we get on average?

where indicator random variable Ix,y = 1 iff h(x) = h(y).

number of

collisions

linearity of

expectation

definition of

expectation 6 n2/2

E(C) = E( ∑

x,y∈T,x<y

Ix,y)=

∑x,y∈T, x<y

E(Ix,y) 6∑

x,y∈T, x<y

1

m=(n2

)·1

m6

n2

2m6

1

2

much

(except we cheated)

better!

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

The probability of at least one collision: Pr(C > 1) 6 12

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

The probability of at least one collision: Pr(C > 1) 6 12

Markov’s inequality

If X is a non-negative r.v., then for all a > 0,

Pr(X > a) 6E(X)

a.

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

The probability of at least one collision: Pr(C > 1) 6 12

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

i.e. at least as good as tossing a heads on a fair coin

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

i.e. at least as good as tossing a heads on a fair coin

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

i.e. at least as good as tossing a heads on a fair coin

E(construction time) = O(m)·E(runs) = O(m) = O(n2)

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

i.e. at least as good as tossing a heads on a fair coin

E(construction time) = O(m)·E(runs) = O(m) = O(n2)

. . . and then the look-up time is always O(1)

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there was a collision

Step 1: Insert everything into a hash table of size m = n2

How many times do we repeat on average?

The expected number of collisions: E(C) 6 12

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least one collision: Pr(C > 1) 6 12

The probability of zero collisions is at least 12

i.e. at least as good as tossing a heads on a fair coin

E(construction time) = O(m)·E(runs) = O(m) = O(n2)

. . . and then the look-up time is always O(1)

(because any h(x) can be computed in O(1) time)

Markov’s inequality

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

This looks rubbish but

it will be useful in a bit!

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

How many times do we repeat on average?

The expected number of collisions: E(C) 6 n2

The probability of at least n collisions: Pr(C > n) 6 12

This looks rubbish but

it will be useful in a bit!

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

How many times do we repeat on average?

The expected number of collisions: E(C) 6 n2

The probability of at least n collisions: Pr(C > n) 6 12

This looks rubbish but

it will be useful in a bit!

(where a = n)

Markov’s inequality

If X is a non-negative r.v., then for all a > 0,

Pr(X > a) 6E(X)

a.

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

How many times do we repeat on average?

The expected number of collisions: E(C) 6 n2

The probability of at least n collisions: Pr(C > n) 6 12

This looks rubbish but

it will be useful in a bit!

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

How many times do we repeat on average?

The expected number of collisions: E(C) 6 n2

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least n collisions: Pr(C > n) 6 12

i.e. at least as good as tossing a heads on a fair coin

E(construction time) = O(m)·E(runs) = O(m) = O(n)

This looks rubbish but

it will be useful in a bit!

The probability of at most n collisions is at least 12

Expected construction time

using a weakly universal hash function

Step 2: Check for collisions

Step 3: Repeat if there are more than n collisions

Step 1: Insert everything into a hash table of size m = n

How many times do we repeat on average?

The expected number of collisions: E(C) 6 n2

E(runs) 6 E(coin tosses to get a heads) = 2

The probability of at least n collisions: Pr(C > n) 6 12

i.e. at least as good as tossing a heads on a fair coin

E(construction time) = O(m)·E(runs) = O(m) = O(n)

. . . but the look-up time could be rubbish (lots of collisions)

This looks rubbish but

it will be useful in a bit!

The probability of at most n collisions is at least 12

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

T

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

Let ni be the number of items in T [i]

T

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

Let ni be the number of items in T [i]

T

n1 = 2

n5 = 2

n8 = 3

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)

n2i

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

(Step 3) Immediately repeat a step if eithera) T has more than n collisionsb) some Ti has a collision

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

(Step 3) Immediately repeat a step if eithera) T has more than n collisionsb) some Ti has a collision

i.e. check (and if necessary rebuild)

each table immediately after building it

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

(Step 3) Immediately repeat a step if eithera) T has more than n collisionsb) some Ti has a collision

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

(Step 3) Immediately repeat a step if eithera) T has more than n collisions

The look-up time is always O(1)

1. Compute i = h(x) (x is the key)2. Compute j = hi(x)

3. The item is in Ti[j]

b) some Ti has a collision

Perfect hashing - attempt three

n

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal hash function, h

. . . but don’t use chaining

Step 2: The ni items in T [i] are inserted into

another hash table Ti of size n2i

using another weakly universal hash function

Let ni be the number of items in T [i]

T

denoted hi (there is one for each i)n2i

(Step 3) Immediately repeat a step if eithera) T has more than n collisions

What is the expected construction time?

What is the space usage?

The look-up time is always O(1)

1. Compute i = h(x) (x is the key)2. Compute j = hi(x)

3. The item is in Ti[j]

Two questions remain:

b) some Ti has a collision

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

Storing hi uses O(1) space

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) space

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

Storing hi uses O(1) space

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) space

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i]

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .∑i

n2i4

6∑i

(ni2

)6 n

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .∑i

n2i4

6∑i

(ni2

)6 n

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .

(ni2

)=

ni(ni−1)2 >

n2i4

∑i

n2i4

6∑i

(ni2

)6 n

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .∑i

n2i4

6∑i

(ni2

)6 n

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .∑i

n2i4

6∑i

(ni2

)6 n

∑i

n2i 6 4nor

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) spacehow big is this?

How big is∑

i n2i ?

There are(ni2

)collisions in T [i] so there are

∑i

(ni2

)collisions in T

but we know that there are at most n collisions in T . . .∑i

n2i4

6∑i

(ni2

)6 n

∑i

n2i 6 4nor

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

= O(n)

Perfect Hashing - Space usage

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

How much space does this use?

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The size of T is O(n)

The size of Ti is O(ni2)

So the total space is. . .

Storing hi uses O(1) space

O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

= O(n)

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

(we considered this on a previous slide)

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

(we considered this on a previous slide)

The expected construction time for each Ti is O(ni2)

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

(we considered this on a previous slide)

The expected construction time for each Ti is O(ni2)

- we insert ni items into a table of size m = n2i- then repeat if there was a collision

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

(we considered this on a previous slide)

The expected construction time for each Ti is O(ni2)

- we insert ni items into a table of size m = n2i

(we also considered this on a previous slide)- then repeat if there was a collision

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

(we considered this on a previous slide)

The expected construction time for each Ti is O(ni2)

- we insert ni items into a table of size m = n2i

(we also considered this on a previous slide)- then repeat if there was a collision

The overall expected constuction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

= E

(construction time of T )+

∑i

E(construction time of Ti)

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

= E

(construction time of T )+

∑i

E(construction time of Ti)

= O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

= E

(construction time of T )+

∑i

E(construction time of Ti)

= O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

= O(n)

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

= E

(construction time of T )+

∑i

E(construction time of Ti)

= O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

= O(n)

∑i

n2i 6 4n

Perfect Hashing - Expected construction time

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

The expected construction time for T is O(n)

The expected construction time for each Ti is O(ni2)

The overall expected construction time is therefore:

E(construction time) = E

construction time of T +∑i

construction time of Ti

= E

(construction time of T )+

∑i

E(construction time of Ti)

= O(n)+∑i

O(n2i ) = O(n)+O

∑i

n2i

= O(n)

Perfect Hashing - Summary

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The look-up time is always O(1)

1. Compute i = h(x) (x is the key)2. Compute j = hi(x)

3. The item is in Ti[j]

Perfect Hashing - Summary

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The look-up time is always O(1)

1. Compute i = h(x) (x is the key)2. Compute j = hi(x)

3. The item is in Ti[j]

Perfect Hashing - Summary

n n2i

Step 1: Insert everything into a hash table, T , of size nusing a weakly universal (w.u.) hash function, h

Step 2: The ni items in T [i] are inserted into another hash table Ti

of size n2i using w.u hash function hi

(Step 3) Immediately repeat if eithera) T has more than n collisionsb) some Ti has a collision

T

Ti

THEOREM

The FKS hashing scheme:

• Has no collisions

• Every lookup takes O(1) worst-case time,

• Uses O(n) space,

• Can be built in O(n) expected time.

The look-up time is always O(1)

1. Compute i = h(x) (x is the key)2. Compute j = hi(x)

3. The item is in Ti[j]