PowerPoint Presentation - Topology Based Methods in Shape...
Transcript of PowerPoint Presentation - Topology Based Methods in Shape...
![Page 1: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/1.jpg)
CSE 2331/5331
CSE 2331/5331
Topic 8:Hash Tables
![Page 2: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/2.jpg)
Dictionary Operations
Given a universe of elements U Need to store some keys Need to perform the following for keys
Insert Search Delete
CSE 2331/5331
Let’s call this a dictionary.
![Page 3: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/3.jpg)
First Try:
Use an array T of the size of universe.
CSE 2331/5331
Not great if universe sizeis much larger than the number of keys
ever needed.
Each operation:O(1).
![Page 4: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/4.jpg)
Hash Table
CSE 2331/5331
U : universe T[0 … m-1] : a hash table of size m
𝑚𝑚 ≪ 𝑈𝑈
Hash functions ℎ:𝑈𝑈 → {0, 1, … ,𝑚𝑚 − 1}
ℎ 𝑘𝑘 is called the hash value of key k. Given a key k, we will store it in location h(k) of hash
table T.
![Page 5: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/5.jpg)
Collisions
CSE 2331/5331
Since the size of hash table is smaller than the universe: Multiple keys may hash to the same slot.
How to handle collisions? Chaining Open addressing
![Page 6: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/6.jpg)
Collision Resolved by Chaining
CSE 2331/5331
T[j]: a pointer to the head of the linked list of all stored elements that hash to j
Nil otherwise
![Page 7: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/7.jpg)
Dictionary Operations
Chained-Hash-Insert (T, x) Insert x at the head of list T[h(key(x))]
Chained-Hash-Search(T, k) Search for an element with key k in list T[h(k)]
Chained-Hash-Delete(T, x) Delete x from the list T[h(key(x))]
CSE 2331/5331
O(1)
O(length(T[h(k)])
O(length(T[h(key(x))])
![Page 8: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/8.jpg)
Average-case Analysis
CSE 2331/5331
n: # elements in the table m: size of table (# slots in the table) Load factor:
𝛼𝛼 = 𝑛𝑛𝑚𝑚
: average number of elements per linked list
Intuitively the optimal time needed Individual operation can be slow (O(n) time)
Under certain assumption of the distribution of keys, analyze expected performance.
![Page 9: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/9.jpg)
Simple Uniform Hashing
Simple uniform hashing assumption: any given element is equally likely to hash into any of
the m slots in T Let nj be length of list T[j]
𝑛𝑛 = 𝑛𝑛0 + 𝑛𝑛1 + ⋯+ 𝑛𝑛𝑚𝑚−1
Under simple uniform hashing assumption: expected value 𝐸𝐸 𝑛𝑛𝑗𝑗 = 𝛼𝛼 = 𝑛𝑛
𝑚𝑚
CSE 2331/5331
Why?
![Page 10: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/10.jpg)
Let 𝑘𝑘1,𝑘𝑘2, … , 𝑘𝑘𝑛𝑛 be the set of keys Goal: Estimate E(nj) Let 𝑋𝑋𝑖𝑖 = 1 if ℎ 𝑘𝑘𝑖𝑖 = 𝑗𝑗
0 otherwise Note: 𝑛𝑛𝑗𝑗 = ∑𝑖𝑖=1𝑛𝑛 𝑋𝑋𝑖𝑖 ! Hence
𝐸𝐸 𝑛𝑛𝑗𝑗 = 𝐸𝐸 �𝑖𝑖=1
𝑛𝑛
𝑋𝑋𝑖𝑖 = �𝑖𝑖=1
𝑛𝑛
𝐸𝐸 𝑋𝑋𝑖𝑖 = �𝑖𝑖=1
𝑛𝑛1𝑚𝑚
=𝑛𝑛𝑚𝑚
CSE 2331/5331
𝐸𝐸 𝑋𝑋𝑖𝑖 = Pr ℎ 𝑘𝑘𝑖𝑖 = 𝑗𝑗 ∗ 1= 1𝑚𝑚
![Page 11: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/11.jpg)
Search Complexity – Case 1
If search is unsuccessful Based on simple uniform hashing, a new key is equally
likely to be in any slot
𝐸𝐸𝐸𝐸 ℎ 𝑘𝑘 = ∑𝑗𝑗=1𝑚𝑚 1𝑚𝑚𝐸𝐸[𝑛𝑛𝑗𝑗] = 𝛼𝛼 = 𝑛𝑛
𝑚𝑚 Expected search time: Θ 1 + 𝛼𝛼
CSE 2331/5331
![Page 12: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/12.jpg)
Search Complexity – Case 2
If the search of k is successful Note, simple uniform hashing assumption does not
necessarily implies that there is a equal chance for k in any slot.
Assume: k is equally likely to be any of the n elements already stored in the hash table.
CSE 2331/5331
Theorem:Under simple uniform hashing assumption, the search
procedure takes Θ 1 + 𝛼𝛼 expected time, when using collision resolution by chaining.
![Page 13: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/13.jpg)
Hash Functions
Ideally, Hash function satisfies the assumption of simple
uniform hashing Hard to achieve without knowledge of distribution
where keys are drawn from Give a few heuristic examples
CSE 2331/5331
![Page 14: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/14.jpg)
Division Method
ℎ 𝑘𝑘 = 𝑘𝑘 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚 e.g, ℎ 𝑘𝑘 = 𝑘𝑘 𝑚𝑚𝑚𝑚𝑚𝑚 701
Choice of m is important Power of 2 not very good
Depends only on few least significant bits Higher bits not used
A good choice is a prime number not too close to exact power of 2
Related: ℎ 𝑘𝑘 = (𝑘𝑘 𝑝𝑝) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚CSE 2331/5331
![Page 15: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/15.jpg)
Multiplication Method
Choose some 0 < 𝐴𝐴 < 1 ℎ 𝑘𝑘 = 𝑚𝑚 𝑘𝑘 𝐴𝐴 𝑚𝑚𝑚𝑚𝑚𝑚 1 Slower than division method, but choice of m not
so critical One reasonable choice of A:
𝐴𝐴 ≈ 5−12
≈ 0.6180339887 …
CSE 2331/5331
![Page 16: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/16.jpg)
Open-address Hashing
All keys are stored in the table itself No extra pointers
Each slot is either a key or NIL To hash a key k:
In the ith iteration, compute h(k, i) If h(k,i) is taken (not NIL)
Go to next iteration
If h(k, i) is free Store k here in this slot. Terminate.
CSE 2331/5331
![Page 17: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/17.jpg)
Pseudo-code
CSE 2331/5331
![Page 18: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/18.jpg)
Search
CSE 2331/5331
![Page 19: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/19.jpg)
Re-hashing Functions
ℎ:𝑈𝑈 × 0, 1, … ,𝑚𝑚 − 1 → 0, 1, … ,𝑚𝑚 − 1
Possible choices Linear probing:
ℎ 𝑘𝑘, 𝑖𝑖 = ℎ′ 𝑘𝑘 + 𝑖𝑖 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
Quadratic probing: ℎ 𝑘𝑘, 𝑖𝑖 = ℎ′ 𝑘𝑘 + 𝑐𝑐1 𝑖𝑖 + 𝑐𝑐2𝑖𝑖2 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
Double hashing ℎ 𝑘𝑘, 𝑖𝑖 = ℎ1 𝑘𝑘 + 𝑖𝑖 ℎ2 𝑘𝑘 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
CSE 2331/5331
Probe number Slot hashed to
Tend to cause primary clustering
Secondary clustering
![Page 20: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/20.jpg)
Remarks
Advantages: no pointer, no memory allocation during the course
But: Load factor 𝛼𝛼 = 𝑛𝑛
𝑚𝑚< 1
Need resizing strategy when n > m
CSE 2331/5331
![Page 21: PowerPoint Presentation - Topology Based Methods in Shape ...web.cse.ohio-state.edu/~wang.1016/courses/2331/cse2331-lec8.pdf · Simple Uniform Hashing Simple uniform hashing assumption:](https://reader034.fdocuments.in/reader034/viewer/2022050503/5f954420ceec086e136db391/html5/thumbnails/21.jpg)
Summary
Hash Table Very practical data structure for dictionary operations Especially when the number of keys necessary is much
smaller than the size of universe Need to choose hash functions properly There exist more intelligent hashing schemes
CSE 2331/5331