Sets, maps and hash tables (Java Collections)
-
Upload
fulvio-corno -
Category
Education
-
view
1.443 -
download
0
description
Transcript of Sets, maps and hash tables (Java Collections)
![Page 1: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/1.jpg)
Sets, maps and hash tables
![Page 2: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/2.jpg)
Long way to Ruzzle…
A.A. 2012/2013 Tecniche di programmazione 2
![Page 3: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/3.jpg)
Sets
Collection that cannot contain duplicate elements.
![Page 4: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/4.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 4
![Page 5: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/5.jpg)
ArrayList vs. LinkedList
A.A. 2012/2013 Tecniche di programmazione 5
ArrayList LinkedList
add(element) IMMEDIATE IMMEDIATE
remove(object) SLUGGISH IMMEDIATE
get(index) IMMEDIATE SLUGGISH
set(index, element) IMMEDIATE SLUGGISH
add(index, element) SLUGGISH SLUGGISH
remove(index) SLUGGISH SLUGGISH
contains(object) SLUGGISH SLUGGISH
indexOf(object) SLUGGISH SLUGGISH
![Page 6: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/6.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 6
![Page 7: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/7.jpg)
Set interface
A.A. 2012/2013 Tecniche di programmazione 7
Add/remove elements
boolean add(element)
boolean remove(object)
Search
boolean contains(object)
No positional Access!
![Page 8: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/8.jpg)
Lists vs. Sets
A.A. 2012/2013 Tecniche di programmazione 8
ArrayList LinkedList Set
add(element) O(1) O(1) O(1)
remove(object) O(n) + O(n) O(n) + O(1) O(1)
get(index) O(1) O(n) n.a.
set(index, elem) O(1) O(n) + O(1) n.a.
add(index, elem) O(1) + O(n) O(n) + O(1) n.a.
remove(index) O(n) O(n) + O(1) n.a.
contains(object) O(n) O(n) O(1)
indexOf(object) O(n) O(n) n.a.
![Page 9: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/9.jpg)
Hash Tables
A data structure implementing an associative array
![Page 10: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/10.jpg)
Notation
A.A. 2012/2013 Tecniche di programmazione 10
A set stores keys
U – Universe of all possible keys
K – Set of keys actually stored
U K
![Page 11: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/11.jpg)
Hash Table
A.A. 2012/2013 Tecniche di programmazione 11
Devise a function to transform each key into an index
Use an array
K T[0..m]
h(∙) k
h(k)
![Page 12: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/12.jpg)
Hash Function
A.A. 2012/2013 Tecniche di programmazione 12
Mapping from U to the slots of a hash table T[0…m–1]
h : U {0,1,…, m–1}
h(k) is the “hash value” of key k
“Any key should be equally likely to hash into any of the
m slots, independent of where any other key
hashes to” (Simple uniform hashing)
Compression/expansion
hN : U N+
h(k) = hN(k) mod m
hR : U [0, 1[ R
h(k) = hR(k) ∙ m
![Page 13: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/13.jpg)
Hash Function - Complexity
A.A. 2012/2013 Tecniche di programmazione 13
Usually, h(k) = O(length(k))
length(k) « N h(k) = O(1)
![Page 14: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/14.jpg)
A simple hash function
A.A. 2012/2013 Tecniche di programmazione 14
h : A N+ [0, m-1]
Split the key into its “component”, then sum their integer
representation
.
h(k) = hN(k) % m
hN 𝑘 = hN(𝑥0𝑥1𝑥2…𝑥𝑛) = 𝑥𝑖
𝑛
𝑖=0
![Page 15: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/15.jpg)
A simple hash (problems)
A.A. 2012/2013 Tecniche di programmazione 15
Problems
hN(“NOTE”) = 78+79+84+69 = 310
hN(“TONE”) = 310
hN(“STOP”) = 83+84+79+80 = 326
hN(“SPOT”) = 326
Problems (m = 173)
h(74,778) = 42
h(16,823) = 42
h(1,611,883) = 42
![Page 16: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/16.jpg)
Collisions
A.A. 2012/2013 Tecniche di programmazione 16
0
m–1
h(k1)
h(k4)
h(k2)=h(k5)
h(k3)
U (universe of keys)
K
(actual
keys)
k1
k2
k3
k5
k4
collision
![Page 17: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/17.jpg)
Collisions
A.A. 2012/2013 Tecniche di programmazione 17
Collisions are possible!
Multiple keys can hash to the same slot
Design hash functions such that collisions are minimized
But avoiding collisions is impossible.
Design collision-resolution techniques
Search will cost Ө(n) time in the worst case
However, all operations can be made to have an expected
complexity of Ө(1).
![Page 18: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/18.jpg)
Hash functions
A.A. 2012/2013 Tecniche di programmazione 18
Simple uniform hashing
Hash value should be independent of any patterns that
might exist in the data
No funneling
![Page 19: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/19.jpg)
Natural numbers
A.A. 2012/2013 Tecniche di programmazione 19
An hash function may assume that the keys are natural
numbers
When they are not, have to “interpret” them as natural
numbers
![Page 20: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/20.jpg)
Natural numbers hashing
A.A. 2012/2013 Tecniche di programmazione 20
Division Method (compression)
h(k) = k mod m
Pros
Fast, since requires just one division operation
Cons
Have to avoid certain values of m
Good choice for m (recipe)
Prime
Not “too close” to powers of 2
Not “too close” to powers of 10
![Page 21: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/21.jpg)
Natural numbers hashing
A.A. 2012/2013 Tecniche di programmazione 21
Multiplication Method I
hR(k) = < k ∙ A > = (k ∙ A – k ∙ A)
h(k) = m ∙ hR(k)
Pros
Value of m is not critical (typically m=2p)
Cons
Value of A is critical
Good choice for A (Donald Knuth)
A = = 5−1
2
![Page 22: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/22.jpg)
Natural numbers hashing
A.A. 2012/2013 Tecniche di programmazione 22
Multiplication Method II
h(k) = k ∙ 2,654,435,761
Pros
Works well for addresses
Caveat (Donald Knuth)
2,654,435,761 = 232 ∙
![Page 23: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/23.jpg)
Resolution of collisions
A.A. 2012/2013 Tecniche di programmazione 23
Open Addressing
When collisions occur, use a systematic
(consistent) procedure to store elements in free
slots of the table
“Double hashing”, “linear probing”, …
Chaining
Store all elements that hash to the same slot in a
linked list k2
0
m–1
k1 k4
k5 k6
k7 k3
k8
![Page 24: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/24.jpg)
Chaining
A.A. 2012/2013 Tecniche di programmazione 24
0
m–1
h(k1)=h(k4)
h(k2)=h(k5)=h(k6)
h(k3)=h(k7)
U (universe of keys)
K
(actual
keys)
k1
k2
k3
k5
k4
k6
k7 k8
h(k8)
X
X
X
![Page 25: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/25.jpg)
Chaining
A.A. 2012/2013 Tecniche di programmazione 25
k2
0
m–1
U (universe of keys)
K
(actual
keys)
k1
k2
k3
k5
k4
k6
k7 k8
k1 k4
k5 k6
k7 k3
k8
![Page 26: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/26.jpg)
Chaining (analysis)
A.A. 2012/2013 Tecniche di programmazione 26
Load factor α = n/m = average keys per slot
n – number of elements stored in the hash table
m – number of slots
![Page 27: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/27.jpg)
Chaining (analysis)
A.A. 2012/2013 Tecniche di programmazione 27
Worst-case complexity:
(n) ( + time to compute h(k) )
![Page 28: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/28.jpg)
Chaining (analysis)
A.A. 2012/2013 Tecniche di programmazione 28
Average depends on how h(∙) distributes keys among m
slots
Let assume
Any key is equally likely to hash into any of the m slots
h(k) = O(1)
Expected length of a linked list = load factor = α = n/m
Search(x) = O(α) + O(1) ≈ O(1)
![Page 29: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/29.jpg)
A note on iterators
A.A. 2012/2013 Tecniche di programmazione 29
Collection extends Iterable
An Iterator is an object that enables you to traverse
through a collection (and to remove elements from the
collection selectively)
You get an Iterator for a collection by calling its iterator()
method
public interface Iterator<E> {
boolean hasNext();
E next();
void remove(); //optional
}
![Page 30: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/30.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 30
![Page 31: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/31.jpg)
HashSet
A.A. 2012/2013 Tecniche di programmazione 31
Add/remove elements
boolean add(element)
boolean remove(object)
Search
boolean contains(object)
No positional Access
Unpredictable iteration order!
![Page 32: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/32.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 32
![Page 33: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/33.jpg)
LinkedHashSet
A.A. 2012/2013 Tecniche di programmazione 33
Add/remove elements
boolean add(element)
boolean remove(object)
Search
boolean contains(object)
No positional Access
Predictable iteration order
![Page 34: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/34.jpg)
Costructors
A.A. 2012/2013 Tecniche di programmazione 34
public HashSet()
public HashSet(Collection<? extends E> c)
HashSet(int initialCapacity)
HashSet(int initialCapacity, float loadFactor)
![Page 35: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/35.jpg)
Costructors
A.A. 2012/2013 Tecniche di programmazione 35
public HashSet()
public HashSet(Collection<? extends E> c)
HashSet(int initialCapacity)
HashSet(int initialCapacity, float loadFactor)
75%
![Page 36: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/36.jpg)
JCF’s HashSet
A.A. 2012/2013 Tecniche di programmazione 36
Built-in hash function
Dynamic hash table resize
Smoothly handles collisions (chaining)
(1) operations (well, usually)
Take it easy!
![Page 37: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/37.jpg)
Default hash function in Java
A.A. 2012/2013 Tecniche di programmazione 37
In Java every class must provide a hashCode() method
which digests the data stored in an instance of the class
into a single 32-bit value
In Java 1.2, Joshua Bloch implemented the java.lang.String
hashCode() using a product sum over the entire text of
the string
ℎ 𝑠 = 𝑠[𝑖] ∙ 31𝑛−1−𝑖𝑛−1𝑖=0
But the basic Object’s hashCode() is
implemented by converting the
internal address of the object
into an integer!
![Page 38: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/38.jpg)
Understanding hash in Java
A.A. 2012/2013 Tecniche di programmazione 38
public class MyData {
public String name;
public String surname;
int age;
}
![Page 39: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/39.jpg)
Understanding hash in Java
A.A. 2012/2013 Tecniche di programmazione 39
MyData foo = new MyData();
MyData bar = new MyData();
if(foo.hashCode() == bar.hashCode()) {
System.out.println("FLICK");
} else {
System.out.println("FLOCK");
}
![Page 40: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/40.jpg)
Understanding hash in Java
A.A. 2012/2013 Tecniche di programmazione 40
MyData foo = new MyData();
MyData bar = new MyData();
foo.name = "Stephane";
foo.surname = "Hessel";
foo.age = 95;
bar.name = "Stephane";
bar.surname = "Hessel";
bar.age = 95;
if(foo.hashCode() == bar.hashCode()) {
System.out.println("FLICK");
} else {
System.out.println("FLOCK");
}
![Page 41: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/41.jpg)
Default hash function in Java
A.A. 2012/2013 Tecniche di programmazione 41
If two objects are equal according to the equals()
method, then hashCode() must produce the same result
If two objects are not equal according to the equals()
method, performances are better whether the
hashCode() produces different results
public boolean equals(Object obj);
public int hashCode();
![Page 42: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/42.jpg)
Hash functions in Java
A.A. 2012/2013 Tecniche di programmazione 42
hashCode() and equals() should
always be defined together
public boolean equals(Object obj);
public int hashCode();
![Page 43: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/43.jpg)
public class MyData
A.A. 2012/2013 Tecniche di programmazione 43
public String name;
public String surname;
int age;
public MyData() { }
public MyData(String n, String s, int a) {
name = n;
surname = s;
age = a;
}
[…]
![Page 44: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/44.jpg)
public class MyData
A.A. 2012/2013 Tecniche di programmazione 44
@Override
public boolean equals(Object obj) {
if (obj == this) {
return true; // quite obvious ;-)
}
if (obj == null || obj instanceof MyData == false) {
return false; // not even comparable
}
// the real check!
if(name.equals(((MyData)obj).name) == false ||
surname.equals(((MyData)obj).surname) == false) {
return false;
}
return true;
}
[…]
![Page 45: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/45.jpg)
public class MyData
A.A. 2012/2013 Tecniche di programmazione 45
@Override
public boolean equals(Object obj) {
if (obj == this) {
return true; // quite obvious ;-)
}
if (obj == null || obj instanceof MyData == false) {
return false; // not even comparable
}
// the real check!
if(name.equals(((MyData)obj).name) == false ||
surname.equals(((MyData)obj).surname) == false) {
return false;
}
return true;
}
[…]
The annotation @Override signals the compiler
that overriding is expected,
and that it has to fail if an override does not occur
![Page 46: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/46.jpg)
public class MyData
A.A. 2012/2013 Tecniche di programmazione 46
@Override
public int hashCode() {
String tmp = name+":"+surname;
return tmp.hashCode();
}
tmp will be “null:null” if MyData has
not been initialized
![Page 47: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/47.jpg)
Implementing your own hash functions
A.A. 2012/2013 Tecniche di programmazione 47
Grab your hash function from a professional
![Page 48: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/48.jpg)
Trivial Hash Function
A.A. 2012/2013 Tecniche di programmazione 48
This hash function helps creating predictable collisions (e.g., “ape” and “pea”)
public long TrivialHash(String str)
{
long hash = 0;
for(int i = 0; i < str.length(); i++)
{
hash = hash + str.charAt(i);
}
return hash;
}
![Page 49: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/49.jpg)
BKDR Hash Function
A.A. 2012/2013 Tecniche di programmazione 49
This hash function comes from Brian Kernighan and Dennis Ritchie's book
"The C Programming Language". It is a simple hash function using a strange set
of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to
be very similar to the DJB hash function.
public long BKDRHash(String str)
{
long seed = 131; // 31 131 1313 13131 131313 etc..
long hash = 0;
for(int i = 0; i < str.length(); i++)
{
hash = (hash * seed) + str.charAt(i);
}
return hash;
}
![Page 50: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/50.jpg)
RS Hash Function
A.A. 2012/2013 Tecniche di programmazione 50
A simple hash function from Robert Sedgwicks Algorithms in C book
public long RSHash(String str)
{
int b = 378551;
int a = 63689;
long hash = 0;
for(int i = 0; i < str.length(); i++)
{
hash = hash * a + str.charAt(i);
a = a * b;
}
return hash;
}
![Page 51: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/51.jpg)
DJB Hash Function
A.A. 2012/2013 Tecniche di programmazione 51
An algorithm produced by Professor Daniel J. Bernstein and shown first to the
world on the usenet newsgroup comp.lang.c. It is one of the most efficient
hash functions ever published
public long DJBHash(String str)
{
long hash = 5381;
for(int i = 0; i < str.length(); i++)
{
hash = hash * 33 + str.charAt(i);
}
return hash;
}
![Page 52: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/52.jpg)
JS Hash Function
A.A. 2012/2013 Tecniche di programmazione 52
A bitwise hash function written by Justin Sobel
public long JSHash(String str)
{
long hash = 1315423911;
for(int i = 0; i < str.length(); i++)
{
hash ^= ((hash << 5) + str.charAt(i) + (hash >> 2));
}
return hash;
}
![Page 53: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/53.jpg)
SDBM Hash Function
A.A. 2012/2013 Tecniche di programmazione 53
This is the algorithm of choice which is used in the open source SDBM
project. The hash function seems to have a good over-all distribution for many
different data sets. It seems to work well in situations where there is a high
variance in the MSBs of the elements in a data set.
public long SDBMHash(String str)
{
long hash = 0;
for(int i = 0; i < str.length(); i++)
{
hash = str.charAt(i) + (hash << 6) +
(hash << 16) - hash;
}
return hash;
}
![Page 54: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/54.jpg)
DEK Hash Function
A.A. 2012/2013 Tecniche di programmazione 54
An algorithm proposed by Donald E. Knuth in The Art Of Computer
Programming (Volume 3), under the topic of sorting and search chapter 6.4.
public long DEKHash(String str)
{
long hash = str.length();
for(int i = 0; i < str.length(); i++)
{
hash = ((hash << 5) ^ (hash >> 27)) ^ str.charAt(i);
}
return hash;
}
![Page 55: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/55.jpg)
DJB Hash Function
A.A. 2012/2013 Tecniche di programmazione 55
The algorithm by Professor Daniel J. Bernstein (alternative take)
public long DJBHash(String str)
{
long hash = 5381;
for(int i = 0; i < str.length(); i++)
{
hash = ((hash << 5) + hash) ^ str.charAt(i);
}
return hash;
}
![Page 56: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/56.jpg)
PJW Hash Function
A.A. 2012/2013 Tecniche di programmazione 56
This hash algorithm is based on work by Peter J. Weinberger of AT&T Bell
Labs. The book Compilers (Principles, Techniques and Tools) by Aho, Sethi and
Ulman, recommends the use of hash functions that employ the hashing
methodology found in this particular algorithm
public long PJWHash(String str)
{
long BitsInUnsigned = (long)(4 * 8);
long ThreeQuarters = (long)((BitsInUnsigned *3) /4);
long OneEighth = (long)(BitsInUnsigned / 8);
long HighBits = (long)(0xFFFFFFFF) <<
(BitsInUnsigned - OneEighth);
long hash = 0;
long test = 0;
[…]
![Page 57: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/57.jpg)
PJW Hash Function
A.A. 2012/2013 Tecniche di programmazione 57
[…]
for(int i = 0; i < str.length(); i++)
{
hash = (hash << OneEighth) + str.charAt(i);
if((test = hash & HighBits) != 0)
{
hash = (( hash ^ (test >> ThreeQuarters)) &
(~HighBits));
}
}
return hash;
}
![Page 58: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/58.jpg)
ELF Hash Function
A.A. 2012/2013 Tecniche di programmazione 58
Similar to the PJW Hash function, but tweaked for 32-bit processors. Its the
hash function widely used on most UNIX systems
public long ELFHash(String str)
{
long hash = 0, x = 0;
for(int i = 0; i < str.length(); i++)
{
hash = (hash << 4) + str.charAt(i);
if((x = hash & 0xF0000000L) != 0)
hash ^= (x >> 24);
hash &= ~x;
}
return hash;
}
![Page 59: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/59.jpg)
A.A. 2012/2013 Tecniche di programmazione 59
![Page 60: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/60.jpg)
Maps
a.k.a, associative array, map, or dictionary
![Page 61: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/61.jpg)
Definition
A.A. 2012/2013 Tecniche di programmazione 61
In computer science, an associative array, map, or
dictionary is an abstract data type composed of (key,
value) pairs, such that each possible key appears at most
once
Native support in most modern languages (perl, python,
ruby, go, …). E.g.,
Implemented through hash tables
V1[42] = “h2g2”
V2[“h2g2”] = 42
![Page 62: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/62.jpg)
Java Collection Framework
A.A. 2012/2013 Tecniche di programmazione 62
![Page 63: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/63.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 63
![Page 64: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/64.jpg)
Map interface
A.A. 2012/2013 Tecniche di programmazione 64
Map<K,V>
K: the type of keys maintained by this map
V: the type of mapped values
Add/remove elements
value put(key, value)
value remove(key)
Search
boolean containsKey(key)
boolean containsValue(value)
![Page 65: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/65.jpg)
Map interface
A.A. 2012/2013 Tecniche di programmazione 65
Map<K,V>
K: the type of keys maintained by this map
V: the type of mapped values
Add/remove elements
value put(key, value)
value remove(key)
Search
boolean containsKey(key)
boolean containsValue(value)
containsValue() will probably
require time linear in the map size
for most implementations of the
Map interface – ie. it is O(N)
![Page 66: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/66.jpg)
Map interface (cont)
A.A. 2012/2013 Tecniche di programmazione 66
Nested Class
Map.Entry<K,V>
A map entry (key-value pair).
Set<Map.Entry<K,V>> entrySet()
Returns a Set view of the mappings contained in this map
Set<K> keySet()
Returns a Set view of the keys contained in this map
Collection<V> values()
Returns a Collection view of the values contained in this map
![Page 67: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/67.jpg)
Collection Family Tree
A.A. 2012/2013 Tecniche di programmazione 67
![Page 68: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/68.jpg)
LinkedHashMap
A.A. 2012/2013 Tecniche di programmazione 68
Maintains a doubly-linked list running through all of its
entries
This linked list defines the iteration ordering (normally
insertion-order)
Insertion order is not affected if a key is re-inserted
![Page 69: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/69.jpg)
Easter Wrap-up
![Page 70: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/70.jpg)
Java Collection Framework
A.A. 2012/2013 Tecniche di programmazione 70
![Page 71: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/71.jpg)
A.A. 2012/2013 Tecniche di programmazione 71
![Page 72: Sets, maps and hash tables (Java Collections)](https://reader034.fdocuments.in/reader034/viewer/2022042601/554ba4b6b4c905b3618b4db0/html5/thumbnails/72.jpg)
Licenza d’uso
A.A. 2012/2013 Tecniche di programmazione 72
Queste diapositive sono distribuite con licenza Creative Commons “Attribuzione - Non commerciale - Condividi allo stesso modo (CC BY-NC-SA)”
Sei libero: di riprodurre, distribuire, comunicare al pubblico, esporre in pubblico,
rappresentare, eseguire e recitare quest'opera
di modificare quest'opera
Alle seguenti condizioni: Attribuzione — Devi attribuire la paternità dell'opera agli autori
originali e in modo tale da non suggerire che essi avallino te o il modo in cui tu usi l'opera.
Non commerciale — Non puoi usare quest'opera per fini commerciali.
Condividi allo stesso modo — Se alteri o trasformi quest'opera, o se la usi per crearne un'altra, puoi distribuire l'opera risultante solo con una licenza identica o equivalente a questa.
http://creativecommons.org/licenses/by-nc-sa/3.0/