Authentication: keys, MAC, hashes, message digests, digital signatures.
A lecture on hashes
-
Upload
bruce-nguyen -
Category
Documents
-
view
35 -
download
3
description
Transcript of A lecture on hashes
A lecture on hashes
By Charles Morris
What is a hash table?
• A hash table is an array-like data structure that associates its input (the key) with the associated output (the record, or value).
• They use a ‘Hashing Function’ to create the association; more on this later.
• Hashes were first known as “Associative Arrays” (and you may think of them as so); but using seven syllables is tiring.
What is a hash table?
Why a hash?
• Hashes have many distinct benefits.– Hashes are designed so that you may find a
variable without knowing its location.– Computational complexity for lookup varies;
Almost always O(1) or O(2), but in the case of `c` collisions, (where c <= n) it is O(c) (like searching an array).
What is a ‘hash function’?
• A hash function is simply a function that generates a fingerprint based on it’s input.
• Hashing functions are used in many fields, in cryptography one-way hash functions are used to create a small pseudo-random checksum out of (normally) much larger data. This checksum is used for authentication and secure network transmissions, amongst other applications.
Hash functions and Collisions
• In a one-way cryptographic hash function, the amount of collisions is not as important as the randomness of the collisions. These functions try to minimize the computational feasibility of finding a key ‘j’ such that j = f(k); where k is the original key.
Hash functions and Collisions
• In a hash function used to store data in a hash table, collisions should be minimized as much as possible; as every time a collision happens for key ‘j’ where j = f(k); it increases the O(lookup of f(k)).
• This assumes that your hash table watches for collisions, if it doesn’t, the old value will be trampled with the new value.
Collision mitigation
• As was stated, if two keys ‘k’ and ‘j’ both hash to the same index ( f(j) == f(k) ), this causes a collision.
• Collisions are avoided by using a good hashing algorithm, however they always happen to some degree when a hash function is given enough data to run on.
Collision resolution
• Chaining• Instead of one value being at the location f(k), there is a
chain (maybe a linked-list) of values. • Linear or Quadratic Probing is a similar solution, where
space is reserved at certain locations.
• Double hashing– Double hashing requires another hash computation,
O(2n), however the records become so sparse that collisions become very rare.
Hash function example
• The hashing function used in Perl 5.005:» (close relative of the popular ‘djb2’ algorithm)
// (Defined by the PERL_HASH macro in hv.h)
// ported by Charles Morris (me) for C++ programmers
unsigned long hashingfunction( string key )
{
unsigned long fingerprint = 0;
for( int j = 0; j < key.length(); j++) //for each letter in the string
{
fingerprint = fingerprint * 33 + (int)key.at(j); //sum
}
return fingerprint;
}
Hash function example
• hf(‘abc’) using the previous functionfingerprint = 0
//fingerprint = fingerprint * 33 + (int)’a’;
fingerprint = 0 * 33 + 97; //fingerprint = 97
//fingerprint = fingerprint * 33 + (int)’b’;
fingerprint = (97 * 33) + 98; //fingerprint = 3299
//fingerprint = fingerprint * 33 + (int)’c’;
fingerprint = (3299 * 33) + 99; //fingerprint = 108966