The use of MD5 Encryption and Salts in MYSQL Databasesamir/elibrary/m-couzens-MD5... · between...
Transcript of The use of MD5 Encryption and Salts in MYSQL Databasesamir/elibrary/m-couzens-MD5... · between...
The use of MD5 Encryption and Salts in MYSQL Databases
Internet and Computer Security (CSY3023) – Amir Minai
Michael Couzens – 20200389 – Bsc Computing (Internet Technology)
Table of ContentsAbstract/Preface...................................................................................................................................1Introduction..........................................................................................................................................2Main Body............................................................................................................................................5
MD5.................................................................................................................................................5How it Works..............................................................................................................................5Vulnerabilities.............................................................................................................................5Collision Vulnerability................................................................................................................6Application use of MD5..............................................................................................................6
Salt...................................................................................................................................................8Example......................................................................................................................................9Other Benefits.............................................................................................................................9
Summary.............................................................................................................................................11References..........................................................................................................................................12
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 1
Abstract/Preface
In carrying out this assignment, the use of MD5 and salts in terms of storing passwords in MYSQL
databases will be explored. This topic is being explored because of the wide use of MYSQL
databases on the Internet used to store passwords and other login details for websites, as well as any
user personal details.
There will also be an element of exploring how the data is captured initially using an XHTML front
end to create the user form, with a PHP back end to connect with the MYSQL database and submit
the captured data to the database. MD5 and salts will be looked at in terms of how they encrypt the
data that is submitted between the end user inputting data into a form and the MYSQL database on
the web site host of the company the end user is interacting with.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 2
Introduction
The topic of this assignment will be to explore the use of MD5 and salts in terms of storing and
securing passwords and other personal user data in MYSQL databases. The reason for doing this, is
because of the wide use of MYSQL databases on the Internet to store login details and other
personal details for users of different web-pages.
In order for data to be stored, it has to be captured in some way, so part of this assignment will be
exploring how the data that is entered into a XHTML and PHP form by a user, is transmitted to a
MYSQL database. MD5 and salts will be explored in terms of how the data that is entered into a
form by a user, is transmitted securely to the MYSQL database.
The primary element to be researched is MD5. MD5 is a widely used cryptographic hash function
in cryptography with a 128-bit hash value. MD5 is an internet standard (RFC 1321) and has been
employed in a variety of security applications, and is also commonly used to check the integrity of
files. MD5 is not suitable for applications like SSL certificates or digital signatures because it is not
collision resistant. Typically, a MD5 hash function is expressed as a 32-bit hexadecimal number.
In order to add another level of integrity and security to the MYSQL database, a salt will be used
with MD5. A salt is used in cryptography and comprises random bits that are used as one of the
inputs to a key derivation function. A password or pass-phrase is usually the other input. The
output of the key derivation function is stored as the encrypted version of the password. A salt can
also be used as part of a key in a cipher or other cryptographic algorithm. A cryptographic hash
function is usually typically used by the key derivation function. Sometimes the initialisation
vector, a previously generated value, is used as a salt.
The back-end of the program for this assignment will be be based on MYSQL. MYSQL will be
used to store the hashes of the user passwords, user names and other personal user information.
MYSQL is a relational database management system. The program runs as a server providing
multi-user access to a number of databases. It is the world's most popular open source database
software. As MYSQL has superior speed, reliability and and ease of use, MYSQL has become the
preferred database management system by end users because it eliminates major problems
associated with downtime, maintenance and administration for modern, online applications.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 3
In order to submit data to the MYSQL database, PHP is used to connect to the database to submit
the entered user data and if necessary, compare it against what might already be in the database.
PHP is an extensively used, general-purpose scripting language that was originally produced for
web development in order to create dynamic web pages. For this to happen PHP is embedded into
the HTML source code and interpreted by the web server with a PHP processor module, which
creates the web page document.
When users access the web-page, a means is needed to create the web-page initially and to display it
to the user. This means is XHTML. XHTML (extendible hypertext mark-up language ) is a family
of XML mark-up languages that mirror or expand upon versions of the widely used Hypertext
Mark-up Language (HTML), the language in which web pages are written. The only real difference
between XHTML and HTML is that XHTML must be well formed XML while HTML doesn't need
need to be.
In order to provide the formatting, style and layout of the text and images on the website, CSS
needs to be used. CSS (Cascading Style Sheet) is a style sheet language used to describe the look
and formatting of a document written in a mark-up language. The most common use of CSS is to
style web pages written in HTML and XHTML. CSS was designed primarily to divide document
content (written in HTML or XHTML) from the document presentation, including elements such as
layout, colours and fonts. By dividing document content and document presentation, this improves
content accessibility, provides more flexibility and control in the specification of presentation
characteristics, allows multiple pages to share formatting and reduce the complexity and repetition
in the structural content (such as allowing table-less design).
In order to demonstrate the security aspects of MD5 and salts in terms of MYSQL databases, it will
be necessary to write a program. The program to be written, will be a login system for a website.
The login system will be written with XHTML to provide the main content and structure to the site,
while the content and structure will be formatted and described by CSS. In order to submit the data
that is entered into the login system by a user, PHP will be used to connect to the MYSQL database
and if necessary, compare the data that is held in the database against the data entered by the user.
PHP will also be used to submit the data through MD5 and salts to the MYSQL database. The
back-end of the program will be based upon a MYSQL database that will be used to hold the login
and personal details of the users who are registered to be able to login into the system.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 4
The major issue to be encountered in doing the assignment is the implementation of MD5 and salts.
The implementation of MD5 and salts is a major issue because of where would be the best place to
implement them. It is possible to implement MD5 and salts in both PHP and MYSQL.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 5
Main Body
MD5
How it Works
MD5 works by mangling bits in a complex way that every output bit is affected by every input bit.
MD5 starts by padding the message to a length of 448 bits (modulo 512). The original length of the
message is then appended as a 64 bit integer to then give a total input of which the length is a
multiple of 512 bits. The final pre-computation step is initialising a 128-bit buffer to a fixed value.
When the pre-computation is completed, the computation begins. Each pass takes a 512-bit block
of input and mixes it in with the 128-bit buffer. To make the process complete, a table constructed
from the sine function is also thrown in. To avoid any suspicion that the designer built a back door
into his program through which only he can enter, is the reason for using a known function like the
sine and not because it is more random than a random number generator. Four passes are performed
on each input block. This process carries on until all the input blocks have been consumed. The
contents of the 128-bit buffer form the message digest.
Vulnerabilities
MD5 has existed for over a decade, and many attacks have been carried out on it. Some
vulnerabilities have been found, but certain internal measures prevent it from being broken.
However, if the remaining barriers contained in MD5 fail, it may eventually fall.
A number of projects have created MD5 rainbow tables which are easily accessible online. A
rainbow table is a lookup table used in recovering plain-text passwords from a password hash
generated by a hash function, often a cryptographic hash function. A common application is to
make attacks against hashed passwords feasible. A salt is often employed with hashed passwords
before the MD5 digest is generated, to make this attack more difficult, often infeasible and rainbow
tables become much less useful. Rainbow tables can be used to reverse many MD5 hashes into
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 6
strings that collide with the original input, usually for the purposes of password cracking.
MD5 is used in some website URLs meaning that search engines can also sometimes function as a
limited tool for reverse lookup of MD5 hashes. This technique is also rendered ineffective by the
use of a salt.
Collision Vulnerability
If two prefixes with the same hash can be constructed, a common suffix can be added to make the
collision more likely to be accepted as valid data by the application using it. Furthermore, current
collision-finding techniques allow to specify an arbitrary prefix an attacker can create two colliding
files that both begin with the same content. All the attacker needs to generate two colliding files is a
template file with a 128-byte block of data aligned on a 64-bit boundary that can be changed freely
by the collision-finding algorithm.
Application use of MD5
MD5 has been widely used in the software world to provide some assurance that a transferred file
has arrived in the same condition it was sent. For example, file servers often provide a pre-
computed MD5 checksum for the files, so that the user can compare the checksum of the
downloaded file to it. Unix based operating systems include MD5 sum utilities in their distribution
packages, whereas Windows users use third-party applications.
Unfortunately, it is now easy for a user to generate MD5 collisions, therefore, it is possible for the
person who created the file to create a second file with the same checksum, so this technique cannot
protect against some forms of malicious tampering. Also, in some cases the checksum cannot be
trusted (for example, if it was obtained over the same channel as the downloaded file), in which
case MD5 can only provide error-checking functionality: it will recognise a corrupt or incomplete
download, which becomes more likely when downloading larger files.
MD5 is widely used to protect the integrity of stored passwords in a MYSQL database. To protect
against previously mentioned vulnerabilities, a salt can be added to the passwords before hashing
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 7
them. An example of a web based login application that uses MD5 to protect users login passwords
can be seen below.
The screen dump above shows a MYSQL database that contains a users details that they need to
login to a website. As can be seen from the password column of the table, the users password has
been scrambled using an MD5 hash algorithm. Salt has not been used.
The screen dump above shows the email that is sent to the new user of the website after they have
registered with their login details. Most importantly, the screen dump shows what the actual
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 8
password of the user is at the time of registration.
In order to create the MD5 hash to store in the database to protect the integrity of the users
password, the following line of code is used in the PHP script: -
$newpass = substr(md5(time()),0,6);
The above code puts the password through an MD5 algorithm, to generate an MD5 hash to store in
the MYSQL database.
Salt
As has been mentioned previously, a salt can be used with MD5 hash algorithms to strengthen the
MD5 hash algorithm. A salt is made up of random bits that are used as one of the inputs to a key
derivation function. The other input is usually a password or pass-phrase. The output of the key
derivation function is stored as the encrypted version of the password. A salt can also be used can
also be used as part of a key in a cipher or other cryptographic algorithm. The key derivation
function typically uses a cryptographic hash function. Sometimes, the initialisation vector, a
previously- generated value, is used as a salt.
Salt data makes dictionary attacks that use pre-encryption of dictionary entries more complicated.
Each bit of salt used, doubles the amount of storage and computation required.
The salt value is kept secret for the best security, separate from the password database. This
provides an advantage if a database is stolen, but the salt is not. To determine a password from a
stolen hash, an attacker cannot simply try common passwords (such as English language words or
names). Rather, they must calculate the hashes of random characters (at least for the portion of the
input they know is the salt), which is much slower.
Within some protocols, the salt is transmitted as clear-text with the encrypted data, sometimes along
with the number of iterations used in generating the key (for key strengthening). A cryptographic
protocol that uses salt would be SSL.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 9
The benefit of using a salted password is that a simple dictionary attack against the encrypted values
becomes impractical if the salt is large enough. That is, an attacker cannot just simply create a
rainbow table, a dictionary of encrypted values (password + salt), because it would either take too
much time, or too much space. This would force the attacker to use the provided authentication
mechanism (which 'knows' the correct salt value).
Example
Assume a user’s (encrypted) secret key is stolen and they are known to use one of 200,000 English
words as their password. The system uses a 32-bit salt. The salted key is now the original password
appended to this random 32-bit salt. Because of this salt, the attacker’s pre-calculated hashes are of
no value. They must calculate the hash of each word with each of 232 (4,294,967,296) possible salts
appended until a match is found. The total number of possible inputs can be obtained by
multiplying the number of words in the dictionary with the number of possible salts:
To complete a brute-force attack, the attacker must now compute about 800 trillion hashes, instead
of only 200,000. Even though the password itself is known to be simple, the secret salt makes
breaking the password increasingly difficult.
Other Benefits
Salts help protect against rainbow tables as they, in effect, extend the length and potentially the
complexity of the password. If the rainbow tables do not have passwords matching the length (e.g.
an 8-byte password, and 2-byte salt, is effectively a 10-byte password) and complexity (non-
alphanumeric salt increases the complexity of strictly alphanumeric passwords) of the salted
password, then the password will not be found. If found, one will have to remove the salt from the
password before it can be used.
Salts also make dictionary attacks and brute-force attacks for cracking large number of passwords
much slower (but not in the case of cracking just one password). Without salts, an attacker who is
cracking many passwords at the same time only needs to hash each password guess once, and
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 10
compare it to all the hashes. However, with salts, all the passwords will likely have different salts;
so each guess must be hashed separately for each salt, which is much slower since hashing is
usually very computationally expensive.
Another (lesser) benefit of a salt is as follows: two users might choose the same string as their
password, or the same user might choose to use the same password on two machines. Without a salt,
this password would be stored as the same hash string in the password file. This would disclose the
fact that the two accounts have the same password, allowing anyone who knows one of the
account's passwords to access the other account. By salting the password hashes with two random
characters, then odds are - even if two accounts use the same password - that no one can discover
this by reading password files.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 11
Summary
In conclusion, having carried out this assignment, it has been discovered that any encryption for
protecting passwords stored in a database is better than no encryption at all. It is possible to create a
MYSQL database that stores passwords as plain text. Storing passwords in plain text means that if
anyone gets access to the database, they could find users passwords and log into the site. By
encrypting passwords, if anyone gets access to the database, they would need to do more work to
reveal what the stored passwords are.
While MD5 can be cracked using collisions or rainbow tables, it is more secure than a plain text
password. To ensure that passwords cannot be cracked using rainbow tables, a salt should also be
used. Using a salt also helps to protect against dictionary attacks. The use of a salt also means that
if the database with the passwords is stolen, if the salt is stored separately, it would be close to
impossible for the hacker to get the passwords contained within the database.
In terms of the practical side of the assignment, the login system only uses MD5. As mentioned
previously, this isn't as insecure as no hashing at all. Users should be encouraged to use strong
passwords with a mixture of characters and numbers as well as letters, rather than just words,
especially if they are easily guessable. Using a salt would make it still harder for a hacker to crack
passwords.
Michael Couzens 20200389 – Bsc Computing – Year 3 – Internet Computer Security – Page 12
References
http://phpsec.org/articles/2005/password-hashing.html
http://www.devshed.com/c/a/PHP/Creating-a-Secure-PHP-Login-Script/
http://en.wikipedia.org/wiki/Cascading_Style_Sheets
http://www.securityfocus.com/infocus/1726
http://en.wikipedia.org/wiki/Salt_(cryptography)
http://en.wikipedia.org/wiki/XHTML
http://en.wikipedia.org/wiki/PHP
http://www.mysql.com/about/
http://en.wikipedia.org/wiki/MySQL
http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html
http://www.osix.net/modules/article/?id=507
http://en.wikipedia.org/wiki/Collision_resistance
http://en.wikipedia.org/wiki/MD5
http://en.wikipedia.org/wiki/Rainbow_table
http://articles.sitepoint.com/article/users-php-sessions-mysql/3
http://www.phpeasystep.com/phptu/6.html
http://www.phpeasystep.com/workshopview.php?id=26
http://en.wikipedia.org/wiki/Salt_(cryptography)#Examples
http://en.wikipedia.org/wiki/Salt_(cryptography)#Additional_Benefits
http://en.wikipedia.org/wiki/MD5#Collision_vulnerability
http://en.wikipedia.org/wiki/MD5#Other_vulnerabilities
http://en.wikipedia.org/wiki/MD5#Applications
Computer Networks – Fourth Edition – Andrew S. Tanenbaum – Chapter 8 Network Security - Page
760 - MD5