Ensuring data integrity with tamper evident encryption of ...gsc/pub/master/bbaker/doc/bbaker... ·...
Transcript of Ensuring data integrity with tamper evident encryption of ...gsc/pub/master/bbaker/doc/bbaker... ·...
Tamper evident encryption of integers using
keyed Hash Message Authentication Code
Brad Baker
November 16, 2009
UCCS
11/16/2009Brad Baker - Master's Project Report1
Master’s Project Report
Agenda
11/16/2009Brad Baker - Master's Project Report2
Introduction / Motivation
Background
Design
Analysis
Implementation
Testing
Conclusion / Future Work
References
Introduction
11/16/2009Brad Baker - Master's Project Report4
Confidentiality and integrity of data are important features in a database environment [16, 26]
Integrity is also referred to as tamper detection for this project
Database tampering is defined as loss of relationship between sensitive data and other data in the record
Standard solutions exist including [16]:
Symmetric and asymmetric encryption for confidentiality
Message authentication codes and hash digests for integrity
Standard solutions require end-user to build a complex process combining hash and encryption functions
This project presents the “HMAC based Tamper Evident Encryption” scheme (HTEE) as an alternative solution
HMAC is Hashed Message Authentication Code
Motivation
11/16/2009Brad Baker - Master's Project Report5
Create an efficient and simple-use tamper evident
encryption technique
Single step, single column tamper detection
Focus on processing numeric data in a database system
Improve performance of the encryption operation
compared to standard approaches
Improve on previous work that introduced an HMAC
based encryption/decryption process
Investigate uses of HMAC as an encryption and key
generation function
Related Work
11/16/2009Brad Baker - Master's Project Report6
File system and application level integrity [21, 22]
Checksums, CRC, RAID Parity, Cryptographic file systems
OpenSSL, Intrusion detection, Tripwire, Samhain
Forensic analysis and tamper detection [23]
Notarization with hash function and reliance on audit log
Analysis of how and when data was tampered
Parallel encryption and authentication code [24, 25]
Various implementations of encryption combined with MAC
Original HMAC encryption scheme [1]
Integer encryption with HMAC
Foundation for HTEE tamper detection
Comparison of Solutions
11/16/2009Brad Baker - Master's Project Report7
Solutions for integrity and confidentiality considered:
HTEE: Encryption and tamper detection with HMAC function
AES & SHA-1: Encryption and hash, detects tampering
AES: Encryption, detects random changes only
Each provides a unique benefit:
Solution
Encryption
Strength
Tamper
Detection
Simple
Usage
Encrypt
Efficiency
Decrypt
Efficiency
HTEE Medium/High* Yes Yes Fast Slow
AES & SHA-1 High Yes No Moderate Moderate
AES High No Yes Moderate Moderate
* Security of the HTEE scheme is variable and relies on the hash algorithm used.
Background - HMAC
11/16/2009Brad Baker - Master's Project Report9
HMAC – keyed Hash Message Authentication Code [13]
Produces a secure authentication code (digest) using message and
secret key, providing integrity and authenticity
Proposed in [3], and standardized as FIPS PUB 198 [12]
Unauthorized individual cannot generate digest without key
Can use any underlying hash function, MD5, SHA-1, etc.
Function generates two keys from secret key
The HMAC process is:
HMAC(key, msg) = Hash((key XOR opad) || Hash ((key XOR ipad) || msg)
Where opad=“0x5c5c…” and ipad=“0x3636…”
Background – Integer Encryption
11/16/2009Brad Baker - Master's Project Report10
Integer encryption with HMAC
Original HMAC integer encryption scheme proposed in [1]
The scheme operates on integer plaintext values, decomposed
into two components or buckets
Encryption is performed with HMAC calculation, decryption is
performed with exhaustive search
The scheme is inefficient on encryption and for large integers
Encryption is recursive HMAC rather than direct calculation
Two buckets results in a large search ranges for decryption
A detailed analysis including testing results are available in [2]
HTEE is based on this scheme, and improves upon it
Introductory Example
11/16/2009Brad Baker - Master's Project Report12
Original HMAC example:
Plaintext integer value 567,212 and bucket size 5,000
Bucket 1 = 113, Bucket 2 = 2212
Plaintext can be retrieved as (567,212 = 113*5,000 + 2212)
HMAC digest / ciphertext output:
113 becomes “fG7Agfw4OErQw+IX2iBw853LBKg=“
2212 becomes “YOLpnTHGIHurCvkrgczFMM1C5PI=“
Decryption searches through 5,000 values to find a ciphertext
match for each bucket
HTEE Design
11/16/2009Brad Baker - Master's Project Report14
Processes positive integer values
Decomposition of plaintext into multiple buckets of size 1,000
For example: 2,412,345,678 becomes four buckets:
Bucket 1 = 2; Bucket 2 = 412; Bucket 3 = 345; Bucket 4 = 678;
In the original scheme, a 50,000 bucket size would make two buckets:
Bucket 1 = 48246; Bucket 2 = 45678;
Key transformation based on a unique value related to plaintext
Each encryption operation uses a different key
Encryption keys depend on original key and unique related data
The unique value is any data that must remain the same in
relation to the plaintext, for example:
Record’s primary key, other unique data, hash digest of unique data
HTEE Design
11/16/2009Brad Baker - Master's Project Report15
Encryption operation:
Calculate HMAC digest for each bucket
Decryption operation:
Search for digest match between ciphertext and all values (0-999)
Tamper detection:
Decryption operation cannot find matching value
Two key transformation functions used: element and bucket
Element transformation creates a key for each plaintext
HMAC executed recursively four times with unique value and original key
Bucket transformation creates key for each bucket value
HMAC executed iteratively with ciphertext output and original key
Encryption performed with transformed keys, not original key
HTEE Design
11/16/2009Brad Baker - Master's Project Report16
HMAC digests for all buckets in a plaintext are
concatenated to form ciphertext
Decryption follows key generation process, plus an
exhaustive search for ciphertext match.
No match indicates data was tampered with, that the ciphertext
or unique related data have changed
The HTEE process is:
HTEE(Plaintext, Key, Unique) =
HMAC(Bucket1, fKey(Key, Unique)) ||
HMAC(Bucket2, fKey(Key, Unique)) || … Bucket N
Where {fKey} is key transformation (element and bucket) and
Bucket 1 through Bucket N are decomposed from Plaintext
Example of HTEE
11/16/2009Brad Baker - Master's Project Report17
Record contents (DATA value is sensitive, must be encrypted):ID = 1001; DATA = 654321
After decomposition of DATA value:bucket1 = 654; bucket2 = 321
Original Key, 512 bit:fwWe6MNL5WC9gRgCfVbUsuFLeX8IfwKbnkWmlKhj5Tx2Ods+VkmKS73AeFt0EsXy+zmfWEsyOEaKSx/oYMSmRA==
Generated keys for buckets (dependent on ID value and original key): Bucket1 key:
qi5K5JmBNRfOuPf8qQvgPVVZ5nHZjlgoDb8un4GS/NxFhbRNdnE5B80kPe3rpqIvHRDzdZsiEmpk+2Ozcb5yXg==
Bucket2 key:ylT5vKaGkdc1XMtW0z+HOb1Td2eqLkrkmYE1F8649/ypC+A9VVnmcdmOWCgNvy6fgZL83EWFtE12cTkHzSQ97Q==
Ciphertext result from HMAC (bucket, key): Bucket1 cipher: Ziuytd9t8Vn1h5ldqZjv57sTe2k=
Bucket2 cipher: uk/ACtScX2oxJUPyEPdPWSPCXQk=
Final Ciphertext: Ziuytd9t8Vn1h5ldqZjv57sTe2k=uk/ACtScX2oxJUPyEPdPWSPCXQk=
Final Output:ID = 1001; CIPHER = Ziuytd9t8Vn1h5ldqZjv57sTe2k=uk/ACtScX2oxJUPyEPdPWSPCXQk=
Security Analysis
11/16/2009Brad Baker - Master's Project Report22
Cryptographic strength of HTEE is based on HMAC
Key transformation and encryption use HMAC function
Cryptographic strength of HMAC is based on underlying
hash function [3, 4, 5]
For this project, SHA-1 is used as underlying hash
Hash can be changed for additional security of HMAC [3]
HMAC proven secure from forgery if hash compression
operation is a pseudo-random function [4, 7, 11]
HMAC is not susceptible to hash collision attacks that
affect MD5 and SHA-1 [3, 4, 5]
Collisions are still produced but more difficult to attack
Security Analysis
11/16/2009Brad Baker - Master's Project Report23
HMAC can be attacked by forgery or key recovery attacks [3, 6]
Key recovery attacks typically have chosen or known plaintext
The birthday paradox controls probability to find an HMAC collision [3, 5, 11, 15]
For SHA-1, 280 (message, digest) pairs from HMAC are needed
Research shows key recovery attacks that are better than brute force, but still worse than birthday attack [6, 7, 10]
For the HTEE scheme key recovery attacks are the primary concern
Forgeries are less of a concern as they could only break a single record’s tamper detection capability
Security Analysis
11/16/2009Brad Baker - Master's Project Report24
The layering of key generation in HTEE makes analysis difficult:
Attacker knows the unique value and final digest/ciphertext
Given the digest it is difficult to find the key or message value
Given the unique value, it is difficult to obtain original key
Consider general form: HTEE(P,K,U) = HMAC(P, fK(K,U))
Intermediate keys and plaintexts are masked and HMAC is difficult to
break if using an effective underlying hash
HMAC operation protects plaintext and intermediate key, makes
derivation of original key more difficult
A key recovery attack will take over 280 message pairs
Most applications will not use the same secret key for a large
number of records (over 240, appx. 1 trillion)
This is short of the required over 280 pairs needed for key recovery
Tamper Detection Analysis
11/16/2009Brad Baker - Master's Project Report25
HTEE creates a distinct key sequence based on the
unique value related to plaintext
Identical keys only occur on hash collisions
This is improbable unless a very large number of records are
processed
If ciphertext or unique value are changed then the key
sequence or HMAC output will differ
Tamper detection will only fail if the original and changed HTEE
process produce a collision
Probability of collision for each bucket is appx. 3.42x10-43
Based on the birthday attack with1,000 values [15, 16]
Probability is{P = 1 – e(-k^2/2N)} with {k = 1000} and {N = 2160}
Implementation
11/16/2009Brad Baker - Master's Project Report27
HTEE process implemented as a PostgreSQL add-on and a command line program Built in the C language
Microsoft Visual C++ 2008 Express Edition
PostgreSQL server versions 8.3.8 and 8.4.1
Implemented versions: Command line program used for validation and flat file processing
PostgreSQL add-on is considered the primary implementation
Two functions added to PostgreSQL server: Encryption: htee_enc(plaintext, unique value)
Decryption: htee_dec(ciphertext, unique value)
Simple operation, example SQL for encryption: SELECT htee_enc(data,unique) FROM test
Maximum of six buckets or 9x1017 integer value supported
Implementation
11/16/2009Brad Baker - Master's Project Report28
SHA-1 used for underlying hash function
Specifies use of 512 bit key, blocks of 160 bit ciphertext output
Input key is 88 base64 characters, output is 28 base64
characters per bucket value
Ciphertext output for six buckets is 168 bytes of base64
encoded data
Comparable AES output is 116 bytes, HTEE is a 44% increase
Compared to plaintext data, a 21-fold increase
Several challenges encountered:
Extending PostgreSQL in Windows environment
Interfacing with the PostgreSQL backend
Testing
11/16/2009Brad Baker - Master's Project Report30
Compared three methods for encryption:
Basic AES (aes1): Does not provide tamper detection
AES & unique value (aes2): Provides tamper detection
HTEE scheme: Provides tamper detection
Tested six datasets, 20,000 random integers in each
Each dataset with different number of buckets, one through six
Results verified tamper detection with AES2 and HTEE
methods
HTEE on average was four times faster on encryption but
four times slower on decryption than AES
Performance analysis
11/16/2009Brad Baker - Master's Project Report33
The performance of HTEE and the original scheme [1] are compared with algorithmic analysis
HTEE is significantly more efficient on encryption, and decryption for large numbers [2]
Original scheme increases with n0.5, HTEE increases with log1000(n)
Testing verifies that HTEE is much faster for similar datasets
The large bucket size required for two buckets becomes prohibitively expensive to calculate decryption
Encryption Scheme Relative complexity
HTEE Encryption 2*log1000(n) Constant
HTEE Decryption 1001*log1000(n) Constant
Original Encryption 2*n0.5 Polynomial
Original Decryption 2*n0.5 Polynomial
Lessons Learned
11/16/2009Brad Baker - Master's Project Report35
Encountered and solved implementation challenges
Null bytes, memory management, hash processing
PostgreSQL extension in Windows environment
Interfacing with PostgreSQL backend, operating on data types
Challenges in algorithm design
Properly protecting key information in the transformation process
Adapting key transformation for a database environment
Created custom key generation for random 512 bit keys
OpenSSL package proved difficult to generate simple random strings
Effect of implementation on security
Processing time exposing information about plaintext values
Effect of small input values
Can be mitigated by expanding the size of the unique value
Conclusion
11/16/2009Brad Baker - Master's Project Report36
HTEE provides strong tamper detection and data integrity
Ciphertext and other related data are tied together
HTEE provides strong confidentiality
Security based on the underlying HMAC and hash functions
Can be improved with stronger hash functions
For regulatory requirements recommend AES encryption
HTEE is more efficient on encryption and less efficient on decryption than AES
Ideal for encryption-heavy applications where tamper detection is needed
Examples include archival and auditing systems, including financial information
Additional information available: http://cs.uccs.edu/~gsc/pub/master/bbaker/
Future Work
11/16/2009Brad Baker - Master's Project Report37
Plaintext value range:
HTEE scheme is limited to positive integer values
Future work can expand operation to negative values, floating
point values, or ASCII encoded data
Floating point can be encoded with multiplication by a positive
factor of 10, the factor must be stored in the ciphertext data
Security Proof
A conceptual analysis of cryptographic strength is presented
Future work can prove of the security of HTEE, focused on:
HMAC as a pseudo-random function
Effect of unique value and bucket values on HMAC randomness
References
11/16/2009Brad Baker - Master's Project Report39
1. Dong Hyeok Lee; You Jin Song; Sung Min Lee; TaekYong Nam; Jong Su Jang, "How to Construct a New Encryption Scheme Supporting Range Queries on Encrypted Database," Convergence Information Technology, 2007. International Conference on , vol., no., pp.1402-1407, 21-23 Nov. 2007URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4420452&isnumber=4420217
2. Brad Baker, "Analysis of an HMAC Based Database Encryption Scheme," UCCS Summer 2009 Independent study July. 2009URI: http://cs.uccs.edu/~gsc/pub/master/bbaker/doc/final_paper_bbaker_cs592.doc
3. Mihir Bellare; Ran Canetti; Hugo Krawczyk; “Keying Hash Functions for Message Authentication”, IACR Crypto 1996URI: http://cseweb.ucsd.edu/users/mihir/papers/kmd5.pdf
4. Mihir Bellare, “New Proofs for NMAC and HMAC: Security without Collision-Resistance,” IACR Crypto 2006URI: http://eprint.iacr.org/2006/043.pdf
5. Mihir Bellare, “Attacks on SHA-1,” 2005URI: http://www.openauthentication.org/pdfs/Attacks%20on%20SHA-1.pdf
6. Pierre-Alain Fouque; Gaëtan Leurent; Phong Q. Nguyen, "Full Key-Recovery Attacks on HMAC/NMAC-MD4 and NMAC-MD5," IACR Crypto 2007URI: ftp://ftp.di.ens.fr/pub/users/pnguyen/Crypto07.pdf
7. Scott Contini; Yiqun Lisa Yin, “Forgery and Partial Key-Recovery Attacks on HMAC and NMAC using Hash Collisions (Extended Version),” 2006URI: http://eprint.iacr.org/2006/319.pdf
References
11/16/2009Brad Baker - Master's Project Report40
8. Hyrum Mills; Chris Soghoian; Jon Stone; Malene Wang, “NMAC: Security Proof,” 2004
URI: http://www.cs.jhu.edu/~astubble/dss/proofslides.pdf
9. Ran Canetti, “The HMAC construction: A decade later,” 2007
URI: http://people.csail.mit.edu/canetti/materials/hmac-10.pdf
10. Yu Sasaki, “A Full Key Recovery Attack on HMAC-AURORA-512,” 2009
URI: http://eprint.iacr.org/2009/125.pdf
11. Jongsung Kim; Alex Biryukov; Bart Preneel; and Seokhie Hong, “On the Security of HMAC
and NMAC Based on HAVAL, MD4, MD5, SHA-0 and SHA-1”, 2006
URI: http://eprint.iacr.org/2006/187.pdf
12. NIST, March 2002. FIPS Pub 198 HMAC specification.
URI = http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf
13. Wikipedia, October 2009. HMAC reference material.
URI= http://en.wikipedia.org/wiki/Hmac
14. Wikipedia, October 2009. SHA-1 reference material.
URI= http://en.wikipedia.org/wiki/SHA-1
References
11/16/2009Brad Baker - Master's Project Report41
15. Wikipedia, October 2009. Birthday Attack reference. URI= http://en.wikipedia.org/wiki/Birthday_attack
16. Forouzan, Behrouz A. 2008. Cryptography and Network Security. McGraw Hill higher Education. ISBN 978-0-07-287022-0
17. Simon Josefsson, 2006. GPL implementation of HMAC-SHA1. URI= http://www.koders.com/c/fidF9A73606BEE357A031F14689D03C089777847EFE.aspx
18. Scott G. Miller, 2006. GPL implementation of SHA-1 hash. URI= http://www.koders.com/c/fid716FD533B2D3ED4F230292A6F9617821C8FDD3D4.aspx
19. Bob Trower, August 2001. Open source base64 encoding implementation, adapted for test program. URI= http://base64.sourceforge.net/b64.c
20. PostgreSQL, October 2009. Server Documentation. URI= http://www.postgresql.org/docs/8.4/static/index.html
21. Gopalan Sivathanu; Charles P. Wright; and Erez Zadok, “Ensuring data integrity in storage: techniques and applications,” Workshop On Storage Security And Survivability, Nov. 2005URI = http://doi.acm.org/10.1145/1103780.1103784
References
11/16/2009Brad Baker - Master's Project Report42
22. Vishal Kher; Yongdae Kim, “Securing Distributed Storage: Challenges, Techniques, and
Systems” Workshop On Storage Security And Survivability, Nov. 2005
URI = http://doi.acm.org/10.1145/1103780.1103783
23. Kyriacos Pavlou; Richard Snodgrass, “Forensic Analysis of Database Tampering,” ACM
Transactions on Database Systems (TODS), 2008
URI = http://doi.acm.org/10.1145/1412331.1412342
24. Elbaz, R.; Torres, L.; Sassatelli, G.; Guillemin, P.; Bardouillet, M.; Rigaud, J.B., "How to Add the
Integrity Checking Capability to Block Encryption Algorithms," Research in Microelectronics
and Electronics 2006, Ph. D. , vol., no., pp.369-372, 0-0 0
URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1689972&isnumber=35631
25. Elbaz, R.; Torres, L.; Sassatelli, G.; Guillemin, P.; Bardouillet, M., "PE-ICE: Parallelized
Encryption and Integrity Checking Engine," Design and Diagnostics of Electronic Circuits and
systems, 2006 IEEE , vol., no., pp.141-142, 0-0 0
URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1649595&isnumber=34591
26. Wikipedia, October 2009. Information Security Reference.
URI= http://en.wikipedia.org/wiki/Information_security