Ensuring data integrity with tamper evident encryption of ...gsc/pub/master/bbaker/doc/bbaker... ·...

42
Tamper evident encryption of integers using keyed Hash Message Authentication Code Brad Baker November 16, 2009 UCCS 11/16/2009 Brad Baker - Master's Project Report 1 Master’s Project Report

Transcript of Ensuring data integrity with tamper evident encryption of ...gsc/pub/master/bbaker/doc/bbaker... ·...

Tamper evident encryption of integers using

keyed Hash Message Authentication Code

Brad Baker

November 16, 2009

UCCS

11/16/2009Brad Baker - Master's Project Report1

Master’s Project Report

Agenda

11/16/2009Brad Baker - Master's Project Report2

Introduction / Motivation

Background

Design

Analysis

Implementation

Testing

Conclusion / Future Work

References

Section 1:

Introduction

11/16/2009Brad Baker - Master's Project Report3

Introduction

11/16/2009Brad Baker - Master's Project Report4

Confidentiality and integrity of data are important features in a database environment [16, 26]

Integrity is also referred to as tamper detection for this project

Database tampering is defined as loss of relationship between sensitive data and other data in the record

Standard solutions exist including [16]:

Symmetric and asymmetric encryption for confidentiality

Message authentication codes and hash digests for integrity

Standard solutions require end-user to build a complex process combining hash and encryption functions

This project presents the “HMAC based Tamper Evident Encryption” scheme (HTEE) as an alternative solution

HMAC is Hashed Message Authentication Code

Motivation

11/16/2009Brad Baker - Master's Project Report5

Create an efficient and simple-use tamper evident

encryption technique

Single step, single column tamper detection

Focus on processing numeric data in a database system

Improve performance of the encryption operation

compared to standard approaches

Improve on previous work that introduced an HMAC

based encryption/decryption process

Investigate uses of HMAC as an encryption and key

generation function

Related Work

11/16/2009Brad Baker - Master's Project Report6

File system and application level integrity [21, 22]

Checksums, CRC, RAID Parity, Cryptographic file systems

OpenSSL, Intrusion detection, Tripwire, Samhain

Forensic analysis and tamper detection [23]

Notarization with hash function and reliance on audit log

Analysis of how and when data was tampered

Parallel encryption and authentication code [24, 25]

Various implementations of encryption combined with MAC

Original HMAC encryption scheme [1]

Integer encryption with HMAC

Foundation for HTEE tamper detection

Comparison of Solutions

11/16/2009Brad Baker - Master's Project Report7

Solutions for integrity and confidentiality considered:

HTEE: Encryption and tamper detection with HMAC function

AES & SHA-1: Encryption and hash, detects tampering

AES: Encryption, detects random changes only

Each provides a unique benefit:

Solution

Encryption

Strength

Tamper

Detection

Simple

Usage

Encrypt

Efficiency

Decrypt

Efficiency

HTEE Medium/High* Yes Yes Fast Slow

AES & SHA-1 High Yes No Moderate Moderate

AES High No Yes Moderate Moderate

* Security of the HTEE scheme is variable and relies on the hash algorithm used.

Section 2:

Background

11/16/2009Brad Baker - Master's Project Report8

Background - HMAC

11/16/2009Brad Baker - Master's Project Report9

HMAC – keyed Hash Message Authentication Code [13]

Produces a secure authentication code (digest) using message and

secret key, providing integrity and authenticity

Proposed in [3], and standardized as FIPS PUB 198 [12]

Unauthorized individual cannot generate digest without key

Can use any underlying hash function, MD5, SHA-1, etc.

Function generates two keys from secret key

The HMAC process is:

HMAC(key, msg) = Hash((key XOR opad) || Hash ((key XOR ipad) || msg)

Where opad=“0x5c5c…” and ipad=“0x3636…”

Background – Integer Encryption

11/16/2009Brad Baker - Master's Project Report10

Integer encryption with HMAC

Original HMAC integer encryption scheme proposed in [1]

The scheme operates on integer plaintext values, decomposed

into two components or buckets

Encryption is performed with HMAC calculation, decryption is

performed with exhaustive search

The scheme is inefficient on encryption and for large integers

Encryption is recursive HMAC rather than direct calculation

Two buckets results in a large search ranges for decryption

A detailed analysis including testing results are available in [2]

HTEE is based on this scheme, and improves upon it

Original HMAC process

11/16/2009Brad Baker - Master's Project Report11

Introductory Example

11/16/2009Brad Baker - Master's Project Report12

Original HMAC example:

Plaintext integer value 567,212 and bucket size 5,000

Bucket 1 = 113, Bucket 2 = 2212

Plaintext can be retrieved as (567,212 = 113*5,000 + 2212)

HMAC digest / ciphertext output:

113 becomes “fG7Agfw4OErQw+IX2iBw853LBKg=“

2212 becomes “YOLpnTHGIHurCvkrgczFMM1C5PI=“

Decryption searches through 5,000 values to find a ciphertext

match for each bucket

Section 3:

Design

11/16/2009Brad Baker - Master's Project Report13

HTEE Design

11/16/2009Brad Baker - Master's Project Report14

Processes positive integer values

Decomposition of plaintext into multiple buckets of size 1,000

For example: 2,412,345,678 becomes four buckets:

Bucket 1 = 2; Bucket 2 = 412; Bucket 3 = 345; Bucket 4 = 678;

In the original scheme, a 50,000 bucket size would make two buckets:

Bucket 1 = 48246; Bucket 2 = 45678;

Key transformation based on a unique value related to plaintext

Each encryption operation uses a different key

Encryption keys depend on original key and unique related data

The unique value is any data that must remain the same in

relation to the plaintext, for example:

Record’s primary key, other unique data, hash digest of unique data

HTEE Design

11/16/2009Brad Baker - Master's Project Report15

Encryption operation:

Calculate HMAC digest for each bucket

Decryption operation:

Search for digest match between ciphertext and all values (0-999)

Tamper detection:

Decryption operation cannot find matching value

Two key transformation functions used: element and bucket

Element transformation creates a key for each plaintext

HMAC executed recursively four times with unique value and original key

Bucket transformation creates key for each bucket value

HMAC executed iteratively with ciphertext output and original key

Encryption performed with transformed keys, not original key

HTEE Design

11/16/2009Brad Baker - Master's Project Report16

HMAC digests for all buckets in a plaintext are

concatenated to form ciphertext

Decryption follows key generation process, plus an

exhaustive search for ciphertext match.

No match indicates data was tampered with, that the ciphertext

or unique related data have changed

The HTEE process is:

HTEE(Plaintext, Key, Unique) =

HMAC(Bucket1, fKey(Key, Unique)) ||

HMAC(Bucket2, fKey(Key, Unique)) || … Bucket N

Where {fKey} is key transformation (element and bucket) and

Bucket 1 through Bucket N are decomposed from Plaintext

Example of HTEE

11/16/2009Brad Baker - Master's Project Report17

Record contents (DATA value is sensitive, must be encrypted):ID = 1001; DATA = 654321

After decomposition of DATA value:bucket1 = 654; bucket2 = 321

Original Key, 512 bit:fwWe6MNL5WC9gRgCfVbUsuFLeX8IfwKbnkWmlKhj5Tx2Ods+VkmKS73AeFt0EsXy+zmfWEsyOEaKSx/oYMSmRA==

Generated keys for buckets (dependent on ID value and original key): Bucket1 key:

qi5K5JmBNRfOuPf8qQvgPVVZ5nHZjlgoDb8un4GS/NxFhbRNdnE5B80kPe3rpqIvHRDzdZsiEmpk+2Ozcb5yXg==

Bucket2 key:ylT5vKaGkdc1XMtW0z+HOb1Td2eqLkrkmYE1F8649/ypC+A9VVnmcdmOWCgNvy6fgZL83EWFtE12cTkHzSQ97Q==

Ciphertext result from HMAC (bucket, key): Bucket1 cipher: Ziuytd9t8Vn1h5ldqZjv57sTe2k=

Bucket2 cipher: uk/ACtScX2oxJUPyEPdPWSPCXQk=

Final Ciphertext: Ziuytd9t8Vn1h5ldqZjv57sTe2k=uk/ACtScX2oxJUPyEPdPWSPCXQk=

Final Output:ID = 1001; CIPHER = Ziuytd9t8Vn1h5ldqZjv57sTe2k=uk/ACtScX2oxJUPyEPdPWSPCXQk=

HTEE Encryption Concept

11/16/2009Brad Baker - Master's Project Report18

Element Key Transformation [3, 4, 9, 11]

11/16/2009Brad Baker - Master's Project Report19

Bucket Key Transformation

11/16/2009Brad Baker - Master's Project Report20

Section 4:

Analysis

11/16/2009Brad Baker - Master's Project Report21

Security Analysis

11/16/2009Brad Baker - Master's Project Report22

Cryptographic strength of HTEE is based on HMAC

Key transformation and encryption use HMAC function

Cryptographic strength of HMAC is based on underlying

hash function [3, 4, 5]

For this project, SHA-1 is used as underlying hash

Hash can be changed for additional security of HMAC [3]

HMAC proven secure from forgery if hash compression

operation is a pseudo-random function [4, 7, 11]

HMAC is not susceptible to hash collision attacks that

affect MD5 and SHA-1 [3, 4, 5]

Collisions are still produced but more difficult to attack

Security Analysis

11/16/2009Brad Baker - Master's Project Report23

HMAC can be attacked by forgery or key recovery attacks [3, 6]

Key recovery attacks typically have chosen or known plaintext

The birthday paradox controls probability to find an HMAC collision [3, 5, 11, 15]

For SHA-1, 280 (message, digest) pairs from HMAC are needed

Research shows key recovery attacks that are better than brute force, but still worse than birthday attack [6, 7, 10]

For the HTEE scheme key recovery attacks are the primary concern

Forgeries are less of a concern as they could only break a single record’s tamper detection capability

Security Analysis

11/16/2009Brad Baker - Master's Project Report24

The layering of key generation in HTEE makes analysis difficult:

Attacker knows the unique value and final digest/ciphertext

Given the digest it is difficult to find the key or message value

Given the unique value, it is difficult to obtain original key

Consider general form: HTEE(P,K,U) = HMAC(P, fK(K,U))

Intermediate keys and plaintexts are masked and HMAC is difficult to

break if using an effective underlying hash

HMAC operation protects plaintext and intermediate key, makes

derivation of original key more difficult

A key recovery attack will take over 280 message pairs

Most applications will not use the same secret key for a large

number of records (over 240, appx. 1 trillion)

This is short of the required over 280 pairs needed for key recovery

Tamper Detection Analysis

11/16/2009Brad Baker - Master's Project Report25

HTEE creates a distinct key sequence based on the

unique value related to plaintext

Identical keys only occur on hash collisions

This is improbable unless a very large number of records are

processed

If ciphertext or unique value are changed then the key

sequence or HMAC output will differ

Tamper detection will only fail if the original and changed HTEE

process produce a collision

Probability of collision for each bucket is appx. 3.42x10-43

Based on the birthday attack with1,000 values [15, 16]

Probability is{P = 1 – e(-k^2/2N)} with {k = 1000} and {N = 2160}

Section 5:

Implementation

11/16/2009Brad Baker - Master's Project Report26

Implementation

11/16/2009Brad Baker - Master's Project Report27

HTEE process implemented as a PostgreSQL add-on and a command line program Built in the C language

Microsoft Visual C++ 2008 Express Edition

PostgreSQL server versions 8.3.8 and 8.4.1

Implemented versions: Command line program used for validation and flat file processing

PostgreSQL add-on is considered the primary implementation

Two functions added to PostgreSQL server: Encryption: htee_enc(plaintext, unique value)

Decryption: htee_dec(ciphertext, unique value)

Simple operation, example SQL for encryption: SELECT htee_enc(data,unique) FROM test

Maximum of six buckets or 9x1017 integer value supported

Implementation

11/16/2009Brad Baker - Master's Project Report28

SHA-1 used for underlying hash function

Specifies use of 512 bit key, blocks of 160 bit ciphertext output

Input key is 88 base64 characters, output is 28 base64

characters per bucket value

Ciphertext output for six buckets is 168 bytes of base64

encoded data

Comparable AES output is 116 bytes, HTEE is a 44% increase

Compared to plaintext data, a 21-fold increase

Several challenges encountered:

Extending PostgreSQL in Windows environment

Interfacing with the PostgreSQL backend

Section 6:

Testing

11/16/2009Brad Baker - Master's Project Report29

Testing

11/16/2009Brad Baker - Master's Project Report30

Compared three methods for encryption:

Basic AES (aes1): Does not provide tamper detection

AES & unique value (aes2): Provides tamper detection

HTEE scheme: Provides tamper detection

Tested six datasets, 20,000 random integers in each

Each dataset with different number of buckets, one through six

Results verified tamper detection with AES2 and HTEE

methods

HTEE on average was four times faster on encryption but

four times slower on decryption than AES

Performance comparison

11/16/2009Brad Baker - Master's Project Report31

HTEE performance details

11/16/2009Brad Baker - Master's Project Report32

Performance analysis

11/16/2009Brad Baker - Master's Project Report33

The performance of HTEE and the original scheme [1] are compared with algorithmic analysis

HTEE is significantly more efficient on encryption, and decryption for large numbers [2]

Original scheme increases with n0.5, HTEE increases with log1000(n)

Testing verifies that HTEE is much faster for similar datasets

The large bucket size required for two buckets becomes prohibitively expensive to calculate decryption

Encryption Scheme Relative complexity

HTEE Encryption 2*log1000(n) Constant

HTEE Decryption 1001*log1000(n) Constant

Original Encryption 2*n0.5 Polynomial

Original Decryption 2*n0.5 Polynomial

Section 7:

Conclusion

11/16/2009Brad Baker - Master's Project Report34

Lessons Learned

11/16/2009Brad Baker - Master's Project Report35

Encountered and solved implementation challenges

Null bytes, memory management, hash processing

PostgreSQL extension in Windows environment

Interfacing with PostgreSQL backend, operating on data types

Challenges in algorithm design

Properly protecting key information in the transformation process

Adapting key transformation for a database environment

Created custom key generation for random 512 bit keys

OpenSSL package proved difficult to generate simple random strings

Effect of implementation on security

Processing time exposing information about plaintext values

Effect of small input values

Can be mitigated by expanding the size of the unique value

Conclusion

11/16/2009Brad Baker - Master's Project Report36

HTEE provides strong tamper detection and data integrity

Ciphertext and other related data are tied together

HTEE provides strong confidentiality

Security based on the underlying HMAC and hash functions

Can be improved with stronger hash functions

For regulatory requirements recommend AES encryption

HTEE is more efficient on encryption and less efficient on decryption than AES

Ideal for encryption-heavy applications where tamper detection is needed

Examples include archival and auditing systems, including financial information

Additional information available: http://cs.uccs.edu/~gsc/pub/master/bbaker/

Future Work

11/16/2009Brad Baker - Master's Project Report37

Plaintext value range:

HTEE scheme is limited to positive integer values

Future work can expand operation to negative values, floating

point values, or ASCII encoded data

Floating point can be encoded with multiplication by a positive

factor of 10, the factor must be stored in the ciphertext data

Security Proof

A conceptual analysis of cryptographic strength is presented

Future work can prove of the security of HTEE, focused on:

HMAC as a pseudo-random function

Effect of unique value and bucket values on HMAC randomness

Questions?

11/16/2009Brad Baker - Master's Project Report38

References

11/16/2009Brad Baker - Master's Project Report39

1. Dong Hyeok Lee; You Jin Song; Sung Min Lee; TaekYong Nam; Jong Su Jang, "How to Construct a New Encryption Scheme Supporting Range Queries on Encrypted Database," Convergence Information Technology, 2007. International Conference on , vol., no., pp.1402-1407, 21-23 Nov. 2007URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4420452&isnumber=4420217

2. Brad Baker, "Analysis of an HMAC Based Database Encryption Scheme," UCCS Summer 2009 Independent study July. 2009URI: http://cs.uccs.edu/~gsc/pub/master/bbaker/doc/final_paper_bbaker_cs592.doc

3. Mihir Bellare; Ran Canetti; Hugo Krawczyk; “Keying Hash Functions for Message Authentication”, IACR Crypto 1996URI: http://cseweb.ucsd.edu/users/mihir/papers/kmd5.pdf

4. Mihir Bellare, “New Proofs for NMAC and HMAC: Security without Collision-Resistance,” IACR Crypto 2006URI: http://eprint.iacr.org/2006/043.pdf

5. Mihir Bellare, “Attacks on SHA-1,” 2005URI: http://www.openauthentication.org/pdfs/Attacks%20on%20SHA-1.pdf

6. Pierre-Alain Fouque; Gaëtan Leurent; Phong Q. Nguyen, "Full Key-Recovery Attacks on HMAC/NMAC-MD4 and NMAC-MD5," IACR Crypto 2007URI: ftp://ftp.di.ens.fr/pub/users/pnguyen/Crypto07.pdf

7. Scott Contini; Yiqun Lisa Yin, “Forgery and Partial Key-Recovery Attacks on HMAC and NMAC using Hash Collisions (Extended Version),” 2006URI: http://eprint.iacr.org/2006/319.pdf

References

11/16/2009Brad Baker - Master's Project Report40

8. Hyrum Mills; Chris Soghoian; Jon Stone; Malene Wang, “NMAC: Security Proof,” 2004

URI: http://www.cs.jhu.edu/~astubble/dss/proofslides.pdf

9. Ran Canetti, “The HMAC construction: A decade later,” 2007

URI: http://people.csail.mit.edu/canetti/materials/hmac-10.pdf

10. Yu Sasaki, “A Full Key Recovery Attack on HMAC-AURORA-512,” 2009

URI: http://eprint.iacr.org/2009/125.pdf

11. Jongsung Kim; Alex Biryukov; Bart Preneel; and Seokhie Hong, “On the Security of HMAC

and NMAC Based on HAVAL, MD4, MD5, SHA-0 and SHA-1”, 2006

URI: http://eprint.iacr.org/2006/187.pdf

12. NIST, March 2002. FIPS Pub 198 HMAC specification.

URI = http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf

13. Wikipedia, October 2009. HMAC reference material.

URI= http://en.wikipedia.org/wiki/Hmac

14. Wikipedia, October 2009. SHA-1 reference material.

URI= http://en.wikipedia.org/wiki/SHA-1

References

11/16/2009Brad Baker - Master's Project Report41

15. Wikipedia, October 2009. Birthday Attack reference. URI= http://en.wikipedia.org/wiki/Birthday_attack

16. Forouzan, Behrouz A. 2008. Cryptography and Network Security. McGraw Hill higher Education. ISBN 978-0-07-287022-0

17. Simon Josefsson, 2006. GPL implementation of HMAC-SHA1. URI= http://www.koders.com/c/fidF9A73606BEE357A031F14689D03C089777847EFE.aspx

18. Scott G. Miller, 2006. GPL implementation of SHA-1 hash. URI= http://www.koders.com/c/fid716FD533B2D3ED4F230292A6F9617821C8FDD3D4.aspx

19. Bob Trower, August 2001. Open source base64 encoding implementation, adapted for test program. URI= http://base64.sourceforge.net/b64.c

20. PostgreSQL, October 2009. Server Documentation. URI= http://www.postgresql.org/docs/8.4/static/index.html

21. Gopalan Sivathanu; Charles P. Wright; and Erez Zadok, “Ensuring data integrity in storage: techniques and applications,” Workshop On Storage Security And Survivability, Nov. 2005URI = http://doi.acm.org/10.1145/1103780.1103784

References

11/16/2009Brad Baker - Master's Project Report42

22. Vishal Kher; Yongdae Kim, “Securing Distributed Storage: Challenges, Techniques, and

Systems” Workshop On Storage Security And Survivability, Nov. 2005

URI = http://doi.acm.org/10.1145/1103780.1103783

23. Kyriacos Pavlou; Richard Snodgrass, “Forensic Analysis of Database Tampering,” ACM

Transactions on Database Systems (TODS), 2008

URI = http://doi.acm.org/10.1145/1412331.1412342

24. Elbaz, R.; Torres, L.; Sassatelli, G.; Guillemin, P.; Bardouillet, M.; Rigaud, J.B., "How to Add the

Integrity Checking Capability to Block Encryption Algorithms," Research in Microelectronics

and Electronics 2006, Ph. D. , vol., no., pp.369-372, 0-0 0

URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1689972&isnumber=35631

25. Elbaz, R.; Torres, L.; Sassatelli, G.; Guillemin, P.; Bardouillet, M., "PE-ICE: Parallelized

Encryption and Integrity Checking Engine," Design and Diagnostics of Electronic Circuits and

systems, 2006 IEEE , vol., no., pp.141-142, 0-0 0

URI: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1649595&isnumber=34591

26. Wikipedia, October 2009. Information Security Reference.

URI= http://en.wikipedia.org/wiki/Information_security