Ensuring data integrity with tamper evident encryption of integers ...

candlewhynotData Management

Jan 31, 2013 (4 years and 6 months ago)

611 views

Tamper evident encryption of integers using
keyed Hash Message Authentication Code

Brad Baker

November 16, 2009

UCCS

11/16/2009

Brad Baker
-

Master's Project Report

1

Master’s Project Report


Agenda

11/16/2009

Brad Baker
-

Master's Project Report

2


Introduction / Motivation


Background


Design


Analysis


Implementation


Testing


Conclusion / Future Work


References

Section 1:

Introduction

11/16/2009

Brad Baker
-

Master's Project Report

3

Introduction

11/16/2009

Brad Baker
-

Master's Project Report

4


Confidentiality and integrity of data are important features in a
database environment [16, 26]


Integrity is also referred to as tamper detection for this project


Database tampering is defined as loss of relationship between
sensitive data and other data in the record


Standard solutions exist including [16]:


Symmetric and asymmetric encryption for confidentiality


Message authentication codes and hash digests for integrity


Standard solutions require end
-
user to build a complex
process combining hash and encryption functions


This project presents the “HMAC based Tamper Evident
Encryption” scheme (HTEE) as an alternative solution


HMAC is Hashed Message Authentication Code

Motivation

11/16/2009

Brad Baker
-

Master's Project Report

5


Create an efficient and simple
-
use tamper evident
encryption technique


Single step, single column tamper detection


Focus on processing numeric data in a database system


Improve performance of the encryption operation
compared to standard approaches


Improve on previous work that introduced an HMAC
based encryption/decryption process


Investigate uses of HMAC as an encryption and key
generation function

Related Work

11/16/2009

Brad Baker
-

Master's Project Report

6


File system and application level integrity [21, 22]


Checksums, CRC, RAID Parity, Cryptographic file systems


OpenSSL
, Intrusion detection, Tripwire,
Samhain


Forensic analysis and tamper detection [23]


Notarization with hash function and reliance on audit log


Analysis of how and when data was tampered


Parallel encryption and authentication code [24, 25]


Various implementations of encryption combined with MAC


Original HMAC encryption scheme [1]


Integer encryption with HMAC


Foundation for HTEE tamper detection


Comparison of Solutions

11/16/2009

Brad Baker
-

Master's Project Report

7


Solutions for integrity and confidentiality considered:


HTEE:
Encryption and tamper detection with HMAC function


AES & SHA
-
1:
Encryption and hash, detects tampering


AES:
Encryption, detects random changes only


Each provides a unique benefit:


Solution

Encryption
Strength

Tamper
Detection

Simple
Usage

Encrypt

Efficiency

Decrypt

Efficiency

HTEE

Medium/High
*

Yes

Yes

Fast

Slow

AES & SHA
-
1

High

Yes

No

Moderate

Moderate

AES

High

No

Yes

Moderate

Moderate

* Security of the HTEE scheme is variable and relies on the hash algorithm used.

Section 2:

Background

11/16/2009

Brad Baker
-

Master's Project Report

8

Background
-

HMAC

11/16/2009

Brad Baker
-

Master's Project Report

9


HMAC


keyed Hash Message Authentication Code [13]


Produces a secure authentication code (digest) using message and
secret key, providing integrity and authenticity


Proposed in [3], and standardized as FIPS PUB 198 [12]


Unauthorized individual cannot generate digest without key


Can use any underlying hash function, MD5, SHA
-
1, etc.


Function generates two keys from secret key


The HMAC process is:


HMAC(key,
msg
) = Hash((key XOR
opad
) || Hash ((key XOR
ipad
) ||
msg
)


Where
opad
=“0x5c5c…” and
ipad
=“0x3636…”

Background


Integer Encryption

11/16/2009

Brad Baker
-

Master's Project Report

10


Integer encryption with HMAC


Original HMAC integer encryption scheme proposed in [1]


The scheme operates on integer plaintext values, decomposed
into two components or buckets


Encryption is performed with HMAC calculation, decryption is
performed with exhaustive search


The scheme is inefficient on encryption and for large integers


Encryption is recursive HMAC rather than direct calculation


Two buckets results in a large search ranges for decryption


A detailed analysis including testing results are available in [2]


HTEE is based on this scheme, and improves upon it

Original HMAC process

11/16/2009

Brad Baker
-

Master's Project Report

11

Introductory Example

11/16/2009

Brad Baker
-

Master's Project Report

12


Original HMAC example:


Plaintext integer value 567,212 and bucket size 5,000


Bucket 1 = 113, Bucket 2 = 2212


Plaintext can be retrieved as (567,212 = 113*5,000 + 2212)


HMAC digest / ciphertext output:


113 becomes “fG7Agfw4OErQw+IX2iBw853LBKg=“


2212 becomes “YOLpnTHGIHurCvkrgczFMM1C5PI=“


Decryption searches through 5,000 values to find a ciphertext
match for each bucket

Section 3:

Design

11/16/2009

Brad Baker
-

Master's Project Report

13

HTEE Design

11/16/2009

Brad Baker
-

Master's Project Report

14


Processes positive integer values


Decomposition of plaintext into multiple buckets of size 1,000


For example: 2,412,345,678 becomes four buckets:


Bucket 1 = 2; Bucket 2 = 412; Bucket 3 = 345; Bucket 4 = 678;


In the original scheme, a 50,000 bucket size would make two buckets:


Bucket 1 = 48246; Bucket 2 = 45678;


Key transformation based on a unique value related to plaintext


Each encryption operation uses a different key


Encryption keys depend on original key and unique related data


The unique value is any data that must remain the same in
relation to the plaintext, for example:


Record’s primary key, other unique data, hash digest of unique data

HTEE Design

11/16/2009

Brad Baker
-

Master's Project Report

15


Encryption operation:


Calculate HMAC digest for each bucket


Decryption operation:


Search for digest match between
ciphertext

and all values (0
-
999)


Tamper detection:


Decryption operation cannot find matching value


Two key transformation functions used: element and bucket


Element transformation creates a key for each plaintext


HMAC executed recursively four times with unique value and original key


Bucket transformation creates key for each bucket value


HMAC executed iteratively with
ciphertext

output and original key


Encryption performed with transformed keys, not original key


HTEE Design

11/16/2009

Brad Baker
-

Master's Project Report

16


HMAC digests for all buckets in a plaintext are
concatenated to form
ciphertext


Decryption follows key generation process, plus an
exhaustive search for
ciphertext

match.


No match indicates data was tampered with, that the
ciphertext

or unique related data have changed


The HTEE process is:


HTEE(Plaintext, Key, Unique) =



HMAC(Bucket1,
f
Key
(Key, Unique)) ||


HMAC(Bucket2,
f
Key
(Key, Unique)) || … Bucket N


Where {
f
Key
} is key transformation (element and bucket) and
Bucket 1 through Bucket N are decomposed from Plaintext

Example of HTEE

11/16/2009

Brad Baker
-

Master's Project Report

17


Record contents (DATA value is sensitive, must be encrypted):


ID = 1001
;
DATA = 654321


After
decomposition of DATA value:


bucket1 =
654;
bucket2 =
321


Original Key, 512 bit
:


fwWe6MNL5WC9gRgCfVbUsuFLeX8IfwKbnkWmlKhj5Tx2Ods+VkmKS73AeFt0EsXy+zmfWEsyOEaKSx/
oYMSmRA
==


Generated keys for
buckets (dependent on ID value and original key):


Bucket1 key:



qi5K5JmBNRfOuPf8qQvgPVVZ5nHZjlgoDb8un4GS/NxFhbRNdnE5B80kPe3rpqIvHRDzdZsiEmpk+2Ozcb5yXg==


Bucket2 key:



ylT5vKaGkdc1XMtW0z+HOb1Td2eqLkrkmYE1F8649/ypC+A9VVnmcdmOWCgNvy6fgZL83EWFtE12cTkHzSQ97Q
==


Ciphertext result from
HMAC (
bucket,

key
):


Bucket1 cipher:

Ziuytd9t8Vn1h5ldqZjv57sTe2k=


Bucket2 cipher:

uk
/ACtScX2oxJUPyEPdPWSPCXQk=


Final Ciphertext:

Ziuytd9t8Vn1h5ldqZjv57sTe2k=
uk
/ACtScX2oxJUPyEPdPWSPCXQk
=


Final Output
:



ID = 1001; CIPHER = Ziuytd9t8Vn1h5ldqZjv57sTe2k=
uk
/ACtScX2oxJUPyEPdPWSPCXQk=

HTEE Encryption Concept

11/16/2009

Brad Baker
-

Master's Project Report

18

Element Key Transformation [3, 4, 9, 11]

11/16/2009

Brad Baker
-

Master's Project Report

19

Bucket Key Transformation

11/16/2009

Brad Baker
-

Master's Project Report

20

Section 4:

Analysis

11/16/2009

Brad Baker
-

Master's Project Report

21

Security Analysis

11/16/2009

Brad Baker
-

Master's Project Report

22


Cryptographic strength of HTEE is based on HMAC


Key transformation and encryption use HMAC function


Cryptographic strength of HMAC is based on underlying
hash function [3, 4, 5]


For this project, SHA
-
1 is used as underlying hash


Hash can be changed for additional security of HMAC [3]


HMAC proven secure from forgery if hash compression
operation is a pseudo
-
random function [4, 7, 11]


HMAC is not susceptible to hash collision attacks that
affect MD5 and SHA
-
1 [3, 4, 5]


Collisions are still produced but more difficult to attack

Security Analysis

11/16/2009

Brad Baker
-

Master's Project Report

23


HMAC can be attacked by forgery or key recovery
attacks [3, 6]


Key recovery attacks typically have chosen or known plaintext


The birthday paradox controls probability to find an
HMAC collision [3, 5, 11, 15]


For SHA
-
1, 2
80

(message, digest) pairs from HMAC are needed


Research shows key recovery attacks that are better than
brute force, but still worse than birthday attack [6, 7, 10]


For the HTEE scheme key recovery attacks are the
primary concern


Forgeries are less of a concern as they could only break a
single record’s tamper detection capability

Security Analysis

11/16/2009

Brad Baker
-

Master's Project Report

24


The layering of key generation in HTEE makes analysis difficult:


Attacker knows the unique value and final digest/ciphertext


Given the digest it is difficult to find the key or message value


Given the unique value, it is difficult to obtain original key


Consider general form: HTEE(P,K,U) = HMAC(P, f
K
(K,U))


Intermediate keys and plaintexts are masked and HMAC is difficult to
break if using an effective underlying hash


HMAC operation protects plaintext and intermediate key, makes
derivation of original key more difficult


A key recovery attack will take over 2
80

message pairs


Most applications will not use the same secret key for a large
number of records (over 2
40
,
appx
. 1 trillion)


This is short of the required over 2
80

pairs needed for key recovery

Tamper Detection Analysis

11/16/2009

Brad Baker
-

Master's Project Report

25


HTEE creates a distinct key sequence based on the
unique value related to plaintext


Identical keys only occur on hash collisions


This is improbable unless a very large number of records are
processed


If
ciphertext

or unique value are changed then the key
sequence or HMAC output will differ


Tamper detection will only fail if the original and changed HTEE
process produce a collision


Probability of collision for each bucket is
appx
. 3.42x10
-
43


Based
on the birthday attack
with1,000
values
[15, 16]


Probability is{P = 1


e
(
-
k^2/2N)
} with {k = 1000} and {N = 2
160
}

Section 5:

Implementation

11/16/2009

Brad Baker
-

Master's Project Report

26

Implementation

11/16/2009

Brad Baker
-

Master's Project Report

27


HTEE process implemented as a PostgreSQL add
-
on and a
command line program


Built in the C language


Microsoft Visual C++ 2008 Express Edition


PostgreSQL server versions 8.3.8 and 8.4.1


Implemented versions:


Command line program used for validation and flat file processing


PostgreSQL add
-
on is considered the primary implementation


Two functions added to PostgreSQL server:


Encryption:
htee_enc
(plaintext, unique value)


Decryption:
htee_dec
(ciphertext, unique value)


Simple operation, example SQL for encryption:


SELECT
htee_enc
(
data,unique
) FROM
test


Maximum of six buckets or 9x10
17

integer value supported

Implementation

11/16/2009

Brad Baker
-

Master's Project Report

28


SHA
-
1 used for underlying hash function


Specifies use of 512 bit key, blocks of 160 bit
ciphertext

output


Input key is 88 base64 characters, output is 28 base64
characters per bucket value


Ciphertext output for six buckets is 168 bytes of base64
encoded data


Comparable AES output is 116 bytes, HTEE is a 44% increase


Compared to plaintext data, a 21
-
fold increase


Several challenges encountered:


Extending PostgreSQL in Windows environment


Interfacing with the PostgreSQL backend


Section 6:

Testing

11/16/2009

Brad Baker
-

Master's Project Report

29

Testing

11/16/2009

Brad Baker
-

Master's Project Report

30


Compared three methods for encryption:


Basic AES (aes1): Does not provide tamper detection


AES & unique value (aes2): Provides tamper detection


HTEE scheme: Provides tamper detection


Tested six datasets, 20,000 random integers in each


Each dataset with different number of buckets, one through six


Results verified tamper detection with AES2 and HTEE
methods


HTEE on average was four times faster on encryption but
four times slower on decryption than AES

Performance comparison

11/16/2009

Brad Baker
-

Master's Project Report

31

HTEE performance details

11/16/2009

Brad Baker
-

Master's Project Report

32

Performance analysis

11/16/2009

Brad Baker
-

Master's Project Report

33


The performance of HTEE and the original scheme [1] are
compared with algorithmic analysis


HTEE is significantly more efficient on encryption, and
decryption for large numbers [2]


Original scheme increases with n
0.5
, HTEE increases with log
1000
(n)


Testing verifies that HTEE is much faster for similar datasets


The large bucket size required for two buckets becomes
prohibitively expensive to calculate decryption

Encryption Scheme

Relative complexity

HTEE Encryption

2*log
1000
(n
)


Constant

HTEE Decryption

1001*log
1000
(n
)


Constant

Original Encryption

2*n
0.5


Polynomial

Original Decryption

2*n
0.5


Polynomial

Section 7:

Conclusion

11/16/2009

Brad Baker
-

Master's Project Report

34

Lessons Learned

11/16/2009

Brad Baker
-

Master's Project Report

35


Encountered and solved implementation challenges


Null bytes, memory management, hash processing


PostgreSQL extension in Windows environment


Interfacing with PostgreSQL backend, operating on data types


Challenges in algorithm design


Properly protecting key information in the transformation process


Adapting key transformation for a database environment


Created custom key generation for random 512 bit keys


OpenSSL

package proved difficult to generate simple random strings


Effect of implementation on security


Processing time exposing information about plaintext values


Effect of small input values


Can be mitigated by expanding the size of the unique value

Conclusion

11/16/2009

Brad Baker
-

Master's Project Report

36


HTEE provides strong tamper detection and data integrity


Ciphertext and other related data are tied together


HTEE provides strong confidentiality


Security based on the underlying HMAC and hash functions


Can be improved with stronger hash functions


For regulatory requirements recommend AES encryption


HTEE is more efficient on encryption and less efficient on
decryption than AES


Ideal for encryption
-
heavy applications where tamper detection
is needed


Examples include archival and auditing systems, including financial
information


Additional information available:
http://cs.uccs.edu/~gsc/pub/master/bbaker/



Future Work

11/16/2009

Brad Baker
-

Master's Project Report

37


Plaintext value range:


HTEE scheme is limited to positive integer values


Future work can expand operation to negative values, floating
point values, or ASCII encoded data


Floating point can be encoded with multiplication by a positive
factor of 10, the factor must be stored in the
ciphertext

data


Security Proof


A conceptual analysis of cryptographic strength is presented


Future work can prove of the security of HTEE, focused on:


HMAC as a pseudo
-
random function


Effect of unique value and bucket values on HMAC randomness

Questions?

11/16/2009

Brad Baker
-

Master's Project Report

38


References

11/16/2009

Brad Baker
-

Master's Project Report

39

1.
Dong
Hyeok

Lee; You Jin Song; Sung Min Lee;
Taek

Yong Nam;
Jong

Su Jang, "How to
Construct a New Encryption Scheme Supporting Range Queries on Encrypted Database,"
Convergence Information Technology, 2007. International Conference on

, vol., no., pp.1402
-
1407,
21
-
23 Nov. 2007

URI:

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4420452&isnumber=4420217

2.
Brad Baker, "Analysis of an HMAC Based Database Encryption Scheme,"
UCCS Summer
2009 Independent study

July. 2009

URI:

http://cs.uccs.edu/~gsc/pub/master/bbaker/doc/final_paper_bbaker_cs592.doc

3.
Mihir

Bellare
; Ran Canetti; Hugo
Krawczyk
; “Keying Hash Functions for Message
Authentication”,
IACR Crypto 1996

URI:
http://cseweb.ucsd.edu/users/mihir/papers/kmd5.pdf


4.
Mihir

Bellare
, “New Proofs for NMAC and HMAC: Security without Collision
-
Resistance,”
IACR Crypto 2006

URI:
http://eprint.iacr.org/2006/043.pdf

5.
Mihir

Bellare
, “Attacks on SHA
-
1,” 2005

URI:
http://www.openauthentication.org/pdfs/Attacks%20on%20SHA
-
1.pdf

6.
Pierre
-
Alain
Fouque
;
Gaëtan

Leurent
;
Phong

Q. Nguyen, "Full Key
-
Recovery Attacks on
HMAC/NMAC
-
MD4 and NMAC
-
MD5,"
IACR Crypto 2007

URI:
ftp://ftp.di.ens.fr/pub/users/pnguyen/Crypto07.pdf


7.
Scott
Contini
;
Yiqun

Lisa Yin, “Forgery and Partial Key
-
Recovery Attacks on HMAC and
NMAC using Hash Collisions (Extended Version),” 2006

URI:
http://eprint.iacr.org/2006/319.pdf

References

11/16/2009

Brad Baker
-

Master's Project Report

40

8.
Hyrum Mills; Chris
Soghoian
; Jon Stone;
Malene

Wang, “NMAC: Security Proof,” 2004

URI:

http://www.cs.jhu.edu/~astubble/dss/proofslides.pdf

9.
Ran Canetti, “The HMAC construction: A decade later,” 2007

URI:
http://people.csail.mit.edu/canetti/materials/hmac
-
10.pdf

10.
Yu Sasaki, “A Full Key Recovery Attack on HMAC
-
AURORA
-
512,” 2009

URI:
http://eprint.iacr.org/2009/125.pdf

11.
Jongsung

Kim; Alex
Biryukov
; Bart
Preneel
; and
Seokhie

Hong, “On the Security of HMAC
and NMAC Based on HAVAL, MD4, MD5, SHA
-
0 and SHA
-
1”, 2006

URI:
http://eprint.iacr.org/2006/187.pdf


12.
NIST, March 2002. FIPS Pub 198 HMAC specification.

URI =
http://csrc.nist.gov/publications/fips/fips198/fips
-
198a.pdf

13.
Wikipedia, October 2009. HMAC reference material.

URI=
http://en.wikipedia.org/wiki/Hmac

14.
Wikipedia, October 2009. SHA
-
1 reference material.

URI=
http://en.wikipedia.org/wiki/SHA
-
1

References

11/16/2009

Brad Baker
-

Master's Project Report

41

15.
Wikipedia, October 2009. Birthday Attack reference.

URI=
http://en.wikipedia.org/wiki/Birthday_attack

16.
Forouzan
,
Behrouz

A. 2008. Cryptography and Network Security. McGraw Hill higher
Education. ISBN 978
-
0
-
07
-
287022
-
0

17.
Simon
Josefsson
, 2006. GPL implementation of HMAC
-
SHA1.

URI=
http://www.koders.com/c/fidF9A73606BEE357A031F14689D03C089777847EFE.aspx

18.
Scott G. Miller, 2006. GPL implementation of SHA
-
1 hash.

URI=
http://www.koders.com/c/fid716FD533B2D3ED4F230292A6F9617821C8FDD3D4.aspx

19.
Bob
Trower
, August 2001. Open source base64 encoding implementation, adapted for test
program.

URI=
http://base64.sourceforge.net/b64.c

20.
PostgreSQL, October 2009. Server Documentation.

URI=
http://www.postgresql.org/docs/8.4/static/index.html

21.
Gopalan

Sivathanu
; Charles P. Wright; and
Erez

Zadok
, “Ensuring data integrity in storage:
techniques and applications,”
Workshop On Storage Security And Survivability,

Nov. 2005

URI =
http://doi.acm.org/10.1145/1103780.1103784

References

11/16/2009

Brad Baker
-

Master's Project Report

42

22.
Vishal
Kher
;
Yongdae

Kim, “Securing Distributed Storage: Challenges, Techniques, and
Systems”
Workshop On Storage Security And Survivability,

Nov. 2005

URI =
http://doi.acm.org/10.1145/1103780.1103783

23.
Kyriacos

Pavlou
; Richard Snodgrass, “Forensic Analysis of Database Tampering,”
ACM
Transactions on Database Systems (TODS),
2008

URI =
http://doi.acm.org/10.1145/1412331.1412342

24.
Elbaz
, R.; Torres, L.;
Sassatelli
, G.; Guillemin, P.;
Bardouillet
, M.;
Rigaud
, J.B., "How to Add the
Integrity Checking Capability to Block Encryption Algorithms,"
Research in Microelectronics
and Electronics 2006, Ph. D.

, vol., no., pp.369
-
372, 0
-
0 0

URI:

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1689972&isnumber=35631

25.
Elbaz
, R.; Torres, L.;
Sassatelli
, G.; Guillemin, P.;
Bardouillet
, M., "PE
-
ICE: Parallelized
Encryption and Integrity Checking Engine,"
Design and Diagnostics of Electronic Circuits and
systems, 2006 IEEE

, vol., no., pp.141
-
142, 0
-
0 0

URI:

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1649595&isnumber=34591

26.
Wikipedia, October 2009. Information Security Reference.

URI=
http://en.wikipedia.org/wiki/Information_security