Secure Computation Over Encrypted Data Liangliang Xiao.

Post on 04-Jan-2016

227 views 3 download

Transcript of Secure Computation Over Encrypted Data Liangliang Xiao.

Secure Computation Over Encrypted Data

Liangliang Xiao

Cloud Customers outsource their data & computing needs to cloud

Hardware technology hits its limit Systems become harder to maintain

Security issues in cloud Adversaries attacks Human error

Mistakenly sending disks with bank files to eBay [Ham03]

Reorganization or buyout [Nan04]

Introduction

Salary

x1

xn

SELECT SUM(salary)

x1+…+ xn

Introduction

Protect data in the cloud Encryption

How to process the encrypted data? Decrypt data for computation (not secure!) Key should be stored with the data at the server (not secure!)

Computing directly on encrypted data

Existing Works

Homomorphic Encryption (HE) Order-preserving Encryption (OPE) Prefix-preserving Encryption (PPE)

HE

Encryption function has homomorphic properties

E(x + y) = E(x) + E(y)E(x * y) = E(x) * E(y)

HE

Encryption function has homomorphic properties

HE supports computations on ciphertexts

Salary

E(x1)

E(xn)

E(x + y) = E(x) + E(y)E(x * y) = E(x) * E(y)

HE

Encryption function has homomorphic properties

HE supports computations on ciphertexts

Salary

E(x1)

E(xn)

SELECT SUM(salary)

E(x + y) = E(x) + E(y)E(x * y) = E(x) * E(y)

HE

Encryption function has homomorphic properties

HE supports computations on ciphertexts

Salary

E(x1)

E(xn)

SELECT SUM(salary)

E(x1+…+ xn)

DB computes E(x1+…+ xn) = E(x1) +…+ E(xn)

E(x + y) = E(x) + E(y)E(x * y) = E(x) * E(y)

HE

Encryption function has homomorphic properties

HE supports computations on ciphertexts

Salary

E(x1)

E(xn)

SELECT SUM(salary)

E(x1+…+ xn)

Alice decrypts to get x1+…+ xn

DB computes E(x1+…+ xn) = E(x1) +…+ E(xn)

E(x + y) = E(x) + E(y)E(x * y) = E(x) * E(y)

Example of “Partial” HE

RSA E(x) = xe mod n

e is the public key n = p ∙ q

Homomorphic with respect to multiplication E(x) * E(y) = xe * ye = (x * y)e = E(x * y)

Not homomorphic with respect to addition E(x) + E(y) = xe + ye ≠ (x + y)e = E(x + y)

Existing HEs

Boolean circuit based HE Plaintexts {0,1} Gentry’s construction [Gen09]

High security level but expensive computation

Ring based HE Plaintexts ZN

Efficient than Boolean circuit based HE

Polly Cracker encryption scheme [Fel94] Lack conclusive security evidence

OPE

Encryption preserves order

x < y E(x) < E(y)

Encryption preserves order

OPE supports range searches on ciphertexts

OPE

Salary

OPE(x1)

OPE(xn)

Name

E(N1)

E(Nn)

x < y E(x) < E(y)

Encryption preserves order

OPE supports range searches on ciphertexts

OPE

Salary

OPE(x1)

OPE(xn)

SELECT NameWhere Salary > OPE(a)

Name

E(N1)

E(Nn)

x < y E(x) < E(y)

OPE

Encryption preserves order

OPE supports range searches on ciphertexts

Salary

OPE(x1)

OPE(xn)

SELECT NameWhere Salary > OPE(a)

E(Ni)

Name

E(N1)

E(Nn)DB returns E(Ni) if OPE(xi) > OPE(a)

x < y E(x) < E(y)

OPE

Encryption preserves order

OPE supports range searches on ciphertexts

Salary

OPE(x1)

OPE(xn)

SELECT NameWhere Salary > OPE(a)

E(Ni)

Alice decrypts E(Ni) to get Ni

Name

E(N1)

E(Nn)DB returns E(Ni) if OPE(xi) > OPE(a)

x < y E(x) < E(y)

Existing OPEs

RN Randomly generate r1, …, rx, …, ry, …

Poly Randomly generate a strict increasing polynomial f

x E(x) = r1 + … + rx

f

y

E(x)

y E(y) = r1 + … + rx + … + ry

x

E(y)

PPE

Encryption preserves prefix

101000 → 001010

101110 → 001111

Plaintexts Ciphertexts

PPE

Encryption preserves prefix

Range searches can be transformed to prefix-matching search [32, 111] [00100000, 01101111] {001*, 010*, 0110*}

101000 → 001010

101110 → 001111

Plaintexts Ciphertexts

PPE

PPE supports range searches on ciphertexts

Salary

PPE(x1)

PPE(xn)

Name

E(N1)

E(Nn)

PPE

PPE supports range searches on ciphertexts

Salary

PPE(x1)

PPE(xn)

Name

E(N1)

E(Nn)

SELECT NameWhere Salary = PPE(aj), 1 ≤ j ≤ m

PPE

PPE supports range searches on ciphertexts

E(Ni)

DB returns PPE(xi) if PPE(aj) is its prefix

Salary

PPE(x1)

PPE(xn)

Name

E(N1)

E(Nn)

SELECT NameWhere Salary = PPE(aj), 1 ≤ j ≤ m

PPE supports range searches on ciphertexts

PPE

E(Ni)

Alice decrypts E(Ni) to get Ni

DB returns PPE(xi) if PPE(aj) is its prefix

Salary

PPE(x1)

PPE(xn)

Name

E(N1)

E(Nn)

SELECT NameWhere Salary = PPE(aj), 1 ≤ j ≤ m

Main problem of the Existing Works

HE/OPE/PPE only consider one encryption key One encryption key

DB colludes with any user compromise all data

Different users use different keys Computation cannot be performed

collude

Other Problems

HE Circuit-based HE has very high computation cost

Gentry’s algorithm

32-bit integer addition 900 s

32-bit integer multiplication 67,000 s 18 hours

Gentry’s algorithm:• Computation of each binary operation is 6 seconds [Gen]• Multiplication requires ~ 11,000 gates; Addition requires 160 gates [Mor]

Other Problems

Attacks against OPE Suppose A knows (m/2, OPE(m/2))

Plaintexts {1, …, m}

A can retrieve the most significant bit of other cipherterxts Need to qualify the security of OPE

OPE(m/2)

m/2

Other Problems

Attacks against OPE Suppose A knows (m/2, OPE(m/2))

Plaintexts {1, …, m}

A can retrieve the most significant bit of other cipherterxts Need to qualify the security of OPE

OPE(m/2)

m/2

ciphertext ciphertext

Other Problems

Existing security analysis Reduce the security of the real OPE scheme to the ideal OPE

object

Other Problems

Existing security analysis Reduce the security of the real OPE scheme to the ideal OPE

object Ideal OPE

The encryption function is uniformly randomly selected from all order-preserving functions

All OPE functionsRandomly selected

Encryption function

Other Problems

Existing security analysis Reduce the security of the real OPE scheme to the ideal OPE

object Ideal OPE

The encryption function is uniformly randomly selected from all order-preserving functions

No security analysis of the ideal OPE object

All OPE functionsRandomly selected

Encryption function

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

Basic construction (ring based) E(x, k) = M

M is a matrix with the eigenvalue x w.r.t. the eigenvector k Over ring ZN where N = p ∙ q

Homomorphic in addition and multiplication x ∙ k = M ∙ k and y ∙ k = M’ ∙ k (x + y) ∙ k = (M + M’) ∙ k

(x ∙ y) ∙ k = (M ∙ M’) ∙ k

Our HE Construction

x ∙ k = M ∙ k

ZN

[ [] ][ ]

Security Definition

Attack model Adversary knows some plaintext/ciphertext pairs Adversary tries to reverse another ciphertext

Called challenge

Attack based on plaintext/ciphertext pair (x, M) Solve k from x ∙ k = M ∙ k

x has the only eigenvector k

Use k to reverse other ciphertexts

Security Analysis

Solve k

x ∙ k = M ∙ k

ZN

[ [] ][ ]

Our HE Construction

Need to improve the basic construction One common eigenvector homomorphic computation Second distinct eigenvector resist the attack

Improved Construction Consider 44 matrix k, u, v, w are randomly chosen eigenvectors

Our HE Construction

Improved Construction Associate x with k and z

z = u, v, or w subject to a distribution D

Randomly select r Associate r with two remaining eigenvectors

M

x r

k u v w

M

x r

k v u w

M

x r

k w u v

Our HE Construction

Consider m rings

ZN

k1

k2

km

k

p

p

ppm

Zf1

Zf2

Zfm

. . .

Security of Our HE

Security Theorem: the probability for the adversary to reverse any

other ciphertext is pm

p = 1 (1 q) qn

pm becomes negligibly small if n < m ln poly() is the security parameter

Achieves one-wayness security Further computes q to minimize (1 (1 q) qn)m

q is the probability that x is associated with u q = 1 1/n

Performance Comparison

Compare our algorithm with Gentry’s

Gentry’s algorithm Our algorithm

32-bit integer addition 900 s 0.0992 ms

32-bit integer multiplication

67,000 s 18 hours

108 ms

Space cost 1 bit → 200,000 bits 1024 bits → 262,144 bits

Our algorithm• Choose m = 16 to sustain 1109 chosen plaintext attacks

Gentry’s algorithm:• Computation of each binary operation is 6 seconds [Gen]• Multiplication requires ~ 11,000 gates; Addition requires 160 gates [Mor]

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

HE for Multi-User System

Key transformation similarity transform k’ ∙ E(x, k) ∙ k’ −1 = E(x, k’ ∙ k)

HE for Multi-User System

Key transformation similarity transform k’ ∙ E(x, k) ∙ k’ −1 = E(x, k’ ∙ k)

Request protocol kj – user key

Different user holds different user key

kj’ and kj’’ – matching key

mk – master key

Response protocol – reverse the request protocol

User Uj Key agent DB

Hold key kj’Hold key kj Hold key kj’’

E(x, kj) E(x, kj’ ∙ kj)

kj’’ kj’ kj = mk

E(x, mk)x

HE for Multi-User System

Security Theorem: Our Protocols are as secure as HE unless both DB

and KA are compromised

Further security improvement Use a chain of KAs

Performance Study

Request/Response Protocols User ; DB Key agent

Performance Study

Results λ – data length NE = “No Encryption” HE – Q = DB received the data encrypted by user HE – P = User decrypts the data sent from DB

λ (bit) NE (ms) HE - Q (ms)

HE - P (ms)

Request Protocol

(ms)

Response Protocol

(ms) 32 86.03 301.03 120.03 807 80664 86.23 301.23 120.23 807 806

1024 91.62 306.99 125.62 807 806

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

Security Analysis of the Ideal OPE

Security metric zh = Average # of secure bits of plaintext under h known

plaintext attacks

= H~∞(X | Y, KPh) H~∞ is the average min-entropy

X is the plaintext, Y is a challenge (randomly generated ciphertext) KPh is h plaintext ciphertext pairs known by the adversary

Challenge of computing zh

It is difficult to find close-form expression for zh

Security Analysis of the Ideal OPE

Instead, estimate the upper and lower bounds on zh

Upper bound on zh Choose KPh = { (xi, E*(xi)) | xi = i∙(m+1)/(h+1), 1≤i≤h }

xi is uniformly distributed

zh ≤ log2(m−h)/(h+1)

Lower bound on zh

Don’t know the strongest plaintext attack, how?

x1 x2xhx3

Security Analysis of the Ideal OPE

Our approach to estimate the lower bound on zh

Observation KPh divides the domain and range

to h+1 subdomains and subranges KPh = {(xi, yi)}1≤i≤h

No plaintext attack within each

subdomains and subranges

Estimate the lower bound for the case of no plaintext attack Defined as z0

Apply z0 to each subdomain and subrange (xi,yi) are variables

Accordingly, estimate zh ≥ clog2(m−h)/(h+1) Optimize the h (xi,yi) pairs

0 < c < 1

xj

yj

[m]

[n]

xj+1

yj+1

No plaintext attack

Security Analysis of the Ideal OPE

zh = Θ(log2(m−h)/(h+1)) for n ≥ m3

Combine the lower bound and upper bound Θ denotes the big-theta notion

Theorem: constant ratio of bits are secure

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

OPE for Multi-User System

Challenge All data should be encrypted by one key

Simple solution: One key agent holds the key and encrypt the data for users

Has to have some knowledge of the key in order to encrypt

No entity should hold the encryption key

But how to encrypt the data? Possible solution:

Use a group of key agents to “distributedly” encrypt the data How to design it so that the resulting ciphertexts are still

order-preserving? Existing data sharing schemes cannot achieve this

Basic-DOPE (digit-based OPE) User partitions the plaintext into “digits” Each key agent encrypts a single “digit” DB integrates the encrypted “digits”

Request protocol

Basic-DOPE (digit-based OPE) User partitions the plaintext into “digits” Each key agent encrypts a single “digit” DB integrates the encrypted “digits”

Request protocol

User DB

KA1 with k1

KAp with kp

KAj with kj

Basic-DOPE (digit-based OPE) User partitions the plaintext into “digits” Each key agent encrypts a single “digit” DB integrates the encrypted “digits”

User DB

KA1 with k1

KAp with kp

x1

xp

Request protocol

KAj with kj

xj

Expresses x in base A number system, i.e., x = 1j≤p xj · Aj−1

Basic-DOPE (digit-based OPE) User partitions the plaintext into “digits” Each key agent encrypts a single “digit” DB integrates the encrypted “digits”

User DB

KA1 with k1

KAp with kp yp

y1x1

xp

Request protocol

KAj encrypts xj by an OPE with kj

KAj with kj

xjyj

Basic-DOPE (digit-based OPE) User partitions the plaintext into “digits” Each key agent encrypts a single “digit” DB integrates the encrypted “digits”

User DB

KA1 with k1

KAp with kp yp

y1x1

xp

Request protocol

DB integrates the ciphertext COPE(x) = 1j≤p yj · Bj−1

In the base B number system

KAj with kj

xjyj

Example Base A = 10; base B = 20 p = 3

User DB

KA1 with k1

KA3 with k3y3 = OPE(5, k3)

y1 = OPE(3, k1)3

5

Request protocol

DB integrates the ciphertext COPE(x) = y3*202 + y2*20 + y1

User maps 583 to (5, 8, 3)

x = 583

KA2 with k28

y2 = OPE(8, k2)

KAj encrypts xj by an OPE with kj

If DB and one KA are compromised The adversary can get one “digit” of each data in DB

Security Issues of Basic-DOPE

User DB

KA1 with k1

KA3 with k3

KA2 with k2

Uses k3 to decrypt this digit

OPE( 5 8 3)

3

5

8

OPE(5, k3)

OPE(3, k1)

OPE(8, k2)

OPE( 2 5 6)

OPE( 7 3 1)

Solution Substitute each KA by a chain of key agents

Security Issues of Basic-DOPE

User DB

KA12 with k12

KA32 with k32z3=OPE(y3, k32)

z1=OPE(y1, k12)3

5

x = 583

KA22 with k228

z2=OPE(y2, k22)

KA11 with k11

KA31 with k31

KA21 with k21

y3=OPE(5, k31)

y1=OPE(3, k11)

y2=OPE(8, k21)

If the first KA in the chain is compromised A views the raw “digit”

Potential solution: Two party computation Too expensive

Security Issues of Basic-DOPE

can views raw digit

User DB

KA12 with k12

KA32 with k32

3

5

x = 583

KA22 with k228

KA11 with k11

KA31 with k31

KA21 with k21

OE-DOPE

OE (Oblivious Encryption)

OE-DOPE

OE (Oblivious Encryption) “digit” x “micro-digits” (x1, x2, …, xu)

Inserting micro-digits into random matrix

User Key agent DBEncrypt elements in the matrix …

r r x1 r r

r r r r x2

r x3 r r r

x4 r r r r

OE-DOPE

OE (Oblivious Encryption) “digit” x “micro-digits” (x1, x2, …, xu)

Inserting micro-digits into random matrix

User Key agent DBEncrypt elements in the matrix …

DB knows which micro-digit to select and encrypt further

r r x1 r r

r r r r x2

r x3 r r r

x4 r r r r

(1,3), (2,5), (3,2), (4,1)

Position information

OE-DOPE

OE (Oblivious Encryption) “digit” x “micro-digits” (x1, x2, …, xu)

Inserting micro-digits into random matrix

User Key agent DBEncrypt elements in the matrix …

DB knows which micro-digit to select and encrypt further

r r x1 r r

r r r r x2

r x3 r r r

x4 r r r r

(1,3), (2,5), (3,2), (4,1)

Position information

The probability to derive digit x is negligibly small

OPE for Multi-User Systems

Response Protocol Can simply reverse the request protocol

Response may contain a large number of confidential data Reverse protocol can be very inefficient

DB maintains CCE(x) Encrypted using a conventional encryption (e.g AES) Key is granted to users with access privilege

User DBCCE(x)

Maintains COPE(x) and CCE(x)

Properties of OE-DOPE

Security Theorem: the probability for A to retrieve any “digit” is

negligible if The underlying OPE has one-wayness security A cannot compromise all the key agents in a chain simultaneously

Versatile Our OE-DOPE can be applied to any OPE algorithm

Performance Study

Basic-DOPE: 4 key agents User ; DB Key agent

Performance Study

OE-DOPE: 4 * 2 = 8 key agents User ; DB Key agent

Performance Study

Results for OPE Poly and Hyper λ – data length NE = “No Encryption”

λ (bit)

NE (ms)

Poly (ms)

Basic-DOPE + Poly (ms)

OE-DOPE + Poly (ms)

Hyper (ms)

Basic-DOPE + Hyper (ms)

OE-DOPE + Hyper (ms)

8 85.87 85.87 166.62 194.35 106.24 506.06 7718.9032 86.03 86.03 167.83 214.18 20537 9.19E+0764 86.23 86.23 169.32 239.34 4965977.56

128 86.64 86.64 172.05 285.091024 91.62 91.99 197.23 786.72

Basic - DOPE is at most 2 times slower

OE - DOPE is at most 8 times slower

OE-DOPE is more expensive but more secure

Objective of My Research

Bridge the gaps HE

Design a more efficient HE algorithm Enhance it for multi-user systems

OPE Prove the security of the ideal OPE object Develop a multi-user OPE protocol

PPE Prove the security of the ideal PPE object Design a multi-user PPE protocol based on an existing PPE

PPE

Prove the security of the ideal PPE object Weaken the security notion from IND-CPA to IND-PCPA Show that IND-PCPA can exactly qualify the security of the

ideal PPE object By mapping the prefix-preserving function to tree-based function

Design a multi-user PPE protocol Based on an existing PPE construction, which consists

A pseudo random function (PRF) A least significant bit extractor (LSB)

Distributedly compute PRF by DL Remove LSB

But will cause the ciphertext to be too long Develop a reduction method to reduce the size

Future Research

Theoretical work Further improve the security and performance of HE Construct better OPE and PPE algorithms to achieve better

security and improved performance

OPE PPE

IND-OCPA ? IND-PCPA √

One-wayness security √ One-wayness security ?

Future Research

HE application Current key management systems have a centralized

manager to generate and refresh keys and validate entities In a large scale system, centralized solution won’t work

E.g., the SCADA (supervisory control and data acquisition) system includes a large number of meters and devices

Expected to have billions of entities

Distributed key management Individual key managers will not be as trustworthy Probability of one of the many key managers being malicious or

compromised is very high

Use HE for key manager computation Large data space, uniform distribution suitable to our HE

HE application

Centralized key manager

Distributed key manager

Key manager

HE application

Centralized key manager

Distributed key manager

Key manager

HEHE HE

Future Research

OPE application Privacy preserving data mining

Decision tree (with continuous attributes) For each step, an attribute X and threshold t is selected Compute the information gain about X and t

o Comparisons are needed to determine the number of instances in each class

Use our multi-user OPE Handle training data from different sources Privacy requirement: Each data owner wishes not to disclose its data

to other parties

OPE application

Privacy preserving data mining

Data owner

Data owner

Data owner

…… … …

… yij …

… … …

… … …

… xij …

… … …

… … …

… zij …

… … …

Data mining

Multi-user OPE

OPE application

Privacy preserving data mining

Data owner

Data owner

Data owner

…… … …

… yij …

… … …

… … …

… xij …

… … …

… … …

… zij …

… … …

Data mining

Future Research

PPE application Anonymous analysis of internet traffic traces

E.g., study web performance, routing performance analysis, or clustering of end-systems

Traffic log owners hesitate to make the traces public Leak the identities of senders and receivers

Use our PPE protocol Handle traffic data from different sources Privacy requirement: No entity in the system has a global knowledge

to the traffic information

Question?