Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing...
Transcript of Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing...
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Privacy-Preserving Query Processing overEncrypted Data in Cloud
CS573 Data Privacy and Security
Yousef M. ElmehdwiDepartment of Mathematics and Computer Science
Emory University
October 31, 2016
Yousef M. Elmehdwi Privacy-Preserving Query Processing 1 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Cloud ComputingComputation over Encrypted Data
Cloud Computing
DefinitionType of computing that relies on sharing computing resources rather thanhaving local servers or personal devices to handle applications
Outsourcing
Data owner outsources its data as well as processing functionalities to acloud
Reduced management cost, less overhead of data storage, andimproved quality of service
Key Challenge
Cloud cannot be fully trusted
Protect data confidentiality query privacy, and data access patterns
Yousef M. Elmehdwi Privacy-Preserving Query Processing 2 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Cloud ComputingComputation over Encrypted Data
Cloud Computing 2
How to Ensure Data Confidentiality
Data owners encrypt their data before outsourced to a cloud
Key challenge: query processing over encrypted data without the cloudever decrypting the data
Yousef M. Elmehdwi Privacy-Preserving Query Processing 3 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Cloud ComputingComputation over Encrypted Data
Computing on Encrypted Data
Basic idea
Party P1 sends encrypted data to party P2
Party P2 performs some computation and returns the encrypted resultto party P1
Party P1 decrypts to find out the answer
Ways to perform computations on encrypted data
Fully homomorphic encryption (impractical)
Additive/Multiplicative homomorphic encryption schemes (Additiveadopted in this work)
Yousef M. Elmehdwi Privacy-Preserving Query Processing 4 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Cloud ComputingComputation over Encrypted Data
The Goal of this Work
Develop distributed protocols to allow the cloud to perform queriesdirectly over encrypted data
During query processing, the cloud cannot infer any information aboutthe outsourced data, the user queries, or data access patterns
Such a protocol is termed as privacy-preserving query processing(PPQP)
Yousef M. Elmehdwi Privacy-Preserving Query Processing 5 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
Cloud ComputingComputation over Encrypted Data
Desired Output and Security Guarantee
Basic formulation
PPQP(〈C : T ′〉, 〈Bob : q〉
)→ 〈Bob : qout〉
Input - T ′ denotes the encrypted database and q the user queryOutput- qout denotes set of records that satisfies q
Security requirements1 Data confidentiality and query privacy2 Privacy/Hide data access patterns3 Output security4 Information that can be inferred from input/output is not a security
violation
Other desirable requirements1 End-user efficiency and correctness
Yousef M. Elmehdwi Privacy-Preserving Query Processing 6 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Example: Insurance Company
Yousef M. Elmehdwi Privacy-Preserving Query Processing 8 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Two-Cloud Environment
Basic Assumptions
Existence of two cloud service providers denoted by C1 and C2 (e.g.,Google and Amazon)
Alice owns a database T of n records t1, . . . , tn and m attributes
Alice generates two keys (pk,sk) based on the AH-ENC system
Alice encrypts T attribute-wise, and sends the encrypted database T ′ toC1 and sk to C2
Bob wants to execute his input query q = 〈q1, . . . , qm〉 on T ′ in thecloud in a privacy-preserving manner
C1 is the data host, who stores all uploaded (encrypted) data T ′
C2 is called the key holder since it stores Alice’s private key sk
Yousef M. Elmehdwi Privacy-Preserving Query Processing 9 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Architecture of Two-Cloud Setting Based Solution
Yousef M. Elmehdwi Privacy-Preserving Query Processing 10 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Adopted Security Model
More realistic model: Secure multiparty computation (SMC)
Parties collaboratively compute the functionality in a secure fashionwithout using a trusted third party
In SMC, security means guaranteeing the correctness of the output aswell as the privacy of the parties’ inputs
Yousef M. Elmehdwi Privacy-Preserving Query Processing 11 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Adopted Adversarial Model
Adversarial modelGenerally specifies what an adversary or attacker is allowed to do during anexecution of a secure protocol
Common adversary models under SMC
Semi-honest: follow the protocol faithfully, but can try to infer thesecret information of the other parties from the data they see during theprotocol execution
Malicious: may do anything to infer secret information (e.g., inputmodification, sending the wrong values)
In our work, we adopt the semi-honest adversary model
Yousef M. Elmehdwi Privacy-Preserving Query Processing 12 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Justification of Use of Semi-Honest Model
Two reasons for adopting Semi-Honest Model
Developing protocols under the semi-honest setting is an important firststep towards constructing protocols with stronger security guarantees
Both C1 and C2 were assumed to be two cloud service providers.Today, cloud service providers in the market are legitimate, well-knowncompanies (e.g., Amazon, Google, and Microsoft). These companiesmaintain reputations that are invaluable assets that need to be protectedat all costs. Thus, a collusion between them is highly unlikely as it willdamage their reputation, which, in turn, affects their revenues.
Yousef M. Elmehdwi Privacy-Preserving Query Processing 13 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Additive Homomorphic Probabilistic Encryption
Epk and Dsk be the encryption and decryption functions. Given m1,m2 ∈ ZN ,the AH-ENC system exhibits the following properties.
Homomorphic Addition
Dsk(Epk(m1 + m2)
)= Dsk
(Epk(m1) ∗ Epk(m2)
)Homomorphic Multiplication
Given a constant c and a ciphertext Epk(m1)
Dsk(Epk(c ∗ m1)
)= Dsk
(Epk(m1)
c)Probabilistic
Let c1 = Epk(m1) and c2 = Epk(m2)
Probability for c1 6= c2 is very high even if m1 = m2
Semantic Security
Given Epk(m1), an adversary cannot derive any information about m1
Yousef M. Elmehdwi Privacy-Preserving Query Processing 14 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Sub-protocol
Secure Multiplication (SM)
SM(⟨
C1 : Epk(a),Epk(b)⟩,⟨C2 : sk
⟩)→
(⟨C1 : Epk(a ∗ b)
⟩, 〈C2 : ∅〉
)Input : Epk(a), Epk(b), and private key sk
Output : encryption of a ∗ b
Secure Squared Euclidean Distance (SSED)
SSED(⟨
C1 : Epk(X),Epk(Y)⟩,⟨C2 : sk
⟩)→
(⟨C1 : Epk(|X − Y|2)
⟩, 〈C2 : ∅〉
)Input : X and Y are m-dimensional vectors, and private key sk , whereEpk(X) = 〈Epk(x1), . . . ,Epk(xm)〉, Epk(Y) = 〈Epk(y1), . . . ,Epk(ym)〉
Output : encryption of squared Euclidean distance between X and Y
Yousef M. Elmehdwi Privacy-Preserving Query Processing 15 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Sub-Protocols 2
Secure Bit-OR (SBOR)
SBOR(⟨
C1 : Epk(o1),Epk(o2)⟩,⟨C2 : sk
⟩)→
(⟨C1 : Epk(o1 ∨ o2)
⟩, 〈C2 : ∅〉
)Input : o1 and o2 are two bits, and sk is private key sk
Output : encryption of the boolean OR operation between o1 and o2
Secure Bit-Decomposition (SBD)
SBD(⟨
C1 : Epk(z)⟩,⟨C2 : sk
⟩)→
(⟨C1 : [z]
⟩, 〈C2 : ∅〉
)Input : Epk(z) such that 0 ≤ z < 2l and sk is private key
Output : [z] =⟨Epk(z1), . . . ,Epk(zl)
⟩SBD: Example
Let z = 5, l = 3. SBD(Epk(5), sk
)→ [z] =
⟨Epk(1),Epk(0),Epk(1)
⟩Yousef M. Elmehdwi Privacy-Preserving Query Processing 16 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
New Secure Minimum Sub-Protocol
Secure Minimum (SMIN)
SMIN(〈C1 : (u′, v′)〉, 〈C2 : sk〉)→ (〈C1 : [min(u, v)],Epk(smin(u,v))〉, 〈C2 : ∅〉)
Input: u′ =([u],Epk(su)
), v′ =
([v],Epk(sv)
), and sk is private key
[u] (resp., [v]) denotes the encryption of individual bits of binaryrepresentation of u (resp., v)su (resp., su) denotes the secret corresponding to u (resp., v)
Output:[min(u, v)]: encryptions of individual bits of minimum between u and vEpk(smin(u,v)): encryptions of secret corresponds to minimum of u and v
SMIN: Example
Let u′ =([5],Epk(s5)
)and v′ =
([3],Epk(s3)
), where [5] =
⟨Epk(1),Epk(0),Epk(1)
⟩,
[3] =⟨Epk(0),Epk(1),Epk(1)
⟩⇒SMIN(u′, v′)=
([3],Epk(s3)
)Yousef M. Elmehdwi Privacy-Preserving Query Processing 17 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
New Secure Minimum out of n numbers Sub-Protocol
Secure Minimum out of n numbers (SMINn)
SMINn(〈C1 : (θ1, . . . , θn)〉, 〈C2 : sk〉)→ (〈C1 : [dmin],Epk(sdmin)〉, 〈C2 : ∅〉)
Input: ∀ni=1 θi =
([di],Epk(sdi )
)and sk is private key
[di]: encryption of individual bits of binary representation of di for1 ≤ i ≤ nsdi : secret corresponding to di for 1 ≤ i ≤ n
Output:[min(d1, . . . , dn)] = [dmin]: encryptions of individual bits of globalminimumEpk(smin(d1,...,dn)) = Epk(sdmin ): encryptions of secret corresponds toglobal minimum
SMAX and SMAXn
Similarly, one can design SMAX and SMAXn to compute the global maximum
Yousef M. Elmehdwi Privacy-Preserving Query Processing 18 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
New Secure Frequency Sub-Protocol
Secure Frequency (SF)
SF(〈C1 : (Λ,Λ′)〉, 〈C2 : sk〉)→(〈C1 : Epk(f (c1)
), . . . ,Epk
(f (cw)
)〉, 〈C2 : ∅〉)
Input: Λ =⟨Epk(c1), . . . ,Epk(cw)
⟩, Λ′ =
⟨Epk(c′1), . . . ,Epk(c′k)
⟩, and sk is
private key
Output:Epk
(f (cj)
): encryption of the frequency of cj in the list 〈c′1, . . . , c′k〉, for
1 ≤ j ≤ wcj’s are unique and c′i ∈ {c1, . . . , cw} for 1 ≤ i ≤ k
SF: Example
(i.e., w = 3 and k = 6), Λ = 〈Epk(2),Epk(3),Epk(5)〉 andΛ′ = 〈Epk(3),Epk(2),Epk(3),Epk(2),Epk(5),Epk(2)〉⇒ SF(Λ,Λ′)→
⟨Epk
(f (2)
)= Epk(3),Epk
(f (3)
)= Epk(2),Epk
(f (5)
)= Epk(1)
⟩Yousef M. Elmehdwi Privacy-Preserving Query Processing 19 / 57
MotivationTwo-Cloud Setting
PPkNN ClassificationConclusion and Future Research
OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM
Proving Security Under The Semi-Honest Model
Key ideas
Information deduced from the messages in the real execution image ofa protocol should be computationally indistinguishable from theinformation deduced based on the corresponding messages in thesimulated view
For this, all the intermediate messages seen by an adversary during theexecution of a protocol should be either random or pseudo-random
To prove a protocol is secure under semi-honest model, it required toshow that the execution image of a protocol does not leak anyinformation regarding the private inputs of participating parties / weneed to show that the simulated execution image of that protocol iscomputationally indistinguishable from its actual execution image
An execution image generally includes the messages exchanged andthe information computed from these messages
Yousef M. Elmehdwi Privacy-Preserving Query Processing 20 / 57