A Framework for Secure Computations With Two Non-Colluding ...

13
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015 445 A Framework for Secure Computations With Two Non-Colluding Servers and Multiple Clients, Applied to Recommendations Thijs Veugen, Robbert de Haan, Ronald Cramer, and Frank Muller Abstract—We provide a generic framework that, with the help of a preprocessing phase that is independent of the inputs of the users, allows an arbitrary number of users to securely outsource a computation to two non-colluding external servers. Our approach is shown to be provably secure in an adversarial model where one of the servers may arbitrarily deviate from the protocol specification, as well as employ an arbitrary number of dummy users. We use these techniques to implement a secure recom- mender system based on collaborative filtering that becomes more secure, and significantly more efficient than previously known implementations of such systems, when the preprocessing efforts are excluded. We suggest different alternatives for preprocessing, and discuss their merits and demerits. Index Terms— Secure multi-party computation, malicious model, secret sharing, client-server systems, preprocessing, recommender systems. I. I NTRODUCTION R ECOMMENDATION systems consist of a processor together with a multitude of users, where the processor provides recommendations to requesting users, which are deduced from personal ratings that were initially submitted by all users. It is easy to see that, in a non-cryptographic setup of such a system, the processor is both able to learn all data submitted by the users, and spoof arbitrary, incorrect recommendations. In this work we replace the recommendation processor by a general two-server processor in such a way that, as long as one of the two servers is not controlled by an adversary and behaves correctly, Manuscript received February 17, 2014; revised September 2, 2014 and October 29, 2014; accepted October 29, 2014. Date of publication Novem- ber 13, 2014; date of current version January 22, 2015. The work of T. Veugen and F. Muller was supported by the Dutch National COMMIT Programme. The work of R. de Haan was supported by the Dutch Science Foundation through the Vrije Competitie Project entitled Applications of Arithmetic Secret Sharing. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. C.-C. Jay Kuo. T.Veugen is with the Cyber Security Group, Delft University of Technology, Delft, The Netherlands, and also with the Department of Technical Sciences, TNO, The Netherlands (e-mail: [email protected]). R. de Haan was with the Centrum Wiskunde and Informatica (CWI), Amsterdam, The Netherlands. R. Cramer is with the Centrum Wiskunde and Informatica (CWI), Amsterdam, The Netherlands, and also with Leiden University, Leiden, The Netherlands (e-mail: [email protected]). F. Muller is with the Department of Technical Sciences, TNO, The Nether- lands (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIFS.2014.2370255 1) the privacy of the ratings and recommendations of the users is maintained to the fullest extent possible, and 2) a server that is under adversarial control is unable to disrupt the recommendation process in such a way that an incorrectly computed recommendation will not be detected by the requesting user. Our result uses a modified version of the standard model for secure multi-party computation, which is a cryptologic paradigm in which the players jointly perform a single secure computation and then abort. In our model the computation is ongoing (recommendations are repeatedly requested) and outsourced to two external servers that do not collude. This approach allows for the involvement of many users that need only be online for very short periods of time in order to provide input data to, or request output data from, the servers. In practice, one of the two servers could be the service provider that wishes to recommend particular services to users, and the other server could be a governmental organisation guarding the privacy protection of users. The role of the second server could also be commercially exploited by a privacy service provider, supporting service providers in protecting the privacy of their customers. While this work focuses on the application of secure recommendation systems, we point out that our underlying framework is sufficiently generic for use in other, similar applications, and also easily extends to model variations involving more than two servers. In general, any configuration is foreseen consisting of multiple servers that have a joint goal of securely delivering a service to a multitude of users. The online phase will be computationally secure as long as one of the servers is honest [1]. However, the application should allow for a significant amount of preprocessing, which is independent of the data, and could be performed during inactive time. Although most related work (see Subsection I-A) is only secure in the semi-honest model, we provide security in the malicous model (see Subsection III-B). There are several reasons why security in the malicious model is important. First of all, the main goal of securing the system is that we do not want the servers to learn the personal data of users. We therefore should not just trust them to follow the rules of the protocol. Second, in a malicious model the correctness of the user outputs is better preserved, because outputs cannot be corrupted by one (malicious) server on his own. Third, a malicious server might introduce a couple of new, so called 1556-6013 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of A Framework for Secure Computations With Two Non-Colluding ...

Page 1: A Framework for Secure Computations With Two Non-Colluding ...

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015 445

A Framework for Secure Computations With TwoNon-Colluding Servers and Multiple Clients,

Applied to RecommendationsThijs Veugen, Robbert de Haan, Ronald Cramer, and Frank Muller

Abstract— We provide a generic framework that, with the helpof a preprocessing phase that is independent of the inputs of theusers, allows an arbitrary number of users to securely outsource acomputation to two non-colluding external servers. Our approachis shown to be provably secure in an adversarial model whereone of the servers may arbitrarily deviate from the protocolspecification, as well as employ an arbitrary number of dummyusers. We use these techniques to implement a secure recom-mender system based on collaborative filtering that becomes moresecure, and significantly more efficient than previously knownimplementations of such systems, when the preprocessing effortsare excluded. We suggest different alternatives for preprocessing,and discuss their merits and demerits.

Index Terms— Secure multi-party computation, maliciousmodel, secret sharing, client-server systems, preprocessing,recommender systems.

I. INTRODUCTION

RECOMMENDATION systems consist of a processortogether with a multitude of users, where the processor

provides recommendations to requesting users, which arededuced from personal ratings that were initially submittedby all users. It is easy to see that, in a non-cryptographicsetup of such a system, the processor is both able to learnall data submitted by the users, and spoof arbitrary, incorrectrecommendations.

In this work we replace the recommendation processor bya general two-server processor in such a way that, as long asone of the two servers is not controlled by an adversary andbehaves correctly,

Manuscript received February 17, 2014; revised September 2, 2014 andOctober 29, 2014; accepted October 29, 2014. Date of publication Novem-ber 13, 2014; date of current version January 22, 2015. The work of T. Veugenand F. Muller was supported by the Dutch National COMMIT Programme.The work of R. de Haan was supported by the Dutch Science Foundationthrough the Vrije Competitie Project entitled Applications of ArithmeticSecret Sharing. The associate editor coordinating the review of this manuscriptand approving it for publication was Prof. C.-C. Jay Kuo.

T. Veugen is with the Cyber Security Group, Delft University of Technology,Delft, The Netherlands, and also with the Department of Technical Sciences,TNO, The Netherlands (e-mail: [email protected]).

R. de Haan was with the Centrum Wiskunde and Informatica (CWI),Amsterdam, The Netherlands.

R. Cramer is with the Centrum Wiskunde and Informatica (CWI),Amsterdam, The Netherlands, and also with Leiden University, Leiden, TheNetherlands (e-mail: [email protected]).

F. Muller is with the Department of Technical Sciences, TNO, The Nether-lands (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIFS.2014.2370255

1) the privacy of the ratings and recommendations of theusers is maintained to the fullest extent possible, and

2) a server that is under adversarial control is unable todisrupt the recommendation process in such a way thatan incorrectly computed recommendation will not bedetected by the requesting user.

Our result uses a modified version of the standard modelfor secure multi-party computation, which is a cryptologicparadigm in which the players jointly perform a single securecomputation and then abort. In our model the computationis ongoing (recommendations are repeatedly requested) andoutsourced to two external servers that do not collude. Thisapproach allows for the involvement of many users thatneed only be online for very short periods of time in orderto provide input data to, or request output data from, theservers. In practice, one of the two servers could be theservice provider that wishes to recommend particular servicesto users, and the other server could be a governmentalorganisation guarding the privacy protection of users. Therole of the second server could also be commercially exploitedby a privacy service provider, supporting service providers inprotecting the privacy of their customers.

While this work focuses on the application of securerecommendation systems, we point out that our underlyingframework is sufficiently generic for use in other, similarapplications, and also easily extends to model variationsinvolving more than two servers. In general, any configurationis foreseen consisting of multiple servers that have a jointgoal of securely delivering a service to a multitude of users.The online phase will be computationally secure as long asone of the servers is honest [1]. However, the applicationshould allow for a significant amount of preprocessing, whichis independent of the data, and could be performed duringinactive time.

Although most related work (see Subsection I-A) is onlysecure in the semi-honest model, we provide security in themalicous model (see Subsection III-B). There are severalreasons why security in the malicious model is important.First of all, the main goal of securing the system is that wedo not want the servers to learn the personal data of users.We therefore should not just trust them to follow the rules ofthe protocol. Second, in a malicious model the correctness ofthe user outputs is better preserved, because outputs cannotbe corrupted by one (malicious) server on his own. Third, amalicious server might introduce a couple of new, so called

1556-6013 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: A Framework for Secure Computations With Two Non-Colluding ...

446 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

dummy users, which he controls. These dummy users mighthelp him deduce more personal data than is available throughthe protocol outputs. We show that in our malicious modelthis is not possible.

A. Related Work

Most related work on privacy preserving recommenda-tions is secure in the semi-honest model, so parties areassumed to follow the rules of the protocol. As mentioned byLagendijk et al. [2], “Achieving security against maliciousadversaries is a hard problem that has not yet been studiedwidely in the context of privacy-protected signal processing.”

Erkin et al. [3] securely computed recommendationsbased on collaborative filtering. They used homomorphicencryption within a semi-honest security model just likeBunn and Ostrovsky [4]. Goethals et al. [5] stated that althoughsuch techniques can be made secure in the malicious model,that will make them unsuitable for real life applicationsbecause of the increased computational and communicationcosts.

Polat and Du [6] used a more lightweight approach bystatistically hiding personal data, which unfortunately has beenproven insecure by Zhang et al. [7]. Atallah et al. [8] useda threshold secret-sharing approach for secure collaborativeforecasting with multiple parties.

Nikolaenko et al. [9] securely computed collaborativefiltering by means of matrix factorization. They used bothhomomorphic encryption and garbled circuits in a semi-honestsecurity model. In another paper [10], these authors use similartechniques to securely implement Ridge regression, a differentapproach of collaborative filtering.

Catrina and de Hoogh [11] developed an efficient frameworkfor secure computations in the semi-honest model, based onsecret sharing and statistical security, which could also be usedfor a recommender system. Peter et al. [12] considered a modelwhere users can outsource computations to two non-colludingservers. They use homomorphic encryption, each user havingits own key, but require the servers to follow the rules of theprotocol.

Canny’s approach [13] on private collaborative filtering isable to cope with malicious behaviour. A detailed comparisonwith our approach can be found in Subsection V-E.

In the last several years, a couple of computation protocolshave been developed, which are both practical and secure inthe malicious model. The idea is to use public-key techniquesin a data-independent pre-processing phase, such that cheapinformation-theoretic primitives can be exploited in the onlinephase, which makes the online phase efficient. In 2011,Bendlin et al. [1] presented such a framework with a some-what homomorphic encryption scheme for implementing thepre-processing phase. This has been improved lately byDamgård et al. [14], which has become known as SPDZ(pronounced “Speedz”). Last year, Damgård et al. [15] showedhow to further reduce the precomputation effort.

B. Our Contribution

We used the SPDZ framework, as introduced in the previoussubsection, which enables secure multi-party computations in

the malicious model, extended it to the client-server model,and worked out a secure recommendation system within thissetting. Not only did this lead to a recommendation systemthat is secure in the malicious model, but also the onlinephase became very efficient in terms of computation andcommunication complexity.

To extend SPDZ to the client-server model, we developedsecure protocols that enable users (clients) to upload their datato the servers, and afterwards obtain the computed outputsfrom the servers. This required a subprotocol for generatingduplicate sharings, as described in Appendix A.

To securely compute a recommendation within SPDZ,we had to develop a secure comparison protocol and asecure integer division protocol. For the comparison proto-col, we combined ideas from Nishide and Ohta [16], andSchoenmakers and Tuyls [17]. In fact, we finetuned thebitwise comparison protocol from [17] to a linear-roundsecret-sharing setting, and replaced the constant-round solutionof [16]. For the integer division, which forms the bottle-neck of our application, we took a secure solution fromBunn and Ostrovsky [4], translated it to a secret-sharingsetting, and put in our secure comparison protocol. Thisyielded an efficient secure integer division, because its outputwas relatively small in our application.

Finally, we took the effort for implementing and testingour recommender system within the SPDZ framework, andcomparing it with the current state-of-the-art.

C. Organization of the Paper

We first explain the basics of collaborative filtering andauthenticated secret sharing, which are necessary for ourpaper. In Section III we describe how we fit secure multi-party computation in the client-server model, and the resultingsecurity model. Section IV focusses on the secure implemen-tation of our recommender system, the various options forpreprocessing, and the more complicated subprotocols that areneeded. The complexity of this implementation is analysed inSection V, and compared with related work. Finally, we stressthe wide applicability of our secure framework.

II. PRELIMINARIES

After explaining collaborative filtering, we introduce theconcepts of SPDZ: tagged secret sharing, and the basic pro-tocols of the computation phase.

A. User-Based Collaborative Filtering

We provide as an example application, taken from [3],a basic implementation for generating recommendations,more specifically a collaborative filtering method with user-neighborhood-based rating prediction [18]. We do not pretendto present the best recommender system, but merely want tointroduce some basic components, and show how they can beimplemented securely.

There is one processor R, with N users, and M differentpredefined items. A small subset of size S (1 ≤ S < M)of these items is assumed to have been rated by each user,

Page 3: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 447

reflecting his or her personal taste. The remaining M − Sitems have only been rated by a small subset of users that haveexperienced the particular item before. A user that is lookingfor new, unrated items, can ask the processor to produceestimated ratings for (a subset of) the M−S items. The numberN of users can be large, say 1 million, M is in the order ofhundreds, and S usually is a few tens [18].

During initialisation, each user n uploads to the processora list of at most M ratings of items, where each rating V(n,m)is represented by a value within a pre-specified interval. Userscan update their ratings at any time during the lifetime ofthe system. A user can, at any time after the initialization,request a recommendation from the processor. When theprocessor receives such a request from a user, it computesa recommendation for this user as follows. First, it uses theinitial S ratings in each list to determine which other usersare considered to be similar to the requesting user, i.e. havesimilarly rated the first S items. The remaining M−S entries inthe lists are then used to compute and return a recommendationfor the requesting user, consisting of M − S ratings averagedover all similar users.

To get an idea of the required computation we describe therequired computational steps. Let Um be the set of users thathave rated item m, S < m ≤ M .

1) Each user uploads his (between S and M) ratings to theprocessor. Like in [3], we assume the first S (similarity)ratings have been normalized and scaled beforehand.A rating is normalized by dividing it by the length(Euclidean norm) of the vector (V(n,1), . . . , V(n,S)),yielding a real number between 0 and 1. Next, thisreal number is scaled and rounded to a positive integerconsisting of a few (e.g. four) bits. The remaining ratingsshould only be scaled and rounded to an integer with thesame maximal number of bits.

2) When user A asks for a recommendation, the processorcomputes M − S estimated ratings for A. First, the sim-ilarities SimA,n = ∑S

m=1 V(n,m) · V(A,m) are computedfor each user n.

3) Each similarity value is compared with a public thresh-old t ∈ N

+, and its outcome is presented by the bitδn = (t < SimA,n).

4) The recommendation for user A consists of M − Sestimated ratings, the estimated rating for item m,S < m ≤ M , simply being an average of the ratingsof the similar users: Recm = (

∑n∈Um

δn · V(n,m)) ÷(∑

n∈Umδn), where ÷ denotes integer division.

5) The processor sends back the recommendationRecS+1 . . .RecM to user A.

Here, we used the notation (x < y) to denote the binary resultof the comparison of two numbers x and y, which equals 1when the comparison is true, and 0 otherwise.

B. Tagged Secret Sharing

Let p be a prime consisting of � bits. We denote Fp

the field of integers modulo p with ordinary addition andmultiplication, and use the elements as if they were integers ofthe set {0, 1, . . . , p−1}. We use + for addition and · for multi-plication, both in the field Fp, and therefore omit the reduction

modulo p. All secret values, which are elements of Fp,are additively shared between the servers. The framework oftagged secret sharing used here is from Bendlin et al. [1].

1) Authenticated Secret Sharing: An external dealer distrib-utes shares of a secret value x ∈ Fp to the two servers asfollows:

1) The dealer selects a value r ∈ Fp uniformly at random.2) The dealer sends the value r to server 1 and the value

x − r to server 2.The values x1 = r and x2 = x − r are considered to bethe share of server 1 and the share of server 2, respectively.It should be clear from the description above that the sharesx1 and x2 are both individually statistically independent of thesecret x , while they together allow to determine the value of x ,by simply adding these shares together.

In addition to the distribution of the shares, the dealerdistributes authentication tags on the shares with respect tothe authentication code C : Fp × F

2p → Fp , defined as

C(x, (α, β)) = α · x + β. Here the value (α, β) is called theauthentication key and the value α · x + β the authenticationtag for the share x .

For every share x1 for server 1, the dealer generates arandom authentication key (α2, β2), computes the correspond-ing authentication tag m1 = α2 · x1 + β2 and sends thekey (α2, β2) to server 2, and the share x1 and tag m1 toserver 1. Symmetrically, this is done for the share x2 ofserver 2. We use notation 〈x〉 to denote such a randomizedshare distribution with corresponding authentication keys andtags for the value x , and call 〈x〉 a(n) (authenticated) secretsharing. Equivalently, we say that the value x is secret-sharedwhen a secret sharing 〈x〉 has been constructed.

Suppose that for a secret sharing 〈x〉 the value x is to berevealed to server 1. This proceeds by server 2 sending theshare x2 and the tag m2 to server 1. Server 1 then verifieswhether m2 = α1 · x2 + β1 and recovers x = x1 + x2.If the tag values are incorrect, the protocol is aborted. This isreferred to as opening 〈x〉 to server 1, and works symmetricallyfor server 2. A secret sharing can also be opened to a partythat is external to servers 1 and 2, by sending the party allrelevant shares, authentication tags and authentication keys,and letting this party do the corresponding verifications.

It is easy to see that any attempt by, for example, server 1 tocorrectly adjust a tag m1 with key (α2, β2) during a maliciousshare modification from x1 to the value x ′

1 �= x1, boils downto guessing the value α2 · (x ′

1 − x1), which only succeeds withprobability 1/p if the value of α2 is unknown.

Although we only focus on two servers, the framework cancope with multiple servers. When n servers are available, thebasic idea is that each server is given authentication keys foreach of the n shares [1]. Intuitively, this allows an honestserver to detect malicious behaviour of the remaining servers,so in SPDZ one honest server is sufficient for providingsecurity. The quadratic amount of authentication keys hasbeen improved lately by Damgård et al. [14], where tagauthentication is linear in the number of players.

2) Structured-Authentication: As we demonstrate in thenext section, it is possible to perform basic linear operations onsecret-shared values without interaction. However, in order to

Page 4: A Framework for Secure Computations With Two Non-Colluding ...

448 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

make it possible to non-interactively update the correspondingauthentication keys and tags, we require both servers to fixtheir respective α-components of their authentication keys.Henceforth, each server i maintains a fixed private valueαi ∈ Fp that is used in all of its authentication keys, andwe use notation [x] for a secret sharing of x based on sucha-priori fixed α-components.

Having an incorruptible external dealer, the servers caneasily create such a structured distribution [x] using thestandard distribution 〈x〉 obtained from the dealer. We describethe necessary steps for server 1; they are similar for server 2.All that is required is that for every αx

1 received from thedealer, server 1 sends the difference δx

1 = α1 −αx1 to server 2,

after which server 2 updates his tag m2 to m2 + δx1 · x2. Since

the values α1 and αx1 are only known to server 1, δx

1 revealsno information about α1 to server 2 [1].

C. Core Protocols for the Computation Phase

In this section we describe the core protocols that are neededfor our recommender system. Suppose that servers 1 and 2hold fixed partial authentication keys α1, α2 ∈ Fp as describedin Subsection II-B.2.

1) Linear Operations: Let [x] and [y] be secret sharings ofarbitrary values x, y ∈ Fp and let c ∈ Fp be a public constant.We will show how to non-interactively compute secret sharingsfor [x + y], [cx] and [x + c] respectively.

An authenticated secret sharing of the sum z = x + yis computed by locally adding the shares, keys, and tags ofx and y. Let zi = xi + yi , β

zi = βx

i + βyi and mz

i = mxi + my

i ,for i = 1, 2, where zi , mz

i , and βzi are held by server i . Then

indeed z = z1 + z2 = x + y, and the authentication tags mzi of

the shares of z are correct, with authentication keys (αi , βzi ).

From a sharing [x], an authenticated secret sharing of cxis computed by local multiplication of the shares, keys, andtags of x . Let zi = c · xi , β

zi = c · βx

i , and mzi = c · mx

i , fori = 1, 2. Then indeed z = z1+z2 = cx , and the authenticationtags mz

i of the shares of z are correct.To add a public constant to a secret sharing [x], one party

adds the constant to its share, and the other party adjusts itsauthentication key. Let z1 = c + x1, z2 = x2, βz

1 = βx1 ,

βz2 = βx

2 − c · α2, mz1 = mx

1, and mz2 = mx

2 . Then indeedz = z1 + z2 = c + x , and the authentication tags remaincorrect.

2) Multiplication: We have seen in the previous subsectionthat linear operations on secret-shared values can be handlednon-interactively, including the updates of the correspondingauthentication keys and tags. In this section we describe howto turn the multiplication of two secret-shared values intoan operation that is essentially linear, using a precomputedmultiplication triplet. Since the multiplication protocol belowrequires some secret sharings to be opened, the multiplica-tion of two shared values requires interaction between thetwo servers. For clarity, we omit the updating of authenticationkeys and tags, and the details of opening a secret shared value,which have been described earlier.

Let [x] and [y] denote secret sharings of arbitrary valuesx, y ∈ Fp , and let ([a], [b], [c]) be a given multiplication

triplet such that c = ab. The following sequence of local linearoperations and interactions is used to compute a sharing [z]where z = xy, making use of the precomputed multiplicationtriplet:

1) The servers locally compute the secret sharing [v] =[x − a] from [x] and [a], and then open it towards eachother.

2) The servers locally compute the secret sharing [w] =[y − b] from [y] and [b], and then open it towards eachother.

3) The servers locally compute the secret sharing

[z] = [xy] = w[a] + v[b] + [c] + vw.

The correctness of the outcome of the multiplication protocolis easily verified by working out xy = (v + a)(w + b).

III. THE SECURE CLIENT-SERVER MODEL

A. Overview

Our secure framework relies heavily on techniques from thecryptologic area of secure multi-party computation [19], [20].The problem of secure multi-party computation considers afully connected communication network of n parties, wherethe parties wish to compute the outcome of a given function fon their respective inputs. However, apart from the output, noinformation on the inputs should leak during the course ofthe computation. In this work we consider functions f thatcorrespond with the computation of a recommendation for auser.

The ideal solution to the problem of secure multi-partycomputation involves an independent, incorruptible mediatorthat privately takes all the required inputs, computes andreveals the desired output to the appropriate parties, andthen forgets everything. The research area of secure multi-party computation studies techniques that allow the parties tosimulate the behavior of such a mediator.

These simulations have the following structure. First, thereis an input phase that enables the parties to encrypt theirrespective inputs for use in the secure computation. Then acomputation phase takes place during which an encryptedoutput of the function f is computed from the encryptedinputs. Finally, an output phase takes place where the outputis decrypted, and sent to the appropriate parties.

We consider secure multi-party computation in the pre-processing model [21], where at some point in time priorto the selection of the inputs, a preprocessing phase takesplace that establishes the distribution of an arbitrary amountof correlated data between the parties involved in the compu-tation. Consequently, this data is completely independent ofthe input data of the parties. The goal of the preprocessing isto remove as much of the complexity and interaction from theactual computation as possible, which as a result makes thiscomputation extremely efficient. We discuss possible ways toimplement such a preprocessing phase in Section IV-A.

In theory, using techniques from secure multi-party compu-tation, it is possible for the users of the recommender systemto securely and jointly compute the recommendations them-selves. However, for a large number of users that approach

Page 5: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 449

quickly becomes impractical. In this work we instead let theusers outsource the problem of multi-party computation totwo dedicated servers that execute a number of two-partycomputations with preprocessing. Unlike in ordinary securemulti-party computation, the inputs are here provided by someexternal parties, and the outputs are also returned to externalparties. We therefore require special techniques to correctlyintegrate these operations into the computation. Moreover, wedesign the system in such a manner that the users need onlyprovide every input (and possible update) once, so they neednot retransmit their current inputs for the computation of everyindividual recommendation.

B. Adversarial Behavior and Security

We consider an adversary that is able to take full control ofone of the two servers involved in the two-party computationwith preprocessing. Moreover, the adversary can initially intro-duce an arbitrary number of so-called dummy users into thesystem, which are also under his complete control. Dummyusers can in particular, like ordinary users, input or updatetheir ratings, and request recommendations. The adversary canread all data available to the entities that are under his control,and can make them deviate from the cryptographic protocolspecification arbitrarily.

The goal is to prove that the actions of the adversary haveessentially no impact on the outcome or the security of theprotocol. In order to describe the security level of our securerecommendation system we make use of the real/ideal worldparadigm [22].

This paradigm is based on the comparison between twoscenarios. In the first scenario, the real world, the adversaryparticipates in a normal protocol execution. In the secondscenario, the ideal world, all participants in the protocol haveblack-box access to the functionality that the protocol inthe real world attempts to emulate. Moreover, there existsa simulator that creates a virtual environment around theadversary, and interacts with the adversary in a special fashion.

The goal of the simulator in the ideal world is to simulate theexpected world-view of the adversary during a real protocolexecution. The security analysis now consists of showing thatthe adversary cannot distinguish whether it is acting duringa protocol execution in the real world, or in the ideal world.Since no effective attack is possible in the ideal world, it thenfollows that no effective attack is possible in the real worldeither.

Our description of the real/ideal world paradigm so farhas been generic. We now provide additional details forthe real/ideal world setup for our specific setting of securerecommender systems. The real world corresponds with themodel outlined in Section III-A, except that we additionallyhave an entity called the environment that fully controls theactions of the adversary and additionally chooses the inputsof the non-adversarial users.

The ideal world describes how we would ideally expect therecommendation processor R to behave. The users in the idealworld have access to an independent incorruptible recommen-dation processor I that privately takes all the required inputs

and updates from the users and, upon request, privately returnsthe correct recommendations to the users. The two servers thatare used to implement the processor R in the real world donot exist in the ideal world.

Just like in the real world, the environment in the ideal worldfully controls the actions of the adversary, and provides inputsto the non-adversarial users, except that all communicationbetween the adversary and the non-adversarial entities (i.e., thenon-dummy users and the non-existent second server of thereal-world setting) is intercepted by the simulator. Specifically,the goal of the simulator is to simulate all communicationbetween the two servers, and all communication from the non-adversarial server to the dummy users, as it would occur duringa protocol execution in the real world.

To help with this simulation, the simulator has access to alldata generated in the preprocessing phase for the two servers,and it may in fact even be assumed without loss of generalitythat it actually generates this data. Moreover, since in the idealworld non-adversarial users communicate directly with theideal processor, this processor notifies the simulator whenevera non-adversarial user either provides it with an input, orrequests a recommendation.

The level of security that our secure recommendation systemachieves can now be described as follows. It is possible todesign a simulator for our secure recommendation systemin such a manner that regardless of what the environment(and therefore the adversary) does, it cannot significantlydistinguish whether it is acting in the real world or in theideal world. Since no meaningful attack is possible in the idealworld, the system is therefore secure in the real world.

Because a server is able to change his share of a particularsecret-shared value, and correctly guess the updated tag valuewith probability 1/p, perfect security is not achievable. Thebest security level is attained by having a trusted dealerperform the preprocessing phase, in which case the real worldand the ideal world are statistically indistinguishable by thesimulator, and our recommender system is statistically secure.When the preprocessing has been performed by public-keycryptography between the two servers, this will reduce tocomputationally indistinguishable worlds, and eventually acomputationally secure recommender system.

Since our core protocols, as described in Section II-C, fallwithin the general framework of [1], we for these proto-cols refer to their proof of security in the malicious model.A key point in which our model differs is that all inputsoriginate from, and all output is returned to, external parties.We therefore include appropriate ideal world simulation detailsto our input (see Section IV-D) and output (see Section IV-E)subprotocols.

C. Our Approach

Every computation corresponds with a function f , whichcan be represented via an arithmetic circuit consisting of basicoperations like addition and multiplication. For the recom-mender application we consider here, it suffices to considerthese basic operations together with the more complex opera-tions of comparison and integer division, which are composedof basic operations.

Page 6: A Framework for Secure Computations With Two Non-Colluding ...

450 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

As mentioned earlier, as a result of the ‘outsourcing aspect’of the secure computation, the input phase is non-standardin the sense that the inputs are not provided by the twoservers. We nevertheless assume for now (and provide detailson how to accomplish this in Section IV-D) that the inputphase results in ‘encryptions’ of the inputs that are suitablefor the computation phase of the two-party computation withpreprocessing.

The encryption we use for the inputs is linear secret-sharing with authentication tags. To be more precise, for everyencrypted input, both servers end up with a random value astheir share. This is done in such a way that the shares togetherdetermine the value of the input, while they are individuallystatistically independent of the value of the input. Moreover,each of these shares is accompanied by an authentication tagthat together with the share is linked to an authentication keythat is held by the other server. As a result, the servers arecommitted to the values of their shares.

Although the users providing the inputs could in principletake care of the share distribution, these users cannot be trustedto provide the authentication keys and tags, since they mightbe under control of one of the servers. Moreover, we wantto introduce additional structure to the authentication keys inorder to enable our secure two-party computation approach,as explained in Subsection II-B2.

Once the encryption of the inputs has been established, thecomputation recursively handles the operations in the circuitwhile maintaining the encryption structure as an invariant.In other words, every operation in the circuit is initiatedwith two encrypted inputs, and produces an encrypted outputwithout leaking any information on the encrypted values.Therefore, at the end of the circuit evaluation, an encryptionof the final output becomes available.

The output phase is also non-standard, as the output needsto be revealed to an external party. Here the idea is that alldata related to the encryption of the output is sent back tothe relevant user, so that this user can verify the correctnessof the shares using the authentication keys and tags, and thendecrypt the output using the shares.

However, we need to recycle the fixed α-keys in order toprevent the users from having to resubmit their inputs for everyrecommendation computation. Therefore, the used authenti-cation keys may never be sent to a (dummy) user, as thiswould break security for future recommendation computations.We provide our solution to this problem in Section IV-E.

IV. IMPLEMENTATION

In this section we describe how the recommendation appli-cation has been implemented, using the Bendlin et al. [1]framework. We mention the required secret sharing operations,and refer to Section II-B for the details on updating ofauthentication keys and tags.

A. Remarks on the Preprocessing Phase

In the preprocessing phase we expect the followingresources to be computed:

1) “Multiplication triplets”, i.e., ([a], [b], [c]) witha, b ∈ Fp selected uniformly at random and c ∈ Fp

such that c = ab.2) Bitwise secret sharings [r ]B of random elements r ∈ Fp,

see Subsection IV-B.3) “Duplicate” secret sharings [r ] and 〈r〉 for random

r ∈ Fp , see Appendix A.Since each secure multiplication requires a precomputed mul-tiplication triplet, the computation of these triplets takes themain effort of the preprocessing phase.

For all of these sharings it is assumed that the distributionis as if all shares, keys, and tags were first generated and dis-tributed directly by an incorruptible external dealer, and then(where required) updated via server interaction as describedat the end of Subsection II-B2.

There are many ways to implement preprocessing. In thissection we discuss some methods that can be used to generatethe random sharings and multiplication triplets. While some ofthese methods can also be used to generate the bitwise sharingsand ‘duplicate’ sharings of random values, we describe in theAppendix a generic solution that allows creating the lattersharings using the former sharings.

The easiest way to implement preprocessing is to outsourcethe generation of triplets and random values to a trustedthird party, who then computes the proper values and sendseach server the corresponding shares and authentication tags.Although such a third party never sees any of the user dataor other data related to the computation, it is still crucial forthe security of the recommendation system that this party doesnot maliciously collaborate with any of servers. Although intheory also the online phase could be outsourced to this trustedthird party, this would not only require the third party to beonline continuously, but also give insight into sensitive userdata, and increase the risk of data leakage.

Similarly, the preprocessing could be outsourced to multipleexternal parties for which more than two-thirds are believedto be honest. These parties can then perform information-theoretically secure multi-party computation [19], [20] togenerate encryptions of the shares, authentication keys andtags for the required triplets and random values, and openthem to the respective servers.

When no suitable external parties are available, the pre-processing can instead be taken care of by the two serversthemselves using cryptography. This could for instance beimplemented using Yao’s garbled circuit approach [23],semi-homomorphic encryption [1], or fully-homomorphicencryption [24]. Standard Yao’s garbled circuits are onlysecure in the semi-honest model, but recent cut-and-choose protocols [25], [26] enable implementing the pre-processing by garbled circuits in the malicious model withtwo parties.

Although jointly precomputing triplets requires a substantialamount of work due to the high computational costs involvedwith the use of heavy cryptographic techniques, the compu-tations can easily be done in parallel by dedicated hardware.Recently, the precomputation effort from the original SPDZframework [1] has been reduced by Damgård et al. [15],lowering the time needed for generating a multiplication triplet

Page 7: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 451

from 2.5 seconds to 0.02 seconds. When covert security issufficient, this can be improved further to 0.0014 seconds.

Regardless of the approach that is chosen, the nature ofthe application allows for many triplets to be computedduring inactive time, which can then be used to generate therecommendations efficiently in active time.

B. Secure Comparison

In this section we describe how to securely compute theoutcome of the comparison x < y, for any two elementsx, y ∈ Fp, based on a solution by Nishide and Ohta [16].This is essentially based on the following two observations.

1) Interval Comparison: The first observation is that itis sufficient to have a protocol that compares the secret-shared value z = (x − y) mod p with p

2 , i.e., that determines[(z < p

2 )] from [z]. Although the original protocol byNishide and Ohta also requires computing [(x < p

2 )] and[(y < p

2 )], we have chosen p in our implementation largeenough such that we are assured that in every execution ofthe secure comparison protocol the inputs do not exceed p

2 ,which reduces the execution time of the secure comparisonprotocol roughly by a factor three.

Given the shared interval comparison bit [(z < p2 )], the

shared comparison bit [(x < y)] is easily derived through(x < y) = 1 − (z < p

2 ) [16].2) Least Significant Bit Protocol: In this paragraph we

frequently switch between integers (which are not reducedmodulo p) and field elements, and use + for both types ofadditions.

Let ζ be the integer 2z. The second observation is that ifz > p

2 , then the integer ζ will be larger than p, so the fieldelement 2z ∈ Fp will equal z − p and will therefore be odd.On the other hand, if z < p

2 , then the field element 2z ∈ Fp

will equal the integer ζ , and will be even. In order to determine(z < p

2 ), it is therefore sufficient to determine the value of theleast significant bit of the integer ζ .

In order to securely compute the least significant bit ζ0 of ζ ,a precomputed bitwise secret sharing [r ]B is used to compute[s] = [ζ + r ], which is then opened towards both servers [16].

If the integer ζ + r is smaller than p, then ζ0 = s0 ⊕ r0,where s0 and r0 are the least significant bits of s, and r ,respectively. On the other hand, if the integer ζ + r isat least p, then ζ0 = 1 − (s0 ⊕ r0). Since s is known toboth parties, [s0 ⊕ r0] is easily computed as [r0], if s0 = 0,and 1 − [r0], otherwise.

The remaining problem is therefore to determine whetherthe integer ζ + r is smaller than p. This is equivalent todetermining the value of (s < r), because the field element sis smaller than the field element r exactly when the integerζ + r is smaller than p.

The comparison of s and r can be completed by goingthrough the bits of field elements s and r , since the (secret-shared) bits of r are available. Let ri , 0 ≤ i < �, be the � bitsof r . Then the following procedure can be used to securelycompute [δ] = [(s < r)]:

1) If s0 = 0 then [δ] := [r0] else [δ] := [0]The premise δ = (s0 < r0) is now true.

2) For i = 1 to �− 1 do

If si = 0, then [δ] := [ri ] + [1 − ri ] · [δ].Otherwise, [δ] := [ri ] · [δ].The invariant condition δ = (si . . . s0 < ri . . . r0)is now true.

Given [δ], as explained above [(z < p2 )] is easily derived as

[δ⊕ s0 ⊕ r0] = [δ]− [s0 ⊕ r0]− 2[δ] · [s0 ⊕ r0]. Therefore, onesecure comparison takes � rounds and � secure multiplicationsin total.

This linear round bitwise comparison protocol, similarto the circuit of Schoenmakers and Tuyls [17], computes[δ] = [(s < r)] within � − 1 rounds, requiring onlyone multiplication per round. Constant round solutions byDamgård et al. [27], or Nishide and Ohta [16] offer only slightround improvements, requiring 19 and 15 rounds respectively,and on the other hand will increase drasticly the number ofsecure multiplications (e.g. 279�+ 5 for [16]), leading to morecommunication and computations.

C. Integer Division

In our implementation of secure recommendations inSubsection IV-F we need a protocol for integer divi-sion between two secret-shared values. In a recent paper,Veugen [28] demonstrated integer division in a setting withadditive homomorphic encryption, and public or privately helddivisors. Constant-round protocols based on secret sharing forinteger division are also available [27]. However, since thesemake use of a large amount of precomputed data, and sincethe prime p remains relatively small in our application, weinstead make use of a simple protocol based on long division.

Let x, y, z ∈ Zp be such that z = y ÷ x , where thesymbol ÷ is used to denote integer division without remainder.Moreover, suppose that zv−1, . . . , z0 are the v bits of z forsome v, 0 < v ≤ �. The following algorithm, inspired by [4],will compute the values zv−1, . . . , z0 one by one.

1) ψ := y; z := 0;2) For i = v − 1 downto 0 do

a) ξ := 2i · x ;b) zi := (ξ ≤ ψ);c) z := zi + 2 · z;

The invariant z = ∑v−1j=i z j 2 j−i is now true.

d) ψ := ψ − zi · ξ ;The invariant conditions y = ψ+x ·z and ψ < 2i xare now true.

Because of the invariant conditions, we end up with ψ < xand y = ψ + x · z, so z will be the result of the integer divisionand ψ will equal the remainder. Although the algorithm forinteger division is described on integers here, all listed opera-tions can equivalently be performed on values that are secret-shared modulo p. The algorithm requires v rounds with onesecure comparison (and one additional secure multiplication)per round, but in our application z represents a rating, so vwill be small.

From a complexity point of view, the integer divisionprotocol will form the bottleneck of our implementation. As asidenote, observe that z = 1 . . .1 = 2v − 1 whenever x = 0,so we could even cope with the case of zero similar users,if we require ratings to be limited to 2v − 2.

Page 8: A Framework for Secure Computations With Two Non-Colluding ...

452 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

D. The Input Phase

Although our framework allows clients to upload any valuesto the processor, we focus on our recommender application(see Subsection II-A). Initially, each user n (1 ≤ n ≤ N) willhave to upload his ratings V(n,m) (1 ≤ m ≤ M) once, beforerecommendations can be requested. A user could easily act asa dealer by splitting his rating into two shares, and sendingeach server a share accompanied by proper authenticationtags. However, since some users may collude with one ofthe servers, it is easy to see that any solution along theselines will leak the private authentication keys α1 and α2 to theadversary.

Therefore, a small additional protocol is needed for upload-ing user ratings. We assume the servers have a precomputedpair of random secret sharings [r ] and 〈r〉, as described inAppendix A, leading to the following protocol for uploadingrating V(n,m).

1) The servers open 〈r〉 towards user n by sending him thecorresponding shares, message authentication tags, andmessage authentication keys.

2) After verifying the correctness of the tags, user n sendsthe value (V(n,m) − r) ∈ Fp to both servers.

3) After verifying that they both received the same value,the servers compute [V(n,m)] = [r ] + (V(n,m) − r) asdescribed in Subsection II-C1.

Only user n will learn r , which he uses to blind his rating.By adding the blinded value to [r ], the servers obtain therequired sharing with their regular authentication keys. Theuser will only learn the random keys (α′

i , β), so the privatekeys αi remain secure.

1) Ideal World Simulation: In this section we describethe behavior of the simulator during the input phase of theideal world. First we point out that, due to knowledge of allpreprocessed data, the simulator has access to all shares andauthentication keys that are held, or have been computed, bythe adversarial server.

Let 〈r〉 and [r ] again be a precomputed pair as is used inthe real world input protocol. We need to distinguish betweentwo different scenarios, namely one where an honest (non-adversarial) user provides an input, and one where a dummyuser provides an input. First, suppose an honest user providesthe input. In that case the input is delivered straight to theideal processor I, which implies that the simulator has noknowledge of the value of this input. Upon being notified bythe processor of such an input event, the simulator selects arandom (or default) value x ∈ Fp , and then acts on behalf ofthe honest user and the non-existent honest server towards theadversarial server, via the following steps.

1) The adversarial server sends the simulator its shares,message authentication tags, and message authenticationkeys corresponding to 〈r〉.

2) The simulator verifies that the three values sent bythe adversarial server are correct, and aborts otherwise.If the simulator does not abort, it sends the value x − rto the dishonest server on behalf of the honest userproviding the input.

3) The dishonest server performs the verification of thevalue x − r with the simulator (which again emulates

the honest server) as it would in the real world, and theservers compute [x] = [r ] + [x − r ].

Now suppose that the input is provided by a dummy user.The simulator then acts as follows.

1) The adversarial server and the simulator open 〈r〉towards the dummy user.

2) The dummy user sends a value y ∈ Fp to the dishonestserver and the simulator, who verify that they havereceived the same value, using the same method as isused in the real world.

3) If the adversarial server and the simulator received thesame value, they compute the sharing [r + y].

4) Using knowledge of the shares of the adversarial server,the simulator now determines the input x = r + y of thedummy user. The simulator then inputs the value x tothe ideal processor I on behalf of the dummy user.

It is straightforward to verify that the view of the adversarialserver in the ideal world now corresponds with that in the realworld.

2) Input Checking: To further reduce the possibilities formalicious adversaries, we could easily check the size of userinputs. Given the shared uploaded value [x], we could securelycompute the secret-shared bit [(x < 2v )], and open it. Thesecure comparison protocol, as described in Subsection IV-Bcould be used for this end, but since we are not assured thatx < p/2, we would have to run the ‘least significant bit’protocol twice, namely for x and z = (x − 2v ) mod p, andcombine the results as shown by Nishide and Ohta [16].

E. The Output Phase

Although our framework allows clients to download anyvalue from the processor, we focus on our recommenderapplication (see Subsection II-A). After a recommendation hasbeen computed, the outputs Recm ∈ Fq (S < m ≤ M) haveto be sent to the requesting user. Since we want the user tobe able to verify the correctness of the output, but we do notwant to reveal the private authentication keys αi to him, weneed an additional protocol.

Assuming the servers have access to a precomputed pair ofrandom sharings [r ] and 〈r〉 (see Appendix A), we have thefollowing protocol for downloading the estimated rating Recm .

1) The servers compute [Recm − r ] = [Recm] − [r ], andopen the resulting secret sharing towards each other byrevealing the shares and authentication tags.

2) After checking the correctness of the tags, the serverscompute 〈Recm〉 = 〈r〉+ (Recm −r). This is an additionof a public constant to a sharing, but using completelyrandom authentication keys (α′, β).

3) The servers open 〈Recm〉 to the user. They also sendhim the corresponding tags and (α′, β) keys, so he cancheck the correctness of his output.

Again, the user will only learn the random keys (α′, β), sothe private keys α remain secure.

1) Ideal World Simulation: In this section we describe thebehavior of the simulator during the output phase in the idealworld. We remind the reader that the simulator has access toall shares and authentication keys that are held or have been

Page 9: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 453

computed by the adversarial server, and all inputs provided bythe dummy users. However, the simulator has no knowledge ofthe inputs of the honest users since these are directly submittedto the ideal processor I.

We again distinguish between the request of an honest userand a dummy user. Suppose first that the request comes froman honest user. The simulator then acts as follows.

1) It first emulates the first two protocol steps in the realworld towards the adversarial server. This results in asharing 〈x〉.

2) The adversarial server then sends the simulator its share,key and tag corresponding to 〈x〉. If the adversarialserver sends an incorrect value, the simulator aborts.

Now suppose that the output request comes from a dummyuser. The simulator then needs to provide the dummy userwith the correct recommendation output. This is done via thefollowing steps.

1) The simulator emulates the first two protocol steps in thereal world towards the adversarial server. This results ina sharing 〈x〉.

2) The simulator now requests the correct estimated rat-ing y from the ideal processor I on behalf of the dummyuser.

3) The simulator uses the knowledge of the shares andauthentication keys of the dishonest server to locallymodify its share and tag in the sharing 〈x〉, thus trans-forming it into a sharing 〈y〉.

4) The adversarial server and the simulator now open thesharing 〈y〉 towards the dummy user as they would inthe real world.

It is again straightforward to verify that the views of theadversarial server and the dummy user in the ideal worldcorrespond with that in the real world.

F. Application: Secure Recommendations

Using the approach described in Subsection III-C, and thebasic operations described in Subsection II-C, we can securelycompute the example application detailed in Subsection II-A.Based on the size and number of the ratings and the desiredsecurity level, we have decided to use a 20-bit prime numberp for the implementation. This for instance implies thatauthentication tags can only be forged with probability 2−20.The parameters for the number of items were set to M = 1000and S = 20, just as in [3], and we used ratings consisting ofv = 4 bits. Consequently, the maximal size of a similarityvalue is S(2v − 1)2 = 4500 which will never exceed p/2,justifying the input requirement of the comparison protocol(see Subsection IV-B). We assume that on average 3% of theusers rated a particular item m, S < m ≤ M , which is similarto the EachMovie dataset used in [13].

Assuming all required values have been precomputed(see Section IV-A), the on-line phase of such a securerecommendation computation will consist of the followingfive steps, which are the secured versions of the steps describedin Subsection II-A.

1) Initially, each user will have to upload his (at most)M ratings, so both servers will obtain authenticated

secret sharings [V(n,m)] (see Section IV-D). After theuploading, the servers will know Um for each m,S < m ≤ M , specifying the users that rated item m.1

2) When user A asks for a recommendation, the servers willcompute similarities [SimA,n ] for each (other) user n.The computation of each similarity value takes S mul-tiplications, and S − 1 local additions.

3) Next, each similarity value is compared with a pub-lic threshold t , leading to N − 1 shared bits [δn](see Subsection IV-B). Each comparison will cost �secure multiplications and some local additions.

4) The shared comparison bits can be used to compute theactual estimated rating for each item m, S < m ≤ M .For item m, the computation of the cumulative sum ofratings of similar users requires |Um |−1 multiplicationsand |Um |− 2 additions. The computation of the numberof similar users requires |Um | − 2 additions. As a finalstep, for each item these two have to be divided as shownin Section IV-C. This integer division protocol requiresv secure comparisons, and a total per item of v(� + 1)multiplications, and some additions.

5) Once the estimated ratings [Recm ], S < m ≤ M , havebeen computed, they have to be downloaded to user A(see Section IV-E).

We need∑

n∈Umδn · V(n,m) < p/2 to correctly perform the

integer division, but in practice only a small portion of userswill be similar to user A, and the user-item matrix will besparsely filled.

V. COMPLEXITY ANALYSIS

We analyse the computation, communication, storage, andprecomputation complexity of computing recommendations inour framework. We did not add extra time due to networklatency during the protocol simulation, just as in [3], but wecounted the number of communication rounds, which is a goodmeasure for network latency. We distinghuished between theinitial step in which the ratings are uploaded, and the on-linerecommendation: steps 2, 3, and 4.

A. Computation Complexity

The application was implemented in C, using Visual Studio2012 on a 64-bits Windows 7 laptop with a 2.5 GHz processorand 4 GB of memory. Figure 1 shows the computation timein seconds for a varying number of users. We did not useparallel computing, so in practice the execution time for twoservers will be approximately half as much. The time foruploading the user ratings could even be parallelized amongstall users which would reduce the uploading time by a factor 1

N .For 10,000 users, we need 0.34 seconds to compute onerecommendation, excluding precomputation time.

B. Communication Complexity

Figure 2 shows the amount of numbers of size p that have tobe sent during the on-line phase. For 10,000 users and a 20 bit

1It is possible to also hide Um , by having each user securely upload exactlyM ratings, and having the servers securely compute the bit [(V(n,m) < 1)],assuming an unrated item has rating zero.

Page 10: A Framework for Secure Computations With Two Non-Colluding ...

454 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

Fig. 1. Amount of on-line computation.

Fig. 2. Amount of on-line communication.

TABLE I

NUMBER OF COMMUNICATION ROUNDS PER PROTOCOL STEP

size p this leads to a total of 35 MB of communication. Theinitial uploading of the ratings takes a relatively large amountof communication.

Besides the amount of communicated information, the num-ber of communication rounds is also important for determiningthe time needed for communication, because each roundintroduces an extra delay depending on the communicationtechnology used. Table I depicts the number of communicationrounds per protocol step.

Given that p consists of � = 20 bits, and each rating consistsof v = 4 bits, the total number of communication roundsis 105. Although many computations can be parallelized,

TABLE II

NUMBER OF SECURE MULTIPLICATIONS

the integer division protocol, used for averaging the ratingof a recommended item over all similar users, clearly requiresthe highest number of rounds.

C. Storage Complexity

The servers mainly need to store the initial ratings ofall users, and the numbers precomputed for performing allcalculations, for which we count the number of requiredtriplets and bitwise shared random values. Given that eachsecure comparison needs � secure multiplications, we derivethe number of secure multiplications (see Table II), andconsequently the number of required triplets, for our entireprotocol. We used here that on average 3% of the users ratedeach item.

Given that � = 20, M = 1000, S = 20, and v = 4,our protocol requires 69.4(N − 1)+82,320 triples for onerecommendation. To store one triple, each server needs to store3 shares, 3 tags and 3 keys, so 9� bits in total.

For each secure comparison, the servers additionally needone bitwise shared random value. Each bitwise shared randomvalue consists for each server of � shares, � tags and � keys.The protocol consists of one secure comparison per user, andv(M − S) for the final integer division.

Thus, for computing one recommendation, each serverneeds to store (69.4(N − 1)+ 82,320) · 9� + (N − 1 +v(M − S)) · 3�2 precomputed bits, and additionallyN · (S + 0.03(M − S)) · 3� bits to store the initial ratings.For 10,000 users this amounts to 22 MB per server.

D. Precomputation Complexity

In practice the precomputation effort is dominated bythe generation of multiplication triplets. As described inSection IV-A, the easiest way to obtain all precomputed datais to involve an honest third party. This minimizes the requiredamount of communication and computation during the offlineprecomputation phase to a level that is at most that of theonline phase.

We assume such a third party is not available, so theservers themselves jointly precompute all required triples andsharings. In their implementation Bendlin et al. [1] require2 to 3 seconds to precompute one triple, in a way that isprovably secure in the malicious model. This precomputationeffort has recently been reduced by Damgård et al. [15] to0.02 seconds per triple (with a 32-bits prime). Based on theirestimation, the precomputation time for our protocol would beroughly 69.4 · 0.02 = 1.388 seconds per user. However, forcommercial purposes, dedicated and highly parallelized hard-ware can be acquired to substantially reduce the time needed.Even on standard hardware, 69.4 · 0.0014 = 0.097 seconds is

Page 11: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 455

achievable, when covert security is sufficient, leading to only16 minutes for 10,000 users.

E. Comparison With Related Work

The only work on recommender systems, the authorsknow is provably secure in the malicious model, isCanny’s paper [13]. Because they use a different matrix-based type of collaborative filtering, within a peer-to-peerenvironment, a fair overall comparison is difficult, but wemention the main differences:

• Although both use collaborative filtering, they use aniterative Singular Value Decomposition algorithm, whilewe use user-neighborhood-based rating prediction, whichin general is better able to cope with dynamic users, butprovides less accurate recommendations [18].

• They finetuned their algorithm to the encrypted domain,such that only vector summations of encrypted data areneeded, while we also allow more complicated operationslike secure comparison and integer division.

• Their computations are performed by the users, half ofwhich have to be honest, and only static adversaries areallowed. In our system, the computations are performedby the servers, where only one needs to be honest, andadversaries can be dynamic.

• Having users jointly perform the computations,and having the private key secret-sharedamong all clients, makes their solution lessscalable. On the other hand, we have to find asecond independent server that is assured not to colludewith the first server.

• They estimated the time for computing one recommen-dation in a setting with 74,442 users at 15 hours. Thisnumber is based on a benchmark from 2002 on 1024-bitElGamal. Although computer power has increased sincethen, a key length of 2048 would be more appropriatenow, yielding a estimated running time of 3 hours in 2014.Time for generating keys and random coins was notincluded. Our time for producing one recommendationwith 10,000 users is 0.34 seconds, excluding four hoursof precomputing.

• Although users can be untrusted, they assume a secureblackboard, where users can upload values, which canbe read by other users but not changed. We assume theexistence of at least one honest server.

• They assume the existence of a trusted source that canproduce many random coin tosses, and optionally alsogenerate the keys, both of which are needed in the cryp-tographic protocols. We assume multiplication triplets andbitwise-shared random numbers have been precomputed,but we do not need an additional trusted source for thisend.

• They allow some amount of information leakage.An aggregated model matrix A is published, and usedby the users in their computations. We do not allowinformation leakage.

• They need 4 GB, of data communication per client, wherewe need 35 MB in total for 10,000 users, which isconsiderably less.

• They estimate a 10-50 MB storage per client, where werequire 44 MB in total for 10,000 users to be stored bythe two servers, which is considerably less.

Although the security model is different, our recommendersystem is more similar to Erkin et al.’s [3], and we alsocompare our experimental results with theirs. They used theGMP library on a single computer with a 2.33 GHz processorand 16 GB memory. Because our numbers are smaller, we didnot need GMP, and we had only 4GB of memory availablewith a 2.5 GHz processor. They did not cope with unrateditems while computing the number of similar users, leading toless accurate rate predictions, where we allowed the serversto learn the sets Um . It took us 0.34 seconds to compute arecommendation for 10,000 users, excluding four hours ofprecomputation (or 16 minutes with covert security), whilethey needed 50 minutes. They needed a total communicationamount of 147 MB, somewhat more than our 35 MB. Also,we needed only 105 communication rounds while their imple-mentation required at least 1024 communication rounds dueto packing. In our setting, each server has to store 22 MBfor 10,000 users, and their service provider needs 112 MB forstoring all user data, and an additional 35 MB for computinga recommendation.

The differences with respect to security are more significant.In [3], a user requesting a recommendation will learn theamount of similar users. The reason is that this approachavoids having to perform a secure integer division protocolwhich is quite expensive. We mitigated that threat by imple-menting a secure integer division protocol. The most importantdifference is that their protocol is secure in the semi-honestmodel, while ours is secure in the malicious model. In ourrecommender system, the user is moreover assured that hisoutputs, the predicted ratings, are correctly computed.

Although [3] also requires some form of precomputation,precomputation is an essential part of our solution, and takesa substantial amount of time. On the other hand, in the covertsecurity model, which is still an improvement w.r.t. the semi-honest model, our total execution time becomes favorable.

VI. CONCLUSIONS

We provide a general framework for outsourcing ongoingcomputations, which is suitable for dealing with a largenumber of users, and is provably secure in the maliciousmodel. This framework is then applied to the problem ofsecure recommendation and, given a sufficient amount of pre-computed data, leads to extremely efficient implementations.

The approach is very generic and easily allows for vari-ations. Although we only consider the case of two servers,our model described in Section III is easily extended to anarbitrary number of servers. As a consequence, many othersecure applications are possible using our framework.

APPENDIX

A. Generating Duplicate Sharings of a Random Secret

For the uploading of user ratings, and downloading ofestimated ratings, we need to precompute “duplicate” sharings[r ] and 〈r〉 for random elements r ∈ Fp . To avoid leakage of

Page 12: A Framework for Secure Computations With Two Non-Colluding ...

456 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 3, MARCH 2015

the private α’s, for each sharing 〈r〉 the servers should usefresh random α-values in their authentication keys that areindependent of their private fixed values for α1 and α2.

Generating a sharing [r ] of a fresh random number r ∈ Fp

is described in [1], resulting in two random shares r1 and r2,one for each server. A new sharing 〈r〉 of the same randomnumber r , but with different authentication keys, is createdby first converting the two shares of r into shared secrets, andthen computing the new tags from fresh random authenticationkeys.

1) The servers generate a fresh random sharing [s].2) Server 1 transmits r1 − s1 and its tag mr−s

1 = mr1 − ms

1to server 2. Simultaneously, server 2 sends s2 and ms

2 toserver 1.

3) The servers verify the tags and, if these are correct, bothdetermine r1 − s = (r1 − s1)− s2.

4) The servers obtain a sharing [r1] of the secret share r1via a (local) computation of [s] + (r1 − s).

5) By repeating the first four steps, the servers similarlyobtain a sharing [r2] of the secret share r2.

6) The servers generate fresh random sharings [α′1], [α′

2],[β ′1] and [β ′

2] that will serve as new authentication keys.7) The servers securely compute [m′

1] = [r1] · [α′2] + [β ′

2].8) The servers open α′

2 and β ′2 to server 2, and open m′

1to server 1. By verifying the tags, the correctness of theopened values is checked.

9) By repeating steps 7 and 8, server 1 will obtainα′

1 and β ′1, and server 2 m′

2.It follows that m′

1 = r1 · α′2 + β ′

2 and m′2 = r2 · α′

1 + β ′1,

so indeed we created different authentication keys for the sameshared value r = r1 + r2, and thus a second sharing 〈r〉 withindependent authentication keys.

REFERENCES

[1] R. Bendlin, I. Damgård, C. Orlandi, and S. Zakarias, “Semi-homomorphic encryption and multiparty computation,” in Advances inCryptology (Lecture Notes in Computer Science), vol. 6632. Berlin,Germany: Springer-Verlag, 2011, pp. 169–188.

[2] R. Lagendijk, Z. Erkin, and M. Barni, “Encrypted signal processing forprivacy protection: Conveying the utility of homomorphic encryptionand multiparty computation,” IEEE Signal Process. Mag., vol. 30, no. 1,pp. 82–105, Jan. 2013.

[3] Z. Erkin, T. Veugen, T. Toft, and R. L. Lagendijk, “Generating pri-vate recommendations efficiently using homomorphic encryption anddata packing,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 3,pp. 1053–1066, Jun. 2012.

[4] P. Bunn and R. Ostrovsky, “Secure two-party k-means clustering,” inProc. 14th ACM Conf. Comput. Commun. Secur., 2007, pp. 486–497.

[5] B. Goethals, S. Laur, H. Lipmaa, and T. Mielikäinen, “On private scalarproduct computation for privacy-preserving data mining,” in Proc. 7thInt. Conf. Inf. Secur. Cryptol., 2004, pp. 104–120.

[6] H. Polat and W. Du, “Privacy-preserving collaborative filtering usingrandomized perturbation techniques,” in Proc. 3rd IEEE ICDM,Nov. 2003, pp. 625–628.

[7] S. Zhang, J. Ford, and F. Makedon, “Deriving private information fromrandomly perturbed ratings,” in Proc. 6th SIAM Int. Conf. Data Mining,2006, pp. 59–69.

[8] M. Atallah, M. Bykova, J. Li, K. Frikken, and M. Topkara, “Privatecollaborative forecasting and benchmarking,” in Proc. ACM WorkshopPrivacy Electron. Soc. (WPES), 2004, pp. 103–114.

[9] V. Nikolaenko, S. Ioannidis, U. Weinsberg, M. Joye, N. Taft, andD. Boneh, “Privacy-preserving matrix factorization,” in Proc. ACMSIGSAC Conf. Comput. Commun. Secur., 2013, pp. 801–812.

[10] V. Nikolaenk, U. Weinsberg, S. Ionnidis, M. Joye, D. Boneh, and N. Taft,“Privacy-preserving ridge regression on hundreds of millions of records,”in Proc. IEEE Symp. Secur. Privacy, May 2013, pp. 334–348.

[11] O. Catrina and S. de Hoogh, “Improved primitives for securemultiparty integer computation,” in Security and Cryptography for Net-works (Lecture Notes in Computer Science), vol. 6280. Berlin, Germany:Springer-Verlag, 2010, pp. 182–199.

[12] A. Peter, E. Tews, and S. Katzenbeisser, “Efficiently outsourcing mul-tiparty computation under multiple keys,” IEEE Trans. Inf. ForensicsSecurity, vol. 8, no. 12, pp. 2046–2058, Dec. 2013.

[13] J. Canny, “Collaborative filtering with privacy,” in Proc. IEEE Symp.Secur. Privacy, May 2002, pp. 45–57.

[14] I. Damgård, V. Pastro, N. Smart, and S. Zakarias, “Multiparty com-putation from somewhat homomorphic encryption,” in Advances inCryptology (Lecture Notes in Computer Science), vol. 7417. Berlin,Germany: Springer-Verlag, 2012, pp. 643–662.

[15] I. Damgård, M. Keller, E. Larraia, V. Pastro, P. Scholl, and N. P. Smart,“Practical covertly secure MPC for dishonest majority—Or: Breakingthe SPDZ limits,” in Computer Security (Lecture Notes in ComputerScience), vol. 8134. Berlin, Germany: Springer-Verlag, 2013.

[16] T. Nishide and K. Ohta, “Multiparty computation for interval, equal-ity, and comparison without bit-decomposition protocol,” in PublicKey Cryptography (Lecture Notes in Computer Science), vol. 4450,T. Okamoto and X. Wang, Eds. Berlin, Germany: Springer-Verlag, 2007,pp. 343–360.

[17] B. Schoenmakers and P. Tuyls, “Practical two-party computation basedon the conditional gate,” in Advances in Cryptology (Lecture Notes inComputer Science), vol. 3329. Berlin, Germany: Springer-Verlag, 2004,pp. 119–136.

[18] F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, Recommender SystemsHandbook. New York, NY, USA: Springer-Verlag, 2011.

[19] O. Goldreich, S. Micali, and A. Wigderson, “How to play any mentalgame or a completeness theorem for protocols with honest majority,”in Proc. 19th Annu. ACM Symp. Theory Comput. (STOC), May 1987,pp. 218–229.

[20] M. Ben-Or, S. Goldwasser, and A. Wigderson, “Completeness theoremsfor non-cryptographic fault-tolerant distributed computation,” in Proc.20th Annu. ACM Symp. Theory Comput., 1988, pp. 1–10.

[21] D. Beaver, “Efficient multiparty protocols using circuit randomization,”in Advances in Cryptology. Berlin, Germany: Springer-Verlag, 1991,pp. 420–432.

[22] R. Canetti, “Universally composable security: A new paradigm forcryptographic protocols,” in Proc. 42nd IEEE Symp. Found. Comput.Sci., Oct. 2001, pp. 136–145.

[23] A. C. Yao, “Protocols for secure computations,” in Proc. IEEE 54thAnnu. Symp. Found. Comput. Sci. (FOCS), Nov. 1982, pp. 160–164.

[24] C. Gentry, “Fully homomorphic encryption using ideal lattices,” in Proc.41st Annu. ACM Symp. Theory Comput., 2009, pp. 169–178.

[25] Y. Huang, J. Katz, and D. Evans, “Efficient secure two-party com-putation using symmetric cut-and-choose,” in Advances in Cryptology(Lecture Notes in Computer Science), vol. 8043. Berlin, Germany:Springer-Verlag, 2013, pp. 18–35.

[26] Y. Lindell, “Fast cut-and-choose based protocols for malicious andcovert adversaries,” in Advances in Cryptology (Lecture Notes inComputer Science), vol. 8043. Berlin, Germany: Springer-Verlag, 2013,pp. 1–17.

[27] I. Damgård, M. Fitzi, E. Kiltz, J. B. Nielsen, and T. Toft, “Uncon-ditionally secure constant-rounds multi-party computation for equality,comparison, bits and exponentiation,” in Theory of Cryptography. Berlin,Germany: Springer-Verlag, 2006, pp. 285–304.

[28] T. Veugen, “Encrypted integer division and secure comparison,” Int.J. Appl. Cryptograph., vol. 3, no. 2, pp. 166–180, 2014.

Thijs Veugen received two M.Sc. degrees in math-ematics and computer science (both cum laude)and the Ph.D. degree in information theory fromthe Eindhoven University of Technology Eindhoven,The Netherlands. He was a Scientific SoftwareEngineer with Statistics Netherlands, Heerlen, TheNetherlands. Since 1999, he has been a Senior Sci-entist with the Information Security Research Group,Department of Technical Sciences, TNO, Delft, TheNetherlands. He is currently a Senior Researcherwith the Cyber Security Group, Delft University of

Technology, Delft, and has specialized in applications of cryptography. He haswritten many scientific papers on computing with encrypted data, and servesfrequently as a member of the program committee board of informationsecurity related conferences, and holds numerous related patents in variouscountries.

Page 13: A Framework for Secure Computations With Two Non-Colluding ...

VEUGEN et al.: FRAMEWORK FOR SECURE COMPUTATIONS 457

Robbert de Haan received M.S. degrees in com-puter science and mathematics from the Universityof Amsterdam, Amsterdam, The Netherlands, in2003 and 2004, respectively, and the Ph.D. degreein mathematics from Leiden University, Leiden, TheNetherlands, in 2009, based on cryptologic researchdone at the Centrum Wiskunde and Informatica,Amsterdam. His research interests include cryptol-ogy and information security.

Ronald Cramer received the Ph.D. degree incryptology from the University of Amsterdam,Amsterdam, The Netherlands, in 1997, and theM.Sc. degree in pure mathematics from LeidenUniversity, Leiden, The Netherlands, in 1992. Since2004, he has been the head and founder of theCryptology Group with Centrum Wiskunde andInformatica, Amsterdam, The Netherlands, which isthe Dutch National Research Center for Mathematicsand Computer Science. Since 2004, he has beena Full Professor with the Mathematical Institute,

Leiden University, Leiden, The Netherlands. Prior to that, he held researchpositions with the Swiss Federal Institute of Technology in Zurich, Switzer-land, and Aarhus University, Aarhus, Denmark. His research areas includecryptology and applications of algebraic number theory and algebraic geom-etry. He is Member of the Royal Netherlands Academy of Arts and Sciencesand Fellow of the International Association for Cryptologic Research.

Frank Muller received the M.Sc. degree in electri-cal engineering from Delft University of Technology,Delft, The Netherlands. He is currently a SeniorScientist with TNO, Delft, The Netherlands. Hisresearch interests include cryptographic protocols,payment systems, and mobile network security.