LightDP: Towards Automating Differential Privacy Proofsdbz5017/pub/popl17-slides.pdfRelated Work DP...

LightDP: Towards Automating Differential Privacy Proofs

Danfeng Zhang Daniel KiferPenn State University

2

Database w/Alice’s data

Database w/oAlice’s data

Alice’s data remain private if 𝜇", 𝜇$ are close

𝜇" 𝜇$

(Pure) Differential Privacy

3

𝜇" 𝜇$If for any adjacent databases and value 𝑣, 𝜇"(𝑣)/𝜇$(𝑣) ≤ 𝑒+for some constant 𝜖, thena computation is 𝜖-private

𝜇"(𝑣) 𝜇$(𝑣)

PrivacyCost

Motivation

4

Rigorous methods are needed for differential privacy proofs

DP has seen explosive growth since 2006•U.S. Census Bureau LEHD OnTheMap tool

[Machanavajjhala et al. 2008]•Google Chrome Browser [Erlingsson et al. 2014]•Apple’s new data collection efforts [Greenberg 2016]

But also accompanied with flawed (paper-and-pencil) proofs• e.g., ones categorized in [Chen&Machanavajjhala’15, Lyu et al.’16]

Related WorkDP programming platforms (e.g., PINQ, Airavat)•Use (instead of verify) basic DP mechanisms•Cannot offer tight bounds for sophisticated algorithms

Methods based on customized logics• Steep learning curve• Heavy annotation burden

5

LightDP offers a better balance between expressiveness and usability

LightDP: Overview

6

SourceProgram

Relational,DependentTypeSystem

TargetProgramwithdistinguished variable

Source program type checks

Source program is 𝜖-private

Main Theoremv+ bounded by constant 𝜖in the target program

Source Language: Syntax

7

Random variable

Random Expression

(e.g., Laplace dist.)

Source Language: SemanticsMemory: mapping from variables to values

8

Initial memory

Final memory dist.

Adjacent memory

Final memory dist.

RelationalReasoningviaTypeSystem

Relational Types

9

Related Memories𝑥:u 𝑥: u𝑦: v 𝑦: v+1

ExampleΓ 𝑥 : num6Γ(𝑦): num"

Base Type Distance

e.g., int, real

Dependent Types

10

Can be a program variable

Related Memories𝑥: u 𝑥: u𝑦: v 𝑦: v + u

ExampleΓ 𝑥 : num6Γ(𝑦): num8

Dependent Types

11

Related Memories𝑥:u 𝑥: u

𝑦: v 𝑦: 9v + 2, u ≥ 1v, u < 1

Can be a non-prob. expression

ExampleΓ 𝑥 : num6

Γ(𝑦): num8>"?$:6

𝑚"Γ𝑚$ if 𝑚" and𝑚$are related by Γ

Notation

12

(for the non-probabilistic subset)Types form an invariant on two related program executions:

Then after executinga well-typed program,final memories

𝑚" 𝑚$

𝑚"A 𝑚$A

If initial memories Γ

Γ

Enforced by a type system

Type System

13

Expression:

e.g.,

+| −

< | > | = | ≤ | ≥

Type System

14

Command:

e.g.,

Distance must be identical

Related executions take same branch

Relating Two Distributions

15

𝜇" 𝜇$

Program𝜂:= Lap 𝑟

Laplace dist. w/ mean 0 and a scale factor 𝑟

Γ 𝜂 = num6With no

cost

𝜇"Γ𝜇$ w.r.t.privacycost𝝐 if ∀𝑚.𝜇"(𝑚)/𝜇$(Γ(𝑚)) ≤ 𝑒+

𝜂 mayhaveanarbitrarydistance,whichaffectstheaddedcost

Observation

Relating Two Distributions

16

𝜇"Γ𝜇$ w.r.t.privacycost𝜖 if ∀𝑚.𝜇"(𝑚)/𝜇$(Γ(𝑚)) ≤ 𝑒+

𝜇" 𝜇$

Program𝜂:= Lap 𝑟

Laplace dist. w/ mean 0 and a scale factor 𝑟

Γ 𝜂 = num"With cost 𝟏/𝒓 due to

dist. property

17

𝜂has a polymorphic type

source program target program, explicitly tracks added privacy cost

Non-deterministic operation

𝜇

Intuitively, target program computesthe added cost for one sample fromdistribution

𝜂 mayhaveanarbitrarydistance,whichaffectstheaddedcost

Observation

In General

18

SourceprogramTargetprogramwithdistinguished variable

TypeSystem

source program target program

Target Language

19

Verification task in the target language:Proving is bounded by some constant 𝜖 in any execution

(in a non-probabilistic program)

A safety property. Can be verified using off-the-shelf tools

(e.g., Hoare logic, model checking)

set x to arbitrary value

Putting TogetherThe Sparse Vector Method [Dwork and Roth’14]

20

Source Program•Correctness proof is subtle

Incorrect variants categorized in [Chen&Machanavajjhala’15, Lyu et al.’16]

•Formally verified very recently [Barthe et al. 2016] with heavy annotation burden

Required Types

21

Distance depends on the value of 𝑖th query

answer (𝑞[𝑖])

Types can be inferred by the inference algorithm of LightDP

Type Inference

Target Program

22

Completing the Proof

23

Loop Invariant

Postcondition:

Sourceprogramtypechecks+boundedbyconstant𝜖=sourceprogramis𝜖-private

Main Theorem

More in the PaperType inference algorithm

Searching for proof with minimum cost w/ MaxSMT

Formal proof for the main theorem

More verified examples (with little manual efforts)

24

Summary

25

Automated by inference engine

A safety property (verified by

existing tools)

SourceProgram

Relational,DependentTypeSystem

TargetProgramwithdistinguished variable

Decomposing differential privacy into subtasks substantially simplifies language-based proof

LightDP: Towards Automating Differential Privacy Proofsdbz5017/pub/popl17-slides.pdfRelated Work DP...

Documents

Transcript of LightDP: Towards Automating Differential Privacy Proofsdbz5017/pub/popl17-slides.pdfRelated Work DP...