LightDP: Towards Automating Differential Privacy Proofsdbz5017/pub/popl17-slides.pdfRelated Work DP...
Transcript of LightDP: Towards Automating Differential Privacy Proofsdbz5017/pub/popl17-slides.pdfRelated Work DP...
-
LightDP: Towards Automating Differential Privacy Proofs
Danfeng Zhang Daniel KiferPenn State University
-
2
Database w/Alice’s data
Database w/oAlice’s data
Alice’s data remain private if 𝜇", 𝜇$ are close
𝜇" 𝜇$
-
(Pure) Differential Privacy
3
𝜇" 𝜇$If for any adjacent databases and value 𝑣, 𝜇"(𝑣)/𝜇$(𝑣) ≤ 𝑒+for some constant 𝜖, thena computation is 𝜖-private
𝜇"(𝑣) 𝜇$(𝑣)
PrivacyCost
-
Motivation
4
Rigorous methods are needed for differential privacy proofs
DP has seen explosive growth since 2006•U.S. Census Bureau LEHD OnTheMap tool
[Machanavajjhala et al. 2008]•Google Chrome Browser [Erlingsson et al. 2014]•Apple’s new data collection efforts [Greenberg 2016]
But also accompanied with flawed (paper-and-pencil) proofs• e.g., ones categorized in [Chen&Machanavajjhala’15, Lyu et al.’16]
-
Related WorkDP programming platforms (e.g., PINQ, Airavat)•Use (instead of verify) basic DP mechanisms•Cannot offer tight bounds for sophisticated algorithms
Methods based on customized logics• Steep learning curve• Heavy annotation burden
5
LightDP offers a better balance between expressiveness and usability
-
LightDP: Overview
6
SourceProgram
Relational,DependentTypeSystem
TargetProgramwithdistinguished variable
Source program type checks
Source program is 𝜖-private
Main Theoremv+ bounded by constant 𝜖in the target program
-
Source Language: Syntax
7
Random variable
Random Expression
(e.g., Laplace dist.)
-
Source Language: SemanticsMemory: mapping from variables to values
8
Initial memory
Final memory dist.
Adjacent memory
Final memory dist.
RelationalReasoningviaTypeSystem
-
Relational Types
9
Related Memories𝑥:u 𝑥: u𝑦: v 𝑦: v+1
ExampleΓ 𝑥 : num6Γ(𝑦): num"
Base Type Distance
e.g., int, real
-
Dependent Types
10
Can be a program variable
Related Memories𝑥: u 𝑥: u𝑦: v 𝑦: v + u
ExampleΓ 𝑥 : num6Γ(𝑦): num8
-
Dependent Types
11
Related Memories𝑥:u 𝑥: u
𝑦: v 𝑦: 9v + 2, u ≥ 1v, u < 1
Can be a non-prob. expression
ExampleΓ 𝑥 : num6
Γ(𝑦): num8>"?$:6
𝑚"Γ𝑚$ if 𝑚" and𝑚$are related by Γ
Notation
-
12
(for the non-probabilistic subset)Types form an invariant on two related program executions:
Then after executinga well-typed program,final memories
𝑚" 𝑚$
𝑚"A 𝑚$A
If initial memories Γ
Γ
Enforced by a type system
-
Type System
13
Expression:
e.g.,
+| −
< | > | = | ≤ | ≥
-
Type System
14
Command:
e.g.,
Distance must be identical
Related executions take same branch
-
Relating Two Distributions
15
𝜇" 𝜇$
Program𝜂:= Lap 𝑟
Laplace dist. w/ mean 0 and a scale factor 𝑟
Γ 𝜂 = num6With no
cost
𝜇"Γ𝜇$ w.r.t.privacycost𝝐 if ∀𝑚.𝜇"(𝑚)/𝜇$(Γ(𝑚)) ≤ 𝑒+
-
𝜂 mayhaveanarbitrarydistance,whichaffectstheaddedcost
Observation
Relating Two Distributions
16
𝜇"Γ𝜇$ w.r.t.privacycost𝜖 if ∀𝑚.𝜇"(𝑚)/𝜇$(Γ(𝑚)) ≤ 𝑒+
𝜇" 𝜇$
Program𝜂:= Lap 𝑟
Laplace dist. w/ mean 0 and a scale factor 𝑟
Γ 𝜂 = num"With cost 𝟏/𝒓 due to
dist. property
-
17
𝜂has a polymorphic type
source program target program, explicitly tracks added privacy cost
Non-deterministic operation
𝜇
Intuitively, target program computesthe added cost for one sample fromdistribution
𝜂 mayhaveanarbitrarydistance,whichaffectstheaddedcost
Observation
-
In General
18
SourceprogramTargetprogramwithdistinguished variable
TypeSystem
source program target program
-
Target Language
19
Verification task in the target language:Proving is bounded by some constant 𝜖 in any execution
(in a non-probabilistic program)
A safety property. Can be verified using off-the-shelf tools
(e.g., Hoare logic, model checking)
set x to arbitrary value
-
Putting TogetherThe Sparse Vector Method [Dwork and Roth’14]
20
Source Program•Correctness proof is subtle
Incorrect variants categorized in [Chen&Machanavajjhala’15, Lyu et al.’16]
•Formally verified very recently [Barthe et al. 2016] with heavy annotation burden
-
Required Types
21
Distance depends on the value of 𝑖th query
answer (𝑞[𝑖])
Types can be inferred by the inference algorithm of LightDP
Type Inference
-
Target Program
22
-
Completing the Proof
23
Loop Invariant
Postcondition:
Sourceprogramtypechecks+boundedbyconstant𝜖=sourceprogramis𝜖-private
Main Theorem
-
More in the PaperType inference algorithm
Searching for proof with minimum cost w/ MaxSMT
Formal proof for the main theorem
More verified examples (with little manual efforts)
24
-
Summary
25
Automated by inference engine
A safety property (verified by
existing tools)
SourceProgram
Relational,DependentTypeSystem
TargetProgramwithdistinguished variable
Decomposing differential privacy into subtasks substantially simplifies language-based proof