DELPHI Cryptographic Inference for Neural Networks

DELPHI: Cryptographic Inference for Neural Networks

Ryan Lehmkuhl Akshayaram Srinivasan Wenting Zheng Raluca Ada PopaPratyush Mishra

UC Berkeley

Neural Network Inference

User data is sensitive Server’s model is proprietary

Client

x

Server

MM(x)

A growing number of applications use neural networks in user interactions

• Home monitoring: detect and recognize visitors • Baby monitor: motion detection to alert parents

2

Client-side inference

Client Server

Client sees server’s model!This reveals model weights and leaks information about private training data

Mx M(x)

Client Server

Mx

M(x)

Server-side inference

Server sees client data!

[SRS17], [CLEKS18], [MSCS18] 3

https://arxiv.org/pdf/1709.07886.pdf



Secure inference goals

Client (& server) should learn only prediction M(x)

Server should not learn private client input x Client should not learn private model weights M

Client

x

Server

MM(x)

4

Prior work on secure inference

Protocol type FHE based 2PCbased Desired

Examples CryptoNets, CHET, TAPAS

SecureML, Gazelle, MiniONN Delphi

Performance

Functionality/Accuracy

5

Delphi

Cryptographic system for secure inference on convolutional neural networks

• improves bandwidth (9x) and inference latency (22x) • can utilize GPU/TPU for linear layers • evaluated on realistic workloads (CIFAR-100, ResNet-32)

Efficiency:

supports arbitrary CNNsFunctionality:

Security: achieves semi-honest simulation-based security

6

Machine LearningCryptography

Systems

7

Linear Layers

Non-linear Layers

Recap: Convolutional Neural Networks

Con

volu

tion

Activ

atio

n (R

eLU

)

Con

volu

tion

Activ

atio

n (R

eLU

)

Fully

-con

nect

ed

Input Prediction

8

Starting point: GAZELLE [JVC18]Key insight: use crypto specialized for each layer.

Client ServerEnc(x)

c1. Linear layer c ← Enc(Lx + s)

2. Activation

y ← Dec(c)

GCy s

Enc(ReLU(Lx))

y - s ↓ Lx ↓

ReLU

Linearly-homomorphic Encryption

Enc(x) + Enc(y) = Enc(x + y)

Garbled circuits: 2PC protocol for bitwise operations like ReLU

9

Expensive parts of Gazelle

Client ServerEnc(x)

c1. Linear layer c ← Enc(Lx + s)

2. Activation

y ← Dec(c)

GCy s

Enc(ReLU(Lx))

y - s ↓ Lx ↓

ReLU

For ResNet-32, per inference:~600MB communication, and ~82 sec latency.

no GPU!

10

Systems

Machine LearningCryptography

11

GPU compatible!

Delphi: Optimizing Linear layers

Client ServerEnc(r)

c c ← Enc(Lr + s)y ← Dec(c)

x + r z := L(x + r) + s = Lx + yGet input x

Sample r Sample s

Online phase

Preprocessing phase

GCy zReLU(Lx) + r2

= x2 + r2ReLU(z-y)

Per inference:>600MB communication, ~82 s latency

~350MB

~13 sr2

12

Delphi: Optimizing Non-linear Activations

Problem: ReLU is cheap for CPUs, but costly in 2PC.

Solution Idea: Replace ReLUs with quadratic activations, which are cheap in 2PC [CryptoNets, SecureML]

Problem: Training accurate quad. networks is difficult:algorithms are optimized for all-ReLU networks

13

Crypto

Systems

Machine Learning

14

Delphi’s Machine Learning Planner

Delphi’s Plannerall-ReLU CNN

Accuracy threshold t hybrid CNN

Contains a mixture of ReLU and quadratic activations,

and has accuracy > t

Better techniques for training hybrid networks• Clipping gradients• Blending in quadratic layers slowly

Specializing Neural Architecture Search to discover hybrid networks• Adapt PBT algorithm• Iterative exploration of search space

15

16

Delphi’s end-to-end workflowClient Server

Server Online

ClientOnline

Server Preprocessing

Client Preprocessing

Preprocessing for linear, ReLU, and quadratic layers

Online phase for linear, ReLU, and quadratic layers

Planner

Train all-ReLU CNN

Accuracythreshold t

Optimize accuracy and efficiency

Train initial all-ReLU network

x

All-ReLU CNN

Hybrid CNN

M(x)

Crypto

Systems

Machine Learning

17

Evaluation1. Does Delphi’s planner preserve accuracy?2. Does Delphi’s protocol reduce latency & bandwidth?

Benchmark: ResNet-32 network on CIFAR-100

github.com/mc2-project/delphi

18

Implementation

Rust + C++ library with support for GPU acceleration

0 5 10 15 20 25

1umbeU RI QRQ-5eL8 layeUs

56

58

60

62

64

66

68

Accu

Uacy

(%)

all-5eL8 baselLQe5eL8 + IdeQtLty5eL8 + 4uadUatLc

Planner accuracy

ReLUs are not redundant:

accuracy loss > 10%

19

0 5 10 15 20 25

1umbeU RI QRQ-5eL8 layeUs

56

58

60

62

64

66

68

Accu

Uacy

(%)

all-5eL8 baselLQe5eL8 + IdeQtLty5eL8 + 4uadUatLc

Planner accuracy

Most efficient planned network achieves loss of

< 2%

20

0 5 10 15 20 251umbeU RI nRn-5eL8 lDyeUs

0

20

40

60

80

InIe

Uenc

e tLm

e (s

)

DelphLGDzelle

0 5 10 15 20 251umbeU Rf nRn-5eLU lDyeUs

0.2

0.4

0.6

DDt

D tU

Dnsf

eUUe

G (G

B)

DelphLGDzelle

Latency and communication

> 20x ~ 9x

21

Comparison with Gazelle

Delphi

• Secure inference on convolutional neural networks • 9-22x more efficient than prior work • Combines techniques from systems, cryptography, and ML

ia.cr/2020/050

github.com/mc2-project/delphi

22

https://ia.cr/2020/050

DELPHI Cryptographic Inference for Neural Networks

Documents

Transcript of DELPHI Cryptographic Inference for Neural Networks