The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf ·...

39
The Symbiosis of Program Analysis and Machine Learning Prateek Saxena Associate Professor National University of Singapore

Transcript of The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf ·...

Page 1: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

The Symbiosis ofProgramAnalysisandMachineLearning

PrateekSaxenaAssociateProfessor

NationalUniversity ofSingapore

Page 2: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ProgramAnalysis,Classically

2

Program

Property

𝜋DeductiveVerification

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA

RulesTrue/False

Page 3: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

But,InPractice…

Program

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA

Rules

Property

𝜋?? ?

• TooComplextoAnalyze/Model• ProbabilisticSystem

• Probabilistic/StochasticProperties• AmbiguousSpec.(Eg. Goodpatch?)

• NotRe-Targetable• IntractableAnalysis

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA-NC

CanMachineLearningHelp?

Page 4: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ThisTalk…

4

ML

Verification

Security3Mainstream

SecurityAnalysisTasks

Page 5: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

MachineLearningProgramAnalysisNewRepresentations&InferenceTools

5

ProgramRepresentations

Property

𝜋Induction(ML)

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA

Rules

Page 6: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ProgramAnalysisMachineLearningDeductiveReasoning

6

MLSystem

Property

𝜋DeductiveVerification

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA

Rules

Page 7: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

MLPA:NewRepresentations&InferenceTools

ForSymbolicExecution

Jointwork with:Shiqi Shen ,ShwetaShinde,Soundarya Ramesh,Abhik Roychoudhury (NDSS2019)

Page 8: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

SymbolicExecution

1 def f (x, y): 2 if (x>y)3 x = x+y4 if (x–y > 0)5 assert false 6 return (x, y)

x↦ Ay↦ B

x↦ A+By↦ B

x↦ Ay↦ B

A>B A≤B

x↦ A+By↦ B

x↦ A+By↦ B

A>0 A≤0

xandyaresymbolicvariablesAandBaresymbolicvalues

DynamicSymbolicExecution(DSE):AwidelyusedvariationofSE

8

assert false

Page 9: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

SymbolicExecutionforFindingSecurityBugs

Kite SAGE

jCUTEManticore

S²EAngr

9

Page 10: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ThePathExplosionProblem1 void copy_data(..., int *file,...) {2 static double data[4096], value;3 read_double_value(file, ... );4 value = fabs (data [0]); 5 for(i=0; i<4096; i++)6 if(file[i] == 0.0) count++;7 data[1] /= (value+count-3);8 …9 }

data[1] /= (value+count-3);

i=0i=1

… … … …… … … …

i=…i=4095

24096paths

10

PriorApproaches:Learnabetterrepresentation,symbolicallysolve!• ExpressinSMTtheoryoffloating-point• Inferthat‘count’=#of0sininputbytes• Assert:value+count- 3=0

FPALinear

arithmetic

Vectors

Page 11: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

But,whychoosethisspecificconstraintrepresentation?1 void copy_data(..., int *file,...) {2 static double data[4096], value;3 read_double_value(file, ... );4 value = fabs (data [0]); 5 for(i=0; i<4096; i++)6 if(file[i] == 0.0) count++;7 data[1] /= (value+count-3);8 …9 }

data[1] /= (value+count-3);

11

BVStringRealBool

UF

AUniversalApproximateRepresentation?

FPALinear

arithmetic

Vectors

Page 12: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

1 void copy_data(..., int *file,...) {2 static double data[4096], value;3 read_double_value(file, ... );4 value = fabs (data [0]); 5 for(i=0; i<4096; i++)6 if(file[i] == 0.0) count++;7 data[1] /= (value+count-3);8 …9 }

KeyInsights

DesiredRepresentation:

Aneuralnetworkisanapproximaterepresentationofthedesired…

12

file

data[1] /= (value+count-3);

𝑐𝑜𝑢𝑛𝑡 ==Σ+∈[.,0.12]𝑠𝑖𝑔𝑛 𝑓𝑖𝑙𝑒 𝑖 == 0

Remarks:- NeuralNetworksareuniversalapproximators- Increasingpracticalsuccess

Page 13: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

KeyInsights

ValuesofVariablesinCVP

ValuesofSymbolicVariables

Learnanapproximationwithsmallnumberof

I/Oexamples 13

1 int main (…) {2 if (strlen(filename)>1 && filename[0]==‘-’)3 exit(1)4 copy_data(…);5 …6 }7 void copy_data(..., int *file,...) {8 static double data[4096], value;9 read_double_value(file, ... );

10 value = fabs (data [0]); 11 for(i=0; i<4096; i++)12 if(file[i] == 0.0) count++;13 data[1] /= (value+count-3);14 …15 }

filename

file

data[1] /= (value+count-3); CVP:Divide-by-zero

ApproximateConstraint(asaneuralnet):

file count &value

Page 14: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ANewApproach:Neuro-symbolicExecution

Program

ThisPhoto byUnknownAuthorislicensedunderCCBY-SA

Property

𝜋

Symbolic(SMT)Constraints

NeuralNetwork

SymbolicExploitCondition

SATISFIABLE?PATH

EXPLOSION

Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 15: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

𝑁: 𝑖𝑛𝑓𝑖𝑙𝑒 → 𝑣𝑎𝑙𝑢𝑒, 𝑐𝑜𝑢𝑛𝑡

1.Reachabilityconstraints:𝑠𝑡𝑟𝑙𝑒𝑛 𝑓𝑖𝑙𝑒𝑛𝑎𝑚𝑒 ≤ 1∨𝑓𝑖𝑙𝑒𝑛𝑎𝑚𝑒 ≠ ′ − ′

2.Vulnerabilitycondition:𝑣𝑎𝑙𝑢𝑒 + 𝑐𝑜𝑢𝑛𝑡 − 3 == 0

Purelysymbolicconstraints:Novariablesharedwithneuralconstraints

SMTsolver

Mixedconstraints:Includingbothneuralconstraintsandsymbolicconstraintswithsharedvariables

ConstraintSolving:SatisfiabilityChecking

15

Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 16: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

value+count

L

SolvingSMT+NeuralConstraints:EncodeSMTconstraintsasthelossfunction

Symbolicconstraint

Criterionforcraftingthelossfunction:Theminimumpointofthelossfunctionsatisfiesthesymbolicconstraints.

3

16

𝑁: 𝑖𝑛𝑓𝑖𝑙𝑒 → 𝑣𝑎𝑙𝑢𝑒, 𝑐𝑜𝑢𝑛𝑡 ∧𝑣𝑎𝑙𝑢𝑒 + 𝑐𝑜𝑢𝑛𝑡 − 3 == 0

𝐿 = 𝑎𝑏𝑠(𝑣𝑎𝑙𝑢𝑒 + 𝑐𝑜𝑢𝑛𝑡 − 3)

Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 17: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

OptimizeusingGradientDescent

file count,value

257202741201802017… … …030

count value loss Gradient:

file:000…Concretelyvalidatetheexploit

17

𝛻O+PQ𝐿

Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 18: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ReachingCVPs

CollectingI/OExamples

NeuralNetTraining

ConstraintSolving

Startingpoint

BottleneckDSE

Randomfuzz

NeuEX =Neuro-symbolicExecution+KLEE

18

Page 19: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

DSEEngine(KLEE)

NeuEx ToolOverview

SourceCode CVPs

SymbolicVariables

InputGrammar(optional)

Bottlenecks:1. UnmodeledAPIs2. LoopUnrolled

Count>10K3. Z3Timeout(>10

mins)4. MemoryCap>3GB

SMTSolver(Z3)

NeuralMode

NeuralMode

NeuralMode

NeuralMode

fork

fork

fork

fork

ValidatedExploits

NeuEx

19Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 20: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

Evaluation

● Recall:NeuralmodeisonlytriggeredwhenDSEencountersbottlenecks

● Benchmarks:7ProgramsknowntobedifficultforclassicDSE○ 4Realprograms

■ cURL:Datatransferring■ SQLite:Database■ libTIFF:Imageprocessing■ libsndfile:Audioprocessing

○ LESEbenchmarks■ BIND,Sendmail,andWuFTP

Include:1.Complexloops2.Floating-pointvariables3.UnmodeledAPIs

20Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 21: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

NeuEx vsKLEE

21

#CVPsreachedorcoveredbyNeuEx is25%higherthanvanillaKLEE.

KLEEgetsstuck(e.g.complexloops)

NeuEx finds94%and89%morebugsthanvanillaKLEEinBFSand

RANDmodein12hours.

Neuro-SymbolicExecution– Shenetal.[NDSS’19]

Page 22: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

MLPA:NewRepresentations&Inference

ForTaintAnalysis

Jointwork with:Shiqi Shen ,ShwetaShinde,Soundarya Ramesh,Abhik Roychoudhury (NDSS2019)

Page 23: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

TaintAnalysis

• Taintanalysistrackstheinformationflowwithinaprogram:• E.g.T[v]isthetaintbitforoperand“v”

• Taintanalysisisthebasisformanysecurityapplications• Informationleakagedetection• Enforcingprogramintegrity• Vulnerabilitydetection• …

1 int parse_buffer(char buffer[100], struct pkt_info *info) {

2 char check_flag;

3

4 check_flag = buffer[5] & 0x16;

5

6 err = init_pkt_info(info);

7 if (!err)

8 return err;

9 info->flag = check_flag;

10 /* … */

11 strncpy(info->data, buffer + 6);

12 info->seq = get_current_seq();

13 return OK;

14 } IsReturnAddressTainted?

Page 24: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

/* tainted input from network socket */

1 int parse_buffer(char buffer[100], struct pkt_info *info) {

2 char check_flag;

3

4 check_flag = buffer[5] & 0x16;

5

6 err = init_pkt_info(info);

7 if (!err)

8 return err;

9 info->flag = check_flag;

10 /* … */

11 strncpy(info->data, buffer + 6, 50);

12 info->seq = get_current_seq();

13 return OK;

14 }

movsx eax, byte ptr [rsi + 5]and eax, 16mov cl, almov byte ptr [rbp - 25], cl

Writebinarytaintrulesbasedoninstructionsemantics

buffer

check_flag

T[check_flag]=T[buffer+5]

TaintMapT[]

Info

TaintAnalysis onBinaries

Page 25: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

TaintRuleRepresentationsinExistingSystems

• Whatisthetaintruleforand eax, 16 onthex86architecture?

TaintEngine1

T[eax]=T[eax]

TaintEngine2

T[eax]=T[eax]

T[pf]=T[sf]=T[zf]=T[eax]T[of]=T[cf]=0

TaintEngine3

T[eax]=T[eax]

T[pf]=T[sf]=T[zf]=T[eax]T[of]=T[cf]=0

ifimm ==0{T[eax]=0}

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 26: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

Complexityof“Real”TaintRules

• Inputdependentpropagation

• Sizedependentpropagation

• Architecturalquirksforbackwardscompatibility

• Therecanbe1000sofopcodesperinstructionset!

if (size == 64 || size == 32 || size == 16) {for (x = 0; x < size / 8; x++) {if (t1[x] & t2[x]) t1[x] = 1;else if (t1[x] and !t2[x])t1[x] = t1[x] & op2[x];

else if (!t1[x] & t2[x])t1[x] = t2[x] & op1[x];

else t1[x] = 0;} else if (size == 8) {

// 0 if it’s lower 8 bits, 1 if it’s upper 8 bitspos1 = isUpper(op1); pos2 = isUpper(op2);if (t1[pos1] & t2[pos2]) t1[pos1] = 1;else if (t1[pos1] & !t2[pos2])t1[pos1] = t1[pos1] & op2[pos2];

else if (!t1[pos1] & t2[pos2])t1[pos1] = t2[pos2] & op1[pos1];

else t1[pos1] = 0;}}if (mode64bit == 1 and size == 64)

for (x = 32; x < size; x++) t1[x] = 0;

if (size == 64 || size == 32 || size == 16) {for (x = 0; x < size / 8; x++) {if (t1[x] & t2[x]) t1[x] = 1;else if (t1[x] and !t2[x])t1[x] = t1[x] & op2[x];

else if (!t1[x] & t2[x])t1[x] = t2[x] & op1[x];

else t1[x] = 0;} else if (size == 8) {

// 0 if it’s lower 8 bits, 1 if it’s upper 8 bitspos1 = isUpper(op1); pos2 = isUpper(op2);if (t1[pos1] & t2[pos2]) t1[pos1] = 1;else if (t1[pos1] & !t2[pos2])t1[pos1] = t1[pos1] & op2[pos2];

else if (!t1[pos1] & t2[pos2])t1[pos1] = t2[pos2] & op1[pos1];

else t1[pos1] = 0;}}if (mode64bit == 1 and size == 64)

for (x = 32; x < size; x++) t1[x] = 0;

if (size == 64 || size == 32 || size == 16) {for (x = 0; x < size / 8; x++) {if (t1[x] & t2[x]) t1[x] = 1;else if (t1[x] and !t2[x])t1[x] = t1[x] & op2[x];

else if (!t1[x] & t2[x])t1[x] = t2[x] & op1[x];

else t1[x] = 0;} else if (size == 8) {

// 0 if it’s lower 8 bits, 1 if it’s upper 8 bitspos1 = isUpper(op1); pos2 = isUpper(op2);if (t1[pos1] & t2[pos2]) t1[pos1] = 1;else if (t1[pos1] & !t2[pos2])t1[pos1] = t1[pos1] & op2[pos2];

else if (!t1[pos1] & t2[pos2])t1[pos1] = t2[pos2] & op1[pos1];

else t1[pos1] = 0;}}if (mode64bit == 1 and size == 64)

for (x = 32; x < size; x++) t1[x] = 0;

if (size == 64 || size == 32 || size == 16) {for (x = 0; x < size / 8; x++) {if (t1[x] & t2[x]) t1[x] = 1;else if (t1[x] and !t2[x])t1[x] = t1[x] & op2[x];

else if (!t1[x] & t2[x])t1[x] = t2[x] & op1[x];

else t1[x] = 0;} else if (size == 8) {

// 0 if it’s lower 8 bits, 1 if it’s upper 8 bitspos1 = isUpper(op1); pos2 = isUpper(op2);if (t1[pos1] & t2[pos2]) t1[pos1] = 1;else if (t1[pos1] & !t2[pos2])t1[pos1] = t1[pos1] & op2[pos2];

else if (!t1[pos1] & t2[pos2])t1[pos1] = t2[pos2] & op1[pos1];

else t1[pos1] = 0;}}if (mode64bit == 1 and size == 64)

for (x = 32; x < size; x++) t1[x] = 0;

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 27: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

LearningTaintRulesAutomatically

• Generateobservations(input-outputpairs)• Infersoundrulefromobservations(exactmode)• Generalizerulewithsimplechangeininferencealgorithm(generalizationmode)

ObservationEngine

Observations(10110…,11100)

…(10111…,11000)

InferenceEngine

TaintComputation

Rules

A→BX→AY→Z

cmovb eax,ebxInstruction

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 28: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

• Flipabitandobservetheoutputforchanges.• ∆EBX0 →∆EAX0• ∆EBX0 →∆EBX0

• Influence(Inf)onlyvalidif:• EAX=11100011,EBX=00101000

• Formatruthtablewithallofthecollectedobservations.• Trueifthereisachange,Falseotherwise

• Unseenvaluesareconservativelysetto“Don’t-Cares”

TaintInduce:SampleandLearn

0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0

0 0 1 0 1 0 0 01 1 1 0 0 0 1 1EAX0 EBX0EBX7EAX7

0 0 1 0 1 0 0 1

0 0 1 0 1 0 0 10 0 1 0 1 0 0 1

0 0 0 1 1 1 0 0 0 1 0 1 1 0 1 0

0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0

0 1 0 1 1 0 1 0

0 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1

0 1 0 1 1 0 1 1

EAX0 EAX1 … EBX0 EBX1 ... Inf

1 1 … 0 0 … 1

1 1 … 1 0 … 1

mov eax,ebx

0 0 … 1 1 ... 1

0 0 … 0 0 … 1

… … … … … ... 0

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 29: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

CapturesConditional&IndirectDependencies

cmovb eax,ebx

StateBefore

StateAfter

MemorySlotsEAX MemorySlotsEBX

MemorySlotsEAX EBX

ECX

ECX

CF

ebx→eaxCF=1, EAX=542, EBX=19, ECX=7, …CF=1, EAX=32, EBX=3, ECX=0, …CF=1, EAX=873, EBX=32, ECX=1, …

eax →eaxCF=0, EAX=12,EBX=4,ECX=1023…CF=0, EAX=42,EBX=11,ECX=13,…CF=0, EAX=2,EBX=3,ECX=33,…

cmovb eax,ebx

StateBefore

StateAfter

MemorySlotsEAX MemorySlotsEBX

MemorySlotsEAX EBX

ECX

ECX

CF

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 30: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

TaintInduce:Outputsasuccinctrule• Useinferencetechniquetolearnasuccinctrule fortheobservedfunction

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

CF=0, EAX=12, …Z FalseCF=1, EAX=333, … TrueCF=0, EAX=42, … FalseCF=0, EAX=44, … FalseCF=1, EAX=873, … TrueCF=0, EAX=1023, … FalseCF=0, EAX=33, … FalseCF=1, EAX=32, … TrueCF=0, EAX=2, … False… DC

InferenceCF=1 True

IF

(EBX0 →EAX0)THEN

(EAX0 →EAX0)ELSE

Page 31: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

Results:ComparisonwithState-of-the-art

• ComparewithTEMU,Triton,libdft• LAVA-M,libtiff,binutils,etc.• CheckstaintpropagationforeachindividualinstructiononbetweenTaintInduceandeachofthetool

• Only0.28%ofthediscrepanciesareerrorsinTaintInduce• AlloftheerrorsmadebyTaintInduceisduetoZF

Matches:93.27%- 99.5%withexistinghand-writtentools.Only0.28%discrepanciesareerrorsinTaintInduce.

X86Instructionsxw

Arith Comp Jump Move Cond FPU SIMD Misc Total

TaintInduce 43 9 33 33 60 85 259 28 550

libdft 15 5 1 30 32 X X 8 91

Triton 38 9 19 33 32 X 144 13 288

TEMU 7 1 2 3 X X X X 13

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 32: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

Results:CoverageandCorrectnessAuto-generatedtaintrulesfor4architectures:x86,x64,AArch 64,MIPS-I

withnomistakesfor~71%oftheinstructions

Arith Comp Jump Move Cond FPU SIMD Misc

x86 √ √ √ √ √ √ √ √x64 √ √ √ √ √ √ √ √

AArch64 √ √ √ √ √ √ √ √MIPS-I √ √ √ √ - - - -

Methodology:trainfor100seeds,teston1000randominputsforeachinstruction

RoomforFutureWork:Learnpreciserulesforalltheinstructions….

OneEngineToServe’Em All:LeongetAl.[NDSS’19]

Page 33: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

PAML:DeductiveReasoning

Jointworkwith:TeodoraBaluta,Shiqi Shen,ShwetaShinde,Kuldeep S.Meel (CCS2019)

Page 34: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

ConcernswithMLSystemsRobustness Fairness Memorization

MLMODEL

[Trojaning Attacks- Liuet.al][AdversarialExamples– Goodfellow etal.]

Page 35: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

QualitativeVerification

VERIFIER

𝑁Boolean ∃𝑥, 𝜋 𝑁, 𝑥

But,thenetworkorpropertyisoftenstochastic…

Page 36: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

QuantitativeVerification

QUANTITATIVEVERIFIER𝑁

Howmany?

QuantitativeVerificationofNeuralNetworksAnditsSecurityApplications– Baluta etal.[CCS’19]

Page 37: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

NPAQ:AQuantitativeVerifierForNeuralNets

𝑁COUNT-

PRESERVINGENCODERS

CNFFORMULA(𝜑)

APPROX.MODEL-COUNTER

COUNT|𝑅(𝜑)|

Pr[(1 + 𝜖)Z[|𝑅(𝜑)| ≤ 𝑟 ≤ (1 + 𝜖)|𝑅(𝜑)|] ≥ 1 − 𝛿PAC-styleSoundnessGuarantees:

ErrorTolerance

TrueCount

Approx.Count Confidence

QuantitativeVerificationofNeuralNetworksAnditsSecurityApplications– Baluta etal.[CCS’19]

Page 38: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

NPAQResults• 84modelsofupto51,410parameters• 1,056encodedCNFformulae• 3applicationsonMNISTandUCIAdultdatasets:• Robustness:Howmanyadversarialsampleswithinsomeperturbationdistance?

• Fairness:Howoftenwillapredictionchange(favorably)ifthegenderoftheapplicantischanged,keepingallelseconstant?

• Trojanattackefficiency:Howoftenwillanimagewithatriggerresultindesiredmisclassification?

97.1%oftheencodedformulasolvedwithin24hourstimeouteachQuantitativeVerificationofNeuralNetworksAnditsSecurityApplications– Baluta etal.[CCS’19]

Page 39: The Symbiosis of Program Analysis and Machine LearningMLSecurity/talks/prateek.pdf · 2019-09-05 · Prateek Saxena Associate Professor National University of Singapore. Program Analysis,

KeyTakeaways● MachineLearningHelpsProgramAnalysis

Ø Whenprogram,propertyoranalysisrulesareuncertainØ ProvidespowerfulapproximaterepresentationsandsolvingtoolsØ SpecificApplications:

● Neuro-SymbolicExecution● AutomaticallyLearningTaintRules

● ProgramAnalysisHelpsMachineLearningØ ByverifyingpropertiesØ SAT/SMT-basedquantitativereasoningisapowerfultoolØ SpecificApplications:Fairness,Robustness,Memorization

39Thankyou!