09/09/16
1
WebApplica0onSecurity:fromSta0cAnalysistoDynamic
Protec0onsandRecoveryMiguelCorreia
jointworkwithIbériaMedeiros,NunoNeves,MiguelBeatriz,DárioNascimento,...
BuildingTrustintheInforma0onAge–SummerSchoolonComputerSecurityandPrivacy–Cagliari,Sep.2016
ULisboa/IST/INESC-ID• UniversidadedeLisboa–Portugal
– largestuniv.inPortugal;~50Kstudents;~460programs;18schools• Ins0tutoSuperiorTécnico
– largestengineeringschoolinPortugal;~12Kstudents;80programs• INESC-ID
– largelabincomputerscienceandelectricalengineering;100+PhDs(mostISTfaculty);~250PhD/MScstudents;manyresearchgroups
• DistributedSystemsGroup(GSD)– 12ISTfaculty,~30PhDstudents,~40MSCstudents,3ECprojects
2
09/09/16
2
Researchoverview(1)IntrusionTolerance
• ToapplytheFaultToleranceparadigminthedomainofSecurity
• Dothebestweknowtoprotectsystems…butvulnerabili7ess7llremain…sotolerateintrusionsthats7lloccur
3
Researchoverview(2)Intrusion-TolerantServices
Servers (N)
Clients
I-T Distributed Service
Request Reply
NFS,DNS,on-lineCA,Webserver,etc.
0-Dayvulnerability
RedundancyDiversity CORR
ECT
oraccidentalfaultByzan0neFT
protocol
securecomponents
4
09/09/16
3
Researchoverview(3)MinBFT
• FirstefficientBFTSMRprotocol:PBFT(1999)– 3f+1replicas– 5communic.steps
• MinBFT(2009-13)– requireslocalsecurecomponent:monotoniccounter(simplerthanTPM)
– 2f+1replicas– 4communic.steps
5
Servers (N)
Clients
I-T Distributed Service
Request Reply
securecomponents
Byzan0neFTprotocol
G.S.Veronese,M.Correia,A.N.Bessani,L.C.Lung,P.Verissimo.EfficientByzan8neFaultTolerance.IEEETransac0onsonComputers2013.
Researchoverview(4)DepSky
• Service:intrusion-tolerantcloudstorage– Client-sidesogware– Server-sidearecloudstorageservices(diversity!)
• Byzan0nequorumprotocol(consistency)+erasurecodes(space)+symmetriccripto(confiden0ality)
• Wide-areaexperiments:+availability+readspeed-writespeed
AmazonS3
Nirvanix
Rackspace
WindowsAzureA.N.Bessani,M.Correia,B.Quaresma,F.André,P.Sousa,
DepSky:DependableandSecureStorageinaCloud-of-Clouds.EuroSys2011andACMTransac0onsonStorage2013.
6
09/09/16
4
Overviewofmyresearch(5)SogwareSecurity
• Diversityisameanstogetdifferentvulnerabili0esinreplicas,mostlyinsogware,buthow?Thismo0vatedmetounderstandsogwarevulnerabili0es
• Alsoreducingvulnerabili0esiscrucialsoaudi0ng,sta0canalysis,dynamicprotec0on,securecoding...
• =>SogwareSecuritythatisthemajortopicofthispresenta0on
7
Overviewofmyresearch(6)SogwareSecurity
• Olderwork:– Aqackinjec0on/fuzzing– Vulnerabili0esinsogwareportedfrom32to64-bitCPUs
– Anomaly-basedintrusiondetec0oninwebapps• Teachingacoursesince2004
8
09/09/16
5
OVERVIEWOFTHEPRESENTATION
9
Outline
1. WAP:vulnerabilitydetec0onwithsta0canalysisusingtaintanalysis+classifier
2. DEKANT:vulnerabilitydetec0onwithsta0canalysisusingasequencemodel
3. SEPTIC:blockingaqacksintheDBMS
4. SHUTTLE:intrusionrecoveryinthecloud
10
09/09/16
6
PapersWAP:I.Medeiros,N.F.Neves,M.Correia.Automa8cDetec8onandCorrec8onofWebApplica8onVulnerabili8esusingDataMiningtoPredictFalsePosi8ves.WWW2014
WAP:___.Detec8ngandRemovingWebApplica8onVulnerabili8eswithSta8cAnalysisandDataMining.IEEETransac0onsonReliability2016
WAP:___.EquippingWAPwithWEAPONStoDetectVulnerabili8es.DSN2016
DEKANT:___.DEKANT:ASta8cAnalysisToolthatLearnstoDetectWebApplica8onVulnerabili8es.ISSTA2016
SEPTIC:I.Medeiros,M.Beatriz,N.NevesandM.Correia.HackingtheDBMStoPreventInjec8onASacks.CODASPY2016
SHUTTLE:D.Nascimento,M.Correia.ShuSle:IntrusionRecoveryforPaaS.ICDCS2015.
11
WAP:VULNERABILITYDETECTIONWITHSTATICANALYSISUSINGTAINTANALYSIS+CLASSIFIER
1
12
09/09/16
7
Mo0va0on
• Webapplica0onsareexposedtomalicioususerinputs;ifvulnerable,theycanbeaqackedsuccessfully
• “Sowhydodeveloperskeepmakingthesamemistakes?(…)Insteadofrelyingonprogrammers’memories,weshouldstrivetoproducetoolsthatcodifywhatisknownaboutcommonsecurityvulnerabili0esandintegrateitdirectlyintothedevelopmentprocess.”– DavidEvansandDavidLarochelle,ImprovingSecurityUsingExtensible
LightweightSta0cAnalysis,2002
13
Sta0c(source)codeanalysis
• Objec0ve:tofindvulnerabili0esintheapplica0ons’(source)codeautoma0cally– Similartocompiler’serrorcheckingbutforvulnerabili0es
– Similartomanualcodereviewingbutautoma0cally
• Sta0cbecausethecodeisnotexecuted
14
09/09/16
8
Genericsta0canalysistool
15
WAP:outline
• Overview• Taintanalysis• Falseposi0veclassifica0on• Codecorrec0on• TheWAPtool• Results
16
09/09/16
9
Vulnerabilityexample(SQLI)
17
Vulnerabilityexample(SQLI)PHPcode:$u=$_POST[’user’];$p=$_POST[’password’];$q=“SELECT*FROMusersWHEREuser='$u'ANDpass='$p'”;$r=mysql_query($q);$q=“SELECT*FROMusersWHEREuser=''or1=1--'ANDpass='any'”;$r=mysql_query($q);
18
09/09/16
10
Mechanism1:TaintAnalysis
19
If we could track the user inputs and verify if they reachsensitive functions, then we could detect vulnerabilities...
...Taint Analysis
● taints all entry points (user inputs, e.g., $_POST)● follows the code propagating its taintedness● until it reaches a sensitive sink
(some functions, e.g., mysql_query)
How?
SQL Injectiondetected
$u = $_POST[’user’];
$p = $_POST[’password’];
$q = “SELECT * FROM users WHERE user='$u' AND pass='$p'”;
$r = mysql_query($q);
Taint Analysis: vulnerabilities detectedTaint Analysis: vulnerabilities detected
Taint Analysis: untaintednessTaint Analysis: untaintedness
Taint analysis: - handles sanitization functions - does not propagate the taintedness
$u = $_POST[’user’];
$p = $_POST[’password’];
$uu = mysql_real_escape_string($u);
$pp = mysql_real_escape_string($p);
$q = “SELECT * FROM users WHERE user='$uu' AND pass='$pp'”;
$r = mysql_query($q);
OK!
Vulnerability!
• Analysesthesourcecode,star0ngateveryentrypoint,propaga0ngtaintedness,checkingifasensivesinkisfedwithtainteddata
somefunc0onssani0zes,so“untaints”,thedataflow
Challenge:FalsePosi0ves
• Falseposi0ve:theanalyzersaysthere’savulnerability,butthat’sfalse
– Cause:sani0za0onfunc0on(s)missingfromlist
– Obvioussolu0on:addmissinginfototheanalyzer
• Howdoweknowwhichfunc0onsuntaintdata?– Someareobvious,likemysql_real_escape_string
– Somearen’t,likesubstrortrim
20
09/09/16
11
Programming
• Howdocomputers“know”howtodosomething?• Humanscreateprograms,i.e.,sequencesofinstruc0ons– Knowledgeistheprogramplusdata(config.,DBs)– Ourcase:program=analyser;data=sani0za0onfunc0ons,etc.
• Drawback:humanshavefirsttosynthe0zethisknowledgeinapreciseway
21
MachineLearning
• Programslearnautoma0callyfromdata
– Noneedtoexpressknowledgeprecisely!– Humaneffortcanbemuchsmaller
• “Wecanthinkofmachinelearningastheinverseofprogramming”(PedroDomingos)
• Extensivelyusedtodaytosolvecomplexproblems
– voicerecogni0on,naturallanguagetransla0on,playingJeopardy...
22
09/09/16
12
Mechanism2:Classifica0on• Keyidea:
– forlessobvioussani0za0onfunc0ons(orcombina0ons)don’taskexperts,letthetoollearn
– weletthetaintanalyzerproducefalseposi0ves,butuseaclassifiertodis0nguishtruefromfalse
• Classifierworksbasedonasetofexamples– ausercanaddmoreexamplestomakethetoolmoreprecise;noneedtoprogramknowledge
– othertools:userlearnsfunc0onXsani0zes,thencodesX– ourtool:userseesexampleYnotvulnerable,thenaddsY
23
Mechanism3:CodeCorrec0on
• Correc0ngvulnerabili0esis0resomeandtheycanberemovedmostlyautoma0callyusingfixes
• Letthetooltodoitwhenitdetectsavulnerability
24
09/09/16
13
WAP:outline
• Overview• Taintanalysis• Falseposi0veclassifica0on• Codecorrec0on• TheWAPtool• Results
25
Scheme
26
ep:entrypointsss:sensi0vesinkssan:sani0za0onfunc0ons
09/09/16
14
WAP:outline
• Overview• Taintanalysis• Falseposi8veclassifica8on• Codecorrec0on• TheWAPtool• Results
29
Keyidea
• Codeslice:sequenceofallinstruc0onsfromanentrypointtoasensi0vesinkthataffectdataflow
• Keyidea:givenacodesliceinwhichthetaintanalyzerdetectedavulnerability,classifyitasvulnerableornot– confirmingtheconclusionofthetaintanalyzer– orsayingitwasafalseposi0ve
• Howtodis0nguishvulnerablefromnon-vulnerableslices?Usingsymptoms/features
30
09/09/16
15
FeaturesforFPclassifica0on
31
• Whatarethefeaturesofthepossibleexistenceofafalseposi0ve?Asymptomexistswhentheuserinputis(examples):– changed
• stringmanipula0onfunc0ons(e.g.,substr)• concatena0onopera0ons
– validated• typecheckingfunc0ons(e.g.,isset,is_string)• whiteandblacklis0ng
• Featuresarebinary:presenceornotofoneofthese
FPclassifica0on:otheringredients
• Whatdoweneedforclassifica0on?• Asetoffeaturestocharacterizefalseposi0ves• Classifica0onclasses;weusetwo:
– isaFP(Y);isnotaFP(N=realVulnerability)• LearningdatasetofslicesannotatedasYorN
– originalset:76instances(32Y,44N)– obtainedmanually,tedious
• Aclassifica0onalgorithm:wedidn’tselectonebutdefinedaprocesstodotheselec0on
32
09/09/16
16
Originallearningdataset• 76instances:32falseposi0ves+44realvulnerabili0es• 15features,correspondingto24symptoms(func0ons)
33
Evalua0onofclassifiers
• WiththeWEKAtoolwe:• evaluated10machinelearningclassifiers
– ID3,C4.5/J48,RandomForest,RandomTree,K-NN,NaiveBayes,BayesNet,MLP,SVM,andLogis0cRegression
• testedtheclassifierswith10-foldcrossvalida0on– datasetdividedinto10buckets,traintheclassifierwith9ofthemandtestitwiththe10th;repeattheprocesswitheverycombina0on(100mes)
• used10metricstoevaluatetheclassifiersperformance
34
09/09/16
17
Evalua0onofclassifiers
• ResultsforLogis0cRegression(thebest):
– Accuracy=(TP+TN)/(P+N)=92.1%(instanceswellclassified)– Precision=TP/(TP+FP)=96.4%(FPinstanceswellclassified)
• Laterwerepeatedthiswithmuchmoredata
35
TP FP
FN TN
Classifiersimplemented
• Firstversion:wefirstimplementedLR• Secondversion:weimplementedacombina0onofthetop3classifiers(LR,RT,SVM)(samedataset)
36
09/09/16
18
WAP:outline
• Overview• Taintanalysis• Falseposi0veclassifica0on• Codecorrec8on• TheWAPtool• Results
37
Codecorrec0on
• Idea:whenavulnerabilityisfound,insertafixthatdoessani0za0onorvalida0onofthedata– Afixisjustacalltoafunc0onthatdoesit– Sani0za0on:escapingmetacharacters/metadata– Valida0on:checkingthedataandexecu0ngthesensi0vesinkornotdependingonthisverifica0on
• SQLIexample:– fixcallsaPHPsani0za0onfunc0onthatdependsontheDBMS(e.g.,pg_escape_string)
– fixinsertedinthelastwriteinthequerystring
38
09/09/16
19
Correc0onofcodecorrec0on(!)
• Weneverobservedfixesbreakinganapplica0onfunc0oning,butit’snotimpossible
• Solu0on:regressiontes0ng– consistsinrunningthesametestsbeforeandagerprogrammodifica0ons
– tocheckifwhatwasworkingcorrectlys0lldoes• WedidsomesimpleexperimentswithSelenium
39
WAP:outline
• Overview• Taintanalysis• Falseposi0veclassifica0on• Codecorrec0on• TheWAPtool• Results
40
09/09/16
20
WAP-WebApplica0onProtec0on• DoeswhatwesawforPHP:analysis,classifica0on,correc0on• Givesfeedback:
– reportsvulnerabili0esdetectedandhowwerecorrected– outputsacorrectedversionofthewebapplica0on– reportsthefalseposi0vesiden0fied
• Availableonline:~9000downloads!– hqp://awap.sourceforge.net/andatOWASP
41
WAP
42
09/09/16
21
Vulnerabili0esconsidered• Mostexploited:
– SQLInjec0on– CrossSiteScrip0ng(XSS)
• Others:– Remotefileinclusion– Localfileinclusion– Directorytraversal/pathtraversal– Sourcecodedisclosure– OScommandinjec0on– PHPcodeinjec0on
43
Challengesofimplemen0ngWAP
• PHPsyntaxuncertainty:PHPisnotformallyspecifiedandpoorlydocumentedfeaturesareusedogen
• Environmentvariables:resolvenameoftheincludedfiles
• Interprocedural,global,context-sensi0ve,classanalysis
44
09/09/16
22
WAPe
• Extendingsta0canalysistoolstofindnewvulnerabilityclassesrequiresprogramming,itscomplexandtakes0me
• Solu0on:modifyWAPtodealwithnewvulnerabilityclassesdefinedbytheuserswithoutprogramming
• “EquippingWAPwithWEAPONS”(WAPextensions)
45
WAPe:Basicscheme
46
09/09/16
23
WAPe:Classifieranddataset
• Weincreasedthedatasetandredonetheclassifierstudy:
WAP WAPe
48
WAP WAPe15features 60features24symptoms(func0ons) 60symptoms(func0ons)datasetwith76instances datasetwith256instancesClassifiers:SupportVectorMachineLogis0cRegressionRandomTree
Classifiers:SupportVectorMachineLogis0cRegressionRandomForest
WAPe:newvulnerabili0es
• LDAPinjec0on(LDAPi)• XPathinjec0on(XPathI)• NoSQLinjec0on(NoSQLi)• Commentspamming(CS)• Sessionfixa0on(SF)• Headerinjec0on/HTTPresponsespli�ng(HI)• Emailinjec0on(EI)• SQLIforWordPress
52
09/09/16
24
WAP:outline
• Overview• Taintanalysis• Falseposi0veclassifica0on• Codecorrec0on• TheWAPtool• Results
53
WAPvsPixy• PixydoestaintanalysistodetectSQLIandXSSvulnerabili0es
54
09/09/16
25
WAPvsPhpMinerII• PhpMinerIIpredictsthepresenceofSQLI/XSSvulnerabili0es
inPHPcode(inslices)usingaMLclassifier• unlikeWAP,itdoesnotiden0fywherevulnerabili0esare• alsoonlySQLIandXSS
55
Summary
56
09/09/16
26
WAPwithallvulnerabilityclasses
57
WAPtotals
58
1.38MLOCs388vulnerabili0es
09/09/16
27
WAPetotals
59
WAPe:0-dayvulnerabili0es
• WordPressisthemostpopularCMS;manyplugins• 115WordPresspluginsanalyzed
– somehavemorethan1Mdownloads– someareinstalledinmorethan10Kwebsites
• 23werefoundvulnerable– 153zero-dayvulnerabili0es– 16knownvulnerabili0es– 55SQLI,71XSS,31DT/RFI/LFI,etc.
60
09/09/16
28
WAPwrap-up
• Anapproachandatool(WAP)– toautoma0callyiden0fyandcorrectthesevulnerabili0es
– andtopredictfalseposi0vesusingdatamining– leveragingtheideaoflearninginsteadofprogramingknowledge
• MillionsofLOCsanalyzed,many0-daysfound
61
WAP:beqerinputvalida0on
62
09/09/16
29
DEKANT:VULNERABILITYDETECTIONWITHSTATICANALYSISUSINGASEQUENCEMODEL
2
63
Mo0va0on
• Typicalsta0canalysistools:– detectvulnerabili0estheyareprogrammedto– learningwouldbeinteres0ng,asseenalready
• WAP:limitedcapacitytolearn– doesclassifica0onofFPsbasedonsymptoms– doesnottakeintoaccounttheorderofelementsthatappearinthecode
• Isitpossibletohaveatoolthatlearns“everything”?
64
09/09/16
30
DEKANT:outline
• Overview• Intermediateslicelanguage• Sequencemodel• TheDEKANTtool• Results
65
DEKANT
• Novulnerabilityknowledgeisprogrammedinthetool– not100%true:slicingisprogrammed;expertassignsfunc0onstoclasses
• Thetoolextractsknowledge(learns)fromacorpus,i.e.,asetofannotatedsourcecodesamples
• Thisknowledgeismodeledusingasequencemodel(aHiddenMarkovModel–HMM)
66
09/09/16
31
Naturallanguageprocessing
• Example:part-of-speech(POS)tagging– NelsonÉvoraisexpectedtowintomorrow– Nelson_Évora/NNPis/VBZexpected/VBNto/TOwin/VBtomorrow/NN
• POSclassifieseachword(observa0on)ofasentence(sequence)withatag– takingintoaccountthecontextoftheword(i.e.,itsplaceinthesentence,order)
• context/orderaremodeledusingaHMM• knowledgeabouttagsislearnedfromacorpus
67
HiddenMarkovModel
• Statesarehiddenandemitobserva0ons• Forasequenceofobserva0ons,theHMMallowsdiscoveringthesequenceofstatesthatemitsthatsequence
68
09/09/16
32
HiddenMarkovModel
• Goal:calculatewhichstateemitsobsn• How:bycalcula0ngtheprobabilitythateachstateemitsobsngiventhepreviousstates
• Winner:thesequencewithhighestprobability
69
Sta0canalysisvsHMM
• Pu�ngthetwotogetherwehaveSATthatlearnstodetectvulnerabili0esusingaHMM
70
09/09/16
33
Knowledgeandlearning• Createthecorpus:
– collectslices(vulnerableandotherwise)– translateslicesintoISL(IntermediateSliceLanguage)– annotatethesliceswithstates(VulandN-Vul)– removeduplicates
• Learnvulnerabilitycharacteris0cs:– generatematricesofprobabili0es– traintheHMM
71
DEKANT:outline
• Overview• Intermediateslicelanguage• Sequencemodel• TheDEKANTtool• Results
72
09/09/16
34
Intermediateslicelanguage(ISL)
• Alanguagethatrepresentsabstractlythesourcecodeelements
• Composedbytokensandagrammar
73
...
Transla0ngasliceintoISL
7418 / 41
A new language...A new language...
● Translates a slice to ISL● Creates the variable map of the slice
Inte
rmedia
te S
lice L
anguage (
ISL)
ISL | S
lice T
ransla
tion P
rocess
$u = $_POST[‘username’];
$q = "SELECT pass FROM users WHERE user=’".$u."’";
$result = mysql_query($q);
inputvar varinput
09/09/16
35
Transla0ngasliceintoISL
7519 / 41
A new language...A new language...
● Translates a slice to ISL● Creates the variable map of the slice
Inte
rmedia
te S
lice L
anguage (
ISL)
ISL | S
lice T
ransla
tion P
rocess
$u = $_POST[‘username’];
$q = "SELECT pass FROM users WHERE user=’".$u."’";
$result = mysql_query($q);
input
var
var
var
Transla0ngasliceintoISL
7620 / 41
A new language...A new language...
● Translates a slice to ISL
● Creates the variable map of the slice
Inte
rmedia
te S
lice L
anguage (
ISL)
ISL | S
lice T
ransla
tion P
rocess
$u = $_POST[‘username’];
$q = "SELECT pass FROM users WHERE user=’".$u."’";
$result = mysql_query($q);
input
var
ss var
var
var
var
09/09/16
36
Transla0ngasliceintoISL
7721 / 41
A new language...A new language...
● Translates a slice to ISL● Creates the variable map of the slice
Inte
rmedia
te S
lice L
anguage (
ISL)
ISL | S
lice T
ransla
tion P
rocess
$u = $_POST[‘username’];
$q = "SELECT pass FROM users WHERE user=’".$u."’";
$result = mysql_query($q);
1,0 : is an assignment instruction or not- : is not a variableu : the name of the variable in the slice
variable mapslice-isl
input
var
ss var
var
var
var
1 - u
1 u q
1 - q result
DEKANT:outline
• Overview• Intermediateslicelanguage• Sequencemodel• TheDEKANTtool• Results
78
09/09/16
37
SequenceModel
• ThemodelistheHMMmodelalreadypresented• anISLinstruc0on
– isasequenceofobserva0onsfortheHMM– isclassifiedastaintorn-taint
• thelastobserva0onfromlastinstruc0oncarriestheclassifica0onofthewholeslice-isl:taintorn-taint,i.e.,vulnerableornot
79
SequenceModel
80
09/09/16
38
Classifica0onexample
81Vulnerability!
DEKANT:outline
• Overview• Intermediateslicelanguage• Sequencemodel• TheDEKANTtool• Results
82
09/09/16
39
TheDEKANTTool• Implementsthelearningphaseandthesequencemodel• Corpuswith510slicesextractedfromrealwebapplica0ons
(414vulnerable,96non-vulnerable)• Detects8vulnerabilityclasses:SQLI,XSS,RFI,LFI,DTSCD,
OSCI,PHPCI• Composedby4modules:
– knowledgeextractor– sliceextractor– slicetranslator– vulnerabilitydetector
83
DEKANT:outline
• Overview• Intermediateslicelanguage• Sequencemodel• TheDEKANTtool• Results
84
09/09/16
40
Evalua0on:WordPressplugins
85
Evalua0on:realwebapplica0ons
86
09/09/16
41
Evalua0on:realwebapplica0ons
88
DEKANTwrap-up
• NewapproachinspiredinNLPtodetectwebapplica0onvulnerabili0es
• Knowledgeislearned(except...)– firstlearnaboutvulnerabili0esfromcorpus– thendetectvulnerabili0estakingtheorderofinstruc0onsintoconsidera0on
• Niceresultsincomparisonwithothertools• Justafirststepinapromisingresearchdirec7on
89
09/09/16
42
SEPTIC:BLOCKINGATTACKSINTHEDBMS
3
91
Mo0va0on:dynamicprotec0on
• Widelysuccessfulinthebinaryapplica0onworld• Todaybufferoverflowsautoma0callyblockedby:
– canariesinthestack–detectreturnaddressmodifica0on– heaphardening–detectsheapmeta-datamodifica0on– non-executablepages–jumpsintoinjectedcodemakeprogramcrash
– addressspacelayoutrandomiza0on–makesaddresseshardtoguess
– andmanymore,e.g.,hqps://wiki.debian.org/Hardening
92
09/09/16
43
Mo0va0on:dynamicprotec0on
• Idea:blockaqacksthatmayexploitexis0ngvulnerabili0es
• Benefit:canbedeployedtransparently(opera0ngsystem,compiler,virtualmachine),independentlyofvulnerabili0esexis0ngornot
• Successfulwithbinaryapplica7ons,whynotwithwebapplica7ons?
93
SEPTIC• Problem:
– SQLIinjec0onaqacksretrieve/storedatainDB– Some0mestheycircumventsani0za0onfunc0ons– Seman0cmismatchbetweenserver-sidelanguageandDBMS
• Oursolu0on:– DBMSself-protectedagainstinjec0onaqacks– Detectandblockinjec0onaqacksinsidetheDBMS
• How:– “hacking”theDBMSàSEPTICmechanism
94
09/09/16
44
Seman0cmismatchexample
• Inputsani0zedwithmysql_real_escape_string– usernameadmin'--à'isescaped– usernameadmin%27--à%27notescapedbutMySQLinterprets%27asaprimeandexecutesSELECTnameFROMusersWHEREuser='admin'
• Seman0cmismatch– differentviewsfromPHPandMySQL– PHPprogrammersdon’tknowthisaqackworks
95
Seman0cmismatchcases
96
09/09/16
45
SEPTIC:outline
• ASackdetec8oninSEPTIC• RunningSEPTIC• Results
97
AqackshandledbySEPTIC
98
09/09/16
46
QueryprocessingvsSEPTIC
99
detec0on:queryiscomparedtomodel(s);nomismatchasmechanismrunsjustbeforequeryisexecuted!
SElf-Protec0ngdaTabasespreventIngaqaCks
SEPTIC:crea0ngquerymodelsSELECTnameFROMusersWHEREuser='alice'ANDpass='foo'
100eachqueryshouldhaveitsowniden0fier(ID)
09/09/16
47
QueryIDcrea0on:SSLEIDs
101
ZendengineforPHP
QueryIDcrea0on:SSLEIDs• SSLEbestplacetocreateIDs
– programmernotinvolved– lot’sofinfoaboutthecode
• BasicID:– file:line–filepathnameandlinenumberwhereDBMSiscalled(e.g.,mysql_query)
– problem:singlefunc0onusedfordifferentqueries• FullID:
– file:line|...|file:line–1stpairhassamemeaning– otherpairs:lineswherequeryispassedasargumenttoafunc0on
102
09/09/16
48
QueryIDcrea0on:DBMSIDs
103
SQLIdetec0on:step1-structurally
• comparethenumberofnodesofQSwithitsQM• if#nodesisdifferent,thenSQLIaqackdetected
– otherwisegotostep2– quickandcoversmanyaqacks,e.g.,admin’--
104
09/09/16
49
SQLIdetec0on:step2-syntac0cally
• comparethecontentofnodesofQSwithitsQM• ifapairdoesnotmatch,aSQLIaqackisdetected
105
Example:secondorderSQLI
106
09/09/16
50
Example:syntaxmimicry
107
Storedinjec0ondetec0on
• Storedinjec0onaqack– Maliciousdata:JavaScript(storedXSS),shellcommands,PHPcode
– 1ststep:maliciousdatainsertedintheDB– 2ndstep:maliciousdataretrievedfromDBandused
• Detec0onusingcodedetectors(plugins)– inputsfromINSERT/UPDATEqueriesarecheckedlookingformaliciousdata
– wedidn’tgomuchdeepinthis(onlyXSS,basic)
108
09/09/16
51
SEPTIC:outline
• Aqackdetec0oninSEPTIC• RunningSEPTIC• Results
109
SEPTICopera0onmodes
110
09/09/16
52
Crea0ng/storingquerymodel
111
SEPTIC
parsedQparse
querymodels
getID
executeQIDen0fierQueryQueryModelQueryStructure
createQS
validateIDQ
DBMS
createQM
generateDBMS-ID
ID
Training mode | training phase Normal mode | incremental
Detec0ng/blockingSQLI
112
SEPTIC
parsedQparse
querymodels
logofaqacks
getID
getQM
dropQ
executeQIDen0fierQueryQueryModelQueryStructure
createQS
detectaqacks
validateIDQ
DBMS
getDBMS-ID
ID
Normal mode | prevention or detection
09/09/16
53
Detec0ng/blockingstoredinjec0on
113
SEPTIC
parsedQparse
logofaqacks
dropQ
executeQIDen0fierQueryQueryModelQueryStructure
createQS
detectaqacks
applyplugins
validateIDQ
DBMS
ID
Normal mode | prevention or detection
SEPTICfullarchitecture
114
SEPTIC
parsedQparse
querymodels
logofaqacks
getID
getQM
dropQ
executeQIDen0fierQueryQueryModelQueryStructure
createQS
detectaqacks
applyplugins
validateIDQ
DBMS
createQM
generate/getDBMS-ID
ID
09/09/16
54
SEPTIC:outline
• Aqackdetec0oninSEPTIC• RunningSEPTIC• Results
115
SEPTICimplementa0on(#changes)• MySQLDBMS–SEPTICitself
– 1file:14loc– SEPTICdetector– SEPTICsetup– sep0c_trainingmodule
• PHP/Zendengine–inser0onofIDsintheSSLE– 3files:27loc– SEPTICiden0fier
• Java/Springframework–toshowit’snotspecifictoPHP– 1file:16loc– SEPTICiden0fier
• AlsoanalyzedcasesofMariaDBandPostgreSQL116
09/09/16
55
• SQLIunrelatedtoseman0cmismatch– 23fromthesqlmapproject– 11byRay&Liga�(4arenotaqacks/vulnerab.)– 7othersamples(forotherSQLIaqacks)
• SQLIrelatedtoseman0cmismatch– 17codesamples
• Storedinjec0on– 5codesamples
• Total:59aqacks/vuln.,4non-aqacks/vuln.
117
SEPTICdetec0onw/codesamples
Comparisonwithothertools
120
DBMSBrowser
SEPTIC
Webapplica8on
an0-SQLItoolsWAF
SQLrandAMNESIA
CANDIDDIGLOSSIA
ModSecuritySEPTIC
010203040506070
Summary of 63 tests
Flagged attacks False positives False negatives
09/09/16
56
• Vulnerabili0esdetected/blockedinrealwebapps• ZeroCMS
– CVE-2014-4194– CVE-2014-4034– OSVDBID108025
• WebChess– 13vulnerabili0es
• measureit– 1storedXSS
121
SEPTIC:realopensourcesogware
122
Apache&ZendWebapplica0onsBenchLab
MySQL&SEPTIC
each1to5browsers
SEPTICcombina8ons
SQLIdetector Storedinj.det.
off off
on off
off on
on on
0.82%
2.24%
SEPTIC:performance
09/09/16
57
SEPTICwrap-up
• Pu�ngprotec0onintheDBMSallowsdetec0ng/blockingaqacksefficiently– Subtleaqacksrelatedtoseman0cmismatch
• (Mostly)transparentprotec0onforwebapplica0ons• Lowperformanceoverhead• Mayhaveprac7calimpactinwebappsecurity?
123
SHUTTLE:INTRUSIONRECOVERYINTHECLOUD
4
124
09/09/16
58
• Cloudprovidervsconsumers• Fundamentalideas
– Compu0ngasau0lity– Pay-as-you-go– Resourcepooling– Elas0city
• Large-scaledatacenters
125
Cloudcompu0ng(publiccloud)
• InfrastructureasaService(IaaS)– virtualmachines,storage(e.g.,AmazonEC2,AmazonS3)
• Pla�ormasaService(PaaS)– programmingandexecu0on(e.g.,GoogleAppEngine,Force.com,WindowsAzure)
• SogwareasaService(SaaS)– mostlywebapplica0ons(e.g.,Yahoo!Mail,GoogleDocs,Facebook,…)
126
Cloudcompu0ngservicemodels
09/09/16
59
Pla�ormasaService(PaaS)
• PaaSservicesallowrunningapplica0ons• Consumerdevelopsapplica0ontoruninthatenvironment,using– Supportedlanguages,e.g.,Java,Python,Go,PHP– Supportedcomponents,e.g.,SQL/NoSQLdatabases,loadbalancers
– Examples:GoogleAppEngine,WindowsAzureCloudServices,SalesforceForce.com,...
127
Mo0va0on
• IntrusionsinPaaSapplica0onsmayhappendueto– Sogwarevulnerabili0es(e.g.,Shellshock)– Configura0onandusagemistakes– Corruptedlegi0materequests(e.g.,SQLI)
• Aqackercanruncommandsintheapplica0onanddelete,add,andmodifydata
• Legi0mateuserscanthendocommandsoncorrupteddata...
128
09/09/16
60
Mo0va0on
129
Shuqle:outline
• ShuSle• Evalua0on
130
09/09/16
61
Shuqle
• RecoversthestateintegrityofPaaSapplica0onswhenthereareintrusions
• Isn’titwhatbackupsdo?– Backups:removebothbadandgoodopera0ons– Shuqle:removesbadopera0onsbutkeepsgoodones
131
Stateoftheart• Previousworks
– Opera0ngsystems:Taser,Retro– Databases:ITDB,Phoenix– Webapplica0ons:Goelet.al,Warp,Aire– Others(Email):UndoforOperators
• Limita0ons– Max.complexity:1appserver,1databaseinstance– Allrequiresetupandconfigura0on– Causeapplica0ondown0meduringrecovery
132
09/09/16
62
Shuqle
• Supportedbythecloud:availablewithoutconsumersetup
• Supportsapplica0onsdeployedinvariousinstances• Avoidsapplica0ondown0measnoneedtostoptheapplica0onduringrecovery
• Leverageelas0citytomakerecoveryfaster
133
PaaSapplica0onsarchitecture
134
User Request
Proxy
Load Balancer
Application Server
Application Server
Database Instance
Database Instance
09/09/16
63
Shuqlearchitecture
normalexecu0on:log,takesnapshots
135
User Request
Proxy
Load Balancer
Application Server
Application Server
Database Instance
Database Instance
Manager
Storage
DB Proxy DB Proxy
Interceptor Interceptor
Shuqleduringrecovery
136
User Request
Proxy
Load Balancer
Application Server
Application Server
Database Instance
Database Instance
Manager
Storage
DB Proxy DB Proxy
Replay Instances
Interceptor Interceptor
09/09/16
64
Recoveryprocess
1. Detect/iden0fythemaliciousopera0ons(notShuqle)
2. Startnewinstancesoftheapplica0onanddatabase3. Loadasnapshotprevioustointrusioninstant;create
anewbranch(applica0onstaysrunninginpreviousbranch)4. Replayrequestsinnewbranch5. Blockincomingrequests;replaylastrequests
6. Changetonewbranch;shutdownunnecessaryinstances
137
Recoverymodes• Full-Replay:Replayeveryopera0onagersnapshot• Selec0ve-Replay:Replayonlyaffected(tainted)opera0ons
• Serial:Replayalldependencygraphsequen0ally• Clustered:Replayindependentclusters
concurrently;allowedbythecloudelas0city
• Modessupported:
138
Full-Replay Selec0ve-Replay1Cluster(Serial) ✔ ✔Clustered ✔ ✗
09/09/16
65
Shuqle:outline
• Shuqle• Evalua8on
139
Evalua0onenvironment
• AmazonEC2,c3.xlargeinstances,GbEthernet
• WildFlyapplica0onserver(formelyJBoss)• Voldemortdatabase
• AskQ&Aapplica0on;datafromStackExchange
140
09/09/16
66
Accuracy• IntrusionScenarios:
– 1.Maliciousrequests– 2.Sogwarevulnerabili0es– 3.Externalchannels(e.g.SSHduetoShellshock)
141
#dataitemsaffected
#requeststainted
#requestsreplayed–Selec8veReplay
#requestsreplayed–FullReplay
1a 106 0 <605 38620 1b 58 14 <379 38620
1c 48 52 <253 38620 2a 4338 0 - 38620 2b 18286 1278 - 38620 3 >2000 - - 38620
Performanceoverhead
• innormalexecu0on
142
Overheadseemsacceptable;penaltymostlyduetosingleproxy
50%Reads50%Inserts 95%Reads5%Inserts ops/sec latency(ms) ops/sec latency(ms)
Shuqle 6325 5.78 15346 3.62 NoShuqle 7148 5.07 17821 3.01 overhead 13% 14% 16% 20%
09/09/16
67
Recovery0me
• for1millionrequests
143Clusteringgreatlyreducesrecovery0me
Restraindura0on
144
0
500
1000
1500
2000
2500
3000
00:00 03:00 06:00 09:00 12:00
Req
uest
s pe
r sec
ond
Time (minutes:seconds)
clustered replayconcurrent client
Beginrestrain
Restrain:46seconds
09/09/16
68
#objects Size(MB) ShuSleStorage:
Requests 1million 212 Response 1million 8767
Start/End0mestamps 2million 16 Keys 137million 488 Total 9648
Databasenode: VersionList 14593 1.4
Opera0onList 9million 277 Total 282
Manager Graph 1million 718
Storageoverhead• for1millionrequests
Storageisconsiderablebutmostlyduetostoringfullresponses$47permonthif20Millionrequestsperday(withoutresponses)
SHUTTLEwrap-up
• NewintrusionrecoveryserviceforPaaSofferings• Supportsapplica0onsrunninginvariousinstances,backedbydistributeddatabases
• Leveragestheresourceelas0cityandpay-per-usemodeltoreducetherecovery0meandcosts
• Providesintrusionrecoverywithoutservicedown0meusingabranchingmechanism
146
09/09/16
69
Outline
1. WAP:vulnerabilitydetec0onwithsta0canalysisusingtaintanalysis+classifier
2. DEKANT:vulnerabilitydetec0onwithsta0canalysisusingasequencemodel
3. SEPTIC:blockingaqacksintheDBMS
4. SHUTTLE:intrusionrecoveryinthecloud
147
PapersWAP:I.Medeiros,N.F.Neves,M.Correia.Automa8cDetec8onandCorrec8onofWebApplica8onVulnerabili8esusingDataMiningtoPredictFalsePosi8ves.WWW2014
WAP:___.Detec8ngandRemovingWebApplica8onVulnerabili8eswithSta8cAnalysisandDataMining.IEEETransac0onsonReliability2016
WAP:___.EquippingWAPwithWEAPONStoDetectVulnerabili8es.DSN2016
DEKANT:___.DEKANT:ASta8cAnalysisToolthatLearnstoDetectWebApplica8onVulnerabili8es.ISTTA2016
SEPTIC:I.Medeiros,M.Beatriz,N.NevesandM.Correia.HackingtheDBMStoPreventInjec8onASacks.CODASPY2016
SHUTTLE:D.Nascimento,M.Correia.ShuSle:IntrusionRecoveryforPaaS.ICDCS2015.
G.S.Veronese,M.Correia,A.N.Bessani,L.C.Lung,P.Verissimo.EfficientByzan8neFaultTolerance.IEEETransac0onsonComputers2013.
A.N.Bessani,M.Correia,B.Quaresma,F.André,P.Sousa,DepSky:DependableandSecureStorageinaCloud-of-Clouds.EuroSys2011andACMTransac0onsonStorage2013.
148
Top Related