A"ribu'ngHacksYu-XiangWang
Jointworkwith ZiqiLiu, AlexSmola,KyleSoska, QinghuaZheng
1
AmazonAI
• MakingmachinelearningandAItechnologiesaccessibletoalldevelopers.
• Wearehiring!– PhDInternshipposi'onsallyearround.– Full-'meposi'onsalsoavailable.– Contactme,Anima,Alexoranyotherfolksthere.
2
Background• Thereare1,000,000,000websitesontheinternetasofSep2014.
• About1%ofthemarecurrentlyhackedorinfected(source:securi.net)
• That’sabout10millionmaliciouswebsites!
3
Whatcanwedoaboutit?
• Typicallyfocusondetec'onandremedia'on.– UsingsmalliFrames(Mavromma's&Monrose,08)– NortonSafeWeb,McAfeeSiteAdvisor.
• Forensics/A,ribu/onofhacks– muchharderproblems– What?How?When?– Thispaper:usesta's'cs,MLtools!
4
Outline
1. Challenges
2. Putourselvesinthehackers’shoes
3. Oursolu'on:survivalanalysis+trendfiltering
4. Resultsonrealdata
5
Challenge1:hiddenhackingprocedure
Noneofthethreeisknowntous!
Websitesgethacked…
6
Challenge2:Unknownhacking'me
Noexplicitlabelsforsupervisedlearner.7
Challenge3:'mevaryingrisk
• Securityriskis'mesensi've.– Hackerskeepdiscoveringnewexploits.– Websiteskeeppatchingbugs/vulnerability.– Newversionsofsolwarearebeinginstalled.
Sharpchangestriggeredbyevents!
8
Fromahacker’spointofview
Ifoundanexploit!Whattodo?
Money?
Fame?
Hackasmanysitesaspossibleasquicklyaspossible
Sharethetheexploitwithpeers.Scriptkiddieswillkickin.
Whatcanwelearnfromthis?- Searchablestringsnippetsareindica'vefeatures(Soska&Chris'n2014)e.g.,HTMLtags<meta>WordPress2.9.2</meta>- Changepointsinhackingvolumerevealhiddenevents/ac'vi'es.(Thispaper!)
9
Outline
1. Challenges
2. Putourselvesinthehackers’shoes
3. Oursolu/on:survivalanalysis+trendfiltering
4. Resultsonrealdata
10
Recalltheinputandoutput
• Task:es'matetheriskofgepnghacked.
• Input:– Censoredhack'me.– featuresofwebsites.
• Thisissurvivalanalysis!
11
Survivalanalysis
Whattheheckisthat?
It’sourbreadandbu"er.- Datesbacktolate1600s,instudying
smallpoxandlifeexpectancy.- S'llanac'veresearchareatoday.
Halley Bernoulli
Modernformula'on:(Kaplan&Meier,1958;Cox,1972)-Adensityes'ma'onproblemforr.v.T:'meofdeath.
12
13
MachineLearning Sta/s/cs
• regression• clustering
• classifica'on
• Bayesianinference
• Graphicalmodels
• Onlinelearning
Hackingasasurvivalproblem• Awebsitegothacked óApa'enthadahearta"ack.• Vulnerablefeatures óGenesassc.withheartdisease• Relaycheckpoint óAregularphysicalcheckup.• Blacklisted óDiagnosedwithheartfailure
• Inferen'altasksofinterest:– Prob(Hearta"ackbeforeage40|DNAsequencex,healthyun'l30)– Prob(hackedbeforeMay1|featurevectorx,nothackedyettoday)
14
TheCoxmodel
• Asemi-parametricmodel.• The“default”survivalanalysismodel…• Cited44903'mes(GoogleScholar)!
SirDavidCox
Cox(1972).“RegressionmodelsandLife-tables”.JournaloftheRoyalSta's'csSociety.
�(x, t;w) = �0(t) exp hw, xi
15
FromCoxmodeltoourmodel
• Coxmodel:– Lowdimensionalgeneralizedlinearmodel
• Ourmodel:
– Timevarying,addi'vehazardfunc'on.– Highdimensional.wisavectoroffunc'onsint.– Fullynonparametricforeachfeature.
�(x, t;w) = �0(t) exp hw, xi
16
Comparingtoexis'ng'me-varyingsurvivalmodels
• Kernel,smoothingsplines(Kooperberg’94;Sauerbrei’07)– Curseofdimensionality.– Requirehomogeneoussmoothness.
• Howwearedoingdifferently?– Addi'veineachdimension.– Usetrendfiltering(Kimet.al.,,2009;Tibshirani,2013)tohandleheterogeneoussmoothness/sharpchanges.
17
Locallyadap'venonparametricregressionviatrendfiltering
• Forfunc'onswithboundedvaria'on:– TrendFiltering:n^(-2/3)minimaxrate– Alllinearsmoothers:n^(-1/2)subop'malrate
18(Kimet.al.2009,SIAMReview),(Tibshirani,AoS2013),(W.,SmolaandTibshirani,ICML’14)
Fusedlasso
LearningbyregularizedMLE
• Technicalchallenges:– Thisisop'mizingoverfunc'ons!– Intervalcensoringlossisnon-convex– TVoperatorisnon-smooth.
min
(w0,w1,...,wp)2Fp� log
Y
i2Bp(ti ⌧i < Ti)
Y
i/2B
p(⌧i > T )
�+ �
pX
j=0
TV(wj)
19
Ourcontribu'ons• Func'ons=>VectorsinEuclideanspace
– Thesolu'onisparameterizedbyasmallnumberofstep-func'ons.(acutere-parameteriza'onanduseofMammen&VanDeGeer,1997)
• Handlingnon-smoothnessviaproximalSVRG.– Combinelinear'meproximalmapusingdynamicprogramming(Johnson,2013)withresultsin(Yu,2014)
– Convergenceratedespitenon-convexity(Reddiet.al.,2016)
• Efficientimplementa'on.– Representsonlyac'vesets.– Highlyscalable,upmillionsoffeaturesanddatapoints.
20
Keystepoftheprox-SVRGalgorithm
21
-Doublyrobustes'ma'on-Controlvariate.
(Reddiet.al.,2016.Allen-Zhu,2016.)Sta'onarityconvergencerate:
Proximaldecomposi'on
• Johnson(2013)’sDPalgorithmsolves:
• Buthowtodealwiththenon-nega'vity?– UsingYaoliangYu(2015)’sgeneralcharacteriza'on,weshowthatitdecomposes!
22
TVpenaltyisnotsensi'vetosparsity.
• Donotdis'nguishbetween:
23
Moresparsity(lessbias)withTV-log
24
Moresparsity(lessbias)withTV-log
25
Forpiecewiseconstantfunc'ons,TV_logisstrictlysmaller!Anovelvaria'onaldefini'on.
Howdoweop'mizeit?
• DiscreteTV_log=DiscreteTV+Concave
• Theconcavepartcanbeshowntobecon'nuouslydifferen'able.
• Combinetheconcavepartwiththelossfunc'ons.ThesameproximalSVRG!
26
Outline
1. Challenges
2. Putourselvesinthehackers’shoes
3. Oursolu'on:survivalanalysis+trendfiltering
4. Resultsonsimula/onandrealdata
27
Simulatedexample:recoveryagainstthegroundtruth
28
TV-penalty TV-logpenalty
Simulatedexample:recoveryagainstthegroundtruth
29
Unregularized PolsplinesinR
Experimentsonmillionsofsitesandmillionsoffeatures,from2010-2014.
Trainingerror Testerror 30
Casestudy:Worldpressfeatures
• A"ackerstendtoworkinbatches
Startofana"ackbyanelitehacker
Secondcampaignofscriptkiddies
31
Interpre'ngthemonotonemodelonceavulnerabili'esknown,alwaysatrisk
32
Otherapplica'ons?
• Userdropoutratees'ma'on– Checkresponsesofgroupsofpeopletocertainpromo'ons.
• Alipay.comdatafromAntFinancial.– Ac'veuserifloginfor7daysinarow.– Otherwiseconsidereddroppedout.– Dataof4millionusers(1%oftheAlipayusers)
33
ResultsontheAlipayDataSet
34
March10:Cashrebatepromo'on
April18:Healthinsurancebonuspromo
Conclusion
• Using3xeffec'veparameters,ourmodelsignificantlyoutperformstheclassicCoxmodelinpredic'onaccuracy.
• Interpretability:Allowsustoa"ributehackstofeatures,andspecificexploits.
• Scalability:fasterandmorelocallyadap'vethanexis'ng'me-varyingmodels.
35
Openproblems• Sta's'calproper'es:– Consistencyandsamplecomplexityofthemodel.– Implicitsparsityregulariza'on?Sublineardependenceind?
• Computa'onalproper'es:– Nonconvex,butconvergencetonearglobalminimaundersta's'calassump'ons?
• Applica'on:– Usehigherordertrendfilteringonothersurvivalanalysisproblems,e.g.,marriage,divorce…
36
Thankyouforyoura"en'on!
Ziqi Alex Kyle Qinghua
Code/demoavailableat:h"ps://github.com/ziqilau/Experimental-HazardRegression
37