An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection...

34
An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) Jinoh Kim Computer Science Department Texas A&M University, Commerce, TX 75429, USA

Transcript of An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection...

Page 1: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

AnEmpiricalStudyonNetworkAnomalyDetectionusingConvolutionalNeuralNetworks

July2,2018(PresentedatSNTA2018)

Jinoh KimComputerScienceDepartment

TexasA&MUniversity,Commerce,TX75429,USA

Page 2: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

WhatareAnomalies?

• Anomalyisapatterninthedatathatdoesnotconformtotheexpectedbehaviour– Outliers,exceptions,peculiarities,etc.

• Realworldanomalies– Cyberintrusions– Creditcardfraud

Page 3: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

WhatisMachineLearning?• Abranchofartificialintelligence,concernedwiththe

designanddevelopmentofalgorithmsthatallowcomputerstoevolvebehaviorsbasedonempiricaldata.

ImagefromGoogleimage

Page 4: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

LearningSystemModel

InputData

LearningMethod

System

Testing

Training

Page 5: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Supervisedvs.unsupervised

• Supervisedlearning– Provisionoftheassociated“label”– Tryingto“predict”aspecificquantity– Example:neuralnetworks,decisiontrees,etc

• Unsupervisedlearning– Noassumptionoftheprovisionoflabels– Tryingto“understand”thedata– Example:clustering

Page 6: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Supervisedvs.unsupervised

Imagefromhttps://towardsdatascience.com/unsupervised-learning-with-python-173c51dc7f03

Page 7: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

CyberIntrusionDetection• Evolutionofcyber-attacks

– Growingcyber-attacksurface– Greaterscale&impacts– Increasinglydifficulttoidentify

• Incidents– WannaCry affected10,000

organizationsin150+countriesinMay2017

– DDoScausedTwitter,Spotify,etc toclosedowninOct.2016

– NewtypeofDOSattacksutilizeIoT devices(2016)

• NeedmoreIntelligenttoolstoidentifynetworkanomalies

https://www.workiva.com/blog/improve-your-cybersecurity-risk-managemet

Page 8: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

IntrusionDetectionApproaches

• Misusedetection– Basedonrules(or

signatures)– Accuratewithwell-

knowntextpatterns– Limiteddueto:

• Encryptionofpackets• Legalissueconcerningprivacy

• Anomalydetection– Basedonprofilingofnormal

and/oranomalousbehaviors– Statisticalinformationiswidely

used• e.g.,duration,numberofpackets/connection,etc

– Lessaccuratethansignature-baseddetection(ingeneral)

– Gainedgreaterattentionwithsignificantlyimprovingmachinelearningtechnologies

Page 9: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Shallowvs.DeepML• The"deep"in"deeplearning"referstothenumberoflayersthrough

whichthedataistransformed.(fromWikipedia)• Shallowlearningisoneotherthandeeplearning• Shallowlearningworkswellforrelativelysimplequestions(Fig.aandb)• However,shallowlearningcannotdealwithaquestionlikeFig.c• Deeplearningworksbettertodealwithmorecomplicateddata

Imagefromhttps://towardsdatascience.com/radial-basis-functions-neural-networks-all-we-need-to-know-9a88cc053448

(a) (b) (c)

Page 10: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Learning-basedAnomalyDetection

• Lotsofstudiesfornetworkanomalydetectionwithitsadvantages– However,usingconventionalshallowMLtechniquesislimitedin

accuracytoidentify(<83%accuracy)– E.g.,SVM,randomforest,Adaboosting,etc.

(IdentificationaccuracyagainstNSL-KDDdatasets)

Page 11: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Non-linearProperty• WhynotgoodenoughwithshallowMLtechniques?

– Networkdatasetsoftenhavenon-linearproperty• t-SNE(t-DistributedStochasticNeighborEmbedding)

– Dimensionreductiontoolwidelyemployed• t-SNEresultsshownormalandattackdatapointssharethesamefeature

space– Hardtoclassifywellusingashallowlearningmethodduetonon-linearity

=NormalTraffic=AttackTraffic

t-SNEresultagainstNSL-KDD t-SNEresultagainstKyoto-Honeypot

Deeplearningisknownasgoodatdealingwithhighdimensionaldatawiththenon-linearityproperty

Page 12: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Earlierandcurrentwork• Inourearlierwork:

– Wesetupasetofdeeplearningmodelsfornetworkanomalydetection• BasedonFullyConnectedNeuralNetwork(FCN),Variational AutoEncoder (VAE),and

Seq2Seqstructure(Seq2Seq)– Weevaluatedthedeeplearningmodelswithtwodatasetswithdifferent

characteristicswrt thepopulationofnormalandattackrecords• NSL-KDDisbalanced,whileKyotoUniversityHoneypotdataishighlyskewedwithalotof

attackrecords– Malaiya,Ritesh K.,etal."AnEmpiricalEvaluationofDeepLearningfor

NetworkAnomalyDetection." 2018InternationalConferenceonComputing,NetworkingandCommunications(ICNC).IEEE,2018.

• WearecurrentlyevaluatingCNNmodelsfornetworkanomalydetection– TestedCNNmodelstakingtheinputasone-dimensionalvector– Butobservednotthatinterestingresults– Currently,wearestudyingonmakingtwo-dimensionalinputdatatobetter

useCNNmodels

Page 13: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

In the earlier work …

Fully Connected Network (FCN) Variational Autoencoder (VAE)LSTM-Seq2Seq

Normal

Anomaly

Page 14: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

EvaluationResult:NSL-KDD

• Seq2Seqmodelsshowmoderatetrainingcomplexities(conductedonGooglecloud)

• Seq2Seqmodelsmuchoutperformtheotherswrt anomalydetectionperformance– Deep-LSTMyields99%ofaccuracytoidentifyforallcombinationsoftraining

andtestingdatasets• Seq2Seqmodelsalsoworkgreatagainstothernetworktraces(Kyoto

HoneypotdataandMAWILab data)

Page 15: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

What about CNN models?

CNN applicability to

network anomaly detection?

Page 16: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

In this presentation …• This is very initial work to see the following

questions:– Can we simply feed in a1D vector to CNN for network

anomaly detection?– Can detection accuracy be improved once the CNN

model gets deeper?

Shallow CNN Moderate CNN Deep CNN

Page 17: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

EvaluationDataSets• NSL-KDDdata– ModifiedversionofKDDCup 1999connectiondata– Consistsof41featureswiththelabels– http://www.unb.ca/cic/datasets/nsl.html

• Kyoto-Honeypotdata– Collectedfromhoneypots,andthusthevastmajorityofrecordsareforattacks(97%ofdatapoints)

– http://www.takakura.com/Kyoto_data/• MAWILab data– CollectedfromthebackbonenetworkinJapan,withthelabelsindicatingtrafficanomalies

– http://www.fukuda-lab.org/mawilab/

Page 18: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

NSL-KDDData• ModifiedversionofKDDCup 1999connectiondata• Consistsof41featureswiththeassociatelabel

– Extended122featuresusingone-hotencoding• Fourfilesinthedataset:2fortrainingand2fortesting

Page 19: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

• Collectedfromhoneypots,andthusthevastmajorityofrecordsareforattacks(97%ofdatapoints)

• Numberoffeaturesis24(14basicand10extendedfeatures)– Excludedsixminorfeaturesrelatedtothehostandportinformationin

ourexperiments.– Extendedto47featuresbyone-hotencodingincludinglabels

• Kyoto-Honeypotisseverelyimbalanced– F-measurecanmisleadtothefailureoftheinterpretationofresults

– MCC(Mattew CorrelationCoefficient)estimatesthequalityofbinaryclassification:-1.0(poor),0.0(random),and1.0(good)

Kyoto-HoneypotData

Page 20: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

• ConvertedthetrafficdatatoNetFlow formatdata• 5featuresoutof29featuresareextracted:– pro,packets,bytes,durat,andstatuspluslabel– Categoricalfeatures:one-hotencoded

MAWILab Data

Page 21: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

CNN framework

• Designed three CNN models– Shallow, moderate, and deep– Takes 1D vector as input– Evaluated with three different data sets (NSL-KDD, Kyoto

Honeypot, MAWILab)

Page 22: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Feature Maps

1. Pre-processed input data are given to convolutional 1D layer(s).• Shallow: 1 Conv1D Layer• Moderate: 2 Conv1D Layers• Deep: 3 Conv1D Layers

2. Filters• Shallow: 64 filters with size 3*1• Moderate: 64 and 128 filters with

size 3*1• Deep: 64, 128, and 256 filters with

size 3*13. Stride: 24. Padding: Same

Page 23: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Feature Maps• Pre-processed input data are given to

convolutional 1D layer(s).o Shallow: 1 Conv1D Layero Moderate: 2 Conv1D Layerso Deep: 3 Conv1D Layers

• Filter: size 3*1• Batch size

o Shallow: 64o Moderate: 64 and 128 o Deep: 64, 128, and 256

• Stride: 2• Padding: set to “same” to make

outputs of the convolutional layer same as inputs.

Page 24: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Feature Map Calculation

1. With ReLu non-linear activation

2. With tanh non-linear activation

– No significant difference observed in our evaluation

where hk denotes the kth feature map at a given layer, i is the index in the feature map, xi indicates the input, and wk denotes the weights.

Page 25: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Pooling Methods• Pooling Layer 1. Average Pooling

– 𝑓"#$ 𝑥 = ()∑ 𝑥+)+,(

2. Max Pooling– 𝑓-". 𝑥 = max(𝑥+)

where x denotes a vector of input datawith activation values and N indicatesa local pooling region.

• Max Pooling is widelyemployed

Input for next layer

Max Avg

Feature maps

Page 26: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Pooling Layers1. Shallow CNN

o 1 max pooling layer in the Conv1D layero Flatten: 3904 for NSL-KDD, 1408 for Kyoto, and 192 for

MAWILab2. Moderate CNN

o 1 max pooling layer in the 2nd Conv1D layero Flatten: 7808 for NSL-KDD, 2816 for Kyoto, and 384 for

MAWILab3. Deep CNN

o 2 max pooling layers in the 2nd and 3rd Conv1D layero Flatten: 7680 for NSL-KDD, 2816 for Kyoto, and 512 for

MAWILab

Page 27: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Fully Connected Network Layer

• Fully Connected Network 1. Hidden layers– Shallow: 1 hidden layer with 64

neurons– Moderate: 2 hidden layers with

64 and 32 neurons– Deep: 3 hidden layers with 64,

32, and 16 neurons2. Other parameters

– Batch normalization– Dropout = 0.5– Loss: binary cross entropy– Epochs: 10 and 20– Learning rate: 1e-3

Page 28: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Experimental Results: NSL-KDD

• Shows less than 80% F-measure score• Using more layers does not improve the performance

F-score

Page 29: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Comparison with other DL models

• CNN model does not work better than other DL models

F-score

Page 30: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Experimental Results: Kyoto Honeypot

• MCC measures the quality of binary classification: -1.0 (poor), 0 (random), 1.0 (good)• Training: January 1, 2014• Testing: December 1 (#1), 15 (#2), 31 (#3), 2015• Result sensitive to testing data sets

MCC

Page 31: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Experimental Results: MAWILab

• F-score ranges between 56% - 68%, which are not that satisfactory

Model Flow 001 / Flow 002 Flow 001 / Flow 003 Flow 001 / Flow 003Shallow CNN 65.44% 59.27% 61.33%

Moderate CNN 65.41% 67.66% 59.76%Deep CNN 65.45% 67.86% 56.76%

Flow # data points # normal # anomaly % anomalyFlow 001 407,807 291,488 116,319 28.5%Flow 002 472,654 327,413 145,241 30.7%Flow 003 423,984 261,426 162,558 38.3%Flow 004 425,994 300,759 125,235 29.4%

Page 32: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Summary

Page 33: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Future Direction

– Convertingisbasedonbinningandone-hotencoding

– Li,Zhipeng,etal."IntrusionDetectionUsingConvolutionalNeuralNetworksforRepresentationLearning." InternationalConferenceonNeuralInformationProcessing.Springer,Cham,2017

Page 34: An Empirical Study on Network Anomaly Detection …An Empirical Study on Network Anomaly Detection using Convolutional Neural Networks July 2, 2018 (Presented at SNTA 2018) JinohKim

Questions?

Contact:[email protected]