Network Traffic Anomaly Detection Based on ML-ESN for...
Transcript of Network Traffic Anomaly Detection Based on ML-ESN for...
Research ArticleNetwork Traffic Anomaly Detection Based on ML-ESN for PowerMetering System
S T Zhang1 X B Lin1 L Wu1 Y Q Song 2 N D Liao2 and Z H Liang3
1CSG Power Dispatching Control Center Guangzhou 510663 China2Changsha University of Science and Technology Changsha 410114 China3CSG Power Digital Grid Research Institute Guangzhou 510623 China
Correspondence should be addressed to Y Q Song acl158474361stucsusteducn
Received 25 February 2020 Revised 20 June 2020 Accepted 2 July 2020 Published 14 August 2020
Academic Editor Ivo Petras
Copyright copy 2020 S T Zhang et al is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
Due to the diversity and complexity of power network system platforms some traditional network traffic detection methods workwell for small sample datasets However the network data detection of complex power metering system platforms has problems oflow accuracy and high false-positive rate In this paper through a combination of exploration and feedback a solution for powernetwork traffic anomaly detection based on multilayer echo state network (ML-ESN) is proposed is method first relies on thePearson and Gini coefficient method to calculate the statistical distribution and correlation of network flow characteristics andthen uses the ML-ESN method to classify the network attacks abnormally Because the ML-ESN method abandons the back-propagation mechanism the nonlinear fitting ability of the model is solved In order to verify the effectiveness of the proposedmethod a simulation test was conducted on the UNSW_NB15 network security dataset e test results show that the averageaccuracy of this method is more than 97 which is significantly better than single-layer echo state network shallow BP neuralnetwork and some traditional machine learning methods
1 Introduction
At present the traditional power grid is developing towardsthe smart grid Due to the need to improve efficiencyflexibility reliability and loss reduction power advancedmetering infrastructure (AMI) has been rapidly developede system integrates smart meters communication net-works data centers and software systems [1]
Various application servers are mainly responsible fordata collection business application operations and systemmaintenance Large-scale measurement terminals need toaccess the measurement automation master station througha virtual private network ese communication processesare very vulnerable to attacks [2] erefore the safe op-eration of power metering systems must rely on reliablecommunication networks and security protection detectionand analysis technologies
Network security experts have discovered that AMI asan important infrastructure in modern society is one of the
important targets of cyberattacks launched by hostile or-ganizations e main attack methods for power networksinclude malicious attacks denial of service attacks dataspoofing and network monitoring [3]
Due to the key information exchanged in AMI com-munication AMI needs reliable protection to prevent un-authorized access and malicious attacks erefore whenmigrating to AMI facilities we must use security mechanismand intrusion detection technology [3]
At present intrusion detection methods are divided intohost-based intrusion detection and network-based intrusiondetection Host-based intrusion detection mainly solves thecollection forensics and audit of host intrusion tracesnetwork-based intrusion detection is mainly used to analyzethe network flow and judge the network attack behavior inreal time
Among them researchers at home and abroad haveapplied network intrusion detection technology to anomalydetection of AMI network flow and proposed a variety of
HindawiMathematical Problems in EngineeringVolume 2020 Article ID 7219659 21 pageshttpsdoiorg10115520207219659
anomaly detection and analysis models such as deep neuralnetwork [1] Markov [4] density statistics [5] BP neuralnetwork [6] attack graph-based information fusion [7] andprinciple component analysis [8]
In [5] Fathnia and Javidi tried to use OPTICS density-based technology to immediately diagnose AMI anomaliesin customer information and intelligent data In order toimprove the efficiency of the method they used LOFindexing technologyis technology actually detects factorsrelated to data anomalies and judges abnormal behaviorbased on factor scores
In [7] an AMI intrusion detection system (AMIDS) wasproposedis system uses information fusion technology tocombine sensors and consumption data in smart meters tomore accurately detect energy theft
Frommost existing research we find that there are moreresearch studies on the detection of AMI theft behavioranomaly but fewer research studies on the detection of AMInetwork traffic anomaly attack
At present there are still some problems in the existingresearch on AMI network traffic anomaly detection forexample the attack rules in [5] for AMI dynamic networkenvironment must be updated regularly In [6] the authorsestablished a BP neural network training model based on 6kinds of simple data of AMI and carried out simulation testson Matlab software However the model is still a long wayfrom real engineering applications
In this study we are different from the previous AMIanomaly detection content focusing on the abnormal sit-uation of AMI platform network flow By continuouslyextracting AMI network traffic characteristics such asprotocol type average packet size maximum and minimumpacket size packet duration and other related flow-basedcharacteristics it is possible to accurately analyze the type ofattack anomalies encountered by the AMI platform
We make the following contributions to AMI networkattack anomaly detection by using deep learning methodsbased on stream feature extraction and multilayer echo statenetworks
(1) is paper proposes a deep learning method for AMInetwork attack anomaly detection based on multi-layer echo state networks
(2) By extracting the statistical features of the collectednetwork data streams the importance and correla-tion of the statistical features of the network streamsare found the data input of deep learning is opti-mized the model training effect is improved and themodel training time is greatly reduced
(3) In order to verify the validity and accuracy of themethod we tested it in the UNSW_NB15 publicbenchmark dataset Experimental results show thatour method can detect AMI anomalous attacks andis superior to other methods
e rest of the paper is organized as follows Section 2describes related research Section 3 introduces AMI net-work architecture and security issues Section 4 proposessecurity solutions Section 5 focuses on the application of
ML-ESN classification method in AMI Section 6 completesexperiments and comparisons finally this paper summa-rizes the research work and puts forward some problemsthat need to be solved in the future
2 Related Work
Smart grid introduces computer and network communi-cation technology and physical facilities to form a complexsystem which is essentially a huge cyber-physical system(CPS) [9]
AMI is regarded as one of themost basic implementationtechnologies of smart grid but so far a large number ofpotential vulnerabilities have been discovered For examplein the AMI network smart meters smart data collectors anddata processing centers have their own storage spaces andthese spaces store a lot of information However this in-formation can easily be tampered with due to the placementof malware
In order to solve the security problems of the AMIsystem the AMI Network Engineering Task Force (AMI-SEC) [10] pointed out that intrusion detection systems orrelated technologies can better monitor the AMI networkand analyze and discover different attacks through technicalmeans
At present domestic and foreign scholars have con-ducted a lot of research studies on the security of AMImainly focusing on power fraud detection malicious codedetection and network attack detection [11]
21 Power Fraud Detection In terms of power spoofingattacks are generally divided into two cases according to theconsequences of the attack
One is to inject the wrong data into the power grid tolaunch an attack which causes the power grid to oscillateOnce successful it will cause a large-scale impact on thepower grid and users e second is to enable attackers toobtain direct economic benefits by stealing electricity
Jokar et al in [12] present a new energy theft detectorbased on consumption patterns e detector uses thepredictability of normal and malicious consumption pat-terns of users and distribution transformer electricity metersto shortlist areas with a high probability of power theft andidentifies suspicious customers by monitoring abnormalconditions in consumption patterns
e authors in [13] proposed a semisupervised anomalydetection framework to solve the problem of energy theft inthe public utility database that leads to changes in user usagepatterns Compared with other methods (such as a class ofSVM and automatic encoder) the framework can control thedetection intensity through the detection index threshold
22 Malicious Code Detection Since the smart metertransmits power consumption information to the grid ter-minal the detection of malicious code can be extended to thedetection of executable code Once it is confirmed that thedata uploaded by the meter contain executable code the dataare likely to be malicious code [14]
2 Mathematical Problems in Engineering
In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem
In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code
23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)
SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users
e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms
e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect
e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]
In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments
In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure
In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork
In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes
In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods
With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]
Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)
Currently there are three main methods of anomalydetection based on deep learning
(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse
(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases
(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered
In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN
As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and
Mathematical Problems in Engineering 3
prediction of one-dimensional time series ESN has a verygood advantage [32]
Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]
Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS
At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized
From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied
e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection
3 AMI Network Architecture andSecurity Issues
e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1
In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers
e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]
As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft
and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure
Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack
4 Proposed Security Solution
At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished
To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI
As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes
Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally
e main reasons for adopting standardized processingare as follows
(1) Improve the centralized processing and visual dis-play of network flow information
(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices
(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification
For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine
4 Mathematical Problems in Engineering
learning classification algorithms according to their actualconditions
e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data
41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1
Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with
the escape string Part of the real probe stream data isshown in Figure 3
e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo
Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata
Data processing centerElectricity users
Data concentratorFirewall
Flow probe Smart electric meter
Figure 2 Traffic probe simple deployment diagram
Table 1 Some important metadata information
ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port
19 NatSrcIP IPstring 1024 NAT translated sourceIP
20 NatSrcPort String 1024 NAT translated sourceport
21 NatDestIP IPstring 1024 NAT translateddestination IP
22 NatDestPort String 1024 NAT translateddestination port
23 SrcMac String 1024 Source MAC address
24 DestMac String 1024 Destination MACaddress
25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol
Smartmeter
Smartmeter
Smartmeter
Repeater
Repeater
Smart home applications
Smart home applications
Energystorage
PHEVPEV
eutilitycentre
HANZigbee Bluetooth
RFID PLC
NANmesh network
Wi-FiWiMAX PLC
WANfiber optic WiMAX
satellite BPLData
concentrator
Figure 1 AMI network layered architecture [35]
Mathematical Problems in Engineering 5
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
anomaly detection and analysis models such as deep neuralnetwork [1] Markov [4] density statistics [5] BP neuralnetwork [6] attack graph-based information fusion [7] andprinciple component analysis [8]
In [5] Fathnia and Javidi tried to use OPTICS density-based technology to immediately diagnose AMI anomaliesin customer information and intelligent data In order toimprove the efficiency of the method they used LOFindexing technologyis technology actually detects factorsrelated to data anomalies and judges abnormal behaviorbased on factor scores
In [7] an AMI intrusion detection system (AMIDS) wasproposedis system uses information fusion technology tocombine sensors and consumption data in smart meters tomore accurately detect energy theft
Frommost existing research we find that there are moreresearch studies on the detection of AMI theft behavioranomaly but fewer research studies on the detection of AMInetwork traffic anomaly attack
At present there are still some problems in the existingresearch on AMI network traffic anomaly detection forexample the attack rules in [5] for AMI dynamic networkenvironment must be updated regularly In [6] the authorsestablished a BP neural network training model based on 6kinds of simple data of AMI and carried out simulation testson Matlab software However the model is still a long wayfrom real engineering applications
In this study we are different from the previous AMIanomaly detection content focusing on the abnormal sit-uation of AMI platform network flow By continuouslyextracting AMI network traffic characteristics such asprotocol type average packet size maximum and minimumpacket size packet duration and other related flow-basedcharacteristics it is possible to accurately analyze the type ofattack anomalies encountered by the AMI platform
We make the following contributions to AMI networkattack anomaly detection by using deep learning methodsbased on stream feature extraction and multilayer echo statenetworks
(1) is paper proposes a deep learning method for AMInetwork attack anomaly detection based on multi-layer echo state networks
(2) By extracting the statistical features of the collectednetwork data streams the importance and correla-tion of the statistical features of the network streamsare found the data input of deep learning is opti-mized the model training effect is improved and themodel training time is greatly reduced
(3) In order to verify the validity and accuracy of themethod we tested it in the UNSW_NB15 publicbenchmark dataset Experimental results show thatour method can detect AMI anomalous attacks andis superior to other methods
e rest of the paper is organized as follows Section 2describes related research Section 3 introduces AMI net-work architecture and security issues Section 4 proposessecurity solutions Section 5 focuses on the application of
ML-ESN classification method in AMI Section 6 completesexperiments and comparisons finally this paper summa-rizes the research work and puts forward some problemsthat need to be solved in the future
2 Related Work
Smart grid introduces computer and network communi-cation technology and physical facilities to form a complexsystem which is essentially a huge cyber-physical system(CPS) [9]
AMI is regarded as one of themost basic implementationtechnologies of smart grid but so far a large number ofpotential vulnerabilities have been discovered For examplein the AMI network smart meters smart data collectors anddata processing centers have their own storage spaces andthese spaces store a lot of information However this in-formation can easily be tampered with due to the placementof malware
In order to solve the security problems of the AMIsystem the AMI Network Engineering Task Force (AMI-SEC) [10] pointed out that intrusion detection systems orrelated technologies can better monitor the AMI networkand analyze and discover different attacks through technicalmeans
At present domestic and foreign scholars have con-ducted a lot of research studies on the security of AMImainly focusing on power fraud detection malicious codedetection and network attack detection [11]
21 Power Fraud Detection In terms of power spoofingattacks are generally divided into two cases according to theconsequences of the attack
One is to inject the wrong data into the power grid tolaunch an attack which causes the power grid to oscillateOnce successful it will cause a large-scale impact on thepower grid and users e second is to enable attackers toobtain direct economic benefits by stealing electricity
Jokar et al in [12] present a new energy theft detectorbased on consumption patterns e detector uses thepredictability of normal and malicious consumption pat-terns of users and distribution transformer electricity metersto shortlist areas with a high probability of power theft andidentifies suspicious customers by monitoring abnormalconditions in consumption patterns
e authors in [13] proposed a semisupervised anomalydetection framework to solve the problem of energy theft inthe public utility database that leads to changes in user usagepatterns Compared with other methods (such as a class ofSVM and automatic encoder) the framework can control thedetection intensity through the detection index threshold
22 Malicious Code Detection Since the smart metertransmits power consumption information to the grid ter-minal the detection of malicious code can be extended to thedetection of executable code Once it is confirmed that thedata uploaded by the meter contain executable code the dataare likely to be malicious code [14]
2 Mathematical Problems in Engineering
In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem
In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code
23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)
SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users
e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms
e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect
e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]
In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments
In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure
In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork
In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes
In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods
With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]
Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)
Currently there are three main methods of anomalydetection based on deep learning
(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse
(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases
(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered
In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN
As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and
Mathematical Problems in Engineering 3
prediction of one-dimensional time series ESN has a verygood advantage [32]
Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]
Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS
At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized
From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied
e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection
3 AMI Network Architecture andSecurity Issues
e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1
In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers
e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]
As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft
and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure
Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack
4 Proposed Security Solution
At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished
To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI
As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes
Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally
e main reasons for adopting standardized processingare as follows
(1) Improve the centralized processing and visual dis-play of network flow information
(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices
(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification
For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine
4 Mathematical Problems in Engineering
learning classification algorithms according to their actualconditions
e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data
41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1
Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with
the escape string Part of the real probe stream data isshown in Figure 3
e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo
Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata
Data processing centerElectricity users
Data concentratorFirewall
Flow probe Smart electric meter
Figure 2 Traffic probe simple deployment diagram
Table 1 Some important metadata information
ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port
19 NatSrcIP IPstring 1024 NAT translated sourceIP
20 NatSrcPort String 1024 NAT translated sourceport
21 NatDestIP IPstring 1024 NAT translateddestination IP
22 NatDestPort String 1024 NAT translateddestination port
23 SrcMac String 1024 Source MAC address
24 DestMac String 1024 Destination MACaddress
25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol
Smartmeter
Smartmeter
Smartmeter
Repeater
Repeater
Smart home applications
Smart home applications
Energystorage
PHEVPEV
eutilitycentre
HANZigbee Bluetooth
RFID PLC
NANmesh network
Wi-FiWiMAX PLC
WANfiber optic WiMAX
satellite BPLData
concentrator
Figure 1 AMI network layered architecture [35]
Mathematical Problems in Engineering 5
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem
In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code
23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)
SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users
e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms
e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect
e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]
In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments
In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure
In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork
In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes
In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods
With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]
Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)
Currently there are three main methods of anomalydetection based on deep learning
(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse
(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases
(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered
In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN
As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and
Mathematical Problems in Engineering 3
prediction of one-dimensional time series ESN has a verygood advantage [32]
Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]
Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS
At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized
From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied
e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection
3 AMI Network Architecture andSecurity Issues
e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1
In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers
e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]
As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft
and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure
Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack
4 Proposed Security Solution
At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished
To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI
As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes
Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally
e main reasons for adopting standardized processingare as follows
(1) Improve the centralized processing and visual dis-play of network flow information
(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices
(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification
For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine
4 Mathematical Problems in Engineering
learning classification algorithms according to their actualconditions
e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data
41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1
Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with
the escape string Part of the real probe stream data isshown in Figure 3
e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo
Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata
Data processing centerElectricity users
Data concentratorFirewall
Flow probe Smart electric meter
Figure 2 Traffic probe simple deployment diagram
Table 1 Some important metadata information
ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port
19 NatSrcIP IPstring 1024 NAT translated sourceIP
20 NatSrcPort String 1024 NAT translated sourceport
21 NatDestIP IPstring 1024 NAT translateddestination IP
22 NatDestPort String 1024 NAT translateddestination port
23 SrcMac String 1024 Source MAC address
24 DestMac String 1024 Destination MACaddress
25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol
Smartmeter
Smartmeter
Smartmeter
Repeater
Repeater
Smart home applications
Smart home applications
Energystorage
PHEVPEV
eutilitycentre
HANZigbee Bluetooth
RFID PLC
NANmesh network
Wi-FiWiMAX PLC
WANfiber optic WiMAX
satellite BPLData
concentrator
Figure 1 AMI network layered architecture [35]
Mathematical Problems in Engineering 5
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
prediction of one-dimensional time series ESN has a verygood advantage [32]
Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]
Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS
At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized
From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied
e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection
3 AMI Network Architecture andSecurity Issues
e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1
In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers
e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]
As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft
and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure
Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack
4 Proposed Security Solution
At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished
To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI
As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes
Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally
e main reasons for adopting standardized processingare as follows
(1) Improve the centralized processing and visual dis-play of network flow information
(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices
(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification
For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine
4 Mathematical Problems in Engineering
learning classification algorithms according to their actualconditions
e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data
41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1
Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with
the escape string Part of the real probe stream data isshown in Figure 3
e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo
Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata
Data processing centerElectricity users
Data concentratorFirewall
Flow probe Smart electric meter
Figure 2 Traffic probe simple deployment diagram
Table 1 Some important metadata information
ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port
19 NatSrcIP IPstring 1024 NAT translated sourceIP
20 NatSrcPort String 1024 NAT translated sourceport
21 NatDestIP IPstring 1024 NAT translateddestination IP
22 NatDestPort String 1024 NAT translateddestination port
23 SrcMac String 1024 Source MAC address
24 DestMac String 1024 Destination MACaddress
25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol
Smartmeter
Smartmeter
Smartmeter
Repeater
Repeater
Smart home applications
Smart home applications
Energystorage
PHEVPEV
eutilitycentre
HANZigbee Bluetooth
RFID PLC
NANmesh network
Wi-FiWiMAX PLC
WANfiber optic WiMAX
satellite BPLData
concentrator
Figure 1 AMI network layered architecture [35]
Mathematical Problems in Engineering 5
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
learning classification algorithms according to their actualconditions
e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data
41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1
Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with
the escape string Part of the real probe stream data isshown in Figure 3
e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo
Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata
Data processing centerElectricity users
Data concentratorFirewall
Flow probe Smart electric meter
Figure 2 Traffic probe simple deployment diagram
Table 1 Some important metadata information
ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port
19 NatSrcIP IPstring 1024 NAT translated sourceIP
20 NatSrcPort String 1024 NAT translated sourceport
21 NatDestIP IPstring 1024 NAT translateddestination IP
22 NatDestPort String 1024 NAT translateddestination port
23 SrcMac String 1024 Source MAC address
24 DestMac String 1024 Destination MACaddress
25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol
Smartmeter
Smartmeter
Smartmeter
Repeater
Repeater
Smart home applications
Smart home applications
Energystorage
PHEVPEV
eutilitycentre
HANZigbee Bluetooth
RFID PLC
NANmesh network
Wi-FiWiMAX PLC
WANfiber optic WiMAX
satellite BPLData
concentrator
Figure 1 AMI network layered architecture [35]
Mathematical Problems in Engineering 5
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP
42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis
In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4
e framework mainly includes three processing stagesand the three steps are as follows
Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model
43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics
Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset
As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow
In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows
Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow
In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following
Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation
In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows
6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^
6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^
6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^
6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^
6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^
6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^
6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^
6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^
6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^
6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^
6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^
6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^
Figure 3 Part of the real probe stream data
6 Mathematical Problems in Engineering
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end
Some of the main features of network traffic extracted inthis paper are shown in Table 2
44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination
At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc
Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo
In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula
xprime x minus x
δ (1)
where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+
(x2 minus x)2 + n(number of samples per feature)) δ std
radic
45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to
quality metrics and selects important features that meetrequirements
At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc
Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics
Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams
Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula
ρxy cov(x y)
σxσy
E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961
σxσy
(2)
where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r
r 1113936
ni1 xi minus x( 1113857 yi minus y( 1113857
1113936ni1 xi minus x( 1113857
2
1113936ni1 yi minus y( 1113857
211139691113970
(3)
where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when
Probe
Traffic collection Feature extractionStatistical flow characteristics
Standardized features
Flow
Characteristic filter
Classification andevaluation
Construction of multilayer echo
state network
Verification and performance
evaluation
Figure 4 Proposed AMI network traffic detection framework
Table 2 Some of the main features
ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port
5 Proto Network protocol mainly TCP UDP andICMP
6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets
29 max_biat Maximum backward packet reach interval
30 std_biat Time interval standard deviation of backwardpackets
31 duration Network flow duration
Mathematical Problems in Engineering 7
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent
Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level
In the classification problem assuming that there are k
classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]
Gini(P) 1113944K
i1Pi(1 minus P) 1 minus 1113944
K
i1p2i (4)
Given the sample set D the Gini coefficient is expressedas follows
Gini(P) 1 minus 1113944K
i1
Ck
11138681113868111386811138681113868111386811138681113868
D1113888 1113889
2
(5)
where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes
5 ML-ESN Classification Method
ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5
In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L
nodes then
U (t) u1(t) u2(t) uk(t)1113858 1113859T
x (t) x1(t) x2(t) xN(t)1113858 1113859T
y(t) y1(t) y2(t) yL(t)1113858 1113859T
(6)
Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +
N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional
When u(t) is input the updated state equation of thereservoir is given by
x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)
where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by
y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)
Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power
In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6
e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]
x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857
xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873
⋮
xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873
(9)
Calculate the output ML-ESN result according to for-mula (9)
y(n + 1) fout WoutxM(n + 1)( 1113857 (10)
51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur
W
ReservoirInput layer
Win
Output layer
U(t) y(t)
x(t)
Wout
Wback
Figure 5 ESN basic model
8 Mathematical Problems in Engineering
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space
However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies
On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model
is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1
6 Simulation Test and Result Analysis
In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed
61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]
Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks
e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data
namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]
In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3
Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7
62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows
accuracy TP + TN
TP + TN + FP + FN
FPR FP
FP + FN
TPR TP
FN + TP
precision TP
TP + FP
recall TP
FN + TP
F minus score 2lowast precisionlowast recallprecision + recall
(11)
e specific meanings of TP TN FP and FN used in theabove formulas are as follows
TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic
Reservoir 1Input layer Output layerReservoir 2 Reservoir M
hellip
WinterWin Wout
W1 WM
xM
W2U(t) y(t)
x1 X2
Figure 6 ML-ESN basic model
Mathematical Problems in Engineering 9
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic
63 Simulation Experiment Steps and Results
Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted
Table 3 e statistics of the training dataset
ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044
(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to
the dataset
(i) Set training data length trainLen
(ii) Set test data length testLen
(iii) Set the number of reservoirs Ri
(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)
(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter
(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of
sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively
(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables
(i) For t from 1 to T compute x1(t)
(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)
(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]
(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure
(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix
(15) (5) Calculate the output ML-ESN result according to formula (10)
(i) Select the SoftMax activation function and calculate the output fout value
(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated
ALGORITHM 1 AMI network traffic classification
10 Mathematical Problems in Engineering
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8
As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03
Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively
It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069
In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained
erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small
In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10
As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature
Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)
Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4
In Table 4 the input dimension is determined accordingto the number of feature selections For example in the
dur proto service state spkts dpkts sbytes dbytes
0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556
1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124
2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277
3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278
4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729
6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863
Figure 7 Partial feature data after standardized
Data label box
Wor
ms
Shel
lcode
Back
door
Ana
lysis
Reco
nnai
ssan
ce
DoS
Fuzz
ers
Expl
oits
Gen
eric
Nor
mal
10
08
06
04
02
00
Figure 8 Normalized data distribution
Mathematical Problems in Engineering 11
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients
e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively
Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3
e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000
In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in
characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect
e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type
In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments
631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset
e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1
e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1
097
1
1
1
1
1
1
1
ndash0079
ndash0079
011
011 ndash014
ndash014
039
039
ndash0076
ndash0076
021
021
ndash006
043
043029
029
ndash017
ndash017
0077
0077
ndash0052
ndash0052
ndash0098
ndash0098
10
08
06
04
02
00
0017
0017
011
011
ndash0069
ndash0069
ndash006
ndash0065
ndash0065
ndash031
ndash031
039
039
033
033
024 ndash0058
ndash0058
0048
0048
ndash0058
ndash0058
ndash028 014 0076
0054
0054
014
014
ndash006
ndash0079
ndash0079
ndash00720076 ndash0072
ndash0072
ndash0086
ndash0086
ndash041 036
036
032
0047
0047
0087 ndash0046
ndash0046
ndash0043
ndash0043
0011
0011
ndash0085 ndash009
ndash009
ndash0072
00025
00025
ndash0045
ndash0045
ndash0037
ndash0037
0083
0083
008
008 1
1 ndash036
1
1 ndash029
ndash029
ndash028
ndash028
012
012
ndash0082 ndash0057
ndash0057
ndash033
ndash033
ndash0082 ndash036 1
1
084
084
024 ndash028 014 ndash041 0087 ndash0085 0068
0068
045
045
044
044 ndash03
ndash03
ndash03ndash03
009
009
097 0039
ndash006 ndash028
011 014
ndash0029 ndash02
02 0021
ndash0043 ndash03
0039
ndash026
016
ndash02
0024
ndash029
ndash0018
0095
ndash009
ndash0049
ndash0022
ndash0076
ndash0014
ndash00097
ndash00097
0039
00039 00075
0039
ndash0057
ndash0055
1
0051
0051
005
005
094
094
ndash006 0035
0035
ndash0098 ndash0059
004 097 ndash0059 1
0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014
ndash006 011 ndash0029 02 ndash0043 0017
0017
ndash028 014 ndash02
ndash02
0021 ndash03 00039
ndash026 016 0024 ndash029 00075
0018
ndash014
ndash014
06
06
ndash0067
ndash0067 ndash004
ndash0098 097
ndash0055ndash0057
032
spkts
state
service
sload
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
djit
stcpb
ct_srv_src
ct_dst_itm
spkt
s
state
serv
ice
sload
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
at djit
stcpb
ct_s
rv_s
rc
ct_d
st_Itm
Figure 9 e Pearson coefficient value for UNSW_NB15
12 Mathematical Problems in Engineering
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz
632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the
combination of the two e experimental results are shownin Figure 11
From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology
In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099
633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training
serv
ice
sload
dloa
d
spkt
s
dpkt
s
rate
dbyt
es
sinpk
t
sloss
tcpr
tt
ackd
atsjit
ct_s
rv_s
rc
dtcp
b
djit
service
sload
dload
spkts
dpkts
rate
dbytes
sinpkt
sloss
tcprtt
ackdat
sjit
ct_srv_src
dtcpb
djit
10
08
06
04
02
00
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
078 097 058
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
11057 059 058 058057 056082
Figure 10 e Gini value for UNSW_NB15
Table 4 e parameters of ML-ESN experiment
Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6
Mathematical Problems in Engineering 13
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12
It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization
Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096
634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)
As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the
model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes
As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095
e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy
From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning
Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection
200
00
400
00
600
00
800
00
120
000
100
000
140
000
160
000
NonPeason
GiniPeason + Gini
10
09
08
07
06
05
04
Data
Accu
racy
Figure 11 Classification effect of different filtering methods
14 Mathematical Problems in Engineering
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers
635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14
It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10
Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]
is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15
From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all
achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80
On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726
In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best
In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR
098 098097098 099095 095 095096 098 099099
094 094097097 10 10
AccuracyF1-scoreFPR
Gen
eric
Expl
oits
Fuzz
ers
DoS
Reco
nnai
ssan
ce
Ana
lysis
Back
door
Shel
lcode
Wor
ms
e different attack types
10
08
06
04
02
00
Det
ectio
n ra
te
001 001 001001 002002002002002
Figure 12 Classification results of the ML-ESN method
Mathematical Problems in Engineering 15
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
200
175
150
125
100
75
50
25
00
e d
etec
tion
time (
ms)
Depth2 Depth3 Depth5Reservoir depths
Depth4
211
167
116
4133
02922
66
1402 0402
50010002000
(a)
Depth 2 Depth 3 Depth 5Reservoir depths
Depth 4
1000
0975
0950
0925
0900
0875
0850
0825
0800
Accu
racy
094 094095 095 095 095
096 096
091093093
091
50010002000
(b)
Accu
racy
10
08
06
04
02
00BP DecisionTree ML-ESNESN
091 096083
077
00013 00017 0002200024
AccuracyTime
0010
0008
0006
0004
0002
0000
Tim
e (s)
(c)
Figure 13 ML-ESN results at different reservoir depths
200
175
150
125
100
75
50
25
000 20000 40000 60000 80000 120000100000 140000 160000
Number of packages
Feature AFeature B
Feat
ure d
istrib
utio
n
Figure 14 Distribution map of the first two statistical characteristics
16 Mathematical Problems in Engineering
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance
e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively
From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for
20000
40000
60000
80000
120000
100000
140000
160000
10
09
08
07
06
05
04
Data
Accuracy
0
GaussianNBKNeighborsDecisionTree
MLPClassifierOur_MLndashESN
Figure 15 Detection results of different classification methods under different data sizes
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)
Generic ROC curve (area = 097)Exploits ROC curve (area = 094)
DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)
Reconnaissance ROC curve (area = 097)
Figure 16 Classification ROC diagram of single-layer ESN algorithm
Mathematical Problems in Engineering 17
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)
Generic ROC curve (area = 082)Exploits ROC curve (area = 077)
DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)
Reconnaissance ROC curve (area = 078)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 18 Classification ROC diagram of DecisionTree algorithm
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)
Generic ROC curve (area = 097)Exploits ROC curve (area = 100)
DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)
Reconnaissance ROC curve (area = 100)
Figure 19 Classification ROC diagram of our ML-ESN algorithm
Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)
Generic ROC curve (area = 099)Exploits ROC curve (area = 096)
DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)
Reconnaissance ROC curve (area = 095)
10
10
08
08
06
06
00
00
02
02
04
04
True
-pos
itive
rate
False-positive rate
Figure 17 Classification ROC diagram of BP algorithm
18 Mathematical Problems in Engineering
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35
7 Conclusion
is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity
Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed
e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption
Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion
erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection
e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)
study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production
Data Availability
e data used to support the findings of this study areavailable from the corresponding author upon request
Conflicts of Interest
e authors declare that they have no conflicts of interest
Acknowledgments
is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)
References
[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019
[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014
[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014
[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017
[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017
[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016
[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013
[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020
Mathematical Problems in Engineering 19
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012
[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx
[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013
[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018
[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese
[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017
[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014
[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014
[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015
[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018
[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015
[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016
[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203
[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017
[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017
[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015
[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019
[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018
[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese
[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017
[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese
[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019
[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007
[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015
[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013
[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese
[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese
[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005
[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin
[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018
[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017
[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo
20 Mathematical Problems in Engineering
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21
2020 httpsarxivorgftparxivpapers1806180601016pdf
[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015
[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets
[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004
[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007
[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014
[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152
[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017
Mathematical Problems in Engineering 21