Anomaly Detection and its Adaptation: Studies on Cyber ...608664/FULLTEXT01.pdf · Anomaly...

Linkoping Studies in Science and Technology

Licentiate Thesis No. 1586

Anomaly Detection and its Adaptation:Studies on Cyber-Physical Systems

by

Massimiliano Raciti

Department of Computer and Information ScienceLinkopings universitet

SE-581 83 Linkoping, Sweden

Linkoping 2013

This is a Swedish Licentiate’s Thesis

Swedish postgraduate education leads to a Doctor’s degree and/or a Licentiate’s degree.A Doctor’s degree comprises 240 ECTS credits (4 years of full-time studies).

A Licentiate’s degree comprises 120 ECTS credits.

Copyright c⃝ 2013 Massimiliano Raciti

ISBN 978-91-7519-644-2ISSN 0280–7971

Printed by LiU Tryck 2013

URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-89617

Anomaly Detection and its Adaptation: Studies onCyber-Physical Systems

by

Massimiliano Raciti

March 2013ISBN 978-91-7519-644-2

Linkoping Studies in Science and TechnologyLicentiate Thesis No. 1586

ISSN 0280–7971LiU–Tek–Lic–2013:20

ABSTRACT

Cyber-Physical Systems (CPS) are complex systems where physical operations are supported and co-ordinated by Information and Communication Technology (ICT). From the point of view of security,ICT technology offers new opportunities to increase vigilance and real-time responsiveness to physicalsecurity faults. On the other hand, the cyber domain carries all the security vulnerabilities typical toinformation systems, making security a new big challenge in critical systems.

This thesis addresses anomaly detection as security measure in CPS. Anomaly detection consists ofmodelling the good behaviour of a system using machine learning and data mining algorithms, detectinganomalies when deviations from the normality model occur at runtime. Its main feature is the ability todiscover the kinds of attack not seen before, making it suitable as a second line of defence.

The first contribution of this thesis addresses the application of anomaly detection as early warningsystem in water management systems. We describe the evaluation of an anomaly detection softwarewhen integrated in a Supervisory Control and Data Acquisition (SCADA) system where water qualitysensors provide data for real-time analysis and detection of contaminants. Then, we focus our attentionto smart metering infrastructures. We study a smart metering device that uses a trusted platform forstorage and communication of electricity metering data, and show that despite the hard core security,there is still room for deployment of a second level of defence as an embedded real-time anomalydetector that can cover both the cyber and physical domains. In both scenarios, we show that anomalydetection algorithms can efficiently discover attacks in the form of contamination events in the first caseand cyber attacks for electricity theft in the second.

The second contribution focuses on online adaptation of the parameters of anomaly detection applied toa Mobile Ad hoc Network (MANET) for disaster response. Since survivability of the communicationto network attacks is as crucial as the lifetime of the network itself, we devised a component that isin charge of adjusting the parameters based on the current energy level, using the trade-off betweenthe node’s response to attacks and the energy consumption induced by the intrusion detection system.Adaption increases the network lifetime without significantly deteriorating the detection performance.

This work has been supported by the Swedish National Graduate School of Computer Science (CUGS)and the EU FP7 SecFutur Project.

Department of Computer and Information ScienceLinkopings universitet

SE-581 83 Linkoping, Sweden

Acknowledgments

When I came to Sweden for the first time I would have never imagined what wasgoing to happen to me. The time I spent here writing my Master’s thesis in 2009was so exciting that when I was done with it I had the feeling that six months at LiUwere definitely not enough. Once I got the opportunity, I headed back to Swedenfor a new, longer adventure.

My first gratitude goes to my supervisor Simin Nadjm-Tehrani. Thank you foraccepting me as PhD student and guiding me through the years. Your never-endingenergy and optimism have an impact on our inspiration and motivation. Thank youfor your advice in teaching and research.

A special thank goes to my current and former colleagues at RTSLab for yourvaluable feedback and for making the environment more than just a workplace.We have slowly polluted the swedish habits with mediterranean standards, it wasinteresting to see even the most swedish-style fellows knocking on our doors at6pm for a fika. Thank you for the nice time we have spent together inside andoutside the lab.

I also want to thank Anne, Eva, Asa and Inger for the administrative support,and all the other colleagues at IDA for making the environment so enjoyable.

While pursuing PhD studies one inevitably has to encounter hills and val-leys, sometimes your mood reaches the peak and sometimes it slopes down to theground. I would like to express my gratitude to my family who has always beenthere to support me. I dedicate this thesis to all of you.

Last but not least, I would like to thank my dear Simona for lighting up myheart and giving me your full support in the last stages of this work. I love youwith all my heart. Finally, an acknowledgement goes to all my friends around theworld for the pleasant time spent together in Linkoping.

Massimiliano RacitiLinkoping

March 2013

v

Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 List of publications . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background 72.1 Anomaly Detection with ADWICE . . . . . . . . . . . . . . . . . 72.2 Water Quality in Distribution Systems . . . . . . . . . . . . . . . 9

2.2.1 Security considerations . . . . . . . . . . . . . . . . . . . 112.2.2 Related Work on Contamination Detection . . . . . . . . 11

2.3 Advanced Metering Infrastructures . . . . . . . . . . . . . . . . . 152.3.1 Security considerations . . . . . . . . . . . . . . . . . . . 162.3.2 Related Work on AMI Security . . . . . . . . . . . . . . 17

2.4 Disaster Area Networks . . . . . . . . . . . . . . . . . . . . . . . 182.4.1 Random-Walk Gossip protocol . . . . . . . . . . . . . . . 192.4.2 Security considerations . . . . . . . . . . . . . . . . . . . 20

3 Anomaly Detection in Water Management Systems 213.1 Scenario: the event detection systems challenge . . . . . . . . . . 213.2 Modelling Normality . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.2 Feature selection . . . . . . . . . . . . . . . . . . . . . . 243.2.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.1 Station A . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.2 Station D . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Detection Latency . . . . . . . . . . . . . . . . . . . . . . . . . . 303.5 Missing and inaccurate data problem . . . . . . . . . . . . . . . . 323.6 Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

vii

CONTENTS

4 Anomaly Detection in Smart Metering Infrastructures 344.1 Scenario: the trusted sensor network . . . . . . . . . . . . . . . . 34

4.1.1 Trusted Meter Prototype . . . . . . . . . . . . . . . . . . 354.1.2 Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2 Embedded Anomaly Detection . . . . . . . . . . . . . . . . . . . 374.2.1 Data Logger . . . . . . . . . . . . . . . . . . . . . . . . 384.2.2 Data Preprocessing and Feature Extraction . . . . . . . . 394.2.3 Cyber-layer Anomaly Detection Algorithm . . . . . . . . 404.2.4 Physical-layer Anomaly Detection Algorithm . . . . . . . 404.2.5 Alert Aggregator . . . . . . . . . . . . . . . . . . . . . . 40

4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.1 Data collection . . . . . . . . . . . . . . . . . . . . . . . 414.3.2 Cyber attacks and data partitioning . . . . . . . . . . . . . 414.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.4 Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Online Parameter Adaptation of Intrusion Detection 445.1 Scenario: challenged networks . . . . . . . . . . . . . . . . . . . 455.2 Adaptive detection . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.2.1 Adaptation component . . . . . . . . . . . . . . . . . . . 475.2.2 Energy-based adaptation in the ad hoc scenario . . . . . . 48

5.3 Energy modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 495.3.1 A CPU model for network simulation . . . . . . . . . . . 505.3.2 Application of the model . . . . . . . . . . . . . . . . . . 50

5.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.4.1 Implementation and simulation setup . . . . . . . . . . . 515.4.2 Simulation results . . . . . . . . . . . . . . . . . . . . . 53

5.5 Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Conclusion 586.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

viii

List of Figures

2.1 NIST smart grid interoperability model [1] . . . . . . . . . . . . . 15

3.1 Effect of contaminants on the WQ parameters . . . . . . . . . . . 223.2 Example of Event Profiles . . . . . . . . . . . . . . . . . . . . . 233.3 Station A contaminant A ROC curve . . . . . . . . . . . . . . . . 283.4 Concentration Sensitivity of the Station A . . . . . . . . . . . . . 283.5 ROC curve station D contaminant A . . . . . . . . . . . . . . . . 293.6 Concentration sensitivity station D . . . . . . . . . . . . . . . . . 303.7 Detection latency in station A . . . . . . . . . . . . . . . . . . . . 313.8 Detection latency in station D . . . . . . . . . . . . . . . . . . . . 31

4.1 Trusted Smart Metering Infrastructure [2] . . . . . . . . . . . . . 354.2 Trusted Meter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3 Manipulation of consumption values . . . . . . . . . . . . . . . . 364.4 Manipulation or injection of control commands . . . . . . . . . . 374.5 Proposed Cyber-Physical Anomaly Detection Architecture . . . . 38

5.1 The General Survivability Framework control loop . . . . . . . . 465.2 Energy-aware adaptation of IDS parameters . . . . . . . . . . . . 475.3 State machine with associated power consumption . . . . . . . . . 505.4 CPU power consumption of the RWG and IDS framework over

some time interval . . . . . . . . . . . . . . . . . . . . . . . . . . 515.5 Energy consumption and number of active nodes in the drain attack 535.6 Survivability performance in the draining attack . . . . . . . . . . 545.7 Energy consumption and number of active nodes in the grey hole

attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.8 Survivability performance in the grey hole attack . . . . . . . . . 555.9 Energy consumption and number of active nodes in the drain attack

with random initial energy levels . . . . . . . . . . . . . . . . . . 565.10 Survivability performance in the drain attack with random initial

energy levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

ix

List of Tables

3.1 Station A detection results of 1mg/L of concentration . . . . . . . 27

5.1 Aggregation interval based on energy levels . . . . . . . . . . . . 485.2 Current draw of the RWG application . . . . . . . . . . . . . . . 525.3 Current draw of the IDS application . . . . . . . . . . . . . . . . 52

x

Chapter 1

Introduction

Since computer-based systems are now pervading all aspects of our everyday life,security is one of the main concerns in our digital era. New systems are oftenpoorly understood at the beginning from the security point of view. Unprotectedassets can be exploited by attackers, who can target the vulnerabilities of the sys-tems to get some kind of benefit out of it. In addition to that, security is often theless developed attribute at design time [3]. Designers tend to focus and optimiseother functional and non-function requirements rather than security, which is typ-ically added afterwards. This is the case, for instance, for smartphone security,where the proliferation of malware that exploits weaknesses of architectures andprotocols has raised the need of efficient detection techniques and this is a hot topicat the time of this thesis.

Old and well understood systems, on the other hand, still expose vulnerabilitiesthat can be discovered later after their deployment and security is typically a never-ending challenge.

Security mechanisms can be classified as active and passive. Active securitymechanisms are normally proactive, meaning that the asset they protect are securedin advance. Data encryption, for instance, is one of the main active security mech-anisms. Passive security mechanisms, on the other hand, do not perform any actionuntil the attack occurs.

Intrusion detection, the main mechanism of this category, consists of passivelymonitoring the system in order to detect attacks and, eventually, apply the oppor-tune countermeasure. There are two subcategories of intrusion detection that differon the way they discover attacks. Misuse detection, the most common in com-mercial tools, consists on creating models (often called signatures) of the attacks.When the current condition matches any known signatures, an alarm is raised. Mis-use detection requires exact characterisation of known attacks based on historicaldata set and gives accurate matches for those cases that are modelled. While mis-use detection provides immediate diagnosis when successful, it is unable to detectcases for which no previous information exists (earlier similar cases in history, aknown signature, etc.).

1

1. INTRODUCTION

Anomaly detection, the complementary technique of misuse detection, is aparadigm in which what is modelled is an approximate normality of a system,using machine learning or data mining techniques. This requires a learning phaseduring which the normality model is built using historical observations. At run-time, anomalous conditions are detected when observations of the current state aresufficiently different from the normality model. As opposed to misuse detection,anomaly detection is able to uncover new attacks not seen earlier, since it doesnot include any knowledge about them. This feature makes it suitable both in en-vironments where vulnerabilities are not known in advance and as a second lineof defence in an operational system when new vulnerabilities are discovered. It’smain limitation is however related to the construction of a good model of normal-ity. A typical problem that occurs when applying anomaly detection algorithms isthe high rate of false alarms if attacks and normality data are similar in subtle ways,and when normality is subject to natural changes over time, referred in the litera-ture as concept drift [4]. The availability of a sufficiently large number of labelledinstances that can constitute the basis for a training dataset is also a challenge insome domains.

A model of normality often includes a number of parameters that need to betuned to fit the training data. These typically determine a tradeoff between a num-ber of factors such as the desired detection accuracy, tolerable false alarms rate,resource utilisation etc. The parameters are often set statically by the system man-ager. Recently the focus has also been on adaptation approaches devised to makeanomaly detection completely autonomic [5].

Anomaly detection has been applied to a range of applications including net-work or host-based intrusion detection, fraud detection, medical and public healthanomaly detection [6]. It is now being considered as prominent technique to dis-cover attacks in new domains, such as smartphone malware detection [7, 8]. Thisthesis addresses the more general application of anomaly detection techniques toprotect cyber-physical systems, as motivated in the next section.

1.1 MotivationCritical infrastructure systems, such as electricity, gas and water distribution sys-tems have been subject to changes in the last decades. The need for distributedmonitoring and control to support their operations has fuelled the practice of inte-grating information and communication technology to physical systems. Supervi-sory Control and Data Acquisition (SCADA) systems have been the first approachto distributed monitoring and control by means of information technology. Froma larger perspective, this integration has led to the term ”Cyber-Physical System”(CPS), where physical processes, computation and information exchange are cou-pled together to provide improved efficiency, functionality and reliability. The in-herently distributed nature of production and distribution and the incorporation ofmass scale sensors and faster management dynamics, and fine-grained adaptabilityto local failures and overloads are the means to achieve this. The notion of cyber-physical systems, aiming to cover the ”virtually global and locally physical” [9] is

2

1.1. Motivation

often used to encompass smart grids as an illustrating example of such complexdistributed sensor and actuator networks aimed to control a physical domain. Per-vasive computing, on the other hand, has generated a subcategory of cyber-physicalsystems, the Mobile Cyber-Physical Systems (MCPS) [10]. Using the advances ofdevelopment on Wireless Sensor Networks (WSN) and Mobile Ad hoc Networks(MANETs) mobile devices, such as smartphones, are prominent tools who allowapplications that are based on the interaction between the physical environmentand the cyber world. Having adequate resources to perform local task processing,storage, networking and sensing (camera, GPS, microphone, light sensors etc.),a number of application of MCPS have been implemented including healthcare,intelligent transportation, navigation and disaster response [9].

Security in critical systems has historically been an important matter of con-cern, even when the cyber domain was not present. Attacks to the physical domaincan have severe impacts on society and can have disastrous consequences. In thepast, most of the security mechanisms were implemented using physical protec-tion. Critical assets were typically located in controlled environments, and thisoften prevented the occurrence of undesired manipulations.

In some cases, however, physical protection is not always fully applicable. Wa-ter management systems, for example, deserve a special attention in critical infras-tructure protection. In contrast to some other infrastructures where the physicalaccess to the critical assets may be possible to restrict, in water management sys-tems there is a large number of remote access points difficult to control and protectfrom accidental or intentional contamination events. Since quality of distributedwater affects every single citizen with obvious health hazards, in the event of con-tamination, there are few defence mechanisms available. Water treatment facilitiesare typically the sole barrier to potential large scale contaminations and distributedcontainment of the event leads to widespread water shortages. Today’s SCADAsystems provide a natural opportunity to increase vigilance against water contam-inations. Specialised event detection mechanisms for water management could beincluded such that 1) a contamination event is detected as early as possible andwith high accuracy with few false positives, and 2) predictive capabilities ease pre-paredness actions in advance of full scale contamination in a utility. The availabledata from water management system sensors are based on biological, chemical andphysical features of the environment and the water sources. Since these changeover seasons, the normality model is rather complex. Also, it is hard to create astatic set of rules or constraints that clearly capture all significant attacks since thesecan affect various features in non-trivial ways and we get a combinatorial problem.This suggests that learning based anomaly detection techniques should be exploredas a basis for contamination event detection in water management systems.

Power distribution networks are less vulnerable to physical attacks, since theirassets can be protected by access restriction more easily. In this context, how-ever, physical attacks typically target the end point devices, the electricity meters.Electricity fraud by meter tampering is a known problem, especially in developingcountries [11]. In this context, anomaly detection algorithms for electricity frauddetection are employed to detect considerable deviations of user profiles from av-

3

1. INTRODUCTION

erage profiles of customers belonging to the same class. To summarise, ICT tech-nology for real-time monitoring and control has a potential to the physical layer invarious domains of CPS.

Cyber security, on the other hand, is a new issue in cyber-physical critical sys-tems. While security is indeed part of the grand challenges facing large scale devel-opment of cyber-physical systems, the focus has initially been to threats to controlsystems [12]. The ICT itself suffers from vulnerability to attacks, and can bringwith it traditional security threats to ICT network. Since sensitive information isexchanged through the network, integrity, authenticity, accountability and confi-dentiality are new fundamental security requirements in cyber-physical systems.Smart grids, once again, are the typical example of CPS where cyber security hasbecome an important matter of concern [13]. Although active security mecha-nisms have been extensively designed, studies [14] show that there is still room foranomaly detection as second line of defence.

1.2 Problem formulationThis thesis addresses anomaly detection in the context of cyber-physical systems.The hypothesis is that anomaly detection can be adopted for cyber-physical sys-tems security in both the cyber and physical domain. In the physical domain,anomaly detection should be explored in order to detect events that can be at-tributed to malicious activity. The challenge is to provide reliable alerts with fewfalse positives and low latency. In the cyber domain, anomaly detection should beexplored to discover attacks on the communication networks.

We then proceed to study the problem of automatic parameter adaptation inintrusion detection. We focus our attention to wireless communications where themore complicated dynamics (mobility and network topology), the constrained re-sources (battery, bandwidth) and the frequent miscommunications make automaticadaptation a desired property in anomaly detection.

1.3 ContributionsThe contributions of this thesis are twofold; the study of the anomaly detection forcritical infrastructure protection and the study of adaptive strategies to adjust theparameters of an intrusion detection architecture during runtime. Our studies havebeen performed in three different domains related to cyber-physical systems: twocritical infrastructures, such as water management systems and smart grids and adisaster response scenario. More specifically:

1. Anomaly detection as event detection system in water management sys-temsIn the first contribution, we apply a method for Anomaly Detection With fastIncremental ClustEring (ADWICE) [15] in a water management system forwater contamination detection. We analyse the performance of the approach

4

1.4. Thesis outline

on real data using metrics such as detection rate, false positives, detectionlatency, and sensitivity to the contamination level of the attacks, discussingof reliability of the analysis when data sets are not perfect (as seen in real lifescenarios), where data values may be missing or less accurate as indicatedby sensor alerts.

2. Anomaly detection in smart metersThe second contribution focuses on a smart metering infrastructure whichuses trusted computing technology to enforce strong security requirements,and we show the existence of weaknesses in the forthcoming end-nodes thatjustify embedded real-time anomaly detection. We propose an architecturefor embedded anomaly detection for both the cyber and physical domainsin smart meters and create an instance of a clustering-based anomaly detec-tion algorithm in a prototype under industrial development, illustrating thedetection of cyber attacks in pseudo-real settings.

3. Adaptive Intrusion Detection in disaster area networksIn this contribution we present the impact of energy-aware parameter adap-tation of an intrusion detection framework earlier devised for disaster areascenarios, built on top of an energy-efficient message dissemination proto-col. We show that adaptation provides extended life time of the networkdespite attack-induced energy drain and protocol/intrusion detection systemoverhead. We demonstrate that evaluation of energy-aware adaptation can bebased on fairly simple models of CPU utilisation applied to networking pro-tocols in simulation platforms, thus enabling evaluations of communicationscenarios that are hard to evaluate by large scale deployments.

1.4 Thesis outlineThe thesis is organised as follows. Chapter 2 presents background informationon the anomaly detection technique adopted in this thesis and the considered do-mains. The chapter gives first an overview of ADWICE, the anomaly detectionalgorithm used as reference algorithm in our studies, then presents and discussesthe security issues on the two domains in the forthcoming chapters: Water Man-agement Systems and Smart Metering Infrastructures. The chapter is concludedwith an introduction to the mobile ad-hoc communication security framework inwhich an approach to adaptation has been proposed. Chapter 3 presents the evalu-ation of ADWICE when implemented as an event detector system for water qual-ity anomalies. Chapter 4 analyses the design of a smart metering infrastructure,proposing an architecture for embedded anomaly detection in smart meters andevaluates the detection performance on cyber attacks performed on a smart meterprototype. In Chapter 5 the energy-based security adaptation approach in mobilead hoc networks is described and evaluated. Finally, Chapter 6 concludes the workand presents our future work.

5

1. INTRODUCTION

1.5 List of publicationsThe work presented in this thesis is the result of the following publications:

• M. Raciti, J. Cucurull, and S. Nadjm-Tehrani, Energy-based Adaptation inSimulations of Survivability of Ad hoc Communication, in Wireless Days(WD) Conference, 2011 IFIP, IEEE, October 2011

• M. Raciti, J. Cucurull, S. Nadjm-Tehrani, Anomaly Detection in WaterManagement Systems in Advances in Critical Infrastructure Protection: In-formation Infrastructure Models, Analysis, and Defence (J. Lopez, R. Setola,and S. Wolthusen, eds.), vol. 7130 of Lecture Notes in Computer Science,pp. 98–119, Springer Berlin / Heidelberg, 2012

• M. Raciti and S. Nadjm-Tehrani, Embedded Cyber-Physical Anomaly De-tection in Smart Meters, in Proceedings of the 7th International Conferenceon Critical Information Infrastructures Security (CRITIS’12), 2012, LNCS,forthcoming

The following publications are peripheral to the work presented in this thesisand are not part of its content:

• H. Eriksson, M. Raciti, M. Basile, A. Cunsolo, A. Froberg, O. Leifler, J.Ekberg, and T. Timpka, A Cloud-Based Simulation Architecture for Pan-demic Influenza Simulation, in AMIA Annual Symposium Proceedings2011, AMIA Symposium, AMIA, October 2011

• J. Cucurull, S. Nadjm-Tehrani, and M. Raciti, Modular anomaly detectionfor smartphone ad hoc communication, in 16th Nordic Conference onSecure IT Systems, NordSec 2011, LNCS, Springer Verlag, October 2011

• J. Sigholm and M. Raciti, Best-Effort Data Leakage Prevention in Inter-Organizational Tactical MANETs, in Military Communications Confer-ence 2012 - MILCOM 2012, IEEE, October 2012

6

Chapter 2

Background

This chapter provides background information to introduce the reader to the al-gorithms and domains where anomaly detection and its parameter adaptation havebeen applied. In the first section, we describe ADWICE, an instance of a clustering-based anomaly detection algorithm earlier developed for securing IP networks. Wepresent the main mechanisms of the algorithm and describe the main features thathave been considered while selecting the approach to anomaly detection suitableto our studies on security in critical systems. Next, we introduce the water man-agement system and the smart metering infrastructure, the two domains where thealgorithm has been applied, first as event- and then as intrusion- detection system.We give an introduction to these domains and present their security issues and re-lated work on security. Finally, we present the context of disaster are networks,describing a protocol for which a comprehensive security framework has been de-vised into which our adaptation module is integrated.

2.1 Anomaly Detection with ADWICEADWICE (Anomaly Detection With fast Incremental ClustEring)[16] is a cluster-ing based anomaly detector that has been developed in an earlier project target-ing infrastructure protection. Originally designed to detect anomalies on networktraffic sessions using features derived from TCP or UDP packets, ADWICE rep-resents the collection of features as multidimensional numeric vectors, in whicheach dimension represents a feature. Thus, vectors are therefore data points in themultidimensional features space. Similar observations (i.e. data points that, usinga certain distance metric, are close to each other) can be grouped together to formclusters. The basic idea of the algorithm is then to model normality as a set of clus-ters that summarise all the observed normal behaviour during the learning process.ADWICE assumes semi-supervised learning, where only the data instances pro-vided to represent the normality model are labelled and assumed not to be affectedby malicious activity. In the detection phase, if the new data point is close enough(using a threshold) to any normality clusters, it can be classified as an observation

7

2. BACKGROUND

of normal behaviour, otherwise it is classified as an outlier.In ADWICE, each cluster is represented through a summary denoted Cluster

Feature (CF). CF is a data structure that has three fields CFi = (n, S, SS), wheren is the number of points in the cluster, (S) is the sum of the points and (SS)is the square sum of the points in the cluster. The first two elements can be usedto compute the average for the points in the cluster used to represent its centroidv0 =

∑ni=1

vin . The third element, the sum of points, can be used to calculate how

large is a circle that would cover all the points in the cluster, through the radius

R(CF ) =√∑n

i=1(vi−v0)2

n .With all this information one can measure how far is a new data point from

the centre of the cluster (as euclidean distance between the cluster centroid andthe new point) and whether the new point falls within or nearby the radius of thecluster. This is used for both building up the normality model (is the new pointclose enough to any existing clusters so it can become part of it or should it form anew cluster?), and during detection (is the new point close enough to any normalityclusters or is it an outlier?).

Using the above structure, during the training phase, a new point can be eas-ily included into a cluster and two clusters CFi = (ni, Si, (SS)i)) and CFj =(nj , Sj , (SS)j)) can be merged to form a new cluster just by computing the sumsof the individual components of the cluster features (ni + nj , Si + Si, (SS)i +(SS)j).

When a new data point is processed, both during training and detection, thesearch of the closest cluster needs to be efficient (and fast enough for the appli-cation). We need therefore an efficient indexing structure that helps to find theclosest cluster to a given point. The cluster summaries, that constitute the normal-ity observations, are organised in a tree structure. Each level in the tree summarisesthe CFs at the level below by creating a new CF which is the sum of them. Thesearch then proceeds from the root of the tree down to the leaves, in a logarithmiccomputational time.

ADWICE is based on the original BIRCH data mining algorithm which hasbeen shown to be fast for incremental updates to the model during learning, andefficient when searching through clusters during detection. The difference is theindexing mechanism used in one of its adaptations (namely ADWICE-Grid), whichhas been demonstrated to give better performance due to fewer indexing errors[15].

The ease of merging or splitting clusters provided by the way of describingthem (using CFs) and the efficiency in the reconfiguration of their indexing en-able incremental updates even during the deployment phase in order to cope withchanges to normality of the system. The possibility of forgetting unused clustersor incorporating new ones makes ADWICE a good choice for exploring adapta-tion strategies in changing environments, where normality is subject to conceptdrift and the detector needs to be efficiently updated online without the need ofrecomputing the whole normality model from scratch.

The implementation of ADWICE consists of a Java library that can be em-bedded in a new setting by feeding the preprocessing unit (e.g. when input are

8

2.2. Water Quality in Distribution Systems

alphanumeric and have to be encoded into numeric vectors) from a new source ofdata. The size of the compiled package is about 2MB. The algorithm has two pa-rameters that have to be tuned during the pre-study of data (with some detectiontest cases) in order to tune the search efficiency: the maximum number of clusters(M), and the threshold for comparing the distance to the centroid (E). The thresh-old implicitly reflects the maximum size of each cluster. The larger a cluster thelarger the likelihood that points not belonging to the cluster are classified as partof the cluster – thus decreasing the detection rate. Too small clusters, on the otherhand, lead to overfitting and increase the likelihood that new points are consideredas outliers, thus adding to the false positive rate.

The performance metrics used to evaluate ADWICE are the commonly usedmetrics to evaluate intrusion detection systems: the Detection Rate (DR) and theFalse Positive Rate (FPR). The detection rate accounts for the percentage of in-stances that are correctly classified as outliers, DR = TP/(TP +FN), where TPrefers to the number of true positives and FN refers to the number of false nega-tives. The false positive rate (FPR) are the normal instances that are erroneouslyclassified as outliers according to the formula FPR = FP/(FP+TN), where FPis the number of false positives and TN refers to the number of true negatives. Theefficiency of the detection are typically analysed by building Receiver OperatingCharacteristic (ROC) curves, where the DR (normally represented in the Y axis) isplotted in relation to the FPR.

2.2 Water Quality in Distribution SystemsA water distribution system is an infrastructure designed to transport and deliverwater from several sources, like reservoirs or tanks, to consumers. This infras-tructure is characterised by the interconnection of pipes using connection elementssuch as valves, pumps and junctions. Water flows through pipes with a certainpressure, and valves and pumps are elements used to adjust this to desired values.Junctions are connection elements through which water can be served to customers.Before entering the distribution system, water is treated first in the treatment plants,in order to ensure its potability. Once processed by the treatment plant, water en-ters the distribution system so it can be directly pumped to the final user, or storedin tanks or reservoirs for further use when the demand on the system is greater thanthe system capacity.

Modelling hydraulic water flow in distribution systems has always been an as-pect of interest when designing and evaluating water distribution systems [17].Water distribution networks are typically modelled using graphs where nodes areconnection elements and edges represent pipes between nodes. The flow of waterthrough the distribution system is typically described by mathematical formulationof fluid dynamics [18], and computer-based simulation (EPANET [19] is an ex-ample of a popular tool) is very common to study the hydraulic dynamics thoughtthe system. Since water must be checked for quality prior to the distribution tothe user, system modelling and water quality decay analysis have especially beenhelpful for finding the appropriate location to place treatment facilities.

9

2. BACKGROUND

Water quality is determined by the analysis of its chemical composition: to besafe to drink some water parameters are allowed to vary within a certain range ofvalues, where typically the boundary values are established by law. In general, thewater quality (WQ) is measured by the analysis of some parameters, for example:

• Chlorine (CL2) levels: free chlorine is added for disinfection. Free chlo-rine levels decrease with time, so for instance levels of CL2 in water that isstagnant in tanks is different from levels in water coming from the treatmentplants.

• Conductivity: estimates the amount of dissolved salts in the water. It isusually constant in water from the same source, but mixing waters can causea significant change in the final conductivity.

• Oxygen Reduction Potential (ORP): measures the cleanliness of the water.

• PH: measures the concentration of hydrogen ions.

• Temperature: is usually constant if measured in short periods of time, but itchanges with the seasons. It differs in waters from different sources.

• Total Organic Carbon (TOC): measures the concentration of organic matterin the water. It may decrease over the time due to the decomposition oforganic matters in the water.

• Turbidity: measures how clear the water is.

Online monitoring and prediction of water quality in a distribution system ishowever a highly complex and sensitive process that is affected by many differentfactors. The different water qualities coming from multiple sources and treatmentplants, the multiplicity of paths that water follows in the system and the changingdemand over the week from the final users make it difficult to predict the waterquality at a given point of the system life time. System operations have a consistentimpact on water quality. For instance, pumping water coming from two or moredifferent sources can radically modify the quality parameters of the water containedin a reservoir.

In normal conditions, it is possible to extract some usage patterns from thesystem operations relating the changes of WQ parameters with changes of somesystem configurations: for example the cause of a cyclic increment of conductivityand temperature of the water contained in a reservoir can be related to the fact thatwater of a well known characteristic coming from a treatment plant is cyclicallybeing pumped into the reservoir. Other factors, however, can have an impact onthe prediction of water quality, making it difficult. The presence of contaminants,which constitutes safety issues as discussed in the next section, can affect waterquality making its prediction hard.

10


2.2.1 Security considerationsAs mentioned earlier, water distribution systems have been subject to particularattention from a security point of view. Since physical protection is not easilyapplicable, intentional or accidental injection of contaminants in some points ofthe distribution system constitutes a serious threat for citizens. Supervisory controland data acquisition (SCADA) systems provide a natural opportunity to increasevigilance against water contaminations. A specialised Event Detection System(EDS) for water management can be included such that a contamination event isdetected as early as possible and with high accuracy with few false positives. EDSsmust distinguish changes caused by normal system operations with events causedby contaminations; since different contaminants affect the water quality parametersin different ways, this distinction is not always clear and represents one of the mainchallenges of EDS tools.

Chapter 3 discusses the application of ADWICE within an event detection sys-tem for water quality when integrated on a SCADA system.

2.2.2 Related Work on Contamination DetectionIn this section we first describe work that is closely related to ours (water qualityanomaly detection), and then we continue with an overview of other works whichare related to the big picture of water quality and monitoring.

Water quality anomalies

The security issues in water distribution systems are typically categorised in twoways: hydraulic faults and quality faults [20]. Hydraulic faults (broken pipes,pump faults, etc.) are intrinsic to mechanical systems, and similar to other infras-tructures, fault tolerance must be considered at design time to make the systemreliable. Hydraulic faults can cause economic loss and, in certain circumstances,water quality deterioration. Online monitoring techniques are developed to detecthydraulic faults, and alarms are raised when sensors detect anomalous conditions(like a sudden change of the pressure in a pipe). Hydraulic fault detection is oftenperformed by using specific direct sensors and it is not the area of our interest.

The second group of security threats, water quality faults, has been subject toincreased attention in the past decade. Intentional or accidental injection of con-taminant elements can cause severe risks to the population, and ContaminationWarning Systems (CWS) are needed in order to prevent, detect, and proactively re-act in situations in which a contaminant injection occurs in parts of the distributionsystem [21]. An EDS is the part of the CWS that monitors in real-time the wa-ter quality parameters in order to detect anomalous quality changes. Detecting anevent consists of gathering and analysing data from multiple sensors and detectinga change in the overall quality. Although specific sensors for certain contaminantsare currently available, EDSs are more general solutions not limited to a set ofcontaminants.

11

2. BACKGROUND

Byers and Carlsson are among the pioneers in this area. They tested a simpleonline early warning system by performing real-world experiments [22]. Usinga multi-instrument panel that measures five water quality parameters at the sametime, they collected 16.000 data points by sampling one measurement of tap waterevery minute. The values of these data, normalised to have zero as mean and 1 asstandard deviation, were used as a baseline data. They then emulated a contam-ination in laboratories by adding four different contaminants (in specific concen-trations) to the same water in beakers or using bench scale distribution systems.The detection was based on a simple rule: an anomaly is raised if the differencebetween the measured values and the mean from the baseline data exceeds threetimes the standard deviation. They evaluated the approach comparing normalitybased on large data samples and small data samples. Among others, they evaluatedthe sensitivity of the detection, and successfully demonstrated detection of contam-inants at concentrations that are not lethal for human health. To our knowledge thisapproach has not been applied in a large scale to a broad number of contaminantsat multiple concentrations.

Klise and McKenna [23] designed an online detection mechanism called mul-tivariate algorithm: the distance of the current measurement is compared with anexpected value. The difference is then checked against a fixed threshold that deter-mines whether the current measurement is a normal value or an anomaly. The ex-pected value is assigned using three different approaches: last observation, closestpast observation in a multivariate space within a sliding time window, or by takingthe closest-cluster centroid in clusters of past observations using k-mean clustering[24]. The detection mechanism was tested on data collected by monitoring fourwater quality parameters at four different locations taking one measurement everyhour during 125 days. Their contamination has been simulated by superimposingvalues according to certain profiles to the water quality parameters of the last 2000samples of the collected data. Results of simulations have shown that the algo-rithm performs the required level of detection at the cost of a high number of falsepositives and a change of background quality can severely deteriorate the overallperformance.

A comprehensive work on this topic has been initiated by U.S. EPA resultingin the CANARY tool [25]. CANARY is a software for online water quality eventdetection that reads data from sensors and considers historical data to detect events.Event detection is performed in two online parallel phases: the first phase, calledstate estimation, predicts the future quality value. In the state estimation, history iscombined with new data to generate the estimated sensor values that will be com-pared with actually measured data. In the second phase, residual computation andclassification, the differences between the estimated values and the new measuredvalues are computed and the highest difference among them is checked against athreshold. If that value exceeds the threshold, it is declared as an outlier. The num-ber of outliers in the recent past are then combined by a binomial distribution tocompute the probability of an event in the current time step.

CANARY integrates old information with new data to estimate the state of thesystem. Thus, their EDS is context-aware. A change in the background quality due

12


to normal operation would be captured by the state estimator, and that would notgenerate too many false alarms. Singular outliers due to signal noise or backgroundchange would not generate immediately an alarm, since the probability of raisingalarms depends on the number of outliers in the past, that must be high enough togenerate an alarm. Sensor faults and missing data are treated in such way that theirvalue does not affect the residual classification: their values (or lack thereof) areignored as long as the sensor resumes its correct operational state.

CANARY allows the integration and test of different algorithms for state esti-mation. Several implementations are based on the field of signal processing or timeseries analysis, like time series increment models or linear filtering. However, it issuggested that artificial intelligence techniques such as multivariate nearest neigh-bour search, neural networks, and support vector machines can also be applied. Asystematic evaluation of different approaches on the same data is needed to clearlysummarise the benefits of each approach. This is the target of the current EPAchallenge of which our work is a part.

So far, detection has been carried out on single monitoring stations. In a waterdistribution network, several monitoring stations could cooperate on the detectionof contaminant event by combining their alarms. This can help to reduce falsealarms and facilitate localisation of the contamination source. Koch and McKennahave recently proposed a method that considers events from monitoring stations asvalues in a random time-space point process, and by using the Kulldorff’s scan testthey identify the clusters of alarms [26].

Contamination diffusion

Modelling water quality in distribution networks allows the prediction of how acontaminant is transported and spread through the system. Using the equations ofadvection/reaction Kurotani et al. initiated the work on computation of the concen-tration of a contaminant in nodes and pipes [27]. They considered the topograph-ical layout of the network, the changing demand from the users, and informationregarding the point and time of injection. Although the model is quite accurate,this work does not take into account realistic assumptions like water leakage, pipesaging, etc. A more realistic scenario has been considered by Doglioni et al. [28].They evaluate the contaminant diffusion on a real case study of an urban water dis-tribution network that in addition to the previous hypothesis considers also waterleakage and contamination decay.

Sensor location problem

The security problem in water distribution systems was first addressed by Kessleret al. [29]. Initially, the focus was on the accidental introduction of pollutant ele-ments. The defence consisted of identifying how to place sensors in the networkin such way that the detection of a contaminant can be done in all parts of the dis-tribution network. Since the cost of installation and maintenance of water qualitysensors is high, the problem consists of finding the optimal placement of the mini-mum number of sensors such that the cost is minimised while performing the best

13

2. BACKGROUND

detection. Research in this field has been accelerated after 2001, encompassing thethreat of intentional injection of contaminants as a terrorist action. A large num-ber of techniques to solve this optimisation problem have been proposed in recentyears [30, 31, 32, 33, 20].

Latest work in this area [34] proposes a mathematical framework to describea wider number of water security faults (both hydraulic and quality faults). Fur-thermore, it builds on top of this a methodology for solving the sensor placementoptimisation problem subject to fault-risk constraints.

Contamination source identification

Another direction of work has been contamination source identification. This ad-dresses the need to react when a contamination is detected, and to take appropriatecountermeasures to isolate the compromised part of the system. The focus is onidentifying the time and the unknown location in which the contamination startedspreading.

Laird et al. propose the solution of the inverse water quality problem, i.e. back-tracking from the contaminant diffusion to identify the initial point. The problemis described again as an optimisation problem, and solved using a direct nonlinearprogramming strategy [35, 36]. Preis and Ostfeld used coupled model trees and alinear programming algorithm to represent the system, and computed the inversequality problem using linear programming on the tree structure [37].

Guan et al. propose a simulation-optimisation approach applied to complexwater distribution systems using EPANET [38]. To detect the contaminated nodes,the system initially assumes arbitrarily selected nodes as the source. The simulateddata is fed into a predictor that is based on the optimisation of a cost function takingthe difference between the simulated data and the measured data at the monitoringstations. The output of the predictor is a new configuration of contaminant con-centrations at (potentially new) simulated nodes, fed again to the simulator. Thisprocess in iterated in a closed-loop until the cost function reaches a chosen lowerbound and the actual sources are found. Extensions of this work have appearedusing evolutionary algoritms [39].

Huang et al. use data mining techniques instead of inverse water quality orsimulation-optimisation approaches [40]. This approach makes possible to dealwith systems and sensor data uncertainties. Data gathered from sensors is first pro-cessed with an algorithm to remove redundancies and narrow the search of possibleinitial contaminant sources. Then using the maximum likelihood method the nodesare associated with the probability of being the sources of injection.

Attacks on SCADA system

A further security risk that must be addressed is the security of the event detec-tion system itself. As any other critical infrastructure, an outage or corruption ofthe communication network of the SCADA infrastructure can constitute a severerisk, as dangerous as the water contamination. Therefore, protection mechanismshave to be deployed in response to that threat, too. Since control systems are often

14

2.3. Advanced Metering Infrastructures

sharing components and making an extensive use of information exchange to co-ordinate and perform operations, several new vulnerabilities and potential threatsemerge [41].

2.3 Advanced Metering InfrastructuresAnother critical infrastructure that has been subject to widespread attention of gov-ernments, industry and academia is the electricity distribution grid. An electricitydistribution grid is an infrastructure in which electricity is generated, transmittedover long distances at high voltages and finally delivered to the end users at lowvoltages. Today’s power demand combined with the limitations of the current in-frastructure and the need for sustainable energy in a deregulated marked has ledto promotion of smart grid infrastructures. The term smart emphasises the ideaof a more intelligent process of electricity generation, transmission, distributionand consumption where automatic control provides improved efficiency, reliabil-ity, fault-tolerance, maintainability and security. All of this is enabled by the sup-port of advanced Information and Communication Technology (ICT), adding the”cyber” network to the traditional ”physical” electricity distribution network.

Figure 2.1: NIST smart grid interoperability model [1]

Due to the number of different solutions designed in the early stage, the U.SNational Institute of Standards and Technology (NIST) has devised a referencemodel to be used for smart grid standardisation [1] in order to improve the interop-erability of different smart grid architectures. The model, depicted in Figure 2.1,describes the system as an interconnection of six different domains, namely bulkgeneration (power plants where electricity is produced), transmission (electricitycarriers over long distances at high voltage), distribution (electricity suppliers atlow voltages), customers (residential, commercial or industrial customers), opera-tions (management and control of electricity) and markets (actors involved in pricesetting and trading). The actors participate in an open market in the process ofgeneration of electricity and distribution where generation can occur at any stage(therefore consumers can produce and sell electricity generated using photovoltaic

15

2. BACKGROUND

or wind power systems) and a multiplicity of operators can interact during the ac-tivities. In the model we can distinguish the flow of energy and secure informationbetween the domains.

The main components of a smart grid infrastructure, from a technical point ofview, have been summarised as follows [42]:

• Smart infrastructure system: is the (cyber-physical) infrastructure for energydistribution and information exchange. It includes smart electricity genera-tion, delivery and distribution; advanced metering, monitoring and commu-nication.

• Smart management system: in the subsystem that provides advanced controlservices.

• Smart protection system: is the subsystem that provides advanced servicesto improve reliability, resilience, fault-tolerance and security.

The Advanced (Smart) Metering Infrastructure (AMI), which is the focus of ourstudy in Chapter 4, is the part of the smart infrastructure system that is in charge ofautomatically collecting consumption data read from the electricity meters for itsstorage in central databases. The Smart Meters (SM), the new type of computer-based electricity meters devised to replace the old electromechanical versions, areconnected to the communication network (proprietary or public IP-based) allow-ing the operator to perform real-time bi-directional communication. This enables anumber of innovative features such as fine-grained electricity billing for new pric-ing schemes, real-time demand side power monitoring and analysis and remotemeter management. The last feature, in particular, allows the operator to connect,configure and disconnect the smart meter remotely. This reduces the managementcosts of the infrastructure and improves the control over the meters as opposed tothe past when electromechanical meters were working offline and physical accesswas required for maintenance, control and electricity consumption reporting.

2.3.1 Security considerationsThe information system which is now integrated into the electricity grid enablesin one hand intelligent real-time control for increased efficiency, reliability and re-silience of the power grid. On the other hand, it exposes the system to new threatswhich are inherent to the ICT domain. Attacks performed on the communica-tion network can exploit software, protocol or hardware vulnerabilities for gainingaccess to the network nodes or control software. In analogy with the intentionalcontamination threat in water distribution systems, Denial of Service attacks (DoS)or attacks that affect integrity and availability of information on the state estima-tion of the grid can cause severe damages or even disasters when performed on alarge scale.

The advanced metering infrastructure, as part of the smart infrastructure sys-tem, is especially vulnerable to security violations since the end devices, the smartmeters, are not located in controlled environments. One of the driving motivations

16

2.3. Advanced Metering Infrastructures

that has lead the development of this practice was the reduction of the so calledNon-Technical Losses (NTL, losses that are not caused by normal power loss alongthe distribution network). The annual revenue losses due to NTL where estimatedup to 40% in developing countries [11], where meter tampering and illegal connec-tions to the low-voltage distribution network are the main practices for electricitytheft. Smart metering was devised as part of a strategy to prevent NTL, thinkingthat online load monitoring and tamper detection could alleviate the phenomenon.The ICT technology supporting smart metering however contributed to more vul-nerabilities exploitable for electricity theft compared to the past [43]. Apart fromphysical tampering (which is still feasible to some extent), typical vulnerabilitiesinherent to networked devices are offered, allowing a potential attacker to operatealso remotely. Among different types of attacks, modification or fabrication of fakeconsumption data can be new means of performing electricity theft. In addiction,the granularity of the measurements and the sensitivity of the information that isexchanged through the communication network have raised valid privacy concerns.

2.3.2 Related Work on AMI SecuritySmart grid cyber security is a crucial issue to solve prior to the deployment ofthe new systems and a lot of effort has been spent on it. Academy, industry andorganisations have been actively involved in the definition of security requirementsand standard solutions [44, 45, 46, 47, 48].

The Advanced Metering Infrastructure is particularly vulnerable to cyber at-tacks, and careful attention has been given to its specific security requirementsanalysis [49]. Confidentiality is a crucial aspect in smart metering, since sensitiveinformation about the user’s activity or habits is available. Although accumulatedconsumption has always been displayed on electromechanical meters without con-cern, detailed load profiles available with fine-grained measurements can be anal-ysed to determine which home appliances are creating the load and give detaileddescription of the activities. Confidentiality of data and credentials should be as-sured at all the stages, from the metering phase till the storage and managementin the operator side. Integrity is strict requirement since data or commands mustbe authentic, i.e. it should not be possible to modify or replace them with bo-gus equivalents. Accountability (or non-repudiation) is required since entities thatgenerate data or commands should not negate their actions. Finally, availabilityis now becoming critical since operation of online components is an integratedpart of electricity delivery. Since metering data can be used to estimate the powerdemand adjusting the supply generation according to it, a large number of unavail-able measurements can severely affect the stability of the grid. Cleveland pointsout that encryption alone is not the solution that matches all the requirements, andautomated diagnostics, physical and cyber intrusion detection can be means of pre-venting loss of availability.

Intrusion detection has been considered as a possible defence strategy in AMIs.Berthier et al. [14, 50] highlight the need for real-time monitoring in AMI sys-tems. They propose a distributed specification-based approach to anomaly detec-

17

2. BACKGROUND

tion in order to discover and report suspicious behaviours during network or hostoperations. The advantage of this approach, which consists of detecting deviationfrom high-level models (specifications) of the system under study, is the effec-tiveness on checking whether the systems follows the specified security policies.The main disadvantages are the high development cost and the complexity of thespecifications.

A recent paper of Kush et al. [51] focuses on the gap between conventionalIDS systems and the specific requirements for smart grid systems. They find thatan IDS must support legacy hardware and protocols, due to the variety of productsavailable, be scalable, standard compliant, adaptive to changes, deterministic andreliable. They evaluate a number of existing IDS approaches for SCADA systems,the Berthier et al. approach and few conventional IDS systems that could be ap-plied to the AMI, and they verify that none of them satisfies all the non-functionalrequirements.

Beside cyber attacks, physical attacks are also a major matter of concern. Asmentioned earlier, stealing electricity is the main motivation that induces unethi-cal customers to tamper with the meters, and the minimisation of energy theft isa major reason why smart metering practice has been initiated in the early 2000s.However, McLaughlin et al. [43, 52] show that smart meters offer even more vul-nerabilities compared to the old electromechanical meters. Physical tampering,password extraction, eavesdropping and meter spoofing can be easily performedwith commodity devices.

An approach for discovering theft detection with smart metering data is dis-cussed in Kadurek et al. [53]. They devise two phases: during the first phase theenergy balance at particular substations of the distribution system is monitored.If the reported consumption is different from the measured one, an investigationphase aims as localising the point where the fraud is taking place.

In Chapter 4 we present the design of a smart metering infrastructure whichuses trusted computing technology to enforce strong security requirements, andwe show that the existence of a weakness in the forthcoming end-nodes makesthem exploitable for electricity theft and justifies presence of real-time anomalydetection.

2.4 Disaster Area NetworksThe last domain where we have focused our studies is mobile ad hoc networking.

This is a networking paradigm in which the network nodes communicate bycreating peer-to-peer connections without the support of an existing infrastructure.Routing and message dissemination in MANETs, an extensive area of research,is performed by the nodes themselves who create chains of connections. Mobilead hoc networks can be deployed in different application scenarios in which in-frastructure based systems are hard to deploy. Among them, a disaster scenario ispresented as a context in which existing infrastructures can be disrupted and spon-taneous networks deployed with commodity devices. The hastily formed networkcan be a fast and early communication system to support rescue operations. The

18

2.4. Disaster Area Networks

main challenge in such a scenario is the presence of partitions, i.e. disruption inpockets of connectivity that change over time due to mobility, lack of power etc..In the following section, we describe a dissemination protocol for disaster areanetworks designed to overcome partitions.

2.4.1 Random-Walk Gossip protocolThe Random-Walk Gossip (RWG) [54] is a manycast partition-tolerant protocoldesigned to efficiently disseminate messages in disaster area networks. The proto-col does not assume any knowledge about the network topology and the identityof the participants, since this information could not be available before deploy-ment time, and in such environments they are expected to be highly dynamic. Toovercome this limitation, the protocol is intended to disseminate a message to knodes in the network, irrespective of their identity. The number of recipients k is aparameter configurable by the user.

To cope with network partitions, the protocol uses a store-carry-and-forwardapproach, meaning that a node stores the messages into a buffer in order to forwardthem to other nodes when links are available. The name of the protocol derivesfrom the way the messages are spread in the network. When a message is injectedby a node, it will perform a random walk on that partition until all the nodes havebeen reached, or the message is k-delivered. In the first case, when one of messageholders moves to another partition, the spreading of the message will be resumed,and this loop is repeated until the message is k-delivered or expires.

The random walk of the message is performed by a three-way handshake usingspecific control packets. When a node is actively trying to disseminate a message,it is said to be the custodian of that message. In order to start the dissemination, thecustodian sends a Request to Forward (REQF) packet, which contains the messagepayload. The nodes in the vicinity who hear that packet store the message in theirbuffer and reply with an acknowledgement (ACK) packet. The custodian, afterupdating a data structure, called bit vector, that tracks how many nodes and whohas received the message, randomly selects one of the nodes from whom it receivedACK packets and sends an OK to Forward (OKTF). The recipient of this packet,now the new custodian, will start again this process to continue the dissemination.The first node that realises that a packet has been k-delivered sends a Be Silent (BS)packet to inform its neighbours about the completed dissemination of the message.

The gossip component of the name derives from the fact that each time a nodereceives a packet, it checks whether the sender has not been informed about oneof the messages it has in its buffer. In such a case, the node holding the messagewill start disseminating it with the same procedure described above. This is veryuseful when a node moves to another partition to quickly resume the disseminationprocess.

19

2. BACKGROUND

2.4.2 Security considerationsMobile ad-hoc networking has been a subject of intense research during the lastdecade. Apart from the development of protocols and architectures to improve net-work robustness, delay tolerance, throughput rates, routing performance etc., theapplication areas of such networks have also raised the need of techniques for pro-tecting them from various security threats. Malicious nodes can exploit protocolvulnerabilities to disrupt the communication, cause node failures or simply behaveselfishly exploiting the network resources without participating in the collabora-tive routing efforts [55]. These issues are present in RWG, where nodes rely oneach other for message dissemination but trust relationships between them cannotbe assumed. Malicious nodes can freely join the network to perform attacks byexploiting vulnerabilities of the three-way handshake mechanism, as shown in Cu-curull et al. [56]. Several approaches to intrusion detection have been proposedfor MANETs, ranging from standalone fully-distributed architectures, where ev-ery node in the network works independently to discover anomalies, to hierarchi-cal solutions with some centralised elements, where nodes collaborate to increasedetection performance. For a broad overview of MANET security architectures,the reader is referred to the comprehensive surveys [57] and [58].

Chaper 5 presents our work on an adaptation component of a fully distributedstandalone framework for surviving attacks in disaster area networks where everynode works independently to capture the state of the network, detect anomalies andtake countermeasures to mitigate the impact of the attacks.

20

Chapter 3

Anomaly Detection in WaterManagement Systems

This chapter addresses the first application of ADWICE in the physical domain ofa cyber-physical system. The hypothesis is that ADWICE, which has been earliersuccessfully applied to detect attacks in IP networks, can also be deployed forreal-time anomaly detection in water management systems. Analysis of physicaldomain’s values and indicators should raise accurate alarms with low latency andfew false positives when changes in quality parameters indicate anomalies.

The chapter therefore describes the evaluation of the anomaly detection soft-ware when integrated in a SCADA system of a water distribution infrastructure.The analysis is carried within the Water Security initiative of the U.S. Environ-mental Protection Agency (EPA), described in Section 3.1. Performance of thealgorithm in terms of detection rate, false positive rate, detection latency and con-taminant sensitivity on data from two monitoring stations is illustrated in Sections3.2 to 3.4. Finally, improvements to the collected data to deal with data unreliabil-ity that arise when dealing with physical sensors are discussed in Section 3.5.

3.1 Scenario: the event detection systems challengeThe United States Environmental Protection Agency, in response to Homeland Se-curity Presidential Directive 9, has launched an Event Detection System challengeto ”develop robust, comprehensive, and fully coordinated surveillance and moni-toring systems, including international information, for water quality that providesearly detection and awareness of disease, pest, or poisonous agents.” [59].

In particular, EPA is interested in the development of Contaminant WarningSystems (CWS) that in real-time detect the presence of contaminants in the wa-ter distribution system. The goal is to take the appropriate countermeasures uponunfolding events to limit or cut the supply of contaminated water to users.

The challenge is conducted by providing water quality data from sensors of

21

3. ANOMALY DETECTION IN WATER MANAGEMENT SYSTEMS

six monitoring stations from four US water utilities. Data comes directly from thewater utilities without any alteration from the evaluators, in order to keep the datain the same condition as if it would come from real-time sensing of the parame-ters. Data contains WQ parameter values as well as other additional informationlike operational indicators (levels of water in tanks, active pumps, valves, etc.) andequipment alarms (which indicate whether sensors are working or not). Each sta-tion differs from the others in the number and type of those parameters. A baselinedata is then provided for each of the six stations. It consists of 3 to 5 months ofobservations coming from the real water utilities. Each station data has a differenttime interval between two observations, ranging in the order of few minutes. Thecontaminated testing dataset is obtained from the baseline data by simulating thesuperimposition of the contaminant effects on the WQ parameters. Figure 3.1 [60]is an example of effects (increase or decrease) of different types of contaminants onTotal Organic Carbon, Chlorine level, Oxygen Reduction Potential, Conductivity,pH and Turbidity.

Figure 3.1: Effect of contaminants on the WQ parameters

EPA has provided a set of 14 simulated contaminants, denoting them contam-inant A to contaminant N. Contaminants are not injected along the whole testingsequence, but the attack can be placed in a certain interval inside the testing data,with a duration limited to a few timesteps. Contaminant concentrations are addedfollowing a certain profile, which define the rise, the fall, the length of the peakconcentration and the total duration of the attack. Figure 3.2 shows some examplesof profiles.

To facilitate the deployment and the evaluation of the EDS tools, a softwarecalled EDDIES has been developed and distributed by EPA to the participants.EDDIES has four main functionalities:

22

3.1. Scenario: the event detection systems challenge

Figure 3.2: Example of Event Profiles

• Real-time execution of EDS tools in interaction with SCADA systems (col-lecting data from sensors, analysing them by the EDS and sending the re-sponse back to the SCADA tool to be viewed by the utility staff).

• Offline evaluation of EDS tool by using stored data.

• Management of the datasets and simulations.

• Creation of new testing datasets by injection of contaminants.

Having the baseline data and the possibility to create simulated contaminations,

23


EDS tools can be tuned and tested in order to see if they suite this kind of appli-cation. In the next sections we will explain how we adapted an existing anomalydetection tool and we will present the results obtained by applying ADWICE todata from two monitoring stations.

3.2 Modelling Normality

3.2.1 TrainingThe training phase is the first step of the anomaly detection. It is necessary to builda model of normality of the system to be able to detect deviations from normal-ity. As mentioned in Section 2.1, ADWICE uses the approach of semi-supervisedanomaly detection, meaning that training data is supposed to be unaffected by at-tacks. Training data should also be long enough to capture as much as possible thenormality of the system. In our scenario, the data that EPA has provided is cleanfrom contaminants. The baseline data contains the measurements of water qualityparameters and other operational indicators of uncontaminated water over a periodof some months. Semi-supervised anomaly detection is thereby applicable.

For our purpose, we divide the baseline data into two parts: the first is usedto train the anomaly detector, while the second one is first processed to add thecontaminations and then used as testing data. To see how the anomaly detectorreacts separately to the effect of each contaminant, 14 different testing datasets,each one with a different contaminant in the same timesteps and with the sameprofile, are created.

3.2.2 Feature selectionA feature selection is made to decide which parameters to consider for the anomalydetection. In the water domain, one possibility is to consider the water quality pa-rameters as they are. Some parameters are usually common to all the stations(general WQ parameters), but some other station-specific parameters can also behelpful to train the anomaly detector on the system normality. The available pa-rameters are:

• Common WQ Parameters: Chlorine, PH, Temperature, ORP, TOC, Conduc-tivity, Turbidity

• Station-Specific Features: active pumps or pumps flows, alarms, CL2 andPH measured at different time points, valve status, pressure.

Sensor alarms are boolean values which indicate whenever sensors are workingproperly or not. The normal value is 1, while 0 means that the sensor is not workingor, for some reason, the value is not accurate and should not be taken into account.The information regarding the pump status could be useful to correlate the changesof some WQ parameter with the particular kind of water being pumped to thestation. There are other parameters that give information about the status of the

24

3.2. Modelling Normality

system at different points, for example the measurement of PH and CL2 of watercoming from other pumps.

Sensor values and indicators are collected at a certain frequency, as mentionedin Section 3.1. The resulting vectors of numerical values are what we consider asobservations of the state of the physical system at these points in time.

Additional features could be considered in order to improve the detection or re-duce the false positive rates. Those features can be derived from some parametersof the same observation, or they can consider some characteristic of the parame-ters along different observations. For instance, to emphasise the intensity and thedirection of the parameter changes over the time, one possible feature to be addedwould be the difference of the value for a WQ parameter with the value in previousobservations. This models the derivative function of the targeted parameter. An-other feature, called sliding average, is obtained by adding for each observation afeature whose value is the average of the last n values of a WQ parameter. Featureselection and customisation must be made separately for each individual station,since they have some common parameters but they differ in many other aspects.

ADWICE assumes the data to be in numerical format to create an n-dimensionalspace state vector. Since the timestep series of numerical data from water utilitiessuit the input requirements of ADWICE, our testing data does not require any par-ticular preprocessing phase before feeding it to the anomaly detector, other thanthe creation of the derived features and normalisation.

3.2.3 ChallengesThe earlier application of ADWICE has been in IP packet header networks. In itsnative domain, the main problem is finding a pure dataset, not affected by attacks,but the quantity and quality of data is always high. Network traffic generates a lotof data, which is good for having a reasonable knowledge of normality as long asresources for labelling the data are available. Feature selection from IP headers,for example, is easy and does not lead to many problems, while the difficult issueswould arise if payload data would need to be analysed, where we would face pri-vacy concerns and anonimisation. In a SCADA system, sensors could give inaccu-rate values and faults can cause missing observations. This makes the environmentmore complicated, thus feature selection and handling is complex. Dealing withinaccurate or missing data requires more efforts to distinguish whenever an event iscaused due to those conditions or due to contamination. Furthermore, the result ofthe anomaly detection is variable depending on where the attack is placed. It is noteasy, for example, to detect a contamination when at the same time some sensorevaluations for some WQ parameters are inaccurate and some others are missing.Training the system with a limited dataset can result in a sub-optimal normalitymodel, and this causes raising of a lot of false alarms when testing with data thatresembles uncovered normality conditions of the system. In the next section weshow some results that we obtained testing our anomaly detector with data fromtwo different monitoring stations, proposing some possible solutions for the kindsof problems described.

25


3.3 Detection ResultsOver six available stations, we have chosen to test our anomaly detection with tworepresentative cases: a station that is slightly affected by sensor inaccuracies andanother where faulty values are more frequent.

As mentioned before, we have generated testing datasets by using the secondhalf of the baseline data and adding one contamination per dataset. The contami-nation has been introduced in the middle of the dataset according to the profile A,depicted in Figure 3.2, which is a normal distribution of the concentration during64 timesteps. Details about the single stations will be presented separately.

Each testing dataset then contains just one attack along several timesteps. Thedetection classifies each timestep as normal or anomalous. The outcome of thedetection is expressed in terms of DR and FPR, as described in Section 2.1.

3.3.1 Station AStation A is located at the entry point of a distribution system. It is the best stationin terms of reliability of values. It only has the common features and three pumpstatus indicators. Values are not affected by inaccuracies and there are no missingvalues both in the training and testing datasets. The baseline data consists of oneobservation every 5 minutes during the period of five months. The first attemptin generating a dataset is done by injecting contaminants according to the normaldistribution during 64 timesteps, in which the peak contaminant concentration is 1mg/L.

Table 3.1 shows the results that we obtained doing a common training phaseand then running a test for each of the contaminants. Training and testing havebeen carried out using a threshold value E in ADWICE set to 2 and the maximumnumber of clusters M is set to 150. Considering the fact that the amount of contam-inant is the lowest possible the results from Table 3.1 are not discouraging. Somecontaminants affect more parameters at the same time and their effect is more ev-ident; some others only affect few parameters with slight changes. Contaminant Ffor instance only affects the ORP, which is not measured in this station, thereforeit is undetectable in this case. As can be observed from the table, the FPR is thesame irrespective of the contaminant. This is due to legitimate normal data that isclassified erroneously as anomalous, a misclassification that is not dependant onthe contaminant injected.

The anomaly detector must be tuned in order to fit the clusters over the nor-mality points and let the furthest points to be recognised as attacks. To determinethe best threshold values of E in ADWICE the ROC curves can be calculated byplotting the detection rate (Y axis) as a function of the false positive rate (X axis)while changing the threshold value (the label beside the point).

Figure 3.3 shows an example of ROC curve obtained with the peak concentra-tion of 1 mg/L of contaminant A, injected according to profile A (Figure 3.2). It ispossible to observe that by setting E to 2.0 we obtain the DR and FPR presentedin Table 3.1. By increasing E the detector expands the region around the clusters

26

3.3. Detection Results

Contaminant ID False Positive Rate Detection RateA 0.057 0.609B 0.057 0.484C 0.057 0D 0.057 0E 0.057 0F 0.057 0G 0.057 0H 0.057 0.422I 0.057 0J 0.057 0.547

K 0.057 0L 0.057 0.406

M 0.057 0.156N 0.057 0.516

Table 3.1: Station A detection results of 1mg/L of concentration

where data is considered normal. In this circumstance, the FPR tends to decreases,since legitimate normal data that is further away from the cluster centroids and waspreviously misclassified as anomalous is now falling within the boundaries of theaccepted region. On the other hand, attack data that falls in that region is misclas-sified as normal, affecting the DR negatively. The contrary holds by decreasing E,leading to a higher DR at the cost of a higher FPR.

Evaluation of the ROC curves of all the contaminants can give hints to deter-mine the best trade-off that gives good detection rates and false positives, but all ofthose curves refer to a contaminant concentration peak of 1 mg/L. As non-expertsit was not clear to us whether this could be a significant level of contamination.For this reason we have tested the sensitivity of the anomaly detection by incre-mentally increasing the contaminant concentration. In our tests, we increased theconcentration in steps of 4 mg/L a time, up to 24 mg/L, under the same configura-tion of E set to 2 and M to 150. Figure 3.4 shows the variation of the detection ratesof three significant contaminants with respect to the increase of the concentration.Contaminants A and L are efficiently detected already at low concentrations, andtheir DR increases steadily above 80% (up to 98%) with higher concentrations.Contaminant E is not detected at all below and equal to 4 mg/L. This is justifiablesince this contaminant slightly affects only the TOC. Increasing the concentrationto higher levels, we obtain a DR up to 67%. In this figure the false positive rate isnot represented since with the higher concentration of contaminants it is easier todetect the deviation from normality without any increase in the false alarms. Theseresults confirm that even if ADWICE was not originally designed for this kindof application, by finding the optimal tradeoff between detection and false posi-tive rates for 1mg/L, this anomaly detector would give good results for any othergreater concentration. We conclude therefore that ADWICE is a good candidate

27


Figure 3.3: Station A contaminant A ROC curve

Figure 3.4: Concentration Sensitivity of the Station A

tool to be applied as EDS for the physical domain in this monitoring station.

3.3.2 Station DThe situation becomes more complicated when a source of uncertainty is addedto the system. Station D is located in a reservoir that holds 81 million gallons ofwater. The water quality in this station is affected by many operational parametersof co-located pump stations and reservoirs. Station D contains more parametersthan station A and some sensors are affected by inaccuracy. In detail, Station Dhas the following parameters:

• Common features: PH, Conductivity, Temperature, Chlorine levels, TOC,Turbidity.

• Alerts: CL2, TOC and Conductivity; 1 means normal functioning, while 0means inaccuracy.

28

3.3. Detection Results

• System indicators: three pump flows, two of them supply the station whilethe third is the pipe which the station is connected to.

• Valves: indicates the position of the key valve; 0 if open, 1 if closed.

• Supplemental parameters: Chlorine levels and PH measured in water com-ing from pump1 and pump2.

By checking the data that EPA has provided, we noted that the only sensor in-accuracy alert that is sometimes raised is the TOC alert, but in general we willassume that the other alerts could be raised as well. There are some missing val-ues in different points scattered within the baseline file. The baseline data consistsof one observation every 2 minutes during the period of three months. The sameprocedure for station A has then been applied to this station.

Figure 3.5 shows the ROC curve obtained with the peak concentration of 1mg/L and the same profile (profile A, Figure 3.2). In this case, we can observe thatthe relation between DR and FPR is not as smooth compared the case with stationA. Furthermore, the results are generally worse both in terms of DR and FPR.

Figure 3.5: ROC curve station D contaminant A

An accurate feature selection has been carried out to get reasonable results,since trying with all the station WQ parameters the false positive rate is very high.This makes it not worthwhile to explore concentration variations with such badresults.

To mitigate the effects of the missing data and the accuracy, the derivatives andsliding averages of the common parameters have been added as new features. Theoutcome was that the derivatives emphasise the intensity of the changes, thus im-proving the detection of the effects of the contaminations, while the sliding windowaverages mitigated the effect of the abrupt changes in data caused by the inaccu-racies or missing data. Some parameters have been ignored, like the pumps flowsand the key valve, since they caused lots of false positives if included as features.

After applying the changes described above, ADWICE was run with the pa-rameter E set to 2. Since the dataset is more complex and there are more possible

29


combinations of data to represent, the maximum number of clusters M was set to200.

Figure 3.6 shows the sensitivity of the detection with the increase of the con-centrations. The reported FPR, not represented here, was 4%, an acceptable levelaccording to the specifications provided by EPA. Again, Contaminant A is detected

Figure 3.6: Concentration sensitivity station D

already at low concentrations, even though Figure 3.6 shows a lower DR comparedto the previous case. Contaminant L this time is not detected at a concentration of1mg/L, but the detection performance increase suddenly at higher concentrations.Contaminant E follow the same trend as in the previous case, but the range between4 mg/L and 12 mg/L is shows improvements in this case. This is due to the additionof the derived features based on the TOC, the only WQ parameter affected by thiscontaminant.

Data from from the above two stations have resulted in clustered models con-sisting of 115 clusters for station A and 186 clusters for station D.

3.4 Detection LatencyThis section focuses on adequacy of the contaminants detection latency. As men-tioned in section 3.1, the final goal of the EPA challenge is to apply the best EDStools in real water utilities to proactively detect anomalous variations of the WQparameters. Real-time monitoring allows to take opportune countermeasures uponunfolding contaminations. This makes the response time to be as crucial as thecorrectness of the detection in general, since even having a good detection rate (onaverage) a late response may allow contaminated water to leave the system and bedelivered to users causing severe risks for their health.

The first issue that comes when measuring the detection latency is from whento start counting the elapsed amount of time before the first alarm is raised. Thisproblem is caused by the fact that different event profiles make it necessary to con-sider the latency in different ways. In case of the normal distribution depicted asprofile A (Figure 3.2), a possible approach could be counting the latency of the

30

3.4. Detection Latency

detection event from the initiation of the contamination event, since the concentra-tion rapidly reaches its peak. If the peak concentration was reached very slowly,the evaluation of latency based on the first raised alarm from the beginning wouldresult in an unnecessary pessimism (e. g. see profile D in Figure 3.2 ). In this caseit would be more appropriate to start counting the reaction time from the time whenthe peak of the event has taken place. In that case, an earlier detection would giverise to a negative delay and this would signal a predictive warning. For the purposeof our experiments, the normal distribution of profile A suits the computation ofthe latency based on counting the number of samples from the beginning of theevent.

Since in the baseline data time is represented as discrete timesteps, we measurethe latency by counting the number of timesteps elapsed before the first alarm israised. Figure 3.7 and 3.8 show the measured latencies for Station A and StationD respectively, with respect to the detection of the three contaminants presented inthe previous section. The curves indicate that in the case of the lowest concentra-

Figure 3.7: Detection latency in station A

Figure 3.8: Detection latency in station D

tion the latencies are high. For instance the detection of contaminant A and L instation A has a latency of 13 and 20 timesteps respectively. They are around one

31


fourth and one third of the total duration of the contamination event (64 timesteps).Contaminant E is not detected at all, so its latency is not represented in the graph.The situation changes positively when the concentrations are gradually increased.The latencies of Contaminant A and L drop sharply until a concentration of 8 mg/Lis reached, decreasing 60% and 75% respectively. At the same time, there are somedetections of Contaminant E, characterised by a high latency. From this point thelatencies of Contaminant A and L steadily drop, while the latency of ContaminantE decreases more rapidly. Latencies for the station D follow the same pattern,although the values are slightly higher.

Checking the results against reality, a latency of 13 discrete timesteps for Con-taminant A in Station A would correspond to a latency of 65 minutes, which isquite a long time. One should note that time interval between two observations hasa high impact on the real latency, since for example 20 timesteps of detection la-tency for Contaminant A in Station D with a concentration of 1 mg/L correspondsto 40 minutes of latency, 25 minutes less than the latency in Station A. Even inthis case the results are definitely improved by the increases in the contaminantconcentrations, but domain knowledge is required to evaluate whether the selectedincrements to a certain concentration are meaningful.

3.5 Missing and inaccurate data problemIn Section 3.3 we have seen that data inaccuracies and missing data were a majorproblem in station D. Our approach for the tests carried out so far has been to useworkarounds but not provide a solution to the original problem.

More specifically, our workaround for missing data was as follows. We havereplaced the missing data values with a zero in the preprocessing stage. Whenlearning takes place the use of a derivative as a derived feature helps to detect themissing data and classify the data points in its own cluster. Now, if training periodis long enough and includes the missing data (e.g. inactivity of some sensors orother operational faults) as normality, then these clusters will be used to recognisesimilar cases in the detection phase as long as no other sensor values are signif-icantly affected. During our tests we avoided injecting contaminants during theperiods of missing data.

Sensor inaccuracies are indicated with a special alert in the provided data set(a 0 when the data is considered as inaccurate, i.e. the internal monitoring systemwarns for the quality of the data). According to our experiments it is not good totrain the system during periods with data inaccuracies, even when workarounds areapplied. First, learning inaccurate values as normality may result in excessive falsepositives when accurate values are dominant later. Second, the detection rate canbe affected if the impact of the contaminant is similar to some of the inaccuratevalues. Thus a more principled way for treating this problem is needed.

Our suggestion for reducing the impact of both problems is the classical ap-proach in dependability, i.e. introducing redundancy. Consider two sensor values(identical or diversified technologies) that are supposed to provide measurementsfor the same data. Then the likelihood of both sensors being inaccurate or both

32

3.6. Closing remarks

having missing values during the same time interval would be lower than the like-lihood of each sensor ”failing”’ individually. Thus, for important contaminantsthat essentially need a given sensor value’s reliability we could learn the normalvalue based on both data sets. When a missing data is observed (0 in the alert cell)the preprocessing would replace both sensor values with the ”healthy”’ one. Whenone sensor value is inaccurate the presence of the other sensor has an impact onthe normality cluster, and vice versa. So, on the whole we expect to have a betterdetection rate and lower false positive rate with sensor replicas (of course at thecost of more hardware).

3.6 Closing remarksThis chapter focused on application of anomaly detection in the physical domainof water management systems. Thus, our implicit assumption was that the sensorvalues received by our anomaly detection system had not been subject to manipula-tion in the cyber layer. This assumption is relaxed in the next domain of applicationin chapter 4.

33

Chapter 4

Anomaly Detection in SmartMetering Infrastructures

This chapter focuses on anomaly detection applied to another critical system, theAdvanced Metering Infrastructure. The chapter starts (Section 4.1) by describingthe design of a smart metering infrastructure that uses trusted computing technol-ogy to meet the smart grid security requirements [49]. In that section we show thatdespite the hard core security, there will still be weaknesses in the forthcomingsmart meters that justify embedded real-time anomaly detection as second line ofdefence. In Section 4.2 we propose an architecture for embedded anomaly detec-tion for both the cyber and physical domains in smart meters. Section 4.3 presentsthe evaluation of an embedded instance of ADWICE for the cyber domain in aprototype under industrial development and illustrates the detection performanceof some examples of cyber attacks.

4.1 Scenario: the trusted sensor networkThe Trusted Sensor Network (TSN) [2] is a smart metering infrastructure definedas a use case within the EU FP7 SecFutur Project [3]. The main goal of this so-lution is to ensure authenticity, integrity, confidentiality and accountability of themetering process in an environment where multiple organisations can operate andwhere legal calibration requirements must be fulfilled [2], in accordance to themodel provided by NIST [1] This goal is achieved by a careful definition of thesecurity requirements supported by trusted computing techniques. As depicted inFigure 4.1, a Trusted Sensor Module (TSM) is located in each household. Theenergy measurements produced by one or more sensors are encrypted and certifiedby the TSM, which sends them to a Trusted Sensor Module Collector (TSMC).This component gathers the data coming from several TSMs and relays it to theoperator server infrastructure for its storage. Through the general purpose networkseveral organisations can get remote access to the functionality of the metering sys-

34

4.1. Scenario: the trusted sensor network

Figure 4.1: Trusted Smart Metering Infrastructure [2]

tem for installation, configuration and maintenance, but strict access policies andaccountability of the actions are enforced.

A smart meter in this architecture is called Trusted Meter (TM), and it can becomposed of one of more physical sensors, one or more TSMs and one TSMC. Adetailed description of the architecture is presented elsewhere [2].

MixedModeTM, a partner company within the SecFutur project, has developeda prototype of a Trusted Meter, described in the next section.

4.1.1 Trusted Meter PrototypeThe prototype of a TM (depicted in Figure 4.2) is composed of one physical sensorand includes the functionalities of a TSM and a TSMC. The sensor is an ADE7758integrated circuit, which is able to measure the accumulated active, reactive andapparent power. The functionality of the sensor is accessible via several registersthat can be read or written through its interface to the Serial Peripheral Interface(SPI) bus. There is a variety of registers that can be accessed for reading out energymeasurements, configuring the calibration parameters, operational states etc. Thesensor is then interfaced with an OMAP 35x system where the functionalities ofthe TSM and TSMC are implemented in software, with the addition of specialisedhardware, namely Trusted Platform Module (TPM), that provides the trusted com-puting functionalities. The system, running Angstrom Linux, uses secure boot toensure that the hardware and software modules are not corrupted and encryption isused to send out the readings to the operator servers.

35

4. ANOMALY DETECTION IN SMART METERING INFRASTRUCTURES

Figure 4.2: Trusted Meter

4.1.2 ThreatsAfter a careful study of the security requirements of the system, the applied securitymechanisms and the design of the meter prototype, we came up with the followingobservations:

• The software, certificates, and credential recorded in the processor module(OMAP 35x) will not be subject to change by malware or external applica-tions due to the use of TPM technology, which also prevents typical smartmeter vulnerabilities reported in an earlier work [43].

• The metering data that is sent out to the operator servers is encrypted by the(secure) application using the TPM, hence secure while in transmission.

• The weakest point in the system is represented by the unprotected connectionbetween the sensor and the OMAP 35x system where the TSM and TSMCfunctionalities are implemented.

The main threat is hence represented by potential man-in-the middle attacks onthe SPI bus that affect the values of data or commands transmitted, as depicted inFigures 4.3 and 4.4.

Figure 4.3: Manipulation of consumption values

36

4.2. Embedded Anomaly Detection

Figure 4.4: Manipulation or injection of control commands

A possible solution would be again based on encryption of the messages priorto the transmission on the SPI bus. This could be applicable in the cases when thesensor and the TSM are two physically separate modules, but when it comes to anembedded system, encryption would dramatically increase the complexity of thesensor circuitry, that must be kept cheap due to the large scale deployment.

The above analysis shows the need for real-time monitoring for intrusion de-tection is still present although trusted platforms offer higher level of protectionthan earlier solutions. This motivates proposing an embedded anomaly detectionas potential technology to explore.

4.2 Embedded Anomaly DetectionA proposed embedded anomaly detection architecture is devised to be included inthe functionality of the Trusted Meter. Figure 4.5 illustrates the main componentsof the architecture. It consists of five modules: a data logger, a data preproces-sor, two anomaly detection modules and an alert aggregator. The data logger isin charge of listening for communication events and data exchange through thesensor-TSM channel. It will record both the cyber domain information, i.e. packetheaders or connections, as well as the physical domain information, i.e. energymeasurements.

The data preprocessor is in charge of transforming the raw signals detected onthe channel into feature vectors that can be fed to the anomaly detection modulesfor evaluation of current state.

Anomaly detection consists of two modules: one is used for the cyber layer, i.e.the communication protocol on the SPI bus in the prototype meter, while the secondis used to detect anomalies on the physical layer, the actual energy consumptionthat is reported by the sensor. The motivation for this distinction is the need forhaving two different time scales on the state estimation in the two domains: whilethe cyber communication can be monitored and suspicious events detected within

37


Figure 4.5: Proposed Cyber-Physical Anomaly Detection Architecture

seconds, anomalies on the physical domain need to be discovered in the orderof days or weeks. This is due to the fact that load profiles change detection isonly meaningful when based on a sufficiently long time window. The two modulescomplement each other: while command injection on the bus can be detected by thecyber layer anomaly detector, consumption data manipulation leading to changingconsumption statistics can be detected by the physical layer anomaly detector.

The last component of the architecture, the alert aggregator, takes as inputthe alarms generated by the two anomaly detector modules and decides whetheranomalous behaviour should be reported to the central system.

In the following sections we will describe the modules in depth and presenttheir implementation in our current test system.

4.2.1 Data LoggerIn the prototype meter, the unprotected communication channel between the sen-sor and the TSM+TSMC module is the SPI bus. The SPI communication is alwaysinitiated by the processor (in the OMAP 35X system) who writes, in the commu-nication register of the sensor, a bit that specifies whether the operation is a reador a write command, followed by the address of the register that needs to be ac-cessed. The second part of the communication is the actual data transfer from orto the addressed register of the sensor. The application that implements the TSMand TSMC functionalities performs an energy reading cycle every second, sendingcalibrations or configuration commands when required.

Our bus logger records the following information: the timestamp of the oper-ation, the command type (read or write), the register involved in the operation ofthe value that is read or written. In our experimental setup, as described later, thebus logger has been implemented at the driver API level of the SPI interface of theTSM+TSMC, intercepting the communication prior to the transfer to the higher

38

4.2. Embedded Anomaly Detection

levels.

4.2.2 Data Preprocessing and Feature ExtractionThe data preprocessor receives records in the format presented in the previous sec-tion, and produces vectors of features that will be processed by the anomaly detec-tors. A common data preprocessor for both domains avoids processing the receiveddata twice. The features selected are numerical variables that all together representthe normal operation of the system. For the cyber domain, these are based on in-formation regarding the frequency and types of operations carried out on the SPIbus during a period of observation time I. There are three categories of features:

• Operation type: percentage of number of read or write operations per-formed in the period of observation I. An additional feature counts the num-ber of times the read-only registers are accessed, which is useful to capturethe fact that most of the time (every second in our case) the communicationis performed to read out energy measurements.

• Category type: percentage of the number of times the registers of the fol-lowing categories are accessed in the period of observation I: reading, con-figuration, interrupt, calibration, event, info. The first category includes reg-isters used for accumulation of active, reactive and apparent energy for thethree different phases. Register categorised as configuration are those usedfor configuring different operational parameters of the energy measurement.Registers in the category interrupt are interrupt status flags. Registers in-cluded in the event category are used to store information on events such asvoltage or current peak detection etc. The category calibration, groups theimportant registers used to calibrate the different parameters of the sensor.Finally, the info category groups registers where checksums and the versionof the sensor are stored.

• Register frequencies: usage frequency of each individual register addressedin the period of observation I.

These features are designed to characterise the typical communication patterns,therefore anomalous communication sequences or register access rates should bediscovered by the anomaly detector. Note that these features are specific to thesmart metering device that is being monitored, in contrast with the physical domainindicators , that as we see below are of a general nature.

In the physical domain, commonly used indices for customers characterisation,based on load profiles, can be utilised as features. These include daily indices, asthe widely used indices proposed in Ernoult et al. [61], such as the non-uniformitycoefficient α = Pmin

Pmax, the fill-up coefficient β =

Pavg

Pmax, the modulation coeffi-

cient at peak hours MCph =Pavg,ph

Pavgand the modulation coefficient at non-peak

hours MCoph =Pavg,oph

Pavg, where Pmin is the minimum power demand reported

during the day, Pmax is the maximum power demand, Pavg is the average power

39


demand, Pavg,ph is the average power demand during the peak hours and Pavg,oph

is the power demand during the off-peak hours. More refined indices that takeinto account weekly patterns (working days and weekends) can be added, as thosedescribed in Chicco et al. [62].

4.2.3 Cyber-layer Anomaly Detection AlgorithmThe sensor-processor communication is based on a series of messages exchangedthrough the bus. In this context, the set of possible combinations is not very large,due to the fact the set of registers accessible is bounded. The behaviour in termsof the commands sequences, captured by the features selected, can be consideredas data points that fall into certain regions of the multidimensional features space.In order to identify the good behaviour, an algorithm that is able to identify theseregions and consider them as the normality space would be needed.

Therefore, we have adopted and embedded an instance of ADWICE, since itscomputationally efficient smart indexing strategy makes it suitable for use in a re-source constraint system such as an embedded system for smart metering. Section4.3 presents the evaluation of this algorithm.

4.2.4 Physical-layer Anomaly Detection AlgorithmThe features available for modelling the physical domain suggest that when theload profile changes due to an eventual attack, the statistics would be affected overa long period. A lightweight change detection algorithm can therefore be em-bedded into the smart meter. A statistical anomaly detector using the indicatorsdescribed in Section 4.2.2 has been developed. However, due to absence of longterm data and ability to train and test the anomaly detector on consumed electricityprofiles, we have focused development and tests on the cyber level attacks.

4.2.5 Alert AggregatorThe last component of the architecture is in charge of collecting the alerts generatedby the anomaly detector modules, and performing aggregation in order to reducethe number of alarms sent to the central operator. The alert aggregation module cangather additional information in order to provide statistics that show the evidence ofan attack or the anomalous conditions. This creates a smart meter health althoughindividual analysis would still require a lot of effort and privacy concerns wouldhinder its ”careless” deployment. However, this can be useful when investigatingareas in which non-technical losses (i. e. losses that are not caused by transmis-sion and distribution operations) are detected, supporting for example localisationstrategies as in [53]. Future works include further investigation and privacy-awaredevelopment of this module.

40

4.3. Evaluation

4.3 EvaluationIn this section we present the evaluation of the anomaly detection on a number ofattacks performed in the cyber domain. We start presenting the methodology tocollect the data for evaluation. Then we introduce the cyber attacks we performedand finally we show the outcomes of the clustering-based anomaly detection algo-rithm.

4.3.1 Data collectionIn order to obtain data for training and testing the anomaly detection algorithm,the trusted meter prototype has been installed in a household and real energy con-sumption measurements on one electrical equipment have been collected duringa period of two weeks in January 2012. Although this frame of time is not longenough to capture normality for the physical domain, it is representative enoughfor the cyber domain, where a 187MB data log file has been collected. The log,produced by the data logger module as explained in section 4.2.1, is composed ofbus communication transactions that involve several registers for energy reading,sensor configuration, calibration commands and sensor events. The energy readingoperations are performed with a period of one second, and they are predominantin the dataset. The data preprocessor, as presented in section 4.2.2, gathers thetransactions during a period of observation I which has been set to 10 seconds, andproduces a feature vector that is processed by the anomaly detection algorithm.

4.3.2 Cyber attacks and data partitioningFour types of attack have been implemented:

1. Data manipulation attack: In this scenario, the attacker performs a man-in-the-middle attack in which the values of the registers involved in the energymeasurement are lowered. This can be easily done by overwriting the signalon the bus on every reading cycle.

2. Recalibration attack: This command is injected on the bus in order tochange the value of some registers that hold calibration parameters, causingthe sensor to perform erroneous measurement adjustments during its opera-tion.

3. Reset attack: This command causes the content of the energy accumulationregisters to be wiped out. It has to be executed within every reading cyclein order to reduce the reported energy consumption. In our scenario, weexecuted it with a period of one second, interleaving it with the period of themeasurement process, which has the same length.

4. Sleep mode attack: This command puts the sensor into sleep mode, e.g. nomeasurements are taken. While in sleep mode, the sensor SPI interface stillreplies to the commands executed by the processor module, but the energyconsumption is not accumulated by the sensor.

41


In our evaluation, we tested the attacks 2 to 4, since attack 1 does not producenew messages on the bus and it would only be detectable by the physical layeranomaly detector. The attacks were first implemented as malware at applicationlevel, acting through the SPI interface drivers of the processor module, in order tosee their impact on the messaging through the bus. Since the data was collectedin an attack-free scenario, a script has been implemented to weave and reproducethe attack information into the clean data. Our traces consist of two weeks of logsin which 2/3 of the data has been left clean to represent normal conditions, andthe remaining 1/3 is affected by one attack at a time, thus 3 different testing traceswere generated. A physical hardware that can implement such attacks on the bus(SecFat) is under development in the SecFutur project.

The anomaly detector algorithm is therefore trained with the feature vectorsobtained by the first third of the data, while we tested with the 3 traces in whichhalf of the trace contains the remaining one third of normal data and the rest thenormal data interleaved with the attack. When the data preprocessor computed thefeature vectors, we manually set an oracle bit to indicate whether the features areaffected by an attack or not. This will be helpful for comparison when evaluatingthe outcomes of the anomaly detection algorithm.

4.3.3 ResultsDuring our evaluation, we have tuned the two classical parameters of the clustering-based anomaly detection algorithm which need to be configured manually in orderto create a good normality model and optimise the search efficiency. As mentionedin Section 2.1, these are the maximum number of clusters (M), and a cluster cen-troid distance threshold (E), that is used when determining whether a new datapoint falls within its closest cluster or not. The optimal number of clusters typi-cally depends on the distribution of the input data into the multidimensional space.The threshold is also important, since during real-time monitoring it determineswhether a new feature vector belongs to any pre-existing cluster or not. Therefore,in order to select a suitable combination of the two parameters, the outcomes ofthe detection were explored with M ranging from 10 to 100, and E ranging from 1to 2.5. The metrics used for evaluating the detection algorithm were the detectionrate (DR) and the false positive rate (FPR).

Our results show that the algorithm does not build a correct partitioning of thenormality data when M is set to 10 and 20 with all the possible combinations of E.In the detection phase, all the observations (with or without attacks) are classifiedas anomalous, leading to 100% DR but with a 100% FPR rate. The explanation forthat result is that with such few clusters the multidimensional space is exactly par-titioned in order to index the clusters in fine-grained detail. Overfitting the pointsduring learning normally leads to a large classification error at detection time, infact all the observation are marked as attacks, leading to 100% FPR and DR. Betterpartitioning of the normality data is found when M is set to at least 30. In this case,the algorithm uses 18 clusters to model the data, and for every configuration of Ein the range between 1 and 2 we get 100% DR with no false positives for all three

42


types of attacks. In the cases when E is over 2 we allow a very large threshold andthe detection rate is reduced to zero for the recalibration attack, since it is the at-tack type that is more similar to normal conditions where recalibration takes placein the training period. The results are similar when increasing M up to 100. Thismeans that the algorithm finds 18 clusters to be the best number for modelling thenormality data.

4.4 Closing remarksThis chapter proposed the application of anomaly detection in the cyber and phys-ical domain of a smart grid infrastructure, at the smart metering end point. Evalua-tion of the proposed architecture with examples of cyber attacks has indicated thatanomaly detection can be efficient as second line of defence even in systems wheresecurity is a well developed attribute.

In the evaluations presented in this and the previous chapter we have shown thatthe selection of the parameters configuration of the anomaly detector (the thresholdE and the number of clusters M in the case of ADWICE) has an impact on the trade-off between detection rate and false positive rate. A suitable configuration shouldbe selected in a pre-study phase prior to the deployment of the system. However,there has been increasing attention to adaptation strategies in order to obtain anautonomic configuration of the parameters [5].

Chapter 5 focuses on the problem of online adaptation of the parameters of anintrusion detection framework in order to adjust this trade-off at runtime, appliedto a domain where adaptation is a challenging but desired functionality.

43

Chapter 5

Online Parameter Adaptationof Intrusion Detection

The first challenge that occurs when applying anomaly detection is the definition ofthe normal operating region of the observed system. Modelling normality is oftencomplex and relevant features must be selected to capture the normal behaviour ofthe system. Another challenge is the definition of the boundaries between normaland anomalous behaviour. In the previous chapters we have focused our studies onanomaly detection on two domains of cyber-physical systems, showing that withcareful feature selection and parameter tuning for a given anomaly detector wecould achieve acceptable tradeoffs between detection performance and false alarmrates.

A third challenge of anomaly detection that typically hinders its applicabilityin real settings is the autonomic adaptation when applied to domains where nor-mality might naturally change over time (this is often referred as concept drift). Inmachine learning literature there are efficient techniques able to detect and clas-sify changes due to concept drift [4], allowing online adaptation of the normalitymodel [63] or detection parameters [64]. Unfortunately, their general applicabilityto real-time anomaly detection is limited. The unconstrained distinction of whethera change is due to benign drifts of the normality or due to an attack is still an openissue when learning in adversarial environments [65] . Most of the existing solu-tions that cope with adaptation applied to anomaly detection either assume gradualdrifts [66] [63] or rely on human intervention, which is the case for ADWICE [15].

In this chapter we focus our attention to an intrusion detection and responsearchitecture devised to protect a mobile cyber-physical system: a disaster areacommunication network, as presented earlier in Chapter 2. In this scenario, adhoc mode of communication is employed whereby each mobile device opportunis-tically connects to neighbours in its vicinity in order to establish an ad hoc chainof dissemination and collaboration for rescue operations management. As shownin previous work [56], even in disaster scenarios there is a need for intrusion de-tection to provide network survivability in case of attacks. In this domain, how-

44

5.1. Scenario: challenged networks

ever, changes can occur at any time. Different load conditions, changing numberof nodes and mobility patterns could radically modify the ”state” of the networkcompared to what could have been considered as normality in the past. Adaptationis therefore a required property but extremely challenging in this context.

When implementing a distributed intrusion detection scheme for ad hoc com-munication, another important factor must also be considered: handheld devicesare power-hungry. The adoption of security solutions in this domain is in fact ham-pered due to the power drain on the handsets. There is an obvious trade-off oneshould consider with an impact on survivability: we get more protection if we haveendless energy.

In this chapter we show that we can adapt the sensitivity of security mecha-nisms, tuning them based on energy estimates. We demonstrate this idea of energy-based adaptation using RWG (described in Section 2), an ad hoc protocol that wascreated for persistent dissemination in a disaster scenario, even in hostile envi-ronments, e.g. under partitioned networks and energy deficiency. This dissemi-nation protocol and the associated general survivability framework, in which localanomaly detection, diagnosis, and mitigation are part of the application needs, havebeen developed in a larger project on Hastily Formed Networks [67] and publishedelsewhere [68, 69, 70].

Since large scale ad hoc networking experiments are difficult to realise, thischapter focuses also on how the environment for large scale studies, such as ns-3simulation platform can be extended in order to model energy-based adaptations ofprotocol and service layer modules. This will enable studies of network life timein presence of various threats and different mobility patterns.

The chapter is organised as follows. Section 5.1 describes the General Sur-vivability Framework, the fully-distributed intrusion detection and response ar-chitecture designed to protect the disaster area communication network; Section5.2 presents the proposed adaptation module; Section 5.3 describes the proposedmodel for accounting CPU energy consumption in network simulation, and finallysimulations results are presented in Section 5.4.

5.1 Scenario: challenged networksThe General Survivability Framework (GSF) [69] is a modular architecture de-signed to provide a coherent security approach for mobile ad hoc communicationin challenging environments, such as disaster area networks. In such environments,with the hypothesis that spontaneous ad hoc networks need to be created on the fly,pre-existing trust relationships among nodes cannot be assumed, thus limiting theuse of encryption-based or collaborative protection mechanisms. The GSF, instead,is a fully-distributed architecture which is installed on each node. The frameworkis composed of four functional modules that cooperate in order to detect, diagnose,and react to network attacks (see Figure 5.1).

Here we describe an instance of the framework in which we have embeddedour adaptation module. The first module, the anomaly detector, is responsiblefor monitoring the network traffic from the vicinity in order to detect anomalous

45

5. ONLINE PARAMETER ADAPTATION OF INTRUSION DETECTION

conditions. The features gathered by the module are based on variables that arebelieved to characterise the statistical behaviour of the routing protocol, and arerepresented as a numerical vector. The anomaly detection algorithm is based ona distance function that compares the current observation vector with an averagevector that represents what is learnt during a training phase prior to deployment inan attack state. If the distance is greater than a threshold (which is also obtainedduring the training phase) the algorithm raises an alarm.

When an alarm is raised, the diagnosis component is triggered to classify theattack type among the known cases, comparing the distance of the observationvector with exemplar vectors that are known to represent attack situations.

Once the attack is classified, a relevant mitigation component is engaged toapply the appropriate countermeasures to reduce or eliminate its impact. If theattack cannot be matched to any of the known cases, a generic mitigation strategythat may help to protect the communication in hostile environments is employed.The adaption module, which is the focus of this chapter, is responsible for changingthe parameters of the other components in order to cope with the changes in thestate of the network and the node. Figure 5.1 shows the interconnections of the four

Figure 5.1: The General Survivability Framework control loop

modules, which is similar to a closed loop control system, with the difference thatthe state of a given node is highly dependent on the global state of the network thatis outside our control, i.e. the collective behaviour of other nodes. The frameworkhas been tested on top of Random-Walk Gossip (RWG) [68], a manycast partition-tolerant protocol designed to energy-efficiently disseminate messages in disasterarea networks, introduced earlier in Chapter 2 of this thesis.

5.2 Adaptive detectionThe adaptation process normally involves monitoring the system under control,detecting changes, deciding and reacting to adjust the system parameters in orderto bring it to the desired state, which often optimises towards some target perfor-mance. The survivability framework presented earlier differs from this concept

46

5.2. Adaptive detection

due the fact that the state of the network is not fully observable and controllablefrom the point of view of an individual node, which has a partial view restrictedto its vicinity. The emerging global response of the network determines a node’sreaction to the attack. In this context the adaptation process that we aim for is aself-adaptation [71] approach, meaning that the control system itself (i.e. the GSF)should be adjusted with the changing conditions of both the node and the network.

Our approach to adaptation consists of adapting the tradeoff between the sensi-tivity of the intrusion detector and its energy consumption. In its essence, it falls inthe class of adaptation algorithms that tradeoff between security provisioning andresource consumption [72] [73]. The solution proposed in this chapter takes intoaccount the perceived state of the network, focusing mainly on the most valuablefeature of the internal state of the node, the energy. The assumption is that whilesupporting resistance to the attacks, the nodes should also be aware of their energybudget, adapting their behaviour in an efficient way in order to extend their life-time as much as possible. With the support of energy modelling and simulation,we show that aggressive energy-agnostic attack survivability strategies, with excel-lent detection performances, could become useless once their impact on the energyconsumption reduces the lifetime of the network.

A similar approach based on a mathematical model of the system has beenproposed by Mitchell et al. [74]. In their case, however, the intrusion detectionscheme is collaborative, where the final decision is based on a voting strategy, andtheir results have only been proven theoretically without the support of simulation.

Figure 5.2: Energy-aware adaptation of IDS parameters

5.2.1 Adaptation componentThe adaptation component proposed (see Figure 5.2) consists of a function thattakes as input the current energy level of the node, the perceived attack situationand the current parameter set configuration. The output is a new set of parametersthat is both relevant for the detection accuracy and has a strong impact on theenergy consumption.

The function is based on a decision table. Each action is a parameter set con-figuration according to the perceived attack situation, modelled as conditions, anddifferent energy states, modelled as rules. Rules based on energy levels could makethe system reactive to energy changes, tilting the trade-off towards the energy as-pect when this resource is limited. Prediction of the energy depletion based on theactual traffic load and CPU consumption could be the basis of a more proactive

47


adaptation of the IDS parameters. All the combinations can be selected during apre-design study on the trade-off between resource usage and security provision-ing.

5.2.2 Energy-based adaptation in the ad hoc scenarioAs a case study, we consider the GSF presented earlier. In this framework, there isa relevant parameter that governs the detection, diagnosis and mitigation cycle andhas a strong impact on the CPU utilisation. It is the aggregation interval Ia in whichthe network state observations are aggregated and the alarm state is evaluated. Theshorter the interval the faster the response to the attacks, but at the same time thedetection accuracy is lower and the power consumption is higher. The longer theinterval, the better the detection accuracy, but the detection latency would increaseas well. In this case the power consumption of the detector would be lower, buta high latency could allow the attack to spread throughout the network causingsubsequent negative consequences.

In the work presented in [69], a trade-off between detection accuracy and mini-mum latency has been obtained with Ia at 50 seconds. A longer interval, providinga better detection accuracy at the cost of a higher latency could still be acceptableif the overall impact on the energy consumption outperforms the case in which thelatency is shorter.

The decision table implemented in this scenario takes into account two condi-tions: attack or non attack. A reactive solution has been implemented consideringthe following four battery level ranges: 100% to 60%, 60% to 40%, 40% to 20%,and 20% to 0% (this makes sense with the linear battery discharge model, oth-erwise it should be replaced with a lifetime prediction based on the current loadand the specific discharge model). Since, as mentioned earlier, the global responseto the attack is given by the collective response of the nodes in the network, thechoice of the aggregation interval is based on the idea that nodes with more energyshould compensate the higher latency of nodes with a limited battery level that tryto extend their lifetime slowing their detection engine. Nodes that have a battery

Energy fraction (%) Non-Attack Attack100-60 IDS FAST IDS MEDIUM60-40 IDS FAST IDS SLOW40-20 IDS MEDIUM IDS SLOW20-0 IDS SLOW IDS SLOW

Table 5.1: Aggregation interval based on energy levels

level bounded between 100% to 60%, in normal conditions, set the shortest intervalIa (namely IDS FAST), to react more quickly in case of attacks. During attacks,instead, the interval is set to longer value (IDS MEDIUM) in order to improve thedetection accuracy and save energy at the same time. With a lower battery level, be-tween 60% and 40%, the interval during attack is further increased (IDS SLOW),

48

5.3. Energy modelling

to save more energy. When the level is between 40% and 20% the interval duringattack-free periods is set to IDS MEDIUM, since at that time the nodes should startpreserving battery more consciously. Below 20%, the node samples slower in allthe cases, to preserve its energy. Table 5.1 summarises the interval adaptations.

5.3 Energy modellingEnergy aware applications require access to the energy level of the system in whichthey are running. Network simulators typically focus on the energy consumption ofthe network interface, ignoring the contribution of other power-hungry componentssuch as the CPU. When evaluating the energy efficiency of network protocols insimulators, it is common practice to estimate their energy consumption from theamount of network traffic they create. The wireless card is assumed as the mostpower-consuming component of the system used by the network protocols, henceother elements are not considered. More realistic simulations should also include,for example, the contribution of the processor.

In ns-3, a widely used network simulator, an energy modelling frameworkis available [75]. In this framework, the device energy consumption and energysource are modelled as two separate elements. Energy sources are modelled bymathematical equations that describe their discharge curves. Device energy mod-els specify the current draw at the states in which the device can operate. Eachdevice drains a certain amount of current from the source, whose total energy con-sumption is computed from the sum of all the individual current draw contribution.Although there are many energy source models, characterised by different dis-charge curves, there is currently only one device energy model available, the WiFienergy consumption model.

A similar energy model targeting wireless sensor networks (thus not really suit-able to our scenario) has earlier been proposed by Chen et al. [76] for OMNeT++[77]. In addition, a simple CPU model, that accounts the energy consumptionin the active or inactive state, is included. For the same simulation platform, ageneric energy model for wireless networks has been proposed by Feeney et al.[78]. It improves the battery depletion handling, compared to the existing model,but CPU energy consumption is not modelled. The common limitation of networksimulators is that they normally assume unlimited computational power and ignoreprocess execution time. This hinders CPU modelling and more detailed energyaccounting.

Simulation of power consumption of programs running on real systems canbe done with instruction sets simulators, such as ARMulator [79] or SimpleScalar[80] among the others. Unfortunately, they are not suitable for simulating complexprograms, neither can they simulate networking scenarios.

Combined network simulation and CPU instruction set simulation has beenproposed in SunFlower [81], Real-Time Network Simulator (RTNS) [82] or Slice-Time [83]. However, the first lacks nodes mobility support, while the two latterworks do not simulate energy consumption. In all the cases the CPU instructionset simulator and the network simulator need to be synchronised, and the simula-

49


tion time is rather slow. The CPU energy model for network simulation proposedin in the next section could represent a way to include CPU energy consumption,thus enabling adaptation studies based on energy.

5.3.1 A CPU model for network simulationIdeally, a CPU utilisation indicator can be directly translated into current draw val-ues. Unfortunately, this information in network simulators is not available due tothe limitation discussed earlier. In order to account for the CPU energy consump-tion based on the impact of network protocols and/or additional applications, asimple CPU energy consumption model is introduced. Our model is based on theassignment of an energy footprint to each application simulated. The model con-sists of a state machine for each application that runs in the simulation environment(Figure 5.3). Each state represents a possible operational mode of the application,that differs from the others in terms of behaviour and consequent CPU usage. Bythe application running at each of its states in a real device, the isolated impact onthe power consumption can be profiled. Functions that produce power consump-tion values depending on the inputs of the application which have an impact on theenergy consumption can then be associated to the corresponding states. In the casein which the power consumption cannot be directly measured, one could for ex-ample analyse the CPU utilisation increase caused by the application, which maybe translated into power consumption values for simulation purposes. In a workpublished later, for instance, we profile the energy consumption of the GSF andRWG in terms of CPU increase on modern Android-based handsets [84]. In thesimulation, the global CPU energy consumption is then given by the combinationof all the individual power contributions of the modelled applications at the currentinputs.

Figure 5.3: State machine with associated power consumption

To illustrate the approach, we show how this model can be applied to accountfor the CPU energy consumption of our case study.

5.3.2 Application of the modelIn our environment, the applications that need to be accounted for their energyconsumption are the RWG protocol and the GSF. In order to assign the current drawat each state, we isolate the power consumption of real devices when running in

50

5.4. Evaluation

similar conditions. For the RWG protocol, we can characterise the following states:RWG, RWG mit gray and RWG mit drain. The first state represents the normalworking conditions of the RWG protocol. The individual energy footprint of theprotocol running at different transmission rates can be measured as in [70]. Theother states represent the protocol behaviour when attack mitigation strategies (tothe grayhole and drain attacks) change the way the protocol operates. In the currentimplementation, RWG can be operated in grey hole or drain attack mitigation. Apower consumption function can be assigned to those states with the same logic asbefore.

A separate finite state machine is created for the GSF application, which con-sists of the intrusion detection, diagnosis and mitigation selection components.Three states, that capture the frequency at which the analysis cycle is performed,are defined: IDS FAST, IDS MEDIUM and IDS SLOW for short, medium andlong interval Ia used to tune the GSF operation frequency. Again, power consump-tion functions for each of these states can be extracted from real devices emulatingthis application under similar workloads. Finally, the total CPU energy consump-tion is then given by the sum of the power consumption contribution of the RWGapplication and the GSF application, as depicted in Figure 5.4

Figure 5.4: CPU power consumption of the RWG and IDS framework over sometime interval

Given this CPU energy consumption model, the impact of RWG and the GSFoperations can be made more visible, while energy-awareness and self-adaptivitycan be implemented and simulated.

5.4 EvaluationThe goal of this section is to show that local adaptation based on per-node esti-mates of the available energy leads to better overall performance in the networkand extends its lifetime.

5.4.1 Implementation and simulation setupAs baseline, the same scenario presented in [69] has been considered. The networkconsists of 25 nodes moving in a disaster area network [85]. The load of the net-work is 15 messages per second sent from randomly chosen nodes. The messagesare disseminated in manycast to at least k=10 nodes and have an expiration time of400 seconds.

51

RWG+IDS_SLOW

RWG+IDS_FAST

RWG_MIT_GRAY+IDS_FAST

RWG_MIT_GRAY+IDS_SLOW

Time

Power


The ns-3 energy framework has been included and extended with the proposedCPU energy model from Section 5.3 to create a model for energy awareness. Forsimplicity, we have used the energy source model characterised by an ideal lineardischarge curve. To consider the energy impact of the wireless device, the WiFienergy model has been employed in the simulation. The current draw values as-signed to the different operational states of the wireless interface are the same as inthe work by Wu et al. [75].

In order to assign the power consumption values to the states that characterisethe RWG application model, the results from Vergara et al. [70] have been con-sidered. To be compliant to the ns-3 energy model, the CPU energy model shouldspecify current draw values in Ampere instead of power in terms of Watts. Asthe ns-3 source model assumes a constant battery voltage, the conversion betweenWatts and Ampere is immediate. Assuming, for example, that the transmission rateis 15 messages per second, according to the load injected in the network, the en-ergy consumed by the RWG application in the normal operation mode at this rate is0.025W, as depicted in Figure 7 in Vergara et al. [70]. Furthermore, we consider theadditional contribution of 0.1W as the power consumption due to message deletionat the same rate. The power consumption value at a rate of 15 messages per sec-ond is then 0.125W. Considering 3.7V constant battery voltage, the current drawthat we associated to this state is 0.034A. RWG in mitigation mode usually per-forms less operations, since the information contained in some signalling packetsis discarded or not processed. A lower footprint has been estimated and assignedto both of the considered mitigation states. Table 5.2 summarises the current drawassigned to RWG.

Rwg application state Current draw (A)RWG 0.034RWG Drain Mitigation 0.018RWG Greyhole Mitigation 0.018

Table 5.2: Current draw of the RWG application

For GSF, as mentioned earlier, the three modelled states correspond to whenthe detection-diagnosis-mitigation analysis are performed within short, medium orlong, intervals, as specified by the parameter Ia. The interval of aggregations Iachosen for IDS FAST is 50 sec, IDS MEDIUM is 75 seconds and IDS SLOW isset to 100 seconds. The assigned constant energy footprint of the IDS states isshown in Table 5.3.

IDS application state Current draw (A)IDS FAST 0.040IDS MEDIUM 0.025IDS SLOW 0.010

Table 5.3: Current draw of the IDS application

52

5.4. Evaluation

In order to illustrate the benefits of adaptation, we focus on two of the pos-sible attack types: the drain attack and the grey hole attack. In the first attacktype the malicious nodes act in order to drain the battery of the victims, inject-ing fake signalling packets that cause benign node to perform a lot of unnecessarydisseminations. In the second type of attack, malicious nodes target the messagedissemination, by sending fake signalling packets that cause the interruption of theprocess before the messages have actually been delivered to the intended k nodes.Both attacks are interesting to analyse with regard to the adaptive function, sincethey are complementary in terms of energy consumption. In the drain attack nodeswaste energy as a consequence of the attack, thus good detection accuracy and lowlatency are necessary to avoid energy waste. On the contrary, the goal of the greyhole attack is disrupt the expedient dissemination of the messages in the network,thus the nodes should consume less energy due to the reduced amount of networktraffic. Longer detection latency in this case could be tolerated, from the energyperspective, but adaptation should still guarantee good detection to ensure networkdissemination goals to be fulfilled.

5.4.2 Simulation resultsIn order to test the complete set of actions performed by the adaptivity functionduring an entire simulation period (3000 seconds as in [69]), the initial energylevel assigned to all nodes is selected to be lower than needed to conclude thesimulation over 3000 seconds, which turns out to be 500J. In this way, one candetermine whether the adaption extends the lifetime of the network compared withthe non-adaptive case. In all of the following simulations, five randomly placedmalicious nodes start the attack at second 2067 and this lasts until the end of thesimulation (details same as [56]). Ten runs of the same simulation are performed,and the results are averaged.

The first attack type we simulated is the drain attack. Figure 5.5a shows the

(a) Average energy in the network (b) Number of active nodes

Figure 5.5: Energy consumption and number of active nodes in the drain attack

comparison of the average available energy in the network in the adaptive case

53


compared to the non-adaptive one when the network is under the drain attack withmitigation enabled. As it can be observed from the figure, the adaptive functionoutperforms the non-adaptive case by extending the lifetime of the network by over250 seconds. The difference of the energy level is larger in the second half of thetime window, since there the nodes start preserving their battery adopting longeraggregation intervals Ia. The confidence interval of the ten rounds is up to 2% aslong as all the nodes have battery capacity, but can drop to 50% when the nodesstart to drop out due to depletion. Figure 5.5b shows the number of active nodesin the network. A good detection accuracy and the minor energy consumption dueto the lower CPU utilisation of the adapted IDS application has an impact on thelifetime of the network, although the linear discharge behaviour in both cases isstill caused by the energy consumption of the wireless interface in ad hoc mode(recall this mode has a strong impact on the consumption due to the constant idlelistening [86]). This causes the nodes to have a similar discharge behaviour, whichresults in a sharp drop of the number of active nodes, as can be observed in Figure5.5b.

(a) Number of packets sent (b) k-delivery rate

Figure 5.6: Survivability performance in the draining attack

The impact of the adaption on the survivability performance during the sameattack is shown in Figure 5.6. Two metrics are used to measure the network per-formance: the packet transmission rate and the packet k-delivery rate. The firstindicates the number of transmitted packets (including data and signalling packets)during the interval of study, which is useful to analyse the impact of the attack onthe bandwidth usage. The second metric shows the performance of the networkas number of packets successfully delivered to k nodes over the interval of study.We expect that if adaptation of the interval is successful the results of attack de-tection measured in terms of network performance are not any worse than whenwe are not adapting. In Figure 5.6a, we can see a peak on the number of trans-missions at the beginning of the drain attack (at second 2100). This is caused bysome detection latency, that leaves some bogus messages being disseminated inthe network. This number is slightly higher in the adaptive case, but afterwardsthe number of messages sent during the attack is similar to the case in which the

54

5.4. Evaluation

interval is not adapted, meaning that the adaptation does not decrease the detectionaccuracy. In Figure 5.6b, we can see how the delivery ratio is also very similar tothe non-adaptive case, indicating that the overall performance is preserved.


Figure 5.7: Energy consumption and number of active nodes in the grey hole at-tack


Figure 5.8: Survivability performance in the grey hole attack

The results of the adaption on the greyhole attack are presented in Figures 5.7and 5.8. As in the case of the drain attack, the average available energy in thenetwork is higher in the adaptive case compared to the non-adaptive one, as shownin Figure 5.7. However, a more accurate analysis of the survivability performanceshould be undertaken since in this attack scenario a bad detection (i.e. a persistentgreyhole attack) could give energy saving by disrupting the attempted dissemina-tion flows. As shown in Figure 5.8, there is again some latency at the beginningof the attack, in which we can see that a sharp decrease in the number of transmis-sions is caused by the malicious nodes causing packets to be dropped. After that,however, the number of transmissions is greater in the adaptive case compared tothe non-adaptive one, showing that a longer interval of aggregation in this case is

55


a benefit and improves the detection accuracy. Figure 5.8b in fact shows that thenumber of deliveries is higher after second 2200.

For a more realistic scenario, we studied the case in which nodes started withrandom initial battery levels. This test shows how heterogeneity on adaptation andattack response impacts our major metrics: the available energy and the survivabil-ity of the network. In this case, all the nodes are assigned a random initial energybetween 100J and 500J.


Figure 5.9: Energy consumption and number of active nodes in the drain attackwith random initial energy levels


Figure 5.10: Survivability performance in the drain attack with random initial en-ergy levels

Again, Figure 5.9 shows that adaptation provides energy efficiency while Fig-ure 5.10 shows that the global impact on the network survivability is further im-proved.

56


5.5 Closing remarksThis chapter described the design and implementation of an on-line energy-basedadaptation component for an attack survivability framework in the context of mo-bile ad hoc communication, where the cyber and physical aspects of the systemare extremely intertwined. The results, based on a model for CPU energy con-sumption in network simulations, have shown that the adaptation component givesan extension of about 14% of lifetime without significantly degrading the surviv-ability performance. Further improvements can address proactive energy-basedapproaches, as for example foreseeing the depletion based on the current workloadand adapting in advance. Although our simple CPU energy consumption model al-lowed us to study the network lifetime and understand the impact of adaptation, amore detailed CPU model should also be included in network simulators to enablemore realistic CPU usage and energy consumption simulation.

57

Chapter 6

Conclusion

In dealing with security challenges in cyber physical systems our hypothesis wasthat anomaly detection could be applicable as a second line of defence in both thecyber and physical domains. This hypothesis was explored by investigations inthree application domains.

The first domain of application has been the physical layer in water distribu-tion systems, described in Chapter 3. These systems are monitored by water qualitysensors that provide chemical properties of the water which are processed and usedto discover contaminations. Despite the fact that specialised sensors and detectorsof specific pollutant substances are available, the water security research commu-nity is seeking to find generalised event detectors that may help to detect any kindof contamination event. Once again, there is a need of both misuse detection tomatch specific known cases, and anomaly detection to discover any patterns thatdo not conform with normal behaviour.

The introduction of this system is challenging since the chemical properties ofthe water can change along the time depending on its source and can be confusedas a contamination event. Nevertheless, the use of a learning based anomaly detec-tion technique, which allows the characterisation of all the variations of the systemnormality, has proved to be effective. Besides, additional features based on slid-ing windows and derivatives of the data analysed have been introduced to improvethe efficiency of the solution under certain circumstances. The performance of theapproach has been analysed using real data of two water stations together with syn-thetic contaminants superimposed with the EDDIES application provided by EPA.The results, in terms of detection rate and false positive rate, have shown somecontaminants are easier to detected than others. The sensitivity of the anomaly de-tector has also been been studied by creating new testing data sets with differentcontaminant concentrations. The results have shown that with more contaminantconcentration the detector obtains higher detection rates with low false positiverates. Domain knowledge is however required to draw conclusions about the per-formance related to the increasing level of concentrations.

The inaccuracy of the data provided in one of the stations has negatively af-

58

fected the performance. This is a typical issue that arises when dealing with aphysical domain where sensors can be subject to faults. We have discussed thepotential of classical dependability measures to improve the detection outcome,but the drawback is the cost of additional hardware. This problem could also beaddressed with software by estimation of the features trend in the preprocessingphase, using expectation-maximisation algorithms for instance.

The latency of the detection, a very important factor, has also been analysed,showing reasonable results that are again qualitatively improved as the contaminantconcentration is increased.

Next, anomaly detection has been explored as a security mechanism in a cyberlayer of a CPS. In Chapter 4 we have focused our attention towards smart meter-ing infrastructures, the prominent example of CPS where cyber and physical layerare strictly related to each other. In our case study, although confidentiality, au-thenticity, accountability, integrity and privacy are provided by the use of trustedcomputing technology embedded into smart meters, some vulnerabilities persistand real-time monitoring for cyber and physical tampering attacks is still a secu-rity solution that must be considered when designing new smart meters. Therefore,we have explored deployment of a lightweight embedded anomaly detection archi-tecture that takes into account both cyber and physical domains and implementedand evaluated part of this architecture on a smart meter prototype. The evaluationperformed on attacks in psuedo-real settings has shown that our anomaly detectionapproach is able to efficiently detect several types of cyber attacks without emittingany false positives. The cyber domain in this case has been easier to model for nor-mality, since the set of messages that can be exchanged through the bus is boundedand primarily because the application generating the communication events has aregular behaviour. Misuse detection could of course be able to efficiently detectthe same set of attacks with less effort, but the advantage of anomaly detection isthat not any information about attacks is modelled.

The work discussed in thesis confirms the known research insights that chal-lenge anomaly detection:

1. Difficulty of the definition of the normal behaviour of the monitored system,

2. Difficulty of the definition of the boundaries between normal and anomalousbehaviour

3. Lack of availability of labelled training data

4. Change of normality.

With regard to challenges 1 and 2, we have seen that feature selection and param-eter tuning play an important role in the success of applicability. Although wehave proven good detection/alarm rate trade-offs, the definition of the boundariesbetween normality and attack is still a challenge: we have selected them basedon a pre-study phase with attack information, using a training-validation-testingapproach. Will the trade-off be effective when new attack types occur? This isparticularly challenging when change of normality occurs.

59

6. CONCLUSION

Our work on online adaptation addresses challenge 4 by tuning of the parame-ters of intrusion detection. Chapter 5 described the design and implementation ofan on-line energy-based adaptation component for an attack survivability frame-work in the context of mobile ad hoc communication. This component adjusts thetrade-off between attack response time and energy consumption based on the avail-able energy. Intrusion detection parameters were initially tuned based on detection-latency tradeoffs, again using training, validation and testing on some examples ofattack. Our work on adaptive trade-off selection targeted improvements towardsthat aspect: a set of parameters can be devised in a phase prior to deployment,but online adaptation selects the parameters to adjust the detector to the currentcondition. The results have shown that adaptation increases the network lifetime.However, the adaptation strategy is based on fixed energy intervals and is rathersimple. Smarter strategies to forecast the energy depletion and adapt in advancecould be adopted in order to further improve network availability.

6.1 Future workOne of the remaining challenges in anomaly detection is the availability of labelleddata. This creates difficulties for research community. Already with trials in threeapplication domains of cyber-physical systems it has been difficult to obtain datato build the normality model. In the first domain real data was available but hadsensor-related issues and we suffered from lack of domain knowledge. In the sec-ond domain capturing real consumption profiles was hard due to privacy issues.In the third domain we could not even get rudimentary data, but had to rely onsimulation.

Availability of data that is affected by concept drift, in order to address thechallenge of dealing with change of normality, is even a harder problem. However,dealing with concept drift is an interesting challenge that is worth to study, espe-cially in cyber-physical systems, that are likely to be subject to this phenomenon.Creation on synthetic datasets, in particular for the physical domain, would be away to perform such studies. In the water scenario, we have seen how syntheticcontaminations can be superimposed by a software; with the same principle, onecould synthetically extend the real dataset with estimated values and emulate apossible change of normality.

Concept drift research is well established in the context of machine learning;extensions of similar techniques in adversarial environment are possible researchdirections. The clustering algorithm adopted in the first two domains has retrain-ing and forgetting capabilities which can in principle be exploited to autonomouslyadapt the normality model to changes over time. The adaptation strategies includedin the existing package are quite rudimentary, relying on human intervention forincremental learning (when a new cluster should be added) and using simple win-dowing for forgetting of unused clusters. Future research will focus on algorithmsfor online self-adaptation of normality to cope with concept drift.

The thesis has focused on ad hoc communication as a third domain of study. Insuch networks, the chaotic environment where nodes mobility, miscommunication

60

6.1. Future work

and constrained resource availability (energy, bandwidth, etc.) makes concept driftdifficult to capture from the point of view of a single node. Since the survivabilityframework is deployed in an adversarial environment, an attacker could learn theadaptive behaviour of the node and act against it. A game-theoretical approachto adaptation therefore can be devised to improve the survivability and lifetime inpresence of two types of players in the network, attackers and defenders.

61

6. CONCLUSION

62

Bibliography

[1] NIST. NIST Framework and Roadmap for Smart Grid InteroperabilityStandards, Release 1.0, http://www.nist.gov/public_affairs/releases/upload/smartgrid_interoperability_final.pdf, accessed Dec. 2012.

[2] M. Broxvall. Metering devices with legal calibration requirements, Deliver-able D2.1, http://www.secfutur.eu/WP2/Deliverables/D2.1/Deliverable_D2_1.pdf, accessed May 2012.

[3] EU SecFutur FP7 Project. http://www.secfutur.eu, accessed Jun.2012.

[4] I. Zliobaite, “Learning under concept: an overview,” CoRR,vol. abs/1010.4784, 2010.

[5] A. Sperotto, M. Mandjes, R. Sadre, P. de Boer, and A. Pras, “Autonomicparameter tuning of anomaly-based idss: an ssh case study,” Network andService Management, IEEE Transactions on, vol. 9, pp. 128 –141, june 2012.

[6] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,”ACM Comput. Surv., vol. 41, pp. 15:1–15:58, July 2009.

[7] G. Dini, F. Martinelli, A. Saracino, and D. Sgandurra, “Madam: A multi-level anomaly detector for android malware,” in Computer Network Security(I. Kotenko and V. Skormin, eds.), vol. 7531 of Lecture Notes in ComputerScience, pp. 240–253, Springer Berlin Heidelberg, 2012.

[8] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, “Crowdroid: behavior-based malware detection system for android,” in Proceedings of the 1st ACMworkshop on Security and privacy in smartphones and mobile devices, SPSM’11, (New York, NY, USA), pp. 15–26, ACM, 2011.

[9] R. R. Rajkumar, I. Lee, L. Sha, and J. Stankovic, “Cyber-physical systems:the next computing revolution,” in Proceedings of the 47th Design Automa-tion Conference, DAC ’10, pp. 731–736, ACM, 2010.

[10] J. White, S. Clarke, C. Groba, B. Dougherty, C. Thompson, and D. Schmidt,“R&d challenges and solutions for mobile cyber-physical applications and

63

BIBLIOGRAPHY

supporting internet services,” Journal of Internet Services and Applications,vol. 1, pp. 45–56, 2010.

[11] S. Depuru, L. Wang, V. Devabhaktuni, and N. Gudi, “Measures and set-backs for controlling electricity theft,” in North American Power Symposium(NAPS), 2010, pp. 1 –8, sept. 2010.

[12] E. Goetz and S. Shenoi, eds., Critical Infrastructure Protection. Springer,2008.

[13] P. McDaniel and S. McLaughlin, “Security and privacy challenges in thesmart grid,” Security Privacy, IEEE, vol. 7, pp. 75 –77, May-June 2009.

[14] R. Berthier, W. Sanders, and H. Khurana, “Intrusion detection for ad-vanced metering infrastructures: Requirements and architectural directions,”in Smart Grid Communications (SmartGridComm), 2010 First IEEE Interna-tional Conference on, pp. 350 –355, Oct. 2010.

[15] K. Burbeck and S. Nadjm-Tehrani, “Adaptive real-time anomaly detectionwith incremental clustering,” Information Security Technical Report - Else-vier, vol. 12, no. 1, pp. 56–67, 2007.

[16] K. Burbeck and S. Nadjm-Tehrani, “ADWICE: Anomaly detection with real-time incremental clustering,” in Proceedings of 7th International Conferenceon Information Security and Cryptology (ICISC 04), vol. 3506 of LNCS,pp. 407–424, Springer, 2004.

[17] T. M. Walski, D. V. Chase, D. A. Savic, W. Grayman, S. Beckwith, andE. Koelle, eds., Advanced water distribution modeling and management.Haestead Press, 2004.

[18] S. Friedlander and D. Serre, eds., Handbook of mathematical fluid dynamics,Vol. 1. Elsevier B.V, 2002.

[19] http://www.epa.gov/nrmrl/wswrd/dw/epanet.htmlAccessed 19 November2010.

[20] D. Eliades and M. Polycarpou, “Security of water infrastructure systems,”in Critical Information Infrastructure Security (R. Setola and S. Geretshuber,eds.), vol. 5508 of Lecture Notes in Computer Science, pp. 360–367, SpringerBerlin / Heidelberg, 2009.

[21] ASCE, Interim voluntary guidelines for designing an online contaminantmonitoring system. American Society of Civil Engineers, Reston,VA, 2004.

[22] D. Byer and K. Carlson, “Real-time detection of intentional chemical con-tamination in the distribution system,” Journal American Water Works Asso-ciation, vol. 97, July 2005.

64

Bibliography

[23] K. A. Klise and S. A. McKenna, “Multivariate applications for detectinganomalous water quality,” vol. 247, pp. 130–130, ASCE, 2006.

[24] J. Han, Data Mining: Concepts and Techniques. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc., 2005.

[25] D. Hart, S. A. McKenna, K. Klise, V. Cruz, and M. Wilson, “Canary: A waterquality event detection algorithm development tool,” vol. 243, pp. 517–517,ASCE, 2007.

[26] M. W. Koch and S. McKenna, “Distributed sensor fusion in water qualityevent detection,” to appear in Journal of Water Resource Planning and Man-agement, vol. 137, Januar/February 2011.

[27] K. Kurotani, M. Kubota, H. Akiyama, and M. Morimoto, “Simulator forcontamination diffusion in a water distribution network,” in Industrial Elec-tronics, Control, and Instrumentation, 1995., Proceedings of the 1995 IEEEIECON 21st International Conference on, vol. 2, pp. 792 –797 vol.2, 1995.

[28] A. Doglioni, F. Primativo, O. Giustolisi, and A. Carbonara, “Scenarios ofcontaminant diffusion on a medium size urban water distribution network,”vol. 340, pp. 84–84, ASCE, 2008.

[29] A. Kessler, A. Ostfeld, and G. Sinai, “Detecting accidental contaminations inmunicipal water networks,” Journal of Water Resources Planning and Man-agement, vol. 124, no. 4, pp. 192–198, 1998.

[30] B. H. Lee and R. A. Deininger, “Optimal locations of monitoring stations inwater distribution system,” Journal of Environmental Engineering, vol. 118,no. 1, pp. 4–16, 1992.

[31] A. Ostfeld and E. Salomons, “Optimal layout of early warning detection sta-tions for water distribution systems security,” Journal of Water ResourcesPlanning and Management, vol. 130, no. 5, pp. 377–385, 2004.

[32] J. W. Berry, L. Fleischer, W. E. Hart, C. A. Phillips, and J.-P. Watson, “Sensorplacement in municipal water networks,” Journal of Water Resources Plan-ning and Management, vol. 131, no. 3, pp. 237–243, 2005.

[33] M. Propato, “Contamination warning in water networks: General mixed-integer linear models for sensor location design,” Journal of Water ResourcesPlanning and Management, vol. 132, no. 4, pp. 225–233, 2006.

[34] D. Eliades and M. Polycarpou, “A fault diagnosis and security framework forwater systems,” Control Systems Technology, IEEE Transactions on, vol. 18,no. 6, pp. 1254 –1265, 2010.

[35] C. D. Laird, L. T. Biegler, B. G. van Bloemen Waanders, and R. A. Bartlett,“Contamination source determination for water networks,” Journal of WaterResources Planning and Management, vol. 131, no. 2, pp. 125–134, 2005.

65

BIBLIOGRAPHY

[36] C. D. Laird, L. T. Biegler, and B. G. van Bloemen Waanders, “Mixed-integerapproach for obtaining unique solutions in source inversion of water net-works,” Journal of Water Resources Planning and Management, vol. 132,no. 4, pp. 242–251, 2006.

[37] A. Preis and A. Ostfeld, “Contamination source identification in water sys-tems: A hybrid model trees–linear programming scheme,” Journal of WaterResources Planning and Management, vol. 132, no. 4, pp. 263–273, 2006.

[38] J. Guan, M. M. Aral, M. L. Maslia, and W. M. Grayman, “Identifica-tion of contaminant sources in water distribution systems using simulation–optimization method: Case study,” Journal of Water Resources Planning andManagement, vol. 132, no. 4, pp. 252–262, 2006.

[39] E. M. Zechman and S. R. Ranjithan, “Evolutionary computation-based meth-ods for characterizing contaminant sources in a water distribution system,”Journal of Water Resources Planning and Management, vol. 135, no. 5,pp. 334–343, 2009.

[40] J. J. Huang and E. A. McBean, “Data mining to identify contaminant eventlocations in water distribution systems,” Journal of Water Resources Planningand Management, vol. 135, no. 6, pp. 466–474, 2009.

[41] A. A. Cardenas, S. Amin, and S. Sastry, “Research challenges for the securityof control systems,” in Proceedings of the 3rd conference on Hot topics insecurity, (Berkeley, CA, USA), pp. 6:1–6:6, USENIX Association, 2008.

[42] X. Fang, S. Misra, G. Xue, and D. Yang, “Smart grid-the new and improvedpower grid: A survey,” Communications Surveys Tutorials, IEEE, vol. PP,no. 99, pp. 1 –37, 2011.

[43] S. McLaughlin, D. Podkuiko, and P. McDaniel, “Energy theft in the advancedmetering infrastructure,” in Proceedings of the 4th international conferenceon Critical information infrastructures security, CRITIS’09, pp. 176–187,Springer-Verlag, 2010.

[44] NISTIR 7628. Guidelines for Smart Grid Cyber Security Requirements,http://csrc.nist.gov/publications/nistir/ir7628/introduction-to-nistir-7628.pdf, accessed Jun. 2012.

[45] U.S. Department of Energy Office of Electricity Delivery and Energy Reliability.Study of Security Attributes of Smart Grid Systems. Current Cyber Se-curity Issues, http://www.inl.gov/scada/publications/d/securing_the_smart_grid_current_issues.pdf, accessedJan. 2012.

[46] E. Pallotti and F. Mangiatordi, “Smart grid cyber security requirements,” inEnvironment and Electrical Engineering (EEEIC), 2011 10th InternationalConference on, pp. 1 –4, 2011.

66

Bibliography

[47] A. Metke and R. Ekl, “Smart grid security technology,” in Innovative SmartGrid Technologies (ISGT), 2010, pp. 1 –7, 2010.

[48] Z. Lu, X. Lu, W. Wang, and C. Wang, “Review and evaluation of securitythreats on the communication networks in the smart grid,” in Military Com-munication Conference, 2010 - MILCOM 2010, pp. 1830 –1835, 2010.

[49] F. Cleveland, “Cyber security issues for advanced metering infrastructure(ami),” in Power and Energy Society General Meeting - Conversion and De-livery of Electrical Energy in the 21st Century, 2008 IEEE, pp. 1 –5, July2008.

[50] R. Berthier and W. Sanders, “Specification-based intrusion detection for ad-vanced metering infrastructures,” in Dependable Computing (PRDC), 2011IEEE 17th Pacific Rim International Symposium on, pp. 184 –193, Dec. 2011.

[51] N. Kush, E. Foo, E. Ahmed, I. Ahmed, and A. Clark, “Gap analysis of in-trusion detection in smart grids,” in 2nd International Cyber Resilience Con-ference (C. Valli, ed.), pp. 38–46, secau - Security Research Centre, August2011.

[52] S. McLaughlin, D. Podkuiko, S. Miadzvezhanka, A. Delozier, and P. Mc-Daniel, “Multi-vendor penetration testing in the advanced metering infras-tructure,” in Proceedings of the 26th Annual Computer Security ApplicationsConference, ACSAC ’10, pp. 107–116, ACM, 2010.

[53] P. Kadurek, J. Blom, J. Cobben, and W. Kling, “Theft detection and smartmetering practices and expectations in the netherlands,” in Innovative SmartGrid Technologies Conference Europe (ISGT Europe), 2010 IEEE PES,pp. 1–6, 2010.

[54] M. Asplund and S. Nadjm-Tehrani, “A partition-tolerant manycast algorithmfor disaster area networks,” IEEE Symposium on Reliable Distributed Sys-tems, pp. 156–165, 2009.

[55] H. Yang, H. Luo, F. Ye, S. Lu, and L. Zhang, “Security in mobile ad hoc net-works: challenges and solutions,” IEEE Wireless Communications, vol. 11,no. 1, pp. 38 – 47, 2004.

[56] J. Cucurull, M. Asplund, and S. Nadjm-Tehrani, “Anomaly detection and mit-igation for disaster area networks,” in Recent Advances in Intrusion Detection(S. Jha, R. Sommer, and C. Kreibich, eds.), vol. 6307 of Lecture Notes inComputer Science, pp. 339–359, Springer Berlin / Heidelberg, 2010.

[57] C. Xenakis, C. Panos, and I. Stavrakakis, “A comparative evaluation of in-trusion detection architectures for mobile ad hoc networks,” Computers &Security, vol. 30, no. 1, pp. 63 – 80, 2011.

67

BIBLIOGRAPHY

[58] M. Lima, A. dos Santos, and G. Pujolle, “A survey of survivability in mobilead hoc networks,” Communications Surveys Tutorials, IEEE, vol. 11, pp. 66–77, quarter 2009.

[59] http://cfpub.epa.gov/safewater/watersecurity/initiative.cfm, Accessed 26April 2010.

[60] S. C. Allgeier and K. Umberg, “Systematic evaluation of contaminant detec-tion through water quality monitoring,” in Water Security Congress Proceed-ings, American Water Works Association, 2008.

[61] M. Ernoult and F. Meslier, “Analysis and forecast of electrical energy de-mand,” in Revue generale de l’electricite, vol. 4, 1982.

[62] G. Chicco, R. Napoli, P. Postolache, M. Scutariu, and C. Toader, “Electricenergy customer characterisation for developing dedicated market strategies,”in Power Tech Proceedings, 2001 IEEE Porto, vol. 1, p. 6 pp. vol.1, 2001.

[63] M. Hossain, S. Bridges, and J. Vaughn, R.B., “Adaptive intrusion detectionwith data mining,” in Systems, Man and Cybernetics, 2003. IEEE Interna-tional Conference on, vol. 4, pp. 3097 – 3103 vol.4, oct. 2003.

[64] Z. Yu, J. J. P. Tsai, and T. Weigert, “An adaptive automatically tuning in-trusion detection system,” ACM Trans. Auton. Adapt. Syst., vol. 3, pp. 10:1–10:25, Aug. 2008.

[65] P. Laskov and R. Lippmann, “Machine learning in adversarial environments,”Mach. Learn., vol. 81, pp. 115–119, Nov. 2010.

[66] G. F. Cretu-Ciocarlie, A. Stavrou, M. E. Locasto, and S. J. Stolfo, “Adaptiveanomaly detection via self-calibration and dynamic updating,” in Proceedingsof the 12th International Symposium on Recent Advances in Intrusion Detec-tion, RAID ’09, (Berlin, Heidelberg), pp. 41–60, Springer-Verlag, 2009.

[67] HFN project, www.ida.liu.se/∼rtslab/HFN, accessed 21 July 2011.

[68] M. Asplund and S. Nadjm-Tehrani, “A partition-tolerant manycast algorithmfor disaster area networks,” IEEE Symposium on Reliable Distributed Sys-tems, pp. 156–165, 2009.

[69] J. Cucurull, M. Asplund, S. Nadjm-Tehrani, and T. Santoro, “Surviving at-tacks in challenged networks,” Dependable and Secure Computing, IEEETransactions on, vol. 9, pp. 917 –929, nov.-dec. 2012.

[70] E. J. Vergara, S. Nadjm-Tehrani, M. Asplund, and U. Zurutuza, “Resourcefootprint of a manycast protocol implementation on multiple mob ile plat-forms,” in Next Generation Mobile Applications, Services and Technologies,2011 NGMAST ’11. The Fifth International Conference on, IEEE, Sept. 2011.

68

Bibliography

[71] M. Salehie and L. Tahvildari, “Self-adaptive software: Landscape and re-search challenges,” ACM Trans. Auton. Adapt. Syst., vol. 4, pp. 14:1–14:42,May 2009.

[72] A. Ferrante, A. V. Taddeo, M. Sami, F. Mantovani, and J. Fridkins, “Self-adaptive security at application level: a proposal,” in ReCoSoC 2007, Jun.2007, in proceedings of ReCoSoC 2007, June 2007.

[73] C. Chigan, L. Li, and Y. Ye, “Resource-aware self-adaptive security provi-sioning in mobile ad hoc networks,” in Wireless Communications and Net-working Conference, 2005 IEEE, vol. 4, pp. 2118 – 2124 Vol. 4, march 2005.

[74] R. Mitchell and I.-R. Chen, “Survivability analysis of mobile cyber physi-cal systems with voting-based intrusion detection,” in Wireless Communica-tions and Mobile Computing Conference (IWCMC), 2011 7th International,pp. 2256 –2261, july 2011.

[75] H. Wu, S. Nabar, and R. Poovendran, “An energy framework for the networksimulator 3 (ns-3),” in 4th International ICST Conference on Simulation Toolsand Techniques (SIMUTools ’11), 2011.

[76] F. Chen, I. Dietrich, R. German, and F. Dressler, “An Energy Model for Sim-ulation Studies of Wireless Sensor Networks using OMNeT++,” Praxis derInformationsverarbeitung und Kommunikation (PIK), vol. 32, pp. 133–138,June 2009.

[77] OMNeT++, http://www.omnetpp.org/, accessed 21 July 2011.

[78] L. M. Feeney and D. Willkomm, “Energy framework: an extensible frame-work for simulating battery consumption in wireless networks,” in Proceed-ings of the 3rd International ICST Conference on Simulation Tools and Tech-niques, SIMUTools ’10, (ICST, Brussels, Belgium, Belgium), pp. 20:1–20:4,ICST (Institute for Computer Sciences, Social-Informatics and Telecommu-nications Engineering), 2010.

[79] ARMulator, www.arm.com, accessed 21 July 2011.

[80] T. Austin, E. Larson, and D. Ernst, “Simplescalar: An infrastructure for com-puter system modeling,” Computer, vol. 35, pp. 59–67, February 2002.

[81] P. Stanley-Marbell and D. Marculescu, “Sunflower: full-system, embed-ded, microarchitecture evaluation,” in Proceedings of the 2nd internationalconference on High performance embedded architectures and compilers,HiPEAC’07, (Berlin, Heidelberg), pp. 168–182, Springer-Verlag, 2007.

[82] L. Palopoli, G. Lipari, G. Lamastra, L. Abeni, G. Bolognini, and P. Ancilotti,“An object-oriented tool for simulating distributed real-time control systems,”Software: Practice and Experience, vol. 32, no. 9, pp. 907–932, 2002.

69

BIBLIOGRAPHY

[83] E. Weingartner, F. Schmidt, H. V. Lehn, T. Heer, and K. Wehrle, “Slicetime: aplatform for scalable and accurate network emulation,” in Proceedings of the8th USENIX conference on Networked systems design and implementation,NSDI’11, (Berkeley, CA, USA), pp. 19–19, USENIX Association, 2011.

[84] J. Cucurull, S. Nadjm-Tehrani, and M. Raciti, “Modular anomaly detec-tion for smartphone ad hoc communication,” in Proceedings of the 16thNordic conference on Information Security Technology for Applications,NordSec’11, (Berlin, Heidelberg), pp. 65–81, Springer-Verlag, 2012.

[85] N. Aschenbruck, E. Gerhards-Padilla, M. Gerharz, M. Frank, and P. Martini,“Modelling mobility in disaster area scenarios,” in MSWiM ’07: Proceed-ings of the 10th ACM Symposium on Modeling, analysis, and simulation ofwireless and mobile systems, pp. 4–12, ACM, 2007.

[86] L. Feeney and M. Nilsson, “Investigating the energy consumption of a wire-less network interface in an ad hoc networking environment,” in INFOCOM2001. Twentieth Annual Joint Conference of the IEEE Computer and Commu-nications Societies. Proceedings. IEEE, vol. 3, pp. 1548 –1557 vol.3, 2001.

70

Department of Computer and Information Science Linköpings universitet

Licentiate Theses Linköpings Studies in Science and Technology

Faculty of Arts and Sciences

No 17 Vojin Plavsic: Interleaved Processing of Non-Numerical Data Stored on a Cyclic Memory. (Available at: FOA, Box 1165, S-581 11 Linköping, Sweden. FOA Report B30062E)

No 28 Arne Jönsson, Mikael Patel: An Interactive Flowcharting Technique for Communicating and Realizing Al-gorithms, 1984.

No 29 Johnny Eckerland: Retargeting of an Incremental Code Generator, 1984. No 48 Henrik Nordin: On the Use of Typical Cases for Knowledge-Based Consultation and Teaching, 1985. No 52 Zebo Peng: Steps Towards the Formalization of Designing VLSI Systems, 1985. No 60 Johan Fagerström: Simulation and Evaluation of Architecture based on Asynchronous Processes, 1985. No 71 Jalal Maleki: ICONStraint, A Dependency Directed Constraint Maintenance System, 1987. No 72 Tony Larsson: On the Specification and Verification of VLSI Systems, 1986. No 73 Ola Strömfors: A Structure Editor for Documents and Programs, 1986. No 74 Christos Levcopoulos: New Results about the Approximation Behavior of the Greedy Triangulation, 1986. No 104 Shamsul I. Chowdhury: Statistical Expert Systems - a Special Application Area for Knowledge-Based Computer

Methodology, 1987. No 108 Rober Bilos: Incremental Scanning and Token-Based Editing, 1987. No 111 Hans Block: SPORT-SORT Sorting Algorithms and Sport Tournaments, 1987. No 113 Ralph Rönnquist: Network and Lattice Based Approaches to the Representation of Knowledge, 1987. No 118 Mariam Kamkar, Nahid Shahmehri: Affect-Chaining in Program Flow Analysis Applied to Queries of Pro-

grams, 1987. No 126 Dan Strömberg: Transfer and Distribution of Application Programs, 1987. No 127 Kristian Sandahl: Case Studies in Knowledge Acquisition, Migration and User Acceptance of Expert Systems,

1987. No 139 Christer Bäckström: Reasoning about Interdependent Actions, 1988. No 140 Mats Wirén: On Control Strategies and Incrementality in Unification-Based Chart Parsing, 1988. No 146 Johan Hultman: A Software System for Defining and Controlling Actions in a Mechanical System, 1988. No 150 Tim Hansen: Diagnosing Faults using Knowledge about Malfunctioning Behavior, 1988. No 165 Jonas Löwgren: Supporting Design and Management of Expert System User Interfaces, 1989. No 166 Ola Petersson: On Adaptive Sorting in Sequential and Parallel Models, 1989. No 174 Yngve Larsson: Dynamic Configuration in a Distributed Environment, 1989. No 177 Peter Åberg: Design of a Multiple View Presentation and Interaction Manager, 1989. No 181 Henrik Eriksson: A Study in Domain-Oriented Tool Support for Knowledge Acquisition, 1989. No 184 Ivan Rankin: The Deep Generation of Text in Expert Critiquing Systems, 1989. No 187 Simin Nadjm-Tehrani: Contributions to the Declarative Approach to Debugging Prolog Programs, 1989. No 189 Magnus Merkel: Temporal Information in Natural Language, 1989. No 196 Ulf Nilsson: A Systematic Approach to Abstract Interpretation of Logic Programs, 1989. No 197 Staffan Bonnier: Horn Clause Logic with External Procedures: Towards a Theoretical Framework, 1989. No 203 Christer Hansson: A Prototype System for Logical Reasoning about Time and Action, 1990. No 212 Björn Fjellborg: An Approach to Extraction of Pipeline Structures for VLSI High-Level Synthesis, 1990. No 230 Patrick Doherty: A Three-Valued Approach to Non-Monotonic Reasoning, 1990. No 237 Tomas Sokolnicki: Coaching Partial Plans: An Approach to Knowledge-Based Tutoring, 1990. No 250 Lars Strömberg: Postmortem Debugging of Distributed Systems, 1990. No 253 Torbjörn Näslund: SLDFA-Resolution - Computing Answers for Negative Queries, 1990. No 260 Peter D. Holmes: Using Connectivity Graphs to Support Map-Related Reasoning, 1991. No 283 Olof Johansson: Improving Implementation of Graphical User Interfaces for Object-Oriented Knowledge- Bases,

1991. No 298 Rolf G Larsson: Aktivitetsbaserad kalkylering i ett nytt ekonomisystem, 1991. No 318 Lena Srömbäck: Studies in Extended Unification-Based Formalism for Linguistic Description: An Algorithm for

Feature Structures with Disjunction and a Proposal for Flexible Systems, 1992. No 319 Mikael Pettersson: DML-A Language and System for the Generation of Efficient Compilers from Denotational

Specification, 1992. No 326 Andreas Kågedal: Logic Programming with External Procedures: an Implementation, 1992. No 328 Patrick Lambrix: Aspects of Version Management of Composite Objects, 1992. No 333 Xinli Gu: Testability Analysis and Improvement in High-Level Synthesis Systems, 1992. No 335 Torbjörn Näslund: On the Role of Evaluations in Iterative Development of Managerial Support Systems, 1992. No 348 Ulf Cederling: Industrial Software Development - a Case Study, 1992. No 352 Magnus Morin: Predictable Cyclic Computations in Autonomous Systems: A Computational Model and Im-

plementation, 1992. No 371 Mehran Noghabai: Evaluation of Strategic Investments in Information Technology, 1993.

No 378 Mats Larsson: A Transformational Approach to Formal Digital System Design, 1993. No 380 Johan Ringström: Compiler Generation for Parallel Languages from Denotational Specifications, 1993. No 381 Michael Jansson: Propagation of Change in an Intelligent Information System, 1993. No 383 Jonni Harrius: An Architecture and a Knowledge Representation Model for Expert Critiquing Systems, 1993. No 386 Per Österling: Symbolic Modelling of the Dynamic Environments of Autonomous Agents, 1993. No 398 Johan Boye: Dependency-based Groudness Analysis of Functional Logic Programs, 1993. No 402 Lars Degerstedt: Tabulated Resolution for Well Founded Semantics, 1993. No 406 Anna Moberg: Satellitkontor - en studie av kommunikationsmönster vid arbete på distans, 1993. No 414 Peter Carlsson: Separation av företagsledning och finansiering - fallstudier av företagsledarutköp ur ett agent-

teoretiskt perspektiv, 1994. No 417 Camilla Sjöström: Revision och lagreglering - ett historiskt perspektiv, 1994. No 436 Cecilia Sjöberg: Voices in Design: Argumentation in Participatory Development, 1994. No 437 Lars Viklund: Contributions to a High-level Programming Environment for a Scientific Computing, 1994. No 440 Peter Loborg: Error Recovery Support in Manufacturing Control Systems, 1994. FHS 3/94 Owen Eriksson: Informationssystem med verksamhetskvalitet - utvärdering baserat på ett verksamhetsinriktat och

samskapande perspektiv, 1994. FHS 4/94 Karin Pettersson: Informationssystemstrukturering, ansvarsfördelning och användarinflytande - En komparativ

studie med utgångspunkt i två informationssystemstrategier, 1994. No 441 Lars Poignant: Informationsteknologi och företagsetablering - Effekter på produktivitet och region, 1994. No 446 Gustav Fahl: Object Views of Relational Data in Multidatabase Systems, 1994. No 450 Henrik Nilsson: A Declarative Approach to Debugging for Lazy Functional Languages, 1994. No 451 Jonas Lind: Creditor - Firm Relations: an Interdisciplinary Analysis, 1994. No 452 Martin Sköld: Active Rules based on Object Relational Queries - Efficient Change Monitoring Techniques, 1994. No 455 Pär Carlshamre: A Collaborative Approach to Usability Engineering: Technical Communicators and System

Developers in Usability-Oriented Systems Development, 1994. FHS 5/94 Stefan Cronholm: Varför CASE-verktyg i systemutveckling? - En motiv- och konsekvensstudie avseende

arbetssätt och arbetsformer, 1994. No 462 Mikael Lindvall: A Study of Traceability in Object-Oriented Systems Development, 1994. No 463 Fredrik Nilsson: Strategi och ekonomisk styrning - En studie av Sandviks förvärv av Bahco Verktyg, 1994. No 464 Hans Olsén: Collage Induction: Proving Properties of Logic Programs by Program Synthesis, 1994. No 469 Lars Karlsson: Specification and Synthesis of Plans Using the Features and Fluents Framework, 1995. No 473 Ulf Söderman: On Conceptual Modelling of Mode Switching Systems, 1995. No 475 Choong-ho Yi: Reasoning about Concurrent Actions in the Trajectory Semantics, 1995. No 476 Bo Lagerström: Successiv resultatavräkning av pågående arbeten. - Fallstudier i tre byggföretag, 1995. No 478 Peter Jonsson: Complexity of State-Variable Planning under Structural Restrictions, 1995. FHS 7/95 Anders Avdic: Arbetsintegrerad systemutveckling med kalkylprogram, 1995. No 482 Eva L Ragnemalm: Towards Student Modelling through Collaborative Dialogue with a Learning Companion,

1995. No 488 Eva Toller: Contributions to Parallel Multiparadigm Languages: Combining Object-Oriented and Rule-Based

Programming, 1995. No 489 Erik Stoy: A Petri Net Based Unified Representation for Hardware/Software Co-Design, 1995. No 497 Johan Herber: Environment Support for Building Structured Mathematical Models, 1995. No 498 Stefan Svenberg: Structure-Driven Derivation of Inter-Lingual Functor-Argument Trees for Multi-Lingual

Generation, 1995. No 503 Hee-Cheol Kim: Prediction and Postdiction under Uncertainty, 1995. FHS 8/95 Dan Fristedt: Metoder i användning - mot förbättring av systemutveckling genom situationell metodkunskap och

metodanalys, 1995. FHS 9/95 Malin Bergvall: Systemförvaltning i praktiken - en kvalitativ studie avseende centrala begrepp, aktiviteter och

ansvarsroller, 1995. No 513 Joachim Karlsson: Towards a Strategy for Software Requirements Selection, 1995. No 517 Jakob Axelsson: Schedulability-Driven Partitioning of Heterogeneous Real-Time Systems, 1995. No 518 Göran Forslund: Toward Cooperative Advice-Giving Systems: The Expert Systems Experience, 1995. No 522 Jörgen Andersson: Bilder av småföretagares ekonomistyrning, 1995. No 538 Staffan Flodin: Efficient Management of Object-Oriented Queries with Late Binding, 1996. No 545 Vadim Engelson: An Approach to Automatic Construction of Graphical User Interfaces for Applications in

Scientific Computing, 1996. No 546 Magnus Werner : Multidatabase Integration using Polymorphic Queries and Views, 1996. FiF-a 1/96 Mikael Lind: Affärsprocessinriktad förändringsanalys - utveckling och tillämpning av synsätt och metod, 1996. No 549 Jonas Hallberg: High-Level Synthesis under Local Timing Constraints, 1996. No 550 Kristina Larsen: Förutsättningar och begränsningar för arbete på distans - erfarenheter från fyra svenska företag.

1996. No 557 Mikael Johansson: Quality Functions for Requirements Engineering Methods, 1996. No 558 Patrik Nordling: The Simulation of Rolling Bearing Dynamics on Parallel Computers, 1996. No 561 Anders Ekman: Exploration of Polygonal Environments, 1996.

No 563 Niclas Andersson: Compilation of Mathematical Models to Parallel Code, 1996. No 567 Johan Jenvald: Simulation and Data Collection in Battle Training, 1996. No 575 Niclas Ohlsson: Software Quality Engineering by Early Identification of Fault-Prone Modules, 1996. No 576 Mikael Ericsson: Commenting Systems as Design Support—A Wizard-of-Oz Study, 1996. No 587 Jörgen Lindström: Chefers användning av kommunikationsteknik, 1996. No 589 Esa Falkenroth: Data Management in Control Applications - A Proposal Based on Active Database Systems,

1996. No 591 Niclas Wahllöf: A Default Extension to Description Logics and its Applications, 1996. No 595 Annika Larsson: Ekonomisk Styrning och Organisatorisk Passion - ett interaktivt perspektiv, 1997. No 597 Ling Lin: A Value-based Indexing Technique for Time Sequences, 1997. No 598 Rego Granlund: C3Fire - A Microworld Supporting Emergency Management Training, 1997. No 599 Peter Ingels: A Robust Text Processing Technique Applied to Lexical Error Recovery, 1997. No 607 Per-Arne Persson: Toward a Grounded Theory for Support of Command and Control in Military Coalitions, 1997. No 609 Jonas S Karlsson: A Scalable Data Structure for a Parallel Data Server, 1997. FiF-a 4 Carita Åbom: Videomötesteknik i olika affärssituationer - möjligheter och hinder, 1997. FiF-a 6 Tommy Wedlund: Att skapa en företagsanpassad systemutvecklingsmodell - genom rekonstruktion, värdering och

vidareutveckling i T50-bolag inom ABB, 1997. No 615 Silvia Coradeschi: A Decision-Mechanism for Reactive and Coordinated Agents, 1997. No 623 Jan Ollinen: Det flexibla kontorets utveckling på Digital - Ett stöd för multiflex? 1997. No 626 David Byers: Towards Estimating Software Testability Using Static Analysis, 1997. No 627 Fredrik Eklund: Declarative Error Diagnosis of GAPLog Programs, 1997. No 629 Gunilla Ivefors: Krigsspel och Informationsteknik inför en oförutsägbar framtid, 1997. No 631 Jens-Olof Lindh: Analysing Traffic Safety from a Case-Based Reasoning Perspective, 1997 No 639 Jukka Mäki-Turja:. Smalltalk - a suitable Real-Time Language, 1997. No 640 Juha Takkinen: CAFE: Towards a Conceptual Model for Information Management in Electronic Mail, 1997. No 643 Man Lin: Formal Analysis of Reactive Rule-based Programs, 1997. No 653 Mats Gustafsson: Bringing Role-Based Access Control to Distributed Systems, 1997. FiF-a 13 Boris Karlsson: Metodanalys för förståelse och utveckling av systemutvecklingsverksamhet. Analys och värdering

av systemutvecklingsmodeller och dess användning, 1997. No 674 Marcus Bjäreland: Two Aspects of Automating Logics of Action and Change - Regression and Tractability,

1998. No 676 Jan Håkegård: Hierarchical Test Architecture and Board-Level Test Controller Synthesis, 1998. No 668 Per-Ove Zetterlund: Normering av svensk redovisning - En studie av tillkomsten av Redovisningsrådets re-

kommendation om koncernredovisning (RR01:91), 1998. No 675 Jimmy Tjäder: Projektledaren & planen - en studie av projektledning i tre installations- och systemutveck-

lingsprojekt, 1998. FiF-a 14 Ulf Melin: Informationssystem vid ökad affärs- och processorientering - egenskaper, strategier och utveckling,

1998. No 695 Tim Heyer: COMPASS: Introduction of Formal Methods in Code Development and Inspection, 1998. No 700 Patrik Hägglund: Programming Languages for Computer Algebra, 1998. FiF-a 16 Marie-Therese Christiansson: Inter-organisatorisk verksamhetsutveckling - metoder som stöd vid utveckling av

partnerskap och informationssystem, 1998. No 712 Christina Wennestam: Information om immateriella resurser. Investeringar i forskning och utveckling samt i

personal inom skogsindustrin, 1998. No 719 Joakim Gustafsson: Extending Temporal Action Logic for Ramification and Concurrency, 1998. No 723 Henrik André-Jönsson: Indexing time-series data using text indexing methods, 1999. No 725 Erik Larsson: High-Level Testability Analysis and Enhancement Techniques, 1998. No 730 Carl-Johan Westin: Informationsförsörjning: en fråga om ansvar - aktiviteter och uppdrag i fem stora svenska

organisationers operativa informationsförsörjning, 1998. No 731 Åse Jansson: Miljöhänsyn - en del i företags styrning, 1998. No 733 Thomas Padron-McCarthy: Performance-Polymorphic Declarative Queries, 1998. No 734 Anders Bäckström: Värdeskapande kreditgivning - Kreditriskhantering ur ett agentteoretiskt perspektiv, 1998. FiF-a 21 Ulf Seigerroth: Integration av förändringsmetoder - en modell för välgrundad metodintegration, 1999. FiF-a 22 Fredrik Öberg: Object-Oriented Frameworks - A New Strategy for Case Tool Development, 1998. No 737 Jonas Mellin: Predictable Event Monitoring, 1998. No 738 Joakim Eriksson: Specifying and Managing Rules in an Active Real-Time Database System, 1998. FiF-a 25 Bengt E W Andersson: Samverkande informationssystem mellan aktörer i offentliga åtaganden - En teori om

aktörsarenor i samverkan om utbyte av information, 1998. No 742 Pawel Pietrzak: Static Incorrectness Diagnosis of CLP (FD), 1999. No 748 Tobias Ritzau: Real-Time Reference Counting in RT-Java, 1999. No 751 Anders Ferntoft: Elektronisk affärskommunikation - kontaktkostnader och kontaktprocesser mellan kunder och

leverantörer på producentmarknader, 1999. No 752 Jo Skåmedal: Arbete på distans och arbetsformens påverkan på resor och resmönster, 1999. No 753 Johan Alvehus: Mötets metaforer. En studie av berättelser om möten, 1999.

No 754 Magnus Lindahl: Bankens villkor i låneavtal vid kreditgivning till högt belånade företagsförvärv: En studie ur ett agentteoretiskt perspektiv, 2000.

No 766 Martin V. Howard: Designing dynamic visualizations of temporal data, 1999. No 769 Jesper Andersson: Towards Reactive Software Architectures, 1999. No 775 Anders Henriksson: Unique kernel diagnosis, 1999. FiF-a 30 Pär J. Ågerfalk: Pragmatization of Information Systems - A Theoretical and Methodological Outline, 1999. No 787 Charlotte Björkegren: Learning for the next project - Bearers and barriers in knowledge transfer within an

organisation, 1999. No 788 Håkan Nilsson: Informationsteknik som drivkraft i granskningsprocessen - En studie av fyra revisionsbyråer,

2000. No 790 Erik Berglund: Use-Oriented Documentation in Software Development, 1999. No 791 Klas Gäre: Verksamhetsförändringar i samband med IS-införande, 1999. No 800 Anders Subotic: Software Quality Inspection, 1999. No 807 Svein Bergum: Managerial communication in telework, 2000. No 809 Flavius Gruian: Energy-Aware Design of Digital Systems, 2000. FiF-a 32 Karin Hedström: Kunskapsanvändning och kunskapsutveckling hos verksamhetskonsulter - Erfarenheter från ett

FOU-samarbete, 2000. No 808 Linda Askenäs: Affärssystemet - En studie om teknikens aktiva och passiva roll i en organisation, 2000. No 820 Jean Paul Meynard: Control of industrial robots through high-level task programming, 2000. No 823 Lars Hult: Publika Gränsytor - ett designexempel, 2000. No 832 Paul Pop: Scheduling and Communication Synthesis for Distributed Real-Time Systems, 2000. FiF-a 34 Göran Hultgren: Nätverksinriktad Förändringsanalys - perspektiv och metoder som stöd för förståelse och

utveckling av affärsrelationer och informationssystem, 2000. No 842 Magnus Kald: The role of management control systems in strategic business units, 2000. No 844 Mikael Cäker: Vad kostar kunden? Modeller för intern redovisning, 2000. FiF-a 37 Ewa Braf: Organisationers kunskapsverksamheter - en kritisk studie av ”knowledge management”, 2000. FiF-a 40 Henrik Lindberg: Webbaserade affärsprocesser - Möjligheter och begränsningar, 2000. FiF-a 41 Benneth Christiansson: Att komponentbasera informationssystem - Vad säger teori och praktik?, 2000. No. 854 Ola Pettersson: Deliberation in a Mobile Robot, 2000. No 863 Dan Lawesson: Towards Behavioral Model Fault Isolation for Object Oriented Control Systems, 2000. No 881 Johan Moe: Execution Tracing of Large Distributed Systems, 2001. No 882 Yuxiao Zhao: XML-based Frameworks for Internet Commerce and an Implementation of B2B e-procurement,

2001. No 890 Annika Flycht-Eriksson: Domain Knowledge Management in Information-providing Dialogue systems, 2001. FiF-a 47 Per-Arne Segerkvist: Webbaserade imaginära organisationers samverkansformer: Informationssystemarkitektur

och aktörssamverkan som förutsättningar för affärsprocesser, 2001. No 894 Stefan Svarén: Styrning av investeringar i divisionaliserade företag - Ett koncernperspektiv, 2001. No 906 Lin Han: Secure and Scalable E-Service Software Delivery, 2001. No 917 Emma Hansson: Optionsprogram för anställda - en studie av svenska börsföretag, 2001. No 916 Susanne Odar: IT som stöd för strategiska beslut, en studie av datorimplementerade modeller av verksamhet som

stöd för beslut om anskaffning av JAS 1982, 2002. FiF-a-49 Stefan Holgersson: IT-system och filtrering av verksamhetskunskap - kvalitetsproblem vid analyser och be-

slutsfattande som bygger på uppgifter hämtade från polisens IT-system, 2001. FiF-a-51 Per Oscarsson: Informationssäkerhet i verksamheter - begrepp och modeller som stöd för förståelse av infor-

mationssäkerhet och dess hantering, 2001. No 919 Luis Alejandro Cortes: A Petri Net Based Modeling and Verification Technique for Real-Time Embedded

Systems, 2001. No 915 Niklas Sandell: Redovisning i skuggan av en bankkris - Värdering av fastigheter. 2001. No 931 Fredrik Elg: Ett dynamiskt perspektiv på individuella skillnader av heuristisk kompetens, intelligens, mentala

modeller, mål och konfidens i kontroll av mikrovärlden Moro, 2002. No 933 Peter Aronsson: Automatic Parallelization of Simulation Code from Equation Based Simulation Languages, 2002. No 938 Bourhane Kadmiry: Fuzzy Control of Unmanned Helicopter, 2002. No 942 Patrik Haslum: Prediction as a Knowledge Representation Problem: A Case Study in Model Design, 2002. No 956 Robert Sevenius: On the instruments of governance - A law & economics study of capital instruments in limited

liability companies, 2002. FiF-a 58 Johan Petersson: Lokala elektroniska marknadsplatser - informationssystem för platsbundna affärer, 2002. No 964 Peter Bunus: Debugging and Structural Analysis of Declarative Equation-Based Languages, 2002. No 973 Gert Jervan: High-Level Test Generation and Built-In Self-Test Techniques for Digital Systems, 2002. No 958 Fredrika Berglund: Management Control and Strategy - a Case Study of Pharmaceutical Drug Development,

2002. FiF-a 61 Fredrik Karlsson: Meta-Method for Method Configuration - A Rational Unified Process Case, 2002. No 985 Sorin Manolache: Schedulability Analysis of Real-Time Systems with Stochastic Task Execution Times, 2002. No 982 Diana Szentiványi: Performance and Availability Trade-offs in Fault-Tolerant Middleware, 2002. No 989 Iakov Nakhimovski: Modeling and Simulation of Contacting Flexible Bodies in Multibody Systems, 2002. No 990 Levon Saldamli: PDEModelica - Towards a High-Level Language for Modeling with Partial Differential

Equations, 2002.

No 991 Almut Herzog: Secure Execution Environment for Java Electronic Services, 2002. No 999 Jon Edvardsson: Contributions to Program- and Specification-based Test Data Generation, 2002. No 1000 Anders Arpteg: Adaptive Semi-structured Information Extraction, 2002. No 1001 Andrzej Bednarski: A Dynamic Programming Approach to Optimal Retargetable Code Generation for Irregular

Architectures, 2002. No 988 Mattias Arvola: Good to use! : Use quality of multi-user applications in the home, 2003. FiF-a 62 Lennart Ljung: Utveckling av en projektivitetsmodell - om organisationers förmåga att tillämpa

projektarbetsformen, 2003. No 1003 Pernilla Qvarfordt: User experience of spoken feedback in multimodal interaction, 2003. No 1005 Alexander Siemers: Visualization of Dynamic Multibody Simulation With Special Reference to Contacts, 2003. No 1008 Jens Gustavsson: Towards Unanticipated Runtime Software Evolution, 2003. No 1010 Calin Curescu: Adaptive QoS-aware Resource Allocation for Wireless Networks, 2003. No 1015 Anna Andersson: Management Information Systems in Process-oriented Healthcare Organisations, 2003. No 1018 Björn Johansson: Feedforward Control in Dynamic Situations, 2003. No 1022 Traian Pop: Scheduling and Optimisation of Heterogeneous Time/Event-Triggered Distributed Embedded

Systems, 2003. FiF-a 65 Britt-Marie Johansson: Kundkommunikation på distans - en studie om kommunikationsmediets betydelse i

affärstransaktioner, 2003. No 1024 Aleksandra Tešanovic: Towards Aspectual Component-Based Real-Time System Development, 2003. No 1034 Arja Vainio-Larsson: Designing for Use in a Future Context - Five Case Studies in Retrospect, 2003. No 1033 Peter Nilsson: Svenska bankers redovisningsval vid reservering för befarade kreditförluster - En studie vid

införandet av nya redovisningsregler, 2003. FiF-a 69 Fredrik Ericsson: Information Technology for Learning and Acquiring of Work Knowledge, 2003. No 1049 Marcus Comstedt: Towards Fine-Grained Binary Composition through Link Time Weaving, 2003. No 1052 Åsa Hedenskog: Increasing the Automation of Radio Network Control, 2003. No 1054 Claudiu Duma: Security and Efficiency Tradeoffs in Multicast Group Key Management, 2003. FiF-a 71 Emma Eliason: Effektanalys av IT-systems handlingsutrymme, 2003. No 1055 Carl Cederberg: Experiments in Indirect Fault Injection with Open Source and Industrial Software, 2003. No 1058 Daniel Karlsson: Towards Formal Verification in a Component-based Reuse Methodology, 2003. FiF-a 73 Anders Hjalmarsson: Att etablera och vidmakthålla förbättringsverksamhet - behovet av koordination och

interaktion vid förändring av systemutvecklingsverksamheter, 2004. No 1079 Pontus Johansson: Design and Development of Recommender Dialogue Systems, 2004. No 1084 Charlotte Stoltz: Calling for Call Centres - A Study of Call Centre Locations in a Swedish Rural Region, 2004. FiF-a 74 Björn Johansson: Deciding on Using Application Service Provision in SMEs, 2004. No 1094 Genevieve Gorrell: Language Modelling and Error Handling in Spoken Dialogue Systems, 2004. No 1095 Ulf Johansson: Rule Extraction - the Key to Accurate and Comprehensible Data Mining Models, 2004. No 1099 Sonia Sangari: Computational Models of Some Communicative Head Movements, 2004. No 1110 Hans Nässla: Intra-Family Information Flow and Prospects for Communication Systems, 2004. No 1116 Henrik Sällberg: On the value of customer loyalty programs - A study of point programs and switching costs,

2004. FiF-a 77 Ulf Larsson: Designarbete i dialog - karaktärisering av interaktionen mellan användare och utvecklare i en

systemutvecklingsprocess, 2004. No 1126 Andreas Borg: Contribution to Management and Validation of Non-Functional Requirements, 2004. No 1127 Per-Ola Kristensson: Large Vocabulary Shorthand Writing on Stylus Keyboard, 2004. No 1132 Pär-Anders Albinsson: Interacting with Command and Control Systems: Tools for Operators and Designers,

2004. No 1130 Ioan Chisalita: Safety-Oriented Communication in Mobile Networks for Vehicles, 2004. No 1138 Thomas Gustafsson: Maintaining Data Consistency in Embedded Databases for Vehicular Systems, 2004. No 1149 Vaida Jakoniené: A Study in Integrating Multiple Biological Data Sources, 2005. No 1156 Abdil Rashid Mohamed: High-Level Techniques for Built-In Self-Test Resources Optimization, 2005. No 1162 Adrian Pop: Contributions to Meta-Modeling Tools and Methods, 2005. No 1165 Fidel Vascós Palacios: On the information exchange between physicians and social insurance officers in the sick

leave process: an Activity Theoretical perspective, 2005. FiF-a 84 Jenny Lagsten: Verksamhetsutvecklande utvärdering i informationssystemprojekt, 2005. No 1166 Emma Larsdotter Nilsson: Modeling, Simulation, and Visualization of Metabolic Pathways Using Modelica,

2005. No 1167 Christina Keller: Virtual Learning Environments in higher education. A study of students’ acceptance of edu-

cational technology, 2005. No 1168 Cécile Åberg: Integration of organizational workflows and the Semantic Web, 2005. FiF-a 85 Anders Forsman: Standardisering som grund för informationssamverkan och IT-tjänster - En fallstudie baserad på

trafikinformationstjänsten RDS-TMC, 2005. No 1171 Yu-Hsing Huang: A systemic traffic accident model, 2005. FiF-a 86 Jan Olausson: Att modellera uppdrag - grunder för förståelse av processinriktade informationssystem i

transaktionsintensiva verksamheter, 2005. No 1172 Petter Ahlström: Affärsstrategier för seniorbostadsmarknaden, 2005. No 1183 Mathias Cöster: Beyond IT and Productivity - How Digitization Transformed the Graphic Industry, 2005. No 1184 Åsa Horzella: Beyond IT and Productivity - Effects of Digitized Information Flows in Grocery Distribution, 2005. No 1185 Maria Kollberg: Beyond IT and Productivity - Effects of Digitized Information Flows in the Logging Industry,

2005.

No 1190 David Dinka: Role and Identity - Experience of technology in professional settings, 2005. No 1191 Andreas Hansson: Increasing the Storage Capacity of Recursive Auto-associative Memory by Segmenting Data,

2005. No 1192 Nicklas Bergfeldt: Towards Detached Communication for Robot Cooperation, 2005. No 1194 Dennis Maciuszek: Towards Dependable Virtual Companions for Later Life, 2005. No 1204 Beatrice Alenljung: Decision-making in the Requirements Engineering Process: A Human-centered Approach,

2005. No 1206 Anders Larsson: System-on-Chip Test Scheduling and Test Infrastructure Design, 2005. No 1207 John Wilander: Policy and Implementation Assurance for Software Security, 2005. No 1209 Andreas Käll: Översättningar av en managementmodell - En studie av införandet av Balanced Scorecard i ett

landsting, 2005. No 1225 He Tan: Aligning and Merging Biomedical Ontologies, 2006. No 1228 Artur Wilk: Descriptive Types for XML Query Language Xcerpt, 2006. No 1229 Per Olof Pettersson: Sampling-based Path Planning for an Autonomous Helicopter, 2006. No 1231 Kalle Burbeck: Adaptive Real-time Anomaly Detection for Safeguarding Critical Networks, 2006. No 1233 Daniela Mihailescu: Implementation Methodology in Action: A Study of an Enterprise Systems Implementation

Methodology, 2006. No 1244 Jörgen Skågeby: Public and Non-public gifting on the Internet, 2006. No 1248 Karolina Eliasson: The Use of Case-Based Reasoning in a Human-Robot Dialog System, 2006. No 1263 Misook Park-Westman: Managing Competence Development Programs in a Cross-Cultural Organisation - What

are the Barriers and Enablers, 2006. FiF-a 90 Amra Halilovic: Ett praktikperspektiv på hantering av mjukvarukomponenter, 2006. No 1272 Raquel Flodström: A Framework for the Strategic Management of Information Technology, 2006. No 1277 Viacheslav Izosimov: Scheduling and Optimization of Fault-Tolerant Embedded Systems, 2006. No 1283 Håkan Hasewinkel: A Blueprint for Using Commercial Games off the Shelf in Defence Training, Education and

Research Simulations, 2006. FiF-a 91 Hanna Broberg: Verksamhetsanpassade IT-stöd - Designteori och metod, 2006. No 1286 Robert Kaminski: Towards an XML Document Restructuring Framework, 2006. No 1293 Jiri Trnka: Prerequisites for data sharing in emergency management, 2007. No 1302 Björn Hägglund: A Framework for Designing Constraint Stores, 2007. No 1303 Daniel Andreasson: Slack-Time Aware Dynamic Routing Schemes for On-Chip Networks, 2007. No 1305 Magnus Ingmarsson: Modelling User Tasks and Intentions for Service Discovery in Ubiquitous Computing,

2007. No 1306 Gustaf Svedjemo: Ontology as Conceptual Schema when Modelling Historical Maps for Database Storage, 2007. No 1307 Gianpaolo Conte: Navigation Functionalities for an Autonomous UAV Helicopter, 2007. No 1309 Ola Leifler: User-Centric Critiquing in Command and Control: The DKExpert and ComPlan Approaches, 2007. No 1312 Henrik Svensson: Embodied simulation as off-line representation, 2007. No 1313 Zhiyuan He: System-on-Chip Test Scheduling with Defect-Probability and Temperature Considerations, 2007. No 1317 Jonas Elmqvist: Components, Safety Interfaces and Compositional Analysis, 2007. No 1320 Håkan Sundblad: Question Classification in Question Answering Systems, 2007. No 1323 Magnus Lundqvist: Information Demand and Use: Improving Information Flow within Small-scale Business

Contexts, 2007. No 1329 Martin Magnusson: Deductive Planning and Composite Actions in Temporal Action Logic, 2007. No 1331 Mikael Asplund: Restoring Consistency after Network Partitions, 2007. No 1332 Martin Fransson: Towards Individualized Drug Dosage - General Methods and Case Studies, 2007. No 1333 Karin Camara: A Visual Query Language Served by a Multi-sensor Environment, 2007. No 1337 David Broman: Safety, Security, and Semantic Aspects of Equation-Based Object-Oriented Languages and

Environments, 2007. No 1339 Mikhail Chalabine: Invasive Interactive Parallelization, 2007. No 1351 Susanna Nilsson: A Holistic Approach to Usability Evaluations of Mixed Reality Systems, 2008. No 1353 Shanai Ardi: A Model and Implementation of a Security Plug-in for the Software Life Cycle, 2008. No 1356 Erik Kuiper: Mobility and Routing in a Delay-tolerant Network of Unmanned Aerial Vehicles, 2008. No 1359 Jana Rambusch: Situated Play, 2008. No 1361 Martin Karresand: Completing the Picture - Fragments and Back Again, 2008. No 1363 Per Nyblom: Dynamic Abstraction for Interleaved Task Planning and Execution, 2008. No 1371 Fredrik Lantz: Terrain Object Recognition and Context Fusion for Decision Support, 2008. No 1373 Martin Östlund: Assistance Plus: 3D-mediated Advice-giving on Pharmaceutical Products, 2008. No 1381 Håkan Lundvall: Automatic Parallelization using Pipelining for Equation-Based Simulation Languages, 2008. No 1386 Mirko Thorstensson: Using Observers for Model Based Data Collection in Distributed Tactical Operations, 2008. No 1387 Bahlol Rahimi: Implementation of Health Information Systems, 2008. No 1392 Maria Holmqvist: Word Alignment by Re-using Parallel Phrases, 2008. No 1393 Mattias Eriksson: Integrated Software Pipelining, 2009. No 1401 Annika Öhgren: Towards an Ontology Development Methodology for Small and Medium-sized Enterprises,

2009. No 1410 Rickard Holsmark: Deadlock Free Routing in Mesh Networks on Chip with Regions, 2009. No 1421 Sara Stymne: Compound Processing for Phrase-Based Statistical Machine Translation, 2009. No 1427 Tommy Ellqvist: Supporting Scientific Collaboration through Workflows and Provenance, 2009. No 1450 Fabian Segelström: Visualisations in Service Design, 2010. No 1459 Min Bao: System Level Techniques for Temperature-Aware Energy Optimization, 2010.

No 1466 Mohammad Saifullah: Exploring Biologically Inspired Interactive Networks for Object Recognition, 2011 No 1468 Qiang Liu: Dealing with Missing Mappings and Structure in a Network of Ontologies, 2011. No 1469 Ruxandra Pop: Mapping Concurrent Applications to Multiprocessor Systems with Multithreaded Processors and Network on Chip-Based Interconnections, 2011. No 1476 Per-Magnus Olsson: Positioning Algorithms for Surveillance Using Unmanned Aerial Vehicles, 2011. No 1481 Anna Vapen: Contributions to Web Authentication for Untrusted Computers, 2011. No 1485 Loove Broms: Sustainable Interactions: Studies in the Design of Energy Awareness Artefacts, 2011. FiF-a 101 Johan Blomkvist: Conceptualising Prototypes in Service Design, 2011. No 1490 Håkan Warnquist: Computer-Assisted Troubleshooting for Efficient Off-board Diagnosis, 2011. No 1503 Jakob Rosén: Predictable Real-Time Applications on Multiprocessor Systems-on-Chip, 2011. No 1504 Usman Dastgeer: Skeleton Programming for Heterogeneous GPU-based Systems, 2011. No 1506 David Landén: Complex Task Allocation for Delegation: From Theory to Practice, 2011. No 1507 Kristian Stavåker: Contributions to Parallel Simulation of Equation-Based Models on

Graphics Processing Units, 2011. No 1509 Mariusz Wzorek: Selected Aspects of Navigation and Path Planning in Unmanned Aircraft Systems, 2011. No 1510 Piotr Rudol: Increasing Autonomy of Unmanned Aircraft Systems Through the Use of Imaging Sensors, 2011. No 1513 Anders Carstensen: The Evolution of the Connector View Concept: Enterprise Models for Interoperability Solutions in the Extended Enterprise, 2011. No 1523 Jody Foo: Computational Terminology: Exploring Bilingual and Monolingual Term Extraction, 2012. No 1550 Anders Fröberg: Models and Tools for Distributed User Interface Development, 2012. No 1558 Dimitar Nikolov: Optimizing Fault Tolerance for Real-Time Systems, 2012. No 1586 Massimiliano Raciti: Anomaly Detection and its Adaptation: Studies on Cyber-Physical Systems, 2013.

Anomaly Detection and its Adaptation: Studies on Cyber ...608664/FULLTEXT01.pdf · Anomaly...

Documents

Transcript of Anomaly Detection and its Adaptation: Studies on Cyber ...608664/FULLTEXT01.pdf · Anomaly...