MOHAMMAD ABDULAZIZ ALWADI - UC Home · MOHAMMAD ABDULAZIZ ALWADI ... conserves the energy in the...
-
Upload
vuongquynh -
Category
Documents
-
view
212 -
download
0
Transcript of MOHAMMAD ABDULAZIZ ALWADI - UC Home · MOHAMMAD ABDULAZIZ ALWADI ... conserves the energy in the...
ENERGY EFFICIENT WIRELESS SENSOR NETWORKS BASED ON
MACHINE LEARNING
MOHAMMAD ABDULAZIZ ALWADI
A Thesis Submitted for the
Degree of Doctor of Philosophy
Faculty of Education Science Technology and Mathematics
October 2015
i
In the name of Allah, the most Merciful, the most compassionate
ii
ح لِّي ص "ر ِّ اْشر ْر لِّ يْدرب ن لِّساني ي * واْحل ي أ مرِّ * ويس ِّ ْل ع قدةً مِّ
(52-52* ي فقه وا ق ْولِّي" )سورة طه
My Lord, 'expand my chest, And ease my task for me. (Grant me self
confidence, contentment, and boldness)
Unloose the knot upon my tongue,
That they may understand my speech. (Quran 20: 25-28)
iii
I dedicate this to My Mother and My Father, My wife EMAN and My Son
OMAR.
v
Abstract
The field of wireless sensor networks have become a focus of intensive research in recent years,
especially for monitoring and characterizing of large physical environments, and for tracking
various environmental or physical conditions such as temperature, pressure, wind and humidity.
Wireless Sensor networks can be used in many applications, such as wildlife monitoring,
military target tracking and surveillance, hazardous environment exploration, and natural
disaster relief. Given the huge amount of sensed data, automatically classifying them becomes
a critical task in many of these applications. Energy efficiency is a key issue in wireless sensor
networks where the energy sources and battery capacity are very limited. To address some of
key WSN challenges, a novel integrated framework for achieving energy efficiency is proposed
consisting of three stages of modelling from data. The first stage is a joint energy efficiency–
event detection model, where a novel sensor node selection technique is designed, that
conserves the energy in the wireless sensor network and at the same time maximizes the event
recognition performance. Here, the scheme utilises, fewer sensor nodes at a time, and placing
unwanted sensor nodes in the sleep mode. For this, a novel objective quantitative metric is
proposed to assess the energy efficiency achieved, namely, the life time extension factor
(LTEF). It was shown with extensive experimental evaluation, that this joint scheme, allows
selection of most significant and influential sensor nodes for participation in different WSN
tasks, and contributes significantly towards energy savings and event detection accuracy. As
the WSN needs to adapt to the state of the environment being monitored dynamically, the
number of sensor nodes participating in the routing tree cannot remain fixed, and need to adapt,
in order to accurately monitor and predict the physical environment, and the second stage in
this framework, is a proposal for adaptive models for sensor selection and classifier learning
for achieving energy efficiency and prediction accuracy, based on performance targets
specified. The third stage is a joint energy efficiency–adaptive routing model, where an
appropriate sensor selection and adaptive routing strategy allows addressing the WSN
challenges corresponding to energy efficiency, prediction accuracy, and MAC layer adaptation.
We show that this joint model, also meets non-functional performance targets, such as missing
or faulty sensors, model building time, needed for adaptation of routing protocol.
vii
Acknowledgments
I would like to express my sincere appreciation to all the staff at the University of Canberra and
Dr. Girija Chetty for all her support and assistance in the preparation of this thesis. My sincere
appreciation to all my Family, Dad, Mum, wife and My Son for supporting me and their
encouragement throughout my life.
ix
Table of Contents
Contents
TABLE OF CONTENTS IX
FORM B XVII
CERTIFICATE OF AUTHORSHIP OF THESIS XVII
KEY TERMS XIX
CHAPTER 1 INTRODUCTION 1
1.1 Introduction 1
1.2 Significance and Motivation 3
1.3 Background 5
1.4 Research Questions 14
1.5 Thesis Contributions 14
1.6 Publications 16
1.7 Organisation of Thesis 17
CHAPTER 2 RELATED WORK AND LITERATURE REVIEW 19
2.1 Machine Learning Based Approaches 19
2.2 Related Work on Machine Learning for WSNs 23
2.2.1 Supervised Machine Learning 23
2.2.2 Unsupervised Machine Learning 27
2.2.3 Reinforcement Machine Learning 28
2.3 Operational Challenges 29
2.3.1 WSN Routing Issues 29
2.3.2 Data Collection and Clustering Issues 34
2.3.3 Event Recognition & Query Processing Issues: 40
2.3.4 Challenges Related to Localisation and Object Targeting 44
2.3.5 Medium Access Control (MAC) Issues: 50
2.4 Non-operational Aspects of WSN 53
2.4.1 Security and Anomaly Intrusion Detection 54
x
2.4.2 Data Integrity, Fault Detection, and QoS Enhancement: 56
2.4.3 Application Specific Unique Challenges 59
2.5 Research Gap in Wireless Sensor Networks Based on Machine Learning/Data Mining
Techniques 62
2.5.1 Better Methods for Selecting Sensors 62
2.5.2 Adaptive and Distributed Machine Learning Approaches For WSNs 62
2.5.3 Managing Resources Using Machine Learning 63
2.5.4 Spatio-Temporal Correlation Detection 63
2.6 Research Plan and Thesis Road Map 63
CHAPTER 3 JOINT SENSOR SELECTION - EVENT DETECTION SCHEME 65
3.1 Introduction 65
3.2 Joint Energy Efficiency - Event Detection Scheme 65
3.2.1 Energy Efficiency with Feature Ranking Algorithm 65
3.2.2 Naïve Bayes Machine Learning Classifier Algorithm 67
3.3 Experimental Validation 68
3.3.1 Experiment 1 (Isolet Data set) 69
3.3.2 Experiment 2 (Ionoshpere dataset) 72
3.3.3 Experiment 3 (forest Cover type data set) 73
3.3.4 Experiment 4 (Forest fires Dataset) 75
3.4 Chapter Summary 77
CHAPTER 4 ADAPTIVE MODELS FOR ENERGY EFFICIENCY 79
4.1 Introduction 79
4.2 Adaptive Classifier Model Based Scheme 79
4.2.1 Data set Description 80
4.2.2 Classification Algorithms 81
4.2.3 Experimental Evaluation 82
4.3 Discussion 84
4.4 Adaptive Classifier Scheme with Gas Sensor Drift Dataset 87
4.4.1 Experimental Validation with Gas Drift Dataset 87
xi
4.4.2 Experimental Validation with Gas Drift Dataset using Ensemble Learning for Weak
Classifiers 89
4.5 Chapter Summary 90
CHAPTER 5 JOINT SENSOR SELECTION- ADAPTIVE ROUTING MODEL 91
5.1 Introduction 91
5.2 Intel Berkeley Lab WSN dataset 91
5.3 Intel Lab data file versus Intel Lab data file restructured for experiments 94
5.4 Sensor Selection and Adaptive Routing Model 98
5.5 Experimental Results and Discussion 99
5.6 Chapter Summary 104
CHAPTER 6 CONCLUSIONS AND FUTURE DIRECTIONS 107
BIBLIOGRAPHY 111
xiii
List of Figures
FIGURE 1 A TYPICAL WIRELESS SENSOR NETWORK [6] 5
FIGURE 2 TAXONOMY OF ENERGY EFFICIENT APPROACHES FOR WIRELESS SENSOR NETWORKS. 7
FIGURE 3 ESTIMATING NODE LOCALIZATION CO-ORDINATES IN WSN USING NEURAL NETWORKS [82] 25
FIGURE 4 SCHEMATIC OF SVM CLASSIFICATION PROCESS [91] 26
FIGURE 5 TWO DIMENSIONAL VISUALIZATION OF PCA PROCESS [103] 28
FIGURE 6 VISUALIZATION OF Q-LEARNING ALGORITHM [108] 29
FIGURE 7 SIMPLIFIED NETWORK ROUTING BASED ON MACHINE LEARNING [46] 31
FIGURE 8 VISUALIZATION OF Q-LEARNING ALGORITHM [118] 35
FIGURE 9 EVENT DETECTION AND QUERY PROCESSING USING MACHINE LEARNING [46] 41
FIGURE 10 HMM AND NAÏVE BAYES EVENT DETECTION AND QUERY PROCESSING [132] 42
FIGURE 11 LOCALIZATION USING BEACON NODES IN WSN [82] 45
FIGURE 12 ALOHA-QIR SCHEME FOR MAC LAYER IN WSN [152] 52
FIGURE 13 ADAPTIVE DECISION TREE BASED MAC PROTOCOL (SAML) [155] 53
FIGURE 14 BASIC CONCEPTS OF ANOMALY INTRUSION DETECTION [54] 54
FIGURE 15 WSN BASED Q-LEARNING FOR OBJECT TRACKING APPLICATION [174] 60
FIGURE 16 BLOCK SCHEMATIC FOR JOINT ENERGY EFFICIENCY - EVENT DETECTION SCHEME 66
FIGURE 17 SENSOR SELECTION AND RANKING ALGORITHM 66
FIGURE 18 EVENT DETECTION ACCURACY VS. LIFE TIME EXTENSION FACTOR(LTEF) (ISOLET 5 DATA SET) 71
FIGURE 19 ACCURACY AND LIFE TIME EXTENSION FACTOR (IONOSPHERE) 73
FIGURE 20 ACCUARCY AND LIFE TIME EXTENSION FACTOR (FOREST COVER TYPE DATA SET) 75
FIGURE 21 ACCURACY AND LIFE TIME EXTENSION FACTOR FOR FOREST FIRES DATA SET 77
FIGURE 22 ADAPTIVE FEATURE SELECTION AND CLASSIFIER MODEL FOR ENERGY EFFICIENCY 80
FIGURE 23 PERFORMANCE OF CLASSIFIERS WITH 10 FOLDS CROSS VALIDATION 82
FIGURE 24 PERFORMANCE OF CLASSIFIERS WITH FULL TRAINING SET 83
FIGURE 25 PERFORMANCE OF CLASSIFIERS WITH FEATURE SELECTION 83
FIGURE 26 PERFORMANCE OF CLASSIFIERS WITH FEATURE SELECTION ON FULL TRAINING SET 84
FIGURE 27 COMPARATIVE CLASSIFIER PERFORMANCE 86
FIGURE 28 GAS DRIFTS SUMMARY OF EXPERIMENTAL RESULTS 88
FIGURE 29 INTEL BERKELEY WIRELESS SENSOR NETWORK DATA SET: LOCATION OF 54 SENSORS IN AN AREA OF 1200
M2 92
FIGURE 30 JOINT SENSOR SELECTION – ADAPTIVE ROUTING MODEL 94
FIGURE 31 INTEL LAB MAIN SOURCE FILE STRUCTURE 94
FIGURE 32 SAMPLE FILES TEMPERATURE READINGS 35, 2700 AND 5400 SAMPLES 96
FIGURE 33 SAMPLE FILES TEMPERATURE READINGS 35, 2700 AND 5400 SAMPLES 97
FIGURE 34 TEMPERATURE SENSOR SELECTION MAP FOR 3 EXPERIMENT SCENARIOS- 1,2 AND 3 98
FIGURE 35 HUMIDITY SENSOR SELECTION MAP FOR 3 EXPERIMENT SCENARIO 1,2 AND 3 99
FIGURE 36 TEMPRATURE EXPERIMENT 1,2 AND 3 RESULTS 101
FIGURE 37 HUMIDITY EXPERIMENT 1,2 AND 3 RESULTS 101
FIGURE 38 TEMPERATURE, ROOT MEAN SQUARE ERROR 102
xiv
FIGURE 39 HUMIDITY, ROOT MEAN SQUARE ERROR 103
FIGURE 40 TIME TAKEN TO BUILD THE MODEL, TEMPERATURE 104
FIGURE 41 TIME TAKEN TO BUILD THE MODEL, HUMIDITY 104
xv
List of Tables
TABLE 1. DATA SETS FOR EXPERIMENTAL VALIDATION 68
TABLE 2 FEATURES SELECTED IN ISOLET 5 69
TABLE 3 NAÏVE BAYES CLASSIFIER PERFORMANCE 70
TABLE 4.RESULTS OF EXPERIMENT 1 71
TABLE 5. EXPERIMENT 1 ACCURACY WITH SENSOR FAILURE PROBABILITY 72
TABLE 6 EXPERIMENT 2 FEATURES SELECTED AND RANKED ON IONOSPHERE DATASET 72
TABLE 7 EXPERIMENT 2 ACCURACY 72
TABLE 8. EXPERIMENT 2 RESULTS 73
TABLE 9 EXPERIMENT 3 FEATURES RANKED AND SELECTED FOR FOREST COVER TYPE DATASET 74
TABLE 10 EXPERIMENT 3 ACCURACY AND LIFE TIME EXTENSION FACTOR 74
TABLE 11. EXPERIMENT 3 RESULTS 75
TABLE 12 SELECTED FEATURES ON FOREST FIRES DATASET 76
TABLE 13 EXPERIMENT 2 ACCUARCY FOREST FIRES DATA SET 76
TABLE 14 EXPERIMENT 4 RESULTS 77
TABLE 15 FOREST COVER TYPE ORIGINAL DATA SET AND SUBSET DATA SET DESCRIPTION 81
TABLE 16 GAS SENSOR ARRAY DRIFT DATA SET DESCRIPTION 87
TABLE 17 PERFORMANCE OF GAS DRIFTS SENSOR DATASET 88
TABLE 18 ENSEMBLE LEARNING ON GAS DRIFT SENSOR ARRAY DATA SET 90
TABLE 19 INTEL LAB DATA SET FILE SCHEMA 92
TABLE 20 TEMPERATURE RESULTS FROM THREE EXPERIMENTS SCENARIOS 100
TABLE 21 HUMIDITY RESULTS FROM THREE EXPERIMENTS SCENARIOS. 100
xvii
Form B
Certificate of Authorship of Thesis
Except where clearly acknowledged in footnotes, quotations and the bibliography, I certify
that I am the sole author of the thesis submitted today entitled
ENERGY EFFICIENT WIRELESS SENSOR NETWORKS BASED ON MACHINE
LEARNING.
(Thesis title)
I further certify that to the best of my knowledge the thesis contains no material previously
published or written by another person except where due reference is made in the text of the
thesis.
The material in the thesis has not been the basis of an award of any other degree or diploma
except where due reference is made in the text of the thesis.
The thesis complies with University requirements for a thesis as set out in Gold Book Part 7:
Examination of Higher Degree by Research Theses Policy, Schedule Two (S2). Refer to
http://www.canberra.edu.au/research-students/goldbook
Signature of Candidate
........................................................................
Signature of chair of the supervisory panel
Date: ……12/7/15……………..
xix
Key Terms
Sensor network
Wireless sensor network
Wired sensor network
Data mining
Classification
Feature selection
Data set
Attributes
Physical environment
Environment Monitoring
Environment characterization
Source node
Sink node
Sensor Failure
Active mode
Sleep mode
Accuracy
Life time extension factor
Energy Efficiency
WEKA data mining software
UCI Repository
Intel lab Wireless sensor network
Mote ID
Root Mean squared error
Feature ranking Algorithm
Feature selection Algorithm
Intelligent monitoring
Intel Berkeley lab
Routing approach
Routing map
Ensemble Learning
Trade off
Simulation tools
Chapter 1
1
Chapter 1 Introduction
1.1 Introduction
The real world physical environment consists of large and diverse information sources, such as
light, temperature, motion, seismic waves, and many others. For a better understanding of the
environment, it is necessary to capture the information from multiple disparate sources, and the
wireless sensor network is an easy to deploy infrastructure allowing capturing of such rich
information.
A wireless sensor network (WSN) consists of spatially distributed autonomous sensors to
monitor the physical environment, and to co-operatively pass their data through the network to a
main node or central location (base station). Modern wireless sensor networks are bi-directional,
allowing transmission of information being monitored from nodes to central node or base station,
as well as enabling control of sensor activity from base station to sensors. The development of
wireless sensor networks was motivated primarily by military applications such as battlefield
surveillance; but today such networks are used in many industrial and consumer applications,
such as industrial process monitoring and control, machine health monitoring, environmental
detection, and habitat monitoring. The WSN is built of "nodes” from a few to several hundreds
or even thousands of nodes (sometimes called as motes), where each node is connected to one
(or sometimes several) sensors. Each such sensor network node has typically several parts: a
radio transceiver with an internal antenna or connection to an external antenna, a microcontroller,
an electronic circuit for interfacing with the sensors and an energy source, usually a battery or an
embedded form of energy harvesting. A sensor node might vary in size from that of a shoebox
down to the size of a grain of dust, although functioning "motes" of genuine microscopic
dimensions have yet to be created. The cost of sensor nodes is similarly variable, ranging from a
few to hundreds of dollars, depending on the complexity of the individual sensor nodes. Size and
cost constraints on sensor nodes result in corresponding constraints on resources such as energy,
memory, computational speed and communications bandwidth. The topology of the WSNs can
vary from a simple star network to an advanced multi-hop wireless mesh network. The
propagation technique between the hops of the network can be determined based on routing or
flooding protocol[1, 2].
Chapter 1
2
A wireless sensor network can be used for various applications; we can summarize some of the
useful applications as the following:
1. Habitat/Area monitoring: Area monitoring is a common application of WSNs. In area
monitoring, the WSN is deployed over a region where some phenomenon is to be
monitored. A military example is the use of sensors to detect enemy intrusion; a civilian
example is the geo-fencing of gas or oil pipelines. When the sensors detect the event
being monitored (heat, pressure), the event is reported to one of the base stations, which
then takes appropriate action (e.g., send a message on Internet or to a satellite).
Similarly, wireless sensor networks can use a range of sensors to detect the presence of
vehicles ranging from motorcycles to trains and cars.
2. Environmental/Earth monitoring: The term Environmental Sensor Networks [3], has
evolved to cover many applications of WSNs to earth science research. This includes
sensing volcanoes oceans, glaciers and forests.
3. Critical Events/Forest fire detection: A network of sensor nodes can be installed in a
forest to detect when a fire has started. The nodes can be equipped with sensors to
measure temperature, humidity and gases which are produced by fire in the trees or
vegetation. Early detection is crucial as it will allow protection of highly valued
resources.
4. Data Logging: Wireless sensor networks are also used to collect data for monitoring
information from the environment. For example, monitoring the temperature in a fridge
to the level of water in over flow tanks in nuclear power plants.
As outlined above, a wide spectrum of applications ranging from habitat monitoring to battlefield
surveillance can be benefited by deploying the wireless sensor network (WSN) technology [1,
2]. Some of the benefits include low cost, easy deployment, high fidelity sensing, self-
organization of WSNs, among several other benefits [2]. However, despite many opportunities
the wireless sensor networks provide, using WSN technology comes with great challenges. These
challenges are associated with characteristics of wireless sensor networks namely:
1. Power consumption constraints for nodes using batteries or energy harvesting.
2. Ability to cope with node failures.
3. Mobility of nodes.
4. Communication failures.
5. Scalability to large scale of deployment.
Chapter 1
3
6. Ability to withstand harsh environmental conditions.
7. Ease of use.
Out of these characteristics, need to operate under severe resource constraints is one of the biggest
challenge with WSNs, which makes efficient design highly necessary.
1.2 Significance and Motivation
A wireless sensor network, or WSN for short, is a large-scale network comprising of wirelessly
interconnected transducer devices called sensor nodes or “mote”. A sensor node, as the name
implies, can have one or more sensor modules, for sensing light, temperature, humidity, pressure,
and sound. In addition, each sensor node can include four other components, namely: memory,
processing, communication, and battery modules. The first use of sensor networks can be traced
back to the cold war era, when a distributed network of radars, and hydrophones were deployed
to monitor the skies and oceans, respectively [4]. Of late, contemporary monitoring networks use
tiny and resource-constrained sensor nodes.
The field of wireless sensor networks (WSN) has become a focus of intensive research in recent
years and various theoretical and practical questions have been addressed. It has drawn a lot of
attention as a result of the possibility of coupling these devices with their surroundings. Well
beyond their direct use, such as surveillance and environmental monitoring, WSNs can help us
pursue one of the ultimate goals in information technology, namely ambient intelligence [5]. The
small size and wireless communication capability of sensor nodes in a WSN provides us with
not only the information about the physical world around us, but also the flexibility to have them
integrated deeply within building material, fabrics, and embedded in inaccessible or hostile
locations in the real world operating scenarios. By using wireless sensor networks we can develop
automated intelligent systems that can co-operate with each other to exchange information
concerning their internal states and the conditions of the physical environment around them, and
provide services to users, and prevent disasters with better efficiency and robustness without any
human intervention [5].
The evolution of sensor networks has extended the computing horizons from desktop computing
to the entire physical environment computing (ambient computing). Due to this, the user-driven
model of traditional computing has shifted to an event-driven model in sensor networks.
Noticeably, the event-driven model entails that the volume of data generated by stimuli of
Chapter 1
4
environmental phenomena exceeds the rate of any user input by multiple folds. The traditional
model for interpreting this large volume of measurements normally involves sending large
sensory data to a base-station for analysis. The collected data could sometimes be locally
processed before being sent in the network, and could involve intermediate sensor nodes for
further processing of the data. Finally, the sensory data is integrated centrally at the base station
to infer the status of the observed environment at the base-station. The base station performs
optimal detection and tracking mechanism based on conventional signal processing methods.
This traditional model, however, suffers from many limitations due to resource-constraints and
the bandwidth limitations. The computational power and speed of base station computers can
create a processing bottleneck and can cause total system failure if the base-station fails. Further,
relaying all sensory data of geographically dispersed sensor nodes to a centralized base-station is
generally ineffective as it requires a significant communication overhead leading to resource
depletion and shortening of the lifetime of the network.
Several research works in the past [9],[10], [11], [12], and [13], tried to address these challenges
using the methods drawn from signal communication theory in telephony/telegraphy, where the
main purpose is the reliable transmission of data in the presence of noisy channels. However,
these approaches did not appear to work well for wireless sensor networks, as the purpose of
WSNs is not just the reliable transmission of data from sender to receiver, but also the detection
of occurrence of catastrophic events from large sets of sensory data, such as earthquakes,
Tsunamis, forest fires, land cover usage etc. Most of the current methods focus on solving the
local short term problem of enhancing the communication capacity between nodes or managing
the resources efficiently for a small WSN, with studies conducted on simulated setups.
Interpreting catastrophic global events from large volumes of data is a challenging task; and
research efforts needs to focus on development of novel approaches to improve the detection
accuracy and detection quality of high level information, where the WSN is deployed, such as,
accurate physical environment event detections, in addition to reduction in amount of data and
energy consumption in the sensor nodes in the network. Approaches to reduce the energy
consumption is one of the most important requirement, as there is no continuous power support
for battery powered sensors in WSNs deployed in the field. The life time of a sensor is very
restricted based on very limited power source. Therefore keeping the energy consumption in the
lowest level is one the key requirement.
Chapter 1
5
1.3 Background
Figure 1 A typical wireless sensor network [6], consisting of a base-station and a collection of
sensor nodes (also called “motes”). Although WSNs are anticipated for dense deployment of
thousands of nodes, some of current deployments range from ten to hundreds of sensor nodes.
Generally, in sensor networks for environmental monitoring and surveillance applications, the
events of interest occur rarely and suddenly. Therefore, the network traffic is typically very low.
However, the traffic flow increases abruptly, when and event of interest occurs leading to large
amounts of sensory data from various sensor nodes being conveyed to the base-station in the
event of a phenomenon of interest, leading to abrupt increase in traffic. To ensure that the event
of phenomenon of interest is captured properly and accurately, sensor nodes are deployed
densely. The densely deployed nodes not only ensure coverage and communication but also
tolerate node failures.
Figure 1 A typical wireless sensor network [6]
A dense sensor node deployment within close proximity of each other can result in an overlap in
coverage and communication. As a result, sensory measurements can contain high correlations
and redundancies. For instance, when the sensing range of two nodes covers the same area, both
sensor nodes will be likely transmitting identical sensory data. Although this ascertains a robust
sensor network tolerant to node failures and noisy sensory measurements, it can cause the sensor
nodes to consume precious battery resources for conveying the redundant data. One effective
approach to control the redundant data being communicated is to adjust the physical location of
Chapter 1
6
sensor nodes so as to minimize the overlap in their sensing ranges. However, adjusting the
location of sensor nodes may not be always possible, especially when using sensor nodes for
applications that require ad hoc and random deployment, e.g. battle field and emergency
applications. Hence, preferably, redundant data needs to be detected and removed.
One effective approach to minimize this energy consumption is the use of an appropriate
communication protocol called “multi hop communication”. With multi hop protocol, the
sensory data is communicated in several “hops” to neighbouring nodes to the base station, instead
of delivering data directly through a maximum range radio link to the base station. This is usually
better in terms of energy consumption. The multi hop communication protocol not only aids in
routing the data through several intermediate nodes, but also doing some node-level processing,
such as removing the data redundancy or for combining the data from other nodes. This
behaviour called in-network processing can contribute significantly to maximizing the longevity
of the WSN by switching off idle nodes. Since the events of interest occur rarely, switching off
battery powered sensor nodes located in inaccessible locations can conserve energy. Secondly,
the processing capabilities of sensor nodes in multi hop path can effectively help reduce the
volume of data transmitted in the network.
In order to predict the energy costs of different algorithms and protocols, and develop energy
efficient techniques, it is important to have an accurate understanding of the amount of energy
consumed at the sensor nodes. There are several sources of power consumption in a typical sensor
node, such as:
1. Sensor start up power,
2. Signal sampling rate,
3. Physical signal-to-electrical conversion,
4. Signal conditioning, and
5. Analogue-to-digital conversion.
In general, the amount of power consumed in the sensors for above mentioned processing stages
is negligible as compared to the energy consumed in the communication of the signal. To manage
the power consumed by a sensor node, for different types of sensors including temperature, photo
resistor, barometric pressure, humidity, passive infrared sensors, sonar rangers, and array sensors,
the processor within these sensor nodes support several operating modes, including active and
sleep modes. In the sleep mode a sensor node completely withholds all its activities and shuts
Chapter 1
7
down almost all of its components. The power consumption for Berkeley motes [7] , an example
wireless sensor node is 8 mill watts in active mode and 75 microwatts in sleep mode (around 10
times less power is consumed). The energy consumption in actual communication of data
between the sensor nodes is much higher than the sensor nodes. In general, there are four modes
of communication modes in a sensor node: transmit, receive, idle, and sleep modes. In transmit
mode, the energy consumption depends on the data rate (40kbps, 38.4kbps, and 250kbps). Other
factors associated with the power consumption and performance of a radio component (wireless
node) includes the type of modulation scheme used, choice of antenna, and duty cycle. The
receive mode of a sensor node also consumes lot of energy, and often has a third operating mode,
called the “idle” mode. The idle mode is different from the sleep mode. In sleep mode all the
radio components within the sensor node are completely shut down (larger energy saving),
whereas in the idle mode a sensor node switches off all its components except the receive radio
antenna.
Figure 2 Taxonomy of energy efficient approaches for wireless sensor networks.
For environment monitoring using wireless sensor networks, many applications are expected to
run continuously, in an unattended manner for several days and months. However, sensor nodes
are constrained by limited resources in terms of energy. And since communication between the
sensors, and, from sensors to central base station is more energy-consuming than the energy
consumption within the individual sensor nodes, appropriate design of sensor data processing
Chapter 1
8
techniques and the collection techniques that can limit the amount of transmitted data continues
to be an important and central issue for diffusion of wireless sensor network technology in real
world application, particularly for civilian operating scenarios, in spite of development of several
protocol driven approaches, such as multi hop communication protocols. Figure 2, shows the
taxonomy of some of earliest attempts on energy efficient methods for WSNs, into information
processing based and sleep-mode based methods. The sleep mode-based approaches conserve
energy by keeping as many nodes as possible for the longest time in the sleep mode. In the
information processing-based approaches, energy saving is achieved by means of reducing the
amount of data communicated in the network, by intermediate node-level processing.
The authors in [8] proposed an information fusion based approach for saving energy by node-
level processing of data to reduce the communication load, and this approach determines the
routing strategy. An approach based on collaborative routing is proposed by authors in [9]. The
authors show that this approach called CRAWL is adaptive to non-uniform distribution of
available energy in sensor networks. Collaborative and non-collaborative algorithms perform
equally when the available energy distribution is uniform, but when the distribution is non-
uniform collaborative algorithms is found to have 20.2% longer network life. For achieving
this, the authors in [9] propose different node scheduling options:
1. Initial network with all surviving nodes.
2. Uneven distribution of surviving sensor nodes.
3. More uniform distribution of surviving sensor nodes.
4. Optimal distribution for the last four surviving nodes for area coverage.
The wireless sensor network is fully effective when all of the sensor nodes are alive and they
cover the entire region of interest. CRAWL algorithm with Collaborative and non-collaborative
scheduling can increase WSN scalability and adaptability and was suggested by authors as the
next generation WSN energy management scheme.
A handoff algorithm for conservation of energy was proposed by authors in [10]. One of the
important characteristic of wireless systems is the signal variation caused by the movement of
the mobile stations. The existing radio link between a base station and the mobile station may
terminate if the radio link between the mobile station and another base station degrades due to
motion of the mobile terminal, and it is necessary to switch, or handoff, the communication link
from one base station to another. This can ensure the signal quality is maintained and the
Chapter 1
9
interference caused to other radio links is minimized leading to energy efficient management of
WSN nodes.
A frequency hopping Spread Spectrum (FHSS) technique was proposed by authors in [11] for
managing the energy in WSN, where the transmitter broadcasts on one frequency for a small
amount of time then switches to another frequency using a known switching algorithm called as
hopping or hopping pattern. The receiver knows the same hopping code so it is able to slide the
code past the incoming signal until it synchronies with the sender. Once they are synchronized,
the transmitter and receiver follow the hopping code to switch frequencies and communicate.
The resulting transmission is spread over a large frequency range and therefore appears as noise
to other receivers unless they know (or can decipher) the hopping code. Four different algorithms
were proposed to decipher the hopping codes:
1. The Brute- force method attempts to decode the signal by using every possible hopping
code.
2. Sequential scanning algorithm: Approach observes once frequency at a time to determine
the hopping sequences and was referred to as sequential scanning.
3. Parallel scanning algorithm: there is a receiver for each possible channel used in the
hopping code.
4. Hybrid algorithm: using concepts from the first three techniques with a set of parallel
receivers that switch through the possible channels.
Each algorithm was analysed theoretically and by simulation, for reduction of power
consumption, the authors in[11] showed that the results were positive in ability to decipher the
hopping codes.
The authors in [12] proposed a WSN scheme with an environment sensing/event detection focus
instead of energy management focus. Wireless sensor networks have a number of strengths such
as distribution, parallelism, redundancy, and comparatively high cost-effectiveness due to lack
of wires. On the other hand, their low cost, need to operate continuously, for a long term and
dependency on batteries, impose severe restrictions on the system. Hence, services provided in
sensor networks need to be lightweight in terms of memory and processing power and should not
require high communication costs. The authors in this work [12] proposed an algorithm in the
context of office monitoring system, which can distinguish abnormal office access pattern from
normal access, using an Adaptive Resonance Theory (ART) based anomaly detection technique.
Chapter 1
10
A resource reservation scheme for managing the energy in WSN was proposed by authors in
[13]. The scheme involving a hand off strategy for small size cells uses transfer probabilities to
predict the destination cell. Here, the reserved resources in each base station are proportional to
the user's transfer probabilities. In order to obtain accurate value of transfer probabilities, they
construct a movement or motion model to study the relationship between the user's initial states
and its transfer probabilities. According to authors, this algorithm turned out to be very easy to
implement and adaptable for different situations. It could offers accurate classification about the
user's random movement in small size cells and improves the efficiency when resources are
limited in wireless systems [13].
An approach, again with focus on environment sensing/event detection focus instead of energy
management focus was proposed by authors in [14]. Event detection is the process of observing
and evaluating an event using multiple sensor nodes without the help of a base station or other
means of central coordination and processing. In this work, authors propose a distributed event
detection approach based on distributed sampling of sensor nodes. It is a self-contained approach,
and it operates without a central component or base station canter, for coordination or processing,
and makes active use of the redundantly placed sensor nodes in the network to improve detection
accuracy.
Schurgers and Srivastava [15] propose an energy efficient routing scheme based on energy
histograms. The scheme involves aggregation of packet streams in a robust way (resulting in
energy reduction of a factor 2 to 3), and shaping of the traffic flow for uniform resource
utilization.
An approach based on opportunistic communication topology control is proposed by authors in
[16] to improve energy efficiency without sacrificing network performance. This technique
involves, wisely choosing a group of nodes to form a connected infrastructure, allowing other
nodes to directly connect to the infrastructure. The nodes that belong to the infrastructure are
called coordinator nodes. a non-coordinator node only turns on when it needs to connect the
infrastructure, and its energy can be significantly saved. A control algorithm for topology control
was developed to minimize the number of coordinator nodes to satisfy given end-to-end network
performance requirements from all the sensor nodes to the single sink [16].
Chapter 1
11
Ali and Uzmi in [17] proposed a scheme based on node address naming for energy efficient
management. The node address naming scheme assigns locally unique addresses without extra
overhead bits, and allows the reduction of address size by a factor of 3.6. Reducing the extra
overhead number of bits from each packet transmission ultimately leads to greater energy
efficiency and increases the lifetime of the network.
A Voronoi diagram based approach was proposed by the authors in [18] for energy management
in WSNs. In case of a network with a high density of sensor nodes, several problems may arise
such as the intersection of sensing area, redundant data, communication interference, and energy
waste. A high density network can introduce a fault-tolerant mechanism, increase precision and
provide multi-resolution data. The authors in this work developed a mechanism to control the
network density based on a criterion to decide which nodes should be turned off or on. Their
solution is based on the Voronoi Diagram, which decomposes the space into regions around each
node, to determine which sensor node should be turned off or on. Given the location of the nodes
and the area to be monitored, each node represents a point, and the desired area to monitor is the
polygon that is defined by the Voronoi diagram [18].
The authors in [19] proposed a power aware routing protocol for energy efficient WSNs, which
involves adapting the routes to available power. This allows a reduction in the total power used
as well as more even power usage across nodes. The authors included three major considerations
in developing this approach: The overall power dissipation, DSAP routing, and Power-DSAP
routing When the power considerations were added to the protocol, the overall power
consumption is much more balanced than without taking power into account.
A highly resilient, multipath routing scheme for energy management was proposed by authors in
[20]. In this work the authors proposed a novel braided multipath route to enable energy efficient
recovery from failure. In this approach, the authors propose localized algorithms to compute
approximations to the idealized disjoint and braided paths. Evaluation of two algorithms was
done using different failure modes: isolated node failures, where each individual node has an
independent probability of failure; and patterned failures, in which all nodes within a certain fixed
radius fail simultaneously. They further evaluate the performance of these approaches across
several parameters: density, probability of isolated failure, spatial separation of source and sink,
and frequency and radius of patterned failures. They found that, for comparable resilience to
patterned failures, braided multipath expends only 33% of the energy of disjoint paths for
Chapter 1
12
alternate path maintenance in some cases, and have a 50% higher resilience to isolated failures
[20].
An approach based on balanced cost cluster-heads selection protocol for reducing the power
consumption in WSNs is proposed in [21]. The protocol named LEACH (Low-Energy Adaptive
Clustering Hierarchy) is completely decentralized, and allows a best distribution of the
transmission energy in the network, and a large stable network lifetime [21].
The authors in [22] introduced a new energy efficient WSN algorithm called e3D (energy-
efficient Distributed Dynamic Diffusion routing algorithm), and compared it to two other similar
algorithms, namely directed, and random clustering communication. The authors take into
account the setup costs and analyse the energy-efficiency and the useful lifetime of the system.
In order to better understand the characteristics of each algorithm and how well e3D really
performs, they also compare e3D with its optimum counterpart and an optimum clustering
algorithm. The benefit of introducing these ideal algorithms is to show the upper bound on
performance at the cost of an astronomical prohibitive synchronization costs. They compare the
algorithms in terms of system lifetime, power dissipation distribution, cost of synchronization,
and simplicity of the algorithm. Their simulation results show that e3D performs comparable to
its optimal counterpart while having significantly less overhead. The proposed algorithm e3D
performed well in terms of achieving its goal to evenly distribute the power dissipation
throughout the network while not creating a very large burden for synchronization purpose [22].
The authors in [23] propose a WSN energy management scheme to conserve energy during
routing. Routing is a main energy demanding operation when nodes become ready for transfer
of data to the sink, an ample amount of research has been conducted to overcome routing energy
issues. However, Quality of Service (QOS) has a very important role especially in critical
applications such as defence, chemical and healthcare, where the accuracy and guaranteed timely
data transfer is an important issue. Hence, besides energy efficiency, QoS based routing is also
required to ensure best use of nodes. In this work, authors tried to focus on operational and
architectural challenges of handling QoS routing traffic in sensor networks and propose a new
protocol for QoS based routing, by applying different techniques simultaneously, and show a
significant improvement towards networks efficiency and QoS [23].
Chapter 1
13
A brief discussion on some of the earlier attempts on addressing the challenges of energy
efficiency done above, show that most of these earlier attempts for designing energy efficient
solutions revolved around classical approaches drawn from telecommunication theory and
communication protocols area. An efficient way to address this challenge, however, is to
combine some of these classical approaches with new developments in soft computing and
evidence based data driven approaches, and exploit the immense amount of data produced by
sensors, exploit the correlation and redundancy between the sensors in the network, and
understand the energy consumption within the sensor nodes and between the sensor nodes, or in
other words, learn the relationships from the data available in the network, and model the spatio-
temporal relationships by means of mathematical models. Learning from data with appropriate
mathematical modelling approaches, can inform the evolution of the measurements taken by
sensors over space and/or time. By building a mathematical model from data or true
measurements, with an appropriate algorithm, it is possible to obtain significant improvements
in prediction of events occurring in the WSN environment, and manage communication capacity
and energy efficiency within the wireless sensor network. These mathematical models that learn
from real data collected by the sensors in the environment, will provide flexible options to the
user, in terms of strategy for WSN setup, optimal selection of number of sensors and base station
nodes in the WSN, and their locations and their grouping into subnets or clusters, choice of
appropriate protocols for transmitting optimal information in the network, and monitor the
overall physical environment accurately. As this is a myriad set of requirements that need to be
satisfied, no single approach can address all these requirements, and though there was some work
done previously in using data driven approaches or machine learning based techniques to address
some of the above mentioned, they were mostly incoherent and often work well in isolation.
These previously proposed techniques mostly focus on the local, within the network protocols
for resource and capacity management and fall short of achieving higher level benefits, in terms
of overall event detection capability and energy efficiency for large dense networks deployed for
real world physical environments. Some of these earlier approaches proposed are reviewed in
detail in Chapter 2.
The construction of mathematical models from data needs to be automatic, since most of the
time, there is little or no information about the variations captured by sensor measurements.
Further, these mathematical models need to be simple and not computation intensive, sensor
nodes have limited computation capacity, and limited energy sources. This calls for some novel
strategies for WSN setup, selection of number of sensors and base station nodes in the WSN,
Chapter 1
14
their locations and their grouping into subnets or clusters, choice of appropriate protocols for
transmitting optimal information in the network, and monitor the overall physical environment
accurately. The hypothesis proposed here is that, it is possible to achieve many of these
objectives coherently, by exploiting the spatial and temporal relationships within the sensor data
that is large and continuously available within the WSN, discover the hidden relationships
between them, and identify the redundancies and correlations, to achieve the most important
objectives of energy efficiency and global event detection capabilities. This will address the
current gap that exists in this area, and is possible with some novel machine learning and data
mining based approaches, which can allow the modelling, prediction and evolution of future
measurements and states of wireless sensor network, and the detect the higher level information
that exists in the physical environment, based on the past measurements or data. For this purpose,
the research questions identified and contributions made for addressing these questions is
presented in next two Sections.
1.4 Research Questions
1. Whether it is possible to develop a strategy for jointly addressing the goals of energy
efficiency and event detection accuracy together?
2. Whether it is possible to develop an adaptive learning strategy to address the dynamic
changing requirements of WSN and address the challenges corresponding to energy
efficiency, event detection accuracy and QoS targets?
3. Whether it is possible to address develop a strategy for jointly addressing the goals of
energy efficiency, prediction accuracy and MAC layer routing issues together?
1.5 Thesis Contributions
To address various WSN challenges, a novel integrated framework for achieving energy
efficiency is proposed and consists of three stages as discussed below:
The first main contribution is the proposal of a joint energy efficiency–event detection
accuracy model, where a novel sensor node selection technique is designed, that
conserves the energy in the wireless sensor network, and at the same time maximizes the
event recognition performance. Here, the scheme utilises, fewer sensor nodes at a time,
Chapter 1
15
and placing unwanted sensor nodes in the sleep mode. For this, a novel objective
quantitative metric is proposed to assess the energy efficiency achieved, namely, the life
time extension factor (LTEF). We show that this joint scheme, allows selection of most
significant and influential sensor nodes for participation in different WSN tasks, and
contributes significantly towards energy savings and event detection accuracy.
As the WSN needs to adapt to the state of the environment being monitored dynamically,
the number of sensor nodes participating in the routing tree cannot remain fixed, and need
to adapt, in order to accurately monitor and predict the physical environment, and the
second contribution of this work is a proposal for adaptive models for sensor selection
and classifier learning for achieving energy efficiency and prediction accuracy, based on
performance targets specified. It turns out that this scheme which involves selection of
an appropriate classifier model, in conjunction with the previous sensor selection
approach, not only results in better prediction accuracy, but also contributes towards
quality of service (QoS) enhancements.
The third and the final contribution is a joint energy efficiency–adaptive routing model,
where an appropriate sensor selection and adaptive routing strategy can address the WSN
challenges corresponding to energy efficiency, prediction accuracy, and MAC layer
adaptation. We show that this joint model, also meet non-functional performance targets,
such as missing or faulty sensors, model building time, needed for adaptation of routing
protocol.
To summarise, this thesis attempts to address some of the important challenges in wireless sensor
networks for physical environment monitoring, such as the energy efficiency, the event
detection/monitoring accuracy, and quality of service aspects, based on evidence-based data
driven machine learning techniques. As can be seen in next Chapter (Chapter 2) on related work
and literature review, to the best of my knowledge, there are not many integrated and joint
approaches investigated in past, that can address multiple objectives of energy efficiency, event
detection accuracy, and quality of service aspects, simultaneously. Some of my attempts to
address these challenges have been published as peer reviewed contributions, and are outlined in
the next Section.
Chapter 1
16
1.6 Publications
The list of peer reviewed publications made during this thesis work is summarised below in
chronological order.
1. Alwadi, M.d. and G. Chetty, A novel feature selection scheme for energy efficient
wireless sensor networks Y.Xiang et al. (Eds): Proceedings ICA3PP 2012, Part II,
Springer LNCS 7440, pp. 264-273, 2012.
2. Moh'd ALWADI, and Girija CHETTY, “Feature Selection and Energy Management
for Wireless Sensor Networks”, IJCSNS International Journal of Computer Science and
Network Security, VOL.12 No.6, June 2012, 46 – 51.
3. Alwadi, Mohammad; Chetty, Girija, “Energy Efficiency Data Mining for Wireless
Sensor Networks Based on Random Forests”, International Journal on Data Mining and
Intelligent Information Technology Applications, 4.1 (Jun 2014): 1-8.
4. Alwadi, Mohammad; Chetty, Girija, “Energy Efficient Data Mining Scheme for Big
Data Biodiversity Environment”, Proceedings 2014 ASE Big Data/Social Comp/Cyber
Security Conference, Stanford University, 27th -31st May 2014. ISBN: 978-1-62561-
000-3. URI: http://www.ase360.org/handle/123456789/100 .
5. Mohammad Alwadi, Girija Chetty, “Energy Efficient Data Mining Scheme for High
Dimensional Data”, Procedia Computer Science 46(2015), 483-490.
doi:10.1016/j.procs.2015.02.047.
6. Alwadi, M. and G. Chetty, " Sensor Selection Scheme in Wireless Sensor
Networks: A New Routing Approach". pp. 73–79, 2015. © CS & IT-CSCP 2015.
7. Mohammad Alwadi and Girija Chetty, “Sensor Selection Scheme in Temperature
Wireless Sensor Network”, International Journal of Wireless and Mobile Networks,
ISSN: 0975-3834. June 2015, Vol 7, No. 3, pp. 47-53. DOI: 10.5121/ijwmn.2015.7304.
Chapter 1
17
8. Mohammad Alwadi and Girija Chetty, “A Novel Sensor Selection Scheme For Energy
Efficient Environment Monitoring of Wireless Sensor Networks”, Journal of Advances
in Computer Networks, ISSN: 1793-8244. (Accepted and in Press).
1.7 Organisation of Thesis
The rest of the thesis is organised as follows. Chapter 2 provides the related work and background
literature review on challenges associated with wireless sensor networks for physical
environment monitoring, and some of earlier research efforts using machine learning techniques,
and contributions made by the research community in addressing these challenges. Chapter 3
presents the first contribution of this thesis, and presents the joint energy efficiency and event
detection model, with discusses the development of an objective measure and sensor selection
scheme to assess the energy efficiency achieved. Chapter 4 discusses the problem of dynamic
behaviour of nature of wireless sensor networks and how the adaptive learning models based on
machine learning approaches can address this problem and maintain the prediction and
monitoring accuracy of the physical environment being monitored. The attempts to address the
challenges corresponding to MAC layer routing adaptation protocol is discussed in Chapter 5,
where a joint sensor selection – adaptive routing model to address the challenges energy
efficiency, prediction accuracy and adaptive routing under dynamic WSN changes is presented.
The thesis concludes with conclusions and further scope of this work in Chapter 6, with some
key references listed in Bibliography Section.
Chapter 2
19
Chapter 2 Related Work and Literature Review
In this Chapter, a review of machine learning approaches proposed in the literature to address the
design challenges in WSNs is presented. As can be seen in this Chapter, a myriad of attempts
have made so far, and many design challenges in wireless sensor networks have been resolved
using several machine learning methods. Utilizing machine learning based algorithms in WSNs
has to consider several constraints, such as limited resources of the network, and application that
requires different events to be monitored, and other operational and non-operational aspects.
2.1 Machine Learning Based Approaches
The recent advancements in machine learning and soft computing techniques allow better
prediction models to be developed based on a set of measurements. The learned model could
be just a simple parametric function, learned from data, a set of input variables - normally
historical measurements or observation, permitting output state or variable to be predicted
accurately.
As discussed before, a Wireless sensor network (WSN) can consist of heterogeneous, multiple
autonomous, tiny, low cost and low power sensor nodes. The purpose of these nodes is to gather
data about the physical environment being monitored, and collaborate with each other to
forward sensed data to centralized controller units called base station nodes or sink nodes for
further processing. The sensor nodes in the WSN could be heterogeneous, that is, they could
be equipped with various types of sensors, including thermal/temperature, acoustic, chemical,
pressure, weather, and optical sensors. Due to this heterogeneity, WSNs have tremendous data
diversity allowing powerful applications to be built, with different characteristics and
requirements. Developing efficient algorithms that are suitable for many different applications
is a challenging task. WSN designers have to address several issues pertaining to collection or
aggregation of data, and reliability of data, in addition to node clustering, energy aware routing,
events scheduling, fault detection and security.
In late 1950s, Machine learning (ML) was initially introduced as a special technique for
artificial intelligence (AI) [41]. It’s focused slowly shifted and evolved more towards
algorithms that are computationally feasible and forceful over the years. Its application grew
Chapter 2
20
extensively in last few years in several areas including bioinformatics, speech recognition,
spam detection, fraud detection and advertising networks. The machine learning tasks involved
were mainly that of classification, regression and density estimation, and involved algorithms
and techniques drawn from many diverse fields including statistics, mathematics, and
neuroscience and computer science. The essence of machine learning can be captured by
following two classical definitions:
o The learning processes for development of computer models that can enhance the
performance of systems and provide solutions to the problem of knowledge acquisition
[41].
o Detecting and describing consistencies and patterns in training data by employing
computational methods that can improve machine performance [42].
Machine learning technology appears very promising as per these definitions, to address
challenges in WSNs, as it allows exploiting historical data to improve the performance of
network on given task, or predict the future performance. For WSNs, using machine learning
technology can be immensely beneficial for a number of reasons, such as:
Better monitoring of dynamic environments that change rapidly over time. For instance,
in soil monitoring scenario, it is possible that the location of sensor nodes may change
due to soil erosion or ocean turbulence, and WSN based on machine learning can allow
automatic adaption and efficient operation in such dynamic environments.
Acquisition of new knowledge from unreachable, dangerous locations in exploratory
applications [43], volcanic eruptions, and early detection of tremors before earth quakes
for example. By detecting anomalies and unexpected behaviour patterns, a WSN that
can learn from data, can provide early warnings to catastrophic events well in advance
for emergency evacuations and calibrate and configure the WSN to collect additional
data from crucial nodes for better tracking of events.
Providing computationally feasible, low-complexity mathematical models for
complicated environments. For these environments, it is difficult to build accurate
Chapter 2
21
mathematical models, and difficult for sensor nodes to compute the algorithm
corresponding to these mathematical models. Under such circumstances, WSN based
on machine learning techniques can provide low complexity approximations for the
system models, allowing its implementation within sensor nodes. Routing problem is a
representative example here [44], [45].
Providing opportunity to extract spatial and temporal correlations between sensor nodes.
There is significant correlation in spatial and temporal dimension in the data that is being
collected in WSN. A WSN based on machine learning can leverage several algorithms
that operate on historical data being captured by sensors, and identify the correlations and
eliminate redundant sensors, localise the sensors at optimal locations or help in failure
recovery mechanism to be invoked in the event of breakdown in the network [46].
Increased automation and novel applications development, such as ubiquitous, ambient
computing systems. WSN based on machine learning can allow increase automation and
new uses by integration with other WSNs leading to fully sensored very large applications
such as Internet of Things technologies, Cyber-physical systems and machine-to-
machine communications. These applications use several different types of WSNs and if
based on machine learning, can support more intelligent decision-making and
autonomous control, with extraction of different levels of abstractions needed to perform
the AI tasks with limited human intervention [47], [48].
However, it is quite possible that WSN based on machine learning techniques may not lead to
any improvements if some of the issues outlined below are not considered during design stage.
As the WSN environment is a resource limited, significant energy is expended on
predicting the hypothesis with accuracy, and for global event detection type scenarios,
energy-efficiency and prediction accuracy is essentially a trade-off, [50].
Since the WSN becomes intelligent by learning from data, there is a need for large data
set. However, just the size of data being big does not ensure better learning or intended
generalization, and it is essential that it is right type of data, not just large data is used for
Chapter 2
22
building the mathematical model. Without right type of data, designer will not have full
control over knowledge discovery process [49], [50].
Recently, an increasing use of machine learning technologies in automation of WSNs operations
is being experienced. The authors in [51], present an excellent survey of machine learning
approaches applied to WSNs for processing the information in the network and improving the
performance. A similar survey, but more focussed towards ad-hoc networks, and how machine
learning techniques have been adopted in ad-hoc networks is presented by authors in [52].
Another seminal work on applications of three popular machine learning algorithms (i.e.,
reinforcement learning, neural networks and decision trees) at all communication layers in the
WSNs is presented in [53]. Some of the work also addressed specific challenges in WSNs, such
as authors in [54], [55], who developed an efficient outlier detection technique based on machine
learning concepts. Authors in [56] proposed an approach based on computational intelligence
technique for addressing challenges corresponding to data aggregation, routing, task scheduling
and optimal deployment and localisation. Computational intelligence techniques are a class of
machine learning techniques that focus on biologically inspired learning approaches such as
neural networks, fuzzy logic and evolutionary algorithms ]57].
Most of the earlier work on using machine learning techniques for WSNs, focussed on
reinforcement learning, neural networks and decision trees which were well established in their
reputation of being efficient at conceptual level and implementation level. Some of the machine
learning algorithms to address functional or operation challenges in WSNs such as routing,
localization, clustering, data aggregation, query processing and medium access control. The
operational or functional issues are those issues which are essential for the basic operation of
WSNs. Then, there are some approaches which have addressed the non-operational on non-
functional in WSNs, such as those that determine the quality or enhance the performance of
functional components, including security, quality of service (QOS) and data integrity.
We present a comprehensive review of some of the related approaches where machine learning
technology has been used for WSNs, which can also act as a design primer and comparative
guide.
Chapter 2
23
2.2 Related Work on Machine Learning for WSNs
In practice, the data science community depicts machine learning techniques as a collection of
algorithms and tools for creation of prediction models. However, machine learning researchers
recognize it as a rich field with very large goals and objectives. Appreciating such large goals
will be useful for designers who wish to apply machine learning to WSNs, which are very
complex in their own way as well. This understanding can provide better insight into
tremendous flexibility and benefit, machine learning algorithms can provide to a wide range of
complex WSN applications. For this, it is necessary to visit some of the theoretical concepts
that form the basis for machine learning technology in the context of WSNs.
Existing machine learning techniques can be categorized into supervised, unsupervised and
reinforcement learning techniques [58]. For supervised learning category, the learning
algorithm is provided with a labelled training data set. The system model is built by using the
labelled training data to make the machine learn the relation between the input, output and
system parameters. On the contrary, no labelled data is provided (there is no output vector) for
unsupervised learning algorithms, For an unsupervised learning algorithm, the relationship is
discovered in an unsupervised manner by clustering several sets of data into different groups
or clusters, and by discovering the similarity between input data samples. The third category is
a reinforcement learning algorithm, where the machine learns interactively, with online
learning from its environment. Finally, another way a machine can learn is a combination of
supervised and unsupervised learning style, and these are called hybrid algorithms or semi-
supervised learning approaches, and they try to inherit the strength of both supervised and
unsupervised learning approaches [59]. Further, a thorough discussion on theoretical concepts
of machine learning is presented in [60].
2.2.1 Supervised Machine Learning
For supervised machine learning, the system model is built with a labelled training set (known
outputs and predefined inputs). The learned relationship between the input, output and system
parameters is learned by the system model. This type of learning approach is extensively used
to solve several challenges in WSNs such as localization and objects targeting [61], [62] [63],
query processing and event detection [64], [65], [66], [67], medium access control [68], [69],
[70], intrusion detection and security [71], [72], [73], [74], data integrity, quality of service
Chapter 2
24
(QoS) and detection of faults [75], [76], [77]. Some well know supervised machine learning
algorithms are discussed next.
2.2.1.1 K-nearest neighbour (k-NN):
For this supervised learning algorithm, a test data sample is classified based on the labels (or
output values) of nearest data samples. By computing an average of readings within its
neighbourhood, the missing or unknown test sample measurement is predicted. Determination
of nearest set of nodes is done by using different methods. One of simplest method to determine
the neighbourhood is by using the Euclidean distance between different sensors [81]. As the
distance measure is computed using few local points, with k normally a small positive integer,
the k-NN approach does not need high computational power. Due to its simplicity, the k-NN
algorithm is suitable for query processing tasks in WSNs [64], [65].
2.2.1.2 Decision Trees
The decision tree classification involves predicting output labels by iterating the input data
using a learning tree [80]. During the iterative process, a comparison of feature properties
relative to decision conditions is done to reach a particular category. A significant amount of
research was done in using decision trees to address different design challenges in WSN, such
as identifying link reliability in WSNs using decision trees. Here use of decision trees provides
a simple method to identify critical features for link reliability, including loss rate, mean time
to failure (MTTF), and mean time to restore (MTTR). However, the limitation of decision trees
is that, it requires linearly separable data [80].
2.2.1.3 Neural Networks
Neural networks are one of the most popular learning algorithms for learning from data and
can be constructed by cascading chains of decision units, often called perception or radial basis
functions [49]. The cascading chains of decision units allow recognitions of non-linear and
complex relationships in data. However, the learning process with multiple cascading chains is
highly computations intensive [81]. An illustrative example of using neural networks for WSNs
is the sensor node localization problem, or determining the node’s geographical position in 3
dimensions. The sensor node’s geographical position has a complex, nonlinear relationship
with propagating angle, and distance measurements of the received signals from anchor nodes
Chapter 2
25
[82]. With supervised training of a neural network with different measurements in WSN as the
inputs, including RSSI (Received Signal Strength Indicator), TOA (Time of Arrival) and
TDOA (Time Difference Of Arrival), the network learns the relationship between RSSI, TOA,
TDOA and Node geometrical position, and can predict/estimate the 3 dimensional node
localisation co-ordinates. Figure 3 shows the schematic of this WSN node localisation
estimation using cascaded layers of neurons (computational units).
Figure 3 Estimating Node Localization co-ordinates in WSN Using Neural Networks [82]
There are several algorithms for training the network of neurons to learn the complex, nonlinear
relationship between the inputs and outputs, including Kohonen’s maps (self organising maps)
and LVQ (learning vector quantisation) [83]. One of the problems with most of the neural
network based estimation techniques is a significant amount of hand crafted feature
engineering required to do precise estimation. However, recently, some of the recent work on
deep learning architectures allows learning directly from high dimensional streaming big data
to learn the relationships between different variables without any feature engineering [84].
2.2.1.4 Support Vector Machines
Support Vector Machines provide alternatives to neural networks, and are preferred options for
solving nonconvex unconstrained optimization problems [79]. In the context of WSN, they
have been used for intrusion detection, or detecting malicious behaviour of sensor nodes,
security [73], [74], [86], [87], [88] and localisation [89], [90] [91]. With SVM, it is possible to
uncover the spatio-temporal correlations in data, as the algorithm involves constructing a set
Chapter 2
26
of hyperplanes (or optimizing a quadratic function with linear constraints) separating WSN
data measurements in feature space, by as wide as possible margins. Figure 4 shows the
schematic of SVM classifies WSN measurements.
Figure 4 Schematic of SVM Classification Process [91]
2.2.1.5 Bayesian Learners:
While most of the machines learning algorithms require large number of training samples to
learn, learning techniques based on Bayesian statistics require lesser training samples [92]. The
learning happens in Bayesian methods by adapting the probability distribution to efficiently learn
the uncertain labels. The important aspect for this learning technique is, it uses the current
knowledge (that the collected data samples (D)) to refine values of prior belief into posterior
belief values (Eq. 3.1).
𝑝(𝜃|𝐷) 𝛼 𝑝(𝜃) ∗ 𝑝(𝐷|𝜃) (3.1)
Where 𝑝(𝜃|𝐷) is the posterior probability of the parameter 𝜃, given observation D. And 𝑝(𝐷|𝜃)
is the prior likelihood of observation D, given the parameter 𝜃. In WSNs, this type of Bayesian
learners are useful for assessing event consistency (𝜃). using incomplete data sets (D) by
investigating prior knowledge about the environment. Several variations of Bayesian learners
allow better learning of relationships, such as Gaussian Mixture Models, Hidden Markov
Models, Conditional Random Fields, Dynamic Bayesian Networks [93].
Chapter 2
27
2.2.2 Unsupervised Machine Learning
For unsupervised learning there are no labels provided or there is no output vector. The sample
set is classified into different groups by investigating the similarity between them with an
unsupervised learning algorithm. This type of learning algorithm finds use in WSN node
clustering or data aggregation at a sink code scenarios [94], [95], [96], [97], [98], 99, [100].
With no labels provided, the unsupervised machine learning algorithm discover the hidden
relationships and is suitable for WSN problems, with complex relationships between variables.
Two most important type of algorithms in this category are K-means clustering [101], and
Principal component analysis [102], 103].
2.2.2.1 K-Means Clustering
This unsupervised learning algorithm classifies data into different clusters or classes and works
in sequential steps involving, random selection of k nodes as initial centroids for different
clusters, use of a distance function to label each node with the closest centroid, iteratively re-
compute the centroids using a predefined threshold value on current node memberships, and
stop the iterations if the convergence condition is met. The K-means clustering algorithm is
widely used in WSN sensor node clustering due to the simplicity and linear in its complexity
[101].
2.2.2.2 Principal Component Analysis
This unsupervised learning algorithm is quite popular in data compression field, and is used for
dimensionality reduction. It is a multivariate method and aims to extract important information
from data in terms of principal components, which is nothing but a set of new orthogonal
variables [102].
It is a multivariate method for data compression and dimensionality reduction that aims to extract
important information from data and present it as a set of new orthogonal variables called
principal components. These principal components are ordered such that the first principal
component is aligned towards highest-variance direction of data, with decreasing variance for
other components in order. This allows, the least variance components to be discarded as they
contain minimum information content, leading to dimensionality reduction. For WSN scenarios,
this can help in reduce the amount of data being transmitted among sensor nodes, by finding a
Chapter 2
28
small set of uncorrelated linear combination of original readings [103]. Further, it can solve the
big data problem into small data by allow selection of only significant principal components and
discarding other lower order insignificant components from the model. The details of PCA
theory, also known as eigenvalue/eigenvector or covariance matrix analysis is discussed
elsewhere [102], [103]. Figure 5 shows a simple two dimensional visualisation of the principal
component analysis (PCA) algorithm in dealing with high dimensional data.
Figure 5 Two dimensional visualization of PCA process [103]
2.2.3 Reinforcement Machine Learning
This type of learning algorithm for WSNs involves learning by interaction with the
environment. Here, a rewards process is involved, and a sensor node learns to take best actions
so that its long-term rewards get maximized with experience. Q-learning is most well-known
reinforcement learning algorithm, useful algorithm for WSN routing problems, where each
node seeks to choose actions that are expected to maximize its long term rewards. [104], [105,
[106], [107], [108]. Here, the sensor node in Q-learning regularly updates the rewards it
achieves based on the action it takes at a given state. The computation of future total reward
(also known as Q-value) of performing an action at a given state is obtained using Eqn. 3.2 as:
𝑄(𝑠𝑡+1, 𝛼𝑡+1) = 𝑄(𝑠𝑡, 𝛼𝑡) + 𝛾(𝑟( (𝑠𝑡, 𝛼𝑡) − 𝑄(𝑠𝑡, 𝛼𝑡)) (3.2)
Chapter 2
29
Where 𝑟( (𝑠𝑡, 𝛼𝑡) denotes the immediate reward of performing an action 𝛼𝑡 at a given state 𝑠𝑡,
and 𝛾 is the learning rate that determines how fast learning occurs (usually set to value between
0 and 1).
Figure 6 shown below illustrates how the sensor node can regularly update its achieved rewards
based on action taken at a given state.
Figure 6 visualization of Q-learning Algorithm [108]
2.3 Operational Challenges
There are several operational or functional challenges in design of WSNs, such as, power and
memory constraints of sensor nodes, topology changes, communication link failures, and
decentralized management. These operational challenges can be addressed by adopting
machine learning paradigms in the ways the WSNs work, so that they can be intelligent, and
can make conscious decisions for achieving energy efficiency, real-time adaptive routing,
query processing, global event detection, localization, node clustering and data
collection/aggregation at sink nodes.
2.3.1 WSN Routing Issues
As the sensor nodes have limited processing capabilities, small memory and low bandwidth,
design a routing protocol for WSNs has to consider various design challenges such as energy
consumption, fault tolerance, scalability and data coverage [46].
Formulation of a routing problem in wireless sensor networks, traditionally is done as a graph
problem, G = (V, E), where V represents the set of all nodes, and E represents the set of
bidirectional communication channels connecting the nodes. With this graph modelling
Chapter 2
30
approach, the routing problem can be described as the process of finding the minimum cost
path from the source vertex to all destination vertices, by using the available graph edges. We
call this path a spanning tree T = (V, E), whose vertices include the source or root node, and
destination nodes or leaf nodes. The solution to such a spanning tree problem with optimal data
aggregation is normally an NP-hard problem, even with the knowledge of full topology [45].
Learning from previous experiences is an important feature of machine learning, and sensor
networks can benefit immensely from machine learning, including selecting optimal routing
actions and adapt to the dynamic environment. Some of the benefits can be summarized as
follows:
Learn the optimal routing paths that can lead to energy efficiency and prolong the
lifetime of dynamically changing WSNs.
Divide the complex routing problem into simpler sub-routing problems, where the
nodes in the sub problem formulate the graph structures, by considering only their local
neighbours, and achieving low efficient and real time routing.
Use relatively simple computational methods and classifiers, and meet Quality of
Service (QoS) requirements in routing problems.
A simple sensor network routing problem using a graph and spanning tree routing algorithm,
is shown in Figure 7.
Figure 7 illustrates a simple sensor network routing problem using a graph, and the traditional
spanning tree routing algorithm, respectively. The network nodes have to exchange their
routing information with each other, to find the optimal routing paths. The illustration of how
machine learning reduces the complexity of a typical routing problem by considering
neighbouring nodes’ information to predict the full path quality. With a routing procedure
backed up with machine learning algorithm, each node will independently decide which
channels to assign, and how to optimise the transmission power. Such an approach will provide
near optimal routing decisions with very low computational complexity.
Chapter 2
31
Figure 7 Simplified Network Routing Based on Machine Learning [46]
A summary of different WSN routing protocols that have used machine learning based
approaches is given below.
2.3.1.1 Distributed Regression Approach:
A general framework for sensors data modelling was proposed by Guestrin et al. in [109]. In
this framework, the network nodes fit a global function to match their own measurements. A
kernel linear regression type of machine learning algorithm is run at the nodes. A set of kernel
functions map the training samples to learn the correlation between different features,
exploiting the fact that the readings of multiple sensors are highly correlated [111], [112]. Due
to the kernel mapping, the communication overhead in detecting the structure in the sensor data
is minimized. This approach contributes to developing a distributed learning framework for
wireless networks based on linear regression methods, the main advantage being good fitting
results, and the small overhead of the learning phase. However, the only disadvantage is that it
cannot learn non-linear and complex functions.
2.3.1.2 SOM (Self Organising Map) based data routing approach:
Using self organised map (SOM) based unsupervised machine learning approach was proposed
by Barbancho et al. in [110], and it involves detecting optimal routing paths as illustrated in
Figure 3.5. This approach is slightly different form the well know Dijkstra’s algorithm which
allows network backbone and shortest paths to be formed from base station to every node in
the network. In this approach, the second layer neurons compete with each other to reserve
high weights in the learning chain, during route learning, and the weights of the winning neuron
and its neighbouring neurons get updated further to match the input patterns. This learning
Chapter 2
32
phase being a highly computational process has to run at central base station node or sink node.
However, the execution phase is not computational and can be made to run on the network
nodes. This algorithm due to its hybrid nature (combination of the Dijkstra’s algorithm and
the SOM, the QoS(quality of service requirements, including latency, throughput, packet error
rate and duty cycle) during the process of updating neuron’s weights are taken into account.
However, some of the drawbacks of this algorithm are the complexity of the algorithm and
computational overhead in the learning phase due to change in network topology and settings.
2.3.1.3 Reinforcement learning based routing enhancement:
A reinforcement learning based algorithms, such as Q-learning algorithm, can enhance routing
protocol to guarantee reliable resource allocation. Sun et al. [105] have shown how a Q-learning
algorithm, called Q-MAP algorithm, can enhance multicast routing in wireless ad hoc
networks, where a node has to send the same messages to several receivers. For a mobile adhoc
network, consisting of heterogeneous nodes, with different nodes having different capabilities,
it is difficult to track the overall, dynamic information about the global state of the network
structure, and the Q-MAP multicast routing algorithm is designed to guarantee reliable resource
allocation for such complex scenario. The Q-MAP algorithm involves two phases, with first
phase as “Join Query Forward” that discovers an optimal route, and updating the Q-values.
In multicast routing, a node sends the same message to several receivers. Sun et al. [65]
demonstrated the use of Q-learning algorithm to enhance multicast routing in wireless ad hoc
networks. Basically, the Q-MAP multicast routing algorithm is designed to guarantee reliable
resource allocation. A mobile adhoc network may consist of heterogeneous nodes, where
different nodes have different capabilities. In addition, it is not feasible to maintain a global,
up-to-date knowledge about the whole network structure. The multicast routes are determined
in two phases. The first phase is “Join Query Forward” that discovers an optimal route, as well
as updates the Q-values (prediction of Q-values) of the Q-learning algorithm. In the second
phase called “Join Reply Forward”, an optimal path is created for facilitating multicast
transmissions. Hence, using a machine learning approach based on Q-learning can reduce the
overhead for route searching for multicast routing in mobile ad-hoc networks. However, this
path may not be energy efficient, and hence Q-MAP needs to be modified to make it an energy
efficient routing strategy.
Chapter 2
33
An alternate routing scheme in WSNs is based on UWB (Ultra Wide Band) communication. A
frequency band of 3.1 to 10.6 GHz (7,500 MHz of spectrum) has been dedicated by FCC
(Federal Communications Commission) towards the use of unlicensed UWB [103]. In UWB
technique, bulky data for short distances is transmitted using a wide spectrum of frequency
bands with relatively low power. An enhanced geographical routing approach with UWB
equipped sensor networks was proposed by Dong et al. [106]]. The authors in [105] used a
reinforcement learning algorithm to enhance geographical routing protocol (RLGR), where an
optimal route is computed by considering sensor node energy and delay as metrics for
formulating the learning reward function. The benefit of using reinforcement learning based
routing protocol is that it does not require information about global network structure to obtain
an optimal routing path. This routing protocol leverages the UWB technology for detecting the
nodes’ location, with UWB devices are placed on cluster heads only. Further, each node
maintains a simple look up table to keep the information about neighbouring nodes, and uses
the location and energy information of neighbouring nodes for network learning. The best
routing actions are learnt by exchanging short “hello” messages between these neighbouring
nodes.
Another enhanced geographic routing scheme based on reinforcement learning algorithm was
proposed by Arroyo-Valles in [108], called “Q-probabilistic Routing” (Q-PR), for WSNs that
can learn from previous routing decisions (for instance, selecting the routing path that has the
highest delivery rate over the past period of time). The difference between this protocol and
one previously discussed, RLGR, is the support for QoS. The Q-PR uses the message priory,
expected delivery rate and the power constraints to determine the optimal route, and uses a
learning model based on reinforcement learning and Bayesian decision models. Here, the
Bayesian method handles the decision to transmit the packets to a set of candidate neighbouring
nodes, by incorporating knowledge about data priority, profile of the nodes, reception energy,
and expected transmission rate. Further, it can discover the next hop online during the message
routing time.
Another enhanced reinforcement learning based WSN scheme was proposed by Forster and
Murphy [107], and it involves exchanging local information in nodes as a feedback response
to other nodes, called as “Feedback Routing for Optimizing Multiple Sinks (FROMS). This
routing algorithm allows efficient routing between multiple sources and multiple sinks, with
initialisation of Q-values based on the hop counts to every node in the network. The hop counts,
Chapter 2
34
which could be short “hello messages” are exchanged between the nodes at earlier stages of
the network deployment, and essentially extends the basic mechanism of RGLR in [106] with
an assumption that there is a direct communication between all the neighbouring nodes.
However, the main shortcoming of reinforcement learning based WSN routing algorithm is
their inability to look ahead or limited recognition of future knowledge. Hence they are unable
to perform in highly dynamic environments, as learning optimal routes in such environments
can take longer times.
2.3.2 Data Collection and Clustering Issues
It is inefficient to transmit all data to the sink directly for large scale energy-constrained sensor
networks, and as proposed by authors in [114], an alternate efficient approach is to pass the
data to an intermediate cluster head (also called as local data collectors), which collects data
from all the sensors within its cluster and forwards it to the sink node or the base station node.
Depending on how the cluster head selection or election is done, it is possible to achieve
significant energy savings. Due to this, several algorithms have been proposed for cluster head
selection/election to maximise the energy efficiency [115], [[116], [117]. A detailed taxonomy
and comparison of different clustering algorithms was done by authors in [118]. Figure 8 shows
an example WSN architecture with different clusters of nodes categorised as working, dead or
head cluster nodes.
Chapter 2
35
Figure 8 visualization of Q-learning Algorithm [118]
As discussed in [118], machine learning based approaches can improve the benefits of
clustering and data collection mechanism between nodes in WSNs in different ways, such as:
Identify non-functional nodes and remove them from routing schemes, using machine
learning algorithms, which can compress data locally at cluster heads, with
dimensionality reduction techniques, that extract similarity and dissimilarity in
different sensors’ readings.
Identify (select or elect) appropriate cluster head that can maximise energy efficiency
and increase lifetime of WSNs with an appropriate feature ranking and feature selection
approaches from machine learning field.
There are several solutions proposed to this end, for selecting the forming different clusters in
a large WSN, and choosing the cluster heads, and assigning nodes to each cluster and devising
a routing scheme from source nodes to sink nodes.
Chapter 2
36
2.3.2.1 Self-Managed Clustering Scheme
Hongmei et al. [119] suggested a scheme based on neural networks for self-managed clusters.
This clustering approach works well for large WSNs with short transmission distances, but for
large distance WSNs, the clustering efficiency is not significant in terms of energy efficiency
and quality of service.
2.3.2.2 LEACH Algorithm
A decision tree based machine learning algorithm was suggested by Ahmed et al. [120] for
solving the cluster head problem. In this work, the authors used several critical features in the
decision tree algorithm for learning the input vector iteratively, including, distance from nodes
to cluster centroids, battery energy level, the mobility degree, and vulnerability indicators.
They did a simulation study and showed its improved performance relatively when compared
to the “Low Energy Adaptive Clustering Hierarchy” or LEACH algorithm proposed by authors
in [127].
2.3.2.3 Gaussian Process Modeling
An approach based on Gaussian modelling of sensor data was proposed by several authors in
[121], [122], [93], and [94]. Gaussian models involve representation using random variables
(stochastic variables) that parameterize mean and covariance functions from the sensor data.
Ertin in [121] proposed an approach based on Gaussian process regression for initializing
probabilistic models. Whereas Kho et al. [122] extended this Gaussian regression approach for
adaptively sample sensor data depending on its importance. Authors in [122], proposed an
approach focussing on energy consumption, which provides a trade-off between optimal
solution and computational cost. In general, with smaller training data sets (less than few
thousand samples), Gaussian models are preferable for prediction of smooth functions [93].
However, they become computational intensive with large scale WSNs, and appropriate
strategies to deal with complexity need to be addressed by WSN designers.
2.3.2.4 CODA Algorithm
Another machine learning based architecture based on self organizing maps (SOM) for data
collection at cluster heads was proposed by Lee et al. in [94]. The SOM approach is an
unsupervised competitive learning technique for mapping high dimensional spaces to lower
Chapter 2
37
dimensions, and in this novel architecture, called, “Cluster based self organization and data
aggregation (CODA), and nodes can classify the collected data using a self-organising
algorithm. For a SOM algorithm, the winning neuron n* , has the weight vector w(t), close to
input vector x(t), at convergence of an optimization algorithm, defined as:
𝑛∗ = 𝑎𝑟𝑔 min𝑛
||𝑥𝑛 (𝑡) − 𝑤𝑛(𝑡)||, 𝑛 = 1, ⋯ , 𝑁 (Eqn.3.3)
Where N represents the number of neurons in the second layer. The updating of winning node
and its neighbours is done as follows:
𝑤𝑛(𝑡 + 1) = 𝑤𝑛(𝑡) + ℎ(𝑡)( 𝑥𝑛(𝑡) − 𝑤𝑛(𝑡)) (Eqn.3.4)
Here 𝑤(𝑡) and 𝑤(𝑡 + 1) represent the values of a neuron at time 𝑡 and 𝑡 + 1, respectively.
Here, ℎ(𝑡) is a Gaussian neighbourhood function defined as:
ℎ(𝑡) = 1
√2𝜋𝜎𝑒𝑥𝑝 (
‖𝑛∗−𝑛‖2
2𝜎2(𝑡)) (Eqn.3.5)
An improvement in energy efficiency and reduction in network traffic was observed by using
the CODA based machine learning approach for WSN.
2.3.2.5 ALVG Algorithm
While there is need for complete knowledge about the network topology in the most of the
methods discussed above, there are some algorithms, which are free from such restrictions. For
instance, one of the algorithm in such category is “Adaptive Learning Vector Quantization”
(ALVG) proposed by authors in [123]. This algorithm uses data correlation and historical
patterns to accurately retrieve compressed versions of reading from sensor nodes. ALVQ
algorithm uses well known learning vector quantization algorithm (LVQ), and uses the past
training samples to predict the code-book. The extension of LVQ to ALVQ enhances the
accuracy of recovering the original data from the compressed data, and reduces the bandwidth
required during transmission. This algorithm though has a capability to represent big size of
data with few vectors [83], it does not use isolated unused nodes in prediction, and hence it is
not robust against outliers.
Chapter 2
38
2.3.2.6 Dimensionality Reduction Techniques
Finally, there are few other algorithms proposed for data collection which reduce the bandwidth
during transmission using dimensionality reduction techniques, such as:
2.3.2.6.1 Compressive Sensing Approach
Compressive Sensing (CS) approach, replaces traditional schemes involving “sampling first
and then compression” to “sample while compressing” scheme. For compressive sensing
scheme, the sparsity feature of the signal is used to recover the original signal from few random
measurements, and is discussed in detail in [128].
2.3.2.6.2 EM Approach
Expectation Maximization (EM) approach, which is basically an iterative algorithm with two
main steps, an expectation (E) step and a maximization (M) step. The formulation of cost
function while setting the current expected value of system parameters happens in E-step, and
recomputation of system parameters, that minimizes the estimation error of the cost function
happens in M-step [129].
2.3.2.6.3 PCA Approach
Principal Component Analysis (PCA) technique, one of the most popular dimensionality
reduction techniques has also found its way in improving WSN performance. A method for
estimating distributed observations using few collected samples, based on PCA was proposed
by Masiero et al. [95], [96]. This technique uses PCA to produce orthogonal components which
is used by compressive sensing scheme to reconstruct original readings. As the PCA technique
here exploits spatial and temporal correlations, this method is independent of routing protocol.
A similar work by Rooshenas et al. [97], has proposed an approach to optimize the direct
transmission of readings to the base station or sink node, based on PCA technique. The use of
PCA here leads to considerable reduction in traffic by extracting fewer packets from combined
nodes’ collected data. The process of data reduction using PCA at intermediate nodes instead
of forwarding them all to destination sink, results in significant reduction in communication in
WSN, and hence makes the network energy efficient. Another approach involving use of PCA
for improving WSN performance was proposed by Macua et al in [98]. This approached uses
Chapter 2
39
a distributed consensus-based method for dimensionality reduction, and uses a combination of
PCA and maximum likelihood measure of the data observed. The two variations of this method
called, consensus based distributed PCA (CB-PCA), extracts the eigenvectors of local
covariance matrices, whereas, the consensus based EM distributed PCA (CB-EM-PCA) uses a
distributed EM algorithm. Both the variations. Use a consensus algorithm proposed in [130],
to predict the probability distribution of the data, and compute the global dominant
eigenvectors based on single hop (local) communication parameters. It is possible to achieve a
trade-off between the dimensionality reduction and the communication costs, by tuning the
consensus round parameter both in CB-DPCA and CB-EM-DPCA algorithms. This implies
increasing the consensus rounds for improvement in algorithm accuracy, but at the cost of
increased computation requirements.
2.3.2.6.4 Distributed PCA
Another recent contribution on using PCA for improving WSN performance is by Fenxiong et
al. [124], who have addressed the problem of data reduction in WSN by transforming the data
from a high dimensional space to a lower one using PCA technique. Here, the data which is
continuously collected over time by each node is sent to its corresponding cluster head, and the
cluster head, the data redundancy is eliminated by compressing the data matrix, by using PCA,
and ignoring the least significant components. By choosing the number of PCA components
appropriately, it is possible to achieve a trade-off between the computational cost and
compression accuracy in WSN.
2.3.2.7 Collaborative Mobile Node Processing
Collaborative mobile node processing with machine learning approach, proposed by authors in
[99], [100], involves use of mobile nodes in the WSN architecture, unlike the fixed location of
WSN layout discussed so far. The use of mobile nodes is particularly needed for collecting
massive data from surveillance camera networks. Here, the power mobile sensor nodes are
deployed along with traditional surveillance cameras, to enhance the intelligence gathering
capabilities of integrated mobile surveillance wireless sensor network systems [99]. Here the
mobile sensor nodes are grouped into several clusters using k-means unsupervised learning
algorithm, with each cluster monitored by single mobile sensor. However, the clustering of
Chapter 2
40
sites with k-means algorithm though simple and straightforward in the implementation with
low complexity, it is sensitive to outliers and selection of initial seed values.
2.3.2.8 Role-free clustering
Role-free clustering approach was proposed by Forster and Murphy in [125], where a Q-
learning technique is used for WSN cluster formulation. In this approach, labelled CLIQUE
method, instead of using an election or selection criteria, it uses a rewards criteria to assign a
node as a cluster-head node. A combination of Q-learning algorithm in combination with
certain dynamic network parameters such as energy levels is used for this method.
2.3.2.9 Decentralised Learning
Reduced data latency using decentralised learning approach is another interesting approach
proposed by Mihaylov et al. [126] to address the problem of data latency that can creep in for
WSNs based on Q-learning with random topology sensor network setups. Here learning
happens in a decentralised manner locally in the cluster head nodes to optimise the data
aggregation, instead of central control/base station node. Due to this, the efficiency of the whole
WSN is improved with smaller learning transmission overheads. Due to the savings in the node
energy budget during data collection process, the lifetime of the network is extended.
2.3.3 Event Recognition & Query Processing Issues:
In addition to routing and node clustering issues mentioned in the previous two sections, the
event recognition and query processing are also important operational requirements of large
scale WSNs. The functionality needed here, is a trustworthy event scheduling and recognition
with minimal human intervention. In general, WSN monitoring can be classified as event-
driven, continuous or query-driven [46]. With machine learning based event monitoring
approach, it is possible to obtain efficient event detection and query processing solutions under
constrained environment with restricted query areas. Figure 9 illustrates different query
processing and event detection operations in WSNs.
Chapter 2
41
Figure 9 Event Detection and Query Processing Using Machine Learning [46]
Adopting machine learning based techniques for these operations can lead to several benefits
including:
Facilitate the development of efficient event detection techniques using learning
algorithms and simple classifiers, particularly with limited availability of storage and
computing resources.
Facilitate the development of effective query processing techniques for WSNs, for
instance, determine the search regions whenever a query is coming from, and localise
the communication efforts there, instead of flooding the whole network.
Several research works have focussed on efficient design of good event detection and query
processing strategies for WSNs. Some of the simpler approaches involve defining a strict
threshold value for phenomenon being sensed and triggering the alarms in case of any violations,
while recent WSN set ups use more complex approaches than using simple threshold values. The
complex, emerging approaches used advanced machine learning based casting of problem for
event detection and query processing. Some of them are discussed next.
2.3.3.1 Bayesian Event Detection Algorithm
Using decentralised Bayesian learning, Krishnamachari and Iyengar [131] investigated the use
of WSNs for detecting environmental phenomenon, and obtained a fault detection accuracy of
Chapter 2
42
up to 95%, with simple threshold criteria. With a decentralised learning approach, they could
isolate the faulty region and focus the query processing in that region leading to better event
recognition accuracy. A follow-up approach was propose by Chen et al [134], which addressed
some of errors in the formulation of distributed learning algorithm problem in [131], leading
to enhanced performance calculations.
2.3.3.2 HMM-Bayes Activity Event Recognition
The work proposed by Zappi et al. [132], involved extension of event recognition and query
processing from WSN area to activity recognition. Here, the authors presented a real-time
approach for activity recognition using WSNs that accurately detects body gesture and motion.
The WSN nodes were initially spread throughout the body, and could detect the organ motion
through accelerometer sensors, measuring three axis measurements (positive, negative and
null). These measurements were then used to build a machine learning model such as Hidden
Markov Model (HMM) to predict the activity at each sensor. The prediction accuracy depends
on selecting appropriate sensors that can provide the most informative description of the
gesture. A final gesture decision is obtained by using a naïve Bayes classifier, which combines
the independent node predictions and maximises the Bayes posterior probability. The
architecture of this system is shown in Figure 10.
Figure 10 HMM and Naïve Bayes Event Detection and Query Processing [132]
Chapter 2
43
2.3.3.3 Neural Networks for Forest Fire Event Recognition
The authors in [135] presented an approach for fire detection and rescue system using WSNs,
where they have shown that better forest fire detection performance can be achieved with use
of WSNs instead of using satellite based solutions, while costing much less. Further, a real time
forest detection scheme based on neural network classifiers was proposed in [66], where, the
distributed processing scheme, with data processing at cluster heads, and important data gets
communicated and collected at the central station for final decision making. The system
however is complex to interpret, specially under real time detection environments, and needs
better strategies for data processing, communication and collection for final decision making
than what has been proposed here.
2.3.3.4 K-Nearest Neighbourhood for Query Processing:
One of the simple but highly effective query processing technique in WSNs is K-nearest
neighbour method for query processing in WSNs. An in-network query processing solution
using the k-nearest neighbour algorithm, called the k-NN boundary tree or KBT algorithm was
proposed by Winter et al. in [64], where, each node, aware of its location can determine its k-
NN search region whenever a query arrives from the application manager. An extension of
KBT query processing approach to 3D space was proposed by Jayaraman et al. [65], called the
“3D-KNN” processing scheme for WSNs, where the query region is restricted to bound at least
k-nearest nodes in 3D space. Further, the SNR (signal to noise ratio) and distance
measurements are used to refine the k-nearest neighbour. One of the main real time constraints
of using such machine learning approaches for query processing, including k-NN based
algorithms is a need for large memory footprint of WSN nodes to store every collected sample,
and high latency, or processing delay in large sensor networks in communication of k-NN
classifier outputs from cluster heads to sink or base station control nodes for final decision
making.
2.3.3.5 Decision Trees for Distributed Event Detection For Disaster Management:
Bahrepour et al. [67] developed a decision tree based event detection and recognition approach
using WSNs for disaster prevention systems. It uses a decentralised mechanism, with its main
application as the fire detection in residential areas. Here the final decision on event detection
is made by using a simple voting scheme from highest reputation nodes.
Chapter 2
44
2.3.3.6 Principal Component Analysis (PCA) for Query Optimization:
An optimized query processing approach using WSN data attributes and PCA was proposed
by Malik et al. [133], where the PCA can dynamically detect dominant principal components
(i.e. important WSN data attributes) from the correlated data set.
The four stage workflow of fundamental steps involved in this algorithm is shown in Fig. 3.8.
In stage 1, an SQL request, containing the human friendly and intelligible attributes is to
DBMS. At the DBMS, this original query is optimized by using only high variance components
of PCA algorithm output extracted from historical data in stage 2. In stage 3 and 4, this
optimized query is transmitted to WSN nodes to extract the data from individual sensor nodes.
The original attributes are then reconstructed from the optimized attributes by reversing the
PCA process. The authors in [133] have shown how this four stage query optimization process
with PCA can result in around 25% energy savings in the WSN nodes at 93% event recognition
accuracy. However, this enhancement does not fully exploit the abundant data effectively, i.e.,
it collected large data at the sensor nodes in the first place, it doesn’t use it fully. So, as such
the process is not cost effective. Therefore, for the applications with high accuracy and
precision requirements, this solution may not be ideal.
2.3.4 Challenges Related to Localisation and Object Targeting
The process of determining the geographic coordinates of network’s nodes is called
localisation, and location awareness of sensor nodes in WSNs is an important capability, since
most of WSN operations are based on the location [136]. Use of GPS hardware in each node
of WSN though can provide location awareness, it not feasible cost wise. Further, GPS services
may not be available in observed remote and certain indoor locations. For such use case
scenarios, relative location measurement may be sufficient, and by using absolute location
measurements for a small group of nodes, relative locations for other nodes can be converted
into absolute location measurements [137].
Moreover, GPS service may not be available in the observed environment (e.g., indoor).
Further, by using proximity based localization, additional measurements relying on distance,
angle or a hybrid of them can be used to enhance the performance of proximity based
localization. These distance measurements can be calculated by different approaches including
RSSI (received strength signal indication), TOA (time of arrival), and TDOA (time difference
Chapter 2
45
of arrival). Also, certain angular measurements can be obtained by using compasses or special
smart antennas [138]. The authors in [82], provide more details about different range based
localization techniques. Sometimes, sensor nodes can encounter changes in their location after
WSN deployment, perhaps, due to the movement of nodes. Use of machine learning techniques
can aid WSN node localisation process in different ways, such as:
Conversion of relative locations of nodes to absolute ones using few anchor beacon or
beacon nodes, eliminating the need for range measurement hardware to obtain distance
estimations.
Machine learning techniques can be used in surveillance and object targeting systems,
to divide the monitored sites into a number of clusters, where each cluster represents
specific location indicator.
Figure 11 Localization Using Beacon Nodes in WSN [82]
Figure 11 shows the layout of beacon nodes (anchor nodes) and unknown node (a node which
cannot determine its location). The beacon nodes can determine their location due to
Chapter 2
46
positioning hardware it consists, and this location serves as a reference point to estimate the
co-ordinates of other unknown nodes.
Some important approaches proposed by researchers for WSN localisation using machine
learning approaches can be described as follows.
2.3.4.1 WSN node localisation using Bayesian approach
WSN node localisation scheme based on Bayesian approach with very few anchor points
(beacon nodes) was proposed by Morelande et al. in [61]. The approach involves extension of
progressive correction technique, proposed in [149], where the predictive samples from
likelihoods get closer to the posterior likelihood. This algorithm works well in localisation in
both small and large WSNs, with few thousands of nodes, as the Bayesian algorithm can
gracefully handle incomplete data sets due to its capability to learn from priors (previous data)
and probabilities.
2.3.4.2 Location Aware Bayesian approach for Activity Recognition
The problem of both WSN sensor and activity localization in smart homes was proposed by Lu
and Fu [62], where the activities of interest including use of phone, listening to musing, using
the refrigerator, studying were detected. The authors reiterated, that in such applications,
designers need to take into consideration both human and environmental constraints, and their
framework named “Ambient Intelligent Compliant Object detects the human interaction with
the home electric devices in a more intelligent manner. This is done using several naive Bayes
classifiers to determine the resident’s current location and evaluate the reliability of the system
by detecting any sensors that didn’t work. This turns out to be a simple and robust mechanism
for localization, though with certain constraints in terms of scope of ambient environment
limited to predefined activities only. If there is a deviation in activities, the location awareness
and the activity detection does work well. To overcome this limitation in this centralized
system, there is a need for less engineering of features, with unsupervised feature learning
techniques, such as those proposed in [49], [150].
Chapter 2
47
2.3.4.3 Neural Network based WSN Localisation Approach
Using different neural networks Shareef et al [63] developed a localisation scheme for WSNs.
By using a combination of MLP (multi-layer perceptron), RBF (radial basis network), and
RNN (recurrent neural network), the authors show that RBF network results in the minimum
error at the cost of high resource requirements, whereas, MLP allows minimization of
computational and memory or storage resources.
A slightly different approach was proposed by Yun et al. [139], where two different processing
modules were used along with RSSI information from anchor/beacon nodes for localisation.
The first processing module uses a combination of fuzzy logic and genetic algorithm system,
whereas for the second processing module, and adaptive neural network that uses the RSSI
measurements from all anchor/beacon nodes as an input vector, to predict the sensor location
is used. A similar approach for WSN localisation with RSSI from anchor nodes as an input to
a set of neural networks was proposed by Chagas et al. in [140]. The advantages of these
multiple NN based localisation algorithms with RSSI information, is their capability to use the
location coordinates in terms of 3D space coordinates (continuous valued vectors). However,
the weakness of neural network based classifiers as compared to Bayesian or statistical
classifiers is their inability to work under uncertainty, as most of the neural networks that have
been used here, are non-probabilistic approaches. Hence prediction estimates cannot exploit
the prior knowledge effectively, leading to increase in localisation errors.
2.3.4.4 Support Vector Machine (SVM) based WSN Localisation Approach
For those scenarios where sensors cannot be equipped with self positioning devices, SVM
based WSN localisation approach was used. To this end, Yang et al. [91] developed a mobile
node localization scheme by employing SVM and connectivity information capabilities. The
algorithm first detects the node movement using the RSSI metric, and SVM estimates the new
location in the second step. A similar approach was proposed by Tran and Nguyen in [90],
called “LSVM” approach for node localization in WSNs. Here LSVM adopts several decision
metrics, including connectivity information and RSSI indicators, and offers a fast and an
effective localisation, it does suffer from sensitivity to outliers in training samples, causing
reduced performance with many outlier samples.
Chapter 2
48
2.3.4.5 Light Weight Support Vector Regression (LWSVR) based Localisation
Kim et al. [89], proposed a light weight implementation of SVR approach, due to problems
with adoption of normal SVR approach, due to limited processing resources in WSN nodes and
high dimensionality of incoming data. In this approach, the original regression problem is
divided into several sub-problems, and algorithm works on several subnetworks with smaller
data processing with each regression algorithm, which they call it as sub-predictors. Then using
a custom ensemble combination technique, the sub-predictor models that were learnt, are
combines together, to predict overall network estimates, with better performance, including
low computational requirements, robustness against noisy data, and convergence to the
preferred solution with low computational requirement.
2.3.4.6 Localisation using Decision Trees
A different application with WSNs, involving acoustic target localisation for WSNs based on
decision tree learning was proposed by Merhi et al. [141], where the exact locations of targets
are determined using time difference of arrival (TDOA) metric and a spatial correlation
decision tree. Also, in this work an EB-MAC protocol (Event Based Medium Access Control)
that allows event-based localization and targeting in acoustic WSNs. This framework was
implemented using MicaZ sensor boards that support ZigBee 802.15.4 specification for
personal area networks. As using GPS functionality in underwater WSN’s applications may
not be feasible, due to limited propagation capability of GPS signal through water [151],
another approach was proposed by Erdal et al [142] for submarine detection in underwater
surveillance systems, a randomly deployed node can find its location in 3D space using beacon
node co-ordinates. Here, a sensor is fixed with a cable to a surface buoy in each monitoring
unit, and data is collected using the buoys and transmitted to central controller and processing
unit. The central unit consists of a decision tree classifier, which can detect any submarines in
the monitored sites.
2.3.4.7 Localisation using Gaussian Processes
For a WSN temperature monitoring system, an optimized solution to sensor placement based
on spatially correlated data was proposed by Krause et al. [143] Here, the authors developed a
lazy learning scheme based on Gaussian process model, which involves storing training
samples, and delay the major processing task until a classification request has arrived. When
Chapter 2
49
choosing optimal locations for sensors, this solution aims to achieve robustness against node
failures and model ambiguity. In another work, Gu and Hu [144] developed an approach based
on spatial Gaussian process regression, for a distributed protocol for collective node motion, A
distributed Gaussian process regression (DGPR) was used to predict optimal location for
mobile nodes’ movements. Further, it uses a sparse version of Gaussian process regression
algorithm to reduce such computational complexity, as compared to traditional Gaussian
process regression (GPR) algorithm, which has a computational complexity of O(N3), where
N is the size of the samples. Using only spatiotemporal information from local neighbours,
each node executes the regression algorithm independently.
2.3.4.8 Localisation using Self Organising Map (SOM):
Paladina et al. [105] proposed the SOM based localisation solution for WSNs consisting of
thousands of nodes. In each WSN node, SOM algorithm is implemented with 2 neurons of the
output layer connected to the 3x3 input layer. The input layer is constructed using spatial
coordinates of 8 anchor nodes surrounding the unknown node. In the output layer, the unknown
node’s 2D spatial co-ordinates evolve, after sufficient training. However, the shortcoming of
this scheme, since it uses its neighbouring nodes, the algorithm expects that the nodes should
be distributed uniformly and equally spaced throughout area that is being monitored. While
most of the traditional methods use absolute locations of a few nodes to find the positions of
the unknown nodes, Giorgetti et al. [146] proposed a localisation algorithm that uses only the
connectivity information and SOM algorithm. Since this method does not require a GPS
enabled device, this method is highly suitable for networks with limited resources. However,
it suffers from latency issues, as this algorithm is implemented in a centralised manner, with
each node transmitting its neighbouring node information to the central control station node
for calculating the adjacency matrix and hence the node’s location. Another algorithm proposed
by Hu and Lee [147], proposed a scheme that does not require anchor nodes, for node
localization service in WSNs. The difference between [147] as compared to [146] is that the
algorithm in [147] is distributed and eliminates the needs for a central unit, and by distributing
the computation tasks to all nodes in the network, eliminates the need for a central unit, and
minimizes the transmission overhead of the algorithm.
Chapter 2
50
2.3.4.9 Reinforcement Learning based Localisation:
A reinforcement learning-based localization scheme for WSNs based on Q-learning was
developed by Li et al. [148], which allows real-time management of the mobile beacons. In
this method called “Dynamic Path determination of Mobile Beacons” (DPMB), the mobile
beacon (MB), is aware of the physical location during its movement, and used to determine the
positions of large number of sensor nodes. Here the different positions of the MB are
determined from different states of Q-learning algorithm, and due to its mobility, the algorithm
can cover all the sensors in the monitored area, with location update message from MB at
different times. This style of mobile beacon functioning can save the resources of the unknown
nodes, as the entire operation is run on mobile devices. However, being centralised, there could
be malfunctioning mobile beacons, and could lead to entire system failure.
2.3.5 Medium Access Control (MAC) Issues:
There are several challenges in the design of MAC protocols for WSNs, such as, the energy
consumption, latency, prediction accuracy etc., in addition to basic operational feature, that a
number of different sensors cooperate to efficiently transfer data [152]. Therefore, the MAC
protocols have to be designed appropriately, to allow efficient data transmission and reception
of the sensor nodes. The authors in [153] have provided a comprehensive survey of MAC
protocols in WSNs. Recently, few machine learning methods have also been proposed for
designing appropriate MAC protocols and enhancing the performance of WSNs. In these
works, machine learning plays a role in a variety of ways, including:
Using the transmission history of the network to adaptively determine the duty cycle of
a node. Here, the assumption is, that the nodes, which are able to predict when the other
nodes’ transmissions will finish, can sleep in the meantime and wake up (to transmit
data) just when the channel is expected to be idle, and no other node is transmitting.
Using the concepts of secured data transmission along with machine learning in
designing the MAC layer protocol. Such a secure MAC layer scheme would be
independent of the proposed application and can learn sporadic attack patterns,
iteratively.
A brief description of how the WSN protocol design issues were addressed by machine
learning, and other related approaches is discussed next.
Chapter 2
51
2.3.5.1 MAC design using Bayesian Statistical Models
A contention-based MAC protocol for managing active and sleep times in WSNs was proposed
by Kim and Park [68]. By using a Bayesian statistical model to learn when the channel can be
allocated, it reduces the need for continuous sensing of medium, and hence save energy. Some
of extensions of this scheme, target the CSMA contention based protocols, and are proposed
as “S-MAC” (Sensor MAC) and “T-MAC” (Timeout MAC”) by authors in [156], [157].
2.3.5.2 MAC design using Neural Network Models
One of the popular medium access protocols in traditional computer networking is TDMA or
time division multiple access protocols, which employ periodic time slots to separate medium
access of different machines, and uses a central server unit to broadcast a transmission schedule
in case of change in topology of the network. This can adopted for WSN scenario, and Shen
and Wang [69] proposed a MAC protocol, which involves broadcasting of the transmission
schedule in TDMA using a fuzzy Hopfield neural network (FHNN) approach. To prevent any
potential transmission collisions and latency issues, the authors propose distribution of time-
slots among different nodes in the network. Another similar approach was proposed by
Kulkarni and Venayagamoorthy [70], which includes security aspects in addition to MAC
issues in WSN protocol. Their CSMA-based MAC approach, can prevent denial-of-service
(DoS) attacks in WSNs, and uses a neural network learning to prevent flooding the WSN with
fake and mendacious data by learning the network properties and variations such as packet
request rate and average packet waiting time. Denial of service attack or DoS attack that
generates large useless data and floods the network, and prevents the delivery of useful data,
and it is much easier to attack WSNs with DoS attacks, as the attacker tries to exploit the
vulnerability of WSNs in terms of limited buffering and storage capacity and limited bandwidth
capabilities. With neural network based MAC protocol, if the neural network exceeds a
predefined threshold level, the MAC layer will be blocked. Further, blocking does not impact
the functioning of the whole network, as the scheme is implemented in a distributed manner,
and only affected site is blocked.
2.3.5.3 MAC design using Reinforcement Learning Models
Use of reinforcement learning based techniques for medium access control (MAC) was
proposed by Liu and Elhanany in [154], called RL-MAC protocol. The adaptive RL-MAC
Chapter 2
52
protocol for WSNs, optimizes the duty cycle of the network node for reduce energy
consumption and increased throughput. RL-MAC works in a similar manner as S-MAC [156]
and T-MAC [157], and synchronises node’s transmission on a common schedule in a frame-
based structure. By using the traffic load and channel bandwidth, the RL-MAC adaptively
determines the slot length, duty cycle and transmission active time. Another proposal by Chu
et al. in [152], proposed a combination of slotted ALOHA and Q-learning algorithm to
introduce a new MAC protocol for WSNs, called ALOHA-QIR, the ALOHA and Q-Learning
based MAC with Informed Receiving. By using the best features of both ALOHA and Q-
Learning, it provides benefits in terms of simple design, low-resource requirements and low-
collision probability. The method works by nodes broadcasting their future transmission
allocation, in their transmission frames, so that nodes can be put in sleep mode. The willingness
to research a slot is represented by Q-value map in each node, where the node with higher Q-
value will attain the right of slot allocation and hence transmission of its own data. An
illustration of steps involved in updating the Q-values over three frames for a node that is
allowed to transmit a maximum of two packets in each frame is shown in Figure 12. The Q-
learning based medium access control can suffer from high collision rates in the initial
exploration phases, though it is appealing due to its distributed mode of operation, a small
storage and computational resource requirement.
Figure 12 ALOHA-QIR Scheme For MAC Layer in WSN [152]
2.3.5.4 MAC design using Adaptive Decision Trees
For modern application scenarios, such as in healthcare and assisted living systems, design of
MAC layer in WSN is quite challenging, particularly to address the dynamic communication
patterns and service requirements over time, and the data in WSNs, need to directly share the
collected data with the users’ mobile phone or smart phone. To this end Sha et al. in [155]
proposed a “Self Adapting MAC layer” (SAML) design, consisting of two components, the
Chapter 2
53
RMA component and the MAC engine component. The RMA or reconfigurable MAC
architecture allows chooses different MAC protocols, and MAC engine, allows learning the
chosen MAC protocol from the current network data. For learning, the MAC engine uses a
decision tree classifier, and uses several features for learning, including, IPI (Interpacket
interval), RSSI (Received Signal Strength Indicator), the application QoS requirements
(reliability, energy usage and latency), statistical parameters (mean and variance), traffic
pattern, and PDR (packet delivery rate). Figure 13 shows the design of SAML protocol.
Figure 13 Adaptive Decision Tree Based MAC Protocol (SAML) [155]
2.4 Non-operational Aspects of WSN
While the operational challenges are directly related to the basic operational or functional
behaviour of the systems with WSN, the non-operational aspects are not related to basic
operational needs of the system, and though non-functional, are highly desirable, performance
enhancing requirements that can used by vendors for differentiating and achieving competitive
edge in the market. Some of the performance enhancing requirements could include updates
and analytics on the environment being monitored by WSNs, QoS (quality of service), security,
and data integrity. Recent advances in machine learning techniques can be harnessed to address
the non-operational aspects, and enhance the WSN performance. Some of the work reported in
this area is discussed next.
Chapter 2
54
2.4.1 Security and Anomaly Intrusion Detection
Due to limited resource requirements, security and intrusion management techniques are
challenging to implement in WSNs [54]. Some of the methods based on machine learning,
proposed for intrusion detection involve introduction of anomaly, or unexpected, misleading
observations to the network, emulating an attack scenario. A brief schematic of general concept
of anomaly detection in monitoring the WSN system is shown in Figure 14.
Figure 14 Basic Concepts of Anomaly Intrusion Detection [54]
Here, the data is classified into two classes corresponding to most observations that may belong
to these two regions, but the measurements that are inconsistent and unusual due to suspected
attacks are considered as anomalies or intrusions. Detection of outliers and misleading
measurements can be done by different machine learning algorithms, including supervised,
unsupervised and reinforcement learning algorithms, and by analysing well known malicious
activities and vulnerabilities, several attacks and intrusions can be detected. Such WSN security
enhancements by adopting machine learning techniques can lead to several benefits, including:
Preventing the transmission of anomalous and suspicious data, by detecting outliers,
save WSN node energy, and significantly expand WSN lifetime.
Eliminating faulty and malicious readings, and avoiding the discovery of unexpected
information impacting on the critical actions, so as to enhance the WSN reliability.
Chapter 2
55
Prevention of malicious attacks and vulnerabilities, by automatic online learning and
prevention of malicious attacks and vulnerabilities.
Some of the approaches based on machine learning, addressing the security issue in WSNs, is
presented next.
2.4.1.1 Outlier detection
An outlier detection scheme based on Bayesian belief networks (BBM) is proposed by
Janakiram et al. [71]. In this scheme, first, the conditional relationships between the nodes’
readings are modelled, since most the nodes’ neighbours have similar readings due to spatial
and temporal correlations. Then, the BBN learns the conditional dependencies in the
observations for detecting the outliers in the collected data.
Another approach based on k-nearest neighbours for outlier detection was developed by Branch
et al. [72]. Here, the anomaly is detected by computing the average value of the k-nearest
neighbour readings, and comparing it with a pre-determined threshold.
2.4.1.2 Anomaly detection:
Kaplantzis et al. [73] proposed a scheme for detecting black hole attacks and selective
forwarding attacks, using routing information bandwidth and hop count to determine the
malicious WSN nodes. In black hole attacks, misleading RREP (Routing Reply) messages are
sent by malicious nodes in response to “Route Request” (messages) from weak and vulnerable
(prone to attack) nodes, indicating incorrectly, that routes to the destinations are found. This
leads to source notes assuming that their packets are being delivered correctly to the
destination, whereas, vulnerable nodes will drop all network’s messages. The selective packet
dropping attack prevention technique based one class SVM (support vector machine) was
proposed by the authors in [73] to address this issue. However, use of traditional SVM is highly
computational intensive, and Rajasegarar et al. [74] proposed a light weight SVM approach for
anomaly detection, called quarter-sphere one class SVM to alleviate this problem. The
approach allows distributed implementation in WSN, and can distinguish anomalies in data
while minimizing communication overheads. Further, Yang et al. in [96], improvised this
Chapter 2
56
algorithm, by having unsupervised clustering technique for learning the anomalies in
distributed nodes, and using the one-class quarter-sphere SVM at the centralised control station
nodes, and show significant improvement in computational complexity. Their approach is
similar to the one proposed in [74]. Another approach using artificial immunity algorithm in
conjunction with SVM for intrusion detection is proposed by Chen et al in [[87]. Artificial
immunity algorithm is a computational intelligence algorithm inspired by biological immunity
systems [164] for problem solving, and involves automatic generation of immune bodies
(antibodies) against the antigen or virus through the cell fission mechanism. For the intrusion
detection scheme, the immunity algorithm was used for pre-processing, the sensor data, which
was fed to SVM after pre-processing for anomaly intrusion detection. Another approach, using
one-class ellipsoid SVM was proposed by Zhang et al. [88], which extracts the temporal and
spatial correlations from the collected readings to train the SVM for developing an outlier
detection technique. The ellipsoid SVM method uses linear optimization instead of quadratic
optimization used for traditional SVM, leading to efficient learning, good performance, and
ability to learn nonlinear and complex problems. However, high computational and large
memory requirements are the main disadvantages, due to scalability problems with large data
sets [85]. An alternative approach using self-organising map (SOM) was proposed by Avram
et al. in [163], who addressed the issue of detecting network attacks in wireless adhoc networks
using an unsupervised learning approach based on SOM, where the weights are learnt from the
statistical analysis of the input data vectors. However, this approach also is not capable enough
to detect malicious attacks in complex data sets from large scale WSNs.
2.4.2 Data Integrity, Fault Detection, and QoS Enhancement:
The state of the art and general QoS requirements in WSNs have been reviewed in [166], and
authors here reiterated that since WSNs suffer from energy and bandwidth constraints, which
can limit the quantity of information that can be transmitted from a source to destination node.
Further, due to random network topologies, and faulty, unreliable data
aggregation/dissemination in WSNs, QoS (Quality of Service) guarantees are necessary. The
QoS enhancements guarantee high priority delivery of real-time events and data, particularly
for complex WSN architectures with multi-hop transmissions of data to the end user, and
distribution of queries from a central system controller to the WSN nodes [165]. Some of recent
efforts discussed next on using machine learning techniques to achieve specific QoS and data
integrity metrics, ascertain several advantages, such as:
Chapter 2
57
Use of machine learning approaches can eliminate the need for flow-aware and stream-
aware management techniques, as they can be trained to recognise different types of
streams automatically.
Machine learning methods can automatically detect the type of network service and the
type of WSN application, and it is possible to meet requirements corresponding to QoS
guarantees, data integrity and detection of faults, while ensuring efficient resource
utilization, mainly bandwidth and power utilization.
Some of the approaches proposed on using machine learning methods for QoS guarantees, data
integrity and fault detection in WSNs are as follows.
2.4.2.1 Using Neural Networks for QoS estimation
Of late, there is a significant interest in estimating and enhancing the WSN performance using
automated approaches. A sensor network dependability metric was proposed by Snow et al.
[75] to represent the availability, reliability, maintainability and survivability of the sensor
network. To estimate the dependency metric, the authors used features performance measures
such as MTBF (Mean Time Between Failure), and MTTR (Mean Time To Repair). Another
approach for modelling dynamic fault detections was proposed by Moustapha and Selmic [76],
where the method models the dynamic behaviour of nodes’ and their effects on other
neighbouring nodes. Further, they used an innovative variation in terms of using the
backpropagation method used for neural network learning for node identification and fault
detection similar to how it was used in [75]. This variation allowed a nonlinear sensor model
to be derived that can adapt to different application with fault detection requirements.
2.4.2.2 Learning Based Quality Estimation Framework
Wang et al. [77] proposed a link quality estimation framework called MetricMap, which
addresses the inadequacies of traditional link quality measurement tools, due to different
operating conditions such as signal variations and interference, leading to inaccurate and
unstable readings across different environments [172]. The proposed MetricMap framework,
for link quality estimation, uses supervised learning techniques, to obtain link quality
Chapter 2
58
indicators. MetricMap is an enhancement over previous MintRoute protocol proposed by
authors in [173], where combination of online and offline learning for decision tree classifiers
was adopted for obtaining link quality indicators. MetricMap builds the classification tree using
several local features, such as RSSI (received signal strength indicator), size of the transmission
buffer, channel load, and forward/backward probabilities. Here, the ratio of the received to the
total transmitted packets is termed as forward probability pf(l), whereas the calculation over the
reverse path is the backward pb(l). Further, as the global features over far away nodes are
communication intensive, local features in the neighbouring nodes are preferred. Experimental
validation of MetricMap framework allowed around three times improvement in data delivery
rate as compared to basic MintRoute method.
2.4.2.3 Use of Multi Output Gaussian Processes for WSN node Accuracy and
Reliability Assessment
A real time algorithm to discover a set of nodes that can handle information processing tasks
corresponding to assessment of accuracy of collected sensor readings, and prediction of
missing readings was proposed by Osborne et al. in [167]. Here, as shown in Eqn 3.6, the
algorithm uses a probabilistic Gaussian process to estimate a reasonable size of training data
by using the priors (historical data/previous experience) and a multivariate Gaussian process
to predict the posterior distribution of an observed environmental variable x (the sea-surface
temperature).
𝑝((𝑥|𝜇, 𝐾, 𝐼)) ≜1
√𝑑𝑒𝑡2𝜋𝐾𝑒𝑥𝑝 (−
1
2(𝑥 − 𝜇)𝑇𝐾−1) (𝑥 − 𝜇) (Eqn 3.6)
where μ, K are the prior mean and covariance of the variable x, respectively, and I denotes the
historical data that is updated online (a sequence of time-stamped samples) to include the new
sequentially collected observations.
2.4.2.4 QoS guarantee based on reinforcement learning
A Q learning based approach for QoS guarantee was proposed by Ouferhat and Mellouk in
[168]. Here, the authors introduced a QoS task scheduler for multimedia sensor networks based
on Q-learning type of reinforcement learning technique, and shown that it is possible to
enhance the network throughput significantly by reducing the transmission delay. Seah et al
[169] on the other hand used WSN coverage as the QoS metric and shown how a Q-learning
Chapter 2
59
method allows efficient monitoring of area of interest in WSN setup. They developed a Q-
learning based distributed learner, that can detect weakly monitored regions, which need to be
scheduled for upgrades in future WSN deployment stages.
Another approach that used the capabilities of Q-learning, considering energy harvesting for
QoS guarantees is proposed by Hsu et al. [170]. The authors, introduced energy harvesting
capabilities, for a QoS-aware WSN power management scheme, and called it - “Reinforcement
Learning based QoS-aware Power Management” (RLPM). The RLPM employs Q-learning
technique to adapt to the dynamic levels of nodes’ energy (In systems with energy harvesting
capabilities). QoS-aware RLPM allows QoS awareness and manages nodes’ duty cycle under
the specified energy restraints. A different approach for QoS guarantee was proposed by Liang
et al. [171] called “Multiagent Reinforcement Learning based multi-hop mesh Cooperative
Communication” (MRL-CC), where MRL-CC is adopted to reliably assess the data in a
cooperative manner. Here, MRL-CC can also examine the impact of traffic load and node
mobility on the performance of the whole network.
2.4.3 Application Specific Unique Challenges
There are some novel application specific challenges, which cannot be categorized into
mainstream machine learning WSN literature, but nevertheless, are unique and provide insight
into how some unforeseen aspects of WSNs were addressed. Some of these are briefly
discussed here.
2.4.3.1 Reinforcement Learning for WSN Resource Management
An algorithm that exploits the local information and constraints imposed on the WSN
application, to optimize various tasks over a period of time, while maximising energy
efficiency was presented by Shah and Kumar in [174]. For this algorithm, termed as DIRL
(Distributed Independent Reinforcement Learning), each WSN node learns the minimum
required resources to perform its scheduled tasks, with rewards assigned by Q-learning method
and finds the optimal parameters of the application equipped with WSN. As an example, for
an object recognition and tracking application shown in Figure 15, the Q-learning based DIRL
algorithm can allow learning of task priorities for a certain task schedule of this application.
The object tracking application, which consists of five different tasks, such as:
Chapter 2
60
Collection of two or more readings into a single reading
Transmission of a message to the next hop
Receipt of incoming messages
Reading of next sample
Placing the node into sleep mode.
These tasks need to be performed in certain priority for maximising the lifetime, and WSN
does not have a predetermined schedule for achieving this performance goal (such as
knowledge of physical proximity of object to a node for enabling the task of reading samples).
Under such circumstances, Q-learning based DIRL task scheduler, can learn from penalties and
rewards assigned for wrong/right decisions during learning stage, and can perform better in
real time based on this knowledge.
Figure 15 WSN Based Q-Learning for Object Tracking Application [174]
2.4.3.2 Decision Tree Based Learning for Animal Behaviour Classification Application
Applications such as habitat and environment monitoring also have used WSNs and used
simple machine learning classifiers to learn the behaviour of herds of animals [175]. Nadimi et
Chapter 2
61
al. [176] utilized a decision tree learner to classify the animal as active or inactive, using
features such as the pitch angle of the neck and movement velocity, from a herd of animals.
This application performed well due to simple implementation and low complexity, with a
decision tree learner and use of few critical features.
2.4.3.3 SOM (Self Organising Map) based Clock Synchronisation
As the modern WSN nodes have to perform several tasks until limited resources, clock
synchronisation between sensor nodes is an important requirement, to maintain consistency in
execution of tasks between the sensor nodes for large scale WSNs. A SOM (self organising
map) based reliable clock synchronisation technique was proposed by Paladina et al. in [177],
where the nodes can predict the near optimal estimation of current time without a need for
central timing device, with restricted storage and computing resources. This method, however,
presumes that the nodes are deployed uniformly over the monitored area, and all the nodes
have same transmission powers, which is not always the case.
2.4.3.4 Neural network based Air Quality Monitoring
A neural network based air quality monitoring approach for measuring pollution levels was
proposed by Postolache et al. in [178]. Here the detection of air quality and gas concentration
was done, by making the neural network learn the readings of inexpensive gas sensor nodes in
the WSN set-up. The implementation was done in a distributed manner by client and server
side scripting on web server and end-user computers.
2.4.3.5 Neural network based Intelligent Lighting Control
A new standard for lighting control for smart buildings based on neural networks was presented
by Gao et al. in [179]. Here, a RBF (Radial Basis Function) neural network was used to extract
a computational entity called “I-Matrix” (Illuminance Matrix), to measure the degree of
illuminance in the lighted area. This is quite a unique application field, and this application
field has several challenges, in terms of converting the detected data from photo sensors to a
quantitative or qualitative feature that can processed by computers, and can impact the
performance of the system significantly. The authors show that their approach based I-Matrix
results in 60% improvement in performance over the standard methods.
Chapter 2
62
2.5 Research Gap in Wireless Sensor Networks Based on Machine
Learning/Data Mining Techniques
As can be seen from the comprehensive previous work presented in this Chapter, a large body
of work exists in using machine learning techniques for addressing various challenges in
WSNs, included operational, non-operational, and application specific challenges, there is still
a research gap, and there is a need for further research efforts as many issues are still open and
need to be solved. Some of the gaps and further research needed are discussed below:
2.5.1 Better Methods for Selecting Sensors
A large number of sensor measurements are needed in practice, to monitor the events and
maintain desired detection accuracy. With the requirement for WSN nodes to operate under
resource constraint, network designers face several design challenges, corresponding to
network management and communication bandwidth. Since around 80% of the energy in the
sensor nodes is consumed for communication activity (sending and receiving data), efficient
data compression and dimensionality reduction techniques are needed to reduce transmission
reduce transmission and hence extend the network lifetime. Most of the previous approaches
discussed here, used PCA (principal component analysis) technique for dimensionality
reduction or data compression. However, PCA is too computationally intensive to be
implemented on WSN nodes, impacting on memory requirements, and causing severe latency
issues ( if implemented on nodes), or extra energy consumption due to the need to transmit the
data for cluster heads or sink codes for extracting features for compression or dimensionality
reduction. Though there is a trade-off between energy consumption and dimensionality
reduction or compression achieved, there is a need for alternate light weight approaches to PCA
and its variants, due to their computational intensive nature and limited resources on WSN
nodes for computing the PCA components.
2.5.2 Adaptive and Distributed Machine Learning Approaches For WSNs
Due to WSN sensors being devices with limited resources, distributed machine learning
techniques are needed for WSNs as compared to centralised learning algorithms. This will
allow less computational power requirements and smaller memory footprint (since they don’t
need to know about the whole network). Further, the algorithms need to be adaptive, allowing
nodes to learn current environment conditions and rapidly adapt their future behaviour and
Chapter 2
63
predictions dynamically. Hence, adaptive and distributed learning algorithms are needed for
reducing the communication overheads and alleviate the computational burden on the nodes.
2.5.3 Managing Resources Using Machine Learning
As discussed in the previous Sections of this Chapter, WSN designer face different types of
challenges, including operational, non-operational or application specific challenges.
Energy efficiency is one of key challenge and energy efficient design goal can be achieved
using improving operational aspects, such as enhanced communication protocols (routing and
MAC protocol design) and by detecting non-operational, energy wasteful activities, such as
listening to neighbouring nodes, transmitting redundant information, by being in active
listening mode all the time. As discussed in the previous sections of this Chapter, while the
first aspect – the design of enhances communication layer protocols based on machine learning
approaches have received significant research attention, with large body of literature available,
the 2nd aspect on design of energy saving approaches has received less attention, and there are
not many approaches available.
2.5.4 Spatio-Temporal Correlation Detection
With several sensor nodes it is quite possible, there is large redundant information being
communicated within the network. This could lead in wastage of energy, and if correlation and
dependencies between the sensors can be detected, both spatially and temporally, and reduced
number of sensors can be used for communication and event detection and monitoring,
significant energy savings can be possible. With most of the earlier approaches examined in
the previous sections in this Chapter, there seems to be not many approaches that exploit the
spatio-temporal correlations for achieving energy efficiency in WSNs.
2.6 Research Plan and Thesis Road Map
To address these research gaps on achieving energy efficient WSN design with machine
learning techniques, in this thesis a novel integrated framework is proposed, which takes into
consideration both operational, non-operational and application-specific challenges to address
the WSN challenges. The integrated framework for energy efficient WSN based on machine
learning, consists of three stages:
Chapter 2
64
Stage I: The Stage1 is based on the proposal of a joint energy efficiency–event
detection model, where we develop a novel sensor node selection scheme that conserves
the energy in the wireless sensor network, and at the same time maximizes the event
recognition performance. Here, the scheme utilises, fewer sensor nodes at a time, and
placing unwanted sensor nodes in the sleep mode. For this, a novel objective
quantitative measure is proposed to assess the energy efficiency achieved, namely, the
life time extension factor (LTEF). We show that this joint scheme, allows selection of
most significant and influential sensor nodes for participation in different WSN tasks,
and contributes significantly towards energy savings and event detection accuracy. The
detailed design and experimental validation for this scheme is presented in Chapter 3.
Stage 2: As the WSN components need to adapt to the state of the WSN environment
being monitored dynamically, the number of sensor nodes participating in the routing
tree cannot remain fixed, and need to adapt, in order to accurately monitor and predict
the physical environment, and the second contribution of this work is on design of
adaptive models for sensor selection and classifier learning which can energy efficiency
and prediction accuracy, based on performance targets specified. It turns out that this
scheme which involves selection of an appropriate classifier model, in conjunction with
the previous sensor selection approach, not only results in better prediction accuracy,
but also contributes towards quality of service (QoS) enhancements. This stage can be
implemented in a decentralised manner in WSN nodes or collectively at the central base
station control code. This module can be implemented in a decentralised manner in
WSN nodes. The detailed design and experimental validation for this scheme is
presented in Chapter 4.
Stage 3: The third and the final contribution is proposal of a joint sensor selection
adaptive routing model, for addressing the dynamic WSN environment, which has a need
to adapt the routing scheme while maintaining the energy efficiency and prediction
accuracy targets. This scheme, also leads in improvement in some non-functional
challenges such as recovery from sensor failure, and model building time, which are
important for maintaining QoS guarantees the detailed design and experimental
validation for this scheme is presented in Chapter 5.
The details of each of these modules are presented in next 3 Chapters of this thesis.
Chapter 3
65
Chapter 3
Joint Sensor Selection - Event Detection Scheme
3.1 Introduction
In this Chapter, the details of joint sensor selection and event detection scheme are presented. In
this scheme, a data driven method was used for learning the most significant sensors. The sensors
are modelled here with the features extracted from the data sets corresponding to different WSN
application scenarios, including acoustic data Isolet, Ionosphere data and Forest cover type data.
In this formulation, minimizing the number of sensors for energy efficient management becomes
equivalent to minimizing the number of features [25]. For minimizing, a feature ranking
approach is used, where the features are ranked according to their significance in the wireless
sensor network. That means we first rank the sensors from the most significant to the least
significant, and then select optimal number of sensors to meet a specified accuracy target[26].
For validating the proposed scheme, we used different publicly available datasets corresponding
to wireless sensor networks in UCI Machine Learning repository [25]. This Chapter will explain
results and studies done on Isolet, Ionosphere, forest cover type and forest fires datasets. Each
data set consists of different number of sensors (features).
3.2 Joint Energy Efficiency - Event Detection Scheme
The block schematic of the joint sensor selection and event detection scheme for the integrated
framework is proposed is as shown in Figure 16.
3.2.1 Energy Efficiency with Feature Ranking Algorithm
The sensor selection algorithm uses the feature selection and ranking technique to determine
most influential sensor by learning the influence of each feature on the event detection
performance, and discards insignificant sensor in the WSN cluster, and keeps the significant
sensor for predicting the application event. For this, a feature selection and ranking algorithm
has been developed which uses the 'independent features' significance testing [175] to extract the
significant sensors in the WSN, and this involves calculation of the significance level of each
sensor from input data measurements, and their ability to distinguish WSN event categories, with
Chapter 3
66
a pre-determined threshold, and sorting them for ranking. Figure 17 shows the implementation
of algorithm for selecting the significant sensors.
Figure 16 Block Schematic for Joint Energy Efficiency - Event Detection Scheme
Figure 17 Sensor Selection and Ranking Algorithm
Chapter 3
67
3.2.3 Naïve Bayes Machine Learning Classifier Algorithm
The Naive Bayes Classifier algorithm is based on the Bayesian theorem and is particularly
suited when the dimensionality of the inputs is high, and number of instances is low. Given a
set of variables, X = {x1, x2, x...,xd}, if we want to construct the posterior probability for the
event Cj among a set of possible outcomes
C = {c1, c2,c...,cd}. In a more familiar nomenclature, X is the predictors and C is the set of
categorical levels present in the dependent variable. Using Bayes' rule:
Equation 1: Bayes's Rule
where p(Cj | x1, x2, x...,xd) is the posterior probability of class membership, i.e., the probability
that X belongs to Cj. Since Naive Bayes assumes that the conditional probabilities of the
independent variables are statistically independent we can decompose the likelihood to a
product of terms:
Equation 2
and rewrite the posterior as:
Equation 3
Using Bayes' rule above, we label a new case X with a class level Cj that achieves the highest
posterior probability.
Although the assumption that the predictor (independent) variables are independent is not
always accurate, it does simplify the classification task dramatically, since it allows the class
conditional densities p(xk | Cj) to be calculated separately for each variable, i.e., it reduces a
multidimensional task to a number of one-dimensional ones. In effect, Naive Bayes reduces a
high-dimensional density estimation task to a one-dimensional kernel density estimation.
Chapter 3
68
Furthermore, the assumption does not seem to greatly affect the posterior probabilities,
especially in regions near decision boundaries, thus, leaving the classification task unaffected.
3.3 Experimental Validation
Four different data sets corresponding to different event recognition application were used for
experimental validation. The data sets used were from publicly available repository.
Table 1. Data sets for experimental validation
The purpose of ISOLET dataset is to predict which letter or name was spoken. As can be seen in
Table 1, the ISOLET is a large data set with 7797 instances and 617 attributes (features). The
data set is divided into number of batches - Isolet 1+2+3+4 and isolet5. In this section Isolet5
part was used consisting of 1559 instances and 617 features.
Ionosphere data set contains radar data, and was collected by system in Goose Bay, Labrador.
The targets were free electrons in the Ionosphere. "Good" radar returns are those showing
evidence of some type of structure in the Ionosphere. "Bad" returns are those that do not let their
signals pass through the Ionosphere [28] . In Ionosphere dataset experiment we used all 34
attributes in addition to the class "good" and "bad".
The Forest Cover type is a huge data set with very large number of attributes (581000 attributes).
This date set used to predict the forest cover type from cartographic variables [25]. In experiment
4 we used all the attributes and instances to find out application’s event detection accuracy.
Forest fires is a regression dataset, and its aim is to predict the burned area due to forest fires.
Several of attributes in forest fires data set could be correlates, thus feature selection and ranking
can reduce the dimensionality of sensors used for detecting the application events [30]. In our
experiments, the features have been minimized to 5 features as some of attributes such as date,
time and month are not the sensor readings, and need not be included in machine learning scheme.
Data set #of instances #of Attributes Missing values? Associated tasks
ISOLET 7797 617 No Classification
Ionosphere 351 34 No Classification
Cover Type 581012 54 No Classification
Forest fires 517 13 N/A Regression
Chapter 3
69
The main aim of our experiments was to show that, to what extent, the number of features
selected may affect the accuracy and the life time extension factor (life time of the sensor network
before the sensor becomes unavailable). In the following experiments, it is shown, that the
accuracy and the life time of a sensor network depends on a variety of factors.
3.3.1 Experiment 1 (Isolet Data set)
The first experiment is on ISOLET dataset. The actual size of data we used consists of 1559
instances with 617 features, whereas the original size of the dataset is 7797 instances and 617
features. After applying our Isolet5 dataset to our feature ranking algorithm, the ranking of the
most significant features are as shown in the Table 2, where hundred features have been ranked
from 1 to 100.
Table 2 Features selected in Isolet 5
As can be seen from this table, The first row represents the first 10 features ranked in order of
significance, from 1 to 10 (455, 453, 454…..462), 2nd row shows the next 10 features ranked in
order from 11 to 20 ( 69,6,101,38. 37…..462), and so on until all features are ranked. This ranking
process, determines which particular sensor is most significant in first batch (1559 instances out
of 7797 instance) of data that has arrived in WSN, and by for determining how many sensors
need to be active to be able to detect the events in WSN, network needs to be trained first and
then used for prediction.
Ranked
Features numbers
1 2 3 4 5 6 7 8 9 10
Most significant 455 453 454 456 457 458 459 460 461 462
69 6 101 38 37 70 39 5 262 261
7 102 40 71 72 103 43 104 8 44
76 73 42 2 41 133 74 75 230 9
106 11 110 108 109 78 77 263 105 45
107 10 12 293 3 46 264 134 229 135
34 111 66 290 98 226 79 47 137 140
227 258 294 231 139 136 225 165 332 166
138 265 130 112 80 486 259 142 48 232
233 141 295 13 81 545 266 167 481 113
.. .. … … … … …. … … …
…. … … … … … … … … …
Least significant 236 467 157 177 329 485 94 147 270 239
Chapter 3
70
This is done by training a machine learning classifier by taking into consideration different
ranked features – first 10 significant features, 20, 30 …features. A simple Naïve Bayes type
machine learning classifier was used, as Bayesian classifiers work well with lesser data, and the
prediction accuracy achieved was noted for deciding the number of sensors that need to be active
in the WSN at a point of time. Table 3 shows the prediction accuracy for Naïve Bayes classifier:
Table 3 Naïve Bayes Classifier Performance
𝐿𝑖𝑓𝑒 𝑡𝑖𝑚𝑒 𝐸𝑥𝑡𝑒𝑛𝑠𝑖𝑜𝑛 𝑓𝑎𝑐𝑡𝑜𝑟(𝐿𝑇𝐸𝐹) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 𝑢𝑠𝑒𝑑1
The third column in Table 3 is the proposed measure for measuring the energy efficiency
achieved in terms of LTEF metric. As can be seen from Table 3, prediction accuracy improves
with increase in number of features/sensors selected by the classifier. However, this will be at
the cost of the life time extension factor. Life time extension factor (LTEF) is increased if the
number of features/sensors used are lesser, and redundant features are eliminated, and the
increase in LTEF represents increase in energy efficiency. There is a trade-off between energy
efficiency (LTEF) and prediction / event detection accuracy, meeting the performance target for
one at the cost of another. Here, an appropriate feature ranking and selection algorithm can
determine most influential sensors or most significant features, and allow redundant features to
be eliminated. Figure 18 shows a visualisation of trade- off between numbers of features/sensors
vs. event detection accuracy.
1 Life time Extension factor Equation
Features Accuracy Lifetime extension factor
10
9.62%
617/10 = 61.7
20
11.80%
617/20 = 30.85
30
13.79%
20.56
40
14.62%
15.42
50
16.10%
12.34
100
23.92%
6.17
200
41.05%
3.08
Chapter 3
71
Figure 18 Event Detection Accuracy vs. Life time extension Factor(LTEF) (Isolet 5 data set)
Table 4 Results of experiment 1
Number of Features 10 20 30 40 50 100 200
Accuracy 9.62% 11.80% 13.795 14.62% 16.10% 23.92% 41.05%
Life time extension factor 61.7 30.85 20.56 15.42 12.34 6.17 3.08
From Figure 18, it can be seen that the life time extension factor increases with lesser sensors
at the cost of accuracy. And the accuracy of a network event detection performance could be
increased at the cost of decreased life time extension factor. Further, in the event of a sensor
failure or unavailability, it is possible to maintain the accuracy by increasing the number of
features used. To emulate the sensor failure, we assigned a probability that one of the sensor Si
is not available with probability p= 0 , 0.01 , 0.05 , 0.10 , 0.50 [31]. In this experiment, we have
multiplied our Isolet5 data set with all probability values above. We have selected 10 features
and used Naïve Bayes Classifier for finding event detection accuracy, and the results are as
shown in the following table.
0
50
100
150
200
250
1 2 3 4 5 6 7
# o
f fe
atu
res
Experiment 1
features
accuracy
Life timeextensionfactor
Chapter 3
72
Table 5 Experiment 1 Accuracy with Sensor Failure Probability
As shown in Table 5, the system is quite stable with respect to occasional sensor faults. In case
of using 20, 30, 40, 50, 100 and 200 features with sensor failure taken into consideration, the
accuracy achieved was quite stable.
3.3.2 Experiment 2 (Ionoshpere dataset)
This experiment was based on Ionosphere data set. We used all 34 attributes with only two output
state of the application class, either “good" or "bad". After applying ionosphere data set into the
proposed feature ranking algorithm, the ranking of features from most significant to least
significant features are as shown in Table 6, and performance of these ranked features on
detection accuracy and Energy efficiency is shown in Table 6.
Table 6 Experiment 2 Features selected and Ranked on Ionosphere dataset
Table 7 Experiment 2 Accuracy
Features
Accuracy
Without P
Accuracy
P= 0.01
Accuracy
P=0.05
Accuracy
P=0.10
Accuracy
P=0.5
10
9.62%
9.55%
9.42%
9.56%
9.56%
Feature number/
Column number
1 2 3 4 5 6 7 8 9 10
1 2 3 5 7 1 9 31 33 29 21
2 15 23 8 13 25 14 11 12 16 6
3 19 10 18 22 27 4 17 34 28 32
4 20 24 30 26
Features Accuracy Life time extension factor
10
38.74%
34/10 = 3.4
20
35.89%
34/20 = 1.7
30
35.89%
34/30 = 1.1
34
35.89%
34/34 = 1
Chapter 3
73
A comparison of performance of Ionosphere and Isolet datasets in Figure 19 shows that using
more sensors can improve the prediction accuracy, but at a highest cost - in terms of reduced
energy efficiency.
Figure 19 Accuracy and life time extension factor (Ionosphere)
Table 8 Experiment 2 results
Number of Features Accuracy Life time extension factor
10 38.74% 3.4
20 35.89% 1.7
30 35.89% 1.1
34 35.89% 1
3.3.3 Experiment 3 (forest Cover type data set)
The data set used in experiment 3 is Forest Cover Type dataset. This dataset is a large data set
with large number of samples, consisting of 581012 instances and 54 attributes. After applying
feature ranking algorithm to the forest cover type data, features are ranked and selected in the
following table from the most significance to the least significance of relative importance.
0
5
10
15
20
25
30
35
40
1 2 3 4
# o
f fe
atu
res
Experiment 2
features
accuracy
Life time extensionfactor
Chapter 3
74
Table 9 Experiment 3 features ranked and selected for forest cover type dataset
Table 10 Experiment 3 Accuracy and life time Extension factor
The results of this experiment are shown in Table 11, and Figure 20 shows the prediction
accuracy vs. the energy efficiency in terms of LTEF metric. As can be seen from Table 11 and
Figure 20, for same set of features- say 10 sensors, the prediction accuracy achieved is better than
the previous two data sets. This could be due larger data size available for training stage and
ability to learn the model better for Forest cover type data, as compared to Ionosphere and Isolet
type of data. Further, it can be seen that increase in number of sensors used does not improve the
prediction accuracy. That is, as we increase in number of sensors, from 10 to 54, the prediction
accuracy improves from 68% to 68.49%, and impact of this on energy efficiency is worst, as the
LTEF drops from 5.4 to 1.0.
Feature Number 1 2 3 4 5 6 7 8 9 10
1 15 19 28 29 51 1 26 36 37 52
2 24 53 12 25 27 54 44 14 18 43
3 10 6 32 8 40 17 48 38 20 49
4 16 35 42 7 33 5 23 3 13 31
5 30 4 45 2 11 21 41 9 39 22
6 47 46 50 34
Features Accuracy Life time extension factor
10
68.00%
54/10 = 5.4
20
68.16%
54/20 = 2.7
30
68.27%
54/30 = 1.8
40
68.37%
54/40 = 1.3
54
68.49%
54/54 = 1
Chapter 3
75
Figure 20 Accuarcy and Life time extension factor (Forest cover Type data set)
Table 11 Experiment 3 results
Number of Features Accuracy Life time extension factor
10 68.00% 5.4
20 68.16% 2.7
30 68.27% 1.8
40 68.37% 1.3
54 68.49% 1
3.3.4 Experiment 4 (Forest fires Dataset)
For the fourth experiment the forest fires dataset was used. This data set has a size of 517 * 13
(517 samples with 13 features). Features have been reduced to 5 because 8 other attributes such
as date, time and month were not relevant features. After applying feature ranking algorithm, the
following table shows features in the order of their significance, most significant to the least
significant feature.
0
10
20
30
40
50
60
1 2 3 4 5
# o
f Fe
atu
res
Experiment 3
features
accuracy
Life timeextension factor
Chapter 3
76
Table 12 selected features on forest fires Dataset
Table 13 Experiment 2 Accuarcy Forest Fires data set
The relationship between accuracy and lifetime extension factor for the forest fires data set is
similar to experiment number one. That is increasing the number of features increases the
accuracy at the cost of life time extension factor. Further, in the event of sensor failure or
unavailability, it is possible to maintain the specified accuracy by including more sensors for
classifying the area affected by fire. However, for a healthy sensor network, using more features
or sensors is costing more resources and reduces the life time of the sensor network. It would be
energy efficient if lesser number of sensors with more significance can be used. The Accuracy
versus life time extension factor for selected features is shown below. The poor detection
accuracy is due to smaller data size and inability of network to learn the relationship between
input and output with not enough data. Figure 21 shows the performance for dataset 4.
Feature 1 Feature 2 Feature 3 Feature 4 Feature 5
3 4 5 1 2
Features Accuracy Life time extension factor
1 ( 3)
10.77%
5/1= 5
2 (3,4)
11.04%
5/2 = 2.5
3 (3,4,5)
13.37%
5/3= 1.6
4 (3,4,5,1)
13.56%
5/4= 1.25
5 (3,4,5,1,2)
13.75%
5/5= 1
Chapter 3
77
Figure 21 Accuracy and life time extension factor for forest fires data set
Table 14 Experiment 4 results
Number of Features Accuracy Life time extension factor
1 10.77% 5
2 11.04% 2.5
3 13.37% 1.6
4 13.56% 1.25
5 13.75% 1
3.4 Chapter Summary
In this Chapter, a joint sensor selection and event detection model/scheme was proposed for
WSN, based on machine learning approach. As the method is data driven, and tries to learn the
relationship between sensor data and event detection capability, different types of publicly
available datasets from UCI Machine Learning repository were used to test the proposed joint
model/scheme. Here, modelling of sensors in WSN is done with features/attributes of a dataset,
the output classes or variables modelled as the application events, and sensor
samples/measurements modelled with instances of dataset. A feature ranking algorithm was
developed which ranks the features or sensors in order of their significance in being able to
0
1
2
3
4
5
6
1 2 3 4 5
# o
f fe
atu
res
Experiment 4
features
accuracy
Life time extensionfactor
Chapter 3
78
predict the output or WSN state. Also, an objective measure to determine the energy efficiency
achieved was devised with a metric called life time extension factor (LTEF), which needs to
be improved by WSN learning from features and predicting the output with a Bayesian (Naïve
Bayes) machine learning classifier. As reiterated before, due to resource constraints on WSN
and its sensor nodes, there is a need to come up with light weight machine learning approaches,
and the scheme proposed in this Chapter based on a joint sensor selection and event detection
model is one such scheme, that can be implemented in WSN nodes, in both decentralised or
centralised topologies easily. It turns out that this scheme based on a feature ranking technique
and Naïve Bayes classifier, can indeed address the non-operational or non-functional
challenges as well, such as QoS guarantees, as it can take into account sensor failures and
guarantee event detection accuracy under sensor failures. It allows graceful management of
sensor network in the event of sensor failures, by increasing the number of sensors to meet the
specified accuracy requirements. However, the event detection performance is quite low, and
needs improvement as such. It could be possible, that better sensor selection and machine
learning approaches can address this issue. In the next Chapter, we discuss the next stage of
the proposed integrated framework, to address this shortcoming, and extend the joint sensor
selection - event detection model, with adaptive classifier models instead of simple Naïve
Bayes classifier and feature ranking algorithm used here, to address both operational
(functional) and non-operational (non-functional) challenges in WSNs.
79
Chapter 4 Adaptive Models for Energy Efficiency
4.1 Introduction
In this Chapter, we extend the scheme developed in the previous Chapter with adaptive classifier
and adaptive sensor selection models for improving the performance of integrated framework in
addressing the WSN challenges.
As the WSN nodes need to adapt to the state of the WSN environment being monitored
dynamically, the number of sensor nodes participating in the routing tree cannot remain fixed,
and need to adapt, in order to accurately monitor and predict the physical environment, and in
this Chapter, the design of data driven adaptive classifier models for improving prediction
accuracy, based on performance targets specified, is presented. It turns out that this scheme which
involves selection of an appropriate classifier model, in conjunction with the previous sensor
selection approach, not only results in better prediction accuracy, but also contributes towards
quality of service (QoS) enhancements, similar to joint sensor selection and event detection
scheme discussed in previous chapter, where the scheme can detect the sensor failures and
gracefully manage the performance targets. The adaptive classifier model scheme discussed in
this Chapter can be implemented in a decentralised manner in WSN nodes or collectively at the
central base station control code, depending on algorithm complexity, and computational
resources available.
4.2 Adaptive Classifier Model Based Scheme
The block schematic for the adaptive classifier model scheme is shown in Figure 22. Random
forests, random trees and decision tree classifier was compared with baseline Naïve Bayes
classifier to achieve energy efficiency. Depending in the WSN configuration, this scheme can be
implemented in a decentralised, distributed manner at WSN cluster head nodes, or in a centralised
manner at central control station. Along with sensor selection scheme proposed in the previous
Chapter, augmentation with adaptive classifier models allows better energy efficiency and
prediction accuracy, as compared to simple Naïve Bayes classifier.
80
Figure 22 Adaptive Feature Selection and Classifier Model for Energy Efficiency
The experimental validation of the proposed scheme was done on a publicly available UCI
machine learning dataset, shows that the proposed adaptive classifier models, based on random
forests, random trees, perform significantly better than the conventional statistical classifiers,
such as Naïve Bayes, discriminant classifiers and decision trees, and can lead towards energy
efficient, intelligent event detection and monitoring and QoS enhancements in WSNs [32].
4.2.1 Data set Description
Accurate natural resource inventory information is vital to any private, state, or federal land
management agency. Forest cover type dataset provides such important information and is made
available publicly through UCI machine learning repository [25]. The original Cover type data
set is very large, and contains 581012 instances and 54 attributes. There are seven forest cover
type classes (Class 1 to Class 7), such as spruce/fire, lodgepole Pine, Ponderosa Pine,
Cottonwood/Willow, Aspen, Douglas-fir and Krummholz. We used smaller subsets of this data,
with each subset containing around 500 instances from each class (Class 1 to 7), with total
81
number of instances 500 * 7 (3500) instances. Table 15 describes the forest cover type data set
[33].
Table 15 Forest cover type original data set and subset data set Description
4.2.2 Classification Algorithms
For baseline comparison with conventional classification schemes, four different classification
algorithms have been examined in this work, including Naive Bayes, Decision Trees, Random
Forests and Random trees. Naïve Bayes Classifier has been described in the previous Chapter,
and in this Chapter rest of the classifier approaches are discussed.
4.2.2.1 Decision Tree Classifier
Decision Tree classifier, is another statistical classifier, similar to Naïve Bayes classifier, that
builds on decision trees from a set of training data, using the concept of information entropy. The
training data is a set S of already classified samples. Each sample S consists of a p-dimensional
vector X, where the Xj represent attributes or features of the sample, as well as the class in which
Si falls. Details of decision tree algorithm is discussed in [34].
Forest Cover Type original data set
Number of
Attributes
54
Class 1 spruce/fir
Class 2 lodgepole
Pine
Class 3 Ponderosa
Pine
Class 4
Cottonwood/Willow
Class 5 Aspen
Class 6 Douglas-fir
Class 7 Krummholz
Number of
Instances
581012
Forest Cover Type subset data set used for experiments
Number of
Attributes
54
Class 1 spruce/fir
Class 2 lodgepole
Pine
Class 3 Ponderosa
Pine
Class 4
Cottonwood/Willow
Class 5 Aspen
Class 6 Douglas-fir
Class 7 Krummholz
Number of
Instances
3500
82
4.2.2.2 Random forests and random trees
Random forests are based on ensemble learning method for classification (and regression) that
operate by constructing a multitude of decision trees at training time and outputting the class that
is the mode of the classes output by individual trees.
Random tree, on the other hand, involves construction of multiple decision trees randomly. When
constructing each tree, the algorithm picks a “remaining" feature randomly at each node
expansion without any purity function check. A categorical feature (such as gender) is considered
"remaining" if the same categorical feature has not been chosen previously in a particular
decision path starting from the root of tree to the current node. The details of random forests and
random trees are available in [35].
4.2.3 Experimental Evaluation
For all the experiments, 10 folds cross validation was used, with data partitioned into 10 folds,
and 9 out of 10 folds used for training and 1fold for testing with unknown data. Further, for
estimation of performance benchmarks, full training set mode was also used for evaluation. We
also examined the performance with and without feature selection/ranking algorithm to find the
optimal number of sensors needed for energy efficiency and prediction accuracy targets. The
results are as shown in Figure 23, 24, 25 and 26.
Figure 23 Performance of classifiers with 10 folds cross validation
83
Figure 24 Performance of classifiers with full training set
Figure 25 Performance of Classifiers with feature selection
84
Figure 26 Performance of classifiers with feature selection on full training set
4.3 Discussion
The comparative performance evaluation of the adaptive classifier model is shown in Figure 18-
21. As can be seen in these figures, the proposed adaptive classifier model scheme based on
random forest and random tree classifiers perform significantly better than conventional
statistical classifier approaches based on Naïve Bayes and decision trees. With 10 fold cross
validation, it was possible to achieve 86.45% with random forests, and 78.14% with random
trees, as compared to 71.08% with Naïve Bayes, and 86.05% with decision trees. With full
training set mode, which serves as a benchmark mode, random forest results in 99.94% and
random tree results in 100% accuracy. This means, there is a need to use appropriate strategies
for improving generalisation abilities, for the classifier model scheme to perform in test mode as
close as possible to learning or training mode.
For the benchmarking, for full training set mode, we use the entire training data for building the
model with each classifier, and use the same data for testing it. However, when we use k fold
cross validation (k = 10 here), we partition the data into 10 equal sized subsets. For the first fold,
the first nine subsets (90% labelled data) are used for training, and last subset (10% data) is used
for testing. For next fold, the training data consists of subset 2 to 10, and test set consists of subset
1. Likewise for each fold, the training data rotates to next 9 folds, so for each fold, the test data
is unseen 10% data, as compared to 90% of training data. As can be expected, testing with unseen
85
data (i.e. 10 fold cross validation), results in a marginal improvement for proposed random forest
(86.45%)/random tree(78.14%) as compared to conventional Naïve Bayes (71.08%) and decision
tree classifiers(86.05%).
However, the improvement is significantly higher with feature selection algorithm involved,
which is a wrapper type feature selection method used here, unlike previous Chapter, where
significant feature test was the criteria to select the significant features. With 10 fold cross
validation and feature selection, the prediction accuracy achieved is 77.94% (random forest) and
74.20% (random tree), as compared to 66.74% (Naïve Bayes) and 75.85% (decision). For
comparison with how these classifier models fare as compared to benchmark mode, testing with
full training set was done.
With full training set ( testing done on same data as training data), the improvement achieved
was much higher, as is evident from Comparative classifier performance. It must be noted that
use of feature selection method denotes improvement in energy efficiency, as lesser number of
features results in lesser computational power and storage requirements. So, a trade-off between
accuracy and energy efficiency can be achieved with appropriate choice of feature selection and
classification model. As the WSN environment changes dynamically, classifier model is adapted
from a choice of four different classifier models, so as to meet the energy efficiency and
prediction accuracy targets. With a joint and adaptive scheme, with feature selection techniques
and classifier models, it is possible, to monitor the large complex WSN for different event
recognition applications
For the feature selection method used here, we selected 8 features (sensors) using wrapper
method for feature selection. Wrapper method searches for the best subset of features, where the
feature subset assesses the quality of a set of features using a specific classification algorithm by
internal cross validation. Here, the wrapper type feature selection method allows selection of
most significant 8 features, instead of full feature set (54 features), resulting in reduced energy
consumption in terms of sensor computation and storage requirements.
86
Figure 27 Comparative classifier performance
As each feature represents a sensor in WSN, use of reduced features (8) here, implies 8 sensors
in active mode and 46 sensors in sleep mode for classifying the forest cover type environment.
This can lead to increased life for sensors, which we measure with a metric called as life time
extension factor. The life time extension factor can be obtained as ratio of total number of
features to number of features in active mode. In this case, the life time extension factor
achieved is 54/8 = 6.75, that is around 6 times increase in life of sensors or improvement in
energy efficiency. Further, the combination the feature selection and adaptive classifier models,
here can also handle sensor similar to scheme discussed in previous chapter, as the sensor
selection scheme adapts to different set of sensors and a different type of classifiers, to graceful
management of performance targets, including energy efficiency, prediction accuracy, and QoS
guarantees.
However, the weakness of the scheme is in generalisation ability, as benchmark performance
with full training set is higher than 10 fold cross validation mode. This could be due to the
characteristic of data set used or the approach used. So, to ascertain this, we examined the
scheme with adaptive classifier models for a different data set and is discussed in the next
Section.
87
4.4 Adaptive Classifier Scheme with Gas Sensor Drift Dataset
The Gas Sensor Array drift dataset is larger compared to Forest cover type dataset, and consists
of 13,910 measurements from 16 chemical sensors to predict 6 different gases at different
concentration levels. The purpose of this dataset is to provide information about the concentration
level at which the sensors were exposed for each measurement. The data set is divided into 10
batches collected over 36 months , each containing the number of measurements per class and
month indicated in the following table, with details of the data set description provided in [25,
35].
Table 16 Gas sensor Array drift data set description
Number
of
Attributes
Number
of
Instances
Number
of
Classes
Class 1 Class 2 Class 3 Class 4 Class 5 Class 6
129 13910 6 Ethanol Ethylene Ammonia Acetal
Deyhde
Acetone Toulene
4.4.1 Experimental Validation with Gas Drift Dataset
For experimental evaluation with this dataset, the adaptive classifier model was enhanced with
more powerful machine learning classifiers and adaptive feature selection model. The adaptive
classifier model consists of five different classification algorithms was examined, including
Naive Bayes, J48, MLP, Random Forests, Random trees and Random Committee. As can be
seen in the experimental validation with Gas Sensor Array drift dataset, consistent results are
obtained similar to the experiments done with Forest cover type and those done in previous
Chapter. Gas sensor Array drift data set being large, we performed the experiments for 10 batches
and averaged the results over these 10 batches. After applying Naive Bayes, Random forest, J48
(Decision Trees), Random tree and Random committee classification for each batch, the average
of the 10 batches were taken. The details are shown in Figure 28.
88
Table 17 Performance of Gas drifts sensor dataset
Gas drift Sensor data set Naive Bayes Random Forest
J48 Random Tree Random committee
Gas drift/10 folds 89.50%
99.91%
99.54%
100.00%
100.00%
Gas drift/Training set 88%
100%
100%
100%
100%
Gas drift/ folds Feature selection (best first method)
86.95%
99.98%
99.57%
100%
100%
Gas drift/Training set Feature selection best first method
86.53%
99.97%
99.57%
100.00%
100.00%
Gas drift/10 folds Feature selection (Greedy stepwise method)
87.89%
99.97%
99.55%
100.00%
100.00%
Gas drift/Training set Feature selection Greedy stepwise method
86.89%
99.97%
99.54%
100.00%
100.00%
Figure 28 Gas drifts summary of experimental results
89
The performance for the Gas Sensor Array drift dataset was much better as compared to the
forests cover type, particularly with extending the adaptive classifier model with ensemble
learning/random committee classifier, and adaptive feature selection model with best first search
and greedy search method. Naive Bayes classifier results in detection accuracy from 86.89% to
89.50%. Random Forest, J48, Random Tree and Random committee achieved very high accuracy
from 99.57% to 100%. Using different feature selection method instead of wrapper method or
significant feature method, LTEF ( the life time extension factor) has jumped up to achieved up
to 25 times (128/5 = 25.6) for the 10 folds, and the same results for the full training set. With
only 5 features selected instead of 128 numbers of features for this dataset, the energy efficiency
has been improved significantly, at highest prediction accuracy of 100%. The combination of
adaptive classifier model and adaptive feature selection model has resulted in improvement in
generalisation ability, as the 10 fold cross validation performance was 100% and is equal to
performance achieved with full training set. This can also impact on the further QoS
enhancements, in terms of sensor failures, and resource management features.
4.4.2 Experimental Validation with Gas Drift Dataset using Ensemble Learning
for Weak Classifiers
In this set of experiments, ensemble learning method was used to examine whether
performance of weak classifiers can be improved, such as Naïve Bayes and J48 (decision trees).
As can be seen in Table 18 Ensemble Learning on Gas drift sensor Array data set, the
performance of weak classifiers is improved. With 10 fold cross validation mode, for Naïve
Bayes classifier, due to bagging, the classification accuracy improves from 67.14% to 71%, and
accuracy with J48 classifier improves from 81.94% to 88 % due to bagging. The improvement
in performance due to ensemble learning is similar with full training set. This validates that for
bagging method of ensemble learning, the generalisation performance is better, as the
performance is improved for both previously seen data (full training data – a benchmark
performance), and unseen data (10 fold cross-validation). For rest of the experiments, we just
used the benchmark case, i.e. full training set.
90
Table 18 Ensemble Learning on Gas drift sensor Array data set
Ensemble Learning Methods Use cross Validation 10 folds Use Training set
MLP Multilayer Perceptron
97.94%
99.58%
Meta- Bagging- NB
59.25%
59.50%
Meta- Bagging-j48
98.61%
99.72%
Meta-Adaboost-NB
59.16%
59.38%
Meta-Adaboost-J48
99.38%
100%
Meta-stacking-NB
16.66%
16.66%
Meta-Stacking-J48
16.66%
16.66%
4.5 Chapter Summary
In this Chapter the adaptive classifier models were proposed to address the WSN
challenges. The adaptive classifier models scheme performed extremely well, and along
with adaptive feature selection scheme, it could achieve energy efficiency and prediction
accuracy targets, as well as address the QoS and resource management issues. For
experimental validation, two different types of large datasets was used, the forest cover
type dataset and Gas drift type data set to emulate a large physical environment instrumented
with WSN, with each attribute/feature from the data set representing the model of a WSN
node/sensor - set up for monitoring a complex and large physical environment. With Gas sensor
data set, it was showing consistency with findings from Forest Cover Type experiments – a
significant performance improvement with combined adaptive classifier and feature selection
model, with random forests, random tree, and random committees, and with best first and greedy
search feature selection techniques. Further, using a different learning scheme within the adaptive
classifier model - the ensemble learning scheme, it was possible to pull up combined performance
of weak classifiers, such as Naïve Bayes, and J48 decision trees and improve their prediction
accuracy performance metric. This validates the hypothesis that the powerful machine learning
approaches can indeed address different WSN challenges, including, the operational/functional
challenges such as energy efficiency and event detection accuracy, and the non-operational/non-
functional challenges such as failure recovery and resource management. In the next Chapter, the
proposed integrated framework is further extended - with a joint sensor selection - adaptive
routing model/scheme for energy efficiency.
Chapter 5
91
Chapter 5
Joint Sensor Selection- Adaptive Routing Model
5.1 Introduction
In this Chapter, the third stage of integrated framework for addressing WSN challenges is
presented. The third stage, involves the joint sensor selection and adaptive routing scheme, which
can address WSN challenges with missing data or lack of sufficient data due to sensor failures.
The proposed approach involves an adaptive routing scheme to be used for energy efficiency and
works in conjunction with extensions to sensor selection scheme proposed in earlier chapters.
The experimental validation of the proposed scheme for publicly available Intel Berkeley lab
Wireless Sensor Network dataset shows, it is indeed possible to achieve energy efficiency, even
under the missing data or insufficient data scenarios, with an adaptive routing protocol.
Here, the adaptive routing scheme is based on selecting most significant sensors based on Akaike
criterion, for learning the physical environment from sensor measurements. The experimental
validation of this scheme was done with of a publicly available WSN dataset acquired from real
indoor physical environment, the Intel Berkeley Lab [38].
5.2 Intel Berkeley Lab WSN dataset
The publicly available data set used for experimental validation consists of Mica2Dot sensors
with weather boards collected time stamped topology information, along with humidity,
temperature, light and voltage values once every 31 seconds. Data was collected using the
TinyDB in-network query processing system, built on the TinyOS platform [38] . The sensors
were arranged according to the Figure 29. The x and y coordinates of sensors (in meters relative
to the upper right corner of the lab) are given in a separate file. The three columns correspond
to mote id, x location, and y location.
This csv file extracted from the downloaded dataset includes a log of about 2.3 million readings
collected from these sensors. The file is 34MB gzipped, 150MB uncompressed. The schema is
as follows:
Chapter 5
92
Table 19 Intel lab data set file schema
date:
yyyy-mm-
dd
time:
hh:mm:ss.xxx
epoc
h:int
moteid:
int
temperature:
real
humidity:
real
light:
real
voltage:
real
To examine the WSN performance on quantity of data available for learning the relationships
between different variables, for prediction capability, we used three different sample sizes - 35
samples , 2700 samples and 5400 samples, corresponding to temperature and humidity sensor
measurements, which come from a deployment of 54 sensors in the Intel research laboratory
at Berkeley [38]. A picture of the deployment is provided in Intel Berkeley Wireless sensor
network Data set: location of 54 sensors in an area of 1200 m2, where sensor nodes are
identified by numbers ranging from 1 to 54.
Figure 29 Intel Berkeley Wireless sensor network Data set: location of 54 sensors in an area of
1200 m2
Many sensor readings from WSN test bed were missing, due to this being a simple prototype
testbed. This gives us a challenging opportunity and test whether the proposed integrated
machine learning framework can cope with missing and insufficient information. We selected
from this data set few subsets of measurements. The readings were originally sampled every
thirty-one seconds. A pre-processing stage where data was partitioned and normalised was
Chapter 5
93
applied to the data set. Also, for this WSN test bed, all the sensors can play the role of both
sources as well as sink node, and can be configured to be a source node or sink node in a test
session. This is how nodes in most distributed WSNs are set up, and can be configured as
source or sink node, based on the decentralised or centralised topology, and assignment of
different nodes as cluster heads, control nodes, sensing nodes etc. This arrangement allows
different type of routing protocols to be tested as well, under different operational or functional
challenges. Those nodes which actively participate in sensing the environment, whether it is a
source node or sink node, can transmit the data, and consume the power and those which don’t
participate in this activity do not consume any power. This can allow an energy efficient WSN
design; by involving optimum number of sensors to participate in environment sensing and
transmission task, and leaving non-participating sensors in sleep mode (no energy consumption).
This can however, impact on the accuracy of sensing the environment, if number of sensors
participating in routing scheme is not properly chosen. To ensure a trade-off between accuracy
and energy efficiency is achieved, it is essential that a dynamic or adaptive routing scheme is
used, where, the machine learning/data mining technique can use larger training data from
previous/historical data sets to decide the sensors participating in the routing scheme, and meet
the performance targets, in terms of energy efficiency, prediction accuracy and other QoS
metrics. The block schematic of this joint sensor selection and adaptive routing model is shown
in Figure 30 below.
Chapter 5
94
Figure 30 Joint Sensor Selection – Adaptive Routing Model
5.3 Intel Lab data file versus Intel Lab data file restructured for
experiments
The files used for experiment contains approximate readings of 65000 samples for each mote ID
the following diagram shows the process done on the main file to achieve the sensor selection
and routing approach
Figure 31 Intel lab main source file structure
Chapter 5
95
The main original Data set contains huge number of readings about 65000 samples for all 54
sensors. In this research, samples of 35, 2700 and 5400 readings have been taken in 3 separate
files for each temperature and humidity make the total number of files is six for all experiments.
The same set of experiments and samples have been repeated for humidity from the main source
file and as per the following structure for temperature and humidity.
Chapter 5
96
Figure 32 Sample files temperature readings 35, 2700 and 5400 samples
Chapter 5
97
Figure 33 Sample files temperature readings 35, 2700 and 5400 samples
Chapter 5
98
5.4 Sensor Selection and Adaptive Routing Model
The proposed sensor selection and routing approach is based on a feature selection technique
called Akaike criterion [40], [41], that selects the attributes (sensors), by evaluating the worth of
a subset of attributes by considering the individual predictive ability of each feature/sensor along
with the degree of redundancy between them. Subsets of features that are highly correlated with
the class while having low inter correlation are preferred [39, 40].
Further, this feature/sensor selection algorithm identifies locally predictive attributes, and
iteratively adds the attributes with the highest correlation with the class as long as there is not
already an attribute in the subset that has a higher correlation with the attribute in question. Once
the appropriate group of sensors are selected, the prediction of sensor output at sink node or base
station is done by linear regression algorithm, using the Akaike criterion [40], [41], which
involves stepping through the attributes, removing the one with smallest standardized coefficient
until no improvement is observed in the estimate of the error given by Akaike information metric.
Figure 34 shows how the sensor selection evolves as the training data (historic data) used for
predicting the sink sensor output is increased, and ensures the prediction accuracy/error is
maintained at a particular threshold value. Here, prediction error (RMSE) was used as the metric,
in contrast to detection/prediction accuracy used in earlier chapters.
Figure 34 Temperature Sensor selection map for 3 experiment scenarios- 1, 2 and 3
Chapter 5
99
Figure 35 Humidity sensor selection map for 3 experiment scenario 1,2 and 3
5.5 Experimental Results and Discussion
Different sets of experiments were performed to examine the relative performance of sensor
selection and adaptive routing model proposed here. K-fold stratified cross validation technique
has been used for performing experiments, with k=2, 5 and 10, based on the training data
available (using larger folds for larger training data). Further, to estimate the relative energy
efficiency achieved, we performed experiments with all sensors (without feature selection/sensor
selection) algorithm, and with sensors selected by feature selection algorithm. As mentioned
before, the feature selection algorithm allows selection of an optimal number of features or sensor
nodes needed to characterize or to classify the environment (which in turn leads to an energy
efficient scheme). Further, time taken to build the model is also an important parameter,
particularly for adaptive sensor routine scheme to be used for real time environment monitoring.
Chapter 5
100
Table 20 Temperature results from three experiments scenarios
Experiment # (Temperature)
Number of Sensors
Number of Samples
Features Selected
Time (No F selection)
Time (F selection)
RMSE No F selection
RMSE with F selection
Experiment 1 54 35 17,50 0.02 sec 0.01 sec 20.26 0.04
Experiment 2 53 2700 3,14,16,19,39 0.43 sec 0.02 sec 5.02 2.23
Experiment 3 53 5400 3,13,14,16,19,24,53 0.57 sec 0.03 sec 3.93 2.93
Table 21 Humidity results from three experiments scenarios.
Experiment # (Temperature)
Number of Sensors
Number of Samples
Features Selected
Time (No F selection)
Time (F selection)
RMSE No F selection
RMSE with F selection
Experiment 1 52 35 7,24,41,44,50 0.01 sec 0.01 sec 3.82 0.04
Experiment 2 52 2700 3,7,11,14,16,22,28,29,34,41 0.14 sec 0.01 sec 0.96 2.11
Experiment 3 52 5400 14,19,24,25,36 0.17 sec 0.03 sec 1.91 4.56
For the Temperature and humidity set of experiments, 54, 53, 52 sensors and a small set of
training samples (35 Humidity measurements) have been used. As can be seen from the sensor
locations shown in Humidity sensor selection map for 3 experiment scenario 1,2 and 3, sensor
50 is the sink node (emulating base station node), and sensors 1 to 49 participate in measuring
and transmitting the environment around them to the sink node, where the machine learning
prediction task is to estimate the measurement at sink node (sensor 50). The RMS error (root
mean squared error) at the sink node (node 50) provides a measure of prediction For all source
sensor nodes (1-49) in WSN participating in measuring the temperature in the environment and
sending it to sink node, the RMS error is 3.82%, and with sensor selection scheme used with only
5 sensors participating in routing scheme, the RMS error is 0.04%. As can be seen in Table 21,
with a moderate degradation in accuracy (3.82% to 0.04%), energy efficiency achieved is of the
order of 52 (52/5). The measure for energy efficiency, is the life time extension factor (LTEF)
metric, which can be defined as:
Life time Extension factor = Total number of features
Number of features used
With 2 sensors out of 54 sensor nodes in active mode, the LTEF achieved is around 27 times,
and 52 sensor nodes are in sleep mode. The trade- off is a slight reduction in accuracy. This could
be due to less training data used. We used only 35 temperature samples for prediction scheme.
Chapter 5
101
With more data samples used in the prediction scheme, performance could be better. To test this
hypothesis, we performed next set of experiments.
Figure 36 Temprature Experiment 1,2 and 3 results
Figure 37 Humidity experiment 1,2 and 3 results
For second set of experiments, we used 2700 training samples collected on different days. As can
be seen in Table 21, with larger training data size, we found that the participating sensors in the
Chapter 5
102
routing scheme are different, as the proposed feature selection algorithm chooses different set of
sensors (3, 14, 16, 19, 39). We used 53 sensors for this set of experiments, as two of the sensors
(sensor 5 did not have more than 35 measurements). With all 53 sensors in the routing scheme,
the RMS errors is 5.02%, and with 6 sensor nodes (3, 14, 16, 19, 39), the error is 5.02%. This is
a significant improvement in prediction accuracy (from 5.02% to 2.23%), with life time extension
of 10.6 (53/5). As is evident here, by using larger training data (2700 temperature measurements),
it was possible to achieve an improvement in prediction accuracy and energy efficiency as well.
To examine the influence of increasing training data size, we performed third set of experiments
with 5400 samples. The performance achieved for this set of experiments is shown in Table 20.
Here the adaptive routing scheme based on proposed feature selection technique selects 8 sensors
(3, 13, 14, 16, 19, 24, 34, 53). For this set of experiments, the RMS error varies from 5.02% for
all sensors participating in the scheme to 2.23% with LTEF of 6.6 (53/8). Though there is no
degradation in prediction accuracy, there is not much improvement in energy efficiency, with
doubling of training data size for the building the model this could be due to overtraining that has
happened, with the network losing its generalisation ability. So by increasing training data size,
it may not be just possible to achieve performance improvement, for pre-diction accuracy (RMS
error) and energy efficiency (LTEF), and a trade off may be needed. An optimal combination of
training data size, and number of sensors actively participating in routing scheme can result in
energy efficient WSN, without compromising the prediction accuracy.
Figure 38 Temperature, root mean square error
Chapter 5
103
Figure 39 Humidity, Root mean square error
Further, another important parameter is model building time, which represents learning time for
learning a new route, as for adaptive sensor routing scheme to be implemented in real time WSN
environment, routing scheme has to dynamically compute the sensors that are in active mode and
in sleep mode. Out of 3 experimental scenarios considered here, as can be seen from Table, the
model building time improves from 0.02 seconds to 0.01 seconds for experiment 1, from 0.43
seconds to 0.02 seconds for experiment 2, and from 0.57 seconds to 0.03 seconds for experiment
3. So, the proposed adaptive routing scheme for sensor selection provides an added benefit of
reduced model building times, suitable for real time deployment. Figure below shows the time
taken to build the model.
Chapter 5
104
Figure 40 Time taken to build the model, Temperature
Figure 41 Time taken to build the model, Humidity
5.6 Chapter Summary
In this Chapter, we proposed a joint sensor selection - adaptive routing model for sensor nodes
in WSN, based on machine learning with a feature selection algorithm based on Akaike criterion,
and can adapt them continuously as time evolves ( more data arrives). The experimental
evaluation for a real world publicly available WSN dataset, the Intel Berkeley Lab WSN test bed,
Chapter 5
105
validated our hypothesis, and allowed WSN operational and non-operational challenges to be
addressed including energy efficiency, prediction error, and QOS enhancements, such as
robustness to sensor failures and quick MAC layer adaptation ( with fast learning times) Next
Chapter concludes this work, with three major contributions for the integrated framework
proposed, and future directions of this research.
Chapter 6
107
Chapter 6 Conclusions and Future Directions
In this thesis a novel integrated framework for energy efficiency based on machine learning and data
mining techniques is proposed. The three stages of this framework, with joint sensor selection – event
detection model, adaptive models for energy efficiency, and joint sensor selection and adaptive routing
model, allow various functional and non-functional challenges in WSN to be addressed, including energy
efficiency, event detection accuracy, MAC layer routing adaptation, QoS enhancements, sensor failures
and model building or learning time.
In Chapter 3, a joint sensor selection and event detection model was proposed for WSN, based on machine
learning approach. As the method is data driven, and tries to learn the relationship between sensor data and
event detection capability, different types of publicly available datasets from UCI Machine Learning
repository were used to test the proposed joint model/scheme. Here, modelling of sensors in WSN is done
with features/attributes of a dataset, the output classes or variables as the application events, and sensor
measurements modelled with instances of dataset. A feature ranking algorithm was developed which ranks
the features or sensors in order of their significance in being able to predict the output or WSN state. Also,
an objective measure to determine the energy efficiency achieved was devised with a metric called life
time extension factor (LTEF), which needs to be improved by WSN learning from features and predict the
output class/variable accurately, to validate the hypothesis proposed in this work. An extensive
experimental evaluation with several publicly available datasets show, that proposed joint sensor selection
– event detection model allows learning from historical data, and meet the operational/functional WSN
challenges such as energy efficiency (LTEF), event detection (prediction accuracy) and QoS guarantees
(sensor failures).
As reiterated before, due to resource constraints on WSN and its sensor nodes, there is a need to come up
with light weight machine learning approaches, and the scheme such as the joint sensor selection – event
detection model proposed in this Chapter is one such simple and effective scheme, that can be implemented
in WSN nodes, amenable to both decentralised or centralised topologies.
In Chapter 4, an adaptive learning model was proposed to address the WSN challenges. This adaptive
classifier model performed extremely well, and along with adaptive feature selection scheme, it could
achieve energy efficiency and prediction accuracy targets, as well as address the resource management
issues. For experimental validation, two different types of large datasets, the forest cover type dataset
and Gas drift type data set was used to emulate a large physical environment instrumented with WSN, with
each attribute/feature from the data set modelling the node/sensor of a WSN set up to monitor or
Chapter 6
108
characterize a complex and large physical environment. With Gas sensor data set, it was showing
consistent results as was for Forest cover type data set. There was a significant performance improvement
with combined adaptive classifier - feature selection model with random forests, random tree, and random
committees, and with best first and greedy search type feature selection techniques. Further, using a
different learning scheme within the adaptive classifier mode - the ensemble learning scheme, it was
possible to pull up combined performance of weak classifiers, such as Naïve Bayes, and J48 decision trees.
This validates the hypothesis that the power machine learning/data mining approaches can indeed address
different WSN challenges, including, operational/functional challenges such as energy efficiency and
event detection accuracy, and non-operational/non-functional challenges such as failure recovery and
resource management.
In Chapter 5, a joint sensor selection - adaptive routing model was proposed for WSN, based on machine
learning with a feature selection algorithm based on Akaiki criterion, that selects few most significant
sensors to be active at a time, and adapts them continuously as time evolves. The experimental evaluation
for a real world publicly available WSN dataset, the Intel Berkeley Lab WSN test bed, validates our
hypothesis, and allows WSN operational and non-operational challenges to be addressed including energy
efficiency, prediction accuracy, and QOS enhancements, such as robustness to sensor failures and quick
MAC layer adaptation ( with fast learning times)
While the proposed integrated framework addressed some of the WSN key challenges were addressed
well. However, there are a myriad of challenges, operational, non-operational and application specific,
which can indeed be addressed by this framework, and can be extended with advanced machine learning
algorithms. Hence, there is still a need for future research in this interdisciplinary area, and some future
directions of this work include, use of emerging techniques that extract spatio-temporal correlations better
as compared to the techniques examined in this work, such as canonical correlation analysis, independent
component analysis, dictionary learning, and non-negative matrix factorization for sensor selection, as they
have proved to be highly efficient in other machine learning application contexts.
Another promising direction for further extending the proposed integrated framework, is to investigate,
some of unsupervised, self- learning and online approaches for addressing WSN challenges, and consider
alternate WSN topologies (instead of just centralised or decentralised WSN topology). Hierarchical
clustering is one such candidate, which uses unsupervised learning and can be deployed in hybrid WSN
topologies (combination of centralised and decentralised topology). Another potential extension of the
proposed framework is, instead of reducing the amount of data in the network, it could be possible to gain
more insight into WSN behaviour from the abundant data available, with some of recent big data analytics,
Chapter 6
109
and scalable machine learning approaches, and devise solutions to achieve energy efficiency, event
detection accuracy and QoS targets. Some of these aspects can be investigated in future.
Bibliography
111
Bibliography
1. Dargie, W. and C. Poellabauer, Fundamentals of wireless sensor networks: theory and
practice2010: John Wiley & Sons Inc.
2. Sohraby, K., D. Minoli, and T.F. Znati, Wireless sensor networks: technology, protocols, and
applications2007: Wiley-Blackwell.
3. David J. Stein, E., Wi
4. reless Sensor Network Simulator. 2006(1.1).
5. Chong, C.-Y. and S.P. Kumar, Sensor networks: evolution, opportunities, and challenges.
Proceedings of the IEEE, 2003. 91(8): p. 1247-1256.
6. Guo, B., D. Zhang, and M. Imai, Toward a cooperative programming framework for context-
aware applications. Personal and Ubiquitous Computing, 2011. 15(3): p. 221-233.
7. Baqer, M. and A. Khan. Energy-efficient pattern recognition approach for wireless sensor
networks. 2007. IEEE.
8. Nakamura, E.F. and A.A.F. Loureiro, Information fusion in wireless sensor networks, in
Proceedings of the 2008 ACM SIGMOD international conference on Management of data2008,
ACM: Vancouver, Canada. p. 1365-1372.
9. Bashyal, S. and G.K. Venayagamoorthy. Collaborative routing algorithm for wireless sensor
network longevity. 2007. IEEE.
10. Narasimhan, R. and D.C. Cox. A handoff algorithm for wireless systems using pattern
recognition. 1998. IEEE.
11. Song, M. and T. Allison, Frequency Hopping Pattern Recognition Algorithms for Wireless
Sensor Networks.
12. Wälchli, M. and T. Braun, Efficient signal processing and anomaly detection in wireless sensor
networks. Applications of Evolutionary Computing, 2009: p. 81-86.
13. Yu, W. and C. He. Resource reservation in wireless networks based on pattern recognition. 2001.
IEEE.
14. Dziengel, N., G. Wittenburg, and J. Schiller. Towards distributed event detection in wireless
sensor networks. 2008.
15. Schurgers, C. and M.B. Srivastava. Energy efficient routing in wireless sensor networks. 2001.
Ieee.
16. Ma, J., et al. Energy-efficient opportunistic topology control in wireless sensor networks. 2007.
ACM.
17. Ali, M. and Z.A. Uzmi. An energy-efficient node address naming scheme for wireless sensor
networks. 2004. IEEE.
Bibliography
112
18. Viera, M., et al. Scheduling nodes in wireless sensor networks: A Voronoi approach. 2003. IEEE.
19. Salhieh, A., et al. Power efficient topologies for wireless sensor networks. 2001. IEEE.
20. Ganesan, D., et al., Highly-resilient, energy-efficient multipath routing in wireless sensor
networks. ACM SIGMOBILE Mobile Computing and Communications Review, 2001. 5(4): p.
11-25.
21. Zytoune, Q., Y. Fakhri, and D. Aboutajdine, A balanced cost cluster-heads selection algorithm
for wireless sensor networks. International Journal of Computer Science, 2009. 4(1): p. 21-24.
22. Raicu, L., et al. e3D: an energy-efficient routing algorithm for wireless sensor networks. 2004.
Ieee.
23. Abdullah, N.Z.a.A.B., Different Techniques Towards Enhancing Wireless Sensor Network
(WSN) Routing Energy Efficiency and Quality of Service (QoS). World Applied Sciences
Journal 2011. 4.
24. Ping, S., Delay measurement time synchronization for wireless sensor networks. Intel Research
Berkeley Lab, 2003.
25. Asuncion, A. and D. Newman, UCI machine learning repository. University of California, Irvine,
School of Information and Computer Sciences, 2007. URL:< http://www. ics. uci.
edu/mlearn/MLRepository. html, 2010.
26. Alwadi, M.d. and G. Chetty, A novel feature selection scheme for energy efficient wireless
sensor networks, in Algorithms and Architectures for Parallel Processing2012, Springer. p. 264-
273.
27. MATLAB. 20/01/2012]; Available from: http://www.mathworks.com.au/.
28. Hall, M., et al., The WEKA data mining software: an update. ACM SIGKDD Explorations
Newsletter, 2009. 11(1): p. 10-18.
29. ALWADI, M. and G. CHETTY, Feature Selection and Energy Management for Wireless Sensor
Networks. IJCSNS, 2012. 12(6): p. 46.
30. Cortez, P. and A.J.R. Morais, A data mining approach to predict forest fires using meteorological
data. 2007.
31. Csirik, J., P. Bertholet, and H. Bunke. Pattern recognition in wireless sensor networks in presence
of sensor failures. 2011.
32. ALWADI, M. and G. CHETTY, Energy Efficiency Data Mining for Wireless Sensor Networks
Based on Random Forests.
33. Alwadi, M. and G. Chetty, Energy Efficient Data Mining Scheme for Big Data Biodiversity
Environment. 2014.
Bibliography
113
34. Karimi, K. and H.J. Hamilton, Logical Decision Rules: Teaching C4. 5 to Speak Prolog, in
Intelligent Data Engineering and Automated Learning—IDEAL 2000. Data Mining, Financial
Engineering, and Intelligent Agents2000, Springer. p. 85-90.
35. Hastie, T., et al., The elements of statistical learning: data mining, inference and prediction. The
Mathematical Intelligencer, 2005. 27(2): p. 83-85.
36. Alwadi, M. and G. Chetty, Energy Efficient Data Mining Scheme for High Dimensional Data.
Procedia Computer Science, 2015. 46: p. 483-490.
37. Richter, R., Distributed Pattern Recognition in Wireless Sensor Networks.
38. Bodik, P., et al., Intel lab data. Online dataset, 2004.
39. Hall, M.A., Correlation-based feature selection for machine learning, 1999, The University of
Waikato.
40. Akaike, H., Information theory and an extension of the maximum likelihood principle, in
Selected Papers of Hirotugu Akaike1998, Springer. p. 199-213.
41. Ashraf, M., et al., A New Approach for Constructing Missing Features Values. International
Journal of Intelligent Information Processing, 2012. 3(1).
42. T. O. Ayodele, “Introduction to machine learning,” in New Advances in Machine Learning.
InTech, 2010.
43. A. H. Duffy, “The “what” and “how” of learning in design,” IEEE Expert, vol. 12, no. 3, pp. 71–
76, 1997.
44. P. Langley and H. A. Simon, “Applications of machine learning and rule induction,”
Communications of the ACM, vol. 38, no. 11, pp. 54–64, 1995.
45. L. Paradis and Q. Han, “A survey of fault management in wireless sensor networks,” Journal of
Network and Systems Management, vol. 15, no. 2, pp.171–190, 2007.
46. B. Krishnamachari, D. Estrin, and S. Wicker, “The impact of data aggregation in wireless sensor
networks,” in 22nd International Conference on Distributed Computing Systems Workshops,
2002, pp. 575–578.
47. J. Al-Karaki and A. Kamal, “Routing techniques in wireless sensor networks: A survey,” IEEE
Wireless Communications, vol. 11, no. 6, pp. 6–28, 2004.
48. K. Romer and F. Mattern, “The design space of wireless sensor networks,” IEEE Wireless
Communications, vol. 11, no. 6, pp. 54–61, 2004.
49. J. Wan, M. Chen, F. Xia, L. Di, and K. Zhou, “From machine-to-machine communications
towards cyber-physical systems,” Computer Science and Information Systems, vol. 10, pp.
1105–1128, 2013.
50. Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends in Machine Learning,
vol. 2, no. 1, pp. 1–127, 2009.
Bibliography
114
51. A. G. Hoffmann, “General limitations on machine learning,” pp. 345–347, 1990
52. M. Di and E. M. Joo, “A survey of machine learning in wireless sensor netoworks from
networking and application perspectives,” in 6th International Conference on Information,
Communications Signal Processing, 2007, pp. 1–5.
53. A. Forster, “Machine learning techniques applied to wireless ad-hoc networks: Guide and
survey,” in 3rd International Conference on Intelligent Sensors, Sensor Networks and
Information. IEEE, 2007, pp. 365–370.
54. A. Förster and M. Amy L, Machine learning across the WSN layers. InTech, 2011.
55. Y. Zhang, N. Meratnia, and P. Havinga, “Outlier detection techniques for wireless sensor
networks: A survey,” IEEE Communications Surveys & Tutorials, vol. 12, no. 2, pp. 159–170,
2010.
56. V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial Intelligence
Review, vol. 22, no. 2, pp. 85–126, 2004.
57. R. Kulkarni, A. Förster, and G. Venayagamoorthy, “Computational intelligence in wireless
sensor networks: A survey,” IEEE Communications Surveys & Tutorials, vol. 13, no. 1, pp. 68–
96, 2011.
58. S. Das, A. Abraham, and B. K. Panigrahi, Computational intelligence: Foundations, perspectives,
and recent trends. John Wiley & Sons, Inc., 2010, pp. 1–37.
59. Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T. Lin, Learning from data. AMLBook, 2012.
60. O. Chapelle, B. Schlkopf, and A. Zien, Semi-supervised learning. MIT press Cambridge, 2006,
vol. 2.
61. S. Kulkarni, G. Lugosi, and S. Venkatesh, “Learning pattern classification-a survey,” IEEE
Transactions on Information Theory, vol. 44, no. 6, pp.2178–2206, 1998.
62. M. Morelande, B. Moran, and M. Brazil, “Bayesian node localisation in wireless sensor
networks,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2008,
pp. 2545–2548.
63. C.-H. Lu and L.-C. Fu, “Robust location-aware activity recognition using wireless sensor
network in an attentive home,” IEEE Transactions on Automation Science and Engineering, vol.
6, no. 4, pp. 598–609, 2009.
64. A. Shareef, Y. Zhu, and M. Musavi, “Localization using neural networks in wireless sensor
networks,” in Proceedings of the 1st International Conference on Mobile Wireless Middleware,
Operating Systems, and Applications, 2008, pp. 1–7.
65. J.Winter, Y. Xu, and W.-C. Lee, “Energy efficient processing of k nearest neighbor queries in
location-aware sensor networks,” in 2nd International Conference on Mobile and Ubiquitous
Systems: Networking and Services. IEEE, 2005, pp. 281–292.
Bibliography
115
66. P. P. Jayaraman, A. Zaslavsky, and J. Delsing, “Intelligent processing of k-nearest neighbors
queries using mobile data collectors in a location aware 3D wireless sensor network,” in Trends
in Applied Intelligent Systems. Springer, 2010, pp. 260–270.
67. L. Yu, N. Wang, and X. Meng, “Real-time forest fire detection with wireless sensor networks,”
in International Conference on Wireless Communications, Networking and Mobile Computing,
vol. 2, 2005, pp. 1214–1217.
68. M. Bahrepour, N. Meratnia, M. Poel, Z. Taghikhaki, and P. J. Havinga, “Distributed event
detection in wireless sensor networks for disaster management,” 2nd International Conference
on Intelligent Networking and Collaborative Systems. IEEE, 2010, pp. 507–512.
69. M. Kim and M.-G. Park, “Bayesian statistical modeling of system energy saving effectiveness
for MAC protocols of wireless sensor networks,” in Software Engineering, Artificial
Intelligence, Networking and Parallel/Distributed Computing, ser. Studies in Computational
Intelligence. Springer Berlin Heidelberg, 2009, vol. 209, pp. 233–245.
70. Y.-J. Shen and M.-S. Wang, “Broadcast scheduling in wireless sensor networks using fuzz
hopfield neural network,” Expert Systems with Applications, vol. 34, no. 2, pp. 900 – 907, 2008.
71. R. V. Kulkarni and G. K. Venayagamoorthy, “Neural network based secure media access control
protocol for wireless sensor networks,” in Proceedings of the 2009 International Joint Conference
on Neural Networks, ser. IJCNN’09. Piscataway, NJ, USA: IEEE Press, 2009, pp. 3437–3444.
72. D. Janakiram, V. Adi Mallikarjuna Reddy, and A. Phani Kumar, “Outlier detection in wireless
sensor networks using Bayesian belief networks,” in 1st International Conference on
Communication System Software and Middleware. IEEE, 2006, pp. 1–6.
73. W. Branch, C. Giannella, B. Szymanski, R. Wolff, and H. Kargupta, “In-network outlier
detection in wireless sensor networks,” Knowledge and information systems, vol. 34, no. 1, pp.
23–54, 2013.
74. S. Kaplantzis, A. Shilton, N. Mani, and Y. Sekercioglu, “Detecting selective forwarding attacks
in wireless sensor networks using support vector machines,” in 3rd International Conference on
Intelligent Sensors, Sensor Networks and Information. IEEE, 2007, pp. 335–340.
75. S. Rajasegarar, C. Leckie, M. Palaniswami, and J. Bezdek, “Quarter sphere based distributed
anomaly detection in wireless sensor networks,” in International Conference on
Communications, 2007, pp. 3864–3869.
76. A. Snow, P. Rastogi, and G. Weckman, “Assessing dependability of wireless networks using
neural networks,” in Military Communications Conference. IEEE, 2005, pp. 2809–2815 Vol. 5.
77. A. Moustapha and R. Selmic, “Wireless sensor network modeling using modified recurrent
neural networks: Application to fault detection,” IEEE Transactions on Instrumentation and
Measurement, vol. 57, no. 5, pp. 981–988, 2008.
Bibliography
116
78. Y. Wang, M. Martonosi, and L.-S. Peh, “Predicting link quality using supervised learning in
wireless sensor networks,” ACM SIGMOBILE Mobile Computing and Communications
Review, vol. 11, no. 3, pp. 71–83, 2007.
79. K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is “nearest neighbor”
meaningful?” in Database Theory. Springer, 1999, pp. 217–235.
80. T. O. Ayodele, “Types of machine learning algorithms,” in New Advances in Machine Learning.
InTech, 2010.
81. S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology,” IEEE
Transactions on Systems, Man and Cybernetics, vol. 21, no. 3, pp. 660–674, 1991.
82. R. Lippmann, “An introduction to computing with neural nets,” ASSP Magazine, IEEE, vol. 4,
no. 2, pp. 4–22, 1987.
83. W. Dargie and C. Poellabauer, Localization. John Wiley & Sons, Ltd, 2010, pp. 249–266.
84. T. Kohonen, Self-organizing maps, ser. Springer Series in Information Sciences. Springer Berlin
Heidelberg, 2001, vol. 30.
85. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural
networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006.
86. I. Steinwart and A. Christmann, Support vector machines. Springer, 2008.
87. Z. Yang, N. Meratnia, and P. Havinga, “An online outlier detection technique for wireless sensor
networks using unsupervised quarter-sphere support vector machine,” in International
Conference on Intelligent Sensors, Sensor Networks and Information Processing. IEEE, 2008,
pp. 151–156.
88. Y. Chen, Y. Qin, Y. Xiang, J. Zhong, and X. Jiao, “Intrusion detection system based on immune
algorithm and support vector machine in wireless sensor network,” in Information and
Automation, ser. Communications in Computer and Information Science. Springer Berlin
Heidelberg, 2011, vol. 86, pp. 372–376.
89. Y. Zhang, N. Meratnia, and P. J. Havinga, “Distributed online outlier detection in wireless sensor
networks using ellipsoidal support vector machine,” Ad Hoc Networks, vol. 11, no. 3, pp. 1062–
1074, 2013.
90. W. Kim, J. Park, and H. Kim, “Target localization using ensemble support vector regression in
wireless sensor networks,” in Wireless Communications and Networking Conference, 2010, pp.
1–5.
91. D. Tran and T. Nguyen, “Localization in wireless sensor networks based on support vector
machines,” IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 7, pp. 981–994,
2008.
Bibliography
117
92. B. Yang, J. Yang, J. Xu, and D. Yang, “Area localization algorithm for mobile nodes in wireless
sensor networks based on support vector machines,” in Mobile Ad-Hoc and Sensor Networks.
Springer, 2007, pp. 561–571.
93. G. E. Box and G. C. Tiao, Bayesian inference in statistical analysis. John Wiley & Sons, 2011,
vol. 40.
94. C. E. Rasmussen, “Gaussian processes for machine learning,” in in: Adaptive Computation and
Machine Learning. Citeseer, 2006.
95. S. Lee and T. Chung, “Data aggregation for wireless sensor networks using self-organizing map,”
in Artificial Intelligence and Simulation, ser. Lecture Notes in Computer Science. Springer
Berlin Heidelberg, 2005, vol. 3397, pp. 508–517.
96. R. Masiero, G. Quer, D. Munaretto, M. Rossi, J. Widmer, and M. Zorzi, “Data acquisition
through joint compressive sensing and principal component analysis,” in Global
Telecommunications Conference. IEEE, 2009, pp. 1–6.
97. R. Masiero, G. Quer, M. Rossi, and M. Zorzi, “A Bayesian analysis of compressive sensing data
recovery in wireless sensor networks,” in International Conference on Ultra Modern
Telecommunications Workshops, 2009, pp. 1–6.
98. A. Rooshenas, H. Rabiee, A. Movaghar, and M. Naderi, “Reducing the data transmission in
wireless sensor networks using the principal component analysis,” in 6th International
Conference on Intelligent Sensors, Sensor Networks and Information Processing. IEEE, 2010,
pp. 133–138.
99. S. Macua, P. Belanovic, and S. Zazo, “Consensus-based distributed principal component analysis
in wireless sensor networks,” in 11th International Workshop on Signal Processing Advances in
Wireless Communications, 2010, pp. 1–5.
100. Y.-C. Tseng, Y.-C. Wang, K.-Y. Cheng, and Y.-Y. Hsieh, “iMouse: An integrated mobile
surveillance and wireless sensor system,” Computer, vol. 40, no. 6, pp. 60–66, 2007.
101. D. Li, K. Wong, Y. H. Hu, and A. Sayeed, “Detection, classification, and tracking of targets,”
IEEE Signal Processing Magazine, vol. 19, no. 2, pp. 17–29, 2002.
102. T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An
efficient k-means clustering algorithm: Analysis and implementation,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, 2002.
103. I. T. Jolliffe, Principal component analysis. Springer verlag, 2002.
104. D. Feldman, M. Schmidt, C. Sohler, D. Feldman, M. Schmidt, and C. Sohler, “Turning big data
into tiny data: Constant-size coresets for k-means, PCA and projective clustering,” in SODA,
2013, pp. 1434–1453.
105. C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992.
Bibliography
118
106. R. Sun, S. Tatsumi, and G. Zhao, “Q-MAP: A novel multicast routing method in wireless ad hoc
networks with multiagent reinforcement learning,” in Region 10 Conference on Computers,
Communications, Control and Power Engineering, vol. 1, 2002, pp. 667–670 vol.1.
107. S. Dong, P. Agrawal, and K. Sivalingam, “Reinforcement learning based geographic routing
protocol for UWB wireless sensor network,” in Global Telecommunications Conference. IEEE,
2007, pp. 652–656.
108. A. Förster and A. Murphy, “FROMS: Feedback routing for optimizing multiple sinks in wsn
with reinforcement learning,” in 3rd International Conference on Intelligent Sensors, Sensor
Networks and Information. IEEE, 2007, pp. 371–376.
109. R. Arroyo-Valles, R. Alaiz-Rodriguez, A. Guerrero-Curieses, and J. Cid-Sueiro, “Q-probabilistic
routing in wireless sensor networks,” in 3rd International Conference on Intelligent Sensors,
Sensor Networks and Information. IEEE, 2007, pp. 1–6.
110. C. Guestrin, P. Bodik, R. Thibaux, M. Paskin, and S. Madden, “Distributed regression: An
efficient framework for modeling sensor network data,” in 3rd International Symposium on
Information Processing in Sensor Networks, 2004, pp. 1–10.
111. J. Barbancho, C. León, F. Molina, and A. Barbancho, “A new QoS routing algorithm based on
self-organizing maps for wireless sensor networks,” Telecommunication Systems, vol. 36, pp.
73–83, 2007.
112. B. Scholkopf and A. J. Smola, Learning with kernels: Support vector machines, regularization,
optimization, and beyond. Cambridge, MA, USA: MIT Press, 2001.
113. J. Kivinen, A. Smola, and R. Williamson, “Online learning with kernels,” IEEE Transactions on
Signal Processing, vol. 52, no. 8, pp. 2165–2176, 2004.
114. G. Aiello and G. Rogerson, “Ultra-wideband wireless systems,” IEEE Microwave Magazine,
vol. 4, no. 2, pp. 36–47, 2003.
115. R. Rajagopalan and P. Varshney, “Data-aggregation techniques in sensor networks: A survey,”
IEEE Communications Surveys & Tutorials, vol. 8, no. 4, pp. 48–63, 2006.
116. G. Crosby, N. Pissinou, and J. Gadze, “A framework for trust-based cluster head election in
wireless sensor networks,” in 2nd IEEE Workshop on Dependability and Security in Sensor
Networks and Systems, 2006, pp. 10–22.
117. J.-M. Kim, S.-H. Park, Y.-J. Han, and T.-M. Chung, “CHEF: Cluster head election mechanism
using fuzzy logic in wireless sensor networks,” in 10th International Conference on Advanced
Communication Technology, vol. 1. IEEE, 2008, pp. 654–659.
118. S. Soro and W. Heinzelman, “Prolonging the lifetime of wireless sensor networks via unequal
clustering,” in 19th IEEE International Parallel and Distributed Processing Symposium, 2005,
pp. 4–8.
Bibliography
119
119. A. A. Abbasi and M. Younis, “A survey on clustering algorithms for wireless sensor networks,”
Computer communications, vol. 30, no. 14, pp. 2826–2841, 2007.
120. H. He, Z. Zhu, and E. Makinen, “A neural network model to minimize the connected dominating
set for self-configuration of wireless sensor networks,” IEEE Transactions on Neural Networks,
vol. 20, no. 6, pp. 973–982, 2009.
121. G. Ahmed, N. M. Khan, Z. Khalid, and R. Ramer, “Cluster head selection using decision trees
for wireless sensor networks,” in International Conference on Intelligent Sensors, Sensor
Networks and Information Processing. IEEE, 2008, pp. 173–178.
122. E. Ertin, “Gaussian process models for censored sensor readings,” in 14th Workshop on
Statistical Signal Processing. IEEE, 2007, pp. 665–669.
123. J. Kho, A. Rogers, and N. R. Jennings, “Decentralized control of adaptive sampling in wireless
sensor networks,” ACM Transactions on Sensor Networks (TOSN), vol. 5, no. 3, pp. 19:1–19:35,
2009.
124. S. Lin, V. Kalogeraki, D. Gunopulos, and S. Lonardi, “Online information compression in sensor
networks,” in IEEE International Conference on Communications, vol. 7. IEEE, 2006, pp. 3371–
3376.
125. C. Fenxiong, L. Mingming, W. Dianhong, and T. Bo, “Data compression through principal
component analysis over wireless sensor networks,” Journal of Computational Information
Systems, vol. 9, no. 5, pp. 1809–1816, 2013.
126. A. Förster and A. Murphy, “CLIQUE: Role-free clustering with q-learning for wireless sensor
networks,” in 29th IEEE International Conference on Distributed Computing Systems, 2009, pp.
441–449.
127. M. Mihaylov, K. Tuyls, and A. Nowe, “Decentralized learning in wireless sensor networks,” in
Adaptive and Learning Agents, ser. Lecture Notes in Computer Science. Springer Berlin
Heidelberg, 2010, vol. 5924, pp. 60–73.
128. W. B. Heinzelman, “Application-specific protocol architectures for wireless networks,” Ph.D.
dissertation, Massachusetts Institute of Technology, 2000.
129. M. Duarte and Y. Eldar, “Structured compressed sensing: From theory to applications,” IEEE
Transactions on Signal Processing, vol. 59, no. 9, pp. 4053–4085, 2011
130. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via
the EM algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 1–
38, 1977.
131. M. H. DeGroot, “Reaching a consensus,” Journal of the American Statistical Association, vol.
69, no. 345, pp. 118–121, 1974.
Bibliography
120
132. B. Krishnamachari and S. Iyengar, “Distributed bayesian algorithms for fault-tolerant event
region detection in wireless sensor networks,” IEEE Transactions on Computers, vol. 53, no. 3,
pp. 241–250, 2004.
133. P. Zappi, C. Lombriser, T. Stiefmeier, E. Farella, D. Roggen, L. Benini, and G. Tröster, “Activity
recognition from on-body sensors: Accuracy-power trade-off by dynamic sensor selection,” in
Wireless Sensor Networks. Springer, 2008, pp. 17–33.
134. H. Malik, A. Malik, and C. Roy, “A methodology to optimize query in wireless sensor networks
using historical data,” Journal of Ambient Intelligence and Humanized Computing, vol. 2, pp.
227–238, 2011.
135. Q. Chen, K.-Y. Lam, and P. Fan, “Comments on "Distributed Bayesian algorithms for fault-
tolerant event region detection in wireless sensor networks",” IEEE Transactions on Computers,
vol. 54, no. 9, pp. 1182–1183, 2005.
136. K. Sha, W. Shi, and O. Watkins, “Using wireless sensor networks for fire rescue applications:
Requirements and challenges,” in IEEE International Conference on Electro/information
Technology, 2006, pp. 239–244.
137. H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of wireless indoor positioning techniques and
systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and
Reviews, vol. 37, no. 6, pp. 1067–1080, 2007.
138. Wang, R. Ghosh, and S. Das, “A survey on sensor localization,” Journal of Control Theory and
Applications, vol. 8, no. 1, pp. 2–11, 2010.
139. A. Nasipuri and K. Li, “A directionality based location discovery scheme for wireless sensor
networks,” in Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks
and Applications. ACM, 2002, pp. 105–111.
140. S. Yun, J. Lee, W. Chung, E. Kim, and S. Kim, “A soft computing approach to localization in
wireless sensor networks,” Expert Systems with Applications, vol. 36, no. 4, pp. 7552–7561,
2009.
141. S. Chagas, J. Martins, and L. de Oliveira, “An approach to localization scheme of wireless sensor
networks based on artificial neural networks and genetic algorithms,” in 10th International
Conference on New Circuits and Systems. IEEE, 2012, pp. 137–140.
142. Z. Merhi, M. Elgamel, and M. Bayoumi, “A lightweight collaborative fault tolerant target
localization system for wireless sensor networks,” IEEE Transactions on Mobile Computing,
vol. 8, no. 12, pp. 1690–1704, 2009.
143. E. Cayirci, H. Tezcan, Y. Dogan, and V. Coskun, “Wireless sensor networks for underwater
survelliance systems,” Ad Hoc Networks, vol. 4, no. 4, pp. 431–446, 2006.
Bibliography
121
144. A. Krause, A. Singh, and C. Guestrin, “Near-optimal sensor placements in gaussian processes:
Theory, efficient algorithms and empirical studies,” The Journal of Machine Learning Research,
vol. 9, pp. 235–284, 2008.
145. D. Gu and H. Hu, “Spatial Gaussian process regression with mobile sensor networks,” IEEE
Transactions on Neural Networks and Learning Systems, vol. 23, no. 8, pp. 1279–1290, 2012.
146. L. Paladina, M. Paone, G. Iellamo, and A. Puliafito, “Self organizing maps for distributed
localization in wireless sensor networks,” in 12th IEEE Symposium on Computers and
Communications, 2007, pp. 1113–1118.
147. G. Giorgetti, S. K. S. Gupta, and G. Manes, “Wireless localization using self-organizing maps,”
in Proceedings of the 6th International Conference on Information Processing in Sensor
Networks, ser. IPSN ’07. New York, NY, USA: ACM, 2007, pp. 293–302.
148. Hu and G. Lee, “Distributed localization of wireless sensor networks using self-organizing
maps,” in IEEE International Conference on Multisensor Fusion and Integration for Intelligent
Systems, 2008, pp. 284–289.
149. S. Li, X. Kong, and D. Lowe, “Dynamic path determination of mobile beacons employing
reinforcement learning for wireless sensor localization,” in 26th International Conference on
Advanced Information Networking and Applications Workshops, 2012, pp. 760–765.
150. C. Musso, N. Oudjane, and F. Le Gland, “Improving regularised particle filters,” in Sequential
Monte Carlo methods in practice. Springer, 2001, pp. 247–271.
151. Y.-X. Wang and Y.-J. Zhang, “Non-negative matrix factorization: A comprehensive review,”
IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 6, pp. 1336–1353, 2013.
152. H.-P. Tan, R. Diamant, W. K. Seah, and M. Waldmeyer, “A survey of techniques and challenges
in underwater localization,” Ocean Engineering, vol. 38, no. 14, pp. 1663–1676, 2011.
153. Y. Chu, P. Mitchell, and D. Grace, “ALOHA and q-learning based medium access control for
wireless sensor networks,” in International Symposium on Wireless Communication Systems,
2012, pp. 511–515.
154. A. Bachir, M. Dohler, T. Watteyne, and K. K. Leung, “MAC essentials for wireless sensor
networks,” IEEE Communications Surveys & Tutorials, vol. 12, no. 2, pp. 222–248, 2010.
155. Z. Liu and I. Elhanany, “RL-MAC: A reinforcement learning based MAC protocol for wireless
sensor networks,” International Journal of Sensor Networks, vol. 1, no. 3, pp. 117–124, 2006.
156. M. Sha, R. Dor, G. Hackmann, C. Lu, T.-S. Kim, and T. Park, “Self-adapting MAC layer for
wireless sensor networks,” Technical Report WUCSE- 2013-75, Washington University in St.
Louis, Tech. Rep., 2013.
157. W. Ye, J. Heidemann, and D. Estrin, “An energy-efficient MAC protocol for wireless sensor
networks,” in 21st Annual Joint Conference of the IEEE Computer and Communications
Bibliography
122
Societies, vol. 3, 2002, pp. 1567–1576 vol.3. T. van Dam and K. Langendoen, “An adaptive
energy-efficient MAC protocol for wireless sensor networks,” in Proceedings of the 1st
International Conference on Embedded Networked Sensor Systems, ser. SenSys ’03. New York,
NY, USA: ACM, 2003, pp. 171–180.
158. K. Klues, G. Hackmann, O. Chipara, and C. Lu, “A component-based architecture for power-
efficient media access control in wireless sensor networks,” in Proceedings of the 5th
International Conference on Embedded Networked Sensor Systems. ACM, 2007, pp. 59–72.
159. C. Doerr, M. Neufeld, J. Fifield, T. Weingart, D. C. Sicker, and D. Grunwald, “MultiMAC-an
adaptive MAC framework for dynamic radio networking,” in International Symposium on New
Frontiers in Dynamic Spectrum Access Networks. IEEE, 2005, pp. 548–555.
160. D. Moss and P. Levis, “BoX-MACs: Exploiting physical and link layer boundaries in low-power
networking,” Computer Systems Laboratory Stanford University, 2008.
161. Y. Sun, O. Gurewitz, and D. B. Johnson, “RI-MAC: A receiver-initiated asynchronous duty cycle
MAC protocol for dynamic traffic loads in wireless sensor networks,” in Proceedings of the 6th
ACM Conference on Embedded Network Sensor Systems. ACM, 2008, pp. 1–14.
162. Z. Alliance, “Zigbee-2007 specification,” Online:
http://www.zigbee.org/Specifications/ZigBee/Overview.aspx, 2007
163. T. Avram, S. Oh, and S. Hariri, “Analyzing attacks in wireless ad hoc network with self-
organizing maps,” in 5th Annual Conference on Communication Networks and Services
Research, 2007, pp. 166–175. L. N. De Castro and J. Timmis, Artificial immune systems: A new
computational intelligence approach. Springer, 2002.
164. G. J. Pottie and A. Pandya, Quality of service in wireless sensor networks. John Wiley & Sons,
Inc., 2008, pp. 401–435.
165. D. Chen and P. K. Varshney, “QoS support in wireless sensor networks: A survey,” in
International Conference on Wireless Networks, vol. 233, 2004.
166. M. A. Osborne, S. J. Roberts, A. Rogers, S. D. Ramchurn, and N. R. Jennings, “Towards real-
time information processing of sensor network data using computationally efficient multi-output
Gaussian processes,” in Proceedings of the 7th International Conference on Information
Processing in Sensor Networks. IEEE Computer Society, 2008, pp. 109–120.
167. N. Ouferhat and A. Mellouk, “A QoS scheduler packets for wireless sensor networks,” in
International Conference on Computer Systems and Applications, 2007, pp. 211–216.
168. M. Seah, C.-K. Tham, V. Srinivasan, and A. Xin, “Achieving coverage through distributed
reinforcement learning in wireless sensor networks,” in 3rd International Conference on
Intelligent Sensors, Sensor Networks and Information. IEEE, 2007, pp. 425–430.
Bibliography
123
169. R. Hsu, C.-T. Liu, K.-C. Wang, and W.-M. Lee, “QoS-aware power management for energy
harvesting wireless sensor network utilizing reinforcement learning,” in International
Conference on Computational Science and Engineering, vol. 2. IEEE, 2009, pp. 537–542.
170. X. Liang, M. Chen, Y. Xiao, I. Balasingham, and V. C. M. Leung, “A novel cooperative
communication protocol for QoS provisioning in wireless sensor networks,” in 5th International
Conference on Testbeds and Research Infrastructures for the Development of Networks
Communities and Workshops, 2009, pp. 1–6.
171. N. Baccour, A. Koubaa, L. Mottola, M. A. Zuniga, H. Youssef, C. A. Boano, and M. Alves,
“Radio link quality estimation in wireless sensor networks: A survey,” ACM Transactions on
Sensor Networks (TOSN), vol. 8, no. 4, p. 34, 2012.
172. A. Woo, T. Tong, and D. Culler, “Taming the underlying challenges of reliable multihop routing
in sensor networks,” in Proceedings of the 1st International Conference on Embedded Networked
Sensor Systems, ser. SenSys ’03. New York, NY, USA: ACM, 2003, pp. 14–27.
173. K. Shah and M. Kumar, “Distributed independent reinforcement learning (DIRL) approach to
resource management in wireless sensor networks,” in Internatonal Conference on Mobile Adhoc
and Sensor Systems, 2007, pp. 1–9.
174. A. Mainwaring, D. Culler, J. Polastre, R. Szewczyk, and J. Anderson, “Wireless sensor networks
for habitat monitoring,” in Proceedings of the 1st ACM International Workshop on Wireless
Sensor Networks and Applications. ACM, 2002, pp. 88–97.
175. Wiess and Indurkya, Predictive data mining, a [ractival guide, Morgan Kaufmann Publishers.
ISBN: 1-55860-403-0.