Neonatal Baby Monitoring - The University of · PDF fileAbstract In this thesis we investigate...
Transcript of Neonatal Baby Monitoring - The University of · PDF fileAbstract In this thesis we investigate...
Neonatal Baby Monitoring
Alexander SpenglerT
HE
U N I V E RS
IT
Y
OF
ED I N B U
RG
H
Master of Science
School of Informatics
University of Edinburgh
2003
AbstractIn this thesis we investigate the use of probabilistic graphical models for neonatal baby
monitoring applications. In particular, we concentrate ondetecting artefact patterns in
physiological data using a conditional Gaussian approach.We describe a system that
learns the necessary parameters from the given data and produces marginal posterior
probabilities for the latent variables that have been used to model the artefact processes.
It should be emphasised that the current system does not include the temporal evolution
of the measured signals, but we indicate how this can be done within the presented
framework. We also discuss our approach in the context of prior work and present
ways to overcome identified problems.
iii
DeclarationI declare that this thesis was composed by myself, that the work contained herein is
my own except where explicitly stated otherwise in the text,and that this work has not
been submitted for any other degree or professional qualification except as specified.
(Alexander Spengler)
iv
AcknowledgementsI would like to thank my supervisor, Dr Chris Williams, for supporting me (not only)
throughout the time I was working on the MSc project here in Edinburgh. He spend
a lot of time and effort to organise meetings, request software updates, review my
progress and provide me with literature. His ability to keepmy interests on the right
track has helped me a lot; as well as his confidence in my work—which sometimes
seemed to be greater than my own. In addition to this I really enjoyed his inspiring
lectures.
I would like to thank Professor Neil McIntosh for answering my numerous questions
on artefact patterns in the monitoring data and for evaluating the results of my work.
He together with Chris also made my visit to the neonatal intensive care unit at the
Royal Infirmary Edinburgh possible, for which I am very grateful and which provided
me with motivation in times when things didn’t turn out to be the way they should have
been.
I furthermore would like to thank Professor Jim Hunter and Paul McCue for their
tremendously fast updates to the Time Series Workbench software, John Quinn for his
support with the machines at the Royal Infirmary and Dr David Barber for his effort,
interest and often funny lectures—even if I finally decided to do the project with Chris.
I would like to thank Dr Ralf Schoknecht for his encouragementthroughout the whole
time I was working at the Institut fur Logik, Komplexitat und Deduktionssysteme,
University of Karlsruhe and later.
Thanks also to Dr Barthelmeß and Professores Calmet, Menzel and Waibel (all Uni-
versity of Karlsruhe) for helping me letting this year in Edinburgh become reality.
I would like to thank all the new friends I made here in Edinburgh as well as the ones
who are back home in Germany.
My deepest gratitude, however, is to my family since withoutthem all of this would
only have been a dream.
v
Contents
1 Introduction 1
1.1 Monitoring in intensive care units . . . . . . . . . . . . . . . . . .. 3
1.2 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Overview of the remaining chapters . . . . . . . . . . . . . . . . . .10
2 Data 11
2.1 General description . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Data formats and their conversion . . . . . . . . . . . . . . . . . . .13
2.3 Artefact processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Drop outs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Recording device artefacts . . . . . . . . . . . . . . . . . . . 15
2.3.3 Recalibration or relocation of the gas probe . . . . . . . . .. 16
2.3.4 Recalibration of the blood pressure transducer . . . . . .. . . 18
2.3.5 Endotracheal Suctioning . . . . . . . . . . . . . . . . . . . . 18
2.3.6 Drawing blood gas . . . . . . . . . . . . . . . . . . . . . . . 20
3 Methods 23
3.1 The conditional Gaussian model . . . . . . . . . . . . . . . . . . . . 23
3.1.1 Modelling artefacts . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.2 Modelling observations . . . . . . . . . . . . . . . . . . . . . 24
3.2 Construction of the belief network . . . . . . . . . . . . . . . . . . .27
3.2.1 General considerations . . . . . . . . . . . . . . . . . . . . . 27
3.2.2 Creation of latent variables . . . . . . . . . . . . . . . . . . . 31
vii
3.2.3 Creation of the CG distribution . . . . . . . . . . . . . . . . . 36
3.3 Computing marginal posterior probabilities . . . . . . . . . .. . . . 43
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.1 Experiment1340 05 Nov 2001 4 . . . . . . . . . . . . . . . 45
3.4.2 Experiment1344 12 Nov 2001 5 . . . . . . . . . . . . . . . 46
3.4.3 Experiment1369 22 Nov 2001 7 . . . . . . . . . . . . . . . 47
3.4.4 Experiment1355 14 Nov 2001 8 . . . . . . . . . . . . . . . 49
3.4.5 Experiment1369 21 Nov 2001 9 . . . . . . . . . . . . . . . 50
4 Results 53
4.1 Experiment1340 05 Nov 2001 4 . . . . . . . . . . . . . . . . . . . 53
4.2 Experiment1344 12 Nov 2001 5 . . . . . . . . . . . . . . . . . . . 58
4.3 Experiment1369 22 Nov 2001 7 . . . . . . . . . . . . . . . . . . . 63
4.4 Experiment1355 14 Nov 2001 8 . . . . . . . . . . . . . . . . . . . 69
4.5 Experiment1369 21 Nov 2001 9 . . . . . . . . . . . . . . . . . . . 73
5 Conclusions and Future Work 83
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A Additional plots 87
Bibliography 99
viii
Chapter 1
Introduction
Every second of our life we humans process immense amounts ofinformation which
we receive from all our senses. And at first sight, this does not seem to be an outstand-
ing ability to us—maybe because we do so immediately and without conscious effort.
But humans are able to outperform machines in many walks of life. This is especially
true for data-rich environments or situations that are characterised by greatly varying
patterns. We can, for example, infer someone’s emotional state just by looking at the
facial expression or predict an approaching storm without having any expert know-
ledge about it, simply by adequately interpreting familiarphenomena like dark clouds,
strong wind, lightnings and thunder.
Both examples share in common some interesting features. First, the observations we
make comprise characteristic patterns which we have encountered before. Hence we
are able to identify and match them to something we already know, to something we
have learned—a tear running down someone’s cheek, say, or the lightning in the sky.
This ability is referred to as generalisation. Those patterns then are often indicative
signs or symptoms of latent/hidden causes; and the inferredknowledge about those
causes may help us to predict the future and thence influence impending decisions.
Tears, for instance, give rise to the conclusion that the person we look at is currently
sad and we might think about cheering her/him up.
1
2 Chapter 1. Introduction
It is quite obvious that the combination and timing of the observations are of great
importance and may well change our interpretation. Dark clouds alone do not make a
proper storm, and neither does thunder. Only the timely combination of both, the dark
clouds and the thunder, may increase our belief in an upcoming storm. And spotting
a tear while a person laughs may lead to its reinterpretationas being a tear of joy, of
course.
A further key point here is the probabilistic nature of both the information we seek
to process and the form in which we should express the results. Being able to handle
variations in the observed patterns as well as uncertainties about their latent causes
plays a crucial role in robustly processing the data. No two storms on this planet have
been, are or will ever be the same.
Sometimes, however, we do not seem to have sufficient knowledge to understand the
underlying causes which generated the patterns we observe—often due to their com-
plexity or to the variability of the observations. Nevertheless, there has always been
a need for an appropriate interpretation of our observations—answers to the questions
“why?” and “how?”. And humans have never been short of explanations. Actually,
we are rather creative: In ancient times we declared a storm the result of a god being
angry, which appears to be somewhat funny today. But how much do we understand
about how our own mind machine works?
In the last decades, more and more technical environments, such as power plants, op-
erating theatres, airplane cockpits or even cars, have alsobecome sources of rich and
complex information. And the monitoring and exploitation of this information is one
of the main goals of today’s science. Again it is clear that anat least partial understand-
ing of the data-generating processes is fundamental to the successful accomplishment
of these tasks.
However, this is not always as ‘trivial as interpreting a tear drop. In contrast, most tech-
nical environments are considered as providing too much information in order to be
able to evaluate it appropriately. The NASA space shuttle, for instance, produces over
one hundred different sensor signals every fraction of a second and it would be foolish
to believe a human could easily spot suspicious patterns andrelationships within these
1.1. Monitoring in intensive care units 3
time series data, especially over hours, days or even weeks.
In an intensive care unit (ICU), the situation is not all that different. Patients are usually
connected to several devices with multiple sensors which monitor vital body functions
such as blood pressure, temperature, heart rate and so forth; and the progress of in-
formation technology in recent years has added numerous other sources to learn about
a patient’s state of health. Mainly for the above mentioned reasons, it turns out to be
difficult to put together the many displayed sensor readingsto form sensible hypotheses
about a patient’s well-being.
This is where techniques from the fields of machine learning and pattern recognition
might be valuable. Automated and reliable classification ofcharacteristic, yet varying
patterns in the time series data would help to learn about a patient’s state and thus assist
the medical staff in their future treatments.
This thesis is primarily about a statistical approach to theimportant task of recognising
patterns in neonatal monitoring data by means of latent variable modelling, in particu-
lar focussing on identifying artefacts.
The next section gives a short introduction into the field of patient monitoring. In
the second section of this chapter we briefly review the key examples from previous
work that has been undertaken in the field of automated patient monitoring and artefact
detection in time series data. The last part of this introductory chapter then presents an
outline of the thesis’ structure.
1.1 Monitoring in intensive care units
McIntosh (2002, page 349) defines monitoring as
“[. . . ] the serial evaluation of time-stamped data.”
and it is clear that there is an almost innumerable amount of systems that produce
data over time that is worthy of a proper evaluation. Examples range from natural
phenomena like thunder storms and floods, to technical environments like cars, planes
4 Chapter 1. Introduction
or even modern cooking devices in the kitchen. But one of the most classic and also
hugely important areas is without doubt patient monitoring.
The most common example of monitoring a patient’s conditionis perhaps watching
the body temperature using a clinical thermometer. It is general knowledge that a body
temperature of about 37C is normal, whereas too low or too high temperatures can be
dangerous. Another example that almost everyone should have undergone is taking
the pulse by putting one’s finger on the inside of the front armnext to the wrist. After
an accident, that’s what everyone should be able to do to figure out the condition of
the injured person. Of course, with the dawn of information technology, nowadays
critical care areas of hospitals such as ICUs and operating rooms have by far more
intelligent means to monitor a patient’s state than listening to the sound of the chest
with a stethoscope.
In a modern ICU, a seriously ill patient is attached to many devices with sometimes
multiple probes in an attempt to aid watching and judging her/his condition. It goes
without saying that not all of those measured sensor signalsare easy to interpret. An-
other crucial point regarding the interpretability of the displayed sensor readings is
their corruption by various artefact processes. Even something as simple as a patient
movement can in fact lead to heavily varying measurements and thus reduce the use-
fullness of the monitors themselves as well the quality of the medical and nursing care.
On the next few lines we give some reasons why care suffers from the presence of
artefactual data in the physiological traces.
Alberdi et al. (2001) report on the outcomes of an cognitive engineering investigation
that analysed the differences between junior and senior physicians in their interpret-
ation of monitored pyhsiological data in a neonatal intensive care unit (NICU). They
show that senior doctors are not only more often in the position to detect relevant pat-
terns in the data, but they also relate a bigger percentage ofthe characteristic traces in
the displayed data to their causes. So the senior doctors identify on average 68% of the
relevant patterns, whereas the junior colleagues spotted 54%. Even clearer are the res-
ults about how often the correct underlying causes of those patterns could be inferred.
Out of 172 possible inferences, the senior physicians generated on average 56%. The
1.1. Monitoring in intensive care units 5
junior doctors however only provided 28% of correct inferences, probably partly as a
consequence of the smaller proportion of identified relevant patterns. In addition to
this senior doctors recognised artefacts seven times more often than the junior doctors.
In other words, inexperienced or less well-trained staff are less likely to detect relevant
events in the data and also find it more difficult to infer the patient’s real state of health.
And it should be noted that it is the junior doctors and the nurses who spend most of
the time at the bedside.
But the presence of artefacts in the monitoring data does not only decrease its inter-
pretability by clinical staff, it also immensely increasesthe number of false alarms
in the critical care areas. This is especially worrying in the context of rising patient
numbers and medical staff shortages, since the sounding of alarms become crucial in-
dicators of a patient’s deteriorating condition or need forassistance in the absence of
personnel.
Several studies (e.g. Tsien and Fackler (1997); Lawless (1994); Koski et al. (1990))
were carried out to accurately determine the quality and quantity of monitoring alarms
in the ICU. The results are disillusioning: The percentages of clinically significant
alarms range from 5.5% (Lawless (1994)), 8% (Tsien and Fackler (1997)) to 10.6%
(Koski et al. (1990)). This means that the false positive rate—i.e. the number of in-
appropriate alarms divided by all alarm soundings—is extraordinarily high. Tsien and
Fackler distinguish further between alarms within some treatment or diagnostic test
of a patient (so-called patient intervention alarms) or not(non-patient intervention
alarms). The false positive rates are almost equal, being 82% for alarms within an
intervention by a caregiver and 86% without it, whereas 88.9% of the true alarms dur-
ing patient interventions are reported to be clinically irrelevant, but 78.6% of the true
alarms not associated with patient interventions are clinically significant. In addition,
four out of five alarms go off while no personnel are attendingthe patient.
The most reliable alarm seems to be the mean systemic blood pressure taken from an
arterial line with a false positive rate of 46% and the most frequent cause for an false
alarm is the pulse oximeter with over 90%. In section 2.3 we discuss these results in
the light of the artefactual data we have examined.
6 Chapter 1. Introduction
False alarms, in general, pose a serious menace to the healthcare of a patient (see
Meredith and Edworthy, 1995), in particular because there are different devices that
are very likely to create different auditory warning signals and those signals are not
necessarily related to the medical urgency of the alarm. Hence, staff can easily become
annoyed, irritated and confused by the false alarms or simply get accustomed to them.
Or they silence the alarm—in the worst case by turning the alarm completely off,
thereby creating a deceiving calmness which is probably worse than having no alarms
at all.
According to Tsien and Fackler (1997) the most prevalent reasons for a nurse or doctor
silencing the alarm are drawing blood gas, suctioning, patient movements, examina-
tions, recalibrations and probes falling off the patient. Interestingly, almost all of them
fall into the category of true, but clinically irrelevant alarms. And even more will be
present in the monitored data traces—as artefacts.
There are, of course, numerous attempts to remedy this situation and reduce the number
of false alarms in an ICU. Most of them have realised the inherent relationship between
false alarm rates and the recognition of artefactual processes. Therefore the gross of
the approaches are indeed artefact detection methods and wereview some of them in
section 1.2.
Let us briefly consider some intensive care scenarios, how the ideal monitoring system
should work there and why this is in practice not as easy as onewishes. Please note
that this paragraph follos the discussion in Tsien (2000b, page 57). First, consider a
child with breathing difficulties, which is quite likely to have an increased heart rate
along with less than normal values for the respiratory rate and the arterial oxygen
saturation. In contrast, a child whose pulse oximeter probehas just fallen off may not
exhibit unusual respiratory or heart rates, but an immediate drop in the saturation of
oxygen. And another child may just have turned around in the bed so that the reading
for the arterial oxygen saturation became corrupted for this time, say generating values
below the lower threshold alarm limit, while all the other physiological parameters are
normal. Currently available monitors would sound the same alarm in all three cases,
due to the fall of the saturation of oxygen below the previously set limit.
1.2. Prior Work 7
An intelligent monitoring system however would be in the position to distinguish the
three cases by examining the available evidence in the recorced physiological signals
and issue an appropriate alarm—should it be necessary at all. In the first scenario, the
monitor could sound an urgent alarm and, if the child is artificially ventilated, adjust
the settings of the ventilator. In the second case, the system could set off a less urgent
alarm to indicate that the oximeter probe has just fallen offand it needs to be corrected.
And finally, in the last case, there would not be a need for an alarm at all; yet the system
should recognise and record the period of time when the infant was rolling over in the
bed as an motion artefact.
Again, this is not as trivial in practice as it might sound in theory and the reasons are
our uncertainty about the underlying cause of the observed data and the variations in
the observed patterns; so could the oxygen saturation possibly drop far below the lower
threshold limit in the third scenario blurring cases two andthree, or just let the child
with breathing difficulties roll over causing the probe to fall off and so forth.
1.2 Prior Work
In this section we review some of the more important approaches to condition mon-
itoring in general, and to patient monitoring, alarm and artefact detection systems in
particular. We will focus on the latter, but begin with an application of condition mon-
itoring in a different field, namely online failure detection in antenna pointing systems
(Smyth (1994a); Smyth (1994b)).
The antenna system described in this work is used to track deep space spacecrafts in
real-time. The aim of the monitoring application is to quickly identify the causes of
any problems, so that loss of telemetry data or early shut down of the track can be
avoided. The author reports on an experiment in which hardware faults are introduced
into the pointing system of a huge antenna; those faults are either a noisy tachometre,
the complete failure of a tachometre or a short-circuit in anamplifier. Furthermore,
there exists a normal state. Eight autoregressive-exogenous (ARX) coefficients and
four standard deviation measurements have been used as the observable feature vector.
8 Chapter 1. Introduction
The goal of the experiment is to determine the type of fault for each of a sequence
of 12-dimensional feature vectors. First, two static models have been used, a Gaus-
sian mixture model(GMM) and a single hidden layer neural network. Again, none
of these models is able to utilise the temporal aspects within the data. Thus, neither
model is reported to produce particularly accurate trackings of the underlying faults,
though the neural network seems to model the causes slightlybetter. Then the tem-
poral evaluation of the observations are addressed by introducing a hidden Markov
model (HMM) whose transition matrix correlated the estimates of the GMM and the
neural network, respectively. Although some improvement for the GMM plus HMM is
stated, the neural network in combination with the HMM performs significantly better
and tracks the underlying faults properly.
This paper is important since it exemplifies an approach similar to the one we take
in this thesis. More specifically, our current model is static as well as the GMM and
the neural network, but it is intended to include the temporal context soon after we
have evaluated it. The result, that the temporal context improves the accuracy in both
cases makes us wonder if this will be true for our approach as well. Nonetheless, there
is one major difference to our model: Smyth employed only onemultinomial hidden
variable—the fault state, whereas our strategy is to combine several latent processes
which generate the observation.
But let us turn to the patient monitoring setting now. Altogether, there are a quite a
lot of different approaches depending on the background of the author. We will try to
cover the most important works from several fields here, although the reader should be
aware that this is no extensive literature review, more an overview. Having said this,
let us go in medias res.
First, we will briefly describe the approach by Tsien et al. (2000) (see also Tsien
(2000b), Tsien et al. (2001) and Tsien (2000a)). The key notehere is to use de-
cision trees and logistic regression models to detect artefacts in monitoring data from a
neonatal ICU. More precisely, both models are built to detectartefacts in four physiolo-
gical channels which provide observations at a one minute granularity. The channels
used are heart rate (HR), mean blood pressure (BM) as well as partial pressures of
1.2. Prior Work 9
oxygen (OX) and carbon dioxide (CO).
From the four raw data signals, several additional featuresare constructed, including
moving mean and median as well as best fit linear regression slope, for example. It
is important to remark that artefact detection is done channel by channel, although
features derived from all channels have been used to classify an specific observation
as artefactual or not. Due to the one minute granularity of the raw data, window sizes
for those features were3, 5 and10 minutes. Then standard software packages have
been used to compute decision trees and logistic regressionmodels from the derived
features only. “Ground truth” for the labels was provided byretrospecive analysis by
a clinical expert. The results were evaluated on a separate test set using performance
metrics such as accuracy, specificity, sensitivity and areaunder the receiver operating
characteristic (ROC) curve.
The reported area under the ROC curve for four final the decision tree models range
from 89.4% for BP to99.9% for OX. The logistic regression models are said to be
worse.
In the last approach, the preprocessing step is maybe the most interesting. Unfortu-
nately, the authors do not examine the influence of the preprocessing step in detail.
Besides it is our opinion that this approach is a rather naıve application of machine
learning techniques and the results are not too impressive.
There are numerous other works that discuss abstraction as ameans of improving
monitoring applications, for example Cao and McIntosh (1998), Cao and McIntosh
(2000), Miksch et al. (1996) or Haimowitz et al. (1995).
Another interesting strategy is to use time series methods,such as ARIMA, to predict
the next data point and hence if it is artefactual or not (Hoare and Beatty (2000);Hoare
et al. (2002)).
There are also some approaches based on knowledge based systems, as for example
discussed in Becker et al. (1997).
It is our point of view, that the only principled calculus to deal with the probabilistic
nature of artefact patterns in monitoring data, is simply probability theory.
10 Chapter 1. Introduction
1.3 Overview of the remaining chapters
Chapter 2 describes the monitoring data that has been collected over several years at
the NICU at the Royal Infirmary in Edinburgh. After a short general introduction to the
structure and content of the data set, we go on to describe various artefact patterns that
can be found within the multiple traces of the physiologicalsignals. We again restrict
our discussion to the most prevalent artefacts.
Chapter 3 details the theory and practical construction of the latent variable model we
used in our approach as well as how we learned its parameters and calculated posterior
and marginal posterior probabilities of an artefact being present at a particular time.
We first introduce the conditional Gaussian model itself andthen explain in detail how
these models can be constructed given the monitoring data. We also demonstrate how
this can be accomplished with the programs we have written. Moreover, we describe
how the model parameters (means, covariances and prior probabilities) can be com-
puted.
Chapter 4 presents the results of the conditional Gaussian model to detecting artefacts.
For five different preterm neonates and various artefacts weshow marginal posterior
probabilities for periods of at least six hours. Due to the absence of “ground truth”
labels for the artefact processes the evaluation is twofold, however. As far as feasible,
we tried to measure classification accuracy automatically.For the remaining artefact
processes, an experienced medical expert evaluated our results. Together with the
annotations and remarks that have been stored in the data setwith the help of the TIME
SERIESWORKBENCHsoftware, he also served as the gold standard for the evaluation.
The final chapter discusses the results in the context of other approaches, identifies sev-
eral problems with the conditional Gaussian approach and discusses how these prob-
lems can be addressed and overcome in the future. We also provide a brief conclusion.
Chapter 2
Data
This chapter provides a description of the neonatal monitoring data with which we will
be working, and of the format we will use in our experiments. Furthermore, we will
present plots of multiple physiological sensor signals in which interesting patterns can
be spotted. As far as possible, we will explain the cause of these characteristic patterns.
2.1 General description
The source of data in this project is a database of neonatal monitoring data which has
been collected by Prof Neil McIntosh and colleagues over thelast few years. The part
of the data which is available to us includes129 recordings (over500 hours) of42
preterm born infants that have been created between September 1st, 2001 and Febru-
ary 13th, 2002 at the neonatal intensive care unit (NICU) of the RoyalInfirmary in
Edinburgh, Scotland.
From the42 different infants, are17 female and21 out of the129 data sources belong
to neonates who were born within or before the29th week of gestation. One baby was
born in the23rd week of gestation. The collection does not only include the recorded
sensor readings of multiple physiological signals such as the heart rate or saturation of
oxygen, it also provides elaborate annotations which have been gathered by a research
11
12 Chapter 2. Data
nurse who was attending the cot-side full-time. Those annotations include the actions
taken by medical personnel, observations of the nurse such as sporadic movements or
skin colour, laboratory results, and device settings for example.
The TIME SERIESWORKBENCH (TSW) software developed by Prof Jim Hunter from
the University of Aberdeen (Hunter, 2001) provides an excellent functionality in order
to display and manipulate the data sources, all of which havebeen recorded at a one
second granularity. Moreover, all annotations are easily accessible within this tool and
the physiological data can be exported to various formats such as ASCII text. Unfor-
tunately, the author did not have the time to develop his own software for use within
the TSW. Instead, the preferred approach was to implement the required routines in
MATLAB (The MathWorks, Inc., 2003), a widely-used mathematical software pack-
age. But even then the TSW was frequently used to access annotations and further
detailed information.
Although really facilitating our project, the annotationsrecorded by the cot-side nurse
were not overly helpful with regard to the automatic selection of artefactual data. This
is true because the remarks indicate only very rarely the period of time for which
a particular process can be observed. In addition to this, isthe stored information
detailed, but incomplete which renders the automatic selection of data via labels im-
possible. Therefore the author had to create machine-usable labels himself—greatly
supported by the annotations available within the TSW and bynotes from a meeting
with Prof Neil McIntosh.
Moreover, the author was in the position to use centiles of physiological sensor signals
with respect to variables such as gestation and post-natal age, which have also been
collected by Prof Neil McIntosh.
Recapitulating, we can say it is our sincere belief that the described database is a
unique resource in the field of neonatal monitoring and provides great opportunities
for improved patient care.
2.2. Data formats and their conversion 13
2.2 Data formats and their conversion
The original source data was stored in a Microsoft Access database of size385 646 kilo-
bytes, including annotations. As this format is rather inappropriate for the computa-
tions we intended to do, we had to convert the raw data with thehelp of the TSW into
a format MATLAB can process.
Fortunately, this could be achieved within hours as the TSW allows us to export the
physiological data channels to an ASCII text file and MATLAB can be programmed to
read it. Below we show an example of how the exported ASCII text file looks like:Context: Badger Source: 1340 Date: 05/11/2001 Time: 08:37:09 SampInt: 1 Second NumSamp: 37181Date Time HR TC TP OX CO BS BD BM . . .05/11/2001 08:37:09 137.00 37.40 36.30 0.00 0.00 36.00 24.00 31.00 . . .05/11/2001 08:37:10 137.00 37.40 36.30 0.00 0.00 36.00 24.00 31.00 . . .05/11/2001 08:37:11 137.00 37.40 36.30 0.00 0.00 36.00 25.00 31.00 . . .05/11/2001 08:37:12 137.00 37.40 36.30 0.00 0.00 37.00 25.00 31.00 . . .05/11/2001 08:37:13 137.00 37.40 36.30 0.00 0.00 37.00 25.00 31.00 . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
In order be able to easily access the physiological data as well as additional inform-
ation provided not only by the ASCII text file, but also within the original database,
such as details about the week of gestation and the birthday of the baby, we created a
class calledneonate in MATLAB . Thus we could utilise the principle of “information
hiding”. It also allows us to overload special functions likeplot. We decided to store
the following information in aneonate object:
• The fifteen physiological data channels HR, TC, TP, OX, CO, BS, BD, BM, RR,
SO, HS, FO, PH, Unused and P2, holding the heart rate (in beatsper minute),
central temperature and peripheral temperatures (in degrees Celsius), oxygen
and carbon dioxide pressures (in kilo Pascal), systolic anddiastolic blood pres-
sures as well as their mean value (in mmHg), respiratory rate(in breaths per
minute), oxygen saturation (in percent) and from SO device recorded heart rate
(in beats per minute) as well as four other channels which were always empty
until now and thus have never been used. All the above data is stored in a big
matrix calledchannels of dimensions “Number of samples”× “Number of
channels”.
• For each of the fifteen-dimensional data points we also savedthe time and the
14 Chapter 2. Data
data of its recording as separate character arrays.
• The ID of the baby and its gender (as strings).
• The week of gestation and the baby’s birthday and -time (integer and strings).
From this core information we can derive other information as the post-natal day or
the number of sampled data points and their granularity.
To be able to manipulate the data in a convenient way, we also overloaded some import-
ant operators such asdisplay andset. Furtermore, we added some of our own func-
tions. The most crucial one is without doubtplot which enables us to visualise and
extract the physiological channels. Then there is also a method calledimportFromTSW
which, when entered from the MATLAB shell, opens a dialog box asking for the TSW
generated ASCII text file, imports the contained informationand asks for the rest, such
as the week of gestation. The imported data is then returned in the form of aneonate
object:
>> n1344_12_Nov_2001 = importFromTSW
n1344_12_Nov_2001 is a neonate object with the following properties:
15 Channels, labeled
(1) HR(2) TC(3) TP(4) OX(5) CO(6) BS(7) BD(8) BM(9) FO(10) RR(11) PH(12) UNUSED(13) P2(14) SO(15) HS
36337 samples availablefrom : 12/11/2001, 8:32:54to : 12/11/2001, 18:38:30
ID : 1344Sex : maleWeek of Gestation: 26Birthday / -time : 6 November 2001, 12:36:33
(Time check off)
Altogether, we imported13 different recordings. All of them contained at least8
2.3. Artefact processes 15
channels and6 hours of data. Histograms of the individual channels can be found in
Appendix A, as well as plots of five entire recordings, all of which have been chosen
for our experiments.
2.3 Artefact processes
As indicated in the introduction of this chapter, we give a brief overview of some of
the most prevalent patterns present in the data we analysed.We certainly do not claim
this list to be complete or the descriptions to be overly precise since the author has to
admit a certain lack of medical background knowledge.
2.3.1 Drop outs
Quite frequently and often in more than one physiological channel, there are drop outs.
These drop outs usually occur completely independent of other channel’s values1 and
almost never follow a specific timing, i.e. they are occuringarbitrarily. Moreover, we
exclude drop outs whose channels do not plummet to zero. Mostoften one can observe
these patterns in the respiratory rate and oxygen saturation recordings. Figure 2.1
shows an example.
2.3.2 Recording device artefacts
Another rather usual pattern is the absence of many, sometimes even all channels.
Figure 2.2 shows a good example. Unfortunately, we do not really know which state
the recording device produces when. Nevertheless, we will refer to the pattern in which
all channels are zero as the one in which the device is supposedly off, and everytime
the temperatures are at20 Celsius we call it a recalibration, irrespective of the truth.
1As long as two different channels are not based on the same device’s recordings.
16 Chapter 2. Data
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
HS [bpm]
0
50
100
SO [%]
17:53 17:54 17:55 17:56 17:57 17:58 17:59 18:00 18:01 18:02 18:030
100
200
RR [1/min]
21/11/2001
Figure 2.1 Drop outs in the heart rate HR and HS, the oxygen saturation SO and
the respiratory rate RR. Please note that HS and SO are recorded from the same
probe, which explains the synchronous patterns.
2.3.3 Recalibration or relocation of the gas probe
The first pattern which is slighty more interesting from a modelling viewpoint, is a
recalibration of the combined O2/CO2 probe. As one can see from Figure 2.3, there are
at least three distinct stages in the pattern. First both, the oxygen and carbon dioxide
pressures fall to zero. Then there is a stage in which the O2 takes on values around
20 kPa and the CO2 is about5 kPa. Finally, the oxygen pressure returns to normal
values. And so does the carbon dioxide, but before it usuallydrops to zero. This last
stage in the CO2 channel is highly variable and the author has seen many different
patterns, ranging from smooth, somewhat exponential increases over oscillations to
spikes.
In case the first stage misses, we will usually refer to this artefact as being a relocation
rather than a recalibration. Whether this is true or not, we leave for the experts to
decide.2
2The great number of variations in the data have unsettled theauthor’s confidence in these matters.
2.3. Artefact processes 17
0
200
400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
20
40
TC [°C]
0
20
40
TP [°C]
0
20
40
OX [kPa]
0
5
10
CO [kPa]
0
0.5
1
BS [mmHg]
0
0.5
1
BD [mmHg]
0
0.5
1
BM [mmHg]
0
100
200
RR [1/min]
0
50
100
SO [%]
12:44 12:46 12:48 12:50 12:52 12:54 12:56 12:58 13:00 13:02 13:04 13:060
200
400
HS [bpm]
21/11/2001
Figure 2.2 An example in which the recording device is said to be off in the first
three minutes of the shown period, whereas we call it a recalibration at 13:05.
0
20
40
OX [kPa]
Baby 1355, born 12 November 2001 at 17:17
9:40 9:42 9:44 9:46 9:48 9:50 9:52 9:54 9:56 9:58 10:000
5
10
CO [kPa]
14/11/2001
Figure 2.3 An example of a gas probe recalibration.
18 Chapter 2. Data
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
13:58:0013:58:3013:59:0013:59:3014:00:0014:00:3014:01:0014:01:3014:02:0014:02:3014:03:0014:03:3014:04:0014:04:3014:05:0014:05:3014:06:000
100
200
BM [mmHg]
22/11/2001
Figure 2.4 An example of a recalibration of the blood pressure transducer with
drop outs in heart rate (HR).
2.3.4 Recalibration of the blood pressure transducer
Another artefact with a complex set of distinct states is therecalibration of the blood
pressure transducer, as illustrated in Figure 2.4 and Figure 2.5. As this artefact influ-
ences only the HR, BS, BD and BM channels, we do not show the others.How to
model this artefact is an interesting question, but we leaveits answer to the reader for
now. It is, however, interesting to observe that the same pattern can occur with and
without synchronous drop outs in heart rate.
Patterns as those shown in Figure 2.5 from 16:22:40 to 16:23:20 are often, especially
when spotted individually, not a recalibration, but the flushing of the line of the probe.
2.3.5 Endotracheal Suctioning
The endotracheal suctioning is the second artefact which modifies the HR, BS, BD
and BM channels (Figure 2.6). The characteristsic patterns is given by a short bowl-
shaped drop in heart rate, usually lasting for about30 seconds. This is when the actual
suctioning takes place. But starting some seconds later, we can see the suctining’s
influence on the blood pressures. Their values rise fast during the event just to slowly
2.3. Artefact processes 19
160
165
170
HR [bpm]
Baby 1355, born 12 November 2001 at 17:17
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
16:21:00 16:21:20 16:21:40 16:22:00 16:22:20 16:22:40 16:23:00 16:23:20 16:23:40 16:24:000
50
100
BM [mmHg]
14/11/2001
Figure 2.5 An example of a recalibration of the blood pressure transducer
without drop outs in heart rate (HR).
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
40
60
80
BS [mmHg]
20
30
40
BD [mmHg]
11:15 11:16 11:17 11:18 11:19 11:20 11:21 11:22 11:23 11:24 11:25 11:26 11:27 11:28 11:29 11:3030
40
50
BM [mmHg]
22/11/2001
Figure 2.6 A characteristic example of an endotracheal suctioning.
return to normal. This normalisation can take up to30 minutes, depending on the baby
and her/his condition.
It is sometimes helpful to know that the nurses usually do twoor three suctionings
within a short period of time.
20 Chapter 2. Data
140
160
180
HR [bpm]
Baby 1340, born 4 November 2001 at 14:27
20
40
60
BS [mmHg]
0
20
40
BD [mmHg]
11:30 11:31 11:32 11:33 11:34 11:35 11:36 11:37 11:38 11:39 11:4020
30
40
BM [mmHg]
19/11/2001
Figure 2.7 An example where blood gas is being taken and there is no drop out
in heart rate.
2.3.6 Drawing blood gas
Drawing the blood gas from the radial arterial line is one of the most obvious patterns in
the data sets. It does, as well as the endotracheal suctioning and the recalibration of the
blood pressure transducer, modify HR, BS, BD and BM. Depending onthe used time
scale, the pattern can look like a sharp spike or a steady, more or less linear increase
in the blood pressures (systolic as well as diastolic). At the same time, the heart rate
usually shows drop outs to zero. Figure 2.7 and Figure 2.8 give to clear examples, one
with the drop out in HR and one without.
2.3. Artefact processes 21
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
20
40
60
BS [mmHg]
20
40
60
BD [mmHg]
11:50 11:51 11:52 11:53 11:54 11:55 11:56 11:57 11:58 11:59 12:0030
40
50
BM [mmHg]
22/11/2001
Figure 2.8 In this example the blood gas is being taken and there is a drop out in
heart rate.
Chapter 3
Methods
This chapter details the approach of modelling the monitoring data at hand via a spe-
cific Bayesian network (Pearl, 1988) in which the distribution of the observations given
the latent causes is conditional Gaussian (CG). First, we describe how artefacts and
observations can be expressed by discrete and continuous random variables. We also
formally introduce the CG distribution here. Then we show howthe parameters of
the belief network can be estimated (learned) from the data.We focus on procedures
rather than theory and include a description of how this can be done with the MATLAB
routines we implemented. In the final section of this chapterwe briefly talk about the
setup of the experiments carried out.
3.1 The conditional Gaussian model
The approach we are taking in this thesis is to model artefactprocesses in the neonatal
baby monitoring data as discrete latent random variables while the observed multiple
physiological data channels are determined by a continuousvariable. More precisely,
we model the artefacts as binary or multinomial variables and the observation at a
specific time as having a normal distribution. In other words, the joint state space
follows a conditional Gaussian distribution as defined below.
23
24 Chapter 3. Methods
3.1.1 Modelling artefacts
As we have seen in section 2.3, most artefact processes do notcomprise several dif-
ferent stages. A drop to zero in the saturation of oxygen channel, for instance, will
either be present or not. In this case, assuming that an artefact does not depend on any
other process, we can endow its binary latent random variable with an unconditional
distribution. Thus, if we letXSO dropout= “present” be the event that there is a zero
dropout in the oxygen saturation, we will only need to determine its prior probabil-
ity π SO dropout= PXSO dropout= “present”
, because from the definition of our
sample spaceΩ = “present”, “absent” and the fact that the artefact can either be
present or absent, it follows thatPXSO dropout= “absent”
= 1−P
XSO dropout=
“present”
.
Similar considerations hold for artefacts which comprise more than two distinct states.
The recalibration of the O2/CO2 probe is a good example. Again, we assume that
artefact processes do not depend on other hidden causes so that we can model multi-
stage processes using multinomial variables with unconditional distributions. Letting
Xgas= i (i ∈ 1, 2, . . . , Ngas) denote the event that the recalibration of the O2/CO2
probe is in stagei andXgas = 0 that there is currently no recalibration of the probe,
we need to find theNgasprior probabilitiesπigas= P
Xgas= i
.
In theory, this is clear and straight-forward, in practice,however, it is sometimes diffi-
cult to tell how many distinct states or what prior probabilities there are for an artefact.
Subsection 3.4 explains in more detail how many states and which prior probabilities
we assigned to a particular latent random variable in a particular experiments. Please
note also that binary random variables can be easily treatedas multinomial ones—and
that is exactly what we do.
3.1.2 Modelling observations
Before we carry on to illustrate the mathematical model associated with an observation
given the artefacts, let us introduce the standard notationfor CG distributions (Laur-
itzen and Wermuth (1984); Lauritzen and Wermuth (1989)).
3.1. The conditional Gaussian model 25
First, let the set of variablesV = ∆∪Γ be partitioned into discrete (∆) and continuous
(Γ) ones. Then letZ be a random vector of the joint state space indexed byU ⊆ V, so
Z = ZV . In addition, we defineY = ZΓ andX = Z∆, so that a typical element of the
discrete state space is given byx = (xδ)δ∈∆, where everyxδ takes on a finite number
of values. The set of all possible realisationsx in the discrete state space is referred to
asH, which is the Cartesian or cross product of the state spaces oftheXδ, δ ∈ ∆. The
conditional distribution of the continuous random variable Y given the discreteX is
assumed to be multivariate normal:
PY |X = x
= N|Γ|
(µ(x), Σ(x)
)wheneverπ(x) = P
X = x
> 0. (3.1)
We write |Γ| to denote the cardinality of the setΓ and the notationN|Γ|
(µ, Σ
)for the
|Γ|-dimensional Gaussian distribution with meanµ and covariance matrixΣ. Given
Σ(x) is semidefinite1, we then sayZ follows a conditional Gaussian distribution.
Now, we transfer the theory into the monitoring context. Then we could define∆ =
”zero drop of oxygen saturation”, ”recalibration of gas probe”,. . . ,”drawing blood
gas” to include all artefacts we wish to model. Similarly,Γ would contain variables
representing all physiological channels that can be observed, sayΓ = “HR”, “TP”,
“TC”, “OX”, “CO”, “BS”, “BD”, “BM”, “RR”, “SO”, . For notational reasons only,
let us use the simpler set∆ = 1, 2, . . . , K to refer to the discrete variables. Thus,
X = (X1, X2, . . . , XK) contains theK binary or multinomial random variables mod-
elling the artefact processes, such asXgas.
In order to completely determine the conditional distribution of Y givenX, we need
to find the moment characteristics of the CG distribution for every realisation inH. In
other words, we have to find|H| |Γ|-dimensional mean vectorsµ(x), |H| |Γ| × |Γ|
covariance matricesΣ(x) and|H| priorsπ(x).
1In the case ofΣ(x) being singular, the probability density of the degenerate distribution does notexist.
26 Chapter 3. Methods
π1 X1
π2 X2
......
πK XK
µ(x)
Y
Σ(x)
Figure 3.1 Graphical model of the CG distribution which we applied to the
problem of artefact detection in neonatal monitoring data.X1, X2, . . . , XK are the
discrete random variables with prior probabilitiesπ1, π2, . . . , πK which modelK
different artefacts. The conditional distribution of the multidimensional continuous
random variableY given the discrete is multivariate Gaussian with meanµ(x) and
covarianceΣ(x). As the prior probabilityπ(x) is the product of the priors of the
hidden nodes, we do not show it (see Equation 3.4). Discrete random variables are
illustrated via square nodes while round nodes indicate continuous variables. A
node with outgoing dotted arrows visualises a random variable’s parameter.
If Nδ, δ ∈ ∆ denotes the total number of distinct states in a particular artefact plus the
state in which this artefact is absent, then|H| =∏
δ∈∆ Nδ. This means we have to
compute12
(|Γ|2 + 3|Γ| + 2
) ∏δ∈∆ Nδ individual parameters, if we restrain from using
spherical or isotropic covariance matrices. Hence, the combinatorial explosion in the
number of free parameters poses a serious threat to every application of the CG model.
As an example,10 binary artefact processes and10 monitored data channels give rise to
3.2. Construction of the belief network 27
66× 210 = 67584 parameters that need to be set. Despite this theoretically prohibitive
increase in the number of free model parameters, the situation in practice is not overly
bad, because a huge number of the means and covariance matrices turn out to be equal.
This very fact is due to the kind of artefacts present in the data. More specifically, there
are processes, such as a recalibration of the recording device, which overrule all other
artefacts, leaving the same observations for all latent variable realisationsx in which it
is present. In our example of a recalibration of the recording device, all values would
therefore always be zero.
Unfortunately, the time constraints on this project did notallow us to research this
issue in more detail. Nevertheless, we briefly discuss a sensible approach to effectively
represent, learn and apply the parameters of a CG distribution of the kind mentioned
here in the last chapter. For now, let us turn to the slightly more mundane field of
determining all necessary parameters—including a discussion of how to avoid troubles
such as singular covariance matrices.
3.2 Construction of the belief network
This section is intended to demonstrate how the moment characterisationsµ, Σ, π
of the CG distribution can be estimated. But before we describethe general procedure
to do that, it is certainly a good idea to have a look at the graphical model associated
with the CG distribution. Figure 3.1 on page 26 illustrates the belief network for the
random variablesY andX1, X2, . . . , XK together with their parameters.
3.2.1 General considerations
In principle, it is very easy to estimate the parameters for asingle cross product statex
of the latent variables, once the appropriate multi-channel data for that state is avail-
able. There are merely two minor caveats here:
1. Even for a restricted number of identified artefacts, there will be an enormous
number of different state combinations of the latent variables.
28 Chapter 3. Methods
2. Not all of those combinations might be present in the data set that is available to
us.
The consequences are twofold. First, we need a reliable and at least moderately fast
method to automatically compute the parameters for all cross product states from the
data and, second, we must also be in the position to easily, but accurately create artifi-
cial data for those artefact state combinations which are not available in the provided
sources.
And indeed, the endeavour of designing and writing adequatesoftware for the above
issues took up a considerable amount of project time.
Regarding the second point, the resulting methods utilise the fact that most artefacts
do not exhibit characteristic changes in all physiologicalchannels, but only in few of
them. Hence the untouched channels can be replaced with datathat does not contain
any artefact patterns, data that is what we refer to as normal2. We are actually also ex-
ploiting the phenomenon we mentioned a little bit earlier—the observation that some
artefacts overwrite others depending on various influences, such as the devices that are
used to support and monitor the baby in the NICU and the way careis provided and
by whom. As another example, consider the two artefacts of drawing blood gas from
the radial arterial line and endotracheal suctioning. If both processes happen simultan-
eously, the usual moderate drop in heart rate which is characteristic for a suctioning
will not be shown in the channel data, as taking the blood gas causes the heart rate
channel values to be zero, irrespective of the suctioning taking place or not.
Subsection 3.2.3 details the above discussion and also addsthe remedy for situations
in which the covariance matrix has originally been estimated as being singular. For
now, let us quickly state how the parameters of the CG model canbe learned, given a
setO = o(t)t=1,...,T of multivariate data samples and a particular realisationx of the
hidden discrete random variables.
2We are careful in the usage of our language here, because noneof the babies in a NICU can be saidto be healthy and moreover because at the moment we do not model the baby’s state of health, so thatthe supposedly normal data might actually show irregularities.
3.2. Construction of the belief network 29
Then this problem can be readily solved using parametric density estimation. This is
especially trivial since the distribution we have to model is assumed to be unimodal
and Gaussian. In chapter 2 we investigated shortly to what extent this is true. Based on
the common and related assumption that the observed data samples are independently
and identically distributed (IID), we use the maximum likelihood estimators (MLEs)
to set the elements of the mean and the covariance matrix:
µ =1
T
T∑
t=1
o(t) (3.2)
and
Σ =1
T
T∑
t=1
(o(t) − µ
)(o(t) − µ
)′(3.3)
wherev′ denotes the transpose of a vectorv. For a more thorough review of the
maximum likelihood principle and the properties of its estimators we refer the reader
to one of the many good resources, including Bishop (1995, chapter 2), Tipping (1999,
chapter 5) and Jordan (2002, chapter 5).
Finally, we are left with the prior probabilitiesπ(x). Due to the (assumed) independ-
ence of the artefact processes, we have
π(x) = PX1 = x1, X2 = x2, . . . , XK = xK
=∏
δ∈∆
PXδ = xδ
=∏
δ∈∆
πxδ
δ ,
(3.4)
where everyπxδ
δ can again be determined using the maximum likelihood approach, i.e.
by the ratio of the number of samples from the entire data set in which artefactδ is
present over the total number of available samples in this data set. Should a known
artefact state not be present at all, one has to fall back on heuristically guessing the
corresponding prior probability.
So, as discussed at the beginning of this section, the main goal is to create a multivariate
data sample that is representative for a specific cross product state. The approach we
take in this work is described in detail in the next two sections, but the general outline
of the procedure is as follows.
30 Chapter 3. Methods
First, we have to determine the artefacts, how many distinctstates they comprise and
which physiological channels they alter in order to be able to recombine this informa-
tion later on when we create the numerous cross product states in order to estimate the
means and covariances.
This can in principle be done by introducing adequate labels. There is a problem with
this approach, however: how should we label data for artefact states which have been
identified but which are not present in the source we currently look at? One might
argue that we do not really need to include this state when we examine this individual
source only; but in the more realistic case where one wants tohave consistent artefact
models for all sources, this is more tricky. Also because recorded channel values from
different days, even more so from different infants, can vary greatly. Hence it might be
necessary to not only fake data for certain cross product states but actually also for data
that represents artefact states which have not been observed in the inspected source. As
an example, one state of a fictitious artefact might correspond to a spike whose shape
is precisely known and which is also clinically important, yet it is extremely rare, say
it occurs once in100 hours. In addition to this, the labelling approach might result in
problems when we estimate parameters from sparse data.
Because of these concerns we model the individual artefact states more explicitly. That
is, we select and extract the observable artefact state data, whereas we manually con-
struct data samples for the missing states using prior knowledge. The extracted samples
together with the constructed ones and their learned prior probabilities can then be
stored in a convenient structure, to ease further processing. Nevertheless, labels are
certainly useful to analyse the results of the artefact detection models.
With the data of the latent variables at hand, it is only a matter of appropriately re-
combining it for all realisations ofX to be able to estimate the elements ofµ(x) and
Σ(x).
3.2. Construction of the belief network 31
3.2.2 Creation of latent variables
In this subsection we explain how to create a random variableassociated with an arte-
fact process using the software we implemented in MATLAB 3. Details about the arte-
facts we used in the different experiments, what states theycomprise and which prior
probabilities we assigned to them are given in section 3.4.
Please note that we will not go into implementation details either. For our purposes
here it is sufficient to know that the software is object oriented and there are classes for
the source data, the multinomial latent variables and the CG distribution. Each class
possesses some useful methods, such asplot in theneonate class which visualises a
neonate object’s physiological data channels.
Identifying latent variables and their states
As we mentioned before, the first thing that needs to be done isto meticulously exam-
ine a large amount of the source data. This allows us to get a feeling for the data set and
its most prevalent patterns. The interesting part then is todevelop consistent models
for the artefacts that can be spotted within the data. Theoretically, it is clear that one
has to determine the number of distinct states and the physiological data channels that
are altered by the underlying artefact process. In practice, however, this is by no means
as trivial as one might expect.
First of all, there are problems with the data itself. Even ifwe classified a common
pattern as an artefact, there might still be large variations regarding the quality and
quantity of individual states. Examples include varying pattern onsets and durations
as well as different shapes, such as wild spikes when there should be a steady rise. An
even more concrete example is the heart rate pattern while a nurse is drawing blood
gas from the arterial line. The common and theoretical valuefor the heart rate in this
case is supposed to be zero. Sometimes, however, this is not true. There might be
several periods within the procedure where its values are actually perfectly normal or
sometimes the onset differs to the onset spotted in other channels.
3The MathWorks, Inc. (2003).
32 Chapter 3. Methods
Moreover, it is obvious that technical and medical background knowledge does help a
lot when one has to decide on the number of artefact states andthe channels affected by
them. If one knows the procedure of taking blood gas, it is easier to infer the changes
in the observed sensor signals.
Unfortunately, this classic expert knowledge was only rarely available to us. As there is
not enough medical staff to observe all infants around the clock—which in fact is one
of the reasons for having monitors in the NICU, the clinical annotations are incomplete.
In addition to this are the annotations provided by the medical personnel sometimes
everything but trivial—at least for the author with his limited medical background.
Also, they only indicate an event in time and never durations. For instance, a nurse
might make a remark saying that she took blood gas, but it should be almost impossible
for her/him to note its precise beginning and end.
Despite all those inconveniences, the decisions regardingthe number and quality of an
artefact’s states as well as the channels affected by them have a crucial impact on the
performance of the detection of this artefact.
Finally, it should be noted that we assume all states of an artefact process to alter the
same channels.
Selecting and generating artificial data
Reliable selection and extraction of the data associated with the previously identified
artefact states is straight-forward, but time-consuming.This is especially true when it
is not possible to visualise the data. Even within MATLAB , which offers some very
high level operations, the extraction of more than two artefact processes without ap-
propriate graphical support is unrealistic.
This is why we devoted a lot of our time to develop a tool calledplot that enables us
to graphically display selected physiological channels for specified periods of time in
a reasonably intuitive fashion. It is clear, however, that its design is not the declared
goal of the project.
3.2. Construction of the belief network 33
Figure 3.2 Example of how a selection of two different physiological data chan-
nels (oxygen and carbon dioxide) can be saved to a workspace variable (here
recalGas21).
Apart from being able to visualise the data, one can also create zoomed plots and—
more important in the context of this section—mark up specific regions of interest in
order to save the corresponding data to a workspace variable.
Figure 3.2 illustrates an example session in which the values of the oxygen and carbon
dioxide channels associated with the first state of the gas probe recalibration artefact
are saved to a workspace variable calledrecalGas21. The corresponding command
shell output is given below:
>> plot(n1355_14_Nov_2001, 4 5)Creating plots...Finished.New figure with...
Start Time: 15:39:31
34 Chapter 3. Methods
End Time : 17:20:37Creating plots...Finished.Selected channel data (16:10:41 to 16:42:42) written to recalGas21.
The only command the user needs to execute from the shell is the plot command
on the first line. It generates a new figure showing the oxygen and carbon dioxide
(indicated by4 and5 respectively) channels for the data stored in theneonate object
n1355 14 Nov 2001. Then a new, zoomed version of this figure is created and in it
the data from 16:10:42 to 16:42:42 is selected (shown in Figure 3.2). A new dialog
appears which asks the user to specify a name for the workspace variable to which the
marked up data will be saved. Note that we store only the data of those channels which
are modified by the latent process.
This process of selecting and saving channel values must be repeated for all identified
artefact states present in the source. Should there be two occurrences of a gas probe re-
calibration, for example, we would have to save six regions,as there are three different
non-normal states in this artefact.4
Faking state data was not really necessary for the artefactswe modelled during this
project. Yet we exploited the fact that some artefact statesdo not change irrespective
of the source given. Thus we were able to save time and effort by reusing that state’s
data. A state associated with channel values which are solely zero (as the highlighted
region in Figure 3.2) are a good example.
Even if one needs to fake data for some artefact states, it canbe easily incorporated
into the model, no matter what techniques are used to generate it.
Constructing the multinomial object
Constructing themultinomial object after all data has been saved to workspace vari-
ables is really trivial. We simply call the constructor of the multinomial class with
the correct arguments.
4For the reasons given in subsection 3.2.3, we usually do not need to include the normal state’s dataof an artefact explicitly.
3.2. Construction of the belief network 35
Suppose we previously selected and saved data associated with the three non-normal
states of the gas probe recalibration artefact to workspacevariables calledrecalGas11,
recalGas12, recalGas21, recalGas22, recalGas31 and recalGas32, where the
first number corresponds to the artefact’s state and the second to its occurrence in the
data source, so thatrecalGas21 contains the data from the first occurrence of the
second non-normal artefact state. Then we can invoke the constructor as follows:
>> nstates = 4;>> data = recalGas11 recalGas12 recalGas21 recalGas22 recalGas31 recalGas32 ;>> labels = ’OX 0/CO 0’, ’OX 20/CO 5’, ’OX high/CO low’, ’Normal’;>> priors = [1 0 0 0];>> colors = zeros(nstates,3);>> name = ’wrong priors’;>> recalGasTmp = multinomial(’NumberOfStates’, nstates, ’Priors’, priors, ’Labels’, labels,...
’Colors’, colors, ’Data’, data, ’Name’, name)Warning: 4. cell array in Data contains no neonate objects> In multinomial.m at line 138
recalGasTmp is a multinomial object called "wrong priors" and has the following properties:
Prior Number of Number ofRealisation Probability Neonate Objects Available Samples
-------------------------------------------------------------------------------
OX 0/CO 0 1 2 1003OX 20/CO 5 0 2 589
OX high/CO low 0 2 169Normal 0 0 0
It acts on the following channels: OX CO
In case we want to copy an already existingmultinomial object to modify some of
its properties later, we could use the copy constructor, as shown below:
>> recalGas = recalGasTmp;>> recalGas.priors = [1003 589 169 34311]/36072;>> recalGas.name = ’correct priors’recalGas is a multinomial object called "correct priors" and has the following properties:
Prior Number of Number ofRealisation Probability Neonate Objects Available Samples
-------------------------------------------------------------------------------
OX 0/CO 0 0.027806 2 1003OX 20/CO 5 0.016328 2 589
OX high/CO low 0.0046851 2 169Normal 0.95118 0 0
It acts on the following channels: OX CO
The above example also demonstrates how we can compute the prior probabilities for
the different artefact states. Given that the source from which we extracted the artefact
36 Chapter 3. Methods
state data comprises a total of36072 samples, and that the data associated with the
artefact states (recalGas11, recalGas12, etc.) has been extracted properly, the above
computed values are the MLEs of the states’ prior probabilities.
This method is especially handy if there are several occurrences of an artefact.
3.2.3 Creation of the CG distribution
Given that we have already builtmultinomial objects for the various artefact pro-
cesses, the creation of aconditionalGaussian object is technically accomplished by
calling the class’s constructor method. Nevertheless, there is one caveat here, which is
the order of themultinomial objects in the argument list. The specified order does—
as we discuss below—determine which artefacts possibly overwrite others. Also we
have to be sure to incorporate data representing some kind ofnormality or, in other
words, the absence of any artefacts. Finally, there are someissues that need to be
addressed in the case of a singular covariance matrixΣ(x).
Determining the order of the latent variables
Every time two or more different artefacts have in common at least one channel which
is affected by them, it is interesting to see what happens when those processes occur
simultaneously. For example, one artefact’s presence could diminish another one’s
influence or in the extreme case cause it to be absent.
This is an important issue since we have to generate the crossproduct states auto-
matically and hence need to know about the observations caused by the interaction of
various artefacts on the considered data channels in order to reproduce them.
Fortunately, there is a simple solution to the problem whichis based on the observa-
tion that the artefacts we considered in this project share in common the fact that one
completely overwrites another one’s patterns or vice versa. Therefore we did not en-
counter an artefact pair which interacted with each other sothat the caused result was
a mixture of their usual patterns or something new, i.e. neither belonging to the first
3.2. Construction of the belief network 37
ZeroHR ZeroRRZeroSO
endoSuc
abgrecalGas
recalBP
recal
off(2)
(2)
(3)
(2)
(2)
(4)(2)(2)(2)
Figure 3.3 Hasse diagram for∆ = zeroSO, zeroHR, zeroRR, recalGas,
endoSuc, abg, recalBP, recal, off , which is used in experiment
1369 21 Nov 2001 9 (see section 3.4). The number in brackets before an artefact’s
labelδ is given byNδ, the number of its distinct states.
nor to the second artefact. But we are sure that those exist andmust be taken care of
in more elaborate models. For now, let us return to how we utilise this observation to
determine the order of themultinomial objects in the argument list.
Mathematically, we can define a binary relation “is overwritten by” on the Cartesian
product∆×∆, so that∆ is actually a partially ordered set on this relation. Figure3.3
illustrates one partially ordered set for∆ = zeroSO, zeroHR, zeroRR, recalGas,
endoSuc, abg, recalBP, recal, off .
A subsetC ⊆ ∆ whose elements are artefacts altering the same channels, isa chain in
(∆, ”is overwritten by”).5 The set endoSuc, abg, recalBP is one of these chains,
for example.
As a consequence, the argument list is determined by the structure of the partially
ordered set∆, so that the process which overwrites at least some channelsof all the
other artefacts in∆ is the last in the list.6 In Figure 3.3 one such list could be (endoSuc,
zeroSO, recalGas, abg, recalBP, zeroHR, zeroRR, recal, off). Given this list, we
can then start to construct the data sample for a specific cross product state as follows:
5Please note that these subsets clearly do not represent all possible chains in(∆, ”is overwritten by”).6It has to be the last if there exists a unique maximal element of ∆ called the largest element,
otherwise the order of the maximal elements can be chosen arbitrarily.
38 Chapter 3. Methods
1. We calculate the maximal numberM of available data points of the artefact
states currently considered.
2. We randomly selectM of those points from every considered artefact state data
set and save them inSδ, δ ∈ ∆ respectively. In addition to this, we randomly
selectM points from the data set which corresponds to the normal state in which
all artefacts are absent. This set is saved toSnormal. All data points might be
multivariate, depending on how many channels an artefact changes.
3. We create the cross product state sampleSx of sizeM by assigning to it the
individual artefact state samplesSδ in the order specified in the argument list.
Moreover, we always initialiseSx with Snormal. Channels which are not affected
by an artefact’s state need not be overwritten.7
Issues about the normal state
The reason why we usually constructmultinomial objects without assigning data to
the states which represent the absence of the artefact, is given by the way we build the
cross product state sampleSnormal. We clearly do not wish an artefact which is absent
to overwrite other artefacts’ states.8 Imagine we include the normal state’s data of the
maximal element of an arbitrary chainC ⊆ ∆, then all samples of the other artefacts
in this chain will be overwritten by the normal state’s sample of the maximal element,
irrespective of the state of the rest of the artefacts inC.
Therefore only the first artefact in the argument list shouldcomprise the normal data
in its normal state, so that no absent/normal state of any other artefact overwrites im-
portant and earlier assigned samples.
Of further interest is what we regard as normal and what not. Asimple definition
would be everything which is not artefactual. The problem with this definition is that
the data is not that well-behaved, and even if we exclude all the samples we consider
7Actually, we do not even store the non-influenced channels, neither inSδ nor in the original dataset.
8Thus, a precise version of the previously given definition ofthe relation on∆ is ”is overwritten inat least one channel by the non-normal states of”.
3.2. Construction of the belief network 39
to be artefactual, there might still be difficulties. It is most likely that there are not
identified artefacts or different states of health of the baby—patterns only a medical
expert can spot and distinguish adequately.
Therefore the construction of the normal state itself mightbe tricky. And unfortunately,
we cannot give a general explanation of what data we assumed to be normal and what
not. The selection process, however, usually did not include regions with high vari-
ability. On the other hand, this also means that the channelsof the normal states we
learned have small standard deviations, which leads to the problem that some artefact
states might be wrongly rendered responsible for patterns neither declared normal nor
artefactual.
One solution—which is not implemented at the moment—is the incorporation of a
model of normality with a large variance; yet we do have our doubts to what extent
this approach could work.
Before we illustrate how theconditionalGaussian class constructor computes the
mean and covariance matrix of a specific cross product sampleSx, let us briefly show
its function call from the command line together with assigning normal data to the first
multinomial object in its argument list:
>> zeroSO.data2 = normal;>> parents = zeroSO recalGas abg recal;>> CG = conditionalGaussian(’Parents’, parents)Initialising given data...Computing means, covariances and priors for 32 cross-product states...
CG is a conditionalGaussian object named "not specified" with 4 parent nodes called:
Zero SO / HS (2 states)Recalibration or Relocation of OX/CO Probe (4 states)Taking Blood Gas (2 states)Recalibration of Badger (2 states)
Altogether there are 32 parent node state combinations (i.e. cross-product states)on the following channels: BD BM BS CO HR HS OX RR SO TC TP
Rows and columns in the covariance matrices which contain only zeros are adjusted
40 Chapter 3. Methods
Computing mean and covariance matrix
In essence, there should not be a lot to say in this subsectionas we set out the fun-
damentals in subsection 3.2.1. We simply construct the cross product sampleSx
as explained in the previous paragraphs and compute its MLEsusing Equation 3.2,
Equation 3.3 and Equation 3.4.
Problems arise in the case whereΣ(x) is not positive definite, which is, of course, the
case when complete columns or rows are equal zero.
Unfortunately, this is often true for the MLE ofΣ(x) since the cross product sample
frequently contains zero values for entire physiological channels. A recalibration of the
recording device could be responsible for this, for instance. Thus we have to sensibly
adjust the corresponding elements inΣ(x).
The procedure we use works as follows. First, we replace the zero-entry columns and
rows in the matrix with data fromSnormal. Then we recompute the MLE ofΣ(x) and
adjust the elements of the previous zero-entry columns/rows by multiplying them by a
small constant (α = 0.05). This ensures that the adjusted matrix elements are equally
small relative to the normal state. Should this not yield thedesired effects, so that there
are still complete columns or rows equal to zero—maybe becauseSnormal contains
only zero values as well, we replace the corresponding main diagonal elements with a
small constant (β = 0.0001).
Time complexity
The time complexity for the computing the parameters of the CGdistribution isO(|H|),
which is clearly not very nice, since we cannot use vast numbers of artefacts. But
this is an intrinsic problem of the CG model itself. One reasonable thing to do when
one wants to consider more artefacts while reducing the unfortunate combinatorial
explosion in the number of the cross product states|H| at the same time, is to ex-
ploit the structure of the poset∆. First of all, we could combine all artefacts of
a chainC, whose elements modify the same channels, to form one big latent vari-
3.2. Construction of the belief network 41
able, withNC = 1 +∑
δ∈C(Nδ − 1) distinct states. Substituting the original ran-
dom variables for the new one, yields another partially ordered set on the set of spe-
cific chains whose artefacts change the same channels. This method alone can re-
duce |H| considerably. For example, if we use the naıve approach for experiment
1369 21 Nov 2001 9, which uses nine different latent variables (see section 3.4), we
have to process27 × 3 × 4 = 1536 distinct cross product states, whereas the newly
constructed latent variable set of size6 consists of only23 × 3 × 4 × 5 = 480.
But we reduce the number of cross product states even more effectively. As an arte-
fact’s non-normal states overwrite all its children’s states in the newly constructed
partially ordered set, it is absolutely superfluous to represent all of their cross product
states individually. If the channels an artefact modifies are a subset of its parent’s chan-
nels, then they will be entirely overwritten every time the parent artefact is present
(non-normal). This means all data samples of the cross product states of the child
and parent processes are the same. Should the parent’s statebe normal or the child’s
modified channels are not a subset of the parent’s channels, all of the child’s states are
relevant in the cross product state sample. Algorithm 4.1 presents a formal description
of the above discussion.
Maybe an example clarifies this point. Consider therecal artefact in Figure 3.3. It
overwrites all of its children’s channels. Thus the cross product of its non-normal states
with the children’s states will exhibit only as many distinct cross product state samples
as there are non-normal states inrecal, reducing its size from23 × 4 × 5 = 160 to 1.
When therecal artefact is absent, however, we need to consider all160 distinct states.
Applying this principle to the complete set∆ in Figure 3.3 reduces the cross product
state space from an original value of1536 to 162.
Hence we can exploit this result to speed up the estimation ofthe CG parameters
as well as the computation of the posterior probabilities. In the estimation problem,
we would transform the original multinomial latent variables into the new partially
ordered set, compute the means, covariance matrices and priors and invert the trans-
formation again. Fortunately for us, these transformations are computationally inex-
pensive. Therefore the discussed method is a little bit likea Fourier transform.
42 Chapter 3. Methods
Algorithm 3.1 Recursive construction of the reduced cross product state space.
1: S = reduceCPSS(artefact)
2: if artefact.children= ∅ then
3: return artefact.states
4: else
5: Split the children in two disjoint sets; one with children whose channels are a subset of the
artefact’s channels (Ω) and one where this is not the case (Λ)
6: Ω = Λ = ∅
7: for all c ∈ artefact.childrendo
8: if c.channels⊆ artefact.channelsthen
9: Ω = Ω ∪ c
10: else
11: Λ = Λ ∪ c
12: end if
13: end for
14: Cross product of artefact and elements ofΛ
15: S1 = artefact.states× “normal”|Ω|
16: for all c ∈ Λ do
17: S1 = S1 × reduceCPSS(c)
18: end for
19: Cross product of elements ofΩ andΛ
20: S2 = “normal”
21: for all c ∈ Ω do
22: S2 = S2 × reduceCPSS(c).
23: end for
24: for all c ∈ Λ do
25: S2 = S2 × reduceCPSS(c).
26: end for
27: Return union of both sets
28: return S1 ∪ S2
29: end if
3.3. Computing marginal posterior probabilities 43
If the time savings are really as great as expected which still needs to be seen in prac-
tice, as we have not had the time to implement the transformation yet.
3.3 Computing marginal posterior probabilities
Given the estimates of the CG parametersµ, Σ and π, we use Bayes’ theorem to
compute posterior probabilities for every cross product statex and a given observation
Y = y:9
P (X = x|Y = y) =P (Y = y|X = x)P (X = x)
P (Y = y)(3.5)
where the conditional probabilityP (Y = y|X = x) is given by Equation 3.1.
With these posterior probabilities we can then calculate marginal posterior probabilit-
ies by summing over latent variable states:
P (Xν = xν |Y = y) =∑
δ∈∆
P (Xδ = xδ|Y = y), whereν ∈ ∆ (3.6)
In words,P (Xν = xν |Y = y) is the probability of a single artefactν being in statexν
given the observationy.
Then—chiefly for visualisation purposes—it is also helpfulto compute the probability
of an artefact being present, which is given by1 − P (Xν = “normal”|Y = y).
As it is often the case when multiplying small numbers, one has to ensure that un-
derflow does not occur. This can be achieved by taking logarithms, so that we sum
probabilities instead of multiplying them. In our case the critical part is the multiplic-
ation of the prior probabilities with the conditional probabilities because the priors are
calculated via Equation 3.4 and most of theπxδ
δ are tiny. Hence it is not unusual to get
division by zero warnings during the application of Equation 3.5.
9Please note that the marginally independent hidden random variables become conditionally depend-ent given an observationy, which can be easily verified using the d-separation criterion (Pearl, 1988).
44 Chapter 3. Methods
Advanced marginalisation methods as those described in Lauritzen and Jensen (1999)
are not necessary, since our model does not include continuous hidden variables.
The considerations in previous sections regarding the timecomplexity also hold for
this one and a cross product state space reduction would allow us to compute the
above probabilities with more artefacts in less time. At themoment, we calculate
the marginal probabilities for approximately35 000 multivariate observations at a time
as this is more efficient than the successive computation of individual data points—at
least in MATLAB . Nevertheless, it takes about ten minutes on a recent PC (600 MHz
AMD Athlon processor,384 MByte RAM) to produce the marginal posterior probab-
ilities for data sets of the above mentioned dimensions. On the other hand, this is not
too bad, since we could easily calculate the probabilities for a single data point in less
than a second, i.e. in real-time, provided that an observation is produced every second.
3.4 Experiments
In this last section of the Methods chapter we briefly specifythe setup of our experi-
ments, including details of how we modelled the used artefacts. Before we start, let us
remark that the names we gave the artefact processes might insome cases misdescribe
the actual underlying cause; the observations, however, looked very similar—at least
to the author with his limited medical knowledge.
From the over one hundred source data sets, we imported not a lot more than ten of
them from the TIME SERIES WORKBENCH into MATLAB neonate objects.10 All of
the selected sets were sufficiently big (over6 hours), exhibited a reasonable amount
of variability and were annotated frequently. We did not choose data sets in which for
huge periods of time the channels were just completely blankor in which they went
absolutely crazy. In other words, we tried to focus on data sets with a large number of
more or less consistent artefacts whose states were indicated rather clearly. This does
not mean, we skipped the relevant or complicated parts.
10The import of TSW generated ASCII files containing physiological data intoneonate objects canbe accomplished within seconds using a method calledimportFromTSW.
3.4. Experiments 45
The set of imported sources was further narrowed down to six sets on which we com-
puted posterior and marginal posterior probabilities as well as various data statistics.
On five of those sets, totalling in163 112 seconds or over45 hours, a clinical expert
(Neil McIntosh) evaluated our artefact detection results.These five sets are the ones
we will describe on the following lines.
3.4.1 Experiment 1340 05 Nov 2001 4
The first experiment we detail is based on source data set1340 05 Nov 2001 from
November 5th, 2001. The available physiological channels—heart rate (HR as well
as HS), central temperature (TC), peripheral temperature (TP), systolic blood pressure
(BS), diastolic blood pressure (BD), mean blood pressure (BM) and saturation of oxy-
gen (SO)—were collected from a female infant who was born oneday before in the
25th week of gestation. The set comprises37 181 samples, one every second, starting
at 8:37:09 in the morning and ending at 18:56:49 in the evening.
Artefact Realisation Prior Samples Source
Zero SO / HS Zero SO / HS 0.045668 2678 1369 22 Nov 2001
Normal 0.95433 22919 1340 05 Nov 2001
Drawing Drawing Blood Gas 0.014524 540 1340 05 Nov 2001
Blood Gas Normal 0.98548 0 —
Recalibration Recalibration 0.022754 846 1340 05 Nov 2001
of Badger Normal 0.97725 0 —
Recording Recording Device Off 0.01 1703 1344 06 Nov 2001
Device Off Normal 0.99 0 —
Table 3.1 Latent variable details for experiment1340 05 Nov 2001 4, show-
ing the artefacts, their states (Realisation) and prior probabilities (Prior) as well
as the number of data points that have been available to estimate the parameters
(Samples) and the source data set from which the samples come from.
Although there are almost certainly other artefacts present in this data set, we modelled
46 Chapter 3. Methods
the four apparent ones:
1. Drops to0 in the channels SO and HS, shortly referred to as artefactzeroSO;
2. Drawing blood gas from the radial arterial line (abg), causing the channels HR,
BS, BD and BM to steadily increase;
3. Recalibration or time test of the recording device called Badger (recal), set-
ting all available channels to zero apart from the temperatures which change to
20 Celsius;
4. Temporary failure of the recording device in which all channels (including TC
and TP) are zero (off).
Details can be found in Table 3.1, where upper artefacts are overwritten by lower ones
(this is always the case in the following tables).
Please note that the samples ofzeroSO andoff come from a different source; this
is fine since the samples contain only zeros anyway and the latter artefact does ac-
tually not occur in source1340 05 Nov 2001, but has been modelled for reasons of
consistency.
3.4.2 Experiment 1344 12 Nov 2001 5
The second experiment we carried out uses the source1344 12 Nov 2001, from Novem-
ber 12th, 2001. The male infant monitored in this data set wasborn six days earlier, in
the 26th week of gestation. The period of time for which data from the channels HR,
HS, SO, TC, TP, BS, BD, BM as well as two from further channels—oxygen (OX) and
carbon dioxide (CO), was available is 8:32:54 to 18:38:30, totalling in 36 337 samples
on a second by second basis.
Apart from the artefacts described in subsection 3.4.1, we modelled the recalibration
or relocation11 of the combined oxygen/carbon dioxide probe to which we refer to as
11Recalibration and relocation of the oxygen/carbon dioxideprobe share in common some states, sothat we were in the position to model both of them with one variable, although the order of the individualartefact states might be different.
3.4. Experiments 47
Artefact Realisation Prior Samples Source
Zero SO / HS Zero SO / HS 0.26986 2678 1369 22 Nov 2001
Normal 0.73014 16167 1344 12 Nov 2001
Recalibration or OX 0/CO0 0.03902 1151 1344 12 Nov 2001
Relocation of OX 20/CO5 0.042008 1473 1344 12 Nov 2001
OX/CO Probe OX high/CO low 0.014624 543 1344 12 Nov 2001
Normal 0.90435 0 —
Drawing Drawing Blood Gas 0.022869 831 1344 12 Nov 2001
Blood Gas Normal 0.97713 0 —
Recalibration Recalibration 0.0040426 115 1369 22 Nov 2001
of Badger Normal 0.99596 0 —
Recording Recording Device Off 0.01 1703 1344 06 Nov 2001
Device Off Normal 0.99 0 —
Table 3.2 Latent variable details for experiment1344 12 Nov 2001 5, show-
ing the artefacts, their states (Realisation) and prior probabilities (Prior) as well
as the number of data points that have been available to estimate the parameters
(Samples) and the source data set from which the samples come from.
recalGas. It exclusively modifies the OX and CO channel data. Table 3.2 shows more
information on this and the other four artefacts we used in this experiment.
Again there are some artefacts which were constructed usingdata from other sources.
This can be done as long as these data samples are accurately defined. Artefact states
whose samples are relative to the source’s normal state haveto be created from the data
set they are a part of. Ways to circumvent this restriction are very briefly discussed in
chapter 5.
3.4.3 Experiment 1369 22 Nov 2001 7
This experiment is based on source1369 22 Nov 2001 from November 22nd, 2001.
The data has been collected from a male neonate born the day ago and was monitored
on the same channels as in experiment1344 12 Nov 2001 5, from 8:39:44 to 16:33:50
which results in a total of28 447 samples. The boy was born in the 24th week of
48 Chapter 3. Methods
gestation.
Artefact Realisation Prior Samples Source
Zero SO / HS Zero SO / HS 0.09414 2678 1369 22 Nov 2001
Normal 0.90586 9357 1369 22 Nov 2001
Recalibration or OX 0/CO0 0.03902 1110 1369 22 Nov 2001
Relocation of OX 20/CO5 0.042008 1195 1369 22 Nov 2001
OX/CO Probe OX high/CO low 0.014624 416 1369 22 Nov 2001
Normal 0.90435 0 —
Endotracheal HR low/BP rising 0.0023904 68 1369 22 Nov 2001
Suction BP high 0.039582 605 1369 22 Nov 2001
Normal 0.95803 0 —
Drawing Drawing Blood Gas 0.0067142 191 1369 22 Nov 2001
Blood Gas Normal 0.99329 0 —
Recalibration of HR 0/BS low/BD high 0.0018983 54 1369 22 Nov 2001
BP transducer BP0 0.00049214 98 1369 22 Nov 2001
HR 0/BP0 0.0031989 91 1369 22 Nov 2001
HR 0/BP maximal 0.00028122 80 1369 22 Nov 2001
Normal 0.99413 0 —
Recalibration Recalibration 0.0040426 115 1369 22 Nov 2001
of Badger Normal 0.99596 0 —
Recording Recording Device Off 0.01 1703 1344 06 Nov 2001
Device Off Normal 0.99 0 —
Table 3.3 Latent variable details for experiment1369 22 Nov 2001 7, show-
ing the artefacts, their states (Realisation) and prior probabilities (Prior) as well
as the number of data points that have been available to estimate the parameters
(Samples) and the source data set from which the samples come from.
Here we introduce two further artefacts, endotracheal suctioning (endoSuc) and the
recalibration of the blood pressure transducer (recalBP). Both were modelled on HR,
BS, BD and BD channel data. Thus we use seven different artefactsin this experiment:
endoSuc, recalBP, off, recal, abg, recalGas andzeroSO. Table 3.3 summarises the
artefact model details.
3.4. Experiments 49
3.4.4 Experiment 1355 14 Nov 2001 8
As a consequence of the presence of an additional channel, the respiratory rate RR,
we introduce a new artefact below. First, let us outline the source of this experiment,
1355 14 Nov 2001. The male baby, born two days prior to when the source has been
recorded, was monitored for36 072 seconds from 8:21:52 to 18:23:03. At the of the
recording, his mother was in the 29th week of gestation.
Artefact Realisation Prior Samples Source
Zero SO / HS Zero SO / HS 0.0021346 2678 1369 22 Nov 2001
Normal 0.99787 15679 1355 14 Nov 2001
Zero RR Zero RR 0.065092 6095 1344 06 Nov 2001
Normal 0.93491 0 —
Zero HR Zero HR 0.0036871 208 1355 14 Nov 2001
Normal 0.99631 0 —
Recalibration or OX 0/CO0 0.081282 2932 1355 14 Nov 2001
Relocation of OX 20/CO5 0.024312 877 1355 14 Nov 2001
OX/CO Probe OX high/CO low 0.0072633 262 1355 14 Nov 2001
Normal 0.88714 0 —
Drawing Drawing Blood Gas 0.014859 536 1355 14 Nov 2001
Blood Gas Normal 0.98514 0 —
Recalibration of BS low/BD high 0.00080395 58 1355 14 Nov 2001
BP transducer BP0 0.0012198 55 1355 14 Nov 2001
BP maximal 0.00011089 56 1355 14 Nov 2001
Normal 0.99787 0 —
Recalibration Recalibration 0.0040426 1618 1344 06 Nov 2001
of Badger Normal 0.99596 0 —
Recording Recording Device Off 0.01 1618 1344 06 Nov 2001
Device Off Normal 0.99 0 —
Table 3.4 Latent variable details for experiment1355 14 Nov 2001 8, show-
ing the artefacts, their states (Realisation) and prior probabilities (Prior) as well
as the number of data points that have been available to estimate the parameters
(Samples) and the source data set from which the samples come from.
50 Chapter 3. Methods
As the respiratory rate showed similar drop outs as the oxygen saturation, we created
an artefact (zeroRR) to track them. Beyond it we also had to change the artefacts
recalBP12, abg andendoSuc all of which usually influence HR. In this data set, how-
ever, there was no change in HR to be found for typical patterns in the BS, BD and
BM channels. Hence, we excluded HR from those models and builda new artefact
to cover the still existing, but seemingly independent HR drop outs to zero (zeroHR).
Moreover, we changedrecal andoff to include the RR channel. For details on the
eight artefact models, see Table 3.4.
3.4.5 Experiment 1369 21 Nov 2001 9
The final experiment was based on source1369 21 Nov 2001, in which the eleven
channels HR, HS, SO, OX, CO, TC, TP, BS, BD, BM and RR were present. The
artefacts modelled differ from previous ones in the following way:
• abg andendoSucwere again modelled on HR, BS, BD and BM, whereasrecalBP
was not seen to alter HR so that we modelled it on BS, BD and BM only.We also
remark that therecalBP variable might not necessarily represent a recalibration
of the blood pressure transducer; it might well be that the pattern is caused by
the starting procedure of the recording device.
• endoSuc was modelled without the patterns of high blood pressure caused by
preceding suctions.
The nine artefact model details are illustrated by Table 3.5.
The source 1369 21 Nov 2001 is from the same baby as in experiment
1369 22 Nov 2001 7, but from the previous day—the day he was born. Altogether,
25 075 samples on a one second basis are available, starting at 12:21:54 and ending at
19:19:48.
12Please note thatrecalBP might in this experiment as well refer to flushing the line.
3.4. Experiments 51
Artefact Realisation Prior Samples Source
Zero SO / HS Zero SO / HS 0.16403 2678 1369 22 Nov 2001
Normal 0.83597 2851 1369 21 Nov 2001
Zero HR Zero HR 0.1324 208 1355 14 Nov 2001
Normal 0.8676 0 —
Zero RR Zero RR 0.16634 6095 1344 06 Nov 2001
Normal 0.83366 0 —
Recalibration or OX 0/CO0 0.03902 1110 1369 22 Nov 2001
Relocation of OX 20/CO5 0.042008 1195 1369 22 Nov 2001
OX/CO Probe OX high/CO low 0.014624 416 1369 22 Nov 2001
Normal 0.90435 0 —
Endotracheal HR low/BP rising 0.0063011 158 1369 21 Nov 2001
Suction Normal 0.9937 0 —
Drawing Drawing Blood Gas 0.030708 770 1369 21 Nov 2001
Blood Gas Normal 0.96929 0 —
Recalibration of BP maximal 0.00055833 70 1369 21 Nov 2001
BP transducer BP0 0.0010768 108 1369 21 Nov 2001
Normal 0.99836 0 —
Recalibration Recalibration 0.0015155 1618 1344 06 Nov 2001
of Badger Normal 0.99848 0 —
Recording Recording Device Off 0.054477 1618 1344 06 Nov 2001
Device Off Normal 0.94552 0 —
Table 3.5 Latent variable details for experiment1369 21 Nov 2001 9, show-
ing the artefacts, their states (Realisation) and prior probabilities (Prior) as well
as the number of data points that have been available to estimate the parameters
(Samples) and the source data set from which the samples come from.
Chapter 4
Results
In this chapter we present the results of the experiments we carried out. Due to the
vast amounts of data that had to be analysed, our strategy is twofold. On the one hand,
we pick concrete examples to discuss the quality of the returned marginal posterior
probabilities of various underlying artefact processes. Besides we show plots of these
probabilities to illustrate our examples and to give the reader a feeling for the types of
problems at hand. On the other hand, we present more quantitative—and therefore less
subjective—measures of the performance of the individual artefact models, including
accuracy and the area under the receiver operating characteristic (ROC) curve.
4.1 Experiment 1340 05 Nov 2001 4
From subsection 3.4.1 we recall that we have modelled four different artefacts in ex-
periment1340 05 Nov 2001 4, viz drop outs in the oxygen saturation (SO), drawing
blood gas from the arterial line and two further processes:recal, whose observed
channel values are zero except for the temperatures—andoff where the temperatures,
as well as the other channels, are zero.
Although we did not model some oddities in the HR, HS and SO channels in this exper-
iment, the overall quality of the data set is really good. There are only a limited number
53
54 Chapter 4. Results
0
100
200
HS [bpm]
Baby 1340, born 4 November 2001 at 14:27
0
50
100
SO [%]
8:38 8:40 8:42 8:44 8:46 8:48 8:50 8:52 8:54 8:56 8:58 9:00 9:02 9:040
0.5
1
Zero SO / HS
05/11/2001
Figure 4.1 Drop outs in the physiological channels HS and SO together with the
marginal posterior probability ofzeroSO for a selected period of time. The later
third of the example illustrates the explaining away effect as the recording device
is recalibrating at that time.
of artefacts and their patterns are surely varying, but nevertheless clearly shaped. Thus
the main focus of this first experiment is to determine how well the static CG model
can do under almost ideal conditions.
Let us begin with thezeroSO artefact. From our experience and intuition as well as
from the results below are the drop outs to zero values, no matter in which channel
they occur, amongst the artefactual patterns from which we can most reliably infer
their cause. Figure 4.1 shows approximately27 minutes of the HS and SO channels
together with the marginal posterior probability forzeroSO. All but the last zero drop
outs seem to be recognised perfectly. The reason for the low marginal posterior prob-
ability (circa0.05) between 8:55:40 and 9:05 is a consequence of the structure of the
CG model. We have already noted earlier in the last chapter that the latent variables
become conditionally dependent once we observe some physiological data. So finding
that another artefact is highly responsible for having generated the observed pattern,
renderszeroSO’s responsibility for this particular observation less credible. This phe-
nomenon is most commonly referred to as “explaining away” (Pearl (1988);Williams
(2002)).
So what is it then that makes the presence ofzeroSO less likely? It is a recalibration
4.1. Experiment 1340 05 Nov 2001 4 55
0
100
200
HR [bpm]
Baby 1340, born 4 November 2001 at 14:27
0
100
200
HS [bpm]
0
50
100
SO [%]
20
30
40
TP [°C]TC [°C]
0
50
BS [mmHg]
0
20
40
BD [mmHg]
0
50
BM [mmHg]
0
0.5
1
Recalibrationof Badger
8:40 8:45 8:50 8:55 9:00 9:05 9:10 9:15 9:200
0.5
1
RecordingDevice Off
05/11/2001
Figure 4.2 Example of a recalibration of the Badger system including the in-
ferred marginal posterior probabilities for the artefactsrecal andoff.
of the recording device named Badger (after Peter Badger).1 Figure 4.2 shows the
same incident on a slightly larger time scale for all recorded channels, together with
the marginal posterior probabilities forrecal andoff. As all channels except TC and
TP are zero from 8:56 to 9:10, it is most likely to berecal which causes SO to be
zero, of course. Please note that the temperatures are actually at 20 Celsius and not at
0, which is what the plot puts on.
For these three artefacts there were no further complications in the data set.zeroSO
seems to model drop outs in SO correctly (including the 14 minute period starting at
8:56), and so dorecal andoff in their domains. Thus, there was no pattern showing
typicaloff values.
1As we have remarked before, the recalibration event we referto might as well be a time test of thesame device; but because “ground truth” was not available for this event, our can be all but precise.
56 Chapter 4. Results
0
200
400
HR [bpm]
Baby 1340, born 4 November 2001 at 14:27
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
0
50
100
BM [mmHg]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:300
0.5
1
Taking BloodGas
05/11/2001
Figure 4.3 Plot of the the physiological channels that exhibit the patterns iden-
tified with drawing blood gas (abg). The spikes in BS, BD and BM and the simul-
taneous HR drop outs at 9:50, 15:30 and 17:25 are clear examples of this artefact’s
pattern.
Slightly less confident are the results for the fourth latentvariable we examined in
this experiment: the drawing of blood gas (abg). From Figure 4.3 we can see that
especially after 17:30 the artefact model does incorrectlyclassify some variations in
HR, BS, BD and BM. From the magnified view in Figure 4.4 we can see more clearly
that the low HR is most likely to be the reason for this misinterpretation. Here we could
have utilised the explaining away effect by modelling another artefact that makesabg
less probable, for instance.
The real causes2 of the patterns for whichabg’s marginal posterior probabilities are
misleadingly high, are the following:
10:16:29, is an unknown artefact in channel HR.
11:02:50, the HR drop was maybe due to emptying the water fromthe ventilator trap.
16:04 – 16:18, the incubator doors were open and the doctors were preparing for ex-
2As far as those causes can be inferred a posteriori.
4.1. Experiment 1340 05 Nov 2001 4 57
0
100
200
HR [bpm]
Baby 1340, born 4 November 2001 at 14:27
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
0
50
100
BM [mmHg]
17:22 17:23 17:24 17:25 17:26 17:27 17:28 17:29 17:30 17:31 17:32 17:33 17:34 17:35 17:36 17:37 17:38 17:39 17:400
0.5
1
Taking BloodGas
05/11/2001
Figure 4.4 Detailed plot ofabg’s marginal posterior probabilities including two
incorrectly marked regions (17:33 and 17:38).
tubation.
17:30 – 18:20, the incubator doors were open again and the doctors were intubating
the baby or using a hand bag.
To quantify our qualitative findings we calculated the true and false positives, the true
and false positive rates, the accuracy as well as the area under the receiver operating
characteristic (ROC) curve (Hanley and McNeil (1982); Hand et al. (2001)) for all arte-
facts in all experiments. Furthermore, we computed those values for two thresholds,
θ = 0.1 andθ = 0.98; this means that we classify all data points for which the mar-
ginal posterior probability is greater than the threshold as the outcome of an underlying
artefact process. In other words, we are one time quite restrictive (0.98) and the other
time rather accomodating (0.1) as to what points are being considered as artefactual or
not. The necessary labels were constructed based on the annotations available within
the TSW as well as on notes the author took during the evaluation by Neil McIntosh.
The results for experiment1340 05 Nov 2001 4 are depicted in Table 4.1 and Table 4.2.
Interestingly, the explaining away effect is also present there as the true positive rate
58 Chapter 4. Results
for zeroSO is with 49.7% relatively low. The overall results are excellent.
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 844 0 49.7 0.0 97.7 1
abg 542 304 99.1 0.8 99.2 0.995
recal 854 0 100.0 0.0 100.0 1
off 0 0 — 0.0 100.0 —
Total 2240 304 82.93 0.21 99.2 0.998
Table 4.1 Summary of the artefact detection performance analysis for the source
1340 05 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.1.
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 844 0 49.7 0.0 97.7 1
abg 537 217 98.2 0.6 99.4 0.995
recal 854 0 100.0 0.0 100.0 1
off 0 0 — 0.0 100.0 —
Total 2235 217 82.63 0.15 99.3 0.998
Table 4.2 Summary of the artefact detection performance analysis for the source
1340 05 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.98.
4.2 Experiment 1344 12 Nov 2001 5
In this experiment we concentrate on the recalibration or relocation of the gas probe
artefact. We begin all the same by looking at another plot of therecal andoff arte-
facts (Figure 4.5). Again, the plotted period of time is the only one that exhibits fea-
4.2. Experiment 1344 12 Nov 2001 5 59
0
100
200
HR [bpm]
Baby 1344, born 6 November 2001 at 12:36
0
100
200
HS [bpm]
0
50
100
SO [%]
20
30
40
TP [°C]TC [°C]
0
20
40
OX [kPa]
0
20
40
CO [kPa]
0
50
100
BS [mmHg]
0
20
40
BD [mmHg]
0
50
BM [mmHg]
0
0.5
1Recalibration
of Badger
8:57 8:58 8:59 9:00 9:01 9:02 9:03 9:04 9:05 9:06 9:07 9:08 9:09 9:10 9:11 9:12 9:130
0.5
1RecordingDevice Off
12/11/2001
Figure 4.5 This plot shows the only three interesting patterns present in source
1344 12 Nov 2001 with regard torecal andoff.
tures interesting with regard to these processes. But because off is not present at all
andrecal classifies all instances correctly, as can be seen from both,Table 4.3 and
Table 4.4, we turn to the next artefact.
So let us consider therecalGas artefact which modifies the oxygen and carbon dioxide
channels. Figure 4.6 shows its performance on the entire data set
1344 12 Nov 2001, whereas Figure 4.7 chiefly illustrates the marginal posterior prob-
abilities for the three distinct non-normal states. From the latter we can see that the
different states are modelled pretty well, party due to the fact that we constructed it
from the same source. The shown marginals also visualise what data we used to con-
struct the artefact.
Nevertheless, there was one problem withrecalGas’s marginals: At 10:28 and 10:38
60 Chapter 4. Results
0
20
40
OX [kPa]
Baby 1344, born 6 November 2001 at 12:36
0
20
40
CO [kPa]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:300
0.5
1Recalibrationor Relocation
of OX/CO Probe
12/11/2001
Figure 4.6 Marginal posterior probabilities for the artefact process “Recalibra-
tion or relocation of the gas probe”. The high probabilities at 10:28 and 10:38 when
air is getting under the probe are the only wrong inferences made on this dataset.
0
20
40
OX [kPa]
Baby 1344, born 6 November 2001 at 12:36
0
5
10
CO [kPa]
0
0.5
1
OX 0/CO 0
0
0.5
1
OX 20/CO 5
0
0.5
1
OX high/CO low
15:16 15:18 15:20 15:22 15:24 15:26 15:28 15:30 15:32 15:34 15:36 15:38 15:400
0.5
1Recalibrationor Relocation
of OX/CO Probe
12/11/2001
Figure 4.7 Marginals for all states ofrecalGas. The inferred states visual-
ise quite exactly what data we used to construct the distinct non-normal states of
recalGas.
air is getting under the probe, which is why the probe is in fact relocated at 10:40. And
on both occasions we can observe high marginal posteriors, which is incorrect. Then
one can also spot further explaining away effects at 9:10:40and 9:12:36 due to a time
4.2. Experiment 1344 12 Nov 2001 5 61
0
100
200
HR [bpm]
Baby 1344, born 6 November 2001 at 12:36
0
200
400
BS [mmHg]
0
100
200
BD [mmHg]
0
200
400
BM [mmHg]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:300
0.5
1
Taking BloodGas
12/11/2001
Figure 4.8 Marginal posterior probabilities of theabg artefact for the complete
source1344 12 Nov 2001. Although the number of false positives is more than
half the number of true positives (Table 4.3), the situation looks worse than itac-
tually is—especially if we recall that the CG model does not include the temporal
evolution of the signals which is fairly important in this case.
test of the recording device (compare Figure 4.5). Althoughthese effects are indeed
desirable, they negatively influences the true positive rate.
For abg the situation is worse (Figure 4.8 and Figure 4.9). Often theprocess of flush-
ing the line is characterised as drawing blood gas (10:23:20to 10:23:50 and 18:20:00
to 18:20:40 as well as 15:58 to 15:59, where the probe was additionally turned off).
Moreover, there are numerous short regions of high probability due to high blood pres-
sure values and synchronous low heart rate. The reason why the results are rather bad
is thatabg is the only modelled process that modifies HR, BS, BD and BM. There-
foreabg is quite likely to be declared responsible for any variability in those channels,
especially because the normal state is constructed in a way that results in small vari-
ances in its channel data. As we will see in the next section where we included two
further artefact processes which alter HR, BS, BD and BM,abg’s marginal posterior
probabilities become more accurate.
62 Chapter 4. Results
0
100
200
HR [bpm]
Baby 1344, born 6 November 2001 at 12:36
0
200
400
BS [mmHg]
0
100
200
BD [mmHg]
0
200
400
BM [mmHg]
10:17 10:18 10:19 10:20 10:21 10:22 10:23 10:24 10:25 10:26 10:27 10:28 10:29 10:30 10:31 10:32 10:33 10:34 10:350
0.5
1
Taking BloodGas
12/11/2001
Figure 4.9 Zoomed plot to illustrate theabg artefact, where the first region of
high probability (10:16:50 – 10:22:30) is actually corresponding to the process of
blood gas being taken, the next period starting shortly before 10:23 is caused by
flushing the line of the probe and the last spikes at 10:34 are perhaps dueto the
variability in heart rate (HR). Thus, the latter two could be misleadingly classified
as drawing blood gas.
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 9803 0 100.0 0.0 100.0 1
recalGas 5571 517 98.3 1.7 98.3 0.988
abg 780 530 97.9 1.5 98.5 0.989
recal 147 0 100.0 0.0 100.0 1
off 0 0 — 0.0 100.0 —
Total 16301 1047 99.03 0.64 99.4 0.994
Table 4.3 Summary of the artefact detection performance analysis for the source
1344 12 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.1.
4.3. Experiment 1369 22 Nov 2001 7 63
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 9656 0 98.5 0.0 99.6 1
recalGas 5536 483 97.7 1.6 98.3 0.988
abg 770 365 96.6 1.0 98.9 0.989
recal 147 0 100.0 0.0 100.0 1
off 0 0 — 0.0 100.0 —
Total 16109 848 98.18 0.52 99.4 0.994
Table 4.4 Summary of the artefact detection performance analysis for the source
1344 12 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.98.
The mean duration of therecalGas artefact was found to be944.8333 seconds or
15 minutes and44 seconds in this experiment. Drawing blood gas took on average
4 minutes and25 seconds, whereas the time tests of the Badger system have a mean
duration of29.4000 seconds, which is by far smaller than in our first experiment where
this procedure took exactly854 seconds. This might be an indication for two different
causes that generate the same channel values, alone with different durations; the static
CG model examined in this thesis is not able to cope with the temporal aspects of
patterns like these which is one of the limitations to be certainly addressed in the future.
4.3 Experiment 1369 22 Nov 2001 7
As we mentioned before, in this third experiment we introduce two new artefact pro-
cesses whose influence is present in the heart rate and blood pressure signals. Our
hope is that the enlarged number of artefacts whose patternsare observed on shared
physiological channels actually reduces the count of wrongly marked regions.
Before we go into more detail regarding this issue, we would briefly like to state the
major problems of the remaining artefacts. ForzeroSO, recal andoff is the situation
64 Chapter 4. Results
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 2565 0 95.6 0.0 99.6 1
recalGas 2797 52 99.3 0.2 99.7 1
endoSuc 86 2079 73.5 7.3 92.6 0.875
abg 180 35 87.4 0.1 99.8 0.984
recalBP 184 56 57.5 0.2 99.3 0.776
recal 117 0 99.2 0.0 100.0 0.994
off 0 0 — 0.0 100.0 —
Total 5929 2222 85.40 1.12 98.7 0.938
Table 4.5 Summary of the artefact detection performance analysis for the source
1369 22 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.1.
still the same as in the previous experiments. That is,off is not present at all,
recal’s true positive rate is99.2% without any false positives, andzeroSO is only
suffering under explaining away issues, at least with regard to Table 4.5 and Table 4.6.
In addition to this, we would like to remark that the values inboth tables are equal for
those three artefacts, which means the marginal posterior probabilities are above0.98
for all true positives and the false negatives are all below0.1. In other words, we were
actually quite sure that the computed marginals are correct.
About the next artefact we consider,recalGas, cannot be much said either. The mar-
ginal posterior probabilities are twice very sharply peaked, before and after therecal
pattern. This is because the oxygen and carbon dioxide channels drop a little bit earlier
to zero than the remaining channels and normalise later, which results in a high prob-
ability of being in artefact state one, “OX 0/CO 0”.
Then there is also another region of high probability of the artefact being present where
it is not the case. At approximately 14:47, air is perhaps getting under the probe
leading to high oxygen pressures. These are accommodated inthe artefacts third state
which essentially models patterns of high variablilty. As arecalibration or relocation
4.3. Experiment 1369 22 Nov 2001 7 65
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 2565 0 95.6 0.0 99.6 1
recalGas 2772 4 98.4 0.0 99.8 1
endoSuc 79 1033 67.5 3.6 96.2 0.875
abg 98 0 47.6 0.0 99.6 0.984
recalBP 141 13 44.1 0.0 99.3 0.776
recal 117 0 99.2 0.0 100.0 0.994
off 0 0 — 0.0 100.0 —
Total 5772 1050 75.39 0.53 99.2 0.938
Table 4.6 Summary of the artefact detection performance analysis for the source
1369 22 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.98.
of the gas probe is most likely not starting with this state, amodel which includes the
temporal evolution of the signals should be able recognise that fact.
Again, it is interesting to take a look at the tables. Now there is a difference between
the two thresholdsθ = 0.1 andθ = 0.98. In the first case, the false positive rate is
relatively high with a count of52, whereas there are only4 false positives forθ = 0.98.
Hence we were really certain that the4 false positives are predicted properly, so that
those four points might be caused by the spikes mentioned above. On the other hand,
we are not so sure about region around 14:47.
Now let us focus on the rest,endoSuc, abg andrecalBP, all of which are modelled
as modifying the same channels: HR, BS, BD and BM. From our observations in the
data, we assumed that the order of the artefacts in the previous sentence is their order in
the argument list of theconditionalGaussian class constructor, i.e.endoSuc is over-
written byabg which is itself overwritten byrecalBP.3 In addition, let us recall that
the goal of this experiment was in principle to research the interrelationship between
3The tables in this chapter do actually also indicate the order of an artefact in the poset. Upper itemsin the list are overwritten by lower ones, as in the previous chapter’s tables.
66 Chapter 4. Results
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
0
100
200
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:300
0.5
1Recalibration
of BPtransducer
22/11/2001
Figure 4.10 Marginal posterior probabilities for the artefacts endotracheal suc-
tioning, drawing blood gas and recalibrating the blood pressure transducer. Low
accuray ofendoSuc is chiefly due to a lack of specificity of the data model. See
text for details.
artefacts which alter the same channels. In particular, we hope that the introduction of
other variables thanabg might improve its accuracy, especially reducing regions where
moderately variable heart rate or blood pressure lead to wrong associations.
So let us consider one artefact and its problems after the other, starting with the drawing
of blood gas,abg. By looking at Figure 4.10, we can see that there are three blocks
with higher marginal posterior probabilities, one of which, the one in the middle, seems
to be less probable than the other two. In comparison with Figures 4.3 and 4.8 is in
this plot astonishingly few noise. This is what we were looking for.
Figure 4.11 shows the spike from 14:00 in detail. We can see that the period of time
where the marginal posterior ofabg is higher, the observed pattern does not look that
different from the typical one, as depicted in Figure 4.12. Actually, it is very likely, that
4.3. Experiment 1369 22 Nov 2001 7 67
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
0
100
200
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
13:59:00 13:59:30 14:00:00 14:00:30 14:01:00 14:01:30 14:02:00 14:02:30 14:03:00 14:03:30 14:04:00 14:04:300
0.5
1Recalibration
of BPtransducer
22/11/2001
Figure 4.11 Blood pressure transducer recalibration. Note the first pattern of the
marginal posteriors forabg andrecalBP in the plot. 14:04:00 to 14:04:30 might
also be flushing the line.
the observation is caused by anabg artefact. But why is pattern of the marginal pos-
terior probability now so different from the one we saw in theprevious experiments?
The answer goes as follows:
1. State “HR 0/BS low/BD high” matches the early part of theabg pattern.
2. The steadily rising pattern in BS, BD and BM which correspondsto the drawing
of blood gas, is modelled with a single state, so that this state has a mean which
lies somewhere between the top and the bottom of the pattern and the variance is
relatively large, as we estimate it from all the values, the low ones and the high
ones.
3. The explaining away effect plays a crucial part as well.
68 Chapter 4. Results
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
20
40
60
BS [mmHg]
20
40
60
BD [mmHg]
30
40
50
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
11:53:40 11:54:00 11:54:20 11:54:40 11:55:00 11:55:20 11:55:40 11:56:00 11:56:20 11:56:400
0.5
1Recalibration
of BPtransducer
22/11/2001
Figure 4.12 Typical patterns that can be observed when blood gas is being taken
from the radial arterial line and we modelled arecalBP state (“HR 0/BS low/BD
high”) similar to the beginning of pattern (11:54 – 11:55).
As a result, the early stage of drawing blood gas is actually more likely to be arecalBP
event due to the way we constructed their state data. But the more the blood pressures
rise, the more likely isabg’s responsiblilty for having generated this pattern and the
explaining away effect therefore starts to diminishrecalBP’s marginal posterior prob-
ability.
This means thatrecalBP is classified wrongly asabg approximately as often asabg
is asrecalBP. Table 4.5 and Table 4.6 reflect these findings at least partially. So is for
exampleabg’s true positive rate forθ = 0.1 87.4%, whereas it is only47.6 in the other
case, when we are cutting marginals close to their top.
Finally, let us say some words about theendoSuc artefact. First of all, it is clear
that the marginals shown in Figure 4.10 do not look very nice and neither do the error
statistics. But the main problem when modelling endotracheal suctionings is the lack of
4.4. Experiment 1355 14 Nov 2001 8 69
a temporal model, since the only clear indication is a small drop in heart rate for about
half a minute, which also renders the collection of adequatesamples more problematic.
Usually there are also some related effects on the blood pressures takeing place, on a
larger time scale though. In this experiment we modelled theendoSuc artefact with
two different non-normal states; the first as the short HR drop and the second as its
effect on BS, BD and BM, which is an increase.
Because of the lack of specificity, this artefact takes on the role of a “garbage collector”.
This becomes particularly apparent from circa 15:30 – 16:33.
The mean durations we computed for this experiment are as follows:
recalGas: 939 seconds= 15 minutes and39 seconds
endoSuc: 39 seconds (without aftereffects)4
abg: 103 seconds= 1 minute and43 seconds
recalBP: 320 seconds= 5 minutes and20 seconds
4.4 Experiment 1355 14 Nov 2001 8
In this experiment there is not really something we haven’t encountered before apart
from an additional channel, the respiratory rate RR, and therefore we will not spend
to much time on it. Since the only interesting artefact pattern we were able to identify
in conjunction with this channel are the drop outs to zero, the other artefact models
stay primarily the same.abg andrecalBP, however, had to be changed as they do not
modify the HR channel in this source data set. As a consequence, we also introduced
an artefact model for heart rate drop outs to zero, calledzeroHR. We begin by stating
that all artefacts which model drop outs on a single channel,i.e.zeroSO, zeroRR and
zeroHR, do their job extraordinarily well (Figure 4.13)—which is not really amazing
since one could also find them deterministically. Thus, whenconsidering cross product
state space reductions, the removal of those easy to preprocess artefacts should be
4We also evaluated theendoSuc artefact without including the aftereffects, which partially explainsthe bad AUC values et cetera.
70 Chapter 4. Results
0
100
200
HR [bpm]
Baby 1355, born 12 November 2001 at 17:17
100
150
200
HS [bpm]
60
80
100
SO [%]
0
100
200
RR [1/min]
0
0.5
1
Zero SO / HS
0
0.5
1
Zero RR
16:44:00 16:44:20 16:44:40 16:45:00 16:45:20 16:45:40 16:46:00 16:46:20 16:46:40 16:47:00 16:47:20 16:47:40 16:48:000
0.5
1
Zero HR
14/11/2001
Figure 4.13 Example of zero drop outs in HR and RR. The SO signal seemed to
be measured reliably in source1355 14 Nov 2001. As one can see, the marginal
posterior probabilities model the drop outs well which is not really astonishing.
preferred over other more complex ones. From Table 4.7 and Table 4.8 we see the
accuracy is at least99.7%, while the true positive rate ofzeroHR is only 42.1%, of
zeroSO even0. The latter, in combination with the accuracy, means that there are
only a few points where the SO value plummets to zero and we were certain about
their categorisation as not beingzeroSO artefacts. Similarly, less than half of the
observations with zeros in HR are regarded as being a consequence ofzeroHR being
present, which is due to the recalibration of the recording device, as can be seen from
the following simple calculation: 5656+74
≈ 43.1%.
From 11:13 on for about one hour air seem to be getting under the combined oxy-
gen/carbon dioxide probe, which is again wrongly categorised as being caused by
recalGas which is apparently not true. Figure 4.14 shows one of those misinterpreta-
tion as being modelled as the “OX high/CO low” state ofrecalGas. A model which
would take care of the temporal evolution should be able to get rid of those misinter-
4.4. Experiment 1355 14 Nov 2001 8 71
0
20
40
OX [kPa]
Baby 1355, born 12 November 2001 at 17:17
0
10
20
CO [kPa]
0
0.5
1
OX 0/CO 0
0
0.5
1
OX 20/CO 5
0
0.5
1
OX high/CO low
11:40 11:50 12:00 12:10 12:20 12:30 12:40 12:50 13:00 13:100
0.5
1
Recalibrationor Relocation
of OX/CO Probe
14/11/2001
Figure 4.14 Plot of the marginal posterior probabilities for all states of arte-
fact recalGas. The real cause of the misleadingly high probabilities of state “OX
high/CO low” at 11:50 is most probably due to air getting under the probe.
pretations as therecalGas artefact usually starts in the “OX 0/CO 0” state, as can be
seen from the second half of the previously mentioned figure.Moreover, it should be
remarked that the false positives ofrecalGas differ dramatically for the two different
thresholds without changing the true positive rate a lot.
Figure 4.15 presents an overview of the marginal posterior probabilities for theabg
andrecalBP artefacts. Apart from the periods of time where blood gas hasreally
been taken (9:05, 13:16 and 16:11), the causes of the patterns which have misleadingly
caused theabg’s marginal posteriors to be high are given below:
11:42:30 is an unknown artefact;
14:42 – 14:43 is maybe due to emptying water from ventilator trap;
16:21:55 and 16:23:15 seem to be problems with adapting to high or low blood pres-
72 Chapter 4. Results
0
100
200
HR [bpm]
Baby 1355, born 12 November 2001 at 17:17
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
0
50
100
BM [mmHg]
0
0.5
1
Taking BloodGas
8:30 9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:000
0.5
1
Recalibrationof BP
transducer
14/11/2001
Figure 4.15 Overview of the marginal posteriors for the artefactsabg and
recalBP. EspeciallyrecalBP’s model seems to work rather fine here.
sures in conjunction withrecalBP (its “high/high” or “low/low” states are caught
by abg as illustrated in Figure 4.16);
17:01 – 17:06 is an unknown artefact;
17:27 – 17:39 is another unknown artefact;
18:09 – 18:23 is unknown as well.
TherecalBP artefact reveals only two very short periods of slightly higher marginal
posterior probabilities (approximately0.6), the second of which is shown more clearly
in the first quarter of Figure 4.16.
Finally, there is a one second drop at 10:24:56 where almost all channels (including the
temperatures) plummet to zero. But because the carbon dioxide (CO) value is normal,
this is correctly recognised as not being anoff artefact. Nonetheless, we certainly
could integrate drop outs as the one above in a new multinomial variable, together
4.5. Experiment 1369 21 Nov 2001 9 73
140
160
180
HR [bpm]
Baby 1355, born 12 November 2001 at 17:17
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
0
50
100
BM [mmHg]
0
0.5
1
Taking BloodGas
16:09 16:10 16:11 16:12 16:13 16:14 16:15 16:16 16:17 16:18 16:19 16:20 16:21 16:22 16:23 16:240
0.5
1
Recalibrationof BP
transducer
14/11/2001
Figure 4.16 Detail of the marginals in Figure 4.15.
with recal’s non-normal states as discussed in subsection 3.2.3.
The mean durations for experiment1355 14 Nov 2001 8 are:
recalGas: 1364.7 seconds= 22 minutes and44 seconds
abg: 181 seconds= 3 minutes and1 second
recalBP: 95 seconds= 1 minute and35 seconds
4.5 Experiment 1369 21 Nov 2001 9
The absence of theoff artefact in the last four experiments certainly leads to theques-
tion why we need to have it at all. In this experiment we demonstrate its usefulness.
Furthermore, and in fact in conjunction with theoff artefact we present a clear ex-
ample of the effects the explaining away phenomenon has on various channels, con-
74 Chapter 4. Results
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 0 0 0.0 0.0 99.8 0.971
zeroRR 2267 0 96.6 0.0 99.8 0.999
zeroHR 56 0 42.1 0.0 99.8 0.983
recalGas 4023 718 98.3 2.2 97.8 0.996
abg 539 1824 99.3 5.1 94.9 0.994
recalBP 92 6 96.8 0.0 100.0 0.986
recal 74 0 97.4 0.0 100.0 0.98
off 0 0 — 0.0 100.0 —
Total 7051 2548 75.77 0.92 99.0 0.987
Table 4.7 Summary of the artefact detection performance analysis for the source
1355 14 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.1.
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 0 0 0.0 0.0 99.8 0.971
zeroRR 2235 0 95.2 0.0 99.7 0.999
zeroHR 56 0 42.1 0.0 99.8 0.983
recalGas 4001 39 97.7 0.1 99.6 0.996
abg 487 315 89.7 0.9 99.0 0.994
recalBP 92 2 96.8 0.0 100.0 0.986
recal 74 0 97.4 0.0 100.0 0.98
off 0 0 — 0.0 100.0 —
Total 6945 356 74.13 0.13 99.7 0.987
Table 4.8 Summary of the artefact detection performance analysis for the source
1355 14 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.98.
4.5. Experiment 1369 21 Nov 2001 9 75
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 4111 0 100.0 0.0 100.0 1
zeroHR 2907 523 87.6 2.4 96.3 0.961
zeroRR 4168 2 99.9 0.0 100.0 1
recalGas 11699 1883 95.9 14.6 90.5 0.986
endoSuc 140 2497 51.7 10.1 89.5 0.791
abg 815 411 69.8 1.7 97.0 0.897
recalBP 14 3461 82.4 13.8 86.2 0.85
recal 40 3 100.0 0.0 100.0 1
off 1515 0 100.0 0.0 100.0 1
Total 25409 8780 87.46 4.74 95.5 0.943
Table 4.9 Summary of the artefact detection performance analysis for the source
1369 21 Nov 2001 showing the number of true positives (TP), the number of false
positives (FP), the true positive rate (TP rate), the false positive rate (FP rate) as
well as the accuracy and the area under the ROC curve (AUC) for the evaluation
thresholdθ = 0.1.
centrating on the drop outs to zero in HR, RR and SO. We also give further examples
for abg andrecalBP. In subsection 3.4.5 we stated thatrecalBP are modelled on BD,
BS and BM only, whereasabg andendoSuc are constructed using HR as well. What
is more is thatendoSuc has been modified to include one non-normal state only, the
one which corresponds to small bowl-shaped drops in heart rate. The second state that
was meant to take the aftereffects of this decrease in HR intoconsideration has thence
been removed. Altogether, our model comprises nine distinct artefacts.
We begin our discussion of experiment1369 21 Nov 2001 9 with off and recal.
First, and maybe most important, we note that both have been classified perfectly, even
for θ = 0.98, as can be seen from Table 4.5, Table 4.6 and also from Figure 4.17.
Second, as Figure 4.17 shows, we observe thatoff infers the zero values correctly
from 12:22 onward to circa 12:47, the time the first channel’ssignal takes on a non-
zero value. Unfortunately, the different channels do not return to normal at the same
time. This has major implications for the classification measures we provide, since
all of the zero values prior to the first normal ones have been included into the labels
76 Chapter 4. Results
Artefact TP FP TP rate (%) FP rate (%) Accuracy (%) AUC
zeroSO 2556 0 62.1 0.0 93.8 1
zeroHR 532 13 16.0 0.1 88.8 0.961
zeroRR 2601 0 62.4 0.0 93.7 1
recalGas 10923 192 89.6 1.5 94.2 0.986
endoSuc 36 331 13.3 1.3 97.7 0.791
abg 656 59 56.2 0.2 97.7 0.897
recalBP 14 3461 82.4 13.8 86.2 0.85
recal 40 3 100.0 0.0 100.0 1
off 1515 0 100.0 0.0 100.0 1
Total 18873 4059 64.66 1.88 94.7 0.943
Table 4.10 Summary of the artefact detection performance analysis for the
source1369 21 Nov 2001 showing the number of true positives (TP), the num-
ber of false positives (FP), the true positive rate (TP rate), the false positive rate
(FP rate) as well as the accuracy and the area under the ROC curve (AUC) for the
evaluation thresholdθ = 0.98.
which help to determine those measures. Take a look at Figure4.18. Here we can
clearly see the effect of howoff andrecal reduce the marginal posterior probabilities
of other artefacts, i.e. how they render them unlikely. What can hardly be seen from
Figure 4.18, is thatzeroHR is below theθ = 0.1 threshold for the periodoff is inferred
to be present. This becomes obvious when comparing Table 4.5and Table 4.6. In
the first table,zeroHR exhibits a by far larger number of false positives thanzeroSO
or zeroRR, whereas in the second table, forθ = 0.98, the false positive counts are
certainly more similar. This means that the difference in accuracy, but not in AUC, is
chiefly a consequence ofzeroHR’s marginal posterior probability being greater then
θ = 0.1 whenoff is present. There is also an immense difference between the true
positives of both,zeroHR andzeroSO/zeroRR as well as between the two thresholds
θ = 0.1 andθ = 0.98. For example, forθ = 0.1 the true positive rate is87.6%, more
than five times the rate forθ = 0.98. We believe that this is mainly a consequence
of theabg artefact explaining parts ofzeroHR’s responsibilies away. And we will see
below that the contrary is true as well. Therefore the numberof artefacts that alter the
4.5. Experiment 1369 21 Nov 2001 9 77
0200400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0200400
HS [bpm]
050
100
SO [%]
0100200
RR [1/min]
02040
TC [°C]TP [°C]
02040
OX [kPa]
05
10
CO [kPa]
00.5
1
BS [mmHg]
00.5
1
BD [mmHg]
00.5
1
BM [mmHg]
00.5
1Recalibration
of Badger
12:44 12:46 12:48 12:50 12:52 12:54 12:56 12:58 13:00 13:02 13:04 13:060
0.51
RecordingDevice Off
21/11/2001
Figure 4.17 This plot shows the marginal posterior probabilities for the artefacts
off andrecal. The latter differs from the first only be the fact that the temperatures
are20 Celsius and not0.
same channels in similar ways tend to interact greatly.
Next we considerrecalGas in brief. Although not illustrated, there was only one
explicit problem between 15:08 and 15:09, where the CO channel is zero and OX
values are high pressures (9 to 12 kPa). Its real cause is unknown. In addition, the
artefact pattern from 16:27 – 17:20 is probably no relocation of the gas probe, although
one might infer this from the observations; but it is known that the incubator was
open and the annotations in TSW are in favour of the peripheral venous line being
removed/inserted.
Figure 4.19 visualises the marginal posterior probabilities for the three artefactsendoSuc,
abg andrecalBP. This time we start with therecalBP, which is at least in this exper-
iment the wrong name for the patterns we modelled. A better label would be “Start up
78 Chapter 4. Results
0
200
400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
200
400
HS [bpm]
0
50
100
SO [%]
0
100
200
RR [1/min]
0
0.5
1
Zero SO / HS
0
0.5
1
Zero HR
12:42 12:44 12:46 12:48 12:50 12:52 12:54 12:56 12:58 13:00 13:02 13:04 13:060
0.5
1
Zero RR
21/11/2001
Figure 4.18 This is a clear example of how one artefact’s presence (off from
12:42 – 12:47 and actuallyrecal as well around 13:05) changes another one’s
marginal posterior probability (zeroSO, zeroHR andzeroRR). This effect is com-
monly referred to as “explaining away”.
of the blood pressure transducer” or something similar. Itsstates are reduced to two,
one where the blood pressures are kind of maximal and one where they are zero. Thus
there should not be an interaction withabg and the period of zero values in whichoff
is absent, leads to the given values in Tables 4.9 and 4.10. The latter is also shown in
Figure 4.19, whereas Figure 4.20 indicates the first. Overall, the true positive rates are
rather high (82.4%) because of the specificity of the model.
Now if there is no interaction betweenrecalBP andabg, why can we observe the
typical pattern in Figure 4.20, from 17:30 to 17:31? Recalling thatzeroHR is part of
the current model, we are in the position to explain this pattern. As HR is obviously
a good indicator for bothzeroHR andabg, the only evidence on which the outcoming
marginals might be based are the blood pressure values. And this is exactly what is
depicted in Figure 4.20
4.5. Experiment 1369 21 Nov 2001 9 79
0
200
400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
0
100
200
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:000
0.5
1Recalibration
of BPtransducer
21/11/2001
Figure 4.19 Marginal posterior probabilities for the complete source
1369 21 Nov 2001 and the three artefactsendoSuc, abg andrecalBP. Note that
we cannot really explainendoSuc’s high marginals between 13:00 and 13:30.
Hence, going back to Figure 4.19,abg’s main problems are regions with low, in par-
ticular zero HR values and almost normal blood pressures.
Before we finish this chapter by looking at another set of mean artefact durations, we
briefly describe theendoSuc artefact. In general, we have to say that it is very hard
to accurately model endotracheal suctionings based almostexclusively on the small
drops in heart rate without the ability to utilise temporal structures. Furthermore, the
data from which we could learn the parameter of its states were really sparse. But
the main reason for the rather disappointing results is the heart rate variablility (see
Figure 4.19. Everytime it is a little bit below normal, the probability of an suctioning
goes up and there is, in principle, nothing we can do about it without using additional
information, such as trends or floating means maybe. More suitable elaborate models
are briefly discussed chapter 5. Nevertheless, Figure 4.21 illustrates that it should be
80 Chapter 4. Results
0
200
400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
20
40
60
BS [mmHg]
20
40
60
BD [mmHg]
20
40
60
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
17:29 17:30 17:31 17:32 17:33 17:34 17:35 17:36 17:37 17:38 17:39 17:40 17:41 17:42 17:43 17:44 17:45 17:460
0.5
1Recalibration
of BPtransducer
21/11/2001
Figure 4.20 Another example of the explaining away phenomenon. The steady
rise in the first marginal posterior ofabg is this time not caused byrecalBP as can
be seen from its own marginal posterior probability, but byzeroHR which “com-
petes” withabg on HR.
possible to recognise this artefact more effectively.
And here are some mean artefact durations as calculated from1369 21 Nov 2001:
recalGas: 3048.3 seconds= 50 minutes and48 seconds
endoSuc: 38.7 seconds
abg: 233.6 seconds= 3 minutes and53 seconds
recalBP: 8.5 seconds
recal: 20 seconds
4.5. Experiment 1369 21 Nov 2001 9 81
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
30
40
50
BS [mmHg]
20
30
40
BD [mmHg]
20
30
40
BM [mmHg]
0
0.5
1
EndotrachealSuction
0
0.5
1
Taking BloodGas
16:10:30 16:11:00 16:11:30 16:12:00 16:12:30 16:13:00 16:13:30 16:14:00 16:14:30 16:15:00 16:15:300
0.5
1Recalibration
of BPtransducer
21/11/2001
Figure 4.21 Example of an endotracheal suctioning.
Chapter 5
Conclusions and Future Work
This chapter asks what conclusions we can draw from the material covered in the
preceding chapters and makes suggestions for future work, including enhancements to
our approach of using a conditional Gaussian model to detectartefacts in the neonatal
monitoring data.
5.1 Conclusions
In this thesis we aimed to detect artefact processes in neonatal monitoring data using a
probabilistic approach, the conditional Gaussian (CG) model. And although our results
are actually promising, this model has to be seen as the first part of the construction of
a more elaborate model—one that includes the temporal aspects of the data.
This is, to the best of our knowledge, the first approach whichtries to infer multiple
latent causes from patient monitoring data. And the resultspresented in chapter 4
underpin the fact that this goal has been accomplished. It is, of course, true that there
are still several problems that need to be addressed in future; but the recognition of
artefactual patterns in multi-channel data is feasible.
We devoted a large amount of the allotted time on solving datamining and prepro-
cessing tasks as well as on the creation of a comfortable computing environment within
83
84 Chapter 5. Conclusions and Future Work
MATLAB . Now we are in the position to build new instances of the CG models fast
and accurately.
Moreover, we gave an algorithm which dramatically reduces the cross product state
space of the latent variables by exploiting the structure ofartefacts. This is useful not
only while we are learning the parameters of the CG distribution, but also when we
compute posterior or marginal posterior probabilities. Inhow far this algorithm will
be useful in conjunction with models that acknowledge the temporal evolution of the
recorded signals, still needs to be seen.
Perhaps most significantly, we have shown that the conditional Gaussian model is a
valid approach to the detection of multiple hidden causes inthe data. The application
of this model on the other hand, helped us understand the interactions between the
various artefact processes themselves.
5.2 Future work
Simple enhancements of this work include the generalisation of the software we cre-
ated, so that, for instance, evidence can be entered into thebelief network or annota-
tions can be stored. We could also modify the current software tools to create labels
sufficient for automatic estimation of parameters, say.
More important, however, would be the implementation of theproposed algorithm to
reduce the cross product state space and thus increase the number of useable artefacts
and speed up the inference of their states.
We might also benefit from computing more detailed data statistics such as the physiolo-
gical centiles collected by Neil McIntosh. Especially in the light of modelling the
baby’s normal state of health it might be wise to know more about the distributions
of the individual channels, or subsets of them. Then we coulddecide with a lot more
confidence on how to model the observations. We might even want to have some kind
of artefact process inventory in the long run.
In every case it is necessary to have access to data with labels that can be directly
5.2. Future work 85
applied in the machine learning context. The absence of these labels made the con-
struction as well as the evaluation of the model so much more difficult.
Furthermore, it is our opinion that there is still a lot to be learned from the everyday
procedures within an ICU and a detailed knowledge about the processes at the cot-side
would certainly facilitate the construction of other artefacts.
In addition to this, we have to include preprocessing techniques as the ones discussed in
section 1.2. The pattern matching approach based on piecewise linear segmentations
(Keogh et al. (2003); Keogh and Smyth (1997)) of the channel data did not really
work, however. Maybe the usage of autoregressive hidden Markov models (Penny and
Roberts (1999); Woodland (1992)) would be a reasonable approach. But a lot simpler
could be the exploitation of the fact that systolic and diastolic blood pressure become
the same when the nurse is drawing blood gas. Simple tests showed that this method
could be a sensible extension to the current system.
Finally, and certainly most significantly, we have to include temporal aspects into our
models. That is, we have to replicate the CG model through time. The result of this
operation would be a factorial hidden Markov model (FHMM) with conditional Gaus-
sian observation model instead of a conditional linear Gaussian one as described in
Ghahramani and Jordan (1997).
Do our results regarding the cross product state space reduction carry over to FHMMs?
Appendix A
Additional plots
In this appendix we gathered some plots which might be interesting to the reader,
although they are not particularly relevant to the understanding of our approach. We
present histograms for the channels available in the data sources as well as plots of the
entire data for the five experiments carried out.
87
88 Appendix A. Additional plots
0 50 100 150 200 2500
0.5
1
1.5
2
2.5
3
3.5x 10
4 HR
[bpm]
Figure A.1 Histogram for channel HR.
0 50 100 150 200 2500
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 10
4 HS
[bpm]
Figure A.2 Histogram for channel HS.
89
0 5 10 15 20 25 30 35 40 450
2
4
6
8
10
12
14x 10
4 TC
[°C]
Figure A.3 Histogram for channel TC.
0 20 400
1
2
3
4
5
6
7
8
9
10x 10
4 TP
[°C]
Figure A.4 Histogram for channel TP.
90 Appendix A. Additional plots
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8x 10
4 OX
[kPa]
Figure A.5 Histogram for channel OX.
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8
9x 10
4 CO
[kPa]
Figure A.6 Histogram for channel CO.
91
0 50 100 150 200 250 3000
1
2
3
4
5
6
7
8
9x 10
4 BS
[mmHg]
Figure A.7 Histogram for channel BS.
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
5
6
7
8
9x 10
4 BD
[mmHg]
Figure A.8 Histogram for channel BD.
92 Appendix A. Additional plots
0 50 100 150 200 2500
1
2
3
4
5
6
7
8
9x 10
4 BM
[mmHg]
Figure A.9 Histogram for channel BM.
0 10 20 30 40 50 60 70 80 90 1000
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 10
4 SO
[%]
Figure A.10 Histogram for channel SO.
93
−20 0 20 40 60 80 100 120 140 1600
0.5
1
1.5
2
2.5x 10
5 RR
[1/min]
Figure A.11 Histogram for channel RR.
94A
ppendixA
.A
dditionalplots
0
100
200
HR [bpm]
Baby 1340, born 4 November 2001 at 14:27
0
100
200
HS [bpm]
0
50
100
SO [%]
20
30
40
TP [°C]TC [°C]
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:300
50
100
BM [mmHg]
05/11/2001
Figure
A.12
Source1340
05Nov
2001.
95
0
100
200
HR [bpm]
Baby 1344, born 6 November 2001 at 12:36
0
100
200
HS [bpm]
0
50
100
SO [%]
20
30
40
TP [°C]TC [°C]
0
20
40
OX [kPa]
0
20
40
CO [kPa]
0
200
400
BS [mmHg]
0
100
200
BD [mmHg]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:300
200
400
BM [mmHg]
12/11/2001
Figure
A.13
Source1344
12Nov
2001.
96A
ppendixA
.A
dditionalplots
0
100
200
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
100
200
HS [bpm]
0
50
100
SO [%]
20
30
40
TP [°C]TC [°C]
0
20
40
OX [kPa]
0
5
10
CO [kPa]
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:300
100
200
BM [mmHg]
22/11/2001
Figure
A.14
Source1369
22Nov
2001.
97
0
100
200
HR [bpm]
Baby 1355, born 12 November 2001 at 17:17
0
100
200
HS [bpm]
0
50
100
SO [%]
0
20
40
TP [°C]TC [°C]
0
20
40
OX [kPa]
0
10
20
CO [kPa]
0
50
100
BS [mmHg]
0
50
100
BD [mmHg]
0
50
100
BM [mmHg]
8:30 9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:000
100
200
RR [1/min]
14/11/2001
Figure
A.15
Source1355
14Nov
2001.
98A
ppendixA
.A
dditionalplots
0
200
400
HR [bpm]
Baby 1369, born 21 November 2001 at 12:21
0
200
400
HS [bpm]
0
50
100
SO [%]
0
20
40
TC [°C]TP [°C]
0
20
40
OX [kPa]
0
20
40
CO [kPa]
0
100
200
BS [mmHg]
0
100
200
BD [mmHg]
0
100
200
BM [mmHg]
12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:000
100
200
RR [1/min]
21/11/2001
Figure
A.16
Source1369
21Nov
2001.
Bibliography
Alberdi, E., Becher, J.-C., Gilhooly, K., Hunter, J. R. W., Logie, R., Lyon, A.,
McIntosh, N., and Reiss, J. (2001). Expertise and the interpretation of compu-
terized physiological data: Implications for the design ofcomputerized monitor-
ing in neonatal intensive care.International Journal of Human Computer Studies,
55(3):191–216.
Becker, K., Thull, B., Kasmacher-Leidinger, H., Stemmer, J., Rau, G., Kalff, G., and
Zimmermann, H.-J. (1997). Design and vaildation of an intelligent patient monitor-
ing and alarm system based on a fuzzy logic process model.Artificial Intelligence
in Medicine, 11:33–53.
Bishop, C. M. (1995).Neural Networks for Pattern Recognition. Oxford University
Press, Oxford, UK.
Cao, C. and McIntosh, N. (1998). Empirical study on artifact detection in monitoring
data. Informatics Program, Children’s Hospital, 310 Longwood Avenue, Boston,
USA.
Cao, C. and McIntosh, N. (2000). An event-based approach to identifying artifacts in
multiple channel monitoring data from preterm infants. Technical report, Depart-
ment of Child Life and Health, University of Edinburgh.
Ghahramani, Z. and Jordan, M. I. (1997). Factorial hidden Markov models.Machine
Learning, 29:245–273.
Haimowitz, I. J., Le, P. P., and Kohane, I. S. (1995). Clinicalmonitoring using
regression-based trend templates.Artificial Intelligence in Medicine, 7(6):473–496.
99
100 BIBLIOGRAPHY
Hand, D., Mannila, H., and Smyth, P. (2001).Principles of Data Mining. Adaptive
Computation and Machine Learning. The MIT Press, Cambridge, USA.
Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a
Receiver Operating Characteristic (ROC) curve.Radiology, 143(1):29–36.
Hoare, S. W., Asbridge, D., and Beatty, P. C. W. (2002). On-linenovelty detection for
artefact identification in automatic anaesthesia record keeping.Medical Engineering
& Physics, 24:673–681.
Hoare, S. W. and Beatty, P. C. W. (2000). Automatic artifact identification in anaes-
thesia patient record keeping: a comparison of techniques.Medical Engineering &
Physics, 22:547–553.
Hunter, J. (2001).Time Series Workbench: User’s Manual. Department of Computing
Science, University of Aberdeen.
Jordan, M. I. (2002). An introduction to probabilistic graphical models. Unpublished
manuscript.
Keogh, E., Chu, S., Hart, D., and Pazzani, M. (2003).Data Mining in Time Series
Databases, chapter Segmenting Time Series: A Survey and Novel Approach. World
Scientific Publishing Company.
Keogh, E. and Smyth, P. (1997). A probabilistic approach to fast pattern matching
in time series databases. In Heckerman, D., Mannila, H., Pregibon, D., and Uthur-
usamy, R., editors,Third International Conference on Knowledge Discovery and
Data Mining, pages 24–30, Newport Beach, CA, USA. AAAI Press, Menlo Park,
California.
Koski, E. M. J., Makivirta, A., Sukuvaara, T., and Kari, A. (1990). Frequencyand re-
liability of alarms in the monitoring of cardiac postoperative patients.International
Journal of Clinical Monitoring and Computing, 7:129–133.
Lauritzen, S. L. and Jensen, F. (1999). Stable local computation with conditional
gaussian distributions. Technical Report R-99-2014, Department of Mathematical
Sciences, Aalborg University.
BIBLIOGRAPHY 101
Lauritzen, S. L. and Wermuth, N. (1984). Mixed interaction models. Technical Report
R-84-8, Institute for Electronic Systems, Aalborg University.
Lauritzen, S. L. and Wermuth, N. (1989). Graphical models for associations between
variables, some of which are qualitative and some quantitative. In Annals of Statist-
ics, volume 17, pages 31–57.
Lawless, S. T. (1994). Crying wolf: false alarms in the pediatric intensive care unit.
Critical Care Medicine, 22:981–985.
McIntosh, N. (2002). Intensive care monitoring: past, present and future.Clinical
Medicine, 2(4):349–355.
Meredith, C. and Edworthy, J. (1995). Are there too many alarms in the intensive care
unit? An overview of the problems.Journal of Advanced Nursing, 21:15–20.
Miksch, S., Horn, W., Popow, C., and Paky, F. (1996). Utilizing temporal data ab-
straction for data validation and therapy planning for artificially ventilated newborn
infants.Artificial Intelligence in Medicine, 8(6):543–576.
Pearl, J. (1988).Probabilistic reasoning in inteligent systems: Networks of plausible
inference. Morgan Kaufmann, San Mateo, USA.
Penny, W. and Roberts, S. (1999). Dynamic models for nonstationary signal segment-
ation. Computers and Biomedical Research, 32(6):483–502.
Smyth, P. (1994a). Hidden Markov models for fault detectionin dynamic systems.
Pattern Recognition, 27(1):149–164.
Smyth, P. (1994b). Markov monitoring with unknown states.IEEE Journal on Selected
Areas in Communications (JSAC), Special Issue on IntelligentSignal Processing for
Communications.
The MathWorks, Inc. (2003). MATLAB . Natick, USA.
http://www.mathworks.com/products/matlab/.
Tipping, M. (1999). Statistical pattern analysis. Unpublished manuscript.
Tsien, C. L. (2000a). Event discovery in medical time-seriesdata. InAMIA 2000
102 BIBLIOGRAPHY
Annual Symposium, pages 858–862. American Medical Informatics Association
(AMIA).
Tsien, C. L. (2000b).TrendFinder: Automated Detection of Alarmable Trends. PhD
thesis, Department of Electrical Engineering and Computer Science, Massachussetts
Institute of Technology, Cambridge, USA.
Tsien, C. L. and Fackler, J. C. (1997). Poor prognosis for existing monitors in the
intensive care unit.Critical Care Medicine, 25(4):614–619.
Tsien, C. L., Kohane, I. S., and McIntosh, N. (2000). Multiplesignal integration by
decision tree induction to detect artifacts in the neonatalintensive care unit.Artificial
Intelligence in Medicine, 19(3):189–202.
Tsien, C. L., Kohane, I. S., and McIntosh, N. (2001). Building ICU artifact detection
models with more data in less time. InAMIA 2001 Fall Symposium, pages 706–710.
American Medical Informatics Association (AMIA), Hanley and Belfus, Inc.
Williams, C. K. I. (2002). Probabilistic modelling and reasoning. Lecture notes, School
of Informatics, University of Edinburgh.
Woodland, P. C. (1992). Hidden Markov models using vector linear predictors and
discriminative output distributions. InProceedings of the International Conference
on Acoustics, Speech and Signal Processing, ICASSP-92, volume 1, pages 509–512.