Developing a System for High-Resolution Detection of ...
Transcript of Developing a System for High-Resolution Detection of ...
Developing a System for High-Resolution Detection of Driver Drowsiness
Using Physiological Signals
by
Ahnaf Rashik Hassan
A thesis submitted in conformity with the requirements
for the degree of Master of Applied Science
Institute of Biomaterials and Biomedical Engineering
University of Toronto
© Copyright by Ahnaf Rashik Hassan 2018
Page | ii
Developing a System for High-Resolution Detection of Driver Drowsiness Using Physiological
Signals
Ahnaf Rashik Hassan
Master of Applied Science
Institute of Biomaterials and Biomedical Engineering
University of Toronto
2018
Abstract
Background: This research aims to develop a high-resolution, reliable, and efficient drowsiness detection system.
Existing systems for detecting drowsiness are of low-resolution, expensive, dependent on external parameters, or are
inconvenient for the driver.
Method: Two studies were conducted: First, we analyzed electroencephalogram (EEG) data collected during a sleep
study to develop a high-resolution drowsiness detection algorithm. This algorithm was then tested in a second study
that actively engaged participants in a reaction time task.
Results: In the sleep study, a sigmoid wake probability model yielded high drowsiness detection rates. In the
reaction time study, however, the same method showed low sensitivity. Instead, a time-domain feature based
algorithm performed best with high accuracy, high sensitivity, and high specificity.
Significance: Upon successful validation of the developed algorithm in a driving study, this research will help to
develop a reliable, wearable, and convenient device to detect drowsy driving that could increase road safety.
Page | iii
Acknowledgments
First and foremost, I am deeply grateful for the guidance, help, and support from my supervisors: Dr. Azadeh
Yadollahi and Dr. Behrang Keshavarz. I am sincerely grateful that they granted me an opportunity to pursue my
thesis with them. The constant supervision, support, and constructive criticism of Dr. Yadollahi and Dr. Keshavarz
immensely helped to improve the quality of my work and greatly contributed to my academic growth. I would also
like to thank my committee members: Prof. Geoff Fernie and Dr. Bruce Haycock for their thoughtful comments that
helped to improve the quality of the thesis.
Additionally, I would like to genuinely thank Dr. Muammar Kabir for his constant support, guidance and mentorship
throughout my master’s, especially in the data collection process. I would like to sincerely thank Prof. Stefan Berti
of Johannes Gutenberg-Universität, Mainz, Germany who personally taught me how to perform EEG data
collection. I would also like to express my special appreciation to Dr. Nasim Montazeri, Cathy Zhu, Joseph
Makanjuola, Bojan Gavrilovic, and Jamie Zhang for their support throughout my master’s study. Additional thanks
are extended to all the organizations that are funding this project.
Finally, I will eternally remain indebted to my mother whose constant support continues to be the source of strength
in all my endeavors.
Page | iv
Table of Contents Acknowledgments ................................................................................................................... iii
List of Tables .......................................................................................................................... vii
List of Figures........................................................................................................................ viii
List of Appendices .....................................................................................................................x
Chapter 1: Introduction ...............................................................................................................1
1.1 Overview ...........................................................................................................................1
1.2 Drowsiness Physiology ......................................................................................................2
1.3 Motivation .........................................................................................................................3
1.4 Thesis Structure .................................................................................................................3
Chapter 2: Background ................................................................................................................4
2.1 Defining Drowsiness .........................................................................................................4
2.2 Measures Used for Drowsiness Detection ..........................................................................6
2.2.1 Behavioral Measures...................................................................................................6
2.2.2 Questionnaire-Based Measures ...................................................................................8
2.2.3 Vehicle-Based Measures .............................................................................................8
2.2.4 Physiological Measures ..............................................................................................9
2.3 Signal Processing and Pattern Recognition Techniques Used for Drowsiness Detection .. 11
2.4 Commercially Available Systems .................................................................................... 12
Chapter 3: Objectives ................................................................................................................ 16
Chapter 4: Study 1 - Sleep Study ............................................................................................... 17
Page | v
4.1 Method ............................................................................................................................ 17
4.1.1 EEG Data ................................................................................................................. 17
4.1.2 Preprocessing of EEG Signals ................................................................................... 18
4.1.3 Feature Extraction from EEG Frequency Bands ........................................................ 19
4.1.4 Sigmoid Wake Probability Model ............................................................................. 20
4.1.5 Parameter Selection for the Model ............................................................................ 21
4.1.6 Validation ................................................................................................................. 24
4.2 Results............................................................................................................................. 24
4.3 Discussions ..................................................................................................................... 33
Chapter 5: Study 2- Reaction Time Study .................................................................................. 35
5.1 Methods .......................................................................................................................... 35
5.1.1 Study Design ............................................................................................................ 35
5.1.2 Stimulus and Data Collection .................................................................................... 35
5.1.3 Drowsiness Detection from Facial Video .................................................................. 38
5.1.4 Performance Evaluation Metrics ............................................................................... 40
5.1.5 Development of Drowsiness Detection Algorithm .................................................... 41
5.2 Results............................................................................................................................. 48
5.2.1 Drowsiness Ratings .................................................................................................. 48
5.2.2 Modified Sigmoid Wake Probability Model .............................................................. 52
5.2.3 Step Function Model ................................................................................................. 53
5.2.4 Time-Domain Feature-Based Algorithm ................................................................... 55
Page | vi
5.3 Discussion ....................................................................................................................... 57
Chapter 6: General Discussions ................................................................................................. 60
6.1. Summary of the Findings ................................................................................................ 60
6.2. Comparison with Other Drowsiness Detection Systems .................................................. 64
6.3. Practical Implications ..................................................................................................... 65
6.4. Limitations ..................................................................................................................... 67
Chapter 7: Conclusions and Future Directions ........................................................................... 68
References ................................................................................................................................ 69
Appendix A4: Results from F3-M2 ........................................................................................... 77
Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study....................... 80
Appendix B5: Mobility Parameter ............................................................................................. 82
Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm ....................... 83
Page | vii
List of Tables
Table 2.1: Rechtschaffen and Kales Sleep Staging Criteria……………….…………………………………………4
Table 2.2: A summary of widely used drowsiness scoring schemes…………………………………………………5
Table 2.3: A summary of drowsiness detection methodologies using physiological signals………………...………7
Table 2.4: A summary of drowsiness detection methodologies that employ behavioral measures……………..……10
Table 2.5: Over view of commercially available driver fatigue or drowsiness monitoring systems…………..…12-14
Table 4.1: Demographic information of the participants for Study 1………..………………………………………24
Table 4.2: Average and standard deviation of the number of 3-s segments from F4-M1 used in this study for model
development and validation……………………………………………………………………………………..……25
Table 4.3: Sigmoid parameters computed from the training data for F4-M1………………………………...………27
Table 4.4: Results of the one-way repeated measures analysis of variance suggesting that the feature values (mean ±
standard deviation) significantly change in the three clusters………………………..………………………………32
Table 5.1: Drowsiness scale proposed in this thesis…………………………………………………….……………39
Table 5.2: Average and standard deviation of the number of 1s segments used per electrode………………………49
Table 5.3: Inter-rater agreement of the proposed video-based drowsiness scale for binary classification of alert vs.
non-alert (scores 0 and 1)……………………………………………………………………...………………...……51
Table 5.4: Inter-rater agreement of the proposed video-based drowsiness scale for the levels of drowsiness (scores 0
to 10)……………………………………………………………………………………………………….…………52
Table 5.5: Performance comparison of the proposed methods with existing works in the literature……………...…57
Page | viii
List of Figures
Figure 2.1: An illustration of commonly used measures for drowsiness detection……………………………..…….6
Figure 4.1: EEG electrode placement map commonly used in sleep studies. The six electrodes available in this study
are highlighted with red circles………………………………………………………………………………….……18
Figure 4.2: Spectrogram (3s window, 50% overlap) of the first few minutes’ EEG of F4 channel of a single
participant……………………………………………………………………………………………………………..19
Figure 4.3: Sigmoid functions used in the proposed model………………………………………………………..…20
Figure 4.4: An example of bootstrapping and out-of-bag instances……………………………………………….…12
Figure 4.5: Sigmoid parameters to be estimated for alpha band……………………………………………………...26
Figure 4.6: The resultant sigmoid functions for the three features for F4-M1………………………….……………27
Figure 4.7: Pr(W) distribution of arousal and deep sleep segments……………………………………….…………28
Figure 4.8: Silhouette and Davies-Bouldin index values computed from the 3-s segments of awake and non-REM 1
data obtained from the training participants………………………………………………..…………………………29
Figure 4.9: Scatter diagram of the awake, drowsy, and sleep clusters for the testing data……………...……………30
Figure 4.10: Distributions of relative power of alpha, beta, and delta for three clusters of awake, drowsy and sleep.31
Figure 4.11: Post hoc multiple comparison tests suggests that alpha, beta, and delta power features are significantly
different between the clusters. …………………………………………………………………………….…………32
Figure 5.1: Illustration of the stimulus used in the present study……………………………………………….……36
Figure 5.2: Illustration of a participant performing the task………………………….………………………………37
Figure 5.3: Electrode locations of the international 10-20 electrode placement scheme………………..……………38
Page | ix
Figure 5.4: A schematic outline of the modified sigmoid wake probability model proposed in this thesis……….…41
Figure 5.5: Component maps of the independent component analysis (ICA) of the EEG recordings a subject…..…42
Figure 5.6: Sigmoid functions used in the modified sigmoid wake probability model………………………………43
Figure 5.7: A schematic outline of the step function model proposed in this thesis…………………………………44
Figure 5.8: A schematic outline of the proposed time-domain feature based algorithm…………………………..…46
Figure 5.9: Illustration of bootstrap aggregating classifier…………………………...………………………………48
Figure 5.10: Reaction time in seconds of a participant in response to the experimental task (i.e., identifying color
change of the fixation cross) throughout a single experimental block………………………………………….…….50
Figure 5.11: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the sigmoid wake
probability model on each of the electrodes of the 10-20 electrode placement system……………………..…..……53
Figure 5.12: Feature weights of F3 electrode computed from random forest…………………..……………….……53
Figure 5.13: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the step function
based algorithm on each of the electrodes of the 10-20 electrode placement system………………………..………54
Figure 5.14: Mean and standard deviation and maximum values of accuracy, sensitivity, and specificity of 100 runs
of the Hjorth parameter based algorithm on each of the electrodes of the 10-20 electrode placement system. ….…55
Figure 5.15: Mean and standard deviation and maximum values of accuracy, sensitivity, and specificity of 100 runs
of the Hjorth parameter based algorithm on each of the electrodes of the 10-20 electrode placement system…..…56
Page | x
List of Appendices
Appendix A4: Results from F3-M2…………………………………………………………………………………77
Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study…………………………………80
Appendix B5: Mobility Parameter………………………………………………………………………………….82
Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm…………………………………83
Page | 1
Chapter 1: Introduction
1.1 Overview
According to the World Health Organization, 1.25 million people die on the roads due to accidents each year across
the globe [1]. Drowsy driving is one of the leading causes of car crashes around the world. According to US
National Highway Traffic Safety Administration (NHTSA), 100,000 crashes related to driver fatigue results in an
estimated 1,550 deaths, 71,000 injuries, and 12.5 billion dollars in monetary losses each year in the US alone [2].
Road accidents due to drowsy driving therefore lead to significant human and material costs and productivity
reduction. The United Nations General Assembly adopted a set of Sustainable Development Goals (SDGs) in
September 2015- one of which is to halve the deaths and injuries from road accidents around the world by 2020 [1].
Reduction of car crashes due to drowsy or fatigued driving is a precondition for achieving this goal and ensuring
road safety.
A large number of existing works in the literature have attempted to deal with the problem of drowsy driving
detection and have propounded various drowsiness and fatigue monitoring systems. However, most of these
research prototypes could not make their way into the real world owing to their expensiveness and poor detection
performance [3, 4]. Even though various commercialized driver alertness or fatigue monitoring systems have been
developed by automobile companies, these systems are only being used in the vehicles of the respective companies
[3-8]. Most of the existing systems use behavioral measure-based drowsiness detection algorithms that involve
analyzing facial video or eye tracking data [3]. However, these systems are reliant on lighting conditions [9]. Also,
these systems necessitate the driver to be constantly monitored by a camera which compromises the privacy of the
driver. Again, a large number of existing drowsiness detection systems use vehicle parameters such as lane deviation
which are affected by external factors such as weather, road markings, and lighting conditions [4]. Moreover,
vehicle parameter variations cannot uniquely be attributed to drowsiness, since driving under the influence of
Page | 2
alcohol, anti-depressants, or other drugs and impaired driving affect these parameters [10-12]. Therefore, there is a
need for an efficient, highly accurate and cost effective drowsiness detection system that can be used in the
community.
1.2 Drowsiness Physiology
Drowsiness, also termed as sleep onset, sleepiness, low arousal state or somnolence, is referred to as the state of
strong desire for sleep that occurs just before a person falls asleep [13]. Since falling asleep (and therefore
sleepiness) is a neural process, brain rhythms are affected by drowsiness. Consequently, electroencephalogram
(EEG) is the most commonly used means for detecting drowsiness. Brain waves consist of five dominant frequency
bands: alpha (8-13 Hz), beta (13-30 Hz), gamma (30-100 Hz), delta (1-4 Hz) and theta (4-8 Hz). In sleep-EEG data,
drowsiness or sleep onset is characterized by a decrease of alpha waves [14]. This is owing to the fact that alpha
wave is associated with relaxed wakefulness. In contrast, delta and theta power increases during drowsy episodes
[14, 15]. In the context of a reaction time or tracking task or in a study in a simulator, drowsiness is associated with
a decrease in high frequency gamma band (30-100 Hz) power and an increase in lower frequency bands such as
delta, theta, alpha, and beta (13-30 Hz) power [16].
Besides changes in brain wave activities, drowsiness is also associated with other physiological processes of the
human body such as autonomic nervous activity (ANS) [17-19]. Drowsiness is characterized by increased
parasympathetic dominance and decreased sympathetic dominance. ANS activity can be noninvasively measured
from the electrocardiogram (ECG) signals. The low frequency (LF) band (0.04- 0.15 Hz) power captures
sympathetic activity, whereas the high frequency (HF) band (0.15- 0.4 Hz) power signifies parasympathetic activity
[18, 19]. Parasympathetic-sympathetic balance is obtained from the ratio of LF and HF, which progressively
decreases as a subject moves from vigilant to drowsy state. Another frequency band of interest is the very low
frequency (VLF) band (0.003- 0.04 Hz). The transition from wakefulness to sleep is associated with a decrease in
power in the VLF band. Thus, drowsiness strongly influences heart rate variability. Drowsiness is also associated
with as decrease in muscle activity [20]. Furthermore, drowsiness is associated with oxygen desaturation which
triggers a loss of alertness and concentration. Studies have shown that peripheral oxygen saturation (SpO2) in the
forehead decreases when drowsiness gets stronger and increases when drowsiness gets weaker [21]. Drowsiness is
also reflected in eye movements (as measured by the electrooculogram or eye tracking). Increased eye blinks, partial
or full closure of the eyelids, and changes in eye blink duration and amplitude can be observed [22]. The appearance
of drowsiness is also manifested by changes in skin conductance, resulting in a decrease in sympathetic dominance,
Page | 3
which in turn, causes the removal of the ionic fillings of the sweat glands of skin [23]. This gives rise to a sudden
decrease in skin conductance. Taken together, it is evident that drowsiness causes various physiological traits to
change which are reflected by the changes in the physiological signals.
1.3 Motivation
Over the years, researchers have attempted to develop drowsiness and fatigue systems that can be installed in
vehicles to monitor drivers [2-4, 16, 24-36]. Upon detection of drowsy episodes, the system would activate an alarm
to alert the vehicle driver. A drowsy driving detection system will greatly benefit the populations who are at higher
risk of drowsy driving related car crashes, including shift workers, patients with sleep related disorders, individuals
who take sedative medications, and occupational drivers. If a convenient and reliable drowsiness detection system is
developed, it will greatly reduce drowsy driving or fatigue-related car crashes. Furthermore, even though this work
reviews drowsiness particularly in the context of driving, a drowsiness detection system will also be useful for
mining workers, pilots, and locomotive operators.
1.4 Thesis Structure
The present thesis is organized as follows: In Chapter 2, we will discuss the existing drowsiness detection
algorithms and commercially available fatigue and drowsiness detection systems. Chapter 3 presents the objectives
of this thesis. In Chapter 4, we will describe the first study that was conducted in a sleep laboratory, explain the
algorithm developed based on the sleep study data, and discuss the obtained results. Next, in Chapter 5, we present
the second study based on a reaction time task and will discuss the developed algorithms together with the results
obtained from the collected data. Chapter 6 provides a general discussion of the two studies and the drowsiness
detection algorithms propounded in this thesis. Finally, Chapter 7 highlights future directions of this thesis and
provides concluding remarks.
Page | 4
Chapter 2: Background
In this chapter, we will provide an overview of the current literature about drowsiness detection. First, this chapter
will define drowsiness before discussing commonly used measures, signal processing and pattern recognition tools
used, and commercially available systems for drowsiness detection.
2.1 Defining Drowsiness
Unlike various stages of sleep, drowsiness is not a physiologically well-defined stage. This has led to the
development of various definitions or scoring schemes for drowsiness. Before describing drowsiness, we first
discuss how various sleep stages are defined, since in the first study, we will use sleep-EEG data to detect
drowsiness. Sleep stage scoring has traditionally been performed using Rechtschaffen and Kales (R&K) [37]
guideline presented in Table 2.1 which was originally proposed in 1968.
Table 2.1: Rechtschaffen and Kales Sleep Staging Criteria [37].
Sleep stage Scoring criteria
Awake >50% of the epoch comprises of alpha (8-13 Hz) activity or low voltage, mixed (2-7 Hz) frequency activity.
Non-REM stage 1 50% of the epoch consists of relatively low voltage mixed (2-7 Hz) activity, and <50% of the epoch contains alpha activity. Slow rolling eye movements that last several seconds often seen in early stage 1.
Non-REM stage 2 Appearance of sleep spindles and/or K complexes and <20% of the epoch may contain high voltage (>75 µV, <2 Hz) activity. Sleep spindles and K complexes each must last >0.5 seconds.
Non-REM stage 3 20- 50% of the epoch consists of high voltage (>75 µV), low frequency (<2 Hz) activity.
Non-REM stage 4 >50% of the epoch consists of high voltage (>75 µV) <2 Hz delta activity.
REM Relatively low voltage mixed (2-7 Hz) frequency EEG with episodic rapid eye movements and absent or reduced chin EMG activity.
Page | 5
In 2005, American Academy of Sleep Medicine (AASM) published a modified guideline which combined S3 and
S4 into one stage, namely N3. A salient trait of both R&K and AASM guidelines is that they score EEG signals on a
30 s basis. Neither R&K nor AASM guideline gives an explicit definition of drowsiness, unlike wakefulness and
various states of sleep. Thus, awake to sleep transition or sleep onset points are discretized in both scoring
guidelines. In other words, this approach considers wake to sleep transition and vice versa as an instantaneous
process and completely overlooks the interplay of neural system and behavior that occurs just before sleep onset.
Table 2.2 summarizes some of the most widely used definitions of drowsiness.
Table 2.2: A summary of widely used drowsiness scoring schemes.
Scoring scheme Brief description
Non-REM 1 [7] Identifying sleep stage non-REM 1 as drowsiness
Karolinska drowsiness scoring method (KDS) [3][5] EEG signals are segmented on 2 second basis. Each segment is checked for sleepiness using EEG and EMG.
Self-evaluated [13] Evaluation of sleepiness by the vehicle driver
Evaluation by external human observer [13] Evaluation of sleepiness by external human observer who rate the drowsiness based on eye closure, head movement etc.
Event related lane departure paradigm [50] The driver has to align the car towards a particular lane as the vehicle drifts away from a lane. The reaction time of the driver is measured. Higher reaction time indicates drowsiness.
Wierewille and Ellsworth criteria [11][31][12] The level of drowsiness is rated from videos of the driver. The scale varies from 1 to 5- where 1 denotes not drowsy and 5 denotes extremely drowsy.
Behavioral task [29] Reaction in a button pressing task.
Johns drowsiness scale [49] A scale that combines different variables representing the variability of eyelid closure’s and blink’s duration and velocity characteristics, measured each minute.
Vehicle-based parameters [17] Thresholding vehicle-based measures, such as standard deviation of lane position
Ocular and facial features from video [17] Thresholding common ocular features, such as percentage of eye closure (PERCLOS)
Unfortunately, most of the existing works on drowsiness and fatigue detection define sleep stage non-REM 1 as
drowsiness according to R&K guidelines, considering sleep as a discrete process. Drowsiness, however, is a state
Page | 6
that precedes non-REM 1. This is because non-REM 1 is a sleep stage, and an individual will be drowsy near the
transition of wake to sleep [14, 15]. It can be seen from Table 2.2 that only Karolinska drowsiness scoring method
(KDS) employ physiological signals such as EEG and electromyogram (EMG). In contrast, all the other scoring
schemes employ behavioral or vehicle-based measures for detection of drowsy driving.
2.2 Measures Used for Drowsiness Detection
Measures used for detecting drowsy driving can be separated into driver-based measures and vehicle-based
measures. Driver-based measures refer to various behavioral and physiological characteristics recorded from the
driver [9, 16, 20, 24-29, 31-33, 38-41]. Fig. 2.1 depicts the commonly used measures for vehicle driver’s drowsiness
detection. Driver-based measures can further be subdivided into three categories, namely physiological, behavioral,
and questionnaire-based measures. Vehicle-based measures are the parameters calculated from the vehicle [42, 43].
These measures include lane deviation and steering wheel movement. In the following, we will focus on describing
each of these measures in details.
Figure 2.1: An illustration of commonly used measures for drowsiness detection.
2.2.1 Behavioral Measures
Several behavioral changes can also be observed during drowsiness. These changes include frequent yawning,
increased eye blink, nodding or sudden movement of the head on one of the sides, and changes in facial features.
Behavioral measure-based methodologies attempt to capture and exploit these characteristic features to develop their
sleepiness detection algorithm. Table 2.4 summarizes some of the existing works that use behavioral measures for
drowsy driving detection.
Drowsiness detection
Driver based measures
Physiological measures
Behavioral measures
Questionnaire based measures
Vehicle based measures
Page | 7
Table 2.4: A summary of drowsiness detection methodologies that employ behavioral measures.
Authors Signals used Experimental paradigm
Number of subjects
Drowsiness scoring scheme
Feature extraction
Classification or regression model
Accuracy
Bergasa et al. [34]
Active IR illuminator and a miniature CCD camera sensitive to IR to capture facial images
Night and day driving in a motorway
Unspecified Fuzzy rules based on the six ocular features extracted in this work
Six features- PERCLOS, eye closure duration, blink frequency, nodding frequency, face position, and fixed gaze.
Kalman filtering and fuzzy classifier
95.62%
D’Orazio et al. [35]
Facial image Image acquisition in different lighting conditions of subjects while driving a car.
2 PERCLOS values
Hough transform
ANN 95%
Flores et al. [36]
Facial image Driving by day and at night
Unspecified PERCLOS values
Gabor filter and PERCLOS
SVM 93%
Sommer et al. [9]
Facial video Night-time driving in a driving simulator
16 KDS, driving related parameter- SDL, and PERCLOS
PERCLOS and spectral domain features
SVM 66-74% for PERCLOS
Yin et al. [44]
Video Videos collected from webcam as the subjects operated computers on a worktable,
30 Annotation based on yawning, eye closure etc. from videos
Local binary pattern feature
Adaptive boosting
98.33%
The advantage of behavioral measures for drowsiness detection is that they are contactless. In other words, these
systems do not require a device or sensor to attach to the driver’s body. However, these systems have a number of
limitations. First, systems using ocular measures show poor performance for drivers wearing glasses [43]. Second,
some of the works in the literature uses eye measures that are dependent on lighting conditions. As a result, they do
not work well in poor lighting conditions, such as cloudy days and at night. Even though some studies used infrared
cameras to overcome this limitation, these detection schemes fail to perform adequately during day time [4]. Third,
eye and head movement-based systems are not very popular among vehicle drivers owing to the fact that most of the
drivers are uncomfortable to have a camera focused on their faces or bodies all the time [43].
Page | 8
2.2.2 Questionnaire-Based Measures
Prior works on drowsiness detection have also used subjective questionnaire-based measures for drowsiness
detection wherein the subject had to fill out a questionnaire to rate their level of drowsiness [45, 46]. Next, the
intensity of sleepiness was measured based on the ratings. Some of the most commonly used questionnaires are the
Dundee Stress State Questionnaire (DSSQ) [47], the Karolinska Sleepiness Scale (KSS) [48], and the Epworth
Sleepiness Scale. The DSSQ attempts to assess the level of task-induced stress and arousal of the subject. The KSS
measures acute sleepiness by asking the subject to rate their level of sleepiness on a scale of 1 (extremely alert) to 10
(extremely sleepy). The ESS measures general sleepiness by asking the respondent to rate on a 4-point scale (0-3)
their usual chances of having dozed off or fallen asleep while engaged in eight different activities. Albeit their use in
some existing studies, questionnaire-based measures have multifarious caveats associated with them. For instance, it
is problematic to obtain feedback from the driver on his/her own drowsiness while driving, since it would require a
person to accompany the driver all the time. This approach also affects the attention and level of alertness of the
driver. Furthermore, sudden variations in drowsiness cannot be measured using such questionnaires owing to the
fact that most of the questionnaires are presented and filled out in 5 min. intervals. Moreover, questionnaire-based
ratings do not fully concur with other measures, such as physiological, behavioral, and vehicle-based [4, 47]. Taken
all, questionnaire-based measures are not effective for developing a drowsiness detection algorithm.
2.2.3 Vehicle-Based Measures
Vehicle-based measures have also been used for drowsiness detection. The two most commonly used such measures
are standard deviation of lane position (SDL or SDLP) and steering wheel movement (SWM). An inattentive or
sleepy driver often drifts while driving. Therefore, SWM and SDLP alter when the driver becomes drowsy. As a
result, there would be variations of SWM, which are manifested by increased standard deviation of steering angle
and increased amplitude of steering movement [42]. Moreover, acceleration and brake patterns are influenced by the
sleepiness of the driver [43]. Studies have shown that fatigue or drowsiness of the driver is manifested by increases
in standard deviation of the vehicle speed [4, 42, 43]. Therefore, the aforementioned parameters have been used in
prior studies that use vehicle-based measures for drowsy driving identification.
The main advantage of vehicle-based measure systems is that they are non-intrusive. Consequently, these systems
are convenient and comfortable for the driver. Nevertheless, there are various caveats of vehicle-based measures for
drowsy driving identification. First, prior studies have shown that vehicle-based metrics are poor predictors of
Page | 9
drowsy driving [3, 4, 43]. Second, vehicle-based measures can be affected by factors other than fatigue or
drowsiness. For instance, they can be influenced by drugs such as antidepressants or by alcohol [10-12]. Third, these
vehicle-based parameters vary greatly from driver to driver. That is, an algorithm that yields good performance for
one driver may yield poor performance for another. Finally, vehicle-based measures are greatly reliant on vehicle
type and external factors, including road geometry, weather, and road marking.
2.2.4 Physiological Measures
It is evident from Section 1.2 that drowsiness is associated with alterations in characteristics of physiological
signals. Consequently, drowsy driving detection systems that use physiological signals attempt to capture these
alterations in the signal characteristics recorded from the driver.
Table 2.3 summarizes some of the existing works that use physiological signals to detect drowsiness. EEG [4, 16,
24, 25, 27, 49-51] is the most widely used physiological signal for drowsiness detection. Some of the most
commonly used features used in EEG-based drowsy driving detection techniques are power spectral density (PSD)
based features, average and relative power of various brain rhythms including alpha, delta, and theta waves, time-
domain features (e.g. mean, variance, minimum, maximum, kurtosis of EEG signal amplitude), fractal dimension,
approximate entropy, and lempel-ziv complexity. Drowsiness is also associated with partial or full eyelid closure
and higher blink frequency. Consequently, electrooculogram (EOG) [20, 22, 49, 51], and EMG [20, 22, 51] have
also been used in prior studies to quantify drowsiness. The most common features extracted from EOG include:
peak eyelid closing velocity, delay of eyelid reopening, blink amplitude, blink duration, peak opening velocity of
eyelid, eyelid opening speed, eyelid closure speed, percentage of eye closure (PERCLOS) which denotes the
percentage of time during which the eyes were at least 80%, closed, average eye closure (AVECLOS), eyelid
closing time, eye-blink interval, and eye-blink frequency every 20 second. On the other hand, EMG-based methods
extracted features to track decreasing muscle tension due to drowsiness to develop their drowsiness detection
algorithms [20]. Furthermore, the studies that use ECG signals [17-19] mostly focus on calculating heart rate
variability (HRV) features from time and frequency domain. The time domain features include standard deviation of
the RR intervals, number and proportion of RR intervals, the root mean square of the difference of successive RR
among others. High frequency (HF) and low frequency (LF) power and ratio of LF to HF are the frequency domain-
based features used for drowsiness detection. LF to HF ratio continues to decrease as an individual becomes
increasingly drowsy [18, 19]. It is also evident from Table 2.3 that most of the existing studies use multiple
physiological signals to improve performance.
Page | 10
Table 2.3: A summary of drowsiness detection methodologies using physiological signals.
Authors Signals used Experimental paradigm
N Drowsiness scoring scheme
Feature extraction Classification or regression model
Accuracy
Akin et al. [20]
Chin EMG and EOG
Overnight sleep study
30 Expert scoring from EEG and EMG
Discrete wavelet transform (DWT)
Artificial neural network (ANN)
98-99%
Khushaba et al. [49]
ECG, EOG, and EEG
1 h. monotonous driving in a driving simulator
31 Wierewille and Ellsworth criteria
Fuzzy mutual-information (MI)- based wavelet packet transform
Support vector machine (SVM), k nearest neighbor, linear discriminant analysis (LDA)
95-97%
He et al. [50] Horizontal and vertical EOG, chin EMG, and EEG
45 mins. driving in a moving base driving simulator
37 Karolinska Drowsiness Scale (KDS)
Eyelid movement parameters extracted from EOG
SVM 90%
Patel et al. [17]
ECG Sensory motor driver simulator task
12 Unspecified Power spectrum density of RR interval
ANN 90%
Su et al. [22] EOG, EEG, and EMG
Monotonous driving in a third generation moving base simulator after a full-night shift with no sleep
44 KDS 14 eyelid features (e.g. blind duration/ amplitude, peak opening/ closing velocity, lid opening/ closing speed etc.) extracted from EOG
Partial least squares regression
90%
Fu et al. [27] EEG Event-related lane departure paradigm in a virtual-reality based driving simulator
6 Reaction time and lane deviation based scoring commonly used in event-related lane departure paradigm
ICA and spectral powers computed by FFT
Self-organizing neural fuzzy inference network
96.7%
Correa et al. [24]
EEG Overnight sleep study
16 Considers non-REM 1 as drowsiness
19 features extracted frequency and time domain and wavelet decomposition of the EEG
ANN 87.4% for drowsiness and 83.6%, for alertness
Kurt et al. [51]
EMG, EEG, and EOG
Overnight sleep study
10 Expert scoring from EEG, EOG, and EMG
DWT features ANN 97-98%
Vicente et al. [18]
ECG 2 hr. driving in a driving simulator
11 Expert annotation based on EEG, percentage of eye closure, video recording
Heart rate variability features in the frequency domain
LDA Positive predictive value: 86.31%; Sensitivity: 70.58%
Chin et al. [25]
EEG Night-time driving in a virtual reality based driving simulator
10 Alpha and theta rhythm and alertness model
Mahalanobis distance of Alpha and Theta rhythm
Thresholding 88.7%
Page | 11
The advantages of physiological signal-based drowsiness detection systems are manifold. They are highly reliable
and tend to provide high accuracy values [4, 16] compared to those of behavioral-based and vehicle-based measures.
Moreover, these systems are not dependent on surrounding lighting conditions and eliminate the discomfort of the
driver of constantly being monitored by a camera [3].
Notwithstanding its benefits and widespread use in the drowsiness detection literature, the use of physiological
signals to detect drowsy driving has some limitations. For instance, these systems are, as opposed to other driver-
based measures, non-contactless and require a lot of sensors and electrodes to be placed on the driver’s body which
makes it uncomfortable for the driver. Furthermore, most of the physiological signal-based algorithms use 30s or
longer signal [20, 24, 33, 51] segments whereas drowsy episodes can be as short as 1s [52]. Moreover, physiological
signal acquisition equipment are expensive and difficult to place. Also, some of the existing works in the literature
analyzes overnight sleep study data for drowsiness detection, the findings of which cannot be generalized for drowsy
driving.
2.3 Signal Processing and Pattern Recognition Techniques Used for Drowsiness Detection
The goals of using signal processing is to clean the data and identify effective markers of drowsiness which are later
fed into patter recognition tools to develop a drowsiness detection algorithm. In this section, the signal processing
and the pattern recognition techniques used for driver drowsiness detection will be discussed. It is often seen that
transform domain features capture more information on drowsiness than features computed directly from the signal
domain. Therefore, most of the existing works in the literature perform feature extraction in the transform domain
[18-20, 35, 51]. A salient trait that is seen in all physiological signal-based systems is that the method attempts to
capture information from a particular frequency band. For example, in EEG, alpha (13-30 Hz), theta (4-8 Hz), and
delta (1-4 Hz) may be bands of interest for feature extraction. In ECG, HF and LF bands of RR interval time series
are often used for feature extraction. The goal of extracting a particular frequency band has motivated the use of
various frequency domain signal decomposition techniques in drowsy driving detection literature. This
decompositions include power spectrum density analysis [17], discrete wavelet transform (DWT) [20, 24, 51], fast
Fourier transform [27], and wavelet packet transform [49].
Even though these signal analysis techniques successfully decompose the data in various frequency bands, each of
these techniques have various caveats associated with them. For example, Fourier transform is not well-suited for
Page | 12
analyzing nonlinear and nonstationary signals such as- EEG [53]. Wavelet transform, on other hand, is reliant on the
choice of basis functions. The use of data adaptive signal decomposition scheme, such as empirical mode
decomposition [53] is not present in the literature. The signal processing techniques used in behavioral measure-
based driver drowsiness detection systems are employed to detect face and eye from facial video frames. The most
commonly used techniques for this purpose include texture-based features, such as local binary pattern [44] and its
variants, Hough transform [35], and Gabor filters [36, 54].
Following feature extraction, various pattern recognition techniques are used to classify drowsy and non-drowsy
signal episodes or among various levels of drowsiness. There is a wide variety of classification techniques that have
been used in the existing literature. These include artificial neural network (ANN) [17, 20, 24, 35, 51], support
vector machine (SVM) [36, 54-56], k nearest neighbors [49], Kalman filtering [34], particle filter [42], discriminant
analysis [19], fuzzy logic based classifiers [34], adaptive boosting [44] among others. Nevertheless, there is dearth
of studies that employ forecasting or prediction of drowsiness episodes.
2.4 Commercially Available Systems
The organizations that have conducted research to develop a drowsy driving detection system include- government
organizations (e.g. Canada Safety Council, Ministry of Transportation of Ontario, Railway Association of Canada,
Transport Canada, Spanish Science and Technology council, U.S. Department of Defense, U.S. Federal Motor
Carrier Safety Administration), specialized companies (e.g. AcuMine, Neurocom, Sleep Diagnostics, Seeing
Machines, Ospat Pty, Pacific Science and Engineering Group, Pernix, Precision Control Design Inc., Security
Electronic Systems), original equipment manufacturers (e.g. General Motors/Saab, Caterpillar, Daimler Chrysler,
BMW, Audi etc.), and universities (e.g. University of Pennsylvania, CMU, University of Tokyo, Royal Institute of
Technology- Sweden, University of Technology of Berlin) [3, 4]. Table 2.5 summarizes some commercially
available drowsiness or fatigue monitoring systems developed by the aforementioned organizations.
Page | 13
Table 2.5: Some commercially available driver fatigue or drowsiness monitoring systems [3].
Company name Product name Signals used in the system
Audi [3] Rest recommendation system Ocular features
BMW [6] Active Driving Assistant HRV
Volkswagen [57] Fatigue detection system Ocular features
Delphi Corporation [58] Driver State Monitor -
Volvo [8] Driver Alert Control Head movement, ocular features
Carnegie Mellon University [3] - Head movement
US Army [3] - EEG
Scania [3] - Lane deviation, SWM
NHTSA [3] - Lane deviation, head movement, EEG, ocular
features
AcuMine [59] HaulCheck Lane deviation
Attention Technologies [60] Driver Fatigue Monitor Ocular features
Subaru [3] EyeSight Driver Assist -
Seeing Machines [61] Facelab Head movement, ocular features
Smart Eye [62] AntiSleep Head movement, ocular features
Sleep Diagnostics [3] Optalert Ocular features
AssistWare Technologies [60] SafeTrac Lane deviation
Saab [3] - Head movement, ocular features
Bosch [7] Driver drowsiness detection SWM, vehicle speed, facial image
Siemens [63] - Lane deviation, HRV, ocular features
SMI [3] InSight Head movement, ocular features
Welkin [3] Nap Zapper Head movement
Page | 14
Denso [60] - HRV, ocular features
Neurocom [3] EDVTCS Skin conductance
Pernix [3] ASTID SWM
Ospat Pty [64] OSPAT Reaction time
Muirhead/Remote Control Tech. [3] Fatigue Warning System Reaction time
Security Electronic Systems [3] Sleep Control Helmet System Head movement
MCJ [3] EyeCheck Ocular features
Mobileye NV [65] Vision/Radar Sensor Lane deviation, SWM
Precision Control Design Inc. [3] SleepWatch -
ARRB Transport Research [3] Fatigue Management System Reaction time
International Mining Technologies [3] Voice Commander Reaction time
Mercedes-Benz [60] Attention Assist -
Nissan [66] Driver Attention Alert -
Mazda [3] Lane Departure Warning System -
Iteris Inc. [60] AutoVue Lane deviation
Advanced Safety Concepts [3] Proximity Array Sensing System (PASS)
Head movement
A close inspection of Table 2.4 reveals that most of the commercially available systems are based on behavioral
measures. Systems that employ vehicle-based signals are fewer and only a handful number of systems use
physiological measures. Furthermore, none of the existing systems use questionnaire-based measures. Moreover,
Table 2.4 manifests that most of the existing systems attempt to combine multiple systems to ameliorate detection
performance. Also, some of the companies did not reveal details of the signals that they used in their fatigue
monitoring systems.
There are several drawbacks of commercially available drowsy driving detection systems. Commercially developed
systems are not easily available to everyone [4], and one must purchase vehicles from a particular company to use
their drowsiness and fatigue detection system. The technology used in the commercialized systems is not open-
source. Furthermore, most of the systems presented in Table 2.4 generate a lot of false alarms [3], making those
Page | 15
systems are unreliable. Moreover, since most of the existing commercialized systems are either using behavioral
measures or vehicle-based measures, they all have the limitations of behavioral or vehicle-based drowsiness
detection systems described in sections 2.1.2 and 2.1.4 respectively.
Page | 16
Chapter 3: Objectives
As previously described, the current methods to detect drowsy driving are of low-resolution (30s or longer window
size) even though many of them give high detection accuracy. Only a handful of methods detect drowsiness at a
high (i.e., less than 10s) resolution. However, these studies yield poor detection performance. Furthermore, most of
the existing algorithms are reliant on various external factors such as weather, road geometry, and lighting
conditions. To goal of the present thesis is to overcome these issues. In detail, the two main objectives of this thesis
are:
1. To develop a high-resolution drowsiness detection algorithm using EEG data collected from a sleep study.
2. To test the algorithms developed on sleep study data in a natural drowsiness-inducing setting that is similar to
daily driving.
The first objective of my thesis project is to develop a high-resolution and accurate drowsiness detection algorithm
using EEG data collected from a sleep study. In a sleep study, the subject is lying in the bed with eyes closed and
without much body movement, resulting in EEG signals that are relatively less contaminated by eye movement,
motion, and eye blink artifacts. Therefore, we use the relatively noise-free sleep-EEG data to develop a high-
resolution drowsiness detection algorithm.
The second objective is to test the algorithms developed on sleep study data in an experiment that is more similar to
driving. To achieve this objective, we will design an interactive reaction time task that will require the participant’s
attention yet monotonous at the same time. In this study, we will use facial videos to identify the episodes of
drowsiness. The algorithm developed in the first objective will be validated against gold standard behavioral scale of
drowsiness based on facial video.
Page | 17
Chapter 4: Study 1 - Sleep Study
In this chapter, a detailed description of the overnight sleep study and the algorithm developed for detection of
drowsiness are presented. EEG data collected from a sleep study were analyzed to develop a high-resolution
drowsiness detection algorithm.
4.1 Method
For the development of the EEG-based drowsiness detection algorithm, data collected as part of an overnight sleep
study at Toronto Rehabilitation Institute were used. The data were collected as part of another study and we used
these data in a post-hoc data analyses. The rationale for using sleep data was that they generally tend to be less noisy
with fewer artifacts such as eye movements as compared to data during wakefulness. Furthermore, compared to a
simulated driving task where it can be challenging to accumulate large data, a sleep study would provide enough
data for primary model development in a sleep study. A full attended overnight polysomnograhpy (PSG) was
conducted using Embla® N7000/S4500 (Natus Medical Incorporated) at the Toronto Rehabilitation Institute Sleep
laboratory. Standard surface electrodes were applied to record EEG, electrocardiogram (ECG), and electromyogram.
Respiratory rate and volume were monitored using chest and abdominal respiratory inductance plethysmography
bands, airflow by nasal pressure cannula, and atrial oxyhemoglobin saturation (SaO2) using pulse oximetry. Sleep
stages and arousals were scored in accordance with standard rules discussed in Chapter 2 [37].
4.1.1 EEG Data
The EEG recordings contained data from six electrodes – two frontal (F3/F4), two central (C3/C4), and two occipital
(O1/O2) electrodes. The electrodes were referenced against the mastoid electrodes (M1 and M2). The sampling rates
of the EEG data were 128 Hz. Fig. 4.1 illustrates the EEG electrode locations used in this study.
Page | 18
Figure 4.1: EEG electrode placement map commonly used in sleep studies [67]. The six electrodes available in this study are highlighted with red circles.
4.1.2 Preprocessing of EEG Signals
The EEG data were bandpass filtered using a butterworth filter with 0.5-30 Hz cut-off frequencies. Next, the data
were segmented on 3s segment basis. Since we aimed to extract features from the delta band (1-4 Hz) of EEG, the
segment length was chosen to see at least two cycles of delta. For drowsiness detection, all episodes which were
rated by sleep technicians based on the whole night EEG recordings as awake and non-rapid eye movement 1 (non-
REM 1) were considered. The sleep technicians score the sleep-EEG data on 30s basis. That is, the technicians rated
a 30s EEG episode as awake when the segment consisted of at least 50% wakefulness; otherwise it is scored as non-
REM 1 [15]. As a result, there could be short wakefulness, drowsy, and non-REM 1 episodes in scored wakefulness
as well as non-REM 1 segments. In this study, we also used arousal segments as extreme cases of alertness and deep
sleep segments as extreme cases of non-alertness for parameter selection and validation of the proposed method.
Arousal segments occur after a respiratory event and are typically 3s long [68]. During arousals, the participant is
cortically awake. For deep sleep segments, we had chosen non-rapid eye movement 2 and 3 stages during which the
participant is definitely asleep.
Page | 19
4.1.3 Feature Extraction from EEG Frequency Bands
After performing preprocessing, we extracted features from the EEG frequency bands. Prior studies have shown that
as an individual moves from wakefulness to non-REM 1, alpha and beta band powers decrease, and delta band
power increases [13, 15]. The spectrogram in Fig. 4.2 shows an example of the changes in the EEG frequency band
powers alpha and delta at sleep onset, as scored by a sleep technician. The beta band changes in the spectrogram are
not visible, since the beta power levels are smaller compared to other bands.
In this study, the relative power features of alpha, delta, and beta bands were investigated. Relative power of a band
was defined as:
푅푒푙푎푡푖푣푒푝표푤푒푟표푓푎푏푎푛푑 =
AveragepowerofthebandAveragepowerfrom1− 30Hz
(4.1)
Figure 4.2: Spectrogram (3s window, 50% overlap) of the first few minutes of EEG recording at channel F4 of a single participant. The F4 electrode was referenced against the left mastoid (M1). Based on the scoring by sleep
technicians, the transition from wakefulness to non-REM 1 occurs at 4.5 minutes.
Page | 20
4.1.4 Sigmoid Wake Probability Model
A Sigmoid function has a characteristics "S"-shaped curve. If the input to sigmoid is extremely high or low, the
sigmoid output is set to either close to 0 or close to 1. Towards the middle, the curve smoothly increases or
decreases. Using the three relative power values of alpha, delta, and beta, we developed a probability model using
sigmoid function. This model outputs the likelihood of wakefulness for a 3s long EEG signal segment. The sigmoid
functions used in the model are depicted in Fig. 4.3. Probability of wakefulness for each feature (PrF(W)) should be
high, if the relative power values of alpha or beta are high. Since alpha and beta band powers are high during
wakefulness, the red curve is used to capture the changes in these two bands. Delta band power, on the other hand,
increases as an individual goes from wakefulness to sleep. Probability of wakefulness (PrF(W)) should be low, if
relative power values of delta are high, which is why the black curve is used to compute PrF(W) from delta band.
Thus, each of the features are fed into a sigmoid function to obtain probability of awake for each feature (PrF(W)).
Figure 4.3: Sigmoid functions used in the proposed model. Probability of wakefulness for each feature (PrF(W)) should be high, if relative power values of alpha/beta are high. Therefore, the red curve is used to capture the
changes in these two bands. The opposite scenarios are seen for delta band, which is why the black curve is used to compute PrF(W) from delta band.
Subsequently, weighted average of the PrF(W) were taken to obtain final probability of wakefulness Pr(W) using the
following equation.
Pr(푊) = 푤 ∗ 푃 +푤 ∗ 푃 +푤 ∗ 푃 (4.2)
Page | 21
Here, w1, w2, and w3 are the weights and P훼, Pβ, and P훿 are the sigmoid outputs for alpha, beta, and delta relative
power features, respectively.
4.1.5 Parameter Selection for the Model
In order to determine the optimal choice of the sigmoid parameters (a and b, Fig. 4.3) and to validate the proposed
model, we selected arousal and deep sleep segments from the sleep-EEG data. To select the sigmoid parameters, we
divided the arousal and deep sleep segments into training and testing data. That is, the training data were used to
compute the model parameters, and the testing data were used to validate the choices of the sigmoid parameters and
weights. To estimate sigmoid parameter a, the feature values of the deep sleep distribution were sorted. Afterwards,
the maximum of the lower 80% of data was selected as the sigmoid parameter a. To estimate sigmoid parameter b,
the feature values of the arousal distribution were sorted. Afterwards, the minimum of the higher 80% of data was
selected as the sigmoid parameter b.
To compute the weight of the features (w1,w2,w3), we employed out-of-bag (OOB) permuted predictor delta error
method [69, 70]. The advantage of OOB permuted predictor delta error method over other methods such as linear
regression is its diversity. Because of the use of bootstrapped replicas of the original dataset and out-of-bag
examples, the weights calculated by OOB permuted predicted delta error method is more robust [70]. At first, we
grew an ensemble of decision trees. Every decision tree in the ensemble was grown on an independently drawn
bootstrap replica of equal size of the input data. Observations not included in this replica were "out-of-bag" for that
tree. Fig. 4.4 gives an example of how bootstrapped decision trees were formed. Each of the examples in Fig. 4.4 is
a multidimensional vector wherein the dimension is equal to the number of features used. By drawing samples with
replacement from S (original dataset) in Fig. 4.4, three bootstrap replicas S1, S2, and S3 were formed. A decision
tree was trained using each of these replicas. The out-of-bag instances are examples that occur in S, but not in a
bootstrap replica. For instance, out-of-bag examples for bootstrap replica S1 are d and e.
After constructing the bootstrapped decision trees, we computed OOB permuted predictor delta error for each of the
three features. For any feature, OOB permuted predictor delta error was the increase in prediction error if the values
of that variable were permuted across the out-of-bag observations. This measure was computed for every tree, then
averaged over the entire ensemble and divided by the standard deviation over the entire ensemble. If OOB permuted
predictor delta error was higher for a particular feature than others, it would indicate that the particular feature was
more important than others.
Page | 22
Figure 4.4: An example of bootstrapping and out-of-bag instances. The original data-set S has 5 examples- a, b, c, d, and e. Each of the examples is a multidimensional vector wherein the dimension is equal to the number of features
used. By drawing samples with replacement from S, three bootstrap replicas S1, S2, and S3 are formed. Each one of the bootstrap replicas is used to train a decision tree. The out-of-bag instances are examples that occur in S, but not
in a bootstrap replica. Out-of-bag examples for bootstrap replica S1 are, for example, d and e.
The last stage was to identify wakefulness, sleep, and drowsy clusters by thresholding Pr(W). To identify three
clusters, we essentially had to find the upper and lower bounds of the drowsy cluster. We selected the lower and
upper bounds of the drowsy cluster by varying the upper and lower cut-offs and computing the clustering evaluation
metrics for every choice of cutoff values. The final bounds were those for which the three clusters were maximally
dissimilar from one another.
Even though the arousal and deep sleep segments used for validation would manifest the extent of the three clusters,
we used cluster quality evaluation metrics, including Davies-Bouldin [71] and silhouette [72] indices to determine
the upper and lower cut-offs of the drowsy cluster. Davies-Bouldin index (DB) is defined as [71]:
퐷퐵 =
1푛 max(
(σi + σj)푑(ci, cj) )
,
(4.3)
Page | 23
Here, n is the number of clusters, σi is the average distance of all patterns in cluster i to their cluster center ci , σj is
the average distance of all patterns in cluster j to their cluster center cj , and d(ci , cj ) is the distance of cluster centers
ci and cj. Small values of DB correspond to clusters that are compact, and whose centers are far away from each
other.
For each data point k, we computed the silhouette value s(k) as [72]:
푠(푘)
=
⎩⎪⎨
⎪⎧1−
푎(푘)푏(푘) , 푎(푘) < 푏(푘)
푏(푘)푎(푘) − 1, 푎(푘) ≥ 푏(푘)
(4.4)
Here, a(k) = the average dissimilarity/distance of k with all other data within the same cluster and b(k)= the lowest
average dissimilarity of k to any other cluster, of which k is not a member. Silhouette value s(k) close to 1 suggests
that the data point belongs to the proper cluster. Silhouette value close to -1, on the other hand, suggests that the
particular data point was assigned to the wrong cluster.
Page | 24
4.1.6 Validation
After identifying three clusters from the data, we performed cluster validation. For validation of the proposed
method, we randomly selected 50% of participants as training dataset and the remainder as testing dataset. The
arousal and the deep sleep segments from the training dataset were used to determine the parameters of the method.
Furthermore, the arousal and the deep sleep segments from the testing dataset were used to determine the
effectiveness of parameter selection and the proposed method. The proposed model was validated by determining if
it can successfully separate arousal and deep sleep segments. In other words, if the sigmoid wake probability model
performs well, it can be expected that it will give high likelihood of wakefulness for arousal segments and low
probability of wakefulness for deep sleep segments. Using, the awake and non-REM 1 periods from the testing
dataset, we visually investigated distribution of alpha, delta and beta bands of awake, drowsy, as sleep clusters as a
qualitative assessment of the clustering performance during. To quantitatively assess the clustering performance, we
performed one-way repeated measures analysis of variance to test if the three features were significantly different
among the three clusters. The clusters were identified based on the feature distributions. We also determined the
clustering quality evaluation metric values to validate the quality of the estimated clusters from the awake and non-
REM 1 data points. We also determined the average detection accuracy by considering the data points with positive
silhouette values as correctly classified and data points with negative silhouette values as misclassified instances.
4.2 Results
In this section, the results of sigmoid wake probability model on sleep study data are presented. The demographic
information of the participants is presented in Table 4.1.
Table 4.1: Demographic information of the participants. Data are presented as mean ± standard deviation.
Characteristics n= 53
Female, n (%) 26 (49.06%)
Body mass index (kg/m2) 29.09 ± 6.16
Age (years) 49.58 ± 16.18
Page | 25
Table 4.2 summarizes the number of 3-s segments used in this study for model development and validation. Results
obtained from the F4-M1 electrode (F4 referenced against left mastoid M1) were used for data analyses and are
presented here, but note that similar results were achieved for all other electrodes (Appendix 4A).
Table 4.2: Average and standard deviation of the number of 3-s segments from F4-M1 used in this study for model development and validation. Data are presented as mean ± standard deviation.
Stage Number of Segments per Participant
Arousal 5 ± 1
Deep sleep 23 ± 14
Non-REM 1 396 ± 112
Awake 1447 ± 337
In order to select sigmoid parameters (a and b in Fig. 4.3) and to validate the efficacy of the sigmoid awake
probability model, 304 arousal segments and 1267 deep sleep segments have been selected from all the participants.
Of the selected arousal and deep sleep segments, data from 26 participants were used as training, and the remaining
data were used as testing. Sigmoid parameters and weights computed from the training data were used to determine
Pr(W) for the testing data.
Page | 26
Figure 4.5: Panel a: Sigmoid parameters to be estimated for alpha band. Panel b: Sigmoid parameter b is selected as the minimum feature value of the arousal distribution after removing outliers. Panel c: Sigmoid parameter a is
selected the maximum feature value of the deep sleep distribution after removing outliers.
Fig. 4.5 shows the distribution of the relative power of the alpha band for the arousal and deep sleep segments of the
training data. To compute sigmoid parameter b (Fig. 4.5a) for the alpha band, the feature values of the arousal
distribution were sorted. Next, the minimum feature value of the higher 80% of data was selected as the sigmoid
parameter b for the alpha band (b = 0.237, Fig. 4.5b). Similarly, to compute sigmoid parameter a (Fig. 4.5a) for the
alpha band, the relative power values of the deep sleep distribution were sorted and the maximum value of the lower
80% of data was selected as the sigmoid parameter a for the alpha band (a= 0.005, Fig. 4.5c). Table 4.3 shows the
sigmoid parameter values obtained for the three features, and Fig. 4.6-a shows the resultant sigmoid functions
obtained by using the parameters in Table 4.3.
Page | 27
Table 4.3: Sigmoid parameters computed from the training data for F4-M1
Frequency band a b
Alpha 0.005 0.237
Beta 0.018 0.167
Delta 0.162 0.972
The features’ weights were calculated using OOB permuted predictor delta error from the training data. Fig. 4.6 (b)
shows the OOB permuted predictor delta error for each of the three features. It is clear from Fig. 4.6 (b) that relative
power of alpha should have the highest weight. The OOB permuted predictor delta error values obtained from the
training data were used to analyze the test data in this work.
(a) (b)
Figure 4.6: The resultant sigmoid functions for the three features for F4-M1 (b) Out-of-bag (OOB) permuted predictor delta error for the three features computed from the training data.
The sigmoid parameters and weights computed from the training dataset were used to compute the probability of
wakefulness (Pr (W)) of the test dataset (Eq. 4.2). The choice of model parameters (w1, w2, w3) was validated from
the effectiveness of the model to discriminate arousal and deep sleep segments of the train and the test data as shown
in Fig. 4.7.
Page | 28
Furthermore, Fig. 4.7 (b) shows that the proposed model distinguishes between arousal and deep sleep segments of
the testing data. This indicates the efficacy of the sigmoid awake probability model. Fig. 4.7 (a) also manifests the
extent of the drowsy cluster. In other words, it is clear from Fig. 4.7 (a) that segments with Pr(W)> 28 and
Pr(W)<65 may belong to the drowsy cluster. This is further validated by the DB (Eq 4.3) and silhouette (Eq 4.4)
indices maps in Fig. 4.8.
(a) (b)
Figure 4.7: Pr(W) distribution of arousal and deep sleep segments of (a) training data and (b) testing data obtained by weights and sigmoid parameters computed from the training data. The proposed model yields lower Pr(W) for
deep sleep segments and higher Pr(W) for arousal segments in the testing data, which indicates its efficacy.
Page | 29
(a) (b)
Figure 4.8: (a) Silhouette and (b) Davies-Bouldin index values computed from the 3-s segments of awake and non-REM 1 data obtained from the training participants. For both maps, the upper and lower bounds of the drowsy
cluster have been varied and the two metrics were computed for the awake, sleep, and drowsy clusters.
In Fig. 4.8, warmer colors indicate more dissimilar clusters. Fig. 4.8 (a) shows that higher silhouette values are
achieved if the lower cutoff of the drowsy cluster is between 21 to 30 and the upper cutoff is between 54 to 58. It is
clear from Fig. 4.8 (b) that lower cutoff of 21 to 27 and upper cutoff of 54 to 55 gives the lowest DB index. For
smaller values of upper (<54%) and lower (<21%) cutoffs in Fig. 4.8, DB is very high and silhouette values are very
low. Therefore, from Fig. 4.7 and 4.8, the lower cutoff of the drowsy cluster was set to Pr(W)= 28% and the upper
cutoff of was set to Pr(W)= 55%.
Page | 30
Figure 4.9: Scatter diagram of the awake, drowsy, and sleep clusters for the testing data. Here, the lower bound of the drowsy cluster is set to Pr(W)= 28% and the upper cutoff of is set to Pr(W)= 55%.
Fig. 4.9 shows the clusters that are obtained if we use the aforementioned upper and lower bounds for the testing
data. The quality of the clusters and the choice of model parameters were further validated from the feature
distributions, when the model was applied on all non-REM 1 and wakefulness segments of the testing participants.
As expected, for sleep clusters, relative power values of alpha and beta bands were smaller (Fig. 4.10 (a) and Fig.
4.10 (b)), and the for awake clusters, the relative power values of alpha and beta were higher. The opposite scenario
is seen in Fig. 4.10 (c). Furthermore, mean silhouette values greater than 0.6 indicate clusters that are compact [72].
It is evident from Fig. 4.9 that the three clusters obtained by thresholding Pr(W) are compact, since the mean
silhouette value is close to 0.74. Lastly, the mean and maximum detection accuracy was 93.21% and 94.73%
respectively.
Page | 31
(a) (b)
(c)
Figure 4.10: Distributions of relative power of (a) alpha, (b) beta, and (c) delta for three clusters of awake, drowsy and sleep. Here, all episodes of non-REM 1 and wakefulness were considered. The sleep segments (Pr(W)<28) have low alpha and beta power and high delta power, while awake segments (Pr(W)>55) have high alpha and beta power
and low delta power.
Page | 32
Table 4.4: One-way repeated measures analysis of variance suggests that the feature values (mean ± standard deviation) significantly change in the three clusters.
Frequency Band Wakefulness Drowsy Sleep p-Value
Delta 0.149 ± 0.042 0.651 ± 0.024 0.916 ± 0.030 <.001
Alpha 0.193 ± 0.034 0.079 ± 0.013 0.017 ± 0.005 <.001
Beta 0.522 ± 0.086 0.095 ± 0.034 0.017 ± 0.005 <.001
Table 4.4 shows the results of repeated measures analysis of variance of the three features and Fig. 4.11 presents the
results of the post hoc analysis. We employed Tukey’s multiple comparison test. From the repeated measures
analysis of variance results in Table 4.4, it is clear that relative power of alpha, delta and beta bands are significantly
different in the three clusters. Based on the post-hoc analysis, it can be seen that compared to the sleep or drowsy
clusters, the relative delta power significantly decreases and relative alpha and beta powers significantly increase for
awake cluster. These results further validate the clusters in the proposed framework.
(a) (b) (c)
Figure 4.11: Post hoc multiple comparison test suggests that (a) alpha, (b) beta, and (c) delta power features are significantly different between the clusters. Error bars indicate standard deviation. * indicates
p<.0001.
Page | 33
4.3 Discussions
In this study, we have developed a high-resolution drowsy driving detection algorithm by extracting features from
the EEG frequency band changes during sleep. The sigmoid awake probability model developed herein provided a
likelihood of wakefulness (Pr(W)) for 3-s signal segments. By choosing appropriate thresholds for Pr(W), we have
identified three clusters. The feature distributions of the clusters suggest that the clusters indicate wakefulness,
drowsiness, and sleep. The proposed scheme has been validated using arousal and deep sleep segments, cluster
quality evaluation metrics, graphical, and statistical analyses. The results presented in the foregoing section address
objective 1 in Section 4 and suggest that spectral properties of EEG indeed manifests the likelihood of wakefulness
for short episodes and leads to the development of a high-resolution drowsiness detection algorithm.
The choice of sigmoid parameter a has necessitated the use of deep sleep segments. This is because for any feature
value below a in Fig. 4.3, the PrF(W) is close to zero or one. This is also why arousal segments were used to
determine the optimal choice of the other sigmoid parameter, b. Moreover, the distributions in Fig. 4.10 as well as
the results of repeated measures analysis of variance further validate that the three resultant clusters can be identified
as wakefulness, drowsy, and sleep. The slight overlap in some of the distributions in Fig. 4.10 could be due to the
variations of EEG frequency bands' power levels across participants. However, inter-participant variation of power
level of a particular frequency band should only have a minor effect on the overall results. For example, if one
participant has high delta power for an awake 3s segment, the relative powers of alpha and beta will make the Pr(W)
for that segment high such that the segment falls into the awake cluster.
The proposed sigmoid awake probability model has several advantages. First, the proposed scheme, once trained, is
independent of sleep technician’s labels. Prior studies have shown that inter-rater disagreement could be up to 20%
[14]. Also, a technician’s scoring accuracy is subject to bias and error due to fatigue. Since the proposed method
utilizes relative power of three EEG bands to detect drowsiness, it is free from the aforementioned limitation.
Second, the use of relative power makes it more robust to noises of EEG signals. Third, even though the foregoing
sections present results from one of the frontal electrodes only, the proposed model yields similar results in the rest
of the five electrodes as well. This suggests that the proposed method can be used to detect drowsiness using single-
channel EEG signals. Fourth, the proposed model can be utilized in quantifying sleep disorders such as insomnia,
which causes the time course of the awake to sleep transition to be pathologically protracted [73]. Also, the
proposed scheme can also be utilized to quantify sleep disorders such as narcolepsy or sleep deprivation, wherein
the wake to sleep transition occurs too rapidly [74]. Therefore, using the proposed framework, one can characterize
Page | 34
the sleep onset process phenotypes of different clinical populations and the natural heterogeneity among healthy
participants. Fifth, since the sigmoid wake probability model separates arousal and deep sleep segments quite well
(Fig. 4.7), it can be used for automated arousal identification. Sixth, the sigmoid wake probability model can also
help sleep technicians in identifying awake episodes and pinpointing sleep onset. Lastly, conventional clustering
algorithms such k-means or hierarchical clustering requires a large amount of data to discover groups or clusters of
data [75]. On the contrary, the sigmoid wake probability model, once trained, is capable of detecting whether an
arbitrary 3s episode is awake, drowsy or asleep.
A limitation of the proposed model is that it cannot detect drowsy episodes that are shorter than 3s. While the
resolution is higher than most of the existing EEG-based works in the literature, lapses or behavioral microsleeps
can still be 1s or a fraction of a second [52]. The inclusion of feature based on delta band, which ranges from 1-4
Hz, necessitates the use of at least 2s or longer signal segments. In the future, features based on only alpha and beta
can be used so that we can use segments as small as 0.25s (necessary to see at least two cycles of alpha) to detect
drowsiness. Time-domain features such as entropy and/or complexity based features can also be used to overcome
this limitation. Another limitation of the proposed model is that it has not been validated against behavioral scales of
drowsiness. Also, the sleep study did not require participants to stay awake and interact with the task. The
subsequent study will focus on addressing this limitation.
Page | 35
Chapter 5: Study 2- Reaction Time Study
In Study 1, we successfully developed an algorithm that showed promising results in detecting drowsiness.
However, several open questions remain. For instance, it needs to be tested whether this algorithm will show similar
performance in detecting drowsiness in a non-sleep related context. In a sleep study, the subjects are lying in supine
position trying to fall asleep. This minimizes movement artifacts, but is not comparable to a driving scenario where
the subject is in sitting position trying to stay awake and interacts with the environment. Furthermore, sleep studies
lack the cognitive components such as attention, memory, or perception that are involved during. Also, in a sleep
study, participants usually are not healthy and have various sleep disorders. Finally, the categorization of sleep
stages based on the ratings provided by the sleep technician is rather coarse. The technicians rated only 30s long
segments based solely on EEG data. Other helpful cues such as facial components that could indicate when exactly
the subject became drowsy are missing. To overcome these issues, we conducted a second study that requires the
participant to perform a task that is engaging yet monotonous to induce drowsiness.
5.1 Methods
5.1.1 Study Design
We took several characteristics of the study into consideration during study design. Firstly, the participant would be
trying to stay alert in a sitting position similar to the scenarios of driving. Secondly, a monotonous task would keep
the subject engaged and induce drowsiness. Thirdly, the stimulus must be rich in visual information to keep the
participant engaged. This must be done so that we can test the developed algorithms in a more active, engaging, and
demanding setting, that is more similar to the processes that are involved during driving. Lastly, one must be able to
pinpoint episodes of drowsiness to develop gold standard for detection. Thus, we designed the study by taking the
aforementioned features into consideration.
5.1.2 Stimulus and Data Collection
Fifteen healthy participants (7 females), who were between 18 to 49 years old (29.33 ± 7.64 years), took part in this
study. One participant’s data were corrupted. So, it was excluded from further analysis. Participants with history of
sleep disorders, stroke, active vestibular disorders, disabling musculoskeletal disorder, acute psychiatric disorder, a
diagnosis of dementia or mild cognitive impairment, and who had regular intake of sedating medication (e.g.,
Page | 36
opioids), and/or engaged in shiftwork were excluded from the study. The study was conducted in the afternoon after
lunch since at this time of the day the participants were the most likely to be sleepy [76]. The participants were
instructed to have lunch before the experiment. They were also instructed to refrain from having attention-altering
food and drinks such as tea, coffee, or alcohol that day prior to the experimental session. Fig. 5.1 shows an
illustration of the stimulus used in Study 2.
Figure 5.1: Illustration of the stimulus used in the present study. The stripes move horizontally with the red fixation cross at the center. The participant presses a button as soon as the cross momentarily turns blue.
For the experiment, participants were seated in front of a projected screen and asked to perform a monotonous
reaction time task, illustrated in Fig. 5.1. The stimulus for the task consisted of a pattern of horizontally moving
black-and-white stripes with a fixation cross at the center. The task was chosen to create a potential sensation of
self-motion (i.e., vection) [77, 78]. This is due to the fact that the drowsiness detection algorithms developed in this
thesis must be tested in a study in a driving simulator, since it is unsafe to make a driver drowsy in a real driving
study. In a driving simulator, one can perceive self-motion in the absence of true, physical motion. The fixation
cross was programmed to be red most of the time and would occasionally (~10% of the time) turn blue for 500-750
ms. The altered black-and-white stripes shown in Fig. 5.1 moved either to the right or to the left in each trial. The
spatial frequency of the altered black-and-white stripes was 0.13 cycles/degree and the speed was 1 cycle/s. The
duration of each trial was 45 seconds. A large projection screen (300 cm ×196 cm) with an Optoma HD 850
projector was used to display the stimulus. The refresh rate of the projector was 60 Hz the display resolution was
1920 ×1080 pixel. The field-of-view of the projection screen was 78°×52°. The participants were seated 215 cm
Page | 37
away from the screen in a height-adjustable chair with eye-height leveled to the screen’s center. The participants
were instructed to press a button as soon as the the red cross changed its color to blue, indicating the response time
as an indirect measure of the participant’s level of attention [79]. There were 1 training block (4 trials) and 10 testing
blocks. Each testing block lasted for about 12 minutes (16 trials). The lights of the room were turned off to create a
drowsiness-inducing environment. Fig. 5.2 shows an illustration of the participant performing the task.
Figure 5.2: Illustration of a participant performing the task. For better visibility, the participant is shown with the lights on even though the experiments were performed in a dark room.
As the participants performed the task, facial video, response time (i.e., button press information), physiological
signals such as EEG, electrocardiogram (ECG), and respiratory inductance plethysmography (RIP) signals were
recorded. The facial videos of the participants were recorded using an infrared camera to identify the drowsy
episodes by visual inspection of the videos. RIP and ECG signals were recorded to develop cardiorespiratory signal
based drowsiness detection algorithms. EEG, ECG, and RIP signals were recorded using a Grael V2 amplifier
Page | 38
(Compumedics, Melbourne, Australia). The RIP signals were calculated from two belts- one in the chest and the
other in the abdomen. For the EEG data acquisition, all of the nineteen electrodes (Quick-Cap, Melbourne,
Australia) of the international 10–20 electrode placement system were used. Fig. 5.3 shows the electrode placement
map of the international 10-20 electrode placement scheme. The reference electrode was placed on the nose. In
between the blocks, the participants were asked about their degree and duration of self-motion perception as well as
their level of motion sickness and sleepiness.
Figure 5.3: Electrode locations of the international 10-20 electrode placement scheme [80].
5.1.3 Drowsiness Detection from Facial Video
Facial videos of each participant during the experiment were used to obtain their level of drowsiness. The facial
videos were rated on 1s basis without the knowledge of the button-press data using a newly developed drowsiness
rating scheme. The scale is presented in Table 5.1.
In the proposed drowsiness rating scheme, the rater at first rates each of the 1s video segments as drowsy (score 1)
or non-drowsy (score 0). If an episode is rated as drowsy, it is further rated on a scale of 1 to 10 depicted in Table 1.
This rating denotes the strength of drowsiness based on behavioral cues. As it is evident from Table 5.1, the
Page | 39
proposed scale employs various behavioral cues of drowsiness such as eye closure, head nodding, facial contortion,
changes in muscle tone, and rapid eye blinks.
The advantage of the proposed drowsiness scale over existing rating schemes such as the Wierwille and Ellsworth
[81] method is that here the levels are more well-defined. Furthermore, unlike existing rating schemes, this scheme
captures both the confidence of observing a drowsy episode and the degree or severity of drowsiness. Moreover,
Wierwille and Ellsworth scale does not offer a guideline on time resolution, and it is often used to score videos on 1
min basis [49]. However, drowsy episodes or microsleeps can be as short as 1s [52]. Thus, the proposed scoring
guidelines will help to capture both longer and shorter drowsy episodes. Lastly, Wierwille and Ellsworth scale offers
generic guidelines in rating whereas the proposed scoring scheme pinpoints behavioral cues corresponding to each
score of drowsiness.
Table 5.1: Drowsiness scale proposed in this work
Score Summary
0 Alert, eye fully open and moving, often the subject is moving
1 Eyelids are slightly (about 30% compared to alert) closed
2 Eyelids are more (about 60% compared to alert) closed than previous stage, very little eye movement
3 Eyelids are more (about 90% compared to alert) closed than previous stage, glassy-eyed appearance, subject staring at a fixed position
4 Eyes are barely open, often facial contortions are visible
5 Slow eye blinks; different from the usual eye blinks of an alert person
6 Increased eye blinks
7 Rapid eye blinks; after the episode the subject usually fully closes his eyes
8 Eye fully closed, head movement/nodding; episode shorter than rating 9
9 Eye fully closed, head nodding, change of muscle tone; episode often terminated by head jerk
10 Eye fully closed, head nodding, change of muscle tone, the subject does not wake up unlike 9
Page | 40
In order to assess the inter-rater variability of the videos based on the proposed scale, 70 minutes of data from 15
participants consisting of 126000 video frames were rated by a second rater. For each participant, the videos were
selected such that there was maximum number of drowsy episodes. The videos were rated by the second rater
independent of the button-press information.
The inter-rater agreement was obtained by determining the percentage agreement of the ratings. Furthermore,
normality test such as Anderson-Darling and Kolmogorov-Smirnov tests followed by Pearson or Spearman
correlation analysis were performed to determine agreement of the two raters. We also determined the mean and
standard deviation of the difference of two raters. Moreover, confusion matrix analysis was also performed to assess
inter-rater reliability.
5.1.4 Performance Evaluation Metrics
The performance metrics used in this work are accuracy, sensitivity, and specificity. They are expressed using the
following equations.
푆푒푛푠푖푡푖푣푖푡푦 = 푇푟푢푒푃표푠푖푡푖푣푒
푇푟푢푒푃표푠푖푡푖푣푒+ 퐹푎푙푠푒푁푒푔푎푡푖푣푒
(5.1)
푆푝푒푐푖푓푖푐푖푡푦 = 푇푟푢푒푁푒푔푎푡푖푣푒
푇푟푢푒푁푒푔푎푡푖푣푒+ 퐹푎푙푠푒푃표푠푖푡푖푣푒
(5.2)
퐴푐푐푢푟푎푐푦 = 푇푟푢푒푃표푠푖푡푖푣푒 + 푇푟푢푒푁푒푔푎푡푖푣푒
푇푟푢푒푃표푠푖푡푖푣푒+ 푇푟푢푒푁푒푔푎푡푖푣푒 + 퐹푎푙푠푒푃표푠푖푡푖푣푒+ 퐹푎푙푠푒푁푒푔푎푡푖푣푒
(5.3)
Page | 41
Sensitivity denotes the proportion of correct positive class or drowsy classifications. Specificity, on the other hand,
expresses the proportion of correct negative class or non-drowsy classifications. Accuracy denotes the proportion of
correct positive and negative class classification. In order to compute the performance metrics, 60% of the
participants were randomly selected as training, and the remainder was selected as testing. Afterwards, we computed
the performance metrics using the training and testing data. This process was repeated 100 times, and the resulting
mean and standard deviation of accuracy, sensitivity, and specificity values are reported here. This ensures that each
of the participant's data was used as training or testing data but never at the same time. The aforementioned analysis
was done for each of the 19 electrodes.
5.1.5 Development of Drowsiness Detection Algorithm
In the following, three methods developed in Study 2 data namely- modified sigmoid wake probability model, step
function model, and time-domain feature based drowsiness detection algorithm will be presented.
5.1.5.1 Modified Sigmoid Wake Probability Model
Figure 5.4: A schematic outline of the modified sigmoid wake probability model proposed in this thesis. A. At first, ocular noise removal was performed using independent component analysis (ICA). Then the signal was band-pass and notch-filtered. B. Relative power of theta and gamma were extracted from 1s EEG segments. C. Each of the
features was fed into a step function. D. The step function outputs were weighted and averaged using weights computed from the training data using random forest. E. The probability of wakefulness (Pr (W)) obtained from the model was used to divide the data in awake, drowsy, and sleep clusters using cluster quality evaluation metrics. The
output is compared against a threshold to classify a 1s episode as alert or drowsy.
Page | 42
The sigmoid wake probability model developed in Study 1 data was tested on the data collected in Study 2.
However, in Study 2, gamma and theta band-based features were used instead of delta, theta, and alpha power-based
features. Since in the EEG data of the Study 2 these two band changes were most prominent, they were used in the
model (details in Appendix 5A). A schematic outline of the modified sigmoid wake probability model is shown in
Fig. 5.4.
Data Preprocessing
EEG data collected from Study 2 contained eye movement and eye blink noise as it is evident from Fig. 5.5. The
smoothly decreasing EEG spectrum and a strong far-frontal projection of independent component 2 (circled in red)
in Fig. 5.5 indicates that the EEG data is corrupted by ocular artifact. Therefore, we performed independent
component analysis (ICA) to remove the ocular noise using EEGLAB software [82]. Next, the EEG signals were
band-pass filtered from 1 to 100 Hz. Subsequently, we applied a notch-filter at 60 Hz for power-line noise removal.
Afterwards, we divided the EEG signals into 1s segments.
Figure 5.5: Component maps of the independent component analysis (ICA) of the EEG recordings a subject. The smoothly decreasing EEG spectrum and a strong far-frontal projection of independent component 2 (circled in red)
are typical of ocular artifact.
Page | 43
Feature Extraction
Instead of the original model’s delta, alpha, and beta features, we computed relative power values of gamma and
delta bands, since these two feature changes appeared to be most prominent in the awake and drowsy states. The
formula used to compute relative power values in the sleep study data in Eqn. 4.1 was also used to compute the
feature values in this method.
Modified Sigmoid Wake Probability Model
Figure 5.6: Sigmoid functions used in the modified sigmoid wake probability model. Probability of wakefulness for each feature should be high, if relative power values of gamma are high. Therefore, the red curve is used to capture the changes in gamma band. The opposite scenarios are seen for theta band, which is why the black curve is used to
capture the changes of delta band.
Each of the two extracted features was fed into a sigmoid function as shown in Fig. 5.6. Prior studies have
demonstrated that as an individual becomes drowsy, gamma band power decreases and theta power increases [16].
Therefore, the red sigmoid function in Fig. 5.6 was used for gamma band, and the black sigmoid function in Fig. 5.6
was used for theta band. Sigmoid outputs were weighted and averaged using weights computed from out-of-bag
permuted predictor delta error method similar to the sleep study data explained in section 4.1.5. Thus, a probability
of wakefulness value (Pr(W)) was obtained from each of the 1s segments using the following equation.
Pr(푊) = 푤 ∗ 푃 +푤 ∗ 푃 (5.4)
Page | 44
Subsequently, we divided the signal segments in awake, drowsy, and sleep clusters using cluster quality evaluation
metrics and tested against the video rating. The model gave high accuracy and specificity but low sensitivity.
5.1.5.2 Step Function Model
Step function model, which was developed to overcome the low sensitivity issue of the sigmoid wake probability
model, necessitates the determination of one parameter instead of sigmoid wake probability model’s two. Therefore,
the model does not require extreme cases of non-alertness for model parameter tuning unlike the sigmoid wake
probability model. A schematic outline of the step function model is shown in Fig. 5.7.
Figure 5.7: A schematic outline of the step function model proposed in this thesis. A. At first, ocular noise removal was performed using independent component analysis (ICA). Then the signal was band-pass and notch-filtered. B. Relative power of delta, theta, alpha, beta, and gamma were extracted from 1s EEG segments. Each of the features
was fed into a step function. C. The step function outputs were weighted and averaged using weights computed from the training data using random forest. D. The output is compared against a threshold to classify a 1s episode as alert
or drowsy.
Data Preprocessing and Feature Extraction
Page | 45
Data preprocessing was performed using the same steps as the sigmoid wake probability model. In the step function
model, we first divided the data in 1s segments. Subsequently, we extracted relative power feature values of delta (2-
4 Hz), theta (4-8 Hz), alpha (8-13 Hz), beta (13-30 Hz), and gamma (30-100 Hz) bands. Even though delta band
starts from 1 Hz, we used the upper part of the delta band (2-4 Hz) to observe the Nyquist criterion.
Development of Step Function Model
Each of the relative power features was fed into one step function. Prior studies have demonstrated that as an
individual becomes drowsy in a situation similar to driving, delta, theta, alpha, and beta band powers increase and
gamma power decreases [16]. Therefore, the top function in Fig. 5.7 is used to capture the changes in gamma band,
and the bottom function in Fig. 5.7 is used to capture the changes in delta, theta, alpha, and beta bands. As it is
evident from Fig. 5.7, the step function requires the determination of one threshold. The advantage of step function
is that instead of sigmoid function’s two parameters, one has to determine one parameter to employ step function.
Therefore, extreme cases of non-alertness are no longer required in step function model.
The step function outputs were weighted and averaged using weights computed by random forest [70].
Subsequently, this value is compared with a threshold to classify the segment as alert or drowsy. The thresholds in
the model were heuristically determined. Similar to the sigmoid wake probability model, we randomly selected 60%
of the participants as training and the remainder as testing in each run. Both the thresholds of step functions and the
feature weights were computed from the training data. The performance of the step function model was compared
against facial video rating for model validation. The performance metrics reported here are the mean and standard
deviation of 100 runs. The step function model yields better accuracy, sensitivity, and specificity than the sigmoid
wake probability model. Nevertheless, the sensitivity is still low, and hence a more robust algorithm needs to be
developed.
5.1.5.3 Time-Domain Feature-Based Algorithm
In order to overcome the low sensitivity problem of sigmoid and step function models, we analyzed the time-domain
properties of EEG to identify effective markers of drowsiness. Time-domain features have been used in the literature
to extract markers for alertness monitoring [29], sleep stage classification [83, 84], and anesthesia-EEG data analysis
[85]. We then developed a time-domain feature-based drowsiness detection algorithm using machine learning
techniques. A schematic outline of the proposed method is shown in Fig. 5.8.
Page | 46
Figure 5.8: A schematic outline of the proposed time-domain feature-based algorithm.
Data Preprocessing
The EEG signals were first band-pass filtered from 1 to 100 Hz. Next, a notch-filter was used at 60 Hz to remove
power line noise. Subsequently, the filtered data were divided into 1s segments.
Feature Extraction
After preprocessing the EEG signals, we extracted Hjorth parameters from each of the 1s signal segment. Hjorth
parameters capture the temporal dynamics of an EEG signal segment [86]. For an EEG signal segment x(t), the three
Hjorth parameters, namely activity, mobility, and complexity are defined as follows.
퐴푐푡푖푣푖푡푦 = 푉푎푟푖푎푛푐푒(푥(푡)) (5.5)
푀표푏푖푙푖푡푦 = √( )
( ( )) (5.6)
Page | 47
퐶표푚푝푙푒푥푖푡푦 = ( )
( ( )) (5.7)
Activity represents the total energy of the signal x(t). As seen from equation 5.6, mobility expresses the ratio of the
standard deviation of the slope and the standard deviation of the amplitude. It can also be deemed as the standard
deviation of the power spectrum along the frequency axis (see Appendix B5 for more details). The ratio of the
mobility values of the signal’s first derivation and the EEG signal is defined as complexity. The value of this
parameter varies from 0 to 1. Complexity values closer to 1 indicate that the EEG signal segment is more similar to
a pure sine wave.
Classification
The three time-domain based features extracted from the EEG signals were fed into a classification model. The
classification model used in the proposed framework is bootstrap aggregating or bagging [87]. Bagging is an
ensemble learning-based classifier which combines multiple weak or base classifiers to create stronger and more
accurate classification models. A schematic illustration of bagging is presented in Fig. 5.9.
Figure 5.9: Illustration of bootstrap aggregating classifier. A percentage of the training data is drawn with
Page | 48
replacement to create each of the bootstrapped replicas. Afterwards, each of the replicas is used to train a weak
classifier. Final output of bootstrap aggregating is determined by majority voting of all the decision tree outputs.
Bagging creates multiple bootstrapped replicas of the original training data by sampling a percentage of the training
data with replacement as shown in Fig. 5.9. Subsequently, each of the bootstrapped replicas is used to train a weak
or base classifier. Thus, using resampling, bagging builds an ensemble that is as diverse as possible. In this work,
decision tree is used as the base classifier. To classify a test instance, each of the trained decision trees gives an
output. Next, the class chosen by most decision trees is selected as the ensemble decision. The majority voting
ensures that incorrect decisions are discarded and correct decisions are amplified. Despite being simple and intuitive,
bagging has proven to be fast, robust, and more accurate than other ensemble learning-based classification models
[87, 88].
5.2 Results
The segment statistics per electrode in our study is presented in Table 5.2.
Table 5.2: Average and standard deviation of the number of 1s segments used per electrode. Data are presented as mean ± standard deviation.
Segments per participant Total segments
Alert 5264.71 ± 210.85 73706
Drowsy 1732.00 ± 210.86 24248
5.2.1 Drowsiness Ratings
In this study, button press information and facial video ratings were used simultaneously to identify episodes of
drowsiness. The use of button press information, however, is problematic given that it is prone to false positives and
false negatives.
Fig. 5.10 (a) shows the reaction time in seconds of a participant in response to blue crosses throughout a block.
Negative reaction time indicates the blue crosses missed by the participant. As we can see from Fig. 5.10 at lower
levels of drowsiness, the subject may be drowsy and still be able to perform the task correctly. If we compare the
Page | 49
button-press and facial video rating in the region marked by blue squares in Fig. 5.10 (a), it becomes evident that the
participant is able to do the task at lower levels of drowsiness. Therefore, button-press information alone is not
capable of detecting lower degrees or intensities of drowsy episodes. In addition, the participant can miss button
presses for reasons other than drowsiness such as inattention and mind-wandering. Fig. 5.10 (b), which shows the
same plot of Fig. 5.10 (a) zoomed at the beginning, illustrates this fact. It is clear from Fig. 5.10 (b) that the
participant missed the first two crosses due to inattention, since the facial video rating showed that the participant
was awake. Therefore, the use of button-press information in such cases will introduce false positives. Also, drowsy
episodes might appear between two blue crosses causing them to remain undetected by button press information.
Finally, button-press information cannot precisely pinpoint the beginning and end of a drowsy episode. Due to the
aforementioned reasons, we adopted the facial video rating as our method of choice in this study to identify episodes
of drowsiness in accordance with the current literature.
(a)
Page | 50
(b)
Figure 5.10: (a) (Top panel) Reaction time in seconds of a participant in response to blue crosses throughout a block. Negative reaction time indicates the blue crosses missed by the participant. (Bottom panel) Facial video rating of the
same block. (b) Reaction time and video rating plot of the same participant zoomed at the beginning.
Table 5.3 shows the 2-class (alert or score 0 and non-alert or score 1) confusion matrix of the ratings of the two
independent raters. The diagonal of the matrix in Table 1 shows the number of alert and non-alert segments that both
of the raters agreed on. The inter-rater agreement between the two raters was 96.27%.
Table 5.3: Inter-rater agreement of the proposed scale for scores 0 and 1. The inter-rater agreement between the two raters was 96.27%. The diagonal of the confusion matrix is highlighted in bold.
Rater #2
Rater #1
Score 0 1
0 2745 71
1 75 1020
Page | 51
Table 5.4 shows the 11-class (score 0 indicating alert and scores 1-10 indicating various levels of non-alertness)
confusion matrix of the ratings of the two independent raters. The diagonal of the matrix in Table 5.4 shows the
number of alert and non-alert segments that both of the raters agreed on. Table 5.1 shows that scores from 1 to 4
each indicates a percentage of eye closure for rating. The subsequent three levels (i.e. scores 5 to 7) are primarily
based on the slow eye blinks that occur right before sleep onset. Usually, after these stages, the subject goes to a
deeper sleep stages. Since the descriptions of scores 5-7 do not contain an empirical value, the inter-rater agreement
for these three classes are lower than that of scores 0-4 as we can see from Table 5.4. The inter-rater agreement
between the two raters was 85.96%.
Anderson-Darling and Kolmogorov-Smirnov tests showed that the data were not normally distributed. Therefore, we
performed Spearman correlation analysis which showed very high correlation (r= 0.93, p<.00001) between the 11-
class (scores 0 to 10) ratings of the two raters. Again, in the 11-class (scores 0 to 10) ratings, the mean difference of
the ratings was 0.038609, the standard deviation of the rating was 0.98534, the median was 0, and the inter-quartile
range was 0.
Table 5.4: Inter-rater agreement of the proposed scale for scores 0 to 10. The inter-rater agreement between the two raters was 85.96%. The diagonal of the confusion matrix is highlighted in bold.
Rater #2
Rater #1
Score 0 1 2 3 4 5 6 7 8 9 10
0 2745 16 10 11 7 9 6 0 6 5 1
1 28 109 34 15 3 2 0 0 1 0 0
2 18 31 3 11 3 3 0 0 1 0 0
3 7 1 3 55 3 1 0 0 2 0 0
4 8 2 0 11 72 3 7 0 1 0 0
5 5 1 0 1 31 5 15 0 0 0 0
6 3 4 0 0 1 0 3 0 23 7 0
7 2 0 0 0 0 0 0 0 26 0 0
8 0 0 0 1 1 0 1 0 49 8 0
9 3 0 0 4 0 0 1 6 4 111 44
10 1 0 0 1 1 0 1 0 0 52 210
Page | 52
Another important observation that can be made from the button-press information and facial video rating plot in
Fig. 5.10 is that even though button-press data often fails to detect low intensity drowsy episode, it consistently
matches with the video rating when rating is close to 10. This indicates that the extremely non-alert episodes were
correctly captured by the facial video rating.
5.2.2 Modified Sigmoid Wake Probability Model
Figure 5.11 shows the performance metrics of the step function model in each of the 19 electrodes. It is evident from
Fig. 5.11 that even though the accuracy and specificity values of the sigmoid wake probability model is high, the
sensitivity is on the lower side. In all of the electrodes, the sigmoid wake probability model gives less than 40%
mean sensitivity. Hence, the step function model was developed to overcome this limitation of the sigmoid wake
probability model.
Figure 5.11: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the sigmoid wake probability model on each of the electrode of 10-20 electrode placement system.
Page | 53
5.2.3 Step Function Model
Figure 5.12: Feature weights of F3 electrode computed from random forest.
Figure 5.13: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the step function-based algorithm on each of the electrode of 10-20 electrode placement system.
Fig. 5.12 shows the feature weights computed from the training participant of a run using random forest. Thus, this
feature weight determination approach ensures that the EEG frequency bands that are dominant in a particular
electrode have higher weights when analyzing EEG signals of that electrode. Figure 5.13 shows the performance
Page | 54
metrics of the step function model in each of the 19 electrodes. We can see from Fig. 5.13 that the accuracy,
sensitivity, and specificity values are similar across various electrodes. In comparison with Fig. 5.13, it is clear that
the accuracy, sensitivity, and specificity values of this model are similar to those of the sigmoid wake probability
model. The high specificity values indicate that the step function model gives very few false alarms. However, the
sensitivity values of this model are still low.
(a)
(b)
Figure 5.14: (a) Mean and standard deviation and (b) maximum values of accuracy, sensitivity, and specificity of 100 runs of the Hjorth parameter-based algorithm on each of the electrode of 10-20 electrode placement system.
Here alert (score 0) vs non-alert (score 1) has been classified.
Page | 55
5.2.4 Time-Domain Feature-Based Algorithm
The accuracy, sensitivity, and specificity values of the proposed time-domain based algorithm to classify alert and
drowsy segments (binary classification) are presented in Fig. 5.14. It is evident form Fig. 5.14 that the values of the
performance metrics do not vary much across the 19 electrodes. Furthermore, the low standard deviations in all of
the electrodes for all of the metrics suggest that the metric values were more or less similar in each of the 100 runs.
Considering both the accuracy and sensitivity, frontal electrodes F3 or F8 can be ideal in implementing the proposed
time domain feature-based algorithm in a wearable EEG system.
(a)
(b)
Figure 5.15: (a) Mean and standard deviation and (b) maximum values of accuracy, sensitivity, and specificity of 100 runs of the Hjorth parameter-based algorithm on each of the electrode of 10-20 electrode placement system.
Here, 11-class (scores 0-10) classification results are shown.
Page | 56
Table 5.5: Performance comparison of the proposed methods with existing works in the literature.
Method Description n Modalities Used Total Features
Resolution Accuracy (%)
Hear rate variability (HRV) features [17]
Driving simulator study
12 Electrocardiogram 30 5 mins 90
Principle component analysis of EEG [41]
Eye open/close task
15 EEG 3 5 mins 0.7 correlation with Karolinska Sleepiness Scale
Artificial neural network [31]
Eye open/close task
17 EEG 42 5 mins 94.37 ± 1.95
Alpha and beta band powers [89]
Eye open/close task
10 EEG 2 3 mins 84.8
HRV features from time-frequency analysis [19]
Driving simulator study
30 Electrocardiogram 8 1 min Sensitivity: 62%, Specificity: 88%
Cardiorespiratory phase synchronization features
[21]
Driving simulator study
16 Respiration, electrocardiogram
2 1 min 97.2
Ocular features and partial least squares regression
[22]
Driving simulator study
44 Electrooculogram 15 30s 85
Temporal and spectral features in the wavelet
domain [24]
Sleep study 16 EEG 19 30s 87.4
Frequency domain features and neural network [20]
Sleep study 30 Chin electromyogram, electrooculogram
5 30s 83
EEG power features [25] Driving simulator study
15 EEG 2 30s 82
Wavelet packet analysis [38]
Sleep study 20 EEG 2 30s 91.8
Wavelet transform and neural network [51]
Sleep study 10 EEG, electromyogram, electrooculogram
4 30s 97
Entropy and power-based features [28]
Tracking task 8 EEG (16 electrodes) 24 2s 61.2
Beta band power and Oxyhemoglobin changes
[16]
Driving simulator study
9 EEG, near infrared spectroscopy
2 2s 79.2 ± 9.4
Sigmoid wake probability model
Sleep study 8 EEG
3 3s
93.11
Sigmoid wake probability model
Reaction time task
8 EEG 2 1s Accuracy: 74.70 ± 0.16%, Sensitivity: 39.04 ± 0.53%, Specificity: 100.00 ± 0.00%
Step function model Reaction time task
8 EEG 5 1s Accuracy: 70.96 ± 0.07%, Sensitivity: 31.12 ± 0.14%, Specificity: 99.99 ± 0.01%
Time domain feature-based algorithm
Reaction time task
8 EEG 3 1s Accuracy: 91.54 ± 0.29%, Sensitivity: 95.38 ± 0.25%, Specificity: 88.52 ± 0.50%
Page | 57
Fig. 5.15 shows the performance of the time-domain feature-based algorithm for 11-class classification. It is clear
from Fig. 5.15 that the proposed scheme gives reasonably high detection performance in detecting not only the non-
alert episodes but also the degree or intensity of a drowsy episode.
The performance of the proposed Hjorth parameter-based method is also compared with existing works in the
literature in Table 5.5. Table 5.5 also lists the performance of the sigmoid wake probability model and step function
model. The sigmoid wake probability model performs better in the sleep-EEG data. Even though the model's
accuracy seems better in comparison with existing works in the literature, the algorithm misses a lot of drowsy
episodes. Step function model, on the other hand, gives high specificity. But its sensitivity is on the lower side, as
we can see from Table 5.5. It is evident from Table 5.5 that the time domain feature-based algorithm proposed
herein yields comparable or better performance in terms of accuracy, sensitivity, and specificity but at a higher
resolution (1s) than the existing studies in the literature. Furthermore, the use of only 3 features makes the proposed
algorithm computationally less expensive. The proposed method takes about 15ms to process and classify an
unlabeled 1s segment.
5.3 Discussion
In this study we have developed and tested three algorithms for drowsiness detection. The first algorithm developed
in Study 1 was a sigmoid wake probability model, which achieved 93.11% accuracy in Study 1. However, in Study
2, the sensitivity values of the algorithm were low (38.95 ± 0.54% in Fp2 electrode). Thus, we developed a model
that employed a step function or thresholding on five relative power based features. The step function model gave
similar accuracy (70.96 ± 0.07% in Fp2 electrode) and specificity (99.99 ± 0.01% in Fp2 electrode) with only a few
false detections. Nevertheless, the step model’s sensitivity (31.12 ± 0.14% in Fp2) was still not satisfactory. The
third algorithm developed in this study, which overcomes the low sensitivity issue of sigmoid and step function
models, exploits time-domain properties of EEG to detect drowsiness.
Although, the sigmoid wake probability model developed in the sleep data had high accuracy, it gave poor detection
performance in Study 2. Lower sensitivity of a drowsiness detection algorithm implies higher missing detections,
and lower specificity indicates higher false detections of the algorithm. Therefore, in the context of drowsiness
detection, higher sensitivity is more important than higher specificity. There are three possible reasons for the poor
sensitivity of sigmoid and step models. First, in a sleep study, participants usually lie in the bed with eyes closed and
Page | 58
without much movement trying to fall asleep. In contrast, in the current study, participant tried to stay awake and
blinked and moved. This makes the EEG data recorded in this study more noisy than sleep-EEG data. Second, prior
studies have demonstrated that neural rhythms are affected by attention [90] or self-motion perception [91]. Unlike
the sleep study, the participants here were trying to focus on performing the task. Therefore, the EEG frequency
bands could be affected by the participant’s attention or self-motion perception. Since the sigmoid wake probability
model and the step function model completely rely on the power levels of EEG frequency bands, both of the
proposed models end up giving poor drowsiness detection performance. Third, in the development of the model
using sleep-EEG data, we used arousals as extreme cases of alertness and deep sleep stages such as non-rapid eye
movement stages 2 and 3 (non-REM 2 and non-REM 3) as extreme cases of non-alertness. While the training blocks
at the beginning of the study were used as extreme cases of alertness, only two participants fell asleep. Furthermore,
two subjects who fell asleep were unlikely to go to a deeper stage such as N2 and N3, since the sleep episodes lasted
only a few minutes. Therefore, we didn’t have any deep sleep data (extreme cases of non-alertness) to re-train our
model and hence sigmoid wake probability model did not work well in the drowsiness study.
The time domain feature-based algorithm has various advantages. The proposed algorithm is of 1s resolution and
gave high accuracy. Furthermore, it is fast and only requires one frontal EEG channel and hence is highly suitable
for drowsiness detection during driving once validated in a driving study. Another advantage of the proposed
algorithm is that it does not require any rigorous and time-consuming preprocessing step such as ICA. This makes
the algorithm more appealing in the context of drowsy driving detection. Thus, the time-domain feature-based
algorithm overcomes the limitations of sigmoid wake probability and step function models and is more suitable for
practical implementation.
From Table 1, it is clear that in the 2-class (score 0 indicating alert and scores 1-10 indicating various levels of non-
alertness) case, the two raters show high agreement. The inter-rater agreement slightly decreases when the raters rate
the degree or level of drowsiness in the 11-class (scores 0 to 10) case. In both of the cases, the agreement was not
higher due to the disagreement of identifying the beginning of some drowsy episodes. That is why we can see from
the third (Score 0) column of Table 2 that four segments were rated as 9 or 10 by rater #1, even though they are
scored as 0 (i.e. alert) by rater #2. In the context of the proposed thesis, the agreement of the 2-class (score 0
indicating alert and scores 1-10 indicating various levels of non-alertness) case is more important, since our primary
goal is to detect drowsiness and not the level or degree of drowsiness at this stage. Nevertheless, the inter-rater
agreement in both cases is reasonably high [81]. Table 2 also shows that the choice of video segments were such that
Page | 59
they contain as many non-alert segments as possible so that the video ratings and reliability of the proposed scale
can be fully determined. It is worth-mentioning that the second rater was blinded to the ratings of the first rater as
well as the button press data.
However, there are two main limitations. In the current study, as we can see from Table 5.2, the number of alert
episodes is much higher compared to the drowsy episode, thus making it harder to train the model for classification.
As a result, we could not achieve higher accuracy. Another limitation of the study is that the video rating stems only
from one observer. For better rating accuracy, it is recommended to have ratings from more than one observer.
Despite efforts of maintaining consistency of video rating, there could be some missing or false detection of drowsy
episodes.
Page | 60
Chapter 6: General Discussions
6.1. Summary of the Findings
In this thesis, we developed three drowsiness detection algorithms, namely sigmoid wake probability model, step
function model, and Hjorth parameter-based algorithm across two separate studies. We first conducted a sleep study
(Study 1) which yielded EEG data relatively free of noise (e.g., eye blink, motion, and eye movement artifacts). We
observed changes in EEG frequency bands alpha, delta, and beta at sleep onset. To capture and manipulate these
changes for drowsiness detection, we employed a sigmoid function. The motivation for using sigmoid function was
that in the sleep study data we had extreme cases of alertness and extreme cases of non-alertness. The former is the
arousal state and the latter are deep sleep stages. The idea was that once we can develop a model that can separate
these two extremes, the episodes which lie ‘in the middle’ of these two extremes can be modeled by slowly
increasing/decreasing curve like sigmoid function. Three clusters were defined in the data by thresholding the
likelihood of wakefulness values using commonly used cluster quality evaluation metrics. The model gave high
accuracy (>90%) for drowsiness detection on the sleep-EEG data in Study 1.
In order to test this model in an experiment more similar to driving (i.e., more engaging and active task, strong
visual stimulation), we designed a reaction time study that, unlike Study 1, necessitates the participant to try to stay
alert in a sitting position and to perform a task (Study 2). In Study 2, we also had facial video scores available for
validation of the algorithms. In spite of having high accuracy and specificity, the sigmoid wake probability model
developed in Study 1 gave a large number of missing detections in Study 2 (sensitivity <40%). In the sleep study,
we used deep sleep segments as extreme cases of non-alertness for model development which was absent in Study 2
data. This caused the model to yield low sensitivity in Study 2 data.
To counteract this problem, we applied a step function on each of the five frequency bands to develop the step
function model. Unlike sigmoid wake probability model, the step function model does not require extreme cases of
non-alertness for drowsiness detection. This model’s accuracy and specificity were values were similar to those of
sigmoid model. However, it still gave a large number of missing detections of drowsy episodes (sensitivity <35%).
Page | 61
We then explored time-domain properties of the EEG data to identify effective markers of drowsiness. Hjorth
parameters have been designed to capture changes in the time-domain properties of EEG [86] and have widely been
used for alertness monitoring [29], sleep stage classification [83, 84], and anesthesia-EEG data analysis [85].
Hjorth parameters involve activity, mobility, and complexity. Instead of power, these parameters capture the time-
domain properties of EEG. Complexity captures the EEG signals similarity to a pure sine wave [86]. Mobility, on
the other hand, captures the degree of fluctuations of energy of an EEG signal segment. Finally, activity captures the
energy of the EEG signal segment. Fig. 6.1 shows the variations of complexity parameter values in different levels
of drowsiness. In general, the complexity parameter decreases as the subject becomes drowsy.
Figure 6.1: Complexity parameter value (top panel) and the facial video rating (bottom panel). In general, the complexity decreases as the subject becomes drowsy.
Fig. 6.2 shows the variations of mobility parameter values in different levels of drowsiness. It can be seen from Fig.
6.2 that in general, the mobility parameter decreases as the subject becomes drowsy. Fig. 6.3 illustrates the
Page | 62
variations of activity parameter values in different levels of drowsiness. It can be seen from Fig. 6.3 that in general,
the activity parameter decreases as the subject becomes drowsy.
Figure 6.2: Mobility parameter value (top panel) and the facial video rating (bottom panel). In general, the mobility decreases as the subject becomes drowsy.
Page | 63
Figure 6.3: Activity parameter value (top panel) and the facial video rating (bottom panel). In general, the activity decreases as the subject becomes drowsy.
The changes in Hjorth parameters can be explained from a neurophysiological standpoint. Sleep onset is
characterized by the activity of the inhibitory projections of the GABAergic and galaninergic neurons in the
ventrolateral preoptic nucleus neurons on cells in the ascending arousal system [92]. This causes the activity of the
neurons in the neocortex to decrease resulting in lower complexity and fluctuations of energy [93, 94]. In fact, the
complexity of EEG has been reported to decrease with increasing deeper sleep stages [94]. Therefore, all three
Hjorth parameters decrease when the subject becomes drowsy.
In Study 2, we used facial video ratings to apply supervised learning algorithms to develop a more robust and
accurate drowsiness detection algorithm without the need of any parameter tuning or weight determination.
Therefore, using bootstrap aggregating classifier, we developed a Hjorth parameter-based algorithm that gave high
accuracy, sensitivity, and specificity (Accuracy: 91.54 ± 0.29%, Sensitivity: 95.38 ± 0.25%, and Specificity: 88.52 ±
0.50% in P7).
In the sleep-EEG data, the sampling rate was 128 Hz for most of the participants. From Nyquist’s sampling theorem,
we could only extract up to 64 Hz which does not cover the range of gamma band (30-100 Hz). In contrast, in Study
2, the sampling rate was 1024 Hz which allowed us to use all the five frequency bands for model development.
Page | 64
Furthermore, for the sigmoid wake probability model in the sleep study data, we used features computed from the
delta (1-4 Hz) band. Therefore, the minimum segment length had to be 2s. In Study 2, on the other hand, we used
theta (4-8 Hz) and gamma (30-100 Hz) bands which allowed us to detect drowsiness at a higher resolution. Prior
studies reported microsleep episodes as short as 1s [52]. Even though the average human reaction time for a visual
stimulus is 0.25s [95], rating drowsiness at a resolution higher than 1s might precipitate the scoring scheme to miss
behavioral cues (eye closure, facial contortion, eye blink, head nodding) of drowsiness. Therefore, we used 1s
resolution in Study 2 to develop our algorithms.
6.2. Comparison with Other Drowsiness Detection Systems
To the best of our knowledge, none of the existing works in the literature detect drowsiness at 1s resolution. As one
uses smaller and smaller window sizes, the amount of information to be extracted from EEG becomes lower. Hence,
extracting successful markers of drowsiness becomes increasingly challenging. Therefore, prior works that detected
drowsiness at a lower resolution usually report higher detection performance than higher resolution (<10s) studies.
Peiris et al. [28] and Nguyen et al. [16] are perhaps the only two studies that detect drowsiness at 2s resolution. The
former study reports maximum accuracy, sensitivity, and specificity values of 61.2%, 73.5%, and 25.5%
respectively using multichannel EEG [28]. Furthermore, all the missing detections were from episodes shorter than
20s. The latter study reports a maximum accuracy of 79.2% but does not report the sensitivity and specificity values
[16]. In contrast, the Hjorth parameter-based algorithm proposed in this thesis gives accuracy, sensitivity, and
specificity of 91.54 ± 0.29%, 95.38 ± 0.25%, and 88.52 ± 0.50%.
One common challenge of commercial fatigue monitors is their annoying frequent false alarms that dissuade the
driver from using these systems [4]. A common feature of all of the three developed algorithms in this thesis is that
they yield close to 100% specificity in Study 2 data. Since higher specificity means lower false positives, the
algorithms developed in this thesis are free from the aforementioned problem.
Unlike most of the existing works in the literature [16, 18-20, 51], we reported accuracy, sensitivity, and specificity
of the developed algorithms. An algorithm with low sensitivity, despite having high accuracy, misses most of the
drowsy episodes. On the contrary, an algorithm with low specificity, despite having high accuracy, gives a lot of
false alarms. Therefore, in order to totally characterize and measure the performance of a drowsiness detection
algorithm, one must report all three measures.
Page | 65
While analyzing data in both of the studies, we selected a percentage of the participants as training and the
remainder of the participants as testing. Thus, we ensured that all the participants can be either in test or train data
but never at the same time. This made sure that the performances of the algorithms are generalizable. This data
setting also has a benefit in terms of practical application. When implementing the developed algorithms in a
vehicle, the algorithm does not need to be trained by the driver but can be pre-trained using other persons’ data.
Furthermore, most of the existing works in the literature [16, 20, 28, 51] do not randomize their training and testing
data which might indicate that the reported performance metric values might be obtained on only a certain choice of
train and test data.
Existing commercialized drowsy driving detection systems are either based on vehicle measures or eye-tracking [3].
The former is dependent on road geometry, weather, and driving skills of the driver [4, 43]. Vehicle parameters also
vary from driver to driver. Therefore, these algorithms’ performances vary with drivers [4]. Eye-tracking based
measures are reliant on lighting conditions [34]. Furthermore, the use of a camera to constantly monitor the driver
hampers his/her privacy. Due to the use of EEG, the proposed scheme is free from the aforementioned caveats.
Most of the physiological signal-based algorithms in the literature either use multiple signals [16, 20, 49] or multiple
channels of data [28, 50] or both [4, 24] for drowsiness detection. The disadvantage of using multiple physiological
signals is that it requires more sensors and electrodes attached to the driver’s body. Multichannel EEG-based
algorithms, on the other hand, require the driver to wear an EEG cap which has to be precisely positioned and is
inconvenient. In both of the studies and in all of the developed algorithms, we used EEG data only. Since the
proposed schemes in this thesis are single-channel EEG-based, they are free from the aforementioned problems.
6.3. Practical Implications
From the button-press data in Fig. 5.10 and the corresponding discussion, it becomes evident that a driver might
have some level of drowsiness, yet he/she might be able to perform driving. Therefore, when trying to practically
implement the Hjorth parameter-based algorithm in driving, one must decide at which level or degree of drowsiness
the driver must be warned. Furthermore, the finely-grained scale of drowsiness also gives us an opportunity to
investigate the effect of drowsiness on driving performance. Thus, in a study in a driving simulator for instance, one
can record facial video and vehicle measures such as lane deviation and speed variability. Using these vehicle
measures, one can determine at which point of the proposed scale (from score 1 to score 10) the driving becomes
Page | 66
impaired and/or the driver loses control of the vehicle. Thus, the scale developed in this thesis can shed light on the
effect of drowsiness and fatigue on driving performance.
Despite sigmoid wake probability model’s poor performance in drowsiness detection in experimental conditions
similar to driving, the model can be useful for drowsiness detection in sleep. Thus, the model can be used to
characterize and compare sleep onset phenotype of different clinical populations as well as to characterize natural
inhomogeneity of healthy subjects [15]. Furthermore, this algorithm can act as a diagnostic tool for disorders of
sleep onset such as narcolepsy or insomnia [14, 15]. Moreover, it can be used to dynamically track loss of alertness
in situations wherein alertness is vital (e.g. depth of anesthesia estimation). Again, Fig. 4.7 shows that the proposed
model separates arousal and deep sleep segments successfully. Therefore, sigmoid wake probability model can be
used for arousal detection as well. The sigmoid wake probability model can also assist sleep technicians in
identifying awake episodes and pinpointing sleep onset. This work also highlights that existing works in the
literature that employ sleep-EEG data [20, 51] to develop drowsiness detection algorithms may not be applicable for
drowsiness detection in the context of driving.
Also, the features used for drowsiness detection in the Hjorth parameter-based method can be used in conjunction
with time series forecasting algorithms such as Kalman filtering or autoregressive integrated moving average
(ARIMA) to develop a prediction model for drowsiness in the future. Upon further validation in a driving study on a
larger population, the proposed algorithm can be implemented in a single-channel EEG-based wearable EEG
headband. In a driving simulator-based study, however, the movement of the participant due to driving might make
the EEG data contaminated by movement-related noise. Therefore, signal decompositions schemes such as empirical
mode decomposition [96] or wavelet transform [97] can be applied for noise removal before extracting the Hjorth
parameters from EEG in a driving simulator-based study. Furthermore, the data collected in Study 2 can be
instrumental in developing more accurate drowsiness detection algorithms using cardiorespiratory signals (more
details can be found in Appendix C5). Moreover, this thesis also proposes a new and well-defined guideline of video
rating for level of drowsiness using behavioral cues such as eye closure, change of muscle tone, eye blink, and head
nodding. Lastly, if the performance of the Hjorth parameter based algorithm can slightly be improved by extracting
more features or applied advanced machine learning algorithms, it can be used as a gold standard for drowsiness
detection. This will alleviate the need of manually annotating a large amount of data and expedite and assist future
research.
Page | 67
6.4. Limitations
The limitation of Study 1 was that the experimental scenario of sleep study was not similar to that of driving. The
subjects were in supine position and were trying to fall asleep without much body movement. This is contrary to
driving where the driver is in sitting position and trying to stay alert, and there are body movements due to driving.
Since Study 2 only mimics the scenarios of driving, the results of this study cannot be directly extrapolated to driver
drowsiness detection without validating in a study in a car or driving simulator. However, given the cognitively
engaging reaction time task with strong visual motion stimulation, it can be anticipated that the EEG data collected
in a driving simulator-based study will be similar to the reaction time study, except with the addition of more
movement noise in EEG due to driving.
The participants in this study were asked to try to stay alert and perform the task to the best of their ability.
However, it is possible that some of the participants did not feel motivated to stay alert and perform the task. This is
unlike driving, where the driver is more motivated to stay alert to avoid accidents. This is also a limitation of Study
2.
Even though the algorithms developed in this thesis are single-channel based and have the potential for
implementation in a wearable EEG headband, the resulting drowsiness detection systems, unlike camera-based or
vehicle parameter-based systems, are still obtrusive. Lastly, even if the Hjorth parameter-based scheme gives high
accuracy, sensitivity, and specificity, the performance metric values are still not 100%.
Page | 68
Chapter 7: Conclusions and Future Directions
In this thesis, we have developed a single-channel EEG based high-resolution drowsiness detection algorithm. In the
future, the algorithm must be validated in a driving simulator based study. Further optimization of the video rating
scale could also be done in future research. Furthermore, cardiorespiratory signals can also be investigated to
develop algorithms that can potentially yield higher accuracy, sensitivity, and specificity. Moreover, more advanced
machine learning algorithms such deep learning can be applied for performance improvement of the proposed time-
domain feature based algorithm. This work is the first step to develop a highly accurate, high-resolution, and
efficient drowsy driving detection system that will greatly benefit the population which are at higher risk of drowsy
driving related car crashes, including- shift workers, patients with sleep related disorders that induce daytime
sleepiness, such as obstructive sleep apnea, individuals who take sedative medications, and occupational drivers.
Since the algorithm is of high-resolution and fast and gives high detection accuracy and sensitivity, upon its
validation in a driving study, it can be marketized as a drowsiness detection system that will benefit the target
Page | 69
population. If a convenient and reliable drowsy driving detection system is developed, it can be used to reduce
drowsy driving or fatigue related car crashes. Furthermore, even though this thesis focuses drowsiness only in the
context of vehicle driving, a drowsiness detection system, once developed, will be useful for target population that
are not related to driving such as mining workers, pilots, and locomotive operators.
References [1] "Global Status Report on Road Safety 2015," 2015.
[2] P. Rau, "Drowsy Driver Detection and Warning System for Commercial Vehicle Drivers: Field Operational Test Design, Analysis, and Progress," 2005.
[3] M. I. Chacon-Murguia and C. Prieto-Resendiz, "Detecting Driver Drowsiness: A survey of system designs and technology," IEEE Consumer Electronics Magazine, vol. 4, pp. 107-119, 2015.
[4] A. Sahayadhas, K. Sundaraj, and M. Murugappan, "Detecting driver drowsiness based on sensors: a review," Sensors (Basel), vol. 12, pp. 16937-53, Dec 07 2012.
Page | 70
[5] (October 09, 2017). Audi Rest Recommendation System. Available: https://www.audi-mediaservices.com/publish/ms/content/en/public/hintergrundberichte/2012/03/05/a_statement_about/driver_assistance.html
[6] (October 09, 2017). BMW Driver Assistant. Available: https://www.bmw.ca/en/topics/experience/connected-drive/BMW%20ConnectedDrive:%20Driver%20Assistance%20.html
[7] (October 09, 2017). Bosch Driver Drowsiness Detection System.
[8] (October 09, 2017). Volvo. Available: https://www.media.volvocars.com/global/en-gb/media/pressreleases/12130
[9] D. Sommer and M. Golz, "Evaluation of PERCLOS based current fatigue monitoring technologies," in 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, 2010, pp. 4456-4459.
[10] R. Simons, M. Martens, J. Ramaekers, A. Krul, I. Klöpping-Ketelaars, and G. Skopp, "Effects of dexamphetamine with and without alcohol on simulated driving," Psychopharmacology, vol. 222, pp. 391-399, August 01 2012.
[11] D. Das, S. Zhou, and J. D. Lee, "Differentiating Alcohol-Induced Driving Behavior Using Steering Wheel Signals," IEEE Transactions on Intelligent Transportation Systems, vol. 13, pp. 1355-1368, 2012.
[12] M. A. J. Mets, E. Kuipers, L. M. de Senerpont Domis, M. Leenders, B. Olivier, and J. C. Verster, "Effects of alcohol on highway driving in the STISIM driving simulator," Human Psychopharmacology: Clinical and Experimental, vol. 26, pp. 434-439, 2011.
[13] R. D. Ogilvie, "The process of falling asleep," Sleep Medicine Reviews, vol. 5, pp. 247-270, 2001/06/01/ 2001.
[14] M. J. Prerau, R. E. Brown, M. T. Bianchi, J. M. Ellenbogen, and P. L. Purdon, "Sleep Neurophysiological Dynamics Through the Lens of Multitaper Spectral Analysis," Physiology, vol. 32, pp. 60-92, 2017-01-01 00:00:00 2017.
[15] M. J. Prerau, K. E. Hartnack, G. Obregon-Henao, A. Sampson, M. Merlino, K. Gannon, et al., "Tracking the sleep onset process: an empirical model of behavioral and physiological dynamics," PLoS Comput Biol, vol. 10, p. e1003866, Oct 2014.
[16] T. Nguyen, S. Ahn, H. Jang, S. C. Jun, and J. G. Kim, "Utilization of a combined EEG/NIRS system to predict driver drowsiness," vol. 7, p. 43933, 03/07/online 2017.
[17] M. Patel, S. K. L. Lal, D. Kavanagh, and P. Rossiter, "Applying neural network analysis on heart rate variability data to assess driver fatigue," Expert Systems with Applications, vol. 38, pp. 7235-7242, 2011.
[18] J. Vicente, P. Laguna, A. Bartra, and R. Bailón, "Detection of driver's drowsiness by means of HRV analysis," in 2011 Computing in Cardiology, 2011, pp. 89-92.
Page | 71
[19] J. Vicente, P. Laguna, A. Bartra, and R. Bailón, "Drowsiness detection using heart rate variability," Medical & Biological Engineering & Computing, vol. 54, pp. 927-937, June 01 2016.
[20] M. Akin, M. B. Kurt, N. Sezgin, and M. Bayram, "Estimating vigilance level by using EEG and EMG signals," Neural Computing and Applications, vol. 17, pp. 227-236, June 01 2008.
[21] I. Takahashi, T. Takaishi, and K. Yokoyama, "Overcoming Drowsiness by Inducing Cardiorespiratory Phase Synchronization," IEEE Transactions on Intelligent Transportation Systems, vol. 15, pp. 982-991, 2014.
[22] H. Su and G. Zheng, "A Partial Least Squares Regression-Based Fusion Model for Predicting the Trend in Drowsiness," IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 38, pp. 1085-1092, 2008.
[23] J. A. Healey and R. W. Picard, "Detecting stress during real-world driving tasks using physiological sensors," IEEE Transactions on Intelligent Transportation Systems, vol. 6, pp. 156-166, 2005.
[24] A. Garces Correa, L. Orosco, and E. Laciar, "Automatic detection of drowsiness in EEG records based on multimodal analysis," Med Eng Phys, vol. 36, pp. 244-9, Feb 2014.
[25] C. T. Lin, C. J. Chang, B. S. Lin, S. H. Hung, C. F. Chao, and I. J. Wang, "A Real-Time Wireless Brain–Computer Interface System for Drowsiness Detection," IEEE Transactions on Biomedical Circuits and Systems, vol. 4, pp. 214-222, 2010.
[26] C.-T. Lin, R.-C. Wu, S.-F. Liang, W.-H. Chao, Y.-J. Chen, and T.-P. Jung, "EEG-based drowsiness estimation for safety driving using independent component analysis," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 52, pp. 2726-2738, 2005.
[27] F. C. Lin, L. W. Ko, C. H. Chuang, T. P. Su, and C. T. Lin, "Generalized EEG-Based Drowsiness Prediction System by Using a Self-Organizing Neural Fuzzy System," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 59, pp. 2044-2055, 2012.
[28] T. R. P. Malik, R. D. Paul, J. B. Philip, and D. J. Richard, "Detection of lapses in responsiveness from the EEG," Journal of Neural Engineering, vol. 8, p. 016003, 2011.
[29] M. Matousek and I. Petersén, "A method for assessing alertness fluctuations from EEG spectra," Electroencephalography and Clinical Neurophysiology, vol. 55, pp. 108-113, 1983.
[30] S. Otmani, T. Pebayle, J. Roge, and A. Muzet, "Effect of driving duration and partial sleep deprivation on subsequent alertness and performance of car drivers," Physiology & Behavior, vol. 84, pp. 715-724, 2005/04/13/ 2005.
[31] A. Vuckovic, V. Radivojevic, A. C. N. Chen, and D. Popovic, "Automatic recognition of alertness and drowsiness from EEG by an artificial neural network," Medical Engineering & Physics, vol. 24, pp. 349-360, 2002/06/01/ 2002.
Page | 72
[32] Y.-T. Wang, K.-C. Huang, C.-S. Wei, T.-Y. Huang, L.-W. Ko, C.-T. Lin, et al., "Developing an EEG-based on-line closed-loop lapse detection and mitigation system," Frontiers in Neuroscience, vol. 8, 2014-October-13 2014.
[33] M. V. M. Yeo, X. Li, K. Shen, and E. P. V. Wilder-Smith, "Can SVM be used for automatic EEG detection of drowsiness during car driving?," Safety Science, vol. 47, pp. 115-124, 2009/01/01/ 2009.
[34] L. M. Bergasa, J. Nuevo, M. A. Sotelo, R. Barea, and M. E. Lopez, "Real-time system for monitoring driver vigilance," IEEE Transactions on Intelligent Transportation Systems, vol. 7, pp. 63-77, 2006.
[35] T. D’Orazio, M. Leo, C. Guaragnella, and A. Distante, "A visual approach for driver inattention detection," Pattern Recognition, vol. 40, pp. 2341-2355, 2007/08/01/ 2007.
[36] M. J. Flores, J. M. Armingol, and A. de la Escalera, "Driver drowsiness detection system under infrared illumination for an intelligent vehicle," IET Intelligent Transport Systems, vol. 5, pp. 241-251, 2011.
[37] M. H. Silber, S. Ancoli-Israel, M. H. Bonnet, S. Chokroverty, M. M. Grigg-Damberger, M. Hirshkowitz, et al., "The visual scoring of sleep in adults," J Clin Sleep Med, vol. 3, pp. 121-31, Mar 15 2007.
[38] T. L. T. da Silveira, A. J. Kozakevicius, and C. R. Rodrigues, "Automated drowsiness detection through wavelet packet analysis of a single EEG channel," Expert Systems with Applications, vol. 55, pp. 559-565, 2016/08/15/ 2016.
[39] R. R. Johnson, D. P. Popovic, R. E. Olmstead, M. Stikic, D. J. Levendowski, and C. Berka, "Drowsiness/alertness algorithm development and validation using synchronized EEG and cognitive performance to individualize a generalized model," Biological Psychology, vol. 87, pp. 241-250, 2011/05/01/ 2011.
[40] C. Papadelis, Z. Chen, C. Kourtidou-Papadeli, P. D. Bamidis, I. Chouvarda, E. Bekiaris, et al., "Monitoring sleepiness with on-board electrophysiological recordings for preventing sleep-deprived traffic accidents," Clinical Neurophysiology, vol. 118, pp. 1906-1922, 2007/09/01/ 2007.
[41] A. A. Putilov and O. G. Donskaya, "Construction and validation of the EEG analogues of the Karolinska sleepiness scale based on the Karolinska drowsiness test," Clinical Neurophysiology, vol. 124, pp. 1346-1352, 2013/07/01/ 2013.
[42] A. S. Aghaei, B. Donmez, C. C. Liu, D. He, G. Liu, K. N. Plataniotis, et al., "Smart Driver Monitoring: When Signal Processing Meets Human Factors: In the driver's seat," IEEE Signal Processing Magazine, vol. 33, pp. 35-48, 2016.
[43] Y. Dong, Z. Hu, K. Uchimura, and N. Murayama, "Driver Inattention Monitoring System for Intelligent Vehicles: A Review," IEEE Transactions on Intelligent Transportation Systems, vol. 12, pp. 596-614, 2011.
Page | 73
[44] B.-C. YIN, X. FAN, and Y.-F. SUN, "MULTISCALE DYNAMIC FEATURES BASED DRIVER FATIGUE DETECTION," International Journal of Pattern Recognition and Artificial Intelligence, vol. 23, pp. 575-589, 2009.
[45] P. Philip, P. Sagaspe, N. Moore, J. Taillard, A. Charles, C. Guilleminault, et al., "Fatigue, sleep restriction and driving performance," Accident Analysis & Prevention, vol. 37, pp. 473-478, 2005/05/01/ 2005.
[46] R. Tremaine, J. Dorrian, L. Lack, N. Lovato, S. Ferguson, X. Zhou, et al., "The relationship between subjective and objective sleepiness and performance during a simulated night-shift with a nap countermeasure," Applied Ergonomics, vol. 42, pp. 52-61, 2010/12/01/ 2010.
[47] G. Matthews, S. E. Campbell, S. Falconer, L. A. Joyner, J. Huggins, K. Gilliland, et al., "Fundamental dimensions of subjective state in performance settings: Task engagement, distress, and worry," Emotion, vol. 2, pp. 315-340, 2002.
[48] M. Ingre, T. ÅKerstedt, B. Peters, A. Anund, and G. Kecklund, "Subjective sleepiness, simulated driving performance and blink duration: examining individual differences," Journal of Sleep Research, vol. 15, pp. 47-53, 2006.
[49] R. N. Khushaba, S. Kodagoda, S. Lal, and G. Dissanayake, "Driver Drowsiness Classification Using Fuzzy Wavelet-Packet-Based Feature-Extraction Algorithm," IEEE Transactions on Biomedical Engineering, vol. 58, pp. 121-131, 2011.
[50] S. Hu and G. Zheng, "Driver drowsiness detection with eyelid related parameters by Support Vector Machine," Expert Systems with Applications, vol. 36, pp. 7651-7658, 2009.
[51] M. B. Kurt, N. Sezgin, M. Akin, G. Kirbas, and M. Bayram, "The ANN-based computing of drowsy level," Expert Systems with Applications, vol. 36, pp. 2534-2542, 2009.
[52] G. R. Poudel, C. R. Innes, P. J. Bones, R. Watts, and R. D. Jones, "Losing the struggle to stay awake: divergent thalamic and cortical activity during microsleeps," Hum Brain Mapp, vol. 35, pp. 257-69, Jan 2014.
[53] S. M. S. Alam and M. I. H. Bhuiyan, "Detection of Seizure and Epilepsy Using Higher Order Statistics in the EMD Domain," IEEE Journal of Biomedical and Health Informatics, vol. 17, pp. 312-318, 2013.
[54] M. J. Flores, Jos, #233, Mar, #237, a. Armingol, et al., "Driver drowsiness warning system using visual information for both diurnal and nocturnal illumination conditions," EURASIP J. Adv. Signal Process, vol. 2010, pp. 1-19, 2010.
[55] D. Sommer, M. Golz, U. Trutschel, and D. Edwards, "Biosignal Based Discrimination between Slight and Strong Driver Hypovigilance by Support-Vector Machines," in Agents and Artificial Intelligence: International Conference, ICAART 2009, Porto, Portugal, January 19-21, 2009. Revised Selected Papers, J. Filipe, A. Fred, and B. Sharp, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 177-187.
Page | 74
[56] S. R. Jagannathan, A. Ezquerro-Nassar, B. Jachs, O. V. Pustovaya, C. A. Bareham, and T. A. Bekinschtein, "Tracking wakefulness as it fades: Micro-measures of alertness," NeuroImage, vol. 176, pp. 138-151, 2018/08/01/ 2018.
[57] (July 24, 2018). http://inside.volkswagen.com/Take-a-break.html.
[58] N. Edenborough, R. Hammoud, A. Harbach, A. Ingold, B. Kisacanin, P. Malawey, et al., "Driver state monitor from DELPHI," in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, pp. 1206-1207 vol. 2.
[59] (October 09, 2017). Acumine. Available: http://www.acumine.com/
[60] D. J Edwards, B. Sirois, T. Dawson, A. Aguirre, B. Davis, and U. Trutschel, Evaluation of Fatigue Management Technologies Using Weighted Feature Matrix Method, 2017.
[61] (July 24, 2018). http://www.seeingmachines.com/.
[62] (October 09, 2017). Smart Eye. Available: http://smarteye.se/
[63] (July 24, 2018). https://www.siemens.com/global/en/home.html.
[64] (July 24, 2018). https://www.ospat.com/.
[65] (July 24, 2018). https://www.mobileye.com/.
[66] (July 24, 2018). https://www.nissanusa.com/experience-nissan/news-and-events/drowsy-driver-attention-alert-car-feature.html.
[67] (July 21, 2018). https://www.slideshare.net/elaghoury/eeg-for-sleep-lab.
[68] A. Azarbarzin, M. Ostrowski, P. Hanly, and M. Younes, "Relationship between Arousal Intensity and Heart Rate Response to Arousal," Sleep, vol. 37, pp. 645-653, 2014.
[69] A. L. a. M. Wiener, "Classification and regression by randomForest," R news2002.
[70] L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.
[71] D. L. Davies and D. W. Bouldin, "A Cluster Separation Measure," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, pp. 224-227, 1979.
[72] P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987/11/01/ 1987.
[73] C. M. Morin, C. L. Drake, A. G. Harvey, A. D. Krystal, R. Manber, D. Riemann, et al., "Insomnia disorder," Nature Reviews Disease Primers, vol. 1, p. 15026, 09/03/online 2015.
[74] B. R. Kornum, S. Knudsen, H. M. Ollila, F. Pizza, P. J. Jennum, Y. Dauvilliers, et al., "Narcolepsy," Nature Reviews Disease Primers, vol. 3, p. 16100, 02/09/online 2017.
Page | 75
[75] C. Bishop, Pattern Recognition and Machine Learning: Springer-Verlag New York, 2006.
[76] D. C. Dolan, D. J. Taylor, R. Okonkwo, P. M. Becker, A. O. Jamieson, W. Schmidt-Nowara, et al., "The Time of Day Sleepiness Scale to assess differential levels of sleepiness across the day," Journal of Psychosomatic Research, vol. 67, pp. 127-133, 2009/08/01/ 2009.
[77] L. J. Hettinger, Schmidt, T., Jones, D. L., and Keshavarz, B, "Illusory self-motion in virtual environments," ed: Boca Raton, FL: CRC Press, 2014.
[78] S. Palmisano, R. S. Allison, M. M. Schira, and R. J. Barry, "Future challenges for vection research: definitions, functional significance, measures, and neural bases," Frontiers in Psychology, vol. 6, 2015-February-27 2015.
[79] M. Basner and D. F. Dinges, "Maximizing Sensitivity of the Psychomotor Vigilance Test (PVT) to Sleep Loss," Sleep, vol. 34, pp. 581-591, 2011.
[80] (July 21, 2018). https://en.wikipedia.org/wiki/10%E2%80%9320_system_(EEG).
[81] W. W. Wierwille and L. A. Ellsworth, "Evaluation of driver drowsiness by trained raters," Accident Analysis & Prevention, vol. 26, pp. 571-581, 1994/10/01/ 1994.
[82] A. Delorme and S. Makeig, "EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis," Journal of Neuroscience Methods, vol. 134, pp. 9-21, 2004/03/15/ 2004.
[83] J. Fell, J. Röschke, K. Mann, and C. Schäffner, "Discrimination of sleep stages: a comparison between spectral and nonlinear EEG measures," Electroencephalography and Clinical Neurophysiology, vol. 98, pp. 401-410, 1996.
[84] T. Penzel and R. Conradt, "Computer based sleep recording and analysis," Sleep Medicine Reviews, vol. 4, pp. 131-148, 2000.
[85] M. S. M. D. Ira J. Rampil, "A Primer for EEG Signal Processing in Anesthesia " Anesthesiology, vol. 89, pp. 980-1002, 1998.
[86] B. Hjorth, "EEG analysis based on time domain properties," Electroencephalography and Clinical Neurophysiology, vol. 29, pp. 306-310, 1970/09/01/ 1970.
[87] L. Breiman, "Bagging Predictors," Machine Learning, vol. 24, pp. 123-140, August 01 1996.
[88] R. Polikar, "Ensemble based systems in decision making," IEEE Circuits and Systems Magazine, vol. 6, pp. 21-45, 2006.
[89] A. K. Tripathy, S. Chinara, and M. Sarkar, "An application of wireless brain–computer interface for drowsiness detection," Biocybernetics and Biomedical Engineering, vol. 36, pp. 276-284, 2016/01/01/ 2016.
Page | 76
[90] U. Friese, J. Daume, F. Göschl, P. König, P. Wang, and A. K. Engel, "Oscillatory brain activity during multisensory attention reflects activation, disinhibition, and cognitive control," Scientific Reports, vol. 6, p. 32775, 09/08/online 2016.
[91] B. Keshavarz, J. L. Campos, and S. Berti, "Vection lies in the brain of the beholder: EEG parameters as an objective measurement of vection," Frontiers in Psychology, vol. 6, 2015-October-13 2015.
[92] C. B. Saper, T. E. Scammell, and J. Lu, "Hypothalamic regulation of sleep and circadian rhythms," Nature, vol. 437, p. 1257, 10/26/online 2005.
[93] M. M. Schartner, A. Pigorini, S. A. Gibbs, G. Arnulfo, S. Sarasso, L. Barnett, et al., "Global and local complexity of intracranial EEG decreases during NREM sleep," Neuroscience of Consciousness, vol. 2017, pp. niw022-niw022, 2017.
[94] U. R. Acharya, S. Bhat, O. Faust, H. Adeli, E. C. P. Chua, W. J. E. Lim, et al., "Nonlinear Dynamics Measures for Automated EEG-Based Sleep Stage Detection," European Neurology, vol. 74, pp. 268-287, 2015.
[95] D. L. Woods, J. M. Wyma, E. W. Yund, T. J. Herron, and B. Reed, "Age-related slowing of response selection and production in a visual choice reaction time task," Frontiers in Human Neuroscience, vol. 9, 2015-April-23 2015.
[96] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, et al., "The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis," Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, vol. 454, p. 903, 1998.
[97] I. Daubechies, "The wavelet transform, time-frequency localization and signal analysis," IEEE Transactions on Information Theory, vol. 36, pp. 961-1005, 1990.
Page | 77
Appendix A4: Results from F3-M2
Here we present the results of F3-M2 electrode of sigmoid wake probability model on sleep-EEG data. Table A4.1
summarizes the number of 3-s segments used in this study for model development and validation.
Table A4.1: Average and standard deviation of the number of 3-s segments from F3-M2 used in this study for
model development and validation.
Stage Number of Segments per Subject (mean ± SD)
Arousal 6 ± 1
Deep sleep 38 ± 5
Non-REM 1 388 ± 86
Awake 1438 ± 212
Table A4.2 shows the sigmoid parameters computed form the training data for F3-M2 electrode. The resultant
sigmoid functions and the feature weights are shown in Fig. A4.1. It is also evident from Fig. A4.1 (b) that feature
weights in frontal electrodes are similar.
Table A4.2: Sigmoid parameters computed from the training data for F3-M2
Frequency band a b
Alpha 0.017 0.157
Beta 0.045 0.413
Delta 0.172 0.919
Page | 78
(a) (b)
Figure A4.1: (a) The resultant sigmoid functions for the three features for F3-M2 (b) Out-of-bag (OOB) permuted
predictor delta error for the three features computed from the training data.
(a) (b)
(c)
Page | 79
Figure A4.2: Distributions of relative power of (a) alpha, (b) beta, and (c) delta for three ranges of Pr (W) of F3-M2.
Here, all non-REM 1 to wakefulness transitions and vice versa are considered. The segments with low Pr(W) (sleep
cluster, Pr(W)<28) have low alpha and beta power and high delta power, while those with high Pr(W) (awake
cluster, Pr(W)>55) have high alpha and beta power and low delta power.
The quality of the clusters and the choice of model parameters are further validated from the feature distributions for
F3-M2 electrode, when the model is applied on all non-REM 1 to wakefulness transitions of the testing subjects.
Prior studies [13, 15] have shown that as an individual goes from wakefulness to non-REM 1, alpha and beta power
decrease, and delta power increases. Thus, if Pr (W) is on the lower side, relative power values of alpha and beta
bands should be lower, which is exactly what we can see in Fig. A4.2 (a) and Fig. A4.2 (b). The relative power
values of alpha and beta are higher for high Pr (W). The opposite scenario is seen in Fig. A4.2 (c). Therefore, the
consistent feature distributions validate the choice of sigmoid parameters and the efficacy of the sigmoid wake
probability model.
Page | 80
Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study
Figure A5.1 Variation of relative power of gamma (30-100 Hz) band (top panel), the corresponding button-press
data (middle panel), and facial video rating (bottom panel). Relative power of gamma band is high when the
participant is alert and low when the participant is drowsy.
Fig. A5.1 (top panel) shows the relative power feature of gamma band (30-100 Hz) for an entire block in a particular
participant. The button-press data (middle panel) and the facial video rating (bottom panel) of the same block are
also shown in Fig. A5.1. Fig. A5.1. reveals that the relative power of gamma band follows the gold standards-
yielding high values in alert episodes and low values for drowsy episodes. Therefore, gamma band was used in the
modified sigmoid wake probability model.
In the reaction time study’s EEG data, the theta (4-8 Hz) band power changes were also the most prominent. Fig.
A5.2 (top panel) shows the relative power feature of theta band for an entire block in a particular participant. The
button-press data (middle panel) and the facial video rating (bottom panel) of the same block are also shown in Fig.
A5.2. Fig. A5.2. reveals that the relative power of theta band follows the gold standards- yielding high values in
Page | 81
alert episodes and low values for drowsy episodes. Therefore, theta band was used in the modified sigmoid wake
probability model.
Figure A5.2 Variation of relative power of theta (4-8 Hz) band (top panel), the corresponding button-press data
(middle panel), and facial video rating (bottom panel). Relative power of theta band is low when the participant is
alert and low when the participant is drowsy.
Page | 82
Appendix B5: Mobility Parameter
Here we elucidate how mobility can be interpreted in the frequency domain. Let a signal is x(t), and its Fourier
transform is X(ω).
Then, the variance of x(t),
б2 = Energy of x(t)= Area under the magnitude spectrum of x(t)= ∫ |푋(휔)|2 dω ….(B5.1)
Again, Fourier transform of dx/dt is jωX(ω).
Therefore, 푉푎푟푖푎푛푐푒표푓 = ∫ 휔 |푋(휔)|2 dω ….(B5.2)
Mobility of x(t) = (2) ÷ (1)= ∫ 휔 | ( )|б
dω …..(B5.3)
The fraction highlighted in red is the normalized magnitude or power spectrum of x(t). Therefore, mobility
represents the frequency standard deviation of the power spectrum.
Page | 83
Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm
The proposed method attempts to detection drowsy episodes using respiratory inductance plethysmography (RIP)
signals. A schematic outline of the proposed scheme is shown in Fig. C5.1.
Figure C5.1: A schematic outline of the proposed method.
At first, RIP signals were filtered using a low-pass filter with a cut-off at 0.6Hz. Next, the data were divided into 10s
segments. If a 10s segment contained a single episode of drowsiness (i.e. at least 1s of drowsiness), the segment was
rated as drowsy. Next, the variance, skewness, kurtosis, and average mean to max features were extracted from each
10s segment. Then 60% of the participants’ data were selected as training and the remainder as testing. The training
data were used to train a random forest classifier which classified the episodes into drowsy and non-drowsy
segments. The accuracy, sensitivity, and specificity of 100 runs were 85.31 ± 3.87%, 73.45 ± 18.97%, and 90.04 ±
Page | 84
6.35% respectively. The maximum accuracy, sensitivity, and specificity of 100 runs were 90.35%, 91.76%, and
99.87% respectively.