Developing a System for High-Resolution Detection of ...

94
Developing a System for High-Resolution Detection of Driver Drowsiness Using Physiological Signals by Ahnaf Rashik Hassan A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Institute of Biomaterials and Biomedical Engineering University of Toronto © Copyright by Ahnaf Rashik Hassan 2018

Transcript of Developing a System for High-Resolution Detection of ...

Page 1: Developing a System for High-Resolution Detection of ...

Developing a System for High-Resolution Detection of Driver Drowsiness

Using Physiological Signals

by

Ahnaf Rashik Hassan

A thesis submitted in conformity with the requirements

for the degree of Master of Applied Science

Institute of Biomaterials and Biomedical Engineering

University of Toronto

© Copyright by Ahnaf Rashik Hassan 2018

Page 2: Developing a System for High-Resolution Detection of ...

Page | ii

Developing a System for High-Resolution Detection of Driver Drowsiness Using Physiological

Signals

Ahnaf Rashik Hassan

Master of Applied Science

Institute of Biomaterials and Biomedical Engineering

University of Toronto

2018

Abstract

Background: This research aims to develop a high-resolution, reliable, and efficient drowsiness detection system.

Existing systems for detecting drowsiness are of low-resolution, expensive, dependent on external parameters, or are

inconvenient for the driver.

Method: Two studies were conducted: First, we analyzed electroencephalogram (EEG) data collected during a sleep

study to develop a high-resolution drowsiness detection algorithm. This algorithm was then tested in a second study

that actively engaged participants in a reaction time task.

Results: In the sleep study, a sigmoid wake probability model yielded high drowsiness detection rates. In the

reaction time study, however, the same method showed low sensitivity. Instead, a time-domain feature based

algorithm performed best with high accuracy, high sensitivity, and high specificity.

Significance: Upon successful validation of the developed algorithm in a driving study, this research will help to

develop a reliable, wearable, and convenient device to detect drowsy driving that could increase road safety.

Page 3: Developing a System for High-Resolution Detection of ...

Page | iii

Acknowledgments

First and foremost, I am deeply grateful for the guidance, help, and support from my supervisors: Dr. Azadeh

Yadollahi and Dr. Behrang Keshavarz. I am sincerely grateful that they granted me an opportunity to pursue my

thesis with them. The constant supervision, support, and constructive criticism of Dr. Yadollahi and Dr. Keshavarz

immensely helped to improve the quality of my work and greatly contributed to my academic growth. I would also

like to thank my committee members: Prof. Geoff Fernie and Dr. Bruce Haycock for their thoughtful comments that

helped to improve the quality of the thesis.

Additionally, I would like to genuinely thank Dr. Muammar Kabir for his constant support, guidance and mentorship

throughout my master’s, especially in the data collection process. I would like to sincerely thank Prof. Stefan Berti

of Johannes Gutenberg-Universität, Mainz, Germany who personally taught me how to perform EEG data

collection. I would also like to express my special appreciation to Dr. Nasim Montazeri, Cathy Zhu, Joseph

Makanjuola, Bojan Gavrilovic, and Jamie Zhang for their support throughout my master’s study. Additional thanks

are extended to all the organizations that are funding this project.

Finally, I will eternally remain indebted to my mother whose constant support continues to be the source of strength

in all my endeavors.

Page 4: Developing a System for High-Resolution Detection of ...

Page | iv

Table of Contents Acknowledgments ................................................................................................................... iii

List of Tables .......................................................................................................................... vii

List of Figures........................................................................................................................ viii

List of Appendices .....................................................................................................................x

Chapter 1: Introduction ...............................................................................................................1

1.1 Overview ...........................................................................................................................1

1.2 Drowsiness Physiology ......................................................................................................2

1.3 Motivation .........................................................................................................................3

1.4 Thesis Structure .................................................................................................................3

Chapter 2: Background ................................................................................................................4

2.1 Defining Drowsiness .........................................................................................................4

2.2 Measures Used for Drowsiness Detection ..........................................................................6

2.2.1 Behavioral Measures...................................................................................................6

2.2.2 Questionnaire-Based Measures ...................................................................................8

2.2.3 Vehicle-Based Measures .............................................................................................8

2.2.4 Physiological Measures ..............................................................................................9

2.3 Signal Processing and Pattern Recognition Techniques Used for Drowsiness Detection .. 11

2.4 Commercially Available Systems .................................................................................... 12

Chapter 3: Objectives ................................................................................................................ 16

Chapter 4: Study 1 - Sleep Study ............................................................................................... 17

Page 5: Developing a System for High-Resolution Detection of ...

Page | v

4.1 Method ............................................................................................................................ 17

4.1.1 EEG Data ................................................................................................................. 17

4.1.2 Preprocessing of EEG Signals ................................................................................... 18

4.1.3 Feature Extraction from EEG Frequency Bands ........................................................ 19

4.1.4 Sigmoid Wake Probability Model ............................................................................. 20

4.1.5 Parameter Selection for the Model ............................................................................ 21

4.1.6 Validation ................................................................................................................. 24

4.2 Results............................................................................................................................. 24

4.3 Discussions ..................................................................................................................... 33

Chapter 5: Study 2- Reaction Time Study .................................................................................. 35

5.1 Methods .......................................................................................................................... 35

5.1.1 Study Design ............................................................................................................ 35

5.1.2 Stimulus and Data Collection .................................................................................... 35

5.1.3 Drowsiness Detection from Facial Video .................................................................. 38

5.1.4 Performance Evaluation Metrics ............................................................................... 40

5.1.5 Development of Drowsiness Detection Algorithm .................................................... 41

5.2 Results............................................................................................................................. 48

5.2.1 Drowsiness Ratings .................................................................................................. 48

5.2.2 Modified Sigmoid Wake Probability Model .............................................................. 52

5.2.3 Step Function Model ................................................................................................. 53

5.2.4 Time-Domain Feature-Based Algorithm ................................................................... 55

Page 6: Developing a System for High-Resolution Detection of ...

Page | vi

5.3 Discussion ....................................................................................................................... 57

Chapter 6: General Discussions ................................................................................................. 60

6.1. Summary of the Findings ................................................................................................ 60

6.2. Comparison with Other Drowsiness Detection Systems .................................................. 64

6.3. Practical Implications ..................................................................................................... 65

6.4. Limitations ..................................................................................................................... 67

Chapter 7: Conclusions and Future Directions ........................................................................... 68

References ................................................................................................................................ 69

Appendix A4: Results from F3-M2 ........................................................................................... 77

Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study....................... 80

Appendix B5: Mobility Parameter ............................................................................................. 82

Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm ....................... 83

Page 7: Developing a System for High-Resolution Detection of ...

Page | vii

List of Tables

Table 2.1: Rechtschaffen and Kales Sleep Staging Criteria……………….…………………………………………4

Table 2.2: A summary of widely used drowsiness scoring schemes…………………………………………………5

Table 2.3: A summary of drowsiness detection methodologies using physiological signals………………...………7

Table 2.4: A summary of drowsiness detection methodologies that employ behavioral measures……………..……10

Table 2.5: Over view of commercially available driver fatigue or drowsiness monitoring systems…………..…12-14

Table 4.1: Demographic information of the participants for Study 1………..………………………………………24

Table 4.2: Average and standard deviation of the number of 3-s segments from F4-M1 used in this study for model

development and validation……………………………………………………………………………………..……25

Table 4.3: Sigmoid parameters computed from the training data for F4-M1………………………………...………27

Table 4.4: Results of the one-way repeated measures analysis of variance suggesting that the feature values (mean ±

standard deviation) significantly change in the three clusters………………………..………………………………32

Table 5.1: Drowsiness scale proposed in this thesis…………………………………………………….……………39

Table 5.2: Average and standard deviation of the number of 1s segments used per electrode………………………49

Table 5.3: Inter-rater agreement of the proposed video-based drowsiness scale for binary classification of alert vs.

non-alert (scores 0 and 1)……………………………………………………………………...………………...……51

Table 5.4: Inter-rater agreement of the proposed video-based drowsiness scale for the levels of drowsiness (scores 0

to 10)……………………………………………………………………………………………………….…………52

Table 5.5: Performance comparison of the proposed methods with existing works in the literature……………...…57

Page 8: Developing a System for High-Resolution Detection of ...

Page | viii

List of Figures

Figure 2.1: An illustration of commonly used measures for drowsiness detection……………………………..…….6

Figure 4.1: EEG electrode placement map commonly used in sleep studies. The six electrodes available in this study

are highlighted with red circles………………………………………………………………………………….……18

Figure 4.2: Spectrogram (3s window, 50% overlap) of the first few minutes’ EEG of F4 channel of a single

participant……………………………………………………………………………………………………………..19

Figure 4.3: Sigmoid functions used in the proposed model………………………………………………………..…20

Figure 4.4: An example of bootstrapping and out-of-bag instances……………………………………………….…12

Figure 4.5: Sigmoid parameters to be estimated for alpha band……………………………………………………...26

Figure 4.6: The resultant sigmoid functions for the three features for F4-M1………………………….……………27

Figure 4.7: Pr(W) distribution of arousal and deep sleep segments……………………………………….…………28

Figure 4.8: Silhouette and Davies-Bouldin index values computed from the 3-s segments of awake and non-REM 1

data obtained from the training participants………………………………………………..…………………………29

Figure 4.9: Scatter diagram of the awake, drowsy, and sleep clusters for the testing data……………...……………30

Figure 4.10: Distributions of relative power of alpha, beta, and delta for three clusters of awake, drowsy and sleep.31

Figure 4.11: Post hoc multiple comparison tests suggests that alpha, beta, and delta power features are significantly

different between the clusters. …………………………………………………………………………….…………32

Figure 5.1: Illustration of the stimulus used in the present study……………………………………………….……36

Figure 5.2: Illustration of a participant performing the task………………………….………………………………37

Figure 5.3: Electrode locations of the international 10-20 electrode placement scheme………………..……………38

Page 9: Developing a System for High-Resolution Detection of ...

Page | ix

Figure 5.4: A schematic outline of the modified sigmoid wake probability model proposed in this thesis……….…41

Figure 5.5: Component maps of the independent component analysis (ICA) of the EEG recordings a subject…..…42

Figure 5.6: Sigmoid functions used in the modified sigmoid wake probability model………………………………43

Figure 5.7: A schematic outline of the step function model proposed in this thesis…………………………………44

Figure 5.8: A schematic outline of the proposed time-domain feature based algorithm…………………………..…46

Figure 5.9: Illustration of bootstrap aggregating classifier…………………………...………………………………48

Figure 5.10: Reaction time in seconds of a participant in response to the experimental task (i.e., identifying color

change of the fixation cross) throughout a single experimental block………………………………………….…….50

Figure 5.11: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the sigmoid wake

probability model on each of the electrodes of the 10-20 electrode placement system……………………..…..……53

Figure 5.12: Feature weights of F3 electrode computed from random forest…………………..……………….……53

Figure 5.13: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the step function

based algorithm on each of the electrodes of the 10-20 electrode placement system………………………..………54

Figure 5.14: Mean and standard deviation and maximum values of accuracy, sensitivity, and specificity of 100 runs

of the Hjorth parameter based algorithm on each of the electrodes of the 10-20 electrode placement system. ….…55

Figure 5.15: Mean and standard deviation and maximum values of accuracy, sensitivity, and specificity of 100 runs

of the Hjorth parameter based algorithm on each of the electrodes of the 10-20 electrode placement system…..…56

Page 10: Developing a System for High-Resolution Detection of ...

Page | x

List of Appendices

Appendix A4: Results from F3-M2…………………………………………………………………………………77

Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study…………………………………80

Appendix B5: Mobility Parameter………………………………………………………………………………….82

Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm…………………………………83

Page 11: Developing a System for High-Resolution Detection of ...

Page | 1

Chapter 1: Introduction

1.1 Overview

According to the World Health Organization, 1.25 million people die on the roads due to accidents each year across

the globe [1]. Drowsy driving is one of the leading causes of car crashes around the world. According to US

National Highway Traffic Safety Administration (NHTSA), 100,000 crashes related to driver fatigue results in an

estimated 1,550 deaths, 71,000 injuries, and 12.5 billion dollars in monetary losses each year in the US alone [2].

Road accidents due to drowsy driving therefore lead to significant human and material costs and productivity

reduction. The United Nations General Assembly adopted a set of Sustainable Development Goals (SDGs) in

September 2015- one of which is to halve the deaths and injuries from road accidents around the world by 2020 [1].

Reduction of car crashes due to drowsy or fatigued driving is a precondition for achieving this goal and ensuring

road safety.

A large number of existing works in the literature have attempted to deal with the problem of drowsy driving

detection and have propounded various drowsiness and fatigue monitoring systems. However, most of these

research prototypes could not make their way into the real world owing to their expensiveness and poor detection

performance [3, 4]. Even though various commercialized driver alertness or fatigue monitoring systems have been

developed by automobile companies, these systems are only being used in the vehicles of the respective companies

[3-8]. Most of the existing systems use behavioral measure-based drowsiness detection algorithms that involve

analyzing facial video or eye tracking data [3]. However, these systems are reliant on lighting conditions [9]. Also,

these systems necessitate the driver to be constantly monitored by a camera which compromises the privacy of the

driver. Again, a large number of existing drowsiness detection systems use vehicle parameters such as lane deviation

which are affected by external factors such as weather, road markings, and lighting conditions [4]. Moreover,

vehicle parameter variations cannot uniquely be attributed to drowsiness, since driving under the influence of

Page 12: Developing a System for High-Resolution Detection of ...

Page | 2

alcohol, anti-depressants, or other drugs and impaired driving affect these parameters [10-12]. Therefore, there is a

need for an efficient, highly accurate and cost effective drowsiness detection system that can be used in the

community.

1.2 Drowsiness Physiology

Drowsiness, also termed as sleep onset, sleepiness, low arousal state or somnolence, is referred to as the state of

strong desire for sleep that occurs just before a person falls asleep [13]. Since falling asleep (and therefore

sleepiness) is a neural process, brain rhythms are affected by drowsiness. Consequently, electroencephalogram

(EEG) is the most commonly used means for detecting drowsiness. Brain waves consist of five dominant frequency

bands: alpha (8-13 Hz), beta (13-30 Hz), gamma (30-100 Hz), delta (1-4 Hz) and theta (4-8 Hz). In sleep-EEG data,

drowsiness or sleep onset is characterized by a decrease of alpha waves [14]. This is owing to the fact that alpha

wave is associated with relaxed wakefulness. In contrast, delta and theta power increases during drowsy episodes

[14, 15]. In the context of a reaction time or tracking task or in a study in a simulator, drowsiness is associated with

a decrease in high frequency gamma band (30-100 Hz) power and an increase in lower frequency bands such as

delta, theta, alpha, and beta (13-30 Hz) power [16].

Besides changes in brain wave activities, drowsiness is also associated with other physiological processes of the

human body such as autonomic nervous activity (ANS) [17-19]. Drowsiness is characterized by increased

parasympathetic dominance and decreased sympathetic dominance. ANS activity can be noninvasively measured

from the electrocardiogram (ECG) signals. The low frequency (LF) band (0.04- 0.15 Hz) power captures

sympathetic activity, whereas the high frequency (HF) band (0.15- 0.4 Hz) power signifies parasympathetic activity

[18, 19]. Parasympathetic-sympathetic balance is obtained from the ratio of LF and HF, which progressively

decreases as a subject moves from vigilant to drowsy state. Another frequency band of interest is the very low

frequency (VLF) band (0.003- 0.04 Hz). The transition from wakefulness to sleep is associated with a decrease in

power in the VLF band. Thus, drowsiness strongly influences heart rate variability. Drowsiness is also associated

with as decrease in muscle activity [20]. Furthermore, drowsiness is associated with oxygen desaturation which

triggers a loss of alertness and concentration. Studies have shown that peripheral oxygen saturation (SpO2) in the

forehead decreases when drowsiness gets stronger and increases when drowsiness gets weaker [21]. Drowsiness is

also reflected in eye movements (as measured by the electrooculogram or eye tracking). Increased eye blinks, partial

or full closure of the eyelids, and changes in eye blink duration and amplitude can be observed [22]. The appearance

of drowsiness is also manifested by changes in skin conductance, resulting in a decrease in sympathetic dominance,

Page 13: Developing a System for High-Resolution Detection of ...

Page | 3

which in turn, causes the removal of the ionic fillings of the sweat glands of skin [23]. This gives rise to a sudden

decrease in skin conductance. Taken together, it is evident that drowsiness causes various physiological traits to

change which are reflected by the changes in the physiological signals.

1.3 Motivation

Over the years, researchers have attempted to develop drowsiness and fatigue systems that can be installed in

vehicles to monitor drivers [2-4, 16, 24-36]. Upon detection of drowsy episodes, the system would activate an alarm

to alert the vehicle driver. A drowsy driving detection system will greatly benefit the populations who are at higher

risk of drowsy driving related car crashes, including shift workers, patients with sleep related disorders, individuals

who take sedative medications, and occupational drivers. If a convenient and reliable drowsiness detection system is

developed, it will greatly reduce drowsy driving or fatigue-related car crashes. Furthermore, even though this work

reviews drowsiness particularly in the context of driving, a drowsiness detection system will also be useful for

mining workers, pilots, and locomotive operators.

1.4 Thesis Structure

The present thesis is organized as follows: In Chapter 2, we will discuss the existing drowsiness detection

algorithms and commercially available fatigue and drowsiness detection systems. Chapter 3 presents the objectives

of this thesis. In Chapter 4, we will describe the first study that was conducted in a sleep laboratory, explain the

algorithm developed based on the sleep study data, and discuss the obtained results. Next, in Chapter 5, we present

the second study based on a reaction time task and will discuss the developed algorithms together with the results

obtained from the collected data. Chapter 6 provides a general discussion of the two studies and the drowsiness

detection algorithms propounded in this thesis. Finally, Chapter 7 highlights future directions of this thesis and

provides concluding remarks.

Page 14: Developing a System for High-Resolution Detection of ...

Page | 4

Chapter 2: Background

In this chapter, we will provide an overview of the current literature about drowsiness detection. First, this chapter

will define drowsiness before discussing commonly used measures, signal processing and pattern recognition tools

used, and commercially available systems for drowsiness detection.

2.1 Defining Drowsiness

Unlike various stages of sleep, drowsiness is not a physiologically well-defined stage. This has led to the

development of various definitions or scoring schemes for drowsiness. Before describing drowsiness, we first

discuss how various sleep stages are defined, since in the first study, we will use sleep-EEG data to detect

drowsiness. Sleep stage scoring has traditionally been performed using Rechtschaffen and Kales (R&K) [37]

guideline presented in Table 2.1 which was originally proposed in 1968.

Table 2.1: Rechtschaffen and Kales Sleep Staging Criteria [37].

Sleep stage Scoring criteria

Awake >50% of the epoch comprises of alpha (8-13 Hz) activity or low voltage, mixed (2-7 Hz) frequency activity.

Non-REM stage 1 50% of the epoch consists of relatively low voltage mixed (2-7 Hz) activity, and <50% of the epoch contains alpha activity. Slow rolling eye movements that last several seconds often seen in early stage 1.

Non-REM stage 2 Appearance of sleep spindles and/or K complexes and <20% of the epoch may contain high voltage (>75 µV, <2 Hz) activity. Sleep spindles and K complexes each must last >0.5 seconds.

Non-REM stage 3 20- 50% of the epoch consists of high voltage (>75 µV), low frequency (<2 Hz) activity.

Non-REM stage 4 >50% of the epoch consists of high voltage (>75 µV) <2 Hz delta activity.

REM Relatively low voltage mixed (2-7 Hz) frequency EEG with episodic rapid eye movements and absent or reduced chin EMG activity.

Page 15: Developing a System for High-Resolution Detection of ...

Page | 5

In 2005, American Academy of Sleep Medicine (AASM) published a modified guideline which combined S3 and

S4 into one stage, namely N3. A salient trait of both R&K and AASM guidelines is that they score EEG signals on a

30 s basis. Neither R&K nor AASM guideline gives an explicit definition of drowsiness, unlike wakefulness and

various states of sleep. Thus, awake to sleep transition or sleep onset points are discretized in both scoring

guidelines. In other words, this approach considers wake to sleep transition and vice versa as an instantaneous

process and completely overlooks the interplay of neural system and behavior that occurs just before sleep onset.

Table 2.2 summarizes some of the most widely used definitions of drowsiness.

Table 2.2: A summary of widely used drowsiness scoring schemes.

Scoring scheme Brief description

Non-REM 1 [7] Identifying sleep stage non-REM 1 as drowsiness

Karolinska drowsiness scoring method (KDS) [3][5] EEG signals are segmented on 2 second basis. Each segment is checked for sleepiness using EEG and EMG.

Self-evaluated [13] Evaluation of sleepiness by the vehicle driver

Evaluation by external human observer [13] Evaluation of sleepiness by external human observer who rate the drowsiness based on eye closure, head movement etc.

Event related lane departure paradigm [50] The driver has to align the car towards a particular lane as the vehicle drifts away from a lane. The reaction time of the driver is measured. Higher reaction time indicates drowsiness.

Wierewille and Ellsworth criteria [11][31][12] The level of drowsiness is rated from videos of the driver. The scale varies from 1 to 5- where 1 denotes not drowsy and 5 denotes extremely drowsy.

Behavioral task [29] Reaction in a button pressing task.

Johns drowsiness scale [49] A scale that combines different variables representing the variability of eyelid closure’s and blink’s duration and velocity characteristics, measured each minute.

Vehicle-based parameters [17] Thresholding vehicle-based measures, such as standard deviation of lane position

Ocular and facial features from video [17] Thresholding common ocular features, such as percentage of eye closure (PERCLOS)

Unfortunately, most of the existing works on drowsiness and fatigue detection define sleep stage non-REM 1 as

drowsiness according to R&K guidelines, considering sleep as a discrete process. Drowsiness, however, is a state

Page 16: Developing a System for High-Resolution Detection of ...

Page | 6

that precedes non-REM 1. This is because non-REM 1 is a sleep stage, and an individual will be drowsy near the

transition of wake to sleep [14, 15]. It can be seen from Table 2.2 that only Karolinska drowsiness scoring method

(KDS) employ physiological signals such as EEG and electromyogram (EMG). In contrast, all the other scoring

schemes employ behavioral or vehicle-based measures for detection of drowsy driving.

2.2 Measures Used for Drowsiness Detection

Measures used for detecting drowsy driving can be separated into driver-based measures and vehicle-based

measures. Driver-based measures refer to various behavioral and physiological characteristics recorded from the

driver [9, 16, 20, 24-29, 31-33, 38-41]. Fig. 2.1 depicts the commonly used measures for vehicle driver’s drowsiness

detection. Driver-based measures can further be subdivided into three categories, namely physiological, behavioral,

and questionnaire-based measures. Vehicle-based measures are the parameters calculated from the vehicle [42, 43].

These measures include lane deviation and steering wheel movement. In the following, we will focus on describing

each of these measures in details.

Figure 2.1: An illustration of commonly used measures for drowsiness detection.

2.2.1 Behavioral Measures

Several behavioral changes can also be observed during drowsiness. These changes include frequent yawning,

increased eye blink, nodding or sudden movement of the head on one of the sides, and changes in facial features.

Behavioral measure-based methodologies attempt to capture and exploit these characteristic features to develop their

sleepiness detection algorithm. Table 2.4 summarizes some of the existing works that use behavioral measures for

drowsy driving detection.

Drowsiness detection

Driver based measures

Physiological measures

Behavioral measures

Questionnaire based measures

Vehicle based measures

Page 17: Developing a System for High-Resolution Detection of ...

Page | 7

Table 2.4: A summary of drowsiness detection methodologies that employ behavioral measures.

Authors Signals used Experimental paradigm

Number of subjects

Drowsiness scoring scheme

Feature extraction

Classification or regression model

Accuracy

Bergasa et al. [34]

Active IR illuminator and a miniature CCD camera sensitive to IR to capture facial images

Night and day driving in a motorway

Unspecified Fuzzy rules based on the six ocular features extracted in this work

Six features- PERCLOS, eye closure duration, blink frequency, nodding frequency, face position, and fixed gaze.

Kalman filtering and fuzzy classifier

95.62%

D’Orazio et al. [35]

Facial image Image acquisition in different lighting conditions of subjects while driving a car.

2 PERCLOS values

Hough transform

ANN 95%

Flores et al. [36]

Facial image Driving by day and at night

Unspecified PERCLOS values

Gabor filter and PERCLOS

SVM 93%

Sommer et al. [9]

Facial video Night-time driving in a driving simulator

16 KDS, driving related parameter- SDL, and PERCLOS

PERCLOS and spectral domain features

SVM 66-74% for PERCLOS

Yin et al. [44]

Video Videos collected from webcam as the subjects operated computers on a worktable,

30 Annotation based on yawning, eye closure etc. from videos

Local binary pattern feature

Adaptive boosting

98.33%

The advantage of behavioral measures for drowsiness detection is that they are contactless. In other words, these

systems do not require a device or sensor to attach to the driver’s body. However, these systems have a number of

limitations. First, systems using ocular measures show poor performance for drivers wearing glasses [43]. Second,

some of the works in the literature uses eye measures that are dependent on lighting conditions. As a result, they do

not work well in poor lighting conditions, such as cloudy days and at night. Even though some studies used infrared

cameras to overcome this limitation, these detection schemes fail to perform adequately during day time [4]. Third,

eye and head movement-based systems are not very popular among vehicle drivers owing to the fact that most of the

drivers are uncomfortable to have a camera focused on their faces or bodies all the time [43].

Page 18: Developing a System for High-Resolution Detection of ...

Page | 8

2.2.2 Questionnaire-Based Measures

Prior works on drowsiness detection have also used subjective questionnaire-based measures for drowsiness

detection wherein the subject had to fill out a questionnaire to rate their level of drowsiness [45, 46]. Next, the

intensity of sleepiness was measured based on the ratings. Some of the most commonly used questionnaires are the

Dundee Stress State Questionnaire (DSSQ) [47], the Karolinska Sleepiness Scale (KSS) [48], and the Epworth

Sleepiness Scale. The DSSQ attempts to assess the level of task-induced stress and arousal of the subject. The KSS

measures acute sleepiness by asking the subject to rate their level of sleepiness on a scale of 1 (extremely alert) to 10

(extremely sleepy). The ESS measures general sleepiness by asking the respondent to rate on a 4-point scale (0-3)

their usual chances of having dozed off or fallen asleep while engaged in eight different activities. Albeit their use in

some existing studies, questionnaire-based measures have multifarious caveats associated with them. For instance, it

is problematic to obtain feedback from the driver on his/her own drowsiness while driving, since it would require a

person to accompany the driver all the time. This approach also affects the attention and level of alertness of the

driver. Furthermore, sudden variations in drowsiness cannot be measured using such questionnaires owing to the

fact that most of the questionnaires are presented and filled out in 5 min. intervals. Moreover, questionnaire-based

ratings do not fully concur with other measures, such as physiological, behavioral, and vehicle-based [4, 47]. Taken

all, questionnaire-based measures are not effective for developing a drowsiness detection algorithm.

2.2.3 Vehicle-Based Measures

Vehicle-based measures have also been used for drowsiness detection. The two most commonly used such measures

are standard deviation of lane position (SDL or SDLP) and steering wheel movement (SWM). An inattentive or

sleepy driver often drifts while driving. Therefore, SWM and SDLP alter when the driver becomes drowsy. As a

result, there would be variations of SWM, which are manifested by increased standard deviation of steering angle

and increased amplitude of steering movement [42]. Moreover, acceleration and brake patterns are influenced by the

sleepiness of the driver [43]. Studies have shown that fatigue or drowsiness of the driver is manifested by increases

in standard deviation of the vehicle speed [4, 42, 43]. Therefore, the aforementioned parameters have been used in

prior studies that use vehicle-based measures for drowsy driving identification.

The main advantage of vehicle-based measure systems is that they are non-intrusive. Consequently, these systems

are convenient and comfortable for the driver. Nevertheless, there are various caveats of vehicle-based measures for

drowsy driving identification. First, prior studies have shown that vehicle-based metrics are poor predictors of

Page 19: Developing a System for High-Resolution Detection of ...

Page | 9

drowsy driving [3, 4, 43]. Second, vehicle-based measures can be affected by factors other than fatigue or

drowsiness. For instance, they can be influenced by drugs such as antidepressants or by alcohol [10-12]. Third, these

vehicle-based parameters vary greatly from driver to driver. That is, an algorithm that yields good performance for

one driver may yield poor performance for another. Finally, vehicle-based measures are greatly reliant on vehicle

type and external factors, including road geometry, weather, and road marking.

2.2.4 Physiological Measures

It is evident from Section 1.2 that drowsiness is associated with alterations in characteristics of physiological

signals. Consequently, drowsy driving detection systems that use physiological signals attempt to capture these

alterations in the signal characteristics recorded from the driver.

Table 2.3 summarizes some of the existing works that use physiological signals to detect drowsiness. EEG [4, 16,

24, 25, 27, 49-51] is the most widely used physiological signal for drowsiness detection. Some of the most

commonly used features used in EEG-based drowsy driving detection techniques are power spectral density (PSD)

based features, average and relative power of various brain rhythms including alpha, delta, and theta waves, time-

domain features (e.g. mean, variance, minimum, maximum, kurtosis of EEG signal amplitude), fractal dimension,

approximate entropy, and lempel-ziv complexity. Drowsiness is also associated with partial or full eyelid closure

and higher blink frequency. Consequently, electrooculogram (EOG) [20, 22, 49, 51], and EMG [20, 22, 51] have

also been used in prior studies to quantify drowsiness. The most common features extracted from EOG include:

peak eyelid closing velocity, delay of eyelid reopening, blink amplitude, blink duration, peak opening velocity of

eyelid, eyelid opening speed, eyelid closure speed, percentage of eye closure (PERCLOS) which denotes the

percentage of time during which the eyes were at least 80%, closed, average eye closure (AVECLOS), eyelid

closing time, eye-blink interval, and eye-blink frequency every 20 second. On the other hand, EMG-based methods

extracted features to track decreasing muscle tension due to drowsiness to develop their drowsiness detection

algorithms [20]. Furthermore, the studies that use ECG signals [17-19] mostly focus on calculating heart rate

variability (HRV) features from time and frequency domain. The time domain features include standard deviation of

the RR intervals, number and proportion of RR intervals, the root mean square of the difference of successive RR

among others. High frequency (HF) and low frequency (LF) power and ratio of LF to HF are the frequency domain-

based features used for drowsiness detection. LF to HF ratio continues to decrease as an individual becomes

increasingly drowsy [18, 19]. It is also evident from Table 2.3 that most of the existing studies use multiple

physiological signals to improve performance.

Page 20: Developing a System for High-Resolution Detection of ...

Page | 10

Table 2.3: A summary of drowsiness detection methodologies using physiological signals.

Authors Signals used Experimental paradigm

N Drowsiness scoring scheme

Feature extraction Classification or regression model

Accuracy

Akin et al. [20]

Chin EMG and EOG

Overnight sleep study

30 Expert scoring from EEG and EMG

Discrete wavelet transform (DWT)

Artificial neural network (ANN)

98-99%

Khushaba et al. [49]

ECG, EOG, and EEG

1 h. monotonous driving in a driving simulator

31 Wierewille and Ellsworth criteria

Fuzzy mutual-information (MI)- based wavelet packet transform

Support vector machine (SVM), k nearest neighbor, linear discriminant analysis (LDA)

95-97%

He et al. [50] Horizontal and vertical EOG, chin EMG, and EEG

45 mins. driving in a moving base driving simulator

37 Karolinska Drowsiness Scale (KDS)

Eyelid movement parameters extracted from EOG

SVM 90%

Patel et al. [17]

ECG Sensory motor driver simulator task

12 Unspecified Power spectrum density of RR interval

ANN 90%

Su et al. [22] EOG, EEG, and EMG

Monotonous driving in a third generation moving base simulator after a full-night shift with no sleep

44 KDS 14 eyelid features (e.g. blind duration/ amplitude, peak opening/ closing velocity, lid opening/ closing speed etc.) extracted from EOG

Partial least squares regression

90%

Fu et al. [27] EEG Event-related lane departure paradigm in a virtual-reality based driving simulator

6 Reaction time and lane deviation based scoring commonly used in event-related lane departure paradigm

ICA and spectral powers computed by FFT

Self-organizing neural fuzzy inference network

96.7%

Correa et al. [24]

EEG Overnight sleep study

16 Considers non-REM 1 as drowsiness

19 features extracted frequency and time domain and wavelet decomposition of the EEG

ANN 87.4% for drowsiness and 83.6%, for alertness

Kurt et al. [51]

EMG, EEG, and EOG

Overnight sleep study

10 Expert scoring from EEG, EOG, and EMG

DWT features ANN 97-98%

Vicente et al. [18]

ECG 2 hr. driving in a driving simulator

11 Expert annotation based on EEG, percentage of eye closure, video recording

Heart rate variability features in the frequency domain

LDA Positive predictive value: 86.31%; Sensitivity: 70.58%

Chin et al. [25]

EEG Night-time driving in a virtual reality based driving simulator

10 Alpha and theta rhythm and alertness model

Mahalanobis distance of Alpha and Theta rhythm

Thresholding 88.7%

Page 21: Developing a System for High-Resolution Detection of ...

Page | 11

The advantages of physiological signal-based drowsiness detection systems are manifold. They are highly reliable

and tend to provide high accuracy values [4, 16] compared to those of behavioral-based and vehicle-based measures.

Moreover, these systems are not dependent on surrounding lighting conditions and eliminate the discomfort of the

driver of constantly being monitored by a camera [3].

Notwithstanding its benefits and widespread use in the drowsiness detection literature, the use of physiological

signals to detect drowsy driving has some limitations. For instance, these systems are, as opposed to other driver-

based measures, non-contactless and require a lot of sensors and electrodes to be placed on the driver’s body which

makes it uncomfortable for the driver. Furthermore, most of the physiological signal-based algorithms use 30s or

longer signal [20, 24, 33, 51] segments whereas drowsy episodes can be as short as 1s [52]. Moreover, physiological

signal acquisition equipment are expensive and difficult to place. Also, some of the existing works in the literature

analyzes overnight sleep study data for drowsiness detection, the findings of which cannot be generalized for drowsy

driving.

2.3 Signal Processing and Pattern Recognition Techniques Used for Drowsiness Detection

The goals of using signal processing is to clean the data and identify effective markers of drowsiness which are later

fed into patter recognition tools to develop a drowsiness detection algorithm. In this section, the signal processing

and the pattern recognition techniques used for driver drowsiness detection will be discussed. It is often seen that

transform domain features capture more information on drowsiness than features computed directly from the signal

domain. Therefore, most of the existing works in the literature perform feature extraction in the transform domain

[18-20, 35, 51]. A salient trait that is seen in all physiological signal-based systems is that the method attempts to

capture information from a particular frequency band. For example, in EEG, alpha (13-30 Hz), theta (4-8 Hz), and

delta (1-4 Hz) may be bands of interest for feature extraction. In ECG, HF and LF bands of RR interval time series

are often used for feature extraction. The goal of extracting a particular frequency band has motivated the use of

various frequency domain signal decomposition techniques in drowsy driving detection literature. This

decompositions include power spectrum density analysis [17], discrete wavelet transform (DWT) [20, 24, 51], fast

Fourier transform [27], and wavelet packet transform [49].

Even though these signal analysis techniques successfully decompose the data in various frequency bands, each of

these techniques have various caveats associated with them. For example, Fourier transform is not well-suited for

Page 22: Developing a System for High-Resolution Detection of ...

Page | 12

analyzing nonlinear and nonstationary signals such as- EEG [53]. Wavelet transform, on other hand, is reliant on the

choice of basis functions. The use of data adaptive signal decomposition scheme, such as empirical mode

decomposition [53] is not present in the literature. The signal processing techniques used in behavioral measure-

based driver drowsiness detection systems are employed to detect face and eye from facial video frames. The most

commonly used techniques for this purpose include texture-based features, such as local binary pattern [44] and its

variants, Hough transform [35], and Gabor filters [36, 54].

Following feature extraction, various pattern recognition techniques are used to classify drowsy and non-drowsy

signal episodes or among various levels of drowsiness. There is a wide variety of classification techniques that have

been used in the existing literature. These include artificial neural network (ANN) [17, 20, 24, 35, 51], support

vector machine (SVM) [36, 54-56], k nearest neighbors [49], Kalman filtering [34], particle filter [42], discriminant

analysis [19], fuzzy logic based classifiers [34], adaptive boosting [44] among others. Nevertheless, there is dearth

of studies that employ forecasting or prediction of drowsiness episodes.

2.4 Commercially Available Systems

The organizations that have conducted research to develop a drowsy driving detection system include- government

organizations (e.g. Canada Safety Council, Ministry of Transportation of Ontario, Railway Association of Canada,

Transport Canada, Spanish Science and Technology council, U.S. Department of Defense, U.S. Federal Motor

Carrier Safety Administration), specialized companies (e.g. AcuMine, Neurocom, Sleep Diagnostics, Seeing

Machines, Ospat Pty, Pacific Science and Engineering Group, Pernix, Precision Control Design Inc., Security

Electronic Systems), original equipment manufacturers (e.g. General Motors/Saab, Caterpillar, Daimler Chrysler,

BMW, Audi etc.), and universities (e.g. University of Pennsylvania, CMU, University of Tokyo, Royal Institute of

Technology- Sweden, University of Technology of Berlin) [3, 4]. Table 2.5 summarizes some commercially

available drowsiness or fatigue monitoring systems developed by the aforementioned organizations.

Page 23: Developing a System for High-Resolution Detection of ...

Page | 13

Table 2.5: Some commercially available driver fatigue or drowsiness monitoring systems [3].

Company name Product name Signals used in the system

Audi [3] Rest recommendation system Ocular features

BMW [6] Active Driving Assistant HRV

Volkswagen [57] Fatigue detection system Ocular features

Delphi Corporation [58] Driver State Monitor -

Volvo [8] Driver Alert Control Head movement, ocular features

Carnegie Mellon University [3] - Head movement

US Army [3] - EEG

Scania [3] - Lane deviation, SWM

NHTSA [3] - Lane deviation, head movement, EEG, ocular

features

AcuMine [59] HaulCheck Lane deviation

Attention Technologies [60] Driver Fatigue Monitor Ocular features

Subaru [3] EyeSight Driver Assist -

Seeing Machines [61] Facelab Head movement, ocular features

Smart Eye [62] AntiSleep Head movement, ocular features

Sleep Diagnostics [3] Optalert Ocular features

AssistWare Technologies [60] SafeTrac Lane deviation

Saab [3] - Head movement, ocular features

Bosch [7] Driver drowsiness detection SWM, vehicle speed, facial image

Siemens [63] - Lane deviation, HRV, ocular features

SMI [3] InSight Head movement, ocular features

Welkin [3] Nap Zapper Head movement

Page 24: Developing a System for High-Resolution Detection of ...

Page | 14

Denso [60] - HRV, ocular features

Neurocom [3] EDVTCS Skin conductance

Pernix [3] ASTID SWM

Ospat Pty [64] OSPAT Reaction time

Muirhead/Remote Control Tech. [3] Fatigue Warning System Reaction time

Security Electronic Systems [3] Sleep Control Helmet System Head movement

MCJ [3] EyeCheck Ocular features

Mobileye NV [65] Vision/Radar Sensor Lane deviation, SWM

Precision Control Design Inc. [3] SleepWatch -

ARRB Transport Research [3] Fatigue Management System Reaction time

International Mining Technologies [3] Voice Commander Reaction time

Mercedes-Benz [60] Attention Assist -

Nissan [66] Driver Attention Alert -

Mazda [3] Lane Departure Warning System -

Iteris Inc. [60] AutoVue Lane deviation

Advanced Safety Concepts [3] Proximity Array Sensing System (PASS)

Head movement

A close inspection of Table 2.4 reveals that most of the commercially available systems are based on behavioral

measures. Systems that employ vehicle-based signals are fewer and only a handful number of systems use

physiological measures. Furthermore, none of the existing systems use questionnaire-based measures. Moreover,

Table 2.4 manifests that most of the existing systems attempt to combine multiple systems to ameliorate detection

performance. Also, some of the companies did not reveal details of the signals that they used in their fatigue

monitoring systems.

There are several drawbacks of commercially available drowsy driving detection systems. Commercially developed

systems are not easily available to everyone [4], and one must purchase vehicles from a particular company to use

their drowsiness and fatigue detection system. The technology used in the commercialized systems is not open-

source. Furthermore, most of the systems presented in Table 2.4 generate a lot of false alarms [3], making those

Page 25: Developing a System for High-Resolution Detection of ...

Page | 15

systems are unreliable. Moreover, since most of the existing commercialized systems are either using behavioral

measures or vehicle-based measures, they all have the limitations of behavioral or vehicle-based drowsiness

detection systems described in sections 2.1.2 and 2.1.4 respectively.

Page 26: Developing a System for High-Resolution Detection of ...

Page | 16

Chapter 3: Objectives

As previously described, the current methods to detect drowsy driving are of low-resolution (30s or longer window

size) even though many of them give high detection accuracy. Only a handful of methods detect drowsiness at a

high (i.e., less than 10s) resolution. However, these studies yield poor detection performance. Furthermore, most of

the existing algorithms are reliant on various external factors such as weather, road geometry, and lighting

conditions. To goal of the present thesis is to overcome these issues. In detail, the two main objectives of this thesis

are:

1. To develop a high-resolution drowsiness detection algorithm using EEG data collected from a sleep study.

2. To test the algorithms developed on sleep study data in a natural drowsiness-inducing setting that is similar to

daily driving.

The first objective of my thesis project is to develop a high-resolution and accurate drowsiness detection algorithm

using EEG data collected from a sleep study. In a sleep study, the subject is lying in the bed with eyes closed and

without much body movement, resulting in EEG signals that are relatively less contaminated by eye movement,

motion, and eye blink artifacts. Therefore, we use the relatively noise-free sleep-EEG data to develop a high-

resolution drowsiness detection algorithm.

The second objective is to test the algorithms developed on sleep study data in an experiment that is more similar to

driving. To achieve this objective, we will design an interactive reaction time task that will require the participant’s

attention yet monotonous at the same time. In this study, we will use facial videos to identify the episodes of

drowsiness. The algorithm developed in the first objective will be validated against gold standard behavioral scale of

drowsiness based on facial video.

Page 27: Developing a System for High-Resolution Detection of ...

Page | 17

Chapter 4: Study 1 - Sleep Study

In this chapter, a detailed description of the overnight sleep study and the algorithm developed for detection of

drowsiness are presented. EEG data collected from a sleep study were analyzed to develop a high-resolution

drowsiness detection algorithm.

4.1 Method

For the development of the EEG-based drowsiness detection algorithm, data collected as part of an overnight sleep

study at Toronto Rehabilitation Institute were used. The data were collected as part of another study and we used

these data in a post-hoc data analyses. The rationale for using sleep data was that they generally tend to be less noisy

with fewer artifacts such as eye movements as compared to data during wakefulness. Furthermore, compared to a

simulated driving task where it can be challenging to accumulate large data, a sleep study would provide enough

data for primary model development in a sleep study. A full attended overnight polysomnograhpy (PSG) was

conducted using Embla® N7000/S4500 (Natus Medical Incorporated) at the Toronto Rehabilitation Institute Sleep

laboratory. Standard surface electrodes were applied to record EEG, electrocardiogram (ECG), and electromyogram.

Respiratory rate and volume were monitored using chest and abdominal respiratory inductance plethysmography

bands, airflow by nasal pressure cannula, and atrial oxyhemoglobin saturation (SaO2) using pulse oximetry. Sleep

stages and arousals were scored in accordance with standard rules discussed in Chapter 2 [37].

4.1.1 EEG Data

The EEG recordings contained data from six electrodes – two frontal (F3/F4), two central (C3/C4), and two occipital

(O1/O2) electrodes. The electrodes were referenced against the mastoid electrodes (M1 and M2). The sampling rates

of the EEG data were 128 Hz. Fig. 4.1 illustrates the EEG electrode locations used in this study.

Page 28: Developing a System for High-Resolution Detection of ...

Page | 18

Figure 4.1: EEG electrode placement map commonly used in sleep studies [67]. The six electrodes available in this study are highlighted with red circles.

4.1.2 Preprocessing of EEG Signals

The EEG data were bandpass filtered using a butterworth filter with 0.5-30 Hz cut-off frequencies. Next, the data

were segmented on 3s segment basis. Since we aimed to extract features from the delta band (1-4 Hz) of EEG, the

segment length was chosen to see at least two cycles of delta. For drowsiness detection, all episodes which were

rated by sleep technicians based on the whole night EEG recordings as awake and non-rapid eye movement 1 (non-

REM 1) were considered. The sleep technicians score the sleep-EEG data on 30s basis. That is, the technicians rated

a 30s EEG episode as awake when the segment consisted of at least 50% wakefulness; otherwise it is scored as non-

REM 1 [15]. As a result, there could be short wakefulness, drowsy, and non-REM 1 episodes in scored wakefulness

as well as non-REM 1 segments. In this study, we also used arousal segments as extreme cases of alertness and deep

sleep segments as extreme cases of non-alertness for parameter selection and validation of the proposed method.

Arousal segments occur after a respiratory event and are typically 3s long [68]. During arousals, the participant is

cortically awake. For deep sleep segments, we had chosen non-rapid eye movement 2 and 3 stages during which the

participant is definitely asleep.

Page 29: Developing a System for High-Resolution Detection of ...

Page | 19

4.1.3 Feature Extraction from EEG Frequency Bands

After performing preprocessing, we extracted features from the EEG frequency bands. Prior studies have shown that

as an individual moves from wakefulness to non-REM 1, alpha and beta band powers decrease, and delta band

power increases [13, 15]. The spectrogram in Fig. 4.2 shows an example of the changes in the EEG frequency band

powers alpha and delta at sleep onset, as scored by a sleep technician. The beta band changes in the spectrogram are

not visible, since the beta power levels are smaller compared to other bands.

In this study, the relative power features of alpha, delta, and beta bands were investigated. Relative power of a band

was defined as:

푅푒푙푎푡푖푣푒푝표푤푒푟표푓푎푏푎푛푑 =

AveragepowerofthebandAveragepowerfrom1− 30Hz

(4.1)

Figure 4.2: Spectrogram (3s window, 50% overlap) of the first few minutes of EEG recording at channel F4 of a single participant. The F4 electrode was referenced against the left mastoid (M1). Based on the scoring by sleep

technicians, the transition from wakefulness to non-REM 1 occurs at 4.5 minutes.

Page 30: Developing a System for High-Resolution Detection of ...

Page | 20

4.1.4 Sigmoid Wake Probability Model

A Sigmoid function has a characteristics "S"-shaped curve. If the input to sigmoid is extremely high or low, the

sigmoid output is set to either close to 0 or close to 1. Towards the middle, the curve smoothly increases or

decreases. Using the three relative power values of alpha, delta, and beta, we developed a probability model using

sigmoid function. This model outputs the likelihood of wakefulness for a 3s long EEG signal segment. The sigmoid

functions used in the model are depicted in Fig. 4.3. Probability of wakefulness for each feature (PrF(W)) should be

high, if the relative power values of alpha or beta are high. Since alpha and beta band powers are high during

wakefulness, the red curve is used to capture the changes in these two bands. Delta band power, on the other hand,

increases as an individual goes from wakefulness to sleep. Probability of wakefulness (PrF(W)) should be low, if

relative power values of delta are high, which is why the black curve is used to compute PrF(W) from delta band.

Thus, each of the features are fed into a sigmoid function to obtain probability of awake for each feature (PrF(W)).

Figure 4.3: Sigmoid functions used in the proposed model. Probability of wakefulness for each feature (PrF(W)) should be high, if relative power values of alpha/beta are high. Therefore, the red curve is used to capture the

changes in these two bands. The opposite scenarios are seen for delta band, which is why the black curve is used to compute PrF(W) from delta band.

Subsequently, weighted average of the PrF(W) were taken to obtain final probability of wakefulness Pr(W) using the

following equation.

Pr(푊) = 푤 ∗ 푃 +푤 ∗ 푃 +푤 ∗ 푃 (4.2)

Page 31: Developing a System for High-Resolution Detection of ...

Page | 21

Here, w1, w2, and w3 are the weights and P훼, Pβ, and P훿 are the sigmoid outputs for alpha, beta, and delta relative

power features, respectively.

4.1.5 Parameter Selection for the Model

In order to determine the optimal choice of the sigmoid parameters (a and b, Fig. 4.3) and to validate the proposed

model, we selected arousal and deep sleep segments from the sleep-EEG data. To select the sigmoid parameters, we

divided the arousal and deep sleep segments into training and testing data. That is, the training data were used to

compute the model parameters, and the testing data were used to validate the choices of the sigmoid parameters and

weights. To estimate sigmoid parameter a, the feature values of the deep sleep distribution were sorted. Afterwards,

the maximum of the lower 80% of data was selected as the sigmoid parameter a. To estimate sigmoid parameter b,

the feature values of the arousal distribution were sorted. Afterwards, the minimum of the higher 80% of data was

selected as the sigmoid parameter b.

To compute the weight of the features (w1,w2,w3), we employed out-of-bag (OOB) permuted predictor delta error

method [69, 70]. The advantage of OOB permuted predictor delta error method over other methods such as linear

regression is its diversity. Because of the use of bootstrapped replicas of the original dataset and out-of-bag

examples, the weights calculated by OOB permuted predicted delta error method is more robust [70]. At first, we

grew an ensemble of decision trees. Every decision tree in the ensemble was grown on an independently drawn

bootstrap replica of equal size of the input data. Observations not included in this replica were "out-of-bag" for that

tree. Fig. 4.4 gives an example of how bootstrapped decision trees were formed. Each of the examples in Fig. 4.4 is

a multidimensional vector wherein the dimension is equal to the number of features used. By drawing samples with

replacement from S (original dataset) in Fig. 4.4, three bootstrap replicas S1, S2, and S3 were formed. A decision

tree was trained using each of these replicas. The out-of-bag instances are examples that occur in S, but not in a

bootstrap replica. For instance, out-of-bag examples for bootstrap replica S1 are d and e.

After constructing the bootstrapped decision trees, we computed OOB permuted predictor delta error for each of the

three features. For any feature, OOB permuted predictor delta error was the increase in prediction error if the values

of that variable were permuted across the out-of-bag observations. This measure was computed for every tree, then

averaged over the entire ensemble and divided by the standard deviation over the entire ensemble. If OOB permuted

predictor delta error was higher for a particular feature than others, it would indicate that the particular feature was

more important than others.

Page 32: Developing a System for High-Resolution Detection of ...

Page | 22

Figure 4.4: An example of bootstrapping and out-of-bag instances. The original data-set S has 5 examples- a, b, c, d, and e. Each of the examples is a multidimensional vector wherein the dimension is equal to the number of features

used. By drawing samples with replacement from S, three bootstrap replicas S1, S2, and S3 are formed. Each one of the bootstrap replicas is used to train a decision tree. The out-of-bag instances are examples that occur in S, but not

in a bootstrap replica. Out-of-bag examples for bootstrap replica S1 are, for example, d and e.

The last stage was to identify wakefulness, sleep, and drowsy clusters by thresholding Pr(W). To identify three

clusters, we essentially had to find the upper and lower bounds of the drowsy cluster. We selected the lower and

upper bounds of the drowsy cluster by varying the upper and lower cut-offs and computing the clustering evaluation

metrics for every choice of cutoff values. The final bounds were those for which the three clusters were maximally

dissimilar from one another.

Even though the arousal and deep sleep segments used for validation would manifest the extent of the three clusters,

we used cluster quality evaluation metrics, including Davies-Bouldin [71] and silhouette [72] indices to determine

the upper and lower cut-offs of the drowsy cluster. Davies-Bouldin index (DB) is defined as [71]:

퐷퐵 =

1푛 max(

(σi + σj)푑(ci, cj) )

,

(4.3)

Page 33: Developing a System for High-Resolution Detection of ...

Page | 23

Here, n is the number of clusters, σi is the average distance of all patterns in cluster i to their cluster center ci , σj is

the average distance of all patterns in cluster j to their cluster center cj , and d(ci , cj ) is the distance of cluster centers

ci and cj. Small values of DB correspond to clusters that are compact, and whose centers are far away from each

other.

For each data point k, we computed the silhouette value s(k) as [72]:

푠(푘)

=

⎩⎪⎨

⎪⎧1−

푎(푘)푏(푘) , 푎(푘) < 푏(푘)

푏(푘)푎(푘) − 1, 푎(푘) ≥ 푏(푘)

(4.4)

Here, a(k) = the average dissimilarity/distance of k with all other data within the same cluster and b(k)= the lowest

average dissimilarity of k to any other cluster, of which k is not a member. Silhouette value s(k) close to 1 suggests

that the data point belongs to the proper cluster. Silhouette value close to -1, on the other hand, suggests that the

particular data point was assigned to the wrong cluster.

Page 34: Developing a System for High-Resolution Detection of ...

Page | 24

4.1.6 Validation

After identifying three clusters from the data, we performed cluster validation. For validation of the proposed

method, we randomly selected 50% of participants as training dataset and the remainder as testing dataset. The

arousal and the deep sleep segments from the training dataset were used to determine the parameters of the method.

Furthermore, the arousal and the deep sleep segments from the testing dataset were used to determine the

effectiveness of parameter selection and the proposed method. The proposed model was validated by determining if

it can successfully separate arousal and deep sleep segments. In other words, if the sigmoid wake probability model

performs well, it can be expected that it will give high likelihood of wakefulness for arousal segments and low

probability of wakefulness for deep sleep segments. Using, the awake and non-REM 1 periods from the testing

dataset, we visually investigated distribution of alpha, delta and beta bands of awake, drowsy, as sleep clusters as a

qualitative assessment of the clustering performance during. To quantitatively assess the clustering performance, we

performed one-way repeated measures analysis of variance to test if the three features were significantly different

among the three clusters. The clusters were identified based on the feature distributions. We also determined the

clustering quality evaluation metric values to validate the quality of the estimated clusters from the awake and non-

REM 1 data points. We also determined the average detection accuracy by considering the data points with positive

silhouette values as correctly classified and data points with negative silhouette values as misclassified instances.

4.2 Results

In this section, the results of sigmoid wake probability model on sleep study data are presented. The demographic

information of the participants is presented in Table 4.1.

Table 4.1: Demographic information of the participants. Data are presented as mean ± standard deviation.

Characteristics n= 53

Female, n (%) 26 (49.06%)

Body mass index (kg/m2) 29.09 ± 6.16

Age (years) 49.58 ± 16.18

Page 35: Developing a System for High-Resolution Detection of ...

Page | 25

Table 4.2 summarizes the number of 3-s segments used in this study for model development and validation. Results

obtained from the F4-M1 electrode (F4 referenced against left mastoid M1) were used for data analyses and are

presented here, but note that similar results were achieved for all other electrodes (Appendix 4A).

Table 4.2: Average and standard deviation of the number of 3-s segments from F4-M1 used in this study for model development and validation. Data are presented as mean ± standard deviation.

Stage Number of Segments per Participant

Arousal 5 ± 1

Deep sleep 23 ± 14

Non-REM 1 396 ± 112

Awake 1447 ± 337

In order to select sigmoid parameters (a and b in Fig. 4.3) and to validate the efficacy of the sigmoid awake

probability model, 304 arousal segments and 1267 deep sleep segments have been selected from all the participants.

Of the selected arousal and deep sleep segments, data from 26 participants were used as training, and the remaining

data were used as testing. Sigmoid parameters and weights computed from the training data were used to determine

Pr(W) for the testing data.

Page 36: Developing a System for High-Resolution Detection of ...

Page | 26

Figure 4.5: Panel a: Sigmoid parameters to be estimated for alpha band. Panel b: Sigmoid parameter b is selected as the minimum feature value of the arousal distribution after removing outliers. Panel c: Sigmoid parameter a is

selected the maximum feature value of the deep sleep distribution after removing outliers.

Fig. 4.5 shows the distribution of the relative power of the alpha band for the arousal and deep sleep segments of the

training data. To compute sigmoid parameter b (Fig. 4.5a) for the alpha band, the feature values of the arousal

distribution were sorted. Next, the minimum feature value of the higher 80% of data was selected as the sigmoid

parameter b for the alpha band (b = 0.237, Fig. 4.5b). Similarly, to compute sigmoid parameter a (Fig. 4.5a) for the

alpha band, the relative power values of the deep sleep distribution were sorted and the maximum value of the lower

80% of data was selected as the sigmoid parameter a for the alpha band (a= 0.005, Fig. 4.5c). Table 4.3 shows the

sigmoid parameter values obtained for the three features, and Fig. 4.6-a shows the resultant sigmoid functions

obtained by using the parameters in Table 4.3.

Page 37: Developing a System for High-Resolution Detection of ...

Page | 27

Table 4.3: Sigmoid parameters computed from the training data for F4-M1

Frequency band a b

Alpha 0.005 0.237

Beta 0.018 0.167

Delta 0.162 0.972

The features’ weights were calculated using OOB permuted predictor delta error from the training data. Fig. 4.6 (b)

shows the OOB permuted predictor delta error for each of the three features. It is clear from Fig. 4.6 (b) that relative

power of alpha should have the highest weight. The OOB permuted predictor delta error values obtained from the

training data were used to analyze the test data in this work.

(a) (b)

Figure 4.6: The resultant sigmoid functions for the three features for F4-M1 (b) Out-of-bag (OOB) permuted predictor delta error for the three features computed from the training data.

The sigmoid parameters and weights computed from the training dataset were used to compute the probability of

wakefulness (Pr (W)) of the test dataset (Eq. 4.2). The choice of model parameters (w1, w2, w3) was validated from

the effectiveness of the model to discriminate arousal and deep sleep segments of the train and the test data as shown

in Fig. 4.7.

Page 38: Developing a System for High-Resolution Detection of ...

Page | 28

Furthermore, Fig. 4.7 (b) shows that the proposed model distinguishes between arousal and deep sleep segments of

the testing data. This indicates the efficacy of the sigmoid awake probability model. Fig. 4.7 (a) also manifests the

extent of the drowsy cluster. In other words, it is clear from Fig. 4.7 (a) that segments with Pr(W)> 28 and

Pr(W)<65 may belong to the drowsy cluster. This is further validated by the DB (Eq 4.3) and silhouette (Eq 4.4)

indices maps in Fig. 4.8.

(a) (b)

Figure 4.7: Pr(W) distribution of arousal and deep sleep segments of (a) training data and (b) testing data obtained by weights and sigmoid parameters computed from the training data. The proposed model yields lower Pr(W) for

deep sleep segments and higher Pr(W) for arousal segments in the testing data, which indicates its efficacy.

Page 39: Developing a System for High-Resolution Detection of ...

Page | 29

(a) (b)

Figure 4.8: (a) Silhouette and (b) Davies-Bouldin index values computed from the 3-s segments of awake and non-REM 1 data obtained from the training participants. For both maps, the upper and lower bounds of the drowsy

cluster have been varied and the two metrics were computed for the awake, sleep, and drowsy clusters.

In Fig. 4.8, warmer colors indicate more dissimilar clusters. Fig. 4.8 (a) shows that higher silhouette values are

achieved if the lower cutoff of the drowsy cluster is between 21 to 30 and the upper cutoff is between 54 to 58. It is

clear from Fig. 4.8 (b) that lower cutoff of 21 to 27 and upper cutoff of 54 to 55 gives the lowest DB index. For

smaller values of upper (<54%) and lower (<21%) cutoffs in Fig. 4.8, DB is very high and silhouette values are very

low. Therefore, from Fig. 4.7 and 4.8, the lower cutoff of the drowsy cluster was set to Pr(W)= 28% and the upper

cutoff of was set to Pr(W)= 55%.

Page 40: Developing a System for High-Resolution Detection of ...

Page | 30

Figure 4.9: Scatter diagram of the awake, drowsy, and sleep clusters for the testing data. Here, the lower bound of the drowsy cluster is set to Pr(W)= 28% and the upper cutoff of is set to Pr(W)= 55%.

Fig. 4.9 shows the clusters that are obtained if we use the aforementioned upper and lower bounds for the testing

data. The quality of the clusters and the choice of model parameters were further validated from the feature

distributions, when the model was applied on all non-REM 1 and wakefulness segments of the testing participants.

As expected, for sleep clusters, relative power values of alpha and beta bands were smaller (Fig. 4.10 (a) and Fig.

4.10 (b)), and the for awake clusters, the relative power values of alpha and beta were higher. The opposite scenario

is seen in Fig. 4.10 (c). Furthermore, mean silhouette values greater than 0.6 indicate clusters that are compact [72].

It is evident from Fig. 4.9 that the three clusters obtained by thresholding Pr(W) are compact, since the mean

silhouette value is close to 0.74. Lastly, the mean and maximum detection accuracy was 93.21% and 94.73%

respectively.

Page 41: Developing a System for High-Resolution Detection of ...

Page | 31

(a) (b)

(c)

Figure 4.10: Distributions of relative power of (a) alpha, (b) beta, and (c) delta for three clusters of awake, drowsy and sleep. Here, all episodes of non-REM 1 and wakefulness were considered. The sleep segments (Pr(W)<28) have low alpha and beta power and high delta power, while awake segments (Pr(W)>55) have high alpha and beta power

and low delta power.

Page 42: Developing a System for High-Resolution Detection of ...

Page | 32

Table 4.4: One-way repeated measures analysis of variance suggests that the feature values (mean ± standard deviation) significantly change in the three clusters.

Frequency Band Wakefulness Drowsy Sleep p-Value

Delta 0.149 ± 0.042 0.651 ± 0.024 0.916 ± 0.030 <.001

Alpha 0.193 ± 0.034 0.079 ± 0.013 0.017 ± 0.005 <.001

Beta 0.522 ± 0.086 0.095 ± 0.034 0.017 ± 0.005 <.001

Table 4.4 shows the results of repeated measures analysis of variance of the three features and Fig. 4.11 presents the

results of the post hoc analysis. We employed Tukey’s multiple comparison test. From the repeated measures

analysis of variance results in Table 4.4, it is clear that relative power of alpha, delta and beta bands are significantly

different in the three clusters. Based on the post-hoc analysis, it can be seen that compared to the sleep or drowsy

clusters, the relative delta power significantly decreases and relative alpha and beta powers significantly increase for

awake cluster. These results further validate the clusters in the proposed framework.

(a) (b) (c)

Figure 4.11: Post hoc multiple comparison test suggests that (a) alpha, (b) beta, and (c) delta power features are significantly different between the clusters. Error bars indicate standard deviation. * indicates

p<.0001.

Page 43: Developing a System for High-Resolution Detection of ...

Page | 33

4.3 Discussions

In this study, we have developed a high-resolution drowsy driving detection algorithm by extracting features from

the EEG frequency band changes during sleep. The sigmoid awake probability model developed herein provided a

likelihood of wakefulness (Pr(W)) for 3-s signal segments. By choosing appropriate thresholds for Pr(W), we have

identified three clusters. The feature distributions of the clusters suggest that the clusters indicate wakefulness,

drowsiness, and sleep. The proposed scheme has been validated using arousal and deep sleep segments, cluster

quality evaluation metrics, graphical, and statistical analyses. The results presented in the foregoing section address

objective 1 in Section 4 and suggest that spectral properties of EEG indeed manifests the likelihood of wakefulness

for short episodes and leads to the development of a high-resolution drowsiness detection algorithm.

The choice of sigmoid parameter a has necessitated the use of deep sleep segments. This is because for any feature

value below a in Fig. 4.3, the PrF(W) is close to zero or one. This is also why arousal segments were used to

determine the optimal choice of the other sigmoid parameter, b. Moreover, the distributions in Fig. 4.10 as well as

the results of repeated measures analysis of variance further validate that the three resultant clusters can be identified

as wakefulness, drowsy, and sleep. The slight overlap in some of the distributions in Fig. 4.10 could be due to the

variations of EEG frequency bands' power levels across participants. However, inter-participant variation of power

level of a particular frequency band should only have a minor effect on the overall results. For example, if one

participant has high delta power for an awake 3s segment, the relative powers of alpha and beta will make the Pr(W)

for that segment high such that the segment falls into the awake cluster.

The proposed sigmoid awake probability model has several advantages. First, the proposed scheme, once trained, is

independent of sleep technician’s labels. Prior studies have shown that inter-rater disagreement could be up to 20%

[14]. Also, a technician’s scoring accuracy is subject to bias and error due to fatigue. Since the proposed method

utilizes relative power of three EEG bands to detect drowsiness, it is free from the aforementioned limitation.

Second, the use of relative power makes it more robust to noises of EEG signals. Third, even though the foregoing

sections present results from one of the frontal electrodes only, the proposed model yields similar results in the rest

of the five electrodes as well. This suggests that the proposed method can be used to detect drowsiness using single-

channel EEG signals. Fourth, the proposed model can be utilized in quantifying sleep disorders such as insomnia,

which causes the time course of the awake to sleep transition to be pathologically protracted [73]. Also, the

proposed scheme can also be utilized to quantify sleep disorders such as narcolepsy or sleep deprivation, wherein

the wake to sleep transition occurs too rapidly [74]. Therefore, using the proposed framework, one can characterize

Page 44: Developing a System for High-Resolution Detection of ...

Page | 34

the sleep onset process phenotypes of different clinical populations and the natural heterogeneity among healthy

participants. Fifth, since the sigmoid wake probability model separates arousal and deep sleep segments quite well

(Fig. 4.7), it can be used for automated arousal identification. Sixth, the sigmoid wake probability model can also

help sleep technicians in identifying awake episodes and pinpointing sleep onset. Lastly, conventional clustering

algorithms such k-means or hierarchical clustering requires a large amount of data to discover groups or clusters of

data [75]. On the contrary, the sigmoid wake probability model, once trained, is capable of detecting whether an

arbitrary 3s episode is awake, drowsy or asleep.

A limitation of the proposed model is that it cannot detect drowsy episodes that are shorter than 3s. While the

resolution is higher than most of the existing EEG-based works in the literature, lapses or behavioral microsleeps

can still be 1s or a fraction of a second [52]. The inclusion of feature based on delta band, which ranges from 1-4

Hz, necessitates the use of at least 2s or longer signal segments. In the future, features based on only alpha and beta

can be used so that we can use segments as small as 0.25s (necessary to see at least two cycles of alpha) to detect

drowsiness. Time-domain features such as entropy and/or complexity based features can also be used to overcome

this limitation. Another limitation of the proposed model is that it has not been validated against behavioral scales of

drowsiness. Also, the sleep study did not require participants to stay awake and interact with the task. The

subsequent study will focus on addressing this limitation.

Page 45: Developing a System for High-Resolution Detection of ...

Page | 35

Chapter 5: Study 2- Reaction Time Study

In Study 1, we successfully developed an algorithm that showed promising results in detecting drowsiness.

However, several open questions remain. For instance, it needs to be tested whether this algorithm will show similar

performance in detecting drowsiness in a non-sleep related context. In a sleep study, the subjects are lying in supine

position trying to fall asleep. This minimizes movement artifacts, but is not comparable to a driving scenario where

the subject is in sitting position trying to stay awake and interacts with the environment. Furthermore, sleep studies

lack the cognitive components such as attention, memory, or perception that are involved during. Also, in a sleep

study, participants usually are not healthy and have various sleep disorders. Finally, the categorization of sleep

stages based on the ratings provided by the sleep technician is rather coarse. The technicians rated only 30s long

segments based solely on EEG data. Other helpful cues such as facial components that could indicate when exactly

the subject became drowsy are missing. To overcome these issues, we conducted a second study that requires the

participant to perform a task that is engaging yet monotonous to induce drowsiness.

5.1 Methods

5.1.1 Study Design

We took several characteristics of the study into consideration during study design. Firstly, the participant would be

trying to stay alert in a sitting position similar to the scenarios of driving. Secondly, a monotonous task would keep

the subject engaged and induce drowsiness. Thirdly, the stimulus must be rich in visual information to keep the

participant engaged. This must be done so that we can test the developed algorithms in a more active, engaging, and

demanding setting, that is more similar to the processes that are involved during driving. Lastly, one must be able to

pinpoint episodes of drowsiness to develop gold standard for detection. Thus, we designed the study by taking the

aforementioned features into consideration.

5.1.2 Stimulus and Data Collection

Fifteen healthy participants (7 females), who were between 18 to 49 years old (29.33 ± 7.64 years), took part in this

study. One participant’s data were corrupted. So, it was excluded from further analysis. Participants with history of

sleep disorders, stroke, active vestibular disorders, disabling musculoskeletal disorder, acute psychiatric disorder, a

diagnosis of dementia or mild cognitive impairment, and who had regular intake of sedating medication (e.g.,

Page 46: Developing a System for High-Resolution Detection of ...

Page | 36

opioids), and/or engaged in shiftwork were excluded from the study. The study was conducted in the afternoon after

lunch since at this time of the day the participants were the most likely to be sleepy [76]. The participants were

instructed to have lunch before the experiment. They were also instructed to refrain from having attention-altering

food and drinks such as tea, coffee, or alcohol that day prior to the experimental session. Fig. 5.1 shows an

illustration of the stimulus used in Study 2.

Figure 5.1: Illustration of the stimulus used in the present study. The stripes move horizontally with the red fixation cross at the center. The participant presses a button as soon as the cross momentarily turns blue.

For the experiment, participants were seated in front of a projected screen and asked to perform a monotonous

reaction time task, illustrated in Fig. 5.1. The stimulus for the task consisted of a pattern of horizontally moving

black-and-white stripes with a fixation cross at the center. The task was chosen to create a potential sensation of

self-motion (i.e., vection) [77, 78]. This is due to the fact that the drowsiness detection algorithms developed in this

thesis must be tested in a study in a driving simulator, since it is unsafe to make a driver drowsy in a real driving

study. In a driving simulator, one can perceive self-motion in the absence of true, physical motion. The fixation

cross was programmed to be red most of the time and would occasionally (~10% of the time) turn blue for 500-750

ms. The altered black-and-white stripes shown in Fig. 5.1 moved either to the right or to the left in each trial. The

spatial frequency of the altered black-and-white stripes was 0.13 cycles/degree and the speed was 1 cycle/s. The

duration of each trial was 45 seconds. A large projection screen (300 cm ×196 cm) with an Optoma HD 850

projector was used to display the stimulus. The refresh rate of the projector was 60 Hz the display resolution was

1920 ×1080 pixel. The field-of-view of the projection screen was 78°×52°. The participants were seated 215 cm

Page 47: Developing a System for High-Resolution Detection of ...

Page | 37

away from the screen in a height-adjustable chair with eye-height leveled to the screen’s center. The participants

were instructed to press a button as soon as the the red cross changed its color to blue, indicating the response time

as an indirect measure of the participant’s level of attention [79]. There were 1 training block (4 trials) and 10 testing

blocks. Each testing block lasted for about 12 minutes (16 trials). The lights of the room were turned off to create a

drowsiness-inducing environment. Fig. 5.2 shows an illustration of the participant performing the task.

Figure 5.2: Illustration of a participant performing the task. For better visibility, the participant is shown with the lights on even though the experiments were performed in a dark room.

As the participants performed the task, facial video, response time (i.e., button press information), physiological

signals such as EEG, electrocardiogram (ECG), and respiratory inductance plethysmography (RIP) signals were

recorded. The facial videos of the participants were recorded using an infrared camera to identify the drowsy

episodes by visual inspection of the videos. RIP and ECG signals were recorded to develop cardiorespiratory signal

based drowsiness detection algorithms. EEG, ECG, and RIP signals were recorded using a Grael V2 amplifier

Page 48: Developing a System for High-Resolution Detection of ...

Page | 38

(Compumedics, Melbourne, Australia). The RIP signals were calculated from two belts- one in the chest and the

other in the abdomen. For the EEG data acquisition, all of the nineteen electrodes (Quick-Cap, Melbourne,

Australia) of the international 10–20 electrode placement system were used. Fig. 5.3 shows the electrode placement

map of the international 10-20 electrode placement scheme. The reference electrode was placed on the nose. In

between the blocks, the participants were asked about their degree and duration of self-motion perception as well as

their level of motion sickness and sleepiness.

Figure 5.3: Electrode locations of the international 10-20 electrode placement scheme [80].

5.1.3 Drowsiness Detection from Facial Video

Facial videos of each participant during the experiment were used to obtain their level of drowsiness. The facial

videos were rated on 1s basis without the knowledge of the button-press data using a newly developed drowsiness

rating scheme. The scale is presented in Table 5.1.

In the proposed drowsiness rating scheme, the rater at first rates each of the 1s video segments as drowsy (score 1)

or non-drowsy (score 0). If an episode is rated as drowsy, it is further rated on a scale of 1 to 10 depicted in Table 1.

This rating denotes the strength of drowsiness based on behavioral cues. As it is evident from Table 5.1, the

Page 49: Developing a System for High-Resolution Detection of ...

Page | 39

proposed scale employs various behavioral cues of drowsiness such as eye closure, head nodding, facial contortion,

changes in muscle tone, and rapid eye blinks.

The advantage of the proposed drowsiness scale over existing rating schemes such as the Wierwille and Ellsworth

[81] method is that here the levels are more well-defined. Furthermore, unlike existing rating schemes, this scheme

captures both the confidence of observing a drowsy episode and the degree or severity of drowsiness. Moreover,

Wierwille and Ellsworth scale does not offer a guideline on time resolution, and it is often used to score videos on 1

min basis [49]. However, drowsy episodes or microsleeps can be as short as 1s [52]. Thus, the proposed scoring

guidelines will help to capture both longer and shorter drowsy episodes. Lastly, Wierwille and Ellsworth scale offers

generic guidelines in rating whereas the proposed scoring scheme pinpoints behavioral cues corresponding to each

score of drowsiness.

Table 5.1: Drowsiness scale proposed in this work

Score Summary

0 Alert, eye fully open and moving, often the subject is moving

1 Eyelids are slightly (about 30% compared to alert) closed

2 Eyelids are more (about 60% compared to alert) closed than previous stage, very little eye movement

3 Eyelids are more (about 90% compared to alert) closed than previous stage, glassy-eyed appearance, subject staring at a fixed position

4 Eyes are barely open, often facial contortions are visible

5 Slow eye blinks; different from the usual eye blinks of an alert person

6 Increased eye blinks

7 Rapid eye blinks; after the episode the subject usually fully closes his eyes

8 Eye fully closed, head movement/nodding; episode shorter than rating 9

9 Eye fully closed, head nodding, change of muscle tone; episode often terminated by head jerk

10 Eye fully closed, head nodding, change of muscle tone, the subject does not wake up unlike 9

Page 50: Developing a System for High-Resolution Detection of ...

Page | 40

In order to assess the inter-rater variability of the videos based on the proposed scale, 70 minutes of data from 15

participants consisting of 126000 video frames were rated by a second rater. For each participant, the videos were

selected such that there was maximum number of drowsy episodes. The videos were rated by the second rater

independent of the button-press information.

The inter-rater agreement was obtained by determining the percentage agreement of the ratings. Furthermore,

normality test such as Anderson-Darling and Kolmogorov-Smirnov tests followed by Pearson or Spearman

correlation analysis were performed to determine agreement of the two raters. We also determined the mean and

standard deviation of the difference of two raters. Moreover, confusion matrix analysis was also performed to assess

inter-rater reliability.

5.1.4 Performance Evaluation Metrics

The performance metrics used in this work are accuracy, sensitivity, and specificity. They are expressed using the

following equations.

푆푒푛푠푖푡푖푣푖푡푦 = 푇푟푢푒푃표푠푖푡푖푣푒

푇푟푢푒푃표푠푖푡푖푣푒+ 퐹푎푙푠푒푁푒푔푎푡푖푣푒

(5.1)

푆푝푒푐푖푓푖푐푖푡푦 = 푇푟푢푒푁푒푔푎푡푖푣푒

푇푟푢푒푁푒푔푎푡푖푣푒+ 퐹푎푙푠푒푃표푠푖푡푖푣푒

(5.2)

퐴푐푐푢푟푎푐푦 = 푇푟푢푒푃표푠푖푡푖푣푒 + 푇푟푢푒푁푒푔푎푡푖푣푒

푇푟푢푒푃표푠푖푡푖푣푒+ 푇푟푢푒푁푒푔푎푡푖푣푒 + 퐹푎푙푠푒푃표푠푖푡푖푣푒+ 퐹푎푙푠푒푁푒푔푎푡푖푣푒

(5.3)

Page 51: Developing a System for High-Resolution Detection of ...

Page | 41

Sensitivity denotes the proportion of correct positive class or drowsy classifications. Specificity, on the other hand,

expresses the proportion of correct negative class or non-drowsy classifications. Accuracy denotes the proportion of

correct positive and negative class classification. In order to compute the performance metrics, 60% of the

participants were randomly selected as training, and the remainder was selected as testing. Afterwards, we computed

the performance metrics using the training and testing data. This process was repeated 100 times, and the resulting

mean and standard deviation of accuracy, sensitivity, and specificity values are reported here. This ensures that each

of the participant's data was used as training or testing data but never at the same time. The aforementioned analysis

was done for each of the 19 electrodes.

5.1.5 Development of Drowsiness Detection Algorithm

In the following, three methods developed in Study 2 data namely- modified sigmoid wake probability model, step

function model, and time-domain feature based drowsiness detection algorithm will be presented.

5.1.5.1 Modified Sigmoid Wake Probability Model

Figure 5.4: A schematic outline of the modified sigmoid wake probability model proposed in this thesis. A. At first, ocular noise removal was performed using independent component analysis (ICA). Then the signal was band-pass and notch-filtered. B. Relative power of theta and gamma were extracted from 1s EEG segments. C. Each of the

features was fed into a step function. D. The step function outputs were weighted and averaged using weights computed from the training data using random forest. E. The probability of wakefulness (Pr (W)) obtained from the model was used to divide the data in awake, drowsy, and sleep clusters using cluster quality evaluation metrics. The

output is compared against a threshold to classify a 1s episode as alert or drowsy.

Page 52: Developing a System for High-Resolution Detection of ...

Page | 42

The sigmoid wake probability model developed in Study 1 data was tested on the data collected in Study 2.

However, in Study 2, gamma and theta band-based features were used instead of delta, theta, and alpha power-based

features. Since in the EEG data of the Study 2 these two band changes were most prominent, they were used in the

model (details in Appendix 5A). A schematic outline of the modified sigmoid wake probability model is shown in

Fig. 5.4.

Data Preprocessing

EEG data collected from Study 2 contained eye movement and eye blink noise as it is evident from Fig. 5.5. The

smoothly decreasing EEG spectrum and a strong far-frontal projection of independent component 2 (circled in red)

in Fig. 5.5 indicates that the EEG data is corrupted by ocular artifact. Therefore, we performed independent

component analysis (ICA) to remove the ocular noise using EEGLAB software [82]. Next, the EEG signals were

band-pass filtered from 1 to 100 Hz. Subsequently, we applied a notch-filter at 60 Hz for power-line noise removal.

Afterwards, we divided the EEG signals into 1s segments.

Figure 5.5: Component maps of the independent component analysis (ICA) of the EEG recordings a subject. The smoothly decreasing EEG spectrum and a strong far-frontal projection of independent component 2 (circled in red)

are typical of ocular artifact.

Page 53: Developing a System for High-Resolution Detection of ...

Page | 43

Feature Extraction

Instead of the original model’s delta, alpha, and beta features, we computed relative power values of gamma and

delta bands, since these two feature changes appeared to be most prominent in the awake and drowsy states. The

formula used to compute relative power values in the sleep study data in Eqn. 4.1 was also used to compute the

feature values in this method.

Modified Sigmoid Wake Probability Model

Figure 5.6: Sigmoid functions used in the modified sigmoid wake probability model. Probability of wakefulness for each feature should be high, if relative power values of gamma are high. Therefore, the red curve is used to capture the changes in gamma band. The opposite scenarios are seen for theta band, which is why the black curve is used to

capture the changes of delta band.

Each of the two extracted features was fed into a sigmoid function as shown in Fig. 5.6. Prior studies have

demonstrated that as an individual becomes drowsy, gamma band power decreases and theta power increases [16].

Therefore, the red sigmoid function in Fig. 5.6 was used for gamma band, and the black sigmoid function in Fig. 5.6

was used for theta band. Sigmoid outputs were weighted and averaged using weights computed from out-of-bag

permuted predictor delta error method similar to the sleep study data explained in section 4.1.5. Thus, a probability

of wakefulness value (Pr(W)) was obtained from each of the 1s segments using the following equation.

Pr(푊) = 푤 ∗ 푃 +푤 ∗ 푃 (5.4)

Page 54: Developing a System for High-Resolution Detection of ...

Page | 44

Subsequently, we divided the signal segments in awake, drowsy, and sleep clusters using cluster quality evaluation

metrics and tested against the video rating. The model gave high accuracy and specificity but low sensitivity.

5.1.5.2 Step Function Model

Step function model, which was developed to overcome the low sensitivity issue of the sigmoid wake probability

model, necessitates the determination of one parameter instead of sigmoid wake probability model’s two. Therefore,

the model does not require extreme cases of non-alertness for model parameter tuning unlike the sigmoid wake

probability model. A schematic outline of the step function model is shown in Fig. 5.7.

Figure 5.7: A schematic outline of the step function model proposed in this thesis. A. At first, ocular noise removal was performed using independent component analysis (ICA). Then the signal was band-pass and notch-filtered. B. Relative power of delta, theta, alpha, beta, and gamma were extracted from 1s EEG segments. Each of the features

was fed into a step function. C. The step function outputs were weighted and averaged using weights computed from the training data using random forest. D. The output is compared against a threshold to classify a 1s episode as alert

or drowsy.

Data Preprocessing and Feature Extraction

Page 55: Developing a System for High-Resolution Detection of ...

Page | 45

Data preprocessing was performed using the same steps as the sigmoid wake probability model. In the step function

model, we first divided the data in 1s segments. Subsequently, we extracted relative power feature values of delta (2-

4 Hz), theta (4-8 Hz), alpha (8-13 Hz), beta (13-30 Hz), and gamma (30-100 Hz) bands. Even though delta band

starts from 1 Hz, we used the upper part of the delta band (2-4 Hz) to observe the Nyquist criterion.

Development of Step Function Model

Each of the relative power features was fed into one step function. Prior studies have demonstrated that as an

individual becomes drowsy in a situation similar to driving, delta, theta, alpha, and beta band powers increase and

gamma power decreases [16]. Therefore, the top function in Fig. 5.7 is used to capture the changes in gamma band,

and the bottom function in Fig. 5.7 is used to capture the changes in delta, theta, alpha, and beta bands. As it is

evident from Fig. 5.7, the step function requires the determination of one threshold. The advantage of step function

is that instead of sigmoid function’s two parameters, one has to determine one parameter to employ step function.

Therefore, extreme cases of non-alertness are no longer required in step function model.

The step function outputs were weighted and averaged using weights computed by random forest [70].

Subsequently, this value is compared with a threshold to classify the segment as alert or drowsy. The thresholds in

the model were heuristically determined. Similar to the sigmoid wake probability model, we randomly selected 60%

of the participants as training and the remainder as testing in each run. Both the thresholds of step functions and the

feature weights were computed from the training data. The performance of the step function model was compared

against facial video rating for model validation. The performance metrics reported here are the mean and standard

deviation of 100 runs. The step function model yields better accuracy, sensitivity, and specificity than the sigmoid

wake probability model. Nevertheless, the sensitivity is still low, and hence a more robust algorithm needs to be

developed.

5.1.5.3 Time-Domain Feature-Based Algorithm

In order to overcome the low sensitivity problem of sigmoid and step function models, we analyzed the time-domain

properties of EEG to identify effective markers of drowsiness. Time-domain features have been used in the literature

to extract markers for alertness monitoring [29], sleep stage classification [83, 84], and anesthesia-EEG data analysis

[85]. We then developed a time-domain feature-based drowsiness detection algorithm using machine learning

techniques. A schematic outline of the proposed method is shown in Fig. 5.8.

Page 56: Developing a System for High-Resolution Detection of ...

Page | 46

Figure 5.8: A schematic outline of the proposed time-domain feature-based algorithm.

Data Preprocessing

The EEG signals were first band-pass filtered from 1 to 100 Hz. Next, a notch-filter was used at 60 Hz to remove

power line noise. Subsequently, the filtered data were divided into 1s segments.

Feature Extraction

After preprocessing the EEG signals, we extracted Hjorth parameters from each of the 1s signal segment. Hjorth

parameters capture the temporal dynamics of an EEG signal segment [86]. For an EEG signal segment x(t), the three

Hjorth parameters, namely activity, mobility, and complexity are defined as follows.

퐴푐푡푖푣푖푡푦 = 푉푎푟푖푎푛푐푒(푥(푡)) (5.5)

푀표푏푖푙푖푡푦 = √( )

( ( )) (5.6)

Page 57: Developing a System for High-Resolution Detection of ...

Page | 47

퐶표푚푝푙푒푥푖푡푦 = ( )

( ( )) (5.7)

Activity represents the total energy of the signal x(t). As seen from equation 5.6, mobility expresses the ratio of the

standard deviation of the slope and the standard deviation of the amplitude. It can also be deemed as the standard

deviation of the power spectrum along the frequency axis (see Appendix B5 for more details). The ratio of the

mobility values of the signal’s first derivation and the EEG signal is defined as complexity. The value of this

parameter varies from 0 to 1. Complexity values closer to 1 indicate that the EEG signal segment is more similar to

a pure sine wave.

Classification

The three time-domain based features extracted from the EEG signals were fed into a classification model. The

classification model used in the proposed framework is bootstrap aggregating or bagging [87]. Bagging is an

ensemble learning-based classifier which combines multiple weak or base classifiers to create stronger and more

accurate classification models. A schematic illustration of bagging is presented in Fig. 5.9.

Figure 5.9: Illustration of bootstrap aggregating classifier. A percentage of the training data is drawn with

Page 58: Developing a System for High-Resolution Detection of ...

Page | 48

replacement to create each of the bootstrapped replicas. Afterwards, each of the replicas is used to train a weak

classifier. Final output of bootstrap aggregating is determined by majority voting of all the decision tree outputs.

Bagging creates multiple bootstrapped replicas of the original training data by sampling a percentage of the training

data with replacement as shown in Fig. 5.9. Subsequently, each of the bootstrapped replicas is used to train a weak

or base classifier. Thus, using resampling, bagging builds an ensemble that is as diverse as possible. In this work,

decision tree is used as the base classifier. To classify a test instance, each of the trained decision trees gives an

output. Next, the class chosen by most decision trees is selected as the ensemble decision. The majority voting

ensures that incorrect decisions are discarded and correct decisions are amplified. Despite being simple and intuitive,

bagging has proven to be fast, robust, and more accurate than other ensemble learning-based classification models

[87, 88].

5.2 Results

The segment statistics per electrode in our study is presented in Table 5.2.

Table 5.2: Average and standard deviation of the number of 1s segments used per electrode. Data are presented as mean ± standard deviation.

Segments per participant Total segments

Alert 5264.71 ± 210.85 73706

Drowsy 1732.00 ± 210.86 24248

5.2.1 Drowsiness Ratings

In this study, button press information and facial video ratings were used simultaneously to identify episodes of

drowsiness. The use of button press information, however, is problematic given that it is prone to false positives and

false negatives.

Fig. 5.10 (a) shows the reaction time in seconds of a participant in response to blue crosses throughout a block.

Negative reaction time indicates the blue crosses missed by the participant. As we can see from Fig. 5.10 at lower

levels of drowsiness, the subject may be drowsy and still be able to perform the task correctly. If we compare the

Page 59: Developing a System for High-Resolution Detection of ...

Page | 49

button-press and facial video rating in the region marked by blue squares in Fig. 5.10 (a), it becomes evident that the

participant is able to do the task at lower levels of drowsiness. Therefore, button-press information alone is not

capable of detecting lower degrees or intensities of drowsy episodes. In addition, the participant can miss button

presses for reasons other than drowsiness such as inattention and mind-wandering. Fig. 5.10 (b), which shows the

same plot of Fig. 5.10 (a) zoomed at the beginning, illustrates this fact. It is clear from Fig. 5.10 (b) that the

participant missed the first two crosses due to inattention, since the facial video rating showed that the participant

was awake. Therefore, the use of button-press information in such cases will introduce false positives. Also, drowsy

episodes might appear between two blue crosses causing them to remain undetected by button press information.

Finally, button-press information cannot precisely pinpoint the beginning and end of a drowsy episode. Due to the

aforementioned reasons, we adopted the facial video rating as our method of choice in this study to identify episodes

of drowsiness in accordance with the current literature.

(a)

Page 60: Developing a System for High-Resolution Detection of ...

Page | 50

(b)

Figure 5.10: (a) (Top panel) Reaction time in seconds of a participant in response to blue crosses throughout a block. Negative reaction time indicates the blue crosses missed by the participant. (Bottom panel) Facial video rating of the

same block. (b) Reaction time and video rating plot of the same participant zoomed at the beginning.

Table 5.3 shows the 2-class (alert or score 0 and non-alert or score 1) confusion matrix of the ratings of the two

independent raters. The diagonal of the matrix in Table 1 shows the number of alert and non-alert segments that both

of the raters agreed on. The inter-rater agreement between the two raters was 96.27%.

Table 5.3: Inter-rater agreement of the proposed scale for scores 0 and 1. The inter-rater agreement between the two raters was 96.27%. The diagonal of the confusion matrix is highlighted in bold.

Rater #2

Rater #1

Score 0 1

0 2745 71

1 75 1020

Page 61: Developing a System for High-Resolution Detection of ...

Page | 51

Table 5.4 shows the 11-class (score 0 indicating alert and scores 1-10 indicating various levels of non-alertness)

confusion matrix of the ratings of the two independent raters. The diagonal of the matrix in Table 5.4 shows the

number of alert and non-alert segments that both of the raters agreed on. Table 5.1 shows that scores from 1 to 4

each indicates a percentage of eye closure for rating. The subsequent three levels (i.e. scores 5 to 7) are primarily

based on the slow eye blinks that occur right before sleep onset. Usually, after these stages, the subject goes to a

deeper sleep stages. Since the descriptions of scores 5-7 do not contain an empirical value, the inter-rater agreement

for these three classes are lower than that of scores 0-4 as we can see from Table 5.4. The inter-rater agreement

between the two raters was 85.96%.

Anderson-Darling and Kolmogorov-Smirnov tests showed that the data were not normally distributed. Therefore, we

performed Spearman correlation analysis which showed very high correlation (r= 0.93, p<.00001) between the 11-

class (scores 0 to 10) ratings of the two raters. Again, in the 11-class (scores 0 to 10) ratings, the mean difference of

the ratings was 0.038609, the standard deviation of the rating was 0.98534, the median was 0, and the inter-quartile

range was 0.

Table 5.4: Inter-rater agreement of the proposed scale for scores 0 to 10. The inter-rater agreement between the two raters was 85.96%. The diagonal of the confusion matrix is highlighted in bold.

Rater #2

Rater #1

Score 0 1 2 3 4 5 6 7 8 9 10

0 2745 16 10 11 7 9 6 0 6 5 1

1 28 109 34 15 3 2 0 0 1 0 0

2 18 31 3 11 3 3 0 0 1 0 0

3 7 1 3 55 3 1 0 0 2 0 0

4 8 2 0 11 72 3 7 0 1 0 0

5 5 1 0 1 31 5 15 0 0 0 0

6 3 4 0 0 1 0 3 0 23 7 0

7 2 0 0 0 0 0 0 0 26 0 0

8 0 0 0 1 1 0 1 0 49 8 0

9 3 0 0 4 0 0 1 6 4 111 44

10 1 0 0 1 1 0 1 0 0 52 210

Page 62: Developing a System for High-Resolution Detection of ...

Page | 52

Another important observation that can be made from the button-press information and facial video rating plot in

Fig. 5.10 is that even though button-press data often fails to detect low intensity drowsy episode, it consistently

matches with the video rating when rating is close to 10. This indicates that the extremely non-alert episodes were

correctly captured by the facial video rating.

5.2.2 Modified Sigmoid Wake Probability Model

Figure 5.11 shows the performance metrics of the step function model in each of the 19 electrodes. It is evident from

Fig. 5.11 that even though the accuracy and specificity values of the sigmoid wake probability model is high, the

sensitivity is on the lower side. In all of the electrodes, the sigmoid wake probability model gives less than 40%

mean sensitivity. Hence, the step function model was developed to overcome this limitation of the sigmoid wake

probability model.

Figure 5.11: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the sigmoid wake probability model on each of the electrode of 10-20 electrode placement system.

Page 63: Developing a System for High-Resolution Detection of ...

Page | 53

5.2.3 Step Function Model

Figure 5.12: Feature weights of F3 electrode computed from random forest.

Figure 5.13: Mean and standard deviation of accuracy, sensitivity, and specificity of 100 runs of the step function-based algorithm on each of the electrode of 10-20 electrode placement system.

Fig. 5.12 shows the feature weights computed from the training participant of a run using random forest. Thus, this

feature weight determination approach ensures that the EEG frequency bands that are dominant in a particular

electrode have higher weights when analyzing EEG signals of that electrode. Figure 5.13 shows the performance

Page 64: Developing a System for High-Resolution Detection of ...

Page | 54

metrics of the step function model in each of the 19 electrodes. We can see from Fig. 5.13 that the accuracy,

sensitivity, and specificity values are similar across various electrodes. In comparison with Fig. 5.13, it is clear that

the accuracy, sensitivity, and specificity values of this model are similar to those of the sigmoid wake probability

model. The high specificity values indicate that the step function model gives very few false alarms. However, the

sensitivity values of this model are still low.

(a)

(b)

Figure 5.14: (a) Mean and standard deviation and (b) maximum values of accuracy, sensitivity, and specificity of 100 runs of the Hjorth parameter-based algorithm on each of the electrode of 10-20 electrode placement system.

Here alert (score 0) vs non-alert (score 1) has been classified.

Page 65: Developing a System for High-Resolution Detection of ...

Page | 55

5.2.4 Time-Domain Feature-Based Algorithm

The accuracy, sensitivity, and specificity values of the proposed time-domain based algorithm to classify alert and

drowsy segments (binary classification) are presented in Fig. 5.14. It is evident form Fig. 5.14 that the values of the

performance metrics do not vary much across the 19 electrodes. Furthermore, the low standard deviations in all of

the electrodes for all of the metrics suggest that the metric values were more or less similar in each of the 100 runs.

Considering both the accuracy and sensitivity, frontal electrodes F3 or F8 can be ideal in implementing the proposed

time domain feature-based algorithm in a wearable EEG system.

(a)

(b)

Figure 5.15: (a) Mean and standard deviation and (b) maximum values of accuracy, sensitivity, and specificity of 100 runs of the Hjorth parameter-based algorithm on each of the electrode of 10-20 electrode placement system.

Here, 11-class (scores 0-10) classification results are shown.

Page 66: Developing a System for High-Resolution Detection of ...

Page | 56

Table 5.5: Performance comparison of the proposed methods with existing works in the literature.

Method Description n Modalities Used Total Features

Resolution Accuracy (%)

Hear rate variability (HRV) features [17]

Driving simulator study

12 Electrocardiogram 30 5 mins 90

Principle component analysis of EEG [41]

Eye open/close task

15 EEG 3 5 mins 0.7 correlation with Karolinska Sleepiness Scale

Artificial neural network [31]

Eye open/close task

17 EEG 42 5 mins 94.37 ± 1.95

Alpha and beta band powers [89]

Eye open/close task

10 EEG 2 3 mins 84.8

HRV features from time-frequency analysis [19]

Driving simulator study

30 Electrocardiogram 8 1 min Sensitivity: 62%, Specificity: 88%

Cardiorespiratory phase synchronization features

[21]

Driving simulator study

16 Respiration, electrocardiogram

2 1 min 97.2

Ocular features and partial least squares regression

[22]

Driving simulator study

44 Electrooculogram 15 30s 85

Temporal and spectral features in the wavelet

domain [24]

Sleep study 16 EEG 19 30s 87.4

Frequency domain features and neural network [20]

Sleep study 30 Chin electromyogram, electrooculogram

5 30s 83

EEG power features [25] Driving simulator study

15 EEG 2 30s 82

Wavelet packet analysis [38]

Sleep study 20 EEG 2 30s 91.8

Wavelet transform and neural network [51]

Sleep study 10 EEG, electromyogram, electrooculogram

4 30s 97

Entropy and power-based features [28]

Tracking task 8 EEG (16 electrodes) 24 2s 61.2

Beta band power and Oxyhemoglobin changes

[16]

Driving simulator study

9 EEG, near infrared spectroscopy

2 2s 79.2 ± 9.4

Sigmoid wake probability model

Sleep study 8 EEG

3 3s

93.11

Sigmoid wake probability model

Reaction time task

8 EEG 2 1s Accuracy: 74.70 ± 0.16%, Sensitivity: 39.04 ± 0.53%, Specificity: 100.00 ± 0.00%

Step function model Reaction time task

8 EEG 5 1s Accuracy: 70.96 ± 0.07%, Sensitivity: 31.12 ± 0.14%, Specificity: 99.99 ± 0.01%

Time domain feature-based algorithm

Reaction time task

8 EEG 3 1s Accuracy: 91.54 ± 0.29%, Sensitivity: 95.38 ± 0.25%, Specificity: 88.52 ± 0.50%

Page 67: Developing a System for High-Resolution Detection of ...

Page | 57

Fig. 5.15 shows the performance of the time-domain feature-based algorithm for 11-class classification. It is clear

from Fig. 5.15 that the proposed scheme gives reasonably high detection performance in detecting not only the non-

alert episodes but also the degree or intensity of a drowsy episode.

The performance of the proposed Hjorth parameter-based method is also compared with existing works in the

literature in Table 5.5. Table 5.5 also lists the performance of the sigmoid wake probability model and step function

model. The sigmoid wake probability model performs better in the sleep-EEG data. Even though the model's

accuracy seems better in comparison with existing works in the literature, the algorithm misses a lot of drowsy

episodes. Step function model, on the other hand, gives high specificity. But its sensitivity is on the lower side, as

we can see from Table 5.5. It is evident from Table 5.5 that the time domain feature-based algorithm proposed

herein yields comparable or better performance in terms of accuracy, sensitivity, and specificity but at a higher

resolution (1s) than the existing studies in the literature. Furthermore, the use of only 3 features makes the proposed

algorithm computationally less expensive. The proposed method takes about 15ms to process and classify an

unlabeled 1s segment.

5.3 Discussion

In this study we have developed and tested three algorithms for drowsiness detection. The first algorithm developed

in Study 1 was a sigmoid wake probability model, which achieved 93.11% accuracy in Study 1. However, in Study

2, the sensitivity values of the algorithm were low (38.95 ± 0.54% in Fp2 electrode). Thus, we developed a model

that employed a step function or thresholding on five relative power based features. The step function model gave

similar accuracy (70.96 ± 0.07% in Fp2 electrode) and specificity (99.99 ± 0.01% in Fp2 electrode) with only a few

false detections. Nevertheless, the step model’s sensitivity (31.12 ± 0.14% in Fp2) was still not satisfactory. The

third algorithm developed in this study, which overcomes the low sensitivity issue of sigmoid and step function

models, exploits time-domain properties of EEG to detect drowsiness.

Although, the sigmoid wake probability model developed in the sleep data had high accuracy, it gave poor detection

performance in Study 2. Lower sensitivity of a drowsiness detection algorithm implies higher missing detections,

and lower specificity indicates higher false detections of the algorithm. Therefore, in the context of drowsiness

detection, higher sensitivity is more important than higher specificity. There are three possible reasons for the poor

sensitivity of sigmoid and step models. First, in a sleep study, participants usually lie in the bed with eyes closed and

Page 68: Developing a System for High-Resolution Detection of ...

Page | 58

without much movement trying to fall asleep. In contrast, in the current study, participant tried to stay awake and

blinked and moved. This makes the EEG data recorded in this study more noisy than sleep-EEG data. Second, prior

studies have demonstrated that neural rhythms are affected by attention [90] or self-motion perception [91]. Unlike

the sleep study, the participants here were trying to focus on performing the task. Therefore, the EEG frequency

bands could be affected by the participant’s attention or self-motion perception. Since the sigmoid wake probability

model and the step function model completely rely on the power levels of EEG frequency bands, both of the

proposed models end up giving poor drowsiness detection performance. Third, in the development of the model

using sleep-EEG data, we used arousals as extreme cases of alertness and deep sleep stages such as non-rapid eye

movement stages 2 and 3 (non-REM 2 and non-REM 3) as extreme cases of non-alertness. While the training blocks

at the beginning of the study were used as extreme cases of alertness, only two participants fell asleep. Furthermore,

two subjects who fell asleep were unlikely to go to a deeper stage such as N2 and N3, since the sleep episodes lasted

only a few minutes. Therefore, we didn’t have any deep sleep data (extreme cases of non-alertness) to re-train our

model and hence sigmoid wake probability model did not work well in the drowsiness study.

The time domain feature-based algorithm has various advantages. The proposed algorithm is of 1s resolution and

gave high accuracy. Furthermore, it is fast and only requires one frontal EEG channel and hence is highly suitable

for drowsiness detection during driving once validated in a driving study. Another advantage of the proposed

algorithm is that it does not require any rigorous and time-consuming preprocessing step such as ICA. This makes

the algorithm more appealing in the context of drowsy driving detection. Thus, the time-domain feature-based

algorithm overcomes the limitations of sigmoid wake probability and step function models and is more suitable for

practical implementation.

From Table 1, it is clear that in the 2-class (score 0 indicating alert and scores 1-10 indicating various levels of non-

alertness) case, the two raters show high agreement. The inter-rater agreement slightly decreases when the raters rate

the degree or level of drowsiness in the 11-class (scores 0 to 10) case. In both of the cases, the agreement was not

higher due to the disagreement of identifying the beginning of some drowsy episodes. That is why we can see from

the third (Score 0) column of Table 2 that four segments were rated as 9 or 10 by rater #1, even though they are

scored as 0 (i.e. alert) by rater #2. In the context of the proposed thesis, the agreement of the 2-class (score 0

indicating alert and scores 1-10 indicating various levels of non-alertness) case is more important, since our primary

goal is to detect drowsiness and not the level or degree of drowsiness at this stage. Nevertheless, the inter-rater

agreement in both cases is reasonably high [81]. Table 2 also shows that the choice of video segments were such that

Page 69: Developing a System for High-Resolution Detection of ...

Page | 59

they contain as many non-alert segments as possible so that the video ratings and reliability of the proposed scale

can be fully determined. It is worth-mentioning that the second rater was blinded to the ratings of the first rater as

well as the button press data.

However, there are two main limitations. In the current study, as we can see from Table 5.2, the number of alert

episodes is much higher compared to the drowsy episode, thus making it harder to train the model for classification.

As a result, we could not achieve higher accuracy. Another limitation of the study is that the video rating stems only

from one observer. For better rating accuracy, it is recommended to have ratings from more than one observer.

Despite efforts of maintaining consistency of video rating, there could be some missing or false detection of drowsy

episodes.

Page 70: Developing a System for High-Resolution Detection of ...

Page | 60

Chapter 6: General Discussions

6.1. Summary of the Findings

In this thesis, we developed three drowsiness detection algorithms, namely sigmoid wake probability model, step

function model, and Hjorth parameter-based algorithm across two separate studies. We first conducted a sleep study

(Study 1) which yielded EEG data relatively free of noise (e.g., eye blink, motion, and eye movement artifacts). We

observed changes in EEG frequency bands alpha, delta, and beta at sleep onset. To capture and manipulate these

changes for drowsiness detection, we employed a sigmoid function. The motivation for using sigmoid function was

that in the sleep study data we had extreme cases of alertness and extreme cases of non-alertness. The former is the

arousal state and the latter are deep sleep stages. The idea was that once we can develop a model that can separate

these two extremes, the episodes which lie ‘in the middle’ of these two extremes can be modeled by slowly

increasing/decreasing curve like sigmoid function. Three clusters were defined in the data by thresholding the

likelihood of wakefulness values using commonly used cluster quality evaluation metrics. The model gave high

accuracy (>90%) for drowsiness detection on the sleep-EEG data in Study 1.

In order to test this model in an experiment more similar to driving (i.e., more engaging and active task, strong

visual stimulation), we designed a reaction time study that, unlike Study 1, necessitates the participant to try to stay

alert in a sitting position and to perform a task (Study 2). In Study 2, we also had facial video scores available for

validation of the algorithms. In spite of having high accuracy and specificity, the sigmoid wake probability model

developed in Study 1 gave a large number of missing detections in Study 2 (sensitivity <40%). In the sleep study,

we used deep sleep segments as extreme cases of non-alertness for model development which was absent in Study 2

data. This caused the model to yield low sensitivity in Study 2 data.

To counteract this problem, we applied a step function on each of the five frequency bands to develop the step

function model. Unlike sigmoid wake probability model, the step function model does not require extreme cases of

non-alertness for drowsiness detection. This model’s accuracy and specificity were values were similar to those of

sigmoid model. However, it still gave a large number of missing detections of drowsy episodes (sensitivity <35%).

Page 71: Developing a System for High-Resolution Detection of ...

Page | 61

We then explored time-domain properties of the EEG data to identify effective markers of drowsiness. Hjorth

parameters have been designed to capture changes in the time-domain properties of EEG [86] and have widely been

used for alertness monitoring [29], sleep stage classification [83, 84], and anesthesia-EEG data analysis [85].

Hjorth parameters involve activity, mobility, and complexity. Instead of power, these parameters capture the time-

domain properties of EEG. Complexity captures the EEG signals similarity to a pure sine wave [86]. Mobility, on

the other hand, captures the degree of fluctuations of energy of an EEG signal segment. Finally, activity captures the

energy of the EEG signal segment. Fig. 6.1 shows the variations of complexity parameter values in different levels

of drowsiness. In general, the complexity parameter decreases as the subject becomes drowsy.

Figure 6.1: Complexity parameter value (top panel) and the facial video rating (bottom panel). In general, the complexity decreases as the subject becomes drowsy.

Fig. 6.2 shows the variations of mobility parameter values in different levels of drowsiness. It can be seen from Fig.

6.2 that in general, the mobility parameter decreases as the subject becomes drowsy. Fig. 6.3 illustrates the

Page 72: Developing a System for High-Resolution Detection of ...

Page | 62

variations of activity parameter values in different levels of drowsiness. It can be seen from Fig. 6.3 that in general,

the activity parameter decreases as the subject becomes drowsy.

Figure 6.2: Mobility parameter value (top panel) and the facial video rating (bottom panel). In general, the mobility decreases as the subject becomes drowsy.

Page 73: Developing a System for High-Resolution Detection of ...

Page | 63

Figure 6.3: Activity parameter value (top panel) and the facial video rating (bottom panel). In general, the activity decreases as the subject becomes drowsy.

The changes in Hjorth parameters can be explained from a neurophysiological standpoint. Sleep onset is

characterized by the activity of the inhibitory projections of the GABAergic and galaninergic neurons in the

ventrolateral preoptic nucleus neurons on cells in the ascending arousal system [92]. This causes the activity of the

neurons in the neocortex to decrease resulting in lower complexity and fluctuations of energy [93, 94]. In fact, the

complexity of EEG has been reported to decrease with increasing deeper sleep stages [94]. Therefore, all three

Hjorth parameters decrease when the subject becomes drowsy.

In Study 2, we used facial video ratings to apply supervised learning algorithms to develop a more robust and

accurate drowsiness detection algorithm without the need of any parameter tuning or weight determination.

Therefore, using bootstrap aggregating classifier, we developed a Hjorth parameter-based algorithm that gave high

accuracy, sensitivity, and specificity (Accuracy: 91.54 ± 0.29%, Sensitivity: 95.38 ± 0.25%, and Specificity: 88.52 ±

0.50% in P7).

In the sleep-EEG data, the sampling rate was 128 Hz for most of the participants. From Nyquist’s sampling theorem,

we could only extract up to 64 Hz which does not cover the range of gamma band (30-100 Hz). In contrast, in Study

2, the sampling rate was 1024 Hz which allowed us to use all the five frequency bands for model development.

Page 74: Developing a System for High-Resolution Detection of ...

Page | 64

Furthermore, for the sigmoid wake probability model in the sleep study data, we used features computed from the

delta (1-4 Hz) band. Therefore, the minimum segment length had to be 2s. In Study 2, on the other hand, we used

theta (4-8 Hz) and gamma (30-100 Hz) bands which allowed us to detect drowsiness at a higher resolution. Prior

studies reported microsleep episodes as short as 1s [52]. Even though the average human reaction time for a visual

stimulus is 0.25s [95], rating drowsiness at a resolution higher than 1s might precipitate the scoring scheme to miss

behavioral cues (eye closure, facial contortion, eye blink, head nodding) of drowsiness. Therefore, we used 1s

resolution in Study 2 to develop our algorithms.

6.2. Comparison with Other Drowsiness Detection Systems

To the best of our knowledge, none of the existing works in the literature detect drowsiness at 1s resolution. As one

uses smaller and smaller window sizes, the amount of information to be extracted from EEG becomes lower. Hence,

extracting successful markers of drowsiness becomes increasingly challenging. Therefore, prior works that detected

drowsiness at a lower resolution usually report higher detection performance than higher resolution (<10s) studies.

Peiris et al. [28] and Nguyen et al. [16] are perhaps the only two studies that detect drowsiness at 2s resolution. The

former study reports maximum accuracy, sensitivity, and specificity values of 61.2%, 73.5%, and 25.5%

respectively using multichannel EEG [28]. Furthermore, all the missing detections were from episodes shorter than

20s. The latter study reports a maximum accuracy of 79.2% but does not report the sensitivity and specificity values

[16]. In contrast, the Hjorth parameter-based algorithm proposed in this thesis gives accuracy, sensitivity, and

specificity of 91.54 ± 0.29%, 95.38 ± 0.25%, and 88.52 ± 0.50%.

One common challenge of commercial fatigue monitors is their annoying frequent false alarms that dissuade the

driver from using these systems [4]. A common feature of all of the three developed algorithms in this thesis is that

they yield close to 100% specificity in Study 2 data. Since higher specificity means lower false positives, the

algorithms developed in this thesis are free from the aforementioned problem.

Unlike most of the existing works in the literature [16, 18-20, 51], we reported accuracy, sensitivity, and specificity

of the developed algorithms. An algorithm with low sensitivity, despite having high accuracy, misses most of the

drowsy episodes. On the contrary, an algorithm with low specificity, despite having high accuracy, gives a lot of

false alarms. Therefore, in order to totally characterize and measure the performance of a drowsiness detection

algorithm, one must report all three measures.

Page 75: Developing a System for High-Resolution Detection of ...

Page | 65

While analyzing data in both of the studies, we selected a percentage of the participants as training and the

remainder of the participants as testing. Thus, we ensured that all the participants can be either in test or train data

but never at the same time. This made sure that the performances of the algorithms are generalizable. This data

setting also has a benefit in terms of practical application. When implementing the developed algorithms in a

vehicle, the algorithm does not need to be trained by the driver but can be pre-trained using other persons’ data.

Furthermore, most of the existing works in the literature [16, 20, 28, 51] do not randomize their training and testing

data which might indicate that the reported performance metric values might be obtained on only a certain choice of

train and test data.

Existing commercialized drowsy driving detection systems are either based on vehicle measures or eye-tracking [3].

The former is dependent on road geometry, weather, and driving skills of the driver [4, 43]. Vehicle parameters also

vary from driver to driver. Therefore, these algorithms’ performances vary with drivers [4]. Eye-tracking based

measures are reliant on lighting conditions [34]. Furthermore, the use of a camera to constantly monitor the driver

hampers his/her privacy. Due to the use of EEG, the proposed scheme is free from the aforementioned caveats.

Most of the physiological signal-based algorithms in the literature either use multiple signals [16, 20, 49] or multiple

channels of data [28, 50] or both [4, 24] for drowsiness detection. The disadvantage of using multiple physiological

signals is that it requires more sensors and electrodes attached to the driver’s body. Multichannel EEG-based

algorithms, on the other hand, require the driver to wear an EEG cap which has to be precisely positioned and is

inconvenient. In both of the studies and in all of the developed algorithms, we used EEG data only. Since the

proposed schemes in this thesis are single-channel EEG-based, they are free from the aforementioned problems.

6.3. Practical Implications

From the button-press data in Fig. 5.10 and the corresponding discussion, it becomes evident that a driver might

have some level of drowsiness, yet he/she might be able to perform driving. Therefore, when trying to practically

implement the Hjorth parameter-based algorithm in driving, one must decide at which level or degree of drowsiness

the driver must be warned. Furthermore, the finely-grained scale of drowsiness also gives us an opportunity to

investigate the effect of drowsiness on driving performance. Thus, in a study in a driving simulator for instance, one

can record facial video and vehicle measures such as lane deviation and speed variability. Using these vehicle

measures, one can determine at which point of the proposed scale (from score 1 to score 10) the driving becomes

Page 76: Developing a System for High-Resolution Detection of ...

Page | 66

impaired and/or the driver loses control of the vehicle. Thus, the scale developed in this thesis can shed light on the

effect of drowsiness and fatigue on driving performance.

Despite sigmoid wake probability model’s poor performance in drowsiness detection in experimental conditions

similar to driving, the model can be useful for drowsiness detection in sleep. Thus, the model can be used to

characterize and compare sleep onset phenotype of different clinical populations as well as to characterize natural

inhomogeneity of healthy subjects [15]. Furthermore, this algorithm can act as a diagnostic tool for disorders of

sleep onset such as narcolepsy or insomnia [14, 15]. Moreover, it can be used to dynamically track loss of alertness

in situations wherein alertness is vital (e.g. depth of anesthesia estimation). Again, Fig. 4.7 shows that the proposed

model separates arousal and deep sleep segments successfully. Therefore, sigmoid wake probability model can be

used for arousal detection as well. The sigmoid wake probability model can also assist sleep technicians in

identifying awake episodes and pinpointing sleep onset. This work also highlights that existing works in the

literature that employ sleep-EEG data [20, 51] to develop drowsiness detection algorithms may not be applicable for

drowsiness detection in the context of driving.

Also, the features used for drowsiness detection in the Hjorth parameter-based method can be used in conjunction

with time series forecasting algorithms such as Kalman filtering or autoregressive integrated moving average

(ARIMA) to develop a prediction model for drowsiness in the future. Upon further validation in a driving study on a

larger population, the proposed algorithm can be implemented in a single-channel EEG-based wearable EEG

headband. In a driving simulator-based study, however, the movement of the participant due to driving might make

the EEG data contaminated by movement-related noise. Therefore, signal decompositions schemes such as empirical

mode decomposition [96] or wavelet transform [97] can be applied for noise removal before extracting the Hjorth

parameters from EEG in a driving simulator-based study. Furthermore, the data collected in Study 2 can be

instrumental in developing more accurate drowsiness detection algorithms using cardiorespiratory signals (more

details can be found in Appendix C5). Moreover, this thesis also proposes a new and well-defined guideline of video

rating for level of drowsiness using behavioral cues such as eye closure, change of muscle tone, eye blink, and head

nodding. Lastly, if the performance of the Hjorth parameter based algorithm can slightly be improved by extracting

more features or applied advanced machine learning algorithms, it can be used as a gold standard for drowsiness

detection. This will alleviate the need of manually annotating a large amount of data and expedite and assist future

research.

Page 77: Developing a System for High-Resolution Detection of ...

Page | 67

6.4. Limitations

The limitation of Study 1 was that the experimental scenario of sleep study was not similar to that of driving. The

subjects were in supine position and were trying to fall asleep without much body movement. This is contrary to

driving where the driver is in sitting position and trying to stay alert, and there are body movements due to driving.

Since Study 2 only mimics the scenarios of driving, the results of this study cannot be directly extrapolated to driver

drowsiness detection without validating in a study in a car or driving simulator. However, given the cognitively

engaging reaction time task with strong visual motion stimulation, it can be anticipated that the EEG data collected

in a driving simulator-based study will be similar to the reaction time study, except with the addition of more

movement noise in EEG due to driving.

The participants in this study were asked to try to stay alert and perform the task to the best of their ability.

However, it is possible that some of the participants did not feel motivated to stay alert and perform the task. This is

unlike driving, where the driver is more motivated to stay alert to avoid accidents. This is also a limitation of Study

2.

Even though the algorithms developed in this thesis are single-channel based and have the potential for

implementation in a wearable EEG headband, the resulting drowsiness detection systems, unlike camera-based or

vehicle parameter-based systems, are still obtrusive. Lastly, even if the Hjorth parameter-based scheme gives high

accuracy, sensitivity, and specificity, the performance metric values are still not 100%.

Page 78: Developing a System for High-Resolution Detection of ...

Page | 68

Chapter 7: Conclusions and Future Directions

In this thesis, we have developed a single-channel EEG based high-resolution drowsiness detection algorithm. In the

future, the algorithm must be validated in a driving simulator based study. Further optimization of the video rating

scale could also be done in future research. Furthermore, cardiorespiratory signals can also be investigated to

develop algorithms that can potentially yield higher accuracy, sensitivity, and specificity. Moreover, more advanced

machine learning algorithms such deep learning can be applied for performance improvement of the proposed time-

domain feature based algorithm. This work is the first step to develop a highly accurate, high-resolution, and

efficient drowsy driving detection system that will greatly benefit the population which are at higher risk of drowsy

driving related car crashes, including- shift workers, patients with sleep related disorders that induce daytime

sleepiness, such as obstructive sleep apnea, individuals who take sedative medications, and occupational drivers.

Since the algorithm is of high-resolution and fast and gives high detection accuracy and sensitivity, upon its

validation in a driving study, it can be marketized as a drowsiness detection system that will benefit the target

Page 79: Developing a System for High-Resolution Detection of ...

Page | 69

population. If a convenient and reliable drowsy driving detection system is developed, it can be used to reduce

drowsy driving or fatigue related car crashes. Furthermore, even though this thesis focuses drowsiness only in the

context of vehicle driving, a drowsiness detection system, once developed, will be useful for target population that

are not related to driving such as mining workers, pilots, and locomotive operators.

References [1] "Global Status Report on Road Safety 2015," 2015.

[2] P. Rau, "Drowsy Driver Detection and Warning System for Commercial Vehicle Drivers: Field Operational Test Design, Analysis, and Progress," 2005.

[3] M. I. Chacon-Murguia and C. Prieto-Resendiz, "Detecting Driver Drowsiness: A survey of system designs and technology," IEEE Consumer Electronics Magazine, vol. 4, pp. 107-119, 2015.

[4] A. Sahayadhas, K. Sundaraj, and M. Murugappan, "Detecting driver drowsiness based on sensors: a review," Sensors (Basel), vol. 12, pp. 16937-53, Dec 07 2012.

Page 80: Developing a System for High-Resolution Detection of ...

Page | 70

[5] (October 09, 2017). Audi Rest Recommendation System. Available: https://www.audi-mediaservices.com/publish/ms/content/en/public/hintergrundberichte/2012/03/05/a_statement_about/driver_assistance.html

[6] (October 09, 2017). BMW Driver Assistant. Available: https://www.bmw.ca/en/topics/experience/connected-drive/BMW%20ConnectedDrive:%20Driver%20Assistance%20.html

[7] (October 09, 2017). Bosch Driver Drowsiness Detection System.

[8] (October 09, 2017). Volvo. Available: https://www.media.volvocars.com/global/en-gb/media/pressreleases/12130

[9] D. Sommer and M. Golz, "Evaluation of PERCLOS based current fatigue monitoring technologies," in 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, 2010, pp. 4456-4459.

[10] R. Simons, M. Martens, J. Ramaekers, A. Krul, I. Klöpping-Ketelaars, and G. Skopp, "Effects of dexamphetamine with and without alcohol on simulated driving," Psychopharmacology, vol. 222, pp. 391-399, August 01 2012.

[11] D. Das, S. Zhou, and J. D. Lee, "Differentiating Alcohol-Induced Driving Behavior Using Steering Wheel Signals," IEEE Transactions on Intelligent Transportation Systems, vol. 13, pp. 1355-1368, 2012.

[12] M. A. J. Mets, E. Kuipers, L. M. de Senerpont Domis, M. Leenders, B. Olivier, and J. C. Verster, "Effects of alcohol on highway driving in the STISIM driving simulator," Human Psychopharmacology: Clinical and Experimental, vol. 26, pp. 434-439, 2011.

[13] R. D. Ogilvie, "The process of falling asleep," Sleep Medicine Reviews, vol. 5, pp. 247-270, 2001/06/01/ 2001.

[14] M. J. Prerau, R. E. Brown, M. T. Bianchi, J. M. Ellenbogen, and P. L. Purdon, "Sleep Neurophysiological Dynamics Through the Lens of Multitaper Spectral Analysis," Physiology, vol. 32, pp. 60-92, 2017-01-01 00:00:00 2017.

[15] M. J. Prerau, K. E. Hartnack, G. Obregon-Henao, A. Sampson, M. Merlino, K. Gannon, et al., "Tracking the sleep onset process: an empirical model of behavioral and physiological dynamics," PLoS Comput Biol, vol. 10, p. e1003866, Oct 2014.

[16] T. Nguyen, S. Ahn, H. Jang, S. C. Jun, and J. G. Kim, "Utilization of a combined EEG/NIRS system to predict driver drowsiness," vol. 7, p. 43933, 03/07/online 2017.

[17] M. Patel, S. K. L. Lal, D. Kavanagh, and P. Rossiter, "Applying neural network analysis on heart rate variability data to assess driver fatigue," Expert Systems with Applications, vol. 38, pp. 7235-7242, 2011.

[18] J. Vicente, P. Laguna, A. Bartra, and R. Bailón, "Detection of driver's drowsiness by means of HRV analysis," in 2011 Computing in Cardiology, 2011, pp. 89-92.

Page 81: Developing a System for High-Resolution Detection of ...

Page | 71

[19] J. Vicente, P. Laguna, A. Bartra, and R. Bailón, "Drowsiness detection using heart rate variability," Medical & Biological Engineering & Computing, vol. 54, pp. 927-937, June 01 2016.

[20] M. Akin, M. B. Kurt, N. Sezgin, and M. Bayram, "Estimating vigilance level by using EEG and EMG signals," Neural Computing and Applications, vol. 17, pp. 227-236, June 01 2008.

[21] I. Takahashi, T. Takaishi, and K. Yokoyama, "Overcoming Drowsiness by Inducing Cardiorespiratory Phase Synchronization," IEEE Transactions on Intelligent Transportation Systems, vol. 15, pp. 982-991, 2014.

[22] H. Su and G. Zheng, "A Partial Least Squares Regression-Based Fusion Model for Predicting the Trend in Drowsiness," IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 38, pp. 1085-1092, 2008.

[23] J. A. Healey and R. W. Picard, "Detecting stress during real-world driving tasks using physiological sensors," IEEE Transactions on Intelligent Transportation Systems, vol. 6, pp. 156-166, 2005.

[24] A. Garces Correa, L. Orosco, and E. Laciar, "Automatic detection of drowsiness in EEG records based on multimodal analysis," Med Eng Phys, vol. 36, pp. 244-9, Feb 2014.

[25] C. T. Lin, C. J. Chang, B. S. Lin, S. H. Hung, C. F. Chao, and I. J. Wang, "A Real-Time Wireless Brain&#x2013;Computer Interface System for Drowsiness Detection," IEEE Transactions on Biomedical Circuits and Systems, vol. 4, pp. 214-222, 2010.

[26] C.-T. Lin, R.-C. Wu, S.-F. Liang, W.-H. Chao, Y.-J. Chen, and T.-P. Jung, "EEG-based drowsiness estimation for safety driving using independent component analysis," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 52, pp. 2726-2738, 2005.

[27] F. C. Lin, L. W. Ko, C. H. Chuang, T. P. Su, and C. T. Lin, "Generalized EEG-Based Drowsiness Prediction System by Using a Self-Organizing Neural Fuzzy System," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 59, pp. 2044-2055, 2012.

[28] T. R. P. Malik, R. D. Paul, J. B. Philip, and D. J. Richard, "Detection of lapses in responsiveness from the EEG," Journal of Neural Engineering, vol. 8, p. 016003, 2011.

[29] M. Matousek and I. Petersén, "A method for assessing alertness fluctuations from EEG spectra," Electroencephalography and Clinical Neurophysiology, vol. 55, pp. 108-113, 1983.

[30] S. Otmani, T. Pebayle, J. Roge, and A. Muzet, "Effect of driving duration and partial sleep deprivation on subsequent alertness and performance of car drivers," Physiology & Behavior, vol. 84, pp. 715-724, 2005/04/13/ 2005.

[31] A. Vuckovic, V. Radivojevic, A. C. N. Chen, and D. Popovic, "Automatic recognition of alertness and drowsiness from EEG by an artificial neural network," Medical Engineering & Physics, vol. 24, pp. 349-360, 2002/06/01/ 2002.

Page 82: Developing a System for High-Resolution Detection of ...

Page | 72

[32] Y.-T. Wang, K.-C. Huang, C.-S. Wei, T.-Y. Huang, L.-W. Ko, C.-T. Lin, et al., "Developing an EEG-based on-line closed-loop lapse detection and mitigation system," Frontiers in Neuroscience, vol. 8, 2014-October-13 2014.

[33] M. V. M. Yeo, X. Li, K. Shen, and E. P. V. Wilder-Smith, "Can SVM be used for automatic EEG detection of drowsiness during car driving?," Safety Science, vol. 47, pp. 115-124, 2009/01/01/ 2009.

[34] L. M. Bergasa, J. Nuevo, M. A. Sotelo, R. Barea, and M. E. Lopez, "Real-time system for monitoring driver vigilance," IEEE Transactions on Intelligent Transportation Systems, vol. 7, pp. 63-77, 2006.

[35] T. D’Orazio, M. Leo, C. Guaragnella, and A. Distante, "A visual approach for driver inattention detection," Pattern Recognition, vol. 40, pp. 2341-2355, 2007/08/01/ 2007.

[36] M. J. Flores, J. M. Armingol, and A. de la Escalera, "Driver drowsiness detection system under infrared illumination for an intelligent vehicle," IET Intelligent Transport Systems, vol. 5, pp. 241-251, 2011.

[37] M. H. Silber, S. Ancoli-Israel, M. H. Bonnet, S. Chokroverty, M. M. Grigg-Damberger, M. Hirshkowitz, et al., "The visual scoring of sleep in adults," J Clin Sleep Med, vol. 3, pp. 121-31, Mar 15 2007.

[38] T. L. T. da Silveira, A. J. Kozakevicius, and C. R. Rodrigues, "Automated drowsiness detection through wavelet packet analysis of a single EEG channel," Expert Systems with Applications, vol. 55, pp. 559-565, 2016/08/15/ 2016.

[39] R. R. Johnson, D. P. Popovic, R. E. Olmstead, M. Stikic, D. J. Levendowski, and C. Berka, "Drowsiness/alertness algorithm development and validation using synchronized EEG and cognitive performance to individualize a generalized model," Biological Psychology, vol. 87, pp. 241-250, 2011/05/01/ 2011.

[40] C. Papadelis, Z. Chen, C. Kourtidou-Papadeli, P. D. Bamidis, I. Chouvarda, E. Bekiaris, et al., "Monitoring sleepiness with on-board electrophysiological recordings for preventing sleep-deprived traffic accidents," Clinical Neurophysiology, vol. 118, pp. 1906-1922, 2007/09/01/ 2007.

[41] A. A. Putilov and O. G. Donskaya, "Construction and validation of the EEG analogues of the Karolinska sleepiness scale based on the Karolinska drowsiness test," Clinical Neurophysiology, vol. 124, pp. 1346-1352, 2013/07/01/ 2013.

[42] A. S. Aghaei, B. Donmez, C. C. Liu, D. He, G. Liu, K. N. Plataniotis, et al., "Smart Driver Monitoring: When Signal Processing Meets Human Factors: In the driver's seat," IEEE Signal Processing Magazine, vol. 33, pp. 35-48, 2016.

[43] Y. Dong, Z. Hu, K. Uchimura, and N. Murayama, "Driver Inattention Monitoring System for Intelligent Vehicles: A Review," IEEE Transactions on Intelligent Transportation Systems, vol. 12, pp. 596-614, 2011.

Page 83: Developing a System for High-Resolution Detection of ...

Page | 73

[44] B.-C. YIN, X. FAN, and Y.-F. SUN, "MULTISCALE DYNAMIC FEATURES BASED DRIVER FATIGUE DETECTION," International Journal of Pattern Recognition and Artificial Intelligence, vol. 23, pp. 575-589, 2009.

[45] P. Philip, P. Sagaspe, N. Moore, J. Taillard, A. Charles, C. Guilleminault, et al., "Fatigue, sleep restriction and driving performance," Accident Analysis & Prevention, vol. 37, pp. 473-478, 2005/05/01/ 2005.

[46] R. Tremaine, J. Dorrian, L. Lack, N. Lovato, S. Ferguson, X. Zhou, et al., "The relationship between subjective and objective sleepiness and performance during a simulated night-shift with a nap countermeasure," Applied Ergonomics, vol. 42, pp. 52-61, 2010/12/01/ 2010.

[47] G. Matthews, S. E. Campbell, S. Falconer, L. A. Joyner, J. Huggins, K. Gilliland, et al., "Fundamental dimensions of subjective state in performance settings: Task engagement, distress, and worry," Emotion, vol. 2, pp. 315-340, 2002.

[48] M. Ingre, T. ÅKerstedt, B. Peters, A. Anund, and G. Kecklund, "Subjective sleepiness, simulated driving performance and blink duration: examining individual differences," Journal of Sleep Research, vol. 15, pp. 47-53, 2006.

[49] R. N. Khushaba, S. Kodagoda, S. Lal, and G. Dissanayake, "Driver Drowsiness Classification Using Fuzzy Wavelet-Packet-Based Feature-Extraction Algorithm," IEEE Transactions on Biomedical Engineering, vol. 58, pp. 121-131, 2011.

[50] S. Hu and G. Zheng, "Driver drowsiness detection with eyelid related parameters by Support Vector Machine," Expert Systems with Applications, vol. 36, pp. 7651-7658, 2009.

[51] M. B. Kurt, N. Sezgin, M. Akin, G. Kirbas, and M. Bayram, "The ANN-based computing of drowsy level," Expert Systems with Applications, vol. 36, pp. 2534-2542, 2009.

[52] G. R. Poudel, C. R. Innes, P. J. Bones, R. Watts, and R. D. Jones, "Losing the struggle to stay awake: divergent thalamic and cortical activity during microsleeps," Hum Brain Mapp, vol. 35, pp. 257-69, Jan 2014.

[53] S. M. S. Alam and M. I. H. Bhuiyan, "Detection of Seizure and Epilepsy Using Higher Order Statistics in the EMD Domain," IEEE Journal of Biomedical and Health Informatics, vol. 17, pp. 312-318, 2013.

[54] M. J. Flores, Jos, #233, Mar, #237, a. Armingol, et al., "Driver drowsiness warning system using visual information for both diurnal and nocturnal illumination conditions," EURASIP J. Adv. Signal Process, vol. 2010, pp. 1-19, 2010.

[55] D. Sommer, M. Golz, U. Trutschel, and D. Edwards, "Biosignal Based Discrimination between Slight and Strong Driver Hypovigilance by Support-Vector Machines," in Agents and Artificial Intelligence: International Conference, ICAART 2009, Porto, Portugal, January 19-21, 2009. Revised Selected Papers, J. Filipe, A. Fred, and B. Sharp, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 177-187.

Page 84: Developing a System for High-Resolution Detection of ...

Page | 74

[56] S. R. Jagannathan, A. Ezquerro-Nassar, B. Jachs, O. V. Pustovaya, C. A. Bareham, and T. A. Bekinschtein, "Tracking wakefulness as it fades: Micro-measures of alertness," NeuroImage, vol. 176, pp. 138-151, 2018/08/01/ 2018.

[57] (July 24, 2018). http://inside.volkswagen.com/Take-a-break.html.

[58] N. Edenborough, R. Hammoud, A. Harbach, A. Ingold, B. Kisacanin, P. Malawey, et al., "Driver state monitor from DELPHI," in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, pp. 1206-1207 vol. 2.

[59] (October 09, 2017). Acumine. Available: http://www.acumine.com/

[60] D. J Edwards, B. Sirois, T. Dawson, A. Aguirre, B. Davis, and U. Trutschel, Evaluation of Fatigue Management Technologies Using Weighted Feature Matrix Method, 2017.

[61] (July 24, 2018). http://www.seeingmachines.com/.

[62] (October 09, 2017). Smart Eye. Available: http://smarteye.se/

[63] (July 24, 2018). https://www.siemens.com/global/en/home.html.

[64] (July 24, 2018). https://www.ospat.com/.

[65] (July 24, 2018). https://www.mobileye.com/.

[66] (July 24, 2018). https://www.nissanusa.com/experience-nissan/news-and-events/drowsy-driver-attention-alert-car-feature.html.

[67] (July 21, 2018). https://www.slideshare.net/elaghoury/eeg-for-sleep-lab.

[68] A. Azarbarzin, M. Ostrowski, P. Hanly, and M. Younes, "Relationship between Arousal Intensity and Heart Rate Response to Arousal," Sleep, vol. 37, pp. 645-653, 2014.

[69] A. L. a. M. Wiener, "Classification and regression by randomForest," R news2002.

[70] L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.

[71] D. L. Davies and D. W. Bouldin, "A Cluster Separation Measure," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, pp. 224-227, 1979.

[72] P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987/11/01/ 1987.

[73] C. M. Morin, C. L. Drake, A. G. Harvey, A. D. Krystal, R. Manber, D. Riemann, et al., "Insomnia disorder," Nature Reviews Disease Primers, vol. 1, p. 15026, 09/03/online 2015.

[74] B. R. Kornum, S. Knudsen, H. M. Ollila, F. Pizza, P. J. Jennum, Y. Dauvilliers, et al., "Narcolepsy," Nature Reviews Disease Primers, vol. 3, p. 16100, 02/09/online 2017.

Page 85: Developing a System for High-Resolution Detection of ...

Page | 75

[75] C. Bishop, Pattern Recognition and Machine Learning: Springer-Verlag New York, 2006.

[76] D. C. Dolan, D. J. Taylor, R. Okonkwo, P. M. Becker, A. O. Jamieson, W. Schmidt-Nowara, et al., "The Time of Day Sleepiness Scale to assess differential levels of sleepiness across the day," Journal of Psychosomatic Research, vol. 67, pp. 127-133, 2009/08/01/ 2009.

[77] L. J. Hettinger, Schmidt, T., Jones, D. L., and Keshavarz, B, "Illusory self-motion in virtual environments," ed: Boca Raton, FL: CRC Press, 2014.

[78] S. Palmisano, R. S. Allison, M. M. Schira, and R. J. Barry, "Future challenges for vection research: definitions, functional significance, measures, and neural bases," Frontiers in Psychology, vol. 6, 2015-February-27 2015.

[79] M. Basner and D. F. Dinges, "Maximizing Sensitivity of the Psychomotor Vigilance Test (PVT) to Sleep Loss," Sleep, vol. 34, pp. 581-591, 2011.

[80] (July 21, 2018). https://en.wikipedia.org/wiki/10%E2%80%9320_system_(EEG).

[81] W. W. Wierwille and L. A. Ellsworth, "Evaluation of driver drowsiness by trained raters," Accident Analysis & Prevention, vol. 26, pp. 571-581, 1994/10/01/ 1994.

[82] A. Delorme and S. Makeig, "EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis," Journal of Neuroscience Methods, vol. 134, pp. 9-21, 2004/03/15/ 2004.

[83] J. Fell, J. Röschke, K. Mann, and C. Schäffner, "Discrimination of sleep stages: a comparison between spectral and nonlinear EEG measures," Electroencephalography and Clinical Neurophysiology, vol. 98, pp. 401-410, 1996.

[84] T. Penzel and R. Conradt, "Computer based sleep recording and analysis," Sleep Medicine Reviews, vol. 4, pp. 131-148, 2000.

[85] M. S. M. D. Ira J. Rampil, "A Primer for EEG Signal Processing in Anesthesia " Anesthesiology, vol. 89, pp. 980-1002, 1998.

[86] B. Hjorth, "EEG analysis based on time domain properties," Electroencephalography and Clinical Neurophysiology, vol. 29, pp. 306-310, 1970/09/01/ 1970.

[87] L. Breiman, "Bagging Predictors," Machine Learning, vol. 24, pp. 123-140, August 01 1996.

[88] R. Polikar, "Ensemble based systems in decision making," IEEE Circuits and Systems Magazine, vol. 6, pp. 21-45, 2006.

[89] A. K. Tripathy, S. Chinara, and M. Sarkar, "An application of wireless brain–computer interface for drowsiness detection," Biocybernetics and Biomedical Engineering, vol. 36, pp. 276-284, 2016/01/01/ 2016.

Page 86: Developing a System for High-Resolution Detection of ...

Page | 76

[90] U. Friese, J. Daume, F. Göschl, P. König, P. Wang, and A. K. Engel, "Oscillatory brain activity during multisensory attention reflects activation, disinhibition, and cognitive control," Scientific Reports, vol. 6, p. 32775, 09/08/online 2016.

[91] B. Keshavarz, J. L. Campos, and S. Berti, "Vection lies in the brain of the beholder: EEG parameters as an objective measurement of vection," Frontiers in Psychology, vol. 6, 2015-October-13 2015.

[92] C. B. Saper, T. E. Scammell, and J. Lu, "Hypothalamic regulation of sleep and circadian rhythms," Nature, vol. 437, p. 1257, 10/26/online 2005.

[93] M. M. Schartner, A. Pigorini, S. A. Gibbs, G. Arnulfo, S. Sarasso, L. Barnett, et al., "Global and local complexity of intracranial EEG decreases during NREM sleep," Neuroscience of Consciousness, vol. 2017, pp. niw022-niw022, 2017.

[94] U. R. Acharya, S. Bhat, O. Faust, H. Adeli, E. C. P. Chua, W. J. E. Lim, et al., "Nonlinear Dynamics Measures for Automated EEG-Based Sleep Stage Detection," European Neurology, vol. 74, pp. 268-287, 2015.

[95] D. L. Woods, J. M. Wyma, E. W. Yund, T. J. Herron, and B. Reed, "Age-related slowing of response selection and production in a visual choice reaction time task," Frontiers in Human Neuroscience, vol. 9, 2015-April-23 2015.

[96] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, et al., "The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis," Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, vol. 454, p. 903, 1998.

[97] I. Daubechies, "The wavelet transform, time-frequency localization and signal analysis," IEEE Transactions on Information Theory, vol. 36, pp. 961-1005, 1990.

Page 87: Developing a System for High-Resolution Detection of ...

Page | 77

Appendix A4: Results from F3-M2

Here we present the results of F3-M2 electrode of sigmoid wake probability model on sleep-EEG data. Table A4.1

summarizes the number of 3-s segments used in this study for model development and validation.

Table A4.1: Average and standard deviation of the number of 3-s segments from F3-M2 used in this study for

model development and validation.

Stage Number of Segments per Subject (mean ± SD)

Arousal 6 ± 1

Deep sleep 38 ± 5

Non-REM 1 388 ± 86

Awake 1438 ± 212

Table A4.2 shows the sigmoid parameters computed form the training data for F3-M2 electrode. The resultant

sigmoid functions and the feature weights are shown in Fig. A4.1. It is also evident from Fig. A4.1 (b) that feature

weights in frontal electrodes are similar.

Table A4.2: Sigmoid parameters computed from the training data for F3-M2

Frequency band a b

Alpha 0.017 0.157

Beta 0.045 0.413

Delta 0.172 0.919

Page 88: Developing a System for High-Resolution Detection of ...

Page | 78

(a) (b)

Figure A4.1: (a) The resultant sigmoid functions for the three features for F3-M2 (b) Out-of-bag (OOB) permuted

predictor delta error for the three features computed from the training data.

(a) (b)

(c)

Page 89: Developing a System for High-Resolution Detection of ...

Page | 79

Figure A4.2: Distributions of relative power of (a) alpha, (b) beta, and (c) delta for three ranges of Pr (W) of F3-M2.

Here, all non-REM 1 to wakefulness transitions and vice versa are considered. The segments with low Pr(W) (sleep

cluster, Pr(W)<28) have low alpha and beta power and high delta power, while those with high Pr(W) (awake

cluster, Pr(W)>55) have high alpha and beta power and low delta power.

The quality of the clusters and the choice of model parameters are further validated from the feature distributions for

F3-M2 electrode, when the model is applied on all non-REM 1 to wakefulness transitions of the testing subjects.

Prior studies [13, 15] have shown that as an individual goes from wakefulness to non-REM 1, alpha and beta power

decrease, and delta power increases. Thus, if Pr (W) is on the lower side, relative power values of alpha and beta

bands should be lower, which is exactly what we can see in Fig. A4.2 (a) and Fig. A4.2 (b). The relative power

values of alpha and beta are higher for high Pr (W). The opposite scenario is seen in Fig. A4.2 (c). Therefore, the

consistent feature distributions validate the choice of sigmoid parameters and the efficacy of the sigmoid wake

probability model.

Page 90: Developing a System for High-Resolution Detection of ...

Page | 80

Appendix A5: Gamma and Theta Band Power Changes in Reaction Time Study

Figure A5.1 Variation of relative power of gamma (30-100 Hz) band (top panel), the corresponding button-press

data (middle panel), and facial video rating (bottom panel). Relative power of gamma band is high when the

participant is alert and low when the participant is drowsy.

Fig. A5.1 (top panel) shows the relative power feature of gamma band (30-100 Hz) for an entire block in a particular

participant. The button-press data (middle panel) and the facial video rating (bottom panel) of the same block are

also shown in Fig. A5.1. Fig. A5.1. reveals that the relative power of gamma band follows the gold standards-

yielding high values in alert episodes and low values for drowsy episodes. Therefore, gamma band was used in the

modified sigmoid wake probability model.

In the reaction time study’s EEG data, the theta (4-8 Hz) band power changes were also the most prominent. Fig.

A5.2 (top panel) shows the relative power feature of theta band for an entire block in a particular participant. The

button-press data (middle panel) and the facial video rating (bottom panel) of the same block are also shown in Fig.

A5.2. Fig. A5.2. reveals that the relative power of theta band follows the gold standards- yielding high values in

Page 91: Developing a System for High-Resolution Detection of ...

Page | 81

alert episodes and low values for drowsy episodes. Therefore, theta band was used in the modified sigmoid wake

probability model.

Figure A5.2 Variation of relative power of theta (4-8 Hz) band (top panel), the corresponding button-press data

(middle panel), and facial video rating (bottom panel). Relative power of theta band is low when the participant is

alert and low when the participant is drowsy.

Page 92: Developing a System for High-Resolution Detection of ...

Page | 82

Appendix B5: Mobility Parameter

Here we elucidate how mobility can be interpreted in the frequency domain. Let a signal is x(t), and its Fourier

transform is X(ω).

Then, the variance of x(t),

б2 = Energy of x(t)= Area under the magnitude spectrum of x(t)= ∫ |푋(휔)|2 dω ….(B5.1)

Again, Fourier transform of dx/dt is jωX(ω).

Therefore, 푉푎푟푖푎푛푐푒표푓 = ∫ 휔 |푋(휔)|2 dω ….(B5.2)

Mobility of x(t) = (2) ÷ (1)= ∫ 휔 | ( )|б

dω …..(B5.3)

The fraction highlighted in red is the normalized magnitude or power spectrum of x(t). Therefore, mobility

represents the frequency standard deviation of the power spectrum.

Page 93: Developing a System for High-Resolution Detection of ...

Page | 83

Appendix C5: Cardiorespiratory Signal Based Drowsiness Detection Algorithm

The proposed method attempts to detection drowsy episodes using respiratory inductance plethysmography (RIP)

signals. A schematic outline of the proposed scheme is shown in Fig. C5.1.

Figure C5.1: A schematic outline of the proposed method.

At first, RIP signals were filtered using a low-pass filter with a cut-off at 0.6Hz. Next, the data were divided into 10s

segments. If a 10s segment contained a single episode of drowsiness (i.e. at least 1s of drowsiness), the segment was

rated as drowsy. Next, the variance, skewness, kurtosis, and average mean to max features were extracted from each

10s segment. Then 60% of the participants’ data were selected as training and the remainder as testing. The training

data were used to train a random forest classifier which classified the episodes into drowsy and non-drowsy

segments. The accuracy, sensitivity, and specificity of 100 runs were 85.31 ± 3.87%, 73.45 ± 18.97%, and 90.04 ±

Page 94: Developing a System for High-Resolution Detection of ...

Page | 84

6.35% respectively. The maximum accuracy, sensitivity, and specificity of 100 runs were 90.35%, 91.76%, and

99.87% respectively.