Work-in-progress, PupilWare-M: Cognitive Load Estimation...

6
Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified Smartphone Cameras Sohail Rafiqi 1 , Chatchai Wangwiwattana 1 , Ephrem Fernandez 2 , Suku Nair 1 , Eric Larson 1 Computer Science and Engineering 1 Southern Methodist University Dallas, TX Department of Psychology 2 University of Texas San Antonio, TX AbstractCognitive load refers to the amount of information a person can process or hold in working memory. Historically, the psychology community has estimated this quantity objectively by monitoring the involuntary dilations and constrictions of the pupil using medical grade equipment known as pupillometers. At the same time, researchers in the HCI and UbiComp communities have hypothesized how cognitive load sensing might be integrated into context aware computing systems, but limitations of sensing cognitive load ubiquitously and reliably prevent the mass integration of such a technology. Our system, PupilWare-M, seeks to begin bridging this sensing gap. We build upon a recent platform, PupilWare, which measures a user’s sub-millimeter pupil dilation from an unmodified camera. We update the PupilWare sensing system with a calibration protocol that brings pupillary responses of a diverse range of people and lighting conditions onto a single 0.0-1.0 scale called CogPoint. Furthermore, we update and optimize the algorithms employed to run in real time from a smartphone. We validate the calibration process using eight users in a controlled experiment where cognitive load is simple to determine from its situational context. Discussion of future work and remaining challenges is then described. Keywords—pupillary response; cognitive load; sensing; image processing; context aware; I. INTRODUCTION Ubiquitous and mobile sensing systems have seen dramatic and considerable accomplishments in recent years: sensing heart rate with cameras [13], stress levels with wearable technology [11], and even how interruptible an individual might be [12]. However one aspect of ubiquitous sensing that remains elusive is user context: understanding user’s decision-making ability, affective states, and cognitive load [1]. It is difficult to imagine methods for measuring these quantities with both reliability and scalability (non- intrusive, low cost, etc.). However, the cognitive psychology community has estimated these quantities in situational contexts for decades using pupillary response. Much research has been done to understand and validate the correlation between pupillary responses and a user’s cognitive and affective states, showing that involuntary pupil dilation increases proportionally to cognitive load but can also change with affective state or sexual arousal [2]– [4][7]–[9]. Nearly all of this previous research around detecting pupillary responses uses infrared light sources and high-resolution imaging, including remote gaze trackers [2][18]. While gaze trackers provide accurate measurements of pupil response, they are not yet used ubiquitously, especially in mobile contexts. To this end, we present PupilWare-M, a system that analyzes pupillary response in real-time and determines cognitive load using unmodified smartphone cameras. PupilWare-M builds upon previous research called PupilWare [14]. This system was a feasibility study for measuring pupil responses from a webcam. In this study, there are three contributions over the original feasibility study: (1) Algorithms are refined to run in real time from a smartphone and are evaluated on a smartphone with eight users. We evaluate a custom calibration procedure that has two purposes: (2) optimizing signal processing parameters for different lighting and eye color, and (3) scaling and projecting the pupil dilation from various users onto a unified axis that estimates cognitive load similarly across users. PupilWare-M uses the embedded smartphone camera to segment, enhance, and estimate the sub-millimeter dilations of a user’s pupil when they are interacting with a smartphone. These dilations are then mapped onto a new axis called CogPoint that can be used across users for ascertaining cognitive load in simple contexts. We validate PupilWare-M against a high-resolution gaze tracker using a classic experiment from cognitive psychology: the digit Figure 1—PupilWare-M stages running on iPhone 5S

Transcript of Work-in-progress, PupilWare-M: Cognitive Load Estimation...

Page 1: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified Smartphone Cameras

Sohail Rafiqi1, Chatchai Wangwiwattana1, Ephrem Fernandez2, Suku Nair1, Eric Larson1 Computer Science and Engineering1

Southern Methodist University Dallas, TX

Department of Psychology2 University of Texas

San Antonio, TX

Abstract—Cognitive load refers to the amount of information a person can process or hold in working memory. Historically, the psychology community has estimated this quantity objectively by monitoring the involuntary dilations and constrictions of the pupil using medical grade equipment known as pupillometers. At the same time, researchers in the HCI and UbiComp communities have hypothesized how cognitive load sensing might be integrated into context aware computing systems, but limitations of sensing cognitive load ubiquitously and reliably prevent the mass integration of such a technology. Our system, PupilWare-M, seeks to begin bridging this sensing gap. We build upon a recent platform, PupilWare, which measures a user’s sub-millimeter pupil dilation from an unmodified camera. We update the PupilWare sensing system with a calibration protocol that brings pupillary responses of a diverse range of people and lighting conditions onto a single 0.0-1.0 scale called CogPoint. Furthermore, we update and optimize the algorithms employed to run in real time from a smartphone. We validate the calibration process using eight users in a controlled experiment where cognitive load is simple to determine from its situational context. Discussion of future work and remaining challenges is then described.

Keywords—pupillary response; cognitive load; sensing; image processing; context aware;

I. INTRODUCTION Ubiquitous and mobile sensing systems have seen dramatic and considerable accomplishments in recent years: sensing heart rate with cameras [13], stress levels with wearable technology [11], and even how interruptible an individual might be [12]. However one aspect of ubiquitous sensing that remains elusive is user context: understanding user’s decision-making ability, affective states, and cognitive load [1]. It is difficult to imagine methods for measuring these quantities with both reliability and scalability (non-intrusive, low cost, etc.). However, the cognitive psychology community has estimated these quantities in situational contexts for decades using pupillary response. Much research has been done to understand and validate the correlation between pupillary responses and a user’s cognitive and affective states, showing that involuntary pupil dilation increases proportionally to cognitive load but can also change with affective state or sexual arousal [2]–[4][7]–[9]. Nearly all of this previous research around detecting pupillary responses uses infrared light sources and

high-resolution imaging, including remote gaze trackers [2][18]. While gaze trackers provide accurate measurements of pupil response, they are not yet used ubiquitously, especially in mobile contexts. To this end, we present PupilWare-M, a system that analyzes pupillary response in real-time and determines cognitive load using unmodified smartphone cameras.

PupilWare-M builds upon previous research called PupilWare [14]. This system was a feasibility study for measuring pupil responses from a webcam. In this study, there are three contributions over the original feasibility study:

• (1) Algorithms are refined to run in real time from a smartphone and are evaluated on a smartphone with eight users.

• We evaluate a custom calibration procedure that has two purposes: (2) optimizing signal processing parameters for different lighting and eye color, and (3) scaling and projecting the pupil dilation from various users onto a unified axis that estimates cognitive load similarly across users.

PupilWare-M uses the embedded smartphone camera to segment, enhance, and estimate the sub-millimeter dilations of a user’s pupil when they are interacting with a smartphone. These dilations are then mapped onto a new axis called CogPoint that can be used across users for ascertaining cognitive load in simple contexts. We validate PupilWare-M against a high-resolution gaze tracker using a classic experiment from cognitive psychology: the digit

Figure 1—PupilWare-M stages running on iPhone 5S

Page 2: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

span task [15][18]. Pilot experiments are first run offline on four users to verify that real time pupil extraction can be performed from a smartphone. These “offline” findings are used to design a real time PupilWare-M prototype, which we evaluated on eight different users (an “online” experiment). We conclude that, in a normal room environment, PupilWare is as accurate as an infrared-based gaze tracker for assessing situational cognitive load. Furthermore, simple calibration allows many users to map to the same cognitive load scale, which is important for future context aware systems that would use PupilWare-M. Though significant research is still required to claim such a technology could be used ubiquitously, PupilWare-M is a first step in realizing context aware, cognitive monitoring systems.

II. BACKGROUND AND MOTIVATION Pupil response has been studied in a variety of different situational contexts with some remarkable insights into human cognition: understanding human intention [9], efficient human computer interaction [8], understanding the impact of interruptions on productivity [17], identifying the emotional change of the user [6], understanding how people reach decisions [1]. These works were informed by a vast amount of previous research [2]–[4], [18] that studied pupillary responses using infrared gaze trackers (either medical grade, known as pupillometers, ~$4000USD, or using high resolution eye trackers, ~$700USD). Because of this research, (1) it is generally accepted that pupillary responses accurately reflect workload induced by the task and (2) pupillary responses change in response to a user’s emotional state [19]. This pupillary change may not reflect the actual emotion of the state, but signifies an emotional transition occurred [ibid].

PupilWare-M is focused on using a mobile device to assess pupillary response in real-time—but not at the expense of practicality. PupilWare-M requires no augmentation to the user nor does it require any additional hardware. Instead of using infrared light (like a gaze tracker), PupilWare uses computer vision and machine learning to extract the eye and pupil diameter from an RGB camera.

It is important to note that PupilWare-M shares some overlap with modern eye trackers, but is also quite distinct. For example, recently Wood and Bulling [20] demonstrated that eye gaze could be ascertained from the camera of a tablet, without the use of an infrared light source. Their prototype, EyeTab is closely related to PupilWare-M but infers user’s gaze from measurements of the iris (not the pupil). Wood and Bulling’s system has clearly influenced the inception of PupilWare-M by demonstrating that eye segmentation and processing is possible from mobile devices, however, PupilWare-M employs methods that are distinct from iris detection. Furthermore, PupilWare-M is validated for its ability to measure sub-millimeter pupil dilations, not track saccades or foveation.

III. DATA COLLECTION In order to validate the PupilWare system we conducted two user studies:

(1) A study in which user data was collected with a normal web camera for offline analysis (four users; data used only for testing processing speed and optimizing parameters to run on a mobile device).

(2) A follow-up study that used a smartphone for real-time cognitive analysis (eight new users; data used for evaluating ability to estimate pupillary response in online study).

Each study artificially induced cognitive load using memorization of spoken digits, known in psychology as the digit span task. In this task, a participant is presented with a sequence of spoken digits and asked to memorize and then repeat the sequence out loud. The act of memorization induces cognitive load in a consistent manner—though the effort and amount of pupil dilation are user dependent. Each participant was presented with increasingly longer (and therefore more difficult) sequences to memorize, varying in length from 5, 6, 7, to 8 digits (with four iterations per sequence length using different random digits each time). Initially we also presented a 9-digit sequence but nearly all participants had difficulty with the task. They would “give-up” memorizing and the cognitive load induced became dependent on the effort of the user. This is consistent with other studies of cognitive load [2][4]. As such, we only present results for up to length 8 sequences. This results in 128 trials of the digit span task (8 users x 4 sequence lengths x 4 iterations).

At the start of the session, participants were also asked to “relax and stare at the phone for about 10 seconds.” This “baseline” data is used to calibrate the parameters of our algorithm (discussed in detail later on).

Table 1—Participant eye color Participants Brown Black Blue Green

8 1 2 4 1 For measuring the ground truth pupil dilation, we use a Gazept Remote Eye tracker [21] that is calibrated using a medical grade pupillometer with accuracy down to 0.1 mm of pupil dilation. At the beginning of the test the distance between participant’s eyes (i.e., pupillary distance) was measured using a digital caliper. This is used for converting pixel data to millimeters. During the experiment the calibrated gaze tracker and the smartphone camera collected the participant’s pupillary feature data simultaneously. The experiment was conducted in several environments: (1) an office under normal (but uncontrolled) lighting conditions and (2) in a conference room surrounded by open windows. Prior to commencing the test and between iterations we ask participants to relax.

For the study, participant demographics are as follows: 8 participants (5 female/3 male) ranging in age from 19 to 23. Eye color is described in Table 1.

Page 3: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

IV. ALGORITHM In this section we present the PupilWare-M algorithm. PupilWare-M uses many of the same processing techniques as the original PupilWare algorithm [14]. However, there are key differences in PupilWare-M that are distinct:

• Adaptive filtering algorithms have been added to denoise the pupil.

• An iterative calibration protocol has been added to help adapt the PupilWare algorithm to different users and ambient light conditions.

We briefly present an abridged explanation of the PupilWare algorithm before explaining new algorithm implementations in more detail.

A. Abridged PupilWare-M Algorithm Explanation PupilWare-M employs computer vision to: (1) eliminate noise via preprocessing, (2) segment the left and right eyes from a video frame, (3) use a “modified starburst” algorithm to estimate pupil diameter, and (4) eliminate outliers in the pupil diameter estimate. Pupil size is obtained from a sequence of images from an iPhone 5S face time camera with resolution 1028x720 (i.e., 720p) running at 15 frames per second. We employ OpenCV [22] and GPU accelerated filtering in Apple’s CoreImage library for image processing.

Segmentation: As in the original study [14], we start with locating and extracting the face using a Haar-Cascade detector. We then constrain the portion of the face where the eyes are most likely to reside based upon the face bounds. We then convert to a gray scale image, and the eye center is estimated using dynamic thresholding, morphology, and the centroid of connected components as described in [14]. We denote this area as the eye region-of-interest or eye-ROI.

Eye-ROI preprocessing: One of the difficulties of capturing pupil size from a camera without infrared light is dealing with occlusion from eyelashes, light reflection, and shadow. Moreover, for dark eyes, the intensity difference between iris and pupil makes it difficult to find a clear boundary. In order to get higher quality results, we preprocess the eye regions to reduce reflection and increase contrast. These preprocessing steps include median filtering with a 3x3 kernel and histogram equalization of the eye region. The area around the seed point is typically about 120x120 pixels and the pupil is typically 15x15 pixels.

PupilWare Starburst Algorithm: The starburst algorithm is a method typically employed in infrared eye trackers after the pupil is illuminated with infrared light [23]. Intuitively, the starburst algorithm uses a seed point in the eye to begin “looking” for the edges of the pupil at various angles around the seed point. It marches in many directions, looking for strong edges (thresholding on the image gradient). It then tries to formulate a better estimate of the pupil center from

the edges points and repeats the march iteratively. Outliers are removed using RANSAC and the remaining boundary points are used in an ellipse-fitting algorithm [24]. The Starburst algorithm is run separately on each eye. We estimated pupil diameter by calculating the diameter of the fitted circles and averaging for each eye.

Modifications in PupilWare-M: We have modified the starburst algorithm to only march from 35 degrees to 125 degrees to avoid the shadow from eyelashes and be more robust to squinting. The exact threshold value for an “edge” in the image gradient is optimized during calibration. We also adjusted the “march” of the starburst algorithm to be biased towards finding edges that are relatively “far” from the seed point. That is, we decrease the magnitude of the threshold needed on the image gradient for a point to be considered an “edge.”

This location bias is achieved using a 10x10 pixel Gaussian kernel, G(x,y), centered at the current seed point, where x and y are the pixel positions with origin at the Gaussian center (the maximum of the Gaussian is normalized to be 1 with covariance set during calibration). We weight the image gradient by 1- G(x,y). As a result, edges close to the center are more likely to be ignored and edges that are farther away from eye center are more likely to be chosen as “pupil edges.” This also helps eliminate edges caused by light refection (and noise in general) in the pupil. The covariance matrix of this Gaussian kernel is adjusted during the calibration phase. We assume the covariance is diagonal, adjusting standard deviation in the vertical and horizontal axes of the eye-ROI, but off diagonal covariance elements are assumed to be negligible.

Post Processing: PupilWare-M first converts the ellipse diameters from pixels to millimeters. This is achieved by calculating the Euclidean distance (in pixels) between the user’s left and right pupil centers, as calculated from the Starburst algorithm. We scale the estimated pupil diameters by the known value between the eyes (measured via calipers in the study session). Note that this approach assumes the eyes lie in a plane perpendicular to the camera, necessitating that we detect and discard large head pose changes. This is achieved using median filtering over a one second moving window. After filtering, the original and raw feature values are compared. Any value that changes more than 5% after median filtering is discarded. The missing values are then imputed with linear interpolation.

Calibration: In PupilWare-M we introduce the calibration process to accommodate for light variability in the mobile environment. The purpose of the calibration process is to dynamically find the optimal parameters for thresholding and the Gaussian kernel, G(x,y). That is, three values are investigated: the threshold for designating an “edge” and the two diagonal elements of the Gaussian covariance matrix. The calibration uses a 10-second recording of the user relaxing and staring at the screen of the phone (i.e., the baseline video taken at the start of the experiment). We then

Page 4: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

process the baseline video over many iterations, using a random grid search to change parameters [25]. Each parameter can take on 5 different values, resulting in 125 possible combinations for the grid search. Each set of parameters is allowed to run to completion (on the smartphone) and the pupil diameter measurement is saved in the smartphone document directory. To keep the runtime manageable, we stop the randomized search after 45 seconds of searching. We then choose the parameter set that had the minimum Mean Absolute Deviation Ratio (MADR) in the pupil diameter in the calibration baseline videos.

MADR = MAD(x)+ |median(x) ||median(x) |

Where x is the estimated pupil diameter signal, and MAD is Mean Absolute Deviation. We choose the minimum because, during calibration, the user’s pupil should not be dilating or constricting. Therefore, the parameter set with the most consistently small deviations is likely not finding noise boundaries, such as shadows, eyelashes, or reflections. Moreover, these parameters help PupilWare-M adapt to eye color because the gradient threshold is chosen dynamically. These parameters are then saved to the smartphone for the current user.

We use this calibration process at the beginning of each online experiment. The results of this calibration are used for a participant for all remaining digit span task experiments. As a qualitative example, Figure 2 shows the result of three sets of parameters during the calibration process. Once the calibration process is completed, the best parameter values are set as default for any future video processing until next calibration is performed.

V. COGNITIVE LOAD DETERMINATION We now discuss the implementation of a novel scale called CogPoint that can be used to map the relative cognitive load of different users onto a single, comparable dimension. In

cognitive psychology, relative change in the pupil diameter during the course of a task is typically used for measuring cognitive load [19], although no standard is used by all researchers. The percent change of pupil dilation for the ith frame, Di, from baseline is calculated by

Di =di −min(d5sec )min(d5sec )

where di is the estimated pupil diameter, in millimeters, for the ith frame, and d5sec is the pupil dilation during the first five seconds of the digit span task. The max(Di), then, is the maximum deviation from baseline during a given task. To customize this for each individual participant we introduce a value called CogPoint that ranges in value from 0.0 (low cognitive load) to 1.0 (high cognitive load). The CogPoint range is calibrated using two videos: max(Dbaseline) from the baseline video and max(D8digit) is from one iteration of the 8-digit span task, which should induce the highest cognitive load. We then define CogPoint for each frame, i, as

CogPoint(i) = Di −max(Dbaseline )max(D8digit )

In this way, using two brief videos, PupilWare-M can be calibrated to measure cognitive load of any individual, personalized for lighting, eye color, and pupillary percentage change to simple cognitive tasks. CogPoint does not account for changes in brightness or context of a scene, only the current situated environment.

VI. RESULTS In the following, we present the results of our eight person user study. In the first section we define the results of baseline calibration using the gaze tracker as a ground truth for the selected parameters of the algorithm. The second section compares pupil size changes as a result of induced cognitive load from the digit span task. Finally we discuss the utility of the CogPoint estimation.

A. Initial Calibration As part of the initial calibration we use a randomized grid search to select the parameter set with the most optimal MADR, as described. Explanation: Figure 3 shows one result of this calibration for one participant. The x-axis represents the video frames and y-axis represents the pupil size in mm. Solid lines represent the result of one iteration of the grid search. The four parameter sets with the lowest MADR are graphed (P1-P4). The red dashed line represents the ground truth from the gaze tracker. Bold green line is the best result automatically chosen (P4 in Table 2). Result: The parameters that result in the lowest MADR score also correspond to the results that most closely match the gaze tracker. Interestingly, P4 and P1 seem to be relatively good choices, whereas P2 and P3 deviate considerable from the gaze tracker. Looking back, these corresponded to the algorithm getting confused about the pupil/iris boundary from a reflection in the eye. It is

Figure 2—Calibration overview (left) and results for one user as displayed on smartphone (right). Top three iterations shown with selected parameterization result in green.

Page 5: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

encouraging that these two series also had higher MADR. Implication: The calibration successfully selects parameters that correspond to the gaze tracker, even though it has no knowledge of the underlying gaze tracker ground truth. The results presented here are highly similar for all participants. However, even though these results show a close relationship between the best parameters picked by the application and the ground truth, more testing is required to ensure that this procedure works in varying lighting conditions or in dynamic lighting conditions.

B. Pupil Dilation Comparison In this section we evaluate the accuracy of capturing real-time changes in the pupil size during the digit span task. Specifically, we want to know how the PupilWare-M results compare with the gaze tracker. Explanation: Figure 4 shows the comparison between the percentage pupil dilation measured by the gaze tracker against the mobile data calculation. This plot is known as a Bland-Altman plot and is used for comparing a ground truth device to a sensor under test in the medical community. The y-axis represents the point-by-point difference of the percentage pupil change between the gaze tracker and PupilWare-M for every iteration and every participant. The x-axis is the percentage pupil size change measured by the gaze tracker. The graph represents the point-by-point difference for each participant, where each participant is represented by a different color on the graph. Result: The interquartile range of the residual difference is -0.05881 to 0.05886 with the mean difference of -0.0008. PupilWare-M and the gaze tracker have remarkable point-by-point similarity, with very few differences greater than 5%. The majority of outliers come from one participant with black eye color (i.e., very dark brown).

C. Cognitive Load Classification Explanation: Figure 5 shows the boxplot of maximum CogPoint values for one participant (as a qualitative example), and all participants. The y-axis is maximum CogPoint during an iteration of the span task and x-axis is length of the digit sequence used in the span task. Result:

The CogPoint values are spread evenly from 0 to 1 for all participants, with increasing difficulty of the cognitive memorization task. For instance, 5-digits is, on average, lower than 6, 7, and 8 digits.

Implication: The data shows considerable overlap between the five and six digit cognitive load induced. While this could be an indication that CogPoint fails to distinguish the difference, it could also be that the tasks illicit similar cognitive load. For example, in Kahneman’s original study of cognitive load [4], pupil response between the 5 and 6 digits sequences overlapped completely. The results also point to some natural breakpoints to classify the cognitive load induced. A CogPoint value less than 0.3 may be considered “low or relaxed.” A CogPoint value spanning 0.3-0.6 could be considered “medium” and corresponds to memorization of 5-6 digits. And a CogPoint above 0.6 might be “maximal or high” cognitive processing corresponding to memorization of 8 digits.

VII. LIMITATIONS There are many factors contribute to the pupil dilation, such as the ability of recognize pattern in some participants, the level of concentration, emotion could also affect the cognitive load. Therefore, many unaccounted for variables may have contributed to deviation in the calibration of digit span tasks and cannot be separated from “poor performance” of the algorithm. Another limitation of our experiment is the sample size. We cannot perform statistical tests that involve subgrouping without collecting more data.

Table 2—MADR Values

P1 P2 P3 P4 MADR 1.1034 1.1150 1.1126 1.0907

Figure 3–Qualitative Calibration comparison

Figure 4 – Gaze and Mobile device comparison

Figure 5—Maximum CogPoint grouped by experimental task (5-8 digits)

Page 6: Work-in-progress, PupilWare-M: Cognitive Load Estimation ...lyle.smu.edu/~eclarson/pubs/2015Social.pdf · Work-in-progress, PupilWare-M: Cognitive Load Estimation Using Unmodified

These experiments were conducted in mostly consistent lighting conditions; it is unclear how PupilWare will perform in dynamic lighting or low light conditions.

VIII. CONCLUSION In this paper we presented PupilWare-M, a system that determines cognitive load of the user using pupil dilation measurements. PupilWare-M uses an unmodified camera in a mobile device. We used classic psychology experiments to artificially induce the cognitive load. During experiments, a remote eye tracker was used to validate the accuracy of PupilWare-M. A novel calibration process was used to improve the accuracy of cognitive load classification. Our results demonstrated a high level of correlation between measurements from PupilWare-M and the remote gaze tracker. We also provided a way for applications to define cognitive load level for their own specific purposes.

IX. REFERENCES

[1] J. F. Cavanagh, T. V Wiecki, A. Kochar, and M. J. Frank, “Eye tracking and pupillometry are indicators of dissociable latent decision processes.,” J. Exp. Psychol. Gen., vol. 143, no. 4, pp. 1476–1488, 2014.

[2] J. Klingner, “Measuring Cognitive Load During Visual Tasks By Combining Pupillometry And Eye Tracking,” 2010.

[3] J. Beatty, “Task-Evoked Pupillary Responses, Processing Load, and the Structure of Processing Resources,” Psychological Bulletin, 1982.

[4] D. Kahneman and P. Wright, “Changes of pupil size and rehearsal strategies in a short-term memory task.,” Q. J. Exp. Psychol., vol. 23, no. 2, pp. 187–96, May 1971.

[5] A. a. Zekveld, D. J. Heslenfeld, I. S. Johnsrude, N. J. Versfeld, and S. E. Kramer, “The eye as a window to the listening brain: Neural correlates of pupil size as a measure of cognitive listening load,” Neuroimage, vol. 101, pp. 76–86, 2014.

[6] P. Ren, A. Barreto, J. Huang, Y. Gao, F. R. Ortega, and M. Adjouadi, “Off-line and on-line stress detection through processing of the pupil diameter signal,” Ann. Biomed. Eng., vol. 42, no. 1, pp. 162–176, 2014.

[7] M. Nakayama and Y. Hayashi, “Prediction of recall accuracy in contextual understanding tasks using features of oculo-motors,” Univers. Access Inf. Soc., pp. 1–16, 2013.

[8] Y. M. Jang, R. Mallipeddi, S. Lee, H. W. Kwak, and M. Lee, “Human intention recognition based on eyeball movement pattern and pupil size variation,” Neurocomputing, vol. 128, pp. 421–432, 2014.

[9] Y. M. Jang, R. Mallipeddi, and M. Lee, “Identification of human implicit visual search intention based on eye movement and pupillary analysis,” User Model. User-Adapted Interact., pp. 1–30, 2013.

[11] Mark, Gloria, Yiran Wang, and Melissa Niiya. "Stress and multitasking in everyday college life: an empirical study of online activity." In SIGCHI Conference on Human Factors in Computing Systems, pp. 41-50. ACM, 2014.

[12] Okoshi, Tadashi, Hideyuki Tokuda, and Jin Nakazawa. "Attelia: Sensing user's attention status on smart phones." In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp. 139-142. ACM, 2014.

[13] Poh, Ming-Zher, Daniel J. McDuff, and Rosalind W. Picard. "Advancements in noncontact, multiparameter physiological measurements using a webcam." Biomedical Engineering, IEEE Transactions on 58, no. 1 (2011): 7-11.

[14] S. Rafiqi, C. Wangwiwattana, J. Kim, E. Fernandez, S. Nair and E. C. Larson, “PupilWare  : Towards Pervasive Cognitive Load Measurement using Commodity Devices,” Proc. 8th Int. Conf. PErvasive Technol. Relat. to Assist. Environ. ACM, 2015., vol. 1, no. 212, 2015.

[15] E. H. Hess and J. M. Polt, “Pupil Size in Relation to Mental Activity during Simple Problem-Solving.,” Science, vol. 143, no. 3611, pp. 1190–2, Mar. 1964.

[16] S. Chen, J. Epps, and F. Chen, “A comparison of four methods for cognitive load measurement,” Proc. 23rd Aust. Comput. Interact. Conf. - OzCHI ’11, pp. 76–79, 2011.

[17] I. Katidioti, J. P. Borst, and N. A. Taatgen, “What Happens When We Switch Tasks  : Pupil Dilation in Multitasking,” vol. 20, no. 4, pp. 380–396, 2014.

[18] B. P. Bailey and S. T. Iqbal, “Understanding changes in mental workload during execution of goal-directed tasks and its application for interruption management,” ACM Trans. Comput. Interact., vol. 14, no. 4, pp. 1–28, 2008.

[19] W. Fordham, U. Tryon, “Survey of sources of variations,” Pscyhophysiology, vol. 12, no. 1, pp. 90–93, 1975.

[20] E. Wood and A. Bulling, “EyeTab: Model-based gaze estimation on unmodified tablet computers,” in Proceedings of the Symposium on Eye Tracking Research and Applications, 2014, pp. 207–210.

[21] S. Zugal and J. Pinggera, Low-Cost Eye-Trackers: User for Information Systems Research?, vol. 178. Cham: Springer International Publishing, 2014.

[22] G. Bradski and A. Kaehler, Learning OpenCV: Computer Vision with the OpenCV Library. “O’Reilly Media, Inc.,” 2008, p. 580.

[23] D. Winfield and D. J. Parkhurst, “Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Workshops, vol. 3, pp. 79–79.

[24] M. A. Fischler and R. C. Bolles, “Paradigm for Model,” Commun. ACM, vol. 24, no. 6, 1981.

[25] J. Bergstra and Y. Bengio, “Random Search for Hyper-Parameter Optimization,” J. ofMachine Learn. Res., vol. 13, pp. 281–305, 2012.