1. Introduction
Many individuals spend an increasingly significant proportion of their day at a computer, especially those in information work. Some workplace computer tasks are known to be associated with stress, such as answering emails [
1,
2] and presenting to a remote audience [
3,
4]. Besides cognitively demanding tasks, workplace stressors include time pressure [
5], social pressure [
6], interruptions [
7] and anticipatory stress from upcoming deadlines [
8,
9]. Excessive exposure to workplace stress has direct effects on health and quality of life, as it can lead to burnout, diminished productivity, and several health problems including cardiovascular disease and impaired immunity functions [
10,
11,
12]. Thus, capturing stress levels in the workplace is vital for improving our understanding of real-life stress and the factors surrounding it. Measuring stress unobtrusively and in real time at the workplace can enable affective computing applications that incorporate user’s stress and new forms of context-aware interactions [
13,
14]. Mental health professionals and organizational psychologists can also benefit from stress monitoring at the workplace, to better understand stress and associated factors, and to deliver interventions.
To capture stress in the workplace, several methods have been tested. Self-reported stress, also referred to as ‘perceived stress’ [
15], is often considered a ground truth of stress. Several instruments have been developed, such as the Perceived Stress Scale [
16], the Daily Stress Inventory [
17], and one-item surveys deployed through experience sampling [
18]. Although self-report instruments are commonly used in the literature, they have several limitations for stress monitoring at the workplace. Self-reports are subjective and are affected by memory and emotion expression biases. They can also be disruptive as they require the full cognitive attention of the user, and do not allow continuous stress measurement. Advances in sensor technologies embedded in wearable devices have motivated researchers to investigate the usability of unobtrusive and wearable sensors for stress measurement in the workplace [
19], especially during computer use. As stress produces several physiological reactions, capturing physiological signals with sensors provides the potential to measure stress objectively, unobtrusively, and in real time.
In this paper, we review recent research on stress measurement with physiological sensors in workplace and computer use settings. We identify a gap in the literature as most studies focus on specific high-stress short-duration computer tasks to induce stress [
20,
21,
22], which might not be representative of those in real workplace settings and can overlook issues and challenges related to stress measurement with physiological sensors during different computer activities. Workplace computer use includes activities that vary in the level of cognitive or emotional stress they could induce, the physical motions and dexterity they require, the user posture, and their duration, all which can potentially affect sensor performance. An empirical study to examine the usability and reliability of a set of unobtrusive sensors across a spectrum of computer activities is lacking.
This paper addresses the following research question: what sensor modality functions best to measure stress across computer tasks? To answer this research question, we compare the use of different sensor modalities across varied computer tasks, investigating the usability, reliability, and problems with each type of sensor. We report our results based on testing the sensors in a simulated office environment. The contribution of this work is as follows:
A review of the literature on stress measurement with sensors in the workplace and laboratory studies examining computer use. Unlike reviews focusing on the results of these studies, we focus on the methods and present a summary of sensors used, tasks performed, ground truth measures, and other dependent stress variables, the number of subjects, and the duration of measurement.
An empirical comparison of the usability and reliability of a set of sensor channels for stress response measurement during computer use, including an emerging non-contact method using thermal imaging.
Identifying challenges for some sensors, specific to certain common computer tasks, which limit the efficacy of these sensors for continuous stress monitoring in situ in a workplace setting.
Recommendations for researchers and system builders interested in stress measurement with unobtrusive physiological sensors during computer use.
To the best of our knowledge, this is the first study to include this collection of physiological sensor streams (heart-rate from ECG and PPG sensors, breathing rate, skin conductance and thermal imaging) which are collected simultaneously for several computer tasks, and the first study to compare thermal imaging against other wearable physiological sensors as a stress measurement technique in computer tasks (for previous studies using thermal imaging for stress detection in other contexts, see [
22,
23,
24,
25,
26]).
5. Discussion
Given the variety of computer tasks conducted at the workplace, our analysis showed that some sensors do not perform accurately to capture stress during certain tasks. For wrist-worn sensors, several reasons could cause the failure to capture stress. Sensor readings are prone to different types of sensor artifacts. For example, sensor electrodes can move, detach from the skin, or change in pressure on the skin, all which can affect the sensor signals, especially in dexterous tasks such as typing. In addition, we used a wrist sensor with dry electrodes, which depend on sweat for conductance. Thus, for calm sedentary users in an air-conditioned lab, the EDA signal might require a length of time of skin contact with the electrodes for the signal to appear. Previous studies have reported that detection of small EDA responses with wrist sensors is problematic [
64], which might also explain why EDA did not detect the mild stress from computer tasks in our study. Palmar EDA (EDA obtained from the palm, or palm side of fingers) have shown better results for classifying calm and distress in sedentary settings in previous studies [
65], but can be uncomfortable to wear during some computer activities. Finally, some subjects naturally do not produce adequate EDA signal in at least one wrist [
66].
Many studies on unobtrusively capturing workplace stress with physiological sensors focus on specific high-stress computerized tasks. With a similar rationale as our study with common office tasks, McDuff et al. [
34] considered more realistic everyday computer activities that require cognitive processing and dexterity. Their selected computer activities could introduce motion artifacts that can negatively influence the quality of the physiological readings and introduce physiological changes associated with body motions. Among other findings, McDuff et al. [
34] report that HR and BR alone were not very discriminative indicators of cognitive stress, although their previous work showed BR to be significantly different during cognitive tasks compared to rest periods. Therefore, they suggest that BR might be dependent on the type of task and thus less generalizable. Another study with common office computer tasks (i.e., email interruptions) also reported low accuracy for predicting stress with HR and EDA [
47]. Our findings are consistent with previous studies on common workplace computer tasks, showing that BR and HR for capturing stress are task-dependent.
While chest-worn sensors can provide an accurate reading for HR and BR, several considerations must be taken into account to ensure acquiring a good signal and reducing noise. For example, posture is important to avoid an abnormal signal. HR signals from the chest-worn sensor can drop to zero if the sensor disconnects due to crouching. HR signal can also be abnormally high due to sensor friction with the skin producing strong high frequency responses. Ramos et al. [
60] reported that they instructed participants to refrain from leaning against the back of the chair to avoid signal noise introduced into the BR readings from the chest-worn sensor when the device was pressed against other objects, which makes wearing the sensor during real-life work contexts uncomfortable. Lastly, BR as a measure of stress is not accurate when the subject is talking, which restricts some workplace scenarios for using this signal to detect stress. These limitations introduce a usability problem with a cost-benefit tradeoff, where producing a good signal might require uncomfortable posture and restricted activities.
Additional filtering for noise reduction can partially address artifact-contaminated signals. However, since the focus of this study is to highlight the issues for different sensor streams during several common computer tasks, we did not pursue developing algorithms for further denoising. Previous work has investigated approaches to process artifact-contaminated data. For example, Hernandez [
45] used a motion-sensor to detect ‘still’ moments in daily activities to opportunistically measure HR and respiration within the detected still motion time. Another approach by Alamudun et al. [
67] suggest a preprocessing technique to remove the effects of factors interfering with physiological signals (e.g., posture or physical activity). They used a method called orthogonal signal correction, which attempts to remove any source of variance that is orthogonal to the dependent variable of stress level. Another method they used is linear discriminant correction, which models the source of noise (i.e., posture or physical activity) and removes it from the physiological signals. Their methods improved stress prediction from physiological data from an accuracy of 53.5% to 76.3%. However, it is unclear whether these approaches that have been developed for physical activities such as walking can successfully address motion artifacts from crouching or finer-grained activities such as typing.
Considering all methods, we found that perinasal perspiration with a thermal camera is the most generalizable method to capture stress across different tasks, as it can capture even slight changes and is robust against subject movement during computer tasks, providing reliable and continuous measurement with minimal missing data.
Our findings reveal that stress measurement in workplace environments, though important to do, is challenging, and relying on a single modality has many limitations. Previous studies have provided support for multimodal stress measurement given that physiological, personality, gender, sensor location and subject posture affect the selection of the best features to predict stress [
8,
45,
50,
68,
69]. We extend those findings to show that the performed tasks also affect the choice of the best sensor signals. It may, however, be impractical to use multiple types of sensors. Therefore, thermal imaging appears to offer the most benefit in terms of usability and signal validity and reliability in the context of sedentary computer work.
In terms of usability, all sensors used in the study are unobtrusive and do not interfere with people’s ability to perform computer tasks. Sensors using electrodes (i.e., EDA and ECG sensors) can be uncomfortable for long-term use, as the electrodes become sticky after prolonged contact with the skin. This problem is avoided with non-contact thermal imaging. For data collection and analysis, all devices used come with software that collects and processes raw signals in real time, which is useful for human–computer interaction researchers who want to use these sensors in lab or in situ studies. Thermal imaging has the additional advantage of having the thermal video, which allows for revisiting the video to investigate abnormalities and re-extract features. In terms of cost, all devices have low costs during use and the main cost is the upfront cost of the device.
5.1. Scientific Contribution
The main contribution of our work lies in the breadth of sensor comparisons we used and the context in which they took place. A few other studies in affective computing have conducted sensor comparisons (e.g., [
32,
70,
71]). However, our study is the first to compare thermal imaging and wearable sensors, capturing multiple physiological variables from different parts of the body with different measurement techniques. The breadth of sensors investigated positions this study as a reference for researchers and practitioners (see
Section 5.3).
Another distinct and important contribution is the context of this study. Most previous studies of empirical comparisons of sensors take place in a context of either using a highly restricted experimental task in the laboratory (e.g., the Stroop Color-Word test) or as observations in the wild. In the case of restricted, standardized tasks conducted in the laboratory, experimenters have control over confounding factors, but ecological validity is compromised, which raises questions about the relevance of the findings for real-world applications. Sensor measurements done with an experimental task in an abstract lab environment may lack important characteristics that are associated with office tasks, such as time pressure or semantic context. In the case of field studies, ecological validity is high, but confounding factors are hard to control, which affect the robustness of sensor comparisons. The context of this current study aimed for ecological validity with multiple common computer tasks, instead of using an abstract laboratory task. Hence, while we controlled for confounding factors, the computer tasks we used are generalizable to real-world office tasks, which makes the sensor comparisons more relevant for use in the workplace. We used a variety of office tasks that were complex, and common in the information workplace, such as answering email and giving presentations.
Lastly, as a result of our sensor comparisons, our empirical study showed thermal imaging to be a robust stress measurement technique that is suitable for workplace and computer use settings, as it is less affected by confounding variables that introduce noise to other wearable sensor streams. Moreover, physiological sensing with thermal imaging has a capacity for correction, because it is not a one-dimensional temporal signal, but a derivative signal from imagery, which can be improved with better extraction processes or algorithms, even years after its original capture. This finding and the empirical testing of thermal imaging in a realistic context advances affective sensing methods and has implications for researchers and system builders.
5.2. Limitations
Our analysis investigated five common sensor streams. However, there are more physiological signals that can be unobtrusively monitored to measure stress that were not covered in our study. For example, heart-rate variability (HRV), blood volume pulse (BVP) and skin temperature (ST) can be extracted from sensors embedded in wearables [
72]. Future work can compare HRV, BVP, and ST with other physiological signals during different workplace computer tasks.
Finally, despite having simulated a workplace environment which allowed us to investigate specific computer tasks, deploying sensors in real-life contexts can have additional challenges that cannot be modeled in lab settings. In the lab setting, careful instrumentation and real-time inspection of the sensor streams ensured high-quality signals. While our study discussed some challenges that are likely to occur in real-life settings, in situ studies can uncover additional validity and usability challenges for unobtrusive stress monitoring in the wild.
5.3. Insights for Researchers and System Builders
Our work has insights and implications for researchers and system builders, which could be synopsized as follows:
1. Controlled experiments are necessary to study cause and effect by isolating nuisance factors. This, however, does not imply that experimentation needs to be void of realism. In studies of stressful computer-based tasks, researchers relied for too long on standardized treatments alone, such as the Stroop Color-Word test (CWT), to investigate phenomena of interest. Such standardized treatments need to be accompanied by carefully designed realistic tasks (e.g., report writing interrupted by emails in the present study) if the goal is to generalize to real-world applications. Importantly, as the sensing results demonstrated in our study, the stress responses generated by standardized treatments often underestimate the stress responses generated by controlled realistic tasks, and thus potentially by real tasks in the wild as well.
2. All unobtrusive physiological sensors—wearable and imaging—are affected by motion artifacts. The advantage of imaging (thermal imaging in this case), however, is that the physiological signals are extracted algorithmically from video streams. Hence, one can visually identify the cause of noise (e.g., head turn) in the original source and compensate for it, either by removing the specific signal segment or by applying an algorithmic correction. In wearable sensor signals, this is more difficult, because there is no primary source of information (i.e., a 3D matrix) out of which these signals are extracted. The 1D temporal signal is all that the wearable sensor provides, and thus identification of motion artifacts is purely conjectural.