Download - Impact of Noise Level on Task Performance and Workload and ...

Wright State University Wright State University

CORE Scholar CORE Scholar

Browse all Theses and Dissertations Theses and Dissertations

2018

Impact of Noise Level on Task Performance and Workload and Impact of Noise Level on Task Performance and Workload and

Correlation to Personality Correlation to Personality

Kaylee Marie Eakins Wright State University

Follow this and additional works at: https://corescholar.libraries.wright.edu/etd_all

Part of the Operations Research, Systems Engineering and Industrial Engineering Commons

Repository Citation Repository Citation Eakins, Kaylee Marie, "Impact of Noise Level on Task Performance and Workload and Correlation to Personality" (2018). Browse all Theses and Dissertations. 1948. https://corescholar.libraries.wright.edu/etd_all/1948

This Thesis is brought to you for free and open access by the Theses and Dissertations at CORE Scholar. It has been accepted for inclusion in Browse all Theses and Dissertations by an authorized administrator of CORE Scholar. For more information, please contact [email protected].

https://corescholar.libraries.wright.edu/

https://corescholar.libraries.wright.edu/etd_all

https://corescholar.libraries.wright.edu/etd_comm

https://corescholar.libraries.wright.edu/etd_all?utm_source=corescholar.libraries.wright.edu%2Fetd_all%2F1948&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/305?utm_source=corescholar.libraries.wright.edu%2Fetd_all%2F1948&utm_medium=PDF&utm_campaign=PDFCoverPages

https://corescholar.libraries.wright.edu/etd_all/1948?utm_source=corescholar.libraries.wright.edu%2Fetd_all%2F1948&utm_medium=PDF&utm_campaign=PDFCoverPages

mailto:[email protected]

Impact of Noise Level on Task Performance and Workload and Correlation to Personality

A thesis submitted in partial fulfillment

of the requirements for the degree of

Master of Science in Industrial and Human Factors Engineering

By

KAYLEE MARIE EAKINS

B.S.B.E., Wright State University 2017

2018

Wright State University

WRIGHT STATE UNIVERSITY

GRADUATE SCHOOL

April 19, 2018

I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY SUPERVISION

BY Kaylee Marie Eakins ENTITLED Impact of Noise Level on Task Performance and

Workload and Correlation to Personality TO BE ACCEPTED IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in Industrial and

Human Factors Engineering

___________________________________

Mary Fendley, Ph.D., Thesis Director

___________________________________

Jaime Ramirez-Vick, Ph.D., Chair

Department of Biomedical, Industrial

and Human Factors Engineering

Committee on

Final Examination

___________________________________

Mary Fendley, Ph.D.

___________________________________

Frank Ciarallo, Ph.D.

___________________________________

Matthew Sherwood, Ph.D.

__________________________________

Barry Milligan, Ph.D.,

Interim Dean of the Graduate School

iii

ABSTRACT

Eakins, Kaylee Marie. M.S.I.H.E., Department of Biomedical, Industrial and Human Factors

Engineering, Wright State University, 2018. Impact of Noise Level on Task Performance and

Workload and Correlation to Personality. An ideal work environment supports a culture of high performance, low mental workload, and

quick turnarounds. The impact of noise on three types of tasks in a lab work environment were

examined while attempting to identify correlations between a subject’s personality and their

tolerance to noise. Neuroticism, agreeableness, conscientiousness, and extroversion correlated

significantly with subjective (NASA-TLX) and physiological mental workload measures (heart

rate variability and eye-tracking). The results show that task type impacts the performance, task

duration, and mental workload. Although the physiological workload measures showed

significant impact, the parameters standard deviation of R-R intervals and LF/HF ratio agreed

with the NASA-TLX scores while the parameters RMSSD value and standardized mean of R-R

intervals disagreed. Noise level nearly showed statistical significance with task duration and

LF/HF ratio; however, more research is necessary to completely rule out the influence of noise

level on the human participants.

iv

Table of Contents

1.0 Introduction ............................................................................................................................... 1

1.1 Background ........................................................................................................................... 1

1.2 Research Objective ................................................................................................................ 2

2.0 Literature Review...................................................................................................................... 2

2.1 Task and Multiple Resource Theory ..................................................................................... 2

2.2 Noise Interruption ................................................................................................................. 5

2.3 Mental Workload Analysis.................................................................................................... 6

2.3.1 NASA TLX..................................................................................................................... 6

2.3.2 Eye-tracking .................................................................................................................... 7

2.3.3 Heart Rate Variability ..................................................................................................... 9

2.4 Personality ........................................................................................................................... 12

3.0 Methods................................................................................................................................... 16

3.1 Experimental Design ........................................................................................................... 16

3.2 Participants .......................................................................................................................... 17

3.3 Stimuli and Apparatus ......................................................................................................... 18

3.4 Procedure ............................................................................................................................. 20

3.5 Data Analysis ...................................................................................................................... 21

3.5.1 Performance Scoring .................................................................................................... 21

3.5.2 Duration of task ............................................................................................................ 22

3.5.3 NASA-TLX .................................................................................................................. 22

3.5.4 Eye-tracking .................................................................................................................. 23

3.5.5 Heart Rate Variability ................................................................................................... 23

3.6 Hypotheses .......................................................................................................................... 24

3.6.1 Performance .................................................................................................................. 24

v

3.6.2 Duration of the Tasks ................................................................................................... 24

3.6.3 NASA-TLX Mental Workload ..................................................................................... 25

3.6.4 Eye-tracking and HRV Parameters ............................................................................... 25

4.0 Results ..................................................................................................................................... 26

4.1 Performance ........................................................................................................................ 27

4.2 Duration ............................................................................................................................... 30

4.3 NASA-TLX Mental Workload............................................................................................ 32

4.4 Physiological Mental Workload .......................................................................................... 34

4.4.1 Heart Rate Variability Analysis .................................................................................... 34

4.4.2 Eye-tracking Analysis................................................................................................... 39

4.4.3 Summary of Physiological Parameters ......................................................................... 43

4.5 Correlation Testing .............................................................................................................. 43

5.0 Discussion ............................................................................................................................... 48

5.1 Performance, Duration, and NASA-TLX ........................................................................... 48

5.2 Physiological Mental Workload .......................................................................................... 50

5.3 Correlation Tests ................................................................................................................. 55

6.0 Conclusion .............................................................................................................................. 56

7.0 Appendix ................................................................................................................................. 57

7.1 Appendix A: Experimental Design and Combinations Table ............................................. 57

7.2 Appendix B: Questionnaires and Task Problems (with answers) ....................................... 60

7.2.1 Noise Tolerance Questionnaire .................................................................................... 60

7.2.2 Big Five Inventory (Link) ............................................................................................. 61

7.2.3 NASA-TLX (Link) ....................................................................................................... 61

7.2.4 Anomaly Detection Task (with answers) ..................................................................... 62

7.2.5 Data Entry Task (with answers) ................................................................................... 62

vi

7.2.6 Mathematical Arithmetic Task (with answers) ............................................................ 63

7.3 Appendix C: Eye-tracking Illustrations............................................................................... 63

7.4 Appendix D: Residual Plots ................................................................................................ 67

7.4.1 Normal Distribution Checks ......................................................................................... 67

7.4.2 Residual vs. Predicted Plots.......................................................................................... 71

7.5 Appendix E: Connecting Letters Reports and Interaction Plots ......................................... 74

7.5.1 Task Performance ......................................................................................................... 74

7.5.2 Task Duration ............................................................................................................... 75

7.5.3 Mental Workload .......................................................................................................... 76

7.5.4 Heart Rate Parameters .................................................................................................. 77

7.5.5 Eye-tracking Parameters ............................................................................................... 79

7.6 Appendix F: Correlation Tables .............................................................................................. 80

8.0 References ............................................................................................................................... 93

vii

List of Figures

Figure 1: Illustration of R-R interval in EKG signal ...................................................................... 9

Figure 2: Graphical user interface for data entry task................................................................... 19

Figure 3: Anomaly detection task with anomaly circled .............................................................. 19

Figure 4: Arithmetic task set-up with manual pill counter, beads, and pill bottle ........................ 20

Figure 5: Average task type and noise level vs. performance (error bars are standard deviation) 28

Figure 6: Task type, noise level, and task type*noise level interaction connecting letters reports

for performance ............................................................................................................................. 29

Figure 7: Average of task duration vs. task type and noise level ................................................. 30


for task duration ............................................................................................................................ 31

Figure 9: Averages of NASA-TLX scores vs. task type and noise level ...................................... 32


for NASA-TLX scores .................................................................................................................. 34

Figure 11: Correlation scatterplot with ellipse of agreeableness vs. office noise mean pupil

diameter......................................................................................................................................... 45

Figure 12: Correlation scatterplot with ellipse of agreeableness vs. data entry mean pupil

diameter......................................................................................................................................... 46

Figure 13: Correlation scatterplot with ellipse of neuroticism vs. office noise MWL (NASA-

TLX) ............................................................................................................................................. 46

Figure 14: Correlation scatterplot with ellipse of agreeableness vs. anomaly detection mean pupil

diameter......................................................................................................................................... 47

viii

Figure 15: Correlation scatterplot with ellipse of agreeableness vs. no noise mean pupil diameter

....................................................................................................................................................... 47

Figure 16: Graphical illustration of all mental workload measures for task type ......................... 52

Figure 17: Graphical illustration of all mental workload measures for noise level ...................... 53

ix

List of Tables

Table 1: HR and HRV parameters with expected changes with increased mental workload ....... 10

Table 2: Independent and Dependent Variable Lists .................................................................... 17

Table 3: ANOVA for task performance ........................................................................................ 29

Table 4: ANOVA for task duration .............................................................................................. 31

Table 5: ANOVA for NASA-TLX mental workload scores ........................................................ 33

Table 6: ANOVA for LF/HF ratio ................................................................................................ 35

Table 7: ANOVA for mean HRV (not standardized) ................................................................... 36

Table 8: ANOVA for standardized mean HRV ............................................................................ 37

Table 9: ANOVA for standard deviation HRV ............................................................................ 38

Table 10: ANOVA for root mean squared differences of successive R-R intervals (RMSSD) ... 39

Table 11: ANOVA for mean difference in pupil diameter ........................................................... 40

Table 12: ANOVA for pupil diameter standard deviation............................................................ 41

Table 13: Table of f-ratios/t-ratios and p values for data entry task's fixation rate, duration, and

counts ............................................................................................................................................ 42

Table 14: Table of f-ratios/t-ratios and p values for anomaly detection task's fixation rate,

duration, and counts ...................................................................................................................... 42

Table 15: Summary table of physiological parameters when mental workload increases ........... 43

Table 16: Correlation coefficient, p-values, and variables for all correlations that showed

significance ................................................................................................................................... 44

Table 17: Top five correlations from Table 15 ............................................................................. 55

x

ACKNOWLEDGEMENTS

I would like to thank my graduate advisor Dr. Mary Fendley for her patience, dedication,

kindness, and support throughout my college experience. I would like to thank Dr. Ciarallo for

his assistance and willingness to answer all of my questions no matter how long it took. Another

thank you to Dr. Sherwood for also giving me input when I was hitting some dead ends. Special

thanks to Wright State’s Biomedical, Industrial, and Human Factors Engineering Department for

their hard work and care for students. I would also like to thank Dr. Joe Tritschler for hiring me

on, allowing me to focus on school rather than where my funding is coming from. I want to also

mention my great appreciation to my research colleagues Alex Dominic, Noel Fleeman, and Josh

Pilcher for their willingness to help me out in the last months of this research.

1

1.0 Introduction

1.1 Background

Human beings are able to control several different aspects of their work environment;

however, there are some aspects that cannot be changed and must be tolerated. Whether this

includes temperature, atmosphere, lighting, mood, or sound level, these aspects have an

influence on the work environment and, in turn, the humans. Noise is an aspect that is present in

many work environments and is easily controlled to benefit the persons working in the

environment. For example, some individuals often choose a quiet space away from people, while

others choose a space with lots of people and plentiful noise. Additionally, while some

individuals may play soft music without lyrics to focus on their work, others may blast lyrical

melodies until their work is completed.

So, how does one know whether the environmental sound around an individual is

supporting accurate task completion, in an effortless timely manner? Multiple studies have found

that task performance deteriorates when noise is produced in the background (Jahncke,

Björkeholm, Marsh, Odelius, & Sörqvist, 2016; Jerison, 1959; Levy-Leboyer, 1989; Nassiri et

al., 2013; Weistein, 1974). Other studies have shown that not only task performance is impacted.

The Reaction time and the mental load on the human during the task is also impacted (Becker,

Warm, Dember, & Hancock 1995; Lahtela, Niemi, Kuusela, & Hypén, 1986; Ljungberg &

Neely, 2007; Nassiri et al., 2013; Tafalla, & Evans, 1997).

Studies have found that personality may have an impact on the quality of a subject’s

performance when there is noise in the background (Belojevic, Jakovljevic, & Slepcevic, 2003;

Dobbs, Furnham, & McClelland, 2011; Furnham, & Strbac, 2002; Jafari & Kazempour, 2013,

Kou, Furnham, McClelland, & Furnham, 2018). A few other studies have looked at the effect of

2

noise on a human based on the ambient, environmental sounds experienced within the human’s

daily life (Dockrell, Shield, & Dockrell, 2008; Lercher, Evans, & Meis, 2003; Pujol, et al.,

2014).

This research study utilized different types of tasks to examine whether performance,

mental workload, or duration of task are impacted when a subject was exposed to three different

background noise levels. The tasks used in this study were designed to simulate those specific to

professionals in a medical field while the accompanying ambient noise was based on sounds

heard in a typical medical environment (e.g. people talking, copier running, typing, and phone

ringing). Correlation tests were then used to study whether the noise and task type’s influences

relate back to the personality and noise tolerance of the individual. In addition, correlation were

examined to look for a relationship between the effect of noise and task types versus the

personality and lifestyle of the subject.

1.2 Research Objective

The purpose of this study was to assess the influence of noise on human subject task

performance, more specifically: the time it took to complete the tasks, the performance of the

subject, and the mental workload of the subject. Noise tolerance and personality correlation

analyses were taken into consideration to examine the effects of the individual’s personality on

the dependent variables when noise was introduced.

2.0 Literature Review

2.1 Task and Multiple Resource Theory

3

This study incorporates three types of tasks (data entry, anomaly detection, and

mathematical arithmetic) that have been commonly used to study human performance and

mental workload (Cail & Aptel, 2003; Church, 2015; Colligan, Potts, Finn, & Sinkin, 2015;

Dimitrakopoulos et al. 2017; Gabbard, 2017; Galy & Mélan, 2015; Khan & Rizvi, 2010; Kotani,

Takamasu, & Tachibana, 2007; Nickels, 2014; Peng, He, Ji, Wang, & Yang, 2006; Piasecki,

2016). Each of these tasks were designed to simulate actual healthcare practices performed by

different medical personnel. The first task, data entry, relates to the job that a nurse must

complete to catalogue patient information into Electronic Health Records (EHR) (graphical user

interface symbolizing EHR interface found in Appendix B). In fact, Colligan, Potts, Finn, and

Sinkin (2015) analyzed mental workload of pediatric nurses during a data entry task of filling out

EHRs in their actual work environment. The study examined the reaction of the nurses when

switching to the new EHRs, and found that mental workload increased only during the initial

switch to the new EHR.

Both Gabbard (2017) and Piasecki (2016) utilized an anomaly detection task to

investigate mental workload and overall performance of human subjects. Gabbard’s anomalies

were set in a video; while Piasecki had a stagnant scene for the anomalies, the anomalies

sometimes flashed and moved within the still scene. These types of anomalies are in contrast to

this research study’s focus, since both the anomalies and the setting were static. Sets of x-rays

were used in a stagnant image set for the subject to pick out abnormalities (images found in

Appendix B). These stagnant scenarios were appropriate, since a radiologist’s work radiologist’s

profession is far more complex than the task for this experiment and requires years of training ;

thus, the task in this research only attempts to simulate a limited portion of the full

responsibilities of a radiologist in interpreting images.

4

The complexity of the actual medical professional’s job also effected the choice of the

mathematical/arithmetic task. An arithmetic task was simulated with counting and categorizing

medical pills, a task that pharmacists are familiar with and one that this study has dubbed the

“pharmaceutical task” (set-up and problems found in Appendix B). Filling prescriptions require

years of pharmaceutical training and is not a task that just anyone can pick up and perform;

therefore, this study simplified the task utilizing basic math, colorful beads, and a manual pill

counter. Arithmetic tasks are a common means of testing mental workload (Dimitrakopoulos et

al. 2017; Galy & Mélan, 2015; Kotani, Takamasu, & Tachibana, 2007; Peng, He, Ji, Wang, &

Yang, 2006). In this study, mathematical arithmetic’s role in mental workload studies and the

way subjects use mental mathematics elicited interesting performance scores.

Some tasks require more resources than others. Multiple Resource Theory (MRT) states

that there are limitations on what an individual can do all at one time, dependent on four factors,

based on the amount of resources that a certain task demands (Basil, 1994; Wickens, 2002;

Wickens, 2008). These factors are processing stages, perceptual modalities, visual channels, and

processing codes. Each factor has two dimensions, and each dimension has two discrete levels.

While all the tasks in this study incorporate visual modalities, the anomaly detection task relies

heavily on the ability to visualize and interpret the differences among x-rays. The mathematical

arithmetic task weighs more heavily on working memory in terms of processing the information

given. However, the data entry task requires both verbal and auditory resources. One of the

resources, auditory, is being disrupted with different noise levels within this study. This theory

drove the choice of different types of tasks in the design of this experiment to study the effect of

noise on performance, task duration, and mental workload.

5

2.2 Noise Interruption

Several studies have shown that introducing noise to an individual during a task has

measurable effects on workload and task performance (Becker, Warm, Dember, & Hancock,

1995; Gabbard, 2017; Hygge, & Knez, 2001; Jerison, 1959; Levy-Leboyer, 1989; Szalma &

Hancock, 2011; Tafalla & Evans, 1997). In addition, a number of these studies characterize noise

in different ways: low frequency hum, music, office noise, background speech (Dobbs et al.,

2011; Furnham, & Strbac, 2002; Jafari & Kazempour, 2013; Jahncke, Björkeholm, Marsh,

Odelius, & Sörqvist, 2016). However, they all have the commonality that noise is considered to

be an auditory interruption to the main work of the subjects in the experiment.

MRT is based on the ideaof multiple separate resources (verbal, auditory, visual,

perceptual, cognitive, and spatial) with a limited capacity. Each task presented to an individual

is allocated to a specific resource (Basil, 1994; Rubio et al., 2004; Horrey and Wickens, 2006;

Wickens & Wickens, 2008). Though these resources are separate from each other, there can be

interference or “resource competition” when two or more resources are occupied simultaneously

in a subject (Horrey and Wickens, 2006). This theory explains why human performance during a

task focusing on one source (i.e. visual) can suffer when another resource (i.e. auditory) is

causing interference.

A study at the University of Cincinnati had students perform a detection task under no

noise, low noise, and high noise conditions (Becker et al., 1995). Not only did their detection

performance decrease with noise, but their perceived mental workload increased. Another study

found that noise appeared to have the same effect (Tafalla & Evans, 1997). Given an arithmetic

task with two levels of complexity, the study demonstrated that noise increased heart rate under

6

high effort conditions and thus high workload conditions. When noise was present, the reaction

time slowed and effort was low.

A more recent study analyzed the performance of 40 male subjects during noise

interruption (Nassiri et al., 2013). The task involved the use of hand tools while testing

steadiness and dexterity. The results showed that intermediate noise worsened the subjective

work environment of the participants, but treble noise reduced the subject’s performance.

Another study found similar results when office noise was in the background during task

performance of 30 students (Jahncke et al., 2016). In this case, each participant experienced

office noise in the background during their task. Some participants were split into another

experimental group that wore headphones with nature sounds to mask the office noises. The

students who performed the tasks without masking had lower performance compared to when the

noise was masked.

This research study uses headphones to play the noise. These headphones were worn for

all three noise levels. These noise levels were office noise, white noise, and no noise. Varying

the sound playing in the background with these levels will determine whether one noise level has

a stronger effect than another.

2.3 Mental Workload Analysis

2.3.1 NASA TLX

The NASA Task Load Index (NASA-TLX) is a validated source for measuring subjective

mental workload in studies concerned with the cognitive load on subjects (Hart, 2006; Francisco

Ruiz-Rabelo et al., 2015; Gerhard & de Winter, 2015; Hu, Lu, Tan, & Lomanto, 2016; Liang,

Rau, Tsai, & Chen, 2014; Marquart, Cabrall, & de Winter, 2015; Rubio, Díaz, Martín, & Puente,

7

2004; Sönmez, Oğuz, Kutlu, & Yıldırım, 2017). The dimensions that the NASA-TLX assess are

mental demand, physical demand, temporal demand, performance, effort, and frustration. Each

dimension is rated by subjects on a scale out of 20, and each rating can either be weighted to

obtain a global score or taken as a sum of all six dimensions together to obtain a raw score

(Francisco Ruiz-Rabelo et al, 2015; Rubio et al. 2004). There are other mental workload

questionnaires that exist, such as the Subjective Workload Assessment Technique (SWAT) and

Workload Profile (WP). NASA-TLX was chosen for this study because of its better range for a

rating scale, the number of dimensions that directly relate to the study (mental workload,

frustration, temporal demand, and effort), and the lack of time pressure ratings, which was not

necessary for the study (Rubio et al., 2004).

2.3.2 Eye-tracking

Eye tracking parameters, such as blinks, fixations, saccades, and pupil dilation, are a

relevant way to track mental workload of a subject (Cardona & Quevedo, 2014; Gao, Wang, Li,

Dong, & Song, 2013; Gerhard & Joost, 2015; He, Wang, Gao, & Chen, 2012; Holmqvist et al.,

2011; Marquart, Cabrall, & de Winter, 2015; Tokuda, Obinata, Palmer, & Chaparro, 2011; Van

Orden, Limbert, Makeig, & Jung, 2001). Marquart, Cabrall, and de Winter (2015) utilized an

arithmetic task and viewed the changes in the pupil diameter. The results showed that the mean

pupil diameter and the change in pupil diameter correlated with the difficulty of the arithmetic

problem that was presented to the subject. While He et al. (2012) also investigated the pupil size

as a means of mental workload, they also examined fixation duration. While fixation duration

decreased with increased time pressure of the task for smaller time pressures, it increased when

the pressure began to overload the subject mentally. As in previous studies, He et al. (2012)

showed that pupil size increases with mental workload, as well as time pressure. Gao, Wang, Li,

8

Dong, and Song (2013) compared seven different eye-tracking measures and found that blink

rate was sensitive to overall task complexity and blink duration increased over the task period.

However, Faure, Lobjois, and Benguigui (2016) attempted to use blink rate as a means of

quantifying mental workload and found that there was not a significant correlation.

Cardona and Quevedo (2014) found that blink rate did not vary significantly across

complexity levels; however, large amplitude saccades (i.e. the angular distance the eye moves)

accompanied with a blink was related to high cognitive demands. The main findings of Tokuda,

Obinata, Palmer, and Chaparro (2011) showed that saccadic intrusions (SI) were regularly

observed during higher mental workload. SI is a type of eye movement that is jerky and quick.

Compared to a micro-saccade (also a quick, jerky eye movement) SI has a larger amplitude and

does not usually change gaze direction. Pupil diameter is also related to mental workload;

however, SI was still a better indicator of mental workload for this study (Tokuda, Obinata,

Palmer, & Chaparro, 2011).

Van Orden, Limbert, Makeig, and Jung (2001) found that different eye-tracking

parameters distinguished mental workload. They used target density in a mock air warfare task to

vary the task complexity, thus changing the mental workload of the subjects that performed the

tasks. There was a decrease in blink duration and frequency when the target density (complexity)

increased. The opposite happened with fixation frequency; more fixations occurred when the

target density increased.

This research study utilized several eye-tracking parameters. Pupil diameter mean was

compared to a control due to the different pupil sizes of the subjects. Pupil diameter standard

9

deviation for the subjects was also calculated. Automatic gaze mapping was utilized on two

tasks to assess the fixation rates, fixation durations, and fixation counts during those tasks.

2.3.3 Heart Rate Variability

Heart rate variability can be used as a physiological measure of mental workload in

several ways. There are several HRV parameters that can be calculated by using the R-R

interval. Figure 1 shows an illustration of an R-R interval.

Figure 1: Illustration of R-R interval in EKG signal

Mansikka, Virtanen, Harris, and Simola (2016) provide a table that helps in interpreting

heart rate variability and how it changes with pilot mental workload (PMWL). Though the tasks

presented in this study are not the same as those required of a pilot, they are tasks that may cause

increased cognitive load. Thus, the PMWL presented in Table 1 created by Mansikka et al.

(2016) can relate to this study as well. Table 1 summarizes what occurs to the HRV measures

with this study with increased mental workload. The parameters found in the table can be

calculated using the R-R intervals (also referred to as normal to normal or N-N intervals).

R-R interval

10

Table 1: HR and HRV parameters with expected changes with increased mental workload

Mansikka, Virtanen, Harris, and Simola (2016) were able to differentiate autonomous

nervous system response variation between the task segments, rather than just between rest and

trial conditions. While there were significant HRV/HR differences between segments of the

tasks, there were no significant differences in performance. This is a strange phenomenon

considering that other studies found performance to decrease when mental workload increases

(Hu et al., 2016; Nickels, 2014; Prabhu, Smith, Yurko, Acker, & Stefanidis, 2010).

Analysis of HRV defines two different categories of parameters: time domain measures

and frequency domain analysis. LF, HF, and LF/HF all assess variance over a longer period of

time in terms of frequency. RMSSD, SDNN, and MEANRR utilize the normal to normal beat

intervals (time between heart beats) (Heine, Lenis, Reichensperger, Beran, Doessel, & Deml,

2017; Mansikka et al., 2016; Sugimoto, Kitamura, Murai, Wang, and Wang, 2016). These time

and frequency domain measures have been analyzed in studies concerned with mental workload.

One study examined drivers’ mental workload by using a series of R-R intervals gathered from

the subjects and analyzed them based on rhythmical or morphological features (based on the

quantifying form of the ECG waves) (Heine et al., 2017). Unfortunately, none of the features

could distinguish the different levels of mental workload. Sugimoto, Kitamura, Murai, Wang,

Measure Description Expected Change

MEANRR The mean of RR intervals Decrease

SMEANRRThe mean compared to control

"resting" heart rateDecrease

SDRR The standard deviation of RR intervals Decrease

RMSSD

The square root of the mean squared

differences between successive RR

intervals

Increase

LF/HF

The ratio between the power of low

frequency (LF) and high frequency (HF)

components of HRV

Increase

11

and Wang (2016) not only used the time domain measures of R-R intervals, but the low/high

frequency (LF/HF) components of the ECG as well. The study showed an increase in LF/HF,

thus an increase in mental workload, through specific events during the study.

The following study gathered R-R intervals and used them to obtain frequency domain

and time domain measures. The frequency domain parameters taken from the R-R interval data

was LF/ HF. The ratio of LF/HF is assumed to show a shift in dominance from the

parasympathetic to the sympathetic (Billman, 2013). The ratio of LF/HF has been shown to

increase when mental workload increases, which correlates with this shift to the sympathetic

nervous system. The high frequency range is 0.15 to 0.4 hertz, while low frequency range is 0.04

to 0.15. These frequency components are partitioned from the total variance of the continuous

series of beats (Billman, 2013; Mansikka, et al., 2016). Because the time interval the data was

collected from was short, statistical significance was not expected from the frequency domain

parameter; thus, only the LF/HF ratio was used in the analysis. The time domain parameters that

were taken from the data were the mean of the normal to normal beats, the standard deviation of

the normal to normal beats, and the square root of the mean squared difference (RMSSD)

between normal to normal beats. Because the mean is not normalized, and humans tend to have

different resting heart rates, another parameter, standardized mean R-R intervals, will be

analyzed as well to counteract this phenomenon in this study. The standard deviation examines

the change in the R-R intervals of each subject, so it is not necessary to compare this value to a

control. This is also holds true for the RMSSD value (1).

𝑅𝑀𝑆𝑆𝐷 = √1

𝑛−1∑ (𝑅𝑅𝑖+1 − 𝑅𝑅𝑖)2𝑛−1

𝑖=1 (1)

12

For equating the RMSSD value, the initial normal to normal or R-R interval (𝑅𝑅𝑖) is

subtracted from the next R-R interval(𝑅𝑅𝑖+1). This difference is squared, while the next two

intervals are taken and squared and so on until the end of the specified time interval. After this,

the number of sampled R-R intervals (n) in this time interval are considered via division before

square rooting (Vollmer, 2015). The mean and standard deviation of the R-R intervals are

suspected to decrease whereas the RMSSD is suspected to increase (Mansikka et al., 2016). A

study by Guo, Tian, Tan, Zhao, and Li (2016) agreed with the RMSSD value increasing with

mental workload, but the standard deviation of R-R intervals increased. Yet, another study

conducted by Arnrich, Cinaz, Arnrich, La Marca, & Troester (2011) resulted in a decrease

RMSSD value, disagreeing with Table 1. Research is inconclusive and more is necessary to

definitively prove that a certain change in a physiological parameter like RMSSD or standard

deviation of R-R intervals correlates with a change in mental workload.

2.4 Personality

Personality tests are widely used in psychology to assess the mindset of humans. If

features of personality interact with environmental variables to affect performance, personality

tests can be further used to ultimately customize an individual’s work environment.

Since there are a myriad of personality tests to choose from this study conducted a review

to determine the most appropriate test for the participants of this study. The tests that were

reviewed include the Myers-Briggs, Eysenck Personality Questionnaire, and Big Five Inventory.

Myers-Briggs is among the most common personality tests (Pittenger, 1993; Gerras &

Wong, 2016, Cooper, McCord, Campbell, 2017; Boonghee Neelankavil, de Guzman, & Lim,

2013; Pittenger, 2005, Brotherton, 2012). Every year, millions of copies are distributed to

13

schools, churches, the workplace, and counseling centers (Pittenger, 1993). Many users have

accessed the true form of the Myers-Briggs while others have found an online version that was

not a true copy of the test. Myers-Briggs personalities are based on typologies, or distinct

groups of people that the user taking the test can fit into, such as ISTJ (Introverted, Sensing,

Thinking, Judging) or ENFP (Extroverted, Intuition, Feeling, Perceiving). The assignment of

people into one of 16 groups is restrictive because the idea of the user fitting into more than one

group is not accounted for (Myers, 2016; Pittenger, 2005). While a person may fall into an ISFP

group, their other personality traits that are not covered by the test may closely resemble one of

the other fifteen groups. The Myers-Briggs also will place someone into one group, even though

one of their scores barely allows them into that group because they were on the edge or middle

percentile. If a user takes the test and scores 1 percent more for introversion than extroversion,

they are placed into a group known as “Introverts.” The scale scores for introvert vs. extrovert or

any of the three other personality type pairs must be sufficiently large to make a definite

distinction and place an individual into a group (Pittenger, 2005).

Another personality test that is widely used is the Eysenck Personality Questionnaire

(EPQ). Several studies that have tested a noisy and quiet work environment have provided this

test to their participants (Furnham, & Strbac, 2002; Belojevic, Jakovljevic, & Slepcevic, 2003;

Dobbs et al., 2017; Jafari & Kazempour, 2013). The noisy backgrounds common in the studies

that incorporated the EPQ ranged from a “white noise” effect with a low frequency hum to

music. These studies found that under noisy conditions, extroverts performed better and/or faster

on one or more tasks, where introverts did not improve or even decreased in performance levels

compared to when tasks were performed under quiet conditions.

14

Over the span of 50 years, the EPQ has changed dramatically, with the early form called

Maudsley Medical Questionnaire having 40 items and the Revised Eysenck Personality

Questionnaire (EPQR) in 1985 containing 100 items (Francis, Lewis, & Ziebertz, 2006; Gentry,

Wakefield, & Friedman, 1985). While a shorter form was devised in 1985 (EPQR-S), the 48-

question test requires further research to test the validity (Francis et al., 2006). The EPQ test

surveys an individual’s extroversion, neuroticism, and psychoticism while also producing a “lie

score”, that tells the experimenter whether the person taking the test is lying to make themselves

look good (Gentry et al., 1985).

The last personality test reviewed was the Big Five Inventory. Created in the 1980s, the

Big Five Inventory is composed of short phrases in a 44-item inventory, measured 5 different

dimensions, and takes 5 minutes or less to finish (Rammstedt & John, 2007). Multiple studies

have validated the Big Five Inventory’s use in determining job performance (Aarde, Meiring, &

Wiernik, 2017; Rodriques & Rebelo, 2013; Judge, Rodell, Klinger, & Simon, 2013; Alessandri

& Veccione, 2012). Despite its short test time, the Big Five Inventory has proven to be a reliable

and valid survey around the world (Alansari, 2016; Aterberry, Martens, Cadigan, & Rohrer,

2014; Lovik, Verbeke, & Molenberghs, 2017; Reyes Zamorano, Álvarez Carrillo, Peredo Silva,

Sandoval, & Rebolledo Pastrana, 2014) One study tested the validity of the Big Five using 685

undergraduate students from Kuwait. The study’s primary purpose was to test the reliability and

validity of the Arabic translation of the Big Five (Alansari, 2016). A Belgium paper analyzed a

study that used a Dutch version of the Big Five (Lovik et al., 2017). The study utilized a Flemish

population sample where nearly 10,000 surveys were collected. The analysis of the original

study validated the imposed five-factor structure of the Dutch version of the personality test.

Another study tested participants from a Midwestern university in the United States and found

15

that the reliability score of the Big Five Inventory was acceptable (Aterberry et al., 2014). In

Mexico City, a population of 472 adult male and females took the Big Five Inventory (Reyes

Zamorano et al., 2014). The large sample size was used to determine the reliability of the test,

which was proven in the study using a specific procedure that gave them a reliability score.

Taken together, all of these recent studies found the Big Five Inventory to be reliable, even as it

travelled across borders.

The Myers-Briggs is often passed around to individuals within companies, school, and

churches to assess personality while others take a free yet potentially inaccurate version online.

Because there is the chance that answers will be skewed due to participants already taking a form

of the test before and being placed into a particular typology group, the widespread Myers-

Briggs was not chosen for this study. The EPQ is a test with a lengthy read time, but only three

personality dimensions analyzed. While the three facets in this test do breakdown further, the

three aspects are not favorable for a study that is looking for the effect of not only noise, but

stress when performing a task. Because there are three tasks in the study, each being performed

more than once, a shorter test than the EPQ was desired (Francis et al., 2006). Thus, a personality

test that was quicker, easier, and had more facets that supported the study was preferred. The

Big Five Inventory, as stated previously, takes only a few minutes to administer to a user. The

five dimensions presented in the Big Five were believed to best correlate to the test variables

presented in this study. These items include extroversion vs. introversion, openness vs.

closedness to experience, emotional stability vs. neuroticism, agreeableness vs disagreeableness,

and conscientiousness vs. lack of direction (John, Naumann, & Soto, 2008). In addition to this,

the Big Five Inventory has been validated across different cultures and ethnicities which is an

16

important measure for this study since many of the participants come from a university with a

diverse population of students.

3.0 Methods

3.1 Experimental Design

The goal of this study is to understand whether task performance, mental workload, and

duration of different types of tasks are impacted by noise. The study was a 3 x 3 experimental

design with no repetition of any factor level. Thus, this experiment was neither a within or

between subject design. This type of design allowed for all the factor levels to be experienced

once by every subject without the subject potentially getting used to a level that they have

already experienced. Further tables and illustrations of the experimental design and combination

distribution can be found in Appendix A. The independent variables are task (data entry,

anomaly detection, and mathematical arithmetic) and noise type (white, office, and no noise).

The dependent variables are performance, duration, and mental workload (subjective and

physiological measures).

17

Table 2: Independent and Dependent Variable Lists

3.2 Participants

The experiment included 60 participants, 29 females, and 31 males, recruited from

Wright State University. Ages of participants range from 18-31 (M=22.7, SD=2.1). The subjects

were assigned the all three task types and the all three sounds, but the combination of the sound

condition and task types were determined randomly. The order for each task-noise pairing, total

of nine pairings, was randomized using Microsoft Office Excel 2016. This study was approved

by the Wright State University’s Institutional Review Board. Participants were not monetarily

compensated for their participation, but those who were eligible for extra credit in their classes

for participating in a study were able to use their participation for this credit.

Independent Variable

Levels of

Independent

Variable

ArithmeticPerformance

Mean/Standardized

Mean R-R Intervals

Anomaly DetectionTask Duration

Standard Deviation R-

R Intervals

Data Entry Subjective Mental

Workload (NASA-TLX)

RMSSD of R-R

intervals

Pupil Dilation Mean LF/HF Ratio

Office Noise Pupil Dilation Standard

Deviation Fixation Rate

White NoiseFixation Counts Fixation Duration

No Noise

Dependent Variables

Task Type

Noise Level

18

3.3 Stimuli and Apparatus

The experiment was conducted in the Neuroscience Engineering Collaboration building

Lab 431 where both temperature and lighting were held constant. Prior to the study, a Well at

Walgreens Deluxe Arm Blood Pressure Monitor was used to collect blood pressure and pulse

rate data. Additionally, a noise tolerance questionnaire and the Big Five Questionnaire (found in

Appendix B) were given to the subject at the beginning of the experiment. The NASA-TLX was

prescribed after each task. During the actual experiment, a Biopac Student Lab MP36 Data

Acquisition Unit was used to collect EKG data. Tobii Pro Glasses 2 were placed on each

participant along with headphones. The computer, used for the data entry task, had a monitor that

measured 15 inches diagonal, while the GUI itself on the computer screen measured 2 by 6

inches. A Grafco 5709 manual pill counter was used for the mathematical arithmetic task, while

a Restar 2.4 GHz Laser Presenter was used by the subject to circle anomalies in the anomaly

detection task.

Noise Level. ATH-M40x Professional Monitor Headphones were kept on the participant

throughout the experiment regardless of the sound playing. The noise was placed at a safe level,

approximately 67 dB (Rabinowitzs, 2000). The control level was simply no noise, the next level

was a crowd of people talking vaguely (representing a white noise effect), and the last noise level

was intermittent office noises: copy machine, typing on a keyboard, phone ringing, and jet flying

overhead.

Task Type. The three tasks were related to the medical field. The data entry task mimics the task

a nurse must do to gather patient information and type it into an EHR on a computer (the

graphical user interface used for this task is displayed in Figure 2). The anomaly detection task

19

used four x-rays of the same body part in each of five sets. In each set, one of the four x-rays

had an anomaly. An example x-ray set with the anomaly circled can be found in Figure 3. The

third task was a mathematical arithmetic task that used a pharmaceutical pill counter, pill bottles,

and beads (to represent pills). A photo of the mathematical arithmetic task set-up and problem set

can be seen in Figure 4.

Figure 2: Graphical user interface for data entry task

Figure 3: Anomaly detection task with anomaly circled

20

Figure 4: Arithmetic task set-up with manual pill counter, beads, and pill bottle

3.4 Procedure

After signing the informed consent, the subject filled out the noise tolerance

questionnaire and the Big Five Inventory (questionnaires found in Appendix B). Once

completed, the subject was asked their age. Then, their blood pressure and pulse rate were taken,

and they were asked whether these values were normal. If blood pressure cuff reading was not

successful after three attempts, the pulse rate was taken manually at the left wrist by the principal

investigator. The subject was then asked to clean the skin on both inner ankles and the inner

wrist of their non-dominant side. Once dry, disposable electrodes were attached so the Biopac

system could collect EKG data. Tobii Pro Glasses 2 wearable eye tracker was placed on the face

of the subject. Headphones were placed over the subject’s ears to be worn throughout the

experiment while the glasses were calibrated, and EKG signal was checked for any problems

with the signal. To ensure that the subject could hear the sound coming from the headphones, the

sound was played and they were asked if they could hear the sound and if it was too loud. If the

sound was too loud for the subject, the noise was turned down until the subject was comfortable

with the sound level.

21

The subject was given instruction on each task and verbally confirmed their

understanding of the task to be performed. The anomaly detection task had two example sets of

x-rays given to the subject for brief training on what sort of anomalies will appear in the five

sets. As the subject performed the three tasks, they were exposed to three different background

noise levels. At the end of each task, a NASA-TLX was completed by the subject based on the

task just performed. The experiment was considered complete once all three tasks were finished

and the corresponding NASA-TLX questionnaire for the last task was filled out. The EKG signal

was saved, the eye-tracking recording stopped and stored, and the subject was permitted to

remove the data collecting equipment and leave.

3.5 Data Analysis

3.5.1 Performance Scoring

Each task was graded on a different scale, but all were converted to percentages for

analysis. The anomaly detection task gave the subject three opportunities to identify the anomaly

correctly on each of the five sets. The anomaly detection task was graded on a 15-point scale,

meaning that each set had 3 points possible associated with the 3 guesses allotted. The subject

started with 15 points, and each wrong answer subtracted a point from their total score. The

lowest score possible of 0/15 meant that the subject did not answer any of the anomalies

correctly, where a 15/15 meant that the subject had no wrong answers.

The data entry task was graded on a 10-point scale, each point associated with one entry

on the GUI. A complete wrong answer or blank entries meant no point was received for that

entry. If an entry was misspelled or pieces of the entry were missing, a half point was taken off.

22

So, a 0/10 meant that all the entries were wrong or empty and a 10/10 meant all the entries were

filled in completely with the correct spelling.

The mathematical arithmetic task was the most complex task to score because of all the

possible mistakes that could occur. So, each correct pill in each bottle was worth one point for a

total of 38 points. A point was received for each bead that was correct. Since there was no way to

interpret the mental arithmetic of the subject during the task, each bottle was scored solely on the

correct bead placement for each problem. Each bottle was associated with one problem for a total

of five problems/bottles.

The three scores associated with the three tasks for each subject were placed in JMP® by

SAS® (2014) for ANOVA and correlational testing.

3.5.2 Duration of task

Each task was timed, starting when the principal investigator verbally told the subject to

start and ending when the subject completed the task. None of the subjects were told that the

tasks were timed, and no time constraint was set in place to avoid undesired mental workload

due to time pressure felt by the subject. The stress of noise on each of the tasks was the desired

factor to be assessed, not time pressure. The duration of each task was placed into JMP® by

SAS® (2014) for ANOVA and correlational tests.

3.5.3 NASA-TLX

The NASA-TLX is based on a 20-point scale. The scales from each NASA-TLX were

summed and the raw score was taken for analysis (Rubio et al. 2004). Because there were three

tasks to grade with their associated noise level, each subject had three different raw subjective

23

mental workload scores. The scores were placed into JMP® by SAS® (2014) for ANOVA and

correlational tests.

3.5.4 Eye-tracking

The files created from the Tobii Eyetracking Controller Software were imported into the

Tobii Analyzer software for analysis. Only the anomaly detection task and the data entry tasks

were given areas of interest (AOI) to analyze the fixation rate, fixation duration, and fixation

counts from the task. An AOI provided a set region within the task’s stimulus for which more

information would like to be gathered, i.e. the fixation parameters (Holmqvist et al., 2011). The

AOIs for this study can be seen in Appendix C.

The task intervals were manually marked, and the gaze data for the AOIs was mapped

automatically using the software. All the tasks were analyzed for pupil diameter. Thus, the

anomaly and data entry task had a separate analysis in JMP® by SAS® (2014). All three tasks

and noise levels were included with the dependent variable pupil diameter based on average

difference from the control, which was recorded when no task was being performed from the

start of the recording to start of the first task.

3.5.5 Heart Rate Variability

Using Python IDLE Version 3.6, Biopac acquisition files were converted into a readable

text file format. A MATLAB (The Mathworks, Inc) code scanned through the file to obtain the

R-R intervals. These intervals were sifted through to obtain the LF/HF, R-R interval means,

standard deviation of R- R intervals, and the root mean squared differences of successive R-R

intervals. There were three total numbers taken for each parameter from each subject to compare

the three tasks and the three noise levels. For control comparison, the first 60 seconds before the

first task started was recorded from most subjects. Not all subjects had a full 60 seconds before

24

the start of their task, and instead, the chunk of time before the task was recorded instead. During

this time, the subject was given instructions. Recording the EKG signal only 60 seconds before

the task ensured that these longer intervals with issues where electrodes may have needed

adjustment were not taken as the control. This control allowed for further comparison between

when the subject was not working a task to when they were for the standardized mean R-R

interval parameter.

3.6 Hypotheses

3.6.1 Performance

This study hypothesized that when noise interrupts the human subject, task performance

will decrease. Based on the hypothesis, the no noise condition should show the best performance

scores on average, and the office noise (deemed as the more disruptive noise condition) should

show the lowest performance averages. As for task conditions, it is predicted that being a simpler

task, the data entry task will show relatively high scores. However, due to the concept of MRT,

this task will have worse performance scores when noise interrupts (the interaction of this task

and noise will show lower scores than other tasks and noise interactions). Data entry requires the

auditory and verbal cognitive channels while the other tasks do not. The noise will interfere with

the auditory channel, which leaves more possibilities for errors.

3.6.2 Duration of the Tasks

The study hypothesized that the duration of the task will increase when noise is played in

the background. The reasoning for this was that when the subject is interrupted, the subject must

recover before returning to the task at hand, a process that will require time. The task that is

predicted to take the most time is the arithmetic task due to the intense working memory and

computation necessary for solving the problems and the motor skills necessary for maneuvering

25

the beads and placing them into the bottles. However, the task type and noise interaction that is

predicted to show the higher task durations, again, the data entry task with office noise due to the

auditory channels being overwhelmed when they are needed for the task.

3.6.3 NASA-TLX Mental Workload

The NASA-TLX subjective workload has different predictions in terms of the task.

Despite MRT, it is hypothesized that the subject will rate the arithmetic task higher because of

the amount of working memory necessary to complete this task. The noise that is predicted to

show the higher raw NASA-TLX scores on average is the office noise because of the disruptive

work environment the sounds create for the subject. As far as interaction, the office noise and

arithmetic task interaction is predicted to show the higher subject mental workload scores

because of the intermittent, non-continuous nature of this auditory work environment and the

mental math the subject must do to complete the task.

3.6.4 Eye-tracking and HRV Parameters

The eye-tracking parameters that were analyzed are pupil diameter for all tasks and

fixation rates, counts, and durations for the anomaly detection and data entry tasks. It is

hypothesized that an increase in pupil diameter will indicate an increase in subject mental

workload (He et al., 2012). This parameter was normalized by taking the difference of it and the

control (time before the first task start). However, the pupil diameter’s standard deviation (not

normalized) will decrease with increased workload (Othman & Romli, 2016). For heart rate

variability, the predictions follow those of Table 1 when mental workload increases: mean and

standard deviation of R-R intervals will decrease, root mean squared differences of successive R-

R intervals will increase, and the LF/HF ratio will increase. If the mean is suspected to decrease

when mental workload increases, then the standardized version of this parameter should do the

26

same if the task’s values are subtracted from the control. The control should have a greater heart

rate mean than when the subject is performing a mentally demanding task. The more demanding,

the lower mean, the less the difference between the control and task mean R-R intervals will be

(some will even be negative). It is predicted that the same tasks, noises, and task type noise

interactions observed as mental overload using subjective measures will also be observed using

physiological measures.

Fixation rates, counts, and durations were analyzed in two separate one-way ANOVAs in order

to ensure that noise was the only factor analyzed. It is hypothesized that the office noise will

create a more mentally taxing work environment for the subject because it is more disruptive.

The no noise condition will show less of an impact on mental workload. For these parameters,

when mental workload increases, the fixation counts will increase while the fixation duration and

rates will decrease (He et al., 2012; Holmquist et al., 2011; and Van Orden et al., 2001).

4.0 Results

From this study’s 60 subjects, the data for 2 subjects were completely excluded due to an

issue with the sound system and a misunderstanding with one of the subject’s arithmetic tasks.

To perform the statistical analysis, all the data was placed into JMP® by SAS® (2014). Two-

way ANOVAs for the two independent variables (task type and noise level) was performed. All

residual plots testing for normality and unequal variances can be found in Appendix D. The

connecting letters reports, created using Tukey’s HSD test, and interaction plots (for

performance, task duration, and NASA-TLX scores) can be found in Appendix E.

All analysis of the physiological measures of mental workload used the same sample size,

but a different sample size than the performance, duration, and NASA-TLX analysis. Subjects

27

were excluded from the analysis if less than 70% gaze data was collected, their EKG signal was

too noisy, or a technical difficulty occurred. After consideration, a total of 10 of the 60 subjects

were removed from this analysis. The eye-tracking and HRV data was also analyzed using

JMP® by SAS® (2014). However, instead of a two-way ANOVA for the AOI parameters

(fixation counts, fixation duration, and fixation rate) a one-way ANOVA was conducted for each

of the two tasks that had AOIs (data entry and anomaly detection). The values provided in this

section with ANOVA as the statistcal analysis are written as (mean ± standard error). The F ratio

and p-value are also given in the text as well as in the ANOVA tables. Both Microsoft Office

Excel 2016 and JMP® by SAS® (2014) were used for the correlation tests of personality and

noise tolerance measures, two factors that could not easily be controlled in the experiment.

4.1 Performance

There was no strong evidence, provided by the residual plots in Appendix D, that the

variances were different for the performance scores among the three tasks. The residual plots that

tested for normality did not show significant evidence to say that the data was not normally

distributed.

28

Figure 5: Average task type and noise level vs. performance (error bars are standard deviation)

Figure 5 shows a graph of the performance scores average of the different task type under

the three noise conditions. The lowest performance appears to be from the anomaly detection

task while the arithmetic task appears to have the higher scores. Statistical analysis agrees with

this trend. The model showed significance in terms of performance [F(8, 165) = 13.22,

p=<.0001]. The task type with the highest performance score average was the arithmetic task

(92.60 ± 1.78) while the lowest was the anomaly detection task (70.02 ± 1.77). The noise with

the highest performance average was white noise (86.82 ± 1.77) and the lowest was office noise

(81.3 ± 1.78). As far as interactions, the highest performance score was the arithmetic task with

white noise (96.18 ± 3.02) and the lowest performance score average is the interaction of

anomaly detection and office noise (64.33 ± 2.95).

29

Table 2 shows the ANOVA table for the performance metric. The task type factor

showed significance in influencing the performance scores (p-value < 0.0001), but the noise

level did not. Figure 6 shows the connecting letters reports for each task level and the

interactions that further reveals the separation of anomaly detection from the rest of the tasks for

performance scores.

Table 3: ANOVA for task performance

Source DF F ratio p-value

Model 8 13.2252 <.0001

Error 165

Total 173


Task Type 2 49.8471 <.0001

Noise 2 1.5199 0.2218

Task Type * Noise 4 1.6065 0.175

Figure 6: Task type, noise level, and task type*noise level interaction connecting letters reports for performance

30

4.2 Duration

According to the residual plots in Appendix D, there was no strong evidence to suggest

that there were unequal variances for the duration times (recorded strictly in minutes, not

seconds) or that the duration data was not normally distributed.

Figure 7: Average of task duration vs. task type and noise level

From Figure 7, the data entry task appears to take the shortest amount of time to complete

and the arithmetic task takes the longest. The statistics show the same conclusion as the averages

presented in the graph. The model showed significance in terms of duration [F(8, 165) = 12.30,

p=<.0001]. The arithmetic task had the longest task duration average (4.46 min ± 0.19) and the

shortest task duration average for task type was data entry (1.89 min ± 0.19). The longest task

duration average for noise level was no noise (3.51 ± 0.19) and the shortest task duration average

31

for noise level was white noise (2.85 ± 0.19). For task type and noise interactions, the longest

task duration average occurs in the interaction of arithmetic task and no noise (4.90 ± 0.32) and

the shortest average duration occurs in the interaction of data entry task and white noise (1.66 ±

0.32).

Table 3 displays the ANOVA table for the task duration times. Figure 8 displays the

differences between the task levels and the connecting letters report and how the no noise and

white noise level are different, but both connected to the disruptive noise level.

Table 4: ANOVA for task duration


Model 8 12.30435 <.0001

Error 165

Total 173


Task Type 2 45.5581 <.0001

Noise 2 2.9149 0.057


Figure 8: Task type, noise level, and task type*noise level interaction connecting letters reports for task duration

32

4.3 NASA-TLX Mental Workload

According to the residual plots in Appendix D, there was no strong evidence to suggest

that there were unequal variances for the NASA-TLX raw or that the data was not normally

distributed.

Figure 9: Averages of NASA-TLX scores vs. task type and noise level

As with the other graphs that display the averages of the dependent variables, the

conclusion drawn from Figure 9 on NASA-TLX scores match that of the statistical analysis.

Data entry appears to have the lowest scores, meaning the lowest stress felt by the subject.

Anomaly detection task seems to have the higher scores in most cases except for the no noise

condition where arithmetic pulls ahead slightly. The model showed significance in terms of

NASA-TLX raw scores [F(8, 165) = 7.885, p=<.0001]. The task type with the highest average of

33

raw NASA-TLX mental workload scores was anomaly detection (52.47 ± 2.32), the lowest raw

score average task type being data entry (28.92 ± 2.32). As for the noise level, the highest

NASA-TLX scores average was found in tasks with office noise playing in the background

(45.24 ± 2.32) and the lowest when white noise played in the background (40.95 ± 2.32). As for

the interactions, the highest scores on average came from the anomaly detection and office noise

interaction (55.86 ± 3.85) and the lowest coming for the no noise and data entry interaction

(25.16 ± 4.05).

Table 4 shows the ANOVA for the subject mental workload raw scores that came from

summing the scores of the NASA-TLX for each task. Figure 10 shows the connecting letters

reports for task type and noise level. It shows that the data entry task is different than the other

two tasks for mental workload scores, but the mental workload scores did not change when the

noise level changed because the letters are all similar.

Table 5: ANOVA for NASA-TLX mental workload scores


Model 8 7.885192 <.0001

Error 165

Total 173


Task Type 2 27.9313 <.0001

Noise 2 0.6995 0.4983


34

Figure 10: Task type, noise level, and task type*noise level interaction connecting letters reports for NASA-TLX scores

4.4 Physiological Mental Workload

As stated before, the physiological mental workload measures were given different

sample sizes due to the issues that occurred with the Tobii Eye-tracking and Biopac systems. The

same two subjects that were removed for the main ANOVA were removed from this analysis.

Three more subjects were removed due to the noisy EKG signals while another subject was

removed because the cord for the EKG recording came un-plugged during the experiment,

unbeknownst to the PI until after the task was finished. Five more subjects were removed

because of the low gaze data score (<70%). Thus, the sample size subjected to analysis was 50

subjects as opposed to the original 60 subjects that were run for the experiment. According to

the residuals plots in Appendix D, there was no significant evidence that the variances for any of

the physiological data were unequal or the data was not normally distributed. Appendix E has all

of the connecting letters reports.

4.4.1 Heart Rate Variability Analysis

As stated earlier, there were four parameters that were analyzed for heart rate variability,

one frequency domain (LF/HF) and three-time domain (mean, standard deviation, RMSSD)

measures.

35

The LF/HF ratio showed significance in the model [F(8,141) =4.0032, p=0.0003].When

the analysis was run, the highest LF/HF value for task type came from the anomaly detection

task (0.914 ± 0.038) and the lowest coming from the data entry task (0.65 ± 0.039). The noise

level with the highest LF/HF ratio was office noise (0.812 ± 0.038) while white noise had the

lowest ratio (0.6957 ± 0.0385). The task and noise type that had the highest LF/HF value

average was anomaly detection with office noise played in the background (0.994 ± 0.066) and

the lowest LF/HF average being when white noise was played in the background of the data

entry task (0.606 ± 0.062). An ANOVA table for this ratio is found in Table 5.

Table 6: ANOVA for LF/HF ratio


Model 8 4.003171 <.0003

Error 141

Total 149


Task Type 2 11.4415 <.0001

Noise 2 3.0275 0.0516

Task Type * Noise 4 0.5664 0.6874

The non-standardized mean did not show significance within the model [F(8,141) =

1.649, p=0.116]. However, the task type did have a statistically significant impact on the non-

standardized mean, despite the difference in resting heart rates [F(2, 141) = 4.7176, p=0.0104].

The task type with the lowest mean R-R intervals on average was data entry (0.694 ± 0.015)

while anomaly detection had the highest (0.755 ± 0.015). The task type that showed the highest

mean R-R intervals was anomaly detection (0.755 ± 0.015) while the lowest was data entry

36

(0.694 ± 0.015). The highest mean R-R interval for noise level was office noise (0.723 ± 0.015)

while the lowest was white noise (0.709 ± 0.015). The interaction with the highest mean R-R

interval average was anomaly detection with office noise (0.770 ± 0.026). The lowest values for

interaction were found in the interaction of data entry with office noise (0.675 ± 0.026). The

standardized mean showed significance in the model [F (8, 141) =3.82, p=0.004]. For the

standardized mean, the highest task type was anomaly detection (0.055 ± 0.01) and the lowest

was data entry (-0.008 ± 0.01). The noise level that had the highest standardized mean was

office noise (0.022 ± 0.01) and the lowest was white noise (0.008 ± 0.01). When anomaly

detection had office noise played in the background, the standardized mean was the highest on

average (0.067 ± 0.017), and arithmetic task with white noise in the background had the lowest

standardized mean (-0.028 ± 0.018). The non-standardized mean and standardized mean

ANOVA tables are presented in Table 6 and 7.

Table 7: ANOVA for mean HRV (not standardized)


Model 8 1.649596 0.116

Error 141

Total 149


Task Type 2 4.7176 0.0104

Noise 2 0.2387 0.7880


37

Table 8: ANOVA for standardized mean HRV


Model 8 3.82 0.0004

Error 141

Total 149


Task Type 2 11.673 <0.0001

Noise 2 0.6072 0.5463

Task Type * Noise 4 1.5910 0.1800

Standard deviation of R-R intervals did show significance in the model [F(8,141) = 3.02,

p=0.0036].The task type that showed the lowest standard deviation in R-R intervals on average

was the anomaly detection task (0.07 ± 0.005) while the highest standard deviation came from

the data entry task (0.102 ± 0.005). Office noise, which played in the background for the task,

had the lowest standard deviation (0.08 ± 0.005) and “no noise” in the background had the

highest noise level R-R interval standard deviation average (0.09 ± 0.005). The task type and

noise level pairing, anomaly detection with office noise played in the background, showed the

lowest standard deviation R-R interval average (0.067 ± 0.009). The highest average pairing for

this HRV parameter was the “no noise” level played during the data entry task (0.121 ± 0.010).

Table 8 displays the standard deviation HRV’s ANOVA values.

38

Table 9: ANOVA for standard deviation HRV


Model 8 3.022596 0.0036

Error 141

Total 149


Task Type 2 9.5674 0.0001

Noise 2 0.7837 0.4587


RMSSD of the successive R-R intervals showed significance in the model as well,

[F(8,141) = 4.19, p=0.0002]. Thus, the highest RMSSD for a task type was data entry (0.117 ±

0.007) versus the lowest RMSSD coming from anomaly detection (0.0634 ± 0.007). The noise

condition that had the highest RMSSD value on average was no noise (0.094 ± 0.007) and the

lowest RMSSD value was office noise (0.081 ± 0.007). The highest RMSSD from interaction

was data entry with no noise (0.137 ± 0.0137) and the lowest at anomaly detection with office

noise (0.053 ± 0.012). Table 9 displays the RMSSD ANOVA values.

39

Table 10: ANOVA for root mean squared differences of successive R-R intervals (RMSSD)


Model 8 4.192103 0.0002

Error 141

Total 149


Task Type 2 14.815 0.0001

Noise 2 0.8665 0.4226


4.4.2 Eye-tracking Analysis

All tasks and noise levels had pupil diameter measures, or the widening of the pupil in

diameter, that were also taken before the start of the first task for each participant (max.60

seconds). This allowed for the pupil diameter to be normalized by taking the pupil diameter

average for each task and subtracting it by the control average pupil diameter. The greatest mean

difference in pupil diameter average compared to the control comes from the task type data entry

(0.16 ± 0.076) while the lowest was anomaly detection (-0.121 ± 0.075). The noise with the

greatest mean difference in pupil diameter was no noise (0.041 ± 0.076) while the lowest was

seen during the white noise condition (-0.014 ± 0.076). Table 10 shows the ANOVA values for

mean difference in pupil diameter.

40

Table 11: ANOVA for mean difference in pupil diameter


Model 8 1.3601 0.2192

Error 141

Total 149


Task Type 2 3.5097 0.0325

Noise 2 0.1741 0.8404

Task Type * Noise 4 0.9966 0.4116

The task that had the lowest standard deviation on average was data entry (0.261 ± 0.015)

while the highest standard deviation came from the arithmetic task (0.309 ± 0.015). The noise

that showed the lowest standard deviation in pupil diameter on average was office noise (0.274 ±

0.015) while the highest was no noise (0.293 ± 0.015). The interaction that had the lowest

average in terms of pupil diameter standard deviation was data entry with office noise in the

background (0.249 ± 0.025) while the highest was found during the arithmetic task with no noise

played (0.334 ± 0.024). Table 11 shows the analysis for pupil diameter standard deviation.

41

Table 12: ANOVA for pupil diameter standard deviation


Model 8 1.0550 0.3982

Error 141

Total 149


Task Type 2 2.7887 0.0649

Noise 2 0.4544 0.6358

Task Type * Noise 4 0.3058 0.8737

Fixation parameters, duration, rate, counts, were collected via automatic mapping by the

Tobii Eye-tracking Analyzer Software. Two AOIs were created, one for the data entry task and

one for the anomaly detection task. These AOIs can be found in Appendix C. Table 12 and 13

below show the f-ratios for the one-way ANOVAs done for the 3 noise levels, and the t-ratios for

each of the levels. The p-values for each are presented showing that there was no significant

evidence that noise level influenced any of the fixation parameters for either task based on the

95% CI.

42

Table 13: Table of f-ratios/t-ratios and p values for data entry task's fixation rate, duration, and counts

Data Entry Fixation Parameters Fixation Rate

Source/Level P-value

Model 0.1732

No Noise 0.2076

White Noise 0.0657

Office Noise 0.6

Fixation Duration


Model 0.9185

No Noise 0.688

White Noise 0.7726

Office Noise 0.8876

Fixation Count


Model 0.631

No Noise 0.339

White Noise 0.5869

Office Noise 0.6283

Table 14: Table of f-ratios/t-ratios and p values for anomaly detection task's fixation rate, duration, and counts

Anomaly Fixation Parameters Fixation Rate


Model 0.7633

No Noise 0.4678

White Noise 0.7812

Office Noise 0.6578

Fixation Duration


Model 0.4691

No Noise 0.5427

White Noise 0.5214

Office Noise 0.2211

Fixation Count


Model 0.2163

No Noise 0.158

White Noise 0.8234

Office Noise 0.1089

43

4.4.3 Summary of Physiological Parameters

Because not all of the physiological parameters presented in the results section agree with

one another, a comprehensive table, Table 14, is presented to summarize the findings. The table

shows the physiological measure, the hypothesis, how the measure actually changed (based off

NASA-TLX subjective score), and the task type that the measure placed as the least mentally

straining task and the most based on the hypothesis. Red text notes that what occurred in the

study did not match the hypothesis. The correlation coefficient and p-value resulted from a

Pearson Correlation Test between the NASA-TLX scores and the physiological parameter.

Table 15: Summary table of physiological parameters when mental workload increases

4.5 Correlation Testing

Microsoft Office Excel 2016 was used to determine correlation for as many measures as

possible, by performing nearly 200 correlation tests using the “CORREL” function. The

significant correlation coefficient comparisons were placed into JMP® by SAS® (2014) for

further analysis. Depending on the correlation test being done, the N value is either 58 or 50,

meaning the df (N-2) is either 56 or 48. Those critical values for 50 and 60, the closest df values

in the table to 56 and 48, are between 0.21 and 0.23 for correlation coefficient or above

(Rummel, 1976). The correlation coefficients whose absolute values met the criterion and

showed statistical significance are featured below in Table 15. The top five, lowest p-values are

Physiological Mental

WorkloadHypothesis

What happened

compared to NASA-

TLX results

Task with highest

MWL if hypothesis

true

Task with lowest

MWL if hypothesis

true

Correlation

coefficient with

NASA-TLX

p-value

LF/HF** Increase Increase Anomaly Detection Data Entry 0.0675 0.4118

MEANRR* Decrease Increase Data Entry Anomaly Detection 0.1292 0.115

SMEANRR** Decrease Increase Data Entry Anomaly Detection 0.2023 0.013

SDRR** Decrease Decrease Anomaly Detection Data Entry -0.0765 0.3522

RMSSD** Increase Decrease Data Entry Anomaly Detection -0.0886 0.281

Pupil Diameter (Mean) * Increase Decrease Data Entry Anomaly Detection -0.1689 0.0388

Pupil Diameter (Standard

Deviation)Decrease Increase Data Entry Arithmetic

-0.0089 0.9139

* = Task Type Statistical Significance ** = Task Type and Model Statistical Significance

44

highlighted in green. Correlation coefficient values found using Microsoft Excel can be found in

Appendix F.

Table 16: Correlation coefficient, p-values, and variables for all correlations that showed significance

Variable 1 Variable 2 Correlation Coefficient p-value

Agreeableness Office Noise Mean Pupil Dilation -0.442 0.0013

Agreeableness Data Entry Mean Pupil Dilation -0.4036 0.0037

Agreeableness Anomaly Detection Mean Pupil Dilation -0.3881 0.0054

Agreeableness No Noise Mean Pupil Dilation -0.3766 0.007

Agreeableness Data Entry RMSSD 0.3636 0.0094

Agreeableness Arithmetic Mean Pupil Dilation -0.3601 0.0102

Agreeableness No Noise LF/HF -0.3498 0.0128

Agreeableness Arithmetic Pupi l Di lation Standard Deviation -0.3201 0.0234

Agreeableness Data Entry Standard Deviation HRV 0.3144 0.0262

Agreeableness White Noise Mean Pupil Dilation -0.2835 0.046

Agreeableness No Noise RMSSD 0.2819 0.0474

Agreeableness Noise Tolerance 0.2383 0.0716

Agreeableness No Noise Standard Deviation HRV 0.2494 0.0806

Agreeableness No Noise Mean HRV -0.2421 0.0902

Agreeableness White Noise RMSSD 0.2404 0.0927

Conscientiousness No Noise Mean HRV -0.3552 0.0114

Conscientiousness Data Entry LF/HF -0.3502 0.0127

Conscientiousness Arithmetic Mean HRV -0.3181 0.0244

Conscientiousness No Noise LF/HF -0.2902 0.0409

Conscientiousness Data Entry Mean HRV -0.288 0.0426

Conscientiousness White Noise Mean HRV -0.2618 0.0663

Extroversion Anomaly Detection MWL -0.3174 0.0152

Extroversion Office Noise Duration of Task 0.221 0.0956

Neuroticism Office Noise MWL 0.3664 0.0047

Neuroticism Data Entry RMSSD -0.3462 0.0138

Neuroticism Data Entry Standard Deviation HRV -0.3216 0.0228

Neuroticism Arithmetic MWL 0.2895 0.0275

Neuroticism MWL Overall Average 0.2876 0.0286

Neuroticism White Noise Standard Deviation HRV -0.3026 0.0327

Neuroticism Average Noise (White and Office) MWL 0.2661 0.0435

Neuroticism Anomaly Detection MWL 0.265 0.0444

Neuroticism White Noise RMSSD -0.2619 0.0662

Neuroticism Office Noise Standard Deviation HRV -0.2438 0.0879

Neuroticism No Noise Mean HRV -0.2421 0.0902

Neuroticism Anomaly Detection LF/HF 0.2357 0.0994

45

The personality trait that appears most frequently in Table 14 was agreeableness. The

second personality trait that appeared numerous times was neuroticism. The third most frequent

personality trait was conscientiousness, and the last personality trait that showed significance, yet

least common, was extroversion. Openness and noise tolerance failed to show significance in

terms of correlation tests. The dependent variable data was gathered by task, but had to be sorted

into noise categories, there were two difference sample sizes (different sample size=different df)

for the main ANOVA dependent variables (performance, task duration, and NASA-TLX scores)

and the physiological dependent variables within the correlation tests, The NASA-TLX is

represented in the table and correlation graphs below as simply MWL. Because there were

several correlation tests that showed significance, the top five with the lowest p-values will be

discussed. The smallest p-value was seen in the test between agreeableness personality scores

and office noise mean pupil diameter (r = -0.442; p = 0.0013); the scatterplot of these variables is

shown in Figure 11. The scatterplots with the ellipse illustrates the general correlation direction

between the two variables.

Figure 11: Correlation scatterplot with ellipse of agreeableness vs. office noise mean pupil diameter

46

The next lowest p-value was the agreeableness personality test scores and the data entry

mean pupil diameter (r = -0.4036; p = 0.0037). The third lowest correlation test p-value was

neuroticism personality scores and the office noise MWL, or NASA-TLX raw scores (r =

0.3664; p = 0.0047). These two scatterplots can be found in Figures 12 and 13.

Figure 12: Correlation scatterplot with ellipse of agreeableness vs. data entry mean pupil diameter

Figure 13: Correlation scatterplot with ellipse of neuroticism vs. office noise MWL (NASA-TLX)

47

The fourth lowest p-value amongst the correlation test was agreeableness personality

scores and anomaly detection mean pupil diameter (r = -0.3881; p = 0.0054). The fifth lowest p-

value to be discussed is the test between agreeableness personality scores and no noise mean

pupil diameter (r = -0.3766, p = 0.007). These two scatterplots can be found in Figure 14 and 15.

Figure 14: Correlation scatterplot with ellipse of agreeableness vs. anomaly detection mean pupil diameter

Figure 15: Correlation scatterplot with ellipse of agreeableness vs. no noise mean pupil diameter

48

5.0 Discussion

5.1 Performance, Duration, and NASA-TLX

The main analysis of this study, concerning the impact on performance, task duration,

and mental workload when noise was introduced in the background, had noticeable findings that

both supported and rejected the hypotheses. It was hypothesized that when noise was played in

the background, especially the more disruptive office noises, the subject’s performance would

decrease, while their task duration and mental workload would increase.

There was significant evidence supporting that task type impacted the performance, task

duration, and subjective mental workload. The effect of noise level and task type* noise

interaction was not significant for all three of these dependent variables, so there was no

evidence that noise affected different task types differently. While the lowest performance score

for noise was with office noise, the highest performance scores came from the white noise work

environment. While the office noise was hypothesized to cause a depletion in performance, it is

surprising that the white noise condition showed the higher scores rather than the no noise

condition. It is possible that the subjects who participated are accustomed to working in

environments where people are talking, which was the chosen background sound for the white

noise condition. White noise had the lower NASA-TLX scores, which might also explain the

better performance scores because the subject felt less mentally overloaded. This phenomenon

also explains why the anomaly detection task, which had subjectively higher mental workload

among subjects, had the lowest performance scores on average.

This phenomenon does not explain why the task that was perceived less mentally

straining, data entry, did not have the highest scores. An explanation for this could be that

because the task was perceived too easy, the task’s difficulty was underestimated, resulting in

49

careless errors. The arithmetic task had the highest scores, and this task type fell in the middle as

far as NASA-TLX scores. One explanation for this observation is that too much mental

workload decreases performance, but so does too little mental workload. There may be a

“happy-medium” mental workload caused by the complexity of a task that supports good

performance and is supported by Yerkes-Dodson Law (Wickens, Lee, Liu, & Becker, 2004)

The best performance averages occurred when white noise/arithmetic tasks were

together, and the lowest performance averages occurred when anomaly detection/office noise

were together. This result is consistent with the results for each factor separately. Additionally,

the NASA-TLX scores for task type and noise level interaction that were the highest showed the

lowest performance scores. However, the highest performance score interaction did not

necessarily have the lowest mental workload. On the contrary, the lowest NASA-TLX scores

occurred during the no noise data entry condition. The data entry task required the auditory

cognitive channels of the subject; so having no noise to crowd these channels is consistent with

the perceptions of this condition being “simpler”, resulting in lower NASA-TLX scores.

The duration analysis also rejected and supported the hypotheses. Task duration is one of

two dependent variables be on the margin of statistical significance. It was predicted that when

background noise was present, the task duration would increase. The longest durations were

predicted to be seen in the arithmetic task type and the data entry task with office noise

interaction. The task type results follow the hypothesis, but the results for the arithmetic task

with no noise condition had the longest duration. The noise level that had the longest duration

was no noise, and white noise had the shortest duration on average. The noise results

contradicted the hypothesis that background noise causes longer task duration due to attention

gravitating toward the distracting sounds with corresponding recovery time. It is possible that

50

introducing noise during a task creates a work environment that the subject feels they must move

faster. Having no noise may have relaxed the subject to a point where they took their time

completing the task.

The shortest durations came from the data entry task and data entry task with white noise

interaction. These results make it clear that the data entry task may have just taken a shorter

amount of time to complete. However, it is interesting that both the interaction and noise level

show the white noise condition taking the shortest time on average. White noise showed the

highest performance out of all the noise levels. In addition, white noise displayed the lowest

NASA-TLX scores compared to no noise and office noise. As stated in the beginning, the ideal

work environment supports a culture of high performance, low mental workload, and quick task

completion. It might not be a coincidence that white noise appears to encourage this

environment. Further research should be conducted to confirm that white noise in the

background can create a positive work environment.

5.2 Physiological Mental Workload

All the physiological parameters, except for pupil diameter standard deviation, showed

statistical significance for task type. The LF/HF ratio nearly showed significance for noise level

with a 95% CI. Whether the measures were predicted to decrease or increase with mental

workload, the main hypothesis for these physiological measures were that they would agree with

the subjective mental workload quantified by the NASA-TLX scores.

The mean, standard deviation, and standardized mean values of R-R intervals were

hypothesized to decrease when mental workload increased (Mansikka et al. 2016). Relative to

the NASA-TLX scores, the only parameter that supported the hypothesis was the standard

51

deviation of R-R intervals, which showed the same task, anomaly detection, to have the most

mental workload. If the mean and standardized mean of the R-R intervals decreased with

increased mental workload, this meant that the data entry task had the highest mental workload,

which disagrees with the NASA-TLX. The RMSSD had the same trend. The hypothesis was

that this value would increase along with the mental workload; but if this were true, the data

entry task had the highest mental workload of all the tasks, disagreeing with the NASA-TLX.

The only other HRV parameter that agrees with the NASA-TLX is the LF/HF ratio. LF/HF

ratio, the other dependent variables that nearly showed noise level statistical significance, shows

that the office noise has the highest mental workload and white noise has the lowest.

The pupil dilates when under stress, and this study hypothesized that the difference in

pupil diameter should increase with mental workload (He et al., 2012; Marquart et al., 2015).

The data entry task showed the highest mental workload for pupil dilation, which does not match

the subjective mental workload scores. However, the pupil diameter standard deviation showed

similar results indicating data entry had the highest mental workload when the diameter

decreased, again, disagreeing with the NASA-TLX. Unlike the other physiological measures

that state that either anomaly detection or data entry are the highest or lowest, pupil diameter

standard deviation shows that the arithmetic task has the lowest mental workload.

To better illustrate the contradicting mental workload measures, Figure 16 displays the

trend for each. The NASA-TLX trend is based on the results, and the physiological measures are

based on the results as a function of how the measure should indicate an increase in mental

workload (see Table 14). As stated above, the only two physiological parameters that appear to

agree with the NASA-TLX scores are the SDRR (standard deviation of R-R intervals) and the

LF/HF ratio.

52

Figure 16: Graphical illustration of all mental workload measures for task type

Figure 17 displays mental workload dependent variable trends in terms of noise level.

The only physiological parameter that agrees with the NASA-TLX is the LF/HF ratio. The rest

of the physiological measures were inconsistent in their agreement with which mental workload

level correlates with each noise level. Pupil diameter standard deviation and standard deviation

of R-R intervals show that the office noise has the highest mental workload and no noise has the

lowest, where the RMSSD value results show the opposite. The mean and standardized mean of

R-R intervals show the white noise having the highest mental workload, but pupil diameter mean

shows this noise level as having the lowest mental workload.

Anomaly Detection Data Entry Arithmetic

Low

MW

L

Med

ium

MW

L

Hig

h M

WL

Task Type

NASA-TLX

LF/HF

MEANRR

SMEANRR

SDRR

RMSSD

Pupil Diameter (Mean)

Pupil Diameter (StandardDeviation)

53

Figure 17: Graphical illustration of all mental workload measures for noise level

Despite these findings, there are several explanations for this contradicting data. The

NASA-TLX is considered the gold standard of subjective mental workload, while the

physiological measures are still being validated (Hart, 2006; Francisco Ruiz-Rabelo et al., 2015;

Gerhard & de Winter, 2015; Hu, Lu, Tan, & Lomanto, 2016; Liang, Rau, Tsai, & Chen, 2014;

Marquart, Cabrall, & de Winter, 2015; Rubio, Díaz, Martín, & Puente, 2004; Sönmez, Oğuz,

Kutlu, & Yıldırım, 2017). Previous studies showed mixed results, such as with the RMSSD

value and standard deviation of R-R intervals, with physiological workload measures (Arnrich et

al., 2011; Guo et al., 2016; Mansikka, 2016).

The HRV findings might be skewed because of the noise present in the EKG. The tasks

all required motion: the data entry task required typing; the arithmetic task required a manual pill

counter; and the anomaly detection task required a laser pointer to circle the abnormalities. This

motion created noise in the EKG signal. Because of this noise, the signal was filtered using the

Office Noise White Noise No Noise

Low

MW

L

Med

ium

MW

L

H

igh

MW

L

Noise Level

NASA-TLX

LF/HF

MEANRR

SMEANRR

SDRR

RMSSD

Pupil Diameter (Mean)

Pupil Diameter (StandardDeviation)

54

one of the Biopac software’s digital filters. This data manipulation and/or the leftover noise in

the EKG signal could be a reason why the HRV data contradicted the NASA-TLX.

The pupil diameter findings may also have been skewed. A pupil has multiple jobs, and

some of them, keeping out light in a bright spaces, letting more light in a dark spaces, focusing

on objects different distances away, require the pupil to dilate or constrict (Spector, 1990). Each

task was a different distance from the subject. The data entry task required the subject to look at

a bright computer screen directly in front of them while the anomaly detection task was a couple

feet away. So, even though the lighting was kept constant, the monitor in the data entry task

presented more light than the anomaly detection task while the anomaly detection task was a

greater distance away. These slight differences could account for the pupil mental workload

measures opposing the NASA-TLX.

Fixation duration, counts, and rates were evaluated for the data entry and anomaly

detection tasks. The fixation rate, count, and duration for the anomaly detection task failed to

show any significance. The fixation rate, count, and duration for the data entry task failed to

show any significance as well. Thus, there is not enough evidence to support that noise level

influenced these parameters. This may be due to the large AOIs created for the two tasks that

resulted from the differing heights of the subjects and the distance between the subject and task

varying amongst the experimental sample. The benefit of using a mobile eye-tracking device,

such as the Tobii Pro Glasses 2, is that the subjects can be allowed the mobility similar to a real-

life setting. However, it is highly suggested that the scene be set or adjusted for each participant

so that the eye levels of the participants are comparable to one another to get more accurate

results.

55

5.3 Correlation Tests

Table 15 shows the five correlations with the associated correlation coefficients and p-values.

Table 17: Top five correlations from Table 15

Agreeableness, as four of the five significant correlation tests in Table 15 shows, is

heavily tied with mean pupil diameter. The four conditions where this occurs is the office noise,

and no noise condition, and the data entry and anomaly detection task. All correlation

coefficients to these four tests are negative indicating that when the person is more disagreeable

(has a lower agreeableness score) they have greater mean pupil diameter (pupil dilation). Having

four significant negative correlations between agreeableness scores and the physiological mental

workload parameter pupil diameter (shown as pupil dilation in the table) is interesting. Further

research is necessary, but this trend could make a case for personality tests helping predict a

better work environment in terms of mental workload.

Neuroticism scores were tied to office noise subjective mental workload (MWL)

measured by the NASA-TLX. The correlation coefficient is positive, meaning that when the

neuroticism scores increase, subjective mental workload of the subject also increases during the

office noise condition, despite whatever task is happening during the noise. The office noise

condition was considered the most disruptive noise condition. It is plausible to say that a more

neurotic person, someone with less emotional/mental stability, does not cope well with noise.

Variable 1 Variable 2 Correlation Coefficient p-value

Agreeableness Office Noise Mean Pupil Dilation -0.442 0.0013

Agreeableness Data Entry Mean Pupil Dilation -0.4036 0.0037

Neuroticism Office Noise MWL 0.3664 0.0047

Agreeableness Anomaly Detection Mean Pupil Dilation -0.3881 0.0054

Agreeableness No Noise Mean Pupil Dilation -0.3766 0.007

56

6.0 Conclusion

The goal of the study was to investigate the impact of noise on different tasks performed

by a human subject and determine the effect of noise on performance, mental workload, and time

taken to complete the task. There was significant evidence that task type influenced performance,

duration of task, and mental workload (subjective and physiological). There were two dependent

variables that nearly showed significance in noise level, which was the duration of the task and

the LF/HF ratio, a physiological measure of mental workload. White noise appeared to harbor

the traits of an ideal work environment based on the results: higher performance scores, lower

mental workload scores, and shorter amounts of time spent on tasks. The physiological mental

workload parameters did not necessarily agree with each other or the hypothesis, but most of

them resulted in the same two tasks (anomaly detection and data entry) as either the most and

least mentally taxing respectively. As research continues to move forward with the use of

physiological measures to give real-time indicators of mental workload, it can be better

understood how there are benefits and drawbacks to their use in studies as these. There were

several correlation tests that showed significance; however, agreeableness and neuroticism are

the two personalities that appeared the most in these tests as being significant. It is interesting

that of all the correlation tests performed, the only ones that showed significance were those that

were related to the subjective and physiological mental workload measures. These correlation

tests provide a starting point for analyzing the best work environment for lower mental workload

characterized by the individual’s personality. Additional research must be done to verify the

findings of these correlation tests and further examine the phenomenon of personality’s

correlation with mental workload.

57

7.0 Appendix

7.1 Appendix A: Experimental Design and Combinations Table

Subject Number Anomaly Data Entry Arithmetic

1 No White Office

2 White Office No

3 Office No White

4 Office No White

5 White Office No

6 No White Office

7 No White Office

8 Office No White

9 No Office White

10 White Office No

11 White Office No

12 No White Office

Questionnaires

/ Equipment

Set-up

Arithmetic

Data

Entry

Anomaly

Detection

Office

White

None

Data

Entry

Anomaly

Detection

Arithmetic

None

Office

White

Questionnaires

/ Equipment

Set-up

Questionnaires

/ Equipment

Set-up

Anomaly

Detection

Arithmetic

Data Entry

White

None

Office

Subject 1 Subject 2 Subject 3

Start Start Start

Three examples of subject’s experiment with three tasks with corresponding noises

58

13 Office No White

14 No White Office

15 Office No White

16 White Office No

17 White Office No

18 Office No White

19 White Office No

20 No White Office

21 No Office White

22 Office White No

23 Office White No

24 Office White No

25 Office White No

26 White No Office

27 No Office White

28 No Office White

29 White No Office

30 Office White No

31 Office No White

32 No White Office

33 White Office No

34 White No Office

35 Office No White

36 No White Office

37 Office White No

38 White No Office

39 No Office White

40 White No Office

59

41 No White Office

42 Office No White

43 White Office No

44 White Office No

45 No White Office

46 Office No White

47 No Office White

48 White No Office

49 Office White No

50 Office No White

51 White Office No

52 White No Office

53 Office White No

54 No Office White

55 No Office White

56 White No Office

57 Office White No

58 Office No White

59 No White Office

60 White Office No

Task Type Noise Level

Data Entry White

Data Entry No

Data Entry Office

Anomaly Detection White

Anomaly Detection No

Anomaly Detection Office

Math Arithmetic White

Math Arithmetic No

Math Arithmetic Office

9 Possible Combinations

60

7.2 Appendix B: Questionnaires and Task Problems (with answers)

7.2.1 Noise Tolerance Questionnaire

Participant #: ______

Pre-Questionnaire

Circle the fill-in to the statement that best fits for you. Please circle only one for each statement.

1. I am between 18-65 years old:________.

a. True

b. False

2. I am fluent in the English Language: ________.

a. True

b. False

3. I am not colorblind:__________. (If you could be colorblind, circle False)

a. True

b. False

**If you’ve answered false to any of the above questions, please turn in your pre-questionnaire now

4. Do you have experience with medical imaging or x-rays? ________

a. Yes

b. No

5. Have you taken the Myers-Briggs personality test? __________

a. Yes

b. No

5a. If yes to question 5, what four letters were your result? ___________ (Example: ISTJ)

Reminder or Myers-Briggs:

Introvert/Extrovert (I or E) Sensing/Intuition (S or N)

Thinking/Feeling (T or F) Judging/Perceiving (J or P)

6. I find that I studied the best:

a. In a very quiet place

b. With light music or a small amount of people

c. In a place with lots of people or loud music

61

7. I find that I sleep the best when:

a. It is very quiet

b. With white noise (such as a fan or AC running)

c. When there is a lot of noise outside my window/room

8. I’ve lived over half of my life in:

a. The country/rural (little to no sound)

b. Suburban type area (some sound)

c. City (lots of sound)

9. When I drive/travel in the car, I like to listen to:

a. Little to no music

b. Music mild in volume (can still hold conversation in car)

c. Loud music (must turn down to have a conversation)

10. My ability to cancel out a conversation when I have to concentrate is:

a. Poor

b. Average

c. Excellent

7.2.2 Big Five Inventory (Link)

http://fetzer.org/sites/default/files/images/stories/pdf/selfmeasures/Personality-BigFiveInventory.pdf

7.2.3 NASA-TLX (Link)

https://humansystems.arc.nasa.gov/groups/tlx/downloads/TLXScale.pdf

http://fetzer.org/sites/default/files/images/stories/pdf/selfmeasures/Personality-BigFiveInventory.pdf

https://humansystems.arc.nasa.gov/groups/tlx/downloads/TLXScale.pdf

62

7.2.4 Anomaly Detection Task (with answers)

7.2.5 Data Entry Task (with answers)

63

7.2.6 Mathematical Arithmetic Task (with answers)

7.3 Appendix C: Eye-tracking Illustrations

8 green beads

5 blue beads

Only bead shape and

colors that should be

used

10 blue beads, 4 pink beads, and 2 green beads

2 pink beads and 1 blue bead

1 green bead, 3 pink beads, and 2 blue beads

64

Data Entry Task Area of Interest

Anomaly Detection Task Area of Interest

65

Example data entry gaze plot of subject who only focused on GUI

Example data entry gaze plot of subject who focused on both the keyboard and

GUI

66

Example data entry gaze plot of subject who focused on GUI and Principal

Investigator Example gaze plot for anomaly detection task

67

7.4 Appendix D: Residual Plots

7.4.1 Normal Distribution Checks

71

7.4.2 Residual vs. Predicted Plots

74

7.5 Appendix E: Connecting Letters Reports and Interaction Plots

7.5.1 Task Performance

75

7.5.2 Task Duration

76

7.5.3 Mental Workload

77

7.5.4 Heart Rate Parameters

LF/HF

Mean R-R Interval

Standard Deviation

78

RMSSD

Standardized Mean HRV

79

7.5.5 Eye-tracking Parameters

Pupil Diameter Mean

Pupil Diameter Standard Deviation

80

7.6 Appendix F: Correlation Tables

Performance Scores by Task vs.

Personality/Noise Tolerance

Scores Noise Tolerance vs.

Anomaly Detection 0.016594

Noise Tolerance vs Data

Entry 0.1715

Noise Tolerance vs.

Arithmetic 0.037862

Agreeableness vs Anomaly

Detection -0.01532

Agreeableness vs. Data

Entry 0.02795

Agreeableness vs.

Arithmetic 0.033015

Conscientiousness vs

Anomaly Detection -0.02843

Conscientiousness vs Data

Entry 0.048

Conscientiousness vs

Arithmetic 0.069696

Neuroticism vs. Anomaly

Detection -0.12451

Neuroticism vs. Data Entry -0.02903

Neuroticism vs. Arithmetic -0.17541

Openness vs Anomaly

Detection 0.094041

Openness vs Data Entry 0.008617

Openness vs Arithmetic 0.03632

Extroversion vs. Anomaly

Detection 0.114438

Extroversion vs Data Entry -0.06425

Extroversion vs. Arithmetic 0.07508

81

NASA-TLX Mental Workload (MWL) scores by

Task vs Personality/Noise Tolerance Scores

Noise Tolerance vs. Anomaly Detection

MWL -0.03498

Noise Tolerance vs Data Entry MWL -0.04264

Noise Tolerance vs. Arithmetic MWL -0.10975

Agreeableness vs Anomaly Detection MWL 0.053231

Agreeableness vs. Data Entry MWL -0.02493

Agreeableness vs. Arithmetic MWL -0.03576

Conscientiousness vs Anomaly Detection

MWL 0.036895

Conscientiousness vs Data Entry MWL -0.008

Conscientiousness vs Arithmetic MWL -0.09948

Neuroticism vs. Anomaly Detection MWL 0.265037

Neuroticism vs. Data Entry MWL 0.128036

Neuroticism vs. Arithmetic MWL 0.272319

Openness vs Anomaly Detection MWL 0.1226

Openness vs Data Entry MWL -0.09011

Openness vs Arithmetic MWL 0.055618

Extrovert vs. Anomaly Detection MWL -0.31736

Extrovert vs Data Entry MWL -0.02714

Extrovert vs. Arithmetic MWL -0.05365

Noise Tolerance vs. MWL Average (over

three tasks) -0.08122

Agreeableness vs MWL Average (over three

tasks) -0.00318

Conscientiousness vs. MWL Average (over

three tasks) -0.03134

Neuroticism vs. MWL Average (over three

tasks) 0.28758

Openness vs. MWL Average (over three

tasks) 0.040067

Extrovert vs. MWL Average (over three

tasks) -0.17175

Personality vs. Noise Tolerance Scores

82

Noise Tolerance vs. Noise Tolerance 1

Agreeableness vs. Noise Tolerance 0.238318

Conscientiousness vs Noise Tolerance 0.134094

Neuroticism vs Noise Tolerance -0.18865

Openness vs Noise Tolerance -0.13839

Extroversion vs. Noise Tolerance 0.11816

Personality/Noise Tolerance Scores vs. Performance

Scores by Noise Level

Extroversion vs. No Noise -0.022748768

Noise Tolerance vs No Noise 0.035836108

Agreeableness vs No Noise -0.000774025

Conscientiousness vs. No Noise 0.11990811

Neuroticism vs. No Noise -0.170906224

Openness vs No Noise -0.057195205

Extroversion vs. White Noise 0.141599354

Noise Tolerance vs White Noise 0.016775334

Agreeableness vs White Noise 0.080286332

Conscientiousness vs. White Noise -0.138204121

Neuroticism vs. White Noise 0.008313073

Openness vs White Noise 0.150512419

Extroversion vs. Office Noise -0.025252986

Noise Tolerance vs Office Noise 0.103132358

Agreeableness vs Office Noise 0.028077652

83

Conscientiousness vs. Office Noise 0.067395146

Neuroticism vs. Office Noise -0.108630887

Openness vs Office Noise -0.041737765

Average Performance

Extroversion 0.049739119

Noise Tolerance 0.098969352

Agreeableness 0.063082779

Conscientiousness 0.046846631

Neuroticism -0.18311397

Openness 0.020356565

Noise Performance Average (Two tasks with noise

only)




Conscientiousness -0.046274585


Openness 0.074581605

Personality/Noise Tolerance Scores vs. Task

Duration by Noise Level

84


Noise Tolerance vs No Noise 0.109929

Agreeableness vs No Noise 0.059981

Conscientiousness vs. No Noise -0.17992

Neuroticism vs. No Noise 0.220944

Openness vs No Noise -0.03097

Extroversion vs. White Noise -0.07958

Noise Tolerance vs White Noise 0.018494


Conscientiousness vs. White Noise 0.174026

Neuroticism vs. White Noise -0.12586

Openness vs White Noise 0.098807

Extroversion vs. Office Noise 0.270089

Noise Tolerance vs Office Noise -0.16615

Agreeableness vs Office Noise 0.029978

Conscientiousness vs. Office Noise -0.13453

Neuroticism vs. Office Noise -0.05926

Openness vs Office Noise 0.121776

Average Overall Duration



85


Conscientiousness -0.1334

Neuroticism 0.08461

Openness 0.098405

Noise Duration Average (Two tasks with noise

only)


Noise Tolerance -0.11084




Openness 0.163478

Personality/Noise Tolerance

Scores vs. NASA-TLX

MWL by Noise Level


Noise Tolerance vs No Noise -0.00642

Agreeableness vs No Noise -0.01852

Conscientiousness vs. No Noise -0.19956

Neuroticism vs. No Noise 0.156542

Openness vs No Noise 0.004953

Extroversion vs. White Noise -0.27159

Noise Tolerance vs White Noise -0.10899


Conscientiousness vs. White Noise 0.178901

Neuroticism vs. White Noise 0.053029

Openness vs White Noise -0.05244

86

Extroversion vs. Office Noise -0.0707

Noise Tolerance vs Office Noise -0.04356

Agreeableness vs Office Noise -0.00352

Conscientiousness vs. Office Noise 0.027704

Neuroticism vs. Office Noise 0.325507

Openness vs Office Noise 0.109319

MWL Noise Average (Two tasks with

noise only)

Extroversion -0.21891

Noise Tolerance -0.09817



Neuroticism 0.254648

Openness 0.041328

Personality/Noise Tolerance Scores vs. HRV

Parameters for Data Entry Task

Noise Tolerance vs. LF/HF Data Entry -0.11042

Noise Tolerance vs Mean HRV Data Entry 0.135907

Noise Tolerance vs. Standard Deviation HRV Data Entry 0.100076

Noise Tolerance vs. RMSSD Data Entry 0.041487

Agreeableness vs LF/HF Data Entry -0.24139

Agreeableness vs. Mean HRV Data Entry 0.035521

Agreeableness vs. Standard Deviation HRV Data Entry 0.322372

Agreeableness vs. RMSSD Data Entry 0.370827

Conscientiousness vs LF/HF Data Entry -0.33965

Conscientiousness vs Mean HRV Data Entry -0.28977

Conscientiousness vs Standard Deviation HRV Data Entry 0.120079

Conscientiousness vs RMSSD Data Entry 0.204418

Neuroticism vs. LF/HF Data Entry 0.020921

Neuroticism vs. Mean HRV Data Entry -0.09506

Neuroticism vs. Standard Deviation HRV Data Entry -0.3297

Neuroticism vs RMSSD Data Entry -0.3538

87

Openness vs LF/HF Data Entry -0.05779

Openness vs Mean HRV Data Entry -0.16198

Openness vs Standard Deviation HRV Data Entry -0.2074

Openness vs RMSSD Data Entry -0.18865

Extroversion vs. LF/HF Data Entry -0.02858

Extroversion vs Mean HRV Data Entry -0.1186

Extroversion vs. Standard Deviation HRV Data Entry -0.11725

Extroversion vs. RMSSD Data Entry -0.1

Personality/Noise Tolerance Scores vs. HRV Parameters for

Anomaly Detection Task

Noise Tolerance vs. LF/HF Anomaly Detection -0.29437

Noise Tolerance vs Mean HRV Anomaly Detection 0.173206

Noise Tolerance vs. Standard Deviation HRV Anomaly Detection 0.130326

Noise Tolerance vs. RMSSD Anomaly Detection 0.043917

Agreeableness vs LF/HF Anomaly Detection -0.18109

Agreeableness vs. Mean HRV Anomaly Detection 0.087904

Agreeableness vs. Standard Deviation HRV Anomaly Detection 0.064341

Agreeableness vs. RMSSD Anomaly Detection 0.108255

Conscientiousness vs LF/HF Anomaly Detection -0.11429

Conscientiousness vs Mean HRV Anomaly Detection -0.21318

Conscientiousness vs Standard Deviation HRV Anomaly Detection 0.185376

Conscientiousness vs RMSSD Anomaly Detection 0.080377

Neuroticism vs. LF/HF Anomaly Detection 0.251707

Neuroticism vs. Mean HRV Anomaly Detection -0.05951

Neuroticism vs. Standard Deviation HRV Anomaly Detection -0.23961

Neuroticism vs RMSSD Anomaly Detection -0.22318

Openness vs LF/HF Anomaly Detection 0.042183

Openness vs Mean HRV Anomaly Detection -0.02354

Openness vs Standard Deviation HRV Anomaly Detection -0.15493

Openness vs RMSSD Anomaly Detection -0.15608

Extroversion vs. LF/HF Anomaly Detection -0.03361

88

Extroversion vs Mean HRV Anomaly Detection -0.09883

Extroversion vs. Standard Deviation HRV Anomaly Detection -0.01366

Extroversion vs. RMSSD Anomaly Detection 0.028114


Arithmetic Task

Noise Tolerance vs. LF/HF Arithmetic -0.13366

Noise Tolerance vs Mean Arithmetic HRV 0.115213

Noise Tolerance vs. Standard Deviation HRV Arithmetic 0.051962

Noise Tolerance vs. RMSSD Arithmetic 0.06056

Agreeableness vs LF/HF Arithmetic -0.22496

Agreeableness vs. Mean HRV Arithmetic 0.083366

Agreeableness vs. Standard Deviation HRV Arithmetic 0.136534

Agreeableness vs. RMSSD Arithmetic 0.177938

Conscientiousness vs LF/HF Arithmetic -0.16897

Conscientiousness vs Mean HRV Arithmetic -0.31982

Conscientiousness vs Standard Deviation HRV Arithmetic 0.153044

Conscientiousness vs RMSSD Arithmetic 0.173451

Neuroticism vs. LF/HF Arithmetic 0.015285

Neuroticism vs. Mean HRV Arithmetic -0.11667

Neuroticism vs. Standard Deviation HRV Arithmetic -0.19283

Neuroticism vs RMSSD Arithmetic -0.12771

Openness vs LF/HF Arithmetic -0.05551

Openness vs Mean HRV Arithmetic -0.14122

Openness vs Standard Deviation HRV Arithmetic -0.18512

Openness vs RMSSD Arithmetic -0.17766

Extroversion vs. LF/HF Arithmetic -0.02455

Extroversion vs Mean HRV Arithmetic -0.13699

Extroversion vs. Standard Deviation HRV Arithmetic -0.06578

Extroversion vs. RMSSD Arithmetic -0.12756


No Noise Tasks

Noise Tolerance vs. LF/HF -0.09209

Noise Tolerance vs Mean HRV 0.074328

89

Noise Tolerance vs. Standard Deviation HRV 0.091491

Noise Tolerance vs. RMSSD 0.027043

Agreeableness vs LF/HF -0.34985

Agreeableness vs. Mean HRV 0.008364

Agreeableness vs. Standard Deviation HRV 0.249448

Agreeableness vs. RMSSD 0.281866

Conscientiousness vs LF/HF -0.29021

Conscientiousness vs MEAN -0.35518

Conscientiousness vs HRV Standard Deviation 0.219654

Conscientiousness vs RMSSD 0.228316

Neuroticism vs. LF/HF 0.100183

Neuroticism vs. MEAN -0.08938

Neuroticism vs. HRV Standard Deviation -0.22604

Neuroticism vs RMSSD -0.22593

Openness vs LF/HF 0.070492

Openness vs Mean HRV -0.2037

Openness vs Standard Deviation HRV -0.1922

Openness vs RMSSD -0.18003

Extrovert vs. LF/HF 0.166907

Extrovert vs Mean HRV -0.09624

Extrovert vs. Standard Deviation HRV -0.13203

Extrovert vs. RMSSD -0.15052

Personality/Noise Tolerance Scores vs. HRV Parameters

for White Noise Tasks









90






Neuroticism vs. MEAN -0.04969



Openness vs LF/HF -0.10274




Extrovert vs. LF/HF -0.13336




Personality/Noise Tolerance Scores vs. HRV Parameters

for Office Noise Tasks














91

Neuroticism vs. MEAN 0.106753



Openness vs LF/HF -0.01792




Extrovert vs. LF/HF -0.10355




No Noise 0.151738 No Noise -0.08802 No Noise -0.37662

White Noise 0.200556 White Noise -0.00497 White Noise -0.28355

Office Noise 0.158821 Office Noise -0.26256 Office Noise -0.44199

Extroversion vs. Pupil Diameter Difference

(from Control) by Noise Level

Noise Tolerance vs. Pupil Diameter

Difference (from Control) by Noise Level

Agreeableness vs. Pupil Diameter


No Noise 0.134334356 No Noise -0.01491 No Noise 0.24384

White Noise -0.04324164 White Noise -0.12207 White Noise 0.021463

Office Noise -0.00557524 Office Noise 0.022873 Office Noise 0.166714

Openness vs. Pupil Diameter Difference

(from Control) by Noise Level

Conscientiousness vs. Pupil Diameter


Neuroticism vs. Pupil Diameter


No Noise Std.d Dilation 0.072009 No Noise Std.d Dilation -0.10471 No Noise Std.d Dilation 0.117678

White Noise Std.d Dilation 0.101576 White Noise Std.d Dilation -0.18032 White Noise Std.d Dilation -0.00417

Office Noise Std.d Dilation 0.031565 Office Noise Std.d Dilation -0.13536 Office Noise Std.d Dilation -0.03787

Extroversion vs. Standard Deviation (Std.d)

Pupil Diameter by Noise Level

Noise Tolerance vs. Standard

Deviation (Std.d) Pupil Diameter by

Noise Level

Agreeableness vs. Standard Deviation

(Std.d) Pupil Diameter by Noise Level

92

No Noise Std.d Dilation -0.03882 No Noise Std.d Dilation -0.0628 No Noise Std.d Dilation 0.12458

White Noise Std.d Dilation 0.018493 White Noise Std.d Dilation -0.01646 White Noise Std.d Dilation 0.239569

Office Noise Std.d Dilation -0.00855 Office Noise Std.d Dilation -0.08294 Office Noise Std.d Dilation -0.03321

Openness vs. Standard Deviation


Conscientiousness vs. Standard


Noise Level

Neuroticism vs. Standard Deviation


Data Entry 0.242012 Data Entry 0.211357 Data Entry -0.14679

Anomaly Detection 0.242012 Anomaly Detection 0.088629 Anomaly Detection -0.1645

Arithmetic 0.306569 Arithmetic 0.163438 Arithmetic -0.09939

Extroversion vs. Pupil Diameter

Difference (from Control) by

Task Type

Noise Tolerance vs. Pupil

Diameter Difference (from

Control) by Task Type

Agreeableness vs. Pupil



Data Entry 0.033953 Data Entry -0.05129 Data Entry -0.06264

Anomaly Detection -0.09294 Anomaly Detection 0.046656 Anomaly Detection -0.1023


Openness vs. Pupil Diameter


Task Type

Conscientiousness vs. Pupil



Neuroticism vs. Pupil Diameter


Task Type

Data Entry 0.011469 Data Entry 0.116233 Data Entry 0.056845

Anomaly Detection -0.00379 Anomaly Detection 0.249274 Anomaly Detection -0.05663


Extroversion vs. Standard

Deviation (Std.d) Pupil

Diameter by Task Type

Noise Tolerance vs. Standard


Task Type

Agreeableness vs. Standard


Task Type

Data Entry 0.170295 Data Entry -0.17873 Data Entry 0.209977

Anomaly Detection 0.007025 Anomaly Detection 0.02287 Anomaly Detection 0.100182

Arithmetic 0.04289 Arithmetic 0.06229 Arithmetic 0.141966

Conscientiousness vs.

Standard Deviation (Std.d)

Pupil Diameter by Task Type

Neuroticism vs. Standard



Openness vs. Standard



93

8.0 References

Aarde, N., Meiring, D., & Wiernik, B. (2017). The validity of the Big Five personality traits for

job performance: Meta-analyses of South African studies. International Journal of

Selection and Assessment. 25(3). 223-239.

Alansari, B. (2016). The Big Five Inventory (BFI): Reliability and validity of its Arabic

translation in non-clinical sample. European Psychiatry, 33S248. doi:

10.1016/j.eurpsy.2016.01.500

Alessandri, G. and Vecchione, M. (2012). The higher-order factors of the Big Five as predictors

of job performance. Personality and Individual Differences. 53. 779-784.

Arnrich, B., Cinaz, B., Arnrich, B., La Marca, R., & Troester, G. (2011). Monitoring of mental

workload levels during an everyday life office-work scenario. Personal and Ubiquitous

Computing, 17(2), 229-239.

Arterberry, B. J., Martens, M. P., Cadigan, J. M., & Rohrer, D. (2014). Application of

Generalizability Theory to the Big Five Inventory. Personality and Individual

Differences, 6998-103. doi:10.1016/j.paid.2014.05.015

Basil, M. D. (1994). Multiple resource theory I: Application to television viewing.

Communication Research, 21(2), 177.

Becker, A. B., Warm, J. S., Dember, W. N., & Hancock, P. (1995). Effects of jet engine noise

and performance feedback on perceived workload in a monitoring task. International

Journal of Aviation Psychology, 5(1).

94

Belojevic, G., Jakovljevic, B., & Slepcevic, V. (2003). Noise and mental performance:

personality attributes and noise sensitivity. Noise & Health, 6(21), 77-89.

Billman, G. (2013) The LF/HF ratio does not accurately measure cardiac sympatho-vagal

balance. Frontiers in Physiology. 4(26).

Boonghee, Y., Neelankavil, J. P., de Guzman, G. M., & Lim, R. A. (2013). Personality Type

Preferences of Asian Managers: A Cross-Country Analysis Using the MBTI Instrument.

International Journal of Global Management Studies, 5(1), 1-23.

Brotherton, P. (2012). 360 Instruments Are the Most Popular Way to Assess Leadership. T+D,

66(8), 18.

Cail, F., & Aptel, M. (2003). Biomechanical stresses in computer-aided design and in data entry.

International Journal of Occupational Safety and Ergonomics, 9(3), 235-255.

Cardona, G., & Quevedo, N. (2014). Blinking and driving: the influence of saccades and

cognitive workload. Current Eye Research, 39(3), 239-244.

doi:10.3109/02713683.2013.841256

Church, D. G. (2015). Reducing Error Rates in Intelligence, Surveillance, and Reconnaissance

(ISR) Anomaly Detection via Information Presentation Optimization. Wright State

University. Retrieved from

https://etd.ohiolink.edu/pg_10?0::NO:10:P10_ACCESSION_NUM:wright1452858183

Colligan, L., Potts, H. W., Finn, C. T., & Sinkin, R. A. (2015). Cognitive workload changes for

nurses transitioning from a legacy system with paper documentation to a commercial


95

electronic health record. International Journal of Medical Informatics, 84(7), 469-476.

doi:10.1016/j.ijmedinf.2015.03.003

Cooper, C. A., McCord, D. M., & Campbell-Bridges, W. (2017). Personality and the Teaching of

Public Administration: A Case for the Big Five. Journal of Public Affairs Education,

23(2), 677.

Dimitrakopoulos, G. N., Kakkos, I., Dai, Z., Lim, J., deSouza, J. J., Bezerianos, A., & Sun, Y.

(2017). Task-Independent mental workload classification based upon common multiband

EEG cortical connectivity. IEEE transactions on neural systems and rehabilitation

engineering: A publication of the IEEE engineering in medicine and biology society,

25(11), 1940-1949. doi:10.1109/TNSRE.2017.2701002

Dobbs, S., Furnham, A., & McClelland, A. (2011). The effect of background music and noise on

the cognitive test performance of introverts and extraverts. Applied Cognitive

Psychology, 25(2), 307-313.

Dockrell, J., Shield, B. M., & Dockrell, J. E. (2008). The effects of environmental and classroom

noise on the academic attainments of primary school children. Journal of the Acoustical

Society of America, 123(1), 133-144.

Faure, V., Lobjois, R., & Benguigui, N. (2016). The effects of driving environment complexity

and dual tasking on drivers’ mental workload and eye blink behavior. Transportation

Research Part F: Traffic Psychology and Behavior, 40. 78-90.

doi:10.1016/j.trf.2016.04.007

96

Francis, L. J., Lewis, C. A., & Ziebertz, H. (2006). The short-form revised Eysenck personality

questionnaire (EPQR-S): A German edition. Social Behavior & Personality: An

International Journal, 34(2), 197-203.

Francisco Ruiz-Rabelo, J., Navarro-Rodriguez, E., Diaz-Jimenez, N., Cabrera-Bermon, J., Diaz-

Iglesias, C., Gomez-Alvarez, M., & ... Luigi Di-Stasi, L. (2015). Validation of the

NASA-TLX score in ongoing assessment of mental workload during a laparoscopic

learning curve in bariatric surgery. Obesity Surgery, 25(12), 2451-2456.

Furnham, A., & Strbac, L. (2002). Music is as distracting as noise: The differential distraction of

background music and noise on the cognitive test performance of introverts and

extraverts. Ergonomics, 45(3), 203-217. doi:10.1080/00140130210121932

Galy, E., & Mélan, C. (2015). Effects of cognitive appraisal and mental workload factors on

performance in an arithmetic task. Applied Psychophysiology and Biofeedback, 40(4),

313-325. doi:10.1007/s10484-015-9302-0

Gao, Q., Wang, Y., Li, Z., Dong, X., & Song, F. (2013). Mental workload measurement for

emergency operating procedures in digital nuclear power plants. Ergonomics, 56(7),

1070-1085.

Gabbard, R. D. (2017). Identifying the Impact of Noise on Anomaly Detection through

Functional Near-Infrared Spectroscopy (fNIRS) and Eye-tracking. Wright State

University. Retrieved from

https://etd.ohiolink.edu/!etd.send_file?accession=wright1501711461736129&disposition

=inline

https://etd.ohiolink.edu/!etd.send_file?accession=wright1501711461736129&disposition=inline

https://etd.ohiolink.edu/!etd.send_file?accession=wright1501711461736129&disposition=inline

97

Gentry, T. A., Wakefield Jr., J. A., & Friedman, A. F. (1985). MMPI Scales for measuring

Eysenck's Personality Factors. Journal of Personality Assessment, 49(2), 146.

Gerhard, M., & Joost de, W. (2015). Workload assessment for mental arithmetic tasks using the

task-evoked pupillary response. Peer Computer Science, 1(16), e16. doi:10.7717/peerj-

cs.16

Gerras, S. J., & Wong, L. (2016). Moving Beyond the MBTI. Military Review, 96(2), 54.

Guo, W., Tian, X., Tan, J., Zhao, L., and Li, L. (2016). Driver's mental workload estimation

based on empirical physiological indicators. 2016 31st Youth Academic Annual

Conference of Chinese Association of Automation (YAC), Chinese Association of

Automation (YAC), Youth Academic Annual Conference of, 344.

doi:10.1109/YAC.2016.7804916

Hart, S. G. (2006). NASA-task load index (NASA-TLX); 20 years later. Paper presented at the

Proceedings of the human factors and ergonomics society annual meeting.

He, X., Wang, L., & Gao, X. & Chen, Y. (2012). IEEE 10th International Conference on

Industrial Informatics, Industrial Informatics (INDIN), 2012 10th IEEE International

Conference on, 502. doi:10.1109/INDIN.2012.6301203

Heine, T., Lenis, G., Reichensperger, P., Beran, T., Doessel, O., & Deml, B. (2017).

Electrocardiographic features for the measurement of drivers' mental workload. Applied

Ergonomics, 6131-43. doi:10.1016/j.apergo.2016.12.015

98

Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & Van de Weijer, J.

(2011). Eye tracking: A comprehensive guide to methods and measures. New York, NY:

Oxford University Press.

Horrey, W.J., & Wickens, C.D. (2003). Multiple resource modeling of task interference in

vehicle control, hazard awareness and in-vehicle task performance. In Proceedings of the

Second International Driving Symposium on Human Factors in Driver Assessment,

Training, and Vehicle Design 2013. 7-12

Hu, J. L., Lu, J., Tan, W. B., & Lomanto, D. (2016). Training improves laparoscopic tasks

performance and decreases operator workload. Surgical Endoscopy, 30(5), 1742-1746.

doi:10.1007/s00464-015-4410-8

Hygge, S., & Knez, I. (2001). Effects of noise, heat, and indoor lighting on cognitive

performance and self-reported affect. Journal of Environmental Psychology, 21. 291-299.

doi:10.1006/jevp.2001.0222

Jafari, M. J., & Kazempour, M. (2013). Mental processing of human subjects with different

individual characters exposed to low frequency noise. International Journal of

Occupational Hygiene, 5(3), 64-70.

Jahncke, H., Björkeholm, P., Marsh, J. E., Odelius, J., & Sörqvist, P. (2016). Office noise: Can

headphones and masking sound attenuate distraction by background speech? Work, 55(3),

505-513. doi:10.3233/WOR-162421

Jerison, H. J. (1959). Effects of noise on human performance. Journal of Applied Psychology,

43(2), 96-101. doi:10.1037/h0042914

99

John, O. P., Naumann, L. P., & Soto, C. J. (2008). Paradigm shift to the integrative Big-Five trait

taxonomy: History, measurement, and conceptual issues. In O. P. John, R. W. Robins, &

L. A. Pervin (Eds.), Handbook of personality: Theory and research (pp. 114-158). New

York, NY: Guilford Press.

Judge, T., Rodell, J., Klinger, R., and Simon, L. (2013) Hierarchical representations of the Five-

Factor Model of Personality in predicting job performance: Integrating three organizing

frameworks with two theoretical perspectives. Applied Psychology. 98(6). 875-925.

Khan, Z. A., & Rizvi, S. H. (2010). A study on the effects of human age, type of computer and

noise on operators' performance of a data entry task. International Journal of

Occupational Safety and Ergonomics, 16(4), 455-463.

Kotani, K., Takamasu, K., & Tachibana, M. (2007). Respiratory-phase domain analysis of heart

rate variability can accurately estimate cardiac vagal activity during a mental arithmetic

task. Methods of Information In Medicine, 46(3), 376-385.

Kou, S., Furnham, A., McClelland, A., & Furnham, A. (2018). The effect of background music

and noise on the cognitive test performance of Chinese introverts and extraverts.

Psychology of Music, 46(1), 125-135.

Lahtela, K., Niemi, P., Kuusela, V., & Hypén, K. (1986). Noise and visual choice-reaction time:

A large-scale population survey. Scandinavian Journal of Psychology, 27(1), 52-57.

doi:10.1111/j.1467-9450.1986.tb01186.x

100

Lercher, P., Evans, G. W., & Meis, M. (2003). Ambient noise and cognitive processes among

primary schoolchildren. Environment and Behavior, 35(6), 725-735.

doi:10.1177/0013916503256260

Levy-Leboyer, C. (1989). Noise effects on two industrial tasks. Work & Stress, 3(4), 315-322.

doi:10.1080/02678378908256949

Liang, S. M., Rau, C., Tsai, P., & Chen, W. (2014). Validation of a task demand measure for

predicting mental workloads of physical therapists. International Journal of Industrial

Ergonomics, 44. 747-752. doi:10.1016/j.ergon.2014.08.002

Ljungberg, J. K., & Neely, G. (2007). Stress, subjective experience and cognitive performance

during exposure to noise and vibration. Journal of Environmental Psychology, 2744-54.

doi:10.1016/j.jenvp.2006.12.003

Lovik, A., Verbeke, G., & Molenberghs, G. (2017). Evaluation of a very short test to measure

the big five personality factors on a Flemish sample. Journal of Psychological &

Educational Research, 25(2), 7-17.

Marquart, G., Cabrall, C., & de Winter, J. (2015). Review of eye-related measures of drivers’

mental workload. Procedia Manufacturing, 3(6th International Conference on Applied

Human Factors and Ergonomics (AHFE 2015) and the Affiliated Conferences, AHFE

2015), 2854-2861. doi:10.1016/j.promfg.2015.07.783

Mansikka, H., Virtanen, K., Harris, D., and Simola, P. (2016). Fighter pilots’ heart rate, heart

rate variation and performance during an instrument flight rules proficiency test. Applied

Ergonomics. 56. 213-219.

101

Myers, S. (2016). Myers-Briggs typology and Jungian individuation. Journal of Analytical

Psychology, 61(3), 289-308. doi:10.1111/1468-5922.12233.

Nassiri, P., Monazam, M., Fouladi Dehaghi, B., Abadi, L., Zakerian, SA., & Azam, K. (2013).

The effect of noise on human performance: A clinical trial. The International Journal of

Occupational and Environmental Medicine, 4(2), Pp 87-95 (2013), (2), 87.

Nickels, M. (2014). IMPROVING MOTION IMAGERY ANALYSIS: INVESTIGATING

DETECTION FAILURES, REMEMBERING TO PERFORM DEFERRED INTENTIONS.

(Electronic Thesis or Dissertation). Retrieved from https://etd.ohiolink.edu/

Othman, N. and Romli, F. (2016). Mental workload evaluation of pilots using pupil dilation.

International Review of Aerospace 9(3).

Peng, X., He, Q., Ji, T., Wang, Z., & Yang, L. (2006). [Mental workload for mental arithmetic on

visual display terminal]. Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi = Zhonghua

Laodong Weisheng Zhiyebing Zazhi = Chinese Journal of Industrial Hygiene and

Occupational Diseases, 24(12), 726-729.

Pittenger, D. J. (1993). The Utility of the Myers-Briggs Type Indicator. Review of Educational

Research, (4), 467.

Pittenger, D. J. (2005). Cautionary Comments Regarding the Myers-Briggs Type Indicator.

Consulting Psychology Journal: Practice & Research, 57(3), 210-221. doi:10.1037/1065-

9293.57.3.210

https://etd.ohiolink.edu/

102

Piasecki, A. M. (2016). Improving Anomaly Detection through Identification of Physiological

Signatures of Unconscious Awareness. Wright State University. Retrieved from


Prabhu, A., Smith, W., Yurko, Y., Acker, C., & Stefanidis, D. (2010). Simulation-Based Surgical

Education: Increased stress levels may explain the incomplete transfer of simulator-

acquired skill to the operating room. Surgery, 147640-645.

doi:10.1016/j.surg.2010.01.007

Pujol, S., Levain, J., Houot, H., Petit, R., Berthillier, M., Defrance, J., & ... Mauny, F. (2014).

Association between Ambient Noise Exposure and School Performance of Children

Living in An Urban Area: A Cross-Sectional Population-Based Study. Journal of Urban

Health, 91(2), 256-271. doi:10.1007/s11524-013-9843-6.

Rabinowitz P.M. 2000. Noise-induced hearing loss. American Family Physician, 61(9),

2749–2759.

Rammstedt, B. & John, O. (2007). Measuring personality in one minute or less: A 10-item short

version of the Big Five Inventory in English and German. Journal of Research in

Personality. 41(1). 203-212.

Reyes Zamorano, E., Álvarez Carrillo, C., Peredo Silva, A., Sandoval, A. M., & Rebolledo

Pastrana, I. M. (2014). Psychometric properties of the big five inventory in a Mexican

sample. Salud Mental, 37(6), 491-497.

103

Rodrigues, N., & Rebelo, T. (2013). Incremental validity of proactive personality over the Big

Five for predicting job performance of software engineers in an innovative context.

Revista De Psicologia Del Trabajo Y De Las Organizaciones, 29(1), 21-27.

doi:10.5093/tr2013a4

Rubio, S., Díaz, E., Martín, J., & Puente, J. M. (2004). Evaluation of subjective mental

workload: A comparison of SWAT, NASA-TLX, and workload profile methods. Applied

Psychology: An International Review, 53(1), 61-86. doi:10.1111/j.1464-

0597.2004.00161.x

Rummel, R. J. (1976). Understanding Correlation. Retrieved from

http://www.hawaii.edu/powerkills/UC.HTM

Sönmez, B., Oğuz, Z., Kutlu, L., & Yıldırım, A. (2017). Determination of nurses' mental

workloads using subjective methods. Journal of Clinical Nursing, 26(3-4), 514-523.

doi:10.1111/jocn.13476

Spector, R. (1990). Chapter 58: The Pupils. In H.K. Walker, W.D. Hall, & J.W. Hurst (Ed. 3),

Clinical Methods: The History, Physical, and Laboratory Examinations. Boston:

Butterworths.

Sugimoto, I., Kitamura, K., Murai, K., Wang, Y., and Wang, J. (2016) Study on relation between

operator and trainee's mental workload for ship maneuvering simulator exercise using

heart rate variability. 2016 IEEE International Conference on Systems, Man, and

http://www.hawaii.edu/powerkills/UC.HTM

104

Cybernetics (SMC) Systems, Man, and Cybernetics (SMC), 2016 IEEE International

Conference, Budapest, Hungary, Oct. 2016.

Szalma, J. L., & Hancock, P. A. (2011). Noise effects on human performance: A meta-analytic

synthesis. Psychological Bulletin, 137(4), 682-707. doi:10.1037/a0023987

Tafalla, R. J., & Evans, G. W. (1997). Noise, physiology, and human performance: The potential

role of effort. Journal of Occupational Health Psychology, 2(2), 148-155.

doi:10.1037/1076-8998.2.2.148

Tokuda, S., Obinata, G., Palmer, E., & Chaparro, A. (2011). Estimation of mental workload

using saccadic eye movements in a free-viewing task. Conference Proceedings: Annual

International Conference Of The IEEE Engineering In Medicine And Biology Society.

IEEE Engineering In Medicine And Biology Society. Annual Conference, 20114523-

4529. doi:10.1109/IEMBS.2011.6091121

Van Orden, K. F., Limbert, W., Makeig, S., & Jung, T. (2001). Eye activity correlates of

workload during a visuospatial memory task. Human Factors, 43(1), 111-121.

doi:10.1518/001872001775992570

Vollmer, M. (2015). A robust, simple and reliable measure of heart rate variability using relative

RR intervals. 2015 Computing In Cardiology Conference (Cinc), 609.

doi:10.1109/CIC.2015.7410984

Weinstein, N. D. (1974). Effect of noise on intellectual performance. Journal of Applied

Psychology, 59(5), 548-554.

105

Wickens, C., & Wickens, C. D. (2008). Multiple resources and mental workload. Human

Factors, 50(3), 449-455.

Wickens, C. D. (2002). Multiple resources and performance prediction. Theoretical Issues in

Ergonomics Science, 3(2), 159-177. doi:10.1080/14639220210123806

Wickens, C. Lee, J.D., Liu, Y., and Becker, S.E.G. (2004). An Introduction to Human Factors

Engineering, 2nd Edition, New Jersey, Prentice Hall.