Emochat: Emotional instant messaging with the Epoc headset
-
Upload
fwrigh2 -
Category
Technology
-
view
6.628 -
download
0
description
Transcript of Emochat: Emotional instant messaging with the Epoc headset
ABSTRACT
Title of Document: EMOCHAT: EMOTIONAL INSTANT
MESSAGING WITH THE EPOC HEADSET Franklin Pierce Wright
Master of Science 2010
Directed By: Asst. Professor of Human-Centered Computing
Dr. Ravi Kuber Information Systems
Interpersonal communication benefits greatly from the emotional information
encoded by facial expression, body language, and tone of voice, however this
information is noticeably missing from typical instant message communication. This
work investigates how instant message communication can be made richer by
including emotional information provided by the Epoc headset. First, a study
establishes that the Epoc headset is capable of inferring some measures of affect with
reasonable accuracy. Then, the novel EmoChat application is introduced which uses
the Epoc headset to convey facial expression and levels of basic affective states
during instant messaging sessions. A study compares the emotionality of
communication between EmoChat and a traditional instant messaging environment.
Results suggest that EmoChat facilitates the communication of emotional information
more readily than a traditional instant messaging environment.
EMOCHAT: EMOTIONAL INSTANT MESSAGING WITH THE EPOC HEADSET
By
Franklin Pierce Wright
Thesis submitted to the Faculty of the Graduate School of the University of Maryland, Baltimore County, in partial fulfillment
of the requirements for the degree of Master of Science
2010
© Copyright by Franklin Pierce Wright
2010
iii
Table of Contents
Acknowledgements ................................................................................................................. ii
Table of Contents ................................................................................................................... iii
List of Tables .......................................................................................................................... vi
List of Figures ........................................................................................................................ vii
1 Introduction .................................................................................................................... 1 1.1 The Importance of Emotion ..................................................................................... 1 1.2 Instant Messaging .................................................................................................... 2 1.3 The Emotional Problem with Instant Messaging .................................................... 2 1.4 Purpose of this Work ............................................................................................... 3 1.5 Structure of this Document ...................................................................................... 4
2 Emotion ........................................................................................................................... 6 2.1 What is Emotion? .................................................................................................... 6
2.1.1 The Jamesian Perspective ................................................................................... 6 2.1.2 The Cognitive-Appraisal Approach .................................................................... 7 2.1.3 Component-Process Theory ................................................................................ 8
2.2 Emotion and Related Affective States ...................................................................... 9 2.2.1 Primary versus Complex ..................................................................................... 9
2.3 Expressing Emotion ................................................................................................. 9 2.3.1 Sentic Modulation ............................................................................................... 9
2.4 Measuring Emotion ............................................................................................... 10 2.4.1 Self-Report Methods ......................................................................................... 10 2.4.2 Concurrent Expression Methods ....................................................................... 11
2.5 Conclusion ............................................................................................................. 12
3 Existing Techniques for Conveying Emotion during Instant Messaging ................ 13 3.1 Introduction ........................................................................................................... 13 3.2 Input Techniques ................................................................................................... 13
3.2.1 Textual Cues...................................................................................................... 13 3.2.2 Automated Expression Recognition .................................................................. 14 3.2.3 Physiologic Data ............................................................................................... 15 3.2.4 Manual Selection ............................................................................................... 15
3.3 Output Techniques ................................................................................................. 16 3.3.1 Emoticons .......................................................................................................... 16 3.3.2 Expressive Avatars ............................................................................................ 17 3.3.3 Haptic Devices .................................................................................................. 19 3.3.4 Kinetic Typography .......................................................................................... 21
3.4 Conclusion ............................................................................................................. 22
4 Study 1: Validating the Emotiv Epoc Headset ......................................................... 23 4.1 Introduction ........................................................................................................... 23 4.2 Overview of the Epoc Headset............................................................................... 23
4.2.1 Expressiv Suite .................................................................................................. 24 4.2.2 Affectiv Suite .................................................................................................... 25 4.2.3 Cognitiv Suite.................................................................................................... 25
iv
4.3 The Need for Validation ........................................................................................ 25 4.4 Experimental Design ............................................................................................. 26
4.4.1 TetrisClone System Development ..................................................................... 27 4.4.2 Measures ........................................................................................................... 28
4.5 Experimental Procedures ...................................................................................... 30 4.6 Results and Analysis .............................................................................................. 32
4.6.1 Headset-Reported versus Self-Reported Levels of Affect ................................ 32 4.6.2 Subjective Causes of Affect during Gameplay ................................................. 38
4.7 Discussion ............................................................................................................. 40 4.7.1 Consistency in the Present Study ...................................................................... 40 4.7.2 Future Direction ................................................................................................ 42
4.8 Conclusion ............................................................................................................. 42
5 Study 2: Emotional Instant Messaging with EmoChat ............................................ 44 5.1 Introduction ........................................................................................................... 44 5.2 EmoChat System Development .............................................................................. 45
5.2.1 Overview ........................................................................................................... 45 5.2.2 Traditional Environment ................................................................................... 47 5.2.3 Application Architecture ................................................................................... 47
5.3 Experimental Design ............................................................................................. 48 5.3.1 Measures ........................................................................................................... 49
5.4 Experimental Setup ................................................................................................ 53 5.5 Experimental Procedures ...................................................................................... 53 5.6 Results and Analysis .............................................................................................. 55
5.6.1 Emotional Transfer Accuracy ........................................................................... 55 5.6.2 Richness of Experience ..................................................................................... 59 5.6.3 Chat Transcripts ................................................................................................ 63 5.6.4 System Usability ............................................................................................... 66 5.6.5 Informal Interviews ........................................................................................... 66
5.7 Discussion ............................................................................................................. 73 5.7.1 Summary of Results .......................................................................................... 73 5.7.2 Consistency with related work .......................................................................... 75 5.7.3 Future Direction ................................................................................................ 77
5.8 Conclusion ............................................................................................................. 78
6 Conclusion ..................................................................................................................... 79 6.1 Limitations ............................................................................................................. 79 6.2 Summary of Contributions ..................................................................................... 82 6.3 EmoChat Design Considerations .......................................................................... 84 6.4 EmoChat Compared with Existing Methods of Emotional Instant Messaging ..... 85
6.4.1 Input Technique ................................................................................................ 85 6.4.2 Output Technique .............................................................................................. 86
6.5 Future Direction .................................................................................................... 87 6.6 Closing ................................................................................................................... 88
Appendix A: Validation Study Demographics Questionnaire ......................................... 90
Appendix B: Validation Study Experiment Questionnaire .............................................. 91
Appendix C: Validation Study Post-Experiment Questionnaire ..................................... 92
Appendix D: EmoChat Demographics Questionnaire...................................................... 93
Appendix E: EmoChat Emotional Transfer Questionnaire ............................................. 94
v
Appendix F: EmoChat Richness Questionnaire ............................................................... 95
Bibliography .......................................................................................................................... 96
vi
List of Tables Table 3.1 Common emoticons in Western and Eastern cultures ............................... 16 Table 3.2 Example of hapticons ................................................................................. 20 Table 4.1 Facial expression features measured by the Epoc headset ........................ 24 Table 4.2 Headset and self-reported levels of affect per subject, per trial ................. 33 Table 4.3 Spearman correlation between headset and self-reported levels of affect
(N=36) ................................................................................................................. 34 Table 4.4 Spearman correlation of headset and self-report data for varied time
divisions (N=36) ................................................................................................. 35 Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level 36 Table 4.6 Grand mean Spearman correlation between headset and self-reported levels
of affect (N=3) .................................................................................................... 36 Table 4.7 Major themes identified in subjective affect survey results ...................... 39 Table 5.1 Facial movements and affective information used by EmoChat ............... 46 Table 5.2 EmoChat experimental groups ................................................................... 48 Table 5.3 ETAQ scores for each subject-pair and both experimental conditions ..... 56 Table 5.4 Spearman correlation matrix between avatar features and perceived
frequency of emotional states (N=5) .................................................................. 58 Table 5.5 Wilcoxon's signed rank test for significant difference between score means
(N=10) ................................................................................................................. 61 Table 5.6 Linguistic categories with significant differences between experimental
conditions ............................................................................................................ 64 Table 5.7 LIWC affective processes hierarchy .......................................................... 65 Table 5.8 LIWC relativity hierarchy ........................................................................... 66
vii
List of Figures
Figure 3.1 Examples of expressive avatars ................................................................ 18 Figure 4.1 Initialization screen for the TetrisClone application ................................ 28 Figure 4.2 TetrisClone application during trials ........................................................ 28 Figure 4.3 Example output from the TetrisClone application ................................... 29 Figure 4.4 Comparison of grand mean headset and self-reported levels of excitement
............................................................................................................................. 37 Figure 4.5 Comparison of grand mean headset and self-reported levels of
engagement ......................................................................................................... 37 Figure 4.6 Comparison of grand mean headset and self-reported levels of frustration
............................................................................................................................. 37 Figure 5.1 The EmoChat client application ............................................................... 45 Figure 5.2 EmoChat server application ..................................................................... 47 Figure 5.3 Mean scores from the richness questionnaire, questions 1-5 ................... 59 Figure 5.4 Mean scores from the richness questionnaire, questions 5-10 ................. 60 Figure 5.5 Comparison of mean responses to REQ between subjects with headsets
versus without headsets, in the EmoChat condition, Q1-5 (N=5) ...................... 62 Figure 5.6 Comparison of mean responses to REQ between subjects with headsets
versus without headsets, in the EmoChat condition, Q5-10 (N=5) .................... 62
1
1 Introduction
This chapter introduces the importance of emotion in interpersonal communication,
and presents some of the challenges with including emotion in instant messages. The
purpose of this thesis is then stated, followed by an overview of the document
structure.
1.1 The Importance of Emotion
Consider the following statement:
“The Yankees won again.”
Does the person who makes this remark intend to be perceived as pleased or
disappointed? Enthusiastic or resentful? The remark is purposely emotionally
ambiguous to illustrate just how powerful the inclusion or absence of emotion can be.
If the same remark were said with a big grin on the face, or with the sound of
excitement in the voice, we would certainly understand that this person was quite
pleased that his team was victorious.
If the speaker displayed slumped shoulders and a head tilted downward we would
assume that he was certainly less than jubilant.
2
It is clear that emotions play a very important role in interpersonal communication,
and without them, communication would be significantly less efficient. A statement
that contains emotion implies context, without the necessity of explicit clarification.
In some cases what is said may be equally as important as how it is said.
1.2 Instant Messaging
Real-time text-based communication is still on the rise. Instant messaging, in one
form of another, has infiltrated nearly all aspects of our digital lives, and shows no
sign of retreat. From work, to school, to play, it’s becoming more and more difficult
to shield ourselves from that popup, or that minimized window blinking in the task
bar, or that characteristic sound our phones make when somebody wants to chat with
us. We are stuck with this mode of communication for the foreseeable future.
1.3 The Emotional Problem with Instant Messaging
As convenient as it is, this text-based communication has inherent difficulties
conveying emotional information. It generally lacks intonation and the subtle non-
verbal cues that make face-to-face communication the rich medium that it is. Facial
expression, posture, and tone of voice are among the highest bandwidth vehicles of
emotional information transfer (Pantic, Sebe, Cohn, & Huang, 2005), but are
noticeably absent from typical text-based communication. According to Kiesler and
colleagues, computer-mediated communication (CMC) in general is “observably
poor” for facilitating the exchange of affective information, and note that CMC
3
participants perceive the interaction as more impersonal, resulting in less favorable
evaluations of partners (Kiesler, Zubrow, Moses, & Geller, 1985).
The humble emoticon has done its best to remedy the situation by allowing text
statements to be qualified with the ASCII equivalent of a smile or frown. While this
successfully aids in conveying positive and negative affect (Rivera, Cooke, & Bauhs,
1996), emoticons may have trouble communicating more subtle emotions. Other
solutions that have been proposed to address this problem are reviewed in chapter 3.
Each solution is successful in its own right, and may be applicable in different
situations. This work examines a novel method for conveying emotion in CMC,
which is offered to the community as another potential solution.
1.4 Purpose of this Work
The main goal of this body of work is to investigate how instant message
communication is enriched by augmenting messages with emotional content, and
whether this can be achieved through the use of brain-computer interface (BCI)
technology. The Emotiv Epoc headset is a relatively new BCI peripheral intended
for use by consumers and is marketed as being capable of inferring levels of basic
affective states including excitement, engagement, frustration, and meditation. A
study presented in this work attempts to validate those claims by comparing data
reported by the headset with self-reported measures of affect during game play at
varied difficulty levels. The novel EmoChat application is then introduced, which
integrates the Epoc headset into an instant messaging environment to control the
4
facial expressions of a basic animated avatar, and to report levels of basic affective
states. A second study investigates differences between communication with
EmoChat and a traditional instant messaging environment. It is posited that the
EmoChat application, when integrated with the Epoc headset, facilitates
communication that contains more emotional information, that can be described as
richer, and that conveys emotional information more accurately than with traditional
IM environments.
In the end, this complete work intends to provide, first, a starting point for other
researchers interested in investigating applications that implement the Epoc headset,
and second, results which may support the decision to apply the Epoc in computer-
mediated communication settings.
1.5 Structure of this Document
The remaining chapters of this work are structured as follows:
Chapter 2 provides an overview of emotion, including historical perspectives, and
how emotions are related to affective computing. Chapter 3 reviews existing
techniques for conveying emotion in instant messaging environments. Chapter 4
details a study to determine the accuracy of the Epoc headset. Chapter 5 introduces
EmoChat, a novel instant messaging environment for exchanging emotional
information. A study compares EmoChat with a traditional instant messaging
environment. Chapter 6 summarizes the contributions this work makes, and
5
compares the techniques for conveying emotion used in EmoChat with techniques
described in the literature.
6
2 Emotion
This chapter describes some of the historical perspectives on emotion, and introduces
its role in affective computing. It intends to provide a background helpful for the
study of emotional instant messaging.
2.1 What is Emotion?
The problem of defining what constitutes human emotion has plagued psychologists
and philosophers for centuries, and there is still no generally accepted description
among researchers or laypersons. A complicating factor in attempting to define
emotion is our incomplete understanding of the complexities of the human brain.
Some theorists have argued that our perceptions enter the limbic system of the brain
and trigger immediate action without consultation with the more developed cortex.
Others argue that the cortex plays a very important role in assessing how we relate to
any given emotionally relevant situation, and subsequently provides guidance about
how to feel.
2.1.1 The Jamesian Perspective
In 1884 psychologist William James hypothesized that any emotional experience with
which physiological changes may be associated requires that those same
physiological changes be expressed before the experience of the emotion (James,
1884). In essence, James believed that humans feel afraid because we run from a
bear, and not that we run from a bear because we feel afraid. James emphasized the
7
physical aspect of emotional experience causation over the cognitive aspect. This
physical action before the subjective experience of an emotion has subsequently been
labeled a “Jamesian” response. For historical accuracy, note, that at about the same
time that James was developing his theory Carl Lange independently developed a
very similar theory. Collectively, their school of thought is referred to as James-
Lange (Picard, 1997).
2.1.2 The Cognitive-Appraisal Approach
In contrast to James’ physical theory of emotion, a string of psychologists later
developed several different cognitive-based theories. Notable among them is the
cognitive-appraisal theory, developed by Magda Arnold, and later extended by
Richard Lazarus, which holds that emotional experience starts not with a physical
response, but with a cognitive interpretation (appraisal) of an emotionally-inspiring
situation (Reisenzein, 2006). In continuing with the bear example, Arnold and
Lazarus would have us believe that we hold in our minds certain evaluations of the
bear-object (it is dangerous and bad for us), we see that the bear is running toward us
(is, or will soon be present), we anticipate trouble if the bear reaches us (poor coping
potential), and so we experience fear and run away.
Certainly, there are valid examples of situations that seem to trigger Jamesian
responses. Consider the fear-based startle response when we catch a large object
quickly approaching from the periphery. It is natural that we sometimes react to
startling stimuli before the experience of the fear emotion, jumping out of the way
8
before we even consciously know what is happening to us. Conversely, consider joy-
based pride after a significant accomplishment. It seems as though pride could only
be elicited after a cognitive appraisal determines that (a) the accomplishment-object is
positive, (b) it has been achieved despite numerous challenges, and (c) it will not be
stripped away. If examples can be found that validate both the James-Lange
approach and the cognitive-appraisal approach, is one theory more correct than the
other?
2.1.3 Component-Process Theory
It is now suggested that emotional experience may result from very complex
interaction between the limbic system and cortex of the brain, and that emotions can
be described as having both physical and cognitive aspects (Picard, 1997).
Encompassing this point of view, that a comprehensive theory of emotion should
consider both cognitive and physical aspects, is the component-process model
supported by Klaus Scherer (Scherer, 2005). This model describes emotion as
consisting of synchronized changes in several neurologically based subsystems,
including, cognitive (appraisal), neurophysiologic (bodily symptoms), motivational
(action tendencies), motor expression (facial and vocal expression), and subjective
feeling (emotional experience) components. Note that Scherer regards “subjective
feeling” as a single element among many in what constitutes an “emotion.”
9
2.2 Emotion and Related Affective States
2.2.1 Primary versus Complex
Several emotion classes exist that are more basic than others. These are the emotions
that seem the most Jamesian in nature—hard coded, almost reflex like responses that,
from an evolutionary perspective, contribute the most to our survival. Fear and anger
are among these basic, primary, emotions. Picard labels these types of emotions as
“fast primary,” and suggests that they originate in the limbic system. This is in
contrast to the “slow secondary,” or cognitive-based emotions that require time for
introspection and appraisal, and therefore require some cortical processing. Scherer
calls this slow type of emotion “utilitarian,” in contrast with the fast, which he terms,
“aesthetic.” An important distinction should be made between emotions and other
related affective states such as moods, preferences, attitudes, and sentiments. A
distinguishing factor of emotions is the comparatively short duration when considered
among the other affective states.
2.3 Expressing Emotion
2.3.1 Sentic Modulation
If emotion has both physical and cognitive aspects, it seems natural that some
emotions can be experienced without much, if any, outward expression. Interpersonal
communication may benefit from those overt expressions of emotion that can be
perceived by others. Picard discusses what she calls “sentic modulation,” overt or
covert changes is physiological features that, although do not constitute emotion
10
alone, act as a sort of symptom of emotional experience (Picard, 1997). The easiest
of these sentic responses to recognize are arguably facial expression, tone of voice,
and posture, or body language. Of these three, research suggests that facial
expression is the highest bandwidth, with regard to the ability to convey emotional
state (Pantic, et al., 2005). There are other, more covert, symptoms of emotional
experience, including heart rate, blood pressure, skin conductance, pupil dilation,
perspiration, respiration rate, and temperature (Picard, 1997). Recent research has
also demonstrated that some degree of emotionality may also be inferred by
neurologic response as measured by electroencephalogram (Khalili & Moradi, 2009;
Sloten, Verdonck, Nyssen, & Haueisen, 2008). Facial expression deserves additional
attention, being one of the most widely studied forms of sentic modulation. Ekman
and others have identified six basic emotion/facial expression combinations that
appear to be universal across cultures including, fear, anger, happiness, sadness,
disgust, and surprise (Ekman & Oster, 1979). The universality of these facial
expressions, that they are so widely understood, suggests that it should be quite easy
to infer emotion from them.
2.4 Measuring Emotion
2.4.1 Self-Report Methods
Perhaps the most widely used method for determining emotional state is through self-
report. This technique asks a subject to describe an emotion he or she is
experiencing, or to select one from a pre-made list. Scherer discusses some of the
11
problems associated with relying on self-reported emotional experience. On one
hand, limiting self-report of emotion to a single list of words from which the subject
must choose the most appropriate response may lead to emotional “priming,” and/or a
misrepresentation of true experience. On the other hand, allowing a subject to
generate a freeform emotional word to describe experience adds significant difficulty
to any type of analysis (Scherer, 2005). Another method for self-report measurement
is to have a subject identify emotional state within the space of some dimension.
Emotional response is often described as falling somewhere in the two-dimensional
valence/arousal space proposed by Lang. This dimensional model of affect
deconstructs specific emotions into some level of valence (positive feelings versus
negative feelings), and some level of arousal (high intensity versus low intensity)
(Lang, 1995). As an example, joyful exuberance is categorized by high valence and
high arousal, while feelings of sadness demonstrate low valence and low arousal.
Lang posits that all emotion falls somewhere in this two-dimensional space. Problems
may arise when emotion is represented in this space without knowing the triggering
event, since several distinct emotions may occupy similar locations in valence-arousal
space, e.g., intense anger versus intense fear, both located in high-arousal, low-
valence space (Scherer, 2005).
2.4.2 Concurrent Expression Methods
An objective way to measure emotional state is by inferring user affect by monitoring
sentic modulation. This method requires the use of algorithms and sensors in order to
perceive the symptoms of emotion and infer state, and is significantly more
12
challenging than simply asking a person how he feels (Tetteroo, 2008). Still, this is
an active area of research within the affective computing domain because user
intervention is not required to measure the emotion, which may be beneficial in some
cases. Techniques include using camera or video input along with classification
algorithms to automatically detect emotion from facial expression (Kaliouby &
Robinson, 2004), monitoring galvanic skin response to estimate level of arousal
(Wang, Prendinger, & Igarashi, 2004), and using AI learning techniques to infer
emotional state from electroencephalograph signals (Khalili & Moradi, 2009; Sloten,
et al., 2008).
2.5 Conclusion
Picard defines affective computing as, “computing that relates to, arises from, or
influences emotion.” (Picard, 1997) According to Picard, some research in this
domain focuses on developing methods of inferring emotional state from sentic user
characteristics (facial expression, physiologic arousal level, etc.), while other research
focuses on methods that computers could use to convey emotional information
(avatars, sound, color, etc.) (Picard, 1997). A study of emotional instant messaging
is necessarily a study of affective computing. In the context of instant messaging,
Tetteroo separates these two research areas of affective computing into the study of
input techniques, and output techniques (Tetteroo, 2008). The next chapter reviews
how these techniques are used during instant message communication to convey
emotional information.
13
3 Existing Techniques for Conveying Emotion during Instant Messaging
3.1 Introduction
A review of the current literature on emotional communication through instant
message applications has identified several techniques for enriching text-based
communication with affective content. These techniques can be broadly classified as
either input techniques, inferring or otherwise reading in the emotion of the user, or
output techniques, displaying or otherwise conveying the emotion to the partner
(Tetteroo, 2008). These categories are reviewed in turn.
3.2 Input Techniques
Research concerning how emotions can be read into an instant messaging system
generally implements one of several methods: inference from textual cues, inference
through automated facial expression recognition, inference from physiologic data, or
manual selection.
3.2.1 Textual Cues
Input methods that use text cues to infer the emotional content of a message generally
implement algorithms that parse the text of a message and compare its contents with a
database of phrases or keywords for which emotional content is known. Yeo
implements a basic dictionary of emotional terms, such as “disappointed,” or
14
“happy,” that incoming messages are checked against. When a match is found, the
general affective nature of the message can be inferred (Yeo, 2008). Others have
used more complicated natural language processing algorithms to account for the
subtleties of communicative language (Neviarouskaya, Prendinger, & Ishizuka,
2007).
Another example of using text cues to infer the emotion of a message involves simply
parsing text for occurrences of standard emoticons. The presence of the smiley
emoticon could indicate a positively valenced message, while a frowning emoticon
could indicate negative valence, as implemented by Rovers & Essen (2004).
3.2.2 Automated Expression Recognition
The goal during automated expression recognition involves using classification
algorithms to infer emotion from camera or video images of a subject’s face.
Kaliouby and Robinson use an automated “facial affect analyzer” in this manner to
infer happy, surprised, agreeing, disagreeing, confused, indecisive, and neutral states
of affect (Kaliouby & Robinson, 2004). The classifier makes an evaluation about
affective state based on information about the shape of the mouth, the presence of
teeth, and head gestures such as nods.
15
3.2.3 Physiologic Data
Some physiologic data is known to encode levels of affect, including galvanic skin
response, skin temperature, heart beat and breathing rate, pupil dilation, and electrical
activity measured from the surface of the scalp (Picard, 1997). Wang and colleagues
used GSR data to estimate levels of arousal in an instant messaging application
(Wang, et al., 2004). Specifically, spikes in the GSR data were used to infer high
levels of arousal, and the return to lower amplitudes signaled decreased level of
arousal.
Output from electroencephalograph (EEG) has also been used to classify emotional
state into distinct categories within arousal valence space by several researchers
(Khalili & Moradi, 2009; Sloten, et al., 2008). These studies use AI learning
techniques to classify affective state into a small number of categories with moderate
success.
3.2.4 Manual Selection
The most basic method of adding emotional content to an instant message is by
simple manual selection or insertion. This method can take the form of a user
selecting from a list of predefined emotions or emotional icons with a mouse click, or
by explicitly inserting a marker, e.g., emoticon, directly in to the body of the message
text. This type of input technique is widely used and is seen in research by (Fabri &
Moore, 2005; Sanchez, Hernandez, Penagos, & Ostrovskaya, 2006; Wang, et al.,
2004).
16
3.3 Output Techniques
Output techniques describe methods that can be used to display emotional content to
a chat participant after it has been input into the system. These techniques generally
involve using emoticons, expressive avatars, haptic devices, and kinetic typography.
3.3.1 Emoticons
Emoticons are typically understood as small text-based or graphical representations of
faces that characterize different affective states, and have been ever evolving in an
attempt to remedy the lack of non-verbal cues during text chat (Lo, 2008). Examples
of commonly used emoticons are presented in the table below.
Meaning Western Emoticon Eastern Emoticon Happy :-) (^_^) Sad :-( (T_T) Surprised :-o O_o Angry >:-( (>_<) Wink ;-) (~_^) Annoyed :-/ (>_>)
Table 3.1 Common emoticons in Western and Eastern cultures
Emoticons are perhaps the most widely used method for augmenting textual
communication with affective information. A survey of 40,000 Yahoo Messenger
users reported that 82% of respondents used emoticons to convey emotional
information during chat (Yahoo, 2010). A separate study by Kayan and colleagues
explored differences in IM behavior between Asian and North American users and
reported that of 34 total respondents, 100% of Asian subjects used emoticons while
17
72% of North Americans (with an aggregate of 85% of respondents) used emoticons
on a regular basis (Kayan, Fussell, & Setlock, 2006). These usage statistics
underscore the prevalence of emoticons in IM communication.
Sanchez and colleagues introduced an IM application with a unique twist on the
standard emoticon. Typical emoticons scroll with the text they are embedded in, and
so lack the ability to convey anything more than brief fleeting glimpses of emotion.
This novel application has a persistent area for emoticons that can be updated as often
as the user sees fit, and does not leave the screen as messages accumulate (Sanchez,
et al., 2006). Building on Russel’s model of affect (Russell, 1980), the team
developed 18 different emoticons, each with three levels of intensity to represent a
significant portion of valence/arousal emotional space.
3.3.2 Expressive Avatars
Using expressive avatars to convey emotion during IM communication may be
considered a close analog to the way emotion is encoded by facial expression during
face-to-face conversation, considering that facial expression is among the highest
bandwidth channels of sentic modulation (Pantic, et al., 2005).
A
F
d
R
su
F
ex
co
ch
hy
“r
pr
du
co
re
fa
A study by K
AIM which
isplays an ex
Robinson, 20
urprised, agr
abri and Mo
xpressions in
ompared thi
hange facial
ypothesis wa
richness,” co
resence, and
uring a class
ollectively o
epresenting e
acial express
Kaliouby and
uses automa
xpressive av
04). Affecti
reeing, disag
oore investig
n an instant m
s with a con
expressions
as that the co
omprised of
d sense of co
sical surviva
ordering a lis
each chat pa
sions includi
Figure 3.1 Ex
Robinson p
ated facial ex
vatar reflectin
ive states cu
greeing, conf
gated the use
messaging e
dition in wh
s, except for
ondition in q
high levels o
opresence. P
al exercise, in
st of survival
artner could b
ing happines
18
xamples of expr
presents an in
xpression re
ng that affec
urrently supp
fused, indeci
of animated
environment
hich the avata
minor rando
question wou
of task invol
Participants i
n which both
l items in ter
be made to d
ss, surprise, a
ressive avatars
nstant messa
cognition to
ct to the chat
ported by FA
isive, and ne
d avatars cap
(Fabri & M
ar was not an
om eyebrow
uld result in
lvement, enj
interacted thr
h subjects w
rms of impor
display one o
anger, fear,
aging applica
o infer affect
t partner (Ka
AIM include
eutral.
pable of emo
Moore, 2005)
nimated and
w movement.
a higher lev
joyment, sen
rough the IM
were tasked w
rtance. An a
of Ekman’s
sadness, and
ation called
, and
aliouby &
happy,
otional facial
. They
d did not
The
vel of
nse of
M application
with
avatar
six universa
d disgust
l
n
al
19
(Ekman & Oster, 1979), by clicking on a corresponding icon in the interface.
Significant results from the study indicated higher levels of task involvement and
copresence in the expressive avatar condition, equally high levels of presence in both
conditions, and a higher level of enjoyment in the non-expressive avatar condition.
The AffectIM application developed by Neviarouskaya also uses expressive avatars
to convey emotion during instant message communication (Neviarouskaya, et al.,
2007). Rather than requiring a user to select an expression from a predefined set,
AffectIM infers the emotional content of a message by analyzing the text of the
message itself, and automatically updates an avatar with the inferred emotion. A
comparison study identified differences between separate configurations of the
AffectIM application: one in which emotions were automatically inferred, one that
required manual selection of a desired emotion, and one that selected an emotion
automatically in a pseudo-random fashion (Neviarouskaya, 2008). The study
compared “richness” between conditions, comprised of interactivity, involvement,
sense of copresence, enjoyment, affective intelligence, and overall satisfaction.
Significant differences indicated a higher sense of copresence in the automatic
condition than in the random condition, and higher levels of emotional intelligence in
both the automatic and manual conditions than in the random condition.
3.3.3 Haptic Devices
Haptic instant messaging is described as instant messaging that employs waveforms
of varying frequencies, amplitudes, and durations, transmitted and received by
pu
w
an
pr
si
pr
tr
h
fr
se
T
ap
p
a
a
al
T
th
urpose-built
which special
nd Essen int
rogrammed
imilar manne
reliminary a
raditional em
aptic device
requency tha
everal abrup
The ContactIM
pproach to in
lugin for the
ball between
standard for
llow each us
The generated
his way, the
t haptic devic
l emotional m
troduce their
force pattern
er as ordinar
application, H
moticons, e.g
s. For exam
at slowly gro
pt pulses with
Emotic
:-)
:-(
M applicatio
ntegrating ha
e Miranda IM
n partners by
rce-feedback
ser to impart
d momentum
act of tossin
ces (force-fe
meaning can
r idea of “hap
ns that can b
ry icons are u
HIM, parses
g., :), etc., an
mple, a smiley
ows in ampli
h high frequ
con Mean
Hap
Sa
Table 3.2
on developed
aptic inform
M environme
y using a for
k joystick (O
t a specific v
m of the ball
ng the ball m
20
eedback joys
n be attached
pticons,” wh
be used to co
used in grap
instant mes
nd sends a pr
y face sends
itude, while
ency and am
ning Hapt
ppy
ad
2 Example of h
d by Oakley
mation with a
ent was creat
rce enabled h
Oakley, 2003
velocity and t
is persistent
may convey s
sticks, haptic
d (Rovers &
hich are desc
ommunicate
phical user in
sage text for
redefined wa
s a waveform
a frowny fac
mplitude.
ticon Wavef
hapticons
and O’Mod
an instant me
ted that mim
haptic devic
3). The appli
trajectory to
t until the ch
some degree
c touchpads,
Essen, 2004
cribed as “sm
a basic notio
nterfaces.” T
r occurrence
aveform to a
m with mode
ce is represe
form
dhrain takes
essaging env
mics the effec
ce such as the
ication is des
o the ball dur
hat partner p
of emotiona
, etc.), to
4). Rovers
mall
on in a
Their
es of
any connecte
erate
ented by
a different
vironment. A
cts of tossing
e phantom o
signed to
ring a throw
icks it up. I
ality, e.g., a
ed
A
g
or
.
n
21
lightly thrown ball as a playful flirtatious gesture, or a fast throw to indicate
disagreement or anger. Emphasis is placed on the asynchronous nature of typical
instant message use, and the application has been designed to suit this mode of
interaction by keeping the general characteristics of the tossed ball persistent until
interaction by the receiver changes it.
3.3.4 Kinetic Typography
Kinetic typography is described as real time modification of typographic
characteristics including animation, color, font, and size, etc., and may be used to
convey affective information (Yeo, 2008). Yeo developed an IM client that inferred
affective meaning through keyword pattern matching, and used kinetic typography to
update the text of messages in real time (Yeo, 2008).
An instant messaging client developed by Wang and colleagues represents emotion in
arousal/valence space by combining kinetic typography with galvanic skin response.
Manually selected text animations are meant to represent valence, while GSR that is
recorded and displayed to the chat partner represents level of arousal. Users were
asked when they felt the most involved during the online communication, and
answers typically corresponded to peaks in GSR level (Wang, et al., 2004). The
study participants reported that the inclusion of arousal/valence information made the
communication feel more engaging and that it was preferred over traditional text-only
chat, although some users indicated that they would not always want their partner to
be aware of their arousal level (Wang, et al., 2004).
22
3.4 Conclusion
This chapter has separated the major components of emotional instant messaging into
two categories: input techniques and output techniques. Among input techniques,
inference from textual cues, inference through automated facial expression
recognition, inference from physiologic data, and manual selection have been
reviewed. Output techniques that were discussed include emoticons, expressive
avatars, haptic devices, and kinetic typography.
The next chapter introduces the Epoc headset and describes some of its capabilities.
This headset is used as the emotional input device for the EmoChat system discussed
in a subsequent chapter, and can be thought of as using automated facial expression
recognition in combination with physiologic data to infer and convey emotion. The
next chapter also presents a study that investigates the validity of the Epoc affect
classifier.
23
4 Study 1: Validating the Emotiv Epoc Headset
4.1 Introduction
This study investigates the validity of the Epoc headset in terms of how accurately it
measures levels of excitement, engagement, and frustration. Self-reported measures
of excitement, engagement, and frustration are collected after games of Tetris are
played at varied difficulty levels. The self-reported measures are compared with data
from the headset to look for relationships.
4.2 Overview of the Epoc Headset
The EmoChat application makes use of the Epoc headset for measuring affective state
and facial expression information. This headset, developed by Emotiv, was one of
the first consumer-targeted BCI devices to become commercially available.
Alternatives BCI devices that were considered include the Neurosky Mindset, and the
OCZ Neural Impulse Actuator. The Epoc was selected because of the comparatively
large number of electrodes (14) that it uses to sense electroencephalograph (EEG),
and electromyography (EMG) signals, and the resulting capabilities. Additionally,
the Epoc has a growing community of active developers who form a support network
for other people using the software development kit to integrate headset capabilities
with custom applications.
Traditional EEG devices require the use of a conductive paste in order to reduce
electrical impedance and improve conductivity between the electrode and the scalp.
24
The Epoc device, however, replaces this conductive paste with saline-moistened felt
pads, which reduces set up time and makes clean up much easier.
A software development kit provides an application programming interface to allow
integration with homegrown applications, and a utility called EmoKey can be used to
associate any detection with any series of keystrokes for integration with legacy
applications. The developers have implemented three separate detection “suites”
which monitor physiologic signals in different ways, and are reviewed below.
4.2.1 Expressiv Suite
This suite monitors EMG activity to detect facial expressions including left/right
winks, blinks, brow furrowing/raising, left/right eye movement, jaw clenching,
left/right smirks, smiles, and laughter. The detection sensitivity can be modified
independently for each feature and for different users. Universal detection signatures
are included for each feature, but signatures can also be trained to increase accuracy.
Lower Face Movements Upper Face Movements Eye Movements
Smirk Right Brow Raise Look Left Smirk Left Brow Furrow Look Right
Smile Wink Right Laugh Wink Left
Jaw Clench Blink Table 4.1 Facial expression features measured by the Epoc headset
25
4.2.2 Affectiv Suite
The Affectiv suite monitors levels of basic affective states including instantaneous
excitement, average excitement, engagement, frustration, and meditation. Detection
algorithms for each state are proprietary and have not been released to the public,
therefore the given labels may be somewhat arbitrary, and may or may not accurately
reflect affective state. The goal of the present study is to determine the accuracy of
these detections with a longer-term goal of investigating whether this information can
be used to augment the instant messaging experience through the presentation of
emotional content.
4.2.3 Cognitiv Suite
This suite allows a user to train the software to recognize an arbitrary pattern of
electrical activity measured by EEG/EMG that is associated with a specific,
repeatable thought or visualization. The user may then reproduce this specific pattern
to act as the trigger for a binary switch. Skilled users may train and be monitored for
up to 4 different thought patterns at once.
4.3 The Need for Validation
The Epoc affectiv suite purports to measure levels of excitement, engagement,
frustration, and meditation; however, the algorithms used to infer these states are
proprietary and closed-source. There has been little research that references the Epoc
headset, perhaps because it is still new and relatively unknown. No studies thus far
26
have evaluated the accuracy of its affective inference algorithms. Cambpell and
colleagues used raw EEG data from the headset as input to a P300-based selection
engine (Campbell, et al., 2010), and several others have reviewed the device (Andrei,
2010; Sherstyuk, Vincent, & Treskunov, 2009). Methods are provided to retrieve
what each affectiv suite score is at any given moment, but do not let one see how each
score is calculated. It is understandable that Emotiv has chosen to keep this part of
their intellectual property out of the public domain, but if these affectiv measurements
are to be used in any serious capacity by researchers or developers, evidence should
be provided to support the claim that reported affectiv suite excitement levels are
reasonable estimates of actual subject excitement levels, that affectiv suite
engagement levels are reasonable estimates of actual subject engagement levels, and
so on.
4.4 Experimental Design
A study was designed to determine the accuracy of the Epoc affectiv suite by
presenting study participants with stimuli intended to elicit different levels of
affective and physiologic responses (in the form of game play at varied levels), and
measuring for correlation between output from the headset and self-reported affective
experience. Since the overall goal of this thesis work is to investigate how the
inclusion of affective information enriches instant message communication, the
excitement, engagement, and frustration headset detections are validated here. It is
thought that they are the most applicable to a study of affective communication.
27
The study design used in the present study is adapted from similar work by Chanel
and colleagues, during which physiological metrics were monitored as participants
played a Tetris game (Chanel, Rebetez, Betrancourt, & Pun, 2008). The difficulty
level of the game was manipulated in order to elicit differing affective states. Self-
report methods were also used to collect data about participants’ subjective
experience of affect. The goal of the study was to use these physiological metrics
with machine learning techniques to classify affective experience into three
categories, including, anxiety, engagement, and boredom. It is thought that the
anxiety category of the Chanel study may be a close analog to the frustration
component of the Epoc affectiv suite.
4.4.1 TetrisClone System Development A small Tetris application (TetrisClone) was developed to automate the experiment
and to aid with data collection. The application was written in C# using Microsoft
Visual Studio 2008 and interfaces with the Emotiv Epoc headset through the supplied
API.
The initialization screen for TetrisClone can be seen in fig. 4.1. This screen collects
the test subject’s name and is used to start logging data coming from the Epoc
headset. After logging begins, the right panel can be hidden so that it does not
distract the subject during the experiment. A screenshot of the TetrisClone application
during one of the trials is presented in fig. 4.2.
Fi
4 4 A
in
op
pr
fa
af
an
as
fr
igure 4.1 Initiali
.4.2 Meas
.4.2.1 Que
All participan
nformation, s
pinions abou
rovided in ap
amiliarity wi
ffect questio
nd frustratio
sked subject
rustrated.
ization screen fo
sures
estionnaires
nts complete
self-reported
ut the causes
ppendices A
ith Tetris, an
ons asked sub
on on a 5 poi
ts to describe
or the TetrisClon
e a total of 3
d levels of af
s of affect du
A-C. Demog
nd skill at co
bjects to rate
nt likert-sca
e what game
28
ne application
surveys to c
ffect during
uring game p
graphics ques
omputer/cons
e their exper
le between t
e events mad
Figureduring
collect basic
the experim
play. These
stions asked
sole games.
riences of ex
trial games.
de them feel
e 4.2 TetrisClong trials
demographi
ment, and ope
questionnai
d about age, g
Self-reporte
xcitement, en
Open ended
excited, eng
ne application
ic
en-ended
res are
gender,
ed levels of
ngagement,
d questions
gaged, and
4 T
h
in
an
In
ar
T
ex
m
en
.4.2.2 Hea
The TetrisClo
eadset at app
n the left pan
nd moves on
n this way, it
re associated
The output C
xcitement (s
meditation, h
ngagement,
adset Data
one applicati
proximately
ne of fig. 4.1
n to the next
t becomes ea
d with which
Figure 4
SV files them
short term), e
owever the p
and frustrati
ion receives
2 Hz. and w
). As a part
, the content
asier to deter
h points in th
4.3 Example ou
mselves con
excitement (
present study
ion compone
29
affective sta
writes these t
ticipant comp
ts of the text
rmine which
he experimen
utput from the T
ntain time-sta
long term), e
y is only con
ents.
ate informati
to a hidden t
pletes one p
t box control
h groupings o
nt.
TetrisClone app
amped, head
engagement
ncerned with
ion from the
text box cont
ortion of the
l are output t
of headset d
plication
dset-reported
t, frustration
h excitement
e Epoc
trol (visible
e experiment
to a CSV file
data records
d levels of
, and
t (short term
t
e.
m),
30
4.5 Experimental Procedures
A total of (7) subjects participated in the experiment; however, data from one subject
was incomplete due to problems with maintaining a consistently good signal quality
from the headset. This incomplete data is not included in the analysis. The
remaining subjects, (N=6), were all male aged 25-30 (mean=28.5), pretty to very
familiar with the Tetris game (mode=very familiar), occasionally to very frequently
played computer or console games (mode=very frequently), but never to rarely played
Tetris (mode=rarely). Participants rated themselves as being average to far above
average skill (mode=above average) when it came to playing computer or console
games.
Subjects arrived, were seated in front of a laptop, asked to review and sign consent
forms, and then completed the Demographics Questionnaire (Appendix A). The
subjects were then fitted with the Epoc headset. Care was taken to ensure that each
headset sensor reported strong contact quality in the Control Panel software. The
self-scaling feature of the Epoc headset required 15-20 minutes prior to data
collection. During this time, the subjects were asked to play several leisurely games
of Tetris as a warm-up activity.
Once the headset adjusted to the subjects, the Tetris game/Headset data recorder was
launched. Subjects played through a series of three Tetris games to determine “skill
level,” as calculated by averaging the highest levels reached during each game. The
TetrisClone application has no maximum level cap, although levels above 10 are so
31
difficult that progressing further is not practical. After each game the subjects rested
for 45 seconds to allow any heightened emotional states to return to baseline.
Once skill level was determined, three experimental conditions were calculated
automatically as follows:
High Difficulty Level = Skill Level (+ 2)
Med Difficulty Level = Skill Level
Low Difficulty Level = Skill Level (– 2)
The subjects then played through a random ordered set of 6 trials consisting of 2
games at each difficulty level, e.g., [8,6,6,4,8,4]. Trials were randomized to account
for order effects. During each trial, games lasted for 3 minutes at a constant
difficulty/speed. If the subjects reached the typical game over scenario, the playing
field was immediately reset and the subjects continued playing until the 3 minutes
were over. At the end of each round, the subjects complete a portion of the
Experiment Questionnaire (Appendix B). The subjects rested for 45 seconds after
completing the questionnaire, but before beginning the next round, to allow emotional
state to return to baseline. Headset logging stopped after all 6 rounds had been
played.
32
After the subjects finished all game play tasks the facilitator removed the headset, and
subjects completed the final Post-Experiment Questionnaire (Appendix C). The
subject was paid for his time, signed a receipt, and was free to leave.
4.6 Results and Analysis
4.6.1 Headset-Reported versus Self-Reported Levels of Affect
The main goal for this study was to determine whether or not the Epoc headset
reports data that is congruent with self-reported data of the same features. This was
done in order to establish the validity of the Epoc affectiv suite. Headset and self-
report data were compared a number of different ways. Each trial yielded 3 minutes
worth of headset data sampled at 2 Hz. (approximately 360 samples x 3 affective
features per sample = 1080 individual data elements), which were to be compared
with a single self-reported level of affect for each of the 3 affective features in
question (excitement, engagement, and frustration). Headset data from each trial was
reduced to 3 individual data elements by taking the mean of all sampled values for
each of the three affective features. Headset means from each trial were then paired
with the corresponding self-reported levels of affect for that trial. These data are
reproduced in the table below.
33
Subject Trial Condition H_exc H_eng H_fru S_exc S_eng S_fru
Subject_1 1 low 0.283975 0.488978 0.347653 2 2 1
Subject_1 2 high 0.315356 0.559145 0.400994 4 5 3
Subject_1 3 med 0.316626 0.52268 0.430862 5 4 4
Subject_1 4 low 0.317267 0.573343 0.396643 4 5 1
Subject_1 5 med 0.389669 0.543603 0.371616 5 5 2
Subject_1 6 high 0.305706 0.596423 0.454396 5 5 3
Subject_2 1 high 0.301968 0.654293 0.580068 3 3 3
Subject_2 2 med 0.384194 0.660933 0.785552 3 3 3
Subject_2 3 high 0.323292 0.559863 0.368594 3 4 4
Subject_2 4 med 0.271679 0.505589 0.336272 2 3 2
Subject_2 5 low 0.289168 0.604071 0.505809 2 2 2
Subject_2 6 low 0.304198 0.530313 0.37589 3 2 2
Subject_3 1 high 0.302396 0.406588 0.34533 3 3 3
Subject_3 2 med 0.279739 0.409497 0.356766 3 3 2
Subject_3 3 low 0.353323 0.402969 0.413634 2 2 1
Subject_3 4 low 0.311233 0.425248 0.360592 2 3 1
Subject_3 5 med 0.365323 0.427358 0.429031 3 4 2
Subject_3 6 high 0.361752 0.543518 0.380738 4 4 4
Subject_4 1 low 0.233371 0.557115 0.391817 2 3 1
Subject_4 2 high 0.334173 0.485377 0.448109 3 4 4
Subject_4 3 med 0.265334 0.532051 0.463684 2 2 3
Subject_4 4 med 0.416719 0.519389 0.522566 2 2 1
Subject_4 5 low 0.281803 0.445416 0.424542 2 3 1
Subject_4 6 high 0.305135 0.508839 0.474708 3 3 3
Subject_5 1 low 0.27187 0.597617 0.362354 3 4 2
Subject_5 2 med 0.256403 0.744225 0.361584 4 4 2
Subject_5 3 high 0.292769 0.659934 0.377663 4 5 3
Subject_5 4 med 0.307077 0.650888 0.533751 4 5 1
Subject_5 5 high 0.256848 0.710124 0.460227 2 2 2
Subject_5 6 low 0.24097 0.594766 0.555613 2 4 1
Subject_6 1 high 0.250623 0.563784 0.455011 3 4 3
Subject_6 2 low 0.271444 0.59069 0.452902 5 5 1
Subject_6 3 high 0.282375 0.55344 0.458738 2 3 4
Subject_6 4 med 0.257751 0.573329 0.482535 4 4 1
Subject_6 5 med 0.235875 0.558884 0.460652 3 4 2
Subject_6 6 low 0.305082 0.622437 0.490638 4 4 1Table 4.2 Headset and self-reported levels of affect per subject, per trial
34
The non-parametric Spearman’s rho was selected as the correlation metric for
determining statistical dependence between headset and self-reported levels of affect
because of the mixed ordinal-interval data. The resulting correlation coefficients and
significances were calculated with SPSS and are presented in the table below (N=36).
Correlation Pair Coefficient Sig. (2-tailed) Excitement 0.261 0.125 Engagement .361 0.03 Frustration -0.033 0.849
Table 4.3 Spearman correlation between headset and self-reported levels of affect (N=36)
This analysis suggests that of the three features of affect that were examined,
engagement appears to significantly correlate (p=.03) between what is reported by the
headset and what is experienced by the subject. No significant correlation of
excitement or frustration between headset and self-reported affect levels was found.
The levels of headset reported excitement, engagement, and frustration presented in
table 4.2 are average levels over entire 3 minute trials. It could be possible that self-
reported affect levels collected after the trials might better relate to headset levels
averaged over smaller subdivisions of the trial time. For example, a subject who
experienced high levels of frustration during the last 15 seconds of game play, but
low levels of frustration at all other times may have self-reported a high level of
frustration after the trial, even though the subject generally experienced low levels.
To investigate whether headset data averaged over smaller subdivisions of trial time
better correlated with self-reported data, headset data was averaged from time slices
35
of the last 15, 30, 60, and first 60 seconds of trial data. Spearman’s rho was
calculated for each new dataset by comparing with the same self-report data. These
results are provided in the table below (N=36), along with original correlation results
from table 4.3.
Correlation Pair Time Division Coefficient Sig. (2-tailed)
Excitement all 3min 0.261 0.125 last 60s 0.133 0.439 last 30s 0.21 0.219 last 15s 0.174 0.31 first 60s 0.285 0.092 Engagement all 3min .361* 0.03 last 60s .340* 0.042 last 30s 0.291 0.085 last 15s 0.316 0.061 first 60s 0.229 0.179 Frustration all 3min -0.033 0.849 last 60s -0.102 0.554 last 30s -0.049 0.775 last 15s 0.005 0.977 first 60s 0.176 0.305
Table 4.4 Spearman correlation of headset and self-report data for varied time divisions (N=36)
This analysis suggests that no new significant relationships between headset and self-
report data are found when analyzing headset data from specific subdivisions of time
(last 60, 30, 15 seconds, and first 60 seconds), however, it does appear that self-
reported excitement and frustration levels correlate best with averaged headset data
from the first 60s of each trial.
The data were further analyzed by calculating grand means for each difficulty
condition, and for both headset and self-reported levels of affect. Grand means are
presented in the table below.
36
Condition H_exc H_eng H_fru S_exc S_eng S_fru
low 0.29 0.54 0.42 2.75 3.25 1.25 med 0.31 0.55 0.46 3.33 3.58 2.08 high 0.30 0.57 0.43 3.25 3.75 3.25
Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level
Spearman’s rho was again used as the metric for determining statistical dependence
between headset and self-reported affective levels of the grand mean data. Resulting
correlation coefficients and significances were calculated with SPSS and are
presented in the table below (N=3).
Correlation Pair Coefficient Sig. (2-tailed)
Excitement 1.00 0.01 Engagement 1.00 0.01 Frustration 0.50 0.667
Table 4.6 Grand mean Spearman correlation between headset and self-reported levels of affect (N=3)
This analysis suggests very high correlation of excitement and engagement between
headset and self-reported levels of affect (p=.01), however these results should be
interpreted with caution, considering that this is a correlation between means and the
number of values being compared is small (N=3). No significant correlation of
frustration between headset and self-report levels of affect were found.
To visualize the relationship between grand means of headset and self-reported affect
features, line charts are presented below.
Fi
Fi
Fi
igure 4.4 Comp
igure 4.5 Comp
igure 4.6 Comp
parison of gran
parison of gran
parison of gran
d mean headset
d mean headset
d mean headset
37
t and self-repor
t and self-repor
t and self-repor
rted levels of ex
rted levels of en
rted levels of fru
xcitement
ngagement
ustration
38
The significant correlation (p=.01) between headset and self-report levels of
excitement and engagement are apparent in fig. 4.4 and fig. 4.5. Frustration is seen to
correlate moderately well from low to medium difficulty conditions; however the
degree and direction of change from medium to high difficulty conditions are clearly
in disagreement. As noted above, cautious interpretation of figures 4.4-4.6 is prudent
considering the use of grand means and small number of data points, but the
emergence of relationships between headset and self-reported levels of excitement
and engagement are suggested.
4.6.2 Subjective Causes of Affect during Gameplay
After the game play portion of the experiment concluded, subjects were asked to
complete a brief survey to collect their subjective opinions about what caused their
experiences of excitement, engagement, and frustration during game play. Participant
responses were analyzed to identify general themes. The data were coded for these
themes, which have been aggregated and are presented in the table below.
39
Q1. What types of events caused you to get excited during game play?
Extracted Themes Participant Responses
Competent game performance
(S3) “Doing well” (S1) “Clearing many lines at once” (S1) “Seeing a particularly good move open up”
Game speed/difficulty (S5) “Speed increase of block decent” (S2) “When blocks got faster” (S4) “Game speed”
Poor game performance
(S5) “Block failing to land where desired” (S3) “A poor move” (S4) “A mistake during otherwise good game play” (S6) “End of game when doing bad” (S2) “When the blocks would get too high”
Positively perceived game variables
(S3) “A good sequence of pieces” (S1) “Getting pieces I was hoping for” (S6) “Seeing blocks I needed to make 4 rows”
Q2. What made the game engaging?
Extracted Themes Participant Responses
Game speed/difficulty (S2) “The intensity” (S1) “Speed increases”
Cognitive load
(S4) “You have to think fast” (S6) “Have to think about what’s happening on board” (S1) “Anticipating future moves” (S1) “Seeing game board get more full”
Game simplicity (S3) “Small learning curve in general” (S1) “Simplicity” (S4) “Few input options, enough to stay interesting”
Q3. What happened during the game that made you feel frustrated?
Extracted Themes Participant Responses
Negatively perceived game variables
(S5) “Too many of same block in a row” (S3) “Bad sequence of blocks” (S1) “Getting the same piece over and over” (S6) “Not getting the blocks I needed” (S5) “Block did not fit pattern I was constructing” (S1) “Getting undesired pieces”
Poor game performance (S5) “When a block would land out of place” (S3) “Poor move”
Game speed/difficulty (S3) “Game moving too quickly” (S4) “Game speed increase”
Table 4.7 Major themes identified in subjective affect survey results
40
4.7 Discussion
4.7.1 Consistency in the Present Study
The main goal of this study was to determine how accurately the Epoc headset
measures levels of excitement, engagement, and frustration in order to establish the
validity of the Epoc affectiv suite. To this end, the TetrisClone application was
developed and used to log headset output while subjects played games of Tetris at
varied difficulty levels. During the study, subjects were asked to self-report how
excited, engaged, and frustrated they felt for each game that they played.
The responses to the self-report excitement, engagement, and frustration questions
were then statistically compared with the output from the headset. This analysis
suggested that self-reported levels of engagement correlated well with levels reported
by the headset. To a lesser degree, the analysis suggested that excitement levels
measured by the headset correlated fairly well with self-reported levels. Frustration
levels measured by the headset, however, did not appear to correlate with self-
reported levels.
Subjective responses about what made the game engaging seem to corroborate the
self-report and headset data. General trends in the data described engagement as
increasing over low, medium, and high difficulty levels. The two main themes
identified in responses to “what makes the game engaging,” were game
speed/difficulty and cognitive load. As level increases, game speed inherently also
increases. It makes sense that increased difficulty of the game should demand greater
41
concentration, more planning, and more efficient decision making—all suggestive of
increased cognitive load. With respect to existing literature, the Chanel study, on
which the experimental design of the present study was based, found a similar upward
linear trend in participant arousal, however, the relationship between engagement and
arousal has not been established.
Excitement trends in self-report and headset data generally showed an increase from
low to medium difficulty, then a slight decrease in the high difficulty condition.
Responses to the question, “what types of events caused you to get excited during
game play,” support this trend. General themes extracted from responses to the
question include competent game performance, game speed/difficulty, poor game
performance, and positively perceived game variables (such as getting a block type
you were hoping for). Game speed increases as difficulty condition increases, so its
contribution to overall excitement level is always present in quantities that increase
with difficulty level. It might be assumed that competent game performance and poor
game performance have a balancing effect on one another, i.e., when one increases
the other decreases, thereby creating a single contributing factor to excitement that is
always present, and arguably stable. The decisive contributor to excitement level
may be the positively perceived game variables. It seems feasible that game variables
that happen to be in the player’s favor should occur at a similar frequency, regardless
of difficulty level. It may also be feasible that these occurrences are less noticed in
higher difficulty levels due to increased cognitive load; a mind occupied by game
tasks of greater importance. This lack of recognition of positive game variables may
42
be the reason that excitement increases from low to medium difficulty conditions, but
then decreases in the high condition. A similar trend reported by the Chanel study
occurs in the valence dimension. Valence is shown to increase from low to medium
difficulty conditions, then decrease in the high condition, although the relationship
between valence and excitement has not been established.
4.7.2 Future Direction
It might be beneficial to take a more granular approach to validating output from the
Epoc headset to determine whether specific game events, e.g., a mistake or an optimal
placement, influence affect levels reported by the headset. The present study only
looked at average headset output over large spans of time, but there is a great deal of
variability in the data, some of which might have a relationship with game events.
This more granular approach would require the ability to record specific game events,
e.g., clearing a line, and cross referencing with data from the headset. This could be
accomplished by recording video of the game play. It might also yield interesting
results if headset data were tested for any correlation with other known physiological
measures of affect such as GSR, or skin temperature.
4.8 Conclusion With the accuracy of at least some of the headset-reported levels of affect established,
an instant message application was developed that uses output from the headset to
control a simple animated avatar and display data from the affectiv suite during
43
messaging sessions. This application is called EmoChat. In the next chapter, a study
investigates how EmoChat can be used to enrich emotional content during IM
sessions.
44
5 Study 2: Emotional Instant Messaging with EmoChat
5.1 Introduction
Existing instant messaging environments generally fail to capture non-verbal cues
that greatly increase information transfer bandwidth during face-to-face
communication. The present study is a step toward investigating the use of a low-
cost EEG device as a means to incorporate lost non-verbal information during
computer-mediated communication.
An Instant Messaging application (EmoChat) has been developed that integrates with
the Emotiv Epoc headset to capture facial movements that are used to animate an
expressive avatar. Output from the headset is also used to convey basic affective
states of the user (levels of excitement, engagement, and frustration).
The present study examines how emotional information is transferred differently
between users of the EmoChat application and a “traditional” instant messaging
environment in terms of emotionality. This study addresses the following research
questions:
1. Does the system facilitate communication that generally contains more
emotional information?
2. Does the system provide a greater degree of richness (as defined in section
5.3.1)?
3. Is the emotional state of participants more accurately conveyed/interpreted?
4. How usable is a system that implements this technology?
5
5
E
in
w
T
em
in
th
ca
d
Fi
5.2 EmoC
.2.1 Over
EmoChat is a
nformation d
with C# in M
Traditional in
moticons in
ntroduces a n
he Emotiv Ep
apable of inf
ata.
igure 5.1 The E
Chat System
rview
a client/serve
during instan
Microsoft Visu
nstant messa
order to sha
novel way to
poc headset,
ferring facia
EmoChat client
m Develop
er application
nt message c
ual Studio 2
ging environ
ape the emoti
o capture and
, a low cost,
al expression
t application
45
ment
n that facilit
ommunicati
008.
nments typic
ional meanin
d convey em
commercial
n and basic a
tates the exch
ion. The app
cally rely on
ng of a mess
motional mea
lly available
ffective info
hange of em
plication was
manually ge
sage. EmoC
aning by inte
e EEG device
ormation from
motional
s developed
enerated
Chat
egrating with
e that is
m raw EEG
h
46
Facial expression information that is captured by the Epoc headset is passed to
EmoChat and used to animate a simple avatar with brow, eye, and mouth movements.
Affective information captured by the headset is used to modify the value of a series
of progress bar style widgets. The previous validation study suggested that
excitement and engagement are reasonably estimated by the headset, however it was
decided that other affective measures from the headset (frustration and meditation),
would also be presented in the EmoChat application to give the users a chance to
decide for themselves whether or not these measures are of any value.
Although the application has been specifically designed to integrate with the Epoc
headset, a headset is not required. All facial movements and affective levels may be
manually manipulated by the user at a very granular level, i.e., users may override
brow control, but leave eye, mouth, and affect control determined by the headset.
Manual override of facial and affect control is permitted whether or not a headset is
being used. A summary of the facial movements and affective information conveyed
by EmoChat is presented below:
Eyebrow Eyes Mouth Affect Strong raise Blink Laugh Excitement Weak raise Left wink Strong smile Average excitementNeutral Right wink Weak smile Engagement Weak furrow Neutral Neutral Frustration Strong furrow Left look Clench (frown) Meditation Right look Table 5.1 Facial movements and affective information used by EmoChat
5
A
an
re
IM
5
T
ap
co
re
se
co
w
Fi
.2.2 Trad
A “traditional
nd affect me
equired to co
M environm
.2.3 Appl
The EmoCha
pplication li
onnection is
eceived, the
erver may re
onfiguration
without requi
igure 5.2 EmoC
ditional Env
l” instant me
eters from th
onvey emotio
ent presents
ication Arc
t application
stens for con
established,
server retran
eside on one
n allows the s
iring any ext
Chat server app
vironment
essaging app
he original Em
onal informa
subjects wit
chitecture
n architecture
nnections fro
, the server m
nsmits the da
or more com
server to log
tra effort by
plication
47
plication was
moChat app
ation strictly
th standard c
e follows the
om networke
monitors for
ata to all oth
mputers netw
g all commun
client users.
s approxima
plication, so t
y through tex
chat input an
e client/serv
ed client app
r data transm
her connected
worked acros
nication even
.
ated by remo
that subjects
xt. This trim
nd output pa
ver model. A
plications. A
missions. On
d clients. Th
ss a LAN or
nts for offlin
oving avatar
s would be
mmed-down
anes.
A server
After a
nce data is
he clients an
r WAN. Thi
ne analysis
nd
is
48
EmoChat introduces a novel way to share emotional information during instant
message communication by integrating with one of the first commercially available
brain-computer interface devices. This application not only demonstrates a new and
interesting way to enrich computer-mediated communication, but also serves an
evaluation tool for the Epoc EEG headset and its inference algorithms. The results
from studies performed with EmoChat may contribute toward a foundation for future
research into BCI applications.
5.3 Experimental Design
The present study follows a crossover-repeated measures design (within-subjects).
Paired subjects spend time chatting with one another using both the EmoChat
application (HE condition), and a “traditional” instant messaging environment (TE
condition). The order in which subjects are presented with these environments is
determined by random assignment to one of two groups.
Group Condition Order I TE then HE II HE then TE
Table 5.2 EmoChat experimental groups
Additionally, roles of expresser (EX) and perceiver (PX) are divided between each
subject pair. These roles are used to determine which version of the Emotional
Transfer Accuracy Questionnaire (ETAQ) is completed (discussed below). EX and
PX roles are swapped by paired subjects between each experimental condition.
49
During each chat session, participants are asked to chat for 15 minutes about any
topic(s) of interest. Discussion guides for eliciting emotional responses were
considered but decided against. It was thought that since this was a repeated
measures design, two separate topic lists would need to be generated; one for each
experimental condition. These two topic lists would need to elicit similar emotional
responses in order to generate data that were comparable. The challenges posed by
creating two equivalent topic lists led to the decision to eliminate them entirely and to
allow subjects to guide their own conversations. It was also thought that unguided
conversation would lead to a more natural exchange.
5.3.1 Measures
5.3.1.1 Questionnaires
Subjects complete a number of surveys during the experiment. Basic demographic
information and responses to the International Positive Affect Negative Affect
Schedule Short Form (I-PANAS-SF) are collected prior to the experiment. After
participating in each condition, subjects complete another battery of questionnaires,
including the Emotional Transfer Accuracy Questionnaire (ETAQ), and Richness of
Experience Questionnaire (REQ). After the HE condition subjects complete the
System Usability Scale (SUS). Descriptions of each survey, and sources, when
applicable, are presented below. Questionnaires are provided in appendices D-F,
where permitted by copyright.
50
I-PANAS-SF
Prior to the experiment, each subject is asked to complete Thompson’s International
Positive Affect Negative Affect Schedule Short Form (I-PANAS-SF) (Thompson,
2007), a shortened, but psychometrically sound version of the original PANAS
developed by Watson and colleagues (Watson, Clark, & Tellegen, 1988), in order to
assess prevailing emotional state at the time of the experiment.
ETAQ
Subjects from both experimental conditions are asked to complete a novel
questionnaire designed to assess how accurately emotional information was conveyed
and perceived during the condition (Appendix E). Wording of the questions is
slightly altered, depending on subject role. Subjects are asked to indicate, “how often
during this chat session [EX role: you / PX role: your partner] experienced the
following emotions,” and are presented with a list of 16 emotional adjectives; 4
representative adjectives from each quadrant of the arousal/valence space defined by
Russell’s circumplex model of affect (Russell, 1980). Subjects are asked to rate
frequency of experience on a five-point likert-scale from never/very seldom to very
often (see Appendix D for the complete version of the emotional transfer accuracy
questionnaire).
REQ
The concept of “richness” has been applied in related studies of instant message
communication involving expressive avatars, and has been used to compare IM
51
applications having expressive avatar capabilities with those that do not (Fabri, 2006;
Neviarouskaya, 2008). Fabri defines a high degree of “richness” as manifesting
through greater task involvement, greater enjoyment, a higher sense of presence, and
a higher sense of copresence (Fabri, 2006), while Neviarouskaya defines “richness”
in terms of interactivity, involvement, copresence, enjoyment, affective intelligence,
and overall satisfaction (Neviarouskaya, 2008). For the purposes of the present study,
a combination of richness components from both Fabri and Neviarouskaya are
adapted and measured, and include: task enjoyment, copresence, affective
intelligence, and overall satisfaction. Questions related to each richness component
are presented and responses are collected on a 5-point symmetric likert-scale ranging
from strongly disagree to strongly agree (See Appendix F for complete version of
richness questionnaire). Subjects from both experimental conditions are asked to
complete this richness questionnaire.
SUS
The usability of any new system is an important measure that should be used to
inform future design revisions. Although the usability of the prototype EmoChat
system is expected to be lower than that of an established traditional instant
messaging environment, it is nonetheless a very worthwhile measure. The present
study implements Brooke’s “Quick and Dirty” system usability scale (SUS) to obtain
a subjective account of usability for the HE condition only (Brooke, 1996) (see
Appendix F for a complete version of the SUS questionnaire).
52
5.3.1.2 Chat Transcripts
Logs of chat sessions are collected by the EmoChat server for offline analysis. These
logs include time-stamped messages that are exchanged between participants and in
the HE condition, time-stamped facial expression and affect information that is either
automatically generated by the Epoc headset, or manually generated by user
manipulation of the EmoChat override controls.
Of particular interest to the present study is the number of affect, and affect related
terms that are used during each experimental condition. To facilitate the recognition
and frequency analysis of these specific terms, the Linguistic Inquiry and Word
Count tool developed by Pennebaker and colleagues is used (Pennebaker, Booth, &
Francis, 2007). This tool was created to aid in the measurement of specific types of
words representing 74 different linguistic dimensions, and will be used to analyze
chat logs from both experimental conditions.
5.3.1.3 Unstructured Interviews
After subjects have completed the experiment, they join the facilitator as a pair for a
short informal interview to collect opinions about each experimental condition.
Discussion topics and subjective experiences are documented for each pair of
subjects. The data collected during the interviews is used to supplement other
measures taken during the study.
53
5.4 Experimental Setup
Two laptop computers (Apple MacBook and IBM ThinkPad X41) were located in lab
areas separated by a closed door. The computers were both running Windows XP
and connected to a private network via Ethernet cable and a Linksys WRT160N
router. An external standard-sized mouse was provided so that subjects would not
have to interact with the computers by using the built-in track pad (MacBook) or
isometric joystick (ThinkPad). A wireless dongle for communicating with the Epoc
headset was connected to the MacBook.
5.5 Experimental Procedures
Subjects recruited for the study (N=10) were aged 22 to 32 years (mean=28.1).
Gender of subjects was divided with 7 males and 3 females participating. All
subjects reported having an intermediate or higher level of computer experience
(mode=intermediate) and between basic and expert typing skills
(mode=intermediate). Subjects’ highest level of education ranged from high school
to bachelor’s degree (mode=bachelor’s degree). Reported use of instant message
applications varied from never to every day (mode=every day).
All subjects scored extremely low on the negative affect component of the PANAS-
SF (mean=5.6 out of 25) and scored between 10 and 22 on the positive affect
component (mean=16.4 out of 25).
54
Subjects who had previously participated in the Epoc validation study (N=5) were
paired with subjects who had no prior experience with the headset. Paired subjects
were randomly assigned to one of the two experimental groups described in section
(5.3). 6 subjects were assigned to experimental group I and 4 subjects were assigned
to experimental group II.
Subject pairs arrived and were asked to review and sign consent forms. One subject in
each pair had participated in the previous Epoc validation study. Subjects were
briefed on the tasks they were asked to complete, including participating in two 15
minute chat sessions, and filling out several questionnaires. The pre-experiment
questionnaire was administered, which included basic demographic information and
the I-PANAS-SF (International Positive Affect Negative Affect Schedule Short
Form). Prior to the experiment, subject pairs were randomly assigned to one of the
experimental groups described above.
Subjects were separated in their respective areas of the lab, and the facilitator
initialized the chat application corresponding to the first experimental condition. In
the (HE) condition, the facilitator fitted the (EX) subject with the Epoc headset, and
then briefly explained the functions of the EmoChat application independently to both
subjects. The facilitator then asked the subjects to begin chatting, and left the room.
After 15 minutes had elapsed, the facilitator informed the subjects that the first chat
session had concluded, and administered a battery of questionnaires containing the
Emotional Transfer Accuracy Questionnaire (ETAQ) and the Richness of Experience
55
Questionnaire (REQ). In the (HE) condition, a third System Usability Scale (SUS)
questionnaire was also administered. While the questionnaires were being completed,
the facilitator initialized the chat application corresponding to the second
experimental condition. Once both subjects completed their respective
questionnaires, the facilitator asked the subjects to being the second 15 minute chat
session and left the room. At the conclusion of the second chat session, subjects were
asked to complete another battery of questionnaires including the ETAQ, the
Richness of Experience Questionnaire, and in the (HE) condition, the SUS.
After both subjects completed this final battery of questionnaires they joined the
facilitator for a short informal interview about their experiences with each chat
application. Questions were asked, such as, “What did you think of each chat
program,” “how often did you manually change the avatar,” and “which program did
you enjoy more,” however the interview was open ended and frequently led to
participants discussing a very wide range of topics.
5.6 Results and Analysis
5.6.1 Emotional Transfer Accuracy
The results from the ETAQ were used to identify differences in emotional transfer
accuracy between conditions. Subject pairs chatted with each other during both
experimental conditions: traditional environment (TE) and the EmoChat environment
(HE). During each condition, one subject was assigned the expresser role (EX), and
56
one subject was assigned the perceiver role (PX). After each chat session concluded,
the subject with the EX role answered questions on the ETAQ about him/herself, e.g.,
“How often did YOU experience the following emotions…,” while the subject with
the PX role answered questions about his/her partner, e.g., “How often did YOUR
PARTNER experience the following emotions.” This procedure yielded matched sets
of responses for each experimental condition.
The absolute value of the difference between the rank of each matched response was
calculated for each item on the ETAQ and then summed to give a total score
indicating how accurate emotional transfer was for that condition. A score of 0
indicates perfect transfer accuracy, i.e., EX and PX roles selected exactly the same
responses for every item on the questionnaire. The ETAQ scores for each subject
pair are presented below.
ETAQ Score (TE) ETAQ Score (HE)
Subject-Pair 1 8 5 Subject-Pair 2 20 11 Subject-Pair 3 7 10 Subject-Pair 4 9 11 Subject-Pair 5 5 8 Mean 9.8 9
Table 5.3 ETAQ scores for each subject-pair and both experimental conditions
To determine whether the mean scores for each condition were statistically different,
the Wilcoxon signed rank test was used to calculate a significance value. Results
from the test indicate that emotional transfer accuracy was not significantly different
between conditions (p=.891).
57
5.6.1.1 Effects of Avatar Features on Perception of Affect
Additional analysis was performed in the HE condition by testing for correlation
between avatar features of EX role with perception of affect in the PX role (measured
by the ETAQ). This was a way to see how changes in the EmoChat affectiv meters
and avatar expressions influenced the perception of emotions by the chat partner.
Headset data recorded during the chat session were used to calculate average levels of
excitement, frustration, engagement, and meditation displayed by the affect meters.
Headset data were also used to calculate the proportion of time that avatars displayed
mouth and brow facial expression features, e.g., smile, neutral mouth, raised brow,
etc. Data calculated from the headset were tested for correlation with the non-headset
partner’s responses on the ETAQ, e.g., “how often was your partner annoyed,” scored
on a 5 point likert-scale from seldom to very often. A summary of salient
relationships is provided below, and an abridged correlation matrix supporting these
relationships follows in table 5.4.
Strong Correlation (r2 ≥ .95)
1. Participants with avatars displaying a lower ratio of smiles were perceived as
annoyed more often.
2. Participants with avatars displaying higher average engagement levels were
perceived as annoyed more often.
58
Moderate Correlation (r2 ≥ .80)
1. Participants with avatars displaying higher average excitement levels were
perceived as pleased more often.
2. Participants with avatars displaying higher average frustration levels were
perceived as satisfied more often.
Table 5.4 Spearman correlation matrix between avatar features and perceived frequency of emotional states (N=5)
eng excLT excST med frus clench neutral smile laugh furrow neutral raise
angry .866 .866 .577 -.866 .000 .289 .866 -.866 -.289 .866 -.289 -.577
r-squared .750 .750 .333 .750 .000 .083 .750 .750 .083 .750 .083 .333
annoyed .975 .564 .205 -.718 .359 .462 .821 -.975 -.154 .718 -.103 -.872
r-squared .950 .318 .042 .516 .129 .213 .674 .950 .024 .516 .011 .761
frustrated -.051 .154 -.103 -.667 .103 .667 -.154 .051 -.462 -.154 .667 -.205
r-squared .003 .024 .011 .445 .011 .445 .024 .003 .213 .024 .445 .042
bored .289 -.289 -.289 .577 .577 .000 .289 -.289 .000 .000 -.289 -.289
r-squared .083 .083 .083 .333 .333 .000 .083 .083 .000 .000 .083 .083
tired .289 .289 .289 .289 .289 .000 .577 -.289 -.577 .289 -.577 .000
r-squared .083 .083 .083 .083 .083 .000 .333 .083 .333 .083 .333 .000
interested .289 .577 .289 -.866 .000 .577 .289 -.289 -.577 .289 .289 -.289
r-squared .083 .333 .083 .750 .000 .333 .083 .083 .333 .083 .083 .083
astonished -.866 -.289 .000 .577 -.289 -.289 -.577 .866 -.289 -.577 .000 .866
r-squared .750 .083 .000 .333 .083 .083 .333 .750 .083 .333 .000 .750
excited -.211 .527 .527 -.580 -.580 .000 -.053 .211 -.316 .158 .053 .369
r-squared .044 .278 .278 .336 .336 .000 .003 .044 .100 .025 .003 .136
happy .000 .707 .707 -.354 -.354 .000 .354 .000 -.707 .354 -.354 .354
r-squared .000 .500 .500 .125 .125 .000 .125 .000 .500 .125 .125 .125
pleased .000 .738 .949 -.158 -.791 -.632 .369 .000 -.105 .580 -.791 .580
r-squared .000 .544 .900 .025 .625 .400 .136 .000 .011 .336 .625 .336
content -.369 .000 .158 .527 .053 -.105 .000 .369 -.632 -.211 -.316 .527
r-squared .136 .000 .025 .278 .003 .011 .000 .136 .400 .044 .100 .278
relaxed .205 -.051 .103 .616 .154 -.410 .359 -.205 .103 .205 -.667 .051
r-squared .042 .003 .011 .379 .024 .168 .129 .042 .011 .042 .445 .003
satisfied .224 -.224 -.447 .224 .894 .671 .224 -.224 -.671 -.224 .224 -.447
r-squared .050 .050 .200 .050 .800 .450 .050 .050 .450 .050 .050 .200
Affect Meters Lower Face (Mouth) Upper Face (Brow)
5
S
d
tr
m
sc
Fi
.6.2 Richn
ubject respo
etermine wh
raditional co
mean scores w
cores compa
igure 5.3 Mean
ness of Exp
onses to the R
hether “richn
ndition. Res
were calcula
ared between
n scores from th
perience
Richness of
ness” was hig
sponses wer
ated per cond
n conditions
he richness ques
59
Experience q
gher in the h
re analyzed a
dition for eac
is presented
stionnaire, ques
questionnair
headset cond
a number of
ch question.
d in the figur
stions 1-5
re were used
dition than in
different wa
A summary
res below.
d to
n the
ays. First,
y of these
Fi
It
h
d
te
in
d
igure 5.4 Mean
t is apparent
eadset envir
ifference in
est was perfo
nstead of the
istributions o
n scores from th
from the vis
ronment than
mean scores
ormed on the
e paired t-tes
or equal vari
he richness ques
sualization th
n in the tradi
s for each qu
e results. Th
t or ANOVA
iances, and o
60
stionnaire, ques
hat mean sco
tional enviro
uestion is sig
his particular
A because th
only two con
stions 5-10
ores for all b
onment. To
gnificant, the
r test for sign
he data do no
nditions are
but Q5 are hi
determine i
e Wilcoxon s
nificance wa
ot have norm
being comp
igher in the
f the
signed-rank
as selected
mal
ared.
61
TE-Mean HE-Mean Wilcoxon Sig.
Q1 felt it was important to respond after each of my partner's statements
4.1 4.2 -
Q2 was interested in my partners responses
4.3 4.5 -
Q3 enjoyed communicating with my partner
4.4 4.5 -
Q4 had the sensation that my partner was aware of me
3.9 4.3 -
Q5 felt as though I was sharing the same space with my partner
3.3 3.3 -
Q6 had the sensation that my partner was responding to me
4.1 4.2 -
Q7 was able to successfully convey my feelings with this application
3.3 3.9 -
Q8 partner was able to successfully convey feelings with this application
3 3.7 .05
Q9 understood the emotions of my partner
2.8 3.6 .05
Q10 am satisfied with the experience of communication with this system
3.7 4.1 -
Table 5.5 Wilcoxon's signed rank test for significant difference between score means (N=10)
This analysis suggests that responses to Q8 and Q9 are significantly different (p=.05)
between experimental conditions. Means of the remaining questions from the survey,
although typically higher in the HE condition, do not differ significantly between
conditions.
5.6.2.1 Differences between Headset and Non-Headset Users in EmoChat
During the HE condition (when subjects were chatting with the EmoChat application)
only one subject from each pair was wearing an Epoc headset. This provided an
opportunity to identify differences between richness of experience when using a
headset with the application, which allows the automatic update of avatar expression
an
M
Fiin
Fiin
nd affectiv m
Mean scores
igure 5.5 Compn the EmoChat
igure 5.6 Compn the EmoChat
meters, and w
of REQ resp
parison of meancondition, Q1-5
parison of meancondition, Q5-1
when not usi
ponses can b
n responses to R5 (N=5)
n responses to R10 (N=5)
62
ing a headse
be visualized
REQ between su
REQ between su
et, which req
d in the figur
ubjects with he
ubjects with he
quires manua
res below.
eadsets versus w
eadsets versus w
al updating.
without headsets
without headsets
s,
s,
63
The Wilcoxon signed-rank test for significant differences did not produce any p-
values at or below .05 for this dataset, largely due to the small sample size (N=5);
however, responses with the largest mean differences between user groups (Q1, Q8,
and Q9) may still suggest meaningful relationships. Additional studies are needed to
determine any significance.
Compared with the headset users, non-headset users typically felt that it was more
important to respond to each of their chat partners’ statements, and believed that their
partners were more able to successfully convey feelings with the application. Non-
headset users also typically indicated that they were better able to understand the
emotions of their headset-wearing chat counterparts.
5.6.3 Chat Transcripts
Chat logs were analyzed by using the Linguistic Inquiry and Word Count (LIWC)
tool developed by Pennebaker and colleagues (Pennebaker, et al., 2007). This
application reports on the frequency of words belonging to specific linguistic
categories within the text that it processes, and has been used to analyze both
transcribed spoken passages, and written text. Output from the application is entirely
context-independent, meaning that the sentence “Quick brown foxes jump over
dogs,” produces the exact same output as a nonsensical sentence using the same
words, like, “Brown over jump foxes dogs quick.” The present study was primarily
interested in comparing the difference in number of affect-related terms that appear in
64
the transcripts from each experimental condition; however, additional linguistic
categories are considered in the analysis.
Transcripts from each trial were analyzed with LIWC. Output from the tool consisted
of frequencies of words from over 70 different categories. Output was analyzed with
SPSS for significant differences between experimental conditions using Wilcoxon’s
signed-rank test. Results are provided below.
Category HE-Mean TE-Mean Mean Dif. Wilcoxon sig.
word count 1376.8 1566.4 -189.6 0.05
affect (%) 4.66 3.44 1.22 0.05
you (%) 1.608 1.072 0.536 0.05
relative (%) 5.988 7.086 -1.098 0.05 Table 5.6 Linguistic categories with significant differences between experimental conditions
This analysis suggests that the HE condition corresponds with significantly lower
word count and fewer relativity-related words, and significantly more affect-related
and you-related words. These metrics are described below in more detail.
Word Count
This metric represents the total number of words in the processed text and includes
both words that are found in the program’s dictionary file, e.g., “together,” and
nonsense words like, “orrrrly.” Most other LIWC generated metrics are calculated as
percentage of total word count. For all subject pairs that participated in the
experiment, total word count was smaller in the HE condition than in the TE
condition.
65
Affect Processes
The affect word category consists of 916 distinct emotion-related words, and is
subdivided into positive and negative emotion words. The negative emotion category
is further subdivided into anxiety, anger, and sadness categories. Example words for
each category within this hierarchy are presented below.
Word Category Abbrev ExamplesAffective processes affect Happy, cried, abandon ---Positive emotion posemo Love, nice, sweet ---Negative emotion negemo Hurt, ugly, nasty ------Anxiety anx Worried, fearful, nervous ------Anger anger Hate, kill, annoyed ------Sadness sad Crying, grief, sad
Table 5.7 LIWC affective processes hierarchy
Words that are encountered by the LIWC processing engine are counted as part of all
parent categories, thus an anger-related word would also be counted as a negative
emotion word, and an affective process word.
Singularly, the ratios of positive and negative emotion words to total word count do
not differ significantly between HE and TE conditions, but collectively as affect
words, ratios were higher in the HE condition for all subject pairs.
You Words
This category consists of 20 different second-person pronoun words, such as you,
your, and thou. It is a child of the personal pronouns category, which itself is a child
of the total pronouns category. You-word ratios were higher in the HE condition than
in the TE condition for all subject pairs.
66
Relativity Words
This category consists of 638 relativity-related words. It is a top-level category with
motion, space, and time categories as children. The hierarchy is presented below with
examples.
Word Category Abbrev ExamplesRelativity relative Area, bend, exit, stop ---Motion motion Arrive, car, go ---Space space Down, in, thin ---Time time End, until, season
Table 5.8 LIWC relativity hierarchy
Ratios of child categories motion, space, and time were not significantly different
between HE and TE conditions, but ratios of the collective relativity-words category
were smaller in the HE condition for all subject pairs.
5.6.4 System Usability
After subjects finished the HE test condition they were asked to rate the EmoChat
application in terms of usability by completing the SUS questionnaire. The SUS
produces output on a scale from 0 to 100, with higher scores considered to indicate
greater usability. Results for the EmoChat application were typically high (N=10),
ranging from 72.5 to 87.5 (mean=76.25).
5.6.5 Informal Interviews
Subject pairs jointly participated in informal interviews at the end of the experiment.
They were asked to discuss differences between experimental conditions. The
facilitator led the discussion and manually recorded interesting conversation topics.
Notes from the interviews are presented below.
67
Subject1 / Subject2
Experimental Group I
Relationship: same sex long-time friends
Both subjects indicated a general distrust of the avatar. Subject2, who was not
wearing the headset, told me that he thought Subject1 may have been manually
manipulating the movement of the avatar and affective bars. After briefly paying
attention to the avatar early in the session both subjects eventually disregarded avatar
and affective information presented in the application.
The subjects mentioned that the lack of a “partner is entering a message…” indicator
made it more difficult to have a conversation. Subject2 is a particularly fast typist
and felt as though he was dominating the conversation, since he couldn’t see when his
partner was entering text.
Subject1 mentioned that the eye blink avatar behavior was very accurate. He also
said that he did not manually manipulate any of the Emo controls. Subject2 felt
obligated to manipulate some of the avatar and affective meters after seeing the
movements in Subject1’s Emo controls, however he said his manipulation choices
were random and did not really consciously reflect a desire to convey specific
emotional information.
Subject1 mentioned that the location of the avatar and meters drew his focus away
from the chat window, and was part of the reason for his general disregard.
68
Subject3 / Subject4
Experimental Group I
Relationship: mixed sex romantic partners
Subjects legitimately enjoyed communicating with the headset system. At one point,
Subject4 made a positive comment about Subject3’s appearance which induced a
very large smile on Subject3’s face. This smile was reflected in the avatar. When
Subject4 noticed the avatar with the large smile she seemed very pleased and giggled.
Subject3 mentioned that the frustration meter may not have been entirely accurate,
but that he did not immediately distrust it. He thought that it was possible that the
frustration meter was not necessarily reflecting frustration caused by the
conversation, but instead caused by the recurrence of pain from a hand burn received
earlier in the day.
Subject3 mentioned that the excitement meter seemed to be the most dynamic and
accurate of the affective meters. At one point during the HE conversation, Subject4
was telling Subject3 a joke. During waiting periods between lines of the joke,
Subject3 noticed that excitement levels and frustration levels were increasing, which
he thought was appropriate. When asked which condition the subjects preferred they
strongly indicated the headset condition. Subject4 noted that she was much more
interested in communicating with Subject3 with the avatar and affective meters
present, than not present. She said the experiment was actually fun.
69
Subject5 / Subject6
Experimental Group II
Relationship: mixed sex short-term friends
Subject5 reported not manually manipulating the EmoControls, even though he was
informed that they were available for use as he saw fit. Subject6 said she wished she
had a headset; she would have enjoyed the communication more and would have
been able to focus more on the conversation and less on manually making the avatar
face and affective meters accurately reflect her emotional state.
Subject5 stated that he felt the avatar and affective meters were reasonably accurate,
and that more than increased accuracy, he wished the presentation was more
aesthetically pleasing. As it stands, the avatar is very simple, and animations between
states are abrupt.
Subject6 purposely tried influencing the affective levels reported by Subject5 by
telling jokes. Both Subject6 and Subject5 noticed that excitement and engagement
levels increased when Subject5 was told an entertaining joke. Subject5’s avatar also
reflected his smile reaction to the jokes.
Both Subject5 and Subject6 agreed that the (TE) condition was less fun that the (HE)
condition.
70
Subject7 / Subject8
Experimental Group II
Relationship: brothers
Subject7 mentioned that several times during the conversation he switched to manual
control of the frustration meter, because he didn’t think it was returning to base line
quickly enough.
Subject8 noticed that every time he mentioned his new girlfriend, who his brother
Subject7 does not entirely approve of, Subject7’s frustration meter spiked.
Subject8 mentioned that he felt that manually manipulating the avatar (since he did
not have a headset) caused him to pay less attention to the content of the dialog, and
more attention to making his avatar reflect his current emotional state. He said that
he wished he also had a headset.
The issue of distrusting the avatar was brought up by Subject8 since participants were
permitted to manually manipulate facial expression and affective states. He
mentioned he would have been happier if the partner was notified whenever manual
EmoControls were invoked.
71
Subject9 / Subject10
Experimental Group I
Relationship: Mixed sex long-time friends
Subject10 mentioned that Subject9 seemed to be quite frustrated, but Subject9 said he
wasn’t very frustrated. Subject10 also noticed increased in excitement and
engagement when she asked Subject9 about a recent party they both attended.
Subject9 said that he was unsure whether or not Subject10 was wearing a headset for
the experiment. Subject10 mentioned that she didn’t think she was supposed to
divulge that information.
Subject10 characterized the HE condition as “funny”. She appreciated the facial
expressions of her partner’s avatar.
72
5.6.5.1 Trends Identified During Interviews Some subjects reported a general distrust of the avatar expressions and affective
meters since subjects were permitted to manually manipulate these features without
their partner being aware. In one case, this distrust caused the chat partners to
completely disregard changes in avatar features, but this was isolated. Most headset
users indicated that they did not manually manipulate anything, opting to let the
output from the headset control any changes to avatar states.
Subject pairs in the EmoChat condition consisted of one partner with a headset and
one partner without. The majority of non-headset users indicated that they wished
they had also been using a headset. It was noted several times that having to
manually manipulate avatar features was time-consuming and took away from the
main conversation. During manual manipulation, focus was shifted from the main
chat window to the override control pane of the application, and substantial effort was
required to alter avatar features to reflect present emotional state.
Participants generally preferred using EmoChat over the traditional environment.
Most agreed that EmoChat provided an environment that was interesting and fun to
use. Several participants spontaneously started telling jokes to their headset-wearing
chat partners because they wanted to see how it affected the avatar features. There
seemed to be a genuine interest in the emotional information presented in changes to
avatar facial expression and movement in the affective meters.
73
Frustration was perceived as inaccurate by several participants. One participant in
particular switched to manual manipulation of the frustration meter because he
thought it was not returning to baseline quickly enough, and presumably did not want
his partner to conclude that he was frustrated. Other participants noted that their
frustration meter was spiking when they did not perceive themselves as being
particularly frustrated. The meditation meter was mentioned only once during chat
sessions and informal interviews. It was used for comedic effect when one
participant maxed out the level and stated that he was “meditating like a Jedi.” On
the whole, participants felt that excitement and engagement meters were substantially
more accurate than frustration and meditation meters.
5.7 Discussion
5.7.1 Summary of Results
This study has introduced a novel instant messaging application called EmoChat that
integrates with the Emotiv Epoc headset to detect and convey brow, eye, and mouth
movements, and basic affective states of excitement, engagement, frustration, and
meditation.
The main goals of the study were to compare EmoChat with a traditional instant
messaging environment in terms of the amount of emotional information contained in
messages, richness of experience, accuracy of emotional transfer, and usability, and
to evaluate the effectiveness of using a brain-computer interface device like the Epoc
to facilitate emotional transfer in an IM environment.
74
Results suggest that communication with EmoChat contains a significantly higher
percentage of affect-related and you-related words, than communication with a
traditional environment.
In terms of richness, the data suggests that the level of affective intelligence was
higher with EmoChat than with a traditional environment. Subjects rated their chat
partners as significantly better able to convey emotional information with EmoChat.
Subjects also felt that using EmoChat helped them to understand the emotions of their
partners significantly more than with a traditional environment. The data also
suggests that other measures of richness, including involvement, enjoyment, sense of
copresence, and overall satisfaction were marginally higher with EmoChat.
Accuracy of emotional transfer was not determined to be significantly different
between chat environments; however, additional analysis was able to identify
significant correlation between behavior of the avatar and emotion perceived by the
chat partner.
Usability ratings of the application on the SUS scale ranged from 72.5 to 87.5 with a
mean of 76.25. Subjects who used the headset with EmoChat reported slightly higher
usability ratings (72.5 to 87.5, mean=79) than subjects without the headset (72.5 to
75, mean=74).
75
5.7.2 Consistency with related work
The results of the present study are partially consistent with findings in related
research, as described in the following subsections.
5.7.2.1 Involvement
Involvement in the present study was measured by responses to Q1-2 on the richness
questionnaire. Responses to the questions were marginally, but not significantly
higher in the EmoChat condition. This is consistent with results from Neviarouskaya
in that no significant differences were found; however, the present study does identify
higher mean differences between conditions than the Neviarouskaya study.
Involvement can also be estimated by characters per minute, total word count, or
number of messages sent. The present study found that total word counts in the
EmoChat condition were typically lower than in the traditional environment. This
finding is inconsistent with work by Fabri that reports considerably higher message
length in an expressive avatar condition. A contributing factor to the lower total word
count in the EmoChat condition might be that only one participant in the subject pair
was using a headset. The other participant was required to manually change avatar
and affective features (if desired). Post-experiment interviews confirm that many
subjects felt like the avatar and affect meters were taking their focus away from the
chat window, and might explain the lower total word count.
76
5.7.2.2 Enjoyment
Enjoyment was measured by Q3 on the richness questionnaire, and was not
significantly different between conditions. This result is consistent with findings by
Fabri and Neviarouskaya. Although results from the questionnaire did not suggest a
significant difference between the enjoyment of test conditions, during informal
interviews, users of the EmoChat application almost unanimously reported having
more “fun” than with the traditional environment.
5.7.2.3 Copresence
The copresence dimension was measured by Q4-6 on the richness questionnaire.
Analysis did not identify any significant difference between conditions, although
mean differences were typically higher in the EmoChat condition. Fabri also notes
higher levels of copresence in his expressive avatar condition, although his difference
was significant. Copresence in the Neviarouskaya study is arguably higher in her
expressive avatar condition, but not significantly.
5.7.2.4 Affective Intelligence
Affective intelligence in the present study was measured by Q7-9 on the richness
questionnaire. This indicated how well the system was able to facilitate emotional
expression. Responses were generally significantly higher in the EmoChat condition
than in the traditional condition. This is inconsistent with the Neviarouskaya study,
which was not able to identify significant differences between comparable conditions.
77
5.7.2.5 Overall Satisfaction
Overall satisfaction was measured by Q10 on the richness questionnaire. Comparison
of responses between conditions did not identify significant differences; however
mean difference was marginally higher in the EmoChat condition. This is consistent
with the Neviarouskaya study in that no significant differences were found, however
mean differences in the present study are larger.
5.7.2.6 Usability
Participants in the EmoChat condition were asked to rate the usability of the system
with Brooke’s SUS. Responses ranged from 72.5 to 87.5 with a mean of 76.25. This
rating is comparable to the usability of the expressive avatar system in the Fabri study
(mean=78.4).
5.7.3 Future Direction
Although the results of the present study are promising in terms of their support for
the hypothesis that EmoChat facilitates emotional communication more readily than
traditional IM environments, the relatively small sample size (N=10), creates
challenges with respect to determining statistical differences between environments.
The present study lays the foundation for related future studies that include a larger
number of participants.
This study only investigated the use of EmoChat when one participant was using an
Epoc headset. The other participant was required to manually manipulate avatar
78
features and affective state meters (if desired). It became clear during the study that
the experience of participants who did not have the benefit of the automatic facial
expression and affect meter updates permitted by the headset was very different from
their counterparts’. A second study where both participants use headsets during
EmoChat sessions is likely to generate substantially different data.
5.8 Conclusion This chapter introduced the EmoChat application and compared it with a traditional
instant messaging environment. The results of the comparison have suggested that
EmoChat facilitates communication that contains more affective information, and
provides a “richer” experience. The data has not demonstrated any significant
difference between the accuracy of the emotional information that is transferred,
however, some correlation was found between the frequency of perceived affective
states and incidence of avatar features including facial expressions and average levels
of affect meters.
79
6 Conclusion
6.1 Limitations
The studies presented in this work generally produced favorable results, however
there are some limitations that should be addressed. These limitations are presented
below.
Since conversation topics were not assigned or suggested during study 2, participants
were free to discuss anything that came to mind during the session. Perhaps because
of this freedom, in the EmoChat condition many subjects talked about their avatars,
e.g., “I like your avatar,” or, “you look frustrated.” This may have contributed to the
greater frequency of affect and you-related words measured by the LIWC tool in the
EmoChat condition.
The novel questionnaire used to measure the accuracy of emotional transfer (the
ETAQ), was designed specifically for this experiment, and had not been established
as reliable or valid prior to the study. Results from the survey were difficult to
interpret and compare between experimental conditions. Neither experimental
condition seemed to produce consistent ETAQ results.
The participant sample consisted entirely of young adults, aged 22-32, who all had
substantial computer and typing experience, most of whom used instant messaging
applications very frequently. This sample may not be representative of the general
80
population, so conclusions that are suggested by this study may not generalize beyond
the sample.
The affectiv suite components of frustration and meditation were included as features
in the EmoChat application, although they had not been shown to accurately reflect
self-reported levels in the previous validation study. As noted in section 5.2.1, it was
a conscious decision to include these additional features to allow participants to
decide for themselves whether or not headset-reported levels of frustration and
meditation were of any value. During post-experiment interviews, several subjects
mentioned that frustration levels reported by the headset may not have been accurate.
This is in agreement with the previous validation study in which no strong correlation
was found between headset and self-reported levels of frustration. References to
meditation in chat transcripts and interviews indicate that the feature was generally
disregarded and not considered an important measure during interpersonal
communication. The decision to include these non-validated features of the affectiv
suite in the EmoChat application may have skewed the study results, particularly with
respect to affect-related word frequency in chat transcripts. Some participants
mentioned frustration during chat sessions, e.g., “are you frustrated?” This may have
caused affect-related word frequency to appear artificially high in the EmoChat
condition. Future studies may omit these non-validated affectiv suite features from
the application entirely.
81
The EmoChat application used headset output from both the Emotiv affectiv and
expressive suites; however the previous study was only concerned with validating a
subset of the affectiv suite. The affectiv suite inferences were expected to use
complex classification algorithms to transform raw electroencephalogram (EEG) data
into levels of basic affective states. It was decided that the challenges involved with
this transformation warranted testing to determine accuracy. In contrast, it was
assumed that the facial expression recognition capabilities of the expressive suite
were based on existing research in the comparatively mature facial electromyography
(EMG) field, and did not necessitate validation. The assumptions that were made
during the previous validation study design may have been inappropriate, and it is
noted that a more thorough study should also investigate the accuracy of the Emotiv
expressive suite if these detections are to be used in future work.
The chat log analysis performed with the LIWC tool to identify significant
differences in word-type frequency was primarily concerned with differences in
affect-related words between experimental conditions. For the sake of completeness,
and to identify potential word frequency differences beyond the scope of the study, all
72 word categories were tested between conditions. It is understood that applying
such a large number of tests may have produced results that incorrectly identify
significant differences between conditions, when no true difference exists. It should
be expected that significant differences (p ≤ .05) are incorrectly determined to exist in
5 out of every 100 tests performed. Applied to the present study, this means that of
72 tests performed, 3.6 of those tests are expected to report differences that are falsely
82
identified as significant. The present study notes that 4 word categories were found
to be significantly different between conditions, but the large number of tests
performed merit careful interpretation of these results. The Bonferroni correction
may have been applied in this situation as a potential workaround. This would have
dictated that the standard significance level of (p=.05) be divided by the number of
tests performed (72) to produce a corrected significance level of (p=.000694). Due to
the small initial sample size, testing with this corrected significance level would be
likely to identify false-negatives, and may have failed to detect real significant
differences. Ideally, the study should have just tested in-scope hypotheses, i.e.,
differences in affect-related words only. It might be prudent, however, for any future
related studies that use this methodology to also test for differences in you and
relativity-related word frequency, and total word count between experimental
conditions. If significant differences are also found in future studies, then it becomes
more likely that these differences actually exist and are not appearing purely by
chance. It may also be informative to test for frequency differences in the child word-
categories of positive and negative emotion in addition to the parent affect-related
category.
6.2 Summary of Contributions
This thesis work has presented two related studies. The overall contributions of each
study are discussed below.
83
The first study investigated the accuracy of the Emotiv Epoc headset in terms of its
ability to correctly identify levels of excitement, engagement, and frustration. Results
from the study suggest that the Epoc is reasonably capable of inferring levels of
engagement, and to a lesser degree, excitement. This information should be
applicable to any future research that involves the use of the Epoc headset affectiv
suite. Additionally, the techniques used during study 1 may be applicable to
validation studies of alternative systems that infer affective states from physiologic
metrics. Future research may wish to adapt the methods used in the present study to
elicit differing affective responses through controlled variability of game play
difficulty.
The second study introduced the EmoChat application, an instant messaging
environment that integrates with the Epoc headset to convey facial expression
information, and levels of basic affective states during IM communication. EmoChat
was compared with a traditional instant messaging environment, and was found to be
more capable of communicating emotional information. The findings from this study
add to existing bodies of research concerning affective computing, instant messaging,
and brain-computer interfacing. The methods used here may be applicable to future
studies that seek to evaluate the “affective intelligence” capabilities of an instant
messaging environment, regardless of how emotional information may be input to the
system. Researchers are free to adapt the experimental framework presented in this
study, including procedures followed, measures used, and analyses performed. In a
more abstract sense, another contribution of this work is that it establishes the
84
potential utility of commercial BCI devices in interaction design. This work may
generate additional interest within the research community to assess how available
BCI-based tools can be applied to the field of human-computer interaction, outside of
the traditional role of assistive technology.
6.3 EmoChat Design Considerations Although EmoChat was generally well received by users, modifications to the
application may improve the overall experience of messaging. The majority of users
did not feel the need to manually override any of the EmoChat controls (such as
locking a particular facial feature, or manually manipulating the affect meters), but
the possibility of doing so without disclosure of manual manipulation to the partner
led some participants to question whether or not they were being deceived, i.e., “is
my partner really smiling, or has he locked that control?” Future iterations of
EmoChat should either inform the partner when override controls are invoked, or
prevent overriding the controls altogether.
The general appearance of the avatar is very basic. One participant mentioned that he
would have had a more enjoyable experience if the aesthetic quality of the avatar had
been improved. Future versions of EmoChat should present more realistic avatars
that perhaps smoothly animate between facial expressions, instead of the current
abrupt transitions.
85
Avatar placement in the current version of EmoChat is off to the side of the main chat
window. One subject mentioned that having to divert his gaze from the chat window
to see changes in the avatar made doing so less appealing. Subsequent revisions of
the application may investigate placement of the avatar within the main chat window,
or slightly above, to make monitoring changes easier for the user.
6.4 EmoChat Compared with Existing Methods of Emotional Instant
Messaging
Existing methods of conveying emotion in instant messaging environments were
reviewed in chapter 3. This section provides a comparison between EmoChat and
applications with similar input and output techniques.
6.4.1 Input Technique
EmoChat uses a form of automated facial expression recognition that is thought to
rely on interpretation by the Epoc headset of electromyographic signals produced by
facial muscle movement. These electrical signals are detected by the Epoc and used
to infer changes in states of brow, mouth, and eye facial features. Existing facial
expression recognition algorithms typically rely on visual input from a camera or
video stream and require that the user’s face remain situated in front of the camera
with no more than a minor angle of incidence. This means that if a user looks down
at the keyboard to type a response, or looks away from the camera to attend to
something else, the system may fail. By contrast, the Epoc headset transmits data
86
wirelessly to a USB dongle connected to the computer running EmoChat at ranges up
to 15 feet. This provides the user with much greater freedom of movement, at the
expense of cost, setup time, and comfort associated with using an EEG headset.
In addition to facial expression recognition, the Epoc is thought to use
electroencephalographic signals to infer levels of basic affective states including
excitement, engagement, frustration, and meditation. To date, no other instant
messaging application has used EEG input in this manner. This may potentially
provide emotional information that is not associated with overt sentic modulation, and
may be capable of detecting very subtle emotional changes. This is in contrast to
most existing research that uses only overt symptoms of emotional experience to infer
affect, e.g., solely facial expression, haptic techniques, and explicit text or phrase
matching. The addition of covert symptoms of emotional experience in the EmoChat
system permits the communication of emotional information that might be considered
more intimate.
6.4.2 Output Technique
EmoChat presents users with a very simple animated avatar that might be perceived
as far less realistic than some expressive avatars used in other instant message
applications. Additionally, the technology used in EmoChat does not permit
simultaneous automatic changes to both mouth and brow features. This may prevent
the avatar from expressing certain emotions that rely on simultaneous feature
changes, such as fear, or sadness. However, the application does provide override
87
controls to prevent facial features from changing based on headset input, and allows
users to manually select a preferred feature. In this way, a user could manually
override the mouth feature to force a persistent smile while allowing input from the
headset to modulate brow movement.
This implementation of an animated avatar is in contrast with techniques presented in
section 3.3.2. While existing implementations modulate multiple facial features to
convey a specific emotion, e.g., brow lowered, lips pursed, to indicate anger,
EmoChat takes a more granular approach. Users are free to change facial features
independent of one another. The facial movement in the EmoChat avatar is not
determined by which emotion is inferred or manually selected, but instead attempts to
directly mimic the expression of the user. The ability to combine multiple,
independent facial features lets the EmoChat avatar display a larger number of
distinct expressions than existing methods. By manipulating only brow and mouth
movements, for example, EmoChat can display up to 25 distinct facial expressions.
A drawback to the avatar implementation in EmoChat is that it may not be able to
faithfully reproduce some facial expressions present in existing systems, including
disgust, fear, or surprise, because the individual component images associated with
changes in brow and mouth state are limited.
6.5 Future Direction Beyond the future directions previously described in sections 4.7.2 and 5.7.3, the
present work may provide impetus for additional related studies.
88
The existing EmoChat application is designed for a maximum of two users. It may be
interesting to study how similar emotion input and output techniques translate to a
larger group chat environment, and how the group dynamic is changed by adding this
type of emotional information channel.
Although EmoChat was well received, and users seemed genuinely interested in the
emotional information it was providing, the real-world feasibility of such a system
has not been established. A longitudinal study might suggest how, when, and why a
system like this might be used, and by whom.
Presently, the EmoChat application does not make inferences as to where a particular
user’s experience of emotion lies in arousal-valence space. It may be possible to use
data recorded by the headset to classify emotional experience into some set of
discrete categories, e.g., positive/negative valence, or high/low arousal.
6.6 Closing
The work presented in this thesis suggests that computer-mediated communication
may benefit from the integration of brain-computer interface devices like the Emotiv
Epoc for facilitating communication that contains more emotional information than
traditional environments. These results may generalize to the larger affective
computing domain if states like engagement and excitement can be accurately
inferred. One might envision using this technology in a virtual environment setting
like second life, to endow an avatar with a greater degree of humanity, or in a
distance-learning scenario, to monitor the interest level of students. It is up to the
89
scholars and proponents of human-centered computing and its related disciplines to
investigate the best uses for this technology, perhaps, in applications that have not yet
been conceived. The author believes that future iterations of BCI technology will
incorporate greatly increased capabilities, opening up a vast number of new
possibilities for human-machine interaction, and he is proud to present the modest
findings of this thesis to the community.
90
Appendix A: Validation Study Demographics Questionnaire Name: _________________________________ Age: ________ Gender: ________
1. How often do you play computer or console games?
Never Rarely Occasionally Frequently Very
Frequently 2. How familiar are you with the game of Tetris?
Not at All Familiar
Somewhat Familiar
Moderately Familiar
Pretty Familiar
Very Familiar
3. How often do you play Tetris, or a similar game?
Never Rarely Occasionally Frequently Very
Frequently 4. What is your typical skill level when it comes to playing computer or console games?
Far Below Average
Below Average
Average Above
Average Far Above Average
5. What are some of your favorite computer or console games, if any? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________
91
Appendix B: Validation Study Experiment Questionnaire Round [n]
Not at
All Slightly Moderately Very Extremely
How Difficult was this round?
1 2 3 4 5
How Fun was this round? 1 2 3 4 5
On average, how Engaged did you feel during this round?
1 2 3 4 5
On average, how Excited did you feel during this round?
1 2 3 4 5
On average, how Frustrated did you feel during this round?
1 2 3 4 5
(repeated for each round of play)
92
Appendix C: Validation Study Post-Experiment Questionnaire 1. What types of events caused you to get excited during game play? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 2. What makes a game like Tetris engaging? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 3. What happened during the game that made you feel frustrated? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 4. What makes Tetris fun? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 5. Do you have any additional comments? _____________________________________________________________________ _____________________________________________________________________
93
Appendix D: EmoChat Demographics Questionnaire Name: _________________________________ Age: ________ Gender: ________ 1. Rate your level of computer experience.
None Basic Intermediate Expert
2. Rate your keyboard typing skills.
None Basic Intermediate Expert
3. Please indicate your highest level of education completed.
None High
School Some
College Bachelor’s
Degree Master’s Degree
PhD or Higher
4. How often do you use Instant Message tools, such as MSN, AIM, ICQ, etc.?
Never Occasionally Regularly Every Day
94
Appendix E: EmoChat Emotional Transfer Questionnaire During this chat session, how often did [EX role: you | PX role: your chat partner] experience the following emotions:
Very Seldom or Not at All
Seldom Sometimes Often Very Often
Afraid 1 2 3 4 5
Angry 1 2 3 4 5
Annoyed 1 2 3 4 5
Astonished 1 2 3 4 5
Bored 1 2 3 4 5
Content 1 2 3 4 5
Excited 1 2 3 4 5
Frustrated 1 2 3 4 5
Happy 1 2 3 4 5
Interested 1 2 3 4 5
Miserable 1 2 3 4 5
Pleased 1 2 3 4 5
Relaxed 1 2 3 4 5
Sad 1 2 3 4 5
Satisfied 1 2 3 4 5
Tired 1 2 3 4 5
95
Appendix F: EmoChat Richness Questionnaire 1. I felt it was important to respond after each of my partner’s statements.
Strongly Disagree
Disagree Neither Agree Strongly Agree
2. I was interested in my partner’s responses.
Strongly Disagree
Disagree Neither Agree Strongly Agree
3. I enjoyed communicating with my chat partner.
Strongly Disagree
Disagree Neither Agree Strongly Agree
4. I had the sensation that my partner was aware of me.
Strongly Disagree
Disagree Neither Agree Strongly Agree
5. I felt as though I was sharing the same space with my partner.
Strongly Disagree
Disagree Neither Agree Strongly Agree
6. I had the sensation that my partner was responding to me.
Strongly Disagree
Disagree Neither Agree Strongly Agree
7. I was able to successfully convey my feelings with this application.
Strongly Disagree
Disagree Neither Agree Strongly Agree
8. My partner was able to successful convey his or her feelings with this application.
Strongly Disagree
Disagree Neither Agree Strongly Agree
9. The emotional behavior of the avatars was appropriate.
Strongly Disagree
Disagree Neither Agree Strongly Agree
10. I understood the emotions of my partner.
Strongly Disagree
Disagree Neither Agree Strongly Agree
11. I am satisfied with the experience of communication with this system
Strongly Disagree
Disagree Neither Agree Strongly Agree
96
Bibliography
Andrei, S. (2010). Toward Natural Selection in Virtual Reality. IEEE Computer Graphics and Applications, 30, 93-96,C93.
Brooke, J. (1996). A "Quick and Dirty" Usability Scale. In Jordan, McLelland, Thomas & Weerdmeester (Eds.), Usability Evaluation in Industry (pp. 189-194): Taylor & Francis.
Campbell, A., Choudhury, T., Hu, S., Lu, H., Mukerjee, M. K., Rabbi, M., et al. (2010). NeuroPhone: brain-mobile phone interface using a wireless EEG headset. Paper presented at the Proceedings of the second ACM SIGCOMM workshop on Networking, systems, and applications on mobile handhelds.
Chanel, G., Rebetez, C., Betrancourt, M., & Pun, T. (2008). Boredom, engagement and anxiety as indicators for adaptation to difficulty in games. Paper presented at the Proceedings of the 12th international conference on Entertainment and media in the ubiquitous era.
Ekman, P., & Oster, H. (1979). Facial Expressions of Emotion. Annual Review of Psychology, 30(1), 527-554.
Fabri, M. (2006). Emotionally Expressive Avatars for Collaborative Virtual Environments. Leeds Metropolitan University.
Fabri, M., & Moore, D. (2005, October 2005). Is Empathy the Key? Effective Communication via Instant Messaging. Paper presented at the in Proceedings of 11th EATA International Conference on Networking Entities, St. Poelten, Austria.
James, W. (1884). What is an Emotion? Mind, 9(34), 188-205.
Kaliouby, R. E., & Robinson, P. (2004). FAIM: integrating automated facial affect analysis in instant messaging. Paper presented at the Proceedings of the 9th international conference on Intelligent user interfaces.
97
Kayan, S., Fussell, S. R., & Setlock, L. D. (2006). Cultural differences in the use of instant messaging in Asia and North America. Paper presented at the Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work.
Khalili, Z., & Moradi, M. H. (2009). Emotion recognition system using brain and peripheral signals: using correlation dimension to improve the results of EEG. Paper presented at the Proceedings of the 2009 international joint conference on Neural Networks.
Kiesler, S., Zubrow, D., Moses, A. M., & Geller, V. (1985). Affect in computer-mediated communication: an experiment in synchronous terminal-to-terminal discussion. Hum.-Comput. Interact., 1(1), 77-104.
Lang, P. J. (1995). The Emotion Probe: Studies of Motivation and Attention. American Psychologist, 50(5), 372-385.
Lo, S.-K. (2008). The Nonverbal Communication Functions of Emoticons in Computer-Mediated Communication. CyberPsychology & Behavior, 11(5), 595-597.
Neviarouskaya, A. (2008). AffectIM: An Avatar-based Instant Messaging System Employing Rule-based Sensing. Unpublished MS, University of Tokyo, Tokyo.
Neviarouskaya, A., Prendinger, H., & Ishizuka, M. (2007). Recognition of Affect Conveyed by Text Messaging in Online Communication Online Communities and Social Computing (Vol. 4564, pp. 141-150). Heidelberg: Springer Berlin.
Oakley, I. O. M., S. (2003). Contact IM: Exploring asynchronous touch over distance. Paper presented at the CSCW'02.
Pantic, M., Sebe, N., Cohn, J. F., & Huang, T. (2005). Affective multimodal human-computer interaction. Paper presented at the Proceedings of the 13th annual ACM international conference on Multimedia.
Pennebaker, J. W., Booth, R., & Francis, M. (2007). Linguistic Inquiry and Word Count: LIWC2007.
98
Picard, R. W. (1997). Affective computing: MIT Press.
Reisenzein, R. (2006). Arnold's theory of emotion in historical perspective. Cognition and emotion, 20(7), 32.
Rivera, K., Cooke, N. J., & Bauhs, J. A. (1996). The effects of emotional icons on remote communication. Paper presented at the Conference companion on Human factors in computing systems: common ground.
Rovers, A. F., & Essen, H. A. v. (2004). HIM: a framework for haptic instant messaging. Paper presented at the CHI '04 extended abstracts on Human factors in computing systems.
Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161-1178.
Sanchez, J. A., Hernandez, N. P., Penagos, J. C., & Ostrovskaya, Y. (2006). Conveying mood and emotion in instant messaging by using a two-dimensional model for affective states. Paper presented at the Proceedings of VII Brazilian symposium on Human factors in computing systems.
Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science Information, 44(4), 695-729.
Sherstyuk, A., Vincent, D., & Treskunov, A. (2009). Towards Virtual Reality games. Paper presented at the Proceedings of the 8th International Conference on Virtual Reality Continuum and its Applications in Industry.
Sloten, J. V., Verdonck, P., Nyssen, M., & Haueisen, J. (2008, November 23-27). On modeling user's EEG response during a human-computer interaction: a mirror neuron system-based approach. Paper presented at the 4th European Conference of the International Federation for Medical and Biological Engineering, Antwerp, Belgium.
Tetteroo, D. (2008). Communicating emotions in instant messaging, an overview. University of Twente, Enschede.
99
Thompson, E. R. (2007). Development and Validation of an Internationally Reliable Short-Form of the Positive and Negative Affect Schedule (PANAS). Journal of Cross-Cultural Psychology, 38(2), 227-242.
Wang, H., Prendinger, H., & Igarashi, T. (2004). Communicating emotions in online chat using physiological sensors and animated text. Paper presented at the CHI '04 extended abstracts on Human factors in computing systems.
Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54(6), 1063-1070.
Yahoo. (2010). Emoticon survey results. Retrieved October 4, 2010, from http://www.ymessengerblog.com/blog/2007/07/10/emoticon-survey-results/
Yeo, Z. (2008). Emotional instant messaging with KIM. Paper presented at the CHI '08 extended abstracts on Human factors in computing systems.