Emochat: Emotional instant messaging with the Epoc headset

ABSTRACT

Title of Document: EMOCHAT: EMOTIONAL INSTANT

MESSAGING WITH THE EPOC HEADSET Franklin Pierce Wright

Master of Science 2010

Directed By: Asst. Professor of Human-Centered Computing

Dr. Ravi Kuber Information Systems

Interpersonal communication benefits greatly from the emotional information

encoded by facial expression, body language, and tone of voice, however this

information is noticeably missing from typical instant message communication. This

work investigates how instant message communication can be made richer by

including emotional information provided by the Epoc headset. First, a study

establishes that the Epoc headset is capable of inferring some measures of affect with

reasonable accuracy. Then, the novel EmoChat application is introduced which uses

the Epoc headset to convey facial expression and levels of basic affective states

during instant messaging sessions. A study compares the emotionality of

communication between EmoChat and a traditional instant messaging environment.

Results suggest that EmoChat facilitates the communication of emotional information

more readily than a traditional instant messaging environment.

EMOCHAT: EMOTIONAL INSTANT MESSAGING WITH THE EPOC HEADSET

By

Franklin Pierce Wright

Thesis submitted to the Faculty of the Graduate School of the University of Maryland, Baltimore County, in partial fulfillment

of the requirements for the degree of Master of Science

2010

© Copyright by Franklin Pierce Wright

2010

iii

Table of Contents

Acknowledgements ................................................................................................................. ii

Table of Contents ................................................................................................................... iii

List of Tables .......................................................................................................................... vi

List of Figures ........................................................................................................................ vii

1 Introduction .................................................................................................................... 1 1.1 The Importance of Emotion ..................................................................................... 1 1.2 Instant Messaging .................................................................................................... 2 1.3 The Emotional Problem with Instant Messaging .................................................... 2 1.4 Purpose of this Work ............................................................................................... 3 1.5 Structure of this Document ...................................................................................... 4

2 Emotion ........................................................................................................................... 6 2.1 What is Emotion? .................................................................................................... 6

2.1.1 The Jamesian Perspective ................................................................................... 6 2.1.2 The Cognitive-Appraisal Approach .................................................................... 7 2.1.3 Component-Process Theory ................................................................................ 8

2.2 Emotion and Related Affective States ...................................................................... 9 2.2.1 Primary versus Complex ..................................................................................... 9

2.3 Expressing Emotion ................................................................................................. 9 2.3.1 Sentic Modulation ............................................................................................... 9

2.4 Measuring Emotion ............................................................................................... 10 2.4.1 Self-Report Methods ......................................................................................... 10 2.4.2 Concurrent Expression Methods ....................................................................... 11

2.5 Conclusion ............................................................................................................. 12

3 Existing Techniques for Conveying Emotion during Instant Messaging ................ 13 3.1 Introduction ........................................................................................................... 13 3.2 Input Techniques ................................................................................................... 13

3.2.1 Textual Cues...................................................................................................... 13 3.2.2 Automated Expression Recognition .................................................................. 14 3.2.3 Physiologic Data ............................................................................................... 15 3.2.4 Manual Selection ............................................................................................... 15

3.3 Output Techniques ................................................................................................. 16 3.3.1 Emoticons .......................................................................................................... 16 3.3.2 Expressive Avatars ............................................................................................ 17 3.3.3 Haptic Devices .................................................................................................. 19 3.3.4 Kinetic Typography .......................................................................................... 21

3.4 Conclusion ............................................................................................................. 22

4 Study 1: Validating the Emotiv Epoc Headset ......................................................... 23 4.1 Introduction ........................................................................................................... 23 4.2 Overview of the Epoc Headset............................................................................... 23

4.2.1 Expressiv Suite .................................................................................................. 24 4.2.2 Affectiv Suite .................................................................................................... 25 4.2.3 Cognitiv Suite.................................................................................................... 25

iv

4.3 The Need for Validation ........................................................................................ 25 4.4 Experimental Design ............................................................................................. 26

4.4.1 TetrisClone System Development ..................................................................... 27 4.4.2 Measures ........................................................................................................... 28

4.5 Experimental Procedures ...................................................................................... 30 4.6 Results and Analysis .............................................................................................. 32

4.6.1 Headset-Reported versus Self-Reported Levels of Affect ................................ 32 4.6.2 Subjective Causes of Affect during Gameplay ................................................. 38

4.7 Discussion ............................................................................................................. 40 4.7.1 Consistency in the Present Study ...................................................................... 40 4.7.2 Future Direction ................................................................................................ 42

4.8 Conclusion ............................................................................................................. 42

5 Study 2: Emotional Instant Messaging with EmoChat ............................................ 44 5.1 Introduction ........................................................................................................... 44 5.2 EmoChat System Development .............................................................................. 45

5.2.1 Overview ........................................................................................................... 45 5.2.2 Traditional Environment ................................................................................... 47 5.2.3 Application Architecture ................................................................................... 47

5.3 Experimental Design ............................................................................................. 48 5.3.1 Measures ........................................................................................................... 49

5.4 Experimental Setup ................................................................................................ 53 5.5 Experimental Procedures ...................................................................................... 53 5.6 Results and Analysis .............................................................................................. 55

5.6.1 Emotional Transfer Accuracy ........................................................................... 55 5.6.2 Richness of Experience ..................................................................................... 59 5.6.3 Chat Transcripts ................................................................................................ 63 5.6.4 System Usability ............................................................................................... 66 5.6.5 Informal Interviews ........................................................................................... 66

5.7 Discussion ............................................................................................................. 73 5.7.1 Summary of Results .......................................................................................... 73 5.7.2 Consistency with related work .......................................................................... 75 5.7.3 Future Direction ................................................................................................ 77

5.8 Conclusion ............................................................................................................. 78

6 Conclusion ..................................................................................................................... 79 6.1 Limitations ............................................................................................................. 79 6.2 Summary of Contributions ..................................................................................... 82 6.3 EmoChat Design Considerations .......................................................................... 84 6.4 EmoChat Compared with Existing Methods of Emotional Instant Messaging ..... 85

6.4.1 Input Technique ................................................................................................ 85 6.4.2 Output Technique .............................................................................................. 86

6.5 Future Direction .................................................................................................... 87 6.6 Closing ................................................................................................................... 88

Appendix A: Validation Study Demographics Questionnaire ......................................... 90

Appendix B: Validation Study Experiment Questionnaire .............................................. 91

Appendix C: Validation Study Post-Experiment Questionnaire ..................................... 92

Appendix D: EmoChat Demographics Questionnaire...................................................... 93

Appendix E: EmoChat Emotional Transfer Questionnaire ............................................. 94

v

Appendix F: EmoChat Richness Questionnaire ............................................................... 95

Bibliography .......................................................................................................................... 96

vi

List of Tables Table 3.1 Common emoticons in Western and Eastern cultures ............................... 16 Table 3.2 Example of hapticons ................................................................................. 20 Table 4.1 Facial expression features measured by the Epoc headset ........................ 24 Table 4.2 Headset and self-reported levels of affect per subject, per trial ................. 33 Table 4.3 Spearman correlation between headset and self-reported levels of affect

(N=36) ................................................................................................................. 34 Table 4.4 Spearman correlation of headset and self-report data for varied time

divisions (N=36) ................................................................................................. 35 Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level 36 Table 4.6 Grand mean Spearman correlation between headset and self-reported levels

of affect (N=3) .................................................................................................... 36 Table 4.7 Major themes identified in subjective affect survey results ...................... 39 Table 5.1 Facial movements and affective information used by EmoChat ............... 46 Table 5.2 EmoChat experimental groups ................................................................... 48 Table 5.3 ETAQ scores for each subject-pair and both experimental conditions ..... 56 Table 5.4 Spearman correlation matrix between avatar features and perceived

frequency of emotional states (N=5) .................................................................. 58 Table 5.5 Wilcoxon's signed rank test for significant difference between score means

(N=10) ................................................................................................................. 61 Table 5.6 Linguistic categories with significant differences between experimental

conditions ............................................................................................................ 64 Table 5.7 LIWC affective processes hierarchy .......................................................... 65 Table 5.8 LIWC relativity hierarchy ........................................................................... 66

vii

List of Figures

Figure 3.1 Examples of expressive avatars ................................................................ 18 Figure 4.1 Initialization screen for the TetrisClone application ................................ 28 Figure 4.2 TetrisClone application during trials ........................................................ 28 Figure 4.3 Example output from the TetrisClone application ................................... 29 Figure 4.4 Comparison of grand mean headset and self-reported levels of excitement

............................................................................................................................. 37 Figure 4.5 Comparison of grand mean headset and self-reported levels of

engagement ......................................................................................................... 37 Figure 4.6 Comparison of grand mean headset and self-reported levels of frustration

............................................................................................................................. 37 Figure 5.1 The EmoChat client application ............................................................... 45 Figure 5.2 EmoChat server application ..................................................................... 47 Figure 5.3 Mean scores from the richness questionnaire, questions 1-5 ................... 59 Figure 5.4 Mean scores from the richness questionnaire, questions 5-10 ................. 60 Figure 5.5 Comparison of mean responses to REQ between subjects with headsets

versus without headsets, in the EmoChat condition, Q1-5 (N=5) ...................... 62 Figure 5.6 Comparison of mean responses to REQ between subjects with headsets

versus without headsets, in the EmoChat condition, Q5-10 (N=5) .................... 62

1

1 Introduction

This chapter introduces the importance of emotion in interpersonal communication,

and presents some of the challenges with including emotion in instant messages. The

purpose of this thesis is then stated, followed by an overview of the document

structure.

1.1 The Importance of Emotion

Consider the following statement:

“The Yankees won again.”

Does the person who makes this remark intend to be perceived as pleased or

disappointed? Enthusiastic or resentful? The remark is purposely emotionally

ambiguous to illustrate just how powerful the inclusion or absence of emotion can be.

If the same remark were said with a big grin on the face, or with the sound of

excitement in the voice, we would certainly understand that this person was quite

pleased that his team was victorious.

If the speaker displayed slumped shoulders and a head tilted downward we would

assume that he was certainly less than jubilant.

2

It is clear that emotions play a very important role in interpersonal communication,

and without them, communication would be significantly less efficient. A statement

that contains emotion implies context, without the necessity of explicit clarification.

In some cases what is said may be equally as important as how it is said.

1.2 Instant Messaging

Real-time text-based communication is still on the rise. Instant messaging, in one

form of another, has infiltrated nearly all aspects of our digital lives, and shows no

sign of retreat. From work, to school, to play, it’s becoming more and more difficult

to shield ourselves from that popup, or that minimized window blinking in the task

bar, or that characteristic sound our phones make when somebody wants to chat with

us. We are stuck with this mode of communication for the foreseeable future.

1.3 The Emotional Problem with Instant Messaging

As convenient as it is, this text-based communication has inherent difficulties

conveying emotional information. It generally lacks intonation and the subtle non-

verbal cues that make face-to-face communication the rich medium that it is. Facial

expression, posture, and tone of voice are among the highest bandwidth vehicles of

emotional information transfer (Pantic, Sebe, Cohn, & Huang, 2005), but are

noticeably absent from typical text-based communication. According to Kiesler and

colleagues, computer-mediated communication (CMC) in general is “observably

poor” for facilitating the exchange of affective information, and note that CMC

3

participants perceive the interaction as more impersonal, resulting in less favorable

evaluations of partners (Kiesler, Zubrow, Moses, & Geller, 1985).

The humble emoticon has done its best to remedy the situation by allowing text

statements to be qualified with the ASCII equivalent of a smile or frown. While this

successfully aids in conveying positive and negative affect (Rivera, Cooke, & Bauhs,

1996), emoticons may have trouble communicating more subtle emotions. Other

solutions that have been proposed to address this problem are reviewed in chapter 3.

Each solution is successful in its own right, and may be applicable in different

situations. This work examines a novel method for conveying emotion in CMC,

which is offered to the community as another potential solution.

1.4 Purpose of this Work

The main goal of this body of work is to investigate how instant message

communication is enriched by augmenting messages with emotional content, and

whether this can be achieved through the use of brain-computer interface (BCI)

technology. The Emotiv Epoc headset is a relatively new BCI peripheral intended

for use by consumers and is marketed as being capable of inferring levels of basic

affective states including excitement, engagement, frustration, and meditation. A

study presented in this work attempts to validate those claims by comparing data

reported by the headset with self-reported measures of affect during game play at

varied difficulty levels. The novel EmoChat application is then introduced, which

integrates the Epoc headset into an instant messaging environment to control the

4

facial expressions of a basic animated avatar, and to report levels of basic affective

states. A second study investigates differences between communication with

EmoChat and a traditional instant messaging environment. It is posited that the

EmoChat application, when integrated with the Epoc headset, facilitates

communication that contains more emotional information, that can be described as

richer, and that conveys emotional information more accurately than with traditional

IM environments.

In the end, this complete work intends to provide, first, a starting point for other

researchers interested in investigating applications that implement the Epoc headset,

and second, results which may support the decision to apply the Epoc in computer-

mediated communication settings.

1.5 Structure of this Document

The remaining chapters of this work are structured as follows:

Chapter 2 provides an overview of emotion, including historical perspectives, and

how emotions are related to affective computing. Chapter 3 reviews existing

techniques for conveying emotion in instant messaging environments. Chapter 4

details a study to determine the accuracy of the Epoc headset. Chapter 5 introduces

EmoChat, a novel instant messaging environment for exchanging emotional

information. A study compares EmoChat with a traditional instant messaging

environment. Chapter 6 summarizes the contributions this work makes, and

5

compares the techniques for conveying emotion used in EmoChat with techniques

described in the literature.

6

2 Emotion

This chapter describes some of the historical perspectives on emotion, and introduces

its role in affective computing. It intends to provide a background helpful for the

study of emotional instant messaging.

2.1 What is Emotion?

The problem of defining what constitutes human emotion has plagued psychologists

and philosophers for centuries, and there is still no generally accepted description

among researchers or laypersons. A complicating factor in attempting to define

emotion is our incomplete understanding of the complexities of the human brain.

Some theorists have argued that our perceptions enter the limbic system of the brain

and trigger immediate action without consultation with the more developed cortex.

Others argue that the cortex plays a very important role in assessing how we relate to

any given emotionally relevant situation, and subsequently provides guidance about

how to feel.

2.1.1 The Jamesian Perspective

In 1884 psychologist William James hypothesized that any emotional experience with

which physiological changes may be associated requires that those same

physiological changes be expressed before the experience of the emotion (James,

1884). In essence, James believed that humans feel afraid because we run from a

bear, and not that we run from a bear because we feel afraid. James emphasized the

7

physical aspect of emotional experience causation over the cognitive aspect. This

physical action before the subjective experience of an emotion has subsequently been

labeled a “Jamesian” response. For historical accuracy, note, that at about the same

time that James was developing his theory Carl Lange independently developed a

very similar theory. Collectively, their school of thought is referred to as James-

Lange (Picard, 1997).

2.1.2 The Cognitive-Appraisal Approach

In contrast to James’ physical theory of emotion, a string of psychologists later

developed several different cognitive-based theories. Notable among them is the

cognitive-appraisal theory, developed by Magda Arnold, and later extended by

Richard Lazarus, which holds that emotional experience starts not with a physical

response, but with a cognitive interpretation (appraisal) of an emotionally-inspiring

situation (Reisenzein, 2006). In continuing with the bear example, Arnold and

Lazarus would have us believe that we hold in our minds certain evaluations of the

bear-object (it is dangerous and bad for us), we see that the bear is running toward us

(is, or will soon be present), we anticipate trouble if the bear reaches us (poor coping

potential), and so we experience fear and run away.

Certainly, there are valid examples of situations that seem to trigger Jamesian

responses. Consider the fear-based startle response when we catch a large object

quickly approaching from the periphery. It is natural that we sometimes react to

startling stimuli before the experience of the fear emotion, jumping out of the way

8

before we even consciously know what is happening to us. Conversely, consider joy-

based pride after a significant accomplishment. It seems as though pride could only

be elicited after a cognitive appraisal determines that (a) the accomplishment-object is

positive, (b) it has been achieved despite numerous challenges, and (c) it will not be

stripped away. If examples can be found that validate both the James-Lange

approach and the cognitive-appraisal approach, is one theory more correct than the

other?

2.1.3 Component-Process Theory

It is now suggested that emotional experience may result from very complex

interaction between the limbic system and cortex of the brain, and that emotions can

be described as having both physical and cognitive aspects (Picard, 1997).

Encompassing this point of view, that a comprehensive theory of emotion should

consider both cognitive and physical aspects, is the component-process model

supported by Klaus Scherer (Scherer, 2005). This model describes emotion as

consisting of synchronized changes in several neurologically based subsystems,

including, cognitive (appraisal), neurophysiologic (bodily symptoms), motivational

(action tendencies), motor expression (facial and vocal expression), and subjective

feeling (emotional experience) components. Note that Scherer regards “subjective

feeling” as a single element among many in what constitutes an “emotion.”

9

2.2 Emotion and Related Affective States

2.2.1 Primary versus Complex

Several emotion classes exist that are more basic than others. These are the emotions

that seem the most Jamesian in nature—hard coded, almost reflex like responses that,

from an evolutionary perspective, contribute the most to our survival. Fear and anger

are among these basic, primary, emotions. Picard labels these types of emotions as

“fast primary,” and suggests that they originate in the limbic system. This is in

contrast to the “slow secondary,” or cognitive-based emotions that require time for

introspection and appraisal, and therefore require some cortical processing. Scherer

calls this slow type of emotion “utilitarian,” in contrast with the fast, which he terms,

“aesthetic.” An important distinction should be made between emotions and other

related affective states such as moods, preferences, attitudes, and sentiments. A

distinguishing factor of emotions is the comparatively short duration when considered

among the other affective states.

2.3 Expressing Emotion

2.3.1 Sentic Modulation

If emotion has both physical and cognitive aspects, it seems natural that some

emotions can be experienced without much, if any, outward expression. Interpersonal

communication may benefit from those overt expressions of emotion that can be

perceived by others. Picard discusses what she calls “sentic modulation,” overt or

covert changes is physiological features that, although do not constitute emotion

10

alone, act as a sort of symptom of emotional experience (Picard, 1997). The easiest

of these sentic responses to recognize are arguably facial expression, tone of voice,

and posture, or body language. Of these three, research suggests that facial

expression is the highest bandwidth, with regard to the ability to convey emotional

state (Pantic, et al., 2005). There are other, more covert, symptoms of emotional

experience, including heart rate, blood pressure, skin conductance, pupil dilation,

perspiration, respiration rate, and temperature (Picard, 1997). Recent research has

also demonstrated that some degree of emotionality may also be inferred by

neurologic response as measured by electroencephalogram (Khalili & Moradi, 2009;

Sloten, Verdonck, Nyssen, & Haueisen, 2008). Facial expression deserves additional

attention, being one of the most widely studied forms of sentic modulation. Ekman

and others have identified six basic emotion/facial expression combinations that

appear to be universal across cultures including, fear, anger, happiness, sadness,

disgust, and surprise (Ekman & Oster, 1979). The universality of these facial

expressions, that they are so widely understood, suggests that it should be quite easy

to infer emotion from them.

2.4 Measuring Emotion

2.4.1 Self-Report Methods

Perhaps the most widely used method for determining emotional state is through self-

report. This technique asks a subject to describe an emotion he or she is

experiencing, or to select one from a pre-made list. Scherer discusses some of the

11

problems associated with relying on self-reported emotional experience. On one

hand, limiting self-report of emotion to a single list of words from which the subject

must choose the most appropriate response may lead to emotional “priming,” and/or a

misrepresentation of true experience. On the other hand, allowing a subject to

generate a freeform emotional word to describe experience adds significant difficulty

to any type of analysis (Scherer, 2005). Another method for self-report measurement

is to have a subject identify emotional state within the space of some dimension.

Emotional response is often described as falling somewhere in the two-dimensional

valence/arousal space proposed by Lang. This dimensional model of affect

deconstructs specific emotions into some level of valence (positive feelings versus

negative feelings), and some level of arousal (high intensity versus low intensity)

(Lang, 1995). As an example, joyful exuberance is categorized by high valence and

high arousal, while feelings of sadness demonstrate low valence and low arousal.

Lang posits that all emotion falls somewhere in this two-dimensional space. Problems

may arise when emotion is represented in this space without knowing the triggering

event, since several distinct emotions may occupy similar locations in valence-arousal

space, e.g., intense anger versus intense fear, both located in high-arousal, low-

valence space (Scherer, 2005).

2.4.2 Concurrent Expression Methods

An objective way to measure emotional state is by inferring user affect by monitoring

sentic modulation. This method requires the use of algorithms and sensors in order to

perceive the symptoms of emotion and infer state, and is significantly more

12

challenging than simply asking a person how he feels (Tetteroo, 2008). Still, this is

an active area of research within the affective computing domain because user

intervention is not required to measure the emotion, which may be beneficial in some

cases. Techniques include using camera or video input along with classification

algorithms to automatically detect emotion from facial expression (Kaliouby &

Robinson, 2004), monitoring galvanic skin response to estimate level of arousal

(Wang, Prendinger, & Igarashi, 2004), and using AI learning techniques to infer

emotional state from electroencephalograph signals (Khalili & Moradi, 2009; Sloten,

et al., 2008).

2.5 Conclusion

Picard defines affective computing as, “computing that relates to, arises from, or

influences emotion.” (Picard, 1997) According to Picard, some research in this

domain focuses on developing methods of inferring emotional state from sentic user

characteristics (facial expression, physiologic arousal level, etc.), while other research

focuses on methods that computers could use to convey emotional information

(avatars, sound, color, etc.) (Picard, 1997). A study of emotional instant messaging

is necessarily a study of affective computing. In the context of instant messaging,

Tetteroo separates these two research areas of affective computing into the study of

input techniques, and output techniques (Tetteroo, 2008). The next chapter reviews

how these techniques are used during instant message communication to convey

emotional information.

13

3 Existing Techniques for Conveying Emotion during Instant Messaging

3.1 Introduction

A review of the current literature on emotional communication through instant

message applications has identified several techniques for enriching text-based

communication with affective content. These techniques can be broadly classified as

either input techniques, inferring or otherwise reading in the emotion of the user, or

output techniques, displaying or otherwise conveying the emotion to the partner

(Tetteroo, 2008). These categories are reviewed in turn.

3.2 Input Techniques

Research concerning how emotions can be read into an instant messaging system

generally implements one of several methods: inference from textual cues, inference

through automated facial expression recognition, inference from physiologic data, or

manual selection.

3.2.1 Textual Cues

Input methods that use text cues to infer the emotional content of a message generally

implement algorithms that parse the text of a message and compare its contents with a

database of phrases or keywords for which emotional content is known. Yeo

implements a basic dictionary of emotional terms, such as “disappointed,” or

14

“happy,” that incoming messages are checked against. When a match is found, the

general affective nature of the message can be inferred (Yeo, 2008). Others have

used more complicated natural language processing algorithms to account for the

subtleties of communicative language (Neviarouskaya, Prendinger, & Ishizuka,

2007).

Another example of using text cues to infer the emotion of a message involves simply

parsing text for occurrences of standard emoticons. The presence of the smiley

emoticon could indicate a positively valenced message, while a frowning emoticon

could indicate negative valence, as implemented by Rovers & Essen (2004).

3.2.2 Automated Expression Recognition

The goal during automated expression recognition involves using classification

algorithms to infer emotion from camera or video images of a subject’s face.

Kaliouby and Robinson use an automated “facial affect analyzer” in this manner to

infer happy, surprised, agreeing, disagreeing, confused, indecisive, and neutral states

of affect (Kaliouby & Robinson, 2004). The classifier makes an evaluation about

affective state based on information about the shape of the mouth, the presence of

teeth, and head gestures such as nods.

15

3.2.3 Physiologic Data

Some physiologic data is known to encode levels of affect, including galvanic skin

response, skin temperature, heart beat and breathing rate, pupil dilation, and electrical

activity measured from the surface of the scalp (Picard, 1997). Wang and colleagues

used GSR data to estimate levels of arousal in an instant messaging application

(Wang, et al., 2004). Specifically, spikes in the GSR data were used to infer high

levels of arousal, and the return to lower amplitudes signaled decreased level of

arousal.

Output from electroencephalograph (EEG) has also been used to classify emotional

state into distinct categories within arousal valence space by several researchers

(Khalili & Moradi, 2009; Sloten, et al., 2008). These studies use AI learning

techniques to classify affective state into a small number of categories with moderate

success.

3.2.4 Manual Selection

The most basic method of adding emotional content to an instant message is by

simple manual selection or insertion. This method can take the form of a user

selecting from a list of predefined emotions or emotional icons with a mouse click, or

by explicitly inserting a marker, e.g., emoticon, directly in to the body of the message

text. This type of input technique is widely used and is seen in research by (Fabri &

Moore, 2005; Sanchez, Hernandez, Penagos, & Ostrovskaya, 2006; Wang, et al.,

2004).

16

3.3 Output Techniques

Output techniques describe methods that can be used to display emotional content to

a chat participant after it has been input into the system. These techniques generally

involve using emoticons, expressive avatars, haptic devices, and kinetic typography.

3.3.1 Emoticons

Emoticons are typically understood as small text-based or graphical representations of

faces that characterize different affective states, and have been ever evolving in an

attempt to remedy the lack of non-verbal cues during text chat (Lo, 2008). Examples

of commonly used emoticons are presented in the table below.

Meaning Western Emoticon Eastern Emoticon Happy :-) (^_^) Sad :-( (T_T) Surprised :-o O_o Angry >:-( (>_<) Wink ;-) (~_^) Annoyed :-/ (>_>)

Table 3.1 Common emoticons in Western and Eastern cultures

Emoticons are perhaps the most widely used method for augmenting textual

communication with affective information. A survey of 40,000 Yahoo Messenger

users reported that 82% of respondents used emoticons to convey emotional

information during chat (Yahoo, 2010). A separate study by Kayan and colleagues

explored differences in IM behavior between Asian and North American users and

reported that of 34 total respondents, 100% of Asian subjects used emoticons while

17

72% of North Americans (with an aggregate of 85% of respondents) used emoticons

on a regular basis (Kayan, Fussell, & Setlock, 2006). These usage statistics

underscore the prevalence of emoticons in IM communication.

Sanchez and colleagues introduced an IM application with a unique twist on the

standard emoticon. Typical emoticons scroll with the text they are embedded in, and

so lack the ability to convey anything more than brief fleeting glimpses of emotion.

This novel application has a persistent area for emoticons that can be updated as often

as the user sees fit, and does not leave the screen as messages accumulate (Sanchez,

et al., 2006). Building on Russel’s model of affect (Russell, 1980), the team

developed 18 different emoticons, each with three levels of intensity to represent a

significant portion of valence/arousal emotional space.

3.3.2 Expressive Avatars

Using expressive avatars to convey emotion during IM communication may be

considered a close analog to the way emotion is encoded by facial expression during

face-to-face conversation, considering that facial expression is among the highest

bandwidth channels of sentic modulation (Pantic, et al., 2005).

A

F

d

R

su

F

ex

co

ch

hy

“r

pr

du

co

re

fa

A study by K

AIM which

isplays an ex

Robinson, 20

urprised, agr

abri and Mo

xpressions in

ompared thi

hange facial

ypothesis wa

richness,” co

resence, and

uring a class

ollectively o

epresenting e

acial express

Kaliouby and

uses automa

xpressive av

04). Affecti

reeing, disag

oore investig

n an instant m

s with a con

expressions

as that the co

omprised of

d sense of co

sical surviva

ordering a lis

each chat pa

sions includi

Figure 3.1 Ex

Robinson p

ated facial ex

vatar reflectin

ive states cu

greeing, conf

gated the use

messaging e

dition in wh

s, except for

ondition in q

high levels o

opresence. P

al exercise, in

st of survival

artner could b

ing happines

18

xamples of expr

presents an in

xpression re

ng that affec

urrently supp

fused, indeci

of animated

environment

hich the avata

minor rando

question wou

of task invol

Participants i

n which both

l items in ter

be made to d

ss, surprise, a

ressive avatars

nstant messa

cognition to

ct to the chat

ported by FA

isive, and ne

d avatars cap

(Fabri & M

ar was not an

om eyebrow

uld result in

lvement, enj

interacted thr

h subjects w

rms of impor

display one o

anger, fear,

aging applica

o infer affect

t partner (Ka

AIM include

eutral.

pable of emo

Moore, 2005)

nimated and

w movement.

a higher lev

joyment, sen

rough the IM

were tasked w

rtance. An a

of Ekman’s

sadness, and

ation called

, and

aliouby &

happy,

otional facial

. They

d did not

The

vel of

nse of

M application

with

avatar

six universa

d disgust

l

n

al

19

(Ekman & Oster, 1979), by clicking on a corresponding icon in the interface.

Significant results from the study indicated higher levels of task involvement and

copresence in the expressive avatar condition, equally high levels of presence in both

conditions, and a higher level of enjoyment in the non-expressive avatar condition.

The AffectIM application developed by Neviarouskaya also uses expressive avatars

to convey emotion during instant message communication (Neviarouskaya, et al.,

2007). Rather than requiring a user to select an expression from a predefined set,

AffectIM infers the emotional content of a message by analyzing the text of the

message itself, and automatically updates an avatar with the inferred emotion. A

comparison study identified differences between separate configurations of the

AffectIM application: one in which emotions were automatically inferred, one that

required manual selection of a desired emotion, and one that selected an emotion

automatically in a pseudo-random fashion (Neviarouskaya, 2008). The study

compared “richness” between conditions, comprised of interactivity, involvement,

sense of copresence, enjoyment, affective intelligence, and overall satisfaction.

Significant differences indicated a higher sense of copresence in the automatic

condition than in the random condition, and higher levels of emotional intelligence in

both the automatic and manual conditions than in the random condition.

3.3.3 Haptic Devices

Haptic instant messaging is described as instant messaging that employs waveforms

of varying frequencies, amplitudes, and durations, transmitted and received by

pu

w

an

pr

si

pr

tr

h

fr

se

T

ap

p

a

a

al

T

th

urpose-built

which special

nd Essen int

rogrammed

imilar manne

reliminary a

raditional em

aptic device

requency tha

everal abrup

The ContactIM

pproach to in

lugin for the

ball between

standard for

llow each us

The generated

his way, the

t haptic devic

l emotional m

troduce their

force pattern

er as ordinar

application, H

moticons, e.g

s. For exam

at slowly gro

pt pulses with

Emotic

:-)

:-(

M applicatio

ntegrating ha

e Miranda IM

n partners by

rce-feedback

ser to impart

d momentum

act of tossin

ces (force-fe

meaning can

r idea of “hap

ns that can b

ry icons are u

HIM, parses

g., :), etc., an

mple, a smiley

ows in ampli

h high frequ

con Mean

Hap

Sa

Table 3.2

on developed

aptic inform

M environme

y using a for

k joystick (O

t a specific v

m of the ball

ng the ball m

20

eedback joys

n be attached

pticons,” wh

be used to co

used in grap

instant mes

nd sends a pr

y face sends

itude, while

ency and am

ning Hapt

ppy

ad

2 Example of h

d by Oakley

mation with a

ent was creat

rce enabled h

Oakley, 2003

velocity and t

is persistent

may convey s

sticks, haptic

d (Rovers &

hich are desc

ommunicate

phical user in

sage text for

redefined wa

s a waveform

a frowny fac

mplitude.

ticon Wavef

hapticons

and O’Mod

an instant me

ted that mim

haptic devic

3). The appli

trajectory to

t until the ch

some degree

c touchpads,

Essen, 2004

cribed as “sm

a basic notio

nterfaces.” T

r occurrence

aveform to a

m with mode

ce is represe

form

dhrain takes

essaging env

mics the effec

ce such as the

ication is des

o the ball dur

hat partner p

of emotiona

, etc.), to

4). Rovers

mall

on in a

Their

es of

any connecte

erate

ented by

a different

vironment. A

cts of tossing

e phantom o

signed to

ring a throw

icks it up. I

ality, e.g., a

ed

A

g

or

.

n

21

lightly thrown ball as a playful flirtatious gesture, or a fast throw to indicate

disagreement or anger. Emphasis is placed on the asynchronous nature of typical

instant message use, and the application has been designed to suit this mode of

interaction by keeping the general characteristics of the tossed ball persistent until

interaction by the receiver changes it.

3.3.4 Kinetic Typography

Kinetic typography is described as real time modification of typographic

characteristics including animation, color, font, and size, etc., and may be used to

convey affective information (Yeo, 2008). Yeo developed an IM client that inferred

affective meaning through keyword pattern matching, and used kinetic typography to

update the text of messages in real time (Yeo, 2008).

An instant messaging client developed by Wang and colleagues represents emotion in

arousal/valence space by combining kinetic typography with galvanic skin response.

Manually selected text animations are meant to represent valence, while GSR that is

recorded and displayed to the chat partner represents level of arousal. Users were

asked when they felt the most involved during the online communication, and

answers typically corresponded to peaks in GSR level (Wang, et al., 2004). The

study participants reported that the inclusion of arousal/valence information made the

communication feel more engaging and that it was preferred over traditional text-only

chat, although some users indicated that they would not always want their partner to

be aware of their arousal level (Wang, et al., 2004).

22

3.4 Conclusion

This chapter has separated the major components of emotional instant messaging into

two categories: input techniques and output techniques. Among input techniques,

inference from textual cues, inference through automated facial expression

recognition, inference from physiologic data, and manual selection have been

reviewed. Output techniques that were discussed include emoticons, expressive

avatars, haptic devices, and kinetic typography.

The next chapter introduces the Epoc headset and describes some of its capabilities.

This headset is used as the emotional input device for the EmoChat system discussed

in a subsequent chapter, and can be thought of as using automated facial expression

recognition in combination with physiologic data to infer and convey emotion. The

next chapter also presents a study that investigates the validity of the Epoc affect

classifier.

23

4 Study 1: Validating the Emotiv Epoc Headset

4.1 Introduction

This study investigates the validity of the Epoc headset in terms of how accurately it

measures levels of excitement, engagement, and frustration. Self-reported measures

of excitement, engagement, and frustration are collected after games of Tetris are

played at varied difficulty levels. The self-reported measures are compared with data

from the headset to look for relationships.

4.2 Overview of the Epoc Headset

The EmoChat application makes use of the Epoc headset for measuring affective state

and facial expression information. This headset, developed by Emotiv, was one of

the first consumer-targeted BCI devices to become commercially available.

Alternatives BCI devices that were considered include the Neurosky Mindset, and the

OCZ Neural Impulse Actuator. The Epoc was selected because of the comparatively

large number of electrodes (14) that it uses to sense electroencephalograph (EEG),

and electromyography (EMG) signals, and the resulting capabilities. Additionally,

the Epoc has a growing community of active developers who form a support network

for other people using the software development kit to integrate headset capabilities

with custom applications.

Traditional EEG devices require the use of a conductive paste in order to reduce

electrical impedance and improve conductivity between the electrode and the scalp.

24

The Epoc device, however, replaces this conductive paste with saline-moistened felt

pads, which reduces set up time and makes clean up much easier.

A software development kit provides an application programming interface to allow

integration with homegrown applications, and a utility called EmoKey can be used to

associate any detection with any series of keystrokes for integration with legacy

applications. The developers have implemented three separate detection “suites”

which monitor physiologic signals in different ways, and are reviewed below.

4.2.1 Expressiv Suite

This suite monitors EMG activity to detect facial expressions including left/right

winks, blinks, brow furrowing/raising, left/right eye movement, jaw clenching,

left/right smirks, smiles, and laughter. The detection sensitivity can be modified

independently for each feature and for different users. Universal detection signatures

are included for each feature, but signatures can also be trained to increase accuracy.

Lower Face Movements Upper Face Movements Eye Movements

Smirk Right Brow Raise Look Left Smirk Left Brow Furrow Look Right

Smile Wink Right Laugh Wink Left

Jaw Clench Blink Table 4.1 Facial expression features measured by the Epoc headset

25

4.2.2 Affectiv Suite

The Affectiv suite monitors levels of basic affective states including instantaneous

excitement, average excitement, engagement, frustration, and meditation. Detection

algorithms for each state are proprietary and have not been released to the public,

therefore the given labels may be somewhat arbitrary, and may or may not accurately

reflect affective state. The goal of the present study is to determine the accuracy of

these detections with a longer-term goal of investigating whether this information can

be used to augment the instant messaging experience through the presentation of

emotional content.

4.2.3 Cognitiv Suite

This suite allows a user to train the software to recognize an arbitrary pattern of

electrical activity measured by EEG/EMG that is associated with a specific,

repeatable thought or visualization. The user may then reproduce this specific pattern

to act as the trigger for a binary switch. Skilled users may train and be monitored for

up to 4 different thought patterns at once.

4.3 The Need for Validation

The Epoc affectiv suite purports to measure levels of excitement, engagement,

frustration, and meditation; however, the algorithms used to infer these states are

proprietary and closed-source. There has been little research that references the Epoc

headset, perhaps because it is still new and relatively unknown. No studies thus far

26

have evaluated the accuracy of its affective inference algorithms. Cambpell and

colleagues used raw EEG data from the headset as input to a P300-based selection

engine (Campbell, et al., 2010), and several others have reviewed the device (Andrei,

2010; Sherstyuk, Vincent, & Treskunov, 2009). Methods are provided to retrieve

what each affectiv suite score is at any given moment, but do not let one see how each

score is calculated. It is understandable that Emotiv has chosen to keep this part of

their intellectual property out of the public domain, but if these affectiv measurements

are to be used in any serious capacity by researchers or developers, evidence should

be provided to support the claim that reported affectiv suite excitement levels are

reasonable estimates of actual subject excitement levels, that affectiv suite

engagement levels are reasonable estimates of actual subject engagement levels, and

so on.

4.4 Experimental Design

A study was designed to determine the accuracy of the Epoc affectiv suite by

presenting study participants with stimuli intended to elicit different levels of

affective and physiologic responses (in the form of game play at varied levels), and

measuring for correlation between output from the headset and self-reported affective

experience. Since the overall goal of this thesis work is to investigate how the

inclusion of affective information enriches instant message communication, the

excitement, engagement, and frustration headset detections are validated here. It is

thought that they are the most applicable to a study of affective communication.

27

The study design used in the present study is adapted from similar work by Chanel

and colleagues, during which physiological metrics were monitored as participants

played a Tetris game (Chanel, Rebetez, Betrancourt, & Pun, 2008). The difficulty

level of the game was manipulated in order to elicit differing affective states. Self-

report methods were also used to collect data about participants’ subjective

experience of affect. The goal of the study was to use these physiological metrics

with machine learning techniques to classify affective experience into three

categories, including, anxiety, engagement, and boredom. It is thought that the

anxiety category of the Chanel study may be a close analog to the frustration

component of the Epoc affectiv suite.

4.4.1 TetrisClone System Development A small Tetris application (TetrisClone) was developed to automate the experiment

and to aid with data collection. The application was written in C# using Microsoft

Visual Studio 2008 and interfaces with the Emotiv Epoc headset through the supplied

API.

The initialization screen for TetrisClone can be seen in fig. 4.1. This screen collects

the test subject’s name and is used to start logging data coming from the Epoc

headset. After logging begins, the right panel can be hidden so that it does not

distract the subject during the experiment. A screenshot of the TetrisClone application

during one of the trials is presented in fig. 4.2.

Fi

4 4 A

in

op

pr

fa

af

an

as

fr

igure 4.1 Initiali

.4.2 Meas

.4.2.1 Que

All participan

nformation, s

pinions abou

rovided in ap

amiliarity wi

ffect questio

nd frustratio

sked subject

rustrated.

ization screen fo

sures

estionnaires

nts complete

self-reported

ut the causes

ppendices A

ith Tetris, an

ons asked sub

on on a 5 poi

ts to describe

or the TetrisClon

e a total of 3

d levels of af

s of affect du

A-C. Demog

nd skill at co

bjects to rate

nt likert-sca

e what game

28

ne application

surveys to c

ffect during

uring game p

graphics ques

omputer/cons

e their exper

le between t

e events mad

Figureduring

collect basic

the experim

play. These

stions asked

sole games.

riences of ex

trial games.

de them feel

e 4.2 TetrisClong trials

demographi

ment, and ope

questionnai

d about age, g

Self-reporte

xcitement, en

Open ended

excited, eng

ne application

ic

en-ended

res are

gender,

ed levels of

ngagement,

d questions

gaged, and

4 T

h

in

an

In

ar

T

ex

m

en

.4.2.2 Hea

The TetrisClo

eadset at app

n the left pan

nd moves on

n this way, it

re associated

The output C

xcitement (s

meditation, h

ngagement,

adset Data

one applicati

proximately

ne of fig. 4.1

n to the next

t becomes ea

d with which

Figure 4

SV files them

short term), e

owever the p

and frustrati

ion receives

2 Hz. and w

). As a part

, the content

asier to deter

h points in th

4.3 Example ou

mselves con

excitement (

present study

ion compone

29

affective sta

writes these t

ticipant comp

ts of the text

rmine which

he experimen

utput from the T

ntain time-sta

long term), e

y is only con

ents.

ate informati

to a hidden t

pletes one p

t box control

h groupings o

nt.

TetrisClone app

amped, head

engagement

ncerned with

ion from the

text box cont

ortion of the

l are output t

of headset d

plication

dset-reported

t, frustration

h excitement

e Epoc

trol (visible

e experiment

to a CSV file

data records

d levels of

, and

t (short term

t

e.

m),

30

4.5 Experimental Procedures

A total of (7) subjects participated in the experiment; however, data from one subject

was incomplete due to problems with maintaining a consistently good signal quality

from the headset. This incomplete data is not included in the analysis. The

remaining subjects, (N=6), were all male aged 25-30 (mean=28.5), pretty to very

familiar with the Tetris game (mode=very familiar), occasionally to very frequently

played computer or console games (mode=very frequently), but never to rarely played

Tetris (mode=rarely). Participants rated themselves as being average to far above

average skill (mode=above average) when it came to playing computer or console

games.

Subjects arrived, were seated in front of a laptop, asked to review and sign consent

forms, and then completed the Demographics Questionnaire (Appendix A). The

subjects were then fitted with the Epoc headset. Care was taken to ensure that each

headset sensor reported strong contact quality in the Control Panel software. The

self-scaling feature of the Epoc headset required 15-20 minutes prior to data

collection. During this time, the subjects were asked to play several leisurely games

of Tetris as a warm-up activity.

Once the headset adjusted to the subjects, the Tetris game/Headset data recorder was

launched. Subjects played through a series of three Tetris games to determine “skill

level,” as calculated by averaging the highest levels reached during each game. The

TetrisClone application has no maximum level cap, although levels above 10 are so

31

difficult that progressing further is not practical. After each game the subjects rested

for 45 seconds to allow any heightened emotional states to return to baseline.

Once skill level was determined, three experimental conditions were calculated

automatically as follows:

High Difficulty Level = Skill Level (+ 2)

Med Difficulty Level = Skill Level

Low Difficulty Level = Skill Level (– 2)

The subjects then played through a random ordered set of 6 trials consisting of 2

games at each difficulty level, e.g., [8,6,6,4,8,4]. Trials were randomized to account

for order effects. During each trial, games lasted for 3 minutes at a constant

difficulty/speed. If the subjects reached the typical game over scenario, the playing

field was immediately reset and the subjects continued playing until the 3 minutes

were over. At the end of each round, the subjects complete a portion of the

Experiment Questionnaire (Appendix B). The subjects rested for 45 seconds after

completing the questionnaire, but before beginning the next round, to allow emotional

state to return to baseline. Headset logging stopped after all 6 rounds had been

played.

32

After the subjects finished all game play tasks the facilitator removed the headset, and

subjects completed the final Post-Experiment Questionnaire (Appendix C). The

subject was paid for his time, signed a receipt, and was free to leave.

4.6 Results and Analysis

4.6.1 Headset-Reported versus Self-Reported Levels of Affect

The main goal for this study was to determine whether or not the Epoc headset

reports data that is congruent with self-reported data of the same features. This was

done in order to establish the validity of the Epoc affectiv suite. Headset and self-

report data were compared a number of different ways. Each trial yielded 3 minutes

worth of headset data sampled at 2 Hz. (approximately 360 samples x 3 affective

features per sample = 1080 individual data elements), which were to be compared

with a single self-reported level of affect for each of the 3 affective features in

question (excitement, engagement, and frustration). Headset data from each trial was

reduced to 3 individual data elements by taking the mean of all sampled values for

each of the three affective features. Headset means from each trial were then paired

with the corresponding self-reported levels of affect for that trial. These data are

reproduced in the table below.

33

Subject Trial Condition H_exc H_eng H_fru S_exc S_eng S_fru

Subject_1 1 low 0.283975 0.488978 0.347653 2 2 1

Subject_1 2 high 0.315356 0.559145 0.400994 4 5 3

Subject_1 3 med 0.316626 0.52268 0.430862 5 4 4

Subject_1 4 low 0.317267 0.573343 0.396643 4 5 1

Subject_1 5 med 0.389669 0.543603 0.371616 5 5 2

Subject_1 6 high 0.305706 0.596423 0.454396 5 5 3

Subject_2 1 high 0.301968 0.654293 0.580068 3 3 3

Subject_2 2 med 0.384194 0.660933 0.785552 3 3 3

Subject_2 3 high 0.323292 0.559863 0.368594 3 4 4

Subject_2 4 med 0.271679 0.505589 0.336272 2 3 2

Subject_2 5 low 0.289168 0.604071 0.505809 2 2 2

Subject_2 6 low 0.304198 0.530313 0.37589 3 2 2

Subject_3 1 high 0.302396 0.406588 0.34533 3 3 3

Subject_3 2 med 0.279739 0.409497 0.356766 3 3 2

Subject_3 3 low 0.353323 0.402969 0.413634 2 2 1

Subject_3 4 low 0.311233 0.425248 0.360592 2 3 1

Subject_3 5 med 0.365323 0.427358 0.429031 3 4 2

Subject_3 6 high 0.361752 0.543518 0.380738 4 4 4

Subject_4 1 low 0.233371 0.557115 0.391817 2 3 1

Subject_4 2 high 0.334173 0.485377 0.448109 3 4 4

Subject_4 3 med 0.265334 0.532051 0.463684 2 2 3

Subject_4 4 med 0.416719 0.519389 0.522566 2 2 1

Subject_4 5 low 0.281803 0.445416 0.424542 2 3 1

Subject_4 6 high 0.305135 0.508839 0.474708 3 3 3

Subject_5 1 low 0.27187 0.597617 0.362354 3 4 2

Subject_5 2 med 0.256403 0.744225 0.361584 4 4 2

Subject_5 3 high 0.292769 0.659934 0.377663 4 5 3

Subject_5 4 med 0.307077 0.650888 0.533751 4 5 1

Subject_5 5 high 0.256848 0.710124 0.460227 2 2 2

Subject_5 6 low 0.24097 0.594766 0.555613 2 4 1

Subject_6 1 high 0.250623 0.563784 0.455011 3 4 3

Subject_6 2 low 0.271444 0.59069 0.452902 5 5 1

Subject_6 3 high 0.282375 0.55344 0.458738 2 3 4

Subject_6 4 med 0.257751 0.573329 0.482535 4 4 1

Subject_6 5 med 0.235875 0.558884 0.460652 3 4 2

Subject_6 6 low 0.305082 0.622437 0.490638 4 4 1Table 4.2 Headset and self-reported levels of affect per subject, per trial

34

The non-parametric Spearman’s rho was selected as the correlation metric for

determining statistical dependence between headset and self-reported levels of affect

because of the mixed ordinal-interval data. The resulting correlation coefficients and

significances were calculated with SPSS and are presented in the table below (N=36).

Correlation Pair Coefficient Sig. (2-tailed) Excitement 0.261 0.125 Engagement .361 0.03 Frustration -0.033 0.849

Table 4.3 Spearman correlation between headset and self-reported levels of affect (N=36)

This analysis suggests that of the three features of affect that were examined,

engagement appears to significantly correlate (p=.03) between what is reported by the

headset and what is experienced by the subject. No significant correlation of

excitement or frustration between headset and self-reported affect levels was found.

The levels of headset reported excitement, engagement, and frustration presented in

table 4.2 are average levels over entire 3 minute trials. It could be possible that self-

reported affect levels collected after the trials might better relate to headset levels

averaged over smaller subdivisions of the trial time. For example, a subject who

experienced high levels of frustration during the last 15 seconds of game play, but

low levels of frustration at all other times may have self-reported a high level of

frustration after the trial, even though the subject generally experienced low levels.

To investigate whether headset data averaged over smaller subdivisions of trial time

better correlated with self-reported data, headset data was averaged from time slices

35

of the last 15, 30, 60, and first 60 seconds of trial data. Spearman’s rho was

calculated for each new dataset by comparing with the same self-report data. These

results are provided in the table below (N=36), along with original correlation results

from table 4.3.

Correlation Pair Time Division Coefficient Sig. (2-tailed)

Excitement all 3min 0.261 0.125 last 60s 0.133 0.439 last 30s 0.21 0.219 last 15s 0.174 0.31 first 60s 0.285 0.092 Engagement all 3min .361* 0.03 last 60s .340* 0.042 last 30s 0.291 0.085 last 15s 0.316 0.061 first 60s 0.229 0.179 Frustration all 3min -0.033 0.849 last 60s -0.102 0.554 last 30s -0.049 0.775 last 15s 0.005 0.977 first 60s 0.176 0.305

Table 4.4 Spearman correlation of headset and self-report data for varied time divisions (N=36)

This analysis suggests that no new significant relationships between headset and self-

report data are found when analyzing headset data from specific subdivisions of time

(last 60, 30, 15 seconds, and first 60 seconds), however, it does appear that self-

reported excitement and frustration levels correlate best with averaged headset data

from the first 60s of each trial.

The data were further analyzed by calculating grand means for each difficulty

condition, and for both headset and self-reported levels of affect. Grand means are

presented in the table below.

36

Condition H_exc H_eng H_fru S_exc S_eng S_fru

low 0.29 0.54 0.42 2.75 3.25 1.25 med 0.31 0.55 0.46 3.33 3.58 2.08 high 0.30 0.57 0.43 3.25 3.75 3.25

Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level

Spearman’s rho was again used as the metric for determining statistical dependence

between headset and self-reported affective levels of the grand mean data. Resulting

correlation coefficients and significances were calculated with SPSS and are

presented in the table below (N=3).

Correlation Pair Coefficient Sig. (2-tailed)

Excitement 1.00 0.01 Engagement 1.00 0.01 Frustration 0.50 0.667

Table 4.6 Grand mean Spearman correlation between headset and self-reported levels of affect (N=3)

This analysis suggests very high correlation of excitement and engagement between

headset and self-reported levels of affect (p=.01), however these results should be

interpreted with caution, considering that this is a correlation between means and the

number of values being compared is small (N=3). No significant correlation of

frustration between headset and self-report levels of affect were found.

To visualize the relationship between grand means of headset and self-reported affect

features, line charts are presented below.

Fi

Fi

Fi

igure 4.4 Comp

igure 4.5 Comp

igure 4.6 Comp

parison of gran

parison of gran

parison of gran

d mean headset

d mean headset

d mean headset

37

t and self-repor

t and self-repor

t and self-repor

rted levels of ex

rted levels of en

rted levels of fru

xcitement

ngagement

ustration

38

The significant correlation (p=.01) between headset and self-report levels of

excitement and engagement are apparent in fig. 4.4 and fig. 4.5. Frustration is seen to

correlate moderately well from low to medium difficulty conditions; however the

degree and direction of change from medium to high difficulty conditions are clearly

in disagreement. As noted above, cautious interpretation of figures 4.4-4.6 is prudent

considering the use of grand means and small number of data points, but the

emergence of relationships between headset and self-reported levels of excitement

and engagement are suggested.

4.6.2 Subjective Causes of Affect during Gameplay

After the game play portion of the experiment concluded, subjects were asked to

complete a brief survey to collect their subjective opinions about what caused their

experiences of excitement, engagement, and frustration during game play. Participant

responses were analyzed to identify general themes. The data were coded for these

themes, which have been aggregated and are presented in the table below.

39

Q1. What types of events caused you to get excited during game play?

Extracted Themes Participant Responses

Competent game performance

(S3) “Doing well” (S1) “Clearing many lines at once” (S1) “Seeing a particularly good move open up”

Game speed/difficulty (S5) “Speed increase of block decent” (S2) “When blocks got faster” (S4) “Game speed”

Poor game performance

(S5) “Block failing to land where desired” (S3) “A poor move” (S4) “A mistake during otherwise good game play” (S6) “End of game when doing bad” (S2) “When the blocks would get too high”

Positively perceived game variables

(S3) “A good sequence of pieces” (S1) “Getting pieces I was hoping for” (S6) “Seeing blocks I needed to make 4 rows”

Q2. What made the game engaging?


Game speed/difficulty (S2) “The intensity” (S1) “Speed increases”

Cognitive load

(S4) “You have to think fast” (S6) “Have to think about what’s happening on board” (S1) “Anticipating future moves” (S1) “Seeing game board get more full”

Game simplicity (S3) “Small learning curve in general” (S1) “Simplicity” (S4) “Few input options, enough to stay interesting”

Q3. What happened during the game that made you feel frustrated?


Negatively perceived game variables

(S5) “Too many of same block in a row” (S3) “Bad sequence of blocks” (S1) “Getting the same piece over and over” (S6) “Not getting the blocks I needed” (S5) “Block did not fit pattern I was constructing” (S1) “Getting undesired pieces”

Poor game performance (S5) “When a block would land out of place” (S3) “Poor move”

Game speed/difficulty (S3) “Game moving too quickly” (S4) “Game speed increase”

Table 4.7 Major themes identified in subjective affect survey results

40

4.7 Discussion

4.7.1 Consistency in the Present Study

The main goal of this study was to determine how accurately the Epoc headset

measures levels of excitement, engagement, and frustration in order to establish the

validity of the Epoc affectiv suite. To this end, the TetrisClone application was

developed and used to log headset output while subjects played games of Tetris at

varied difficulty levels. During the study, subjects were asked to self-report how

excited, engaged, and frustrated they felt for each game that they played.

The responses to the self-report excitement, engagement, and frustration questions

were then statistically compared with the output from the headset. This analysis

suggested that self-reported levels of engagement correlated well with levels reported

by the headset. To a lesser degree, the analysis suggested that excitement levels

measured by the headset correlated fairly well with self-reported levels. Frustration

levels measured by the headset, however, did not appear to correlate with self-

reported levels.

Subjective responses about what made the game engaging seem to corroborate the

self-report and headset data. General trends in the data described engagement as

increasing over low, medium, and high difficulty levels. The two main themes

identified in responses to “what makes the game engaging,” were game

speed/difficulty and cognitive load. As level increases, game speed inherently also

increases. It makes sense that increased difficulty of the game should demand greater

41

concentration, more planning, and more efficient decision making—all suggestive of

increased cognitive load. With respect to existing literature, the Chanel study, on

which the experimental design of the present study was based, found a similar upward

linear trend in participant arousal, however, the relationship between engagement and

arousal has not been established.

Excitement trends in self-report and headset data generally showed an increase from

low to medium difficulty, then a slight decrease in the high difficulty condition.

Responses to the question, “what types of events caused you to get excited during

game play,” support this trend. General themes extracted from responses to the

question include competent game performance, game speed/difficulty, poor game

performance, and positively perceived game variables (such as getting a block type

you were hoping for). Game speed increases as difficulty condition increases, so its

contribution to overall excitement level is always present in quantities that increase

with difficulty level. It might be assumed that competent game performance and poor

game performance have a balancing effect on one another, i.e., when one increases

the other decreases, thereby creating a single contributing factor to excitement that is

always present, and arguably stable. The decisive contributor to excitement level

may be the positively perceived game variables. It seems feasible that game variables

that happen to be in the player’s favor should occur at a similar frequency, regardless

of difficulty level. It may also be feasible that these occurrences are less noticed in

higher difficulty levels due to increased cognitive load; a mind occupied by game

tasks of greater importance. This lack of recognition of positive game variables may

42

be the reason that excitement increases from low to medium difficulty conditions, but

then decreases in the high condition. A similar trend reported by the Chanel study

occurs in the valence dimension. Valence is shown to increase from low to medium

difficulty conditions, then decrease in the high condition, although the relationship

between valence and excitement has not been established.

4.7.2 Future Direction

It might be beneficial to take a more granular approach to validating output from the

Epoc headset to determine whether specific game events, e.g., a mistake or an optimal

placement, influence affect levels reported by the headset. The present study only

looked at average headset output over large spans of time, but there is a great deal of

variability in the data, some of which might have a relationship with game events.

This more granular approach would require the ability to record specific game events,

e.g., clearing a line, and cross referencing with data from the headset. This could be

accomplished by recording video of the game play. It might also yield interesting

results if headset data were tested for any correlation with other known physiological

measures of affect such as GSR, or skin temperature.

4.8 Conclusion With the accuracy of at least some of the headset-reported levels of affect established,

an instant message application was developed that uses output from the headset to

control a simple animated avatar and display data from the affectiv suite during

43

messaging sessions. This application is called EmoChat. In the next chapter, a study

investigates how EmoChat can be used to enrich emotional content during IM

sessions.

44

5 Study 2: Emotional Instant Messaging with EmoChat

5.1 Introduction

Existing instant messaging environments generally fail to capture non-verbal cues

that greatly increase information transfer bandwidth during face-to-face

communication. The present study is a step toward investigating the use of a low-

cost EEG device as a means to incorporate lost non-verbal information during

computer-mediated communication.

An Instant Messaging application (EmoChat) has been developed that integrates with

the Emotiv Epoc headset to capture facial movements that are used to animate an

expressive avatar. Output from the headset is also used to convey basic affective

states of the user (levels of excitement, engagement, and frustration).

The present study examines how emotional information is transferred differently

between users of the EmoChat application and a “traditional” instant messaging

environment in terms of emotionality. This study addresses the following research

questions:

1. Does the system facilitate communication that generally contains more

emotional information?

2. Does the system provide a greater degree of richness (as defined in section

5.3.1)?

3. Is the emotional state of participants more accurately conveyed/interpreted?

4. How usable is a system that implements this technology?

5

5

E

in

w

T

em

in

th

ca

d

Fi

5.2 EmoC

.2.1 Over

EmoChat is a

nformation d

with C# in M

Traditional in

moticons in

ntroduces a n

he Emotiv Ep

apable of inf

ata.

igure 5.1 The E

Chat System

rview

a client/serve

during instan

Microsoft Visu

nstant messa

order to sha

novel way to

poc headset,

ferring facia

EmoChat client

m Develop

er application

nt message c

ual Studio 2

ging environ

ape the emoti

o capture and

, a low cost,

al expression

t application

45

ment

n that facilit

ommunicati

008.

nments typic

ional meanin

d convey em

commercial

n and basic a

tates the exch

ion. The app

cally rely on

ng of a mess

motional mea

lly available

ffective info

hange of em

plication was

manually ge

sage. EmoC

aning by inte

e EEG device

ormation from

motional

s developed

enerated

Chat

egrating with

e that is

m raw EEG

h

46

Facial expression information that is captured by the Epoc headset is passed to

EmoChat and used to animate a simple avatar with brow, eye, and mouth movements.

Affective information captured by the headset is used to modify the value of a series

of progress bar style widgets. The previous validation study suggested that

excitement and engagement are reasonably estimated by the headset, however it was

decided that other affective measures from the headset (frustration and meditation),

would also be presented in the EmoChat application to give the users a chance to

decide for themselves whether or not these measures are of any value.

Although the application has been specifically designed to integrate with the Epoc

headset, a headset is not required. All facial movements and affective levels may be

manually manipulated by the user at a very granular level, i.e., users may override

brow control, but leave eye, mouth, and affect control determined by the headset.

Manual override of facial and affect control is permitted whether or not a headset is

being used. A summary of the facial movements and affective information conveyed

by EmoChat is presented below:

Eyebrow Eyes Mouth Affect Strong raise Blink Laugh Excitement Weak raise Left wink Strong smile Average excitementNeutral Right wink Weak smile Engagement Weak furrow Neutral Neutral Frustration Strong furrow Left look Clench (frown) Meditation Right look Table 5.1 Facial movements and affective information used by EmoChat

5

A

an

re

IM

5

T

ap

co

re

se

co

w

Fi

.2.2 Trad

A “traditional

nd affect me

equired to co

M environm

.2.3 Appl

The EmoCha

pplication li

onnection is

eceived, the

erver may re

onfiguration

without requi

igure 5.2 EmoC

ditional Env

l” instant me

eters from th

onvey emotio

ent presents

ication Arc

t application

stens for con

established,

server retran

eside on one

n allows the s

iring any ext

Chat server app

vironment

essaging app

he original Em

onal informa

subjects wit

chitecture

n architecture

nnections fro

, the server m

nsmits the da

or more com

server to log

tra effort by

plication

47

plication was

moChat app

ation strictly

th standard c

e follows the

om networke

monitors for

ata to all oth

mputers netw

g all commun

client users.

s approxima

plication, so t

y through tex

chat input an

e client/serv

ed client app

r data transm

her connected

worked acros

nication even

.

ated by remo

that subjects

xt. This trim

nd output pa

ver model. A

plications. A

missions. On

d clients. Th

ss a LAN or

nts for offlin

oving avatar

s would be

mmed-down

anes.

A server

After a

nce data is

he clients an

r WAN. Thi

ne analysis

nd

is

48

EmoChat introduces a novel way to share emotional information during instant

message communication by integrating with one of the first commercially available

brain-computer interface devices. This application not only demonstrates a new and

interesting way to enrich computer-mediated communication, but also serves an

evaluation tool for the Epoc EEG headset and its inference algorithms. The results

from studies performed with EmoChat may contribute toward a foundation for future

research into BCI applications.

5.3 Experimental Design

The present study follows a crossover-repeated measures design (within-subjects).

Paired subjects spend time chatting with one another using both the EmoChat

application (HE condition), and a “traditional” instant messaging environment (TE

condition). The order in which subjects are presented with these environments is

determined by random assignment to one of two groups.

Group Condition Order I TE then HE II HE then TE

Table 5.2 EmoChat experimental groups

Additionally, roles of expresser (EX) and perceiver (PX) are divided between each

subject pair. These roles are used to determine which version of the Emotional

Transfer Accuracy Questionnaire (ETAQ) is completed (discussed below). EX and

PX roles are swapped by paired subjects between each experimental condition.

49

During each chat session, participants are asked to chat for 15 minutes about any

topic(s) of interest. Discussion guides for eliciting emotional responses were

considered but decided against. It was thought that since this was a repeated

measures design, two separate topic lists would need to be generated; one for each

experimental condition. These two topic lists would need to elicit similar emotional

responses in order to generate data that were comparable. The challenges posed by

creating two equivalent topic lists led to the decision to eliminate them entirely and to

allow subjects to guide their own conversations. It was also thought that unguided

conversation would lead to a more natural exchange.

5.3.1 Measures

5.3.1.1 Questionnaires

Subjects complete a number of surveys during the experiment. Basic demographic

information and responses to the International Positive Affect Negative Affect

Schedule Short Form (I-PANAS-SF) are collected prior to the experiment. After

participating in each condition, subjects complete another battery of questionnaires,

including the Emotional Transfer Accuracy Questionnaire (ETAQ), and Richness of

Experience Questionnaire (REQ). After the HE condition subjects complete the

System Usability Scale (SUS). Descriptions of each survey, and sources, when

applicable, are presented below. Questionnaires are provided in appendices D-F,

where permitted by copyright.

50

I-PANAS-SF

Prior to the experiment, each subject is asked to complete Thompson’s International

Positive Affect Negative Affect Schedule Short Form (I-PANAS-SF) (Thompson,

2007), a shortened, but psychometrically sound version of the original PANAS

developed by Watson and colleagues (Watson, Clark, & Tellegen, 1988), in order to

assess prevailing emotional state at the time of the experiment.

ETAQ

Subjects from both experimental conditions are asked to complete a novel

questionnaire designed to assess how accurately emotional information was conveyed

and perceived during the condition (Appendix E). Wording of the questions is

slightly altered, depending on subject role. Subjects are asked to indicate, “how often

during this chat session [EX role: you / PX role: your partner] experienced the

following emotions,” and are presented with a list of 16 emotional adjectives; 4

representative adjectives from each quadrant of the arousal/valence space defined by

Russell’s circumplex model of affect (Russell, 1980). Subjects are asked to rate

frequency of experience on a five-point likert-scale from never/very seldom to very

often (see Appendix D for the complete version of the emotional transfer accuracy

questionnaire).

REQ

The concept of “richness” has been applied in related studies of instant message

communication involving expressive avatars, and has been used to compare IM

51

applications having expressive avatar capabilities with those that do not (Fabri, 2006;

Neviarouskaya, 2008). Fabri defines a high degree of “richness” as manifesting

through greater task involvement, greater enjoyment, a higher sense of presence, and

a higher sense of copresence (Fabri, 2006), while Neviarouskaya defines “richness”

in terms of interactivity, involvement, copresence, enjoyment, affective intelligence,

and overall satisfaction (Neviarouskaya, 2008). For the purposes of the present study,

a combination of richness components from both Fabri and Neviarouskaya are

adapted and measured, and include: task enjoyment, copresence, affective

intelligence, and overall satisfaction. Questions related to each richness component

are presented and responses are collected on a 5-point symmetric likert-scale ranging

from strongly disagree to strongly agree (See Appendix F for complete version of

richness questionnaire). Subjects from both experimental conditions are asked to

complete this richness questionnaire.

SUS

The usability of any new system is an important measure that should be used to

inform future design revisions. Although the usability of the prototype EmoChat

system is expected to be lower than that of an established traditional instant

messaging environment, it is nonetheless a very worthwhile measure. The present

study implements Brooke’s “Quick and Dirty” system usability scale (SUS) to obtain

a subjective account of usability for the HE condition only (Brooke, 1996) (see

Appendix F for a complete version of the SUS questionnaire).

52

5.3.1.2 Chat Transcripts

Logs of chat sessions are collected by the EmoChat server for offline analysis. These

logs include time-stamped messages that are exchanged between participants and in

the HE condition, time-stamped facial expression and affect information that is either

automatically generated by the Epoc headset, or manually generated by user

manipulation of the EmoChat override controls.

Of particular interest to the present study is the number of affect, and affect related

terms that are used during each experimental condition. To facilitate the recognition

and frequency analysis of these specific terms, the Linguistic Inquiry and Word

Count tool developed by Pennebaker and colleagues is used (Pennebaker, Booth, &

Francis, 2007). This tool was created to aid in the measurement of specific types of

words representing 74 different linguistic dimensions, and will be used to analyze

chat logs from both experimental conditions.

5.3.1.3 Unstructured Interviews

After subjects have completed the experiment, they join the facilitator as a pair for a

short informal interview to collect opinions about each experimental condition.

Discussion topics and subjective experiences are documented for each pair of

subjects. The data collected during the interviews is used to supplement other

measures taken during the study.

53

5.4 Experimental Setup

Two laptop computers (Apple MacBook and IBM ThinkPad X41) were located in lab

areas separated by a closed door. The computers were both running Windows XP

and connected to a private network via Ethernet cable and a Linksys WRT160N

router. An external standard-sized mouse was provided so that subjects would not

have to interact with the computers by using the built-in track pad (MacBook) or

isometric joystick (ThinkPad). A wireless dongle for communicating with the Epoc

headset was connected to the MacBook.

5.5 Experimental Procedures

Subjects recruited for the study (N=10) were aged 22 to 32 years (mean=28.1).

Gender of subjects was divided with 7 males and 3 females participating. All

subjects reported having an intermediate or higher level of computer experience

(mode=intermediate) and between basic and expert typing skills

(mode=intermediate). Subjects’ highest level of education ranged from high school

to bachelor’s degree (mode=bachelor’s degree). Reported use of instant message

applications varied from never to every day (mode=every day).

All subjects scored extremely low on the negative affect component of the PANAS-

SF (mean=5.6 out of 25) and scored between 10 and 22 on the positive affect

component (mean=16.4 out of 25).

54

Subjects who had previously participated in the Epoc validation study (N=5) were

paired with subjects who had no prior experience with the headset. Paired subjects

were randomly assigned to one of the two experimental groups described in section

(5.3). 6 subjects were assigned to experimental group I and 4 subjects were assigned

to experimental group II.

Subject pairs arrived and were asked to review and sign consent forms. One subject in

each pair had participated in the previous Epoc validation study. Subjects were

briefed on the tasks they were asked to complete, including participating in two 15

minute chat sessions, and filling out several questionnaires. The pre-experiment

questionnaire was administered, which included basic demographic information and

the I-PANAS-SF (International Positive Affect Negative Affect Schedule Short

Form). Prior to the experiment, subject pairs were randomly assigned to one of the

experimental groups described above.

Subjects were separated in their respective areas of the lab, and the facilitator

initialized the chat application corresponding to the first experimental condition. In

the (HE) condition, the facilitator fitted the (EX) subject with the Epoc headset, and

then briefly explained the functions of the EmoChat application independently to both

subjects. The facilitator then asked the subjects to begin chatting, and left the room.

After 15 minutes had elapsed, the facilitator informed the subjects that the first chat

session had concluded, and administered a battery of questionnaires containing the

Emotional Transfer Accuracy Questionnaire (ETAQ) and the Richness of Experience

55

Questionnaire (REQ). In the (HE) condition, a third System Usability Scale (SUS)

questionnaire was also administered. While the questionnaires were being completed,

the facilitator initialized the chat application corresponding to the second

experimental condition. Once both subjects completed their respective

questionnaires, the facilitator asked the subjects to being the second 15 minute chat

session and left the room. At the conclusion of the second chat session, subjects were

asked to complete another battery of questionnaires including the ETAQ, the

Richness of Experience Questionnaire, and in the (HE) condition, the SUS.

After both subjects completed this final battery of questionnaires they joined the

facilitator for a short informal interview about their experiences with each chat

application. Questions were asked, such as, “What did you think of each chat

program,” “how often did you manually change the avatar,” and “which program did

you enjoy more,” however the interview was open ended and frequently led to

participants discussing a very wide range of topics.

5.6 Results and Analysis

5.6.1 Emotional Transfer Accuracy

The results from the ETAQ were used to identify differences in emotional transfer

accuracy between conditions. Subject pairs chatted with each other during both

experimental conditions: traditional environment (TE) and the EmoChat environment

(HE). During each condition, one subject was assigned the expresser role (EX), and

56

one subject was assigned the perceiver role (PX). After each chat session concluded,

the subject with the EX role answered questions on the ETAQ about him/herself, e.g.,

“How often did YOU experience the following emotions…,” while the subject with

the PX role answered questions about his/her partner, e.g., “How often did YOUR

PARTNER experience the following emotions.” This procedure yielded matched sets

of responses for each experimental condition.

The absolute value of the difference between the rank of each matched response was

calculated for each item on the ETAQ and then summed to give a total score

indicating how accurate emotional transfer was for that condition. A score of 0

indicates perfect transfer accuracy, i.e., EX and PX roles selected exactly the same

responses for every item on the questionnaire. The ETAQ scores for each subject

pair are presented below.

ETAQ Score (TE) ETAQ Score (HE)

Subject-Pair 1 8 5 Subject-Pair 2 20 11 Subject-Pair 3 7 10 Subject-Pair 4 9 11 Subject-Pair 5 5 8 Mean 9.8 9

Table 5.3 ETAQ scores for each subject-pair and both experimental conditions

To determine whether the mean scores for each condition were statistically different,

the Wilcoxon signed rank test was used to calculate a significance value. Results

from the test indicate that emotional transfer accuracy was not significantly different

between conditions (p=.891).

57

5.6.1.1 Effects of Avatar Features on Perception of Affect

Additional analysis was performed in the HE condition by testing for correlation

between avatar features of EX role with perception of affect in the PX role (measured

by the ETAQ). This was a way to see how changes in the EmoChat affectiv meters

and avatar expressions influenced the perception of emotions by the chat partner.

Headset data recorded during the chat session were used to calculate average levels of

excitement, frustration, engagement, and meditation displayed by the affect meters.

Headset data were also used to calculate the proportion of time that avatars displayed

mouth and brow facial expression features, e.g., smile, neutral mouth, raised brow,

etc. Data calculated from the headset were tested for correlation with the non-headset

partner’s responses on the ETAQ, e.g., “how often was your partner annoyed,” scored

on a 5 point likert-scale from seldom to very often. A summary of salient

relationships is provided below, and an abridged correlation matrix supporting these

relationships follows in table 5.4.

Strong Correlation (r2 ≥ .95)

1. Participants with avatars displaying a lower ratio of smiles were perceived as

annoyed more often.

2. Participants with avatars displaying higher average engagement levels were

perceived as annoyed more often.

58

Moderate Correlation (r2 ≥ .80)

1. Participants with avatars displaying higher average excitement levels were

perceived as pleased more often.

2. Participants with avatars displaying higher average frustration levels were

perceived as satisfied more often.

Table 5.4 Spearman correlation matrix between avatar features and perceived frequency of emotional states (N=5)

eng excLT excST med frus clench neutral smile laugh furrow neutral raise

angry .866 .866 .577 -.866 .000 .289 .866 -.866 -.289 .866 -.289 -.577

r-squared .750 .750 .333 .750 .000 .083 .750 .750 .083 .750 .083 .333

annoyed .975 .564 .205 -.718 .359 .462 .821 -.975 -.154 .718 -.103 -.872

r-squared .950 .318 .042 .516 .129 .213 .674 .950 .024 .516 .011 .761

frustrated -.051 .154 -.103 -.667 .103 .667 -.154 .051 -.462 -.154 .667 -.205

r-squared .003 .024 .011 .445 .011 .445 .024 .003 .213 .024 .445 .042

bored .289 -.289 -.289 .577 .577 .000 .289 -.289 .000 .000 -.289 -.289

r-squared .083 .083 .083 .333 .333 .000 .083 .083 .000 .000 .083 .083

tired .289 .289 .289 .289 .289 .000 .577 -.289 -.577 .289 -.577 .000

r-squared .083 .083 .083 .083 .083 .000 .333 .083 .333 .083 .333 .000

interested .289 .577 .289 -.866 .000 .577 .289 -.289 -.577 .289 .289 -.289

r-squared .083 .333 .083 .750 .000 .333 .083 .083 .333 .083 .083 .083

astonished -.866 -.289 .000 .577 -.289 -.289 -.577 .866 -.289 -.577 .000 .866

r-squared .750 .083 .000 .333 .083 .083 .333 .750 .083 .333 .000 .750

excited -.211 .527 .527 -.580 -.580 .000 -.053 .211 -.316 .158 .053 .369

r-squared .044 .278 .278 .336 .336 .000 .003 .044 .100 .025 .003 .136

happy .000 .707 .707 -.354 -.354 .000 .354 .000 -.707 .354 -.354 .354

r-squared .000 .500 .500 .125 .125 .000 .125 .000 .500 .125 .125 .125

pleased .000 .738 .949 -.158 -.791 -.632 .369 .000 -.105 .580 -.791 .580

r-squared .000 .544 .900 .025 .625 .400 .136 .000 .011 .336 .625 .336

content -.369 .000 .158 .527 .053 -.105 .000 .369 -.632 -.211 -.316 .527

r-squared .136 .000 .025 .278 .003 .011 .000 .136 .400 .044 .100 .278

relaxed .205 -.051 .103 .616 .154 -.410 .359 -.205 .103 .205 -.667 .051

r-squared .042 .003 .011 .379 .024 .168 .129 .042 .011 .042 .445 .003

satisfied .224 -.224 -.447 .224 .894 .671 .224 -.224 -.671 -.224 .224 -.447

r-squared .050 .050 .200 .050 .800 .450 .050 .050 .450 .050 .050 .200

Affect Meters Lower Face (Mouth) Upper Face (Brow)

5

S

d

tr

m

sc

Fi

.6.2 Richn

ubject respo

etermine wh

raditional co

mean scores w

cores compa

igure 5.3 Mean

ness of Exp

onses to the R

hether “richn

ndition. Res

were calcula

ared between

n scores from th

perience

Richness of

ness” was hig

sponses wer

ated per cond

n conditions

he richness ques

59

Experience q

gher in the h

re analyzed a

dition for eac

is presented

stionnaire, ques

questionnair

headset cond

a number of

ch question.

d in the figur

stions 1-5

re were used

dition than in

different wa

A summary

res below.

d to

n the

ays. First,

y of these

Fi

It

h

d

te

in

d

igure 5.4 Mean

t is apparent

eadset envir

ifference in

est was perfo

nstead of the

istributions o

n scores from th

from the vis

ronment than

mean scores

ormed on the

e paired t-tes

or equal vari

he richness ques

sualization th

n in the tradi

s for each qu

e results. Th

t or ANOVA

iances, and o

60

stionnaire, ques

hat mean sco

tional enviro

uestion is sig

his particular

A because th

only two con

stions 5-10

ores for all b

onment. To

gnificant, the

r test for sign

he data do no

nditions are

but Q5 are hi

determine i

e Wilcoxon s

nificance wa

ot have norm

being comp

igher in the

f the

signed-rank

as selected

mal

ared.

61

TE-Mean HE-Mean Wilcoxon Sig.

Q1 felt it was important to respond after each of my partner's statements

4.1 4.2 -

Q2 was interested in my partners responses

4.3 4.5 -

Q3 enjoyed communicating with my partner

4.4 4.5 -

Q4 had the sensation that my partner was aware of me

3.9 4.3 -

Q5 felt as though I was sharing the same space with my partner

3.3 3.3 -

Q6 had the sensation that my partner was responding to me

4.1 4.2 -

Q7 was able to successfully convey my feelings with this application

3.3 3.9 -

Q8 partner was able to successfully convey feelings with this application

3 3.7 .05

Q9 understood the emotions of my partner

2.8 3.6 .05

Q10 am satisfied with the experience of communication with this system

3.7 4.1 -

Table 5.5 Wilcoxon's signed rank test for significant difference between score means (N=10)

This analysis suggests that responses to Q8 and Q9 are significantly different (p=.05)

between experimental conditions. Means of the remaining questions from the survey,

although typically higher in the HE condition, do not differ significantly between

conditions.

5.6.2.1 Differences between Headset and Non-Headset Users in EmoChat

During the HE condition (when subjects were chatting with the EmoChat application)

only one subject from each pair was wearing an Epoc headset. This provided an

opportunity to identify differences between richness of experience when using a

headset with the application, which allows the automatic update of avatar expression

an

M

Fiin

Fiin

nd affectiv m

Mean scores

igure 5.5 Compn the EmoChat

igure 5.6 Compn the EmoChat

meters, and w

of REQ resp

parison of meancondition, Q1-5

parison of meancondition, Q5-1

when not usi

ponses can b

n responses to R5 (N=5)

n responses to R10 (N=5)

62

ing a headse

be visualized

REQ between su

REQ between su

et, which req

d in the figur

ubjects with he

ubjects with he

quires manua

res below.

eadsets versus w

eadsets versus w

al updating.

without headsets

without headsets

s,

s,

63

The Wilcoxon signed-rank test for significant differences did not produce any p-

values at or below .05 for this dataset, largely due to the small sample size (N=5);

however, responses with the largest mean differences between user groups (Q1, Q8,

and Q9) may still suggest meaningful relationships. Additional studies are needed to

determine any significance.

Compared with the headset users, non-headset users typically felt that it was more

important to respond to each of their chat partners’ statements, and believed that their

partners were more able to successfully convey feelings with the application. Non-

headset users also typically indicated that they were better able to understand the

emotions of their headset-wearing chat counterparts.

5.6.3 Chat Transcripts

Chat logs were analyzed by using the Linguistic Inquiry and Word Count (LIWC)

tool developed by Pennebaker and colleagues (Pennebaker, et al., 2007). This

application reports on the frequency of words belonging to specific linguistic

categories within the text that it processes, and has been used to analyze both

transcribed spoken passages, and written text. Output from the application is entirely

context-independent, meaning that the sentence “Quick brown foxes jump over

dogs,” produces the exact same output as a nonsensical sentence using the same

words, like, “Brown over jump foxes dogs quick.” The present study was primarily

interested in comparing the difference in number of affect-related terms that appear in

64

the transcripts from each experimental condition; however, additional linguistic

categories are considered in the analysis.

Transcripts from each trial were analyzed with LIWC. Output from the tool consisted

of frequencies of words from over 70 different categories. Output was analyzed with

SPSS for significant differences between experimental conditions using Wilcoxon’s

signed-rank test. Results are provided below.

Category HE-Mean TE-Mean Mean Dif. Wilcoxon sig.

word count 1376.8 1566.4 -189.6 0.05

affect (%) 4.66 3.44 1.22 0.05

you (%) 1.608 1.072 0.536 0.05

relative (%) 5.988 7.086 -1.098 0.05 Table 5.6 Linguistic categories with significant differences between experimental conditions

This analysis suggests that the HE condition corresponds with significantly lower

word count and fewer relativity-related words, and significantly more affect-related

and you-related words. These metrics are described below in more detail.

Word Count

This metric represents the total number of words in the processed text and includes

both words that are found in the program’s dictionary file, e.g., “together,” and

nonsense words like, “orrrrly.” Most other LIWC generated metrics are calculated as

percentage of total word count. For all subject pairs that participated in the

experiment, total word count was smaller in the HE condition than in the TE

condition.

65

Affect Processes

The affect word category consists of 916 distinct emotion-related words, and is

subdivided into positive and negative emotion words. The negative emotion category

is further subdivided into anxiety, anger, and sadness categories. Example words for

each category within this hierarchy are presented below.

Word Category Abbrev ExamplesAffective processes affect Happy, cried, abandon ---Positive emotion posemo Love, nice, sweet ---Negative emotion negemo Hurt, ugly, nasty ------Anxiety anx Worried, fearful, nervous ------Anger anger Hate, kill, annoyed ------Sadness sad Crying, grief, sad

Table 5.7 LIWC affective processes hierarchy

Words that are encountered by the LIWC processing engine are counted as part of all

parent categories, thus an anger-related word would also be counted as a negative

emotion word, and an affective process word.

Singularly, the ratios of positive and negative emotion words to total word count do

not differ significantly between HE and TE conditions, but collectively as affect

words, ratios were higher in the HE condition for all subject pairs.

You Words

This category consists of 20 different second-person pronoun words, such as you,

your, and thou. It is a child of the personal pronouns category, which itself is a child

of the total pronouns category. You-word ratios were higher in the HE condition than

in the TE condition for all subject pairs.

66

Relativity Words

This category consists of 638 relativity-related words. It is a top-level category with

motion, space, and time categories as children. The hierarchy is presented below with

examples.

Word Category Abbrev ExamplesRelativity relative Area, bend, exit, stop ---Motion motion Arrive, car, go ---Space space Down, in, thin ---Time time End, until, season

Table 5.8 LIWC relativity hierarchy

Ratios of child categories motion, space, and time were not significantly different

between HE and TE conditions, but ratios of the collective relativity-words category

were smaller in the HE condition for all subject pairs.

5.6.4 System Usability

After subjects finished the HE test condition they were asked to rate the EmoChat

application in terms of usability by completing the SUS questionnaire. The SUS

produces output on a scale from 0 to 100, with higher scores considered to indicate

greater usability. Results for the EmoChat application were typically high (N=10),

ranging from 72.5 to 87.5 (mean=76.25).

5.6.5 Informal Interviews

Subject pairs jointly participated in informal interviews at the end of the experiment.

They were asked to discuss differences between experimental conditions. The

facilitator led the discussion and manually recorded interesting conversation topics.

Notes from the interviews are presented below.

67

Subject1 / Subject2

Experimental Group I

Relationship: same sex long-time friends

Both subjects indicated a general distrust of the avatar. Subject2, who was not

wearing the headset, told me that he thought Subject1 may have been manually

manipulating the movement of the avatar and affective bars. After briefly paying

attention to the avatar early in the session both subjects eventually disregarded avatar

and affective information presented in the application.

The subjects mentioned that the lack of a “partner is entering a message…” indicator

made it more difficult to have a conversation. Subject2 is a particularly fast typist

and felt as though he was dominating the conversation, since he couldn’t see when his

partner was entering text.

Subject1 mentioned that the eye blink avatar behavior was very accurate. He also

said that he did not manually manipulate any of the Emo controls. Subject2 felt

obligated to manipulate some of the avatar and affective meters after seeing the

movements in Subject1’s Emo controls, however he said his manipulation choices

were random and did not really consciously reflect a desire to convey specific

emotional information.

Subject1 mentioned that the location of the avatar and meters drew his focus away

from the chat window, and was part of the reason for his general disregard.

68

Subject3 / Subject4


Relationship: mixed sex romantic partners

Subjects legitimately enjoyed communicating with the headset system. At one point,

Subject4 made a positive comment about Subject3’s appearance which induced a

very large smile on Subject3’s face. This smile was reflected in the avatar. When

Subject4 noticed the avatar with the large smile she seemed very pleased and giggled.

Subject3 mentioned that the frustration meter may not have been entirely accurate,

but that he did not immediately distrust it. He thought that it was possible that the

frustration meter was not necessarily reflecting frustration caused by the

conversation, but instead caused by the recurrence of pain from a hand burn received

earlier in the day.

Subject3 mentioned that the excitement meter seemed to be the most dynamic and

accurate of the affective meters. At one point during the HE conversation, Subject4

was telling Subject3 a joke. During waiting periods between lines of the joke,

Subject3 noticed that excitement levels and frustration levels were increasing, which

he thought was appropriate. When asked which condition the subjects preferred they

strongly indicated the headset condition. Subject4 noted that she was much more

interested in communicating with Subject3 with the avatar and affective meters

present, than not present. She said the experiment was actually fun.

69

Subject5 / Subject6

Experimental Group II

Relationship: mixed sex short-term friends

Subject5 reported not manually manipulating the EmoControls, even though he was

informed that they were available for use as he saw fit. Subject6 said she wished she

had a headset; she would have enjoyed the communication more and would have

been able to focus more on the conversation and less on manually making the avatar

face and affective meters accurately reflect her emotional state.

Subject5 stated that he felt the avatar and affective meters were reasonably accurate,

and that more than increased accuracy, he wished the presentation was more

aesthetically pleasing. As it stands, the avatar is very simple, and animations between

states are abrupt.

Subject6 purposely tried influencing the affective levels reported by Subject5 by

telling jokes. Both Subject6 and Subject5 noticed that excitement and engagement

levels increased when Subject5 was told an entertaining joke. Subject5’s avatar also

reflected his smile reaction to the jokes.

Both Subject5 and Subject6 agreed that the (TE) condition was less fun that the (HE)

condition.

70

Subject7 / Subject8

Experimental Group II

Relationship: brothers

Subject7 mentioned that several times during the conversation he switched to manual

control of the frustration meter, because he didn’t think it was returning to base line

quickly enough.

Subject8 noticed that every time he mentioned his new girlfriend, who his brother

Subject7 does not entirely approve of, Subject7’s frustration meter spiked.

Subject8 mentioned that he felt that manually manipulating the avatar (since he did

not have a headset) caused him to pay less attention to the content of the dialog, and

more attention to making his avatar reflect his current emotional state. He said that

he wished he also had a headset.

The issue of distrusting the avatar was brought up by Subject8 since participants were

permitted to manually manipulate facial expression and affective states. He

mentioned he would have been happier if the partner was notified whenever manual

EmoControls were invoked.

71

Subject9 / Subject10


Relationship: Mixed sex long-time friends

Subject10 mentioned that Subject9 seemed to be quite frustrated, but Subject9 said he

wasn’t very frustrated. Subject10 also noticed increased in excitement and

engagement when she asked Subject9 about a recent party they both attended.

Subject9 said that he was unsure whether or not Subject10 was wearing a headset for

the experiment. Subject10 mentioned that she didn’t think she was supposed to

divulge that information.

Subject10 characterized the HE condition as “funny”. She appreciated the facial

expressions of her partner’s avatar.

72

5.6.5.1 Trends Identified During Interviews Some subjects reported a general distrust of the avatar expressions and affective

meters since subjects were permitted to manually manipulate these features without

their partner being aware. In one case, this distrust caused the chat partners to

completely disregard changes in avatar features, but this was isolated. Most headset

users indicated that they did not manually manipulate anything, opting to let the

output from the headset control any changes to avatar states.

Subject pairs in the EmoChat condition consisted of one partner with a headset and

one partner without. The majority of non-headset users indicated that they wished

they had also been using a headset. It was noted several times that having to

manually manipulate avatar features was time-consuming and took away from the

main conversation. During manual manipulation, focus was shifted from the main

chat window to the override control pane of the application, and substantial effort was

required to alter avatar features to reflect present emotional state.

Participants generally preferred using EmoChat over the traditional environment.

Most agreed that EmoChat provided an environment that was interesting and fun to

use. Several participants spontaneously started telling jokes to their headset-wearing

chat partners because they wanted to see how it affected the avatar features. There

seemed to be a genuine interest in the emotional information presented in changes to

avatar facial expression and movement in the affective meters.

73

Frustration was perceived as inaccurate by several participants. One participant in

particular switched to manual manipulation of the frustration meter because he

thought it was not returning to baseline quickly enough, and presumably did not want

his partner to conclude that he was frustrated. Other participants noted that their

frustration meter was spiking when they did not perceive themselves as being

particularly frustrated. The meditation meter was mentioned only once during chat

sessions and informal interviews. It was used for comedic effect when one

participant maxed out the level and stated that he was “meditating like a Jedi.” On

the whole, participants felt that excitement and engagement meters were substantially

more accurate than frustration and meditation meters.

5.7 Discussion

5.7.1 Summary of Results

This study has introduced a novel instant messaging application called EmoChat that

integrates with the Emotiv Epoc headset to detect and convey brow, eye, and mouth

movements, and basic affective states of excitement, engagement, frustration, and

meditation.

The main goals of the study were to compare EmoChat with a traditional instant

messaging environment in terms of the amount of emotional information contained in

messages, richness of experience, accuracy of emotional transfer, and usability, and

to evaluate the effectiveness of using a brain-computer interface device like the Epoc

to facilitate emotional transfer in an IM environment.

74

Results suggest that communication with EmoChat contains a significantly higher

percentage of affect-related and you-related words, than communication with a

traditional environment.

In terms of richness, the data suggests that the level of affective intelligence was

higher with EmoChat than with a traditional environment. Subjects rated their chat

partners as significantly better able to convey emotional information with EmoChat.

Subjects also felt that using EmoChat helped them to understand the emotions of their

partners significantly more than with a traditional environment. The data also

suggests that other measures of richness, including involvement, enjoyment, sense of

copresence, and overall satisfaction were marginally higher with EmoChat.

Accuracy of emotional transfer was not determined to be significantly different

between chat environments; however, additional analysis was able to identify

significant correlation between behavior of the avatar and emotion perceived by the

chat partner.

Usability ratings of the application on the SUS scale ranged from 72.5 to 87.5 with a

mean of 76.25. Subjects who used the headset with EmoChat reported slightly higher

usability ratings (72.5 to 87.5, mean=79) than subjects without the headset (72.5 to

75, mean=74).

75

5.7.2 Consistency with related work

The results of the present study are partially consistent with findings in related

research, as described in the following subsections.

5.7.2.1 Involvement

Involvement in the present study was measured by responses to Q1-2 on the richness

questionnaire. Responses to the questions were marginally, but not significantly

higher in the EmoChat condition. This is consistent with results from Neviarouskaya

in that no significant differences were found; however, the present study does identify

higher mean differences between conditions than the Neviarouskaya study.

Involvement can also be estimated by characters per minute, total word count, or

number of messages sent. The present study found that total word counts in the

EmoChat condition were typically lower than in the traditional environment. This

finding is inconsistent with work by Fabri that reports considerably higher message

length in an expressive avatar condition. A contributing factor to the lower total word

count in the EmoChat condition might be that only one participant in the subject pair

was using a headset. The other participant was required to manually change avatar

and affective features (if desired). Post-experiment interviews confirm that many

subjects felt like the avatar and affect meters were taking their focus away from the

chat window, and might explain the lower total word count.

76

5.7.2.2 Enjoyment

Enjoyment was measured by Q3 on the richness questionnaire, and was not

significantly different between conditions. This result is consistent with findings by

Fabri and Neviarouskaya. Although results from the questionnaire did not suggest a

significant difference between the enjoyment of test conditions, during informal

interviews, users of the EmoChat application almost unanimously reported having

more “fun” than with the traditional environment.

5.7.2.3 Copresence

The copresence dimension was measured by Q4-6 on the richness questionnaire.

Analysis did not identify any significant difference between conditions, although

mean differences were typically higher in the EmoChat condition. Fabri also notes

higher levels of copresence in his expressive avatar condition, although his difference

was significant. Copresence in the Neviarouskaya study is arguably higher in her

expressive avatar condition, but not significantly.

5.7.2.4 Affective Intelligence

Affective intelligence in the present study was measured by Q7-9 on the richness

questionnaire. This indicated how well the system was able to facilitate emotional

expression. Responses were generally significantly higher in the EmoChat condition

than in the traditional condition. This is inconsistent with the Neviarouskaya study,

which was not able to identify significant differences between comparable conditions.

77

5.7.2.5 Overall Satisfaction

Overall satisfaction was measured by Q10 on the richness questionnaire. Comparison

of responses between conditions did not identify significant differences; however

mean difference was marginally higher in the EmoChat condition. This is consistent

with the Neviarouskaya study in that no significant differences were found, however

mean differences in the present study are larger.

5.7.2.6 Usability

Participants in the EmoChat condition were asked to rate the usability of the system

with Brooke’s SUS. Responses ranged from 72.5 to 87.5 with a mean of 76.25. This

rating is comparable to the usability of the expressive avatar system in the Fabri study

(mean=78.4).

5.7.3 Future Direction

Although the results of the present study are promising in terms of their support for

the hypothesis that EmoChat facilitates emotional communication more readily than

traditional IM environments, the relatively small sample size (N=10), creates

challenges with respect to determining statistical differences between environments.

The present study lays the foundation for related future studies that include a larger

number of participants.

This study only investigated the use of EmoChat when one participant was using an

Epoc headset. The other participant was required to manually manipulate avatar

78

features and affective state meters (if desired). It became clear during the study that

the experience of participants who did not have the benefit of the automatic facial

expression and affect meter updates permitted by the headset was very different from

their counterparts’. A second study where both participants use headsets during

EmoChat sessions is likely to generate substantially different data.

5.8 Conclusion This chapter introduced the EmoChat application and compared it with a traditional

instant messaging environment. The results of the comparison have suggested that

EmoChat facilitates communication that contains more affective information, and

provides a “richer” experience. The data has not demonstrated any significant

difference between the accuracy of the emotional information that is transferred,

however, some correlation was found between the frequency of perceived affective

states and incidence of avatar features including facial expressions and average levels

of affect meters.

79

6 Conclusion

6.1 Limitations

The studies presented in this work generally produced favorable results, however

there are some limitations that should be addressed. These limitations are presented

below.

Since conversation topics were not assigned or suggested during study 2, participants

were free to discuss anything that came to mind during the session. Perhaps because

of this freedom, in the EmoChat condition many subjects talked about their avatars,

e.g., “I like your avatar,” or, “you look frustrated.” This may have contributed to the

greater frequency of affect and you-related words measured by the LIWC tool in the

EmoChat condition.

The novel questionnaire used to measure the accuracy of emotional transfer (the

ETAQ), was designed specifically for this experiment, and had not been established

as reliable or valid prior to the study. Results from the survey were difficult to

interpret and compare between experimental conditions. Neither experimental

condition seemed to produce consistent ETAQ results.

The participant sample consisted entirely of young adults, aged 22-32, who all had

substantial computer and typing experience, most of whom used instant messaging

applications very frequently. This sample may not be representative of the general

80

population, so conclusions that are suggested by this study may not generalize beyond

the sample.

The affectiv suite components of frustration and meditation were included as features

in the EmoChat application, although they had not been shown to accurately reflect

self-reported levels in the previous validation study. As noted in section 5.2.1, it was

a conscious decision to include these additional features to allow participants to

decide for themselves whether or not headset-reported levels of frustration and

meditation were of any value. During post-experiment interviews, several subjects

mentioned that frustration levels reported by the headset may not have been accurate.

This is in agreement with the previous validation study in which no strong correlation

was found between headset and self-reported levels of frustration. References to

meditation in chat transcripts and interviews indicate that the feature was generally

disregarded and not considered an important measure during interpersonal

communication. The decision to include these non-validated features of the affectiv

suite in the EmoChat application may have skewed the study results, particularly with

respect to affect-related word frequency in chat transcripts. Some participants

mentioned frustration during chat sessions, e.g., “are you frustrated?” This may have

caused affect-related word frequency to appear artificially high in the EmoChat

condition. Future studies may omit these non-validated affectiv suite features from

the application entirely.

81

The EmoChat application used headset output from both the Emotiv affectiv and

expressive suites; however the previous study was only concerned with validating a

subset of the affectiv suite. The affectiv suite inferences were expected to use

complex classification algorithms to transform raw electroencephalogram (EEG) data

into levels of basic affective states. It was decided that the challenges involved with

this transformation warranted testing to determine accuracy. In contrast, it was

assumed that the facial expression recognition capabilities of the expressive suite

were based on existing research in the comparatively mature facial electromyography

(EMG) field, and did not necessitate validation. The assumptions that were made

during the previous validation study design may have been inappropriate, and it is

noted that a more thorough study should also investigate the accuracy of the Emotiv

expressive suite if these detections are to be used in future work.

The chat log analysis performed with the LIWC tool to identify significant

differences in word-type frequency was primarily concerned with differences in

affect-related words between experimental conditions. For the sake of completeness,

and to identify potential word frequency differences beyond the scope of the study, all

72 word categories were tested between conditions. It is understood that applying

such a large number of tests may have produced results that incorrectly identify

significant differences between conditions, when no true difference exists. It should

be expected that significant differences (p ≤ .05) are incorrectly determined to exist in

5 out of every 100 tests performed. Applied to the present study, this means that of

72 tests performed, 3.6 of those tests are expected to report differences that are falsely

82

identified as significant. The present study notes that 4 word categories were found

to be significantly different between conditions, but the large number of tests

performed merit careful interpretation of these results. The Bonferroni correction

may have been applied in this situation as a potential workaround. This would have

dictated that the standard significance level of (p=.05) be divided by the number of

tests performed (72) to produce a corrected significance level of (p=.000694). Due to

the small initial sample size, testing with this corrected significance level would be

likely to identify false-negatives, and may have failed to detect real significant

differences. Ideally, the study should have just tested in-scope hypotheses, i.e.,

differences in affect-related words only. It might be prudent, however, for any future

related studies that use this methodology to also test for differences in you and

relativity-related word frequency, and total word count between experimental

conditions. If significant differences are also found in future studies, then it becomes

more likely that these differences actually exist and are not appearing purely by

chance. It may also be informative to test for frequency differences in the child word-

categories of positive and negative emotion in addition to the parent affect-related

category.

6.2 Summary of Contributions

This thesis work has presented two related studies. The overall contributions of each

study are discussed below.

83

The first study investigated the accuracy of the Emotiv Epoc headset in terms of its

ability to correctly identify levels of excitement, engagement, and frustration. Results

from the study suggest that the Epoc is reasonably capable of inferring levels of

engagement, and to a lesser degree, excitement. This information should be

applicable to any future research that involves the use of the Epoc headset affectiv

suite. Additionally, the techniques used during study 1 may be applicable to

validation studies of alternative systems that infer affective states from physiologic

metrics. Future research may wish to adapt the methods used in the present study to

elicit differing affective responses through controlled variability of game play

difficulty.

The second study introduced the EmoChat application, an instant messaging

environment that integrates with the Epoc headset to convey facial expression

information, and levels of basic affective states during IM communication. EmoChat

was compared with a traditional instant messaging environment, and was found to be

more capable of communicating emotional information. The findings from this study

add to existing bodies of research concerning affective computing, instant messaging,

and brain-computer interfacing. The methods used here may be applicable to future

studies that seek to evaluate the “affective intelligence” capabilities of an instant

messaging environment, regardless of how emotional information may be input to the

system. Researchers are free to adapt the experimental framework presented in this

study, including procedures followed, measures used, and analyses performed. In a

more abstract sense, another contribution of this work is that it establishes the

84

potential utility of commercial BCI devices in interaction design. This work may

generate additional interest within the research community to assess how available

BCI-based tools can be applied to the field of human-computer interaction, outside of

the traditional role of assistive technology.

6.3 EmoChat Design Considerations Although EmoChat was generally well received by users, modifications to the

application may improve the overall experience of messaging. The majority of users

did not feel the need to manually override any of the EmoChat controls (such as

locking a particular facial feature, or manually manipulating the affect meters), but

the possibility of doing so without disclosure of manual manipulation to the partner

led some participants to question whether or not they were being deceived, i.e., “is

my partner really smiling, or has he locked that control?” Future iterations of

EmoChat should either inform the partner when override controls are invoked, or

prevent overriding the controls altogether.

The general appearance of the avatar is very basic. One participant mentioned that he

would have had a more enjoyable experience if the aesthetic quality of the avatar had

been improved. Future versions of EmoChat should present more realistic avatars

that perhaps smoothly animate between facial expressions, instead of the current

abrupt transitions.

85

Avatar placement in the current version of EmoChat is off to the side of the main chat

window. One subject mentioned that having to divert his gaze from the chat window

to see changes in the avatar made doing so less appealing. Subsequent revisions of

the application may investigate placement of the avatar within the main chat window,

or slightly above, to make monitoring changes easier for the user.

6.4 EmoChat Compared with Existing Methods of Emotional Instant

Messaging

Existing methods of conveying emotion in instant messaging environments were

reviewed in chapter 3. This section provides a comparison between EmoChat and

applications with similar input and output techniques.

6.4.1 Input Technique

EmoChat uses a form of automated facial expression recognition that is thought to

rely on interpretation by the Epoc headset of electromyographic signals produced by

facial muscle movement. These electrical signals are detected by the Epoc and used

to infer changes in states of brow, mouth, and eye facial features. Existing facial

expression recognition algorithms typically rely on visual input from a camera or

video stream and require that the user’s face remain situated in front of the camera

with no more than a minor angle of incidence. This means that if a user looks down

at the keyboard to type a response, or looks away from the camera to attend to

something else, the system may fail. By contrast, the Epoc headset transmits data

86

wirelessly to a USB dongle connected to the computer running EmoChat at ranges up

to 15 feet. This provides the user with much greater freedom of movement, at the

expense of cost, setup time, and comfort associated with using an EEG headset.

In addition to facial expression recognition, the Epoc is thought to use

electroencephalographic signals to infer levels of basic affective states including

excitement, engagement, frustration, and meditation. To date, no other instant

messaging application has used EEG input in this manner. This may potentially

provide emotional information that is not associated with overt sentic modulation, and

may be capable of detecting very subtle emotional changes. This is in contrast to

most existing research that uses only overt symptoms of emotional experience to infer

affect, e.g., solely facial expression, haptic techniques, and explicit text or phrase

matching. The addition of covert symptoms of emotional experience in the EmoChat

system permits the communication of emotional information that might be considered

more intimate.

6.4.2 Output Technique

EmoChat presents users with a very simple animated avatar that might be perceived

as far less realistic than some expressive avatars used in other instant message

applications. Additionally, the technology used in EmoChat does not permit

simultaneous automatic changes to both mouth and brow features. This may prevent

the avatar from expressing certain emotions that rely on simultaneous feature

changes, such as fear, or sadness. However, the application does provide override

87

controls to prevent facial features from changing based on headset input, and allows

users to manually select a preferred feature. In this way, a user could manually

override the mouth feature to force a persistent smile while allowing input from the

headset to modulate brow movement.

This implementation of an animated avatar is in contrast with techniques presented in

section 3.3.2. While existing implementations modulate multiple facial features to

convey a specific emotion, e.g., brow lowered, lips pursed, to indicate anger,

EmoChat takes a more granular approach. Users are free to change facial features

independent of one another. The facial movement in the EmoChat avatar is not

determined by which emotion is inferred or manually selected, but instead attempts to

directly mimic the expression of the user. The ability to combine multiple,

independent facial features lets the EmoChat avatar display a larger number of

distinct expressions than existing methods. By manipulating only brow and mouth

movements, for example, EmoChat can display up to 25 distinct facial expressions.

A drawback to the avatar implementation in EmoChat is that it may not be able to

faithfully reproduce some facial expressions present in existing systems, including

disgust, fear, or surprise, because the individual component images associated with

changes in brow and mouth state are limited.

6.5 Future Direction Beyond the future directions previously described in sections 4.7.2 and 5.7.3, the

present work may provide impetus for additional related studies.

88

The existing EmoChat application is designed for a maximum of two users. It may be

interesting to study how similar emotion input and output techniques translate to a

larger group chat environment, and how the group dynamic is changed by adding this

type of emotional information channel.

Although EmoChat was well received, and users seemed genuinely interested in the

emotional information it was providing, the real-world feasibility of such a system

has not been established. A longitudinal study might suggest how, when, and why a

system like this might be used, and by whom.

Presently, the EmoChat application does not make inferences as to where a particular

user’s experience of emotion lies in arousal-valence space. It may be possible to use

data recorded by the headset to classify emotional experience into some set of

discrete categories, e.g., positive/negative valence, or high/low arousal.

6.6 Closing

The work presented in this thesis suggests that computer-mediated communication

may benefit from the integration of brain-computer interface devices like the Emotiv

Epoc for facilitating communication that contains more emotional information than

traditional environments. These results may generalize to the larger affective

computing domain if states like engagement and excitement can be accurately

inferred. One might envision using this technology in a virtual environment setting

like second life, to endow an avatar with a greater degree of humanity, or in a

distance-learning scenario, to monitor the interest level of students. It is up to the

89

scholars and proponents of human-centered computing and its related disciplines to

investigate the best uses for this technology, perhaps, in applications that have not yet

been conceived. The author believes that future iterations of BCI technology will

incorporate greatly increased capabilities, opening up a vast number of new

possibilities for human-machine interaction, and he is proud to present the modest

findings of this thesis to the community.

90

Appendix A: Validation Study Demographics Questionnaire Name: _________________________________ Age: ________ Gender: ________

1. How often do you play computer or console games?

Never Rarely Occasionally Frequently Very

Frequently 2. How familiar are you with the game of Tetris?

Not at All Familiar

Somewhat Familiar

Moderately Familiar

Pretty Familiar

Very Familiar

3. How often do you play Tetris, or a similar game?

Never Rarely Occasionally Frequently Very

Frequently 4. What is your typical skill level when it comes to playing computer or console games?

Far Below Average

Below Average

Average Above

Average Far Above Average

5. What are some of your favorite computer or console games, if any? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________

91

Appendix B: Validation Study Experiment Questionnaire Round [n]

Not at

All Slightly Moderately Very Extremely

How Difficult was this round?

1 2 3 4 5

How Fun was this round? 1 2 3 4 5

On average, how Engaged did you feel during this round?

1 2 3 4 5

On average, how Excited did you feel during this round?

1 2 3 4 5

On average, how Frustrated did you feel during this round?

1 2 3 4 5

(repeated for each round of play)

92

Appendix C: Validation Study Post-Experiment Questionnaire 1. What types of events caused you to get excited during game play? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 2. What makes a game like Tetris engaging? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 3. What happened during the game that made you feel frustrated? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 4. What makes Tetris fun? _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ 5. Do you have any additional comments? _____________________________________________________________________ _____________________________________________________________________

93

Appendix D: EmoChat Demographics Questionnaire Name: _________________________________ Age: ________ Gender: ________ 1. Rate your level of computer experience.

None Basic Intermediate Expert

2. Rate your keyboard typing skills.

None Basic Intermediate Expert

3. Please indicate your highest level of education completed.

None High

School Some

College Bachelor’s

Degree Master’s Degree

PhD or Higher

4. How often do you use Instant Message tools, such as MSN, AIM, ICQ, etc.?

Never Occasionally Regularly Every Day

94

Appendix E: EmoChat Emotional Transfer Questionnaire During this chat session, how often did [EX role: you | PX role: your chat partner] experience the following emotions:

Very Seldom or Not at All

Seldom Sometimes Often Very Often

Afraid 1 2 3 4 5

Angry 1 2 3 4 5

Annoyed 1 2 3 4 5

Astonished 1 2 3 4 5

Bored 1 2 3 4 5

Content 1 2 3 4 5

Excited 1 2 3 4 5

Frustrated 1 2 3 4 5

Happy 1 2 3 4 5

Interested 1 2 3 4 5

Miserable 1 2 3 4 5

Pleased 1 2 3 4 5

Relaxed 1 2 3 4 5

Sad 1 2 3 4 5

Satisfied 1 2 3 4 5

Tired 1 2 3 4 5

95

Appendix F: EmoChat Richness Questionnaire 1. I felt it was important to respond after each of my partner’s statements.

Strongly Disagree

Disagree Neither Agree Strongly Agree

2. I was interested in my partner’s responses.

Strongly Disagree


3. I enjoyed communicating with my chat partner.

Strongly Disagree


4. I had the sensation that my partner was aware of me.

Strongly Disagree


5. I felt as though I was sharing the same space with my partner.

Strongly Disagree


6. I had the sensation that my partner was responding to me.

Strongly Disagree


7. I was able to successfully convey my feelings with this application.

Strongly Disagree


8. My partner was able to successful convey his or her feelings with this application.

Strongly Disagree


9. The emotional behavior of the avatars was appropriate.

Strongly Disagree


10. I understood the emotions of my partner.

Strongly Disagree


11. I am satisfied with the experience of communication with this system

Strongly Disagree


96

Bibliography

Andrei, S. (2010). Toward Natural Selection in Virtual Reality. IEEE Computer Graphics and Applications, 30, 93-96,C93.

Brooke, J. (1996). A "Quick and Dirty" Usability Scale. In Jordan, McLelland, Thomas & Weerdmeester (Eds.), Usability Evaluation in Industry (pp. 189-194): Taylor & Francis.

Campbell, A., Choudhury, T., Hu, S., Lu, H., Mukerjee, M. K., Rabbi, M., et al. (2010). NeuroPhone: brain-mobile phone interface using a wireless EEG headset. Paper presented at the Proceedings of the second ACM SIGCOMM workshop on Networking, systems, and applications on mobile handhelds.

Chanel, G., Rebetez, C., Betrancourt, M., & Pun, T. (2008). Boredom, engagement and anxiety as indicators for adaptation to difficulty in games. Paper presented at the Proceedings of the 12th international conference on Entertainment and media in the ubiquitous era.

Ekman, P., & Oster, H. (1979). Facial Expressions of Emotion. Annual Review of Psychology, 30(1), 527-554.

Fabri, M. (2006). Emotionally Expressive Avatars for Collaborative Virtual Environments. Leeds Metropolitan University.

Fabri, M., & Moore, D. (2005, October 2005). Is Empathy the Key? Effective Communication via Instant Messaging. Paper presented at the in Proceedings of 11th EATA International Conference on Networking Entities, St. Poelten, Austria.

James, W. (1884). What is an Emotion? Mind, 9(34), 188-205.

Kaliouby, R. E., & Robinson, P. (2004). FAIM: integrating automated facial affect analysis in instant messaging. Paper presented at the Proceedings of the 9th international conference on Intelligent user interfaces.

97

Kayan, S., Fussell, S. R., & Setlock, L. D. (2006). Cultural differences in the use of instant messaging in Asia and North America. Paper presented at the Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work.

Khalili, Z., & Moradi, M. H. (2009). Emotion recognition system using brain and peripheral signals: using correlation dimension to improve the results of EEG. Paper presented at the Proceedings of the 2009 international joint conference on Neural Networks.

Kiesler, S., Zubrow, D., Moses, A. M., & Geller, V. (1985). Affect in computer-mediated communication: an experiment in synchronous terminal-to-terminal discussion. Hum.-Comput. Interact., 1(1), 77-104.

Lang, P. J. (1995). The Emotion Probe: Studies of Motivation and Attention. American Psychologist, 50(5), 372-385.

Lo, S.-K. (2008). The Nonverbal Communication Functions of Emoticons in Computer-Mediated Communication. CyberPsychology & Behavior, 11(5), 595-597.

Neviarouskaya, A. (2008). AffectIM: An Avatar-based Instant Messaging System Employing Rule-based Sensing. Unpublished MS, University of Tokyo, Tokyo.

Neviarouskaya, A., Prendinger, H., & Ishizuka, M. (2007). Recognition of Affect Conveyed by Text Messaging in Online Communication Online Communities and Social Computing (Vol. 4564, pp. 141-150). Heidelberg: Springer Berlin.

Oakley, I. O. M., S. (2003). Contact IM: Exploring asynchronous touch over distance. Paper presented at the CSCW'02.

Pantic, M., Sebe, N., Cohn, J. F., & Huang, T. (2005). Affective multimodal human-computer interaction. Paper presented at the Proceedings of the 13th annual ACM international conference on Multimedia.

Pennebaker, J. W., Booth, R., & Francis, M. (2007). Linguistic Inquiry and Word Count: LIWC2007.

98

Picard, R. W. (1997). Affective computing: MIT Press.

Reisenzein, R. (2006). Arnold's theory of emotion in historical perspective. Cognition and emotion, 20(7), 32.

Rivera, K., Cooke, N. J., & Bauhs, J. A. (1996). The effects of emotional icons on remote communication. Paper presented at the Conference companion on Human factors in computing systems: common ground.

Rovers, A. F., & Essen, H. A. v. (2004). HIM: a framework for haptic instant messaging. Paper presented at the CHI '04 extended abstracts on Human factors in computing systems.

Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161-1178.

Sanchez, J. A., Hernandez, N. P., Penagos, J. C., & Ostrovskaya, Y. (2006). Conveying mood and emotion in instant messaging by using a two-dimensional model for affective states. Paper presented at the Proceedings of VII Brazilian symposium on Human factors in computing systems.

Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science Information, 44(4), 695-729.

Sherstyuk, A., Vincent, D., & Treskunov, A. (2009). Towards Virtual Reality games. Paper presented at the Proceedings of the 8th International Conference on Virtual Reality Continuum and its Applications in Industry.

Sloten, J. V., Verdonck, P., Nyssen, M., & Haueisen, J. (2008, November 23-27). On modeling user's EEG response during a human-computer interaction: a mirror neuron system-based approach. Paper presented at the 4th European Conference of the International Federation for Medical and Biological Engineering, Antwerp, Belgium.

Tetteroo, D. (2008). Communicating emotions in instant messaging, an overview. University of Twente, Enschede.

99

Thompson, E. R. (2007). Development and Validation of an Internationally Reliable Short-Form of the Positive and Negative Affect Schedule (PANAS). Journal of Cross-Cultural Psychology, 38(2), 227-242.

Wang, H., Prendinger, H., & Igarashi, T. (2004). Communicating emotions in online chat using physiological sensors and animated text. Paper presented at the CHI '04 extended abstracts on Human factors in computing systems.

Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54(6), 1063-1070.

Yahoo. (2010). Emoticon survey results. Retrieved October 4, 2010, from http://www.ymessengerblog.com/blog/2007/07/10/emoticon-survey-results/

Yeo, Z. (2008). Emotional instant messaging with KIM. Paper presented at the CHI '08 extended abstracts on Human factors in computing systems.

Emochat: Emotional instant messaging with the Epoc headset

Technology

Transcript of Emochat: Emotional instant messaging with the Epoc headset