Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf ·...

10
Enhancing Communication and Dramatic Impact of Online Live Performance with Cooperative Audience Control Takuro Yonezawa Keio University Delta S213, Endou 5322, Fujisawa, Kanagawa, Japan [email protected] Hideyuki Tokuda Keio University Delta S213, Endou 5322, Fujisawa, Kanagawa, Japan [email protected] ABSTRACT Recent progress in information technology enables people to easily broadcast events live on the Internet. Although the advantage of the Internet is live communication between a performer and listeners, the current mode of communi- cation is writing comments using Twitter or Facebook, or some similar messaging network. In one type of live broad- cast, musical performances, it is difficult for a musician, when playing an instrument, to communicate with listeners by writing comments. We propose a new communication mode between performers who play musical instruments, and their listeners by enabling listeners to control the per- former’s camera or illumination remotely. The results of four weeks of experiment confirm the emergence of nonver- bal communication between a performer and listeners, and among listeners, which increases camaraderie amongst lis- teners and performers. Additionally, the dramatic impact of a performance is increased by enabling listeners to control various camera actions such as zoom-in or pan in real time. The results also provide implications for design of future in- teractive live broadcasting services. Author Keywords Interactive live broadcasting, collaborative camera control, musical performance, nonverbal communication, amateur ACM Classification Keywords H.5.2 Information interfaces and presentation (e.g., HCI): Miscellaneous. General Terms Human Factors, Communication, Measurement, Experience INTRODUCTION Continued progress in the expansion of the Internet and in- formation technologies has enabled us to provide informa- tion easily worldwide. Especially, amateur users can broad- cast various contents live on the Internet today, which was Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. UbiComp ’12, Sep 5-Sep 8, 2012, Pittsburgh, USA. Copyright 2012 ACM 978-1-4503-1224-0/12/09...$10.00. heretofore possible only by expert users with professional- level hardware and software. Due to the proliferation of ser- vices such as Ustream [3], YouTube Live [1], and NicoN- ico Live [2], performers and listeners can increasingly enjoy live broadcasting. One advantage of internet live broadcast- ing is the two-way real-time communication that can occur between performers and listeners. For example, Ustream en- ables listeners to send messages to a performer or other lis- teners through social network services such as Facebook or Twitter. Moreover, in NicoNico Live, the users’ comments are shown over the live streaming movie. Thereby, perform- ers and listeners can communicate more directly. A main component in live broadcasting is its musical instru- ment performance. Traditional live musical performances were performed only in concerts or halls, in the street or through TV, however live broadcasting services enable mu- sicians to perform live in their homes. Because communica- tions between performers and listeners are limited to sending messages, it is difficult for musicians to communicate with listeners while they are playing their instruments. Therefore, interactivity—a great advantage of live internet broadcasts— is lost in live musical performances. To address and solve that problem, we present the Listener- Cooperative Live Production System called LCLPS, which enables listeners (1) to control the performer’s environment (e.g., camera or illumination) remotely in real time, and con- sequently (2) to enhance communication between the per- former and the listeners. Although professionals can afford a large number of staffs to support a musician’s performance by controlling cameras or lighting during live performances, such support is too costly for amateur musicians. By en- abling listeners to control cameras or other devices remotely, the LCLPS provides a high dramatic impact similar to that of a professional live performance but without the cost. Addi- tionally, through the listeners’ control, we aim to bring non- verbal communication, which has an important impact on musical performances. The performer and listeners can send and share their intent for the musical performance through control of the performer’s devices. This paper reports our study with four weeks’ experimentation using the proposed system with thousands of listeners, and presents an evalua- tion of the LCLPS’ effectiveness and usability. Our results show that the LCLPS enhances the dramatic impact of a musical performance, and communication between the per- former and listeners.

Transcript of Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf ·...

Page 1: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Enhancing Communication and Dramatic Impact of OnlineLive Performance with Cooperative Audience Control

Takuro YonezawaKeio University

Delta S213, Endou 5322, Fujisawa,Kanagawa, Japan

[email protected]

Hideyuki TokudaKeio University

Delta S213, Endou 5322, Fujisawa,Kanagawa, Japan

[email protected]

ABSTRACTRecent progress in information technology enables peopleto easily broadcast events live on the Internet. Althoughthe advantage of the Internet is live communication betweena performer and listeners, the current mode of communi-cation is writing comments using Twitter or Facebook, orsome similar messaging network. In one type of live broad-cast, musical performances, it is difficult for a musician,when playing an instrument, to communicate with listenersby writing comments. We propose a new communicationmode between performers who play musical instruments,and their listeners by enabling listeners to control the per-former’s camera or illumination remotely. The results offour weeks of experiment confirm the emergence of nonver-bal communication between a performer and listeners, andamong listeners, which increases camaraderie amongst lis-teners and performers. Additionally, the dramatic impact ofa performance is increased by enabling listeners to controlvarious camera actions such as zoom-in or pan in real time.The results also provide implications for design of future in-teractive live broadcasting services.

Author KeywordsInteractive live broadcasting, collaborative camera control,musical performance, nonverbal communication, amateur

ACM Classification KeywordsH.5.2 Information interfaces and presentation (e.g., HCI):Miscellaneous.

General TermsHuman Factors, Communication, Measurement, Experience

INTRODUCTIONContinued progress in the expansion of the Internet and in-formation technologies has enabled us to provide informa-tion easily worldwide. Especially, amateur users can broad-cast various contents live on the Internet today, which was

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.UbiComp ’12, Sep 5-Sep 8, 2012, Pittsburgh, USA.Copyright 2012 ACM 978-1-4503-1224-0/12/09...$10.00.

heretofore possible only by expert users with professional-level hardware and software. Due to the proliferation of ser-vices such as Ustream [3], YouTube Live [1], and NicoN-ico Live [2], performers and listeners can increasingly enjoylive broadcasting. One advantage of internet live broadcast-ing is the two-way real-time communication that can occurbetween performers and listeners. For example, Ustream en-ables listeners to send messages to a performer or other lis-teners through social network services such as Facebook orTwitter. Moreover, in NicoNico Live, the users’ commentsare shown over the live streaming movie. Thereby, perform-ers and listeners can communicate more directly.

A main component in live broadcasting is its musical instru-ment performance. Traditional live musical performanceswere performed only in concerts or halls, in the street orthrough TV, however live broadcasting services enable mu-sicians to perform live in their homes. Because communica-tions between performers and listeners are limited to sendingmessages, it is difficult for musicians to communicate withlisteners while they are playing their instruments. Therefore,interactivity—a great advantage of live internet broadcasts—is lost in live musical performances.

To address and solve that problem, we present the Listener-Cooperative Live Production System called LCLPS, whichenables listeners (1) to control the performer’s environment(e.g., camera or illumination) remotely in real time, and con-sequently (2) to enhance communication between the per-former and the listeners. Although professionals can afforda large number of staffs to support a musician’s performanceby controlling cameras or lighting during live performances,such support is too costly for amateur musicians. By en-abling listeners to control cameras or other devices remotely,the LCLPS provides a high dramatic impact similar to that ofa professional live performance but without the cost. Addi-tionally, through the listeners’ control, we aim to bring non-verbal communication, which has an important impact onmusical performances. The performer and listeners can sendand share their intent for the musical performance throughcontrol of the performer’s devices. This paper reports ourstudy with four weeks’ experimentation using the proposedsystem with thousands of listeners, and presents an evalua-tion of the LCLPS’ effectiveness and usability. Our resultsshow that the LCLPS enhances the dramatic impact of amusical performance, and communication between the per-former and listeners.

Page 2: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

In summary, our work offers three main contributions:

• Creating a live broadcasting support system called LCLPS,which enhances the dramatic impact of the performances.

• Verifying the emergence of nonverbal communication be-tween the performer and listeners, and also among the lis-teners, by sufficient experimentation using the proposedsystem.

• Presenting implications for design of future interactive livebroadcasting services.

The remainder of the paper is organized as follows. Firstly,we present the motivation of our research with a summary ofthe current status of live broadcasting services, their strengthsand weaknesses. Then, we describe the concept and im-plementation of the LCLPS. In the following section, weevaluate the LCLPS in an actual live broadcasting situation,in terms of its effectiveness and usability. Subsequently,we discuss its implications for designing future interactivebroadcasting systems. Finally, we describe related work andour conclusion .

MOTIVATIONAs evidenced by the spread of the Internet-based live broad-casting services [3, 1, 2], many users enjoy not only see-ing live streaming of performances, but also to broadcastthemselves. Additionally, the ubiquity of personal comput-ers, smartphones and internet access enables users to see andbroadcast live streaming anytime at anywhere. The contentsof these broadcasts have also become increasingly diverse.Especially, musical performances are a popular type of inter-net live performance; many amateur musicians enjoy broad-casting their musical performance from their own homes.

Amateur musicians frequently devise new ideas to enhancethe entertainment features of their performances. For ex-ample, some musicians play instruments in concert with ex-isting music contents found on the Internet such as in mu-sic videos on YouTube, as “mashup” live session (see Fig.1). Although music contents have faced copyright problemsin the past, recent live internet services allow users to usethis music in their programs by signing deals with its copy-right owners such as record companies. Consequently, suchmashups are expected to become increasingly popular in thefuture.

One advantage of live internet broadcasting is the high po-tential for interactivity between performers and their listen-ers. Through social streaming or chat space in live internetweb services, they can communicate with each other in realtime. Consequently, although the skills of amateur musi-cians are often inferior to those of professional musicians,both the performers and listeners can increase enjoymentof musical performances by communicating with each other(e.g., the listeners can request songs to be performed by theperformers). Nevertheless, two main problems arise duringamateur musicians’ live broadcasts. First is the intermit-tent communication problem. When amateur musicians playinstruments, the communication between the musician and

+

Music clip in the Internet(Ex.YouTube)

Performance of amateur musician

Live collaboration with existing video contents

Figure 1. Mashup with existing music contents on the Internet.

Pan Right/Left Pan Up/Down Zoom In/Out Rotate Left/Right

Figure 2. Types of camera work.

listeners is greatly decreased because the amateur musiciancannot read and reply to messages from listeners while theyare concentrating on playing instruments. This shortcomingin turn reduces the big advantage of the Internet live perfor-mance: interactivity. Second is the lack of dramatic impact.In live broadcasts of professional musicians, dramatic im-pact is enhanced through the cooperation of a large supportstaff including camera operators or illuminating staff. Forexample, professional camera operators can enhance a per-formance by manipulating a camera’s zoom or rotation (seeFigure 2). However, because amateur musicians are not sup-ported by such a professional staff, their broadcasts are oftenstatic in nature.

To solve these two problems simultaneously, we specificallydevote attention to remote control of a musician’s broadcast-ing equipment by listeners. The purpose of our researchis to show that such control is useful for enhancing bothcommunication and dramatic impact. We especially expectthe emergence of nonverbal communication between ama-teur musicians and their listeners. For example, we assumenot only that the listeners control a performer’s camera in re-sponse to the musician’s performance, but also that the musi-cian’s performance can change in response to listener’s cam-era control. Network-controlled pan-tilt-zoom (PTZ) cam-eras are commercially available. No technical difficulty ex-ists in controlling the equipment remotely. However, ourmotivation arises from human factors: recognizing what willoccur when remote control of those devices is used within alive streaming contexts. The main purposes of such com-mercially available PTZ cameras are for security, or moni-toring traffic. No research has been done into giving cameracontrol to many users during live streaming. Another im-portant aspect of our research is manual control. Ubicomptechnologies enable us to control devices automatically in acontext-aware manner. There also have been many efforts tocontrol cameras automatically in lecture rooms [4, 18] andmeeting rooms [17, 9, 21]. Our research is conducted to ex-plore the advantages of manual control: the emergence of a

Page 3: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Figure 3. Concept of Listener-Cooperative Live Production System.

new kind of communication.

LISTENER-COOPERATIVE LIVE PRODUCTION SYSTEMThis section presents a description of concepts, design andimplementation of the LCLPS.

ConceptFigure 3 portrays the LCLPS concept. In the performer’s en-vironment, equipment such as cameras and lighting is usedfor live broadcasts. Listeners can control such devices re-motely while they are seeing the live broadcast. The per-former sends their live performance through a streaming ser-vice (e.g., Ustream or NicoNico Live). The listeners canmanipulate performer’s devices by sending control messagesthrough a chat space on the live broadcast websites or Twit-ter. Consequently, the listeners need not install additionalapplications on their PC or smartphone to control the per-former’s devices. Our implementation targets the control ofcameras because cameras have the most dramatic impact ona broadcast.

Hardware ComponentsFigure 4 shows the performer-side hardware components inour implementation: three web cameras (Cameras 1-3), twokeyboards with different sounds (Keyboards 1 and 2), and aPC which connects to those devices. The cameras are placedaccordingly: Camera 1 captures the performer’s upper body,face, and Keyboard 1, Camera 2 captures Keyboard 2 andthe performer’s hand, and Camera 3 captures the performer,Keyboard 1 and Keyboard 2 from the ceiling.

Software ComponentsFigure 5 shows the software architecture of our implemen-tation. To enable many listeners to participate in the con-trol of the performer’s devices, we claim that the systemshould be simple. Therefore, we designed our system toobviate installation of software on the listeners’ computers.The Comment Analyzer in Figure 5 collects and analyzes lis-teners’ comments on live broadcast websites using XML-Socket, and listeners’ posts on Twitter through the Twit-ter API. When listeners’ comments are identified as control

Camera 2 Camera 3

Camera 1

Camera 2

Camera 1

Camera 3

Figure 4. Hardware components of the LCLPS: three cameras, twokeyboards, and a PC for broadcasting and controlling cameras.

commands, Comment Analyzer invokes Camera Controllerto operate the camera. Table 1 presents each command andits operation. Listeners can control the performer’s cameraby posting these commands. We implemented basic camerawork using digital image processing, as illustrated in Fig-ure 2. Additionally, we prepared an operator “:” that ex-ecutes multiple commands in one message. For example,“cam3:zooming” means selecting camera 3, then perform-ing a zoom-in effect. Figure 6 shows the sequence of cam-era work according to each command. We implemented thesystem with Java and OpenCV.

Conflict resolution of multiple commands sent by listenerssimultaneously is not the focus of this study. We designed asimple priority policy as follows. The commands “cam1”,“cam2”, “cam3” are executed exclusively. The latest com-mand is always operated. Although conflicting effects ofeach zoomin/out, panright/left, panup/down and rotateright/leftare executed serially, effects of different types are operatedsimultaneously. For example, when “zoomin” and “rota-teright” are requested simultaneously, then the effects over-lap. These effects are canceled when a serial camera selec-tion command is processed because we consider that sucheffects should be combined with each camera.

EVALUATIONThis section presents a field experiment and two surveys toevaluate the LCLPS approach.

ExperimentTo evaluate how the system influences dramatic impact andcommunication among the performer and listeners, we con-ducted broadcast experiments during one month (Mar. 25 –Apr. 24.) The performer was an author of the paper whohas a background in playing piano. Table 3 shows the dates,time, and the length of each live broadcast in the experiment.The number of participants, all comments, and commentsfor control of the camera are also presented in the Table 3.

Page 4: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Figure 5. Software components of the LCLPS.

Figure 6. Sequence of captured video according to the camera control command.

Table 1. List of camera control commands.

Command Name Operationcam1 select camera 1cam2 select camera 2cam3 select camera 3

zoomin / zoomout perform zoom-in(out) effectzoomstop stop zooming effect

panright / panleft perform pan-right(left) effectpanup / pandown perform pan-up(down) effect

panstop stop pan effectrotateright / rotateleft rotate camera (counter)clockwise

rotatestop stop rotate effect

For the live broadcasting platform, we chose NicoNico Live,one of the most popular live broadcasting services in Japan.A key feature of NicoNico Live is that listeners’ commentsare overlaid directly onto the streaming video. This featuremotivates listeners to expose their feelings spontaneously.During the experiments, we performed on the keyboard al-most in a mashup live session which was described in theprevious section (see Figure 1). Through the experiments,all comments were recorded. We also administered a ques-tionnaire survey to elicit detailed comments and a usabilityevaluation.

We didn’t provide any special training for listeners using theLCLPS. Instead, we presented simple instructions (see Ta-

Table 2. Instructions to control cameras for listeners.We have a camera control function. You can control

cameras by typing following commands.Camera selection: cam1 / cam2 / cam3Zoom-in or out (magnification percentage of zoom-ineffect is 150%): zoomin / zoomout / zoomstopRotation: rotatoright (clockwise) /rotateleft (anticlockwise) / rotatestopPan: panup / pandown / panright / panleft

You can also use “:” command for multi effects in onemessage. (Ex: zoomin:panup / cam2:zoomin:pandown )

ble 2) to show how to use the LCLPS on top of the stream-ing video frame. This means that all of the participantslearned how to control the cameras on-the-fly during the ac-tual broadcast.

Emergence of Communications between the Performerand ListenersIn this section, we present what kind of communication emergedamong performer and listeners using our system. We willuse a live streaming program that we broadcast on Apr. 22as an example. We acquired the largest number of listenerson this day, because our program was featured into the offi-cial program by NicoNico Live. At that time, more than halfof the listeners saw the performer’s broadcast and also theLCLPS for the first time, therefore, we were able to obtain

Page 5: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Table 3. List of experimental dates and the number of listeners, com-ments and control comments.

Dates Time length (min) # of L # of AC # of CCMar. 25 60 222 596 156Mar. 26 60 141 336 79Mar. 30 90 266 651 177Apr. 1 90 210 403 138Apr. 3 30 95 110 49Apr. 4 90 499 1210 152Apr. 5 60 256 674 263Apr. 6 90 343 763 216Apr. 9 90 328 836 181Apr. 13 90 327 1226 194Apr. 14 60 196 672 119Apr. 16 60 192 346 152Apr. 18 90 193 494 148Apr. 20 60 181 234 89Apr. 21 90 250 1148 212Apr. 22 90 658 6323 560Apr. 24 90 352 1436 311

# of L = number of listeners,# of AC = number of all comments,

# of CC = number of comments for camera control

unbiased feedbacks from them, not just like friendly feed-backs from fans.

The live performance was broadcast for 90 min. We had 658listeners and 6323 comments, which included control com-mands. All listeners were ordinary people with whom wewere not acquainted. Out of 658 total users, there were 61users (≈10%, it seems to be followed by 90-9-1 rule [15])who sent a total of 560 control commands. The breakdownof camera controls were camera selection (168), zoom effect(374), pan effect (58), and rotate (118). The total numberof camera controls (718) is greater than the number of com-ments for controls (560) because some comments use the “:”operator to execute multiple commands in one comment.

Figure 7 presents a timeline of two different scenes (scenesA and B) during the Apr. 22 live broadcast. The timelineshows video frames captured at five second intervals. We as-signed a number to each frame (from A-1 to A-22, and fromB-1 to B-29) and display comments for control commandswith a balloon on the timeline. In looking at the timeline, wecan see that listeners often select a camera in response to aperformance. For example, when the performer starts to playkeyboard 2 in A-7, a listener sent a “cam2” command tocapture the performance clearly. As another example, whenthe performer ends playing in B-24, a listener sent a “cam1”command to capture the performer’s facial expression afterthe performance. We observed that listeners often try to un-derstand the performer’s intention for performance, and con-trol the cameras accordingly to the performance. Therefore,the performer and listeners mutually communicate throughthe performance.

We also observed other nonverbal communication between

the performer and listeners. When the listener wants the per-former to change his playing style, the listeners can controlthe cameras to communicate this (e.g., switching the cam-eras to encourage the performer to move). In such a case,the performer can detect the listeners’ initiative and changehis playing style in response to the listeners’ camera manip-ulations. Scene C in Figure 8 demonstrates this. After alistener changed the camera source from camera 1 to camera2, the performer changed from keyboard 1 to keyboard 2.There was another type of camera control in a mashup ses-sion. Some listeners controlled the camera by synchronizingits movements with the camera work of the existing musicvideos playing next to the performance.

We received many positive comments for the LCLPS whilewe broadcasted, such as“New way to enjoy music!”, “I canfeel togetherness.”, “This program makes both the performerand listeners happy.” and “Listeners can participate directlyin the performance!” Some listeners described the emer-gence of communication between the performer and listen-ers, and also among listeners themselves, such as “I enjoycommunication by switching cameras according to perfor-mance, and also by switching cameras to change perfor-mance.”, “I enjoy collaboration with other listeners. WhenI feel how other listeners want to enhance dramatic impact,I help them by performing additional effective camera con-trol.”, “It is very interesting to make a better program withthe performer and other listeners.” and “Thanks to the sys-tem, I can feel closer to the performer and other listeners.”These positive clearly comments show that nonverbal com-munication emerged. Moreover, we find that the communi-cation enhanced camaraderie among the many participantsin the session.

Enhancement of Dramatic ImpactThis section presents a description of how dramatic impactof the program was enhanced using the LCLPS. We observedmany positive comments about dramatic impact in the pro-gram. Comments such as “There is a professional cam-era operator definitely.”, “What wonderful camera work.”,“Good camera angle.” and “This is just like a professionalprogram.” are some examples. We also recognized that theLCLPS reduces the gap in the quality of the dramatic impactbetween professional programs and amateur programs dur-ing mashup live sessions. Scene D in Figure 9 portrays anexample of listeners’ control based on a professional musicclip. In scene D-1, the professional music clip shows camerashots of an overhead view. The screen capture shows that alistener changes the camera to the same overhead view per-spective. Because this effect merges the professional’s andour broadcast, we actually received many comments such as“The performer appears on the same stage with professionalmusicians.” and “I feel like I’m in a real concert.”

We also conducted a online survey to ascertain how listenersfeel about the dramatic impact of the LCLPS. We recruitedparticipants of the survey by announcing it on our broad-casting web page in NicoNico Live. The statements and re-sults of the online survey are presented in Table 4. Usersindicated their ratings of agreement for the statement on a

Page 6: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Scene A

Scene B

Figure 7. Timeline of the live broadcast on April 22nd.

Figure 8. Timeline of the live broadcast showing nonverbal communi-cation between the performer and listeners.

1–5 scale (1=strongly disagree, 2=partly disagree, 3=neitheragree nor disagree, 4=partly agree, 5=strongly agree). Wealso required users to explain why they selected their ratingusing comments. The participants were made up of 44 userswho saw the broadcast, including 10–19 year olds (4), 20–29 year olds (20), 30–39 year olds (6), 40–49 year olds (12),

Figure 9. Timeline of the live broadcasting showing camera controlaccording to a professional music clip.

and 50–59 year olds (2). Results show that the average scorewas 4.47.

Commenters who scored positively (4 or 5) included com-ments such as “We can see the performer in multi angle.”,“When the mood of the song was changed, camera effects

Page 7: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

are also changed. This is similar to the effects of profes-sional program.” and “We can feel the existence of multiplecamera operators. This makes me feel like I have joined areal concert which is created by multiple staff.” We also hadsome comments which described negative effects on dra-matic impact. For example, some users reported conflictsof many camera control commands such as “when manycomments for camera control are sent from many listeners,camera effects sometimes change very quickly. I guess thiscase is not good for visual effect itself.” In contrast, we con-firmed that some users enjoy such conflicts “Conflicts are avery interesting phenomenon. It’s like a spice of the hand-made live streaming. I can feel that there are many listenerswho want to make the broadcast better through camera con-trol.” Through our experiments, overall we confirmed thatthe LCLPS can enhance the dramatic impact of live stream-ing.

Usability EvaluationThis section evaluates the LCLPS’s usability with five in-dicators [11]: Learnability, Efficiency, Memorability, ErrorHandling and User Satisfaction through the online survey.We believed that some users might encounter difficulty incontrolling cameras by text commands. The survey was ad-ministered on the Internet by recruiting participants throughan announcement on our broadcasting web page. We had 37participants who had experience with controlling the camerain our program. The participants included 10–19 year olds(8), 20–29 year olds (15), 30–39 year olds (7), 40–49 yearolds (5), and 50–59 year olds (2). Table 5 presents state-ments and results obtained from the questionnaire. The re-spective ratings of agreement are 1=strongly disagree, 2=partlydisagree, 3=neither agree nor disagree, 4=partly agree, and5=strongly agree. Note that all questionnaires are in the formof positive statements which has a possibility to cause posi-tive bias. Thus, we focus more on their remarks.

LearnabilityThe average rating of the statement “It was very easy tolearn how to use the system.” was 3.78. We confirmed thata learning curve was present for some users learning how touse the system. We obtained both positive and negative com-ments for learnability from the questionnaire such as “I canuse the system through some experiences.”, “Commands areeasy to understand. I can control the camera intuitively.”,“It was difficult for me to understand pan effects. I some-times confused pan right or left commands because it’s diffi-cult to understand right or left means which direction of thecamera movement.”, “It was very easy for me to learn howto use the system because I’m experienced with CUI.”, and“I was able to use the system after the first trial.” Overall,text-based control is apparently tolerated by most users.

EfficiencyThe average rating of the statement “the system is very pleas-ant to work with” was 3.65. In the comments, some usersdescribed the “:” operator that connects two or more cameracontrol commands such as “Efficiency was enhanced withmulti-command using “:” operator.” One user indicated the

problem of text-based control, “I made an effort to imag-ine how my command affects camera shot. I think it will beeasy to select a camera graphically like a camera switcherof TV station”. We confirmed that the text-based cameracontrol worked well with users, but more efficient methodsshould also be investigated. We discuss this point in the nextsection.

MemorabilityThe average rating of the statement “It was very memorablehow to use the system.” was 3.46. In a comments sec-tion, there were opinions given such as “I can rememberthe command if there are not only English commands butalso Japanese commands.” In contrast, a user who seems tohave much experience at controlling cameras in our programdescribed that “I will not forget how to use the system evenafter one year.” The grades of memorability of the systemvaried widely compared with those of other statements. Thisalso implies the necessity of intuitive interfaces.

Error HandlingWe asked users to report their experiences of control failurein the comments. As described in the Learnability section,some users remarked that “I often confused direction of panright and pan left, or rotate right and rotate left.” We of-ten encountered this error when we broadcasted. Anotheruser reported errors resulting from conflicts with other users,“Though I just thought my command was wrong, in factanother user sent another command shortly after my com-mand.” We also observed other comments that described thelatency of the live broadcasting. According to the load con-dition of NicoNico Live’s live streaming server, the stream-ing video is broadcast with latency from 3 sec in minimumto 10 sec in maximum. Consequently, in such cases, userswho want to control a camera effectively must predict theperformance in 3–10 sec. Some users seemed to feel thatthis prediction is difficult. Meanwhile, some users enjoy thisprediction like a game such as “To predict the performer’saction is both difficult and interesting. When my commandsucceeds in making effective control, I’m very excited”.

User SatisfactionThe average rating of the statement “It was very fun to usethe system.” was 4.27. Therefore, we conclude that our sys-tem is satisfying for many users. In the comments, we ob-served many positive remarks such as “interesting system”,“I enjoyed”, “excited”, “I’m very impressed when my con-trol realizes very good effect. I’m also very happy with otherlistener’s feedback. ” and “I enjoy communication with theperformer or other listeners.” These responses indicate thatcommunication through camera control with other listenerscontributes to a high rating of listeners’ satisfaction.

Some users expressed concerns about conflicts of control re-quirements or mischievous users (trolls), such as “I worryabout conflicts. When many and many users want to con-trol the cameras, will bad conflicts occur?” and “I won-der if trolls would appear.” Although we did not encounterany trolls in our experiments, some function to prevent trollsmight be necessary in general. As for conflicts, we observed

Page 8: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Table 4. Results of a questionnaire related to enhancement of dramatic impact.Scale

Question 1 2 3 4 5 AverageDramatic impact of the live broadcasting were enhanced by the system. 0 0 4 15 25 4.47

Table 5. Results of questionnaires related to usability of the LCLPS.Scale

Question 1 2 3 4 5 AverageIt was very easy to learn how to use the system. 0 2 11 17 7 3.78

The system is very pleasant to work with. 0 3 9 19 3 3.65It was very memorable how to use the system. 0 4 16 13 4 3.46

It was very fun to use the system. 0 1 3 10 20 4.27

an interesting phenomenon by which listeners try to avoidconflicts or collaborate to create good effects by themselves.In other domains employing shared camera control, therewas evidence that users wanted to fulfill their own needs, notthose of others. The designers of these systems attempted toaddress conflicts in camera control [20, 12]. However, in oursystem, each user wanted to show movies with high dramaticimpact for all listeners. Consequently, the users can collab-orate naturally to solve conflicts and create good broadcastprograms. Moreover, the users seemed to enjoy solving theconflicts, for example, “I sometimes feel that the other usersseem to wait my or others’ control to avoid conflicts or tokeep equal opportunities for controlling. So, I sometimesbehaved in the same way. In such a case, I can recognizehuman kindness.”

Since the performer in the experiment was one of the authorsof the paper, it is hard to describe the performer’s experiencewithout biases. However, we should still report on the per-former’s experience to provide a more complete understand-ing of the impacts of LCLPS. Overall, we were satisfied withour performances with the LCLSP because we felt that bothdramatic impact and communication with listeners are en-hanced as listeners remarked. Especially, as a performer,we were very excited when our performance was directedeffectively by the listeners. We also felt comfortable withperformance changes initiated by listeners’ camera control.It is hard to devise ways of performance by ourselves in allthe time of broadcasting. Therefore, entrusting our perfor-mance to the listeners made us easy to organize the broad-cast program. We were also encouraged to devise new ideasfor performance by the listeners’ camera control. For exam-ple, we were sometimes encouraged to play two keyboardssimultaneously (right hand for the keyboard 1 and left handfor the keyboard 2) when our performance was captured bythe overview camera. This performance style had not beenappeared when we performed without the LCLPS. On thecontrary, we consider that other musicians sometimes feeluncomfortable with controlled performance by listeners ifthe musicians’ intents for performance are greatly differentfrom listeners’ intents. We have two approaches to solvesuch the case. One is that when we continue to convey ourintents to listeners, the listeners changed their intents in re-sponse to our performance. Second is a negative approach,but we prepare a control interface which terminates a func-

tion to control the cameras by listeners. Though we didn’tencounter such a situation that we wanted to use the con-troller, it is also useful for trolls.

DISCUSSION AND IMPLICATIONS FOR DESIGNThe experimental analysis of our system indicates possibleareas of improvement for the design of current live broad-casting services, and implications for the design of futureservices.

More Effective PerformancesWe focused only on camera control in the current implemen-tation, but the LCLPS can control other devices such as illu-mination and special effects such as smoke generation. Ourexperiments prompted some comments regarding future de-velopment such as “If I could control other devices such as amirror ball, it would be very exciting for me.” This will en-able us to create more professional live performances withother users. We also consider that great potential remainsfor camera devices. If listeners were able to control cameraswith a movable crane or flying helicopter cameras, it wouldprovide more dynamic and attractive movies.

We also anticipate more effective interfaces to control de-vices. For example, we consider that it is worth designinga multi-touch and gesture-based camera control interface toenhance intuitiveness (see Figure 10). The challenge to solvelatency problems is also important. This enables many usersto control devices remotely with appropriate timing. Fromthe point of view of enhancing communication and a greatsense of camaraderie, we consider the importance of iden-tifying the existence of contributors who enhance broadcastprograms. From our experiments, we observed many com-ments which applauded the listeners who participated in theprogram by controlling cameras. We consider that maintain-ing this kind of positive communication is an important pri-ority for maintaining a collaboration system.

Adapting the Concept to Other AreasThough we focused on live streaming of musical perfor-mances in this paper, our system can be adapted to othercategories. Art, dance, magic, juggling, board gaming, andvarious sports are examples of live performances that canbe supported by our system. In the ubicomp research area,

Page 9: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

Pinch/Spread for Zoom-in/out

Swipe for camera selection

Rotate for camera rotation

Figure 10. Example of a multi-touch and gesture-based camera controlinterface.

Table 6. Levels of automation of decision and action selection [16].

(High)10. The computer decides everything, acts autonomously,ignoring the human.9. informs the human only if it, the computer, decides to8. informs the human only if asked, or7. executes automatically, then necessarily informs the hu-man, and6. allows the human a restricted time to veto before auto-matic execution, or5. executes that suggestion if the human approves, or4. suggests one alternative3. narrows the selection down to a few, or2. The computer offers a complete set of decision/actionalternatives, or1. The computer offers no assistance: human must take alldecisions and actions.

(Low)

many smart rooms, which have various networked appli-ances and sensors, have been developed to realize a moreefficient lifestyle [14]. Those smart rooms are very usefulfor deploying our system, and will have another advantage;not only for efficiency but also for communication with oth-ers. Adapting context recognition techniques also enhancesour system. Recognizing performers’ context and notify-ing listeners can help listeners to control devices accord-ing to performers’ intents. We also consider that our sys-tem is appropriate not only to home environments but alsoin outdoor or semi-professional environments. For example,live venues equipped with our system might reach new au-diences. We also expect that relationships between amateurmusicians and their fans will become stronger.

Balancing between Manual and Automatic ControlThrough our experiments, we studied the importance of bal-ancing between manual and automatic control. If we wereable to take a fully automated approach, we would have nonew communication between performers and listeners, oramong listeners. On the other hand, we implemented sev-eral automatic limitations for camera control. For example,the magnification percentage of zoom-in effects were limitedto 150 %. Even if listeners do not send a “zoomstop” com-mand, the zoom-in effect is stopped automatically to main-tain an appropriate presentation.

Parasuraman et al. [16] proposed an automation frameworkthat has ten levels (see Table 6). Although we can designand implement a system that enhances dramatic impact atany level, it is important to examine the tradeoff between ef-ficiency and the possibility of communication among users.Another important implication for collaborative system de-sign is related to simplicity. Our system was very simple touse, which in turn allowed listeners to devise various meansof control. At the end of experiment, some users seemedto have developed a great ability to control cameras. Theycombined several commands effectively and made expres-sive performences. Some other listeners called such skilledlisteners “craftsman” with praise. In ordinary live broadcast-ing programs, only the performers are spotlighted. In con-trast, our system enabled some listeners to be spotlighted asimportant participants in the program. In such a system, sim-plicity plays an important role in emphasizing the existence,participation and ability of humans.

RELATED WORKSome research has been conducted to create interactive broad-casting. Interactive television [10] is a research area thatprovides audiences with interactive functions to share expe-riences. However, most such systems only use verbal com-munication such as text chat functions or voice chat func-tions [8, 23, 5]. In contrast, our system focuses on emergentnonverbal communication via control devices. Moreover, al-though text chatting or voice chatting are only sharing ex-periences, users in our system influence broadcast contentmore directly. Consequently, users can also feel togethernessin becoming part of the program. There are studies relatedto camera control in live broadcasting. Saito et al. [19] pro-poses an interface that enables listeners to tell a performerto change camera effects such as zoom. However, in thiscase, camera control is executed by the performer. Our ex-periments are the first reported case of simultaneous cameracontrol by many listeners in actual live broadcasting.

Great interest has arisen in semi-automated or fully auto-mated camera control to create effective videos in HumanComputer Interaction (HCI) research. As a semi-automatedexample, cameras are controlled automatically with preparedscenarios written by TVML [13], which is a markup lan-guage to express TV programs. FlySPEC [12] uses a PTZcamera on top of a panoramic camera to support camera con-trols by users. In the system, users overview an entire situa-tion of a room using a panoramic camera and select regionsto view more detail. Then, PTZ cameras are controlled au-tomatically to fulfill the participants’ requests to the greatestdegree possible. For fully automated cameras, some priorstudies have examined lecture rooms [4, 18] and meetingrooms [17, 9]. Ranjan et al. [17] provide a system for auto-matic multi-camera control in meetings. The system uses amotion-tracking system and microphones, and control cam-eras automatically based on television production principles.Although these automatic camera control systems can pro-vide benefits for efficient video recording/broadcasting, weconsider that no communication will occur among partici-pants. Meanwhile, we specifically examine the advantagesof manual control. In our experiences, manual control plays

Page 10: Enhancing Communication and Dramatic Impact of Online Live …takuro/ubicomp2012takuro.pdf · 2012-06-22 · we discuss its implications for designing future interactive broadcasting

an important role for listeners to share experiences, producenew communication, togetherness, and share accomplish-ments.

Our work is also related to research in Computer SupportedCollaborative Work (CSCW). Recent reports of studies in-vestigating collaborative video production [6, 7, 22] presentfindings about human factors when multiple users collab-orate in video creation, and also provide design implica-tions for collaborative systems. Engstrom et al. [6] ana-lyzed dance clubs’ VJ work and presented the SwarmCamprototype, which enables VJs to create collaborative videoswith audiences. Engstrom et al. [7] also studied how livevideo can be produced as a group activity by amateurs us-ing a tool called Instant Broadcasting System. From ourpaper’s point of view, these studies are interesting becausethey provide ethnographic insights regarding interaction be-tween participants in a collaborative content-production set-ting. The differences between our experiments and theseworks are that we developed collaborative production for en-hancing not only dramatic impact, but also nonverbal com-munication, and provide experimental results in a public broad-casting service with numerous simultaneous participants.

CONCLUSIONThis paper presents the Listener-Cooperative Live Produc-tion System, which enhances communication and dramaticimpact during live broadcasts. The LCLPS enables listenersto control cameras using methods such as switching othercameras or various effects (zoom, pan and rotate effects).Through one month of experiments with thousands of users,we confirmed the emergence of nonverbal communicationand enhancement of dramatic impact. Especially, nonver-bal communication occurred not only between the performerand the listeners, but also among the listeners themselves.Through these experiments, we anticipate the great possibil-ities of manual control for human communications, and itsimplications for the future design of interactive broadcastingsystems.

ACKNOWLEDGEMENTSWe would like to thank the anonymous reviewers, and es-pecially our shepherd William G. Griswold for their insight-ful comments. We also thank all of participants in our livebroadcasting. This research was partly supported by Na-tional Institute of Information and Communications Tech-nology (NICT).

REFERENCES1. Live-youtube. http://www.youtube.com/live.

2. Niconico live. http://live.niconico.com/.

3. Ustream. http://www.ustream.tv/.

4. M. Bianchi. Automatic video production of lectures using anintelligent and aware environment. In Proceedings of the 3rdinternational conference on Mobile and ubiquitous multimedia, MUM’04, pages 117–123, New York, NY, USA, 2004. ACM.

5. T. Coppens, L. Trappeniers, M. Godon, and F. Wellesplein. Amigotv:towards a social tv experience. Proceedings from the SecondEuropean Conference on Interactive Television Enhancing theexperience University of Brighton EuroITV, 36(11):1–4, 2004.

6. A. Engstrom, M. Esbjornsson, and O. Juhlin. Mobile collaborativelive video mixing. In Proceedings of the 10th international conferenceon Human computer interaction with mobile devices and services,MobileHCI ’08, pages 157–166, New York, NY, USA, 2008. ACM.

7. A. Engstrom, M. Perry, and O. Juhlin. Amateur vision andrecreational orientation:: creating live video together. In Proceedingsof the ACM 2012 conference on Computer Supported CooperativeWork, CSCW ’12, pages 651–660, New York, NY, USA, 2012. ACM.

8. D. Geerts. Comparing voice chat and text chat in a communicationtool for interactive television. In Proceedings of the 4th Nordicconference on Human-computer interaction: changing roles,NordiCHI ’06, pages 461–464, New York, NY, USA, 2006. ACM.

9. T. Inoue, K.-i. Okada, and Y. Matsushita. Learning from tv programs:application of tv presentation to a videoconferencing system. InProceedings of the 8th annual ACM symposium on User interface andsoftware technology, UIST ’95, pages 147–154, New York, NY, USA,1995. ACM.

10. J. F. Jensen. Interactive television - a brief media history. InM. Tscheligi, M. Obrist, and A. Lugmayr, editors, EuroITV, volume5066 of Lecture Notes in Computer Science, pages 1–10. Springer,2008.

11. M. J. LaLomia and J. B. Sidowski. Measurements of computersatisfaction, literacy, and aptitudes: A review. International Journal ofHuman-Computer Interaction, 2, 1990.

12. Q. Liu, D. Kimber, J. Foote, L. Wilcox, and J. Boreczky. Flyspec: Amulti-user video camera system with hybrid human and automaticcontrol. In IN ACM MULTIMEDIA 2002, pages 484–492. ACM Press,2002.

13. T. K. M Ueda Hayashi, H. Tvml (tv program making language)automatic tv program generation from text-base script.

14. S. Meyer and A. Rakotonirainy. A survey of research oncontext-aware homes. In Proceedings of the Australasian informationsecurity workshop conference on ACSW frontiers 2003 - Volume 21,ACSW Frontiers ’03, pages 159–168, Darlinghurst, Australia,Australia, 2003. Australian Computer Society, Inc.

15. J. Nielsen. Participation inequality: Encouraging more users tocontribute, 2006. http://www.useit.com/alertbox/participation inequality.html.

16. R. Parasuraman, T. B. Sheridan, and C. D. Wickens. A model fortypes and levels of human interaction with automation. IEEETransactions on Systems Man and Cybernetics Part A Systems andHumans, 30(3):286–297, 2000.

17. A. Ranjan, J. Birnholtz, and R. Balakrishnan. Improving meetingcapture by applying television production principles with audio andmotion detection. In Proceedings of the twenty-sixth annual SIGCHIconference on Human factors in computing systems, CHI ’08, pages227–236, New York, NY, USA, 2008. ACM.

18. Y. Rui, A. Gupta, J. Grudin, and L. He. Automating lecture captureand broadcast: technology and videography. ACM MultimediaSystems Journal, 10:3–15, 2004.

19. Y. Saito and Y. Murayama. An empirical study of audience-driveninteractive live television on the internet. EuroITV’09, 2009.

20. D. Song and K. Y. Goldberg. Approximate algorithms for acollaboratively controlled robotic camera. IEEE Transactions onRobotics, 23:2007, 2007.

21. Y. Takemae, K. Otsuka, and J. Yamato. Automatic video editingsystem using stereo-based head tracking for multiparty conversation.In CHI ’05 extended abstracts on Human factors in computingsystems, CHI EA ’05, pages 1817–1820, New York, NY, USA, 2005.ACM.

22. S. Vihavainen, S. Mate, L. Seppala, F. Cricri, and I. D. Curcio. Wewant more: human-computer collaboration in mobile social videoremixing of music concerts. In Proceedings of the 2011 annualconference on Human factors in computing systems, CHI ’11, pages287–296, New York, NY, USA, 2011. ACM.

23. J. D. Weisz, S. Kiesler, H. Zhang, Y. Ren, R. E. Kraut, and J. A.Konstan. Watching together: integrating text chat with video. InProceedings of the SIGCHI conference on Human factors incomputing systems, CHI ’07, pages 877–886, New York, NY, USA,2007. ACM.