How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation...

19
SIMG-503 Senior Research How People Take Pictures: Understanding Consumer Behavior through Eye Tracking Before, During, and After Image Capture Final Report Marianne Lipps Visual Perception Laboratory Center for Imaging Science Rochester Institute of Technology [email protected]

Transcript of How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation...

Page 1: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

SIMG-503 Senior Research

How People Take Pictures: Understanding Consumer Behavior through Eye Tracking

Before, During, and After Image Capture

Final Report

Marianne Lipps Visual Perception Laboratory Center for Imaging Science

Rochester Institute of Technology [email protected]

Page 2: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

2

Table of Contents

Abstract .................................................................................................................................................. 3

Copyright................................................................................................................................................ 3

Acknowledgement................................................................................................................................... 4

1. Introduction ........................................................................................................................................ 4

2. Background......................................................................................................................................... 5

2.1 Eye Movements and Visual Perception............................................................................................ 5 2.2 Eye Movements and Picture Viewing .............................................................................................. 6

3. Methods............................................................................................................................................... 7

3.1 Eye Tracking Instrumentation.......................................................................................................... 7 3.1.1 Wearable Eye Tracking System .............................................................................................. 7 3.1.2 Integrated Eye and Head Tracking System ............................................................................. 8

3.2 The Tasks ..................................................................................................................................... 9 3.2.1 Task 1: Image Capture ............................................................................................................ 9 3.2.2 Task II: Image Edit ................................................................................................................ 10 3.2.3 Subjects................................................................................................................................ 11

4. Results .............................................................................................................................................. 11

4.1 Task 1: Image Capture ................................................................................................................. 11 4.1.1 Gaze Duration....................................................................................................................... 11

4.2 Task 2: Image Editing .................................................................................................................. 13 4.2.1 Gaze Duration....................................................................................................................... 13 4.2.2 Fixation Densities.................................................................................................................. 15 4.2.3 Cropping and zooming........................................................................................................... 16

5. Discussion......................................................................................................................................... 17

5.1 Image Capture Summary.............................................................................................................. 17 5.2 Image Editing Summary............................................................................................................... 17

6. Conclusion........................................................................................................................................ 18

References............................................................................................................................................. 19

Page 3: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

3

Abstract In the digital photography business, the number of prints of photographs taken by consumers is a critical factor in profitability. Consumers often report that the printed photographs they receive do not adequately reproduce their memory of the original scene; they are dissatisfied with the composition of the photograph, not necessarily physical image quality. This dissatisfaction results in a lower percentage of images being printed or reprinted. Very little is known about how people compose photographs. In order to attain a better understanding of consumer behavior during image capture, simply asking the photographer about his or her methods is not always accurate; amateur photographers may not be conscious of the strategies they actually use when preparing to take a photograph. Also, asking photographers to pay attention to these strategies may change their behavior. The human visual system has very high resolution only in the fovea, making it necessary to move the eyes to gather visual information about the world. By studying these eye movements, the process of visual perception can be investigated. Eye movements reveal where in a scene a person is attending; with this information, it is possible to gain insight into a person’s cognitive processes. A portable eye tracker was used to record photographers’ eye movements while they took digital photographs of a number of scenes. Eye movements were also recorded as the participants selected and cropped their images on a computer. Analysis revealed that during image capture, the participants’ behavior was affected by the subject matter of the photograph; the time spent looking at either the primary object or the surround differed across scenes. However, results from the editing phase show that the spread of fixations, edit time, and number of crop windows did not differ significantly across scenes. This suggests that unlike image capture, the cropping task is not influenced by the content of the image.

Copyright Copyright © 2002 Center for Imaging Science Rochester Institute of Technology Rochester, NY 14623-5604 This work is copyrighted and may not be reproduced in whole or part without permission of the Center for Imaging Science at the Rochester Institute of Technology. This report is accepted in partial fulfillment of the requirements of the course SIMG-503 Senior Research. Title: How People Take Pictures: Understanding Consumer Behavior through Eye tracking Before, During, and After Image Capture Author: Marianne Lipps Project Advisor: Jeff B. Pelz SIMG 503 Instructor: Anthony Vodacek

Page 4: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

4

Acknowledgement Jeff B. Pelz Jason Babcock Subjects of the experiment Visual Perception Lab members

For being a great advisor, and for making me present at Industrial Associates For a lot of stuff For volunteering their time and patience For helping me figure everything out

1. Introduction

This research project utilizes the study of eye movements to gain insight into how humans perform a common task: taking a photograph. With eye tracking equipment, it is possible to literally see what the photographer sees, and what the photographer pays attention to. By studying these eye movements, it is possible to learn how a photographer composes an image. It is difficult to attain a better understanding of consumer behavior during image capture simply by asking the photographer about his or her methods; amateur photographers may not be conscious of the strategies they actually use when preparing to take a photograph. Also, telling a photographer to pay attention to his or her methods may influence the task, making it less natural. An experiment was designed using the eye tracking equipment in the Visual Perception Lab at RIT. Subjects were asked to perform tasks that included taking and editing photographs of various scenes. With eye tracking equipment, it is possible to record spatial and temporal information about eye movements. These eye movements can then be analyzed, yielding information about how a person gathers visual information from the environment. Eye movements while taking photographs were compared over three image classes: person, sculpture, and interior. This information was also compared to eye movements as subjects cropped the photographs they took.

Page 5: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

5

2. Background

2.1 Eye Movements and Visual Perception The human visual system is highly sophisticated, and allows for the perception of a full field of high resolution vision. This perception is created by sampling the environment through a complex process that occurs below consciousness (Pelz, Canosa, Babcock, 2000). In effect, the environment is sampled spatially and temporally by a pool of sensors located on the retina. There are two types of photosensitive receptors used. The first is the cone, which is used for color vision during normal levels of illumination. The rods, on the other hand, are highly sensitive and are useful in low levels of illumination. These sensors are not evenly distributed, however. The cones are clustered on the back of the retina near the optical axis, while a greater number of rods compose the periphery, as illustrated in Figure 1.

Figure 1: Distribution of rods and cones on the retina of the human eye (Falk, et al, 1986, page 153) Although it seems as if we have high resolution vision everywhere, it is only in the fovea that we have high resolution capabilities. In order to create the effect of high resolution everywhere with our resolution-limited system, it is necessary to move the detector across the visual field rapidly. The oculomotor system allows humans to move their eyes at speeds up to 600 degrees per second (Canosa, 2000). These eye movements are used to stabilize an image on the retina, follow an object that is moving, or to reorient the eye to gather new information about a scene. On average, a person makes over 150,000 eye movements every day. Saccades are rapid, ballistic eye movements that reorient the fovea to new targets that require high acuity. Fixations occur when the eye pauses at a particular spatial location. By studying these eye movements, it is possible to understand how visual attention is deployed in the environment (Pelz, et al, 2000). The mechanics of the oculomotor system have been studied in depth through experiments in controlled laboratory settings. Typically, eye movements are tracked as observers are asked to perform simple tasks while keeping the head stationary. These tasks involve looking at a static image or searching for a specific shape in a field of similar shapes. While these research studies have learned much about the visual system, the findings cannot be applied to visual perception during complex, natural tasks. Several studies have shown significant differences between eye movements when the head is fixed and eye movements when the head is allowed to move freely. For example, it was found that retinal image stabilization decreased when the subject’s head was not supported (Skavenski, 1979). Other research showed that saccades are faster and more accurate when the head is free to move (Collewijn, 1992). Also under natural conditions, it has been seen that vergence eye movements are carried out at a higher velocity than previously thought (Steinman, 1990).

Visual Angle (degrees from fovea)

Rods

Cones

Num

ber o

f pho

tore

cept

ors

per m

m2

Page 6: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

6

2.2 Eye Movements and Picture Viewing The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his experiments, Buswell recorded eye movements of over 200 participants as they viewed 55 photographs of various types of fine art. He compared eye movements of trained and untrained artists, but found no significant differences. However, he concluded that although no two subjects exhibited the same viewing behavior, two general classes of viewing behavior could be formed. The first is represented by a global survey of the image, where subjects made brief fixations over the main features of the image. The second behavior is characterized by long fixations over smaller sections of the image. In general, the global fixations were made early, followed by longer fixations as viewing time increased (Buswell, 1935). When fixation patterns were plotted collectively over a specific image, areas of high fixation density often corresponded to information-rich regions in the image. This suggests that observers fixated on many of the same spatial locations in the image, but not in the same order over time. Generally, people did not randomly explore the images. Instead, they focused on foreground elements including faces and people, and rarely focused on background elements. In 1967, Yarbus reported that as subjects viewed I.E. Repin’s An Unexpected Visitor, eye movement patterns changed when different instructions were given. For example, when observers were asked to remember the clothes the people in the painting are wearing, or to estimate the age of the people, the most informative regions received the most fixations (Yarbus, 1967). Research presented by Molnar in 1981 also shows that eye movement patterns changed depending upon the task given to observers. A group of fine-art students viewed eight classical paintings. Half of the group was told that they would later be questioned about what they saw (Molnar, 1981). The other half was told that they would be asked about aesthetic qualities of the painting. Molnar found that fixations were much longer for the group making aesthetic judgments. In other research done by Nodine, Locher, and Krupinski in 1991, it was found that composition of images affected eye movement patterns of trained artists. Observers made long fixations and tended to focus on spatial relationships between foreground objects and background. Untrained viewers made shorter fixations, and focused on semantically important regions of the image (Nodine, et al, 1991).

Page 7: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

7

3. Methods 3.1 Eye Tracking Instrumentation 3.1.1 Wearable Eye Tracking System In order to monitor eye movements during different phases of the photographic process, two types of eye trackers were used. During the image capture phase, a wearable eye tracker, developed in the Visual Perception Laboratory at Rochester Institute of Technology, was used. The wearable eye tracker was specifically designed to record eye movements as observers perform tasks in a natural environment – outside the laboratory. Based on a pair of racquetball goggles and a backpack, the wearable eye tracker does not interfere with natural eye, head, and whole-body movements. The custom headgear, as shown in Figure 2, uses an infrared illuminator, a miniature, IR-sensitive CMOS camera, and beam splitter to capture an image of the eye. The eye is illuminated by the IR along an optical path directed by a first surface mirror and hot mirror. The bright-pupil image is reflected back to the CMOS camera along the same path. A second miniature camera is located just above the right eye and is used to capture the scene from the subject’s perspective. Slightly above the scene camera is a small laser to be used for calibration. Finally, a battery is mounted on the left side of the goggles to power the cameras, and also to balance the weight of the optical components.

Figure 2: Custom-built headgear for the wearable eye tracker

Figure 3: Wearable eye tracking backpack containing digital video camcorder, picture-in-

picture, ASL control unit, and batteries

Figure 4: Video image of the scene with eye image superimposed in the upper right corner.

The crosshairs indicate point of gaze.

The headgear is wired to a customized Applied Science Laboratory (ASL) Model 501 controller unit contained in a backpack, shown in Figure 3. The line of gaze is computed in real time, and is based on a vector difference between the center of the pupil and the first corneal reflection. The ASL unit is calibrated for each subject by recording vectors as the person fixates on nine points of a calibration target. A crosshair representing the line of gaze is superimposed on the scene video. The video signals from the eye camera and ASL unit are passed through a picture-in-picture unit which superimposes the image of the eye in the corner of the scene, as seen in Figure 4. The combined image is then recorded onto a Sony digital video camcorder.

Page 8: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

8

3.1.2 Integrated Eye and Head Tracking System During the editing task, the subject cropped photographs on a computer, as shown in Figure 5. For this portion of the experiment, an ASL Model 501 eye tracker and Polhemus 3-Space magnetic head tracker were used. By tracking head movements as well as eye movements, the gaze position relative to the computer monitor can be determined. The eye is tracked in the same fashion as in the wearable system. However, the head-mounted eye tracker can only be used in the laboratory because it is tethered to the control unit and video recorder. The Polhemus system uses a fixed transmitter located behind the subject, and a receiver attached to the eye tracker to determine the position and orientation of the head, shown in Figure 6. With the location of the transmitter defined as the origin, the plane of the LCD monitor was determined and entered into the ASL control unit. After averaging eight video fields to reduce noise, the ASL system calculates gaze position at an effective temporal resolution of 133 msec. The intersection of the subject’s line of gaze with the LCD monitor is computed in real time, and a video record with a cursor overlay is created, shown in Figure 7. Information about the gaze position can also be logged to a data file for off-line analysis.

Figure 5: Setup for head mounted eye tracker with magnetic head tracker

Figure 6: Close-up of head mounted eye tracker with head position sensor attached

Figure 7: Video image of scene with eye image superimposed in upper left corner. Crosshairs indicate point of gaze.

Page 9: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

9

3.2 The Tasks The experiment consisted of two sessions. In the first, the subject was asked to take photographs. In the second, the subject returned to the laboratory and was allowed to crop the photographs. After the experiment, each subject was asked to complete a questionnaire about his or her familiarity with cameras and any previous training in the visual arts. On average, each task took 30 minutes, including setup and calibration of the equipment. 3.2.1 Task 1: Image Capture A Kodak DC-210 digital camera was used for image capture. A portion of the LCD viewfinder was covered with a material that closely resembles the body of the camera, as shown in Figure 8. This was done so that the camera would capture more of the scene than was originally visible through the LCD viewfinder, illustrated in Figure 9.

Figure 8: Kodak DC-210 digital camera with partially covered LCD viewfinder

Figure 9: Sample picture taken with the Kodak DC-210. The shaded region indicates the portion

of the image that was masked off in the LCD viewfinder.

After setup and calibration of the wearable eye tracking system, subjects were given instructions for the first task. They were shown a 5.5 x 8.5 inch, double-sided mock brochure about the Center for Imaging Science. The inside of the brochure contained rectangles with cartoon icons of objects including 1) a person, 2) a large sculpture, and 3) the main stairwell in the building atrium. The primary objects in these scenes define their ‘image class,’ and will be referred to as “person,” “sculpture,” and “interior” scenes. Subjects were instructed to take photographs to replace the icons. They were asked to take three photographs of a large sculpture, three of the main stairwell, and three of any person in the building. After being given a quick tutorial on how to use the camera, including the zoom feature, they were each asked to keep the camera oriented horizontally, and to use only the LCD viewfinder when taking photographs. Subjects were not informed of the alteration to the LCD viewfinder, and were not given any information about the second session of the experiment.

Figure 10: Sample pictures of the three image classes: person, sculpture and interior.

Page 10: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

10

3.2.2 Task II: Image Edit After completing the first task, the wearable eye tracker was taken off the observer. He or she was given a five minute break before beginning the next task. During this time, the nine images from the digital camera were downloaded to a Macintosh computer, and a contact sheet of thumbnails was created in Adobe Photoshop 6. After setup and calibration of the integrated eye and head tracking system, subjects were given instructions for the second task. They were told that they were now going to edit (crop) the photographs for publication in the brochure. At this time, the subjects were asked to close their eyes, and a sample image of the color cube was put on the screen. Using Photoshop’s crop tool, a selection window was placed over the image, as illustrated in Figure 11. The area and position of the window corresponded to the area of the scene that was visible through the modified LCD viewfinder. The resolution of the full image was 640 x 480 pixels, and the inside of the crop window was 427 x 300 pixels. Subjects were asked to look at the Apple icon in the top left corner of the screen upon opening their eyes. After three seconds, they looked at the center of the screen. At this time, subjects were told that photographs would be presented to them with a selection window over the image, as in the example currently on the screen. Again, they were not informed that the area inside the selection window was what they saw through the LCD. Subjects were told that if they like the image as it was, they could double click it, and what is inside the window would be kept. If not, they were allowed to move the window or change its size, but not its aspect ratio. Subjects were given time to practice using the mouse, moving the window, and changing the window size. Next, subjects were presented with thumbnails of the photographs they took. Beginning with the sculpture scene, they were asked to choose one of the three photographs to use for publication. Again they closed their eyes and the image was prepared. Subjects were then allowed to crop the image as they pleased. This process was then repeated for the interior and person scenes. Information about the size and position of the final crop window was recorded by Adobe Photoshop during the experiment.

Figure 11: Experimental setup in Adobe Photoshop during the image editing task. The shaded region outside the crop window indicates the portion of the image covered in the LCD viewfinder during the

image capture task.

Page 11: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

11

3.2.3 Subjects Sixteen subjects performed this experiment. However, only ten were successfully eye tracked in both tasks. All of the ten subjects considered themselves amateur photographers. Three of the subjects are female, and seven are male. Their ages ranged from 18 to 46. Some had taken art classes in the past, but all participants either studied or worked in a technological field. Each subject read and signed an informed-consent form before the experiment began. 4. Results Video records of each task were carefully analyzed, yielding the results found below. Additionally, log files of gaze position coordinates from the second task were used. 4.1 Task 1: Image Capture 4.1.1 Gaze Duration One measure of a photographer’s behavior is the amount of time he or she spends taking a photograph. As shown in Figure 12, subjects spent an average of 17 seconds taking one photograph of a person. For the sculpture and interior scenes, they spent 25 and 29 seconds, respectively. These results show a relationship between the amount of time spent and the content of the scene being photographed; the amount of time needed to frame a photograph seemed to increase with the physical size of the object being photographed. Using eye movement data, we performed a finer-grained analysis by breaking up the total time spent completing the task. Figure 13 shows the amount of time subjects spent looking at the primary object, the surround, and the camera.

0

5

10

15

20

25

30

35

Person Sculpture InteriorImage Class

Aver

age

Tim

e (s

ec)

Figure 12: Average time spent taking one

photograph for the person, sculpture, and interior scenes. Error bars represent one standard error of

the mean for ten subjects.

The primary object is the object that the observer was instructed to take a picture of, i.e., the person, Color Cube, or stairwell. Distinguishing between the primary object and surround is straightforward for the person and sculpture classes, but less so for the interior scene. When extracting data from the video tape, fixations were coded as looking at the primary object when they looked at the stairs in a manner that indicated they were not using the fixations to help them navigate the stairs. For example, fixations along the railings, ceilings, walls and structure of the stairwell were coded as ‘primary object.’ Video records from previous experiments in which subjects had to use the stairwell only to move between floors show that is it characteristic for a person to look at individual steps as they walk up or down the stairs. These fixations were coded as gaze durations in the surround. This category also included fixations made on people walking by, or other objects not relevant to the task. It should also be noted that due to resolution and parallax limitations, the point of gaze on the LCD viewfinder of the camera could not be accurately determined.

Page 12: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

12

On average, subjects spent about 3 seconds looking at the primary object when taking a photograph of a person and sculpture. For the interior scene, subjects fixated on the primary object for more than twice that amount of time. Next we consider the amount of time spent looking at the surround. For the person and interior scenes, there is no significant difference between the amount of time spent looking at the object and surround. However, an increase in the amount of time spent looking at the surround is evident for the sculpture scene. Video records show that it is not uncommon for the subject to walk around the sculpture before taking photographs. Another way to compare the photographer’s behavior between the three classes is to analyze the fraction of the total time spent looking at the object, surround, and camera. The percentages were calculated for each subject, and then averaged across all ten subjects. Figure 14 shows an illustration of the distribution of time.

0

2

4

6

8

10

12

14

16

18

Person Sculpture Interior

Image ClassG

aze

Dur

atio

n (s

ec)

ObjectSurroundCamera

Figure 13: Amount of time spent looking at the primary object, surround, and camera for each image class

19% 21% 60%

15% 31% 54%

26% 28% 46%

Figure 14: Average percentage of time spent looking at the primary object, surround, and camera for all three image classes.

Interior

Sculpture

Person

Object Surround Camera

Object Surround Camera

Object Surround Camera

Page 13: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

13

Figure 14 shows that for the person scene, subjects spent about the same percentage of time looking at the primary object as they did looking at the surround. However, the majority of the time was spent looking at the camera. For the sculpture scene, subjects spent twice as much time looking at the surround scene as they did looking at the sculpture itself. When taking a photograph of the interior scene, subjects spent about the same amount of time looking at the object as they did the surround. Compared to the person scene, they spent a higher percentage of their time looking at the object and surround, and a lower fraction of their time looking at the camera. Again, these differences suggest that the task of taking a photograph is influenced by the content of the scene. 4.2 Task 2: Image Editing 4.2.1 Gaze Duration Video tape records from the second task were analyzed to determine the total amount of time spent cropping the photographs. Figure 15 shows the average amount of time spent for each of the three image classes. Unlike the image capture task, there is no significant difference between the person, sculpture, and interior scenes. Subjects spent a little over 30 seconds cropping each photograph. This suggests that, unlike the image capture task, the task of cropping photographs is not influenced by the content of the image. In comparison to taking a photograph of a person, subjects spent almost twice as much time cropping the photograph, as seen in Figure 16. The sculpture and interior scenes also mark an increase in the time spent cropping the photograph in comparison to capturing the photograph. An analysis similar to that of Figures 13 and 14 was performed on eye movement data from the image editing task. The amounts of time subjects spent looking at the primary object and surround were compared and are illustrated in Figures 17 and 18. This analysis could not be performed on the interior scene, as the primary object comprised the entire image. Figure 17 shows that on average, subjects spent slightly more time looking at the surround when cropping the photograph of a person. However, because of the large standard deviation and the small number of subjects, this difference is not statistically significant. For the sculpture scene, there is no significant difference between the amount of time spent looking at the object and surround. Recall from Figure 13 that during the image capture task, subjects spent nearly twice as much time looking at the surround for the sculpture scene; this shows that eye movements are task dependent.

Figure 15: Average time spent cropping one photograph

Figure 16: Comparison between the amount of time spent completing the image capture and image editing

tasks

0

5

10

15

20

25

30

35

40

45

Person Sculpture InteriorImage Class

Aver

age

Tim

e (s

ec)

0

5

10

15

20

25

30

35

40

45

Person Sculpture InteriorImage Class

Aver

age

Tim

e (s

ec)

Capture

Edit

Page 14: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

14

0

5

10

15

20

25

30

Person Sculpture

Image ClassG

aze

Dur

atio

n (s

ec)

Object

Surround

Figure 17: Amount of time spent looking at the object and surround for the

person and sculpture image classes

Figure 18: Average percentage of time spent looking at the primary object and surround

Figure 18 also shows no significant difference in the percentage of time subjects spent looking at the object and surround for the person and sculpture scenes. The large standard error may be a result of the variability between images. In some extreme cases, the primary object comprised the entire image, as illustrated in Figure 19.

Figure 19: Sample image of sculpture scene in which the primary object takes up the entire field of view

43% 57%

55% 45%

Sculpture Object Surround

Person

Object Surround

Page 15: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

15

4.2.2 Fixation Densities Using the integrated eye and head data logged during the experiment, it is possible to study the spatial and temporal distribution of fixations on the image during the editing task. Figure 20 a) shows fixation positions for subject CS for the entire task of image editing. Figures 20 b) through e) present a timeline of fixations for portions of the cropping task. The black border indicates the current position of the crop window, beginning with the default window in b). Figure 20 e) shows fixations as the crop window was being moved.

b)

c)

d)

e)

Figure 20 a) Fixation densities on an image across the

entire editing session for subject CS. Figures 20 b) through e) show fixations during sequential portions of

the editing session. The black border represents the current position of the crop window. Figure 20 c) shows fixations as the crop window was being expanded down

and to the right.

Figure 20 b) shows subject CS’s initial fixations on the primary object and its relation to other important objects in the image. In the next figure, CS expands the crop window to include the person’s arm and computer, suggesting that these are important features to include in this composition. In Figure 20 d), fixations appear to correlate with the subject’s decision to exclude the people in the background. Figure 20 e) shows the final crop window, which now excludes the people in the background, but includes the computer and arm. Note that these portions of the image were not seen through the LCD viewfinder during image capture. Subject CS decided to include more of the image than she had originally captured. In general, subjects’ eye movement patterns correlated with their decisions to include or exclude areas of the image. Also, initial fixations were almost always made on the primary object. Fixations then moved to the surround and crop window borders. Careful analysis of the video records show that just before the final window was chosen, subjects generally looked back and forth between the primary object, the surround, and the window border to check spatial relationship between them.

Page 16: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

16

To determine differences in the spatial distribution of fixations between the three image classes, the mean horizontal and vertical eye position in pixels was computed for each image. Also, the radial distance of each fixation from the mean fixation was calculated. This analysis is illustrated in Figure 21.

a) b)

Figure 21: a) Fixation densities during editing session for subject SM. b) Example of the spread from the

mean fixation across all fixations. The standard deviation of the radial distances across all fixations was compared across the three image classes for eight subjects. The mean standard deviations for the person, sculpture, and interior classes all fell between 70 and 80 pixels (~1 degree), with a standard error of ~5 pixels. This result shows that there are no significant differences between the spread of eye movements (relative to the mean fixation) for the three image classes in this task. 4.2.3 Cropping and zooming The behavior of subjects as they cropped photographs was compared to their behavior of zooming when taking photographs. Because this analysis can be done without the use of the eye tracker, the following results were computed using data from sixteen subjects. When given the option to crop their photographs, all subjects decided to do so. Every subject moved the position of the crop window, and all but one subject changed the size of the window. However, when taking photographs, only 12 out of the 16 subjects, or 75%, used the zoom feature on the camera. The sizes of the final crop windows were compared to the size of the image as originally seen through the LCD viewfinder. Figure 22 shows the average percentage of the original photograph kept after image cropping for the person, sculpture, and interior classes. It is interesting that on average, subjects decided to keep more of the photograph than originally captured for the sculpture and interior cases.

0

20

40

60

80

100

120

140

Perc

enta

ge

Figure 22: Average percentage of the original photograph kept after editing for all three image

classes

Person Sculpture Interior

100

Page 17: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

5. Discussion Results from the image capture and image edit tasks are not symmetrical. While differences between image classes were found during the image capture task, no significant differences were found in the image edit task. 5.1 Image Capture Summary Video records from the image capture task reveal differences in oculomotor behavior across the image classes of person, sculpture and interior. First, the amount of time that subjects spent completing the task increased with the extent of the object being photographed. Subjects spent the same amount of time looking at the primary object in the person and sculpture scenes, but spent almost twice as much time looking at the surround when photographing the sculpture. Also, the proportion of time spent looking at the primary object, the surround, and camera differed across image classes. Behavior while taking a photograph, as revealed through eye movements, is influenced by the type of the scene being photographed. However, it is important to keep in mind that only one specific example of each image class was used during this experiment, making the classifications of person, sculpture and interior somewhat arbitrary. No similar research involving eye tracking during image capture has been done before, so there is no context for comparison. 5.2 Image Editing Summary Unlike the image capture task, no significant differences were found in the amount of time subjects spent cropping photographs across the three image classes. Between the person and sculpture scenes, there was no significant difference in the amount of time spent looking at the primary object and surround. Fixation densities and eye movement patterns also showed no discernable differences between image classes. The standard deviations of the spread of fixations, relative to the mean, were nearly the same for the person, sculpture and interior scenes. In comparing the amount of the original image kept after cropping, subjects decided to keep more of the original image than originally scene through the LCD viewfinder for the sculpture and interior scenes. This result is interesting because it does not agree with the result of a related experiment performed by Miller and Muszak in 1999. In their study, consumer photographers were intercepted when picking up a set of developed photographs and were asked if they would like to crop digital copies of their photos using a computer. While looking at an image, the fourteen participants could zoom in or out using the “I” and “O” keys, and move the crop window by pressing the directional arrow keys. Each participant selected seven images to crop. On average, only 43 percent of the total area of the original was kept after cropping. It is important to note the difference between the experiments. In Miller and Muszak’s experiment, participants were not able to consider keeping more of the photograph than they had originally captured. 5.3 Future Research The next step in studying how people take pictures is to eye track them through a traditional viewfinder. The largest limitation of this experiment is the inability to accurately determine the point of gaze on the LCD viewfinder itself, due to resolution and parallax limitations of the equipment. Further study may lead to new designs in consumer photographic equipment, or new technologies including adaptive printing technologies. For example, intelligent software may someday be able to perform “assisted composition” by suggesting cropping options for consumers before photographs are printed or reprinted.

Page 18: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

18

6. Conclusion This research project utilizes the study of eye movements to gain insight into how humans perform a common task: taking a photograph. During a two-part experiment using eye tracking equipment, oculomotor behavior was recorded. Analysis revealed differences in the way participants photographed three types of scenes: person, sculpture, and interior. Most notably, there was an increase in the amount of time spent completing the task between the person and sculpture scenes, and again between the sculpture and interior scenes. Also, the amount of time subjects spent looking at the primary object, the surround, and the camera differed across image classes. When taking a photograph of a person, subjects spent about 20 percent of their time looking at the primary object, and 20 percent looking at the surround. For the sculpture scene, we see that while subjects spent 15 percent of the time looking at the object, 30 percent of their time was spent looking at the surrounding scene. Finally, when photographing an interior scene, about 30 percent was spent looking at the scene, and another 30 looking at the surround. When subjects cropped their photographs, no significant differences between the amount of time spent looking at the primary object or surround were found for the person and sculpture classes. Also, the spread of fixation densities relative to the mean fixation were constant across all three image classes. The results of this experiment suggest that oculomotor behavior during the task of taking a photograph is influenced by the type of scene being photographed. However, the content of the photographs did not affect the task of image cropping. This verifies previous research showing that oculomotor behavior is task dependent (Yarbus, 1967; Pelz, et al, 2000; Land 1999).

Page 19: How People Take Pictures: Understanding Consumer Behavior ... · The first thorough investigation into how people look at pictures was published in 1935 by Guy T. Buswell. In his

19

References Buswell, G. T. How People Look at Pictures: A Study of The Psychology of Perception in Art, The

University of Chicago Press, Chicago, 1935. Canosa, R. “Eye Movements and Natural Tasks in an Extended Environment.” Master of Science Thesis,

Rochester Institute of Technology. 2000. Collewijn, H., et al., “Effects of freeing the head on eye movement characteristics during three dimensional

shifts of gaze and tracking.” The Head-Neck Sensory Motor System. Oxford University Press: New York. 1992.

Falk, D., Brill, D., and Stork, D. Seeing the Light: Optics in Nature, Photography, Color, Vision, and

Holography. John Wiley and Sons, Inc. New York. 1986. Land, M., Mennie, N., & Rusted, J., “The roles of vision and eye movements in the control of activities of

daily living.” Perception Magazine, Vol. 28., 1999. Miller, Michael E. and Muszak, Jerry. "Consumer Behavior when Zooming and Cropping Personal

Photographs and its Implications for Digital Image Resolution", In Proceedings of the 52nd Annual Conference of the Society for Imaging Science and Technology (pp. 137-142). Savannah, GA.: The Society for Imaging Science and Technology. 1999

Molnar, F. “About the role of visual exploration in aesthetics.” Advances in Intrinsic Motivation and

Aethetics. H. I. Day, pp. 385-413. Plenum Press. New York. 1981. Nodine, C.F., Locher, P.J., and Krupinski, E.A., “The role of formal art training on perception and aesthetic

judgment of art composition,” Leonardo, 26, pp. 219-227, 1991. Pelz, J.B. & Canosa, R., “Oculomotor Behavior and Perceptual Strategies in Complex Tasks,” Accepted for

publication in Vision Research. 2001 Pelz, J.B., Canosa, R., Babcock, J., Kucharczyk, D., Silver, A., and Konno, D., “Portable Eyetracking: A

Study of Natural Eye Movements” Proceedings of the SPIE, Human Vision and Electronic Imaging , San Jose, CA: SPIE 2000.

Skavenski, A. et al, “Quality of retinal image stabilization during small natural an artificial body rotations

in man.” Vision Research, Vol 19. 1979. Steinman, R.M, et al, “New directions for oculomotor research.” Vision Research, Vol 30. 1990. Yarbus, A. L., Eye Movements in Vision. Plenum Press, NewYork. 1965 Portions of this research have been presented in "How people look at pictures before, during, and after image capture: Buswell revisited," Proceedings of SPIE, Human Vision and Electronic Imaging. San Jose, 2002.