WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only...

32
WWW and Internet CS 7450 - Information Visualization March 4, 2004 John Stasko Spring 2004 CS 7450 2 Internet and WWW By nature, abstract, so good target for visualization Often described in terms of metaphors “Information Superhighway”

Transcript of WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only...

Page 1: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

1

WWW and Internet

CS 7450 - Information VisualizationMarch 4, 2004John Stasko

Spring 2004 CS 7450 2

Internet and WWW

• By nature, abstract, so good target for visualization

• Often described in terms of metaphors− “Information Superhighway”

Page 2: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

2

Spring 2004 CS 7450 3

Agenda

• Two main topics− Presentations of the Internet and WWW

Focus on topology and navigation, similar to the graph visualization work

− Visual aids for browsing and using the WWW and the InternetAssistive visualizations not focusing on presenting

net structure and connectivity

Spring 2004 CS 7450 4

1. Internet and WWW Topology

• Fundamentally, the Internet is a graph with some existing physical topology, though that is often not how we want to conceptualize it− Might think of it as having a structure

• Our discussions from graph visualization are germane here

Page 3: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

3

Spring 2004 CS 7450 5

Mukherjea & Foley WWW ‘95

The Problem

Spring 2004 CS 7450 6

The Problem

• Websites simply are too big• Huge graphs• Layout is challenging

Page 4: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

4

Spring 2004 CS 7450 7

Step Back

• Why would someone want to visualize the WWW?

Spring 2004 CS 7450 8

Some Reasons

• Aid authors and webmasters with production and organization of content

• Assist Web surfers making sense of the information

• Help researchers understand the Web

Page 5: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

5

Spring 2004 CS 7450 9

Depictions of the Web

•• GreatGreat web site that presents many different conceptualizations of cyberspace− Atlas of Cyberspace

http://www.cybergeography.org/atlas/

• Let’s take a few minutes to browse...

Spring 2004 CS 7450 10

Mapping the Internet

• Bill Cheswick at ATT• Interesting visualizations plus the data

sets are available• www.cs.bell-labs.com/who/ches/map/index.html

Page 6: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

6

Spring 2004 CS 7450 11

Internet Traffic Paths

www.caida.org/tools/measurement/skitter/

Spring 2004 CS 7450 12

MboneMap

www.cs.berkeley.edu/~elan/mbone/map.html

Page 7: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

7

Spring 2004 CS 7450 13

Immersive Systems

www.pnl.gov/remote/projects/starlight/

Spring 2004 CS 7450 14

View of Web Site’s Pages

www.dynamicdiagrams.com/

Page 8: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

8

Spring 2004 CS 7450 15

Web Site

www.mos.ics.keio.ac.jp/NattoView

Spring 2004 CS 7450 16

Web Site Visitations

www.inventix.com

Page 9: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

9

Spring 2004 CS 7450 17

Task Analysis

• Potential web-related tasks− How and when has info been accessed?− Where do people enter and spend time?− How do they move about?− What paths aren’t traversed?− Where are they coming from?− What has been added, changed, deleted?− Do changes affect navigation patterns?− Do we need to do a redesign?

Spring 2004 CS 7450 18

Data Set

• Each server request is a data case• Example variables

− IP Address/Client host− Timestamp− URL requested− HTTP status (success, not found, …)− Bytes delivered− Referencing URL (HTTP-Referrer)− User agent (browser and OS info)− ...

Page 10: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

10

Spring 2004 CS 7450 19

One Approach

• Use existing InfoVis tool (Eureka, Spotfire, InfoZoom, etc.), load the data set, and analyze it

• Get all the strengths and weakness of the InfoVis tool for supporting particular analysis tasks

Spring 2004 CS 7450 20

Web Ecology

• Problem: Most visualizations of the web fail to present the dynamically changing ecology of users and documents on the web

• What do we mean by ecology metaphor?

Chi, et al CHI ‘98

Page 11: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

11

Spring 2004 CS 7450 21

Web Ecology

• By understanding set of relationships (ecology) among users and their information environment, and its change through time (evolution) individuals can better understand− Web Content− Layout of physical and topological space− Usage through time

Spring 2004 CS 7450 22

Existing Visualizations

• Despite useful functions, problems − Difficulty visualizing large number of

documents− Considerable amount of screen real-estate

used− Only permits the visualization of a site at a

particular point in time, very difficult to make comparisons across times

− No mechanisms provided that allow differences in usage to be identified

Page 12: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

12

Spring 2004 CS 7450 23

Techniques

• Disk Tree− Center-rooted tree that represents the

hyperlink structure of a web site

• Time Tube− Set of disk trees that organizes and visualizes

the evolution of web sites

Spring 2004 CS 7450 24

Task Application

• Visualizations designed to be useful for− Local - Finding specific content− Comparison - Comparing info at two places − Global - Discovering a trend or pattern in the

site

Page 13: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

13

Spring 2004 CS 7450 25

Analysis Domain

• www.xerox.com, April ‘97− 7,588 items across a 30-day period− 889 new items− Daily log kept of additions, modifications, and

deletions of content − Base data comes from link info, usage log

from web servers− Topological info from custom hyperlink

database

Spring 2004 CS 7450 26

Disk Trees

• Interested in shortest number of hops from one document to another

• Breadth-first traversal transforms the web graph into a tree by placing the node as close to the root node as possible

• After obtaining this tree we then visualize the structure using the Disk Tree technique

Page 14: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

14

Spring 2004 CS 7450 27

Disk Tree

Lines - tree linksLine size & brightness -

page access frequencyColor - page lifecycle stage

new: red continued: greendeleted: yellow

Spring 2004 CS 7450 28

Advantages

• Structure is compact, with pattern easily recognizable

• When viewed straight on or at slight angles, no occlusion problems, since entire layout is on a 2-D plane

• Unlike cone trees, this 2-D representation can utilize a third dimension for other information, such as time

• Circularity pleasing to the eye

Page 15: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

15

Spring 2004 CS 7450 29

Time Tubes

• Time Tubes are multiple disk trees layered out along a spatial axis

• Advantages− By using a spatial axis to represent time, we

see information space-time in a single visualization

− Focus and Context− Possibility for Animation

Spring 2004 CS 7450 30

Time Tubes

Page 16: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

16

Spring 2004 CS 7450 31

Key Point

• Pages there any time during the studied period are shown in all disk trees for period, even if they didn’t exist yet

Spring 2004 CS 7450 32

Real Use

• Time Tube answers following questions:− What devolved into dead wood? When did it?

Was there a correlation with the restructuring of the web?Product safety pages got darker and darker,

indicating lower usageDoesn’t tell why page is less popular, just raises a

flag to explore page further

Page 17: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

17

Spring 2004 CS 7450 33

Real Use

• What evolved into a popular page? When did it? Was there a correlation with the restructuring of the Web site? − Redesign of site called attention to Fact Book

page− Became more popular and the corresponding

Disk Trees become greener and greener in successive weeks

Spring 2004 CS 7450 34

Real Use

• How was usage affected by items added over time?− Press release issued for new family of

products, shown as red links− Usage in the third week jumped from 1

access to 871 accesses, this example helps us understand that this was probably a well received product line

Page 18: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

18

Spring 2004 CS 7450 35

Real Use

• How was usage affected by items deleted over time?− Change in removing direct link from home

page to main driver page did not negatively affect the overall use of driver information

− Info stayed green indicating usage, but link from home page was black, showing not much traffic

Spring 2004 CS 7450 36

E-Commerce Applications

• What if your focus is on understanding user access patterns for web sites selling products to consumers?

• What tasks are important?

Page 19: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

19

Spring 2004 CS 7450 37

One Approach

• Blue Martini Software• Aggregate web data and visualize

simplified graph of user movements through web site

• Highlight places where people leave before purchasing

• ...Brainerd & BeckerInfoVis ‘01

Spring 2004 CS 7450 38

Different icons representdifferent kinds of pages

Only show most-used pages

Page 20: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

20

Spring 2004 CS 7450 39

E-Commerce mimicsmall shopping :^)

Gender differences inpurchase paths atwebsites

Spring 2004 CS 7450 40

2. Aiding WWW Browsing

• Can we utilize information visualization techniques to help people interact with the WWW and the Internet?

• Battle “lost in hyperspace” problem• Help us know what’s there• Help us find things

Page 21: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

21

Spring 2004 CS 7450 41

WebBook and Web Forager

• Personal computers viewed as knowledge processors before− Spreadsheets and calculators

• Now viewed as knowledge sources, portals to vast information worlds− Networking and WWW

Card, Robertson and York CHI ‘96

Spring 2004 CS 7450 42

WWW Problems

• Pages are hard to find• Users get lost, can’t relocate pages• Difficulty organizing things once found• Difficulty doing knowledge processing on

found thing• Interacting with web is too slow to

incorporate gracefully into other activities

Page 22: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

22

Spring 2004 CS 7450 43

Information Foraging Theory

• From Ecological Biology• Idea: user stalks certain types of

information• Users have tendency to interact

repeatedly with small clusters of information (locality of reference)

• Information encountered at certain rate− Users evolve to increase finding rate− Sources evolve to be more attractive

Spring 2004 CS 7450 44

Mechanisms Evolved

• 3 mechanisms in the evolution of the web on the server side − Indexes - Lycos search− Table of contents - Yahoo− Home pages provided by users with big lists

of related links

Page 23: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

23

Spring 2004 CS 7450 45

Assisting People

• To provide insight− must support sensemaking− restructuring− recoding

• Hotlists are one mechanism in this direction

Spring 2004 CS 7450 46

Improvements

• WebBook and Web Forager try to do two things to foster information sensemaking− Move away from a single web page, and

group and manipulate related pages− Move from a work environment containing a

single element to a workspace in which the page is contained with multiple other entities, including Web Books

Page 24: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

24

Spring 2004 CS 7450 47

WebBook

Spring 2004 CS 7450 48

Features

• WebBook allows for the rapid interaction with object at a higher level of aggregation than pages

• 3D book representation, uses animation• Can ruffle through pages, leave

bookmarks

Page 25: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

25

Spring 2004 CS 7450 49

Applications

• Hot List books• Topic books• Search reports• Book books• ...

Spring 2004 CS 7450 50

Web Forager

Page 26: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

26

Spring 2004 CS 7450 51

Web Forager

• Application that embeds the WebBook and other objects in a hierarchical 3D information workspace

• Workspace is intended to create patches from the web where high density of relevant pages (grouped together in Web Books) can be combined with rapid access

Spring 2004 CS 7450 52

Constituents

• Hierarchical Workspace - 3 levels− Focus Place - full page shown, direct

interaction− Intermediate memory space - books or pages

placed when they are in use but not immediate focus

− Tertiary space - Storage (bookcase)

Video

Page 27: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

27

Spring 2004 CS 7450 53

Discussion

• Strengths/Weaknesses

Spring 2004 CS 7450 54

Data Mountain

• 3D document management system• Prototype is an alternative to web browser

“bookmarks” or “favorites”• Could be used for any kind of document

management

Robertson, et al UIST ‘98

Page 28: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

28

Spring 2004 CS 7450 55

Make-Up

• 3D inclined plane in which thumbnails of web pages are placed to serve as favorites

• User is responsible for organization• Uses smooth animation and audio to

assist interaction

Spring 2004 CS 7450 56Video

Page 29: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

29

Spring 2004 CS 7450 57

User Study

• Data Mountain versus IE4 “Favorites”• Experienced IE4 users• Stored 100 pages, then retrieved them• DM fared about-as-well with “title” cue• DM fared better for all other cues

Spring 2004 CS 7450 58

Leveraging Human Capabilities

• Spatial memory: analogy with paper placed on a pile on your desk− User is responsible for personal organization

• 3D perception: minimal cognitive load, good utilization of screen space

Page 30: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

30

Spring 2004 CS 7450 59

Interaction Techniques

• Placing pages: confinement to inclined plane makes normal 2D drag-and-drop sufficient; no unfamiliar 3D navigation needed

• Continuous feedback: both audio and visual feedback are natural; minimized unexpected interactions/surprises

Spring 2004 CS 7450 60

Limitations/Future

• Limits number of pages stored• No explicit support for grouping

• Landmarks/contours as helpers

Page 31: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

31

Spring 2004 CS 7450 61

Discussion

• Strengths/Weaknesses

• Could it be used elsewhere?

Spring 2004 CS 7450 62

Upcoming

• Spring Break− Woo-woo

• Text & documents (2 days)− Reading

Chapter 10Salton et al

• Mid-project reports due March 25

Page 32: WWW and Internetbennani/tmpc/VISD/16-−Considerable amount of screen real-estate used −Only permits the visualization of a site at a particular point in time, very difficult to

32

Spring 2004 CS 7450 63

References

• Spence and CMS texts• All referred to papers and websites• McNamara & Defnet and Craighill,

Robeson & Sheridan F ‘99 slides