Surviving the Information Explosion
Jaime Teevan, MIT
with Christine Alvarado, Mark Ackerman and David Karger
Let Me Interview You!
Web:–What’s the last Web page you visited? How did you get there?–Have you looked for anything on the Web?
Email:
Files:
–What’s the last email you read? What did you do with it?–Have you gone back to an email you’ve read before?
–What’s the last file you looked at? How did you get to it?–Have you looked for a file?
Overview
Introduction
Related Work
Study Methodology
Results: Search
Discussion
Intro
RW
Study
Res
Disc
Overview
Intro
RW
Study
Res
Disc
Introduction
Related Work
Study Methodology
Results: Search
Discussion
The Information Explosion
You must extract information from: 3 billion Web pages (Google) Dozens of incoming
emails daily Hundreds of files
on your personalcomputer
Intro
RW
Study
Res
Disc
Haystack:Personal Information Storage
Email Web pages
Files Calendar
Contacts
Haystack
Intro
RW
Study
Res
Disc
Haystack:Personal Information Storage
What was that paper I read last week about
Information Retrieval?Haystack
Intro
RW
Study
Res
Disc
Haystack:Personal Information Storage
Ah yes! Thank you.
Haystack
Intro
RW
Study
Res
Disc
Supporting Information Interaction
Treat different corpora the same? Provide access to meta-data?
– Keyword search (XP, advanced search)– Browse (Hearst)
Intro
RW
Study
Res
Disc
We don’t really know …
Understand access in the wild!
Overview
Intro
RW
Study
Res
Disc
Introduction
Related Work
Study Methodology
Results: Search
Discussion
– Interaction by corpus
– How people search
Interaction By Corpus
Paper documents– [Malone, 1983], [Whittaker & Hirshberg, 2001]
Files– [Barreau & Nardi, 1995]
Web– [Abrams, et al. 1998], [Byrne, et al. 1999]
Email/Calendar– [Whittaker & Snider, 1996], [Bellotti & Smith, 2000]
Intro
RW
Study
Res
Disc
How People Look for Information
Focus: Web Log analysis
– [Catledge & Pitkow, 95], [Tauscher & Greenberg 97]
Controlled tasks/environment– [Baldonado & Winograd, 1997], [Spool, 1998]
Situated navigation– Micronesian islanders [Suchman, 1987]– Electronic [Marchionini, 1995], [Hearst, 2000]– Information scent [Chi, Pirolli, Chen & Pitkow, 2001]
Intro
RW
Study
Res
Disc
Overview
Intro
RW
Study
Res
Disc
Introduction
Related Work
Study Methodology
Results: Search
Discussion
Method
Subjects– 15 MIT CS graduate students (5 women, 10 men)
Setup– 10 short interviews (~ 5 min.)– 1 long interview (~ 45 min.)
Topics– Web, Email, Files
Intro
RW
Study
Res
Disc
Short Interviews
Modified diary study [Palen, 2002] Randomly interrupted participant Two question types
– Last email/file/Web page looked at– Last email/file/Web page looked for
Goal: Discover patterns in searching and browsing
Intro
RW
Study
Res
Disc
Long Interviews
“Guided tour” of subject’s Web space, email, and file system
Goals:– Discover organizational patterns– Discover problems in
organizational structure– Relate organization to
search/browse behavior
Intro
RW
Study
Res
Disc
Overview
Intro
RW
Study
Res
Disc
Introduction
Related Work
Study Methodology
Results: Search
Discussion
– What and how
– Relating what and how
– Individual strategies
Complex Information Spaces
People had complex spaces Felt in control
Intro
RW
Study
Res
Disc
“That’s an interesting question. I think my email is the worst, because I have so much of it. And there are people on the other end who expect me to reply to it. My file system is pretty well organized. I have to go through it every once in a while, every couple of months and just kind of push things into the right folders and delete the old stuff. The Web just works, usually.”
What People Look For
Specific Information– A small fact– E.g., URL, phone number, appointment time
General Information– A broad set of information– E.g., good sneakers to buy, info on cancer
Specific Document– The actual document– E.g., a file to print, an email to reply to
Intro
RW
Study
Res
Disc
How People Look For Information
The last thing you looked for on the WebIntro
RW
Study
Res
Disc
Search is more than just keyword search
– Did you use a search engine?
Browse, use bookmarks, type URLs
“I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].”
– Traditional search– Jump directly to target– Specify everything up front
Strategies Looking for Information
Intro
RW
Study
Res
Disc
Teleporting
Orienteering– Use local navigation– [O’Day and Jeffries, 1993]– Could include keyword
search
Example: Orienteering
[…]J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.”
Intro
RW
Study
Res
Disc
Interviewer: Have you looked for anything on the Web today?Jim: I had to look for the office number of the Harvard professor.
[…]I: So you went to the Math department, and then what did you do over there?J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was.
I: So how did you go about doing that?J: I went to the homepage of the Math department at Harvard
Example: Teleporting
What if Jim had teleported instead?
Could have typed into a search engine: “Connie Monroe, office number”
Intro
RW
Study
Res
Disc
“Keyword Search” and “Browse”“Keyword Search” and “Browse”“Keyword Search” and “Browse”
“Keyword Search”“Keyword Search”– Traditional search– Jump directly to target– Specify everything up front
“Keyword Search” and “Browse”
Intro
RW
Study
Res
Disc
Teleporting
Orienteering– Use local navigation– [O’Day and Jeffries, 1993]– Could include keyword
search
Teleporting
Orienteering
Orienteer to specific information
Relating How and What
People orienteer a lot What people look for related to how they look
Specific General Document
Orienteer 47 19 41
Teleport 34 23 17
Intro
RW
Study
Res
Disc
Surprise:
– Did you know what email contained that information?
Why So Much Orienteering?
Your last email searchIntro
RW
Study
Res
Disc
People look for the information source Specific information searches Document
searches
– What were you looking for?
Looking for the Source: Example
“I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].”
Intro
RW
Study
Res
Disc
Looking for the Source: Example
Interviewer: Have you looked for anything on the Web today?Jim: I had to look for the office number of the Harvard professor.I: So how did you go about doing that?J: I went to the homepage of the Math department at Harvard[…]J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.[…]I: So you went to the Math department, and then what did you do over there?J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was.
Intro
RW
Study
Res
Disc
Individual Strategies
Search strategies varied by individual Pilers: Pile information Filers: File information
Intro
RW
Study
Res
Disc
Where was the last email you found?– Inbox?– Elsewhere?
File or Pile Email
0
2
4
6
8
0 50 100
% found in Inbox
# of
sea
rche
s
Intro
RW
Study
Res
Disc
Filer
Piler
How Individuals Search For Files
0 1 2 3 4 5 6 7 8 9
M
L
K
J
I
H
G
F
E
D
C
B
A
Keyword Search OrienteeringIntro
RW
Study
Res
Disc
Filers
Pilers
Teleport
Orienteer
Overview
Intro
RW
Study
Res
Disc
Introduction
Related Work
Study Methodology
Results
Discussion
– Understanding and applying what we learn
– Future work
UnderstandingTeleporting v. Orienteering
Why was orienteering chosen over teleporting? Teleporting doesn’t work Teleporting requires too much cognitive effort Risk of over-specifying target Orienteering gives knowledge of the source Teleporting a failure mode
– Can’t associate information with source– Can’t find the information source
Intro
RW
Study
Res
Disc
Understanding Filers v. Pilers
Why do filers teleport more than pilers? Irony: Those with good organization don’t take
advantage of it Filers have strictly organized information
Are used to defining meta-data for their information
Pilers loosely organize their information Are used to associative navigating
Intro
RW
Study
Res
Disc
Haystack: Applying What We Learn
Using meta-data: Support orienteering– Not about having the perfect search interface– Need ability to prompt
Individualized support– Pilers/filers– Learning individual behaviors
Intro
RW
Study
Res
Disc
Future Work: Search
Previously viewed information Causes of failure Searches across corpus Getting help from others
Intro
RW
Study
Res
Disc
Future Work: Organization
Consistency of organization across corpus
Corpora boundaries Context used in
organization Organization’s
effect on search
Intro
RW
Study
Res
Disc
Conclusion
Look at search in the wild Strategies: Teleport/Orienteer Individual strategies Future systems should:
– Support orienteering– Provide individualized support
Questions?
To learn more about Haystack:
http://haystack.lcs.mit.edu
Contact us with comments:
Relating How and Corpus
Email and files: Almost always orienteered Easy to associate information with document Web: Teleported much more often
Email Files Web
Orienteer 59 42 19
Teleport 06 10 64
Intro
RW
Study
Res
Disc
Relating What and Corpus
Email Files Web
Specific 39 7 33
General 10 7 30
Document 08 35 14
Email searches were primarily for specific information File searches were primarily for documents Web searches were more evenly distributed
Intro
RW
Study
Res
Disc
Top Related