Inferring Searcher Intent

85
Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 1 AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 Eugene Agichtein Emory University Inferring Searcher Intent Eugene Agichtein Emory University Tutorial Website (for expanded and updated bibliography): http://ir.mathcs.emory.edu/intent_tutorial/ Instructor contact information: Email: [email protected] Web: http://www.mathcs.emory.edu/~eugene/ AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 Tutorial Overview Part 1: Search Intent Modeling Motivation: how intent inference could help search Search intent & information seeking behavior in traditional IR Searcher models: from eye tracking to clickthrough mining Part 2: Inferring Web Searcher Intent Inferring result relevance: clicks Richer interaction models: clicks + browsing Part 3: Applications and Extensions Implicit feedback for ranking Contextualized prediction: session modeling Personalization, query suggestion, active learning 2 Eugene Agichtein Emory University

Transcript of Inferring Searcher Intent

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 1

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eugene Agichtein

Emory University

Inferring Searcher Intent

Eugene Agichtein

Emory University

Tutorial Website (for expanded and updated bibliography):

http://ir.mathcs.emory.edu/intent_tutorial/

Instructor contact information:

Email: [email protected]

Web: http://www.mathcs.emory.edu/~eugene/

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Overview

• Part 1: Search Intent Modeling– Motivation: how intent inference could help search

– Search intent & information seeking behavior in traditional IR

– Searcher models: from eye tracking to clickthrough mining

• Part 2: Inferring Web Searcher Intent– Inferring result relevance: clicks

– Richer interaction models: clicks + browsing

• Part 3: Applications and Extensions– Implicit feedback for ranking

– Contextualized prediction: session modeling

– Personalization, query suggestion, active learning

2Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 2

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

About the Instructor

• Eugene Agichtein (Ah-ghi-sh-tein)http://www.mathcs.emory.edu/~eugene/

• Research: Information retrieval and data mining– Mining search behavior and interactions in web search– Text mining, information extraction, and question answering

• Relevant experience:2006 - Assistant Professor, Emory UniversitySummer’07: Visiting Researcher, Yahoo! Research2004-06: Postdoc, Microsoft Research1998 - 2004: PhD student, Columbia

• Databases/IR

Eugene Agichtein

Emory University3

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Outline: Search Intent and Behavior

�Motivation: how intent inference could help search

• Search intent and information seeking behavior– Classical models of information seeking

• Web searcher intent

• Web searcher behavior– Levels of modeling: micro-, meso-, and macro- levels

– Variations in web searcher behavior

– Click models

4Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 3

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Some Key Challenges for Web Search

• Query interpretation (infer intent)

• Ranking (high dimensionality)

• Evaluation (system improvement)

• Result presentation (information visualization)

5Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example: Task-Goal-Search Model

6

car safety ratings consumer reports

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 4

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Retrieval Process Overview

7

Source

Selection

Search

Query: car safety ratings

Selection

Ranked List

Examination

Documents

Delivery

Documents

Query

Formulation

Resource

query reformulation,

vocabulary learning,

relevance feedback

source reselection

Search Engine

Result Page (SERP)

Credit: Jimmy Lin, Doug Oard, …

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Explicit Intentions in Query Logs

• Match known goals (from ConceptNet) to query logs

Eugene Agichtein

Emory University8

Strohmaier et al., K-Cap 2009

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 5

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Unfortunately, most queries are not so explicit…

Eugene Agichtein

Emory University9

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Outline: Search Intent and Behavior

� Motivation: how intent inference could help search

� Search intent and information seeking behavior– Classical models of information seeking

• Web Searcher Intent– Broder

– Rose

– More recent?

• Web Searcher Behavior– Levels of modeling: micro-, meso-, and macro- levels

– Variations in web searcher behavior

– Click models

• Challenges and open questions

10Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 6

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Seeking Funnel

• Wandering: the user does not have an information seeking-goal in mind. May have a meta-goal (e.g. “find a topic for my final paper.”)

• Exploring: the user has a general goal (e.g. “learn about the history of communication technology”) but not a plan for how to achieve it.

• Seeking: the user has started to identify information needs that must be satisfied (e.g. “find out about the role of the telegraph in communication.”), but the needs are open-ended.

• Asking: the user has a very specific information need that corresponds to a closed-class question (“when was the telegraph invented?”).

Eugene Agichtein

Emory University11

D. Rose, 2008

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Models of Information Seeking

• “Information-seeking … includes

recognizing … the information

problem, establishing a plan of

search, conducting the search,

evaluating the results, and …

iterating through the process.”-

Marchionini, 1989

– Query formulation

– Action (query)

– Review results

– Refine query

12

Adapted from: M. Hearst, SUI, 2009

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Reviewing Results: Relevance Clues

• What makes information or information objects

relevant? What do people look for in order to infer

relevance?

– Topicality (subject relevance)

– Extrinsic (task-, goal- specific)

• Information Science “clues research”:

– uncover and classify attributes or criteria used for

making relevance inferences

13Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Scent for Navigation

• Examine clues where to find useful information

14

Search results listings must provide

the user with clues about which

results to clickEugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 8

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic “Berry Picking” Model

• Information needs change during interactions

15

[Bates, 1989] M.J. Bates. The design of browsing and berrypicking techniques for the on-

line search interface. Online Review, 13(5):407–431, 1989.

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Goal: maximize rate of

information gain.Patches of information �websites

Basic Problem: should I

continue in the current patch

or look for another patch?

Expected gain from continuing in

current patch, how long to continue

searching in that patch

Information Foraging Theory

16Eugene Agichtein Emory

University

Pirolli and Card, CHI 1995

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 9

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Diminishing returns: 80% of users scan only first 3 pages of search results

-Charnov’s Marginal Value Theorem

17Eugene Agichtein Emory

University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Hotel Search

Eugene Agichtein

Emory University18

Goal: Find

cheapest 4-star

hotel in Paris.

Step 1: pick hotel

search site

Step 3: goto 1

Step 2: scan list

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 10

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example: Hotel Search (cont’d)

Eugene Agichtein

Emory University19

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Orienteering vs. Teleporting

• Orienteering:

– Searcher issues a quick, imprecise to get to approximately the right information space region

– Searchers follow known paths that require small steps that move them closer to their goal

– Easy (does not require to generate a “perfect” query)

• Teleporting:

– Issue (longer) query to jump directly to the target

– Expert searchers issue longer queries

– Requires more effort and experience.

– Until recently, was the dominant IR model

20Eugene Agichtein

Emory University

Teevan et al., CHI 2004

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 11

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Serendipity

Eugene Agichtein

Emory University21

Andre et al., CHI 2009

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Summary of Models

• Static, berry-picking, information foraging, orienteering, serendipity

• Classical IR Systems research mainly uses the simplest form of relevance (topicality)

• Open questions:

– How people recognize other kinds of relevance

– How to incorporating other forms of relevance (e.g., user goals/needs/tasks) into IR systems

Eugene Agichtein

Emory University22

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 12

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Search Intent and Behavior

� Motivation: how intent inference could help search

� Search intent and information seeking behavior� Classical models of information seeking

• Web Searcher Intent– Broder

– Rose

– More recent?

• Web Searcher Behavior– Levels of modeling: micro-, meso-, and macro- levels

– Variations in web searcher behavior

– Click models

23Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Intent Classes (top level only)

User intent taxonomy (Broder 2002)

– Informational – want to learn about something (~40% / 65%)

– Navigational – want to go to that page (~25% / 15%)

– Transactional – want to do something (web-mediated) (~35% / 20%)

• Access a serviceDownloads

• Shop

– Gray areas

• Find a good hub

• Exploratory search “see what’s there”

Eugene Agichtein

Emory University

History nonya food

Singapore Airlines

Jakarta weather

Kalimantan satellite images

Nikon Finepix

Car rental Kuala Lumpur

[from SIGIR 2008 Tutorial, Baeza-Yates and Jones]

24

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 13

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extended User Goal Taxonomy

Eugene Agichtein

Emory University25

Rose et al., 2004

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Complex Search

A complex search task refers to cases where:

• searcher needs to conduct multiple searches to locate the information sources needed,

• completing the search task spans over multiple sessions (task is interrupted by other things),

• searcher needs to consult multiple sources of information (all the information is not available from one source, e.g., a book, a webpage, a friend),

• requires a combination of exploration and more directed information finding activities,

• often requires note-taking (cannot hold all the information that is needed to satisfy the final goal in memory), and

• specificity of the goal tends to vary during the search process (often starts with exploration).

Eugene Agichtein

Emory University26

Aula and Russel, 2008

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 14

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Complex Search (Cont’d)

Eugene Agichtein

Emory University27

Aula and Russel, 2008

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Web Search Queries

• Cultural and educational diversity

• Short queries and impatient interaction

– Few queries posed and few answers seen (first page)

– Reformulation common

• Smaller and different vocabulary

– Not “expert” searchers!

– “Which box do I type in?”

Eugene Agichtein

Emory University28

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 15

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010Eugene Agichtein

Emory University29

[from SIGIR 2008 Tutorial, Baeza-Yates and Jones]

Intent Distribution by Topic

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Distribution by Demographics

• Education:

• Ethnicity:

• Gender:

Eugene Agichtein

Emory University30

[Weber & Castillo, SIGIR 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 16

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Demographics 2: Demo

Keywords have demographic signatures

– Microsoft adCenter Demographics Prediction:

http://adlab.msn.com/DPUI/DPUI.aspx

adCenter [posters]

Quantcast

31Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Domain-Specific Intents: Named Entities

• Named Entities

(Persons, Orgs, Places)

are often searched

– “Brittany Spears”?

• Popular

phrases

by entity

type:

Eugene Agichtein

Emory University32

[Yin & Shah, WWW 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 17

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example Intent Taxonomy for Musicians

• Musicians (most

popular phrases)

Eugene Agichtein

Emory University33

[Yin & Shah, WWW 2010]

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Analyzing Searches: Funneling

• What is the intent of customers that type such queries?

• Hint: What they searched before/after?

– Search Funnels: http://adlab.msn.com/searchfunnel/

– How can you catch

customers earlier?

– What customers do

when they leave?

34Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 18

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Search Intent and Behavior

� Motivation: how intent inference could help search

� Search intent and information seeking behavior� Classical models of information seeking

� Web Searcher Intent� Broder

� Rose

� Demographics

• Web Searcher Behavior– Levels of modeling: micro-, meso-, and macro- levels

– Variations in web searcher behavior

– Click models

35Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Web Searcher Behavior

• Meso-level: query, intent, and session

characteristics

• Micro-level: how searchers interact with result

pages

• Macro-level: patterns, trends, and interests

36Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 19

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Levels of Understanding Searcher Behavior

• Micro (eye tracking): lowest level of detail, milliseconds

• Meso (field studies): mid-level, minutes to days

• Macro (session analysis):millions of observations, days to months

37

[Daniel M. Russell, 2007]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Scales

Eugene Agichtein

Emory University38

from: Pirolli, 2008

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 20

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Retrieval Process (User view)

Eugene Agichtein

Emory University39

Source

Selection

Search

Query

Selection

Ranked List

Examination

Documents

Delivery

Documents

Query

Formulation

Resource

query reformulation,

vocabulary learning,

relevance feedback

source reselection

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

People Look at Only a Few Results

(Source: iprospect.com WhitePaper_2006_SearchEngineUserBehavior.pdf)

Eugene Agichtein

Emory University40

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 21

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Views Depend on Rank

Mean: 3.07 Median: 2.00

[Daniel M. Russell, 2007]

Eugene Agichtein

Emory University41

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Views and Clicks Depend on Rank[from Joachims et al, SIGIR 2005]

Eugene Agichtein

Emory University42

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 22

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

“Eyes are a Window to the Soul”

• Eye tracking gives information

about search interests:

– Eye position

– Pupil diameter

– Seekads and fixations

Eugene Agichtein

Emory University43

Reading

Visual

Search

Camera

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Micro-level: Examining Results

• Users rapidly scan the search result page

• What they see in lower summaries may influence

judgment of higher result

• Spend most time scrutinizing top results 1 and 2

– Trust the ranking

44

[Daniel M. Russell, 2007]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 23

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

POM: Partially Observable Model

Eugene Agichtein

Emory University45

Wang et al., WSDM 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Examination (cont’d)

• Searchers might use

the mouse to focus

reading attention,

bookmark promising

results, or not at all.

• Behavior varies with

task difficulty and user

expertise

46

[K. Rodden, X. Fu, A. Aula, and I. Spiro, Eye-mouse

coordination patterns on web search results pages,

Extended Abstracts of ACM CHI 2008]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 24

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Ex. (cont): Predicting Eye-Mouse coordination

47

Guo & Agichtein, CHI 2010

0 1 2 3 4

x 104

0

150

300

450

600

750

Time

Eye

−m

ou

se

dis

tan

ce

Euclidean DistanceThresholdPrediction

0 1000 2000 3000 4000 50000

150

300

450

600

Time

Eye

−m

ou

se

dis

tan

ce

Euclidean DistancePredictionThreshold

0 3000 6000 9000 120000

150

300

450

600

Time

Eye

−m

osu

e d

ista

nce

Euclidean DistancePredictionThreshold

Eugene Agichtein

Emory University

Actual Eye-Mouse Coordination Predicted

No Coordination (30%)

Bookmarking (25%)

Eye follows mouse (25%)

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Macro-Level (Session) Analysis

• Can examine theoretical user models in light of empirical data:– Orienteering?

– Foraging?

– Multi-tasking?

• Search is often a multi-step process: – Find or navigate to a good site (“orienteering”)

– Browse for the answer there: [actor most oscars] vs. [oscars]

• Teleporting – “I wouldn’t use Google for this, I would just go to…”

• Triangulation– Draw information from multiple sources and interpolate

– Example: “how long can you last without food?”

48Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 25

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Users (sometimes) Multi-task

49

[Daniel M. Russell, 2007]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Kinds of Search+Browsing Behavior

50

[Daniel M. Russell, 2007]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 26

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Parallel Browsing Behavior [Huang & White, HT 2010]

• 57% of all tabbed sessions are parallel browsing

• Can mean multi-tasking

• Common scenario: “branching” in exploring search results

Eugene Agichtein

Emory University51

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Engine Switching Behavior

Eugene Agichtein

Emory University52

White et al., CIKM 2009

• 4% of all search sessions contained a switching event

• Switching events:

– 58.6 million switching events in 6-month period

• 1.4% of all Google / Yahoo! / Live queries followed by switch

– 12.6% of all switching events involved same query

– Two-thirds of switching events from browser search box

• Users:

– 72.6% of users used multiple engines in 6-month period

– 50% of users switched search engine within a session

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 27

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Search Engine Switching

• Switching is more frequent in longer sessions

White et al., CIKM 2009

53Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Switching - Survey

• 70.5% of survey respondents reported having switched– Remarkably similar to the 72.6% observed in logs

• Those who did not switch:– Were satisfied with current engine (57.8%)

– Believed no other engine would perform better (24.0%)

– Felt that it was too much effort to switch (6.8%)

– Other reasons included brand loyalty, trust, privacy

• Within-session switching:– 24.4% of switching users did so “Often” or “Always”

– 66.8% of switching users did so “Sometimes”

White et al., CIKM 2009

54Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 28

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Reasons for Engine Switching

• Three types of reasons:– Dissatisfaction with original engine

– Desire to verify or find additional information

– User preference

Other reasons included:

- Loyalty to dest. engine

- Multi-engine apps.

- Hope (!)

White et al., CIKM 2009

55Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Pre-switch Behavior

• Most common are queries and non-SERP clicks

• This is the action immediately before the switch

• What about pre-switch activity across the session?

White et al., CIKM 2009

56Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 29

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Pre-switch Behavior (Survey)

“Is there anything about your search behavior

immediately preceding a switch that may indicate to an

observer that you are about to switch engines?”

• Common answers:

– Try several small query changes in pretty quick succession

– Go to more than the first page of results, again often in quick succession and often without clicks

– Go back and forth from SERP to individual results, without spending much time on any

– Click on lots of links, then switch engine for additional info

– Do not immediately click on something

White et al., CIKM 2009

57Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Post-switch Behavior

• Extending the analysis beyond next action:

– 20% of switches eventually lead to return to origin engine

– 6% of switches eventually lead to use of third engine

• > 50% led to a result click. Are users satisfied?

White et al., CIKM 2009

58Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 30

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Post-Switch Satisfaction

• Measures of user effort / activity (# Queries, # Actions)

• Measure of the quality of the interaction

– % queries with No Clicks, # Actions to SAT (>30sec dwell)

• Users issue more queries/actions; seem less satisfied (higher %NoClicks and more actions to SAT)

• Switching queries may be challenging for search engines

Activity# Queries # Actions

Origin Destination Origin Destination

All Queries 3.14 3.70 9.85 11.62

Same Queries 3.08 3.73 9.03 10.25

Success% NoClicks # Actions to SatAction

Origin Destination Origin Destination

All Queries 49.7 52.7 3.81 4.71

Same Queries 54.5 59.7 3.67 4.61

White et al., CIKM 2009

59Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Expertise

• Some people are more expert at searching than others

– Search expertise, not domain expertise

– Alternative explanation: Orienteering vs. Teleporting

• Find characteristics of these “advanced search engine users” in an effort to better understand how these users search

• Understanding what advanced searchers are doing could improve the search experience for everyone

60

[White & Morris, WWW 2007]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 31

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Findings – Post-query browsing

Advanced users:

– Traverse trails faster

– Spend less time viewing

each Web page

– Follow query trails with

fewer steps

– Revisit pages less often

– “Branch” less often

Feature padvanced

0% > 0% ≥ 25% ≥ 50% ≥ 75%

Session Secs 701.10 706.21 792.65 903.01 1114.71

Trail Secs 205.39 159.56 156.45 147.91 136.79

Display Secs 36.95 32.94 34.91 33.11 30.67

Num. Steps 4.88 4.72 4.40 4.40 4.39

Num. Revisits 1.20 1.02 1.03 1.03 1.02

Num.Branches

1.55 1.51 1.50 1.47 1.44

%Trails 72.14% 27.86% .83% .23% .05%

%Users 79.90% 20.10% .79% .18% .04%

Non-advanced More advanced ����Advanced

[White & Morris, 2007]

Eugene Agichtein

Emory University61

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Demographics

• Gender differences:

– Query “wagner”

• Women: http://en.wikipedia.org/wiki/Richard_Wagner

• Men: http://www.wagnerspraytech.com/

• Education differences:

Eugene Agichtein

Emory University62

[Weber & Castillo, SIGIR 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 32

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

ReFinding Behavior

• 40% of the queries led to a click on a result that the

same user had clicked on

in a past search session.

– Teevan et al., 2007

• What’s the URL for this

year’s SIGIR 2010?

– Does not really matter,

it is faster to re-find it

63

[From Teevan et al, 2007]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

What Is Known About Re-Finding

• Re-finding recent topic of interest

• Web re-visitation common [Tauscher & Greenberg]

• People follow known paths for re-finding

– Search engines likely to be used for re-finding

• Query log analysis of re-finding

– Query sessions [Jones & Fain]

– Temporal aspects [Sanderson & Dumais]

64Eugene Agichtein

Emory University

[From Teevan et al, 2007]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 33

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

3100

(24%)

36

(<1%)

635

(5%)

485

(4%)

637

(5%)

4

(<1%)

660

(5%)

7503

(57%)

Click on previously clicked results?

Click on different

results?

Same query issued

before?

New query?

Click same and

different?1 click > 1 click39%

Navigational

Re-finding with different query

Eugene Agichtein

Emory University

[From Teevan et al, 2007]

65

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Rank Change Degrades Re-Finding

• Results change rank

• Change in result rank reduces probability of re-click

– No rank change: 88% chance

– Rank change: 53% chance

• Rank change � slower repeat click

– Compared with initial search to click

– No rank change: Re-click is faster

– Rank change: Re-click is slower

[From Teevan et al, 2007]

66Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 34

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Aside: Mobile Search…

• Not topic of today’s tutorial

• Some references:

�M. Jones, Mobile Search Tutorial, Mobile HCI, 2009

– K. Church, B. Smyth, K. Bradley, Keith and P. Cotter. A large scale study of European mobile search behaviour. Mobile HCI, 2008

– Kamvar, M., Kellar, M., Patel, R., and Xu, Y. Computers and iphones and mobile phones, oh my!: a logs-based comparison of search users on different devices. WWW 2009

– Kamvar, M. and Baluja, S. 2008. Query suggestions for mobile search: understanding usage patterns, CHI 2008

Eugene Agichtein

Emory University67

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Summary

�Understanding user behavior at micro-, meso-, and macro- levels

�Theoretical models of information seeking

�Web search behavior:

�Levels of detail

�Search Intent

�Variations in web searcher behavior

�Keeping found things found

68Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 35

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eugene Agichtein

Emory University

Part 2: Inferring Web Searcher Intent

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Overview

�Part 1: Search Intent Modeling

�Motivation: how intent inference could help search

�Web search intent & information seeking models

�Web searcher behavior models

�Part 2: Inferring Web Searcher Intent

– Inferring result relevance: clicks

– Richer interaction models: clicks + browsing

– Contextualizing intent models: personalization

70Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 36

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 2: Inferring Searcher Intent

• Inferring result relevance: clicks

• Richer behavior models:

– SERP presentation info

– Post-search behavior

– Rich interaction models for SERPs

• Contextualizing intent inference:

– Session-level models

– Personalization

71Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Implicit Feedback

• Users often reluctant to provide relevance judgments

– Some searches are precision-oriented (no “more like this”)

– They’re lazy or annoyed:

– “Was this document helpful?”

• Can we gather relevance feedback without requiring

the user to do anything?

• Goal: estimate relevance from behavior

Click

72Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 37

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Observable Behavior

Minimum Scope

Segment Object ClassB

eh

av

ior

Ca

teg

ory

Examine

Retain

Reference

Annotate

View

Listen

Select

(click)

Print Bookmark

Save

Purchase

Delete

Subscribe

Copy / paste

Quote

Forward

Reply

Link

Cite

Mark up Rate

Publish

Organize

Click

73Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clicks as Relevance Feedback

• Limitations:

– Hard to determine the meaning of a click. If the best

result is not displayed, users will click on something

– Presentation bias

– Click duration may be misleading

• People leave machines unattended

• Opening multiple tabs quickly, then reading them all slowly

• Multitasking

• Compare above to limitations of explicit feedback:

– Sparse, inconsistent ratings

Click

74Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 38

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

“Strawman” Click model: No Bias

• Naive Baseline

– cdi is P( Click=True | Document=d, Position=i )

– rd is P( Click=True | Document=d )

• Why this baseline?

– We know that rd is part of the explanation

– Perhaps, for ranks 9 vs 10, it’s the main explanation

– It is a bad explanation at rank 1 e.g. Eye tracking

Attractiveness of summary ~= Relevance of result

[Craswell et al., 2008]

Eugene Agichtein

Emory University75

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Realistic Click models

• Clickthrough and subsequent browsing behavior of

individual users influenced by many factors

– Relevance of a result to a query

– Visual appearance and layout

– Result presentation order

– Context, history, etc.

Eugene Agichtein

Emory University76

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 39

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

De-biasing position (first attempt)

Relative clickthrough for queries with known relevant results

in position 3 (results in positions 1 and 2 are not relevant)

1 2 3 5 10

Result Position

Rela

tive C

lick F

requ

ency All queries

PTR=1

PTR=3

Higher clickthrough at

top non-relevant than at

top relevant document

[Agichtein et al., 2006]

Click

77Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Simple Model: Deviation from Expected

• Relevance component: deviation from “expected”:

Relevance(q , d)= observed - expected (p)

-0.023-0.029

-0.009-0.001

-0.013

0.010

-0.002 -0.001

0.144

0.063

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

1 2 3 5 10

Result position

Clic

k f

req

ue

nc

y d

ev

iati

on

PTR=1

PTR=3

[Agichtein et al., 2006]

Click

78Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 40

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

• CD: distributional model, extends SA+N

– Clickthrough considered iff frequency > ε than expected

• Click on result 2 likely “by chance”

• 4>(1,2,3,5), but not 2>(1,3)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

1 2 3 4 5

Result position

Cli

ck

thro

ug

h F

req

ue

nc

y D

ev

iati

on

Simple Model: Example

1

2

3

4

5

6

7

8

Click

Click

Click

79Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Simple Model Results

Improves precision

by discarding

“chance” clicks

Click

80Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 41

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Another Formulation

• There are two types of user/interaction

– Click based on relevance

– Click based on rank (blindly)

• A.k.a. the OR model:

– Clicks arise from

relevance OR position

– Estimate with logistic regression 1 2 3 4 5 6 7 8 9 100

0.2

0.4

ib

i

[Craswell et al., 2008]

Eugene Agichtein

Emory University81

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Linear Examination Hypothesis

• Users are less likely to look at lower ranks, therefore

less likely to click

• This is the AND model

– Clicks arise from

relevance AND examination

– Probability of examination does not depend on what

else is in the list

1 2 3 4 5 6 7 8 9 100

0.5

1

i

x i

[Craswell et al., 2008]

Eugene Agichtein

Emory University82

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 42

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model

• Users examine the results in rank order

• At each document d

– Click with probability rd

– Or continue with probability (1-rd)

[Taylor et al., 2008]

Eugene Agichtein

Emory University83

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model (2)

Eugene Agichtein

Emory University84

query URL1 URL2 URL3 URL4

C1 C2 C3C4

r1 r2 r3 r4 Relevance

ClickThroughs

rd

(1-rd) (1-rd) (1-rd)

rdrdrd

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 43

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model Example

• 500 users typed a query

• 0 click on result A in rank 1

• 100 click on result B in rank 2

• 100 click on result C in rank 3

• Cascade (with no smoothing) says:

• 0 of 500 clicked A � rA = 0

• 100 of 500 clicked B � rB = 0.2

• 100 of remaining 400 clicked C � rC = 0.25

[Craswell et al., 2008]

Eugene Agichtein

Emory University85

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model Seems Closest to Reality

Best possible: Given the true click counts for ordering BA

[Craswell et al., 2008]

Eugene Agichtein

Emory University86

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 44

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic Bayesan Net

87

Click

O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

did user examine url?

was user satisfied by

landing page?

user attracted to url?

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic Bayesan Net

88

Click

O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 45

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 89

Dynamic Bayesan Net (results)Click

predicted relevance

agrees 80% with

human relevance

O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

Use EM algorithm (similar to forward-backward

to learn model parameters; set manually

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clicks: Summary So Far

• Simple model accounts for position bias

• Bayes Net model: extension of Cascade model

shown to work well in practice

– Limitations?

• Questions?

90

Click

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 46

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Capturing a Click in its Context

91

[Piwowarski et al., 2009]

Building query chainsBuilding query chains

• Simple model based on time deltas & query similarities

Analysing the chainsAnalysing the chains

• Layered Bayesian Network (BN) model

Validation of the modelValidation of the model

• Relevance of clicked documents

• Boosted Trees with features from the BN

Click

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overall process

Time thresholdTime threshold

Similarity thresholdSimilarity threshold

Grouping atomic sessionsGrouping atomic sessions

[Piwowarski et al., 2009]

Click

92Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 47

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Layered Bayesian Network[Piwowarski et al., 2009]

Click

93Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

The BN gives the context of a click

94

Probability (Chain state=… / observations)

= (0.2, 0.4, 0.01, 0.39, 0)

Probability (Search state=… / observations)

= (0.1, 0.42, …)

Probability (Page state=… / observations)

= (0.25, 0.2, …)

Probability (Click state=… / observations)

= (0.02, 0.5, …)

Probability ([not] Relevant / observations)

= (0.4, 0.5)

Chain

Search

Relevance

Click

Page

[Piwowarski et al., 2009]

Click

94Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 48

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Features for one click

• For each clicked document, compute features:

– (BN) Chain/Page/Action/Relevance state distribution

– (BN) Maximum likelihood configuration, likelihood

– Word confidence values (averaged for the query)

– Time and position related features

• This is associated with a relevance judgment from

an editor and used for learning

[Piwowarski et al., 2009]

Click

95Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Learning with Gradient Boosted Trees

• Use a Gradient boosted trees (Friedman 2001),

with a tree depth of 4 (8 for non BN-based model)

• Used disjoint train (BN + GBT training) and test sets

– Two sets of sessions S1 and S2 (20 million chains) and

two set of queries + relevance judgment J1 and J2

(about 1000 queries with behavior data)

– Process (repeated 4 times):

• learn the BN parameters on S1+J1,

• extract the BN features and learn the GBT with S1+J1

• Extract the BN features and predict relevance assessments of

J2 with sessions of S2

[Piwowarski et al., 2009]

Click

96Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 49

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Predicting Relevance of Clicked Docs[Piwowarski et al., 2009]

Click

97Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Problem: Users click based on result “Snippets”

• Effect of Caption Features on Clickthrough

Inversions, C. Clarke, E. Agichtien, S. Dumais, R.

White, SIGIR 2007

[Clarke et al., 2007]

Eugene Agichtein

Emory University98

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 50

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clickthrough Inversions [Clarke et al., 2007]

Eugene Agichtein

Emory University99

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Relevance is Not the Dominant Factor![Clarke et al., 2007]

Eugene Agichtein

Emory University100

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 51

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Features Studied[Clarke et al., 2007]

Eugene Agichtein

Emory University101

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Feature Importance[Clarke et al., 2007]

Eugene Agichtein

Emory University102

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 52

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Important Words in Snippet[Clarke et al., SIGIR 2007]

Eugene Agichtein

Emory University103

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension: Use Fair Pairs Randomization

Click

data:

Example result:

(bars should

be equal

if unbiased)

Eugene Agichtein

Emory University104

[Yue et al., WWW 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 53

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Viewing Organic Results vs. Ads

• Ads and Organic results

compete for user

attention

Navigational vs. Other Diversity v Similarity

Eugene Agichtein

Emory University105

Danescu-Niculescu-Mizil et al., WWW 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 2: Inferring Searcher Intent

�Inferring result relevance: clicks

�Richer behavior models:

�SERP presentation info

�Richer interaction models: +presentation, +behavior

106Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 54

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Richer Behavior Models

• Behavior measures of Interest

– Browsing, scrolling, dwell time

– How to estimate relevance?

• Heuristics

• Learning-based

– General model: Curious Browser [Fox et al., TOIS 2005]

– Query + Browsing [Agichtein et al., SIGIR 2006]

– Active Prediction: [Yun et al., WWW 2010]

107Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Curious Browser[Fox et al., 2003]

108Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 55

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Data Analysis

• Bayesian modeling at result and session level

• Trained on 80% and tested on 20%

• Three levels of SAT – VSAT, PSAT & DSAT

• Implicit measures:Result-Level Session-Level

Diff Secs, Duration Secs Averages of result-level measures (Dwell Time

and Position)

Scrolled, ScrollCnt, AvgSecsBetweenScroll,

TotalScrollTime, MaxScroll

Query count

TimeToFirstClick, TimeToFirstScroll Results set count

Page, Page Position, Absolute Position Results visited

Visits End action

Exit Type

ImageCnt, PageSize, ScriptCnt

Added to Favorites, Printed

[Fox et al., 2003]

Eugene Agichtein Emory University 109

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Data Analysis, cont’d[Fox et al., 2003]

Eugene Agichtein Emory University 110

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 56

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result-Level Findings

1. Dwell time, clickthrough and exit type

strongest predictors of SAT

2. Printing and Adding to Favorites highly

predictive of SAT when present

3. Combined measures predict SAT better

than clickthrough

[Fox et al., 2003]

Eugene Agichtein Emory University 111

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Level Findings, cont’d

Only clickthrough

Combined measures

Combined measures with

confidence of > 0.5 (80-20

train/test split)

[Fox et al., 2003]

Eugene Agichtein Emory University 112

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 57

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Learning Result Preferences in Rich User Interaction Space

• Observed and Distributional features

– Observed features: aggregated values over all user interactions for

each query and result pair

– Distributional features: deviations from the “expected” behavior

for the query

• Represent user interactions as vectors in “Behavior Space”

– Presentation: what a user sees before click

– Clickthrough: frequency and timing of clicks

– Browsing: what users do after the click

[Agichtein et al., 2006]

Eugene Agichtein Emory University 113

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Features for Behavior Representation[Agichtein et al., SIGIR2006]

PresentationPresentation

ResultPositionResultPosition Position of the URL in Current rankingPosition of the URL in Current ranking

QueryTitleOverlapQueryTitleOverlap Fraction of query terms in result TitleFraction of query terms in result Title

Clickthrough Clickthrough

DeliberationTimeDeliberationTime Seconds between query and first clickSeconds between query and first click

ClickFrequencyClickFrequency Fraction of all clicks landing on pageFraction of all clicks landing on page

ClickDeviationClickDeviation Deviation from expected click frequencyDeviation from expected click frequency

Browsing Browsing

DwellTimeDwellTime Result page dwell timeResult page dwell time

DwellTimeDeviationDwellTimeDeviation Deviation from expected dwell time for queryDeviation from expected dwell time for query

Sample Behavior Features

Eugene Agichtein Emory University 114

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 58

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Result Preferences

• Task: predict pairwise preferences

– A judge will prefer Result A > Result B

• Models for preference prediction

– Current search engine ranking

– Clickthrough

– Full user behavior model

[Agichtein et al., SIGIR2006]

Eugene Agichtein Emory University 115

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

User Behavior Model

• Full set of interaction features

– Presentation, clickthrough, browsing

• Train the model with explicit judgments

– Input: behavior feature vectors for each query-page pair in rated results

– Use RankNet (Burges et al., [ICML 2005]) to discover model weights

– Output: a neural net that can assign a “relevance” score to a behavior feature vector

[Agichtein et al., SIGIR2006]

Eugene Agichtein Emory University 116

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 59

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 117

Results: Predicting User Preferences

SA+N

0.6

0.62

0.64

0.66

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0 0.1 0.2 0.3 0.4

Recall

Pre

cis

ion

SA+N

CD

UserBehavior

Baseline

• Baseline < SA+N < CD << UserBehavior

• Rich user behavior features result in dramatic improvement

[Agichtein et al., SIGIR2006]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Queries from Browsing Behavior

• Identify “Search Trigger” browse-search patterns

• Distribution of “Search-Browse” patterns:

URLs: movies.about.com/ nationalpriorities.org pds.jpl.nasa.gov/planets

Eugene Agichtein

Emory University118

[Cheng et al., WWW 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 60

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Summary of Part 2

• Click data contains important information about the

distribution of intents for a query

• For accurate interpretation, must model the (many)

biases present:

– Presentation, demographics, types of intent

Eugene Agichtein

Emory University119

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 3: Applications and Extensions

• Improving search ranking

– Implicit feedback

• Predicting Intent and Behavior

– Query suggestion, ads

• Search personalization

Eugene Agichtein

Emory University120

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 61

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Observable Behavior

Minimum Scope

Segment Object ClassB

eh

avio

r C

ate

go

ry

Examine

Retain

Reference

Annotate

View

Listen

Print Bookmark

Save

Purchase

Delete

Subscribe

Copy / paste

Quote

Forward

Reply

Link

Cite

Mark up Rate

Publish

Organize

Eugene Agichtein

Emory University121

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eye Tracking

• Unobtrusive

• Relatively precise(accuracy: 1° of visual angle)

• Expensive

• Mostly used as „passive“ tool for behavior analysis, e.g. visualized by heatmaps:

• We use eye tracking for immediate implicit feedback taking into account temporal fixation patterns

122Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 62

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Using Eye Tracking for Relevance Feedback

• Starting point: Noisy gaze data from the eye tracker.

2. Fixation detection and saccade classification

3. Reading (red) and skimming (yellow) detection line by line

See G. Buscher, A. Dengel, L. van Elst: “Eye Movements as Implicit Relevance Feedback”, in CHI '08

[Buscher et al., 2008]

123Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Three Feedback Methods Compared

Input:

viewed

documents

Baseline TF x IDF

Gaze-Filter TF x IDF

Gaze-Length-

Filter

Reading

Speed

ReadingScore(t) x

TF x IDF

based on read vs.

skimmed passages

containing term t

based on opened

entire documents

based on read or

skimmed passages

Interest(t) x TF x IDF

based on length of

coherently read text

[Buscher et al., 2008]

124Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 63

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eye Tracking-based RF Results

[Buscher et al., 2008]

125Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Instrumenting SERP Interactions: EMU

126

• EMU: Firefox + LibX plugin instrumentation � http log

• Track whitelisted sites e.g., Emory, Google, Yahoo search…

• All SERP events logged (asynchronous http requests)

•150 public use machines, 5,000+ opted-in users

HTTP Log

HTTP Server

Usage DataData Mining &

Management

Train Prediction

Models

Eugene Agichtein

Emory University

Gui, Agichtein, et al., JCDL 2009

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 64

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Classifying Research vs. Purchase Intent

• 12 subjects (grad students and staff) asked to

1. Research a product they want to purchase eventually

(Research intent)

2. Search for a best deal on an item they want to

purchase immediately (Purchase intent)

• Eye tracking and browser instrumentation

performed in parallel for some of the subjects

127

Guo & Agichtein, SIGIR 2010

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Research Intent

Eugene Agichtein

Emory University128

Guo & Agichtein, SIGIR 2010

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 65

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Purchase Intent

129Eugene Agichtein

Emory University

Guo & Agichtein, SIGIR 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Contextualized Intent Inference

130

Guo & Agichtein, SIGIR 2010

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 66

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Implementation: Conditional Random Field (CRF) Model

Eugene Agichtein Emory

University131

Guo & Agichtein, SIGIR 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Ad Click Prediction

• 200%+ precision improvement (within task)

Eugene Agichtein

Emory University132

Guo & Agichtein, SIGIR 2010

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 67

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Application: Learning to Rank from Click Data

133

[ Joachims 2002 ]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results

134

[ Joachims 2002 ]

Summary:

Learned outperforms all base

methods in experiment

� Learning from clickthrough data

is possible

� Relative preferences are useful

training data.

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 68

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension: Query Chains

135

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Chains (Cont’d)

136

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 69

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Chains (Results)

• Query Chains add slight improvement over clicks

137

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Richer Behavior for Dynamic Ranking

138

[Agichtein et al., SIGIR2006]

PresentationPresentation

ResultPositionResultPosition Position of the URL in Current rankingPosition of the URL in Current ranking

QueryTitleOverlapQueryTitleOverlap Fraction of query terms in result TitleFraction of query terms in result Title

Clickthrough Clickthrough

DeliberationTimeDeliberationTime Seconds between query and first clickSeconds between query and first click

ClickFrequencyClickFrequency Fraction of all clicks landing on pageFraction of all clicks landing on page

ClickDeviationClickDeviation Deviation from expected click frequencyDeviation from expected click frequency

Browsing Browsing

DwellTimeDwellTime Result page dwell timeResult page dwell time

DwellTimeDeviationDwellTimeDeviation Deviation from expected dwell time for queryDeviation from expected dwell time for query

Sample Behavior Features (from Lecture 2)

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 70

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Feature Merging: Details

• Value scaling:

– Binning vs. log-linear vs. linear (e.g., μ=0, σ=1)

• Missing Values:

– 0? (meaning for normalized feature values s.t. μ=0?)

• “real-time”: significant architecture/system problems

Result URL BM25 PageRank … Clicks DwellTime …

sigir2007.org 2.4 0.5 … ? ? …

Sigir2006.org 1.4 1.1 … 150 145.2 …

acm.org/sigs/sigir/ 1.2 2 … 60 23.5 …

Query: SIGIR, fake results w/ fake feature values

[Agichtein et al., SIGIR2006]

139Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Review: NDCG

• Normalized Discounted Cumulative Gain

• Multiple Levels of Relevance

• DCG:

– contribution of ith rank position:

– Ex: has DCG score of

• NDCG is normalized DCG

– best possible ranking as score NDCG = 1

)1log(

12

+

i

iy

45.5)6log(

1

)5log(

0

)4log(

1

)3log(

3

)2log(

1≈++++

140Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 71

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Human Judgments

http://jobs.monsterindia.com/details/7902838.html 141

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results for Incorporating Behavior into Ranking

MAP Gain

RN 0.270

RN+ALL 0.321 0.052 (19.13%)

BM25 0.236

BM25+ALL 0.292 0.056 (23.71%)

0.56

0.58

0.6

0.62

0.64

0.66

0.68

0.7

0.72

0.74

1 2 3 4 5 6 7 8 9 10K

ND

CG

RN

Rerank-All

RN+All

[Agichtein et al., SIGIR2006]

142Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 72

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Which Queries Benefit Most

0

50

100

150

200

250

300

350

0.1 0.2 0.3 0.4 0.5 0.6

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

Frequency Average Gain

Most gains are for queries with poor original ranking

[Agichtein et al., SIGIR2006]

143Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension to Unseen Queries/Documents: Search Trails

144

[Bilenko and White, WWW 2008]

• Trails start with a search engine query

• Continue until a terminating event

– Another search

– Visit to an unrelated site (social networks, webmail)

– Timeout, browser homepage, browser closingEugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 73

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Probabilistic Model

• IR via language modeling [Zhai-Lafferty, Lavrenko]

• Query-term distribution gives more mass to rare

terms:

• Term-website weights combine dwell time and counts

[Bilenko and White, WWW 2008]

145Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Learning to Rank

Add Rel(q, di) as a feature to RankNet

0.58

0.6

0.62

0.64

0.66

0.68

0.7

0.72

NDCG@1 NDCG@3 NDCG@10

ND

CG

Baseline

Baseline+Heuristic

Baseline+Probabilistic

Baseline+Probabilistic+RW

[Bilenko and White, WWW 2008]

146Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 74

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Personalization

Eugene Agichtein

Emory University147

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Which Queries to Personalize?

• Personalization benefits ambiguous queries

• Inter-rater reliability (Fleiss’ kappa)

– Observed agreement (Pa) exceeds expected (Pe)

– κ = (Pa-Pe) / (1-Pe)

• Relevance entropy

– Variability in probability result is relevant (Pr)

– S = -Σ Pr log Pr

• Potential for personalization

– Ideal group ranking differs from ideal personal

– P4P = 1 - nDCGgroup

148

Teevan, J, S. T. Dumais, and D. J. Liebling. To personalize or not to

personalize: modeling queries with variation in user intent., SIGIR 2008

[Teevan et al., 2008]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 75

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Ambiguous Queries

History

No Yes

Info

rma

tio

n

Qu

ery

Query length

Contains URL

Contains advanced operator

Time of day issued

Number of results (df)

Number of query suggests

Reformulation probability

# of times query issued

# of users who issued query

Avg. time of day issued

Avg. number of results

Avg. number of query suggests

Re

sult

s

Query clarity

ODP category entropy

Number of ODP categories

Portion of non-HTML results

Portion of results from .com/.edu

Number of distinct domains

Result entropy

Avg. click position

Avg. seconds to click

Avg. clicks per user

Click entropy

Potential for personalization

Teevan, J, S. T. Dumais, and D. J. Liebling. To personalize or not to

personalize: modeling queries with variation in user intent., SIGIR 2008

[Teevan et al., 2008]

149Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Mars (Candy) vs. Mars (Planet)

• Approach:

– Intent = Set of visited documents

– Cluster refinements using document visit distribution vectors

Clustering Query Refinements by User Intent

Eugene Agichtein

Emory University150

[Sadikov et al., WWW 2010]

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 76

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Approaches to Personalization

1. Pitkow et al., 2002

2. Qiu et al., 2006

3. Jeh et al., 2003

4. Teevan et al., 2005

5. Das et al., 2007

151

Figure adapted from: Personalized search on the world wide web, by

Micarelli, A. and Gasparetti, F. and Sciarrone, F. and Gauch, S., LNCS 2007

1

2 4

5

3

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

personalization research

• Ask the searcher

– Is this relevant?

• Look at searcher’s clicks

• Similarity to content

searcher’s seen before

Teevan et al., TOCHI 2010

152Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 77

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Ask the Searcher

• Explicit indicator of relevance

• Benefits

– Direct insight

• Drawbacks

– Amount of data limited

– Hard to get answers for the same query

– Unlikely to be available in a real system

Teevan et al., TOCHI 2010

153Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Searcher’s Clicks

• Implicit behavior-based

indicator of relevance

• Benefits

– Possible to collect from

all users

• Drawbacks

– People click by mistake

or get side tracked

– Biased towards what is

presented

Teevan et al., TOCHI 2010

154Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 78

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Similarity to Seen Content

• Implicit content-based indicator of relevance

• Benefits

– Can collect from all users

– Can collect for all queries

• Drawbacks

– Privacy considerations

– Measures of textual similarity noisy

Teevan et al., TOCHI 2010

155Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Evaluating Personalized Search

• Explicit judgments (offline and in situ)

– Evaluate components before system

– NOTE: What’s relevant for you

• Deploy system

– Verbatim feedback, Questionnaires, etc.

– Measure behavioral interactions (e.g., click, reformulation, abandonment, etc.)

– Click biases –order, presentation, etc.

– Interleaving for unbiased clicks

• Link implicit and explicit (Curious Browser toolbar)

• From single query to search sessions and beyond

156

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 79

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

User Control in Personalization (RF)

157

J-S. Ahn, P. Brusilovsky, D. He, and S.Y. Syn. Open user profiles for adaptive

news systems: Help or harm? WWW 2007

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Personalization Summary

• Lots of relevant content ranked low

• Potential for personalization high

• Implicit measures capture explicit variation

– Behavior-based: Highly accurate

– Content-based: Lots of variation

• Example: Personalized Search

– Behavior + content work best together

– Improves search result click through

158Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 80

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

New Direction: Active Learning

• Goal: Learn the relevances with as little training

data as possible.

• Search involves a three step process:

1. Given relevance estimates, pick a ranking to display to

users.

2. Given a ranking, users provide feedback: User clicks

provide pairwise relevance judgments.

3. Given feedback, update the relevance estimates.

159

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Approach

• Available information:1. Have an estimate of the relevance of each result.

2. Can obtain pairwise comparisons of the top few results.

3. Do not have absolute relevance information.

• Goal: Learn the document relevance quickly.

• Addresses four questions:1. How to represent knowledge about doc relevance.

2. How to maintain this knowledge as we collect data.

3. Given our knowledge, what is the best ranking?

4. What rankings do we show users to get useful data?

160

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 81

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

• Given a fixed query, maintain knowledge about

relevance as clicks are observed.

– This tells us which documents we are sure about, and

which ones need more data.

161

1: Representing Document Relevance[Radlinski & Joachims, KDD 2007]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

• Problem: could present the ranking based on

current best estimate of relevance.

– Then the data we get would always be about the

documents already ranked highly.

• Instead, optimize ranking shown users:

1. Pick top two docs to minimize future loss

2. Append current best estimate ranking.

162

4: Getting Useful Data[Radlinski & Joachims, KDD 2007]

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 82

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 163

4: Exploration Strategies[Radlinski & Joachims, KDD 2007]

Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 164

Results: TREC Data [Radlinski & Joachims, KDD 2007]

Optimizing for relevance estimates better than for ordering

Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 83

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Summary

�Understanding user behavior at micro-, meso-, and macro- levels

�Theoretical models of information seeking

�Web search behavior:�Levels of detail

�Search Intent

�Variations in web searcher behavior

�Keeping found things found

�Click models

165Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Summary (2)

� Inferring result relevance: clicks

�Richer behavior models:

�SERP presentation info

�Post-search behavior

�Rich interaction models for SERPs

�Contextualizing intent inference:

�Session-level models

�Personalization

166Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 84

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Inferring Searcher Intent (Information)

Eugene Agichtein

Emory University

• Tutorial Page:

���� http://ir.mathcs.emory.edu/intent_tutorial/

See the online version for expanded and updated

bibliography

• Contact information for the instructor:

– Eugene Agichtein

– Email: [email protected]

– Homepage: http://www.mathcs.emory.edu/~eugene/

167

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (1)

• Marti Hearst, Search User Interfaces, 2009, Chapter 3 “Models of the

Information Seeking Process”: http://searchuserinterfaces.com/

• Teevan, J., Adar, E., Jones, R. and Potts, M. Information Re-Retrieval:

Repeat Queries in Yahoo's Logs, SIGIR 2007

• Clarke, C, E. Agichtein, S. Dumais and R. W. White, The Influence of

Caption Features on Clickthrough Patterns in Web Search, SIGIR 2007

• Craswell, N., Zoeter, O., Taylor, M., Ramsey, B. An experimental

comparison of click position-bias models, WSDM 2008

• Dupret, G and Piwowarski, B: A user browsing model to predict

search engine click data from past observations. SIGIR 2008

• White, R and D. Morris, Investigating the Querying and Browsing

Behavior of Advanced Search Engine Users, SIGIR 2007

168Eugene Agichtein

Emory University

Eugene Agichtein, Emory University 11 July 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 85

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (2)

• Marti Hearst, Search User Interfaces, 2009, Chapter 3 “Models of the

Information Seeking Process”: http://searchuserinterfaces.com/

• Teevan, J., Adar, E., Jones, R. and Potts, M. Information Re-Retrieval:

Repeat Queries in Yahoo's Logs, SIGIR 2007

• Clarke, C, E. Agichtein, S. Dumais and R. W. White, The Influence of

Caption Features on Clickthrough Patterns in Web Search, SIGIR 2007

• Craswell, N., Zoeter, O., Taylor, M., Ramsey, B. An experimental

comparison of click position-bias models, WSDM 2008

• Dupret, G and Piwowarski, B: A user browsing model to predict

search engine click data from past observations. SIGIR 2008

• White, R and D. Morris, Investigating the Querying and Browsing

Behavior of Advanced Search Engine Users, SIGIR 2007

169Eugene Agichtein

Emory University

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (3)

Kelly, D. and Teevan, J. Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37, 2 (Sep. 2003)

Joachims, T., Granka, L., Pan, B., Hembrooke, H., and Gay, G. Accurately interpreting clickthrough data as implicit feedback., SIGIR 2005

Agichtein, E., Brill, E., Dumais, S., and Ragno, R. Learning user interaction models for predicting web search result preferences, SIGIR 2006

Buscher, G., Dengel, A., and van Elst, L. Query expansion using gaze-based feedback on the subdocument level., SIGIR 2008

Chapelle, O, and Y. Zhang, A Dynamic Bayesian Network Click Model for Web Search Ranking, WWW 2009

Piwowarski, B, Dupret, G, Jones, R: Mining user web search activity with layered bayesian networks or how to capture a click in its context, WSDM 2009

Guo, Q and Agichtein, E. Ready to Buy or Just Browsing? Detecting Web Searcher Goals from Interaction Data, to appear, SIGIR 2010

170Eugene Agichtein

Emory University