Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

47
Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis

Transcript of Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Page 1: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Social Tagging

Uichin LeeKSE652 Social Computing Systems

Design and Analysis

Page 2: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Survey on Social Tagging Techniques

Manish Gupta, Rui Li, Zhijun Yin, Jiawei Han,SIGKDD Explorations , 2010

Page 3: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

What is “tag”?

usertag

resourcetag assignment

jazzmusicu1 r1

y1:

trumpetu1 r3

y2:

tag assignments

trumpetu2 r3

y3:

http://wis.ewi.tudelft.nl/icwe2011/tutorial/tutorial-slides.pptx

Page 4: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag photos on Flickr

Page 5: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag photos on Flickr

Page 6: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag URLs on Delicious

http://bierdoctor.com/papers/cscw08/ejrader-rwash-tagging-cscw.pdf

Page 7: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Bloggers: Wordpress, LiveJournal

Page 8: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Hash tags in Twitter

Page 9: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Citeulike

Page 10: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Design Space

• Tag sharing• Tag selection/suggestion: how to select/display a set of tags? • Item ownership:

– Apply tags to items users created (e.g., photos in Flickr)– Apply tags to items others created (e.g., product pages in Amazon)

• Tag scope– Broad: <user, item, tag> (personal tag to an item; Delicious)– Narrow: <item, tag> (single shared tags to an item; Flickr)

• Other dimensions: tag delimiter (one or multiple words), how to normalize tags across factors like letter cases, white space, etc.

tagging, communities, vocabulary, evolution, Sen et al., CSCW 2006

Page 11: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Applications• Indexing: faster/deeper indexing (e.g., delicious)• Search: social and semantic expansions for web search;

personalized search; enterprise search; searching library catalogues

• Enhanced browsing: tag clouds; popularity driven browsing, filtering

• Taxonomy generation (e.g., folksonomy)• Clustering/classification: clustering/classifying web objects (or

blog entries) [tag + text if any]• Social interest discovery: user interest profiling, discovering

current popular places/events (e.g., Flickr)• Recommendation/personalization

Page 12: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Contents

• Taxonomy? Folksonomy?• Tagging Motivations• Tag Types• Linguistic Classification• Tag Generation Models• Tag Distributions• Tag Semantics• Tag Visualization

Page 13: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Taxonomy? Folksonomy?

• Problems with metadata generation and fixed taxonomies– Manual, expensive, different vocabulary– Fixed static taxonomies are rigid, conservative, and centralized– “Post activation analysis paralysis” (Sinha 2005)

• A state of fear that you will make the wrong decision. And the item will be lost forever - it will land in some deep well, some hard to access branch of the tree and disappear from your view and attention.

• Folksonomies as a solution– Folksonomy: folk (people) + taxis (classification) + nomos

(management)– Emergent and iterative system

Page 14: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging Motivations

• (easing) Future Retrieval (e.g., toread)• Contribution and Sharing • Attract Attention (if popular)• Play and Competition (e.g., ESP games)• Self Referential Tags (mystuff, myLaptop)• Opinion Expression • Task Organization (e.g., gtd, jobsearch)• Social Signaling (contextual info about an object)• Money (e.g., tagging tasks in M-Turk)• Technological Ease (e.g., Phonetags)

Page 15: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging Motivations in Flickr

Why we tag: motivations for annotation in mobile and online media, M. Ames, and M. Naaman, CHI 2007

ZoneTag

Flickr

Page 16: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging Motivations in Flickr

Why we tag: motivations for annotation in mobile and online media, M. Ames, and M. Naaman, CHI 2007

Page 17: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging Motivations in Flickr

What Drives Content Tagging: The Case of Photos on Flickr, Oded Nov, Mor Naaman, Chen Ye, CHI 2008

Page 18: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging Motivations in Flickr

What Drives Content Tagging: The Case of Photos on Flickr, Oded Nov, Mor Naaman, Chen Ye, CHI 2008

Number of Tags (R2 = .571)

(from survey)

(from usage data; Flickr API)

Page 19: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Types

• Content-Based tags (autos, Honda, batman, Lucene)• Context-Based tags (location, time)• Attribute tags (Jeremy’s Blog) / qualities or characteristics• Ownership tag; identifying who owns the resource• Subjective tags (opinion, emotion)• Organizational tags (mywork, mypaper)• Purpose tags (related to info seeking, e.g., “learn_LATEX”)• Factual tags (people, place, concepts)• Personal tags• Self-referential tags • Tag bundles (tagging tags)

Page 20: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Linguistic Classification

• Functional (describing functions; e.g., weapon)• Functional collocation (function + place/time; e.g.,

furniture, tableware)• Origin collocation (why things are together; e.g., dirty

dishes)• Function or origin (e.g., “Michelangelo” “medieval”)• Taxonomic (classifying objects)• Adjective (e.g., red, great, funny, beautiful)• Verb (action; e.g., “explore”, “todo”, “jumping”)• Proper name (e.g., “New Zealand”)

Page 21: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Generation Models

• Factors– Users’ background knowledge– Previous tags suggested by others– Content of the resources– Community influences– Tag selection algorithm– And others….

Page 22: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Generation Models• Basic Polya Urn Model

– Captures popularity of assigned tags but does not consider new tags

• Yule Simon Model– New word (prob p), existing word (prob 1-p) --- each word proportional

to its frequency (leading to power-law dist)• Information value based model

– Previous tag assignments vs. information value of a tag• More parameters

– User background knowledge, number of previous tags the user has accessed, most popular tags

• Language model– Content affects tag generation (tagging ~ language model)

Page 23: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tags: rank – frequency plot

Tag Distributions

• Vocabulary growth over time follows power law (both system and resource level)– N(t) t^r, r< 1∼– dN(t)/d ~ t^(r-1) ; new tags appears

less and less frequently as time passes• A user’s set of distinct tags grows

linearly as new resources are added. But user vocabulary growth tends to decline over time

• Vocabulary rank-frequency follows power law

Temporal evolution: total # of distinct tags

Vocabulary growth in collaborative tagging systems, Cattuto et al., 2007

Page 24: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Semantics• Analysis of pairwise relation between tags (inter-tag relation

graphs)• Semantic tag classification (ClassTag)

– Mapping tags onto WordNet semantic categories– Additionally using Wikipedia articles

ClassTag: Classifying Tags using Open Content Resources, Overell et al., WSDM 2009

Page 25: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Semantics

• Tags vs. keywords– Most important words (e.g., tf or tf*idf) of the

document are generally covered by the tags– Missing keywords are often misspelled

Tag-based Social Interest Discovery, Li et al., WWW 2008

An example of the tf and tf×idf keywords and user-generated tags of a user-saved URL

(all tags attached to this URL by all users)

Page 26: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Semantics

• Tags vs. keywords– Most important words (e.g., tf or tf*idf) of the

document are generally covered by the tags– Missing keywords are often misspelled

Tag-based Social Interest Discovery, Li et al., WWW 2008

Tag coverage for tf keywords

Tag coverage for tf×idf keywords

Page 27: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Visualization

• Tag clouds for browsing/searching– Useful for broad search (less cognitive load); but less useful for

specific search– Disadvantages: skewness towards popular items; multiple clicks; low

recall• Tag selection for tag clouds

– Due to limited screen space, select tags with higher resource coverage (representativeness, volume)

– When displaying tags, we can cluster them based on semantic relationship

• Tag evolution visualization– Temporal evolution of tags; merging data from multiple time

intervals (e.g., tagline)

Page 28: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

tagging, communities, vocabulary, evolution

Shilad Sen, Shyong K. (Tony) Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, John Riedl

CSCW 2006

Page 29: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Relationship between community influence and user tendency

(preference,knowledge)

Page 30: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tagging in MovieLens

MovieLens movie list with tags

“Movie details page” tag display

Adding tags with auto-complete

Page 31: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Research Questions

• How strongly do investment and habit affect personal tagging behavior?

• How does the tagging community influence personal vocabulary?

• How does the tag selection algorithm affect users’ satisfaction with the system?

Page 32: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Experiment Setup

• Randomly assigned users who logged in to MovieLens during the experiment to one of four experimental groups– Unshared– Shared (randomly selected tags)– Shared-pop (most popular tags)– Shared-rec group (recommend tags most commonly

applied to both the target movie and to the most similar movies to the target movie)

Page 33: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Overall Tag Usage

• Overall tag usage statistics by experimental group

The tags column overall total is smaller than the sum of the groups, because two groups might independently use the same tag

Page 34: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Classification• Factual tags identify “facts” about a movie such as people, places,

or concepts. Help to describe movies and find related movies • Subjective tags express user opinions related to a movie. • Personal tags are most often used to organize a user’s movies (item

ownership, self-reference, task organization)

63% factual, 29% subjective, 3% personal

Page 35: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

How strongly do investment and habit affect personal tagging behavior?

• Similarity of tag class of the nth tag applied by a user to – tag class distributions of other tags applied by the user

before the nth tag (applied)– tag class distributions of tags viewed by the user (viewed)– tag class distributions of the uniform tag class distribution

(uniform)

• Example: x(nth)= [0, 1, 0] (fact, sub, per)– y(1~n-1, applied or viewed) = [0.62, 0.35, 0.13] => x*y =

0.37– y~uniform = [1/3, 1/3, 1/3] => x*y = 0.58

Page 36: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

How strongly do investment and habit affect personal tagging behavior?

• Both habit/investment and tags viewed appear to influence the class of applied tags.

Page 37: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

How strongly do investment and habit affect personal tagging behavior?

• Probability that a user’s nth applied tag is a new tag decreases over time

Page 38: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

How does the tag selection algorithm affect users’ satisfaction with the system?

• Final tag application class distribution by experimental group

The dominant tag class for each group is bolded. (Each row sums to 100%.)

Page 39: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

How does the tag selection algorithm affect users’ satisfaction with the system?

factual

subjectivepersonal

subjective

factual

personal

subjective personal

factualfactual

subjective

personal

Unshared Shared

Shared Popular Shared

Recommendation

Group tag application number Group tag application number

Frac

tion

of ta

g ap

plic

ation

s

Group tag application number

Frac

tion

of ta

g ap

plic

ation

s

Frac

tion

of ta

g ap

plic

ation

sFr

actio

n of

tag

appl

icati

ons

Page 40: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Tag Suggestion: User Satisfaction

• Participants didn’t like intrusive tag suggestion (e.g., popup after movie rating)

• Participants didn’t like inference algorithm either– Wrong inference makes users confusing

• e.g., suggested tag “small town” for the movie “Swiss Family Robinson” “I’m confused – I thought it was about people on a deserted island??”

• Yet, suggestion algorithm worked well in terms of displaying a higher number of tags– Pervasiveness may lead users to tag more in general

Page 41: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Influences on TagChoices in del.icio.us

Emilee Rader and Rick WashSchool of Information, University of Michigan

CSCW 2008

http://bierdoctor.com/papers/cscw08/

Page 42: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Understanding Tagging Process

• Social Hypothesis: Users’ tag choices are influenced by the tag choices of others

• Organizing Hypothesis: Users’ tag choices are personal and idiosyncratic, not influenced by others’ tag choices

Page 43: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Wash and Rader (2007)

• Respondents generally used one or more heuristics for choosing tags: – Reuse tags they have applied before to other web

pages– Create and adhere to mental rules or definitions

for specific tags – Choose terms they imagine using to re-find

bookmarks in the future

Page 44: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Research Question

• Look for a connection between the small scale (individual tag choices) and the large scale (aggregate patterns) for tags on del.icio.us.

• Hypotheses– Imitation: Users imitate tags that previous users

have applied to a web page– Organizing: Users re-use tags that they have applied

to other web pages– Recommended: Users choose tags that are

suggested via the del.icio.us posting interface

Page 45: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Regression

• Logistic regression model: tag chosen = f(used.onSite, used.byUser, interaction, tag dummys, random effect(user))– Used.onSite: imitation– Used.byUser: organizing– Interaction: onSite=1, byUser=1 (i.e., Recommendation)

• Data set:– Randomly chose 30 web pages from the sample that had been

bookmarked by at least 100 users. – In June 2007, the complete public bookmark and tag histories for all of

the approximately 12,000 users who had ever bookmarked any of these 30 web pages were downloaded

– Complete tag histories for 30 web pages bookmarked in del.icio.us, as well as tag histories for all users who ever bookmarked any of those 30 web pages as of June 2007.

Page 46: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Regression Results

• Organizing hypothesis is strongly supported (less influence by social and recommendation mechanisms in Delicious)

Page 47: Social Tagging Uichin Lee KSE652 Social Computing Systems Design and Analysis.

Summary

• Applications; Motivations• Tag Types; linguistic Classification• Tag Generation Models; Tag Distributions• Tag Semantics; Tag Visualization; Tag Design Space• Relationship between community influence and personal

tendency– Influenced by personal tagging behavior and tag selection algorithm

(community input)– Tag class distribution differs widely across different groups– Quality of tag recommendation matters

• Tagging process is mainly driven by information organizing behavior (i.e., personal tendency) in Delicious web site.