Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging...

Click here to load reader

download Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.

of 15

Transcript of Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging...

Usage Patterns of Collaborative Tagging System

Golder and Huberman, 2006

Journal of Information Science

Usage Patterns of Collaborative Tagging System

Introduction

Collaborative tagging: process by which many users add keywords(descriptive terms) to shared content.

Analyze the structure of collaborative tagging systems as well as their dynamic aspects.

Regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given URL.

Introduction

Categorizing, indexing

When there is nobody in the librarian role

There is simply too much content for a single authority to classify.

Both personal and public

Tagging and taxonomy

Taxonomies are hierarchical and exclusive

Linnaean system of classifying living things, Dewey Decimal classification for libraries, computer file systems for organizing electronic files.

Each animal is in one unambiguous category which is in turn within a yet more general one.

Tagging and taxonomy

Tagging are non-hierarchical and inclusive.

May have advantages over hierarchical taxonomies:

e.g. A researcher downloads an article about cat species native to Africa.

May have disadvantages:

The seeker cannot be sure that a query has returned all relevant items, a folder hierarchy assures the seeker that all the files it contains are in one stable place.

Non-exclusive system could identify such an article as being about a great variety of things simultaneously, including Africa and cats, as well as animals more generally and cheetahs more specifically.

Tagging is like filtering: out of all the possible documents, a tag returns only those items tagged with that tag.

Semantic and cognitive aspects of classification

Polysemy, synonymy, basic level variation

Polysemous: word that has many related senses. E.g. window

Homonymy: searching for employment at Apple can create conflicts with CEO Steve Jobs.

Synonymy: multiple words having the same or closely related meanings. E.g. television and tv.

Basic level variation: Conflicting basic levels for different people.

Del.icio.us.com

Delicious Dynamics

Social bookmarks manager

How it works?

Data: two sets of data: June 23-June 27 2005.

First set: all the URLs which appeared on the Delicious popular page during this time frame. All bookmarks ever posted to each of those URLs regardless of time. Total: 212 URLs, 19422 bookmarks.

Second set: random sample of 229 users who posted during the above time frame. All bookmarks ever posted by those users regardless of time. Total: 68668 bookmarks.

User activity and tag quantity

Users vary greatly in the frequency and nature of their Delicious use.

There is only a weak relationship between the age of the users account and the number of days on which they created at least one bookmark. (n=229, R2 = 0.52)

ie. Some users use Delicious very frequently, some less.

There is no strong relationship between number of bookmarks a user has created and the number of tags they used in those bookmarks. (n=229, R2 = 0.33)

Some users have comparatively large sets of tags and others have comparatively small.

Users tag lists grow overtime but can exhibit very different growth rates reflecting how users interests develop and change over time.

Kinds of tags

Several functions tags perform for bookmarks:

Identifying what or who it is about

Identifying what it is

Identifying who owns it

Refining categories

Identifying qualities or characteristics

Self reference

Task organizing

Users have a strong bias toward using general tags first.

As a tags order in a bookmark increases, its rank (frequency) in the list of tags decreases.

Bookmarks

Trends in bookmarking

URLs often receive most of their bookmarks very quickly, the rate of new bookmarks decreasing over time.

Of the 212 popular URLs in our dataset, 142(67%) reached their peak popularity in their first 10 days in Delicious, 37(17%) on the first day. However, 37 were in the system for six months before they reached their peak popularity.

Effect of Popular page.

Stable patterns in tag proportions

Combined tags of many users bookmarks give rise to a stable pattern in which the proportions of each tag are nearly fixed.

After the first 100 or so bookmarks, each tags frequency is a nearly fixed proportion of the total frequency of all tags used.

Dynamics of a stochastic urn model.

Even if some tags are initially more likely than others, one nevertheless observes the same kind of convergence to a fixed, but random limit.

Why? Imitation and shared knowledge.

Conclusion

Information tagged for personal use can benefit other users.

Users exhibit a great variety in their sets of tags.

Stable patterns emerge.

Stable choices that emerge may be used on a large scale to describe and organize how web documents interact with one another.