BP303 Taxonomy versus Folksonomy: Document Management in a Social Age
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
-
Upload
earley-amp-associatesinc -
Category
Technology
-
view
6.179 -
download
2
description
Transcript of Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
- 1.Hybrid Approaches toTaxonomy & Folksonomy Semantic Technology 2009 San Jose, CA June 17, 2009Richard Beatch Paul Wlodarczyk Earley & Associates www.earley.com
2. Agenda
- The taxonomy/folksonomy debate
- Tagging pitfalls
- Social tagging & the enterprise
- Hybrid approaches to taxonomy/folksonomy
-
- Co-existence
-
- Tag-influenced taxonomy
-
- Taxonomy-influenced tags
-
- Tag hierarchies/ontologies
- Conclusion
Copyright 2009 Earley & Associates Inc. All Rights Reserved 3. About Earley & Associates
- Founded in 1994, Earley & Associates is an information management (IM) consulting company specializing in
-
- Taxonomy development and management
-
- Content management strategy
-
- Search integration
-
- Usability & Information Architecture
- Some of our recent clients include:
-
- American Greetings, Hasbro, Ford Foundation, Astra Zeneca, Motorola, The Hartford Insurance Group, Urban Land Institute
- Give us your business card
-
- For a free pass to one of our Community of Practice conference calls
Copyright 2009 Earley & Associates Inc. All Rights Reserved 4. About us
- Richard Beatch
-
- Senior Consultant at Earley & Associates, Inc.
-
- Ph.D. in Ontology
-
- Specialized in Taxonomy, Search, Metadata, and content architecture.
-
- Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T.
-
- Blog: http://sethearley.wordpress.com/
Copyright 2009 Earley & Associates Inc. All Rights Reserved 5. About us
- Paul Wlodarczyk
-
- Director, Solutions Consulting at Earley & Associates, Inc.
-
- MBA with BA in Psychology / Cognitive Science
-
- Specialized in unstructured content technologies with over 20 years experience in XML / structured authoring, content reuse, ECM, KM, localization, semantic analysis and content enrichment
-
- Blogs at http://sethearley.wordpress.com/ and http://thecontentguy.net
Copyright 2009 Earley & Associates Inc. All Rights Reserved 6. The tired debate Copyright 2009 Earley & Associates Inc. All Rights Reserved Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost crowdsourced 7. The relevance problem
- Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term*
- Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps
*Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf Copyright 2009 Earley & Associates Inc. All Rights Reserved 8. Taxonomy
- Added by a small number of individuals: author/originators or authorized persons (e.g.librarian)
- Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary
- Relationships between terms defined
-
- Hierarchical (e.g. Computer hardware > Keyboard)
-
- Associative (e.g. Computer hardware Software)
-
- Equivalent (e.g. Laptop = Notebook Computer)
Copyright 2009 Earley & Associates Inc. All Rights Reserved 9. Tags
- Added by authors and consumers (individual motivation)
- Can connote any type of meaning or purpose
- No compression around a single viewpoint, no control of vocabulary
- Self-correcting through volume
Copyright 2009 Earley & Associates Inc. All Rights Reserved 10. Why tagging is so interesting
- Adding individual value to the act of classification user control over findability
- Reducing the cognitive burden(i.e. its easy)
- Reduced technologicalinvestment (i.e. its cheap)
- Can leverage emergentstructure (folksonomy)
Reno| Tags Copyright 2009 Earley & Associates Inc. All Rights Reserved 11. The downside
- Neither tags nor taggers are perfect
- No language control
-
- Guy & Tonkin, 2006.
-
- http://www.dlib.org/dlib/january06/guy/01guy.html
Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Copyright 2009 Earley & Associates Inc. All Rights Reserved Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog 12. The downside
- Varying levels of granularity
- Same tag, different meanings
- Lack of relationships between tags which is broader? Narrower?
- Lack of consistency/approach to change even single user can change language and hamper own personal retrieval
Robin Bird Turdus migratorinus Known as tag noise Copyright 2009 Earley & Associates Inc. All Rights Reserved 13. The downside
- Most tag search does not account for stemming, plurals, etc.
E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642 Copyright 2009 Earley & Associates Inc. All Rights Reserved 14. The tagging hype cycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 15. The web vs. the enterprise
- Shirky: there is no shelf
-
- Traditional organization schemes are built to deal with physical collections and constraints.
-
- They dont work well on the web
-
-
- large corpus
-
-
-
- no clear edges
-
-
-
- no formal categories
-
-
-
- no authority
-
- The enterprise is much more defined
-
-
- smaller corpuses
-
-
-
- formal entities
-
-
-
- coordinated users, clear tasks
-
-
-
- need for reliable retrieval
-
E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena Copyright 2009 Earley & Associates Inc. All Rights Reserved 16. R o le of folksonomy in the enterprise?
- Tagging external links
-
- Seeing what colleagues are interested in
-
- Sharing links with a specific team
-
- Subscribing to link feeds
-
- Monitoring news/blog coverage of the company
-
- Consumer/competitor research
-
- Tracking industry trends
- Tagging internal links
-
- Finding/facilitating access to most popular pages on the intranet
-
- Seeing what intranet pages mean to staff
Copyright 2009 Earley & Associates Inc. All Rights Reserved 17. Role of folksonomy in the enterprise?
- Social aspects
-
- Identifying subject matter experts
-
- Connecting people who share interests
-
- Encouraging collaboration & resource sharing
- Improve your taxonomy, information retrieval
-
- User tagging to refine the corporate taxonomy
-
-
- New concepts
-
-
-
- New terminology
-
-
- Seeing what employees find interesting
-
- Distributing tagging tasks
Copyright 2009 Earley & Associates Inc. All Rights Reserved 18. The downside
- Potential issues of security, inappropriateness
-
- Can implement some level of vetting
- Privacy concerns
-
- Can be anonymous tagging, although this removes some social value
-
- Can create role or team-based collections
- Need higher ratio of active participants due to population size
Copyright 2009 Earley & Associates Inc. All Rights Reserved 19. Message text External News Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept:Not all content is created equally The content continuum Copyright 2009 Earley & Associates Inc. All Rights Reserved Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value 20. What if we blended the two?
- Folksonomy / Taxonomy
Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency Copyright 2009 Earley & Associates Inc. All Rights Reserved 21. Hybrid approaches Co-existence Tag-influenced taxonomy Taxonomy-influenced tagging Tag hierarchies/ontologies Copyright 2009 Earley & Associates Inc. All Rights Reserved 22. Co-existence
- Taxonomy and folksonomy are used side by side
- Strengths of each approach preserved, philosophy of each kept pure
Web example: Flickr & Library of Congress:http://www.flickr.com/photos/library_of_congress/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 23. Co-existence Ann Arbor District Library Copyright 2009 Earley & Associates Inc. All Rights Reserved 24. Raytheon corporate example
- Used in Raytheon employee portal - website lists (Suggested sites feature box)
- How does it work:
-
- inserted Suggested Sites in a "feature" box to the right of the regularly ranked results
-
- website suggestions (URLs) submitted along with recommended tags/keywords which are subsequently verified and approved by librarians
http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation Copyright 2009 Earley & Associates Inc. All Rights Reserved 25. Variation: Tag mediation
- Vetting & editing tags
- Pros:
-
- Weeds out potentially inappropriate tags
-
- Eliminates misspellings, plural issues, etc.
-
- Some can be done automatically (spell-checker, e.g.)
-
- Enhances findability
- Cons:
-
- Higher effort/cost
-
- Perceived lack of trust
-
- Who knows better?
Copyright 2009 Earley & Associates Inc. All Rights Reserved 26. Tag-influenced taxonomy
- Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current
-
- Find new terminology (synonyms, popular language)
-
- Find new concepts
- Performed as separateprocesses (taxonomytagging=formal,tagging=informal) orcombined in singleinterface
Copyright 2009 Earley & Associates Inc. All Rights Reserved 27. Tag-influenced taxonomy
- Requires formal vetting process
- Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words)
- Evaluate candidates based on
-
- Frequency (literary warrant)
-
- Salience within context
- Look at tags used in conjunction with taxonomy
Copyright 2009 Earley & Associates Inc. All Rights Reserved 28. Taxonomy-influenced tagging
- Presenting choices/suggestions to user from controlled set ofterms/tags
-
- Sometimes users prefer easy choice
-
-
- Drop-down menus
-
-
-
- Check boxes
-
-
-
- Type ahead
-
-
-
- Tree view
-
-
- influenced option to enter own tag? Good source of new terms
-
- Enforces consistency
-
- Offers structure
Copyright 2009 Earley & Associates Inc. All Rights Reserved 29. WWW example: ZigTag Defined Tagging Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms Copyright 2009 Earley & Associates Inc. All Rights Reserved 30. Zigtag
- Type ahead & synonyms encourage consistency
- Users can enter new tags
- Synonyms based on Wikipedia, so can be dirty data
- No hierarchy, only equivalent relationships so far
Copyright 2009 Earley & Associates Inc. All Rights Reserved 31. Zigtag search Still get problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browse-able tag cloud Copyright 2009 Earley & Associates Inc. All Rights Reserved 32. Example: myedna (Education.au) http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging Copyright 2009 Earley & Associates Inc. All Rights Reserved 33. TextWise Semantic Cloud
- Document (URL or text) is submitted to web service for semantic analysis
- Category tags from subset of the ODP taxonomy
- Concept tags are derived from document, persisted, related to ODP categories
Copyright 2009 Earley & Associates Inc. All Rights Reserved 1 3 2 34. Buzzillions.com
- Review site: tags are controlled not against a taxonomy, but against other tags reduces redundancy
- Only popular tags exposed as faceted navigation
Copyright 2009 Earley & Associates Inc. All Rights Reserved 35. SharePoint?
- Plug-ins make taxonomy easy
- Present the taxonomy like tags
- E.g. KWizCom: plug-in manages taxonomy and tags in easy interface can opt-out of letting users create own tags
Copyright 2009 Earley & Associates Inc. All Rights Reserved 36. Taxonomy-influenced tagging
- Pros:
-
- More consistency
-
- Better support for findability
-
- Relationships, definitions leveragedadding meaning to the tags
-
- Realistic for the enterprise
- Cons:
-
- Not really folksonomy anymore..
-
- Can be forcing terminology on user
-
- Need to develop reference list of concepts manually through taxonomy or need large corpus to derive automatically
Copyright 2009 Earley & Associates Inc. All Rights Reserved 37. Tag hierarchies
- Tag hierarchies come in two flavors:
- User-powered
- Automatic derivation
Copyright 2009 Earley & Associates Inc. All Rights Reserved 38. User-powered tag hierarchies
- User-powered
-
- Social approach
-
- Bogus hierarchies possible
-
- Small population will contribute
- RawSugar tried it
-
- (no longer around)
-
- Taggers could specify hierarchy in own account, tags clustered based on common groups
Copyright 2009 Earley & Associates Inc. All Rights Reserved 39. User-poweredtag hierarchies
- E.g. LibraryThing
LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com Copyright 2009 Earley & Associates Inc. All Rights Reserved 40. User-poweredtag hierarchies: Intelligent tags
- Move toward more semantic tagging with machine-readable tags, e.g. Flickrmachine tagsin triple format: [namespace]:[key]=[value]
-
- geo:neighborhood=SoHo, geo:lat=58.41618, etc.
-
- flickr:user=mortimer
-
- taxonomy:common=grevyszebra
-
- lastfm:event=34640
-
-
- makes your photo appear on a lastfm event page
-
Copyright 2009 Earley & Associates Inc. All Rights Reserved 41. User-poweredtag hierarchies: Intelligent tags
- MOAT: Meaning of a tag part of linked data movement, mapping tags to semantic web
-
- http://moat-project.org/
- Adding to the triplet
-
- User resource tag meaning
-
- Meaning = URI to a resource containing meaning (e.g. DBPedia)
Copyright 2009 Earley & Associates Inc. All Rights Reserved 42. Automatically derived tag hierarchies
- Tag hierarchies, facets, ontologies, or folksontology
- Done through statistical/clustering algorithms
http://www.pui.ch/phred/automated_tag_clustering/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 43. Delicious & citeulike hiearchy http://heymann.stanford.edu/taghierarchy.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 44. Clustering at flickr Copyright 2009 Earley & Associates Inc. All Rights Reserved 45. Auto clustering/facets
- Still not very mature
- Time-sensitive
- Community- sensitive
- Ambiguous tags
- Improve with volume(self-correcting)
http://www.pui.ch/phred/automated_tag_clustering/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 46. Tag hierarchy pros and cons
- Pros:
-
- Relationships, definitionsleveragedadding meaning to the tags
-
- Provides a basis for application behavior in the absence of taxonomy (e.g. Flickr maps, clusters)
-
- Self-correcting with volume
- Cons:
-
- Automatically derived relationships (clusters) can be bogus or time-sensitive
-
- Folksonomic relationships can be esoteric (just like tags)
-
- Small population of contributors
Copyright 2009 Earley & Associates Inc. All Rights Reserved 47. Conclusion
- Not all content is created equal tags and taxonomies have their sweet spots
- Hybrid approaches are emerging
-
- taxonomy-influenced tagging leading the pack in popularity on the web
-
- co-existence in the enterprise
- Look for more developments on the semantic web/linked data front for making tags more intelligent
Copyright 2009 Earley & Associates Inc. All Rights Reserved 48. Questions? Richard Beatch[email_address] Paul Wlodarczyk [email_address] Web :www.earley.com Blog : sethearley.wordpress.com Twitter :earleytaxonomy Give us your business card for a free pass to one of our Community of Practice conference calls (a $50 value). 49. Appendix: Corporate social tagging tools 50. Corporate social tagging software http://www.connectbeam.com/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 51. Corporate social tagging software http://www.cogenz.com/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 52. Corporate social tagging software http://www-306.ibm.com/software/lotus/products/connections/dogear.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 53. Corporate social tagging software
- BEA AquaLogic Pathways
-
-
- http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/
-
Copyright 2009 Earley & Associates Inc. All Rights Reserved 54. Corporate social tagging software
-
-
- http://www.newsgator.com/business/socialsites/default.aspx
-
Copyright 2009 Earley & Associates Inc. All Rights Reserved