Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy

download Semantic Technology 2009:  Hybrid  Approaches to Taxonomy and Folksonomy

If you can't read please download the document

description

Tagging isn’t new - it’s been around for a dog’s age in internet years. But in the past few years some fresh ideas and tools have reinvigorated the social tagging world. These new approaches include an attempt to improve findability through a bit of structure and control. While the idea of adding control to folksonomy seems like going against the whole selling point of social tagging (flexibility, openness), it is bringing the tagging to a new level, making it more viable for practical use in enterprises. This session will present hybrid approaches to formal taxonomies and social tagging. How can they be used in the corporate environment? What type of content is appropriate for social tagging? What kind of software is available for the enterprise? Learn how social tagging is not necessarily anathema to corporate taxonomy programs and how this hybrid approach can bring the best of both worlds: a fresh, up to date taxonomy with the structure needed to improve information findability. Key Takeaways: Folksonomy and taxonomy defined Drawbacks of pure social tagging Social tagging in the enterprise Hybrid taxonomy & folksonomy approaches: Four models

Transcript of Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy

  • 1.Hybrid Approaches toTaxonomy & Folksonomy Semantic Technology 2009 San Jose, CA June 17, 2009Richard Beatch Paul Wlodarczyk Earley & Associates www.earley.com

2. Agenda

  • The taxonomy/folksonomy debate
  • Tagging pitfalls
  • Social tagging & the enterprise
  • Hybrid approaches to taxonomy/folksonomy
    • Co-existence
    • Tag-influenced taxonomy
    • Taxonomy-influenced tags
    • Tag hierarchies/ontologies
  • Conclusion

Copyright 2009 Earley & Associates Inc. All Rights Reserved 3. About Earley & Associates

  • Founded in 1994, Earley & Associates is an information management (IM) consulting company specializing in
    • Taxonomy development and management
    • Content management strategy
    • Search integration
    • Usability & Information Architecture
  • Some of our recent clients include:
    • American Greetings, Hasbro, Ford Foundation, Astra Zeneca, Motorola, The Hartford Insurance Group, Urban Land Institute
  • Give us your business card
    • For a free pass to one of our Community of Practice conference calls

Copyright 2009 Earley & Associates Inc. All Rights Reserved 4. About us

  • Richard Beatch
    • Senior Consultant at Earley & Associates, Inc.
    • Ph.D. in Ontology
    • Specialized in Taxonomy, Search, Metadata, and content architecture.
    • Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T.
    • Blog: http://sethearley.wordpress.com/

Copyright 2009 Earley & Associates Inc. All Rights Reserved 5. About us

  • Paul Wlodarczyk
    • Director, Solutions Consulting at Earley & Associates, Inc.
    • MBA with BA in Psychology / Cognitive Science
    • Specialized in unstructured content technologies with over 20 years experience in XML / structured authoring, content reuse, ECM, KM, localization, semantic analysis and content enrichment
    • Blogs at http://sethearley.wordpress.com/ and http://thecontentguy.net

Copyright 2009 Earley & Associates Inc. All Rights Reserved 6. The tired debate Copyright 2009 Earley & Associates Inc. All Rights Reserved Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost crowdsourced 7. The relevance problem

  • Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term*
  • Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps

*Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf Copyright 2009 Earley & Associates Inc. All Rights Reserved 8. Taxonomy

  • Added by a small number of individuals: author/originators or authorized persons (e.g.librarian)
  • Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary
  • Relationships between terms defined
    • Hierarchical (e.g. Computer hardware > Keyboard)
    • Associative (e.g. Computer hardware Software)
    • Equivalent (e.g. Laptop = Notebook Computer)

Copyright 2009 Earley & Associates Inc. All Rights Reserved 9. Tags

  • Added by authors and consumers (individual motivation)
  • Can connote any type of meaning or purpose
  • No compression around a single viewpoint, no control of vocabulary
  • Self-correcting through volume

Copyright 2009 Earley & Associates Inc. All Rights Reserved 10. Why tagging is so interesting

  • Adding individual value to the act of classification user control over findability
  • Reducing the cognitive burden(i.e. its easy)
  • Reduced technologicalinvestment (i.e. its cheap)
  • Can leverage emergentstructure (folksonomy)

Reno| Tags Copyright 2009 Earley & Associates Inc. All Rights Reserved 11. The downside

  • Neither tags nor taggers are perfect
  • No language control
    • Guy & Tonkin, 2006.
    • http://www.dlib.org/dlib/january06/guy/01guy.html

Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Copyright 2009 Earley & Associates Inc. All Rights Reserved Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog 12. The downside

  • Varying levels of granularity
  • Same tag, different meanings
  • Lack of relationships between tags which is broader? Narrower?
  • Lack of consistency/approach to change even single user can change language and hamper own personal retrieval

Robin Bird Turdus migratorinus Known as tag noise Copyright 2009 Earley & Associates Inc. All Rights Reserved 13. The downside

  • Most tag search does not account for stemming, plurals, etc.

E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642 Copyright 2009 Earley & Associates Inc. All Rights Reserved 14. The tagging hype cycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 15. The web vs. the enterprise

  • Shirky: there is no shelf
    • Traditional organization schemes are built to deal with physical collections and constraints.
    • They dont work well on the web
      • large corpus
      • no clear edges
      • no formal categories
      • no authority
  • The enterprise is much more defined
      • smaller corpuses
      • formal entities
      • coordinated users, clear tasks
      • need for reliable retrieval

E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena Copyright 2009 Earley & Associates Inc. All Rights Reserved 16. R o le of folksonomy in the enterprise?

  • Tagging external links
    • Seeing what colleagues are interested in
    • Sharing links with a specific team
    • Subscribing to link feeds
    • Monitoring news/blog coverage of the company
    • Consumer/competitor research
    • Tracking industry trends
  • Tagging internal links
    • Finding/facilitating access to most popular pages on the intranet
    • Seeing what intranet pages mean to staff

Copyright 2009 Earley & Associates Inc. All Rights Reserved 17. Role of folksonomy in the enterprise?

  • Social aspects
    • Identifying subject matter experts
    • Connecting people who share interests
    • Encouraging collaboration & resource sharing
  • Improve your taxonomy, information retrieval
    • User tagging to refine the corporate taxonomy
      • New concepts
      • New terminology
    • Seeing what employees find interesting
    • Distributing tagging tasks

Copyright 2009 Earley & Associates Inc. All Rights Reserved 18. The downside

  • Potential issues of security, inappropriateness
    • Can implement some level of vetting
  • Privacy concerns
    • Can be anonymous tagging, although this removes some social value
    • Can create role or team-based collections
  • Need higher ratio of active participants due to population size

Copyright 2009 Earley & Associates Inc. All Rights Reserved 19. Message text External News Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept:Not all content is created equally The content continuum Copyright 2009 Earley & Associates Inc. All Rights Reserved Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value 20. What if we blended the two?

  • Folksonomy / Taxonomy

Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency Copyright 2009 Earley & Associates Inc. All Rights Reserved 21. Hybrid approaches Co-existence Tag-influenced taxonomy Taxonomy-influenced tagging Tag hierarchies/ontologies Copyright 2009 Earley & Associates Inc. All Rights Reserved 22. Co-existence

  • Taxonomy and folksonomy are used side by side
  • Strengths of each approach preserved, philosophy of each kept pure

Web example: Flickr & Library of Congress:http://www.flickr.com/photos/library_of_congress/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 23. Co-existence Ann Arbor District Library Copyright 2009 Earley & Associates Inc. All Rights Reserved 24. Raytheon corporate example

  • Used in Raytheon employee portal - website lists (Suggested sites feature box)
  • How does it work:
    • inserted Suggested Sites in a "feature" box to the right of the regularly ranked results
    • website suggestions (URLs) submitted along with recommended tags/keywords which are subsequently verified and approved by librarians

http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation Copyright 2009 Earley & Associates Inc. All Rights Reserved 25. Variation: Tag mediation

  • Vetting & editing tags
  • Pros:
    • Weeds out potentially inappropriate tags
    • Eliminates misspellings, plural issues, etc.
    • Some can be done automatically (spell-checker, e.g.)
    • Enhances findability
  • Cons:
    • Higher effort/cost
    • Perceived lack of trust
    • Who knows better?

Copyright 2009 Earley & Associates Inc. All Rights Reserved 26. Tag-influenced taxonomy

  • Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current
    • Find new terminology (synonyms, popular language)
    • Find new concepts
  • Performed as separateprocesses (taxonomytagging=formal,tagging=informal) orcombined in singleinterface

Copyright 2009 Earley & Associates Inc. All Rights Reserved 27. Tag-influenced taxonomy

  • Requires formal vetting process
  • Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words)
  • Evaluate candidates based on
    • Frequency (literary warrant)
    • Salience within context
  • Look at tags used in conjunction with taxonomy

Copyright 2009 Earley & Associates Inc. All Rights Reserved 28. Taxonomy-influenced tagging

  • Presenting choices/suggestions to user from controlled set ofterms/tags
    • Sometimes users prefer easy choice
      • Drop-down menus
      • Check boxes
      • Type ahead
      • Tree view
    • influenced option to enter own tag? Good source of new terms
    • Enforces consistency
    • Offers structure

Copyright 2009 Earley & Associates Inc. All Rights Reserved 29. WWW example: ZigTag Defined Tagging Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms Copyright 2009 Earley & Associates Inc. All Rights Reserved 30. Zigtag

  • Type ahead & synonyms encourage consistency
  • Users can enter new tags
  • Synonyms based on Wikipedia, so can be dirty data
  • No hierarchy, only equivalent relationships so far

Copyright 2009 Earley & Associates Inc. All Rights Reserved 31. Zigtag search Still get problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browse-able tag cloud Copyright 2009 Earley & Associates Inc. All Rights Reserved 32. Example: myedna (Education.au) http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging Copyright 2009 Earley & Associates Inc. All Rights Reserved 33. TextWise Semantic Cloud

  • Document (URL or text) is submitted to web service for semantic analysis
  • Category tags from subset of the ODP taxonomy
  • Concept tags are derived from document, persisted, related to ODP categories

Copyright 2009 Earley & Associates Inc. All Rights Reserved 1 3 2 34. Buzzillions.com

  • Review site: tags are controlled not against a taxonomy, but against other tags reduces redundancy
  • Only popular tags exposed as faceted navigation

Copyright 2009 Earley & Associates Inc. All Rights Reserved 35. SharePoint?

  • Plug-ins make taxonomy easy
  • Present the taxonomy like tags
  • E.g. KWizCom: plug-in manages taxonomy and tags in easy interface can opt-out of letting users create own tags

Copyright 2009 Earley & Associates Inc. All Rights Reserved 36. Taxonomy-influenced tagging

  • Pros:
    • More consistency
    • Better support for findability
    • Relationships, definitions leveragedadding meaning to the tags
    • Realistic for the enterprise
  • Cons:
    • Not really folksonomy anymore..
    • Can be forcing terminology on user
    • Need to develop reference list of concepts manually through taxonomy or need large corpus to derive automatically

Copyright 2009 Earley & Associates Inc. All Rights Reserved 37. Tag hierarchies

  • Tag hierarchies come in two flavors:
  • User-powered
  • Automatic derivation

Copyright 2009 Earley & Associates Inc. All Rights Reserved 38. User-powered tag hierarchies

  • User-powered
    • Social approach
    • Bogus hierarchies possible
    • Small population will contribute
  • RawSugar tried it
    • (no longer around)
    • Taggers could specify hierarchy in own account, tags clustered based on common groups

Copyright 2009 Earley & Associates Inc. All Rights Reserved 39. User-poweredtag hierarchies

  • E.g. LibraryThing

LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com Copyright 2009 Earley & Associates Inc. All Rights Reserved 40. User-poweredtag hierarchies: Intelligent tags

  • Move toward more semantic tagging with machine-readable tags, e.g. Flickrmachine tagsin triple format: [namespace]:[key]=[value]
    • geo:neighborhood=SoHo, geo:lat=58.41618, etc.
    • flickr:user=mortimer
    • taxonomy:common=grevyszebra
    • lastfm:event=34640
      • makes your photo appear on a lastfm event page

Copyright 2009 Earley & Associates Inc. All Rights Reserved 41. User-poweredtag hierarchies: Intelligent tags

  • MOAT: Meaning of a tag part of linked data movement, mapping tags to semantic web
    • http://moat-project.org/
  • Adding to the triplet
    • User resource tag meaning
    • Meaning = URI to a resource containing meaning (e.g. DBPedia)

Copyright 2009 Earley & Associates Inc. All Rights Reserved 42. Automatically derived tag hierarchies

  • Tag hierarchies, facets, ontologies, or folksontology
  • Done through statistical/clustering algorithms

http://www.pui.ch/phred/automated_tag_clustering/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 43. Delicious & citeulike hiearchy http://heymann.stanford.edu/taghierarchy.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 44. Clustering at flickr Copyright 2009 Earley & Associates Inc. All Rights Reserved 45. Auto clustering/facets

  • Still not very mature
  • Time-sensitive
  • Community- sensitive
  • Ambiguous tags
  • Improve with volume(self-correcting)

http://www.pui.ch/phred/automated_tag_clustering/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 46. Tag hierarchy pros and cons

  • Pros:
    • Relationships, definitionsleveragedadding meaning to the tags
    • Provides a basis for application behavior in the absence of taxonomy (e.g. Flickr maps, clusters)
    • Self-correcting with volume
  • Cons:
    • Automatically derived relationships (clusters) can be bogus or time-sensitive
    • Folksonomic relationships can be esoteric (just like tags)
    • Small population of contributors

Copyright 2009 Earley & Associates Inc. All Rights Reserved 47. Conclusion

  • Not all content is created equal tags and taxonomies have their sweet spots
  • Hybrid approaches are emerging
    • taxonomy-influenced tagging leading the pack in popularity on the web
    • co-existence in the enterprise
  • Look for more developments on the semantic web/linked data front for making tags more intelligent

Copyright 2009 Earley & Associates Inc. All Rights Reserved 48. Questions? Richard Beatch[email_address] Paul Wlodarczyk [email_address] Web :www.earley.com Blog : sethearley.wordpress.com Twitter :earleytaxonomy Give us your business card for a free pass to one of our Community of Practice conference calls (a $50 value). 49. Appendix: Corporate social tagging tools 50. Corporate social tagging software http://www.connectbeam.com/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 51. Corporate social tagging software http://www.cogenz.com/ Copyright 2009 Earley & Associates Inc. All Rights Reserved 52. Corporate social tagging software http://www-306.ibm.com/software/lotus/products/connections/dogear.html Copyright 2009 Earley & Associates Inc. All Rights Reserved 53. Corporate social tagging software

  • BEA AquaLogic Pathways
      • http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/

Copyright 2009 Earley & Associates Inc. All Rights Reserved 54. Corporate social tagging software

      • http://www.newsgator.com/business/socialsites/default.aspx

Copyright 2009 Earley & Associates Inc. All Rights Reserved