8/6/2019 Busch Slides
1/126
Strategies LLCTaxonomy
6-15 June 2007 Copyright 2007 Taxonomy Strategies LLC. All rights reserved.
Taxonomy & metadata
strategies for effectivecontent management
Melbourne, Sydney, Canberra
Masterclass
8/6/2019 Busch Slides
2/126
2Taxonomy Strategies LLC The business of organized
Todays agenda
9:00-9:10 10 minIntroduction
9:10-9:15 5 minWarm-up exercise9:15-9:45 30 minTaxonomy fundamentals: Building taxonomies
9:45-10:00 15 minTaxonomy exercise
10:00-10:30 30 minTaxonomy fundamentals: Taxonomy business case
10:30-11:00 30 minTea Break
11:00-12:00 60 minTaxonomy governance
12:00-12:30 30 minCapabilities self-assessment
12:30-13:30 60 minLunch
13:30-14:30 60 minTaxonomy benchmarking
14:30-14:45 15 minBenchmarking exercise14:45-15:15 30 minTea Break
15:15-16:15 60 minContent tagging
16:15-16:30 15 minTagging exercise
16:30-17:00 30 minQ&A
8/6/2019 Busch Slides
3/126
8/6/2019 Busch Slides
4/126
4Taxonomy Strategies LLC The business of organized
What we do
Organize Stuff
8/6/2019 Busch Slides
5/126
5Taxonomy Strategies LLC The business of organized
For us, taxonomy work includes:
y
Metadata specification definesthe properties needed todescribe content so that it canbe found & used.
y Vocabularies are collections of
terms that are used to specifysome of the metadataproperties.
Some vocabularies are bigand hierarchical, some aresmall and flat.
y An application profile specifieswhat metadata & vocabulariesare required, and thenrepresents them formally.
8/6/2019 Busch Slides
6/126
6Taxonomy Strategies LLC The business of organized
Recent & current projects:
http://www.taxonomystrategies.com/html/clients.htm
Government Commercial
Not-for-Profit
http://www.taxonomystrategies.com/html/clients.htmhttp://www.oracle.com/index.htmlhttp://www.taxonomystrategies.com/html/clients.htm8/6/2019 Busch Slides
7/126
7Taxonomy Strategies LLC The business of organized
Who are you? What sectors do you work in?
Your Role
y Administrator
y Records Manager
y Content Manager
y Communications
y Editor
y Information Architect
y Usability Expert
y Librarian
y Knowledge Engineer
y Ontologist
y Chief Information Officer
Industrial Sector
y Agriculture & Processing
Food, Lumber, Pulp & Papery Financial Services
Banking & Insurance
y Government Public administration
Public safety
y High Tech Computers, Software &
Telecommunications
y Heavy Manufacturing Steel, Automobiles & Aircraft
yManufacturing Consumer Products
y Medical & Health Care
y Mining & Refining Petrochemicals, Oil & Gas
y Pharmaceuticals
8/6/2019 Busch Slides
8/126
8Taxonomy Strategies LLC The business of organized
Why are you here?
y
What are the key questions that you want answered in todaysworkshop?
y Please rank the questions from the most important (5) to the leastimportant (1)
y Please provide your job title, organization and department; your
name is optional.Priority (1-5) Questions
Your title or role:
Your org or industry:
Your dept:
Your name: (optional)
8/6/2019 Busch Slides
9/126
8/6/2019 Busch Slides
10/12610Taxonomy Strategies LLC The business of organized
The Taxonomy problem: How to pick from > 5,000
faucets?
By:y Category
y Price
y Brand
y Color/Finish
y # Handles
y Series Name
y
Water Filter?y Faucet Spray
y Handle Shape
y Soap Dispenser?
8/6/2019 Busch Slides
11/12611Taxonomy Strategies LLC The business of organized
The main issue: What goes here?
y When do the
things in the list
change?
y How do wemaintain the list?
y What rules do we
follow?
8/6/2019 Busch Slides
12/12612Taxonomy Strategies LLC The business of organized
Seven phases of taxonomy development
Week: 1 2 3 4 5 6 7 8 9 10 11 12
1 IdentifyObjectives Conduct interviews
2 InventoryResources
Identify, gather & reviewresources
Define fields &
purpose
3 Specify
Metadata
4 ModelContent
Define contentchunks & XML
DTDs
5 SpecifyVocabularies
Compile controlledvocabularies
6 SpecifyProcedures
Develop workflow,rules & procedures
7 Test & TrainManually tagsmall sample
8/6/2019 Busch Slides
13/12613Taxonomy Strategies LLC The business of organized
Taxonomy design phases need to be iterated
1 Identify
Objectives
2 InventoryResources
3 SpecifyMetadata
4 ModelContent
5 SpecifyVocabularies
6 SpecifyProcedures
7 Test & Train
Interview core teamand stakeholders
Identify, gather &review resources
Define fields &purpose
Definecontent
chunks &XML DTDs
Compilecontrolled
vocabularies
Developworkflowrules &
procedures
Plan & Prototype
Manuallytag smallsample
Gatheradditionalresources,
if any
Revise ifneeded, bake
into alphaCMS
Revise if needed,bake into alpha
CMS
Revise, use inalpha CMS
alpha workflowsin CMS
Alpha Dev & TestReviewtagged
samples,
defaultprocedures
Use alphaCMS to tag
larger sample
ModifyCMS for
beta
Modify CMSfor beta
Revise,use inbetaCMS
Modify &extend
workflows
Gatheradditionalsources, if
any
Beta D&T
Interviewalpha
users
Use beta CMSto tag larger
sample
Finalize trainingmaterials & train
staff
Modifyfor 1.0
Modifyfor 1.0
Reviseusingteam
procedure
Finalizeprocedurematerials
Final D&T
Interviewbeta users
8/6/2019 Busch Slides
14/12614Taxonomy Strategies LLC The business of organized
Licensing an existing taxonomy
See Factivas taxonomy www.taxonomywarehouse.comy There are usually license fees, but these will be less than the
effort to develop an equivalent taxonomy.
y But pre-existing taxonomies rarely fit an organizations needs
and may require extensive customization.
Recommendation
y Adopt a faceted approach.
y Reuse existing (especially internal) vocabularies for as manyof the facets as possible.
y Plan on doing full-custom Content Type and Topictaxonomies.
http://www.taxonomywarehouse.com/index.asphttp://www.taxonomywarehouse.com/http://www.taxonomywarehouse.com/index.asphttp://www.taxonomywarehouse.com/8/6/2019 Busch Slides
15/12615Taxonomy Strategies LLC The business of organized
Free sources for 8 common taxonomies
Taxonomy Definition Potential Sources
Organization Organizational structure. SP 800-87, U.S. Government Manual, Your
organizational structure, etc.
Content Type Structured list of the various types ofcontent being managed or used.
Dublin Core Type Vocabulary, AGLS DocumentType, Your records management policy, etc.
Industry Broad market categories such aslines of business, life events, orindustry codes.
SIC, NAICS, Your market segments, etc.
Location Place of operations orconstituencies.
FIPS 5-2, FIPS 55-3, ISO 3166, UN StatisticsDiv, US Postal Service, Your sales regions, etc.
BusinessActivity
Business activities or functionsperformed to accomplish missionand goals.
Federal Enterprise Architecture BusinessReference Model, Enterprise ontology,Yourbusiness functions, etc.
Topic Business topics relevant to yourmission & goals. Federal Register Thesaurus, NAL AgriculturalThesaurus, Your research areas, etc.
Audience Subset of constituents to whom apiece of content is directed or isintended to be used by.
GEM, ERIC Thesaurus, IEEE LOM, Yourpsycho-graphics or personas, etc.
Products &Services
Names of products/programs andservices.
ERP system, Your products and services, etc.
8/6/2019 Busch Slides
16/12616Taxonomy Strategies LLC The business of organized
Typical product catalog:
A-Z, then idiosyncratic categories
8/6/2019 Busch Slides
17/12617Taxonomy Strategies LLC The business of organized
How to analyze existing product catalog categories:
Principles and priorities
Preparing a product catalog for facet browsing (aka GuidedNavigation) requires a category hierarchy and additional attributes.
Principles
1. Categories and subcategories that could be swapped are candidates forconversion to attributes.
2. Repeated lists of subcategories signal a possible need for an attribute.3. The number of attributes should not exceed six or seven, so not all attribute
candidates should be used. Avoid selecting strongly correlated attributes, such as Weight and Shipping
Weight.
Priorities
1. Choose Categories that apply to many products, over those with fewproducts.
2. Choose Attributes that apply to many Categories over those that apply onlyto very few categories.
8/6/2019 Busch Slides
18/12618Taxonomy Strategies LLC The business of organized
Product categories example: Wireless carrier
Products
AccessoriesContentPhonesServices
BatteriesCasesChargersDataHands-FreeHeadsetsMiscellaneous
ConferencingInternet / DataLandline PhoneNetwork &Roaming
Relay ServicesSolutionsWireless Data
Versatile PhonesSmart DevicesBasic PhonesPrepaid PhonesInternational OnlyPhones
Mobile Broad-band Cards
PurchasedSubscription
8/6/2019 Busch Slides
19/12619Taxonomy Strategies LLC The business of organized
Product attributes example: Digital cameras in an
electronics catalog
y Types of attributes Generic attributes
Brand/Product Family/Model Price Range Usually Ships
Merchandising attributes Usage (E-mail, Internet Browsing, Programming, )
Segment (Home, Business, Education, Government ) Region & Country Most Popular New Related Products
Specialized attributes
Capacity (Battery; Memory; MB; GB; BPS, ) Resolution (DPI; Megapixels; XGA, XGA, UXGA, ) Size (Display; Screen; ...) Standard (a, b, g, n, ; scsi, ata, sata, eide, ; dimm, simm,
) Type(Camera; Battery; Display; Printer; Server; Storage;
Switch; )
Resolution3 Megapixels (4)
4 Megapixels (5)5 Megapixels (27)
6-8 Megapixels (21)
BrandCanon (15)
Fuji (10)
Kodak (17)
Nikon (8)
Olympus (9)
TypePoint & Shoot (25)
Digital SLR (10)
Packages (5)
Price Range$100-250 (5)
$250-500 (16)
$500-1000 (19)
More than $1000 (3)
8/6/2019 Busch Slides
20/12620Taxonomy Strategies LLC The business of organized
Faceted taxonomy theory & practice
y How many terms are needed to provide sufficientgranularity? Not as many as you think!
y Post-coordinate indexing allows several simple controlledvocabularies to be combined, rather than using a single
large pre-coordinated vocabulary.
8/6/2019 Busch Slides
21/12621Taxonomy Strategies LLC The business of organized
The power of faceted taxonomy
4 independent categories of 10nodes each have the samediscriminatory power as onehierarchy of10,00010,000 nodes (104) Easier to maintain
Easier to tag by content authors Can be easier to navigate
y Its more effective to increasethe number of facets, than to
increase the number of termsper facet.
AdvocacyContractors &Grantees
EnvironmentalProfessionals
FederalFacilities
General PublicIndustryKids
Researchers &Scientists
Small BusinessStudents
Audience
AdvisoryExposureFood SafetyHealthAssessment
Health EffectHealth RiskOccupationalHealth
Pesticide
EffectsSun ProtectionToxicity
Health Industry
AllergenBiologicalContaminant
CarcinogenChemicalExplosiveLiquid WasteMicroorganismOzonePesticide
RadioactiveWaste
Substance
Agriculture &Cattle
AutomobileRepair
ChemicalDry CleaningElectronics &Computer
EnergyExtractive
IndustriesFoodProcessing
LeatherTanning &Finishing
Metal Finishing
8/6/2019 Busch Slides
22/12622Taxonomy Strategies LLC The business of organized
Automatically created taxonomies
y
Documents can be clusteredbased on similarities anddifferences.
y Problems:
Typically only a single
hierarchy No overall plan
Results hard for people tonavigate
What does North mean on this map?
8/6/2019 Busch Slides
23/12623Taxonomy Strategies LLC The business of organized
Automatic taxonomy construction software
y Software can scan large quantities of
content and extract statistically significantwords and phrases.y Example:
Archive of 10 publications analyzed fortopics related to copyright.
y Software does a poorjob of De-duplication. Turning significant words and phrases
into a larger structure. Discriminating between gold and
garbage.
y Software is good for Getting an understanding of the key noun
phrases in a large collection. Providing test cases for evaluating a
taxonomy.
Source: Sample data courtesy of nStein.
8/6/2019 Busch Slides
24/12624Taxonomy Strategies LLC The business of organized
Most popular flickr tags on 20 Feb 2007
http://www.flickr.com/photos/tags/
Sort flickr categories into 5 or fewergroups. Then label each group.
http://www.flickr.com/photos/tags/http://www.flickr.com/photos/tags/8/6/2019 Busch Slides
25/12625Taxonomy Strategies LLC The business of organized
Taxonomy exercise
Facet grouping
yUniversal taxonomy facets By location (spatially)
By time (chronologically)
By type (genre)
By physical properties (size, color, shape, etc.)
By subject (topic)
Richard Saul Wurman. Information Architects (1996)
8/6/2019 Busch Slides
26/12626Taxonomy Strategies LLC The business of organized
Taxonomy exercise Facet grouping
Sort flickr categoriesinto 5 or fewer groups.Then label each group.
8/6/2019 Busch Slides
27/126
27Taxonomy Strategies LLC The business of organized
Todays agenda
9:00-9:10 10 minIntroduction
9:10-9:15 5 minWarm-up exercise
9:15-9:45 30 minTaxonomy fundamentals: Building taxonomies
9:45-10:00 15 minTaxonomy exercise
10:00-10:30 30 minTaxonomy fundamentals: Taxonomy business case
10:30-11:00 30 minTea Break
11:00-12:00 60 minTaxonomy governance
12:00-12:30 30 minCapabilities self-assessment
12:30-13:30 60 minLunch
13:30-14:30 60 minTaxonomy benchmarking
14:30-14:45 15 minBenchmarking exercise14:45-15:15 30 minTea Break
15:15-16:15 60 minContent tagging
16:15-16:30 15 minTagging exercise
16:30-17:00 30 minQ&A
8/6/2019 Busch Slides
28/126
28Taxonomy Strategies LLC The business of organized
Business case and motivations for taxonomies
y
How are we going to use content, metadata, andtaxonomies in applications to obtain business benefits?
8/6/2019 Busch Slides
29/126
29Taxonomy Strategies LLC The business of organized
What technology analysts have said:
Add metadata to search on!
y Adding metadata to unstructured content allows it to be managed likestructured content. Applications that use structured content workbetter.
y Enriching content with structured metadata is critical forsupporting search and personalized content delivery.
y Content that has been adequately tagged with metadata can beleveraged in usage tracking, personalization and improvedsearching.
y Better structure equals better access: Taxonomy serves as aframework for organizing the ever-growing and changing informationwithin a company. The many dimensions of taxonomy can greatlyfacilitate Web site design, content management, and searchengineering. If well done, taxonomy will allow for structured Webcontent, leading to improved information access.
8/6/2019 Busch Slides
30/126
30Taxonomy Strategies LLC The business of organized
Fundamentals of taxonomy ROI
y
Tagging content using a taxonomy is a cost, not a benefit.y There is no benefit without exposing the tagged content
to users in some way that cuts costs or improvesrevenues.
yPutting taxonomy into operation requires UI changesand/or backend system changes, as well as datachanges.
y You need to determine those changes, and their costs, as
part of the ROI.
8/6/2019 Busch Slides
31/126
31Taxonomy Strategies LLC The business of organized
Product utilization: Taxonomy compared to search
y
Conversion rate increases. HomeDepot.com Double digit increase.
1-800-Flowers.com More than a 10% increase.
Otto Group (Kaleidoscope, Freemans, Grattan, and lookagaincatalogs) 130% increase.
y Lift in average order size.
8/6/2019 Busch Slides
32/126
32Taxonomy Strategies LLC The business of organized
Product catalog: Taxonomy compared to search
Benefit: Increased conversion rate& revenue lift
Web sales net income $ 80,000,000
Increased conversion rate 30%
$ 24,000,000
Order size lift 10%
$ 8,000,000
Potential revenue increase per year $ 32,000,000
8/6/2019 Busch Slides
33/126
33Taxonomy Strategies LLC The business of organized
Usability research: Taxonomy compared to search
y
We found that users preferred a browsing orientedinterface for a browsing task, and a direct searchinterface when they knew precisely what they wanted.
Marti Hearst (and others)
y The category interface is superior to the list interface inboth subjective and objective measures.
Hao Chen & Susan Dumais
8/6/2019 Busch Slides
34/126
34Taxonomy Strategies LLC The business of organized
Usability research: Taxonomy compared to search
0
20
40
60
80100
120
140
C ategory List
M
edian
Search
Tim
ein
Seconds
In top 20 results
Not in top 20 results
Category is36% faster
Category is48% faster
Source: Chen & Dumais
8/6/2019 Busch Slides
35/126
35Taxonomy Strategies LLC The business of organized
Time saved: Taxonomy compared to search
1 hour per day searching x 36% faster = 22 minuteseach day
22 minutes x 250 working days per year = 5500 minutes
or 92 hours per year
8/6/2019 Busch Slides
36/126
36Taxonomy Strategies LLC The business of organized
Time saved: Taxonomy compared to search
Benefit: Increase service efficiency
Number of call center calls per month 50,000
Average cost per call $ 20Call response costs per month $ 1,000,000
Total call response costs per year $12,000,000
Percentage of self-serviced calls due to
improved information browsing
30%
Service costs savings per year $ 3,600,000
8/6/2019 Busch Slides
37/126
37Taxonomy Strategies LLC The business of organized
Trusted advisers: Taxonomy avoids costs
y
The amount of time wasted in futile searching for vitalinformation is enormous, leading to staggering costs
Sue Feldman,
y Suns usability experts calculated that 21,000 employeeswere wasting an average of six minutes per day due toinconsistent intranet navigation structures. When losttime was multiplied by staff salaries, the estimated
productivity loss exceeded $10M per yearabout $500per employee per year.Jakob Nielsen, useit.com
K l d k d t 2 5 h
8/6/2019 Busch Slides
38/126
38Taxonomy Strategies LLC The business of organized
Searching
Creating
Commun-
icating
Knowledge workers spend up to 2.5 hours
each day looking for information
But find what they are looking for only 40% of
the time.
Source: Kit Sims Taylor
K l d k d ti ti i ti
8/6/2019 Busch Slides
39/126
39Taxonomy Strategies LLC The business of organized
Creating
new
content
Recreating
existing
content
SearchingCommun-
icating
25% 8%
Knowledge workers spend more time re-creating existing
content than creating new content
Source: Kit Sims Taylor (cited by Sue Feldman in her original article)
C t d b t ti t t
8/6/2019 Busch Slides
40/126
40Taxonomy Strategies LLC The business of organized
Cost saved by not recreating content
Benefit: Increase in productivity
Number of employees 100
Average employee salary $ 80,000
Employee costs per year $8,000,000
Increase in productivity from not re-creatingcontent
25%
Employee cost savings per year $2,000,000
B i
8/6/2019 Busch Slides
41/126
41Taxonomy Strategies LLC The business of organized
Business case summary
1. Classifications and classification-like schemes arebeing used to facilitate information seeking in theworkplace, and on the web.
2. Users take advantage (and prefer) this type of
scheme (faceted navigation) when it is madeavailable in the user interface.
3. Hierarchical or facet navigation can be guided by theUser Interface.
4. Facet navigation is best combined with keywordsearching. E.g., keyword search followed by facetednavigation of results.
T d d
8/6/2019 Busch Slides
42/126
42Taxonomy Strategies LLC The business of organized
Todays agenda
9:00-9:10 10 minIntroduction
9:10-9:15 5 minWarm-up exercise
9:15-9:45 30 minTaxonomy fundamentals: Building taxonomies
9:45-10:00 15 minTaxonomy exercise
10:00-10:30 30 minTaxonomy fundamentals: Taxonomy business case
10:30-11:00 30 minTea Break
11:00-12:00 60 minTaxonomy governance
12:00-12:30 30 minCapabilities self-assessment
12:30-13:30 60 minLunch
13:30-14:30 60 minTaxonomy benchmarking
14:30-14:45 15 minBenchmarking exercise14:45-15:15 30 minTea Break
15:15-16:15 60 minContent tagging
16:15-16:30 15 minTagging exercise
16:30-17:00 30 minQ&A
T i b i
8/6/2019 Busch Slides
43/126
43Taxonomy Strategies LLC The business of organized
Taxonomy requires a business processes
y
Taxonomies must change, gradually, over time if they areto remain relevant.
y Maintenance processes need to be specified so that thechanges are based on rational cost/benefit decisions.
T b i d
8/6/2019 Busch Slides
44/126
44Taxonomy Strategies LLC The business of organized
Taxonomy governance can be viewed as a
standards process
y
Taxonomy must evolve, but in a predictable way.y Team structure, with an appeals process
Taxonomy stewardship is part-time role at most organizations.
Team needs to make decisions based on costs and benefits.
y
Documentation and educational materials.y Comment-handling responsibilities (part of error-
correction process)
y Issue Logs.
y Release Schedule.
Taxonomy governance: Change process overview
8/6/2019 Busch Slides
45/126
45Taxonomy Strategies LLC The business of organized
Taxonomy governance: Change process overview
Working Copiesof CVs, maintain in
Taxonomy Tool
Site Search Tool
Portal
Project Archives
DMS
Metatagging Tool
Search UI
2: NASA Taxonomy Team
decides when toupdate snapshots of
external CVs
4: Updated versions of
CVs to Consumers
NASA Taxonomy
GovernanceEnvironment
3: Team adds value to
snapshots through
definitions, synonyms,
classification rules,
training materials, etc.
Internally CreatedCVs
Codes
NASACompetencies
CVs from otherNASA Sources
External StandardVocabularies
2: Taxonomy Team decideswhen to update CVsnapshots
Taxonomy
Facets
3: Team adds value viadefinitions,synonyms,classification rules,training materials, etc.
1: External controlledvocabularies (CVs) changeon their own schedule
TaxonomyGovernance
Environment
4: Updatedversions of CVspublished toconsumers
CV
Consumers
CV Sources
SubjectCodes
Expertise
OtherInternal
ExternalStandard
Site SearchTool
Portal
WorkingPapers
Web CMS
DAM
TaggingTool
Search UI
InternallyCreated
TaxonomyTool
CV = Controlled Vocabulary
Who should build the taxonomy?
8/6/2019 Busch Slides
46/126
46Taxonomy Strategies LLC The business of organized
Who should build the taxonomy?
y
The taxonomy (and metadata specification) should beproduced by a cross-functional team which includesbusiness, technical, information management, andcontent creation stakeholders.
y The team should plan on maintaining the taxonomy aswell as building it. Maintenance will not (usually) be anyones full-time job.
Exact mix of people on team will change.
y It should be built in an iterative fashion, with more content
and broader review for each iteration.
Taxonomy governance: Generic team charter
8/6/2019 Busch Slides
47/126
47Taxonomy Strategies LLC The business of organized
Taxonomy governance: Generic team charter
y
Taxonomy Team is responsible for maintaining: The Taxonomy, a multi-faceted classification scheme. Associated taxonomy materials, such as:
Editorial Style Guides.
Taxonomy Training Materials.
Metadata Standard.
Team rules and procedures for change management.
y Taxonomy Team will consider costs and benefits ofsuggested changes.
y Taxonomy Team will:
Manage relationship between providers of source vocabulariesand consumers of the Taxonomy.
Identify new opportunities for use of the Taxonomy across theenterprise to improve information management practices.
Promote awareness and use of the Taxonomy.
Taxonomy governance team:
8/6/2019 Busch Slides
48/126
48Taxonomy Strategies LLC The business of organized
Taxonomy governance team:
Generic roles
BusinessLead
Technical
Specialist
Taxonomy
Specialist
Content
Specialist
Content
Owners
Keeps committee on track with larger business objectives.
Balances cost/benefit issues to decide appropriate levels ofeffort.
Obtains needed resources if those on committee cantaccomplish a particular task.
Estimates costs of proposed changes in terms of amount ofdata to be retagged, additional storage and processing burden,software changes, etc.
Helps obtain data from various systems.
Committees liaison to content creators.
Estimates costs of proposed changes in terms of editorialprocess changes, additional or reduced workload, etc.
Suggests potential taxonomy changes based on analysis of
query logs, indexer feedback. Makes edits to taxonomy, installs into system with aid of IT
specialist.
Reality check on process change suggestions.
Where taxonomy changes come from
8/6/2019 Busch Slides
49/126
49Taxonomy Strategies LLC The business of organized
Where taxonomy changes come from
experience
End User
Firewall
Taxonomy
Content TaggingLogic
Application
UITagging
UI
Tagging Staff
Taxonomy Editor
Staffnotes
missingconcepts
Query loganalysis
Requests from otherparts of NASA
experience
End User
Taxonomy Team
FirewallFirewall
Taxonomy
Content TaggingLogic
TaggingLogic
Application
UI
Application
UITagging
UITagging
UI
Tagging Staff
Taxonomy Editor
Staffnotes
missingconcepts
Query loganalysis
Requests from other
parts of the organization
Team Considerations
1.Business goals.2.Changes in user
experience.
3.Retagging cost.
Recommendations by Editor
1. Small taxonomy changes(labels, synonyms)
2. Large taxonomy changes(retagging, applicationchanges)
3.New best bets content.
Application
Logic
Taxonomy maintenance processes
8/6/2019 Busch Slides
50/126
50Taxonomy Strategies LLC The business of organized
Taxonomy maintenance processes
y
Different organizations will need to consider their ownchange processes. Organization 1: A custodian is responsible for the content, but
checks facts with department heads before making changes. Organization 2: Analysts suggest changes, editors approve,
copyeditors verify consistency. Organization 3: Marketing reps ask for a change, taxonomy editor
makes demo, web representative approves it.
y Change process MUST also consider cost ofimplementing the change
Retagging data. Reconfiguring auto-classifier. Retraining staff. Changes in user expectations.
Taxonomy maintenance workflow
8/6/2019 Busch Slides
51/126
51Taxonomy Strategies LLC The business of organized
Taxonomy maintenance workflow
Problem?
Problem?
Yes
Yes No
No
Suggest new
name/categoryReview new
name
Taxon-omy
Copy edit newname
Add to
enterpriseTaxonomy
Analyst Editor Copywriter Sys Admin
Taxonomy Tool
Sample taxonomy editor: Data Harmony
8/6/2019 Busch Slides
52/126
52Taxonomy Strategies LLC The business of organized
Sample taxonomy editor: Data Harmony
Hierarchy
Browser
Standard
Term
Info
Taxonomy editing tools vendors An immature area
8/6/2019 Busch Slides
53/126
53Taxonomy Strategies LLC The business of organized
Taxonomy editing tools vendors
AbilitytoExecute
lo
w
high
Completeness of VisionVisionariesNiche Players
Most populartaxonomy editor is
MS Excel
An immature areaNo vendors are in
upper-rightquadrant!
MultiTes is widely
used, cheap with
Highfunctionality
/high cost
products($100K+)
Taxonomy maturity model
http://www.wordmap.com/index.html8/6/2019 Busch Slides
54/126
54Taxonomy Strategies LLC The business of organized
Taxonomy maturity model
y Taxonomy governance processes must fit the organization.y As consultants, we notice different levels of maturity in the business
processes around content management, taxonomy, and metadata.y Honestly assess your organizations metadata maturity in order to
design appropriate governance processes.y We are starting to define a maturity model, similar to the Software
Capability Maturity Model (CMM) Initial: Ad hoc, each project begins from scratch. Repeatable: Procedures defined and used, but not standardized across
organization or are misapplied to projects. Defined: Standard processes are tailored for project needs. Strategic
training for long-range goals is in place. Managed: Projects managed using quantitative quality measures.
Process itself is measured and controlled. Optimizing: Continual process improvement. Extremely accurate project
estimation.
Purpose of maturity model
8/6/2019 Busch Slides
55/126
55Taxonomy Strategies LLC The business of organized
Purpose of maturity model
y
Estimating the maturity of an organizations informationmanagement processes tells us: How involved the taxonomy development and maintenance
process should be Overly sophisticated processes will fail.
What to recommend as first steps.
y Maturity is not a goal, it is a characterization of anorganizations methods for achieving particular goals.
y Mature processes have expenses which must be justifiedby consequent cost savings or revenue gains.
y IT Maturity may not be core to your business.
Taxonomy maturity scorecard
8/6/2019 Busch Slides
56/126
56Taxonomy Strategies LLC The business of organized
Taxonomy maturity scorecardInitial Repeatable Defined Managed Optimizing
Organizational Structure
Executive Sponsorship *
Budgeting *
Hiring & Training *
Quality Assurance
Manual Processes * 1
Automated Processes *
Project Management
Estimating & Scheduling *
Cost Control *
Project Methodology * 2
Design and Execution
Planning *
Design Excellence *
Development Maturity *
1 X is starting to examine search query logs, which is an important first step in improving search.But this is only an isolated example.2 IT has a project methodology they are trying to use across all projects. But not all business unitshave project methodologies.
8/6/2019 Busch Slides
57/126
2005 Maturity survey: Search practices
8/6/2019 Busch Slides
58/126
58Taxonomy Strategies LLC The business of organized
2005 Maturity survey: Search practices
n=87 Not currentpractice
Beingdeveloped
In practice Former practice
NA orUnknown
Search Box in standard place on all web pages. 20% (12) 11% (7) 62% (38) 2% (1) 5% (3)
Search engine indexes multiple repositories in addition toweb sites.
25% (15) 21% (13) 44% (27) 2% (1) 8% (5)
Spell Checking. 31% (19) 18% (11) 38% (23) 0% (0) 13% (8)
Synonym Searching. 41% (25) 23% (14) 30% (18) 0% (0) 7% (4)
Search results grouped by date, location, or other factorsin addition to simple relevance score. 37% (22) 20% (12) 37% (22) 0% (0) 7% (4)
Queries are logged and the logs are regularly examined 31% (19) 25% (15) 31% (19) 5% (3) 8% (5)
Common queries identified, 'best' pages for those queriesare found, and search engine configured to return them atthe top. (Best Bets)
46% (28) 25% (15) 21% (13) 0% (0) 8% (5)
dvanced computation of relevance based on data in
addition to the text of the document.43% (26) 16% (10) 25% (15) 0% (0) 16% (10)
faceted search tool, such as Endeca, has beenimplemented for the organization's external site or productcatalog search.
68% (41) 7% (4) 10% (6) 0% (0) 15% (9)
faceted search tool, such as Endeca, has beenimplemented for the organization's internal website(s) orportal.
57% (34) 15% (9) 17% (10) 0% (0) 12% (7)
2005 Maturity survey: Metadata practices
8/6/2019 Busch Slides
59/126
59Taxonomy Strategies LLC The business of organized
2005 Maturity survey: Metadata practices
n=87 Not currentpractice
Beingdeveloped
In practice Former practice
NA orUnknown
Metadata standards are developed for the needs of eachsystem with no overall attempt to unify them. 22% (13) 12% (7) 37% (22) 20% (12) 10% (6)
n Organization-wide metadata standard exists and newsystems consider it during development.
37% (22) 37% (22) 20% (12) 0% (0) 7% (4)
The Organization-wide metadata standard is based onthe Dublin Core.
52% (30) 16% (9) 21% (12) 0% (0) 12% (7)
Multiple repositories comply with metadata standard. 52% (31) 20% (12) 17% (10) 0% (0) 12% (7)
Cataloging Policy document exists to teach people howto tag data in compliance with organizational metadatastandard.
48% (29) 20% (12) 20% (12) 0% (0) 12% (7)
The Cataloging Policy document is revised periodically. 48% (29) 15% (9) 17% (10) 0% (0) 20% (12)
centralized metadata repository exists to aggregate andunify metadata from disparate sources.
57% (34) 17% (10) 17% (10) 0% (0) 10% (6)
Metadata is manually entered into web forms. 15% (9) 12% (7) 61% (36) 3% (2) 8% (5)
Metadata is generated automatically by software. 38% (23) 18% (11) 27% (16) 2% (1) 15% (9)
Metadata is generated automatically, then reviewedmanually for correction.
48% (29) 18% (11) 17% (10) 2% (1) 15% (9)
2005 Maturity survey: Taxonomy practices
8/6/2019 Busch Slides
60/126
60Taxonomy Strategies LLC The business of organized
2005 Maturity survey: Taxonomy practices
n=87 Not currentpractice
Beingdeveloped
In practice Former practice
NA orUnknown
Org Chart Taxonomy - One based primarily on thestructure of the organization.
36% (21) 10% (6) 34% (20) 5% (3) 15% (9)
Products Taxonomy - One based primarily on theproducts and/or services offered by the organization.
37% (22) 10% (6) 32% (19) 5% (3) 15% (9)
Content Types Taxonomy - One based primarily on thedifferent types of documents.
28% (16) 21% (12) 40% (23) 5% (3) 7% (4)
Topical Taxonomy - One based primarily on topics ofinterest to the site users. 20% (12) 36% (21) 34% (20) 3% (2) 7% (4)
Faceted Taxonomy - One which uses several of theapproaches above.
32% (19) 29% (17) 34% (20) 0% (0) 5% (3)
The Taxonomy, or a portion of it, was licensed from anoutside taxonomy vendor.
75% (44) 3% (2) 14% (8) 0% (0) 8% (5)
The Taxonomy follows a written 'style guide' to ensure its
consistency over time.
47% (28) 22% (13) 20% (12) 0% (0) 10% (6)
The Taxonomy is maintained using a taxonomy editingtool other than MS Excel.
35% (21) 17% (10) 40% (24) 2% (1) 7% (4)
The Taxonomy was validated on a representative sampleof content during its development.
28% (17) 22% (13) 33% (20) 3% (2) 13% (8)
Roadmap for the future evolution of the Taxonomy hasbeen developed.
38% (23) 40% (24) 13% (8) 0% (0) 8% (5)
Todays agenda
8/6/2019 Busch Slides
61/126
61Taxonomy Strategies LLC The business of organized
Today s agenda
9:00-9:10 10 minIntroduction
9:10-9:15 5 minWarm-up exercise
9:15-9:45 30 minTaxonomy fundamentals: Building taxonomies
9:45-10:00 15 minTaxonomy exercise
10:00-10:30 30 minTaxonomy fundamentals: Taxonomy business case
10:30-11:00 30 minTea Break
11:00-12:00 60 minTaxonomy governance12:00-12:30 30 minCapabilities self-assessment
12:30-13:30 60 minLunch
13:30-14:30 60 minTaxonomy benchmarking
14:30-14:45 15 minBenchmarking exercise14:45-15:15 30 minTea Break
15:15-16:15 60 minContent tagging
16:15-16:30 15 minTagging exercise
16:30-17:00 30 minQ&A
Taxonomy testing methods
8/6/2019 Busch Slides
62/126
62Taxonomy Strategies LLC The business of organized
Taxonomy testing methods
Method Process Who Requires Validation
Walk-thru Show & explain Taxonomist
SME Team
Roughtaxonomy
Approach
Appropriateness to task
Walk-thru Checkconformance toeditorial rules
Taxonomist Drafttaxonomy
Editorial Rules
Consistent look and feel
Usability
Testing
Contextual
analysis (cardsorting, scenariotesting, etc.)
Users Rough
taxonomy Tasks &
Answers
Tasks are completed
successfully Time to complete task is
reduced
UserSatisfaction
Survey Users RoughTaxonomy
UI Mockup
Searchprototype
Reaction to taxonomy
Reaction to new interface
Reaction to search results
TaggingSamples
Tag samplecontent withtaxonomy
Taxonomist
Team
Indexers
Samplecontent
Roughtaxonomy (orbetter)
Content fit
Fills out content inventory
Training materials for people &algorithms
Walk-through method
8/6/2019 Busch Slides
63/126
63Taxonomy Strategies LLC The business of organized
Walk through method
Show & explain
ABC Computers.com
All
BusinessEmployeeEducationGamingEnthusiast
HomeInvestorJob SeekerMediaPartnerShopper
First TimeExperiencedAdvanced
Supplier
Audience
All
Home & HomeOfficeGamingGovernment,Education &Healthcare
Medium &LargeBusiness
Small Business
Line of
Business
All
Asia-PacificCanadaEMEAJapanLatin America &Caribbean
United States
Region-
Country
Desktops
MP3 PlayersMonitorsNetworkingNotebooksPrintersProjectorsServersServicesStorageTelevisionsOther Brands
Product
Family
Award
Case StudyContract &Warranty
DemoMagazineNews & EventProductInformation
ServicesSolutionSpecification
Technical NoteToolTrainingWhite PaperOther ContentTypes
Content
Type
Business &
FinanceInterpersonalDevelopment
IT ProfessionalsTechnicalTraining
IT ProfessionalsTraining &Certification
PC ProductivityPersonal
ComputingProficiency
Competency Industry
Banking &
FinanceCommunica-tions
E-BusinessEducationGovernmentHealthcareHospitalityManufacturingPetro-chemicalsRetail /
WholesaleTechnologyTransportationOther Industries
Service
Assessment,
Design &Implementa-tion
DeploymentEnterpriseSupport
Client SupportManagedLifecycle
AssetRecovery &
RecyclingTraining
Walk-through method
8/6/2019 Busch Slides
64/126
64Taxonomy Strategies LLC The business of organized
Walk through method
Editorial rules consistency check
y Abbreviations
y Ampersandsy Capitalizationy General, More, Othery Languages & character setsy Length limitsy Multiple parentsy Plural vs. singular formy Scope notesy Serial commay Sources of termsy Spacesy Synonyms & acronymsy Term order (Alphabetic or )y Term label order (Direct vs.
inverted)
Rule Name Editorial Rule
Abbreviations Abbreviations, other than colloquial termsand acronyms, shall not be used in termlabels.Example: Public InformationNOT: Public Info.
Ampersands The ampersand [&] character shall beused instead of the word and. Example:
Licensing & Compliance
NOT: Licensing and Compliance
Capitalization Title case capitalization shall be used.Example: Customer ServiceNOT: CUSTOMER SERVICENOT: Customer serviceNOT: customer service
General,More, Other
The term labels General, More, andOther shall be used for categories
which contain content items that are notfurther classifiable. Example:Other Property Other Services General InformationGeneral Audience
Task-based testing* * Based on Donna Maurers usability
8/6/2019 Busch Slides
65/126
65Taxonomy Strategies LLC The business of organized
as based test g
y 15 representative questions were selected
Perspective of various organizational units Most frequent website searches Most frequently accessed website content Correct answers to the questions were agreed in advance by team.
y 15 users were tested Did not work for the organization Represented target audiences
y Testers were asked where would you look for under which facet Topic, Commodity, or Geography? Then, under which category? Then, under which sub-category?
Tester choices were recordedy Testers were asked to think aloud
Notes were taken on what they saidy Pre- and post questions were asked
Tester answers were recorded
work with the Australian government
Task-based testing
8/6/2019 Busch Slides
66/126
66Taxonomy Strategies LLC The business of organized
g
Representative questions
1. How much cotton is imported from China?
2. What are the impacts of mad cow" disease on U.S. meat production, sales?3. What is the average farm income level in your state?4. How much of our diet comes from fast food?5. How many people receive WIC benefits (Special Supplemental Nutrition
Program for Women, Infants, and Children)?6. How much acreage is planted to genetically engineered corn?
7. What is the cost of foodborne illness in the United States?8. What part of food costs go to farmers, retailers?9. Which States produce the most tobacco?10. What percentage of farms in the United States are small farms?11. What are the costs and benefits associated with providing more traceability in
the U.S. food supply?
12. How many people in America dont get enough to eat?13. What is behind the trade balance (surplus or deficit) in agricultural goods?14. What is the extent of conservation compliance? How does that impact farmer's
decisions?15. What are the impacts of foreign trade restrictions on U.S. farmers, U.S. food
prices?
Task-based testing
8/6/2019 Busch Slides
67/126
67Taxonomy Strategies LLC The business of organized
g
Closed card sorting
3. What is the averagefarm income level in
your state?
1. Topics2. Commodities
3. Geographic Coverage
1. Topics
1.1 Agricultural Economy1.2 Agriculture-RelatedPolicy
1.3 Diet, Health & Safety1.4 Farm Financial
Conditions1.5 Farm Practices &
Management1.6 Food & Agricultural
Industries1.7 Food & Nutrition
Assistance1.8 Natural Resources &
Environment1.9 Rural Economy1.10 Trade & International
Markets
1.4 Farm Financial
Conditions1.4.1 Costs of Production1.4.2 Commodity Outlook1.4.3 Farm Financial
Management &Performance
1.4.4 Farm Income1.4.5 Farm Household
Financial Well-being
1.4.6 Lenders & FinancialMarkets
1.4.7 Taxes
8/6/2019 Busch Slides
68/126
Task based testing
8/6/2019 Busch Slides
69/126
69Taxonomy Strategies LLC The business of organized
g
Card sort results
y In 80% of the trials users looked for information under thecategories that we expected them to look for it.
y Breaking-up topics into facets makes it easier to findinformation, especially information related tocommodities.
Task based testing
8/6/2019 Busch Slides
70/126
Taxonomy Strategies LLC The business of organized
g
Card sort results
Test Questions % Correct % Agree
1. Cotton 91% 82%
2. Mad cow 73% 64%
3. Farm income 100% 55%
4. Fast food 91% 73%
5. WIC 100% 100%
6. GE corn 100% 100%
7. Foodborne illness 82% 82%
8. Food costs 55% 27%
9. Tobacco 100% 100%
10. Small farms 91% 91%
11. Traceability 36% 18%
12. Hunger 100% 73%
13. Trade balance 36% 64%
14. Conservation 91% 91%
15. Trade restrictions 55% 36%
Possible change required.
Change required.
Possible error in categorization of this
question because 64% thought the answer
should be Commodity Trade.
On these trials, only 50% looked in the right
category, & only 27-36% agreed on the
category.
Policy of Traceability needs to be clarified.
Use quasi-synonyms.
Task-based testing
8/6/2019 Busch Slides
71/126
Taxonomy Strategies LLC The business of organized
g
User satisfaction survey
y Was it easy, medium or difficult to choose the appropriateTopic?
Easy Medium Difficult
y Was it easy, medium or difficult to choose the appropriateCommodity?
Easy Medium Difficult
y
Was it easy, medium or difficult to choose the appropriateGeographic Coverage? Easy
Medium
Difficult
User satisfaction survey
8/6/2019 Busch Slides
72/126
72Taxonomy Strategies LLC The business of organized
y
Results
-
0.50
1.00
1.50
2.00
Topic Commodity Geography
Facet
Easy
-->
Difficul
EasierMore Difficult
User interface survey
http://flamenco.berkeley.edu/index.html8/6/2019 Busch Slides
73/126
73Taxonomy Strategies LLC The business of organized
Which search UI is better?
y Criteria
User satisfaction
Success completing tasks
Confidence in results
Fewer dead ends
y
Methodology Design tasks from specific togeneral
Time performance Calculate success rates Survey subjective criteria Pay attention to survey
hygiene: Participant selection Counterbalancing T-scores
Source: Yee, Swearingen, Li, & Hearst
User interface survey
8/6/2019 Busch Slides
74/126
74Taxonomy Strategies LLC The business of organized
Results (1)
Which Interface would you rather use for these tasks? Google-likeBaseline
FacetedCategory
Find images of roses 15 16
Find all works from a certain period 2 30
Find pictures by 2 artists in the same media 1 29
Overall assessment: Google-likeBaseline
FacetedCategory
More useful for your usual tasks 4 28
Easiest to use 8 23
Most flexible 6 24
More likely to result in dead-ends 28 3
Helped you learn more 1 31
Overall preference 2 29
Source: Yee, Swearingen, Li, & Hearst
User interface survey
8/6/2019 Busch Slides
75/126
75Taxonomy Strategies LLC The business of organized
Results (2)
6.06.7
4.7 4.6
5.8 5.56.0
4.0
7.2
6.3
3.5
7.7 7.47.8
4.8
7.6
0
1
2
34
5
6
7
89
EasytoUseSimpleFlexibleTedious
Interesting
EasytoBrowse
Enjoyable
Overwhelming
Faceted Category
Google-like Baseline
Source: Yee, Swearingen, Li, & Hearst
Tagging samples
8/6/2019 Busch Slides
76/126
76Taxonomy Strategies LLC The business of organized
How many items?
Goal Number of
Items
Criteria
Illustrate metadata schema 1-3 Random (excluding junk)
Develop training documentation 10-20 Show typical & unusual cases
Qualitative test of smallvocabulary (
8/6/2019 Busch Slides
77/126
Tagging samples
8/6/2019 Busch Slides
78/126
78Taxonomy Strategies LLC The business of organized
Spreadsheet for tagging 10s-100s of items
1) Clickable URLs for sample content
2) Review small sample and describe
3) Drop-down for tagging (including
Other entry for the unexpected
4) Flag questions
Rough bulk tagging
8/6/2019 Busch Slides
79/126
79Taxonomy Strategies LLC The business of organized
Facet demo (1)
y Collections: 4 content sources NTRS, SIRTF, Webb, Lessons Learned
y Taxonomy Converted MultiTes format into RDF for Seamark
y Metadata Converted from existing metadata on web pages, or
Created using simple automatic classifier (string matching withterms & synonyms)
250k items, ~12 metadata fields, 1.5 weeks effort
y OOTB Seamark user interface, plus logo
Rough bulk tagging
8/6/2019 Busch Slides
80/126
80Taxonomy Strategies LLC The business of organized
Facet demo(2)
Document distribution
http://demo.siderean.com/NASADemoV4/NASA-demoquery1.jsphttp://demo.siderean.com/NASADemoV4/NASA-demoquery1.jsp8/6/2019 Busch Slides
81/126
81Taxonomy Strategies LLC The business of organized
How evenly does it divide the content?
y Documents do not distribute uniformly across categories
y Zipf (1/x) distribution is expected behavior
y 80/20 rule in action (actually 70/20 rule)
Measured v Expected Distribution of Top 10 Content Types in
Library of Congress Database
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Cong
resses
Biog
raph
y
Perio
dicals
Map
s
Fiction
Exhib
itions
Juve
nilelite
ratur
e
Bibli
ograph
y
Statistics
Top 10 Content Types
NumberofRecords
Leading candidate forsplitting
Leading candidatesfor merging
Document distribution
8/6/2019 Busch Slides
82/126
82Taxonomy Strategies LLC The business of organized
How evenly does it divide the content?
y Methodology: 115 randomly selected URLs from corporate intranet
search index were manually categorized. Inaccessible files and junkwere removed.
y Results: Slightly more uniform than Zipf distribution. Above the curveis better than expected.
Measured v Expected Intranet Content Type Distribution
0
5
10
15
20
25
People,Groups
&
Places
News&Events
Manuals&
Learning
Materials
Operations&
Internal
Communications
Marketing&
Sales
Regulations,
Policies,
Procedures&
Templates
Papers&
Presentations
Other&
Unclassified
Programs,
Proposals,Plans
&
Schedules
Content Type
#Documents
Document distribution How does taxonomy
h h h f ?
8/6/2019 Busch Slides
83/126
83Taxonomy Strategies LLC The business of organized
shape match that of content?
Background:y Hierarchical taxonomies allow
comparison of fit between contentand taxonomy areas
Methodology:y
25,380 resources tagged withtaxonomy of 179 terms. (Avg. of 2terms per resource)
y Counts of terms and documentssummed within taxonomy hierarchy
Results:y Roughly Zipf distributed (top 20
terms: 79%; top 30 terms: 87%)
y Mismatches between term% anddocument% flagged
Term Group % Terms % Docs
Administrators 7.8 15.8
Community Groups 2.8 1.8
Counselors 3.4 1.4
Federal Funds Recipients andApplicants
9.5 34.4
Librarians 2.8 1.1
News Media 0.6 3.1
Other 7.3 2.0
Parents and Families 2.8 6.0
Policymakers 4.5 11.5
Researchers 2.2 3.6
School Support Staff 2.2 0.2
Student Financial AidProviders
1.7 0.7
Students 27.4 7.0
Teachers 25.1 11.4
Source: Courtesy Keith Stubbs, US. Dept. of Ed.
Usability testing
H i t iti ( t bl ) th t i ti (1)?
8/6/2019 Busch Slides
84/126
84Taxonomy Strategies LLC The business of organized
How intuitive (repeatable) are the categorizations (1)?
y Methodology: Closed Card Sort For alpha test of a grocery site
15 Testers put each of 71 best-selling product types into one of10 pre-defined categories
Categories where fewer than 14 of 15 testers put product into
same category were flagged
Usability testing
H i t iti ( t bl ) th t i ti (2)?
8/6/2019 Busch Slides
85/126
85Taxonomy Strategies LLC The business of organized
How intuitive (repeatable) are the categorizations (2)?
Usability testing
H i t iti ( t bl ) th t i ti ?
8/6/2019 Busch Slides
86/126
86Taxonomy Strategies LLC The business of organized
% of Testers Cumulative % ofProducts
15/15 54%
14/15 70%
13/15 77%
12/15 83%
11/15 85%
8/6/2019 Busch Slides
87/126
87Taxonomy Strategies LLC The business of organized
The #1 underused source of quantitativeinformation on how to improve your
taxonomy?
Query Logs & Click Trails
Query log & click trail examination
Wh th & h t th l ki f ?
8/6/2019 Busch Slides
88/126
88Taxonomy Strategies LLC The business of organized
Who are the users & what are they looking for?
y Only 30-40% of organizations regularly examine theirlogs*.
y Sophisticated software available, but dont wait.
y 80% of value comes from basic reports
Query log & click trail examination
Q l
8/6/2019 Busch Slides
89/126
89Taxonomy Strategies LLC The business of organized
Query log
UltraSeek Reportingy Top queries
y Queries with no results
y Queries with no click-through
y Most requested documents
y Query trend analysisy Complete server usage
summary
Query log & click trail examination
Cli k t il k
8/6/2019 Busch Slides
90/126
90Taxonomy Strategies LLC The business of organized
Click trail packages
y iWebTrack
y NetTracker
y OptimalIQ
y SiteCatalyst
y Visitorville y WebTrends
8/6/2019 Busch Slides
91/126
Benchmarking exercise
8/6/2019 Busch Slides
92/126
92Taxonomy Strategies LLC The business of organized
y What are 5 representative questions that your users ask or tasks
that your users do when using your application?y Is it currently easy, medium or difficult to answer these questions or
accomplish these tasks?
Rating (Easy/Medium/Difficult)
Questions or Tasks
Conclusion
What is a good taxonomy?
8/6/2019 Busch Slides
93/126
93Taxonomy Strategies LLC The business of organized
What is a good taxonomy?
y Incremental, extensible process that identifies andenables owners, and engages stakeholders.
y Quick implementation that provides measurable resultsas quickly as possible.
y A means to an end, and not the end in itself.
y Not perfect, but it does the job it is supposed to dosuchas improving search and navigation.
y Improved over time, and maintained.
Todays agenda
8/6/2019 Busch Slides
94/126
94Taxonomy Strategies LLC The business of organized
9:00-9:10 10 minIntroduction
9:10-9:15 5 minWarm-up exercise
9:15-9:45 30 minTaxonomy fundamentals: Building taxonomies
9:45-10:00 15 minTaxonomy exercise
10:00-10:30 30 minTaxonomy fundamentals: Taxonomy business case
10:30-11:00 30 minTea Break
11:00-12:00 60 minTaxonomy governance12:00-12:30 30 minCapabilities self-assessment
12:30-13:30 60 minLunch
13:30-14:30 60 minTaxonomy benchmarking
14:30-14:45 15 minBenchmarking exercise
14:45-15:15 30 minTea Break
15:15-16:15 60 minContent tagging
16:15-16:30 15 minTagging exercise
16:30-17:00 30 minQ&A
Tagging Overview
8/6/2019 Busch Slides
95/126
95Taxonomy Strategies LLC The business of organized
y Tagging is better than the words that happen to occur in apiece of content.
y All tagging is useful End user tagging
Tagging by librarians
Automated tagging by OS and algorithms
y Content should be tagged throughout its lifecycle, eachtime the content is handled and used so that it accruesvalue or its significance is diminished.
MS Office: File Properties
8/6/2019 Busch Slides
96/126
96Taxonomy Strategies LLC The business of organized
Howmanypeoplefillthisi
n?
Organize
8/6/2019 Busch Slides
97/126
97Taxonomy Strategies LLC The business of organized
Howmanype
opleclickonthis?
What is social tagging?
8/6/2019 Busch Slides
98/126
98Taxonomy Strategies LLC The business of organized
y End user tagging
y Easy, intuitive tagging interfaces
y Almost instantaneous feedback Enables people to tag & re-tag content
in response to seeing their tags in context with other tags.
y Emergent categories Resembles open card sort process in which patterns emerge
rather than validating categories using closed card sorts.
Social tagging innovators
8/6/2019 Busch Slides
99/126
99Taxonomy Strategies LLC The business of organized
y flickr founders Caterina Fake
Stewart Butterfield
y del.icio.us founder Joshua Schachter
y
del.icio.us & flickr are now both part of Yahoo!y As of April 2006 flickr had 130 million photos posted by 3
million registered users.
Four tagging rules for end users
8/6/2019 Busch Slides
100/126
100Taxonomy Strategies LLC The business of organized
Rule Description
Use specific terms Apply the most specific terms when tagging content.But do not tag every possible topic, just the onesthat are most important or best characterize thecontent as a whole.
Use multiple terms Use as many terms as necessary to describeoverall What the content is about& Why it isimportant. Do not over-tag.
Use appropriateterms
Only fill-in the facets & values that make sense. Notall facets apply to all content.
Consider howcontent will beused
Anticipate how the content will be searched forinthe future, & how to make it easy to find it.Remember that search engines can only operate onexplicit information.
Agenda
8/6/2019 Busch Slides
101/126
101Taxonomy Strategies LLC The business of organized
y Content Tagging
y Tagging Interface
Requirements for a tagging interface
8/6/2019 Busch Slides
102/126
102Taxonomy Strategies LLC The business of organized
y Automated form fill-in (automatically fills in known data)y Tagging precedents (see tags already assigned by
others)y Controlled vocabularies, e.g., with pull-down listy Multi-valued tags
y Geo-taggingy Group taggingy Clean-up tag tools, e.g., alpha listy Batch editing
y Share/Dont share (Public/Private)y Identified owner (who can be emailed)y Almost immediate feedback, e.g., tag cloud
Form fill-in: Automatically filled-in known data
8/6/2019 Busch Slides
103/126
103Taxonomy Strategies LLC The business of organized
Form fill-in: Automatically filled-in known data
8/6/2019 Busch Slides
104/126
104Taxonomy Strategies LLC The business of organized
Manual form fill-in w/ checkboxes, pull-down lists, etc.
Auto keyword &
summarization
Form fill-in: Automatically filled-in known data
8/6/2019 Busch Slides
105/126
105Taxonomy Strategies LLC The business of organized
Auto-categorization
Parse & lookup
(recognize names)
Rules & pattern
matching
Tagging precedents:
See tags assigned by others
8/6/2019 Busch Slides
106/126
106Taxonomy Strategies LLC The business of organized
See tags assigned by others
Multi-valued group tagging
8/6/2019 Busch Slides
107/126
107Taxonomy Strategies LLC The business of organized
Group geo-tagging
8/6/2019 Busch Slides
108/126
108Taxonomy Strategies LLC The business of organized
Group geo-tagging
8/6/2019 Busch Slides
109/126
109Taxonomy Strategies LLC The business of organized
Clean up tag tools: Alpha list
8/6/2019 Busch Slides
110/126
110Taxonomy Strategies LLC The business of organized
Batch edit
8/6/2019 Busch Slides
111/126
111Taxonomy Strategies LLC The business of organized
Share or dont share tagging
8/6/2019 Busch Slides
112/126
112Taxonomy Strategies LLC The business of organized
Bulk tagging
8/6/2019 Busch Slides
113/126
113Taxonomy Strategies LLC The business of organized
y ID collection of related content items by pattern or context
y Then, apply same attributes to all content items
Tag a folder
8/6/2019 Busch Slides
114/126
114Taxonomy Strategies LLC The business of organized
y Drag & drop content items into folder
y Then, content items inherit properties of folder
Workflow
8/6/2019 Busch Slides
115/126
115Taxonomy Strategies LLC The business of organized
y Approve & improve mindset
Review &
Improve
Review &
Improve
Add
Metadata
Create
Content Publish
Interactive rewards
8/6/2019 Busch Slides
116/126
116Taxonomy Strategies LLC The business of organized
y Almost instantaneous exposure of tags in simple user
interfaces on the web provides positive reinforcement foruser tagging that simply did not exist before.
y For example, Most popular
Tag clouds Alerts
Most popular
8/6/2019 Busch Slides
117/126
117Taxonomy Strategies LLC The business of organized
Another example is most emailed from, e.g., the NYTimes.
Tag cloud
8/6/2019 Busch Slides
118/126
118Taxonomy Strategies LLC The business of organized
Alerts
8/6/2019 Busch Slides
119/126
119Taxonomy Strategies LLC The business of organized
y New (content selected by date)
y Subscriptions (content selected by tags)
y Interest (content selected by other people)
y Individual (content selected for you by other people)
Strategies LLCTaxonomy
8/6/2019 Busch Slides
120/126
6-15 June 2007 Copyright 2007 Taxonomy Strategies LLC. All rights reserved.
Is faceted indexing the future of
social tagging?
Tagging exercise: Blog tagging (a)
8/6/2019 Busch Slides
121/126
121Taxonomy Strategies LLC The business of organized
ALA Tech Source. http://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.html
Tagging exercise: Blog tagging (b)
http://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.htmlhttp://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.html8/6/2019 Busch Slides
122/126
122Taxonomy Strategies LLC The business of organized
HBSP. http://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.html#comments
Tagging exercise: Taxonomy facetsdefinitions
http://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.htmlhttp://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.html8/6/2019 Busch Slides
123/126
123Taxonomy Strategies LLC The business of organized
Taxonomy Facets Descriptions
Business activity Use for common business function or activity such asfinance, marketing and sales.
Industry / Product Use for content that is about or related to an industrialsector or product such as construction equipment.
Geography Use for content that is about a region, country or city.
Organization Use for named organizations, brands and businessentities.
Person / Role Use for named people and the roles people have inorganizations.
Content Type Use for content genres such as letters, memos andreports.
Audience Use to indicate the intended audience.
Topic Use for other business and associated topics that thecontent is about or related to.
Tagging exercise: Taxonomy facetsvalues
8/6/2019 Busch Slides
124/126
124Taxonomy Strategies LLC The business of organized
Geography Industry / Product People / RoleOrganization /
EntityContent TypeBusiness activity
Business LeadersThought LeadersPolitical LeadersRoles
Business entitiesCompanies &brands
Governmentagencies
InternationalNGOsOrganizationtypes
Agriculture MiningUtilitiesConstructionManufacturingWholesale tradeRetail tradeTransportation &
warehousingInformationFinance &
insuranceReal estateProfessionalManagementAdministrative
supportEducationHealth careArts, entertainment
& recreationAccommodation &
food
Other servicesPublic
administration
AfricaAmericasAntarcticaAsiaEuropeOceaniaGlobalHistorical
geographyOceans & seas
Regions
Audience
AccountingAuditingFinanceHR managementITMarketingOperations
managementSales
ConsumerEmployeeManagerExecutive
Basic facts &information
BlogBrochureDatabaseE-mailLetterMemoMultimediaReport
NewsletterPodcastPress ReleaseResearch &Analysis
RSS Feed
Taxonomy Facets Tags
Business activity
Industry / ProductGeography
Organization
Person / Role
Content Type
Audience
Topic
Summary
8/6/2019 Busch Slides
125/126
125Taxonomy Strategies LLC The business of organized
y There are lessons to be learned from web tagging about
how to get good metadata in document and contentmanagement applications.
y Document and content management system tagging mustbe simple, and it must be almost instantaneously easier
to find relevant work products.
Strategies LLCTaxonomy
8/6/2019 Busch Slides
126/126
Questions?
Joseph A. Busch
+ 415-377-7912
http://www.taxonomystrategies.com
mailto:[email protected]://www.taxonomystrategies.com/http://www.taxonomystrategies.com/mailto:[email protected]Top Related