Taxonomy 101: Classifying DITA Tasks

20
easyDITA How-To Series: Taxonomy 101: Classifying DITA Tasks Paul Wlodarczyk CEO, Jorsek LLC June 28, 2012

description

Click here to listen to the webcast - http://bit.ly/MdAzXd DITA Tasks are often the most valuable content we create – especially when we present them in Support portals. But if end-users can’t find them they have no value – avoiding that requires classifying them with metadata and labels from a standard taxonomy. Taxonomy and metadata can seem like scary or complex turf to the uninitiated – but they don’t have to be. In this 40-minute webinar, Paul Wlodarczyk will walk you through a simple process to begin to assemble a basic taxonomy of controlled vocabularies for tagging your DITA Tasks. You will learn: The most critical metadata for classifying tasks – regardless of your industry How to use tools that you already own to build your taxonomy Simple rules for keeping your terms consistent Using existing lists of terms so you don’t have to build a taxonomy from scratch

Transcript of Taxonomy 101: Classifying DITA Tasks

Page 1: Taxonomy 101: Classifying DITA Tasks

easyDITA How-To Series:Taxonomy 101: Classifying DITA Tasks

Paul Wlodarczyk

CEO, Jorsek LLC

June 28, 2012

Page 2: Taxonomy 101: Classifying DITA Tasks

Poll: Please complete while folks arrive

How are you delivering DITA Tasks or other procedural / how-to content?• Portal with advanced / faceted search• Static web pages or web help• Print / PDF• Windows Help• Other

6/28/2012 2© Jorsek, LLC. All Rights Reserved.

Page 3: Taxonomy 101: Classifying DITA Tasks

Why talk about task oriented content?

Task-oriented content is valuable:• It is versatile and can be

reused in more deliverables than conceptual content– Product user guides– Context-sensitive help– Knowledge base– Support– Training

• It’s what most users are searching for in a knowledge base or help A DITA Task published to MindTouch

6/28/2012 3© Jorsek, LLC. All Rights Reserved.

Page 4: Taxonomy 101: Classifying DITA Tasks

Benefits of using DTA for authoring tasks

• Task authored in DITA are – Concise– Consistent– Modular– Semantic

• DITA Tasks make good templates for content contributed by SMEs (like product engineers)

• For software UA in particular, task-oriented content is perfect for QA. The task becomes the Test Case.

The XML Source for a DITA Task

6/28/2012 4© Jorsek, LLC. All Rights Reserved.

Page 5: Taxonomy 101: Classifying DITA Tasks

Anatomy of a DITA Task

• Title• Short description• Context• Prerequisite• Step section • Step• Command• Sub Step• Step Info• Step Result• Step Example• Choice and Choice Table• Example• Post-requisite• Result

A DITA Task in easyDITA

6/28/2012 5© Jorsek, LLC. All Rights Reserved.

Page 6: Taxonomy 101: Classifying DITA Tasks

DITA Tasks are semantic

• DITA tasks are inherently semantic – Not simple ordered lists– Not simple paragraphs

• This is useful for – Dynamic rendition, e.g.

• Expand / collapse steps• Interactive UI controls

– Semantic Search in the context of the structure, e.g. • find STEPS that contain MENU CASCADES• Find STEP INFORMATION that contains IMAGES tagged with [text]• Find PREREQUISITES that contain [text]

6/28/2012 6© Jorsek, LLC. All Rights Reserved.

Page 7: Taxonomy 101: Classifying DITA Tasks

Making tasks more findable with metadata

• Q: How can we make content even more findable– For authors and content

managers?– For end users in a dynamic

delivery system?

• A: Tag tasks with semantic metadata – Semantic = “meaning”– Metadata can be set with terms

from controlled vocabularies defined and managed in a taxonomy

6/28/2012 7© Jorsek, LLC. All Rights Reserved.

Page 8: Taxonomy 101: Classifying DITA Tasks

What is Metadata?

6/28/2012 8

• Literally “Data about the data”• Also known as “tags”

– Not to be confused with the content itself (e.g. XML structure)

– Can be embedded in a file (e.g. the DITA Prolog or attributes; JPEG image data) or associated in a CMS

• Two main flavors:– Administrative metadata

• e.g. Content Type, Author, Date Modified, Version, Title, etc.,

• Usually system-generated• What the content is

– Descriptive metadata• Subject classification, keywords, etc.• Usually manually authored• What the content is about

© Jorsek, LLC. All Rights Reserved.

Page 9: Taxonomy 101: Classifying DITA Tasks

Key Concept: Taxonomy

taxonomy n. A categorization scheme for concepts, often hierarchical • Most often, taxonomies show “is a”

relationships, e.g. A mammal is a vertebrate, A rodent is a mammal, etc.

• Navigation up and down the tree yields broader than (BT) and narrower than (NT) classification

– Can be used to adjust search scope

• Can also show related terms (RT)– Can be used to suggest related searches / “see

also”

• Can manage synonyms (UF – Use For)– Can be used to find content when search

terms are not the preferred terms

6/28/2012 9© Jorsek, LLC. All Rights Reserved.

Page 10: Taxonomy 101: Classifying DITA Tasks

Using Taxonomy for controlled vocabularies

• A taxonomy is the “source of truth” for what terms to use for various concepts – so terms are consistent.

• Taxonomy terms can be used as controlled vocabularies (“pick-lists”) for metadata, so authors simply select preferred terms– Avoids typos, duplicates, word form variations, use of non-preferred terms

• Some content management systems enable controlled vocabularies from taxonomies to be used for setting attribute values in DITA (e.g. selectatts like Audience, Product, Platform etc.).

• Relationships between terms in a Taxonomy can improve search – CMS search and site search indexing tools can use equivalent and related

terms to find content that does not contain the search term– Relationships between terms can be expressed as RDF in HTML content for

improving web search indexing

6/28/2012 10© Jorsek, LLC. All Rights Reserved.

Page 11: Taxonomy 101: Classifying DITA Tasks

Simple framework for tagging tasks

• In any industry, we’re all trying to help people do something to something in a context:– Who is doing what to what (+ other important context or condition)

• ExamplesJunior Service Technician doing preventive maintenance on Acme Jetpack XR7 that uses nitrous oxide injection technologyCasual User clearing paper jam on MFD100 Copier with envelope tray option Case Worker performing an intake interview for a recently unemployed person in New York StateIntermediate User publishing a DITA Map using DITA OT to PDF formatFinancial analyst calculating a WACC for a publicly traded company located in a country using GAPP accountingRegistered Nurse administering medication to patient in the ICU and drug is a controlled substanceContract Service Technician doing diagnosis on P1000 Printer showing missing sections of the printed image

6/28/2012 11© Jorsek, LLC. All Rights Reserved.

Page 12: Taxonomy 101: Classifying DITA Tasks

What metadata do you need?

Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach)• Performer metadata:

– Types of users (roles, experience, education level, etc.)

– Types of employees (title, training, certifications, clearance, department, skill level etc.)

– Types of customers• Activity metadata:

– Broad Task Types (e.g. for service: maintenance, diagnosis, repair, calibration, startup, etc.)

– High Level Task names from a performance analysis / instructional design

– Competencies from a model– Commercial Services listing

6/28/2012 12© Jorsek, LLC. All Rights Reserved.

Page 13: Taxonomy 101: Classifying DITA Tasks

What metadata do you need?

Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach)• Object (i.e. “To what / to whom”) metadata:

– Things: Product, product components, product subsystems

– People: Types of customers or clients• Context metadata:

– Market / locale – Product options– Technologies– Special situations– Tools required– Security classification– Symptoms / Fault codes

6/28/2012 13© Jorsek, LLC. All Rights Reserved.

Page 14: Taxonomy 101: Classifying DITA Tasks

Do we have to create these terms from scratch?

No! You are surrounded by free sources for term lists, many are governed and authoritative. Don’t reinvent – borrow! Here are some common sources of terms:• Corporate ECM or Web taxonomy (from IT or marketing)• Industry-specific taxonomies (e.g. MeSH for life sciences, DSM for mental health)• Government taxonomies (e.g. UK IPSV - Integrated Public Sector Vocabulary)• Generic public domain taxonomies (e.g. People, Places, and Cultures; AP News)• Other corporate sources:

– Training group (competency models, task analyses)

– HR (Job codes and Job Titles)

– Support / field service systems (Parts, fault classifications, failure modes, tools used)

– CRM data (Customer names, Customer categories, SKUs, Products & Services)

– Product data (Product BOMs, platforms, parts, subsystems, options)

– Organization Charts (Divisions, departments, locations, budget centers)

– Business Process Analysis (process names and steps, inputs and outputs)

6/28/2012 14© Jorsek, LLC. All Rights Reserved.

Page 15: Taxonomy 101: Classifying DITA Tasks

Taxonomy Tools

• You can build and manage a simple taxonomy in Microsoft Excel• Even if authors manually tag metadata, the Excel taxonomy can be a useful

guide and source of terms to copy/paste• Each row is a term and each column is a level in the hierarchy

• Put other data required for related and equivalent terms in columns to right of preferred term hierarchy

• Add a column for scope notes• Use Grouping to help expand / collapse sections of a long taxonomy• If you have a CMS or other tool that consumes taxonomy, you can export a CSV

file from Excel and import it to the CMS (see Mary Garcia’s excellent blog posts at TaxoDiary.com to learn how)

6/28/2012 15© Jorsek, LLC. All Rights Reserved.

Page 16: Taxonomy 101: Classifying DITA Tasks

Taxonomy Tools

• Consider using a Taxonomy Management System if:– You have a large taxonomy (over 500 terms)– The taxonomy changes often– You have a complex governance process for approving new terms– The taxonomy needs to be consumed by more than one system– You are using term relationships to improve search indexing

6/28/2012 16© Jorsek, LLC. All Rights Reserved.

Page 17: Taxonomy 101: Classifying DITA Tasks

Guidelines for taxonomy quality

• The hierarchy should reflect any of three relationships:– Generic (e.g. VehicleCar)– Instance (e.g. Mountain regionsRockies)– Whole-Part (e.g. HouseRoof)

• Terms should be nouns or noun phrases.• Activities should be nouns or gerunds.• Avoid adjectives and prepositions unless integral to the term.• When in doubt singular vs. plural, choose plural; these are

categories. Singular is OK for instances at the narrow end.• Named entities should be proper nouns.• Avoid punctuation and ampersands. Eliminate hyphens except

where the term is confusing or unclear without them.• Make the most commonly used term the preferred term, even if it

is an acronym (e.g. NASA). Make other forms Equivalent Terms.6/28/2012 17© Jorsek, LLC. All Rights Reserved.

Page 18: Taxonomy 101: Classifying DITA Tasks

Poll:

Are you currently using controlled vocabularies for any of the following?• CMS Metadata• DITA Attributes• Prolog Metadata and Keywords• Other• Not using controlled vocabularies

6/28/2012 18© Jorsek, LLC. All Rights Reserved.

Page 19: Taxonomy 101: Classifying DITA Tasks

Resources

• LinkedIn Taxonomy Community of Practice• ANSI/NISO Z39.19-2005 - Guidelines on Construction, Format,

and Management of Monolingual Controlled Vocabularies • IBM Presentation: Writing Effective DITA Task Topics

– http://svdig.ditamap.com/DITATaskTopics_090310SR.ppt

• TaxoDiary blog posts by Mary Garcia: Maintaining a Thesaurus in an Excel Workbook (two parts)– http://taxodiary.com/2012/04/maintaining-a-thesaurus-in-an-excel-workb

ook/

– http://taxodiary.com/2012/05/maintaining-a-thesaurus-in-an-excel-workbook-part-2/

• easyDITA blog posts and Twitter– easyDITA.com/blog and @easydita

6/28/2012 19© Jorsek, LLC. All Rights Reserved.

Page 20: Taxonomy 101: Classifying DITA Tasks

Thank you!

• Questions?• Recorded webcast will be available soon through our website –

you will get an email with the link• Anyone can register after the event to view the recording• Slides will be available on SlideShare

– www.slideshare.net/easydita

• Next webcast July 25, featuring Amber Swope of DITA Strategies discussing Using Taxonomy for DITA Content. Please join us!

6/28/2012 20© Jorsek, LLC. All Rights Reserved.