Terminology Curation with the Semantic MediaWiki
description
Transcript of Terminology Curation with the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Curation with the
Semantic MediaWiki
Harold SolbrigInformatics Architect
Apelon, Inc.
04/18/2007 Terminology and the Semantic MediaWiki
http://www.framleyexaminer.com/
The Original Task
04/18/2007 Terminology and the Semantic MediaWiki
The Original Task
Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to:
• Upper Level Ontological Principles• ISO TC37 & Related principles
04/18/2007 Terminology and the Semantic MediaWiki
Approach
1. Gather appropriate upper level ontologies (BFO, Dolce Lite, Top Bio, UMLS Semantic Net and OBO Relations Ontology) into a single, readily referenced format
2. Load NCI Thesaurus into same format3. Multiple parties review, annotate, recommend
and categorize4. Publish, analyze and evaluate results
04/18/2007 Terminology and the Semantic MediaWiki
Solution
By using the Semantic MediaWiki (SMW), we were able to accomplish all of the goals in a (very) reasonable period of time
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
Internal Hyperlink to Semantic Net
External Hyperlink to OMIM
Text Rendering of DL “Some” Statement
Curation Status Tracking
04/18/2007 Terminology and the Semantic MediaWiki
Observations, Recommendations
04/18/2007 Terminology and the Semantic MediaWiki
Internal Hyperlink to BFO
Subcategory
Dump in RDF
04/18/2007 Terminology and the Semantic MediaWiki
Discussion
We also discovered that, with some extensions, the SMW could be useful for publishing, annotating and cross-referencing other terminological (and other..) resources.
04/18/2007 Terminology and the Semantic MediaWiki
The Question
Could this approach be more generally applied to the task of terminology authoring /curation
04/18/2007 Terminology and the Semantic MediaWiki
Clinical Terminology
• Centrally curated• Central to the practice of medicine
– Insurance and reporting– Regulatory– Research– Clinical Practice– Information Sharing
• ICD-9, CPT-4, SNOMED, …
04/18/2007 Terminology and the Semantic MediaWiki
Clinical Terminology
• Quality and content is important• Needs central vetting, integration, qa
– Central model doesn’t scale– Need input from (many) experts– Need visible, active feedback loop
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Workflow 1995
ControlledTerminology
Curation
(1)
Distribution
BooksPDF
Lists andTables
(2)
(3)
(4)
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Workflow 1995
ControlledTerminology
‘B’
(1)
(2)
(3)
Curation
Distribution
BooksPDF
Lists andTables
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Workflow 2008
ControlledTerminology
Curation
(1)
Distribution
(2)
(3)
(5)
CommonDistribution
Model
OnlineServices
(4)
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Workflow 2008
ControlledTerminology
Curation
(1)
Distribution
(2)
(3)
(5)
CommonDistribution
Model
OnlineServices
(4)
ControlledTerminology
B
04/18/2007 Terminology and the Semantic MediaWiki
Common Distribution Model
• LexGrid• (a little bit of…) OWL
– NCI Thesaurus & SNOMED CT– Still requires LexGrid-like additions– “Pushing the envelope”
• UMLS RRF– Although underspecified as a ‘model’
04/18/2007 Terminology and the Semantic MediaWiki
Online Services
• OMG Terminology Query Services– Not heavily used– Perceived (incorrectly) as CORBA specific– Perceived as too complex– Object oriented and stateful
• ANSI Common Terminology Services– Being adopted– Necessary but not sufficient– Stateless
• CTS-2– Co-development beginning w/ HL7 & OMG
04/18/2007 Terminology and the Semantic MediaWiki
Online Services
• LexBIG– LexGrid for the Bio Informatics Grid– Robust query specification– Meets many end-user (developers) requirments
• Not simple to implement – it actually adds value• Not a standard - but will be used to guide CTS-2
04/18/2007 Terminology and the Semantic MediaWiki
Workflow and Feedback
ControlledTerminology
Curation
(1)
Distribution
(2)
(3)
(5)
CommonDistribution
Model
OnlineServices
(4)
04/18/2007 Terminology and the Semantic MediaWiki
The Feedback Component
Curation
04/18/2007 Terminology and the Semantic MediaWiki
The Feedback Component
Curation
SemanticMediaWiki (++)
Annotations andChange Requests
CommunityReview
Distribution
CommonDistribution
Model
OnlineServices
VersionStaging
ControlledTerminology
04/18/2007 Terminology and the Semantic MediaWiki
Wiki’s
• Community developed• Collaborative• “Organic” – to the very core…• Primary focus (to date) is human
consumption• Traceable, provenance automatically
recorded, differences, undo and redo.
04/18/2007 Terminology and the Semantic MediaWiki
MediaWiki
• http://en.wikipedia.org/wiki/Wiki• Base for WikiPedia and many others…• Key characteristics
– Web based editing– Page links– Categories– Templates
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
MediaWiki
• Fully documented using (surprise!) MediaWiki
• Rich mechanisms for discussion, curation, export, etc.
04/18/2007 Terminology and the Semantic MediaWiki
Templates
04/18/2007 Terminology and the Semantic MediaWiki
Templates
04/18/2007 Terminology and the Semantic MediaWiki
Sample Template
ParameterExtension call
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
Semantic MediaWiki
3 Key extensions to MediaWiki1. Categories == Class
– PageA … [[Category:X]] pageA rdf:Type Category:X
– Category:Y … [[Category:X]] category:Y rdfs:subClassOf category:X
2. Links == Role– PageA … [[PageB]] PageA …[[hasPart::PageB]]
3. Attributes == DataProperty – [[population:=32,154,773]]– Includes datatypes
04/18/2007 Terminology and the Semantic MediaWiki
Categories and Relations
04/18/2007 Terminology and the Semantic MediaWiki
Attributes
04/18/2007 Terminology and the Semantic MediaWiki
Semantic Rendering
Type (or superClass)
Attribute Value
Relation RDF (!)
04/18/2007 Terminology and the Semantic MediaWiki
04/18/2007 Terminology and the Semantic MediaWiki
Services
• We don’t want to change the terminology content on the fly
04/18/2007 Terminology and the Semantic MediaWiki
Terminology Workflow 1995
ControlledTerminology
Curation
(1)
Distribution
BooksPDF
Lists andTables
(2)
(3)
(4)
04/18/2007 Terminology and the Semantic MediaWiki
Thesaurus Content
04/18/2007 Terminology and the Semantic MediaWiki
Services
WebService
Transform
XML
CTDETemplates
Display
04/18/2007 Terminology and the Semantic MediaWiki
Links to Service
Service Call
04/18/2007 Terminology and the Semantic MediaWiki
Service XML
04/18/2007 Terminology and the Semantic MediaWiki
CTDE Templates
04/18/2007 Terminology and the Semantic MediaWiki
Display
04/18/2007 Terminology and the Semantic MediaWiki
Services
WebService
Transform
XML
CTDETemplates
Display
DTSLexBIGProtégé…
DCR (!)
ApplicationSpecific
04/18/2007 Terminology and the Semantic MediaWiki
The Next Step
WebService
Transform
XML
CTDETemplates
Display
DCR (!)
ApplicationSpecific
???
04/18/2007 Terminology and the Semantic MediaWiki
Preliminary Work
04/18/2007 Terminology and the Semantic MediaWiki
How is it Working?
Very well!
04/18/2007 Terminology and the Semantic MediaWiki
Status
• Collaboration on TCDE– Apelon w/ DTS Server / LexBIG– Mayo with Protégé Server
• Workflow– NCI / Apelon
• Identifier resolution / hyperlinks• Tools
– Semantic Templates– Ajax search and hyperlinks– Hover over– WYSWIG editing– …
04/18/2007 Terminology and the Semantic MediaWiki
Questions?
• [email protected]• http://wiktolog.org• http://wiki.ontoworld.org/index.php/Seman
tic_MediaWiki