Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein,...

40
Low-Cost & No-Cost Taxonomy Low-Cost & No-Cost Taxonomy Tools Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825, Voice & Fax: 602- 470-0389,

Transcript of Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein,...

Page 1: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Low-Cost & No-Cost Taxonomy ToolsLow-Cost & No-Cost Taxonomy Tools

Presented November 2, 2006 in San Jose, CA by

Mark Goldstein, International Research CenterPO Box 825, Tempe, AZ 85280-0825, Voice & Fax: 602-470-0389,

[email protected], URL: http://www.researchedge.com/

© 2006 - International Research Center

Page 2: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Low-Cost & No-Cost Taxonomy ToolsThursday, 11/2/06 from 2:30-3:00 PM

Presented by Mark Goldstein, International Research Center

Often taxonomy development and its integration are seen as part of expensive and complex enterprise toolsets and suites. There are, however, a number of free open source and low-cost commercial tools that enable full taxonomy development and maintenance for more modest budgets. This session covers the availability of existing open source taxonomies, a variety of taxonomy tools for modest budgets, comparisons of their capabilities, and an analysis of their applicability for integration to portal and search applications.

Page 3: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Taxonomy• From Greek verb τασσεῖν or tassein = "to classify" and νόμος or nomos = law, science, cf "economy“

• Taxonomy was once only the science of classifying living organisms (alpha taxonomy)

• Later the word was applied in a wider sense, and may also refer to either a classification of things, or the principles underlying the classification

• Almost anything, animate objects, inanimate objects, places, and events, may be classified according to some taxonomic scheme

• Taxonomies, which are comprised of taxonomic units known as taxa (singular taxon), are frequently hierarchical in structure, commonly displaying parent-child relationships

Source: Wikipedia (http://en.wikipedia.org/wiki/Taxonomy)

Page 5: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Interviews/Surveys

ManualOnline

Submittals

SelectiveWeb SiteIndexing

UserFeedback &Suggestions

Research AreaKnowledge Map

Database Queries, NewsFeeds & Blogs

Communitiesof Practice

(CoP) Indexing

Research AreaOntology/Taxonomy

AutonomicSemantic Analysis& Meta Tagging

Research AreaXML Variant

Manual Review& Meta Tagging

Ontology, Taxonomy & XMLAuthoring Tools

Data Sources Information Architecture Information ArchitectureManagement

Information Processing

Information Warehouse

Research AreaMeta DataRepository

KnowledgeVisualization, GUI

& Navigation

Page 6: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Ontology& TaxonomyAuthoring &Maintenance

Text Mining/Semantic Analysis

Knowledge Visualization& Navigation

Meta Tagging& DatabaseRepository

Portal Creation & Maintenance

Communities ofPractice (CoP)

& Collaboration

AutonomyCorporation(with Portal Overlay)

Page 8: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.w3.org/2004/OWL/

Page 11: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

IBM uses UIMA-enabled products and services help customers build applications in a variety of solution areas including:

Financial Government Life Sciences

Aerospace Chemical Clinical

Insurance Medical Healthcare

http://www.research.ibm.com/UIMA/

Page 14: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://uima.lti.cs.cmu.edu:8080/UCR/Welcome.do

Page 15: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Nstein Launches 12 UIMA AnnotatorsNstein Technologies, a leader in text mining and multilingual information access solutions, today announced the launch of 12 annotators, compliant with the open-source Unstructured Information Management Architecture (UIMA) standard developed by IBM. The UIMA standard provides the foundation for new search-related applications that extract hidden meaning from unstructured information. Nstein’s annotators will allow organizations adopting the UIMA standard to considerably expand their content discovery capabilities associated with market intelligence, customer intelligence and early warning applications.

Nstein’s topic-based categorization annotators are now available for the automated tagging of concepts, names of people and organizations, geographic locations, dates and currencies in unstructured documents. Sentiment-based categorization annotators are also available for the tagging of objective and subjective statements in documents, as well as overall negative or positive statements. Other annotators launched by Nstein include fact finding, annotating facts related to human resources movements (hiring, firing, promotion) as well as financial information (mergers and acquisitions, investments, etc.).

Source: Nstein Technologies Inc. 8/21/06 (http://www.nstein.com/)

Page 20: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

In order to fully benefit from the three core technologies in Semantically Enable Knowledge Technologies (SEKT): Ontology-based Metadata, Human Language Technology and Knowledge Discovery, they must be used together. This convergence is now timely because of the maturing of the three separate disciplines, particularly ontology technology, which has received much attention over the last 2-3 years.

Next Generation Knowledge Management solutions will be built upon ontology-based metadata (OMT) and thus the creation and management of machine-interpretable information and the consequent use of ontologies. The integrated management of ontologies and metadata, especially their generation, mediation and evolution, is fundamental to this development and relies in part on innovative Human Language Technology (HLT) and Knowledge Discovery (KD) methods.

Advanced reasoning capabilities will strongly support the evolution of ontologies and metadata and greatly reduce the overhead for maintenance. Work in advanced reasoning will include the development of techniques for robust reasoning, i.e. reasoning in the presence of inconsistencies, i.e. in order to give meaningful results even when the overall ontology has conflicts. It will also include flexible reasoning which can cope with changes and conflicts in a given model and can fall back to old versions or change the scope of reasoning to a consistent set of statements. The advanced reasoning work will support the evolution of ontologies and meta-data, in order to reduce maintenance overhead. http://www.sekt-project.org/project/

Page 21: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture

Page 22: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://docs.oasis-open.org/dita/v1.0/archspec/ditaspec.html

Page 23: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.termtree.com.au/

Export to CSV, Metabrowser and XML. API available for custom application development. Records management support (optional) for Document Workbench, TRIM Captura, TRIM Context, Objective, PCDocs/Hummingbird DM, Recfind, Seraph/Vignette and Dataworks.

Page 24: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

The Cyc Knowledge Server is a very large, multi-contextual knowledge base and inference engine developed to break the "software brittleness bottleneck" by constructing a foundation of basic "common sense" knowledge--a semantic substratum of terms, rules, and relations--that will enable a variety of knowledge-intensive products and services. (http://www.cyc.com/)

The OpenCyc KB Browser

http://www.opencyc.org/

Page 26: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Protégé-Frames Classes Tab

Protégé-OWL OWLViz Extension

http://protege.stanford.edu/

Page 27: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://ontoworld.org/wiki/Main_Page

Ontology Evaluation Subsite - http://ontoworld.org/wiki/Ontology_evaluation

If ontologies are the foundation of the semantic web, they better be stable.

Ontoworld.org

Page 32: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.factiva.com/content/intindexing.asp

Page 33: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.taxonomywarehouse.com/

Page 34: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,
Page 35: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,
Page 37: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.ncbi.nlm.nih.gov/Taxonomy/

U.S. National Library of Medicine (NLM)National Center for Biotechnology Information (NCBI)

Page 38: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

http://www.searchtools.com/info/classifiers.html

http://www.searchtools.com/info/classifiers-tools.html

http://www.searchtools.com/info/visualization.html

Page 40: Low-Cost & No-Cost Taxonomy Tools Presented November 2, 2006 in San Jose, CA by Mark Goldstein, International Research Center PO Box 825, Tempe, AZ 85280-0825,

Taxonomy in Wikipedia - http://en.wikipedia.org/wiki/TaxonomyTaxonomy CoP TaxoTools - http://taxocop.wikispaces.com/TaxoToolsTaxonomy CoP Topic Maps - http://taxocop.wikispaces.com/Topic+MapsDublin Core Metadata Initiative (DCMI) - http://dublincore.org/W3C Web Ontology Language (OWL) - http://www.w3.org/2004/OWL/SchemaLogic - http://www.schemalogic.com/IBM Integrated Ontology Development Toolkit - http://www.alphaworks.ibm.com/tech/semanticstkIBM Unstructured Information Management Architecture (UIMA) - http://www.research.ibm.com/UIMA/UIMA Component Repository - http://uima.lti.cs.cmu.edu:8080/UCR/Welcome.doNstein Technologies - http://www.nstein.com/, FAST Search - http://www.fastsearch.com/Data Harmony - http://www.dataharmony.com/Eclipse Modeling Framework (EMF) - http://www.eclipse.org/emf/Semantically Enable Knowledge Technologies (SEKT) Project - http://www.sekt-project.org/project/Darwin Information Typing Architecture (DITA) - http://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture

http://docs.oasis-open.org/dita/v1.0/archspec/ditaspec.htmlTermTree - http://www.termtree.com.au/, OpenCyc - http://www.opencyc.org/MindServer Categorization - http://www.recommind.com/Protégé Platform - http://protege.stanford.edu/Ontoworld.org - http://ontoworld.org/wiki/Main_PageOCLC Terminology Services - http://www.oclc.org/terminologies/XBRL Financial Reporting Taxonomies - http://www.xbrl.org/FRTaxonomies/UBmatrix XBRL Designer - http://www.ubmatrix.com/products/products_taxonomy_designer.aspAutonomy Taxonomies - http://www.autonomy.com/, Wikispecies - http://species.wikimedia.org/NCBI Taxonomy- http://www.ncbi.nlm.nih.gov/Taxonomy/Taxonomies Search Tools - http://www.searchtools.com/info/classifiers.html

Low-Cost & No-CostTaxonomy Resources

Summary

Factiva Taxonomy Warehouse - http://www.factiva.com/content/intindexing.asphttp://www.taxonomywarehouse.com/