Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics...

42
Pragmatics of knowledge Pragmatics of knowledge engineering on the Web engineering on the Web Guus Schreiber Guus Schreiber Free University Amsterdam Free University Amsterdam Co Co - - chair W3C Semantic Web Deployment WG chair W3C Semantic Web Deployment WG

Transcript of Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics...

Page 1: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Pragmatics of knowledge Pragmatics of knowledge engineering on the Webengineering on the Web

Guus SchreiberGuus SchreiberFree University AmsterdamFree University Amsterdam

CoCo--chair W3C Semantic Web Deployment WGchair W3C Semantic Web Deployment WG

Page 2: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Overview

• Principles for ontology engineering on Web scale– Some remarks about web standards

• RDF/OWL conversion issues• SKOS: pragmatics of publishing Web

vocabularies– Context: W3C SWD Working Group

Page 3: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Principles for ontology engineering in a distributed world

Page 4: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

1. Modesty principle

• Ontology engineers should refrain from developing their own idiosyncratic ontologies

• Instead, they should make the available rich vocabularies, thesauri and databases available in web format

• Initially, only add the originally intended semantics

Page 5: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)
Page 6: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

2. Scale principle: “Think large!”

"Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing."

Doug Lenat

Page 7: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Applications require many ontologies

Page 8: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

3. Pattern principle: don’t try to be too creative!

• Ontology engineering should not be an art but a discipline

• Patterns play a key role in methodology for ontology engineering

• See for example patterns developed by the W3C Semantic Web Best Practices group

http://www.w3.org/2001/sw/BestPractices/

Page 9: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

SKOS: pattern for thesaurus modeling

• Based on ISO standard• RDF representation• Documentation:

http://www.w3.org/TR/swbp-skos-core-guide/

• Base class: SKOS Concept

Page 10: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

4. Enrichment principle

• Don’t modify, but add!• Techniques:

– Learning ontology relations/mappings– Semantic analysis, e.g. OntoClean– Processing of scope notes in thesauri

Page 11: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Example enrichment• Learning relations between art styles in AAT

and artists in ULAN through NLP of art0historic texts

• But don’t learn things that already exist!

Page 12: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

DERAIN, AndreThe Turning Road

MATISSE, HenriLe Bonheur de vivre

Page 13: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Extracting additional knowledge from scope notes

Page 14: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Thesauri / vocabularies

• Large bodies of domain-specific knowledge that represent consensus in particular domains

• Typically weak semantic structure• Often lots of implicit semantics available• Representation is typically relational

database and/or XML• Semantic Web Challenge showed that

thesauri are important resources for SW applications

Page 15: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)
Page 16: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

WordNet: internal representation

s(108644031,1,'bed',n,3,2).s(108644031,2,'bottom',n,5,1).

s(102719813,1,'bed',n,1,51).s(102720436,1,'bed',n,2,3).

g(108644031,'(a depression forming the ground under a body of water; "he searched for treasure on the ocean bed")').

g(102719813,'(a piece of furniture that provides a place to sleep; "he sat on the edge of the bed"; "the room had only a bed andchair")').

g(102720436,'(a plot of ground in which plants are growing; "thegardener planted a bed of roses")').

SynsetID Order LexForm Type SenseNum

Page 17: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

synset3rd sense ofBed (noun)

5th sense ofBottom (noun)

Synset108644031

a depression forming the ground under a body of water; "he searched for treasure on the ocean bed

Page 18: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

WordNet URI s

• What URIs should be chosen?– SynSet, WordSense, Word

• URI name: – ID? => difficult for human interpretation– Concatenated unique, human readable

wn:synset-bank-noun-2 First sense in synset denoted by second sense of “bank”

wn:wordsense-bank-noun-1 wn:word-bank

Page 19: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)
Page 20: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

XML fragment of ULAN

<Associative_Relationships><Associative_Relationship><Historic_Flag>NA</Historic_Flag><Relationship_Type>1102/student of

</Relationship_Type><Related_Subject_ID><VP_Subject_ID>500011051</VP_Subject_ID>

</Related_Subject_ID></Associative_Relationship>

</Associative_Relationship>

Page 21: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Conversion issues

• XML and RDF/OWL are inherently different– XML = thesaurus document structure– RDF = thesaurus document content

• Redundant information in XML file<Associative_Relationships><Historic_Flag>NA</Historic_Flag>

• How to represent “student of”?– Subproperty of Associative_Relationship is

probably preferred– Needs to be derived from the data; not part of

schema

Page 22: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

XML fragment of ULAN (2)

<Non-Preferred_Term><Term_Text>Koning, Philips Aertsz. de</Term_Text><Term_ID>1500207734</Term_ID><Display_Order>34</Display_Order><Vernacular>Vernacular</Vernacular>

</Non-Preferred_Term>

Page 23: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Conversion issues

• Do we include all information in the conversion?– Display-order example– Source and revisions information

• Should each term have a URI?• Making language explicit

– “vernacular” means the string is written in the original language

– Multi-linguality is an important issue for thesauri

Page 24: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

SWD goals

• Schema for interoperable RDF/OWL representation of vocabularies – SKOS

• Publication guidelines: – URI management, representation of versions

• Embedding RDF in (X)HTML pages– RDFa

Page 25: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

ISO standard for representing thesauri

• Term– Preferred term (USE)– Non-preferred term (USED FOR)

• Hierarchical relation between terms– Broader/narrower term (BT/NT)

• Generic• Partitive

• Association between terms (RT)

Page 26: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)
Page 27: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Multi-lingual labels for concepts

Page 28: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Semantic relation:broader and narrower

• No subclass semantics assumed!

Page 29: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Semantic relations:related

• Symmetry is issue (OWL use)

Page 30: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Indexing a resource with a SKOS concept

• primarySubject is defined as subproperty

Page 31: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Collections:role-type trees

Page 32: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Adding semantics

• Adding OWL statements• Interpretations of thesaurus relations such as

narrower as subclass-of are often imprecise (but can still be useful)

• Learning relations between thesauri is important form of additional semantics– Example: AAT contains styles; ULAN contains

artists, but there is no link– Availability of this kind of alignment knowledge is

extremely useful

Page 33: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

SKOS semanticsinference rules

• Collection membership rule(?i skos:subject ?x) (?x skos:broader ?y)

-> (?i skos:subject ?y)

• If a painting of Van Gogh has as subjectSunFlowers and if Flowers is a broaderterm of SunFlowers, then Flowers is also the subject of the painting.

Page 34: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

W3C standardization process

• Input: draft specification• Collect use cases• Derive requirements• Create issues list: requirements that cannot be

handled by the draft spec• Propose resolutions for issues• Continuously: ask for public feedback/comments• Get consensus on amended spec• Find two independent implementation for each

feature in the spec

Page 35: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Example issue: relationships between lexical labels

• In draft SKOS spec lexical labels of concepts are represented as datatype properties

• Use cases require relations between labels, e.g. “AAT” is an acronym of “Art & Architecture Thesaurus”

• This is a problem because literals have no URI (so cannot be subject of an RDF property)

• Possible resolutions:– Labels/terms as classes– Relaxing constraints on label property– …..

Page 36: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Recipes for vocabulary URIs

• Simplified rule:– Use “hash" variant” for vocabularies that are

relatively small and require frequent accesshttp://www.w3.org/2004/02/skos/core#Concept

– Use “slash” variant for large vocabularies, where you do not want always the whole vocabulary to be retrieved

http://xmlns.com/foaf/0.1/Person• For more information and other recipes, see:

http://www.w3.org/TR/swbp-vocab-pub/

Page 37: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

More information

Page 38: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Query for WordNet URI returns “concept-bounded description”

Page 39: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

A RDFa sample

Regular HTML

Resulting RDF statements

HTML with RDFa

Page 40: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Adding datatypes and informal representation

Page 41: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Linking to other resources

Regular HTML

HTML with embedded RDF

Page 42: Pragmatics of knowledge engineering on the Webguus/talks/07-stanford.pdf · SKOS semantics inference rules • Collection membership rule (?i skos:subject ?x) (?x skos:broader ?y)

Statements about other resources:photo example