Markup Languages and the Semantic Web
-
Upload
garrett-potts -
Category
Documents
-
view
32 -
download
2
description
Transcript of Markup Languages and the Semantic Web
04/19/23 Inf 722 Fall 2007 (Gangolly) 1
Markup Languages and the Semantic Web
Lecture Notes Prepared by
Jagdish S. GangollyInterdisciplinary Ph.D Program in Information
ScienceState University of New York at Albany
04/19/23 Inf 722 Fall 2007 (Gangolly) 2
Markup Languages
• Knowledge assumed:– HTML
• DTD (Document Type Definition)
• Tags– Format (confusion between format and other tags)
– Structure (Too flexible, and so almost useless)
– Content (virtually none)
• Very poor in semantics
• Inability to exploit latent semantics
• Users at the mercy of browsers
• Inflexibility in adding new tags un less blessed by browsers
04/19/23 Inf 722 Fall 2007 (Gangolly) 3
XML I
• SGML, the forerunner of HTML– Too complex (annotated SGML standard runs over
1,000 pages– Too flexible– Little browser support
• XML– Less complex and yet extensible– Flexible in expressing semantics– Browser support
04/19/23 Inf 722 Fall 2007 (Gangolly) 4
XML II
• Separation of format, content, and structure tags– Content: Schema
• Rich set of data types
• Easy to understand and implement
– Format: XSL (XML Style-sheet language)• Complex and no universal browser support
• Such support may not be crucial because of XSLT (XSL Transform) which enables HTMLize XML
– Structure: Subsumed in content and format– Representing richer semantics than HTML allowed
04/19/23 Inf 722 Fall 2007 (Gangolly) 5
XML III
• Discipline enforced• Document Type Definition, required to specify the
grammar of HTML and SGML required programmers to be familiar with one more language (EBNF - Extended Backus-Naur Formalism) in which DTDS are represented.
• Good browser support• DOM (Document Object Model), SAX (Simple API for
XML), and Namespaces facilitates machines to communicate and (understand) mutual data to an extent
04/19/23 Inf 722 Fall 2007 (Gangolly) 6
Semantic Web
• ..is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. (http://infomesh.net/2001/swintro/)
04/19/23 Inf 722 Fall 2007 (Gangolly) 7
Motivation
• Need for interchangeability of information (information sharing)
• Need for interchangeability, translatability, uniformity of ontologies
• Need for improving precision in retrieval
• Need for web services based on understanding of data as well as metadata
04/19/23 Inf 722 Fall 2007 (Gangolly) 8
Semantic Web Components
– Data• Structure• Content• Format• Ontology
– Metadata• Representation Languages• Facility for metadata Interchange
04/19/23 Inf 722 Fall 2007 (Gangolly) 9
Data
• Data (Semi-structured as well as structured)
•Structure Tags: XML-Schema
•Content Tags: XML-Schema
•Ontology: Ontology representation languages
04/19/23 Inf 722 Fall 2007 (Gangolly) 10
Metadata I
• Representation languages based on First Order Logic
• KIF-based Ontolingua (http://www.ksl.stanford.edu/software/ontolingua/
• Loom (http://www.isi.edu/isd/LOOM/LOOM-HOME.html)
• Frame-Logic (http://www.cs.sunysb.edu/~kifer/dood/papers.html)
04/19/23 Inf 722 Fall 2007 (Gangolly) 11
Metadata II
• Languages using standardised syntax– Simple HTML Ontology Extensions (SHOE) (
http://www.cs.umd.edu/projects/plus/SHOE/)– XOL Ontology Exchange Language (XOL)(
http://www.ai.sri.com/pkarp/xol/)– Ontology Markup Language (OML and CKML)
(Ontology Markup Language (OML and CKML) – Resource Description Framework Schema
Language (RDFS) (http://www.w3.org/TR/rdf-schema/)
– RiboWEB (http://www-smi.stanford.edu/projects/helix/riboweb/kb-pub.html)
04/19/23 Inf 722 Fall 2007 (Gangolly) 12
Metadata III
– OIL (Ontology Interchange Language) (http://www.ontoknowledge.org/oil/)
– DAML+OIL (http://www.daml.org)– XFML+CAMEL (eXchangeable Faceted Metadata
Language + Compound term composition Algebraically-Motivated Expression Language) (http://www.csi.forth.gr/~tzitzik/XFML+CAMEL/)
• Good sources of information: – http://www.cs.umd.edu/users/hendler/sciam/
walkthru.html– http://www.w3.org/2001/sw/
04/19/23 Inf 722 Fall 2007 (Gangolly) 13
Dublin Core
• Metadata ElementsISO 15836:2003
Title Format
Creator Identifier
Subject Source
Description Language
Publisher Relation
Contributor Coverage
Date Rights
Type
04/19/23 Inf 722 Fall 2007 (Gangolly) 14
RDF (http://www.xml.com/pub/a/2002/01/30/daml1.html)
• XML based language that allows you to define classes and properties<rdfs:Class rdf:ID="Product">
<rdfs:label>Product</rdfs:label> <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment> </rdfs:Class>
<rdfs:Property rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/> </rdfs:Property>
04/19/23 Inf 722 Fall 2007 (Gangolly) 15
RDF
• "there is a Person identified by http://www.w3.org/People/EM/contact#me, whose name is Eric Miller, whose email address is [email protected], and whose title is Dr."
04/19/23 Inf 722 Fall 2007 (Gangolly) 16
RDF
04/19/23 Inf 722 Fall 2007 (Gangolly) 17
RDF
<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName> <contact:mailbox rdf:resource="mailto:[email protected]"/> <contact:personalTitle>Dr.</contact:personalTitle> </contact:Person>
</rdf:RDF>
04/19/23 Inf 722 Fall 2007 (Gangolly) 18
DAML+OIL I (http://www.xml.com/pub/a/2002/01/30/daml1.html)
• DAML+OIL also allows you to define instances of classes and specify their properties<Product rdf:ID="WaterBottle"> <rdfs:label>Water Bottle</rdfs:label> <productNumber>38267</productNumber> </Product>
• DAML+OIL allows datatyping<daml:DatatypeProperty rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/> </daml:DatatypeProperty>
04/19/23 Inf 722 Fall 2007 (Gangolly) 19
DAML+OIL II
• Provides for uniqueness, equivalence, enumerations, disjoint classes, disjoint unions of classes, non-exclusive Boolean combinations of classes, intersection of classes, sub-classing, property restrictions
• Rich enough to model ontologies
04/19/23 Inf 722 Fall 2007 (Gangolly) 20
Semantic Web Stack of Expressive Power (Berners-Lee)
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
04/19/23 Inf 722 Fall 2007 (Gangolly) 21
Semantic Web Stack of Expressive Power (Berners-Lee)
• URI (Uniform Resource Identifier)– http://www.ietf.org/rfc/rfc2396.txt
• Unicode – unicode.org
• XML– http://www.w3.org/XML/
• RDF– http://www.w3.org/RDF/
• RDF-S (RDF Schema)– www.w3.org/TR/2000/CR-rdf-schema-20000327/
• SPARQL– www.w3.org/TR/rdf-sparql-query/
04/19/23 Inf 722 Fall 2007 (Gangolly) 22
• OWL (Web Ontology Language)– http://www.w3.org/2004/OWL/
• RIF– http://www.w3.org/TR/rif-core/
• Unifying Logic
• Proof
• Crypto
• Trust
04/19/23 Inf 722 Fall 2007 (Gangolly) 23
Web Ontology Language (OWL) I
• OWL Lite supports those users primarily needing a classification hierarchy and simple constraints.
• OWL DL supports those users who want the maximum expressiveness while retaining computational completeness (all conclusions are guaranteed to be computed) and decidability (all computations will finish in finite time).
• OWL Full is meant for users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees.
Source: http://www.w3.org/TR/owl-features/
04/19/23 Inf 722 Fall 2007 (Gangolly) 24
Semantic Web: Readings
• Semantic Web: Readings
• “The Semantic Web In Breadth”, by Aaron Swartz– http://logicerror.com/semanticWeb-long
• The Semantic Web: An Introduction– http://infomesh.net/2001/swintro/