[DSBW Spring 2010] Unit 10: XML and Web And beyond

41
1 dsbw 2009/2010 q2 XML DTD, XMLSchema XSL, Xquery Web Services SOAP, WSDL RESTful Web Services Semantic Web Introduction RDF, RDF Schema, OWL, SPARQL Unit 10: XML and Web and Beyond Unit 10: XML and Web and Beyond

description

[DSBW Spring 2010] Unit 10: XML and Web And beyond

Transcript of [DSBW Spring 2010] Unit 10: XML and Web And beyond

Page 1: [DSBW Spring 2010] Unit 10: XML and Web And beyond

1dsbw 2009/2010 q2

XML DTD, XMLSchema XSL, Xquery

Web Services SOAP, WSDL RESTful Web Services

Semantic Web Introduction RDF, RDF Schema, OWL, SPARQL

Unit 10: XML and Web and BeyondUnit 10: XML and Web and Beyond

Page 2: [DSBW Spring 2010] Unit 10: XML and Web And beyond

2dsbw 2009/2010 q2

“... is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. ”

W3 Consortium XML …

is not a solution but a tool to build solutions is not a language but a meta-language that require

interoperating applications that use it to adopt clear conventions on how to use it

is a standardized text format that is used to represent structured information

eeXXtensible tensible MMarkup arkup LLanguageanguage

Page 3: [DSBW Spring 2010] Unit 10: XML and Web And beyond

3dsbw 2009/2010 q2

SGML, XML and their applicationsSGML, XML and their applications

HyTime HTML

XHTML SMIL SOAP WML

SGMLSGML

XMLXML

Meta-Markup Language

Markup Language

Application

Page 4: [DSBW Spring 2010] Unit 10: XML and Web And beyond

4dsbw 2009/2010 q2

The document has exactly one root element The root element can be preceded by an optional XML declaration Non-empty elements are delimited by both a start-tag and an end-tag. Empty elements are marked with an empty-element (self-closing) tag Tags may be nested but must not overlap All attribute values are quoted with either single (') or double (") quotes

<?xml version="1.0" encoding="UTF-8"?><address> <street> <line>123 Pine Rd.</line> </street> <city name="Lexington"/> <state abbrev="SC"/> <zip base="19072" plus4=""/></address>

Well-Formed XML DocumentsWell-Formed XML Documents

Page 5: [DSBW Spring 2010] Unit 10: XML and Web And beyond

5dsbw 2009/2010 q2

Are well-formed XML documents Are documents that conform the rules defined by certain

schemas Schema: define the legal building blocks of an XML

document. It defines the document structure with a list of legal elements. Two ways to define a schema: DTD: Document Type Definition XML Schema

Valid XML DocumentsValid XML Documents

Page 6: [DSBW Spring 2010] Unit 10: XML and Web And beyond

6dsbw 2009/2010 q2

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE address [

<!ELEMENT address (street, city, state, zip)>

<!ELEMENT street line+>

<!ELEMENT line (#PCDATA)>

<!ELEMENT city (#PCDATA)>

<!ELEMENT state (#PCDATA)>

<!ELEMENT zip (#PCDATA)> ]>

<address> ... </address>

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE address SYSTEM "http://dtd.mycompany.com/address.dtd">

<address> ... </address>

DTD Example: Embedded and External DTD Example: Embedded and External DefinitionsDefinitions

Page 7: [DSBW Spring 2010] Unit 10: XML and Web And beyond

7dsbw 2009/2010 q2

DTD is not integrated with Namespace technology so users cannot import and reuse code

DTD does not support data types other than character data DTD syntax is not XML compliant DTD language constructs are no extensible

DTD LimitationsDTD Limitations

Page 8: [DSBW Spring 2010] Unit 10: XML and Web And beyond

8dsbw 2009/2010 q2

<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"

elementFormDefault="qualified"> <xsd:import namespace=" "/> <xsd:element name="address"> <xsd:complexType> <xsd:sequence> <xsd:element name="street"> <xsd:complexType> <xsd:all maxOccurs="unbounded"> <xsd:element name="line" type="xsd:string"/> </xsd:all> </xsd:complexType> </xsd:element> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element></xsd:schema>

XML Schema: ExampleXML Schema: Example

Page 9: [DSBW Spring 2010] Unit 10: XML and Web And beyond

9dsbw 2009/2010 q2

Using a programming language and the SAX API. SAX is a lexical, event-driven interface in which a document is

read serially and its contents are reported as "callbacks" to various methods on a handler object of the user's design

Using a programming language and the DOM API. DOM allows for navigation of the entire document as if it were

a tree of "Node" objects representing the document's contents.

Using a transformation engine and a filter XSLT, XQuery, etc

Processing XML DocumentsProcessing XML Documents

Page 10: [DSBW Spring 2010] Unit 10: XML and Web And beyond

10dsbw 2009/2010 q2

Alternative/complement to HTML XML + CSS, XML + XSL, XHTML

Declarative application programming/configuration Configuration files, descriptors, etc.

Data exchange among heterogeneous systems B2B, e-commerce: ebXML

Data Integration from heterogeneous sources Schema mediation

Data storage and processing XML Databases, XQuery (XPath)

Protocol definition SOAP, WAP, WML, etc.

XML UsesXML Uses

Page 11: [DSBW Spring 2010] Unit 10: XML and Web And beyond

11dsbw 2009/2010 q2

XSL serves the dual purpose of transforming XML documents exhibiting control over document rendering

XSL consists of two parts: XSL Transformations (XSLT):

An XML language for transforming XML documents It uses the XML Path Language (XPath) to search and transverse

the element hierarchy of XML documents XSL Formatting Objects (XSL-FO):

An XML language for specifying the visual formatting of an XML document.

It is a superset of the CSS functionally designed to support print layouts.

eXtensible Stylesheet Language: XSLeXtensible Stylesheet Language: XSL

Page 12: [DSBW Spring 2010] Unit 10: XML and Web And beyond

12dsbw 2009/2010 q2

<bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </book></book></bib>

XQuery (XML Query): Example (XQuery (XML Query): Example (sourcesource))

Page 13: [DSBW Spring 2010] Unit 10: XML and Web And beyond

13dsbw 2009/2010 q2

<results><results> { let $a := doc("http://bstore1.example.com/bib/bib.xml")//author for $last in distinct-values($a/last), $first in distinct-values($a[last=$last]/first) order by $last, $first return <author><author> <name><name> <last><last>{ $last }</last><first></last><first>{ $first }</first></first> </name></name> { for $b in doc("http://bstore1.example.com/bib.xml")/bib/book where some $ba in $b/author satisfies ($ba/last = $last and $ba/first=$first) return $b/title } </author></author> }</results></results>

XQuery (XML Query): Example (XQuery (XML Query): Example (queryquery))

For each author, retrieve its last, first names as well as the title of its books, ordered by last, first names

For each author, retrieve its last, first names as well as the title of its books, ordered by last, first names

Page 14: [DSBW Spring 2010] Unit 10: XML and Web And beyond

14dsbw 2009/2010 q2

<results><results> <author><author> <name><name> <last><last>Abiteboul</last><first></last><first>Serge</first></first> </name></name> <title>Data on the Web</title> </author></author><author><author> <name><name> <last><last>Stevens</last><first></last><first>W.</first></first> </name></name> <title>TCP/IP Illustrated</title> <title>Advanced Programming in the Unix environment</title> </author></author> <author><author> <name><name> <last><last>Suciu</last><first></last><first>Dan</first></first> </name></name> <title>Data on the Web</title> </author></author></results></results>

XQuery (XML Query): Example (XQuery (XML Query): Example (resultresult))

Page 15: [DSBW Spring 2010] Unit 10: XML and Web And beyond

15dsbw 2009/2010 q2

People and communities have data stores and applications to share Vision:

Expand the Web to include more machine-understandable resources Enable global interoperability between resources you know should be

interoperable as well as those you don't yet know should be interoperable

Key Web technologies: Web Services: Web of Programs

Standards for interactions between programs, linked on the Web Easier to Expose and Use services (and data they provide)

Semantic Web: Web of Data Standards for things, relationships and descriptions, linked on the Web Easier to Understand, Search for, Share, Re-Use, Aggregate, Extend

information

A Smarter Web Is PossibleA Smarter Web Is Possible

Page 16: [DSBW Spring 2010] Unit 10: XML and Web And beyond

16dsbw 2009/2010 q2

Web ServicesWeb Services “A Web service is a software system designed to support interoperable

machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP-messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards”. Web Services Glossary, W3C, http://www.w3.org/TR/ws-gloss/

UDDI: Universal Description, Discovery and Integration

Page 17: [DSBW Spring 2010] Unit 10: XML and Web And beyond

17dsbw 2009/2010 q2

SOAP is a simple XML based protocol to let applications exchange information over HTTP.

A SOAP message is a XML document containing the following elements: A required Envelope element that identifies the XML document as a

SOAP message An optional Header element that contains header information A required Body element that contains call and response information An optional Fault element that provides information about errors that

occurred while processing the message

Simple Object Access Protocol (SOAP)Simple Object Access Protocol (SOAP)

Page 18: [DSBW Spring 2010] Unit 10: XML and Web And beyond

18dsbw 2009/2010 q2

POST /InStock HTTP/1.1Host: www.stock.orgContent-Type: application/soap+xml; charset=utf-8Content-Length: nnn

<?xml version="1.0"?><soap:Envelopexmlns:soap="http://www.w3.org/2001/12/soap-envelope"soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"><soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body></soap:Envelope>

SOAP Request: ExampleSOAP Request: Example

Page 19: [DSBW Spring 2010] Unit 10: XML and Web And beyond

19dsbw 2009/2010 q2

HTTP/1.1 200 OKContent-Type: application/soap; charset=utf-8Content-Length: nnn

<?xml version="1.0"?><soap:Envelopexmlns:soap="http://www.w3.org/2001/12/soap-envelope"soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"><soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPriceResponse> <m:Price>34.5</m:Price> </m:GetStockPriceResponse> </soap:Body></soap:Envelope>

SOAP Response: ExampleSOAP Response: Example

Page 20: [DSBW Spring 2010] Unit 10: XML and Web And beyond

20dsbw 2009/2010 q2

Web Services Description Language Web Services Description Language (WSDL)(WSDL) A WSDL document describes a web

service using these major elements: <portType>: The operations

performed by the web service <message>: The messages used

by the web service <types>: The data types used by

the web service <binding>: The communica-tion

protocols used by the web service

<definitions>

<types>

type definition ......

</types>

<message>

message definition ...

</message>

<portType>

port definition ....

</portType>

<binding>

binding definition ..

</binding>

</definitions>

Page 21: [DSBW Spring 2010] Unit 10: XML and Web And beyond

21dsbw 2009/2010 q2

<message name=“getStockPriceRequest"> <part name="StockName" type="xs:string"/></message>

<message name=“getStockPriceResponse"> <part name="Price" type="xs:float"/></message>

<portType name=“StockMarket"> <operation name=“getStockPrice"> <input message="getStockPriceRequest"/> <output message= "getStockPriceTermResponse"/>

</operation></portType>

WSDL Document: Example (fragment)WSDL Document: Example (fragment)

Page 22: [DSBW Spring 2010] Unit 10: XML and Web And beyond

22dsbw 2009/2010 q2

The overhead associated to SOAP makes it impractical in high-traffic scenarios

Representational State Transfer (REST): architectural style for networked systems based on the following principles: Application state and functionality are abstracted into resources Every resource is uniquely addressable by an URI Client-Server: Clients pull resource representations Stateless: each request from client to server must contain all needed

information. Uniform interface: all resources are accessed with a generic interface

(HTTP-based) Interconnected resource representations Layered components - intermediaries, such as proxy servers, cache

servers, to improve performance, security

RESTful Web ServicesRESTful Web Services

Page 23: [DSBW Spring 2010] Unit 10: XML and Web And beyond

23dsbw 2009/2010 q2

A RESTful web service is a simple web service implemented using HTTP and the principles of REST.

A RESTful web service is a collection of resources. Its definition comprises: The URI for the web service as a whole (<baseURI>) A URI scheme to address individual resources, e.g. <baseURI>/<ID> The MIME type of the data supported by the web service (JSON, XML) The set of operations supported by the web service using HTTP

methods: POST: To create a resource on the server GET: To retrieve the current state of the resource PUT: To change the state of a resource or to update it DELETE: To remove or delete a resource

RESTful Web Services (cont.)RESTful Web Services (cont.)

Page 24: [DSBW Spring 2010] Unit 10: XML and Web And beyond

24dsbw 2009/2010 q2

RESTful WS: Example RESTful WS: Example (adapted from (adapted from WikipediaWikipedia))

Page 25: [DSBW Spring 2010] Unit 10: XML and Web And beyond

25dsbw 2009/2010 q2

“The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form””.

"If HTML and the Web made all the online documents look like one huge book, RDF, schema, and inference languages will make all the data in the world look like one huge database"

Tim Berners-Lee

Semantic Web = The Web of DataSemantic Web = The Web of Data

Page 26: [DSBW Spring 2010] Unit 10: XML and Web And beyond

26dsbw 2009/2010 q2

The Current Web (1/2)The Current Web (1/2) Resources:

Identified by URI's untyped

Links: href, src, ... limited, non-descriptive

Users: A lot of information, but its

meaning must be interpreted and deduced from the content as it has been done since millenniums

Machines: They don’t understand.

Page 27: [DSBW Spring 2010] Unit 10: XML and Web And beyond

27dsbw 2009/2010 q2

The Public Web The web found when searching and browsing At least 21 billion pages indexed by standard search engines

The Deep Web Large data repositories that require their own internal searches. About 6 trillion documents not indexed by standard search

engines.

The Private Web Password-protected sites and data: corporate intranets, private

networks, susbscription-based services, etc. About 3 trillion documents not indexed by standard search

engines.

The Current Web (2/2)The Current Web (2/2)

Page 28: [DSBW Spring 2010] Unit 10: XML and Web And beyond

28dsbw 2009/2010 q2

The Semantic WebThe Semantic Web Resources:

Globally identified by URIs or locally (Blank) Extensible Relational

Links: Identified by URIs Extensible Relational

Users: More an better information

Machines: More processable

information (Data Web)

Page 29: [DSBW Spring 2010] Unit 10: XML and Web And beyond

29dsbw 2009/2010 q2

Make web resources more accessible to automated processes

Extend existing rendering markup with semantic markup Metadata (data about data) annotations that describe

content/function of web accessible resources

Use Ontologies to provide vocabulary for annotations “Formal specification” accessible to machines

A prerequisite is a standard web ontology language Need to agree common syntax before we can share

semantics Syntactic web based on standards such as HTTP and HTML

Semantic Web: How?Semantic Web: How?

Page 30: [DSBW Spring 2010] Unit 10: XML and Web And beyond

30dsbw 2009/2010 q2

Metadata annotationsMetadata annotations

Page 31: [DSBW Spring 2010] Unit 10: XML and Web And beyond

31dsbw 2009/2010 q2

Is it semantic?Is it semantic? Are the terms unambiguous and tagged in royalty-free format,

governed by a nonprofit organization, that all software programs can understand?

Is it on the web?Is it on the web? Is it online using a common name space that makes it easily

findable? Is it shared among collaborators or companies? Does it use the information already online to get smarter as

more people use the system?

The Semantic Web “Acid Test” (by D. Siegel) The Semantic Web “Acid Test” (by D. Siegel)

Page 32: [DSBW Spring 2010] Unit 10: XML and Web And beyond

32dsbw 2009/2010 q2

Semantic Web: W3C Standards and ToolsSemantic Web: W3C Standards and Tools

RDF (Resource Description Framework): simple data model to describe resources and their relationshipsRDF Schema: is a language for declaring basic class and types for describing the terms used in RDF, that allows defining class hierarchies SPARQL: SPARQL Protocol and RDF Query Language OWL: Web Ontology Language. Allows enriching the description of properties and classes, including, among others, class disjunction, association cardinality, richer data types, property features (eg. symmetry), etc.

Page 33: [DSBW Spring 2010] Unit 10: XML and Web And beyond

33dsbw 2009/2010 q2

RDF is graphical formalism ( + XML syntax + semantics) for representing metadata for describing the semantics of information in a machine- accessible

way

RDF Statements are <subject, predicate, object> triples that describe properties of resources :

<Carles,hasColleague,Ernest>

XML representation:<Description about="some.uri/person/carles_farre">

<hasColleague resource="some.uri/person/ernest_teniente"/>

</Description>

RResource esource DDescription escription FFramework (RDF)ramework (RDF)

Page 34: [DSBW Spring 2010] Unit 10: XML and Web And beyond

34dsbw 2009/2010 q2

RDF Schema allows you to define vocabulary terms and the relations between those terms it gives “extra meaning” to particular RDF predicates and resources this “extra meaning”, or semantics, specifies how a term should be

interpreted

Examples:<Person,type,Class>

<hasColleague,type,Property>

<Professor,subClassOf,Person>

<Cristina,type,Professor>

<hasColleague,range,Person>

<hasColleague,domain,Person>

RDF SchemaRDF Schema

Page 35: [DSBW Spring 2010] Unit 10: XML and Web And beyond

35dsbw 2009/2010 q2

RDFS too weak to describe resources in sufficient detail No localized range and domain constraints

Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

No existence/cardinality constraints Can’t say that all instances of person have a mother that is also a

person, or that persons have exactly 2 parents No transitive, inverse or symmetrical properties

Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

Difficult to provide reasoning support No “native” reasoners for non-standard semantics May be possible to reason via FO axiomatization

Problems with RDFSProblems with RDFS

Page 36: [DSBW Spring 2010] Unit 10: XML and Web And beyond

36dsbw 2009/2010 q2

OWL is RDF(S), adding vocabulary to specify: Relations between classes Cardinality Equality More typing of and characteristics of properties Enumerated classes

Three species of OWL OWL full is union of OWL syntax and RDF OWL DL restricted to FOL fragment (≅ SHIQ Description Logic) OWL Lite is “easier to implement” subset of OWL DL

OWL DL Benefits from many years of DL research Well defined semantics Formal properties well understood (complexity, decidability) Known reasoning algorithms Implemented systems (highly optimised)

Web Ontology Language (OWL)Web Ontology Language (OWL)

Page 37: [DSBW Spring 2010] Unit 10: XML and Web And beyond

37dsbw 2009/2010 q2

Person Person ⊓⊓ ∀∀hasChild.(Doctor hasChild.(Doctor ⊔⊔ ∃∃hasChild.Doctor)hasChild.Doctor)<owl:Class> <owl:intersectionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Person"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:toClass> <owl:unionOf rdf:parseType="collection"> <owl:Class rdf:about="#Doctor"/> <owl:Restriction> <owl:onProperty

rdf:resource="#hasChild"/> <owl:hasClass rdf:resource="#Doctor"/> </owl:Restriction> </owl:unionOf> </owl:toClass> </owl:Restriction> </owl:intersectionOf></owl:Class>

OWL in RDF(S) notation: ExampleOWL in RDF(S) notation: Example

Page 38: [DSBW Spring 2010] Unit 10: XML and Web And beyond

38dsbw 2009/2010 q2

Designed to query collections of triples…

…and to easily traverse relationships

Vaguely SQL-like syntax (SELECT, WHERE)

“Matches graph patterns”

SELECT ?salWHERE { emps:e13954 HR:salary ?sal }

SPARQL Protocol And RDF Query LanguageSPARQL Protocol And RDF Query Language

Page 39: [DSBW Spring 2010] Unit 10: XML and Web And beyond

39dsbw 2009/2010 q2

SQL vs SPARQLSQL vs SPARQL

SELECT hire_date

FROM employees

WHERE salary >= 21750

EMP_ID NAME HIRE_DATE

SALARY

13954 Joe 2000-04-14 48000

10335 Mary 1998-11-23 52000

… … … …

04182 Bob 2005-02-10 21750

emps:e13954 HR:name 'Joe'emps:e13954 HR:hire-date 2000-04-14emps:e13954 HR:salary 48000emps:e10335 HR:name ‘Mary'emps:e10335 HR:hire-date 1998-11-23emps:e10335 HR:salary 52000…

SELECT ?hdate

WHERE

{ ?id HR:salary ?sal

?id HR:hire_date ?hdate

FILTER ?sal >= 21750 }

Page 40: [DSBW Spring 2010] Unit 10: XML and Web And beyond

40dsbw 2009/2010 q2

Semantic Web ServicesSemantic Web Services

The main aim is to enable highly flexible Web services architectures, where new services can be quickly discovered, orchestrated and composed into workflows by creating a semantic markup of Web services that makes them

machine understandable and use-apparent is necessary developing an agent technology that exploits this semantic markup to

support automated Web service composition and interoperability

WWWURI, HTML, HTTP

Semantic WebRDF, RDF(S), OWL

DynamicWeb ServicesUDDI, WSDL, SOAP

Static

Semantic Web Services

Page 41: [DSBW Spring 2010] Unit 10: XML and Web And beyond

41dsbw 2009/2010 q2

KAPPEL, Gerti et al. Web Engineering, John Wiley & Sons, 2006. Chapter 14.

SHKLAR, Leon and ROSEN, Rich. Web Application Architecture: Principles, Protocols and Practices, 2nd Edition. John Wiley & Sons, 2009. Chapters 5 and 13.

SIEGEL, David. Pull. The Power of the Semantic Web to Transform Your Business. Portfolio (Penguin Group), 2009.

RAY, Kate. Web 3.0 (video) http://vimeo.com/11529540

www.w3.org

www.w3schools.com

ReferencesReferences