Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML...

59
Data Services (I): XML Transformation and Query Techniques Helen Paik School of Computer Science and Engineering University of New South Wales Week 8 H. Paik (CSE, UNSW) XML Week 8 1 / 59

Transcript of Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML...

Page 1: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Data Services (I):XML Transformation and Query Techniques

Helen Paik

School of Computer Science and EngineeringUniversity of New South Wales

Week 8

H. Paik (CSE, UNSW) XML Week 8 1 / 59

Page 2: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Web services so far

1. WS-* services:

Logical/functional view: logical view of actual programs - defined interms of what it does, typically carrying out a business-level operationMessage orientation: a service is formally defined in terms of themessages exchanged between provider agents and requester agentsA service is described by machine-processable meta data.

2. RESTful services:

resources, uniform (HTTP) operations and hyperlinksmultiple representations of a resourceWeb is the largest distributed application ever created - servicesshould be designed to respect the Web architecture itself

VS.

H. Paik (CSE, UNSW) XML Week 8 2 / 59

Page 3: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Data as Services

Data is everywhere, but much of them is still locked behindapplications

Especially in an enterprise environment, data is stored in multiplesystems and any potential data consumer client needs to handlemultiple interfaces or mechanisms to interact with them (... againheterogeneity issue).

Data services focus on providing “uniform access” to data for itsclients. Data Services == Data Access as a Service

Let us expose data so that it is easily accessed over simple accessinterfaces (bypassing application logic layer)

New way of ’thinking’ about data integration and interoperabilityacross a broad range of data consumers

H. Paik (CSE, UNSW) XML Week 8 3 / 59

Page 4: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XML and Data Services

Just like Web Services are built around ’agreed standards’, Dataservices should be built around standards ... However, there is noconsensus yet

Data services can be implemented either as a WS-* service or aRESTful service (the focus is on exposing data to data consumers)

No widely used “standards” for data services yet, but the coreimplementation relies on the following XML technologies: XMLSchema (Web feeds), XPath, XQuery (XML querying), XSLT (XMLtransformation)

For Data services, we will learn XPath, XSLT and XQuery as “coredata services enabling technologies”. Then, discuss some of the dataservice design/implementation options

H. Paik (CSE, UNSW) XML Week 8 4 / 59

Page 5: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Part I

XPath

H. Paik (CSE, UNSW) XML Week 8 5 / 59

Page 6: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath is a query language for XML documents. It appears as the value ofattributes in other XML languages (eg., XSLT, Schema, BPEL).

Example 1: Consider this bunch of lonely office dwellers.

H. Paik (CSE, UNSW) XML Week 8 6 / 59

Page 7: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath Node and Axis

To understand XPath, we must understand the concept of an XPath axisand context node. An axis is a particular direction through an XMLdocument. The direction is determined from the context node (==“current” node). Based on this, XPath defines 13 axes.

1. self – the context node

2. child – the children of the CTX

3. parent – the parent of the CTX

4. ancestor – the ancestors of CTX

5. ancestor-or-self – the ancestors + self

6. descendant – the descendant

7. descendant-or-self – the descendant+ self

8. following-sibling

9. preceding-sibling

10. following – all nodes thatcome after the CTX (minusany descendants, attribute andnamespaces nodes)

11. preceding – all nodes thatcome before the CTX (minusany ancestors, attribute andnamespaces nodes)

12. attribute – the attribute nodesof the CTX

13. namespace – the namespacenodes of the CTX

H. Paik (CSE, UNSW) XML Week 8 7 / 59

Page 8: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath Node Tests

XPath defines several node tests: A nodetest is true if the name of a node matchesthe name specified in the test.

eg., child::person- (i.e., is the child node person?)

eg., attribute::name- (i.e., is the attribute node name?)

text(): selects all the text-node children of the context.

comment(): selects all the comment-node children of the context.

node() or ’*’: is true for all nodes, regardless of type. Select allelement nodes and attribute nodes.

H. Paik (CSE, UNSW) XML Week 8 8 / 59

Page 9: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Building an XPath: Location Steps

A location Step ::= AxisSpecifier NodeTest Predicate*

Think of each step as a pipeline of “three filters”.

First filter: Choose an axis

AxisSpecifier: Axis name followed by ’::’

Eg., child::, descendant-or-self::, parent::

Second filter: Choose nodes

NodeTest: XPath node types followed by name, or ’*’

Eg., child::office, descendant-or-self::person,attribute::title, child::*

Third filter: Refine the choice

Predicate: contains a boolean expression (– could be a function),

E.g., [child::room=‘‘B501’’], [attribute::born<1976]

H. Paik (CSE, UNSW) XML Week 8 9 / 59

Page 10: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Location Steps (Con.)

Here is a complete location path:

parent::person[attribute::name=’Sue’]

child::person[last()]

child::person[1]

child::name[child::phone=’56789’]

H. Paik (CSE, UNSW) XML Week 8 10 / 59

Page 11: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Building an XPath: Location Path

A Location ’Path’ consists of location steps (separated by ’/’)

Relative location paths – start from context node

Step1 / Step2 / Step3 / ... / StepN

child::office/child::person

child::office/child::person/child::grade

child::person[attribute::name=’Sue’]/child::age

Absolute location paths – start from doc root ’/’

/ Step1 / Step2 / Step 3/ ... / StepN

/child::office

/child::office/child::person

/child::office/child::person/child:grade

/decendant::person[attribute::name=’Sue’]/child::age

H. Paik (CSE, UNSW) XML Week 8 11 / 59

Page 12: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath Examples

Example 2: Identify all person elements.

Example 3: Identify all person elements with grade 2.

Example 4: Identify Sue’s age.

Example 5: Identify all phone elements.

Example 6: Identify person’s names with age 29.

http://chris.photobooks.com/xml/default.htm (visualiser)

H. Paik (CSE, UNSW) XML Week 8 12 / 59

Page 13: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath Abbreviated Syntax

Shorthands

// is short for /descendant-or-self::node()

. is short for self::node()

.. is short for parent::node()

@ is short for attribute::

’nothing’ (ie., empty) is short for child::

Long XPath vs. Short XPath

child::office/child::person[attribute::name=’Sue’]

shorthand – office/person[@name=’Sue’]

/descendant-or-self::person[@name=’Sue’]

shorthand – //person[@name=’Sue’]

self::node()/descendant-or-self::node()/child::para

shorthand – .//para

H. Paik (CSE, UNSW) XML Week 8 13 / 59

Page 14: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath Examples – revisited

This time, use abbreviated syntax.

Exercise 1: Identify all person elements (cf. how about first personelement?)

Exercise 2: Identify all person elements with grade 2.

Exercise 3: Identify Sue’s age.

Exercise 4: Identify all phone elements.

Exercise 5: Identify person’s names with age 29.

http://chris.photobooks.com/xml/default.htm (visualiser!)

H. Paik (CSE, UNSW) XML Week 8 14 / 59

Page 15: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XPath in action

Example 7: A small XSLT program that formats the phonebook.xml intoan HTML table:

<?xml version="1.0"?><xsl:stylesheet

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.1">

<xsl:template match="/Phonebook"><HTML><BODY><TABLE><xsl:apply-templates select="Entry"/></TABLE></BODY></HTML>

</xsl:template><xsl:template match="Entry"><TR><TD><xsl:value-of select="LastName"/></TD>

<TD><xsl:apply-templates select="LastName"/></TD><TD><xsl:value-of select="FirstName"/></TD><TD><xsl:value-of select="School"/></TD><TD><xsl:value-of select="Campus"/></TD><TD><xsl:value-of select="Room"/></TD><TD><xsl:value-of select="Extension"/></TD></TR>

</xsl:template><xsl:template match="LastName">

<xsl:value-of select="@Title"/></xsl:template>

</xsl:stylesheet>

H. Paik (CSE, UNSW) XML Week 8 15 / 59

Page 16: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Part II

XSLT

H. Paik (CSE, UNSW) XML Week 8 16 / 59

Page 17: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

eXtensible Stylesheet Language Transformation

XSLT specification: http://www.w3.org/TR/xslt

A language for transforming the structure of an XML document. It could be usedfor:

document data conversion (xml → xml)

add/remove elements, change xml tree structuresrearrange and sort elements, perform calculations, hide/display certainelements, perform tests

different document formats for publishing the same data

from a single XML document to many different document formats (eg.,HTML, Mobile devices format, PDF, plain text, and more ...)separation of data (ie., content) and presentation

transmitting data between applications

app A data format ↔ xml ↔ app B data foramta cheap and effective application integration

H. Paik (CSE, UNSW) XML Week 8 17 / 59

Page 18: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

A simple example

From data-oriented XML:<person>

<name>

<given>David</given>

<family>Edmond</family>

</name>

<age>57</age>

<pets>

<dog>Winnie</dog>

<cat>Misty</cat>

</pets>

</person>

file: edmond.xml

You want to produce:

That is:

extract name, age and # of pets

display the info in HTML

H. Paik (CSE, UNSW) XML Week 8 18 / 59

Page 19: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Here is the XSLT program for it:

Note XSLT itself is an XML document ...

<?xml version="1.0"?><xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform version="1.1"><xsl:output method="html" indent="no"/><xsl:strip-space elements="*"/><xsl:template match="/"><html><head>

<title><xsl:value-of select="person/name/given"/></title></head><body>

<h2><xsl:value-of select="person/name/given"/> &nbsp;<xsl:value-of select="person/name/family"/></h2>

<table border="1"><tr>

<td>How old?</td><td><xsl:value-of select="person/age"/></td></tr>

<tr><td>Nr of pets</td><td><xsl:value-of select="count(person/pets/child::node())"/></td></tr>

</table></body></html></xsl:template></xsl:stylesheet>

file: edmond.xslH. Paik (CSE, UNSW) XML Week 8 19 / 59

Page 20: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT is a declarative language

XSLT employs a high-level declarative language approach.

The required transformation is expressed as a set of ’rules’. Each ruledescribes the transformation you want, rather than specifying sequence ofsteps of how it should be done.

Think SQL or Prolog ...

Before XSLT

You could only do ’transformation’ via writing custom applications withXML parsers. You would work with the parser’s API and a programminglanguage to define a specific sequence of steps to be followed in order toproduce the desired output.

H. Paik (CSE, UNSW) XML Week 8 20 / 59

Page 21: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Transformation

Output Process

XML

StyleSheet

Transformation Process

XML

Text

HTMLSource

DocumentSource Tree

Result Tree

documents are represented as ’Trees’

XSLT relies on the XML parser to get the trees

two steps: structure transformation, formatting and serialisation

H. Paik (CSE, UNSW) XML Week 8 21 / 59

Page 22: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

When to run XSLT?

XML document(s) + XSL document → Transformed document

�•� When do you perform transformation?

Server-side:

either on demand or in advance

by means of a program such as Saxon.

Client-side:

on demand using XSLT supported browsers

most modern browsers have built-in XSLT processors

H. Paik (CSE, UNSW) XML Week 8 22 / 59

Page 23: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Processors

Xalan-Java: The Apache software foundation: xml.apache.orgXalan-Java (http://xml.apache.org/xalan-j/index.html)Saxon: by Michael Kay, http://saxon.sourceforge.net/Built-in support in browsers: IE 5.5+, Netscape 6+ and Mozilla

Binding XML and XSLT:

outside of XML

% java -jar saxon.jar source.xml source.xsl

within XML: use Processing Instruction xml-stylesheet

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xml" href="source.xsl"?>

<catalog>

<!-- catalog content here -->

</catalog>

H. Paik (CSE, UNSW) XML Week 8 23 / 59

Page 24: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Tree Model

Consider the following Green-eyed Monster (GEM) document:

<GEM><JealousyRecord>

<Person>Sue</Person><Job Earnings="peanuts">lecturer</Job><Holiday>

<Year>1996</Year><City>Nairobi</City><Country>Kenya</Country></Holiday>

<Holiday><Year>1994</Year><City>Paris</City><Country>France</Country></Holiday>

<Holiday><Year>1995</Year><City>Acapulco</City><Country>Mexico</Country></Holiday>

</JealousyRecord><JealousyRecord>

<Person>Bill</Person><Job Earnings="heaps">plumber</Job>...

</JealousyRecord><JealousyRecord>

<Person>Doug</Person>...

</JealousyRecord></GEM>

file: gem.xml

H. Paik (CSE, UNSW) XML Week 8 24 / 59

Page 25: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Tree Representation

H. Paik (CSE, UNSW) XML Week 8 25 / 59

Page 26: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

The Basic Transformation Process

Rule-based (Push) Processing:

The dominant feature of a typical XSLT is that it consists of asequence of “template rules”.

Rules are not arranged in any particular order → XSLT is declarative,not procedural

In each template rule, you specify what output should be producedwhen particular patterns occur in the input

H. Paik (CSE, UNSW) XML Week 8 26 / 59

Page 27: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

The Basic Transformation Process

1 XSLT first reads and parses the source document and the stylesheet

2 Then, finds a template rule that matches the root node.

3 Then, the processor instantiates the content of the template rule.

A very simple XSLT with one template rule ...

<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Trans ...><xsl:template match="/">

<html><head>

<title>Green Eyed Monster</title></head><body>

I found the Green Eyed Monster!</body>

</html></xsl:template></xsl:stylesheet>

file: gem one.xsl

H. Paik (CSE, UNSW) XML Week 8 27 / 59

Page 28: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

The Basic Transformation Process

1 If the template contains another rule (in the form of’apply-templates’), XSLT finds the matching template rule andinstantiates the content

2 This basic process is repeated until there is no more template to apply.

A template rule with another template rule

<xsl:template match="/"><html>

<head><title>Green Monster</title>

</head><body>

<xsl:apply-templates select="GEM/JealousyRecord"/></body>

</html></xsl:template><xsl:template match="JealousyRecord">

<xsl:apply-templates/></xsl:template>

file: gem holiday.xsl

<apply-templates/> means “process all children”.H. Paik (CSE, UNSW) XML Week 8 28 / 59

Page 29: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Rule-based processing in XSLT

Take this example from M. Kay’s book pp.39-41

<poem><author>Rupert Brooke</author><date>1912</date><title>Song</title><stanza>

<line>And suddenly the wind comes soft,</line><line>And Spring is here again;</line><line>And the hawthorn quickens with buds of green</line><line>And my heart with buds of pain.</line>

</stanza><stanza>

<line>My heart all Winter lay so numb,</line><line>The earth so dead and frore,</line><line>That I never thought the Spring would come again</line><line>Or my heart wake any more.</line>

</stanza><stanza>

<line>But Winter’s broken and earth has woken,</line><line>And the small birds cry again;</line><line>And the hawthorn hedge puts forth its buds,</line><line>And my heart puts forth its pain.</line>

</stanza></poem>

file: poem.xml

H. Paik (CSE, UNSW) XML Week 8 29 / 59

Page 30: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Rule-based processing in XSLT: template rules

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:template match="poem">

<html> <head> <title><xsl:value-of select="title"/></title></head><body>

<xsl:apply-templates select="author"/><xsl:apply-templates select="stanza"/><xsl:apply-templates select="date"/>

</body> </html></xsl:template><xsl:template match="author">

<div align="center"><h2>By <xsl:value-of select="."/></h2></div></xsl:template><xsl:template match="date">

<p><i><xsl:value-of select="."/></i></p></xsl:template><xsl:template match="stanza">

<p><xsl:apply-templates select="line"/></p></xsl:template><xsl:template match="line">

<xsl:if test="position() mod 2 = 0">&#160;&#160;</xsl:if><xsl:value-of select="."/><br/>

</xsl:template>

</xsl:stylesheet>

file: poem.xsl

H. Paik (CSE, UNSW) XML Week 8 30 / 59

Page 31: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Rule-based processing in XSLT: template rules

<Phonebook><Entry>

<LastName Title="Miss">Edgar</LastName><FirstName>Pam</FirstName><School>Optometry</School><Campus>GP</Campus><Room>B501</Room><Extension>5695</Extension>

</Entry><Entry>

<LastName Title="Dr">Edmond</LastName><FirstName>David</FirstName><School>Information Systems</School><Campus>GP</Campus><Room>S842</Room><Extension>2240</Extension>

</Entry><Entry>

<LastName Title="Dr">Edmonds</LastName><FirstName>Ian</FirstName><School>Physical Sciences</School><Campus>GP</Campus><Room>M206</Room><Extension>2584</Extension>

</Entry></Phonebook>

file: Phonebook.xmlH. Paik (CSE, UNSW) XML Week 8 31 / 59

Page 32: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Rule-based processing in XSLT: template rules

<?xml version="1.0"?><xsl:stylesheet

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.1">

<xsl:template match="/Phonebook"><HTML><BODY><TABLE><xsl:apply-templates select="Entry"/></TABLE></BODY></HTML>

</xsl:template><xsl:template match="Entry"><TR><TD><xsl:value-of select="LastName"/></TD>

<TD><xsl:apply-templates select="LastName"/></TD><TD><xsl:value-of select="FirstName"/></TD><TD><xsl:value-of select="School"/></TD><TD><xsl:value-of select="Campus"/></TD><TD><xsl:value-of select="Room"/></TD><TD><xsl:value-of select="Extension"/></TD></TR>

</xsl:template><xsl:template match="LastName">

<xsl:value-of select="@Title"/></xsl:template>

</xsl:stylesheet>

file: Phonebook.xsl

H. Paik (CSE, UNSW) XML Week 8 32 / 59

Page 33: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Defining a rule using <xsl:template>

a template is a fragment of output to be generated when a suitablematch is found.

the attribute match is used to indicate the nodes of the sourcedocument to which the template applies.

<xsl:template match=”/”> will match the root node of the source

<xsl:template match=”Report/Intro”> will match any Intro

element

Note: you cannot nest <xsl:template> tags, , you should use<xsl:apply-templates> (or <xsl:for-each> tags).

H. Paik (CSE, UNSW) XML Week 8 33 / 59

Page 34: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Selecting text value using xsl:value-of

<xsl:value-of select=”person/age”/> retrieves the value of the agenode that has a person as parent node.

<xsl:value-of select=”.” /> retrieves the value of the current node

<xsl:value-of select=”count(person/pets/*)”/> returns a count of allthe child nodes under pets node.

It is also used to retrieve value of XSLT parameters and variables.

eg. Declare variable myVar using:

<xsl:variable name=“myVar” value=“20”/>

<xsl:value-of select=“$myVar” /> returns the value of myVarfile: sdb.xml, q3.xsl

H. Paik (CSE, UNSW) XML Week 8 34 / 59

Page 35: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Built-in Template Rules in XSLT

What happens when ’apply-templates’ is invoked to process nodes, but notemplate rule defined for the node?

Node Type Built-in template rule

root Call <xsl:apply-templates/> to process the chil-dren of the root node.

element Call <xsl:apply-templates/> to process the chil-dren of the node

attribute Copy the attribute value to the result tree as text

text Copy the text to the result tree

comment Do nothing

H. Paik (CSE, UNSW) XML Week 8 35 / 59

Page 36: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Built-in Template Rules

What will be the result of the following code?

<xsl:stylesheet xmlns:xsl="http://www.w3.org/...>

<xsl:template match="/">

<xsl:apply-templates/>

</xsl:template>

</xsl:stylesheet>

file: builtin.xsl

Can you trace the sequence of processing by XSLT?

H. Paik (CSE, UNSW) XML Week 8 36 / 59

Page 37: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT: Basic Constructs

Stylesheet structuring

<xsl:stylesheet>

<xsl:include> <xsl:import>

Template structuring

<xsl:template>

Generating output

<xsl:value-of> <xsl:element> <xsl:attribute> <xsl:comment> <xsl:processing-instruction> <xsl:template>

<xsl:apply-template> <xsl:call-template>

Conditional processing

<xsl:if>

<xsl:choose> <xsl:when> <xsl:otherwise> <xsl:for-each>

<xsl:processing-instruction> <xsl:text>

Variables and parameters

<xsl:variable> <xsl:param> <xsl:with-param>

Sorting and numbering

<xsl:sort> <xsl:number>

H. Paik (CSE, UNSW) XML Week 8 37 / 59

Page 38: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Basic Transformation Process - Revisited

The simplest way to process the source tree is to write a template rule foreach kind of node that will be encountered: Push processing

Consider the following input:<?xml version="1.0"?><books>

<book category="reference"><author>North, Ken</author><title>Database magic with Ken North</title><price>8.95</price>

</book><book category="fiction">

<author>Evelyn Waugh</author><title>Sword of Honour</title><price>12.99</price>

</book><book category="fiction">

<author>Herman Melville</author><title>Moby Dick</title><price>8.99</price>

</book>...</books>

file: books.xml (example from M. Kay book pp.77-78)

H. Paik (CSE, UNSW) XML Week 8 38 / 59

Page 39: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Push Processing

Elements to handle = books, book, author, title and price

An example of push processing

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/Transform" ...><xsl:template match="books">

<html><body><h1>A list of Books</h1><table width="640"><xsl:apply-templates/></table></body></html>

</xsl:template><xsl:template match="book">

<tr><td><xsl:number/></td><xsl:apply-templates/>

</tr></xsl:template><xsl:template match="author | title | price">

<td><xsl:value-of select="."/></td></xsl:template></xsl:stylesheet>

file: books push.xsl

<xsl:number/>: get a sequence number of the current node (source document)

H. Paik (CSE, UNSW) XML Week 8 39 / 59

Page 40: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Push Processing

The output:

<html>

<body>

<h1>A list of books</h1>

<table width="640">

<tr>

<td>1</td>

<td>North, Ken</td>

<td>Database magic with Ken North</td>

<td>8.95</td>

</tr>

<tr>

<td>2</td>

<td>Evelyn Waugh</td>

<td>Sword of Honour</td>

<td>12.99</td>

</tr>

...

</table>

</body>

</html>

H. Paik (CSE, UNSW) XML Week 8 40 / 59

Page 41: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Controlling Which Node to Process

In push style processing, the processor pushes every node out of the door,and your template rule catches the node for processing.

This will not work if the properties of each book were less predictable (eg.,some books have no price, title and author may come in different order,etc.)

Irregular Input 1:

<book category="reference">

<title>Database magic</title>

<author>North, Ken</author>

</book>

Irregular Input 2:

<book category="fiction">

<price>12.99</price>

<title>Sword of Honour</title>

<author>Nathan Waugh</author>

</book>

Output of books push.xml now ?

H. Paik (CSE, UNSW) XML Week 8 41 / 59

Page 42: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Controlling the sequence of processing

Take Irregular Input 2 - a possible fix is to specify which node to processinstead of telling process all children

template match=”book”, revisited (I)

<xsl:template match="book">

<tr>

<td><xsl:number/></td>

<xsl:apply-templates select="author"/>

<xsl:apply-templates select="title"/>

<xsl:apply-templates select="price"/>

</tr>

</xsl:template>

file: books pull 1.xsl

This is more robust, but it produces a ragged table for ’irregular input 1’case (e.g., missing element)

H. Paik (CSE, UNSW) XML Week 8 42 / 59

Page 43: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Selecting nodes explicitly (Pull Processing)

template match=”book”, revisited (II)

<xsl:template match="book">

<tr>

<td><xsl:number/></td>

<td><xsl:value-of select="author"/></td>

<td><xsl:value-of select="title"/></td>

<td><xsl:vlaue-of select="price"/></td>

</tr>

</xsl:template>

file: books pull 2.xsl

There are other ways to handle the issue of unpredictable structures:

Using <xsl:for-each> to perform explicit processing of each ofnodes

H. Paik (CSE, UNSW) XML Week 8 43 / 59

Page 44: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Selecting nodes explicitly - Using For-Each

Irregular Input 3: <book category="fiction">

<author>Nathan Waugh</author>

<author>Another Authoer Here</author>

<title>Sword of Honour</title>

<price>12.99</price>

</book>

template match=”book”, revisited (III)

<xsl:template match="book">

<tr>

<td><xsl:number/></td>

<td>

<xsl:for-each select="author">

<xsl:value-of select="."/>

</xsl:for-each> </td>

<td><xsl:value-of select="title"/></td>

<td><xsl:value-of select="price"/></td>

</tr>

</xsl:template> file: books pull 3.xsl

H. Paik (CSE, UNSW) XML Week 8 44 / 59

Page 45: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Sorting Nodes

By default, the nodes are processed in document order: the order inwhich they appear in the source document.

The order can be changed in the output document via <xsl:sort>

Books<?xml version="1.0"?><books>

<book category="fiction"><author>Evelyn Waugh</author><title>Sword of Honour</title><price>12.99</price>

</book><book category="fiction">

<author>Herman Melville</author><title>Moby Dick</title><price>8.99</price>

</book></books>

Sort book by price

<xsl:apply-templates select="book">

<xsl:sort select="price"

order="descending/>

<xsl:sort select="author"

order="descending/>

</xsl:apply-templates>

<xsl:template match="book">

<xsl:value-of select="title"/>

</xsl:template>

file: books sort.xsl

H. Paik (CSE, UNSW) XML Week 8 45 / 59

Page 46: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Using <xsl:if>

<xsl:if test=“XPath expressions”> element body </xsl:if>

The Poem again ...<stanza><line>My heart all Winter lay so numb,</line><line>The earth so dead and frore,</line><line>That I never thought the Spring would come again</line><line>Or my heart wake any more</line></stanza>

Indenting every second line ...

<xsl:template match="stanza"><p><xsl:apply-templates select="line"/></p>

</xsl:template><xsl:template match="line">

<xsl:if test="position() mod 2 = 0">&#160;&#160;</xsl:if><xsl:value-of select="."/><br/>

</xsl:template>

H. Paik (CSE, UNSW) XML Week 8 46 / 59

Page 47: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Using <xsl:choose>

Else-If construction is not available in XSLT.

’xsl:choose’ lets you test multiple conditions.Stock price

<stock id=“IAG” open=“31.20” high=“32.61” low=“30.15” close=“30.51” />

Display different icons based on the price

<xsl:choose><xsl:when test=”@close &lt; @open”>

<img src=”down.gif” /></xsl:when><xsl:when test=”@close &gt; @open”>

<img src=”up.gif” /></xsl:when><xsl:otherwise>

<img src=”same.gif” /></xsl:otherwise>

</xsl:choose>file: books choose.xsl

H. Paik (CSE, UNSW) XML Week 8 47 / 59

Page 48: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Generating XML documents

Transformation from an XML to an XML

Input

<Phonebook>

<Entry>

<LastName Title="Miss">Edgar</LastName>

<FirstName>Pam</FirstName>

<School>Computer Science</School>

<Room>B501</Room>

<Extension>5097</Extension>

</Entry>

</Phonebook>

Output

<Phonebook>

<Entry Extension="5097">

<Name Title="Miss"

LastName="Edgar"

FirstName="Pam"

<Room Building="B">501</Room>

</Entry>

</Phonebook>

Entry will have a new attribute ’Extension’

Create new element called ’Name’ with ’Title’, ’LastName’ and’FirstName’ as attributes

Ignore the ’School’ element

The ’Room’ element with new ’Building’ attribute

H. Paik (CSE, UNSW) XML Week 8 48 / 59

Page 49: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Here is the XSLT<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output method="xml" indent="yes"/><xsl:template match="Phonebook">

<xsl:element name="Phonebook"><xsl:apply-templates/>

</xsl:element></xsl:template><xsl:template match="Entry">

<xsl:element name="Entry"><xsl:attribute name="Extension">

<xsl:value-of select="Extension"/></xsl:attribute><xsl:element name="Name">

<xsl:attribute name="Title"><xsl:value-of select="LastName/@Title"/></xsl:attribute>

<xsl:attribute name="Lastname"><xsl:value-of select="LastName"/></xsl:attribute>

<xsl:attribute name="FirstName"><xsl:value-of select="FirstName"/></xsl:attribute>

</xsl:element><xsl:element name="Room">

<xsl:attribute name="Building"><xsl:value-of select="substring(Room,1,1)"/></xsl:attribute><xsl:value-of select="substring(Room,2)"/>

</xsl:element></xsl:element>

</xsl:template></xsl:stylesheet>

file: xml xml.xsl, p2p.xsl

H. Paik (CSE, UNSW) XML Week 8 49 / 59

Page 50: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

xsl:element

Creating elements:

Use xsl:element

<xsl:template match="Phonebook">

<xsl:element name="Phonebook">

<xsl:apply-templates/>

</xsl:element>

</xsl:template>

Literally write the element name

<xsl:template match="Phonebook">

<Phonebook>

<xsl:apply-templates/>

</Phonebook>

</xsl:template>

H. Paik (CSE, UNSW) XML Week 8 50 / 59

Page 51: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

xsl:attribute

Consider this:

<person id=“hamidm”><homepage>http://www.cse.unsw.edu.au/~hamidm</homepage>

</person>

Say you want to create ...

<a href=”http://www.cse.unsw.edu.au/~hamidm”>hamidm</a>

What is wrong with the following?

<xsl:template match=”person”><a href=” <xsl:value-of select=”homepage”/> ”>

<xsl:value-of select=”@id”/></a>

</xsl:template>

H. Paik (CSE, UNSW) XML Week 8 51 / 59

Page 52: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

xsl:attribute

The correct way ...

<xsl:template match=”person”><a><xsl:attribute name=”href”><xsl:value-of select=”homepage”/></xsl:attribute><xsl:value-of select=”@id”/></a>

</xsl:template>

A shortcut

<xsl:template match=”person”><a href=”{homepage}”>

<xsl:value-of select=”@id”/></a>

</xsl:template>

H. Paik (CSE, UNSW) XML Week 8 52 / 59

Page 53: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Text nodes and White space

What happen to adjacent text nodes?

<xsl:template match=”Entry”><xsl:value-of select=”LastName”> <xsl:value-of select=”FirstName”></xsl:template>

In the output document, any adjacent text nodes are merged into onenode (separated by a single space).

XSLT processor ignores white spaces (just like Web browsers).

XSLT does not support ’&nbsp;’. You should use one of the followinginstead:

space (&#x20;)

tab (&#x9;)

new line (&#xA;)

carriage return (&#xD;)

H. Paik (CSE, UNSW) XML Week 8 53 / 59

Page 54: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Text nodes and White space

Creating extra spaces

<xsl:template match=”Entry”><xsl:value-of select=”LastName”>&#x20;&#x20; <xsl:value-ofselect=”FirstName”></xsl:template>

Or use ... <xsl:text>

Creating extra spaces

<xsl:template match=”Entry”><xsl:value-of select=”LastName”><xsl:text> </xs:text><xsl:value-ofselect=”FirstName”></xsl:template>

Note only text can come between <xsl:text> and </xsl:text>

H. Paik (CSE, UNSW) XML Week 8 54 / 59

Page 55: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Managing White Spaces

Insignificant whitespace inside nodes:

<description> This ... is a test. </description>normalize-space(description)

Whitespace-only text nodes: <item/>

<xsl:strip-space elements="*"/>

use it as top-level element to get rid of all whitespace-only text nodesuse <xsl:preserve-space elements="..."/> to preservewhitespace-only elements

Use xml:space="preserve" in XML document to instruct XMLparsers to keep whitespaces, if they are to be kept.

H. Paik (CSE, UNSW) XML Week 8 55 / 59

Page 56: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Variables and Parameters

Variable Declaration

<xsl:variable name=“var name” select=“var value”/>xsl:variable name="city" select="’Sydney’"

Global variables: available throughout the whole stylesheet

Local variables: only within a particular template body

Parameter Declaration

<xsl:param name=“param name” />

Global parameters: values are set outside the stylesheet (eg.,command line)

Local parameters: defined for a template and values are set using’xsl:with-param’ element when the template is called

H. Paik (CSE, UNSW) XML Week 8 56 / 59

Page 57: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Variables

Global Variable

<xsl:stylesheet .... ><xsl:variable name=“width” select=“50”/><xsl:template match=“someNode”>

width is: <xsl:value-of select=“$width”/></xsl:template><xsl:template match=“someOtherNode”>

width still is: <xsl:value-of select=“$width”/></xsl:template>

</xsl:stylesheet>

Local Variable

<xsl:stylesheet .... ><xsl:variable name=“gwidth” select=“50”/><xsl:template match=“someNode”>

<xsl:variable name=“lwidth” select=“$gwidth*2”/>width: <xsl:value-of select=“$lwidth”/>

</xsl:template></xsl:stylesheet>

H. Paik (CSE, UNSW) XML Week 8 57 / 59

Page 58: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

XSLT Parameters

Global Parameter (inside XSLT)

<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” version=”1.0”><xsl:param name=”short”/><xsl:template match=”Australia”>

<xsl:apply-templates select=”states/name[@abbr=$short]”/></xsl:template>

Using global parameter: outside XSLT

% java -jar saxon.jar aus.xml aus.xsl short=QLD

Local Parameter

<xsl:apply-templates select=”Customers/Customer”><xsl:with-param name=”Filter” select=”C101”/></xsl:apply-templates><xsl:template match=”Customer”><xsl:param name=”Filter”/>...</xsl:template>

H. Paik (CSE, UNSW) XML Week 8 58 / 59

Page 59: Data Services (I): XML Transformation and Query Techniques · Transformation Output Process XML Style Sheet Transformation Process Text HTML Source Document Source Tree Result Tree

Constructing a temporary tree using a variable

A Variable Declaration

<xsl:variable name="tree">

AAA<xsl:element name="X">

<xsl:attribute name="att">att-value</xsl:attribute>

BBB</xsl:element>

<xsl:element name="y">

CCC

</xsl:variable>

root

element

X

element

Y

text

CCC

text

BBB

attributeatt

att-value

text

AAA

You could do:

<xsl:value-ofselect=”$tree/X/@att” />

count($tree//*) returns thenumber of elements

H. Paik (CSE, UNSW) XML Week 8 59 / 59