XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and...

234
XPath from a Logical Point of View Tadeusz Litak (joint work with Balder ten Cate, Maarten Marx and Gaëlle Fontaine) Department of Computer Science University of Leicester 15 December 2010 MGS Christmas Dinner Apéritif

Transcript of XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and...

Page 1: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath from a Logical Point of View

Tadeusz Litak(joint work with

Balder ten Cate, Maarten Marx and Gaëlle Fontaine)

Department of Computer ScienceUniversity of Leicester

15 December 2010MGS Christmas Dinner Apéritif

Page 2: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Advertisement

Expressive power:Marx and De Rijke. Semantic characterizations of navigational XPath. SIGMODRecord 34(2), 2005

Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate and Segoufin. XPath, transitive closure logic, and nested tree walkingautomata. Journal of the ACM, 2010. Extended abstract appeared in PODS 2008.

Place/Segoufin in LICS 2010, Fontaine/Place in MFCS 2010 . . .

Axiomatization:Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate, Litak and Marx. Complete axiomatizations of XPath fragments. JAL 2010.Extended abstract presented at LiD 2008.

Ten Cate and Marx. Axiomatizing the logical core of XPath 2.0. ICDT’07.

Complexity:Gottlob, Koch and Pichler. Efficient algorithms for processing XPath queries. TODS30(2), 2005

Ten Cate and Lutz. Query containment in very expressive XPath dialects. PODS’07.

Page 3: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Advertisement

Expressive power:Marx and De Rijke. Semantic characterizations of navigational XPath. SIGMODRecord 34(2), 2005

Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate and Segoufin. XPath, transitive closure logic, and nested tree walkingautomata. Journal of the ACM, 2010. Extended abstract appeared in PODS 2008.

Place/Segoufin in LICS 2010, Fontaine/Place in MFCS 2010 . . .

Axiomatization:Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate, Litak and Marx. Complete axiomatizations of XPath fragments. JAL 2010.Extended abstract presented at LiD 2008.

Ten Cate and Marx. Axiomatizing the logical core of XPath 2.0. ICDT’07.

Complexity:Gottlob, Koch and Pichler. Efficient algorithms for processing XPath queries. TODS30(2), 2005

Ten Cate and Lutz. Query containment in very expressive XPath dialects. PODS’07.

Page 4: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What This Talk Is About

Abstract of AbstractI will sketch how to use an intimate relationship between• subsets and extensions of XPath 1.0 and 2.0 (in particular

for their "navigational core") and• well-understood logical and algebraic formalisms

to derive results on• complete axiomatizations• computational complexity and• expressive power

for XPath fragments

Page 5: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What This Talk Is About

Abstract of AbstractI will sketch how to use an intimate relationship between• subsets and extensions of XPath 1.0 and 2.0 (in particular

for their "navigational core") and• well-understood logical and algebraic formalisms

to derive results on• complete axiomatizations• computational complexity and• expressive power

for XPath fragments

Page 6: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS• Atom• SOAP• XHTML . . .

Page 7: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS• Atom• SOAP• XHTML . . .

Page 8: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)

• developed to• RSS• Atom• SOAP• XHTML . . .

Page 9: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS

• Atom• SOAP• XHTML . . .

Page 10: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS• Atom

• SOAP• XHTML . . .

Page 11: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS• Atom• SOAP

• XHTML . . .

Page 12: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XML and Semi-structured Data

XMLeXtensible Markup Language

• began as a subset of SGML:

Standard Generalized Markup Language

(HTML: simplified and corrupted subset of SGML)• developed to

• RSS• Atom• SOAP• XHTML . . .

Page 13: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Example Document

No XML talk can do without its own example document:

<?xml version=’1.0’ encoding=’UTF-8’?><talk date=’15-Dec-2010’>

<speaker uni=’Leicester’>T. Litak</speaker><title><i>XPath</i>from a Logical Point of View

</title><location><i>ATT LT3</i><b>Leicester</b>

</location>

</talk>

(no DTD given, but you can easily come up with one)

Page 14: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Example Document

No XML talk can do without its own example document:

<?xml version=’1.0’ encoding=’UTF-8’?><talk date=’15-Dec-2010’>

<speaker uni=’Leicester’>T. Litak</speaker><title>

<i>XPath</i>from a Logical Point of View</title><location>

<i>ATT LT3</i><b>Leicester</b></location>

</talk>

(no DTD given, but you can easily come up with one)

Page 15: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we’ll see through our dim glasses

E

ither this . . .

talk

speaker title

i

location

i b

(

we cannot even see attributes, each node is labelled with a single label: its name)

At any rate, we are too blind to see actual text content

Page 16: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we’ll see through our dim glasses

Either this . . .

talk

speaker title

i

location

i b

(we cannot even see attributes, each node is labelled with a single label: its name)

At any rate, we are too blind to see actual text content

Page 17: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we’ll see through our dim glasses

or that . . .

talk,@date=’15-Dec-2010’

speaker,@uni=’Leicester’ title

i

location

i b

(attribute-value pairs are additional labels)

At any rate, we are too blind to see actual text content

Page 18: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we’ll see through our dim glasses

or perhaps . . .

talk

@date=’15-Dec-2010’ speaker

@uni=’Leicester’

title

i

location

i b

(back to the unique labelling idea, attribute-value pairs are a special kind of children)see Ranko Lazic’s talk last week for a more sophisticated approach to trees with values

At any rate, we are too blind to see actual text content

Page 19: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we’ll see through our dim glasses

or perhaps . . .

talk

@date=’15-Dec-2010’ speaker

@uni=’Leicester’

title

i

location

i b

(back to the unique labelling idea, attribute-value pairs are a special kind of children)see Ranko Lazic’s talk last week for a more sophisticated approach to trees with values

At any rate, we are too blind to see actual text content

Page 20: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 1.0: W3C Specification

• Provides a common syntax and semantics for functionalityshared between [XQuery], XSL Transformations and XPointer

• Primary purpose: to address parts of an XML document

• In support of this primary purpose, it also providesbasic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntaxto facilitate use of XPath within URIs and XML attribute values

• Operates on the abstract, logical structure of an XML document,rather than its surface syntax

Page 21: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 1.0: W3C Specification

• Provides a common syntax and semantics for functionalityshared between [XQuery], XSL Transformations and XPointer

• Primary purpose: to address parts of an XML document

• In support of this primary purpose, it also providesbasic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntaxto facilitate use of XPath within URIs and XML attribute values

• Operates on the abstract, logical structure of an XML document,rather than its surface syntax

Page 22: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 1.0: W3C Specification

• Provides a common syntax and semantics for functionalityshared between [XQuery], XSL Transformations and XPointer

• Primary purpose: to address parts of an XML document

• In support of this primary purpose, it also providesbasic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntaxto facilitate use of XPath within URIs and XML attribute values

• Operates on the abstract, logical structure of an XML document,rather than its surface syntax

Page 23: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 1.0: W3C Specification

• Provides a common syntax and semantics for functionalityshared between [XQuery], XSL Transformations and XPointer

• Primary purpose: to address parts of an XML document

• In support of this primary purpose, it also providesbasic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntaxto facilitate use of XPath within URIs and XML attribute values

• Operates on the abstract, logical structure of an XML document,rather than its surface syntax

Page 24: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 1.0: W3C Specification

• Provides a common syntax and semantics for functionalityshared between [XQuery], XSL Transformations and XPointer

• Primary purpose: to address parts of an XML document

• In support of this primary purpose, it also providesbasic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntaxto facilitate use of XPath within URIs and XML attribute values

• Operates on the abstract, logical structure of an XML document,rather than its surface syntax

Page 25: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 26: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 27: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 28: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 29: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 30: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 31: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 32: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 33: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 34: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 35: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Samples of XPath Expressions

• Unions. For example: /note/from | /note/to.

• Counting. For example: /node/to[1]

• Descendant and ancestor steps. For example: /node//i

• Filters. For example: /note[from]/to

• Attributes. For example: /note[@date="10-nov-2006"]

• String functions. For example:/note[substring(body,1,3)="It’s"]

• Arithmetical functions. . . .

• . . .

• XPath 1.0 specification (W3C, Nov ’99): 37 pages.

• XPath 2.0 specification (W3C, Jan ’07): 122 pages.

• XPath 3.0: . . . ?

Page 36: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Michael Kay, “XPath 2.0 Programmer’s Reference”

One of the pleasures of XPath 1.0 (like XML itself) was thebrevity of the specification: a mere 30 pages. I haven’t tried toprint the XPath 2.0 specification, but it is certainly vastly longer.What’s more, it is split between multiple documents. The mainlanguage specification at http://www.w3.org/TR/xpath20 points tosubsidiary documents describing the data model, the functionlibrary, and the formal semantics, all at considerable length. Aswith a comparison between the US Constitution and the proposedEU Constitution, the length of the document tells us moreabout the number of people involved in defining it than aboutthe benefits it offers. I would estimate that in reality the 2.0language is about twice the size of XPath 1.0. (...) But whetherthe increased word count in the spec adds precision andclarity, or merely creates opportunities for errors andinconsistencies to creep in, is anyone’s guess.

Page 37: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath 1.0

We focus on the basic navigational functionality of XPath:

(no arithmetics, no strings, no counting . . .—recall these features are secondary!)

Core XPath 1.0

• Isolated by Gottlob, Koch and Pichler in 2002• Explicit motivation: allows linear time model checking on

XML trees

• An additional advantage of such a simple language:data model discrepancies between XPath 1.0 and 2.0 nolonger relevant

Page 38: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath 1.0

We focus on the basic navigational functionality of XPath:(no arithmetics, no strings, no counting . . .—recall these features are secondary!)

Core XPath 1.0

• Isolated by Gottlob, Koch and Pichler in 2002• Explicit motivation: allows linear time model checking on

XML trees

• An additional advantage of such a simple language:data model discrepancies between XPath 1.0 and 2.0 nolonger relevant

Page 39: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath 1.0

We focus on the basic navigational functionality of XPath:(no arithmetics, no strings, no counting . . .—recall these features are secondary!)

Core XPath 1.0

• Isolated by Gottlob, Koch and Pichler in 2002• Explicit motivation: allows linear time model checking on

XML trees

• An additional advantage of such a simple language:data model discrepancies between XPath 1.0 and 2.0 nolonger relevant

Page 40: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 41: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 42: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]

nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 43: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 44: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:children::*,parent::*,preceding-sibling::*[1],following-sibling::*[1]a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 45: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:children::*,parent::*,preceding-sibling::*[1],following-sibling::*[1]. . .descendant::*,ancestor::*,preceding-sibling::*,following-sibling::*pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 46: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:children::*,parent::*,preceding-sibling::*[1],following-sibling::*[1]. . .descendant::*,ancestor::*,preceding-sibling::*,following-sibling::*. . .self::*, pexpr/pexpr, pexpr | pexpr, pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 47: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:children::*,parent::*,preceding-sibling::*[1],following-sibling::*[1]. . .descendant::*,ancestor::*,preceding-sibling::*,following-sibling::*. . .self::*, pexpr/pexpr, pexpr | pexpr, pexpr[nexpr]self::p , pexpr, not(nexpr), nexpr or nexpr

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 48: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 49: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath

Core XPath has two types of expressions:• Path expressions define binary relations• Node expressions define sets of nodes

Syntax of Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

pexpr ::= a | · | pexpr/pexpr | pexpr ∪ pexpr | pexpr[nexpr]nexpr ::= p | 〈pexpr〉 | ¬nexpr | nexpr ∨ nexpr (p ∈ Σ)

(Our notation is a bit different from the official XPath notation)

We also consider single and restricted axis fragments ofCoreXPath—notation CoreXPath(a) for a single axis

and CoreXPath(A) for a set of axesRestricting the set of axes is quite common: recall James Cheney’s orRanko Lazic’s talks last week for fresh examples

Page 50: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of CoreXPath

As said above, we see XML documents asfinite sibling-ordered node labelled trees:ideal abstraction for such a simple syntax

XML documentA tuple T = (N,R⇓,R⇒,V ) where

- N is the set of nodes,- R⇓ and R⇒ are ‘child’ and ‘next sibling’ relations of a finite

tree, and- V : Σ→ 2N (or just V : N → Σ if unique labelling assumed)

Page 51: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of CoreXPath

As said above, we see XML documents asfinite sibling-ordered node labelled trees:ideal abstraction for such a simple syntax

XML documentA tuple T = (N,R⇓,R⇒,V ) where

- N is the set of nodes,- R⇓ and R⇒ are ‘child’ and ‘next sibling’ relations of a finite

tree, and- V : Σ→ 2N (or just V : N → Σ if unique labelling assumed)

Page 52: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of CoreXPath

As said above, we see XML documents asfinite sibling-ordered node labelled trees:ideal abstraction for such a simple syntax

XML documentA tuple T = (N,R⇓,R⇒,V ) where

- N is the set of nodes,

- R⇓ and R⇒ are ‘child’ and ‘next sibling’ relations of a finitetree, and

- V : Σ→ 2N (or just V : N → Σ if unique labelling assumed)

Page 53: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of CoreXPath

As said above, we see XML documents asfinite sibling-ordered node labelled trees:ideal abstraction for such a simple syntax

XML documentA tuple T = (N,R⇓,R⇒,V ) where

- N is the set of nodes,- R⇓ and R⇒ are ‘child’ and ‘next sibling’ relations of a finite

tree, and

- V : Σ→ 2N (or just V : N → Σ if unique labelling assumed)

Page 54: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of CoreXPath

As said above, we see XML documents asfinite sibling-ordered node labelled trees:ideal abstraction for such a simple syntax

XML documentA tuple T = (N,R⇓,R⇒,V ) where

- N is the set of nodes,- R⇓ and R⇒ are ‘child’ and ‘next sibling’ relations of a finite

tree, and- V : Σ→ 2N (or just V : N → Σ if unique labelling assumed)

Page 55: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of Core XPath (ct’d)

pexpr : pairs (context node, reachable node)—subsets of N2

[[s]]T = Rs[[s+]]T = the transitive closure of Rs[[·]]T = the identity relation on N[[A/B]]T = composition of [[A]]T and [[B]]T

[[A ∪ B]]T = union of [[A]]T and [[B]]T

[[A[φ]]]T = {(n,m) ∈ [[A]]T | m ∈ [[φ]]T}

nexpr : subsets of N[[p]]T = {n ∈ N | n ∈ V (p)}[[φ ∧ ψ]]T = [[φ]]T ∩ [[ψ]]T

[[¬φ]]T = N \ [[φ]]T

[[〈A〉]]T = domain of [[A]]T = {n | (n,m) ∈ [[A]]T}

Page 56: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A (slightly modified) diagram of Johan Van Benthem

W W2

unary properties → modes → binary relations

of states ← projections ← between states

propositional operators program operators

ML DRA/TRA

Page 57: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Examples of modes:

?X :={〈x , x〉 | x ∈ X} (testing)!X :={〈w , x〉 | w ∈W, x ∈ X} (realizing)

Examples of projections:

〈R〉 :={w ∈W | ∃v ∈W.wRWv} (domain)

π−1(R) :={w ∈W | ∃v ∈W.vRWw} (codomain)

∼R :={w ∈W | ∀v ∈W.¬(wRWv)} (antidomain)

∆(R) :={w ∈W | wRWw} (diagonal)

NOTE THAT:

〈R〉 = ∼∼R= R/R^ ∩ ·

∆(R) = R ∩ ·π−1(R) = 〈R^〉

Page 58: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Examples of modes:

?X :={〈x , x〉 | x ∈ X} (testing)!X :={〈w , x〉 | w ∈W, x ∈ X} (realizing)

Examples of projections:

〈R〉 :={w ∈W | ∃v ∈W.wRWv} (domain)

π−1(R) :={w ∈W | ∃v ∈W.vRWw} (codomain)

∼R :={w ∈W | ∀v ∈W.¬(wRWv)} (antidomain)

∆(R) :={w ∈W | wRWw} (diagonal)

NOTE THAT:

〈R〉 = ∼∼R= R/R^ ∩ ·

∆(R) = R ∩ ·π−1(R) = 〈R^〉

Page 59: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Comments for logicians

• Note we do not allow transitive closure of arbitrary path expressions(allowed in non-standard extensions like Regular XPath)

• Note also that path expressions, as opposed to node expressions,are not closed under other boolean connectives than sum(changed in XPath 2.0)

• Therefore, we are not exactly in the world of Tarski’s relation algebras• The right algebraic two-sorted setting would be

boolean modules over idempotent semirings• It is possible to move the discussion to one-sorted setting, though:

Dynamic Relation Algebras (DRA’s)studied in the 1990’s by a Dutch group in Utrecht (A. Visser, M. Hollenberg)

that is, idempotent semirings with antidomain operation ∼The Utrecht group used the name dynamic negation

Antidomain: term introduced recently by Desharnais, Jipsen and Struth

Page 60: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Comments for logicians

• Note we do not allow transitive closure of arbitrary path expressions(allowed in non-standard extensions like Regular XPath)

• Note also that path expressions, as opposed to node expressions,are not closed under other boolean connectives than sum(changed in XPath 2.0)

• Therefore, we are not exactly in the world of Tarski’s relation algebras

• The right algebraic two-sorted setting would beboolean modules over idempotent semirings

• It is possible to move the discussion to one-sorted setting, though:

Dynamic Relation Algebras (DRA’s)studied in the 1990’s by a Dutch group in Utrecht (A. Visser, M. Hollenberg)

that is, idempotent semirings with antidomain operation ∼The Utrecht group used the name dynamic negation

Antidomain: term introduced recently by Desharnais, Jipsen and Struth

Page 61: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Comments for logicians

• Note we do not allow transitive closure of arbitrary path expressions(allowed in non-standard extensions like Regular XPath)

• Note also that path expressions, as opposed to node expressions,are not closed under other boolean connectives than sum(changed in XPath 2.0)

• Therefore, we are not exactly in the world of Tarski’s relation algebras• The right algebraic two-sorted setting would be

boolean modules over idempotent semirings

• It is possible to move the discussion to one-sorted setting, though:

Dynamic Relation Algebras (DRA’s)studied in the 1990’s by a Dutch group in Utrecht (A. Visser, M. Hollenberg)

that is, idempotent semirings with antidomain operation ∼The Utrecht group used the name dynamic negation

Antidomain: term introduced recently by Desharnais, Jipsen and Struth

Page 62: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Comments for logicians

• Note we do not allow transitive closure of arbitrary path expressions(allowed in non-standard extensions like Regular XPath)

• Note also that path expressions, as opposed to node expressions,are not closed under other boolean connectives than sum(changed in XPath 2.0)

• Therefore, we are not exactly in the world of Tarski’s relation algebras• The right algebraic two-sorted setting would be

boolean modules over idempotent semirings• It is possible to move the discussion to one-sorted setting, though:

Dynamic Relation Algebras (DRA’s)studied in the 1990’s by a Dutch group in Utrecht (A. Visser, M. Hollenberg)

that is, idempotent semirings with antidomain operation ∼The Utrecht group used the name dynamic negation

Antidomain: term introduced recently by Desharnais, Jipsen and Struth

Page 63: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Comments for logicians

• Note we do not allow transitive closure of arbitrary path expressions(allowed in non-standard extensions like Regular XPath)

• Note also that path expressions, as opposed to node expressions,are not closed under other boolean connectives than sum(changed in XPath 2.0)

• Therefore, we are not exactly in the world of Tarski’s relation algebras• The right algebraic two-sorted setting would be

boolean modules over idempotent semirings• It is possible to move the discussion to one-sorted setting, though:

Dynamic Relation Algebras (DRA’s)studied in the 1990’s by a Dutch group in Utrecht (A. Visser, M. Hollenberg)

that is, idempotent semirings with antidomain operation ∼The Utrecht group used the name dynamic negation

Antidomain: term introduced recently by Desharnais, Jipsen and Struth

Page 64: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Short Core XPath

Enter the Short CoreXPath (SCX) of de Rijke and Marx:one-sorted notational variant

Syntax of Short Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

exp ::= · | a | exp/exp | exp ∪ exp | ?p | ∼exp (p ∈ Σ)

Definition of a single/restricted axis fragment remains the same

Page 65: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Short Core XPath

Enter the Short CoreXPath (SCX) of de Rijke and Marx:one-sorted notational variant

Syntax of Short Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

exp ::= · | a | exp/exp | exp ∪ exp

| ?p | ∼exp (p ∈ Σ)

Definition of a single/restricted axis fragment remains the same

Page 66: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Short Core XPath

Enter the Short CoreXPath (SCX) of de Rijke and Marx:one-sorted notational variant

Syntax of Short Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

exp ::= · | a | exp/exp | exp ∪ exp | ?p | ∼exp (p ∈ Σ)

Definition of a single/restricted axis fragment remains the same

Page 67: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Short Core XPath

Enter the Short CoreXPath (SCX) of de Rijke and Marx:one-sorted notational variant

Syntax of Short Core XPath:

s ::= ⇓, ⇑,⇐,⇒a ::= s | s+

exp ::= · | a | exp/exp | exp ∪ exp | ?p | ∼exp (p ∈ Σ)

Definition of a single/restricted axis fragment remains the same

Page 68: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of Short Core XPath

pexpr : pairs (context node, reachable node)—subsets of N2

[[s]]T = Rs[[s+]]T = the transitive closure of Rs[[·]]T = the identity relation on N[[A/B]]T = composition of [[A]]T and [[B]]T

[[A ∪ B]]T = union of [[A]]T and [[B]]T

[[A[φ]]]T = {(n,m) ∈ [[A]]T | m ∈ [[φ]]T}nexpr : subsets of N

[[p]]T = {n ∈ N | n ∈ V (p)}[[φ ∧ ψ]]T = [[φ]]T ∩ [[ψ]]T

[[¬φ]]T = N \ [[φ]]T

[[〈A〉]]T = domain of [[A]]T = {n | (n,m) ∈ [[A]]T}

Page 69: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Semantics of Short Core XPath

exp : pairs (context node, reachable node)—subsets of N2

[[s]]T = Rs[[s+]]T = the transitive closure of Rs[[·]]T = the identity relation on N[[A/B]]T = composition of [[A]]T and [[B]]T

[[A ∪ B]]T = union of [[A]]T and [[B]]T

[[?p]]T = {(n,n) ∈ N2 | n ∈ V (p)}[[∼A]]T = {(n,n) ∈ N2 | ∀m.(n,m) 6∈ [[A]]T}

Page 70: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Back-and-forth Between Core XPath and SCX

One direction is easy:[[∼A]]T = [[·[¬〈A〉]]]T

But there is also a polynomial translation t in the reversedirection:

t(p) = ?pt(〈A〉) = ∼∼t(A)t(φ ∧ ψ) = ∼(∼t(φ) ∪ ∼t(ψ))t(A[φ]) = t(A)/t(φ)

other connectives being straightforward. Clearly

[[A]]T = [[t(A)]]T for all A ∈ pexpr

[[·[φ]]]T = [[t(φ)]]T for all φ ∈ nexpr

Page 71: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Back-and-forth Between Core XPath and SCX

One direction is easy:[[∼A]]T = [[·[¬〈A〉]]]TBut there is also a polynomial translation t in the reversedirection:

t(p) = ?pt(〈A〉) = ∼∼t(A)t(φ ∧ ψ) = ∼(∼t(φ) ∪ ∼t(ψ))t(A[φ]) = t(A)/t(φ)

other connectives being straightforward. Clearly

[[A]]T = [[t(A)]]T for all A ∈ pexpr

[[·[φ]]]T = [[t(φ)]]T for all φ ∈ nexpr

Page 72: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Back-and-forth Between Core XPath and SCX

One direction is easy:[[∼A]]T = [[·[¬〈A〉]]]TBut there is also a polynomial translation t in the reversedirection:

t(p) = ?pt(〈A〉) = ∼∼t(A)t(φ ∧ ψ) = ∼(∼t(φ) ∪ ∼t(ψ))t(A[φ]) = t(A)/t(φ)

other connectives being straightforward. Clearly

[[A]]T = [[t(A)]]T for all A ∈ pexpr

[[·[φ]]]T = [[t(φ)]]T for all φ ∈ nexpr

Page 73: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

When Two Queries Are Equivalent?

DefinitionLet P and Q be either• both path expressions or• both node expressions

We say P and Q are equivalent (P ≡ Q)if for any document [[P]]T = [[Q]]T

Page 74: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .. . . this is a bit nontrivial

• reduction of satisfaction to equivalence:straightforward

• reduction of equivalence to satisfaction:requires The Nasty Trick

Page 75: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .. . . this is a bit nontrivial

• reduction of satisfaction to equivalence:straightforward

• reduction of equivalence to satisfaction:requires The Nasty Trick

Page 76: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .

. . . this is a bit nontrivial• reduction of satisfaction to equivalence:

straightforward• reduction of equivalence to satisfaction:

requires The Nasty Trick

Page 77: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .. . . this is a bit nontrivial

• reduction of satisfaction to equivalence:straightforward

• reduction of equivalence to satisfaction:requires The Nasty Trick

Page 78: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .. . . this is a bit nontrivial

• reduction of satisfaction to equivalence:straightforward

• reduction of equivalence to satisfaction:requires The Nasty Trick

Page 79: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Problems Equivalent to Equivalence

• (because of presence of ∨ and ∪):• containment for node expressions• containment for path expressions

• (because of presence of ∧ and ¬):• satisfiability for node expressions

• satisfiability for path expressions . . .. . . this is a bit nontrivial

• reduction of satisfaction to equivalence:straightforward

• reduction of equivalence to satisfaction:requires The Nasty Trick

Page 80: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?

Page 81: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?

• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?

Page 82: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?

• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?

Page 83: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?

Page 84: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?

Page 85: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?• One last try: how about· [〈⇓〉] ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · [〈⇑〉] ∪⇒+?

Page 86: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?• One last try: how about∼∼⇓ ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ ∼∼⇑ ∪⇒+?

Page 87: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Which expressions are equivalent?

Let’s give it a try:

• is it true that

· ≡ ⇑/⇓?• fine, how about· ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ · ∪⇒+?• One last try: how about∼∼⇓ ≡ ⇓/⇑and

⇑/⇓ ≡ ⇐+ ∪ ∼∼⇑ ∪⇒+?

Page 88: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A non-trivial problem for query rewrite and optimization:

Evaluation times of two equivalent queriesmay differ up to several orders of magnitude!

When implementing an optimizer,you may need thousands of those equivalencesNow how do you know. . .

(soundness problem). . . all of your equivalences are valid?

some fake equivalences not so easy to spot, especially in hurry

(completeness problem). . . you took care of all (possibly) relevant ones?

there might be classes of equivalences you never thought of!

Page 89: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A non-trivial problem for query rewrite and optimization:

Evaluation times of two equivalent queriesmay differ up to several orders of magnitude!

When implementing an optimizer,you may need thousands of those equivalencesNow how do you know. . .

(soundness problem). . . all of your equivalences are valid?

some fake equivalences not so easy to spot, especially in hurry

(completeness problem). . . you took care of all (possibly) relevant ones?

there might be classes of equivalences you never thought of!

Page 90: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A non-trivial problem for query rewrite and optimization:

Evaluation times of two equivalent queriesmay differ up to several orders of magnitude!

When implementing an optimizer,you may need thousands of those equivalencesNow how do you know. . .

(soundness problem). . . all of your equivalences are valid?

some fake equivalences not so easy to spot, especially in hurry

(completeness problem). . . you took care of all (possibly) relevant ones?

there might be classes of equivalences you never thought of!

Page 91: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A non-trivial problem for query rewrite and optimization:

Evaluation times of two equivalent queriesmay differ up to several orders of magnitude!

When implementing an optimizer,you may need thousands of those equivalencesNow how do you know. . .

(soundness problem). . . all of your equivalences are valid?

some fake equivalences not so easy to spot, especially in hurry

(completeness problem). . . you took care of all (possibly) relevant ones?

there might be classes of equivalences you never thought of!

Page 92: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Definition (Complete Axiomatization)

A complete axiomatization of a given XPath fragment:

A set of• finitely many valid equivalence schemes• finitely many validity preserving inference rules

from which every other valid equivalence is derivable.

For logicians again: of course, we are interested only in finite axiomatizations.As intended models are finite, finite axiomatization implies decidability!

One of reasons why we consider Core XPath only:

the whole XPath would be too big to allow an axiomatization

Page 93: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Definition (Complete Axiomatization)

A complete axiomatization of a given XPath fragment:

A set of• finitely many valid equivalence schemes• finitely many validity preserving inference rules

from which every other valid equivalence is derivable.

For logicians again: of course, we are interested only in finite axiomatizations.As intended models are finite, finite axiomatization implies decidability!

One of reasons why we consider Core XPath only:

the whole XPath would be too big to allow an axiomatization

Page 94: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Definition (Complete Axiomatization)

A complete axiomatization of a given XPath fragment:

A set of• finitely many valid equivalence schemes• finitely many validity preserving inference rules

from which every other valid equivalence is derivable.

For logicians again: of course, we are interested only in finite axiomatizations.As intended models are finite, finite axiomatization implies decidability!

One of reasons why we consider Core XPath only:

the whole XPath would be too big to allow an axiomatization

Page 95: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:

formulas terms query planstautologies equations query equivalences

inference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 96: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:formulas terms query plans

tautologies equations query equivalencesinference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 97: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:formulas terms query planstautologies equations query equivalences

inference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 98: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:formulas terms query planstautologies equations query equivalences

inference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 99: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:formulas terms query planstautologies equations query equivalences

inference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 100: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Logic—Algebra—Query Languages

Logicians and algebraists have long studiedsimilar problems in a different disguise:

logic: algebras: databases:formulas terms query planstautologies equations query equivalences

inference rules rewrite rules

In particular, they standarized a beautifully simple set of validitypreserving rules:

Birkhoff’s Calculus For Equational Logic

Page 101: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ P

P ≡ Q =⇒ Q ≡ PP ≡ Q & Q ≡ R =⇒ P ≡ RP ≡ Q =⇒ R ≡ R′

(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 102: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ PP ≡ Q =⇒ Q ≡ P

P ≡ Q & Q ≡ R =⇒ P ≡ RP ≡ Q =⇒ R ≡ R′

(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 103: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ PP ≡ Q =⇒ Q ≡ PP ≡ Q & Q ≡ R =⇒ P ≡ R

P ≡ Q =⇒ R ≡ R′(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 104: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ PP ≡ Q =⇒ Q ≡ PP ≡ Q & Q ≡ R =⇒ P ≡ RP ≡ Q =⇒ R ≡ R′

(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 105: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ PP ≡ Q =⇒ Q ≡ PP ≡ Q & Q ≡ R =⇒ P ≡ RP ≡ Q =⇒ R ≡ R′

(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 106: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Birkhoff’s Calculus For Equational Logic

Definition

• Let Γ be a set of equivalences.Equivalence P ≡ Q is derivable from Γ if it can be obtainedby the following rules:P ≡ PP ≡ Q =⇒ Q ≡ PP ≡ Q & Q ≡ R =⇒ P ≡ RP ≡ Q =⇒ R ≡ R′

(R′ is obtained from R by replacing occurrences of P by Q)

• An axiomatization using Birkhoff’s rules only is orthodox.

Clearly, these rules preserve validity.

Page 107: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 108: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 109: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?

Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 110: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 111: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?

Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 112: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1: Why Birkhoff Calculus?

Before we proceed, you may have two questions:

DefinitionWhat is so great about this derivation system? Is it . . .

• . . . the definition itself?Should feel straightforward and natural,not surprising and counterintuitive

• . . . the avalanche of results it triggered off?Theory of varieties developed since the 1930’s: semigroups andgroups, semirings, semilattices, lattices and residuated lattices,boolean algebras, abstract relation and cylindric algebras . . .

Page 113: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1 (ct’d): But What Use Are They For Us?

An orthodox axiomatization≡

An elegant, self-contained equational rewrite system(no need to break equational reasoning with intermediate lemmas)

Almost all axiomatizations presented today will be orthodox(you’re going to see one exception at the end of the talk and dislike it)

Page 114: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1 (ct’d): But What Use Are They For Us?

An orthodox axiomatization≡

An elegant, self-contained equational rewrite system(no need to break equational reasoning with intermediate lemmas)

Almost all axiomatizations presented today will be orthodox(you’re going to see one exception at the end of the talk and dislike it)

Page 115: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q1 (ct’d): But What Use Are They For Us?

An orthodox axiomatization≡

An elegant, self-contained equational rewrite system(no need to break equational reasoning with intermediate lemmas)

Almost all axiomatizations presented today will be orthodox(you’re going to see one exception at the end of the talk and dislike it)

Page 116: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2: Anything Special about XPath?

QuestionHow about complete axiomatizations for SQL-like languages?After all, there has been nothing XML specific to what I said . . .

AnswerEven with no more than three attributes, you soon run intounaxiomatizability results! ( c©by logicians and algebraists)Some database theorists got into problems not knowing about it . . .It does not mean you cannot find interesting axiomatizablefragments—they are rather small though

Page 117: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2: Anything Special about XPath?

QuestionHow about complete axiomatizations for SQL-like languages?After all, there has been nothing XML specific to what I said . . .

AnswerEven with no more than three attributes, you soon run intounaxiomatizability results! ( c©by logicians and algebraists)Some database theorists got into problems not knowing about it . . .

It does not mean you cannot find interesting axiomatizablefragments—they are rather small though

Page 118: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2: Anything Special about XPath?

QuestionHow about complete axiomatizations for SQL-like languages?After all, there has been nothing XML specific to what I said . . .

AnswerEven with no more than three attributes, you soon run intounaxiomatizability results! ( c©by logicians and algebraists)Some database theorists got into problems not knowing about it . . .It does not mean you cannot find interesting axiomatizablefragments—they are rather small though

Page 119: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 120: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 121: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 122: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic

• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 123: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings

• The duality of path and node expressions:resembles (fragments of) the logic of programs (PDL)

Page 124: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 125: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Q2 (ct’d): Is XPath Querying Any Better Off, Then?

Short AnswerYes.

Long AnswerYes, precisely because

• we can isolate the navigational core . . .(would not make much sense in the relational context)

• . . . and this core is related towell-behaved, axiomatizable formalisms:

• Node expressions—to modal logic• Path expressions—to idempotent (antidomain) semirings• The duality of path and node expressions:

resembles (fragments of) the logic of programs (PDL)

Page 126: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Basic Axioms I: Idempotent Semirings

ISAx1 (A ∪ B) ∪ C ≡ A ∪ (B ∪ C)ISAx2 A ∪ B ≡ B ∪ AISAx3 A ∪ A ≡ AISAx4 A/(B/C) ≡ (A/B)/C

ISAx5{ ·/A ≡ AA/· ≡ A

ISAx6{ A/(B ∪ C) ≡ A/B ∪ A/C(A ∪ B)/C ≡ A/C ∪ B/C

ISAx7 ⊥ ⊆ A

Distributive lattices, Kleene algebras, Tarski’s relation algebras:they all have idempotent semiring reducts.

Idempotency is the axiom ISAx3.⊥ abbreviates · [¬〈·〉]

Page 127: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Basic Axioms II: Predicate Axioms

PrAx1 A [¬〈B〉] /B ≡ ⊥PrAx2 A [φ ∨ ψ] ≡ A [φ] ∪ A [ψ]PrAx3 (A/B) [φ] ≡ A/B [φ]PrAx4 · [〈·〉] ≡ ·

In Tarski’s relation algebras and XPath 2.0,predicates can be defined away

Note that PrAx3 would not be validif we allowed unrestricted positional predicates

Page 128: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Basic Axioms III: Node Axioms

NdAx1 φ ≡ ¬(¬φ ∨ ψ) ∨ ¬(¬φ ∨ ¬ψ)NdAx2 〈A ∪ B〉 ≡ 〈A〉 ∨ 〈B〉NdAx3 〈A/B〉 ≡ 〈A [〈B〉] 〉NdAx4 〈· [φ] 〉 ≡ φ

Note how little was needed to ensure booleanity!(by Huntington’s result from the 1930’s)

And NdAx2–NdAx4 just mimick PrAx2—PrAx4(redundancy: price to pay for two-sorted signature)

Page 129: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axioms in one-sorted signature

Recall all the two-sorted axioms for predicates andexpressions:

PrAx1 A [¬〈B〉] /B ≡ ⊥PrAx2 A [φ ∨ ψ] ≡ A [φ] ∪ A [ψ]PrAx3 (A/B) [φ] ≡ A/B [φ]PrAx4 · [〈·〉] ≡ ·

NdAx1 φ ≡ ¬(¬φ ∨ ψ) ∨ ¬(¬φ ∨ ¬ψ)NdAx2 〈A ∪ B〉 ≡ 〈A〉 ∨ 〈B〉NdAx3 〈A/B〉 ≡ 〈A [〈B〉] 〉NdAx4 〈· [φ] 〉 ≡ φ

Page 130: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axioms in one-sorted signature

Here is a one-sorted axiomatization for ∼ over idempotentsemi-ring axioms found by Hollenberg:

∼A/A ≡ ⊥∼∼A/A ≡ A∼(A/B)/A ≡ (∼(A/B)/A)/∼B∼(A ∪ B) ≡ ∼A/∼B∼A ∪ ∼B ≡ ∼∼(∼A ∪ ∼B)

We need to add one more axiom for tests:

?p ≡ ∼∼?p

Page 131: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Now, you may have the feeling thatthere was nothing XPath-specific yet

But in fact there is a fragment for whichit is all there is:

Core XPath(⇓), the child-axis-only fragment!

TheoremThe axioms presented so far are complete for all validequivalences of Core XPath(⇓).

In order to find more interesting equivalences,we have to move to other fragments

Page 132: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Now, you may have the feeling thatthere was nothing XPath-specific yet

But in fact there is a fragment for whichit is all there is:

Core XPath(⇓), the child-axis-only fragment!

TheoremThe axioms presented so far are complete for all validequivalences of Core XPath(⇓).

In order to find more interesting equivalences,we have to move to other fragments

Page 133: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Now, you may have the feeling thatthere was nothing XPath-specific yet

But in fact there is a fragment for whichit is all there is:

Core XPath(⇓), the child-axis-only fragment!

TheoremThe axioms presented so far are complete for all validequivalences of Core XPath(⇓).

In order to find more interesting equivalences,we have to move to other fragments

Page 134: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Now, you may have the feeling thatthere was nothing XPath-specific yet

But in fact there is a fragment for whichit is all there is:

Core XPath(⇓), the child-axis-only fragment!

TheoremThe axioms presented so far are complete for all validequivalences of Core XPath(⇓).

In order to find more interesting equivalences,we have to move to other fragments

Page 135: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axioms for Linear Axes

The non-transitive case:

LinAx1 s [¬φ] ≡ · [¬〈s [φ] 〉] /s for s ∈ {⇒,⇐,⇑}

This forces functionality of the corresponding axis

Page 136: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axioms for Transitive Axes

One for node expressions, one for path expressions:

TransAx1 〈s+ [φ] 〉 ≡ 〈s+ [φ ∧ ¬〈s+ [φ] 〉] 〉TransAx2 s+ ≡ s+ ∪ s+/s+

The first one is called the Löb axiom and forces well-foundednessDon’t get modal logicians started on it—people wrote books about this formula

In particular, all the consequences of TransAx2 for node expressionscan be already derived from TransAx1

I can neither prove nor disprove that for path expressionsTransAx2 is (ir-)redundant

Page 137: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Finally, Axes which Are Both Transitive and Linear

LinAx2 · [〈s+ [φ] 〉] /s+ ≡ s+ [φ] ∪ s+ [φ] /s+ ∪ s+ [〈s+ [φ] 〉]for s ∈ {⇒,⇐,⇑}

together with transitivity axioms

This forces the corresponding axis is a linear order

Page 138: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Single Axis Completeness Result

Theorem

• Base axioms are completefor Core XPath(⇓)

• Base axioms with LinAx1 are completefor other intransitive single axis fragments

• Base axioms with TransAx1 and TransAx2 are completefor Core XPath(⇓+)

• Base axioms with TransAx1, TransAx2 and LinAx2 are completefor other transitive single axis fragments

Page 139: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs

• First, rewrite node expressions to simple node expressions:

siNode ::= 〈·〉 | p | 〈a [siNode] 〉 | ¬siNode | siNode∨siNode

They are isomorphic variants of modal formulas• Using normal form theorems for modal logic, we provide a

completeness proof for node expressions

Page 140: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs

• First, rewrite node expressions to simple node expressions:

siNode ::= 〈·〉 | p | 〈a [siNode] 〉 | ¬siNode | siNode∨siNode

They are isomorphic variants of modal formulas

• Using normal form theorems for modal logic, we provide acompleteness proof for node expressions

Page 141: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs

• First, rewrite node expressions to simple node expressions:

siNode ::= 〈·〉 | p | 〈a [siNode] 〉 | ¬siNode | siNode∨siNode

They are isomorphic variants of modal formulas• Using normal form theorems for modal logic, we provide a

completeness proof for node expressions

Page 142: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs cntd.

• Then we rewrite all path expressions as sums of sum-freeexpressions of the form

S = · [β1] /a [β2] / . . . /a [β`] ,

(all βi are normal forms of

• the same nesting degree in case of transitive axes• strictly decreasing degree for intransitive axes)

In case of linear axes, we can even guarantee that every formulais witnessed further down the chain

• We prove that for every two such expressions either• one is a subsequence of the other—provably contained or• there is a countermodel for containment

Page 143: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs cntd.

• Then we rewrite all path expressions as sums of sum-freeexpressions of the form

S = · [β1] /a [β2] / . . . /a [β`] ,

(all βi are normal forms of

• the same nesting degree in case of transitive axes• strictly decreasing degree for intransitive axes)

In case of linear axes, we can even guarantee that every formulais witnessed further down the chain

• We prove that for every two such expressions either

• one is a subsequence of the other—provably contained or• there is a countermodel for containment

Page 144: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs cntd.

• Then we rewrite all path expressions as sums of sum-freeexpressions of the form

S = · [β1] /a [β2] / . . . /a [β`] ,

(all βi are normal forms of

• the same nesting degree in case of transitive axes• strictly decreasing degree for intransitive axes)

In case of linear axes, we can even guarantee that every formulais witnessed further down the chain

• We prove that for every two such expressions either• one is a subsequence of the other—provably contained or

• there is a countermodel for containment

Page 145: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

A Few Words About Proofs cntd.

• Then we rewrite all path expressions as sums of sum-freeexpressions of the form

S = · [β1] /a [β2] / . . . /a [β`] ,

(all βi are normal forms of

• the same nesting degree in case of transitive axes• strictly decreasing degree for intransitive axes)

In case of linear axes, we can even guarantee that every formulais witnessed further down the chain

• We prove that for every two such expressions either• one is a subsequence of the other—provably contained or• there is a countermodel for containment

Page 146: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Aside: the issue of labels

There is a fact about XML trees we did not take into account(unless we opt to render attribute-value pairs as additional labels)

The labels are disjoint!

However, this is easy to fix: add node axiom

p ∧ q ≡ ⊥

for distinct p and qThis axiom itself is not substitution-invariant,this is why we do not like itBut as our proofs used only Birkhoff’s rulesthey are quite flexible and adding this axiom does not hurt

Page 147: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Aside: the issue of labels

There is a fact about XML trees we did not take into account(unless we opt to render attribute-value pairs as additional labels)

The labels are disjoint!

However, this is easy to fix: add node axiom

p ∧ q ≡ ⊥

for distinct p and qThis axiom itself is not substitution-invariant,this is why we do not like itBut as our proofs used only Birkhoff’s rulesthey are quite flexible and adding this axiom does not hurt

Page 148: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Aside: the issue of labels

There is a fact about XML trees we did not take into account(unless we opt to render attribute-value pairs as additional labels)

The labels are disjoint!

However, this is easy to fix: add node axiom

p ∧ q ≡ ⊥

for distinct p and qThis axiom itself is not substitution-invariant,this is why we do not like it

But as our proofs used only Birkhoff’s rulesthey are quite flexible and adding this axiom does not hurt

Page 149: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Aside: the issue of labels

There is a fact about XML trees we did not take into account(unless we opt to render attribute-value pairs as additional labels)

The labels are disjoint!

However, this is easy to fix: add node axiom

p ∧ q ≡ ⊥

for distinct p and qThis axiom itself is not substitution-invariant,this is why we do not like itBut as our proofs used only Birkhoff’s rulesthey are quite flexible and adding this axiom does not hurt

Page 150: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axiom For Axes Dependencies

TreeAx1{ s+/s ∪ s ≡ s+

s/s+ ∪ s ≡ s+

TreeAx2 s [φ] /s^ ≡ · [〈s [φ] 〉] ( for s distinct than ⇑)TreeAx3 ⇑ [φ] /⇓ ≡ (⇐+ ∪⇒+ ∪ ·) [〈⇑ [φ] 〉]

TreeAx4 { ⇐+ ≡ ⇐+ [〈⇑〉]⇒+ ≡ ⇒+ [〈⇑〉]

TreeAx1 says: s+ is a transitive closure of sTreeAx2 says non-child axes are functional

and describes their converseTreeAx3 forces ⇑ is the converse of (non-functional) ⇓

with TreeAx4, it also describes how horizontal and vertical axesinterplay

Page 151: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Theorem

The axioms presented so far are complete forCore XPath node expressions

Proof.By reduction to simple node expressionsand derivation of all axioms of modal logic of finite treesby Blackburn, Meyer-Viol, de Rijke

Page 152: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Theorem

The axioms presented so far are complete forCore XPath node expressions

Proof.By reduction to simple node expressionsand derivation of all axioms of modal logic of finite treesby Blackburn, Meyer-Viol, de Rijke

Page 153: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

(boolean axioms)〈s [¬〈·〉] 〉 ≡ ¬〈·〉〈s [φ ∨ ψ] 〉 ≡ 〈s [φ] 〉 ∨ 〈s [ψ] 〉φ ≤ ¬〈s [¬〈s^ [φ] 〉] 〉〈s [¬φ] 〉 ∧ 〈s [φ] 〉 ≡ ¬〈·〉( for s distinct than ⇑)〈s [φ] 〉 ∨ 〈s [〈s+ [φ] 〉] 〉 ≡ 〈s+ [φ] 〉¬〈s [φ] 〉 ∧ 〈s+ [φ] 〉 ≤ 〈s+ [¬φ ∧ 〈s [φ] 〉] 〉〈s [〈·〉] 〉 ≤ 〈s+ [¬〈s [〈·〉] 〉] 〉TransAx1 for ⇓+ and⇒+

〈⇓ [¬〈⇐〉 ∧ ¬〈⇒∗ [φ] 〉] 〉 ≤ ¬〈⇓ [φ] 〉〈⇓ [φ] 〉 ≤ 〈⇓ [¬〈⇐〉] 〉 ∧ 〈⇓ [¬〈⇒〉] 〉¬〈⇑〉 ≤ ¬〈⇐〉 ∧ ¬〈⇒〉

Page 154: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick

We can use this to providean axiomatization for path expressions . . .

. . . of a sort—a non-orthodox one!Add the separability rule:

(Sep) IF 〈A [p] 〉 ≡ 〈B [p] 〉 for p not occurring in A,B

THEN A ≡ B.

Except for spoiling the whole equational story,it does not sit too well with the labelling axiom . . .

Page 155: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick

We can use this to providean axiomatization for path expressions . . .

. . . of a sort—a non-orthodox one!Add the separability rule:

(Sep) IF 〈A [p] 〉 ≡ 〈B [p] 〉 for p not occurring in A,B

THEN A ≡ B.

Except for spoiling the whole equational story,it does not sit too well with the labelling axiom . . .

Page 156: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick

We can use this to providean axiomatization for path expressions . . .

. . . of a sort—a non-orthodox one!

Add the separability rule:

(Sep) IF 〈A [p] 〉 ≡ 〈B [p] 〉 for p not occurring in A,B

THEN A ≡ B.

Except for spoiling the whole equational story,it does not sit too well with the labelling axiom . . .

Page 157: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick

We can use this to providean axiomatization for path expressions . . .

. . . of a sort—a non-orthodox one!Add the separability rule:

(Sep) IF 〈A [p] 〉 ≡ 〈B [p] 〉 for p not occurring in A,B

THEN A ≡ B.

Except for spoiling the whole equational story,it does not sit too well with the labelling axiom . . .

Page 158: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick

We can use this to providean axiomatization for path expressions . . .

. . . of a sort—a non-orthodox one!Add the separability rule:

(Sep) IF 〈A [p] 〉 ≡ 〈B [p] 〉 for p not occurring in A,B

THEN A ≡ B.

Except for spoiling the whole equational story,it does not sit too well with the labelling axiom . . .

Page 159: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The Nasty Trick Does Its Job

. . . but it’s perfect for obtaining complexity resultsfor query equivalence problem

by using reductions to corresponding modal logics

Page 160: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Complexity Theorem

Theorem

• Query equivalence of Core XPath(⇒+,⇐+), Core XPath(⇑+),Core XPath(s) (for s ∈ {⇑,⇐,⇒}) is coNP-complete.

• Query equivalence of Core XPath(⇐+,⇐,⇒+,⇒,⇑+,⇑) isPSPACE-complete.Thus, the PSPACE upper bound applies to all its sublanguages.

• Query equivalence of Core XPath(⇓) and Core XPath(⇓+) isPSPACE-complete.Thus, all extensions of this fragment are PSPACE-hard.

• Query equivalence of Core XPath(⇓,⇓+) is EXPTIME-complete.Thus, all extensions of this fragment are EXPTIME-hard.

Page 161: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Proofs

. . . by reductions to complexity results for modal logics like K,GL, Alt.1 and fragments of tense/temporal logic on linear and

branching orders.The most interesting one is for the second clause—somewhat

tricky embedding into a logic of Sistla and Clarke

Page 162: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we have seen so far . . .

We have seen:• equational axiomatizations for path equivalences of all

eight single axis fragments of Core XPath

• equational axiomatizations for node equivalences of fullCore XPath 1.0

• non-orthodox axiomatization for path equivalences of fullCore XPath 1.0

• computational complexity results for path equivalences inmost meaningful sublanguages of Core XPath 1.0

Page 163: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we have seen so far . . .

We have seen:• equational axiomatizations for path equivalences of all

eight single axis fragments of Core XPath• equational axiomatizations for node equivalences of full

Core XPath 1.0

• non-orthodox axiomatization for path equivalences of fullCore XPath 1.0

• computational complexity results for path equivalences inmost meaningful sublanguages of Core XPath 1.0

Page 164: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we have seen so far . . .

We have seen:• equational axiomatizations for path equivalences of all

eight single axis fragments of Core XPath• equational axiomatizations for node equivalences of full

Core XPath 1.0• non-orthodox axiomatization for path equivalences of full

Core XPath 1.0

• computational complexity results for path equivalences inmost meaningful sublanguages of Core XPath 1.0

Page 165: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we have seen so far . . .

We have seen:• equational axiomatizations for path equivalences of all

eight single axis fragments of Core XPath• equational axiomatizations for node equivalences of full

Core XPath 1.0• non-orthodox axiomatization for path equivalences of full

Core XPath 1.0• computational complexity results for path equivalences in

most meaningful sublanguages of Core XPath 1.0

Page 166: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

What we have not seen so far . . .

• Definability and expressivity results• Results for stronger fragments of XPath

Page 167: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?Possible types of characterizations:• Syntactic (e.g. “L is equivalent to the two variable . . . ”)

versus semantic (e.g., “bisimulation invariant fragment . . . ”)Decidable characterizations?

Page 168: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize

• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?Possible types of characterizations:• Syntactic (e.g. “L is equivalent to the two variable . . . ”)

versus semantic (e.g., “bisimulation invariant fragment . . . ”)Decidable characterizations?

Page 169: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?

Possible types of characterizations:• Syntactic (e.g. “L is equivalent to the two variable . . . ”)

versus semantic (e.g., “bisimulation invariant fragment . . . ”)Decidable characterizations?

Page 170: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?Possible types of characterizations:

• Syntactic (e.g. “L is equivalent to the two variable . . . ”)versus semantic (e.g., “bisimulation invariant fragment . . . ”)

Decidable characterizations?

Page 171: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?Possible types of characterizations:• Syntactic (e.g. “L is equivalent to the two variable . . . ”)

versus semantic (e.g., “bisimulation invariant fragment . . . ”)

Decidable characterizations?

Page 172: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power

Possible yardsticks for expressive power on trees:• First-order logic (FO) (cf. Codd completeness of SQL/RA)• Monadic second-order logic (MSO)• . . . — e.g., in between FO and MSO lies FO(TC)

What kind of queries do we want to characterize• Binary relations definable by path expressions?• Node sets definable by node expressions?• Properties of trees definable by node expressions

evaluated at the root?Possible types of characterizations:• Syntactic (e.g. “L is equivalent to the two variable . . . ”)

versus semantic (e.g., “bisimulation invariant fragment . . . ”)Decidable characterizations?

Page 173: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

Descendant-only fragmentCoreXPath(⇓+) node expressions have the same expressivepower as MSO formulas ϕ(x) for which

(i) truth of ϕ(x) at a node depends only on the subtree(ii) ϕ(x) does not distinguish children from descendants, i.e.,

the following operation preserves truth/falsity at the root:

Easy proof from De Jongh-Sambin fixed point theorem for GLand Janin-Walukiewicz theorem for µ-calculus

Moreover, the proof is effective: it yields a decision procedure.A variant for path expressions obtained by replacingµ-formulas by µ-programsbisimulation invariance by bisimulation safety

Page 174: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

Descendant-only fragmentCoreXPath(⇓+) node expressions have the same expressivepower as MSO formulas ϕ(x) for which

(i) truth of ϕ(x) at a node depends only on the subtree(ii) ϕ(x) does not distinguish children from descendants, i.e.,

the following operation preserves truth/falsity at the root:

Easy proof from De Jongh-Sambin fixed point theorem for GLand Janin-Walukiewicz theorem for µ-calculus

Moreover, the proof is effective: it yields a decision procedure.A variant for path expressions obtained by replacingµ-formulas by µ-programsbisimulation invariance by bisimulation safety

Page 175: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

Descendant-only fragmentCoreXPath(⇓+) node expressions have the same expressivepower as MSO formulas ϕ(x) for which

(i) truth of ϕ(x) at a node depends only on the subtree(ii) ϕ(x) does not distinguish children from descendants, i.e.,

the following operation preserves truth/falsity at the root:

Easy proof from De Jongh-Sambin fixed point theorem for GLand Janin-Walukiewicz theorem for µ-calculus

Moreover, the proof is effective: it yields a decision procedure.

A variant for path expressions obtained by replacingµ-formulas by µ-programsbisimulation invariance by bisimulation safety

Page 176: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

Descendant-only fragmentCoreXPath(⇓+) node expressions have the same expressivepower as MSO formulas ϕ(x) for which

(i) truth of ϕ(x) at a node depends only on the subtree(ii) ϕ(x) does not distinguish children from descendants, i.e.,

the following operation preserves truth/falsity at the root:

Easy proof from De Jongh-Sambin fixed point theorem for GLand Janin-Walukiewicz theorem for µ-calculus

Moreover, the proof is effective: it yields a decision procedure.A variant for path expressions obtained by replacingµ-formulas by µ-programsbisimulation invariance by bisimulation safety

Page 177: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

What about the full Core XPath language?

No decidable characterization in terms of MSO is known.(but see a LICS 2010 result of Place/Segoufin for a decidablecharacterization of CoreXPath(⇓+,⇑+,⇐+,⇒+) in terms offorest algebras)

Syntactic characterization of full CXP (Marx-De Rijke 05)Core XPath node expressions have the same expressive poweras formulas φ(x) in the two-variable fragment ofFO[R⇓,R⇓+ ,R⇒,R⇒+ ].There is a similar characterization for path expressions.

Page 178: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

What about the full Core XPath language?

No decidable characterization in terms of MSO is known.(but see a LICS 2010 result of Place/Segoufin for a decidablecharacterization of CoreXPath(⇓+,⇑+,⇐+,⇒+) in terms offorest algebras)

Syntactic characterization of full CXP (Marx-De Rijke 05)Core XPath node expressions have the same expressive poweras formulas φ(x) in the two-variable fragment ofFO[R⇓,R⇓+ ,R⇒,R⇒+ ].There is a similar characterization for path expressions.

Page 179: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power (ct’d)

What about the full Core XPath language?

No decidable characterization in terms of MSO is known.(but see a LICS 2010 result of Place/Segoufin for a decidablecharacterization of CoreXPath(⇓+,⇑+,⇐+,⇒+) in terms offorest algebras)

Syntactic characterization of full CXP (Marx-De Rijke 05)Core XPath node expressions have the same expressive poweras formulas φ(x) in the two-variable fragment ofFO[R⇓,R⇓+ ,R⇒,R⇒+ ].There is a similar characterization for path expressions.

Page 180: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Brief recap

In summary. . .

• Core XPath is reasonably expressive yet computationallyattractive.

• In the remainder of this talk, we consider two extensions:

1 Regular XPath: the extension of Core XPath with fulltransitive closure.

2 Core XPath 2.0: the navigational core of XPath 2.0,featuring path intersection and complementation and more.

Page 181: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Brief recap

In summary. . .

• Core XPath is reasonably expressive yet computationallyattractive.

• In the remainder of this talk, we consider two extensions:

1 Regular XPath: the extension of Core XPath with fulltransitive closure.

2 Core XPath 2.0: the navigational core of XPath 2.0,featuring path intersection and complementation and more.

Page 182: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Brief recap

In summary. . .

• Core XPath is reasonably expressive yet computationallyattractive.

• In the remainder of this talk, we consider two extensions:

1 Regular XPath: the extension of Core XPath with fulltransitive closure.

2 Core XPath 2.0: the navigational core of XPath 2.0,featuring path intersection and complementation and more.

Page 183: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Brief recap

In summary. . .

• Core XPath is reasonably expressive yet computationallyattractive.

• In the remainder of this talk, we consider two extensions:

1 Regular XPath: the extension of Core XPath with fulltransitive closure.

2 Core XPath 2.0: the navigational core of XPath 2.0,featuring path intersection and complementation and more.

Page 184: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Motivation for Regular XPath

Extending XPath with transitive closure proposed for variousreasons.

• Practical reasons: some applications require the use oftransitive closure.(an example: Nentvich et al. 2002, xlinkit: a consistency checkingand smart link generation service. ACM Trans. on Int. Tech. 2002)

• Core XPath extended with transitive closure• has full first-order expressive power• is rich enough to express DTDs and• admits view-based query rewriting with recursive views

• very natural from the perspective of PDL

1 Part of community standard EXSLT2 Implemented (naively) in Saxon and Xalan3 . . . but left out by W3C from XPath 2.0

Page 185: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Motivation for Regular XPath

Extending XPath with transitive closure proposed for variousreasons.

• Practical reasons: some applications require the use oftransitive closure.(an example: Nentvich et al. 2002, xlinkit: a consistency checkingand smart link generation service. ACM Trans. on Int. Tech. 2002)

• Core XPath extended with transitive closure• has full first-order expressive power• is rich enough to express DTDs and• admits view-based query rewriting with recursive views

• very natural from the perspective of PDL

1 Part of community standard EXSLT2 Implemented (naively) in Saxon and Xalan3 . . . but left out by W3C from XPath 2.0

Page 186: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Motivation for Regular XPath

Extending XPath with transitive closure proposed for variousreasons.

• Practical reasons: some applications require the use oftransitive closure.(an example: Nentvich et al. 2002, xlinkit: a consistency checkingand smart link generation service. ACM Trans. on Int. Tech. 2002)

• Core XPath extended with transitive closure• has full first-order expressive power• is rich enough to express DTDs and• admits view-based query rewriting with recursive views

• very natural from the perspective of PDL

1 Part of community standard EXSLT2 Implemented (naively) in Saxon and Xalan3 . . . but left out by W3C from XPath 2.0

Page 187: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Motivation for Regular XPath

Extending XPath with transitive closure proposed for variousreasons.

• Practical reasons: some applications require the use oftransitive closure.(an example: Nentvich et al. 2002, xlinkit: a consistency checkingand smart link generation service. ACM Trans. on Int. Tech. 2002)

• Core XPath extended with transitive closure• has full first-order expressive power• is rich enough to express DTDs and• admits view-based query rewriting with recursive views

• very natural from the perspective of PDL

1 Part of community standard EXSLT2 Implemented (naively) in Saxon and Xalan3 . . . but left out by W3C from XPath 2.0

Page 188: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Motivation for Regular XPath

Extending XPath with transitive closure proposed for variousreasons.

• Practical reasons: some applications require the use oftransitive closure.(an example: Nentvich et al. 2002, xlinkit: a consistency checkingand smart link generation service. ACM Trans. on Int. Tech. 2002)

• Core XPath extended with transitive closure• has full first-order expressive power• is rich enough to express DTDs and• admits view-based query rewriting with recursive views

• very natural from the perspective of PDL

1 Part of community standard EXSLT2 Implemented (naively) in Saxon and Xalan3 . . . but left out by W3C from XPath 2.0

Page 189: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Syntax of Regular XPath

• Regular XPath has two types of expressions:

• path expressionsα ::= ⇑ | ⇓ | ⇐ | ⇒ | . | α/β | α ∪ β | α∗ | α[φ]

• node expressionsφ ::= p | ¬φ | φ ∧ ψ | 〈α〉

Page 190: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Syntax of Regular XPath

• Regular XPath has two types of expressions:• path expressionsα ::= ⇑ | ⇓ | ⇐ | ⇒ | . | α/β | α ∪ β | α∗ | α[φ]

• node expressionsφ ::= p | ¬φ | φ ∧ ψ | 〈α〉

Page 191: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Syntax of Regular XPath

• Regular XPath has two types of expressions:• path expressionsα ::= ⇑ | ⇓ | ⇐ | ⇒ | . | α/β | α ∪ β | α∗ | α[φ]

• node expressionsφ ::= p | ¬φ | φ ∧ ψ | 〈α〉

Page 192: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

An example

“Go to the next book that has at least two authors.”

In Regular XPath:

(⇒[¬twoauthorbook ])∗/⇒[twoauthorbook ]

where twoauthorbook stands forbook ∧ 〈⇓[author ]/⇒+[author ]〉.

Page 193: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

An example

“Go to the next book that has at least two authors.”

In Regular XPath:

(⇒[¬twoauthorbook ])∗/⇒[twoauthorbook ]

where twoauthorbook stands forbook ∧ 〈⇓[author ]/⇒+[author ]〉.

Page 194: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Another example

The following can be expressed in Regular XPath:

“The tree has an even number of nodes”

To see this, note that

• Let (α while φ) be shorthand for (.[φ]/α)∗.

• Let root be short for ¬〈⇑〉.Let leaf be short for ¬〈⇓〉.Let first be short for ¬〈⇐〉.Let last be short for ¬〈⇒〉.

• Let suc be shorthand for ⇓[first ] ∪ .[leaf ]/(⇑ while last

)/⇒

(the successor in depth first left-to-right ordering).

• Then⟨

(suc/suc)∗[leaf ]/(⇑ while last

)[root ]

⟩is true at the

root iff the number of nodes is even.

Page 195: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Another example

The following can be expressed in Regular XPath:

“The tree has an even number of nodes”

To see this, note that

• Let (α while φ) be shorthand for (.[φ]/α)∗.

• Let root be short for ¬〈⇑〉.Let leaf be short for ¬〈⇓〉.Let first be short for ¬〈⇐〉.Let last be short for ¬〈⇒〉.

• Let suc be shorthand for ⇓[first ] ∪ .[leaf ]/(⇑ while last

)/⇒

(the successor in depth first left-to-right ordering).

• Then⟨

(suc/suc)∗[leaf ]/(⇑ while last

)[root ]

⟩is true at the

root iff the number of nodes is even.

Page 196: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Another example

The following can be expressed in Regular XPath:

“The tree has an even number of nodes”

To see this, note that

• Let (α while φ) be shorthand for (.[φ]/α)∗.

• Let root be short for ¬〈⇑〉.Let leaf be short for ¬〈⇓〉.Let first be short for ¬〈⇐〉.Let last be short for ¬〈⇒〉.

• Let suc be shorthand for ⇓[first ] ∪ .[leaf ]/(⇑ while last

)/⇒

(the successor in depth first left-to-right ordering).

• Then⟨

(suc/suc)∗[leaf ]/(⇑ while last

)[root ]

⟩is true at the

root iff the number of nodes is even.

Page 197: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Another example

The following can be expressed in Regular XPath:

“The tree has an even number of nodes”

To see this, note that

• Let (α while φ) be shorthand for (.[φ]/α)∗.

• Let root be short for ¬〈⇑〉.Let leaf be short for ¬〈⇓〉.Let first be short for ¬〈⇐〉.Let last be short for ¬〈⇒〉.

• Let suc be shorthand for ⇓[first ] ∪ .[leaf ]/(⇑ while last

)/⇒

(the successor in depth first left-to-right ordering).

• Then⟨

(suc/suc)∗[leaf ]/(⇑ while last

)[root ]

⟩is true at the

root iff the number of nodes is even.

Page 198: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Another example

The following can be expressed in Regular XPath:

“The tree has an even number of nodes”

To see this, note that

• Let (α while φ) be shorthand for (.[φ]/α)∗.

• Let root be short for ¬〈⇑〉.Let leaf be short for ¬〈⇓〉.Let first be short for ¬〈⇐〉.Let last be short for ¬〈⇒〉.

• Let suc be shorthand for ⇓[first ] ∪ .[leaf ]/(⇑ while last

)/⇒

(the successor in depth first left-to-right ordering).

• Then⟨

(suc/suc)∗[leaf ]/(⇑ while last

)[root ]

⟩is true at the

root iff the number of nodes is even.

Page 199: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

One more example

• Consider game trees:• leafs are labeled by Anne-wins or Bob-wins• inner nodes are labeled by Anne’s-move or Bob’s-move

• Puzzle:Show that “Anne has a winning strategy” is expressible.

Page 200: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

One more example

• Consider game trees:• leafs are labeled by Anne-wins or Bob-wins• inner nodes are labeled by Anne’s-move or Bob’s-move

• Puzzle:Show that “Anne has a winning strategy” is expressible.

Page 201: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath

• What is the expressive power of Regular XPath?• We know that

FO ( Regular XPath ⊆ FO(TC)

(The first inclusion follows from results by Marx 2004).

• A natural conjecture:

Regular XPath ≡ FO(TC)

(after all, Regular XPath has a transitive closure operator!)

• Ten Cate and Segoufin managed to prove this only afterextending Regular XPath with a “within” operator W :

T ,n |= Wφ iff Tn,n |= φ

(cf. temporal logics with forgettable past)

Page 202: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath

• What is the expressive power of Regular XPath?• We know that

FO ( Regular XPath ⊆ FO(TC)

(The first inclusion follows from results by Marx 2004).

• A natural conjecture:

Regular XPath ≡ FO(TC)

(after all, Regular XPath has a transitive closure operator!)

• Ten Cate and Segoufin managed to prove this only afterextending Regular XPath with a “within” operator W :

T ,n |= Wφ iff Tn,n |= φ

(cf. temporal logics with forgettable past)

Page 203: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath

• What is the expressive power of Regular XPath?• We know that

FO ( Regular XPath ⊆ FO(TC)

(The first inclusion follows from results by Marx 2004).

• A natural conjecture:

Regular XPath ≡ FO(TC)

(after all, Regular XPath has a transitive closure operator!)

• Ten Cate and Segoufin managed to prove this only afterextending Regular XPath with a “within” operator W :

T ,n |= Wφ iff Tn,n |= φ

(cf. temporal logics with forgettable past)

Page 204: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath

• What is the expressive power of Regular XPath?• We know that

FO ( Regular XPath ⊆ FO(TC)

(The first inclusion follows from results by Marx 2004).

• A natural conjecture:

Regular XPath ≡ FO(TC)

(after all, Regular XPath has a transitive closure operator!)

• Ten Cate and Segoufin managed to prove this only afterextending Regular XPath with a “within” operator W :

T ,n |= Wφ iff Tn,n |= φ

(cf. temporal logics with forgettable past)

Page 205: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath (ct’d)

• FO(TC) is the extension of first-order logic with a transitiveclosure operator for binary relations.

Theorem: (Ten Cate and Segoufin, PODS 2008, JACM2010)

Regular XPath(W) path expressions define the same binaryrelations as FO(TC) formulas with two free variables.Similarly for node expressions.

• Corollary: Regular XPath(W) is closed under pathintersection and complementation.

Page 206: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath (ct’d)

• FO(TC) is the extension of first-order logic with a transitiveclosure operator for binary relations.

Theorem: (Ten Cate and Segoufin, PODS 2008, JACM2010)

Regular XPath(W) path expressions define the same binaryrelations as FO(TC) formulas with two free variables.Similarly for node expressions.

• Corollary: Regular XPath(W) is closed under pathintersection and complementation.

Page 207: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath (ct’d)

• FO(TC) is the extension of first-order logic with a transitiveclosure operator for binary relations.

Theorem: (Ten Cate and Segoufin, PODS 2008, JACM2010)

Regular XPath(W) path expressions define the same binaryrelations as FO(TC) formulas with two free variables.Similarly for node expressions.

• Corollary: Regular XPath(W) is closed under pathintersection and complementation.

Page 208: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressive power of Regular XPath (ct’d)

• FO(TC) is the extension of first-order logic with a transitiveclosure operator for binary relations.

Theorem: (Ten Cate and Segoufin, PODS 2008, JACM2010)

Regular XPath(W) path expressions define the same binaryrelations as FO(TC) formulas with two free variables.Similarly for node expressions.

• Corollary: Regular XPath(W) is closed under pathintersection and complementation.

Page 209: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axiomatizations and complexity

No axiomatizations are known yet for Regular XPath andRegular XPath(W).

As for complexity,

• Query evaluation can still be performed in PTime even forRegular XPath(W).

• Query containment is still ExpTime-complete for RegularXPath but it is 2ExpTime-complete for Regular XPath(W)

Page 210: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axiomatizations and complexity

No axiomatizations are known yet for Regular XPath andRegular XPath(W).

As for complexity,

• Query evaluation can still be performed in PTime even forRegular XPath(W).

• Query containment is still ExpTime-complete for RegularXPath but it is 2ExpTime-complete for Regular XPath(W)

Page 211: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axiomatizations and complexity

No axiomatizations are known yet for Regular XPath andRegular XPath(W).

As for complexity,

• Query evaluation can still be performed in PTime even forRegular XPath(W).

• Query containment is still ExpTime-complete for RegularXPath but it is 2ExpTime-complete for Regular XPath(W)

Page 212: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Core XPath 2.0

Page 213: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 2.0 extends XPath 1.0 with many features, including thefollowing new navigational operations:

• Intersection and complementation of path expressions.α intersect β and α except β

Example: ⇓∗[p] except ⇓∗[q]/⇓∗[p]

• Variables and for loopsfor $x in α return β andα[. is $x]

Example:for $x in . return ⇓∗[p ∧¬

⟨⇑∗[q]/⇑∗[. is $x]

⟩]

• Core XPath 2.0 is the extension of Core XPath with thesefeatures.

Page 214: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 2.0 extends XPath 1.0 with many features, including thefollowing new navigational operations:

• Intersection and complementation of path expressions.α intersect β and α except β

Example: ⇓∗[p] except ⇓∗[q]/⇓∗[p]

• Variables and for loopsfor $x in α return β andα[. is $x]

Example:for $x in . return ⇓∗[p ∧¬

⟨⇑∗[q]/⇑∗[. is $x]

⟩]

• Core XPath 2.0 is the extension of Core XPath with thesefeatures.

Page 215: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 2.0 extends XPath 1.0 with many features, including thefollowing new navigational operations:

• Intersection and complementation of path expressions.α intersect β and α except β

Example: ⇓∗[p] except ⇓∗[q]/⇓∗[p]

• Variables and for loopsfor $x in α return β andα[. is $x]

Example:for $x in . return ⇓∗[p ∧¬

⟨⇑∗[q]/⇑∗[. is $x]

⟩]

• Core XPath 2.0 is the extension of Core XPath with thesefeatures.

Page 216: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 2.0 extends XPath 1.0 with many features, including thefollowing new navigational operations:

• Intersection and complementation of path expressions.α intersect β and α except β

Example: ⇓∗[p] except ⇓∗[q]/⇓∗[p]

• Variables and for loopsfor $x in α return β andα[. is $x]

Example:for $x in . return ⇓∗[p ∧¬

⟨⇑∗[q]/⇑∗[. is $x]

⟩]

• Core XPath 2.0 is the extension of Core XPath with thesefeatures.

Page 217: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

XPath 2.0 extends XPath 1.0 with many features, including thefollowing new navigational operations:

• Intersection and complementation of path expressions.α intersect β and α except β

Example: ⇓∗[p] except ⇓∗[q]/⇓∗[p]

• Variables and for loopsfor $x in α return β andα[. is $x]

Example:for $x in . return ⇓∗[p ∧¬

⟨⇑∗[q]/⇑∗[. is $x]

⟩]

• Core XPath 2.0 is the extension of Core XPath with thesefeatures.

Page 218: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

• The path intersection and complementation turn CoreXPath 2.0 into a version of Tarski’s relation algebra(interpreted on finite ordered trees).

• The variables and for-loops make it possible to give alinear translation from first-order logic to Core XPath 2.0:

TR(φ(x , y)) = for $x in ., $y in > return $y[TR′(φ)]

TR′(x = y) =⟨>[. is $x ∧ . is $y]

⟩TR′(R⇓xy) =

⟨>[. is $x ∧ 〈⇓[. is $y]〉]

⟩TR′(R⇓∗xy) =

⟨>[. is $x ∧ 〈⇓∗[. is $y]〉]

⟩TR′(R⇒xy) =

⟨>[. is $x ∧ 〈⇒[. is $y]〉]

⟩TR′(R⇒∗xy) =

⟨>[. is $x ∧ 〈⇒∗[. is $y]〉]

⟩TR′(φ ∧ ψ) = TR′(φ) ∧ TR′(ψ)TR′(¬φ) = ¬TR′(φ)TR′(∃x .φ) = for $x in > return TR′(φ)

where > is shorthand for ⇑∗/⇓∗ (the universal relation)

Page 219: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

• The path intersection and complementation turn CoreXPath 2.0 into a version of Tarski’s relation algebra(interpreted on finite ordered trees).

• The variables and for-loops make it possible to give alinear translation from first-order logic to Core XPath 2.0:

TR(φ(x , y)) = for $x in ., $y in > return $y[TR′(φ)]

TR′(x = y) =⟨>[. is $x ∧ . is $y]

⟩TR′(R⇓xy) =

⟨>[. is $x ∧ 〈⇓[. is $y]〉]

⟩TR′(R⇓∗xy) =

⟨>[. is $x ∧ 〈⇓∗[. is $y]〉]

⟩TR′(R⇒xy) =

⟨>[. is $x ∧ 〈⇒[. is $y]〉]

⟩TR′(R⇒∗xy) =

⟨>[. is $x ∧ 〈⇒∗[. is $y]〉]

⟩TR′(φ ∧ ψ) = TR′(φ) ∧ TR′(ψ)TR′(¬φ) = ¬TR′(φ)TR′(∃x .φ) = for $x in > return TR′(φ)

where > is shorthand for ⇑∗/⇓∗ (the universal relation)

Page 220: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressivity, Complexity and Axiomatization

• Core XPath 2.0 has the same expressive power asfirst-order logic, both with and without variables (in thecase with variables there is a linear translation).

• The complexity of the query equivalence problem isnon-elementary, both with and without variables.

(even adding only path intersection to Core XPath makes it2ExpTime-complete.)

• We have two complete axiomatizations of path equivalencein Core XPath 2.0: one with and one without variables.

Page 221: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressivity, Complexity and Axiomatization

• Core XPath 2.0 has the same expressive power asfirst-order logic, both with and without variables (in thecase with variables there is a linear translation).

• The complexity of the query equivalence problem isnon-elementary, both with and without variables.

(even adding only path intersection to Core XPath makes it2ExpTime-complete.)

• We have two complete axiomatizations of path equivalencein Core XPath 2.0: one with and one without variables.

Page 222: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressivity, Complexity and Axiomatization

• Core XPath 2.0 has the same expressive power asfirst-order logic, both with and without variables (in thecase with variables there is a linear translation).

• The complexity of the query equivalence problem isnon-elementary, both with and without variables.

(even adding only path intersection to Core XPath makes it2ExpTime-complete.)

• We have two complete axiomatizations of path equivalencein Core XPath 2.0: one with and one without variables.

Page 223: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Expressivity, Complexity and Axiomatization

• Core XPath 2.0 has the same expressive power asfirst-order logic, both with and without variables (in thecase with variables there is a linear translation).

• The complexity of the query equivalence problem isnon-elementary, both with and without variables.

(even adding only path intersection to Core XPath makes it2ExpTime-complete.)

• We have two complete axiomatizations of path equivalencein Core XPath 2.0: one with and one without variables.

Page 224: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The case without variables

• Recall that, without variables, Core XPath 2.0 is essentiallya version of Relation Algebra interpreted on finite siblingordered trees.

• One apparent problem: Relation Algebra has no nodetests.However, these can easily be translated away:

Pred1. α[φ ∧ ψ] ≡ α[φ][ψ]Pred2. α[φ ∨ ψ] ≡ α[φ] ∪ α[ψ]Pred3. α[¬φ] ≡ α− α[φ]Pred4. α[〈β〉] ≡ α/((β/>) ∩ .)

• Besides these axioms, our axiomatization for variable freeCore XPath 2.0 contains two groups of axioms:

• General axioms of Boolean Algebra and Relation Algebra• Axioms describing (first-order) properties of trees.

Page 225: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The case without variables

• Recall that, without variables, Core XPath 2.0 is essentiallya version of Relation Algebra interpreted on finite siblingordered trees.

• One apparent problem: Relation Algebra has no nodetests.However, these can easily be translated away:

Pred1. α[φ ∧ ψ] ≡ α[φ][ψ]Pred2. α[φ ∨ ψ] ≡ α[φ] ∪ α[ψ]Pred3. α[¬φ] ≡ α− α[φ]Pred4. α[〈β〉] ≡ α/((β/>) ∩ .)

• Besides these axioms, our axiomatization for variable freeCore XPath 2.0 contains two groups of axioms:

• General axioms of Boolean Algebra and Relation Algebra• Axioms describing (first-order) properties of trees.

Page 226: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The case without variables

• Recall that, without variables, Core XPath 2.0 is essentiallya version of Relation Algebra interpreted on finite siblingordered trees.

• One apparent problem: Relation Algebra has no nodetests.However, these can easily be translated away:

Pred1. α[φ ∧ ψ] ≡ α[φ][ψ]Pred2. α[φ ∨ ψ] ≡ α[φ] ∪ α[ψ]Pred3. α[¬φ] ≡ α− α[φ]Pred4. α[〈β〉] ≡ α/((β/>) ∩ .)

• Besides these axioms, our axiomatization for variable freeCore XPath 2.0 contains two groups of axioms:

• General axioms of Boolean Algebra and Relation Algebra• Axioms describing (first-order) properties of trees.

Page 227: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Axioms of Boolean algebra

• BA1. α ∪ (β ∪ γ) ≡ (α ∪ β) ∪ γ• BA2. α ∩ (β ∩ γ) ≡ (α ∩ β) ∩ γ• BA3. α ∪ β ≡ β ∪ α• BA4. α ∩ β ≡ β ∩ α• BA5. α ∪ (β ∩ γ) ≡ (α ∪ β) ∩ (α ∪ γ)

• BA6. α ∩ (β ∪ γ) ≡ (α ∩ β) ∪ (α ∩ γ)

• BA7. α ∪ (α ∩ β) ≡ α• BA8. α ∩ (α ∪ β) ≡ α• BA9. α ∪ (>− α) ≡ >• BA10. α ∩ (>− α) ≡ ⊥• BA11. α ∩ (>− β) ≡ α− β

Page 228: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The axioms of Relation algebra

• RA1. α/(β/γ) ≡ (α/β)/γ

• RA2. α/. ≡ α• RA3. (α ∪ β)/γ ≡ α/γ ∪ β/γ• RA4. (α ∪ β)^ ≡ α^ ∪ β^

• RA5. (α/β)^ ≡ β^/α^

• RA6. (α^)^ ≡ α• RA7. (α/(>− (α^/β)) ⊆ > except β

• To completely axiomatize relation algebra, normally, oneneeds to add also Venema’s Rule:

If X is a relation variable not occurring in α andX − (((>− .)/X/>) ∪ (>/X/(>− .))) ⊆ α thenα ≡ >.

Fortunately, this rule turns out to be derivable in our case.

Page 229: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

The axioms for finite sibling ordered trees

Tr1. ⇓+/⇓+ ⊆ ⇓+

Tr2. ⇓+ ∩ ⇑+ ≡ ⊥Tr3a. ⇓+ ≡ ⇓ ∪ (⇓/⇓+)Tr3b. ⇓ ≡ ⇓+ − (⇓+/⇓+)Tr4. .[〈⇑〉] ≡ .[⇑+[¬〈⇑〉]]Tr5. ⇓+/⇑+ ≡ ⇓+ ∪ .[〈⇓〉] ∪ .[〈⇓〉]/⇑+

Tr6. ⇒+/⇒+ ⊆ ⇒+

Tr7. ⇒+ ∩⇐+ ≡ ⊥Tr8a. ⇒+ ≡ ⇒∪ (⇒/⇒+)Tr8b. ⇒ ≡ ⇒+ − (⇒+/⇒+)Tr9. .[⇐] ≡ .[〈⇐+[¬〈⇐〉]〉]Tr10. ⇒+ ∪⇐+ ≡ (⇑/⇓)− .Tr11. . ∪ ⇑+ ∪ ⇓+∪

(⇑∗/⇒+/⇓∗) ∪ (⇑∗/⇐+/⇓∗) ≡ >

Ind . >[〈α〉] ≡ >[〈α− (α/ <<)〉]

Page 230: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Rounding up

Three languages:

• Core XPath: the navigational core of XPath 1.0Expressivity: FO2

Query evaluation: PTimeQuery equivalence: ExpTime-complete

• Regular XPath(W): the extension with ∗ and W .Expressivity: same as FO(TC)Query evaluation: PTimeQuery equivalence: 2ExpTime-complete

• Core XPath 2.0: the navigational core of XPath 2.0Expressivity: same as FO.Query evaluation: PSpace-completeQuery equivalence: non-elementary hard

Page 231: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Rounding up

Three languages:

• Core XPath: the navigational core of XPath 1.0Expressivity: FO2

Query evaluation: PTimeQuery equivalence: ExpTime-complete

• Regular XPath(W): the extension with ∗ and W .Expressivity: same as FO(TC)Query evaluation: PTimeQuery equivalence: 2ExpTime-complete

• Core XPath 2.0: the navigational core of XPath 2.0Expressivity: same as FO.Query evaluation: PSpace-completeQuery equivalence: non-elementary hard

Page 232: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Rounding up

Three languages:

• Core XPath: the navigational core of XPath 1.0Expressivity: FO2

Query evaluation: PTimeQuery equivalence: ExpTime-complete

• Regular XPath(W): the extension with ∗ and W .Expressivity: same as FO(TC)Query evaluation: PTimeQuery equivalence: 2ExpTime-complete

• Core XPath 2.0: the navigational core of XPath 2.0Expressivity: same as FO.Query evaluation: PSpace-completeQuery equivalence: non-elementary hard

Page 233: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Rounding up

Three languages:

• Core XPath: the navigational core of XPath 1.0Expressivity: FO2

Query evaluation: PTimeQuery equivalence: ExpTime-complete

• Regular XPath(W): the extension with ∗ and W .Expressivity: same as FO(TC)Query evaluation: PTimeQuery equivalence: 2ExpTime-complete

• Core XPath 2.0: the navigational core of XPath 2.0Expressivity: same as FO.Query evaluation: PSpace-completeQuery equivalence: non-elementary hard

Page 234: XPath from a Logical Point of View - Le · shared between [XQuery], XSL Transformations and XPointer Primary purpose: to address parts of an XML document In support of this primary

Some references

Expressive power:Marx and De Rijke. Semantic characterizations of navigational XPath. SIGMODRecord 34(2), 2005

Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate and Segoufin. XPath, transitive closure logic, and nested tree walkingautomata. Journal of the ACM, 2010. Extended abstract appeared in PODS 2008.

Place/Segoufin in LICS 2010, Fontaine/Place in MFCS 2010 . . .

Axiomatization:Ten Cate, Fontaine and Litak. Some modal aspects of XPath. M4M’07. Journal versionto appear in a special “20th birthday" issue of JANCL

Ten Cate, Litak and Marx. Complete axiomatizations of XPath fragments. JAL 2010.Extended abstract presented at LiD 2008.

Ten Cate and Marx. Axiomatizing the logical core of XPath 2.0. ICDT’07.

Complexity:Gottlob, Koch and Pichler. Efficient algorithms for processing XPath queries. TODS30(2), 2005

Ten Cate and Lutz. Query containment in very expressive XPath dialects. PODS’07.