Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...
-
Upload
esmond-reed -
Category
Documents
-
view
212 -
download
0
Transcript of Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...
Keeping Chess Alive – Do we Keeping Chess Alive – Do we need 1-unambiguous content need 1-unambiguous content
models?models?Murali Mani, UCLA/CSD
Extreme Markup Languages 2001
Montreal, Canada
Outline of the talkOutline of the talk
Why is 1-unambiguity important?Formalize few concepts and learn –
– There exist regular languages that are inherently not 1-unambiguous.
We do not need 1-unambiguity– No additional benefit– Difficulty for document processing (type
inference)
Why is 1-unambiguity Why is 1-unambiguity important?important?
XML 1.0 specification
[3.2.1] – “It is an error if an element in the document can match more than one occurrence of an element type in the content model”.
[App E] – “The content model (b, c) | (b, d) is in error and may be reported as an error.”
Why is 1-unambiguity Why is 1-unambiguity important? (contd…)important? (contd…)
XML Schema [3.8.6] – Schema Component Constraint: Unique
Particle Attribution“A content model must be formed such that during validation of an element information item sequence, the particle with which we attempt to validate each item in the sequence can be uniquely determined without examining the content or attributes of that element, and without any information about items in the remainder of the sequence”
ConceptsConcepts
Regular expression – ‘,’, ‘|’, ,’*’
(a | b)*, c Model group – other operators also – ‘+’, ‘?’, ‘&’
a?, (b | c)* = (a, (b | c)*) | (b | c)*
Every regular expression is a model group Every model group can be expressed as a regular
expression.
1-unambiguous content 1-unambiguous content modelsmodels
Ambiguity in Graphs and Expressions – Book, Evan, Greibach, Ott, 1971– Given a regular expression, E, is E ambiguous?
For example, (a | (a, b*)) is ambiguous
Deterministic Regular Languages – Anne Bruggemann Klein, 1991– Studied 1-unambiguity in SGML content
models
1-unambiguous content 1-unambiguous content models (contd…)models (contd…)
Reasoning about XML Schema Languages using Formal Language Theory –Dongwon Lee, Murali Mani, Makoto Murata, 2000– Content models without the 1-unambiguous contraint
http://www.oasis-open.org/cover/topics.html#ambiguity
Example content model -- (whitemove, blackmove)*, whitemove?
Type assignmentType assignment
Type assignment (contd…)Type assignment (contd…)
Assumption – If the type of an element can be determined by a SAX parser on seeing the start element tag, it is sufficient.
DTDs and XML-Schema have the above property even without the 1-unambiguity constraint.
Disadvantages of having the Disadvantages of having the 1-unambiguity constraint1-unambiguity constraint
Significant loss in ability to describe constraints – the game of chess might be described as (whitemove | blackmove)*
We lost the following constraints– whitemove and blackmove alternate– We start with a whitemove
Shall we stick to the chess rules?
Disadvantages of having the Disadvantages of having the 1-unambiguity constraint 1-unambiguity constraint
(contd…)(contd…)Difficult for document processing, and type
inference– No characterization of 1-unambiguous model
groups– Less constraints => less algebraic optimization
is possible.
ConclusionsConclusions
One class of schema languages identified by the property – the type of an element can be determined by a depth first traversal (SAX parser) on seeing the start element tag
Such schema languages do not need the 1-unambiguity constraint.
1-unambiguity constraint is difficult to work with for type inference, and for playing chess.
AcknowledgementsAcknowledgements
XML-DEV mailing list – the discussions in this list largely motivated this talk.
Additional material at this Additional material at this conferenceconference
Taxonomy of XML Schema languages using Formal Language Theory – Aug 15, 4:00 pm
RELAX NG: Unification of RELAX Core and TREX – Aug 17, 9:00 am