Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...

14
Keeping Chess Alive – Keeping Chess Alive – Do we need 1- Do we need 1- unambiguous content unambiguous content models? models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada

Transcript of Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...

Page 1: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Keeping Chess Alive – Do we Keeping Chess Alive – Do we need 1-unambiguous content need 1-unambiguous content

models?models?Murali Mani, UCLA/CSD

Extreme Markup Languages 2001

Montreal, Canada

Page 2: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Outline of the talkOutline of the talk

Why is 1-unambiguity important?Formalize few concepts and learn –

– There exist regular languages that are inherently not 1-unambiguous.

We do not need 1-unambiguity– No additional benefit– Difficulty for document processing (type

inference)

Page 3: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Why is 1-unambiguity Why is 1-unambiguity important?important?

XML 1.0 specification

[3.2.1] – “It is an error if an element in the document can match more than one occurrence of an element type in the content model”.

[App E] – “The content model (b, c) | (b, d) is in error and may be reported as an error.”

Page 4: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Why is 1-unambiguity Why is 1-unambiguity important? (contd…)important? (contd…)

XML Schema [3.8.6] – Schema Component Constraint: Unique

Particle Attribution“A content model must be formed such that during validation of an element information item sequence, the particle with which we attempt to validate each item in the sequence can be uniquely determined without examining the content or attributes of that element, and without any information about items in the remainder of the sequence”

Page 5: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

ConceptsConcepts

Regular expression – ‘,’, ‘|’, ,’*’

(a | b)*, c Model group – other operators also – ‘+’, ‘?’, ‘&’

a?, (b | c)* = (a, (b | c)*) | (b | c)*

Every regular expression is a model group Every model group can be expressed as a regular

expression.

Page 6: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

1-unambiguous content 1-unambiguous content modelsmodels

Ambiguity in Graphs and Expressions – Book, Evan, Greibach, Ott, 1971– Given a regular expression, E, is E ambiguous?

For example, (a | (a, b*)) is ambiguous

Deterministic Regular Languages – Anne Bruggemann Klein, 1991– Studied 1-unambiguity in SGML content

models

Page 7: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

1-unambiguous content 1-unambiguous content models (contd…)models (contd…)

Reasoning about XML Schema Languages using Formal Language Theory –Dongwon Lee, Murali Mani, Makoto Murata, 2000– Content models without the 1-unambiguous contraint

http://www.oasis-open.org/cover/topics.html#ambiguity

Example content model -- (whitemove, blackmove)*, whitemove?

Page 8: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Type assignmentType assignment

Page 9: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Type assignment (contd…)Type assignment (contd…)

Assumption – If the type of an element can be determined by a SAX parser on seeing the start element tag, it is sufficient.

DTDs and XML-Schema have the above property even without the 1-unambiguity constraint.

Page 10: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Disadvantages of having the Disadvantages of having the 1-unambiguity constraint1-unambiguity constraint

Significant loss in ability to describe constraints – the game of chess might be described as (whitemove | blackmove)*

We lost the following constraints– whitemove and blackmove alternate– We start with a whitemove

Shall we stick to the chess rules?

Page 11: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Disadvantages of having the Disadvantages of having the 1-unambiguity constraint 1-unambiguity constraint

(contd…)(contd…)Difficult for document processing, and type

inference– No characterization of 1-unambiguous model

groups– Less constraints => less algebraic optimization

is possible.

Page 12: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

ConclusionsConclusions

One class of schema languages identified by the property – the type of an element can be determined by a depth first traversal (SAX parser) on seeing the start element tag

Such schema languages do not need the 1-unambiguity constraint.

1-unambiguity constraint is difficult to work with for type inference, and for playing chess.

Page 13: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

AcknowledgementsAcknowledgements

XML-DEV mailing list – the discussions in this list largely motivated this talk.

Page 14: Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme Markup Languages 2001 Montreal, Canada.

Additional material at this Additional material at this conferenceconference

Taxonomy of XML Schema languages using Formal Language Theory – Aug 15, 4:00 pm

RELAX NG: Unification of RELAX Core and TREX – Aug 17, 9:00 am