XML's validation - DTD

64
Validation - DTDs Nguyễn Đăng Khoa

description

Content: - Element declarations - Attribute declarations - Entity declarations References: Beginning XML, 5th Edition, Joe Fawcett, Liam R. E. Quin, Danny Ayers

Transcript of XML's validation - DTD

Page 1: XML's validation - DTD

Validation - DTDs

Nguyễn Đăng Khoa

Page 2: XML's validation - DTD

Content

• Document Type Definitions (DTDs)• XML Schemas

Page 3: XML's validation - DTD

DTD – What’s a DTD?

• is a set of rules that defines the elements and their attributes for an XML document

• DTD defines the “grammar” for an XML document

• DTDs were created as part of SGML

Page 4: XML's validation - DTD

DTD’s goals

• Check XML document is valid or not

Page 5: XML's validation - DTD

When to use a DTD

• To create and manage large sets of documents for your company

• To define clearly what markup may be used in certain documents and how markup should be sequenced

• To provide a common frame of reference for documents that many users can share

Page 6: XML's validation - DTD

When NOT to use a DTD

• You’re working with only one or a few small document

• You’re using a nonvalidating processor to handle your XML documents

Page 7: XML's validation - DTD

DTD - Example

Page 8: XML's validation - DTD

DTD - Example

Page 9: XML's validation - DTD

DTD – Internal subset declarations

<!DOCTYPE name_of_root [ …. declarations …]>• Declarations appear between the [ and ]

Page 10: XML's validation - DTD

DTD – External subset declarations

• System Identifiers<!DOCTYPE name_of_root SYSTEM “URI to DTD file”

[ …. declarations …]>– Example:• <!DOCTYPE name SYSTEM “file:///c:/name.dtd” [ ]>• <!DOCTYPE name SYSTEM

“http://wiley.com/hr/name.dtd” [ ]>• <!DOCTYPE name SYSTEM “name.dtd”>

Page 11: XML's validation - DTD

DTD – External subset declarations

• Public Identifiers<!DOCTYPE name_of_root PUBLIC “entry in a

catalog” optional_system_identifier>– Example:• <!DOCTYPE name PUBLIC “-//Beginning XML//DTD

Name Example//EN”> • <!DOCTYPE name PUBLIC “-//Beginning XML//DTD

Name Example//EN” “name.dtd”>

– Common format is Formal Public Identifiers, FPIs-//Owner//Class Description//Language//Version

Page 12: XML's validation - DTD

DTD – External subset declarations

• Public Identifiers<!DOCTYPE name_of_root PUBLIC “entry in a

catalog” optional_system_identifier>– Example:• <!DOCTYPE name PUBLIC “-//Beginning XML//DTD

Name Example//EN”> • <!DOCTYPE name PUBLIC “-//Beginning XML//DTD

Name Example//EN” “name.dtd”>

– Common format is Formal Public Identifiers, FPIs-//Owner//Class Description//Language//Version

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”“http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>

Page 13: XML's validation - DTD

Anatomy of a DTD

• Element declarations• Attribute declarations• Entity declarations

Page 14: XML's validation - DTD

DTD – Element declarations

• declare each element that appears within the document

• can include declarations for optional elements

<!ELEMENT name (first, middle, last)>

ELEMENT declaration

element name

element content model

Page 15: XML's validation - DTD

DTD – Element content models

• Element• Mixed• Empty• Any

Page 16: XML's validation - DTD

DTD – Element Content

• Include the allowable elements within parentheses

<!ELEMENT contact (name)>

<!ELEMENT contact (name, location, phone, knows, description)>

Sequences

Choices

Page 17: XML's validation - DTD

DTD – Element Content – Sequences

• Elements within these documents must appear in a distinct order

<!ELEMENT name (first, middle, last)>

<!ELEMENT contact (name, location, phone, knows, description)>

Page 18: XML's validation - DTD

DTD – Element Content – Sequences

• Elements within these documents must appear in a distinct order

<!ELEMENT name (first, middle, last)>

<!ELEMENT contact (name, location, phone, knows, description)>

Error when parent element:• is missing one of the elements• contains more elements• the elements appeared in another order

Page 19: XML's validation - DTD

DTD – Element Content – Choices

• Allow one element or another, but not both

<!ELEMENT location (address | GPS)>

Page 20: XML's validation - DTD

DTD – Element Content – Choices

• Allow one element or another, but not both

<!ELEMENT location (address | GPS)>

Error when parent element:• is empty• contain more than one of these elements

Page 21: XML's validation - DTD

DTD – Element Content – Combining Sequences and Choices

• Many XML documents need to leverage much more complex rules

<!ELEMENT location (address | (latitude, longitude))>

Page 22: XML's validation - DTD

DTD – Mixed Content

• Any element with text in its content– text can appear by itself or it can be interspersed

between elements• Case 1: simplest mixed content model - text-

only:

<!ELEMENT first (#PCDATA)>

<first>John</first>

Page 23: XML's validation - DTD

DTD – Mixed Content

• Case 2: Mixed content models can also contain elements interspersed within the text

<description>Joe is a developer and author for <title>Beginning XML</title>, now in its <detail>5th Edition</detail></description>

<!ELEMENT description (#PCDATA | title | detail)*>

Page 24: XML's validation - DTD

DTD – Mixed Content

• Case 2: Mixed content models can also contain elements interspersed within the text

<description>Joe is a developer and author for <title>Beginning XML</title>, now in its <detail>5th Edition</detail></description>

<!ELEMENT description (#PCDATA | title | detail)*>

4 rules:• They must use the choice mechanism to separate elements• The #PCDATA keyword must appear first• There must be no inner content models.• If there are child elements, the * cardinality indicator must appear at the end of the model

Page 25: XML's validation - DTD

DTD – Empty Content

• Elements never need to contain content

<!ELEMENT br EMPTY>

Page 26: XML's validation - DTD

DTD – Any Content

• The ANY keyword indicates that – text (PCDATA) – any elements must be declared within the DTD– any order any number of times

<!ELEMENT description ANY>

Page 27: XML's validation - DTD

DTD – Example

Page 28: XML's validation - DTD

DTD – Example

Page 29: XML's validation - DTD

DTD – Cardinality

• An element’s cardinality defines how many times it will appear within a content model

Page 30: XML's validation - DTD

DTD - Example

Page 31: XML's validation - DTD

DTD - Example

Page 32: XML's validation - DTD

DTD - Example

Page 33: XML's validation - DTD

DTD - Example

Page 34: XML's validation - DTD

DTD – Attribute Declarations

• declare a list of allowable attributes for each element

<!ATTLIST contacts source CDATA #IMPLIED>

ATTLIST declaration

associated element’s name

list of declared attributes

Page 35: XML's validation - DTD

DTD – Attribute Declarations

• declare a list of allowable attributes for each element

<!ATTLIST contacts source CDATA #IMPLIED>

attribute name

attribute type

attribute value declaration

Page 36: XML's validation - DTD

DTD – Attribute Types

• When declaring attributes, you can specify how the processor should handle the data that appears in the value

Page 37: XML's validation - DTD

DTD – Attribute Types

Page 38: XML's validation - DTD

DTD – Attribute Types – CDATA

<!ATTLIST website description CDATA #IMPLIED>

Page 39: XML's validation - DTD

DTD – Attribute Types – ID

<!ATTLIST website url ID #IMPLIED>

Page 40: XML's validation - DTD

DTD – Attribute Types – IDREF

<!ATTLIST website link IDREF #IMPLIED>

Page 41: XML's validation - DTD

DTD – Attribute Types – IDREFS

<!ATTLIST website links IDREFS #IMPLIED>

Page 42: XML's validation - DTD

DTD – Attribute Types – NMTOKEN

<!ATTLIST website category NMTOKEN #IMPLIED>

Page 43: XML's validation - DTD

DTD – Attribute Types – NMTOKENS

<!ATTLIST website category NMTOKENS #IMPLIED>

Page 44: XML's validation - DTD

DTD – Attribute Types – Enumerated list

<!ATTLIST website like (YES|NO) #IMPLIED>

Page 45: XML's validation - DTD

DTD – Attribute Value Declarations

• Within each attribute declaration you must specify how the value will appear in the document– Has a default value– Has a fixed value– Is required– Is implied (or is optional)

Page 46: XML's validation - DTD

DTD – Attribute Value Declarations – Default values

• can be sure that it is included in the final output

<!ATTLIST phone kind (Home | Work | Cell | Fax) “Home”>

kind=“Work”

kind=“Home”

Page 47: XML's validation - DTD

DTD – Attribute Value Declarations – Fixed Values

• When an attribute’s value can never change, you use the #FIXED keyword followed by the fixed value

• Fixed values operate much like default values

<!ATTLIST contacts version CDATA #FIXED “1.0”>

Page 48: XML's validation - DTD

DTD – Attribute Value Declarations – Required Values

• Attribute is required must be included within the XML document– you are not permitted to specify a default value

<! ATTLIST phone kind (Home | Work | Cell | Fax) #REQUIRED>

Page 49: XML's validation - DTD

DTD – Attribute Value Declarations – Implied Values

• Attribute has no default value, has no fixed value, and is not required

<! ATTLIST knows contacts IDREFS #IMPLIED>

Page 50: XML's validation - DTD

DTD – Specifying Multiple Attributes

<!ATTLIST contacts version CDATA #FIXED “1.0” source CDATA #IMPLIED>

<!ATTLIST contacts version CDATA #FIXED “1.0”> <!ATTLIST contacts source CDATA #IMPLIED>

Page 51: XML's validation - DTD

DTD – Example

Page 52: XML's validation - DTD

DTD – Example

Page 53: XML's validation - DTD

DTD – Example

Page 54: XML's validation - DTD

DTD – Example

Page 55: XML's validation - DTD

DTD – Entity Declarations

• escape characters • include special characters• refer to sections of replacement text, other

XML markup, and even external files

Page 56: XML's validation - DTD

DTD – Entity Declarations

• 4 primary types– Built-in entities– Character entities– General entities– Parameter entities

Page 57: XML's validation - DTD

DTD – Entity Declarations – Built-in entities

• Start with an ampersand (&) and finish with a semicolon (;)

• There are five built-in entity references in XML

Page 58: XML's validation - DTD

DTD – Entity Declarations – Character entities

• Begin with &# and end with a semicolon (;)• Example: the Greek letter omega (Ω) as a

reference it would be &#x03A9; in hexadecimal or &#937; in decimal

Page 59: XML's validation - DTD

DTD – Entity Declarations – General entities

• create reusable sections of replacement text• must be declared within the DTD before they

can be used• There are 2 ways to declare:– Internal entity declaration– External entity declaration

Page 60: XML's validation - DTD

DTD – Entity Declarations – General entities

• Internal Entity Declaration

&source-text;&address-unknow;&empty-gps;

Page 61: XML's validation - DTD

DTD – Entity Declarations – General entities

• External Entity Declaration

Page 62: XML's validation - DTD

DTD – Entity Declarations – Parameter entities

• much like general entities, enable you to create reusable sections of replacement text

• cannot be used in general content• can refer to parameter entities only within the

DTD%NameDeclarations;

Page 63: XML's validation - DTD

DTD – Entity Declarations – Parameter entities

Page 64: XML's validation - DTD

DTD Limitations

• Poor support for XML namespaces• Poor data typing• Limited content model descriptions