SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
-
Upload
dorthy-hawkins -
Category
Documents
-
view
227 -
download
4
Transcript of SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
SNUOOPSLA Lab.
Logical structure
© copyright 2001 SNU OOPSLA Lab.
SNU
OOPSLA Lab.
Contents
• Concepts• DTD Structure• Element Declaration• Attribute Declarations• Parameter Entities• Conditional Sections• Notation Declarations• DTD Processing Issues
SNU
OOPSLA Lab.
Concepts of DTD(1)
• DTD(Document Type Definition)– An optional but powerful feature of XML– Comprises a set of declarations that define a
document structure tree– Some XML processors read the DTD and use it to
build the document model in memory– Establishes formal document structure rules
• It define the elements and dictates where they may be applied in relation to each other
SNU
OOPSLA Lab.
Concepts of DTD(2)
• Declare Vs. Define
– Declare “This document is a concert poster”
– Define “A concert poster must have the following features”
• DTD define
– Element type + Attribute + Entities
• Valid Vs. Invalid
– Valid conforms to DTD
– Invalid fail to conform to DTD
Well formed XML Document
Valid XML Document
SNU
OOPSLA Lab.
Valid & Invalid Documents• Valid:
<GREETING>various random text but no markup</GREETING>
• Invalid: anything else including<GREETING> <sometag>various random text</sometag> <someEmptyTag/></GREETING>– or<GREETING> <GREETING>various random text</GREETING></GREETING>
SNU
OOPSLA Lab.
DTD structure
• DTD is composed of a number of declarations– ELEMENT (tag definition)– ATTLIST (attribute definitions)– ENTITY (entity definition)– NOTATION(data type notation definition)
• DTD can be stored in an external subset or an internal subset
SNU
OOPSLA Lab.
Internal and External Subset(1)
• Internal subset– Form : <!DOCTYOE … [ <!-- Internal Subset --> … ]>– Pros
• Easy to write XML
– Cons
• Editing two files without moving
• Other document can’t reuse without copying internal subset
SNU
OOPSLA Lab.
Internal and External Subset(2)
• External subset– better to use external DTDs
– Reason why?• Many benefits
– document management
– updating
– editing
• Few reasons
– If you use an external DTD, you can use public DTDs(capability)
– External DTDs provide for better document management
– External DTDs make it easier to validate you document
SNU
OOPSLA Lab.
Element Declarations• Used to define a new element, specify its allowed
content and gives the name and content model of the element
• Each tag must be declared in a <!ELEMENT> declaration.
• The content model uses a simple regular expression-like grammar to precisely specify what is and isn't allowed in an element
ELEMENT Type declaration ‘<!ELEMENT’ S Name S Contentspec S? ‘>’
SNU
OOPSLA Lab.
Content Specifications
• ANY• #PCDATA• Sequences• Choices• Mixed Content• Modifiers• Empty
SNU
OOPSLA Lab.
ANY
• A SEASON can contain any child element and/or raw text (parsed character data)
• Rarely used in practice, due to the lack of constraint on structure it encourages.
<!ELEMENT SEASON ANY>
SNU
OOPSLA Lab.
#PCDATA
• Parsed Character Data; i.e. raw text, no markup
• Represent normal data and preceded by the hash-symbol, ‘#’, to avoid confusion with an identical element name, when used within a model group( for example, ‘(#PCDATA | PCDATA)’)
<!ELEMENT YEAR (#PCDATA)>
SNU
OOPSLA Lab.
Use of #PCDATA in XML
• Valid: • Invalid:
<YEAR>1999</YEAR><YEAR>99</YEAR><YEAR>1999 .E.</YEAR><YEAR> The year of our Lord one thousand, nine hundred, and ninety-nine</YEAR>
<YEAR><MONTH>January</MONTH><MONTH>February</MONTH><MONTH>March</MONTH><MONTH>April</MONTH><MONTH>May</MONTH><MONTH>June</MONTH><MONTH>July</MONTH><MONTH>August</MONTH><MONTH>September</MONTH><MONTH>October</MONTH><MONTH>November</MONTH><MONTH>December</MONTH></YEAR>
SNU
OOPSLA Lab.
Child Elements
• To declare that a LEAGUE element must have a LEAGUE_NAME child:
<!ELEMENT LEAGUE (LEAGUE_NAME)> <!ELEMENT LEAGUE_NAME (#PCDATA)>
SNU
OOPSLA Lab.
Sequences(1)
• Separate multiple required child elements with commas; e.g.
• One or More Children +
<!ELEMENT SEASON (YEAR, LEAGUE, LEAGUE)><!ELEMENT LEAGUE (LEAGUE_NAME, DIVISION, DIVISION, DIVISION)>
<!ELEMENT DIVISION_NAME (#PCDATA)><!ELEMENT DIVISION (DIVISION_NAME, TEAM+)>
SNU
OOPSLA Lab.
Sequences(1)
• Zero or More Children *
•
• Choices
<!ELEMENT TEAM (TEAM_CITY, TEAM_NAME, PLAYER*)><!ELEMENT TEAM_CITY (#PCDATA)><!ELEMENT TEAM_NAME (#PCDATA)>
<!ELEMENT PAYMENT (CASH | CREDIT_CARD)>
<!ELEMENT PAYMENT (CASH | CREDIT_CARD | CHECK)>
SNU
OOPSLA Lab.
Grouping With Parentheses
• Parentheses combine several elements into a single element.
• Parenthesized element can be nested inside other parentheses in place of a single element.
• The parenthesized element can be suffixed with a plus sign, a comma, or a question mark.
<!ELEMENT dl (dt, dd)*><!ELEMENT ARTICLE (TITLE, (P | PHOTO |GRAPH | SIDEBAR | PULLQUOTE | SUBHEAD)*, BYLINE?)>
SNU
OOPSLA Lab.
Mixed Content
• Both #PCDATA and child elements in a choice
• #PCDATA must come first• #PCDATA cannot be used in a sequence
<!ELEMENT TEAM (#PCDATA | TEAM_CITY | TEAM_NAME | PLAYER)*>
Empty elements
<!ELEMENT BR EMPTY>
SNU
OOPSLA Lab.
Attribute Declarations
• Consider this element:
• It is declared like this:
<GREETING LANGUAGE="Spanish"> Hola!</GREETING>
<!ELEMENT GREETING (#PCDATA)><!ATTLIST GREETING LANGUAGE CDATA "English">
<!ATTLIST Element_name Attribute_name Type Default_value>
SNU
OOPSLA Lab.
Multiple Attribute Declarations
• Consider this element
• With two attribute declarations:
• With one attribute declaration
• Indentation is a convetion, not a requirement
<RECT LENGTH="70px" WIDTH="85px"/>
<!ELEMENT RECTANGLE EMPTY><!ATTLIST RECTANGLE LENGTH CDATA "0px"><!ATTLIST RECTANGLE WIDTH CDATA "0px">
<!ATTLIST RECTANGLE LENGTH CDATA "0px" WIDTH CDATA "0px">
SNU
OOPSLA Lab.
Attribute Types
• CDATA• ID• IDREF• IDREFS• ENTITY
• ENTITIES • NOTATION • NMTOKEN • NMTOKENS• Enumerated
SNU
OOPSLA Lab.
CDATA
• Most general attribute type• Value can be any string of text not containing
a less-than sign (<) or quotation marks (")
SNU
OOPSLA Lab.
ID• Value must be an XML name
– May include letters, digits, underscores, hyphens, and periods
– May not include whitespace– May contain colons only if used for namespaces
• Value must be unique within ID type attributes in the document
• Generally the default value is #REQUIRED
SNU
OOPSLA Lab.
IDREF
• Value matches the ID of an element in the same document
• Used for links and the like
IDREFS
A list of ID values in the same documentSeparated by white space
SNU
OOPSLA Lab.
ENTITY
• Value is the name of an unparsed general entity declared in the DTD
ENTITIES
Value is a list of unparsed general entities declared in the DTDSeparated by white space
SNU
OOPSLA Lab.
NOTATION• Value is the name of a notation declared in
the DTD
<!NOTATION Tex SYSTEM “..\TEXVIEW.EXE”>
<!ENTITY Logo SYSTEM “LOGO.TEX” NDATA Tex>
TEXVIEW.EXE LOGO.TEX
1
2
3
4
SNU
OOPSLA Lab.
NMTOKEN
• Value is any legal XML name
NMTOKENS
Value is a list of XML namesSeparated by white space
SNU
OOPSLA Lab.
Enumerated
• Not a keyword• Refers to a list of possible values from which
one must be chosen• Default value is generally provided explicitly
<!ATTLIST P VISIBLE (TRUE | FALSE) "TRUE">
SNU
OOPSLA Lab.
Attribute Default Values
• A literal string value • One of these three keywords
– #REQUIRED– #IMPLIED– #FIXED
SNU
OOPSLA Lab.
#REQUIRED• No default value is provided in the DTD• Document authors must provide attribute value
for each element
<!ELEMENT IMG EMPTY><!ATTLIST IMG ALT CDATA #REQUIRED><!ATTLIST IMG WIDTH CDATA #REQUIRED><!ATTLIST IMG HEIGHT CDATA #REQUIRED>
SNU
OOPSLA Lab.
#IMPLIED
• No default value in the DTD• Author may(but does not have to) provide a
value with each element
SNU
OOPSLA Lab.
#FIXED
• Value is the same for all elements• Default value must be provided in DTD• Document author may not change default value
<!ELEMENT AUTHOR EMPTY><!ATTLIST AUTHOR NAME CDATA #REQUIRED><!ATTLIST AUTHOR EMAIL CDATA #REQUIRED><!ATTLIST AUTHOR EXTENSION CDATA #IMPLIED><!ATTLIST AUTHOR COMPANY CDATA #FIXED "TIC">
SNU
OOPSLA Lab.
Example of Internal DTDs
<?xml version="1.0"?><!DOCTYPE GREETING [ <!ELEMENT GREETING (#PCDATA)>]><GREETING>Hello XML!</GREETING>
SNU
OOPSLA Lab.
Internal DTD Subsets
• Internal declarations override external declarations
<?xml version="1.0"?><!DOCTYPE GREETING SYSTEM "greeting.dtd" [ <!ELEMENT GREETING (#PCDATA)>]><GREETING>Hello XML!</GREETING>