XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania...
-
Upload
barrie-robbins -
Category
Documents
-
view
223 -
download
0
Transcript of XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania...
XML DocumentsXML DocumentsChao-Hsien Chu, Ph.D.
School of Information Sciences and TechnologyThe Pennsylvania State University
ElementsAttributes
Comments
PI
Documen
t
Type
Components of XML SystemsComponents of XML Systems
XMLParser
(Processor)
XMLApplication
XMLDocument(Contents)
XMLDTD
(Rule)
Well-Formed(Syntax)
Validate(Structure)
FurtherProcessing
(optional)
How a Parser Interprets XML - ValidateHow a Parser Interprets XML - Validate
XMLDocument
Data TypeDefinition
IssueWarning/Stop
Processing
WellFormed? DTD?
Valid?Issue
Warning/StopProcessing
no
no
no
yes
yes
yes
XML Document SyntaxXML Document Syntax
Processing Instructions (PI)
Document Type Declarations (optional)
Comments (optional)
Element Start and End Tags
Attributes
Entity References
Character Data Sections (CDATA)
The Panoramic Perspective of XMLThe Panoramic Perspective of XML
XMLDocument
Prolog
Doc. TypeDeclaration
RootElement
Comments
ProcessingInstructions
Comments
ProcessingInstructions
Comments
ProcessingInstructions
EntityReferences
CDATASections
Elements PCDATA
Attributes
EntityReferences
CDATA,Entities, ID,..
Doc TypeDefinitions
ElementDeclaration
AttributeDeclaration
EntityDeclaration
NotationsDeclaration
: Optional
An Example of XML DocumentAn Example of XML Document<?xml version = "1.0" standalone = "no“ ?><!DOCTYPE Address_Book SYSTEM "fclml.dtd"><?xml-stylesheet type="text/xsl" href="mystyle.xsl“ ?><Address_Book> <Contact> <Name>Alley Gator</Name> <ID>001</ID> <EMAIL>[email protected]</EMAIL> <Phone>(010)62345678</Phone> <Address> <Street>112 Main Street</Street> <City>Muddy Waters</City> <State>FL</State> <ZIP>55544</ZIP> </Address> </Contact></Address_Book>
Process Instruction
Elements
RootElement
DocumentTypeDeclaration
Processing Instructions (PI)Processing Instructions (PI)
PI is used to provide information regarding processing such as processor (name and version of the processor)
Syntax: <?Processor Attribute = “Value of Attribute” ?>
Examples: <?xml version = “1.0” ?> <?xml version="1.0" encoding="Big5" ?> <?xml version = "1.0" standalone = "no"?> <?rtf \page ?>
DTD File
Document Type DeclarationDocument Type Declaration
A statement embedded in an XML document whose purpose is to point to the existence and location of a document type definition (DTD).
DTD is optional.
Syntax:<!DOCTYPE Root Element SYSTEM “xxx.dtd">
Example:<?xml version = "1.0" standalone = "no"?>
<!DOCTYPE Address_Book SYSTEM "fclml.dtd">
CommentsComments
A place to write a note for reminding, simple documentation, or commenting out codes for debugging, etc., which will not be seen by the end users.
<!-- This is a comment area-->
You can use any character inside the comment area except “--” itself
There is no limitation on the length of the comment area.
Comments may not come before the XML declaration. Comments may not be placed inside a tag.
Guideline for ElementsGuideline for Elements
Elements are the building blocks of XML documents.
Every document needs to have one and only one root element.
An element must start with a starting tag and ends with a corresponding ending tag.
Element names are case sensitive. Element names must open and close with identical cases.
Spaces are not allowed between the forward slash and element name.</ Books>
Example of Element Example of Element
<Item optional = “1”>
</Item>
TagName
AttributeName
AttributeValue
Attribute
End Tag
Start Tag
Element Contents• Texts• Elements
Elem
ent
Guideline for ElementsGuideline for Elements
Elements can be used to both contain information and define structure.
The structure of information is encoded by the nesting of tags.
Empty elements, which don’t have contents, are being used as placeholders or to signify their existence
E.G., <BR />.
Tree Diagram of Address DocumentTree Diagram of Address Document
Address_Book
Contact
IDName E-mail Phone Address
ZipStateCityStreet
Root Element
Element NameElement Name
Element names must begin with a letter or an underscore (_). Subsequent characters may include letters, digits, underscore, hyphens, and periods.
Element names cannot begin with a number. Element names cannot include spaces.
Instant QuizInstant Quiz
<Help> <Book%7> <Volume Control> <Volume> <_8ball> <1heading> <heading1> <Mary Smith> <section.paragraph>
_______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______
Which of the flowing are “legal” or “illegal” element name?
AttributesAttributes
Attributes are small descriptive bits of information used for describing elements.
Attributes are contained within the start tag of an element after the element name and are followed by an “=“ sign, then the value of the attribute.
The attributes value must be enclosed with a pair of single or double quotes.
Instant QuizInstant Quiz
1. <marble color=“red”> _____
2. <marble color=“red” size=“big”> _____
3. <marble color=“red” /> _____
4. <marble color=red> _____
5. <marble color> _____
Which of the followings are legal attributes?
CDATA SectionCDATA Section
CDATA sections are used when you want all text to be interpreted as pure character data rather than as markup. This is useful if you have a lot of <, >, & or “ characters.
Example:<Height><![CDATA[Faraz < Alex]]>
</Height>
Entity ReferencesEntity References
Entity references are markup that is replaced with character data when the document is parsed.
XML predefines five entity references:& &< <> >" “' ‘
Entity references point to either external text file or external picture.
Illustration of Entity ReferenceIllustration of Entity Reference
XMLDocument
EntityReference
EntityReference
TextFile
Before
Illustration of Entity ReferenceIllustration of Entity Reference
XMLDocument
AfterParsing
TextContents
Well Formed DocumentWell Formed Document
Here are some general guidelines:
Contains one and only one root element. All elements must contain both start and end tags. Tags are case sensitive No overlapping tags. Elements must nest inside each other
properly. Attribute values must be enclosed in quotes. An empty element must end with “/>” The text characters (<), (>) and (“) must always be
represented by character entities.
Well formed XML documents are those documents that are syntactically correct.
Thank You?
Any Question?