E X TENSIBLE M ARKUP L ANGUAGE (XML). What is XML? XML stands for EXtensible Markup Language XML...
-
Upload
naomi-strickland -
Category
Documents
-
view
233 -
download
6
Transcript of E X TENSIBLE M ARKUP L ANGUAGE (XML). What is XML? XML stands for EXtensible Markup Language XML...
What is XML?
XML stands for EXtensible Markup Language XML is mainly designed to carry (or transmit)
data , not to display data. XML is a computer language for defining markup
languages to create structured documents. XML tags are not predefined. You must define
your own tags
Difference between HTML and XML
• HTML is used to mark up text so it can be displayed to users on the screen.
• HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>).
• HTML uses a fixed, unchangeable set of tags
• HTML is not case-sensitive
• It is not strict about syntactical rules.
• Browsers ignore and/or correct as many HTML errors as they can.
• XML is used to carry data so it can be processed by computers.
• XML describes only content, or “meaning”.
• In XML, you can create your own tags
• XML is case-sensitive.
• Follows strict syntactical rules.
• Browsers process XML documents only if they are syntactically correct.
HTML and XML (similarity)4
HTML and XML look similar, because they are both SGML languages ( Standard Generalized Markup Language)
Both HTML and XML use elements enclosed in tags
(Ex: <body>This is an element</body>)
Both use tag attributes
<font face="Verdana" size="10" color="red">
Both use entities (<, >, &, ", ')
XML document structure (Basic points to remember)
Each XML document should start with the version of XML <?xml version="1.0“ encoding=“utf-8” ?>
The second line of your document should be the DTD, this includes the name of your DTD and its URI or location<!DOCTYPE mydocument SYSTEM "mydocument.dtd">
Ex: <!DOCTYPE book SYSTEM “Book.dtd"> <book>
<title>Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price>
</book>
root element of the document
Format of root element of the document:
<root> <child> <subchild>.....</subchild> </child> </root>
XML document structure (Basic points to remember)
A simple example for XML document<?xml version= “1.0” encoding = “utf-8” ?>
<!DOCTYPE ad SYSTEM “ad.dtd"><ad>
<year>1960</year>
<make>Cessna</make>
<model>Centurian</model>
<color>Yellow with white trim</color>
<location>
<city>Bangalore</city>
<state>Karnataka</state>
</location>
</ad>
XML Document
Contains 2 auxiliary files: One specifies tag set and rules (DTD/XML schema)
Second specifies how content should be displayed
(CSS/XSLT)
Xml document consists of many entities: Document entity (Physically within the document)
Reference entity (Separate files)
- Should have name and reference (A reference to an entity has the form: &entity_name;)
Binary entity (binary data)(ex: images, sound files etc)
DTD - XML Building Blocks
The Building Blocks of XML Documents
Elements Attributes Entities PCDATA CDATA
What is an XML Element?
An XML element is everything from the element's start tag to the element's end tag.
An element can contain:other elements (element-content)Text (text content)Attributes
DOCUMENT TYPE DEFINITIONS
Document Type Definitions (DTDs)
A set of structural rules called declarations, which specify a set of elements that can appear in the document as well as how and where these elements may appear.
Purpose: to provide a standard form for a collection of XML documents.
Not all XML documents have or need a DTD. Two types of DTDs
Internal DTD (appears within a XML document) External DTD (appears as a external file – can be used with more
than one document)
Declaring Elements within DTD DTD contains declarations that define elements, attributes,
etc. Syntax:
<!ELEMENT element-name (element-content)>
Ex:
<!ELEMENT person(parent, age, spouse,
sibling)>
Empty Elements Empty elements are declared with the keyword -
EMPTY
<!ELEMENT element-name EMPTY>
Example:<!ELEMENT br EMPTY>
Declaring attributes:
In a DTD, attributes are declared with an ATTLIST declaration.
An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attribute-type default-value>
Attribute types: there are many possible, but we will consider only CDATA
DTD example:<!ATTLIST payment type CDATA "cheque">
XML example:<payment type="cheque" />
Declaring entities
Entities are variables used to define shortcuts to standard text or special characters.
Entities are normally used to specify large blocks of data that need to be repeated throughout the document.
Entities can be declared in two ways: internal or external.
An Internal Entity Declaration
Syntax:<!ENTITY entity-name "entity-value">
DTD Example:<!ENTITY writer "Donald Duck.">
XML example:<author>&writer;</author>
Note: An entity has three parts: an ampersand (&), an entity name, and a semicolon (;)
An External Entity Declaration
Syntax:
<!ENTITY entity-name SYSTEM "URI/URL“>
SYSTEM The keyword, specifies that the definition of the entity is in a different file.
DTD Example:<!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd">
XML example:<author>&writer; </author>
Internal and External DTDs
Internal and External DTDs
Internal DTD Declaration If the DTD is declared inside the XML file, it should
be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [
element-declarations
]>
Example:
!DOCTYPE note defines that the root element of this document is note.
!ELEMENT note defines that the note element contains four elements: "to,from,heading,body”
#PCDATA – Parsed Character Data – Indicates that browser should parse the content
<?xml version="1.0"?><!DOCTYPE note [<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>]>
<note> <to>John</to> <from>Robert</from> <heading>Reminder</heading> <body>Don't forget me</body></note>
External DTD Declaration If the DTD is declared in an external file, it should be
wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element SYSTEM "filename”>
Example:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd"><note> <to>Jonh</to> <from>Robert</from> <heading>Reminder</heading> <body>Don't forget me</body></note>
The file "note.dtd“ contains the DTD:<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>
XML Namespaces
XML Namespaces
XML Namespaces provide a method to avoid element name conflicts.
To use XML Namespaces, elements are given qualified names.
Name Conflicts In XML, element names are defined by the developer. This often
results in a conflict when trying to mix XML documents from different XML applications.
For example, this file carries HTML table information:<table>
<tr> <td>Apples</td> <td>Bananas</td> </tr>
</table>
Whereas this XML file carries user defined tags:<table>
<name>African Coffee Table</name><width>80</width>
<length>120</length></table>
If these two files were added together, there would be a name conflict.
Both contain a <table> element, but the elements have different content and meaning.
An XML parser will not know how to handle these differences.
XML Namespaces (Contd.)
<h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table>
<f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table>
Solving the Name Conflict Using a Prefix:
• Name conflicts in XML can easily be avoided using a name prefix.Example:
<root>
<h:table xmlns:h="http://www.w3.org/1999/xhtml/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table>
<f:table xmlns:f="http://www.myschools.com/furniture"><f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table>
</root>
The xmlns AttributeWhen using prefixes in XML, a so-called namespace for the prefix must be defined.The namespace is defined by the xmlns attribute in the start tag of an element. xmlns:prefix=“URI”
<rootxmlns:h="http://www.w3.org/TR/html4/"xmlns:f="http://www.myschools.com/furniture">
<h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table>
<f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table>
</root>
• Namespaces can be declared in the elements where they are used or in the XML root element:
XML schemas
XML schemas
XML Schema is an XML-based alternative to DTD. An XML schema describes the structure of an XML
document. The XML Schema language is also referred to as XML
Schema Definition (XSD).
What is an XML Schema? The purpose of an XML Schema is to define the legal
building blocks of an XML document, just like a DTD.
Advantage of XML Schema: One of the greatest strength of XML Schemas is the support
for data types.
An XML Schema: defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can
include text defines data types for elements and attributes defines default and fixed values for elements and
attributes
XML schemas (Contd.)
Advantages of using data types:
With support for data types: It is easier to describe allowable document content It is easier to validate the correctness of data It is easier to work with data from a database It is easier to define data facets (restrictions on data) It is easier to define data patterns (data formats) It is easier to convert data between different data types
The <schema> element may contain some attributes. A schema declaration often looks something like this:
Ex: note.xsd
<?xml version="1.0"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"targetNamespace="http://www.w3schools.com"xmlns="http://www.w3schools.com"elementFormDefault="qualified">......</xs:schema>
Defining a schema
Meaning of attributes<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema" The document element of XML Schemas is xs:schema. It takes the
attribute xmlns:xs with the value of: http://www.w3.org/2001/XMLSchema indicating that the document should follow the rules of XML Schema.
targetNamespace=http://www.w3schools.com indicates that the elements defined by this schema (note, to, from,
heading, body) come from the "http://www.w3schools.com" namespace.
xmlns="http://www.w3schools.com" indicates that the default namespace is: "http://www.w3schools.com".
elementFormDefault="qualified" indicates that any elements used in this schema must be namespace
qualified.
Defining a schema instance The XML Schema is like a class and XML document which
adhere to an XML schema are basically instance of that schema.
This XML document has a reference to an XML Schema:
<?xml version="1.0"?>
<note xmlns="http://www.w3schools.com"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.w3schools.com/note.xsd">
<to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>
xmlns="http://www.w3schools.com" specifies the default namespace declaration (to be the one
defined in its schema).
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
The above code indicates that this XML document is an instance of an XML Schema.
xsi:schemaLocation=“http://www.w3schools.com/note.xsd”
Indicates the file name where the default namespace is defined. This attribute has two values.
The first value is the namespace to use. The second value is file name of the schema.
Defining a schema instance (Contd.)
Overview of Data Types
There are 2 categories of user defined XML schema data types.
Simple data type: cannot have attributes & nested elements.
Complex data type: have attributes & nested elements
XML schema defines 44 data types.
Simple Types (Simple Data Types) Built-in
Primitive Total 19 (string, boolean, decimal, time,date…)
Derived Total 25 (token, NMTOKEN, language, name….)
User Defined minLength, maxLength, pattern …
The following is an example of a string declaration in a schema:
<xsd:element name="customer" type="xsd:string"/> An element in your document might look like this:
<customer>John Smith</customer> The following is an example of a date declaration in a
schema:
<xsd:element name="start" type="xsd:date"/> An element in your document might look like this:
<start>2002-09-24</start> The decimal data type is used to specify a numeric value. The following is an example of a decimal declaration in a
schema:
<xsd:element name="prize" type="xsd:decimal"/> An element in your document might look like this:
<prize>999.50</prize>
Complex types
What is a Complex Element? A complex element is an XML element that contains other
elements (child elements) and/or attributes. There are four kinds of complex elements:
empty elements
elements that contain only other elements
elements that contain only text
elements that contain both other elements and text
Note: It is not necessary to explicitly declare that a simple-type element is a simple type, but it is necessary to specify that a complex-type element is a complex type.
A complex XML element, "product", which is empty:
<product pid="1345"/> A complex XML element, "employee", which contains
only child elements:
<employee> <firstname>John</firstname> <lastname>Smith</lastname>
</employee> A complex XML element, "food", which contains only text:
<food type="dessert">Ice cream</food> A complex XML element, "description", which contains
both elements and text:
<description>It happened on
<date lang="norwegian">03.03.99</date> ....
</description>
How to Define a Complex Element<employee> <firstname>John</firstname> <lastname>Smith</lastname></employee>
Look at this complex XML element, "employee", which contains only other elements:
<xsd:element name="employee"> <xsd:complexType> <xsd:sequence> indicates that elements must appear in the order specified.
<xsd:element name="firstname" type="xsd:string"/> <xsd:element name="lastname" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>
We can define a complex element in an XML Schema in two different ways:1. The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence></xs:complexType>
2. The "employee" element can have a type attribute that refers to the name of the complex type to use:
If you use the method described above, several elements can refer to the same complex type, like this:
<xsd:element name="employee" type="personinfo"/><xsd:element name="student" type="personinfo"/><xsd:element name="member" type="personinfo"/>
<xsd:complexType name="personinfo"> <xsd:sequence> <xsd:element name="firstname" type="xs:string"/> <xsd:element name="lastname" type="xs:string"/> </xsd:sequence></xsd:complexType>
Content Models:They indicate the structure and order in which child elements can appear within their parent element.
sequence elements must appear in the order specified. all the elements must appear, but order is not important. choice only one of the element can appear.
Validating Instances of Schemas
Several XML schema validation tools are available. One of them is named xsv, (XML Schema Validator). If theschema and the instance document are available on the Web, xsv can be used online.
This tool can also be downloaded and run on any computer. The Web site for xsv is:
http://www.w3.org/XML/Schema#XSV
DISPLAYING XML DOCUMENTS WITH STYLES
Displaying XML documents with Styles
We can use CSS and XSLT for styles.
Displaying XML documents with CSS: We need two files:
- the XML file (with .xml extension) and- a CSS file (with .css extension)
In the XML document, we need to add one line of code to include external style sheet:
<?xml-stylesheet type = “text/css” href = “note.css” ?>
Displaying XML documents with XSLT style sheets
XSL (eXtensible Stylesheet Language) is the preferred style sheet language of XML and is developed by W3C.
XSLT is more powerful than CSS for XML.
XSL consists of three parts: XSLT (XSL Transformations) - a language for
transforming XML documents Xpath (XML Path Language) - a language for
navigating in XML documents XSL-FO (XSL Formatting Objects) - a language for
formatting XML documents
Difference between CSS and XSLCSS Style Sheets for HTML HTML uses predefined tags, and the meaning of each tag is well
understood. For example: the <table> tag in HTML defines a table - and a
browser knows how to display it. Adding styles to HTML elements are simple. Telling a browser to
display an element in a special font or color, is easy with CSS.
XSL Style Sheets for XML XML does not use predefined tags (we can use any tag-names we
like), and therefore the meaning of each tag is not well understood.
For example: a <table> tag could mean an HTML table, a piece of furniture, or something else (since it is user defined)- and a browser does not know how to display it.
XSL describes how the XML document should be displayed and hence we should follow its syntactical rules.
XSLT XSL Transformations
XSLT is the most important part of XSL. XSLT is used to transform an XML document into:
another XML document, or another type of document that is recognized by a browser,
like HTML or XHTML. Normally XSLT does this by transforming each XML element into
an HTML/ XHTML element.
XSLT Uses XPath XSLT uses XPath to find information in an XML
document. XPath is used to navigate through elements and attributes in XML documents.
How Does it Work? In the transformation process, XSLT uses XPath to
define parts of the source document that should match one or more predefined templates. When a match is found, XSLT will transform the matching part of the source document into the result document.
XSLT - Transformation An XML document must inform the XSLT processor that the style
sheet is used by including the line:
<?xml-stylesheet type=“text/xsl” href=“filename.xsl”?>
XSLT uses .xsl as file extension for its stylesheet. A style sheet must include at least one template element along
with the following lines:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/
Transform" xmlns="http://www.w3.org/1999/xhtml">
<xsl:template match="/">
The <xsl:template> Element
Normally a template is included to match the root node of the XML document.
This can be done in two ways:
- One way is to use the XPath expression “/”
Ex: <xsl:template match = “/”>
- another method is to use the actual root of the document.
Ex: <xsl:template match = “note”>
XSLT <xsl:value-of> Element
The <xsl:value-of> element is used to extract the value of an XML element and copy it to the output document.
It uses a select attribute to specify the element whose contents are to be copied.
Ex: <xsl:value-of select = “author” />