XML, DTD & XSD Overview
-
Upload
pradeep-rapolu -
Category
Technology
-
view
575 -
download
2
description
Transcript of XML, DTD & XSD Overview
XML, DTD & SCHEMA
Pradeep Rapolu
MODULE 1: XML OVERVIEW
Agenda
Introduction to XML
XML Tree
XML Syntax Rules
XML Elements
XML Attributes
XML Namespaces
XML Encoding
XML with CSS
Introduction to XML
What is XML?• XML is a markup language much like HTML
• XML was designed to describe data.
• XML tags are not predefined.
• XML is a W3C Recommendation
XML is not a replacement of HTML• XML specifies what data is.
• HTML specifies how data looks.
XML Doesn’t do anything. Some code makes use of XML.
Advantages of XML:• XML Separates Data from HTML
• XML Simplifies Data Sharing
• XML Simplifies Data Transport
• XML Simplifies Platform Changes
• Several Internet languages are written in XML.
XHTML XML Schema SVG WSDL RSS
• XML documents form a tree structure• XML documents are made up with
Elements Attributes Text
XML Tree
XML Syntax Rules
• XML Elements Must Have a Closing Tag
• XML Tags are Case Sensitive
• XML Elements Must be Properly Nested
• XML Documents Must Have a Root Element
• Entity References
• Comments in XML
• XML must be well formed
Valid XML:<color id=“2”>green</color> <!-- The color is green -->
Invalid XML:
<color id=2>green</Color
XML Elements• XML Element is everything from a start tag to end tag.
• An element can contain other elements text attributes or a mix of all of the above.
• XML Elements must follow naming rules.
E.g.:
<country type=“subcontinent”>India</country>
XML Attributes• Attributes provide additional information about an element.
• XML Attribute Values Must be Quoted
• Avoid attributes – use only to store metadata.
E.g.:<file type="gif">computer.gif</file>
XML Namespaces• Namespaces – to avoid name conflicts
Syntax: xmlns:prefix="URI“
Default Namespace:
• Saves from using prefixes in all the child elements
Syntax: xmlns="namespaceURI“
Default_Namespace.xml
Tables_together.xml
HTML_Table.xml Furniture_Table.xml
• XML documents can contain international characters
Syntax:<?xml version="1.0" encoding="UTF-8"?>
Unicode:
• Unicode is an industry standard for character encoding of text documents• Unicode has two variants:
UTF-8 UTF-16.
• UTF = Universal character set Transformation Format.• UTF-8 uses 1 byte (8-bits) to represent characters in the ASCII set, and two or
three bytes for the rest.• UTF-16 uses 2 bytes (16 bits) for most characters, and four bytes for the rest.
• UTF-8 is the default for documents without encoding information.
XML Encoding
• XML documents can be formatted with CSS (Cascading Style Sheets)
• Formatting XML with CSS is not the most common method.
• W3C recommends using XSLT instead.
XML with CSS
CD_Catalog.xml CD_Catalog_CSS.css
Module 2: DTD Overview
Agenda
Introduction to DTD
DTD Building Blocks
DTD Elements
DTD Attributes
DTD Entities
Introduction to DTD
• DTD defines the document structure with a list of legal elements and attributes.
• The XML document that follows DTD is valid and well formed.
Why DTD?
• With a DTD, each XML file can carry a description of its own format.
• To verify if the XML received from outside world is valid
• To maintain a standard for interchanging data
DTD Declaration Types:
1. Internal DTD Declaration
2. External DTD Declaration
1. Internal DTD Declaration:
• The DTD is declared inside the XML file
Syntax: <!DOCTYPE root-element [element-declarations]>
2. External DTD Declaration
• The DTD is declared in an external file
• The DTD document is referred to xml document
Syntax: <!DOCTYPE root-element SYSTEM "filename">
note.xml
note.dtd note.xml
DTD Building Blocks
• Per DTD all the XML documents are made up by the following building blocks
Elements
Attribues
Entities
PCDATA
CDATA
DTD Elements• In DTD, elements are declared with an ELEMENT declaration.
Syntax:<!ELEMENT element-name category>
or<!ELEMENT element-name (element-content)>
Element Types:• <!ELEMENT element-name EMPTY>
• <!ELEMENT element-name (#PCDATA)>
• <!ELEMENT element-name ANY>
• <!ELEMENT element-name (child1, child2,…..)>
• <!ELEMENT element-name (child-name)>
• <!ELEMENT element-name (child-name+)>
• <!ELEMENT element-name (child-name*)>
• <!ELEMENT element-name (child-name?)>
• <!ELEMENT element-name (child1, child2, (child3|child4))>
• <!ELEMENT element-name (#PCDATA|child1|child2|child3|child4)*>
• DTD, attributes are declared with an ATTLIST declaration.
Syntax:<!ATTLIST element-name attribute-name attribute-type attribute-
value>
Attribute Values:
• <!ATTLIST element-name attribute-name attribute-type default-value>
• <!ATTLIST element-name attribute-name attribute-type #REQUIRED>
• <!ATTLIST element-name attribute-name attribute-type #IMPLIED>
• <!ATTLIST element-name attribute-name attribute-type #FIXED "value">
• <!ATTLIST element-name attribute-name (en1|en2|..) default-value>
DTD Attributes
DTD Entities
• Entities are like variables
• Entities can be declared internal or external
1. Internal Entity Declaration:
Syntax:<!ENTITY entity-name "entity-value">
2. External Entity Declaration:
Syntax:<!ENTITY entity-name SYSTEM "URI/URL">
Entity reference in XML document:
<element-name>&entity-name;</element-name>
internal_entity_xml.xml
DTD_Allover_Example.xml
Module 3: XML Schema Overview
• XML schema describes the structure of an XML document.
• XSD - XML Schema language
What is an XML Schema?
• XML Schema defines the legal building blocks of an XML document.
An XML Schema -
defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes
XML Schema
Advantages of XML Schema over DTD
• XML Schemas are written in XML
• XML Schemas support data types
• XML Schemas support namespaces
XML Schema Syntax:
• The XML Schema must be embedded inside the root element <schema>
<?xml version="1.0"?><xs:schema>......</xs:schema>
XML With XSD:
• XML documents refer XML Schema. (XSD Documents)
Sample_XSD.xsd
Sample_XML.xml
Agenda
XML Schema
XSD Simple Types
XSD Complex Types
XSD Complex Types – Indicators
XSD Complex Types - any & anyAttribute
XSD Complex Types - Element Substitution
Writing XML Schema
XSD Data types
XSD Simple Types• The Simple Types in XSD are –
Simple Element Attribute
1. Simple Element:
• Element contains only text, but no other elements or attributes.
Syntax:
<xs:element name=“element-name" type=“element-type"/>
• Simple elements can have default and fixed values
• XML Schema has a lot of built-in data types. The most common types are:
xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
Simple_Element_XML.xml Simple_Element_XSD.xsd
2. Attribute:• Simple elements cannot have attributes.
• The attribute itself is a simple type.
Syntax:<xs:attribute name=“attribute-name" type=“attribute-type"/>
E.g.:<lastname lang="EN">Smith</lastname> <!--Element with Attribute -->
<xs:attribute name="lang" type="xs:string"/> <!-- Attribute definition -->
XSD Restrictions/Facets:
• Restrictions define acceptable values for XML elements or attributes. • Restrictions on XML elements are called facets.
Different Restrictions:
• Restrictions on Values• Restrictions on set of values• Restrictions on a Series of Values• Restrictions on Whitespace Characters• Restrictions on Length
Restriction_Value_XSD.xsd
Restriction_Value_XML.xml
XSD Complex Types• A complex type element contains other elements and/or attributes.
• There are four kinds of complex elements - empty elements elements that contain only other elements elements that contain only text elements that contain other elements, attributes and text
** The Complex Type Elements can be Extended or Restricted
Empty elements:
• An empty complex element cannot have contents, but only attributes.
E.g.: <product prodid="1345" />
** By giving complexType element a name and let the element have a type attribute that refers to the name of the complexType several elements can refer to the same complex type
ComplexType_Extension_XSD.xsd
ComplexType_Extension_XML.xml
Complex_Empty_XSD.xsd Complex_Empty_XML.xml
Elements that contain only other elements:• An "elements-only" complex type contains an element that contains only
other elements.
E.g.: <person> <firstname>John</firstname> <lastname>Smith</lastname>
</person>
Elements that contain only text:• A complex text-only element can contain text and attributes.
E.g.: <shoesize country="france">35</shoesize> • This type contains only simple content (text and attributes)• We add a simpleContent element around the content.
Complex_ElementsOnly_XML.xml Complex_Empty_XSD.xsd
Complex_TextOnly_XML.xml Complex_TextOnly_XSD.xsd
Elements that contain other elements, attributes and text (Mixed):• A mixed complex type element can contain attributes, elements, and text.
E.g.: <letter id=“123”> Dear Mr.<name>John Smith</name>. Your order <orderid>1032</orderid> will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
Complex_Mixed_XML.xml Complex_Mixed_XSD.xsd
XSD Complex Types - Indicators• We can control HOW elements are to be used in documents with indicators.
• There are seven indicators classified into 3 types
a) Order indicators:• Order indicators define the order of the elements.
All: The child elements can appear in any order, but must occur only once:
Choice: Either one child element or another can occur, but not both Sequence: The child elements must appear in a specific order
b) Occurrence indicators:• Occurrence indicators define the no. of times an element can appear
maxOccurs: Maximum number of times an element can occur minOccurs: Minimum number of times an element can occur
OrderIndicator_All_XSD.xsd
OrderIndicator_All_XML.xml
c) Group indicators:• Group indicators define related sets of elements.
Element Groups: • Define related sets of elements• Element groups are defined with the group declaration.
Syntax:
<xs:group name="groupname">...</xs:group>
Attribute Groups: • Define related sets of attributes.• Attribute groups are defined with the attributeGroup declaration
Syntax:
<xs:attributeGroup name="groupname">...</xs:attributeGroup>
GroupIndicator_Element_XML.xml
GroupIndicator_Element_XSD.xsd
GroupIndicator_Attribute_XML.xml
GroupIndicator_Attribute_XSD.xsd
XSD Complex Types - any & anyAttributeany Element:• The <any> element enables us to extend the XML document with elements not
specified by the schema!
anyAttribute Element:• The <anyAttribute> element enables us to extend the XML document with
attributes not specified by the schema!
Any_XML.xml Any_XSD.xsd Any_Children_XSD.xsd
anyAttribute_XML.xml anyAttribute_XSD.xsd attribute_XSD.xsd
XSD Complex Types - Element Substitution• With Element Substitution one element can substitute another in different
instances
• An attribute “substitutionGroup” used to apply substitution.
• Substitution can be blocked by using attribute block="substitution"
First_XML.xml Second_XML.xml Element_Substitution_XSD.xsd
Writing XML Schema
• Schemas for XML can be created in below ways
Hierarchical manner Divide the Schema Using Named Types
XML.xml Schema_XSD.xsd Divide_Schema_XSD.xsd NamedTypes_Schema_XSD.xsd
XSD Data types• XSD has below mentioned data types
String Date Numeric Miscellaneous
BooleanBinaryAnyURI
Reference:
http://www.w3schools.com