XML DotNet Lecture Notes
-
Upload
api-3737107 -
Category
Documents
-
view
148 -
download
1
Transcript of XML DotNet Lecture Notes
Department of Computer Science and EngineeringUniversity of South CarolinaColumbia, SC 29208
CSCE 547CSCE 547Windows ProgrammingWindows Programming
XML SupportXML Support
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 22
Why XML?Why XML?XML stands for eXXtensible MMarkup LLanguage.
XML is an extension of HTML; it is designed to express the structure of data and information about how to render the data.
Some organizations are embarked on defining standards that use XML to express the semantics of their domains (healthcare, automotive, security and the military).
WHY XML? Because:
1. It is just text, readable by any OS (Linux, MacOs, WinTel, etc) and humans
2. It has become the de facto standard adopted by everybody who is somebody wishing to communicate data over the WWW
This chapter discusses .NET support for XML.
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 33
Why XML?Why XML?
XML encourages the separation of interface from structured data,allowing the seamless integration of data from diverse sources, and providing the infrastructure to create N-tier architectures.
XML
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 44
XML DocumentsXML DocumentsXML documents can be described in terms of their logical and physical structure.
The logical structure is a function of the XML elements and attributescontained in the document.
The physical structure is the set of storage units in which the document actually exists. These units, called entities, could be a stream of characters or a (set of) files.
XML documents contain two parts, called the header and the content.
Typically, the header contains declarations or processing instructions(commands for the XML processor).
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 55
XML Documents Can ContainXML Documents Can Contain
• Processing Instructions (aka PIs) delimited by <? . . . ?>
• Declarations, in the form <! aDeclaration >• Elements• Attributes• Entities• Comments
Typically, you will include in the header declarations and/or processing instructions
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 66
Processing Instructions and DeclarationsProcessing Instructions and Declarations
<?xml version="1.0"?><?xml-stylesheet href="XSL\DotNet.html.xsl" type="text/xsl"?><?xml-stylesheet href="XSL\DotNet.wml.xsl" type="text/xsl"
media="wap"?><?cocoon-process type="xslt"?>
<?xml-stylesheet type="text/xsl" href="Guitars.xsl"?><?xml version="1.0" encoding="UTF-16"?>
<!DOCTYPE DotNetXML:Book SYSTEM "DTD\DotNetXML.dtd"><!NOTATION PNG SYSTEM “program.exe”><!ATTLIST . . . >
<!ENTITY AGRAPH SYSTEM “file.png” NDATA PNG><!ENTITY memoText “blablabla”><memo> && memoText; </memo>
?xml
Declarations
& is Reference Notation
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 77
XML ElementsXML ElementsXML elements are made up of a start tag, an end tag, and data in between. The start and end tags describe the data or value of the elements:
<Student> Anita Donut </Student><CarDriver> Anita Donut </CarDriver><BloodDonor> Anita Donut </BloodDonor>
Elements can be empty, e.g.,
<memo> </memo>
But this only makes sense when creating attributes. The preferred way is:
<memo />
Attributes define properties for an element. XML elements can contain one or more attributes
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 88
XML ElementsXML ElementsThe XML tree in Figure 13-1 was
produced by the code below<?xml version="1.0"?><Guitars><Guitar><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar><Guitar><Make>Fender</Make><Model>Stratocaster</Model><Year></Year><Color>Black</Color><Neck>Maple</Neck>
</Guitar></Guitars>
<Guitar Year="1977"><Make>Gibson</Make><Model>SG</Model><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar>
<Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar>
Using attributes:
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 99
Name SpacesName SpacesXML uses name spaces to avoid name collisions, such that, e.g., gibson:color and fender:color may refer to different elements
<?xml version="1.0"?><win:Guitarsxmlns:win="http://www.wintellect.com/classic-guitars"xmlns:gibson="http://www.gibson.com/finishes"xmlns:fender="http://www.fender.com/finishes"><win:Guitar><win:Make>Gibson</win:Make><win:Model>SG</win:Model><win:Year>1977</win:Year><gibson:Color>Tobacco Sunburst</gibson:Color><win:Neck>Rosewood</win:Neck>
</win:Guitar><win:Guitar><win:Make>Fender</win:Make><win:Model>Stratocaster</win:Model><win:Year>1990</win:Year><fender:Color>Black</fender:Color><win:Neck>Maple</win:Neck>
</win:Guitar></win:Guitars>
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1010
Default Name SpacesDefault Name SpacesA default space is declared with no tag. The XML in the previous slide has the same content as this one.
<?xml version="1.0"?><win:Guitarsxmlns="http://www.wintellect.com/classic-guitars"xmlns:gibson="http://www.gibson.com/finishes"xmlns:fender="http://www.fender.com/finishes"><Guitar><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><gibson:Color>Tobacco Sunburst</gibson:Color><Neck>Rosewood</Neck>
</Guitar><Guitar><Make>Fender</Make><Model>Stratocaster</Model><Year>1990</Year><fender:Color>Black</fender:Color><Neck>Maple</Neck>
</Guitar></Guitars>
Default Name Space
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1111
Document ValidationDocument Validation“Well-formed” documents satisfy XML syntactic rules. Well-formed documents may be validated against schema documents, which define in great detail how elements in the document must be written.<?xml version="1.0"?><xsd:schemaschema id="Guitars" xmlns=""xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="Guitars"><xsd:complexType><xsd:choice maxOccurs="unbounded"><xsd:element name="Guitar">
<xsd:complexType><xsd:sequence><xsd:element name="Make" type="xsd:string" /><xsd:element name="Model" type="xsd:string" /><xsd:element name="Year" type="xsd:gYear"
minOccurs="0" /><xsd:element name="Color" type="xsd:string"
minOccurs="0" /><xsd:element name="Neck" type="xsd:string"
minOccurs="0" /></xsd:sequence>
</xsd:complexType></xsd:element>
</xsd:choice></xsd:complexType>
</xsd:element></xsd:schema>
Document is a schema
As of 2001, this was the mother of all schemas
The definitions in red come from the XMLSchema document
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1212
Parsing XMLParsing XMLThere are two main APIs for XML parsers: DOM and SAX. The differences are significant. DOM parsers assume that the entire document resides in memory, while SAX parsers do their work under an event-driven model.
DOM offers the advantage of random-access while SAX offers advantages derived from the event-driven style of processing.
Microsoft offers a DOM-based parser, MSXML.dll as part of IE in Windows.
The DOM tree of Figure 13-2 can be produced by:
<?xml version="1.0"?><Guitars><Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar></Guitars>
<?xml version="1.0"?><Guitars>
<Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar><Guitar Image="MyStrat.jpeg"
PreviousOwner="Eric Clapton"><Make>Fender</Make><Model>Stratocaster</Model><Year>1990</Year><Color>Black</Color><Neck>Maple</Neck>
</Guitar></Guitars>
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1313
ReadXML.CPPReadXML.CPP
This sample code reads XML using MSXML.dll.
Although the code is great fun to decipher, not every not enjoys doing so L
The crucial code is
hr = CoCreateInstance (CLSID_DOMDocument, NULL,CLSCTX_INPROC_SERVER, IID_IXMLDOMDocument, (void**) &pDocpDoc);
hr = pDocpDoc->load (var, &success);
hr = pDocpDoc->getElementsByTagName (tag, &pNodeList);
Create a COM object to host the parser in the memory of this process
Use the parser to load XML doc from file
Get element given tag into pNodeList
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1414
ReadXML.CSReadXML.CSThe code below also reads the Guitars.xml file and writes into the console the values associated to the “Guitar” tag.
The entire code is:
using System;using System.Xml;
class MyApp{
static void Main (){
XmlDocumentXmlDocument doc = new doc = new XmlDocumentXmlDocument ();();doc.Loaddoc.Load ("Guitars.xml");XmlNodeListXmlNodeList nodes = doc.GetElementsByTagNameGetElementsByTagName ("Guitar");foreach (XmlNodeXmlNode node in nodes) {
Console.WriteLine ("{0} {1}", node["Make"].InnerText,node["Model"].InnerText);
}}
}
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1515
XmlDocumentXmlDocument Class Class This class is compatible with DOM level 2. Using that class is quite trivial, even to discover the contents of the nodes in the document
XmlDocument doc = new XmlDocument ();doc.Load ("Guitars.xml");OutputNode (doc.DocumentElement);...
void OutputNode (XmlNode node){
Console.WriteLine("Type={0}\tName={1}\tValue={2}",
node.NodeType, node.Name, node.Value);
if (node.HasChildNodes) {XmlNodeList children = node.ChildNodes;foreach (XmlNode child in children)
OutputNode (child);}
}
XmlNode is a class that contains type, name and value information
The items in red are defined in the Xml Name Space
Document points to root when loaded
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1616
Inspecting AttributesInspecting AttributesA node may have a collection named AttributesAttributes, which may contain XmlAttributeXmlAttribute items, which in turn may contain type, name and value
void OutputNode (XmlNode node){
Console.WriteLine ("Type={0}\tName={1}\tValue={2}",node.NodeTypenode.NodeType, , node.Namenode.Name, , node.Valuenode.Value);
if (node.Attributes != null) {foreach (XmlAttribute attr in node.Attributes)
Console.WriteLine ("Type={0}\tName={1}\tValue={2}",attr.NodeType, attr.Name, attr.Value);
}
if (node.HasChildNodes) {foreach (XmlNode child in node.ChildNodes)
OutputNode (child);}
}
Attributes and XmlAttribute
HasChildNode and ChildNodes
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1717
XmlTextReaderXmlTextReaderThis class is a forward-only reader, which, as the ADO.NET DataReaderclass, provides a fast mechanism for traversing through an XML document.
XmlTextReader reader = null;try {
reader = new XmlTextReader ("Guitars.xml");reader.WhitespaceHandling = WhitespaceHandling.None;while (reader.Read ()) {
if (reader.NodeType == XmlNodeType.Element &&reader.Name == "Guitar" &&reader.AttributeCount > 0) {while (reader.MoveToNextAttribute ()) {
if (reader.Name == "Image") {Console.WriteLine (reader.Value);break;
}}}}}finally {
if (reader != null)reader.Close ();
}
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1818
XmlValidatingReaderXmlValidatingReaderHopefully you guessed it: This class performs validations while reading. Validation could be against schemas of types DTD XSD, XDRusing System; using System.Xml;using System.Xml.Schema;class MyApp {static void Main (string[] args) {
if (args.Length < 2) {Console.WriteLine ("Syntax: VALIDATE xmldoc schemadoc");return;
}XmlValidatingReader reader = null;try {
XmlTextReader nvr = new XmlTextReader (args[0]);nvr.WhitespaceHandling = WhitespaceHandling.None;reader = new XmlValidatingReader (nvr);reader.Schemas.Add (GetTargetNamespace (args[1]), args[1]);reader.ValidationEventHandler +=
new ValidationEventHandler (OnValidationError);while (reader.Read ());
}catch (Exception ex) {
Console.WriteLine (ex.Message);}finally {
if (reader != null)reader.Close ();
}}
Throw exception if invalid elements are found
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1919
XmlTextWriterXmlTextWriterThis class has methods for reading and writingand writing elements, attributes, comments, etc, from/to an XML Document.try {
writer = new XmlTextWriter("Guitars.xml", System.Text.Encoding.Unicode);
writer.Formatting = Formatting.Indented;
writer.WriteStartDocument ();writer.WriteStartElement ("Guitars");writer.WriteStartElement ("Guitar");writer.WriteAttributeString ("Image", "MySG.jpeg");writer.WriteElementString ("Make", "Gibson");writer.WriteElementString ("Model", "SG");writer.WriteElementString ("Year", "1977");writer.WriteElementString ("Color", "Tobacco Sunburst");writer.WriteElementString ("Neck", "Rosewood");writer.WriteEndElement ();writer.WriteEndElement ();
}finally {
if (writer != null)writer.Close ();
}
<?xml version="1.0" encoding="utf-16"?><Guitars><Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>
</Guitar></Guitars>
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2020
XPathXPathXPath is a query language that can be used to get elements or attributes from an XML document, using “path expressions.” Since these expressions are a bit arcane, the WWW consortium is working on a SQL-like query language aimed at replacing XPath.
In the meantime, .NET offers XPath support via a class named XPathNavigator, which contains a number of features (methods, events, etc) that make querying a document quite simple, as seen in XPathDemo.csusing System; using System.Xml.XPath;class MyApp {static void Main () {XPathDocument doc = new XPathDocument ("Guitars.xml");XPathNavigator nav = doc.CreateNavigator ();XPathNodeIterator iterator = nav.Select ("/Guitars/Guitar");while (iterator.MoveNext ()) {XPathNodeIterator it = iterator.Current.Select ("Make");it.MoveNext ();string make = it.Current.Value;it = iterator.Current.Select ("Model");
it.MoveNext ();string model = it.Current.Value;Console.WriteLine ("{0} {1}", make, model);
}}}
This is the query expresion
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2121
Expressalyzer.csExpressalyzer.csThis application, shown in Figure 13-12, illustrates the power of XPath.
You can load a document, and make queries dynamically (provided that you are familiar with xPath expressions)
The crucial methods in this application are OnExecuteExpressionwhere a navigator is built, and AddNoteAndChildren, where, depending on the type of item found, nodes are added to the TreeView.
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2222
XSL TransformationsXSL Transformations
XSL is a language that can be used to transform the format of a document into a different format. XSL stands for eXXtensible SStylesheetLLanguage, and was probably the main reason XML became so popular,as it was a crucial factor in the early success of EDI (Electronic Data Interchange)
Organizations use XSL to get their document from/to other organizations, e.g., just in the healthcare sector
Humana ó KaiserPermanenteBlueCrossBlueShield ó HCA
XSLT is at the heart of MS BizTalk Server, a set of B2B tools, that facilitate converting all kinds of business forms (invoices, paychecks, purchase orders, etc) from one format to another.
Figure 13-13 illustrates this concept.
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2323
XML XML --> HTML> HTMLCopy Figure 13-16’s Guitars.xml and Guitars.xsl into a directory
Comment out the following statement in Guitars.xml:
<?xml-stylesheet type="text/xsl" href="Guitars.xsl"?>
Open Guitars.xml in IE. (Figure 13-14).
Uncomment the statement
Open Guitars.xml again in IE. (Figure 13-15).
<?xml<?xml--stylesheetstylesheet type="text/type="text/xslxsl" " hrefhref="="Guitars.xslGuitars.xsl"?>"?>
The code in
Contains instructions to transform the XML file into an HTML table at the client side
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2424
Guitars.XSLGuitars.XSL<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0"><xsl:template match="/"><html><body><h1>My Guitars</h1><hr /><table width="100%" border="1">
<tr bgcolor="gainsboro"><td><b>Make</b></td><td><b>Model</b></td><td><b>Year</b></td><td><b>Color</b></td><td><b>Neck</b></td>
</tr><xsl:for-each select="Guitars/Guitar"><tr><td><xsl:value-of select="Make" /></td><td><xsl:value-of select="Model" /></td><td><xsl:value-of select="Year" /></td><td><xsl:value-of select="Color" /></td><td><xsl:value-of select="Neck" /></td>
</tr></xsl:for-each>
</table></body>
</html></xsl:template>
</xsl:stylesheet>
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2525
XSLT at the serverXSLT at the server
.NET provides a class, named XslTransformXslTransform, that can convert a document from a format to another, at the server side, using ASP.NET
The chapter illustrates how this can be done in three files:
Quotes.aspx Quotes.xml Quotes.xml
The result is shown in figure 13-17.
Note that the key to get this done is to have a good understanding of .XSL specifics.
CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2626
XslTransformXslTransform in CSin CSThe code below shows how easy it is to work with XslTransform.
Again, as long as you know the details of XSL, transforming a document to another format is quite easy.
using System; using System.Xml.XPath;using System.Xml.Xsl;class MyApp {static void Main (string[] args) {
if (args.Length < 2) {Console.WriteLine ("Syntax: TRANSFORM xmldoc xsldoc");return;
}try {
XPathDocument doc = new XPathDocument (args[0]);XslTransform xsl = new XslTransform ();xsl.Load (args[1]);xsl.Transform (doc, null, Console.Out);
}catch (Exception ex) {
Console.WriteLine (ex.Message);}}}