Xml writers

43
XML Writers

Transcript of Xml writers

Page 1: Xml writers

XML Writers

Page 2: Xml writers

XML documents are text-based files

The XML Writer Programming Interface:

An XML writer represents a component that provides a fast, forward-only way of outputting XML data to streams or files.

Page 3: Xml writers

void CreateXmlFile(String[] theArray, string filename)

{

StringBuilder sb = new StringBuilder("");

// Loop through the array and build the file

sb.Append("<array>");

foreach(string s in theArray)

{

sb.Append("<element value=\"");

sb.Append(s);

sb.Append("\"/>");

}

Page 4: Xml writers

sb.Append("</array>");

// Create the file

StreamWriter sw = new StreamWriter(filename);

sw.Write(sb.ToString());

sw.Close();

}

Page 5: Xml writers
Page 6: Xml writers

Let's rewrite our sample file using .NET XML writers, as shown in the following code. A

.NET XML writer features ad hoc write methods for each possible XML node type and

makes the creation of XML output more logical and much less dependent on the

intricacies, and even the quirkiness, of the markup languages.

Page 7: Xml writers

void CreateXmlFileUsingWriters(String[] theArray,

string filename)

{

// Open the XML writer (default encoding charset)

XmlTextWriter xmlw = new XmlTextWriter(filename, null);

xmlw.Formatting = Formatting.Indented;

Page 8: Xml writers

xmlw.WriteStartDocument();

xmlw.WriteStartElement("array");

foreach(string s in theArray)

{

xmlw.WriteStartElement("element");

xmlw.WriteAttributeString("value", s);

xmlw.WriteEndElement();

}

xmlw.WriteEndDocument();

// Close the writer

xmlw.Close();

}

Page 9: Xml writers

<?xml version="1.0"?>

<array>

<element value="Rome" />

<element value="New York" />

<element value="Sydney" />

<element value="Stockholm" />

<element value="Paris" />

</array>

Page 10: Xml writers

An XML writer is a specialized class that knows only how to write XML data to a variety

of storage media. It features ad hoc methods to write any special item that

characterizes XML documents—from character entities to processing instructions, from

comments to attributes, and from element nodes to plain text. In addition, and more

important, an XML writer guarantees well-formed XML 1.0–compliant output. And you

don't have to worry about a single angle bracket or the last element node that you left

open.

Page 11: Xml writers

The XmlWriter Base ClassXML writers are based on the XmlWriter abstract class that defines the .NET

Framework interface for writing XML. The XmlWriter class is not directly creatable from

user applications, but it can be used as a reference type for objects that are instances

of classes derived from XmlWriter. Actually, the .NET Framework provides just one

class that gives a concrete implementation of the XmlWriter interface—the

XmlTextWriter class.

Page 12: Xml writers

Writing Well-Formed XML TextTheXmlTextWriter class takes a number of precautions to ensure that the final XML

code is perfectly compliant with the XML 1.0 standard of well-formedness. In particular,

the class verifies that any special character found in the passed text is automatically

escaped and that no elements are written in the wrong order (such as attributes outside

nodes, or CDATA sections within attributes). Finally, the Close method performs a full

check of well-formedness immediately prior to return. If the verification is successful,

the method ends gracefully; otherwise, an exception is thrown.

Page 13: Xml writers

Other controls that the XmlTextWriter class performs on the generated XML output

ensure that each document starts with the standard XML prolog, shown in the following

code, and that any DOCTYPE node always precedes the document root node:

<?xml version="1.0" ?>

Page 14: Xml writers

The following code demonstrates how to write two identical attributes for a specified

node:

xmlw.WriteStartElement("element");

xmlw.WriteAttributeString("value", s);

xmlw.WriteAttributeString("value", s);

xmlw.WriteEndElement();

In the check made just before dumping data out, the writer neither verifies the names

and semantics of the attributes nor validates the schema of the resultant document,

thus authorizing this code to generate bad XML.

Page 15: Xml writers

Building an XML DocumentInitialize the document

Write data

Close the document

Page 16: Xml writers

Writing the XML PrologOnce you have a living and functional instance of the XmlTextWriter class, the first XML

element you add to it is the official XML 1.0 signature. You obtain this signature in a

very natural and transparent way simply by calling the WriteStartDocument method.

This method starts a new document and marks the XML declaration with the version

attribute set to "1.0", as shown in the following code:

// produces: <?xml version="1.0"?>

writer.WriteStartDocument();

Page 17: Xml writers

Decoding Base64 and BinHex DataReading encoded data is a bit trickier, but not because the ReadBase64 and

ReadBinHex methods feature a more complex interface. The difficulty lies in the fact

that you have to allocate a buffer to hold the data and make some decision about its

size. If the buffer is too large, you can easily waste memory; if the buffer is too small,

you must set up a potentially lengthy loop to read all the data. In addition, if you can't

process data as you read it, you need another buffer or stream in which you can

accumulate incoming data.

Page 18: Xml writers

Encoding-derived classes also provide a method—GetString—to transform an array ofbytes into a string, as shown here:

XmlTextReader reader = new XmlTextReader(filename);

while(reader.Read())

{

if (reader.LocalName == "element")

{

byte[] bytes = new byte[1000];

int n = reader.ReadBase64(bytes, 0, 1000);

Page 19: Xml writers

string buf = Encoding.Unicode.GetString(bytes);

// Output the decoded data

Console.WriteLine(buf.Substring(0,n));

}

}

reader.Close();

Page 20: Xml writers

Embedding Images in XML DocumentsThe structure of the sample XML document is extremely simple. It will consist of a

single <jpeg> node holding the BinHex data plus an attribute containing the original

name, as shown here:

writer.WriteStartDocument();

writer.WriteComment("Contains a BinHex JPEG image");

writer.WriteStartElement("jpeg");

writer.WriteAttributeString("FileName", filename);

Page 21: Xml writers

// Get the size of the file

FileInfo fi = new FileInfo(jpegFileName);

int size = (int) fi.Length;

// Read the JPEG file

byte[] img = new byte[size];

FileStream fs = new FileStream(jpegFileName, FileMode.Open);

BinaryReader f = new BinaryReader(fs);

img = f.ReadBytes(size);

f.Close();

Page 22: Xml writers

// Write the JPEG data

writer.WriteBinHex(img, 0, size);

// Close the document

writer.WriteEndElement();

writer.WriteEndDocument();

Page 23: Xml writers

public void WriteContent(DataTable dt)

{

// Write data

Writer.WriteStartElement("rs", "data", null);

foreach(DataRow row in dt.Rows)

{

Writer.WriteStartElement("z", "row", null);

foreach(DataColumn dc in dt.Columns)

Writer.WriteAttributeString(dc.ColumnName,

row[dc.ColumnName].ToString());

Writer.WriteEndElement();

}

Writer.WriteEndElement();

}

Page 24: Xml writers

ADO Recordset objects do not support embedding more result sets in a single XML file.

For this reason, you must either develop a new XML format or use separate files, one

for each result set

Page 25: Xml writers

Testing the XmlRecordsetWriter Class

For .NET Framework applications, using the XmlRecordsetWriter class is no big deal.

You simply instantiate the class and call its methods, as shown here:

void ButtonLoad_Click(object sender, System.EventArgs e)

{

// Create and display the XML document

CreateDocument("adors.xml");

UpdateUI("adors.xml");

}

void CreateDocument(string filename)

{

DataSet ds = LoadDataFromDatabase();

XmlRecordsetWriter writer = new

XmlRecordsetWriter(filename);

writer.WriteRecordset(ds);

}

Page 26: Xml writers

A Read/Write XML Streaming ParserXML readers and writers work in separate compartments and in an extremely

specialized way. Readers just read, and writers just write. There is no way to force

things to go differently, and in fact, the underlying streams are read-only or write-only

as required. Suppose that your application manages lengthy XML documents that

contain rather volatile data. Readers provide a powerful and effective way to read that

contents.

Page 27: Xml writers

Designing a Writer on Top of a ReaderIn the .NET Framework, the XML DOM classes make intensive use of streaming

readers and writers to build the in-memory tree and to flush it out to disk. Thus, readers

and writers are definitely the only XML primitives available in the .NET Framework.

Consequently, to build up a sort of lightweight XML DOM parser, we can only rely, once

more, on readers and writers

Page 28: Xml writers

The inspiration for designing such a read/write streaming parser is database server

cursors. With database server cursors, you visit records one after the next and, if

needed, can apply changes on the fly. Database changes are immediately effective,

and actually the canvas on which your code operates is simply the database table. The

same model can be arranged to work with XML documents.

Page 29: Xml writers

You will use a normal XML (validating) reader to visit the nodes in sequence. While

reading, however, you are given the opportunity to change attribute values and node

contents. Unlike the XML DOM, changes will have immediate effect. How can you

obtain these results? The idea is to use an XML writer on top of the reader

Page 30: Xml writers

Built-In Support for Read/Write OperationsWhen I first began thinking about this lightweight XML DOM component, one of key

points I identified was an efficient way to copy (in bulk) blocks of nodes from the readonly

stream to the write stream. Luckily enough, two somewhat underappreciated

XmlTextWriter methods just happen to cover this tricky but boring aspect of two-way

streaming: WriteAttributes and WriteNode.

Page 31: Xml writers

The WriteAttributes method reads all the attributes available on the currently selected

node in the specified reader. It then copies them as a single string to the current output

stream. Likewise, the WriteNode method does the same for any other type of node.

Note that WriteNode does nothing if the node type is XmlNodeType.Attribute

Page 32: Xml writers

The following code shows how to use these methods to create a copy of the original

XML file, modified to skip some nodes. The XML tree is visited in the usual node-first

approach using an XML reader. Each node is then processed and written out to the

associated XML writer according to the index. This code scans a document and writes

out every other node

Page 33: Xml writers

XmlTextReader reader = new XmlTextReader(inputFile);

XmlTextWriter writer = new XmlTextWriter(outputFile);

// Configure reader and writer

writer.Formatting = Formatting.Indented;

reader.MoveToContent();

// Write the root

writer.WriteStartElement(reader.LocalName);

Page 34: Xml writers

// Read and output every other node

int i=0;

while(reader.Read())

{

if (i % 2)

writer.WriteNode(reader, false);

i++;

}

// Close the root

writer.WriteEndElement();

// Close reader and writer

writer.Close();

reader.Close();

Page 35: Xml writers

The CSV Reader/Writer in ActionLet's take a sample CSV file, read it, and apply some changes to the contents so that

they will automatically be persisted when the reader is closed. Here is the source CSV

file:

LastName,FirstName,Title,Country

Davolio,Nancy,Sales Representative,USA

Fuller,Andrew,Sales Manager,USA

Leverling,Janet,Sales Representative,UK

Suyama,Michael,Sales Representative,UK

Page 36: Xml writers

// Instantiate the reader on a CSV file

XmlCsvReadWriter reader;

reader = new XmlCsvReadWriter("employees.csv",

hasHeader.Checked);

reader.EnableOutput = true;

reader.Read();

Page 37: Xml writers

// Define the schema of the table to bind to the grid

DataTable dt = new DataTable();

for(int i=0; i<reader.AttributeCount; i++)

{

reader.MoveToAttribute(i);

DataColumn col = new DataColumn(reader.Name,

typeof(string));

dt.Columns.Add(col);

}

Page 38: Xml writers

reader.MoveToElement();

// Loop through the CSV rows and populate the DataTable

do

{

DataRow row = dt.NewRow();

for(int i=0; i<reader.AttributeCount; i++)

{

Page 39: Xml writers

if (reader[i] == "Sales Representative")

reader[i] = "Sales Force";

row[i] = reader[i].ToString();

}

dt.Rows.Add(row);

}

while (reader.Read());

Page 40: Xml writers

// Flushes the changes to disk

reader.Close();

// Bind the table to the grid

dataGrid1.DataSource = dt;

Page 41: Xml writers

Readers and writers are at the foundation of every I/O operation in the .NET

Framework. You find them at work when you operate on disk and on network files,

when you serialize and deserialize, while you perform data access, even when you

read and write configuration settings.

Page 42: Xml writers

XML writers are ad hoc tools for creating XML documents using a higherlevel metaphor

and putting more abstraction between your code and the markup. By using XML

writers, you go far beyond markup to reach a nodeoriented dimension in which, instead

of just accumulating bytes in a block of contiguous memory, you assemble nodes and

entities to create the desired schema and infoset

Page 43: Xml writers

.NET XML writers only ensure the well-formedness of each individual XML element

being generated. Writers can in no way guarantee the well-formedness of the entire

document and can do even less to validate a document against a DTD or a schema.

Although badly formed XML documents can only result from actual gross programming

errors, the need for an extra step of validation is often felt in production environments,

especially when the creation of the document depends on a number of variable factors

and run-time conditions. For this reason, we've also examined the key points involved

in the design and implementation of a validating XML writer.