XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products ...
-
Upload
imogene-briggs -
Category
Documents
-
view
223 -
download
0
Transcript of XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products ...
![Page 1: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/1.jpg)
XML Parsers
Overview Types of parsers
Using XML parsers
SAX
DOM
DOM versus SAX
Products
Conclusion
![Page 2: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/2.jpg)
Types of Parsers
There are several different ways to categorise
parsers:
– Validating versus non-validating parsers
– Parsers that support the Document Object Model (DOM)
– Parsers that support the Simple API for XML (SAX)
– Parsers written in a particular language (Java, C++, Perl, etc.)
![Page 3: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/3.jpg)
Non-validating Parsers
Speed and efficiency- It takes a significant amount of effort for an XML
parser to process a DTD and make sure that every element in an XML document follows the rules of the DTD.
If only want to find tags and extract information - use non-validating
![Page 4: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/4.jpg)
Using XML Parsers
• Three basic steps to use an XML parser– Create a parser object – Pass your XML document to the parser – Process the results
• Generally, writing out XML is outside scope of parsers (though some may implement proprietary mechanisms)
![Page 5: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/5.jpg)
Parsing XML
Two established API's:
– SAX (Simple API for XML)• Define handlers containing methods as XML
parsed
– DOM (Document Object Model)• Defines a logical tree representing the parsed
XML
![Page 6: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/6.jpg)
Parsing XML: DOM
• Document Object Model
• standard API for accessing and creating XML data
• tree-based
• programming language indepedent
• developed by W3C
• whole document is read into memory
• read and write
![Page 7: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/7.jpg)
Creating a DOM Tree
• A DOM implementation will have a method to pass a XML file to a factory object that will return a Document object that represents root element of whole document
• After this, may use DOM standard interface to interact with XML structure
API
Application
![Page 8: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/8.jpg)
Parsing XML: DOM
XML File DOM Tree
![Page 9: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/9.jpg)
DOM Interfaces
• The DOM defines several interfaces
– Node The base data type of the DOM – Element Represents element– Attr Represents an attribute of an element– Text The content of an element or attribute– Document Represents the entire XML document.
A Document object is often referred to as a DOM tree
![Page 10: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/10.jpg)
DOM Level
• DOM Level 1- basic functionality for document navigation and manipulation.
• DOM Level 2- includes a style sheet object model- defines an event model and provides support for XML namespaces.
• DOM Level 3- still under development- addresses document loading and saving
- content model (DTDs and schemas) with document validation support.
![Page 11: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/11.jpg)
Parsing XML: SAX
• Simple API for XML
• API for accessing xml data
• event based
• programming language indepedent
• application has to store fragments into memory
• read only
![Page 12: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/12.jpg)
Parsing XML: SAX
• SAX is an interface to the XML parser based on
streaming and call-backs
• You need to implement the HandlerBase interface :
• startDocument, endDocument
• startElement, endElement
• characters
• warning, error, fatalError
![Page 13: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/13.jpg)
Parsing XML: SAX
XML File SAX calls
![Page 14: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/14.jpg)
SAX versus DOM
DOM:• read and write• need to move back and forth in data• document is human created
SAX:• read only• huge data or streams• data is machine generated
![Page 15: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/15.jpg)
DOM pro and contra
PRO
• The file is parsed only once. • High navigation abilities : this is the aim of the DOM design.
CONTRA
• More memory needed since the XML tree is in memory.
![Page 16: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/16.jpg)
SAX pro and contra
PRO• Low memory needs since the XML file is never entirely in
memory• Can deal with XML streams
CONTRA
• The file has to be parsed entirely to access any node. Thus, getting the 10 nodes included in a catalog ended up in parsing 10 times the same file.
• Poor navigation abilities : no way to get easily the children of a given node or the list of "B" nodes
![Page 17: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/17.jpg)
SAX versus DOM
• If your document is very large and you only need a
few elements - use SAX
• If you need to process many elements and perform
operations on XML - use DOM
• If you need to access the XML many times
- use DOM
![Page 18: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/18.jpg)
Parser Products
• Xerces4J / Xerces4C++ (Apache)
• James Clark’s XP (Java)
• IBM XML4J / XML4C++
• Java Project X (Sun)
• Oracle’s XML Parser for Java
• MSXML (Microsoft)
• Dan Connolly’s XML Parser (Phyton)
• …
![Page 19: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/19.jpg)
Conclusion
• The parser is key building block for every XML application.
• When building XML applications, you have to think how will you handle large chunks of data
• Choosing between SAX and DOM is not always trivial
![Page 20: XML Parsers Overview Types of parsers Using XML parsers SAX DOM DOM versus SAX Products Conclusion.](https://reader036.fdocuments.net/reader036/viewer/2022082414/56649ea25503460f94ba57fe/html5/thumbnails/20.jpg)
The End
Questions?
Thank you!