Xml processing in scala

24
Basic XML Processing In Scala Neelkanth Sachdeva Consultant / Software Engineer Knoldus Software LLP , New Delhi neelkanthsachdeva.wordpress.com [email protected]

Transcript of Xml processing in scala

Page 1: Xml processing in scala

Basic XML Processing In Scala

Neelkanth Sachdeva

Consultant / Software Engineer

Knoldus Software LLP , New Delhi

neelkanthsachdeva.wordpress.com

[email protected]

Page 2: Xml processing in scala

What is XML ?

→ XML is a form of semi-structured data.

→ It is more structured than plain strings, because it organizes the contents of the data into a tree.

→ There are many forms of semi-structured data,

but XML is the most widely used.

Page 3: Xml processing in scala

XML overview

→ XML is built out of two basic elements :

1. Text

2. Tags

Text : As usual, any sequence of characters.

Tags: Consist of a less-than sign,an alphanumeric label, and a greater than sign.

Page 4: Xml processing in scala

Writing XML Tags

● There is a shorthand notation for a start tag

followed immediately by its matching end tag. ● Simply write one tag with a slash put after the tag’s

label. Such a tag comprises an empty element.

e.g <pod>Three <peas/> in the </pod>● Start tags can have attributes attached to them.

e.g <pod peas="3" strings="true"/>

Page 5: Xml processing in scala

XML literals

Scala lets you type in XML as a literal anywhere that an

expression is valid. Simply type a start tag and then continue

writing XML content. The compile will go into an XML-input mode

and will read content as XML until it sees the end tag matching

the start tag you began with.

Page 6: Xml processing in scala
Page 7: Xml processing in scala

Important XML Classes

Class Node is the abstract superclass of all

XML node classes.

Class Text is a node holding just text. For

example, the “Here” part of

<a>Here</a> is of class Text.

Class NodeSeq holds a sequence of nodes.

Page 8: Xml processing in scala

Evaluating Scala Code

Page 9: Xml processing in scala

Example of XML

Page 10: Xml processing in scala

Taking XML apart

Extracting text :

By calling the text method on

any XML node you retrieve all of the text within

that node, minus any element tags.

Page 11: Xml processing in scala
Page 12: Xml processing in scala

Extracting sub-elements :

If you want to find a sub-element by tag name,

simply call \ with the name of the tag:

You can do a “deep search” and look through

sub-sub-elements, etc., by using \\ instead of

the \ operator.

Page 13: Xml processing in scala
Page 14: Xml processing in scala
Page 15: Xml processing in scala

Extracting attributes:

You can extract tag attributes using the same \

and \\ methods. Simply put an at sign (@) before

the attribute name:

Page 16: Xml processing in scala
Page 17: Xml processing in scala
Page 18: Xml processing in scala

Runtime Representation

XML data is represented as labeled trees.

You can conveniently create such labeled nodes

using standard XML syntax.

Consider the following XML document:

Page 19: Xml processing in scala

<html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="http://scala- lang.org/">Scala</a> talks XHTML</p> </body> </html>

This document can be created by the following Scala program as :

Page 20: Xml processing in scala

object XMLTest1 extends Application { val page = <html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="scala-lang.org">Scala</a> talks XHTML</p> </body> </html>; println(page.toString())}

Page 21: Xml processing in scala

It is possible to mix Scala expressions and XML :

object XMLTest2 extends Application { import scala.xml._ val df = java.text.DateFormat.getDateInstance val dateString = df.format(new java.util.Date) def theDate(name: String) = <dateMsg addressedTo={ name }> Hello, { name }! Today is { dateString } </dateMsg>; println(theDate("Neelkanth Sachdeva").toString)}

Page 22: Xml processing in scala

Pattern matching on XML

Sometimes we face a situation that there are

multiple kinds of records within the data. In these

kind of scenarios we used to go with pattern

matching on XML.

Page 23: Xml processing in scala

object XMLTest3 {

def search(node: scala.xml.Node): String = node match { case <a>{ contents }</a> => "It's an a Catagory Item & The Item Is : " + contents case <b>{ contents }</b> => "It's as b Catagory Item & The Item Is : " + contents case _ => "It's something else." }

def main(args: Array[String]) { println(search(<a>Apple</a>)) println(search(<b>Mango</b>)) }}

Page 24: Xml processing in scala

Cheers