DIG 3134 – Internet Software Design

39
1 DIG 3134 – Internet Software Design Lecture 17 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al . Imagery is fromWikimedia except where marked with *. Licensing is listed.

description

DIG 3134 – Internet Software Design. Lecture 17 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida. Original image* by Moshell et al. Imagery is fromWikimedia except where marked with *. Licensing is listed. Purposes of XML:. Make data more easily used - PowerPoint PPT Presentation

Transcript of DIG 3134 – Internet Software Design

Page 1: DIG 3134 – Internet Software Design

1

DIG 3134 – Internet Software Design

Lecture 17 - XML:

eXtensible Markup LanguageJ. Michael Moshell

University of Central Florida

Original image* by Moshell et al .

Imagery is fromWikimedia except where marked with *. Licensing is listed.

Page 2: DIG 3134 – Internet Software Design

-2 -

Purposes of XML:

• Make data more easily used• Make data last longer (across generations of technology)

Strategy of XML:

• Provide a basis for creating 'dialects' for special purposes- Thus, XML is a meta-language

• Provide tools you can use, rather than re-invent

Structure of XML:

• Inject <tags> into text files

Page 3: DIG 3134 – Internet Software Design

-3 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

Page 4: DIG 3134 – Internet Software Design

-4 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

content

Page 5: DIG 3134 – Internet Software Design

-5 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

attribute

Page 6: DIG 3134 – Internet Software Design

-6 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student> valuename

Page 7: DIG 3134 – Internet Software Design

-7 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student> valuename

Page 8: DIG 3134 – Internet Software Design

-8 -

Real World Example:

E-commerce (Euro processing) in a PHP application

function sendResponse($status, $statusmessage, $neworderid, $batchid){ echo '<?xml version="1.0" encoding="utf-8"?>'; echo "<responsemessage>"; echo "<status>".$status."</status>"; echo "<statusmessage>".$statusmessage."</statusmessage>"; echo "<neworderid>".$neworderid."</neworderid>"; echo "<batchid>".$batchid."</batchid>"; echo "</responsemessage>";}

Page 9: DIG 3134 – Internet Software Design

-9 -

This raises a Question;

Nested elements:<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

How does one represent the 'grammar' ofan element ... e. g. A transcript will consist of

zero or more courses.

Page 10: DIG 3134 – Internet Software Design

-10 -

This raises a Question;

Nested elements:<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course><gradepoint>3.62</gradepoint>

</transcript></student>

How does one represent the 'grammar' ofan element ... e. g. A transcript will consist of

zero or more courses.

This will be done via a SCHEMA.

Page 11: DIG 3134 – Internet Software Design

-11 -

Two kinds of "grammaticality"

Well-formed:

• one ROOT ELEMENT - e. g. <student> ... </student> per document

• all non-empty elements are delimited with start & end tags.

• Empty elements are delimited properly

- intentionally empty placemarkers: <thisway />

- temporarily empty placemarkers: <likethis></likethis>

• All attribute values are quoted.

• Tags do not overlap.

• Document complies to its character set definition.

1. Well-formedness (standard XML)2. Validity (based on a schema)

Page 12: DIG 3134 – Internet Software Design

-12 -

Schemas represent a particular

We're not going to explore Schemas in this lecture.

But we do have ANOTHER issue to mention:

NAMESPACES ... consider this piece of XML:

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major>

"language" subset of XML

Page 13: DIG 3134 – Internet Software Design

-13 -

Namespaces

What's a 'major'? What are the legal values? What is its relationship to a particular university? Does it have a relationship to any national or world standards?

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major>

Page 14: DIG 3134 – Internet Software Design

-14 -

Namespaces

What's a 'major'? What are the legal values? What is its relationship to a particular university? Does it have a relationship to any national or world standards?

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major>

Page 15: DIG 3134 – Internet Software Design

-15 -

Namespaces

At the listed "URI" (similar to a URL) would be founda description of what kinds of things can be put intothe 'major' field.

This allows people to establish and share standards.(There is no such thing as example.org ... it's a placeholder)

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major xmlns="http://example.org/academicmajor">

Digital Media</major>

Page 16: DIG 3134 – Internet Software Design

-16 -

Namespaces: Another example

<root>

<h:table xmlns:h="http://www.w3.org/TR/html4/">

  <h:tr>

    <h:td>Apples</h:td>

    <h:td>Bananas</h:td>

  </h:tr>

</h:table>

<f:table xmlns:f="http://www.w3schools.com/furniture">

  <f:name>African Coffee Table</f:name>

  <f:width>80</f:width>

  <f:length>120</f:length>

</f:table>

</root>

Page 17: DIG 3134 – Internet Software Design

-17 -

XML into PHP: Start simple...

<person>

<lastname>Wilson</lastname>

<firstname>Henry</firstname>

<address>122 Smith Road</address>

</person>

$xml = simplexml_load_file($filename);print "Raw xml:";

print_r($xml);

print "<br /><br />";

Then I ran this code to suck it into PHP and look at it:

Page 18: DIG 3134 – Internet Software Design

-18 -

And the results: <person>

<lastname>Wilson</lastname>

<firstname>Henry</firstname>

<address>122 Smith Road</address>

</person>

Raw xml:SimpleXMLElement Object (

[lastname] => Wilson

[firstname] => Henry

[address] => 122 Smith Road )

The resulting object looked like this ( via print_r )

Page 19: DIG 3134 – Internet Software Design

-19 -

To access some piece (e. g. lastname): <person>

<lastname>Wilson</lastname>

<firstname>Henry</firstname>

<address>122 Smith Road</address>

</person>

$lastname=$xml->lastname; // $xml is an objectprint "ln=$lastname <br />";

print "<br /><br />";

And the output was

ln=Wilson

Page 20: DIG 3134 – Internet Software Design

-20 -

A more complex record:

<student>

<person>

<lastname>Wilson</lastname>

<firstname>Henry</firstname>

<address>122 Smith Road</address>

</person>

<transcript>

<course level='undergrad'>

<title>DIG 3134</title>

<semester>Fall 2011</semester>

<grade>A</grade>

</course>

<course>

<title>DIG 3353</title>

<semester>Spring 2011</semester>

<grade>C</grade>

</course>

</transcript>

</student> see xample.php (as .txt)

Page 21: DIG 3134 – Internet Software Design

The Battleship XML This example represents two ships for the 'black' team.<ocean>

<ship number='1'>

<x>A</x> <y>1</y>

<orientation>horizontal</orientation> <type>black</type>

</ship>

<ship number='2'>

<x>H</x> <y>1</y>

<orientation>vertical</orientation> <type>black</type>

</ship>

</ocean>

Page 22: DIG 3134 – Internet Software Design

The Battleship XML This example represents two ships for the 'black' team.<ocean>

<ship number='1'>

<x>A</x> <y>1</y>

<orientation>horizontal</orientation> <type>black</type>

</ship>

<ship number='2'>

<x>H</x> <y>1</y>

<orientation>vertical</orientation> <type>black</type>

</ship>

</ocean>

Attribute-

Value

pair

Page 23: DIG 3134 – Internet Software Design

How do we read XML?

$xml=simplexml_load_file($shipfilename);

But what is now in the variable '$xml' ?

To find out, we use the print_r function.

print_r ($xml); -- what do we get?

Page 24: DIG 3134 – Internet Software Design

SimpleXMLElement Object ( [ship] => Array ( [0] =>

SimpleXMLElement Object ( [@attributes] => Array

( [number] => 1 ) [x] => A [y] => 1 [orientation] => horizontal [type]

=> black ) [1] => SimpleXMLElement Object

( [@attributes] => Array ( [number] => 2 ) [x] => H [y] => 1

[orientation] => vertical [type] => black ) ) )

AMess!

Page 25: DIG 3134 – Internet Software Design

SimpleXMLElement Object (

[ship] => Array (

[0] => SimpleXMLElement Object (

[@attributes] => Array ( [number] => 1 )

[x] => A

[y] => 1

[orientation] => horizontal

[type] => black )

[1] => SimpleXMLElement Object (

[@attributes] => Array ( [number] => 2 )

[x] => H [y] => 1

[orientation] => vertical

[type] => black ) ) )

Prettyprint it:

Page 26: DIG 3134 – Internet Software Design

If you "View Source" on a print_r output,

you will see the prettyprinted version.

This is easy with Firefox (Command-U).

Prettyprint it:

Page 27: DIG 3134 – Internet Software Design

Getting at the data

$ships=$xml->ship;

foreach ($ships as $ship)

{

$xc=$ship->x; // x-character (like 'A')

$xlo=(ord(strtoupper($xc))-64); // get it?

$ylo=$ship->y; // y-smallest number (like 1)

$orientation=$ship->orientation; // like 'vertical'

// more to come

Page 28: DIG 3134 – Internet Software Design

Checking the data

$xc=$ship->x; // x-character (like 'A')

$xlo=(ord(strtoupper($xc))-64); // get it?

$ylo=$ship->y; // y-smallest number (like 1)

$orientation=$ship->orientation; // like 'vertical'

logprint("xc=$xc,ylo=$ylo,or=$orientation",5);

output: xc=A, ylo=1, or=horizontal

Page 29: DIG 3134 – Internet Software Design

Ominous storm clouds$xc=$ship->x; // x-character (like 'A')

$xlo=(ord(strtoupper($xc))-64); // get it?

$ylo=$ship->y; // y-smallest number (like 1)

$orientation=$ship->orientation; // like 'vertical'

logprint("xc=$xc,ylo=$ylo,or=$orientation",5);

output: xc=A, ylo=1, or=horizontal

everything looks normal and reasonable .. BUT ...

(cue the scary organ music….) TROUBLE ahead!

Page 30: DIG 3134 – Internet Software Design

Meanwhile … getting at the data

$shipattributes=$ship->attributes( ); // eh?

// The attributes of the <ship> element are

// returned by a special built-in method,

// in the form of an array. We saw:

[ship] => Array (

[0] => SimpleXMLElement Object (

[@attributes] => Array ( [number] => 1 )

// so, to 'peel' the info, we access an array element.

$shipnumber=$shipattributes['number'];

Page 31: DIG 3134 – Internet Software Design

Continuing the story …

$ships=$xml->ship;

foreach ($ships as $ship)

{ // … further down the foreach loop:

$type=$ship->type;

if ($type=='black') $fillcolor=BLACK;

else $fillcolor=GOLD;

if ($orientation=='horizontal')

{ // and we get ready to draw a ship into $Grid

Page 32: DIG 3134 – Internet Software Design

Then something WEIRD happened

for ($x=$xlo; $x<=$xlo+4; $x++)

{

$Grid[$x][$y]=$fillcolor;

And this is what happened:

Warning: Illegal offset type in /Applications/MAMP/htdocs/DIG3134/battleship/battleship12.php on line 428

What's that? An 'offset' is an index, like [$x] or [$y]

What is wrong with $x or $y?

So – I whip out my trusty print_r:

Page 33: DIG 3134 – Internet Software Design

Continuing the story …print "xc is "; print_r($xc);

print "ylo is "; print_r($ylo);

xc is SimpleXMLElement Object ( [0] => A )

ylo is SimpleXMLElement Object ( [0] => 1 )

What?? When I printed xc, it just looked like A

So:: the moral is, you get OBJECTS, all the way.

Page 34: DIG 3134 – Internet Software Design

PHP =

-- You always gotta watch it, and be ready to jump --

Page 35: DIG 3134 – Internet Software Design

Fixing the problem$y=$ylo+0; // looks weird. Add zero? Why?

print "y is ";

print_r($y);

y is 1 // problem solved.

How did this work?

PHP automatically assigns a data type to new

variables like $y, based on the types of incoming vars.

$y = $ylo + 0;

(object + number) results in a number.

Page 36: DIG 3134 – Internet Software Design

Creating XML from Objects

<?php // example.php -- here's one way to create a complex string.

$xmlstr = <<XML

<?xml version='1.0' standalone='yes'?> <movie>  <title>PHP: Behind the Parser</title>  <characters>    <character>     <name>Ms. Coder</name>     <actor>Onlivia Actora</actor>    </character>  

</characters>  <plot> To save space, nothing here. </plot>

</movie>

XML;?>

Page 37: DIG 3134 – Internet Software Design

Creating XML from Objects

<?phpinclude 'example.php';$movie = new SimpleXMLElement($xmlstr);

$character = $movie->characters->addChild('character');$character->addChild('name', 'Mr. Parser');$character->addChild('actor', 'John Doe');

$rating = $movie->addChild('rating', 'PG');$rating->addAttribute('type', 'mpaa');

$stringout= $movie->asXML(); // then write out text file.?>

Page 38: DIG 3134 – Internet Software Design

Take-away:

1. print_r is your friend, in times of confusion.

It can print (and "explain") any PHP variable.

2. simplexml is an easy-to-use tool,

but you gotta understand objects to use it.

3. attribute-value pairs are accessed via a special

method, not simply as object variables.

Page 39: DIG 3134 – Internet Software Design

Looking forward:

* Creating useful outputs(PDF, XLS)

* Reading XLS files directly

* Communicating with otherwebsites (CURL)

* Advanced topics:- recursion, JSON