LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

15
The Information School at the University of Washington LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair XML Basics

description

LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair. XML Basics. You invent a set of names for the stuff you want to manage and put around them You figure out which ones go inside witch others - PowerPoint PPT Presentation

Transcript of LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

Page 1: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

LIS 549 U/TU: Intro to Content ManagementFall 2003 * Bob Boiko * MSIM Associate Chair

XML Basics

Page 2: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

A First Look at XML

1. You invent a set of names for the stuff you want to manage and put <> around them

– <MyInfo>

– <Title>

– <Author>

– <Body>

– <Para>

2. You figure out which ones go inside witch others

• <MyInfo>• <Title>• <Author>• <body>

• <Para>

• </Body• </MyInfo>

3. You add additional information to the names

• <MyInfo Date=“”>• <Title>• <Author eMail = “”>• <Body Revision=“”>

• <Para Style=“”>

• </Body>

• </MyInfo>

4. You fill in the blanks• <MyInfo Date=“2004-03-03”>

• <Title>My Title</Title>• <Author eMail = “[email protected]”>Bob</Author>• <Body Revision=“2”>

• <Para Style=“ListHead”>Some Head</Para>• <Para Style=“List”>Point 1</Para>• <Para Style=“List”>Point 2</Para>

• </Body>

Page 3: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton The Three Faces of XML

Data- Transfer

Resource identification and discovery (Dublin core, RDF, etc.)Content modeling

Page 4: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

Why XML is Cool

• Has an X in its name!

• Is as strong as it’s “also ran” parent SGML

• More flexible than it’s superficial cousin HTML

• It’s accepted

Page 5: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

What is XML Good For?

You can treat and XML file like:

• A word processing file– Type it, edit it, display it

• An HTML file– Tag it, display it with a style sheet

• A database– Open it. Search it, add, update, delete

Page 6: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

XML vs. Other Markup Languages       

HTML XML WP Markup

    ASCII vs. Binary

ASCII ASCII Binary

    Format vs. Structure

Format Struct Format and Structure

    Extendable vs. Non Extendable

Non Ext Ext Non Ext

    Range of coverage

Low High Medium

Page 7: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

How do you Write XML?

Page 8: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

Three Views of XMLSchematic View

Tag View

Browser View

Page 9: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

XML Instance DocumentsNo TaggingDodge DurangoSport Utility432000 miles$18000YesYesRadio/Cassette/CDYesYesFull/PartialVery clean 

Minimal TaggingName: Dodge DurangoType: Sport UtilityDoors: 4Miles: 32000Price: 18000Power_Locks: YesPower_Windows: YesStereo: Radio/Cassette/CDAir-Conditioning: YesAutomatic: YesFWD: Full/PartialNote: Very clean 

XML Tagging<VEHICLES>   <VEHICLE inventory_number="1">      <MAKE>Dodge</MAKE>      <MODEL model_code="USA23">Durango</MODEL>      <YEAR>1998</YEAR>      <STYLE>Sport Utility</STYLE>      <DOORS>4</DOORS>      <PRICE>18000</PRICE>      <MILES>32000</MILES>      <OPTIONS>         <POWER_LOCKS>Yes</POWER_LOCKS>         <POWER_WINDOWS>Yes</POWER_WINDOWS>         <STEREO>Radio/Cassette/CD</STEREO>         <AIR_CONDITIONING>Yes</AIR-_CONDITIONING>         <AUTOMATIC>Yes</AUTOMATIC>         <FWD>Full/Partial</FWD>      </OPTIONS>      <NOTE>Very clean</NOTE>   </VEHICLE></VEHICLES> 

Page 10: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

The Gross Anatomy of a Tag

Page 11: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

The Micro Anatomy of a Tag

• Shorthand tag names stand for real words.

• Every tag "inside" is contained by that tag.

• Parameters tell you what the tag has.

White Space is for Your Eyes Only

<TABLE><TR><TD COLSPAN="2">Here is the picture<IMG SRC="ngo.jpg" BORDER

="1"> </TD> </TR><TABLE>   

Page 12: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

The Element

• The thing in brackets– <CapitalizationMatters/>– <NoSpaces/>– <StartWithALetter/>– <BeDescriptive/>– <Nest>

• <Nest>– <Nest/>

• </Nest>

– </Nest>

Page 13: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

The Attribute

• Really just another form of an element

• Always quote them

• A variety of data types

• Can be linked to a list of values

• Cannot nest

Page 14: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

How to Approach an Instance

• Understand it is an instance of a model– Model of what?– From what perspective?

• What are the names– What are the biggies?– Which ones to ignore for now– How is capitalization handled?

• Strip it to know it– Get rid of the bulk.– Lay out the major structure

Page 15: LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair

LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair

T

he I

nfo

rmati

on

Sch

ool at

the U

niv

ers

ity o

f W

ash

ing

ton

Finding Stuff in an XML File - Xpath

• Directories are hierarchies– Each file has a

path

• XML files are hierarchies– Each element has

an Xpath

• /Subject• //Subject• //Subject[@id=‘s0’]• //*[@id=‘s0’]• //*[@id=‘s0’]/title