LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair
description
Transcript of LIS 549 U/TU: Intro to Content Management Fall 2003 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
LIS 549 U/TU: Intro to Content ManagementFall 2003 * Bob Boiko * MSIM Associate Chair
XML Basics
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
A First Look at XML
1. You invent a set of names for the stuff you want to manage and put <> around them
– <MyInfo>
– <Title>
– <Author>
– <Body>
– <Para>
2. You figure out which ones go inside witch others
• <MyInfo>• <Title>• <Author>• <body>
• <Para>
• </Body• </MyInfo>
3. You add additional information to the names
• <MyInfo Date=“”>• <Title>• <Author eMail = “”>• <Body Revision=“”>
• <Para Style=“”>
• </Body>
• </MyInfo>
4. You fill in the blanks• <MyInfo Date=“2004-03-03”>
• <Title>My Title</Title>• <Author eMail = “[email protected]”>Bob</Author>• <Body Revision=“2”>
• <Para Style=“ListHead”>Some Head</Para>• <Para Style=“List”>Point 1</Para>• <Para Style=“List”>Point 2</Para>
• </Body>
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton The Three Faces of XML
Data- Transfer
Resource identification and discovery (Dublin core, RDF, etc.)Content modeling
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Why XML is Cool
• Has an X in its name!
• Is as strong as it’s “also ran” parent SGML
• More flexible than it’s superficial cousin HTML
• It’s accepted
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What is XML Good For?
You can treat and XML file like:
• A word processing file– Type it, edit it, display it
• An HTML file– Tag it, display it with a style sheet
• A database– Open it. Search it, add, update, delete
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
XML vs. Other Markup Languages
HTML XML WP Markup
ASCII vs. Binary
ASCII ASCII Binary
Format vs. Structure
Format Struct Format and Structure
Extendable vs. Non Extendable
Non Ext Ext Non Ext
Range of coverage
Low High Medium
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
How do you Write XML?
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Three Views of XMLSchematic View
Tag View
Browser View
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
XML Instance DocumentsNo TaggingDodge DurangoSport Utility432000 miles$18000YesYesRadio/Cassette/CDYesYesFull/PartialVery clean
Minimal TaggingName: Dodge DurangoType: Sport UtilityDoors: 4Miles: 32000Price: 18000Power_Locks: YesPower_Windows: YesStereo: Radio/Cassette/CDAir-Conditioning: YesAutomatic: YesFWD: Full/PartialNote: Very clean
XML Tagging<VEHICLES> <VEHICLE inventory_number="1"> <MAKE>Dodge</MAKE> <MODEL model_code="USA23">Durango</MODEL> <YEAR>1998</YEAR> <STYLE>Sport Utility</STYLE> <DOORS>4</DOORS> <PRICE>18000</PRICE> <MILES>32000</MILES> <OPTIONS> <POWER_LOCKS>Yes</POWER_LOCKS> <POWER_WINDOWS>Yes</POWER_WINDOWS> <STEREO>Radio/Cassette/CD</STEREO> <AIR_CONDITIONING>Yes</AIR-_CONDITIONING> <AUTOMATIC>Yes</AUTOMATIC> <FWD>Full/Partial</FWD> </OPTIONS> <NOTE>Very clean</NOTE> </VEHICLE></VEHICLES>
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Gross Anatomy of a Tag
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Micro Anatomy of a Tag
• Shorthand tag names stand for real words.
• Every tag "inside" is contained by that tag.
• Parameters tell you what the tag has.
White Space is for Your Eyes Only
<TABLE><TR><TD COLSPAN="2">Here is the picture<IMG SRC="ngo.jpg" BORDER
="1"> </TD> </TR><TABLE>
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Element
• The thing in brackets– <CapitalizationMatters/>– <NoSpaces/>– <StartWithALetter/>– <BeDescriptive/>– <Nest>
• <Nest>– <Nest/>
• </Nest>
– </Nest>
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Attribute
• Really just another form of an element
• Always quote them
• A variety of data types
• Can be linked to a list of values
• Cannot nest
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
How to Approach an Instance
• Understand it is an instance of a model– Model of what?– From what perspective?
• What are the names– What are the biggies?– Which ones to ignore for now– How is capitalization handled?
• Strip it to know it– Get rid of the bulk.– Lay out the major structure
LIS 549 U/TU: Intro to Content Management * Fall 2004 * Bob Boiko * MSIM Associate Chair
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Finding Stuff in an XML File - Xpath
• Directories are hierarchies– Each file has a
path
• XML files are hierarchies– Each element has
an Xpath
• /Subject• //Subject• //Subject[@id=‘s0’]• //*[@id=‘s0’]• //*[@id=‘s0’]/title