NewsML™, NITF & NewsCodes The winning triple Michael Steidl IPTC Managing Director ANSA/FIEG...
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of NewsML™, NITF & NewsCodes The winning triple Michael Steidl IPTC Managing Director ANSA/FIEG...
NewsML™, NITF & NewsCodesThe winning triple
Michael SteidlIPTC Managing Director
ANSA/FIEG meeting19 April 2006, Rome
© 2006 IPTC All rights reserved 2
Who is what
• NewsML 1: News Markup Language for managing and packaging of news– allows versioning of news items: easy tracking of breaking news
= evolving stories.– rich set of management metadata:
publishing status (“usable”, ”embargoed”, “canceled”, …)why updated, links to other news items (like “see also”)
– packaging of news items of different media types (text, photo, …)
• NITF: News Industry Text Format for marking up text news– inline markup of text– structure for semi-layout (e.g. tables)
• NewsCodes: for proper categorisation– Subject NewsCodes with about 1300 terms, in three levels
© 2006 IPTC All rights reserved 3
NewsML™ version 1
How a NewsML instance is built:• Structured content: story package
Top Content Container = the NewsItem
© 2006 IPTC All rights reserved 4
NewsML™ version 1
How a NewsML instance is built:• Structured content: story package
Top Content Container
text / role = interview
ContentComponent
© 2006 IPTC All rights reserved 5
NewsML™ version 1
How a NewsML instance is built :• Structured content: story package
Top Content Container
text / role = interview
text / role = background
© 2006 IPTC All rights reserved 6
NewsML™ version 1
How a NewsML instance is built :• Structured content: story package
Top Content Container
text / role = interview
text / role = background
photo / role = pic of person
© 2006 IPTC All rights reserved 7
NewsML™ version 1
How a NewsML instance is built :• Structured content: web page package
Top Content Container
text / role = main story
© 2006 IPTC All rights reserved 8
NewsML™ version 1
How a NewsML instance is built :• Structured content: web page package
Top Content Container
text / role = main story
text / role = tickerline1text / role = tickerline2text / role = tickerline3text / role = tickerline4
© 2006 IPTC All rights reserved 9
NewsML™ version 1
How a NewsML instance is built :• Structured content: web page package
Top Content Container
text / role = main story
text / role = tickerline1text / role = tickerline2text / role = tickerline3text / role = tickerline4
text / role = sidebar
photorole = pic
text / role=news audio / role = sound
© 2006 IPTC All rights reserved 10
NewsML™ version 1
Versioning: The original version is circulated at 11:32
ANSA
Italy wins world championshipID: abc123 Version: 1
Another news 5
Another news 4
Another news 3
Another news 2
Another news 1
Italy wins world championshipID: abc123 Version: 1
© 2006 IPTC All rights reserved 11
NewsML™ version 1
Versioning: An updated version is circulated at 13:43
ANSA
Italy wins world championshipID: abc123 Version: 2
Another news 9
Another news 8
Another news 7
Another news 6
Italy wins world championshipID: abc123 Version: 1
Italy wins world championshipID: abc123 Version: 2
This is an update
Italy wins world championshipID: abc123 Version: 1
© 2006 IPTC All rights reserved 12
NewsML™ version 1
Summary• NewsML provides a rich, well designed and extensible
set of metadata to enhance routing and selecting.• NewsML allows to manage items:
– each item has a unique identifier– each item has a distinct version
• NewsML allows to package several pieces of content into one item – content of various media types
• NewsML adds value to packaging:– “roles” identify why the content is there– groups of packages enhance the structure
© 2006 IPTC All rights reserved 13
NITF
• Feature “inline mark up”: one can add metadata to portions of the news text:
<p xml:lang="en">The weather was superb today in Norfolk, Virginia. Made me want to take my boat, manufactured by the <org value=“IT123498312" idsrc=“ISIN">Acme Boat Company</org>.</p>
This inline mark up may be used to add linked information to the final rendition: like identifying information about entities (“what company is that exactly?”) or a link to a background story.and to add layout “recommendations” (e.g. emphasised)
© 2006 IPTC All rights reserved 14
NITF
• feature: “structure/layout mark up”<table border="1"><tr><!-- beach --><th></th><!-- day high and low --><th colspan="2">today</th><!-- tide times --><th colspan="2">tide</th><!-- forecast tomorrow --><th colspan="2">tomorrow</th><!-- forecast the next day --><th colspan="2">next day</th><!-- forecast the day after that --><th colspan="2">third day</th></tr><tr><!-- beach --><th>beach</th><!-- day high and low --><th>high</th><th>low</th>
….
this sequence of strange looking code translates into a decent table (▼) and into a even more fashionable version on a layout system for newspapers.
© 2006 IPTC All rights reserved 15
NITF
Summary• NITF is a kind of “HTML for all kinds of media” – it
delivers the features of easy web publishing also to the print layout.
• Inline mark up allows to link to reference information and to background information
• Structure mark up allows to convey layout information from the maker of the news to its users.
© 2006 IPTC All rights reserved 16
IPTC metadata codes
• The challenge:“The most effective communication occurs when all
parties involved agree on the meaning of the terms being used.” (Fast ,Leise & Steckel, “Boxes and Arrows”)
© 2006 IPTC All rights reserved 17
IPTC metadata codes
• The challenge:“The most effective communication occurs when all parties involved agree
on the meaning of the terms being used.” (Fast ,Leise & Steckel, “Boxes and Arrows”)
• The solution: IPTC’s controlled vocabularies =• Managed lists of codes (= abstract notations)• with names (in different languages)• with explicit explanations (≈ encyclopaedia) (in different lang.)• each of the 28 for a specific scope• to navigate content
© 2006 IPTC All rights reserved 18
IPTC NewsCodes
The common name for ALL controlled vocabularies maintained by the IPTC is
IPTC NewsCodes
(More info at www.newscodes.org)
© 2006 IPTC All rights reserved 19
IPTC NewsCodes
Currently the IPTC maintains 28 sets of NewsCodes
IPTC NewsCodes break out into groups:
© 2006 IPTC All rights reserved 20
IPTC NewsCodes
What the content is about
– Subject-NewsCodes: ~ 1300 termsat 3 levels
– SubjectQualifier-NewsCodes:men, women, age groups, sports specific qualifiers, …
© 2006 IPTC All rights reserved 21
IPTC NewsCodes
Formal attributes of the content
– Genre-NewsCodes like current, update, wrap-up, background, feature, interview, review …
– Scene-NewsCodes for photos like head-/half-/full-shot, interior/exterior, single/two/group …
– Importance-NewsCodes identifying 6 levels– Location-NewsCodes are location qualifiers from “WorldRegion”
to “Sublocation”
© 2006 IPTC All rights reserved 22
IPTC NewsCodes
Formal attributes of the media data
– Format (mimetype, mediatype)– Encoding– Encoders– Physical Characteristics– Colourspace
© 2006 IPTC All rights reserved 23
IPTC NewsCodes
Codes to manage news exchange
– (news) Provider-NewsCodes – already registered with the IPTC?
– Status-NewsCodes (usable, embargoed …)– Priority-NewsCodes (9 levels)– Urgency-NewsCodes (9 levels)– Of interest to-NewsCodes identifying groups of the audience the
content is aimed at– Relevance-NewsCodes identifying journalistic relevance– Role-NewsCodes to provide semantics to news package
components (NewsML!)
© 2006 IPTC All rights reserved 24
IPTC NewsCodes
In depth …
IPTC’s huge taxonomy to describe content
The Subject NewsCodes
© 2006 IPTC All rights reserved 25
IPTC NewsCodes
The Subject NewsCodes
• Three level tree structure• ~ 1300 terms in total• 17 top level Subjects (Broadest term) for
art, crime/law, disaster, economy/business, education, environment, health, human interest, labour, lifestyle, politics, religion, science/technology, social issues, sports, unrest/war, weather
• ~ 350 intermediate level terms (Narrow term, NT)• ~ 900 third (= lowest) level terms (most NT)
© 2006 IPTC All rights reserved 26
IPTC NewsCodes
The Subject NewsCodes
Term structure: each term has …• a Code: 8 digits (e.g. 170010009)• a Name: language specific string
(e.g: weather/forecast or Meteorología/Pronósticos)• an Explanation: short text describing the concept of this Subject-
NewsCode• term management data (versioning)
© 2006 IPTC All rights reserved 27
IPTC NewsCodes
The Subject NewsCodes
Where to apply …Explicit tags are provided by:
• NITF• NewsML• IIM (aka “IPTC Headers” for images)• “IPTC Core” Scheme for XMP (for Adobe CS products)
© 2006 IPTC All rights reserved 28
IPTC NewsCodes
The Subject NewsCodes
How to apply …• manually by editors (pick lists)• automatically by categorization engines• “mixed mode”: suggested by categorizer, changed/approved by
editor
© 2006 IPTC All rights reserved 29
IPTC NewsCodes
A Subject NewsCodes example:
“IPTC gave a presentation about their news technology at an ANSA/FIEG meeting in Rome”would e.g. resolve to:– 13022000 (Technology/IT)– 04003000 (Economy/Computing and IT)– 04010004 (Economy/Media/News agency)
© 2006 IPTC All rights reserved 30
IPTC NewsCodes
The Subject NewsCodes
You are in control:you can make your own subset
• select the Subject Codes you want to use for your agency• select sets of Subject Codes for the various desks in your agency
(e.g. economy, sports …)
© 2006 IPTC All rights reserved 31
IPTC NewsCodes
The Subject NewsCodes
Additional refinement: Qualifiers– primarily used for sports– adds facets to the content like men/women, individual/team,
indoor/outdoor …