NewsML™, NITF & NewsCodes The winning triple Michael Steidl IPTC Managing Director ANSA/FIEG...

32
NewsML™, NITF & NewsCodes The winning triple Michael Steidl IPTC Managing Director ANSA/FIEG meeting 19 April 2006, Rome
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of NewsML™, NITF & NewsCodes The winning triple Michael Steidl IPTC Managing Director ANSA/FIEG...

NewsML™, NITF & NewsCodesThe winning triple

Michael SteidlIPTC Managing Director

ANSA/FIEG meeting19 April 2006, Rome

© 2006 IPTC All rights reserved 2

Who is what

• NewsML 1: News Markup Language for managing and packaging of news– allows versioning of news items: easy tracking of breaking news

= evolving stories.– rich set of management metadata:

publishing status (“usable”, ”embargoed”, “canceled”, …)why updated, links to other news items (like “see also”)

– packaging of news items of different media types (text, photo, …)

• NITF: News Industry Text Format for marking up text news– inline markup of text– structure for semi-layout (e.g. tables)

• NewsCodes: for proper categorisation– Subject NewsCodes with about 1300 terms, in three levels

© 2006 IPTC All rights reserved 3

NewsML™ version 1

How a NewsML instance is built:• Structured content: story package

Top Content Container = the NewsItem

© 2006 IPTC All rights reserved 4

NewsML™ version 1

How a NewsML instance is built:• Structured content: story package

Top Content Container

text / role = interview

ContentComponent

© 2006 IPTC All rights reserved 5

NewsML™ version 1

How a NewsML instance is built :• Structured content: story package

Top Content Container

text / role = interview

text / role = background

© 2006 IPTC All rights reserved 6

NewsML™ version 1

How a NewsML instance is built :• Structured content: story package

Top Content Container

text / role = interview

text / role = background

photo / role = pic of person

© 2006 IPTC All rights reserved 7

NewsML™ version 1

How a NewsML instance is built :• Structured content: web page package

Top Content Container

text / role = main story

© 2006 IPTC All rights reserved 8

NewsML™ version 1

How a NewsML instance is built :• Structured content: web page package

Top Content Container

text / role = main story

text / role = tickerline1text / role = tickerline2text / role = tickerline3text / role = tickerline4

© 2006 IPTC All rights reserved 9

NewsML™ version 1

How a NewsML instance is built :• Structured content: web page package

Top Content Container

text / role = main story

text / role = tickerline1text / role = tickerline2text / role = tickerline3text / role = tickerline4

text / role = sidebar

photorole = pic

text / role=news audio / role = sound

© 2006 IPTC All rights reserved 10

NewsML™ version 1

Versioning: The original version is circulated at 11:32

ANSA

Italy wins world championshipID: abc123 Version: 1

Another news 5

Another news 4

Another news 3

Another news 2

Another news 1

Italy wins world championshipID: abc123 Version: 1

© 2006 IPTC All rights reserved 11

NewsML™ version 1

Versioning: An updated version is circulated at 13:43

ANSA

Italy wins world championshipID: abc123 Version: 2

Another news 9

Another news 8

Another news 7

Another news 6

Italy wins world championshipID: abc123 Version: 1

Italy wins world championshipID: abc123 Version: 2

This is an update

Italy wins world championshipID: abc123 Version: 1

© 2006 IPTC All rights reserved 12

NewsML™ version 1

Summary• NewsML provides a rich, well designed and extensible

set of metadata to enhance routing and selecting.• NewsML allows to manage items:

– each item has a unique identifier– each item has a distinct version

• NewsML allows to package several pieces of content into one item – content of various media types

• NewsML adds value to packaging:– “roles” identify why the content is there– groups of packages enhance the structure

© 2006 IPTC All rights reserved 13

NITF

• Feature “inline mark up”: one can add metadata to portions of the news text:

<p xml:lang="en">The weather was superb today in Norfolk, Virginia. Made me want to take my boat, manufactured by the <org value=“IT123498312" idsrc=“ISIN">Acme Boat Company</org>.</p>

This inline mark up may be used to add linked information to the final rendition: like identifying information about entities (“what company is that exactly?”) or a link to a background story.and to add layout “recommendations” (e.g. emphasised)

© 2006 IPTC All rights reserved 14

NITF

• feature: “structure/layout mark up”<table border="1"><tr><!-- beach --><th></th><!-- day high and low --><th colspan="2">today</th><!-- tide times --><th colspan="2">tide</th><!-- forecast tomorrow --><th colspan="2">tomorrow</th><!-- forecast the next day --><th colspan="2">next day</th><!-- forecast the day after that --><th colspan="2">third day</th></tr><tr><!-- beach --><th>beach</th><!-- day high and low --><th>high</th><th>low</th>

….

this sequence of strange looking code translates into a decent table (▼) and into a even more fashionable version on a layout system for newspapers.

© 2006 IPTC All rights reserved 15

NITF

Summary• NITF is a kind of “HTML for all kinds of media” – it

delivers the features of easy web publishing also to the print layout.

• Inline mark up allows to link to reference information and to background information

• Structure mark up allows to convey layout information from the maker of the news to its users.

© 2006 IPTC All rights reserved 16

IPTC metadata codes

• The challenge:“The most effective communication occurs when all

parties involved agree on the meaning of the terms being used.” (Fast ,Leise & Steckel, “Boxes and Arrows”)

© 2006 IPTC All rights reserved 17

IPTC metadata codes

• The challenge:“The most effective communication occurs when all parties involved agree

on the meaning of the terms being used.” (Fast ,Leise & Steckel, “Boxes and Arrows”)

• The solution: IPTC’s controlled vocabularies =• Managed lists of codes (= abstract notations)• with names (in different languages)• with explicit explanations (≈ encyclopaedia) (in different lang.)• each of the 28 for a specific scope• to navigate content

© 2006 IPTC All rights reserved 18

IPTC NewsCodes

The common name for ALL controlled vocabularies maintained by the IPTC is

IPTC NewsCodes

(More info at www.newscodes.org)

© 2006 IPTC All rights reserved 19

IPTC NewsCodes

Currently the IPTC maintains 28 sets of NewsCodes

IPTC NewsCodes break out into groups:

© 2006 IPTC All rights reserved 20

IPTC NewsCodes

What the content is about

– Subject-NewsCodes: ~ 1300 termsat 3 levels

– SubjectQualifier-NewsCodes:men, women, age groups, sports specific qualifiers, …

© 2006 IPTC All rights reserved 21

IPTC NewsCodes

Formal attributes of the content

– Genre-NewsCodes like current, update, wrap-up, background, feature, interview, review …

– Scene-NewsCodes for photos like head-/half-/full-shot, interior/exterior, single/two/group …

– Importance-NewsCodes identifying 6 levels– Location-NewsCodes are location qualifiers from “WorldRegion”

to “Sublocation”

© 2006 IPTC All rights reserved 22

IPTC NewsCodes

Formal attributes of the media data

– Format (mimetype, mediatype)– Encoding– Encoders– Physical Characteristics– Colourspace

© 2006 IPTC All rights reserved 23

IPTC NewsCodes

Codes to manage news exchange

– (news) Provider-NewsCodes – already registered with the IPTC?

– Status-NewsCodes (usable, embargoed …)– Priority-NewsCodes (9 levels)– Urgency-NewsCodes (9 levels)– Of interest to-NewsCodes identifying groups of the audience the

content is aimed at– Relevance-NewsCodes identifying journalistic relevance– Role-NewsCodes to provide semantics to news package

components (NewsML!)

© 2006 IPTC All rights reserved 24

IPTC NewsCodes

In depth …

IPTC’s huge taxonomy to describe content

The Subject NewsCodes

© 2006 IPTC All rights reserved 25

IPTC NewsCodes

The Subject NewsCodes

• Three level tree structure• ~ 1300 terms in total• 17 top level Subjects (Broadest term) for

art, crime/law, disaster, economy/business, education, environment, health, human interest, labour, lifestyle, politics, religion, science/technology, social issues, sports, unrest/war, weather

• ~ 350 intermediate level terms (Narrow term, NT)• ~ 900 third (= lowest) level terms (most NT)

© 2006 IPTC All rights reserved 26

IPTC NewsCodes

The Subject NewsCodes

Term structure: each term has …• a Code: 8 digits (e.g. 170010009)• a Name: language specific string

(e.g: weather/forecast or Meteorología/Pronósticos)• an Explanation: short text describing the concept of this Subject-

NewsCode• term management data (versioning)

© 2006 IPTC All rights reserved 27

IPTC NewsCodes

The Subject NewsCodes

Where to apply …Explicit tags are provided by:

• NITF• NewsML• IIM (aka “IPTC Headers” for images)• “IPTC Core” Scheme for XMP (for Adobe CS products)

© 2006 IPTC All rights reserved 28

IPTC NewsCodes

The Subject NewsCodes

How to apply …• manually by editors (pick lists)• automatically by categorization engines• “mixed mode”: suggested by categorizer, changed/approved by

editor

© 2006 IPTC All rights reserved 29

IPTC NewsCodes

A Subject NewsCodes example:

“IPTC gave a presentation about their news technology at an ANSA/FIEG meeting in Rome”would e.g. resolve to:– 13022000 (Technology/IT)– 04003000 (Economy/Computing and IT)– 04010004 (Economy/Media/News agency)

© 2006 IPTC All rights reserved 30

IPTC NewsCodes

The Subject NewsCodes

You are in control:you can make your own subset

• select the Subject Codes you want to use for your agency• select sets of Subject Codes for the various desks in your agency

(e.g. economy, sports …)

© 2006 IPTC All rights reserved 31

IPTC NewsCodes

The Subject NewsCodes

Additional refinement: Qualifiers– primarily used for sports– adds facets to the content like men/women, individual/team,

indoor/outdoor …

© 2006 IPTC All rights reserved 32

Thank you for your time

www.iptc.org