Audiovisual Content Exploitation at FIA 15042010 NISV

45
Audiovisual content exploitation in the networked information society Roeland Ordelman Research&Development Netherlands Institute for Sound and Vision r[email protected]

description

 

Transcript of Audiovisual Content Exploitation at FIA 15042010 NISV

Page 1: Audiovisual Content Exploitation at FIA 15042010 NISV

Audiovisual content exploitation in the networked information society

Roeland Ordelman

Research&Development

Netherlands Institute for Sound and Vision

[email protected]

Page 2: Audiovisual Content Exploitation at FIA 15042010 NISV

contents

• NISV context: Images of the Future• “Access”, an important keyword in the

business models ...• ... but what about access in practice• Technology and user interaction: from

a ‘laboratory view’ on users to drawing them into the development chain

Page 3: Audiovisual Content Exploitation at FIA 15042010 NISV

NISV context

• +700.000 hours of radio, television, documentaries, films and music, over 2 million photographs, 20.000 objects like cameras, televisions, radios, costumes and pieces of scenery.

• still growing:• digitally born television and radio programs made

by the Dutch public broadcasting companies (video: 15K/hours/year)

• PROARCHIVE: archiving service• selection of (Dutch) user generated content

Page 4: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 5: Audiovisual Content Exploitation at FIA 15042010 NISV

Images of the Future

• Selection, restoration, digitization, encoding and storage of 137,000 hours of video, 20,000 hours of film, 124,000 hours of audio and more than three million photographs.

• One of the largest digitisation effort in Europe• Three goals:

• Safeguarding heritage for future generations• Creating social- economical value (“unlock the social

and economic potential of the collections”)• Innovation: new infrastructure for strengthening

knowledge economy• To achieve these objectives, the cultural heritage sector is

challenged to re-evaluate its business models

Page 6: Audiovisual Content Exploitation at FIA 15042010 NISV

Business model

• The total investment of this initiative sums up to 173 million Euro• A strong business model is necessary to support this kind of

investment and prove that such an investment will result in long-term socio-economic returns

• The outcome of a Cost-Benefit analysis was positive: “The total balance of costs and returns of restoring, preserving and digitising audio-visual material (excluding costs of tax payments) will be between: 20+ and 60+ million.’’

• Economic benefits:• Direct effects of the investment are revenues from sales,

access for specific user groups, the repartition of copyright for the use of the material and so on.

• The indirect effects concern the product markets and labour market.

• Social benefits:• conservation of culture, reinforcement of cultural awareness,

reinforcement of democracy through the accessibility of information, increase in multimedia literacy and contribution to the Lisbon goals set by the EU

http://www.prestoprime.org/project/public.en.html

Page 7: Audiovisual Content Exploitation at FIA 15042010 NISV

Content exploitation: from content is king ...

Page 8: Audiovisual Content Exploitation at FIA 15042010 NISV

... to metadata rules

Page 9: Audiovisual Content Exploitation at FIA 15042010 NISV

MANUAL ANNOTATION

costly & limited

Page 10: Audiovisual Content Exploitation at FIA 15042010 NISV

Research on automatic annotation

• automatic information extraction based on:• visual features• information from audio

• crowdsourcing• deploying collateral data sources:

• subtitles, production scripts, meeting minutes, slides

Page 11: Audiovisual Content Exploitation at FIA 15042010 NISV

PROGRESS? YES!

Various (laboratory) showcases

Commercial systems (e.g., blinkx, google)

Page 12: Audiovisual Content Exploitation at FIA 15042010 NISV

work in progress

• institutional: reorganisation of traditional archival workflows

• national: development of common services • OAI, Persistent Identifiers, ASR service,

Vocabulary Repositories• commercial: uptake by MNCs (Google and

Microsoft) and SMEs • individual: bring about a shift regarding

defensive attitude of content owners towards opening up their funded and protected archives (trust/reliability)

Page 13: Audiovisual Content Exploitation at FIA 15042010 NISV

Automatic annotation

• Participation in international research projects• VidioActive, MultiMATCH, VIDI-video, LiWA, P2P-

Fusion, Sterna, EUScreen, PrestoPrime• Collaboration agreement with Dutch research

institutes• Researchers stationed at Sound and Vision• Provide data (TRECVID, VideoCLEF)

• Research environment: exact copy of iMMix production environment for testing new technology• speech recognition• video analysis• fingerprinting• linking of context data (web, program guide,

production data)

Page 14: Audiovisual Content Exploitation at FIA 15042010 NISV

Annotation strategies

• crowdsourcing: video labeling game• deploying collateral data sources:

incorporation of subtitles• automatic information extraction: speech

recognition for radio, pilots with visual• technology aided manual annotation:

documentalist support• linking to other information sources

Page 15: Audiovisual Content Exploitation at FIA 15042010 NISV

DISPARITY BETWEEN TECHNOLOGY AND USER NEEDS

media professionals

journalists

researchers

educators

general public

Page 16: Audiovisual Content Exploitation at FIA 15042010 NISV

Users perspective

• Rapidly evolving networked information society• Opening up• Focus on community specific

requirements• search needs• presentation/interaction needs

• Draw communities into libraries

Page 17: Audiovisual Content Exploitation at FIA 15042010 NISV

(import)

metadata

(import)

content

metadata

(conversions)

content

(encoding)

Digital Archive

Digital Born15.000 hours of video40.000 hours of radio

Digitising Legacy MaterialImages for the Future

>250.000 hrs of audio and video

Asset management

ExhibitionsPublic Web Acces

User generated content and metadata

BroadcastProfessional

Education

Page 18: Audiovisual Content Exploitation at FIA 15042010 NISV

"if it doesn't spread, it is dead" (Jenkins, 2009)

Page 19: Audiovisual Content Exploitation at FIA 15042010 NISV

Open Images

• Open media platform for online access to audiovisual archive material, available for free (creative) reuse

• Built by Sound and Vision & Knowledgeland• Contributers include:

Page 20: Audiovisual Content Exploitation at FIA 15042010 NISV

Open, open, open

• Open source media platform (MMBase)• Use of and open video codec (Ogg

Theora)• Use of the HTML5 <video> tag• Use of an open API (OAI-PMH, Atom

feeds)

Page 21: Audiovisual Content Exploitation at FIA 15042010 NISV

Licence

• CC-BY-SA as preferred license• 3,000 items from our ‘own’

collection• ‘Internet quality’

Page 22: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 23: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 24: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 25: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 26: Audiovisual Content Exploitation at FIA 15042010 NISV

COMMUNITY SPECIFIC REQUIREMENTS

From document level search to fragment level search

Page 27: Audiovisual Content Exploitation at FIA 15042010 NISV

28

Broadcast professionals

In: Huurnink, Hollink, van Den Heuvel 2009 (submitted)

Page 28: Audiovisual Content Exploitation at FIA 15042010 NISV

User survey (broadcast professionals)

Page 29: Audiovisual Content Exploitation at FIA 15042010 NISV

Sound and Vision: Education

• Government and ‘Images for the Future’

• Earlier Initiatives

• ED*IT latest development completed with tools for

teacher and student

• ED*IT has been tested and developed in cooperation with many schools

Page 30: Audiovisual Content Exploitation at FIA 15042010 NISV

ED*IT: Proposition

• One environment provides access to different controlled content databases (video, audio,

photograps, articles, etc)

• Editorial Staff contextualizes and enriches content for educational use

• Enriched with tools for student and teacher to edit content in an easy way

• For primary-, secondary- and vocational education

Page 31: Audiovisual Content Exploitation at FIA 15042010 NISV

ED*IT: Functionalities

Video & Content Editor

Dossier Maker

Cut Videoclips

E- Lesson Maker

Teacher Forum

Upload Files

Presentation Maker

Edit Photographs

Digital Paper Maker

Timeline Maker

Page 32: Audiovisual Content Exploitation at FIA 15042010 NISV

ED*IT: Facts & Figures

• Test Accounts: 2500

• Licence: 50 schools

• Licence: 50 educational departments

• Objective: Same market share as Teleblik is 78%

Page 33: Audiovisual Content Exploitation at FIA 15042010 NISV

Researchers

• Verteld Verleden aims at establishing a shared information space on distributed Dutch Oral History collections:• distributed collections (harvested via OAI)• search & interlink collections via centralized search

• project goals:

1. provide demonstrator portal to show how technology could help researchers

2. acquire information on specific user requirements • search• collaboration• linking• privacy• dedicated work space

http://www.verteldverleden.org

Page 34: Audiovisual Content Exploitation at FIA 15042010 NISV

example VPRO radio interviews

QUOTE

“VISUAL RADIO”

Page 35: Audiovisual Content Exploitation at FIA 15042010 NISV

INTERACTION REQUIREMENTS

people expect easy interaction as

in 'every-day tools' they use on the web ...

• The Sound and Vision Experience: a crossover between a museum and amusement park with various archive material, to make audiovisual heritage more acessible to the general public

Page 36: Audiovisual Content Exploitation at FIA 15042010 NISV

DRAW COMMUNITIES INTO LIBRARIES

Page 37: Audiovisual Content Exploitation at FIA 15042010 NISV

goals

• exploiting community tagging (tagging games, etc)

• exploring the wisdom of crowds by hooking up with user communities (e.g., everyone-as-commentator, unexpected experts)

• capturing relevant information from the internet and aligning this with archived items.

• finding new ways for communities to interact with the data.

Page 38: Audiovisual Content Exploitation at FIA 15042010 NISV

Technology perspective

Technology:• provide anchor points for linking up with the

`cloud’ (entity detection, segmentation, cross-collection SID, etc): people, places, events, topics, quotes, etc.• keywords: reliability, speed

• synchronization of UGC with AV documents• users in the loop: UGC for adapting/training

analysis tools• early fushion of multiple modalities (vision,

speech)• technology aided annotation: Documentalist

Support System

Page 39: Audiovisual Content Exploitation at FIA 15042010 NISV
Page 40: Audiovisual Content Exploitation at FIA 15042010 NISV

Crowdsourcing

14 minutes left for annotation

Play!you score when somebody else uses the same term

fill in words that describe what you see or hear

Page 41: Audiovisual Content Exploitation at FIA 15042010 NISV

TAGGING GAME EXAMPLEwww.waisda.nl

Page 42: Audiovisual Content Exploitation at FIA 15042010 NISV

Play against ASR

Page 43: Audiovisual Content Exploitation at FIA 15042010 NISV

www.beeldengeluidwiki.nlwww.beeldengeluidwiki.nl

Page 44: Audiovisual Content Exploitation at FIA 15042010 NISV

Hollands Glorie op Pinkpop

Page 45: Audiovisual Content Exploitation at FIA 15042010 NISV

Wrap up

• value of archive is strongly related to access opportunities

• access is to a large extend technology driven• but next to technology development we need

to make a shift:• from a ‘laboratory view’ on users to drawing

users and communities into the loop• NISV is aiming towards this two-way strategy:

• incorporate advanced access technology• discuss access requirements with the

stakeholders