Case Study: Report from the Front Lines of Digital Asset Management at CNN Kathy Christensen CNN...
-
Upload
benjamin-silva -
Category
Documents
-
view
212 -
download
0
Transcript of Case Study: Report from the Front Lines of Digital Asset Management at CNN Kathy Christensen CNN...
Case Study:Report from the Front Lines of Digital
Asset Management at CNN
Kathy Christensen
CNN News Archives
August 2001
2
CNN Background
• Multiple products: CNN, Headline News, CNN International, CNN.com et al, CNN/SI, CNNfn, CNN en Espanol, Airport Network, Inflight
• CNN Library as central resource
– Information research
– Archive
– Footage licensing
3
What’s in the CNN archive?• Type of material
– 10%: programs (Larry King, Crossfire, etc)
– 90% is raw footage & edited cut items (pkgs, sots, vo’s)
• Volume
– 150,000+ hours of footage in Atlanta plus additional footage in bureaus
– 1,000,000+items in Atlanta central catalog plus 600,000 across bureau catalogs
• Growth
– 2000 items archived per week in Atlanta culled from many times more incoming items
• 1/3 of items per day are cut (3 hrs)
• 2/3 of items per day are raw (90 hrs)
– 30,000 hours archived in 2000
4
Who are the archive clients?
• CNN
– daily news - TV and Interactive
– documentary - TV and Interactive
– other (Sales, Marketing, PR, Legal, etc)
• AOL-TW companies (TNT, TBS, Warner Bros)
• External customers (Imagesource clients)
5
The “Archive Project” (aka core of CNN’s digital future)
• Purpose– Preserve assets
– Extend usage of assets
– Create efficiencies
– Facilitate new business opportunities
– Create media management framework for the digital CNN
6
Pre-Digital Scenario
Tape Library
Acquis ition ProductionRece iv ingContribu tion
D istribu tionRece iv ing
Rece iv ing
Production
Production
D istribu tion
D istribu tion
F ie ldProduction
S tud ioProduction
7
Digital Scenario
Global ContentManagem ent andStorage System
Production Production
D istribu tion
HDTVD istribu tion
EnhancedD istribu tion
In teractiveD istribu tion
ProductionProduction
Production
Acquis itionEditing
Contribu tion
F ie ldProduction
S tud ioProduction
8
• System goals and challenges
– Multiple resolutions captured simultaneously - to serve broadcast, edit and Internet
– Generate as much meaningful cataloging data automatically as possible - technology continuing to improve
– Support the necessary human cataloging with powerful tools
– Support retrieval needs of diverse user communities
9
• Our Approach– Assemble a diverse internal team with multidisciplinary
expertise
• R&D, Engineering, IT, Library Science, Users
– Co-developers with Sony and IBM
• Key Principles– Custom solution not desired
– Focus on interoperability and standards
– Phased development
• get started and build on it
10
Users drive cataloging & search requirements• Production usually demands video of versus stories about
– Automatically captured narrative track excellent for finding “about” but often misses the “of”-- what do we see in the footage?
– Special challenge of raw video -- b-roll often has no track to capture
• High-pressure, fast turn-around, 24-hour environment requires highly precise results, extremely quickly
• Long-term documentary production can tolerate more browsing but still requires reliably comprehensive retrieval
• News domain requires reliance on accuracy of editorial metadata - bad data and inadequate search systems equal journalistic problems
11
Enablers of accuracy, precision, speed, thoroughness
• Controlled vs Free-form Data Entry - build data entry aids which support consistent entry
• Adequate size for keyword and video description fields
• Controlled classification terms with a mechanism for dynamically updating the classifications
• Fielded Tags for
– “best of” video
– about but not seen
– natural sound
• Flexibility in search approaches - free-text, controlled vocabulary, field-specific, user control over precision vs fuzziness, user control over tracks to include, user control over weighting and display of results
12
Technology strengths supplement human weaknesses
• Automatic capture of closed-caption text improves retrieval of small, specific portions of programming about something -- a viewer need which is not easily met now.
• Voice-to-text transcription even at 60% accuracy fills a not-easily met need to find specific soundbites in raw speeches, interviews, hearings, etc.
• Video to video matching supports identification of permutations of the same video piece across the catalog
13
Technology strengths supplements human strengths
• Making sense of images, putting them into editorial context, and attaching words so they may be retrieved
– Automatic scene change detection facilitates speedy review of item by human cataloger
– Face recognition software may not know who a particular face is, but can know that the video contains a face which a human can then identify
14
Technology strengths also supplement technology weaknesses
• Speech-to-text weakness - some of the data most likely to be search on… names of people, companies, places
– Phonetic-based search strengths can cover speech-to-text search weakness
• Phonetic track useful for searching but doesn’t provide textual cataloging data
– Speech-to-text transcription useful as representation of the content of the asset
15
Food for thought …
• Responsibilities
– to the parent company
– to the user communities
– to the rightsholders
– to posterity???
• This means thinking about
– Physical integrity of the content (quality, lossless conversions, standards, migration)
– Intellectual integrity of content…ethics