Audiovisual collections, the spoken word and user needs of scholars in the Humanities
-
Upload
roelandordelmannl -
Category
Technology
-
view
261 -
download
0
Transcript of Audiovisual collections, the spoken word and user needs of scholars in the Humanities
![Page 1: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/1.jpg)
Audiovisual collections, the spoken word and user needs of scholars in the Humanities
Observations based on related work in The Netherlands 2005-
2012
![Page 2: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/2.jpg)
E-research
• New and/or rapid ways to gain knowledge• Digital resources and information technology• Big data & data mining (social sciences)• Digital Humanities / E-Humanities• Digitization, Infra, Tools, Standards• CLARIN.eu / DARIAH.eu
![Page 3: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/3.jpg)
Emerging focus on audiovisual
• Multi-modal, multi-semiotic: – multiple layers of meaning / interpretation– E.g., “quote + intonation + images + discourse”
• New dimensions for scholarly research• Large investments in digitization:
– Images for the Future: 200k hours of film, video and audio
– Various digitization projects for scientific collections
![Page 4: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/4.jpg)
?METADATARULES
![Page 5: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/5.jpg)
Metadata & Annotations
• Annotations:– General (document level)– Specific (segment level)
• Metadata: typically sparse / document level• Requirements dependent on research field• Annotation generation:
– Manual (Individual, Teams, Crowd) – Automatic: (un/lightly) supervised
![Page 6: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/6.jpg)
Monitoring radio transcriptions
INGEST SUPERVISION // ARCHIVIST SUPPORT:
Quickly assess quality of ASR
![Page 7: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/7.jpg)
Spoken word search 2005-2012
• Wide range of projects in various domains– Radio
• Daily ingest: selection of programs• Woord.nl: public access to radio content
– Historical video collections with sparse data– ``Oral History’’
• Development of an ASR service for cultural heritage institutions
![Page 8: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/8.jpg)
1st experiment on ASR for humanities: access to personal recordings of Dutch novelist WF Hermans
![Page 9: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/9.jpg)
Access to interview collection with camp survivors World War II
![Page 11: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/11.jpg)
INTERVIEWS ON BOMBARDEMENT OF ROTTERDAM
Alignment of transcripts for indexing
![Page 12: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/12.jpg)
Access to Radio interviews Experiments with various types of access and result presentation: speaker changes, speaking rate, search strategies, word clouds
![Page 13: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/13.jpg)
Access to Historical Speeches:Alignment & Linking
![Page 14: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/14.jpg)
ACCESS TO DISTRIBUTED ORAL HISTORY COLLECTIONS
• Infrastructure for searching collections at various institutes in The Netherlands
• Harvesting of Metadata (OAI-PMH)
• ASR as a service• Evaluated with
Oral Historians
![Page 15: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/15.jpg)
Observations on speech search
• Large variation in ASR performance• Performance (and decisions on use)
should be assessed in context of application: audiovisual search
• Usefulness in audiovisual search should be assessed in context of use scenarios
• Use scenarios require specific presentation/visualization requests
![Page 16: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/16.jpg)
Usefulness of results• Perception of usefulness
– Usefulness in context of search/data exploration– Educate / Expectation management– Guide searching – Show why (errors, confidence, trust-levels, cut-offs)– Focus on research needs
• Improve on ASR quality– Educate: how to record an interview (Oral History)– Use available textual resources (alignment, vocab optimization)
• Improve on search application– Visualization– Result presentation
• documents versus segments• combination of information sources• cross/within-collection linking
![Page 17: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/17.jpg)
Methodology (1)• E-research is an intervention in current practices!• Promise:
– increased efficiency, relevance, novelty• Interest of scholars:
– tools that facilitate or simplify existing practice (RIN report, 2011)• Co-development ICT-researchers & scholars to adjust
expectations. Examples:– Finding more in less time may not be a goal in itself for humanities
researchers– Deep engagement with primary texts versus results on the
segment level
![Page 18: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/18.jpg)
Methodology (2)• 4 stages:
1. Preliminary archival search• Browsing as a general interest• Purpose driven (checking details, complementary resources)• Item-oriented (finding first mentioning of something)• Collection-oriented (thematic, source, person, event)
2. Content analysis• Visualization, compression, aggregation• (optionally) go back to (1)
3. Presentation and dissemination• Enhanced publications (persistent identifiers on segment level)
4. Curation• Trusted digital repository
• (spoken) search scenarios: facilitate these stages
![Page 19: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/19.jpg)
ASR for research• Triple-A: Accessible, Affordable, Accurate• Individual researchers sending files to ASR?• Embedded in suite of research tools?• What about integration in search applications?
– Stagnation due to inadequate local infrastructures• Variation across collections requires ‘tailor-
made’ approaches: e.g., speaker adaptation, vocabulary adaptation, alignment, collection of related resources (information trail)
![Page 20: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/20.jpg)
ASR service
Model of use:• Free test bundle (10h)• Various small/medium/large
bundles• Reduced costs (only
hardware and maintenance)• Management by CH body• Maintenance by industry
partner
Upload: via http, ftp, api
Account information
![Page 21: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/21.jpg)
Dutch Queen Wilhelmina addressing the Dutch people from London during WWII
![Page 22: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/22.jpg)
![Page 23: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/23.jpg)
![Page 24: Audiovisual collections, the spoken word and user needs of scholars in the Humanities](https://reader031.fdocuments.net/reader031/viewer/2022030303/587bc3891a28ab6c3c8b4e61/html5/thumbnails/24.jpg)