Transcribe NLS: Crowdsourcing at the National Library of Scotland
description
Transcript of Transcribe NLS: Crowdsourcing at the National Library of Scotland
Transcribe NLSCrowdsourcing at the National Library of
Scotland
Scotland's National Collections and the Digital Humanities
Workshop 3: Research and/as Engagement
12 September 2014
Ines Byrne, Digital Collections Specialist, National Library of Scotland
Transcribe NLSQuestion:
What is crowdsourcing?
Answer: More than just free labour!
Actively engaging the public in our collection development
Work we could never resource ourselves
Transcribe NLS
Plans at NLS:
manuscript transcriptions - Transcribe NLS
printed text (OCR) corrections
indexing/tagging
crowdsourcing transcriptions around the world
National Archives USA http://www.archives.gov/citizen-archivist/transcribe/University of Iowa http://diyhistory.lib.uiowa.edu/National Archives of Australia http://transcribe.naa.gov.au/The Smithsonian https://transcription.si.edu/Transcribe Bentham http://www.transcribe-bentham.da.ulcc.ac.ukScotlandsPlaces http://www.scotlandsplaces.gov.uk/transcribe
What can we learn from them?crowd activity level
only 21% of registered transcribers produce transcriptions
70% of transcriptions come from 3% of active transcribers
on average, one transcriber contributes 2 working hours per week
on average, one transcriber works on 6 pages per hour
What can we learn from them?what the transcriber wants
guidelines on how to use the tool
be able to flag up issues with more experienced transcribers
personal activity logs to keep track of their activity and history
What can we learn from them?What is off-putting?
extensive and complicated instructions
hard-to-read handwriting
technical issues with the tool
complexity of mark-up encoding
What can we learn from them?how to keep the crowd motivated
• feeling trusted and respected• recognition – “show us how we fit into the big picture”• the outcome – “tell us what your aim is – we want to help
achieve it”• clear instructions – “we will work better if we know what
you want from us”• ability to communicate with other transcribers• constantly adding more material to be transcribed –
“keep us busy”
What can we learn from them?lessons learned
• reply to enquiries promptly or else you lose your transcribers• the public have more spare time than you can imagine• moderation raises quality but is time-consuming (Bentham’s 2
moderators could have produced 2.5 times more transcriptions than the crowd did)
• one system doesn’t fit all content adapt your tool to fit your material
• majority of transcribers are not very experienced• invest in training videos• go live quietly to be able to deal with all the issues arising• no transcriber felt being exploited
What is it that we want to build?
Easy to use no complex mark-up encoding
No resources for moderation self-moderation (loss of quality control)
High-level control registration required
Transcribe NLS
Materials for transcription:
Marjory Fleming diary
mountaineering diaries and notebooks
recipe booksgenealogy materials
It will be great!
Well, let’s define “great”…