Post on 15-Jul-2015
British Library Labsand Lessons for theLibrary
Mahendra Mahey
21st Century Curatorship TalksThursday 18 September, 2014, 1500 - 1600Meeting Room K, British Library, St Pancras, London
Manager of British Library Labs
Ben O’SteenTechnical Lead British Library Labs
http://labs.bl.uk 2#bl_labs labs@bl.ukFunded by the Andrew Mellon Foundation
http://labs.bl.uk 3#bl_labs labs@bl.uk
Digital Scholarship
Digital Research
Access & Reuse Group
©
Developers/ Technical
Staff
British Library
Universities & widere.g. companies, start-ups, independent scholars etc.
Stakeholders involved in Labs
United KingdomThe World
Researchers
Developers
BL Labs
Curators / Researchers
DigitalContent
http://labs.bl.uk 4#bl_labs labs@bl.uk
How Labs works…
BL Labs
OpenSoftware
Publications
Tools & services to
support Digital Scholarship
Case Studies
AudienceResearch
question / idea
idea
idea
Competition
Contact
Events
Meetings and visits
Experimenting with our digital collections
Outputs from engagementData
Other Digital Collection / Data
BL Digital Collections /
Data
Researchers
Developers
Data Driven
Projects
http://labs.bl.uk/Digital+Collections
1 2 3 4 5
http://labs.bl.uk 5#bl_labs labs@bl.uk
Example Digital research methods
http://labs.bl.uk/Launch+Event (has some examples from researchers)
Corpus analysis toolsText Mining
Visualisations
Location based searching
Geotagging
Annotation
Natural Language Processing
Using Application Programming Interfaces for datasets e.g. Metadata, Images
Transcribing
Crowdsourcing / Human Computation
http://labs.bl.uk 6#bl_labs labs@bl.uk
The winners of the Labs 2013 competition
Pieter Francois (left) and Dan Norton (right) and each received a cheque for £2000 in November 2013as winners of the first British Library Lab Competition 2013
Two entries chosen in June 2013
They both worked in residence from July to October 2013with Labs to complete their projects
Pieter Francois (left) and Dan Norton (right)
with Adam Farquhar (middle) Head of Digital
Scholarship
http://labs.bl.uk 7#bl_labs labs@bl.uk
Mixing the Library: The Disc Jockey & the Digital Collection
http://www.tompro.co.uk
http://www.ablab.org/shetland
http://www.ablab.org/pd/di/
Prototype design
Annotation
Preview ‘item’
Selected ‘right’ channel ‘item’
Selected ‘left’channel ‘item’
Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels
‘Play back’ of ‘items’ (Blue) and annotations (Yellow)
http://212.71.253.54:8000/a
Living Lab: Library of the Future, see: http://alturl.com/284zw
Basic functioning prototype:
http://labs.bl.uk 8#bl_labs labs@bl.uk
Pieter Francois
https://www.youtube.com/watch?v=xK80Jy0ijkA
http://labs.bl.uk 9#bl_labs labs@bl.uk
Winners of 2014 Competition
Victorian Meme Machine
Bob Nicholson of Edge Hill University
Anna Gerber and Desmond Schmidt from Queensland University
Blog posting http://goo.gl/iJy0aTYouTube: http://goo.gl/mBTlk2
Blog: http://goo.gl/ofpNoslYouTube: http://goo.gl/iseHTE
Text to Image Linking Tool (TILT)
http://labs.bl.uk 10#bl_labs labs@bl.uk
Bob Nicholson
https://www.youtube.com/watch?v=zK95lzaPNp0
http://labs.bl.uk 11#bl_labs labs@bl.uk
Anna Gerber and Desmond Schmidt
https://www.youtube.com/watch?v=Bl4bjZSJ4cY&feature=youtu.be
Text to Image Linking Tool (TILT)
http://labs.bl.uk 12#bl_labs labs@bl.uk
The story of one digital collection…
The story of 68,000 books and 1 million images
and Flickr
Image: Artwork by Alicia Martin
http://mechanicalcurator.tumblr.com
http://www.flickr.com/photos/britishlibrary/
http://labs.bl.uk 13#bl_labs
Extracting Images from OCR
13
Digitisation
<?xml version="1.0" encoding="UTF-8" ?>
- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/
METS/ http://www.loc.gov/standards/mets/ver
sion18/mets.xsd info:lc/xmlns/premi
s-v2
Optical
Character
Recognition Image snipped outAlgorithmically
From ALTO XML
Image snipped out
Image taken from page 207 of 'London and its Environs. A picturesque survey of the metropolis and the suburbs ... Translated by Henry Frith. With ... illustrations'
ALTO XML
http://labs.bl.uk 14#bl_labs labs@bl.uk
Face Recognition of 19th Century Faces
The face-recognition algorithm worked better for female faces than men’s
http://labs.bl.uk 15#bl_labs labs@bl.uk
The Mechanical Curator
http://mechanicalcurator.tumblr.com
• #similar_to_77576796197_published_date• #similar_to_77576796197_slantyness
• #similar_to_77576796197_bubblyness_x• #similar_to_77576796197_bubblyness_y
• #new_train_of_thought
Image from ‘A Lost Estate, by Mary E.Mann,Volume: 02, Page: 91, 1889, London, Bentley & Son
http://labs.bl.uk 16#bl_labs labs@bl.uk
1,020,418 images!
http://www.flickr.com/photos/britishlibrary/
Each image has a URL
Some metadata, but you can add tags!
Flickr has an API so researchers and developers can build appsAnd query the data
Flickr Commons – 1,020,418 images!
http://labs.bl.uk 17#bl_labs labs@bl.uk
Flickr in numbers
>190,000, 000 !!!image views since launch December 13th, 2013 to 18 September 2014553 images seen less than 10 times
103,000 tags added
Labs involved a number of funded research projects & 4 grassroots crowdsourcing efforts.
http://labs.bl.uk 18#bl_labs labs@bl.uk
Tagging a million images - Metadata games and other projects
http://www.metadatagames.org/
Games developed using Flickr sets
http://goo.gl/j6fxac
Cardiff University’s - Lost Visions Project
http://labs.bl.uk 19#bl_labs labs@bl.uk
Flickr coverage in the media!
http://labs.bl.uk 20#bl_labs labs@bl.uk
Opportunities – increasing traffic to Library services
You can purchase a ‘High Res’ Copy
View in the Library Item Viewer
Download .pdfAll illustrations
in book
Other illustrations in booksPublished in same year
View the item in the Library Catalogue Tags auto generated
User generatedTag
Grouping for image
http://labs.bl.uk 21#bl_labs labs@bl.uk
Creative Useshttp://goo.gl/qPPgxX
http://goo.gl/OH6FSn
Jura’s Sound Skateboard
http://labs.bl.uk 22#bl_labs labs@bl.uk
Lessons learned…in getting digital content
• Filter was necessary to choose content because of the amount of content, size and time period of the project
• Getting the story behind the collection was crucial, usually from the curator
• Sometimes access and reuse requests are needed for content
• Getting the curators on board (engaging with the competition, getting them to be judges) and rewarding them after is important (e.g. technical quick wins by working with the Labs technical lead)
http://labs.bl.uk 23#bl_labs labs@bl.uk
Lessons learned…metadata
• Metadata cleansing needed, duplicate records, records not always linked when updated
• Lots of digital content doesn’t have metadata, initiate crowd sourcing perhaps?
There is limited subject classification for the 19th century metadata for books
http://labs.bl.uk 24#bl_labs labs@bl.uk
Lessons learned…technical
• Some content is only available on site due to licensing restrictions
• Labs highlights when systems don’t always join up and this can be flagged internally
• Some restrictions mean that workarounds have to be developed for researchers to work with the content
http://labs.bl.uk 25#bl_labs labs@bl.uk
Lessons learned…human
• Working on site means internal systems and process challenges, issues not insurmountable, workarounds possible
• Starting a dialogue with the right person is the most important lesson I learned about the Library (obvious but true)
http://labs.bl.uk 26#bl_labs labs@bl.uk
Poster given at Digital Humanities 2014
http://figshare.com/articles/Interoperable_Infrastructures_for_Digital_Research_A_proposed_pathway_for_enabling_transformation/1092550
Adam FarquharJames Baker
http://labs.bl.uk 27#bl_labs labs@bl.uk
What do digital researchers want?
• Scalable access to large quantities of digital content
• To work with all types of content - text, image, audio, video
• To work the way they want to, use any work flow, address any sort of problem
• To work across collections irrespective of cotent owner or licence terms
http://labs.bl.uk 28#bl_labs labs@bl.uk
What do researchers get?
• Restrictive, prospective and incompatible infrastructures
• Assets distributed unevenly across organisations and systems
http://labs.bl.uk 29#bl_labs labs@bl.uk
Proposed pathway
• Use off-the-shelf technologies and services
• Bring computational capacity to data
• Provide researchers with something they know and use - a file system and a desktop
• Offer research libraries a cost-effective model that scales with use
http://labs.bl.uk 30#bl_labs labs@bl.uk
Five Principles
• Keep it simple
• Lower the bar
• Bring your own tools
• Be creative
• Enable users to start small and grow big
http://labs.bl.uk 31#bl_labs labs@bl.uk
Thanks
• Mahendra Mahey
• mahendra.mahey@bl.uk
• labs@bl.uk