British Library Labs 21st Century Curatorship Talk

31
British Library Labs and Lessons for the Library Mahendra Mahey 21 st Century Curatorship Talks Thursday 18 September, 2014, 1500 - 1600 Meeting Room K, British Library, St Pancras, London Manager of British Library Labs Ben O’Steen Technical Lead British Library Labs

Transcript of British Library Labs 21st Century Curatorship Talk

Page 1: British Library Labs 21st Century Curatorship Talk

British Library Labsand Lessons for theLibrary

Mahendra Mahey

21st Century Curatorship TalksThursday 18 September, 2014, 1500 - 1600Meeting Room K, British Library, St Pancras, London

Manager of British Library Labs

Ben O’SteenTechnical Lead British Library Labs

Page 2: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 2#bl_labs [email protected] by the Andrew Mellon Foundation

Page 3: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 3#bl_labs [email protected]

Digital Scholarship

Digital Research

Access & Reuse Group

©

Developers/ Technical

Staff

British Library

Universities & widere.g. companies, start-ups, independent scholars etc.

Stakeholders involved in Labs

United KingdomThe World

Researchers

Developers

BL Labs

Curators / Researchers

DigitalContent

Page 4: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 4#bl_labs [email protected]

How Labs works…

BL Labs

OpenSoftware

Publications

Tools & services to

support Digital Scholarship

Case Studies

AudienceResearch

question / idea

idea

idea

Competition

Contact

Events

Meetings and visits

Experimenting with our digital collections

Outputs from engagementData

Other Digital Collection / Data

BL Digital Collections /

Data

Researchers

Developers

Data Driven

Projects

http://labs.bl.uk/Digital+Collections

1 2 3 4 5

Page 5: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 5#bl_labs [email protected]

Example Digital research methods

http://labs.bl.uk/Launch+Event (has some examples from researchers)

Corpus analysis toolsText Mining

Visualisations

Location based searching

Geotagging

Annotation

Natural Language Processing

Using Application Programming Interfaces for datasets e.g. Metadata, Images

Transcribing

Crowdsourcing / Human Computation

Page 6: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 6#bl_labs [email protected]

The winners of the Labs 2013 competition

Pieter Francois (left) and Dan Norton (right) and each received a cheque for £2000 in November 2013as winners of the first British Library Lab Competition 2013

Two entries chosen in June 2013

They both worked in residence from July to October 2013with Labs to complete their projects

Pieter Francois (left) and Dan Norton (right)

with Adam Farquhar (middle) Head of Digital

Scholarship

Page 7: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 7#bl_labs [email protected]

Mixing the Library: The Disc Jockey & the Digital Collection

http://www.tompro.co.uk

http://www.ablab.org/shetland

http://www.ablab.org/pd/di/

Prototype design

Annotation

Preview ‘item’

Selected ‘right’ channel ‘item’

Selected ‘left’channel ‘item’

Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels

‘Play back’ of ‘items’ (Blue) and annotations (Yellow)

http://212.71.253.54:8000/a

Living Lab: Library of the Future, see: http://alturl.com/284zw

Basic functioning prototype:

Page 8: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 8#bl_labs [email protected]

Pieter Francois

https://www.youtube.com/watch?v=xK80Jy0ijkA

Page 9: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 9#bl_labs [email protected]

Winners of 2014 Competition

Victorian Meme Machine

Bob Nicholson of Edge Hill University

Anna Gerber and Desmond Schmidt from Queensland University

Blog posting http://goo.gl/iJy0aTYouTube: http://goo.gl/mBTlk2

Blog: http://goo.gl/ofpNoslYouTube: http://goo.gl/iseHTE

Text to Image Linking Tool (TILT)

Page 10: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 10#bl_labs [email protected]

Bob Nicholson

https://www.youtube.com/watch?v=zK95lzaPNp0

Page 11: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 11#bl_labs [email protected]

Anna Gerber and Desmond Schmidt

https://www.youtube.com/watch?v=Bl4bjZSJ4cY&feature=youtu.be

Text to Image Linking Tool (TILT)

Page 12: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 12#bl_labs [email protected]

The story of one digital collection…

The story of 68,000 books and 1 million images

and Flickr

Image: Artwork by Alicia Martin

http://mechanicalcurator.tumblr.com

http://www.flickr.com/photos/britishlibrary/

Page 13: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 13#bl_labs

Extracting Images from OCR

13

Digitisation

<?xml version="1.0" encoding="UTF-8" ?>

- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/

METS/ http://www.loc.gov/standards/mets/ver

sion18/mets.xsd info:lc/xmlns/premi

s-v2

Optical

Character

Recognition Image snipped outAlgorithmically

From ALTO XML

Image snipped out

Image taken from page 207 of 'London and its Environs. A picturesque survey of the metropolis and the suburbs ... Translated by Henry Frith. With ... illustrations'

ALTO XML

Page 14: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 14#bl_labs [email protected]

Face Recognition of 19th Century Faces

The face-recognition algorithm worked better for female faces than men’s

Page 15: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 15#bl_labs [email protected]

The Mechanical Curator

http://mechanicalcurator.tumblr.com

• #similar_to_77576796197_published_date• #similar_to_77576796197_slantyness

• #similar_to_77576796197_bubblyness_x• #similar_to_77576796197_bubblyness_y

• #new_train_of_thought

Image from ‘A Lost Estate, by Mary E.Mann,Volume: 02, Page: 91, 1889, London, Bentley & Son

Page 16: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 16#bl_labs [email protected]

1,020,418 images!

http://www.flickr.com/photos/britishlibrary/

Each image has a URL

Some metadata, but you can add tags!

Flickr has an API so researchers and developers can build appsAnd query the data

Flickr Commons – 1,020,418 images!

Page 17: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 17#bl_labs [email protected]

Flickr in numbers

>190,000, 000 !!!image views since launch December 13th, 2013 to 18 September 2014553 images seen less than 10 times

103,000 tags added

Labs involved a number of funded research projects & 4 grassroots crowdsourcing efforts.

Page 18: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 18#bl_labs [email protected]

Tagging a million images - Metadata games and other projects

http://www.metadatagames.org/

Games developed using Flickr sets

http://goo.gl/j6fxac

Cardiff University’s - Lost Visions Project

Page 19: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 19#bl_labs [email protected]

Flickr coverage in the media!

Page 20: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 20#bl_labs [email protected]

Opportunities – increasing traffic to Library services

You can purchase a ‘High Res’ Copy

View in the Library Item Viewer

Download .pdfAll illustrations

in book

Other illustrations in booksPublished in same year

View the item in the Library Catalogue Tags auto generated

User generatedTag

Grouping for image

Page 21: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 21#bl_labs [email protected]

Creative Useshttp://goo.gl/qPPgxX

http://goo.gl/OH6FSn

Jura’s Sound Skateboard

Page 22: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 22#bl_labs [email protected]

Lessons learned…in getting digital content

• Filter was necessary to choose content because of the amount of content, size and time period of the project

• Getting the story behind the collection was crucial, usually from the curator

• Sometimes access and reuse requests are needed for content

• Getting the curators on board (engaging with the competition, getting them to be judges) and rewarding them after is important (e.g. technical quick wins by working with the Labs technical lead)

Page 23: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 23#bl_labs [email protected]

Lessons learned…metadata

• Metadata cleansing needed, duplicate records, records not always linked when updated

• Lots of digital content doesn’t have metadata, initiate crowd sourcing perhaps?

There is limited subject classification for the 19th century metadata for books

Page 24: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 24#bl_labs [email protected]

Lessons learned…technical

• Some content is only available on site due to licensing restrictions

• Labs highlights when systems don’t always join up and this can be flagged internally

• Some restrictions mean that workarounds have to be developed for researchers to work with the content

Page 25: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 25#bl_labs [email protected]

Lessons learned…human

• Working on site means internal systems and process challenges, issues not insurmountable, workarounds possible

• Starting a dialogue with the right person is the most important lesson I learned about the Library (obvious but true)

Page 26: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 26#bl_labs [email protected]

Poster given at Digital Humanities 2014

http://figshare.com/articles/Interoperable_Infrastructures_for_Digital_Research_A_proposed_pathway_for_enabling_transformation/1092550

Adam FarquharJames Baker

Page 27: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 27#bl_labs [email protected]

What do digital researchers want?

• Scalable access to large quantities of digital content

• To work with all types of content - text, image, audio, video

• To work the way they want to, use any work flow, address any sort of problem

• To work across collections irrespective of cotent owner or licence terms

Page 28: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 28#bl_labs [email protected]

What do researchers get?

• Restrictive, prospective and incompatible infrastructures

• Assets distributed unevenly across organisations and systems

Page 29: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 29#bl_labs [email protected]

Proposed pathway

• Use off-the-shelf technologies and services

• Bring computational capacity to data

• Provide researchers with something they know and use - a file system and a desktop

• Offer research libraries a cost-effective model that scales with use

Page 30: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 30#bl_labs [email protected]

Five Principles

• Keep it simple

• Lower the bar

• Bring your own tools

• Be creative

• Enable users to start small and grow big

Page 31: British Library Labs 21st Century Curatorship Talk

http://labs.bl.uk 31#bl_labs [email protected]

Thanks

• Mahendra Mahey

[email protected]

[email protected]