British Library Labs 21st Century Curatorship Talk

Post on 15-Jul-2015

104 views 0 download

Tags:

Transcript of British Library Labs 21st Century Curatorship Talk

British Library Labsand Lessons for theLibrary

Mahendra Mahey

21st Century Curatorship TalksThursday 18 September, 2014, 1500 - 1600Meeting Room K, British Library, St Pancras, London

Manager of British Library Labs

Ben O’SteenTechnical Lead British Library Labs

http://labs.bl.uk 2#bl_labs labs@bl.ukFunded by the Andrew Mellon Foundation

http://labs.bl.uk 3#bl_labs labs@bl.uk

Digital Scholarship

Digital Research

Access & Reuse Group

©

Developers/ Technical

Staff

British Library

Universities & widere.g. companies, start-ups, independent scholars etc.

Stakeholders involved in Labs

United KingdomThe World

Researchers

Developers

BL Labs

Curators / Researchers

DigitalContent

http://labs.bl.uk 4#bl_labs labs@bl.uk

How Labs works…

BL Labs

OpenSoftware

Publications

Tools & services to

support Digital Scholarship

Case Studies

AudienceResearch

question / idea

idea

idea

Competition

Contact

Events

Meetings and visits

Experimenting with our digital collections

Outputs from engagementData

Other Digital Collection / Data

BL Digital Collections /

Data

Researchers

Developers

Data Driven

Projects

http://labs.bl.uk/Digital+Collections

1 2 3 4 5

http://labs.bl.uk 5#bl_labs labs@bl.uk

Example Digital research methods

http://labs.bl.uk/Launch+Event (has some examples from researchers)

Corpus analysis toolsText Mining

Visualisations

Location based searching

Geotagging

Annotation

Natural Language Processing

Using Application Programming Interfaces for datasets e.g. Metadata, Images

Transcribing

Crowdsourcing / Human Computation

http://labs.bl.uk 6#bl_labs labs@bl.uk

The winners of the Labs 2013 competition

Pieter Francois (left) and Dan Norton (right) and each received a cheque for £2000 in November 2013as winners of the first British Library Lab Competition 2013

Two entries chosen in June 2013

They both worked in residence from July to October 2013with Labs to complete their projects

Pieter Francois (left) and Dan Norton (right)

with Adam Farquhar (middle) Head of Digital

Scholarship

http://labs.bl.uk 7#bl_labs labs@bl.uk

Mixing the Library: The Disc Jockey & the Digital Collection

http://www.tompro.co.uk

http://www.ablab.org/shetland

http://www.ablab.org/pd/di/

Prototype design

Annotation

Preview ‘item’

Selected ‘right’ channel ‘item’

Selected ‘left’channel ‘item’

Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels

‘Play back’ of ‘items’ (Blue) and annotations (Yellow)

http://212.71.253.54:8000/a

Living Lab: Library of the Future, see: http://alturl.com/284zw

Basic functioning prototype:

http://labs.bl.uk 8#bl_labs labs@bl.uk

Pieter Francois

https://www.youtube.com/watch?v=xK80Jy0ijkA

http://labs.bl.uk 9#bl_labs labs@bl.uk

Winners of 2014 Competition

Victorian Meme Machine

Bob Nicholson of Edge Hill University

Anna Gerber and Desmond Schmidt from Queensland University

Blog posting http://goo.gl/iJy0aTYouTube: http://goo.gl/mBTlk2

Blog: http://goo.gl/ofpNoslYouTube: http://goo.gl/iseHTE

Text to Image Linking Tool (TILT)

http://labs.bl.uk 10#bl_labs labs@bl.uk

Bob Nicholson

https://www.youtube.com/watch?v=zK95lzaPNp0

http://labs.bl.uk 11#bl_labs labs@bl.uk

Anna Gerber and Desmond Schmidt

https://www.youtube.com/watch?v=Bl4bjZSJ4cY&feature=youtu.be

Text to Image Linking Tool (TILT)

http://labs.bl.uk 12#bl_labs labs@bl.uk

The story of one digital collection…

The story of 68,000 books and 1 million images

and Flickr

Image: Artwork by Alicia Martin

http://mechanicalcurator.tumblr.com

http://www.flickr.com/photos/britishlibrary/

http://labs.bl.uk 13#bl_labs

Extracting Images from OCR

13

Digitisation

<?xml version="1.0" encoding="UTF-8" ?>

- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/

METS/ http://www.loc.gov/standards/mets/ver

sion18/mets.xsd info:lc/xmlns/premi

s-v2

Optical

Character

Recognition Image snipped outAlgorithmically

From ALTO XML

Image snipped out

Image taken from page 207 of 'London and its Environs. A picturesque survey of the metropolis and the suburbs ... Translated by Henry Frith. With ... illustrations'

ALTO XML

http://labs.bl.uk 14#bl_labs labs@bl.uk

Face Recognition of 19th Century Faces

The face-recognition algorithm worked better for female faces than men’s

http://labs.bl.uk 15#bl_labs labs@bl.uk

The Mechanical Curator

http://mechanicalcurator.tumblr.com

• #similar_to_77576796197_published_date• #similar_to_77576796197_slantyness

• #similar_to_77576796197_bubblyness_x• #similar_to_77576796197_bubblyness_y

• #new_train_of_thought

Image from ‘A Lost Estate, by Mary E.Mann,Volume: 02, Page: 91, 1889, London, Bentley & Son

http://labs.bl.uk 16#bl_labs labs@bl.uk

1,020,418 images!

http://www.flickr.com/photos/britishlibrary/

Each image has a URL

Some metadata, but you can add tags!

Flickr has an API so researchers and developers can build appsAnd query the data

Flickr Commons – 1,020,418 images!

http://labs.bl.uk 17#bl_labs labs@bl.uk

Flickr in numbers

>190,000, 000 !!!image views since launch December 13th, 2013 to 18 September 2014553 images seen less than 10 times

103,000 tags added

Labs involved a number of funded research projects & 4 grassroots crowdsourcing efforts.

http://labs.bl.uk 18#bl_labs labs@bl.uk

Tagging a million images - Metadata games and other projects

http://www.metadatagames.org/

Games developed using Flickr sets

http://goo.gl/j6fxac

Cardiff University’s - Lost Visions Project

http://labs.bl.uk 19#bl_labs labs@bl.uk

Flickr coverage in the media!

http://labs.bl.uk 20#bl_labs labs@bl.uk

Opportunities – increasing traffic to Library services

You can purchase a ‘High Res’ Copy

View in the Library Item Viewer

Download .pdfAll illustrations

in book

Other illustrations in booksPublished in same year

View the item in the Library Catalogue Tags auto generated

User generatedTag

Grouping for image

http://labs.bl.uk 21#bl_labs labs@bl.uk

Creative Useshttp://goo.gl/qPPgxX

http://goo.gl/OH6FSn

Jura’s Sound Skateboard

http://labs.bl.uk 22#bl_labs labs@bl.uk

Lessons learned…in getting digital content

• Filter was necessary to choose content because of the amount of content, size and time period of the project

• Getting the story behind the collection was crucial, usually from the curator

• Sometimes access and reuse requests are needed for content

• Getting the curators on board (engaging with the competition, getting them to be judges) and rewarding them after is important (e.g. technical quick wins by working with the Labs technical lead)

http://labs.bl.uk 23#bl_labs labs@bl.uk

Lessons learned…metadata

• Metadata cleansing needed, duplicate records, records not always linked when updated

• Lots of digital content doesn’t have metadata, initiate crowd sourcing perhaps?

There is limited subject classification for the 19th century metadata for books

http://labs.bl.uk 24#bl_labs labs@bl.uk

Lessons learned…technical

• Some content is only available on site due to licensing restrictions

• Labs highlights when systems don’t always join up and this can be flagged internally

• Some restrictions mean that workarounds have to be developed for researchers to work with the content

http://labs.bl.uk 25#bl_labs labs@bl.uk

Lessons learned…human

• Working on site means internal systems and process challenges, issues not insurmountable, workarounds possible

• Starting a dialogue with the right person is the most important lesson I learned about the Library (obvious but true)

http://labs.bl.uk 26#bl_labs labs@bl.uk

Poster given at Digital Humanities 2014

http://figshare.com/articles/Interoperable_Infrastructures_for_Digital_Research_A_proposed_pathway_for_enabling_transformation/1092550

Adam FarquharJames Baker

http://labs.bl.uk 27#bl_labs labs@bl.uk

What do digital researchers want?

• Scalable access to large quantities of digital content

• To work with all types of content - text, image, audio, video

• To work the way they want to, use any work flow, address any sort of problem

• To work across collections irrespective of cotent owner or licence terms

http://labs.bl.uk 28#bl_labs labs@bl.uk

What do researchers get?

• Restrictive, prospective and incompatible infrastructures

• Assets distributed unevenly across organisations and systems

http://labs.bl.uk 29#bl_labs labs@bl.uk

Proposed pathway

• Use off-the-shelf technologies and services

• Bring computational capacity to data

• Provide researchers with something they know and use - a file system and a desktop

• Offer research libraries a cost-effective model that scales with use

http://labs.bl.uk 30#bl_labs labs@bl.uk

Five Principles

• Keep it simple

• Lower the bar

• Bring your own tools

• Be creative

• Enable users to start small and grow big

http://labs.bl.uk 31#bl_labs labs@bl.uk

Thanks

• Mahendra Mahey

• mahendra.mahey@bl.uk

• labs@bl.uk