IMPACT Final Conference - Richard Boulderstone
-
Upload
impact-centre-of-competence -
Category
Education
-
view
1.290 -
download
0
description
Transcript of IMPACT Final Conference - Richard Boulderstone
IMPACT Conference 2011
Richard Boulderstone
Director, eStrategy & ProgrammesOctober 2011
Fantastic Project!
2
Highly collaborative Addressing common set of issues across Europe Will have multi-year benefits for organisations that do
digitisation Will result in much richer and more value-added
applications Will benefits the citizens of Europe for many years to come
Could finish here,…However, would like to talk about:
My views on print, digitisation, OCR, apps and the future…..
3
The British Library
Exists for everyone who wants to do research – for academic, personal, and commercial purposes.
Covers all subject areas – sciences, technology, medicine, arts, humanities, social sciences…
Receives a copy of every item published in the UK.
Holds over 150 million items, with 3 million items added each year.
Used by over 16,000 people each day (on site and online).
4
2020 Mission & Vision
Digitisation provides long-lasting digital copy
Digital content can Support advanced analysis
Digital is easier to access
Digital content has much greater reach
We can only accomplish theseobjectives with partners
5
Physical Collections
Physical Item
British Library has 150M Items in Collection Estimated Number Of Pages 5,000M Therefore Average Number of Pages per Item = 33 CENL (Conference Of European National Libraries) Survey
2006 400M Items in National Libraries Estimate 13,200M pages (33 * 400M) Lots to Digitise!
6
Digital not Digitalis
Born-Digital Normally contemporary material that we acquire in digital-
form (eJournals, eBooks, Web Sites, &etc).
Digitised Digital image of physical collection item (Newspapers,
Books, Manuscript, Journals, Audio, &etc.)
Not….Digitalization The administration of digitalis (fox glove) or one of its active
constituents to a patient or an animal so that the required physiological changes occur in the body; also, the state of the body resulting from this. (Oxford English Dictionary)
7
Digitisation – Create Images
Physical ItemDigitised Item
Digitisation
BL has digitised 57M Objects, around 1% of physical collection However, partnership with Brightsolid - digitising newspaper collection
(fee service) – Up to an additional 40M pages Google to digitise 250,000 books (80M pages)
Cost to digitise, initially much more than £1 per page, more recently less than £1 per page
For entire BL collection – estimated storage required @10Mbytes / page is 50 Petabytes (5 * 10^16)
CENL Survey 2006: 4.8M Items; 2012 Projection: 17M Items (~4%)
8
OCR – Gateway to Advanced Digital Functionality
Physical ItemDigitised Item
Digitisation
<?xml version="1.0" encoding="UTF-8" ?>
- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/
METS/ http://www.loc.gov/standards/mets/ver
sion18/mets.xsd info:lc/xmlns/premi
s-v2
Digital Item
Optical
Character
Recognition
OCR Works very well for modern collections with high accuracy rates However, some way to go for older material (Going Grey? Comparing the OCR
Accuracy Levels of Bitonal and Greyscale Images, Tracy Powell & Gordon Paynter NLNZ)
Vital for Advanced Digital Functionality Impact has made significant progress in this area
How good can it get? Rose Holley NLARequire high accuracy for researchers to trust.Good 98-99%Poor below 90%
9
Adding Value To Collection Items
Physical ItemDigitised Item
Digitisation
<?xml version="1.0" encoding="UTF-8" ?>
- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/
METS/ http://www.loc.gov/standards/mets/ver
sion18/mets.xsd info:lc/xmlns/premi
s-v2
Digital Item
Optical
Character
Recognition
Indexing
Basic Search & Discovery
Text Analysis
Text Mining
Image Comparison
Specialist Applications
Application Programming Interface (API)
Social Networking
Colle
ct &
Sto
re:
Comm
ents
,
Annot
atio
ns,
Additi
ons
Do we need all these applications?
Are they value for money?
10
Commercial Break…..
11
Value of Digitisation
Splashes and Ripples: Synthesizing the Evidence on the Impact of Digital Resources, 2011 - Eric T. Meyer, Oxford Internet Institute
JISC Funded Review of the Value of Digitisation Projects Examined 12 JISC-Funded Digitisation Projects Various Types of Benefits Analysed:
Quantitative Analytics Income Log Files Scientometrics Surveys Webometrics
Qualitative Content Analysis Feedback Focus Groups Interviews Referrer
12
Webometrics for 12 JISC-funded Digitisation Projects
Monthly statistics for 12 JISC-funded Digitisation Projects
Does this tell us whether we should do these projects?....
13
Print vs Digital
Factor Print Digital Winner
Durability Good (some not so good – newspapers) but eventually destroyed through use
Requires specialist system to retain for ever – but possible
Tie
Look & Feel Original Item Good simulations possible – also multi-layer digitisation; electronic comparisons provide additional utility
Tie
Search Only catalogue With good ocr - Full Text Digital
Distribution Slow, expensive & cumbersome Fast, cheap, entire internet Digital
Linking, Text mining, social networking
Not Possible Potentially Digital
Revenue Very limited opportunities Already have a number of revenue generating apps
DigitalDigital W
ins!!!
14
CENL Survey Digitised Items: Potential
Enormous Potential for digitisation
“If we match the total physical holdings national libraries against digital holdings (objects) of a library it becomes clear that content digitisation still in its infancy and how enormous the potential for digitisation of content in National Libraries is.”
15
My Vision
Cost reductions in storage technologies, mass digitisation processes and application development make it possible for the first time to imagine digitising the entire holdings of major Libraries.
This creates the opportunity to allow all citizens to experience, enjoy, learn from and build on the World’s Knowledge.
16
Concluding Comments
Digitisation projects have created a fantastic resource for scholars, researchers and the public
European National Libraries, including the British Library, will have digitised around 4% of their collections by 2012
Funding, standards, copyright, technology and interoperability will remain major issues
However these programmes have the potential to radically improve the access to collections across Europe and beyond
We will need to work together to unleash the potential of these resources…..
is a great example of this collaboration