A centre of expertise in digital information management UKOLN is supported by: Digital libraries...

Post on 28-Mar-2015

218 views 1 download

Tags:

Transcript of A centre of expertise in digital information management UKOLN is supported by: Digital libraries...

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

UKOLN is supported by:

Digital libraries and digital scholarship: changing roles and responsibilities?

Dr Liz Lyon

Director, UKOLN

SCONUL Conference, Newcastle, June 2006.

This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Overview1. Some images of scholarship

• Perceptions from the past• 23rd June: today• “Native digital scholar” beyond 2010?

2. Digital libraries and e-Research infrastructure• Data creation and capture • Data curation and preservation• Data citation, discovery and use• Adding value and Knowledge extraction

3. A Case Study4. Roles & responsibilities: new challenges?

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

The scholar in AD 731

Folio 3v Codex Beda Petersburgiensis

Scholarship today? OA landscape

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

http://www.flickr.com/photos/97797311@N00/61648107/

23 June 2006

Architecture of Participation?

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uke-Scientist desktop?

Slide: Carole Goble

Data-centric 2020 vision

Reference datasets as infrastructure?

Human discourse: supporting persistent

conversations?

MEMETIC Project

JISC VRE Programme

Compendium software + Access Grid

http://www.anotherlanguage.org/interplay/packetcreek/

Performing Arts & Access Grid

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

New forms of publication: integration of data and journals

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Digital libraries & e-Research Infrastructure

(Very simple) e-Research Cycle

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

Understanding the research process

• Core business process? Workflows?• Project StORe: Source-to-Output Repositories (Edinburgh)

– Primary data : research publications– Survey questionnaire

• RepoMMan: Repository Metadata and Management (Hull)– Survey questionnaire and interviews– Activity diagram and workflow

• How is primary research data captured in faculty and academic departments?

• Where and how is primary research data stored in your institution?

• What data is curated by data centres?

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

The scholarly knowledge cycle.

Liz Lyon, Ariadne, July 2003.

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

© Liz Lyon (UKOLN, University of Bath), 2005

“JISC Vision”: a global landscape of federated repositories

fusion layer ‘repository federator’

repository repository repository repository repository

portal portal portal portal portal

heterogeneous - metadataformats, content formats,identifiers, packagingstandards

homogeneous - metadataformats, content formats,identifiers, packagingstandards

From Andy Powell: http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/presentations/jiie-jcs-2005/

• Multi-disciplinary, cross-sectoral

• National, institutional

• Different platforms

• Many format types: data, eprints, images, geospatial

• e-Framework and Information Environment context

• Define common + domain-specific + repository “services”

• Interoperability based on open standards, software tools

Digital repositories, OA & preservation• Long-term access: trust, responsibility, policy• Trusted DR Audit Checklist for Certification Draft Research Libraries

Group-NARA Taskforce 2005• Defined criteria under 4 categories

– Organisation– Functions, processes & procedures– Designated community & usability– Technologies & technical infrastructure

• UK Digital Curation Centre: advice, tools & services• RepInfo Registry• EU CASPAR Integrated Project

• Task Force on the Permanent Access to the Records of Science

http://www.dcc.ac.uk/

http://www.casparpreserves.info/pages/1/index.htm

http://tfpa.kb.nl/

Data, metadata and discovery• Validation, publication & discovery of data models

& schema• Metadata packaging standards

– METS, MPEG 21 DIDL– Complex object model?

• Semantic descriptions– Formal high-level and domain ontologies– Inter-disciplinary discovery

• ePrints DC Application Profile • UK Intute IR search service (eprints)• Informal social network approaches

“folksonomies”• What data models and metadata schema are

in place?• Have librarians been involved in their

development?

Persistent identifiers for data citation• How will they be used? We need use cases: depositor,

author, service provider, researcher, publisher?• Schemes: DOI, Handle, ARK, PURL• Publication & citation of scientific primary data project

National Library for Science & Technology (TIB), University of Hanover, Germany. STD-DOI Project DOI registry for datasets http://www.std-doi.de

• What persistent identifiers have been assigned to your data?• Is there a data citation policy?• Was the Library involved?

Adding value: repository services• Tools: for deposit, normalisation, manipulation, transformation…..

• Linking, annotation, visualisation

• Aggregators: generic, (sub-) disciplinary

Knowledge extraction:• Mining (data, text, structures)

• Modelling (economic, climate, mathematical, biological…)

• Analysis (statistical, lexical, gene….)

Is your data OA?

How is your data being used and re-used?

Avian flu outbreaks mashup - Nature January 2006

Data from FAO, WHO…

+Google Earth

Nature 23 March 2006 OTMI: Open Text Mining Interface

NaCTeMhttp://www.nactem.ac.uk/

Emerging tools: TerMine, GENIA, Cafetiere

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

A Case Study in Crystallography

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Data capture

Deposit scenario (…part of….)1. Produce strategy for synthesis (=idea)

2. Submit plan to SmartTea system (incl. identifiers)

3. Retrieve and follow instructions (sub-workflow?)

4. Experimental synthesis metadata automatically recorded on instruments (Smart Lab)

5. Create record for synthesised sample (+ proposed chemical identifier) in R4L laboratory data management system

6. Run spectral analyses on sample capturing further analysis metadata (incl. time-stamp, analysis software version, researcher details etc.)

7. Save spectrum in native and common formats

8. Invoke R4L data capture service and deposit files + metadata in laboratory repository…

RAW DATA DERIVED DATA RESULTS DATA

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

eBank UK Project• Promote open access crystallography data • Aggregator service harvests OAI metadata from institutional data

repository (e-Crystals archive)• Service linking from data to derived research publication• Embedding eBank service in learning workflows: pedagogy• Future federation plans for crystallography data repositories

UKOLN (lead), University of Southampton, University of Manchester

http://www.ukoln.ac.uk/projects/ebank-uk/

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

A data repository entry ecrystals.chem.soton.ac.uk

Access to the underlying data: complex objects

eBank Metadata Publication

• Using simple Dublin Core • Crystal structure• Title (Systematic IUPAC Name)• Authors• Affiliation• Creation Date

• Additional chemical information through Qualified Dublin Core• Empirical formula• International Chemical Identifier InChI • Compound Class & Keywords

• Specifies which ‘datasets’ are present in an entry• Application Profile• DOIs from TIB http://dx.doi.org/10 .1594/ecrystals.chem.soton.ac.uk/145

• Data citation policy http://ecrystals.chem.soton.ac.uk/rights.html

http://www.ukoln.ac.uk/projects/ebank-uk/schemas/

Discovering data:

Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k

• Domain identifier: International Chemical Identifier (INChI) code• Google molecule using INChISlide from Simon Coles

Adding value: eBank linking data to

publications

Linking research to learning - embedding eBank aggregator service in a science portal for student learners

Integration into the curriculum and e-Learning workflows

• MChem course • Assess role in

Undergraduate Chemical Informatics courses

• Pedagogic evaluation• April – June 2006• Report to follow.

e-Research workflows

Aggregator services

Institutional data repositories

Data curation & preservation: databases & databanks

Validation

Harvest

Data creation & capture in “Smart lab”

Deposit

Publishers: peer-review journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Search, harvest

Presentation services: portals

Data discovery, linking, citation

Linking, citation

Laboratory repository

Deposit

(Chemistry Central)

e-Crystals Federation model

This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Roles & responsibilities: new challenges?

Workforce development and capacity building

• NSF Draft Report 2005 “Data scientist” - hybrid skills

• Facilitate collaboration– “Multidisciplinary teams: computer

scientists, domain scientists, digital library experts, statisticians/modellers e.g. eBank project

– Lessons learnt: e-Science Human Factors Audit Report (to be published 2006) Roy Kawalsky, Loughborough

• CURL/SCONUL e-Research Taskforce

Has your (digital) library engaged with the e-Research agenda?

Supporting the “native digital scholar”• Develop leadership & vision for eResearch engagement

and infrastructure development• Provide (e-)Services for data

– We “do” eLearning so why not eResearch?– Include in institutional digital asset management plans

• Review organisational structures– Extend & re-profile the Faculty/Subject/Reference Librarian role– Collaborate closely with Computing Services and Depts

• Promote professional development of staff– Raise awareness, acquire new skills– Build multidisciplinary teams, explore emergent roles

Respond to the challenge...The Future is NOW

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Thank you.

UKOLN receives core funding from the Joint Information Systems Committee (JISC) and the Museums, Libraries & Archives Council (MLA)

and is based at the University of Bath, UK.