UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library...

43
UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL Press President of LIBER (Association of European Research Libraries) Chair, LERU Chief Information Officer Community League of European Research Universities) e-mail: [email protected]

Transcript of UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library...

Page 1: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Science 2.0: Research Data ManagementDr Paul Ayris

Director of UCL Library Services and UCL Copyright OfficerChief Executive, UCL PressPresident of LIBER (Association of European Research Libraries)Chair, LERU Chief Information Officer Community League of European Research Universities)

e-mail: [email protected]

Page 2: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 3: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Bibliography

Science as an Open Enterprise (2012) http://royalsociety.org/policy/projects/science-public-enterprise/rep

ort/

Susan Reilly, Opportunities for Data Exchange: optimising the conditions for data sharing (2012). LERU Doctoral Summer School, 9th July, 2012 http://www.ub.edu/lerudss2012/en/material.html

Opportunities for Data Exchange project website (2012) http://www.alliancepermanentaccess.org/index.php/community/curr

ent-projects/ode/

UCL Research Data Management Policy (2013) http://www.ucl.ac.uk/isd/staff/research_services/research-data/

Page 4: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Bibliography

The Perfect Swell: defining the ideal conditions for the growth of text and data mining in Europe. Report from a workshop on Friday, September 27th 2013, organised by LIBER Europe and held at the British Library (2013) http://www.libereurope.eu/sites/default/files/TDM%20Workshop%2

0Report%5B1%5D_0.pdf

LERU Roadmap for Research Data (2014) http://www.leru.org/index.php/public/news/press-release-leru-road

map-for-research-data/

A Scientist’s Take on the new Elsevier TDM Policy (2014) http://www.libereurope.eu/blog/a-scientists-take-on-the-new-elsevi

er-tdm-policy

Page 5: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 6: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

See Science as an open enterprise http://royalsociety.org/policy/projects/science-public-enterprise/report/

Page 7: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Technological change

Modern computers permit massive datasets to be assembled and explored in ways that reveal inherent but unsuspected relationships. This data-led science is a promising new source of knowledge (p. 7)

The emergence of linked data technologies creates new information through deeper integration of data across different datasets with the potential to greatly enhance automated approaches to data analysis (p. 7)

Page 8: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Map of Interlinked Data

W3C (2012). Available at: http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Page 9: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Open Data

Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control

http://en.wikipedia.org/wiki/File:DNA_orbit_animated.gif

Auer, S. R.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. (2007). "DBpedia: A Nucleus for a Web of Open Data". The Semantic Web. Lecture Notes in Computer Science 4825. p. 722. doi:10.1007/978-3-540-76298-0_52. ISBN 978-3-540-76297-3.

Page 10: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Human Genome Project

Aim: To determine the sequence of chemical base pairs which make up human DNA, and to identify and map the total genes of the human genome

Benefits – felt from molecular medicine to human evolution Better understanding of disease Design of medication and prediction of their effects Commercial development of genomics research

See http://en.wikipedia.org/wiki/DNA

Page 11: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Richard III

Page 12: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 13: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 14: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

LERU Roadmap for Research Data

Overseen by Research Data Working Group

Pablo Achard (University of Geneva)Paul Ayris (UCL, University College London)Serge Fdida (UPMC, Paris)Stefan Gradmann (University of Leuven)Wolfram Horstmann (University of Oxford)Ignasi Labastida (University of Barcelona)Liz Lyon (University of Bath)Katrien Maes (LERU)Susan Reilly (LIBER)Anja Smit (University of Utrecht)

Page 15: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

LERU Roadmap for Research Data

1. Policy and Leadership

2. Advocacy

3. Selection and Collection, Curation, Description, Citation, Legal Issues

4. Research Data Infrastructure

5. Costs

6. Roles, Responsibilities and Skills

7. Recommendations to different stakeholder groups

Cern, Geneva

Page 16: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

See http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/150.pdf

Page 17: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Key Messages Each LERU university

needs a Research Data Management Strategy

Researchers should have Research Data Management Plans

LERU universities need to bring stakeholders together

Benefits of ‘open data’ for sharing and re-use should be advocated and explored

New role of Data Scientist is emerging

King’s Cross, London

Page 18: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Policy Development

Case Study on Policy development from UCL

Drivers External funders Need to inform researchers Raise awareness of issues

facing UCL researchers

Identifies roles and responsibilities

Data to be made open in the most open manner appropriate

Researchers should have Data Management Plans

LERU slams lack of data policies – Research Europe

Page 19: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Open Data

Open Data allows research data to be shared and re-used Avoids costly duplication of research activity Provides greater transparency in research activity Potential to speed discovery of solutions to societal Grand

Challenges, such as health care & environmental science

Can all research data be open? Certain categories probably cannot

National security Data protection Commercial Funder requirements

http://en.wikipedia.org/wiki/File:Open_Data_stickers.jpg

Page 20: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Data management Which of these

layers of research data need to be curated for a

fixed term? preserved for

the long term? thrown away?

LERU Roadmap identifies this as an area for future study

The ODE Data Publication Pyramid athttp://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf

Page 21: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 22: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 23: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 24: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Collaboration a way forward LERU Rectors see this as an

area for study Collaboration between Dutch

institutions Focus is on research data

which lies behind publications

Each university and faculty has its own Dataverse installation

Support services offered by libraries in Dutch universities

Utrecht, Tilburg, Erasmus University Rotterdam, Maastrict, Groningen, 3TU Datacentrum and Netherlands Institute of Ecology

Page 25: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Collaboration – a UCL Case Study UCL Research Data

Service Will curate outputs of ‘Big

Science’ funded by projects

Centrally funded by UCL Some cost recovery Add a preservation

service Advocacy for research

data management25

UCL Library Services Will curate the outputs of

‘Small Science’ Funded via the Library No cost recovery planned Oversee UCL policy

development Advocacy for research

data management

Plaster Relief by John Flaxman, Flaxman Gallery, UCL

Page 26: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 27: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 28: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 29: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

7 areas of opportunity

Availability Findability Interpretability Reusability Citability Curation Preservation

http://www.alliancepermanentaccess.org/index.php/community/current-projects/ode/

Page 30: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 31: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 32: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 33: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 34: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Text and Data Mining – What is it?

Page 35: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

European discussions on TDM

Licences4Europe LIBER and research

stakeholder organisations withdrew from process

LIBER’s TDM Workshop in September 2013

Commission now holding a copyright consultation – until 5 March 2014

LIBER wants Exception in the European Copyright and Database Directives

Page 36: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

The view from a researcherDr Peter Murray-Rust (Cambridge)

Elsevier is the sole author and controller of the policy – there has been no Open discussion or agreement with scholarly bodies

Libraries have to – individually – sign agreements with Elsevier. (Libraries have universally and unilaterally given away all these rights over the last decade and support publishers to forbid machine access to content)

Researchers have to register as a developer (I think) and ask permission of Elsevier for every project they wish to do

Page 37: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

And …

Researchers can only mine text. Images are specifically prohibited. This is useless for me – as I and colleagues are mining chemical structure diagrams

There is no indication of how current the material will be. I shall be mining the literature an hour after it appears. Will the API provide that?

The amount that can be republished is often useless (“200 characters”). I want to build corpora (impossible); vocabularies (essential to record precise words – impossible); chemical names (often > 200 characters so impossible). Figure captions (impossible)

Page 38: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

And …

The researchers must commit to a CC-NC licence. This effectively kills downstream use (I shall use CC0). It also trains them into thinking CC-NC is a “good thing”. It isn’t

If a researcher has a LEGITIMATE collection of papers that they wish to mine (say on their hard disk) they are forbidden. They have to go to each publisher (if this awful protocol is promoted elsewhere) and find the API and mine the individual papers. Absurd

Page 39: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Contents

The importance of Research Data LERU Research Data Roadmap Roles and Opportunities Text and Data Mining Next Steps

Page 40: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Page 41: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Next Steps in London

Policy creation UCL has a Research Data Management policy at

http://www.ucl.ac.uk/isd/staff/research_services/research-data/Do you?

Advocacy Meetings with Research Committees at all 10 UCL Faculties to

raise awareness; Communications Strategy to follow

Training UCL Library Services establishing a training programme for library

staff to be provided by the Library School, University of Sheffield

Page 42: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Finally

Breakout Groups Discussion

Page 43: UCL LIBRARY SERVICES Science 2.0: Research Data Management Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer Chief Executive, UCL.

UCL LIBRARY SERVICES

Questions for Breakout Groups

What are the main points your Research Data Management Policy should make?

What are the drivers to help engage with researchers?