Repositories Update (UK)

31
1 Repositories Update (UK) Peter Burnhill Director, EDINA National Data Centre, University of Edinburgh, Scotland UK JISC/CNI Conference, Edinburgh, 1 & 2 July2010 Managing Data in Difficult Times

description

Presentation by Peter Burnhill at the JISC/CNI Conference, Edinburgh, 1 & 2 July 2010.

Transcript of Repositories Update (UK)

Page 1: Repositories Update (UK)

1

Repositories Update (UK)

Peter Burnhill

Director, EDINA National Data Centre, University of Edinburgh, Scotland UK

JISC/CNI Conference, Edinburgh, 1 & 2 July2010

Managing Data in Difficult Times

Page 2: Repositories Update (UK)

2

Overviewpolicies/strategies/technologies/infrastructure to manage research/teaching

• Scope

– Digital repositories at the level of the institution (for itself), at a level above the campus: for institutions, for UK, for much much more

* within the European and wider international context* in support of research, learning & teaching …. and management

• Having voice as …

– a provider of common services and national infrastructure [EDINA]– a user of repository software [Eprints, DSpace, IntraLibrary]– a member of SONEX and indirectly of COAR and UK-CORR

• and focus on repository-related progress in the UK since last JISC/CNI; where is the value, how this is assessed/expressed?– Size of investment in recent times– Cost-effectiveness and ‘impact’ of provision

* Effort at institutional & inter/national level and the ‘shared services’ agenda?

• Wondering what Dorothea said next …

Page 3: Repositories Update (UK)

3

Managing Data in Difficult Times

Nostalgia for interesting but not difficult times? • JISC Repositories & Preservation Programme - April 2006; March 2009

“£14m investment in H.E. repository and digital content infrastructure”

• This included the JISC RepositoryNet, as four ‘support services’:① Repository Support Project

② Repository Research Project

③ Intute Repository Search

④ ‘interim repository’ | Prospero | the Depot | OpenDepot

• Checking the JISC website today– under the heading of ‘key digital repository activities’ are 21 funding

programmes and 216 funded projects.

Including some that are just being awarded … & then there is:

• OR10: Open Repositories Conference, 6/9 July 2010, Madrid• RepoFringe2010: Repository Fringe 2/3 September, Edinburgh• and several others

Page 4: Repositories Update (UK)

4

R is for Repository• What are Repositories?

– Facility/technology to support at least three basic types of service:PUT: a service interface that allows one or more use community to

deposit/issue digital content (+ metadata on that content) KEEP: a service that ensures the integrity of that content, for the life of the

repositoryGET: a service interface that allows one or more use community to

search/extract that content* Use community: persons or machines/software; appropriate interface

• Digital Repositories Review (R.Heery and S.Anderson, 2005)– Digital repository differs from other digital collections in that:

* "content is deposited, whether by content creator, owner or third party * architecture manages content as well as metadata; * repository offers a minimum set of basic services [put, get, search, access control] * must be sustainable & trusted, well-supported & well-managed."

• "a university-based institutional repository is a set of services … for the management and dissemination of digital materials created by

the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access ..." (C. Lynch, 2003)

Page 5: Repositories Update (UK)

5

R is for Repository• Who has Repositories and why?

Page 6: Repositories Update (UK)

6

R is for Repository• Who has Repositories and why?

Page 7: Repositories Update (UK)

7

R is for Repository• Who has Repositories and why?

Page 8: Repositories Update (UK)

8

R is for Repository• What are Repositories and what are they for?

– Allowing deposit of and holding all sorts of digital things/stuff* Metadata + Objects; Metadata + pointers; Metadata only* All sorts of objects: images, datasets, theses, articles, etc etc

• Special interest in serving our central task:– ease & continuity of access to scholarly resources

Page 9: Repositories Update (UK)

anytime/placeconvenience authorisation

licence to use

Ensuring researchers, students and their teachers have

ease and continuing accessto online scholarly resources projects

‘ease’ ‘continuing’

P.Burnhill, Edinburgh 2009

usability

open

preservationpost-cancellationrestricted

Use case: article–length work published in e-journalsbut other use cases apply

access to content & services

reliabilitywell-seamed

interoperability

functionality

who/WAYFauthentication

Page 10: Repositories Update (UK)

UK funding councils

JISC Sub-CommitteesJISC Collections

acting as platform for network-level services & helping to build the JISC Integrated Information Environment

research, learning & teaching in UK universities & colleges

Research Councils

UK

National Data Centres

Page 11: Repositories Update (UK)

11

1&2 provider of services & user of software• EDINA-run repositories, with and without JISC

– DataShare: for research data (institutional, U of Ed)

* Open Data; using DSpace

– Jorum: for learning materials [with Mimas]

* OER and turnstile (UK); using DSpace & IntraLibrary

– OpenDepot (the Depot): for research papers

* OA (world); using Eprints

– ShareGeo: for geo-spatial data

* Open Data and turnstile (UK); using DSpace

– OA Repository Junction as shared service tool

* using own code and Eprints as an 'escrow' repository during the transfer process.

– & maybe others … depending on definition of repository

Page 12: Repositories Update (UK)

for learning materials [with Mimas]

OER and turnstile (UK);

using DSpace & IntraLibrary

Page 13: Repositories Update (UK)

for research papers

OA (world); using Eprints

Page 14: Repositories Update (UK)

ShareGeo: for geo-spatial data

Turnstile (UK) Data & Open Data; using DSpace

Page 15: Repositories Update (UK)

15

3.

SONEX

• four individuals in JISC-sponsored mini think-tank – from Denmark, Spain & UK – Morgens Sandfaer, Pablo de Castro (Chair)

& Jim Downing (Richard Jones) and Peter Burnhill

• came out of international workshop Amsterdam, March 2009

– charged with looking at how repositories should inter-operate

– the focus group given name of ‘repository handshake’– 3 other focus groups on citation, identifiers and ‘organisation’

* the latter an exit strategy for EU-funded DRIVER project?

• focus switched to ‘deposit opportunities’– semi-automatic issue/deposit, under terms of Open Access

* concern about risk of ‘hollow ring of repositories’ * avoid diktat about standards and techno babble

– looking to interoperability via SWORD

Page 16: Repositories Update (UK)

16

3.

SONEX

• focus switched to ‘deposit opportunities’– Initial categorisation of repositories into which authors deposit– Looking to onward interworking/interoperability (SWORD)

* Not just technical interoperability but workflow

– Role of repository managers

• But also recognitionof other network-attached ‘systems’:– Authoring tools

* Desktop software

– Bibliography tools– Non-Author-based

workflows* CRIS* REF

Page 17: Repositories Update (UK)

17

SONEX: Scholarly Output Notification & EXchange

• Re-branded ourselves as SONEX, to signal …

– ‘scholarly output’, not just research publications – ‘notification’ using metadata only– ‘exchange’ as two-way interoperability/negotiation

* push metadata; pull content; exploit always-on Internet

• SONEX use case: multi-person & multi-institutional

• SONEX activities:– Identify/analyse deposit opportunities (use cases) for ingest

into the repository space. – Identify/promote projects tackling deposit use cases – Gap analysis

• machine (third party systems) as user (PUT & GET)

http://sonexworkgroup.blogspot.com

Page 18: Repositories Update (UK)

18

SONEX Use Case Actors

• Use case Actor 1: Individual author/researcher [person]

author of multi-authored article, other author(s) at other institution(s)sole author with entire career at a single institution [exception]

– Variant: author making deposit is the PI of funded research project(compliance with mandate from funder to deposit)

– Variant: author making deposit is not the PI of funded research project but work is associated with one or more funded research projects (PI)

• Use case Actors  2&3: Depositor is not author (Mediated deposit)– Variant: support staff in research group– Variant : Library’s own resources and document collections

– Variant: Institutional Research Support Systems (CRIS systems) [machine]

• Use case Actor 4: Repository Manager (RM) of an IR– wishing to be notified & obtain copy from a subject (SR) or another IR

• Use case Actor 5: Publisher (which work is published) [machine]– deposit under OA of the author's final copy (OA-RJ & PEER projects)– OA of published copy– Pointer supply to published copy

• Other Actors: Vendor of authoring or repository software

Page 19: Repositories Update (UK)

19

SONEX Use Case Scenarios

Gven opportunity, and motivation, to deposit content into the ‘repository space’, for onward notification and exchange:

1. PI(s) as co-author* with felt obligation to notify grant funders of OA deposit* via web-based or desktop environment

2. Publisher(s) * assisting their author(s) in supply of full-text

into appropriate repositories

3. CRIS, a campus research information system, * managed support for researchers,

including note of publications for the Project/Grant

4. ‘Bibliography’ * web-based publications lists * as maintained by individual researchers,

Research Groups, Departments, etc.* including RAE/REF driven institutional

actions

Page 20: Repositories Update (UK)

20

OA Repository Junction Project

• m2m broker supports:– Discovery of user &

content type– Get /ingest package of data

(metadata + digital object)– Deduce /parse data object &

deduce target repository(s) – Pass /deposit package into

repository targets– Notify /send alert to

appropriate 3rd party(s)eg repository managers

• Working with ‘Publisher’ and ‘Subject Repository via Broker Service

• Theo Andrew & Ian Stuart (EDINA)

Page 21: Repositories Update (UK)

21

Part 2: Showcase

Page 22: Repositories Update (UK)

22

O is for Open• OA (for publications) not the only ‘open’ policy:

– OER: Open Educational Resources* UKOER: Jorum and other subject/institutional repositories * Open CourseWare – as open webpages

– Open Data* Both repository and open databases; Linked Open Data

– Open Source Software

• Open Access – the regime used for Subject Repositories – seemed to be motive for creation of Institutional Repositories

* ‘Green OA’ self-archiving by authors: Creative Commons

• Is this how we should judge success of Repositories?– OA now becoming mainstream, including uptake by publishers– "One fifth of 2008's research papers now open access"

The Great Beyond, Nature blog, June 25, 2010

• Are Repositories the only way to support OA?– Repositories to align themselves with, and support funder-mandates

for open access if they are to be successful

Page 23: Repositories Update (UK)

23

Informal discussion with JISC programme managers

“Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in:

• managing research data programme • research information programme • open educational resources programme

as they have much more in common with those projects than they do with each other.”

Page 24: Repositories Update (UK)

24

Informal discussion with JISC programme managers

“Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in:

• managing research data programme • research information programme • open educational resources programme

as they have much more in common with those projects than they do with each other.”

“repositories have found their core business proposition via the REF and making sure Universities list research outputs to obtain research ratings

- have not succeeded in making the business case that IRs should be doing the job of archiving, a core library platform, or the job of an institutional demonstrator/poster space.

Repositories fit in the ‘University Enterprise Stack’ by virtue of being a system that delivers a business solution to a real financial problem.”

Page 25: Repositories Update (UK)

25

UK-CORR: UK Council of Research Repositories

individual rather than institutional, [email protected] UK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk following comment from Dorothea Salo

Page 26: Repositories Update (UK)

26

UK-CORR: UK Council of Research Repositories

individual rather than institutional, [email protected] UK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk following comment from Dorothea Salo: US mainly about OA full texts; UK mainly about … serving research assessment!– Is there more to IRs than the REF: lots of bibliographic records & little full text?– Should IRs only accept full text, not metadata only?

– in absence of a CRIS, our IR had to do REF (Lancaster & Northampton)

– was OA but then RAE2008, but should aim to include all (OU)

– motive for IR was digital preservation, with different REF system; funder mandate compliance for OA; visibility via OA (Oxford/Bodleian)

– RAE/REF is opportunity to engage institution-wide (Warwick)

– Advent of CRIS (which don’t manage outputs well) may be opportunity for IRs to have role, including use of ‘metadata only’ as lever to obtain full text (Hull)

– REF & research management information allows IRs to be embedded as platform for OA (Southampton)

– RAE/REF has different goals to OA and IRs with low % of full text may undermine OA movement (Nottingham)

Page 27: Repositories Update (UK)

27

COAR: Confederation of Open Access Repositories

• New: 1st General Assembly in Madrid in March 2010

• 48 members drawn largely from Europe, but including both JISC & CNI, and also EDINA (University of Edinburgh)

• Work Plan for 2010/12, including1. Advocacy on behalf of OA and repositories (Rs) [both together?]2. Populating (OA) Rs3. Best practice documents4. Facilitate and ensure data interoperability of (across?) Rs5. interoperability with other systems (such as CRIS systems)6. Support national helpdesks7. Guidance on how Rs will form essential elements for global e-

infrastructure8. Promote R manager profession9. Provide advice & guidance on suitable R infrastructure

technologies10.Global (meta)data store11.Strategic partner other infrastructure-related initiatives worldwide

Page 28: Repositories Update (UK)

28

Managing Data in Difficult [Interesting] Times End of an era? End of the R word? Embedded in domain-specific

processes?

1. Moving from technology to policy & practice:some domain-specific, some common to repositoriesa) Collection management: active curation & Linked relationships

* versions, data|article|learning material* Collections, ‘see also’

b) First point of public issue (availability); Take-down regimes

2. Institutional stewardship responsibility for its born-digital [and digitised] content– "a university-based institutional repository [supports] a set of

services … for the management and dissemination of digital materials created by the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access ..." (C. Lynch, 2003)

3. What of the (new) shared services imperative?

– Who does what, at what level/scale?

Page 29: Repositories Update (UK)

29

Theoretical basis for digital library?

• Mix of document tradition & computation tradition

“considerable simplification, … helpful to think … of two traditions, or mentalities, even cultures, co-exist in area of Information Science

1. “Approaches based on a concern with documents, with signifying records: archives, bibliography, documentation, librarianship, records management, and the like

2. “approaches based on uses for formal techniques, whether mechanical (such as punch cards and data-processing equipment) or mathematical (as in algorithmic procedures).”

Michael Buckland, UC Berkeley, 1998

http://people.ischool.berkeley.edu/~buckland/asis62.html

Page 30: Repositories Update (UK)

30

Time for me to stop

Hoping that I have left some space/place for questions

Thank you

Acknowledgements

Theo Andrew, Pablo de Castro & Robin Rice, Dave Flanders & Andy McGregor

Page 31: Repositories Update (UK)

31

Multimedia resources: candidate for repository?• platform for search and download of film, video and audio

– wide range of subject coverage, including documentary film– Llicensed for use in learning, teaching and research

• Being re-worked as the Digital Media Hub, combining– Film & Sound Online

* initial 600 hours of film, digitised for downloading

– NewsFilm Online

* 3000 hours of material from ITN & Reuters

* Over 4TBs of clips to download

– Release of product from JISC Digitisation programmes * Plus Education Image Gallery of still photography

– Visual and Sound Materials Portal project

* Discovering all sorts of audio-visual material

• Special interest for social science as record on non-print record of 20th Century: the first A-V century– With new forms of research material to use and to master