Programgruppen for Informationsforsyning · 7/7/2010 2 CRIS OAR INTER OPERA + Knowledge Exchange is...
Transcript of Programgruppen for Informationsforsyning · 7/7/2010 2 CRIS OAR INTER OPERA + Knowledge Exchange is...
7/7/2010
1
CRISOAR
INTEROPERA
+
+
Knowledge Exchange CRIS-OAR interoperability project
publication metadata
7/7/2010
2
CRISOAR
INTEROPERA
+
Knowledge Exchange is an international co-operative effort that supports the use and development of e-infrastructures for higher education and research.
Partners are:n Denmark’s Electronic Research Library (DEFF)n German Research Foundation (DFG)n Joint Information Systems Committee (JISC) in UKn SURF foundation in the Netherlands.
CRISOAR
INTEROPERA
+Motivation: Enable broad collaboration in the information management of research publications
nCurrent Research Information Systemsn a label for research management systems of various
types, dealing with many aspects of research activitiesn contain metadata on research publications
nOpen Access Repositoriesn a label for for open research output archives aiming at
preservation and dissemination of publications etc.n contain metadata on research publications
n They share the challenge of achieving full metadata coverage for the publications within their scope
7/7/2010
3
CRISOAR
INTEROPERA
+
n If CRIS and OAR easily could exchange metadata about publications, they could support each other
n But CRIS and OAR have grown out of different communities and have developed rather different approaches to publication metadata
n If a university has a CRIS and an OAR, generally a publication must be registered twice to comply with both systems’ requirements
n Both CRIS and OAR strive to be complete in their coverage of publications – both would benefit from collaboration – not to mention the authors/researchers.
Motivation: Enable broad collaboration in the information management of research publications
CRISOAR
INTEROPERA
+
n CRIS use a variety of formats – some use CERIF (or variants thereof) and some use various local or national formats
n In many disciplines, publications are of global interest and are often results of international collaboration è They are often of interest to more than one CRIS
n CRISwith different formats would benefit from an easy and precise mechanism to exchange publication metadata
Motivation: Enable broad collaboration in the information management of research publications
7/7/2010
4
CRISOAR
INTEROPERA
+
nOAR use a variety of formats – some use Dublin Core(or variants thereof), some use library formats such as MARC and MODS, and some use use various local or national formats
n In many disciplines publications are of global interest and are often results of international collaboration è They are often of interest to more than one OAR
nOAR with different formats would benefit from an easy and precise mechanism to exchange publication metadata
Motivation: Enable broad collaboration in the information management of research publications
CRISOAR
INTEROPERA
+Aim and purpose
n To increase the metadata interoperability n between CRIS and OAR systems
n and thus alson between CRIS and CRISwith different formatsn between OAR and OAR with different formats
n by defining and proposing 1.a metadata exchange format for publications2.a set of common vocabularies for key elements
7/7/2010
5
CRISOAR
INTEROPERA
+Project participants
UK - JISC DE - DFG NL - SURF DK - DEFFRosemary Russell, UKOLN
Michael Day,UKOLN
Simon Lambert, Rutherford Appleton
Wolfram Horstmann,Bielefeld University
Najko Jahn,Bielefeld University
Friedrich Summann,Bielefeld University
Max Stempfhuber, Aachen University
Marga van Meel,KNAW
Arnoud Jippes,KNAW
Ed SimmonsNijmegen Univ.
Adrian Price,Copenhagen Univ.
Mikael Elbaek,Technical Univ. DK
Mogens Sandfaer,Technical Univ. DK
Project manager
Project director
CRISOAR
INTEROPERA
+Building new bridges in the old world
This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
Not designing new (and better) worlds
This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
good
7/7/2010
6
CRISOAR
INTEROPERA
+Building new bridges in the old world
This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
Not designing new (and better) worlds
This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
good
We (simply) build a bridge that will enable these islands to communicate - without changing their language and life style.
That will allow them to exchange publication metadata without studying and understanding the particularities of the other part.
We (simply) build a bridge that will enable these islands to communicate - without changing their language and life style.
That will allow them to exchange publication metadata without studying and understanding the particularities of the other part.
CRISOAR
INTEROPERA
+Challenges stemming from different missions of formats
nThe different nature (and tasks) of n CRIS formats n Repository formats
nThe granularity challenge
7/7/2010
7
CRISOAR
INTEROPERA
+The different nature of CRIS and repository formats
Typical CRIS main entities and their relations (many triples & many detailed fields)
CRISOAR
INTEROPERA
+The different nature of CRIS and repository formats
Simple Dublin Core
15 fields in a single flat structureAimed at the description of some sort of“document”
May be enhanced to provide more granularity and specificityBut mostly isn’t
7/7/2010
8
CRISOAR
INTEROPERA
+Bridging publications metadata
n CRIS formats are characterized by their n broader view on research information depicting research
results as well as the actors and various environmental factors in their own right
n (often) high level of detail and specificity in describing the various entities (very granular and precise)
n ability to handle the dynamics of time – as everything else but research publications changes over time as well as their interrelations
CRISOAR
INTEROPERA
+Bridging publications metadata
n OAR (DC) formats are characterized by their n Narrow view on depicting research results – generally
publications
n (mostly) low level of detail and specificity in describing the various aspects (less granular)
n absence of need to handle the dynamics of time – as they deal with research publications tied to a specific point in time
7/7/2010
9
CRISOAR
INTEROPERA
+Bridging publications metadata
è Implode the relational/network nature of the CRIS formats to a single structure – adequate for describing publications
è Design the field/element hierarchy so that highly granular as well less granular metadata may be represented – without loss of information
CRISOAR
INTEROPERA
+
DRIVERDC
CERIF
NARCISMODS
DDF-MXD
DRIVER
DRIVER
DRIVER
DRIVER
Metadata Metadata exchangeexchange
format and format and vocabularyvocabulary
METIS
ePrintsdefault
Project approach
7/7/2010
10
CRISOAR
INTEROPERA
+ Project approach
1. Analyze metadata practices of CRIS and OAR n Looking at formats in actual use at KE partnersn Chart entities and granularities, similarities, differences
CRISOAR
INTEROPERA
+ Project approach
2. Define entities/elements/attributes to be exchangedn Respecting differences in granularityn So that metadata may be exported without loss of informationn So that the format may be used by very granular
environments as well as less granular
3. Define/propose common exchange vocabularyn For the identified key concepts/entities
4. Define/propose common exchange syntaxn Handle differences in granularity
7/7/2010
11
CRISOAR
INTEROPERA
+Some potential use cases
n CRISèOAR
n OARèCRIS
n CRISèCRIS
n OARèOAR
n CRIS/OARèOpenAIRE (EU Open Access pilot)
n PublisherèCRIS/OAR
n Subject repositoryèCRIS/OAR (institutional)
CRISOAR
INTEROPERA
+Over to Mikael
7/7/2010
12
CRISOAR
INTEROPERA
+Based on ideal examples – ”use cases”
CRISOAR
INTEROPERA
+Ideal example of a publication
7/7/2010
13
+The DC elements are used as a baseline.
n Title
n Creator
n Subject
n Description
n Publisher
n Contributer
n Date
n Type
n Format
n Indentifier
n Source
n Language
n Relation
n Coverage
n Rigths
7/7/2010
14
+Main entities of interest
n The publication is in focus and other entities are in relation to the publication
CRISOAR
INTEROPERA
+Person
7/7/2010
15
CRISOAR
INTEROPERA
+Organisation
CRISOAR
INTEROPERA
+Event
7/7/2010
16
CRISOAR
INTEROPERA
+Project
CRISOAR
INTEROPERA
+Publication
7/7/2010
17
CRISOAR
INTEROPERA
+Person in more details
CRISOAR
INTEROPERA
+Vocabularies
n Personn Role
n Description: role is the person role in relation to the publication.
Terms:
n Author
n Primary Author
n Corresponding Author
n Editor
n Publisher
n Translator
n Illustrator
n Inventor
n Supervisor
7/7/2010
18
CRISOAR
INTEROPERA
+Publication in detail – type, review and
CRISOAR
INTEROPERA
+Publication typesn Publication
n Type
n Description: the format does provide a gross list of publication types based on an analysis of the formats analysed in the project. A mapping between the different systems and formats in the analysis can be found on a web page.
n Mapping between common vocabularies can be found at: http://weekschild.uci.ru.nl/KE/?select=all
n The formats analysed: CERIF2008, MODS/DIDL, DRIVER_DC, DDF-MXD; EPrints, METIS, PURE
7/7/2010
19
CRISOAR
INTEROPERA
+Publication types (terms)
n Journal Letter
n Journal comment
n Journal review article
n Journal book review
n Book
n Book chapter
n Book preface
n Conference paper
n Conference abstract
n Conference poster
n Conference talk
n Thesis Doctoral
n Thesis PhD
n Thesis Master
n Working paper, preprint
n Report
n Report chapter
n Lecture Notes
n Lecture
n Memorandum
n Net publication
n Patent
n Software
n Data set
n Newspaper article
n Radio/TV broadcast
n Exhibition catalogue
n Student report
n Other
CRISOAR
INTEROPERA
+Vocabularies - Versionsn Version
n Description: This element and vocabulary is expressing the version of the document i.e. draft or published version of the document. The terms are based on the VERSIONS toolkit excluding the term “updated”.
n Important! Different versions should be self contained and constitute individual records. This mirrors best-practices for repositories but not always the case for CRIS.
Terms:
n Draft i.e. working paper
n Submitted i.e. pre print
n Accepted i.e. post print
n Published i.e. publisher edition
n Updated i.e. reprint
n VERSIONS project: http://www2.lse.ac.uk/library/versions/
7/7/2010
20
Let’s test it!
CRISOAR
INTEROPERA
+The challenges for interoperability
n Discussion!