ww
w.u
ni-
stu
ttg
art.
de
Networking institutional repositories in Germany – DINI / DFG projects (… and DRIVER)
Frank Scholze
Stuttgart University Library
KUB Seminar on Open Access, Copenhagen, 29.11. 2007
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de Overview
General DINI certificate DINI / DFG projects
OA network OA statistics OA citation
DRIVER
ww
w.u
ni-
stu
ttg
art.
de
Disciplinary strategy
Institutional strategy
Self-archiving
„green“
OA publishing
„gold“Institutional Repositories
BMC, PLoS, ACP …
University Presses
Disciplinary Repositories
Open Access strategies
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
The current situation for digital repositories
More than 1000 institutional repositories worldwide, about 120 in Germany
Many others: disciplinary, national, …
Many types: Primary data, textual documents, learning materials, multimedia objects, code …
Documents: incl. pre-prints, postprints, technical papers, dissertations, theses …
Various repository software
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Many have the OAI-PMH implemented small but relevant local specialties
Some international registries exist OpenDOAR, ROAR …
Some national registries exist DINI list …
Some search engines exist BASE, OAIster, Google Scholar …
What is known about repositories?
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Collaboration in repositories
Very few mature national repository organizations/collaborations SURF, DINI …
No trans-national repository organization/collaboration
Lack of data harmonization, orchestration of services
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
From the user point of view[ talking about researchers ]
Fragmented, obscure information landscape
content can be (partly) searched and found
quality and re-use differs from repository to repository
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Repository Infrastructure
Repository Infrastructure
Secondarypublications
Published reports Theses
Books, reviews, etc.
Pre-research documentsGrey literature?
Pre-prints
Patent documents
Researchdocuments
Processeddata
Raw data
Learningmaterials
From: e-SciDR Lisbon workshop, 4th September 2007
Research process and repositories
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
“Make it workable”
Focus on existing repositories and services
Focus on Institutional Repositories Rapid progress over the last years Inherent sustainability (e.g. libraries) Adequate technical homogeneity (OAI-PMH)
Focus on textual materials
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
General DINI certificate DINI / DFG projects
OA network OA statistics OA citation
DRIVER
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de DINI
Deutsche Initiative für Netzwerk Information(German Initiative for Networked Information))
Coalition of German Higher Education Infrastructure- or Service-Institutions
- Libraries- Computing Centres- Media Centres- Scientists
8 Working Groups
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DINI Certificate
Launched in 2003 by DINI Electronic Publishing working group
Quality control for Document and Publication Repositories- Organizational, technical, personal and policy aspects
Defines a set of minimum standards (requirements) for a repository and its operator(s) mandatory for modern scholarly communication
Recommends foreseeable developments that might turn into future requirements
DINI Certificate 2007 released September 2006
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DINI Certificate - Content
Visibility of the Service
Policy, Guidelines
Author Support
Legal Aspects
Security, Authenticity and Data Integrity
Indexing
- Subject indexing
- Metadata Export
- Interfaces
Logs and Statistics
Long-term Availability
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Certification in practice
Certificate 2004: 19 Services certified
Certificate 2007: 2 services certified, 4 in progress
Common issues during the certification process- policy- persistent identifiers- documentation
Results of certification- Certification as development of the service
common experience- Certification as marketing action
experiences range from very good results to no effect at all
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
General DINI certificate DINI / DFG projects
OA network OA statistics OA citation
DRIVER (Europe)
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DINI / DFG projects
Cluster of proposals to the DFG (coordinated by DINI) Network of certified open access repositories (OA network) 2y
– National input to EU repository infrastructure project DRIVER Usage statistics (OA statistics)
demonstrator proposal under review Distributed open access reference citation service (OA citation)
demonstrator proposal under review
Related DINI projects OA information (open-access.net) 18m CARPET - Community for Academic Reviewing, Publishing and
Editorial Technology proposal under review
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
OA network
Building a networked infrastructure for German repositories Project just started Builds on DINI certified services Relationship to DRIVER
German node for DRIVER DINI certificate more comprehensive than DRIVER guidelines
– Except for harvesting recommendations
Add-on services beyond DRIVER OA statistics OA citation
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de OA network - architecture
Harvesting
Aggregation
Enrichment
Processing
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
OA statistics
Local and aggregated usage data
Transparent and standardized data E.g. COUNTER, IFABC, LogEc
Calculation of data is comprehensible Klick-spans, robot elimination etc.
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DataMining
Filtering
Metrics
Services
Aggregatedlogs
Log DB
OpenURL ContextObjects
LogRepository
Link Resolver
LogRepository
Link Resolver
LogRepository
Log harvester(Service Provider)
COCOCO
COCOCO
COCOCO
Aggregated Usage Data
Log DBWebserver
-Log
Aggregated Usage Data
Rewrite module
Normalise (optional) -> Robots, psydonymization
OpenURL ContextObjects
or SUSHI
Normalise
Infrastructure for Collecting Usage Data
e.g.
e.g.
Based on: Bollen, Johan and Van de
Sompel, Herbert, OAI4, Geneva
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de Usage - indicators
Indicators can be calculated quantitatively or structurally
Example for a quantitative indicator: Usage Factor Mean value of aggregated usage over a defined period of time
Example for a structural indicator: Usage Page Rank Reciprocal voting of nodes in a network
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
bX project - Comparison of Journal Usage PageRank and Journal Impact Factor
Journal of Molecular Graphics and Modelling
Dr. Dobb's Journal
Usage > IF
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
OA citation
Builds on work done in citebase, citeseer, google scholar, CDSware, ePrints
Extraction of references Citation indexing (CI) Expansion of the traditional document space for CI
competing with WCI
Calculating alternative indicators (Citation Page Rank) Cf projects MESUR (LANL), Eigenfactor (U of Washington)
http://www.mesur.org/ http://www.eigenfactor.org/
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de Open Access and Metrics
BASE integration demonstrator
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
General DINI certificate DINI / DFG projects
OA network OA statistics OA citation
DRIVER
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DRIVER
Digital Repository Infrastructure Vision for European Research
Environment and tools for building service-based Repository Systems
Sets of services running at different network sites, possibly in multiple instances, interacting, dynamic, sharable, open
DRIVER I: BE, FR, GE, NL, UK DRIVER II: IT, PL, EL, DK, SL, PT
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
European Information Space
Includes the DRIVER Repository System Providing users with advanced functionalities over a uniform
European Information Space formed by aggregating multiple Repositories
Repositories Can join or leave the infrastructure at any time Are dynamically/automatically aggregated to populate and keep
updated the DRIVER Information Space
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DRIVER and standards
Service Resources are implemented as Web Services and accessed through the corresponding Web Service Interface Parameters calls are enveloped into SOAP messages The Enabling Services are also compatible with REST
XML is the lingua-franca for the whole system Resource internal status, i.e. Resource profiles Profiles in Information Service use eXist XML engine
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DRIVER and standards II
DRIVER Aggregation Harvesting according to OAI-PMH Adopting OAI-Provenance best practice
(OAI-about DRIVER Guidelines) To be extended to other object models and harvesting
protocols
Queries to Search Service and Index Service obey to SRW/CQL standard
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
DRIVER Guidelines
Unambiguous identification of OA content (using sets if necessary)
Direct link to the digital object (dc:identifier)
Transient or persistent information on deleted objects
ISO 639-3 language format
Well defined batch size (100-200 datasets)
Adequate lifespan of the resumption token (24h)
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Conclusion
Bring standardization of interfaces, protocols and formats to a wider community
Services based repository infrastructure Identification of repositories Harvesting, searching, browsing, re-use
Repository Infrastructure
Repository Infrastructure
Secondarypublications
Published reports Theses
Books, reviews, etc.
Pre-research documentsGrey literature?
Pre-prints
Patent documents
Researchdocuments
Processeddata
Raw data
Learningmaterials
Integrating repository infrastructure into the research process Other outputs: Primary data, learning objects, patents … Linking outputs from a research process perspective
29.11.2007 Frank Scholze
ww
w.u
ni-
stu
ttg
art.
de
Information
DINI certificatehttp://nbn-resolving.de/urn:nbn:de:kobv:11-10075687
DINI repository listhttp://www.dini.de/wisspub/repositories/german/index.php
OA networkhttp://www.dini.de/oa-netzwerk/
OA citation (DOARC – Demonstrator)http://doarc.projects.isn-oldenburg.de/
DRIVERhttp://www.driver-repository.eu/