Please - Islandora Camp Europe
Transcript of Please - Islandora Camp Europe
Digibess: thanks Islandora!Arcidosso Italy
March, 20-22, 2013
Giancarlo Birello, Anna PerinIT office and Library CNR-Ceris
BESS : group of 18 socioeconomic libraries in Piemonte (Italy)
The libraries of the project share a common specialization but are different in terms of size, institution, purpose, as well as collections. Such differences, which could have been deemed as a weakness, have eventually turned out to be an asset.
The coming together of such diverse agencies as private foundations, research institutes, and university libraries has freed and disseminated a capital of resources, know-how, and initiatives.
General info 1 of 4
The most important initiative of BESS is the creation of a digital repository, of sources for the study of Piedmontese society and economy (digibess). The project is supported by Compagnia di San Paolo of Turin.
The resulting repository provides as a stable and fundamental source of regional and economic information to the whole community.
The repository contains also some interesting collections of partners such us FIAT Historical Archive, LavazzaArchive, Gramsci Foundation.
The laboratory is composed of:• Automatic scanner Qidenus: high-speed book
scanning, format max. A4• Planetary scanner Bookeye for large format• 6 workstations to carry out all post- production steps • A 15 terabytes NAS to store files
The entire workflow involves the following steps:• scanning• Native file conversion to high resolution tiff• Text file creation by Optical Character Recognition• Metadata files preparation
Supported by Compagnia di San Paolo
General info 2 of 4
CNR-Ceris IT Office and CNR-Ceris Library are commissioned to handle all the post-scan of the digitization.
CNR-Ceris had to provide for the management of large volumes of data with the availability of space storage for the digitized works with characteristics of stability, versatility and dynamism. CNR-Ceris has deployed the software and server platforms of the repository, in a virtualized and redundant infrastructure. CNR-Ceris also take care of the design, development and management of the web portal (front-end) for the presentation, search and consulting data of digitalized items .
Some characteristics:
files: high resolution tiff, OCR txt file, pdf /a, metadata14 TB disk base space at disposal2-node active/passive open-source clusterHigh Availability Hypervisor using cluster storageFedora Commons repositoryharvesting OAI-PMHScripting for ingestingIslandora model and modulesDrupal front-endSolr - search platform from the Apache Lucene projectFull-text search
General info 3 of 4
Total budget:90% budget for digitalization(digital laboratory, staff for digitalization)10% for repository
budget for repository:90% for hardware
Thank you to:open source solutionsmany nights work (for system manager..)
But YES HE DID IT!
General info 4 of 4
Architecturetwo servers solution: front-end/back-end, multiple front-end for back-end
Architecture 2 of 2
Softwareopen-source, linux, (DRBD, corosync, pacemaker), KVM, Fedora
Commons and …
CLUSTER (DRBD COROSYNC, PACEMAKER, UBUNTU)
REPOSITORYFEDORA, SOLR, APACHE, UBUNTU, TOMCAT
HYPERVISOR(KVM, UBUNTU)
FRONT-ENDAPACHE, DRUPAL, UBUNTU….and ?
Software 1 of 6
Book solution pack
Islandora book solution pack (it is our case: collections, books and pages)
views
searchmodels
Software 4 of 6
IIV
Islandora Image Viewer
Viewer and read online
With T text
Open-source
NO flash player
NO external servicesSoftware 5 of 6
Islandora has more useful features(for our project we use only presentation layer)
Software 6 of 6
ingesting
object and collection management
roles manager
workflow
Projectwas a strict collaboration between librarian and system manager:
librarian asking for feature and system administrator customizing code
MODELs,SEARCH,
VIEW,QUERY,…
YES!Yes!Yes!Yes!
Digibess 1 of 1
Please:We need full-text
search resultwith highlighting
search result view modified
Done!
Digibess Search 1 of 4
Full-text search resultsDC search returns books, full-text search returns text, pages and books
Digibess Search 2 of 4
+++
+
++
++ ++
Please:We need to enable
download of pdf,tiff and txt
book and page view modified
Done!
Digibess View 1 of 5
Please:Reverse order
by dc.description for some collections
(Periodicals)
custom collection datastream
QUERY addedDone!
Digibess View 3 of 5
Please:We need repository statistics
new object withcustum QUERY and
QUERYCOLLECTION_VIEW Done!
Digibess View 4 of 5
OAIPROVIDERoai_dc and pico metadata with virtual datastream
OAI-PMH Data Provider
Repository exposesOAI-PMH interface for external metadata harvesting.Metadata can be disseminated in two formats: OAI_DC and PICO.OAI_DC are extracted from object datastream DC.PICO are generated on-the-fly by an object service.
Digibess Harvesting 1 of 1
• Development site: http://dev.digibess.it
• Step by step installation guide: http://www.digibess.it/fedora/repository/openbess:TO094-00163/PDF/openbess_TO094-00163.pdf
Mark tell me I did a great job with islandora and
invited me to join islandora Camp Europe. Done!
Please:I need Holidays...
Thanks ISLANDORA!
Anna: [email protected] Giancarlo: [email protected]