Please - Islandora Camp Europe

33
Digibess: thanks Islandora! Arcidosso Italy March, 20-22, 2013 Giancarlo Birello, Anna Perin IT office and Library CNR-Ceris

Transcript of Please - Islandora Camp Europe

Digibess: thanks Islandora!Arcidosso Italy

March, 20-22, 2013

Giancarlo Birello, Anna PerinIT office and Library CNR-Ceris

BESS : group of 18 socioeconomic libraries in Piemonte (Italy)

The libraries of the project share a common specialization but are different in terms of size, institution, purpose, as well as collections. Such differences, which could have been deemed as a weakness, have eventually turned out to be an asset.

The coming together of such diverse agencies as private foundations, research institutes, and university libraries has freed and disseminated a capital of resources, know-how, and initiatives.

General info 1 of 4

The most important initiative of BESS is the creation of a digital repository, of sources for the study of Piedmontese society and economy (digibess). The project is supported by Compagnia di San Paolo of Turin.

The resulting repository provides as a stable and fundamental source of regional and economic information to the whole community.

The repository contains also some interesting collections of partners such us FIAT Historical Archive, LavazzaArchive, Gramsci Foundation.

The laboratory is composed of:• Automatic scanner Qidenus: high-speed book

scanning, format max. A4• Planetary scanner Bookeye for large format• 6 workstations to carry out all post- production steps • A 15 terabytes NAS to store files

The entire workflow involves the following steps:• scanning• Native file conversion to high resolution tiff• Text file creation by Optical Character Recognition• Metadata files preparation

Supported by Compagnia di San Paolo

General info 2 of 4

CNR-Ceris IT Office and CNR-Ceris Library are commissioned to handle all the post-scan of the digitization.

CNR-Ceris had to provide for the management of large volumes of data with the availability of space storage for the digitized works with characteristics of stability, versatility and dynamism. CNR-Ceris has deployed the software and server platforms of the repository, in a virtualized and redundant infrastructure. CNR-Ceris also take care of the design, development and management of the web portal (front-end) for the presentation, search and consulting data of digitalized items .

Some characteristics:

files: high resolution tiff, OCR txt file, pdf /a, metadata14 TB disk base space at disposal2-node active/passive open-source clusterHigh Availability Hypervisor using cluster storageFedora Commons repositoryharvesting OAI-PMHScripting for ingestingIslandora model and modulesDrupal front-endSolr - search platform from the Apache Lucene projectFull-text search

General info 3 of 4

Total budget:90% budget for digitalization(digital laboratory, staff for digitalization)10% for repository

budget for repository:90% for hardware

Thank you to:open source solutionsmany nights work (for system manager..)

But YES HE DID IT!

General info 4 of 4

Architecturecluster, hypervisor, virtual machines, network(VLANs, IPv6)

Architecture 1 of 2

Architecturetwo servers solution: front-end/back-end, multiple front-end for back-end

Architecture 2 of 2

Softwareopen-source, linux, (DRBD, corosync, pacemaker), KVM, Fedora

Commons and …

CLUSTER (DRBD COROSYNC, PACEMAKER, UBUNTU)

REPOSITORYFEDORA, SOLR, APACHE, UBUNTU, TOMCAT

HYPERVISOR(KVM, UBUNTU)

FRONT-ENDAPACHE, DRUPAL, UBUNTU….and ?

Software 1 of 6

Islandora as front-end

flexible

open

clear code

high activity

Software 2 of 6

Islandora a good guidefor fedora commons models

Software 3 of 6

Book solution pack

Islandora book solution pack (it is our case: collections, books and pages)

views

searchmodels

Software 4 of 6

IIV

Islandora Image Viewer

Viewer and read online

With T text

Open-source

NO flash player

NO external servicesSoftware 5 of 6

Islandora has more useful features(for our project we use only presentation layer)

Software 6 of 6

ingesting

object and collection management

roles manager

workflow

Projectwas a strict collaboration between librarian and system manager:

librarian asking for feature and system administrator customizing code

MODELs,SEARCH,

VIEW,QUERY,…

YES!Yes!Yes!Yes!

Digibess 1 of 1

Please:METADATA

ok basic DC no MODS

MODS removed

Done!Digibess Model 1 of 6

Dublin Core metadata

Digibess Model 2 of 6

Please:We need

book INDEX

DatastreamINDEX added

to bookCModel

Done!Digibess Model 3 of 6

TAB index to book view added

Digibess Model 4 of 6

Please:We need

collection info

datastream INFO added to

collectionCModelDone!

Digibess Model 5 of 6

TAB info to collection and book view added

Digibess Model 6 of 6

++

++

Please:We need full-text

search resultwith highlighting

search result view modified

Done!

Digibess Search 1 of 4

Full-text search resultsDC search returns books, full-text search returns text, pages and books

Digibess Search 2 of 4

+++

+

++

++ ++

Please:We need multiple

words search

search form and syntax modified

Done!

Digibess Search 3 of 4

Multiple words search example

Digibess Search 4 of 4

++

++

Please:We need to enable

download of pdf,tiff and txt

book and page view modified

Done!

Digibess View 1 of 5

Book and page views with file links

Digibess View 2 of 5

++

++

Please:Reverse order

by dc.description for some collections

(Periodicals)

custom collection datastream

QUERY addedDone!

Digibess View 3 of 5

Please:We need repository statistics

new object withcustum QUERY and

QUERYCOLLECTION_VIEW Done!

Digibess View 4 of 5

Statistics

Digibess View 5 of 5

Digibess Ingesting 1 of 1

Ingesting by scripts (from word, native pdf, jpeg, tiff)

scripts

OAIPROVIDERoai_dc and pico metadata with virtual datastream

OAI-PMH Data Provider

Repository exposesOAI-PMH interface for external metadata harvesting.Metadata can be disseminated in two formats: OAI_DC and PICO.OAI_DC are extracted from object datastream DC.PICO are generated on-the-fly by an object service.

Digibess Harvesting 1 of 1

• Development site: http://dev.digibess.it

• Step by step installation guide: http://www.digibess.it/fedora/repository/openbess:TO094-00163/PDF/openbess_TO094-00163.pdf

Mark tell me I did a great job with islandora and

invited me to join islandora Camp Europe. Done!

Please:I need Holidays...

Thanks ISLANDORA!

Anna: [email protected] Giancarlo: [email protected]