Islandora Overview: PASIG May 2013
-
Upload
mark-leggott -
Category
Technology
-
view
105 -
download
1
description
Transcript of Islandora Overview: PASIG May 2013
Islandora OverviewMark Leggott, University of PEI/DiscoveryGarden
PASIG - Washington DC May, 2013
Note: Red text indicates a link.
Open Source
Islandora 101
Project Foundations
• Developed at University of PEI (2007)
• UPEI has FT staff and project staff (AIF)
• DiscoveryGarden is commercial services/support company - sustainability
• 25+ staff at DGI, 6 at UPEI
• Both teams maintain/contribute to code
Conceptualizing
Initializing
Creating/Analyzing
Reporting
Formalizing
Popularizing
Research Institutes
Libraries & Archives
Museums
Media Organizations
Health Centres
Government Agencies
Private Companies
Universities & Colleges
NGOs & Non-ProfitsOther
Access Collaboration Preservation
E-Mail, Letters, Published Research,
Requirements
Meeting Minutes, Grants, Data Collection,
Acquisitions
Forms, Data, Cataloguing,
Findings, Discussion
Reports, Theses, Datasets, Visualizations
Articles, Curricular Content, Policies,
Exhibits
Blogs, Twitter, Newspapers, iTunesU, Flickr
Information Life Cycle
Object Space
User Space
Individual
Group
Department
Museum
University
External
Private Shared Open
Collabora
te
Publish
Re-U
se
Cre
ate
Preservation, Migration, Transformation
Basics
• Drupal+Fedora+Other OS = ecosystem
• Flexible UI on top of Fedora + other apps
• Support for 180+ languages via Drupal
• Focus on robust preservation features and services + flexibility in data models and UI
• VM/code, documentation, lists, Camps
Key Components
• Core - Islandora, Tuque, Solr, XML Forms, FITS, Workflow, Solution Packs (SPs)
• SPs - add specific+tested functionality
• Image, Large Image, PDF, Audio, Video, Book/Paged Image Document
• Newspapers, Digital Humanities, IR, Chem
• DuraCloud integration via Vault Module
Drupal UI
HTML
CSS/Themes
User Roles/PermissionsEditorial WorkFlow
Modules (LDAP/BibUtils/Etc
HooksSPARQL
LDAP
FCK
BibUtils
FormsAPI
ImageAPI
Tabs
Drupal UI
HTML
CSS/Themes
User Roles/PermissionsEditorial WorkFlow
Modules (LDAP/BibUtils/Etc
HooksSPARQL
LDAP
FCK
BibUtils
FormsAPI
ImageAPI
Tabs
XSLTs PHP/Python Snippets
Micro Services Engine
Tika
Kakadu
SWFTools
OOffice
Djatoka
Tesseract
R
Wowza
Islandora
JMS JMS
Code Snippets/Applications
Drupal UI
HTML
CSS/Themes
User Roles/PermissionsEditorial WorkFlow
Modules (LDAP/BibUtils/Etc
HooksSPARQL
LDAP
FCK
BibUtils
FormsAPI
ImageAPI
Tabs
XSLTs PHP/Python Snippets
Micro Services Engine
Tika
Kakadu
SWFTools
OOffice
Djatoka
Tesseract
R
Wowza
Islandora
JMS JMS
Code Snippets/Applications
Process
Any Metadata&
Any Data
Solr/Lucene
GSearch
Mulgara
MySQL
Content Models XACML Policies
Fedora
SPARQLREST
SOAP
Drupal UI
HTML
CSS/Themes
User Roles/PermissionsEditorial WorkFlow
Modules (LDAP/BibUtils/Etc
HooksSPARQL
LDAP
FCK
BibUtils
FormsAPI
ImageAPI
Tabs
XSLTs PHP/Python Snippets
Micro Services Engine
Tika
Kakadu
SWFTools
OOffice
Djatoka
Tesseract
R
Wowza
Islandora
JMS JMS
Code Snippets/Applications
Process
Any Metadata&
Any Data
Solr/Lucene
GSearch
Mulgara
MySQL
Content Models XACML Policies
Fedora
SPARQLREST
SOAP
• Fedora Object Model
• Flexibility supports any data model
• Atomistic and compound objects
• Support for RDF allows integration of specific ontologies
imagined:208361 (PID)
Object Properties
Relations (RELS-EXT)
Dublin Core (DC)
Audit Trail (AUDIT)
JP2K Web (JP2)
JP2K Archival (LOSSLES_JP2)
Low Res JPEG (JPG)
Thumbnail (TN)
Descriptive Metadata (MODS)
Object Model - IslandImagined/Large Image
Digital Object Identifier
System PropertiesManage & Track Object
Reserved DatastreamsKey Object Metadata
DatastreamsAggregates Content Items
• MicroServices
• PHP/Python/Java
• Drives integration of external services for data transformation +
• Log via Fedora audit
• Taverna integration
• Simple Workflow Module
• Simple approach to Editorial Workflow
• Provides “human” nodes in the services framework
• Upcoming version support more granular controls and workflow states/actions
• XML Form Builder
• Create a rich form for any XML schema
• Multiple forms for specific schemas
• Control access via security policies
Administration
• Flexible admin options
• Standard Drupal admin functions
• + ability to maintain aspects of Fedora and other apps via Admin interfaces
• Solution Packs increasingly adding greater configuration options
Preservation Services
• Fedora provides robust service framework
• TechDS+DescDS+RightsDS+AuditDSs transformed to a Dynamic PREMIS record
• Adding DuraCloud support via “Vault”
• Adding Archivematica integration as an optional preservation component
Islandora Community
Community
• Estimate 150+ Islandora sites worldwide in production or development
• 500+ people on Google Groups List
• Some projects starting to contribute back
• Libraries bulk of use now, but includes museums, archives, private companies
discoverygarden
• Commercial UPEI spin-off - full service
• Installation, Configuration, Customization
• Support, System Audit, Consulting
• Hosting, Platforms, Vendor partnerships
• Primary codebase contributor
DGI and Oracle
• discoverygarden working with Oracle to test/certify Islandora on Oracle systems
• SAM/QFS optimization for HFS
• Non-profit membership organization
• Provides members with a range of services, including Islandora hosting/setup
• Shared/Individual/Group repositories
• Working with discoverygarden to provide customization services when desired
Code
Releases
• General goal is to release 4 times per year, or now 7-8 with 2 versions
• Latest “Islandora 6” for March
• First full “Islandora 7” for March
• Goal is to release bug fixes for 6, focus on new developments in 7
Robust Dev Approach
• Islandora code is managed using best-practice open source tools
• GitHub code repository
• Jenkins and Travis CI continuous integration
• Jira ticket management
• Confluence Wiki for documentation
Islandora 6
• March 2013 Release
• Improved documentation, print book
• XACML Editor, Workflows
• Forms Autocomplete, FITS integration
• Smoother SP Installation
Islandora 7
• 1st full release for Drupal 7
• New admin interface/functions
• All new SPs, SeaDragon, IAV
• Complete integration of Tuque API
• Clip tool for SeaDragon
Contribs
• WARC SP (Nick Ruest, York)
• Administrative Dashboard (Peter MacDonald, Hamilton)
• Relationship Editor/Ontology Management (Giancarlo Birello and Rosie Le Faive)
• Batch Ingester (Colorado Alliance)
• Black Thumbnail Bug (Aaron Collie)
Standard SPs
• Image, Large Image, Audio, Video, Book, PDF, Newspaper
• Includes MODS for, DC mapping, sample data, viewer(s), TechMD extraction
• Solution Pack module makes it easier to create new ones, modify existing
Book SP
• Code simplified and made more modular
• Can enable IA viewer for books, Open Seadragon for page images
• Tesseract OCR support standard
• Page manipulation, PDF creation
Image/Large Image SP
• GIF/PNG/JPG + TIFF/J2K support
• Conversion of TIFF to J2K
• DC + MODS
• Option to use OpenSeadragon viewer
Tools Modules
• FITS Extractor, creates technical metadata
• Batch Import (RIS, EndNote, PubMed, DOI)
• OCR, Tesseract with OCR/HOCR
• MARCXML, ingest and view MARC data
• XACML Editor, rebuilt XML FormBuilder
Bridge
• Upcoming module which will allow tighter integration with Drupal, using Ver 7 Nodes or Entities
• Create content via Drupal - sync’d to Fedora and visa versa
• Facilitate re-creation of entire repo, including interface a future goal
DropBox
• Alpha module provides sync between DropBox and Islandora
• Creates Collection objects for each folder and a separate file object for each contained file with all relationships
• Provides basic DC record for metadata
• Upcoming for Google Drive, DataFlow ++
Fedora 4
• Fedora Futures project to review & rebuild Fedora for next major release
• Looking to provide better support for large files, large collections and optimized ingest
• Pilot project us using ModeShape as the core repository
• Islandora team already has pilot integration
Roadmap
• More SPs: Research Data, Digital Humanities, Chemistry, Conferences
• Image Annotation tool (Shared Canvas from Stanford - OAC compliant)
• Full Bridge development
• Integration of Microservices + Taverna
Trying Islandora
• Try production sites (list on last slide)
• Play in sandbox.islandora.ca (cleaned daily)
• Download VM from islandora.ca
• Install code referring to documentation
• iCamps: PEI, Europe, Australia, US east+west
• Documentation: Jira, videos, GitHub, Jenkins
Islandora in the Cloud
DuraCloud
• UPEI and DGI committed to supporting DuraCloud in the Islandora interface
• Works with CloudSync as the bridge between Fedora and DuraCloud
• Can be used with or without Islandora managed collections
DGI Examples• DGI 5 Islandora clients using DC + backup
• Largest has 2 TB of mostly image J2Ks
• full site (objects/MySQL/Drupal) with DC and DCStool using Continuous mode
• + backup up using Zmanda/S3: D/W/M/Y
• + experimental backup to Glacier
• 3 sites using DC/DCS for full backup of IR
Islandora Vault Module
• New module for managing DC+CS services
• “Vault” component on Manage Tab
• Manage CS sets/tasks for Collections
• View Health Check at Object level (e.g. check for matching checksums)
• Defining default actions for mismatches
Vault Setup
Collection Restore
Object Health
Reports
Next Steps
• Tighter integration and more UI functions
• Automated recovery (Auto vs Manual)
• Full Fedora/Collection restore
• Support for private DuraCloud instances
• Add integration with Glacier+
DGI Platforms
• Islandora Platform solutions from discoverygarden released at OR in July
• Initial offerings IR and Digital Collections
• 1-button setup/payment/management
• Additional platforms before end of 2013
• Research Data and Digital Humanities
Sustainability
• Non-profit Islandora Foundation will help maintain code, documentation, training, community participation and more
• Membership model
• Partner - $10K, Board, Resources, Camps
• Collaborator - $4K, Roadmap
• Member - $2K, links
Progress
• Non-profit registered
• UPEI and discoverygarden Partners
• Commitment from other members in 1st month sufficient to hire 1 staff person
• Goal is to have 2 FT staff by Fall 2013
Research Data
@
Physical Data Model
• UPEI/DGI developing a generic data tool to work with systems researchers use now
• Provide a range of filesystem sync tools
• Minimal service - store data in repository
• Enhance with metadata, transform services
• Project metadata CASRAI/VIVO/CERIF +
@
Fedora Repository
DescMDTechMD
AdminMDAssets
Local File System
DropBox
Box.net
DataStage
Google Drive
Private Cloud
Storage
Generic Research Data SP
(+ Standard SPs, Viewers)
Sync
Extract
Transform
Enrich
Check
MintTaverna DataCite
FITS + Authority
Islandora Generic Research Data Architecture
Islandora Framework
Islandora VRE (Virtual
Research Environment)
Islandora IR (Research Articles)
BackupsRegional &
National TDRs
@
Intellectual Data Model
• Smithsonian/DGI developing Sidora system to respond to specific research data needs
• Custom interface, Content Models and Forms, adding Taverna/R integration
• Camera trap images, archaeological data, carbon sequestration data
• File browse interface for all operations
@
Fedora Repository
DescMDTechMD
AdminMDAssets
Image SP + FGDC,
DwC
Numeric Data SP +
FGDC, DDI
Panama Dig Data +
LIDO
Research Articles
Sidora Application
Taverna R
FITS + Authority
The Smithsonian Data Architecture
Islandora Framework
Sidora
@
Intellectual Data Model
Physical Data Model
@
@
@
@
Additional
• Domain specific Solutions Packs for 2013
• Digital Humanities
• Chemistry
• Biodiversity
• Taverna+R++ integration
Examples
Institutional Repository
Digital Collections
Research Data
UPEI VRE
• Rich implementation of Islandora
• Used for digital stewardship of research, administrative and learning assets of UPEI
• Over 150 VREs with wide range of features
• VRE Management Team with 4 librarians
• Standard no cost, extra features charged
Consortia
Admin Collections
Links• General: islandora.ca, discoverygarden.ca, islandora.ca/if, sandbox.islandora.ca,
wiki.duraspace.org/display/FF/Fedora+Futures+Home, duracloud.org
• Code: github.com/Islandora, jenkins.discoverygarden.ca, travis-ci.org/Islandora/islandora/pull_requests, wiki.duraspace.org/display/ISLANDORA/Islandora, jira.duraspace.org/browse/ISLANDORA
• Institutional Repositories: islandscholar.ca, digital.march.es/ceacs-ir, digital.grinnell.edu/drupal/, digitalunc.coalliance.org/
• Digital Library Collections: peildo.ca, digital.march.es/clamor, digital.march.es/merce, newspapers.vre.upei.ca, mirc.sc.edu, islandimagined.ca, vre2.upei.ca/pwc/, atmintis.mb.vu.lt/en, unbound.williams.edu
• Research Data: library.upei.ca/vre, www.taverna.org.uk/, vdp.vre3.upei.ca/, modernistcommons.ca, vre2.upei.ca/herbarium/, discoveryspace.upei.ca/parca, discoveryspace.upei.ca/quantumchem/, upeikerrlab.ca
• Consortia: cairnrepo.ca, adrresources.coalliance.org
Note: some of these sites require authentication access - contact Mark for more information.
Questions?Mark Leggott - University of PEI/discoverygarden
Kathleen Van Ekris - [email protected]