EPrints and the Cloud
-
Upload
leslie-carr -
Category
Education
-
view
1.567 -
download
2
description
Transcript of EPrints and the Cloud
EPrintsCloud Visions
What is EPrints For?
EPrints offers a safe, open and useful place
to store, share and manage material in the
pursuit of research and educational
agendas.administrative reporting, collaboration, data sharing,
digital profile enhancement , e-learning, e-publishing, e-
research, marketing, open access,
preservation, publicity, research assessment, research management, scholarly collections
Research Curation, Researcher Support
Researchers’ environment supported by repository
Research data managed by repository
Research community assisted by repository
What is a Repository
Safe, secure, persistent, managed storage for files
Safe, secure, persistent management of shareable FRBR works
Safe, secure, persistent, management of scholarly & scientific working
Lead
ing
to…
Science 2.0 / The Fourth Paradigm / Data Intensive ScienceThe challenge is not cloud computing but cloud thinking
Bio-Diversity
Current EPrints Cloud Capabilities
Amazon Elastic Compute Machine Images (AMIs) Small (Single Core / 1.7Gb) Large (64 Bit / Quad Core / 7.5Gb) Extra Large (64 Bit / 8 Core / 15Gb)
EPrints 3.2 is 64 Bit Enabled
Persistent Database & Storage Really Excited - Super Fast / Cheap / Easy!
Cloud to Desktop Storage
Data can be stored on multiple storage services
Local disk, SAN, NAS, Honeycomb, Cloud
Researchers can mount repository objects as a networked filesystem
Service usage and preservation risks can be monitored and analysed.
Hybrid Storage In EPrints
A single storage solution has drawbacks.
Cost vs. Speed vs. Reliability Repositories need to be
agile: to utilize and be able to migrate to new platforms
Leverage the benefits of each solution without losing control of your digital objects.
Local Disk Storage
No local bandwidth costs Hard to expand Locally Managed High overheads cost Requires space and cooling Tied closely to the software S
TO
RA
GE
EC
OS
YS
TEM
Local Archival Storage
Specialist Expensive to purchase Locally Managed Space and running costs Expandable
STO
RA
GE
EC
OS
YS
TEM
Cloud Storage
Scalable Externally controlled Known Costings Unclear retention policy Re-Useable (using simple APIs) Global Scale
STO
RA
GE
EC
OS
YS
TEM
But Clouds Blow Away
Recently: Yahoo Briefcase XDrive AOL Pictures HP Upline Sony Image Station
Source: Tom Spring - PCWorld
Why use Hybrid Storage
Use the best features of each storage type
Performance Scaling-up bandwidth
Optimisation Large-file handling Multimedia streaming
Localised Delivery Local delivery from the cloud
EPrints Storage Controller
• The storage controller decides where to put a file.
• Rule-based policy defined by XML configuration file
• Large binary files of scientific data (raw machine result data) can be stored in a large disk (slower access) system and sent to a tape company for long term storage.
• Processed results can be stored locally and in the cloud ready for rapid delivery to end points.
Architecture Diagram
Controller Ruleset
<choose> <when test="datasetid = 'document'"> <choose> <when test="$parent{relation_type} =
'isVolatileVersionOf'"> <plugin name="Local"/> </when> <otherwise> <plugin name="AmazonS3"/> </otherwise> </choose> </when> <otherwise> <plugin name="Local"/> </otherwise> </choose>
EPrints Storage Manager
Amazon S3 Localisation (1)
Amazon S3 Localisation (2)
Preservation Services
Object Classification
Risk Analysis
Mitigation and Migration
EPrintsForthcoming Development
EPrints Cloud Services
Web based repository setup Much like getting started with a blog. Fill in a form and obtain a repository. Coming to EPrints core in next major release.
Enterprise Support for Cloud Solutions Full Setup & Configuration Global Distribution Auto Upgrade & Patching Trusted Backup
EPrints 3.2
Plug-ins / Modules Everything builds on the core layer Major part of v3.2 is strengthening
the core and adding more abstraction layers
Improved data model Enhanced data facilities Enhanced metadata facilities Improved programming & API
EPrints 3.2 Structure
Community Driven Development
There are many abstraction layers. Display Manipulation Upload Handlers Custom Datasets Import / Export Plug-ins Transcoding Plug-ins Database Plug-ins Storage Plug-ins
One API
Storage Plug-ins
Local NFS Amazon S3 Sun Cloud Storage Service Microsoft Azure Any others based on the S3 API…. (the last 3 all are)
5 Call API (about 30mins to write a plug-in)
Our Development Vision
Empower the Community with a simple API API in 3.2
Give the community a platform to test their code
Use the Cloud!
Give the community a distribution mechanism
The EPrints Bazaar (beta)
EPrints Bazaar
Similar in concept to Apple’s App Store
Every install of EPrints will have access to the Bazaar
Single click install/uninstall of plug-ins
EPrints Services Approved Plug-ins Enterprise support for limited 3rd party plug-ins
Summary
EPrints provides the professional, enterprise level application for resource management
Including cloud support at many levels Repository-in-the-cloud Storage-in-the-cloud Services-in-the-cloud