PPT

56
Rule-Based Distributed Rule-Based Distributed Data Management Data Management iRODS 1.0 - Jan 23, 2008 iRODS 1.0 - Jan 23, 2008 http://irods.sdsc.edu http://irods.sdsc.edu Reagan W. Moore Reagan W. Moore Mike Wan Mike Wan Arcot Rajasekar Arcot Rajasekar Wayne Schroeder Wayne Schroeder San Diego Supercomputer Center San Diego Supercomputer Center {moore, mwan , sekar , schroede }@sdsc.edu

Transcript of PPT

Page 1: PPT

Rule-Based Distributed Data Rule-Based Distributed Data Management Management

iRODS 1.0 - Jan 23, 2008iRODS 1.0 - Jan 23, 2008http://irods.sdsc.eduhttp://irods.sdsc.edu

Reagan W. MooreReagan W. Moore

Mike WanMike Wan

Arcot RajasekarArcot Rajasekar

Wayne SchroederWayne Schroeder

San Diego Supercomputer CenterSan Diego Supercomputer Center

{moore, {moore, mwan, sekar, schroede}@sdsc.edu

http://irods.sdsc.edu

http://www.sdsc.edu/srb/

Page 2: PPT

Data Management GoalsData Management Goals

• Support for data life cycle• Shared collections -> data publication -> reference

collections

• Support for socialization of collections• Process that governs life cycle transitions• Consensus building for collection properties

• Generic infrastructure• Common underlying distributed data management

technology• iRODS - integrated Rule-Oriented Data System

Page 3: PPT

NSF Software Development for NSF Software Development for CyberInfrastructure Data CyberInfrastructure Data

Improvement ProjectImprovement Project• #0721400: Data Grids for Community Driven

Applications• Three major components:

• Maintain highly successful Storage Resource Broker (SRB) data grid technology for use by the NSF research community

• Create an open source production version of the integrated Rule-Oriented Data System (iRODS)

• Support migration of collections from the current SRB data grid to the iRODS data grid

Page 4: PPT

Why Data Grids (SRB)?Why Data Grids (SRB)?

• Organize distributed data into shared collections• Improve the ability for researchers to collaborate on

national and international scales• Provide generic distributed data management mechanisms

• Logical name spaces (files, users, storage systems)• Collection metadata• Replicas, versions, backups• Optimized data transport• Authentication and Authorization across domains• Support for community specific clients • Support for vendor specific storage protocols• Support for remote processing on data, aggregation in containers• Management of all phases of the data life cycle

Page 5: PPT

Using a SRB Data Grid - Using a SRB Data Grid - DetailsDetails

SRB Server

•Data request goes to SRB Server

SRB Server Metadata Catalog

DB

•Server looks up information in catalog

•Catalog tells which SRB server has data

•1st server asks 2nd for data

•The 2nd SRB server supplies the data

•User asks for data

Page 6: PPT

Extremely SuccessfulExtremely Successful• Storage Resource Broker (SRB) manages 2 PBs of data in

internationally shared collections• Data collections for NSF, NARA, NASA, DOE, DOD, NIH, LC,

NHPRC, IMLS: APAC, UK e-Science, IN2P3, WUNgrid• Astronomy Data grid• Bio-informatics Digital library• Earth Sciences Data grid• Ecology Collection• Education Persistent archive• Engineering Digital library• Environmental science Data grid• High energy physics Data grid• Humanities Data Grid• Medical community Digital library• Oceanography Real time sensor data, persistent archive • Seismology Digital library, real-time sensor data

• Goal has been generic infrastructure for distributed data

Page 7: PPT

Date

ProjectGBs of

data stored1000’s of

filesGBs of

data stored1000’s of

filesUsers with

ACLsGBs of

data stored1000’s of

filesUsers with

ACLs

Data Grid NSF / NVO 17,800 5,139 51,380 8,690 80 88,216 14,550 100 NSF / NPACI 1,972 1,083 17,578 4,694 380 39,697 7,590 380 Hayden 6,800 41 7,201 113 178 8,013 161 227 Pzone 438 31 812 47 49 28,799 17,640 68 NSF / LDAS-SALK 239 1 4,562 16 66 207,018 169 67 NSF / SLAC-JCSG 514 77 4,317 563 47 23,854 2,493 55 NSF / TeraGrid 80,354 685 2,962 282,536 7,257 3,267 NIH / BIRN 5,416 3,366 148 20,400 40,747 445 NCAR 70,334 325 2 LCA 3,787 77 2Digital Library NSF / LTER 158 3 233 6 35 260 42 36 NSF / Portal 33 5 1,745 48 384 2,620 53 460 NIH / AfCS 27 4 462 49 21 733 94 21 NSF / SIO Explorer 19 1 1,734 601 27 2,750 1,202 27 NSF / SCEC 15,246 1,737 52 168,931 3,545 73 LLNL 18,934 2,338 5 CHRON 12,863 6,443 5Persistent Archive NARA 7 2 63 81 58 5,023 6,430 58 NSF / NSDL 2,785 20,054 119 7,499 84,984 136 UCSD Libraries 127 202 29 5,205 1,328 29 NHPRC / PAT 2,576 966 28 RoadNet 3,557 1,569 30 UCTV 7,140 2 5 LOC 6,644 192 8 Earth Sci 6,136 652 5TOTAL 28 TB 6 mil 194 TB 40 mil 4,635 1,023 TB 200 mil 5,539

5/17/02 6/30/04 11/29/07

Page 8: PPT

Generic InfrastructureGeneric Infrastructure

• Data grids manage data distributed across multiple types of storage systems• File systems, tape archives, object ring buffers

• Data grids manage collection attributes• Provenance, descriptive, system metadata

• Data grids manage technology evolution• At the point in time when new technology is

available, both the old and new systems can be integrated

Page 9: PPT

SRB MaintenanceSRB Maintenance

• Release of SRB 3.5.0 was made on December 3, 2007• Eliminated a security vulnerability through use

of bind variables for interacting with databases

• Implemented additional community requirements• Provided bulk replication and backup• Automated reconnection on network read-write

failures• Incorporated bug fixes

Page 10: PPT

Why iRODS?Why iRODS?• Need to verify assertions about the purpose of a

collection• Socialization of data collections, map from creator

assertions to community expectations

• Need to manage exponential growth in collection size• Improve support for all phases of data life cycle from

shared data within a project, to published data in a digital library, to reference collections within an archive

• Data life cycle is a way to prune collections, and identify what is valuable

• Need to minimize labor by automating enforcement of management policies

Page 11: PPT

Starting RequirementsStarting Requirements

• Base capabilities upon features required by scientific research communities• Started with features in SRB data grid, but needed to

understand impact of management policies and procedures

• Incorporate trustworthiness assessment criteria from the preservation community• Other criteria include human subject approval, patient

confidentiality, time-dependent access controls

• Promote international support for iRODS development to ensure international use

Page 12: PPT

Data Management ApplicationsData Management Applications

• Data grids• Share data - organize distributed data as a collection

• Digital libraries• Publish data - support browsing and discovery

• Persistent archives• Preserve data - manage technology evolution

• Real-time sensor systems• Federate sensor data - integrate across sensor streams

• Workflow systems• Analyze data - integrate client- & server-side workflows

Page 13: PPT

ApproachApproach

• To meet the diverse requirements, the architecture must:• Be highly modular• Be highly extensible• Provide infrastructure independence• Enforce management policies• Provide scalability mechanisms• Manipulate structured information• Enable community standards

Page 14: PPT

Observations of Production Observations of Production Data GridsData Grids

• Each community implements different management polices• Community specific preservation objectives• Community specific assertions about

properties of the shared collection• Community specific management policies

• Need a mechanism to support the socialization of shared collections• Map from assertions made by collection

creators to expectations of the users

Page 15: PPT

Tension between Common and Tension between Common and Unique ComponentsUnique Components

• Synergism - common infrastructure• Distributed data

• Sources, users, performance, reliability, analysis

• Technology management• Incorporate new technology

• Unique components - extensibility• Information management

• Semantics, formats, services

• Management policies• Integrity, authenticity, availability, authorization

Page 16: PPT

Data Grid EvolutionData Grid Evolution

• Implement essential components needed for synergism• Storage Resource Broker - SRB• Infrastructure independence• Data and trust virtualization

• Implement components needed for specific management policies and processes• integrated Rule Oriented Data System - iRODS• Policy management virtualization• Map processes to standard micro-services• Structured information management and transmission

Page 17: PPT

Initial iRODS Design Initial iRODS Design Next-generation data grid technologyNext-generation data grid technology

• Open source software - BSD license • Unique capability - Virtualization of

management policies• Map management policies to rules• Enforce rules at each remote storage location

• Highly extensible modular design• Management procedures are mapped to micro-services

that encapsulate operations performed at the remote storage location

• Can add rules, micro-services, and state information

• Layered architecture• Separation of client protocols from storage protocols

Page 18: PPT

Using an iRODS Data Grid - Using an iRODS Data Grid - DetailsDetails

iRODS ServerRule Engine

•Data request goes to iRODS Server

iRODS ServerRule Engine

Metadata CatalogRule Base

DB

•Server looks up information in catalog

•Catalog tells which iRODS server has data

•1st server asks 2nd for data

•The 2nd iRODS server applies rules

•User asks for data

Page 19: PPT

Data VirtualizationData Virtualization

Storage SystemStorage System

Storage ProtocolStorage Protocol

Access InterfaceAccess Interface

Traditional

approach:

Client talks

directly to storage system using Unix I/O:Microsoft Word

Page 20: PPT

Data Virtualization (Digital Library)Data Virtualization (Digital Library)

Storage SystemStorage System

Storage ProtocolStorage Protocol

Access InterfaceAccess Interface

Digital LibraryDigital Library

Client talks to the

Digital Library

which then

interacts with the

storage system

using Unix I/O

Page 21: PPT

Data Virtualization (iRODS)Data Virtualization (iRODS)

Storage SystemStorage System

Storage ProtocolStorage Protocol

Access InterfaceAccess Interface

Standard Micro-servicesStandard Micro-services

Data GridData Grid

•Map from the actions

requested by the access

method to a standard set

of micro-services.

•The standard micro-

services use standard

operations.

•Separate protocol drivers are written for each storage system.

Standard OperationsStandard Operations

Page 22: PPT

iRODS Release 1.0iRODS Release 1.0

• Open source software available at wiki:• http://irods.sdsc.edu

• Since January 23, 2008, more than 590 downloads by projects in 18 countries:• Australia, Austria, Belgium, Brazil, China,

France, Germany, Hungary, India, Italy, Norway, Poland, Portugal, Russia, Spain, Taiwan, UK, and the US

Page 23: PPT

Core ComponentsCore Components• Framework

• Infrastructure that ties together the layered environment

• Drivers• Infrastructure that interacts with commercial protocols (database, storage,

information resource)

• Clients• Community specific access protocols

• Rules• Management policies specific to a community

• Micro-services• Management procedures specific to a community

• Quality assurance• Testing routines for code validation

• Maintenance• Bug fixes, help desk, chat, bugzilla, wiki

Page 24: PPT

Rule SpecificationRule Specification

• Rule - Event : Condition : Action set :

Recovery Procedure• Event - atomic, deferred, periodic• Condition - test on any state information attribute• Action set - chained micro-services and rules• Recovery procedure - ensure transaction

semantics in a distributed world

• Rule types• System level, administrative level, user level

Page 25: PPT

Distributed Management SystemDistributed Management System

RuleRule

EngineEngine

DataData

TransportTransport

MetadataMetadata

CatalogCatalog

ExecutionExecution

ControlControl

MessagingMessaging

SystemSystem

ExecutionExecution

EngineEngine

VirtualizationVirtualization

ServerServer

SideSide

WorkflowWorkflow

PersistentPersistent

StateState

informationinformation

SchedulingScheduling

PolicyPolicy

ManagementManagement

Page 26: PPT

integrated Rule-Oriented Data Systemintegrated Rule-Oriented Data System

Client Interface Admin Interface

Current State

Rule Invoker

MicroService

Modules

Metadata-based Services

Resources

MicroService

Modules

Resource-based Services

ServiceManager

ConsistencyCheck

Module

RuleModifierModule

ConsistencyCheck

Module

Engine

Rule

Confs

ConfigModifierModule

MetadataModifierModule

MetadataPersistent

Repository

ConsistencyCheck

Module

RuleBase

Page 27: PPT

iRODS Data Grid CapabilitiesiRODS Data Grid Capabilities

• Remote procedures• Atomic / deferred / periodic• Procedure execution / chaining• Structured information

• Structured information• Metadata catalog interactions / 205 queries• Information transmission• Template parsing• Memory structures• Report generation / audit trail parsing

Page 28: PPT

iRODS Data Grid CapabilitiesiRODS Data Grid Capabilities

• Rules• User / administrative / internal• Remote web service invocation• Rule & micro-service creation• Standards / XAM, SNIA

• Installation• CVS / modules• System dependencies• Automation

Page 29: PPT

iRODS Data GridiRODS Data Grid

• Administration• User creation• Resource creation• Token management• Listing

• Collaborations• Development plans• International collaborators• Federations

Page 30: PPT

Three Major InnovationsThree Major Innovations

1. Management virtualization• Expression of management policies as rules• Expression of management procedures as

remote micro-services• Expression of assertions as queries on

persistent state information

• Required addition of three more logical name spaces for rules, micro-services, and state information

Page 31: PPT

Second Major InnovationSecond Major Innovation

• Recognition of the need to support structured information• Manage exchange of structured information between

micro-services• Argument passing• Memory white board

• Manage transmission of structured information between servers and clients

• C-based protocol for efficiency• XML-based protocol to simplify client porting (Java)• High performance message system

Page 32: PPT

Third Major InnovationThird Major Innovation

• Development of the Mounted Collection interface• Standard set of operations (20) for extracting

information from a remote information resource• Allows data grid to interact with autonomous resources

which manage information independently of iRODS• Structured information drivers implement the

information exchange protocol used by a particular information repository

• Examples• Mounted Unix directory• Tar file

Page 33: PPT

SDCI Project at SDSCSDCI Project at SDSC

• Implement using spiral development. Iterate across development phases:• Requirements - driven by application communities• Prototypes - community specific implementation of a

new feature / capability• Design - creation of generic mechanism that is

suitable for all communities• Implementation - robust, reliable, high performing code• Maintenance - documentation, quality assurance, bug

fixes, help desk, testing environment

• We find communities eager to participate in all phases of spiral development

Page 34: PPT

Example External ProjectsExample External Projects

• International Virtual Observatory Alliance (federation of astronomy researchers)• Observatoire de Strasbourg ported the IVOA

VOSpace interface on top of iRODS• This means the astronomy community can use their

web service access interface to retrieve data from iRODS data grids.

• Parrot grid interface• Ported by Douglas Thain (University of Notre Dame)

on top of iRODS. Builds user level file system across GridFTP, iRODS, http, and other protocols

Page 35: PPT

Collaborators - Partial ListCollaborators - Partial List• Aerospace Corporation• Academia Sinica, Taiwan• UK security project, King's College London• BaBar High Energy Physics• Biomedical Informatics Research Network• California State Archive• Chinese Academy of Sciences• CASPAR - Cultural, Artistic, and Scientific knowledge for Preservation Access and Retrieval• CineGrid - Media cyberinfrastructure• DARIAH - Infrastructure for arts and humanities in Europe• D-Grid - TextGrid project, Germany• DSpace Foundation digital library• Fedora Commons digital library• Institut national de physique nucleaire et de physique des particules• IVOA - International Virtual Observatory Alliance• James Cook University• KEK - High Energy Accelerator Research Organization, Japan• LOCKSS - Lots of Copies Keep Stuff Safe• Lstore - REDDnet Research and Education Data Depot network• NASA Planetary Data System• National Optical Astronomy Observatory• Ocean Observatory Initiative• SHAMAN - Sustaining Heritage through Multivalent Archiving• SNIA - Storage networking Industry Association• Temporal Dynamics of Learning Center

Page 36: PPT

Project CoordinationProject Coordination

• Define international collaborators • Technology developers for a specific development

phase for a specific component.

• Collaborators span:• Scientific disciplines• Communities of practice (digital library, archive, grid)• Technology developers• Resource providers• Institutions and user communities

• Federations within each community are essential for managing scientific data life cycle

Page 37: PPT

Scientific Data Life CycleScientific Data Life Cycle

• Shared collection • Used by a project to promote collaboration between

distributed researchers• Project members agree on semantics, data formats,

and manipulation services

• Data publication• Requires defining context for the data• Provenance, conformance to community format

standards

• Reference collections• Community standard against which future research

results are compared

Page 38: PPT

Scientific Data Life CycleScientific Data Life Cycle

• Each phase of the life cycle requires consensus by a broader community

• Need mechanisms for expressing the new purpose for the data collection

• Need mechanisms that verify • Authoritative source• Completeness• Integrity• Authenticity

Page 39: PPT

Why iRODS?Why iRODS?

• Collections are assembled for a purpose• Map purpose to assessment criteria• Use management policies to meet assertions• Use management procedures to enforce policies• Track persistent state information generated by

every procedure• Validate criteria by queries on state information

and on audit trails

Page 40: PPT

Data ManagementData Management

Data ManagementEnvironment

ConservedProperties

ControlMechanisms

RemoteOperations

ManagementFunctions

AssessmentCriteria

ManagementPolicies

Capabilities

Data grid – Management virtualizationData Management

InfrastructurePersistent

StateRules Micro-services

Data grid – Data and trust virtualizationPhysical

InfrastructureDatabase Rule Engine Storage

System

iRODS - integrated Rule-Oriented Data SystemiRODS - integrated Rule-Oriented Data System

Page 41: PPT

Why iRODS?Why iRODS?

• Can create a theory of data management• Prove compliance of data management system with specified

assertions

• Three components1. Define the purpose for the collection, expressed as assessment

criteria, management policies, and management procedures2. Analyze completeness of the system

• For each criteria, persistent state is generated that can be audited• Persistent state attributes are generated by specific procedure

versions• For each procedure version there are specific management policy

versions• For each criteria, there are governing policies

3. Audit properties of the system• Periodic rules validate assessment criteria

Page 42: PPT

Major iRODS Research QuestionMajor iRODS Research Question

• Do we federate data grids as was done in the SRB, by explicitly cross-registering information?

• Or do we take advantage of the Mounted Collection interface and access each data grid as an autonomous information resource?

• Or do we use a rule-based database access interface for interactions between iCAT catalogs?

Page 43: PPT

Federation Between IRODS Data GridsFederation Between IRODS Data Grids

Data Grid

• Logical resource name space

• Logical user name space

• Logical file name space

• Logical rule name space

• Logical micro-service name

• Logical persistent state

Data Collection B

Data Access Methods (Web Browser, DSpace, OAI-PMH)

Data Grid

• Logical resource name space

• Logical user name space

• Logical file name space

• Logical rule name space

• Logical micro-service name

• Logical persistent state

Data Collection A

Page 44: PPT

Mounted CollectionsMounted Collections

• Minimizes dependencies between the autonomous systems• Supports retrieval from the remote information

resource, but not pushing of information• Pull environment• Can be controlled by rules that automate

interactions• Chained data grids• Central archive (archive pulls from other data grids)• Master-slave data grids (slaves pull from master)

Page 45: PPT

Rule-based Database Access Rule-based Database Access InterfaceInterface

• Support interactions by querying the remote iCAT catalog’s database• Expect to support publication of schemata• Ontology-based reasoning on semantics• Can be used for both deposition and retrieval

of information• Simplifies exchange of rules and possibly of

micro-services

Page 46: PPT

iRODS DevelopmentiRODS Development

• NSF - SDCI grant “Adaptive Middleware for Community Shared Collections”• iRODS development, SRB maintenance

• NARA - Transcontinental Persistent Archive Prototype• Trusted repository assessment criteria

• NSF - Ocean Research Interactive Observatory Network (ORION)• Real-time sensor data stream management

• NSF - Temporal Dynamics of Learning Center data grid• Management of IRB approval

Page 47: PPT

iRODS Development StatusiRODS Development Status

• Production release is version 1.0• January 24, 2008

• International collaborations• SHAMAN - University of Liverpool

• Sustaining Heritage Access through Multivalent ArchiviNg

• UK e-Science data grid• IN2P3 in Lyon, France• DSpace policy management• Shibboleth - collaboration with ASPIS

Page 48: PPT

Planned DevelopmentPlanned Development

• GSI support (1)• Time-limited sessions via a one-way hash authentication• Python Client library• GUI Browser (AJAX in development)• Driver for HPSS (in development)• Driver for SAM-QFS• Porting to additional versions of Unix/Linux• Porting to Windows• Support for MySQL as the metadata catalog• API support packages based on existing mounted collection

driver• MCAT to ICAT migration tools (2)• Extensible Metadata including Databases Access Interface

(6)• Zones/Federation (4)• Auditing - mechanisms to record and track iRODS metadata

changes

Page 49: PPT

For More InformationFor More Information

Reagan W. MooreSan Diego Supercomputer Center

[email protected]

http://www.sdsc.edu/srb/http://irods.sdsc.edu/

Page 50: PPT

CollaborationsCollaborations

Prototypes Design Generic version

Main-ten-ance

Technology Lead

Core server SDCI SDCI SDCI SDCI Wan (SDSC)

Rule Engine SDCI SDCI SDCI SDCI Rajasekar (SDSC)

Rule Base SDCI SDCI SDCI SDCI Rajasekar (SDSC)

iCAT catalog TPAP, SDCI SDCI SDCI SDCI Schroeder (SDSC)

Extensible iCAT TPAP, SDCI SDCI SDCI SDCI Schroeder (SDSC)

ExecServer (Scheduler

SDCI SDCI SDCI SDCI Wan (SDSC)

Xmessaging Server TPAP, OOI SDCI SDCI SDCI Wan, Rajasekar (SDSC)

Transport Layer SDCI SDCI SDCI SDCI Wan (SDSC)

Transport Layer (XML)

TPAP, SDCI SDCI SDCI SDCI Wan (SDSC)

Monitoring System IN2P3 IN2P3 IN2P3 IN2P3 Nief(IN2P3)

Mounted Collection TPAP, SDCI SDCI SDCI SDCI Wan (SDSC)

Federation TPAP, SDCI SDCI SDCI SDCI Wan, Rajasekar, Schroeder (SDSC)

Master-slave catalogs

BIRN, SDCI SDCI SDCI SDCI Schroeder, Wan (SDSC)

Authentication BIRN, SDCI SDCI SDCI SDCI Schroeder (SDSC)

Capability Requirements

FrameworkTPAP, OOI, TDLC, BIRN, …

TPAP, BIRN, TDLC

TPAP

TPAP, TDLC, BIRN, OOI

TPAP

Teragrid, TPAP, OOI

OOI, TPAP, LSST, Kepler

DLI2, OOI, Cinegrid, LSST, TPAP, PDS

TPAP, DLI2, OOI

IN2P3, BIRN

TPAP, OOI, Teragrid, NSDL

TPAP, OOI, TDLC, NSDL, BIRN

BIRN

BIRN, TPAP, Teragrid, OSG, NeSC

Page 51: PPT

CollaborationsCollaborations

Prototypes Design Generic version

Main-tenance Technology Lead

iCAT database drivers

SDCI, TPAP SDCI SDCI SDCI Schroeder (SDSC)

Database access drivers

ROADnet, TPAP SDCI SDCI SDCI Schroeder (SDSC)

Storage drivers SDCI SDCI SDCI SDCI Wan (SDSC), Pasquinelli (Sun)

Structured Information Drivers

Teragrid, TPAP, SDCI, NeSC

SDCI SDCI SDCI Wan (SDSC), Hasan (SHAMAN), Bady (RMBC)

SRB-iRODS migration

TPAP, SDCI SDCI SDCI SDCI Schroeder, Rajasekar, Wan (SDSC)

GSI authentication SPAWAR SDCI SDCI SDCI Schroeder (SDSC)

Shibboleth authentication

NeSC ASPIS ASPIS ASPIS Hasan (SHAMAN), Beitz(MU)

Capability Requirements

DriversTPAP, SPAWAR, NSDL

TDLC, TPAP, ROADnet, OOI

TPAP, Teragrid, LSST, SNIA, ZIH

Teragrid, TPAP

ZIH, JCU, BIRN, TPAP

SPAWAR, OSG, TeraGrid, NeSC, AS, BMBF

ASPIS, SHAMAN

Page 52: PPT

CollaborationsCollaborationsPrototypes Design Generic

version Maintenance Technology

Lead

Jargon - Java I/O library

BIRN, SDCI, TPAP BIRN, SDCI BIRN, SDCI BIRN, SDCI Gilbert (SDSC)

Web client SDCI, TPAP, LUScid

SDCI SDCI SDCI Lu, Antoine (SDSC)

inQ - Windows browser

SDCI, TPAP, NSDL SDCI SDCI SDCI Cowart, Zhu (SDSC)

iCommands - shell commands

SDCI SDCI SDCI SDCI Schroeder, Wan, Rajasekar (SDSC)

Parrot ND ND ND ND Thain (ND)

PHP client API TPAP SDCI SDCI SDCI Lu (SDSC), Atkinson (JCU)

PERL load library JCU JCU JCU JCU Atkinson (JCU)

Python load library TPAP SDCI SDCI SDCI DeTordy (SDSC), Atkinson (JCU)

Fedora NSDL, KCL NSDL NSDL NSDL Zhu (SDSC), Hodges (KCL), Aschenbrenner (UG)

DSpace MIT TPAP MIT MIT Smith (MIT)

LOCKSS LOCKSS LOCKSS LOCKSS LOCKSS Rosenthal (Stanford)Lstore REDDnet REDDnet REDDnet REDDnet Sheldon (ACCRE)

Archivist Toolkit UCSD Libraries UCSD Libraries UCSD Libraries UCSD Libraries Westbrook (UCSD)

Internet Archive NDIIPP CDL CDL CDL Zhu (SDSC)

SRM AS AS AS AS (AS)

GridFTP Archer Archer Archer Archer Sim (JCU)

VOSpace NVO NVO NVO NVO Wagner (UCSD), Lu (SDSC)

Sensor Stream Interface

OOI OOI OOI OOI Rajasekar, Lu (SDSC), Arrott (UCSD)

Video Streaming Interface

Cinegrid Cine-grid Cinegrid Cinegrid Wan (SDSC)

ClientsBIRN, TPAP, OOI, Teragrid, TDLC

BIRN, Teragrid, OSG, TPAP, LUScid

TPAP, DLI2, OOI, Teragrid, NSDL

Teragrid, OOI, TPAP

PetaShare, Condor

TPAP, TDLC

TPAP, JCU

TPAP, NCRIS

NSDL, KCL FNL, RMBC, D-Grid, DARIAH

MIT, TPAP, AS

LOCKSS

REDDnet

UCSD Libraries

NDIIPP, NSDL, TPAP, CDL

AS, OGF, EGEE, DEISA

Archer, DEISA

IVOA

OOI

Cinegrid

Capability Requirements

Page 53: PPT

CollaborationsCollaborations

Prototypes Design Generic version

Maintenance Technology Lead

Administrative Teragrid, TPAP SDCI SDCI SDCI Rajasekar, Schroeder, Chen (SDSC)

Trustworth-iness TPAP, SHAMAN TPAP TPAP TPAP DeTorcy, Marciano (SDSC), Giaretta (CASPAR), Bady(RMBC), Thaller (Planets)

IRB TDLC TDLC TDLC TDLC Hou(SDSC)

HIPAA BIRN BIRN BIRN BIRN Rajasekar (SDSC)

RulesTeragrid, BIRN, TPAP, NOAO, BaBar, NeSC

TPAP, SHAMAN

TDLC

BIRN

Capability Requirements

Page 54: PPT

CollaborationsCollaborations

Prototypes Design Generic version

Maintenance Technology Lead

Administrative Teragrid, TPAP SDCI SDCI SDCI Rajasekar, Schroeder (SDSC)

Audit trails TPAP SDCI SDCI SDCI DeTorcy, Schroeder (SDSC), Bady (UMBC)

ERA TPAP TPAP, SDCI TPAP, SDCI TPAP, SDCI Moore, DeTorcy (SDSC), Treloar (MU)

Trustworth-iness TPAP, SHAMAN TPAP TPAP TPAP DeTorcy, Marciano, Kremenek, Moore (SDSC)

IRB TDLC TDLC TDLC TDLC Hou, DeTorcy, Moore (SDSC)

HIPAA BIRN BIRN BIRN BIRN Rajasekar (SDSC)

Micro-servicesTeragrid, BIRN, TPAP, NOAO, BaBar, NeSC

TPAP, RMBC

TPAP, MU

TPAP, SHAMAN

TDLC

BIRN

Capability Requirements

Page 55: PPT

CollaborationsCollaborations

Page 56: PPT

CollaborationsCollaborationsPrototypes Design Generic

versionMaintenance Technology

Lead

Micro-service creation

SHAMAN, IN2P3, TPAP

SDCI, TPAP SDCI, TPAP SDCI, TPAP Rajasekar, Tooby (SDSC)

Micro-services SHAMAN, IN2P3, TPAP

SDCI, TPAP SDCI, TPAP SDCI, TPAP Rajasekar, Tooby, Schroeder (SDSC), Hasan (SHAMAN), Nief (IN2P3)

Clients SHAMAN, IN2P3, TPAP

SHAMAN, IN2P3, TPAP

SHAMAN, IN2P3, TPAP

SHAMAN, IN2P3, TPAP

Rajasekar, Tooby, Schroeder (SDSC), Hasan (SHAMAN), Nief (IN2P3)

SAA demonstration TPAP TPAP, SDCI TPAP, SDCI TPAP, SDCI Marciano, DeTorcy, Moore (SDSC)

Tutorials TPAP, SDCI TPAP, SDCI TPAP, SDCI TPAP, SDCI Moore, Schroeder, Rajasekar, Wan(SDSC), Berry (NeSC), Hasan (SHAMAN), Li (CAS)

Concepts TPAP, SDCI SDCI, TPAP SDCI, TPAP SDCI, TPAP Moore (SDSC)

Community Sustainability

SDCI SDCI SDCI SDCI Tooby (SDSC)

Standards SDCI, TPAP SDCI, TPAP SDCI, TPAP SDCI, TPAP Moore (SDSC), Lee (AC)

DocumentationSHAMAN, IN2P3, TPAP

Capability Requirements

SHAMAN, IN2P3, TPAP, AS

SHAMAN, IN2P3, TPAP, AS

TPAP

AS, NeSC, Teragrid, TPAP

TPAP, OGF

DSpace, Fedora, IA

OGF, SNIA, Sun, SHAMAN