Developing a 100G TestBed for Life Science Collaborations Taking advantage of existing UM/SURA dark...

18
The Next Big Thing COLLABORATIONS IN THE LIFE SCIENCES Gordon K Springer University of Missouri Internet2 Spring Member Meeting April 24, 2012

Transcript of Developing a 100G TestBed for Life Science Collaborations Taking advantage of existing UM/SURA dark...

The Next Big ThingCOLLABORATIONS IN THE LIFE SCIENCES

Gordon K SpringerUniversity of Missouri

Internet2 Spring Member Meeting

April 24, 2012

Developing a 100G TestBedfor Life Science Collaborations

Taking advantage of existing UM/SURA dark fiber to create a research 100G pathway from St Louis to Kansas City via Columbia along Interstate 70

Using InCommon Federated Identities for authentication via Shibboleth; authorizations occur via an autonomous Entitlement Server to provide fine-grained authorizations to/from service providers

Developing distributed resource sharing by mapping needs to resources eligible for assignment depending on geography and resource availability

At 100G some distributed computing latency issues can likely be overcome

Problem Set

High-Throughput Sequencing is producing enormous quantities of data; needing a storage infrastructure as a private cloud

Need to provision collecting, analyzing and using resources according to demand and includes the processing applications being net-aware

Develop security, measurement and analysis tools to efficiently run at 100G across a regional multi-cluster environment using OpenFlow and other specialized protocols

High throughput DNA sequencing at MU DNA Core Facility.

Sequence data

analyzed at MU on

high-speed systems.

Patterns of gene expression analyzed with microarrays to reveal mechanisms that contribute to reproduction efficiency.

Improved efficiency, quality and profitability is the goal.

Gene annotations quickly obtained by access to other data-bases through the MU Internet2 high speed network.

Swine female reproductive tissues and embryos are removed at various times of gestation.

Iterate microarray & other experiments to focus on gene discovery.

Where Does the Data Come From?A High-Volume EST Pipeline For

Discovery

ColKC SLI70

Big Data/Big Science Collaboratory

The GPN Network

Grant Writers Professional Development Sessions

University of Missouri System

UM InterCampusNetwork with 100G Pathway along I70

100G 100G

HtSeqLSC

Internet2

UM Portion of MOREnet

Multi-Site Sharing

SMP Servers Linux Cluster GPGPUsHPC Storage

Visualization & Display

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

HPC Infiniband Network

Machine Room GbE Network

ProtocolsCIFSNFSHTTPFTPSCP

ManagementCentralAdministrationMonitoringFile Mgmt

AvailabilityData MigrationReplicationBackup

Research Data Store

Campus GbE Network

Instruments (Medical)

Instruments(Core Service)

Instrument(Research)

Lab (Research & Clinical)

Mgmt

OpenFlowAnd

Other Protocols

LOGIN CLOUD

Next-Gen Linux Cluster GPGPUsHPC Storage

Visualization & Display

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

HPC Infiniband Network

Machine Room GbE Network

ProtocolsCIFSNFSHTTPFTPSCP

ManagementCentralAdministrationMonitoringFile Mgmt

AvailabilityData MigrationReplicationBackup

Research Data Store

Campus GbE Network

Instruments (Medical)

Instruments(Core Service)

Instrument(Research)

Lab (Research & Clinical)

Mgmt

CLOUD LOGIN

Site A Site B

Adapted from: The Stanford Clean Slate Program http://cleanslate.stanford.edu

Controller

PC

NetStorage

Storage Cloud

OpenFlow at the packet level

OpenFlow-enabledCommercial Switch

FlowTableFlowTable

SecureChannelSecureChannel

NormalSoftware

NormalDatapath

NetProcessors

Analysis EnginesUser

11

Collaborative Framework

Bridging the Gaps(Some are very Large)

• Data• Processors

•Provider• Authenticate

(InCommon)

Analysis

ToolsSharing

AppsNets

WAYF

Identity Provider Service Provider

IdentityDirectory

Handle Service

AttributeAuthority

SHIRE

SHAR

Resource

Manager

Resource

Attributes

Credentials

Handle

Handle

5

2

34

5

1

6

6

13

14

Entitlement Server

Administrative User

Entitlement Client App

10

7

12

Command

YES/NO

Handle

VO Entitlement Command

Credentials8

9

Command

YES/NO

11

Using Middleware Tools for VO

CollaborationES DB

Entitlement Server

User

User

Service Provider

Entitlement Server

Identity Provider

Page or computational results

1: request by URL or command

2 3

4

5

6uses public key encryption for authentication

and privacy

Simplified Design

Getting Authenticated

If you belong to a GPN member organization, but do not see your institution in the list,please contact your local GPN representative to request help in authenticating in thisenvironment.

2/24/2008 16

Entering the VO Environment

And the story continues …More Data, Resources, People & Knowledge

02/14/2007 Grant Writers Professional Development Sessions 18

UMBC

http://umbc.rnet.missouri.edu