Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for...

34
Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for...

Page 1: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

Grid-related High Performance Middleware and Laboratories

Dr. Carl KesselmanDirectorCenter for Grid Technologies

Page 2: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

How do we solve problems? Communities committed to common goals

- Virtual organizations

Teams with heterogeneous members & capabilities

Distributed geographically and politically- No location/organization possesses all required skills

and resources

Adapt as a function of the situation- Adjust membership, reallocate responsibilities,

renegotiate resources

Page 3: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

The Grid Vision

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”- On-demand, ubiquitous access to computing, data,

and services

- New capabilities constructed dynamically and transparently from distributed services

“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances”

(George Gilder)

Page 4: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

A Little History(U.S. Perspective)

Early 90s- Gigabit testbeds, metacomputing

Mid to late 90s- Early experiments (e.g., I-WAY), software projects (e.g.,

Globus), application experiments 2001

- Major application communities emerging- Major infrastructure deployments are underway- Rich technology base has been constructed- Global Grid Forum: >1000 people on mailing lists, 192

orgs at last meeting, 28 countries

Page 5: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Selected Major Grid Projects

Name URL & Sponsors FocusAccess Grid www.mcs.anl.gov/FL/

accessgrid; DOE, NSFCreate & deploy group collaboration systems using commodity technologies

BlueGrid IBM Grid testbed linking IBM laboratories

DISCOM www.cs.sandia.gov/discomDOE Defense Programs

Create operational Grid providing access to resources at three U.S. DOE weapons laboratories

DOE Science Grid

sciencegrid.org

DOE Office of Science

Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities

Earth System Grid (ESG)

earthsystemgrid.orgDOE Office of Science

Delivery and analysis of large climate model datasets for the climate research community

European Union (EU) DataGrid

eu-datagrid.org

European Union

Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics

g

g

g

g

g

g

Page 6: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Selected Major Grid Projects

Name URL/Sponsor FocusEuroGrid, Grid Interoperability (GRIP)

eurogrid.org

European Union

Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus

Fusion Collaboratory fusiongrid.org

DOE Off. Science

Create a national computational collaboratory for fusion research

Globus Project globus.org

DARPA, DOE, NSF, NASA, Msoft

Research on Grid technologies; development and support of Globus Toolkit; application and deployment

GridLab gridlab.org

European Union

Grid technologies and applications

GridPP gridpp.ac.uk

U.K. eScience

Create & apply an operational grid within the U.K. for particle physics research

Grid Research Integration Dev. & Support Center

grids-center.org

NSF

Integration, deployment, support of the NSF Middleware Infrastructure for research & education

g

g

g

g

g

g

Page 7: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Selected Major Grid Projects

Name URL/Sponsor FocusGrid Application Dev. Software

hipersoft.rice.edu/grads; NSF

Research into program development technologies for Grid applications

Grid Physics Network

griphyn.org

NSF

Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS

Information Power Grid

ipg.nasa.gov

NASA

Create and apply a production Grid for aerosciences and other NASA missions

International Virtual Data Grid Laboratory

ivdgl.org

NSF

Create international Data Grid to enable large-scale experimentation on Grid technologies & applications

Network for Earthquake Eng. Simulation Grid

neesgrid.org

NSF

Create and apply a production Grid for earthquake engineering

Particle Physics Data Grid

ppdg.net

DOE Science

Create and apply production Grids for data analysis in high energy and nuclear physics experiments

g

g

g

g

g

g

Page 8: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Selected Major Grid Projects

Name URL/Sponsor FocusTeraGrid teragrid.org

NSF

U.S. science infrastructure linking four major resource sites at 40 Gb/s

UK Grid Support Center

grid-support.ac.uk

U.K. eScience

Support center for Grid projects within the U.K.

Unicore BMBFT Technologies for remote access to supercomputers

SCEC www.scec.org

Nsf

Integrated geophysics modeling

g

gNew

Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS

See also www.gridforum.org

Page 9: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

The Grid World: Current Status

Dozens of major Grid projects in scientific & technical computing/research & education

Considerable consensus on key concepts and technologies- Open source Globus Toolkit™ a de facto standard for

major protocols & services

- Far from complete or perfect, but out there, evolving rapidly, and large tool/user base

Industrial interest emerging rapidly Opportunity: convergence of eScience and

eBusiness requirements & technologies

Page 10: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Layered Grid Architecture

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 11: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

Globus Toolkit

Globus Toolkit is the source of many of the protocols described in “Grid architecture”

Adopted by almost all major Grid projects worldwide as a source of infrastructure

Open source, open architecture framework encourages community development

Active R&D program continues to move technology forward

Developers at ANL, USC/ISI, NCSA, LBNL, and other institutions

www.globus.org

Page 12: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

Globus ToolkitComponents Include … Core protocols and services

- Grid Security Infrastructure

- Grid Resource Access & Management

- MDS information & monitoring

- GridFTP data access & transfer

Other services- Community Authorization Service

- DUROC co-allocation service

Other Data Grid technologies- Replica catalog, replica management service

Page 13: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

User

Userprocess #1

Proxy

Authenticate & create proxy

credential

GSI(Grid

Security Infrastruc-

ture)

Gatekeeper(factory)

Reliable remote

invocation

GRAM(Grid Resource Allocation & Management)

Reporter(registry +discovery)

Userprocess #2Proxy #2

Create process Register

The Globus Toolkit in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within

virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services)

Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, …

Other service(e.g. GridFTP)

Other GSI-authenticated remote service

requests

GIIS: GridInformationIndex Server (discovery)

MDS-2(Meta Directory Service)

Soft stateregistration;

enquiry

Page 14: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Globus Toolkit Structure

GRAM MDS

GSI

GridFTP MDS

GSI

???

GSI

Reliable invocationSoft state

management

Notification

ComputeResource

DataResource

Other Serviceor Application

Jobmanager

Jobmanager

Lots of good mechanisms, but (with the exception of GSI) not that easilyincorporated into other systems

Service naming

Page 15: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

NSF Middleware Initiative

NSF Funded Project to build national middleware infrastructure- USC/ISI, SDSC, U. Wisc., ANL, NCSA, I2

Software Integration (NMI Software Releases)- Interoperability

- Testing

- Install, Configure, Manage University Campus Infrastructure Integration

- Campus Authentication / GSI

- Enterprise Directories / GSI and MDS Use NMI as Teragrid Baseline

- Specialize for Teragrid unique aspects (e.g. Viz resources)

Page 16: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

NMI-R1 Software Components

Globus Toolkit Condor-G Network Weather Service KX.509 / KCA Certificate Profile Maker Pubcookie Grid Packaging Tools

Page 17: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

U.S. GRIDS Center

GRIDS = Grid Research, Integration, Deployment, & Support

NSF-funded center to provide- State-of-the-art middleware infrastructure to support

national-scale collaborative science and engineering

- Integration platform for experimental middleware technologies

ISI, NCSA, SDSC, UC, UW + commercial partners

www.grids-center.org

Page 18: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Network for Earthquake Eng. Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

www.neesgrid.org: Argonne, Michigan, NCSA, UIUC, USC

Page 19: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

SCEC Modeling Environment

Knowledge Base

OntologiesCurated taxonomies,

Relations & constraints

Pathway ModelsPathway templates,

Models of simulation codes

Code Repositories

Data & SimulationProductsData Collections

FSM

RDM

AWM

SRM

Storage

GRIDPathway Execution

Policy, Data ingest, Repository access

Grid ServicesCompute & storage management, Security

DIGITALLIBRARIES

Navigation &Queries

Versioning,Topic maps

MediatedCollectionsFederated

access

KNOWLEDGEACQUISITION

Acquisition InterfacesDialog planning,

Pathway constructionstrategies

Pathway AssemblyTemplate instantiation,

Resource selection,Constraint checking

KNOWLEDGE REPRESENTATION & REASONINGKnowledge Server

Knowledge base access, InferenceTranslation Services

Syntactic & semantic translation

Pathway Instantiations

Computing

Users

Page 20: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Data Intensive Physical Sciences

High energy & nuclear physics- Including new experiments at CERN

Gravity wave searches- LIGO, GEO, VIRGO

Time-dependent 3-D systems (simulation, data)- Earth Observation, climate modeling

- Geophysics, earthquake modeling

- Fluids, aerodynamic design

- Pollutant dispersal scenarios

Astronomy: Digital sky surveys

Page 21: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

National Virtual Observatory

Xray (ROSAT) theme

Change scale

Change theme

http://virtualsky.org/fromCaltech CACRCaltech AstronomyMicrosoft Research

Optical (DPOSS)

Coma cluster

Virtual Sky has140,000,000 tiles

140 Gbyte

Page 22: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Grid Physics Network (GriPhyN)

Enabling R&D for advanced data grid systems, focusing in particular on Virtual Data concept

Virtual Data ToolsRequest Planning and

Scheduling ToolsRequest Execution Management Tools

Transforms

Distributed resources(code, storage,computers, and network)

Resource Management

Services

Resource Management

Services

Security and Policy

Services

Security and Policy

Services

Other Grid Services

Other Grid Services

Interactive User Tools

Production Team

Individual Investigator Other Users

Raw data source

ATLASCMSLIGOSDSS

www.griphyn.org; see also www.ppdg.net, www.eu-datagrid.org

Page 23: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Image courtesy Harvey Newman, Caltech

Page 24: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

LaserInterferometricGravitational waveObservatory

Listening to Collisions ofBlack Holes andNeutron Stars

Page 25: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

LIGO Hardware

Page 26: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Grid LIGO Architecture

Clientseg Web, Script, Agent

Clientseg Web, Script, Agent

Text requestText request Request Manager

Request Manager

GriPhyN LDAS

Gatekeeper(GRAM)

Gatekeeper(GRAM)

Science AlgorithmsSoftware Collaboratory

Parallel ComputingGridFTPGridFTP

Local Disk

Data

HPSS

GridFTPGridFTP

Replica CatalogReplica ManagementTransformation CatalogVirtual Data Catalog

Virtual Data Request

Data Movement

Globus RPC

other LDASCondor jobs

Page 27: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

iVDGL: A Global Grid Laboratory

International Virtual-Data Grid Laboratory- A global Grid laboratory (US, Europe, Asia, South America, …)- A place to conduct Data Grid tests “at scale”- A mechanism to create common Grid infrastructure- A laboratory for other disciplines to perform Data Grid tests- A focus of outreach efforts to small institutions

U.S. part funded by NSF (2001-2006)- $13.7M (NSF) + $2M (matching)

“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”

From NSF proposal, 2001

Page 28: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

iVDGL Components Computing resources

- 2 Tier1 laboratory sites (funded elsewhere)- 7 Tier2 university sites software integration- 3 Tier3 university sites outreach effort

Networks- USA (TeraGrid, Internet2, ESNET), Europe (Géant, …)- Transatlantic (DataTAG), Transpacific, AMPATH?, …

Grid Operations Center (GOC)- Joint work with TeraGrid on GOC development

Computer Science support teams- Support, test, upgrade GriPhyN Virtual Data Toolkit

Education and Outreach Coordination, management

Page 29: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

iVDGL Components (cont.)

High level of coordination with DataTAG- Transatlantic research network (2.5 Gb/s) connecting

EU & US

Current partners- TeraGrid, EU DataGrid, EU projects, Japan, Australia

Experiments/labs requesting participation- ALICE, CMS-HI, D0, BaBar, BTEV, PDC (Sweden)

Page 30: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Initial US-iVDGL Data Grid

Tier1 (FNAL)Proto-Tier2Tier3 university

UCSDFlorida

Wisconsin

FermilabBNL

Indiana

BU

Other sites to be added in

2002

SKC

Brownsville

Hampton

PSU

JHUCaltech

Page 31: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

iVDGL Map (2002-2003)

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

DataTAG

Surfnet

LaterBrazilChile?PakistanRussiaChina

Page 32: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

The TeraGrid:26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

Page 33: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

Summary Grid infrastructure is becoming widespread

- Major deployment based on common technology

- Significant new deployment activities

Consensus building mechanisms in place- Global Grid Forum (www.gridforum.org)

Industrial buy in starting- IBM, Entropia, more to come

Page 34: Grid-related High Performance Middleware and Laboratories Dr. Carl Kesselman Director Center for Grid Technologies.

EO Grid Middleware

For More Information

Book (Morgan Kaufman)- www.mkp.com/grids

Globus- www.globus.org

- “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”

GRIDS Center- www.grids-center.org

Grid Forum- www.gridforum.org