CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May,...

21
CERN TERENA Lisbon The Grid Project The Grid Project Fabrizio Gagliardi Fabrizio Gagliardi CERN CERN Information Technology Division Information Technology Division May, 2000 May, 2000 [email protected] [email protected]

Transcript of CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May,...

Page 1: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

CERN

TERENA Lisbon

The Grid ProjectThe Grid Project

Fabrizio GagliardiFabrizio Gagliardi

CERNCERN

Information Technology DivisionInformation Technology Division

May, 2000May, 2000

[email protected] [email protected]

Page 2: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

2

CERN

TERENA Lisbon

SummarySummary

High Energy Physics and CERN computing High Energy Physics and CERN computing problemproblem

An excellent computing model: the GRIDAn excellent computing model: the GRID

The Data Grid Initiative The Data Grid Initiative

(http://www.cern.ch/grid/)

Page 3: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

3

CERN

TERENA Lisbon

CERN organizationCERN organization

Largest Particle Physics lab in the world

European International Center for ParticlePhysics Research

Budget: 1020 M CHF 2700 staff

7000 physicist users

Page 4: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

4

CERN

TERENA Lisbon

The LHC DetectorsCMS

ATLAS

LHCb3.5 PetaBytes / year

~108 events/year

Page 5: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

5

CERN

TERENA Lisbon

The HEP Problem - Part I The HEP Problem - Part I

The scale...

Page 6: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

6

CERN

TERENA Lisbon

Estimated CPU Capacity at CERN

0

500

1,000

1,500

2,000

2,500

1998 1999 2000 2001 2002 2003 2004 2005 2006

year

K S

I95

~10K SI951200 processors

Non-LHC

technology-price curve (40% annual price improvement)

LHC

Capacity that can purchased for the value of the equipment present in 2000

Page 7: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

7

CERN

TERENA Lisbon

Estimated DISK Capacity ay CERN

0

200

400

600

800

1000

1200

1400

1600

1800

1998 1999 2000 2001 2002 2003 2004 2005 2006

year

Ter

aByt

es

Non-LHC

technology-price curve (40% annual price improvement)

LHC

Page 8: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

8

CERN

TERENA Lisbon

Long Term Tape Storage Estimates

Current Experiments

COMPASS

LHC

0

2'000

4'000

6'000

8'000

10'000

12'000

14'000

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

Year

Ter

aByt

es

Page 9: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

9

CERN

TERENA Lisbon

HPC or HTCHPC or HTC

High High ThroughputThroughput Computing Computing mass of modest, independent problems computing in parallel – not parallel computing throughput rather than single-program performance resilience rather than total system reliability

Have learned toHave learned to exploit exploit inexpensive mass marketinexpensive mass market componentscomponents

But we need to marry these with But we need to marry these with inexpensiveinexpensive highly highly scalable managementscalable management tools tools

Much in common with other sciences (see EU-US Much in common with other sciences (see EU-US Annapolis Workshop at www.cacr.caltech.edu/euus): Annapolis Workshop at www.cacr.caltech.edu/euus): Astronomy, Earth Observation, Bioinformatics, and Astronomy, Earth Observation, Bioinformatics, and commercial/industrial: data mining, Internet commercial/industrial: data mining, Internet computingcomputing, , e-commercee-commerce facilities, …… facilities, ……

Contrast withsupercomputing

Page 10: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

network servers

tape servers

disk servers

application servers

Generic component modelof a computing farm

Page 11: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

11

CERN

TERENA Lisbon

The HEP Problem - Part IIThe HEP Problem - Part II

Geography, Sociology, Funding and Politics...

Page 12: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

12

CERN

TERENA Lisbon

CMS: 1800 physicists150 institutes32 countries

World Wide Collaboration distributed computing & storage capacity

Page 13: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

13

CERN

TERENA Lisbon

Regional Centres - a Multi-Tier ModelRegional Centres - a Multi-Tier Model

Department

Desktop

CERN – Tier 0

MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html

Tier 1 FNALRAL

IN2P3622 M

bps2.5 Gbps

622 M

bp

s

155

mbp

s 155 mbps

Tier2 Lab a

Uni b Lab c

Uni n

Page 14: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

14

CERN

TERENA Lisbon

Are Grids a solution?Are Grids a solution?

Change of orientation of US Meta-computingChange of orientation of US Meta-computingactivityactivity

From inter-connected super-computers … .. towards a more general concept of a computational Grid (The Grid – Ian Foster, Carl Kesselman)

Has initiated a flurry of activity in HEPHas initiated a flurry of activity in HEP US – Particle Physics Data Grid (PPDG) GriPhyN – data grid proposal submitted to NSF Grid technology evaluation project in INFN UK proposal for funding for a prototype grid NASA Information Processing Grid

Page 15: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

15

CERN

TERENA Lisbon

The Grid

“Dependable, consistent, pervasive access to [high-end] resources”

• Dependable:

• provides performance and functionality guarantees

• Consistent:

• uniform interfaces to a wide variety of resources

• Pervasive:

• ability to “plug in” from anywhere

Page 16: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

16

CERN

TERENA Lisbon

R&D requiredR&D required

Local fabricLocal fabric Management of giant computing fabricsManagement of giant computing fabrics

auto-installation, configuration management, resilience, self-healing

MMass storage mass storage managementanagement multi-PetaByte data storage, “real-time” data recording

requirement, active tape layer – 1,000s of users

WWide-areaide-area - - building on an existing framework & RN building on an existing framework & RN (e.g.Globus, Geant)(e.g.Globus, Geant)

workload managementworkload management no central status local access policies

data managementdata management caching, replication, synchronisation object database model

application monitoringapplication monitoring

Page 17: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

17

CERN

TERENA Lisbon

HEP Data Grid InitiativeHEP Data Grid Initiative

European level coordination of national initiatives & European level coordination of national initiatives & projectsprojects

Principal goals:Principal goals: Middleware for fabric & Grid management Large scale testbed - major fraction of one LHC

experiment Production quality HEP demonstrations

“mock data”, simulation analysis, current experiments

Other science demonstrations Three year phased developments & demosThree year phased developments & demos Complementary to other GRID projectsComplementary to other GRID projects

EuroGrid: Uniform access to parallel supercomputing resources

Synergy to be developed (GRID Forum, Industry and Synergy to be developed (GRID Forum, Industry and Research Forum)Research Forum)

Page 18: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

18

CERN

TERENA Lisbon

ParticipantsParticipants

Main partners: CERN, INFN(I), CNRS(F), Main partners: CERN, INFN(I), CNRS(F), PPARC(UK), NIKEF(NL), ESA-Earth Observation PPARC(UK), NIKEF(NL), ESA-Earth Observation

Other sciences: KNMI(NL), Biology, Medicine Other sciences: KNMI(NL), Biology, Medicine Industrial participation: CS SI/F, DataMat/I, Industrial participation: CS SI/F, DataMat/I,

IBM/UKIBM/UK Associated partners: Czech Republic, Finland, Associated partners: Czech Republic, Finland,

Germany, Hungary, Spain, Sweden (mostly Germany, Hungary, Spain, Sweden (mostly computer scientists)computer scientists)

Formal collaboration with USAFormal collaboration with USA Industry and Research Project Forum with Industry and Research Project Forum with

representatives from:representatives from: Denmark, Greece, Israel, Japan, Norway, Poland,

Portugal, Russia

Page 19: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

19

CERN

TERENA Lisbon

Status Status

Prototype work already started at CERN and in Prototype work already started at CERN and in most of collaborating institutesmost of collaborating institutes

Proposal to RN2 submittedProposal to RN2 submitted

Network requirements discussed with Network requirements discussed with Dante/GeantDante/Geant

Page 20: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

20

CERN

TERENA Lisbon

WAN RequirementsWAN Requirements

High bandwidth from CERN to Tier 1 centres (5-6) VPN, Quality of Service Guaranteed performance during limited test

periods and at the end of the project for production quality services

Target requirements (2003) 2.5 Gb/s + 622 Mb/s + 155 Mb/s

Could saturate for limited amount of test time 2.5 Gb/s (100 MB/s out from a 100 PC farm, we plan for 1000’s PC farm)

Reliability is an important factor: from WEB client-server model to GRID peer distributed computing model

Page 21: CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000 F.Gagliardi@cern.ch.

F. Gagliardi - CERN/IT-May-2000

21

CERN

TERENA Lisbon

ConclusionsConclusions

This project, motivated by HEP and other high This project, motivated by HEP and other high data and computing demanding sciences, will data and computing demanding sciences, will contribute to develop and implement a new contribute to develop and implement a new world-wide distributed computing model: The world-wide distributed computing model: The GRIDGRID

An ideal computing model for the next An ideal computing model for the next generation Internetgeneration Internet

An excellent test case for the next generation of An excellent test case for the next generation of high-performance research networkshigh-performance research networks