A Grid For Particle Physics From testbed to production Jeremy Coles J.Coles@rl.ac.uk 3 rd September...

Post on 28-Mar-2015

215 views 0 download

Tags:

Transcript of A Grid For Particle Physics From testbed to production Jeremy Coles J.Coles@rl.ac.uk 3 rd September...

A Grid For Particle Physics

From testbed to production

Jeremy ColesJ.Coles@rl.ac.uk

3rd September 2004All Hands Meeting – Nottingham, UK

Contents

• The middleware components of the testbed

• Lessons learnt from the project

• Status of the current operational Grid

• Future plans and challenges

• Summary

• Review of GridPP1 and the European Data Grid Project

CMS LHCb ATLAS ALICE

1 Megabyte (1MB)A digital photo

1 Gigabyte (1GB) = 1000MBA DVD movie

1 Terabyte (1TB)= 1000GBWorld annual book production

1 Petabyte (1PB)= 1000TBAnnual production of one LHC experiment

1 Exabyte (1EB)= 1000 PBWorld annual information production

les.

rob

ert

son

@ce

rn.c

h

The physics driver

• 40 million collisions per second

• After filtering, 100-200 collisions of interest per second

• 1-10 Megabytes of data digitised for each collision = recording rate of 0.1-1 Gigabytes/sec

• 1010 collisions recorded each year = ~10 Petabytes/year of data

The LHC

simulation

reconstruction

analysis

interactivephysicsanalysis

batchphysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreprocessing

eventreprocessing

eventsimulation

eventsimulation

analysis objects(extracted by physics topic)

event filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

les.

rob

ert

son

@ce

rn.c

h

CERN

Data handling

The UK response

GridPPGridPP – A UK Computing Grid for

Particle Physics

19 UK Universities, CCLRC (RAL & Daresbury) and CERN

Funded by the Particle Physics and Astronomy Research Council (PPARC)

GridPP1 - Sept. 2001-2004 £17m "From Web to Grid“

GridPP2 – Sept. 2004-2007 £16(+1)m "From Prototype to Production"

GridPP1 project structure

1 . 1 2 . 1 3 . 1 4 . 1 5 . 1 6 . 1 7 . 1

1 .1 .1 1 .1 .2 1 .1 .3 1 .1 .4 2 .1 .1 2 .1 .2 2 .1 .3 2 .1 .4 3 .1 .1 3 .1 .2 3 .1 .3 3 .1 .4 4 .1 .1 4 .1 .2 4 .1 .3 4 .1 .4 5 .1 .1 5 .1 .2 5 .1 .3 6 .1 .1 6 .1 .2 6 .1 .3 6 .1 .4 7 .1 .1 7 .1 .2 7 .1 .3 7 .1 .41 .1 .5 2 .1 .5 2 .1 .6 2 .1 .7 2 .1 .8 3 .1 .5 3 .1 .6 3 .1 .7 3 .1 .8 4 .1 .5 4 .1 .6 4 .1 .7 4 .1 .8 6 .1 .5

2 .1 .9 3 .1 .9 3 .1 .1 0 4 .1 .9

1 . 2 2 . 2 3 . 2 4 . 2 5 . 2 6 . 2 7 . 2

1 .2 .1 1 .2 .2 1 .2 .3 1 .2 .4 2 .2 .1 2 .2 .2 2 .2 .3 2 .2 .4 3 .2 .1 3 .2 .2 3 .2 .3 3 .2 .4 4 .2 .1 4 .2 .2 4 .2 .3 4 .2 .4 5 .2 .1 5 .2 .2 5 .2 .3 6 .2 .1 6 .2 .2 6 .2 .3 7 .2 .1 7 .2 .2 7 .2 .31 .2 .5 1 .2 .6 1 .2 .7 1 .2 .8 2 .2 .5 2 .2 .6 2 .2 .7 3 .2 .5 3 .2 .6 3 .2 .7 3 .2 .8 4 .2 .5 4 .2 .6 4 .2 .71 .2 .9 1 .2 .1 0 3 .2 .9

1 . 3 2 . 3 3 . 3 4 . 3 5 . 3 6 . 3 7 . 3

1 .3 .1 1 .3 .2 1 .3 .3 1 .3 .4 2 .3 .1 2 .3 .2 2 .3 .3 2 .3 .4 3 .3 .1 3 .3 .2 3 .3 .3 3 .3 .4 4 .3 .1 4 .3 .2 4 .3 .3 4 .3 .4 5 .3 .1 5 .3 .2 5 .3 .3 6 .3 .1 6 .3 .2 6 .3 .3 6 .3 .4 7 .3 .1 7 .3 .2 7 .3 .3 7 .3 .41 .3 .5 1 .3 .6 1 .3 .7 1 .3 .8 2 .3 .5 2 .3 .6 2 .3 .7 3 .3 .5 3 .3 .6 4 .3 .51 .3 .9 1 .3 .1 0 1 .3 .1 1

1 . 4 2 . 4 3 . 4 4 . 4 5 . 4

1 .4 .1 1 .4 .2 1 .4 .3 1 .4 .4 2 .4 .1 2 .4 .2 2 .4 .3 2 .4 .4 3 .4 .1 3 .4 .2 3 .4 .3 3 .4 .4 4 .4 .1 4 .4 .2 4 .4 .3 4 .4 .4 5 .4 .1 5 .4 .2 5 .4 .3 5 .4 .41 .4 .5 1 .4 .6 1 .4 .7 1 .4 .8 2 .4 .5 2 .4 .6 2 .4 .7 3 .4 .5 3 .4 .6 3 .4 .7 3 .4 .8 4 .4 .5 4 .4 .6 5 .4 .51 .4 .9 3 .4 .9 3 .4 .1 0 M e tr ic O K 1 .1 .1

M e tr ic n o t O K 1 .1 .1 1 . 5 2 . 5 3 . 5 4 . 5 T a sk c o m p le te 1 .1 .1

1 .5 .1 1 .5 .2 1 .5 .3 1 .5 .4 2 .5 .1 2 .5 .2 2 .5 .3 2 .5 .4 3 .5 .1 3 .5 .2 3 .5 .3 3 .5 .4 4 .5 .1 4 .5 .2 4 .5 .3 4 .5 .4 T a sk o ve rd u e 1 .1 .11 .5 .5 1 .5 .6 1 .5 .7 1 .5 .8 2 .5 .5 2 .5 .6 2 .5 .7 3 .5 .5 3 .5 .6 3 .5 .7 6 3 d a ys 1 .1 .11 .5 .9 1 .5 .1 0 T a sk n o t d u e s o o n 1 .1 .1

N o t A c ti ve 1 .1 .1 2 . 6 3 . 6 4 . 6 N o T a s k o r m e tr i c

2 .6 .1 2 .6 .2 2 .6 .3 2 .6 .4 3 .6 .1 3 .6 .2 3 .6 .3 3 .6 .4 4 .6 .1 4 .6 .2 4 .6 .32 .6 .5 2 .6 .6 2 .6 .7 2 .6 .8 3 .6 .5 3 .6 .6 3 .6 .7 3 .6 .8 N a vi g a te u p 2 .6 .9 3 .6 .9 3 .6 .1 0 3 .6 .1 1 3 .6 .1 2 N a vi g a te d o w n

E xte rn a l l in k 2 . 7 3 . 7 L i n k to g o a l s

2 .7 .1 2 .7 .2 2 .7 .3 2 .7 .4 3 .7 .1 3 .7 .2 3 .7 .3 3 .7 .42 .7 .5 2 .7 .6 2 .7 .7 2 .7 .8 3 .7 .5 3 .7 .6

2 . 8 3 . 8

2 .8 .1 2 .8 .2 2 .8 .3 2 .8 .4 3 .8 .1 3 .8 .2 3 .8 .32 .8 .5

W P 8

1 2 3

D e p lo y m e n t

W P 4

W P 5

F a b r ic

T e c h n o lo g y

W P 6

D u e w i th i n

A T L AS

G rid P P G o a l

R eso u rcesIn te r o p er ab i li ty D issem in a tio n

T ie r -1

T ie r -A

L H C b T ie r -2

C E R N D ataG rid A p p lica t io n s In fr as t ru c tu r e

W P 1

W P 2

W P 3

L C G C r e a t io n

A p p lic a t io n s

W P 7

A T L AS /L H C b

C M S

B a B a r

C D F /D O

U K Q C D

O th e r

D a ta C h a lle n g e s

R o llo u t

T e s tb e d

3 0 -J u n -0 4S ta tu s D a te

In t . S ta n d a r d s

O p e n S o u r c e

W orldw ide In te g ra tio n

U K In te g ra t io n

M o n ito r in g

D e v e lo p in gE n g a g e m e n t

P a rt ic ip a t io n

T o de v elop an d d ep loy a lar ge sc ale s cie nc e G r idin th e U K for th e us e o f th e P ar t ic le Ph ys ic s c om m un ity

P re s e n ta t io n D e p lo y m e n t

5 6 74

U p d a te

C le a r

Software

> 65 use cases

7 major software releases (> 60 in total)

> 1,000,000 lines of code

People

500 registered users

12 Virtual Organisations

21 Certificate Authorities

>600 people trained

456 person-years of effort

Application Testbed

~20 regular sites

> 60,000 jobs submitted (since 09/03, release 2.0)

Peak >1000 CPUs

6 Mass Storage Systems

Scientific Applications5 Earth Obs institutes10 bio-medical apps6 HEP experiments

The project

http://eu-datagrid.web.cern.ch/eu-datagrid/

Contents

• The middleware components of the testbed

• Lessons learnt from the project

The infrastructure developed

Job submissionPython – default

Java – GUIAPIs (C++,J,P)

Batch workers

StorageElement

Gatekeeper (Perl script) + Scheduler

gridFTP

NFS, Tape, Castor

User Interface

ComputingElement

Resource broker(C++ Condor MM libraries, Condor-G for submission)

Replica catalogue per VO (or equiv.)

Berkely DatabaseInformation Index

AA server(VOMS)

UIJDL

Logging & Book keepingMySQL DB – stores job state info

Integration

Much time spent on– Controlling the direct and indirect interplay of the various integrated components

– Addressing stability issues (often configuration linked) and bottlenecks in a non-linear system

– Predicting (or failing to predict) where the next bottleneck will appear in the job processing network

(MDS +) BDIIOr R-GMA

Data services-RLS-RC

The GridThe Grid

Storage Elementinterfaces

“Handlers”

TAPE storage(or disk)

AccessControl

FileMetadata

• Manages storage and provides common interfaces to Grid clients.

• Higher level data management tools use replica catalogues & metadata about files to locate, and optimise which replica to use

• Since EDG work has provided the SE with an SRM 1Interface. SRM 2.1 with added functionality will beavailable soon.

•The SRM interface is a file control interface, there is also an interface for publishing information. Internally, “handlers” ensure modularity and flexibility.

The storage element

Lessons learnt

• Separating file control (e.g. staging, pinning) from data transfer is useful (different nodes better performance)

– Can be used for load balancing, redirection, etc

– Easy to add new data transfer protocols

– However, files in cache must be released by the client or time out

• Based on the (simple model of the) Grid Monitoring Architecture (GMA) from the GGF

• For Relational Grid Monitoring Architecture (R-GMA): hide Registry mechanism from the user

– Producer registers on behalf of user – Mediator (in Consumer) transparently

selects the correct Producer(s) to answer a query

Use relational model (R of R-GMA)

Facilitate expression of queries over all the published information

Producer

Registry/Schema

Consumer Users just think in terms of Producers and Consumers

Information & monitoring

Lessons learnt• Release working code early• Distributed Software System testing is hard – private WP3 testbed was very

useful• Automate as much as possible (CruiseControl always runs all tests!)

The security model

VO-VOMS

useruser serviceservice

Mutual authentication & authorization info

user cert(long life)

VO-VOMS

VO-VOMS

VO-VOMS

CA CA CAlow frequency

high frequency

host cert(long life)

authz cert(short life)

service cert(short life)

authz cert(short life)

proxy cert(short life)

voms-proxy-init

crl update

registration

registration

LCAS

Local Centre Authorisation Service

The security model (2)

Lessons learned• Be careful collecting requirements (integration is difficult)• Security must be an integral part of all development (from the start)• Building and maintaining “trust” between projects and continents takes time• Integration of security into existing systems is complex• There must be a dedicated activity dealing with security• EGEE benefited greatly – now has separate activity

• Authentication - GridPP led the EDG/LCG CA infrastructure (trust) • Authorisation

• VOMS for global policy• LCAS for local site policy• GACL (fine grained access control) and GridSite for http

• LCG/EGEE security policy led by GridPP

Networking

• A network transfer “cost” estimation service to provide applications and middleware with the costs of data transport

– Used by RBs for optimized matchmaking (getAccessCost), and also directly by applications (getBestFile)

• GEANT network tests campaign – Network Quality Of Service– High-Throughput Transfers

• Close collaboration with DANTE– Set-up of the testbed– Analysis of results– Access granted to all internal GEANT monitoring

tools

• Network monitoring is a key activity, both for provisioning and to provide accurate aggregate function for global grid schedulers.

• The investigations on network QoS carried out have led to a much greater understanding of how to utilise the network to benefit Grid operations

• Benefits resulted from close contact with DANTE and DataTAG, both at technical and management level

Project lessons learnt

•Formation of Task Forces (applications+middleware) was a very important step midway in project. Applications should have played a larger role in architecture discussions from the start

•Loose Cannons (team of 5) were crucial to all developments. Worked across experiments and work packages

•Site certification needs to be improved. and validation needs to be automated and run regularly. Misconfigured sites may cause many failures

• Important to provide a stable environment to attract users but get at the start get working code out to known users as quickly as possible

•Quality should start at the beginning of the project for all activities with defined Procedures, standards and metrics

• Security needs to be an integrated part from the very beginning

Contents

• Status of the current operational Grid

Our grid is working …

NorthGrid ****Daresbury, Lancaster, Liverpool,Manchester, Sheffield

SouthGrid *Birmingham, Bristol, Cambridge,Oxford, RAL PPD, Warwick

ScotGrid *Durham, Edinburgh, Glasgow

LondonGrid ***Brunel, Imperial, QMUL, RHUL, UCL

… and is part of LCG

• Rutherford Laboratory together with a site in Taipei is currently providing the Grid Operations Centre. It will also run the UK/I EGEE Regional Operations Centre and Core Infrastructure Centre

• Resources are being used for data challenges

• Within the UK we have some VO/experiment Memorandum of Understandings in place

• Tier-2 structure is working well

Scale

GridPP prototype Grid> 1,000 CPUs

– 500 CPUs at the Tier-1 at RAL

> 500 CPUs at 11 sites across UK organised in 4 Regional Tier-2s

> 500 TB of storage> 800 simultaneous jobs

• Integrated with international LHC Computing Grid (LCG)

> 5,000 CPUs> 4,000 TB of storage> 70 sites around the world> 4,000 simultaneous jobs• monitored via Grid Operations

Centre (RAL)

CPUs FreeCPUs

RunJobs

WaitJobs

Avail TB Used TB Max CPU

Ave.CPU

Total 7710 1439 5852 8733 6558.47 3273.86 9148 6198

http://goc.grid.sinica.edu.tw/gstat/

Picture yesterday (hyperthreading enabled on some sites)

Past upgrade experience at RAL

CSF Linux CPU Use 2001-02

0

20000

40000

60000

80000

100000

120000

140000

Jan

-0

1

Feb

-0

1

Mar-

01

Ap

r-0

1

May-

01

Jun

-0

1

Jul-

01

Sep

-0

1

Oct-

01

No

v-

01

Dec-

01

Jan

-0

2

Feb

-0

2

Ap

r-0

2

May-

02

Jun

-0

2

Jul-

02

Au

g-

02

cpu

Previously utilisation of new resources grew steadily over weeks or months.

Tier-1 update 27-28th July 2004

Hardware Upgrade

With the Grid we see a much more rapid utilisation of newly deployed resources.

Contents

• Future plans and challenges

Current context of GridPP

UK Core e-Science Programme

Institutes

Tier-2 Centres

CERNLCG

EGEE

GridPPTier-1/A

Middleware, Security,

Networking

Experiments

GridSupportCentre

Not to scale!

Apps Dev

AppsI nt

GridPP

UK Core e-Science Programme

Institutes

Tier-2 Centres

CERNLCG

CERNLCG

EGEE

GridPPGridPPTier-1/ATier-1/A

Middleware, Security,

Networking

Middleware, Security,

Networking

Experiments

GridSupportCentre

GridSupportCentre

Not to scale!

Apps DevApps Dev

AppsI nt

GridPP

GridPP2 management

Collaboration Board

Project ManagementBoard

Project Leader

Project Manager

DeploymentBoard

UserBoard

Production Manager

Dissemination Officer

GGF, LCG, EGEE, UK e-

Science, Liaison

Project Map

Risk Register

There are still challenges

• Middleware validation

• Improving Grid “efficiency”

• Meeting experiment requirements with the Grid

• Provision of work group computing

• Distributed file (and sub-file) management

• Experiment software distribution

• Provision of distributed analysis functionality

• Production accounting

• Encouraging an open sharing of resources

• Security

Middleware validation

CERTIFICATIONTESTING

Integrate

BasicFunctionality

Tests

Run testsC&T suitesSite suites

RunCertification

Matrix

Releasecandidate

tag

APPINTEGR

Certifiedrelease

tag

DE

VE

LO

PM

EN

T &

INT

EG

RA

TIO

NU

NIT

& F

UN

CT

ION

AL

TE

ST

ING

DevTag

JRA1

HEPEXPTS

BIO-MED

OTHERTBD

APPSSW

Installation

DE

PL

OY

ME

NT

PR

EP

AR

AT

ION

Deploymentrelease

tag

DEPLOY

SA1

SERVICES

PR

E-P

RO

DU

CT

ION

PR

OD

UC

TIO

N

Productiontag

Is starting to be addressed through a Certification and Testing testbed…

Work Group Computing

1. AliEn (ALICE Grid) provided a pre-Grid implementation [Perl scripts]

2. ARDA provides a framework for PP application middleware

Distributed analysis

• ATLAS Data Challenge to validate world-wide computing model

• Packaging, distribution and installation: Scale:one release build takes 10 hours produces 2.5 GB of files

• Complexity: 500 packages, Mloc, 100s of developers and 1000s of users– ATLAS collaboration

is widely distributed:140 institutes, all wanting to use the software

– needs ‘push-button’ easy installation..

Physics Models

Monte Carlo Truth DataMonte Carlo Truth Data

MC Raw DataMC Raw Data

Reconstruction

MC Event Summary DataMC Event Summary Data MC Event Tags MC Event Tags

Detector Simulation

Raw DataRaw Data

Reconstruction

Data Acquisition

Level 3 trigger

Trigger TagsTrigger Tags

Event Summary Data

ESD

Event Summary Data

ESD Event Tags Event Tags

Calibration DataCalibration Data

Run ConditionsRun Conditions

Trigger System

Step 1: Monte Carlo

Data Challenges

Step 1: Monte Carlo

Data Challenges

Step 2: Real DataStep 2: Real Data

Software distribution

GOC aggregates data across all sites.

Production accounting

http://goc.grid-support.ac.uk/ROC/docs/accounting/accounting.php

Deployment

Security Stable fabricMiddleware

Pro

ced

ure

s

Docu

men

tatio

n

Metrics

Accounting and Monitoring

Su

pp

ort

Porting to new platforms…

Current status

BaBar

D0CDF

ATLAS

CMS

LHCb

ALICE

19 UK Institutes

RAL Computer Centre

CERN ComputerCentre

SAMGrid

BaBarGrid

LCG

EDGGANGA

EGEE

UK PrototypeTier-1/A Centre

CERN PrototypeTier-0 Centre

4 UK Tier-2 Centres

LCG

UK Tier-1/ACentre

CERN Tier-0Centre

200720042001

4 UK Prototype Tier-2 Centres

ARDA

Separate Experiments, Resources, Multiple

Accounts 'One' Production GridPrototype Grids

Grevolution

Contents

• Summary

Summary

• The Large Hadron Collider data volumes make Grid computing a necessity

• GridPP1 with EDG developed a successful Grid prototype

• GridPP members have played a critical role in most areas – security, work load management, monitoring & operations

• GridPP involvement continues with the Enabling Grids for e-Science in Europe (EGEE) project – driving the federating of Grids

• As we move towards a full production service we face many challenges in areas such as deployment, accounting and true open sharing of resources

Or to see a possible analogy of developing a Grid follow this link!http://www.fallon.com/site_layout/work/clientview.aspx?clientid=12&projectid=85&workid=25784

Useful links

GRIDPP and LCG:

• GridPP collaborationhttp://www.gridpp.ac.uk/

• Grid Operations Centre (inc. maps)http://goc.grid-support.ac.uk/

• The LHC Computing Gridhttp://lcg.web.cern.ch/LCG/

Others

• PPARChttp://www.pparc.ac.uk/Rs/Fs/Es/intro.asp

• The EGEE projecthttp://egee-intranet.web.cern.ch/egee-intranet/index.html

• The European Data Grid final reviewhttp://eu-datagrid.web.cern.ch/eu-datagrid/