The LHC Computing Grid

26
The LHC Computing Grid – February 2008 The LHC Computing Grid Dr Ian Bird LCG Project Leader 28 March 2008 Visit of US House of Representatives Committee on Appropriations

description

The LHC Computing Grid. Visit of US House of Representatives Committee on Appropriations. Dr Ian Bird LCG Project Leader 28 March 2008. The LHC Data Challenge. The accelerator will be completed in 2008 and run for 10-15 years - PowerPoint PPT Presentation

Transcript of The LHC Computing Grid

Page 1: The LHC Computing Grid

The LHC Computing Grid – February 2008

The LHC Computing Grid

Dr Ian Bird

LCG Project Leader

28 March 2008

Visit of US House of Representatives

Committee on Appropriations

Page 2: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

The LHC Data Challenge

• The accelerator will be completed in 2008 and run for 10-15 years

• Experiments will produce about 15 Million Gigabytes of data each year (about 20 million CDs!)

• LHC data analysis requires a computing power equivalent to ~100,000 of today's fastest PC processors

• Requires many cooperating computer centres, as CERN can only provide ~20% of the capacity

Ian Bird, CERN, IT Department

Page 3: The LHC Computing Grid

CERN18%

All Tier-1s39%

All Tier-2s43%

CERN12%

All Tier-1s55%

All Tier-2s33%

CERN34%

All Tier-1s66%

CPU Disk Tape

Summary of Computing Resource RequirementsAll experiments - 2008From LCG TDR - June 2005

CERN All Tier-1s All Tier-2s TotalCPU (MSPECint2000s) 25 56 61 142Disk (PetaBytes) 7 31 19 57Tape (PetaBytes) 18 35 53

Ian Bird, CERN, IT Department

Page 4: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

Solution: the Grid

• Use the Grid to unite computing resources of particle physics institutions around the world

The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations

The Grid is an infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe

Ian Bird, CERN, IT Department

Page 5: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

How does the Grid work?

• It makes multiple computer centres look like a single system to the end-user

• Advanced software, called middleware, automatically finds the data the scientist needs, and the computing power to analyse it.

• Middleware balances the load on different resources.It also handles security, accounting, monitoring and much more.

Ian Bird, CERN, IT Department

Area of strong EU-US collaboration:Virtual Data Toolkit; ETICS-NMI : both have NSF funding

Page 6: The LHC Computing Grid

View of the ATLAS detector (under construction)

150 million sensors deliver data …

… 40 million times per second

Ian Bird, CERN, IT Department

Page 7: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

Frédéric Hemmer, CERN, IT Department

Page 8: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

Frédéric Hemmer, CERN, IT Department

Page 9: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

Frédéric Hemmer, CERN, IT Department

Page 10: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

LHC Computing Grid project (LCG)

• More than 140 computing centres

• 12 large centres for primary data management: CERN (Tier-0) and eleven Tier-1s

• 38 federations of smaller Tier-2 centres

– 12 Tier 2 in US: 16 universities and 1 national lab

• 35 countries involved

Ian Bird, CERN, IT Department

BNL

FNAL

Page 11: The LHC Computing Grid

LCG Service HierarchyTier-0: the accelerator centre• Data acquisition & initial processing• Long-term data curation• Distribution of data Tier-1 centres

Canada – Triumf (Vancouver)France – IN2P3 (Lyon)Germany – Forschunszentrum KarlsruheItaly – CNAF (Bologna)Netherlands – NIKHEF/SARA (Amsterdam)Nordic countries – distributed Tier-1

Spain – PIC (Barcelona)Taiwan – Academia SInica (Taipei)UK – CLRC (Oxford)US – FermiLab (Illinois) – Brookhaven (NY)

Tier-1: “online” to the data acquisition process high availability

• Managed Mass Storage – grid-enabled data service

• Data-heavy analysis• National, regional support

Tier-2: ~140 centres in ~35 countries• Simulation• End-user analysis – batch and interactive

Ian Bird, CERN, IT Department

Page 12: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

WLCG Collaboration• The Collaboration

– 4 LHC experiments– ~140 computing centres– 12 large centres

(Tier-0, Tier-1)– 38 federations of smaller

“Tier-2” centres– ~35 countries

• Memorandum of Understanding– Agreed in October 2005, now being signed

• Resources– Focuses on the needs of the four LHC experiments – Commits resources

• each October for the coming year• 5-year forward look

– Agrees on standards and procedures• Relies on EGEE and OSG (and other regional efforts)

Ian Bird, CERN, IT Department

Page 13: The LHC Computing Grid

Data Transfer

• Data distribution from CERN to Tier-1 sites– The target rate was achieved

in 2006 under test conditions – Autumn 2007 & CCRC’08

under more realistic experiment testing, reaching & sustaining target rate with ATLAS and CMS active

Tier-2s and Tier-1s are inter-connected by the general

purpose research networks

Any Tier-2 mayaccess data at

any Tier-1

Tier-2 IN2P3

TRIUMF

ASCC

FNAL

BNL

Nordic

CNAF

SARAPIC

RAL

GridKa

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2Tier-2

Tier-2

Target 2008

Page 14: The LHC Computing Grid

Grid activity• WLCG ran ~ 44 M jobs in 2007 – workload

has continued to increase – now at ~ 165k jobs/day

• Distribution of work across Tier0/Tier1/Tier 2 really illustrates the importance of the grid system

– Tier 2 contribution is around 50%; > 85% is external to CERN

Tier 2 sites

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07Jul-0

7

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

0

1000000

2000000

3000000

4000000

5000000

6000000

LCG Jobs run 2007LHCb

CMS

ATLAS

ALICE

165k/day

Page 15: The LHC Computing Grid

• LCG has been the driving force for the European multi-science Grid EGEE (Enabling Grids for E-sciencE)

• EGEE is now a global effort, and the largest Grid infrastructure worldwide

• Co-funded by the European Commission (Cost: ~130 M€ over 4 years, funded by EU ~70M€)

• EGEE already used for >100 applications, including…

Impact of the LHC Computing Grid in Europe

Medical ImagingEducation, TrainingBio-informatics

Ian Bird, CERN, IT Department

Page 16: The LHC Computing Grid

• LCG is a major stakeholder in the US Open Science Grid national infrastructure.

• OSG is funded by the National Science Foundation and Department of Energy’s SciDAC program at $6M/year for 5 years, starting in 2006.

• More than 20 US LHC Universities (Tier-2 and Tier-3s) are members.

Impact of the LCG in the US

Weather Forecasting B. Etherton, RENCI

Ian Bird, CERN, IT Department Education and Training

• OSG supports other sciences and provides training and common software.

• NSF further contributes through Grid middleware projects - Globus, Condor - and other collaborative projects such as DISUN.

Page 17: The LHC Computing Grid

The EGEE project

• EGEE– Started in April 2004, now in second phase with 91 partners in

32 countries– 3rd phrase (2008-2010) in preparation

• Objectives– Large-scale, production-quality

grid infrastructure for e-Science – Attracting new resources and

users from industry as well asscience

– Maintain and further improve“gLite” Grid middleware

Ian Bird, CERN, IT Department

EGEE-II Partners in USA: Universities of: Chicago, Southern California & Wisconsin, and the Renaissance Computing Institute (RENCI)

Page 18: The LHC Computing Grid

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Registered Collaborating Projects

Applicationsimproved services for academia,

industry and the public

Support Actionskey complementary functions

Infrastructuresgeographical or thematic coverage

25 projects have registered as of September 2007: web page

Ian Bird, CERN, IT Department

Page 19: The LHC Computing Grid

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Collaborating infrastructures

Ian Bird, CERN, IT Department

Page 20: The LHC Computing Grid

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000 users>150 VOs>150,000 jobs/day

Page 21: The LHC Computing Grid

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

In silico drug discovery

• Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public health due to world wide exchanges and circulation of persons

• Grids open new perspectives to in silico drug discovery– Reduced cost and adding an accelerating factor in the search for new drugs

• Avian influenza: • bird casualties

International collaboration is required for: • Early detection• Epidemiological watch• Prevention• Search for new drugs• Search for vaccines

Page 22: The LHC Computing Grid

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

WISDOM

http://wisdom.healthgrid.org/

Page 23: The LHC Computing Grid

Example:Geocluster industrial application

• The first industrial application successfully running on EGEE

• Developed by the Compagnie Générale de Géophysique (CGG) in France, doing geophysical simulations for oil, gas, mining and environmental industries

• EGEE technology helps CGG to federate its computing resources around the globe

Ian Bird, CERN, IT Department

Page 24: The LHC Computing Grid

OSG and the LCG• The 17 US Tier-2 centers are funded by the NSF and

participate in the LCG through the OSG.• Common software through the Virtual Data Toolkit is

used by many different Grids.

US ATLAS Tier-2s US CMS Tier-2s

Midwest Tier-2: Indiana University, University of Chicago

CaltechFlorida

Southeast Tier-2: Oklahoma University, University of Texas Arlington, Langston University, University of New Mexico

MITNebraska

Stanford Linear Accelerator Center Purdue

Northeast Tier-2: Boston University, Harvard University

U. of Wisconsin

Great Lakes Tier-2: University of Michigan, Michigan State

U. of California, San Diego

WLCG ATLAS and CMS processing usage for the past year (WLCG Apel reports).

Page 25: The LHC Computing Grid

Sustainability• Need to prepare for permanent Grid infrastructure in Europe and the

world• Ensure a high quality of service for all user communities• Independent of short project funding cycles• Infrastructure managed in collaboration

with National Grid Initiatives (NGIs)• European Grid Initiative (EGI)• Future of projects like OSG, NorduGrid, ... ?

Ian Bird, CERN, IT Department

Page 26: The LHC Computing Grid

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/it

For more information:

Thank you for your kind attention!

www.cern.ch/lcg www.eu-egee.org

www.eu-egi.org/

www.gridcafe.org

Ian Bird, CERN, IT Department

www.opensciencegrid.org