EGEE-III Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Andreas Gellrich,...

19
EGEE-III Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Andreas Gellrich, DESY The Grid – The Future of Scientific Computing Andreas Gellrich DESY IT Training DESY, Hamburg 28.10.2008
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of EGEE-III Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Andreas Gellrich,...

EGEE-III

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks Andreas Gellrich, DESY

The Grid –The Future of Scientific Computing

Andreas GellrichDESY

IT TrainingDESY, Hamburg28.10.2008

Introduction to Grids and the EGEE project 2

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Collaborating e-Infrastructures

Potential for linking ~80 countries by 2008

Introduction to Grids and the EGEE project 3

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

The need for Grid in HEP

• CERN: the world's largest particle physics laboratory

• Particle physics requires special tools to create and study new particles: accelerators and detectors

Mont Blanc(4810 m)

Downtown Geneva

• Large Hadron Collider (LHC):– One of the most powerful

instruments ever built to investigate matter

– 4 experiments: ALICE, ATLAS, CMS, LHCb

– 27 km circumference tunnel– Due to start up mid 2008

Introduction to Grids and the EGEE project 4

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Highlights of EGEE-II

• >200 VOs from several scientific domains– Astronomy & Astrophysics– Civil Protection– Computational Chemistry– Comp. Fluid Dynamics– Computer Science/Tools– Condensed Matter Physics– Earth Sciences– Fusion– High Energy Physics– Life Sciences

• Further applications under evaluation

98k jobs/day

Applications have moved from testing to routine and daily usage

~80-90% efficiency

Introduction to Grids and the EGEE project 5

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Biomedical applications

• Biomedicine is also a pilot application area

• More than 20 applications deployed and being ported

• Three sub domains – Medical image processing– Biomedicine– Drug discovery

• Use Grid as platform for collaboration (don’t need same massive processing power or storage as HEP)

Introduction to Grids and the EGEE project 6

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Applications Example

• Grid-enabled drug discovery process for neglected diseases– In silico docking

compute probability that potential drugs dock with target protein

– To speed up and reduce cost to develop new drugs

• WISDOM (World-wide In Silico Docking On Malaria)– First biomedical data challenge – 46 million ligands docked in 6 weeks– 1TB of data produced – 1000 computers in 15 countries

Equivalent to 80 CPU years

• Second data challenge on Avian flu in April 2006– 300,000 possible drug components tested– 8 different targets– 2000 computers used for 4 weeks

Introduction to Grids and the EGEE project 7

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Astroparticle physics

• Major Atmospheric Gamma Imaging Cherenkov telescope (MAGIC) – Origin of VHE -rays (30 GeV – TeV)

Active Galactic Nuclei (AGN) Supernova Remnants Unidentified EGRET sources Gamma Ray Bursts

– Huge hadronic background MC simulations to simulate the background of one night, 70 CPUs (P4 2GHz) need to run

for 19200 days– Observation data are big too!

• MAGIC Grid– Use three national Grid centres as backbone– All are members of EGEE

• Work to build a second telescope is currently in progress

Towards a virtual observatory for VHE -rays

Introduction to Grids and the EGEE project 8

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Astroparticle physics

• PLANCK satellite mission– Measure cosmic microwave background (CMB)

At even higher resolution than previous missions

– Launch in 2008; duration >1 year

• Application– N simulations of the whole Planck/LFI mission

different cosmological and instrumental parameters

– Full sky map for frequencies 30 up to 850 GHz (two complete sky surveys)

– 22 channels for LFI, 48 for HFI > 12 times fa

ster!!!

but ~5% fa

ilure ra

te

Introduction to Grids and the EGEE project 9

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Computational Chemistry

• GEMS (Grid Enabled Molecular Simulator) application– Calculation and fitting of electronic energies of atomic and molecular

aggregates (using high level ab initio methods)– The use of statistical kinetics and dynamics to study chemical

processes

• Virtual Monitors– Angular distributions– Vibrational distributions– Rotational distributions– Many body systems

• End-User applications– Nanotubes– Life sciences– Statistical Thermodynamics– Molecular Virtual Reality

Angular distribution

Rotational distribution

Vibrational distribution

Many body systemAngular distribution

Introduction to Grids and the EGEE project 10

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Fusion

• Large Nuclear Fusion installations– E.g. International Thermonuclear Experimental Reactor (ITER)

– Distributed data storage and handling needed

– Computing power needed for Making decisions in real time Solving kinetic transport

particle orbits Stellarator optimization

magnetic field to contain the plasma

Introduction to Grids and the EGEE project 11

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Earth Science Applications

• Community – Many small groups that aggregate for projects (and separate

afterwards)

• The Earth – Complex system– Independent domains with interfaces

Solid Earth – Ocean – Atmosphere

– Physics, chemistry and/or biology

• Applications– Earth observation by satellite– Seismology– Hydrology– Climate– Geosciences– Pollution

– Meteorology, Space Weather– Mars Atmosphere– Database Collection

Introduction to Grids and the EGEE project 12

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Earthquake analysis

• Seismic software application determines:Epicentre, magnitude, mechanism May make it possible to predict future

earthquakes Assess potential impact on specific regions

• Analysis of Indonesian earthquake (28 March 2005)– Data from French seismic sensor network

GEOSCOPE transmitted to IPGP within 12 hours after the earthquake

– Solution found within 30 hours after earthquake occurred 10 times faster on the Grid than on local computers

– Results Not an aftershock of December 2004 earthquake Different location (different part of fault line further south) Different mechanism

• Rapid analysis of earthquakes is important for relief efforts

Introduction to Grids and the EGEE project 13

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Industrial applications

• EGEODE– Industrial application from Compagnie Générale de Géophysique

running on EGEE infrastructure Seismic processing platform Based on industrial application Geocluster© used at CGG Being ported to EGEE for Industry and Academia

• OpenPlast project– French R&D programme to develop and deploy Grid platform for

plastic industry (SMEs)– Based on experience from EGEE (supported by CS)– Next: Interoperability with other Grids

Introduction to Grids and the EGEE project 14

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

HEP

• High Energy Physics is a pilot application domain for EGEE– Large datasets– Large computing requirements Major need for Grid technology to support distributed communities

• Support for LHC experiments through LHC Computing Grid (LCG)– ATLAS, CMS, LHCb, ALICE

• Also support for other major international HEP experiments– BaBar (US)– CDF (US)– DØ (US)– H1 and ZEUS (Germany)

Introduction to Grids and the EGEE project 15

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

LCG Tier Model

Tier2 Centre ~1 TIPS

Online System

Offline Farm~20 TIPS

CERN Computer Centre >20 TIPS

GridKa Regional Centre

US Regional Centre

French Regional Centre

Italian Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Workstations

~100 MBytes/sec

~100 MBytes/sec

100 - 1000 Mbits/sec

•One bunch crossing per 25 ns

•100 triggers per second

•Each event is ~1 Mbyte

Physicists work on analysis “channels”

Each institute has ~10 physicists working on one or more channels

Data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~ Gbits/sec or Air Freight

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

~Gbits/sec

Tier Tier 00

Tier Tier 11

Tier Tier 33

1 TIPS = 25,000 SpecInt95

PC (1999) = ~15 SpecInt95

DESY ~1 TIPSTier Tier 22

Introduction to Grids and the EGEE project 16

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

LHC Data

• 40 million collisions per second• After filtering, 100 collisions of interest

per second• A Megabyte of data for each collision

= recording rate of 0.1 Gigabytes/sec

• 1010 collisions recorded each year When LHC starts operation:

will generate ~ 15 Petabytes/year of data*

*corresponding to more than 20 million CDs!

Concorde(15 Km)

Balloon(30 Km)

CD stack with1 year LHC data!(~ 20 Km)

Mt. Blanc(4.8 Km)

Introduction to Grids and the EGEE project 17

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

LHC Computing Grid

• Aim: to develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments

Ensure the computing service … and common application libraries and tools

• “Tier” infrastructure with Tier-0 at CERN, 11 Tier-1 centres and more than 100 Tier-2, and Tier-3 centres

• Phase I – 2002-05 – Development & planning

• Phase II – 2006-2008 – Deployment & commissioning of the initial services

LCG is not a development project – it relies on EGEE (and other Grid projects) for Grid middleware development, application support, Grid operation and deployment

Introduction to Grids and the EGEE project 18

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

HEP success stories

• Fundamental activity in preparation of LHC start up– Physics– Computing systems

• Examples:– LHCb: ~700 CPU/years in 2005

on the EGEE infrastructure– ATLAS: over 20,000 jobs per

day Comprehensive analysis: see

S.Campana et al., “Analysis of the ATLAS Rome Production experience on the EGEE Computing Grid“, e-Science 2005, Melbourne, Australia

– A lot of activity in all involved applications (including as usual a lot of activity within non-LHC experiments like BaBar, CDF and D0)

CPU used: 6,389,638 hData Output: 77 TB

DIRAC.Barcelona.es 0.214% DIRAC.Bologna-T2.it 0.696%DIRAC.CERN.ch 0.571% DIRAC.Cambridge.uk 0.001%DIRAC.CracowAgu.pl 0.001% DIRAC.IF-UFRJ.br 0.175%DIRAC.LHCBONLINE.ch 0.779% DIRAC.Lyon.fr 2.552%DIRAC.PNPI.ru 0.000% DIRAC.Santiago.es 0.148%DIRAC.ScotGrid.uk 3.068% DIRAC.Zurich-spz.ch 0.003%DIRAC.Zurich.ch 0.756% LCG.ACAD.bg 0.106%LCG.BHAM-HEP.uk 0.705% LCG.Barcelona.es 0.281%LCG.Bari.it 1.357% LCG.Bologna.it 0.032%LCG.CERN.ch 10.960% LCG.CESGA.es 0.528%LCG.CGG.fr 0.676% LCG.CNAF-GRIDIT.it 0.012%LCG.CNAF.it 13.196% LCG.CNB.es 0.385%LCG.CPPM.fr 0.242% LCG.CSCS.ch 0.282%LCG.CY01.cy 0.103% LCG.Cagliari.it 0.515%LCG.Cambridge.uk 0.010% LCG.Catania.it 0.551%LCG.Durham.uk 0.476% LCG.Edinburgh.uk 0.031%LCG.FZK.de 1.708% LCG.Ferrara.it 0.073%LCG.Firenze.it 1.047% LCG.GR-01.gr 0.349%LCG.GR-02.gr 0.226% LCG.GR-03.gr 0.171%LCG.GR-04.gr 0.056% LCG.GRNET.gr 1.170%LCG.HPC2N.se 0.001% LCG.ICI.ro 0.088%LCG.IFCA.es 0.022% LCG.IHEP.su 1.245%LCG.IN2P3.fr 4.143% LCG.INTA.es 0.076%LCG.IPP.bg 0.033% LCG.ITEP.ru 0.792%LCG.Imperial.uk 0.891% LCG.Iowa.us 0.287%LCG.JINR.ru 0.472% LCG.KFKI.hu 1.436%LCG.Lancashire.uk 6.796% LCG.Legnaro.it 1.569%LCG.Manchester.uk 0.285% LCG.Milano.it 0.770%LCG.Montreal.ca 0.069% LCG.NIKHEF.nl 5.140%LCG.NSC.se 0.465% LCG.Napoli.it 0.175%LCG.Oxford.uk 1.214% LCG.PIC.es 2.366%LCG.PNPI.ru 0.278% LCG.Padova.it 2.041%LCG.Pisa.it 0.121% LCG.QMUL.uk 6.407%LCG.RAL-HEP.uk 0.938% LCG.RAL.uk 9.518%LCG.RHUL.uk 2.168% LCG.SARA.nl 0.675%LCG.Sheffield.uk 0.094% LCG.Torino.it 1.455%LCG.Toronto.ca 0.343% LCG.Triumf.ca 0.105%LCG.UCL-CCC.uk 1.455% LCG.USC.es 1.853%

20-6-2005, P. Jenni 4LCG POB: ATLAS on the LCG/EGEE Grid

This is the first successful use of the grid by a largeuser community, which has however also revealed several shortcomings which need now to be fixed as LHC turn-on is onlytwo years ahead!

Very instructive comments from the user feedback have been presented at the Workshop (obviously this was one of the main themes and purposes of the meeting)

All this is available on the Web

ATLAS

LHCb

Introduction to Grids and the EGEE project 19

Enabling Grids for E-sciencE

EGEE-III Andreas Gellrich, DESY

Monitoring