EGEE-III Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Andreas Gellrich,...
-
date post
19-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of EGEE-III Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Andreas Gellrich,...
EGEE-III
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE and gLite are registered trademarks Andreas Gellrich, DESY
The Grid –The Future of Scientific Computing
Andreas GellrichDESY
IT TrainingDESY, Hamburg28.10.2008
Introduction to Grids and the EGEE project 2
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Collaborating e-Infrastructures
Potential for linking ~80 countries by 2008
Introduction to Grids and the EGEE project 3
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
The need for Grid in HEP
• CERN: the world's largest particle physics laboratory
• Particle physics requires special tools to create and study new particles: accelerators and detectors
Mont Blanc(4810 m)
Downtown Geneva
• Large Hadron Collider (LHC):– One of the most powerful
instruments ever built to investigate matter
– 4 experiments: ALICE, ATLAS, CMS, LHCb
– 27 km circumference tunnel– Due to start up mid 2008
Introduction to Grids and the EGEE project 4
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Highlights of EGEE-II
• >200 VOs from several scientific domains– Astronomy & Astrophysics– Civil Protection– Computational Chemistry– Comp. Fluid Dynamics– Computer Science/Tools– Condensed Matter Physics– Earth Sciences– Fusion– High Energy Physics– Life Sciences
• Further applications under evaluation
98k jobs/day
Applications have moved from testing to routine and daily usage
~80-90% efficiency
Introduction to Grids and the EGEE project 5
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Biomedical applications
• Biomedicine is also a pilot application area
• More than 20 applications deployed and being ported
• Three sub domains – Medical image processing– Biomedicine– Drug discovery
• Use Grid as platform for collaboration (don’t need same massive processing power or storage as HEP)
Introduction to Grids and the EGEE project 6
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Applications Example
• Grid-enabled drug discovery process for neglected diseases– In silico docking
compute probability that potential drugs dock with target protein
– To speed up and reduce cost to develop new drugs
• WISDOM (World-wide In Silico Docking On Malaria)– First biomedical data challenge – 46 million ligands docked in 6 weeks– 1TB of data produced – 1000 computers in 15 countries
Equivalent to 80 CPU years
• Second data challenge on Avian flu in April 2006– 300,000 possible drug components tested– 8 different targets– 2000 computers used for 4 weeks
Introduction to Grids and the EGEE project 7
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Astroparticle physics
• Major Atmospheric Gamma Imaging Cherenkov telescope (MAGIC) – Origin of VHE -rays (30 GeV – TeV)
Active Galactic Nuclei (AGN) Supernova Remnants Unidentified EGRET sources Gamma Ray Bursts
– Huge hadronic background MC simulations to simulate the background of one night, 70 CPUs (P4 2GHz) need to run
for 19200 days– Observation data are big too!
• MAGIC Grid– Use three national Grid centres as backbone– All are members of EGEE
• Work to build a second telescope is currently in progress
Towards a virtual observatory for VHE -rays
Introduction to Grids and the EGEE project 8
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Astroparticle physics
• PLANCK satellite mission– Measure cosmic microwave background (CMB)
At even higher resolution than previous missions
– Launch in 2008; duration >1 year
• Application– N simulations of the whole Planck/LFI mission
different cosmological and instrumental parameters
– Full sky map for frequencies 30 up to 850 GHz (two complete sky surveys)
– 22 channels for LFI, 48 for HFI > 12 times fa
ster!!!
but ~5% fa
ilure ra
te
Introduction to Grids and the EGEE project 9
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Computational Chemistry
• GEMS (Grid Enabled Molecular Simulator) application– Calculation and fitting of electronic energies of atomic and molecular
aggregates (using high level ab initio methods)– The use of statistical kinetics and dynamics to study chemical
processes
• Virtual Monitors– Angular distributions– Vibrational distributions– Rotational distributions– Many body systems
• End-User applications– Nanotubes– Life sciences– Statistical Thermodynamics– Molecular Virtual Reality
Angular distribution
Rotational distribution
Vibrational distribution
Many body systemAngular distribution
Introduction to Grids and the EGEE project 10
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Fusion
• Large Nuclear Fusion installations– E.g. International Thermonuclear Experimental Reactor (ITER)
– Distributed data storage and handling needed
– Computing power needed for Making decisions in real time Solving kinetic transport
particle orbits Stellarator optimization
magnetic field to contain the plasma
Introduction to Grids and the EGEE project 11
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Earth Science Applications
• Community – Many small groups that aggregate for projects (and separate
afterwards)
• The Earth – Complex system– Independent domains with interfaces
Solid Earth – Ocean – Atmosphere
– Physics, chemistry and/or biology
• Applications– Earth observation by satellite– Seismology– Hydrology– Climate– Geosciences– Pollution
– Meteorology, Space Weather– Mars Atmosphere– Database Collection
Introduction to Grids and the EGEE project 12
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Earthquake analysis
• Seismic software application determines:Epicentre, magnitude, mechanism May make it possible to predict future
earthquakes Assess potential impact on specific regions
• Analysis of Indonesian earthquake (28 March 2005)– Data from French seismic sensor network
GEOSCOPE transmitted to IPGP within 12 hours after the earthquake
– Solution found within 30 hours after earthquake occurred 10 times faster on the Grid than on local computers
– Results Not an aftershock of December 2004 earthquake Different location (different part of fault line further south) Different mechanism
• Rapid analysis of earthquakes is important for relief efforts
Introduction to Grids and the EGEE project 13
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
Industrial applications
• EGEODE– Industrial application from Compagnie Générale de Géophysique
running on EGEE infrastructure Seismic processing platform Based on industrial application Geocluster© used at CGG Being ported to EGEE for Industry and Academia
• OpenPlast project– French R&D programme to develop and deploy Grid platform for
plastic industry (SMEs)– Based on experience from EGEE (supported by CS)– Next: Interoperability with other Grids
Introduction to Grids and the EGEE project 14
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
HEP
• High Energy Physics is a pilot application domain for EGEE– Large datasets– Large computing requirements Major need for Grid technology to support distributed communities
• Support for LHC experiments through LHC Computing Grid (LCG)– ATLAS, CMS, LHCb, ALICE
• Also support for other major international HEP experiments– BaBar (US)– CDF (US)– DØ (US)– H1 and ZEUS (Germany)
Introduction to Grids and the EGEE project 15
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
LCG Tier Model
Tier2 Centre ~1 TIPS
Online System
Offline Farm~20 TIPS
CERN Computer Centre >20 TIPS
GridKa Regional Centre
US Regional Centre
French Regional Centre
Italian Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100 MBytes/sec
~100 MBytes/sec
100 - 1000 Mbits/sec
•One bunch crossing per 25 ns
•100 triggers per second
•Each event is ~1 Mbyte
Physicists work on analysis “channels”
Each institute has ~10 physicists working on one or more channels
Data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~ Gbits/sec or Air Freight
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
~Gbits/sec
Tier Tier 00
Tier Tier 11
Tier Tier 33
1 TIPS = 25,000 SpecInt95
PC (1999) = ~15 SpecInt95
DESY ~1 TIPSTier Tier 22
Introduction to Grids and the EGEE project 16
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
LHC Data
• 40 million collisions per second• After filtering, 100 collisions of interest
per second• A Megabyte of data for each collision
= recording rate of 0.1 Gigabytes/sec
• 1010 collisions recorded each year When LHC starts operation:
will generate ~ 15 Petabytes/year of data*
*corresponding to more than 20 million CDs!
Concorde(15 Km)
Balloon(30 Km)
CD stack with1 year LHC data!(~ 20 Km)
Mt. Blanc(4.8 Km)
Introduction to Grids and the EGEE project 17
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
LHC Computing Grid
• Aim: to develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments
Ensure the computing service … and common application libraries and tools
• “Tier” infrastructure with Tier-0 at CERN, 11 Tier-1 centres and more than 100 Tier-2, and Tier-3 centres
• Phase I – 2002-05 – Development & planning
• Phase II – 2006-2008 – Deployment & commissioning of the initial services
LCG is not a development project – it relies on EGEE (and other Grid projects) for Grid middleware development, application support, Grid operation and deployment
Introduction to Grids and the EGEE project 18
Enabling Grids for E-sciencE
EGEE-III Andreas Gellrich, DESY
HEP success stories
• Fundamental activity in preparation of LHC start up– Physics– Computing systems
• Examples:– LHCb: ~700 CPU/years in 2005
on the EGEE infrastructure– ATLAS: over 20,000 jobs per
day Comprehensive analysis: see
S.Campana et al., “Analysis of the ATLAS Rome Production experience on the EGEE Computing Grid“, e-Science 2005, Melbourne, Australia
– A lot of activity in all involved applications (including as usual a lot of activity within non-LHC experiments like BaBar, CDF and D0)
CPU used: 6,389,638 hData Output: 77 TB
DIRAC.Barcelona.es 0.214% DIRAC.Bologna-T2.it 0.696%DIRAC.CERN.ch 0.571% DIRAC.Cambridge.uk 0.001%DIRAC.CracowAgu.pl 0.001% DIRAC.IF-UFRJ.br 0.175%DIRAC.LHCBONLINE.ch 0.779% DIRAC.Lyon.fr 2.552%DIRAC.PNPI.ru 0.000% DIRAC.Santiago.es 0.148%DIRAC.ScotGrid.uk 3.068% DIRAC.Zurich-spz.ch 0.003%DIRAC.Zurich.ch 0.756% LCG.ACAD.bg 0.106%LCG.BHAM-HEP.uk 0.705% LCG.Barcelona.es 0.281%LCG.Bari.it 1.357% LCG.Bologna.it 0.032%LCG.CERN.ch 10.960% LCG.CESGA.es 0.528%LCG.CGG.fr 0.676% LCG.CNAF-GRIDIT.it 0.012%LCG.CNAF.it 13.196% LCG.CNB.es 0.385%LCG.CPPM.fr 0.242% LCG.CSCS.ch 0.282%LCG.CY01.cy 0.103% LCG.Cagliari.it 0.515%LCG.Cambridge.uk 0.010% LCG.Catania.it 0.551%LCG.Durham.uk 0.476% LCG.Edinburgh.uk 0.031%LCG.FZK.de 1.708% LCG.Ferrara.it 0.073%LCG.Firenze.it 1.047% LCG.GR-01.gr 0.349%LCG.GR-02.gr 0.226% LCG.GR-03.gr 0.171%LCG.GR-04.gr 0.056% LCG.GRNET.gr 1.170%LCG.HPC2N.se 0.001% LCG.ICI.ro 0.088%LCG.IFCA.es 0.022% LCG.IHEP.su 1.245%LCG.IN2P3.fr 4.143% LCG.INTA.es 0.076%LCG.IPP.bg 0.033% LCG.ITEP.ru 0.792%LCG.Imperial.uk 0.891% LCG.Iowa.us 0.287%LCG.JINR.ru 0.472% LCG.KFKI.hu 1.436%LCG.Lancashire.uk 6.796% LCG.Legnaro.it 1.569%LCG.Manchester.uk 0.285% LCG.Milano.it 0.770%LCG.Montreal.ca 0.069% LCG.NIKHEF.nl 5.140%LCG.NSC.se 0.465% LCG.Napoli.it 0.175%LCG.Oxford.uk 1.214% LCG.PIC.es 2.366%LCG.PNPI.ru 0.278% LCG.Padova.it 2.041%LCG.Pisa.it 0.121% LCG.QMUL.uk 6.407%LCG.RAL-HEP.uk 0.938% LCG.RAL.uk 9.518%LCG.RHUL.uk 2.168% LCG.SARA.nl 0.675%LCG.Sheffield.uk 0.094% LCG.Torino.it 1.455%LCG.Toronto.ca 0.343% LCG.Triumf.ca 0.105%LCG.UCL-CCC.uk 1.455% LCG.USC.es 1.853%
20-6-2005, P. Jenni 4LCG POB: ATLAS on the LCG/EGEE Grid
This is the first successful use of the grid by a largeuser community, which has however also revealed several shortcomings which need now to be fixed as LHC turn-on is onlytwo years ahead!
Very instructive comments from the user feedback have been presented at the Workshop (obviously this was one of the main themes and purposes of the meeting)
All this is available on the Web
ATLAS
LHCb