EGEE 3 Project

27
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE-III A presentation for EU officials Status: May 2008

description

EGEE 3 Project Presentation

Transcript of EGEE 3 Project

Page 1: EGEE 3 Project

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE-III A presentation for EU officials Status: May 2008

Page 2: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Training8%

Dissemination & International Cooperation

6%

Integration and testing

9%

Management2%

Middleware 5%

Application support

20%

Service & Networking

support50%

2

EGEE

Main Objectives• Operate a large-scale,

production quality Grid infrastructure for e-Science

• Attract new resources and users from sciences as well as business

Flagship Grid infrastructure project co-funded by the European Commission

Page 3: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 3

EGEE-III• EGEE-III

– Third phase of the EGEE programme:EGEE: April 2004 – March 2006EGEE-II: April 2006 – April 2008

– Co-funded under European Commission under call INFRA-2007-1.2.3– 9010 person months/375 FTEs– 2 year period – 1 May 2008 to 30 April 2010– EC Requested Contribution : €32M - represents less than 1/3 of total project costs

• Key objectives– Expand/optimise existing EGEE infrastructure, include more resources and user

communities– Prepare migration from a project-based model to a sustainable federated

infrastructure based on National Grid Initiatives

• Consortium– Structured on a national basis (National Grid Initiatives/Joint Research Units)– 42 beneficiaries (+ 100 JRU members)

Page 4: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 4

EGEE-III activities and leaders

NA1: Management of the projectBob Jones, CERN

NA2: Dissemination, Communication and Outreach

Catherine Gater, CERN

SA1: Grid OperationsMaite Barroso Lopez, CERN

NA3: User Training and supportRobin McConnell, UEDIN

SA2: Networking SupportXavier Jeannin, CNRS

NA4: User Community Support and Expansion

Cal Loomis, CNRS

SA3: Integration, testing & certificationOliver Keeble, CERN

NA5: International Cooperation & PolicyPanos Louridas, GRNET

JRA1: Middleware engineeringFrancesco Giacomini, INFN

Page 5: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 5

EGEE – What do we deliver?• Infrastructure operation

– Sites distributed across many countriesLarge quantity of CPUs and storageContinuous monitoring of Grid services & automated site configuration/managementSupport multiple Virtual Organisations from diverse research disciplines

• Middleware– Production quality middleware distributed under business

friendly open source licenceImplements a service-oriented architecture that virtualises resourcesAdheres to recommendations on web service inter-operability and evolving towards emerging standards

• User Support - Managed process from first contact through to production usage– Training– Expertise in Grid-enabling applications– Online helpdesk– Networking events (User Forum, Conferences etc.)

Page 6: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 6

>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000 users>150 VOs>150,000 jobs/day

EGEE – Infrastructure

Application areas include:

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

Page 7: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 7

Users and resources distribution

February 2008 figures

Page 8: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 8

Workload ManagementData Management

Security Information & Monitoring

Access

gLite Grid Middleware Services

API

Computing Element

Workload Management

Metadata Catalog

Storage Element

Data Movement

File & Replica Catalog

Authorization

Authentication

Information & Monitoring

ApplicationMonitoring

Auditing

Job Provenance

Package Manager

CLI

Accounting

Site Proxy

Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf

Page 9: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 9

Disciplines and user communitiesAstrophysics and astroparticle physics Biomedical and bioinformatics information Computational chemistry Othersargo libi enmr.eu aegis inaf bio trgrida apesci pamela biomed compchem astron astro.vo.eu-egee.org embrace gaussian cesga planck enea virgo High Energy Physics Infrastructure grid-it magic calice edteam gridmosi.ici.ro auger hone euindia lights.infn.it

ific ops ncf Earth sciences ildg pvier vo.agata.org trgridc minos.vo.gridpp.ac.uk rdteam vo.ipno.in2p3.fr esr pheno rgstest vo.northgrid.ac.uk

supernemo.vo.eu-egee.org swetest webcom Geophysics vo.lal.in2p3.fr vo.deploymenttest.cea.fr geant4 egeode vo.llr.in2p3.fr vo.e-ca.es imath.cesga.es

vo.lpnhe.in2p3.fr vo.grif.fr proactive Finance vo.sbg.in2p3.fr infngrid cosmo egrid hermes eela crypto.swing-grid.ch

vo.dapnia.cea.fr eumed diligent Fusion alice dteam cyclops fusion atlas vo.plgrid.pl geclipse

babar balticgrid gridcc belle dech cdf see cms seegrid dzero twgrid gridpp trgrida/b/c/d/eilc voce lhcb na48 zeus ghep desy http://cic.gridops.org/index.php?section=home&page=volist

~9000 users listed in

registered VOs Digital libraries, disaster

recovery, computational sciences, etc.

All user communities are required to contribute resources to the infrastructure

Page 10: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 10

• Share more than information• Efficient use of resources at many institutes• Leverage other sources of funding• Data, computing power, applications• Join local communities

Challenges:• share data between thousands of scientists with multiple interests• link major and minor computer centres• ensure all data accessible anywhere, anytime• grow rapidly, yet remain reliable for more than a decade• cope with different management policies of different centres• ensure data security• continuous, production service

Why users choose the EGEE Grid

Page 11: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 11

4 Large Experiments

CERN Large Hadron ColliderThe world’s most powerful particle accelerator

Why do particle physicists  need the Grid? 1/2

ATLAS

Page 12: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 12

Example from LHC: starting from this event

We are looking for this “signature”

Selectivity: 1 in 1013

Like looking for 1 person in a thousand world populations;or for a needle in 20 million haystacks!

~100,000,000  electronic 

channels•

0.0002 Higgs 

per second•

15 PBytes

of 

data a year •

(10 Million 

GBytes

= 14  Million CDs)

Concorde(15 km)

Mt. Blanc(4.8 km)

One year’s data from LHC would fill a stack of CDs 20km high

Why do particle physicists  need the Grid? 2/2

Page 13: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 13

A question of scale

Page 14: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 14

Recent Grid activity

These workloads (reported across all WLCG centres) are at the level anticipated for 2008 data taking

230k /day230k /day

In 2007, Worldwide LHC Computing Grid ran ~ 44 M jobs on different infrastructures (EGEE, NGDF, OSG) with the large majority of them served by EGEE – workload has continued to increase

29M in 1st quarter of 2008 –now at ~ >300k jobs/day

Distribution of work across Tier0/Tier1/Tier2 really illustrates the importance of the Grid system

Tier 2 contribution is around 50%; > 85% is external to CERN

300k /day300k /day

Page 15: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 15

In silico drug discovery

• Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public health due to world wide exchanges and circulation of persons

• Grids open new perspectives to in silico drug discovery– Reduced cost, adding an accelerating factor in the search for new drugs

•Avian influenza: •bird casualties

International collaboration is required for: • Early detection• Epidemiological watch• Prevention• Search for new drugs• Search for vaccines

Page 16: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 16

WISDOMhttp://wisdom.healthgrid.org/

Page 17: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 17

Computational Chemistry

• Researchers from more than 30 universities across Europe use EGEE for their work

• Chemical software ported include commercial (Gaussian03, Turbomole, Wien2k) and several freely available packages (GAMES, DL_POLY, CPMD, DALTON, Columbus etc.)

• Virtual Organisations:– CompChem (http://compchem.unipg.it)– Gaussian (http://egee.grid.cyfronet.pl/gaussian)– Turbomole (http://egee.grid.cyfronet.pl/turbomole-vo)

• ~ 3 million jobs executed during year 2007• 90+ users actively using EGEE infrastructure

Page 18: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 18

Computational chemistry example

• Cytochrome c Oxydase (CcO) consists of approximately 10.000 atoms and the dynamics calculations are unfeasible on ordinary clusters (2.4 years needed for a simulation of 5.2 ns).

• Grid computations– Three structures

studied– Total time - 93 days– Nearly 6000 jobs– 3043 days of CPU time

Page 19: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 19

Grid added value

• Grid can help satisfy computational chemistry demands:– both CPU power and intermediate data storage for future

restarts– easy management for large numbers of jobs (e.g. GANGA)– automation of common tasks during job execution via workflows– possibility of direct cooperation between computational chemistry

and other scientific disciplines some ligand properties such as geometry, charges etc. can be stored on the Grid these data can be accessed by others to study interaction between ligand and protein for example

– possibility to execute many parallel jobs at the same time– for some commercial software packages, Grid is the only way to

allowing users access to these programs

Page 20: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 20

Expanding Geosciences-On-Demand (EGEODE) services to SMEs

• Modern seismic data processing and geophysical simulations require greater amounts of computing power, data storage and sophisticated software

• Difficult for oil & gas small & medium size enterprises (SMEs) to exploit innovative algorithms

• SME Market: small O&G structures– 1035 O&G companies in EU– 93% are SMEs; 63% < 10 employees

Research labsVery small projects of large firms

Conventional

High Tech.

SMEs market

CGGVeritas market

Page 21: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 21

XferCPU

Storage

EGEE workload in 2007

CPU: 114 Million hours

Data:25Pb stored11Pb transferred

Estimated cost if performed with Amazon’s EC2 and S3: $58,690,000 = €37Mhttp://gridview.cern.ch/GRIDVIEW/same_index.php http://calculator.s3.amazonaws.com/calc5.html? 17/05/08Paper on Clouds and Grids, May 2008: https://edms.cern.ch/file/925013/4/EGEE-Grid-Cloud-v1_2.pdf

Page 22: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

gLite Business Use Cases• Adopted gLite on own infrastructure

– BEinGRIDEarth Sciences; Finance

– EU-IndiaGridFinancial Stock Analysis application using gLite

– Health-e-ChildBiomedical information platform for Paediatrics

– Imense Ltd gLite-based Grid computing for large scale image indexing and retrieval

– Philips ResearchUsing gLite for medical imaging, bio-informatics and simulation

• Proof of Concept– GridVideo

gLite-based multimedia application– TOTAL, UK

Application to assess the usefulness of External Grids using GILDA testbed

• Application and Development– CERN Openlab

CERN and industrial partners to develop data-intensive Grid solutions – WISDOM

Using EGEE infrastructure for drug discovery22

Page 23: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 23

Business and EGEE-III• Technology Transfer and potential commercial exploitation

– Further develop the Business Forum as a means of dialog with business actors

– More attention to SMEs start-ups (innovative applications and portals) collaborative projects (partner grids)

– Develop a network of companies to prepare the future commercial exploitation of EGEE technology

EGEE Business Associates; ISVs; Software integrators and IT Services providers

• Provide solutions to challenges for Business adoption– MoUs signed with related projects and interested partners to develop

identified higher-level services and solutions (e.g. SLA; Windows porting, ...)– Further develop EGEE technology to simplify the interaction between grids

and commercial cloud services– Explain the advantages and limitations of grids & cloud computing to

businesses

Page 24: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 24

Collaborating e-Infrastructures

Page 26: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 26

European Grid Initiative• Need to prepare permanent, common Grid infrastructure• Ensure the long-term sustainability of the European e-Infrastructure

independent of short project funding cycles• Coordinate the integration and interaction between National Grid

Infrastructures (NGIs)• Operate the production Grid infrastructure on a European level for a

wide range of user communities

Must be no gap in the support of the production Grid

Page 27: EGEE 3 Project

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 27

Summary• EGEE operates the world’s largest multi-disciplinary Grid

infrastructure for scientific research– In constant and significant production use– Constantly growing in scale of resources and breadth of user communities

supported

• A third phase of EGEE has now started– EGEE-III 2008-2010

• Need to prepare the long-term– EGEE, collaborating projects, National Grid Initiatives and user

communities are working to define a model for a sustainable Grid infrastructure that is independent of short project cycles

www.eu-egee.org