EGEE 3 Project

download EGEE 3 Project

of 27

  • date post

  • Category


  • view

  • download


Embed Size (px)


EGEE 3 Project Presentation

Transcript of EGEE 3 Project

  • 1.Enabling Grids for E-sciencEEGEE-IIIA presentation for EU officialsStatus: May 2008 EGEE-III INFSO-RI-222667

2. EGEEEnabling Grids for E-sciencEFlagship Grid infrastructure project co-funded by the European Commission Main ObjectivesService & Application Operate a large-scale,Networkingsupportsupport 20%production quality Grid 50%infrastructure for e-Science Training 8% Integration Attract new resources and and testingMiddleware 5%users from sciences as well 9%as business Dissemination & International ManagementCooperation 2% 6%EGEE-III INFSO-RI-222667 2 3. EGEE-IIIEnabling Grids for E-sciencE EGEE-III Third phase of the EGEE programme: EGEE: April 2004 March 2006 EGEE-II: April 2006 April 2008 Co-funded under European Commission under call INFRA-2007-1.2.3 9010 person months/375 FTEs 2 year period 1 May 2008 to 30 April 2010 EC Requested Contribution : 32M - represents less than 1/3 of total project costs Key objectives Expand/optimise existing EGEE infrastructure, include more resources and user communities Prepare migration from a project-based model to a sustainable federated infrastructure based on National Grid Initiatives Consortium Structured on a national basis (National Grid Initiatives/Joint Research Units) 42 beneficiaries (+ 100 JRU members)EGEE-III INFSO-RI-222667 3 4. EGEE-III activities and leadersEnabling Grids for E-sciencE NA1: Management of the project Bob Jones, CERNNA2: Dissemination, Communication andSA1: Grid OperationsOutreach Maite Barroso Lopez, CERNCatherine Gater, CERNNA3: User Training and supportSA2: Networking Support Robin McConnell, UEDIN Xavier Jeannin, CNRS NA4: User Community Support andSA3: Integration, testing & certificationExpansionOliver Keeble, CERNCal Loomis, CNRSNA5: International Cooperation & PolicyJRA1: Middleware engineering Panos Louridas, GRNET Francesco Giacomini, INFN EGEE-III INFSO-RI-222667 4 5. EGEE What do we deliver?Enabling Grids for E-sciencE Infrastructure operation Sites distributed across many countriesLarge quantity of CPUs and storageContinuous monitoring of Grid services & automated siteconfiguration/managementSupport multiple Virtual Organisations from diverseresearch disciplines Middleware Production quality middleware distributed under businessfriendly open source licenceImplements a service-oriented architecture that virtualisesresourcesAdheres to recommendations on web service inter-operability and evolving towards emerging standards User Support - Managed process from first contact through to production usage Training Expertise in Grid-enabling applications Online helpdesk Networking events (User Forum, Conferences etc.) EGEE-III INFSO-RI-222667 5 6. EGEE InfrastructureEnabling Grids for E-sciencE Application areas include:Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance>250 sites Fusion 48 countries Geophysics >50,000 CPUs High Energy Physics>20 PetaBytes Life Sciences>10,000 users Multimedia >150 VOs Material Sciences>150,000 jobs/day EGEE-III INFSO-RI-2226676 7. Users and resources distributionEnabling Grids for E-sciencE February 2008 figures EGEE-III INFSO-RI-222667 7 8. gLite Grid Middleware ServicesEnabling Grids for E-sciencE Access CLIAPISecurityInformation & MonitoringAuthorization AuditingInformation & Application MonitoringMonitoringAuthenticationData Management Workload Management Metadata File & Replica AccountingJob Package CatalogCatalogProvenanceManagerStorage Data Site ProxyComputing Workload Element MovementElement ManagementOverview paper EGEE-III INFSO-RI-2226678 9. Disciplines and user communities Enabling Grids for E-sciencEAstrophysics and astroparticle physics Biomedical and bioinformatics information Computational chemistryOthers argo libienmr.euaegis inaf bio trgridaapesci pamela biomedcompchem astron embrace gaussian cesga planckenea virgoHigh Energy Physics Infrastructure grid-it magiccaliceedteam augerhoneeuindialights.infn.itificopsncf Earth sciences rdteam esrpheno Geophysics geant4 egeodeAll user communities are required to imath.cesga.esFinanceresources to the infrastructurevo.lpnhe.in2p3.frvo.sbg.in2p3.frvo.grif.frinfngrid proactive cosmo egridhermeseela crypto.swing-grid.chvo.dapnia.cea.freumeddiligent Fusion alice dteamcyclops fusion atlas geclipsebabar balticgrid gridccbelle dech~9000 users cdfcmsseeseegridlisted indzerogridpptwgridtrgrida/b/c/d/e registeredilclhcbvoce na48 Digital libraries, disasterVOs zeusgheprecovery, computational sciences, etc.desy EGEE-III INFSO-RI-2226679 10. WhyuserschoosetheEGEEGridEnabling Grids for E-sciencE Share more than information Efficient use of resources at many institutes Leverage other sources of funding Data, computing power, applications Join local communitiesChallenges: share data between thousands of scientists with multiple interests link major and minor computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security continuous, production service EGEE-III INFSO-RI-222667 10 11. WhydoparticlephysicistsEnabling Grids for E-sciencE needtheGrid?1/2CERN Large Hadron ColliderThe worlds most powerful particle accelerator4LargeExperiments ATLASEGEE-III INFSO-RI-222667 11 12. WhydoparticlephysicistsEnabling Grids for E-sciencE needtheGrid?2/2 Example from LHC:One years datastarting from this event from LHC would fill a stack of CDs 20km high ~100,000,000 electronic Concordechannels We are looking for this signature 0.0002Higgs (15 km) persecond 15PBytes of dataayear (10MillionMt. BlancGBytes =14Selectivity: 1 in 1013(4.8 km) MillionCDs)Like looking for 1 person in athousand world populations;or for a needle in 20 millionhaystacks! EGEE-III INFSO-RI-22266712 13. AquestionofscaleEnabling Grids for E-sciencE EGEE-III INFSO-RI-22266713 14. Recent Grid activityEnabling Grids for E-sciencEIn 2007, Worldwide LHC Computing Grid ran ~ 44 M 300k /day jobs on different infrastructures (EGEE, NGDF, OSG) with the large majority of them served by230k /day EGEE workload has continued to increase 29M in 1st quarter of 2008 now at ~ >300k jobs/day Distribution of work across Tier0/Tier1/Tier2 really illustrates the importance of the Grid system Tier 2 contribution is around 50%; > 85% is external to CERN These workloads (reported across all WLCG centres) are at the level anticipated for 2008 data takingEGEE-III INFSO-RI-22266714 15. In silico drug discoveryEnabling Grids for E-sciencE Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public health due to world wide exchanges and circulation of persons Grids open new perspectives to in silico drug discovery Reduced cost, adding an accelerating factor in the search for new drugs International collaboration is required for: Early detection Epidemiological watch Prevention Search for new drugs Avian influenza: Search for vaccinesbird casualties EGEE-III INFSO-RI-222667 15 16. WISDOMEnabling Grids for E-sciencE EGEE-III INFSO-RI-22266716 17. Computational ChemistryEnabling Grids for E-sciencE Researchers from more than 30 universities acrossEurope use EGEE for their work Chemical software ported include commercial(Gaussian03, Turbomole, Wien2k) and several freelyavailable packages (GAMES, DL_POLY, CPMD, DALTON,Columbus etc.) Virtual Organisations: CompChem ( Gaussian ( Turbomole ( ~ 3 million jobs executed during year 2007 90+ users actively using EGEE infrastructure EGEE-III INFSO-RI-22266717 18. Computational chemistry exampleEnabling Grids for E-sciencE Cytochrome c Oxydase (CcO) consists of approximately 10.000 atoms and the dynamics calculations are unfeasible on ordinary clusters (2.4 years needed for a simulation of 5.2 ns). Grid computations Three structures studied Total time - 93 days Nearly 6000 jobs 3043 days of CPU time EGEE-III INFSO-RI-222667 18 19. Grid added valueEnabling Grids for E-sciencE Grid can help satisfy computational chemistrydemands: both CPU power and intermediate data storage for future restarts easy management for large numbers of jobs (e.g. GANGA) automation of common tasks during job execution via workflows possibility of direct cooperation between computational chemistry and other scientific disciplines some ligand properties such as geometry, charges etc. can be stored on the Grid these data can be accessed by others to study interaction between ligand and protein for example possibility to execute many parallel jobs at the same time for some commercial software packages, Grid is the only way to allowing users access to these programsEGEE-III INFSO-RI-22266719 20. Expanding Geosciences-On-DemandEnabling Grids for E-sciencE(EGEODE) services to SMEs Modern seismic data processing and geophysical simulations require greater CGGVeritas market amounts of computing power, data storage and sophisticated software High Tech. Difficult for oil & gas small & medium size enterprises (SMEs) to exploit innovative algorithms SMEs market SME Market: small O&G structures Conventional 1035 O&G companies in EU 93% are SMEs; 63% < 10 employeesResearch labsVery small projects of large firms EGEE-III INFSO-RI-222667 20 21. EGEE workload in 2007Enabling Grids for E-sciencEData:25Pb stored11Pb transferredCPU: 114 Million hoursCPUXferStorageE