EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE VO Management in EGEE Maurice Bouwhuis SARA...

23
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org VO Management in EGEE Maurice Bouwhuis SARA computing and networking services Joint EGEE and OSG Workshop on VO Management in Production Grids HPDC 2008 Boston, USA

Transcript of EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE VO Management in EGEE Maurice Bouwhuis SARA...

Services provided by EGEE

VO Management in EGEEMaurice BouwhuisSARA computing and networking services

Joint EGEE and OSG Workshop on VO Management in Production GridsHPDC 2008Boston, USA EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencEwww.eu-egee.orgEnabling Grids for E-sciencEEGEE-III INFSO-RI-2226671HPDC 20082eScienceScience is becoming increasingly digital, needs to deal with increasing amounts of data and computational needsSimulations get ever more detailedNanotechnology design of new materials from the molecular scaleModelling and predicting complex systems (weather forecasting, river floods, earthquake)Decoding the human genomeExperimental Science uses ever moresophisticated sensors to make precisemeasurementsNeed high statistics Huge amounts of complex dataServes user communities around the worldEGEE = enabling grids for e-science

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226672Science is getting more digital world-wide LHC as example

Univ. Linz - March 20083Accelerating and colliding particles

Large Hadron Collider

Mont Blanc(4810 m)Downtown Geneva

27 km circumference tunnelDue to start up in 200840 Million Particle collisions per secondOnline filter reduces to a few 100 good events per second recorded on disk and magnetic tape at 100-1,000 MegaBytes/sec~15 PetaBytes per year for all four experimentsData analyzed by 100s of research groups world wideEnabling Grids for E-sciencEEGEE-III INFSO-RI-2226673HPDV 2008

Data Distribution on the Grid4

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226674HPDC 20085Challenges for high throughput virtual docking300,000 Chemical compounds:ZINC &Chemical combinatorial libraryTarget (PDB) :Neuraminidase (8 structures)Millions of chemicalcompounds availablein laboratories

High Throughput Screening2$/compound, nearly impossibleMolecular docking (Autodock)100s CPU years, TBs dataData challenge on EGEE, Auvergrid, TWGrid~6 weeks on ~2000 computers

In vitroscreeningof 100 hits

Hits sorting and refining

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226675Univ. Linz - March 20086Example: PharmacokinetisA lesion is detected in an MRI study of a patient start with virtual biopsy The process requires obtaining a sequence of MRI volumetric images. Different images are obtained in different breath-holds.Before analyzing the variation of each voxel, images must be co-registered to minimize deformation due to different breath holds.The total computational cost of a clinical trial of 20 patients is around 100 CPU days.

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226676Earth Science Applications in EGEE

ESA, UTV(IT), KNMI(NL), IPSL(FR)- Production and validation of 7 years of Ozone profiles from GOMERapid Earthquake analysis (mechanism and epicenter) 50- 100CPUs IPGP(FR)

Modelling seawater intrusion in costal aquifer (SWIMED) CRS4(IT),INAT(TU),Univ.Neuchtel(CH)

Geocluster for Academy and industry CGG(FR)Flood of a Danube river-Cascade of models (meteorology,hydraulic ,hydrodynamic.) UISAV(SK)Specfem3D: Seismic application. Benchmark for MPI (2 to 2000 CPUs) (IPGP,FR)DKRZ(DE)- Data access studies, climate impacts on agricultureData mining Meteorology & Space Weather (GCRAS, RU)Air Pollution model- BAS(BG)Mars atmosphere CETP( FR) 7HPDC 2008Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226677HPDC 20088ManpowerTotal of 375 FTEs in EGEE-III9010 person months (vs. 11165 PMs in EGEE-II; ~20% less)

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226678HPDC 20089

Registered Collaborating Projects

Applicationsimproved services for academia, industry and the publicSupport Actionskey complementary functionsInfrastructuresgeographical or thematic coverage

25 projects have registered as of September 2007: web pageEnabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-2226679Univ. Linz - March 200810

250 sites48 countries50,000 CPUs13 PetaBytes>5000 users>200 VOs>140,000 jobs/dayArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences

32%Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266710HPDC 200811EGEE working with related infrastructure projects

GIN

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266711Univ. Linz - March 200812European Grid InitiativeNeed to prepare permanent, common Grid infrastructureEnsure the long-term sustainability of the European e-Infrastructure independent of short project funding cyclesCoordinate the integration and interaction between National Grid Infrastructures (NGIs)Operate the production Grid infrastructure on a European level for a wide range of scientific disciplines

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266712HPDC 200813Virtual OrganisationsWhat is a Virtual Organisation (EGEE take) ?

A set of individuals or organisations, not under single hierarchical control, (temporarily) joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.

graphic from: Anatomy of the Grid, Foster, Kesselman and TueckeEnabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266713Virtual OrganisationsWithin LCG/EGEE, VOs are essentially authorization domains:access rights to resources and datasets owned by a group of people Authentication with X509 certificates Trust provided by IGTFAuthorization with VOMSVO membership, group and role determines which resources (storage, computes) one has access totalk by Erwin LaureHPDC 200814Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266714HPDC 200815Trusted third partiesAll research grid infrastructures share the same base set of trusted third parties (CAs)There is typically one in each countryThe credentials they issue are comparable in quality

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266715Virtual OrganizationWho becomes a member ?Commonality, they want something, they do not own themselvesSeparation, they are not together physically/organizationallyWithin EGEE adherence to similar community

ExamplesHigh Energy Physics people from Atlas storing on wLCGTwo theoretical chemists simulating on a regional grid

HPDC 200816Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266716HPDC 200817VO affiliationIf X509 is your passport then VO membership is the visaPer-VO Authorisations (visa)granted to a person or service by a virtual organisationbased on the passport nameacknowledged by the resource owners providers can still ban individual users, and decide which privileges are granted to which VO attributesIn your case, these visa are called VOMS credentialsIt is a cryptographically protected statement by the VOwhich is bound (by the VO) to your subject nameRoles and VO-groups are thereOKC=IT/O=INFN /L=CNAF/CN=Pinco Palla/CN=proxy

PincosVO attributesEnabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667VOMS registryHPDC 200818

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266718VOsTypical VOs that we manageState:Registered VO, Adherence to EGEE polliciesSelected/Supported VO, support from EGEE projectExternal VO, funded/supported by other project

Geographic DistributionLocal VO, at a single Resource CenterRegional VO, with a federation or coutryGlobal VO, EGEE wide HPDC 200819Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266719HPDC 200820Active VOsNumber of active VOs growing steadily!Turnover: Diff. VOs in last 6 / 12 / 24 months = 83 / 92 / 102Total VOs: 104 registered, 258 visible

Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266720Setting up new VOsNew Vitual Organisation (if no existing fits)Setup a VOMS instanceBe accepted by at least one Resource Center Access to Resource Broker Access to File CatalogueRegister with EGEE portalEGEE compliant security pollicy (standard by EGEE) and acceptable use PollicyUsually one of the Regional Centers will provide these servicesNot a really dynamic process, but fits the e-Science requirements

More Resources For EGEE-wide VOs negociate through Regional Operations Centers Regional VOs can usually join regional grid infrastructures Always needs an action and decision form each resource provider

HPDC 200821Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266721VOs and sitesSites in EGEE usually support many VOsTies between each VO and sites are looseSites must be more generic in their setup VOs must program with more discipline

Each VO has role Software Admin and each site provides software area

In LCG exist VO-boxes for site related VO specific servicesHPDC 200822Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266722SummaryEGEE provides a dependable production quality Grid infrastructure to a wide variety of scientific disciplines. Grids are increasingly becoming an essential part of the scientific computing infrastructure sustainability needs to be ensuredVO setup is staticVO membership (incl. group and role determine access)

HPDC 200823Enabling Grids for E-sciencEEGEE-III INFSO-RI-222667Enabling Grids for E-sciencEEGEE-III INFSO-RI-22266723