The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

25
EGEE is a project funded by the European Union under contract IST-2003-508833 The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager Paradyn/Condor Week, April 16, 2004

description

The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager. Paradyn/Condor Week, April 16, 2004. EGEE is a project funded by the European Union under contract IST-2003-508833. Contents. EGEE - what is it and why is it needed? Grid operations – providing a stable service - PowerPoint PPT Presentation

Transcript of The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Page 1: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

EGEE is a project funded by the European Union under contract IST-2003-508833

The EGEE project:An overview

Frédéric Hemmer

EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004

Page 2: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 2

Contents

• EGEE - what is it and why is it needed?

• Grid operations – providing a stable service

• Grid middleware – current and future

• Networking activity

• Summary

The material of this talk is the work of many people in EGEE and LCG

Despite its name EGEE is an International project involving in particular Israel, Russia and US

Page 3: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 3

Background

• Networking, commodity computing and distributed software tools became ripe for Grid technology to start become available at the end of the 1990’s

• Many public funded projects (in the US and in the EU) launched since

• Grid computing a key activity of the EU programmes

• Industrial and commercial Grids have been following (see a good sample on the www.cern.ch/gridcafe portal and also www.gridstart.org)

• Major IT vendors involved in Grid activity

Page 4: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 4

EGEE: Why?

• Access to a production quality grid will change the way science and business is done

• Current Grid R&D projects run to completion within the next few months or next year

• The EGEE partners have already made major progress in aligning national and regional Grid R&D efforts, in preparation for EGEE

• EGEE will preserve the current strong momentum of the European Grid community and the enthusiasm of the hundreds of young European researchers already involved in EU Grid projects (>150 in EU DataGrid alone)

Page 5: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 5

EGEE Manifesto:Enabling Grids for E-science in Europe

 Applications

Geant network

Grid infrastructure

• Goal• Create a wide European Grid production quality

infrastructure on top of present and future EU RN infrastructure

• Build On:• EU and EU member states major investments in Grid Technology• International connections (US and AP)• Several pioneering prototype results• Large Grid development teams in EU require

major EU funding effort

• Approach• Leverage current and planned national and

regional Grid programmes• Work closely with relevant industrial Grid

developers, NRENs and US-AP projects

Page 6: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 6

EGEE Partners

• Leverage national resources in a more effective way for broader European benefit

• 70 leading institutions in 27 countries, federated in regional Grids

Page 7: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 7

EGEE Project Structure

JRA1: Middleware Engineering and Integration

JRA2: Quality Assurance

JRA3: Security

JRA4: Network Services Development

SA1: Grid Operations, Support and Management

SA2: Network Resource Provision

NA1: Management

NA2: Dissemination and Outreach

NA3: User Training and Education

NA4: Application Identification and Support

NA5: Policy and International Cooperation

24% Joint Research 28% Networking

48% ServicesEmphasis in EGEE is on operating a productiongrid and supporting the end-users

Page 8: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 8

EGEE Applications

• EGEE Scope : ALL-Inclusive for academic applications (open to industrial and socio-economic world as well)

• The major success criterion of EGEE: how many satisfied users from how many different domains ?

• 5000 users (3000 after year 2) from at least 5 disciplines

• Two pilot applications selected to guide the implementation and certify the performance and functionality of the evolving infrastructure: Physics & Bioinformatics

Application domains and timelines are for illustration only

Page 9: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 9

The pilot applications

• High Energy Physics with LHC Computing Grid (www.cern.ch/lcg) relies on a Grid infrastructure to store and analyse petabytes (1015 bytes) of real and simulated data. LCG is a major source of resources, requirements and a hard deadlines with no conventional solution available

• In Biomedics several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data. Need to access large and distributed non-homogeneous data and important on-demand computing requirements

Page 10: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 10

EGEE Implementation

• From day 1 (1st April 2004)Production grid service based on the LCG infrastructure running LCG-2 grid middleware

LCG-2 will be maintained until the new generation has proven itself (fallback solution) VDT support for Condor/GT2 based code is needed 1H05 at least

• In parallel develop a “next generation” grid facilityProduce a new set of grid services according to evolving standards (Web Services)

Run a development service providing early access for evaluation purposes

Will replace LCG-2 on production facility in 2005

Globus 2 based Web services based

EGEE-2EGEE-1LCG-2LCG-1

EDGVDT . . .

LCG

EGEE

. . .AliEn

Page 11: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 11

EGEE and LCG

EGEE builds on the work of LCG to establish a grid operations service• LCG: a worldwide collaboration of

• The LHC experiments• The Regional Computing Centres• Physics institutes

• Mission:• Prepare and deploy the computing environment that will be used by

the experiments to analyse the LHC data• Strategy:

• Integrate thousands of computers at dozens of participating institutes worldwide into a global computing resource

• Rely on software being developed in advanced grid technology projects, both in Europe and in the USA

Page 12: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 12

Grid operations

• Create, operate, support and manage a production quality infrastructure

• Offered services:• Middleware deployment and

installation• Software and documentation

repository• Grid monitoring and problem tracking• Bug reporting and knowledge

database• VO services• Grid management services

Page 13: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 13

Operations Structure

• Implement the objectives to provide• Access to resources• Operation of EGEE as a reliable service• Deploy new middleware and resources• Support resource providers and users

• With a clear layered structure• Operations Management Centre (CERN)

Overall grid operations coordination• Core Infrastructure Centres

CERN, France, Italy, UK, Russia (from M12) Operate core grid services

• Regional Operations Centres One in each federation, in some cases these are distributed centres Provide front-line support to users and resource centres Support new resource centres joining EGEE in the regions Support deployment to the resource centres

• Resource Centres Many in each federation of varying sizes and levels of service Not funded by EGEE directly

50+

~11

5

1

instances

Page 14: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 14

EGEE Computing Resources

• Resource Centers foreseen in the project

Region CPU nodes Disk (TB) CPU Nodes Disk (TB)

CERN 900 140 1800 310

UK + Ireland 100 25 2200 300

France 400 15 895 50

Italy 553 60.6 679 67.2

North 200 20 2000 50

South West 250 10 250 10

Germany + Switzerland

100 2 400 67

South East 146 7 322 14

Central Europe 385 15 730 32

Russia 50 7 152 36

Totals 3084 302 8768 936

April 2004: 10 sites July 2005: 20 sites

Page 15: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 15

Deployment Status

Core Sites already integrated

With the other sites (currently running LCG-1), the expected capacity will exceed the previsions foreseen for 2004:

around 4000 CPUs at about 30 sites

Site CPU

CERN 324

FZK 144

PIC 160

FNAL 4

CNAF 715

Nikhef 250

Taipei 98

RAL 146

Total 1841

Page 16: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 16

Deployment Issues

• Need to expand on existing LCG service while maintaining stability• Add more sites/resources (some have no previous experience with grids)

Experience has shown that this can be effort consuming Problematic sites have been causing problems for the whole system

• Introduce applications and VOs from non-HEP (Bio-medical) Need to clarify processes and information flow

• Portability• Support for further platforms (currently just RedHat 7.3)• Middleware dependencies and packaging

• Middleware Support• Deterministic Support Model has been formalized• Essential to have (so far excellent) VDT support for Condor/Globus

• “24x7” operational support• Currently have GOC at RAL http://goc.grid-support.ac.uk/• Being replicated at Taipei (and maybe Canada?)• Prototype accounting system (based on R-GMA) ready for the release in April 2004

(testing, documentation and packaging done)

Page 17: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 17

Expected Developments in 2004

• General:• LCG-2 will be the service run in 2004 – aim to evolve incrementally• Goal is to run a stable service

• Some functional improvements:• Extend access to MSS – tape systems, and managed disk pools• Distributed vs replicated replica catalogs

To avoid reliance on single service instances

• Operational improvements:• Monitoring systems – move towards proactive problem finding, ability to take

sites on/offline; experiment monitoring• Continual effort to improve reliability and robustness• Develop accounting and reporting

• Address integration issues:• With large clusters, with storage systems• Ensure that large clusters can be accessed via grid • Issue of integrating with other applications and non-LHC experiments

New release foreseen end April 2004

Page 18: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 18

EGEE Middleware Activity

• Hardening and re-engineering of existing middleware functionality, leveraging the experience of partners

• Activity concentrated in few major centers and organized in “Software clusters”

• Key services:• Data Management (CERN)• Information Collection (UK)• Resource Brokering, Accounting (Italy-Czech

Republic)• Quality Assurance (France)• Grid Security (Northern Europe)• Middleware Integration (CERN)• Middleware Testing (CERN)

Page 19: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 19

Characteristics of the new middleware

• Develop a lightweight stack of generic middleware useful to LHC experiments and BioMedicals based upon existing components

• Biomedical applications have important security requirements (e.g. confidentiality) that need to be addressed.

• Focus is on re-engineering and hardening• Early prototype and fast feedback turnaround envisaged• Use a service oriented approach

A note on OGSI/WSRF/WS/….

• Still discussing – nothing has settled yet

• Need to take a step back• Focus on the service decomposition, semantics, interplay rather than the envelope

• WS seems to provide a useful abstraction• Widely used in industry, Grid projects, Internet computing (Google, Amazon)• Need to follow standardization efforts to be able to adopt them once settled

Page 20: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 20

Middleware approach

• Formed a design team with members from• AliEn• VDT• EDG

• Started intense technical discussion to• Break down the proposed architecture to real components• Identify critical components (and what existing software to use for the first

instance of a prototype)• Define semantics and interfaces of these component

• Focus on key services discussed; exploit existing components• Initially an ad-hoc installation at CERN and Wisconsin• Aim to have first instance ready by end of April

• Open only to a small user community • Expect frequent changes (also API changes) based on user feedback and

integration of further services

• Enter a rapid feedback cycle• Continue with the design of remaining services• Enrich/harden existing services based on early user-feedback

all members will be part of EGEE as of April 1st

From US and Europe

Page 21: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 21

Initial Services

• Data management• Storage Element

SRM based; allow POSIX-like access

• Workload management• Computing Element

Allow pull and push mode

• More discussions needed• Information and monitoring• Security

• Guiding principles:• Lightweight services

Easily and quickly deployable• Interoperability

Allow for multiple implementations (medium/long term)– Being based on WS should help

• Co-existence with deployed infrastructure Run as an application

• Security:• Need to integrate components with quite different security models• Start with a minimalist approach based on VOMS and myProxy

Page 22: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 23

Condor Team contributions to EGEE Middleware

• Many years of experience in designing and running real world distributed systems

• Essential for relatively new Grid Middleware technologies• Many of the problems we see today are related to robustness, deployment, scalability

• Proven scheduling technologies• Condor/Condor-G

• Leadership in the new Middleware Design Group• Monthly face-to-face meetings covering all essential parts of Middleware• Miron Livny’s influence and contributions are essential

• Support of Middleware Components for the existing LCG-2 code base (VDT)

• Condor, Globus GT2 leveraging the NSF Middleware Initiative • Proactive, bilateral problem resolution and enhancements

• US contribution to EGEE project• Essential, as many applications are/will be worldwide, not only European

Page 23: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 24

EGEE Networking Activity

• Dissemination and outreach• Lead by TERENA

• User training and induction• Lead by Unv Edin. (NeSC)• The success of EGEE is measured by the impact it has on collaborative European

science• The goal is to support communities of users• Therefore induction and training have a high priority from the outset

• Application identification and support• Two pilot application centers (for high energy physics and biomedical grids)• One more generic component dealing with longer term recruitment and support of

other communities

• Policy and International cooperation• Establish Grid policy forum • Coordinate relations with other projects (EU and beyond)

•Training courses (based on EDG tutorials) will be available from July 2004•Grid school near Naples, Italy 18-30 July 2004:http://www.dma.unina.it/~murli/GridSummerSchool2004/

Page 24: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 25

Summary

• EGEE is expected to deliver a production Grid infrastructure for scientific applications

• The project just started weeks ago• We have a running grid service based on LCG-2• All EGEE activities are well advanced and ready to go

• Biomedical and physics are the pilot applications domains that will lead the exploitation of the EGEE Grid infrastructure

• US contribution essential through support of existing middleware and design of new generation middleware

• The first project conference will be held in Cork (Ireland) 18-22nd Aprilhttp://public.eu-egee.org/kickoff/index.html

Page 25: The EGEE project: An overview Frédéric Hemmer EGEE Middleware Manager

Paradyn/Condor Week, April 16, 2004 - 26

To know more:

EGEE – www.eu-egee.org

Further Information