EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered...

19
EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Multi-Platform Support Presenters: Eamonn Kenny, Rafal Lichwala Date: 15th May 2008 Time: 5-6pm Location: 40-S2-D01, CERN

Transcript of EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered...

Page 1: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

EGEE-II TCD 22nd-25th May 2007

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Multi-Platform Support

Presenters: Eamonn Kenny, Rafal LichwalaDate: 15th May 2008Time: 5-6pm Location: 40-S2-D01, CERN

Page 2: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 2

Overview

• History: – Porting Multi-Platform Support Activity – Timeline 20042008– Lessons to learn from Timeline

• TCGTMB Recommendations• Current Multi-Platform Activities:

– Debian 4 (x86_64/x86) – SL/CentOS 5.1 (x86)

• NMI Support

Page 3: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 3

Timeline for Porting 2004

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

IRIX 6.5

VDT 1.2.0

Build started

LCG-2_1_0

Release

LCG-2_2_0

Release

LCG-2_3_0

Release

IRIX 6.5

VDT 1.2.0

Build finished

IRIX 6.5

VDT 1.2.0

edg-job-submit

AIX 5.2L

PPC Setup

edg-build build system

Page 4: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 4

Timeline for Porting 2005

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

LCG-2_4_0

SLC3 Release

LCG-2_4_0

SuSE 9.3/SL3/

RH7.3/RH9/

CentOS 4.1

SFT-1 Passed

Mac OS X

G5 Machine

Delivered

LCG-2_6_0

SLC3 Release

gLite build system /edg-build mixtureedg-build build system

Page 5: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 5

Timeline for Porting 2006

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

LCG-2_7_0

SLC3 Release

RH9

Mac OS X 10.3

CentOS 4.2

gLite build system /edg-build mixture

FC4

SuSE9CentOS 4.3

Page 6: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 6

Timeline for Porting 2007

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

gLite 3.1.0-1

SLC4 Release

SL3 and

CentOS 4.5

Build to 100%

ETICS 0.7.0-1

Client released

SuSE 9.3

CentOS 4.5

glite-3.1.0-5

gLite 3.1.0-5

SLC4 Release

ETICS client with ETICS/gLite packager

Page 7: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 7

Timeline for Porting 2008

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Debian x86

OS Installed

gLite-WN Release

3.1.11-1 (SLC4)

OpenSuSE

10.3 x86

Installed

ETICS client with ETICS/gLite packager

CentOS/SL 5.1

Debian 4 x86

OpenSuSE 10.3 x86

gLite-WN Release

SL 5.1 (x86_64/x86)

Page 8: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 8

What do glean from Timeline?

• It’s easy to fall behind in OS support due to discontinued platforms, but experience is had by attempting to provide the different OS support. We should attempt builds on the latest stable OS versions.

• Middleware has changed a little in structure but the basics remain the same.

• The build system has changed 3 times. It now includes better platform support.

• VDT is no longer difficult to build under Unix OSes, whereas it was very time consuming to port in the past.

Page 9: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 9

TMB EGEE-III Recommendations

Platform Type Architecture Node Types Services

SL 5 x86_64 All Servers/Clients

SL 5 x86 All Servers/Clients

Debian 4 x86 WN/UI Clients

Debian 4 x86_64 WN/UI Clients

Mac OS X 10.5

(Leopard)

x86_64 WN/UI Clients

Windows (Cygwin/Interix)

x86_64 WN/UI Clients

Fat Tarball (Fedora 8, Ubuntu 8, Debian 4)

x86_64 WN/UI Clients

Page 10: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 10

PSNC Issues (Debian x86_64)

1. dpkg-gencontrol: error: current build architecture AMD64 does not appear in package's list (x86_64).– PSNC resolved this problem changing /usr/share/dpkg/cputable

file. It is now possible to create debs with AMD64.

2. Case sensitivity in debs naming fixed using the TCD one line patch (ETICS client 1.3.5-1 doesn’t include it)

3. Simple configuration issues have been pinpointed and fixed.

4. Most TCD patches have been introduced.

5. Other issues resolved over Skype.

Page 11: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 11

TCD ETICS Build Status

Page 12: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 12

Debian/SL5 Outstanding Bugs

Affected Package(s)

or sub-system

Bug

Report

Numbers

Number of Patches

Work-Package/

Responsible

Affected Packages

Debian SL5

CERN TCD TCD

dcache-client #35493 None SA3 1 1 1

WMS #35836, #35878 2 JRA1 3 1 -

R-GMA #27471, #32296

#32304

2

2 PW*

JRA1 3 - -

SAM #35499 2 JRA1 3 3 -

LCG-DM #35550 None JRA1 3 3 -

gridsite #35502 None JRA1 5 5 5

edg-gridftp-client #36472 None SA3/ETICS 1 - -

VDT Essentials None None SA3 - - -

Packager #34130 1

(2 PW*)

ETICS 11 debs

0 RPMS

-

* PW – Project Wide SPEC File Patch Required

Page 13: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 13

Debian Development Strategy

• org.glite branch for multi-platform support:– glite_branch_3_1_0_dev (previously

glite_branch_3_1_0_for_porting)

• Change subsystems to go ahead of glite_branch_3_1_0:– glite-wms_branch_3_1_0 (fixes to WMS code on cvs HEAD)– glite-legacy_branch_3_1_0_dev (for correct edg-gridftp-client)– gridsite-core_R_1_1_18_1_dev

• Current default configurations to go ahead of glite_branch_3_1_0:– globus v. 4.0.3-VDT-1.6.1-5– vdt_globus_essentials v. 4.0.3-VDT-1.6.1-5– vdt_globus_rm_essentials v. 4.0.3-VDT-1.6.1-5

Page 14: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 14

SL/CentOS 5.1 Differences

• org.glite branch for multi-platform support:– glite_branch_3_1_0_dev (previously

glite_branch_3_1_0_for_porting)

• Change subsystems to go ahead of glite_branch_3_1_0:– glite-wms_branch_3_1_0 (fixes to WMS code on cvs HEAD)– glite-legacy_branch_3_1_0_dev (for correct edg-gridftp-client)– gridsite-core_R_1_1_18_1_dev

• Current default configurations to go ahead of glite_branch_3_1_0:– globus v. 4.0.5-VDT-1.8.1-1_dev– vdt_globus_essentials v. 4.0.5-VDT-1.8.1-2_dev– vdt_globus_rm_essentials v. 4.0.5-VDT-1.8.1-2_dev

Page 15: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 15

Outstanding ETICS Issues: Patching

Component locking per platform and Patching System

Alternative: 1. clone the configuration.

2. Upload a patch to AFS (or point to a local patch for wget).

3. Introduce the patch into the new configuration.

4. Create the configuration at the org.glite.node.WN or org.glite branch to point to the new configuration.

5. Wait for the fix by JRA1/SA3/ETICS to arrive in a developers branch.

6. Remove the cloned configuration.

• Procedure: Create, Upload, Patch, Metadata, Integrate, Delete• Caveat: No Partner such as Catania, PSNC or HPNC can do this!

Page 16: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008 16

Outstanding ETICS Issues: Global Replace

Project Wide RPM SPEC File Fixes (new: rpmbuild issue)– %build issue and Copyright issue

Agreed Procedure to Implementation: – Report each bug in Savannah to each subsystem developer. – We can either, provide the Linux source diff code patch, say it in

words what the changes are, or provide a sed type script to perform the fix.

– This is time consuming but necessary since no one person can edit and commit all subsystem fixes.

Page 17: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008

NMI Support

• Working on an automated procedure for NMI/Condor Submit & NMI/Condor Execute.

• Will include some firewall traversal.• Condor group are interested in NMI Build & Test.

Currently NMI building is all that is used in practice.• TCD would like to put together some scripts to perform

the Condor/NMI installation automatically.• A time-separated build using nmi.<local.network> or

nmi.cern.ch as a submitter should be provided for.• This will allow all partners to register packages in AFS

in CERN creating a faster certification cycle.

17

Page 18: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008

Ideal Distributed Building Infrastructure

ETICS checkout

ETICS build

ETICS test

Web Server

etics.cern.ch

NMI Webserver

NMI archiver

NMI/ETICS dB host

NM

I S

erve

r

Another Site

NMI Webserver

NMI archiver

NMI/ETICS dB host

NM

I S

erve

r

Execute node

Condor CM

Condor CM

Condor execute

NMI Hawkeye

https & DN

ETICS client

NMI Submit

NMI Submit

ANYWHERE

CERN

3rd PARTY

4th PARTY

Page 19: EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Multi-Platform Support Presenters:

Enabling Grids for E-sciencE

EGEE-III CERN 14-16th May 2008

Submit times:6am to 6pm

daily.

Web Server

etics.cern.ch

NM

I S

erve

r

Condor Configuration

condor_config

condor_config.local

nmi.cs.tcd.ie

NM

I S

erve

r

Execute node1https & DN

condor_configcondor_config.local

Condor Configurationcondor_configcondor_config.localcondor_config.cern

Condor Configuration

6am to 6pm

6pmto 6am

Cronjob run at6pm to 6am

Proposed TCD NMI Attempt (i.e point execute nodes to different masters on 12 hour cycle)

Experimental

Local Builds

Developer

Builds Local FirewallFull Control

TestGrid FirewallFull Control

Local FirewallFull Control

TestGrid FirewallFull Control

CAG FirewallPartial Control

Dept FirewallExternal Control

TCD FirewallExternal Control