EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered...
-
Upload
reginald-barton -
Category
Documents
-
view
216 -
download
0
Transcript of EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered...
EGEE-II TCD 22nd-25th May 2007
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE and gLite are registered trademarks
Multi-Platform Support
Presenters: Eamonn Kenny, Rafal LichwalaDate: 15th May 2008Time: 5-6pm Location: 40-S2-D01, CERN
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 2
Overview
• History: – Porting Multi-Platform Support Activity – Timeline 20042008– Lessons to learn from Timeline
• TCGTMB Recommendations• Current Multi-Platform Activities:
– Debian 4 (x86_64/x86) – SL/CentOS 5.1 (x86)
• NMI Support
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 3
Timeline for Porting 2004
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
IRIX 6.5
VDT 1.2.0
Build started
LCG-2_1_0
Release
LCG-2_2_0
Release
LCG-2_3_0
Release
IRIX 6.5
VDT 1.2.0
Build finished
IRIX 6.5
VDT 1.2.0
edg-job-submit
AIX 5.2L
PPC Setup
edg-build build system
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 4
Timeline for Porting 2005
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
LCG-2_4_0
SLC3 Release
LCG-2_4_0
SuSE 9.3/SL3/
RH7.3/RH9/
CentOS 4.1
SFT-1 Passed
Mac OS X
G5 Machine
Delivered
LCG-2_6_0
SLC3 Release
gLite build system /edg-build mixtureedg-build build system
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 5
Timeline for Porting 2006
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
LCG-2_7_0
SLC3 Release
RH9
Mac OS X 10.3
CentOS 4.2
gLite build system /edg-build mixture
FC4
SuSE9CentOS 4.3
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 6
Timeline for Porting 2007
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
gLite 3.1.0-1
SLC4 Release
SL3 and
CentOS 4.5
Build to 100%
ETICS 0.7.0-1
Client released
SuSE 9.3
CentOS 4.5
glite-3.1.0-5
gLite 3.1.0-5
SLC4 Release
ETICS client with ETICS/gLite packager
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 7
Timeline for Porting 2008
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Debian x86
OS Installed
gLite-WN Release
3.1.11-1 (SLC4)
OpenSuSE
10.3 x86
Installed
ETICS client with ETICS/gLite packager
CentOS/SL 5.1
Debian 4 x86
OpenSuSE 10.3 x86
gLite-WN Release
SL 5.1 (x86_64/x86)
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 8
What do glean from Timeline?
• It’s easy to fall behind in OS support due to discontinued platforms, but experience is had by attempting to provide the different OS support. We should attempt builds on the latest stable OS versions.
• Middleware has changed a little in structure but the basics remain the same.
• The build system has changed 3 times. It now includes better platform support.
• VDT is no longer difficult to build under Unix OSes, whereas it was very time consuming to port in the past.
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 9
TMB EGEE-III Recommendations
Platform Type Architecture Node Types Services
SL 5 x86_64 All Servers/Clients
SL 5 x86 All Servers/Clients
Debian 4 x86 WN/UI Clients
Debian 4 x86_64 WN/UI Clients
Mac OS X 10.5
(Leopard)
x86_64 WN/UI Clients
Windows (Cygwin/Interix)
x86_64 WN/UI Clients
Fat Tarball (Fedora 8, Ubuntu 8, Debian 4)
x86_64 WN/UI Clients
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 10
PSNC Issues (Debian x86_64)
1. dpkg-gencontrol: error: current build architecture AMD64 does not appear in package's list (x86_64).– PSNC resolved this problem changing /usr/share/dpkg/cputable
file. It is now possible to create debs with AMD64.
2. Case sensitivity in debs naming fixed using the TCD one line patch (ETICS client 1.3.5-1 doesn’t include it)
3. Simple configuration issues have been pinpointed and fixed.
4. Most TCD patches have been introduced.
5. Other issues resolved over Skype.
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 11
TCD ETICS Build Status
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 12
Debian/SL5 Outstanding Bugs
Affected Package(s)
or sub-system
Bug
Report
Numbers
Number of Patches
Work-Package/
Responsible
Affected Packages
Debian SL5
CERN TCD TCD
dcache-client #35493 None SA3 1 1 1
WMS #35836, #35878 2 JRA1 3 1 -
R-GMA #27471, #32296
#32304
2
2 PW*
JRA1 3 - -
SAM #35499 2 JRA1 3 3 -
LCG-DM #35550 None JRA1 3 3 -
gridsite #35502 None JRA1 5 5 5
edg-gridftp-client #36472 None SA3/ETICS 1 - -
VDT Essentials None None SA3 - - -
Packager #34130 1
(2 PW*)
ETICS 11 debs
0 RPMS
-
* PW – Project Wide SPEC File Patch Required
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 13
Debian Development Strategy
• org.glite branch for multi-platform support:– glite_branch_3_1_0_dev (previously
glite_branch_3_1_0_for_porting)
• Change subsystems to go ahead of glite_branch_3_1_0:– glite-wms_branch_3_1_0 (fixes to WMS code on cvs HEAD)– glite-legacy_branch_3_1_0_dev (for correct edg-gridftp-client)– gridsite-core_R_1_1_18_1_dev
• Current default configurations to go ahead of glite_branch_3_1_0:– globus v. 4.0.3-VDT-1.6.1-5– vdt_globus_essentials v. 4.0.3-VDT-1.6.1-5– vdt_globus_rm_essentials v. 4.0.3-VDT-1.6.1-5
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 14
SL/CentOS 5.1 Differences
• org.glite branch for multi-platform support:– glite_branch_3_1_0_dev (previously
glite_branch_3_1_0_for_porting)
• Change subsystems to go ahead of glite_branch_3_1_0:– glite-wms_branch_3_1_0 (fixes to WMS code on cvs HEAD)– glite-legacy_branch_3_1_0_dev (for correct edg-gridftp-client)– gridsite-core_R_1_1_18_1_dev
• Current default configurations to go ahead of glite_branch_3_1_0:– globus v. 4.0.5-VDT-1.8.1-1_dev– vdt_globus_essentials v. 4.0.5-VDT-1.8.1-2_dev– vdt_globus_rm_essentials v. 4.0.5-VDT-1.8.1-2_dev
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 15
Outstanding ETICS Issues: Patching
Component locking per platform and Patching System
Alternative: 1. clone the configuration.
2. Upload a patch to AFS (or point to a local patch for wget).
3. Introduce the patch into the new configuration.
4. Create the configuration at the org.glite.node.WN or org.glite branch to point to the new configuration.
5. Wait for the fix by JRA1/SA3/ETICS to arrive in a developers branch.
6. Remove the cloned configuration.
• Procedure: Create, Upload, Patch, Metadata, Integrate, Delete• Caveat: No Partner such as Catania, PSNC or HPNC can do this!
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008 16
Outstanding ETICS Issues: Global Replace
Project Wide RPM SPEC File Fixes (new: rpmbuild issue)– %build issue and Copyright issue
Agreed Procedure to Implementation: – Report each bug in Savannah to each subsystem developer. – We can either, provide the Linux source diff code patch, say it in
words what the changes are, or provide a sed type script to perform the fix.
– This is time consuming but necessary since no one person can edit and commit all subsystem fixes.
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008
NMI Support
• Working on an automated procedure for NMI/Condor Submit & NMI/Condor Execute.
• Will include some firewall traversal.• Condor group are interested in NMI Build & Test.
Currently NMI building is all that is used in practice.• TCD would like to put together some scripts to perform
the Condor/NMI installation automatically.• A time-separated build using nmi.<local.network> or
nmi.cern.ch as a submitter should be provided for.• This will allow all partners to register packages in AFS
in CERN creating a faster certification cycle.
17
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008
Ideal Distributed Building Infrastructure
ETICS checkout
ETICS build
ETICS test
Web Server
etics.cern.ch
NMI Webserver
NMI archiver
NMI/ETICS dB host
NM
I S
erve
r
Another Site
NMI Webserver
NMI archiver
NMI/ETICS dB host
NM
I S
erve
r
Execute node
Condor CM
Condor CM
Condor execute
NMI Hawkeye
https & DN
ETICS client
NMI Submit
NMI Submit
ANYWHERE
CERN
3rd PARTY
4th PARTY
Enabling Grids for E-sciencE
EGEE-III CERN 14-16th May 2008
Submit times:6am to 6pm
daily.
Web Server
etics.cern.ch
NM
I S
erve
r
Condor Configuration
condor_config
condor_config.local
nmi.cs.tcd.ie
NM
I S
erve
r
Execute node1https & DN
condor_configcondor_config.local
Condor Configurationcondor_configcondor_config.localcondor_config.cern
Condor Configuration
6am to 6pm
6pmto 6am
Cronjob run at6pm to 6am
Proposed TCD NMI Attempt (i.e point execute nodes to different masters on 12 hour cycle)
Experimental
Local Builds
Developer
Builds Local FirewallFull Control
TestGrid FirewallFull Control
Local FirewallFull Control
TestGrid FirewallFull Control
CAG FirewallPartial Control
Dept FirewallExternal Control
TCD FirewallExternal Control