Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that...

26
Worker Node Software Management: the VO perspective Mark Santcroos Dennis van Dok

Transcript of Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that...

Page 1: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Worker Node Software Management: the VO perspective

Mark Santcroos Dennis van Dok

Page 2: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Introduction

•  e-BioScience group –  Bioinformatics Laboratory –  Clinical Epidemiology, Biostatistics and Bioinformatics –  Academic Medical Centre, Amsterdam

•  Intermediate between medical researchers and Dutch NGI

•  Support a wide range of applications in Next Generation Sequencing and Medical Imaging

Page 3: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Worker Node Software

•  Running on 15 sites in the Netherlands

•  Base worker node installation (glite-WN)

•  Proof of Concept (PoC) software installation, heritage of Virtual Laboratory for e-Science (ended 2009)

Page 4: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Perspective

•  Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid

•  Mark is a VO manager for the vlemed VO

Page 5: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Job / Application Scenarios

•  Use installed software

•  Application in Job Sandbox

•  Fetch Application using wrapper

•  Upgrade versions in PoC distribution

•  Lobby for new versions with Site admins

Page 6: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Limitations

•  Sandbox solution has size limits

•  Sandbox and wrapper have network overhead

•  Installed version out of date / too new

•  Responsibility of maintaining applications for end-user not always preferable

•  Site admins have to be in the loop

Page 7: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

High Level Goal

•  Have a flexible solution to make software available on the grid for end users that is also manageable from a VO admin perspective.

Page 8: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Packaging Requirements

•  Automatic dependency resolution

•  Supported on Linux

•  Tools for install/update/remove/status

•  Running entire in userspace, unprivileged

•  Multiple installed versions of the same software

Page 9: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Unsuitable candidates

•  rpm/yum •  deb/apt •  portage •  Arch User Repository •  pacman •  …

•  Reasons: too OS specific, difficult to manage unprivileged

Page 10: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Pkgsrc

•  Originating in NetBSD •  Supported on Linux •  Self contained •  Actively maintained •  Can be used as a non-privileged user •  Large collection of applications already packaged •  Can make use of system provided dependencies •  Allows maintaining a local set of packages •  Could add packages to the main distribution •  Supports binary and source packages

Page 11: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Creating a package

DISTNAME= vlet-1.3.2 CATEGORIES= local MASTER_SITES= http://orange.ebioscience.amc.nl/pkgsrc/distfiles/ EXTRACT_SUFX= .zip MAINTAINER= [email protected] HOMEPAGE= http://orange.ebioscience.amc.nl/pkgsrc/distfiles/ COMMENT= This is the VL-e Toolkit LICENSE= apache-2.0 NO_CONFIGURE= yes NO_BUILD= yes PKG_DESTDIR_SUPPORT= user-destdir INSTALLATION_DIRS= bin lib post-extract:

${CP} ${FILESDIR}/Makefile ${WRKSRC}/Makefile .include "../../mk/bsd.pkg.mk"

Page 12: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Package Tree Management

•  update-tree.sh –  Pull upstream pkgsrc changes –  Create tarball –  Put on website

Page 13: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Implementation Principles

•  $VO_[VONAME]_SW_DIR is a directory shared between all worker nodes on a site

•  Run with a Software (VO) Manager proxy

•  Install packages per site / cluster / CE

Page 14: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Architecture

Server (UI)

Shared Storage Area

Worker Nodes

Management Jobs

Mount

Page 15: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Managing packages

•  site-pkgtool.sh –  Program to manage packages centrally –  Initiates grid jobs

•  Install, Remove, Update

•  Init, Reinit, Check, Dump, Info, Version

Page 16: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Script on the worker node

•  pkgsrc-cmd.sh –  Wrapper program that runs on the worker node

•  Running as a grid job

Page 17: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Information Management

•  list-installed-packages.sh –  Display information about installed packages for sites

•  get-site-status.sh –  Gather information from all supported sites

•  verify-package.sh –  Check if a certain package is available on a site

•  get-tags.sh –  Get all the package tags for the configured sites

Page 18: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Installing a package

•  Check if distribution is fresh

•  Extract tree in scratch space

•  Build package and dependencies

•  Install package in shared software area

•  Install modulefile

Page 19: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Environment Modules

•  “The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles.”

•  Select versions •  Setup environment •  Integrates with system provided setup

Page 20: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Tags

•  Software Tags in Information System (BDII)

•  Publish installed software versions per CE

•  Used for resource selection by adding it to the “Requirements” of a JDL

•  Use lcg-ManageVOTag tool to publish tag

•  Structure of tags is VO-${vo}_SW_${package}

Page 21: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Practical issues

•  Tags are not omnipresent

•  Shared area can become bottleneck

•  No intelligent matching on tags

Page 22: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Conclusions

•  Flexible software management system

•  Relieves burden from user

•  Creating packages is still labor intensive work

Page 23: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Discussion

•  One size fits all? (Did we reinvent the wheel?)

•  Connect to EGI AppDB?

•  EMI Community Repositories?

•  Usable for data distribution?

•  Other mechanism for matching?

Page 24: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Links

•  pkgsrc –  http://www.netbsd.org/docs/software/packages.html

•  Modules –  http://modules.sourceforge.net/

•  BiG Grid –  http://www.biggrid.nl/

•  Bioinformatics Laboratory –  http://www.bioinformaticslaboratory.nl/

•  Project Code –  http://dvandok.github.com/userspace-package-

management/

Page 25: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Acknowledgements

•  AMC Bioinformatics Laboratory –  Prof. dr. Antoine van Kampen –  Dr. Silvia Delgado Olabarriaga –  Barbera van Schaik

•  Big Grid / Nikhef –  Jan Just Keijser

Page 26: Worker Node Software Management: the VO perspective · • Dennis van Dok is part of team that developed and managed the PoC environment at BiG Grid • Mark is a VO manager for the

Thanks!