INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July...

35
INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@c ern.ch

Transcript of INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July...

Page 1: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

ETICS

The Software Engineering Infrastructure

Physics Services MeetingGeneva, July 3rd 2008

Alberto Di Meglio

CERN - ETICS

[email protected]

Page 2: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Contents

• General overview• Architecture• What ETICS is and what it is not• System features highlights• Guidelines and ‘constraints’• Benefits of using ETICS• Problems and Issues• Examples of usage• Conclusions

2ETICS@PSM - 3 July 2008

Page 3: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

General Overview

• ETICS stands for

E-infrastructure for Testing, Integration and Configuration of Software

• ETICS started in January 2006 and ended in December 2007. It had a very successful review exceeding all stated objectives

• ETICS 2 started in March 2008 (ranked first in the proposal selection process) and it’s supposed to run until February 2010 (mostly with EC funding) and then on its own

• ETICS is not ‘just’ a build system, it’s a complete infrastructure for building, testing, configuring and managing software projects

3ETICS@PSM - 3 July 2008

Page 4: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Partners

4ETICS@PSM - 3 July 2008

Page 5: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

ArchitectureETICS is not ‘just’ a build system

1

Build/TestArtefacts

ConfigurationWeb Service

Report/MetricsDB

ConfigurationDB

Execution Engines(Metronome, gLite, UNICORE, etc)

Command Line User Interface

RepositoryWeb Service

ETICS Infrastructure

Physical WorkerNodes

Web Portal

Virtual OS Images

5ETICS@PSM - 3 July 2008

Page 6: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

What ETICS is

• It’s a software engineering management system• It’s a build and test infrastructure• It provides tools and resources to configure, manage and

analyse build and test runs• It provides a common interface to diverse projects to

facilitate knowledge sharing and operations management• It has an open repository of configuration metadata,

packages, reports. The goal is to share information, but also to reliably store and preserve information

• It has a plugin-based architecture and APIs to allow integrating ETICS into existing processes and extending it with custom actions

• It’s multi-platform and independent from any specific build or test tool

6ETICS@PSM - 3 July 2008

Page 7: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

What ETICS is not

• It’s not a replacement for source code management systems like CVS or Subversion. ETICS uses such systems and can be easily extended with support for more

• It’s not a replacement for build tools like make, ant, etc. ETICS uses whatever native tool a specific project decides to use and doesn’t force the usage of any particular tool

• It’s not a replacement for QA tools like checkstyle, junit, cppunit, coverage tools, etc. ETICS provides a rich library of such tools that projects can activate as they wish when running builds and tests

7ETICS@PSM - 3 July 2008

Page 8: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Typical ETICS Execution Sequence

etics-get-project etics-checkout etics-build/test

Extract metadata information about a project from the ETICS DB

Extract configuration information from the ETICS DB and source/binary packages from different repositories

cvs svn http

Execute the build or test operations

make

ant

Unit tests, coverage, service mgmt, packaging, reporting

8ETICS@PSM - 3 July 2008

Page 9: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Plugins and Metrics Collectors

• The ETICS system is plugin-based (like for example ECLIPSE)

• It provides a core set of tools and a published specification for developing additional plugins

• Plugins are essentially thin wrappers around existing or custom tools providing very specific functionality like packaging, static or dynamic testing, standards compliance testing, service installation and management, reporting, etc

• Plugins can publish information in the ETICS system in the form of metrics, which can then be used to do data mining or trend analysis using the available ETICS reporting tools

9ETICS@PSM - 3 July 2008

Page 10: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Examples of metrics collectorsMetrics Type Programming

languages/ technologies

Tool ETICS Plugin

Complexity static Java

Python

Javancss JCcnPlugin

PyComplexityPlugin.py

Design quality static Java Jdepend JDependPlugin

N of potential bugs static C/C++

Python

Perl

PHP

Java

Flawfinder,

RATS

PMD

Findbugs

CFlawfinderPlugin

CPyPhpRatsPlugin

JPmdPlugin

JFindbugsPlugin

N of potential bugs dynamic C/C++ Valgrind CValgrindPlugin

Lines of code static All SLOCCount SLOCCountPlugin

Coverage dynamic Java Emma

Cobertura

JUnitemmaPlugin

JCoberturaPlugin

Unit tests success rate

dynamic Java

Python

JUnit

PyUnit

JUnitPlugin

JUnitreportsPlugin.py PyUnitPlugin.py

Compliance with standards

static IPv6

WSI

IPv6Plugin WSInteroperabilityPlugin

Profiling dynamic C/C++

Java

Jrat

Valgrind

JRatPlugin

CValgrindPlugin 10ETICS@PSM - 3 July 2008

Page 11: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Examples of metrics collectors

11ETICS@PSM - 3 July 2008

Page 12: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

The ‘Distributed Testing’ Feature

• One of the last features to be added, still in experimental mode• It allows designing complex tests involving several services and

test applications to be deployed on separate nodes• ETICS analyses the definition and deploys the services on the

necessary nodes• A synchronization mechanism orchestrate the start/stop of

services and the execution of the test applications• At the end the results are collected and reported as a single

report• It is not yet usable by ‘any user’, it requires some deep

knowledge of the system to be tested• It has been successfully used in a number of cases by the

DILIGENT project• It has the strong prerequisite that the services to be deployed

have to be managed without user intervention, which is not always the case

12ETICS@PSM - 3 July 2008

Page 13: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

The ETICS Infrastructure

• The ETICS Infrastructure is currently based on three resource pools managed by Condor, at CERN, UoW and INFN

• There are about 450 cores available in the three sites with more than 40 different types of platforms

• At CERN in particular there are 120 cores and 12 different platforms (a platform is a combination of operating system, CPU architecture and compiler, ex: slc4_ia32_gcc36, deb4_x86_64_gcc412, etc)

• Additionally various nodes are attached to the CERN pool from third-party projects or organizations for specific use (4D Soft Ltd for DILIGENT in Hungary, GARR for EUChinaGrid and EGEE/SA2, IN2P3 for EGEE/SA2)

13ETICS@PSM - 3 July 2008

Page 14: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

The CERN Resource Pool

14ETICS@PSM - 3 July 2008

Page 15: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Guidelines and Constraints

• The ETICS System is quite flexible in terms of usage policies

• However, there are recommendations and constraints• The most important are:

• Organize the software in well manageable units (components) and functional sets (subsystems). This is not mandatory, projects are free to organize the software as they wish

• Version the software when changes occur to be able to reproduce past configurations and generate reliable trend analysis data. Not mandatory, but there is a lot to lose in not doing it

15ETICS@PSM - 3 July 2008

Page 16: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Guidelines and Constraints (2)

• Lock the software when it’s ready to be released, so it cannot be modified anymore, and give it a new name/version/date. This is not mandatory, however it is strongly recommended and related to the next point

• Only locked configurations can be published in the one and only public permanent Repository. This is a hard constraint. However, users can create as many volatile repositories as they wish during the development and test phases

16ETICS@PSM - 3 July 2008

Page 17: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Benefits of Using ETICS

• Common set of interfaces and tools to different projects• Simplifies working with different projects and transferring

knowledge among technical people• Configuration information is stored in a single place with a

common schema and preserved while people or projects change

• Multi-platform support allow building and testing the same code-base on as many platforms as needed with negligible additional effort

• The resource infrastructure allows running several builds and tests without having to set up dedicated machines

• All machines are configured in the same way or use virtual images in order to guarantee reproducibility of results

• Standard build and test tools are available out-of-the-box and can be applied in the same way to different build/test runs and different projects over time

17ETICS@PSM - 3 July 2008

Page 18: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Benefits of Using ETICS (2)

• The built-in packaging system abstracts the package information and automatically builds the appropriate package format for the different platforms (tarballs, RPMS, debs, MSIs, etc). No need to maintain separate scripts

• Reporting tools are available out-of-the-box to control the software evolution, pinpoint issues and quickly solve problems

• APIs and plugins allow to integrate the tools into existing development processes and extend them with custom tests or reports as needed

• New plugins and tools developed by a project can be of benefit to all other projects (no duplication of effort)

• ETICS provides a large set of external libraries in source format or already precompiled for many platforms (the ‘externals’ project). Components can set dependencies on these libraries and automatically configure their workspace.

18ETICS@PSM - 3 July 2008

Page 19: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

The ‘externals’ project

19ETICS@PSM - 3 July 2008

Page 20: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Problems and Issues:Technical problems

PERFORMANCE• The system was designed for integrators and managers

and the speed of execution of individual commands was not a priority compared to support for multiple platforms, reporting, common interfaces

• Over time it’s been used more and more by individual developers, whose primary concern is performance of single builds or tests, rather than quality evolution over time

• The original XML-based implementation did not scale, new implementation is based on sqlite, the de-facto standard in multiplatform embedded database engines

• New requirements have been analysed and a new version of the tools is being deployed that improves performance from 200% to 900% depending on the task to be executed and the available hardware

20ETICS@PSM - 3 July 2008

Page 21: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Performance ComparisonsCheckout and workspace configuration

Old client New client Speed-up Modules

gLite ~35h ~4h 875% 384

WMS 1h 43m 41s 14m 16s 735% 110

Data Management

1h 12m 18s 10m 34s 720% 104

Security 29m 38s 5m 45s 483% 65

LB 14m 32s 2m 51s 460% 42

The checkout and workspace configuration (etics-checkout) is the command with the highest overhead. The table above compares the execution time for the entire gLite project and some individual subsystemsThe commands are executed on a standard CERN desktop with SLC4 32 bit. It may take longer on a laptop or older machines (it requires at least 1GB RAM to handle the entire gLite project) and from outside CERN (due to CVS)

21ETICS@PSM - 3 July 2008

Page 22: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Problems and Issues:Technical problems

USER-FRIENDLYNESS• The system has some learning curve, although it depends on

the user background and experience• In part this is due to the fact that the scope of the system is

quite broad and there are many different commands and options to be used

• But it is true that the interfaces (both web and CLI) could benefit from a number of techniques to make their usage friendlier (wizards, templates, etc)

• User documentation is extensive, but not in a format that developers like to use. We are moving now from a single Word document to lightweight web-based help pages and tutorials

• It must be said that once any person learns how to use the system, the same knowledge can be applied to any of the registered projects, making sharing of effort within teams very easy

22ETICS@PSM - 3 July 2008

Page 23: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Problems and Issues:Socio-technical problems

• Projects that started using ETICS since the very early phases had to endure the unavoidable initial problems and bugs. In many cases this had created a certain mistrust in the system that we have now difficulties in overcoming

• Many developers are very reluctant to use new procedures and tools, especially when they have developed their own. We still have several users wrapping the ETICS tools within layers of scripts. These scripts are supposed to improve the tools and/or shield the developers against their misbehaviour, but often prevent them from working correctly

• In other cases, projects use their own tools or store some information outside the system in parallel to ETICS. This prevents ETICS from providing some functionality for lack of complete information. Although this in itself is not a problem, it becomes a problems if it is perceived as an issue on the ETICS side

23ETICS@PSM - 3 July 2008

Page 24: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Problems and Issues:Socio-technical problems

• Developers independence is important to keep up motivation and enthusiasm. However for the managers is good to have some uniformity, global reporting and trend analysis. These two aspects are very difficult to reconcile

• ETICS is configurable to allow a sort of intermediate “managed freedom”, but some rules have to be established

24ETICS@PSM - 3 July 2008

Page 25: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Example of Usage:EGEE

• EGEE uses ETICS within four Activities: NA4, JRA1, SA2, SA3

• JRA1 and SA3 are the main ETICS users (more in the following slide)

• SA2 has implemented with ETICS the IPv6 compliance analysis of gLite code and the distributed IPv6 experimental testbed where the IPv6 version of BDII has been tested

• NA4 is using ETICS for experimenting with a few projects like Gridway, mpi, SAGA and the panc quattor compiler

25ETICS@PSM - 3 July 2008

Page 26: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Example of Usage:EGEE – gLite Building

• org.glite: 289 Modules, 5211 Configurations, 149 Main Reports, 1583 Metrics

• The project has two main configurations, one for production (glite_branch_3_1_0) and one for development (glite_branch_3_1_0_dev)

• The project configuration is not versioned, although recently some backups have been created when a change occurred

• All other configurations are versions of the 288 modules inside the org.glite project

• It is built four times per day on 5 different platforms (slc4 32/64, sl5 32, deb4 32, rhel4 32). SLC3 has been dropped recently

• Many on-demand builds are performed by individual developers• The porting of several services is managed with ETICS. At the

moment development gLite builds are run on several different platforms

26ETICS@PSM - 3 July 2008

Page 27: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Examples of UsageEGEE – gLite Porting

27ETICS@PSM - 3 July 2008

Page 28: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Example of Usage:EGEE – gLite Testing

• Very little testing is done with ETICS at the moment• There is ongoing work to do deployment and patch testing

of the gLite metapackages (services)• However, this is made difficult by the fact that the definitions

of the metapackages and patches are not in ETICS• We are working on a Savannah plugin that would allow

extracting information from Savannah (bugs, patches, etc)• More complex tests are planned but only after this is solved

28ETICS@PSM - 3 July 2008

Page 29: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Example of Usage:DILIGENT/D4Science

• DILIGENT has used ETICS for building, testing, releasing and reporting since the beginning

• D4Science, its follow-up project, has fully standardized its software engineering process on ETICS

• The are two main projects in D4Science:• org.gcore: 14 Modules, 619 Configurations, 60 Main

Reports, 160 Metrics• org.gcube: 145 Modules, 1386 Configurations, 131 Main

Reports, 272 Metrics• A letter has recently been received from the D4Science

Technical Coordinator describing how they use ETICS and quantifying the savings it brought in terms of manpower and efficiency of the process

29ETICS@PSM - 3 July 2008

Page 30: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Example of Usage:OMII-Europe

• OMII-Europe has used ETICS as software engineering platform since the beginning

• It mainly includes builds of the BES compliant modules for CREAM CE and UNICORE

• BES compliance tests for CREAM-BES have also been implemented in ETICS, although problems in deploying CREAM have prevented running them in a completely automated way

• OMII-Europe: 13 Modules, 10 Configurations, 10 Main Reports, 86 Metrics

30ETICS@PSM - 3 July 2008

Page 31: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Experimental Projects:ROOT

• ROOT has been added to ETICS by ETICS developers, not by ROOT developers. It has been used as a proof of concept to see how difficult it is to add existing projects to the system

• It was added and configured in one afternoon• It can be easily built with ETICS on various platforms (slc3,

slc4 32/64, rhel4, debian)• root-project: 1 Module, 6 Configurations, 6 Main Reports,

12 Metrics

31ETICS@PSM - 3 July 2008

Page 32: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Experimental Projects:CASTOR 2

• CASTOR 2 has been added to ETICS by ETICS developers as a proof of concept to see how easy it is to add existing projects to the system

• The CASTOR developers have been involved in this case• The exercise started nominally more than one year ago with

very low priority on both sides and a bit more effort only in the past two weeks (also in preparation of this presentation)

• However, it still cannot be fully built due to some compilation errors and a rather different approach to building software

• The errors can be reproduced with and without ETICS and have been shown to the developers. Work is ongoing.

32ETICS@PSM - 3 July 2008

Page 33: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Quick Demos

33ETICS@PSM - 3 July 2008

Page 34: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Next Steps

• The major goals of ETICS 2 are:• To make the system even more reliable and user-friendly• To include it as a general service in the major European

infrastructures (EGEE, DEISA, SeeGrid, etc). ETICS support is already a standard unit within GGUS

• To propose it also to non-grid related projects inside and outside CERN

• To add direct support for major job management systems (CREAM, UNICORE, LSF). Currently only Condor is supported

• To add more advanced release management and patch management tools with proper responsibility transitions (like in EDH for example)

• To add dedicated project dashboards where information about projects is clearly summarized (like in SourceForge)

34ETICS@PSM - 3 July 2008

Page 35: INFSO-RI-223782 ETICS The Software Engineering Infrastructure Physics Services Meeting Geneva, July 3 rd 2008 Alberto Di Meglio CERN - ETICS alberto.di.meglio@cern.ch.

INFSO-RI-223782

Conclusions

• ETICS is not ‘a build system’• ETICS is a complete infrastructure for configuring, building,

testing and managing the software development process• The current system is the result of two years of

requirements analysis, design, implementations and deployment with many projects and developers

• It is currently effectively used in production tasks• Problems do exist, but we are actively and proactively

addressing them together with the users• Where we go from here depends on how we can evolve the

system and how many more projects decide to adopt ETICS

35ETICS@PSM - 3 July 2008