Download - F. Boulahya, I. Dubus, F. Dupros, P. Lombard EGEE User Forum, May 9 th -11 th , Manchester

Transcript

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

FOOTPRINT@work,a computing framework for large scale parametric simulations : application to pesticide risk assessment and management

F. Boulahya, I. Dubus, F. Dupros, P. Lombard

EGEE User Forum, May 9th-11th, Manchester

EGEE User Forum, May 9th-11th, Manchester 2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT presentation

• 3year EU-funded project– Part of the 6th Framework Program (FP6)– Started in January 2006

• Objective :– Develop functional tools to identify pathways and sources of pesticides

contamination in the agricultural landscape– Help decision makers to choose strategies to reduce pesticides

contamination of water ressources at different scale Local/Farm Catchments National/EU

• How :– Based on meta-modelling i.e. based on a large number of precomputed

scenarios• http://www.eu-footprint.org

• Huge requirement of computing power

EGEE User Forum, May 9th-11th, Manchester 5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture– What is FOOTPRINT@work?– ComputeMode– OAR/CIGRI

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• What is FOOTPRINT@work?– Relies on a French national Grid initiative IGGI

Infrastructure for Grids, Cluster and Intranet Partners : BRGM, INRIA, Mandriva Allow access and gather the whole computing resources spread

over the intranet of a company

• The ability to use desktop PC coming from the BRGM administrative staff and researchers to perform large scale pesticide simulations

EGEE User Forum, May 9th-11th, Manchester 7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture– What is FOOTPRINT@work?– ComputeMode– OAR/CIGRI

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• ComputeMode– Provides a seamless Linux cluster infrastructure within an

Intranet – Aggregates idle user machine to a virtual computing cluster– Is based on :

PXE boot with DHCP and Wake-on-LAN NFS accesses PostgreSQL database linked to an Apache server for an easy to

use web interface

– http://www.computemode.org/

EGEE User Forum, May 9th-11th, Manchester 9

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• ComputeMode Avantages– An almost homogeneous environment

Every nodes boot a same diskless Linux flavor The only sources of heterogeneity are the difference in CPU powers

and the RAM available

– A safe environment Due to a diskless operating system Logging on a node is only allowed through an ssh coming from the

ComputeMode server or through an interactive job reservation

EGEE User Forum, May 9th-11th, Manchester 10

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• ComputeMode : how does it work?– Each cluster node has to be registered

A hostname attached to its MAC address A booting schedule : describes when this machine will work “locally” and

when it will take part of the cluster Properties are added (location,type,…) which are exported to the job

manager

– Standard schedule : from 6pm until 8am from Monday to Friday and the whole Saturday and Sunday

– During a computing period, possibility to halt the PC by hitting 'Alt-Ctl-Del'

– To get some more availability, users may inform the system that they will be out of office for the next few days thanks to a simplified web page

• Including vacations, nights and weekend a single PC could be used 4.5 days in a week for grid computations

EGEE User Forum, May 9th-11th, Manchester 11

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• ComputeMode : Shedule

EGEE User Forum, May 9th-11th, Manchester 12

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture– What is FOOTPRINT@work?– ComputeMode– OAR/CIGRI

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 13

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• OAR– Batch sceduler– Developed by the ID-IMAG laboratory – Is written in Perl and is based on a MYSQL relational database– One example of users: the French national Grid Grid5000,

http://www.grid5000.fr  – available free of charge under an Open Source license http://oar.

imag.fr/

EGEE User Forum, May 9th-11th, Manchester 14

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• CIGRI– a campaign manager for parametric job – based on a MySQL database and a web interface – written in Perl– acts as a meta-scheduler and uses OAR and ssh to successfully

handle the campaigns (task and job scheduling, multi-cluster support).

– several thousand parameters can be manipulated – indicates where jobs are executed at what time– indicates what happens on the Grid platform and what are errors

related to their campaigns– available free of charge under an Open Source license: http://

cigri.imag.fr/

EGEE User Forum, May 9th-11th, Manchester 15

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work architecture

• CIGRI web interface to manage campaigns

EGEE User Forum, May 9th-11th, Manchester 16

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models– Models descriptions– Emulation– Management of parametric campaigns

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 17

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

PRZM and MACRO Models

• MACRO– A physically-based, one-dimensional, numerical model of

water flow and reactive solute transport in field soils (Jarvis, 1994)

– Calculates coupled unsaturated-saturated water flow in cropped soil and can also deal with saturated flow to field drainage systems

– Richards' equation and the convection-dispersion equation are used to model soil water flow and solute transport in the soil micropores, while a simplified capacitance type-approach is used to calculate fluxes in the macropores

EGEE User Forum, May 9th-11th, Manchester 18

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

PRZM and MACRO Models

• PRZM (Pesticide Root Zone Model)– Based on a one-dimensional finite-difference code – Consists of hydrologic (flow) and chemical transport components

to simulate runoff, erosion, plant uptake, leaching, decay, foliar washoff, and volatilisation

– Pesticide transport and fate processes include advection, dispersion, molecular diffusion, and soil sorption

– Includes soil temperature effects, volatilisation and vapour phase transport in soils, irrigation simulation and a method of characteristics algorithm to eliminate numerical dispersion

– Predictions can be made for daily, monthly or annual output

EGEE User Forum, May 9th-11th, Manchester 19

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models– Models descriptions– Emulation– Management of parametric campaigns

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 20

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

PRZM and MACRO Models

• MACRO and PRZM are based on Windows operating system whereas computing nodes are under Linux. – Homologation requires Windows version of these models– the FOOTPRINT@work software includes emulation facilities to

run these two codes on Linux platforms

• The tools used– For PRZM : Wine (emulation) and Xvfb ( redirection of output

events)– For MACRO : a patched version of DOSEMU

• NB: Because the sources of MACRO code are available in FOOTPRINT context, this level will be given up as soon as the Linux version is validated

EGEE User Forum, May 9th-11th, Manchester 21

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models– Models descriptions– Emulation– Management of parametric campaigns

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 22

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

PRZM and MACRO Models

• several shell and perl tools have been written to mimic SENSAN package functionalities

• What is SENSAN?– A model-independent sensitivity analyser– Communicates with the model through its input and output (no

need to adapt the simulations)– Example1 :

Identifies parameters in the input files The user provides different sets of parameters values SENSAN runs the model for each set of parameter values Records input and output values in a spreadsheet format

– Example2 : Offers the possibility to add a system command (renaming output

files, moving them,…)

EGEE User Forum, May 9th-11th, Manchester 23

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

PRZM and MACRO Models

• Input data: Two modes are allowed for input data:– one file per individual run– a template file as well as a variable file where a line correspond to a set

of parameters (this mimics SENSAN package functionality).

• A number of job is associated to each simulation

• Output data:– Renamed according to the number of the simulation.– Because of the amount of data that could be generated, two

functionalities have been added: specify a perl script to filter output files during their generation only to

preserve “sections” of interest data. specify a filter script using the same syntax as SENSAN to extract several

values. (a merger script is used to aggregate these information to create a table of results)

EGEE User Forum, May 9th-11th, Manchester 24

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• FOOTPRINT presentation

• FOOTPRINT@work architecture

• PRZM and MACRO Models

• FOOTPRINT@work and EGEE

EGEE User Forum, May 9th-11th, Manchester 25

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work and EGEE

• Challenge– Around 12 thousands of agro environmental scenarios (crop x

soil x climate)– 100 different pesticides and for each 10 possible application

dates– Several millions of runs of MACRO and PRZM– Several Terabytes of data

• Internal BRGM Grid capabilities– 500 PC that could be used during idle period (4.5 days per

week)– Storage Means in discussion

EGEE User Forum, May 9th-11th, Manchester 26

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

FOOTPRINT@work and EGEE

• EGEE– Because of the computing time (more than 2700 years of

running on a single PC) and the amount of storage data (around 20To), FOOTPRINT clearly calls for Grid capabilities such as EGEE infrastructure.

– Virtual Organisation Earth Science Research– Needed in FOOTPRINT and available in EGEE infrastructure :

Huge number of available CPUs Possibility to manage parametric campaigns (functionality of gLite) Management of files

– But needs MACRO to be ported on Linux (current)

EGEE User Forum, May 9th-11th, Manchester 27

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Acknowledgements

The funding of the FOOTPRINT project

by the European Commission

through its Sixth Framework Programme

is gratefully acknowledged

[email protected]