EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ab...

61
EGEE-III INFSO-RI- 031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Ab initio electronic structure calculations on the Grid M. Sterzel – ACC “Cyfronet” AGH COST Training School on Molecular and Material Science Grid Applications Trieste, 30 March 2010

Transcript of EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ab...

Page 1: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

EGEE-III INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Ab initio electronic structure calculations on the Grid

M. Sterzel – ACC “Cyfronet” AGH

COST Training School on Molecular and Material Science Grid Applications

Trieste, 30 March 2010

Page 2: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 2

Outlook

• Aim of this talk– To demonstrate benefits of Grid computing in chemistry

• Parts of the talk topics– Work on the Grid vs. cluster – comparison– Chemical Software packages available on the Grid

Parallel execution

– Typical Chemical Computations on the Grid Geometry optimizations Numerical Frequencies Chemical reactions

– In silico Lab– Final Remarks

Page 3: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

What is a/the Grid ?

• A Grid is NOT– The Next generation Internet– A new Operating System– Just

(a) a way to exploit unused cycles (b) a new model of parallel computing (c) a new model of P2P networking

• Definition (Vaidy Sunderam)– A paradigm/infrastructure that enables the sharing, selection, & aggregation of

geographically distributed resources (computers, software, data(bases), people) [share (virtualized) resources]

– . . . depending on availability, capability, cost– . . . for solving large-scale problems/applications– . . . within virtual organizations [multiple administrative domains]

• Grid Checklist (Ian Foster): A Grid is a system that– Coordinates resources that are not subject to centralized control– Uses standard, open, general purpose protocols and interfaces– Delivers nontrivial qualities of service

3

Page 4: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 4

Authorization

• On a cluster - passwordokrucyusz:~ sterzel$ ssh ymsterze@uiLast login: Wed Mar 5 18:40:24 2008 from ha2rtr.agh.edu.pl

[ymsterze@ui ymsterze]$

• On the grid - certificate (Grid identity card)– User has to be a member of Virtual Organization (at least one)– Virtual Organization determines the resources user can use– To access grid resources user needs to obtain proxy:

[ymsterze@ui ymsterze]$ voms-proxy-init -voms gaussianYour identity: /C=PL/O=GRID/O=Cyfronet/CN=Mariusz SterzelEnter GRID pass phrase:Creating temporary proxy .............................. DoneContacting voms.cyf-kr.edu.pl:15001[/C=PL/O=GRID/O=Cyfronet/

CN=voms.cyf-kr.edu.pl] "gaussian" ...................DoneCreating proxy ........................................ DoneYour proxy is valid until Thu Mar 6 07:20:50 2008

Page 5: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

EGEE

5Mariusz Sterzel CGW'08 Kraków, 13 October 2008 5

EGEE

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

>250 sites48 countries>150,000 CPUs>50 PetaBytes>15,000 users>150 VOs>150,000 jobs/day

Page 6: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 6

Job file

• Grid JDL file

Executable = "/bin/bash";

Arguments = $VO_GAUSSIAN_SW_DIR/g03/gaussian.x water.com";JobType = "MPICH";NodeNumber = 4;

StdOutput = "water.out";

StdError = "water.err";

InputSandbox = {"water.com"};

OutputSandbox= {"water.out",

"water.log", "water.err"

};

• PBS file

#!/bin/bash#PBS -l ncpus=4#PBS -q long#PBS -o water.out#PBS -e water.err#PBS -N my_job_name#PBS -M my@email#PBS -m eexport g03root=/somewhere. $g03root/g03/bsd/g03.profile$g03root/g03 water.com

Page 7: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 7

Job management

• PBS– qsub – submit job to a queue– qdel – delete job from a queue– qstat – show job status in a queue

• EGEE Grid– glite-wms-job-submit – submit job to the Grid– glite-wms-job-delete – remove job from the Grid– glite-wms-job-status – show status of the job on the Grid– glite-wms-job-output – retrieve job files from the Grid

… just a few new commands…

Page 8: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 8

• Freely available packages on EGEE Grid:

• Commercial packages on EGEE Grid:– Gaussian– Turbomole– Wien2k

Chemical software

– GAMESS

– DALTON

– CPMD

– Newton X

– DL_POLY

– NAMD

– RWAVEP

– GROMACS

– Autodock

– Tinker

– Solvate

– PIC-DMSC

– MCGBgrid

– QMC

– ABCtraj

– VENUS

– CRBS

– LM

– COLUMBUS

– DINX

– Abinit

Page 9: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 9

Gaussian VO

• Why Gausian?– Large number of computational methods implemented– One of the first ab initio codes– The most popular among communities– User friendly– Available for many platforms along with GUI

• Gaussian VO– Invented and operated by ACC CYFRONET– All license issues confirmed with Gaussian Inc,– Open for every EGEE user– Any computing centre with site Gaussian license may support it

(4 supporting centres, another 3 in the line)– 30+ users since the start in September 2006– VO manager – Mariusz Sterzel ([email protected])– Enabled for parallel execution up to 8 processors

Page 10: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 10

Turbomole – an alternative

Advantages:– Probably the fastest B3LYP implementation– Analytical gradients for excited states at DFT and CC2 levels– Variety of fitting approaches speeding up calculations– Very well scalability during parallel execution– Extremely fast and very well parallelised (ri)CC2 and (ri)MP2

Disadvantages:– Limited number of DFT functionals (only “good” ones available)– Lack of parallel version of analytical second derivatives– Lack of parallel version of TDDFT– Only NMR chemical shifts implemented, no spin-spin couplings

Page 11: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 11

Gaussian VO – participation

As a user:• Register at:

https://voms.cyf-kr.edu.pl:8443/voms/gaussian• Wait for VOMS admin acceptance• voms-proxy-init --voms gaussian and you are ready to use the

program...

As a participating centre:• Just sent an e-mail concerning participation to VO manager• After confirmation of the license status at your centre with Gaussian Inc,

detailed information concerning set-up will be sent back to you

More details at:http://egee.grid.cyfronet.pl/Applications/gaussian-vo/

Page 12: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 12

Grid and parallel execution

Past:– Serial jobs only– Job of MPICH type always enforced execution of mpirun

Present– mpirun no longer enforced– Instead a wrapper script mpistart can be executed which will

automatically set up environment for required MPI flavour– No possibility to request desired # of processors on a WN

… Unfortunately not all sites are set up…

Current work– Support for OpenMP jobs

Page 13: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 13

Chemical software

• “Old codes” – mostly written in FORTRAN• Serial – parallel execution added later

(with exceptions)

• Different parallelization models used• Low scalability in many cases• Only selected computational methods parallelized

… all that makes parallel grid ports of chemical software even more complicated

Page 14: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 14

Selected cases

• Gaussian– Parallelization via OpenMP or Linda– OpenMP – SMP machines or multiprocessor/core clusters up to # of

processors/cores on WN– Linda – allows the parallel execution between nodes. Requires equal #

of processors for each WN– For Linda additional expenses required (commercial package),

available only for specific platforms• Turbomole

– Uses MPI – currently HPMPI– No specific requirements

• GAMESS– Uses sockets but MPI execution possible (a little slower but more

convenient for execution on a grid)• ADF

– Uses MPI (MPICH, OpenMPI, HPMPI, …)– One of the best parallelized QC codes // to my knowledge ;-)

Page 15: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 15

Grid solution

• Gaussian– Parallel execution via OpenMP on a single WN up to # of

processors/cores available on that worker node– Necessarily queue system set-up requires a Site admin help– Torque set-up:

Modification of /var/spool/pbs/torque.cfg

to: SUBMITFILTER /var/spool/pbs/submit_filter.pl

– Other settings -- typical Job has to be of MPICH type # of processors controlled via NodeNumber variable Gaussian %Nproc route is automatically set-up by script executing

Gaussian

– Execution with 8 processors per job possible.

Page 16: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 16

Sample script

Executable = “/bin/sh”;

Arguments = “$GAUSSIAN_SW_DIR/gaussian.run myfile.com”;

JobType = “MPICH”;

NodeNumber = 8;

InputSandbox = {“myfile.com”};

StdOut = “myfile.out”;

StdErr = “myfile.err”;

OutputSandbox = {“myfile.log”, “myfile.chk”,

“myfile.out”, “myfile.err”};

Requirements = other.GlueCEUniqueID==“ce.cyf-kr.edu.pl”

Page 17: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 17

Grid Solution cont.

• Turbomole– No special set up except shared directory needed, # of

processors automatically discovered by Turbomole scripts

• NAMD– Similar to Turbomole. If the NAMD executing script was set-up

properly during installation the necessarily “node file” is created every time program is executed

• GAMESS– Depends on Grid port– In case of MPI no additional input needed– DDI case – may require WN reconfiguration especially if large

DDI memory is requested by a job

Page 18: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

PL-Grid – Polish Grid Infradtructure

• In addition to above:– CPMD– Cfour– ACES III– Amber– OOMMF– ADF– Amber– Cadence– Fluent– ... any aplication needed by users...

18

Page 19: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 19

Execution benchmarks

Scheduling time– MPI jobs

4 proc./job -- usually less than hour 8 proc./job -- waiting time even 3-4 hour

– OpenMP jobs Job waiting time much longer, heavily depends on site overload

• 4 proc./job -- from less than hour up to 6 hours

• 8 proc./job -- in some cases job waiting time exceeds 12 hours

Parallel job execution can be inefficient in case of short (less than 24 h) jobs

Page 20: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 20

Grid application in chemistry

• Tasks to which Grid can be applied directly:– Conformational analysis– Numerical frequency computations– Zero Point Vibrational Averaging– Property computations for series of geometries from Molecular

Dynamics simulation– Determination of chemical reaction paths– Determination of potential energy surfaces (PES)

… all kind of “brute force” tasks, or tasks which operate on huge data sets

• Other tasks– Computations need to to be planned in order to maximize

benefits from the grid computing

Page 21: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 21

An example

Geometry optimization:– Steep potential

Few steps needed

– Flat potential Many steps needed Energy and gradient

convergence have to be increased to high values

Start

Stop

Page 22: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 22

PCP complex

Page 23: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 23

PCP cont.

Page 24: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 24

PCP cont.

Page 25: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 25

PCP cont.

Page 26: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 26

PCP cont.

Page 27: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 27

PCP cont.

Page 28: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 28

PCP cont.

Page 29: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 29

Plan

• Computations of the whole molecule are not possible• System needs to be modeled

– For this we need: Chlorophyll A and peridinin ground and excited states geometries

and normal modes The geometry of whole complex A little programming (tetramer model, energy transfer model)

Page 30: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 30

Plan

• Computations of the whole molecule are not possible• System needs to be modeled

– For this we need: Chlorophyll A and peridinin ground and excited states geometries

and normal modes The geometry of whole complex A little programming (tetramer model, energy transfer model)

… a lot of luck

Page 31: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 31

First step – Peridinin

Page 32: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 32

First step – Peridinin

Page 33: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 33

Peridinin geometry

• A long peridinin chain makes usual gradient based optimization inefficient

• Instead we propose following scheme:– MD for peridinin (force field level)– Geometry preoptimization for series of MD snapshots (semi

empirical or ~ RHF/STO-3G level)– “Final” geometry optimization for few lowest energy MD

snapshots (CASSCF/PT2 level)– Verification of the minima by frequency calculations– Excited state geometry optimization with ground state as a

starting point– Again, minima verification via vibrational analysis– Verification of obtained data by comparison with experiment

Page 34: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 34

H—CN CN—H

Chemical reactions

... a process that results interconversion of molecules

Page 35: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 35

Chemical reactions

Points of interest:– Structure of substrates and products– Structure of active complex at TS– Reaction path

Page 36: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 36

Chemical reactions

Points of interest:– Structure of substrates and products– Structure of active complex at TS– Reaction path(s)

Reactionpath

substrate(s) product(s)

Transition State (TS)

Energy

Page 37: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 37

Chemical reactions

Energy

substrate(s)product(s)

A+B C

DIntermediateproduct

TS1

TS2

E(TS1) > E(TS2)

Reactionpath

Page 38: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 38

Chemical reactions

Energy

Possible reaction paths;More intermediate products

Reaction path

Page 39: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 39

Chemical reactions

Main problem – TS determination

Page 40: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 40

Chemical reactions

Main problem – TS determination

Artur MichalakArtur Michalak

Page 41: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 41

Chemical reactions

Main problem – TS determination

Page 42: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 42

Chemical reactions

Main problem – TS determination

E

‘reaction coordinate’

Page 43: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 43

Chemical reactions

Main problem – TS determination

‘reaction coordinate’

E

Page 44: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 44

Chemical reactions

TS verification – vibrational analysis – one imaginary frequency!

i1203 cm-1

Page 45: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 45

Sample calculations

N2O braking on oxide surfaces – possible mechanisms

• Electron transfer

6. (O-)--X+--(O-) X + O2

1. N2O + X 2. N2O(s) + X X+--N2O-

4. X+--N2O- X+--(O-) + N25. (O-)--X+--(O-)

e- transfer

Filip ZasadaFilip Zasada

Page 46: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 46

Sample calculations

N2O braking on oxide surfaces – possible mechanisms

• An Oxygen transfer

1. N2O(g) N2O(s) 2. N2O(s) + X X--O--N2

4. O-X-O 5. O--O-X 6. X + O2

4. X-O

Page 47: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 47

Sample calculations

Vac

uu

m

Oxi

de

slab

Adsorbed N2O

Slab construction

Oxi

de

slab

Matrix generation

Page 48: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 48

Sample calculations

• Reaction paths:

Page 49: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 49

Sample calculations

• Reaction paths:

Page 50: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 50

Sample calculations

Computational details:– Gaussian 03 D.01– BP86 functional– Basis set of double-ζ quality

Timings:– CPU time for SP calculation – approx 7 hours– 15 paths – 10-50 energy points on each path– In total about 250 energy points calculated

Page 51: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 51

Conformational searchers

• To determine lowest energy structure

(1R,2R)-1,2-bis-(N’-cyklohexy-1’,4’,5’,8’-naphthalenetetracarboxydiimide)cyklohexane

• Possible issues– For short time jobs

it is convenient to

group several geometries

in to one job to minimize

average job scheduling time

Page 52: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 52

Conformational searches

Page 53: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 53

Harmonic frequencies

Numerical frequency computations for lycopene

– 96 atoms, 2·3·96+1=577 independent computation steps– Software: GAMESS version June 2005– VOCE VO resources used– Methodology: B3LYP, cc-pVDZ basis set– Computations done on approx. 100 processors (ia64 and i386)– CPU time for single computation:

Intel Xeon 2.8Gh - 14h 39’ Intel Itanium 2 1.3Gh - 12h 8’

– Total time: single CPU - 330 days (estimated) EGEE Grid - 3 days

Page 54: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

In silico Lab• Computational chemists use first

principles chemistry packages available on the Grid (Gaussian, GAMESS, TURBOMOLE)

• Grid is still mainly available through command line interfaces (voms-proxy-init, glite-wms-job-submit, glite-wms-job-status)

• Adoption to the Grid environment by non-experts is difficult

• Lets build a web-based problem solving environment facilitating the use of Grid … but not only that

Page 55: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

State-of-the-art

• WebMO

– Desktop and Web access

– Supports main computational chemistry applications

– Does not support Grid or local queues infrastructures

• ECCE – Extensible Computational Chemistry Environment

– Only desktop access

– Supports main computational chemistry applications

– Supports many queue management systems (PBS, LSF, NQE, etc.)

– Does not use Grid infrastructures

• CCG – Computational Chemistry Grid

– A virtual organization which is a part of Teragrid

– GridChem – Java client which can be run as a WebStart application

• P-GRADE – Framework for Development and Execution of Parallel Applications

– A Grid-oriented solution

55

Page 56: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

Description of the solution - requirements

• Supports main chemistry packages

– Gaussian

– GAMESS

– TURBOMOLE

– ...

• Is accessible through a web interface

– All Grid-related operations should be embedded (proxy generation, job submission and status monitoring, LFC catalog operation)

– Persists information about executed jobs between web sessions

• Enables user-centric processing rather than grid-centric

– It should not be yet another Grid job submission tool

– Supports inter-application geometry passing

– Provides automated report summaries

– Covers Grid complexity

• Supports annotations through user free-text tagging

56

Page 57: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688

Description of the solution – architecture (1/2)

• Uses gLite for Grid job management

– A set of Grid APIs is used (LFC, WMS, VOMS)

• Job submission and monitoring implemented as a separate layer for better error handling

– GridSpace platform used

– Each application backed by a separate script

• Application model realized by using SINT (Semantic Integration Tool)

– Basic concepts such as geometry, basic and detailed reports, input parameters, annotations are modeled

• Interactive graphical user interfaces are used

– GWT (Google Web Toolkit) is used as the user front-end technology

57

Page 58: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 58

Description of the solution – architecture (2/2)

Page 59: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 59

Demo

Switching to demo ...

Page 60: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 60

References

T. Gubala, M. Bubak, P.M.A. Sloot: Semantic Integration of Collaborative Research Environments, In: M. Cannataro (Ed.) Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare, Information Science Reference, 2009, IGI Global

Marian Bubak et al.,: Virtual Laboratory for Collaborative Applications, In: M. Cannataro (Ed.) Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare, Information Science Reference, 2009, IGI Global

Page 61: EGEE-III INFSO-RI-031688 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Ab initio electronic structure calculations.

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-031688 61

Summary

• Access to the Grid is easy, does not differ to much from queue system usage

• EGEE Grid offers variety of software packages for chemical computations. A parallel execution can be made as simple for the user as a serial execution is now

• It is always possible to find solution for parallel execution even if computational platform does not directly support certain parallelization model

• With a little of planning every computational chemistry task may benefit from the Grid platform