EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28...

9
EGEE-II INFSO-RI- 031688 Enabling Grids for E-sciencE www.eu-egee.org WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput Screening N. Jacq LPC of Clermont-Ferrand (CNRS/IN2P3) Biomed meeting, Lyon, 2006/04/28

Transcript of EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28...

Page 1: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

WISDOM in EGEE-2, biomed meeting, 2006/04/28

WISDOM : Grid-enabled Virtual High Throughput Screening

N. Jacq

LPC of Clermont-Ferrand (CNRS/IN2P3)

Biomed meeting, Lyon, 2006/04/28

Page 2: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

• Partners– Fraunhofer SCAI, Germany (Project PI: Martin Hofmann)– LPC Clermont-Ferrand, France (CNRS/IN2P3)– CMBA, France (Center for Bio-Active Molecules screening)– BioSolveIT– HealthGrid

• Representing different projects:– EGEE (EU FP6)– Simdat (EU FP6)– AuverGrid (French Regional Grid)– Accamba project (French ACI project)

WISDOM : Wide In Silico Docking On Malaria

Page 3: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

High Throughput Virtual Docking

Chemical compounds (ZINC):Chembridge – 500,000Drug like – 500,000

Targets (PDB):

Plasmepsin II (1lee, 1lf2, 1lf3)

Plasmepsin IV (1ls5)

Millions of chemicalcompounds availablein laboratories

High Throughput Screening1-10$/compound, nearly impossible

Molecular docking (FlexX, Autodock)~80 CPU years, 1 TB data

Data challenge on EGEE~6 weeks on ~1700 computers

Hits screeningusing assays performed onliving cells

Hits refiningUsingMolecularDynamics

Page 4: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Significant numbers

• Total of 46 million ligands docked in 6 weeks

• 1TB of data produced

• Up 1700 computers in 15 countries used simultaneously

• corresponding to about 72 000 jobs and 80 CPU years

• Average crunching factor ~660

Number of running and waiting jobs vs time

Number of running and waiting jobs vs time

Page 5: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SouthEasternEurope, 10%

SouthWesternEurope, 12% Italy, 16%

France, 18%

UKI, 29%NorthernEurope, 7%

CentralEurope, 4%

AsiaPacific, 2%

GermanySwitzerland, 1%

Russia, 1%

Deployment on EGEE infrastructure, wisdom.eu-egee.fr

10UK1Poland1Germany

1Taiwan2Netherlands

9France

7Spain13Italy1Cyprus

2Russia1Israel1Croatia

1Romania3Greece3Bulgaria

sitescountrysitescountrysitescountry

Countries with nodes contributing to the data challenge WISDOM

Total amount of CPU provided by EGEE federation

Page 6: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Data challenge on avian flu

• A collaboration of 5 grid projects: Auvergrid, BioinfoGrid, EGEE-II, Embrace, TWGrid

• Partners Institute : Academia Sinica (Computing Center, Genomics Research Center), CNRS-LPC, CNR-ITB

• Timescale: – First contacts: March 1st 2006– kick-off: April 1st 2006– Duration: ~4 weeks

Page 7: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Data challenge on avian flu: biological goals

• The bird flu virus is named H5N1. H5 and N1 correspond to the name of proteins (Hemagglutinins and Neuraminidases) on the virus surface.

• Neuraminidases play a major role in the virus multiplication• Present drugs such as Tamiflu inhibit the action of

neuraminidases and stop the virus proliferation• The virus keeps mutating and drug-resistant N1 variants can

appear

• The goal of the data challenge is to study in silico the impact of selected point mutations on the efficiency of existing drugs and to find new potential drugs

N1H5

Credit: Y-T Wu

Page 8: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Data challenge on avian flu: grid facts

• Data challenge parameters:– One docking software: autodock– 8 conformations of the target (N1)– 300000 selected compounds – 100 year CPU to dock all

configurations on all compounds

Credit: Y-T Wu

Page 9: EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE  WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.

WISDOM in EGEE-2, biomed meeting, 2006/04/28 9

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Perspectives

• Future works on the hits : reranking of WISDOM hits by Molecular Dynamics simulations– Approximately 100 CPU years needed– Supported by EGEE-II & BioinfoGrid european projects– Need for ressources on supercomputers (contact with DEISA)– Finally in vitro testing and structure activity relationships

• Second large scale docking on EGEE in fall 2006– Several new foreseen targets on malaria, dengue and other

neglected diseases. – Resources needed: ~80 CPU years per target– Supported by EGEE-II and EELA european projects, Swiss

BioGrid initiative