Andrey Voronkov* and John Shultz *av@drugdiscoveryathome
description
Transcript of Andrey Voronkov* and John Shultz *av@drugdiscoveryathome
Drugdiscovery@home - distributed volunteer computing project in the fields of cancer, aging
and stem cells
Andrey Voronkov* and John Shultz
What is VCSC and distributed computing?Computational tasks go from server to locally or globally
distributed computers and computed results go back to the server.
Internet – volunteer computing
ProjectsVolunteers
Helps science
Involves public in sciencehttp://boinc.berkeley.edu/trac/wiki/BoincPapers
Local network – VCSC
BOINC server
DRUGDISCOVERY@HOME PROJECT WORKFOW
METHODS OF THE PROJECT:• Distributed computing, GPU computing• Virtual screening with flexible amino acids• Relaxed complex scheme for docking• Molecular dynamics with explicit solvent models for protein-
ligands complexes stability evaluation• Pathway interactive mapping with dynamics changes modeling
FIELDS OF THE RESEARCH:• Biotargets involved in stem cell niches signaling pathways• which are related but not limited to cancer and
neurodegenerative diseases pathways. Biotargets which fit to cancer/aging regulation according to hypothesis on Pic. A.
• Example of biotargets: proteins involved in Wnt, Shh and Notch signaling pathways.
• Other biological targets, related to cancer, degenerative diseases and stem cells biology can be considered in collaboration with the experimental biologists groups.
The working hypothesis of cance/degeneration and symmetric/asymmetric division of stem cells
ACCOMPLISHMENTS:– Initial integration project website with Drupal– High throughput Molecular Docking CPU
• Distributes Python• Distributes some MGLTools Packages• Managed by BOINC Wrapper
– GROMACS integration with BOINC Wrapper for CPU• Simulate 100 ps in 2.5 hrs• Trajectory Files Range from 10-40MB • Results compress with 7zip format
– Autodock 4.0 integration with BOINC Wrapper for CPU– Protein-ligand docking->MD workflow setup (acpypi)– Major Platform Integrations
• Windows• Mac PPC & Intel• Linux
Team:• Andrey Voronkov, PhD, Moscow State
University, department of chemistry – project leader, molecular modeling, drug design, BOINC server setup
• John Shultz, National academy of sciences, Washington D.C., IT, coding, BOINC server setup
• Jorden van der Elst, main software tester• Also we collaborate with several people from
industry, which make systems biology part and which want to be undisclosed for now.
COLLABORATION
OPTION 1: Collaboration with the experimental biologists
OPTION 2: Virtual Campus Super Computing for universities and organizations
Advantages against cluster supercomputing:• New pool of computing power for very low cost• Enhanced stability compared to clusters & supercomputers• Applications not built for the cluster architecture• Positive PR for University
Advantages against distributed volunteer computing:• Purely VCSC, no volunteers outside network
o No Credits, no cheaters, only need one result per workunit (better performance per 1 CPU), better security, more flexible regarding software licenses
• Volunteer Projecto Need to preven cheating, validate results, more limitations on
redistributing licensed software
Examples of applications for drug design
1 average CPU Molecular dynamics of 100 aminoacids of complex of protein with small molecule ligand with explicit water and explicit salts during 2 days
100 picoseconds
VCSC
with 200 CPUs
100 trajectories by 100 ps for one complex or
100 different ligand protein-complexes by one 100 ps trajectory
VCSC increases computing resources by several orders of magnitude and enables to apply some of the existing software application to more of objects.
Example 1. Virtual screening by docking of organic compounds to biotargets.
Example 2. Molecular dynamics of protein-ligand complexes with explicit water molecule models
1 average CPU ~1 000 000 compounds screened by rigid protein model docking with Autodock 4.0
100 days
VCSC
with 200 CPUs
~ 1 000 000 compounds Autodock 4.0. docking to rigid protein model or ~ 50 000 compounds docking with flexible protein model
1 day
GPU usage can increase computing resources from ~10 to 50 times against CPUs
I. Campus virtual supercomputing center BOINCserver setup• I.1 Evaluation of potential computing resources and server requirements• I.2 BOINC server setup
II. Communication with computer owners and system administrators
III. Communication with computational scientists• Identification of scientists with computationally-intensive applications that map well to volunteer computing.• Porting of applications to BOINC• Applications compilation for CPU Windows/Linux• Applications compilation for GPU Nvidia/ATI AMD• BOINC options setup (priority system, tasks limits) IV. VCSC maintenance
TOTAL TIME for VCSC: 2-3 human*months
Virtual Campus Supercomputing Center creation process
PLANS (2 years):1) GPU coding for applications and BOINC client – significant increase of
computational power for virtual screening and molecular dynamics with explicit solvent models.
2) Implications of several protein flexibility methods like relaxed complex scheme and protein Monte Carlo dynamics.
3) Dynamic modeling of signaling pathways network which must result in interactive mapping and prediction of most prospective biotargets for suggested diseases.
4) Drug design and biological compounds trials for prospective biotarters of Wnt signaling pathway (1st year) ~8-10 biotargets, and Shh, Notch and other stem cell niche regulating proteins for the second year (10-15 biotargets).
Funding required 150 000$/year:-full-time salary for 4 persons, hosting, some software licenses
Funding alternatives which are considered now: - Grants for small entities - required to make project as non-commercial (in
collaboration with universities)- Sales and services (volunteers profits sharing, initial general business plan
available upon request), an office required, preferably in Maryland, US
Thank you for the attention!