11
The Performance Analysis of The Performance Analysis of Molecular dynamics RAD GTPase Molecular dynamics RAD GTPase
with AMBER application onwith AMBER application onCluster computing environtment.Cluster computing environtment.
Universitas Indonesia
Heru Suhartanto, Arry Yanuar,
Toni Dermawan
22
Acknowledgments:Acknowledgments:
Fang Pang Lin – for invitation to SEAP Fang Pang Lin – for invitation to SEAP 2010, Taichung, Taiwan and for 2010, Taichung, Taiwan and for introduction to Peter Azbergerintroduction to Peter Azberger
Peter Arzberger – for invitation to Peter Arzberger – for invitation to PRAGMA20 and introduction to the PRAGMA20 and introduction to the audiencesaudiences
33
InGRID: InGRID: ININHERENT/HERENT/ININDONESIADONESIA GRIDGRID
IdeaIdea RI-GRID: National Grid Computing infrastructure development RI-GRID: National Grid Computing infrastructure development
proposal, Mei 2006, by FAculty of Computer Science, UIproposal, Mei 2006, by FAculty of Computer Science, UI Part of UI competitive grants (PHK INHERENT K1 UI)Part of UI competitive grants (PHK INHERENT K1 UI)
””Menuju Kampus Dijital:Menuju Kampus Dijital: Implementasi Implementasi Virtual LibraryVirtual Library, , Grid ComputingGrid Computing, , Remote-Laboratory, Computer Mediated LearningRemote-Laboratory, Computer Mediated Learning, dan Sistem , dan Sistem Manajemen Akademik dalam INHERENT,” Sep ’06 – Mei ‘07Manajemen Akademik dalam INHERENT,” Sep ’06 – Mei ‘07
Objective:Objective: Developing Grid Computing Infrastructure with computation capacity Developing Grid Computing Infrastructure with computation capacity
intially 32 processors (~intel pentium IV) and 1 TB storage.intially 32 processors (~intel pentium IV) and 1 TB storage. Hopes: the capacity will improve as some other organization will joint Hopes: the capacity will improve as some other organization will joint
the InGRid.the InGRid. Developing e-Science community in IndonesiaDeveloping e-Science community in Indonesia
44
Grid computing Challenges : still developing, minimum HR, depend on grants,
Researches challenges : reliable resources integration, management of rich natural resources, wide areas but composing with thousands of island, natural disasters: earthquake, tsunami, landslide, floods, forest fires, etc.
55
The InGRID ArchitectureThe InGRID Architecture
inGRIDPORTAL
GlobusHead Node
INHERENT
INHERENT
User
User
Linux/SparcClusterGlobus
Head Node
Linux/x86Cluster
Windows/x86Cluster
Solaris/x86Cluster
GlobusHead Node
UI I*
U*
CustomPORTAL
66
HastinapuraHastinapura Cluster ClusterNama Nama
NodeNode
Head NodeHead Node Worker NodesWorker Nodes Storage Storage
NodeNode
ArsitektuArsitektu
rr
Sun Fire Sun Fire
X2100X2100
Sun Fire X2100Sun Fire X2100 --
ProsesorProsesor AMD Opteron AMD Opteron
2.2 GHz (Dual 2.2 GHz (Dual
Core)Core)
AMD Opteron AMD Opteron
2.2 GHz (Dual 2.2 GHz (Dual
Core)Core)
Dual Intel Dual Intel
Xeon 2.8 GHz Xeon 2.8 GHz
(HT(HT))
RAMRAM 2 GB RAM2 GB RAM 11 GB RAM GB RAM 2 GB RAM2 GB RAM
HarddiskHarddisk 80 GB80 GB 80 GB80 GB 3 x 320 GB3 x 320 GB
6Fakultas Ilmu Komputer Universitas Indonesia
77
SoftwareSoftwaress Hastinapura Hastinapura ClusterCluster
7Fakultas Ilmu Komputer Universitas Indonesia
FunctionsFunctions Applications Applications (versi)(versi)
11 compilerscompilers ggcccc ( (3.3.53.3.5); ); g++g++ ( (3.3.53.3.5, ,
GCC); GCC); g77g77 ( (3.3.53.3.5, GNU , GNU
Fortran); Fortran); g95g95 (0.91, GCC (0.91, GCC
4.0.3)4.0.3)
22 Aplikasi MPI 1Aplikasi MPI 1 MPICH (MPICH (1.2.7p11.2.7p1, , Release Release
date: 2005/11/04 date: 2005/11/04
11:54:5111:54:51))
33 Operating systemOperating system Debian/Linux OS (3.1 Debian/Linux OS (3.1
“Sarge”)“Sarge”)
44 Resource managementResource management Globus Toolkit [2] (4.0.3)Globus Toolkit [2] (4.0.3)
55 Job schedulerJob scheduler Sun Grid Engine (SGE) Sun Grid Engine (SGE)
(6.1u2)(6.1u2)
88
Molecular Dynamics Molecular Dynamics SimulationSimulation
MD simulation on virus H5N1 [3]
Computer Simulation Techniques
Computer Simulation Techniques
Molecular Dynamic
Simulation
Molecular Dynamic
Simulation
8Fakultas Ilmu Komputer Universitas Indonesia
99
““MD simulation : MD simulation : computational tools used to computational tools used to describe the position, speed describe the position, speed an and orientation of an and orientation of molecules at a certain time” molecules at a certain time” Ashlie Martini Ashlie Martini [4][4]
9Fakultas Ilmu Komputer Universitas Indonesia
1010
MD simulation MD simulation purposes/benefits:purposes/benefits:
Sumber gambar: [5], [6], [7]10Fakultas Ilmu Komputer Universitas Indonesia
1111
Challenges in MD Challenges in MD simulationsimulation
11Fakultas Ilmu Komputer Universitas Indonesia
•O(N2) time complexity
•Timesteps (simulation time)
1212
Focus of the experimentFocus of the experiment
12Fakultas Ilmu Komputer Universitas Indonesia
•Study the effect of MD simulation timestep on the executing / processing time;
•Study the effect of in vacum and implicit solvent technique with generalied Born (GB) model on the executing / processing time;
•Study (scalability) how the number of processors improve executing / processing time;
•Study how the output file grows as the timesteps increase.
1313
Scope of the experimentsScope of the experiments
13Fakultas Ilmu Komputer Universitas Indonesia
•Preparation and simulation with AMBER packages
•Performance is based on the execution time of the MD simulation
•No parameter optimization for the MD simulation
1414
Molecular Dynamics basic process Molecular Dynamics basic process [4][4]
14Fakultas Ilmu Komputer Universitas Indonesia
1515
Flow of data in AMBER [8]
1616
Flows in AMBER [8]Flows in AMBER [8]
Preparatory programPreparatory program LEaP is the primary program to create a new system in LEaP is the primary program to create a new system in
Amber, or to modify old systems. It combines the Amber, or to modify old systems. It combines the functionality of prep, link, edit, and parm from earlier functionality of prep, link, edit, and parm from earlier versions.versions.
ANTECHAMBER is the main program from the ANTECHAMBER is the main program from the Antechamber suite. If your system contains more than Antechamber suite. If your system contains more than just standard nucleic acids or proteins, this may help you just standard nucleic acids or proteins, this may help you prepare the input for LEaP.prepare the input for LEaP.
1717
Flows in AMBER [8]Flows in AMBER [8]
SimulationSimulation SANDER is the basic energy minimizer and molecular SANDER is the basic energy minimizer and molecular
dynamics program. This program relaxes the dynamics program. This program relaxes the structure by iteratively moving the atoms down the structure by iteratively moving the atoms down the energy gradient until a sufficiently low average energy gradient until a sufficiently low average gradient is obtained.gradient is obtained.
PMEMD is a version of sander that is optimized for PMEMD is a version of sander that is optimized for speed and for parallel scaling. The name stands for speed and for parallel scaling. The name stands for "Particle Mesh Ewald Molecular Dynamics," but this "Particle Mesh Ewald Molecular Dynamics," but this code can now also carry out generalized Born code can now also carry out generalized Born simulations.simulations.
1818
Flows in AMBER [8]Flows in AMBER [8]
AnalysisAnalysis PTRAJ is a general purpose utility for PTRAJ is a general purpose utility for
analyzing and processing trajectory or analyzing and processing trajectory or coordinate files created from MD simulationscoordinate files created from MD simulations
MM-PBSA is a script that automates energy MM-PBSA is a script that automates energy analysis of snapshots from a molecular analysis of snapshots from a molecular dynamics simulation using ideas generated dynamics simulation using ideas generated from continuum solvent models.from continuum solvent models.
1919
RAD (Ras Associated with Diabetes) is a family of RGK small GTPase located inside human body with diabetes type 2. The crystal form of Rad GTPase has resolution of 1,8 angstrom.
The crystal form of RAD GTPase is stored in d Protein Data Bank (PDB) file.
Ref: A. Yanuar, S. Sakurai, K. Kitano, Hakoshima, dan Toshio, “Crystal structure of human rad gtpase of the rgk-family,” Genes to Cells, vol. 11, no. 8, pp. 961-968, Agustus 2006
The The RAD GTPaseRAD GTPase Protein Protein
2020
RAD GTPaseRAD GTPase Protein Protein
20Fakultas Ilmu Komputer Universitas Indonesia
Reading from PDB with NOC:
The leap.log reading:
number of atom 2529
2121
Parallel approach in MD Parallel approach in MD simulationsimulation
21Fakultas Ilmu Komputer Universitas Indonesia
AlgoritAlgorithms forhms for fungsi fungsi forceforce:: data replidata replicationcation DataData distribution distribution
DataData decomposition decomposition Particle decompositionParticle decomposition Force decompositionForce decomposition Domain decompositionDomain decomposition Interaction decompositionInteraction decomposition
2222
Parallel implementation inParallel implementation in AMBERAMBER
22Fakultas Ilmu Komputer Universitas Indonesia
•Atoms are distributed among available processors (Np)
•Each Execution nodes / processors compute force function
•Updating position, computing parsial force, ect.
•Write to output files
2323
Experiment resultsExperiment results
Fakultas Ilmu Komputer Universitas Indonesia
2424
Execution time withExecution time with In In VacuumVacuum
Fakultas Ilmu Komputer Universitas Indonesia
2525Fakultas Ilmu Komputer Universitas Indonesia
Execution time for Execution time for In VacuumIn Vacuum
2626
Execution time for Execution time for Implicit Implicit Solvent Solvent with GB with GB Model Model
Fakultas Ilmu Komputer Universitas Indonesia
2727Fakultas Ilmu Komputer Universitas Indonesia
Execution time for Execution time for Implicit SolvenImplicit Solven with GBwith GB Model Model
2828Fakultas Ilmu Komputer Universitas Indonesia
Execution time comparison betweenExecution time comparison between In In Vacuum Vacuum and and Implicit Solvent Implicit Solvent withwith GB GB model model
2929Fakultas Ilmu Komputer Universitas Indonesia
The effect of The effect of ProsesorProsesor number on number on MD MD simulation withsimulation with In VacuumIn Vacuum
3030Fakultas Ilmu Komputer Universitas Indonesia
The effect of processors number at MD The effect of processors number at MD simulation with simulation with Implicit Solvent Implicit Solvent with GB with GB
ModelModel
3131
Number of processors and output file sizes Simulation time - (ps)
1 2 4 8 MB
(Megabytes) 100 6.148.096 6.148.096 6.148.096 6.148.096 5,86
200 12.292.096 12.292.096 12.292.096 12.292.096 11,72
300 18.440.192 18.440.192 18.440.192 18.440.192 17,59
400 24.584.192 24.584.192 24.584.192 24.584.192 23,45
Output file sizes as the simulation time grows – in vacum
3232
Jumlah prosesor
Simulation time (ps)
1 2 4 8 Konversi ke
MB (Megabytes)
100 6.148.096 6.148.096 6.148.096 6.148.096 5,86
200 12.292.096 12.292.096 12.292.096 12.292.096 11,72
300 18.440.192 18.440.192 18.440.192 18.440.192 17,59
400 24.584.192 24.584.192 24.584.192 24.584.192 23,45
Output file sizes as the simulation time grows –
Implicit solvent with GB model
3333
Gromacs on the Pharmacy Cluster
This cluster is built to back up the Hastinapura Cluster which has storge
problems.
3434
Network Structure of Pharmacy Network Structure of Pharmacy ClusClusterter
grid01grid04
grid06
grid03grid05
grid01Router Farmasi Gigabit Ethernet Switch
Web Server
Database Server
JUITA (Jaringan Universitas Indonesia Terpadu)
3535
SoftwareSoftware
MPICH 2 1.2.1MPICH 2 1.2.1 Installed Gromacs 4.0.5Installed Gromacs 4.0.5
3636
Installation StepsInstallation Steps
Installing All node with Ubuntu CDInstalling All node with Ubuntu CDConfiguring NFS (Network File System)Configuring NFS (Network File System) Installing MPIInstalling MPI Installing Gromacs ApplicationInstalling Gromacs Application
3737
ProblemsProblems
Everything work fine in the first a few months, Everything work fine in the first a few months, but after the nodes have been used for 5 but after the nodes have been used for 5 months, the nodes often crashed when its months, the nodes often crashed when its running simulationrunning simulation
Crashed means, for example if we run Crashed means, for example if we run gromacs simulation in 32 nodes (now the gromacs simulation in 32 nodes (now the clustes consisting of 6 four cores PC), the clustes consisting of 6 four cores PC), the execution node one by one collapse after a few execution node one by one collapse after a few timestimes
Unreliable electrical suppliesUnreliable electrical supplies
3838
Sources of problems?Sources of problems?
Network Configuration orNetwork Configuration orNFS Configuration orNFS Configuration orHW Problem, NIC, Switch orHW Problem, NIC, Switch orProcessor OverheatProcessor Overheat
3939
Problems – Error LogProblems – Error Log Fatal error in MPI_Alltoallv: Other MPI error, error stack:Fatal error in MPI_Alltoallv: Other MPI error, error stack: MPI_Alltoallv(459)................: MPI_Alltoallv(sbuf=0xc81680, MPI_Alltoallv(459)................: MPI_Alltoallv(sbuf=0xc81680,
scnts=0xc60be0, sdispls=0xc60ba0, MPI_FLOAT, rbuf=0x7f7821774de0, scnts=0xc60be0, sdispls=0xc60ba0, MPI_FLOAT, rbuf=0x7f7821774de0, rcnts=0xc60c60, rdispls=0xc60c20, MPI_FLOAT, comm=0xc4000006) rcnts=0xc60c60, rdispls=0xc60c20, MPI_FLOAT, comm=0xc4000006) failedfailed
MPI_Waitall(261)..................: MPI_Waitall(count=8, req_array=0xc7ad40, MPI_Waitall(261)..................: MPI_Waitall(count=8, req_array=0xc7ad40, status_array=0xc6a020) failedstatus_array=0xc6a020) failed
MPIDI_CH3I_Progress(150)..........: MPIDI_CH3I_Progress(150)..........: MPID_nem_mpich2_blocking_recv(948): MPID_nem_mpich2_blocking_recv(948): MPID_nem_tcp_connpoll(1709).......: Communication errorMPID_nem_tcp_connpoll(1709).......: Communication error Fatal error in MPI_Alltoallv: Other MPI error, error stack:Fatal error in MPI_Alltoallv: Other MPI error, error stack: MPI_Alltoallv(459)................: MPI_Alltoallv(sbuf=0x14110e0, MPI_Alltoallv(459)................: MPI_Alltoallv(sbuf=0x14110e0,
scnts=0x13f0920, sdispls=0x13f08e0, MPI_FLOAT, rbuf=0x7f403eb4c460, scnts=0x13f0920, sdispls=0x13f08e0, MPI_FLOAT, rbuf=0x7f403eb4c460, rcnts=0x13f09a0, rdispls=0x13f0960, MPI_FLOAT, comm=0xc4000000) rcnts=0x13f09a0, rdispls=0x13f0960, MPI_FLOAT, comm=0xc4000000) failedfailed
MPI_Waitall(261)..................: MPI_Waitall(count=8, req_array=0x140c7b0, MPI_Waitall(261)..................: MPI_Waitall(count=8, req_array=0x140c7b0, status_array=0x1408c90) failedstatus_array=0x1408c90) failed
MPIDI_CH3I_Progress(150)..........: MPIDI_CH3I_Progress(150)..........: MPID_nem_mpich2_blocking_recv(948): MPID_nem_mpich2_blocking_recv(948):
4040
Next targetsNext targets
Currently we are running experiments on GPU Currently we are running experiments on GPU as well, the results will be available soon,as well, the results will be available soon,
Solving the cluster problems (considering Solving the cluster problems (considering Rocks),Rocks),
Clustering PCs at 2 students lab (60 and 140 Clustering PCs at 2 students lab (60 and 140 nodes), and run experiments in the nodes), and run experiments in the “nights/holidays” periods,“nights/holidays” periods,
Rebuilding the grid,Rebuilding the grid, Sharing some resources to PRAGMA.Sharing some resources to PRAGMA.
Your advices are very important and useful, Thank you!
4141
ReferencesReferences[1]http://www.cfdnorway.no/images/PRO4_2.jpg[1]http://www.cfdnorway.no/images/PRO4_2.jpg[2]http://sanders.eng.uci.edu/brezo.html[2]http://sanders.eng.uci.edu/brezo.html[3]http://www.atg21.com/FigH5N1jcim.png[3]http://www.atg21.com/FigH5N1jcim.png[4] A. Martini, “Lecture 2: Potential Energy Functions”, 2010, [4] A. Martini, “Lecture 2: Potential Energy Functions”, 2010, [Online]. Tersedia di: http://nanohub.org/resources/8117. [Online]. Tersedia di: http://nanohub.org/resources/8117. [Diakses pada 18 Juni 2010].[Diakses pada 18 Juni 2010].[5]http://www.dsimb.inserm.fr/images/Binding-sites_small.png[5]http://www.dsimb.inserm.fr/images/Binding-sites_small.png[6]http://thunder.biosci.umbc.edu/classes/biol414/spring2007/[6]http://thunder.biosci.umbc.edu/classes/biol414/spring2007/files/protein_folding(1).jpgfiles/protein_folding(1).jpg[7]http://www3.interscience.wiley.com/tmp/graphtoc/[7]http://www3.interscience.wiley.com/tmp/graphtoc/72514732/118902856/118639600/ncontent72514732/118902856/118639600/ncontent[8] D. A. Case et al., “AMBER 10”, University of California, San [8] D. A. Case et al., “AMBER 10”, University of California, San Francisco, 2008, [Online]. Tersedia di: Francisco, 2008, [Online]. Tersedia di: http://www.lulu.com/content/paperback-book/amber-10-users-http://www.lulu.com/content/paperback-book/amber-10-users-manual/2369585. [Diakses pada 11 Juni 2010].manual/2369585. [Diakses pada 11 Juni 2010].
41Fakultas Ilmu Komputer Universitas Indonesia
Top Related