GPU Accelerated Molecular Dynamics Simulation ...

University of Illinois at Urbana-ChampaignBeckman Institute for Advanced Science and TechnologyNIH Resource for Macromolecular Modeling and BioinformaticsTheoretical and Computational Biophysics Group

GPU Accelerated Molecular Dynamics Simulation,Visualization, and Analysis

Authors:Ivan Teo

Juan PerillaRezvan ShahoeiRyan McGreevyChris Harrison

May 19, 2014

Please visit www.ks.uiuc.edu/Training/Tutorials/ to get the latest version of this tutorial, to obtainmore tutorials like this one, or to join [email protected] mailing list for additionalhelp.

GPU accelerated molecular dynamics simulations, visualization, and analysis 2

Contents

1. Introduction 31.1. Introduction to GPU Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2. GPU Computing in NAMD and VMD. . . . . . . . . . . . . . . . . . . . . . . . 3

2. Introduction to Simulations using GPUs 42.1. How to run NAMD using GPUs.. . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2. Looking at the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3. Basic benchmarking of NAMD performance.. . . . . . . . . . . . . . . . . . . . 52.4. Simulating 2.3 million atoms on CPUs and GPUs.. . . . . . . . . . . . . . . . . . 62.5. Comparison of CPU and GPU performance.. . . . . . . . . . . . . . . . . . . . . 7

3. GPU Enhanced Visualization 83.1. Rendering surfaces the “old” way.. . . . . . . . . . . . . . . . . . . . . . . . . . 83.2. Introducing GPU-acceleratedQuickSurf . . . . . . . . . . . . . . . . . . . . . . . 93.3. Usefulness of Surface Representations. . . . . . . . . . . . . . . . . . . . . . . . 9

4. GPU Accelerated Molecular Dynamics (aMD) 124.1. “Accelerated” Molecular Dynamics: Theory.. . . . . . . . . . . . . . . . . . . . . 12

4.1.1. Theoretical background. . . . . . . . . . . . . . . . . . . . . . . . . . . 124.1.2. Compute the aMD parameters from cMD simulations.. . . . . . . . . . . 14

4.2. Using aMD & GPUs for long-timescale molecular dynamics.. . . . . . . . . . . . 15

5. GPU Augmented Analysis 175.0.1. Analysis of GGBP trajectories. . . . . . . . . . . . . . . . . . . . . . . . 17

5.1. Calculating g(r) using GPUs.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18


1. Introduction

This tutorial will demonstrate how to use features in NAMD and VMD to harness the computationalpower of graphics processing units (GPUs) to accelerate simulation, visualization and analysis. Youwill learn how to drastically improve the efficiency of your computational work, achieving largespeedups over CPU only methods. You will explore ways of investigating large multimillion atomsystems or long simulation timescales through easy to use features of NAMD and VMD on readilyavailable hardware. Please note that completeing the tutorial examples will require a computerwith a CUDA-capable NVIDIA GPU. Please see Section2.1. for more information.

1.1. Introduction to GPU Computing

Over the past decade, physical and engineering practicalities involved in microprocessor designhave resulted in flat performance growth for traditional single-core microprocessors. Continuedmicroprocessor performance growth is now achieved primarily through multi-core designs andthrough greater use of data parallelism, with vector processing machine instructions. Currently,the year-to-year growth in the number of processor cores roughly follows the growth in transistordensity predicted by Moore’s Law, doubling every two years. At the forefront of this parallelcomputation revolution are graphics processing units (GPUs), traditionally used for visualization.

Graphics workloads contain tremendous amounts of inherent parallelism. As a result, GPUhardware is designed to accelerate data-parallel calculations using hundreds of arithmetic units.The individual GPU processing units support all standard data types and arithmetic operations,including 32-bit and 64-bit IEEE floating point arithmetic. State-of-the-art GPUs can achievepeak single-precision floating point arithmetic performance of 2.0 trillion floating point opera-tions per second (TFLOPS), with double-precision floating point rates reaching approximately halfthat speed. GPUs also contain large high-bandwidth memory systems that achieve bandwidths ofover 200 GB/sec in recent devices. The general purpose computational capabilities of GPUs areexposed to application programs through the two leading GPU programming toolkits: CUDA [6],and OpenCL [5].

1.2. GPU Computing in NAMD and VMD

NAMD and VMD utilize GPUs to accelerate an increasing number of their most computation-ally demanding functions, resulting in significant speed increases. Many algorithms involved inmolecular modeling and computational biology applications can be adapted to GPU acceleration,commonly increasing performance by factors ranging from 10× to 30× faster, and occasionallyas much as 100× faster, relative to contemporary multi-core CPUs [10, 11, 9]. GPU-accelerateddesktop workstations can now provide performance levels that used to require a cluster, but withoutthe complexity involved in managing multiple machines or high-performance networking. Users ofNAMD and VMD can now perform many computations on laptops and modest desktop machineswhich would have been nearly impossible without GPU acceleration. For example, NAMD userscan easily perform simulations on large systems containing hundreds of thousands and even mil-lions of atoms thanks to GPUs. The time-consuming non-bonded calculations on so many atomscan now be performed on a GPU at 20 times the speed of a single CPU core. VMD users cansmoothly and interactively animate trajectories using visualization techniques such as the displayof molecular orbitals or QuickSurf for surface representations. In the case of visualizing molecularorbitals, VMD’s GPU algorithm obtains a 125× speedup over the CPU.


2. Introduction to Simulations using GPUs

The performance benefit NAMD’s GPU acceleration feature is most clearly demonstrated by sim-ulation of large systems, e.g. with> 105 atoms, with sufficient work to keep the GPU busy [7].

This section will guide you through the simulation of such a large system with and without aGPU, for the purpose of comparison between the two cases. For this section, please use as yourworking directorygpu-tutorial/gpu-tutorial data/1-largeSims/ .

2.1. How to run NAMD using GPUs.

To benefit from GPU acceleration you will need a CUDA build of NAMD [10, 11, 9] and a recenthigh-end NVIDIA video card. CUDA builds will not function without a CUDA-capable GPU. Youwill also need to be running the NVIDIA Linux driver version 270.41.19 or newer (released Linuxbinaries are built with CUDA 4.0, but can be built with newer versions as well).

Finally, the libcudart.so.4 included with the binary (the one copied from the version of CUDAit was built with) must be in a directory in your LDLIBRARY PATH before any other libcudart.solibraries. For example, when running a multicore binary (recommended for a single machine):

setenv LD_LIBRARY_PATH ".:$LD_LIBRARY_PATH"(or LD_LIBRARY_PATH=".:$LD_LIBRARY_PATH"; export LD_LIBRARY_PATH)./namd2 +idlepoll +p4 <configfile>

For more information on running NAMD on the GPU, please see theNAMD User’s Guide

2.2. Looking at the System

You will now proceed to examine the example system for this section. The system is comprisedof a mechanosensitive channel of small conductance (MscS) embedded in a lipid bilayer andsolvated in a water box of dimensions 324A× 324A× 230A. The MscS allows outflow of ionswhen the cell experiences osmotic shock, while maintaining charge balance across the membraneand selectively retaining crucial ions such as glutamate. The diffusive behavior of ions around andthrough the MscS is hence a subject of considerable scientific interest.

1 Open VMD. Go to ‘TkConsole’ from ‘Extensions’

2 In the TkConsole, navigate to the folder containing the files for section 2.

3 Next, open the PDB file of the system by typing in the TkConsole:

mol load pdb mscs.pdb

4 Take some time to inspect the system. Observe that some useful information about the systemhas been loaded in the command terminal window. In particular, there are approximately 2.3million atoms.

5 Close VMD.

http://www.ks.uiuc.edu/Research/namd/2.9/ug/node88.html


2.3. Basic benchmarking of NAMD performance.

Before starting actual runs, it is advisable to take stock of your simulation requirements andestimate how much running time it would take to finish running the simulation given the computingresources at your disposal. Benchmarking serves as a straightforward way of doing so. In themidst of any simulation run, NAMD measures the average rate of calculation over the elapsedsimulation time. The rate of calculation depends on many factors, among which are the systemsize, configuration parameters, and the computational resources allocated to the simulation. Thusit is more sensible to empirically measure the rate over elapsed timesteps for each simulation thanto perform an extremely complicateda priori calculation of the rate. Here, you will perform shortequilibration runs of the MscS system and subsequently extract benchmark information from thegenerated logfiles.

1 Let us begin by taking a look at the NAMD configuration file for the benchmark run. In thefolder for this section, use your favorite text editor and openbenchmark cpu.conf .

2 Notice the small number of timesteps near the end of the file:run 1000 . NAMD performsbenchmark measurements after 400 timesteps. However, averaging over several benchmarksgives a more reliable estimate.

3 Now close the editor. Perform the benchmark run on just CPUs by typing in the commandprompt:

namd2 benchmark_cpu.conf > benchmark_cpu.log

4 Create the configuration file for the GPU benchmark by openingbenchmark cpu.confwith a text editor and settingoutputName to benchmark gpu . Exit and save the fileasbenchmark gpu.conf . Note the superficial difference between the CPU and GPUconfiguration files; the key procedural difference between running with and without GPUs isinstead in how NAMD is called on the command prompt.

5 Perform the benchmark run on CPUs together with a GPU by typing in the command prompt:

namd2 +idlepoll benchmark_gpu.conf > benchmark_gpu.log

6 Examples of benchmark cpu.conf and benchmark gpu.conf , as wellas benchmark cpu.log and benchmark gpu.log have been saved ingpu-tutorial/gpu-tutorial data/1-largeSims/examples/ . In caseof time constraints or failure in a previous step, please transfer the example files to yourworking directory and use them as you proceed.

7 After each run has finished, the benchmark information can be extracted from the respectivelogfiles. On a Linux or Mac, this can be easily done by typing into the command prompt:

grep Benchmark benchmark_cpu.log

orgrep Benchmark benchmark_gpu.log


You should see a line(s) of text that looks like:Info: Benchmark time: 12 CPUs 5.99984 s/step 34.7213 days/ns10434.7 MB memory

8 Based on thes/step anddays/ns numbers, approximately how long would it take, withand without the GPU, to run, say,106 timesteps? What about 10 ns?

2.4. Simulating 2.3 million atoms on CPUs and GPUs.

You are now ready to perform actual equilibration runs on the MscS system. Due to timeconstraints, you will perform 2-hour (clock time) runs.(Feel free to perform longer runs if timeallows.) Judging from the benchmarks obtained from the previous system, how many ns do youthink you would be able simulate, both with and without the GPU? Record your estimate forcomparison with the actual results later.

1 The benchmark configuration is virtually identical to that of the actual run. Hence, you canprepare the configuration file for the actual run simply by editing the benchmark configura-tion file. Use a text editor to openbenchmark cpu.conf .

2 SetoutputName to equil cpu , then scroll down to the bottom of the file and change thenumber of timesteps:

run 1000000

Of course,106 should exceed the number of timesteps in your estimate. However, the simula-tion can be halted in 2 hours for you to view the results. In actual runs, you should set the numberof timesteps according to your benchmark estimates.

3 Save the edited configuration file asequil cpu.conf .

4 Create also, frombenchmark gpu.conf , the GPU configuration fileequil gpu.confusing the same procedure in the preceding steps.

5 Run the simulation with and without the GPU by typing in the command prompt:

namd2 equil_cpu.conf > equil_cpu.log &namd2 +idlepoll equil_gpu.conf > equil_gpu.log &

After each of these commands, a process id should have beenprinted to the terminal. If you are running NAMD on a computerrunning linux or OSX, you can now use the linux ”at” commandto kill these two processes in 2 hours. To do this type at "now +2 hours" . This command will give you a prompt at >, at whichyou should enter at >kill pid , where pid is the process idprinted after starting the namd run. You can now exit the promptwith ”ctrl-d”. You should do this process for both the cpu and gpusimulations. This will set up jobs to kill the namd simulations 2hours from the time you entered the command..

6 Examples of the .conf and .log files in this section have been saved ingpu-tutorial/gpu-tutorial data/1-largeSims/examples/ . In addition,you will also find the trajectory filesequil cpu.dcd andequil gpu.dcd in the samelocation should you wish to visualize them in VMD.


2.5. Comparison of CPU and GPU performance.

1 Use a text editor to open the logfilesequil cpu.log andequil gpu.log . How manytimesteps were run in each case?

2 Next, usegrep to inspect the benchmarks in each logfile as you did for the benchmark runs.How do they compare to your previous benchmark results?

3 Based on your observations, how much faster did the GPU simulation run as compared to theCPU simulation? Do you think the same performance boost would be observed for a smallsystem of, say, 5000 atoms?


3. GPU Enhanced Visualization

In addition to being computationally demanding to simulate, large biomolecular structures can bedifficult to visualize as well. Not only do large systems push the abilities of the GPU to displaythe structures, but displaying structures such that interesting details can be easily discerned is alsoa challenge.

Molecular surface visualization allows researchers to see where structures are exposed to sol-vent or contact each other, and to view the overall architecture of large biomolecular complexessuch as trans-membrane channels and virus capsids. VMD is capable of calculating surfacesquickly via the GPU-accelerated QuickSurf representation, which achieves performance ordersof magnitude faster than the conventional Surf and MSMS representations. Hence, users can eas-ily set up interactive displays of molecular surfaces for multi-million atom complexes, e.g. largevirus capsids. Furthermore, QuickSurf enables smooth interactive animation of moderate-sizedbiomolecular complexes consisting of a few hundred thousand to a million atoms.

3.1. Rendering surfaces the “old” way.

In this section, you will be acquainted with surface representations using non-GPU methods.

1 Open VMD. Go to ‘TkConsole’ from the ‘Extensions’ tab on the top of VMD Main menu.

2 Ensure your working directory is the same as in Section 2.gpu-tutorial/gpu-tutorial data/1-largeSims/ In the TkConsole type:

mol load psf mscs.psf pdb mscs.pdb

3 In the selected atoms field, type “segname PA PB PC PD PE PF PG”.

4 For the Drawing Method, choose ‘Surf’ from the drop-down menu. Notice how long it takesto calculate the surface and apply it to the structure. This surface is rather slow in bothgeneration and display for systems over several hundred atoms. The Surf calculation is quiteexact and will show complete detail even when it isn’t needed. The use of disk space as aninterprocess communications medium takes up about half of the run time. In addition, theuser’s options are limited to changing the radius of the probe used in calculating the surfaceand the ability to render a wireframe representation of the surface.

5 If displaying one frame using Surf is slow, playing a trajectory with Surf will be imprac-tical. For later comparison, add the GPU equilibration trajectory from Section2.. Ifyou did not generate this file, one has been provided in: gpu-tutorial/gpu-tutorialdata/1-largeSims/examples/

mol addfile equil_gpu.dcd

6 Now attempt to play the trajectory or even just skip one frame forward from the VMD Mainwindow.

7 There is another surface representation, MSMS which is faster than Surf and gives the userslightly more options. You can try using MSMS by selecting it from the Drawing Methodmenu. If the representation fails to load, try selecting fewer segments, e.g. ‘segname PA’.MSMS may fail because while it can be faster than Surf, it is still quite limited by the size ofthe system it can work on.


8 Alternatively we could use space-filling models to represent our structure such as CPK orVDW, the latter also giving us an idea of the volume and surface of the protein. Try ap-plying these Drawing Methods and subsequently rotating the structure. Notice how theserepresentations are still slower than we would like.

3.2. Introducing GPU-accelerated QuickSurf .

Figure 1: MscS in membrane with QuickSurf representation. Ions are represented using VdW.

1 Now select the QuickSurf representation from the Drawing Method menu. Surprised by howfast the representation loaded? As you can see, QuickSurf lives up to its name by using thecomputational power of GPUs to calculate quickly the surface representation.

2 In addition to being fast, QuickSurf gives the user many useful options for controlling therepresentation. We can change the Radius Scale, Density Isovalue or Grid Spacing indi-vidually, or use the Resolution slider which will change them in tandem to give the desiredresolution. Try adjusting the resolution and see how quickly the representation responds.This can be quite useful for changing on the fly from a high resolution, when you want tosee detail, to a low resolution when you want the detail obscured.

3 Aside from faster rendering of surfaces than the traditional Surf and MSMS methods, Quick-Surf also allows us to view an entire trajectory with a surface representation. Try playing thetrajectory as you attempted before. Note that you can also adjust the resolution even whilethe trajectory is playing.

3.3. Usefulness of Surface Representations

Having a fast surface representation is great, but why might we want to visualize surfaces in the firstplace? As an example, look at the structure of the MscS in the QuickSurf. It consists of a seven-fold


Figure 2: MscS with transparent QuickSurf membrane.

symmetric heptamer forming a balloon-like cytoplasmic domain attached to the transmembranedomain. There are several openings into the protein interior - the transmembrane channel, sevenidentical windows lining the balloon structure, and one at the C-terminus on the bottom of theballoon structure. Using a QuickSurf representation, you can quickly get a rough impression ofhow the sizes of these openings compare with one another. In particular, you can immediately tellthat the C-terminus window is much smaller than the others. In fact, it is the only window whichis impermeable to ions. The capability of running trajectories in QuickSurf makes it easy to see ifwindows are becoming wider or narrower, making it more apparent if, for example, a channel isopening or closing.Next, we will modify the representation of different parts of the system to investigate how we canreduce the detail of certain components without removing them entirely.

1 Go to the Graphical Representation menu and create a selection “segname PA PB PC PD PEPF PG” in NewCartoon Drawing Method.

2 Create another representation of only “lipids”. by clicking the ‘Create Rep’ button and typing‘lipids’ into the Selected Atoms field.

3 Select QuickSurf for the lipids representation.

4 Under the “Draw Style” tab you can find an option named “Material”. Select “Transparent”for the lipids representation.


5 Tune the Resolution of QuickSurf to somewhere between 1.5 and 2. Notice how the lipiddetail is reduced while still giving you a clear representation of the volume and overall shapeof the membrane. Zoom into the lipid bilayer and notice how you can clearly see the trans-membrane portion of the protein without losing the visualization of the lipids. An exampleof this view can be seen in Fig.2

While studying ion dynamics in the MscS system, you probably would not want to be encum-bered by the fine details of lipid conformations. Using QuickSurf, you can reduce the detail levelof the lipid bilayer. In general, the QuickSurf representation helps you to reduce the complexity ofparts of the system without removing them entirely.

Figure 3: Front and side views of the Satellite Tobacco Mosaic Virus (STMV) using QuickSurfrepresentation for the viral capsid. Note the hole in the capsid now visible due to surface represen-tation.

Another important aspect of visualization is discovery of features and phenomena which wouldotherwise have been hidden. The right representation can drastically change how you perceive thestructure being modeled. For example, the capsid of the Satellite Tobacco Mosaic Virus (STMV)can be represented using QuickSurf to visualize the volume occupied by the proteins without un-ecessary detail.

In addition to being visually appealing, when using QuickSurf we now notice that the capsidcontains small holes in Fig.3 which would have been much more difficult or even impossible tosee with other representations.


4. GPU Accelerated Molecular Dynamics (aMD)

Accelerated molecular dynamics (aMD) is an all-atom enhanced sampling technique that enablesthe study of large conformational changes in proteins which normally occur on millisecond orlonger time scales. For example, using the aMD method on GPUs the bovine pancreatic trypsininhibitor was simulated for 500 ns.The 500 ns aMD simulation sampled the same amountofconformational space as an unbiased 1-millisecond MD simulation.[8].

Our model system for this section is the D-Glucose/D-Galactose binding protein (GGBP)which exhibits a “Venus Flytrap” architecture present in several other proteins like the Lac re-pressor, G-protein-coupled receptors, ion channels, and GABA receptors. GGBP, as several otherPerisplasmic binding proteins (PBP) can exist in open (ligand-free) and closed (ligand-bound)states. Using the aMD and Generalized Born implicit solvent (GBIS) methods with GPU accel-eration, you will perform in this section long-timescale simulations of D-Glucose/D-Galactosebinding protein in order to sample the conformational change of its “Venus Flytrap” hinge bendingmotion.

4.1. “Accelerated” Molecular Dynamics: Theory.

Figure 4: D-Glucose/D-Galactose binding pro-tein (GGBP) mediates chemotaxis toward and ac-tive transport of glucose and galactose in somebacterial species [1]. The apo state (blue) showsa 31◦ angle difference between theα andβ do-mains of GGBP when compared with the holostate (red).

Accelerated molecular dynamics (aMD) [4] isan enhanced-sampling method thatacceleratesthe sampling of conformational space by ef-fectively reducing the height of energy bar-riers that separate different states of a sys-tem. The method modifies the potential en-ergy landscape by applying a boost potentialto raise energy wells below a certain thresh-old level, while leaving those above this levelunaffected. As a result, barriers separating ad-jacent energy basins are effectively reduced,allowing the system to sample conformationalspace that would otherwise be inaccessible in aclassical MD simulation, or require extensivelylong simulations at least an order of magni-tude longer than the corresponding aMD simu-lations. The threshold level energy, referred toasE, will be an important input parameter foryour aMD simulations.

4.1.1. Theoretical background

In Hamelberg and McCammon’s aMD method [4], a system’s potential energy is monitored until itfalls below a threshold energy,E. A boost potential is then added, such that the modified potential,V ′(r), is related to the original potential,V (r), via

V ′(r) = V (r) + ∆V (r), (1)


where∆V (r) is the boost potential,

∆V (r) =

{0 V (r) ≥ E(E−V (r))2

α+E−V (r) V (r) < E.(2)

As shown in the following figure, the portion of the potential surface below the threshold en-ergyE is elevated by the boost potential. The acceleration factor,α, modulates the shape of theresulting elevated potential, permitting smooth continuous splining-together of the new elevatedpotential and the old potential. This approach prevents discontinuities at points where the poten-tials are joined. Such discontinuities lead to improper, even imposssible, energies and dynamics.Additionally, theα parameter modulates theflexibility of the boosted potential. By splining thepotentials and modulating the curvature of the boosted potential, discontinuities are effectivelyeliminated and a smooth differentiable potential energy surface containing the boosted potential isgenerated. Note thatα cannot be set to zero, otherwise the derivative of the modified potential isdiscontinuous.

Figure 5: Schematics of the aMD method. When the original potential (thick line) falls below athreshold energyE (dashed line), a boost potential is added. The modified energy profiles (thinlines) have smaller barriers separating adjacent energy basins. Two parameters,E andα, controlthe portion of the affected potential landscape and the shape of the modified potential, respectively.

From an aMD simulation, the ensemble average,〈A〉, of an observable,A(r), can be calculatedusing the following reweighting procedure:

〈A〉 =〈A(r) exp(β∆V (r))〉∗

〈exp(β∆V (r))〉∗, (3)

in whichβ=1/kBT , and〈· · · 〉 and〈...〉∗ represent the ensemble average in the original and theaMD ensembles, respectively.

Currently, aMD can be applied in three modes in NAMD: dihedral boost (accelMDdihe ),total energy boost, and dual boost (accelMDdual ) modes [12]. In dihedral boost mode(accelMDdihe ) the boost energy is applied only to the dihedral potentials. Intotal energy boostmode, the default mode in NAMD, the boost potential is applied to the total potential of the simu-lation. In thedual boost mode(accelMDdual ) [4] two independent boost energies are applied,one to the dihedral potential and a second to the potential resulting from the difference of the totalpotential and the dihedral potential, ie (VT − Vdihe).


Most long-timescale motions of proteins and large biomolecules correspond to large motions,and these in-turn are often thought linked to the slower vibrational modes of the system. Of thebonded potentials in MD forcefields that correspond to internal degrees-of-freedom of the system(bonds, angles, dihedrals and impropers), dihedrals are typically the lowest energy and slowestmoving of the internal degrees-of-freedom. By accelerating, through aMD, the dihedral poten-tials of a system, the key internal degrees-of-freedom responsible for long-timescale motions areaccelerated to produce these large, slow motions in a shorter simulation timescale. With suchan understanding in mind, thedihedral boost modeis the most frequently used mode of aMD inNAMD.

However, boosting only the dihedrals in protein systems that also exhibit large diffusive mo-tions of a domain may not be sufficient to accelerate that protein’s motions. To overcome thischallenge, thedual boost modemay be used in which the internal dihedrals are boosted as well asthe diffusive degrees of freedom of the system, but with two seperate boost potentials, to acceleratelarge, diffusive, long-timescale motions of protein domains.

Citing accelerated MD in NAMD. Please include the following referencein publications that include usage of NAMD’s GPU and aMD implementa-tions:

• Accelerating Molecular Modeling Applications with Graphics Pro-cessors, John E. Stone, James C. Phillips, Peter L. Freddolino,David J. Hardy, Leonardo Trabuco and Klaus Schulten. J. Comp.Chem., 28:2618-2640, 2007.,

• Accelerated Molecular Dynamics: A Promising and Efficient Simu-lation Method for Biomolecules, D. Hamelberg, J. Mongan, and J. A.McCammon. J. Chem. Phys., 120:11919-11929, 2004.

• Implementation of Accelerated Molecular Dynamics in NAMD,Y. Wang, C. Harrison, K. Schulten, and J. A. McCammon,Comp. Sci. Discov., 4:015002, 2011.

4.1.2. Compute the aMD parameters from cMD simulations.

Here you will perform a dihedral boost aMD simulation and need to obtain the following input pa-rameters: the dihedral threshold energy (accelMDE ) and dihedral alpha value (accelMDalpha ).To determine these values for your aMD simulation, you must first perform a short classical MDsimulation of your system. Typically 500 ps to a few ns are sufficient.

After completion of the classical MD simulation, calculate the mean of the dihedral energies.This mean energy value is needed to calculate the aMD input parameters. Determining the meandihedral energy can be done by hand, for example by extracting the values into a plain text fileand evaluating in xmgrace or your favorite spreadsheet software, or by using theNAMD Plotplugin in VMD in conjunction with xmgrace. In our example, the average dihedral energy froma classical MD simulation for GGBP was found to be 871 kcal mol−1. Now calculate the meandihedral energy for your system.

1 Navigate togpu-tutorial/gpu-tutorial data/3-aMD/ , where the necessaryfiles for the simulation are stored.

2 Start the simulation by entering.


namd2 +p8 +idlepoll 2fw0.namd > 2fw0_autopsf.log

3 At the end of the simulation, start a VMD session and open the TkConsole. Ensure that youare in the directory stated above.

4 Calculate the average energies. Open Extensions -> Analysis -> NAMD Plot. Click on File-> Select NAMD Log File, then select the logfile from your simulation. Click on DIHED.Click on File -> Plot Selected Data. On the plot window, pull down the ‘File’ menu andexport the file to a format that you can work with, such as Xmgrace. You can find exampledatasets in the.txt files in theexamples directory. Open the exported format using itscorresponding software and calculate the mean of the dataset.

5 Calculate the dihedralE andα parameter input values by plugging the mean dihedral en-ergy values into the below equations [3, 2]. These equations provide the values to use foraccelMDE andaccelMDalpha :

Vdihe = 871 kcal mol−1+(4 kcal mol−1 residue−1×306 solute residues) = 2095 kcal mol−1,(4)

αdih =15×

(4 kcal mol−1 residue−1 × 306 solute residues

)= 244.8 kcal mol−1. (5)

4.2. Using aMD & GPUs for long-timescale molecular dynamics.

1 In this exercise you will perform an aMD simulation with GPUs. Using the parameters youpreviously calculated, add a new section to your NAMD configuration containing the belowaMD keywords with your corresponding values. Our example values are shown below forcomparison.

You can find an example of the modified configuration file atexamples/2fw0 aMD.namd.

accelMD onaccelMDdihe onaccelMDE 2095accelMDalpha 244.8accelMDOutFreq 1000

2 Generalized Born implicit solvent (GBIS) is a fast but approximate method for calculatingmolecular electrostatics in solvent as described by the Poisson Boltzmann equation whichmodels water as a dielectric continuum. GBIS enables the simulation of atomic structureswithout including explicit solvent water. In aMD simulations involving large motions, elim-inating the water can sometimes play a critical role in accelerating those large motions as theresistance and viscous drag of explicit water is absent when using the GBIS model.

In addition, the elimination of explicit solvent actually accelerates simulations by eliminat-ing the cost of calculating bulk water; however this speedup is somewhat hampered by the


increased computational complexity of the implicit solvent electrostatic calculation and alonger interaction cutoff.

We will set the ion concentration to 300mM using theionConcentration keyword.

In our example, we turn on GBIS and set the salt concentration to 300mM with the followingcommands:

# GBIS ParametersGBIS onionConcentration 0.3

3 When using GBIS, you should not usePMEbecause it is not compatible with GBIS. Peri-odic boundary conditions are supported but are optional. You will also need to increase thecutoff ; 16-18A is a good starting value but some systems may vary.

# Nonbonded Parametersswitching onswitchdist 15cutoff 16pairlistdist 17.5Pme off

4 Select anexclude value appropriate for the forcefield you will be using; in this case,CHARMM. When using GBIS, interactions are not excluded regardless of the type of forcefield used, thus any value forexclude will work without affecting GBIS. You shouldchooseexclude values appropriate for the force field you will be using. If you are un-sure of the value to use, examine the input config for your earlier classical MD simulation.Notice that theexclude values will therefore be the same both with and without explicitsolvent.

5 Select multiple timestepping values compatible with GBIS. When using GBIS, multipletimestepping behaves as follows: the dEdr force is calculated everynonbondedFreq steps(as with explicit solvent, 2 is a reasonable frequency) and the dEda force is calculated everyfullElectFrequency steps (because dEda varies more slowly than dEdr, 4 is a reason-able frequency).

# Multistep Parameterstimestep 1nonbondedFreq 2fullElectFrequency 4

6 There are many ways to tune NAMD to obtain the best possible speeds for a particular sys-tem running on specific hardware. One of the easiest methods is making sure the numberof patches (sections of your system split by NAMD which can be determined by the line”PATCH GRID IS nx BY ny BY nz” in NAMD output) are equal to or greater than the num-ber of processors. If this is not the case, you can force NAMD to almost double the numberof patches in each dimension by putting the parameters shown below in your configurationfile. This has doubled performance for this system on several test machines. You can ex-plore the effects of turning this on or off in different combinations of dimensions to see ifyou can obtain improved performance on your own hardware. For further information aboutperformance tuning, please seethe NAMD wiki.

http://www.ks.uiuc.edu/Research/namd/wiki/?NamdPerformanceTuning


twoAwayX yestwoAwayY yestwoAwayz yes

7 You are now ready to run your simulation using the aMD method on GPUs. Pleaseproceed and begin your simulation. Your simulation will run between 12 and 24 hoursdepending on your GPU hardware. You can find an example trajectory of this run atexamples/ggbp-aMD.dcd .

8 While your aMD simulation is running, prepare a second, exact copy of the simulation ina new directory. This simulation will be identical to the aMD simulation in every respectexcept that you will remove from the config file all aMD commands ( i.e., any commandprefixed withaccel ). Once the aMD simulation that you started in the previous step iscomplete, run this new classical version of the system. Possessing all the same settings asthe aMD simulation, this classical simulation will be used to evaluate differences in samplingbetween simulations that do and do not use aMD. Examples of the configuration file andoutput trajectory have been included in theexamples directory, with filename2fw0 cMD.

5. GPU Augmented Analysis

5.0.1. Analysis of GGBP trajectories

After the GPU and aMD simulations are finished something you will notice is that the proteindiffuses in the simulation box. Hence, we first need to remove the rotational and translationaldegrees of freedom of the protein by aligning the entire trajectories to the first frame.

1 Load both cMD and aMD simulations in VMD.

2 In VMD go to Extensions->Analysis->RMSD Trajectory Tool, a new windw opens up. Forthe alignment we are going to use residues from 1 to 109. Thus, in the text box where it saysprotein , addresid 1 to 109 then click onALIGN.

3 Use theQuickSurfrepresentation to look at the trajectories. Color the protein by residuetype. Can you identify the binding site?

4 Now that your trajectory is aligned, make sure the text box contains the entireprotein asselection. Check thePlot box and click onRMSD. You will immediatly notice a differencebetween the RMSD curves for the cMD simulation and aMD simulation.

Analysis of protein dynamics in VMD. VMD contains a pow-erful interface to study the modes of a protein using ProDy(http://www.csb.pitt.edu/ProDy/plugins/index.html). You can follow the in-structions on the website to install ProDy. Once you have it installedand running, in VMD go to Extensions->Analysis->Normal Mode Wiz-ard. Then select the ProDy interface. Try different calculations, ANM,ENM and PCA. Are the modes obtained from the cMD and aMD differ-ent? Look at different modes and animate them to get a better insight ofthe motions of the protein.


5.1. Calculating g(r) using GPUs.

In VMD, GPUs have extended not only visualization capabilities, but also boosted the efficiencyof calculations for analysis of trajectories, such as that of the pair correlation function,g(r). In thissection, you will performg(r) calculations using a pre-generated MscS trajectory.

1 In VMD, open the TkConsole and navigate togpu-tutorial/gpu-tutorial data/4-analysis/ .

2 Due to time constraints, you are not expected to have generated a long enough trajectoryto yield meaningful data for analysis. Hence, you have been provided with a pre-generatedlong trajectory for this section. The following optional procedure describes how this pre-generated trajectory was built. If you have had enough time to run a long simulation, thenyou are encouraged to go through the following optional steps.

Downsizing your DCDs (Optional). In some cases, you may have gen-erated very large trajectory files. Large trajectories slow down loadingtimes, depending on your hardware, and inconvenience transfer of files tocollaborators. To get around this problem, you can extract from your origi-nal file the trajectories of just the atoms you wish to study, using CatDCD.CatDCD is a procedure in VMD that manipulates .dcd files, appendingthem to one another or extracting from them trajectories of selected atomsinto a separate file. In the current exercise, we will be calculating g(r) forcertain MscS residues and nearby ions in solution, hence you will extractthe trajectories of just these atoms into a separate file. The followingsteps, labeled Optional , will guide you through the process.

3 (Optional)

Copy the MscS PSF, PDB and trajectory files from section 1:cp ../1-largeSims/mscs.psf .cp ../1-largeSims/mscs.pdb .cp ../1-largeSims/equil_gpu.dcd .

4 (Optional) Load the system onto VMD by typing in the TkConsole:

mol load psf mscs.psf pdb mscs.pdb

5 (Optional) You will now create a file listing the indices of atoms whose trajectories you willbe extracting. Source the provided scriptMake ind.tcl or enter the following lines intothe TkConsole:

set seltext1 "resname GLU and not segname PA PB PC PD PE PF PG and name CA"set seltext2 "name POT"set seltext3 "segname PA PB PC PD PE PF PG and resid 158"set sel [atomselect top "($seltext1) or ($seltext2) or ($seltext3)"]set dat [$sel get index]set filename "atoms.ind"set fileID [open $filename "w"]puts -nonewline $fileID $datclose $fileID$sel writepsf mscs_cat.psfmol delete top


6 (Optional) Call CatDCD, using as input the file you have just created:catdcd -o equil cat.dcd -i atoms.ind equil gpu.dcd

7 (Optional) Compare the filesize of the new trajectory file you have created with the originalone. How much of a difference did CatDCD make?

8 Load the MscS trajectory using the command:

mol load psf mscs_cat.psf dcd equil_cat.dcd

9 On the menubar, click on Extensions, then under Analysis, select Radial Pair DistributionFunction g(r). A user interface should pop up.

10 Make sure that the field ‘Use Molecule:’ is set tomscs cat.psf .

11 Notice that you can set the beginning and end frames of the time period that you wish tocalculate g(r) over. The defaults are respectively the first and final frames of the trajectoryyou have loaded.

12 You can also alter the histogram parameters to suit your own purposes. For this exercise, justleave them as the defaults.

13 Check ‘Use PBC’, ‘Display g(r)’ and ‘Use GPU code’.

14 Now, we will calculateg(r) for the positively-charged arginine residue 158, or more specif-ically, the alpha carbon atom of the residue, and positively-charged potassium ions. In ‘Se-lection 1’, entername CA and segname PA PB PC PD PE PF PG and resid158 and in ‘Selection 2’, entername POT.

15 Click on ‘Compute g(r)’. The resultingg(r) plot will be loaded.

16 Alternatively, if you want to run this analysis from the command line, type the following intoTkConsole:

set sel1 [atomselect top "name CA and segname PA PB PC PD PE PF PG and resid 158"]set sel2 [atomselect top "name POT"]measure rdf $sel1 $sel2 usepbc true

17 Perform the same calculation for negatively-charged glutamate ions instead of potassium byentering in ‘Selection 2’,name CA and protein and not segname PA PB PCPD PE PF PG, before clicking on ‘Compute g(r)’.

18 Alternatively, if you want to run this analysis from the command line, type the following intoTkConsole:

set sel1 [atomselect top "name CA and segname PA PB PC PD PE PF PG and resid 158"]set sel2 [atomselect top "name CA and protein and not segname PA PB PC PD PE PF PG"]measure rdf $sel1 $sel2 usepbc true


Figure 6: Pair distribution function of residue 158 and potassium ions. Note the almost uniformdistribution across all lengths

Figure 7: Pair distribution function of residue 158 and the alpha carbon of glutamate ions. Incontrast to the case of potassium ions, notice the spikes at 5A and 7.5A. Do you think this resultmakes sense?


References

[1] M. Jack Borrok, Laura L. Kiessling, and Katrina T. Forest. Conformational changes ofglucose/galactose-binding protein illuminated by open, unliganded, and ultra-high-resolutionligand-bound structures.Protein Science, 16(6):1032–1041, 2007.

[2] Csar Augusto F. de Oliveira, Barry J. Grant, Michelle Zhou, and J. Andrew McCam-mon. Large-scale conformational changes of ¡italic¿trypanosoma cruzi¡/italic¿ prolineracemase predicted by accelerated molecular dynamics simulation.PLoS Comput Biol,7(10):e1002178, 10 2011.

[3] Barry J. Grant, Alemayehu A. Gorfe, and J. Andrew McCammon. Ras conformational switch-ing: Simulating nucleotide-dependent conformational transitions with accelerated moleculardynamics.PLoS Comput Biol, 5(3):e1000325, 03 2009.

[4] Donald Hamelberg, Cesar Augusto F. de Oliveira, and J. Andrew McCammon. Sampling ofslow diffusive conformational transitions with accelerated molecular dynamics.The Journalof Chemical Physics, 127(15):155102, 2007.

[5] Aaftab Munshi. OpenCL Specification Version 1.0, December 2008.http://www.khronos.org/registry/cl/.

[6] John Nickolls, Ian Buck, Michael Garland, and Kevin Skadron. Scalable parallel program-ming with CUDA. ACM Queue, 6(2):40–53, 2008.

[7] James C. Phillips, John E. Stone, and Klaus Schulten. Adapting a message-driven parallelapplication to GPU-accelerated clusters. InSC ’08: Proceedings of the 2008 ACM/IEEEConference on Supercomputing, Piscataway, NJ, USA, 2008. IEEE Press.

[8] Levi C.T. Pierce, Romelia Salomon-Ferrer, Cesar Augusto F. de Oliveira, J. Andrew McCam-mon, and Ross C. Walker. Routine access to millisecond time scale events with acceleratedmolecular dynamics.Journal of Chemical Theory and Computation, 8(9):2997–3002, 2012.

[9] John E. Stone, David J. Hardy, Ivan S. Ufimtsev, and Klaus Schulten. GPU-acceleratedmolecular modeling coming of age.J. Mol. Graph. Model., 29:116–125, 2010.

[10] John E. Stone, James C. Phillips, Peter L. Freddolino, David J. Hardy, Leonardo G. Trabuco,and Klaus Schulten. Accelerating molecular modeling applications with graphics processors.J. Comp. Chem., 28:2618–2640, 2007.

[11] John E. Stone, Jan Saam, David J. Hardy, Kirby L. Vandivort, Wen-mei W. Hwu, and KlausSchulten. High performance computation and interactive display of molecular orbitals onGPUs and multi-core CPUs. InProceedings of the 2nd Workshop on General-Purpose Pro-cessing on Graphics Processing Units, ACM International Conference Proceeding Series,volume 383, pages 9–18, New York, NY, USA, 2009. ACM.

[12] Yi Wang, Chris B. Harrison, Klaus Schulten, and J. Andrew McCammon. Implementationof accelerated molecular dynamics in NAMD.Comput. Sci. Discov., 4:015002, 2011. (11pages).

GPU Accelerated Molecular Dynamics Simulation ...

Documents

Transcript of GPU Accelerated Molecular Dynamics Simulation ...