Algorithms and Infrastructure for Molecular Dynamics Simulations Ananth Grama Purdue University ...
-
Upload
amie-gardner -
Category
Documents
-
view
220 -
download
0
Transcript of Algorithms and Infrastructure for Molecular Dynamics Simulations Ananth Grama Purdue University ...
Algorithms and Infrastructure forMolecular Dynamics Simulations
Ananth GramaPurdue University
http://www.cs.purdue.edu/people/[email protected]
Various parts of this work are in collaboration with Eric Jakobsson,Vipin Kumar, Sagar Pandit, Ahmed Sameh, Vivek Sarin, H. Larry Scott, Shankar Subramaniam, Priya Vashishtha.
Supported by NSF, NIH, and DoE.
Statistical and continuum methods
ps ns s ms
nm
m
mm
Ab-initio methodsAtomistic methods
The Regime of Simulations
Multiscale Modeling Techniques Ab-initio methods (few approximations, but slow)
DFT, CPMD Electron and nuclei treated explicitly.
Classical atomistic methods (more approximations)Classical molecular dynamicsMonte CarloBrownian dynamicsNo electronic degrees of freedom. Electrons are approximated through fixed partial charges on atoms.
Continuum methods (No atomistic details)
• Initialize positions, velocities and potential parameters
• At each time step• Compute energy and forces
• Non bonded forces, bonded forces, external forces, pressure tensor
• Update configuration • Compute new configuration, apply constraints and
correct velocities, scale simulation cell
Molecular Dynamics
V = Vbond + Vangle + Vdihedral + VLJ + VElecrostatics
F V
VLJ Cij
(12)
rij12
Cij
(6)
rij6
i j
VElectrostatics fqiq j
rriji j
Vbond 1
2kij (rij bij )
2
Vangle 1
2kijk
(ijk ijk0 )2
Vdihedral kijkl (1 cos(nijkl ijkl
n0 ))n
Simplified Interactions Used in Classical Simulations
Force Parameters
• Determination of partial charges on the atoms.• Get bond and angle parameters from spectroscopic data.• Work on smaller fragments of the molecule and determine LJ parameters by fitting the density and heat of vaporization. For example, linear chain alkanes are used to determine the LJ parameters of hydrocarbon chains in lipids.
• Determination of dihedral parameters.
• Tune parameters to match known experimental observables.
-0.84
0.42
0.42
• Lennard-Jones (LJ) potential• short range potential• can be truncated with a cutoff rc (errors, dynamic
neighbor list)
• Electrostatics• long range potential• cutoffs usually produce erroneous results• Particle Mesh Ewald (PME) technique with appropriate
boundary corrections (poor scalability)• Fast Multipole Methods.
Non Bonded Forces
Handling Long-Range Potential
● Multipole Methods (FMM/Barnes-Hut)● Error Control in Multipole Methods● Parallelism in Multipole Methods● Applications in Dense System Solvers● Applications in Sparse Solvers● Application Studies: Lipid Bilayers● Ongoing Work: Reactive Simulations
Multipole Methods
● Represent the domain hierarchically using a spatial tree.
● Accurate estimation of potentials requires O(n2) force computations.
● Hierarchical methods reduce this complexity to O(n) or O(n log n).
● Since particle distributions can be very irregular, the tree can be highly unbalanced.
Multipole Methods
Error Control in Multipole Methods
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Error Control in Multipole Methods
Grama et al., SIAM SISC.
Parallelism in Multipole Methods
● Each particle can be processed independently● Nearby particles interact with nearly identical set
of sub-domains● Spatially proximate ordering minimizes
communication● Load can be balanced by assigning each task
identical number of interactions
Grama et al., Parallel Computing.
Parallelism in Multipole Methods
Grama et al., Parallel Computing.
Parallelism in Multipole Methods
Grama et al., Parallel Computing.
Solving Dense Linear Systems
Solving Dense Linear Systems
Solving Dense Linear Systems
Solving Dense Linear Systems
● Over a discretized boundary, these lead to a dense linear system of the form Ax = b.
● Iterative methods (GMRES) are typically used to solve the system.
● The matrix-vector product can be evaluated as a single iteration of the multipole kernel – near field is replaced using Gaussian quadratures and far-field is evaluated using multipoles.
Solving Dense Linear Systems
Multipole Based Dense Solvers
All times in seconds on a 64 processor Cray T3E (24,192 unknowns)
Multipole Based Solver Accuracy
Preconditioning the Solver
Preconditioning the Solver
Solving Sparse Linear Systems
Generalized Stokes Problem
● Navier-Stokes equation
● Incompressibility Condition 0 u
Ωon
Ωin
ΩinΔR1
pt
gu
0u
uuuu
Generalized Stokes Problem
● Linear system
● BT is the discrete divergence operator and A collects all the velocity terms.
● Navier-Stokes:
● Generalized Stokes Problem:
0p
u
0
f
B
BAT
LR
CMt
A11
LR
Mt
A11
Solenoidal Basis Methods
● Class of projection based methods● Basis for divergence-free velocity● Solenoidal basis matrix P: ● Divergence free velocity:
● Modified system
0PBT
xu P
0xu PBB TT
0p
u
0
f
B
BAT
fBAP px=>
Solenoidal Basis Method
● Reduced system:
● Reduced system solved by suitable iterative methods such as CG or GMRES.
● Velocity obtained by :
● Pressure obtained by:
0 BPPB TT
fPAPP TT xfBAP px ;
xu P
APxfBp
Preconditioning• Observations
• Vorticity-Velocity function formulation:
ψξ
ψu
uξ
PwPy
Pwu
uPy
T
T
ψξξξ
Δ,R
1
t
Preconditioning• Reduced system approximation:
• Preconditioner:
• Low accuracy preconditioning is sufficient
PPPPR1
MΔt1
APP TTT
ss LLR1
MΔt1
G
Preconditioning
● Preconditioning can be affected by an approximate Laplace solve followed by a Helmholtz solve.
● The Laplace solve can be performed effectively using a Multipole method.
Preconditioning Laplace/Poisson Problems● Given a domain with known internal charges and boundary
potential (Dirichlet conditions), estimate potential across the domain. Compute boundary potential due to internal charges
(single multipole application) Compute residual boundary potential due to charges on
the boundary (vector operation). Compute boundary charges corresponding to residual
boundary potential (multipole-based GMRES solve). Compute potential through entire domain due to
boundary and internal charges (single multipole).
Numerical Experiments
● 3D driven cavity problem in a cubic domain.● Marker-and-cell scheme in which domain is
discretized uniformly.● Pressure unknowns are defined at node points and
velocities at mid points of edges.● x, y, and z components of velocities are defined on
edges along respective directions.
Sample Problem Sizes
92,25695,23232,76832x32x32
10,80011,5204,09616x16x16
1,1761,3445128x8x8
Solenoidal
Functions
VelocityPressureMesh
Preconditioning Performance
Preconditioning Performance – Poisson Problem
Preconditioning Performance– Poisson Problem
Preconditioning Performance– Poisson Problem
Parallel Performance
Comparison to ILU
Application Validation: Simulating Lipid Bilayers.
Membrane as a barrier
Thanks to David Bostick
Fluid Mosaic Model
What is a lipid bilayer ?
Hydrophilic
Hydrophobic
lipid bilayer
Depending on the cross sectional area and the chain length the molecules can form vesicles, micelles, hexagonal phase
Fatty acid
Fatty acidPhosphate
Glycerophospholipid
Key Physical Properties of Lipid Bilayers
• Thickness.
• increases with the length of the hydrocarbon chains • decreases with number of unsaturated bonds in chains • decreases with temperature
• Area per molecule
• increases with the length of the hydrocarbon chains• increases with the number of unsaturated bond in chains• increases with temperature • depends on the properties of polar region
Key Physical Properties of Lipid Bilayers
Order parameter
b
SCD P2(cos() 1
23cos2() 1
Key Physical Properties of Lipid BilayersPhase Transition
• Effect of phase transitiono order parameterso thicknesso area per lipido bending and compressibility
modulus.o lateral and transverse
diffusion coefficients.
How real are the simulations ?
Area per lipid (Å2)
experimentssimulationsLipid
~ 38*~ 27Cholesterol
~ 53~ 53Sphingomyelin (18:0)
~ 54~ 53.6Dipalmitoylphosphatidylserine
(DPPS with Na+ counter ions)
~ 72~ 71Dioleylphosphatidylcholine
(DOPC)
~ 64~ 62Dipalmitoylphosphatidylcholine
(DPPC)
How real are the simulations ?
Pandit, et al., Biophys. J.
Simulation of 128 DPPC bilayer in SPC water
Predictability of atomistic simulations
Pandit, et al., J. Chem. Phys.
Water shells are observed using the surface to point correlation function.
Heterogeneous Mixures in bilayers
• Lipids with other membrane components.• Lipid mixtures with different headgroups and polar region.• Lipid mixtures with different chain lengths and saturation.
Mixture with cholesterola - faceb - face
Hydrophilic
• Cholesterol increases mechanical strength of the bilayer• Ordering of chains and phase behavior.
Phase Diagram with Cholesterol
Reproduced from Vist and Davis (1990), Biochemistry, 29:451
• Lo or b phase
Liquid ordered phase where chain ordering is like gel phase but the lateral diffusion coefficients are like fluid phase.
Rafts
Rafts are liquid ordered domains in cell plasma membranes rich in sphingomyelin (or other sphingolipids), cholesterol and certain anchor proteins.
Rafts are important membrane structural components in:• signal transduction• protein transport • sorting membrane components• process of fusion of bacteria and viruses into host cells
How do rafts function and maintain their integrity?
How does Cholesterol control the fluidity of lipid membranes?
A Model for Rafts
Simulation Studies of Rafts
Study of liquid ordered phase Study of ternary mixtures that form raft like domains.
• Mixture of cholesterol and saturated lipid (DPPC and DLPC).• Mixture of cholesterol and sphingomyelin.
Preformed domain
Spontaneous
domain formation
Ternary mixture of DOPC, SM, Chol with a preformed domain.
Ternary mixture of DOPC, SM, Chol with random initial distribution
simulated systems:120 PC molecules (DPPC/DLPC)80 cholesterol molecules (~40 mol %)5000 water moleculesduration: 35 ns / 18 nsconditions: NPT ensemble (P = 1 atm, T = 323 / 279 => )
Study of Lo phase
Pandit, et al., Biophys. J. (2004), 86, 1345-1356.
Treduced T TmTm
0.029
DPPC+Cholesterol DLPC+Cholesterol
Pandit, et al., J. (2004), 86, 1345-1356.
Pandit, et al., Biophys. J.
Conclusions from Simulations
Cholesterol increases order parameter of the lipids. The order parameters are close to liquid ordered phase order parameters.
Cholesterol forms complexes with two of its neighboring lipids. Life time of such a complex is ~10ns.
Complex formation process is dependent on the chain length of the lipids. Shorter chain lipids do not form large complexes.
Simulation of Domain Formation
•1424 DOPC, 266 SM, 122 cholesterols and 62561 waters (284,633 atoms)• Duration: 12 ns (with electrostatic cutoff) and 8 ns (with PME), 3 fs steps• Ensemble: NPT (P = 1 atm (isotropic), T = 293 K)• Configuration Biased Monte Carlo (CBMC) is used to thermalise chains.
Pandit et al., Biophys. J.
Binary mixture: 100 DOPC, 100 18:0 SM, 7000 water (31600 atoms)
0 ns 250 ns
Ternary mixture: 100 DOPC, 100 18:0 SM, 100 CHOL, 10000 water (43500 atoms)
0 ns 250 ns
Some Ongoing Work
Reactive systems
● Chemical reactions are association and dissociation of chemical bonds
● Classical simulations cannot simulate reactions● ab-initio methods calculate overlap of electron orbitals to investigate
chemical reactions
● ReaX force field postulates a classical bond order interaction to mimic the association and dissociation of chemical bonds1
1 van Duin et al , J. Phys. Chem. A, 105, 9396 (2001)
Bond Order Interaction
1 van Duin et al , J. Phys. Chem. A, 105, 9396 (2001)
Bond order for C-C bond
BOij '(rij ) exp a
rijr0
b
• Uncorrected bond order:
Where is for andbonds• The total uncorrected bond order is sum of three types of bonds• Bond order requires correction to account for the correct valency
• After correction, the bond order between a pair of atoms depends on the uncorrected bond orders of the neighbors of each atoms
• The bond orders rapidly decay to zero as a function of distance, so one can easily construct a neighbor list for efficient computation of bond orders
Bond Order Interaction
Neighbor Lists for Bond Order
• Efficient implementation critical for performance
• Implementation based on an oct-tree decomposition of the domain
• For each particle, we traverse down to neighboring octs and collect neighboring atoms
• Has implications for parallelism (issues identical to parallelizing multipole methods)
Bond Order : Choline
Bond Order : Benzene
Other Local Energy Terms
● Other interaction terms common to classical simulations, e.g., bond energy, valence angle and torsion, are appropriately modified and contribute to non-zero bond order pairs of atoms
● These terms also become many body interactions as bond order itself depends on the neighbors and neighbor’s neighbors
● Due to variable bond structure there are other interaction terms, such as over/under coordination energy, lone pair interaction, 3 and 4 body conjugation, and three body penalty energy
Non Bonded van der Waals Interaction
• van der Waals interactions are modeled using distance corrected Morse potential
Where R(rij) is the shielded distance given by
VvdW (rij ) Dij exp ij 1 R(rij )
rvdW
2Dij exp
1
2ij 1
R(rij )
rvdW
R(rij ) rij 1
ij
1
Electrostatics
• Shielded electrostatic interaction is used to account for orbital overlap of electrons at closer distances
• Long range electrostatics interactions are handled using the Fast Multipole Method (FMM).
VEle(rij ) fqiq j
rij3 ij
3 1
3
Charge Equilibration (QEq) Method
● The fixed partial charge model used in classical simulations is inadequate for reacting systems.
● One must compute the partial charges on atoms at each time step using an ab-initio method.
● We compute the partial charges on atoms at each time step using a simplified approach call the Qeq method.
Charge Equilibration (QEq) Method
• Expand electrostatic energy as a Taylor series in charge around neutral charge.
• Identify the term linear in charge as electronegativity of the atom and the quadratic term as electrostatic potential and self energy.
• Using these, solve for self-term of partial derivative of electrostatic energy.
Qeq Method
We need to minimize:
subject to:
jii
ijii
iele qqHqXE 2
1
0i
iq
H ij Jiij 1 ij
rij3 ij
3 1 3
where
Qeq Method
i
iieleiu quqEqE })({})({
0
jj
ijiui
qHuXEq
uXqH ijj
ij
uXqH ~~~
Qeq Method
)1(1kk
k
iki uXHq
i k
ik
ik
k
ik
ii uHXHq 011
From charge neutrality, we get:
i kkik
ik
k
ik
H
XHu
11
1
Qeq Method
ii
ii
t
su
Let
wherek
k
iki XHs 1
kk
iki Ht 11
or ii
ikk sHX
ii
iktH 1
Qeq Method
● Substituting back, we get:
i
ii
ii
iiii tt
ssutsq
We need to solve 2n equations with kernel H for si and ti.
Qeq Method
● Observations:– H is dense. The diagonal term is Ji The shielding term is short-range Long range behavior of the kernel is 1/r
Hierarchical methods to the rescue! Multipole-accelerated matrix-vector products combined with GMRES and a preconditioner.
Qeq: Parallel Implementation
● Key element is the parallel matrix-vector (multipole) operation
● Spatial decomposition using space-filling curves● Domain is generally regular since domains are
typically dense● Data addressing handled by function shipping● Preconditioning via truncated kernel● GMRES never got to restart of 10!
Qeq Parallel Performance
27 (26.5)187 (3.8)7189255,391
13 (24.3)87 (3.6)3169108,092
4.95 (21.0)32 (3.3)104743,051
P=32P=4P=1IterationsSize
Size corresponds to number of atoms; all times in seconds, speedups in parentheses. All runtimes on a cluster of Pentium Xeons connected over a Gb Ethernet.
Qeq Parallel Performance
19.1 (26.6)132 (3.8)508255,391
9.2 (24.8)61 (3.7)228108,092
3.2 (22.8)21 (3.5)7343,051
P=16P=4P=1Size
Size corresponds to number of atoms; all times in seconds, speedups in parentheses. All runtimes on an IBM P590.
Parallel ReaX Performance
● ReaX potentials are near-field.● Primary parallel overhead is in multipole
operations.● Excellent performance obtained over rest of
the code.● Comprehensive integration and resulting
(integrated) speedups being evaluated.
... and yet to come...
● Comprehensive validation of parallel ReaX ● System validation of code – from simple
systems (small hydrocarbons) to complex molecules (larger proteins)
● Parametrization and tuning force fields.
Parting Thoughts
● There is a great deal computer scientists can contribute to engineering and sciences.
● In our group, we work on projects ranging from: structural control (model reduction, control, sensing, macroprogramming -- NSF/ITR), materials modeling (multiscale simulations – NSF, DoE), systems biology (interaction networks – NIH), and finite element solvers (sparse solvers – NSF).
Parting Thoughts
● There is a great deal we can learn from scientific and engineering disciplines as well.
● Our experiences in these applications motivate our work in core areas of computing, including wide-area storage systems (NSF/STI), speculative execution (Intel), relaxed synchronization (NSF), and scaling to the extreme (DARPA/HPCS).
Thank you very much!
References and papers at:
http://www.cs.purdue.edu/pdsl/