Computational models of bio-molecules - USP · Computational models of bio-molecules Eric Darve...

Post on 16-Mar-2020

6 views 1 download

Transcript of Computational models of bio-molecules - USP · Computational models of bio-molecules Eric Darve...

E. Darve, ICME, 2/5/2007

Computational models of bio-molecules

Eric DarveMechanical Engineering

Stanford

2/34E. Darve, ICME, 2/5/2007

Protein models

3/34E. Darve, ICME, 2/5/2007

Life begins with cells

Single 200 micrometer cell (the human egg), with sperm. From the union of an egg and sperm will arise the 10 trillion cells of a human body.

Cells are the building block of the body.

What is the cell filled with?

4/34E. Darve, ICME, 2/5/2007

A cell is filled with molecules: from ions, small molecules to macromolecules

Proteins give cells structure and perform most cellular tasks.

Water-accessible surface of proteins; notice the complex three-dimensional shape.

Enzyme, hormone, antibody, blood’s oxygen carrier.

Enzyme

Hormone

Oxygen carrier Antibody Enzyme Cell membrane

5/34E. Darve, ICME, 2/5/2007

The code for this machinery is in the DNA

The DNA stores a code in the form of a succession of four letters A, G, T, C.

A section is copied into a ribonucleic acid (RNA).

The ribosome performs the translation: amino acids get linked together to form a protein.

The order is specified by the RNA; a universal genetic code is followed.

6/34E. Darve, ICME, 2/5/2007

Biology is a multiscale problem

DNA double helix: 2 nmEight cells in an embryo: 200 micro mWolf spider: 15 mmEmperor penguin: 1 m.

Atomistic computer models

7/34E. Darve, ICME, 2/5/2007

Structure and function of proteins are tightly coupled

Proteins are defined by a unique sequence of amino acids.

There are 20 amino acids.

A hierarchy of folding processes gives rise to large complexes or assemblies.

Modeling becomes increasingly harder as the size increases.

8/34E. Darve, ICME, 2/5/2007

Proteins are polypeptides formed by chaining amino acids

Tripeptide: peptide bonds (yellow) link the amide nitrogen atom (blue) of one amino acid with the carbonyl carbon atom (gray) of an adjacent one in the linear polymers known as polypeptides.

Proteins are polypeptides (100s to 1000s of amino acids) that have folded into a defined 3D shape.

The side chain (R group, green) determine its properties.

9/34E. Darve, ICME, 2/5/2007

The simplest structure is the alpha helix

The alpha helix: the most basic secondary structure.

The backbone (red) is folded into a spiral that is held in place by hydrogen bonds between backbone oxygen and hydrogen atoms.

Side chain R groups are covering the outside of the helix.

Helix has a directionality because all the hydrogen-bond donors have the same orientation.

10/34E. Darve, ICME, 2/5/2007

Some diseases are caused by proteins which misfold

Alzheimer’s: caused by the formation of insoluble plaques composed of amyloid protein.

Conformation changes from alpha-helix to beta-sheet.

This leads to an aggregation into filaments (amyloid) found in plaques.

11/34E. Darve, ICME, 2/5/2007

Ion channels allow molecules to come in and out of the cell

Protein sits in the membrane of the cell.

Two conformations: open and closed.

Hydrophilic groups are facing inside the channels while hydrophobic groups face the lipid bilayer.

The selectivity filter determines the ion selectivity of the channel.

12/34E. Darve, ICME, 2/5/2007

Ion channel: Bacterial K+ channel

Top view

Side View

Potassium ionSelectivity loop

Pore helix

VestibuleInner helix

Outer helix

Color code: acidic=red, basic=blue, polar=green, non-polar=white

13/34E. Darve, ICME, 2/5/2007

The Art of Water Transport in Aquaporins: UIUC, theoretical and computational biophysics group

Aquaporins are membrane water channels that play critical roles in controlling the water contents of cells. These channels are widely distributed in all kingdoms of life, including bacteria, plants, and mammals.

They form tetramers in the cell membrane, and facilitate the transport of water and, in some cases, other small solutes across the membrane.

14/34E. Darve, ICME, 2/5/2007

Ion channels are gated by different mechanisms

Channels are often gated, i.e., they don’t stay open or closed but open briefly and close again.

Gating mechanisms: voltage-gated, binding of a ligand, mechanically gated.

Example: mediate most forms of electrical signaling in the nervous system.

Project with School of

Medicine: sense of touch

15/34E. Darve, ICME, 2/5/2007

Free energy is used to understand these changes of conformation

Free energy: used to describe systems at constant temperature and pressure.

All systems evolve such that the free energy is minimized.

On the right: a typical free energy profile for a reaction.

Reaction occurs if the free energy of products is less than reactants.

High-energy transition state must be crossed: activation energy.

In mechanically gated channels, a force applied to the channel lowers the barrier, enabling the channel to open.

The adaptive biasing force is a numerical technique to efficiently calculate such

profiles.

16/34E. Darve, ICME, 2/5/2007

Symplectic time integrators

17/34E. Darve, ICME, 2/5/2007

Symplectic integrators are a special class of geometric integrators

They conserve area.

Importantly, they conserve energy (no drift) over long-time scale integrations.

18/34E. Darve, ICME, 2/5/2007

The discrete Hamilton’s principle allows constructing symplectic integrators

Hamilton’s principle: a trajectory is an extremum of the action integral:

Discrete principle: extremum of the discrete action integral

19/34E. Darve, ICME, 2/5/2007

This class of integrator can be extended to asynchronous integrators

Independent choice of time steps for each potential:

This variational principle leads necessarily to a symplectic integrator

20/34E. Darve, ICME, 2/5/2007

The second order method can be implemented very easily

Time

21/34E. Darve, ICME, 2/5/2007

For molecular dynamics, a time step is chosen for each type of potential

Chemical bonds

Bond angle, torsion angle, dihedral angle

Lennard-Jones

Short-range and long-range electrostatics

22/34E. Darve, ICME, 2/5/2007

A model problem allows studying the stability of synchronous integrators

Model problem:

r-RESPA corresponds to the choice:

Stability condition:

The integrator is unstable when:

23/34E. Darve, ICME, 2/5/2007

This analysis can be extended to the asynchronous case

Rational ratio:

We define the following matrix:

The integrator is unstable if one of the eigenvalues is larger than 1.

This allows a numerical investigation of unstable time steps.

24/34E. Darve, ICME, 2/5/2007

The stability diagram shows many structures

25/34E. Darve, ICME, 2/5/2007

Instability if the synchronization time is a multiple of the half-period

Proved:

These equations lead to a finite set of points.

Conjecture:

26/34E. Darve, ICME, 2/5/2007

There exists a family of curves composed of unstable points only

27/34E. Darve, ICME, 2/5/2007

Red curves:

Four curves are clearly visible on this plot

Green curves:

Magenta curves:

Cyan curves:

28/34E. Darve, ICME, 2/5/2007

This integrator can be stabilized using a Langevin dynamics equationLangevin dynamics is used to model a system at constant temperature.

It’s a stochastic equation given by:

The previous study can be used to determine the smallest value of γ which guarantees a stable integrator.

29/34E. Darve, ICME, 2/5/2007

AVI is faster than r-RESPA

30/34E. Darve, ICME, 2/5/2007

AVI is even faster when the time scales are close to one another

31/34E. Darve, ICME, 2/5/2007

The gap in performance between conventional processors and graphics cards increases

32/34E. Darve, ICME, 2/5/2007

The computing performance are incredible

1.7x1.4xAnnual growth

$599$874Price

330 Gflops (measured)

48 Gflops (maximum)Performance

Nvidia 8800 GTX

3 GHz Intel Core 2 Duo

33/34E. Darve, ICME, 2/5/2007

A speed-up of 70x is obtained on atomistic simulations

Results on ATI X1900XTX

This will enable simulations of larger systems over realistic time scales, i.e., relevant to the biologists.

High-performance computing is not just for gamers anymore!

34/34E. Darve, ICME, 2/5/2007

Students and collaborators

Free Energy:– Andrew Pohorille, NASA Ames– David Rodriguez-Gomez, NASA Ames

Symplectic integrators:– Adrian Lew, ME Department, Stanford– William Fong, ICME program, Stanford

GPU:– Vijay Pande, Chemistry Department, Stanford– Pat Hanrahan, Computer Science Department, Stanford– Erich Elsen, ME Department, Stanford

35/34E. Darve, ICME, 2/5/2007

Classes

Spring 2007: ME 436, Computational Molecular Modeling and Parallel Computing

Summer 2007: ME 438, Computational Molecular Modeling Project

36/34E. Darve, ICME, 2/5/2007