Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science...

58
Bio-CS Bio-CS Exploration of Molecular Exploration of Molecular Conformational Spaces Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science...

Page 1: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Bio-CSBio-CSExploration of Molecular Exploration of Molecular Conformational SpacesConformational Spaces

Jean-Claude LatombeComputer Science Department

Robotics Laboratory & Bio-X Clark Center

Page 2: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Range of Bio-CS ResearchRange of Bio-CS Research

Gene

Molecules

Tissue/Organs

Body system

Robotic surgery

Molecular structures,similaritiesand motions

Soft-tissue simulation andsurgical trainingCells

Simulation ofcell interaction

Page 3: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Soft-tissue simulation andsurgical training

Range of Bio-CS ResearchRange of Bio-CS Research

Gene

Molecules

Tissue/Organs

Body system

Robotic surgery

CellsSimulation ofcell interaction

Accuray

Molecular structures,similaritiesand motions

Page 4: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Range of Bio-CS ResearchRange of Bio-CS Research

Gene

Molecules

Tissue/Organs

Body system

Robotic surgery

Molecular structures,similaritiesand motions

Soft-tissue simulation andsurgical trainingCells

Simulation ofcell interaction

Page 5: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Motion Motion Structure Structure

1

2 3

4

Page 6: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Motion Motion Structure Structure Function Function

Develop efficient algorithms and data structuresto explore protein conformational spaces: Sampling Similarities Pathways

Page 7: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Vision for the FutureVision for the Future

In-silico experiments

Drugs on demand

“Interactive” Biology

Page 8: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Analogy with RoboticsAnalogy with Robotics

free space

[Kavraki, Svetska, Latombe,Overmars, 95][Kavraki, Svetska, Latombe,Overmars, 95]

Page 9: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

But Biology But Biology Robotics … Robotics …

Energy field, instead of joint controlContinuous energy field, instead of binary free and in-collision spacesMultiple pathways, instead of single collision-free pathPotentially many more degrees of freedomRelation to real world is more complex

Page 10: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

OverviewOverview

Part I Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular MotionsM.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, J.C. Latombe, and C. Varma. Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion. J. Computational Biology, 10(3-4):257-281, 2003.

Part IIChainTree: A Data Structure for Efficient Monte Carlo Simulation of ProteinsI. Lotan, F. Schwarzer, J.C. Latombe. Efficient Energy Computation for Monte Carlo Simulation of Proteins. 3rd Workshop on Algorithms in Bioinformatics (WABI), Budapest, Hungary, Sept., 2003.

Page 11: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Part I Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular MotionsSerkan Apaydin, Doug Brutlag1, Carlos Guestrin, David Hsu2, Jean-Claude Latombe, Chris VarmaComputer Science DepartmentStanford University1 Department of Biochemistry, Stanford University2 Computer Science Department, Nat. Univ. of Singapore

Page 12: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Initial WorkInitial Work[Singh, Latombe, Brutlag, 99][Singh, Latombe, Brutlag, 99]

Study of ligand-protein bindingProbabilistic roadmaps with edges weighted by energetic plausibility

vi

vj

wij

Page 13: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Initial WorkInitial Work[Singh, Latombe, Brutlag, 99][Singh, Latombe, Brutlag, 99]

Study of ligand-protein bindingProbabilistic roadmaps with edges weighted by energetic plausibility

Search of most plausible path

vi

vj

wij

Page 14: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Initial WorkInitial Work[Singh, Latombe, Brutlag, 99][Singh, Latombe, Brutlag, 99]

Study of energy profiles along most plausible paths

Extensions to protein folding[Song and Amato, 01] [Apaydin et al., 01]

But: Molecules fold/bind along a myriad of pathways. Any single pathway is of limited interest.

CatalyticSite

energy

Page 15: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

New Idea: New Idea: Capture the stochastic nature of Capture the stochastic nature of molecular motion by assigning molecular motion by assigning probabilities to edgesprobabilities to edges

vi

vj

Pij

Page 16: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Edge probabilitiesEdge probabilities

Follow Metropolis criteria:

otherwise. ,

1

;0 if ,)/exp(

i

iji

Bij

ij

N

EN

TkE

P

Self-transition probability:

ijijii PP 1

vj

vi

Pij

Pii

Page 17: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Stochastic simulation on roadmap and Monte Carlo simulation converge to same Boltzmann distribution

S

Stochastic Roadmap SimulationStochastic Roadmap Simulation

Pij

Page 18: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Problems with Problems with Monte Carlo SimulationMonte Carlo Simulation

Much time is wasted escaping local minima Each run generates a single pathway

Page 19: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Proposed SolutionProposed Solution

Pij

Treat a roadmap as a Markov chain and use First-Step Analysis tool

Page 20: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Example #1: Example #1:

Probability of Folding pProbability of Folding pfoldfold

Unfolded state Folded state

pfold1- pfold

“We stress that we do not suggest using pfold as a transition coordinate for practical purposes as it is

very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition

Coordinate for Protein Folding” Journal of Chemical Physics (1998).

HIV integrase[Du et al. ‘98]

Page 21: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Pii

F: Folded setU: Unfolded set

First-Step AnalysisFirst-Step Analysis

Pij

i

k

j

l

m

Pik Pil

Pim

Let fi = pfold(i)After one step: fi = Pii fi + Pij fj + Pik fk + Pil fl + Pim fm

=1 =1

One linear equation per node Solution gives pfold for all nodes

No explicit simulation run All pathways are taken into account Sparse linear system

Page 22: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

In Contrast …In Contrast …

Computing pfold with MC simulation requires:

For every conformation c of interest

Perform many MC simulation runs from c

Count number of times F is attained first

Page 23: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Computational TestsComputational Tests• 1ROP (repressor of

primer)• 2 helices• 6 DOF

• 1HDD (Engrailed homeodomain)

• 3 helices• 12 DOF

H-P energy model with steric clash exclusion [Sun et al., 95]

Page 24: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

1ROP

Correlation with MC ApproachCorrelation with MC Approach

Page 25: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Computation Times (1ROP)Computation Times (1ROP)

Monte Carlo:

49 conformations Over 11 days ofcomputer time

Over 106 energy

computations

Roadmap:

5000 conformations1.5 hours ofcomputer time

~15,000 energycomputations

~4 orders of magnitude speedup!

Page 26: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Example #2: Example #2: Ligand-Protein InteractionLigand-Protein Interaction

Computation of escape time from funnels of attraction around potential binding sites

funnel = ball of 10Å rmsd[Camacho, Vajda, 01]

Page 27: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Similar Computation Similar Computation Through Simulation Through Simulation [Sept, Elcock and McCammon `99]

10K to 30K independent simulations

Page 28: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Computing Escape Time with Computing Escape Time with RoadmapRoadmap

Funnel of Attraction

ij

kl

m

Pii

Pim

PilPikPij

i = 1 + Pii i + Pij j+ Pik k + Pil l + Pim m

(escape time is measured as number of stepsof stochastic simulation)

= 0

Page 29: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Distinguishing Catalytic SiteDistinguishing Catalytic Site

Given several potential binding sites,which one is the catalytic site?

Energy: electrostatic + van der Waals + solvation free energy terms

Page 30: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Complexes StudiedComplexes Studied

ligand protein # random nodes

# DOFs

oxamate 1ldm 8000 7

Streptavidin 1stp 8000 11

Hydroxylamine 4ts1 8000 9

COT 1cjw 8000 21

THK 1aid 8000 14

IPM 1ao5 8000 10

PTI 3tpi 8000 13

Page 31: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Distinction Based on Distinction Based on EnergyEnergy

Protein Bound state

Best potential binding site

1stp -15.1 -14.6

4ts1 -19.4 -14.6

3tpi -25.2 -16.0

1ldm -11.8 -13.6

1cjw -11.7 -18.0

1aid -11.2 -22.2

1ao5 -7.5 -13.1 (kcal/mol)

Able to distinguish

catalytic site

Not able

Page 32: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Distinction Based on Escape Distinction Based on Escape TimeTimeProtein Bound

stateBest potential binding site

1stp 3.4E+9 1.1E+7

4ts1 3.8E+10 1.8E+6

3tpi 1.3E+11 5.9E+5

1ldm 8.1E+5 3.4E+6

1cjw 5.4E+8 4.2E+6

1aid 9.7E+5 1.6E+8

1ao5 6.6E+7 5.7E+6(# steps)

Able to distinguishcatalytic

site

Not able

Page 33: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ConclusionConclusion

Probabilistic roadmaps are a promising tool for computing ensemble properties of molecular pathways

Current work: Non-uniform sampling strategies to

handle more complex molecules More realistic energetic models Extension to molecular dynamic

simulation Connection to in-vitro experiments

(interaction of two proteins)

Page 34: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Part IIPart II ChainTree: ChainTree: A Data Structure for Efficient A Data Structure for Efficient Monte Carlo Simulation of Monte Carlo Simulation of ProteinsProteinsItay Lotan, Fabian Schwarzer, Dan Halperin1,

Jean-Claude LatombeComputer Science DepartmentStanford University1 Computer Science Department, Tel Aviv University

Page 35: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Used to study thermodynamic and kinetic properties of proteins

Random walk through conformation space At each attempted step:

– Perturb current conformation at random– Accept step with probability:

Problem: How to maintain energy efficiently?

/( ) min 1, bE k TP accept e

Monte Carlo Simulation Monte Carlo Simulation (MCS)(MCS)

Page 36: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Energy FunctionEnergy Function

E = bonded terms + non-bonded terms

Bonded terms, e.g. bond length Easy to compute

Non-bonded terms, e.g. Van der Waals, depend on distances between pairs of atomsExpensive to compute, O(n2)

Page 37: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Energy FunctionEnergy Function

Non-bonded terms

Use cutoff distance (6 - 12Å) Only O(n) interacting pairs

[Halperin & Overmars ’98]

Problem: How to find interacting pairswithout enumerating all atom pairs?

Page 38: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Grid MethodGrid Method

Subdivide space into cubic cells Compute cell that contains each atom

center Store results in hash table

dcutof

f• Θ(n) time to update grid• O(1) time to find

interactions for each atom• Θ(n) to find all interactions

Asymptotically optimal in worst-case!

Page 39: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Can We Do Better on Can We Do Better on Average?Average?

Proteins are long kinematic chains

Page 40: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Protein’s Kinematic Protein’s Kinematic StructureStructure

Angles for backbone andfor side-chains Conformational space

torsional dof

Page 41: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Can We Do Better on Can We Do Better on Average?Average?

Proteins are long chain kinematics

Few DOFs are perturbed at each MC step

Long sub-chains stay rigid at each stepMany partial energy sums remain constant

How to retrieve unchanged partial sums?

Page 42: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Two New Data StructuresTwo New Data Structures

1. ChainTree Fast detection of interacting atom pairs

2. EnergyTree Reuse of unchanged partial energy sums

Page 43: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ChainTreeChainTree

Combination of two hierarchies: Transform hierarchy:

Bounding volume hierarchy:

Page 44: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ChainTreeChainTree

Combination of two hierarchies: Transform hierarchy:

approximate kinematics of protein backbone at successive resolutions

Page 45: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ChainTreeChainTree

Combination of two hierarchies:

Bounding volume hierarchy: approximate geometry of protein at successive resolutions

(Larsen et al., ’00)

Page 46: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ChainTreeChainTree

Page 47: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Updating the ChainTreeUpdating the ChainTree

Update path to root– Recompute transforms that shortcut change– Recompute bounding volumes that contain change

Page 48: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Finding Interacting PairsFinding Interacting Pairs

vs.

• Do not search inside rigid sub-chains (unmarked nodes)

• Do not test two nodes with no marked node between them

Page 49: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Finding Interacting PairsFinding Interacting Pairs

vs.

• Do not search inside rigid sub-chains (unmarked nodes)

• Do not test two nodes with no marked node between them

Page 50: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Computational ComplexityComputational Complexity

• n : total number of DOFs in protein backbone

• k : number of simultaneous DOF changes at each step of MCS

• Updating complexity:

• Worst-case complexity of finding all interacting pairs:

but performs much better in practice!!!

logn

O kk

43( )n

Page 51: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

EnergyTreeEnergyTree

E(N,N) E(N,O)

E(P,P)

E(O,O)

Page 52: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

EnergyTreeEnergyTree

( , ) ( , ) ( , ) ( , ) ( , )

( , ) ( , ) ( , ) ( , )l l r r l r r l

l l r r l r

E E E E E

E E E E

E(N,N) E(N,O)

E(P,P)

E(O,O)

Page 53: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Experimental SetupExperimental Setup

Energy function:– Van der Waals– Electrostatic– Attraction between native contacts– Cutoff at 12Å

300,000 steps MCS Early rejection for large vdW terms

Page 54: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Results: 1-DOF changeResults: 1-DOF change

(68) (144) (374) (755)

Page 55: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Results: 5-DOF changeResults: 5-DOF change

(68) (144) (374) (755)

Page 56: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

Two-Pass ChainTreeTwo-Pass ChainTree

(68) (144) (374) (755)

Page 57: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

ConclusionConclusion

• Chain/EnergyTree reduces average time per step in MCS of proteins (vs. grid)

• Exploit chain kinematics of protein • Larger speed-up for bigger proteins

and for smaller number of simultaneous DOF changes

Page 58: Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

What is Computational Biology?What is Computational Biology?

Using computers in Biology?Designing efficient algorithms for analyzing biological data and simulating biological processes?Using Biology to design new algorithms and computing hardware?

Cultural clash Biology classificationComputer Science abstraction

In any case, Computational Biology will be a critical domain for the next 20 years, probably the next “big thing” after the Internet