Understanding molecular mechanisms of protein tyrosine kinases...

76
Understanding molecular mechanisms of protein tyrosine kinases by molecular dynamics and free energy calculations Yaozong Li Doctoral Thesis, 2017 Department of Chemistry Umeå University, Sweden

Transcript of Understanding molecular mechanisms of protein tyrosine kinases...

Page 1: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

Understanding molecular mechanisms of protein tyrosine kinases by molecular dynamics and free energy calculations

Yaozong Li

Doctoral Thesis, 2017

Department of Chemistry

Umeå University, Sweden

Page 2: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

This work is protected by the Swedish Copyright Legislation (Act 1960:729)

ISBN: 978-91-7601-727-2

Electronic version available at http://umu.diva-portal.org/

Tryck/Printed by: VMC, KBC, Umeå university

Umeå, Sweden, 2017

Page 3: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

Everything should be made as simple as possible,

but no simpler.

-- Albert Einstein

To my wife

Page 4: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations
Page 5: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

i

Table of Contents

Table of Contents i

List of Publications for the Thesis iii

List of Abbreviations vi

1. Introduction 1

2. Theory and Method 3

2.1 Theory 3

2.1.1 Molecular dynamics (MD) 3

2.1.1.1 Classical mechanics 3

2.1.1.2 Potential energy functions 4

2.1.1.3 Non-bonded interaction evaluation 5

2.1.1.4 Thermostat 6

2.1.1.5 Barostat 8

2.1.1.6 Accelerated MD (aMD) 9

2.1.2 Alchemical free energy calculations 10

2.1.3 Characterization of protein dynamics and motions 16

2.2 Background of Protein Kinases 18

2.2.1 Insulin/insulin-like growth factor 1 receptor signaling pathways 18

2.2.2 Protein kinase allostery and phosphorylation 20

2.3 Methods 20

2.3.1 Preparation of kinase systems 20

2.3.2 MD simulation protocols 22

2.3.3 Dynamics analysis 24

2.3.4 Free energy calculation protocols 26

3. Phosphoryl Transfer by Kinases 29

4. Kinase Dynamics and the Free Energy Landscape 32

4.1 Kinase dynamics and cooperativity affected by phosphorylation 32

4.1.1 Protein Motion 32

4.1.2 Structural Correlation 34

4.1.3 Structural Cooperation 35

4.1.4 αC-helix Orientation 36

4.2 General mechanisms regulating protein kinase function 38

4.2.1 Attraction-to-Repulsion Switch 38

4.2.2 A-loop and αC-helix Coupling 40

4.3 Free energy landscape and kinase allostery 42

4.3.1 Free Energy Landscape 42

4.3.2 Kinase Allostery 45

5. Gaussian Soft-core Potentials (GSC) 48

5.1 Reliability of new soft-core potentials 48

Page 6: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

ii

5.2 Origin of the Efficiency from GSC Protocol 50

5.3 Optimize location of λ simulations 51

6. Conclusions and Future Work 53

7. Acknowledgements 55

8. References 58

Page 7: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

iii

List of Publications for the Thesis

I. Pedro Ojeda-May†, Yaozong Li†, Victor Ovchinnikov, and Kwangho Nam. Role of Protein Dynamics in Allosteric Control of the Catalytic Phosphoryl Transfer of Insulin Receptor Kinase. J. Am. Chem. Soc.,

2015, 137 (39), 12454–12457. (contribution: model building and classical MD simulations)

II. Yaozong Li and Kwangho Nam. Dynamic, structural and thermodynamic basis of insulin-like growth factor 1 kinase allostery mediated by activation loop phosphorylation. Chem. Sci. 2017, 8 (5), 3453-3464.

III. Yaozong Li and Kwangho Nam. Physical end-point extrapolatable soft-core potentials for efficient and accurate alchemical free energy calculations. Manuscript.

All papers have been reprinted with kind permission from the publishers.

†Joint first authors

Page 8: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

iv

Publications not included in this thesis

i. Xianqiang Sun, Lei Chen, Yaozong Li, Weihua Li, Guixia Liu, Yaoquan Tu and Yun Tang. Structure-based ensemble-QSAR model: a novel approach to the study of the EGFR tyrosine kinase and its inhibitors. Acta Pharmacol. Sin. (2014) 35: 301–310.

ii. Jin Zhang†, Yaozong Li†, Arun A. Gupta, Kwangho Nam, and Patrik L. Andersson. Identification and Molecular Interaction Studies of Thyroid Hormone Receptor Disruptors among Household Dust Contaminants. Chem. Res. Toxicol., 2016, 29 (8), 1345–1354.

iii. Ranjeet Kumar, Candan Ariöz, Yaozong Li, Niklas Bosaeus, Sandra Rocha, Pernilla Wittung-Stafshede. Disease-causing point-mutations in metal-binding domains of Wilson disease protein decrease stability and increase structural dynamics. BioMetals, 2017, 30(1), 27-35.

†Joint first authors

Page 9: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

v

Abstract

Background Insulin receptor kinase (IRK) and Insulin-like growth factor 1 receptor kinase (IGF-1RK) are two important members in the large class of tyrosine kinase receptors. They play pivotal roles in the regulation of glucose homeostasis, cell proliferation, differentiation, motility, and transformation. Their dysfunctions are linked to diabetes, rheumatoid arthritis and many cancers. Although their regulatory mechanisms have been widely studied experimentally, the atomistic details are still poorly understood, especially for the influences caused by activation loop (A-loop) phosphorylation.

Methods Molecular dynamics (MD) and alchemical free energy simulations are carried out to understand mechanisms underlying the kinase proteins regulation and their thermodynamic basis. To capture a full picture about the entire kinase catalytic cycle, different functional steps are considered, i.e., conformational transition, substrate binding, phosphoryl transfer and product release. The effects of the A-loop phosphorylation on protein’s dynamics, structure, stability, and free energy landscape are examined by various analysis methods, including principle component analysis (PCA), motion projection, dynamical network analysis and free energy perturbation.

Results The main findings are: 1) A-loop phosphorylation shifts the kinase conformational population to the active one by changing the electrostatic environments in the kinase apo form, 2) allosterically fine-tunes the orientation of the catalytic residues mediated by the αC-helix in the reactant and product binding states, and 3) thermodynamically favors the kinase catalysis presented by a catalytic-cycle-mimic free energy landscape. An integrated view on the roles of A-loop phosphorylation in kinase allostery is developed by incorporating kinase’s dynamics, structural interactions, thermodynamics and free energy landscape. In addition, new soft-core potentials (Gaussian soft-core) and protocols are developed to conduct accurate and efficient alchemical free energy calculations.

Conclusions The entire catalytic cycle is examined by MD and free energy calculations and comprehensive analyses are conducted. The findings from the studied kinases are general and can be implemented to the other members in IRK family or even to more non-homologous families because of the conservation of the characteristic residues between their A-loop and αC-helix. In addition, the Gaussian soft-core potentials provide a new tool to perform alchemical free energy calculations in an efficient way.

Page 10: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

vi

List of Abbreviations

ADP Adenosine diphosphate

A-loop Activation loop

aMD accelerated molecular dynamics

ATP Adenosine triphosphate

BAR Bennett acceptance ratio

CSC Classical soft-core

DFG Asp-Phe-Gly

EGFR Epidermal growth factor receptor

FEP Free energy perturbation

GSC Gaussian soft-core

IGF-1RK Insulin-like growth factor 1 receptor kinase

IRK Insulin receptor kinase

L-J Lennard-Jones

MD Molecular dynamics

ms Millisecond

NMR Nuclear magnetic resonance

NPT Constant number of particle, pressure and temperature

ns Nanosecond

NVT Constant number of particle, volume and temperature

OIOFEP Optimizing infinite-order free energy perturbation

PBC Periodic boundary condition

PCA Principal component analysis

PME Particle-mesh Ewald

PMF Potential of mean force

pTyr Phosphorylated tyrosine

QM Quantum mechanism

QM/MM Hybrid quantum mechanism and molecular mechanism

RAF Rapidly accelerated fibrosarcoma

RMSD Root-mean-square deviation

RMSF Root-mean-square fluctuation

TI Thermodynamic integration

vdW van der Waals

µs Microsecond

Page 11: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

1

1. Introduction

Insulin-like growth factor 1 (IGF-1R) and Insulin receptor (IR) are two key members in the large family of tyrosine kinase receptors. IGF-1R and IR signal transduction pathways control cell proliferation, differentiation, motility, transformation and glucose homeostasis.[1] Because of these important functions, dysfunction of the two signaling pathways is associated with a number of human diseases, including cancers [2-6], Alzheimer’s disease [7], rheumatoid arthritis [8] and diabetes [9]. Because IGF-1R and IR share a similar structural framework, they utilize a common mechanical signaling mechanism to control their catalytic activities. Specifically, the signaling firstly is initiated by binding of an extracellular hormone, e.g., insulin, IGF-1 and IGF-2, to the receptor’s ectodomain, leading to dimerization of the intracellular kinase domain [10-11]. Then, trans-autophosphorylation takes place at three tyrosine residues (e.g., Tyr1131, Tyr1135, and Tyr1136 for IGF-1RK) located at the activation loop (A-loop) of the kinase, yielding the fully activated kinases [12].

The catalytic efficiency of the two kinases are approximately enhanced by 200 fold after its full activation via A-loop phosphorylation [12-13], in which ~20 fold is contributed by the increase of catalytic rates (i.e., kcat), and the rest by the increase of substrate binding affinities. These thermodynamic and kinetic aspects have two implications to the kinase activation and catalysis. First, the enzymes have a basal level of catalytic ability even in their unphosphorylated forms. Second, the unphosphorylated kinase can adopt the catalytically competent (active) conformation to exhibit the basal level of catalytic activity. These imply that the kinases interconvert between inactive and active conformations, and that the relative population is altered between them as the A-loop phosphorylation level varies (i.e., the population shift). Our computational studies on IRK and IGF-1RK [14] and other’s investigations on Src kinases [15] support this model. Given that the phosphorylatable sites are distant from the catalytic pocket; A-loop phosphorylation would function as an allosteric effector. Studying such allosteric effects is still challenging, because the kinases fulfill their catalytic function through multiple conformational and functional states.

Nuclear magnetic resonance (NMR) techniques and molecular dynamics (MD) simulations have increased our understanding about the relationship between kinase protein dynamics and their functions [16-20], and phosphorylation effects on tyrosine kinases (e.g., Src, EGFR and RAF) [21-

Page 12: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

2

24]. Nevertheless, we still lack an integrated picture about the allostery caused by A-loop phosphorylation on each step of the kinase catalytic cycle. In this thesis, I have exploited MD and alchemical free energy simulations to gain insights into how A-loop phosphorylation influences each conformational state in the catalytic cycle of IGF-1RK and IRK. Specifically, the effects of A-loop phosphorylation on protein dynamics, structure, stability and free energy landscape were analyzed, and thereby an integrated view of A-loop phosphorylation effects was depicted. In parallel, to facilitate further studies by free energy calculations, we developed new soft-core potentials, called Gaussian soft-core (GSC), and tested their applicability under different circumstances. In paper I [14], I focus on the impacts of A-loop phosphorylation on the phosphoryl transfer step and examine if protein dynamics plays a role in IRK catalysis. In paper II [25], I expand our horizon to the entire catalytic cycle of IGF-1RK to draw a comprehensive picture on kinase allostery mediated by A-loop phosphorylation. In paper III, the GSC potentials and protocols are developed and tested on model systems and the kinases systems.

Page 13: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

3

2. Theory and Method

Recent decades have witnessed the advancement of computer simulations for understanding biomolecular behaviors and functions [26]. Among the computational methods, molecular dynamics (MD) [27] plays distinguished roles because of its advantages, e.g., accurate estimations of experimental data, reasonable computational cost and applicability to different systems. Its theoretical cornerstone is the classical mechanics, i.e., Newtonian mechanics. By implementing force fields, i.e., empirical approximations to “real” potential surfaces, we can simulate fairly large biomolecules for reasonably long time with decent computational power. In addition, with the advancement of reliable thermostat and barostat algorithms and their implementation, results from microscopic simulations have become comparable to macroscopic measurements. Many physical properties of our objects of interest can be obtained accurately without a priori knowledge, which leads to a cost-efficient science. Among these physical properties, the free energy is an extremely useful quantity in real life chemistry and biology because it governs biomolecule’s fate, but is hard to determine accurately. In this chapter, I will present the basic theory and methodologies for MD simulations. I will also present how we use MD simulations to estimate free energy difference between two related systems and derive useful information from it.

2.1 Theory

2.1.1 Molecular dynamics (MD)

2.1.1.1 Classical mechanics

If electronic behaviors of a studied molecule are not the focus, a classical mechanics based MD simulation can be conducted rather than quantum mechanics (QM) based calculations. The classical mechanics is formulated by Newtonian mechanics

����� = ����� = − �� ∇�U���� , (1) where ��, ��, ��, ∇�U and �� are acceleration, force, atomic mass, gradient of potential energy and atomic position, respectively. To obtain the time evolution of the molecule in silico, certain numerical integration algorithms have to be used. Since the original Verlet algorithm [28] was proposed, many

Page 14: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

4

variations of the algorithm have been developed. Among them, the Leap-frog algorithm [29] is often used. Starting from a Taylor expansion of atomic position, velocity and acceleration are propagated with the time step ∆� by the following equations:

�� �� + ∆� � = �� �� − ∆� � + �����∆� + ��Δ���, (2) r��t + ∆t� = r��t� + v� �t + ∆�� �∆t + Ο�Δt�� , (3) v��t� = �� v� �t + ∆�� � + v� �t − ∆�� �!. (4) From equations (2) and (3), it is apparent that the obtained velocity and

position are not synchronized and experience a ∆�� time lag, which is vividly

likened to “frog leaping”. The expected velocity can be achieved through eq.

(4).

2.1.1.2 Potential energy functions

Most MD simulations are carried out using the force fields as an approximation of the “real” potential surface, which in principle can be more accurately described by QM methods at the expense of large computational cost. Among many different forms of force fields, the additive force fields are more often used in simulations of large biomacromolecules such as proteins and nucleic acids, under the solvation by a large number of solvents. Typical additive force fields include two classes of energy terms: bonded and non-bonded terms. Here, the CHARMM force field [30-32] is taken as an example and presented in eq. (5). The first six terms are bonded terms, successively including bond vibration, angle bending, dihedral torsion, out-of-plane bending and protein backbone correction terms. The energy evaluations on these terms are easy and efficient because any necessary parameters (i.e., force constants and geometric equilibrium values) and geometric values, e.g., bond distance, angles and dihedrals, can be easily obtained from the force field and the bonded list, respectively, which are setup at the beginning of each simulation, and the total number of terms grow linearly with the increase of number of particles. The last two terms are Lennard-Jones (L-J) and Coulomb potentials, representing vdW and

electrostatic interactions, respectively.

Page 15: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

5

U�"� = # $%�& − &'�� +()*+, # $-�. − .'�� +/*012,

# $34�5 − 5'�� +67289:7/+128 # $;�1 + cos�@A − B�� +C�DECFGHI

∑ $K�L − L'�� +�MN7)N27, ∑ OPQRS�A, U� +FEI�CVEI∑ WX�Y�Z [\]�_̂�`

F�^ a�� − 2\]�_̂�`F�^ acd + e�e^fghihF�^jZkZ9%kZCECmG�FI . (5)

Based on the empirical potential energy functions, the forces acting on each atom can be derived. Then each atom in the system is propagated according to eq. (2)~(4).

2.1.1.3 Non-bonded interaction evaluation

Non-bonded energy evaluations consume most of simulation time because the number of interaction pairs increase with n�, where N is the number of atoms in the studied molecule. To reduce the computational burden, many approaches have been implemented in the modern MD simulations. For the vdW interactions, which is mostly approximated by the 12-6 Lennard-Jones (L-J) potential, can be evaluated by cut-off strategies. This is ensured with the fact that vdW interactions decay rapidly with increasing distance between two atoms (proportional to o9c ). Furthermore, to turn off L-J potentials smoothly at the cut-off distance, different cut-off methods can be used. One such example is the switching function [33], adopted in CHARMM as

5�o� = p 1,o ≤ o)*rFsttu 9Fuvu�Fsttu w�Fu9�Fsxu ��Fsttu 9Fsxu �y , o)* < o ≤ o){{ , (6)

where 5�o� is the weighting function that scales the L-J interaction given in eq. (5). When the distance o is less than o)*, the weight is 1 and the energy is evaluated with the full L-J potential. When o is in the range o)* < o ≤ o){{, the L-J potential is scaled by the lower formula in eq. (6), and beyond the cut-off distance, the weighting factor is 0 and thus there is no need to evaluate the

L-J potential beyond the distance.

Unlike the L-J potential, the long-range Coulomb interaction decays very slowly with distance. Due to this, the Coulomb interaction converges

Page 16: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

6

conditionally when evaluating it under the periodic boundary conditions (PBC). Therefore, typical cut-off approaches involve substantial underestimation of the total Coulomb interactions and often cause simulation artifacts. Under PBC, there exists a rigorous algorithm for the long-range Coulomb interactions, namely, the Ewald summation method. The Ewald sum method works by separating the original summation in the real space into four terms, denoted as the real space, reciprocal space, reciprocal space, self-correction and surface terms (eq. 7) , which follow

Bogusz el.’s notations [34]

|}kVHk% = 12#′ # ���Y���Y + ���

�,Y��Z ≈ |��GHC =

|]EGH + |]E��mFk�GH + |�EH� + |�VF�G�E = �� ∑ ∑ ∑ ′���|�|�'�Y������ �Y EF�������^w������^w�� +��g�y ∑ ∑ ∑ ����'�Y������ �Y �fgu

�u � �\9 �u��ua cos�� ⋅ ��Y� −� �√g�∑ ��� + � �g��y� |∑ �������� |����� . (7)

In the |]EGH term, where the point interactions are screened by the neutralizing Gaussian charge distributions (equal but opposite electric quantity), the interaction converges rapidly and thus is evaluated only up to the cutoff distance. Then, the remaining interactions, i.e., the interactions between the cancelling Gaussian distributions (equal but opposite to the neutralizing charges) are evaluated in the reciprocal space (second term in eq. 7, where � is the reciprocal space lattice vector). The summation in the reciprocal space can be further sped-up by the particle-mesh Ewald summation (PME) [35-36], in which the fast Fourier transformation is used, resulting in the total computational cost to increase with the N·logN scaling.

Since the reciprocal term includes the artificial self-interactions between the point charges in the real space, the third term in eq. 7 cancels this interaction (i.e., the self-correction term). The last term is the surface dipole term and vanishes if the so-called “tinfoil” boundary (i.e., infinite dielectric value) is

assumed and the total charge of the system is zero.

2.1.1.4 Thermostat

Page 17: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

7

The Nosé thermostat [37] is a pioneering methodology that allows the control of the simulated system temperature during MD simulation, by coupling the system to an external thermal bath. More importantly, this thermostat can produce canonical ensemble distributions. The coupling

works by extending the system Hamiltonian as

H�),é = ∑ ��u��Iu + ���� + m�u�� +  �),é¡4¢ ln ¥���� , (8)

where H�),é, s, Q, p,  �),é and T are the extended Hamiltonian, extended variable (generalized coordinate), its conjugated mass, generalized momentum, degree of freedom for the system ( �),é = 3n + 1) and thermal-bath temperature, respectively. The first two terms of eq. (8) represents the simulated system Hamiltonian. The next two terms are the generalized kinetic and potential energies of the extended system. By coupling the system to the thermal bath, the thermal energy can be transferred from the thermal bath to the system and vice versa, so that the system temperature fluctuates around the target temperature, in which the amount of temperature fluctuations can be controlled by selecting a proper mass Q. When performing the MD simulation with the Nosé thermostat, however, the simulated time step (virtual time) is inconsistent with the real time. In fact, the two time steps are related by Δt = Δt¨/s, where Δt and Δt¨ are real and virtual time steps, respectively, in which s can be viewed as a scaling factor of the time step. Due to the inconsistency between the two time steps, many physical properties cannot be conveniently extracted from the trajectory produced by Nosé MD.

This time step inconsistency can be avoided by using the modification proposed by Hoover. In the Hoover method, the time step scaling factor s is avoided in deriving the equations of motion given below,

ª«¬«­ �® � = ����® � = �¯��� − °±�® �°®± = �� �∑ ��u� −  ±¡4¢���� �

°± = C1*I²C . (9)

The equations of motion can produce the canonical ensemble distributions by adjusting the particle’s velocities in the presence of the friction factor ζ±.

Page 18: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

8

Here,  ±, the degree of freedom, is 3N and equals to that of the studied system. Similar to the Nosé thermostat, the mass Q acts as the inertia for the temperature flow to and out of the system. This modified thermostat inherited the spirit of the extended Nosé Hamiltonian and thus is named as the Nosé-Hoover thermostat [38]. The Hoover modified extended

Hamiltonian is

H±))µ27 = ∑ ��u�� + ���� + ¶²u�� +  ±¡4¢ ln ¥±���� . (10)

Besides above mentioned thermostats, Langevin thermostat [39] is an alternative approach to control the system temperature and produces a correct canonical ensemble distributions. The Langevin equation of motion for a particle is given by

������� = ����� − °·/*����� + ¸��� , (11) where ζ·/* is the friction constant. Then, the motion of a particle is driven by the intrinsic force ���t�, determined from the force fields, collision/damping force ζ·/*���t� and random Gaussian distributed force ¸�t� . The random force is zero on average and its variation follows the fluctuation-dissipation

relation,

⟨¸���¸��¨�⟩ = 2��»¡¢B�� − �¨� . (12) In the fluctuation-dissipation relation, γ is the collision frequency and equivalent to ζ·/*/m�. δ is the Dirac delta function. Affected by the damping and random forces, the system is assured to evolve in the correct canonical ensemble.

2.1.1.5 Barostat

Since most biological macroscopic measurements are done under the NPT (isothermal-isobaric) condition, there is a need to conduct the MD simulations under the same condition to make the experimental and simulation results comparable. This can be accomplished by applying a barostat together with a thermostat to a MD simulation. The Langevin piston method [40] is one of the most commonly used methods to this end. In this method, the motions for the particles and volume of the simulation box are

governed by the following equations of motion,

Page 19: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

9

ª««¬««­ "® ¯ = �̄� + �� ¿®¿ "¯

�® ¯ = �� + �� ¿®¿ �¯�À = �Á ÂÃ��� − ÃEÄÅ − »�® + ÆN�,�)*���⟨ÆN�,�)*���ÆN�,�)*��¨�⟩ = �Ç�ÈÉ�9Ê�Á

, (13)

where V is the volume, P and P2Ì� the instant and expected pressure, and ÆN�,�)* and Í the random force acting on the piston and piston mass,

respectively. The other notations are similar to those in the Langevin

thermostat. By applying the Langvin equations of motion, we can effectively

control the movement of piston and its fluctuation. The obvious merit of this

method is that it avoids the oversampled motion of the piston compared with

the week coupling method. In fact, the equation of the piston motion given in

eq. 13 can be modified such that it can be described by other methods, such

as the Nosé-Hoover method, for constant pressure [41], while its fluctuation

is still controlled by the fluctuation-dissipation relation. Such combination

leads to the Nosé-Hoover Langevin piston method, which is used in NAMD

software.

2.1.1.6 Accelerated MD (aMD)

Sometimes, we are interested in large time scale molecular behaviors, e.g. significant conformational changes of a protein. Such phenomena may not be easily sampled by classical MD because they happen in most cases in the time scale of µs~ms. To tackle the sampling issue, many enhanced sampling methods have been developed [42]. Among them, accelerated MD (aMD) [43] has attracted increasing attentions because of its easy-to-use, continual developments and generality. aMD enhances sampling through altering potential energy surfaces. When the potential energy is less than certain value (i.e., reference potential energy), the MD simulation will work on an enhanced potential energy surface, otherwise it will evolve on the original

potential energy surface. The idea is mathematically given by

O∗�"� = ÏO�"�,O�"� ≥ |O�"� +△ O�"�,O�"� < | , (14)

Page 20: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

10

where O∗�"� , O�"� and △ O�"�and are the enhanced, original, and boost potential energies, respectively. |, the reference energy, is a specific energy value and determined by users. The boost potential function △ O�"� is guided by

△ O�"� = Â�93�"�ÅuÒwÂ�93�"�Å , (15)

where the parameter Ó is a tuning factor and used to control the deepness of the modified potential basin. The original aMD method performs the enhanced sampling by altering dihedral terms and 1-4 non-bonded interactions. The later version of aMD also includes the total potential energy in the boost term to further accelerate conformational transitions of

biomolecules [44].

Although aMD is a useful method to achieve the enhanced sampling on the systems studied in the present dissertation, fine-tuning of its parameters (| and Ó) is not straightforward. Excessive sampling enhancement may cause unreasonable distortions of protein secondary structures, whilst inefficient enhancement cannot help to observe significant conformational changes of the studied proteins. Therefore, many trial-and-errors would be unavoidable to determine optimal parameters specific to each protein system. Reweighting back to the canonical ensemble is another issue. Because the boost potential energy is relatively large for a protein system and changes over time on the course of the simulation, it produces very noisy reweighting results. Therefore, in the my studies, aMD was only exploited to phenomenologically observe protein conformational changes but not to achieve theoretically correct thermodynamic properties like conformational

free energy changes.

2.1.2 Alchemical free energy calculations

Alchemical calculation [45] is a common strategy to determine a reliable free energy difference between two different Hamiltonians. It has been widely used to calculate free energy differences for different ligand bindings, residue mutation, and post-translational modifications [46-52]. Because it is very hard to achieve sufficient configurational overlap between two physical end-point simulations, alchemical calculation employs a hybrid Hamiltonian to bridge their gap with multiple unphysical intermediate states. The hybrid

Hamiltonian is given by

Page 21: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

11

ÔÕ = �1 − λ�Ô' + λÔ� , (16) where Ô× , Ô' and Ô� are the hybrid Hamiltonian, the Hamiltonians of the physical end-point 0 and 1, respectively. The coupling parameter λ constructs the hybrid Hamiltonian by linearly interpolating the two end-point Hamiltonians, thus ranging between 0 and 1. By simulating many hybrid states (i.e., the unphysical intermediate states), the two physical end-points are smoothly connected, so that a reliable free energy difference is obtained by certain alchemical free energy calculation methods.

2.1.2.1 Alchemical free energy calculation theory

Thermodynamic integration (TI) and free energy perturbation (FEP, also called experiential formula) [53] are the two mostly used theories to determine the free energy difference based on a series of alchemical λ simulations. In the TI method, the free energy difference ΔØ between two

physical end-points is calculated by

ΔØ = Ù Ø����λ�dÛ�' = Ù ⟨ÜÔ/Üλ⟩Õdλ�' , (17)

where Ø����λ� is the derivative of the free energy. ⟨∂Ô/ ∂λ⟩Õ is the ensemble average of ∂Ô/ ∂λ at a specific λ point. Using eq. (16), ∂Ô/ ∂λ is straightforward to evaluate and the free energy difference can be formulated

as

ΔØ = Ù ⟨Ô� − Ô'⟩×dÛ ≡ Ù ⟨ß⟩×dÛ�'�' , (18)

where Ô' and Ô� denote two end-point Hamiltonians. ß denotes the Hamiltonian difference ( Ô� − Ô' ). The TI formulism can be also approximated by a numerical integration given by

ΔØ�0 → 1� ≈ ∑ ∑ w��ã�Ø�ã��λ��Mäã���� , (19)

where wY is the weight factor of the jth order derivative of F, denoted by Ø�ã�. In the Zwanzig’s relationship [53], ΔØ is given by an exponential formula (i.e., free energy perturbation, FEP):

ΔØ = −å9�ln ⟨�æç9è éêë9éêi!⟩ = −å9�ln⟨�æç9èV⟩ , (20)

Page 22: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

12

where λ� and λ' are specific values of the coupling parameters, which represent two arbitrary end-points. Through a Taylor series expansion

forØ�λ� at λ', ∆F can be approximated by

ΔØ�λ' → λ� = ∑ �Õ9Õi�x*! Ø�*��λ'��*�� (21a)= �λ − λ'�⟨ß⟩Õi − í�Õ9Õi�u� ⟨rß − ⟨ß⟩Õiv�⟩Õi + ⋯(21b)where Ø�*� is the nth derivative of the free energy Ø�λ�. Interestingly, eq. (19) and eq. (21) relate TI and FEP formulas. Their relationship indicates that ΔØ can be approximated in a high order of FEP expansion by applying appropriate integration methods without calculating the high-order of derivatives in real. To this end, eq. (19) can be transformed into different expressions by applying different integration protocols. For instance, when

N=2, j=1 in eq. (19), Hummer and Szabo [54] derived

ΔØ�0 → 1� ≈ �� �⟨u⟩ðë + ⟨u⟩ðu� , (22) where g� = 1/2 − 1/√12 and g� = 1/2 + 1/√12. By optimizing the infinite-

order free energy perturbation (OIOFEP) theory, they derived a three-point

perturbation,

ΔF�0 → 1� = ΔF�0 →  �� − ΔF ��� →  �� + ΔF ��� →  �� − ΔF�1 →  �� . (23) Eq. (22) and eq. (23) present efficient estimations of ΔØ because they only need 2 and 3 λ simulations, respectively. In fact, this formula includes the features of the forward, backward, and double-wide (DW) perturbation sampling of the FEP method.

If the configurations have been sampled at two different λ values, ΔØ between the two λ states can also be estimated by the Bennett acceptance

ratio (BAR) method [55]. The method is formulated as

ΔØ�0 → 1� = β9� ln ⟨��éi9éëwP�⟩ë⟨��éë9éi9P�⟩i! + C , (24)

Page 23: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

13

where õ�x� is the Fermi function, i.e., õ�x� = 1/Â1 + exp�βx�Å and C =β9�ln�Q'N�/Q�N'�. Wherein, Q and N indicate the partition function and the

number of sampled configurations, respectively. Starting from an initial

guess of C values, the BAR method iteratively optimizes the C values until

the first term in eq. (24) falls below a certain small threshold. Under this

condition, the variance of the estimated free energy value is minimal.

2.1.2.2 End-point catastrophe and use of soft-core potentials

When an alchemical transformation involves atom annihilation or creation, ⟨∂Ô/ ∂λ⟩ may experience a huge jump and large variation at certain λ points. This phenomenon is called the “integrand catastrophe” or “end-point catastrophe” [56] and such jump may result in unreliable free energy estimation. The problem is mainly caused by the steric clash between the transformed atoms and their surroundings (solute and solvent atoms). This issue can be avoided by soften the repulsive vdW interactions at short

distance, such as by applying the soft-core potentials [57-58].

If vdW and electrostatic terms are transformed simultaneously in absence of soft-core potentials, the transformed system may suffer from unstable dynamics due to huge charge-charge interactions. This is caused by an insufficient shield of the reducing hard-core vdW radius on the residual partial charges in certain λ simulations, thus crashing the simulations or resulting in unreasonable estimation of ⟨∂Ô/ ∂λ⟩. On the other hand, even under the soft-core potential protection, simultaneously transforming vdW and electrostatic terms is still problematic. The insufficient separation may still bring about unreasonable attractive electrostatic interactions, resulting

in a rugged transformation integrand.

The original version of soft-core potentials (classical soft-core, CSC) [57] is formulated as a linear scaling formula according to

Ô·9ù,PúP�λ� = �1 − λ�ε',�ãM�* ü R',�ãM�*���r�ã� + δλ�c − 2 R',�ãM�*c

�r�ã� + δλ��þ +

λε�,�ãM�* � �ë,ä��äxëu 7ä�uw���9Õ�!� − 2 �ë,ä��äxëu

�7ä�uw���9Õ�!y� , (25)

Page 24: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

14

where 0 and 1 represent the initial and final states of the transformed system, respectively. ε�ãM�* and R�ãM�* denote the well depth of L-J potential and the distance where the potential minimum is located, respectively. δ is a shift radius of the soft-core potential. In addition, the soft-core potential can in

principle be applied to partial charge transformations and is formulated as

ÔCoulomb�λ� = �1 − λ� i,äi,�f�i��7ä�uw�Õ�ëu + λ ë,äë,�

f�i�Â7ä�uw���9Õ�Åëu , (26)

where q�and qã are partial charges of an interactive atomic pair, α is again a shift parameter, ε' and ε are dielectric constants in vacuum and relative dielectric constant, respectively. Under the electrostatic soft-core potentials, although the electrostatic collapse problem can be greatly alleviated, the rugged transformation may still exist.

CSC was designed by directly modifying the low separation regions of the original potentials by shifting large repulsive and attractive potentials to small values (eq. 25 and 26). This direct interference to the original potentials makes extrapolation to other non-simulated λ values difficult. The extrapolation has to cost extra energy evaluations. Therefore, only TI is a straightforward method to estimate the alchemical free energy difference with CSC. That means we cannot implement other potentially more efficient free energy calculation theories to the collected data easily, e.g., perturbation theories and BAR. In addition, in the protocol of CSC with TI, the number of λ simulations has to be densely sampled near the highly curved λ regions to

obtain an accurate estimation.

To solve the inefficiency of the CSC potentials, we designed a class of new soft-core potentials, which is named as Gaussian soft-core (GSC) potentials. They have two main advantages compared with CSC. The most distinguished one is that GSC potentials make ⟨∂Ô/ ∂λ⟩ be easily extrapolated to non-simulated λ values based on data that have been simulated at certain λ values. This feature will free us to use other free energy calculation methods without extra computational cost. Second, the new GSC potentials produce stable dynamics and potential derivatives, i.e., generate a smooth transition immune from additional minima at any λ simulations. The GSC potentials

are formulated it as

Page 25: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

15

ª«¬«­Ô0 = Ô0�o�  + ∑ Ó0�−å0� o��Æ����@�

��

Ô1 = Ô1�o�  + ∑ Ó1�−å1� o��Æ����@��

��

, (27)

where H')7�0 and H�)7�0 refer the original Hamiltonians given in eq. (16) for 0 and 1 physical states, α and β the pre-factor and exponential factor of the exponential term (both always ≥ 0) with X determining the power of the exponential term, either being 2 or 4 in the present development. When an annihilation transformation is performed, the final state 1 will be non-interactive with the surroundings, and thus the corresponding terms are zero. We use a two-step strategy to annihilate the interactions of a target molecule in water solution. First, vdW and electrostatic terms of the molecule are annihilated by scaling down λ values from 1 to 0, while the GSC potentials are scaled up to push away water molecules and intra atoms. In this way, the annihilation avoids the integrand catastrophe. Because the GSC does not penetrate original potential functions, the hybrid Hamiltonians ÔÕ can be extrapolated to any other non-simulated λ values. In the second step, the introduced GSC potentials in the first step, is eliminated gradually by scaling

down the protective GSC potentials. The GSC decoupling process follows

Ô�úP = �1 − λ�Ó'�9èi� ��^�i,�^_�`�

� . (28)

Besides alchemical annihilation, GSC potentials are also valid on the 0-to-1

mutation simulations. Their basic ideas are presented in Figure 1.

Page 26: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

16

Figure 1. Alchemical simulation strategies with different soft-core potentials. (a)

Conventional soft-core (CSC) protocol for an annihilation process. (b) Gaussian soft-

core (GSC) protocol for an annihilation process. (c) CSC protocol for mutation

process. (d) GSC protocol for mutation procedure.

2.1.3 Characterization of protein dynamics and motions

Based on the atomic coordinates saved during MD simulations, protein dynamics and motions can be analyzed with an atomistic level resolution. Commonly applies analyses include structural root-mean-square deviation (RMSD), root-mean-square fluctuation of atomic positions (RMSF), quasi-harmonic analysis [59], motion projection [60], cross correlation matrix, contact map and dynamical network analysis [61]. The choice of analysis methods is question dependent. In this section, I focus on quasi-harmonic analysis and its derivatives.

Quasi-harmonic analysis, also called as principal component analysis (PCA), is an effective method to transform atomic motions in Cartesian space into the motions in frequency space [59]. As a first step, a covariance matrix � is constructed by averaging the displacement correlation between atomic positions saved during an MD simulation; thus it is also commonly called

displacement (or fluctuation) matrix. The covariance matrix is formulated as

� = ⟨���⟩ , (29)

Page 27: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

17

where � = " − ⟨"⟩ denotes atomic displacement vector in the 3n dimensional configuration space, n is the number of atoms, " is an atomic coordinate vector, and ⟨… ⟩ denotes average over sampled coordinates. �� denotes the transpose vector of �. Therefore � is an 3n × 3n matrix.

In the quasi-harmonic model of atomic fluctuation, the equations of motion

can be approximated as

Â� − �k:T/ω��!9�Å� = 0 , (30) where k: is the Boltzmann factor, T is Kelvin temperature, ω is the quasi-harmonic frequency and ! is atomic mass matrix. By matrix algebra, the corresponding secular equation can be derived. Then, the diagonalization of the mass-weighted matrix � is conducted to obtain eigenvalues λ (given by k:T/ω� ) and the associated eigenvectors " . λ is a measurement of the contribution of a specific mode or collective modes to the overall atomic displacement, i.e.,

Ω = ∑ λ�M��� /∑ λã��ã�� , (31)

where ∑ λ�M��� denotes an eigenvalue summation over m modes, and ∑ λã��ã�� denotes the summation over the entire 3n modes. Through the observation of the individual eigenvector from the ensemble of " , the functionally relevant mode can be identified, especially from low-frequency modes.

If there is a clear-cut motion to be considered, the motion projection analysis [60] can be useful. For instance, if there are two functionally interested protein conformations (A and B), a target displacement vector $ can be defined, i.e., the coordinate vector from B to A. Since the eigenvector " has constructed in the 3n dimensional configuration space, its projection, named as superposition (or overlap), to the target vector $ can be mathematically formulated by

p� = $∙"¯|$| , (32)

where "¯ denotes the ith eigenvector obtained from quasi-harmonic analysis.

The projection fulfills the relation

∑ p�� = 1����� , (33)

Page 28: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

18

where p�� can be interpreted as the contribution of the mode "¯ to the target motion $ . When we cumulatively add up p�� along the lowest to highest frequency of modes, the cumulative contribution to the target motion can be detected. Here, we define the target motion as $R→: (i.e., &: − &R), where &R and &: are the vectors of conformation A and B, respectively.

2.2 Background of Protein Kinases

2.2.1 Insulin/insulin-like growth factor 1 receptor signaling pathways

Insulin receptor (IR) and insulin-like growth factor 1 receptor (IGF-1R) signaling pathways paly essential roles in the regulation of cell proliferation, differentiation, motility, transformation and glucose homeostasis. Improper activation of these pathways has been associated with many diseases, e.g., diabetes, Alzheihemer’s disease and cancers. Because IRK and IGF-1RK are highly homologous on sequence and structure, they can form various hybrid dimmer receptors on cell membrane and together give rise to complex cell

regulations [62].

IRK and IGF1R share a similar activation mechanism. The activation of the signaling begins with the binding of a hormone molecule, e.g., insulin, IGF-1 or IGF-2, to the ectodomain of the receptor. The hormone binding induces dimerization of two monomers, which then leads to activation of the kinase domain inside of cell. The full catalytic activity of the kinase domain is achieved primarily by trans-autophosphorylation on three tyrosine residues of its activation loop (A-loop). After the fully catalytic activity is obtained, the kinase domain begins phosphorylating downstream proteins in the signal transduction pathways. The sequential events of the autophosphorylation

process are illustrated in Figure 2a.

X-ray crystallography has established that IGF-1RK (and IRK) exists in two main conformations: one is “inactive” and the other “active”. Here, I use IGF-1RK as an example to illustrate the structural features. Without the A-loop phosphorylation, the kinase often adopts the inactive conformation [63]. This conformation is characterized by the inhibitory orientation of the A-loop, which sterically blocks the binding pocket of adenosine triphosphate (ATP) and occludes the docking site of downstream substrate proteins (Figure 2b). In addition, the conserved Asp-Phe-Gly (DFG) motif flips out of the active site, and the αC-helix rotates outward away from the ATP binding site. Due to the obstructed substrate and ATP binding sites, this

Page 29: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

19

conformation is also called the auto-inhibited conformation. In the active conformation [12], which is often found with the fully phosphorylated A-loop, the A-loop adopts an extended conformation, thus allowing the binding of ATP and its substrate (Figure 2c). In addition, the DFG motif flips in and the αC-helix takes an in-orientation, thereby constructing a catalytically

competent active site of the kinase.

Figure 2. Autophosphorylation mechanism and IGF-1RK structures: (a)

autophosphorylation mechanism, (b) inactive and (c) active conformations. Tyrosines

and phosphorylated tyrosines (pTyrs) on the A-loop are shown with the ball and stick

model. The A-loop, αC-helix and substrate peptide are colored differently.

Despite the importance of protein dynamics on kinase functions, e.g., catalysis, substrate binding, product release, how dynamics affects the functions at the atomic level remains poorly characterized. For instance, the catalytic activity of the kinases is strictly regulated by A-loop phosphorylation. How the A-loop phosphorylation affects the functions by altering the protein dynamics is not fully understood. Structural and NMR studies have demonstrated certain dynamics responses of the kinase

Page 30: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

20

structural elements to A-loop phosphorylation. Yet, available experimental evidence does not yield understanding at the molecular level of the roles of A-loop phosphorylation from the perspective of the kinase structure, dynamics and thermodynamics.

2.2.2 Protein kinase allostery and phosphorylation

Allostery is a ubiquitous mechanism of functional regulations among various bimolecular molecules. In general, allosteric regulation occurs by binding of an effort molecule to the target molecule (protein, DNA or RNA) at a non-natural ligand binding site (allosteric site) that is typically remote from the active site (orthosteric site). The effector binding induces changes of the target protein structure or dynamics, which then propagate to the active site and yield enhanced or diminished activity of the enzyme. The initial studies on enzyme allostery focused on the allosteric ligand binding with functional influences. In fact, the allosteric effector may include diverse stimulus, such as ligand binding, protein-protein interactions, residue mutations, post-translational modification and even pH and temperature changes. Based on their effects, the allosteric modulation can be classified as positive and negative modulations, i.e., the former with enhanced activity and the latter with reduced activity, respectively. Furthermore, given a target macroscopic quantity like binding affinity, the allosteric effect caused by the effector can

be quantified by a “bi-stable” model [64].

Phosphorylation and de-phosphorylation switching is an important mechanism regulating protein’s behaviors. They can affect the protein functions in many aspects, including protein’s activity, conformational stability, dynamics and the ability to downstream signaling. In the protein tyrosine kinases, because the phosphorylation sites are usually distant from the orthosteric site (ATP binding pocket) the phosphorylation can be viewed as an allosteric effector. Although the A-loop phosphorylation effects have been widely studied in past decades, their structural and dynamic origins are still ill-established. As demonstrated by my dissertation projects, it regulates the kinase in all the steps of the catalytic cycle, including conformational

transition, substrate binding, phosphoryl transfer and product release.

2.3 Methods

2.3.1 Preparation of kinase systems

Page 31: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

21

Here, I describe the simulation system preparation based on the IGF-1RK protein and followed a similar procedure to prepare IRK systems. Several different simulation systems were prepared on the basis of IGF-1RK functional and conformational states (Figure 3). The systems (residues between 958 and 1255 of IGF-1RK) were prepared based on either the inactive structure 1P4O [63] or active structure 1K3A [12], depending on the simulated conformational states. In the inactive conformation systems, the monomer chain A of the 1P4O structure, including its crystal water molecules, was extracted. Any residues in the structure that were different from the wild-type sequence were restored to their wild-type counterparts. Protonation states of all residues were adjusted according to their estimated pKa values and surrounding’s hydrogen bonding networks in the proteins under the pH=7 condition. In the phosphorylated system, three A-loop tyrosines, i.e., Tyr1135, Tyr1131 and Tyr1136, were modified to their phosphorylated forms. All missing residue atoms were generated according to their corresponding IC tables and the CHARMM27 force field parameters [30-32]. Each system was solvated in an approximately 80 Å of rhombic dodecahedron water box. To neutralize the system, Na+ and Cl- ions with 150 mM of ionic concentration were randomly placed by a Monte Carlo sampling

procedure.

All active conformation systems were prepared based on the 1K3A structure. The structure 3NW5 [65] was also used to model the missing insert part of 1K3A. In the reactant state systems, the ATP analogue (AMP-PCP) was replaced with ATP, and the substrate peptide in the original structure was used without any modification. For the product systems, the ATP and substrate tyrosine were modified to ADP and phospho-tyrosine (pTyr), respectively. In the both systems, two Mg2+ ions and the coordinated waters were modeled based on the IRK structure 1IR3 [66]. To build their unphosphorylated counterparts, the 3 phosphorylated A-loop tyrosines were modified to tyrosines. The other system preparation procedure followed that

of the inactive conformation systems.

Page 32: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

22

Figure 3. Schematic presentation of the built models for mimicking the entire

catalytic cycle of the kinase. The orange curve represents the kinase A-loop. The

circled ‘P’ symbol represents a phosphoryl group in ATP, ADP and phosphor-Tyr of

substrate peptide (green) and A-loop (orange). In ATP and ADP, the ‘star’ symbol

represents the adenine nucleoside.

2.3.2 MD simulation protocols

Classical MD (cMD) A series of constrained and restrained energy minimizations were firstly performed to remove severe atomic clash in the built models. The minimizations were conducted on missing and mutated residues, bulk water molecules and ions sequentially, while the other parts of the system were constrained or restrained. Then, the whole system was relaxed with gradual release of restraints acting on protein heavy atoms, followed by full minimization of the entire system without any restrain. This minimization procedure was accomplished in ~10,000 steps with slightly modifications between different systems. For minimization of the holo systems, I used a mass-weighted harmonic potential (~5 kcal/mol) to restrain ATP/ADP, two Mg2+ and their coordinated water molecules. This restrains were applied to maintain their catalytic core structures close to their original positions.

The minimized system was heated up to 300 K and equilibrated with restrained protein heavy atoms. To fully equilibrate bulk water molecules and ions, we continued heating up the system to 600 K with the same restrain potentials, and equilibrating it in 600 K. The system was then cooled down back to 300 K. After a restrained equilibration in 300 K, the harmonic

Page 33: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

23

potential was totally released gradually, and the free equilibration was continued under the NVT condition. Finally, the simulation was carried out at the NPT condition. The whole equilibration process was done within approximately 1 nanosecond (ns) but slightly deviated between systems. The equilibration MD simulation was performed using the Leap-frog integration algorithm with 2 femtosecond (fs) time steps. The SHAKE algorithm [67] was used to constrain any bond length involving hydrogen atoms. For the NVT simulations, the Langevin thermostat [39] was used to maintain the temperature with 5 ps-1 of friction constant for heavy atoms. For the NPT simulations, the temperature was kept in 300 K using the Nosé-Hoover thermostat [38], and the pressure was maintained at 1 atm using the extended Langevin piston algorithm [68]. All the modeling and equilibration simulations were executed by the CHARMM program.

Production MD simulations were performed by the NAMD software. Each system was simulated for 300 ns under NPT condition (isobaric-isothermal ensemble at 1 atm and 300 K). The temperature was maintained using the Langevin thermostat, and the pressure was controlled by Nosé-Hoover Langevin piston method [68-69]. The force field CHARMM27 with the CMAP correction on protein backbone dihedrals [30-31] was used to represent all the system interactions. The particle mesh Ewald summation (PME) method [70] was used to evaluate electrostatic interactions. vdW interactions was calculated by a switching function [33]. The trajectory was generated with 2 ps of time interval for later analyses.

Accelerated MD (aMD) The dual boost approach of aMD [71] was employed to enhance the configuration sampling. The reference potential E and tuning parameter α were determined by the following empirical formulas [72]:

ª«¬«­E+�(2+ = V+�(2+_/µ0 + 0.3 × V+�(2+_/µ0Ó+�(2+ = 0.3 × ,-ä./-_0123E�)�/1 = V�)�/1_/µ0 + 0.2 × n/�)M,Ó�)�/1 = 0.2× n/�)M,

, (34)

where E+�(2+ is the reference energy biasing dihedral torsion terms, V+�(2+_/µ0 the average of dihedral potentials extracted from the conventional MD (cMD) simulation, α+�(2+ the tuning factor for the dihedral part, respectively. For the total energy part, E�)�/1 is the reference total energy of the system, V�)�/1_/µ0 is the average of total potential energy from 300 ns cMD, n/�)M, is

Page 34: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

24

the total number of the system, and α�)�/1 is the tuning factor controlling the total boost potential. Before performing the aMD simulation, each system was first simulated for 50 ns using cMD. The aMD simulation was then performing for 200 ns starting from the restart file saved at 50 ns time point and under the same simulation condition as the equilibrium MD. The 2D potential of mean force (PMF) plot of each simulated system was obtained directly from the biased potential surface, which was aimed to avoid the

reweighting noise in the relatively large system [73].

2.3.3 Dynamics analysis

PCA and Motion Projection PCA was performed on each system using the coordinates saved between 50 ns and 300 ns time frames. Each set of coordinates was first superposed to the corresponding starting structure to remove the global translation and rotation of the protein. The average structure was then determined by the superposed trajectory. Using the average structure as a reference, I constructed the mass-weighted covariance matrix of atomic fluctuations, and diagonalized the matrix. From the diagonalization, I obtained a collection of protein motions (i.e., PCA mode) ranked by their vibrational frequencies. In each PCA, only backbone heavy atoms of the rigid structural parts were considered, to avoid the interferences caused by flexible regions of the protein, e.g., N- and C- protein

termini.

Once each PCA mode was obtained, it can be projected onto a target motion, which is defined by the different between the inactive and active kinase conformations as described in the Method section. Such projection helps to quantify the contribution of a collection of PCA modes to the presumed global conformation change of the kinase by following the target motion. I want to note that the direction defined by the target vector may not represent the physical transition paths for the conformational change. However, the projection provides a way to quantify the influences from certain effectors, e.g., A-loop phosphorylation and ligand binding, on the

conformational change of the kinase.

Dynamical Network Analysis To conduct a dynamical network analysis, a physical contact network should be constructed firstly. In the network analysis, the network is presented by many isolated nodes bridged by edges between them. For a protein, a node can be a residue, which is often assigned to backbone Cα atom to simplify the network presentation. The edge is

Page 35: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

25

defined by a cut-off distance that judges if two nodes are physically connected. Typically, we say that an edge exists between two residues if any pair of heavy atoms between the two residues is within 4.5 Å separation over the 75% time span of an analyzed MD trajectory [61]. For this, I measured all the heavy atom pair distances between two residues rather than only their Cα atoms. To simplify the network, the edges between two neighboring residues in sequence were ignored. The network was built based on the 50 ns and 300 ns time frames sampled with 50 ps time interval. To determine the probability of information transfer between edges [61], the correlation between residues were calculated. Then, the edges were weighted by the factors (w�ã), which were calculated from the residue-pairwise correlation map, given by w�ã = −logC�ã, where C�ã is the correlation between i and j nodes [74].

The aim of the dynamical network analysis is to partition the whole protein into different dynamically independent substructures, i.e., “communities”, and detect their extent of coupling. To do so, several other indicators are needed. The first one is the path length between two nodes i and j (D�ã), which was determined by summing the weights w�ã between them. In this procedure, the smallest value of D�ã defines the shortest path determined by the Floyd-Warshall algorithm [75-76]. For a specific edge, many shortest paths could pass it. The number of the shortest paths is defined by the indicator “betweenness” [77]. The Girvan-Newman algorithm [78] was then employed to partition the entire dynamical network into different communities, by integrating all the indicator information. The algorithm works by the following procedure. First, the edge with the highest “betweenness” is eliminated from the original network. The new “betweenness” with the rest of edges is re-calculated and eliminated again from the new network. This procedure continues until disconnected communities are generated. During this iteration, an optimized community partition can be found by maximizing the indicator “modularity” [79], which differentiates the probabilities of intra- and inter- community edges. In my analysis, the optimal modularity value was ~0.65, which falls in the

reasonable range of 0.4~0.7 for biological systems [80].

In the analysis, the correlation map was generated by Carma [81], structural network was constructed by NetworkView plugin [61] in VMD [82] (version: 1.9.1), and the community was partitioned by the routine ‘gncommunities’ [61]. The resultant protein communities were mapped to the corresponding kinase structures by an in-house PyMOL script. The basic operations

Page 36: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

26

followed mostly the instruction manual “dynamical network analysis”, which

was produced by Luthey-Schulten’s group.

2.3.4 Free energy calculation protocols

A-loop Tyrosine Phosphorylation The free energy difference between the unphosphorylated and phosphorylated kinase systems was obtained by applying the TI free energy simulation methods. The alchemical simulations were performed on four different functional and conformational states of IGF-1RK (i.e., the inactive and active conformations in their apo forms, reactant and product states). Before running the alchemical simulation, each system was equilibrated for 10 ns in its phosphorylated form. Then, the equilibrated structure was extracted to build a hybrid system represented in the dual topology strategy [27], i.e., both the tyrosine and pTyr side chains co-exist in their common backbone. To determine a reference free energy value, a similar alchemical transformation was carried out in water, namely transformation of a tyrosine to its phosphorylated counterpart. All equilibration MD simulations were carried out under the NPT condition, and all subsequent alchemical transformations were carried out under the NVT

condition.

Each alchemical transformation was carried out in three separate steps. Firstly, the charges on the tyrosine residue were scaled down (i.e., uncharging step). Secondly, the uncharged tyrosine was transformed to the uncharged pTyr (i.e., vdW transformation step). In this step, the soft-core potential [57] was employed to avoid the end-point convergence problems. In third step, the charges on the pTyr were restored (i.e., charging step). Apparently, this step has to run with net charges, because their total change changes after restoring the charges on pTyr. However, because the PME method was used to evaluate the electrostatic interactions, having the total charge not equals to zero may produce certain artifacts [83]. To relieve these artifacts, I employed a net charge counterbalance protocol [84]. The basic idea is as following. For each pTyr, the λ=0 state was prepared to have net charge of -1.0). Therefore, when the transformation reached the λ=1 state, the net charge of the system became +1.0, an equal quantity with the reverse sign. With this net charge selection, the majority of the net charge artifacts

can be cancelled in the final free energy values.

Each complete transformation was finished in 11 λ steps, i.e., ranging from 0 to 1 with ∆λ=0.1. In each λ simulation, an equilibration-relaxation-collection

Page 37: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

27

protocol was adopted. Namely, the system was firstly equilibrated at λ=0.5 for 0.5 ns. Then, a series of short relaxed simulations were carried out until the λ value reached the target value. For example, if the target λ value is 1, the λ value was increased by 0.1 every 20 ps until it reached λ=1. After this step, a total of 1 ns simulation was carried out at the target λ value, in which the date collected from the last 800 ps were used for the free energy calculation by the TI method. To minimize the computational overhead, I alchemically transformed all three A-loop tyrosines simultaneously in the protein systems. To remove bias of error estimation, uncorrelated data were extracted by calculating the correlation time of ∂O/ ∂λ in each simulation. A bootstrapping protocol [85] was then applied to estimate the free energy error by 2000 repetitions for each transformation. All simulations were carried out using the PERT module of CHARMM program. Further details of

the simulation procedures are described in paper II.

Gaussian Soft-core Potential In paper III, the annihilation free energy simulations were carried out for each of 12 model systems with and without the newly developed soft-core potentials. In the conventional soft-core (CSC) scheme, the electrostatic interactions of a solute were firstly removed by uncharging its charges, followed by removal of the vdW interactions with the CSC potential based on distance separation. For the Gaussian soft-core (GSC) scheme, I firstly removed both the electrostatic and vdW interactions of the solute from the system, meanwhile introduced slowly the repulsive interactions between the solute and solvent atoms decreased by the GSC potential. The added GSC potentials were then removed during the subsequent transformation simulation. In GSC, the parameters αR, βR and � in eq. (27) and (28), which control the strength of the GSC repulsion, were selected to be 5.0, 5.0 and 4, respectively. For each transformation, 19 λ simulations were carried out, i.e., at λ=0.0, 0.01, 0.02, 0.05, 0.1, 0.2, 0.211325 (g� = 1 2⁄ − 1/√12), 0.3, 0.4, 0.5, 0.6, 0.7, 0.788675 (g� = 1 2⁄ +1/√12 ), 0.8, 0.9, 0.95, 0.98, 0.99 and 1.0. In each λ simulation, the equilibration and data collection stages were lasted for 0.5 ns and 1.5 ns, respectively. The final free energy differences were calculated by several

different methods, including TI, FEP, double-wide FEP and BAR methods.

The transformations Tyr to pTyr in water and IGF-1RK were achieved in the 3-step protocol described above in the “A-loop Tyrosine Phosphorylation” section, using both the CSC and GSC schemes (Figure 1). Differently from the above alchemical transformation simulation, in the present test, only one A-loop tyrosine in IGF-1RK, i.e., Tyr1136, was in silico mutated to pTyr. In

Page 38: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

28

addition, the single topology strategy [27] was used to represent the hybrid residue, in which the unique atoms in position before/after the mutation were represented by “dummy” atoms. All other procedure followed the description of the previous section.

Page 39: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

29

3. Phosphoryl Transfer by Kinases

In paper I [14], my coworker and I have investigated the mechanism of kinase phosphory transfer and A-loop phosphorylation effects on it. To study the chemical reaction catalyzed by IRK, a combined quantum mechanical and molecular mechanical (QM/MM) methods were employed. Firstly, we compared two alternative reaction mechanisms, i.e, the general base versus substrate-assisted mechanisms, by the semiempirical AM1/d-PhoT QM/MM method. The comparison simulations revealed that the catalytic reaction prefers the general base mechanism (Figure 4a). Specifically, we have shown that the reaction proceeds in two sequential steps, starting with the proton transfer from the substrate tyrosine to the catalytic Asp1132 (IRK numbering system, Asp1105 for IGF-1RK). It was then followed by the γ-phosphoryl transfer from ATP to substrate tyrosine. The quality of the AM1/d-PhoT method was also examined by comparing the results to the ab initio QM and

QM/MM simulation results.

The A-loop phosphorylation effects on these two chemical steps were then investigated by applying the AM1/d-PhoT QM/MM free energy simulation methods. The simulations revealed that the A-loop phosphorylation enhanced the catalytic reaction rate via affecting the proton transfer free energy (2.6 kcal/mol favored), thereby shifting the entire free energy landscape of the catalytic reaction. In contrast, A-loop phosphorylation has less effect on the phosphoryl transfer step (Figure b). Overall, our calculated reaction free energy profiles revealed that A-loop phosphorylation lowered the overall catalytic barrier by 2.0 kcal/mol, which is highly consistent with

the experimental estimation of 2.2 kcal/mol [13].

Page 40: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

30

Figure 4. Phosphoryl transfer mechanism and protein dynamics of IRK. (a)

Proposed two-step phosphoryl transfer mechanism assisted by the general base

Asp1132. (b) Free energy landscapes for the proton transfer step (left figure) and the

phosphoryl transfer step (right figure). Black and red lines denote the

unphosphorylated (0P) and fully phosphorylated (3P) kinases, respectively. ζ, the

reaction coordinate, is defined as the distance difference between OTyr-HTyr and HTyr-

OδAsp bonds for the proton transfer step. It is defined as the distance between the Oβγ-

Pγ and Pγ-OTyr bonds for the phosphoryl transfer step. (c) Cross correlation maps for

the systems before the proton transfer (left figure) and after the proton transfer (right

figure).

To trace the origin of the A-loop phosphorylation effects on the free energy profiles, I performed the 300 ns classical MD (cMD) simulations on each of 4

Page 41: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

31

systems, i.e., kinase states before/after the proton transfer with/without A-loop phosphorylation. To demonstrate the dynamics differences between these systems, the residue-residue cross correlation maps are presented in Figure 4c. In this figure, the left figure shows that A-loop phosphorylation increased protein motion correlation in the active conformation kinase before the proton transfer. This enhanced correlation suggests that the motions of enzyme substructures become more cooperative upon A-loop phosphorylation. In contrast, after the proton transfer, the two correlation maps show negligible differences (right figure in Figure 4c). This change of correlated protein motions is consistent with the free energy landscapes (left in Figure 4b). The free energy profiles clearly show that A-loop phosphorylation favors the proton transfer reaction. The reduced conformation correlation of the post-proton transfer state upon A-loop phosphorylation reflects an increased intra-structural motion, thus resulting in increased conformational entropy of the kinase. The suggested increase of conformational entropy then lowers the free energy of proton transfer in the fully phosphorylated kinase. This finding exemplifies how protein dynamics participates in the allosteric regulation of the kinase catalytic activity mediated by A-loop phosphorylation.

Page 42: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

32

4. Kinase Dynamics and the Free Energy Landscape

Protein dynamics plays essential roles in biomolecules’ functions, thus attracting more and more attentions nowadays [86]. It has been well observed by the experimental approaches such as NMR [87-88]. However, tracing the structural origins of the dynamics is not easily achieved by experimental techniques. To this end, MD becomes a powerful tool to study protein dynamics at the atomic level [86]. A-loop phosphorylation of kinases is an important factor to regulate kinase catalytic activity. In an entire catalytic cycle, as demonstrated below, A-loop phosphorylation affects many properties of kinases, such as, local protein dynamics, interactions, structural cooperation, conformational changes, binding affinity, and thermodynamics. To dissect functional relevance of protein dynamics, I have used the MD and alchemical free energy simulations in the multiple catalytic

steps of the kinase IGF-1RK.

4.1 Kinase dynamics and cooperativity affected by phosphorylation

Protein dynamics can be examined in different time and length scales. With the increase of compute power and development of advanced algorithms, MD simulations can investigate protein dynamics in a relatively long time scale, e.g., ranging from microsecond (µs) to millisecond (ms), during which the protein may exhibit large-scale conformational changes [26]. Despite these advancements, hundreds of ns MD simulations on a common size of protein (like kinase domains, ~300 residues) still remain the mainstream. Thus, it is a crucial question: how we utilize such relatively short MD simulations to deduce functionally relevant information, which may correspond to longer time-scale phenomena. In this section, I employed several different analysis methods to understand how protein dynamics responds stimulus, e.g., phosphorylation and ligand binding. Further details

and results of the simulation analyses are provided in paper II.

4.1.1 Protein Motion

To mimic the entire catalytic cycle of IGF-1RK, I have built 8 simulation systems, each representing different functional and conformational states. These states include the inactive and active conformations in their apo forms

Page 43: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

33

with/without A-loop phosphorylation, the reactant and product states with different A-loop phosphorylation levels (Figure 3). The 300 ns MD simulation for each system suggests that A-loop phosphorylation induces only local changes in the protein structure. The PCA was performed for each system to gain insights into how such local changes propagate to the rest of the kinase to induce large conformational changes, such as the inactive ↔ active conformational transition. The analysis revealed the motion variations

in the A-loop and αC-helix after A-loop phosphorylation in each system pair.

Figure 5. Cumulative contribution of PCA modes to the target motion. The

contribution was quantified by cumulative summation of the projection (eq. (32))

square for (a) the apo and (b) holo systems. The number of mode is ranked

ascendingly by the frequency of PCA modes along abscissa. Y axis is the cumulative

summation of the squared projection.

The PCA results imply that only localized changes of protein motions occur after A-loop phosphorylation. Can such localized changes induce a global change of the kinase? To quantify the contribution of protein motions to a specific global conformational change, I projected the PCA modes onto a unit displacement vector defined by the active versus inactive conformations according to eq. (32) and (33). The results are presented in Figure 5. The inactive conformation systems showed the most pronounced directed motion among all systems. Specifically, the cumulative sum raised early in IRN)�S , such

that the first 100 modes account for ~70 % of the total projection. In all other systems, the same number of modes contributes only ~40 %. Because the orientation difference of the A-loop between the two kinase conformations

Page 44: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

34

dominates the displacement vector, the results suggest that the enhanced motions toward the activated conformation are produced by the A-loop

phosphorylation in the inhibited conformation kinase.

4.1.2 Structural Correlation

The residue-residue correlation map was generated to characterize correlated protein motions in different systems. For visual inspection, each correlation map was mapped onto corresponding protein structure (Figure 6). The mapping was carried out by averaging the absolute (pair) correlation values for each residue along each column (or row) of a given map. The resulting value can be considered as each residue’s average correlation with the rest of the kinase. Two main trends were found from the comparison. First, A-loop phosphorylation weakened the correlation in the kinase’ apo form, while enhanced the correlation in the holo form. Second, the overall correlations of the protein motions were diminished after substrate/product binding in both the phosphorylated and unphosphorylated systems. Consequently, the unphosphorylated kinase underwent substantially decreased correlation upon substrate/product binding (i.e., between ARN)'S

and A�ú'S ), while fully phosphorylated kinase experienced relatively

unchanged correlation.

Figure 6. Structural mapping of cross correlation maps. The average value of the

correlation along each column (or line) of the correlation map was determined by the

Page 45: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

35

formula: :C�; = �� ∑ |C�|���� , where :… ; denotes the average, i is the ordinal number of

protein residues, N the total number of the residues, C� an individual correlation value of a residue pair, and |… | denotes absolute value, respectively. The color scale varies from cyan (the smallest correlation value < 0.07) to red (the largest correlation

value > 0.27).

In addition, I identified the regions of the protein that significantly varied their correlation. In the inactive conformation systems, the phosphorylation decreased the correlations of the A-loop, whilst it enhanced the correlations of the αC-helix (Figure 6a and e). In contrast, the correlations of the A-loop and αC-helix were decreased in the active apo system after phosphorylation (i.e., ARN)�S ). In the holo systems, the opposite was observed, namely the

correlations of both the A-loop and αC-helix were enhanced after A-loop phosphorylation (Figure 6c, d, g and h). The results show that the A-loop and αC-helix are the regions that are affected predominantly. The identified trends in Figure 6c and g are consistent with the findings from the IRK system simulations (left in Figure 4c). This suggests that the conformational entropy argument for IRK is also valid for IGF-1RK.

Figure 7. Dynamical network analysis of IGF-1RK. Different colors on each average

structure indicate different communities from the network analysis and they are

assigned as consistent as possible for the comparison between systems. Red pillars

denote the indicator ‘betweeness’, and a thicker pillar indicates a greater value of

“betweeness”. The heighted regions in green carbon and red oxygen are where

isolated communities locate.

4.1.3 Structural Cooperation

Page 46: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

36

Through the dynamical network analysis (also called community network analysis), I have examined how different protein regions, e.g., the αC-helix and A-loop, are dynamically coupled. The analysis revealed that the entire kinase is partitioned roughly into 9 communities by ignoring isolated communities formed by 3 or less residues (Figure 7). This is similar to the partition observed in protein kinase A [89]. In addition, my analysis captured the structural coupling variations between different systems caused by A-loop phosphorylation. Two conformational states, i.e., the inactive apo and reactant states, showed the distinguished differences compared to other states. In the inactive system pair, the substantial fraction of IRN)'S A-loop

merged with the αD-helix to form a large community (Figure 6a). The A-loop phosphorylation detached the A-loop from αD-helix, transforming the large community into several small isolated communities (compare Figure 6a and 6e). This community variation suggests that A-loop phosphorylation decouples the A-loop’s contacts with the rest of the protein, and thereby make the A-loop to move more independently. In the reactant state pair, A-loop phosphorylation enhanced the connectivity between the A-loop and αC-helix (compare Figure 6c and g). The pronounced coupling, together with the correlation mapping results, suggests that the A-loop phosphorylation increases both the structural and dynamical coupling of the A-loop with the

αC-helix in the reactant state.

4.1.4 αC-helix Orientation

X-ray crystallography has established that αC-helix adopts an in-orientation in the fully phosphorylated active conformation and the out-orientation in the unphosphorylated inactive conformation. The present analysis revealed that detailed motion of αC-helix is more complex than the simple two conformation model. Especially, the αC-helix orientation and dynamics are altered by A-loop phosphorylation in most conformational states (Figure 8). For instance, the αC-helix of IRN)�S adopted a more out-oriented orientation than that of IRN)'S (Figure 5a). My analysis of the DFG motif and the αC-helix

orientations suggests that this outward orientation of αC-helix facilitates the conformational transition of the A-loop to the active kinase conformation (see paper II for details). Intriguingly, the αC-helix orientation of IRN)'S was

rather different from that observed in the original X-ray structure. The discrepancy is probably because of the artificial crystalline contacts, which happen between the two kinase monomers in an asymmetric unit of the X-ray structure, thus stabilizing their αC-helices’ out-orientation. By contrast, the relatively inward αC-helix orientation from the simulation was stabilized

Page 47: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

37

by the interaction between Glu1020 and Arg1104, which was not observed

partly due to crystal contacts.

Figure 8. Comparison of the αC-helix orientation (a) between the apo systems and

(b) between the holo systems. The αC-helix is shown in cylinder. For simplicity, only

two structures (one inactive and the other active conformation) are shown in gray

cartoon. The inhibitory and activated conformations of A-loop are colored in blue and

red, respectively. The average structure of the αC-helix is obtained based on the last

250 ns out of 300 ns trajectory

In the active conformation, the αC-helix responded differently to A-loop phosphorylation between the apo and holo systems. While both the ARN)'S and ARN)�S systems adopted a similar αC-helix orientation, that of the holo systems

differed consistently before and after A-loop phosphorylation (Figures 8). Specifically, the αC-helix of the unphosphorylated holo systems adopted a more outward orientation than that of the phosphorylated counterparts (Figure 8b). In addition, the distribution of the αC-helix orientation was broader with the unphosphorylated holo systems than the phosphorylated holo systems. These differences suggest that the A-loop phosphorylation aids the formation of the catalytic competent conformation in the holo enzyme by

modulating its αC-helix dynamics..

In this section, I elucidated how protein dynamics responds A-loop phosphorylation, by comparing protein motions, structural correlation, dynamically structural cooperation and structural fluctuation. Although relatively short MD simulations were used to conduct the described analyses, the comparison of protein dynamics between different functional and

Page 48: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

38

conformational states overcame the time-scale gap between local protein dynamics and functional relevance and yielded valuable insights into the large conformational change of the kinase triggered allosterically by A-loop phosphorylation. This section gives a lesson that comprehensive analyses in different aspects of the protein dynamics can yield a reasonable deduction

for protein functions based on information of local structural fluctuations.

4.2 General mechanisms regulating protein kinase function

4.2.1 Attraction-to-Repulsion Switch

Above, I have demonstrated that A-loop phosphorylation imposes several structural and dynamical changes and induce large scale conformational change of IGF-1RK. One change of particular importance is the attraction-to-repulsion switch of interactions between the A-loop tyrosines and their neighboring residues in the inactive conformation (Figure 9a and 9b). This interaction switch provides a rational understanding of the observed enhanced flexibility and decreased correlation of A-loop (Figure 6e) and the detachment of the A-loop from the main body of the kinase (Figure 7e).

Page 49: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

39

Figure 9. Representative interactions regulating the inhibitory kinase conformation

stability: the interactions between the A-loop tyrosines and their neighboring

residues for (a) the unphosphorylated (0P) and (b) phosphorylated (3P) states,

respectively, and (c) between Glu1020 and Arg1104. In (d), the changes of the

Glu1020-Arg1104 and the αC-helix orientation during the MD simulation are shown for the IRN)'S system.

More importantly, this interaction switch is conserved in many protein tyrosine kinases and may control their inhibitory conformation stability (Figure 10), suggestive of a common mechanism to control the conformation transition of A-loop. In addition, A-loop phosphorylation produced long-range impacts on IGF-1RK’ structure and dynamics. One such example was the disrupted coupling between the inward motion of the αC-helix and Glu1020-Arg1104 salt bridge by A-loop phosphorylation (Figure 9c and d). This decoupling occurs likely because the strain energy of the phosphorylated A-loop propagates to trigger the flipping of the DFG motif and the outward movement of the αC-helix. This can further facilitate the global conformational change of the kinase, i.e., from the inactive to active conformations. On the other hand, the unphosphorylated inhibitory conformation can be stabilized by the Glu1020-Arg1104 salt bridge, consistent with the inward orientation of αC-helix (Figure 8a). All of these changes of side chain interactions and αC-helix dynamics caused by A-loop phosphorylation are reflected in the 12.7 kcal/mol destabilization of the inactive conformation after its A-loop phosphorylation (see paper II and the “4.3 Free energy landscape and kinase allostery” section). Because several of similar interactions are found in other kinases (Figure 10), the proposed mechanism may also be applicable to these kinases.

Page 50: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

40

Figure 10. Conserved interactions shared by different kinases in their inactive

conformations. In all structure, A-loop is shown in cyan cartoon. The A-loop

tyrosines and their neighboring negatively charged residues are shown with the ball

and stick model. The kinase name and corresponding PDB id for each structure is

placed at the lower left corner of each figure.

4.2.2 A-loop and αC-helix Coupling

A-loop phosphorylation significantly alters the local dynamics and motions of the reactant states (Figures 4 and 6), suggesting the importance of these local changes in kinase allostery. This view is further supported by the analyses of side chain interactions and their changes. In A�ú�S , the interactions between pTyr1136 and Arg1012 and between Glu1016 and the cofactor Mg2+ ion (MgI), held the αC-helix and A-loop together to encapsulate the nucleotide-binding site (Figure 11a). In this way, the kinase can properly assemble the catalytic residues (e.g., Glu1020, Aps1016, and Asp1105) for the efficient catalysis (Figure 11b and c). In contrast, these interactions were loosened in A�ú'S , and resulted in the less restricted αC-helix, which then

moved outward (Figure 8b and aMD results). These changes can rationalize

why the catalytic rate was decreased in the unphosphorylated form kinase.

Page 51: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

41

Figure 11. (a) Interaction details along the A-loop, αC-helix and active site in the

phosphorylated reactant system (A�ú�S ). The activated A-loop and αC-helix are colored in red, and the substrate peptide in pink. The interaction distance bar plots are shown

for A�ú�S and A�ú'S in figure (b) and (c), respectively. Each bar represents an average distance, and the error bar denotes its standard deviation during the 300 ns MD

simulation. Interaction indices from 1 to 6 indicate the following interaction pairs in

sequence: (p)Tyr1136-Arg1012, Glu1016-MgI, β phosphate of ATP-MgI, Asp1123-MgI,

Glu1020-MgI, Asp1105-MgI.

In the product states, A-loop phosphorylation produced similar orientation changes of the αC-helix to that observed in the reactant states. However, unlike the reactant states, pTyr1136 of the product state did not form the strong interactions with the αC-helix. Instead, the phosphorylated tyrosines only stabilized the active conformation of the A-loop. Such weaken interactions in product state may provide a mechanism that alleviates the damping effect of A-loop phosphorylation after the phosphoryl transfer. In fact, similar interactions mediated by A-loop phosphorylation are prevalent in many other kinases (Figure 12), suggesting a ubiquitous mechanism for regulation of kinase catalysis.

Page 52: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

42

Figure 12. Conserved interactions shared by different kinases in their active

conformations. These interactions potentially regulate their catalytic activity and

dynamics by coupling the αC-helix and A-loop. The protein structure is presented by

cartoon, and the important residues are shown in ball and stick. The kinase names

and their corresponding PDB IDs are shown at the bottom of each sub-figure.

4.3 Free energy landscape and kinase allostery

4.3.1 Free Energy Landscape

The free energy landscapes were constructed for the first time for this kinase, based on the published data on IRK [13] and my simulation results. The Presented landscapes in Figure 13 mimic the entire catalytic cycle of kinase, including the kinase conformational change, substrate binding, phosphoryl transfer and product release. Therefore, the reaction coordinates of the free energy landscape was displayed are no longer certain geometric descriptions (e.g., distance, angles and dihedrals) but necessary catalytic events in sequence. These events were constructed by bridging 4 different kinase states with/without A-loop phosphorylation (Figure 3). Below, I describe the results and the implication for each catalytic step, starting from the conformational equilibrium between the inactive and active conformational states

Page 53: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

43

Figure 13. Free energy landscapes mimicking the IGF-1RK catalytic cycle. The red

and blue lines denote the phosphorylated (3P) and the unphosphorylated (0P) states,

respectively. The solid parts of the profiles denote the free energy values from

alchemical calculations, and dashed parts denote the deduced free energy values from

experimental data. The biological events involved in the catalytic cycle are showed on

the bottom of the figure.

Conformational equilibrium. Figure 13 clearly presents that A-loop phosphorylation drives the conformational transition to the active conformation by destabilizing the inactive conformation state and stabilizing the active conformation state. Before the phosphorylation, there exists an equilibrium between the two conformational states in the absence of A-loop phosphorylation, in which the free energy of the active conformation was estimated to be 4 kcal/mol higher than that of IRN)'S [90-92]. In contrast, the

corresponding free energy dropped to -9.3 kcal/mol after A-loop phosphorylation (IRN)�< ⇔ARN)�< , Figure 13), in which the destabilization of the

inactive conformation by 12.7 kcal/mol provides the major driving force. The present results are in line with the studies on Src kinase [15], which suggest that A-loop phosphorylation is a primary effector to induce the Src’s conformational change. These consistent observations show that A-loop phosphorylation plays important roles in controlling kinase conformational equilibrium.

Page 54: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

44

Substrate binding and product release. The reactant state was more favored by A-loop phosphorylation (-1.6 kcal/mol) than the other two active conformation states (i.e., the apo active and product states). The encapsulated ATP-binding site by αC-helix and A-loop (Figure 7g) rationalizes the favored free energy. In particular, the interactions mediated by pTyr1136 play an essential role here, which is further supported by my calculation (i.e., -6.9 kcal/mol favored by Tyr1136 phosphorylation, paper III) and the mutagenesis experiments [93]. The analyses of protein dynamics of the reactant states also showed that A-loop phosphorylation strengthened the structural coupling between αC-helix and A-loop by pTyr1136-Arg1012 salt bridge (Figure 11). Taken together, the fully phosphorylated IGF-1RK accelerates the chemical step mainly by the contributive interactions with pTyr1136, which affects the αC-helix orientation and the assembly of the

active site.

In contrast, the product state showed a destabilization after A-loop phosphorylation (+1.7 kcal/mol). To trace the structural origin, I compared the interaction difference between ASú�S and A�ú�S . The analysis revealed that ASú�S forms a weaker coupling between the αC-helix and A-loop than A�ú�S , which is contributed by the loss of pTyr1136-Arg1012 salt bridge. The different interaction modes are probably caused by the changes of electrostatic environments after the catalytic reaction, i.e., one more negative charge is introduced into the nucleotide-binding site after the reaction, and the change caused by the γ-phosphoryl group of ATP to the substrate tyrosine, inducing local rearrangement of the active site of the kinase. As a consequence, αC-helix is decoupled from the A-loop. The reduced coupling between the αC-helix and A-loop may provide a mechanism to alleviate the damping effect of A-loop phosphorylation on the ADP release after the γ-

phosphate transfer.

In the substrate and product binding, A-loop phosphorylation contributed little to the substrate binding (-1.1 kcal/mol), and weakened the product binding (+2.2 kcal/mol), if their affinities were compared only in the kinase’s active conformation state. This result agrees with my argument that the interaction of the extra charge makes nucleotide-binding site be more electrostatically negative after the γ-phosphate transfer. Notably, the weakened product binding was evaluated only based on the active conformation of the kinase. If I factored the 4 kcal/mol of free energy cost into the binding affinity, which was for the interconversion between the inactive and active conformations, A-loop phosphorylation favored the

Page 55: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

45

product binding by 1.7 kcal/mol (closed to the experimental data 2.4 kcal/mol). The same correction can be made in the substrate binding, yielding -5.1 kcal/mol as the binding affinity change estimated based on the present simulation results. Again, this value is very close to the -4.2 kcal/mol

experimental estimation.

Another important finding is that A-loop phosphorylation smoothed out the free energy landscape by restricting kinase conformations in its active form (later phase in Figure 13). The free energy landscape of the unphosphorylated kinase underwent pronounced fluctuation due to the conformational interconversion of the kinase. The fluctuation of free energy landscape reflects a large kinase conformational dynamics. Obviously, the unphosphorylated kinase will consume more energy to finalize an entire catalytic cycle along the reactive direction (from left to the right in Figure 13) than its phosphorylated counterpart, because it has to surmount more free energy barriers. Thus, the free energy landscape provides a new insight into

the question: how protein dynamics participates in the kinase catalytic cycle.

4.3.2 Kinase Allostery

By integrating all the results presented so far, I can propose a mechanism of the regulation of IGF-1RK catalytic activity by A-loop phosphorylation. Basically, the mechanism can be decomposed into 3 allosteric modulation steps. In the first step, the thermodynamically favored inhibitory conformation of the kinase is changed to the active conformation after A-loop phosphorylation, by a population shift mechanism. This conformational change is mainly driven by destabilization of the inactive conformation. In particular, the electrostatic repulsion between the kinase and phosphorylated A-loop tyrosines destabilizes the local structure. This local perturbation propagates along the A-loop to induce the flipping of the DFG motif and change of the αC-helix orientation, which in turn triggers the global conformational change of the A-loop. This step is a positive allosteric transition process with a 13.2 kcal/mol of net population shift (the top panel in Figure 14). This allosteric transition is beneficial to the second step (the middle panel in Figure 14), i.e., substrate binding, thus producing the positive allosteric modulation. In this step, our free energy calculation gives a consistent binding data with the experimental measurement, suggestive of

the reliability of our simulations.

Page 56: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

46

Figure 14. Allosteric modulations of A-loop phosphorylation on IGF-1RK

conformational change (top), substrate binding (middle) and phosphoryl transfer

reaction (bottom).

In the third step, A-loop phosphorylation alters the interactions between A-loop and αC-helix, thereby fine-tuning the organization of the active site residues and cooperative protein motions in the active kinase conformation. The altered interactions and dynamics eventually modulate the catalytic barrier and reaction free energy (Figure 14, bottom panel). In particular, after the substrate binding, the strengthened coupling between A-loop and αC-helix in the fully phosphorylated kinase tightens the active site and thereby accelerates the catalytic reaction. On the other hand, when the catalytic reaction is completed, the coupling between A-loop and αC-helix is weakened and reduces the stability of the product state relative to that of the unphosphorylated kinase. This change of the product state stability leads to

Page 57: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

47

an increased reaction free energy. This is a negative allosteric regulation on

the reaction free energy, by a 3.3 kcal/mol.

In summary, most of previous studies only focused on a single kinase conformational and functional state or few relevant properties, e.g., protein dynamics or conformational changes. Analyses on these limited states and properties may not deliver a complete picture about how the kinase allostery works. In contrast, by utilizing several different simulations methods and extensive analyses, complemented by the experimental data, I have demonstrated here that we can dissect the entire allosteric effects of A-loop phosphorylation onto several individual steps of the kinase catalytic cycle. Especially, I have explored the changes of the protein dynamics, structural interactions and thermodynamics to deliver a comprehensive understanding

of the kinase allostery.

Page 58: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

48

5. Gaussian Soft-core Potentials (GSC)

5.1 Reliability of new soft-core potentials

Herein, I introduce repulsive soft-core potentials, named as Gaussian soft-core (GSC) potentials. To test their validity, I executed a series of alchemical annihilation transformations, mainly including ions and small molecules in water solution. In 12 test cases, GSC potentials reproduced all the results from the CSC potentials (Figure 15a), with relatively small deviations (< 0.4 kcal/mol). Because the tested chemical species are diverse, GSC potentials should work well on other common systems.

The optimized infinite-order formula of free energy perturbation theory (OIOFEP, eq. 23) has a potential to reduce the number of λ simulations. In this theory, 3 points of λ simulations are needed, i.e., λ=0.0, 0.5 and 1.0. Because the λ simulations based on CSC potentials cannot be used by this free energy formulism, TI method is used as the comparison to obtain the corresponding free energies based on the 3 same λ points. As shown in Figure 15b and c, GSC with OIOFEP method gives reasonable accuracy in 12 test systems, whilst CSC with TI method experiences relatively large errors. This comparison shows that by applying the GSC potentials to the annihilations we can obtain reliable free energies more efficiently than those from CSC potentials.

Figure 15. Comparison between the CSC and GSC protocols on annihilation

transformations of all 12 small molecule cases: (a) GSC vs. CSC with 19 point TI

method (see Tables 1 and 2), (b) GSC with OIOFEP vs. CSC with 19 point TI method

(references), and (c) CSC with 3 point TI method vs. GSC with 19 point TI method

Page 59: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

49

(reference). The 19 point TI methods refer the two step annihilation, in which each

step is achieved over 19 λ state simulations.

Under the GSC and CSC strategies, different protocols are used to achieve the final annihilation free energies (see the Method section). By using the GSC protocol, the OIO-FEP method produces accurate estimations on both transformation steps (Figure 16a and b). By comparison, the CSC protocol yields reasonable estimation only on the first transition step with the 3 point OIO-FEP method (Figure 16c), and poor results on the second transformation step by the 3 point TI method, resulting in about 2 kcal/mol

of error on average (Figure 16d).

Page 60: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

50

Figure 16. Comparison of estimated free energies by different methods for the first

and second annihilation transformation steps. In (a) and (b), the estimated free

energies for the first and second steps by the 3-point OIOFEP method are compared

for the GSC protocol. In (c) and (d), the corresponding free energies from the CSC

protocol are shown, respectively. In the 2nd step of the CSC protocol, instead of using

the OIOFEP method, we have used the 3-point TI method to estimate the free energy

change, and thus, a larger error is expected. In all comparison, the reference free

energies are determined by the TI method using all 19 λ state simulations.

5.2 Origin of the Efficiency from GSC Protocol

Page 61: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

51

What makes the GSC protocol be more efficient on estimating annihilation free energy than the CSC protocol? To answer this question, I have selected glycine system to illustrate different behaviors of transition integrands between the GSC and CSC protocols. Figure 17a presents the first step transition integrands from the two protocols. The two integrands are linear enough so that the 3-point TI estimation can be accurate for both protocols. However, for the second transition steps, both soft-core potentials have produced highly curved integrands near λ=g2 (Figure 17b). Apparently, such large curvatures cannot be well captured by the 3 point TI method. Interestingly, when I extrapolated <∂H/∂λ> to non-simulated λ points based on the simulations at λ=0.0, 0.5 and 1.0 in the context of the GSC strategy, the overall integrand curvature of the 19 points of simulations can be well reproduced (Figure 17c). On the other hand, the collected data from CSC potentials cannot be straightforwardly used to produce the same type of extrapolations. In fact, to extrapolate <∂H/∂λ> values based on the CSC generated configurations, the quantity on each target λ value has to be re-evaluated. Such energy evaluation could be very time-consuming and thus inconvenient.

Figure 17. TI integrands of the (a) first and (b) second annihilation steps from the

GSC and CSC protocols for glycine. In both, the results from the GSC protocol is

shown in red and the results from the CSC protocol in blue, respectively. (c)

Extrapolated TI-integrands (eq. (7)) for the second transformation step with the GSC

potentials for glycine. The extrapolations are based the simulations at 3 different λ

states, i.e., λ=0 (gray), 0.5 (cyan) and 1 (green).

5.3 Optimize location of λ simulations

To further investigate the transformation behaviors with GSC potentials, the ∆U distribution is plotted for each of the 19 λ simulations for glycine in Figure 18. For the first step transition, the ∆U distributions are all Gaussian-like with similar variances (Figure 18a). Further, 3 λ simulations, i.e., λ=0.0, 0.5 and 1.0, can cover the entire range of energy distribution. For the second

Page 62: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

52

step, only 2 λ simulations are enough to cover the entire configurations, i.e., λ=0.1 and 0.95. This is because near the annihilation end, ∆U distributions become extremely broad and thus cover majority of configuration space (Figure 18b). Based on the 2 λ simulations, we can apply the double-wide FEP method to estimate the annihilation free energy of the second step for all the 12 cases. The results are quite accurate with errors smaller than 0.2 kcal/mol (paper III).

Figure 18. Figure 5. ∆U distributions of all λ simulations for glycine annihilation

by the GSC protocol: (a) the first step transformation and (b) the second step

transformation.

In this section, I have introduced the GSC potentials, which not only avoid the end-point problems but also generate efficient estimation of annihilation free energies. Under the GSC implementation, the double-wide FEP and BAR methods are much more efficient than TI method. The GSC protocol repels the CSC protocol mainly because of its extrapolability on the second transformation step, i.e., during annihilation of the Gaussian repulsive potentials. Because the simulations near the end-point cover majority of configuration space, GSC potentials can give an accurate free energy estimation with very few λ points. Moreover, GSC potentials are valid on a challenging case, i.e., Tyr-to-pTyr mutation in water solution and protein, suggesting its high and general applicability to other protein systems. Overall, GSC will play positive roles in future molecular and protein design.

Page 63: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

53

6. Conclusions and Future Work

In this thesis, I present how molecular simulation are applied to detect effects of A-loop phosphorylation on the different functional and conformational states in the kinase (IGF-1RK and IRK) catalytic cycle, i.e., conformational equilibrium, substrate binding, phosphoryl transfer and product release. By integrating knowledge gained on kinase dynamic, structural, thermodynamic and free energy landscape, the mechanism of the allosteric regulations of A-loop phosphorylation on the entire kinase catalytic cycle is understood in great details and new model is presented for experimental validation. In the kinase inactive conformation, A-loop phosphorylation enhances the protein dynamics by introducing the unfavorable electrostatic interactions of the A-loop tyrosines. The enhanced dynamics produces the cumulative motions that are directed to the kinase activation. By contrast, in the active apo conformation, A-loop phosphorylation introduces the multiple salt bridges, and thereby stabilizes the active conformation and restrains the overall motions of the kinase. My thermodynamic analysis shows that A-loop phosphorylation, as an allosteric effector, shifts the kinase population to the active conformation by -13.2

kcal/mol, and thus significantly favors the substrate-binding.

In the reactant conformational state, A-loop phosphorylation fine-tunes the positions of the catalytic residues by coupling αC-helix and A-loop dynamically and structurally, which is fulfilled by pTyr1136-Arg1012 (IGF-1RK residue ID) interaction. By comparison, A-loop phosphorylation does not produce obvious interactions to couple αC-helix and A-loop in the product state. The reaction free energy is also shifted by 3.3 kcal/mol by A-loop phosphorylation, thus thermodynamically facilitates the product release. For the phosphoryl transfer step, which is studied by QM/MM simulations on IRK, the activation barrier is lowered by 2.0 kcal/mol, which agrees with experimental data (2.2 kcal/mol). The kinetic efficiency is achieved by A-loop phosphorylation mainly from the proton transfer step rather than the γ-phosphate transfer step. The dynamics analysis suggests that the increase of protein conformational entropy contributes the proton transfer in the fully phosphorylated kinase. Because the conserved interactions are shared in many human kinases, the mechanisms found in this thesis can be general

and applicable to them.

The proposed mechanisms in this thesis are testable in experiments. All the proposals, e.g. the three allosteric modulations on IGF-1RK, are associated

Page 64: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

54

with key residues, which play important roles in regulating the structures or dynamics of kinases. Mutagenesis studies on those residues are straightforward to test the proposed ideas. Nevertheless, more efforts (experimentally or computationally) are still needed to broaden our insight into the kinase (IGF-1RK or IRK) catalytic mechanisms. For instance, how does the kinase transform its inactive conformation to the active one, i.e., transition pathway? How does A-loop phosphorylation alter the transition pathways? How do the kinase inhibitors regulate the conformational equilibrium at different phosphorylation levels? Studies on these questions would broaden our horizon on kinase catalytic and regulatory mechanisms and accelerate the development of new therapeutic approaches to treat

related diseases.

Page 65: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

55

7. Acknowledgements

I never imagined there was a so nice country in the world like Sweden before my coming here. The country, the people living here and the things I encountered totally change my outlook. After the 5 years of study, I think I have been a bit approaching the perfect person I would like to be, who is kind, brave, honest, persistent, independent, responsible, wise, humorous, optimistic and principled. I have seen many people who have such personalities. They are the excellent examples guiding my life direction.

First of all, I would like to thank my main supervisor, Kwangho Nam. You spent very hard time and put much effort to guide me to be a real scientist. You taught me how to think independently, how to correct myself and how to clearly show my ideas. Under the past 5 years of your supervision, I think I am becoming an independent researcher. I deeply think that continually improving myself along your concepts will let me be successful in academia. I am your first PhD student and sometimes a willful student. Please don’t let me shadow your teaching career. There must be many excellent candidates waiting for you.

I’d like to thank my associate supervisor, Magnus Wolf-Watz. I admire your presentation skills. You can always explain a complex scientific topic in a quite logic and clear way. I remember you once let me use three sentences to summarize my whole seminar contents. That time, I failed to do, but you gave an excellent example. I was inspired by that. Many thanks also for your help on the last phase of my PhD study. Thanks for your encouragement with your carton figure on my thesis manuscript.

Big thanks to the committee for my annual evaluation. Patrik Andersson, Per-Olof Westlund and Pernilla Wittung Stafshede spent time on evaluating my performance in the first 3 years of my study. Your suggestions made my study in a right way.

A special appreciation goes to Pernilla. Thanks for your encouragement when I felt hopeless about my academia performance. Our first collaboration paper let me feel that I was not that poor. That feeling saved me from giving up my academia career.

I give many thanks to my friend Jin Zhang. We not only drunk together but also published together.

Page 66: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

56

I thank Uwe Sauer and Elisabeth Sauer-Eriksson. Thanks for providing the crystallization platform for me to do experiments for the THR project. Thank Uwe for teaching me basic crystallographic knowledge and operations. I also thank Roland Bergdahl and Per-Gustaf Norman. You guys really did me a favor when I was zero on crystallization. I am thankful to Pedro Ojeda May and Tatiana Shutova. You proposed many valuable questions to improve my thinking in our internal group meetings. Many thanks also for your reading my manuscripts.

I thank GMPK (later GMRK) members, including Gerhard Gröbner, Ronnie Berntsson, Tobias Sparrman, Artur Dingeldein, Ilona Dudka, Jörgen Ådén, Johan Vestergren, Marcus Wallgren, Martin Lidman, Ngoc Oanh and Per Rogne. During regular group meetings, I learned much knowledge from you. Thank you for the excellent seminars.

I would like to express my appreciation to Gerhard and Tobias. Thank Gerhard for accepting my invitation as the chairman of my dissertation defense. Thank Tobias for accepting my invitation as the toastmaster of my dissertation party. I also thank Tobias for inviting me into your “innebandy” team. I like this Nordic sport very much.

I thank Maria Jansson for your kind supports about documentation stuff. Your timely help greatly facilitated my graduation process.

I also thank my previous supervisor Prof. Yun Tang. You guided me to finish my master study in China. Without that experience, I don’t think I could be qualified to get the chance to the PhD study.

I thank my previous colleagues Jian Li, Zhongguo Chen, Zhenting Gao, Zhen Gong and Qiang Li in WuXi AppTec. Under your help, I grew into a qualified computational chemist in industrial. That experience always alerts me to think how to transform a scientific finding to benefit mankind.

A special appreciation goes to Zhenting. That time, you almost taught me new modeling technics hand to hand, and let me realize that a real scientist should have a strong willing to share knowledge.

A huge thank to my good brothers in Sweden: Xianqiang Sun, You Xu, Guanglin Kuang. You shared lots of expertise with me. Your suggestions gave me inspiration to design my projects. In our collaboration study, you donated many good ideas. We also had a lot of fun together, and shared happiness and grief. Thanks a lot for the firmly supporting in the past 5 years.

Page 67: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

57

I also thank Docent Yaoquan Tu. When I visited your lab, you provided so nice preparation for our young people to communicate.

A big toast to Zhao Wang and Duanting Zhai. Thanks for sharing your experiences on studying, living, seeing movies and drinking. Your enlightening words on our weekend parties helped me a lot.

I thank Dan Adolfsson and Wenjia Wang for teaching me hiking and playing Swedish board games. I got a lot of fun from you. I deeply thank Bo Li for teaching me how to snowboard. I think it is my favorite sport in my life. I also thank you and Shiyu for inviting me to join your weekend parties for many times. Your accompanying did me a favor in my hard time. Thanks for listening my complaints about the life and study.

I thank members of the Chinese ski team: Hanqing Zhang, Nongfei Sheng and Da Wang. You guys freed me from the big working pressure. I remember every happy moment when we “Four kings of the Chinese ski” skied (I snowboarded) together.

I would like to give a big hug to Qiuju Gao. Every time when I fallen into my low life, you were always the first person to enlighten me. Your cares on me and Chaojun are so meticulously. That even let me doubt if I was a single child. If I had a lost elder sister, that must be you.

I also want to thank many of my friends: Tao Jiang, Wei Zhu, Yingying Liu, Mingquan Liu, Liangyu Qian, Weixing Qian, Juan Jiang, Xueen Jia, Xian Li, Chao Wang, Ziqing Kong, Genqiang Chen, Hilde Polo, Beatriz Galindo-Prieto and Yang Huang. You guys companied me to go pass a beautiful period of my life in Sweden.

I’d like to take this chance to thank my dear mother and father. Thank you for giving me the life to enjoy many beautiful things. Thanks for your efforts on me with bearing bitter hardships in my early 25 years.

The last, but the most important appreciation goes to my dear Chaojun. In fact, you were eligible to be the author on every paper I published. You almost took all the household duties in the past 5 years. You just want to free me to focus on my studies. You even care the publication more than me. Recall that I got a manuscript rejection from JACS, you cried for that several days. I tried many ways to let you calm down. Yes. You always have a stronger confidence in me than myself. It will be never enough to say many magnificent words to you. It is time to let me fulfill some of your dreams.

Page 68: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

58

8. References

1. Siddle, K., Signalling by insulin and IGF receptors: supporting acts and new players. J. Mol. Endocrinol. 2011, 47 (1), R1-10.

2. Li, R. S.; Pourpak, A.; Morris, S. W., Inhibition of the Insulin-like Growth Factor-1 Receptor (IGF1R) Tyrosine Kinase as a Novel Cancer Therapy Approach. J. Med. Chem. 2009, 52 (16), 4981-5004.

3. Wu, J. H.; Li, W.; Craddock, B. P.; Foreman, K. W.; Mulvihill, M. J.; Ji, Q. S.; Miller, W. T.; Hubbard, S. R., Small-molecule inhibition and activation-loop trans-phosphorylation of the IGF1 receptor. EMBO. J. 2008, 27 (14), 1985-1994.

4. Weroha, S. J.; Haluska, P., IGF-1 receptor inhibitors in clinical trials--early lessons. J. Mammary Gland Biol. Neoplasia 2008, 13 (4), 471-83.

5. Mayer, S. C.; Banker, A. L.; Boschelli, F.; Di, L.; Johnson, M.; Kenny, C. H.; Krishnamurthy, G.; Kutterer, K.; Moy, F.; Petusky, S.; Ravi, M.; Tkach, D.; Tsou, H. R.; Xu, W., Lead identification to generate isoquinolinedione inhibitors of insulin-like growth factor receptor (IGF-1R) for potential use in cancer treatment. Bioorg. Med. Chem. Lett. 2008, 18 (12), 3641-5.

6. Bahr, C.; Groner, B., The insulin like growth factor-1 receptor (IGF-1R) as a drug target: novel approaches to cancer therapy. Growth Horm. Igf Res. 2004, 14 (4), 287-295.

7. O'Neill, C.; Kiely, A. P.; Coakley, M. F.; Manning, S.; Long-Smith, C. M., Insulin and IGF-1 signalling: longevity, protein homoeostasis and Alzheimer's disease. Biochem Soc T 2012, 40, 721-727.

8. Bostrom, E. A.; Svensson, M.; Andersson, S.; Jonsson, I. M.; Ekwall, A. K. H.; Eisler, T.; Dahlberg, L. E.; Smith, U.; Bokarewa, M. I., Resistin and Insulin/Insulin-like Growth Factor Signaling in Rheumatoid Arthritis. Arthritis Rheum-Us 2011, 63 (10), 2894-2904.

9. Clemmons, D. R., The relative roles of growth hormone and IGF-1 in controlling insulin sensitivity. The Journal of clinical investigation 2004, 113 (1), 25-7.

10. Hubbard, S. R., Juxtamembrane autoinhibition in receptor tyrosine kinases. Nat. Rev. Mol. Cell Biol. 2004, 5 (6), 464-71.

Page 69: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

59

11. Hubbard, S. R.; Till, J. H., Protein tyrosine kinase structure and function. Annu. Rev. Biochem. 2000, 69, 373-98.

12. Favelyukis, S.; Till, J. H.; Hubbard, S. R.; Miller, W. T., Structure and autoregulation of the insulin-like growth factor 1 receptor kinase. Nat. Struct. Biol. 2001, 8 (12), 1058-1063.

13. Ablooglu, A. J.; Kohanski, R. A., Activation of the insulin receptor's kinase domain changes the rate-determining step of substrate phosphorylation. Biochemistry 2001, 40 (2), 504-513.

14. Ojeda-May, P.; Li, Y.; Ovchinnikov, V.; Nam, K., Role of Protein Dynamics in Allosteric Control of the Catalytic Phosphoryl Transfer of Insulin Receptor Kinase. J. Am. Chem. Soc. 2015, 137 (39), 12454-7.

15. Meng, Y.; Roux, B., Locking the active conformation of c-Src kinase through the phosphorylation of the activation loop. J. Mol. Biol. 2014, 426 (2), 423-35.

16. Foda, Z. H.; Shan, Y. B.; Kim, E. T.; Shaw, D. E.; Seeliger, M. A., A dynamically coupled allosteric network underlies binding cooperativity in Src kinase. Nat. Commun. 2015, 6.

17. Tokunaga, Y.; Takeuchi, K.; Takahashi, H.; Shimada, I., Allosteric enhancement of MAP kinase p38 alpha's activity and substrate selectivity by docking interactions. Nat. Struct. Mol. Biol. 2014, 21 (8), 704-711.

18. Masterson, L. R.; Cheng, C.; Yu, T.; Tonelli, M.; Kornev, A.; Taylor, S. S.; Veglia, G., Dynamics connect substrate recognition to catalysis in protein kinase A. Nat. Chem. Biol. 2010, 6 (11), 821-8.

19. Hyeon, C.; Jennings, P. A.; Adams, J. A.; Onuchic, J. N., Ligand-induced global transitions in the catalytic domain of protein kinase A. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (9), 3023-8.

20. Young, M. A.; Gonfloni, S.; Superti-Furga, G.; Roux, B.; Kuriyan, J., Dynamic coupling between the SH2 and SH3 domains of c-Src and hck underlies their inactivation by C-terminal tyrosine phosphorylation. Cell 2001, 105 (1), 115-126.

21. Jambrina, P. G.; Rauch, N.; Pilkington, R.; Rybakova, K.; Nguyen, L. K.; Kholodenko, B. N.; Buchete, N. V.; Kolch, W.; Rosta, E., Phosphorylation of RAF Kinase Dimers Drives Conformational Changes that Facilitate Transactivation. Angew. Chem. Int. Ed. Engl. 2016, 55 (3), 983-6.

Page 70: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

60

22. Damle, N. P.; Mohanty, D., Mechanism of autophosphorylation of mycobacterial PknB explored by molecular dynamics simulations. Biochemistry 2014, 53 (28), 4715-26.

23. Shan, Y.; Eastwood, M. P.; Zhang, X.; Kim, E. T.; Arkhipov, A.; Dror, R. O.; Jumper, J.; Kuriyan, J.; Shaw, D. E., Oncogenic mutations counteract intrinsic disorder in the EGFR kinase and promote receptor dimerization. Cell 2012, 149 (4), 860-70.

24. Groban, E. S.; Narayanan, A.; Jacobson, M. P., Conformational changes in protein loops and helices induced by post-translational phosphorylation. Plos Comput Biol 2006, 2 (4), 238-250.

25. Li, Y.; Nam, K., Dynamic, structural and thermodynamic basis of insulin-like growth factor 1 kinase allostery mediated by activation loop phosphorylation. Chem. Sci. 2017, 8 (5), 3453-3464.

26. Dror, R. O.; Dirks, R. M.; Grossman, J. P.; Xu, H. F.; Shaw, D. E., Biomolecular Simulation: A Computational Microscope for Molecular Biology. Annu Rev Biophys 2012, 41, 429-452.

27. Brooks, B. R.; Brooks, C. L.; Mackerell, A. D.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; Caflisch, A.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, M.; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M., CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30 (10), 1545-1614.

28. Verlet, L., Computer Experiments on Classical Fluids .I. Thermodynamical Properties of Lennard-Jones Molecules. Phys Rev 1967, 159 (1), 98-+.

29. Snyman, J. A., A New and Dynamic Method for Unconstrained Minimization. Appl Math Model 1982, 6 (6), 449-462.

30. Mackerell, A. D., Jr.; Feig, M.; Brooks, C. L., 3rd, Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J. Comput. Chem. 2004, 25 (11), 1400-15.

31. MacKerell, A. D.; Feig, M.; Brooks, C. L., Improved treatment of the protein backbone in empirical force fields. J. Am. Chem. Soc. 2004, 126 (3), 698-699.

Page 71: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

61

32. Foloppe, N.; MacKerell, A. D., All-atom empirical force field for nucleic acids: I. Parameter optimization based on small molecule and condensed phase macromolecular target data. J. Comput. Chem. 2000, 21 (2), 86-104.

33. Steinbach, P. J.; Brooks, B. R., New Spherical-Cutoff Methods for Long-Range Forces in Macromolecular Simulation. J. Comput. Chem. 1994, 15 (7), 667-683.

34. Bogusz, S.; Cheatham, T. E.; Brooks, B. R., Removal of pressure and free energy artifacts in charged periodic systems via net charge corrections to the Ewald potential. J Chem Phys 1998, 108 (17), 7070-7084.

35. Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G., A Smooth Particle Mesh Ewald Method. J Chem Phys 1995, 103 (19), 8577-8593.

36. Darden, T.; York, D.; Pedersen, L., Particle Mesh Ewald - an N.Log(N) Method for Ewald Sums in Large Systems. J Chem Phys 1993, 98 (12), 10089-10092.

37. Nose, S., A Molecular-Dynamics Method for Simulations in the Canonical Ensemble. Mol Phys 1984, 52 (2), 255-268.

38. Hoover, W. G., Canonical Dynamics - Equilibrium Phase-Space Distributions. Phys. Rev. A 1985, 31 (3), 1695-1697.

39. Grest, G. S.; Kremer, K., Molecular-Dynamics Simulation for Polymers in the Presence of a Heat Bath. Phys Rev A 1986, 33 (5), 3628-3631.

40. Feller, S. E.; Zhang, Y. H.; Pastor, R. W.; Brooks, B. R., Constant-Pressure Molecular-Dynamics Simulation - the Langevin Piston Method. J Chem Phys 1995, 103 (11), 4613-4621.

41. Martyna, G. J.; Tobias, D. J.; Klein, M. L., Constant-Pressure Molecular-Dynamics Algorithms. J Chem Phys 1994, 101 (5), 4177-4189.

42. Bernardi, R. C.; Melo, M. C. R.; Schulten, K., Enhanced sampling techniques in molecular dynamics simulations of biological systems. Bba-Gen Subjects 2015, 1850 (5), 872-877.

43. Hamelberg, D.; Mongan, J.; McCammon, J. A., Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules. J Chem Phys 2004, 120 (24), 11919-11929.

Page 72: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

62

44. Hamelberg, D.; de Oliveira, C. A. F.; McCammon, J. A., Sampling of slow diffusive conformational transitions with accelerated molecular dynamics. J Chem Phys 2007, 127 (15).

45. Kollman, P., Free-Energy Calculations - Applications to Chemical and Biochemical Phenomena. Chem Rev 1993, 93 (7), 2395-2417.

46. Aldeghi, M.; Heifetz, A.; Bodkin, M. J.; Knappcd, S.; Biggin, P. C., Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci 2016, 7 (1), 207-218.

47. Klimovich, P. V.; Shirts, M. R.; Mobley, D. L., Guidelines for the analysis of free energy calculations. J Comput Aid Mol Des 2015, 29 (5), 397-411.

48. Hansen, N.; van Gunsteren, W. F., Practical Aspects of Free-Energy Calculations: A Review. J Chem Theory Comput 2014, 10 (7), 2632-2647.

49. Mobley, D. L.; Klimovich, P. V., Perspective: Alchemical free energy calculations for drug discovery. J Chem Phys 2012, 137 (23).

50. Seeliger, D.; de Groot, B. L., Protein Thermostability Calculations Using Alchemical Free Energy Simulations. Biophys J 2010, 98 (10), 2309-2316.

51. Pohorille, A.; Jarzynski, C.; Chipot, C., Good Practices in Free-Energy Calculations. J Phys Chem B 2010, 114 (32), 10235-10253.

52. Christ, C. D.; Mark, A. E.; van Gunsteren, W. F., Feature Article Basic Ingredients of Free Energy Calculations: A Review. J Comput Chem 2010, 31 (8), 1569-1582.

53. Zwanzig, R. W., High-Temperature Equation of State by a Perturbation Method .1. Nonpolar Gases. J Chem Phys 1954, 22 (8), 1420-1426.

54. Hummer, G.; Szabo, A., Calculation of free-energy differences from computer simulations of initial and final states. J Chem Phys 1996, 105 (5), 2004-2010.

55. Bennett, C. H., Efficient Estimation of Free-Energy Differences from Monte-Carlo Data. J Comput Phys 1976, 22 (2), 245-268.

Page 73: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

63

56. Beveridge, D. L.; Dicapua, F. M., Free-Energy Via Molecular Simulation - Applications to Chemical and Biomolecular Systems. Annu Rev Biophys Bio 1989, 18, 431-492.

57. Zacharias, M.; Straatsma, T. P.; Mccammon, J. A., Separation-Shifted Scaling, a New Scaling Method for Lennard-Jones Interactions in Thermodynamic Integration. J Chem Phys 1994, 100 (12), 9025-9031.

58. Beutler, T. C.; Mark, A. E.; Vanschaik, R. C.; Gerber, P. R.; Vangunsteren, W. F., Avoiding Singularities and Numerical Instabilities in Free-Energy Calculations Based on Molecular Simulations. Chem Phys Lett 1994, 222 (6), 529-539.

59. Brooks, B. R.; Janezic, D.; Karplus, M., Harmonic-Analysis of Large Systems .1. Methodology. J. Comput. Chem. 1995, 16 (12), 1522-1542.

60. Marques, O.; Sanejouand, Y. H., Hinge-bending motion in citrate synthase arising from normal mode calculations. Proteins 1995, 23 (4), 557-560.

61. Sethi, A.; Eargle, J.; Black, A. A.; Luthey-Schulten, Z., Dynamical networks in tRNA: protein complexes. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (16), 6620-6625.

62. Singh, P.; Alex, J. M.; Bast, F., Insulin receptor (IR) and insulin-like growth factor receptor 1 (IGF-1R) signaling systems: novel treatment strategies for cancer. Med Oncol 2014, 31 (1).

63. Munshi, S.; Hall, D. L.; Kornienko, M.; Darke, P. L.; Kuo, L. C., Structure of apo, unactivated insulin-like growth factor-1 receptor kinase at 1.5 angstrom resolution. Acta. Crystallogr. D Biol. Crystallogr. 2003, 59, 1725-1730.

64. Tsai, C. J.; Nussinov, R., A Unified View of "How Allostery Works". Plos Comput. Biol. 2014, 10 (2).

65. Sampognaro, A. J.; Wittman, M. D.; Carboni, J. M.; Chang, C.; Greer, A. F.; Hurlburt, W. W.; Sack, J. S.; Vyas, D. M., Proline isosteres in a series of 2,4-disubstituted pyrrolo[1,2-f][1,2,4]triazine inhibitors of IGF-1R kinase and IR kinase. Bioorg. Med. Chem. Lett. 2010, 20 (17), 5027-30.

66. Hubbard, S. R., Crystal structure of the activated insulin receptor tyrosine kinase in complex with peptide substrate and ATP analog. EMBO J. 1997, 16 (18), 5572-5581.

Page 74: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

64

67. Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C., Numerical-Integration of Cartesian Equations of Motion of a System with Constraints - Molecular-Dynamics of N-Alkanes. J. Comput. Phys. 1977, 23 (3), 327-341.

68. Feller, S. E.; Zhang, Y. H.; Pastor, R. W.; Brooks, B. R., Constant-Pressure Molecular-Dynamics Simulation - the Langevin Piston Method. J. Chem. Phys. 1995, 103 (11), 4613-4621.

69. Martyna, G. J.; Tobias, D. J.; Klein, M. L., Constant-Pressure Molecular-Dynamics Algorithms. J. Chem. Phys. 1994, 101 (5), 4177-4189.

70. Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G., A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103 (19), 8577-8593.

71. Hamelberg, D.; de Oliveira, C. A.; McCammon, J. A., Sampling of slow diffusive conformational transitions with accelerated molecular dynamics. J. Chem. Phys. 2007, 127 (15), 155102.

72. Miao, Y. L.; Nichols, S. E.; Gasper, P. M.; Metzger, V. T.; McCammon, J. A., Activation and dynamic network of the M2 muscarinic receptor. P Natl Acad Sci USA 2013, 110 (27), 10982-10987.

73. Miao, Y. L.; Nichols, S. E.; McCammon, J. A., Free energy landscape of G-protein coupled receptors, explored by accelerated molecular dynamics. Phys. Chem. Chem. Phys. 2014, 16 (14), 6398-6406.

74. Pyrkosz, A. B.; Eargle, J.; Sethi, A.; Luthey-Schulten, Z., Exit Strategies for Charged tRNA from GluRS. J. Mol. Biol. 2010, 397 (5), 1350-1371.

75. Cormen, T. H., Introduction to algorithms. 2nd ed.; MIT Press: Cambridge, Mass., 2001; p xxi, 1180.

76. Floyd, R. W., Algorithm-97 - Shortest Path. Commun. Acm. 1962, 5 (6), 345-345.

77. Newman, M. E. J., Analysis of weighted networks. Phys. Rev. E 2004, 70 (5), 056131.

78. Girvan, M.; Newman, M. E. J., Community structure in social and biological networks. Proc. Natl. Acad. Sci. U. S. A. 2002, 99 (12), 7821-7826.

Page 75: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

65

79. Newman, M. E. J.; Girvan, M., Finding and evaluating community structure in networks. Phys. Rev. E 2004, 69 (2), 026113.

80. Lee, J.; Hendrickson, T. L., Divergent anticodon recognition in contrasting glutamyl-tRNA synthetases. J. Mol. Biol. 2004, 344 (5), 1167-1174.

81. Glykos, N. M., Software news and updates - Carma: A molecular dynamics analysis program. J. Comput. Chem. 2006, 27 (14), 1765-1768.

82. Humphrey, W.; Dalke, A.; Schulten, K., VMD: Visual molecular dynamics. J. Mol. Graph. Model. 1996, 14 (1), 33-38.

83. Lin, Y. L.; Aleksandrov, A.; Simonson, T.; Roux, B., An Overview of Electrostatic Free Energy Computations for Solutions and Proteins. J Chem Theory Comput 2014, 10 (7), 2690-2709.

84. Morgan, B. R.; Massi, F., Accurate Estimates of Free Energy Changes in Charge Mutations. J Chem Theory Comput 2010, 6 (6), 1884-1893.

85. Boyce, S. E.; Mobley, D. L.; Rocklin, G. J.; Graves, A. P.; Dill, K. A.; Shoichet, B. K., Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 2009, 394 (4), 747-63.

86. Karplus, M.; Kuriyan, J., Molecular dynamics and protein function. P Natl Acad Sci USA 2005, 102 (19), 6679-6685.

87. Kovermann, M.; Aden, J.; Grundstrom, C.; Sauer-Eriksson, A. E.; Sauer, U. H.; Wolf-Watz, M., Structural basis for catalytically restrictive dynamics of a high-energy enzyme state. Nat Commun 2015, 6.

88. Henzler-Wildman, K. A.; Thai, V.; Lei, M.; Ott, M.; Wolf-Watz, M.; Fenn, T.; Pozharski, E.; Wilson, M. A.; Petsko, G. A.; Karplus, M.; Hubner, C. G.; Kern, D., Intrinsic motions along an enzymatic reaction trajectory. Nature 2007, 450 (7171), 838-U13.

89. McClendon, C. L.; Kornev, A. P.; Gilson, M. K.; Taylor, S. S., Dynamic architecture of a protein kinase (vol 111, pg E4623, 2014). Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (47), 16973-16973.

Page 76: Understanding molecular mechanisms of protein tyrosine kinases …umu.diva-portal.org/smash/get/diva2:1095434/FULLTEXT01.pdf · 2017. 5. 14. · 2.1.2 Alchemical free energy calculations

66

90. Park, J.; McDonald, J. J.; Petter, R. C.; Houk, K. N., Molecular Dynamics Analysis of Binding of Kinase Inhibitors to WT EGFR and the T790M Mutant. J. Chem. Theory. Comput. 2016, 12 (4), 2066-78.

91. Shukla, D.; Meng, Y. L.; Roux, B.; Pande, V. S., Activation pathway of Src kinase reveals intermediate states as targets for drug design. Nat. Commun. 2014, 5.

92. Berteotti, A.; Cavalli, A.; Branduardi, D.; Gervasio, F. L.; Recanatini, M.; Parrinello, M., Protein Conformational Transitions: The Closure Mechanism of a Kinase Explored by Atomistic Simulations. J. Am. Chem. Soc. 2009, 131 (1), 244-250.

93. Li, W. Q.; Miller, W. T., Role of the activation loop tyrosines in regulation of the insulin-like growth factor I receptor-tyrosine kinase. J. Biol. Chem. 2006, 281 (33), 23785-23791.