Computational Microscopy: Revealing Molecular Mechanisms ......Computational Microscopy: Revealing...

12
Computational Microscopy: Revealing Molecular Mechanisms in Plants Using Molecular Dynamics Simulations - Teaching Guide Jiangyan Feng+, Jiming Chen+, Balaji Selvam+ and Diwakar Shukla* Department of Chemical & Biomolecular Engineering, Plant Biology, University of Illinois at Urbana-Champaign. +These authors contributed equally to this work. 1

Transcript of Computational Microscopy: Revealing Molecular Mechanisms ......Computational Microscopy: Revealing...

Computational Microscopy: Revealing MolecularMechanisms in Plants Using Molecular Dynamics

Simulations - Teaching GuideJiangyan Feng+, Jiming Chen+, Balaji Selvam+ and Diwakar Shukla*

Department of Chemical & Biomolecular Engineering, Plant Biology,University of Illinois at Urbana-Champaign.

+These authors contributed equally to this work.

1

OverviewMolecular dynamics (MD) simulations provide a detailed view of biological pro-cesses at the atomic level. It has been successfully applied to predict protein struc-tures, dynamics, functions and for design of drugs. Despite the increasing amountof sequence and structural information, very little is known about protein dynamics,which governs protein functions. This teaching tool introduces the MD simulationsto the plant community. It contains six sections that aim to answer followingquestions about molecular simulations: (i) Why use MD? (ii) What is the futureof MD in plant biology? (iii) What are the basic principles of MD (iv) What aresome key achievements of MD (v) How has MD been applied to plant proteins?(vi) How can MD simulations and experiments be integrated? This teaching toolalso includes two tutorials on how to run MD simulations using QwikMD softwareand how to analyze MD simulations using Visual Molecular Dynamics (VMD)software.

Use of this MaterialThis teaching tool is appropriate for use with a plant molecular biology course forupper level undergraduates. Prior knowledge of basic protein structure and functionis helpful, but not required as this teaching tool includes material on basics ofprotein structure. Basic knowledge of physics (i.e. a high school or freshman levelphysics class) is useful for a full grasp of the section on the inner workings of MD.Remaining sections assume early undergraduate-level biology background. Thelecture slides should take approximately 4-6 hours of lecture time to go through.

Tutorial Use: In addition to the lecture material, we have included hands-on tutori-als to guide students through setting up, running, and analyzing MD data. “Tutorial0: Installing Required Software” contains instructions on how to download nec-essary software on Windows, Mac and Linux platforms. “Tutorial 1: Preparingand Running a Simulation” shows students how to prepare and run a moleculardynamics simulation using the QwikMD software. Finally, “Tutorial 2: Watchinga Protein in Action” shows students how to analyze MD data that we provide.Together, these two tutorials are intended to demonstrate the process of performingan MD study, from system preparation to analysis of results. Tutorials 1 and 2are independent and thus can be performed in any order. For students with priorknowledge of molecular biology and protein structure, we recommend workingthrough Tutorial 1 before Tutorial 2 to mimic an actual workflow of an MD study.For students with less prior knowledge of molecular biology and protein structure,we recommend working through Tutorial 2 before Tutorial 1. This will allowstudents gain some familiarity with looking at protein structures prior to setting

2

up their own simulations. The estimated time requirement for tutorials are ∼15minutes for Tutorial 0, 1 hour setup for Tutorial 1 plus 3-4 hours of run time, and1.5 hours for Tutorial 2. Due to the required amount of time for each tutorial, werecommend that students work on them individually or in small groups outside ofclass time.

Learning ObjectivesBy the end of this lesson the student should be able to understand:

1. Why use molecular dynamics (MD) simulations?

2. What is all-atom MD?

3. How does MD work and what is the protocol to setup a MD simulation?

4. How can MD complement the study of plant proteins?

5. What is the advantage of integrating MD and experiments?

6. How can we engineer plant proteins using MD?

7. How can MD make an impact in the future of plant biology?

8. How do we run MD simulations using QwikMD?

9. How do we analyze MD output using VMD?

3

Study/exam Questions (understanding and comprehension)Quiz 1: Why use molecular dynamics (MD) simulations?

1. What are amino acids and how do they form proteins?

2. What is the central dogma of life?

3. Why do proteins function like machines?

4. What are the limitations of experiments?

5. Why do we need MD to complement experiments?

6. How are proteins made, and how do you classify levels of protein structure?

7. Why do we study plant proteins?

Quiz 2: Future of MD in plant biology

1. How can MD simulations help solve global issues?

2. List two projects that are aimed at engineering high yield crops.

Quiz 3: All-atom MD: The computational microscope

1. What is the principle behind MD and why do we need MD?

2. How does MD calculate forces and how the data is stored?

3. What kind of information do we get from MD?

4. How do we setup a MD simulation and why the starting structure important forMD?

5. How do we solve a protein structure and where can you find solved proteinstructures?

6. List some classical MD programs.

7. How can you gain biological insights of proteins using MD data?

8. What are advantages and limitations of MD?

9. Why do we use QwikMD and what is the best feature of QwikMD?

10. How do you visualize proteins and MD trajectory data?

Quiz 4: Scientific successes of MD

1. Discuss the evolution of MD

4

2. Name three scientists involved in discovery and development of MD methodsand their contributions.

3. What are the timescales of biologically relevant protein dynamics?

4. How can MD scaling be increased using supercomputers?

5. List some of the protein folding simulations performed by different researchgroups. How well do they match with experiments?

6. Why do proteins adopt different shape and structure?

7. What is a ligand and what is the importance of ligands in terms of proteinfunction?

Quiz 5: Applications of MD to plant proteins

1. How will MD complement the experimental study of plant proteins?

2. What are the plant proteins studied using MD?

3. Describe the functions of the above proteins.

Quiz 6: Complement MD and experiments

1. What is the relationship between simulations and experiments?

2. What is the advantage of combining experiments and simulations in terms ofunderstanding complex functional dynamics of plant proteins?

3. List one example of simulation guided experimental design.

4. List one example of experiment guided MD simulation.

5

Discussion Questions (engagement and connections)Part 1: Why use molecular dynamics (MD) simulations?

1. How amino acids are classified based on their properties?

2. What are essential and nonessential amino acids?

3. How are amino acids linked together to form a long chain?

4. Why are plant protein MD studies largely under-developed compared to humanproteins targets?

5. Explain how MD is a successful complement to experimental studies.

Part 2: All-atom MD: The computational microscope

1. What are the common experimental techniques to solve protein structures?

2. What are the inputs needed for MD?

3. How can we obtain biological insights from MD results?

4. How do we setup a MD simulation using QwikMD?

5. How do we analyze MD output using VMD?

Part 3: Scientific successes of MD

1. What are the example applications of MD?

2. Does protein size limit MD simulations?

Part 4: Applications of MD to plant proteins

1. Describe the advantages of using MD to study plant proteins?

2. List the recent MD studies on plant proteins.

Part 5: Complement MD and experiments

1. List the recent publications that combine MD and experiments.

2. What are remaining challenges in combining simulation and experiments?

Part 6: Future of MD in plant biologyHow can MD help solve the future challenges in plant biology?

6

Lecture SynopsisWhy use molecular dynamics (MD) simulations? (3-19)For more details, see Lecture Notes, sections 1 and 3Proteins are essential biological molecules that perform various physiological func-tions in animals and plants. They are synthesized from DNA, transcribed to RNAand translated to protein. Amino acids are bonded together as a polypeptide tomake a long chain of protein. Later, the protein is folded to perform a desiredfunction. Proteins are small and the three-dimensional structures of these moleculesare obtained using experimental techniques such as X-ray crystallography, Cryo-electon microscopy (Cryo-EM) and Nuclear Magnetic Resonance (NMR). Proteinsare dynamic entities in nature and adapt shapes frequently to perform various func-tions. Therefore, it is extremely difficult to understand the molecular mechanism ofprotein functions from one single static structure.

The need for Molecular Dynamics simulations in plant biology (20-27)For more details, see Lecture Notes, section 2Climate change and global warming pose a major threat to current agricultureprocedure. The projected global population and future food demands shows thatwe need to double to crop productivity by 2050. Drought threatens agriculturalproduction and severely hits the crop yield. Additionally, extensive use of fertilizersto increase the crop production results in environmental pollution. The multiscalemodeling might help us to engineer plant proteins to increase crop yield and im-prove food security.

All-atom MD: The computational microscope (28-55)For more details, see Lecture Notes, section 4All-atom MD provides molecular level details of time dependent motions of theprotein. MD algorithm is based on the principle of Newton’s second law of motion.At each time step, MD integrates Newton’s second law to evolve the positions andvelocities of all the atoms. This allows MD to show the continuous snapshots of bi-ological processes. The energy terms in simulations are calculated using force fieldthat typically includes covalent bonding, bond angles, dihedral angles, Coulombicinteractions and Van der Waals interactions. This Teaching Tool provides twotutorials which explain the use of QwikMD to set up and run MD simulations,and the analysis of MD output using VMD, respectively.

Scientific successes of MD (56-70)For more details, see Lecture Notes, section 5MD allows us to examine biological processes in an approximation to reality through

7

the slow molecular motions, which is difficult to achieve traditional experimentalprocedures. The hardware development (GPUs) and parallel computational re-source such as Blue Waters allows to conduct simulations for long timescales tocapture the molecular mechanism of complex process for example protein fold-ing, conformational changes such as activation and signaling mechanism, substratetransport in biological system and molecular recognition and drug design.

Applications of MD to plant proteins (71-77)For more details, see Lecture Notes, section 6Compared with their mammalian counterparts, very little structural information isavailable for plant proteins. MD can be used as an alternative to understand thestructural and biologically relevant dynamics of plant proteins. In this TeachingTool, we have discussed three important classes of plant proteins and shown howthe dynamics play a crucial role in understanding the molecular mechanism ofthese proteins. By capturing the essential dynamics, one could engineer theseproteins to improve their functions which could result in increase in plant growthand development.

Complement MD and experiments (78-85)For more details, see Lecture Notes, section 7The combination of biochemical experiments and MD could lead to a better un-derstanding of experimental observables or enhanced MD sampling by choosingthe optimal collective variables. We also developed a computational platform thatpredicts the optimal choice of probe placement for labelling residues techniquessuch as DEER spectroscopy, FRET, LRET, TTET etc., to conduct the experiments.This computational method will choose best choices from the MD dataset andguide experimentalists to perform experiments that results better understanding ofbiological process.

Suggested reading and online resources (86-87)We introduced classic MD textbooks and online resources for further reading.

8

Slide Concepts: Lecture Slides

Slides Table of contents/concepts1 Title2 Overview and outline3-19 Part 1: Why use molecular dynamics (MD) simulations?4-7 What are proteins?5 “Levels” of protein structure6 The central dogma of biology7 Proteins: “Workhorse” molecules of plant life8-11 Why is atomic-level protein structure important?9 Proteins are tiny10 The sequence-to-structure-to-function paradigm11 Structure determines function12-16 How can MD complement experiments?13 Structural information is missing14 Proteins are dynamic15 Experiments only provide static snapshots16 How MD helps?17 Part 1: Recap18 Part 1: Discussion20-27 Part 2: The need for Molecular Dynamics in plant biology21-23 How can we feed a hot and hungry world?24 What drives crop demand?25 What threats food security?26 Crops in silico27 RIPE: Engineering photosynthesis28 Part 2: Recap29 Part 2: Discussion28-55 Part 3: All-atom MD: The computational microscope29-31 What is all-atom MD?30 What is all-atom?31 What is MD?32-44 How does MD work?32 The basic algorithm33 MD setup, not that different from real experiment34-37 Input 1: Initial protein coordinates35 Solving structures by X-ray crystallography36 Solving structures by Cryo-EM37 X-ray crystallography vs. Cryo-EM38-43 Input 2: Force field39 Bond length stretching40 Bond angle bending41 Dihedral angle twisting42 Electrostatic interactions43 Van der Waals interactions

9

44 Periodic boundary conditions (PBC)45-51 Workflow for a typical MD simulation46 Setting up a simulation is like cooking47 Some classical MD programs48 Sample applications of MD49 Advantage: Spatial and temporal resolutions50 Limitation 1: Timescale51 Limitation 2: Force field inaccuracy52 QwikMD: Gateway to easy simulation

Appendix: QwikMD tutorial53 VMD: “Visual Molecular Dynamics”

Appendix: VMD tutorial54 Part 3: Recap55 Part 3: Discussion56-70 Part 4: Scientific successes of MD57-61 History and evolution: The race for longer and larger simulations58 Short history of MD59-61 All-atom MD today59 Larger systems60 Longer simulations61 Massive parallel computer Blue Waters61 Growth of MD studies62-64 Protein folding62 What are protein folding questions?63 2011, 12 fast-folding proteins64 MD predicted vs. experimental folding66-66 Ligand binding65 What is ligand binding and why is it relevant?66 Case study: How does a drug molecule find its target binding site?67-68 Protein conformational change67 What is conformational change and why is it relevant?68 Case study: MD reveals the whole transport cycle of PepTso

69 Part 4: Recap70 Part 4: Discussion71-77 Part 5: Applications of MD to plant proteins72-73 Case study 1: Investigate the conformational dynamics of plant protein kinases (BRI1/BAK1)74 Case study 2: MD describes the complete ABA recognition pathways75 Case study 3: How does a SWEET transporter transport glucose?76 Part 5: Recap77 Part 5: Discussion78-84 Part 6: Complement MD and experiments79 Simulations and experiments are complementary80-81 Simulations + experiments: A clearer picture81 Case study 1: Augment simulations with experiments82 Case study 2: Guide mutagenesis with simulations

10

83 Part 6: Recap84 Part 6: Discussion85 Suggested reading86 Online resources86 Acknowledgments

11

Slide Concepts: Tutorial Slides

Slides Table of contents/concepts1 Title2-20 Hands-on activity 1: Running the first MD simulation using QwikMD2 Learning objectives3 Example system: Stomagen

Step 1. Open QwikMDStep 2. Load PDB fileStep 3. Manipulate structureStep 4. Solvate your systemStep 4a. Prepare an implicit solvent systemStep 4b. Prepare an explicit solvent systemStep 5. Set simulation protocolStep 6. Prepare simulation filesStep 7a. Run your simulation inside of QwikMDStep 7b. Run your simulation outside of QwikMDStep 8. Check your output filesStep 9a. Analyze your data: Basic analysisStep 9b. Analyze your data: Advanced analysisStep 10. Tackling scientific problems: Performing mutational studies in QwikMD

20 Hands-on activity 1: Wrap up21-39 Hands-on activity 2: Analyzing MD output using Visual Molecular Dynamics (VMD)21 Learning objectives22 Example system: PYL523 Getting started

Load your trajectories3D molecular visualizationRMSD Trajectory ToolLabel atomsLabel the bond between atomsEdit label appearancePlot distance change along trajectoryVisualize protein conformational changesVisualize ABA binding pathway

39 Hands-on activity 2: Wrap up

12