Computers in Chemistry Dr John Mitchell & Rosanna Alderson University of St Andrews.

77
Computers in Chemistry Dr John Mitchell & Rosanna Alderson University of St Andrews

Transcript of Computers in Chemistry Dr John Mitchell & Rosanna Alderson University of St Andrews.

  • Slide 1

Computers in Chemistry Dr John Mitchell & Rosanna Alderson University of St Andrews Slide 2 1. Why? Working with experiment to test our theories. Computer uses theory to calculate an answer that can be compared with experiment. If prediction and experiment dont agree, something has to give. Slide 3 To Test Our Theories The theory that lies beneath chemistry is ultimately quantum physics. Turning this into a prediction of the rate of a chemical reaction or frequency of a transition in an IR spectrum needs lots of computation. Example: Quantum chemistry predicts that atoms in molecules are not spherical. Slide 4 Atoms in molecules are not spherical Slide 5 To Make Testable Predictions Computations ability to make accurate predictions of experimental measurements is a good test of the validity of a theory. We only understand if we can predict. Slide 6 Crystal Structure Prediction Given the structural diagram of an organic molecule, predict the 3D crystal structure. Slide 7 Calculate Energy of Infinite Crystal Calculate molecular energies and interactions Allow unit cell to change Optimise size, shape, packing Find energy of infinite lattice Find lattice with best energy Predicted crystal structure Slide 8 To Analyse Experimental Results Modern experimental techniques (NMR, mass-spec, X-ray crystallography etc.) are complex and work best if analysis of the results is done by computer. This both speeds up the process and lessens the risk of human bias in analysis of data. Slide 9 Looking at Molecules Experimental Analysis the measurement of radiation intensity as a function of wavelength Gives an indication to synthesis success and overall structure. the science that examines the arrangement of atoms in solids Gives a 3D structure Allows conformation of molecular arrangement & indicates interaction within a crystal. Spectroscopic Data AnalysisX-Ray Crystallography No indication of the underlying theoretical physics! Slide 10 To Access Data that Experiment cant Computational chemistry provide means to obtain data very difficult, expensive or time- consuming to get experimentally. Behaviour at high temperature or pressure. Structure of liquids at atomic scale. Dynamics of proteins. Slide 11 Phase Changes of Iron in the Earths Core et al., Slide 12 Structure of Liquid Water and Water Clusters Computer simulations are an important source of evidence, since atomic scale details of an irregular structure are hard to obtain by experiment. Slide 13 2. The Power to Compute Slide 14 Development of Computer Power University of Manchester SSEM, 1948 Slide 15 Development of Computer Power IBM Roadrunner, 2008 Slide 16 Computer Power: Moores Law Computer power doubles every two years: exponential growth Slide 17 Computer Power: Moores Law Logarithmic scale Slide 18 Computer Power: Moores Law This growth will, eventually, slow down as components reach atomic scale we think! Slide 19 The Size of the Problem Slide 20 Scaling of the Expense of Computation Typical scaling is ~N 4, as fourth power of molecular size. For the foreseeable future, there will be chemical problems at the limit of our computing capacity. Slide 21 3. Philosophies of Computational Chemistry Slide 22 The problem is difficult, but by making suitable approximations we can solve it at reasonable cost based on our understanding of physics and chemistry. A: Philosophy of Theoretical Chemistry Slide 23 Theoretical Chemistry Calculations and simulations based on real physics. Calculations are either quantum mechanical or use numbers derived from quantum mechanics. Attempt to model or simulate reality. Usually Low Throughput. Slide 24 What Kinds of Theoretical Chemistry can be Done? Prof. Eitan Geva (1) Quantum Chemistry Slide 25 What Kinds of Theoretical Chemistry can be Done? (1) Quantum Chemistry Using quantum mechanics to solve the structures and energetics of molecules; everything depends on the distribution of electrons. Slide 26 1926 Erwin Schrdinger proposed the Schrdinger equation The time independent Schrdinger equation; Hamiltonian an operator Energy Wavefunction What Kinds of Theoretical Chemistry can be Done? (1) Quantum Chemistry Slide 27 E The Hamiltonian Mathematical operator embodying the underlying physics -Kinetic energy of electrons -Attraction between electrons and nuclei of atoms -Repulsion between electrons The Wavefunction Describes the distribution of electrons in space that gives the lowest energy -A function of all electron positions within the molecule -The square of the wavefunction gives the electron density -Any molecular property can be calculated from the wavefunction The Energy -There is always one energy associated with each wavefunction Although quantum chemistry involves solving Schrdingers equation, it is not fully exact. There are some approximations involved. Slide 28 What Kinds of Theoretical Chemistry can be Done? (1) Quantum Chemistry There are two main kinds of quantum chemistry: Ab initio Density Functional Theory Slide 29 What Kinds of Theoretical Chemistry can be Done? (1) Quantum Chemistry Ab initio from first principles. Solve Schrdinger equation to get wavefunction. In principle rigorous we know what we calculate. But the standard Hartree-Fock method contains significant approximations. Expensive to adjust for these and get more accuracy. Slide 30 What Kinds of Theoretical Chemistry can be Done? (1) Quantum Chemistry Density Functional Theory Makes use of the theorem that all properties of interest can be determined directly from the electron density. True in principle, but the correct functional is unknown. Less rigorous than ab initio, but usually more accurate for an equivalent cost (or cheaper for similar accuracy). Slide 31 What Kinds of Theoretical Chemistry can be Done? (2) Molecular Simulation Slide 32 What Kinds of Theoretical Chemistry can be Done? (2) Molecular Simulation There are various techniques for simulating molecules, the most significant is probably Molecular Dynamics. Molecular Dynamics makes a balls-and- springs model of the molecule in the computer, and follows its behaviour over time. Slide 33 What Kinds of Theoretical Chemistry can be Done? (2) Molecular Simulation Light-harvesting protein subunit. Slide 34 What Kinds of Theoretical Chemistry can be Done? (2) Molecular Simulation Time steps need to be very, very short (~10 -15 seconds), so it takes a million steps to simulate one nanosecond of real time and a billion steps to simulate a microsecond. So it is hard to directly simulate relatively slow or rare events, such as protein folding. Slide 35 What Kinds of Theoretical Chemistry can be Done? (2) Molecular Simulation Also, a balls-and-springs model lacks the quantum mechanics needed to simulate a chemical reaction. Nonetheless, molecular dynamics is very important for understanding shape changes, interactions and energetics of large molecules. Slide 36 B: Philosophy of Informatics The problem is too difficult to solve at reasonable cost based on real physics and chemistry, so instead we will build a purely empirical model to predict the required molecular properties from chemical structure, using the available data. Slide 37 Informatics In general, informatics methods represent phenomena mathematically, but not in a physics-based way. Inputs and output model are based on an empirically parameterised equation or more elaborate mathematical model. Do not attempt to simulate reality. Usually High Throughput. Slide 38 Informatics Bioinformatics = Informatics applied to biology (genes and proteins). Cheminformatics or chemoinformatics = informatics applied to chemistry; cheminformatics techniques are often used in drug discovery and pharmaceutical research. Medical informatics = application of informatics to medicine or medical data. Slide 39 Modelling in Chemistry LOW THROUGHPUT HIGH THROUGHPUT Slide 40 Modelling in Chemistry LOW THROUGHPUT HIGH THROUGHPUT Theoretical Chemistry Slide 41 Modelling in Chemistry LOW THROUGHPUT HIGH THROUGHPUT Slide 42 Modelling in Chemistry LOW THROUGHPUT HIGH THROUGHPUT Informatics Slide 43 4. How Best to Compute Solubility? Slide 44 Which would you Prefer... or ? Slide 45 Which would you Prefer... Solubility in water (and other biological fluids) is highly desirable for pharmaceuticals! or ? Slide 46 Solubility is an important issue in drug discovery and a major cause of failure of drug development projects Expensive for the pharma industry Patients suffer lack of available treatments A good computational model for predicting the solubility of druglike molecules would be very valuable. Slide 47 Our Methods (A) Thermodynamic Cycle (Theoretical chemistry) Slide 48 Drug Disc.Today, 10 (4), 289 (2005) Slide 49 We can use theoretical chemistry to calculate solubility via a thermodynamic cycle 49 G hyd G solu Crystalline Gaseous Solution G sub Sub = sublimation Hyd = hydration Solu = solution Slide 50 We can use theoretical chemistry to calculate solubility via a thermodynamic cycle 50 G hyd G solu Crystalline Gaseous Solution G sub Sub = sublimation Hyd = hydration Solu = solution Slide 51 Calculate Energy of Infinite Crystal Take one molecule Solve its Schrdinger equation Calculate its interactions Allow unit cell to change Find best size, shape, packing Find energy of infinite lattice This is the same methodology as used in crystal structure prediction. Slide 52 We can use theoretical chemistry to calculate solubility via a thermodynamic cycle 52 G hyd G solu Crystalline Gaseous Solution G sub Sub = sublimation Hyd = hydration Solu = solution Slide 53 Model of Solvent-Solute Interaction Calculate energy of interaction between solute and solvent Model is called RISM Slide 54 We can use theoretical chemistry to calculate solubility via a thermodynamic cycle 54 G hyd G solu Crystalline Gaseous Solution G sub Sub = sublimation Hyd = hydration Solu = solution Slide 55 Our Methods (B) Random Forest (informatics) Slide 56 A decision tree is like a flow chart Random Forest Slide 57 This is a decision tree. We use lots of them to make a forest! A Machine Learning Method Slide 58 Looks soluble to me! Random Forest Looks sort of soluble As soluble as can be! I guess its insoluble This guy is soluble! Soluble? No way! I know its soluble Slide 59 Fits into drug discovery pipeline here Could take 15 years and $1 billion! Slide 60 Application to Proteins Funnel-shaped energy landscape Slide 61 Protein Folding Slide 62 Slide 63 114 Slide 64 115 Slide 65 116 Slide 66 Using computers to study the world of proteins Rosanna Alderson Slide 67 Slide 68 Slide 69 Slide 70 Slide 71 Slide 72 These two proteins only have 21% of the same sequence of amino acids in their polypeptide chain but fold into similar structures! Slide 73 ? Are there any proteins with a similar structure to this one? Lots of proteins have a similar structure! We need to look at a deeper level- to see if we can find amino acids we know are important for a particular function. Slide 74 MGSSHHHHHHENLYFQGMMFKKKMLAAT What if we dont have the 3D structure of a protein but only know its amino acids? We know what the amino acids are but we dont know how they fold together ? Slide 75 MGSSHHHHHHENLYFQGMMFKKKMLAAT Look for similar sequences MGSSHHHHHHENLYFQGMMFKKKMLAAT MGSSHHHHHHDNLPFQGMMFKKNMLAAT 3D structure input Similar to input from PDB Slide 76 Slide 77 Thank you for listening Want to know more?