MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization...

download MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization 3.04. Geometries from energy derivatives.

If you can't read please download the document

Transcript of MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization...

  • Slide 1

MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization 3.04. Geometries from energy derivatives Slide 2 General algorithm: the line search strategy An unavoidable pathway for each kind of iterative minimization: {R k } k such that k N E(Z,R k+1 ) E(Z,R k ) is to find optimal step sizes in the way for finding a local or global minimum of a hypersurface in a given direction. Slide 3 General algorithm: the line search strategy An unavoidable pathway for each kind of iterative minimization: {R k } k such that k N E(Z,R k+1 ) E(Z,R k ) is to find optimal step sizes in the way for finding a local or global minimum of a hypersurface in a given direction. line search It is performed by a search strategy known as line search, meaning: Computing E(Z,R k ) and g(R k ) given R k at a certain k point of the search. Slide 4 General algorithm: the line search strategy An unavoidable pathway for each kind of iterative minimization: {R k } k such that k N E(Z,R k+1 ) E(Z,R k ) is to find optimal step sizes in the way for finding a local or global minimum of a hypersurface in a given direction. line search It is performed by a search strategy known as line search, meaning: Computing E(Z,R k ) and g(R k ) given R k at a certain k point of the search. Choosing a search direction as F(R k ) Slide 5 General algorithm: the line search strategy An unavoidable pathway for each kind of iterative minimization: {R k } k such that k N E(Z,R k+1 ) E(Z,R k ) is to find optimal step sizes in the way for finding a local or global minimum of a hypersurface in a given direction. line search It is performed by a search strategy known as line search, meaning: Computing E(Z,R k ) and g(R k ) given R k at a certain k point of the search. Choosing a search direction as F(R k ) Obtaining step sizes k > 0 that minimizes E[Z, R k + k F(R k )], i.e. the hypersurface at a better following location by appropriate mathematical procedures. Slide 6 General algorithm: the line search strategy An unavoidable pathway for each kind of iterative minimization: {R k } k such that k N E(Z,R k+1 ) E(Z,R k ) is to find optimal step sizes in the way for finding a local or global minimum of a hypersurface in a given direction. line search It is performed by a search strategy known as line search, meaning: Computing E(Z,R k ) and g(R k ) given R k at a certain k point of the search. Choosing a search direction as F(R k ) Obtaining step sizes k > 0 that minimizes E[Z, R k + k F(R k )], i.e. the hypersurface at a better following location by appropriate mathematical procedures. Selecting the new geometry at the following local step k + 1, i.e. R k+1 = R k + k F(R k ) and repeating the cycle from k+1 until achieving a value of , given a local convergence threshold that will be the corresponding final optimization step. Slide 7 First order treatments Slide 8 Simple gradient methods (first order): steepest descent The method of steepest descent only make use of gradient information on the potential energy surface in order to determine the F direction of search: Slide 9 Simple gradient methods (first order): steepest descent The method of steepest descent only make use of gradient information on the potential energy surface in order to determine the F direction of search: Then, each optimization step k + 1 will depend on the previous one following: Slide 10 Simple gradient methods (first order): steepest descent The method of steepest descent only make use of gradient information on the potential energy surface in order to determine the F direction of search: The inconvenience is that the gradient at a given point is always orthogonal to the gradient at the following point in the minimization It means a zig-zag approach to the minimum that looses much computing time. Then, each optimization step k + 1 will depend on the previous one following: Slide 11 Simple gradient methods (first order): non linear conjugate gradient The non linear conjugate gradient method can be considered as a correction to the method of steepest descent. It remains only making use of gradient information on the potential energy surface as: Slide 12 Simple gradient methods (first order): non linear conjugate gradient The non linear conjugate gradient method can be considered as a correction to the method of steepest descent. It remains only making use of gradient information on the potential energy surface as: but subsequent steps behave as: being k+1 a matrix of real numbers. Slide 13 Simple gradient methods (first order): non linear conjugate gradient The non linear conjugate gradient method can be considered as a correction to the method of steepest descent. It remains only making use of gradient information on the potential energy surface as: As well as in the case of steepest descent, each optimization step k + 1 will depend on the previous one by following: but subsequent steps behave as: being k+1 a matrix of real numbers. Slide 14 Simple gradient methods (first order): non linear conjugate gradient several formulas are used depending on gradients themselves: To calculate terms of the k+1 diagonal matrix in: - as Fletcher - Reeves: Slide 15 Simple gradient methods (first order): non linear conjugate gradient several formulas are used depending on gradients themselves: To calculate terms of the k+1 diagonal matrix in: - or Polak - Ribire: or even other. - as Fletcher - Reeves: Slide 16 Simple gradient methods (first order): non linear conjugate gradient The big advantage of non linear gradient conjugate method is that being: Slide 17 Simple gradient methods (first order): non linear conjugate gradient in steepest descent method, here it becomes non - orthogonal: The big advantage of non linear gradient conjugate method is that being: Slide 18 Simple gradient methods (first order): non linear conjugate gradient in steepest descent method, here it becomes non - orthogonal: The big advantage of non linear gradient conjugate method is that being: It means a more straightforward pathway to minima, avoiding zig zags. Slide 19 Simple gradient methods (first order): non linear conjugate gradient It must be observed that non - linear gradient conjugate methods use the history of gradient development in a certain way. It means that curvatures (Hessians) are quite accounted. Advantages are: a rather simple formulas for updating the direction vector. Slide 20 Simple gradient methods (first order): non linear conjugate gradient It must be observed that non - linear gradient conjugate methods use the history of gradient development in a certain way. It means that curvatures (Hessians) are quite accounted. Advantages are: a rather simple formulas for updating the direction vector. being slightly more complicated than steepest descent, converges faster. Slide 21 Simple gradient methods (first order): non linear conjugate gradient It must be observed that non - linear gradient conjugate methods use the history of gradient development in a certain way. It means that curvatures (Hessians) are quite accounted. Advantages are: a rather simple formulas for updating the direction vector. being slightly more complicated than steepest descent, converges faster. more reliable than simpler methods. Slide 22 Simple gradient methods (first order): non linear conjugate gradient It must be observed that non - linear gradient conjugate methods use the history of gradient development in a certain way. It means that curvatures (Hessians) are quite accounted. Advantages are: a rather simple formulas for updating the direction vector. being slightly more complicated than steepest descent, converges faster. more reliable than simpler methods. It is considered that a good optimization strategy to perform initially a few steepest descent steps to reach a conformation near to the energy minimum and then finishing it with conjugate gradient steps until convergence. Slide 23 Second order treatments Slide 24 Gradient and Hessian path to optimize geometries The Taylor series can be truncated if R f is any stationary point of the system or even being at their nearest surroundings. Moreover, gradients must collapse at equilibrium or any other stationary geometry. Then: Slide 25 Gradient and Hessian path to optimize geometries As the derivative with respect to the position of any a center at geometry R f is the gradient of energy at such coordinate, it can also be expanded as a Taylors series: Slide 26 Gradient and Hessian path to optimize geometries As the derivative with respect to the position of any a center at geometry R f is the gradient of energy at such coordinate, it can also be expanded as a Taylors series: Therefore, the case of a full gradient matrix at the R f geometry in matrix notation is: Slide 27 Gradient and Hessian path to optimize geometries When R f is the equilibrium geometry results that g(R f ) = g(R eq ) = 0 by definition, and therefore gradients and curvatures become related by displacements q as: Slide 28 Gradient and Hessian path to optimize geometries When R f is the equilibrium geometry results that g(R f ) = g(R eq ) = 0 by definition, and therefore gradients and curvatures become related by displacements q as: Hessian matrix is considered non singular when HH 1 = H 1 H = I n, where I n is the identity matrix of n order. Therefore, the displacement matrix can then be expressed as a product of the inverse Hessian by the gradient matrix if Hessian is non singular: Slide 29 Gradient and Hessian path to optimize geometries Therefore, being: whereas q = R eq R, then: Slide 30 Gradient and Hessian path to optimize geometries Therefore, being: whereas q = R eq R, then: Remembering steps in the line search procedure: arises that a good approach for step sizes is the inverse Hessian when it is non - singular: Slide 31 Gradient and Hessian path to optimize geometries That means a way to find the equilibrium geometry of any system following a process where an arbitrary geometry can be progressively changed downwards in the hypersurface to the point when g(R) becomes 0 by means of the curvature provided by the Hessian. Slide 32 Gradient and Hessian path to optimize geometries That means a way to find the equilibrium geometry of any system following a process where an arbitrary geometry can be progressively changed downwards in the hypersurface to the point when g(R) becomes 0 by means of the curvature provided by the Hessian. This is a general principle for finding optimized geometries guided by the gradient path and the curvature of hypersurfaces at each point. Slide 33 Gradient and Hessian methods (second order): Newton - Raphson The Newton Raphson scheme is based on the principle of using first derivatives of a function to find extrema and second derivatives to define the condition of maxima or minima. Therefore, the general recipe for an iterative term finding the next step is: Slide 34 Gradient and Hessian methods (second order): Newton - Raphson The Newton Raphson scheme is based on the principle of using first derivatives of a function to find extrema and second derivatives to define the condition of maxima or minima. Therefore, the general recipe for an iterative term finding the next step is: Observe that like the steepest descent method, Newtons searches by the negative gradient direction. Slide 35 Gradient and Hessian methods (second order): Newton - Raphson Newton Raphson methods provide good output properties and fast convergence) if started near the equilibrium structure (either local or global). However, needs modifications if started far away from solution. Slide 36 Gradient and Hessian methods (second order): Newton - Raphson Newton Raphson methods provide good output properties and fast convergence) if started near the equilibrium structure (either local or global). However, needs modifications if started far away from solution. Another inconvenience is that the inverse Hessian is expensive to calculate. Slide 37 Gradient and Hessian methods (second order): Pseudo - Newton - Raphson As not always is possible (as fast as desired) to compute H(R k ) at each iteration step, there are pseudo Newton Raphson methods that use approximate Hessian matrices: becoming: Being sure that B k is always definite positive and Slide 38 Gradient and Hessian methods (second order): Pseudo - Newton - Raphson One case of pseudo Newton Raphson procedure is the Broyden Fletcher Goldfarb Shanno procedure (BFGS), used by default in important program packages: Slide 39 Gradient and Hessian methods (second order): Pseudo - Newton - Raphson One case of pseudo Newton Raphson procedure is the Broyden Fletcher Goldfarb Shanno procedure (BFGS), used by default in important program packages: The evaluation of B(R 1 ) is performed by particular formulas or by other method, as those previously described. Slide 40 Gradient and Hessian methods (second order): Pseudo - Newton - Raphson Pseudo Newton Raphson method use to speed up convergence without requiring the high computational resources of finding the Hessian matrix at each optimization step. Slide 41 Gradient and Hessian methods (second order): Pseudo - Newton - Raphson Pseudo Newton Raphson method use to speed up convergence without requiring the high computational resources of finding the Hessian matrix at each optimization step. In certain cases, as those of large systems with too many degrees of freedom it could be better to perform steepest descent or conjugate gradient procedures. Slide 42 Convergences There are several convergence criteria which are not mutually exclusive. If is a convenient threshold: Mathematical: and all terms in H k are definite positive Slide 43 Convergences There are several convergence criteria which are not mutually exclusive. If is a convenient threshold: Mathematical: and all terms in H k are definite positive Gradient only : although the minimum is not granted Slide 44 Convergences There are several convergence criteria which are not mutually exclusive. If is a convenient threshold: Mathematical: and all terms in H k are definite positive Gradient only: although the minimum is not granted Both on gradient and displacements: and Slide 45 Coordinates In practice, all matrices are treated in the 3N dimension space of x, y and z (or x 1, x 2, x 3 )on each nuclei or center of reference. Slide 46 Coordinates It means that the Taylor series for energy can be expressed in terms of the X in place of the R matrix: Slide 47 Coordinates It must be observed that a molecule with 2 atoms have 2N = 6 coordinates, although only one dimension is significant for their geometry: 3 molecular dimensions ( x 1, x 2, x 3 ) are referred to the translation of the system with respect to any external point. x4x5x4x5 x2x2 x3x3 x1x1 x6x6 Slide 48 Coordinates It must be observed that a molecule with 2 atoms have 2N = 6 coordinates, although only one dimension is significant for their geometry: 3 molecular dimensions ( x 1, x 2, x 3 ) are referred to the translation of the system with respect to any external point. 2 molecular dimensions ( x 4, x 5 ) are referred to its rotational motion. x4x5x4x5 x2x2 x3x3 x1x1 x6x6 Slide 49 Coordinates It must be observed that a molecule with 2 atoms have 2N = 6 coordinates, although only one dimension is significant for their geometry: 3 molecular dimensions ( x 1, x 2, x 3 ) are referred to the translation of the system with respect to any external point. 2 molecular dimensions ( x 4, x 5 ) are referred to its rotational motion. Molecules with more than 2 atoms (non cylindrical symmetry) require an additional coordinate for dealing with rotations. x4x5x4x5 x2x2 x3x3 x1x1 x6x6 Slide 50 Coordinates To be valid the expression in terms of Cartesian coordinates: the Hessian must be non singular, i.e. HH -1 = H -1 H = I. Slide 51 Coordinates To be valid the expression in terms of Cartesian coordinates: the Hessian must be non singular, i.e. HH -1 = H -1 H = I. istranslation (involving x 1, x 2, x 3 ) and rotation (involving x 4, x 5, x 6 ) of the complete system is a constant sum of an overall kinetic energy to all terms in Hessian. It is comprehensive to the whole nanoscopic system. The problem is that a 3N Hessian matrix in terms of Cartesian coordinates is singular as well as translation (involving x 1, x 2, x 3 ) and rotation (involving x 4, x 5, x 6 ) of the complete system is a constant sum of an overall kinetic energy to all terms in Hessian. It is comprehensive to the whole nanoscopic system. Slide 52 Coordinates The systems geometry deals with the potential energy among the components that individually regards all other coordinates. Slide 53 Coordinates Therefore, there are no potential energy gradient, nor Hessian, for the whole systems translational nor rotational energy components. Slide 54 Coordinates The solution is transforming the full X Cartesian coordinate matrix in an internal or a reduced Cartesian coordinate matrix X* with only the geometry significant 3N 6 terms (or 3N 5 in the case of cylindrical systems): Slide 55 Coordinates The H(X*) Hessian matrix is non singular and then all previous considerations holds for geometry optimizations. Slide 56 Finding gradients and Hessians There are two main forms to compute gradients and Hessians: Analytically, meaning the evaluation of formulas of total energy derivatives from the ones used for the hypersurface function. Slide 57 Finding gradients and Hessians There are two main forms to compute gradients and Hessians: Analytically, meaning the evaluation of formulas of total energy derivatives from the ones used for the hypersurface function. Numerically, using series to evaluate points and their corresponding inter- and extrapolations Slide 58 Finding gradients and Hessians: numerical approaches A good algorithm to be used when only energy is known and analytical gradients can not be obtained is that of Fletcher - Powell (FP). Slide 59 Finding gradients and Hessians: numerical approaches A good algorithm to be used when only energy is known and analytical gradients can not be obtained is that of Fletcher - Powell (FP). It builds up an internal list of gradients by keeping track of the energy changes from one step to the next. The Fletcher - Powell algorithm is usually the method of choice when energy gradients cannot be computed. Slide 60 Finding gradients and Hessians: numerical approaches A good algorithm to be used when only energy is known and analytical gradients can not be obtained is that of Fletcher - Powell (FP). It builds up an internal list of gradients by keeping track of the energy changes from one step to the next. The Fletcher - Powell algorithm is usually the method of choice when energy gradients cannot be computed. 1. Fletcher, R., A new approach to variable metric algorithms. The Computer Journal 1970, 13 (3), 317-322. 2. Fletcher, R.; Powell, M. J. D., A Rapidly Convergent Descent Method for Minimization. The Computer Journal 1963, 6 (2), 163-168. Slide 61 Finding gradients and Hessians: numerical approaches Some of the most efficient algorithms are the so called quasi-Newton algorithms, which assume a quadratic potential surface near to minima. Slide 62 Finding gradients and Hessians: numerical approaches Some of the most efficient algorithms are the so called quasi-Newton algorithms, which assume a quadratic potential surface near to minima. The Berny algorithm internally builds up a second derivative Hessian matrix. Slide 63 Finding gradients and Hessians: numeric and analytic approaches Another good algorithm is the geometric direct inversion of the iterative subspace (GDIIS) algorithm. Slide 64 Finding gradients and Hessians: numeric and analytic approaches Another good algorithm is the geometric direct inversion of the iterative subspace (GDIIS) algorithm. Molecular mechanics programs often use: Slide 65 Finding gradients and Hessians: numeric and analytic approaches The general DIIS procedure used in GAMESS that includes both numerical and analytical derivatives is described in the series: Hamilton, T. P.; Pulay, P., Direct inversion in the iterative subspace (DIIS) optimization of open-shell, excited-state, and small multiconfiguration SCF wave functions. J. Chem. Phys. 1986, 84 (10), 5728-5734. Pulay, P., Improved SCF convergence acceleration. J. Comput. Chem. 1982, 3 (4), 556-560. Pulay, P., Convergence acceleration of iterative sequences. The case of SCF iteration. Chem. Phys. Lett. 1980, 73 (2), 393-398. Slide 66 Finding gradients and Hessians: refinements near minima and transition states The Newton Raphson approach can be expressed as before in terms of and internal or reduced Cartesian coordinate matrix in a stationary point: Slide 67 Finding gradients and Hessians: refinements near minima and transition states The Newton Raphson approach can be expressed as before in terms of and internal or reduced Cartesian coordinate matrix in a stationary point: For the sake of simplicity it can be written as: Slide 68 Finding gradients and Hessians: refinements near minima and transition states A linear transformation is useful to be performed to treat variables: Slide 69 Finding gradients and Hessians: refinements near minima and transition states A linear transformation is useful to be performed to treat variables: Then the corresponding displacements are: Slide 70 Finding gradients and Hessians: refinements near minima and transition states A linear transformation is useful to be performed to treat variables: Then the corresponding displacements are: and gradients: Slide 71 Finding gradients and Hessians: refinements near minima and transition states Such h i, q i and g i variables are called as the local principal modes or axes and the Newton Raphson step along each principal axis is then: Slide 72 Finding gradients and Hessians: refinements near minima and transition states Such h i, q i and g i variables are called as the local principal modes or axes and the Newton Raphson step along each principal axis is then: Then, for stationary points: for minima Slide 73 Finding gradients and Hessians: refinements near minima and transition states Such h i, q i and g i variables are called as the local principal modes or axes and the Newton Raphson step along each principal axis is then: Then, for stationary points: for minima for maxima Slide 74 Finding gradients and Hessians: refinements near minima and transition states Such h i, q i and g i variables are called as the local principal modes or axes and the Newton Raphson step along each principal axis is then: Then, for stationary points: for minima for maxima for a order saddle point Slide 75 Finding gradients and Hessians: refinements near minima and transition states It conducts to stepping procedures where: where is an appropriately chosen shift parameter. Slide 76 Finding gradients and Hessians: refinements near minima and transition states It conducts to stepping procedures where: where is an appropriately chosen shift parameter. Depending upon the value of, the sign of each (h i - ) will be positive or negative, and hence the direction of the step q i will be opposite or toward the direction of the gradient. Slide 77 Finding gradients and Hessians: refinements near minima and transition states All these expressions facilitate the use of algebra to determine the best. These approaches are named as the eigenvector following procedures. Slide 78 Finding gradients and Hessians: refinements near minima and transition states All these expressions facilitate the use of algebra to determine the best. These approaches are named as the eigenvector following procedures. They result excellent to find accurate minima and saddle points. Slide 79 Finding gradients and Hessians: refinements near minima and transition states All these expressions facilitate the use of algebra to determine the best. These approaches are named as the eigenvector following procedures. They result excellent to find accurate minima and saddle points. Banerjee, A.; Adams, N.; Simons, J.; Shepard, R., Search for stationary points on surfaces. J. Phys. Chem. 1985, 89 (1), 52-57. Simons, J.; Joergensen, P.; Taylor, H.; Ozment, J., Walking on potential energy surfaces. J. Phys. Chem. 1983, 87 (15), 2745-2753. Cerjan, C. J.; Miller, W. H., On finding transition states. J. Chem. Phys. 1981, 75 (6), 2800-2806. Slide 80 Summary An interesting paper describing how all this matter works in recent programs is: Slide 81 Comments The QM/MM chalenge is also being aforded: Slide 82 Comments This is a still active field of research: Slide 83 References A comprehensive illustration: Cramer, C. J., Essentials of Computational Chemistry. Theories and Models. 2nd. ed.; John Wiley & Sons Ltd: Chichester, 2004; p 596. An appendix in: Szabo, A.; Ostlund, N. S., Modern quantum chemistry: introduction to advanced electronic structure theory. First edition, revised ed.; McGraw-Hill: New York, 1989; p 466.