QuanteMM



2       Theory

QuanteMM is provided to investigate large molecular or crystalline systems by means of a combination of quantum mechanical and molecular mechanical tools.

This combination is based on the concept of a potential energy surface of a molecule.


The Potential Energy Surface

The complete mathematical description of a molecule, including both quantum mechanical and relativistic effects, is a formidable problem, due to the small scales and large velocities. However, for this discussion, these intricacies are ignored and the focus is on general concepts, because molecular mechanics and dynamics are based on empirical data that implicitly incorporate all the relativistic and quantum effects. Since no complete relativistic quantum mechanical theory is suitable for the description of molecules, this discussion starts with the nonrelativistic Schrödinger equation:

The Schrödinger equation

Eq. 1

where H is the Hamiltonian for the system, is the wavefunction, and E is the energy. In general, is a function of the coordinates of the nuclei (R) and of the electrons (r).

The Born-Oppenheimer approximation

Although it is quite general, this equation is too complex for any practical use, so approximations are made. Noting that the electrons are several thousand times lighter than the nuclei and therefore move much faster, Born and Oppenheimer (1927) proposed what is known as the Born-Oppenheimer approximation: the motion of the electrons can be decoupled from that of the nuclei, giving two separate equations. The first equation describes the electronic motion:

Equation for electronic motion - The potential energy surface

Eq. 2

and depends only parametrically on the positions of the nuclei. Note that this equation defines an energy E(R), which is a function of only the coordinates of the nuclei. This energy is usually called the potential energy surface.

Equation for nuclear motion on the potential energy surface

The second equation then describes the motion of the nuclei on this potential energy surface E(R):

Eq. 3

The direct solution of Eq. 2 is the province of ab initio quantum chemical codes such as Gaussian, Cadpac, Hondo, GAMESS, DMol, and Turbomole. Semi-empirical codes such as Zindo, MNDO, MINDO, MOPAC, and AMPAC also solve Eq. 2, but they approximate many of the integrals needed with empirically fit functions. The common feature of these programs, though, is that they solve for the electronic wavefunction and energy as a function of nuclear coordinates.


Empirical Fit of the Surface

Solving Eq. 3 is important if you are interested in the structure or time evolution of a molecule. As written, Eq. 3 is the Schrödinger equation for the motion of the nuclei on the potential energy surface. In principle, Eq. 2 could be solved for the potential energy E, and then Eq. 3 could be solved. However, the effort required to solve Eq. 2 is extremely large, so usually an empirical fit to the potential energy surface is used. In any case, the solution of the quantum mechanical form of Eq. 3 is called quantum dynamics, but since the nuclei are relatively heavy objects, the quantum mechanical effects are often insignificant, in which case Eq. 3 can be replaced by Newton's equation of motion:

Eq. 4

Molecular dynamics and mechanics

The solution of Eq. 4 using an empirical fit to the potential energy surface E(R) is called molecular dynamics. Molecular mechanics ignores the time evolution of the system and instead focuses on finding particular geometries and their associated energies or other static properties. This includes finding equilibrium structures, transition states, relative energies, and harmonic vibrational frequencies.


The Forcefield

The empirical fit to the potential energy surface is the forcefield. The forcefield defines the coordinates used, the mathematical form of the equations involving the coordinates, and the parameters adjusted in the empirical fit of the potential energy surface. Forcefields commonly used for describing molecules employ a combination of internal coordinates (bond distances, bond angles, torsions), to describe the bond part of the potential energy surface, and interatomic distances to describe the van der Waals and electrostatic interactions between atoms. The functional forms range from simple quadratic forms to Morse functions, Fourier expansions, Lennard-Jones potentials, etc. The goal of a forcefield is to describe entire classes of molecules with reasonable accuracy. In a sense, the forcefield interpolates and extrapolates from the empirical data of the small set of molecules used to parameterize the forcefield to a larger set of related molecules and structures.

Classical forcefields

The physical significance of most of the types of interactions in a classical forcefield is easily understood. Describing a molecule's internal degrees of freedom in terms of bonds, angles, and torsions is natural. The analogy of vibrating balls connected by springs to describe molecular motion is equally familiar. However, it must be remembered that such classical models have limitations. Consider for example the difference between a classical and a quantum mechanical "bond".

Quantum and classical descriptions of bonds

Covalent bonds can, to a first approximation, be described by a harmonic oscillator, both in quantum and classical theory. Consider the classic oscillator in Figure 1. A ball poised at the intersection of the pale horizontal line with the parabolic energy surface (thick line) would begin to roll away, converting its potential energy to kinetic energy and achieving a maximum velocity as it passes the minimum. Its velocity (kinetic energy) is then converted back into potential energy until, at the exact same height as it had started, it would pause momentarily before rolling back. The interconversion of kinetic and potential energy in such a classical system is familiar and intuitive. The probability that the ball is at any point along its trajectory is inversely proportional to its velocity at that point. This probability is plotted above the parabolic curve (thin line). The probability is greatest near the high-energy limits of its trajectory (where it is moving slowly) and lowest at the energy minimum (where it is moving quickly). Because the total energy cannot exceed the initial potential energy defined by the starting point, the probability drops to zero outside the limit defined by the intersection of the total energy (pale horizontal line) with the parabola.

Figure 1 . Energy and Probability of a Classical and Quantum Particle in a Harmonic Energy Well

The energy is indicated by the heavy lines and probability by the thin lines. The total energy of the system is indicated by the pale horizontal line. The classical probability is highest when the particle reaches it maximum potential energy (zero velocity) and drops to zero between these points. The quantum mechanical probability is highest where the potential energy is lowest, and there is a finite probability that the particle can be found outside the classical limits (pale vertical lines).  

Describing a quantum mechanical "trajectory" is impossible, because the uncertainty principle prevents an exact, simultaneous specification of both position and momentum. However, the probability that the quantum mechanical ball will be at a given point on the parabola can be quantified. The quantum mechanical probability function plotted in the right panel of Figure 1 is very different from the classical system. First, the highest probability is at the energy minimum, which is the opposite of the classical case. Second, the quantum mechanical ball can actually be found beyond the classical limits imposed by the total energy of the system (tunneling). Both these properties can be attributed to the uncertainty principle.

Utility of the classical approach

With such a different qualitative picture of fundamental physical principles, is it reasonable to use a classical approach for obviously quantum mechanical entities like bonds? In practice, many experimental properties such as vibrational frequencies, sublimation energies, and crystal structures can be reproduced with a classical forcefield, not because the systems behave classically, but because the forcefield is fit to reproduce relevant observables and therefore includes most of the quantum effects empirically. Nevertheless, it is important to appreciate the fundamental limitations of a classical approach.

Limitations of the classical approach

Applications beyond the capability of most classical methods include:


The Classical Energy Expression

The actual coordinates of a molecule combined with the forcefield data create the energy expression or target function for the molecule. This energy expression is the equation that describes the potential energy surface of a particular molecule as a function of its atomic coordinates. As a simple example, consider the following equation, which might be used to describe the potential energy surface of a water molecule:

Example energy expression for water

Eq. 5

where Koh, b0oh, Khoh, and 0hoh are parameters of the forcefield, b is the current bond length of one OH bond, b¢ is the length of the other OH bond, and is the HOH angle.

In this example, the forcefield defines:

A more realistic example

Eq. 5 is an example for a simple case. Eq. 6 is a more realistic energy expression:

Eq. 6

The first four terms in this equation are sums that reflect the energy needed to stretch bonds (b), bend angles () away from their reference values, rotate torsion angles () by twisting atoms about the bond axis that determines the torsion angle, and distort planar atoms out of the plane formed by the atoms they are bonded to (). The next five terms are cross terms that account for interactions between the four types of internal coordinates. The final term represents the nonbond interactions as a sum of repulsive and attractive Lennard-Jones terms as well as Coulombic terms, all of which are a function of the distance between atom pairs rij. The forcefield defines the functional form of each term in this equation as well as the parameters such as Db, , and b0. The forcefield also defines the internal coordinates such as b, , , and as a function of the Cartesian atomic coordinates, although this is not explicitly obvious in Eq. 6. Finally, the energy expression in Eq. 6 is cast in a general form. The true energy expression for a specific molecule includes information about the coordinates that are included in each sum. For example, it is common to exclude interactions between bonded and 1-3 atoms in the summation representing the nonbond interactions. Thus, a true energy expression might actually use a list of allowed interactions rather than the full summation implied in Eq. 6.

The Melding of Quantum Mechanical and Molecular Mechanical Surfaces

A combination of quantum mechanics and molecular mechanics to describe the potential energy surface of an extended system is promising and feasible if the following conditions are met:

Quantum mechanical methods are, in general, computationally more expensive than molecular mechanical methods which limits their range to smaller systems.

The system of interest contains a region where a molecular mechanical description is known to be less accurate.

This partitioning results in a molecular model, described by a quantum mechanical method, embedded into an environment described by a forcefield.

In the following, the term QM system is used for the embedded molecular model while the term MM system is used for the environment including the atoms that form part of the QM system.

Points 3 and 4 can be in conflict if one wants to study a system where the quantum mechanical region contains atoms for which no forcefield parameters exist at all. The lack of those parameters prevents describing interactions between this region and its environment, while the interactions between atoms within this region are covered by the quantum mechanical description.

The "nu" potential type (which was originally designed to calculate the difference in free energies between molecules that do not contain the same number of atoms) could be used as a substitute for an unknown potential type; however, this entails neglecting all molecular mechanics contributions from this atom and the necessity to remove all bonds from the atom for which the "nu" potential type is employed. It is sufficient, however, for single point calculations where the electrostatic field of the environment is included into the quantum mechanical description (see below). For geometry optimizations, the lack of energy contributions that balance the electrostatic interactions between QM system and environment will have detrimental effects to the geometry.

The molecular mechanics approach describes the energy hypersurface by means of a so-called forcefield (the molecular mechanics Hamiltonian) that, in its present implementation, depends only on the nuclear positions. Accordingly, the interaction between the molecular model and its environment can be represented exclusively by molecular mechanics (if the necessary parameters exist, compare above). This is referred to as "mechanical embedding" and means that the environment of the QM system has no direct effect on the electron distribution within the QM system. In other words, there is no polarization of the electron density. There is an indirect (trivial) effect as far as changes in the electron distribution with geometry are concerned. However, if the molecular mechanics Hamiltonian contains a description of Coulomb interactions (generally through the use of atom-centered point charges), then it is possible to transfer these interactions into the quantum mechanical Hamiltonian. In this way, one can account for the polarization of the electron distribution within the QM system in the presence of the electrostatic potential of its environment. In this case, referred to as "electronic embedding", the forcefield parameters (such as point charges) would have to provide a reasonably accurate representation of the electrostatic potential which is by no means guaranteed since forcefields are only required to yield an accurate representation of the total potential energy without requiring the non-bond contributions by themselves to be accurate. In addition, it should be kept in mind that the quantum mechanical calculation is still based on a finite molecule approach. This means that the use of periodicity is restricted to the molecular mechanical part of the calculation, and we do not perform a fully periodical quantum mechanical calculation.


Link/Capping Atoms

The main approximation then consists of how interactions between the two regions-- QM system and environment--are treated. Two issues must be considered. First, the quantum mechanical region may have to be modified since the definition of such a region might require the breaking of chemical bonds (for example, in a zeolite, where bonds between silicon and oxygen are affected). This problem is approached by introducing capping atoms, subsequently called link or capping atoms. The bonds between atoms of the quantum mechanical region and atoms of the environment are now replaced by bonds between atoms of the quantum mechanical region and link atoms, resulting in a finite molecular model that may even be accessible to experiment. For example, defining the quantum mechanical region to be a single T site of a zeolite and using hydrogen atoms as link atoms, the existing Si(OH)4 molecule is obtained. The link atom should be chosen in a way that conserves bond order and such that there are no open valences left on the link atom. In practice, hydrogen atoms seem to be the most reasonable choice for the "capping" of dangling single bonds.

In applications to proteins, it is advisable to include amide groups (HN-CO) completely into the definition of the quantum mechanical region. In this case, the QM system would be terminated with either an amino (NH2) or an aldehyde (CHO) group, where the capping hydrogen atom replaces the alpha carbon atom.

For study of solute-solvent interactions the partitioning between molecular model and environment is almost trivial if the solute constitutes the quantum mechanical region while the solvent is modeled by a forcefield. In some cases, however, charge transfer between solute and solvent may be important and then solvent molecules would have to be included in the molecular model.

In geometry optimizations, the atoms of the original structure - before the quantum mechanical system was embedded into it - determine the structure of the embedded molecule. The default procedure therefore is that the position of the capping atoms is determined by the atoms forming the bond that is being capped, and not vice versa (Maseras 1995). As a result, the forces on the capping atoms can be transferred as increments to the forces on the atoms forming the capped bond.


Coupling Between Quantum and Molecular Mechanics

The second of the two issues that need to be addressed relates to the approximations that need to be made for computing interactions between the QM system and its environment. There are three ways to do this. In the following, it shall be assumed that the forcefield contains at most point charges, but no higher multipole moments (i.e., dipoles, quadrupoles, etc.) to describe electrostatic interactions.

The coupling can be achieved in two ways:

The total energy is the sum of the quantum mechanical energy and the energy due to the forcefield.

Denoting all atoms of the embedded QM system by lowercase indices and all atoms of the surrounding structure by uppercase indices, the following is obtained:

Eq. 7

where Q and M denote contributions from the quantum and the molecular mechanical parts, respectively. The forcefield accounts for all the interactions between the embedded QM system and its environment (i.e., the last term in the previous equation).

The coupling term itself may contain electronic as well as nuclear contributions, depending on the terms that are part of the classical Hamiltonian.

For the forcefields available to use with Discover, the electronic contribution is limited to the polarization of the QM system's electron density by point charges that were originally designed to model the Coulomb contribution to non-bond interactions between atoms in the environment. The nuclear contribution - that is, the interaction between the nuclei of the QM system and the environment - then contains the remaining non-bond interactions and the bond terms, if there are bonds between the QM system and its environment.

The following equation highlights the separation of the Coulomb interaction between electrons ("el") of the QM system and the electrostatic field generated by the environment from the total coupling term in Eq. 1.

Eq. 8

The electrostatic interaction between molecular mechanics atoms, carrying point charges q, and the electrons of the quantum mechanical system can then be included into the quantum mechanical Hamiltonian.

The total energy is the sum of the quantum mechanical energy, the forcefield energy excluding Coulomb interactions between QM system and environment and a quantum mechanical/forcefield energy (polarization energy) for those excluded Coulomb interactions:

Eq. 9

A third strategy can be adopted that accounts for the fact that the point charges employed in the forcefield may not reproduce the electrostatic potential accurately enough. Then, method (1) can be used for structure optimization and method (2) for the computation of the total energy. Of course, this method suffers from the fact that energy and forces are no longer consistent, but is still very useful in assessing the effect of an external electrostatic potential on relative energies.

With either method (1) or (2), the actual geometry optimization procedure is straightforward. In addition, one can try to correct the artificial contributions to either the total energy and/or the forces on the atoms which are due to the introduction of link atoms. Since quantum mechanics does not provide any means to decompose the total energy of a molecule into atomic contributions, this may only be accomplished at the level of a forcefield and may therefore involve approximations too crude to make it a reliable procedure. This remains a topic of further research.

A final complication relates to the fact that the symmetry of the original system may be different from that of the (isolated) embedded QM system. This means that the forces (i.e., the first derivatives of the total energy with respect to the atomic positions) from the quantum mechanical calculation will break the symmetry. In this case, one can consider a symmetrization of those forces to maintain the symmetry of the original system. It leads, however, to an inconsistency between the energy and its derivatives and, therefore, has not been adopted.


"Surface" Effects

Due to the presence of link atoms, there will be additional, "spurious" forces on the atoms of the QM system. They can be corrected, at least partially, by obtaining those forces from a separate molecular mechanics calculation on the isolated QM system, (i.e., including the link atoms). If the forcefield employed is accurate enough, you can expect that the forces due to the link/capping atoms can be balanced.

Furthermore, the presence of link/capping atoms inevitably leads to additional Coulomb interactions between embedded QM system and environment if method (2) is followed (i.e., if the electron density of the molecular model is allowed to become polarized by the environment). Then, contributions to the Coulomb interaction that stem from those atoms in the environment which are close to the link atoms of the embedded QM system (i.e., those atoms that, in the original structure, are forming bonds to atoms of the embedded QM system) can be excluded. This measure, in turn, affects the total charge of the environment and therefore the molecular orbital energies, and calls for an adjustment of the remaining point charges.

This adjustment can be accomplished by shifting the point charges from the atoms under consideration towards those atoms of the environment that form bonds with them. The charge increments eventually have to be modified if you want to account for the effective partial charge already carried by the link atom. For obvious reasons, such a procedure (an example of which is given below) is somewhat arbitrary.


Example (Zeolites): Shifting of Point Charges

Assume that the Si-O bond in a Si-O-Si linkage within a zeolite is replaced by an
O-H bond, yielding an Si-O-H group within the embedded QM system. The charge on the silicon atom originally bonded to the oxygen atom would have to be shifted towards the remaining three oxygen atoms bonded to it, yielding a charge increment of q(Si)/3 for each of them. If it is assumed that the hydrogen atom already carries a quarter of the silicon charge q(Si), however, the charge increment would have to be reduced to q(Si)/4.




Last updated October 06, 1997 at 10:26PM PDT.
Copyright © 1997, Molecular Simulations, Inc. All rights reserved.