Analytical Calculation of Geometrical Derivatives in Molecular Electronic Structure Theory
TRYGVE HELGAKER and POUL JORGENSEN
Department of Chemistry Aarhus University DK-8000 Aarhus C , Denmark
I. Introduction
A. Perspective
The development of and computational implementation of analytical expressions for the derivatives of ab initio electronic energy surfaces and molecular properties have undergone rapid growth in recent years. This growth reflects the central role these derivatives play in understanding chemical reactions and in interpreting many spectroscopic experiments. Low- order derivatives are now calculated analytically in a routine fashion for a variety of wave functions, greatly increasing the amount of information that can be conveniently extracted from approximate electronic wave func- tions. The purpose of this article is to review the techniques currently available for the analytical calculation of molecular energy and property derivatives.
The analytical calculation of energy derivatives has recently been reviewed by Pulay (1987) and Gaw and Handy (1984), and the analytical calculation of property derivatives by Amos (1987). The calculation of derivatives has traditionally been expressed in a generalized Fock-operator formalism, and this approach is also taken by’the above authors. This makes it easier to look at developments in a historical perspective. However, the Fock-operator technique suffers from some disadvantages. One problem is the use of redundant variational parameters, requiring a complicated handling of Lagrange multipliers. Another disadvantage is the lack of separation between the effects of geometry-dependent basis sets (variable metrics) and the physical relaxation of the wave function in the presence of a geometrical perturbation. Because of this the underlying simplicity of the derivative
I83
Copyright 0 1988 by Academic Press, Inc All nghts of reproduction in any form reserved ADVANCES I N QUANTUM CHEMISTRY, VOLUME 19
I84 Trygve Helgaker and P o d Jargensen
expressions become obscured, and generalizations to time-dependent geo- metrical derivatives (e.g., geometrical derivatives of frequency-dependent polarizabilities) become difficult if not impossible.
The response function approach (Jergensen and Simons, 1981) to the calculation of geometrical derivatives does not suffer from these disadvan- tages. In this formalism, wave functions are parametrized by exponential transformations of a reference state. This ensures that only independent variations are considered, eliminating the need for Lagrange multipliers.
Furthermore, by expressing the Hamiltonian in the language of second quantization, the geometry dependence of the basis set can be isolated in the Hamiltonian and incorporated in the perturbation operator describing the nuclear distortion. In this way the geometry dependence of the basis set is treated in the same way for all wave functions and can be considered before the choice of wave function is made. Also, since the geometry dependence is incorporated in the Hamiltonian, a geometrical distortion may be treated as an ordinary external perturbation in the same way as, for example, electric field perturbations. The extension of the analytical calculation of geometrical derivatives to time-dependent molecular properties therefore becomes straightforward. However, this topic is outside the scope of the present review.
[For details about time-dependent properties, see Olsen and Jqrgensen (1985).] The use of independent variables and the isolation of the geometry dependence of the basis set in the Hamiltonian also make it easy to see structural similarities and differences between various wave functions, and hence to recognize their advantages and disadvantages. Also, the similarities between wave function optimizations and derivative calculations become apparent when the response function approach is used.
The response function approach has been used for many years to describe molecular properties where the perturbations leave the metric unchanged (the basis set remains fixed in space). The use of response functions to calculate geometrical derivatives was initiated by Jqrgensen and Simons (1983) and independently described by Helgaker and Almlof (1984). Since then many developments have been made using this approach. However, this progress has received little attention in the recent reviews by Gaw and Handy (1984), Pulay (1987), and Amos (1 987). Since the response function approach offers a very direct insight into the calculation of geometrical derivatives, we feel that a review which focuses on this approach is appropriate and may help to clarify some of the confusion in this field, and may also provide the unfamiliar reader with a simple introduction to the calculation of geometrical derivatives in quantum chemistry.
The emphasis in this review is on theoretical and technical aspects. We set up a mathematical framework within which the expressions for the various derivatives can be conveniently derived, and discuss efficient computer
Geometrical Derivatives in Molecular Electronic Theory I85
implementations of these expressions. There is no discussion of the relation- ship between the calculated numbers and observed quantities, or of the accuracy of the various wave function approximations. For this we refer to the reviews by Fogarasi and Pulay (1984), Hess et al. (1986), Amos (1987), and Schlegel(l987). Also, little attention is given to the important field of integral evaluation, which is well covered by Saunders (1983) and Gaw and Handy (1984).
B. Outline
with total energy We consider a molecular system at a reference geometry X, in state IX,)
WXO) =
(xol~(xo)lx,>
(1)where H(X,) is the Hamiltonian. The total energy at a displaced geometry X = X ,
+
p may be expanded around X, (see Appendix A for notation)WXO
+
PI = (XO+
PIH(X0+
P)lXO+
P )- (2)
- W'O' + pW'" + tppW'2' + 1 &ppW'3)
+ ...
where W'O) is the total energy at X,, W ( ' ) is the molecular gradient, W(') is the molecular Hessian, and W ( 3 ) is the first molecular anharmonicity. In this review we discuss the analytical evaluation of these derivatives when IX,) is approximated by multiconfiguration self-consistent field (MCSCF), con- figuration interaction (CI), coupled-cluster (CC), and MZller-Plesset per- turbation theory (MPPT) wave functions. Most attention is given to the lower derivatives, as efficient computer implementations exist only in those cases.
The total energy has an explicit geometry dependence in the nuclear- electron and nuclear-nuclear interaction terms, and an implicit geometry dependence in the wave function. In approximate calculations where finite nuclear-fixed basis sets are used, the total energy has an explicit dependence aIso in the basis set. Using the technique of second quantization, the ge- ometry dependence of the basis set may be transferred to the Hamiltonian.
In Section I1 we describe how the Hamiltonian at X ,
+
p may be expanded around X,H(X,
+
p) = H(0)+
pH'"+
1 2 PP H'2'+
ipppH'3'+ ...
(3) where the coefficients H(") contain all the geometry dependence originating from the use of the finite basis set. In some approaches (e.g., limited CI and CC) the molecular orbitals are determined in H F or MCSCF calculations prior to the calculation of the wave function. This orbital preoptimization creates an additional geometry dependence in the basis set, which is also transferred to the Hamiltonian.I86 Trygve Helgaker and PouI Jsrgensen
In Sections III-VI the nuclear displacement operators H(") are treated as perturbation operators and response function theory is used to determine the implicit geometry dependence of the wave function to each order in the nuclear displacement:
Ix, +
p ) = IXo)(o)+
p1X0)(')+
+ppp'o)(2)+
+pppI~,)(~)+ ...
(4)By combining the order-by-order expansions of the Hamiltonian and the wave function we identify the geometrical derivatives and discuss computer implementations of these expressions.
In Section VII we describe how expressions for geometrical derivatives of molecular properties may be derived using the formalism developed for energy derivatives. We also discuss alternative definitions that may be used to determine geometrical derivatives of molecular properties for wave functions which do not satisfy the Hellmann-Feynman relationship for the property in question. Finally, in Section VIII we describe how translational and rota- tional symmetries may be used to reduce the cost of derivative calculations.
11. Hamiltonian Expansion
A. Hamiltonian at the Reference Geometry
The basic elements of the second-quantization formalism are the anni- hilation and creation operators (Linderberg and o h m , 1973). The annihi- lation operator up annihilates an electron in orbital
4p
(which we assume real), while the creation operator uf (the conjugate of up) creates an electron in orbital $ p . These operators satisfy the anticommutation relationsif the orbitals q$, are orthonormal. Omitting the nuclear-nuclear repulsion part the Hamiltonian is
where
are generators of the unitary group and
Geometrical Derivatives in Molecular Electronic Theory I87
In Eq. (9) the summation is over spin. In atomic units the one- and two- electron Hamiltonian integrals are obtained by integration over electronic coordinates according to
hp, =
<4p,ct1)1h(r1)l 4,(tl))
(11)(12)
Y p q r s =
<4p,ct, )4,,ctl )I ti:
I4r(t2)4,(52)>where
t1
denotes the coordinates of electron 1 andtlZ
=It1
-t21.
The one-electron operator h ( l ) is given by
1 Z A
h(5) = - - V: -
1
2 A - xAl
where ZA denotes the charge and XA the position of nucleus A. In the linear combination of atomic orbitals (LCAO) approach the orbitals
dP
areexpanded in a set of nonorthogonal atomic orbitals
xP
fixed on the atomic nuclei:4 P
=c
C P P X PThe integrals are calculated in terms of the atomic orbitals (AOs) and are subsequently transformed to the orthonormal basis. In some cases it may be more efficient to evaluate the expressions in the nonorthogonal A 0 basis. We return to this problem when we consider the calculation of the individual geometry derivatives. For the time being we assume that the Hamiltonian is expressed in the orthonormal molecular orbital (MO) basis. The second- quantized Hamiltonian [Eq. (S)] is a projection of the full Hamiltonian onto the space spanned by the molecular orbitals
&,
i.e., the space in which calculations are carried out.B. Orbital Connections
So far we have considered the Hamiltonian at one geometry, as appro- priate for single-point calculations. However, if we wish to calculate the derivatives of the energy with respect to variations in the geometry, we must also consider the geometry dependence of the Hamiltonian. This introduces certain complications, which are treated in the remainder of this section.
The Hamiltonian integrals depend on the molecular geometry in two ways.
The first is trivial and arises because the Coulomb interactions between the electrons and the nuclei depend on the geometry. The second is more complicated and arises because the orbitals are themselves functions of the geometry. The reason for this is that the MOs are expanded in a finite set of AOs fixed on the nuclear centers. A consequence of using a finite set of AOs is that we are presented with a different basis set at each geometry.
I88 Trygve Helgaker and Pod Jargensen
To study the geometry dependence of the Hamiltonian in any detail, we must first find a way to relate or “connect” Orbitals at neighboring geometries X . In other words we must establish a one-to-one correspondence between orbitals at neighboring points. Rules that accomplish this are called orbital connections (Helgaker, 1986). In the LCAO approach the situation may be pictured as follows. At each geometry we have a set of AOs from which an infinite set of orthogonally equivalent orbital bases can be constructed. As the geometry changes we must pick out exactly one of these orbital bases at each geometry X . In this way an orthogonal orbital connection is established.
(A connection is called orthogonal if it preserves orthonormality between the orbitals.) We further require that the connection is continuous and differentiable.
One may also wish to impose an additional requirement on the connec- tion, namely that it is translationally and rotationally invariant. This may seem to be a trivial requirement. However, a connection is conveniently de- fined in terms of atomic Cartesian displacements rather than in terms of a set of nonredundant internal coordinates. This implies that each molecular geometry may be described in an infinite number of translationally and rotationally equivalent ways. The corresponding connections may be differ- ent and therefore not translationally and rotationally invariant. In other words, the orbital basis is not necessarily uniquely determined by the internal coordinates when the connections are defined in terms of Cartesian coordi- nates. Conversely, a rotationally invariant connection picks up the same basis set regardless of how the rotation is carried out and so the basis is uniquely defined by the internal coordinates. [For a discussion of translationally and rotationally invariant connections, see Carlacci and McIver (1 986).]
In cases where fully orbital-optimized wave functions are used [e.g., MCSCF or AGP (antisymmetrized geminal power)] it does not matter what particular orbital basis is chosen at a given geometry as long as the connec- tion is orthogonal. For example, at any geometry an MCSCF wave function can be constructed from an arbitrary orthonormal basis by parametrizing the MCSCF state in terms of exponential orthogonal configuration and or- bital transformations of a nonoptimized reference state [see Eq. (49)]. The parameters of the orthogonal transformations are determined from the vari- ational principle. Usually we change the orbital basis in each iteration of MCSCF wave function optimizations, and the final MCSCF state is expressed in a basis for which the exponential orthogonal transformation corresponds to the identity transformation. In principle, however, we can carry out an MCSCF calculation at any geometry using an arbitrary set of orthonormal orbitals (Olsen et al., 1983). From this it follows that we are free to choose whatever orthogonal connection we find most convenient.
Geometrical Derivatives in Molecular Electronic Theory I89
In certain cases (e.g., limited CI and CC) the wave function at each geometry is expressed in terms of orbitals that are defined by some procedure independent of the actual wave function. This puts an extra constraint on our connection, namely that in addition to being orthonormal it must also connect these predefined orbitals.
C. Geometry Dependence of the Hamiltonian MOs are determined as linear combinations of AOs:
First
X
= X o is chosen as the reference geometry and a set of orthonormalwhere integration is over the electronic coordinates
5.
When the geometry changes the unmodified molecular orbitals4p(X)
(UMOs) are no longer orthonormal:Provided the overlap matrix S ( X ) is nonsingular we can define a set of orthonormal molecular orbitals (OMOs)
$ ( X ) = u ( X ) s - ’ ’ 2 ( x ) 4 ( X ) (18) where Sp’i2(X) ensures that the connection is orthogonal (Helgaker and Almlof, 1984) and the orthogonal matrix U ( X ) makes the connection general (Helgaker, 1986) so that we can pick up any set of orthonormal orbitals, for example, a set of orbitals determined in an optimization prior to the wave function calculation. In the MCSCF case we may set U ( X ) equal to the identity matrix, as any orthogonal connection can be used, and the OMOs
$ ( X ) are then defined completely in terms of the unmodified molecular orbitals through the S-’i2 matrix. A parametrization equivalent to Eq. (18) to be used in a Fock-operator formalism has been given by King et al. (1984).
Using the OMOs we may construct a Hamiltonian which is defined at all geometries (J$rgensen and Simons, 1983; Helgaker and Almlof, 1984; Simons et al., 1984)
where the geometry dependence of E J X ) and e,,,(X> will be discussed later.
I90 Trygve Helgaker and Poul brgensen
The integrals which appear in this expression are given in terms of the UMOs as
$ ( X ) p q r s =
1
gallyS(X)Ypa(X)Yq/l(X)Y,Y(X) G ( X ) (21)a/lyd
where we have introduced the connection matrix Y ( X )
Y ( X ) =
u(x)s-"'(X)
(22)(We use indices pqrs and a/?$ for general molecular orbitals and p v k ~ for atomic orbitals.) The UMO integrals in Eqs. (20) and (21) are given as [compare the expression for the overlap matrix, Eq. (17)]
Note that in these integrals the geometry dependence is isolated in the atomic integrals.
We are now ready to discuss the geometry expansion of the Hamiltonian, Eq. (19). Both the integrals and the operators in this Hamiltonian are geometry dependent. In particular the geometry dependence of the operators Epq(X) and epqrs(X) arises due to the geometry dependence of the orbitals. However, for our purposes we may treat these operators as geometry independent (Jgrgen- sen and Simons, 1983; Helgaker and Almlof, 1984). This can be seen in the following way.
The creation and annihilation operators always appear in expectation and transition densities such as ( p l a ~ a ~ a , a , l v ) where 111) and Iv) are electronic configurations, i.e., space- and/or spin-symmetrized ordered products of creation operators working on the vacuum state. In the simplest case we have
where the vacuum state is characterized by
a,lvac) = 0 (26)
The element (plup+aq+a,a,Jv) may be written as a vacuum expectation value of a string of creation and annihilation operators such as (vaclap,ap2...ap,ap+aq+asa,a:, ... a:2a:,lvac). According to Wick's theorem such vacuum expectation values can be expressed as a sum over all totally contracted terms, each of which depends only on the overlap between the orbitals. Since these are Spq for orthonormal orbitals, we see that the vacuum
Geometrical Derivatives in Molecular Electronic Theory 191
expectation values are geometry independent, provided the orbitals are orthonormal at all geometries. Since we use orthogonal connections this condition is fulfilled, and we may neglect the geometry dependence of the creation and annihilation operators in our derivations.
We may now expand the Hamiltonian in geometrical distortions about the reference geometry
where, following the discussion above, each term has the form
H ( X ,
+
p ) = H'O'+
pH'"+
3ppH'Z'+
4pppH'3'+ ...
(27)The geometry dependence of the Hamiltonian is isolated in the integrals. It only remains to determine the explicit form of
h"!'
andg"!jrs.
If in Eq. (22) we introduce the matrices
Q = I n U (29)
R = In ,'-'I2 (30)
where Q is antisymmetric and R symmetric, the Hamiltonian integrals can be written in an expansion in Q and R reminiscent of the Baker-Campbell- Hausdorff (BCH) commutator expansion for operators (Simons et al., 1984).
If we let I denote either the one- or the two-electron integrals, the O M 0 integrals may be written as
I"
= I+
{Q+
R, I }+
t{Q, Q, I }+
{Q, { R , I } }+
t { R , R, I }+
b{Q, Q,Q,
11+
t{Q, Q, { R , I > >+ 3{Q7
{ R , R,11)
+
+ { R , R, R, I }+ ...
(31)where we have used the notation for one-index transformations given in Appendix B, for example
From the BCH expansion one may arrive at the expressions for the derivatives of the integrals
I"
r"
= r"(0' + pr"'l) + +ppI"'2' + $pppI"'3' +.
. . (33)(34) (35) (36) by expanding
I = 1'0) + p l ( l ) + L 2 PP 1'2) +
& p p p p )
+. . .
Q = pQ(')
+
+ppQ("+
dpppQ'3'+ . . .
R = pR'"
+
9ppR"'+
4 ~ p p R ' ~ '+ .
. .I92 Trygve Helgaker and P o d Jargensen
where Q(') and R(O) are zero, as the O M 0 integrals are identical to the UMO integrals at the reference geometry. Identifying terms to same order we obtain (37)
} (38)
+ {Q(l), Q(1), 1 ( 0 ) } + {R(1), R('), I ( 0 ) }
+
2{Q('),{W, Po) 11
(39)+
{Q(')+
R(3), Z(')}+
3{Q('), Q('), 1"))+
6{Q('), {R('), 1(')1) +
3{R"', z(1) }+
3{Q('), Q(2), Z(')}+
3{Q('), {R(2), 1")>>
+
3{Q(2), {R"),Po)
}}+
3{R"',P), Po) 1
+
3 { Q(l), Q('), {R('), 1") }}+
3{Q('), {R('), R('), I(')I>
I(0) I = I(0)
I 1 ( 1 ) = 1 ( 1 ) + {Q") + R(1), 1 ' 0 )
1 ( 2 ) = I ( 2 ) + 2{Q(1) + R(1), 1 ( 1 ) } + { Q ( 2 )
+
R(2), Z(O)}I
1'3)
-
= 1'3) + 3 { ~ ( 1 ) + ~ ( 1 )p ) }
+ 3 { ~ ( 2 ) + ~ ( 2 ) , 1'1))+ {Q('), Q('), Q(1), 1 ( 0 ) } + {R(1), R"), R"), I(0) }
(40) For (MC) SCF wave functions it is not necessary to include Q in the above expressions. The correct expressions are then obtained simply by omitting all terms containing derivatives of Q.
To calculate the above integrals one must know Q(") and R("). The derivatives of the rotation matrix Q are obtained by solving the appropriate response equations, as discussed in later sections. In some cases the explicit evaluation of Q(") can be avoided by using the technique of Handy and Schaefer (1984). The expressions for R(") in terms of the derivatives of the overlap matrix may be obtained by expanding R as
R = -$A + $A2 -&A3 +
...
(41)where A is defined as
S = l + A (42)
By expanding A in powers of p and identifying terms to same order, we obtain (Helgaker and Almlof, 1984)
(43) (44) (45)
R(1) = - L s ( 1 ) 2
R'2' = -LS(2) + +S"'S"'
R(3) = - tS'3' + 3 p S ' 2 ' + 3S(Z)S(l) 4 - S(')S(')S(1)
2
where S(") can be calculated using standard techniques for calculation of atomic orbital integrals.
Geometrical Derivatives in Molecular Electronic Theory I93
D. Calculation of Atomic Orbital Integrals
It is not our purpose to discuss in detail methods for evaluation of molecular integrals. At present two techniques are in general use-the Rys quadrature method introduced by Dupuis et al. (1976), and the incomplete gamma function method as formulated by McMurchie and Davidson (1978).
Efficient computer codes for calculating derivative integrals exist using both techniques. For example, the GAUSSIAN 86 (Frisch et al., 1984), HONDO (Dupuis et al., 1976), GRADSCF (Komornicki, 1980), and CADPAC (Amos, 1984a) programs all use the Rys quadrature method, while ABACUS (Helgaker et d., 1986a) uses the McMurchie-Davidson scheme. Neither method appears to be superior. [For reviews and comparisons of the two approaches, see Saunders (1983) and Hegarty and van der Velde (1983).] A good discussion of integral evaluation with emphasis on derivatives has been given by Gaw and Handy (1984). Recently, an interesting new recursive scheme for calculation of integrals has been presented by Obara and Saika (1986). However, too little experience is available to judge the merits of the Obara- Saika scheme.
111. Derivatives from Multiconfiguration Self-Consistent-Field Wave Functions
A. MCSCF Electronic Energy
We assume that an MCSCF calculation has been carried out at the reference geometry X , and denote the MCSCF state at this geometry by
I
MC). The MCSCF state may be expressed as IMC) = CCYcIv)V
where CMC are the expansion coefficients of the determinants
where { u : } refer to the orthonormal set of MCSCF orbitals {I)~}. The orthogonal complement set of states { I k ) } to IMC) may be written as
where the coefficients C = {CMC, C k } form an orthogonal matrix. The MCSCF state IMC(p)) at a displaced geometry X ,
+
p may be obtained by an orthogonal transformation of the state IMC) at the reference geometry IMC(p)) = ~ x P ( - K ) exp(-P)IMC) (49)I94 Trygve Helgaker and Poul Jmrgensen
r
R k ) ]
describes the orthogonal transformation in (Jqjrgensen and Linderberg, 1970; Levy, 1970)
r
the configuration space and
(51)
1
~xP(-K) = ~ X P
1
- r > s1
Krs(Ers - E s r )describes the orthogonal transformation in the orbital space. Only nonre- dundant orbital rotations (Dalgaard and JBrgensen, 1978; Hoffmann et al., 1984) are included in K . With the ansatz of Eq. (49) we may express an arbitrary state spanned by the orbital and configuration space in terms of a particular set of nonredundant orbital {qS} and state {pk} parameters. One of these sets represents the optimized state at the displaced geometry (Jqjrgensen and Simons, 1983; Helgaker and Almiof, 1984).
The total energy of the MCSCF state at the displaced geometry is
WXO
+
K K > P ) = (MC(dIH(X0+
P)IMC(P))= (MCI exp(P) exp(lc)H(Xo
+
p) exp(-ic) exp(-P)IMC) (52) where the orbital connection of the Hamiltonian contains only finite basis reorthonormalization contributions. The variational parameters { K , , ~ ; r > s>and {pk; k # MC}, and the operators E + = {&; r > s} and R + = { R ; ; k # MC} may be arranged as column vectors
A
=(a)
/2 =
("'
R + - R -")
(53) (54) Inserting the Hamiltonian expansion [Eq. (27)] in Eq. (52) and using the BCH expansion, we obtain
W(X0
+
p, 3") = E'O'+
pE'"+
ippE'2)+
dpppE'3'Geometrical Derivatives in Molecular Electronic Theory I95
where
f ( ' ) = (MCICA, H("]IMC) G(') = (MCICA, A, H(')](MC) are the electronic energy, gradient, and Hessian, and where
K(') = (MC( [A, A, A, H"']
1
MC)L(') = (MCI[A, A, A, A, H"']IMC)
(59) (60) are the third and fourth electronic derivatives. In Eqs. (58)-(60) we have used the symmetric commutators (Olsen et al., 1983). The gradient f " ) is zero since the MCSCF energy is stationary at X o .
B. MCSCF Response Equations
The geometry dependence of the variational parameters 1 is determined from the requirement that the total energy in Eq. (55) must be stationary in the perturbed field:
dW(Xo
+
p, I)/d1 = 0 (61)We obtain
- G'O),?. = pf(l) + + ipppf(3) + pG(')A + $ppG(2)1
+
*K(O'11+
+pK(l'IA+
iL(0)1lb1+ . .
' (62) The variational parameters may be expanded in a power series in the nuclear displacement1" = p1'"
+
+ p p p+
+ppp1(3)+ . . .
(63) where the zeroth-order term vanishes as the MCSCF state IMC) is opti- mized at X o . Solving Eq. (62) to each order in p gives(64) (65)
- G ( O ) A ( l ) =
f
(1)- ~ ( 0 ) ) ~ ( 2 ) = f ( 2 ) + 2 ~ ( 1 ) ~ ( 1 ) + K ( O ) A ( ~ ) ) * ( ~ ) - ~ ( 0 ) 1 ( 3 ) = f ( 3 ) + 3~(1)1(2) + 3~(2);1(1)
+ 3 ~ ( 0 ) ~ ( 2 ) 1 ( 1 ) + 3 ~ ( 1 ) ~ ( 1 ) ~ ( 1 )
+ L(o))*(1)]"(1)1(1)
These equations describe how the MCSCF state responds through first, second, and third orders to a nuclear displacement.
The first-order MCSCF response equations were first derived by Dalgaard and Jprrgensen (1 978). For geometrical perturbations these equations were derived by Osamura et al. (1982a) using a Fock-operator approach.
(66)
I96 Trygve Helgaker and Poul Jsrgensen
C. MCSCF Energy Derivatives
Inserting the power-series expansion of
A
[Eq. (63)] in the total energy expression in Eq. (55) allows us to identify the geometrical derivatives for the MCSCF wave function as(67
1
w(1) = E(1)
In the identification of W ( 3 ) we have used the first-order response equations [Eq. (64)] to eliminate A(2). From Eqs. (68) and (69) we see that the first-order correction to the wave function determines the energy through third order. In general the nth-order response of the wave function determines the energy through order 2n
-+
1.The MCSCF gradient expression was first given by Pulay (1977). The MCSCF Hessian and first anharmonicity expressions were derived by Pulay (1983) using a Fock-operator approach, and by Jgrgensen and Simons (1983) and Simons and Jgrgensen (1983) using a response function approach.
D. Implementation of MCSCF Derivative Expressions 1. MCSCF Gradients
From Eq. (67) we see that the expression for the molecular gradient is
= (MCIH'l)IMC) (70)
where H") contains the first derivatives of the O M 0 integrals
f(').
An explicit expression for these derivatives is obtained by combining Eqs. (38) and (43) of Section 11. This gives us1
(71)p -
= f(1) - '(S'", ff0)2
since the rotational part of the connection may be omitted for MCSCF wave functions. The Hamiltonian may therefore be written
(72) H(1) = HI11 - '(S'", H[OI)
2
where we have used the square bracket superscript notation as explained in Appendix A, and the notation for operators containing one-index transformed integrals as introduced in Appendix B. Inserting this expression in Eq. (70), we
Geometrical Derivatives in Molecular Electronic Theory I97
obtain
(73) W(1) = Tr Dh(') + 9Tr dg(1) - Tr S(')F[OI
where we have used Eq. (D.4) and the notation for trace operations and density elements as given in Appendix D. Also we have used the identity [Eq. (F.l)]
(MCI(S('), HIO1)IMC) = 2Tr S(')FIO1 (74) Here Frnl is the Fock matrix, which is discussed in detail in Appendix E.
The gradient expression given above is not particularly useful since it appears in the MO basis. Following the discussion in Appendix C about covariant and contravariant representations, we may rewrite the gradient as (75) W ( ' ) = Tr D,,hyA
+
5Tr daogi& - Tr S,, (1) F[o1where ao denotes contravariant and A 0 covariant components. Since all derivative integrals now appear in the covariant A 0 representation in which they are calculated, no integral transformation is required. The MCSCF gradient was first implemented by Goddard et al. (1979) and Kato and Morokuma (1979).
2. MCSCF Hessians contribution
The MCSCF Hessian as given by Eq. (68) contains two terms. The first
E(') = (MCI H(')1 MC) (76)
may be treated in the same way as the molecular gradient (MCIH'''I MC).
The integrals
>>
(77)1 S(Z) -
s'""", p }
+ ${S('', {$I), I'O'I 1 0 ) = I(') - {S(l), I(')
1 -z{
give rise to the operator
H(Z) = H[zl - (S('), H"1) - l(S(2' 2 - S(')S(') H[o1) +
$(st'),
(S('), H[OI)) (78)Taking the expectation value of this operator we obtain
(79)
E ( 2 ) = Tr Dh(2) + "Tr 2 dg(2) - 2Tr S(')F['I
- Tr[(S(2) - S(l)S('))F[oI] + $Tr[S(l)(S(l), F[O1)]
where ($I), F[O1) is the Fock matrix constructed from one-index transformed integrals (see Appendix E). As for the molecular gradient some of the terms are best calculated in the A 0 basis. A convenient working expression is
(80) E(') = Tr D,,hk2:,
+
9Tr d,,ga':, - Tr S,, (2) F[ol- 2Tr S(')F[11 + Tr(S(')S(')F[O]) + +Tr[S(')(S('), F[o])]
I98 Trygve Helgaker and P o d Jsrgensen
in which all second-derivative integrals appear in the covariant A 0 representation.
The evaluation of the second contribution to the MCSCF Hessian requires the construction of f ( l ) and the solution of the linear set of equations [Eq. (6411:
(81) G(O)]*(') + f ( 1 ) = 0
Before discussing the solution of these equations, we consider the evaluation of p .
The orbital components of f(') may be written as [see Eq. (E.2)]
fl."
= 2(F&) - F g ) (82)where F"' is the Fock matrix calculated from the Hamiltonian H ( ' ) of Eq. (72) [see Eq. (E.l)]. Then F"' may be decomposed as
(83) F(" = p 1- '(S"', p o l )
2
where F"] and (S('), FIol) are the Fock matrices constructed from the UMO integrals 1'') and the one-index transformed integrals {$'I,
Po)],
respectively.The calculation of these matrices is described in Appendix E.
The configuration part of f ( ' )
fh"
= (MCJCR; - R,, H"']IMC) = -2(MCIH"'lk) (84) refers to the orthogonal complement set of states { Jk)) and may, for small configuration expansions, be calculated straightforwardly using, for example, the states obtained by diagonalizing the Hamiltonian matrix. For larger con- figuration spaces it is convenient to calculatef r )
from the configuration space equivalent of Eq. (84) [(MClH("lv)] (Lengsfield and Liu, 1981; Lengsfield, 1982). This requires no further consideration if all vectors multiplied on f(') are orthogonal to (MC). If this is not the case the IMC) component of Eq. (84) must be projected outf;'' = -2(vIH"'IMC)
+
2C~C(MCIH"'IMC) ( 8 5 ) Note that (MC1H")IMC) is the molecular gradient. The configuration partof
f""
may thus be evaluated in the configuration basis requiring and{ S('), Z(')} in the M O basis with all four indices active.
For small configuration expansions the linear equations [Eq. (Sl)] can be solved by constructing the MCSCF electronic Hessian G(O) in an explicit representation of the orthogonal complement. For larger expansions G'O) cannot be constructed explicitly. The solution must then be determined by an iterative method. In the algorithm described below new trial vectors are obtained using a modification of the conjugate-gradient algorithm (Hestenes, 1980).
Geometrical Derivatives in Molecular Electronic Theory ' 199
In an iterative scheme linear transformations are carried out on trial vec- tors using Go) as transformation matrix:
a, = G(0)bi (86)
Assume that in the nth iteration we have generated n orthonormal trial vectors orthogonal to
I
MC){bi, b,,
.
. ., b,)@I, a,,
. .
., a"}GRIT
+
f " = 0(87) and calculated the transformed vectors
(88) The reduced linear equations
(89) where
determine the best solution I" in the basis of the trial vectors. To obtain a new trial vector we divide the residual
8'') = (;(o)(CbiAy)
+
f ( l ) = CaiAF i+
f ( l ) (921 by the diagonal Hessian elements, orthonormalize the resulting vector againstI
MC) and the previous trial vectors, and add it to the basis of trial vectors. The iterative procedure is continued until the residual is smaller than a preset threshold.For large configuration expansions the linear transformation Eq. (86) is the time-consuming step in the iterative scheme. For convenience we write the linear transformation as
where superscripts c and o denote the configuration and orbital blocks, respectively. After some simple algebra the working equations are seen to be (Olsen et al., 1983)
~ G ( , 9 ? b V = 2((pIH"'IB) - b,E'O') (94) (95)
Y
CGg!vbv = -2((MCI[E,,, H'O']lB)
+
(BI[E,,, H'O'IIMC))V
200 Trygve Helgaker and Poul Jnrgensen
where
A linear transformation of a configuration vector '6 thus requires the construction of a configuration gradient with IB) as the reference state [Eq. (94)], and the construction of an orbital gradient with a symmetric tran- sition density matrix [Eq. (95)]. A linear transformation on an orbital vector
"6 requires the construction of a configuration gradient [Eq. (96)] and an orbital gradient [Eq, (97)] from the one-index transformed Hamiltonian K . A straightforward implementation of the above algorithm converges slowly mainly because of large off-diagonal elements in the orbital-orbital block OOG(of of the Hessian. The following three techniques can be used to speed up convergence (Helgaker et ul., 1986a).
1. Solve for several or all Cartesian displacements simultaneously. Sharing trial vectors reduces the number of vectors that must be generated.
2. Split the trial vectors into two sets, one which spans the orbital part of the solution only and one which spans the configuration part only. Usually a large number of orbital trial vectors is needed while the configuration vectors converge much faster.
3. For the current approximate configuration part of the solution vector, find the exact solution of the orbital part. This reduces the number of configuration gradients (p1 KIMC) [Eq. (96)] that must be calculated.
The relaxation contribution to the MCSCF molecular Hessian
( 100) -j(l)(G(o))-lj(l) = f ( l ) A ( l )
may be evaluated straightforwardly as written if the exact solution of the response equation is known. If the solution vectors are determined iteratively to a preset residual norm then errors in Eq. (100) will be linear in the residual error. However, Eq. (100) may be written in a different form, which (at essentially no extra cost) gives errors quadratic in the residual error (Sellers, 1986). We refer to Appendix G for a detailed discussion of linear and quadratic errors.
The first implementations of general MCSCF Hessians were reported by Hoffmann et ul. (1984) and shortly after by Page et al. (1984a). A program
Geometrical Derivatives in Molecular Electronic Theory 20 I
capable of handling large configuration spaces using the techniques described in this section has been presented by Helgaker et al. (1986a).
3. MCSCF First Anharmonicities first term may be written as
The first anharmonicity [Eq. (69)] contains four terms. Using Eq. (40) the
= (MCIH[31
+
3(R'", HC2I)+
3(R'", H"])+
3(R'", R'", H"])+
(R'3', H[O')+ 3 ( ~ ( 1 ) , RW, ~ 1 0 1 ) + ( ~ ( i ) , ~ ( i ) , R"),H[OI )IMQ (101) where I?('), R(2), and R(3) [see Eqs. (43)-(45)] are easily calculated. Using the same techniques as for the gradient and the Hessian we obtain
E ( 3 ) = Tr D,,ha":,
+
$Tr d,,ga":,+
6Tr RLi'Fa':,+
6Tr R(2)F[11+
6Tr[R(')(R('), F['])]+
2Tr R'3)F[01+
3Tr[R(')(R(2), Fro])]+
3Tr[R(2)(R('), F["])]+
2Tr[R"'(R(", (R('), FIO1))] ( 102) The terms containing second and third derivative integrals may be evaluated in the A 0 basis, while those containing first derivative integrals require two general and two active indices in the MO basis.The second term in Eq. (69) is f(2)21), where the electronic gradient f(') is constructed from the Hamiltonian
H'2' = H[21 - ($I), Jpl) -
yp',
f p O 1 )+ $(S'", (S"' 2 HIOI)) + '(S'"S'1) 2 9H[OI )
2
(103) The contributions to f " ) from H['I and HLo1 may be easily evaluated if Z(') are available in the MO basis with two general and two active indices, and Z'O)
with three general and one active indices. The contribution from H121 may be calculated in the A 0 basis, as will now be discussed. Using Eqs. (D.7) and (F.5), we obtain
f121$1) = Cf[21C;l'l) + O f [ 2 1 0 ~ ( 1 )
(104)
= -2Tr Dtha":, - Tr d,,g,, P (2)
+
2Tr r c ~ ' , ) F ~ ~ wherek
The evaluation of f ( 2 ) A ( 1 ) thus requires the same differentiated integrals in the MO basis as does the evaluation of E(3).
202 Trygve Helgaker and Poul Jsrgensen
The third term in Eq. (69) should be calculated carrying out the linear transformation G(')I.('). [The reason this and similar terms discussed later should be calculated using linear transformations is that we only need to know the projection of G(') onto the solution vectors
A('),
and not Gi')itself. Constructing G(') explicitly constitutes a much heavier task than carrying out linear transformations in a direct fashion.] The transforma- tion G(')A(') is identical to G'O'b described in Eqs. (94)-(97), except that the 1") integrals are replaced by ?(I). The linear transformation requires and { S ( ' ) , I(')) with two general and two active indices. The requirements on the integrals are therefore the same as in the calculation of E ( 3 ) .
The evaluation of the last term in Eq. (69) requires linear transformations K(o)%(1),4i1) to be carried out. Such transformations were described by Olsen et al. (1982) for small configuration spaces with an explicit representation of the orthogonal complement space in connection with cubically convergent MCSCF optimization techniques. For larger configuration spaces it is better to work in the configuration-based representation given by Olsen and Jergensen (1 985).
To summarize, the first anharmonicity may be evaluated for an MCSCF wave function with second and third derivative integrals in the A 0 basis, first derivative integrals in the MO basis with two general and two active indices, and undifferentiated integrals with three general and one active indices.
E. Hartree-Fock Derivatives
Some simplifications occur in the MCSCF derivative expressions for single-configuration self-consistent field wave functions. The most important is that the calculations can be carried out in the A 0 basis as discussed below.
After omitting all contributions to the MCSCF derivatives which arise from variations in the configuration space, the remaining terms either already appear in the A 0 basis (e.g., all terms (MCIH["IIMC)), or they are expressed in terms of Fock matrices. In the MCSCF case these matrices are partly calculated in the MO basis. In contrast, SCF Fock matrices (which may be calculated from the inactive Fock matrices) are straightforwardly calculated in the A 0 basis, as described in Appendix E. (This is also true for SCF Fock matrices constructed from multiply one-index transformed integrals.) This implies that SCF derivatives can be calculated completely in the A 0 basis. We give some examples to make this point clear.
Consider first the SCF response equations [Eq. (64)]. The construction of f ( ' ) [see Eqs. (82) and (83)] requires the calculation of iF['l and ($'I, iF[ol) 9 i.e .7
the calculation of inactive Fock matrices from differentiated integrals and from one-index transformed integrals. Both matrices may be calculated in the A 0 basis (see Appendix E). Furthermore, the solution of the response equations requires linear transformations of orbital trial vectors. Equation
Geometrical Derivatives in Molecular Electronic Theory 203
(97) shows how the transformed vectors may be obtained from one-index transformed Fock matrices. We conclude that the SCF response equations may be solved with no reference to MO integrals (Bacskay, 1981).
The evaluation of E(") requires the construction of Fock matrices from multiply one-index transformed integrals. For example, contains the term (HFI(R('), R('), R('), H[O1)(HF) = 2Tr[R(')(R('), (R('), 'F["]))] (106) which requires the calculation of inactive Fock matrices from integrals which have been one-index transformed twice. These matrices may be calculated in the A 0 basis as described in Eq. (E.17).
All contributions to the molecular derivatives involving the higher electronic derivatives [Eqs. (59) and (60)] may be treated as direct linear transformations and calculated in terms of inactive Fock matrices containing multiply one-index transformed integrals. For example, the first anharmonic- ity contains the term
K(o)A(l)A(l)A(l) = (HFI [I\, I\, I\, H(O)] IHF)A(')~(')~(')
= (HF1[dl), K ( ' ) ,
d'),
H'O'IIHF)= (HFI(d",
d'), dl),
H'O')IHF)= 2Tr[d')(~c('),
(d'),
'FIO1))] (107) which is calculated in the same way as Eq. (106), i.e., from inactive Fock matrices containing integrals which have been one-index transformed twice.The expression for H F gradients was first derived by Bratoi (1958), and the first practical implementation was reported by Pulay (1969, 1970). H F Hessians were first implemented by Thomsen and Swanstr%m (1973), based on the work by Bratoi (1958) and Gerratt and Mills (1968). The first practical implementation was reported by Pople et al. (1979). Third derivative ex- pressions were derived by Moccia (1970). Further developments and the first implementation of third derivatives were reported by Gaw et al. (1984). Very recently analytical fourth derivatives have been programmed by Gaw and Handy (1 987).
IV. Derivatives from Multireference Configuration Interaction Wave Functions
A. MRCI Electronic Energy
We consider a multireference CI (MRCI) wave function calculated from a set of MCSCF orbitals. The CI reference state is denoted by ICI) at X , and by ICI(,u)) at the displaced geometry X ,
+
1.1. In addition to the reorthonormali- zation part, the MRCI orbital connection contains the MCSCF orbital204 Trygve Helgaker and Pod Jsrgenren
rotation parameters IC. The MCSCF responses K(") through third order are given as the orbital components of the solution vectors
A(")
in Eqs. (64)-(66).The CI state at the displaced geometry may be expressed in terms of the CI state at the undisplaced geometry
I C W > = exp(-P)ICI) ( 108) where
The set of states { I k ) } constitutes an orthonormal basis for the orthogonal complement to ICI).
The total energy of the CI state at the displaced geometry is WXO
+
P? P ) = (CI(PL)IH(Xo+
P)ICI(P))= (CII exp(P)H(X,
+
p) exp(-P)ICI) (1 10) Inserting the Hamiltonian expansion [Eq. (27)] and using the BCH expansion, we obtainW ( X ,
+
p, P ) =Po' +
pE'"+
tppLE'2'+
$pppE'3' (1 11)+
pf'l)P+
$ppf(')P+
SG'o)PP+
tpG"'PP+ . .
. The matrices f @ ) and G") are defined as in Eqs. (57) and (58) but with average values taken with respect to ICI) and with no orbital excitation operators in A [Eq. (109)l. Note that both f ' ' ) and K'O) are zero as ICI) is optimized at X,, it., (CI(H(Xo)Jk) = 0.B. MRCI Response Equations
The geometry dependence of the P parameters is obtained from the requirement that the total energy in Eq. (111) must be stationary in the perturbed field
dW(X*
+
p, P)/dP = pf'l)+
+ppf'2'+
G'O'P+
pG'1'P+
... = 0 (1 12) The parameters P may be expanded in a power series in the nuclear displacement(113) p = pp"' + 1 2 PP p'2) +
. . .
where the zeroth-order term vanishes as the CI state is optimized at X,.
Solving Eq. (1 12) to first and second orders we obtain
(114) (1 15)
- G'O"'' = f ( 1 )
- G'O'P'2' = f"' + 2G'"''
Geometrical Derivatives in Molecular Electronic Theory 205
These equations describe how the CI state responds through first and second orders to a nuclear displacement.
The first-order CI response equations were derived by Jprgensen and Simons (1 983) using a response function approach and reformulated by Fox et al. (1 983) using a Fock-operator approach.
C. MRCI Energy Derivatives
The geometrical derivatives may be identified by inserting the power series expansion of P [Eq. (113)] into the CI energy expression [Eq. (lll)]. We obtain
(116) (1 17) (1 18) W'" = E")
W(2) = E'2' + f(l)P(1)
~ ( 3 ) = ~ ( 3 ) + 3f(2)p(i) + 3 ~ ( i ) p t i ) p i )
The energy through third order is thus determined by the first-order correction to the wave function. In general the nth-order correction to the wave function determines the energy through order (2n
+
1). Comparing these equations with their MCSCF counterparts [Eqs. (67)-(69)], we note their structural similarity. However, the elements entering the CI and MCSCF derivative expressions are defined differently. The orbital connections (and therefore H'")) in both cases contain reorthonormalization contributions, but in the CI case H'") also contains the MCSCF orbital responses. Differences also occur in the definitions of f ( i ) , G(i), and K ( i ) , reflecting the different variational parameters of the two wave functions. We finally note that the K(O) contribution vanishes in the CI third derivative expression, as opposed to the MCSCF case where the many-body orbital operators make this term nonvanishing.The CI gradient expression was derived and implemented by Krishnan et al. (1980) and Brooks et al. (1980). The generalization to MRCI is due to Osamura et al. (1981, 1982a,b). The Hessian expression was derived by J@rgensen and Simons (1 983) and implemented by Fox et al. (1983). Recently, a more efficient implementation has been reported by Lee et al. (1986). MRCI derivative expressions up to fourth order have been derived by Simons et al.
(1984). The introduction of the Handy-Schaefer technique (Handy and Schaefer, 1984) greatly improved the efficiency of CI derivative calculations.
The calculation of CI derivatives within the Fock-operator formalism has recently been reviewed by Osamura et al. (1987).
D. Implementation of MRCI Derivative Expressions 1. MRCI Gradients
Using Eq. (38), the CI gradient [Eq. (1 16)] may be written as
W'l) = (CIIH"] -
+(s('),
H'O')+
( K ' l ) , H'O')ICI) (1 19)206 Trygve Helgaker and Poul Jsrgensen
where K ( ' ) is the first-order MCSCF orbital response [the orbital part of Eq. (64)]. The first two terms may be calculated as in the MCSCF case, i.e., (CIIH"' - +(S'", H'")(CI) = Tr DaohyA
+
+Tr da0&A - Tr SyAFE' (120)The evaluation of the third term may be simplified as described by Handy and Schaefer (1984):
(CI((K(~), H'")ICI) = 2Tr d1)P0)
1121)
- K ( l ) o (0) (0) ( 1 ) - f C 1
= c
f M Cwhere
7:)
is the CI orbital gradient calculated from Z('), and fgL
is the full MCSCF electronic gradient calculated fromI(').
Here ['O' is the solution of the MCSCF response equations associated with the CI orbital gradient "f):The orbital part of [(')f'fC& may be rewritten as [see Eqs. (F.3) and (F.l)] (Rice and Amos, 1985)
0
i
(0)Of
(1) MC - -(Mcl(oc'o', H'")lMC)
-
- -Tr{O[(0), D M C } ~ ( ~ ) - +Tr{Oc(O), d M C } g ( I ) (123) where DMC and dMC are the MCSCF density elements. The configuration part of [(')f'fCA becomes [see Eq. (D.7)] (Shepard, 1987)
c [ ( O N fMc ( 1 ) = -2(MClH(1)l[(o))
(124)
- - - 233 DMCC&(l) - Tr d M C C i ( 1 )
where
=
1 y p l k )
(125)k
and DMCi and dMCC are the appropriately defined transition density matrices.
Combining Eqs. (1 20), (1 23), and (1 24) we find
(126)
} (127)
eff ( 1 ) T~ ~ ( 1 ) ~ e f f A 0 a o
W") = Tr D:Fh'A:,
+
t T r d,, gAO -where we have introduced the effective density elements
D e f f = D - 2 D M C r - { O [ ( o ) , DMC
and similarly for the two-electron densities. The Fock matrix in Eq. (126) is constructed from the effective densities. Hence the CI gradient can be cal- culated very efficiently, requiring no transformation of derivative integrals and the solution of only a single set of MCSCF response equations.
Geometrical Derivatives in Molecular Electronic Theory 207
2. M R C l Hessiuns obtain for the first term
The MRCI Hessian [Eq. (117)] contains two terms. Using Eq. (39) we
E'2' = (CIIH[21 + (2K'1' - SW, f p l )
+ ( ~ ( 2 ) , H(0)) + (-1S(2) + LS(l)s(l) H ( 0 )
2 2 2 )
+ (d'), d'),
H'O))+
$(S('), ($I), H"))) - ( K ( ' ) , ($I), H'o)))lCI) (128) Using a similar technique as for the MCSCF Hessian we write this equation asE'2' = Tr D,,ha":,
+
*Tr daogf&+
Tr[(4~'" - 2S'")F"']+ 2Tr .(2)F[o] - Tr S(2)F[O1 + Tr(S(l)S(')FIOl)
A 0 ao
+
2 T r [ ~ ( ~ ) ( d " , FIO1)] - Tr[(2rc'" - *S('))(S('), FIO1)] (129) Except for the term containingd2)
all terms in this expression may be evaluated as in the MCSCF case with little extra cost. The evaluation of the d2) term may be simplified as described by Handy and Schaefer (1984). We obtain (Lee et ul., 1986)( 130) 2Tr K(2)F[01 = [(o)('ck + 2G(1)],(1) ( 0 ) " ( 1 ) * ( I )
MC MC
+
KMSAMCAtvlS)where
5'''
is identical to the Handy-Schaefer vector which appears in the CI gradient calculation. The[('Yc;
term in Eq. (130) is treated in the same way as[('!fi;
for gradients, resulting in the expressionwhich takes care of all contributions from second derivative integrals. Hence no transformation of second derivative integrals is required. The remain- ing two terms in Eq. (130) are best calculated by computing Gc;['O' and
K:;[(O) first.
The evaluation of the second term fcl)Pc'), of the MRCI Hessian [Eq. (117)] requires the construction of f ( l ) and the solution of the linear equations
(132) G(o)p(') + f ( l ) = 0
The electronic gradient f ( l ) (which has no orbital part) has the same structure as the configuration part of the MCSCF electronic gradient [Eq. (84)] and may be constructed in the configuration basis, requiring I('), {St'), I(*)}, and (dl), I(')) in the MO basis. (The {IC('), I(')} integrals are needed since the orbital connection includes the MCSCF orbital reoptimization effects.)
The linear equations may be solved using the same iterative technique as in the MCSCF case. The direct transformations are simplified since only the