Mathematical Physics, Analysis and Geometry - Volume 7

343
Mathematical Physics, Analysis and Geometry 7: 1–8, 2004. © 2004 Kluwer Academic Publishers. Printed in the Netherlands. 1 Relating Thomas–Whitehead Projective Connections by a Gauge Transformation CRAIG ROBERTS Department of Mathematics, Southeast Missouri State University, Cape Girardeau, MO 63701-4799, U.S.A. e-mail: [email protected] (Received: 13 August 2002) Abstract. Thomas–Whitehead projective connections, or TW-connections, are torsionfree linear connections, satisfying certain properties, on a naturally defined principal R-bundle over a manifold. The name credits T. Y. Thomas and J. H. C. Whitehead, who originally studied these connections in the 1920’s and 1930’s. Three equivalence classes of TW-connections will be considered. This leads to a necessary and sufficient condition for TW-connections to be related by a gauge transformation; namely, they induce the same projective structure on the base manifold, have identical Ricci tensor, and induce the identity element in the one-dimensional de Rham cohomology vector space of the base manifold. Mathematics Subject Classifications (2000): 53C05, 53C22, 53C80. Key words: bundle of volume elements, gauge transformation, projective structure, Ricci equiva- lence, structural equivalence, Thomas–Whitehead projective connection. 1. Introduction Thomas–Whitehead projective connections, or TW-connections, have their origin in the work of T. Y. Thomas (1925, 1926) and J. H. C. Whitehead (1931). Each represented a projective connection on a manifold by means of a torsionfree lin- ear connection defined on another manifold of one more dimension. With this in mind, a TW-connection is a torsionfree linear connection, satisfying certain properties, on a naturally defined principal R-bundle over a manifold. It is shown in (Roberts, 1992) and (Roberts, 1995) that a TW-connection induces a projec- tive structure on the base manifold and that an equivalence relation on the set of TW-connections may be defined by calling TW-connections equivalent whenever they induce the same projective structure on the base manifold. In this paper two refinements of this equivalence relation are studied. The first refinement defines TW-connections to be Ricci equivalent if they induce the same projective structure on the base manifold and have the same Ricci tensor. The second refinement defines TW-connections to be gauge equivalent if they are related by a gauge transforma- tion of the principal R-bundle. It is shown that such a gauge transformation exists if and only if the TW-connections are Ricci equivalent and induce the identity

Transcript of Mathematical Physics, Analysis and Geometry - Volume 7

Page 1: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 1–8, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

1

Relating Thomas–Whitehead ProjectiveConnections by a Gauge Transformation

CRAIG ROBERTSDepartment of Mathematics, Southeast Missouri State University, Cape Girardeau,MO 63701-4799, U.S.A. e-mail: [email protected]

(Received: 13 August 2002)

Abstract. Thomas–Whitehead projective connections, or TW-connections, are torsionfree linearconnections, satisfying certain properties, on a naturally defined principal R-bundle over a manifold.The name credits T. Y. Thomas and J. H. C. Whitehead, who originally studied these connections inthe 1920’s and 1930’s. Three equivalence classes of TW-connections will be considered. This leadsto a necessary and sufficient condition for TW-connections to be related by a gauge transformation;namely, they induce the same projective structure on the base manifold, have identical Ricci tensor,and induce the identity element in the one-dimensional de Rham cohomology vector space of thebase manifold.

Mathematics Subject Classifications (2000): 53C05, 53C22, 53C80.

Key words: bundle of volume elements, gauge transformation, projective structure, Ricci equiva-lence, structural equivalence, Thomas–Whitehead projective connection.

1. Introduction

Thomas–Whitehead projective connections, or TW-connections, have their originin the work of T. Y. Thomas (1925, 1926) and J. H. C. Whitehead (1931). Eachrepresented a projective connection on a manifold by means of a torsionfree lin-ear connection defined on another manifold of one more dimension. With thisin mind, a TW-connection is a torsionfree linear connection, satisfying certainproperties, on a naturally defined principal R-bundle over a manifold. It is shownin (Roberts, 1992) and (Roberts, 1995) that a TW-connection induces a projec-tive structure on the base manifold and that an equivalence relation on the set ofTW-connections may be defined by calling TW-connections equivalent wheneverthey induce the same projective structure on the base manifold. In this paper tworefinements of this equivalence relation are studied. The first refinement definesTW-connections to be Ricci equivalent if they induce the same projective structureon the base manifold and have the same Ricci tensor. The second refinement definesTW-connections to be gauge equivalent if they are related by a gauge transforma-tion of the principal R-bundle. It is shown that such a gauge transformation existsif and only if the TW-connections are Ricci equivalent and induce the identity

Page 2: Mathematical Physics, Analysis and Geometry - Volume 7

2 CRAIG ROBERTS

element in the one-dimensional de Rham cohomology vector space of the basemanifold.

2. Structurally Equivalent TW-connections

The relevant definitions and results for TW-connections will be reviewed in thissection. Further details may be obtained by referring to (Roberts, 1992) and(Roberts, 1995).

The construction of the principal R-bundle begins with a real n-dimensionalvector space V . If v is a nonzero element in the nth exterior product of V , thenthe set ε = {±v} will be called a volume element. Defining a volume elementonly up to sign allows for the consideration of both orientable and nonorientablemanifolds. The set of all volume elements will be denoted by E(V ). If M is asmooth n-dimensional manifold and p an element of M, replacing V with TpM,the tangent space of M at p, allows the bundle of volume elements over M, denotedby E(M), to be defined as

E(M) =⋃p∈M

E(TpM).

E(M) is the total space of a principal R-bundle over M with the projection mapπ : E(M) → M defined by π(ε) = p. The structure group is the reals R underaddition, and the right action on E(M) is given by ε · a = eaε. The fundamentalvector field on E(M) corresponding to the element d/dt in the Lie algebra of R

will be called the canonical fundamental vector field on E(M) and denoted by ξ .

DEFINITION 1. A Thomas–Whitehead projective connection, or TW-connection,is a torsionfree linear connection ∇ on E(M) that is invariant with respect to theright action of R on E(M) and for which

∇ξ = − 1

n + 1(id),

where id is the identity (1, 1)-tensor and ξ the canonical fundamental vector fieldon E(M).

A TW-connection ∇ induces a projective structure on the base manifold M inthe following way. If ω is a connection 1-form on E(M) and X and Y are therespective ω-horizontal lifts of smooth vector fields X and Y on M, then

∇ωXY = π∗(∇XY )

defines a torsionfree linear connection ∇ω on M. For the same TW-connection ∇,but a different connection 1-form ω, the torsionfree linear connection ∇ω belongsto the same projective structure as ∇ω. Furthermore, the mapping ω �→ ∇ω isa one-to-one correspondence between the connection 1-forms on E(M) and the

Page 3: Mathematical Physics, Analysis and Geometry - Volume 7

GAUGE-RELATED TW-CONNECTIONS 3

torsionfree linear connections belonging to the same projective structure as ∇ω.Consequently, the TW-connection ∇ induces a projective structure on the basemanifold M.

DEFINITION 2. TW-connections that induce the same projective structure on thebase manifold M are structurally equivalent.

Definition 2 is a change in terminology from (Roberts, 1992) and(Roberts, 1995) designed to distinguish this equivalence class of TW-connectionsfrom those that follow.

If the base manifold M has dimension one, then all TW-connections are struc-turally equivalent in this special case. This follows since the induced connec-tions on M all have the same geodesics up to a reparameterization, which im-plies the induced connections belong to the same projective structure on M. See(Spivak, 1979) and (Kobayashi, 1972).

Structurally equivalent TW-connections are more generally characterized by theexistence of a symmetric (0, 2)-tensor on E(M).

THEOREM 1. The TW-connections ∇ and ∇ are structurally equivalent if andonly if there is a unique symmetric (0, 2)-tensor β on E(M) such that the Liederivative Lξβ = 0, and

∇ = ∇ + (ιξβ) ⊗ id + id ⊗ (ιξβ) − β ⊗ ξ,

where (ιξ β) denotes a 1-form on E(M) defined by (ιξβ)(X) = β(ξ,X), for anysmooth vector field X on E(M), and satisfies (ιξβ)(ξ) = 0.

3. Ricci Equivalent TW-connections

DEFINITION 3. The Ricci tensor of a TW-connection ∇ is a (0, 2)-tensor onE(M) defined by

Ricci(X, Y ) = trace{V �→ R(V,X)Y },where R is the curvature tensor of ∇ and X, Y , and V are smooth vector fieldson E(M).

The Ricci tensor of a TW-connection satisfies two special properties. First, sincea TW-connection is invariant with respect to the right action of R on E(M), its Riccitensor will be invariant with respect to this right action. Also, the Ricci tensor is0 if one of the arguments is the canonical fundamental vector field ξ . These twoproperties imply that Ricci = π∗α, for a (0, 2)-tensor α on M.

For structurally equivalent TW-connections, a straightforward, but ratherlengthy, calculation shows the difference of their Ricci tensors may be expressedin terms of the symmetric (0, 2)-tensor β from Theorem 1.

Page 4: Mathematical Physics, Analysis and Geometry - Volume 7

4 CRAIG ROBERTS

THEOREM 2. If ∇ and ∇ are structurally equivalent TW-connections and theirrespective Ricci tensors are denoted by Ricci and Ricci, then

Ricci − Ricci = −(n+ 1) d(ιξ β) ++ (n− 1)

((ιξβ) ⊗ (ιξβ) − sym ∇(ιξβ) + 1

n + 1β

),

where β and ιξβ are defined as in Theorem 1 and sym∇(ιξ β) denotes the symmet-ric part of ∇(ιξ β).

DEFINITION 4. The TW-connections ∇ and ∇ are Ricci equivalent if they arestructurally equivalent and have identically equal Ricci tensors.

As was mentioned in the previous section, for the special case in which the basemanifold M has dimension one, all TW-connections are structurally equivalent. Inaddition, whenever M has dimension one, E(M) has dimension two and the Riccitensor of any TW-connection is identically 0. Consequently, all TW-connectionsare Ricci equivalent for the special case in which the base manifold has dimensionone.

More generally, Ricci equivalent TW-connections are characterized by sharp-ening Theorem 1 to the following.

THEOREM 3. The TW-connections ∇ and ∇ are Ricci equivalent if and only ifthere is a unique closed 1-form φ on E(M) such that φ(ξ) = 0 and

∇ = ∇ + φ ⊗ id + id ⊗ φ − (n + 1)(∇φ − φ ⊗ φ) ⊗ ξ.

Proof. If ∇ and ∇ are Ricci equivalent TW-connections, they are structurallyequivalent and Theorem 1 implies there exists a unique symmetric (0, 2)-tensor βon E(M) such that Lξβ = 0, (ιξ β)(ξ) = 0, and

∇ = ∇ + (ιξβ) ⊗ id + id ⊗ (ιξβ) − β ⊗ ξ.

Since ∇ and ∇ also have identically equal Ricci tensors, the equation for thedifference of their Ricci tensors given by Theorem 2 becomes

−(n + 1)d(ιξ β) + (n − 1)

((ιξβ) ⊗ (ιξβ) − sym ∇(ιξβ) + 1

n+ 1β

)= 0.

Considering the skew-symmetric and symmetric parts, respectively, of this equa-tion yields d(ιξ β) = 0 and

(ιξβ) ⊗ (ιξβ) − sym ∇(ιξβ) + 1

n + 1β = 0.

From these equations, as well as ∇(ιξβ) = sym ∇(ιξβ) − d(ιξβ), it follows thatβ = (n+1)(∇(ιξβ)− (ιξ β)⊗ (ιξβ)). Thus, setting φ = ιξβ yields a unique closed1-form such that φ(ξ) = 0 and

∇ = ∇ + φ ⊗ id + id ⊗ φ − (n + 1)(∇φ − φ ⊗ φ) ⊗ ξ.

Page 5: Mathematical Physics, Analysis and Geometry - Volume 7

GAUGE-RELATED TW-CONNECTIONS 5

Conversely, if φ is a closed 1-form on E(M) and φ(ξ) = 0, then ∇φ =sym ∇φ − dφ = sym ∇φ and (Lξφ)(X) = d(φ(ξ))(X) + 2 dφ(ξ,X) = 0, forall smooth vector fields X on E(M). Furthermore, if φ is the unique closed 1-formon E(M) satisfying

∇ = ∇ + φ ⊗ id + id ⊗ φ − (n + 1)(∇φ − φ ⊗ φ) ⊗ ξ,

then β = (n+ 1)(∇φ − φ ⊗ φ) defines a unique symmetric (0, 2)-tensor on E(M)

for which Lξβ = 0, ιξβ = φ, and

∇ = ∇ + (ιξβ) ⊗ id + id ⊗ (ιξβ) − β ⊗ ξ.

Hence, ∇ and ∇ are structurally equivalent by Theorem 1. Substituting the expres-sions for β and ιξ β into the equation for the difference in the Ricci tensors of ∇and ∇ given by Theorem 2 and simplifying shows ∇ and ∇ have identical Riccitensors. Therefore, ∇ and ∇ are Ricci equivalent. ✷

4. Gauge Equivalent TW-connections

DEFINITION 5. A diffeomorphism g: E(M) → E(M) such that g(ε · a) =g(ε) · a, for all ε in E(M) and real numbers a, and which induces the identity mapon the base manifold M is a gauge transformation of E(M).

For a TW-connection ∇, a gauge transformation g of E(M), and smooth vectorfields X and Y on E(M), the expression

g−1∗ (∇g∗(X)g∗(Y ))

defines a TW-connection since g commutes with the right action of R on E(M).Also, a one-to-one correspondence between the group of gauge transformationsof E(M) and the set of smooth maps from M to R can be established by settingg(ε) = ε · (f ◦ π)(ε), for all ε in E(M), where f is a smooth map from M to R

and π is the projection map from E(M) to M (Bleecker, 1981).

DEFINITION 6. The TW-connections ∇ and ∇ are gauge equivalent if there is agauge transformation g of E(M) such that

∇XY = g−1∗ (∇g∗(X)g∗(Y )),

for all smooth vector fields X and Y on E(M).

The next theorem shows gauge equivalence is a finer equivalence relation thanRicci equivalence and structural equivalence.

THEOREM 4. Gauge equivalent TW-connections are Ricci equivalent.

Page 6: Mathematical Physics, Analysis and Geometry - Volume 7

6 CRAIG ROBERTS

Proof. If ∇ and ∇ are gauge equivalent TW-connections, there exists a gaugetransformation g of E(M) such that ∇XY = g−1∗ (∇g∗(X)g∗(Y )), for all smoothvector fields X and Y on E(M). If R and R denote the respective curvature tensorsof ∇ and ∇, then

R(X, Y )Z = g−1∗ (R(g∗(X), g∗(Y ))g∗(Z)),

where Z is a smooth vector field on E(M), and it follows that Ricci = g∗Ricci.In the previous section, it was noted that there is a (0, 2)-tensor α on M such thatRicci = π∗α. Therefore,

Ricci = g∗Ricci = g∗(π∗α) = (π ◦ g)∗α = π∗α = Ricci,

and the Ricci tensors of gauge equivalent TW-connections are identical.It remains to show that the gauge equivalent TW-connections ∇ and ∇ are

structurally equivalent. Let ω be a connection form on E(M). For the gauge trans-formation g relating ∇ and ∇, g∗ω is also a connection form on E(M). TakingX and Y to be smooth vector fields on M, and denoting their respective g∗ω-horizontal lifts by X and Y , we have 0 = g∗ω(X) = ω(g∗(X)) and 0 = g∗ω(Y) =ω(g∗(Y )). Thus, the vector fields g∗(X) and g∗(Y ) on E(M) are the ω-horizontallifts of X and Y , respectively. For the induced connections ∇ω and ∇g∗ω on M, thisimplies

∇ωXY = π∗(∇g∗(X)g∗(Y )) = (π ◦ g−1)∗(∇g∗(X)g∗(Y ))

= π∗(∇XY ) = ∇g∗ωX Y

since π ◦ g−1 = π . Therefore, ∇ and ∇ induce the same projective structure on M

so they are structurally equivalent. ✷Theorems 3 and 4 suggest the possibility of further refining Theorem 1 for

gauge equivalent TW-connections.

THEOREM 5. The TW-connections ∇ and ∇ are gauge equivalent if and only ifthere is a unique exact 1-form φ on E(M) such that φ(ξ) = 0 and

∇ = ∇ + φ ⊗ id + id ⊗ φ − (n + 1)(∇φ − φ ⊗ φ) ⊗ ξ.

Proof. If ∇ and ∇ are gauge equivalent TW-connections, then Theorem 4 im-plies they are Ricci equivalent. It must be shown that the unique closed 1-formφ given by Theorem 3 is exact for the case of gauge equivalent TW-connections.Recall from the proof of Theorem 3 that the property φ(ξ) = 0 for a closed 1-form φ implies Lξφ = 0. Hence, there exists a closed 1-form ρ on M such thatπ∗ρ = φ, and the equation in Theorem 3 may be written as

∇ = ∇ + π∗ρ ⊗ id + id ⊗ π∗ρ − (n+ 1)(∇π∗ρ − π∗ρ ⊗ π∗ρ) ⊗ ξ.

Page 7: Mathematical Physics, Analysis and Geometry - Volume 7

GAUGE-RELATED TW-CONNECTIONS 7

If ω is a connection form on E(M), X and Y smooth vector fields on M, and X andY their respective ω-horizontal lifts, then this equation becomes

∇XY = ∇XY + π∗ρ(X)Y + π∗ρ(Y )X −− (n + 1)((∇π∗ρ)(X; Y ) − π∗ρ(X)π∗ρ(Y ))ξ.

Projecting by π∗ to M gives ∇ωXY = ∇ω

XY +ρ(X)Y +ρ(Y )X. H. Weyl showed thisequation implies ∇ω and ∇ω are projectively equivalent; in other words, they havethe same geodesics up to a reparameterization (Spivak, 1979). This is equivalent to∇ω and ∇ω belonging to the same projective structure on M (Kobayashi, 1972).

The gauge transformation g of E(M) relating ∇ and ∇ provides another meansof obtaining the equation showing ∇ω and ∇ω are projectively equivalent. Sinceg may be expressed as g(ε) = ε · (f ◦ π)(ε), for all ε in E(M) and some smoothmap f from M to R, it follows that g∗(X) = X + π∗ df (X)ξ and g∗(Y ) = Y +π∗ df (Y )ξ . Hence, ∇XY = g−1∗ (∇g∗(X)g∗(Y )) becomes

∇XY = ∇XY − 1

n+ 1π∗ df (X)Y − 1

n+ 1π∗ df (Y )X +

+ (X(π∗ df (Y )) − π∗ df (∇XY ) + 1

n + 1π∗ df (X)π∗ df (Y ))ξ.

Projecting by π∗ to M gives

∇ωXY = ∇ω

XY − 1

n + 1df (X)Y − 1

n+ 1df (Y )X,

which again shows ∇ω and ∇ω are projectively equivalent.Setting ρ = −(n+ 1)−1 df shows ρ is exact, which in turn implies φ = π∗ρ is

exact.Conversely, assume there exists an exact 1-form φ satisfying the hypotheses,

then ∇ and ∇ are Ricci equivalent by Theorem 3 since an exact 1-form is closed.From the properties dφ = 0 and φ(ξ) = 0, it follows that Lξφ = 0, which impliesthere is a 1-form ρ on M such that φ = π∗ρ. Furthermore, ρ must be exact sinceφ is an exact 1-form satisfying φ(ξ) = 0. Let f be the smooth map from M to R

such that df = ρ. Hence, φ = π∗ df . Define a gauge transformation g of E(M)

by g(ε) = ε · (−(n + 1)(f ◦ π)(ε)), for all ε in E(M). Thus, for smooth vectorfields X and Y on E(M),

g−1∗ (∇g∗(X)g∗(Y )) = ∇

XY + π∗ df (X)Y + π∗ df (Y )X −

− (n+ 1)((∇π∗ df )(X;Y ) − π∗ df (X)π∗ df (Y ))ξ.

By assumption, the right-hand side of this equation is merely ∇XY since φ =π∗ df . Therefore, ∇ and ∇ are gauge equivalent TW-connections. ✷

Since gauge equivalence is characterized by an exact 1-form and Ricci equiva-lence is characterized by a closed 1-form, a relationship to the one-dimensional deRham cohomology vector space is suggested.

Page 8: Mathematical Physics, Analysis and Geometry - Volume 7

8 CRAIG ROBERTS

DEFINITION 7. The vector space H 1(M,R) = {closed 1-forms on M}�{exact1-forms on M} is the one-dimensional de Rham cohomology vector space of M.

A pair of Ricci equivalent TW-connections ∇ and ∇ induce a de Rham coho-mology class on M in a natural way. If φ is the unique closed 1-form on E(M)

given by Theorem 3, then recalling the proof of Theorem 5 there is a closed 1-formρ on M such that φ = π∗ρ. The induced de Rham cohomology class on M is [ρ].THEOREM 6. The TW-connections ∇ and ∇ are gauge equivalent if and only ifthey are Ricci equivalent and the induced de Rham cohomology class on M is 0.

Proof. The result follows from Theorems 3 and 5 since the unique closed 1-formφ characterizing a pair of Ricci equivalent TW-connections may be expressed asφ = π∗ρ, for a closed 1-form ρ on M, and φ is exact if and only if ρ is exact. ✷

A consideration of the stronger condition that the vector space H 1(M,R) = 0yields the following corollary.

COROLLARY 7. Ricci equivalence and gauge equivalence are identical if andonly if H 1(M,R) = 0.

Since the Poincaré Lemma shows that the one-dimensional de Rham coho-mology vector space of a contractible manifold is 0 (Conlon, 1993), Corollary 7implies Ricci equivalence and gauge equivalence are identical when the base man-ifold is contractible. In particular, it was noted in the previous section that allTW-connections are Ricci equivalent when the base manifold has dimension one.Therefore, if the base manifold has dimension one and is contractible, Corollary 7shows that all TW-connections are gauge equivalent.

References

Bleecker, D. (1981) Gauge Theory and Variational Principles, Addison-Wesley, Reading, MA.Conlon, L. (1993) Differentiable Manifolds: A First Course, Birkhäuser, Boston.Kobayashi, S. (1972) Transformation Groups in Differential Geometry, Springer-Verlag, New York.Roberts, C. W. (1992) The projective connections of T. Y. Thomas and J. H. C. Whitehead on the

principal R-bundle of volume elements, PhD Thesis, Saint Louis University, St. Louis, MO.Roberts, C. W. (1995) The projective connections of T. Y. Thomas and J. H. C. Whitehead applied to

invariant connections, Differential Geom. Appl. 5, 237–255.Spivak, M. (1979) A Comprehensive Introduction to Differential Geometry II, 2nd edn, Publish or

Perish, Wilmington.Thomas, T. Y. (1925) On the projective and equi-projective geometries of paths, Proc. Nat. Acad.

Sci. 11, 199–203.Thomas, T. Y. (1926) A projective theory of affinely connected manifolds, Math. Z. 25, 723–733.Whitehead, J. H. C. (1931) The representation of projective spaces, Ann. of Math. 32, 327–360.

Page 9: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 9–46, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

9

Heat Kernel Asymptotics of Zaremba BoundaryValue Problem

IVAN G. AVRAMIDIDepartment of Mathematics, New Mexico Institute of Mining and Technology, Socorro,NM 87801, U.S.A. e-mail: iavramid@nmt. edu

(Received: 30 October 2001; in final form: 18 July 2002)

Abstract. The Zaremba boundary-value problem is a boundary value problem for Laplace-typesecond-order partial differential operators acting on smooth sections of a vector bundle over a smoothcompact Riemannian manifold with smooth boundary but with discontinuous boundary conditions,which include Dirichlet boundary conditions on one part of the boundary and Neumann boundaryconditions on another part of the boundary. We study the heat kernel asymptotics of Zaremba bound-ary value problem. The construction of the asymptotic solution of the heat equation is describedin detail and the heat kernel is computed explicitly in the leading approximation. Some of the firstnontrivial coefficients of the heat kernel asymptotic expansion are computed explicitly.

Mathematics Subject Classifications (2000): 58J35, 58J37, 58J50, 58J32, 35P20, 35K20.

Key words: boundary value problem, heat kernel, spectral asymptotics, spectral geometry.

1. Introduction

The heat kernel of elliptic partial differential operators acting on sections of vectorbundles over compact manifolds proved to be of great importance in mathemat-ical physics. In particular, the main objects of interest in quantum field theoryand statistical physics, such as the effective action, the partition function, Greenfunctions, and correlation functions, are described by the functional determinantsand the resolvents of differential operators, which can be expressed in terms ofthe heat kernel. The most important operators appearing in physics and geometryare the second order partial differential operators of Laplace type; such operatorsare characterized by a scalar leading symbol (even if acting on sections of vectorbundles). Within the smooth category this problem has been studied extensivelyduring last years (see, for example, [30, 10]; for reviews see [9, 4, 3] and referencestherein). In the case of smooth compact manifolds without boundary the problemof calculation of heat kernel asymptotics reduces to a purely computational (al-gebraic) one for which various powerful algorithms have been developed [1, 43];this problem is now well understood. In the case of smooth compact manifoldswith a smooth boundary and smooth boundary conditions the complexity of theproblem depends significantly on the type of the boundary conditions. The clas-

Page 10: Mathematical Physics, Analysis and Geometry - Volume 7

10 IVAN G. AVRAMIDI

sical smooth boundary problems (Dirichlet, Neumann, or a mixed combination ofthose on vector bundles) are the most extensively studied ones (see [12, 13, 34, 2]and the references therein). A more general scheme, so called oblique (or Grubb–Gilkey–Smith) boundary value problem [31, 29, 28], which includes tangential(oblique) derivatives along the boundary, has been studied in [7, 8, 6, 22–24]. Inthis case the problem is not automatically elliptic; there is a certain strong ellipticitycondition on the leading symbol of the boundary operator. This problem is muchmore difficult to handle, the main reason being that the heat kernel asymptoticsare no longer polynomial in the jets of the symbols of the differential operatorand the boundary operator. Another class of boundary value problems are charac-terized by essentially nonlocal boundary conditions, for example, the spectral orAtiyah–Patodi–Singer boundary conditions [30, 32, 11, 38].

All the boundary value problems described above are smooth. A more general(and much more complicated) setting, so called singular boundary value problem,arises when either the symbol of the differential operator or the symbol of theboundary operator (or the boundary itself) are not smooth. In this paper we studya singular boundary value problem for a second order partial differential operatorof Laplace type when the operator itself has smooth coefficients but the boundaryoperator is not smooth. The case when the manifold as well as the boundary aresmooth, but the boundary operator jumps from Dirichlet type to Neumann typealong the boundary, is known in the literature as Zaremba problem. Such problemsoften arise in applied mathematics and engineering and there are some exact re-sults available for special cases (two or three dimensions, specific geometry, etc.)[42, 25]. Zaremba problem belongs to a much wider class of singular boundaryvalue problems, i.e. manifolds with singularities (corners, edges, cones, etc.). Thereis a large body of literature on this subject where the problem is studied from anabstract function-analytical point of view [26, 14–18, 41, 37, 35, 33, 27]. However,the study of heat kernel asymptotics of Zaremba type problems is quite new, andthere are only some preliminary results in this area [5, 40, 21, 20]. Moreover,compared to the smooth category the needed machinery is still underdeveloped.We would like to stress that we are interested not only in the asymptotics of thetrace of the heat kernel, i.e. the integrated heat kernel diagonal, but also in the localasymptotic expansion of the off-diagonal heat kernel.

In this paper we study Zaremba boundary value problem for second-order par-tial differential operators F of Laplace type acting on sections of a vector bundleV over a smooth compact manifold M of dimension m with the boundary ∂M.The boundary is decomposed as the disjoint union ∂M = �1 ∪ �2 ∪ �0, sothat �1 = �1 ∪ �0 and �2 = �2 ∪ �0 are smooth compact co-dimension onesubmanifolds with the boundary �0 = ∂�1 = ∂�2, which is a smooth compactco-dimension two submanifold without boundary, ∂�0 = ∅. Both the manifold M

and its boundary ∂M are assumed to be smooth and the differential operator F

to have smooth coefficients. However, the boundary operator B is discontinuouson the boundary, it jumps from the Dirichlet type operator on �1 to Neumann

Page 11: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 11

type boundary operator on �2. We will see that to fix the problem completely onehas to impose an additional boundary condition along �0 as well (see Section 4below). Since this problem is not smooth there could be additional logarithmicterms in the asymptotic expansion of the trace of the heat kernel TrL2 exp(−tFB)

as t → 0. However, Seeley [40] has shown recently (confirming the previousconjecture of [5]) that the logarithmic terms do not appear:

THEOREM 1. There is asymptotic expansion as t → 0+ in half-integer powersof t only

TrL2 exp(−tFB) ∼∞∑k=0

t (k−m)/2Bk, (1)

with the coefficients Bk given by the integral of local invariants over the mani-folds M, �1, �2 and �0

Bk =∫M

b(0)k +

∫�1

b(1),1k +

∫�2

b(1),2k +

∫�0

b(2)k . (2)

This seems to contradict the conclusions of [21], where it has been shown thatsuch an expansion with locally computable coefficients does not exist. The term‘locally computable’ is confusing though. As we show below the calculation of thecoefficients of the asymptotic expansion of the trace of the heat kernel involves theknowledge of some global information, in particular the spectral data of a certaindifferential operator with mixed boundary conditions on a unit semicircle in thenormal bundle to �0 (see Section 8). So, one could say that these coefficients arelocally computable in local coordinates on �0 and the normal distance to �0 butare global in the angular coordinate around �0 (see Section 8). Thus the standardasymptotic expansion (1) in powers of t (without logarithmic terms) still existswith coefficients (2) given by integrals over M, �1, �2 and �0. The interior co-efficients, b(0)k , and the co-dimension one coefficients, b(1),1k and b

(1),2k , are ‘locally

computable’, but the co-dimension two coefficients, b(2)k , are ‘global’ in the angularcoordinate (or pseudo-local) and require new methods of calculation (e.g., like theapproach of this paper or [40]). They are constructed from the local invariantson �0. It is the numerical coefficients that are global.

Let us stress here that our goal in this paper is not to provide a rigorous con-struction of the parametrix of the heat equation with all the definitions and esti-mates, which, for a singular boundary-value problem, is a task that would requirea separate paper. For such a treatment the reader is referred to the papers [38, 39,29, 28, 11, 32] for the smooth case and to [26, 14–18, 41, 37, 35, 33, 27] for thesingular case.

Here we shall adhere instead to a pragmatic approach and will describe the con-struction of an asymptotic solution of the heat equation that can be used to calculateexplicitly the heat kernel asymptotics. The main object of our investigation is not

Page 12: Mathematical Physics, Analysis and Geometry - Volume 7

12 IVAN G. AVRAMIDI

the parametrix but the exact explicit formulas for the coefficients of the asymptoticexpansion of the heat kernel.

Let us summarize briefly our main results. First of all, we provide the correctformulation of Zaremba type boundary value problem. We find that the boundaryconditions on the open sets �1 and �2 are not enough to fix the problem, and anadditional boundary condition along the singular set �0 is needed. This additionalboundary condition can be considered formally as an ‘extension’ of Dirichlet con-ditions from �1 to �0, or an ‘extension’ of Neumann conditions from �2 to �0.However, strictly speaking the boundary condition on �0 does not follow from theboundary conditions on �1 and �2 and can be chosen rather arbitrarily. In fact, oneneeds some supplementary ‘physical’ criteria to fix this boundary condition. Sec-ond, we describe the geometry of the problem, which involves now some nontrivialgeometrical quantities (normal bundle and extrinsic curvatures) that characterizeproperly the imbedding of a co-dimension two submanifold �0 in M. The higherorder coefficients b

(2)k are invariants constructed from those geometric quantities.

Next, we describe the construction of the asymptotic solution of the heat equationin the interior of the manifold M, in a thin shell close to �1 and �2, and finally,in a thin strip close to �0. We use the standard scaling device; the difference isjust what coordinates are involved in the scaling. Finally, we find explicit formulafor the off-diagonal heat kernel in Mbnd

0 , the thin strip close to �0, in the leadingapproximation, and use it to compute the first nontrivial ‘global’ coefficient, b(2)2 ,of the heat kernel asymptotic expansion. We consider two types of the additionalboundary condition along �0, one being the ‘extension’ of Dirichlet boundary con-ditions (that we call regular boundary condition), and another being the ‘extension’of the Neumann (or rather Robin) boundary conditions. We show that the result,i.e. the coefficient b(2)2 , does depend on the type of the boundary condition at �0,i.e. Dirichlet vs Neumann, but does not depend on the parameter of the Robinboundary condition (it will however contribute to the higher order coefficients).

Our main result can be summarized in the following

THEOREM 2. Let (M, g) be a smooth compact Riemannian manifold of dimen-sion m with the Riemannian metric g with the boundary ∂M = �1∪�2∪�0, where�1 and �2 are disjoint smooth co-dimension one submanifolds with compact clo-sures �1 and �2 with the common boundary �0 = ∂�1 = ∂�2; �0 being a smoothcompact co-dimension two submanifold without boundary. Let V be a smooth Her-mitian vector bundle over the manifold M and ∇ be the natural extension of theconnection on the vector bundle V by using the Levi-Civita connection. Let Q be asmooth Hermitian endomorphism of the vector bundle V , � be a smooth Hermitianendomorphism of the vector bundle V restricted to the boundary, r be the normalgeodesic distance to the boundary, ρ be the normal geodesic distance to �0 and s

be a real parameter. Let FB be the Laplace type operator F = −gµν∇µ∇ν + Q

subject to Zaremba boundary conditions

ϕ|�1 = 0, (∂r +�)ϕ|�2 = 0 (3)

Page 13: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 13

and one of the following

(√ρϕ)|�0 = 0 (4)

or

(∂ρ − s)(√ρϕ)|�0 = 0. (5)

Then the trace of the heat kernel of the Zaremba boundary value problem has thefollowing asymptotic expansion as t → 0+

TrL2 exp(−tFB)

= (4πt)−m/2

{dimV vol(M)+ t1/2

√π

2dimV [vol(�2)− vol(�1)]+

+ t

[1

6dimV

∫M

R −∫M

trV Q+ 1

3dimV

∫∂M

K +

+ 2∫�2

trV �+ απ

4dimV vol(�0)

]+ O

(t3/2

)}, (6)

where R is the scalar curvature of the metric g, K is the trace of the extrinsiccurvature of the boundary, and α is a numerical constant, α = −1 for the boundarycondition (4) and α = 7 for the boundary condition (5).

This is in agreement with [40, 20].This paper is organized as follows. In Section 2 the formal description of Zarem-

ba type boundary value problem is given. In Section 3 the general form of the heatkernel asymptotic expansion is described. In Section 4 we describe the relevantgeometrical framework. Section 5 is devoted to the construction of the asymptoticsolution of the heat equation in the interior of the manifold. In Section 6 we de-scribe the asymptotic solution of the heat equation in a thin strip near �1 withthe Dirichlet boundary conditions. Similarly, Section 7 deals with the asymptoticsolution of the heat equation in a thin strip near �2 with Neumann boundary con-ditions. Section 8 is the central section of the paper. Here we construct the localasymptotic solution of the Zaremba boundary value problem in the neighborhoodof the singular set �0 in the leading approximation and explicitly calculate the firstnon-trivial singular heat kernel coefficients.

2. General Setup

2.1. LAPLACE TYPE OPERATORS

Let (M, g) be a smooth compact Riemannian manifold of dimension m with aboundary ∂M, equipped with a positive definite Riemannian metric g. Let V be avector bundle over M, V ∗ be its dual, and End(V ) ∼= V ⊗V ∗ be the correspondingbundle of endomorphisms. Given any vector bundle V , we denote by C∞(V ) its

Page 14: Mathematical Physics, Analysis and Geometry - Volume 7

14 IVAN G. AVRAMIDI

space of smooth sections. We assume that the vector bundle V is equipped with aHermitian metric. This naturally identifies the dual vector bundle V ∗ with V , anddefines a natural L2 inner product and the L2-trace using the invariant Riemannianmeasure d volg on the manifold M. The completion of C∞(V ) in this norm definesthe Hilbert space L2(V ) of square integrable sections.

We denote by TM and T∗M the tangent and cotangent bundles of M. Let a con-nection, ∇V : C∞(V ) → C∞(T ∗M ⊗ V ), on the vector bundle V be given, whichwe assume to be compatible with the Hermitian metric on the vector bundle V . Theconnection is given its unique natural extension to bundles in the tensor algebraover V and V ∗. In fact, using the Levi-Civita connection ∇LC of the metric g

together with ∇V , we naturally obtain connections on all bundles in the tensoralgebra over V , V ∗, TM and T∗M; the resulting connection will usually be denotedjust by ∇. It is usually clear which bundle’s connection is being referred to, fromthe nature of the section being acted upon. We also adopt the Einstein conventionand sum over repeated indices. With our notation, Greek indices, µ, ν, . . . , labelthe local coordinates x = (xµ) on M and range from 1 through m, lower case Latinindices from the middle of the alphabet, i, j, k, l, . . . , label the local coordinatesx = (xi) on ∂M (codimension one submanifold) and range from 2 through m,and lower case Latin indices from the beginning of the alphabet, a, b, c, d, . . . ,label the local coordinates x = (xa) on a codimension two submanifold �0 ⊂ ∂M

that will be described later and range over 3, . . . , m. Further, we will denote byg and ∇ the induced metric and the corresponding Levi-Civita connection on thecodimension one submanifolds �1 and �2 and by g and ∇ the induced metric andthe corresponding Levi-Civita connection on the codimension two submanifold �0.

Let ∇∗ be the formal adjoint of the covariant derivative defined using the Rie-mannian metric and the Hermitian structure on V and let Q ∈ C∞(End(V )) be asmooth Hermitian section of the endomorphism bundle End(V ).

DEFINITION 1. A partial differential operator F : C∞(V ) → C∞(V ) of theform

F = ∇∗∇ +Q = −gµν∇µ∇ν +Q (7)

is called Laplace type operator.

Alternatively, the Laplace type operators are second-order partial differentialoperators with positive definite scalar leading symbol of the form

σL(F ; x, ξ) = I|ξ 2| = Igµν(x)ξµξν. (8)

Hereafter I denotes the identity endomorphism of the vector bundle V . We willoften omit it whenever it does not cause any misunderstanding. Any second-orderoperator with a scalar leading symbol can be put in the form (7) by choosingthe Riemannian metric g, the connection ∇V on the vector bundle V and theendomorphism Q.

Page 15: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 15

2.2. BOUNDARY CONDITIONS

In the case of manifolds with boundary, one has to impose some boundary condi-tions in order to make a (formally self-adjoint) differential operator self-adjoint (atleast symmetric) and elliptic.

As usual, by using the inward geodesic flow, we identify a narrow neighbour-hood of the boundary ∂M with a part of R+×∂M and define a split of the cotangentbundle T∗M = R⊕ T ∗∂M. Let r be the normal geodesic distance to the boundary,so that N = ∂r is the inward unit normal on ∂M and x = (xi) be the localcoordinates on ∂M. Near ∂M we choose the local coordinates x = (xµ) = (r, x).Let W = V |∂M be the restriction of the vector bundle V to the boundary ∂M. Wedefine the boundary data map ψ : C∞(V ) → L2(W ⊕W) by

ψ(ϕ) =(

ϕ|∂M∇Nϕ|∂M

). (9)

The boundary conditions then read

Bψ(ϕ) = 0, (10)

where B: L2(W ⊕ W) → L2(W ⊕ W) is the boundary operator, which will bespecified later. If the operator B is a tangential differential operator (possibly oforder zero), then the boundary conditions are local. Otherwise, for example, whenB is a pseudo-differential operator, the boundary conditions are nonlocal.

To define the boundary operator one needs a self-adjoint orthogonal projector, that splits the space L2(W) in two orthogonal subspaces

L2(W) = L2‖(W)⊕ L2

⊥(W), (11)

where

L2‖(W) = ,L2(W) and L2

⊥ = (Id −,)L2(W), (12)

and a self-adjoint operator �: L2(W) → L2(W), such that �L2‖(W) = {0}, i.e.

,� = �, = 0. Hereafter Id denotes the identity operator. The boundary operatoris then defined by

B =(, 0� Id −,

), (13)

which is equivalent to the following boundary conditions

,(ϕ|∂M) = 0, (14)

(Id −,)(∇Nϕ|∂M)+�(ϕ|ϕM) = 0. (15)

It is easy to see that the boundary operator B and the operator

K = Id − B =(

Id −, 0−� ,

), (16)

Page 16: Mathematical Physics, Analysis and Geometry - Volume 7

16 IVAN G. AVRAMIDI

are complimentary projectors on L2(W ⊕W), i.e.

B2 = B, K2 = K, BK = KB = 0. (17)

Hence, a section that satisfies the boundary conditions can be parametrized byχ(ϕ) = u(ϕ)⊕ v(ϕ) ∈ L2(W ⊕W), u(ϕ) ∈ L2⊥, v(ϕ) ∈ L2‖, so that

ψ(ϕ) = Kχ(ϕ) =(

u(ϕ)

−�u(ϕ)+ v(ϕ)

). (18)

It is mainly the projector , that specifies the boundary conditions. It is notdifficult to see that the boundary operator (13) incorporates all standard types ofboundary conditions. Indeed, by choosing , = I and � = 0 one gets the Dirichletboundary conditions, by choosing , = 0, � = 0 one gets the Neumann boundaryconditions (if � is a smooth endomorphism then these are called Robin boundaryconditions). More generally, the case when , and � are smooth endomorphismsof the bundle W corresponds to the mixed boundary conditions. If � is a first ordertangential differential operator then we have oblique boundary conditions.

Remark. The boundary ∂M could be, in general, a disconnected manifold con-sisting of a finite number of disjoint connected parts, ∂M = ⋃n

i=1 �i , with each �i

being compact connected manifold without boundary, ∂�i = ∅ and �i ∩�j = ∅ ifi �= j . Thus one can impose different boundary conditions on different connectedparts of the boundary �i . This means that the full boundary operator decomposesB = B1 ⊕ · · · ⊕Bn, with Bi being different boundary operators acting on differentbundles.

2.2.1. Nonsmooth Boundary Conditions

We always assume the manifold M itself and the coefficients of the operator F tobe smooth in the interior of M. If, in addition, the boundary ∂M is smooth, andthe boundary operator B is a differential operator with smooth coefficients, then(F,B) is called smooth local boundary value problem.

In this paper we are interested in a different class of boundary conditions.Namely, we do not assume the boundary operator to be smooth. Instead, we willstudy the case when it has discontinuous coefficients. Such problems are oftencalled mixed boundary conditions; to avoid misunderstanding we will not use thisterminology. We impose different boundary conditions on connected parts of theboundary, which makes the boundary value problem singular. Roughly speaking,one has a decomposition of a smooth boundary in some parts where different typesof the boundary conditions are imposed, i.e. Dirichlet or Neumann. The boundaryoperator is then discontinuous at the intersection of these parts. The boundary valueproblems of this type are called Zaremba problem in the literature [14, 15] (seealso [42, 25, 5, 21, 20]).

Page 17: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 17

In this paper we consider the simplest case when there are just two components.We assume that the boundary of the manifold ∂M is decomposed as the disjointunion

∂M = �1 ∪�2 ∪�0, (19)

so that the closures �1 = �1 ∪ �0 and �2 = �2 ∪ �0 are smooth compactsubmanifolds of dimension (m−1) (codimension one submanifolds), with the sameboundary �0 = ∂�1 = ∂�2, that is a smooth compact submanifold of dimension(m − 2) (codimension two submanifold) without boundary, i.e. ∂�0 = ∅. Let usstress here that when viewed as sets both �1 and �2 are considered to be disjointopen sets, i.e. �1 ∩�2 = ∅.

Let χi: ∂M → R, (i = 0, 1, 2), be the characteristic functions of the sets �i;χi(x) = 1 if x ∈ �i and χi(x) = 0 if x /∈ �i . Obviously, χ1(x)+χ2(x)+χ0(x) = 1for any x ∈ ∂M. Let πi: L2(W) → L2(W), (i = 0, 1, 2), be the trivial projectionsof sections, ψ , of a vector bundle W to �i defined by (πiψ)(x) = χi(x)ψ(x), i.e.(πiψ)(x) = ψ(x) if x ∈ �i and (πiψ)(x) = 0 if x /∈ �i . In other words π1 mapssmooth sections of the bundle W to their restriction to �1, extending them by zeroon �2, and similarly for π2. Obviously, π1 + π2 + π0 = Id, π2

i = πi , (i = 0, 1, 2),and πiπj = 0 for i �= j . In principle, these projections can be used to define theboundary conditions. However, we will not use them in this paper.

We will see later that to specify the solution uniquely, i.e. to completely deter-mine the domain of the operator FB , we also need an additional condition whichspecifies the type of the singularity on �0. Therefore, the boundary data are

ψ(ϕ) = ψ1(ϕ)⊕ ψ2(ϕ)⊕ ψ0(ϕ), (20)

where ψ1 and ψ2 are defined as above

ψi(ϕ) =(

ϕ|�i∇Nϕ|�i

), (i = 1, 2). (21)

The boundary data map ψ0 is more involved since �0 is a codimension two sub-manifold. It turns out that the solutions of the boundary value problem could besingular at �0 and still be in L2(V ) (and smooth in the interior of M). Thus, therestriction map to �0 is singular, in general, i.e. the data ϕ|�0 and ∇Nϕ|�0 are notwell defined. Instead we shall define the boundary data as follows. Let ρ be thenormal geodesic distance to �0 and N = ∂ρ be the unit inward normal vector fieldto �0. Then

ψ0(ϕ) =(

(√ρϕ)|�0∇N (√ρϕ)|�0

). (22)

The boundary operator decomposes accordingly

B = B1 ⊕ B2 ⊕ B0, (23)

Page 18: Mathematical Physics, Analysis and Geometry - Volume 7

18 IVAN G. AVRAMIDI

where Bi are the boundary operators of mixed type (13). We choose B1 and B2 tobe Dirichlet and Neumann (Robin) boundary operators

B1 =(

I 00 0

), B2 =

(0 0� I

), (24)

where � is a smooth Hermitian endomorphisms of the vector bundle W . As far asthe boundary operator B0 is concerned we will study both cases

B0 =(

I 00 0

), (25)

and

B0 =(

0 0−s I

), (26)

with s a real parameter.In other words, we have Dirichlet boundary conditions on �1 and Neumann

(Robin) boundary conditions on �2

ϕ|�1 = 0, (27)

(∇N +�)ϕ|�2 = 0 (28)

as well as an additional condition on �0:

(√ρϕ)|�0 = 0 (29)

or

(∇N − s)(√ρϕ)|�0 = 0. (30)

DEFINITION 2. The boundary value problem for a Laplace operator F (7) withthe boundary conditions (27–30) is called Zaremba problem.

Roughly speaking the boundary condition (29) corresponds to the extension ofthe Dirichlet boundary conditions from �1 to �0 while the condition (30) cor-responds to the extension of the Neumann boundary conditions from �2 to �0

(however this should not be taken literally). We will discuss below the origin andthe meaning of these boundary conditions and show that the case (29) leads toregular solutions at �0 and that is why will be called regular whereas the case (30)leads to singular solutions and will be called singular.

2.3. SYMMETRY

Let us define the antisymmetric bilinear form

I (ϕ1, ϕ2) ≡ (Fϕ1, ϕ2)L2(V ) − (ϕ1, Fϕ2)L2(V ), (31)

Page 19: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 19

for any two smooth sections ϕ1, ϕ2 ∈ C∞(V ) of the vector bundle V . By integrat-ing by parts on M one can easily see that this bilinear form depends only on theboundary data

I (ϕ1, ϕ2)= (ψ1(ϕ1), Jψ1(ϕ2))L2(W⊕W,�1)+

+ (ψ2(ϕ1), Jψ2(ϕ2))L2(W⊕W,�2), (32)

where

J =(

0 I

−I 0

). (33)

Therefore, it vanishes on sections of the bundle V with compact support disjointfrom the boundary ∂M when the boundary data vanish ψ(ϕ1) = ψ(ϕ2) = 0. Thisis a simple consequence of the fact that the operator F is formally self-adjoint.A formally self-adjoint operator is essentially self-adjoint if its closure is self-adjoint. This means that the operator is such that: (i) it is symmetric on smoothsections satisfying the boundary conditions, and (ii) there exists a unique self-adjoint extension of it. To prove the latter property one has to study the deficiencyindices; however, this will not be the subject of primary interest in the presentpaper. We check only the first property, i.e. that the operator F is symmetric.

For any two sections ϕ1, ϕ2 ∈ C∞(V ) satisfying the boundary conditions(27–30) we easily obtain

I (ϕ1, ϕ2) = (�ϕ1, ϕ2)L2(W,�2)− (ϕ1,�ϕ2)L2(W,�2)

. (34)

That is for any Hermitian endormophism � ∈ C∞(End(W)) the form I (ϕ1, ϕ2)

vanishes. Therefore, we immediately obtain

PROPOSITION 1. Zaremba boundary value problem is symmetric.

Note that this property does not depend on the boundary conditions (29, 30) atthe singular codimension two submanifold �0.

2.4. ELLIPTICITY

Roughly speaking ellipticity means invertibility up to a compact operator in ap-propriate functional spaces (see [39, 41, 28, 32, 30]). This is basically a conditionthat implies invertibility “locally”. That is why it has three components: (i) in theinterior of the manifold M, (ii) at the boundary parts �1 and �2 (codimension onesubmanifolds) and (iii) at the singular set �0 (codimension two submanifold). Itcan be formulated as follows.

2.4.1. Interior Ellipticity

DEFINITION 3. The operator F is called elliptic if its leading symbol σL(F ; x, ξ)is nonsingular for any ξ �= 0 and any interior point x in M.

Page 20: Mathematical Physics, Analysis and Geometry - Volume 7

20 IVAN G. AVRAMIDI

For Laplace type operator (7) defined with a positive-definite nonsingular Rie-mannian metric we obviously have

PROPOSITION 2. Laplace type operator (8) is elliptic.

2.4.2. Codimension One Ellipticity

At the boundary we use the coordinates x = (r, x) and define a split of thecotangent bundle T∗M = R ⊕ T ∗∂M so that ξ = (ξµ) = (ω, ζ ) ∈ T∗M, whereζ = (ζj ) ∈ T ∗∂M and ω ∈ R. Let further λ be a complex number which doesnot lie on the positive real axis, λ ∈ C − R+. We consider the leading symbolσL(F ; r, x, w, ζ ) of the operator F , substitute r = 0 and ω �→ −i∂r and considerthe following ordinary differential equation on a half-line (r ∈ R+)

[σL(F ; 0, x,−i∂r , ζ )− λ]ϕ = 0, (35)

with an asymptotic condition

limr→∞ ϕ = 0. (36)

Consider now the general boundary operator of mixed type (13) (with , and �

being some endomorphisms). Its graded leading symbol is defined by [28, 30]

σgL(B; x, ζ ) =(, 00 (I −,)

). (37)

DEFINITION 4. The boundary operator B is called elliptic with respect to anelliptic operator F if for each boundary point x in ∂M, each ζ ∈ T ∗

x∂M, each

λ ∈ C − R+, (ζ, λ) �= (0, 0), and each f ∈ C∞(W ⊕ W) there is a uniquesolution ϕ to the equation (35) subject to the condition (36) and satisfying

σgL(B; x, ζ )[ψ(ϕ)− f ] = 0, (38)

where ψ(ϕ) are the boundary data.

It is not difficult to check

PROPOSITION 3. The Dirichlet and Neumann boundary operators are ellipticwith respect to the Laplace type operator F in ∂M \�0.

2.4.3. Codimension Two Ellipticity

The question of ellipticity of Zaremba boundary value problem is a subtle one.At the singular codimension two submanifold �0 we use the local coordinatesx = (r, y, x) and define a split of the cotangent bundle T∗M = R

2 ⊕ T ∗�0 so

Page 21: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 21

that ξ = (ξµ) = (ω, ν, η) ∈ T∗M, where η = (ηa) ∈ T ∗�0, and (ω, ν) ∈ R2. Let

further λ be a complex number which does not lie on the positive real axis, λ ∈C−R+. We consider the leading symbol σL(F ; r, y, x, ω, ν, η) of the operator F ,substitute r = y = 0, ω �→ −i∂r , ν �→ −i∂y , and consider the following ordinarydifferential equation on a half-plane ((r, y) ∈ R+ × R)

[σL(F ; 0, 0, x,−i∂r ,−i∂y, η)− λ]ϕ = 0, (39)

with an asymptotic conditions

limr→∞ ϕ = lim|y|→∞ϕ = 0. (40)

Define the graded leading symbol of the boundary operator B0 (25) and (26) by

σgL(B0; x, η) =(

I 00 0

), or σgL(B0; x, η) =

(0 00 I

). (41)

DEFINITION 5. The boundary operator B0 is called elliptic with respect to anelliptic operator F and the elliptic boundary operators B1 and B2 if for each point xin �0, each η ∈ T ∗

x�0, each λ ∈ C − R+, (η, λ) �= (0, 0), and each f = f1 ⊕

f2 ⊕ f0, where fi ∈ C∞(W ⊕W,�i); (i = 0, 1, 2), there is a unique solution ϕ toEquation (39) subject to the asymptotic conditions (40) and boundary conditions

σgL(B; x, η)[ψ(ϕ)− f ] = 0, (42)

where B = B1 ⊕ B2 ⊕ B0 is the total boundary operator and ψ(ϕ) = ψ1(ϕ) ⊕ψ2(ϕ)⊕ ψ0(ϕ) are the boundary data defined by (21, 22).

We will construct the unique solution to an equivalent problem in the Section 8,establishing:

PROPOSITION 4. The boundary operator B0 is elliptic with respect to theLaplace type operator F and the Dirichlet and Neumann boundary operators B1

and B2.

This leads finally to:

PROPOSITION 5. The Zaremba boundary value problem is elliptic.

3. Heat Kernel Asymptotics

For t > 0 the heat semi-group operator U(t) = exp(−tF ): L2(V ) → L2(V ) iswell defined. The integral kernel of this operator, called the heat kernel, is definedby the equation

(∂t + F)U(t|x, x′) = 0 (43)

Page 22: Mathematical Physics, Analysis and Geometry - Volume 7

22 IVAN G. AVRAMIDI

with the initial condition

U(0|x, x′) = δ(x, x′), (44)

where δ(x, x′) is the covariant Dirac distribution, the boundary condition

Bψ[U(t|x, x′)] = 0 (45)

and the self-adjointness condition

U(t|x, x′) = U ∗(t|x′, x). (46)

Hereafter all differential operators as well as the boundary data map act on the firstargument of the heat kernel, unless otherwise stated.

Let λ be a complex number with a sufficiently large negative real part, Reλ " 0.The resolvent can then be defined by the Laplace transform

G(λ) =∫ ∞

0dt etλU(t), (47)

and by analytical continuation elsewhere. The heat kernel can be expressed, in turn,in terms of the resolvent by the inverse Laplace transform

U(t) = 1

2πi

∫ w+i∞

w−i∞dλ e−tλG(λ), (48)

where w is a sufficiently large negative real number, w " 0. As it has beendone here, we will sometimes omit the space arguments if it does not cause anyconfusion.

It is well known [30] that the heat kernel U(t|x, x′) is a smooth function neardiagonal of M ×M, i.e. for x close to x′, and has a well defined diagonal value

U diag(t|x) = U(t|x, x), (49)

and the functional trace

TrL2 exp(−tF ) =∫M

trVUdiag(t), (50)

where trV is the fiber trace and the integration is defined with the help of the usualRiemannian volume element d volg .

It is also well known that

THEOREM 3. In the smooth category the trace of the heat kernel has an asymp-totic expansion as t → 0+ of the form [30]

TrL2 exp(−tF ) ∼∞∑k=0

t (k−m)/2Ak. (51)

Page 23: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 23

Here Ak are the famous so-called (global) heat-kernel coefficients (sometimescalled also Minakshisundaram–Plejel coefficients). They have the following gen-eral form [30]:

A2k =∫M

a(0)2k +

∫∂M

a(1)2k , (52)

A2k+1 =∫∂M

a(1)2k+1, (53)

where a(0)k and a

(1)k are the (local) interior and boundary heat-kernel coefficients.

The local interior coefficients a(0)k are also called HMDS (Hadamard–Minackshi-

sundaram–De Witt–Seeley) coefficients in the literature. Hereafter the integrationover the boundary is defined with the help of the usual Riemannian volume elementd volg on ∂M with the help of the induced metric g.

The interior coefficients a(0)k do not depend on the boundary conditions B. Theeven order coefficients a

(0)2k are calculated for Laplace-type operators up to a

(0)8

[1, 43]. The boundary coefficients a(1)k do depend on both the operator F and

the boundary operator B. They are far more complicated because in addition tothe geometry of the manifold M they depend essentially on the geometry of theboundary ∂M. For Laplace-type operators they are known for the usual boundaryconditions (Dirichlet, Neumann, or mixed version of them) up to a

(1)5 [12, 13, 34].

For oblique boundary conditions including tangential derivatives some coefficientswere recently computed in [36, 7, 8, 6, 22, 23].

However, the Zaremba boundary value problem considered in the present paperis essentially singular. Even if the manifold M, its boundary ∂M and the opera-tor F are all in smooth category, the coefficients of the boundary operator B arediscontinuous on �0, which makes it a singular problem. For such problems

THEOREM 4. The asymptotic expansion of the trace of the heat kernel has addi-tional nontrivial logarithmic terms, [27, 15]

TrL2 exp(−tFB) ∼∞∑k=0

t (k−m)/2Bk + log t∞∑k=0

tk/2Hk. (54)

Whereas there are some results concerning the coefficients Bk, until recentlyalmost nothing was known about the coefficients Hk. Since the Zaremba problemis local, or better to say ‘pseudo-local’, all these coefficients have the form

B2k =∫M

b(0)2k +

∫�1

b(1),12k +

∫�2

b(1),22k +

∫�0

b(2)2k , (55)

B2k+1 =∫�1

b(1),12k+1 +

∫�2

b(1),22k+1 +

∫�0

b(2)2k+1, (56)

Hk =∫�0

hk. (57)

Page 24: Mathematical Physics, Analysis and Geometry - Volume 7

24 IVAN G. AVRAMIDI

Here the new feature is the appearance of the integrals over �0, which complicatesthe problem even more, since the coefficients now depend on the geometry ofthe imbedding of the codimension two submanifold �0 in M that could be prettycomplicated, even if smooth.

The asymptotic expansion of the trace of the heat kernel has been studied re-cently in [40]. It has been shown there that for the Zaremba type problem consid-ered in the present paper, the logarithmic terms do not appear, i.e. Hk = 0 for any k

(see Theorem 1), which confirmed the conjecture of [5].

4. Geometrical Framework

First of all, we need to describe properly the geometry of the problem. Let us fixtwo small positive numbers ε1, ε2 > 0. We split the whole manifold in a disjointunion of four different parts:

M = M int ∪Mbnd = M int ∪Mbnd1 ∪Mbnd

2 ∪Mbnd0 . (58)

Here Mbnd0 is defined as the set of points in the narrow strip Mbnd of the manifold M

near the boundary ∂M of the width ε1 that are at the same time in a narrow strip ofthe width ε2 near �0

Mbnd0 = {x ∈ M | dist(x, ∂M) < ε1, dist(x,�0) < ε2}. (59)

Further, Mbnd1 is the part of the thin strip Mbnd of the manifold M (of the width εi)

near the boundary ∂M that is near �1 but on the finite distance from �0, i.e.

Mbnd1 = {x ∈ M | dist(x,�1) < ε1, dist(x,�0) > ε2}. (60)

Similarly,

Mbnd2 = {x ∈ M | dist(x,�2) < ε1, dist(x,�0) > ε2}. (61)

Finally, M int is the interior of the manifold M without a thin strip at the bound-ary ∂M, i.e.

M int = M \ (Mbnd1 ∪Mbnd

2 ∪Mbnd0 ) = {x ∈ M | dist(x, ∂M) > ε1}. (62)

We will construct the asymptotic solution of the heat equation on M by us-ing different approximations in different domains. Strictly speaking, to glue themtogether in a smooth way one should use ‘smooth characteristic functions’ of dif-ferent domains (partition of unity) and carry out all necessary estimates. Whatone has to control is the order of the remainder terms in the limit t → 0 andtheir dependence on ε1 and ε2. Since our task here is not to prove the form ofthe asymptotic expansion (54), which is known, but rather to compute explicitlythe coefficients of the asymptotic expansion, we will not worry about such subtledetails. We will compute the asymptotic expansion as t → 0 in each domain and

Page 25: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 25

then take the limit ε1, ε2 → 0. For a rigorous treatment see [14, 15, 27] and thereferences therein.

We will use different local coordinates in different domains. In M int we do notfix the local coordinates; our treatment will be manifestly covariant.

In Mbnd1 we choose the local coordinates as follows. Let {ei}, (i = 2, . . . , m),

be the local frame for the tangent bundle T�1 and x = (xi) = (x2, . . . , xm),(i = 2, . . . , m), be the local coordinates on �1. Let r = dist(x,�1) be the normaldistance to �1, (r = 0 being the defining equation of �1), and N = ∂r |�1 be theinward pointing unit normal to �1. Then by using the geodesic flow we get thelocal frame {N(r, x), ei(r, x)} for the tangent bundle TM and the local coordinatesx = (r, x) on Mbnd

1 . The geometry of �1 is described by the extrinsic curvature K

(second fundamental form)

∇iej− = KijN, ∇iN = −Kj

i ej . (63)

The coordinate r ranges from 0 to ε1, 0 � r � ε1. The local coordinates in Mbnd2

are chosen similarly.Finally, in Mbnd

0 we choose the local coordinates as follows. Let {ea(x)}, (a =3, . . . , m), be a local frame for the tangent bundle T�0 and let x = (xa) =(x3, . . . , xm} be the local coordinates on �0. Let dist∂M(x,�0) be the distance froma point x on ∂M to �0 along the boundary ∂M. Then define y = +dist∂M(x,�0) >

0 if x ∈ �1 and y = −dist∂M(x,�0) < 0 if x ∈ �2. In other words, y = 0 on�0, (r = y = 0 being the defining equations of �0), y > 0 on �1 and y < 0on �2. Let n(x) = ∂y|�0 be the unit normal to �0 pointing inside �1. Thenby using the tangential geodesic flow along the boundary (that is normal to �0)we first get the local orthonormal frame {n(y, x), ea(y, x)} for the tangent bundleT ∂M. Further, let the unit normal vector field to the boundary N(y, x) be definedas above. Then by using the normal geodesic flow to the boundary we get thelocal frame {N(r, y, x), n(r, y, x), ea(r, y, x)} for the tangent bundle TM and localcoordinates (r, y, x) on Mbnd

0 . The geometry of �0 (codimension two manifold) isdescribed by two extrinsic curvatures K and L and an additional vector T :

∇aeb = Kabn+ LabN. (64)

∇an = −Kba eb + TaN, ∇aN = −Lb

aeb − Tan. (65)

The ranges of the coordinates r and y are: 0 � r � ε1 and −ε2 < y � ε2.Finally, we introduce the polar coordinates

r = ρ cos θ, y = ρ sin θ. (66)

The angle θ ranges from −π/2 to π/2 with θ = −π/2 on �1 and θ = π/2 on�2. To cover the whole Mbnd

0 , ρ should range from 0 to some ε3 (depending on ε1

and ε2), 0 � ρ � ε3.

Page 26: Mathematical Physics, Analysis and Geometry - Volume 7

26 IVAN G. AVRAMIDI

5. Interior Heat Kernel

This is the easiest case. The construction of the heat kernel goes along the samelines as for manifolds without boundary (see, e.g., [19, 30, 1, 9, 4]. The basiccase (when the coefficients of the operator F are frozen at a point x0) is, in fact,zero-dimensional, i.e. algebraic. By using the normal coordinates at x0 and Fouriertransform one easily obtains

PROPOSITION 6. The leading order interior heat kernel is

U int0 (t|x, x′) = (4πt)−m/2 exp

(−|x − x′|2

4t

). (67)

We try to find the fundamental solution of the heat equation near diagonal forsmall t , i.e. x → x′ and t → 0+, that, instead of the boundary conditions satisfiesasymptotic condition at infinity. This means that effectively one introduces a smallexpansion parameter ε reflecting the fact that the points x and x′ are close to eachother and the parameter t is small. This can be done by fixing a point x0 = x′ inM int, choosing the normal coordinates at this point (with gµν(x

′) = δµν), scaling

x → x′ + ε(x − x′), y → x′ + ε(y − x′), t → ε2t, (68)

and expanding in a power series in ε. We will label the scaled objects by ε, e.g., Uε.The scaling parameter ε will be considered as a small parameter in the theory andwe will use it to expand everything in power (asymptotic) series in ε. At the veryend of calculations we set ε = 1. The nonscaled objects, i.e. those with ε = 1, willnot have the label ε. Another way of doing this is by saying that we will expandall quantities in the homogeneous functions of (x − x′), (y − y′) and

√t . This

construction is standard and we do not repeat it here.One can also use instead a manifestly covariant method [19, 9, 1, 3, 4, 43],

which gives a convenient formula for the asymptotics as t → 0+

U int(t) ∼ exp

(− σ

2t

)E1/2

∞∑k=0

t (k−m)/2ak, (69)

where σ = σ (x, x′) = (1/2)[dist(x, x′)]2 is one half of the square of the geodesicdistance between x and x′, E = E(x, x′) = g−1/2(x)g−1/2(x′) det[−∂xµ∂

x ′ν σ (x, x

′)]is the corresponding Van Vleck–Morette determinant, g = det gµν , and ak =ak(x, x

′) are the off-diagonal heat-kernel coefficients (note that odd order coef-ficients vanish identically, i.e. a2k+1 = 0). These coefficients satisfy certain differ-ential recursion relations which can be solved in form of a covariant Taylor seriesnear diagonal [1].

PROPOSITION 7. The asymptotic expansion of the heat kernel on the diagonalreads

U intdiag(t) ∼

∞∑k=0

t (k−m)/2adiagk , (70)

Page 27: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 27

where

adiagk (x) = ak(x, x). (71)

This asymptotic expansion can be integrated over the interior of the manifoldM int. Since both the local interior coefficients ak and the volume element d volg areregular at the boundary, these integrals have well defined limits as ε1 → 0

limε1→0+

∫M int

trV adiagk =

∫M

trV adiagk . (72)

Thus we obtain

PROPOSITION 8. The local interior contribution to the global heat kernel coef-ficients Bk is given by

b(0)2k = trV a

diag2k . (73)

As we already noted above all odd order coefficients vanish, b(0)2k+1 = 0. The

explicit formulas for even order coefficients b(0)2k are known up to b

(0)8 [1, 43]. The

first two coefficients have the well known form:

THEOREM 5.

b(0)0 = (4π)−m/2 dimV, (74)

b(0)2 = (4π)−m/2trV

(16R −Q

), (75)

where R is the scalar curvature.

6. Dirichlet Heat Kernel

In this section we will follow closely the ideas of the paper [2]. For an ellip-tic boundary-value problem the diagonal of the heat kernel U diag

bnd (t) in Mbnd1 has

exponentially small terms, i.e. of order ∼ exp(−r2/t), (recall that r is the nor-mal geodesic distance to the boundary) as t → 0+ and r > 0. These termsdo not contribute to the asymptotic expansion of the heat-kernel diagonal out-side the boundary as t → 0+. However, they behave like distributions near theboundary, and, therefore, the integrals over Mbnd

1 , more precisely, the integralslimε1→0

∫�1

∫ ε10 dr(. . .), do contribute to the asymptotic expansion with coefficients

being the integrals over �1. It is this phenomenon that leads to the boundary termsin the heat kernel coefficients. Thus, such terms determine the local boundarycontributions b

(1)k to the global heat-kernel coefficients Bk. The same applies to

the Neumann heat kernel and �2.The Dirichlet heat kernel U bnd,1(t|x, x′) in Mbnd

1 is constructed as follows. Nowwe want to find the fundamental solution of the heat equation near diagonal, i.e.

Page 28: Mathematical Physics, Analysis and Geometry - Volume 7

28 IVAN G. AVRAMIDI

for x → x′ and for small t → 0 in the region Mbnd1 close to the boundary, i.e. for

small r and r ′, that satisfies Dirichlet boundary conditions on �1 and asymptoticcondition at infinity. We fix a point on the boundary, x0 ∈ �1, and choose normalcoordinates on �1 at this point (with gij (0, x0) = δij ).

The basic case here (when the coefficients of the operator F are frozen at thepoint x0 is one-dimensional. The zeroth-order term U

bnd,(1)0 is defined by the heat

equation

(∂t + F0)Ubnd,(1)0 = 0, (76)

where

F0 = −∂2r − ∂2, (77)

the initial condition

Ubnd,(1)0 (0|r, x; r ′, x′) = δ(r − r ′)δ(x, x′), (78)

the boundary conditions,

Ubnd,(1)0 |�1 = 0, (79)

and the asymptotic condition

limr→∞U

bnd,(1)0 (t|r, x; r ′, x′) = lim

r ′→∞U

bnd,(1)0 (t|r, x; r ′, x′) = 0. (80)

Note that the restriction to the boundary (. . .)|�1 applies only to the first argu-ment, i.e. r → 0. The operator F0 is a partial differential operator with constantcoefficients. By using the Fourier transform in the boundary coordinates (x − x′)it reduces to an ordinary differential operator of second order. Clearly, the �0 partfactorizes and the solution to the remaining one-dimensional problem can be easilyobtained by using the Laplace transform, for example.

PROPOSITION 9. The leading order Dirichlet heat kernel has the form

Ubnd,(1)0 (t|r, x; r ′, x′) = K(t|r, x; r ′, x′)−K(t|r, x;−r ′, x′) (81)

where

K(t|r, x; r ′, x′) = (4πt)−m/2 exp

(−|x − x′|2 + (r − r ′)2

4t

). (82)

Note that in addition to the usual symmetry of the heat kernel, the Dirichlet heatkernel possesses the following ‘mirror symmetry’

Ubnd,(1)0 (t|r, x; r ′, x′)

= −Ubnd,(1)0 (t| − r, x; r ′, x′) = −U

bnd,(1)0 (t|r, x;−r ′, x′), (83)

Page 29: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 29

i.e. it is an odd function of the coordinates r and r ′ separately.To construct the whole heat kernel, we again scale the coordinates. But now we

include the coordinates r and r ′ in the scaling

x → x0 + ε(x − x0), x′ → x0 + ε(x′ − x0) (84)

r → εr, r ′ → εr ′, t → ε2t. (85)

The corresponding differential operators are scaled by

∂ → 1

ε∂, ∂r → 1

ε∂r, ∂t → 1

ε2∂t . (86)

Then, we expand the scaled operator Fε in the power series in ε, i.e.

F → Fε ∼∞∑n=0

εn−2Fn, (87)

where Fn are second-order differential operators with homogeneous symbols. Sincethe Dirichlet boundary operator does not contain any derivatives and has constantcoefficients on �1 it does not scale at all.

The subsequent strategy is rather simple. We expand the scaled heat kernel in ε

U bnd,(1)ε ∼

∞∑n=0

ε2−m+nU bnd,(1)n , (88)

and substitute into the scaled version of the heat equation and the Dirichlet bound-ary condition on �1. Then, by equating the like powers in ε one gets an infinite setof recursive differential equations

(∂t + F0)Ubnd,(1)k = −

k∑n=1

FnUbnd,(1)k−n , k = 1, 2, . . . , (89)

with the boundary conditions

Ubnd,(1)k (t|0, x; r ′, x′) = U

bnd,(1)k (t|r, x; 0, x′) = 0, (90)

and the asymptotic conditions

limr→∞U

bnd,(1)k (t|r, x; r ′, x′) = lim

r ′→∞U

bnd,(1)k (t|r, x; r ′, x′) = 0. (91)

In other words, we decompose the heat kernel into the homogeneous parts withrespect to (x − x0), (x′ − x0), r, r ′ and

√t , i.e.

Ubnd,(1)k (t|r, x; r ′, x′)

= t (k−m)/2Ubnd,(1)k (1|t−1/2r, x′ + t−1/2(x − x′); t−1/2r ′, x′), (92)

in particular, on the diagonal we have

Ubnd,(1)k (t|r, x; r, x) = t (k−m)/2U

bnd,(1)k (1|t−1/2r, x; t−1/2r, x), (93)

and, therefore:

Page 30: Mathematical Physics, Analysis and Geometry - Volume 7

30 IVAN G. AVRAMIDI

PROPOSITION 10. As t → 0

Ubnd,(1)diag (t|r, x) ∼

∞∑k=0

t (k−m)/2Ubnd,(1)k (1|t−1/2r, x; t−1/2r, x). (94)

To compute the contribution to the asymptotic expansion of the trace of the heatkernel, we will need to compute the integral of U bnd,(1)

diag (t) over Mbnd1 . One should

stress that the volume element should also be scaled

d vol(r, x) → d vol(εr, x) = d vol(0, x) ·∞∑k=0

εkrk

k!gk(x), (95)

where

gk(x) = ∂k

∂rk

[d vol(r, x)

d vol(0, x)

]∣∣∣∣r=0

. (96)

Combining the above equations and changing the variable r = √tξ we obtain∫

Mbnd1

Ubnd,(1)diag (t)

=∫�1

∫ ε1

0dr

d vol(r, x)

d vol(0, x)U

bnd,(1)diag (t|r, x; rx)

∼∞∑k=0

t (k−m)/2∫�1

k−1∑n=0

1

n!gn(x)∫ ε1/

√t

0dξξnU bnd,(1)

k−n−1 (1|ξ, x; ξ, x). (97)

We note that even if the coefficients Ubnd,(1)k satisfy the asymptotic regularity

condition at r → ∞ (91) off-diagonal, the diagonal values of them do not fall offat infinity. They have the following general form

Ubnd,(1)k (1|ξ, x; ξ, x) = Pk(ξ, x)+ Y

(1)k (ξ, x), (98)

where Pk(ξ, x) are polynomials in ξ and Y(1)k (ξ, x) are exponentially small, more

precisely ∼ ξα exp(−ξ 2) with some α, as ξ → ∞ (which corresponds to t → 0).Obviously, the integrals over the polynomial part over Mbnd

1 vanish after takingthe asymptotic expansion as t → 0 and the limit ε1, ε2 → 0. The coefficientsPk constitute simply the ‘interior part’ of the heat kernel and are not essential incomputing the boundary contribution. The coefficients Y

(1)k , in contrary, behave

like distributions near �1. They give the �1 contributions to the boundary heat

kernel coefficients b(1)k . In the limit t → 0 the integral

∫ ε1/√t

0 dξ(. . .) becomes∫ ∞0 dξ(. . .) plus an exponentially small remainder term. Then in the limit ε1 → 0

we obtain integrals over �1 up to an exponentially small function that we are notinterested in.

As the result we get

Page 31: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 31

PROPOSITION 11. The coefficients b(1)k are given by

b(1),1k =

k−1∑n=0

1

n!gn∫ ∞

0dξξn trV Y

(1)k−n−1(ξ, x). (99)

These are the standard boundary heat kernel coefficients for Dirichlet boundaryconditions. They are listed, for example, in [12, 13] up to k = 4. The first two havethe form:

THEOREM 6.

b(1),10 = 0,

b(1),11 = −(4π)−(m−1)/2 dimV 1

4 , (100)

b(1),12 = (4π)−m/2 dimV 1

3K,

where K is the trace of the extrinsic curvature (second fundamental form) of theboundary.

7. Neumann Heat Kernel

The construction of the Neumann heat kernel in Mbnd2 goes essentially along the

same lines except that now the boundary operator, in fact the endomorphism �,is not constant and should be also scaled, so that the scaled boundary conditionsare [7, 8]

(1

ε∂r +�ε

∣∣∣∣�2

= 0, (101)

where

�ε ∼∞∑k=0

εk�k. (102)

The zeroth-order operator F0 is given by the same formula (77) and the zero-orderboundary operator is just the standard Neumann one. The basic zero-order problemcan again be easily solved giving

PROPOSITION 12. The leading order Neumann heat kernel has the form

Ubnd,(2)0 (t|r, x; r ′, x′) = K(t|r, x; r ′, x′)+K(t|r, x;−r ′, x′) (103)

with the same kernel K (82).

Page 32: Mathematical Physics, Analysis and Geometry - Volume 7

32 IVAN G. AVRAMIDI

Note that the Neumann heat kernel has another mirror symmetry

Ubnd,(2)0 (t|r, x; r ′, x′)

= Ubnd,(2)0 (t| − r, x; r ′, x′) = U

bnd,(2)0 (t|r, x;−r ′, x′), (104)

i.e. it is an even function of the coordinates r and r ′ separately.The construction of the heat kernel goes along the same lines as in Dirichlet

case. We have the recursive differential equations

(∂t + F0)Ubnd,(2)k = −

k∑n=1

FnUbnd,(2)k−n , k = 1, 2, . . . , (105)

with the boundary conditions

∂rUbnd,(2)k

∣∣�2

= −k−1∑n=1

�nUbnd,(2)k−n−1

∣∣�2, (106)

and the asymptotic conditions

limr→∞U

bnd,(2)k (t|r, x; r ′, x′) = lim

r ′→∞U

bnd,(2)k (t|r, x; r ′, x′) = 0. (107)

As we already noted above the restriction to the boundary applies only to thefirst argument r. One can repeat here everything said at the end of the previoussubsection about Dirichlet heat kernel. We have again homogeneity property

Ubnd,(2)k (t|r, x; r ′, x′)

= t (k−m)/2Ubnd,(2)k (1|t−1/2r, x′ + t−1/2(x − x′); t−1/2r ′, x′) (108)

and the following expansion for the diagonal:

PROPOSITION 13. As t → 0

Ubnd,(2)diag (t|r, x) ∼

∞∑k=0

t (k−m)/2Ubnd,(2)k (1|t−1/2r, x; t−1/2r, x). (109)

By separating the polynomial and exponentially small parts,

Ubnd,(2)k (1|ξ, x; ξ x) = Pk(ξ, x)+ Y

(2)k (ξ, x), (110)

and repeating the arguments at the end of the previous subsection we obtain the �2

contributions to the boundary heat kernel coefficients b(1),2k

PROPOSITION 14.

b(1),2k =

k−1∑n=0

1

n!gn∫ ∞

0dξξn trV Y

(2)k−n−1(ξ, x). (111)

Page 33: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 33

These are the standard boundary heat kernel coefficients for Neumann boundaryconditions. They are listed, for example, in [12, 13] up to k = 4. The first two havethe form

THEOREM 7.

b(1),20 = 0,

b(1),21 = (4π)−(m−1)/2 dimV 1

4 , (112)

b(1),22 = (4π)−m/2 dimV

(13K + 2�

).

8. Zaremba Heat Kernel

This is the most complicated (and the most interesting) case, since here the basicproblem with frozen coefficients at a point x0 on �0 is two-dimensional. We willlimit ourselves in this paper to the leading order and will be actually working in thetangent space R+×R×Tx0∂�0, so that the basic problem in Mbnd

0 will be reducedto the problem on the half-plane.

As above we denote by r the normal geodesic distance to the boundary ∂M

and by y the signed normal geodesic distance to �0 along the boundary and wechoose normal coordinates on �0 at the point x0 (with gab(0, 0, x0) = δab). Thenthe operator F0 has the form

F0 = −∂2r − ∂2

y − ∂2, (113)

where ∂2 = δab∂a∂b. This operator acts on the square integrable sections of thevector bundle V in a neighbourhood of the point x0. We extend the operator ap-propriately to L2(V ,R+,R,Rm−2; dr dy dx), so that it coincides with the initialoperator in the neighborhood of the point x0. By choosing the polar coordinates inthe normal bundle described above we obtain the operator

F0 = −∂2ρ −

1

ρ∂p − 1

ρ2∂2θ − ∂2 (114)

acting on

L2

(V,R+,

[−π

2,π

2

],Rm−2;ρ dρ dθ dx

). (115)

Note that in the polar coordinates the set �1 corresponds to θ = π/2, the set �2

corresponds to θ = −π/2 and the singular set �0 corresponds to ρ = 0. The zeroorder inward pointing normal N to the boundary in polar coordinates has the form

N0|�1 = ∂r |y>0 = − 1

ρ∂θ |ρ>0, θ=π/2, (116)

N0|�2 = ∂r |y<0 = 1

ρ∂θ |ρ>0, θ=−π/2, (117)

N0|�0 = ∂r |y=0 = ∂θ |ρ=0, θ=0. (118)

Page 34: Mathematical Physics, Analysis and Geometry - Volume 7

34 IVAN G. AVRAMIDI

Also, in the leading order the endormorphism � in the Neumann boundary operatordoes not contribute. Therefore, the Zaremba boundary conditions on �1 and �2

read

ϕ|θ=π/2 = 0, ∂θϕ|θ=−π/2 = 0, (ρ > 0). (119)

Hence the boundary operator is discontinuous, and there is a singularity at theorigin ρ = 0. As we already discussed above these boundary conditions do notcompletely determine the domain of the operator F0 – we need an additional con-dition along the singular set �0, i.e. we need to specify the behavior of the solutionas ρ → 0+.

Since ϕ must be square integrable,

∫Rm−2

dx∫ π/2

−π/2dθ

∫ ∞

0dρρ|ϕ|2 < ∞, (120)

the section ϕ must decrease at infinity faster than ρ−1/2 as ρ → ∞. We require that

limρ→∞

√ρϕ = lim

ρ→∞ ∂ρ(√ρϕ) = 0. (121)

On another hand, since the volume element has an extra power of ρ, the squareintegrable section can be singular as ρ → 0. However, this must be an integrablesingularity, i.e. ϕ cannot be singular stronger than ρ−1/2 as ρ → 0+. The type ofthe singularity should be specified by an additional boundary condition at ρ → 0+.Since the point ρ = 0 is singular, this boundary condition cannot be imposedarbitrarily. Also it does not follow from the boundary conditions on �1 and �2. Weimpose it in one of the following forms

√ρϕ|ρ=0+ = 0, (122)

or

(∂p − s)(√ρϕ)|ρ=0+ = 0, (123)

where s is a real parameter. We call the boundary condition (122) regular boundarycondition and the boundary condition (123) singular boundary condition. The reg-ular boundary condition corresponds formally to the limit s → +∞ in the singularone. We will see that the heat kernel asymptotics do depend on this boundarycondition as well.

8.1. SEPARATION OF VARIABLES

Thus, the Zaremba heat kernel in the leading approximation is determined bythe fundamental solution of the heat equation for the operator F0 (114) with the

Page 35: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 35

boundary conditions (119), (121) and (122) or (123). The part due to �0 easilyfactorizes

Ubnd,(0)0 (t|ρ, θ, x;ρ ′, θ ′, x′)

= (4πt)−(m−2)/2 exp

(−|x − x′|2

4t

)I(t|ρ, θ;ρ ′, θ ′), (124)

and for I(t|ρ, θ;ρ ′, θ ′) we obtain a two-dimensional boundary value problem: theheat equation

(∂t − ∂2

ρ −1

ρ∂ρ − 1

ρ2∂2θ

)I(t|ρ, θ;ρ ′, θ ′) = 0, (125)

the initial condition

I(0+|ρ, θ;ρ ′, θ ′) = 1√ρρ ′ δ(ρ − ρ ′)δ(θ − θ ′), (126)

the symmetry condition

I(t|ρ, θ;ρ ′, θ ′) = I∗(t|ρ ′, θ ′;ρ, θ), (127)

and the boundary conditions:

I(t|ρ, θ;ρ ′, θ ′)|θ=π/2 = 0, (128)

∂θI(t|ρ, θ;ρ ′, θ ′)|θ=−π/2 = 0, (129)

limρ→∞

√ρρ ′I(t|ρ, θ;ρ ′, θ ′) = lim

ρ→∞ ∂ρ[√

ρρ ′I(t|ρ, θ;ρ ′, θ ′)] = 0 (130)

and[√

ρρ ′I(t|ρ, θ;ρ ′, θ ′)]∣∣

ρ=0+ = 0, (131)

or

(∂ρ − s)[√

ρρ ′I(t|ρ, θ;ρ ′, θ ′)]∣∣

ρ=0+ = 0. (132)

The existence of a unique solution to this two-dimensional problem is actuallyequivalent to the ellipticity of the Zaremba boundary value problem.

To construct the heat kernel we study first the operator

L = −∂2θ (133)

on L2(V , [−π/2, π/2]) with the boundary conditions

ϕ(θ)|θ=π/2 = 0, ∂θϕ(θ)|θ=−π/2 = 0. (134)

It is not difficult to find the spectral resolution of this operator.

Page 36: Mathematical Physics, Analysis and Geometry - Volume 7

36 IVAN G. AVRAMIDI

PROPOSITION 15. The orthonormal eigenfunctions and eigenvalues of the oper-ator L are

ϕn(θ) =√

2

πcos

[(n+ 1

2

)(θ + π

2

)], (135)

λn = (n+ 1

2

)2, (136)

where n = 0, 1, 2, . . . .

By separating the variables

I(t|ρ, θ;ρ ′, θ ′) =∞∑n=0

ϕn(θ)ϕn(θ′)un(t|ρ;ρ ′) (137)

we obtain the equation[∂t − ∂2

ρ −1

ρ∂ρ + 1

ρ2

(n+ 1

2

)]un(t|ρ;ρ ′) = 0, (138)

with the initial condition

un(0+|ρ;ρ ′) = 1√

ρρ ′ δ(ρ − ρ ′), (139)

the symmetry condition

un(t|ρ;ρ ′) = un(t|ρ ′;ρ) (140)

and the boundary conditions

limρ→∞

√ρρ ′un(t|ρ;ρ ′) = lim

ρ→∞ ∂p[√

ρρ ′un(t|ρ;ρ ′)] = 0. (141)

and [√ρρ ′un(t|ρ, θ;ρ ′, θ ′)

]∣∣ρ=0+ = 0, (142)

or

(∂p − s)[√

ρρ ′un(t|ρ, θ;ρ ′, θ ′)]∣∣

ρ=0+ = 0. (143)

Let us consider the operator

Dn = −∂2ρ −

1

ρ∂ρ + 1

ρ2

(n+ 1

2

)2

. (144)

PROPOSITION 16. The operators Dn, (n = 0, 1, 2, . . .), have the “generalizedeigenfunctions” Jν(µρ):

DnJν(µρ) = µ2Jν(µρ), (145)

where µ is a positive real parameter, Jν(z) are Bessel functions of the first kindof order ν, and ν can take one of two values, either ν = (n + (1/2)) or ν =−(n+ (1/2)).

Page 37: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 37

Let us look at the behavior of the generalized eigenfunctions at ρ → 0. In thecase n � 1 the Bessel functions J−(n+(1/2))(µρ) behave like ∼ρ−(n+(1/2)) at ρ → 0,which is too singular and violates the integrability condition near boundary. Thismeans that for n � 1 we have to choose ν = (n+ (1/2)).

Note that these are not “true” eigenfunctions, since they are nonnormalizable.Instead, there holds

PROPOSITION 17. The generalized eigenfunctions, Jn+(1/2)(µρ), (n = 0, 1,2, . . .), satisfy the “generalized orthogonality” condition∫ ∞

0dµµJn+(1/2)(µρ)Jn+(1/2)(µρ

′) = 1√ρρ ′ δ(ρ − ρ ′). (146)

In the contrary, in the case n = 0 both choices, ν = +1/2 or ν = −1/2, arepossible, which makes the analysis of the problem more complicated. Therefore,we will treat the cases n � 1 and n = 0 separately.

8.2. REGULAR GENERALIZED EIGENFUNCTIONS

We consider first the case n � 1. We will solve this problem by employing theHankel transform which is well defined in the class of functions satisfying theconditions imposed above. We define

vn(t|µ, ρ ′) =∫ ∞

0dρ ρJn+(1/2)(µρ)un(t|ρ, ρ ′). (147)

Then

un(t|ρ, ρ ′) =∫ ∞

0dµµJn+(1/2)(µρ)vn(t|µ, ρ ′). (148)

Next, by integrating by parts and using the Equation (145), we compute theHankel transform∫ ∞

0dρ ρJn+(1/2)(µρ)Dnun(t|ρ, ρ ′)

= µ2∫ ∞

0dρ ρJn+(1/2)(µρ)un(t|ρ, ρ ′)+

+ ρ{[∂ρJn+(1/2)(µρ)

]un(t|ρ, ρ ′)− Jn+(1/2)(µρ)∂ρun(t|ρ, ρ ′)

}∣∣∞0 .(149)

Finally, by taking into account the boundary conditions (141) and (122), (123) andthe asymptotic form of the Bessel functions, we obtain∫ ∞

0dρ ρJn+(1/2)(µρ)Dnun(t|ρ, ρ ′) = µ2vn(t|µ, ρ ′). (150)

Thus, the Hankel transform of the heat Equation (138) is

(∂t + µ2)vn(t|µ, ρ ′) = 0. (151)

Page 38: Mathematical Physics, Analysis and Geometry - Volume 7

38 IVAN G. AVRAMIDI

From (139) we also obtain the initial condition

vn(0+|µ, ρ ′) = Jn+(1/2)(µρ

′). (152)

It immediately follows that

vn(t|µ, ρ ′) = e−tµ2Jn+(1/2)(µρ

′), (153)

and, therefore,

un(t|ρ, ρ ′) =∫ ∞

0dµµe−tµ2

Jn+1/2(µρ)Jn+1/2(µρ′). (154)

This integral can be computed by using the properties of the Bessel functions. Weobtain finally

PROPOSITION 18. The boundary value problem (138–143) for n = 1, 2, . . . hasa unique solution

un(t|ρ, ρ ′) = 1

2texp

(−ρ2 + ρ ′2

4t

)In+1/2

(ρρ ′

2t

), (155)

where In+1/2(z) is the modified Bessel function of first kind.

Note that although this solution was obtained without making use of the bound-ary conditions (142) or (143), it satisfies both of them since it is regular at ρ → 0.

8.3. SINGULAR GENERALIZED EIGENFUNCTION

Now let us consider the case n = 0. As we have seen the condition of integrabil-ity near boundary does not fix the solution uniquely, since there are two linearlyindependent integrable solutions, which corresponds to the choices ν = −1/2 andν = +1/2. The Hankel transform in this case reduces to the standard cosine andsine Fourier transforms. However, we will not use them, but will solve the heatequation directly.

Let us single out the allowed singular factor

u0(t|ρ, ρ ′) = 1√ρρ ′w(t|ρ, ρ ′). (156)

Then, the heat equation (138), the initial condition (139), and the boundary condi-tions (142) and (143) take the form

(∂t − ∂2p)w(t|ρ, ρ ′) = 0, (157)

w(0+|ρ, ρ ′) = δ(ρ − ρ ′), (158)

w(t|ρ, ρ ′)|ρ=0 = 0, (159)

Page 39: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 39

or

(∂ρ − s)w(t|ρ, ρ ′)|ρ=0 = 0. (160)

There is also the usual regularity condition at infinity p → ∞.As we see, w is just the standard one-dimensional heat kernel on the half-axis.

By using the Laplace transform we easily obtain the solution of this problem

w(t|ρ, ρ ′)= 1

2πi

∫ c+i∞

c−i∞dλe−tλ 1

2√−λ

{exp

[−√−λ|ρ − ρ ′|]++

√−λ− s√−λ+ sexp

[−√−λ(ρ + ρ ′)]}

, (161)

where c is a sufficiently large negative real constant, i.e.√−c > −s, and

√−λ isdefined in the complex plane of λ with a cut along the real positive half-axis, so thatRe

√−λ > 0. Notice that the boundary conditions (159) correspond to the limits → +∞. The limit s → −∞ is not well defined since the constant c depends ons and would have to go to −∞ as well.

Next, let us change the variable λ according to

λ = µ2, µ = i√−λ, (162)

where Imµ > 0. In the upper half-plane, Imµ > 0, this change of variables issingle-valued and well defined. Under this change the complex λ-plane is mappedonto the upper half µ-plane, and the cut in the complex λ-plane along the positivereal axis from 0 to ∞ is mapped onto the whole real axis in the µ-plane.

The contour of integration in the complex µ-plane is a hyperbola going from(ei3π/4)∞ through the point

√−c to (eiπ/4)∞. It can be deformed to a contour C

that is above all poles of the integrand. It comes from −∞ along the real axis,encircles posible poles on the imaginary axis in the clockwise direction, and goesto +∞ along the real axis.

After such a transformation we obtain

w(t|ρ, ρ ′)=∫C

{exp

[−tµ2 + iµ|ρ − ρ ′|]++ µ− is

µ+ isexp

[−tµ2 + iµ(ρ + ρ ′)]}

. (163)

This function is an analytic function of s since the contour C is above the poleat −is. Therefore, we can compute it, for example, for s > 0, and then make ananalytical continuation on the whole complex s-plane. So, let s > 0. Then the pole−is is in the lower half-plane. Therefore, the contour C can be deformed to justthe real axis, i.e. −∞ < µ < ∞. Next, we use the following trick

µ− is

µ+ is= 1 − 2is

1

µ+ is= 1 − 2s

∫ ∞

0dp eip(µ+is). (164)

Page 40: Mathematical Physics, Analysis and Geometry - Volume 7

40 IVAN G. AVRAMIDI

This integral converges since s > 0. Substituting this equation in (163) and evalu-ating the Gaussian integral over µ, we obtain

w(t|ρ, ρ ′)= (4πt)−1/2

{exp

[−(ρ − ρ ′)2

4t

]+ exp

[−(ρ + ρ ′)2

4t

]−

− 2s∫ ∞

0dp exp

[−(ρ + ρ ′ + p)2

4t− ps

]}, (165)

which can be expressed in terms of the complimentary error function

w(t|ρ, ρ ′)

= (4πt)−1/2

{exp

[−(ρ − ρ ′)2

4t

]+ exp

[−(ρ + ρ ′)2

4t

]−

− 2√πs

√t exp

[ts2 + (ρ + ρ ′)s

]erfc

[ρ + ρ ′

2√t

+ s√t

]}. (166)

Here erfc(z) is defined by

erfc(z) = 2√π

∫ ∞

z

du e−u2. (167)

The case s < 0 can be analyzied either directly or by the analytical continuation.The direct computation is different since now the pole −is is in the upper half-planeand one has to take into account the residue at this pole. However, the integral alongthe real axis is also different, so that the sum is the same. In other words, the resultfor s < 0 has the same analytical form (166).

Finally, we obtain

PROPOSITION 19. The heat kernel component u0 is

u0(t|ρ, ρ ′)

= (4πt)−1/2 1√ρρ ′

{exp

[−(ρ − ρ ′)2

4t

]+ exp

[−(ρ + ρ ′)2

4t

]−

− 2√πs

√t exp

[ts2 + (ρ + ρ ′)s

]erfc

(ρ + ρ ′

2√t

+ s√t

)}. (168)

In the particular case s = 0 we get

u0(t|ρ, ρ ′) = (4πt)−1/2 1√ρρ ′

{exp

[−(ρ − ρ ′)2

4t

]+ exp

[−(ρ + ρ ′)2

4t

]}. (169)

Note that this eigenfunction is singular at ρ → 0 for any finite s.

Page 41: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 41

The case s → +∞ corresponds to the regular boundary conditions (142). Inthis case the solution reads

u0(t|ρ, ρ ′)= (4πt)−1/2 1√ρρ ′

{exp

[−(ρ − ρ ′)2

4t

]− exp

[−(ρ + ρ ′)2

4t

]}

= 1

2texp

(−ρ2 + ρ ′2

4t

)I1/2

(ρρ ′

2t

). (170)

This solution coincides with the solution (155) for n = 0 obtained by the Hankeltransform and is obviously regular.

8.4. OFF-DIAGONAL ZAREMBA HEAT KERNEL

Combining our results and using the explicit form of the eigenfunctions ϕn, weobtain

LEMMA 1. The boundary value problem (125–132) has the unique solution

I(t|ρ, θ;ρ ′, θ ′)= (4πt)−1L(t|ρ, ρ ′)××

{cos

(θ − θ ′

2

)+ cos

(θ + θ ′ + π

2

)}+

+ (4πt)−1 exp

(−ρ2 + ρ ′2

4t

×{M

(ρρ ′

2t, θ − θ ′

)+M

(ρρ ′

2t, θ + θ ′ + π

)}, (171)

where

L(t|ρ, ρ ′)= 4√π

(t

ρρ ′

)1/2{exp

[−(ρ + ρ ′)2

4t

]−

−√πs

√t exp

[ts2 + (ρ + ρ ′)s

]erfc

(ρ + ρ ′

2√t

+ s√t

)}, (172)

and

M(z, γ ) = 2∞∑n=0

In+1/2(z) cos[(n+ 12 )γ ]. (173)

Notice that for the “regular” boundary conditions (131), which correspond to thelimit s → +∞, the function L(t|ρ, ρ ′) vanishes.

This series can be evaluated by using the following integral representation ofthe Bessel function

In+1/2(z) = 1√πn!

(z

2

)n+1/2 ∫ 1

−1dp e−pz(1 − p2)n. (174)

Page 42: Mathematical Physics, Analysis and Geometry - Volume 7

42 IVAN G. AVRAMIDI

Substituting this integral in the series and summing over n we obtain

M(z, γ )=√

z

∫ 1

−1dp e−pz

{exp

[12 (1 − p2)zeiγ + 1

2 iγ]+

+ exp[

12 (1 − p2)ze−iγ − 1

2 iγ]}. (175)

The remaining integral can be expressed in terms of the error function, so thatfinally we get

M(z, γ ) = ez cosγ erf

[√2z cos

2

)], (176)

where the error function is defined by

erf(z) = 2√π

∫ z

0dp e−p2

. (177)

By adding the �0 factor we obtain the final result:

THEOREM 8. The leading order off-diagonal Zaremba heat kernel has the form

Ubnd,(0)0 (t|ρ, θ, x;ρ ′, θ ′, x′)= L(t|ρ, θ, x;ρ ′, θ ′, x′)+

+ L(t|ρ, θ, x;ρ ′,−θ ′ − π, x′), (178)

where

L(t|ρ, θ, x;ρ ′, θ ′, x′)

= (4πt)−m/2 exp

(−|x − x′|2

4t

)L(t|ρ, ρ ′) cos

(θ − θ ′

2

)+

+ (4πt)−m/2 exp

{− 1

4t

[|x − x′|2 + ρ2 + ρ ′2 − 2ρρ ′ cos(θ − θ ′)]}×

× erf

(√ρρ ′

tcos

(θ − θ ′

2

)). (179)

An important corollary from this formula are the symmetries of the heat kernel.First of all, we have the usual ‘self-adjointness’ symmetry

θ → θ ′, ρ → ρ ′. (180)

Second, we have the ‘periodicity’ symmetries

θ → θ + 4πn, θ ′ → θ ′ + 4πm, n,m ∈ Z. (181)

Finally, there is additional ‘mirror’ symmetry

θ → θ, θ ′ → −θ ′ − π, (182)

θ → −θ − π, θ ′ → θ ′. (183)

Page 43: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 43

Note the essential difference of the symmetries of the Zaremba heat kernel ver-sus those of the Dirichlet and Neumann parametrices. The mixed heat kernel isa periodic function of the angles (expected), but not with the period 2π but withthe period 4π (not expected). That is why there are two different mirror images,(ρ,−θ − π, x) and (ρ,−θ + π, x), of a point with the coordinates (ρ, θ, x). Inother words the double reflection of a point does not bring it back – the doubleimage is not identical with the original point. Denoting by T the transformationθ → −θ − π we have

T 4 = Id, but T 2 �= Id. (184)

The operator T has four eigenvalues 1, −1, i and −i. Whereas the first two, 1 and−1, are the standard ones, the latter two, i and −i, correspond to some new images.This might have some interesting applications.

8.5. ZAREMBA HEAT TRACE ASYMPTOTICS

Now it is easily found:

PROPOSITION 20. The diagonal of the leading Zaremba heat kernel is

Ubnd,(0)diag (t|ρ, θ, x)

= (4πt)−m/2

{1 − erfc

(ρ√t

)− exp

(−ρ2 cos2 θ

t

)erf

(ρ sin θ√

t

)+

+ (1 − sin θ)4√π

√t

ρ

[exp

(−ρ2

t

)−

−√πs

√t exp

(ts2 + 2ρs

)erfc

(ρ√t+ s

√t

)]}. (185)

Next, we compute the integral of the diagonal of the heat kernel over Mbnd0

∫Mbnd

0

trV Ubnd,(0)diag,0 (t) =

∫ ε3

0dρ ρ

∫ π/2

−π/2dθ trV U

bnd,(0)diag,0 (t) (186)

for some finite ε3 >

√ε2

1 + ε22 > 0.

First of all, obviously the integrals over θ of the odd functions in θ vanishidentically. So, we only need to consider the even part. Second, since in the limitε3 → 0 the volume of Mbnd

0 vanishes, the regular part of the heat kernel diagonaldoes not contribute to the trace either. It is only the singular part of the heat kerneldiagonal, which behaves like a distribution near �0, that contributes to the integralin the limit ε3 → 0.

Page 44: Mathematical Physics, Analysis and Geometry - Volume 7

44 IVAN G. AVRAMIDI

The integral over ρ can be computed exactly. It reads∫Mbnd

0

trV Ubnd,(0)diag,0 (t)

=∫�0

(4πt)−m/2 dimV

{πε2

3

2+ t

[−π

4+ 2πO(

√ts)

]+X(t)

}, (187)

where

O(z) = ez2

erfc(z), (188)

and

X(t)= 2√πtε3 exp

(−ε2

3

t

)+

(πt

4− πε2

3

2

)erfc

(ε3√t

)−

− 2πt exp(ts2 + 2sε3

)erfc

(ε3√t+√

ts

). (189)

Notice that πε23/2 is nothing but the area of the semi-circle of radius ε3, so that

vol(�0)πε23/2 = vol(Mbnd

0 ). In the limit ε3 → 0 this term does not contribute tothe asymptotics.

By using the asymptotic behavior of the error function as z → ∞erfc(z) ∼ 1√

πze−z2

(190)

we find that the function X(t) is exponentially small, i.e. it is suppressed by the fac-tor ∼ exp(−ε2

3/t), as t → 0, and, therefore, does not contribute to the asymptoticexpansion of the heat kernel in powers of t (54) either.

The behavior of the function O(√ts) depends on the parameter s. For a finite s

in the limit t → 0 we have

O(√ts) = 1 + O(t1/2). (191)

On another hand, by using (190) we see that for a finite t the function O(√ts)

vanishes in the limit s → ∞:

O(√ts)|s→∞ = 0. (192)

It immediately follows

THEOREM 9. The singular heat kernel coefficients b(2)k depend on the type of theadditional boundary conditions, i.e. (131) vs. (132), at the singular set �0. Theleading singular heat kernel coefficient b(2)2 has the form:

b(2)2 = (4π)−(m−2)/2 7

16 dimV (193)

for the singular boundary conditions (finite s) (132), and is equal to

b(2)2 = −(4π)−(m−2)/2 1

16 dimV (194)

for the regular boundary conditions (131) (s → +∞).

Page 45: Mathematical Physics, Analysis and Geometry - Volume 7

HEAT KERNEL ASYMPTOTICS OF ZAREMBA BOUNDARY VALUE PROBLEM 45

Notice that the leading singular coefficient does not depend on s explicitly.

Acknowledgements

I would like to thank Jochen Brüning, Stuart Dowker, Giampiero Esposito, StephenFulling, Peter Gilkey, Gerd Grubb, and Werner Müller for stimulating and fruitfuldiscussions. I am also very grateful to Robert Seeley for clarifying discussions ofthe boundary conditions and sharing the preliminary results. The support by theNSF Block Travel Grant DMS-9988119, by the MSRI, and by the Istituto Italianoper gli Studi Filosofici and the Azienda Autonoma Soggiorno e Turismo, Napoli,is gratefully acknowledged.

References

1. Avramidi, I. G.: A covariant technique for the calculation of the one-loop effective action,Nuclear Phys. B 355 (1991), 712–754. Erratum: Nuclear Phys. B 509 (1998), 557–558.

2. Avramidi, I. G.: A method for calculating the heat kernel for manifolds with boundary,Yadernaya Fiz. 56 (1993), 245–252, [Russian]; Phys. Atomic Nuclei 56 (1993), 138–142[English].

3. Avramidi, I. G.: Green functions of higher-order differential operators, J. Math. Phys. 39(1998), 2889–2909.

4. Avramidi, I. G.: Covariant techniques for computation of the heat kernel, Rev. Math. Phys. 11(1999), 947–980.

5. Avramidi, I. G.: Heat kernel asymptotics of non-smooth boundary value problem, New MexicoTech., 1999; In: M. van den Berg and V. Liskevich (eds), Workshop on Spectral Geometry,Abstracts of Int. Conf., University of Bristol, Bristol, U.K., July 10–15, 2000.

6. Avramidi, I. G. and Esposito, G.: Lack of strong ellipticity in Euclidean quantum gravity,Classical Quantum Gravity 15, (1998), 1141–1152.

7. Avramidi, I. G. and Esposito, G.: Gauge theories on manifolds with boundary, Comm. Math.Phys. 200 (1999), 495–543.

8. Avramidi, I. G. and Esposito, G.: Heat kernel asymptotics of the Gilkey-Smith boundary valueproblem, In: V. Alexiades and G. Siopsis (eds), Trends in Mathematical Physics, Stud. Adv.Math. 13, Amer. Math. Soc. and International Press, 1999, pp. 15–34.

9. Avramidi, I. G. and Schimming, R.: Algorithms for the calculation of the heat kernel coeffi-cients, In: M. Bordag (ed.), Quantum Field Theory under the Influence of External Conditions,Teubner-Texte Phys. 30, Teubner, Stuttgart, 1996, pp. 150–162.

10. Berline, N., Getzler, E. and Vergne, M.: Heat Kernels and Dirac Operators, Springer-Verlag,Berlin, 1992.

11. Booss-Bavnbek, B. and Wojciechowski, K. P.: Elliptic Boundary Problems for Dirac Opera-tors, Birkhäuser, Boston, 1993.

12. Branson, T. and Gilkey, P. B.: The asymptotics of the Laplacian on a manifold with boundary,Comm. Partial Differential Equations 15 (1990), 245–272.

13. Branson, T. P., Gilkey, P. B., Kirsten, K. and Vassilevich, D. V.: Heat kernel asymptotics withmixed boundary conditions, Nuclear Phys. B 563 (1999), 603–626.

14. Brüning, J. and Seeley, R.: Regular singular asymptotics, Adv. in Math. 58 (1985), 133–148.15. Brüning, J. and Seeley, R,. T.: The expansion of the resolvent near a singular stratum of conical

type, J. Funct. Anal. 95 (1991), 255–290.16. Callias, C.: The heat equation with singular coefficients I, Comm. Math. Phys. 88 (1983), 357–

385.17. Cheeger, J.: On the spectral geometry of spaces with cone-like singularities, Proc. Nat. Acad.

Sci. U.S.A. 76 (1979), 2103–2106.

Page 46: Mathematical Physics, Analysis and Geometry - Volume 7

46 IVAN G. AVRAMIDI

18. Cheeger, J.: Spectral geometry of singular Riemannian spaces, J. Differential Geom. 18 (1983),575–657.

19. De Witt, B. S.: The Spacetime Approach to Quantum Field Theory, In: B. S. De Witt andR. Stora (eds), Relativity, Groups and Topology II, North-Holland, Amsterdam, 1984, pp. 383–738.

20. Dowker, J. S.: The N ∪ D problem, University of Manchester, 2000, hepth/0007127.21. Dowker, J. S., Gilkey, P. B. and Kirsten, K.: On properties of the asymptotic expansion of the

heat trace for the N/D problem, Internat. J. Math. 12 (2001), 505–517.22. Dowker, J. S. and Kirsten, K.: Heat-kernel coefficients for oblique boundary conditions,

Classical Quantum Gravity 14 (1997), L169–L175.23. Dowker, J. S. and Kirsten, K.: The a3/2 heat-kernel coefficient for oblique boundary conditions,

Classical Quantum Gravity 16 (1999), 1917–1936.24. Elizalde, E. and Vassilevich, D. V.: Heat Kernel Coefficients for Chern–Simons Boundary

Conditions in QED, Classical Quantum Gravity 16 (1999), 813–823.25. Fabrikant, V. I.: Mixed Boundary Value Problems of Potential Theory and Their Applications

in Engineering, Kluwer, Dordrecht, 1991.26. Fedosov, B. V.: Asymptotic formulas for the eigenvalues of the Laplace operator in the case of

a polyhedron, Soviet Math. Dokl. 5 (1964), 988–990.27. Gil, J. B.: Full asymptotic expansion of the heat trace for non-self-adjoint elliptic cone

operators, Temple University (2001), math.AP/0004161.28. Gilkey, P. B. and Smith, L.: The eta invariant for a class of elliptic boundary value problems,

Comm. Pure Appl. Math. 36 (1983), 85–132.29. Gilkey, P. B. and Smith, L.: The twisted index theorem for manifolds with boundary,

J. Differential Geom. 18 (1983), 393–344.30. Gilkey, P. B.: Invariance Theory, the Heat Equation and the Atiyah-Singer Index Theorem,

Chemical Rubber Company, Boca Raton, 1995.31. Grubb, G.: Properties of normal boundary value problems for elliptic even-order systems, Ann.

Scuola Norm. Sup. Pisa Cl. Sci. (4) 1 (1974), 1–61.32. Grubb, G.: Functional Calculus of Pseudodifferential Boundary Problems, Progr. Math. 65,

Birkhäuser, Boston, 1996.33. Karol’, A. L: Asymptotics of the parabolic Green function for an elliptic operator on a manifold

with conical points, Math. Notes 63(1–2) (1998), 25–32.34. Kirsten, K.: The a5 heat kernel coefficient on a manifold with boundary, Classical Quantum

Gravity 15 (1998), L5–L12.35. Lesch, M.: Operators of Fuchs Type, Conical Singularities, and Asymptotic Methods, Teubner-

Texte Math. 136, Teubner, Stuttgart–Leipzig, 1997.36. McAvity, D. M. and Osborn, H.: Asymptotic expansion of the heat kernel for generalized

boundary conditions, Classical Quantum Gravity 8 (1991), 1445–1454.37. Mooers, E. A.: Heat kernel asymptotics on manifolds with conic singularities, J. Anal. Math.

78 (1999), 1–36.38. Seeley, R. T.: Topics in pseudo-differential operators, In: CIME Conference on Pseudo-

Differential Operators 1968, Edizioni Cremonese, Roma (1969), pp. 169–305.39. Seeley, R. T.: The resolvent of an elliptic boundary value problem, Amer. J. Math. 91 (1969),

963–983.40. Seeley, R. T.: Trace Expansions for the Zaremba Problem, Comm. Partial Differential

Equations 27 (2002), 2403–2421.41. Simanca, S. R.: Mixed elliptic boundary value problems, Comm. Partial Differential Equations

12 (1987), 123–200.42. Sneddon, I. N.: Mixed Boundary Value Problems in Potential Theory, Wiley, New York, 1966.43. van de Ven, A. E. M.: Index free heat kernel coefficients, Classical Quantum Gravity 15 (1998),

2311–2344.

Page 47: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 47–96, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

47

Tau-functions on Hurwitz Spaces

A. KOKOTOV and D. KOROTKINDepartment of Mathematics and Statistics, Concordia University, 7141 Sherbrook West,Montreal H4B 1R6, Quebec, Canada. e-mail: {alexey, korotkin}@mathstat.concordia.ca

(Received: 22 February 2002; in final form: 18 February 2003)

Abstract. We construct a flat holomorphic line bundle over a connected component of the Hurwitzspace of branched coverings of the Riemann sphere P1. A flat holomorphic connection defining thebundle is described in terms of the invariant Wirtinger projective connection on the branched coveringcorresponding to a given meromorphic function on a Riemann surface of genus g. In genera 0 and 1we construct a nowhere vanishing holomorphic horizontal section of this bundle (the ‘Wirtingertau-function’). In higher genus we compute the modulus square of the Wirtinger tau-function. Inparticular one gets formulas for the isomonodromic tau-functions of semisimple Frobenius manifoldsconnected with the Hurwitz spaces Hg,N(1, . . . , 1).

Mathematics Subject Classification (2000): 32G99.

Key words: the Wirtinger projective connection, Hurwitz spaces, the Bergmann kernel.

1. Introduction

Holomorphic line bundles over moduli spaces of Riemann surfaces were stud-ied by many researchers during last 20 years (see, e.g., Fay’s survey [3]). In thepresent paper we consider (flat) holomorphic line bundles over Hurwitz spaces(the spaces of meromorphic functions on Riemann surfaces or, what is the same,the spaces of branched coverings of the Riemann sphere P

1) and over coveringsof Hurwitz spaces. The covariant constant sections (we call them tau-functions) ofthese bundles are the main object of our consideration.

Our work was inspired by a coincidence of the isomonodromic tau-functionof a class of 2 × 2 Riemann–Hilbert problems solved in [7] with the heuristicexpression which appeared in the context of the string theory and was interpretedas the determinant of the Cauchy–Riemann operator acting in a spinor line bundleover a hyperelliptic Riemann surface (see the survey [8]).

To illustrate our results consider, for example, the Hurwitz space Hg,N(1, . . . , 1)consisting of N-fold coverings of genus g with only simple branch points, none ofwhich coincides with infinity. (In the main text we work with coverings havingbranch points of arbitrary order.)

Let L be a covering from Hg,N(1, . . . , 1), we use the branch points λ1, . . . , λM(i.e. the projections of the ramification points P1, . . . , PM of the covering L) as

Page 48: Mathematical Physics, Analysis and Geometry - Volume 7

48 A. KOKOTOV AND D. KOROTKIN

local coordinates on the space Hg,N(1, . . . , 1); according to the Riemann–Hurwitzformula M = 2g + 2N − 2.

Let λ be the coordinate of the projection of a point P ∈ L to P1. In a neighbor-

hood of a ramification point Pm we introduce the local coordinate xm = √λ− λm.Besides the Hurwitz space Hg,N(1, . . . , 1), we shall use the ‘punctured’ Hur-

witz space H ′g,N(1, . . . , 1), which is obtained from Hg,N(1, . . . , 1) by excluding

all branched coverings which have at least one vanishing theta-constant.In the trivial bundle H ′

g,N(1, . . . , 1)× C we introduce the connection

dW = d−M∑m=1

Am dλm, (1.1)

where d is the external differentiation operator including both holomorphic and an-tiholomorphic parts; connection coefficients are expressed in terms of the invariantWirtinger projective connection SW on the covering L as follows:

Am = − 112SW(xm)|xm=0, m = 1, . . . M. (1.2)

The connection coefficients Am are holomorphic with respect to λm and well-defined for all coverings L from the ‘punctured’ Hurwitz space H ′

g,N(1, . . . , 1).Connection (1.1) turns out to be flat; therefore, it determines a character of

the fundamental group of H ′g,N(1, . . . , 1); this character defines a flat holomorphic

line bundle TW over H ′g,N(1, . . . , 1). We call this bundle the ‘Wirtinger line bundle’

over Hurwitz space; its horizontal holomorphic section we call the Wirtinger tau-function of the covering L.

In a trivial bundle U(L0)×C, where U(L0) is a small neighborhood of a givencovering L0 in Hg,N(1, . . . , 1) we can define also the flat connection dB = d −∑M

m=1 Bm dλm, where the coefficients Bm are built from the Bergmann projectiveconnection SB in a way similar to (1.2):

Bm = − 112SB(xm)|xm=0.

The covariant constant section of this line bundle in case of hyperelliptic coverings(N = 2, g > 1) turns out to coincide (see [7] for explicit calculation) withheuristic expression for the determinant of the Cauchy–Riemann operator actingin the trivial line bundle over a hyperelliptic Riemann surface, which was proposedin [8]. This section also appears as a part of isomonodromic tau-function associatedto matrix Riemann–Hilbert problems with quasi-permutation monodromies [9]. Its(−1/2)-power coincides with isomonodromic tau-function of a Frobenius mani-fold corresponding to the Hurwitz space Hg,N(1, . . . , 1) (see [1]). However, sincethe Bergmann projective connection, in contrast to Wirtinger projective connection,does depend on the choice of canonical basis of cycles on the covering, connec-tion dB can not be globally continued to the whole Hurwitz space, but only toits appropriate covering. We call the corresponding line bundle over this cover-ing the Bergmann line bundle and its covariant constant section – the Bergmanntau-function.

Page 49: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 49

We obtain explicit formulas for the modulus square of the Wirtinger andBergmann tau-functions in genus greater than 1; in genera 0 and 1 we perform the‘holomorphic factorization’ and derive explicit formulas for the tau-functions them-selves.

In genera 1 and 2 (as well as in genus 0) there are no vanishing theta-constants,i.e. Hg,N(1, . . . , 1) = H ′

g,N(1, . . . , 1); therefore, the holomorphic bundle TW is thebundle over the whole Hurwitz space Hg,N(1, . . . , 1).

To write down an explicit formula for the tau-function over the Hurwitz spaceH1,N(1, . . . , 1), consider a holomorphic (not necessarily normalized) differentialv(P ) on an elliptic covering L ∈ H1,N(1, . . . , 1). Introduce the notation fm ≡fm(0), hk ≡ hk(0), where v(P ) = fm(xm) dxm near the branch point Pm andv(P ) = hk(ζ ) dζ near the infinity of the kth sheet; ζ = 1/λ, where λ is the coordi-nate of the projection of a point P ∈ L to P

1. Then the Wirtinger tau-function onH1,N(1, . . . , 1) is given by the formula

τW = {∏Nk=1 hk}1/6

{∏Mm=1 fm}1/12

. (1.3)

The analogous explicit formula can be written for coverings of genus 0.The results in genera 0, 1 follow from the study of the properly regularized

Dirichlet integral S = 1/2π∫L |φλ|2, where eφ|dλ|2 is the flat metric on L obtained

by projecting down the standard metric |dz|2 on the universal covering L. Thederivatives of S with respect to the branch points can be expressed through thevalues of the Schwarzian connection at the branch points; this reveals a close linkof S with the modulus of the tau-function. On the other hand, the integral S admitsan explicit calculation via the asymptotics of the flat metric near the branch pointsand the infinities of the sheets of the covering. Moreover, it admits a ‘holomorphicfactorization’ i.e. it can be explicitly represented as the modulus square of someholomorphic function, which allows one to compute the tau-function itself.

The same tools (except the explicit holomorphic factorization) also work in caseof higher genus, when two equivalent approaches are possible.

First, one can exploit the Schottky uniformization and introduce the Dirichletintegral corresponding to the flat metric on L obtained by projecting of the flatmetric |dω|2 on a fundamental domain of the Schottky group. This approach leadsto the expression of the modulus square of the tau-function through the holomor-phic function F on the Schottky space, which was introduced in [16] and canbe interpreted as the holomorphic determinant of the Cauchy–Riemann operatoracting in the trivial line bundle over L. (In the main text we denote this functiondirectly by det ∂ .)

The second approach uses the Fuchsian uniformization and the Liouville actioncorresponding to the metric of constant curvature −1 on L. It gives the followingexpression for the modulus square of the tau-function:

|τW |2 = e−SFuchs/6 det�

det�B

∏β even

|![β](0 | B)|−8/(4g+2g), (1.4)

Page 50: Mathematical Physics, Analysis and Geometry - Volume 7

50 A. KOKOTOV AND D. KOROTKIN

where det� is the determinant of the Laplacian on the L; SFuchs is an appropriatelyregularized Liouville action which is a real-valued function of the branch points;B is the matrix of b-periods of the branched covering.

Existence of explicit holomorphic factorization of our expressions for |τW |2 ingenera g = 0, 1 allows to suggest that explicit formulas for τW similar to (1.3) alsoexist in higher genera.

In this paper we use the technical tools developed in [17, 18]. We stronglysuspect that in our context it should be possible to avoid the extrinsic formalism ofthe Dirichlet integrals and Liouville action and, at the least, it should exist a directway to prove the genus 1 formula (1.3).

The paper is organized as follows. In Section 2 after some preliminaries weprove the flatness of the connections dW and dB and introduce the flat line bun-dles over Hurwitz spaces and their coverings. In Section 2 we find explicitly thetau-functions for genera 0 and 1. In Section 3, using the Schottky and Fuchsianuniformizations, we give the expressions for the modulus square of tau-functionsin genus greater than 1.

2. Tau-Functions of Branched Coverings

2.1. THE HURWITZ SPACES

Let L be a compact Riemann surface of genus g represented as an N-fold branchedcovering

p: L −→ P1, (2.1)

of the Riemann sphere P1. Let the holomorphic map p be ramified at the points

P1, P2, . . . , PM ∈ L of ramification indices r1, r2, . . . , rM respectively (the rami-fication index is equal to the number of sheets glued at a given ramification point).Let also λm = p(Pm), m = 1, 2, . . . ,M be the branch points. (Following [4], wereserve the name ‘ramification points’ for the points Pm of the surface L and thename ‘branch points’ for the points λm of the base P

1.)We assume that none of the branch points λm coincides with the infinity and

λm �= λn for m �= n.Recall that two branched coverings p1: L1 → P

1 and p2: L2 → P1 are

called equivalent if there exists a biholomorphic map f : L1 → L2 such thatp2f = p1. Let H(N,M,P1) be the Hurwitz space of the equivalence classes ofN-fold branched coverings of P

1 with M branch points none of which coincideswith the infinity. This space can be equipped with natural topology (see [4]) andis a (generally disconnected) complex manifold. Denote by U(L) the connectedcomponent of H(N,M,P1) containing the equivalence class of the covering L.According to the Riemann–Hurwitz formula, we have

g =M∑m=1

rm − 1

2−N + 1,

Page 51: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 51

where g is the genus of the surface L.If all the branch points of the covering L are simple (i.e. all the rm are equal to 2)

then U(L) coincides with the space Hg,N(1, . . . , 1) of meromorphic functions ofdegree N on Riemann surfaces of genus g = M/2 − N + 1 with N simple polesand M simple critical values (see [11]). The space Hg,N(1, . . . , 1) is also called theHurwitz space ([11]).

Following [1], introduce the set U(L) of pairs{L1 ∈ U(L) | a canonical basis {ai, bi}gi=1 of cycles on L1

}. (2.2)

The space U(L) is a covering of U(L).The branch points λ1, . . . , λM of a covering L1 ∈ U(L) can serve as local

coordinates on the space U(L) as well as on its covering U(L).A branched covering L is completely determined by its branch points if in ad-

dition one fixes a representation σ of the fundamental group π1(P1 \ {λ1, . . . , λM})

in the symmetric group SN . The element σγ ∈ SN corresponding to an elementγ ∈ π1(P

1\{λ1, . . . , λM}) describes the permutation of the sheets of the covering Lif the point λ ∈ P

1 encircles the loop γ . One gets a small neighborhood of a givenbranched covering L moving the branch points in small neighborhoods of theirinitial positions without changing the representation σ .

2.2. THE BERGMANN AND WIRTINGER PROJECTIVE CONNECTIONS

Choose on L a canonical basis of cycles {ai, bi}gi=1 and the corresponding basis ofholomorphic differentials vi normalized by the conditions

∮aivj = δij . Let

B(P,Q) = dP dQ lnE(P,Q), (2.3)

where E(P,Q) is the prime form (see [10] or [2]), be the Bergmann kernel on thesurface L.

The invariant Wirtinger bidifferential W(P,Q) on L is defined by the equality

W(P,Q)

= B(P,Q)+ 2

4g + 2g

g∑i,j=1

vi(P )vj (Q)∂2

∂zi∂zjln

∏β even

![β](z|B)|z=0, (2.4)

where B = ‖Bij‖gi,j=1 is the matrix of b-periods of L; β runs through the set of alleven characteristics (see [3, 15]).

In contrast to the Bergmann kernel, the invariant Wirtinger differential does notdepend on the choice of canonical basic cycles {ai, bi}.

The invariant Wirtinger bidifferential is not defined if the surface L has at leastone vanishing theta-constant. Thus, we introduce the ‘punctured’ space U′(L) ⊂U(L) consisting of equivalence classes of branched coverings with all nonvanish-ing theta-constants. Unless the g � 2 or g > 2 and N = 2 the ‘theta-divisor’

Page 52: Mathematical Physics, Analysis and Geometry - Volume 7

52 A. KOKOTOV AND D. KOROTKIN

Z = U(L) \U′(L) forms a subspace of codimension 1 in U(L). If g � 2 thenthe set Z is empty and U′(L) = U(L); for hyperelliptic (N = 2) coverings ofgenus g > 2 a vanishing theta-constant does always exist and, therefore, for suchcoverings U′(L) is empty.

The Wirtinger bidifferential has the following asymptotics near diagonal:

W(P,Q) ={

1

(x(P )− x(Q))2+ 1

6SW(x(P ))+ o(1)

}dx(P ) dx(Q) (2.5)

as P → Q, where x(P ) is a local coordinate on L. The quantity SW is a projectiveconnection on L; it is called the invariant Wirtinger projective connection. For theBergmann kernel we have similar asymptotics

B(P,Q) ={

1

(x(P )− x(Q))2+ 1

6SB(x(P ))+ o(1)

}dx(P ) dx(Q), (2.6)

where SB is the Bergmann projective connection. The Bergmann and the invariantWirtinger projective connections are related as follows:

SW = SB + 12

4g + 2g

g∑i,j=1

{∂2

∂zi∂zjln

∏βeven

![β](z|B)|z=0

}vivj . (2.7)

As well as the Wirtinger bidifferential itself, the Wirtinger projective connectiondoes not depend on the choice of basic cycles on L while the Bergmann projectiveconnection does.

We recall that any projective connection S behaves as follows under the coordi-nate change x = x(z):

S(z) = S(x)

(dx

dz

)2

+ Rx,z, (2.8)

where

Rx,z ≡ {x, z} = x′′′(z)x′(z)

− 3

2

(x′′(z)x′(z)

)2

(2.9)

is the Schwarzian derivative.The following formula for the Bergmann projective connection at an arbitrary

point P ∈ L on the Riemann surface of genus g � 1 is a simple corollary ofexpression (2.3) for the Bergmann kernel [2]:

SB(x(P )) = −2T

H+

{∫ P

H, x(P )

}, (2.10)

where

H =∑

!∗zi (0)fi; T =∑i,j,k

!∗zi zj zk (0)fifjfk;

Page 53: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 53

!∗ is the theta-function with an arbitrary nonsingular odd half-integer characteris-tic; fi ≡ vi(P )/dx(P ).

2.3. VARIATIONAL FORMULAS

Denote by xm = (λ−λm)1/rm the natural coordinate of a point P in a neighborhoodof the ramification point Pm, where λ = p(P ).

Recall the Rauch formula (see, e.g., [3], formula (3.21) or the classical pa-per [13]), which describes the variation of the matrix B = ‖bij‖ of b-periodsunder the variation of conformal structure corresponding to a Beltrami differentialµ ∈ L∞:

δµbij =∫

L

µvivj . (2.11)

We shall need also the analogous formula for the variation of the Bergmannkernel

δµB(P,Q) = 1

2πi

∫L

µ(·)B(·, P )B(·,Q) (2.12)

(see [3], p. 57).Introduce the following Beltrami differential

µm = − 1

2εrm

( |xm|xm

)rm−2

1{|xm|�ε}dxmdxm

(2.13)

with sufficiently small ε > 0 (where 1{|xm|�ε} is the function equal to 1 inside thedisc of radius ε centered at Pm and vanishing outside the disc); if rm = 2 thisBeltrami differential corresponds to the so-called Schiffer variation).

Setting µ = µm in (2.11) and using the Cauchy formula, we get

δµmbij =2πi

rm (rm − 2)!(

d

dxm

)rm−2{vi(xm)vj (xm)

(dxm)2

}∣∣∣∣xm=0

. (2.14)

Observe now that the r.h s. of formula (2.14) coincides with the known expressionfor the derivative of the b-period with respect to the branch point λm:

∂bij

∂λm= 2πires|λ=λm

N∑k=1

1

dλvi(λ

(k))vj (λ(k)), (2.15)

where λ(k) denotes the point on the kth sheet of the covering L which projects tothe point λ ∈ P

1. (Only those sheets which are glued together at the point Pm givea nontrivial contribution to the summation at the right-hand side of (2.15).) Thus,we have the following relation for variations of b-periods:

∂λmbij = δµmbij . (2.16)

Page 54: Mathematical Physics, Analysis and Geometry - Volume 7

54 A. KOKOTOV AND D. KOROTKIN

This relation can be generalized for an arbitrary function of moduli. Let Z: Tg →Hg be the standard holomorphic map from the Teichmüller space Tg to Siegel’sgeneralized upper half-plane. (The Z maps the conformal equivalence class of amarked Riemann surface to the set of b-periods of normalized holomorphic dif-ferentials on this surface.) It is well-known that the rank of the map Z is 3g − 3at any point of Tg \ T ′g, where T ′g is the (2g − 1)-subvariety of Tg correspondingto hyperelliptic surfaces. Thus, one can always choose some 3g − 3 b-periods aslocal coordinates in a small neighborhood of any point of Tg \ T ′g. Using thesecoordinates, we get

δf

δµm

=∑i,j

∂f

∂bijδµmbij =

∂f

∂λm, (2.17)

for any differentiable function f on Tg under the condition that the variation in thel.h.s. of (2.17) is taken at a point of Tg \ T ′g (i.e. at a nonhyperelliptic surface).

Formula (2.15) is well-known in the case of the simple branch point λm(i.e. for rm = 2, see, e.g., [12]). Since we did not find an appropriate referencefor the general case, in what follows we briefly outline the proof:

Writing the basic differential vi in a neighborhood of the ramification point Pmas

vi(xm) =(C0 + C1xm + · · · + Crm−1x

rm−1m + O(|xm|rm)

)dxm

and differentiating this expression with respect to λm, we get the asymptotics

∂λmvi(xm) =

{C0

(1− 1

rm

)1

xrmm

+ C1

(1− 2

rm

)1

xrm−1m

+ · · · +

+ Crm−2

(1− rm − 1

rm

)1

x2m

+ O(1)

}dxm. (2.18)

If n �= m then in a neighborhood of the ramification point Pn we have the asymp-totics

∂λmvi(xn) = O(1) dxn.

Therefore, the meromorphic differential ∂λmvi has the only pole at the point Pm andits principal part at Pm is given by (2.18). Observe that all the a-periods of ∂λmviare equal to zero. Thus we can reconstruct ∂λmvi via the first rm − 2 derivatives ofthe Bergmann kernel:

∂λmvi(P ) = 1

rm(rm − 2)!(

d

dxm

)rm−2{B(P, xm)vi(xm)

(dxm)2

}∣∣∣∣xm=0

. (2.19)

To get (2.15) it is enough to integrate (2.19) over the b-cycle bj (whose projectionon P

1 is independent of the branch points) and use the formula∫bj

B(·, xm) = 2πivj (xm).

Page 55: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 55

One may apply the same arguments to get the following formula for the deriva-tive of the Bergmann kernel with respect to the branch point λm:

∂λmB(P,Q) = −res|λ=λm

1

N∑k=1

B(P, λ(k))B(Q, λ(k)). (2.20)

This formula also follows from (2.12) and (2.17).We shall need also another expression for the derivative of the Bergmann kernel:

∂λmB(P,Q) = res|λ=λm

{1

∑j �=k

B(P, λ(j))B(Q, λ(k))

}. (2.21)

To prove it we note that the sum∑

j B(P, λ(j)) over all the sheets of covering L

gives the Bergmann kernel on the sphere P1

dλ dµ(P )

(λ− µ(P ))2

(here µ(P ) = p(P )), therefore, we have

(dλ)2 dµ(P ) dµ(Q)

(λ− µ(P ))2(λ− µ(Q))2

=∑j

B(P, λ(j))∑k

B(Q, λ(k))

=∑j

B(P, λ(j))B(Q, λ(j))+∑j �=k

B(P, λ(j))B(Q, λ(k)).

Now taking the residue at λ = λm and using (2.20), we get (2.21).

2.4. THE BERGMANN AND WIRTINGER PROJECTIVE CONNECTIONS AT THE

BRANCH POINTS

Here we prove a property of the Bergmann projective connection on a branchedcovering which plays a crucial role in all our forthcoming constructions.

Introduce the following notation:

Bm = − 1

6(rm − 2)! rm(

d

dxm

)rm−2

SB(xm)|xm=0, m = 1, 2, . . . ,M, (2.22)

where SB(xm) is the Bergmann projective connection corresponding to the localparameter xm = (λ − λm)

1/rm near the ramification point Pm. (The factor −1/6in (2.22) seems to be of no importance, its appearance will be explained later on.)

If we deform covering (2.1) moving the branch points in small neighborhoods oftheir initial positions and preserving the permutations corresponding to the branchpoints then the quantity Bm becomes a function of (λ1, . . . , λM).

Page 56: Mathematical Physics, Analysis and Geometry - Volume 7

56 A. KOKOTOV AND D. KOROTKIN

THEOREM 1. For any m,n = 1, . . . ,M the following equations hold

∂Bm

∂λn= ∂Bn

∂λm. (2.23)

Proof. We start with the following lemma.

LEMMA 1. The function Bm can be expressed via the Bergmann kernel as

Bm = 2res|λ=λm{

1

N∑k,j=1;j �=k

B(λ(j), λ(k))

}, (2.24)

where λ(j) is the point of the j th sheet of covering (2.1) such that p(λ(j)) = λ.

Let H(·, ·) be the nonsingular part of the Bergmann kernel, i.e.

B(P,Q) =(

1

(x(P )− x(Q))2+H(x(P ), x(Q))

)dx(P ) dx(Q),

as P → Q.To prove the lemma we observe that only those sheets which are glued together

at the point Pm give a nontrivial contribution to the summation in (2.24). Now wemay rewrite the right hand side of (2.24) as

1

3res|λ=λm

rm∑j,k=1, j �=k

H(γ jxm, γkxm)γ

j+k(

dxmdλ

)2

dλ,

where γ = e2πi/rm is the root of unity. In terms of coefficients of the Taylor seriesof H(xm, ym) at the point Pm:

H(xm, ym) =∞∑s=0

s∑p=0

H(p,s−p)(0, 0)

p!(s − p)! xpmys−pm

this expression looks as follows:

1

3r2m

rm−2∑p=0

H(p,rm−2−p)(0, 0)

p!(rm − 2− p)!rm∑

j,k=1,j<k

γ (p+1)k+(rm−p−1)j .

Summing up the geometrical progression, we get (2.24).Using (2.24) and (2.21) we conclude that

∂Bm

∂λn= 2

{∂

∂λnres|λm

1

∑j �=k

B(λ(j), λ(k))

}

= 2res|λ=λmres|µ=λn{

1

1

∑j �=k

∑j ′ �=k′

B(µ(j ′), λ(j))B(µ(k′), λ(k))

}.

(2.25)

Page 57: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 57

To finish the proof we note that the last expression is symmetric with respect to m

and n. ✷The analogous statement is also true for the derivatives of the Wirtinger projec-

tive connection. Namely, set

Am = − 1

6(rm − 2)! rm(

d

dxm

)rm−2

SW(xm)|xm=0, m = 1, 2, . . . ,M, (2.26)

where SW(xm) is the Wirtinger projective connection corresponding to the localparameter xm near the ramification point Pm. The following statement is an easycorollary of Theorem 1.

THEOREM 2. For any m,n = 1, . . . ,M the following equations hold∂Am

∂λn= ∂An

∂λm. (2.27)

Proof. A simple calculation shows that the one-form

V =M∑m=1

(Am −Bm) dλm

is a total differential:

V = − 4

4g + 2gd ln

∏βeven

![β](0 | B). (2.28)

To prove (2.28) it is sufficient to use the heat equation for theta-function

∂![β](z | B)∂bjk

= 1

4πi

∂2![β](z | B)

∂zj∂zk, (2.29)

the formula (2.14) for the derivative of the b-period with respect to the branch pointand the link (2.7) between the Wirtinger and Bergmann projective connections. ✷

2.5. THE WIRTINGER AND BERGMANN TAU-FUNCTIONS OF BRANCHED

COVERINGS

2.5.1. The Wirtinger Tau-function

We recall that U′(L) denotes the set of branched coverings from the connectedcomponent U(L) � L of the Hurwitz space H(N,M,P1) for which none of thetheta-constants vanishes. Introduce the connection

dW = d−M∑m=1

Am dλm, (2.30)

acting in the trivial bundle U′(L) × C, where d is the external differentiation(having both ‘holomorphic’ and ‘antiholomorphic’ components); the connectioncoefficients Am are defined by (2.26).

Page 58: Mathematical Physics, Analysis and Geometry - Volume 7

58 A. KOKOTOV AND D. KOROTKIN

Remark 1. If we choose another global holomorphic coordinate λ on P, λ =(aλ+ b)/(cλ+ d), where ad − bc = 1, then the connection dW turns into a gaugeequivalent connection. Consider, for example, the case of branched coverings withsimple branch points (all the rm are equal to 2). Let λm be the new coordinates ofthe branch points,

λm = aλm + b

cλm + d; (2.31)

then the gauge transformation of connection dW in local coordinates looks as fol-lows

dW �−→ G−1 dWG, (2.32)

where

G =M∏m=1

(cλm + d)−1/4. (2.33)

Theorem 2 implies the following statement.

THEOREM 3. The connection dW , defined in the trivial line bundle over U′(L)

in terms of the Wirtinger projective connection by formulas (2.30), (2.26), is flat.

The flat connection dW determines a character of the fundamental group ofU′(L), i.e. the representation

ρ: π1(U′(L)

)→ C∗. (2.34)

Denote by E the universal covering of U′(L); then the group π1(U′(L)) acts

on the direct product E × C as follows:

g(e, z) = (ge, ρ(g)z),

where e ∈ E , z ∈ C, g ∈ π1(U′(L)). The factor manifold E × C/π1(U

′(L))

has the structure of a holomorphic line bundle over U′(L); we denote this bundleby TW .

DEFINITION 1. The flat holomorphic line bundle TW equipped with the flatconnection dW is called the Wirtinger line bundle over the punctured Hurwitz spaceU′(L). The (unique up to a multiplicative constant) horizontal holomorphic sectionof the bundle TW is called the Wirtinger τ -function of the covering L and denotedby τW .

Taking into account the form (2.32), (2.33) of the gauge transformation ofconnection dW under conformal transformations on the base λ-plane, we see that

Page 59: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 59

the Wirtinger tau-function τW of a branched covering with simple branch pointstransforms as follows under conformal transformation (2.31):

τW �−→M∏m=1

(cλm + d)−1/4τW . (2.35)

One can easily derive the analogous formula in the general case of an arbitrarycovering.

We notice that

• In genera 0, 1 and 2 the ‘theta-divisor’ Z = U(L) \U′(L) is empty. There-fore, in this case the bundle TW is a bundle over the whole connected compo-nent U(L) of the Hurwitz space H(N,M,P1).

• Hyperelliptic coverings (N = 2) fall within this framework only in generag = 0, 1, 2 since for genus g > 2 one of the theta-constants always vanishesfor hyperelliptic curves [10].

• In the case of simple branch points the space U(L) is nothing but the Hurwitzspace Hg,N(1, . . . , 1) from ([1, 11]).

2.5.2. The Bergmann Tau-function

Consider now the covering U(L) (the set of pairs (2.2)) of the space U(L). Re-peating the construction of the previous subsection for the flat connection

dB = d−M∑m=1

Bm dλm, (2.36)

in the trivial line bundle U(L) × C, we get a flat holomorphic line bundle TBover U(L).

(Here the coefficients Bm are defined by formula (2.22), the flatness of connec-tion (2.36) follows from Theorem 1.)

DEFINITION 2. The flat holomorphic line bundle TB equipped with the flatconnection dB is called the Bergmann line bundle over the covering U(L) of theconnected component U(L) of the Hurwitz space H(N,M,P1). The (unique upto a multiplicative constant) horizontal holomorphic section of the bundle TB iscalled the Bergmann τ -function of the covering L and denoted by τB .

According to the link (2.7) between Wirtinger and Bergmann projective con-nections, the corresponding tau-functions are related as follows:

τW = τB

{ ∏β even

![β](0|B)}−1/(4g−1+2g−2)

. (2.37)

In contrast to the Wirtinger tau-function, the Bergmann tau-function does dependupon the choice of canonical basis of cycles on L.

Page 60: Mathematical Physics, Analysis and Geometry - Volume 7

60 A. KOKOTOV AND D. KOROTKIN

Consider the case of hyperelliptic (N = 2) coverings. As a by-product ofcomputation of isomonodromic tau-functions for Riemann–Hilbert problems withquasi-permutation monodromies (see [7]), it was found the following expressionfor the Bergmann tau-function τB on the spaces Hg,2(1, 1):

τB = det A2g+2∏

m,n=1; m<n(λm − λn)

1/4, (2.38)

where A is the matrix of a-periods of nonnormalized holomorphic differentials onL:Aαβ =

∮aαλβ−1dλ/ν, with ν2 =∏2g+2

m=1 (λ− λm).Expression (2.38) coincides with the empirical formula for the determinant of

∂-operator, acting in the trivial line bundle over L, derived in [8]. Due to the termdet A, the expression (2.38) is explicitly dependent on the choice of canonical basisof cycles on L.

On the other hand, the Wirtinger tau-function, which is independent of thechoice of canonical basis of cycles, is defined on hyperelliptic curves only if g � 2.Consider the case g = 2 (postponing the cases g = 0, 1 to the next section).

Recall the classical Thomae formulas, which express the theta-constants of hy-perelliptic curves in terms of branch points. Namely, consider an arbitrary partitionof the set of branch points {λ1, . . . , λ2g+2} into two subsets: T and T , where thesubset T (and also T ) contains g + 1 branch points. To each such partition we canassociate an even vector of half-integer characteristics [η′T , η′′T ] such that

Bη′T + η′′T =∑λm∈T

U(λm)−K, (2.39)

where U(P ) is the Abel map, K is the vector of Riemann constants. The number ofeven characteristics obtained in this way is given by 1

2Cg+12g+2. If we denote the theta-

function with characteristics [η′T , η′′T ] by θ[βT ], the Thomae formula (see [10])states that related theta-constant can be computed as follows:

!4[βT ](0) = ±(det A)2∏

λm,λn∈T(λm − λn)

∏λm,λn∈T

(λm − λn). (2.40)

In genus 2 we have 12 (4

2 + 22) = 10 even characteristics in total; this numbercoincides with the number 1

2C36 of nonvanishing even characteristics for which

the Thomae formulas take place. Substitution of Thomae formulas (2.40) and ex-pression (2.38) for τB into (2.37) gives the following formula for the Wirtingertau-function of a hyperelliptic covering of genus 2:

τW =6∏

m,n=1,m<n

(λm − λn)1/20. (2.41)

The independence of the Wirtinger tau-function of the choice of canonical basis ofcycles on L is manifest here.

Page 61: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 61

Remark 2. For higher genus (g > 2) two-fold coverings our definition ofWirtinger tau-function does not work, since some of theta-constants always vanish.However, we can slightly modify formula (2.37), averaging only over the set ofnonsingular even characteristics. This leads to the following definition

τ ∗W = τB

{∏T

![βT ](0|B)}−4/Cg+1

2g+2

. (2.42)

Since the set of all characteristics βT is invariant with respect to any change ofcanonical basis of cycles, function τ ∗W does not depend on the choice of this basis.Substitution of expression (2.38) and Thomae formulas (2.40) into (2.42) leads tothe following result:

τ ∗W =2g+2∏

m,n=1, m�=n(λm − λn)

1/4(2g+1). (2.43)

The main goal of the present paper is the calculation of the Wirtinger andBergmann tau-functions of an arbitrary covering L. In Section 3 we explicitlycalculate them for coverings of genera 0 and 1. For arbitrary coverings of highergenus we are able to calculate only the modulus square of the tau-function (seeSection 4).

Remark 3. The Bergmann tau-function is closely related to some classes ofFrobenius manifolds (see [1]). Let φ be a primary differential (see [1], Theo-rem 5.1) defining the structure of Frobenius manifold Mφ on the coveringHg,N(1, . . . , 1). The rotation coefficients βmn of the corresponding Darboux–Egoroff metric are independent of φ and can be expressed through the Bergmannkernel on the covering L:

βmn = 1

2

B(P,Q)

dxm(P ) dxn(Q)

∣∣∣∣P=Pm, Q=Pn

.

A simple calculation shows that

Hn = 1

2

∑m�=n

β2mn(λn − λm) = − 1

2Bn, (2.44)

where Hn is the isomonodromic quadratic Hamiltonian from [1]. Relation (2.44)follows from Equation (2.25) and the properties of the vector fields

∑m ∂λm and∑

m λm∂λm on the Frobenius manifold Mφ .Thus the Bergmann tau-function is related as follows to the isomonodromic

tau-function from [1]: τB = τ−2I , where τI is the isomonodromic tau-function

of the Frobenius manifold Mφ . This enables us to answer the question from [14]concerning the relations between our formulas for the Bergmann tau-function andthe G-functions of Frobenius manifolds considered in [14]. The details will appearelsewhere.

Page 62: Mathematical Physics, Analysis and Geometry - Volume 7

62 A. KOKOTOV AND D. KOROTKIN

3. Rational and Elliptic Cases

If g = 0 the branched covering L can be biholomorphically mapped to the Rie-mann sphere P

1. Let z be the natural coordinate on P1 \∞. The projective connec-

tion SB(xm) reduces to the Schwarzian derivative

SB(xm) = Rz,xm = {z(xm), xm}.Therefore

Bm = −1

6rm (rm − 2)!(

d

dxm

)rm−2

Rz,xm|xm=0. (3.1)

If g = 1 the branched covering L can be biholomorphically mapped to the toruswith periods 1 and µ; in genus 1 there is only one theta-function with odd character-

istic which is the odd Jacobi theta-function θ1(z|µ) = θ[

1/21/2

](z|µ). Using (2.10)

and the heat equation ∂2z θ1 = 4πi ∂µθ1, we get

SB(xm) = −8πi∂ ln θ1

∂µv2(xm)+ Rz,xm,

where θ ′1 ≡ ∂θ1/∂z|z=0, v = v(xm) dxm and z = ∫ Pv. Now the variational

formula (2.14) implies that

Bm = 2

3

∂ ln θ1′

∂λm− 1

6rm (rm − 2)!(

d

dxm

)rm−2

Rz,xm|xm=0. (3.2)

Our way of calculating of the tau-functions τW and τB is rather indirect. Namely,we shall first compute the module of the tau-function. Since the first term in (3.2)can be immediately integrated, in both cases g = 0 and g = 1 one needs to find areal-valued potential S(λ1, . . . , λn) satisfying

∂S

∂λm= 1

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm|xm=0, (3.3)

where z is the natural coordinate on the universal covering of L (i.e. on the complexplane for g = 1 and the Riemann sphere for g = 0).

The solution of Equations (3.3) is given by Theorem 4 below. The function S

turns out to coinside with the properly regularized Dirichlet integral

1

∫L

|φλ|2, (3.4)

where eφ|dλ|2 is the flat metric on L obtained by projecting the standard metric|dz|2 from the universal covering. (In case g = 0, when the universal covering isthe Riemann sphere, the metric |dz|2 is singular.)

Page 63: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 63

The Dirichlet integral (3.4) can be explicitly represented as the modulus squareof holomorphic function of variables λ1, . . . , λM . The procedure of holomorphicfactorization gives us the value of the tau-function itself.

The next two subsections are devoted to the calculation of the function S.

3.1. THE FLAT METRIC ON RIEMANN SURFACES OF GENUS 0 AND 1

The asymptotics of the flat metric near the branch points. Compact Riemann sur-faces L of genus 1 and 0 have the universal coverings L = C and L = P

1

respectively. Projecting from the universal covering onto L the metric |dz|2, weobtain the metric of the Gaussian curvature 0 on L. (In case g = 0 the obtainedmetric has singularity at the image of the infinity of P

1). Let J : L → L be theuniformization map; denote its inverse by U = J−1. Denote by x a local parameteron L. The projection of the metric |dz|2 on L looks as follows:

eφ(x,x)|dx|2 = |Ux(x)|2|dx|2; (3.5)

where the function φ satisfies the Laplace equation

φxx = 0. (3.6)

In the case g = 1 the map P �→ U(P ) may be defined by

U(P ) =∫ P

v

with any holomorphic differential v on L (not necessarily normalized).In the case g = 0 we choose one sheet of the covering L (we shall call this

sheet the first one) and require that U(∞(1)) = ∞, where∞(1) is the infinity of thefirst sheet.

Choose any sheet of the covering L (this will be a copy of the Riemann sphereP

1 with appropriate cuts between the branch points; we recall that it is assumedthat the infinities of all the sheets are not the ramification points) and cut out smallneighborhoods of all the branch points and a neighborhood of the infinity. In theremaining domain we can use λ as global coordinate. Let φext(λ, λ) be the functionfrom (3.5) corresponding to the coordinate x = λ and φint(xm, xm) be the functionfrom (3.5) corresponding to the coordinate x = xm.

LEMMA 2. The derivative of the function φext has the following asymptotics nearthe branch points and the infinities of the sheets:

(1) |φextλ (λ, λ)|2 = ((1/rm)− 1)2|λ− λm|−2 + O(|λ− λm|−2+1/rm) as λ→ λm,

(2) |φextλ (λ, λ)|2 = 4|λ|−2 + O(|λ|−3) as λ→∞.

(3) In the case g = 0 on the first sheet the last asymptotics is replaced by

|φextλ (λ, λ)|2 = O(|λ|−6)

as λ→∞.

Page 64: Mathematical Physics, Analysis and Geometry - Volume 7

64 A. KOKOTOV AND D. KOROTKIN

Proof. In a small punctured neighborhood of Pm on the chosen sheet we have

eφint(xm,xm)|dxm|2 = eφ

ext(λ,λ)|dλ|2. (3.7)

This gives the equality

eφext(λ,λ) = 1

r2m

eφint(xm,xm)|λ− λm|2/rm−2

which implies the first asymptotics.In a neighborhood of the infinity of the chosen sheet we may introduce the

coordinate ζ = 1/λ. Denote by φ∞(ζ, ζ ) the function φ from (3.5) correspondingto the coordinate w = ζ . Now the second asymptotics follows from the equality

eφext(λ,λ) = eφ

∞(ζ,ζ )|λ|−4. (3.8)

In the case g = 0 near the infinity of the first sheet we have

U(λ) = c1λ+ c0 + c−11

λ+ · · ·

with c1 �= 0. So at the infinity of the first sheet there is the asymptotics

φextλ (λ, λ) = Uλλ

= O(|λ|−3). ✷The Schwarzian connection in terms of the flat metric. Let x be some local coor-

dinate on L. Set z = U(x); here z is a point of the universal covering(C or P

1). The system of Schwarzian derivatives Rz,x (each derivative correspondsto its own local chart) forms a projective connection on the surface L. In accor-dance with [5], we call it the Schwarzian connection.

LEMMA 3. (1) The Schwarzian connection can be expressed as follows in termsof the function φ from (3.5):

Rz,x = φxx − 12φ

2x . (3.9)

(2) In a neighborhood of a branch point Pm there is the following relation be-tween the values of Schwarzian connection computed with respect to coordinates λand xm:

Rz,λ = 1

r2m

(λ− λm)2/rm−2Rz,xm +

(1

2− 1

2r2m

)(λ− λm)

−2. (3.10)

(3) Let ζ be the coordinate in a neighborhood of the infinity of any sheet ofcovering (2.1) (except the first one in the case g = 0), ζ = 1/λ. Then

Rz,λ = Rz,ζ

λ4= O(|λ|−4). (3.11)

Page 65: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 65

Proof. The second and the third statements are just the rule of transformationof the Schwarzian derivative under the coordinate change. The formula (3.9) iswell-known and can be verified by a straightforward calculation. ✷

The derivative of the metric with respect to a branch point. In this item weset φ(λ, λ) = φext(λ, λ). The following lemma describes the dependence of thefunction φ on positions of the branch points of the covering L.

LEMMA 4. Let g = 0, 1. The derivative of the function φ with respect to λ isrelated to its derivative with respect to a branch point λm as follows:

∂φ

∂λm+ Fm

∂φ

∂λ+ ∂Fm

∂λ= 0, (3.12)

where

Fm = −Uλm

. (3.13)

Proof. We have φ = lnUλ + lnUλ; φλ = Uλλ/Uλ, φλm = Uλλm/Uλ and

Uλλm

= Uλm

Uλλ

+(Uλm

.

(We used the fact that the map U depends on the branch points holomorphically.) ✷LEMMA 5. Let g = 0 or g = 1 and let J be the uniformization map J :CP 1 → Lor J : C → L respectively. Denote the composition p ◦ J by R. Then

(1) The following relation holds:

Fm = ∂R

∂λm. (3.14)

(2) In a neighborhood of the branch point λl the following asymptotics holds:

Fm = δlm + o(1), (3.15)

where δlm is the Kronecker symbol.(3) At the infinity of each sheet (except the first sheet for g = 0) the following

asymptotics holds:

Fm(λ) = O(|λ|2). (3.16)

Proof. Writing the dependence on the branch points explicitly we have

U(λ1, . . . , λM;R(λ1, . . . , λM; z)) = z (3.17)

for any z from the universal covering (P1 for g = 0 or C for g = 1). Differentiat-ing (3.17) with respect to λm we get (3.14).

Page 66: Mathematical Physics, Analysis and Geometry - Volume 7

66 A. KOKOTOV AND D. KOROTKIN

Let z0 = z0(λ1, . . . , λM) be a point from the universal covering such thatJ (z0) = Pm. The map R is holomorphic and in a neighborhood of z0 there isthe representation

R(z) = λm + (z− z0)rmf (z, λ1, . . . , λM) (3.18)

with some holomorphic function f (·, λ1, . . . , λM). This together with the firststatement of the lemma give (3.15).

Let now z∞ = z∞(λ1, . . . , λM) be a point from the universal covering such thatJ (z∞) = ∞, where ∞ is the infinity of the chosen sheet. Then in a neighborhoodof z∞ we have

λ = R(z) = g(z;λ1, . . . , λM)(z − z∞)−1

with holomorphic g(·, λ1, . . . , λM). Using the first statement of the lemma, weget (3.16). ✷COROLLARY 1. Keep m fixed and define Fn(xn) ≡ Fm(λn + xrnn ). Then

Fn(0) = δnm;(

d

dxn

)k

Fn(0) = 0, k = 1, . . . , rn − 2.

This immediately follows from formulas (3.14) and (3.18).Formulas (3.12) and (3.15) are analogous to the Ahlfors lemma as it was for-

mulated in [17]. However, they are more elementary, since their proof does not useTeichmüller’s theory.

3.2. THE REGULARIZED DIRICHLET INTEGRAL

We recall that the covering L has N sheets and N = ∑Mm=1(rm − 1)/2 − g + 1

due to the Riemann–Hurwitz formula. To the kth sheet Lk of the covering L therecorresponds the function φext

k : Lk → R which is smooth in any domain Gkr of the

form Gkρ = {λ ∈ Lk : ∀m|λ−λm| > ρ and |λ| < 1/ρ}, where ρ > 0. Here λm are

all the branch points which belong to the kth sheet Lk of L. In the case of genuszero the above definition of the domain Gk

ρ is valid for k = 2, . . . , N . The domainG1ρ in this case should be defined separately:

G1ρ = {λ ∈ L1 \ ∞1 : ∀m|λ− λm| > ρ}.

(Here, again, λm are all the branch points from the first sheet.) We recall that inthe case g = 0 we have singled out one sheet of the covering (the first sheet inour enumeration). The function φext

k has finite limits at the cuts (except the end-points which are the ramification points); at the ramification points and at infinityit possesses the asymptotics listed in Lemma 3.

Page 67: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 67

Let us introduce the regularized Dirichlet integral

1

∫L

|φλ|2 dS.

Namely, set

Qρ =N∑k=1

∫Gkρ

|∂λφextk |2 dS, (3.19)

where dS is the area element on C1: dS = |dλ ∧ dλ|/2.

According to Lemma 3 there exist the finite limits

Sell(λ1, . . . , λM)

= 1

2πlimρ→0

(Qρ +

{4N +

M∑m=1

(rm − 1)2

rm

}2π ln ρ

)+

M∑m=1

(1− rm) ln rm

(3.20)

in the case g = 1 and

Srat(λ1, . . . , λM)

= 1

2πlimρ→0

(Qρ +

{4(N − 1)+

M∑m=1

(rm − 1)2

rm

}2π ln ρ

)+

+M∑m=1

(1− rm) ln rm (3.21)

in the case g = 0; the last constant term∑M

m=1(1 − rm) ln rm we include forconvenience.

THEOREM 4. Let S = Srat for g = 0, S = Sell for g = 1. Then for any m =1, . . . ,M

∂S(λ1, . . . , λM)

∂λm= 1

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm|xm=0, (3.22)

where z is the natural coordinate on the universal covering of L (P1 for g = 0 andC for g = 1).

Proof. We shall restrict ourselves to the case g = 1. The proofs for g = 0 andg = 1 differ only in details concerning the infinity of the first sheet.

Let Qρ be defined by formula (3.19). We have

∂λmQρ = i

2

rm∑l=1

∮|λ(l)−λ(l)m |=ρ

|∂λφ|2 dλ+N∑k=1

∫ ∫G(k)ρ

∂λm|∂λφ|2 dS. (3.23)

Page 68: Mathematical Physics, Analysis and Geometry - Volume 7

68 A. KOKOTOV AND D. KOROTKIN

Here the first sum corresponds to those sheets of the covering (2.1) which are gluedtogether at the point Pm; the upper index (l) signifies that the integration is over acontour lying on the lth sheet.

LEMMA 6. There is an equality

2

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm|xm=0

= −M∑n=1

(1− 1

r2n

)1

(rn − 1)!(

d

dxn

)rn

Fm(λn + xnrn)|xn=0. (3.24)

Here xn, xm are the local parameters near Pn and Pm. The summation at the rightis over all the branch points of the covering L.

Proof. Using (3.9) and the holomorphy of Rz,λ with respect to λ, we have

0 =N∑k=1

∮∂Gk

ρ

Fm(2φλλ − φ2λ) dλ

= 2N∑k=1

∮|λ|=1/ρ

FmRz,λ dλ+

+N∑k=1

∑λn∈Lk

∮|λ−λn|=ρ

Fm(2φλλ − φ2λ) dλ. (3.25)

The asymptotics (3.11) and (3.16) imply that the first sum in (3.25) is o(1) asρ → 0. The second sum coincides with

M∑n=1

∮|xn|=ρ1/rn

Fn(xn)

[2Rz,xn

rnx2rn−2n

+ 1

x2rnn

(1− 1

r2n

)]rnx

rn−1n dxn. (3.26)

Here we have used (3.10); the function Fn is from Corollary 1. Now using Corol-lary 1 together with Cauchy formula and taking the limit ρ → 0 we get (3.24). ✷

The rest of the proof relies on the method proposed in [17]. Denote by H2 thesecond term in (3.23). Using (3.12) and the equality Fmλ = 0, we get the relation

∂λm|φλ|2 = −(Fm|φλ|2)λ − (Fmλφλ)λ

= −(Fm|φλ|2)λ − (Fmλφλ)λ − (Fmλφλ)λ. (3.27)

This gives

H2 = − i

2

(N∑k=1

∮∂G

(k)ρ

Fm|φλ|2 dλ−∮∂G

(k)ρ

Fmλφλ dλ+∮∂G

(k)ρ

Fmλφλ dλ

)

Page 69: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 69

= − i

2

∑λj

rj∑p=1

(∮|λ(p)−λ(p)j |=ρ

Fm|φλ|2 dλ−

−∮|λ(p)−λ(p)j |=ρ

Fmλφλ dλ+∮|λ(p)−λ(p)j |=ρ

Fmλφλ dλ

)−

− i

2

N∑k=1

(∮|λ(k)|=1/ρ

Fm|φλ|2 dλ−∮|λ(k)|=1/ρ

Fmλφλ dλ+

+∮|λ(k)|=1/ρ

Fmλφλ dλ

). (3.28)

Let

I n1 (ρ) =rn∑p=1

∮|λ(p)−λ(p)n |=ρ

Fm|φλ|2 dλ; I n2 (ρ) =rn∑p=1

∮|λ(p)−λ(p)n |=ρ

Fmλφλ dλ;

I n3 (ρ) =rn∑p=1

∮|λ(p)−λ(p)n |=ρ

Fmλφλ dλ.

We have

I n1 (ρ) = δnm

rn∑p=1

∮|λ(p)−λ(p)n |=ρ

|φλ|2 dλ+

+∮|xn|=ρ1/rn

[1

(rn − 1)!F(rn−1)n (0)xrn−1

n + 1

rn!F(rn)n (0)xrnn +

+ O(|xn|rn+1)

×( |φint

xn|2

rnxrn−1n x

rn−1n

+ 1− rn

r2n

φintxn

xrnn x

rn−1n

+ 1− rn

r2n

φintxn

xrn−1n x

rnn

+

+(

1

rn− 1

)2 1

xrnn x

rnn

)rnx

rn−1n dxn

= δnm

rn∑p=1

∮|λ(p)−λ(p)n |=ρ

|φλ|2 dλ+ 2πi(1/rn − 1)2

(rn − 1)! F(rn)n (0)+

+ 2πi1− rn

rn(rn − 1)!F(rn−1)n (0)φint

xn(0)+ o(1)

as ρ → 0.We get also

I n2 (ρ) =∮|xn|=ρ1/rn

(1

rnxnrn−1φintxn+

(1

rn− 1

)1

xrnn

Page 70: Mathematical Physics, Analysis and Geometry - Volume 7

70 A. KOKOTOV AND D. KOROTKIN

×(

1

(rn − 2)!F(rn−1)n (0)xrn−2

n +

+ 1

(rn − 1)!F(rn)n (0)xrn−1

n + O(|xn|rn))

dxn

= −2πi

(1

rn− 1

)1

(rn − 1)!F(rn)n (0)−

− 2πi1

rn(rn − 2)!φintxn(0)F(rn−1)

n (0)+ o(1)

and

I n3 (ρ) =∮|xn|=ρ1/rn

(1

(rn − 2)!F(rn−1)n (0)xrn−2

n +

+ 1

(rn − 1)!F(rn)n (0)xrn−1

n + O(|xn|rn))×

×(

1

rnxrn−1n

φintxn+

(1

rn− 1

)1

xrnn

)(xn

xn

)rn−1

dxn

= 2πi(1/rn − 1)

(rn − 1)! F(rn)n (0)+ o(1).

We note that

I n1 − I n2 + I n3 = δnm

rn∑p=1

∮|λ(p)−λ(p)n |=ρ

|φλ|2 dλ+

+ 2πi

(rn − 1)!F(rn)n (0)

[(1

rn− 1

)2

+ 2

(1

rn− 1

)]+ o(1)

= δnm

rn∑p=1

∮|λ(p)−λ(p)n |=ρ

|φλ|2 dλ−

− 2πi

(rn − 1)!(

1− 1

r2n

)F(rn)

n (0)+ o(1).

It is easy to verify that

N∑k=1

(∮|λ(k)|=1/ρ

Fm|φλ|2 dλ−∮|λ(k)|=1/ρ

Fmλφλ dλ+∮|λ(k)|=1/ρ

Fmλφλ dλ

)= o(1),

so we get

H2 = − i

2

(rm∑l=1

∮|λ(l)−λ(l)m |=ρ

|φλ|2 dλ−

− 2πiM∑n=1

1

(rn − 1)!(

1− 1

r2n

)F(rn)

n (0)

)+ o(1). (3.29)

Page 71: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 71

Now Lemma 6, (3.23) and (3.29) imply that

∂λmQρ = 2π

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm|xm=0 + o(1). (3.30)

To prove Theorem 4 it is sufficient to observe that the term o(1) in (3.30) is uniformwith respect to parameters (λ1, . . . , λM) belonging to a compact neighborhood ofthe initial point (λ0

1, . . . , λ0M). ✷

COROLLARY 2. The formulas for functions Sell and Srat can be rewritten asfollows:

Sell(λ1, . . . , λM) =M∑m=1

rm − 1

2φint(xm, xm)|xm=0 −

N∑k=1

φ∞(∞(k)), (3.31)

Srat(λ1, . . . , λM) =M∑m=1

rm − 1

2φint(xm, xm)|xm=0 −

N∑k=2

φ∞(∞(k)). (3.32)

Here ∞(k) is the infinity of the kth sheet of covering (2.1); φ∞(∞(k)) =φ∞(ζ, ζ )|ζ=0; ζ = 1/λ is the local parameter near ∞(k).

Proof. Using the Laplace equation (3.6), the Stokes theorem and the asymptoticsfrom Lemma 2, we get in the case g = 1:

Qρ =N∑k=1

∫ ∫Gkρ

(φλφ)λ − φλλφ dS = 1

2i

N∑k=1

∫∂Gk

ρ

φλφ dλ

= 1

2i

(M∑m=1

∮|xm|=ρ1/rm

{1

rmφintxmx1−rmm +

(1

rm− 1

)x−rmm

}{φint +

+ 2(1− rm) ln|xm| − 2 ln rm}rmxrm−1m dxm +

+N∑k=1

∮|λ|=1/ρ

{−φ∞ζ λ−2 − 2

λ

}{φ∞ − 4 ln|λ|} dλ

)

= −πM∑m=1

(1− rm)φint(xm)|xm=0 − 2π

N∑k=1

φ∞(∞(k))−

−(

4N +M∑m=1

(rm − 1)2

rm

)2π ln ρ − 2π

M∑m=1

(1− rm) ln rm + o(1),

as ρ → 0. This implies (3.31).In case g = 0 we repeat the same calculation, omitting the integrals around the

infinity of the first sheet. ✷

Page 72: Mathematical Physics, Analysis and Geometry - Volume 7

72 A. KOKOTOV AND D. KOROTKIN

3.3. FACTORIZATION OF THE DIRICHLET INTEGRAL AND THE

TAU-FUNCTIONS OF RATIONAL AND ELLIPTIC COVERINGS

Now we are in a position to calculate the Bergmann tau-function itself. For ratio-nal coverings the Wirtinger and Bergmann tau-functions trivially coincide, in theelliptic case the expression for the Wirtinger tau-function follows from that for theBergmann one.

We start with the tau-functions of elliptic coverings.

THEOREM 5. In case g = 1 the Bergmann tau-function of the covering L isgiven by the following expression:

τB = [θ1′(0 | µ)]2/3

∏Nk=1 h

1/6k∏M

m=1 f(rm−1)/12m

, (3.33)

where v(P ) is the normalized Abelian differential on the torus L;v(P ) = fm(xm) dxm as P → Pm and fm ≡ fm(0); v(P ) = hk(ζ ) dζ as P →∞(k)

and hk ≡ hk(0); µ is the b-period of the differential v(P ).Proof. It is sufficient to observe that

φint(xm, xm) = lnU ′(xm)+ lnU ′(xm) = ln|fm(xm)|2in a neighborhood of Pm and

φ∞(ζ, ζ ) = ln|hk(ζ )|2in a neighborhood of ∞(k) and to make use of (3.31) and (3.2). ✷

Now Theorem 5, the link (2.37) between the Bergmann and Wirtinger tau-functions, and the Jacobi formula θ ′1 = πθ2θ3θ4 imply the following corollary

COROLLARY 3. The Wirtinger tau-function of the elliptic covering L is given bythe formula

τW =∏N

k=1 h1/6k∏M

m=1 f(rm−1)/12m

. (3.34)

We notice that the result (3.34) does not depend on normalization of the holomor-phic differential v(P ): if one makes a transformation v(P ) → Cv(P ) with an ar-bitrary constant C, this constant cancels out in (3.34) due to the Riemann–Hurwitzformula.

For the rational case the Bergmann and Wirtinger tau-functions coincide.

THEOREM 6. In case g = 0 the tau-functions of the covering L can be calculatedby the formula

τW ≡ τB =∏N

k=2(dUdζk|ζk=0)

1/6∏Mm=1(

dUdxm|xm=0)(rm−1)/12

, (3.35)

Page 73: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 73

where xm is the local parameter near the branch point Pm, ζk is the local parameternear the infinity of the kth sheet. (We recall that the map U is chosen in such a waythat U(∞(1)) = ∞.)

The proof is essentially the same.

Remark 4. The fractional powers at the right-hand sides of formulas (3.35)and (3.34) are understood in the sense of the analytical continuation. The aris-ing monodromies are just the monodromies generated by the flat connection dW .It should be noted that the 12th powers of tau-functions (3.35) and (3.34) aresingle-valued global holomorphic functions on the Hurwitz space U(L).

It is instructive to illustrate the formulas (3.35) and (3.33) for the simplest two-fold coverings with two (g = 0) and four (g = 1) branch points.

3.3.1. Tau-function of a Two-fold Rational Covering

Consider the covering of P1 with two sheets and two branch points λ1 and λ2. Then

g = 0 and

U(λ) = 1

2

(λ+ λ1 + λ2

2+√

(λ− λ1)(λ− λ2)

). (3.36)

We get

{U(x1), x1}x1=0 ={x2

1 + x1

√λ1 − λ2 + x2

1 , x1}∣∣

x1=0

={√

λ1 − λ2x1 + x21 +

x31

2

√λ1 − λ2, x1

}∣∣∣∣x1=0

= 3

λ2 − λ1

and

{U(x2), x2}|x2=0 = 3

λ1 − λ2. (3.37)

Now direct integration of Equations (3.37) gives the following result:

τW = τB = (λ1 − λ2)1/4 (3.38)

(up to a multiplicative constant). On the other hand, to apply the general for-mula (3.35), we find

Ux1(0) = 12

√λ1 − λ2; Ux2(0) = 1

2

√λ2 − λ1,

U(ζ2) = 1

2

(1

ζ2+ λ1 + λ2

2− 1

ζ2

√(1− ζ2λ1)(1− ζ2λ2)

)

= λ1 + λ2

2+ (λ1 − λ2)

2

16ζ2 + · · · .

Page 74: Mathematical Physics, Analysis and Geometry - Volume 7

74 A. KOKOTOV AND D. KOROTKIN

Therefore, our formula (3.35) in this case also gives rise to (3.38).

3.3.2. Tau-functions of Two-fold Elliptic Coverings

Consider the two-fold covering L with four branch points:

µ2 = (λ− λ1)(λ− λ2)(λ− λ3)(λ− λ4). (3.39)

There are two ways to compute the tau-function on the space of such coverings.On one hand, since the elliptic curve L belongs to the hyperelliptic class, we canapply known formula (2.38) which gives:

τB(λ1, . . . , λ4) = A∏

m,n=1,...4; m<n(λm − λn)

1/4, (3.40)

where A = ∮a

dλ/µ is the a-period of the nonnormalized holomorphic differential.On the other hand, to apply the formula (3.33) to this case, we notice that the

normalized holomorphic differential on L is equal to

v(P ) = 1

A

µ;

the local parameters near Pn are xn = √λ− λn. Therefore,

fm = 2A−1∏n�=m

(λm − λn)−1/2, hk = (−1)kA−1, k = 1, 2.

According to the Jacobi formula θ ′1 = πθ2θ3θ4; moreover, the genus 1 version ofThomae formulas for theta-constants gives

θ4k = ±

A2

(2πi)2(λj1 − λj2)(λj3 − λj4),

where k = 2, 3, 4 and (j1, . . . , j4) are appropriate permutations of (1, . . . , 4).Computing θ ′1 according to these expressions, we again get (3.40).

3.4. THE WIRTINGER TAU-FUNCTION AND ISOMONODROMIC DEFORMATIONS

In [9] it was given a solution to a class of the Riemann–Hilbert problems withquasi-permutation monodromies in terms of Szegö kernels on branched coveringsof P

1. The isomonodromic tau-function of Jimbo and Miwa associated to theseRiemann–Hilbert problems is closely related to the tau-functions of the branchedcoverings considered in this paper.

Here we briefly outline this link for the genus zero coverings L. So, let L bebiholomorphically equivalent to the Riemann sphere P

1 with global coordinate z.Introduce the ‘prime-forms’ on the z-sphere and the λ-sphere:

E(z, z0) = z − z0√dz√

dz0, E0(λ, λ0) = λ− λ0√

dλ√

dλ0

. (3.41)

Page 75: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 75

Define a N ×N matrix-valued function I(λ, λ0) for λ belonging to a small neigh-borhood of λ0:

Ijk(λ, λ0) = E0(λ, λ0)

E(λ(k), λ(j)

0 )= (λ− λ0)

√z′(λ(k))

√z′(λ(j)0 )

z(λ(k))− z(λ(j)

0 ), (3.42)

where z′ = dz/dλ. To compute the determinant of the matrix I we use the follow-ing identity for two arbitrary sets of complex numbers z1, . . . , zN, µ1, . . . , µN :

detN×N

{1

zj − µk

}=

∏j<k(zj − zk)(µk − µj )∏

j,k(zj − µk). (3.43)

Using this relation, we find that

det I = (λ− λ0)N

N∏k=1

{zλ(λ(k))zλ(λ(k)0 )}N/2 ×

×∏

j<k{z(λ(k))− z(λ(j))}{z(λ(j)0 )− z(λ(k)

0 )}∏j,k{z(λ(k))− z(λ

(j)

0 )} .

This expression is symmetric with respect to interchanging of any two sheets,therefore, it is a single-valued function of λ and λ0. Moreover, it is nonsingular(and equal to 1) as λ = λ0, and nonsingular as λ → ∞. Therefore, it is globallynonsingular, thus identically equal to 1.

The function I obviously equals to the unit matrix as λ → λ0. The onlysingularities of the function I in λ-plane are the branch points λm. These are reg-ular singularities with quasi-permutation monodromy matrices with nonvanishingentries equal to ±1.

Therefore, function I(λ), being analytically continued from a small neighbor-hood of point λ0 to the universal covering of P

1 \ {λ1, . . . , λm}, gives a solution tothe Riemann–Hilbert problem with regular singularities at the points λm and quasi-permutation monodromy matrices. It is nondegenerate outside of {λm}, equals I atλ = λ0, and satisfies the equations

∂I

∂λ=

M∑m=1

Am

λ− λmI,

∂I

∂λm= − Am

λ− λmI (3.44)

for some N × N matrices {Am} depending on {λm}. Compatibility of Equations(3.44) implies the Schlesinger system for the functions Am({λn}). The correspond-ing Jimbo–Miwa tau-function τJM({λm}) is defined by the equations

∂ ln τJM∂λm

= 12 res|λ=λm tr(IλI

−1)2. (3.45)

Page 76: Mathematical Physics, Analysis and Geometry - Volume 7

76 A. KOKOTOV AND D. KOROTKIN

The tau-function, as well as the expression tr(IλI−1)2, is independent of the

normalization point λ0; taking the limit λ0 → λ in this expression, we get

Ijk = zλ(λ(j))zλ(λ

(k))

z(λ(j))− z(λ(k))(λ0 − λ)+ O((λ− λ0)

2), Ijj = 1+ o(1)

as λ0 → λ (3.46)

and

12 tr

(IλI

−1(λ))2 = − 1

(dλ)2

∑j �=k

B(z(λ(j)), z(λ(k))

), (3.47)

where

B(z, z) = dz dz

(z− z)2

is the Bergmann kernel on P1. Consider the behavior of expression (3.47) as

λ→ λm; suppose that the sheets glued at the ramification point Pm have numbers sand t . Then, since dλ = 2xm dxm, we have as λ→ λm,

12 tr

(IλI

−1(λ))2 = − 1

4(λ− λm)

zxm(λ(s))zxm(λ

(t))

[z(λ(s))− z(λ(t))]2 + O(1)

= − 1

4(λ− λm)

(1

[xm(λ(s))− xm(λ(t))]2 +

+ 16{z, xm}|xm=0

)+ O(1)

= − 1

4(λ− λm)

(1

4(λ− λm)+ 1

6{z, xm}|xm=0

)+ O(1).

Therefore, the definition of isomonodromic tau-function (3.45) gives rise to

∂ ln τJM∂λm

= − 124{z, xm}|xm=0; (3.48)

thus, in genus zero we get the following relation between isomonodromic andWirtinger tau-functions: τJM = {τW }−1/2, where τW is given by (3.35).

4. The Case of Higher Genus

In this section we calculate the modulus square of the Bergmann and Wirtingertau-functions for an arbitrary covering of genus g > 1.

Let L0 be a point of U(L). In a small neighborhood of L0 we may considerthe branch points λ1, . . . , λM as local coordinates on U(L).

Page 77: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 77

The tau-function τB (a section of the Bergmann line bundle) can be consideredas a holomorphic function in this small neighborhood of L0. Its modulus square,|τB |2 is the restriction of a section of the ‘real’ line bundle TB ⊗ TB .

To compute |τB |2 we are to find a real-valued potential ln|τB |2 such that

∂ ln|τB |2∂λm

= Bm; m = 1, . . . ,M. (4.1)

If the covering L has genus g > 1 then it is biholomorphically equivalent to thequotient space H/L, where H = {z ∈ C : �z > 0}; L is a strictly hyperbolicFuchsian group. Denote by πF : H → L the natural projection. The Fuchsianprojective connection on L is given by the Schwarzian derivative {z, x}, wherex is a local coordinate of a point P ∈ L, z ∈ H, πF(z) = P .

We recall the variational formula ([19], see also [3]) for the determinant of theLaplacian on the Riemann surface L:

δµ ln

(det�

det�B

)= − 1

12πi

∫L

(SB − SF)µ,

where B is the matrix of b-periods, SB is the Bergmann projective connection,SF is the Fuchsian projective connection, µ is a Beltrami differential. Since, aswe discussed above, the derivation with respect to λm corresponds to the Beltramidifferential µm from (2.13), we conclude that

− 1

6rm, (rm − 2)!(

d

dxm

)rm−2

(SB(xm)− {z, xm})|xm=0

= ∂

∂λmln

(det�

det�B

). (4.2)

Remark 5. This formula explains the appearance of the factor −1/6 in Defini-tion (2.22) of the connection coefficient Bm.

Therefore, the calculation of the modulus of the Bergmann tau-functionof the covering L reduces to the problem of finding a real-valued functionSFuchs(λ1, . . . , λM) such that

∂SFuchs

∂λm= 1

rm(rm − 2)!(

d

dxm

)rm−2

{z, xm}|xm=0, m = 1, . . . M. (4.3)

Another link of |τB |2 with known objects can be established if we introducethe Schottky uniformization of the covering L. Namely, the covering L (of genusg > 1) is biholomorphically equivalent to the quotient space

L = D/H,

where H is a (normalized) Schottky group, D ⊂ P1 is its region of discontinuity.

Denote by πH: D → L the natural projection.

Page 78: Mathematical Physics, Analysis and Geometry - Volume 7

78 A. KOKOTOV AND D. KOROTKIN

Introduce the Schottky projective connection on L given by the Schwarzianderivative {ω, x}, where x is a local coordinate of a point P ∈ L; ω ∈ D;πH(ω) = P .

Due to the formula (2.17) and the results of [16] (namely, see Remark 3.5in [16]), we have

− 1

6rm(rm − 2)!(

d

dxm

)rm−2

(SB(xm)− {ω, xm})|xm=0 = ∂

∂λmln |det ∂|2. (4.4)

Here det ∂ is the holomorphic determinant of the family of ∂-operators (this holo-morphic determinant can be considered as a nowhere vanishing holomorphic func-tion on the Schottky space; see Theorem 3.4 [16] for precise definitions and anexplicit formula for |det ∂|2).

Therefore, the calculation of the modulus square of the Bergmann tau-functionof the covering L reduces to the integration of the following system of equationsfor real-valued function SSchottky:

∂SSchottky

∂λm= 1

rm(rm − 2)!(

d

dxm

)rm−2

{ω, xm}|xm=0, m = 1, . . .M. (4.5)

In the following two subsections we solve, first, system (4.5) and, second, sys-tem (4.3).

4.1. THE DIRICHLET INTEGRAL AND THE SCHOTTKY UNIFORMIZATION

4.1.1. The Schottky Uniformization and the Flat Metric on Dissected RiemannSurface

The Schottky uniformization. We refer the reader to [18] for a brief review ofSchottky groups and the Schottky uniformization theorem.

Fix some marking of the Riemann surface L (i.e. a point x0 in L and somesystem of generators α1, . . . , αg, β1, . . . , βg of the fundamental group π1(L, x0)

such that Mg

i=1α−1i β−1

i αiβi = 1).The marked surface L is biholomorphically equivalent to the quotient space

D/H, where H is a normalized marked Schottky group, D ⊂ P1 is its region of

discontinuity. (A Schottky group is said to be marked if a relation-free system ofgenerators L1, . . . , Lg is chosen in it. For the normalized Schottky group L1(ω) =k1ω with 0 < |k1| < 1 and the attracting fixed point of the transformation L2 is 1.)

Choose a fundamental region D0 for H in D. This is a region in P1 bounded by

2g disjoint Jordan curves c1, . . . , cg, c′1, . . . , c

′g with c′i = −Li(ci), i = 1, . . . , g;

the curves ci and c′i are oriented as the components of ∂D0, the minus sign meansthe reverse orientation.

Let πH: D → L be the natural projection. Set Ci = πH(ci).Denote by Ldissected the dissected surface L \ ⋃g

i=1 Ci . The map πH: D0 →Ldissected is invertible; denote the inverse map by G0.

Page 79: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 79

4.1.2. The Flat Metric on Ldissected

Let x be a local parameter on Ldissected. Define a flat metric eφ(x,x)|dx|2 on Ldissected

by

eφ(x,x)|dx|2 = |dω|2. (4.6)

Here ω ∈ D0, πH(ω) = x. Thus, to each local chart with local parameter x

there corresponds a function φ(x, x). We specify the function φext(λ, λ) of localparameter λ by

eφext(λ,λ)|dλ|2 = |dω|2 = |G′0(λ)|2|dλ|2. (4.7)

Here ω ∈ D, πH(ω) = P ∈ L and p(P ) = λ.Introduce also the functions φint(xm, xm), m = 1, . . . ,M and φ∞(ζk, ζk), k =

1, . . . , N corresponding to the local parameters xm near the ramification pointsPm and the local parameters ζk = 1/λ near the infinity of the kth sheet. In theintersections of the local charts we have

eφint(xm,xm)|dxm|2 = eφ

ext(λ,λ)|dλ|2 (4.8)

and

eφ∞(ζk,ζk)|dζk|2 = eφ

ext(λ,λ)|dλ|2. (4.9)

Choose an element L ∈ H and consider the fundamental region D1 = L(D0).Introduce the map G1: Ldissected → D1 and the metric eφ1(x,x)|dx|2 on Ldissected

corresponding to this new choice of fundamental region.Since G1(x) = L(G0(x)), we have

φ1(x, x) = φ(x, x)+ ln|L′(G0(x))|2, (4.10)

[φ1(x, x)]x = φx(x, x)+ L′′(G0(x))

L′(G0(x))G′0(x) (4.11)

and

[φ1(x, x)]x = φx(x, x)+ L′′(G0(x))

L′(G0(x))G′0(x). (4.12)

The following statements are complete analogs of those from Section 3.1. Lem-mas 7 and 8 are evident, to get Lemmas 9, 10 and Corollary 4 one only needs tochange the map U :L � x �→ z ∈ L to the map G0:Ldissected � x �→ ω ∈ D0 in theproofs of corresponding statements from Section 3.1. Since the map G0, similarlyto the map U , depends on the branch points λ1, . . . , λM holomorphically, all thearguments from Section 3.1 can be applied in the present context.

LEMMA 7. The derivative of the function φext has the following asymptotics nearthe branch points and the infinities of the sheets:

Page 80: Mathematical Physics, Analysis and Geometry - Volume 7

80 A. KOKOTOV AND D. KOROTKIN

(1) |φextλ (λ, λ)|2 = ((1/rm)− 1)2|λ− λm|−2 + O(|λ− λm|−2+1/rm) as λ→ λm,

(2) |φextλ (λ, λ)|2 = 4|λ|−2 + O(|λ|−3) as λ→∞.

Let x be a local coordinate on L. Set Rω,x = {ω, x}, where ω ∈ D, πH(ω) = x.

LEMMA 8. (1) The Schwarzian derivative can be expressed as follows in termsof the function φ from (4.6):

Rω,x = φxx − 12φ

2x . (4.13)

(2) In a neighborhood of a branch point Pm there is the following relationbetween Schwarzian derivatives computed with respect to coordinates λ and xm:

Rω,λ = 1

r2m

(λ− λm)2/rm−2Rω,xm +

(1

2− 1

2r2m

)(λ− λm)

−2. (4.14)

(3) Let ζ be the coordinate in a neighborhood of the infinity of any sheet ofcovering L, ζ = 1/λ. Then

Rω,λ = Rω,ζ

λ4= O(|λ|−4). (4.15)

LEMMA 9. The derivatives of the function φ with respect to λ are related to itsderivatives with respect to the branch points as follows:

∂φ

∂λm+ Fm

∂φ

∂λ+ ∂Fm

∂λ= 0, (4.16)

where

Fm = −[G0]λm[G0]λ . (4.17)

LEMMA 10. Denote the composition p ◦ πH by R. Then(1) The following relation holds:

Fm = ∂R

∂λm. (4.18)

(2) In a neighborhood of the point λl the following asymptotics holds:

Fm = δlm + o(1), (4.19)

where δlm is the Kronecker symbol.(3) At the infinity of each sheet the following asymptotics holds:

Fm(λ) = O(|λ|2). (4.20)

Page 81: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 81

COROLLARY 4. Keep m fixed and define Fn(xn) ≡ Fm(λn + xrnn ). Then

Fn(0) = δnm;(

d

dxn

)k

Fn(0) = 0, k = 1, . . . , rn − 2.

4.1.3. The Regularized Dirichlet Integral

Assume that the ramification points and the infinities of sheets do not belong to thecuts Ci .

To the kth sheet L(k)

dissected of the dissected surface L (we should add some cutsconnecting the branch points) there corresponds the function φext

k : L(k)

dissected → R

which is smooth in any domain �kr of the form

�kρ = {λ ∈ L(k)

dissected : ∀m|λ− λm| > ρ and |λ| < 1/ρ},where ρ > 0 and λm are all the branch points from the kth sheet L(k)

dissected ofLdissected.

The function φextk has finite limits at the cuts (except the endpoints which are

the ramification points); at the ramification points and at the infinity it possessesthe asymptotics listed in Lemma 7.

Introduce the regularized Dirichlet integral∫Ldissected

|φλ|2 dS.

Namely, set

Qρ =N∑k=1

∫�kρ

|∂λφextk |2 dS, (4.21)

where dS is the area element on C1 : dS = |dλ ∧ dλ|/2.

According to Lemma 3 there exists the finite limit

reg∫

Ldissected

|φλ|2 dS = limρ→0

(Qρ +

(4N +

M∑m=1

(rm − 1)2

rm

)2π lnρ

). (4.22)

Now set

SSchottky(λ1, . . . , λM) = 1

2πreg

∫Ldissected

|φλ|2 dS +

+ i

g∑k=2

{∫Ck

φ(λ, λ)L′′k(G0(λ))

L′k(G0(λ))G′0(λ) dλ−

−∫Ck

φ(λ, λ)L′′k(G0(λ))

L′k(G0(λ))G′0(λ) dλ+

Page 82: Mathematical Physics, Analysis and Geometry - Volume 7

82 A. KOKOTOV AND D. KOROTKIN

+∫Ck

ln|L′k(G0(λ)|2L′′k(G0(λ))

L′k(G0(λ))G′0(λ) dλ

}+

+ 2g∑

k=2

ln|lk|2. (4.23)

Here Lk are generators of the Schottky group H, the orientation of contours Ck isdefined by the orientation of countours ck and the relations Ck = πH(ck); the valueof the function φ(λ, λ) at the point λ ∈ Ck is defined as the limit limµ→λ φ(µ, µ),µ = πH(ω) and ω tends to the contour ck from the interior of the region D0; lkis the left-hand lower element in the matrix representation of the transformationLk ∈ PSL(2,C). The summations at the right-hand side of (4.23) start from k = 2due to the normalization condition for the group H (the terms with k = 1 are equalto zero).

Observe that the expression at the right-hand side of (4.23) is real and does notdepend on small movings of the cuts Ck (i.e. on a specific choice of the fundamentalregion D0). In particular, we can assume that the contours Ck are {λ1, . . . , λM}-independent. (To see this one should make a simple calculation based on (4.11),(4.12) and the Stokes theorem.) Thus all terms in this expression except the lastone are rather natural. The role of the last term will become clear later.

The main result of this section is the following theorem.

THEOREM 7. For any m = 1, . . . ,M the following equality holds

∂SSchottky(λ1, . . . , λM)

∂λm= 1

(rm − 2)! rm(

d

dxm

)rm−2

Rω,xm|xm=0. (4.24)

Remark 6. This result seems to be very similar to Theorem 1 from [18]. How-ever, we would like to emphasize that in oppose to [18] we deal here with theDirichlet integral corresponding to a flat metric. Thus, the following proof doesnot explicitly use the Teichmüller theory and, therefore, is more elementary thanthe proof of an analogous result in [18].

Proof. Set

Sρ = Qρ + i

2

g∑k=2

{∫Ck

φ(λ, λ)L′′k(G0(λ))

L′k(G0(λ))G′0(λ) dλ−

−∫Ck

φ(λ, λ)L′′k(G0(λ))

L′k(G0(λ))G′0(λ) dλ+

+∫Ck

ln|L′k(G0(λ)|2L′′k(G0(λ))

L′k(G0(λ)G′0(λ)) dλ

}. (4.25)

Page 83: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 83

We recall that the contours Ck are assumed to be {λ1, . . . , λM}-independent.From now on we write G(λ) and φ instead of G0(λ) and φext. Since φλλ = 0, wehave |φλ|2 = (φλφ)λ. The Stokes theorem and the formulas (4.10), (4.11) give

Qρ = − i

2

[M∑n=1

rn∑l=1

∮|λ(l)−λn|=ρ

φλφ dλ+N∑k=1

∮|λ(k)|=1/ρ

φλφ dλ

]−

− i

2

g∑k=2

∫Ck

{φλφ −

[φλ + L′′k(G(λ))

L′k(G(λ))G′(λ)

× [φ + ln|L′k(G(λ))|2]} dλ. (4.26)

Here λ(k) denotes the point on the kth sheet of the covering L whose projection toP

1 is λ.Denote the first term in (4.26) by − i

2 [Tρ]. Substituting (4.26) into (4.25) andusing the equalities

∫Ck

d[φ(λ, λ) ln|L′(G(λ))|2] = 0 and∫Ck

d[ln2 |L′k(G(λ))|2]= 0, we get

Sρ = − i

2[Tρ] − i

2

g∑k=2

∫Ck

φλ(λ, λ) ln|L′k(G(λ))|2 dλ−

− i

2

g∑k=2

∫Ck

φ(λ, λ)L′′k(G(λ))

L′k(G(λ))G′(λ) dλ. (4.27)

LEMMA 11. For the first term in (4.27) we have the asymptotics

− i

2

∂λm[Tρ] = 2π

(rm − 2)! rm(

d

dxm

)rm−2

Rω,xm|xm=0 +

+ i

2

g∑k=1

∫Ck∪C−k

{Fm(2φλλ − φ2

λ)+ [Fm]λφλ}dλ+ o(1), (4.28)

as ρ → 0. Here C−k is the contour Ck provided by the reverse orientation, the valueof the integrand at a point λ ∈ C−k is understood as the limit as µ → λ, whereµ = πH(ω), ω tends to c′k from the interior of the region D0; the function Fm isfrom Lemma 9.

Proof. Using Lemma 9, we get

∂λm

M∑n=1

rn∑l=1

∮|λ(l)−λn|ρ

φλφ dλ

=rm∑l=1

∮|λ(l)−λm|=ρ

(φ2λ + φφλλ) dλ−

Page 84: Mathematical Physics, Analysis and Geometry - Volume 7

84 A. KOKOTOV AND D. KOROTKIN

−M∑n=1

rn∑l=1

∮|λ(l)−λn|=ρ

(Fmφλ + [Fm]λ)φλ + φ([Fm]λφλ + Fmφλλ + [Fm]λλ) dλ

= −rm∑l=1

∮|λ(l)−λm|=ρ

|φλ|2 dλ+

+M∑n=1

rn∑l=1

∮|λ(l)−λn|=ρ

Fm|φλ|2 dλ+ φλ[Fm]λ dλ. (4.29)

For the integrals around the infinities we have the equality

∂λm

N∑k=1

∮|λ(k)|=1/ρ

φλφ dλ =N∑k=1

∮|λ(k)|=1/ρ

Fm|φλ|2 dλ+ φλ[Fm]λ dλ. (4.30)

Applying the Cauchy theorem to the (holomorphic) function [Fm]λφλ, we get

g∑k=1

∫Ck∪C−k

[Fm]λφλ dλ

= −(

M∑n=1

rn∑l=1

∮|λ(l)−λn|=ρ

+N∑k=1

∮|λ(k)|=1/ρ

)[Fm]λφλ dλ. (4.31)

By (4.29), (4.30) and (4.31)

− i

2

∂λm[Tρ] = i

2

rm∑l=1

∮|λ(l)−λm|=ρ

|φλ|2 dλ−

− i

2

{(M∑n=1

rn∑l=1

∮|λ(l)−λn|=ρ

+N∑k=1

∮|λ(k)|=1/ρ

× (Fm|φλ|2 dλ− [Fm]λφλ dλ+ [Fm]λφλ dλ

)}++ i

2

g∑k=1

∫Ck∪C−k

[Fm]λφλ dλ. (4.32)

Denote the expression in the large braces by H2. We claim that

− i

2H2 = − i

2

(rm∑l=1

∮|λ(l)−λm|=ρ

|φλ|2 dλ−

− 2πiM∑n=1

1

(rn − 1)!(

1− 1

r2n

)F(rn)

n (0)

)+ o(1), (4.33)

where the function Fn is from Corollary 4.

Page 85: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 85

To prove this we set

I n1 (ρ) =rn∑p=1

∮|λ(p)−λn|=ρ

Fm|φλ|2 dλ;

I n2 (ρ) =rn∑p=1

∮|λ(p)−λn|=ρ

[Fm]λφλ dλ;

I n3 (ρ) =rn∑p=1

∮|λ(p)−λn|=ρ

[Fm]λφλ dλ.

By Corollary 4 we have

I n1 (ρ) = δnm

rn∑p=1

∮|λ(p)−λn|=ρ

|φλ|2 dλ+

+∮|xn|=ρ1/rn

[1

(rn − 1)!F(rn−1)n (0)xrn−1

n + 1

rn!F(rn)n (0)xrnn +

+ O(|xn|rn+1

×( |φint

xn|2

rnxrn−1n x

rn−1n

+ 1− rn

r2n

φintxn

xrnn x

rn−1n

+ 1− rn

r2n

φintxn

xrn−1n x

rnn

+

+(

1

rn− 1

)2 1

xrnn x

rnn

)rnx

rn−1n dxn

= δnm

rn∑p=1

∮|λ(p)−λn|=ρ

|φλ|2 dλ+ 2πi(1/rn − 1)2

(rn − 1)! F(rn)n (0)+

+ 2πi1− rn

rn(rn − 1)!F(rn−1)n (0)φint

xn(0)+ o(1)

as ρ → 0.We get also

I n2 (ρ) = −2πi

(1

rn− 1

)1

(rn − 1)!F(rn)n (0)−

− 2πi1

rn(rn − 2)!φintxn(0)F(rn−1)

n (0)+ o(1)

and

I n3 (ρ) = 2πi(1/rn − 1)

(rn − 1)! F(rn)n (0)+ o(1).

Page 86: Mathematical Physics, Analysis and Geometry - Volume 7

86 A. KOKOTOV AND D. KOROTKIN

We note that

I n1 − I n2 + I n3 = δnm

rn∑p=1

∮|λ(p)−λn|=ρ

|φλ|2 dλ+

+ 2πi

(rn − 1)!F(rn)n (0)

[(1

rn− 1

)2

+ 2

(1

rn− 1

)]+ o(1)

= δnm

rn∑p=1

∮|λ(p)−λn|=ρ

|φλ|2 dλ−

− 2πi

(rn − 1)!(

1− 1

r2n

)F(rn)

n (0)+ o(1).

It is easy to verify that

N∑k=1

(∮|λ(k)|=1/ρ

Fm|φλ|2 dλ−∮|λ(k)|=1/ρ

[Fm]λφλ dλ+∮|λ(k)|=1/ρ

[Fm]λφλ dλ

)

= o(1),

so we get (4.33).The function Fm(2φλλ − φ2

λ) is holomorphic outside of the ramification points,the infinities and the cuts. Applying to it the Cauchy theorem and making use ofLemma 8 and the asymptotics from Lemma 10, we get the equality

2πiM∑n=1

1

(rn − 1)!(

1− 1

r2n

)F(rn)

n (0)

= − 4πi

(rm − 2)! rm(

d

dxm

)rm−2

Rω,xm(xm)|xm=0+

+g∑

k=1

∫Ck∪C−k

{Fm(2φλλ − φ2λ)} dλ. (4.34)

Summarizing (4.32), (4.33) and (4.34), we get (4.28). ✷Now we shall differentiate with respect to λm the remaining terms in (4.27).

Denote by Lk;m, G;m the derivatives ∂/∂λmLk, ∂/∂λmG. Since φλ is holomorphicwith respect to λm, we have [φλ]λm = 0. Thus,

∂λm

[− i

2

g∑k=2

∫Ck

φλ(λ, λ) ln|L′k(G(λ))|2 dλ−

− i

2

g∑k=2

∫Ck

φ(λ, λ)L′′k(G(λ))

L′k(G(λ))G′(λ) dλ

]

Page 87: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 87

= i

2

g∑k=2

∫Ck

φλL′k;m(G(λ))+ L′′k(G(λ))G;m(λ)

L′k(G(λ))dλ+

+ i

2

g∑k=2

(Fmφλ + [Fm]λ)L′′k(G(λ))

L′k(G(λ))G′(λ) dλ. (4.35)

(We have used the equality

φλ∂

∂λmln|L′k(G(λ))|2 dλ+ φ

∂2

∂λ∂λmln|L′k(G(λ))|2 dλ

= d

∂λmln|L′k(G(λ))|2

)− φλ

∂λmln|L′k(G(λ))|2 dλ

and Lemma 9.)To finish the proof we have to rewrite the last term at the right-hand side of (4.28)

as follows

i

2

∫Ck∪C−k

{Fm(2φλλ − φ2

λ)+ [Fm]λφλ}

= i

2

∫Ck∪C−k

φλφλm dλ

= i

2

∫Ck

φλφλm −(φλ + L′′k(G(λ))

L′k(G(λ))G′(λ)

×(φλm +

L′k;m(G(λ))+ L′′k(G(λ))G;m(λ)L′k(G(λ))

)dλ

= − i

2

∫Ck

[φλL′k;m(G(λ))+ L′′k(G(λ))G;m(λ)

L′k(G(λ))+ φλm

L′′k(G(λ))

L′k(G(λ))G′(λ)+

+ L′′k(G(λ))

L′k(G(λ))G′(λ)

L′k;m(G(λ))+ L′′k(G(λ))G;m(λ)L′k(G(λ))

]dλ. (4.36)

Collecting (4.27), (4.28), (4.35) and (4.36) and using the equality

φλm =G′;m(λ)G′(λ)

,

we get

∂Sρ

∂λm+ o(1) = 2π

(rm − 2)! rm(

d

dxm

)rm−2

Rω,xm|xm=0 −

− i

2

g∑k=2

∫Ck

L′′k(G(λ))L′k;m(G(λ))

[L′k(G(λ))]2 G′(λ) dλ−

− i

2

g∑k=2

∫Ck

[L′′k(G(λ))

L′k(G(λ))

]2

G′(λ)G;m(λ) dλ−

Page 88: Mathematical Physics, Analysis and Geometry - Volume 7

88 A. KOKOTOV AND D. KOROTKIN

− i

g∑k=2

∫Ck

L′′k(G(λ))

L′k(G(λ))G′;m(λ) dλ. (4.37)

Since {Lk(ω), ω} ≡ 0, the last two terms in (4.37) cancel (one should beforehandintegrate the last term by parts). For the second term we have the equality ([18]):

− i

2

∫Ck

L′′k(G(λ))L′k;m(G(λ))

[L′k(G(λ))]2 G′(λ) dλ = −4πlk;mlk

.

To prove Theorem 7 it is sufficient to observe that the term o(1) in (4.37) is uniformwith respect to parameters (λ1, . . . , λM) belonging to a compact neighborhood ofthe initial point (λ0

1, . . . , λ0M). ✷

4.2. THE LIOUVILLE ACTION AND THE FUCHSIAN UNIFORMIZATION

4.2.1. The Metric of Constant Curvature −1 on L and its Dependence upon theBranch Points

The covering L is biholomorphically equivalent to the quotient space H/L, whereH = {z ∈ C : �z > 0}, L is a strictly hyperbolic Fuchsian group. Denote byπL: H → L the natural projection. Let x be a local parameter on L, introduce themetric eχ(x,x)|dx|2 of the constant curvature −1 on L by the equality

eχ(x,x)|dx|2 = |dz|2|�z|2 , (4.38)

where z ∈ H, πL(z) = x. As usually we specify the functions χ ext(λ, λ),χ int(xm, xm), m = 1, . . . ,M and χ∞(ζk, ζk), k = 1, . . . , N setting x = λ, x = xmand x = ζk in (4.38).

Set Rz,x = {z, x}, where z ∈ H, πL(z) = x. Clearly, Lemmas 7 and 8 stillstand with χ ext, Rz,x instead of φext and Rω,x , whereas Lemma 9 should be recon-sidered, since the Fuchsian uniformization map depends upon the branch pointsnonholomorphically.

Introduce the metric eψ(ω,ω)|dω|2 of constant curvature −1 on D0 (see the pre-vious section) by the equation

eψ(ω,ω)|dω|2 = |dz|2|�z|2 ,

where πH(ω) = πL(z). Then there is the following relation between the derivativesof the function ψ :

ψλm(ω, ω)+ ψω(ω, ω)Fm(ω, ω)+ [Fm]ω(ω, ω) = 0, (4.39)

where F is a continuously differentiable function on D0; (the proof of (4.39) isparallel to the one in [18]).

We shall now prove the analog of (4.39) and Lemma 9 for the function χ = χ ext.

Page 89: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 89

LEMMA 12. There is the following relation between the derivatives of the func-tion χ:

∂χ(λ, λ)

∂λm+ Fm(λ, λ)

∂χ(λ, λ)

∂λ+ ∂Fm(λ, λ)

∂λ= 0, (4.40)

where

Fm(λ, λ) = Fm(G0(λ),G0(λ))1

G′0(λ)+ Fm(λ). (4.41)

Here Fm = −[G0]λm/[G0]λ is the function from Lemma 9, Fm is the functionfrom (4.39).

Proof. Since

eχ(λ,λ)|dλ|2 = eψ(G0(λ),G0(λ))|G′0(λ)|2|dλ|2,we have the equality

χ(λ, λ) = ψ(G0(λ),G0(λ))+ φ(λ, λ), (4.42)

where φ(λ, λ) = ln|G′0(λ)|2 is the function from (4.7). Differentiating (4.42) withrespect to λm via formulas (4.39) and (4.16), after some easy calculations weget (4.40). ✷

Remark 7. Observe that the function Fm does not have jumps at the cycles Ck,whereas the both terms at the right hand side of (4.41) do. This immediately followsfrom the formulas

F−m (λ) = F+m (λ)−Lk;m(G+0 (λ))

L′k(G+0 (λ))[G+0 ]λ(λ)

,

[G−0 ]λ(λ) = L′k(G+0 (λ))[G+0 ]λ(λ)

and the formula from [18]:

Fm ◦ Lk = FmL′k + Lk;m.

Here the indices + and − denote the limit values of the corresponding functions atthe ‘ck’ and the ‘c′k’ sides of the cycle Ck.

LEMMA 13. Fix a number m = 1, . . . ,M. Then for any n = 1, . . . ,M thefollowing asymptotics holds

Fm(λn + xrnn , λn + xrnn )

= δmn + anxrn−1n + bnxnx

rn−1n + cnx

rnn + O(|xn|rn+1) (4.43)

as xm → 0; here an, bn, cn are some complex constants.At the infinity of the kth sheet of the covering L there is the asymptotics

Fm(λ, λ) = Akλ2 + Bkλ+ Ckλ

2λ−1 + O(1) (4.44)

as λ→ ∞(k); here ∞(k) is the point at infinity of the kth sheet of the covering L;Ak,Bk, Ck are some complex constants.

Page 90: Mathematical Physics, Analysis and Geometry - Volume 7

90 A. KOKOTOV AND D. KOROTKIN

Proof. This follows from Corollary 4, asymptotics (4.20) and formula (4.41). ✷

4.2.2. The Regularized Liouville Action

Here we define the regularized integral

reg∫

L

(|χλ|2 + eχ ) dS

and calculate its derivatives with respect to the branch points λm.Set Qk

ρ = {λ ∈ L(k) : ∀m |λ − λm| > ρ and |λ| < 1/ρ}, where Pm are all theramification points which belong to the kth sheet L(k) of the covering L. To thesheet L(k) there corresponds the function χ ext

k : L(k) → R which is smooth in anydomain Qk

ρ , ρ > 0.The function χ ext

k has finite limits at the cuts (except the endpoints which arethe ramification points); at the ramification points and at the infinity it possessesthe same asymptotics as the function φext

k from the previous section.Observe also that the function eχ

extk is integrable on L(k). Set

Tρ =N∑k=1

∫Qkρ

|∂λχ extk |2 dS. (4.45)

Then there exists the finite limit

reg∫

L

(|χλ|2 + eχ ) dS

= limρ→0

(Tρ +

N∑k=1

∫L(k)

eχextk dS +

(4N +

M∑m=1

(rm − 1)2

rm

)2π lnρ

). (4.46)

Set

SFuchs(λ1, . . . , λM)

= 1

2πreg

∫L

(|χλ|2 + eχ ) dS +M∑n=1

(rn − 1)χ int(xn)|xn=0−

−2N∑k=1

χ∞(ζk)|ζk=0. (4.47)

Now we state the main result of this section.

THEOREM 8. For any m = 1, . . . ,M the following equality holds

∂SFuchs(λ1, . . . , λM)

∂λm= 1

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm|xm=0. (4.48)

Page 91: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 91

Proof. Set Qρ =⋃Nk=1 Q

kρ . Then

∂λmTρ = i

2

rm∑k=1

∮|λ(k)−λm|=ρ

|∂λχ |2 dλ+∫Qρ

∂λm|∂λχ |2 dS. (4.49)

By (4.40) the last term in (4.49) can be rewritten as∫Qρ

∂λm|∂λχ |2 dS

=∫Qρ

(((2χλλ − χ2

λ )[Fm])λ − 2(χλ[Fm]λ)λ + (χλ[Fm]λ)λ−

− (χλ[Fm]λ)λ − (|χλ|2[Fm])λ)

dS

= − i

2

∫∂Qρ

(2χλλ − χ2λ )Fm dλ+

+ 2χλ[Fm]λ dλ+ χλ[Fm]λ dλ+ χλ[Fm]λ dλ+ |χλ|2Fm dλ

= − i

2

M∑n=1

[I n1 + 2I n2 + I n3 + I n4 + I n5 ]−

− i

2

N∑k=1

[J∞,k1 + J

∞,k2 + J

∞,k3 + J

∞,k4 + J

∞,k5 ], (4.50)

where

I n1 =rn∑l=1

∮|λ(l)−λn|=ρ

(2χλλ − χ2λ )Fm dλ,

J∞,k1 =

∮|λ(k)|=1/ρ

(2χλλ − χ2λ )Fm dλ

and the terms I np and J∞,kp , p = 2, 3, 4, 5 are the similar sums of integrals and

integrals with integrands χλ[Fm]λ dλ, χλ[Fm]λ dλ, χλ[Fm]λ dλ and |χλ|2Fm dλ re-spectively. It should be noted that the circles |λ − λn| = ρ are clockwise orientedwhereas the circles |λ| = 1/ρ are counter-clockwise oriented. Using (4.43), we get

I n1 =∮|xn|=ρ1/rn

[2Rz,xn(xn)

rnx2rn−2n

+(

1− 1

r2n

)1

x2rnn

× (δmn + anx

rn−1n + bnxnx

rn−1n + cnx

rnn + O(|xn|rn+1)

)rnx

rn−1n dxn

= −δnm 4πi

(rn − 2)! rn(

d

dxn

)rn−2

Rz,xm(0)−

− 2πirn

(1− 1

r2n

)cn + o(1). (4.51)

Page 92: Mathematical Physics, Analysis and Geometry - Volume 7

92 A. KOKOTOV AND D. KOROTKIN

In the same manner we get

I n2 = o(1),

I n3 = −2πi

(rn − 1

rnanχ

intxn(0)+ rn

(1

rn− 1

)cn

)+ o(1), (4.52)

and

I n4 = 2πi

(1

rn− 1

)rncn + o(1),

I n5 = δmn

rn∑l=1

∮|λ(l)−λ|=ρ

|χλ|2 dλ+

+ 2πiχ intxn(0)

1− rn

rnan + 2πi

(1

rn− 1

)2

rncn + o(1). (4.53)

Using (4.44), we get also

J∞,k1 = o(1), J

∞,k2 = o(1),

J∞,k3 = −4πi(Akχ

∞ζk(0)+ Bk)+ o(1), (4.54)

and

J∞,k4 = 4πiBk + o(1), J

∞,k5 = −4πi(Akχ

∞ζk(0)+ 2Bk)+ o(1). (4.55)

Summarizing (4.49–4.55), we have

∂λmTρ = 2π

(rm − 2)! rm(

d

dxm

)rm−2

Rz,xm(0)+ 2πM∑n=1

1− rn

rn

(anχ

intxn(0)+ cn

)−− 4π

N∑k=1

(Akχ

∞ζk(0)+ Bk

)+ o(1). (4.56)

To finish the proof we need the following lemma.

LEMMA 14. The equalities hold

∂λmχ int(xn)|xn=0 = − 1

rn(anχ

intxn(0)+ cn) (4.57)

and

∂λmχ∞(ζk)|ζk=0 = Akχ

∞ζk(0)+ Bk. (4.58)

Page 93: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 93

Proof. We shall prove (4.57); (4.58) can be proved analogously. Since

eχint(xn,xn)|dxn|2 = eχ

ext(λ,λ)|dλ|2,we get

χ int(xn, xn) = χ ext(λ, λ)−(

1

rn− 1

)1

r2n

ln|λ− λn|2 (4.59)

and

χ extλm(λ, λ) = χ int

λm(xn, xn)+ const δmn

1

xrnn

. (4.60)

By (4.43) and (4.40) we have

χ extλm(λ, λ)

= −(δmn + anx

rn−1n + bnxnx

rn−1n + cnx

rnn + O(|xn|rn+1)

)××[

1

rnxrn−1n

χ intxn(xn, xn)+

(1

rn− 1

)1

xrnn

]−

− rn − 1

rnan

1

xn− rn − 1

rnbnxn

xn− cn + O(|xn|). (4.61)

Now substituting (4.61) in (4.60) and comparing the coefficients near the zeropower of xn, we get (4.57). ✷

Observe that

∂λm

∫L

eχ dS = 0

due to the Gauss–Bonnet theorem and the term o(1) in (4.56) is uniform withrespect to (λ1, . . . , λM) belonging to a compact neighborhood of the initial point(λ0

1, . . . , λ0M). This together with (4.56) and Lemma 14 proves Theorem 8. ✷

Remark 8. Consider the functional defined by the right-hand side of (4.47). Ifwe introduce variations δχ which are smooth functions on L vanishing in neigh-borhoods of the branch points and the infinities then the Euler–Lagrange equationfor an extremal of this functional coinsides with the Liouville equation

χλλ =1

2eχ .

The last equation is equivalent to the condition that the metric eχ |dλ|2 has constantcurvature −1.

Page 94: Mathematical Physics, Analysis and Geometry - Volume 7

94 A. KOKOTOV AND D. KOROTKIN

4.3. THE MODULUS SQUARE OF BERGMANN AND WIRTINGER

TAU-FUNCTIONS IN HIGHER GENUS

Now we are in a position to calculate the modulus square of Bergmann (and,therefore, Wirtinger) tau-function. Actually, we shall give two equivalent answers:one is given in terms of the Fuchsian uniformization of the surface L and thedeterminant of the Laplacian, another one uses the Schottky uniformization andthe holomorphic determinant of the Cauchy–Riemann operator in the trivial linebundle over L.

Indeed, formula (4.2) and Theorem 8 imply the following statement.

THEOREM 9. Let the regularized Liouville action SFuchs be given by formula(4.47). Then we have the following expression for the modulus square |τB |2 of theBergmann tau-function of the covering L:

|τB |2 = e−SFuchs/6 det�

det�B. (4.62)

For the modulus square |τW |2 of the Wirtinger tau-function we have the expression:

|τW |2 = e−SFuchs/6 det�

det�B

∏β even

|![β](0|B)|−2/(4g−1+2g−2). (4.63)

On the other hand, using formula (4.4) and Theorem 7, we get the followingalternative answer.

THEOREM 10. Let the regularized Dirichlet integral SSchottky be given by for-mula (4.23). Then the modulus square of the Bergmann and Wirtinger tau-functionsof the covering L can be expressed as follows:

|τB |2 = e−SSchottky/6|det ∂|2, (4.64)

|τW |2 = e−SSchottky/6|det ∂|2∏β even

|![β](0|B)|−2/(4g−1+2g−2). (4.65)

Remark 9. Comparing (4.64), (4.62) and formula (3.3) for |det ∂|2 from [16],we get the equality

SSchottky − SFuchs = 1

2πS,

where S is the Liouville action from [18]. Whether it is possible to prove thisrelation directly is an open question.

Remark 10. Looking at the formulas for the tau-functions in genera 0 and 1(and for genus 2 two-fold coverings), one may believe that the expressions for thetau-functions in higher genus can be also given in pure holomorphic terms, withoutany use of the Dirichlet integrals and, especially, the Fuchsian uniformization. Atthe least, the Dirichlet integral should be eliminated from the proofs in genus 0and 1.

Page 95: Mathematical Physics, Analysis and Geometry - Volume 7

TAU-FUNCTIONS ON HURWITZ SPACES 95

Remark 11. The number of sheets of the covering

Hg,N(1, . . . , 1) −→ C(M) \�

(or, equivalently, the degree of the Lyashko–Looijenga map) is finite and equals(up to the factor N !) to the Hurwitz number hg,N . Here M = 2g + 2N − 2, C

(M)

is the Mth symmetric power of C, � = ⋃i,j {λi = λj }. Due to Remark 4, in

case g = 0, 1 the 12th power τ 12W of the Wirtinger tau-function gives a global

holomorphic function on Hg,N(1, . . . , 1). It would be very interesting to connectthe Wirtinger tau-function with the Hurwitz numbers hg,N .

Acknowledgements

Our work on this paper was greatly influenced by Andrej Nikolaevich Tyurin; inparticular, he attracted our attention to the Wirtinger bidifferential.

The authors are also greatly indebted to the anonymous referee; a lot of hisproposals and remarks were used here.

This work was partially supported by the grant of Fonds pour la Formationde Chercheurs et l’Aide a la Recherche de Quebec, the grant of Natural Sciencesand Engineering Research Council of Canada and Faculty Research DevelopmentProgram of Concordia University.

References

1. Dubrovin, B.: Geometry of 2D topological field theories, In: Integrable Systems and QuantumGroups. Proceedings, Montecatini Terme, 1993, Lecture Notes in Math. 1620, Springer, Berlin,1996, pp. 120–348.

2. Fay, J. D.: Theta-functions on Riemann Surfaces, Lecture Notes in Math. 352, Springer, 1973.3. Fay, J. D.: Kernel functions, analytic torsion, and moduli spaces, Mem. Amer. Math. Soc.

96(464) (1992).4. Fulton, W.: Hurwitz schemes and irreducibility of moduli of algebraic curves, Ann. of Math. 90

(1969), 542–575.5. Hawley, N. S. and Schiffer, M.: Half-order differentials on Riemann surfaces, Acta Math. 115

(1966), 199–236.6. Jimbo, M., Miwa, M. and Ueno, K.: Monodromy preserving deformations of linear ordinary

differential equations with rational coefficients, I, Phys. D 2 (1981), 306–352.7. Kitaev, A. and Korotkin D.: On solutions of Schlesinger equations in terms of theta-functions,

Internat. Math. Res. Notices 17 (1998), 877–905.8. Knizhnik, V. G.: Multiloop amplitudes in the theory of quantum strings and complex geometry,

Sov. Phys. Usp. 32(11) (1989), 945–971.9. Korotkin, D.: Matrix Riemann–Hilbert problems related to branched coverings of CP1, archive

math-ph/0106009, In: I. Gohberg, A. F. dos Santos and N. Manojlovic (eds), Operator Theory:Advances and Application, Proceedings of the Summer School on Factorization and IntegrableSystems, Algarve, September 6–9, 2000, Birkhäuser, Boston, 2002, to appear.

10. Mumford, D.: Tata Lectures on Theta, Birkhäuser, 1984.11. Natanzon, S. M.: Topology of 2-dimensional coverings and meromorphic functions on real and

complex algebraic curves, Selecta Math. Soviet. 12(3) (1993), 251–291.

Page 96: Mathematical Physics, Analysis and Geometry - Volume 7

96 A. KOKOTOV AND D. KOROTKIN

12. Rauch, H. E.: Weierstrass points, branch points, and moduli of Riemann surfaces, Comm. PureAppl. Math. 12 (1959), 543–560.

13. Rauch, H. E.: A transcendental view of the space of algebraic Riemann surfaces, Bull. Amer.Math. Soc. 71 (1965), 1–39.

14. Strachan, I. A. B.: Symmetries and solutions of Getzler’s equation for Coxeter and extendedaffine Weyl Frobenius manifolds, math-ph/0205012.

15. Tyurin, A. N.: Periods of quadratic differentials (Russian), Uspekhi Mat. Nauk 33(6(204))(1978), 149–195.

16. Zograf, P. G.: Liouville action on moduli spaces and uniformization of degenerate Riemannsurfaces, Leningrad. Math. J. 1(4) (1990), 941–965.

17. Zograf, P. G. and Takhtajan, L. A.: On the Liouville equation, accessory parameters and thegeometry of Teichmüller space for Riemann surfaces of genus 0, Math. USSR-Sb. 60(1) (1988),143–161.

18. Zograf, P. G. and Takhtajan, L. A.: On the uniformization of Riemann surfaces and on theWeil–Petersson metric on the Teichmüller and Schottky spaces, Math. USSR-Sb. 60(2) (1988),297–313.

19. Zograf P. G. and Takhtajan, L. A.: Potential of the Weil–Peterson metric on Torelli space,J. Soviet. Math. 52 (1990), 3077–3085.

Page 97: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 97–117, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

97

On Equilibria of the Two-fluid Model inMagnetohydrodynamics

DIMITRI J. FRANTZESKAKIS1, IOANNIS G. STRATIS2 andATHANASIOS N. YANNACOPOULOS3

1Department of Physics, University of Athens, Panepistimiopolis, GR 15784 Zografou, Athens,Greece. e-mail: [email protected] of Mathematics, University of Athens, Panepistimiopolis, GR 15784 Zografou, Athens,Greece. e-mail: [email protected] of Statistics and Actuarial Science, University of the Aegean, GR 82300 Karlovassi,Samos, Greece. e-mail: [email protected]

(Received: 11 April 2002; in final form: 13 July 2003)

Abstract. We show how the equilibria of the two-fluid model in magnetohydrodynamics can bedescribed by the double curl equation and through the study of this equation we study some propertiesof these equilibria.

Mathematics Subject Classifications (2000): 76W05, 35Q35.

Key words: Beltrami fields, double curl equation, equilibrium solutions, magnetohydrodynamics,two-fluid model.

1. Introduction

The study of ideal magnetohydrodynamics (MHD) is a problem that has occu-pied the scientific community for nearly four decades as this theory finds manyapplications ranging from fusion studies to astrophysics (see, e.g., [3, 6, 12]).The equations of ideal MHD present interesting dynamical behaviour and supportimportant and diverse classes of solutions, such as soliton solutions or solutionsdisplaying turbulent behaviour, etc. However, a very helpful insight to the under-standing of the dynamics of such systems is provided by the thorough study of theequilibrium solutions (time stationary solutions).

In this paper, we undertake a necessary and important first step in the study ofthe properties of the equilibria of the two-fluid model in ideal MHD: the proof ofexistence of such states. The two-fluid model of a plasma describes the strong cou-pling between the magnetic properties of a plasma and its properties as a fluid, in amacroscopic limit [4]. In a recent paper, Yoshida and Mahajan [20], using the two-fluid description and neglecting dissipative effects have reduced the determinationof certain types of equilibrium solutions to a double curl type equation with con-stant coefficients. This equation was reduced to appropriate single curl equations

Page 98: Mathematical Physics, Analysis and Geometry - Volume 7

98 DIMITRI J. FRANTZESKAKIS ET AL.

and the equilibria were expressed in terms of Beltrami fields (see, e.g., [18]). In thispaper, we wish to study a more general type of such equilibrium solutions whichmay be obtained by the solution of a double curl equation with spatially varyingcoefficients. There is some related work on the single curl equation with spatiallyvarying coefficients, see, e.g., [7, 13]. This set of equilibria solutions may proveuseful in the modelling of plasmas especially in the context of laboratory plasmas.

This paper is organized as follows: in Section 2 we derive the double curl equa-tion in the context of the two fluid model. In Section 3 we study the existence ofequilibrium solutions using a variational formulation and a fixed point scheme. InSection 4 we show that the considered problem can, under suitable constraints, bereduced to a single curl equation. Finally, in Section 5 we consider some interestingspecial cases (regarding the relation between the spatial scales of the magnetic andthe velocity fields) in which the considered system reduces to simpler forms.

2. Derivation of the Double Curl Equation

In this section we present the derivation of the double curl equation in the contextof the two fluid model in magnetohydrodynamics.

We assume that the plasma is composed of two different fluids, a fluid of elec-trons (mass m and charge −e) and a fluid of ions (mass M and charge e). Theelectron fluid has velocity ve and pressure pe and the ion fluid has velocity vi andpressure pi . We assume a neutral plasma with number density of the electrons andions n. The momentum equation for each of these two fluids takes the form (see,e.g., [12])

∂ve

∂t+ (ve · ∇)ve = − e

m(E + ve × B) − 1

mn∇pe, (1a)

∂vi

∂t+ (vi · ∇)vi = e

M(E + vi × B) − 1

Mn∇pi. (1b)

We shall use a scaled version of these equations. The following scaled variablesare to be employed

x = λix, B = B0B, t = λi

VA

t, p =(

B2

µ0

)pv = VAv,

where VA = B0/√

µ0Mn is the Alfven velocity and λ2i = M/(µ0n) is a relevant

lengthscale. In the scaled variables the equations become

µ

(∂ve

∂t+ (ve · ∇)ve

)= −E − ve × B − ∇pe,

∂vi

∂t+ (vi · ∇)vi = E + vi × B − ∇pi,

where the hats have been dropped. In the above equation µ = m/M is a smallparameter. In the limit µ → 0 (i.e. in the limit where the electron mass can be

Page 99: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 99

neglected with respect to the ion mass) the equations become

∂A

∂t+ ∇φ − ve × B − ∇pe = 0,

∂vi

∂t− vi × (∇ × vi) + ∇

(v2

i

2

)= −∂A

∂t− ∇φ − vi × B − ∇pi,

where we used the divergence free property of the velocity fields and we have ex-pressed the electric field with the help of a vector potential A and a scalar potentialφ in the form

E = −∂A

∂t− ∇φ. (2)

The next step is to eliminate the electron fluid velocity ve from these equations. Theelectric current in the plasma is given (in dimensionless variables) by j = vi − ve.By the Maxwell equations the current is related to the magnetic field by j = ∇×B.So, the electron velocity field is related to the magnetic field by ve = vi − ∇ × B.Finally, we define the mean velocity of the two fluid system (in dimensionlessvariables) by

V = vi + µve

1 + µ� vi.

The two fluid system in the limit µ → 0 thus becomes

0 = −∂A

∂t− ∇φ + (V − ∇ × B) × B − ∇pe,

∂V

∂t− V × (∇ × V ) + ∇

(V 2

2

)= −∂A

∂t− ∇φ − V × B − ∇pi.

We may now eliminate the pressure terms by taking the curl of these equations

−∂B

∂t− ∇ × ((V − ∇ × B) × B) = 0,

∂t(∇ × V ) − ∇ × (V × (∇ × V )) = −∂B

∂t+ ∇ × (V × B).

These equations may be written in symmetric form as

∂ωj

∂t− ∇ × (Uj × ωj) = 0, j = 1, 2

in terms of the generalized vorticities

ω1 = B, ω2 = B + ∇ × V

and the effective flows

U1 = V − ∇ × B, U2 = V.

Page 100: Mathematical Physics, Analysis and Geometry - Volume 7

100 DIMITRI J. FRANTZESKAKIS ET AL.

We may now look for equilibrium solutions of these equations. Such equilib-rium solutions are solutions of the equations

∇ × (Uj × ωj) = 0, j = 1, 2.

A special class of solutions is in terms of generalized Beltrami fields (or nonlinearforce free fields) in the form

Uj = θj (x)ωj

or in terms of the original fields

B = a(x)(V − ∇ × B),

B = −∇ × V + b(x)V,

where

a(x) = 1

θ1(x), b(x) = 1

θ2(x).

We may now eliminate the velocity field V and obtain a single equation in terms ofthe magnetic field only. This is the following double curl equation with nonconstantcoefficients

∇ × (∇ × B) +(

1

a(x)− b(x)

)∇ × B +

+(

1 − b(x)

a(x)

)B + ∇

(1

a(x)

)× B = 0. (3)

In the special case of constant coefficients the above equation assumes the form

∇ × (∇ × B) +(

1

a− b

)∇ × B +

(1 − b

a

)B = 0.

Equations of the same type appear in the modelling of chiral media in electromag-netic theory. (See, e.g., [2] where a similar equation has been treated in the casewhere the functions a(x) and b(x) are known functions. In our case, these functionsare to be determined.)

3. Existence of Equilibrium Solutions

In this section we study the existence of equilibrium solutions with spatially vary-ing coefficients. We rewrite the problem in the following form

Page 101: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 101

∇ × U1 = −U2 + λ1U1 in �,

∇ × U2 = U1 + λ2U2 in �,

∇ · U1 = ∇ · U2 = 0 in �,

U1 · n = g1 on ∂�,

U2 · n = g2 on ∂�,

(∇ × U1) · n = −g2 + λ10g1 on �−1 ,

(∇ × U2) · n = g1 + λ20g2 on �−2 ,∫

�i

U1 · n dσ = a(1)i ,∫

�i

U2 · n dσ = a(2)i ,

(4)

where (U1, U2) = (B,U) and λ1, λ2 are considered as unknown functions to bedetermined. � is a bounded domain of R

3 which is not necessarily simply con-nected. The domain � can be turned into a simply connected one by ‘cutting’ m

surfaces �i . By n we denote the outer unit normal on the boundary of �, ∂�. Thesurfaces �−

i are the parts of ∂� on which the vector fields gi are incoming.

Remark 1. One set of boundary conditions refers to the determination of thenormal components of the magnetic field and the velocity on the boundary of thedomain �. Another set of boundary conditions is needed for the determination ofthe unknown eigenfunctions λi , i = 1, 2. To obtain this set of boundary conditionswe work in complete analogy with [7]: We take the curl of the equations leadingto ∇ · (λiUi) = 0. For smooth enough λi this set of equations could be interpretedwith the help of the ordinary differential equations x = Ui(x) along the orbits ofwhich λi is a constant. From this observation a proper boundary condition may beobtained by specifying the value of λi , say λi0, on the sets �−

i which are the subsetsof ∂� on which the fields gi are incoming. To deal with the problem of the possiblelack of necessary smoothness of λi , Ui , for the above interpretation to be valid wechoose instead to reformulate these boundary conditions as written above, wherewe have used the equations themselves to obtain an equivalent form which is moresuitable for our purpose. Finally, the last set of boundary conditions is necessaryto complete the formulation of the problem in the case where the domain � is notsimply connected; �i are the surfaces by which we need to ‘cut’ the domain inorder to render it simply connected. The surfaces �i have the following properties:(a) �0 = �\⋃m

i=1 �i is simply-connected, (b) �i ∩ �j = ø if i �= j and (c) theboundary of �i , i = 1, . . . , m is contained in ∂�.

Remark 2. As seen in the above remark we may formally write ∇ · (λiUi) = 0.For smooth enough λi this set of equations could be interpreted with the help ofthe ordinary differential equations x = Ui(x) along the orbits of which λi is aconstant. In the case of λi ∈ C1 this would imply that the equilibrium solutionsof the MHD equations satisfying the Beltrami condition with λi �= constant will

Page 102: Mathematical Physics, Analysis and Geometry - Volume 7

102 DIMITRI J. FRANTZESKAKIS ET AL.

be two-dimensional solutions. Eventhough 2D solutions are of great interest inplasma physics (see, e.g., [17, 16, 14, 5, 1]) some care has to be taken in writingand interpreting the equation ∇ ·(λiUi) = 0. Since, as shown in this paper (see also[7]), λi ∈ L∞ and is not necessarily in C1, the above condition may only be valid inthe almost everywhere sense. Thus, the solution will be two-dimensional but theremight be sets of measure zero where it is not necessarily two-dimensional, in otherwords it may be considered as a ‘quasi-two-dimensional’ solution.

Before dealing with the above system, we must make the necessary changes inorder to turn it into one with homogeneous boundary conditions. In order to achievethat, we must subtract a properly chosen potential field which is responsible for thenonhomogeneous boundary conditions. We write

Ui = ui + ∇φi, i = 1, 2

and choose φi to be the solution of the following Neumann problem

φi = 0 in �,

∂φi

∂n= gi on ∂�.

The new variables ui , i = 1, 2 will solve the homogeneous system

∇ × u1 = −u2 + λ1u1 + J1 in �,

∇ × u2 = u1 + λ2u2 + J2 in �,

∇ · u1 = ∇ · u2 = 0 in �,

u1 · n = 0 on ∂�,

u2 · n = 0 on ∂�,

(∇ × u1) · n = −g2 + λ10g1 on ∂�,

(∇ × u2) · n = g1 + λ20g2 on ∂�,∫�i

u1 · n dσ = β(1)i ,∫

�i

u2 · n dσ = β(2)i ,

where

J1 = −∇φ2 + λ1∇φ1, J2 = ∇φ1 + λ2∇φ2

and

β(j)

i = a(j)

i −∫

�i

∇φj · n dσ, i = 1, . . . , m, j = 1, 2.

Page 103: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 103

3.1. FUNCTIONAL SET-UP OF THE PROBLEM

We now introduce the functional set-up of the problem. Let V be the subspace ofL2(�)3 defined by

V = {v ∈ L2(�)3,∇ · v ∈ L2(�)3,∇ × v ∈ L2(�)3 and v · n = 0 on ∂�}equipped with the norm

‖v‖ = (‖v‖20,� + ‖∇ · v‖2

0,� + ‖∇ × v‖20,�)1/2.

It is useful to note that this space is a Hilbert space [11] which can be identifiedwith

V = {v ∈ H 1(�)3, v · n = 0 on ∂�}.On the space V is is useful to define the equivalent norm

‖v‖V = (‖∇ · v‖20,� + ‖∇ × v‖2

0,� + ‖PHv‖20,�)1/2,

where PH is the orthogonal projection of V on the finite-dimensional (of dimen-sion m) subspace H defined by H = {v ∈ V,∇ ·v = 0,∇ ×v = 0}. Furthermore,we have the inequality

‖v‖0,� � 1

c0‖v‖V , ∀v ∈ V

for a constant c0. For more details on the above, see [7–9, 19] and referencestherein. For the problem we study, the proper functional set-up for (u1, u2) is theproduct space V × V equipped with the norm

‖v‖V×V = (‖u1‖2V + ‖u2‖2

V)1/2.

3.2. A FIXED POINT SCHEME FOR THE SOLUTION OF THE PROBLEM

The solution of the problem will be considered with the use of the following fixedpoint scheme:

• We first assume that (λ1, λ2) are known and we solve the following problem∇ × u1 = −u2 + λ1u1 + J1 in �,

∇ × u2 = u1 + λ2u2 + J2 in �,

∇ · u1 = ∇ · u2 = 0 in �,

u1 · n = 0 on ∂�, (PROBLEM A)

u2 · n = 0 on ∂�,∫�i

u1 · n dσ = β(1)i ,∫

�i

u2 · n dσ = β(2)i ,

Page 104: Mathematical Physics, Analysis and Geometry - Volume 7

104 DIMITRI J. FRANTZESKAKIS ET AL.

where J1 = −∇φ2 + λ1∇φ1 +∇p1, J2 = ∇φ1 + λ2∇φ2 +∇p2 and the terms∇pi , i = 1, 2 have been added so as to ensure the divergence free propertyof the fields since for general λi that may be used in Problem A it will not betrue that ∇ · (λiui) = 0. Of course, this property will always be true for the λi

that are the eigenfunctions of the problem.• We then solve the problem for λi , i = 1, 2 assuming ui are known. We solve

the following problem

−ελiε + ui · ∇λiε = 0 in �(PROBLEM B)

λiε = σi on ∂�

The terms −ελiε are elliptic terms which are added to regularize the hyper-bolic system that the eigenfunctions λiε satisfy.

The solution of the problem we wish to solve is now the fixed point of the suc-cessive solution of Problems A and B. In the next subsections we treat separatelythese problems and prove the existence of a fixed point in the above scheme.

3.2.1. Solution of Problem A

Let us temporarily assume that λ1,2 ∈ L∞(�) and that they satisfy the bounds‖λi‖∞ � λ∞.

We will first introduce the variational form of Problem A. We may prove thefollowing lemma

LEMMA 3.1. The quadruple (u1, u2, p1, p2) ∈ V × V × H 10 (�) × H 1

0 (�) is asolution of Problem A if and only if it is a solution of the following variationalproblem; ∀(v1, v2, w1, w2) ∈ V × V × H 1

0 (�) × H 10 (�)

(∇ × u1 + u2 − λ1u1,∇ × v1) + (∇ · u1,∇ · v1) + (PHu1, PHv1)

= (J1,∇ × v1) +m∑

i=1

β(1)i (qi, PHv1),

(∇ × u2 − u1 − λ2u2,∇ × v2) + (∇ · u2,∇ · v2) + (PHu2, PHv2)

= (J2,∇ × v2) +m∑

i=1

β(2)i (qi, PHv2),

(∇p1,∇w1) = −(J1 + λ1u1,∇w1),

(∇p2,∇w2) = −(J2 + λ2u2,∇w2).

Proof. The proof of this lemma is a straightforward generalization of the proofof Lemma 8 in [7] and is omitted here for the sake of brevity. �

We may now use the above variational characterization of Problem A, alongwith the Lax–Milgram lemma, to guarantee its solution and obtain estimates onthe solution. We have the following

Page 105: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 105

LEMMA 3.2. Problem A admits a unique solution satisfying the estimates

‖ui‖V � C,

‖∇pi‖0,� � C1,i .

Proof. We will use the linear forms

li (vi) = (Ji,∇ × vi) +m∑j

β(i)j (qj , PHvi)

and the bilinear forms

a1(u1, v1) = (∇ × u1 + u2 − λ1u1,∇ × v1) + (∇ · u1,∇ · v1) ++ (PHu1, PHv1),

a2(u2, v2) = (∇ × u2 − u1 − λ2u2,∇ × v2) + (∇ · u2,∇ · v2) ++ (PHu2, PHv2).

The variational problem may then be written in the form

a1(u1, v1) + a2(u2, v2) = l1(v1) + l2(v2).

In order to apply the Lax–Milgram lemma we need to check the boundedness ofthe form a1 + a2 and its coercivity.

The boundedness is straightforward to check using the following estimates

|a1(u1, v1)|� ‖∇ × u1‖0,�‖∇ × v1‖0,� + ‖u2‖0,�‖∇ × v1‖0,� +

+‖λ1‖∞‖u1‖0,�‖∇ × v1‖0,� + ‖∇ · u1‖0,�‖∇ · v1‖0,� ++‖PHu1‖0,�‖PHv1‖0,�,

|a2(u2, v2)|� ‖∇ × u2‖0,�‖∇ × v2‖0,� + ‖u1‖0,�‖∇ × v2‖0,� +

+‖λ2‖∞‖u2‖0,�‖∇ × v2‖0,� + ‖∇ · u2‖0,�‖∇ · v2‖0,� ++‖PHu2‖0,�‖PHv2‖0,�

from which we may conclude that

|a1(u1, v1) + a2(u2, v2)| � C(c0, ‖λi‖∞)‖u‖V×V‖v‖V×V

which guarantees the boundedness of the bilinear forms in the space V × V.We now deal with the coercivity property of the forms. We have that

a1(u1, u1) + a2(u2, u2)

= ‖u1‖2V + ‖u2‖2

V + (u2,∇u1)−− (λ1u1,∇ × u1) − (u1,∇ × u2)−− (λ2u2,∇ × u2)

Page 106: Mathematical Physics, Analysis and Geometry - Volume 7

106 DIMITRI J. FRANTZESKAKIS ET AL.

from which we may conclude that

|a1(u1, u1) + a2(u2, u2)|� ‖u1‖2

V + ‖u2‖2V − ‖u2‖0,�‖∇ × u1‖0,� −

−‖u1‖0,�‖∇ × u2‖0,� −−‖λ1‖∞‖u1‖0,�‖∇ × u1‖0,� −−‖λ1‖∞‖u1‖0,�‖∇ × u1‖0,�.

Using standard inequalities we may find that

|a1(u1, u1) + a2(u2, u2)|� ‖u1‖2

V + ‖u2‖2V − 2

c0‖u1‖V‖u2‖V − λ∞

c0(‖u1‖2

V + ‖u2‖2V)

from which we may conclude that

|a1(u1, u1) + a2(u2, u2)| �(

1 − 1

c0− λ∞

c0

)‖u‖2

V×V ,

where λ∞ is the bound for the L∞(�) norm of λi (i = 1, 2) and u = (u1, u2). Theabove result guarantees the coercivity of the form a1 + a2.

We may furthermore show that the linear forms li are bounded as follows

|li (vi)| � ‖Ji‖0,�‖∇ × vi‖0,� +(

m∑j

|β(i)j

)‖PHvi‖0,�

from which we may conclude that

|l1(v1) + l2(v2)|�(

‖J1‖2 + ‖J2‖2 +(

m∑j=1

β(1)j

)2

+(

m∑j=1

β(2)j

)2) 12

‖v‖V×V .

From a straightforward application of the Lax–Milgram lemma we may thus seethat the above problem has a unique solution that satisfies the bound

‖u‖V×V � c0

(‖J1‖2 + ‖J2‖2 + (∑m

j=1 β(1)j )2 + (

∑mj=1 β

(2)j )2)

12

c0 − 1 − λ∞≡ C.

The proof for the solvability of the equation for p follows similarly, and a possiblechoice for C1,i, i = 1, 2, is C1,1 = ‖∇φ2‖0,� + λ∞(‖∇φ1‖0,� + ‖u1‖0,�), C1,2 =‖∇φ1‖0,� + λ∞(‖∇φ2‖0,� + ‖u2‖0,�). This concludes the proof of the lemma. �

Let us note that the solvability of this problem requires c0 − 1 − λ∞ > 0. Thisimplies that c0 � 1 and ‖λi‖∞ � c0 which is a constraint both on the initial datafor the ’eigenvalues’ λi as well as on the domain, since the constant c0 depends onthe choice of domain (see Lemma 5 in [7]).

Page 107: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 107

3.2.2. The Solution of Problem B

In this section we treat the solution of Problem B, i.e. the solution of the problemof determination of λi with ui given. This problem is decoupled, so the solvabilityand the bounds for this problem can be obtained straight away from the relevantresults of Boulmezaoud and Amari [7] on the single curl equation. In this directionwe have

LEMMA 3.3. Assume that ui ∈ H 1(�)3, ∇ · ui = 0, σi ∈ H 3/2(∂�), i = 1, 2.Then Problem B has a unique solution λiε ∈ H 1(�). Moreover λiε ∈ H 2(�) andsatisfy the following estimates

‖λiε‖∞ � ‖σi‖∞,∂�,

‖∇λiε‖0,� � C2ε−1/2,

‖λiε‖2,� � C3ε−7/4.

Proof. In complete analogy with the proof of Lemma 10 of [7]. �

3.2.3. The Coupled Problem

We are now in a position to address the coupled problem that will give us thesolution of the double curl problem

∇ × u1ε = −u2ε + λ1εu1ε + J1 on �,

∇ × u2ε = u1ε + λ2εu2ε + J2 on �,

∇ · u1ε = ∇ · u2ε = 0 on �,∫�i

ujε · n = β(j)

i , i = 1, . . . , m, j = 1, 2, (5)

−ελiε + (uiε + ∇φi) · ∇λiε = 0 in �,

λiε = σi on ∂�

with (u1ε, u2ε) ∈ V × V.We define the closed subspace of V

V1 = {v ∈ V/∇ · v = 0}.For the coupled system we will work with the closed subspace V1 × V1 of thefunctional space V × V.

We now consider the operator T : (j1, j2) ∈ V1 × V1 → (u1, u2) ∈ V1 × V1

where (u1, u2) is a solution of Problem A with (λ1, λ2) being a solution of problem

−ελi + (ji + ∇φi) · ∇λi = 0 in �(PROBLEM C)

λi = σi on ∂�

for i = 1, 2.

Page 108: Mathematical Physics, Analysis and Geometry - Volume 7

108 DIMITRI J. FRANTZESKAKIS ET AL.

A fixed point of operator T is a solution of the coupled system. To prove theexistence of a fixed point for the operator T we will employ Schauder’s fixed pointtheorem. We need the following lemma.

LEMMA 3.4. The operator T is compact from V1 × V1 into V1 × V1.Proof. We will first check the continuity of the operator T . Let (j1, j2), (j

∗1 , j ∗

2 )∈V1 ×V1. By (λ1ε, λ2ε), (λ

∗1, λ

∗2) we denote the solutions of Problem C correspond-

ing to these choices of ji . Set (u1, u2) = T (j1, j2), (u∗1, u

∗2) = T (j ∗

1 , j ∗2 ) and let us

denote by an overbar the difference between the starred and the nonstarred quan-tities, e.g., u1 = u1 − u∗

1, etc. We now consider the equations that the differenceswill satisfy. We begin with the equations for λi . Subtracting the equations for λi

we obtain

−ελi + (ji + ∇φi) · ∇λi + ji · ∇λ∗i = 0

with homogeneous boundary conditions. Multiplying by λi and integrating overthe whole domain � we obtain

ε‖λi‖21,� =

∫�

λ∗i ji · ∇λi d� � |λ∗

i |∞‖j1‖0,�‖λi‖1,�

� C′0‖j1‖0,�‖λi‖1,�

(where we have used the estimates of Lemma 3.3). From the above inequality weobtain the estimate

ε‖λi‖1,� � C0‖j1‖0,�.

On the other hand we have that

∇ × u1 = −u2 + λ1u1 + λ1u∗1 + λ1∇φ1 + ∇p1,

∇ · u1 = 0, PH u1 = 0,

∇ × u2 = u1 + λ2u2 + λ2u∗2 + λ2∇φ2 + ∇p2,

∇ · u2 = 0, PH u2 = 0.

We may now use the estimate for the solution of Problem A to estimate thesedifferences. We have that

‖u‖V×V � C′1{‖λ1u

∗1 + λ1∇φ1‖2

0,� + ‖λ2u∗2 + λ2∇φ2‖2

0,�}1/2.

Using Hölder’s inequality with p = 3, q = 3/2 we have that

‖λi (∇φi + u∗i )‖0,� � ‖λi‖L6‖∇φi + u∗

i ‖L3 � C‖λi‖1,�,

where we have used the compact embedding H 1(�) ↪→ L6(�) ↪→ L3(�) and theboundedness of ‖u∗

i ‖V .By the above we have that

‖u‖V � C(‖λ1‖21,� + ‖λ2‖2

1,�)1/2 � C′(‖j1‖2

0,� + ‖j2‖20,�)1/2

which guarantees the continuity of the operator T .

Page 109: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 109

We will now use arguments related to the continuity of T to prove the com-pactness property. Recall that an operator T : H1 → H2 is called compact ifit maps bounded sequences in H1 into sequences in H2 that contain convergentsubsequences. Let (j1n, j2n) be a bounded sequence of V1 × V1. By the definitionof V1 we may extract a subsequence (j1k, j2k) converging strongly in L2(�)3 ×L2(�)3 and weakly in H 1(�)3 ×H 1(�)3. Consider now the sequence (u1k, u2k) =T (j1k, j2k). By the continuity inequality we have proved above we see that

‖T (jk1) − T (jk2)‖V×V � C′′‖jk1 − jk2‖V×V

which guarantees the convergence of the subsequence. This concludes the proof ofcompactness of the operator T . �

Using the above lemma we may now show the existence of a solution of thecoupled system.

THEOREM 3.1. The coupled system admits a solution satisfying the bounds

‖uiε‖V×V � C′,

‖λiε‖∞ � C′1,

‖∇λiε‖0,� � C2ε−1/2,

‖λiε‖2,� � C3ε−7/4,

‖∇piε‖0,� � C2ε1/2.

Proof. We will use Schauder’s fixed point theorem. Consider the ball B of V×Vdefined by

B = {v ∈ V1 × V1 : ‖v‖V×V � R},where R is chosen in such a way that it is larger than the bound for (u1, u2)

provided by Problem A. Then T (B) ⊂ B and since T is a compact operator bySchauder’s fixed point theorem we may conclude that there exists a fixed point foroperator T . This fixed point is a solution of the coupled system. A possible choicefor the bounds of the solution is

C′ = c0

c0 − 1 − σ∞

(‖J1‖0,� + ‖J2‖0,� +

∥∥∥∥∥m∑

j=1

β1j qj

∥∥∥∥∥0,�

+∥∥∥∥∥

m∑j=1

β2j qj

∥∥∥∥∥0,�

)

and C′1 = σ∞ where σ∞ = max(‖σi‖∞).

For the estimate of pi we proceed as follows: By taking the divergence of theequations for ui , using the fact that ∇ · vi = 0 and the equation for λiε we find that

piε = −∇ · (λiui + Ji) = −ελiε.

Multiplying by piε and integrating over the whole of � we obtain

‖∇pi‖0,� � ε‖∇λiε‖0,� � C2ε1/2. (6)

This concludes the proof of the theorem. �

Page 110: Mathematical Physics, Analysis and Geometry - Volume 7

110 DIMITRI J. FRANTZESKAKIS ET AL.

3.2.4. The Limit as ε → 0

In the treatment of the problem we have regularized Problem B by the use of asmall elliptic part in the operator. The final technical part of the proof consists ofstudying the limit as this term goes to zero, that is the limit as ε → 0. We provethe following lemma

LEMMA 3.5. The solution of the coupled system (5) converges weakly to a solu-tion of the system (4) as ε → 0.

Proof. We will consider the coupled problem (5) using the boundary condition

σi ={

λi0 on �−i ,

0 on �+i .

Since ‖∇piε‖1,� � Cε12 we may deduce that piε → 0 strongly in H 1

0 (�) as ε → 0.Furthermore, since ‖uiε‖V and ‖λ‖∞ are bounded independently of ε we may con-clude the existence of a sequence (λ1ε, λ2ε, u1ε, u2ε) in L∞(�)×L∞(�)×V ×Vsuch that λiε → λi weakly star in L∞(�) and uiε → ui weakly in H 1(�)3 andstrongly in L2(�)3. Furthermore, by the uniform estimates on ‖λi‖∞ we concludethat ‖λi‖∞ � ‖λ0i‖�−

i. From the above considerations we find that the products

λiεuiε converge weakly to λiui in L2(�)3. We may further deduce the convergenceof ∇piε to 0 strongly in L2(�)3 (by the bound on ‖piε‖1,�). We thus conclude that

∇ × u1ε = λ1εu1ε − u2ε + ∇p1ε → ∇ × u1 = λ1u1 − u2,

∇ × u2ε = λ2εu2ε + u1ε + ∇p2ε → ∇ × u2 = λ2u2 + u1

weakly in L2(�)3 × L2(�)3. Using the above we may conclude that uiε convergeweakly to ui in V ↪→ H 1(�)3. The limiting fields Ui = ui+∇φi solve the problem

∇ × U1 = −U2 + λ1U1 in �,

∇ × U2 = U2 + λ2U2 in �,

∇ · U1 = ∇ · U2 = 0 in �,

U1 · n = g1 on ∂�,

U2 · n = g2 on ∂�,∫�i

U1 · n dσ = a(1)i ,∫

�i

U2 · n dσ = a(2)i .

We only need to check now the boundary conditions on �−i , i = 1, 2. In direct

analogy with the method used in the single curl equations we define two functionswi ∈ C2(�) such that

wi ={

wi on �−i ,

0 on ∂�\�−i .

Page 111: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 111

We now multiply

∇ × U1ε = λ1εU1ε − U2ε + ∇p1ε,

∇ × U2ε = λ2εU2ε + U1ε + ∇p2ε

by ∇w1 and ∇w2 respectively and integrate over �. Using integration by parts weconclude that

〈∇ × U1ε · n,w1〉 ≡∫

∂�

(∇ × U1ε) · nw1 dσ

=∫

λ1εU1ε · ∇w1 d� −∫

∂�

U2εw2 dσ +∫

∇p1ε∇w1 d�,

〈∇ × U2ε · n,w2〉 ≡∫

∂�

(∇ × U2ε) · nw2 dσ

=∫

λ2εU2ε · ∇w2 d� +∫

∂�

U1εw1 dσ +∫

∇p2ε∇w2 d�,

where we have used the fact that ∇ · (∇ × Uiε) = 0 and ∇ · Uiε = 0. We mustnow estimate the integrals

∫�

λiεUiε · ∇wi d�, i = 1, 2. To this end, we set Uiε =uiε + ∇φi and λiε = λiε − ui0 where ui0 satisfies the Dirichlet problem

ui0 = 0 in �,

ui0 = σi on ∂�.

The functions λiε satisfy the equation

−ελiε + Uiε · λiε = 0

with homogeneous Dirichlet boundary conditions, which upon multiplied by wi

and integrated over the whole domain yields∫�

λiεUiε · ∇wi d�

= ε

∫�

∇λiε · ∇wi d� − ε

∫�−

i

wi

∂λiε

∂ndσ +

∫�−

i

λ0igiwi dσ,

where we have used the definition of the functions wi . We thus conclude that

〈∇ × U1ε · n,w1〉 = ε

∫�

∇λ1ε · ∇w1 d� − ε

∫�−

1

w1∂λ1ε

∂ndσ +

+∫

�−1

λ01g1w1 dσ −∫

�−1

g2w2 dσ +∫

∇p1ε∇w1 d�,

〈∇ × U2ε · n,w2〉 = ε

∫�

∇λ2ε · ∇w2 d� − ε

∫�−

2

w2∂λ2ε

∂ndσ +

+∫

�−2

λ02g2w2 dσ +∫

�−2

U1εw1 dσ +∫

∇p2ε∇w2 d�.

Page 112: Mathematical Physics, Analysis and Geometry - Volume 7

112 DIMITRI J. FRANTZESKAKIS ET AL.

By the weak convergence of ∇ × Uiε to ∇ × Ui in L2(�)3 and by the vanishing ofthe divergence of the curl for all ε we have that

〈∇ × Uiε · n,wi〉 −→ 〈∇ × Ui · n, φ〉for i = 1, 2. We have also that

ε

∣∣∣∣∫

∇λiε · ∇wi d�

∣∣∣∣ � ε‖λi‖1,�‖wi‖1,� −→ 0,∣∣∣∣∫

∇piε∇wi d�

∣∣∣∣ � ‖pi‖1,�‖wi‖1,� −→ 0,

ε

∣∣∣∣∫

�−i

∂λiε

∂nwi

∣∣∣∣ dσ −→ 0.

The last inequality can be proved in the same way as it was done in [7].So we may conclude that in the limit as ε → 0 we recover the boundary

conditions

∇ × U1 = λ01g1 − g2 on �−1 ,

∇ × U2 = λ02g2 + g1 on �−2 .

This concludes the proof of the lemma. �The steps in Sections 3.2.1, 3.2.2, 3.2.3 and 3.2.4 guarantee the existence of a

solution to Problem A.

4. Reduction to the Single Curl Equation

In [20] the existence and uniqueness of equilibrium solutions for the two fluidequations are obtained in the case of constant coefficients. This is done using a fac-torization of the double curl equation into two single curl equations. This techniquereveals some interesting properties of the equilibria in terms of eigenfunctions ofthe curl operator (Beltrami fields), [15]. As we will see in the case of nonconstantcoefficients this factorization is no longer possible (except in the case of specialchoice of coefficients).

THEOREM 4.1. If the coefficients a(x) and b(x) satisfy the constraint

1

a(x)+ b(x) = C

with C = constant, the double curl equation (3) can be factorized into two equa-tions

(curl − +(x))(curl − −(x))B = 0,

Page 113: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 113

where

+(x) = b(x) − φ, −(x) = − 1

a(x)+ φ

with φ = constant and the constants should satisfy the equation

φ2 − Cφ + 1 = 0.

Under the same conditions the solution can be written as a linear combination oftwo ‘nonlinear’ Beltrami fields.

Proof. By a simple algebraic manipulation of

(curl − +(x))(curl − −(x))B = 0,

we find that the factorized equation will be equivalent to the double curl equa-tion (3) as long as

+ + − = − 1

a(x)+ b(x),

+ − = 1 − b(x)

a(x),

∇ − = −∇(

1

a(x)

).

The first and the third equation imply that

− = − 1

a(x)+ φ,

+ = b(x) − φ,

where φ is a constant. Substituting these into the second equation we obtain theconsistency condition

φ2 −(

1

a(x)+ b(x)

)φ + 1 = 0.

This may be true as long as

1

a(x)+ b(x) = C,

where C is a constant. In order to be able to write the general solution of thisequation in terms of a linear combination of nonlinear Beltrami fields, i.e. in termsof solutions of the equations

L±G± = (curl − ±(x))G± = 0

the operators L+ and L− will have to commute. This implies that

∇ + = ∇ −

Page 114: Mathematical Physics, Analysis and Geometry - Volume 7

114 DIMITRI J. FRANTZESKAKIS ET AL.

which in turn is equivalent to the condition 1/a(x) + b(x) = c which is consistentwith the previous condition. �

In the special case that this consistency condition holds the results of Boulmeza-oud and Amari [7] may be used for the treatment of the existence of equilibria.

5. Some Interesting Limiting Situations

In this section we present some interesting limiting situations of the double curlequation.

5.1. THE CASE WHERE 1/a(x) = b(x)

In this case the equation takes the simpler form

∇ × (∇ × B) +(

1 − b(x)

a(x)

)B + ∇

(1

a(x)

)× B = 0

with boundary conditions

B · n = 0 on ∂�,

where n is the normal vector to the boundary ∂�.In the case of constant coefficients this equation becomes equivalent to the

vector Helmholtz equation

−B +(

1 − b

a

)B = 0.

5.2. THE CASE OF EQUAL LENGTH SCALES a(x) = b(x)

In this case the equation becomes

∇ × (∇ × B) +(

1

a(x)− b(x)

)∇ × B + ∇

(1

a(x)

)× B = 0.

In the case of constant coefficients this equation can be written in terms of thecurl of the magnetic field u = ∇ × B as

∇ × u +(

1

a− b

)u = 0

so that the magnetic field is a Beltrami field.This case corresponds physically to the case where the magnetic field and the

velocity field of the fluid have the same spatial scales.

Page 115: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 115

5.3. REDUCTION OF THE PROBLEM TO AN ITERATIVE SEQUENCE OF

POISSON’S EQUATIONS

In this section we propose a reduction of the problem in the case that the magneticfield and the velocity field are of the same spatial scales. Consider the constantcoefficients problem

−B + ω2B = 0,

B · n = 0,

where we have set ω2 = 1 − b/a. We will assume that ω is small. It is well knownthat Helmholtz’s equation is analytic with respect to ω. We may thus assume thepower series expansion

B(x) =∞∑

m=0

ωm

m! �m(x),

where �m are independent of ω. This is in analogy with the low frequency expan-sion for electromagnetic fields, [10]. Substituting this expansion into the Helmholtzequation we end up with

−�m = 0, m = 0, 1,

−�m = −m(m − 1)�m−2, m � 2,

�m · n = 0, m = 0, 1, 2, . . .

The potentials �0,�1 are harmonic functions and �m, m � 2, solve Poisson’sequation. Using the above expansion we conclude that we may obtain the magneticfields by iteratively solving a sequence of simple potential problems. This approachto the problem may prove useful for obtaining approximate solutions of the doublecurl equation, in terms of special functions for special geometries (e.g., cylindricalgeometry, spherical geometry, etc.). Such solutions may be used to check numericalalgorithms or for other computational purposes.

5.4. EXTENSION TO NON-BELTRAMI FIELDS

The equilibrium solutions given by the generalized Beltrami fields in this paperhave the property that ∇pe = 0 and pi = −V 2/2. The property that the electronpressure is a constant everywhere may be rather restrictive. To remedy this situationwe propose a way to obtain more general equilibrium solutions to the ideal MHDequations using as starting point the Beltrami equilibrium solutions. To this end wewrite

V = Vb + V1,

B = Bb + B1,

Page 116: Mathematical Physics, Analysis and Geometry - Volume 7

116 DIMITRI J. FRANTZESKAKIS ET AL.

where Vb and Bb are the Beltrami solutions obtained here and V1, B1 are (small)perturbations to the Beltrami solutions. We look for equilibrium solutions.

We substitute this ansatz to the MHD equations and linearize with respect tothe fields V1 and B1. Then the equations become

0 = ∇ × {[v1 − ∇ × B1 − θ1B1] × Bb},0 = ∇ ×

{[B1 + ∇ × V1 − 1

θ2V1

]× Vb

},

where θ1 and θ2 are the Beltrami proportionality factors which are specified by thesolution of the original problem. In the above we used the definition of the fieldsBb and Vb.

The general solution of the above equation is of the form

(V1 − ∇ × B1 − θ1B1) × Bb = ∇ψ1,(B1 + ∇ × V1 − 1

θ2V1

)× Vb = ∇ψ2,

where now ψ1 and ψ2 are ‘arbitrary’ functions of x. It can be seen that for theabove fields, the pressure will satisfy pe = ψ1 and pi = −(V 2/2) + ψ2. Thus,the solution of the above equations for Bb, Vb and ψ1, ψ2 given, will provide uswith an approximate equilibrium solution of the two-fluid model with prescribedpressure characteristics. However, a quick inspection of the above set of equationsshows that the fields ∇ψ1 and ∇ψ2 are constrained by the conditions

Bb · ∇ψ1 = Vb · ∇ψ2 = 0,

that is the pressure gradients should be orthogonal to the (unperturbed) Beltramifields. In the regions where the unperturbed fields are two-dimensional (see Re-mark 2 in Section 3) the surfaces of constant pressure define tori. Other config-urations are also possible. The above constraints result from the linearization ofthe MHD equations. The full nonlinear model is expected to be free from suchlimitations. We will return to the matter of the fully nonlinear model in separatecommunications.

The above equations are formally of the same type as the ones considered in themain part of this paper, and the ones considered in [2] in a different setting.

Acknowledgements

A.N.Y. wishes to acknowledge partial financial support from the Postdoctoral pro-gramme of the Hellenic State Scholarship Foundation. I.G.S. and A.N.Y. acknowl-edge partial financial support from the Special Research Account of the Universityof Athens (grant no. 70/4/5643).

Page 117: Mathematical Physics, Analysis and Geometry - Volume 7

TWO-FLUID MHD EQUILIBRIA 117

References

1. Aly, J. J. and Amari, T.: Current sheets in two-dimensional magnetic fields I, II, III, Astronom.Astrophys. 221 (1989), 287–294; 227 (1990), 628–633; 319 (1997), 699–719.

2. Ammari, H. and Nédélec, J.-C.: Small chirality behaviour of solutions to electromagneticscattering problems in chiral media, Math. Methods Appl. Sci. 21 (1998), 327–359.

3. Biskamp, D.: Nonlinear Magnetohydrodynamics, Cambridge Univ. Press, Cambridge, 1993.4. Braginskii, S. I.: Transport processes in a plasma, In: M. A. Leontovich (ed.), Reviews of

Plasma Physics, Consultants Bureau, New York, 1965, 205–311.5. Bogoyavlenskij, O. I.: Exact axially symmetric MHD equilibria, C.R. Acad. Sci. Paris Sér. I

Math. 331 (2000), 569–574.6. Boulmezaoud, T. Z.: Etude des champs de Beltrami dans des domaines de R

3 bornés et nonbornés et applications en astrophysique, PhD Thesis, Univ. P. M. Curie, Paris, 1998.

7. Boulmezaoud, T. Z. and Amari, T.: On the existence of non-linear force-free fields in three-dimensional domains, Z. Angew. Math. Phys. 51 (2000), 942–967.

8. Boulmezaoud, T. Z. and Amari, T.: Approximation of linear force-free fields in bounded 3-Ddomains, Math. Comput. Modelling 31 (2000), 109–129.

9. Boulmezaoud, T. Z., Maday, Y. and Amari, T.: On linear force-free fields in bounded andunbounded three-dimensional domains, Math. Modelling Numer. Anal. (M2AN) 33 (1999),359–393.

10. Dassios, G. and Kleinman, R.: Low Frequency Scattering, Clarendon Press, Oxford, 2000.11. Dautray, R. and Lions, J.-L.: Mathematical Analysis and Numerical Methods for Science and

Technology, Vol. 3, Spectral Theory and Applications, Springer, Berlin, 1991.12. Davidson, P. A.: An Introduction to Magnetohydrodynamics, Cambridge Univ. Press, Cam-

bridge, 2001.13. Kravchenko, V. V.: On Beltrami fields with nonconstant proportionality factor, J. Phys. A 36

(2003), 1515–1522.14. Laurence P. and Stredulinsky, E. W.: Two-dimensional magnetohydrodynamic equilibria with

prescribed topology, Comm. Pure Appl. Math. 53 (2000), 1177–1200.15. Moses, H. E.: Eigenfunctions of the curl operator, rotationally invariant Helmholtz theorem and

applications to electromagnetics and fluid dynamics, SIAM J. Appl. Math. 21 (1971), 114–144.16. Neukirch, T.: Quasiequilibria: a special class of time dependent solutions of the two-

dimensional magnetohydrodynamic equations, Phys. Plasmas 2 (1995), 4389–4399.17. Rappaz, J. and Touzani, R.: On a two-dimensional magnetohydrodynamic problem I: Mod-

elling and analysis, Math. Modelling Numer. Anal. (M2AN) 26 (1991), 347–364.18. Yoshida, Z.: Application of Beltrami functions in plasma physics, Nonlinear Anal. 30 (1997),

3617–3627.19. Yoshida, Z. and Giga, Y.: Remarks on spectra of operator rot, Math. Z. 204 (1990), 235–245.20. Yoshida, Z. and Mahajan, S.: Simultaneous Beltrami conditions in coupled vortex dynamics,

J. Math. Phys. 40 (1999), 5080–5091.

Page 118: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 119–149, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

119

Transformation Operators for Sturm–LiouvilleOperators with Singular Potentials �

Dedicated to Professor V. A. Marchenko on the occasion of his 80th birthday

ROSTYSLAV O. HRYNIV1 and YAROSLAV V. MYKYTYUK2

1Institute for Applied Problems of Mechanics and Mathematics, 3b Naukova st., 79601 Lviv,Ukraine and Lviv National University, 1 Universytetska st., 79602 Lviv, Ukraine.e-mail: [email protected] National University, 1 Universytetska st., 79602 Lviv, Ukraine.e-mail: [email protected]

(Received: 26 September 2002; in final form: 4 April 2003)

Abstract. We construct transformation operators for Sturm–Liouville operators with singular poten-tials from the space W−1

2 (0, 1) and show that these transformation operators naturally appear duringfactorisation of Fredholm operators of a special form. Some applications to the spectral analysis ofSturm–Liouville operators with singular potentials under consideration are also given.

Mathematics Subject Classifications (2000): Primary: 34C20; secondary: 34B24, 34L05, 47A68.

Key words: transformation operators, Sturm–Liouville operators, singular potentials.

1. Introduction

In the present work we shall study transformation operators (TOs) for Sturm–Liou-ville (SL) operators generated in a Hilbert space H = L2(0, 1) by the differentialexpressions

�(f ) := −f ′′ + qf (1.1)

with complex-valued distributions q from the space W−12 (0, 1). (The precise defi-

nitions of the SL operators considered and the TOs are given in the next section.)Starting from the works by Povzner [25], Marchenko [21], Gelfand and Levi-

tan [9] TOs have been successfully used in the spectral analysis of SL operators andother classical operators of mathematical physics. In particular, TOs have provedto be an important tool for solution of inverse spectral problems for SL operators(see the original papers [9, 21] and the monographs [20, 22, 26] for extended ref-erence lists) and have been thoroughly studied for the case of regular, i.e., locallyintegrable, potentials q. Recent development of the theory of Sturm–Liouville and

� The work was partially supported by the Ukrainian Foundation for Basic Research DFFD undergrant no. 01.07/00172.

Page 119: Mathematical Physics, Analysis and Geometry - Volume 7

120 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

Schrödinger operators with singular (i.e., not locally integrable) potentials [4, 14–16, 24, 28–30] (see also the books [1] and [2] for general theory and detailedbibliography) has made feasible a thorough spectral analysis for SL operatorswith potentials from W−1

2 (0, 1) and, in turn, led to inverse spectral problems forsuch class of operators. That posed the problem of existence and properties of TOsfor SL operators with singular potentials, which we study in detail in the presentarticle. In the subsequent papers [17, 18] we use the TOs constructed to solve the in-verse spectral problems for SL operators with potentials from the space W−1

2 (0, 1).Note that inverse spectral problems for SL operators with nonsmooth coefficients(in particular, for SL operators in the impedance form), were treated in differentmanner in, e.g., [3, 5, 7, 13, 27, 31, 32].

The main aim of this article is two-fold. Firstly, we shall construct the TOsfor SL operators in H with singular potentials q ∈ W−1

2 (0, 1) and study theirproperties. Secondly, we shall point out connection between the TOs constructedand a factorisation problem for Fredholm operators of a special form. Although thisconnection is known in the regular case, in the singular case under consideration ittakes a very explicit form and allows a complete description.

The organisation of the paper is the following. In the next section we brieflyintroduce the related concepts and give formulation of the main results. Section 3is devoted to construction of some special TOs, which are then used in Section 4to construct TOs for SL operators from the class considered and then to study theirdependence on the potential. In Section 5 we establish results on connection ofTOs with factorisation of some Fredholm operators, and in Section 6 we give someapplications of the TOs to the spectral analysis of singular SL operators.

2. Preliminaries and Formulation of Main Results

Throughout the paper we denote by dom T and ker T respectively the domain andkernel of an operator T in a Banach space X, while Ws

p([0, 1], X) and Lp((0, 1),X)

will stand for the Sobolev and Lebesgue spaces of X-valued strongly measurablefunctions on [0, 1]. We shall write Ws

p[0, 1] and Lp(0, 1) instead of Wsp([0, 1], R)

and Lp((0, 1), R) respectively; in particular, W−12 (0, 1) is the dual space of

W 12 [0, 1] with respect to L2(0, 1).Suppose that q ∈ W−1

2 (0, 1); then there exists a function σ ∈ H such thatq = σ ′ in the distributional sense, and differential expression (1.1) can be recast interms of σ as

�σ (f ) := −(f ′ − σf )′ − σf ′ = −(

d

dx+ σ

)(d

dx− σ

)f − σ 2f,

where again the derivatives are understood in the sense of distributions. We denoteby Tσ a differential operator in H given by Tσf = �σ (f ) on the domain

dom Tσ := {f ∈ W 12 [0, 1] | f [1] ∈ W 1

1 [0, 1], �σ (f ) ∈ H },

Page 120: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 121

where f [1] := f ′ − σf is the quasi-derivative of a function f . It is shown in [29]that the operator Tσ is closed. We consider a family Tσ,h of restrictions of Tσ to thelinear manifolds

dom Tσ,h := {f ∈ dom Tσ | f [1](0) = hf (0)}.Here h is an arbitrary number from the extended complex plane C = C∪{∞}, andfor h = ∞ the above relation is interpreted as

dom Tσ,∞ := {f ∈ dom Tσ | f (0) = 0}.It is easily seen that Tσ,h = Tσ+h,0 for any h ∈ C, so that

{Tσ,h | σ ∈ H,h ∈ C} = {Tσ,0 | σ ∈ H } ∪ {Tσ,∞ | σ ∈ H },and it suffices to consider only the cases h = 0 and h = ∞. We also observe thatTσ,∞ = Tσ+h,∞ for any h ∈ C, so that the operator Tσ,∞ depends on the equivalenceclass σ := σ + C of H/C rather than on σ itself. It turns out that within each ofthe orbits {Tσ,0 | σ ∈ H } and {Tσ,∞ | σ ∈ H } all the operators are similar to eachother. We recall first some definitions.

DEFINITION 2.1. We say that closed and densely defined operators A and B ina Banach space X are similar and write A ∼ B if there exists a bounded andboundedly invertible operator U = UA,B (called the transformation operator (TO)for the pair (A,B)) such that AU = UB.

Remark 2.2. Our definition of TO slightly differs from the one given in [20];despite some shortcomings, it is more convenient for our purposes.

The set �(A,B) of all TOs for a pair (A,B) has the form �(A,B) = {UV |V ∈ MB}, where U is a fixed TO and MB denotes the group of all bounded andboundedly invertible operators that commute with B. A TO U for a pair (A,B) issaid to be unique up to a scalar factor if �(A,B) = {λU | λ ∈ C \ {0}}; accordingto the above remark this is equivalent to the equality MB = {λI | λ ∈ C \ {0}}.THEOREM 2.3. Suppose that σ ∈ H ; then the following statements are true:

(i) Tσ,0 ∼ T0,0 and Tσ,∞ ∼ T0,∞;(ii) a transformation operator Uσ,h for the pair (Tσ,h, T0,h), h = 0,∞, is unique

up to a scalar factor and can be chosen as Uσ,h = I +Kσ,h, where Kσ,h is anintegral Volterra operator of Hilbert–Schmidt class of the form

Kσ,hf (x) =∫ x

0kσ,h(x, t)f (t) dt; (2.1)

(iii) the operators T0,0 and T0,∞ are not similar.

Page 121: Mathematical Physics, Analysis and Geometry - Volume 7

122 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

Remark 2.4. For the case of Sturm–Liouville operators on semiaxis with σ ∈C1(R+) claim (i) of the theorem is proved in [20, Theorem 1.2.1].

The operators Kσ,h and Lσ,h := (I + Kσ,h)−1 − I , h = 0,∞, have some other

nice properties, to formulate which we have to introduce some functional spaces.Suppose that X is a Banach space with norm | · |; we shall also use the no-

tation | · | for the operator norm in the algebra B(X) of all bounded operatorsin X. We denote by Gp(X), p � 1, the set of all strongly measurable (classes ofequivalent) functions k on [0, 1]2 with values in B(X) having the property that|k| ∈ Lp((0, 1)2) and that the mappings

x �−→ k(x, · ) ∈ Lp

((0, 1),B(X)

), x �−→ k( · , x) ∈ Lp

((0, 1),B(X)

)are continuous on the interval [0, 1] (i.e., coincide a.e. with some continuous map-pings of [0, 1] into Lp((0, 1),B(X))). Also Gp(X) will denote the set of all inte-gral operators with kernels from Gp(X).

The set Gp(X) becomes a Banach space upon introducing the norm

‖k‖Gp(X) := max{ maxx∈[0,1]

‖k(x, · )‖Lp((0,1),B(X)), maxx∈[0,1]

‖k( · , x)‖Lp((0,1),B(X))},

while Gp(X) turns into a Banach algebra under the norm ‖K‖Gp(X) := ‖k‖Gp(X)

with k being the kernel of K. It is easily seen that Gp(X) ⊂ G1(X) for all p � 1and that the norm ‖ · ‖G1(X) coincides with the so-called Holmgren norm [8]; thusevery operator K ∈ Gp(X), p � 1, is continuous in all spaces Lq((0, 1),X),q � 1, see [6, Lemma XX.2.5]. Also every K ∈ G2(X) acts continuously fromL2((0, 1),X) into the space C([0, 1], X), and

‖K‖L2((0,1),X)→C([0,1],X) � ‖K‖G2(X).

Put �+ := {(x, t) ∈ (0, 1)2 | x > t}, �− := {(x, t) ∈ (0, 1)2 | x < t} anddenote by G+

p (X) and G−p (X) the subspaces of Gp(X) consisting of all operators

K whose kernels k satisfy the condition k(x, t) = 0 a.e. in �− and k(x, t) = 0a.e. in �+, respectively. G+

p (X) and G−p (X) form closed subalgebras of Gp(X)

and, moreover, Gp(X) = G+p (X)� G−

p (X). Finally we observe that every elementof G±

p (X) is a Volterra operator in the space Lq((0, 1),X) with any q � 1.

THEOREM 2.5. For any σ ∈ H and h = 0 or h = ∞ the operators Kσ,h andLσ,h = (I + Kσ,h)

−1 − I belong to G+2 := G

+2 (C), and the mappings

H σ �−→ Kσ,h ∈ G+2 , H σ �−→ Lσ,h ∈ G

+2

are continuous from H into G+2 .

Consider now the sets of operators

K0 := {Kσ,0 | σ ∈ H }, K∞ := {Kσ,∞ | σ ∈ H }.

Page 122: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 123

It turns out that these sets can be completely described in terms of the factorisationtheory of Fredholm operators. We shall briefly recall its main notions, referring thereader to the books [10, 12] for further details.

Let S2 denote the ideal of all Hilbert–Schmidt operators in H . Recall that everyoperator in S2 is an integral operator; we denote by S

+2 (S−

2 ) the set of those K ∈S2, whose kernels k vanish on �− (vanish on �+, respectively). It is obvious thatS2 = S

+2 ⊕S

−2 and that the operators from S

+2 and S

−2 are Volterra ones. Also, the

inclusions G2(C) ⊂ S2, G±2 (C) ⊂ S

±2 hold and, moreover, ‖K‖S2 � ‖K‖G2(C).

DEFINITION 2.6. We say that an operator I + Q with Q ∈ S2 admits factori-sation (or is factorisable) if there exist operators K+ ∈ S

+2 and K− ∈ S

−2 such

that

I + Q = (I + K+)−1(I + K−)−1.

Note that an operator I + Q can admit at most one factorisation, so that theoperators K± = K±(Q) are determined uniquely by Q. We denote by F2 the setof those Q ∈ S2, for which I + Q is factorisable. The results of [23] imply thefollowing statement.

PROPOSITION 2.7. Put G2 := G2(C); then

(i) the set G2 ∩ F2 is open and everywhere dense in G2;(ii) for every Q ∈ G2 ∩ F2 the operators K±(Q) belong to G

±2 and the operator-

valued mappings

G2 ∩ F2 Q �−→ K±(Q) ∈ G±2

are locally uniformly continuous.

It turns out that the sets K0 and K∞ can be described as ranges of K+(·) whenthe argument Q runs through some special sets of operators in F2. Namely, forφ ∈ L2(0, 2) we denote by Fφ,0 and Fφ,∞ integral operators in G2 with kernelsφ(x + t) + φ(|x − t|) and φ(x + t) − φ(|x − t|) respectively, i.e.,

Fφ,0f (x) :=∫ 1

0

(φ(x + t) + φ(|x − t|))f (t) dt,

Fφ,∞f (x) :=∫ 1

0

(φ(x + t) − φ(|x − t|))f (t) dt.

We observe that, just like for Tσ,∞, for any h ∈ C we have Fσ+h,∞ = Fσ,∞, so thatFσ,∞ depends on the equivalence class φ := φ + C in L2(0, 2)/C rather than on φ

itself. Put for h = 0,∞Fh := {φ ∈ L2(0, 2) | Fφ,h ∈ F2};

then by Proposition 2.7 the sets F0 and F∞ are open in L2(0, 2).

Page 123: Mathematical Physics, Analysis and Geometry - Volume 7

124 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

THEOREM 2.8. The following equalities hold:

K0 = {K+(Fφ,0) | φ ∈ F0}, K∞ = {K+(Fφ,∞) | φ ∈ F∞}.In other words, for every σ ∈ H the operator Kσ,h can be obtained as a result

of factorisation of I + Fφ,h for some φ ∈ L2(0, 2) and, conversely, for everyφ ∈ L2(0, 2) such that I + Fφ,h is factorisable, the operator I + K+(Fφ,h) is aTO for some SL operator Tσ,h. Thus Theorem 2.8 states that there exist bijections�0: H → F0 and �∞: H/C → F∞/C, given by

H σ �−→ φ0 =: �0(σ ) ∈ F0 and

H/C σ �−→ φ∞ =: �∞(σ ) ∈ F∞/C

respectively. In fact, �∞ can be lifted to a mapping �∞ between H and F∞ and,moreover, the maps �0 and �∞ can be made more explicit.

THEOREM 2.9. The mappings �h, h = 0,∞, are homeomorphic, and, moreover,with φσ,h := �h(σ ) it holds

φσ,h(2x) = −1

2σ (x) +

∫ x

0lσ,h(x, t)2 dt,

where lσ,h is the kernel of the integral operator Lσ,h = (I + Kσ,h)−1 − I .

3. Some Special Transformation Operators

The aim of this section is to construct some special TOs for first order systems ofdifferential equations on an interval. In the next section these TOs will be used tofind TOs for SL operators with singular potentials from the space W−1

2 (0, 1).The systems we shall consider here arise in a very natural way. Namely, ac-

cording to our definition of �σ the equality �σ (u) = v is to be interpreted as−(u[1])′ − σu[1] − σ 2u = v with u[1] = u′ − σu, or, in other words, as the firstorder system

d

dx

(u1

u2

)+ V (x)

(u1

u2

)=

(0

−v

), V (x) :=

( −σ (x) −1σ 2(x) σ (x)

).

Observe that for σ ∈ H the entries of V (x) are integrable functions so that anysolution to the above system is absolutely continuous and enjoys the standarduniqueness properties. It is reasonable to expect that TOs for the SL operatorswith singular potentials could be constructed through the analogous TOs for theoperator d

dx+ V in the space H × H . More precisely, we shall seek for bounded

operators A± such that(d

dx± V

)A±f = A∓f ′

Page 124: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 125

for any function f ∈ W 12 [0, 1] × W 1

2 [0, 1] with f(0) = 0. Since V 2 = 0, the upper-left components of ( d

dx∓V )( d

dx±V ) coincide with −�±σ , and hence the upper-left

components of A± are strongly connected with the TOs for T±σ .In fact, without any technical complication we can consider a more abstract

setting. Suppose that X is a Hilbert space with norm | · | and that v(·) is a func-tion from L1((0, 1),B(X)). We denote by V an operator from L∞((0, 1),X) intoL1((0, 1),X) given by ‘pointwise multiplication’, (Vf )(x) = v(x)f (x), and by‖ · ‖p the norm in the space Lp((0, 1),B(X)), p � 1.

THEOREM 3.1. There exist operators A± ∈ B(L2((0, 1),X)) such that the equal-ities (

d

dx+ V

)A+f = A−f ′,

(3.1)(d

dx− V

)A−f = A+f ′

hold for all f ∈ W 12 ([0, 1], X) with f (0) = 0. The operators A± have the form

A±f (x) = f (x) +∫ x

0a±(x, t)f (t) dt, f ∈ H , (3.2)

and (A± − I ) ∈ G+1 (X).

The proof of the theorem will rely on several lemmata. Consider first the casewhere v is smooth, i.e., v ∈ C∞([0, 1],B(X)). Putting

B+ := 12 (A+ + A−), B− := 1

2 (A+ − A−), (3.3)

we rewrite system (3.1) as

d

dx(B+f ) + V B−f = B+f ′,

(3.4)d

dx(B−f ) + V B+f = −B−f ′.

In terms of the kernels b± := (a+ ±a−)/2 of the operators B± (assuming that theyare continuously differentiable in �+) these equations read (recall that f (0) = 0)∫ x

0

{(∂

∂x+ ∂

∂t

)b+(x, t) + v(x)b−(x, t)

}f (t) dt = 0,

(3.5)∫ x

0

{(∂

∂x− ∂

∂t

)b−(x, t) + v(x)b+(x, t)

}f (t) dt = −(

v(x) + 2b−(x, x))f (x).

Page 125: Mathematical Physics, Analysis and Geometry - Volume 7

126 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

It is easily seen that relations (3.5) hold for all f ∈ W 12 ([0, 1], X) with f (0) = 0

if the kernels b± satisfy the system

b+(x, t) = −∫ x

x−t

v(ξ)b−(ξ, ξ − x + t) dξ,

(3.6)

b−(x, t) = −∫ x

x+t2

v(η)b+(η, x + t − η) dη − 1

2v

(x + t

2

).

For convenience we assume the kernels a± and b± to be extended by zero outsidethe domain �+. We find a solution of system (3.6) by successive approximationmethod in the form

b± =∞∑

n=1

b±n , (3.7)

where the kernels b±n , n ∈ N, satisfy the following recurrent relations

b−1 (x, t) = −1

2v

(x + t

2

),

b+n (x, t) = −

∫ x

x−t

v(ξ)b−n (ξ, ξ − x + t) dξ, (3.8)

b−n+1(x, t) = −

∫ x

x+t2

v(η)b+n (η, x + t − η) dη

for (x, t) ∈ �+ and equal zero otherwise.

Remark 3.2. Suppose that v ∈ C∞([0, 1],B(X)) and denote C :=maxx∈(0,1) |v(x)| + maxx∈(0,1) |v′(x)|. The standard induction arguments appliedto (3.8) show that the functions b±

n are continuously differentiable in �+ and alsoyield the following inequalities for all n ∈ N and all (x, t) ∈ �+:

|b−n (x, t)| � 1

2

C2n−1

(2n − 2)!x2n−2,

|b+n (x, t)| � 1

2

C2n

(2n − 1)!x2n−1,∣∣∣∣ ∂

∂xb−

n (x, t)

∣∣∣∣,∣∣∣∣ ∂

∂tb−

n (x, t)

∣∣∣∣ � (2n − 1)C2n−1

(2n − 2)! (1 + x)2n−2,∣∣∣∣ ∂

∂xb+

n (x, t)

∣∣∣∣,∣∣∣∣ ∂

∂tb+

n (x, t)

∣∣∣∣ � (2n)C2n

(2n − 1)!(1 + x)2n−1.

This implies that series (3.7) as well as the series obtained after term-by-termdifferentiation of (3.7) in x or t converge uniformly in the domain �+. Henceforththe functions b± and a± are continuously differentiable in �+ and the operatorsA± given by formulae (3.2) satisfy equalities (3.1).

Page 126: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 127

Our ultimate goal is to construct the operators A± satisfying equalities (3.1)for an arbitrary v ∈ L1((0, 1),B(X)), and to this end we shall study convergenceof (3.7) in a more suitable norm.

LEMMA 3.3. Suppose that v ∈ C∞([0, 1],B(X)). Then the functions b±n , n ∈ N,

verify the inequalities

‖b−n (x, ·)‖1 � 1

(2n − 1)!(∫ x

0|v(ξ)| dξ

)2n−1

,

(3.9)

‖b+n (x, ·)‖1 � 1

(2n)!(∫ x

0|v(ξ)| dξ

)2n

.

Proof. Using recurrent formulae (3.8), we find that

‖b−1 (x, ·)‖1 = 1

2

∫ x

0|v(x/2 + t/2)| dt �

∫ x

0|v(ξ)| dξ,

‖b+n (x, ·)‖1 �

∫ x

0dt

∫ x

x−t

|v(ξ)| |b−n (ξ, ξ − x + t)| dξ

�∫ x

0dξ |v(ξ)|

∫ x

x−ξ

|b−n (ξ, ξ − x + t)| dt

�∫ x

0|v(ξ)| ‖b−

n (ξ, ·)‖1 dξ,

‖b−n+1(x, ·)‖1 �

∫ x

0dt

∫ x

x+t2

|v(η)| |b+n (η, x + t − η)| dη

�∫ x

x2

dη |v(η)|∫ 2η−x

0|b+

n (η, x + t − η)| dt

�∫ x

0|v(η)| ‖b+

n (η, ·)‖1 dη.

Inequalities (3.9) follow now by induction if we use the identity

1

n!∫ x

0|v(ξ)|

(∫ ξ

0|v(τ)| dτ

)n

dξ = 1

(n + 1)!(∫ x

0|v(τ)| dτ

)n+1

. (3.10)

The lemma is proved. �LEMMA 3.4. Suppose that v ∈ C∞([0, 1],B(X)); then the functions b+, b−belong to G1(X) and

‖b−‖G1(X) � sinh(‖v‖1), ‖b+‖G1(X) � cosh(‖v‖1) − 1. (3.11)

Remark 3.5. Observe that since the functions b+ and b− are continuous, theybelong to the spaces Gp(X) for all p � 1; however, inequalities analogous to (3.11)with p > 1 in general do not hold.

Page 127: Mathematical Physics, Analysis and Geometry - Volume 7

128 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

Proof of Lemma 3.4. Integrating relations (3.8) in x and using Fubini’s theorem,we get

‖b−1 (·, t)‖1 �

∫ 1

0|v(ξ)| dξ,

‖b+n (·, t)‖1 �

∫ 1

t

dx

∫ x

x−t

|v(ξ)| |b−n (ξ, ξ − x + t)| dξ

�∫ 1

0dξ |v(ξ)|

∫ ξ+t

ξ

|b−n (ξ, ξ − x + t)| dx

�∫ 1

0|v(ξ)| ‖b−

n (ξ, ·)‖1 dξ,

‖b−n+1(·, t)‖1 �

∫ 1

t

dx

∫ x

x+t2

|v(η)| |b+n (η, x + t − η)| dη

�∫ 1

t

dη |v(η)|∫ 2η−t

η

|b+n (η, x + t − η)| dx

�∫ 1

0|v(η)| ‖b+

n (η, ·)‖1 dη.

Recalling (3.9) and using (3.10), we easily establish the inequalities

‖b+n ‖G1(X) � 1

(2n)!‖v‖2n1 , ‖b−

n ‖G1(X) � 1

(2n − 1)!‖v‖2n−11 ,

and relations (3.11) follow. �Now we show that the functions b± ∈ G1(X) depend continuously on the

function v in the L1((0, 1),B(X))-norm.

LEMMA 3.6. Assume that v, v ∈ C∞([0, 1],B(X)) and let b±n = b±

n,v andb±

n = b±n,v be the corresponding kernels constructed as above. Then for every n ∈ N

and all x ∈ [0, 1] the following inequalities are satisfied:

‖b−n (x, ·) − b−

n (x, ·)‖1 � ‖v − v‖1

(2n − 2)!(∫ x

0

(|v(ξ)| + |v(ξ)|) dξ

)2n−2

,

(3.12)

‖b+n (x, ·) − b+

n (x, ·)‖1 � ‖v − v‖1

(2n − 1)!(∫ x

0

(|v(ξ)| + |v(ξ)|) dξ

)2n−1

.

Proof. Applying the arguments of the proof of Lemma 3.3 to b±n and b±

n resultsin

‖b−1 (x, ·) − b−

1 (x, ·)‖1 �∫ x

0|v(ξ) − v(ξ )| dξ,

Page 128: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 129

‖b+n (x, ·) − b+

n (x, ·)‖1 �∫ x

0|v(ξ) − v(ξ )|‖b−

n (ξ, ·)‖1 dξ +

+∫ x

0|v(ξ)|‖b−

n (ξ, ·) − b−n (ξ, ·)‖1 dξ,

‖b−n+1(x, ·) − b−

n+1(x, ·)‖1 �∫ x

0|v(η) − v(η)|‖b+

n (η, ·)‖1 dη +

+∫ x

0|v(η)|‖b+

n (η, ·) − b+n (η, ·)‖1 dη.

Using Lemma 3.3 and the induction assumption for ‖b−n (x, ·) − b−

n (x, ·)‖1, we get

‖b+n (x, ·) − b+

n (x, ·)‖1

� ‖v − v‖1

(2n − 1)!(∫ x

0|v(ξ)| dξ

)2n−1

+

+ ‖v − v‖1

(2n − 2)!∫ x

0|v(ξ)|

(∫ ξ

0

(|v(u)| + |v(u)|) du

)2n−2

� ‖v − v‖1

(2n − 2)!∫ x

0

(|v(ξ)| + |v(ξ)|)(∫ ξ

0

(|v(u)| + |v(u)|) du

)2n−2

= ‖v − v‖1

(2n − 1)!(∫ x

0

(|v(u)| + |v(u)|) du

)2n−1

.

In the same manner we obtain the required estimate for ‖b−n+1(x, ·) − b−

n+1(x, ·)‖1

based on that for ‖b+n (x, ·) − b+

n (x, ·)‖1, and the proof by induction is complete. �Modifying similarly the arguments of Lemma 3.4 and using Lemma 3.6, we

arrive at the following conclusion.

LEMMA 3.7. Suppose that v, v ∈ C∞([0, 1],B(X)) and that b±, b± are thecorresponding kernels. Then

‖b+ − b+‖G1(X) � ‖v − v‖1 sinh(‖v‖1 + ‖v‖1),

‖b− − b−‖G1(X) � ‖v − v‖1 cosh(‖v‖1 + ‖v‖1).

We now return to the kernels a± and the corresponding operators A±. Recallthat by definition

a+ = b+ + b−, a− = b+ − b−.

Denote by a±v (respectively A±

v ) the kernels (respectively operators) that corre-spond to v ∈ C∞([0, 1],B(X)). Since the space G1(X) is complete, Lemma 3.7and extension by continuity yield the following statement.

Page 129: Mathematical Physics, Analysis and Geometry - Volume 7

130 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

COROLLARY 3.8. The mappings v �→ a±v extend uniquely to continuous func-

tions

L1((0, 1),B(X)

) v �−→ a±v ∈ G1(X),

and for arbitrary v, v ∈ L1((0, 1),B(X)) it holds

‖a±v − a±

v ‖G1(X) � ‖v − v‖1 exp(‖v‖1 + ‖v‖1).

It turns out that the functions a±v even for an arbitrary v ∈ L1((0, 1),B(X))

still are the kernels of the corresponding transformation operators as stated inTheorem 3.1.

Proof of Theorem 3.1. Suppose that v is an arbitrary function from L1((0, 1),

B(X)). We fix a sequence (vn)∞n=1 ⊂ C∞([0, 1],B(X)) that converges to v in

the L1-norm and denote by Vn and A±n the corresponding operators of ‘pointwise

multiplication’ by vn and transformation operators, respectively. It follows fromCorollary 3.8 that there exist operators A± = A±

v that are continuous in the spaceLp((0, 1),X) for every p ∈ [1,∞] and such that A±

n f → A±f in Lp((0, 1),X)

for any f ∈ Lp((0, 1),X).Take an arbitrary f ∈ W 1

2 ([0, 1], X) with f (0) = 0; then (recall Remark 3.2)we have(

d

dx± Vn

)A±

n f = A∓n f ′. (3.13)

Since f ∈ L∞((0, 1),X) ∩ L1((0, 1),X) and f ′ ∈ L2((0, 1),X) ⊂ L1((0, 1),X),we have that A±

n f → A±f in L∞((0, 1),X) and L1((0, 1),X) and that A±n f ′ →

A±f ′ in L1((0, 1),X) as n → ∞. Convergence of vn to v in the L1-norm nowimplies that VnA

±n f → V A±f in L1((0, 1),X), so that by (3.13) we get

(A±n f )′ −→ A∓f ′ ∓ V A±f

as n → ∞ in the norm of the space L1((0, 1),X). It follows that A±f ∈W 1

1 ([0, 1], X) and

(A±f )′ = A∓f ′ ∓ V A±f.

Thus operators A± verify equalities (3.1) and the theorem is proved. �The next theorem shows how equalities (3.1) should be modified for an arbitrary

function f ∈ W 12 ([0, 1], X).

THEOREM 3.9. Suppose that v ∈ L1((0, 1),B(X)) and f ∈ W 12 ([0, 1], X).

Then A±f ∈ W 11 ([0, 1], X) and(

d

dx± V

)A±f = A∓f ′ + a∓(x, 0)f (0).

Page 130: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 131

Proof. Consider the sequence fn = φnf , where φn(x) = min{nx , 1} for x ∈[0, 1] and n ∈ N. Then fn ∈ W 1

2 ([0, 1], X) and fn(0) = 0, so that by virtue ofequalities (3.1) we get (A±f )′ ± V A±fn = A∓f ′

n, or

((A± − I )fn

)′ = (A∓ − I )(φnf′) +

+ (A∓ − I )(φ′nf ) ∓ V A±fn, n ∈ N. (3.14)

Observe that fn → f in Lp((0, 1),X) as n → ∞ for all p ∈ [1,∞] andthus A±fn → A±f in L1((0, 1),X) ∩ L∞((0, 1),X) and V A±fn → V A±f inL1((0, 1),X). Convergence φnf

′ → f ′ in L1((0, 1),X) as n → ∞ implies thatA∓(φnf

′) → A∓f ′ in L1((0, 1),X). Since f is continuous and a∓ ∈ G1(X), wehave that

(A∓ − I )(φ′nf )(x) = n

∫ 1/n

0a∓(x, t)f (t) dt −→ a∓(x, 0)f (0)

in L1((0, 1),X). The above reasonings show that the right-hand side of equal-ity (3.14) converges in L1((0, 1),X) to (A∓ − I )f ′ + a∓(x, 0)f (0) ∓ V A±f asn → ∞. Since the space W 1

1 ([0, 1], X) is complete, the statements of the theoremfollow. �

Now we establish some additional property of the kernels a± constructed thatwill be essentially used for Sturm–Liouville operators.

LEMMA 3.10. Suppose that v, v ∈ C∞([0, 1],B(X)) and a± = a±v , a± = a±

v

denote the corresponding kernels constructed for v and v respectively. Let P be anarbitrary bounded operator in X; then with g(x) = (1 + x) exp(2x) the followinginequalities hold:

‖Pa± − P a±‖2G2(X)

� 12(‖Pv − P v‖22 + ‖v − v‖2

1‖P v‖22)g(‖v‖1 + ‖v‖1). (3.15)

Proof. It suffices to prove analogous inequalities for Pb± − P b±, where b± :=(a+ ± a−)/2. It follows from (3.6) that

|Pb+(x, t) − P b+(x, t)|�

∫ x

x−t

|Pv(ξ) − P v(ξ)| |b−(ξ, ξ − x + t)| dξ +

+∫ x

x−t

|P v(ξ)| |b−(ξ, ξ − x + t) − b−(ξ, ξ − x + t)| dξ.

Page 131: Mathematical Physics, Analysis and Geometry - Volume 7

132 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

The Cauchy–Schwarz inequality yields the estimate

|Pb+(x, t) − P b+(x, t)|2

� 2C1

∫ x

x−t

|Pv(ξ) − P v(ξ)|2|b−(ξ, ξ − x + t)| dξ +

+ 2C2

∫ x

x−t

|P v(ξ)|2|b−(ξ, ξ − x + t) − b−(ξ, ξ − x + t)| dξ,

where

C1 := maxy∈[0,1]

∫ 1

y

|b−(ξ, ξ − y)| dξ,

C2 := maxy∈[0,1]

∫ 1

y

|b−(ξ, ξ − x + t) − b−(ξ, ξ − x + t)| dξ.

Integration in x and t now produces the inequality

‖Pb+ − P b+‖2G2(X) � 2C1‖P(v − v)‖2

2‖b−‖G1(X) ++ 2C2‖P v‖2

2‖b− − b−‖G1(X),

and it remains to estimate C1 and C2 in a suitable way. Using again (3.6) andFubini’s theorem, we find that∫ 1

y

|b−(ξ, ξ − y)| dξ �∫ 1

y

∫ ξ

ξ− y2

|v(η)| |b+(η, 2ξ − y − η)| dη +

+ 1

2

∫ 1

y

∣∣∣∣v(

ξ − y

2

)∣∣∣∣ dξ

� 12‖v‖1‖b+‖G1(X) + 1

2‖v‖1,

so that 2C1 � ‖v‖1‖b+‖G1(X) + ‖v‖1. Analogously we estimate 2C2 as follows:

2C2 � ‖v − v‖1‖b+‖G1(X) + ‖v‖1‖b+ − b+‖G1(X) + ‖v − v‖1.

Combining these inequalities with the estimates of Lemmata 3.4 and 3.7 we arriveat (3.15) for ‖Pb+ −P b+‖2

G2(X) with the constant 6 instead of 12. The estimate for‖Pb− − P b−‖2

G2(X) is derived analogously, and the result follows. �Passing to the limit and completeness of the corresponding spaces justify now

the following statement.

COROLLARY 3.11. Suppose that v, v ∈ L1((0, 1),B(X)) and P ∈ B(X) aresuch that Pv, P v belong to L2((0, 1),B(X)). Then for the corresponding kernelsa± = a±

v , a± = a±v the inclusions Pa±, P a± ∈ G2(X) take place, and, moreover,

inequality (3.15) holds.

Page 132: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 133

4. Transformation Operators for Sturm–Liouville Operators

In this section we use the results of the previous section to construct the TOs forthe pairs of SL operators (Tσ,h, T0,h), where σ ∈ H and h = 0,∞.

Denote by M2 := B(C2) the Banach space of all 2 × 2 matrices with complexentries and put H := L2((0, 1), C

2). For an arbitrary but fixed σ ∈ H the function

v(x) :=( −σ (x) −1

σ 2(x) σ (x)

)belongs to L1((0, 1),M2). Henceforth by Theorems 3.1 and 3.9 there exist opera-tors A± ∈ B(H) of the form

A±f(x) = f(x) +∫ x

0a±(x, t)f(t) dt,

such that a± ∈ G1(C2) and that for any f ∈ W 1

2 ([0, 1], C2) the following relation

holds:(d

dx± V

)A±f = A∓f ′ + a∓(·, 0)f(0). (4.1)

With respect to the natural decomposition H = L2(0, 1) × L2(0, 1) the operatorsA± and the kernels a± can be represented in the matrix form

A± =(

A±11 A±

12A±

21 A±22

), a± =

(a±

11 a±12

a±21 a±

22

). (4.2)

It follows that a±ij ∈ G1(C); also with

P =(

1 00 0

)

we have Pv(·) ∈ L2((0, 1),M2), which implies that a±11 ∈ G2(C) depends contin-

uously on σ ∈ H in view of Corollary 3.11.Taking f = (f, 0)T with f ∈ W 1

2 [0, 1] in (4.1), we get the following equalities:(d

dx− σ

)(A+

11f ) = A−11f

′ + A+21f + a−

11(·, 0)f (0),(d

dx+ σ

)(A−

11f ) = A+11f

′ − A−21f + a+

11(·, 0)f (0), (4.3)(d

dx+ σ

)(A+

21f ) = A−21f

′ − σ 2A+11f + a−

21(·, 0)f (0).

In particular, for f ∈ W 22 [0, 1] with f (0) = f ′(0) = 0 this gives(

d

dx+ σ

)(d

dx− σ

)A+

11f =(

d

dx+ σ

)(A−

11f′ + A+

21f )

= A+11f

′′ − σ 2A+11f,

Page 133: Mathematical Physics, Analysis and Geometry - Volume 7

134 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

i.e., �σ (A+11f ) = A+

11(�0(f )). To have this equality satisfied for all f ∈ dom Tσ,h

with h = 0 or h = ∞, we should modify the operator A+11 in a suitable way.

For the sake of brevity we put

α1 = a−11(·, 0), α2 = a+

11(·, 0), α3 = a−21(·, 0).

Take m = mh ∈ H , denote by M = Mh an integral operator in H given by

Mf (x) =∫ x

0f (x − t)m(t) dt,

and put

A±ij = A±

ij (I + M).

Since

(Mf )′ = Mf ′ + f (0)m, (Mf )(0) = 0

for all f ∈ W 12 [0, 1], equalities (4.3) result in(

d

dx− σ

)(A+

11f ) = A−11f

′ + A+21f + (A−

11m + α1)f (0),(d

dx+ σ

)(A−

11f ) = A+11f

′ − A−21f + (A+

11m + α2)f (0), (4.4)(d

dx+ σ

)(A+

21f ) = A−21f

′ − σ 2A+11f + (A−

21m + α3)f (0).

We are now in a position to prove similarity of the operators Tσ,h and T0,h,h = 0,∞. We start with the simpler case h = ∞.

LEMMA 4.1. The operators Tσ,∞ and T0,∞ are similar.Proof. Take m∞ := −(A+

11)−1α2 in equalities (4.4); then for all f ∈ dom Tσ,0

we get (recall that f (0) = 0):(d

dx+ σ

)(d

dx− σ

)A+

11f =(

d

dx+ σ

)(A−

11f′ + A+

21f )

= A+11f

′′ − σ 2A+11f.

It follows that A+11f ∈ dom Tσ,∞ and Tσ,∞A+

11f = A+11T0,∞f , i.e., that

A+11T0,∞ ⊂ Tσ,∞A+

11.

Since A+11 is a bounded and boundedly invertible operator, for S := A+

11T0,∞(A+11)

−1

we get

S ⊂ Tσ,∞. (4.5)

Page 134: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 135

Fix an arbitrary nonzero λ ∈ C; it is clear that

ran(T0,∞ − λI) = H, dim ker(T0,∞ − λI) = 1,

so that also

ran(S − λI) = H, dim ker(S − λI) = 1. (4.6)

Assuming that S �= Tσ,∞, we deduce from (4.5) and (4.6) that

dim ker(Tσ,∞ − λI) > 1.

It follows then that there exists a nonzero function f ∈ W 12 [0, 1] such that f [1] ∈

W 11 [0, 1], f (0) = f [1](0) = 0, and

d

dx

(f

f [1]

)+ V

(f

f [1]

)= −λ

(0f

).

This is impossible by uniqueness arguments, so that S = Tσ,0 and the lemma isproved. �LEMMA 4.2. The operators Tσ,0 and T0,0 are similar.

Proof. Since any f ∈ dom T0,0 satisfies f ′(0) = 0, we find that(d

dx− σ

)A+

11f = A−11f

′ + A+21f + pf (0)

with p := A−11m + α1 and(

d

dx+ σ

)(d

dx− σ

)A+

11f =(

d

dx+ σ

)(A−

11f′ + A+

21f + pf (0))

= A+11f

′′ − σ 2A+11f + qf (0)

with

q :=(

d

dx+ σ

)p + A−

21m + α3.

We shall prove below that for a suitable choice of m ∈ W 12 [0, 1] we get p ∈

W 12 [0, 1], p(0) = 0, and q ≡ 0. Then A+

11T0,0 ⊂ Tσ,0A+11 and the arguments similar

to those used in the proof of Lemma 4.1 do the rest.To produce the required m, we denote by J the integration operator,

Jf (x) :=∫ x

0f (t) dt,

and take m in the form m = (A−11)

−1(J r − α1) for some r ∈ H . Then p = J r ∈W 1

2 [0, 1], p(0) = 0, and the equation q = 0 can be recast as

r + σJ r + A−21(A

−11)

−1J r − A−21(A

−11)

−1α1 + α3 = 0 (4.7)

Page 135: Mathematical Physics, Analysis and Geometry - Volume 7

136 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

in terms of r. It is clear that

R := σJ + A−21(A

−11)

−1J

is a Hilbert–Schmidt operator with lower triangular kernel; therefore R is a Volterraoperator in H and (4.7) has a unique solution

r = (I + R)−1(A−21(A

−11)

−1α1 + α3) ∈ H.

The proof is complete. �The previous lemmata also show that the TO for the pair Tσ,h and T0,h can be

chosen as A+11 = A+

11 + A+11M, which is of the form (2.1) with

kσ,h(x, t) = a+11(x, t) +

∫ x

t

a+11(x, s)mh(s − t) ds.

Remark 4.3. Observe that although the kernel kσ,h need not be smooth enoughfor the function Kσ,hf to belong to W 1

2 [0, 1] even if f ∈ C∞[0, 1], the functiong := (I + Kσ,h)f has quasi-derivative g[1] ∈ W 1

1 [0, 1] for any f ∈ dom T0,h and

g[1](x) = f ′(x) +∫ x

0a−

11(x, t)f ′(t) dt +∫ x

0a+

21(x, t)f (t) dt + ph(x)f (0)

with p∞ ≡ 0 and p0 ∈ W 12 [0, 1] constructed in the proof of Lemma 4.2.

We show next that the TO I + Kσ,h is unique up to a scalar factor. To thisend it suffices to prove that the set of bounded and boundedly invertible operatorscommuting with T0,h in H consists of nonzero multiples of the identity operator.

Denote

sλ(x) := sin λx

λ, cλ(x) := cos λx,

with λ an arbitrary complex number and s0(x) ≡ x. Observe that the mappingsλ �→ sλ and λ �→ cλ are analytic H -valued functions of the argument λ.

LEMMA 4.4. Suppose that a bounded and boundedly invertible operator U in H

satisfies one of the following conditions:

(a) for all λ ∈ C the function sλ is an eigenvector of U ;(b) for all λ ∈ C the function cλ is an eigenvector of U .

Then U = cI for some nonzero c ∈ C.Proof. Assume first that condition (a) is satisfied. Then there exists a function g

such that

Usλ = g(λ)sλ (4.8)

Page 136: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 137

for all λ ∈ C. For a given λ0 ∈ C the relation

g(λ) = (Usλ, sλ0)

(sλ, sλ0)

shows that g analytic in a neighbourhood of λ0 consisting of those λ, for which(sλ, sλ0) �= 0; since λ0 is arbitrary, g is entire. By (4.8) the function g satisfies theinequality

‖U−1‖−1 � |g(λ)| � ‖U‖for all λ ∈ C; the Liouville theorem now proves that g ≡ c for some nonzero c.Since the system {sλ | λ ∈ C} is complete in H (e.g., the sequence (sπn)n∈N formsa basis of H ), this implies that U = cI as claimed.

The case (b) is considered analogously. �LEMMA 4.5. The operators T0,0 and T0,∞ are not similar.

Proof. Assume that the claim of the lemma is false. Then there exists a boundedand boundedly invertible operator U such that

T0,∞U = UT0,0.

Since for an arbitrary λ ∈ C we have

ker(T0,0 − λ2I ) = lin{sλ}, ker(T0,∞ − λ2I ) = lin{cλ},there exists a function f : C → C such that

Usλ = f (λ)cλ (4.9)

for all λ ∈ C. The relation

f (λ) = (Usλ, cλ0)

(cλ, cλ0),

shows that f is analytic in a neighbourhood of any point λ0 ∈ C and hence isentire. Equality (4.9) implies that

|f (λ)| � ‖sλ‖‖cλ‖‖U‖, λ ∈ C.

Writing λ = ξ + iη with ξ, η ∈ R, we find that∫ 1

0| sin λx|2 dx = 1

2

(sinh 2η

2η− sin 2ξ

),∫ 1

0| cos λx|2 dx = 1

2

(sinh 2η

2η+ sin 2ξ

).

Page 137: Mathematical Physics, Analysis and Geometry - Volume 7

138 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

Therefore

limλ→∞

‖sλ‖‖cλ‖ = 0

and hence

f (λ) = o(1), λ → ∞.

The Liouville theorem now implies f ≡ 0, which contradicts invertibility of U .The contradiction derived shows that T0,0 and T0,∞ are not similar. �

With all these results in hand, we can prove Theorems 2.3 and 2.5.

Proof of Theorem 2.3. Lemmata 4.1 and 4.2 establish similarity of claim (i) andexistence of the TO of required form of claim (ii). Uniqueness of claim (ii) followsfrom Lemma 4.4, and part (iii) is proved in Lemma 4.5. �

Proof of Theorem 2.5. We consider first the operators Kσ,h, h = 0,∞. It followsfrom the proof of Lemmata 4.1 and 4.2 that

I + Kσ,h = A+11,σ + A+

11,σMσ,h, (4.10)

where the operator Mσ,h acts according to

Mσ,hf (x) :=∫ x

0mσ,h(x − t)f (t) dt,

with mσ,0 and mσ,∞ defined in the proofs of Lemmata 4.2 and 4.1 respectively. Itis easily seen that

‖Mσ,h − Mσ,h‖G+2

� ‖mσ,h − mσ,h‖L2(0,1),

so that the mapping

L2(0, 1) σ �−→ Mσ,h ∈ G+2

is continuous as soon as such is the mapping

L2(0, 1) σ �−→ mσ,h ∈ L2(0, 1).

Assuming this already established and recalling that G+2 is a Banach algebra and

that A+11,σ depends continuously on σ ∈ H in G

+2 , we conclude from the above

arguments and representation (4.10) that the mapping

L2(0, 1) σ �−→ Kσ,h ∈ G+2

is continuous. Thus it remains to show that mσ,0 and mσ,∞ depend continuouslyon σ .

Page 138: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 139

For h = ∞ we have

mσ,∞ := (A+11,σ )−1a+

11,σ (·, 0),

and the required continuity follows from Corollary 3.8.For h = 0 the function mσ,0 is given by

mσ,0 := (A−11,σ )−1(J rσ − a−

11,σ (·, 0)),

where

rσ := (I + Rσ)−1pσ , pσ := A−21,σ (A−

11,σ )−1a−11,σ (·, 0) + a−

21,σ (·, 0)

and

Rσ := σJ + A−21,σ (A−

11,σ )−1J

with J being the integration operator. It follows from Corollary 3.8 that the map-pings

L2(0, 1) σ �−→ pσ ∈ L1(0, 1),

L2(0, 1) σ �−→ Rσ ∈ B(L1(0, 1), L2(0, 1)

)are continuous. This implies continuity of the mapping

L2(0, 1) σ �−→ rσ ∈ L2(0, 1)

as well as the one

L2(0, 1) σ �−→ mσ,0 ∈ L2(0, 1),

and the proof for the operators Kσ,h, h = 0,∞, is complete.To treat the operators Lσ,h, we observe first that the mapping L: G

+2 → G

+2

given by

L(K) = (I + K)−1 − I =∞∑

n=1

(−K)n (4.11)

is continuous. In fact, for the kernel k of an operator K ∈ G+2 we find that∣∣∣∣

∫ 1

0k(x, s)k(s, t) ds

∣∣∣∣ � ‖k(x, ·)‖L2(0,1)‖k(·, t)‖L2(0,1) � ‖k‖2G2

if x > t , so that by induction ‖K2n‖G2 � ‖K‖2nG2

/(n − 1)! and thus series (4.11)converges locally uniformly in K ∈ G2. It follows now that Lσ,h = (I +Kσ,h)

−1 −I = L(Kσ,h) ∈ G

+2 depends continuously on σ ∈ H , and the theorem is proved. �

Page 139: Mathematical Physics, Analysis and Geometry - Volume 7

140 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

5. Proof of Theorems 2.8 and 2.9

In this section, we shall study the question how the TOs Kσ,h enter factorisation ofsome special operators. Note that the relations we establish below are known forregular potentials (e.g., formula (5.2) can be found in [22]); we derive them herenot only for the sake of completeness but also to give a precise characterisation ofthe TOs for the SL operators with singular potentials from W−1

2 (0, 1).Recall that for an integral operator K in H with kernel k its associated operator

K� is defined as the integral operator with kernel k�(x, t) = k(t, x). Claims ofTheorems 2.8 and 2.9 are basically contained in the following two lemmata.

LEMMA 5.1. Suppose that σ ∈ H . Then for h = 0,∞ the following equalityholds:

(I + Kσ,h)−1(I + K�

σ,h)−1 = I + Fφ,h, (5.1)

in which the function φ = φσ,h ∈ L2(0, 2) is given by

φσ,h(2x) = −1

2σ (x) +

∫ x

0lσ,h(x, t)2 dt, x ∈ [0, 1], (5.2)

and lσ,h is the kernel of the operator Lσ,h = (I + Kσ,h)−1 − I .

Proof. First we shall prove equality (5.1) for the case h = 0 and under theassumption that σ ∈ C2[0, 1]. Put

y(λ, x) := cos λx +∫ x

0kσ,0(x, t) cos λt dt, λ ∈ C, x ∈ [0, 1]. (5.3)

By the definition of the TO I + Kσ,0 the function y satisfies the equation

−y′′(λ, x) + q(x)y(λ, x) = λ2y(λ, x)

and the initial conditions

y(λ, 0) = 1, y′(λ, 0) = ay(λ, 0),

where q = σ ′ and a := σ (0). Since I +Lσ,0 is the inverse of I +Kσ,0, relation (5.3)can be recast as

cos λx = y(λ, x) +∫ x

0lσ,0(x, t)y(λ, t) dt.

It is shown in [22, Ch. I.2] that the kernel lσ,0 of Lσ,0 is twice continuously dif-ferentiable in the closure of the domain �+ = {(x, t) ∈ (0, 1)2 | x > t} and is aunique solution of the partial differential equation

−l′′xx = −l′′t t + q(t)l, (x, t) ∈ �+, (5.4)

subject to the boundary conditions

l(x, x) = −a − 1

2

∫ x

0q(t) dt, l′t (x, t)|t=0 − al(x, 0) = 0. (5.5)

Page 140: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 141

Denote by I + F the operator of the left-hand side of (5.1). Then

I + F := (I + Lσ,0)(I + L�σ,0),

so that F is an integral operator with kernel

f (x, t) = lσ,0(x, t) + lσ,0(t, x) +∫ 1

0lσ,0(x, s)lσ,0(t, s) ds.

It follows that the function f is twice continuously differentiable in [0, 1]2 and issymmetric with respect to x and t . We shall use (5.4) and (5.5) to show that f

satisfies in �+ the wave equation −f ′′xx = −f ′′

t t so that f (x, t) = φ1(x + t)+φ2(x − t) for some φ1, φ2 ∈ C2[0, 1], and then derive from (5.5) that φ1 = φ2 =φσ,0 with φσ,0 of (5.2).

The details are as follows. We write l instead of lσ,0 for brevity and observe thatfor (x, t) ∈ �+ it holds

f (x, t) = l(x, t) +∫ t

0l(x, s)l(t, s) ds. (5.6)

Differentiating (5.6) in x twice and using (5.4)–(5.5) at various points, we obtain

f ′′xx(x, t) = l′′xx(x, t) +

∫ t

0l′′xx(x, s)l(t, s) ds

= l′′xx(x, t) +∫ t

0l′′ss(x, s)l(t, s) ds −

∫ t

0l(x, s)q(s)l(t, s) ds

= l′′xx(x, t) +∫ t

0l′′ss(x, s)l(t, s) ds +

+∫ t

0l(x, s)

(l′′t t (t, s) − l′′ss(t, s)

)ds

= l′′xx(x, t) +∫ t

0l(x, s)l′′t t (t, s) ds +

+ [l′s(x, s)l(t, s) − l(x, s)l′s (t, s)

]∣∣t0

= l′′xx(x, t) +∫ t

0l(x, s)l′′t t (t, s) ds +

+ l′t (x, t)l(t, t) − l(x, t)l′s (t, s)|s=t .

Differentiation in t gives

f ′′t t (x, t) = l′′t t (x, t) +

∫ t

0l(x, s)l′′t t (t, s) ds +

+[l(x, t)l(t, t)]′t + l(x, t)l′t (t, s)|s=t ,

so that in view of (5.4) and (5.5)

f ′′xx(x, t) − f ′′

t t (x, t) = −q(t)l(x, t) − 2[l(t, t)]′t l(x, t) = 0

Page 141: Mathematical Physics, Analysis and Geometry - Volume 7

142 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

as stated. Therefore f (x, t) = φ1(x + t) + φ2(x − t) for some φ1, φ2 ∈ C2[0, 1].It follows that f ′

t (x, t)|t=0 = φ′1(x) − φ′

2(x), while (5.6) and (5.5) give

f ′t (x, t)|t=0 = l′t (x, t)|t=0 + l(x, 0)l(0, 0) = l′t (x, t)|t=0 − al(x, 0) = 0.

Thus φ1 − φ2 ≡ c for some constant c, and with φ0 := φ1 − c/2 = φ2 + c/2 weconclude that f (x, t) = φ0(x + t) + φ0(x − t), i.e., that F = Fφ0,0. To find φ0, weobserve that (5.6) and (5.5) for x = t give

φ0(2x) + φ0(0) = −a − 1

2

∫ x

0q(s) ds +

∫ x

0l(x, s)2 ds,

so that φ0(0) = −a/2. Recalling that σ is a primitive of q with σ (0) = a, we arriveat the equality

φ0(2x) = − 12σ (x) +

∫ x

0l(x, s)2 ds.

Summarizing, we have shown that F = Fφ0,0 with φ0 = φσ,0 given by (5.2) (i.e.,proved the theorem) under the additional assumption that σ ∈ C2[0, 1].

Suppose now that σ is an arbitrary function in H and take a sequence (σn) ⊂C2[0, 1] that converges to σ in H . Denoting by φn ∈ L2(0, 2) the functions asin (5.2) but corresponding to σn, we conclude from Theorem 2.5 that φn convergein L2(0, 2) to φσ,0 of (5.2). Therefore

limn→∞ ‖Fφn,0 − Fφσ,0,0‖G2 = 0,

and, passing to the limit in the equality

Lσn,0 + L�σn,0 + Lσn,0L

�σn,0 = Fφn,0

results in (5.1) for h = 0 and an arbitrary σ ∈ H .The case h = ∞ is treated analogously; the only reservations are that the

function cos λx should be replaced with (sin λx)/λ and boundary conditions (5.5)with the ones

lσ,∞(x, x) = −1

2

∫ x

0q(t) dt, lσ,∞(x, 0) = 0.

The lemma is proved. �LEMMA 5.2. Suppose that φ ∈ F0. Then there exists a unique σ0 ∈ H such that

(I + Kσ,0)−1(I + K�

σ,0)−1 = I + Fφ,0; (5.7)

moreover, σ0 depends continuously on φ ∈ F0. Similarly, for φ ∈ F∞ there existsa unique σ∞ ∈ H/C such that, for σ ∈ σ ,

(I + Kσ,∞)−1(I + K�σ,∞)−1 = I + Fφ,∞; (5.8)

moreover, σ∞ depends continuously on φ := φ + C ∈ F∞/C.

Page 142: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 143

Proof. We shall consider only the case h = 0 as the other one is completelyanalogous.

Denote by F = Fφ,0 an integral operator with kernel f (x, t) = φ(x + t) +φ(|x − t|); then by assumption the operator I + F is factorisable as explained inSection 2. Put K := K+(F ); since the integral operator F has a symmetric kernel,it is easily seen that K−(F ) = K�, i.e., that

I + Fφ,0 = (I + K)−1(I + K�)−1.

Multiplying both sides of the above relation by I + K and equating the corre-sponding kernels, we arrive at the so-called Gelfand–Levitan–Marchenko (GLM)equation

k(x, t) + f (x, t) +∫ x

0k(x, s)f (s, t) ds = 0, (x, t) ∈ �+, (5.9)

in which k is the kernel of K. Recall that the operator I + F is factorisable if andonly if the Gelfand–Levitan–Marchenko equation is soluble for k.

Suppose now that the function φ is smooth (e.g., infinitely differentiable). Thenit is shown in [22, Ch. II.3] and [20, Ch. II.4] that the solution k of the GLMequation (5.9) is smooth as well and satisfies the partial differential equation

−k′′t t (x, t) = −k′′

xx(x, t) + q(x)k(x, t), q(x) := 2[k(x, x)]′xand the boundary condition

k′t (x, t)|t=0 = 0.

Moreover, k is the kernel of the TO I + Kσ,0 with σ (x) := 2[k(x, x) − k(0, 0)].Denote by l the kernel of the integral operator L := (I + K)−1 − I . Then

l = lσ,0, so that by equality (5.2)

σ (x) = −2φ(2x) + 2∫ x

0l(x, s)2 ds, x ∈ [0, 1]. (5.10)

In view of Proposition 2.7 the operator K ∈ G2(C) (and thus L ∈ G2(C)) dependscontinuously on φ ∈ F0, whence the function σ ∈ H of (5.10) is continuous inφ ∈ F0 ∩ C∞[0, 2] in the L2(0, 2)-norm.

Take now an arbitrary φ ∈ F0 and put ck(x) := cos(π2 kx); since the sys-

tem (ck)k�0 is an orthonormal basis of L2(0, 2), we have φ = ∑∞k=0 αkck with

αk := (φ, ck). Put φm := ∑mk=0 αkck ; then the sequence (φm)∞

m=1 converges to φ

in L2(0, 2) and therefore by Proposition 2.7 the infinitely differentiable functionsφm belong to F0 for all m large enough, say for m � m0. Let (σm)∞

m=m0be the

sequence of functions constructed for φm through formula (5.10); in particular,

I + Fφm,0 = (I + Kσm,0)−1(I + K�

σm,0)−1. (5.11)

Page 143: Mathematical Physics, Analysis and Geometry - Volume 7

144 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

From the above said it follows that (σm)∞m=m0

is a Cauchy sequence in H and henceconverges to some function σ ∈ H . By virtue of Theorem 2.5 we can pass to thelimit in equation (5.11) to obtain

I + Fφ,0 = (I + Kσ,0)−1(I + K�

σ,0)−1.

The above arguments also imply that the function σ so constructed depends con-tinuously on φ ∈ F0. The proof (for the case h = 0) is complete. �

Proof of Theorem 2.8. The statements of the theorem are corollaries of Lem-mata 5.1 and 5.2. �

Proof of Theorem 2.9. Continuity of the mappings �h: σ �→ φσ,h, h = 0,∞,follows from Theorem 2.5 and formula (5.2), and that of the inverse mappings fromLemma 5.2. �

6. Some Applications

In this section we demonstrate usefulness of the TOs constructed for the spectralanalysis of Sturm–Liouville operators with singular potentials q from the spaceW−1

2 (0, 1). Namely, we shall establish eigenvalue asymptotics, completeness prop-erties of eigenfunctions, and similarity of related operators. Note that some of theseresults were established by different methods in [28–30].

Assume that q = σ ′ for some complex-valued function σ ∈ H and denoteby Tσ,DD the restriction of the operator Tσ,∞ defined in Section 2 by the Dirichletboundary condition at the point x = 1. Then Tσ,DD has a discrete spectrum [29]that accumulates at +∞; we denote by λ2

k, k ∈ N, the eigenvalues of Tσ,DD countedwith multiplicities and ordered so that Re λk+1 � Re λk and |Im λk+1| � |Im λk| inthe case of equality above. Here and in the following, λk are taken from the closedright half-plane.

THEOREM 6.1. For the eigenvalues λ2k ordered as explained above we have λk =

πk + λk, where the sequence (λk) belongs to �2.Proof. Observe first that for any λ ∈ C the solution to the equation �σy = λ2y

satisfying the boundary condition y(0) = 0 equals

y(x, λ) = sλ(x) +∫ x

0kσ,∞(x, t)sλ(t) dt;

here sλ(x) = (sin λx)/λ for λ �= 0 and s0(x) ≡ x, and kσ,∞ is the kernel ofthe TO I + Kσ,∞. Therefore λ2 is an eigenvalue of the operator Tσ,DD if andonly if y(1, λ) = 0, and in that case y(x, λ) is a corresponding eigenfunction. In

Page 144: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 145

other words, the spectrum of Tσ,DD coincides with the squared zeros of the entirefunction

�(λ) := sin λ

λ+

∫ 1

0kσ,∞(1, t)

sin λt

λdt.

Observe also that λ is a zero of the function � of multiplicity k � 1 if and onlyif the functions y(x, λ), ∂

∂λy(x, λ), . . . , ( ∂

∂λ)k−1y(x, λ) form a chain of eigen- and

associated functions of the operator Tσ,DD corresponding to the eigenvalue λ2, i.e.,the multiplicity of a zero λ of � coincides with the algebraic multiplicity of theeigenvalue λ2 of Tσ,DD. Thus it suffices to study the asymptotics of zeros of thefunction �(λ) in the half-plane Re λ � 0. Since � is an odd function, its zerosare symmetric with respect to the origin and we can study the zeros in the wholeplane C.

Put

Q(µ) :=∫ 1

0kσ,∞(1, t) sin µt dt;

then Q is an entire function of exponential type and

lim|µ|→∞ e−|Im µ|Q(µ) = 0 (6.1)

by [22, Lemma 1.3.1]. For any n ∈ Z, denote by �n the boundary of the rectangular

Rn := {µ = ν + iτ | ν, τ ∈ R, |ν − πn| < π/2, |τ | < 1}.Then

infµ∈�n

|sin µ| = c > 0

is independent of n, while

supµ∈�n

|Q(µ)| −→ 0

as |n| → ∞ by virtue of (6.1). By Rouche’s theorem each Rn contains exactlyone zero of the function � for all n large enough. Representing this solution asµn = πn + µn, we see that

sin µn = (−1)n+1Q(πn + µn) −→ 0

and whence µn → 0 as n → ∞. It follows now that

µn = (−1)n+1∫ 1

0kσ,∞(1, t) sin πnt dt + o(|µn|);

since {√2 sin πnt}∞n=1 is an orthonormal basis of H and kσ,∞(1, t) ∈ H , this

implies that (µn) ∈ �2.

Page 145: Mathematical Physics, Analysis and Geometry - Volume 7

146 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

Finally we observe that for all n large enough sin λ/λ and � have the samenumber of zeros (counting multiplicities) inside the regions

Pn := {µ ∈ C | |Re µ| � πn + π/2, |Im µ| < n};this easily follows by Rouche’s theorem (see details in [22, Ch. 1.3]). Thereforeλn = µn for all n large enough; in particular, we may write λn = πn + λn with(λn) ∈ �2. The theorem is proved. �

It follows from Theorem 6.1 that all eigenvalues λ2n but maybe finitely many

are simple. We denote by φn, n ∈ N, the system of eigen- and associated functionsof the operator Tσ,DD corresponding to the eigenvalues λ2

n. For all n large enough(e.g., such that λ2

n is a simple eigenvalue of Tσ,DD) we normalize the eigenfunctionsφn by the condition φ[1]

n (0) = √2λn; observe that for such n we have φn(t) =

(I + Kσ,∞)√

2 sin λnt .

THEOREM 6.2. The system (φn)∞n=1 forms a Bari basis of H (i.e., a basis that is

quadratically close to an orthonormal one).Proof. Put ψn := √

2 sin λnx if n ∈ N is such that λ2n is a simple eigenvalue of

Tσ,DD and put ψn := √2 sin λnx,ψn+1 = √

2x sin λnx, . . . , ψn+k = √2xk sin λnx

if λ2n = λ2

n+1 = · · · = λ2n+k is an eigenvalue of Tσ,DD of algebraic multiplicity

k + 1. Then due to the asymptotics of λn established in the previous theorem thesystem (ψn)

∞n=1 is complete in H , see [19, App. III]. For all n large enough we have

φn = (I +Kσ,∞)ψn; we can also choose the remaining functions φn accordingly sothat the above equality will hold for all n ∈ N. Since I+Kσ,∞ is a homeomorphism,we conclude that the system (φn) is complete in H .

Put now ψn,0 := √2 sin πnx; then (ψn,0) is an orthonormal basis of H and

φk − ψk,0 = (I + Kσ,∞)(ψk − ψk,0) + Kσ,∞ψk,0.

Observe that

ψk(x) − ψk,0(x) = 2√

2 sinλk

2cos(πk + λk/2) = O(|λk|)

for all k large enough, so that∑‖ψk − ψk,0‖2 < ∞

on account of the inclusion (λk) ∈ �2. Since Kσ,∞ is a Hilbert–Schmidt operatorand (ψk,0) is an orthonormal basis of H , we also have∑

‖Kσ,∞ψk,0‖2 < ∞.

Therefore∑‖φk − ψk,0‖2 < ∞,

Page 146: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 147

and thus (φn)∞n=1 is a Bari basis of H (see [11, Ch. VI]). �

Our final result shows that existence of TOs implies that the operators Tσ,DD

with σ ∈ H are similar to rank one perturbations of the potential-free operatorT0,DD.

THEOREM 6.3. Suppose that σ ∈ H . Then there exists a function p = pσ ∈ H

such that Tσ,DD is similar to the operator S given by

(Sf )(x) = −f ′′

on the domain

dom S ={f ∈ W 2

2 [0, 1] | f (0) = f (1) +∫ 1

0p(t)f (t) dt = 0

}.

The similarity is performed by the operator I + Kσ,∞.Proof. Since Tσ,∞(I +Kσ,∞) = (I +Kσ,∞)T0,∞ and Tσ,DD is a one-dimensional

restriction of Tσ,∞, the operator Tσ,DD is similar to a one-dimensional restric-tion S of the operator T0,∞. The domain of S is found from the requirement that(I + Kσ,∞) dom S = dom Tσ,DD. Thus every f ∈ dom S belongs to W 2

2 [0, 1] andsatisfies the conditions

f (0) = f (1) +∫ 1

0kσ,∞(1, t)f (t) dt = 0,

and it remains to observe that p(t) := kσ,∞(1, t) belongs to H . �Similar results also hold for the restriction Tσ,DN of the operator Tσ,∞ by the

Neumann boundary condition f [1](1) = hf (1), h ∈ C, at the point x = 1 and forthe restrictions Tσ,ND and Tσ,NN of the operator Tσ,0 by the Dirichlet and Neumannboundary conditions respectively at the point x = 1.

References

1. Albeverio, S., Gesztesy, F., Høegh-Krohn, R. and Holden, H.: Solvable Models in QuantumMechanics, Springer, New York, 1988.

2. Albeverio, S. and Kurasov, P.: Singular Perturbations of Differential Operators. SolvableSchrödinger Type Operators, Cambridge University Press, Cambridge, 2000.

3. Andersson, L.: Inverse eigenvalue problems for a Sturm–Liouville equation in impedance form,Inverse Probl. 4 (1988), 929–971.

4. Berezanskii, Yu. M. and Brasche, J.: Generalized selfadjoint operators and their singularperturbations, Methods Funct. Anal. Topol. 8(4) (2002), 1–14.

5. Coleman, C. F. and McLaughlin, J. R.: Solution of the inverse spectral problem for an im-pedance with integrable derivative, I, Comm. Pure Appl. Math. 46 (1993), 145–184; II, Comm.Pure Appl. Math. 46 (1993), 185–212.

6. Dunford, N. and Schwartz, J. T.: Linear Operators. Part III: Spectral Operators, Wiley–Inter-science, New York, 1988.

Page 147: Mathematical Physics, Analysis and Geometry - Volume 7

148 ROSTYSLAV O. HRYNIV AND YAROSLAV V. MYKYTYUK

7. Freiling, G. and Yurko, V.: On the determination of differential equations with singularities andturning points, Results Math. 41 (2002), 275–290.

8. Friedrichs, K. O.: Perturbation of Spectra in Hilbert Space, Lectures in Appl. Math. 3, Amer.Math. Soc., Providence, RI, 1965.

9. Gelfand, I. M. and Levitan, B. M.: On determination of a differential equation by its spectralfunction, Izv. Akad. Nauk SSSR Ser. Mat. 15(4) (1951), 309–360 (in Russian).

10. Gohberg, I., Goldberg, S. and Kaashoek, M.: Classes of Linear Operators, Birkhäuser, Basel,1987.

11. Gohberg, I. and Krein, M.: Introduction to the Theory of Linear Non-selfadjoint Operatorsin Hilbert Space, Nauka Publ., Moscow, 1965 (in Russian); Engl. transl.: Amer. Math. Soc.Transl. Math. Monogr. 18, Amer. Math. Soc., Providence, RI, 1969.

12. Gohberg, I. and Krein, M.: Theory of Volterra Operators in Hilbert Space and its Applica-tions, Nauka Publ., Moscow, 1967 (in Russian); Engl. transl.: Amer. Math. Soc. Transl. Math.Monogr. 24, Amer. Math. Soc., Providence, RI, 1970.

13. Hald, O.: Discontinuous inverse eigenvalue problems, Comm. Pure Appl. Math. 37 (1984),539–577.

14. Herczynski, J.: On Schrödinger operators with distributional potentials, J. Oper. Theory 21(2)(1989), 273–295.

15. Hryniv, R. O. and Mykytyuk, Ya. V.: 1D Schrödinger operators with singular periodicpotentials, Meth. Funct. Anal. Topol. 7(4) (2001), 31–42.

16. Hryniv, R. O. and Mykytyuk, Ya. V.: 1D Schrödinger operators with singular Gordon potentials,Meth. Funct. Anal. Topol. 8(1) (2002), 36–48.

17. Hryniv, R. O. and Mykytyuk, Ya. V.: Inverse spectral problems for Sturm–Liouville operatorswith singular potentials, Inverse Probl. 19 (2003), 665–684.

18. Hryniv, R. O. and Mykytyuk, Ya. V.: Inverse spectral problem for Sturm–Liouville operatorswith singular potentials, II. Reconstruction by two spectra, In: V. Kadets and W. Zelazko (eds),Proceedings of the Conference on Functional Analysis and Its Applications Dedicated to the110th Anniversary of Stefan Banach, Lviv, May 28–31, 2002, North-Holland Math. Studies,Elsevier, 2004 (to appear).

19. Levin, B. Ya.: Distribution of Zeros of Entire Functions, Gostekhizdat, Moscow, 1956 (inRussian); Engl. transl.: Amer. Math. Soc., Providence, RI, 1964.

20. Levitan, B. M.: Inverse Sturm–Liouville Problems, Nauka Publ., Moscow, 1984 (in Russian);Engl. transl.: VNU Science Press, Utrecht, 1987.

21. Marchenko, V. A.: Some questions of the theory of second order differential operators, Dokl.Akad. Nauk SSSR 72(3) (1950), 457–460 (in Russian).

22. Marchenko, V. A.: Sturm–Liouville Operators and Their Applications, Naukova Dumka Publ.,Kyiv, 1977 (in Russian); Engl. transl.: Birkhäuser, Basel, 1986.

23. Mykytyuk, Ya. V.: Factorisation of Fredholm operators, 2001, preprint.24. Neiman-zade, M. I. and Shkalikov, A. A.: Schrödinger operators with singular potentials from

the space of multipliers, Mat. Zametki (Math. Notes) 66(5) (1999), 723–733.25. Povzner, A. Ya.: On differential Sturm–Liouville operators on semiaxis, Math. USSR-Sb.

23(65) (1948), 3–52.26. Pöschel, J. and Trubowitz, E.: Inverse Spectral Theory, Pure Appl. Math. 130, Academic Press,

Orlando, Florida, 1987.27. Rofe-Beketov, F. S. and Khristov, E. H.: Some analytical questions and the inverse Sturm–

Liouville problem for an equation with highly singular potential, Dokl. Akad. Nauk SSSR 185(4)(1969), 768–771 (in Russian); Engl. transl.: Soviet Math. Dokl. 10(1) (1969), 188–192.

28. Savchuk, A. M.: On eigenvalues and eigenfunctions of Sturm–Liouville operators with singularpotentials, Mat. Zametki (Math. Notes) 69(2) (2001), 277–285 (in Russian).

29. Savchuk, A. M. and Shkalikov, A. A.: Sturm–Liouville operators with singular potentials, Mat.Zametki (Math. Notes) 66(6) (1999), 897–912 (in Russian).

Page 148: Mathematical Physics, Analysis and Geometry - Volume 7

TRANSFORMATION OPERATORS 149

30. Savchuk, A. M. and Shkalikov, A. A.: Sturm–Liouville operators with distributional potentials,Trudy Moskov. Mat. Obshch. (Trans. Moscow Math. Soc.), 64 (2003), to appear (in Russian).

31. Yurko, V. A.: Inverse problems for differential equations with singularities lying inside theinterval, J. Inverse Ill-Posed Probl. 8(1) (2000), 89–103.

32. Zhikov, V. V.: On inverse Sturm–Liouville problems on a finite segment, Izv. Akad. Nauk SSSR35(5) (1967), 965–976 (in Russian).

Page 149: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 151–185, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

151

A General Framework for Localization ofClassical Waves: II. Random Media �

ABEL KLEIN1 and ANDREW KOINES2

1University of California, Irvine, Department of Mathematics, Irvine, CA 92697-3875, U.S.A.e-mail: [email protected] Coast College, Department of Mathematics, Costa Mesa, CA 92626, U.S.A.

(Received: 26 November 2002)

Abstract. We study localization of classical waves in random media in the general frameworkintroduced in Part I of this work. This framework allows for two random coefficients, encompassesacoustic waves with random position dependent compressibility and mass density, elastic waves withrandom position dependent Lamé moduli and mass density, electromagnetic waves with random po-sition dependent magnetic permeability and dielectric constant, and allows for anisotropy. We showexponential localization (Anderson localization) and strong Hilbert–Schmidt dynamical localizationfor random perturbations of periodic media with a spectral gap.

Mathematics Subject Classifications (2000): 35Q60, 35Q99, 78A99, 78A48, 74J99, 35P99, 47F05.

Key words: wave localization, random media, Anderson localization, dynamical localization,Wegner estimate.

1. Introduction

In this series of articles we provide a general framework for studying localizationof acoustic waves, elastic waves, and electromagnetic waves in inhomogeneousand random media, i.e., the existence of acoustic, elastic, and electromagneticwaves such that almost all of the wave’s energy remains in a fixed bounded regionuniformly over time. Our general framework encompasses acoustic waves withposition dependent compressibility and mass density, elastic waves with positiondependent Lamé moduli and mass density, and electromagnetic waves with posi-tion dependent magnetic permeability and dielectric constant. We also allow foranisotropy.

In the first article [KK] we developed mathematical methods to study wavelocalization in inhomogeneous media; as an application we proved localizationfor local perturbations (defects) of media with a gap in the spectrum and stud-ied midgap eigenmodes. In this second article these methods are applied to proveexistence of exponential localization (Anderson localization) and strong Hilbert–

� This work was partially supported by NSF Grants DMS-9800883 and DMS-0200710.

Page 150: Mathematical Physics, Analysis and Geometry - Volume 7

152 ABEL KLEIN AND ANDREW KOINES

Schmidt dynamical localization for classical waves in random media. This phe-nomenum has been experimentally observed for light waves [WBLR].

Previous results on localization of classical waves in random media [FK3, FK4,FK7, CHT] considered only the case of one random coefficient. Acoustic andelectromagnetic waves were treated separately. Elastic waves were not discussed.

Our results extend the work of Figotin and Klein [FK3, FK4, FK7] in severalways: (1) We study a general class of classical waves which includes acoustic,electromagnetic and elastic waves as special cases. (2) We allow for two randomcoefficients (e.g., electromagnetic waves in media where both the magnetic per-meability and the dielectric constant are random). (3) We allow for anisotropy inour wave equations. (4) We prove strong Hilbert–Schmidt dynamical localizationin random media, using the bootstrap multiscale analysis of Germinet and Klein[GK1] and the generalized eigenfunction expansion of of Klein, Koines and Seifert[KKS] for classical wave operators.

Our approach to the mathematical study of localization of classical waves is op-erator theoretic and reminiscent of quantum mechanics. It is based on the fact thatmany wave propagation phenomena in classical physics are governed by equationsthat can be recast in abstract Schrödinger form [Wi, SW, FK4, Kle, KKS, KK]. Thecorresponding self-adjoint operator, which governs the dynamics, is a first-orderpartial differential operator, but its spectral theory may be studied through an aux-iliary self-adjoint, second-order partial differential operator. These second-orderclassical wave operators are analogous to Schrödinger operators in quantum me-chanics. The method is particularly suitable for the study of phenomena historicallyassociated with quantum mechanical electron waves, especially Anderson local-ization in random media [FK3, FK4, FK7, Kle] and midgap defect eigenmodes[FK5, FK6, KK].

Physically interesting inhomogeneous and random media give rise to nonsmoothcoefficients in the classical wave equations and, hence, in their classical wave op-erators. Thus we make no assumptions about the smoothness of the coefficients ofclassical wave operators.

Classical waves do not localize in a homogeneous medium; to obtain wavelocalization an appropriate medium must be fabricated. We start with an under-lying periodic medium (a ‘photonic crystal’ in the case of light waves) with aspectral gap. As randomness is added to the medium, we prove that the gap inthe spectrum shrinks (possibly closing), and localization occurs in the spectrumat the edges of the gap. A crucial technical result is a Wegner-type estimate forrandom second-order classical wave operators with two random coefficients.

This paper is organized as follows: In Section 2 we review our framework forstudying classical waves. In Section 3 we discuss localization of classical wavesin random media. We introduce a model for random media, and consider the cor-responding random classical wave operators. Exponential localization and strongHilbert–Schmidt dynamical localization are defined. The connection between lo-calization of a random first-order classical wave operator and localization of the

Page 151: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 153

two associated random second order classical wave operators is described in Re-marks 3.7 and 3.8. We study the effect of randomness on a spectral gap of anunderlying periodic medium in Theorem 3.10. The results on localization are statedin Theorems 3.11–3.14. In Section 4 we show that random second order partiallyelliptic classical wave operators satisfy the requirements for the bootstrap mul-tiscale analysis in Theorem 4.1; the Wegner estimate for random second-orderclassical wave operators is given in Theorem 4.4. The results on localization areproven using the bootstrap multiscale analysis.

2. Classical Wave Operators

We start by reviewing the mathematical framework for classical waves introducedin the prequel [KK], to which we refer for discussion and examples.

Many classical wave equations in a linear, lossless, inhomogeneous mediumcan be written as first order equations of the form:

K(x)−1 ∂

∂tψt (x) = D∗φt (x),

(2.1)R(x)−1 ∂

∂tφt (x) = −Dψt(x),

where x ∈ Rd (space), t ∈ R (time), ψt(x) ∈ Cn and φt (x) ∈ Cm are physicalquantities that describe the state of the medium at position x and time t , D is anm×n matrix whose entries are first-order partial differential operators with constantcoefficients (see Definition 2.1), D∗ is the formal adjoint of D, and, K(x) and R(x)

are n × n and m × m positive, invertible matrices, uniformly bounded from aboveand away from 0, that describe the medium at position x (see Definition 2.3). Inaddition, D satisfies a partial ellipticity property (see Definition 2.2), and there maybe auxiliary conditions to be satisfied by the quantities ψt(x) and φt (x).

The physical quantities ψt(x) and φt(x) then satisfy second-order wave equa-tions, with the same auxiliary conditions:

∂2

∂t2ψt(x) = −K(x)D∗R(x)Dψt(x), (2.2)

∂2

∂t2φt (x) = −R(x)DK(x)D∗φt (x). (2.3)

Conversely, given (2.2) (or (2.3)), we may write this equation in the form (2.1)by introducing an appropriate quantity φt(x) (or ψt(x)), which will then satisfyEquation (2.3) (or (2.2)).

The wave equation (2.1) may be rewritten in abstract Schrödinger form:

−id

dt�t = W�t, (2.4)

Page 152: Mathematical Physics, Analysis and Geometry - Volume 7

154 ABEL KLEIN AND ANDREW KOINES

where �t =(

ψt

φt

)and

W =(

0 −iK(x)D∗iR(x)D 0

). (2.5)

The (first order) classical wave operator W is formally (and can be defined as) aself-adjoint operator on the Hilbert space

H = L2(Rd,K(x)−1 dx; Cn) ⊕ L2(Rd,R(x)−1 dx; Cm), (2.6)

where, for a k × k positive invertible matrix-valued measurable function S(x), weset

L2(Rd,S(x)−1 dx; Ck) = {f : Rd → Ck; 〈f,S(x)−1f 〉L2(Rd , dx;Ck) < ∞}.The auxiliary conditions to the wave equation are imposed by requiring the solu-tions to Equation (2.4) to also satisfy

�t = P ⊥W�t, (2.7)

where P ⊥W denotes the orthogonal projection onto the orthogonal complement of

the kernel of W. The solutions to Equations (2.4) and (2.7) are of the form

�t = eitWP ⊥W�0, �0 ∈ H . (2.8)

The energy density at time t of a solution � ≡ �t(x) = (ψt (x), φt (x)) of thewave equation (2.1) is given by

E�(t, x) = 12 {〈ψ(x)t ,K(x)−1ψt(x)〉Cn + 〈φt (x),R(x)−1φt (x)〉Cm}. (2.9)

The wave energy, a conserved quantity, is thus given by

E� = 12‖�t‖2

H for any t. (2.10)

Note that (2.8) gives the finite energy solutions to the wave equation (2.1).It is convenient to work on L2(Rd, dx; Ck) instead of the weighted space

L2(Rd,S(x)−1dx; Ck). To do so, note that the operator VS , given by multiplicationby the matrix S(x)−1/2, is a unitary map from the Hilbert space L2(Rd,S(x)−1dx;Ck) to L2(Rd, dx; Ck), and if we set W = (VK ⊕ VR)W(V ∗

K ⊕ V ∗R), we have

W =(

0 −i√

K(x)D∗√R(x)

i√

R(x)D√

K(x) 0

), (2.11)

a formally self-adjoint operator on L2(Rd, dx; Cn) ⊕ L2(Rd, dx; Cm).In addition, if S−I � S(x) � S+I with 0 < S− � S+ < ∞, as it will be

the case in this article, it turns out that if ϕ = VSϕ, then the functions ϕ(x) andϕ(x) share the same decay and growth properties (e.g., exponential or polynomialdecay).

Page 153: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 155

Thus it will suffice for us to work on L2(Rd, dx; Ck), and we will do so in theremainder of this article. We set

H (k) = L2(Rd, dx; Ck). (2.12)

Given a closed densely defined operator T on a Hilbert space H , we will denoteits kernel by ker T and its range by ran T ; note ker T ∗T = ker T . If T is self-adjoint, it leaves invariant the orthogonal complement of its kernel; the restrictionof T to (ker T )⊥ will be denoted by T⊥. Note that T⊥ is a self-adjoint operator onthe Hilbert space (ker T )⊥ = P ⊥

T H , where P ⊥T denotes the orthogonal projection

onto (ker T )⊥.

DEFINITION 2.1. A constant coefficient, first order, partial differential opera-tor D from H (n) to H (m) (CPDO(1)

n,m) is of the form D = D(−i∇), where, for ad-component vector k, D(k) is the m × n matrix

D(k) = [D(k)r,s] r=1,...,m

s=1,...,n; D(k)r,s = ar,s · k, ar,s ∈ Cd. (2.13)

We set

D+ = sup{‖D(k)‖; k ∈ Cd, |k| = 1}, (2.14)

so ‖D(k)‖ � D+|k| for all k ∈ Cd . Note that D+ is bounded by the norm of thematrix [|ar,s |] r=1,...,m

s=1,...,n.

Defined on

D(D) = {ψ ∈ H (n) : Dψ ∈ H (m) in distributional sense}, (2.15)

a CPDO(1)n,m D is a closed, densely defined operator, and C∞

0 (Rd; Cn) (the spaceof infinitely differentiable functions with compact support) is an operator core forD. We will denote by D∗ the CPDO(1)

m,n given by the formal adjoint of the matrixin (2.13).

DEFINITION 2.2. A CPDO(1)n,m D is said to be partially elliptic if there exists a

CPDO(1)n,q D⊥ (for some q), satisfying the following two properties:

D⊥D∗ = 0, (2.16)

D∗D + (D⊥)∗D⊥ � �[(−) ⊗ In], (2.17)

with � > 0 being a constant. ( = ∇ · ∇ is the Laplacian on L2(Rd, dx); In

denotes the n × n identity matrix.)

If D is partially elliptic, we have

H (n) = ker D⊥ ⊕ ker D, (2.18)

Page 154: Mathematical Physics, Analysis and Geometry - Volume 7

156 ABEL KLEIN AND ANDREW KOINES

and

D∗D + (D⊥)∗D⊥ = (D∗D)⊥ ⊕ ((D⊥)∗D⊥)⊥. (2.19)

Note that D is elliptic if and only it is partially elliptic with D⊥ = 0. Note also thata CPDO(1)

n,m D may be partially elliptic with D∗ not being partially elliptic [KKS,Remark 1.1].

DEFINITION 2.3. A coefficient operator S on H (n) (COn) is a bounded, invert-ible operator given by multiplication by a coefficient matrix: an n×n matrix-valuedmeasurable function S(x) on Rd , satisfying

S−In � S(x) � S+In, with 0 < S− � S+ < ∞. (2.20)

DEFINITION 2.4. A multiplicative coefficient, first order, partial differential op-erator from H (n) to H (m) (MPDO(1)

n,m) is of the form

A = √RD

√K on D(A) = K− 1

2 D(D), (2.21)

where D is a CPDO(1)n,m, K is a COn, and R is a COm. (We will write AK,R for A

whenever it is necessary to make explicit the dependence on the on the medium,i.e., on the coefficient operators. D does not depend on the medium, so it will beomitted in the notation.)

An MPDO(1)n,m A is a closed, densely defined operator with A∗ = √

KD∗√R anMPDO(1)

m,n. Note that K−1/2C∞0 (Rd; Cn) is an operator core for A. The following

quantity will appear often in estimates:

A ≡ D+√

R+K+. (2.22)

DEFINITION 2.5. A first-order classical wave operator (CWO(1)n,m) is an operator

of the form

WA =[

0 −iA∗iA 0

]on H (n+m) ∼= H (n) ⊕ H (m), (2.23)

where A is an MPDO(1)n,m. If either D or D∗ is partially elliptic, WA will also be

called partially elliptic. If both D and D∗ are partially elliptic, WA will be calleddoubly partially elliptic.

Note that our definition of a first-order classical wave operator is more restric-tive than the one used in [KKS]. Our definition of partial ellipticity is also differentfrom [KKS], where partially eliptic corresponds to our doubly partially elliptic –see [KKS, Remark 1.2].

Page 155: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 157

Remark 2.6. The usual first-order classical wave operators are doubly partiallyelliptic, including the operators corresponding to electromagnetic waves (Maxwellequations), acoustic waves, and elastic waves (see [KK, p. 100]). But there areexamples of first-order classical wave operators which are partially elliptic but notdoubly partially elliptic (see [KKS, Remark 1.1]).

The Schrödinger-like equation (2.4) for classical waves with the auxiliary con-dition (2.7) may be written in the form:

−i∂

∂t�t = (WA)⊥�t, �t ∈ (ker WA)⊥ = (ker A)⊥ ⊕ (ker A∗)⊥, (2.24)

with WA a CWO(1)n+m as in (2.23). Its solutions are of the form

�t = eit (WA)⊥�0, �0 ∈ (ker WA)⊥, (2.25)

which is just another way of writing (2.8).Since

(WA)2 =[

A∗A 00 AA∗

], (2.26)

if �t = (ψt, φt ) ∈ H (n) ⊕ H (m) is a solution of (2.24), then its components satisfythe second-order wave equations (2.2) and (2.3), plus the auxiliary conditions,which may be all written in the form

∂2

∂t2ψt = −(A∗A)⊥ψt, with ψt ∈ (ker A)⊥, (2.27)

∂2

∂t2φt = −(AA∗)⊥φt , with φt ∈ (ker A∗)⊥. (2.28)

The solutions to (2.27) and (2.28) may be written as

ψt = cos(t(A∗A)

1/2⊥

)ψ0 + sin

(t(A∗A)

1/2⊥

)η0, ψ0, η0 ∈ (ker A)⊥, (2.29)

φt = cos(t(AA∗)1/2

⊥)φ0 + sin

(t(AA∗)1/2

⊥)ζ0, φ0, ζ0 ∈ (ker A∗)⊥, (2.30)

with a similar expression for the solutions of (2.28).The operators (A∗A)⊥ and (AA∗)⊥ are unitarily equivalent (see [KK, Lemma

A.1]): the operator U defined by

Uψ = A(A∗A)−1/2⊥ ψ for ψ ∈ ran(A∗A)

1/2⊥ , (2.31)

extends to a unitary operator from (ker A)⊥ to (ker A∗)⊥, and

(AA∗)⊥ = U(A∗A)⊥U ∗. (2.32)

In particular, �t = (ψt, φt ) is the solution of (2.24) given in (2.25) if and only ifψt and φt are the solutions (2.29) and (2.30) of (2.27) and (2.28) with η0 = Uφ0

and ζ0 = U ∗ψ0.

Page 156: Mathematical Physics, Analysis and Geometry - Volume 7

158 ABEL KLEIN AND ANDREW KOINES

In addition, if

U = 1√2

[IA IA

iU −iU

], with IA the identity on (ker A)⊥, (2.33)

U is a unitary operator from (ker A)⊥ ⊕ (ker A)⊥ to (ker A)⊥ ⊕ (ker A∗)⊥, and wehave the unitary equivalence:

U∗(WA)⊥U = (A∗A)1/2⊥ ⊕ [−(A∗A)

1/2⊥

]. (2.34)

Thus the operator (A∗A)⊥ contains full information about the spectral theory ofthe operator (WA)⊥. In particular

σ ((WA)⊥) = σ((A∗A)

1/2⊥

) ∪ (−σ((A∗A)

1/2⊥

)), (2.35)

and to find all eigenvalues and eigenfunctions for (WA)⊥, it is necessary and suf-ficient to find all eigenvalues and eigefunctions for (A∗A)⊥. For if (A∗A)⊥ψω2 =ω2ψω2 , with ω �= 0, ψω2 �= 0, we have

(WA)⊥(

ψω2 , ± i

ωAψω2

)= ±ω

(ψω2 , ± i

ωAψω2

). (2.36)

Conversely, if (WA)⊥(ψ±ω, φ±ω) = ±ω(ψ±ω, φ±ω), with ω �= 0, it follows that(see [KKS, Proposition 5.2])

(A∗A)⊥ψ±ω = ω2ψ±ω and φ±ω = ± i

ωAψ±ω. (2.37)

DEFINITION 2.7. A second-order classical wave operator on H (n) (CWO(2)n ) is

an operator W = A∗A, with A an MPDO(1)n,m for some m. (We write WK,R =

A∗K,RAK,R.) If D in (2.21) is partially elliptic, the CWO(2)

n will also be calledpartially elliptic.

Note that a first-order classical wave operator WA is partially elliptic if and onlyif one of the two second-order classical wave operators A∗A and AA∗ is partiallyelliptic. It is doubly elliptic if both A∗A and AA∗ are partially elliptic.

DEFINITION 2.8. A classical wave operator (CWO) is either a CWO(1)n or a

CWO(2)n . If the operator W is a CWO, we call W⊥ a proper CWO.

Remark 2.9. A proper classical wave operator W has a trivial kernel by con-struction, so 0 is not an eigenvalue. But 0 is in the spectrum of W⊥ [KKS, Theo-rem A.1], so W⊥ and W have the same spectrum and essential spectrum.

3. Wave Localization in Random Media

The form of the wave equation (2.1) is given by a constant coefficient, first order,partial differential operator D from H (n) to H (m); the properties of the medium are

Page 157: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 159

encoded in the coefficients matrices K(x) and R(x). Random media is modeledby random coefficients matrices.

In this article we study random perturbations of an underlying periodic medium,i.e., a medium specified by periodic coefficients matrices (recall Definition 2.3)K0(x) and R0(x) with the same period q (i.e., K0(x) = K0(x +qj) and R0(x) =R0(x + qj) for all j ∈ Zd – we take q ∈ N without loss of generality). We use thefollowing model for random media:

ASSUMPTION 3.1 (Random medium). The random medium is modeled by ran-dom matrix-valued functions Kg(x) = Kg,ω(x) and Rg(x) = Rg,ω(x) of theform

Kg,ω(x) = γg,ω(x)K0(x), with γg,ω(x) = 1 + g∑i∈Zd

ωiui(x), (3.1)

Rg,ω(x) = ζg,ω(x)R0(x), with ζg,ω(x) = 1 + g∑i∈Zd

ωivi(x), (3.2)

where

(i) K0(x) and R0(x) are n × n and m × m periodic coefficient matrices withperiod q ∈ N.

(ii) ui(x) = u(x − i) and vi(x) = v(x − i) for i ∈ Zd , where u and v are realvalued measurable functions on Rd with support in the cube centered at theorigin with side r < ∞, with

0 � U− � U(x) =∑i∈Zd

ui(x) � U+, (3.3)

0 � V− � V (x) =∑i∈Zd

vi(x) � V+ (3.4)

for a.e. x ∈ Rd , where U± and V± are constants such that

0 < Z− � Z+ < ∞, with Z± = max{U±, V±}. (3.5)

(iii) ω = {ωi; i ∈ Zd} is a family of independent, identically distributed randomvariables taking values in the interval [−1, 1], whose common probabilitydistribution µ has a bounded density ρ > 0 a.e. in [−1, 1].

(iv) g, the disorder parameter, satisfies

0 � g <1

Z+. (3.6)

Remark 3.2. The use of the same random variables in (3.1) and (3.2) models thefact that the medium itself is what is random. This randomness in the medium ismodeled by random coefficient matrices, which are not independent since a changein the medium leads to changes in both coefficient matrices.

Page 158: Mathematical Physics, Analysis and Geometry - Volume 7

160 ABEL KLEIN AND ANDREW KOINES

Remark 3.3. The results in this article are also valid for random coefficientmatrices Kg,ω(x) and Rg,ω(x) of the form

Kg,ω(x) = γ −1g,ωK0(x), Rg,ω(x) = ζ−1

g,ωR0(x). (3.7)

The modifications in the proofs are obvious. This is the form used in [FK3, FK4]for acoustic and electromagnetic waves.

It follows from Assumption 3.1 that for a.e. ω the coefficient matrices Kg,ω(x)

and Rg,ω(x) satisfy (2.20) with

Kg,ω,± = Kg,± ≡ K0,±(1 ± gU+), (3.8)

Rg,ω,± = Rg,± ≡ R0,±(1 ± gV+). (3.9)

Thus multiplication by Kg,ω(x) and Rg,ω(x) yield coefficient operators Kg =Kg,ω and Rg = Rg,ω as in Definition 2.3, for a.e. ω. For later use, we set

g = D+√

Rg,+Kg,+, (3.10)

δ±(g) = U±1 ∓ gU+

, (3.11)

η±(g) = V±1 ∓ gV+

. (3.12)

The periodic operators associated with the coefficient matrices K0(x) and R0(x)

will carry the subscript 0, i.e.,

A0 = √R0D

√K0, W0 = WA0, W0 = A∗

0A0. (3.13)

Similarly, we write (for a.e. ω)

Ag,ω = √Rg,ωD

√Kg,ω, Wg,ω = WAg,ω

, Wg,ω = A∗g,ωAg,ω. (3.14)

We also set

Wg,ω,∗ = Ag,ωA∗g,ω, (3.15)

and recall (2.32).

DEFINITION 3.4. By a random classical wave operator we will always meaneither Wg,ω (first order) or Wg,ω (second order) as in (3.14), with the randomcoefficient matrices satisfying Assumption (3.1)

Note that Wg,ω,∗ is also a random second-order classical wave operator.Random classical wave operators are random operators (see Appendix; a ran-

dom operator is a mapping ω → Hω from a probability space to self-adjointoperators on a Hilbert space such that the mappings ω → f (Hω) are strongly

Page 159: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 161

measurable for all bounded measurable functions f on R). In addition, they areqZd -ergodic.

It is a consequence of ergodicity that there exist nonrandom sets �g and �g ,such that σ (Wg,ω) = �g and σ (Wg,ω) = �g with probability one. In addition,the decompositions of σ (Wg,ω) and σ (Wg,ω) into pure point spectrum, absolutelycontinuous spectrum and singular continuous spectrum are also independent of thechoice of ω with probability one [KM1, CL]. (These sets are related by (2.34).)

We will use Wg,ω to denote a random classical operator of either first or secondorder; its almost sure spectrum will be denoted by �g.

Random classical wave operators may exhibit the phenomenum of localization.We give two definitions: the first, spectral localization, in its stronger form, ex-ponential localization, is sometimes called Anderson localization; the second is astronger form of dynamical localization introduced in [GK1].

DEFINITION 3.5 (Exponential localization). The random classical wave operatorWg,ω exhibits spectral localization in an interval I if I ∩ �g �= ∅ and Wg,ω hasonly pure point spectrum in I ∩ �g with probability one. It exhibits exponentiallocalization in I if it exhibits spectral localization in I and, with probability one,all the eigenfunctions corresponding to eigenvalues in I are exponentially decaying(in the sense of having exponentially decaying local L2-norms).

DEFINITION 3.6. The random classical wave operator Wg,ω exhibits strong HS-dynamical localization in an interval I if I ∩ �g �= ∅ and for any bounded region� and all p � 0 we have

E(

sup|||f |||�1

∥∥|X|p/2f (Wg,ω)EWg,ω(I )χ�

∥∥22

)< ∞. (3.16)

(The supremum is taken over Borel functions f of a real variable with |||f ||| =supt∈R |f (t)|; EH ( ) denotes the spectral projection of the self-adjoint operator H ;‖B‖2 denotes the Hilbert–Schmidt norm of the operator B.)

Remark 3.7. In view of (2.32) and (2.34), spectral localization of a randomsecond-order classical wave operator Wg,ω in a compact interval I ⊂ (0,∞) isequivalent to spectral localization of Wg,ω,∗ in I , and also equivalent to spectrallocalization of the random first order classical wave operator Wg,ω in one (and thenboth) of the compact intervals ±√

I . The same is true for exponential localization,in view of (2.36), (2.37), and the interior estimate of [KK, Lemma 3.4].

Remark 3.8. In view of (2.26), strong HS-dynamical localization of both ran-dom second-order classical wave operators Wg,ω and Wg,ω,∗ in a compact intervalI ⊂ (0,∞) is equivalent to strong HS-dynamical localization of the random firstorder classical wave operator Wg,ω in both compact intervals ±√

I .

It follows from Remarks 3.7 and 3.8 that it suffices to prove localization forrandom second-order classical wave operators.

Page 160: Mathematical Physics, Analysis and Geometry - Volume 7

162 ABEL KLEIN AND ANDREW KOINES

To create an environment which favors localization, we follow the strategy firstintroduced in [FK1] and subsequently used in [FK2, FK3, FK4, KSS, CHT]: Westart with an underlying periodic medium. The spectrum associated with a periodicmedium has band gap structure and may have a gap in the spectrum. We assume theexistence of a spectral gap for the underlying periodic medium. We randomize thisperiodic medium with a gap in the spectrum, prove that the gap shrinks but doesnot close if the disorder is not too large, and show that exponential localization andstrong HS-dynamical localization occurs in a vicinity of the edges of the gap.

ASSUMPTION 3.9 (Gap in the spectrum). There is a gap in the spectrum ofthe periodic second-order classical wave operator W0. More precisely, there exista, b ∈ σ (W0), 0 < a < b, such that

(a, b) ∩ σ (W0) = ∅. (3.17)

When randomness is added to the medium, the spectrum of the correspondingclassical wave operator changes. The following theorem gives information on whathappens to a spectral gap.

THEOREM 3.10 (Location of the spectral gap). Let Wg,ω be a random second-order classical wave operator satisfying Assumption 3.9. There exists g0, with

max

{1

U+,

1

V+

}(1 −

(a

b

)1/4)� g0 � min

{1

Z+,

1

U+

((b

a

)U+/4U−− 1

),

1

V+

((b

a

)V+/4V−− 1

)}, (3.18)

and increasing, Lipschitz continuous real valued functions a(g) and −b(g) on theinterval [0, 1/Z+), with a(0) = a, b(0) = b, and a(g) � b(g), such that:

(i) �g ∩ [a, b] = [a, a(g)] ∪ [b(g), b]. (3.19)(ii) If g < g0 we have a(g) < b(g) and (a(g), b(g)) is a gap in the spectrum �g

of the random operator Wg,ω. Moreover, we have

a(1 + gU+)U−/U+(1 + gV+)V−/V+ � a(g) � a

(1 − gU+)(1 − gV+)(3.20)

and

b(1 − gU+)(1 − gV+) � b(g) � b

(1 + gU+)U−/U+(1 + gV+)V−/V+. (3.21)

In addition, if 0 � g1 < g2 < g0 we have12(δ−(g2) + η−(g2))(a(g1) + a(g2))

� a(g2) − a(g1)

g2 − g1� 1

2(δ+(g2) + η+(g2))(a(g1) + a(g2)), (3.22)

Page 161: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 163

12(δ−(g2) + η−(g2))(b(g1) + b(g2))

� b(g1) − b(g2)

g2 − g1� 1

2 (δ+(g2) + η+(g2))(b(g1) + b(g2)). (3.23)

(iii) If g0 < 1/Z+ we have a(g) = b(g) for all g ∈ [g0, 1/Z+), and the randomclassical wave operator Wg,ω has no spectral gap inside the gap (a, b) of theperiodic classical wave operator W0, i.e., we have [a, b] ⊂ �g .

Theorem 3.10 is proven in Section 5.Localization for continuous random operators is usually proved by a multiscale

analysis, e.g., [HM, CH, Klo1, FK3, FK4, KSS, GD, CHT, Kle, DS, GK1, GK4,GK5]. (But note that the fractional moment method [AM, ASFH] has just beenextended to the continuum [AENSS].) In this article we use the most recent andpowerful version, the bootstrap multiscale analysis introduced in [GK1]. It can beapplied in all cases where a multiscale analysis has been used, and it yields bothexponential localization and strong HS-dynamical localization. (It gives a lot more,see [GK3, GK6].)

A random second order partially elliptic classical wave operator Wg,ω will beshown (Theorem 4.1) to satisfy all the requirements of the bootstrap multiscaleanalysis in each compact interval I ⊂ (0,∞). Thus, to prove exponential local-ization and strong HS-dynamical localization for Wg,ω in some interval centered atE ∈ �g\{0}, it suffices to verify the initial length-scale estimate of the bootstrapmultiscale analysis [GK1, Equation (3.3)] at E.

We will show that if the random second-order partially elliptic classical waveoperator Wg,ω has a gap in the spectrum, the random perturbation creates local-ization near the edges of the gap for g < g0, where g0 is given in Theorem 3.10.To prove the initial length-scale estimate for the multiscale analysis (as originallydone in [FK1]), we need low probability to have spectrum near an edge of the gapin finite but large volume. This can achieved either by hypotheses on the probabilitydistribution µ in Assumption 3.1(iii), which produce classical tails at the edge ofthe gap, or by postulating the existence of Lifshitz tails at the edges of the gap.

Lifshitz tails were originally proved for random Schrödinger operators at thebottom of the spectrum (e.g., [PF, Section 10], [CL, Section VI.2]). Holden andMartinelli [HM] used the Lifshitz tails estimate to obtain the initial length-scaleestimate for the Fröhlich–Spencer multiscale analysis at the bottom of the spectrumfor random Schrödinger operators. The best estimates on the size of the interval oflocalization at the bottom of the spectrum at low disorder have been obtained fromLifshitz tails by Klopp [Klo3].

Klopp [Klo2] proved that for a random perturbation of a periodic Schrödingeroperator, there are Lifshitz tails at an edge of a spectral gap if and only if the densityof states of the periodic operator is nondegenerate at the same edge of the spectralgap. (This nondegeneracy has not been established for arbitrary edges of spectralgaps.) Najar [Na] extended Klopp’s results to random acoustic operators with con-stant compressibility and smooth mass density. If Lifshitz tails are present at an

Page 162: Mathematical Physics, Analysis and Geometry - Volume 7

164 ABEL KLEIN AND ANDREW KOINES

edge of a spectral gap, the Holden–Martinelly argument can be used to obtain theinitial length-scale estimate for the bootstrap multiscale analysis. But the existenceof Lifshitz tails at the edges of spectral gaps of has not been established for therandom classical wave operators studied in this article.

We state our results with hypotheses on the probability distribution µ, as in[FK3, FK4]. The following two theorems achieve low probability of extremal val-ues for the random variables in different ways. The results are formulated for theleft edge of the gap, with similar results holding at the right edge. We use thenotation of Theorem 3.10.

THEOREM 3.11 (Localization at the edge). Let Wg,ω be a random second-orderpartially elliptic classical wave operator satisfying Assumption 3.9. Suppose theprobability distribution µ in Assumption 3.1(iii) satisfies

µ{(1 − γ, 1]} � Kγ η for all 0 � γ � 1, (3.24)

where K < ∞ and η > d/2. Then, for any g < g0 there exists δ(g) > 0, dependingonly on the constants d, g, q,K0,±,R0,±,�,U±, V±, r, ‖ρ‖∞, K, η, a, b, suchthat the random classical wave operator Wg,ω exhibits exponential localizationand strong HS-dynamical localization in the interval [a(g) − δ(g), a(g)].THEOREM 3.12 (Localization in a specified interval). Let Wg,ω be a randomsecond-order partially elliptic classical wave operator satisfying Assumption 3.9.Let g < g0, and fix a1 and a2 such that a < a1 < a2 < a(g) and a(g) − a1 �b(g) − a(g). Then there exists p1 > 0, depending only on the constants d, g, q,

K0,±, R0,±, �, U±, V±, r, K, η, a, b, on the fixed a1, a2, and on a fixed upperbound on ‖ρ‖∞, such that if

µ

((g1

g, 1

])� p1, (3.25)

where g1 is defined by a(g1) = a1, the random classical wave operator Wg,ω ex-hibits exponential localization and strong HS-dynamical localization in the interval[a2, a(g)].

Theorems 3.11 and 3.12 can be extended to the situation when the gap is totallyfilled by the spectrum of the random classical wave operator, establishing the exis-tence of a subinterval of the original gap where the random classical wave operatorexhibits localization. Note that the extension of Theorem 3.12 says that we canarrange for localization in as much of the gap as we want.

Recall that if g0 < 1/Z+ (see (3.18) for a necessary condition), then (a, b) ⊂�g for g ∈ [g0, 1/Z+), i.e., the gaps closes.

THEOREM 3.13 (Localization at the meeting of the edges). Let Wg,ω be a randomsecond-order partially elliptic classical wave operator satisfying Assumption 3.9.Suppose the probability distribution µ in Assumption 3.1(iii) satisfies

µ{(1 − γ, 1]}, µ{[−1,−1 + γ )} � Kγ η for all 0 � γ � 1, (3.26)

Page 163: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 165

where K < ∞ and η > d. Suppose also that g0 < 1/Z+. Then there exist0 < ε < (1/Z+) − g0 and δ > 0, depending only on the constants d, g, q,K0,±,

R0,±,�,U±, V±, r, ‖ρ‖∞, K, η, a, b, such that the random classical wave oper-ator Wg,ω exhibits exponential localization and strong HS-dynamical localizationin the interval [a(g0) − δ, a(g0) + δ] for all g ∈ [g0, g0 + ε).

THEOREM 3.14 (Localization in a specified interval in the closed gap). Let Wg,ω

be a random second-order partially elliptic classical wave operator satisfyingAssumption 3.9. Suppose g0 < 1/Z+, and fix a1, a2, b1 and b2 such that a <

a1 < a2 < a(g0) = b(g0) < b2 < b1 < b. For any g ∈ [g0, 1/Z+) there existp1, p2 > 0, depending only on the constants d, g, q,K0,±, R0,±, �, U±, V±, r,K, η, a, b, on the given a1, a2, b1, b2, and on a fixed upper bound on ‖ρ‖∞, suchthat if

µ

((g1

g, 1

])� p1 and µ

([−1,−g2

g

))� p2, (3.27)

where g1 and g2 are defined by a(g1) = a1 and b(g2) = b1 (notice 0 < g1, g2 <

g0 � g), the random classical wave operator Wg,ω exhibits exponential localiza-tion and strong HS-dynamical localization in the interval [a2, b2].

Theorems 3.13 and 3.14 are proved similarly to Theorems 3.11 and 3.12, re-spectively, taking into account both edges of the gap.

4. The Multiscale Analysis and Localization

The analysis requires finite volume random classical wave operators.Throughout this paper we use two norms in Rd and Cd :

|x| =(

d∑i=1

|xi |2) 1

2

, (4.1)

‖x‖ = max{|xi|, i = 1, . . . , d}. (4.2)

By �L(x) we denote the open cube in Rd , centered at x with side L > 0:

�L(x) ={y ∈ Rd; ‖y − x‖ <

L

2

}, (4.3)

by �L(x) the closed cube, and by �L(x) the half-open/half-closed cube, i.e.,

�L(x) ={y ∈ Rd; −L

2� yi − xi <

L

2, i = 1, 2, . . . , d

}. (4.4)

We will identify a closed cube �L(x) with a torus in the usual way, We set

χx,L = χ�L(x), (4.5)

where χ� denotes the characteristic function of a set � ⊂ Rd .

Page 164: Mathematical Physics, Analysis and Geometry - Volume 7

166 ABEL KLEIN AND ANDREW KOINES

Since we will work with an underlying periodic medium with period q ∈ N, werestrict ourselves to cubes �L(x) with x ∈ Zd and L ∈ 2qN. We set

H (n)x,L = H (n)

�L(x) = L2(�L(x), dx; Cn). (4.6)

A CPDO(1)n,m D defines a closed densely defined operator Dx,L from H (n)

x,L

to H (m)x,L with periodic boundary condition; an operator core is given by

C∞per(�L(x), Cn), the infinitely differentiable, periodic Cn-valued functions

on �L(x).If the CPDO(1)

n,m D is partially elliptic, then the restriction Dx,L is also partiallyelliptic, in the sense that Equations (2.16) and (2.17) hold for Dx,L, (D⊥)x,L, andx,L. (x,L is the Laplacian on L2(�L(x), dx) with periodic boundary condition.)This can be easily seen by using the Fourier transform; here the use of periodicboundary condition plays a crucial role. We also have (2.18) and (2.19) with H (n)

x,L.We fix a random second order classical wave operator Wg,ω as in (3.14). Given

ω ∈ RZd

, we define ωx,L = ω�L(x) ∈ RZd

by

ωx,L,i = ωi for each i ∈ �L(x) ∩ Zd,(4.7)

ωx,L,i = ωx,L,i+Lj for all i, j ∈ Zd.

We set

Ag,ω,x,L = Ag,ω,�L(x) =√

Rg,ωx,LDx,L

√Kg,ωx,L

(4.8)

on D(Ag,ω,x,L) = K−1/2g,ωx,LD(Dx,L), a closed, densely defined operator on H (n)

x,L.

The finite volume random classical wave operator Wg,ω,x,L on H (n)x,L is now defined

by

Wg,ω,x,L = Wg,ω,�L(x) = A∗g,ω,x,LAg,ω,x,L. (4.9)

(Wg,ω,x,L is a “periodic restriction” of Wg,ω to �L(x) with periodic boundarycondition.) We have the equivalent of (2.31), (2.32), etc. We write

Rg,ω,x,L(z) = (Wg,ω,x,L − z)−1 (4.10)

for the finite volume resolvent.The multiscale analysis works with the decay of the finite volume resolvent

from the center of a cube to its boundary, or more precisely, to its boundary belt.We set

q = min

{q ′ ∈ qN; q ′ � q + r

2

}, (4.11)

where r is given in Assumption 3.1(ii). Given a cube �L(x), we set

ϒL(x) ={y ∈ Zd; ‖y − x‖ = L

2− q

}, (4.12)

Page 165: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 167

and define its (boundary) belt by

ϒL(x) =⋃

y∈ϒL(x)

�q(y); (4.13)

it has the characteristic function

�x,L = χϒL(x) =∑

y∈ϒL(x)

χy,q a.e. (4.14)

Note that

|ϒL(x)| � d

(L

q

)d−1

. (4.15)

The following theorem shows that random second-order partially elliptic clas-sical wave operators satisfy the requirements for the bootstrap multiscale analysisof [GK1]. Note that we use the finite volume operators defined in (4.9), and theboundary belt defined in (4.13), i.e., with �x,L as in (4.14).

THEOREM 4.1. A random second-order partially elliptic classical wave oper-ator is a qZd-ergodic random operator satisfying the requirements for the boot-strap multiscale analysis in any compact interval I0 ⊂ (0,∞), i.e., it satisfiesAssumptions SLI (Simon–Lieb inequality), EDI (eigenfunction decay inequality),IAD (independence at a distance), NE (number of eigenvalues), SGEE (stronggeneralized eigenfunction expansion), and W (Wegner’s estimate) of [GK1] in I0.The constants γI0 in Assumption SLI and γI0 in Assumption EDI are given byγI0 = γI0 = supE∈I0

γE, with

γE = 6√

d

qg

(2E + 100d

q22

g

)1/2

, (4.16)

where g is given in (3.10). In addition, it satisfies the kernel polynomial decayestimate of [GK2, Theorem 2] with � = (3

√7)/(32g) for a.e. ω (note that

�2 = 0).

Remark 4.2. Partial ellipticity is required for Assumption SGEE. AssumptionsNE and W require that either Wg,ω or Wg,ω,∗ is partially elliptic. The other assump-tions and the kernel polynomial decay estimate do not require partial ellipticity.

Remark 4.3. It follows from Theorem 4.1 that the results of [GK3] on the An-derson metal-insulator transport transition apply to random second-order partiallyelliptic classical wave operators.

We have already proven most of Theorem 4.1. Taking into account (3.8) and(3.9), Assumptions SLI, EDI, and NE follow from Lemmas 3.8, 3.9, and 3.3 in

Page 166: Mathematical Physics, Analysis and Geometry - Volume 7

168 ABEL KLEIN AND ANDREW KOINES

[KK], respectively. (We used slightly different finite volume operators in [KK],where we used a boundary belt with q = q. But the proofs of Lemmas 3.8 and 3.9in [KK] still apply with the definitions used in this article due to our choice of q in(4.11).) IAD is true by Assumption 3.1 and the definition of finite volume operatorsin (4.9); note that � = 0. Assumption SGEE is proven in [KKS], in the strongerform of the trace estimate given in [GK1, Equation (2.36)]. The kernel polynomialestimate is just a special case of [GK2, Theorem 2].

Assumption W follows from the following theorem. The constants r, Z±, K0,−,R0,−, and the probability density ρ are as in Assumption (3.1); � is the constantin (2.17).

THEOREM 4.4 (Wegner estimate). Let Wg,ω be a random second-order classicalwave operator, with either Wg,ω or Wg,ω,∗ partially elliptic. Then for all E > 0,cubes � = �L(x) with x ∈ Zd and L ∈ 2qN, and 0 � η � E, we have

P{dist(σ (Wg,ω,x,L), E) � η} � Qg‖ρ‖∞E(d/2)−1ηL2d, (4.17)

where

Qg = nCd(2 + r)d

gZ−(1 − gZ+)d+1(K0,−R0,−�)−d/2, (4.18)

with Cd a constant depending only on the dimension d.

Theorem 4.4 is proven in Section 6.In view of Theorems 4.1 and 4.4, to prove exponential localization and strong

HS-dynamical localization for Wg,ω in some interval centered at E ∈ �g\{0},it suffices to verify the initial length scale estimate for the bootstrap multiscaleanalysis [GK1, Equation (3.3)] at E. To state this estimate, we need a definition,which we state in the context of this article.

DEFINITION 4.5. Let Wg,ω be a random second-order classical wave operator.Given θ > 0, E > 0, x ∈ Zd , and L ∈ 6qN, we say that the cube �L(x) is(θ, E)-suitable for Wg,ω if E /∈ σ (Wg,ω,x,L) and

‖�x,LRg,ω,x,L(E)χx,L/3‖x,L � 1

Lθ. (4.19)

The following theorem summarizes the results of [GK1] that will be used toprove Theorems 3.11–3.14.

THEOREM 4.6. Let Wg,ω be a random second-order partially elliptic classicalwave operator. Let E0 ∈ �g\{0}. Given θ > 2d, there exists a finite scale L =L(d, q,K0,±,R0,±,�,U±, V±, r, ‖ρ‖∞, g,E0, θ), bounded for E0 in compactsubintervals of (0,∞), such that, if we can verify at some finite scale L � L that

P{�L(0) is (θ, E0)-suitable for Wg,ω} > 1 − 1

841d, (4.20)

Page 167: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 169

then there exists δ0 = δ0(d, q,K0,±,R0,±,�,U±, V±, r, ‖ρ‖∞, g,E0, θ,L) > 0,such that the random classical wave operator Wg,ω exhibits exponential localiza-tion and strong HS-dynamical localization in the interval [E0 − δ0, E0 + δ0]. Inaddition, we have the conclusions of [GK1, Theorems 3.4, 3.8 and 3.10, Corollaries3.10 and 3.12].

Proof of Theorem 3.11. In view of Theorem 4.6 it suffices to prove that for allg < g0 and θ > 2d we have

limL→∞ P{�L(0) is (θ, a(g))-suitable for Wg,ω} = 1. (4.21)

We fix g < g0 and θ > 2d. Given L ∈ 6qZd we use Theorem 3.10 and, forlarge L, define gL ∈ (0, g) by

a(gL) = a(g) −(

κlog L

L

)2

, (4.22)

where κ > 0 will be specified later. We define the event

EL ={ω ∈ RZd ; ωi � gL

gfor all i ∈ �L(0) ∩ Zd

}. (4.23)

It follows from Theorem 3.10, Lemma 5.1 and [KK, Theorem 4.3] that

(a(gL), b(g)) ⊂ R\σ (Wg,ω,0,L) for all ω ∈ EL. (4.24)

Hence it follows from [KK, Theorem 3.6] that for large L we have

‖�0,LRg,ω,0,L(a(g))χ0,L/3‖0,L � C1L2d−1

(κ log L

L

)−2

e−C2((κ log L)/L)L

= C1L2d+1

(κ log L)2LκC2(4.25)

for all ω ∈ EL, where C1 and C2 are finite, strictly positive constants dependingonly on d, q,K0,±,R0,±, U±, V±, g, a, b. It follows that

EL ⊂ {ω ∈ RZd ; �L(0) is (θ, a(g))-suitable for Wg,ω} (4.26)

for all L sufficiently large if κ > (θ + 2d + 1)/C2.Fixing κ , denoting by EL the complementary event to EL, and using (3.24),

(3.22), and (4.22), we have that

P(EL) � Ldµ

((gL

g, 1

])� KLd

(g − gL

g

(4.27)

� KLd

(a(g) − a(gL)

ag(δ−(g) + η−(g))

(4.28)

� KLd

((κ

log L

L)2

ag(δ−(g) + η−(g))

−→ 0 (4.29)

Page 168: Mathematical Physics, Analysis and Geometry - Volume 7

170 ABEL KLEIN AND ANDREW KOINES

as L → ∞ since η > d2 . �

Proof of Theorem 3.12. Let g < g0, fix a < a1 < a2 < a(g), with a(g) − a1 �b(g) − a(g), and define g1 ∈ (0, g) by a(g1) = a1 using Theorem 3.10.

We use the notation of the the previous proof with aL = a1 and gL = g1 forall L. As before, it follows from [KK, Theorem 3.6] that for sufficiently large L wehave

‖�0,LRg,ω,0,L(E)χ0,L/3‖0,L � C1L2d−1(a2 − a1)

−1e−C2√

a2−a1 L

for all E ∈ [a2, a(g)] and ω ∈ EL, where C1 and C2 are the same constants as in(4.25). Thus, given θ > 0, we have

EL ⊂⋂

E∈[a2,a(g)]{ω ∈ RZd ; �L(0) is (θ, E)-suitable for Wg,ω} (4.30)

for sufficiently large L. We also have, using (3.25), that

P(EL) � Ldµ

((g1

g, 1

])� Ldp1. (4.31)

We now fix θ > 2d, and pick L0 ∈ 6qZd , suficiently large so (4.30) holds forthis θ and L0 � L, where L is given in Theorem 4.6, and take

p1 <1

841dLd0

. (4.32)

We then have (4.20) with L = L0 for all E ∈ [a2, a(g)], so Theorem 3.12 followsfrom Theorem 4.6. �

Proof of Theorem 3.13. It proceeds in the same way as the proof of Theo-rem 3.11, but taking into account both edges of the gap. Since we will use[KK, Theorem 3.6] for an energy in the middle of a gap, we will need η > d

instead of η > d/2 as in Theorem 3.11. We verify (4.20) instead of (4.21).We fix g ∈ [g0, 1/Z+) and θ > 2d. Recall a(g) = a(g0) = b(g0) = b(g).

Given L ∈ 6qZd we use Theorem 3.10 and, for large L, define g±L ∈ (0, g0) by

a(g−L ) = a(g0) − κ

log L

L, (4.33)

b(g+L ) = a(g0) + κ

log L

L, (4.34)

where κ > 0 will be specified later. We define the event

FL ={ω ∈ RZd ; −g+

L

g� ωi � g−

L

gfor all i ∈ �L(0) ∩ Zd

}. (4.35)

Page 169: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 171

It follows from Theorem 3.10, Lemma 5.1 and [KK, Theorem 4.3] that

(a(g−L ), b(g+

L )) ⊂ R\σ (Wg,ω,0,L) for all ω ∈ FL. (4.36)

Hence it follows from [KK, Theorem 3.6] that for large L we have

‖�0,LRg,ω,0,L(a(g0))χ0,L/3‖0,L � C ′1L

2d−1

(κ log L

L

)−1

e−C ′2((κ log L)/L)L

= C ′1L

2d

κ(log L)LκC ′2

(4.37)

for all ω ∈ FL, where C ′1 and C ′

2 are finite, strictly positive constants dependingonly on d, q,K0,±,R0,±, U±, V±, g, a, b. It follows that

FL ⊂ {ω ∈ RZd ; �L(0) is (θ, a(g0))-suitable for Wg,ω} (4.38)

for all L sufficiently large if κ > (θ + 2d)/C ′2.

Fixing κ , denoting by FL the complementary event to FL, and using (3.26),(3.22), (3.23), (4.33), and (4.34), we have that

P(FL) � Ld

((g−

L

g, 1

])+ µ

([−1,−g+

L

g

))}(4.39)

� KLd

{(g − g−

L

g

+(

g − g+L

g

)η}(4.40)

� KLd

0

{(g − g0 + a(g0) − a(g−

L )

a(δ−(g0) + η−(g0))

+

+(

g − g0 + b(g+L ) − a(g0)

a(δ−(g0) + η−(g0))

)η}(4.41)

� 2KLd

0

(g − g0 + κ

log L

L

a(δ−(g0) + η−(g0))

. (4.42)

We now fix θ > 2d and κ > (θ + 2d)/C ′2, and pick L0 ∈ 6qZd , suficiently

large so (4.38) holds for this θ and L0 � L, where L is given in Theorem 4.6, and

2KLd0

0

( 2κlog L0

L0

a(δ−(g0) + η−(g0))

<1

841d, (4.43)

what can be done since η > d. If we now set

ε = min

log L0L0

a(δ−(g0) + η−(g0)),

1

Z+− g0

}, (4.44)

we have (4.20) with L = L0 and E = a(g0) for all g ∈ [go, g0 + ε), so Theo-rem 3.13 follows from Theorem 4.6. �

Page 170: Mathematical Physics, Analysis and Geometry - Volume 7

172 ABEL KLEIN AND ANDREW KOINES

Proof of Theorem 3.14. The proof is similar to the proof of Theorem 3.12, buttaking into account both edges of the gap, as in the proof of Theorem 3.13. �

5. The Location of the Spectral Gap

In this section we prove Theorem 3.10. We proceed as in [FK3, Theorem 3], butwe must take into consideration two random coefficients. To do so, we make useof the unitary equivalence between the operators (Wg,ω,�)⊥ and (Wg,ω,∗,�)⊥, anduse [KK, Theorem 4.3].

We start by approximating the spectrum of the random operator by spectra ofperiodic operators. If k, n ∈ N, we say that k � n if n ∈ kN and that k ≺ n if k � n

and k �= n.

Let us fix g as in (3.6) and set

Tg = {τ = {τi, i ∈ Zd}; −g � τi � g} = [−g, g]Zd

, (5.1)

T (n)g = {τ ∈ T ; τi+nj = τi for all i, j ∈ Zd}, n ∈ N, (5.2)

and

T (∞)g =

⋃n�q

T (n)g . (5.3)

For τ ∈ Tg we let

Kτ (x) = γτ (x)K0(x), with γτ (x) = 1 +∑i∈Zd

τiui(x), (5.4)

Rτ (x) = ζτ (x)R0(x), with ζτ (x) = 1 +∑i∈Zd

τivi(x), (5.5)

Aτ = √Rτ D

√Kτ , Wτ = A∗

τAτ . (5.6)

The following lemma shows that the (nonrandom) spectrum of the random clas-sical wave operator Wg,ω is determined by the spectra of the periodic classical waveoperators Wτ , τ ∈ T (∞)

g . The analogous result for random Schrödinger operatorswas proven in [KM2, Theorem 4]. It was extended to certain random classical waveoperators in in [FK3, Lemma 19] and [FK4, Lemma 27].

LEMMA 5.1. Let Wg,ω be a random second-order classical wave operator. Itsspectrum �g is given by

�g =⋃

τ∈T (∞)g

σ (Wτ ). (5.7)

Page 171: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 173

Proof. Let �′g denote the right-hand side of (5.7). We start by showing that

σ (Wτ ) ⊂ �′g for all τ ∈ Tg, (5.8)

which implies that

�g ⊂ �′g. (5.9)

Let {�n; n = 0, 1, 2, . . .} be a sequence in 2N such that �0 = 2q and �n ≺ �n+1

for each n = 0, 1, 2, . . .. Given τ ∈ Tg, we specify τn ∈ T (�n)g by requiring (τn)i =

τi for i ∈ [−�n/2, �n/2)d ∩ Zd . We set Rn = (Wτn+ 1)−1, R = (Wτ + 1)−1. We

will show that Rn → R strongly, which implies (5.8), as in [FK3, Lemma 45].To do so, note that

Rn = (√

KτnD∗Rτn

D√

Kτn+ 1)−1 (5.10)

= K−1/2τn

(D∗RτnD + K−1

τn)−1K−1/2

τn,

and similarly for R. Note that we have uniform (in n) bounds on the operatornorms of K−1

τn, Rτn

, and (D∗RτnD + K−1

τn)−1. In addition, it is easy to see that

K−1/2τn

→ K−1/2τ , K−1

τn→ K−1

τ , and Rτn→ Rτ , the convergence being in the

strong operator topology. Thus it suffices to show that (D∗RτnD + K−1

τn)−1 →

(D∗Rτ D + K−1τ )−1 strongly. But this follows from the preceding remarks, the

relation

(D∗RτnD + K−1

τn)−1 − (D∗Rτ D + K−1

τ )−1

= (D∗RτnD + K−1

τn)−1(D∗(R − Rτn

)D ++ (K − K−1

τn))(D∗Rτ D + K−1

τ )−1, (5.11)

and the fact that the operators D(D∗RτnD + K−1

τn)−1 are bounded with norms

uniformly bounded in n and, hence, also their adjoints.To prove the opposite inclusion to (5.9), we introduce the countable sets

T (N)

g,Q = T (N)g ∩ QZd

, N = 0, 1, 2, . . . ,∞. (5.12)

Since any τ ∈ T (∞)g can be approximated uniformly by a sequence τn ∈ T (∞)

g,Q , theprevious argument shows that

σ (Wτ ) ⊂⋃

τ∈T (∞)g,Q

σ (Wτ) for all τ ∈ T (∞)g , (5.13)

which implies that

�′g =

⋃τ∈T (∞)

g,Q

σ (Wτ ). (5.14)

Page 172: Mathematical Physics, Analysis and Geometry - Volume 7

174 ABEL KLEIN AND ANDREW KOINES

Thus (5.7) follows if we prove that

σ (Wτ ) ⊂ �g for all τ ∈ T (∞)

g,Q . (5.15)

Note that a.e. ω ∈ � ≡ [−1, 1]Zd

. Let {�n ∈ N; n = 0, 1, 2, . . .} be such that�0 = 2q and �n ≺ �n+1 for each n = 0, 1, 2, . . .. For each n, q ′ � q, and τ ∈ T (q ′)

g,Q

we consider the event

�n,q ′,τ ={ω ∈ �; there is xω = xn,q ′,τ ,ω ∈ q ′Zd such that

maxi∈(xω+[−�n/2,�n/2)d)∩Zd

|gωi − τi | � 1

�d+1n

}; (5.16)

notice P(�n,q ′,τ ) = 1. We take the countable intersection

� =∞⋂

n=0

⋂q ′�q

⋂τ∈T

(q′)g,Q

�n,q ′,τ , (5.17)

so we have P(�) = 1. We will show that

σ (Wτ ) ⊂ σ (Wg,ω) for all τ ∈ T (∞)

g,Q and ω ∈ �, (5.18)

so (5.15) follows.So let τ ∈ T (∞)

g,Q , say τ ∈ T (q ′)g,Q for some q ′ � q. Let ω ∈ �, n ∈ N, and let

xω = xn,q ′,τ ,ω be as in (5.16). We set ω(n) = {ω(n)i = ωi−xω

; i ∈ Zd}, and noticethat σ (Wg,ω(n)) = σ (Wg,ω). We have the following inequalities for the matricesnorms:

‖(Rg,ω(n)(x) − Rτ (x))χ0,�n−r (x)‖ � R0,+V+�n

χ0,�n−r (x), (5.19)

‖(Kg,ω(n)(x)−1 − Kτ (x)−1)χ0,�n−r (x)‖� U+

�n K0,−(1 − gU+)2χ0,�n−r (x), (5.20)

‖(Kg,ω(n)(x)−1/2 − Kτ (x)−1/2)χ0,�n−r (x)‖

�K

1/20,+U+(1 + gU+)

�n K0,−(1 − gU+)5/2χ0,�n−r (x). (5.21)

Using these inequalities we can proceed as before to show that

limn→∞(Wg,ω(n) + I )−1 = (Wτ + I )−1 (5.22)

Page 173: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 175

in the strong operator topology, an hence that

σ (W(τ)) ⊂∞⋃

n=0

σ (Wg,ω(n)) = σ (Wg,ω). � (5.23)

Given real numbers k, h, with |k|, |h| < 1/Z+, we set

Kk(x) = K0(x)(1 + kU(x)) and Rh(x) = R0(x)(1 + hV (x)),

A(k, h) = √RhD

√Kk, (5.24)

W(k, h) = A(k, h)∗A(k, h), W∗(k, h) = A(k, h)A(k, h)∗.

LEMMA 5.2. Let W(k, h) be as in (5.24), and let � = ��(x0) for some x0 ∈ Rd

and � � q. The positive self-adjoint operator (W(k, h)�)⊥ has compact resolvent,so let µ1(k, h) � µ2(k, h) � . . . be its eigenvalues, repeated according to their(finite) multiplicity. Then each µj (h) ≡ µj(h, h), j = 1, 2, . . . , is a Lipschitzcontinuous, strictly increasing function of h, with

12 (δ−(g) + η−(g))(µj (h1) + µj (h2))

� µj (h2) − µj (h1)

h2 − h1� 1

2 (δ+(g) + η+(g))(µj (h1) + µj (h2)) (5.25)

for any h1, h2 ∈ (−g, g), 0 < g < 1/Z+, where δ±(g) and η±(g) are given in(3.11) and (3.12).

Proof. Let 0 < g < 1/Z+, −g � h1 < h2 � g. We have

Rh2(x) − Rh1(x) = (h2 − h1)U(x)R0(x) � 0, (5.26)

so W(k, h2)� � W(k, h1)λ and, hence, each µj (k, h) is an increasing function ofh for fixed k. It also follows from (5.26) that

Rh2(x) = Rh1(x)

(1 + (h2 − h1)V (x)

1 + h2V (x)

)(5.27)

and

Rh1(x) = Rh2(x)

(1 − (h2 − h1)V (x)

1 + h1V (x)

), (5.28)

which gives us

Rh1(x)(1 + η−(g)(h2 − h1))

� Rh2(x) � Rh1(x)(1 + η+(g)(h2 − h1)) (5.29)

and

Rh2(x)(1 − η+(g)(h2 − h1))

� Rh1(x) � Rh2(x)(1 − η−(g)(h2 − h1)) (5.30)

Page 174: Mathematical Physics, Analysis and Geometry - Volume 7

176 ABEL KLEIN AND ANDREW KOINES

with η±(g) as in (3.12).From (5.29) we get that for each k we have

(1 + η−(g)(h2 − h1))W(k, h1)�

� W(k, h2)� � (1 + η+(g)(h2 − h1))W(k, h1)�, (5.31)

so it follows from the min-max principle that for all j = 1, 2, . . . ,

(1 + η−(g)(h2 − h1))µj (k, h1) � µj (k, h2)

� (1 + η+(g)(h2 − h1))µj (k, h1), (5.32)

i.e.,

η−(g)µj (k, h1) � µj(k, h2) − µj(k, h1)

h2 − h1� η+(g)µj (k, h1). (5.33)

Similarly, using (5.30) we get

η−(g)µj (k, h2) � µj(h2, h2) − µj (h2, h1)

h2 − h1� η+(g)µj (k, h2). (5.34)

Thus

η−(g)µj (k, h2) � µj(k, h2) − µj(k, h1)

h2 − h1� η+(g)µj (k, h1). (5.35)

Since the operators (W(k, h)�)⊥ and (W∗(k, h)�)⊥ are unitarily equivalent, theµj (k, h) are also the egenvalues of (W∗(k, h)�)⊥, so the above argument gives

δ−(g)µj (k2, h) � µj(k2, h) − µj (k1, h)

k2 − k1� δ+(g)µj (k1, h), (5.36)

where −g � k1 < k2 � g.Since

µj (h2, h2) − µj(h1, h1)

= (µj (h2, h2) − µj (h2, h1)) + (µj (h2, h1) − µj (h1, h1)) (5.37)

= (µj (h2, h2) − µj (h1, h2)) + (µj (h1, h2) − µj (h1, h1)), (5.38)

we may use (5.35) and (5.36) with (5.37), repeat the procedure with (5.38) insteadof (5.37), and take the average of the bounds to obtain (5.25). The properties of thefunctions µj (h) follow. �

The following lemma follows immediately from [KK, Theorem 4.3], Lemmas5.1 and 5.2, and the min-max principle. We write W(h) for W(h, h) as in (5.24).

Page 175: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 177

LEMMA 5.3. Let Wg,ω be a random second-order classical wave operator. Forall sequences {�n ∈ N; n = 0, 1, 2, . . .}, with �0 = 2q and �n ≺ �n+1 for eachn = 0, 1, 2, . . . , we have

�g =⋃

h∈[−g,g]σ (W(h)) =

⋃h∈[−g,g]

∞⋃n=0

σ (W(h)��n(0)). (5.39)

In particular, �g is increasing in g.

We are now ready to prove Theorem 3.10. As �g is increasing in g, we expectthe gap to shrink as we increase g until it either disappears at some g0, or it remainsopen for all allowed g. Thus we define

g0 = sup

{g ∈

[0,

1

Z+

); �g ∩ (a, b) �= (a, b)

}. (5.40)

Let {�n; n = 0, 1, 2, . . .} be as in Lemma 5.3, h ∈ [−g, g], and let µ(n)

1 (h) �µ

(n)

2 (h) � . . . be the nonzero eigenvalues of W(h)�n, where �n = ��n

(0), re-peated according to their (finite) multiplicity; notice limj→∞ µ

(n)j (h) = ∞. By

Lemma 5.2 each µ(n)j (h) is a strictly increasing continuous function of h, hence it

follows from Lemma 5.3 that

�g =∞⋃

n=0

⋃h∈[−g,g]

σ (W(h)�n) =

∞⋃n=0

∞⋃j=1

[µ(n)j (−g), µ

(n)j (g)]. (5.41)

In particular, �g is a countable union of disjoint closed intervals, so for g < g0 wecan define a(g) and b(g) by (3.19). Since �g is increasing in g ∈ [0, 1/Z+) byLemma 5.3, it follows that a(g) and −b(g) are increasing functions in [0, g0).

For each n let

jn = max{j ; µ(n)j (0) � a}, (5.42)

so using Assumption 3.9 and [KK, Equation (4.1) in Theorem 4.3], we have

jn + 1 = min{j ; µ(n)j (0) � b}. (5.43)

If g < g0, it follows from the definition of jn, Assumption 3.9 and [KK, Theorem4.3], that µjn

(−g) and −µjn+1(g) are both increasing in n, and

a(g) = limn→∞ µjn

(g), (5.44)

b(g) = limn→∞ µjn+1(−g). (5.45)

Thus, given 0 � g1 < g2 < g0, we can conclude from (5.25) that

12 (δ−(g2) + η−(g2))(a(g1) + a(g2))

� a(g2) − a(g1)

g2 − g1� 1

2 (δ+(g2) + η+(g2))(a(g1) + a(g2)), (5.46)

Page 176: Mathematical Physics, Analysis and Geometry - Volume 7

178 ABEL KLEIN AND ANDREW KOINES

12 (δ−(g2) + η−(g2))(b(g1) + b(g2))

� b(g1) − b(g2)

g2 − g1� 1

2(δ+(g2) + η+(g2))(b(g1) + b(g2)), (5.47)

which are exactly (3.22) and (3.23).The Lipschitz continuity of a(g) and b(g) follows and, hence, they are ab-

solutely continuous functions. Their a.e. derivatives can be estimated from (5.46)and (5.47):

δ−(h) + η−(h) � a′(h)

a(h)� δ+(h) + η+(h), (5.48)

δ−(h) + η−(h) � −b′(h)

b(h)� δ+(h) + η+(h). (5.49)

Using the abolute continuity, we may integrate over h obtaining∫ g2

g1

(δ−(h) + η−(h)) dh � log

(a(g2)

a(g1)

)�

∫ g2

g1

(δ+(h) + η+(h)) dh (5.50)

and ∫ g2

g1

(δ−(h) + η−(h)) dh � log

(b(g1)

b(g2)

)�

∫ g2

g1

(δ+(h) + η+(h)) dh. (5.51)

Performing the integrations, we obtain (3.20) and (3.21), from which (3.18) fol-lows.

If g0 < 1/Z+, we must have limg↑g0 a(g) = limg↑g0 b(g). This follows from(5.41), (5.44) and (5.45), since by (5.25) each µ

(n)j (h) is a locally Lipschitz contin-

uous functions of h ∈ (−1/Z+, 1/z+), uniformly in n. Thus, if g ∈ [g0, 1/Z+) itfollows that [a, b] ⊂ �g; we set a(g) = b(g) = limg↑g0 a(g).

Theorem 3.10 is proven.

6. The Wegner Estimate

In this section we prove Theorem 4.4. We proceed as in [FK3, Theorem 23], butwe must take into consideration two random coefficients. To do so, we make useof the unitary equivalence between the operators (Wg,ω,�)⊥ and (Wg,ω,∗,�)⊥. Weassume that Wg,ω is the partially elliptic operator without loss of generality.

We start by picking κ ∈ (1, (1/g)(1/Z+)), say

κ = 1 + gZ+2gZ+

. (6.1)

We rewrite γg,ω and ζg,ω in the form

γg,ω = γ + g∑i∈Zd

ωiui, (6.2)

ζg,ω = ζ + g∑i∈Zd

ωivi , (6.3)

Page 177: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 179

where

γ = 1 − κg∑i∈Zd

ui � 1 − gU+2

> 0, (6.4)

ζ = 1 − κg∑i∈Zd

vi � 1 − gV+2

> 0, (6.5)

and ωi = ωi + κ ∈ [κ − 1, κ + 1] for each i ∈ Zd .We fix � = �L(x) with x ∈ Zd and L ∈ 2qN. The finite volume opera-

tors operators (Wg,ω,�)⊥ and (Wg,ω,∗,�)⊥ are unitarily equivalent by [KK, LemmaA.1]. Since (Wg,ω,�)⊥ has compact resolvent by [KK, Proposition 3.2], so does(Wg,ω,∗,�)⊥, and they have the same eigenvalues, say {λg,ω,n}n∈N. We will denoteby {ψg,ω,n}n∈N and {ϕg,ω,n}n∈N the corresponding orthonormal eigenfunctions for(Wg,ω,�)⊥ and (Wg,ω,∗,�)⊥, respectively. Note that they can may chosen so theyare measurable functions of ω.

Given j ∈ Zd , we set ε(j) ∈ RZd

by ε(j)

i = δj,i . Note that Kg,ω+sε(j)(x) andRg,ω+tε(j)(x) are coefficient matrices for |s|, |t| sufficiently small; the correspond-ing classical wave operators will be denoted by Wg,ω(s, t; j), etc. The remarksof the previous paragraph still apply. Note that we can choose each λg,ω,n(s, t; j)

jointly analytic in s and t . If j ∈ � ∩ Zd , we have

∂ωj

λg,ω,n = ∂

∂sλg,ω,n(s, t; j)|(s,t)=(0,0) + ∂

∂tλg,ω,n(s, t; j)|(s,t)=(0,0)

=⟨ϕg,ω,n,

(∂

∂sWg,ω,∗,�(s, t; j)|(s,t)=(0,0)

)ϕg,ω,n

⟩+

+⟨ψg,ω,n,

(∂

∂tWg,ω,�(s, t; j)|(s,t)=(0,0)

)ψg,ω,n

⟩= 〈ϕg,ω,n,Wg,ω,∗,�(gu�

j K0) ϕg,ω,n〉 ++ 〈ψg,ω,n,Wg,ω,�(gv�

j R0) ψg,ω,n〉, (6.6)

where we used (4.7), with

u�j =

∑i∈Zd

uj+Li, v�j =

∑i∈Zd

vj+Li, (6.7)

and Wg,ω,∗,�(K1) and Wg,ω,�(R1) the finite volume operators defined by

Wg,ω,∗,�(K1) = √Rg,ω�

D�K1D∗�

√Rg,ω�

, (6.8)

Wg,ω,�(R1) = √Kg,ω�

D∗�R1D�

√Kg,ω�

. (6.9)

Since (Wg,ω,�)⊥ � 0 has compact resolvent, we may define

Ng,ω,�(E) = tr χ(−∞,E]((Wg,ω,�)⊥), (6.10)

Page 178: Mathematical Physics, Analysis and Geometry - Volume 7

180 ABEL KLEIN AND ANDREW KOINES

the number of eigenvalues of (W�)⊥ that are less than or equal to E. If E � 0, wehave NW�

(E) = 0, and if E > 0, NW�(E) is the number of eigenvalues of Wg,ω,�

(or (Wg,ω,�)⊥) in the interval (0, E]. Notice that Ng,ω,�(E) is the distributionfunction of the measure ng,ω,�(dE) given by∫

h(E)ng,ω,�(dE) = tr(h((Wg,ω,�)⊥)), (6.11)

for positive continuous functions h of a real variable. Note also that

Ng,ω,�(E) =∑n∈N

Y (E − λg,ω,n), (6.12)

where Y (x) is the Heaviside function.Let f be a positive, continuous function on the real line with f (0) = 0, and

j ∈ � ∩ Zd . We have, using (6.6), that

− ∂

∂ωj

∫Ng,ω,�(E)f (E) dE

= − ∂

∂ωj

∞∑n=1

∫Y (E − λg,ω,n)f (E) dE

=∞∑

n=1

∫ (∂

∂ωj

λg,ω,n

)δ(E − λg,ω,n)f (E) dE

=∞∑

n=1

f (λg,ω,n)∂

∂ωj

λg,ω,n

=∞∑

n=1

{〈f (Wg,ω,∗,�)ϕg,ω,n,Wg,ω,∗,�(gu�j K0) ϕg,ω,n〉+

+ 〈f (Wg,ω,�)ψg,ω,n,Wg,ω,�(gv�j R0) ψg,ω,n〉}

= tr{Wg,ω,∗,�(gu�j K0)f (Wg,ω,∗,�)} + tr{Wg,ω,�(gv�

j R0)f (Wg,ω,�)}. (6.13)

The last step used the fact that f (0) = 0. Thus

−∑

i∈�∩Zd

ωi

∂ωi

∫Ng,ω,�(E)f (E) dE

= tr{Wg,ω,∗,�((γg,ω�− γ )K0)f (Wg,ω,∗,�)} +

+ tr{Wg,ω,�((ζg,ω�− ζ )R0)f (Wg,ω,�)}

= tr{Wg,ω,∗,�f (Wg,ω,∗,�)} − tr{Wg,ω,∗,�(γK0)f (Wg,ω,∗,�)} ++ tr{Wg,ω,�f (Wg,ω,�)} − tr{Wg,ω,�(ζR0)f (Wg,ω,�)}. (6.14)

Page 179: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 181

We have, for any ω ∈ [−1.1]Zd

(and hence also for ω�), that

γ −1γg,ω = γ −1

(γ + g

∑i∈Zd

ωiui

)� 1 + (κ − 1)gU−

1 − κgU−� 1 + (1 − gZ+)U−

2Z+, (6.15)

and similarly,

ζ−1ζg,ω � 1 + (1 − gZ+)V−2Z+

. (6.16)

Since f � 0, we obtain

tr{Wg,ω,∗,�(γK0)f (Wg,ω,∗,�)}�

(1 + (1 − gZ+)U−

2Z+

)−1

tr{Wg,ω,∗,�f (Wg,ω,∗,�)}, (6.17)

tr{Wg,ω,�(ζR0)f (Wg,ω,�)}�

(1 + (1 − gZ+)V−

2Z+

)−1

tr{Wg,ω,�f (Wg,ω,�)}. (6.18)

In addition, using the unitary equivalence between (Wg,ω,�)⊥ and (Wg,ω,∗,�)⊥, weget

tr{Wg,ω,∗,�f (Wg,ω,∗,�)} = tr{Wg,ω,�f (Wg,ω,�)}. (6.19)

It follows from (6.14)–(6.19) that

tr{Wg,ω,�f (Wg,ω,�)}� (2Z+ + (1 − gZ+)Z−)2

2(1 − gZ+)Z+Z−

{−

∑i∈�∩Zd

ωi

∂ωi

∫Ng,ω,�(E)f (E) dE

}, (6.20)

where we used(1 −

(1 + (1 − gZ+)U−

2Z+

)−1)+

(1 −

(1 + (1 − gZ+)V−

2Z+

)−1)=

1−gZ+2Z+ U−

1 + 1−gZ+2Z+ U−

+1−gZ+

2Z+ V−

1 + 1−gZ+2Z+ V−

�1−gZ+

2Z+ (U− + V−)(1 + 1−gZ+

2Z+ U−)(

1 + 1−gZ+2Z+ V−

)�

1−gZ+2Z+ Z−(

1 + 1−gZ+2Z+ Z−

)2 = 2(1 − gZ+)Z+Z−(2Z+ + (1 − gZ+)Z−)2

. (6.21)

Page 180: Mathematical Physics, Analysis and Geometry - Volume 7

182 ABEL KLEIN AND ANDREW KOINES

For given j ∈ Zd we set ω(j) = {ωi; i ∈ Zd\{j}}, and denote the correspondingexpectation by E(j). We have, for j ∈ � ∩ Zd ,

E

(− ∂

∂ωj

∫Ng,ω,�(E)f (E) dE

)= E(j)

(∫ κ+1

κ−1

[− ∂

∂ωj

∫Ng,ω,�(E)f (E) dE

]ρ(ωj − κ) dωj

)� ‖ρ‖∞E(j)

(∫|Ng,{ω(j),ωj =−1},�(E) − Ng,{ω(j),ωj =1},�(E)|f (E) dE

)� 2nC ′

d(Kg,−Rg,−�)−d/2‖ρ‖∞Ld

∫Ed/2f (E) dE, (6.22)

where we used [KK, Lemma 3.3] in the last step. C ′d is a constant depending only

on d, and � is the constant in (2.17).Now let ng,�(dE) = E(ng,ω,�(dE)). For functions f as above, it now follows

from (6.11), (6.20), and (6.22), that∫Ef (E)ng,�(dE) = E{tr{Wg,ω,�f (Wg,ω,�)}}

� C‖ρ‖∞L2d

∫Ed/2f (E) dE, (6.23)

where

C = 2nC ′d(2 + r)d(κ + 1)

(2Z+ + (1 − gZ+)Z−)2

2(1 − gZ+)Z+Z−(Kg,−Rg,−�)−d/2

� 36nC ′d(2 + r)d

gZ−(1 − gZ+)d+1(K0,−R0,−�)−d/2. (6.24)

We can now conclude that ng,�(dE) is absolutely continuous with

ng,�(dE)

dE� C‖ρ‖∞E(d/2)−1L2d for E � 0. (6.25)

The estimate (4.17) now follows by a standard argument:

P{dist(σ (Wg,ω,�), E) < η}� P

{∫(E−η,E+η)

ng,ω,�(dE) � 1

}�

∫(E−η,E+η)

ng,�(dE) � 2d/2C‖ρ‖∞E(d/2)−1ηL2d, (6.26)

for all E > 0 and 0 � η � E.Theorem 4.4 is proven.

Page 181: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 183

Appendix: Measurability of Random Classical Wave Operators

In this appendix we prove measurability for the random classical wave operatorsWg,ω and Wg,ω. We also prove measurability for Wg,ω,∗.

We recall that a random operator is a mapping ω → Hω from a probability spaceto self-adjoint operators on a Hilbert space, such that the mappings ω → f (Hω)

are strongly measurable for all bounded measurable functions f on R. It sufficesto require weak measurability. (See [KM1, CL].)

PROPOSITION A.1. If the random medium satisfies Assumption 3.1, then Wg,ω,Wg,ω, and Wg,ω,∗ are random operators.

Proof. We start by showing that Wg,ω is a random operator. To do so, we provethat (Wg,ω∓i)−1 is strongly measurable. It then follows from the resolvent identity,continuity of the resolvent, and a connectedness argument that (Wg,ω − z)−1 isstrongly measurable for all nonreal z, and hence Wg,ω is a random operator by[KM1, Theorem 3].

Note that we may write

Wg,ω = √Sg,ω WD

√Sg,ω, (A.1)

where Sg,ω = Kg,ω ⊕ Rg,ω and WD is given by (2.23) with A = D. Thus

(Wg,ω ∓ i)−1 = S−1/2g,ω (WD ∓ iS−1

g,ω)−1S−1/2g,ω , (A.2)

so it suffices to show that (WD ∓ iS−1g,ω)−1 is strongly measurable.

Let λ > 0; using the resolvent identity we get

(WD ∓ iS−1g,ω)−1 = (WD ∓ iλ)−1 ∓

∓ i(WD ∓ iS−1g,ω)−1(λ − S−1

g,ω)(WD ∓ iλ)−1, (A.3)

hence

(WD ∓ iS−1g,ω)−1(1 ± i(λ − S−1

g,ω)(WD ∓ iλ)−1) = (WD ∓ iλ)−1. (A.4)

If λ > (min{K−, R−})−1, we have

‖(λ − S−1g,ω)(WD ∓ iλ)−1‖ � ‖1 − λ−1S−1

g,ω‖� 1 − λ−1(min{K−, R−})−1 < 1, (A.5)

and hence

(WD ∓ iS−1g,ω)−1 = (WD ∓ iλ)−1(1 ± i(λ − S−1

g,ω)(WD ∓ iλ)−1)−1. (A.6)

The strong measurability of (WD ∓ iS−1g,ω)−1 follows.

We have proved that Wg,ω is a random operator. It follows that W2g,ω is also a

random operator since (W2g,ω − z)−1 is strongly measurable if z /∈ [0,∞). Thus

Wg,ω and Wg,ω,∗ are random operators in view of (2.26). �

Page 182: Mathematical Physics, Analysis and Geometry - Volume 7

184 ABEL KLEIN AND ANDREW KOINES

Acknowledgements

The authors thanks Maximilian Seifert for many discussions and suggestions.A. Klein also thanks Alex Figotin, François Germinet, and Svetlana Jitomirskayafor enjoyable discussions.

References

[AM] Aizenman, M. and Molchanov, S.: Localization at large disorder and extreme energies:an elementary derivation, Comm. Math. Phys. 157 (1993), 245–278.

[ASFH] Aizenman, M., Schenker, J., Friedrich, R. and Hundertmark, D.: Finite-volume criteriafor Anderson localization, Comm. Math. Phys. 224 (2001), 219–253.

[AENSS] Aizenman, M., Elgart, A., Naboko, S., Schenker, J. and Stolz, G.: In preparation.[CL] Carmona, R. and Lacroix, J.: Spectral Theory of Random Schrödinger Operators,

Birkhäuser, Boston, 1990.[CH] Combes, J. M. and Hislop, P. D.: Localization for some continuous, random Hamil-

tonian in d-dimension, J. Funct. Anal. 124 (1994), 149–180.[CHT] Combes, J. M., Hislop, P. D. and Tip, A.: Band edge localization and the density of

states for acoustic and electromagnetic waves in random media, Ann. Inst. H. PoincaréPhys. Théor. 70 (1999), 381–428.

[DS] Damanik, D. and Stollman, P.: Multi-scale analysis implies strong dynamical localiza-tion, Geom. Funct. Anal. 11 (2001), 11–29.

[FK1] Figotin, A. and Klein, A.: Localization phenomenon in gaps of the spectrum of randomlattice operators, J. Statist. Phys. 75 (1994), 997–1021.

[FK2] Figotin, A. and Klein, A.: Localization of electromagnetic and acoustic waves inrandom media. Lattice Model, J. Statist. Phys. 76 (1994), 985–1003.

[FK3] Figotin, A. and Klein, A.: Localization of classical waves I: Acoustic waves, Comm.Math. Phys. 180 (1996), 439–482.

[FK4] Figotin, A. and Klein, A.: Localization of classical waves II: Electromagnetic waves,Comm. Math. Phys. 184 (1997), 411–441.

[FK5] Figotin, A. and Klein, A.: Localized classical waves created by defects, J. Statist. Phys.86 (1997), 165–177.

[FK6] Figotin, A. and Klein, A.: Midgap defect modes in dielectric and acoustic media, SIAMJ. Appl. Math. 58 (1998), 1748–1773.

[FK7] Figotin, A. and Klein, A.: Localization of light in lossless inhomogeneous dielectrics,J. Opt. Soc. Amer. A 15 (1998), 1423–1435.

[GD] Germinet, F. and De Bièvre, S.: Dynamical localization for discrete and continuousrandom Schrödinger operators, Comm. Math. Phys. 194 (1998), 323–341.

[GK1] Germinet, F. and Klein, A.: Bootstrap multiscale analysis and localization in randommedia, Comm. Math. Phys. 222 (2001), 415–448.

[GK2] Germinet, F. and Klein, A.: Decay of operator-valued kernels of functions ofSchrödinger and other operators, Proc. Amer. Math. Soc. 131 (2003), 911–920.

[GK3] Germinet, F. and Klein, A.: A characterization of the Anderson metal-insulatortransport transition, Duke Math. J., to appear.

[GK4] Germinet, F. and Klein, A.: Explicit finite volume criteria for localization in continuousrandom media and applications, Geom. Funct. Anal., to appear.

[GK5] Germinet, F. and Klein, A.: High disorder localization for random Schrödinger op-erators through explicit finite volume criteria, Markov Process. Related Fields, toappear.

Page 183: Mathematical Physics, Analysis and Geometry - Volume 7

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: II 185

[GK6] Germinet, F. and Klein, A.: The Anderson metal-insulator transport transition, Con-temp. Math., to appear.

[HM] Holden, H. and Martinelli, F.: On absence of diffusion near the bottom of the spectrumfor a random Schrödinger operator, Comm. Math. Phys. 93 (1984), 197–217.

[Kle] Klein, A.: Localization of light in randomized periodic media, In: J.-P. Fouque (ed.),Diffuse Waves in Complex Media, Kluwer, Dordrecht, 1999, pp. 73–92.

[KK] Klein, A. and Koines, A.: A general framework for localization of classical waves:I. Inhomogeneous media and defect eigenmodes, Math. Phys. Anal. Geom. 4 (2001),97–130.

[KKS] Klein, A., Koines, A. and Seifert, M.: Generalized eigenfunctions for waves ininhomogeneous media, J. Funct. Anal. 190 (2002), 255–291.

[Klo1] Klopp, F.: Localization for continuous random Schrödinger operators, Comm. Math.Phys. 167 (1995), 553–569.

[Klo2] Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schrödingeroperators, Duke Math. J. 98 (1999), 335–396.

[Klo3] Klopp F.: Weak disorder localization and Lifshitz tails: continuous Hamiltonians, Ann.Inst. H. Poincaré 3 (2002), 711–737.

[KM1] Kirsch, W. and Martinelli, F.: On the ergodic properties of the spectrum of generalrandom operators, J. Reine Angew. Math. 334 (1982), 141–156.

[KM2] Kirsch, W. and Martinelli, F.: On the spectrum of Schrödinger operators with a randompotential, Comm. Math. Phys. 85 (1982), 329–350.

[KSS] Kirsch, W., Stolz, G. and Stollman, P.: Localization for random perturbations of peri-odic Schrödinger operators, Random Oper. Stochastic Equations 6 (1998), 241–268.

[Na] Najar, H.: Asymptotic of the integrated density of states of random acoustic operators,C.R. Acad. Sci. Paris Sér. I Math. 333 (2001), 191–194.

[PF] Pastur, L. and Figotin, A.: Spectra of Random and Almost-Periodic Operators,Springer-Verlag, Heidelberg, 1992.

[SW] Schulenberger, J. and Wilcox, C.: Coerciveness inequalities for nonelliptic systems ofpartial differential equations, Arch. Rational Mech. Anal. 88 (1971), 229–305.

[WBLR] Wiersma, D., Bartolini, P., Lagendijk, A. and Righini, R.: Localization of light in adisordered medium, Nature 390 (1997), 671–673.

[Wi] Wilcox, C.: Wave operators and asymptotic solutions of wave propagation problemsof classical physics, Arch. Rational Mech. Anal. 22 (1966), 37–78.

Page 184: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 187–192, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

187

Forces along Equidistant Particle Paths

P. COULTON and G. GALPERINDepartment of Mathematics, Eastern Illinois University, 600 Lincoln Avenue, Charleston,IL 61920, U.S.A. e-mail: {cfprc, cfgg}@eiu.edu

(Received: 28 October 2002; in final form: 7 May 2003)

Abstract. Two particles on the sphere leave the equator moving due south and travel at a constantand equal speed along a geodesic colliding at the south pole. An observer who is unaware of thecurvature of the space will conclude that there is an attractive force acting between the particles.On the other hand, if particles travel at the same speed (initially parallel) along geodesics in thehyperbolic plane, then the particle paths diverge. Imagine two particles in the hyperbolic plane thatare bound together at a constant distance with their center of mass traveling along a geodesic pathat a constant velocity, then the force due to the curvature of the space acts to break the bond andincreases as a quadratic function of the velocity. We consider this problem for the sphere and thehyperbolic plane and we give the exact formula for the apparent force between the particles.

Mathematics Subject Classifications (2000): 53Axx, 70Exx, 85-XX.

Key words: geodesic, curvature, relativity.

1. Introduction

In this paper we wish to study the apparent force on particles traveling in a two di-mensional space with constant sectional curvature. It is well known that geodesicsin positively curved space will tend to converge and that particles traveling alonggeodesic paths in negatively curved space will tend to diverge.

Let M denote a 2-dimensional surface of constant nonzero curvature with Rie-mannian metric g(, ). We will assume that the mass m is constant unless statedotherwise. Let σ (s) denote a geodesic path in the manifold M such that σ (0) = x0

and d/dsσ (s)|s=0 = u0, where the magnitude of u0 is 1. Recall that the speed of ageodesic path is always a unit. We define an inertial path, µ(t), as a path along ageodesic such that the speed is constant and equal to the initial speed v0. In otherwords:

d

dtµ(t) = d

dtσ (v0t),

where σ (t) is a geodesic. We say that a path is a constant speed path if it hasno tangential acceleration component. Let λ(t) denote a constant speed path. Theexternal force required for the particle to follow this path is given by

F = md2λ(t)

dt2.

Page 185: Mathematical Physics, Analysis and Geometry - Volume 7

188 P. COULTON AND G. GALPERIN

Let p1 and p2 denote particles, each of mass m traveling along constant speedpaths λ1(t) and λ2(t) respectively, such that the two particles move at a constantspeed v at a distance d/2 from the central inertial path. We will say that such apair of particles along with their respective paths are coupled. The force requiredto keep each on its path is the coupling force. This is precisely the tension insome imaginary connecting rod. The coupling force is positive when paths areconvergent and negative when the paths are divergent. Assume that the directionfrom the midpoint to the path denoted by λ1(t) is in the positive direction, then thecoupling force is defined by

Fc = m

(d2λ1(t)

dt2− d2λ2(t)

dt2

),

where the force may be assumed to be defined by the tension of the connecting rodat the midpoint. We observe that if the force on the particles is only in the directionof the connecting rod and the forces at the midpoint due to the motion is equal andin opposite directions, then the midpoint will continue to move along the geodesicat a constant speed. This is expressed by the Lagranian equation applied at themidpoint:

m

(d2λ1(t)

dt2+ d2λ2(t)

dt2

)= 0.

We also observe that the energy is given by

E = m

2

(dλ1

dt· dλ1

dt

)+ m

2

(dλ2

dt· dλ2

dt

),

and that the energy is minimized whenever

dE

dt= m

(dλ1

dt· d2λ1

dt2

)+ m

(dλ2

dt· d2λ2

dt2

)= 0.

We conclude that the motion is possible provided that the velocity and the accelera-tion are perpendicular at each point. This fact follows from the definition of motionin this problem.

We will prove the following:

THEOREM. Let M be a manifold with constant nonzero sectional curvature K.Then the coupling force between two coupled particles of mass m moving at speedv at a distance d/2 from a central inertial path is given by

Fc = 2mκλ(t)v2,

where κλ(t) is the geodesic curvature for the path λ(t) and where

κλ(t) = √K tan

(√Kd

2

)

Page 186: Mathematical Physics, Analysis and Geometry - Volume 7

FORCES ON PARTICLE PATHS 189

in the case of positive curvature, and

κλ(t) = −√−K tanh

(√−Kd

2

),

in the case of negative curvature.

Note that this force is essentially a function of the geodesic curvature and thusthe geometry of the given curve and the manifold.

2. Proof of the Theorem

The geodesic curvature is the signed magnitude of the arc-length parameter deriv-ative of a unit tangent vector along the given curve, where the sign is determinedby the orientation of the curve with respect to the manifold. The case of positivesectional curvature is straight forward. Recall that geodesic curvature κρ for a circleof radius ρ in on the 2-sphere is given by

κρ = √K cot(ρ

√K).

Observe that if the particles in question move at unit speed, then the second deriv-ative with respect to the arc-length is the geodesic curvature. If

d2λ(t)

dt2= κρ, then

d2λ(vt)

dt2= v2κρ.

The radius of our curve with respect to the north pole of the sphere is ρ =(rπ − d)/2. This yeilds

κρ = cot

(rπ

√K

2− d

√K

2

)= tan

(d√

K

2

),

where√

K = 1/r.We conclude that the magnitude of the force on each particle is given by

F = mv2κλ(t),

where λ(t) represents a curve of constant speed at a constant distance d/2 fromsome geodesic. This proves the theorem in the case of positive curvature.

We next consider the case of a space form M with sectional curvature K =−1/k2. The Minkowski space is defined by

R2,1 = {(x, y, z) | x, y, z ∈ R},

such that the distance between points P = (x1, y1, z1) and Q = (x2, y2, z2) isgiven by

ρ(P,Q) =√

(x1 − x2)2 + (y1 − y2)2 − (z1 − z2)2.

Page 187: Mathematical Physics, Analysis and Geometry - Volume 7

190 P. COULTON AND G. GALPERIN

Figure 1. The hyperbolic case.

Then the hyperbolic plane H2(−k) is defined by the set locus

x2 + y2 + 1

k2= z2,

subject to the induced metric. We obtain a coordinate system for the hyperbolicplane in the xy-plane under the projection (see Figure 1)

�(x, y, z) −→ (x, y).

A parametric equation for a curve of constant distance from the y-axis is givenby

σ (t) =

α

t√α2 + t2 + 1/k2

.

Two curves of this type, i.e. symmetric with the y-axis, will give a pair of inertialpaths that are equidistant for all t . However, it is convenient to do our calculationsat the origin of the coordinate system. It suffices to translate the curve σ (t) back tothe origin using an appropriate isometry in hyperbolic space:

λ(t) =(√

k2α2 + 1 0 −αk

0 1 0−αk 0

√k2α2 + 1

)

α

t√α2 + t2 + 1/k2

.

Page 188: Mathematical Physics, Analysis and Geometry - Volume 7

FORCES ON PARTICLE PATHS 191

At t = 0, we obtain

d2λ

dt2(0) =

−αk2/√

k2α2 + 1

0

k/√

k2α2 + 1

,

and ∥∥∥∥d2γ (0)

dt2

∥∥∥∥ = αk2

√k2α2 + 1

.

The velocity for this curve is not constant but it does take an extremum at t = 0and the curve has unit speed at t = 0. Thus the second derivative at t = 0 gives thegeodesic curvature. We conclude that

κλ(t) = k tanh(kd/2), where α = 1

ksinh(dk/2).

Therefore, the coupling force is

Fc = −2mv2√−K tanh(d

√−K/2).

If√|K|d is sufficiently small, we may approximate by the coupling force by |Fc| ≈

2W |K|d, where W is the kinetic energy. This formula holds in both cases.

3. Conclusion

Particles that are bound together by chemical or nuclear forces may be studiedas though they were connected by a massless rod. Force fields in three space, i.e.gravitation, electric and magnetic fields, have characteristics of both positive andnegative curvature, depending on the charge of the particles and the motion relativeto the field.

Observe that the coupling force in each case is related to the kinetic energy andis nearly a linear function of the distance for small distances. From a relativisticpoint of view, we see that even when the velocity is bounded, the kinetic energy isnot. As the kinetic energy increases so does the coupling force.

In the case of a negatively curved manifold, the coupling force necessary tokeep the particles together must be applied inward, but if the kinectic energy islarge enough and the absolute value of the curvature is increasing, then the typicalchemical or nuclear binding forces between particles may be broken.

High energy particles are sometimes injected into relatively large electric fieldsor gravitational fields to separate elementary particles. The coupling force rela-tion in this paper should give a reasonable estimate of the force of separation (orcombination) in such cases.

Finally, note that the force required to keep a particle moving along a path atconstant speed is proportional to the geodesic curvature. One must compute the

Page 189: Mathematical Physics, Analysis and Geometry - Volume 7

192 P. COULTON AND G. GALPERIN

geodesic curvature to solve the problem on a general surface, assuming no rotationabout the center of mass.

Acknowledgement

The authors would like to thank the referee for comments which helped to simplifythe proof of the theorem.

Reference

1. Kobayashi, S. and Nomizu, K., Foundations of Differential Geometry, Vol. II, Interscience, NewYork, 1969.

Page 190: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 193–221, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

193

Three Term Recursion Relation for SphericalFunctions Associated to theComplex Projective Plane

INÉS PACHARONI and JUAN A. TIRAO�

CIEM-FaMAF, Universidad Nacional de Córdoba, Córdoba 5000, Argentina.{pacharon, tirao}@mate.uncor.edu

(Received: 27 February 2003)

Abstract. The aim of this paper is to prove a three term recursion relation for a sequence of matrixvalued functions �(g,w) on G = SU(3) built up of � + 1 spherical functions of a given typeπn,�, associated to the complex projective plane G/K , K = S(U(2) × U(1)). The three term recur-sion relation that constitutes our main result, Theorem 5.2, together with the fact that the functions�(g,w) are eigenfunctions of all differential operators on G which are left invariant under G andright invariant under K , provides for each � ∈ N0 a solution of a matrix valued extension of theBochner’s problem to G. In fact by restriction to an Abelian Iwasawa subgroup of G, for each� ∈ N0, we obtain a sequence H (t, w) of matrix valued polynomial functions on t which satisfiesa three term recursion relation and such that they are eigenfunctions of a second order differentialoperator on 0 < t < 1. Thus each sequence H (t,w) satisfies both conditions explicitly asked for byBochner.

Mathematics Subject Classifications (2000):

Key words:

1. Introduction

The general theory of scalar valued spherical functions of arbitrary type, associatedto a pair (G,K) with G a locally compact group and K a compact subgroup,goes back to Godement and Harish-Chandra. In [9], attention is focused on theunderlying matrix valued spherical functions defined as a solution of an integralidentity, see Definition 2.1. These two notions are related by the operation of takingtraces.

When G is a Lie group the general theory, see [9, 3], gives for a fixed irreduciblerepresentation (π, V ) of K a family of matrix valued functions that are eigenfunc-tions of a system of left invariant differential operators defined on the Lie group G.These spherical functions in fact take values in the set of linear maps from V intoitself.

� Partially supported by CONICET grant PIP655-98.

Page 191: Mathematical Physics, Analysis and Geometry - Volume 7

194 INES PACHARONI AND JUAN A. TIRAO

In [4] one finds a detailed elaboration of this theory when the symmetric spaceG/K is the complex projective plane. In this case we have G = SU(3) and K =S(U(2) × U(1)). In particular one constructs, out of several spherical functions ofa given type πn,�, n ∈ Z and � ∈ N0, a sequence of (� + 1)× (� + 1)-matrix valuedpolynomial functions H (t, w), 0 < t < 1, w � max{0,−n}, such that as functionsof the spectral parameter w they satisfy a three term recursion relation of the form

tH (t, w) = AwH (t, w − 1) + BwH (t, w) + CwH (t, w + 1), (1)

where Aw,Bw,Cw are matrices independent of t . On the other hand it is alsoproved that the functions H (t, w) satisfy a differential equation of the form

DH(t, w)T = H (t, w)T�, (2)

where D is a second order differential operator in the variable t whose coefficientsdepend on t (and not on w). Here � is a diagonal matrix with entries that dependon w but not on t .

In 1929, S. Bochner [1] solved the problem of determining all families of scalarvalued orthogonal polynomials that are eigenfunctions of some arbitrary but fixedsecond order differential operator. From this point of view one can interpret theresults (1) and (2) as an instance of a matrix valued solution to Bochner’s problem.

In [4] the three term recursion relation (1) was conjectured for all sphericalfunctions of type πn,� for n � 0 and any � � 0. In [5] explicit formulae for thecoefficient matrices Aw,Bw and Cw were given as well as a sketch of the way inwhich the tensor product of certain representations of G could be used to get thisresult.

The aim of this paper is to give a proof of (1). Our strategy is to work with asequence of (� + 1)2 × (� + 1)-matrix valued functions �(g,w) on G built upof � + 1 spherical functions of a given type πn,�, which parallels the construc-tion of H (t, w). The three term recursion relation that constitutes our main result,Theorem 5.2, as is given by

φ(g)ψ(g)�(g,w) = Aw�(g,w − 1) + Bw�(g,w) + Cw�(g,w + 1) (3)

generalizes (1) for spherical functions on G of arbitrary type. It is important tostress that this relation is valid on G, and not just on a one-dimensional submanifoldof G. On the other hand from Proposition 2.3(iii) it follows that for each w, � isan eigenfunction of all differential operators on G which are left invariant under G

and right invariant under K. Thus the sequence �(g,w) provides an extension toG of the matrix valued version of the Bochner’s problem. Moreover, (1) followseasily from (3) by restriction, see Proposition 5.3.

The functions H (t, w) are polynomials in t and are seen in [4, Section 7], tosatisfy orthogonality relations. Nevertheless it is important to notice that the classi-cal argument that gives from here a three term recursion relation cannot be applieddirectly. It may also be important to repeat a remark from [4]: the polynomial

Page 192: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 195

matrix valued functions considered in Section 12 do not satisfy all the conditionsof the theory in [2]. Besides it is not at all clear how to go from the left-hand sideof (1) to the left-hand side of (3).

An example that was very useful in the development of our work is the followingwhich corresponds to the spherical functions of one-dimensional types. For eachone-dimensional representation π = πn,0, of K the functions H(t,w), with w �max{0,−n}, associated in [4] to the irreducible spherical functions �(g,w) of typeπn,0 are given by

H(t,w) =

(−1)w

w + 1P (n,1)

w (1 − 2t), if n � 0, w � 0

(−1)w+nt−n

w + n + 1P

(−n,1)w+n (1 − 2t), if n < 0, w � −n,

here P(α,β)

j (z) are the Jacobi polynomials in the interval [−1, 1]. These polynomi-als satisfy the following three term recursion relation

(2j + α + β + 1)(2j + α + β)(2j + α + β + 2)zP(α,β)

j (z)

= 2(j + α)(j + β)(2j + α + β + 2)P(α,β)

j−1 (z)−− (2j + α + β + 1)(α2 − β2)P

(α,β)

j (z)++ 2(j + 1)(j + α + β + 1)(2j + α + β)P

(α,β)

j+1 (z).

From this we obtain

tH (t, w) = AwH(t,w − 1) + BwH(t,w) + CwH(t,w + 1), (4)

where Aw,Bw and Cw are suitable constants. Moreover, appealing to property (ii)in Proposition 2.3, for all g ∈ G we get

φ(g)ψ(g)�(g,w) = Aw�(g,w − 1) + Bw�(g,w) + Cw�(g,w + 1), (5)

where φ(g) is the spherical function �(g, 1) of type π−1,0, and ψ(g) is the spher-ical function �(g, 0) of type π1,0, which is an instance of the three term recursionrelation (1).

2. Background

The aim of this section is to collect the necessary material to obtain a three term re-cursion relation for the spherical functions associated to the pair (G,K) = (SU(3),S(U(2) × U(1)).

Page 193: Mathematical Physics, Analysis and Geometry - Volume 7

196 INES PACHARONI AND JUAN A. TIRAO

2.1. THE LIE ALGEBRA OF SU(3)

The Lie algebra of G is g = {X ∈ gl(3, C) : X = −Xt, tr X = 0}. Its complexifi-

cation is gC = sl(3, C). The Lie algebra k of K can be identified with u(2) and itscomplexification kC with gl(2, C). The following matrices form a basis of g.

H1 =[

i 0 00 −i 00 0 0

], H2 =

[i 0 00 i 00 0 −2i

],

Y1 =[ 0 1 0

−1 0 00 0 0

], Y2 =

[ 0 i 0i 0 00 0 0

],

Y3 =[ 0 0 1

0 0 0−1 0 0

], Y4 =

[ 0 0 i

0 0 0i 0 0

],

Y5 =[ 0 0 0

0 0 10 −1 0

], Y6 =

[ 0 0 00 0 i

0 i 0

].

Let h be the Cartan subalgebra of gC of all diagonal matrices. The correspondingroot space structure is given by

Xα =[ 0 1 0

0 0 00 0 0

], X−α =

[ 0 0 01 0 00 0 0

], Hα =

[ 1 0 00 −1 00 0 0

],

Xβ =[ 0 0 0

0 0 10 0 0

], X−β =

[ 0 0 00 0 00 1 0

], Hβ =

[ 0 0 00 1 00 0 −1

],

Xγ =[ 0 0 1

0 0 00 0 0

], X−γ =

[ 0 0 00 0 01 0 0

], Hγ =

[ 1 0 00 0 00 0 −1

],

where

α(x1E11 + x2E22 + x3E33) = x1 − x2,

β(x1E11 + x2E22 + x3E33) = x2 − x3,

γ (x1E11 + x2E22 + x3E33) = x1 − x3.

We have

Xα = 12 (Y1 − iY2), Xβ = 1

2(Y5 − iY6), Xγ = 12 (Y3 − iY4),

X−α = − 12(Y1 + iY2), X−β = − 1

2 (Y5 + iY6),

X−γ = − 12(Y3 + iY4).

We take {α, β, γ } as the set of positive roots of (gC, h). Then λα = 13(2α + β)

and λβ = 13 (α + 2β) are the corresponding fundamental weights.

Page 194: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 197

2.2. SPHERICAL FUNCTIONS

Let G be a locally compact unimodular group and let K be a compact subgroupof G. Let K denote the set of all equivalence classes of complex finite-dimensionalirreducible representations of K; for each δ ∈ K , let ξδ denote the character ofδ, d(δ) the degree of δ, i.e. the dimension of any representation in the class δ,and χδ = d(δ)ξδ. We shall choose once and for all the Haar measure dk on K

normalized by∫K

dk = 1.We shall denote by V a finite-dimensional vector space over the field C of

complex numbers and by End(V ) the space of all linear transformations of V

into V .

DEFINITION 2.1 ([9, 3]). A spherical function � on G of type δ ∈ K is acontinuous function on G with values in End(V ) such that

(i) �(e) = I . (I = identity transformation).(ii) �(x)�(y) = ∫

Kχδ(k

−1)�(xky) dk, for all x, y ∈ G.

PROPOSITION 2.2 ([9, 3]). If �: G → End(V ) is a spherical function of type δ

then:

(i) �(kgk′) = �(k)�(g)�(k′), for all k, k′ ∈ K, g ∈ G.(ii) k �→ �(k) is a representation of K such that any irreducible subrepresenta-

tion belongs to δ.

Spherical functions of type δ arise in a natural way upon considering repre-sentations of G. If g �→ U(g) is a continuous representation of G, say on afinite-dimensional vector space E, then

P(δ) =∫

K

χδ(k−1)U(k) dk

is a projection of E onto P(δ)E = E(δ); E(δ) consists of those vectors in E, thelinear span of whose K-orbit splits into irreducible K-subrepresentations of type δ.The function �: G → End(E(δ)) defined by

�(g)a = P(δ)U(g)a, g ∈ G, a ∈ E(δ)

is a spherical function of type δ. In fact, if a ∈ E(δ) we have

�(x)�(y)a = P(δ)U(x)P (δ)U(y)a

=∫

K

χδ(k−1)P (δ)U(x)U(k)U(y)a dk

=( ∫

K

χδ(k−1)�(xky) dk

)a.

If the representation g �→ U(g) is irreducible then the associated spherical function� is also irreducible. Conversely, any irreducible spherical function on a compactgroup arises in this way.

Page 195: Mathematical Physics, Analysis and Geometry - Volume 7

198 INES PACHARONI AND JUAN A. TIRAO

If G is a connected Lie group it is not difficult to prove that any sphericalfunction �: G → End(V ) is differentiable (C∞), and moreover that it is analytic.Let D(G) denote the algebra of all left invariant differential operators on G and letD(G)K denote the subalgebra of all operators in D(G) which are invariant underall right translation by elements in K.

In the following proposition (V , π) will be a finite-dimensional representationof K such that any irreducible subrepresentation belongs to the same class δ ∈ K.

PROPOSITION 2.3 ([9, 3]). A function �: G → End(V ) is a spherical functionof type δ if and only if

(i) � is analytic.(ii) �(kgk′) = π(k)�(g)π(k′), for all k, k′ ∈ K, g ∈ G, and �(e) = I .

(iii) [D�](g) = �(g)[D�](e), for all D ∈ D(G)K , g ∈ G.

2.3. IRREDUCIBLE REPRESENTATIONS OF GL(3, C)

We recall here some basic facts about the representation theory of GL(n, C), whichcan be found, for example, in [10, §67].

The equivalence classes of finite-dimensional irreducible holomorphic repre-sentations of GL(n, C) are parameterized by the n-tuples of integers

m = (m1, . . . , mn) such that m1 � · · · � mn.

We denote by Vm the space of a representation in the class m.A highest weight vector in Vm is a vector 0 �= v ∈ Vm invariant under the upper

triangular subgroup N of GL(n, C). Since the subgroup of all diagonal matrices�(ex1, . . . , exn) normalizes N and the subspace V N

m of all N-invariant vectors in Vm

is one-dimensional, it follows that the diagonal subgroup acts on V Nm by a character,

namely for v ∈ V Nm we have

�(ex1, . . . , exn)v = em1x1+···+mnxnv.

We identify GL(n − 1, C) with the subgroup of GL(n, C) in the following way

GL(n − 1, C) �(

GL(n − 1, C) 00 1

).

When we restrict the representation m of GL(n, C) to GL(n − 1, C) it de-composes as the direct sum of representations of GL(n − 1, C) in the classesk = (k1, . . . , kn−1) such that

m1 � k1 � m2 � k2 � · · · � kn−1 � mn.

Each of the representations of GL(n − 1, C) is contained in this decompositionexactly once.

Page 196: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 199

In particular when n = 3 the space Vm of a representation of GL(3, C) in theclass m = (m1,m2,m3) decomposes as the direct sum Vm = ⊕

Vk of irreduciblerepresentations of GL(2, C), where the sum is over all classes k such that

m1 � k1 � m2 � k2 � m3.

In turn each one of these representations Vk decomposes as the direct sum Vk =⊕Vs of one-dimensional irreducible representations of GL(1, C) in the classes

s = (s), such that k1 � s � k2.Taking one nonzero vector from each one of these subspaces Vs, we get a ba-

sis for the entire space Vm. It is clear that each one of these vectors is uniquelydetermined, up to a scalar, by the following triangle of integers

µ =m1 m2 m3

k1 k2

s

,

where

m1 � k1 � m2 � k2 � m3 and k1 � s � k2.

We denote by vµ a basis vector corresponding to the triangle µ. It is easy to seethat each basis vector vµ is a weight vector of the Cartan subalgebra of gl(3, C) ofall diagonal matrices �(x1, x2, x3) of weight

x1s + x2(k1 + k2 − s) + x3(m1 + m2 + m3 − k1 − k2)vµ.

The basis {vµ}µ taking above is known as a Gelfand–Cetlin basis of the mod-ule Vm.

Let W = C3 denote the canonical irreducible GL(3, C) module, and let

{e1, e2, e3} be the canonical basis of C3. Then e1 is a dominant vector of weight

(1, 0, 0). Let V = Vm be any irreducible GL(3, C) module with m = (m1,m2,m3).We are interested in the decomposition of the tensor product V ⊗W as a direct sumof GL(3, C) irreducible submodules. The following proposition is a special case ofthe so called Pieri’s formula, see [10, §77].

PROPOSITION 2.4. For V = Vm, m = (m1,m2,m3) we have

V ⊗ W � V σ1 ⊕ V σ2 ⊕ V σ3,

where V σ1, V σ2, V σ3 are irreducible GL(3, C) modules of parameters

σ1 = (m1 + 1,m2,m3), σ2 = (m1,m2 + 1,m3),

σ3 = (m1,m2,m3 + 1).

Remark. The irreducible modules on the right-hand side whose parameters(m′

1,m′2,m

′3) do not satisfy the conditions m′

1 � m′2 � m′

3 have to be omitted.

Page 197: Mathematical Physics, Analysis and Geometry - Volume 7

200 INES PACHARONI AND JUAN A. TIRAO

In other words the module V σ2 appears in the decomposition of V ⊗ W if and onlyif m1 �= m2, and V σ3 appears if and only if m2 �= m3.

Our Lie group K = S(U(2) × U(1)) is isomorphic to U(2) under the map(A 00 a

)�→ A.

Let us recall that the identity representation π1 of U(2) in C2, as well as the

�-symmetric power of it π�: A �→ A�,A ∈ U(2), of dimension �+1 are irreducible.Moreover the representations πn,� of U(2) defined by

πn,�(A) = (det A)nA�, n ∈ Z, � ∈ Z�0

give a complete set of representatives of elements in U(2). Then by composing πn,�

with the above isomorphism we get an irreducible representation of K which weshall still call πn,�. We shall refer to (n, �) as the type of πn,�.

On the other hand the reader can easily see that if Vk = Vk1,k2 is an irreducibleGL(2, C) submodule of the GL(3, C) module Vm, m = (m1,m2,m3), then Vk isalso a K submodule of type (n, �) given by

� = k1 − k2, n = k1 + 2k2 − m1 − m2 − m3. (6)

In particular let W1 be the irreducible GL(2, C) submodule of W = V(1,0,0) ofdimension 1, i.e. W1 = Ce3. Then W1 = V0,0 as a K-module is of type (−1, 0).

3. Multiplication Formulas

The tensor product Vk ⊗ W1 is an irreducible GL(2, C) module of parameters(k1, k2) + (0, 0) = (k1, k2). The GL(3, C) projection

Pj : Vm ⊗ W → V σj (for j = 1, 2, 3)

maps Vk⊗W1 onto the trivial module or onto the GL(2, C) submodule Vσj

k1,k2of V σj .

For any vµ in the Gelfand–Cetlin basis of Vm corresponding to the triangle

µ =m1 m2 m3

k1 k2

k

we have

vµ ⊗ e3 = v1 + v2 + v3 ∈ V σ1 ⊕ V σ2 ⊕ V σ3,

where the vectors vj are weight vectors in V σj and belong to the GL(2, C) sub-modules V

σj

k1,k2. Thus the corresponding triangles of v1, v2, v3 are respectively

m1 + 1 m2 m3

k1 k2

k

,

m1 m2 + 1 m3

k1 k2

k

,

m1 m2 m3 + 1k1 k2

k

.

Page 198: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 201

We note that the vector vµ ⊗ e3 is of weight

(k, k1 + k2 − k,m1 + m2 + m3 + 1 − k1 − k2)

and each Vσj

k1,k2is an irreducible K-module of type (k1+2k2−m1−m2−m3−1, �) =

(n − 1, �).It is well known (see [7, p. 32]) that there exists a basis {vi}�

i=0 of Vk such that

π (Hα)vi = (� − 2i)vi,

π (Xα)vi = (� − i + 1)vi−1, (v−1 = 0), (7)

π (X−α)vi = (i + 1)vi+1, (v�+1 = 0),

where n and � are given by (6).Therefore we can normalize the basis of Vk, taken from the Gelfand–Cetlin

basis of Vm, in such a way that (7) holds.

LEMMA 3.1. Let us consider a U(2) invariant inner product on Vk. Then the basis{vi}�

i=0 described above is an orthogonal basis such that

‖vi‖2 =(

i

)‖v0‖2.

Proof. Let ∗ denote the adjoint operator in End(Vk) corresponding to an invariantinner product on Vk. Since π(Y )∗ = −π(Y ) for all Y ∈ g we have

π (Hα)∗ = iπ (H1)

∗ = −iπ (H1) = π(Hα),

π(X−α)∗ = − 1

2(π(Y1) + iπ (Y2))∗ = − 1

2(−π (Y1) + iπ (Y2)) = π (Xα).

Since π (Hα)∗ = π (Hα) and the vi’s are eigenvectors corresponding to different

eigenvalues of π (Hα) they are orthogonal to each other.Now the proof will be completed by induction on 0 � i � �. The statement is

clearly true for i = 0. Let us assume that the assertion is true for some 0 � i �� − 1. Then

(i + 1)〈vi+1, vi+1〉 = 〈π (X−α)vi, vi+1〉 = 〈vi, π(Xα)vi+1〉 = (� − i)〈vi, vi〉.Thus

〈vi+1, vi+1〉 = � − 1

i + 1

(�

i

)〈v0, v0〉 =

(�

i + 1

)〈v0, v0〉. �

PROPOSITION 3.2. Let {vi}�i=0 be a basis Vk, k = (k1, k2), such that (7) holds,

and equip Vm with a G-invariant inner product such that ‖v0‖ = 1. Similarly takeon W the G-invariant inner product such that ‖e3‖ = 1. Let ai = ai(m, k) bedefined by

v0 ⊗ e2 = a1vσ10 + a2v

σ20 + a2v

σ30 ∈ V σ1 ⊗ V σ2 ⊗ Vσ3, (8)

Page 199: Mathematical Physics, Analysis and Geometry - Volume 7

202 INES PACHARONI AND JUAN A. TIRAO

with aj > 0 and ‖vσj

0 ‖ = 1. Let vσj

i ∈ V σj be defined by

vi ⊗ e3 = a1vσ1i + a2v

σ2i + a3v

σ3i .

Then {vσj

i }�i=0 (j = 1, 2, 3) is a basis of an irreducible GL(2, C) module V

σj

kcontained in V σj such that (7) holds. Hence

‖vσj

i ‖ =(

i

).

Remark. If Pj (v0 ⊗ e3) = 0 we take aj = 0 and we do not define vσj

i .

Proof. Since Pj is in particular a GL(2, C) morphism and e3 is GL(2, C) in-variant from (8) it follows that each v

σj

0 is a GL(2, C) dominant vector of weight(k1, k2).

On the other hand we have

a1Xi−α(v

σ10 ) + a2X

i−α(v

σ20 ) + a3X

i−α(v

σ30 ) = Xi

−α(v0 ⊗ e3) = i!vi ⊗ e3

= i!(a1vσ1i + a2v

σ2i + a3v

σ3i ).

Therefore Xi−α(vσj

0 ) = i!vσj

i for j = 1, 2, 3. This completes the proof of theproposition. �THEOREM 3.3. Let � be the irreducible spherical function of type (n, �) = (k1+2k2 −m1 −m2 −m3, k1 −k2) associated to the G module Vm and the K submoduleVk = Vk1,k2 . Let φ be the spherical function of type (−1, 0) associated to the G

module W . Let �σj be the spherical functions of type (n − 1, �) associated to theG modules V σj , (j = 1, 2, 3). Then

φ(g)�(g) = a21�

σ1(g) + a22�

σ2(g) + a23�

σ3(g). (9)

Proof. Let ui = (�

i

)−1/2vi and let u

σj

i = (�

i

)−1/2v

σj

i . Then {ui}�0 and {uσj

i }�0 are,

respectively, orthonormal bases of Vk and Vσj

k for j = 1, 2, 3.We recall now that if � is the spherical function associated to the G module Vm

and the K submodule Vk then by definition �(g)a = P(ga) for all g ∈ G andall a ∈ Vk, where P is the K-projection of Vm onto Vk. Therefore if {ui}�

0 is anorthonormal basis of Vk we have

�ij (g) = 〈�(g)uj , ui〉 = 〈P(guj ), ui〉 = 〈guj , ui〉.Now on the one hand we have

〈g(uj ⊗ e3), ui ⊗ e3〉 = 〈guj , ui〉〈ge3, e3〉,

Page 200: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 203

and on the other hand we get

〈g(uj ⊗ e3), ui ⊗ e3〉= 〈a1gu

σ1j + a2gu

σ2j + a3gu

σ3j , a1u

σ1i + a2u

σ2i + a3u

σ3i 〉

= a21〈gu

σ1j , u

σ1i 〉 + a2

2〈guσ2j , u

σ2i 〉 + a2

3〈guσ3j , u

σ3i 〉.

Therefore

φ(g)�ij (g) = a21�

σ1ij (g) + a2

2�σ2ij (g) + a2

3�σ3ij (g).

This completes the proof of the theorem. �We rewrite (9) making explicit the dependence on the parameters m = (m1,m2,

m3) and k = (k1, k2). Then, up to equivalences of spherical functions, we have

φ(g)�m,k(g) = a21(m, k)�m+e1,k(g) + a2

2(m, k)�m+e2,k(g) ++ a2

3(m, k)�m+e3,k(g), (10)

where e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1).In the following section we shall prove the following theorem.

THEOREM 3.4. The constants ai(m, k) defined in Proposition 3.2 are given by

a21(m, k) = (m1 − k1 + 1)(m1 − k2 + 2)

(m1 − m2 + 1)(m1 − m3 + 2),

a22(m, k) = (k1 − m2)(m2 − k2 + 1)

(m1 − m2 + 1)(m1 − m3 + 1),

a23(m, k) = (k1 − m3 + 1)(k2 − m3)

(m1 − m3 + 2)(m2 − m3 + 1).

3.1. THE DUAL PICTURE

Let V = Vm be an irreducible GL(n, C) module associated to m = (m1, . . . , mn).Then it is easy to see that V ∗ is associated to the parameters (−mn, . . . ,−m1). Inparticular let W ∗ denote the GL(3, C) module dual to W . Thus W ∗ has parameters(0, 0,−1). Then from Proposition 2.4 we obtain the following proposition.

PROPOSITION 3.5. If V = Vm then

V ⊗ W ∗ � V τ1 ⊕ V τ2 ⊕ V τ3 ,

where V τ1 , V τ2 , V τ3 are irreducible GL(3, C) modules of parameters

τ1 = (m1 − 1,m2,m3), τ2 = (m1,m2 − 1,m3),

τ3 = (m1,m2,m3 − 1).

Page 201: Mathematical Physics, Analysis and Geometry - Volume 7

204 INES PACHARONI AND JUAN A. TIRAO

THEOREM 3.6. Let �m,k be the irreducible spherical function associated to theG module Vm and the K submodule Vk1,k2 . Let ψ be the spherical function oftype (1, 0) associated to the G module W ∗. Then up to equivalences of sphericalfunctions we have

ψ(g)�m(g) = c21(m, k)�m−e1,k(g) + c2

2(m, k)�m−e2,k(g) ++ c2

3(m, k)�m−e3,k(g), (11)

where

cj (m, k) = a4−j (−m3,−m2,−m1,−k2,−k1),

j = 1, 2, 3. More explicitly

c21(m, k) = (m1 − k2 + 1)(m1 − k1)

(m1 − m2 + 1)(m1 − m3 + 2),

c22(m, k) = (m2 − k2)(k1 − m2 + 1)

(m1 − m2 + 1)(m2 − m3 + 1),

c23(m, k) = (k2 − m3 + 1)(k1 − m3 + 2)

(m1 − m3 + 2)(m2 − m3 + 1).

To prove this theorem we can repeat all the arguments given in Section 4, butwe give here a short proof which depend on the notion of the dual of a sphericalfunction.

If �: G → End(V ) is a spherical function of type π ∈ K then the function�∗: G → End(V ∗) defined by �∗(g) = �(g−1)t is a spherical function of type π∗,where π∗ denotes the contragredient representation associated to π (see [4]). Inparticular ψ(g) = φ∗(g), because both spherical functions are associated to thesame G module and are of the same K type.

Now given a parameter m = (m1,m2,m3) let m = (−m3,−m2,−m1), simi-larly given a parameter k = (k1, k2) we put k = (−k2,−k1).

Therefore we can see that if �m,k is the spherical function associated to the G

module Vm and the K submodule Vk then �∗m,k = �m,k is the spherical function

associated to the G module Vm and the K submodule Vk.

Proof of Theorem 3.6. We start from the following identity established in Theo-rem 3.3

φ(g)�m,k(g) = a21(m, k)�m1,k(g) + a2

2(m, k)�m2,k(g) ++ a2

3(m, k)�m3,k(g), (12)

where

m1 = (m1 + 1,m2,m3), m2 = (m1,m2 + 1,m3),

m3 = (m1,m2,m3 + 1).

Page 202: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 205

By taking ∗ on both sides of (12) we obtain

ψ(g)�m,k(g) = a21(m, k)�m1,k

(g) + a22(m, k)�m2,k

(g) ++ a2

3(m, k)�m3,k(g).

Notice that m1 = m − e3, m2 = m − e2 and m3 = m − e1. Now if we change mby m and k by k we obtain

ψ(g)�m,k(g) = a21(m, k)�m−e3,k(g) + a2

2(m, k)�m−e2,k(g) ++ a2

3(m, k)�m−e1,k(g).

Then (18) follows with cj (m, k) = a4−j (m, k). Now the explicit expresions forcj (m, k) follow from Theorem 3.4. �

4. The Constants of the Multiplication Formulas

The goal of this section is to give explicit expressions for the constants a21, a

22, a

23 ,

appearing in Theorem 3.4 in terms of the parameters k = (k1, k2) and m =(m1,m2,m3).

Let V = Vm be an irreducible GL(3, C) module and equipped V with a G

invariant inner product. Let v ∈ V be a highest weight vector such that ‖v‖ = 1.The vector v is of weight (m1,m2,m3) and it is associated to the triangle

m1 m2 m3

m1 m2

m1

.

As we mentioned before as a GL(2, C) module, V decomposes as the directsum of the submodules Vk such that m1 � k1 � m2 � k2 � m3, all of these withmultiplicity one. A highest weight vector vµ in Vk is of weight

(k1, k2,m1 + m2 + m3 − k1 − k2)

and corresponds to the triangle

µ =m1 m2 m3

k1 k2

k1

.

Let ‖vµ‖ = 1. The constants a1, a2, a3 were defined in a such a way that

vµ ⊗ e3 = a1vσ1µ + a2v

σ2µ + a3v

σ3µ ∈ V σ1 ⊗ V σ2 ⊗ V σ3,

with aj > 0 and ‖vσjµ ‖ = 1.

Page 203: Mathematical Physics, Analysis and Geometry - Volume 7

206 INES PACHARONI AND JUAN A. TIRAO

PROPOSITION 4.1. Let v be a GL(3, C) highest weight vector in V and let vσj ∈V ⊗ W be the vectors defined by

vσ1 = v ⊗ e1,

vσ2 = E21v ⊗ e1 − (m1 − m2)v ⊗ e2, if m1 �= m2,

vσ3 = (E21E32 − (m2 − m3)E31)v ⊗ e1 − (m1 + 1 − m3)E32v ⊗ e2 ++ (m2 − m3)(m1 + 1 − m3)v ⊗ e3, if m2 �= m3.

Then vσj , (j = 1, 2, 3) are dominant vectors in V ⊗ W of weights(m1 + 1,m2,m3), (m1,m2 + 1,m3), (m1,m2,m3 + 1), respectively.

Proof. From the definition it is clear that vσj �= 0 for j = 1, 2, 3. That vσ1 is avector of the specified weight it follows from the fact that v is a vector of weight(m1,m2,m3) and that e1 is a vector of weight (1, 0, 0). Similarly vσ2 is a vector ofweight (m1,m2 + 1,m3) because E21v is a vector of weight (m1 − 1,m2 + 1,m3)

and e2 is a vector of weight (0, 1, 0). In the same way one verifies that vσ3 is avector of weight (m1,m2,m3 + 1).

To prove that the vσj are dominant it is enough to verify that they are killed byE12 and E23. This can be check by a straightforward computation. �

We introduce now the following elements of the complex universal envelopingalgebra U(g) of g, which can be found in [10, §68].

∇32 = E32, ∇31 = (E11 − E22 + 2)E31 + E21E32,

∇23 = E23, ∇13 = (E11 − E22 + 1)E13 + E23E12.

We start by observing some elementary properties of these elements.

LEMMA 4.2. We have

(i) ∇31∇32 = ∇32∇31,(ii) ∇13∇23 = ∇23∇13,

(iii) ∇∗31 = ∇13 and ∇∗

32 = ∇23,(iv) E12∇32 = ∇32E12,(v) E12∇31 = (∇31 − 2E31)E12.

PROPOSITION 4.3. Let v ∈ Vm be a GL(3, C) highest weight vector, and letvσj ∈ V ⊗ W, (j = 1, 2, 3), be the vectors defined in Proposition 4.1. Then

v ⊗ e3 = 1

(m1 + 1 − m2)(m1 + 2 − m3)∇31(v

σ1) −

− 1

(m1 + 1 − m2)(m2 + 1 − m3)∇32(v

σ2) +

+ 1

(m1 + 2 − m3)(m2 + 1 − m3)vσ3 .

Page 204: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 207

Proof. The vector v belongs to the GL(2, C) submodule Vm1,m2 . Then we havethat

v ⊗ e3 = w1 + w2 + w3 ∈ V σ1 ⊕ V σ2 ⊕ V σ3,

where the vectors wj are weight vectors in V σj of weight (m1,m2,m3 + 1), whichbelongs to the GL(2, C) submodules V

σjm1,m2 . Moreover, since v and e3 are GL(2, C)

dominant the same happens with w1, w2 and w3. Lemmas 4.2(iv) and (v) implythat ∇31(v

σ1) and ∇32(vσ2) are GL(2, C) dominant vectors in V σ1 and V σ2 of

weights (m1,m2), respectively. On the other hand an easy computation gives that∇31(v

σ1) �= 0 and ∇32(vσ2) �= 0. Therefore we have

w1 = x∇31(vσ1), w2 = y∇32(v

σ2), w2 = zvσ3,

for some constants x, y, z. Now it is straightforward to verify that

x = 1

(m1 + 1 − m2)(m1 + 2 − m3),

y = − 1

(m1 + 1 − m2)(m2 + 1 − m3),

z = 1

(m1 + 2 − m3)(m2 + 1 − m3). �

PROPOSITION 4.4. Let v ∈ Vm be a GL(3, C) highest weight vector with‖v‖ = 1 and let vσj be defined as in Proposition 4.1. Then we have

‖vσ1‖2 = 1,

‖vσ2‖2 = (m1 − m2)(m1 + 1 − m2),

‖vσ3‖2 = (m2 − m3)(m1 + 1 − m3)(m2 + 1 − m3)(m1 + 2 − m3).

Proof. Recall that we have equipped V and W = C3 with G-invariant inner

products. The canonical basis of C3 is an orthonormal basis with respect to this

inner product. The inner product in V ⊗ W is such that 〈v ⊗ w, v′ ⊗ w′〉 =〈v, v′〉〈w,w′〉.

Therefore we obtain

‖vσ1‖2 = 〈v ⊗ e1, v ⊗ e1〉 = 〈v, v〉〈e1, e1〉 = 1.

We have vσ2 = E21v ⊗ e1 − (m1 − m2)v ⊗ e2, then

〈vσ2, vσ2〉 = 〈E21v,E21v〉 + (m1 − m2)2〈v, v〉

= 〈v,E12E21v〉 + (m1 − m2)2

= 〈v, (E11 − E22)v〉 + (m1 − m2)2

= (m1 − m2) + (m1 − m2)2.

Page 205: Mathematical Physics, Analysis and Geometry - Volume 7

208 INES PACHARONI AND JUAN A. TIRAO

Finally, for the vector vσ3 we have

‖vσ3‖2 = ‖(E21E32 − (m2 − m3)E31)v‖2 + (m1 + 1 − m3)2‖E32v‖2 +

+ (m2 − m3)2(m1 + 1 − m3)

2.

We also get

‖E32v‖2 = 〈v,E23E32v〉 = 〈v, (E22 − E33)v〉 = (m2 − m3).

On the other hand we have

‖(E21E32 − (m2 − m3)E31)v‖2 = 〈E32v,E12E21E32v〉 −− 2(m2 − m3)〈v,E13E21E32v〉 ++ (m2 − m3)

2〈v,E13E31v〉.Since v is a GL(3, C) dominant vector we have

E12E32v = E32E12v = 0 and E13E32v = E32E13v + E12v = 0.

We also have that E32v is a vector of weight (m1,m2 − 1,m3 + 1). Thus

‖(E21E32 − (m2 − m3)E31)v‖2

= 〈E32v, (E11 − E22)E32v〉 + 2(m2 − m3)〈v,E23E32v〉++ (m2 − m3)

2〈v, (E11 − E33)v〉= (m1 − m2 + 1)(m2 − m3) + 2(m2 − m3)

2 + (m2 − m3)2(m1 − m3)

= (m2 − m3)(m1 − m3 + 1)(m2 − m3 + 1).

Therefore

‖vσ3‖2 = (m2 − m3)(m1 − m3 + 1)(m2 − m3 + 1) ++ (m1 + 1 − m3)

2(m2 − m3) + (m2 − m3)2(m1 + 1 − m3)

2

= (m2 − m3)(m1 − m3 + 1)(m2 − m3 + 1)(m1 + 2 − m3).

This completes the proof of the proposition. �

4.1. THE OPERATORS �µ

The proof of Theorem 3.4 is based on the action on V σ1 and V σ2 of certain ele-ments �1 and �2 in U(g), which transform the GL(3, C) highest weight vectorsvσ1 ∈ V σ1 and vσ2 ∈ V σ2 into GL(2, C) highest weight vectors in V

σ1k1,k2

and Vσ2k1,k2

,respectively.

For

µ =m1 m2 m3

k1 k2

k1

we define, as in [10, §68, Theorem 5],

�µ = ∇m1−k131 ∇m2−k2

32 . (13)

Page 206: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 209

THEOREM 4.5. Let Vm be a GL(3, C) module and let Vk be a GL(2, C) irre-ducible submodule. If v ∈ Vm is a GL(3, C) highest weight vector, then �µ(v) ∈Vk is a GL(2, C) highest weight vector associated to the triangle µ.

Proof. It is easy to see that �µ(v) is a vector of weight (k1, k2,m1 +m2 +m3 −k1 − k2). To prove that it is a GL(2, C) dominant vector we need to see that it isannihilated by E12. This follows from Lemmas 4.2(iv) and (v). Now it is clear that�µ(v) is associated to the triangle µ. From Theorem 4.6 below it will follow that�µ(v) is a nonzero vector. �

To compute the constants of the multiplication formulas we need the norms ofthe vectors �µ(v) with respect to a G invariant inner product in Vm. Given a linearoperator a, as usual, we shall denote

[a]n = a(a − 1) · · · (a − n + 1).

THEOREM 4.6. Let v ∈ Vm be a GL(3, C) highest weight vector. Then

‖�µ(v)‖2 = (m1 − k1)!(m2 − k2)![m1 − k2 + 1]m1−k1 ××[m1 − m3 + 1]m1−k1[m1 − m2]m1−k1 [m2 − m3]m2−k2‖v‖2.

A proof of this theorem can be found in [10, Chapter X], where an explicitrealization of the GL(n, C) irreducible modules is used. An abstract proof in ourcase, is included for completeness in Appendix.

4.2. COMPUTATION OF THE CONSTANTS

The aim of this section is to give the proof of Theorem 3.4.

THEOREM 3.4. In terms of the parameters m = (m1,m2,m3) and k = (k1, k2)

we have

a21 = a2

1(m, k) = (m1 − k1 + 1)(m1 − k2 + 2)

(m1 − m2 + 1)(m1 − m3 + 2),

a22 = a2

2(m, k) = (k1 − m2)(m2 − k2 + 1)

(m1 − m2 + 1)(m2 − m3 + 1),

a23 = a2

3(m, k) = (k1 − m3 + 1)(k2 − m3)

(m1 − m3 + 2)(m2 − m3 + 1).

Proof. Let v ∈ Vm be a GL(3, C) highest weight vector such that ‖v‖ = 1. ByProposition 4.3 we have that

v ⊗ e3 = 1

(m1 + 1 − m2)(m1 + 2 − m3)∇31(v

σ1) −

− 1

(m1 + 1 − m2)(m2 + 1 − m3)∇32(v

σ2) +

+ 1

(m1 + 2 − m3)(m2 + 1 − m3)vσ3 .

Page 207: Mathematical Physics, Analysis and Geometry - Volume 7

210 INES PACHARONI AND JUAN A. TIRAO

The operator �µ defined in (13) has the property that �µ(v) = vµ is a GL(2, C)

highest weight vector in the submodule Vk ⊆ Vm. Moreover since e3 ∈ W is alowest weight vector of weight (0, 0, 1) it follows that �µ(v ⊗ e3) = �µ(v) ⊗ e3.Then we obtain

�µ(v) ⊗ e3 = 1

(m1 + 1 − m2)(m1 + 2 − m3)�µ∇31(v

σ1) −

− 1

(m1 + 1 − m2)(m2 + 1 − m3)�µ∇32(v

σ2) +

+ 1

(m1 + 2 − m3)(m2 + 1 − m3)�µ(vσ3).

This expression is exactly the decomposition of the vector vµ ⊗e3 in the direct sumV σ1 ⊕ V σ2 ⊕ V σ3 .

The constants a1, a2, a3 were defined in a such a way that

vµ ⊗ e3 = a1vσ1µ + a2v

σ2µ + a3v

σ3µ ∈ V σ1 ⊕ V σ2 ⊕ V σ3,

with aj > 0 and ‖vσjµ ‖ = 1. Therefore we observe that

a21 = 1

(m1 + 1 − m2)2(m1 + 2 − m3)2

‖�µ∇31(vσ1)‖2

‖�µ(v)‖2, (14)

a22 = 1

(m1 + 1 − m2)2(m2 + 1 − m3)

2

‖�µ∇32(vσ2)‖2

‖�µ(v)‖2, (15)

a23 = 1

(m1 + 2 − m3)2(m2 + 1 − m3)2

‖�µ(vσ3)‖2

‖�µ(v)‖2. (16)

By Theorem 4.6 we have

‖�µ(v)‖2 = (m1 − k1)!(m2 − k2)![m1 − k2 + 1]m1−k1 ××[m1 − m3 + 1]m1−k1[m1 − m2]m1−k1 [m2 − m3]m2−k2 .

The vector vσ1 ∈ V σ1 is GL(3, C) dominant of weight (m1 + 1,m2,m3), and

�µ∇31(vσ1) = �µ(vσ1) where �µ = ∇m1+1−k1

31 ∇m2−k232 .

Then by Theorem 4.6, replacing m1 by m1 + 1, we obtain

‖�µ(vσ1)‖2 = (m1 + 1 − k1)!(m2 − k2)![m1 − k2 + 2]m1+1−k1 ××[m1 − m3 + 2]m1+1−k1[m1 + 1 − m2]m1+1−k1 ××[m2 − m3]m2−k2 .

Therefore

‖�µ∇31(vσ1)‖2

‖�µ(v)‖2= (m1 + 1 − k1)(m1 + 2 − k2)(m1 + 2 − m3) ×

× (m1 + 1 − m2).

Page 208: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 211

By replacing in (14) we get

a21 = (m1 − k1 + 1)(m1 − k2 + 2)

(m1 − m2 + 1)(m1 − m3 + 2).

We proceed in a similar way with a22 .

The vector vσ2 is a GL(3, C)-dominant vector of weight (m1,m2 + 1,m3). Let� = ∇m1−k1

31 ∇m2+1−k232 , then �µ∇32(v

σ2) = �(vσ2). Thus by Theorem 4.6 we have

‖�(vσ2)‖2 = (m1 − k1)!(m2 + 1 − k2)![m1 − k2 + 1]m1−k1 ××[m1 − m3 + 1]m1−k1 [m1 − m2 − 1]m1−k1 ××[m2 + 1 − m3]m2+1−k2‖vσ2‖2.

Then by Proposition 4.4 we get

‖�µ∇32(vσ2)‖2

‖�µ(v)‖2= (m2 + 1 − k2)(k1 − m2)(m2 + 1 − m3)(m1 + 1 − m2).

Finally, by replacing in (15) we obtain the expression for a22 .

The vector vσ3 is a G-dominant vector of weight (m1,m2,m3 + 1). Thus byTheorem 4.6 we have

‖�µ(vσ3)‖2 = (m1 − k1)!(m2 − k2)![m1 − k2 + 1]m1−k1 [m1 − m3]m1−k1 ××[m1 − m2]m1−k1[m2 − m3 − 1]m2−k2‖vσ3‖2.

By using Proposition 4.4 we get

‖�µ(vσ3)‖2

‖�µ(v)‖2= (k1 − m3 + 1)(k2 − m3)(m2 + 1 − m3)(m1 + 2 − m3).

By replacing this expression in (16) we finish the proof of the theorem. �5. The Three Term Recursion Relation

We shall start from the identities (10) and (11) established in Theorems 3.3 and3.6. Since the parameter k = (k1, k2) will be the same in all places it will not beexplicit:

φ(g)�m(g) = a21(m)�m+e1(g) + a2

2(m)�m+e2(g) + a23(m)�m+e3(g), (17)

ψ(g)�m(g) = c21(m)�m−e1(g) + c2

2(m)�m−e2(g) + c23(m)�m−e3(g). (18)

From (17) and (18) we obtain

φ(g)ψ(g)�m(g)

= a21(m)[c2

1(m1)�m(g) + c22(m1)�m1−e2(g) + c2

3(m1)�m1−e3(g)]++ a2

2(m)[c21(m2)�m2−e1(g) + c2

2(m2)�m(g) + c23(m2)�m2−e3(g)]+

+ a23(m)[c2

1(m3)�m3−e1(g) + c22(m3)�m3−e2(g) + c2

3(m3)�m(g)], (19)

where we put mj = m + ej , j = 1, 2, 3.

Page 209: Mathematical Physics, Analysis and Geometry - Volume 7

212 INES PACHARONI AND JUAN A. TIRAO

At this point it is convenient to make clear when two spherical functions �m,k(g)

and �m′,k′(g) are equivalent.

PROPOSITION 5.1. The spherical functions �m,k(g) and �m′,k′(g) of the pair(G,K) are equivalent, if and only if m′ = m + j (1, 1, 1) and k′ = k + j (1, 1) forsome j ∈ Z.

Proof. First of all, an irreducible spherical function � of (G,K) is characterizedby the eigenvalues [�2�](e) and [�3�](e) corresponding to the generators �2 and�3 of the center Z(g) of the universal enveloping algebra U(g) (see [4]).

Moreover if p = m1 − m2 and q = m2 − m3 then the corresponding eigen-values for �m,k(g) coincide with the infinitesimal character χλ+δ of Vm evaluatedat �2 and �3. Here λ = pλα + qλβ, λα and λβ being the fundamental weightscorresponding to the set of simple roots {α, β}, δ = α +β, and χλ+δ = (λ + δ) · γwhere γ : Z(g) → U(h)W is the Harish-Chandra isomorphism of Z(g) onto theinvariants in U(h).

Now let p′ = m′1 − m′

2, q′ = m′

2 − m′3 and λ′ = p′λα + q ′λβ . If the spherical

functions �m,k(g) and �m′,k′(g) are equivalent then χλ+δ = χλ′+δ , and this impliesthat λ′ +δ = w(λ+δ) for some w ∈ W , (see Theorem 5.62 in [8]). But then w = 1and λ′ = λ because λ′ + δ and λ + δ are strictly dominant weights. Thus p′ = p

and q ′ = q. If we put j = m′1 − m1 = m′

2 − m2 = m′3 − m3 we have on one hand

m′ = m + j (1, 1, 1).On the other hand, since the types of �m,k(g) and �m′,k′(g) must be the same

we have n = k1 + 2k2 − m1 − m2 − m3 = k′1 + 2k′

2 − m′1 − m′

2 − m′3 and

� = k1 −k2 = k′1 −k′

2. Therefore (k′1 −k1)+2(k′

2 −k2) = 3j and k′1 −k1 = k′

2 −k2

which give k′ = k + j (1, 1). The proposition is proved. �We introduce now a better parametrization for the irreducible spherical func-

tions. Given integral tuples m = (m1,m2,m3) and k = (k1, k2) subject to m1 �k1 � m2 � k2 � m3, let

w = m1 − k1 and k = m2 − k2.

Then it is easy to see that the map �m,k �→ (w, k, n, �) gives a one to one corre-spondence between the set of equivalence classes of irreducible spherical functionsof (G,K) and the set

{(w, k, n, �) ∈ Z4 : 0 � k � �, 0 � w, 0 � w + n + k}.

In terms of these new parameters we have

a21(m, k) = a2

1(w, k, n, �) = (w + 1)(w + � + 2)

(w + � − k + 1)(2w + � + n + k + 2),

a22(m, k) = a2

2(w, k, n, �) = (� − k)(k + 1)

(w + � − k + 1)(w + n + 2k + 1),

a23(m, k) = a2

3(w, k, n, �) = (w + � + n + k + 1)(w + n + k)

(2w + � + n + k + 2)(w + n + 2k + 1).

Page 210: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 213

From c2j (m, k) = a2

4−j (m, k) we also obtain

c2j (m, k) = c2

j (w, k, n, �) = a24−j (w + n + k, � − k,−n − �, �).

Then to rewrite (19) in term of these new parameters we shall omit the para-meters (n, �) since all the spherical functions involved, except φ and ψ , are ofthe same type. We also put b2

j (w, k) = c2j (w, k, n − 1, �) and a2

j = a2j (w, k) =

a2j (w, k, n, �), for short. More explicitly,

b21(w, k) = (w + � + 1)w

(2w + � + n + k + 1)(w + � − k + 1),

b22(w, k) = k(� − k + 1)

(w + n + 2k)(w + � − k + 1),

b23(w, k) = (w + n + k)(w + � + n + k + 1)

(w + n + 2k)(2w + � + n + k + 1).

Then (19) reads

φ(g)ψ(g)�(w, k; g)

= (a21b

21(w + 1, k) + a2

2b22(w, k + 1) + a2

3b23(w, k))�(w, k; g)+

+ a21b

22(w + 1, k)�(w + 1, k − 1; g) + a2

1b23(w + 1, k)�(w + 1, k; g)+

+ a22b

21(w, k + 1)�(w − 1, k + 1; g) + a2

2b23(w, k + 1)�(w, k + 1; g)+

+ a23b

21(w, k)�(w − 1, k; g) + a2

3b22(w, k)�(w, k − 1; g). (20)

Remark. The above identity holds for 0 � k � �, 0 � w, 0 � w + n + k,even when some spherical functions in the right-hand side of (20) were not defined,because in such cases the coefficients vanish.

Now we shall encode the set of identities (20) in a set of three term recursionrelations by introducing the necessary notation.

For w � max{0,−n} we define a column vector �(w, g) of � + 1 sphericalfunctions of type (n, �) as follows

�(w, g) = �(w, n, �; g) = (�(w, 0, n, �; g), . . . ,�(w, �, n, �; g))T.

We also define the following (� + 1) × (� + 1) matrices

Aw =�∑

k=0

a23(w, k)b2

1(w, k)Ekk +�−1∑k=0

a22(w, k)b2

1(w, k + 1)Ek,k+1,

Bw =�∑

k=0

(a21(w, k)b2

1(w + 1, k) + a22(w, k)b2

2(w, k + 1) +

+ a23(w, k)b2

3(w, k))Ekk +

Page 211: Mathematical Physics, Analysis and Geometry - Volume 7

214 INES PACHARONI AND JUAN A. TIRAO

+�∑

k=1

a23(w, k)b2

2(w, k)Ek,k−1 +�−1∑k=0

a22(w, k)b2

3(w, k + 1)Ek,k+1,

Cw =�∑

k=0

a21(w, k)b2

3(w + 1, k)Ekk +�∑

k=1

a21(w, k)b2

2(w + 1, k)Ek,k−1.

We recall that given two square matrices M and P the tensor product matrixM ⊗ P is the matrix obtained by blowing up each entry Mij of M to the matrixMijP . Now let

Aw = Aw ⊗ I, Bw = Bw ⊗ I, Cw = Cw ⊗ I,

where I denotes the (� + 1) × (� + 1) identity matrix.Then from (20) we obtain the following result.

THEOREM 5.2. For each type (n, �), for all integers w � max{0,−n} and allg ∈ G we have

φ(g)ψ(g)�(w, g) = Aw�(w − 1, g) + Bw�(w, g) + Cw�(w + 1, g).

Moreover Cw is a nonsingular matrix.

Remark. If w = 0 we have A0 = 0; this is useful to interpret the above identitysince we have not defined �(−1, g). When n < 0 and w = −n the function�(w − 1, g) has not defined the first component �(w − 1, 0, n, �; g), but the firstcolumn of Aw is zero, giving perfect sense to the identity.

Proof. An (� + 1)2 × (� + 1) matrix V will be seen as an (� + 1)-column vectorV = (V0, . . . , V�)

t of (� + 1) × (� + 1) matrices Vk. If M is an (� + 1) × (� + 1)

matrix and M = M ⊗ I , then

(M�(w, g))k =�∑

j=0

Mkj�(w, j ; g).

In particular if Aij , Bij and Cij denote, respectively, the ij -entries of the matricesAw,Bw and Cw we have

(Aw�(w − 1, g))k = Akk�(w − 1, k; g) + Ak,k+1�(w − 1, k + 1; g),

(Bw�(w, g))k = Bk,k−1�(w, k − 1; g) + Bk,k�(w, k; g)++ Bk,k+1�(w, k + 1; g),

(Cw�(w + 1, g))k = Ck,k−1�(w + 1, k − 1; g) + Ck,k�(w + 1, k; g).

Therefore, using (20) for any 0 � k � �, we obtain

(Aw�(w − 1, g) + Bw�(w, g) + Cw�(w + 1, g))k = φ(g)ψ(g)�(w, g)k.

Page 212: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 215

To prove the last statement it is enough to observe that

det Cw = det Cw =�∏

k=0

a21(w, k)b2

3(w + 1, k) �= 0.

The proof of the theorem is completed. �It is of interest to see how we can write the above theorem when we restrict the

spherical functions to the Abelian subgroup A of G of all matrices of the form

a(s) =( cos s 0 sin s

0 1 0− sin s 0 cos s

),

for any s ∈ R.Let M be the centralizer of A in K. Then M consists of all elements of the form

m(r) =( eir 0 0

0 e−2ir 00 0 eir

),

for any r ∈ R.If � is a spherical function on G of type π = π(n,�) ∈ K then �(a(s)) com-

mutes with π(m) for all m ∈ M. On the other hand there exits a basis {v0, . . . , v�}of Vπ such that m(r) · vj = eir(�−2j−n)vj , j = 0, . . . , �. Therefore we havethat �(a(s)) diagonalizes in such a basis for each s ∈ R. Let �(�0(a(s)), . . . ,

��(a(s))) be the corresponding diagonal matrix.In the open subset {a(s) ∈ A : 0 < s < π/2} of A we introduce the coordinate

t = cos2(s) and define the vector valued function

F(t) = (�0(a(s)), . . . ,��(a(s)))

associated to the spherical function �. If �(g) = �(w, k; g) then we shall alsoput F(t) = F(w, k; t).

In a similar way, for w � max{0,−n}, corresponding to the function �(w, g)

we consider the (� + 1) × (� + 1) matrix valued function F (t, w) whose rows aregiven by the vectors F(w, k; t) for k = 0, 1, . . . , �. More explicitly

F (w, t) = (Fij (g, t)) with Fij (w, t) = �j(w, i; a(s)).

PROPOSITION 5.3. For each fixed type (n, �), for all integers w � max{0,−n}and all 0 < t < 1 we have

tF (w, t) = AwF (w − 1, t) + BwF (w, t) + CwF (w + 1, t). (21)

Proof. We recall that φ(g) is the spherical function of type (−1, 0) associatedto the G-module W = C

3 and that ψ(g) is the spherical function of type (1, 0)

associated to W ∗. A direct computation gives

φ(a(s)) = ψ(a(s)) = cos s.

Page 213: Mathematical Physics, Analysis and Geometry - Volume 7

216 INES PACHARONI AND JUAN A. TIRAO

Therefore from the identity (20), for g = a(s), we obtain

tF (w, k; t) = (A)kkF (w − 1, k; t) + (A)k,k+1F(w − 1, k + 1; t) ++ (B)k,k−1F(w, k − 1; t) + (Bw)k,kF (w, k; t) ++ (B)k,k+1F(w, k + 1; t) ++ (C)k,k−1F(w + 1, k − 1; t) + (A)kkF (w + 1, k; t).

This is nothing but the equality of the kth rows of the identity (21). �Finally we want to relate these functions F (t) = F (w, t) with the functions

H (t) = H (t, w) in [4] and [5] associated to the spherical functions. In [4, Sec-tion 4] we considered the function H(g) = �(g)�π(g)−1, associated to a sphericalfunction �(g) of type π = πn,� ∈ K , and its restriction H(t) = H(a(s)) wheret = cos2(s). There we viewed the diagonal matrix H(t) as a column vector. Thenit is easy to verify that

F(t) = tn/2H(t)t

(t1/2 00 1

)�

,

where the exponent � denotes the �th symmetric power of the matrix. Explicitly(t1/2 0

0 1

)�is a diagonal matrix whose j th entry is t (�−j)/2, with 0 � j � �.

In [5, Section 3], for � � 0, w � max{0,−n} we also defined the matrixvalued function H (t) = H (t, w) whose rows are given by the vectors H(w, k; t)

for 0 � k � �. Then

F (t) = tn/2H (t)

(t1/2 00 1

)�

.

By multiplying both sides of (21) on the right by t−n/2(

t1/2 00 1

)�we obtain

tH (t, w) = AwH (t, w − 1) + BwH (t, w) + CwH (t, w + 1),

which is exactly the three term recursion relation established in [5, Theorem 3.7]. �Appendix

For completeness we include here the proof of Theorem 4.6. To reach this goal weneed some preparatory material. We start by considering the particular case whenm1 = k1.

PROPOSITION 6.1. Let v ∈ Vm be a GL(3, C) highest weight vector. Then forany n ∈ N we have

‖En32v‖2 = n! (m2 − m3)!

(m2 − m3 − n)!‖v‖2 = n![m2 − m3]n‖v‖2.

Page 214: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 217

Proof. It is not difficult to prove, by induction on n, that the following identityholds

En23E32 = E32E

n23 + n(E22 − E33 − n + 1)En−1

23 . (22)

Now for n ∈ N we have

En23E

n−132 v = 0. (23)

In fact for n = 1 it is true, since v is GL(3, C) dominant. If we assume that it holdsfor n � 1 then, by using (22), we obtain

En+123 En

32v = E23(E32En23 + n(E22 − E33 − n + 1)En−1

23 )En−132 v

= E23E32En23E

n−132 v + n(E22 − E33 − n − 1)En

23En−132 v = 0.

From (22) and (23) it follows that

En23E

n32v = E32E

n32E

n−132 v + n(E22 − E33 − n + 1)En−1

23 En−132 v

= n(E22 − E33 − n + 1)En−123 En−1

32 v

= n(m2 − m3 − n + 1)En−123 En−1

32 v,

by using that the weight of the vector En−123 En−1

32 v is (m1,m2,m3). Now we get

‖En32v‖2 = 〈v,En

23En32v〉 = n(m2 − m3 − n + 1)〈v,En−1

23 En−132 v〉

= n(m2 − m3 − n + 1)‖En−132 v‖2. (24)

Thus the proposition follows by induction on n. �PROPOSITION 6.2. Let v ∈ Vm be a GL(3, C) highest weight vector. Let a =m1 − k1, b = m2 − k2 and w = Eb

32v. Then we have

‖�µ(v)‖2 = [m1 − m2 + b + 1]a〈w,Ea13∇a

31w〉.Proof. We have

〈�µ(v),�µ(v)〉 = 〈∇b32(v),∇a

13�µ(v)〉.To compute ∇a

13�µ(v) we first observe that

E12(E11 − E22 + 1)E13 = (E11 − E22 − 1)E12E13

= (E11 − E22 − 1)E13E12.

Then ∇a13�µ(v) = ((E11 − E22 + 1)E13)

a�µ(v) since E12�µ(v) = 0. Now byinduction on a we have

((E11 − E22 + 1)E13)a = [E11 − E22 + 1]aEa

13.

Page 215: Mathematical Physics, Analysis and Geometry - Volume 7

218 INES PACHARONI AND JUAN A. TIRAO

Therefore we obtain

∇a13�µ(v) = ((E11 − E22 + 1)E13)

a�µ(v) = [E11 − E22 + 1]aEa13�µ(v)

= [m1 − m2 + b + 1]aEa13�µ(v),

because the weight of the vector Ea13�µ(v) is (m1,m2 − b,m3 + b). Finally

〈�µ(v),�µ(v)〉 = [m1 − m2 + b + 1]a〈w,Ea13�13�µ(v)〉

= [m1 − m2 + b + 1]a〈w,Ea13∇a

31w〉. �LEMMA 6.3. In U(g) we have

(i) En13E21 = E21E

n13 − nEn−1

13 E23.(ii) E13E

n32 = En

32E13 + nEn−132 E12.

(iii) En13E31 = n(E11 − E33 − n + 1)En−1

13 + E31En13.

(iv) En+113 ∇n

31 ≡ 0 mod(U(g)E12 + U(g)E13).

Proof. (i) Since E21(E13) = E23 and E13 commute we have

E21(En13) = nEn−1

13 E23.

Therefore

E21En13 − En

13E21 = nEn−113 E23.

This proves (i), and in the same way we obtain (ii).(iii) For n = 1 the statement is clearly true. We assume that the identity holds

for n � 1 and we get

En+113 E31 = nE13(E11 − E33 − n + 1)En−1

13 + E13E31En13

= n(E11 − E33 − n − 1)En13 + (E11 − E33)E

n13 + E31E

n+113

= (n + 1)(E11 − E33 − n)En13 + E31E

n+113 .

This completes the proof of (iii).(iv) For n = 0 the assertion is obvious. By induction on n � 0 we assume that

En13∇n−1

31 = 0 mod(U(g)E12 + U(g)E13). We have

En+113 ∇n

31 = En+113 ((E11 − E22 + 2)E31 + E21E32∇n−1

31 )

= (E11 − E22 − n + 1)En+113 E31∇n−1

31 + En+113 E21∇n−1

31 E32, (25)

where we have used that E32∇31 = ∇31E32 (see Lemma 4.2(i)).By (iii) we have

(E11 − E22 − n + 1)En+113 E31∇n−1

31

= (n + 1)(E11 − E22 − n + 1)(E11 − E33 − n)En13∇n−1

31 ++ (E11 − E22 − n + 1)E31E

n+113 ∇n−1

31

≡ 0 mod(U(g)E12 + U(g)E13).

Page 216: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 219

By (i) we have

En+113 E21∇n−1

31 E32 = E21En+113 ∇n−1

31 E32 − (n + 1)E23En13∇n−1

31 E32.

Now we note that

E12E32 = E32E12 and E13E32 = E32E13 + E13.

Therefore the left ideal U(g)E12 + U(g)E13 is invariant under right multiplicationby E32. Then by the inductive hypothesis we obtain

En+113 E21∇n−1

31 E32 ≡ 0 mod(U(g)E12 + U(g)E13).

By replacing in (25) we obtain En+113 ∇n

31 ≡ 0 mod(U(g)E12 + U(g)E13), and thiscompletes the proof of (iv). �PROPOSITION 6.4. Let v ∈ Vm be a GL(3, C) highest weight vector. Letw = Eb

32v. Then

〈w,En13∇n

31w〉 = n![m1 − m3 + 1]n[m1 − m2]n‖w‖2.

Proof. We have

En13∇n

31 = En13((E11 − E22 + 2)E31 + E21E32)∇n−1

31

= (E11 − E22 + 2 − n)En13E31∇n−1

31 + En13E21∇n−1

31 E32.

By Lemmas 6.3(i) and (iii) we get

En13∇n

31 = n(E11 − E22 + 2 − n)(E11 − E33 − n + 1)En−113 ∇n−1

31 ++ (E11 − E22 + 2 − n)E31E

n13∇n−1

31 + E21En13∇n−1

31 E32 −− nE23E

n−113 ∇n−1

31 E32.

The vector w = Eb32 is of weight (m1,m2 −b,m3 +b) and we note that E12w =

E12Eb23v = 0, and by Lemma 6.3(ii) we obtain E13w = E13E

b32v = 0. Then by

Lemma 6.3(iv) we have En13∇n−1

31 w = 0 and En13∇n−1

31 E32w = 0. Therefore

En13∇n

31w = n(m1 − m2 + b + 2 − n) ×× (m1 − m3 − b − n + 1)En−1

13 ∇n−131 w −

− nE23En−113 ∇n−1

31 E32w.

Then

〈w,En13∇n

31w〉= n(m1 − m2 + b + 2 − n)(m1 − m3 − b − n + 1)〈w,En−1

13 ∇n−131 w〉−

− n〈E32w,En−113 ∇n−1

31 E32w〉. (26)

Page 217: Mathematical Physics, Analysis and Geometry - Volume 7

220 INES PACHARONI AND JUAN A. TIRAO

Recall that by (24) we have

‖E32w‖2 = (b + 1)(m2 − m3 − b)‖w‖2.

Now we shall prove the proposition by induction on n.For n = 1, by using (26) we have that

〈w,E13∇31w〉 = (m1 − m2 + b + 1)(m1 − m3 − b)‖w‖2 − ‖E32w‖2

= (m1 − m2)(m1 − m3 + 1)‖w‖2.

Let us assume that the identity in the statement of the proposition is true for n − 1and any b. Then we have

〈w,En−113 ∇n−1

31 w〉 = (n − 1)![m1 − m3 + 1]n−1[m1 − m2]n−1‖w‖2

and

〈E32w,En−113 ∇n−1

31 E32w〉= (n − 1)![m1 − m3 + 1]n−1[m1 − m2]n−1‖E32w‖2

= (n − 1)![m1 − m3 + 1]n−1[m1 − m2]n−1(b + 1)(m2 − m3 − b)‖w‖2.

Thus by replacing in (26) we obtain

〈w,En13∇n

31w〉 = n![m1 − m3 + 1]n−1[m1 − m2]n−1 ×× ((m1 − m2 + b + 2 − n)(m1 − m3 − b − n + 1) −− (b + 1)(m2 − m3 − b))‖w‖2

= n![m1 − m3 + 1]n−1[m1 − m2]n−1 ×× (m1 − m3 − n + 2)(m1 − m2 − n + 1)‖w‖2

= n![m1 − m3 + 1]n[m1 − m2]n‖w‖2.

This completes the proof of the proposition. �Finally we are in position to give the proof of Theorem 4.6.

THEOREM 4.6. Let v ∈ Vm be a GL(3, C) highest weight vector. Then

‖�µ(v)‖2 = (m1 − k1)!(m2 − k2)![m1 − k2 + 1]m1−k1 ××[m1 − m3 + 1]m1−k1[m1 − m2]m1−k1 [m2 − m3]m2−k2‖v‖2.

Proof. Let w = Em2−k232 v. By Proposition 6.2 we have

‖�µ(v)‖2 = [m1 − k2 + 1]m1−k1〈w,Em1−k113 ∇m1−k1

31 w〉.By using Proposition 6.4 we obtain

〈w,Em1−k113 ∇m1−k1

31 w〉 = (m1 − k1)![m1 − m3 + 1]m1−k1 [m1 − m2]m1−k1‖w‖2.

Page 218: Mathematical Physics, Analysis and Geometry - Volume 7

THREE TERM RECURSION RELATION 221

By Proposition 6.1 we have

‖w‖2 = (m2 − k2)![m2 − m3]m2−k2‖v‖2.

Finally putting these things together the theorem follows. �

Acknowledgements

It is a pleasure to thank Prof. F. A. Grünbaum for introducing us to the bispectralproblem and for his continuous encouragement.

References

1. Bochner, S.: Über Sturm-Liouvillesche Polynomsysteme, Math. Z. 29 (1929), 730–736.2. Duran, A. and Van Assche, W.: Orthogonal matrix polynomials and higher order recurrrence

relations, Linear Algebra Appl. 219 (1995), 261–280.3. Gangolli, R. and Varadarajan, V. S.: Harmonic Analysis of Spherical Functions on Real

Reductive Groups, Ergeb. Math. Grenzgeb. 101, Springer-Verlag, Berlin, 1988.4. Grünbaum, F. A., Pacharoni, I. and Tirao, J.: Matrix valued spherical functions associated to

the complex protective plane, J. Funct. Anal. 188 (2002), 350–441.5. Grünbaum, F. A., Pacharoni, I. and Tirao, J.: A matrix valued solution to Bochner’s problem,

J. Phys. A 34 (2001), 10647–10656.6. Grünbaum, F. A., Pacharoni, I. and Tirao, J.: Spherical functions associated to the three

dimensional hyperbolic space, Internat. J. Math. 13(7) (2002), 727–784.7. Humphreys, J.: Introduction to Lie Algebras and Representation Theory, Springer-Verlag, New

York, 1972.8. Knapp, A.: Lie Groups beyond an Introduction, Progr. Math., Birkhäuser, Boston, 1996.9. Tirao, J.: Spherical functions, Rev. Un. Mat. Argentina 28 (1977), 75–98.

10. Zelobenko, D. P.: Compact Lie Groups and Their Representations, Transl. Math. Monographs,Amer. Math. Soc., Providence, RI, 1973.

Page 219: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 223–237, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

223

Four-Vertex Theorems, Sturm Theory andLagrangian Singularities

To Alain Chenciner in his 60th birthday

RICARDO URIBE-VARGASCollège de France, 11, Pl. Marcelin–Berthelot, 75005 Paris, France. e-mail: [email protected]

(Received: 20 March 2003; in final form: 7 November 2003)

Abstract. We prove that the vertices of a curve γ ⊂ Rn are critical points of the radius of the

osculating hypersphere. Using Sturm theory, we give a new proof of the (2k + 2)-vertex theoremfor convex curves in the Euclidean space R

2k . We obtain a very practical formula to calculate thevertices of a curve in R

n. We apply our formula and Sturm theory to calculate the number of verticesof the generalized ellipses in R

2k . Moreover, we explain the relations between vertices of curvesin Euclidean n-space, singularities of caustics and Sturm theory (for the fundamental systems ofsolutions of disconjugate homogeneous linear differential operators L: C∞(S1) → C∞(S1)).

Mathematics Subject Classifications (2000): 51L15, 53A04, 53A07, 53C99, 53D05, 53D12,58K35, 47B25.

Key words: caustic, Lagrangian manifold, singularity, space curve, Sturm theory, vertex.

Introduction

The geometry of curves is a classical subject which relates geometrical intuitionwith analysis and topology.

Sturm theory and oscillatory properties of solutions have clear geometrical in-terpretation in terms of geometry of the curve.

In particular, to estimate the possible number of special points, e.g., flattenings(points at which the last torsion vanishes) for different types of closed curves is animportant problem involving topological, symplectic and analytic methods.

In [2], V. I. Arnold pointed out that “most of the facts of the differential geom-etry of submanifolds of Euclidean or of Riemannian space may be translated intothe language of contact (or symplectic) geometry and may be proved in this moregeneral setting. Thus we can use the intuition of Euclidean or Riemannian geome-try to guess general results of contact (or symplectic) geometry, whose applicationsto the problem of ordinary differential geometry provide new information in thisclassical domain.” The classical four-vertex theorem, [15], asserts that a closedconvex plane curve has at least 4 critical points of the curvature. It is easy toconstruct an arbitrarily small perturbation of the Euclidean metric of the plane in aneighbourhood of the unit circle C such that the curve C has only two critical points

Page 220: Mathematical Physics, Analysis and Geometry - Volume 7

224 R. URIBE-VARGAS

of the curvature in the obtained Riemannian plane. The validity of the four-vertextheorem, in the particular case of a Riemannian metric, may be re-established if thevertices of a curve are defined in the following way:

For each point of our curve, consider the geodesic issuing from that point per-pendicularly to the curve and in the direction of the inward normal. The point ofintersection of such a geodesic with an infinitely close geodesic normal is said to bea conjugate point (along the original normal). All these conjugate points form thecaustic of the original curve (the envelope of the family of geodesic normals). Thepoints of our curve corresponding to the singular points of the caustic are calledthe vertices of the curve (the generic singular points of the caustic are semi-cubiccusps).

So the four-vertex theorem in the Riemannian case asserts that: The caustic ofa generic closed convex curve has at least four cusps (counted geometrically). Ifthe curve is nongeneric, then multiplicities must be counted. For instance, the levelsets f = c > 0 of the function

f (x, y) = x2 + y2 + α(x2 − y2) + 2βxy + x3 − 3xy2,

with 0 < α2 +β2 < 0.1, are convex curves near the origin. For c sufficiently small,the curve f = c has 4 vertices, while for c > 0.05, the curve f = c has 6 vertices.For some intermediate value c = c0, the curve f = c0 has 5 vertices, one of themwith multiplicity 2.

EXAMPLE. The caustic of a circle is a single (singular!) point, with infinitemultiplicity: all points of the circle are vertices.

We give a formula to calculate the vertices of a curve in Rn and a new proof

of a higher dimensional 4-vertex theorem ([20] Theorem 1, below) applying Sturmtheory and the theory of Lagrangian singularities. We show the relations betweenthe vertices of curves, the singularities of caustics and Sturm theory of fundamen-tal systems of solutions of disconjugate homogeneous linear differential operatorsL: C∞(S1) → C∞(S1).

1. Statement of Results on Vertices

In the sequel Rn will denote a Euclidean space and we assume that the derivatives

of order 1, . . . , n − 1, of our curves are linearly independent at any point (this istrue for generic curves).

DEFINITION 1. Let M be a d-dimensional submanifold of Rn, considered as a

complete intersection: M = {x ∈ Rn : g1(x) = · · · = gn−d (x) = 0}. We say

that k is the order of contact of a curve γ : t �→ γ (t) ∈ Rn with the submanifold

M, or that γ and M have k-point contact, at a point γ (t0), if each function g1 ◦γ, . . . , gn−d ◦ γ has a zero of multiplicity at least k at t = t0, and at least one ofthem has a zero of multiplicity k at t = t0.

Page 221: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 225

Remark. If one needs to make this definition more invariant, one could denotethe image of γ by � and then write that the order of contact at a point is the mini-mum of the multiplicities of zero among the functions of the form g|�: � → R, atthat point, where g: R

n → R belongs to the generating ideal of M and we assumethat 0 is a regular value of g.

EXAMPLE. A smooth curve in Rn has 2-point contact with its tangent line (at

the point of tangency) for the generic points of the curve. The curve y = x3 has3-point contact with the line y = 0, at the origin: the equation x3 = 0 has a root ofmultiplicity 3.

By convention, the k-dimensional affine subspaces of the Euclidean space Rm+1

will be considered as k-dimensional spheres of infinite radius.

DEFINITION 2. For k = 1, . . . , n−1, a k-osculating sphere at a point of a curvein the Euclidean space R

n is a k-dimensional sphere having at least (k + 2)-pointcontact with the curve at that point. For k = n − 1 we will simply write osculatinghypersphere.

EXAMPLE. A generic plane curve and its osculating circle have 3-point contactat an ordinary point of the curve.

DEFINITION 3. A vertex of a curve in Rn is a point at which the curve has at

least (n + 2)-point contact with its osculating hypersphere.

EXAMPLE. A noncircular ellipse in the plane R2 has 4 vertices. They are the

points at which the ellipse intersects its principal axes.

DEFINITION 4. An embedded closed curve in Rn (or in RP n) is called convex

if it intersects any hyperplane (or projective hyperplane, respectively) at no morethan n points, taking multiplicities into account.

EXAMPLE. A closed plane curve is convex if it intersects any straight line in atmost two points, taking multiplicities into account.

EXAMPLE. For n = 2k, the generalized ellipse, defined as the image of theembedding t �→ (cos t, sin t, cos 2t, sin 2t, . . . , cos kt, sin kt), is convex in R

2k.Indeed, the number of intersection points of this curve with the hyperplane givenby the equation

a0 +k∑

j=1

(aj xj + bjyj ) = 0,

is equal to the number of zeroes of the trigonometric polynomial

Page 222: Mathematical Physics, Analysis and Geometry - Volume 7

226 R. URIBE-VARGAS

f (t) = a0 +k∑

j=1

(aj cos j t + bj sin j t),

which is at most 2k.

The following theorem was proved in [10] and [20].

THEOREM 1. Any closed convex curve in R2k has at least 2k+2 vertices (counted

geometrically).

This theorem holds for convex curves in the 2k-sphere S2k ([20]) and also holds

for convex curves in Lobachevskian 2k-space �2k ([21]). A more general theoremwas proved in [22] for a class of curves invariant under conformal transformationsand containing the class of convex curves in R

2k (respectively in S2k and in �2k).

Vertices of curves in Euclidean spaces and flattenings of curves in projective(or affine) spaces are related to Sturm theory. In Section 4, we give a new proof ofTheorem 1 based on Sturm theory. This new proof allows us to give a formula tocalculate the vertices of a curve in R

n as the zeroes of a determinant:

THEOREM 2. The vertices of any curve γ : S1 → R

n (or γ : R → Rn), γ : s �→

(ϕ1(s), . . . , ϕn(s)) are given by the solutions s ∈ S1 (or s ∈ R) of the equation

det(R1, . . . , Rn,G) = 0,

where Ri (respectively G) is the column vector defined by the first n+1 derivativesof ϕi (of g = γ 2/2, respectively, where γ 2 := 〈γ, γ 〉).COROLLARY 1 (see also [23]). The vertices of any curve γ : S

1 → Rn (or γ :

R → Rn), γ : s �→ (ϕ1(s), . . . , ϕn(s)) correspond to the flattenings of the curve

�: S1 → R

n+1 (or �: R → Rn+1),

�: s �−→(

ϕ1(s), . . . , ϕn(s),γ 2(s)

2

).

Remark. This means that the vertical projection of a curve γ ⊂ Rn to the

paraboloid ‘of revolution’ z = 12 (x2

1 + · · · + x2n) sends the vertices of the curve

γ onto the flattenings of its image.

Remark. Another formula for calculation of the vertices of a curve (unfortu-nately not very practical) was found in [11]. For curves in R

3, such formula appearsin [16], Exercise 6.8.

A curve in the Euclidean plane, has a vertex if and only if the radius of itsosculating circle is critical. In higher dimensional spaces we have the followingtheorem (announced in [20] and proved in Section 3).

Page 223: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 227

THEOREM 3. The vertices of a smoothly immersed curve in the Euclideanspace R

n are critical points of the radius of the osculating hypersphere.

Remark. For n > 2, the converse is not always true. For example, all the pointsof the circular helix t �→ (cos t, sin t, t) are critical points of the radius of theosculating hypersphere. However it has no vertex. A more generic example is givenby the curve t �→ (a cos t, b sin t, t) which has no vertex for any a, b ∈ R \ {0}such that |a2 − b2| < 1/3.

Proof of remark. Apply our formula of Theorem 1 to obtain∣∣∣∣∣∣∣

−a sin t b cos t 1 1/2(b2 − a2) sin 2t + t

−a cos t −b sin t 0 (b2 − a2) cos 2t + 1a sin t −b cos t 0 −2(b2 − a2) sin 2t

a cos t b sin t 0 −4(b2 − a2) cos 2t

∣∣∣∣∣∣∣= 0,

which gives ab(1 − 3(b2 − a2) cos 2t) = 0. For |a2 − b2| < 1/3, this equation hasno solution t ∈ R. �

The (noncircular) ellipse is the simplest closed convex curve in the plane havingthe minimum number of vertices: 4.

DEFINITION 5. A generalized ellipse in R2k is the convex curve given by the

following parametrization ([7]):

θ �−→ (a1 cos θ, b1 sin θ, a2 cos 2θ, b2 sin 2θ, . . . , ak cos kθ, bk sin kθ).

Since generalized ellipses are the convex curves of lower degree in R2k, one

would expect (from the ‘topological economy principle in algebraic geometry’)that they have the minimum number of vertices, i.e. 2k + 2, as it is the case forthe noncircular ellipse in the 2-plane. However, the following example shows thatgeneric generalized ellipses in R

2k can have more than 2k + 2 vertices.

EXAMPLE. The generalized ellipse in R4,

γ (θ) = (a1 cos θ, b1 sin θ, a2 cos 2θ, b2 sin 2θ),

with a22 = b2

2 and a1b1a2b2 = 0 has 8 vertices. If a22 = b2

2 then γ is a sphericalcurve and all its points are thus vertices.

Denote Ck = cos kθ and Sk = sin kθ .

THEOREM 4. Consider the generalized ellipse in R2k given by

γ (θ) = (a1C1, b1S1, a2C2, b2S2, . . . , akCk, bkSk),

with a1b1a2b2 · · · akbk = 0. Then, for even k, γ can have 2k + 4, 2k + 8, . . . , 4k

or an infinity of vertices depending on the values of the parameters aj and bj , forj � (k/2) + 1. For odd k, γ can have 2k + 2, 2k + 6, . . . , 4k or an infinity ofvertices depending on the values of the parameters aj and bj , for j � (k + 1)/2.

Page 224: Mathematical Physics, Analysis and Geometry - Volume 7

228 R. URIBE-VARGAS

Remark. In the space of parameters aj and bj , there is a hypersurface whichseparates the domains with different number of vertices. The points of this hyper-surface correspond to generalized ellipses having vertices with multiplicity > 1.

To prove that Theorem 1 is sharp, we will construct a convex curve in R2k

having the minimum number of vertices, i.e. 2k + 2.Consider the generalized ellipse of Theorem 3 with coefficients

a1 = b1 = · · · = ak = bk = 1

and denote it by γ0. Obviously γ0 is a spherical curve and all its points are ver-tices. In order to obtain the desired convex curve, we will perturb γ0 in the “radialdirection”:

THEOREM 5. For ε = 0 sufficiently small, the curve

γε = (1 + ε cos(k + 1)θ)γ0(θ)

has exactly 2k + 2 vertices.

Theorems 4 and 5 are proved in Section 5.

2. Some Background on Lagrangian Singularities

In this section, we give the basic definitions and facts from Lagrangian singular-ity theory, that we will need later, focused on the normal map introduced below(for a deep study, see [3] or [4]). The reader having familiarity with Lagrangiansingularity theory (and the normal map) can go directly to Section 3.

A symplectic structure on a manifold M is a closed differentiable 2-form ω,nondegenerate on M, also called symplectic form. A manifold equipped with asymplectic structure is called a symplectic manifold.

EXAMPLE. The total space of the cotangent bundle π : T ∗R

n → Rn of R

n is asymplectic manifold.

A submanifold of a symplectic manifold (M2n, ω) is called Lagrangian if it hasdimension n and the restriction to it of the symplectic form ω is equal to 0.

EXAMPLE. Let N be any submanifold in the Euclidean space Rn and let L be

the n-dimensional manifold formed by the covectors 〈v, ·〉 at the end-points of thenormal vectors v to N . Then L is a Lagrangian submanifold of the symplecticspace T ∗

Rn.

A fibration of a symplectic manifold is called Lagrangian fibration if all thefibers are Lagrangian submanifolds.

Page 225: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 229

EXAMPLE. The cotangent bundle T ∗V → V of any manifold V is a Lagrangianfibration. The standard action 1-form λ = p dq vanishes along the fibers. Thus itsdifferential ω = dλ, which is the 2-form defining the standard symplectic structureon T ∗V , also vanishes.

Consider the inclusion i: L → E of an immersed Lagrangian submanifold L

in the total space of a Lagrangian fibration π : E → B. The restriction of theprojection π to L, that is π ◦ i: L → B is called a Lagrangian map. Thus, aLagrangian map is a triple L → E → B, where the left arrow is a Lagrangianimmersion and the right arrow a Lagrangian fibration.

The normal map. Consider the set of all vectors normal to a submanifold N in theEuclidean space R

n. Associate to each vector its end point. To the vector v basedat the point q associate the point q + v. This Lagrangian map of the n-dimensionalmanifold of normal vectors to N into the n-dimensional Euclidean space R

n iscalled normal map. The Lagrangian submanifold L in T ∗

Rn is formed by the

covectors 〈v, ·〉 at the end points of the normal vectors v.

The set of critical values of a Lagrangian map is called its caustic.

EXAMPLE. The focal set or evolute of a submanifold of positive codimensionin Euclidean space R

n is defined as the envelope of the family of normal lines tothe submanifold. The caustic of the normal map associated to that submanifoldcoincides with its focal set.

A Lagrangian equivalence of two Lagrangian maps is a symplectomorphism ofthe total space transforming the first Lagrangian fibration to the second, and thefirst Lagrangian immersion to the second. Caustics of equivalent Lagrangian mapsare diffeomorphic.

Generating families. Consider the total space of the standard Lagrangian fibra-tion R

2n → Rn, (p, q) �→ q with the form dp ∧ dq. Let F(x, q) be the germ, at

the point (x0, q0), of a family of smooth functions of k variables x = (x1, . . . , xk)

which depends smoothly on the parameters q. Suppose

(a) ∂F∂x

(x0, q0) = 0 and(b) the map (x, q) �→ ∂F

∂xhas rank k at (x0, q0).

Then the germ at the point ( ∂F∂q

(x0, q0), q0) of the set

LF ={(p, q) : ∃x : ∂F

∂x= 0, p = ∂F

∂q

},

is the germ of a smoothly immersed Lagrangian submanifold of R2n. The fam-

ily germ F is said to be a generating family of LF and of its Lagrangian mapπF : (q, p) �→ q.

Page 226: Mathematical Physics, Analysis and Geometry - Volume 7

230 R. URIBE-VARGAS

It turns out that the germ of each Lagrangian map is Lagrange equivalent to thegerm of the Lagrangian map πF for a suitable Lagrangian generating family F .Such generating family is said to be a Lagrangian representative family of thecorresponding germ of Lagrangian map.

The equivalence classes of germs of Lagrangian maps are called Lagrangiansingularities. The classification of Lagrangian singularities of manifolds in generalposition of dimension n � 10 is given in [3].

3. Proof of Theorem 3 and Study of the Focal Set of a Curve inTerms of Its Normal Map

Proof of Theorem 3. We will assume that the derivatives of order 1, . . . , n − 1, ofour curve, γ : R → R

n, are linearly independent at any point (which is true forgeneric curves). A generating family of the normal map associated to the curve isgiven by F : R

n × R → R,

Fq(s) = F(q, s) = 12‖q − γ (s)‖2.

The caustic of this Lagrangian map is the focal set of the curve. We shall write

(i) = {(q, s) ∈ Rn × R : ∂sF (q, s) = 0, . . . , ∂i

sF (q, s) = 0}, i � 1.

According to this notation the manifolds

Rn × R ⊃ (1) ⊃ (2) ⊃ . . . ⊃ (i) ⊃ . . .

are embedded one inside the other. Hence the kernels of the differentials of therestrictions of F to these embedded submanifolds are also embedded one insideanother.

The set (1) consists of the pairs (q, s) such that q is the centre of some hyper-sphere of R

n tangent to γ at s (this means that q lies in the normal hyperplane toγ at s). One can see from the equations defining (i), i � n, that it consists of thepairs (q, s) such that q is the centre of some hypersphere having at least (i + 1)-point contact with γ at s and that these centres form an affine (n − i)-dimensionalsubspace of R

n.So (n) is the curve formed by the pairs (q(s), s) such that q(s) is the centre of

an osculating hypersphere at γ (s). If γ (s) is not a flattening of γ then the osculatinghypersphere at γ (s) is unique and hence the value of F at the point (q(s), s) of (n) is one half of the square of the radius of the osculating hypersphere at γ (s).The condition for a point p = γ (s0) to be a vertex is equivalent to the fact that thefirst n + 1 derivatives of F with respect to s vanish at s = s0. Hence to each point(q(s), s) in (n + 1) corresponds a vertex γ (s) of the curve.

The set (i + 1) is a smooth submanifold of (i) characterised by the fact thatat its points the kernel of (F| (i))∗ is tangent to (i). Hence the derivative of therestriction of F to the curve (n) has rank 1 at all points of (n), except at the

Page 227: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 231

points of (n+1) (where it is 0). That is, a point belonging to (n+1) is a criticalpoint of the restriction of F to the curve (n). So a vertex is a critical point of theradius of the osculating hypersphere. �

Remarks on the focal set (caustic). The centres of the osculating hyperspheresat the vertices of γ are the points q ∈ R

n for which there exists a solution s of the(n + 1)-system of equations

F ′q(s) = 0,

F ′′q (s) = 0,

...

F (n+1)q (s) = 0.

For a fixed s, the first equation gives the normal hyperplane to the curve at thepoint γ (s). The first two equations give a codimension 1 subspace of the normalhyperplane to the curve at the point γ (s). Following this process we obtain (fora generic curve) a complete flag at each nonflattening point of the curve. Thiscomplete flag is the osculating flag of the focal curve (which is formed by thecentres of the osculating hyperspheres and is determined by the first n equations).In particular, the osculating hyperplane of the focal curve at the point q(s) is thenormal hyperplane to the curve γ at the point γ (s). As the point moves alongthe curve γ , the corresponding flag (starting with the codimension 2 subspace)generates a hypersurface which is stratified in a natural way by the components ofthe flag. This stratified hypersurface is a component of the focal set of the curveγ (the other component being the curve γ itself). The 1-dimensional stratum isthe focal curve of γ : it is generated by the 0-dimensional subspace of the flag,that is, by centre of the osculating hypersphere at the moving point. The equationFn+1

q (s) = 0 gives a finite number of isolated points at which the focal curve issingular (it has a cusp). These points correspond to the vertices of γ .

As we explained above, the focal set is the caustic of the normal map definedby the generating family F(q, s) = 1

2‖q − γ (s)‖2. Thus – according to Arnold’sclassification of singularities of caustics (see [3] or [4]) – the vertices of a curve inR

n correspond to a Lagrangian singularity An+1 of the normal map.

4. Proof of the (2k + 2)-Vertex Theorem by Sturm Theory

We begin this section with some definitions and results of Sturm theory, taken from[7] and [12].

A set of functions {ϕ1, . . . , ϕ2k+1} with ϕi: S1 → R is a Chebyshev system if

any linear combination a1ϕ1 +· · ·+a2k+1ϕ2k+1, ai ∈ R, with a21 +· · ·+a2

2k+1 = 0has at most 2k zeroes on S

1.

Page 228: Mathematical Physics, Analysis and Geometry - Volume 7

232 R. URIBE-VARGAS

EXAMPLE 1. The set of functions {1, cos θ, sin θ} is a Chebyshev system.

Remark. Any convex closed curve θ �→ (ϕ1(θ), . . . , ϕ2k(θ)) in R2k defines a

Chebyshev system: {1, ϕ1, . . . , ϕ2k}.DEFINITION 6. A linear homogeneous differential operator

L: C∞(S1) → C∞(S1)

is called disconjugate if it has a fundamental system of solutions for the equationLg = 0 which are defined on the circle and form a Chebyshev system.

EXAMPLE 2. The operator L = ∂(∂2 + 1) is disconjugate. The Chebyshevsystem {1, cos θ, sin θ} is a fundamental system of solutions for it.

EXAMPLE 3. Any convex curve γ : θ �→ (ϕ1(θ), . . . , ϕ2k(θ)) in R2k defines a

(2k + 1)-order disconjugate operator Lγ defined by

Lγ g = det(R1, . . . , R2k,G),

where Ri (respectively G) is the column vector defined by the first 2k+1 derivativesof ϕi (of g, respectively). The Chebyshev system {1, ϕ1, . . . , ϕ2k} is a fundamentalsystem of solutions of the equation Lγ g = 0.

EXAMPLE 4. The generalized ellipse ([7])

γ : θ �→ (a1 cos θ, b1 sin θ, a2 cos 2θ, b2 sin 2θ, . . . , ak cos kθ, bk sin kθ),

defines, up to a constant factor, the (2k + 1)-order disconjugate operator

Lγ = ∂(∂2 + 1) · · · (∂2 + n2).

Some proofs of 4-vertex type theorems are (implicitly) based on the followingtheorem due to Hurwitz ([13]) which generalize a theorem of Sturm ([19]):

HURWITZ THEOREM. Any function f ∈ C∞(S1) whose Fourier series beginswith the harmonics of order N , f = ∑

k�N ak cos kθ + bk sin kθ , has at least 2N

zeroes.

In fact any function f ∈ C∞(S1) without harmonics up to order n is orthogonalto the solutions of the equation ∂(∂2 + 1) · · · (∂2 + n2)ϕ = 0, and such solutionsform a Chebyshev system.

The following theorem generalizes Hurwitz’s theorem.

STURM–HURWITZ THEOREM ([7, 12]). Let f : S1 → R be a C∞ function

such that∫

S1 f (θ)ϕi(θ) dθ = 0, {ϕi}i=1,...,2k+1 being a Chebyshev system. Then f

has at least 2k + 2 sign changes.

Page 229: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 233

COROLLARY 2 ([12]). Any function in the image of a disconjugate operator(f = Lg, where g ∈ C∞(S1) is any function) of order 2k + 1 has at least 2k + 2sign changes.

Proof of the (2k + 2)-vertex theorem in R2k. Let

γ : θ �→ (ϕ1(θ), . . . , ϕ2k(θ))

be a convex curve in R2k. Consider the family of functions on the circle F : R

2k ×S

1 → R defined by

Fq(θ) = 12‖q − γ (θ)‖2.

In the proof of Theorem 3 we saw that the centres of the osculating hyperspheresat the vertices of γ are the points q ∈ R

2k for which there exists a solution θ of thefollowing system of 2k + 1 equations:

F ′q(θ) = 0,

F ′′q (θ) = 0,

...

F (2k+1)q (θ) = 0.

The focal curve θ �→ q(θ) (consisting of the centres of the osculating hyper-spheres) is determined by the first 2k equations. The last equation is the conditionon this curve determining the vertices. Write g = γ 2/2. Using the fact that

−F = γ · q − γ 2

2− q2

2,

the preceding system of equations can be written as

γ ′ · q − g′ = 0,

γ ′′ · q − g′′ = 0,

...

γ (2k+1) · q − g(2k+1) = 0.

This means that the vector (q,−1) in R2k+1 is orthogonal to the 2k + 1 vectors

(γ ′, g′),(γ ′′, g′′),...

(γ (2k+1), g(2k+1)).

So the vertices of γ are given by the zeros of the determinant of the matrix whoselines are these 2k + 1 vectors. This determinant is equal to det(R1, . . . , R2k,G)

Page 230: Mathematical Physics, Analysis and Geometry - Volume 7

234 R. URIBE-VARGAS

where Ri (respectively G) is the column vector defined by the first 2k+1 derivativesof ϕi (of g = γ 2/2, respectively). This determinant is the image of the functiong = γ 2/2 under the operator Lγ (see Example 3). So Corollary 2 implies that thisdeterminant has at least 2k + 2 sign changes. This proves the theorem. �

Proof of Theorem 2. In the above proof of the (2k+2)-vertex theorem for convexcurves in R

2k, the convexity of the curve and the parity of the dimension were usedonly in the last step. So the determinant obtained in the proof gives a formula tocalculate the vertices of a curve in R

n. This proves Theorem 2. �

5. On the Number of Vertices of Generalized Ellipses

We will give two examples and then we will prove Theorem 4.

EXAMPLE 1. The generalized ellipse in R4,

γ (θ) = (a1 cos θ, b1 sin θ, a2 cos 2θ, b2 sin 2θ),

with a22 = b2

2 and a1b1a2b2 = 0 has 8 vertices. If a22 = b2

2 then γ is a sphericalcurve and all its points are thus vertices.

Proof. Denote Ck = cos kθ , Sk = sin kθ and

g = 2(a21C

21 + b2

1S21 + a2

2C22 + b2

2S22).

Theorem 2 and Examples 3 and 4 of Section 4 imply that the vertices of γ

correspond to the roots of the equation ∂(∂2 +1)(∂2 +22)g = 0. The trigonometricidentity

a2 cos2 θ + b2 sin2 θ = 12(a

2 + b2 + (a2 − b2) cos 2θ)

allows us to write

g = (a21 − b2

1)C2 + (a22 − b2

2)C4 + a21 + b2

1 + a22 + b2

2.

The operator ∂ kills the constant terms (i.e. the harmonics of order zero), andthe operator (∂2 + 22) kills the second-order harmonics. Moreover ∂C4 = −4S4,∂S4 = 4C4 and (∂2 + k2)C4 = (k2 − 42)C4. Thus

∂(∂2 + 1)(∂2 + 22)g = ∂(∂2 + 1)(∂2 + 22)(a22 − b2

2)C4

= K(a22 − b2

2)S4,

where K = −4(1 − 42)(22 − 42) is a nonzero constant. Thus the vertices of γ

correspond to the solutions of the equation K(a22 − b2

2)S4 = 0, i.e. γ has 8 verticesfor a2

2 = b22 and all its points are vertices for a2

2 = b22. �

We keep the notation Ck = cos kθ and Sk = sin kθ .

Page 231: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 235

EXAMPLE 2. The generalized ellipse in R6,

γ (θ) = (a1 cos θ, b1 sin θ, a2 cos 2θ, b2 sin 2θ, a3 cos 3θ, b3 sin 3θ),

with∏3

i=1 aibi = 0 may have 8, 12 or an infinity of vertices, depending on thevalues of the parameters a2, b2, a3, b3. In particular, if a2

2 = b22 and a2

3 = b23 then

γ has 12 vertices, and if a22 = b2

2 and a23 = b2

3 then γ has 8 vertices. If a22 = b2

2and a2

3 = b23 then γ is a spherical curve and all its points are thus vertices.

Proof. As in Example 1, the vertices of γ are the roots of the equation given by

∂(∂2 + 1)(∂2 + 22)(∂2 + 32)g = 0,

where

g = (a21 − b2

1)C2 + (a22 − b2

2)C4 + (a23 − b2

3)C6 +3∑

i=1

(a2i + b2

i ).

The operator ∂(∂2 + 1)(∂2 + 22)(∂2 + 32) kills the harmonics of orders zero, one,two and three. Thus

∂(∂2 + 1)(∂2 + 22)(∂2 + 32)g = K2(a22 − b2

2)S4 + K3(a23 − b2

3)S6,

where K2 and K3 are nonzero constants. �We recall Theorem 4:

THEOREM 4. Consider the generalized ellipse in R2k

γ (θ) = (a1C1, b1S1, a2C2, b2S2, . . . , akCk, bkSk),

with a1b1a2b2 · · · akbk = 0. Then, for even k, γ can have 2k + 4, 2k + 8, . . . , 4k

or an infinity of vertices depending on the values of the parameters aj and bj , forj � (k/2) + 1. For odd k, γ can have 2k + 2, 2k + 6, . . . , 4k or an infinity ofvertices depending on the values of the parameters aj and bj , for j � (k + 1)/2.

Proof. As in Examples 1 and 2, the vertices of γ are the roots of the equationgiven by

∂(∂2 + 1)(∂2 + 22) · · · (∂2 + k2)g = 0,

where

g =k∑

i=1

(a2i − b2

i )C2i +k∑

i=1

(a2i + b2

i ).

The operator

∂(∂2 + 1)(∂2 + 22) · · · (∂2 + k2)

Page 232: Mathematical Physics, Analysis and Geometry - Volume 7

236 R. URIBE-VARGAS

kills the harmonics from the order zero until order k. Thus, for even k,

∂(∂2 + 1)(∂2 + 22) · · · (∂2 + k2)g =∑

i� k2 +1

Ki(a2i − b2

i )S2i,

where Ki is a nonzero constant, for i � (k/2) + 1. For odd k

∂(∂2 + 1)(∂2 + 22) · · · (∂2 + k2)g =∑

i�(k+1)/2

Ki(a2i − b2

i )S2i,

where Ki is a nonzero constant, for i � (k + 1)/2. Theorem 4 is proved. �Proof of Theorem 5. Applying our formula of Theorem 2 we obtain that the

number of vertices of the curve γε(θ) = (1 + ε cos(k + 1)θ)γ0(θ) is given by thenumber of solutions θ ∈ S

1 of an equation of the form

ε∂(∂2 + 1)(∂2 + 22) · · · (∂2 + k2) cos(k + 1)θ + ε2f (θ, ε) = 0,

i.e.

εK sin(k + 1)θ + ε2f (θ, ε) = 0,

where

K = −(k + 1)

k∏i=1

(−(k + 1)2 + i2) = 0

is a constant and the function f (θ, ε) is bounded for |ε| < 1. Thus for ε = 0 smallenough this equation has exactly 2k + 2 solutions. �

Acknowledgements

The author is grateful to V. I. Arnold and to M. Kazarian for helpful discussions,questions and remarks.

References

1. Anisov, S. S.: Convex Curves in RPn, Proc. Steklov Inst. Math. 221 (1998), 3–39.2. Arnold, V. I.: Contact Geometry and Wave Propagation, Enseign. Math. 34 (1989).3. Arnold, V. I., Varchenko, A. N. and Gussein-Zade, S. M.: Singularities of Differentiable Maps,

Vol. 1, Birkhäuser, 1986. (French version: Singularités des applications différentiables, Vol. 1,Mir, Moscow, 1986. Russian version: Nauka, 1982.)

4. Arnold, V. I.: Singularities of Caustics and Wave Fronts, In: Math. Appl. (Soviet series) 62,Kluwer, Dordrecht, 1991.

5. Arnold, V. I.: Sur les propriétés des projections Lagrangiennes en géométrie symplectique descaustiques, In: Cahiers Math. Decision 9320, CEREMADE, 1993, pp. 1–9. Rev. Mat. Univ.Complut. Madrid 8(1) (1995), 109–119.

Page 233: Mathematical Physics, Analysis and Geometry - Volume 7

FOUR-VERTEX THEOREMS, STURM THEORY AND LAGRANGIAN SINGULARITIES 237

6. Arnold V. I.: The geometry of spherical curves and the algebra of quaternions, Russian Math.Surveys 50(1) (1995), 1–68.

7. Arnold, V. I.: On the number of flattening points of space curves, Amer. Math. Soc. Transl. 171(1995), 11–22.

8. Arnold, V. I.: Topological problems of the theory of wave propagation, Russian Math. Surveys51(1) (1996), 1–47.

9. Arnold, V. I.: Towards the Legendrian Sturm theory of space curves, Functional Anal. Appl.32(2) (1998), 75–80.

10. Barner, M.: Über die Mindestanzahl stationärer Schmiegeebenen bei geschlossenen streng-konvexen Raumkurven, Abh. Math. Sem. Univ. Hamburg 20 (1956), 196–215.

11. Fukui, T. and Nuño-Ballesteros, J. J.: Isolated roundings and flattenings of submanifolds inEuclidean spaces, Preprint.

12. Guieu, L., Mourre, E. and Ovsienko, V. Yu.: Theorem on six vertices of a plane curve via theSturm theory, In: V. I. Arnold, I. M. Gelfand, M. Smirnov and V. S. Retakh (Eds), Arnold–Gelfand Mathematical Seminars, Birkhäuser, Boston, 1997, pp. 257–266.

13. Hurwitz, A.: Über die Fourierschen Konstanten integrierbarer Funktionen, Math. Ann. 57(1903), 425–446.

14. Kazarian, M.: Nonlinear version of Arnold’s theorem on flattening points, C.R. Acad. Sci. ParisSér. I 323(1) (1996), 63–68.

15. Mukhopadhyaya, S.: New methods in the geometry of a plane arc I, Bull. Calcutta Math. Soc.1 (1909), 31–37.

16. Porteous, I. R.: Geometric Differentiation, Cambridge Univ. Press, 1994.17. Romero-Fuster, M. C.: Convexly-generic curves in R

3, Geom. Dedicata 28 (1988), 7–29.18. Sedykh, V. D.: The theorem about four vertices of a convex space curve, Functional Anal. Appl.

26(1) (1992), 28–32.19. Sturm, J. C. F.: Mémoire sur les équations différentielles du second ordre, J. Math. Pures Appl.

1 (1836), 106–186.20. Uribe-Vargas, R.: On the higher dimensional four-vertex theorem, C.R. Acad. Sci. Paris Sér. I

321 (1995), 1353–1358.21. Uribe-Vargas, R.: On the (2k+2)-vertex and (2k+2)-flattening theorems in higher dimensional

Lobatchevskian space, C.R. Acad. Sci. Paris Sér. I 325 (1997), 505–510.22. Uribe-Vargas, R.: Four-vertex theorems in higher dimensional spaces for a larger class of curves

than the convex ones, C.R. Acad. Sci. Paris Sér. I 330 (2000), 1085–1090.23. Uribe-Vargas, R.: On polar duality, Lagrange and Legendre singularities and stereographic

projection to quadrics, Proc. London Math. Soc. 87(3) (2003), 701–724.

Page 234: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 239–287, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

239

Thermal IonizationDedicated, in admiration and friendship, to Elliott Lieb and EdwardNelson, on the occasion of their 70th birthdays

JÜRG FRÖHLICH and MARCO MERKLI�Theoretical Physics ETH-Hönggerberg, CH-8093 Zürich, Switzerland.e-mail: {juerg, merkli}@itp.phys.ethz.ch

(Received 26 June 2003)

Abstract. In the context of an idealized model describing an atom coupled to black-body radiationat a sufficiently high positive temperature, we show that the atom will end up being ionized in thelimit of large times. Mathematically, this is translated into the statement that the coupled systemdoes not have any time-translation invariant state of positive (asymptotic) temperature, and that theexpectation value of an arbitrary finite-dimensional projection in an arbitrary initial state of positive(asymptotic) temperature tends to zero, as time tends to infinity. These results are formulated withinthe general framework of W∗-dynamical systems, and the proofs are based on Mourre’s theory ofpositive commutators and a new virial theorem. Results on the so-called standard form of a vonNeumann algebra play an important role in our analysis.

Mathematics Subject Classifications (2000): 82C10, 47A10, 46L55.

Key words: open quantum system, black-body radiation, CCR algebra, virial theorem, positive com-mutators, Mourré estimate, standard form of von Neumann algebras, Fermi Golden Rule, Liouvilleoperator.

1. Introduction

In this paper, we study an idealized model describing an atom or molecule con-sisting of static nuclei and electrons coupled to black-body radiation. Our aim isto show that when the quantized radiation field is in a thermal state correspondingto a sufficiently high positive temperature, and under suitable conditions on theinteraction Hamiltonian, including infrared and ultraviolet cutoffs and a small valueof the coupling constant, the atom or molecule will always be ionized in the limitof very large times. This process is called thermal ionization.

Thus, a very dilute gas of atoms or molecules in intergalactic space and sub-ject to the 3K thermal background radiation of the universe will eventually betransformed into a very dilute plasma of nuclei and electrons.

If the temperature of the black-body radiation is small, as compared to a typicalatomic ionization energy, then an atom initially prepared in an excited bound statewill start to emit light and relax towards its ground-state. After a time much longer

� Supported by an NSERC Postdoctoral Fellowship and by ETH-Zürich. Presented address:Department of Mathematics, McGill University, Canada. e-mail: [email protected]

Page 235: Mathematical Physics, Analysis and Geometry - Volume 7

240 JÜRG FRÖHLICH AND MARCO MERKLI

than its relaxation time, it will be stripped of its electrons in very unlikely eventswhere an atomic electron is hit by a high-energy photon from the thermal back-ground radiation. The life time of the groundstate of an isolated atom interactingwith black body radiation at inverse temperature β, before it is ionized, is expectedto be exponentially large in β. A precise description of the temporal evolution ofsuch an atom is difficult to come by; but the claim that it will eventually be ionized,is highly plausible. To most physicists, this result must look obvious. Unfortunatelya complete proof of it is likely to be very involved. The main purpose of this paper isto present some partial results, thermal ionization at sufficiently high temperaturesfor simplified models, supporting this picture.

If the temperature of electromagnetic radiation is strictly zero then an atominitially prepared in a bound state of maximal energy well below its ionizationthreshold can be shown to always relax to a groundstate by emitting photons; (fora proof of this statement in some slightly idealized models see [FGS]). This resultand our complementary result on thermal ionization provide some qualitative un-derstanding of two fundamental irreversible processes in atomic physics: relaxationto a ground state, and ionization by thermal radiation.

Next, we describe the physical system analyzed in this paper somewhat moreprecisely; (for further details see Section 2.1). It is composed of a subsystem withfinitely many degrees of freedom, the ‘atom’ (or ‘molecule’), and a subsystem withinfinitely many degrees of freedom, the ‘radiation field.’ The space of pure statevectors of the atom is a separable Hilbert space, Hp; (where the subscript p standsfor ‘particle’). Mixed states of the atom are described by density matrices, ρ, whereρ is a nonnegative, trace-class operator on Hp of unit trace. The expectation valueof a bounded operator A on Hp in the state ρ is given by

ωpρ (A) := tr ρA. (1)

Before the ‘atom’ or particle system is coupled to the radiation field the timeevolution of a bounded operator A on Hp in the Heisenberg picture is given by

αpt (A) := eitHpAe−itHp, (2)

where Hp is the particle Hamiltonian, which is a selfadjoint operator on Hp whosespectrum is bounded from below by a constant E > −∞.

To be specific, we may think of Hp as being the Hilbert space

Hp = Cn ⊕ L2(R3, d3x), (3)

and the Hamiltonian Hp as the operator

Hp = diag(E0 = E,E1, . . . , En−1) �Cn ⊕(−�) �L2(R3,d3x), (4)

describing a one-electron atom (with a static nucleus) with n boundstates of ener-gies E0, E1, . . . , En−1 < 0 and scattering states of arbitrary energies k2 ∈ [0,∞)

spanning the subspace L2(R3, d3x) of Hp. Thus, the point spectrum of Hp is givenby the eigenvalues {E0, E1, . . . , En−1} and the continuous spectrum of Hp covers

Page 236: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 241

[0,∞), has constant (infinite) multiplicity and is absolutely continuous. Just inorder to keep things simple, we shall usually assume that n = 1.

The bounded operators on a Hilbert space H form a von Neumann algebradenoted by B(H). A convenient algebra of operators encoding the kinematics ofthe ‘atom’ or particle system is the algebra Ap := B(Hp).

The ‘radiation field’ is described by a free, massless, scalar Bose field ϕ onphysical space R

3, a ‘phonon field.’ For purposes of physics, it would be prefer-able to replace ϕ by the free electromagnetic field. In our entire analysis, thisreplacement can be made without any difficulties – at the price of slightly morecomplicated notation. A convenient algebra of operators to encode the kinematicsof the radiation field is a C∗-algebra Af which can be viewed as a time-averagedversion of the algebra of Weyl operators generated by ϕ and its conjugate mo-mentum field π . The time evolution of operators in Af , in the Heisenberg picture,before the field is coupled to the particle system, is given by the free-field timeevolution α

ft , which is a one-parameter group of ∗automorphisms of Af .

A one-parameter group {αt |t ∈ R} defined on a C∗-algebra A is a ∗auto-morphism group of A iff

αt(A) ∈ A, (αt(A))∗ = αt (A∗), for all A ∈ A,

αt(A)αt (B) = αt (AB), for all A,B ∈ A, (5)αt=0(A) = A, αt(αs(A)) = αt+s(A), for all A ∈ A, t, s ∈ R.

Since we work on a time-averaged Weyl algebra, the free field time evolution isnorm continuous, i.e., t �→ α

ft (A) is a continuous map from R to Af . General

states of the radiation field can be described as states on the algebra Af , i.e., aspositive, linear functionals, ω, on Af normalized such that ω(1) = 1.

A convenient algebra of operators to encode the kinematics of the system com-posed of the ‘atom’ and the ‘radiation field’ is the C∗-algebra, A, given by

A = Ap ⊗ Af . (6)

The time evolution of operators in A, before the two subsystems are coupled toeach other, is given by

αt,0 := αpt ⊗ α

ft . (7)

A regularized interaction coupling the two subsystems can be introduced bychoosing a bounded, selfadjoint operator V (ε) ∈ A, where the superscript (ε) indi-cates that a regularization has been imposed on an interaction term, V , in such away that ‖V (ε)‖ = O(1/ε). We define the regularized, interacting time evolutionof the coupled system as a ∗automorphism group {α(ε)

t,λ | t ∈ R} of the algebra A

given by the norm-convergent Schwinger–Dyson series

α(ε)t,λ(A) = αt,0(A) +

∞∑n=1

(iλ)n

∫ t

0dt1 · · ·

· · ·∫ tn−1

0dtn[αtn,0(V

(ε)), [αtn−1,0(V(ε)), . . . ,

[αt1,0(V(ε)), αt,0(A)] · · ·]], (8)

Page 237: Mathematical Physics, Analysis and Geometry - Volume 7

242 JÜRG FRÖHLICH AND MARCO MERKLI

for an arbitrary operator A ∈ A. In Equation (8), λ is a coupling constant, and theinteraction term V is chosen in accordance with conventional models describingelectrons coupled to the quantized radiation field.

We are interested in analyzing the time evolution of the coupled system insome states ω of physical interest, i.e., in understanding the time-dependence ofexpectation values

ω(α(ε)t,λ(A)), A ∈ A, (9)

in the limit where the regularization is removed, i.e., ε → 0, and for large times t .The states ω of interest are states ‘close to’ (technically speaking, normal withrespect to) a reference state of the form

ωρ,β := ωpρ ⊗ ω

f

β , (10)

where ωpρ is given by a density matrix ρ on Hp, see Equation (1), and ω

f

β is thethermal equilibrium state of the radiation field at temperature T = (kBβ)−1, wherekB is Boltzmann’s constant. Technically, ω

f

β is defined as the unique (αft , β)-KMS

state on the algebra Af ; it is invariant under (or ‘stationary’ for) the free-field timeevolution α

ft . If the density matrix ρ describes an arbitrary statistical mixture of

bound states of Hp, but ρ vanishes on the subspace L2(R3, d3x) of Hp, then ωρ,β

is stationary for the free time evolution αt,0 defined in Equation (7). However, itis not an equilibrium (KMS) state for αt,0. In fact, because Hp has continuousspectrum, there are no equilibrium (KMS) states on A for the time evolution αt,0.

Given the algebra A and a reference state ωρ,β on A, as in Equation (10),the GNS construction associates with the pair (A, ωρ,β) a Hilbert space H , a∗representation πβ of A on H , and a vector �ρ ∈ H , cyclic for the algebra πβ(A),such that

ωρ,β(A) = 〈�ρ, πβ(A)�ρ〉, (11)

for all A ∈ A. The closure of the algebra πβ(A) in the weak operator topology is avon Neumann algebra of bounded operators on H which we denote by Mβ . Thisalgebra depends on β, but is independent of the choice of the density matrix ρ. Thestates ω on A of interest to us are given by vectors ψ ∈ H in such a way that

ω(A) = 〈ψ,πβ(A)ψ〉, A ∈ A. (12)

We shall see that there exists a selfadjoint operator L(ε)λ on H generating the

time evolution of the coupled system, in the sense that

πβ(α(ε)t,λ(A)) = eitL

(ε)λ πβ(A)e−itL

(ε)λ , (13)

for A ∈ A; L(ε)λ is called the (regularized) Liouvillian. Clearly,

σ(ε)t,λ (K) := eitL

(ε)λ Ke−itL

(ε)λ , K ∈ Mβ, (14)

Page 238: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 243

defines a ∗automorphism group of time translations on Mβ . For an interesting classof models, we shall show that

s-limε→0

eitL(ε)λ =: eitLλ (15)

exists, for all t , and defines a unitary one-parameter group on H . It then followsfrom (14) and (15) that

σt,λ(K) := eitLλKe−itLλ (16)

defines a one-parameter group of ∗automorphisms on the von Neumann alge-bra Mβ . The pair (Mβ, σt,λ) defines a so-called W ∗-dynamical system. If the cou-pling constant λ vanishes then a state ωρ,β = ωp

ρ ⊗ ωf

β , where the density matrix ρ

vanishes on the subspace L2(R3, d3x) ⊂ Hp corresponding to the continuousspectrum of Hp and commutes with Hp, is an invariant state for σt,0, in the sensethat

ωρ,β(σt,0(K)) := 〈�ρ, σt,0(K)�ρ〉 = ωρ,β(K), (17)

for all K ∈ Mβ .The main result proven in this paper can be described as follows: For an interest-

ing class of interactions, V , for an arbitrary inverse temperature 0 < β < ∞, andfor all real coupling constants λ with 0 < |λ| < λ0(β), where λ0(β) depends on thechoice of V , and on β as λ0(β) ∼ eβE0 , where E0 < 0 is the ground state energyof the particle system, there do not exist any states ω on Mβ close, in the sense ofEquation (12), to a reference state ωρ,β , as in Equation (10), which are invariantunder the time evolution σt,λ on Mβ , (in the sense that ω(σt,λ(K)) = ω(K), forK ∈ Mβ ).

In other words, we show that, under the hypotheses described above, there areno time-translation invariant states of the coupled system of asymptotic temper-ature T = (kBβ)−1 > 0. It will turn out that this result is a consequence ofthe following one: For a certain canonical definition of the Liouvillian Lλ of thecoupled system, and under the hypotheses sketched above, Lλ does not have anyeigenvectors ψ ∈ H , in particular, ker Lλ = {0}. This result will be proven withthe help of Mourre’s theory of positive commutators applied to Lλ and a new virialtheorem.

As a corollary of our results it follows that, for an arbitrary vector ψ ∈ H andan arbitrary compact operator K on H ,

〈ψ, eitLλKe−itLλψ〉 −→ 0, (18)

as time t → ∞, (at least in the sense of ergodic means). This means that thesurvival probability of an arbitrary bound state of the atom coupled to the quantizedradiation field in a thermal equilibrium state at positive temperature tends to zero,as time t → ∞. Heuristically, this can be understood by using Fermi’s GoldenRule.

Page 239: Mathematical Physics, Analysis and Geometry - Volume 7

244 JÜRG FRÖHLICH AND MARCO MERKLI

One may wonder how the quantum-mechanical motion of an electron looks like,after it has been knocked off the atom by a high-energy boson, i.e., after thermalionization. We cannot give an answer to this question in this paper, because we arenot able to analyze appropriately realistic models yet. But it is natural to expectthat this motion will be diffusive, furnishing an example of ‘quantum Brownianmotion.’ Progress on this question would be highly desirable.

Organization of the paper. In Section 2, we define the model, and state our mainresult on thermal ionization, Theorem 2.4, which follows from spectral propertiesof the Liouvillian proven in our key technical theorem, Theorem 2.3. In Section 3,we state two general virial theorems, Theorems 3.2 and 3.3, we present a result onregularity of eigenfunctions of Liouvillians, Theorem 3.4, and explain some basicideas of the positive commutator method. The proof of Theorem 2.3 (spectrum ofLiouvillian) is given in Section 4. It consists of two main parts: verification that thevirial theorems are applicable in the particular situation encountered in the analysisof our models (Subsection 4.2), and proof of a lower bound on a commutator of theLiouvillian with a suitable conjugate operator (Subsections 4.3, 4.4). In Section 5,we establish some technical results on the invariance of operator domains and oncertain commutator expansions that are needed in the proofs of the virial theoremsand of the theorem on regularity of eigenfunctions. Proofs of the latter results arepresented in Section 6. In Section 7, we describe some results on unitary groupsgenerated by vector fields which are needed in the definition of our ‘conjugate op-erator’ Aa

p in the positive commutator method. The last section, Section 8, containsproofs of several propositions used in earlier sections of the paper.

2. Definition of Models and Main Results on Thermal Ionization

In Section 2.1, we introduce our model and use it to define a W ∗-dynamical system(Mβ, σt,λ). Our main results on thermal ionization are described in Section 2.2.

2.1. DEFINITION OF THE MODEL

Starting with the algebra A and a (regularized) dynamics α(ε)t,λ on it, we intro-

duce a reference state ωρ0,β , and consider the induced (regularized) dynamics σ(ε)t,λ

on πβ(A), where (H , πβ,�ρ0) denotes the GNS representation corresponding to(A, ωρ0,β). We show that, as ε → 0, σ

(ε)t,λ tends to a ∗automorphism group, σt,λ, of

the von Neumann algebra Mβ , defined as the weak closure of πβ(A) in B(H). Wedetermine the generator, Lλ, of the unitary group, eitLλ , on H implementing σt,λ;Lλ is called a Liouvillian. The relation between eigenvalues of Lλ and invariantnormal states on Mβ will be explained later in this section (see Theorem 2.2). Wewill sometimes write simply L instead of Lλ, for λ = 0.

Page 240: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 245

2.1.1. The Algebra Af

We introduce a C∗-algebra suitable for the description of the dynamics of the freefield, and, as we explain below, for the description of the interacting dynamics.

Let W = W(L20) be the Weyl CCR algebra over

L20 := L2(R3, d3k) ∩ L2(R3, |k|−1 d3k),

i.e., the C∗-algebra generated by the Weyl operators, W(f ), for f ∈ L20, satisfying

W(−f ) = W(f )∗, W(f )W(g) = e−iIm〈f,g〉/2W(f + g).

Here 〈·, ·〉 denotes the inner product of L20. The latter relation implies the CCR

W(f )W(g) = e−iIm〈f,g〉W(g)W(f ). (19)

The expectation functional for the KMS state of an infinitely extended free Bosefield in thermal equilibrium at inverse temperature β is given by

g �−→ ωf

β (W(g)) = exp

{−1

4

∫R3

(1 + 2

eβ|k| − 1

)|g(k)|2 d3k

},

which motivates the choice of the space L20 (as opposed to g ∈ L2(R3)).

The free field dynamics on W is given by the ∗automorphism group

αWt (W(f )) = W(ei|k|t f ). (20)

It is well known that for f = 0, t �→ αWt (W(f )) is not a continuous map from R

to W, but t �→ ω(αWt (W(f ))) is continuous for a large (weak∗ dense) class of

states ω on W. An interacting dynamics is commonly defined using a Dyson seriesexpansion, hence we should be able to give a sense to time integrals over αW

t (a),for a ∈ W. Because of the lack of norm-continuity of the free dynamics, such anintegral cannot be interpreted in norm sense, but only in a weak hence representa-tion dependent way. In order to give a representation independent definition of the(coupled) dynamics, we modify the algebra in such a way that the free dynamicsbecomes norm-continuous. The idea is to introduce a time-averaged Weyl algebra,generated by elements given by

a(h) =∫

R

ds h(s)αWs (a), (21)

for functions h in a certain class, and a ∈ W (if h is sharply localized at zero, theintegral approximates a ∈ W). The free dynamics is then given by

αft (a(h)) =

∫R

ds h(s)αWs (αW

t (a)) =∫

R

ds h(s − t)αWs (a).

We now construct a C∗-algebra whose elements, when represented on a Hilbertspace, are given by (21), where the integral is understood in a weak sense.

Page 241: Mathematical Physics, Analysis and Geometry - Volume 7

246 JÜRG FRÖHLICH AND MARCO MERKLI

Let P be the free algebra generated by elements

{a(h) | a ∈ W, h ∈ C∞0 (R)},

where ˆ denotes the Fourier transform. Taking the functions h to be analytic (i.e.,having a Fourier transform in C∞

0 ) allows us to construct KMS states w.r.t. the freedynamics, as we explain below. We equip the algebra P with the star operationdefined by (a(h))∗ = (a∗)(h), and introduce the seminorm

p(a(h)) = supπ∈Rep W

∥∥∥∥ ∫R

dt h(t)π(αWt (a))

∥∥∥∥, (22)

where the supremum extends over all representations of W. The integral on ther.h.s. of (22) is understood in the weak sense (t �→ π(αW

t (a)) is weakly mea-surable for any π ∈ Rep W), and the norm is the one of operators acting on therepresentation Hilbert space. It is not difficult to verify that

N = {a ∈ P | p(a) = 0}is a two-sided ∗ideal in P. We can therefore build the quotient ∗algebra P/N

consisting of equivalence classes [a] = {a + n | a ∈ P, n ∈ N}, on which p

defines a norm

‖[a]‖ = p(a), [a] ∈ P/N,

having the C∗ property

‖[a]∗[a]‖ = ‖[a]‖2.

The C∗-algebra Af of the field is defined to be the closure of the quotient in thisnorm,

Af = P/N‖·‖

.

Notice that every πW ∈ Rep W induces a representation πf ∈ Rep Af accordingto πf (a(h)) = ∫

dt h(t)πW(αWt (a)). The algebra Af can be viewed as a time-

averaged version of the Weyl algebra. The advantage of Af over W is that the freefield dynamics on Af , defined by

αft (a(h)) = a(ht), ht(x) = h(x − t), (23)

is a norm-continous ∗automorphism group, i.e., ‖αft (a) − a‖ → 0, as t → 0, for

any a ∈ Af .There is a one-to-one correspondence between (β, αW

t )-KMS states ωWβ on W

and (β, αft )-KMS states ω

f

β on Af , given by the relation

ωf

β (a1(f1) · · · an(fn))

=∫

dt1 · · · dtnf1(t1) · · · fn(tn)ωWβ (αW

t1(a1) · · · αW

tn(an)).

Page 242: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 247

If (H , πβ

W,�) is the GNS representation of (W, ωW

β ) then the one of (Af , ωf

β ) is

given by (H , πβ

f ,�), where

πβ

f (a1(f1) · · · an(fn))

=∫

dt1 · · · dtnf1(t1) · · · fn(tn)πβ

W(αW

t1(a0) · · · αW

tn(an)). (24)

It follows that any unitary group implementing the free dynamics relative to πβ

W

implements it in the representation πβ

f , and conversely.

2.1.2. The Algebra A and the Regularized Dynamics α(ε)t,λ

The C∗-algebra A describing the ‘observables’ of the combined system is the tensorproduct algebra

A = Ap ⊗ Af . (25)

Here Ap = B(Hp) is the C∗-algebra of all bounded operators on the particleHilbert space

Hp = C ⊕ L2(R+, de;H) ≡ C ⊕∫ ⊕

R+He de, (26)

where de is the Lebesgue measure on R+, H is a (separable) Hilbert space, and ther.h.s. is the constant fibre direct integral with He

∼= H, e ∈ R+. An element in Hp

is given by ψ = {ψ(e)}e∈{E}∪R+ , where ψ(E) ∈ C, and ψ(e) ∈ H, e ∈ R+. Hp isa Hilbert space with inner product

〈ψ,φ〉 = ψ(E)φ(E) +∫

R+〈ψ(e), φ(e)〉H de.

Let αpt denote the ∗automorphism group on Ap given by

αpt (A) = eitHpAe−itHp,

where Hp is a selfadjoint operator on Hp, which is diagonalized by the directintegral decomposition of Hp:

Hp = E ⊕∫ ⊕

R+e de, for some E < 0. (27)

The domain of definition of Hp is given by

D(Hp) = C ⊕{ψ ∈

∫ ⊕

R+He de

∣∣∣∣ ∫R+

e2‖ψ(e)‖2H de < ∞

}. (28)

The dense set C∞0 (R+;H) ≡ C∞

0 consists of all elements ψ ∈ Hp s.t. thesupport, supp(ψ � R+), is a compact set in the open half-axis (0,∞), and s.t. ψ is

Page 243: Mathematical Physics, Analysis and Geometry - Volume 7

248 JÜRG FRÖHLICH AND MARCO MERKLI

infinitely many times continuously differentiable as an H-valued function. Clearly,C∞

0 ⊂ D(Hp), and since eitHp leaves C∞0 invariant, it follows that C∞

0 is a corefor Hp. It is sometimes practical to identify C ∼= Cϕ0, and we say that ϕ0 is theeigenfunction of Hp corresponding to the eigenvalue E.

EXAMPLE. This model is inspired by considering a block-diagonal Hamil-tonian Hp on the Hilbert space C ⊕ L2(R3, d3x), with Hp � C = E < 0,Hp � L2(R3, d3x) = −�. Passing to a diagonal representation of the Laplacian(Fourier transform), we have the following identifications, using polar coordinates:

Hp = C ⊕ L2(R3, d3k)

= C ⊕ L2(R+ × S2, |k|2 d|k| × d�)

= C ⊕ L2(R+, |k|2 d|k|;L2(S2, d�))

= C ⊕ L2(R+, dµ;H),

where we set H = L2(S2, d�), and make the change of variables |k|2 = e, so thatdµ(e) = µ(e) de, with µ(e) = (1/2)

√e. To arrive at the form (26), (27) of Hp,

Hp, we use the unitary map U : L2(R+, dµ;H) → L2(R+, de;H), given by

ψ �−→ Uψ = √µψ.

If Hp is the operator of multiplication by e on L2(R+, dµ;H), then its transform,UHpU−1, is the operator of multiplication by e on L2(R+, de;H).

We define the noninteracting time-translation ∗automorphism group of A (thefree dynamics) by

αt,0 := αpt ⊗ α

ft .

Given ε = 0, set

V (ε) :=∑

α

Gα ⊗ 1

2iε{(W(εgα))(hε) − (W(εgα))(hε)

∗} ∈ A, (29)

where the sum is over finitely many indices α, with Gα = G∗α ∈ B(Hp), gα ∈ L2

0,for all α, and where hε is an approximation of the Dirac distribution localized atzero. To be specific, we can take hε(t) = (1/ε)e−t2/ε2

. For any value of the realcoupling constant λ, the norm-convergent Dyson series

αt,0(A)++

∑n�1

(iλ)n

∫ t

0dt1 · · ·

∫ tn−1

0dtn[αtn,0(V

(ε)), [· · · [αt1,0(V(ε)), αt,0(A)] · · ·]]

=: α(ε)t,λ(A), (30)

where A ∈ A, defines a ∗automorphism group of A. The multiple integral in (30)is understood in the product topology coming from the strong topology of B(Hp)

and the norm topology of Af .

Page 244: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 249

One should view α(ε)t,λ as a regularized dynamics, in the sense that it has a limit,

as ε → 0, in suitably chosen representations of A; (this is shown below).The functions gα ∈ L2

0 are called form factors. Using spherical coordinatesin R

3, we often write gα = gα(ω,�), where (ω,�) ∈ R+ × S2.In accordance with the direct integral decomposition of Hp, the operators Gα

are determined by integral kernels. For ψ = {ψ(e)} ∈ Hp, we set

(Gαψ)(e) =

Gα(E,E)ψ(E) +

∫R+

Gα(E, e′)ψ(e′) de′, if e = E,

Gα(e,E)ψ(E) +∫

R+Gα(e, e

′)ψ(e′) de′, if e ∈ R+.

(31)

The families of bounded operators Gα(e, e′): He′ → He, with HE = C, have the

following symmetry properties (guaranteeing that Gα is selfadjoint):

Gα(E,E)∈ R,

Gα(E, e)∗ = Gα(e,E), ∀e ∈ R+,

Gα(e, e′)∗ = Gα(e

′, e), ∀e, e′ ∈ R+.

Here, ∗ indicates taking the adjoint of an operator in B(H, C) or B(H).

Remarks. (1) The map Gα(E, e): He → C is identified (Riesz) with an element�α(e) ∈ He, so that Gα(E, e)ψ(e) = 〈�α(e), ψ(e)〉He

. Then Gα(E, e)∗: C → He

is given by Gα(E, e)∗z = z�α(e), for all z ∈ C. Consequently, the above symmetrycondition implies that Gα(e,E)z = z�α(e).

(2) Assuming the strong derivatives w.r.t. the two arguments (e, e′) ∈ R2+ of

Gα(·, ·) exist, we have that ∂1,2Gα(e, e′) are operators H → H. Similarly, one

introduces higher derivatives. We assume that all derivatives occuring are boundedoperators on H. For Gα(·, ·) ∈ Cn(R+ × R+,B(H)), it is easily verified that theabove symmetry conditions imply that

(∂n11 ∂

n22 Gα(e, e

′))∗ = ∂n21 ∂

n12 Gα(e

′, e), (32)

for any n1,2 � 0, n1 + n2 � n, where ∗ is the adjoint on B(H). Similar statementshold for Gα(E, e),Gα(e,E).

The interaction is required to satisfy the following three conditions:

(A1) Infrared and ultraviolet behaviour of the form factors: for any fixed � ∈ S2,gα(·, �) ∈ C4(R+), and there are two constants 0 < k1, k2 < ∞, s.t. ifω < k1, then

|∂jωgα(ω,�)| < k2ω

p−j , for some p > 2, (33)

uniformly in α, j = 0, . . . , 4 and � ∈ S2. Similarly, there are two constants0 < K1,K2 < ∞, s.t. if ω > K1, then

|∂jωgα(ω,�)| < K2ω

−q−j , for some q > 72 . (34)

Page 245: Mathematical Physics, Analysis and Geometry - Volume 7

250 JÜRG FRÖHLICH AND MARCO MERKLI

(A2) The map (e, e′) �→ Gα(e, e′) is C3(R+ × R+,B(H)), and we have∫

R+de‖e−m1∂

m21 Gα(e,E)‖2

H < ∞, (35)∫R+

de

∫R+

de′‖e−m1(e′)−m′1∂

m21 ∂

m′2

2 Gα(e, e′)‖2

B(H) < ∞, (36)

for all integers m1,2,m′1,2 � 0, s.t. m1+m′

1+m2+m′2 = 0, 1, 2, 3. Moreover,∫

R+de

∫R+

de′‖eGα(e, e′)‖2B(H) < ∞. (37)

(A3) The Fermi Golden Rule condition. Define a family of bounded operatorson Hp by

F(ω,�) =∑

α

gα(ω,�)Gα. (38)

There is an ε0 > 0, s.t. for 0 < ε < ε0, we have that∫ ∞

−E

∫S2

d�ω2

eβω − 1p0F(ω,�)

p0ε

(Hp − E − ω)2 + ε2F(ω,�)∗p0

� γp0, (39)

for some strictly positive constant γ > 0. Here p0 is the orthogonal projec-tion onto the eigenspace C of Hp (see (26), (27)), and p0 = 1 − p0 is theprojection onto L2(R+, de;H).

Remarks. (1) Since E < 0 we have that γ ∼ eβE decays exponentially in β,for large β.

(2) Recalling that Gα(E, e) is identified with �α(e) ∈ He, see Remark (1) after(31) above, we can rewrite the l.h.s. of (39) as∫

(−E,∞)×S2dω d�

∫R+

deω2

eβω − 1

ε

(e − E − ω)2 + ε2×

×∑α,α′

gα(ω,�)〈�α(e), �α′(e)〉Hgα′(ω,�),

and this expression has the limit∫(−E,∞)×S2

dω d�ω2

eβω − 1

∑α,α′

gα(ω,�)〈�α(E +ω), �α′(E + ω)〉Hgα′(ω,�),

as ε → 0, because �α(e) is continuous in e. Consequently, (39) is satisfied if thisintegral is strictly positive.

Page 246: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 251

2.1.3. The Reference State ωρ0,β

Let ρ0 be a strictly positive density matrix on Hp, i.e., ρ0 > 0, tr ρ0 = 1, anddenote by ωp

ρ0the state on Ap given by A �→ tr ρ0A. Let ω

f

β be the (αft , β)-KMS

state on Af and define the reference state

ωρ0,β = ωpρ0

⊗ ωf

β .

The GNS representation (H , πβ,�ρ0) corresponding to (A, ωρ0,β) is explicitlyknown. It has first been described in [AW]; (we follow [JP] in its presentation).The representation Hilbert space is

H = Hp ⊗ Hp ⊗ F , (40)

where F is a shorthand for the Fock space

F = F ((L2(R × S2, du × d�))), (41)

du being the Lebesgue measure on R, and d� the uniform measure on S2. F (X)

denotes the bosonic Fock space over a (normed vector) space X:

F (X) := C ⊕⊕n�1

(SX⊗n), (42)

where S is the projection onto the symmetric subspace of the tensor product. Weadopt standard notation, e.g., � is the vacuum vector, [ψ]n is the n-particle com-ponent of ψ ∈ F (X), d�(A) is the second quantization of the operator A on X,N = d�(1) is the number operator.

The representation map πβ : A → B(H) is the product

πβ = πp ⊗ πβ

f ,

where the ∗homomorphism πp: Ap → B(Hp ⊗ Hp) is given by

πp(A) = A ⊗ 1p.

The representation map πβ

f : Af → B(F ) is determined by the representation

map of the Weyl algebra, πβ

W: W → B(F ), according to (24). To describe π

β

W,

we point out that L2(R+ × S2) ⊕ L2(R+ × S2) is isometrically isomorphic toL2(R × S2) via the map

(f, g) �→ h, h(u,�) ={

uf (u,�), u > 0,

ug(−u,�), u < 0.(43)

The representation map πβ

Wis given by

πβ

W= πFock ◦ Tβ,

Page 247: Mathematical Physics, Analysis and Geometry - Volume 7

252 JÜRG FRÖHLICH AND MARCO MERKLI

where the Bogoliubov transformation Tβ: W(L20) → W(L2(R × S2)) acts as

W(f ) �→ W(τβf ), with τβ : L2(R+ × S2) → L2(R × S2) given by

(τβf )(u,�) =√

u

1 − e−βu

{ √uf (u,�), u > 0,

−√−uf (−u,�), u < 0.(44)

Remarks. (1) It is easily verified that Im〈τβf, τβg〉L2(R×S2) = Im〈f, g〉L2(R+×S2),for all f, g ∈ L2

0, so the CCR (19) are preserved under the map τβ .(2) In the limit β → ∞, the r.h.s. of (44) tends to

uf (u,�), u > 0,

0, u < 0,

which is identified via (43) with f ∈ L20. Thus, Tβ reduces to the identity (an

imbedding), πβ

Wbecomes the Fock representation of W(L2

0), as β → ∞, and werecover the zero temperature situation.

It is useful to introduce the following notation. For f ∈ L2(R × S2), we defineunitary operators, W (f ), on the Hilbert space (40), by

W (f ) = eiϕ(f ), f ∈ L2(R × S2),

where ϕ(f ) is the selfadjoint operator on F given by

ϕ(f ) = a∗(f ) + a(f )√2

, (45)

and a∗(f ), a(f ) are the creation- and annihilation-operators on F , smeared outwith f . One easily verifies that

πβ

W(W(f )) = W(τβf ).

The cyclic GNS vector is given by

�ρ0 = �ρ0p ⊗ �,

where � is the vacuum in F , and

�ρ0p =

∑n�0

knϕn ⊗ Cpϕn ∈ Hp ⊗ Hp. (46)

Here, {k2n}∞

n=0 is the spectrum of ρ0, {ϕn} is an orthogonal basis of eigenvectorsof ρ0, and Cp is an antilinear involution on Hp. The origin of Cp lies in theidentification of l2(Hp) (Hilbert–Schmidt operators on Hp) with Hp ⊗ Hp, via|ϕ〉〈ψ | �→ ϕ ⊗ Cpψ . We fix a convenient choice for Cp: it is the antilinear invo-lution on Hp that has the effect of taking complex conjugates of components ofvectors, in the basis in which the Hamiltonian Hp is diagonal, i.e.,

(Cpψ)(e) ={

ψ(e) ∈ C, e = E,

ψ(e) ∈ H, e ∈ [0,∞).

Page 248: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 253

By ψ(e) ∈ H for e ∈ [0,∞), we understand the element in H obtained by complexconjugation of the components of ψ(e) ∈ H, in an arbitrary, but fixed, orthonormalbasis of H. This Cp is also called the time reversal operator, and we have

CpHpCp = Hp.

2.1.4. The W ∗-dynamical System (Mβ, σt,λ)

Let Mβ be the von Neumann algebra obtained by taking the weak closure (orequivalently, the double commutant) of πβ(A) in B(H):

Mβ = B(Hp) ⊗ 1p ⊗ πβ

f (Af )′′ ⊂ B(H).

Since ρ0 is strictly positive, �ρ0p is cyclic and separating for the von Neumann

algebra πp(Ap)′′ = B(Hp)⊗1p. Similarly, � is cyclic and separating for πβ

f (Af )′′,since it is the GNS vector of a KMS state (see, e.g., [BRII]). Consequently, �ρ0 iscyclic and separating for Mβ . Let J be the modular conjugation operator associatedto (Mβ,�ρ0). It is given by

J = Jp ⊗ Jf , (47)

where, for ϕ,ψ ∈ Hp,

Jp(ϕ ⊗ Cpψ) = ψ ⊗ Cpϕ,

and, for ψ = {[ψ]n}n�0 ∈ F ,

[Jf ψ]n(u1, . . . , un) = [ψ]n(−u1, . . . ,−un), for n � 1,

[Jf ψ]0 = [Jf ψ]0 ∈ C.

Clearly, J�ρ0 = �ρ0 , and one verifies that

Jpπp(A)Jp = 1p ⊗ CpACp, (48)

Jf πβ

W(W(f ))Jf = W (−e−βu/2τβ(f )) = W(e−βu/2τβ(f ))∗, (49)

for f ∈ L20. More generally, for f ∈ L2(R × S2), Jf W (f )Jf = W (f (−u,�)).

We now construct a unitary implementation of α(ε)t,λ w.r.t. πβ . Recall that πβ =

πp ⊗πβ

f , where πp: B(H) → B(Hp ⊗Hp) is continuous w.r.t. the strong topolo-

gies and πβ

f : Af → B(F ) is continuous w.r.t. the norm topologies (because it isa ∗ homomorphism). We thus have, for A ∈ A,

πβ(α(ε)t,λ(A))

= πβ(αt,0(A)) +∑n�1

(iλ)n

∫ t

0dt1 · · ·

∫ tn−1

0dtn[πβ(αtn,0(V

(ε))), [· · ·

· · · [πβ(αt1,0(V(ε))), πβ(αt,0(A))] · · ·]]. (50)

Page 249: Mathematical Physics, Analysis and Geometry - Volume 7

254 JÜRG FRÖHLICH AND MARCO MERKLI

Because

πp(αpt (A))= eitHpAe−itHp ⊗ 1p

= eit (Hp⊗1p−1p⊗Hp)πp(A)e−it (Hp⊗1p−1p⊗Hp),

and

πβ

W(αW

t (W(f )))= πβ

W(W(eiωt f )) = W (eiut τβ(f ))

= eit d�(u)W (τβ(f ))e−it d�(u)

= eit d�(u)πβ

W(W(f ))e−it d�(u),

so that

πβ

f (αft (a)) = eit d�(u)π

β

f (a)e−it d�(u), a ∈ Af ,

we find that

σt,0(πβ(A)) := πβ(αt,0(A)) = eitL0πβ(A)e−itL0,

for all A ∈ A, where L0 is the selfadjoint operator on H , given by

L0 = Hp ⊗ 1p − 1p ⊗ Hp + d�(u), (51)

commonly called the (noninteracting, standard) Liouvillian. One easily verifies that

J eitL0 = eitL0J. (52)

Remark. There are other selfadjoint operators generating unitary implementa-tions of σt,0 on H . Indeed, we may add to L0 any selfadjoint operator L′

0 affiliatedwith the commutant M′

β ; then L0 + L′0 still generates a unitary implementation

of σt,0 on H . However, the additional condition (52) fixes L0 uniquely, and thegenerator of this unitary group is called the standard Liouvillian for σt,0. Thisterminology has been used before in [DJP]. The importance of considering thestandard Liouvillian (as opposed to other generators of the dynamics) lies in thefact that its spectrum is related to the dynamical properties of the system; seeTheorem 2.2.

Notice that σt,0 is a group of ∗automorphisms of πβ(A), in particular,eitL0πβ(A)e−itL0 = πβ(A), ∀t ∈ R. From Tomita–Takesaki theory, we know thatJMβJ = M′

β (the commutant), and since

σt,0(Jπβ(V(ε))J ) = Jσt,0(πβ(V (ε)))J = Jπβ(αt,0(V

(ε)))J ∈ M′β,

we can write the multicommutator in (50) as

[σtn,0(πβ(V (ε)) − Jπβ(V (ε))J ), [· · ·· · · [σt1,0(πβ(V (ε)) − Jπβ(V (ε))J ), σt,0(πβ(A))] · · ·]].

Page 250: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 255

It follows that the r.h.s. of (50) defines a ∗automorphism group of πβ(A), σ(ε)t,λ ,

which is implemented unitarily by

σ(ε)t,λ (πβ(A)) = πβ(α

(ε)t,λ(A)) = eitL

(ε)λ πβ(A)e−itL

(ε)λ ,

with

L(ε)λ = L0 + λπβ(V (ε)) − λJπβ(V

(ε))J.

It is not difficult to see (using Theorem 3.1) that the regularized Liouvillian L(ε)λ is

essentially selfadjoint on

D = C∞0 ⊗ C∞

0 ⊗ (F (C∞0 (R × S2)) ∩ F0) ⊂ H ,

where F0 is the finite-particle subspace. Moreover, we have that J eitL(ε)λ = eitL

(ε)λ J .

We now explain how to remove the regularization (ε → 0), obtaining a weak∗continuous ∗automorphism group σt,λ of the von Neumann algebra Mβ . We recallthat a ∗automorphism group τt on a von Neumann algebra M is called weak∗continuous iff t �→ ω(τt(A)) is continuous, for all A ∈ M and for all normalstates ω on M. From

πβ(V (ε)) =∑

α

Gα ⊗ 1p ⊗ 1

2iε

∫R

dt hε(t){W (eiut ετβ(gα))−

− W (eiut ετβ(gα))∗}

Jπβ(V (ε))J =∑

α

1p ⊗ CpGαCp ⊗ 1

2iε

∫R

dt hε(t){W (eiut εe−βu/2τβ(gα))−

− W (eiut εe−βu/2τβ(gα))∗},

where we recall that hε(t) = (1/ε)e−t2/ε2approximates the Dirac delta distribution

concentrated at zero, one verifies that, in the strong sense on D ,

limε→0

πβ(V (ε)) =∑

α

Gα ⊗ 1p ⊗ ϕ(τβ(gα)),

limε→0

Jπβ(V (ε))J =∑

α

1p ⊗ CpGαCp ⊗ ϕ(e−βu/2τβ(gα)),

where the operator ϕ(f ) has been defined in (45). The symmetric operator Lλ,defined on D by

Lλ = L0 + λI, (53)

with

I =∑

α

Gα ⊗ 1p ⊗ ϕ(τβ(gα)) − 1p ⊗ CpGαCp ⊗ ϕ(e−βu/2τβ(gα)), (54)

Page 251: Mathematical Physics, Analysis and Geometry - Volume 7

256 JÜRG FRÖHLICH AND MARCO MERKLI

is essentially selfadjoint on D , for any real value of λ; (this will be shown to be aconsequence of Theorem 3.1). Using Theorem 5.1 on invariance of domains, theDuhamel formula gives

eitL(ε)λ = eitLλ − iλ

∫ t

0eisLλ(I − πβ(V (ε)) + Jπβ(V (ε))J )e−i(s−t )L

(ε)λ

as operators defined on D , from which it follows that eitL(ε)λ → eitLλ , as ε → 0, in

the strong sense on H . Consequently, for A ∈ πβ(A), we have σ(ε)t,λ (A) → σt,λ(A),

in the σ -weak topology of B(H). Notice that for A ∈ πβ(A), we have σt,λ(A) ∈Mβ , because σt,λ(A) = w-limε→0 σ

(ε)t,λ (A), σ

(ε)t,λ (A) ∈ πβ(A) ⊂ Mβ , and Mβ

is weakly closed. Clearly, σt,λ is a σ -weakly continuous ∗automorphism group ofB(H). If A ∈ Mβ , there is a net {Aα} ⊂ πβ(A), s.t. Aα → A, in the weak operatortopology. Thus, since σt,λ is weakly continuous, we conclude that

σt,λ(A) = w-limα

σt,λ(Aα) ∈ Mβ .

We summarize these considerations in a proposition.

PROPOSITION 2.1. (Mβ, σt,λ) is a W ∗-dynamical system, i.e. σt,λ is a weak∗continuous group of ∗automorphisms of the von Neumann algebra Mβ . Moreover,σt,λ is unitarily implemented by eitLλ , where Lλ is given in (53), (54), and

J eitLλ = eitLλJ, for all t ∈ R.

2.1.5. Kernel of Lλ and Normal Invariant States

Let P be the natural cone associated with (Mβ,�ρ0), i.e., P is the norm closureof the set

{AJA�ρ0 | A ∈ Mβ} ⊂ H .

The data (Mβ,H , J,P ) is called the standard form of the von Neumann alge-bra Mβ . We have constructed J and P explicitly, starting from the cyclic andseparating vector �ρ0 . There is, however, a general theory of standard forms of vonNeumann algebras; see [BRI, II, Ara, Con] for the case of σ -finite von Neumannalgebras (as in our case), or [Haa] for the general case. Among the properties ofstandard forms, we mention here only the following:

(P) For every normal state ω on Mβ , there exists a unique ξ ∈ P , s.t. ω(A) =〈ξ,Aξ 〉,∀A ∈ Mβ .

Recall that a state ω on Mβ ⊂ B(H) is called normal iff it is σ -weakly continuous,or, equivalently, iff it is given by a density matrix ρ ∈ l1(H), as ω(A) = tr ρA,for all A ∈ Mβ . The uniqueness of the representing vector in the natural cone,according to (P), allows us to establish the following connection between the kernelof Lλ and the normal invariant states (see also, e.g., [DJP]).

Page 252: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 257

THEOREM 2.2. If Lλ does not have a zero eigenvalue, i.e., if ker Lλ = {0}, thenthere does not exist any σt,λ-invariant normal state on Mβ .

Proof. We show below that, for all t ∈ R,

eitLλP = P . (55)

If ω is a normal state on Mβ , invariant under σt,λ, i.e., such that ω ◦ σt,λ = ω, forall t ∈ R, then, for a unique ξ ∈ P ,

ω(A) = 〈ξ,Aξ 〉 = ω(σt,λ(A)) = 〈e−itLλξ, Ae−itLλξ 〉.Since (55) holds, and due to the uniqueness of the vector in P representing a givenstate, we conclude that eitLλξ = ξ , for all t ∈ R, i.e. Lλ has a zero eigenvalue witheigenvector ξ .

We now show (55). Notice that (55) is equivalent to eitLλP ⊆ P . Since P isa closed set, it is enough to show that for all A ∈ Mβ , eitLλAJA�ρ0 ∈ P . SinceeitLλJ = J eitLλ, eitLλAe−itLλ ∈ Mβ , for all A ∈ Mβ , and BJBJP ⊂ P , for allB ∈ Mβ , we only need to prove that

eitLλ�ρ0 ∈ P . (56)

The Trotter product formula gives

eitLλ�ρ0 = limn→∞(ei t

nλIei t

nL0)n�ρ0,

and, since P is closed, (56) holds provided the general term under the limit isin P , for all n � 1. We show that eisL0P = P and eisλIP = P , for all s ∈ R.Remarking that

eisL0�ρ0 = (eisHp ⊗ e−isHp ⊗ eis d�(u))�ρ0 = (eisHp ⊗ 1p)J (eisHp ⊗ 1p)�ρ0,

where we use that Jp(eisHp ⊗ 1p)Jp = 1p ⊗ CpeisHpCp = 1p ⊗ e−isHp , recallingthat eisL0 implements σt,0, and arguing as above, we see that eitL0P = P .

The Trotter product formula gives

exp

{is

N∑α=1

Gα ⊗ 1p ⊗ ϕ(τβ(gα)) − JGα ⊗ 1p ⊗ ϕ(τβ(gα))J

= limn1→∞

{(ei s

n1G1 ⊗ 1p ⊗ W

(s

n1τβ(gα)

))×

× J

(ei s

n1G1 ⊗ 1p ⊗ W

(s

n1τβ(gα)

))J ×

× exp

[i

s

n1

N∑α=2

(Gα ⊗ 1p ⊗ ϕ(τβ(gα))−

− JGα ⊗ 1p ⊗ ϕ(τβ(gα))J )

]}n1

ξ,

Page 253: Mathematical Physics, Analysis and Geometry - Volume 7

258 JÜRG FRÖHLICH AND MARCO MERKLI

for all ξ ∈ P , and we may apply Trotter’s formula repeatedly to conclude that,since AJAJP ⊂ P , for A ∈ Mβ , and P is closed, we have that eisλIP = P , forall s ∈ R. �

Remark. The proof of Theorem 2.2 uses property (P), which is satisfied in ourcase, because �ρ0 is cyclic and separating for Mβ . This, in turn, is true becauseρ0 has been chosen to be strictly positive. One may start with any reference stateof the form ωp

ρ ⊗ ωf

β , where ρ is any density matrix on Hp; it may be of finiterank. The resulting von Neumann algebra (obtained as the weak closure of A

when represented on the GNS Hilbert space corresponding to (A, ωpρ ⊗ ω

f

β )) is∗isomorphic to Mβ . This is the reason we have not added to Mβ an index for thedensity matrix ρ0. More specifically, the GNS representation of (A, ωp

ρ ⊗ ωf

β ) isgiven by (H1, π1,�1), where

H1 = Hp ⊗ Kρ ⊆ Hp ⊗ Hp,

π1(A ⊗ (W(f ))(h)) = A ⊗ 1p ⊗∫

R

dt h(t)W (eiut τβ(f )),

�1 = �ρp ⊗ �.

Here, Kρ is the closure of Ran ρ, �ρp is given as in Equation (46). Consequently,

π1(A)′′ = B(Hp) ⊗ 1p �Kρ⊗ π

β

f (Af )′′ ∼= Mβ .

In particular, π1(A)′′ and Mβ have the same set of normal states. Thus, our par-ticular choice for the reference state is immaterial when examining properties ofnormal states. One may express this in the following way: (Mβ,H , J,P ) is astandard form for all the von Neumann algebras obtained from any reference state(A, ωp

ρ ⊗ ωf

β ).

2.2. RESULT ON THERMAL IONIZATION

Our main result in this paper is that the W ∗-dynamical system (Mβ, σt,λ) intro-duced above does not have any normal invariant states.

THEOREM 2.3. Assume conditions (A1)–(A3) hold. For any inverse temperature0 < β < ∞ there is a constant, λ0(β) > 0, proportional to γ given in (39), suchthat the following holds. If 0 < |λ| < λ0 then the Liouvillian Lλ given in (53)and (54) does not have any eigenvalues.

Remark. Since γ decays exponentially in β, for large β, Theorem 2.3 is ahigh temperature result (β has to be small for reasonable values of the couplingconstant λ). From physics it is clear that thermal ionization takes place for arbitrary

Page 254: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 259

positive temperatures (but not at zero temperature, where the coupled system has aground state).

Combining Theorems 2.3 and 2.2 yields our main result about thermal ioniza-tion.

THEOREM 2.4 (Thermal ionization). Under the assumptions of Theorem 2.3,there do not exist any normal σt,λ-invariant states on Mβ .

Remark. For λ = 0, the state ω0, determined by the vector �0p⊗�, where �0

p =ϕ0 ⊗ϕ0 ∈ Hp ⊗Hp, and ϕ0 is the eigenvector of Hp, is a normal σt,0-invariant stateon Mβ . As we have explained in the introduction, the physical interpretation ofTheorem 2.4 is that a single atom coupled to black-body radiation at a sufficientlyhigh positive temperature will always end up being ionized.

The proof of Theorem 2.3 is based on a novel virial theorem.

3. Virial Theorems and the Positive Commutator Method

3.1. TWO ABSTRACT VIRIAL THEOREMS

Let H be a Hilbert space, D ⊂ H a core for a selfadjoint operator Y � 1, and X asymmetric operator on D . We say the triple (X, Y,D) satisfies the GJN (Glimm–Jaffe–Nelson) condition, or that (X, Y,D) is a GJN-triple, if there is a constantk < ∞, s.t. for all ψ ∈ D :

‖Xψ‖ � k‖Yψ‖, (57)

±i{〈Xψ, Yψ〉 − 〈Yψ,Xψ〉} � k〈ψ, Yψ〉. (58)

Notice that if (X1, Y,D) and (X2, Y,D) are GJN triples, then so is (X1+X2, Y,D).Since Y � 1, inequality (57) is equivalent to

‖Xψ‖ � k1‖Yψ‖ + k2‖ψ‖,for some k1, k2 < ∞.

THEOREM 3.1 (GJN commutator theorem). If (X, Y,D) satisfies the GJN condi-tion, then X determines a selfadjoint operator (again denoted by X), s.t. D(X) ⊃D(Y ). Moreover, X is essentially selfadjoint on any core for Y , and (57) is validfor all ψ ∈ D(Y ).

Based on the GJN commutator theorem, we next describe the setting for ageneral virial theorem. Suppose one is given a selfadjoint operator � � 1 withcore D ⊂ H , and operators L, A, N, D, Cn, n = 0, 1, 2, 3, all symmetric on D ,and satisfying

〈ϕ,Dψ〉 = i{〈Lϕ,Nψ〉 − 〈Nϕ,Lψ〉}, (59)

C0 = L,

〈ϕ,Cnψ〉 = i{〈Cn−1ϕ,Aψ〉 − 〈Aϕ,Cn−1ψ〉}, n = 1, 2, 3, (60)

Page 255: Mathematical Physics, Analysis and Geometry - Volume 7

260 JÜRG FRÖHLICH AND MARCO MERKLI

where ϕ,ψ ∈ D . We assume that

• (X,�,D) satisfies the GJN condition, for X = L, N, D, Cn. Consequently,all these operators determine selfadjoint operators, which we denote by the sameletters.

• A is selfadjoint, D ⊂ D(A), and eitA leaves D(�) invariant.

Remarks. (1) From the invariance condition eitAD(�) ⊂ D(�), it follows thatfor some 0 � k, k′ < ∞, and all ψ ∈ D(�),

‖�eitAψ‖ � kek′|t |‖�ψ‖. (61)

A proof of this can be found in [ABG], Propositions 3.2.2 and 3.2.5.(2) Condition (57) is phrased equivalently as ‘X � kY , in the sense of Kato

on D .’(3) One can show that if (A,�,D) satisfies conditions (57), (58), then the

above assumption on A holds; see Theorem 5.1.

THEOREM 3.2 (1st virial theorem). Assume that, in addition to (59), (60), wehave, in the sense of Kato on D ,

D � kN1/2, (62)

eitAC1e−itA � kek′|t |Np, some 0 � p < ∞, (63)

eitAC3e−itA � kek′|t |N1/2, (64)

for some 0 � k, k′ < ∞, and all t ∈ R. Let ψ be an eigenvector of L. Then thereis a one-parameter family {ψα} ⊂ D(L) ∩ D(C1), s.t. ψα → ψ , α → 0, and

limα→0

〈ψα,C1ψα〉 = 0. (65)

Remarks. (1) A sufficient condition for (63) to hold (with k′ = 0) is that N

and eitA commute, for all t ∈ R, in the strong sense on D , and C1 � kNp. Thiscondition will always be satisfied in our applications. A similar remark appliesto (64).

(2) In a heuristic way, we understand C1 as the commutator i[L,A] =i(LA − AL), and (65) as 〈ψ, i[L,A]ψ〉 = 0, which is a standard way of statingthe virial theorem, see, e.g., [ABG] and [GG] for a comparison (and correction) ofvirial theorems encountered in the literature.

The result of the virial theorem is still valid if we add to the operator A a suitablysmall perturbation A0:

THEOREM 3.3 (2nd virial theorem). Suppose that we are in the situation ofTheorem 3.2 and that A0 is a bounded operator on H s.t. Ran A0 ⊂ D(L) ∩Ran P(N � n0), for some n0 < ∞. Then i[L,A0] = i(LA0 −A0L) is well defined

Page 256: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 261

in the strong sense on D(L), and we have, for the same family of approximatingeigenvectors as in Theorem 3.2:

limα→0

〈ψα, (C1 + i[L,A0])ψα〉 = 0. (66)

In conjunction with a positive commutator estimate, the virial theorem impliesa certain regularity of eigenfunctions.

THEOREM 3.4 (Regularity of eigenfunctions). Suppose C is a symmetric opera-tor on a domain D(C) s.t., in the sense of quadratic forms on D(C), we have thatC � P − B, where P � 0 is a selfadjoint operator, and B is a bounded (every-where defined) operator. Let ψα be a family of vectors in D(C), with ψα → ψ , asα → 0, and s.t.

limα→0

〈ψα,Cψα〉 = 0. (67)

Then 〈ψ,Bψ〉 � 0, ψ ∈ D(P 1/2), and

‖P 1/2ψ‖ � 〈ψ,Bψ〉1/2. (68)

Remark. Theorem 3.4 can be viewed as a consequence of an abstract Fatoulemma, see [ABG], Proposition 2.1.1. We give a different, very short proof of (68)at the end of Section 6.

3.2. THE POSITIVE COMMUTATOR METHOD

This method gives a conceptually very easy proof of absence of point spectrum.The subtlety of the method lies in the technical details, since one deals with un-bounded operators.

Suppose we are in the setting of the virial theorems described in Section 3.1,and that the operator C1 (or C1 + i[L,A0]) is strictly positive, i.e.

C1 � γ, (69)

for some γ > 0. Inequality (69) and the virial theorem immediately show thatL cannot have any eigenvalues. Indeed, assuming ψ is an eigenfunction of L, wereach the contradiction

0 = limα→0

〈ψα,C1ψα〉 � γ limα→0

〈ψα,ψα〉 = γ ‖ψ‖2 > 0.

Although the global PC estimate (69) holds in our situation, often one manages toprove merely a localized version. Suppose g ∈ C∞(J ) is a smooth function withsupport in an interval J ⊆ R, g � J1 = 1, for some J1 ⊂ J , s.t. g(L) leaves theform domain of C1 invariant. The same reasoning as above shows that if

g(L)C1g(L) � γg2(L),

Page 257: Mathematical Physics, Analysis and Geometry - Volume 7

262 JÜRG FRÖHLICH AND MARCO MERKLI

for some γ > 0, then L has no eigenvalues in the interval J1. The use of PCestimates for spectral analysis of Schrödinger operators has originated with Mourre[Mou], and had recent applications in [Ski, BFSS, DJ, Mer].

4. Proof of Theorem 2.3

4.1. STRATEGY OF THE PROOF

As in [JP, Mer], the starting point in the construction of a positive commutator isthe adjoint operator Af = d�(i∂u), the second quantized generator of translationin the radial variable of the glued Fock space F , see (41). We formally have

i[L0, Af ] = d�(1f ) = N � 0.

The kernel of this form is the infinite-dimensional space Hp ⊗ Hp ⊗ Ran P�.Following [Mer], one is led to try to add a suitable operator A0 to Af , where A0 de-pends on the interaction λI , and is designed in such a way that i[L0 +λI,Af +A0]is strictly positive (has trivial kernel). This method is applicable if the (imaginarypart) of the so-called level shift operator is strictly positive, or equivalently, if(39) is satisfied, but where the finite-dimensional projection p0 is replaced by theinfinite-dimensional projection 1p. Such a positivity condition does not hold forreasonable operators Gα and functions gα .

In order to be able to carry out our program, we add to Af a term Ap ⊗ 1p −1p⊗Ap that reduces the kernel of the commutator. A prime candidate for Ap wouldbe the operator i∂e acting on Hp (we write simply i∂e instead of 0 ⊕ i∂e, c.f. (26)),since then

i[L0, Ap ⊗ 1p − 1p ⊗ Ap + Af ] = P+(Hp) ⊗ 1p + 1p ⊗ P+(Hp) + N,

where P+(Hp) = ∫ ⊕R+ de is the projection onto L2(R+, de;H). The above form has

now a one-dimensional kernel, Ran p0 ⊗ p0 ⊗ P�. By adding a suitable operatorA0, as described above, one can obtain a lower bound on the commutator (and inparticular, reduce its kernel to {0}), provided (39) is satisfied.

However, the operator Ap chosen above has the inconvenience of not beingselfadjoint, while our virial theorems require selfadjointness. We introduce a fam-ily of selfadjoint operators Aa

p, a > 0, that approximate i∂e in a certain sense(a → 0). The idea of approximating a nonselfadjoint A by a selfadjoint sequencewas also used in [Ski]. We now define Aa and then explain, in the remainder of thissubsection, how to prove Theorem 2.3.

We define Aap as the generator of a unitary group on L2(R+, de;H), which is

induced by a flow on R+. For the proof of the following proposition, and moreinformation on unitary groups induced by flows, we refer to Section 7.

PROPOSITION 4.1. Let ξ : R+ → R+ be a bounded, smooth vector field, s.t.ξ(0) = 0, ξ(e) → 1, as e → ∞, and ‖(1 + e)ξ ′‖∞ < ∞. Then ξ generates a

Page 258: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 263

global flow, and this flow induces a continuous unitary group on L2(R+, de;H).The generator Ap of this group is essentially selfadjoint on C∞

0 , and it acts on C∞0

as

Ap = i( 12ξ ′(e) + ξ(e)∂e), (70)

where ξ ′(e) and ξ(e) are multiplication operators. Given a > 0, ξa(e) = ξ(e/a) isa vector field on R+, and lima→0 ξa = 1, pointwise (except at zero). The generatorAa

p of the unitary group induced by ξa is given on its core, C∞0 , by

Aap = i

(1

2

1

aξ ′

(e

a

)+ ξ

(e

a

)∂e

). (71)

We define the selfadjoint operator

Aa = Aap ⊗ 1p ⊗ 1f − 1p ⊗ Aa

p ⊗ 1f + Af , (72)

and calculate the commutator Ca1 of iL with Aa (in the sense given in (60), see also

Subsection 4.2):

Ca1 =

∫ ⊕

R+ξa(e) de ⊗ 1p + 1p ⊗

∫ ⊕

R+ξa(e) de + N + λIa

1 , (73)

where I a1 is N1/2-bounded. In Section 4.3, we show that Ca

1 + i[L,A0] � Ma ,where Ma is a bounded operator. We will see that s-lima→0+ Ma = M (see Propo-sition 4.6), where M is a bounded, strictly positive operator (see Proposition 4.8).Since Ma, M are bounded, we obtain from the virial theorem

0 = limα→0

〈ψα, (Ca1 + i[L,A0])ψα〉 � 〈ψ, (Ma − M)ψ〉 + 〈ψ,Mψ〉, (74)

for any eigenfunction ψ of L. Taking a → 0+ and using strict positivity of M (forsmall, but nonzero λ, see Proposition 4.8), gives a contradiction, and this will proveTheorem 2.3.

4.2. CONCRETE SETTING FOR THE VIRIAL THEOREMS

The Hilbert space is the GNS representation space (40), and we set

D = C∞0 ⊗ C∞

0 ⊗ Df , (75)

where

Df = F (C∞0 (R × S2)) ∩ F0,

and F0 denotes the finite-particle subspace of Fock space. The operator � is givenby

� = �p ⊗ 1p + 1p ⊗ �p + �f , (76)

�p =∫ ⊕

R+e de + 1p = HpP+(Hp) + 1p, (77)

�f = d�(u2 + 1) + 1f . (78)

Page 259: Mathematical Physics, Analysis and Geometry - Volume 7

264 JÜRG FRÖHLICH AND MARCO MERKLI

In (77), we have introduced P+(Hp), the projection onto the spectral interval R+of Hp. It is clear that � is essentially selfadjoint on D , and � � 1. The operatorL is the interacting Liouvillian (53), and

N = d�(1) (79)

is the particle number operator in F ≡ F (L2(R × S2)). Clearly, X = L, N aresymmetric operators on D , and the symmetric operator D on D (see (59)) is givenby

D = iλ√2

∑α

{Gα ⊗ 1p ⊗ (−a∗(τβ(gα)) + a(τβ(gα)))−

− 1p ⊗ CpGαCp ⊗ (−a∗(e−βu/2τβ(gα)) + a(e−βu/2τβ(gα)))}. (80)

The operator A is given by Aa defined in (72). Notice that Aap leaves C∞

0 invariant,Af leaves Df invariant, so Aa maps D into D(L). Furthermore, it is easy to seethat L maps D into D(Aa), hence the commutator of L with Aa is well defined inthe strong sense on D . The same is true for the multiple commutators of L withAa . Setting ξ ′

a(e) = ξ ′(e/a), ξ ′′a (e) = ξ ′′(e/a), we obtain

Ca1 =

∫ ⊕

R+ξa(e) de ⊗ 1p + 1p ⊗

∫ ⊕

R+ξa(e) de + N + λIa

1 , (81)

Ca2 = 1

a

∫ ⊕

R+ξ ′a(e)ξa(e) de ⊗ 1p − 1p ⊗ 1

a

∫ ⊕

R+ξ ′a(e)ξa(e) de + λIa

2 , (82)

Ca3 = 1

a2

∫ ⊕

R+(ξ ′′

a (e)ξa(e)2 + ξ ′

a(e)2ξa(e)) de ⊗ 1p +

+ 1p ⊗ 1

a2

∫ ⊕

R+(ξ ′′

a (e)ξa(e)2 + ξ ′

a(e)2ξa(e)) de + λIa

3 , (83)

where

I an = in

n∑j=0

(n

k

) ∑α

{ad(j)

Aap(Gα) ⊗ 1p ⊗ ad(n−j)

Af(ϕ(τβ(gα)))+

+ (−1)j1p ⊗ ad(j)

Aap(CpGαCp) ⊗ ad(n−j)

Af(ϕ(eβu/2τβ(gα)))}, (84)

for n = 1, 2, 3.We define the bounded selfadjoint operator A0 on H by

A0 = iθλ(�IR2ε� − �R2

ε I�), (85)

with

R2ε = (L2

0 + ε2)−1. (86)

Page 260: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 265

Here, θ and ε are positive parameters, and � is the projection onto the zero eigen-space of L0:

� = P0 ⊗ P�, (87)

P0 = p0 ⊗ p0, (88)

� = 1 − �, (89)

where p0 is the projection in B(Hp) projecting onto the eigenspace correspondingto the eigenvalue E of Hp, i.e. p0ψ = ψ(E) ∈ C, and P� is the projection inB(F ) projecting onto C�. We also introduce the notation

Rε = �Rε.

Notice that the operator A0 satisfies the conditions given in Theorem 3.3 withn0 = 1. Moreover, [L,A0] = LA0 − A0L extends to a bounded operator on theentire Hilbert space, and

‖[L,A0]‖ � k

(θλ

ε+ θλ2

ε2

). (90)

This choice for the operator A0 was initially introduced in [BFSS] for thespectral analysis of Pauli–Fierz Hamiltonians (zero temperature systems), and wasadopted in [Mer] to show return to equilibrium (positive temperature systems). The

key feature of A0 is that i�[L,A0]� = 2θλ2�IR2ε I� is a nonnegative operator.

Assuming the Fermi Golden Rule condition (39), it is a strictly positive operator,as shows

PROPOSITION 4.2. Assume condition (A3). For 0 < ε < ε0, we have

�IR2ε I� � γ

ε�. (91)

The proof is given in Section 8.We are now ready to verify that the virial theorems are applicable.

PROPOSITION 4.3. The unitary group eitAaleaves D(�) invariant (a > 0,

t ∈ R), and, for ψ ∈ D(�),

‖�eitAa

ψ‖ � kek′|t |/a‖�ψ‖, (92)

where k, k′ < ∞ are independent of a.

The proof is given in Section 8.Next, we verify the GJN conditions, and the bounds (62), (64), (63). The fol-

lowing result is useful.

Page 261: Mathematical Physics, Analysis and Geometry - Volume 7

266 JÜRG FRÖHLICH AND MARCO MERKLI

PROPOSITION 4.4. Under conditions (35), (36), the multiple commutators of Gα

with Aap are well defined in the strong sense on C∞

0 , and, for any ψ ∈ C∞0 , we have

that

‖ad(n)Aa

p(Gα)ψ‖ � k‖ψ‖, (93)

for n = 1, 2, 3, and uniformly in a > 0.

The proofs of this and the next proposition are given in Section 8.

PROPOSITION 4.5. The virial theorems, Theorems 3.2 and 3.3, apply in the con-crete situation described above, with the following identifications: the domain Dof Section 3.1 is given in (75), the operators L, N, D, �, A0 appearing inTheorems 3.2, 3.3 are chosen in (53), (79), (80), (76), (85), and the operator A

is given by Aa in (72).

4.3. A LOWER BOUND ON Ca1 + i[L, A0] UNIFORM IN a

In order to estimate Ca1 + i[L,A0] from below, we start with the following obser-

vation: in the sense of forms on D ,

±λIa1 � 1

10NP � + kλ2, (94)

for some k independent of a > 0. This estimate follows in a standard way fromthe explicit expression for I a

1 , Equation (84), and the bound in (93). We concludefrom (94), (81) that

Ca1 + i[L,A0] � Ma, (95)

where

Ma =∫ ⊕

R+ξa(e) de ⊗ 1p + 1p ⊗

∫ ⊕

R+ξa(e) de +

+ 9

10P � − kλ2 + i[L,A0]. (96)

The constant k on the r.h.s. is independent of a. Recalling that ξa → 1 a.e., we areled to define the bounded limiting operator

M = P+(Hp) ⊗ 1p + 1p ⊗ P+(Hp) + 9

10P � − kλ2 + i[L,A0], (97)

where k is the same constant as in (96). Using dominated convergence, one readilyverifies that

∫ ⊕R+ ξa(e) de → P+(Hp), in the strong sense on Hp.

PROPOSITION 4.6. lima→0+ Ma = M, strongly on H .

Page 262: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 267

Our next task is to show that M is strictly positive.

4.4. THE FESHBACH METHOD AND STRICT POSITIVITY OF M

Recall that � = P0 ⊗ P� is the rank-one projection onto the zero eigenspaceof L0, see (87). We apply the Feshbach method to analyze the operator M, with thedecomposition

H = Ran � ⊕ Ran �.

First, we note that

�M� � P 0 ⊗ P�(P+(Hp) ⊗ 1p + 1p ⊗ P+(Hp) − kλ2)++ ( 9

10 − kλ2)P � + i�[L,A0]�. (98)

Recalling the definitions of P0 and A0, (88) and (85), one easily sees that

P 0(P+(Hp) ⊗ 1p + 1p ⊗ P+(Hp)) � P 0,

i�[L,A0]� = −θλ2(�I�IR2ε + R

2ε I�I�),

in particular, ‖i�[L,A0]�‖ � kθλ2/ε2. Together with (98), this shows that thereis a constant λ1 > 0 (independent of λ, θ, ε and of β � β0, for any β0 > 0 fixed),s.t.

M := �M� � Ran � > 12�, (99)

provided

|λ|, θλ2

ε2< λ1. (100)

It follows from Equation (99) that the resolvent set of M , ρ(M), contains theinterval (−∞, 1/2), and for m < 1/2:

‖(M − m�)−1‖ < ( 12 − m)−1. (101)

For m ∈ ρ(M), we define the Feshbach map F�,m applied to M by

F�,m(M) = �(M − M�(M − m�)−1�M)�. (102)

The operator F�,m(M) acts on the space Ran �. In our specific case, Ran � ∼= C,hence F�,m(M) is a number. (If Ran � had dimension n, then F�,m(M) wouldbe represented by an n × n matrix.) The following crucial property is called theisospectrality of the Feshbach map (see, e.g., [BFS, DJ]):

m ∈ ρ(M) ∩ σ (M) ⇐⇒ m ∈ ρ(M) ∩ σ (F�,m(M)), (103)

where σ (·) denotes the spectrum. Hence by examining the spectrum of the operatorF�,m(M), one obtains information about the spectrum of M. The idea is, of course,that it is easier to examine the former operator, since it acts on a smaller space.

Page 263: Mathematical Physics, Analysis and Geometry - Volume 7

268 JÜRG FRÖHLICH AND MARCO MERKLI

PROPOSITION 4.7. Assume condition (A3) and let 0 < ε < ε0. Then

F�,m(M) � 2θλ2

εγ

(1 − kθ

(1 + |λ|

ε

)2

− kε

γ θ

)�, (104)

uniformly in m < 1/4.Proof. Recall the structure of F�,m(M), given in (102). We show that

−�M�(M − m�)−1�M� is small, as compared to �M�, and that the latteris strictly positive. Estimate (101) gives

−�M�(M − m�)−1�M� � −4�M�M�, (105)

for m < 1/4. An easy calculation shows that

�M� = �i[L,A0]� = θλ�LR2ε I� = θλ�(L0R

2ε I + λIR

2ε I )�,

and using that ‖L0Rε‖ � 1, ‖Rε‖ � 1/ε, we obtain the bound

‖�M�ψ‖ �(

θ |λ| + kθλ2

ε

)‖RεI�ψ‖, (106)

for any ψ ∈ H , where we have used that Ran R2ε I� ⊂ Ran P(N � 1), and

‖IP (N � 1)‖ � k. Combining (106) with (105) yields

−�M�(M − m�)−1�M� � −kθ2λ2(1 + |λ|/ε)2�IR2ε I�.

Furthermore, we have that

�M� = �i[L,A0]� − kλ2� = 2θλ2�IR2ε I� − kλ2�.

These observations and the definition of the Feshbach map, (102), show that

F�,m(M) � 2θλ2

(1 − kθ

(1 + |λ|

ε

)2)�IR

2ε I� − kλ2�,

which, by Proposition 4.2, yields (104). �Estimate (104) tells us that there is a λ2 > 0 s.t.

F�,m(M) � θλ2

εγ �, (107)

provided conditions (100) hold, and

θ

(1 + |λ|

ε

)2

+ ε

γ θ< λ2, 0 < ε < ε0. (108)

Notice that all these estimates are independent of m < 1/4. Using the isospec-trality property of the Feshbach map, (103), we conclude that if the bounds (100)

Page 264: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 269

and (108) are imposed on the parameters, and if m < 1/4 and m ∈ σ (M), thenm > θλ2

εγ . Consequently,

M � min

{1

4,θλ2

εγ

}= θλ2

εγ .

Fix a θ < λ2/4 and an ε < min{ε0, γ θλ2}. Then, defining

λ0 = min

{λ1,

ε√

λ1√θ

, ε

},

(100) and (108) are satisfied for |λ| < λ0.

PROPOSITION 4.8. There is a choice of the parameters θ and ε, and of λ0 > 0(depending on θ, ε, β) s.t. if |λ| < λ0 then

M >θλ2

εγ . (109)

We have λ0 � kγ for some k independent of β � β0 (for any β0 > 0 fixed), i.e.,λ0 ∼ eβE is exponentially small in β, as β → ∞ (see Remark (1) after (39)).

Proposition 4.8 completes the proof of Theorem 2.3, according to the argumentgiven in (74).

5. Some Functional Analysis

The following two theorems are useful in our analysis. Their proofs can be foundin [Frö].

THEOREM 5.1 (Invariance of domain, [Frö]). Suppose (X, Y,D) satisfies theGJN condition, (57), (58). Then the unitary group, eitX, generated by theselfadjoint operator X leaves D(Y ) invariant, and

‖Y eitXψ‖ � ek|t |‖Yψ‖, (110)

for some k � 0, and all ψ ∈ D(Y ).

THEOREM 5.2 (Commutator expansion, [Frö]). Suppose D is a core for theselfadjoint operator Y � 1. Let X, Z, ad(n)

X (Z) be symmetric operators on D ,where

ad(0)X (Z) = Z,

〈ψ, ad(n)X (Z)ψ〉 = i{〈ad(n−1)

X (Z)ψ,Xψ〉 − 〈Xψ, ad(n−1)X (Z)ψ〉},

Page 265: Mathematical Physics, Analysis and Geometry - Volume 7

270 JÜRG FRÖHLICH AND MARCO MERKLI

for all ψ ∈ D , n = 1, . . . ,M. We suppose that the triples (ad(n)X (Z), Y,D), n =

0, 1, . . . ,M, satisfy the GJN condition (57), (58), and that X is selfadjoint, withD ⊂ D(X), eitX leaves D(Y ) invariant, and (110) holds. Then

eitXZe−itX = Z −M−1∑n=1

tn

n!ad(n)X (Z)−

−∫ t

0dt1 · · ·

∫ tM−1

0dtMeitMXad(M)

X (Z)e−itMX, (111)

as operators on D(Y ).

Remark. This theorem is proved in [Frö], under the assumption that (X, Y,D)

satisfies (57), (58). However, [Frö]’s proof only requires the properties of the groupeitX indicated in our Theorem 5.2.

An easy, but useful result follows from (110).

PROPOSITION 5.3. Suppose that the unitary group eitX leaves D(Y ) invariant,for some operator Y , and that estimate (110) holds. For a function χ on R withFourier transform χ ∈ L1(R), we define χ(X) = ∫

Rχ(s)eisX ds. If χ has compact

support, then χ(X) leaves D(Y ) invariant, and, for ψ ∈ D(Y ),

‖Yχ(X)ψ‖ � ekR‖χ‖L1(R)‖Yψ‖, (112)

for any R s.t. suppχ ⊂ [−R,R].The proof is obvious. Proposition 5.4 states a similar result, but for a function

whose Fourier transform is not necessarily of compact support.

PROPOSITION 5.4. Suppose (X, Y,D) satisfies the GJN condition, and so dothe triples (ad(n)

X (Y ), Y,D), for n = 1, . . . ,M, and for some M � 1. Moreover,assume that, in the sense of Kato on D(Y ), ±ad(M)

X (Y ) � kX, for some k � 0. Forχ ∈ C∞

0 (R), a smooth function with compact support, define χ(X) = ∫χ(s)eisX ,

where χ is the Fourier transform of χ . Then χ(X) leaves D(Y ) invariant.Proof. For R > 0, set χR(X) = ∫ R

−Rχ(s)eisX , then χR(X) → χ(X) in operator

norm, as R → ∞. From the invariance of domain theorem, we see that χR(X)

leaves D(Y ) invariant. Let ψ ∈ D(Y ), then using the commutator expansiontheorem above, we have

YχR(X)ψ = χR(X)Yψ +∫ R

−R

χ(s)eisX(e−isXY eisX − Y )ψ

= χR(X)Yψ −∫ R

−R

χ(s)eisX

(M−1∑n=1

(−s)n

n! ad(n)X (Y )+

+ (−1)M

∫ s

0ds1 · · ·

∫ sM−1

0dsMe−isMXad(M)

X (Y )eisMX

)ψ. (113)

Page 266: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 271

The integrand of the s-integration in (113) is bounded in norm by

k(|s|M + 1)(‖Yψ‖ + ‖Xψ‖) � k(|s|M + 1)‖Yψ‖,where we have used that ‖ad(M)

X (Y )eisMXψ‖ � ‖XeisMXψ‖ � ‖Xψ‖. Since χ is ofrapid decrease, it can be integrated against any power of |s|, and we conclude thatthe r.h.s. of (113) has a limit as R → ∞. Since Y is a closed operator, it followsthat χ(X)ψ ∈ D(Y ). �PROPOSITION 5.5. Let χ ∈ C∞

0 (R), χ = F 2 � 0. Suppose (X, Y,D) satis-fies the GJN condition. Suppose F(X) leaves D(Y ) invariant. Let Z be a sym-metric operator on D s.t., for some M � 1, and n = 0, 1, . . . ,M, the triples(ad(n)

X (Z), Y,D) satisfy the GJN condition. Moreover, we assume that the multiplecommutators, for n = 1, . . . ,M, are relatively X2p-bounded in the sense of Katoon D , for some p � 0. In other words, there is some k < ∞, s.t. ∀ψ ∈ D ,

‖ad(n)X (Z)ψ‖ � k(‖ψ‖ + ‖X2pψ‖), n = 1, . . . ,M.

Then the commutator [χ(X),Z] = χ(X)Z − Zχ(X) is well defined on D andextends to a bounded operator.

Proof. We write F, χ instead of F(X), χ(X). Since F leaves D(Y ) invariant,we have that

[χ,Z] = F [F,Z] + [F,Z]F,

as operators on D(Y ). We expand the commutator

[F,Z]=∫

F (s)eisX(Z − e−isXZeisX)

=∫

F (s)eisX

{M−1∑n=1

sn

n! ad(n)X (Z)+

+∫ s

0ds1 · · ·

∫ sM−1

0dsMe−isMXad(M)

X (Z)eisMX

}. (114)

Multiplying this equation from the right with F (and noticing that F commuteswith eisMX), we see immediately that [F,Z]F is bounded, and hence F [F,Z] =−([F,Z]F)∗ is bounded, too. �PROPOSITION 5.6. Suppose (X, Y,D) is a GJN triple. Then the resolvent(X − z)−1 leaves D(Y ) invariant, for all z ∈ {C | |Im z| > k}, for some k > 0.

Proof. Suppose Im z < 0 (the case Im z > 0 is dealt with similarly). We writethe resolvent as

(X − z)−1 = i

∫ ∞

0dt ei(X−z)t ,

Page 267: Mathematical Physics, Analysis and Geometry - Volume 7

272 JÜRG FRÖHLICH AND MARCO MERKLI

and it follows from Theorem 5.1 that for ψ ∈ D(Y ),

‖Y (X − z)−1ψ‖ � ‖Yψ‖∫ ∞

0dt e(Im z+k)t < ∞,

provided Im z < −k. �

6. Proof of the Virial Theorems and the Regularity Theorem

Proof of Theorem 3.2. We start by introducing some cutoff operators, and theregularized (cutoff, approximate) eigenfunction.

Let g1 ∈ C∞0 ((−1, 1)) be a real valued function, s.t. g1(0) = 1, and set g =

g21 ∈ C∞

0 ((−1, 1)), g(0) = 1. Pick a real valued function f on R with theproperties that f (0) = 1 and f ∈ C∞

0 (R) (Fourier transform). We set

f1(x) =∫ x

−∞f 2(y) dy,

so that f ′1(x) = f 2(x). Since f ′

1 (s) = isf1(s) = (2π)−1/2f ∗ f (s), it follows thatf1 has compact support, and is smooth except at s = 0, where it behaves like s−1.

We have f(n)

1 = (is)nf1 ∈ C∞0 , for n � 1. Let α, ν > 0 be two parameters and

define the cutoff-operators

g1,ν = g1(νN) =∫

R

g1(s)eisνN ds,

gν = g21,ν,

fα = f (αA) =∫

R

f (s)eisαA ds.

For η > 0, define

1,α = 1

α

∫R\(−η,η)

dsf1(s)eisαA = (f

η

1,α)∗.

1,α leaves D(�) invariant, and ‖f η

1,α‖ � k/α, where k is a constant independentof η; this can be seen by noticing that ‖f1‖∞ < ∞.

Suppose that ψ is an eigenfunction of L with eigenvalue e: Lψ = eψ . Sinceψ ∈ D(L), then ψ = (L + i)−1ϕ, for some ϕ ∈ H . Let {ϕn} ⊂ D(�) be asequence of vectors converging to ϕ. Then

ψn := (L + i)−1ϕn −→ ψ, n −→ ∞, (115)

and moreover, ψn ∈ D(�). The latter follows because the resolvent (L + i)−1

leaves D(�) invariant, see Proposition 5.6; without loss of generality, we assume

Page 268: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 273

that k = 1. Moreover, by Proposition 5.3, we know that fα leaves D(�) invariant(see also (61)), and gν leaves D(�) invariant (� commutes with N in the strongsense on D). Hence, the regularized eigenfunction

ψα,ν,n = fαgνψn

satisfies ψα,ν,n ∈ D(�), ψα,ν,n → ψ , as α, ν → 0, n → ∞.Notice that in the definition of ψn, we introduced the resolvent of L, so that we

have (L − e)ψn → 0, as n → ∞, which we write as

(L − e)ψn = o(n). (116)

We now prove the estimate

|〈if η

1,α(L − e)〉gνψn| � k

1

α(√

ν + o(n)), (117)

where k is some constant independent of η, α, ν, n. This estimate follows fromthe bound

‖(L − e)gνψn‖ � k(√

ν + o(n)), (118)

which is proven as follows. We have that

(L − e)gνψn = gν(L − e)ψn + (119)

+ g1,ν[L, g1,ν]ψn + (120)

+ [L, g1,ν]g1,νψn, (121)

and the r.h.s. of (119) is o(n), by (116). Let us show that both (120) and (121)are bounded above by k

√ν, uniformly in n. The commutator expansion of Theo-

rem 5.2 (see also (114)) yields

g1,ν[L, g1,ν] = ν

∫R

dsg1(s)eisνN

∫ s

0ds1e−is1νNg1,νDeis1νN, (122)

as operators on D(�), where D is given in (80). We use that g1,ν commutes witheisνN . From (62), we see that for any φ ∈ D(�),

‖g1,νDeis1νNφ‖= supϕ∈D,ϕ =0

|〈ϕ, g1,νDeis1νNφ〉|‖ϕ‖ � sup

ϕ∈D,ϕ =0

‖Dg1,ϕ‖‖φ‖‖ϕ‖

� k supϕ∈D,ϕ =0

‖N1/2g1,νϕ‖‖ϕ‖ ‖φ‖ � k

1√ν‖φ‖,

and consequently,

‖g1,ν[L, g1,ν]φ‖� ν

∫R

ds|g1(s)|∫ s

0ds1‖g1,νDeis1νNφ‖

� k√

ν

∫R

ds|sg1(s)| ‖φ‖. (123)

Page 269: Mathematical Physics, Analysis and Geometry - Volume 7

274 JÜRG FRÖHLICH AND MARCO MERKLI

Thus, the desired bound for (120) is proven, and the same bound is establishedfor (121) by proceeding in a similar way. This proves (118).

Next, since fη

1,α leaves D(�) invariant, the commutator [f η

1,α, L] is defined inthe strong sense on D(�), and Theorem 5.2 yields

[f η

1,α, L]=

∫R\(−η,η)

dsf1(s)eisαA

(sC1 + α

s2

2C2

)+

+ α2∫

R\(−η,η)

dsf1(s)eisαA

∫ s

0ds1

∫ s1

0ds2

∫ s2

0ds3e−is3αAC3eis3αA.

(124)

For n � 1, we have

f(n)

1 (αA) =∫

R

ds(is)nf1(s)eisαA =

∫R\(−η,η)

ds(is)nf1(s)eisαA − Rη,n,

where the remainder term

Rη,n = −∫ η

−η

ds(is)nf1(s)eisαA

satisfies Rη,n = (Rη,n)∗, and ‖Rη,n‖ � knη, with a constant kn that does not

depend on α, η. We obtain from (124)

[f η

1,α, L]= −i(f ′

1(αA) + Rη,1)C1 − α

2(f ′′

1 (αA) + Rη,2)C2 +

+ α2∫

R\(−η,η)

dsf1(s)eisαA

∫ s

0ds1

∫ s1

0ds2

∫ s2

0ds3e−is3αAC3eis3αA.

(125)

Recalling that f ′1(αA) = f 2(αA) = f 2

α , we write

−if 2α C1 = −ifαC1fα −

− ifα

∫R

dsf (s)eisαA

(αsC2 + α2

∫ s

0ds1

∫ s1

0ds2e−is2αAC3eis2αA

)= −ifαC1fα − αfαf

′αC2 −

− iα2fα

∫R

dsf (s)eisαA

∫ s

0ds1

∫ s1

0ds2e−is2αAC3eis2αA, (126)

where f ′α = f ′(αA). Remarking that fαf

′α = 1

2 (f 2)′(αA) = 12f

′′1 (αA), we obtain

Page 270: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 275

from (125), (126):

[f η

1,α, L]= −ifαC1fα − αf ′′

1 (αA)C2 − iRη,1C1 − α

2Rη,2C2 +

+ α2∫

R\(−η,η)

dsf1(s)eisαA

∫ s

0ds1

∫ s1

0ds2

∫ s2

0ds3e−is3αAC3eis3αA −

− iα2fα

∫R

dsf (s)eisαA

∫ s

0ds1

∫ s1

0ds2e−is2αAC3eis2αA. (127)

Consequently, taking into account estimate (64), we obtain that

〈i[f η

1,α, L]〉gνψn= 〈C1〉ψα,ν,n

− Re iα〈f ′′(αA)C2〉gνψn+ Re〈Rη,1C1〉gνψn

−− Re i

α

2〈Rη,2C2〉gνψn

+ O

(α2

√ν

), (128)

as we show next. We have taken the real part on the right side, since the left side isa real number. To estimate the remainder term, we use condition (64) to obtain

‖e−is3αAC3eis3αAgνψn‖ � k1√ν

eαk′|s3|,

uniformly in n, so the middle line in (127) is estimated from above by

kα2

√ν

∫R

ds|f1(s)| |s|3eαk′|s| � kα2

√ν

eαk′K∫

R

ds|f1(s)| |s|3,

where K < ∞ is such that suppf1 ⊂ [−K,K]. The exponential is boundeduniformly in 0 � α < 1, hence the r.h.s. is � k(α2/

√ν). The last line in (127)

is analyzed in the same way and (128) follows.Finally, we observe that

− Re〈iαf ′′(αA)C2〉gνψn

= −α

2〈i[f ′′(αA),C2]〉gνψn

= −α2

2

⟨ ∫R

dsf ′′(s)eisαA

∫ s

0ds1e−is1αAC3eis1αA

⟩gνψn

= O

(α2

√ν

),

where we use (64) again, as above. A similar estimate yields

Re iα

2〈Rη,2C2〉gνψn

= −iα

4〈[Rη,2C2]〉gνψn

= O

(α2η√

ν

),

and using the bound (63), we have that

〈Rη,1C1〉gνψn= O

νp

).

Page 271: Mathematical Physics, Analysis and Geometry - Volume 7

276 JÜRG FRÖHLICH AND MARCO MERKLI

Combining this with (128) and (117) shows that

|〈C1〉ψα,ν,n| � k

(√ν + o(n)

α+ α2

√ν

+ η

νp

). (129)

Notice that

C1ψα,ν,n =∫

dsf (s)C1eisαAgνψn −→ C1ψα,ν,

as n → ∞, where ψα,ν := fαgνψ . This follows from the boundedness condi-tion (63) and from ψn → ψ , n → ∞, see (115). Consequently we obtain bytaking the limit n → ∞ in (129)

|〈C1〉ψα,ν| � k

(√ν

α+ α2

√ν

+ η

νp

).

Choose, for instance, ν = α3, η = α3p+δ, for any δ > 0, then

limα→0

〈C1〉ψα,α3 = 0.

This concludes the proof of the theorem. �Proof of Theorem 3.3. We adopt the definitions and notation introduced in the

proof of Theorem 3.2. It suffices to prove

limα→0

〈ψα, i[L,A0]ψα〉 = 0,

where we set ψα = ψα,ν|ν=α3 ; see in the proof of Theorem 3.2. The scalar productcan be estimated by

|〈ψα, i[L,A0]ψα〉|� 2|〈(L − e)ψα,A0ψα〉|� 2‖P(N � n0)(L − e)ψα‖ ‖A0ψα‖.

We have

P(N � n0)(L − e)ψα,ν = limn→∞ P(N � n0)[L, fα]gνψn + (130)

+ limn→∞ P(N � n0)fα[L, gν]ψn. (131)

Using condition (63), we easily find (expanding the commutator [L, fα]) that‖P(N � n0)[L, fα]gνψn‖ � kn0α. Similarly, using (62), we find that ‖P(N �n0)fα[L, gν]ψn‖ � k

√ν. It follows that ‖P(N � n0)(L − e)ψα‖ � kn0α. �

Proof of Theorem 3.4. The inequality C � P −B, the continuity of B, and (67)imply that for any ε > 0, there is an α0(ε), s.t. if α < α0(ε) then

〈ψα,Pψα〉 � 〈ψ,Bψ〉 + ε. (132)

Page 272: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 277

Let µφ be the spectral measure of P corresponding to some φ ∈ H . Then

〈ψα,Pψα〉 =∫

R+p dµψα

(p) = limR→∞

∫R+

pχ(p � R) dµψα(p),

where χ(p � R) is the indicator of [0, R]. We obtain from (132)

limR→∞

〈ψα, χ(P � R)Pψα〉= limR→∞

‖χ(P � R)P 1/2ψα‖2

� 〈ψ,Bψ〉 + ε ≡ k.

We have ‖χ(P � R)P 1/2ψ‖ � R1/2‖ψ − ψα‖ + √k, and taking α → 0

gives ‖χ(P � R)P 1/2ψ‖ �√

k, uniformly in R, so limR→∞∫ R

0 p dµψ(p) ex-ists and is finite, by the monotone convergence theorem. Since D(P 1/2) = {ψ |∫ ∞

0 p dµψ(p) < ∞}, we have that ψ ∈ D(P 1/2), and ‖P 1/2ψ‖ � 〈ψ,Bψ〉. �

7. Flows and Induced Unitary Groups

Let R ⊆ Rn be a Borel set of R

n (with nonempty interior), let X be a vector fieldon R

n, and consider the initial value problem for x ∈ R:

d

dtxt = X(xt ), xt |t=0 = x. (133)

We assume that X has the property that, for any initial condition x ∈ R, thereis a unique, global (for all t ∈ R) solution xt ∈ R to (133). Let �t denote thecorresponding flow and assume �t is a diffeomorphism of R into R, for all t ∈ R.The following properties of the flow will be needed: �s+t = �s ◦ �t , �−1

t = �−t ,�0 = 1. The Jacobian determinant of �t(x) is given by

Jt (x) = |det �′t (x)|, (134)

where �′t (x) is the matrix ( ∂(�t )i

∂xj(x)).

Let µ: R → R+ be a continuous function which is C1 on the interior of R

and which is strictly positive except possibly on a set of measure zero. We writedµ for the absolutely continuous measure µ(x) dx, where dx denotes Lebesguemeasure on R

n. Given a Hilbert space H, consider L2(R, dµ;H), the space ofsquare integrable functions ψ : R → H, equipped with the scalar product

〈ψ,φ〉 =∫

R

〈ψ(x), φ(x)〉H dµ(x).

On the Hilbert space L2(R, dµ;H), the flow �t induces a strongly continuousunitary group, Ut , defined by

(Utψ)(x) =√

Jt(x)µ(�t(x))

µ(x)ψ(�t(x)), (135)

Page 273: Mathematical Physics, Analysis and Geometry - Volume 7

278 JÜRG FRÖHLICH AND MARCO MERKLI

for ψ ∈ L2(R, dµ;H). To check that Ut preserves the norm, we make the changeof variables y = �t(x) to arrive at∫

R

|(Utψ)(x)|2 dµ(x)=∫

R

Jt (x)|ψ(�t(x))|2µ(�t(x)) dx

=∫

R

Jt (�−1t (y))|det(�−1

t )′(y)| |ψ(y)|2µ(y) dy.

We observe that Jt (�−1t (y))|det(�−1

t )′(y)| = |det 1| = 1, hence ‖Utψ‖ = ‖ψ‖.Next, using that �t+s = �t ◦ �s , one easily shows that Jt+s(x) = Jt(�s(x))Js(x),and that

µ(�t+s(x))

µ(x)= µ(�t(�s(x)))

µ(�s(x))

µ(�s(x))

µ(x),

hence t �→ Ut is a unitary group.In order to see that the unitary group is strongly continuous and to calculate its

generator, we impose some additional assumptions on µ and X.

(1) X is C∞ and bounded,(2) for any compact set M ⊂ R, there is a k < ∞ s.t. ∂t |t=0Jt (x) � k, uniformly

in x ∈ M,(3) for any compact set M ⊂ R, there is a k < ∞ s.t. ‖X′(x)∇µ(x)

µ(x)‖ � k, uniformly

in x ∈ M,(4) t �→ {Jt(x)µ(�t (x))}1/2 is C1 in a neighbourhood (−t0, t0) of zero, and

for any compact set M ⊂ R, there is a k < ∞ s.t. we have the estimate|{Jt(x)µ(�t(x))}1/2| < f (x), for |t| < t0, where f ∈ L2

loc(R, dx).

If X is C∞ then so is �t(x) (jointly in (t, x)), and using that

�t(x) = x +∫ t

0X(�s(x)) ds,

�′t (x) = 1 +

∫ t

0X′(�s(x))�′

s (x) ds, (136)

it follows immediately that

‖�t(x)‖ � ‖x‖ + |t| ‖X‖∞, (137)

where the subscript ∞ denotes the supremum norm over x ∈ R. In order to obtainan estimate on ‖�′

t (x)‖ (the operator norm on B(Rn), i.e. the matrix norm, for x

fixed), we recall Gronwall’s lemma. If µ: R → R+ is continuous, and ν: R → R+is locally integrable, then the inequality

µ(t) � c +∫ t

t0

ν(s)µ(s) ds,

Page 274: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 279

where c � 0, and t � t0, implies that

µ(t) � ce∫ tt0

ν(s) ds. (138)

Equation (136) implies

‖�′t (x)‖ � 1 + ‖X′‖∞

∫ t

0‖�′

s(x)‖ ds,

so Gronwall’s lemma yields the estimate

‖�′t (x)‖ � exp(‖X′‖∞t), ∀t > 0.

A similar bound holds for t < 0, and hence

‖�′t (x)‖ � exp(‖X′‖∞|t|), t ∈ R, (139)

from which it follows that

Jt (x) � exp(n‖X′‖∞|t|). (140)

For ψ ∈ C∞0 ,

−1

i∂t |t=0(Utψ)(x)

= −1

i

(1

2∂t |t=0Jt (x) + 1

2

∇µ(x) · X(x)

µ(x)+ X(x) · ∇

)ψ(x)

= (Aψ)(x), (141)

which defines an operator A on C∞0 . Notice that due to conditions (1–3), A maps

C∞0 into L2(R, dµ;H).

PROPOSITION 7.1. Assume conditions (1–4) hold. Then for any ψ ∈ C∞0 , in the

strong sense on L2(R, dµ;H),

−1

i

Ut − 1

tψ −→ Aψ, t −→ 0. (142)

Consequently, C∞0 is in the domain of definition of the selfadjoint generator of the

unitary group Ut , and on C∞0 , this generator can be identified with the operator A

of Equation (141).Proof. Invoking the dominated convergence theorem, it is enough to verify that∥∥∥∥−1

i

1

t(Ut − 1)ψ(x) − (Aψ)(x)

∥∥∥∥2

H

(143)

Page 275: Mathematical Physics, Analysis and Geometry - Volume 7

280 JÜRG FRÖHLICH AND MARCO MERKLI

is bounded above by a dµ-integrable function which is independent of t , for small t .We write

(143)� 1

µ(x)

∣∣∣∣1

t(√

Jt (x)µ(�t(x)) − √µ(x))

∣∣∣∣2

‖ψ(�t(x))‖2H + (144)

+ 1

t2‖ψ(�t(x)) − ψ(x)‖2

H + (145)

+ ‖(Aψ)(x)‖2H. (146)

Clearly, (146) is integrable, and, using the continuity properties of ψ and � andthe bound (139), one sees that (145) is bounded above by a t-independent functionthat is dµ-integrable (use the mean value theorem). Next, if ψ has support in a ballof radius ρ in R ⊂ R

n, then ψ ◦�t has support in the ball of radius ρ +|t|‖X‖∞ �ρ + ‖X‖∞, for |t| � 1. This follows from (137). Let χ(x) denote the indicatorfunction on the ball of radius ρ + ‖X‖∞, then we have for |t| < t0 with t0 as incondition (4),

(144)� kχ(x)1

µ(x)

∣∣∣∣1

t(√

Jt (x)µ(�t(x)) − √µ(x))

∣∣∣∣2

� kχ(x)1

µ(x)|f (x)|2,

where we have used the mean value theorem and condition (4). The latter functionis dµ-integrable. �

Proof of Proposition 4.1. Since ξ is globally Lipshitz (with Lipshitz constant‖ξ ′‖∞), we have existence and uniqueness of global solutions to the initial valueproblem (133). Due to uniqueness and the fact that R � t �→ et = 0 is a solution(since ξ(0) = 0), we see that �t(e) ∈ (0,∞), for all t ∈ R, e ∈ (0,∞), so �t isa diffeomorphism in R+. It is not difficult to verify that conditions (1–4) above aresatisfied. Consequently, it follows from Proposition 7.1 that C∞

0 ⊂ D(A), and thatA is given by (70) on C∞

0 . Since ξ is infinitely many times differentiable, A leavesC∞

0 invariant. Hence C∞0 is a core for A. �

8. Proofs of Some Propositions

Proof of Proposition 4.2. Since �I� = 0 and �IR2ε (p0 ⊗ p0)I� = 0, we have

�IR2ε I� = �IR 2

ε I�

= �IR2ε (p0 ⊗ p0 + p0 ⊗ p0)I� + �IR2

ε (p0 ⊗ p0)I�. (147)

It is not difficult to see that ε�IR2ε (p0 ⊗ p0)I� → 0, as ε → 0, so the last

term in (147) does not contribute effectively to a lower bound in the limit ε → 0.

Page 276: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 281

Let J be the modular conjugation operator introduced in (47). Using the relationsJ 2 = J , Jp0 ⊗ p0 = p0 ⊗ p0J , JR2

ε = R2ε J , J I = −IJ , and the invariance of

ϕ0 ⊗ ϕ0 ⊗ � under J , one verifies easily that

�IR2ε (p0 ⊗ p0)I�

= �IR2ε (p0 ⊗ p0)I�

=∑α,α′

�(Gα ⊗ 1p ⊗ a(τβ(gα)))p0 ⊗ p0

(Hp ⊗ 1p − E + Lf )2 + ε2×

× (Gα′ ⊗ 1p ⊗ a∗(τβ(gα′)))�,

where Lf = d�(u) and where τβ has been defined in (44). We pull the anni-hilation operator through the resolvent, using the pull through-formula (for f ∈L2(R × S2))

a(f )Lf =∫

R×S2f (u,�)(Lf + u)a(u,�),

and then contract it with the creation operator. This gives the bound

�IR2ε I�

�∫ E

−∞du

∫S2

d�u2

e−βu − 1×

×(

p0F(−u,�)p0

(Hp − E + u)2 + ε2F(−u,�)∗p0

)⊗ p0 ⊗ P�,

where we restricted the domain of integration over u to (−∞, E) ⊂ R− (as ε → 0,ε

(Hp−E+u)2+ε2 tends to the Dirac distribution δ(Hp −E+u), hence u = −Hp +E ∈(−∞, E)), and where we used (44). The desired result (91) now follows by makingthe change of variable u �→ −u in the integral, and by remembering the definitionof γ , (39). �

Proof of Proposition 4.3. First, we prove a bound on �peitAapψ , for ψ ∈ C∞

0 .Let �a

t denote the flow generated by the vector field ξa. Then, for each e ∈ [0,∞),((�p − 1p)eitAa

pψ)(e) = eψ(�at (e)), and

‖(�p − 1p)eitAapψ‖2 =

∫R+

e2‖ψ(�at (e))‖2 de

=∫

R+(�a

−t (y))2‖ψ(y)‖2(�a−t )

′(y) dy, (148)

where we make the change of variables y = �at (e). Recall that �a

t (y) = y +∫ t

0 ξ(�as (y)/a) ds, so

|�at (y)| � |y| + |t| ‖ξ‖∞. (149)

Page 277: Mathematical Physics, Analysis and Geometry - Volume 7

282 JÜRG FRÖHLICH AND MARCO MERKLI

Next (�at )

′(y) = 1 + ∫ t

01aξ ′(�a

s (y)/a)(�as )

′(y) ds yields

|(�at )

′(y)| � 1 +∫ t

0

1

a‖ξ ′‖∞|(�a

t )′(y)| ds, (150)

and Gronwall’s estimate, (138), implies that

|(�at )

′(y)| � e‖ξ ′‖∞|t |/a. (151)

Using (151) and (149) in (148) yields

‖(�p − 1p)eitAapψ‖2 � e‖ξ ′‖∞|t |/a

∫R+

(y + ‖ξ‖∞|t|)2‖ψ(y)‖2 dy

� 2e‖ξ ′‖∞|t |/a(1 + ‖ξ‖∞|t|)2(‖(�p − 1p)ψ‖ + ‖ψ‖)2,

from which it follows that

‖�peitAapψ‖� 4

√2(1 + ‖ξ‖∞|t|)e‖ξ ′‖∞|t |/a‖�pψ‖

� 4√

2e(‖ξ ′‖∞+‖ξ‖∞)|t |/a‖�pψ‖. (152)

Estimate (152) holds for all ψ ∈ C∞0 , which is a core for �p. Next, let ψ ∈ D(�p),

and let {ψn} ⊂ C∞0 be a sequence, s.t. ψn → ψ , �pψn → �pψ , as n → ∞. If χR

denotes the cutoff function χ(�p � R), for R > 0, we have

‖χR�peitAapψ‖� ‖χR�peitAa

pψn‖ + R‖ψ − ψn‖� 4

√2e(‖ξ ′‖∞+‖ξ‖∞)|t |/a‖�pψn‖ + R‖ψ − ψn‖.

Taking n → ∞ yields

‖χR�peitAapψ‖ � 4

√2e(‖ξ ′‖∞+‖ξ‖∞)|t |/a‖�pψ‖,

uniformly in the cutoff parameter R. This shows that eitAapψ ∈ D(�p), and (152)

is valid for all ψ ∈ D(�p).We complete the proof of the proposition by examining �f eitAf ψ . Let

ψ ∈ Df . Then one finds the following bound for the n-particle component:

‖[(�f − 1f )eitAf ψ]n‖2 =∥∥∥∥∥

n∑j=1

(u2j + 1)ψn(u1 − t, . . . , un − t)

∥∥∥∥∥2

=∥∥∥∥∥

n∑j=1

((uj + t)2 + 1)ψn(u1, . . . , un)

∥∥∥∥∥2

� (2(1 + t2))2

∥∥∥∥∥n∑

j=1

(u2j + 1)ψn(u1, . . . , un)

∥∥∥∥∥2

.

Page 278: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 283

It follows that ‖(�f − 1f )eitAf ψ‖ � 2(1 + t2)‖�f ψ‖, for all ψ ∈ Df , so that

‖�f eitAf ψ‖ � 2(1 + t2)‖�f ψ‖ + ‖ψ‖ � 3et‖�f ψ‖,for all ψ ∈ Df . A similar argument as above shows that this estimate extends toall ψ ∈ D(�f ). �

Proof of Proposition 4.4. We denote the fiber of Aap by Aa

p(e), i.e.

Aap(e) = i

(1

2

1

aξ ′

(e

a

)+ ξ

(e

a

)∂e

), (153)

see also (71). For ψ ∈ C∞0 , we have

(AapGαψ)(e)= Aa

p(e)(Gαψ)(e)

= Aap(e)Gα(e,E)ψ(E) + Aa

p(e)

∫R+

Gα(e, e′)ψ(e′) de′.

Due to the regularity property (36), we can take the operator Aap(e) inside the

integral (dominated convergence theorem), and obtain the estimate

‖AapGαψ‖2 � |ψ(E)|2

∫R+

‖Aap(e)Gα(e,E)‖2

H de + (154)

+∫

R+

[∫R+

‖Aap(e)Gα(e, e

′)ψ(e′)‖H de′]2

de. (155)

Using (153) and the bound |a−1ξ ′(e/a)| � e−1 supe�0 eξ ′(e) � ke−1, it is easilyseen that the integrand of (154) is bounded above by

k(‖e−1Gα(e,E)‖2H + ‖∂1Gα(e,E)‖2

H),

which is integrable, due to (35). We estimate the integrand in (155) by

‖Aap(e)Gα(e, e

′)ψ(e′)‖H

� k(‖e−1Gα(e, e′)‖B(H) + ‖∂1Gα(e, e

′)‖B(H))‖ψ(e′)‖H,

and using Hölder’s inequality, we arrive at

(155)� k

∫R+

‖ψ(e)‖2 de ×

×∫

R+

∫R+

{‖e−1Gα(e, e′)‖2

B(H) + ‖∂1Gα(e, e′)‖2

B(H)} de de′.

By condition (36), the double integral is finite. We conclude that

‖AapGαψ‖ � k‖ψ‖. (156)

Page 279: Mathematical Physics, Analysis and Geometry - Volume 7

284 JÜRG FRÖHLICH AND MARCO MERKLI

One also finds that ‖GαAapψ‖ � k‖ψ‖, e.g., by noticing that

‖GαAapψ‖ = sup

0 =φ∈C∞0

‖φ‖−1|〈φ,GαAapψ〉| = sup

0 =φ∈C∞0

‖φ‖−1|〈AapGαφ,ψ〉|

and using (156). Consequently, we have shown (93) for n = 1.The proof for n = 2, 3 follows the above lines. For instance, in order to

show boundedness of the third multi-commutator, a typical term to estimate is‖Aa

pAapGαA

apψ‖, for ψ ∈ C∞

0 . We shall sketch the proof that this term is bounded,all other ones being treated similarly. We have

‖AapAa

pGαAapψ‖ = sup

0 =φ∈C∞0

‖φ‖−1|〈φ,AapAa

pGαAapψ〉|, (157)

and the scalar product equals∫R+

∫R+

〈φ(e),Aap(e)2Gα(e, e

′)Aap(e′)ψ(e′)〉H de de′. (158)

Recalling (153), one can calculate the operator A2p(e)2Gα(e, e

′). It can be written asa sum of terms, involving multiplications by functions with argument e, and deriv-atives ∂1Gα(e, e

′), ∂21Gα(e, e

′). Using the formulas for the adjoints of derivativesof ∂

1,21 Gα(e, e

′), see (32), we obtain [Aap(e)2Gα(e, e

′)]∗, and (158) becomes∫R+

∫R+

〈Aap(e′)[Aa

p(e)2Gα(e, e′)]∗φ(e), ψ(e′)〉H de de′, (159)

due to selfadjointness of Aap(e′) on H, and the fact that for all e ∈ R+,

[Aap(e)2Gα(e, e

′)]∗φ(e) ∈ D(Aap(e′)),

which follows from condition (36). Moreover, the same condition allows us toestimate

|(159)|�

∫R+

∫R+

‖Aap(e′)[|Aa

p(e)2Gα(e, e′)]∗‖B(H)‖φ(e)‖H‖ψ(e′)‖H de de′

� ‖φ‖‖ψ‖[∫

R+

∫R+

‖Aap(e′)[Aa

p(e)2Gα(e, e′)]∗‖2

B(H) de de′]1/2

� k‖φ‖ ‖ψ‖,where we have used Hölder’s inequality. This shows that (157) � k‖ψ‖. �

Proof of Proposition 4.5. We have mentioned before (90) that A0 satisfies theconditions of Theorem 3.3, so it suffices to verify the conditions of Theorem 3.2.

We need to check that (X,�,D) is a GJN triple, for X = L,N,D,Can , n = 1,

2, 3, and that (61), (62), (64), (63) are satisfied. Proposition 4.3 shows that (61)

Page 280: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 285

holds. The operator D, given in (80), is clearly N1/2-bounded in the sense of Katoon D , since Gα are bounded operators, and gα , e−βu/2gα are square-integrable.Hence (62) holds. Recalling Remark (1) after Theorem 3.2, and noticing that N

commutes with eitAa

, in the strong sense on D , and that Ca3 � kN1/2, in the sense

of Kato on D (see (83)), we see that (64) is verified. Similarly, Ca1 � kN in the

sense of Kato on D , see (81), so (63) holds.It remains to show that the above mentioned triples satisfy the GJN properties.

We first look at (L,�,D). Clearly, ‖Lψ‖ � k‖�ψ‖, for ψ ∈ D . Moreover,L0 commutes with � in the strong sense on D , so we need only consider theinteraction term in the verification of (58). Due to condition (37), we have for allψ ∈ C∞

0 : ‖�pGαψ‖ � k‖ψ‖, ‖Gα�pψ‖ � k‖ψ‖. Consequently, for ψ ∈ D :

|〈Gα ⊗ 1p ⊗ ϕ(gα)ψ,�ψ〉 − 〈�ψ,Gα ⊗ 1p ⊗ ϕ(gα)ψ〉|� k‖ψ‖ ‖ϕ(gα)ψ‖+

+ |〈Gα ⊗ 1p ⊗ ϕ(gα)ψ,�f ψ〉 − 〈�f ψ,Gα ⊗ 1p ⊗ ϕ(gα)ψ〉|� k‖ψ‖ ‖ϕ(gα)ψ‖ + k‖ψ‖ ‖ϕ((u2 + 1)gα)ψ‖� k‖ψ‖ ‖�1/2ψ‖� k(‖ψ‖2 + ‖�1/2ψ‖2)

� k〈ψ, (� + 1)ψ〉� 2k〈ψ,�ψ〉. (160)

We used in the third step that ϕ(gα) and ϕ((u2 +1)gα) are relatively �1/2f bounded,

in the sense of Kato on D . This follows since (u2 + 1)gα ∈ L2(R × S2, du × d�),due to conditions (33) and (34). The same estimates hold for 1p ⊗ CpGαCp ⊗ϕ(e−βu/2ga), hence we have shown that (L,�,D) is a GJN triple.

It is clear that N � � in the sense of Kato on D , and since N commutes with �

in the strong sense on D , we see immediately that (N,�,D) is a GJN triple.Next, consider (D,�,D). Since D has the same structure as I , c.f. (54)

and (80), the proof that (D,�,D) is a GJN triple goes as the one for (L,�,D).We examine (Ca

n,�,D), n = 1, 2, 3, a > 0. Recall that the Can are given in

(81)–(83). Each Can has a term that acts purely on the particle space. This term is a

bounded multiplication operator that commutes with �, in the strong sense on D .Therefore, we need only show that (N + λIa

1 ,�,D), (I a2,3,�,D) are GJN triples.

Since we have shown it for (N,�,D), it suffices to treat (I an ,�,D), n = 1, 2, 3,

a > 0. We take the general term in the sum of (84):

X := ad(j)

Aap(Gα) ⊗ 1p ⊗ ad(n−j)

Afϕ(gα)).

Since ad(j)

Aap(Gα) is bounded, j = 1, 2, 3 (see Proposition 4.4), and

ad(n−j)

Af(ϕ(gα)) = ϕ((i∂u)

n−j gα)

is relatively �1/2f -bounded, in the sense of Kato on D (this follows from ∂k

ugα ∈L2(R × S2), k = 1, 2, 3, due to (33), (34)), then it is clear that ‖Xψ‖ � k‖�ψ‖,

Page 281: Mathematical Physics, Analysis and Geometry - Volume 7

286 JÜRG FRÖHLICH AND MARCO MERKLI

ψ ∈ D . Next, we verify condition (58) as above in (160):

|〈Xψ,�ψ〉 − 〈�ψ,Xψ〉|� k‖ψ‖ ‖ϕ((u2 + 1)(i∂)n−j gα)ψ‖� k‖ψ‖ ‖�1/2ψ‖,

since (u2 + 1)(∂u)kgα ∈ L2(R × S2), for k = 1, 2, 3, due to (33) and (34). �

Acknowledgements

We thank I. M. Sigal for numerous stimulating discussions on the subject matterof this paper. M.M. is grateful to him for some ideas leading to an early version ofTheorem 3.2.

References

[ABG] Amrein, W., Boutet de Monvel, A., Georgescu, V.: C0-Groups, Commutator Methodsand Spectral Theory of N-body Hamiltonians, Birkhäuser, Basel, 1996.

[Ara] Araki, H.: Some properties of modular conjugation operator of von Neumann algebrasand a noncommutative Radon–Nicodym theorem with a chain rule, Pacific J. Math. 50(2)(1974), 309–354.

[AW] Araki, H. and Woods, E.: Representations of the canonical commutation relationsdescribing a nonrelativistic infinite free bose gas, J. Math. Phys. 4 (1963), 637–662.

[BFS] Bach, V., Fröhlich, J. and Sigal, I. M.: Quantum electrodynamics of confined nonrela-tivistic particles, Adv. Math. 137(2) (1995), 299–395.

[BFSS] Bach, V., Fröhlich, J., Sigal, I. M. and Soffer, A.: Positive commutators and the spectrumof Pauli–Fierz Hamiltonians of atoms and molecules, Comm. Math. Phys. 207(3) (1999),557–587.

[BRI, II] Bratteli, O. and Robinson, D.: Operator Algebras and Quantum Statistical Mechanics I,II, 2nd edn, Texts Monogr. Phys., Springer, Berlin, 1987.

[Con] Connes, A.: Caractérisation des algèbres de von Neumann comme espaces vectrorielsordonnés, Ann. Inst. Fourier (Grenoble) 24 (1974), 121–155.

[DJ] Derezinski, J. and Jakšic, V.: Spectral theory of Pauli–Fierz operators, J. Funct. Anal. 180(2001), 243–327.

[DJP] Derezinski, J., Jakšic, V. and Pillet, C.-A.: Perturbation theory for W∗-dynamics,Liouvilleans and KMS-states, Preprint.

[Frö] Fröhlich, J.: Application of commutator theorems to the integration of representations ofLie algebras and commutation relations, Comm. Math. Phys. 54 (1977), 135–150.

[FGS] Fröhlich, J., Griesemer, M. and Schlein, B.: Asymptotic completeness for Rayleighscattering, Ann. Inst. H. Poincaré 3 (2002), 107–170.

[GG] Georgescu, V. and Gérard, C.: On the virial theorem in quantum mechanics, Comm. Math.Phys. 208 (1999), 275–281.

[Haa] Haagerup, U.: The standard form of von Neumann algebras, Math. Scand. 37 (1975),271–283.

[JP] Jakšic, V. and Pillet, C.-A.: On a model for quantum friction III. Ergodic properties ofthe spin-boson system, Comm. Math. Phys. 178 (1996), 627–651.

Page 282: Mathematical Physics, Analysis and Geometry - Volume 7

THERMAL IONIZATION 287

[Mer] Merkli, M.: Positive commutators in nonequilibrium quantum statistical mechanics,Comm. Math. Phys. 223 (2001), 327–362.

[Mou] Mourre, E.: Absence of singular continuous spectrum for certain self-adjoint operators,Comm. Math. Phys. 91 (1981), 391–408.

[Ski] Skibsted, E.: Spectral analysis of N-body systems coupled to a bosonic field, Rev. Math.Phys. 10(7) (1998), 989–1026.

Page 283: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 289–308, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

289

A New Integrable Hierarchy, Parametric Solutionsand Traveling Wave Solutions

ZHIJUN QIAO1,2 and SHENGTAI LI2

1Department of Mathematics, University of Texas–Pan American, Edinburg, TX 78539, USA2Los Alamos National Laboratory, Los Alamos, NM 87545, USA

(Received: 18 March 2003; in final form: 29 August 2003)

Abstract. In this paper we give a new integrable hierarchy. In the hierarchy there are the followingrepresentatives:

ut = ∂5xu−2/3,

ut = ∂5x

(u−1/3)xx − 2(u−1/6)2x

u,

uxxt + 3uxxux + uxxxu = 0.

The first two are the positive members of the hierarchy, and the first equation was a reduction ofan integrable (2 + 1)-dimensional system (see B. G. Konopelchenko and V. G. Dubrovsky, Phys.Lett. A 102 (1984), 15–17). The third one is the first negative member. All nonlinear equations in thehierarchy are shown to have 3×3 Lax pairs through solving a key 3×3 matrix equation, and thereforethey are integrable. Under a constraint between the potential function and eigenfunctions, the 3 × 3Lax pair and its adjoint representation are nonlinearized to be two Liouville-integrable Hamiltoniansystems. On the basis of the integrability of 6N-dimensional systems we give the parametric solutionof all positive members in the hierarchy. In particular, we obtain the parametric solution of theequation ut = ∂5

xu−2/3. Finally, we present the traveling wave solutions (TWSs) of the above threerepresentative equations. The TWSs of the first two equations have singularities, but the TWS ofthe 3rd one is continuous. The parametric solution of the 5th-order equation ut = ∂5

xu−2/3 can notcontain its singular TWS. We also analyse Gaussian initial solutions for the equations ut = ∂5

xu−2/3,and uxxt + 3uxxux + uxxxu = 0. Both of them are stable.

Mathematics Subject Classifications (2000): 37K10, 58F07, 35Q35.

Key words: Hamiltonian system, matrix equation, zero curvature representation, parametric solution,traveling wave solution.

1. Introduction

The inverse scattering transformation (IST) method plays a very important rolein solving integrable nonlinear evolution equations (NLEEs) [17]. These NLEEsinclude the well-known KdV equation [22] which is related to a 2nd order opera-tor (i.e. Hill operator) problem [23, 25], the remarkable Ablowitz–Kaup–Newell–Segur (AKNS) equations [1, 2] which are associated with the Zakharov–Shabat(ZS) spectral problem [33], and other higher-dimensional integrable equations.

Page 284: Mathematical Physics, Analysis and Geometry - Volume 7

290 ZHIJUN QIAO AND SHENGTAI LI

In the theory of integrable system, it is significant for us to find new integrableevolution equations. Kaup [19] studied the inverse scattering problem for cubiceigenvalue equations of the form ψxxx + 6Qψx + 6Rψ = λψ , and showed a 5th-order partial differential equation (PDE) Qt + Qxxxxx + 30(QxxxQ +(5/2)QxxQx) + 180QxQ

2 = 0 (called the KK equation) integrable. Afterwards,Kupershmidt [21] constructed a super-KdV equation and presented the integrabilityof the equation through giving bi-Hamiltonian property and Lax form. Recently,Degasperis and Procesi [12] proposed a new integrable equation: mt + umx +3mux = 0, m = u − uxx , called the DP equation, which has the peaked solitonsolution.

The DP equation is actually a member with b = 3 in the family mt + umx +bmux = 0, m = u − uxx , b = constant. It has been already proven that onlyb = 2, 3 are integrable cases [26]. With b = 2, it works out the equation mt +umx + 2mux = 0, which was first derived in Camassa and Holm [8] (1993) byusing asymptotic expansions for Euler’s equations governing inviscid incompress-ible flow in the shallow water regime. It was thereby shown to be bi-Hamiltonianand integrable and to have the peaked soliton solution. Its billiard solutions, piece-wise smooth solutions and algebro-geoemtric solutions were successively treatedin Alber et al. [3–6] (1994, 1995, 1999, 2001), Constantin and McKean [10] (1999)and in Qiao [29] (2003). Before Camassa and Holm [8] (1993), families of inte-grable equations similar to shallow water equation were known to be derivablein the general context of hereditary symmetries in Fokas and Fuchssteiner [16](1981). However, this equation was not written explicitly, nor was it derived phys-ically as a water wave equation and its solution properties were not studied beforeCamassa and Holm [8] (1993). See Fuchssteiner [15] (1996) for an insightful his-tory of how the shallow water equation is associated with the hereditary symmetriesand symplectic structures.

The DP equation (i.e. the equation with b = 3) was proven integrable, asso-ciated with a 3rd-order spectral problem [11]: ψxxx = ψx − λmψ , and related tothe canonical Hamiltonian system under a new nonlinear Poisson bracket (calledPeakon bracket) [18]. In 2002, we extended the DP equation to an integrablehierarchy and dealt with its parametric solution and stationary solutions [28].

In this paper, we propose a new integrable hierarchy. In particular, the followingthree representatives in the hierarchy

ut = ∂5xu

−2/3, (1)

ut = ∂5x

(u−1/3)xx − 2(u−1/6)2x

u, (2)

uxxt + 3uxxux + uxxxu = 0, (3)

are shown to have bi-Hamiltonian operator structure and to be integrable. The firsttwo are the positive members of the hierarchy. But the third one is the first negativemember of the hierarchy. Konopelchenko and Dubrovsky [20] pointed out thatEquation (1) is a reduction of a (2 + 1)-dimensional equation. Here we will deal

Page 285: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 291

with its spectral problem and parametric representation of solution from the pointof constraint view. All nonlinear equations in the hierarchy are shown to have 3×3Lax pairs through solving a key 3 × 3 matrix equation, and therefore they areintegrable. After being imposed on a constraint between the potential function andeigenfunctions, the 3×3 Lax pair and its adjoint representation are nonlinearized tobe two Liouville-integrable Hamiltonian systems. On the basis of the integrabilityof 6N-dimensional systems we give the parametric solution of all positive membersin the hierarchy. In particular, we obtain the parametric solution of the equationut = ∂5

xu−2/3. Furthermore, we obtain the traveling wave solutions (TWSs) for

Equations (1), (2), and (3). The first two look like a class of cusp soliton solutions(but not cusp soliton [32]). The TWSs of Equations (1) and (2) have singularities,but the TWS of Equation (3) is continuous. Additionally, the parametric solutionof the 5th-order Equation (1) can not include its singular TWS. Equation (3) hasa compacton-like and a parabolic cylinder solution. We also analyse the initialGaussian solutions for equations ut = ∂5

xu−2/3 and uxxt + 3uxxux + uxxxu = 0.

Both of them are stable (see Figures 1 and 2).The whole paper is organized as follows. In the next section we describe how to

connect the above three equations to a spectral problem and how to cast them into anew hierarchy of NLEEs, and also give the bi-Hamiltonian operators for the wholehierarchy. In Section 3, we construct the zero curvature representations for the newhierarchy through solving a key 3 × 3 matrix equation. In particular, we obtain theLax pair of Equations (1), (2), (3), and therefore they are integrable. In Section 4,we show that the 3rd order spectral problem and its adjoint representation related tothe above three equations are nonlinearized as a completely integrable Hamiltoniansystem under a constraint in R

6N . In Section 5 we present the parametric solution

Figure 1. Stable solution for the equation uxxt + 3uxxux + uxxxu = 0 under theGaussian initial condition. A shock is developed during the time integration. This figureis very like the Burgers case uxxt + 3uxxux + uxxxu + εuxxxx = 0 through addingsmall viscosity term εuxxxx to the equation. For instance, when ε = −0.01, the equationuxxt + 3uxxux + uxxxu + εuxxxx = 0 has Figure 2.

Page 286: Mathematical Physics, Analysis and Geometry - Volume 7

292 ZHIJUN QIAO AND SHENGTAI LI

Figure 2. Stable solution for the equation uxxt +3uxxux +uxxxu+ εuxxxx = 0, ε = −0.01under the Gaussian initial condition. This figure is almost same as Figure 1.

for the positive hierarchy of NLEEs. We particularly get the parametric solutionof Equation (1). Moreover, in section 6 we obtain the traveling wave solutions forEquations (1), (2), and (3), and also analyse the initial Gaussian solutions for theequations ut = ∂5

xu−2/3, and uxxt + 3uxxux + uxxxu = 0. Finally, in Section 7 we

give some conclusions.

2. Spectral Problem, Hamiltonian Operators, and a New Hierarchy

Let us consider the following spectral problem

ψxxx = −λuψ (4)

and its adjoint representation

ψ∗xxx = λuψ∗. (5)

Then, we have their functional gradient δλ/δu with respect to the potential u

δλ

δu= λψψ∗

E≡ ∇λ

E, (6)

where

∇λ = λψψ∗,

E =∫

uψψ∗ dx = constant,(7)

and � = (−∞,∞) or � = (0, T ). In this procedure, we need the boundarycondition of u decaying at infinities or of u being periodic with period T . Usually,we compute the functional gradient δλ/δu of the eigenvalue λ with respect to thepotential u by using the method in [13, 9, 31]. Fokas and Anderson constructed

Page 287: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 293

hereditary symmetries and Hamiltonian systems by using the isospectral eigen-value problems [13]. Later, Cao developed the functional gradient procedure tothe nonlinearization method [9], which closely connects finite-dimensional inte-grable systems to nonlinear integrable partial differential equations (also see detailsin [27]).

Taking derivatives five times on both sides of Equation (7), we find

(∇λ)xxxxx = −3λ2(2u∂ + ∂u)(ψψ∗x − ψ∗ψx),

(ψψ∗x − ψ∗ψx)xxx = (u∂ + 2∂u)∇λ,

which directly lead to

K∇λ = λ2J∇λ, (8)

where

K = ∂5, (9)

J = −3(2u∂ + ∂u)∂−3(u∂ + 2∂u). (10)

Obviously, K, J are antisymmetric, and both of them are Hamiltonian operatorsbecause they satisfy the Jacobi identity.

Now, according to this pair of Hamiltonian operators, we define the hierarchyof nonlinear evolution equations associated with the spectral problems (4) and (5).Let G0 ∈ Ker J = {G ∈ C∞(R) | JG = 0} and G−1 ∈ Ker K = {G ∈ C∞(R) |KG = 0}. We define the Lenard sequence

Gj ={

LjG0, j � 0,j ∈ Z,

Lj+1G−1, j < 0,(11)

where L = J −1K is called the recursion operator. Therefore we obtain a newhierarchy of nonlinear evolution equations:

utk = JGk, ∀k ∈ Z. (12)

Apparently, this hierarchy includes the positive members (k � 0) and the neg-ative members (k < 0), and possesses the bi-Hamiltonian structure because of theHamiltonian properties of K, J .

Let us now give specific equations in the hierarchy (12).

• Choosing G−1 = 1/6 ∈ Ker K yields the first negative member of thehierarchy:

ut + vux + 3vxu = 0, u = vxx. (13)

This equation is actually: vxxt + 3vxxvx + vxxxv = 0 which is equivalent to∂2(vt + vvx) = 0. It has the compacton-like solution [30]. Obviously, v =c1x + c0 (c1, c0 are two constants) is a special solution of this equation.

Page 288: Mathematical Physics, Analysis and Geometry - Volume 7

294 ZHIJUN QIAO AND SHENGTAI LI

• Choosing G0 = u−2/3 ∈ Ker J leads to the second positive member of thehierarchy:

ut = ∂5xu

−2/3. (14)

Konopelchenko and Dubrovsky pointed out that this equation is integrable andis a reduction of a (2 + 1)-dimensional equation [20]. But they did not studysolutions of the equation. In the following, we study the relation between theequation and finite-dimensional integrable system and will find that it hasparametric solution as well as the traveling wave solution which looks likea cusp.

• Choosing another element G0 = ((u−1/3)xx−2(u−1/6)2x)/u in the kernel Ker J

gives the following positive member of the hierarchy:

ut = ∂5x

(u−1/3)xx − 2(u−1/6)2x

u. (15)

This equation also has a cusp-like trveling wave solution. See this in Section 6.

Of course, we may generate further nonlinear equations by selecting other elementsfrom the kernel elements of J , K. In the following, we will see that all equationsin the hierarchy (12) are integrable. Particularly, the above three Equations (13),(14), (15) are integrable.

3. Zero Curvature Representations

Letting ψ = ψ1, we change Equation (4) to the following 3 × 3 matrix spectralproblem

�x = U(u, λ)�, (16)

U(u, λ) = 0 1 0

0 0 1−λu 0 0

, � =

ψ1

ψ2

ψ3

. (17)

Apparently, the Gateaux derivative matrix U∗(ξ) of the spectral matrix U in thedirection ξ ∈ C∞(R) at point u is

U∗(ξ) � d

∣∣∣∣ε=0

U(u + εξ) = 0 0 0

0 0 0−λξ 0 0

(18)

which is obviously an injective homomorphism, i.e. U∗(ξ) = 0 ⇔ ξ = 0.For any given C∞-function G, we construct the following 3×3 matrix equation

with respect to V = V (G)

Vx − [U,V ] = U∗(KG − λ2JG). (19)

Page 289: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 295

THEOREM 1. For spectral problem (16) and an arbitrary C∞-function G, thematrix equation (19) has the following solution

V = λ

−G′′ − 3λ∂−2ϒG 3(G′ + λ∂−3ϒG) −6G

−G′′′ − 3λ∂−1uG′ 2G′′ 3(−G′ + λ∂−3ϒG)

−G′′′′ − 3λ2u∂−3ϒG G′′′ − 3λ∂−1uG′ −G′′ + 3λ∂−2ϒG

,

(20)

where ∂ = ∂x = ∂/∂x, ϒ = u∂ + 2∂u, and the superscript ‘′’ means the derivativein x. Therefore, J = −3ϒ∗∂−3ϒ (ϒ∗ is the conjugate of ϒ).

Proof. Let us set

V = V11 V12 V13

V21 V22 V23

V31 V32 V33

,

and subsitute it into Equation (19), which is an overdetermined equation. Usingcalculation techniques in [27], we obtain the following results:

V11 = −λG′′ − 3λ2∂−2ϒG,

V12 = 3(λG′ + λ2∂−3ϒG),

V13 = −6λG,

V21 = −λG′′′ − 3λ2∂−1uG′,V22 = 2λG′′,V23 = 3λ(−G′ + λ∂−3ϒG),

V31 = −λG′′′′ − 3λ3u∂−3ϒG,

V32 = λG′′′ − 3λ2∂−1uG′,V33 = −λG′′ + 3λ2∂−2ϒG,

which completes the proof. �THEOREM 2. Let G0 ∈ Ker J , G−1 ∈ Ker K, and let each Gj be given throughEquation (11). Then,

1. each new vector field Xk = JGk, k ∈ Z satisfies the following commutatorrepresentation

Vk,x − [U,Vk] = U∗(Xk), ∀k ∈ Z; (21)

2. the new hierarchy (12), i.e.

utk = Xk = JGk, ∀k ∈ Z, (22)

possesses the zero curvature representation

Utk − Vk,x + [U,Vk] = 0, ∀k ∈ Z, (23)

Page 290: Mathematical Physics, Analysis and Geometry - Volume 7

296 ZHIJUN QIAO AND SHENGTAI LI

where

Vk =∑

V (Gj)λ2(k−j−1),

∑=

∑k−1j=0, k > 0,

0, k = 0,

−∑−1j=k, k < 0,

(24)

and V (Gj ) is given by Equation (20) with G = Gj .

Proof. 1. For k = 0, it is obvious. For k < 0, we have

Vk,x − [U,Vk] = −−1∑j=k

(Vx(Gj ) − [U,V (Gj)])λ2(k−j−1)

= −−1∑j=k

U∗(KGj − λ2KGj−1)λ2(k−j−1)

= U∗

( −1∑j=k

KGj−1λ2(k−j) − KGjλ

2(k−j−1)

)

= U∗(KGk−1 − KG−1λ2k)

= U∗(KGk−1)

= U∗(Xk).

For the case of k > 0, it is similar to prove.2. Noticing Utk = U∗(utk ), we obtain

Utk − Vk,x + [U,Vk] = U∗(utk − Xk).

The injectiveness of U∗ implies the second result holds. �From Theorem 2, we immediately obtain the following corollary.

COROLLARY 1. The new hierarchy (12) has Lax pair:

ψxxx = −λuψ, (25)

ψtk =∑

λ2(k−j)−1[−6Gjψxx + 3(G′j + λ∂−3ϒGj)ψx

− (G′′j + 3λ∂−2ϒGj)ψ], (26)

where all symbols are the same as in Thereom 2 and Thereom 1.

So, all equations in the hierarchy (12) have Lax pairs and are therefore inte-grable. In particular, we have the following specific examples.

• When we choose G−1 = 1/6, Equation (13) has the following Lax pair:�x = U(u, λ)�, (27)

�t = V (u, λ)�, (28)

Page 291: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 297

where u = uxx , U(u, λ) is defined by Equation (17), and V (u, λ) is given by

V (u, λ) = vx −v λ−1

0 0 −v

λvu 0 −vx

. (29)

Apparently, Lax pair (27) and (28) is equivalent to

ψxxx = −λuψ, (30)

ψt = λ−1ψxx − vψx + vxψ. (31)

• In a similar way, choosing G0 = u−2/3 gives the Lax pair of Equation (14),i.e. ut = (u−2/3)xxxxx,

ψxxx = −λuψ, (32)

ψt = −6λu−2/3ψxx + 3λ(u−2/3)xψx − λ(u−2/3)xxψ. (33)

This Lax pair is different from/inequivalent to the result in [20].• Furthermore, through choosing G0 = ((u−1/3)xx − 2(u−1/6)2

x)/u, we find thatthe new equation (2) has the Lax pair:

ψxxx = −λuψ, (34)

ψt = −6λG0ψxx + 3λ(G′0 + 3λu−1/3)ψx

− λ(G′′0 + 9λ(u−1/3)xx)ψ. (35)

4. 6N-dimensional Integrable System

To discuss solutions of the hierarchy (12), we want to use the constrained method[9, 27] which leads finite-dimensional integrable systems to nonlinear integrablepartial differential equations. Because Equation (4)/(16) is a 3rd order eigenvalueproblem, we have to investigate itself together with its adjoint problem when weadopt the nonlinearized procedure. Ma and Strampp [24] already studied the AKNSand its adjoint problem, a 2 × 2 case, by using the so-called symmetry constraintmethod. Now, we are dealing with 3 × 3 spectral problem (16) related to thehierarchy (12).

Let us go back to spectral problem (16) and consider its adjoint problem

�∗x =

0 0 uλ

−1 0 00 −1 0

�∗, �∗ =

ψ∗

1ψ∗

2ψ∗

3

, (36)

where ψ∗ = ψ∗3 .

Let λj (j = l, . . . , N) be N distinct spectral values of (16) and (36), and q1j ,q2j , q3j and p1j , p2j , p3j be the corresponding spectral functions, respectively.Then we have

Page 292: Mathematical Physics, Analysis and Geometry - Volume 7

298 ZHIJUN QIAO AND SHENGTAI LI

q1x = q2,

q2x = q3, (37)

q3x = −uq1,

and

p1x = up3,

p2x = −p1, (38)

p3x = −p2,

where = diag(λ1, . . . , λN), qk = (qk1, qk2, . . . , qkN)T, pk = (pk1, pk2, . . . ,

pkN)T, k = 1, 2, 3.Let us consider the above two systems in the symplectic space (R6N , dp ∧ dq),

and introduce the following constraint:

u−2/3 =N∑

j=1

∇λj , (39)

where ∇λj = λjq1jp3j is the functional gradient of λj for spectral problems (16)and (36). Then Equation (39) reads

u = 〈q1, p3〉−3/2. (40)

Under this constraint, Equation (37) and its adjoint problem (38) are cast in acanonical Hamiltonian form in R

6N :

qx = {q,H+},px = {p,H+}, (41)

with the Hamiltonian

H+ = 〈q2, p1〉 + 〈q3, p2〉 + 2√〈q1, p3〉 , (42)

where p = (p1, p2, p3)T, q = (q1, q2, q3)

T ∈ R6N , 〈· , ·〉 stands for the standard

inner product in RN , and {· , ·} represents the Poisson bracket of two functions F1,

F2 defined by:

{F1, F2} =3∑

i=1

(⟨∂F1

∂qi

,∂F2

∂pi

⟩−

⟨∂F1

∂pi

,∂F2

∂qi

⟩)(43)

which is antisymmetric and bilinear and satifies the Jacobi identity.To see the integrability of system (41), we take into account the time part

�t = Vk� and its adjoint problem �∗t = −V T

k �∗, where Vk is defined by Vk =∑k−1j=0 V (Gj)λ

2(k−j−1), and V (Gj) is given by Equation (20) with G = Gj .

Page 293: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 299

Let us first look at V1 case. Then the corresponding time part is:

�t = λ

−(u−2/3)xx 3(u−2/3)x −6u−2/3

−(u−2/3)xxx + 6λu1/3 2(u−2/3)xx −3(u−2/3)x

−(u−2/3)xxxx (u−2/3)xxx + 6λu1/3 −(u−2/3)xx

�, (44)

and its adjoint problem is:

�∗t =λ

(u−2/3)xx (u−2/3)xxx + 6λu1/3 −(u−2/3)xxxx

−3(u−2/3)x −2(u−2/3)xx −(u−2/3)xxx − 6λu−1/3

6u−2/3 3(u−2/3)x (u−2/3)xx

�∗. (45)

Noticing the following relations

u1/3 = 〈q1, p3〉−1/2,

(u−2/3)x = 〈q2, p3〉 − 〈q1, p2〉,(u−2/3)xx = 〈q3, p3〉 + 〈q1, p1〉 − 2〈λq2, p2〉,(u−2/3)xxx = 3(〈q2, p1〉 − 〈q3, p2〉),(u−2/3)xxxx = 6〈q3, p1〉 + 3〈q1, p3〉−3/2(〈2q1, p2〉 + 〈2q2, p3〉),

we obtain nonlinearized systems of the time parts (44) and (45), and furthermorecast them into the following canonical Hamiltonian system in R

6N :

qt1 = {q, F+1 },

pt1 = {p,F+1 }, (46)

with the Hamiltonian

F+1 = −1

2(〈q1, p1〉 + 〈q3, p3〉)2

+ 2〈q2, p2〉(〈q1, p1〉 + 〈q3, p3〉 − 〈q2, p2〉)+ 3(〈q2, p3〉 − 〈q1, p2〉)(〈q2, p1〉 − 〈q3, p2〉)− 6〈q1, p3〉〈q3, p1〉+ 6√〈q1, p3〉 (〈

2q1, p2〉 + 〈2q2, p3〉). (47)

A direct computation leads to the following theorem.

THEOREM 3.

{H+, F+1 } = 0, (48)

that is, two Hamiltonian flows commute in R6N .

For general case Vk, k > 0, k ∈ Z, we consider the following Hamiltonianfunctions

Page 294: Mathematical Physics, Analysis and Geometry - Volume 7

300 ZHIJUN QIAO AND SHENGTAI LI

F+k = −1

2

k−1∑j=0

(〈2j+1q1, p1〉 + 〈2j+1q3, p3〉)(〈2(k−j)−1q1, p1〉

+ 〈2(k−j)−1q3, p3〉)

+ 2k−1∑j=0

〈2j+1q2, p2〉(〈2(k−j)−1q1, p1〉 + 〈2(k−j)−1q3, p3〉

− 〈2(k−j)−1q2, p2〉)

+ 3k−1∑j=0

(〈2j+1q2, p3〉 − 〈2j+1q1, p2〉)(〈2(k−j)−1q2, p1〉

− 〈2(k−j)−1q3, p2〉)

− 6k−1∑j=0

〈2j+1q1, p3〉〈2(k−j)−1q3, p1〉

− 3

2

k∑j=0

(〈2jq1, p1〉 − 〈2j q3, p3〉)(〈2(k−j)q1, p1〉

− 〈2(k−j)q3, p3〉)

− 3k∑

j=0

(〈2jq2, p3〉 − 〈2j q1, p2〉)(〈2(k−j)q2, p1〉

+ 〈2(k−j)q3, p2〉)+ 3H+(〈2kq1, p2〉 + 〈2kq2, p3〉). (49)

Through a lengthy calculation, we find

{H+, F+k } = 0, {F+

l , F+k } = 0, k, l = 1, 2, . . . . (50)

That is,

THEOREM 4. All canonical Hamiltonian flows (F+k ) commute with the

Hamiltonian system (41). In particular, the Hamiltonian systems (41) and (46) arecompatible and therefore integrable in the Liouville sense.

Remark 1. In the proof procedure of this theorem, we have used the followingtwo facts: 〈q1, p2〉+ 〈q2, p3〉 = c1, and 〈q1, p1〉− 〈q3, p3〉 = c2. They always holdalong x-flow in R

6N . Here c1, c2 are two constants.

Remark 2. In fact, the involutive functions F+k are generated from nonlin-

earization of the time part �t = Vk� and its adjoint problem �∗t = −V T

k �∗

under the constraint (39), where Vk is defined by Vk = ∑k−1j=0 V (Gj )λ

2(k−j−1), andV (Gj) is given by Equation (20) with G = Gj . In this calculation, we have usedthe following equalities:

Page 295: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 301

Gj = −〈2j+1q1, p3〉, j = 0, 1, 2 . . . ,

G′j = 〈2j+1q2, p3〉 − 〈2j+1q1, p2〉,

G′′J = 〈2j+1q3, p3〉 + 〈2j+1q1, p1〉 − 2〈2j+1q2, p2〉,

G′′′j = 3(〈2j+1q2, p1〉 − 〈2j+1q3, p2),

G′′′′j = 6〈2j+1q3, p1〉 + 3〈q1, p3〉−3/2(〈2j+2q1, p2〉 + 〈2j+2q2, p3〉),

∂−1mG′j = 〈2jq3, p2〉 + 〈2j q2, p1〉,

∂−2ϒGj = 〈2j q1, p1〉 − 〈2jq3, p3〉,∂−3ϒGj = −(〈2j q1, p2〉 + 〈2j q2, p3〉).

5. Parametric Solution

Since Hamiltonian flows (H+) and (F+k ) are completely integrable in R

6N andtheir Poisson brackets {H+, F+

k } = 0 (k = 1, 2, . . .), their phase flows gxH+ , g

tk

F+k

commute [7]. Thus, we can define their compatible solution as follows:(q(x, tk)

p(x, tk)

)= gx

H+gtk

F+k

(q(x0, t0

k )

p(x0, t0k )

), k = 1, 2, . . . , (51)

where x0, t0k are the initial values of phase flows gx

H+ , gtk

F+k

.

THEOREM 5. Let q(x, tk) = (q1, q2, q3)T, p(x, tk) = (p1, p2, p3)

T be a solutionof the compatible Hamiltonian systems (H+) and (F+

k ) in R6N . Then

u = 1√〈q1(x, tk), p3(x, tk)〉3(52)

satisfies the positive equation of the hierarchy

utk = JLk · u−2/3, k = 1, 2, . . . , (53)

where the operators L = J −1K, J , K are given by Equations (10) and (9),respectively.

Proof. Direct computation completes the proof. �In particular, we have the following result.

THEOREM 6. Let p(x, t), q(x, t) (p(x, t) = (p1, p2, p3)T, q(x, t) =

(q1, q2, q3)T) be a common solution of the two integrable compatible flows (41)

and (46), then

u = 1√〈q1(x, t), p3(x, t)〉3(54)

Page 296: Mathematical Physics, Analysis and Geometry - Volume 7

302 ZHIJUN QIAO AND SHENGTAI LI

satisfies the equation:

ut = ∂5xu

−2/3. (55)

Proof. Taking derivatives in x five times on both sides of Equation (54), weobtain

∂5xu

−2/3 = 9u(〈2q3, p3〉 − 〈2q1, p1〉)+ 3ux(〈2q1, p2〉 + 〈2q2, p3〉), (56)

where

ux = −3

2u(〈2q1, p2〉 + 〈2q2, p3〉)(〈q2, p3〉 − 〈q1, p2〉)

〈q1, p3〉 .

On the other hand, taking derivative in t on both sides of Equation (54) yields

ut = −3

2u〈p3, q1〉 + 〈q1, p3〉

〈q1, p3〉

= −3

2u〈p3,

∂F+1

∂p1〉 − 〈q1,

∂F+1

∂q3〉

〈q1, p3〉 .

Substituting expression of F+1 into the above equality and calculating, we find that

final result is the same as the right-hand side of Equation (56), which completesthe proof. �

6. Traveling Wave Solutions

First, let us compute the traveling wave solution of Equation (3). Set u = f (ξ),ξ = x − ct (c is some constant speed), then after substituting this setting intoEquation (3) we obtain

−cf ′′′ + 3f ′′f ′ + f ′′′f = 0,

i.e.

(f 2 − 2cf )′′′ = 0.

Therefore,

(f − c)2 = Aξ 2 + Bξ + C, ∀A,B,C ∈ R. (57)

So, the equation uxxt + 3uxxux + uxxxu = 0 has the following traveling wavesolution

u(x, t) = c ±√

A(x − ct)2 + B(x − ct) + C. (58)

Let us discuss specific cases as follows:

Page 297: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 303

• When c = 0, we get stationary solution

u(x) = ±√

Ax2 + Bx + C, ∀A,B,C ∈ R, (59)

which may be a straight line, circle, ellipse, parabola, and hyperbola accordingto different choices of constants A, B, C.

• When c = 0 and A = 0, then we have

u(x, t) = c ±√

A

(x − ct + B

2A

)2

+ 4AC − B2

4A, ∀A,B,C ∈ R. (60)

Therefore with 4AC − B2 = 0 this solution becomes

u(x, t) = c ± √A

∣∣∣∣(

x − ct + B

2A

)∣∣∣∣, ∀A > 0, B ∈ R. (61)

Setting c = 1, A = 1, B = 0 yields

u(x, t) = 1 − |x − t|, (62)

and

u(x, t) = 1 + |x − t|. (63)

The former looks like a compacton solution [30, 14]. The latter is a “V”-typesolution.

• When c = 0 and A = 0, then we have

u(x, t) = c ± √B(x − ct) + C, ∀B,C ∈ R, (64)

which is a parabolic traveling wave solution if B = 0 and becomes a constantsolution if B = 0. In particular,

u(x, t) = 1 + √x − t, x − t � 0, (65)

and

u(x, t) = 1 − √x − t, x − t � 0 (66)

are two specific solutions.

So, the 3rd-order equation uxxt + 3uxxux + uxxxu = 0 has the continuoustraveling wave solution (58). In addition, we also have the Gaussian initial solutionof this 3rd-order equation, which is stable (see Figure 1).

Second, we give the traveling wave solution of the 5th-order equation (1). Setu = ξ−γ , ξ = x−ct (c is a constant speed to be determined), then after substitutingthis setting into Equation (1) we obtain

γ = 125 , c = − 336

625 . (67)

So, the 5th-order equation (1) has the following traveling wave solution

u = (x + 336625 t)

−12/5. (68)

Page 298: Mathematical Physics, Analysis and Geometry - Volume 7

304 ZHIJUN QIAO AND SHENGTAI LI

Figure 3. This is the stable solution for the 5th-order equation ut = ∂5xu−2/3 under the

Gaussian initial condition.

Figure 4. This is the stable solution for the Harry–Dym equation ut = ∂3xu−1/2 under the

Gaussian initial condition.

Although at each time solution (68) has singular point at x = −(336/625)t ,this 5th-order equation has the smooth and stable traveling wave solution under theGaussian initial condition (see Figure 3).

So, Figure 3 of the equation ut = ∂5xu

−2/3 has a slight difference from Figure 4of the Harry–Dym equation ut = ∂3

xu−1/2.

Page 299: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 305

Figure 5. Solution near singular point.

Third, we give the traveling wave solution for the new integrable 7th-orderequation (2). Set u = ξ−γ , ξ = x − ct (c is a constant speed to be determined),then we have

γ = 187 , c = 31680

117649 . (69)

So, the 7th-order equation (1) has the following traveling wave solution (seeFigure 5)

u = (x − 31680117649t)

−18/7. (70)

Furthermore, we propose the following new equations:

ut = ∂lxu

−m/n, l � 1, n = 0,m, n ∈ Z. (71)

This equation has the following traveling wave solution

u(x, t) = (x − ct)−n(l−1)/(m+n), c = m

n

l−1∏k=1

(m(l − 1)

m + n− k

). (72)

Apparently, if mn + n2 > 0 this solution has singularity at point x = ct at eachtime, and if mn + n2 < 0 this solution is a polynomial traveling wave solutionwhich is smooth.

Remark 3. Here are the cusp-like traveling wave solutions with singularities

u(x, t) = (x − 29 t)−4/3 (73)

and

u(x, t) = (x + 336626 t)

−12/5 (74)

for the Harry–Dym equation ut = ∂3(u−1/2) and the 5th-order equation ut =∂5(u−2/3).

Page 300: Mathematical Physics, Analysis and Geometry - Volume 7

306 ZHIJUN QIAO AND SHENGTAI LI

7. Conclusions

In Section 5, we obtain the parametric solution (54) of the 5th-order equation (1).This parametric solution does not include its traveling wave solution u = (x +(336/625)t)−12/5 because the parametric solution is smooth everywhere, but thetraveling wave solution has singularity.

The traveling wave solutions u = (x + (336/625)t)−12/5 for the equation ut =∂5u−2/3 and u = (x−(31680/117649)t)−18/7 for the equation ut = ∂5

x (((u−1/3)xx−

2(u−1/6)2x)/u) are singular at each time. That is, the singularity property travels

with the time t (see Figure 5). Actually, when n(m + n) > 0 the traveling wavesolution (72) for general Equation (71) is also matching this property. A naturalquestion arises here: is Equation (71) integrable for all l � 1,m, n ∈ Z or for whatkind of l � 1,m, n ∈ Z is it integrable? We will discuss this elsewhere.

The Harry–Dym equation has the cusp-like traveling wave solution u(x, t) =(x − (2/9)t)−4/3, but this is not cusp soliton which Wadati described in [32],because the current traveling wave solution is singular, but the cusp is continuous.

If we consider other constraints between the potential and eigenfunctions, thenwe can still get parametric solutions for the other two equations

ut = ∂5x

(u−1/3)xx − 2(u−1/6)2x

u,

uxxt + 3uxxux + uxxxu = 0.

Acknowledgments

The first author is much indebted to Dr. Darryl Holm for his invitation and pro-viding an opportunity to join his research project. He would like to express hissincere thank to Prof. Konopelchenko for showing his paper [20] and Prof. Magrifor his fruitful discussion during their visit at Los Alamos National Laboratory. Heis grateful to the referee for reminding literatures [13, 16].

This work was supported by the Foundation for the Author of National Ex-cellent Doctoral Dissertation (FANEDD) of PR China, and also the Doctoral Pro-gramme Foundation of the Insitution of High Education of China.

References

1. Ablowitz, M. J., Kaup, D. J., Newell, A. C. and Segur, H.: Nonlinear evolution equations ofphysical significance, Phys. Rev. Lett. 31 (1973), 125–127.

2. Ablowitz, M. J., Kaup, D. J., Newell, A. C. and Segur, H.: Inverse scattering transform – Fourieranalysis for nonlinear problems, Studies Appl. Math. 53 (1974), 249–315.

3. Alber, M. S., Camassa, R., Holm, D. D. and Marsden, J. E.: The geometry of peaked solitonsand billiard solutions of a class of integrable PDE’s, Lett. Math. Phys. 32 (1994), 137–151.

4. Alber, M. S., Camassa, R., Holm, D. D. and Marsden, J. E.: On the link between umbilicgeodesics and soliton solutions of nonlinear PDE’s, Proc. Roy. Soc. 450 (1995), 677–692.

Page 301: Mathematical Physics, Analysis and Geometry - Volume 7

A NEW INTEGRABLE HIERARCHY 307

5. Alber, M. S., Camassa, R., Fedorov, Y. N., Holm, D. D. and Marsden, J. E.: On billiard solutionsof nonlinear PDE’s, Phys. Lett. A 264 (1999), 171–178.

6. Alber, M. S., Camassa, R., Fedorov, Y. N., Holm, D. D. and Marsden, J. E.: The complexgeometry of weak piecewise smooth solutions of integrable nonlinear PDE’s of shallow waterand Dym type, Comm. Math. Phys. 221 (2001), 197–227.

7. Arnol’d, V. I.: Mathematical Methods of Classical Mechanics, Springer-Verlag, Berlin, 1978.8. Camassa, R. and Holm, D. D.: An integrable shallow water equation with peaked solitons,

Phys. Rev. Lett. 71 (1993), 1661–1664.9. Cao, C. W.: Nonlinearization of Lax system for the AKNS hierarchy, Sci. China Ser. A (in

Chinese) 32 (1989), 701–707; also see English edition: Nonlinearization of Lax system for theAKNS hierarchy, Sci. Sin. A 33 (1990), 528–536.

10. Constantin, A. and McKean, H. P.: A shallow water equation on the circle, Comm. Pure Appl.Math. 52 (1999), 949–982.

11. Degasperis, A., Holm, D. D. and Hone, A. N. W.: A new integrable equation with peakonsolutions, Theoret. and Math. Phys. 133 (2002), 1463–1474.

12. Degasperis, A. and Procesi, M.: Asymptotic integrability, In: A. Degasperis and G. Gaeta (eds),Symmetry and Perturbation Theory, World Scientific, 1999, pp. 23–37.

13. Fokas, A. S. and Anderson, R. L.: On the use of isospectral eigenvalue problems for obtaininghereditary symmetries for Hamiltonian systems, J. Math. Phys. 23 (1982), 1066–1073.

14. Fringer, D. and Holm, D. D.: Integrable vs. nonintegrable geodesic soliton behavior, Physica D150 (2001), 237–263.

15. Fuchssteiner, B.: Some tricks from the symmetry-toolbox for nonlinear equations: Generaliza-tions of the Camassa–Holm equation, Physica D 95 (1996), 229–243.

16. Fuchssteiner, B. and Fokas, A. S.: Symplectic structures, their Baecklund transformations andhereditaries, Physica D 4 (1981), 47–66.

17. Gardner, C. S., Greene, J. M., Kruskal, M. D. and Miura, R. M.: Method for solving theKorteweg–de Vries equation, Phys. Rev. Lett. 19 (1967), 1095–1097.

18. Holm, D. D. and Hone, A. N. W.: Note on Peakon bracket, Private communication, 2002.19. Kaup, D. J.: On the inverse scattering problem for cubis eigenvalue problems of the class

ψxxx + 6Qψx + 6Rψ = λψ , Stud. Appl. Math. 62 (1980), 189–216.20. Konopelchenko, B. G. and Dubrovsky, V. G.: Some new integrable nonlinear evolution

equations in 2 + 1 dimensions, Phys. Lett. A 102 (1984), 15–17.21. Kupershmidt, B. A.: A super Korteweg–De Vries equation: an integrable system, Phys. Lett. A

102 (1984), 213–215.22. Korteweg, D. J. and De Vries, G.: On the change of form long waves advancing in a rectangular

canal, and on a new type of long stationary waves, Phil. Mag. 39 (1895), 422–443.23. Levitan, B. M. and Gasymov, M. G.: Determination of a differential equation by two of its

spectra, Russ. Math. Surveys 19(2) (1964), 1–63.24. Ma, W. X. and Strampp, W.: An explicit symmetry constraint for the Lax pairs and the adjoint

Lax pairs of AKNS systems, Phys. Lett. A 185 (1994), 277–286.25. Marchenko, V. A.: Certain problems in the theory of second-order differential operators, Dokl.

Akad. Nauk SSSR 72 (1950), 457–460.26. Mikhailov, A. V. and Novikov, V. S.: Perturbative symmetry approach, J. Phys. A 35 (2002),

4775–4790.27. Qiao, Z. J.: Finite-dimensional Integrable System and Nonlinear Evolution Equations, Higher

Education Press, PR China, 2002.28. Qiao, Z. J.: Integrable hierarchy, 3 × 3 constrained systems, and parametric solutions, preprint,

2002, to appear in Acta Appl. Math.29. Qiao, Z. J.: The Camassa–Holm hierarchy, N-dimensional integrable systems, and algebro-

geometric solution on a symplectic submanifold, Comm. Math. Phys. 239 (2003), 309–341.

Page 302: Mathematical Physics, Analysis and Geometry - Volume 7

308 ZHIJUN QIAO AND SHENGTAI LI

30. Rosenau, P. and Hyman, J. M.: Compactons: Solitons with finite wavelength, Phys. Rev. Lett.70 (1993), 564–567.

31. Tu, G. Z.: An extension of a theorem on gradients of conserved densities of integrable systems,Northeast. Math. J. 6 (1990), 26–32.

32. Wadati, M., Ichikawa, Y. H. and Shimizu, T.: Cusp soliton of a new integrable nonlinearevolution equation, Progr. Theoret. Phys. 64 (1980), 1959–1967.

33. Zakharov, V. E. and Shabat, A. B.: Exact theory of two dimensional self focusing and onedimensional self modulation of waves in nonlinear media, Soviet Phys. JETP 34 (1972), 62–69.

Page 303: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 309–331, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

309

Generalized Value Distribution for HerglotzFunctions and Spectral Theory

Y. T. CHRISTODOULIDES and D. B. PEARSONDepartment of Mathematics, University of Hull, Cottingham Rd., Hull HU6 7RX, UK.e-mail: [email protected]; [email protected]

(Received: 17 April 2003; in final form: 8 June 2004)

Abstract. We generalize the theory of value distribution for a class of functions defined as boundaryvalues of Herglotz functions, by considering other measures than Lebesgue measure. The link withcompositions of Herglotz functions is presented, and precise relations for the associated measuresare obtained. We also consider uniformly convergent sequences of Herglotz functions on compactsubsets of the upper half-plane, and prove that the corresponding sequence of Herglotz measures andthe generalized value distribution of these functions also converge.

Mathematics Subject Classification (2000): 34L05.

Key words: generalized value distribution, Herglotz functions, spectral theory.

1. Introduction

Let F(z) be a Herglotz function, that is analytic in the upper half-plane and withpositive imaginary part. Then F admits the integral representation ([1, 7, 10])

F(z) = aF + bF z +∫

R

{1

t − z− t

t2 + 1

}dρ(t), (1)

where aF , bF are real constants and the function ρ(t) is nondecreasing, right-continuous, and unique up to an additive constant, for given F . In particular, aF

and bF are given by

aF = Re F(i), bF = lims→∞

1

sIm F(is),

and ρ(t) gives rise to a Borel measure µ through the relation ρ(b) − ρ(a) =µ((a, b]) for intervals (a, b]. Thus µ is the Herglotz measure corresponding to F ,and it satisfies the integral condition∫

R

1

1 + t2dρ(t) < +∞, (2)

which is a sufficient condition to ensure convergence of the representation of F(z)

in (1).

Page 304: Mathematical Physics, Analysis and Geometry - Volume 7

310 Y. T. CHRISTODOULIDES AND D. B. PEARSON

We denote by F+(λ) the boundary value of F at the point λ ∈ R, definedby F+(λ) = limε→0+ F(λ + iε). Thus, F+(λ) is the limiting value of F(z) as z

approaches the point λ on the real line, vertically from the upper half-plane. Themeasure µ corresponding to F may be decomposed into its absolutely continuousand singular parts, µ = µa.c. + µs ([3, 12, 15]). In the special case that F has realboundary values almost everywhere, µ is purely singular. More generally, µa.c.

is concentrated on the set of λ ∈ R for which Im F+(λ) > 0, and the densityfunction f of µa.c. is given by f (λ) = 1

πIm F+(λ), λ ∈ R. For an analysis of the

supports of the measures µa.c. and µs see [14].In [4, 5, 13], a theory of value distribution for boundary values of Herglotz

functions has been developed, with applications to spectral analysis and to theWeyl–Titchmarsh theory of the m-function [8] for Sturm–Liouville equations. Thestarting point of this theory is a study of the properties of the value distributionfunction M, given for Borel subsets of the real line in the case that µ is purelysingular by

M(A, S) = ∣∣A ∩ F−1+ (S)

∣∣,where | · | denotes Lebesgue measure.

The main purpose of the current paper is to extend the existing theory to allowa description of value distribution involving measures other than Lebesgue. Tooutline in broad terms the direction that this extension [6] might take, considerthe following simple example, again in the special case that µ is purely singular.For given Herglotz function F , define a measure ν on Borel subsets of R by

ν(A) = α∣∣A ∩ F−1

+ (R−)∣∣ + β

∣∣A ∩ F−1+ (R+)

∣∣,where α, β are positive constants. If α = β = 1, the measure ν reduces to Lebesguemeasure. More generally, ν is an absolutely continuous measure weighted accord-ing to the sign of F+(λ). Moreover, we have

ν(A) = α

∫ 0

−∞µy(A) dy + β

∫ ∞

0µy(A) dy,

where {µy} (y ∈ R) is a one-parameter family of measures defined by the Her-glotz function F (for precise definitions of µy see Section 2 below, in particularEquations (3)–(6)).

Defining an absolutely continuous measure dσ , having density function α onR− and β on R+, we can write

ν(A) =∫ ∞

−∞µy(A) dσ (y).

Again, with α = β = 1 the measure dσ reduces to Lebesgue measure. Moregenerally, we may apply the theory of boundary values for Herglotz functions bynoting that dσ is the Herglotz measure associated with the function φ(z) = iπβ +

Page 305: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 311

(α − β)log z, and that ν is the Herglotz measure associated with the composedfunction φ ◦ F , given by (φ ◦ F)(z) = iπβ + (α − β)log F(z).

If νS = ν|F−1+ (S) denotes the restriction of ν to F−1+ (S) for an arbitrary Borelset S, we may describe the generalized value distribution of F+ (that is, the valuedistribution with ν replacing Lebesgue measure) by means of the value distributionfunction Mν given by

Mν(A, S) = ν(A ∩ F−1

+ (S)) = νS(A).

In the following sections, we shall extend these ideas to cover the case in whichσ is a general absolutely continuous measure and ν is a general measure weightedaccording to the values of F+(λ). Moreover, the general features of the theory,involving measures σ and ν, generalized value distribution function Mν , and acomposed Herglotz function φ ◦ F with associated measure ν, apply quite gen-erally to Herglotz functions F , whether or not the measure µ is purely singular.Within this context of generalized value distribution we prove continuity of Mν

with respect to the Herglotz function F , in the sense that Mν(A, S;Fn) convergesto Mν(A, S;F) whenever Fn → F uniformly for z in compact subsets of C+.

The main results of the paper are summarized below in Lemma 2.6, and Theo-rems 3.4 and 3.8. The paper is organized as follows.

In Section 2 we consider the Herglotz measure dσ and associated Herglotz func-tion φ(z), and define corresponding generalized measures ν, νS for the Herglotzfunction F , where S is an arbitrary Borel set. We verify that the measure dνS isabsolutely continuous provided dσ is absolutely continuous, and prove that dνS

is also a Herglotz measure. In Lemma 2.6 we show that, for any Borel set B, wehave νS(B) = µ(φS◦F)(B) − bφµ(B), where µ(φS◦F) is the measure correspondingto the composed Herglotz function (φS ◦ F), and φS(z) is the Herglotz functionhaving the same representation as φ(z), except that now integration with respect toσ takes place over the set S. Here, µ is the measure corresponding to the Herglotzfunction F , and bφ is the constant appearing in the representation of F in (1).Hence, in the case that bφ = 0, then νS is precisely the measure corresponding tothe composed Herglotz function φS ◦ F .

In Section 3 we consider a sequence of Herglotz functions Fn with correspond-ing measures µn, converging uniformly as n → ∞ to the Herglotz function F(z),on compact subsets of the upper half-plane. We prove in Theorem 3.4 that, in thatcase, we have µn((a, b]) → µ((a, b]), provided the points a and b are not discretepoints of any of the measures µn or µ, the measure corresponding to F . We makeuse of the standard result regarding Herglotz measures of intervals whose endpointsare not discrete points of the measure. Also, in order to control limits close tothe real axis, we use complex contour integration in the upper half-plane [11],where the functions Fn, by assumption, satisfy convergence conditions. Under thefurther assumption that the measure dσ is absolutely continuous, we will prove inTheorem 3.8 that the generalized value distribution of Fn converges, as n → ∞, tothe generalized value distribution of F .

Page 306: Mathematical Physics, Analysis and Geometry - Volume 7

312 Y. T. CHRISTODOULIDES AND D. B. PEARSON

2. Generalized Value Distribution for Herglotz Functions

Given a Herglotz function F , we may construct a one-parameter family of Herglotzfunctions Fy defined by

Fy(z) = 1

y − F(z), y ∈ R, (3)

with corresponding measures µy and representations

Fy(z) = ay + byz +∫

R

{1

t − z− t

t2 + 1

}dµy(t). (4)

The integral∫S

µy(A) dy, (5)

where A, S are arbitrary Borel sets and µy are the measures corresponding tothe Herglotz functions Fy , defines the value distribution mapping for the Herglotzfunction F , as the following result shows.

THEOREM 2.1. Let F be a Herglotz function with associated measure µ. Sup-pose that F has real boundary values almost everywhere with respect to Lebesguemeasure, so that µ is purely singular. Then, for any Borel sets A, S ⊆ R we have∫

S

µy(A) dy = ∣∣A ∩ F−1+ (S)

∣∣, (6)

where F+ is the boundary value of F as z approaches the real axis, µy are themeasures corresponding to the Herglotz functions Fy defined in (3), and |·| denotesLebesgue measure. In particular, we have∫

R

µy(A) dy = |A|. (7)

Proof. See [13, 14]. �Hence

∫Sµy(A) dy is the Lebesgue measure of the points in A for which the

boundary value of F is in S. Equation (6) also holds in the case when F hasreal boundary values almost everywhere on A, and (7) holds for arbitrary Herglotzfunctions F , that is without the assumption that F has real boundary values almosteverywhere.

The integral in (5) may be regarded as a spectral average of a family of measuresover the set S. There is an extensive literature on spectral averaging with appli-cations to spectral analysis. See [9] for a recent review of this theory. A unifyingfeature of the treatment in [9] is to consider the average of a one parameter family of

Page 307: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 313

measures, where each measure is obtained from the Herglotz measure correspond-ing to the composition of a given Herglotz function with a one-parameter group ofautomorphisms of the upper half plane. Results such as (7) then follow in a naturalway from this underlying theory. In the present paper, we will extend the theoryby considering spectral averaging with respect to a wider class of measures, forwhich we shall establish a connection with ideas of value distribution for boundaryvalues of Herglotz functions. In this more general context, we will need to considercompositions with arbitrary Herglotz functions rather than Möbius transformationsas in [9], and this theme will be continued in a subsequent publication.

It will be important for later calculations to know how the constants by in (4)depend on y. The following lemma provides an answer to this question.

LEMMA 2.2. For a given Herglotz function F(z), the nonnegative constants by

appearing in (4) are zero, except possibly for a single value of y. Moreover, by

may be strictly positive for some value y = y0; for any given y0 ∈ R, the conditionby0 > 0 is equivalent to the condition that the point t = 0 is a discrete point of themeasure dg, where dg is the measure corresponding to the Herglotz function G(z)

defined by

G(z) = 1

y0 − F(− 1z). (8)

Proof. The constants by are given by

by = lims→∞

1

sIm Fy(is) = lim

s→∞1

sIm

[1

y − F(is)

].

Suppose that F(is) = α(s) + iβ(s), with α, β real. Then,

by = lims→∞

1

s

β(s)

[y − α(s)]2 + [β(s)]2.

If by > 0, then α(s) → y and β(s) → 0 as s → ∞, so that F(is) → y as s → ∞.That is, F(is) → y as s → ∞ is a necessary condition for by > 0, where y is anyreal number. So there can only be at most one value of y such that by > 0, sinceF(is) can not tend to two different limits as s → ∞.

Consider now the Herglotz function G(z) defined in (8), and in particular, thelimit

= limw→0+ w Im G(iw), w ∈ R.

Writing w = 1/s, we have

= lims→∞

1

sIm

[1

y0 − F(is)

].

Therefore, by0 = and thus by0 > 0 is equivalent to > 0.

Page 308: Mathematical Physics, Analysis and Geometry - Volume 7

314 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Suppose that G admits the representation

G(z) = aG + bGz +∫

R

{1

t − z− t

t2 + 1

}dg(t).

Then,

= limw→0+ w Im G(iw) = lim

w→0+

∫R

w2

t2 + w2dg(t). (9)

The integrand in (9) has 0 as its pointwise limit, as w → 0+, except in the casewhen t = 0, in which case the limit is 1. Note also that

w2

t2 + w2� 1

t2 + 1, 0 < w � 1,

where the function 1/(1 + t2) is integrable with respect to the Herglotz meas-ure dg. Hence, we can apply the Lebesgue dominated convergence theorem in (9),to deduce that

limw→0+

∫R

w2

t2 + w2dg(t) = g

({0}).Hence by0 > 0 is equivalent to t = 0 being a discrete point of the dg measure, andmoreover in that case we have by0 = g({0}).

We now extend the idea of value distribution associated with a Herglotz func-tion. Given a Herglotz function F(z), and a Borel subset S of R, we define theintegral-measures dν and dνS by

ν(A) =∫

R

µy(A) dσ (y), (10)

and

Mν(A, S) = νS(A) =∫

S

µy(A) dσ (y) (11)

respectively, for any Borel subset A of R. Here, µy are the measures correspond-ing to the Herglotz functions Fy which were defined in (3), and dσ is a Herglotzmeasure. �LEMMA 2.3. Suppose that the measure dσ is absolutely continuous. Then, themeasures dν and dνS are absolutely continuous.

Proof. Let A be a Borel set having zero Lebesgue measure. Then, from (7)we have µy(A) = 0 almost everywhere, and since dσ is absolutely continuousit follows that ν(A) = νS(A) = 0. This proves the absolute continuity of dν

and dνS . �

Page 309: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 315

The following lemma shows that, in the special case µa.c. = 0, νS(A) is theνS-measure (and also the ν-measure) of the points in A for which the boundaryvalue of F is in S.

LEMMA 2.4. Suppose that F has real boundary values almost everywhere, i.e.the measure µ is purely singular. Then, we have

νS(A) = νS

(A ∩ F−1

+ (S)) = ν

(A ∩ F−1

+ (S)). (12)

Proof. Suppose µ is purely singular. Note that the functions Fy will also havereal boundary values almost everywhere, and hence the measures µy will also bepurely singular. The support of µy will be the set {λ ∈ R : F+(λ) = y}. So, fory ∈ S, the support of µy will be a subset of F−1

+ (S), and we can write

νS(A)=∫

S

µy(A) dσ (y)

=∫

S

µy

(A ∩ F−1

+ (S))

dσ (y) = νS

(A ∩ F−1

+ (S)).

On the other hand, if y /∈ S, the support of µy will be a subset of R\F−1+ (S), and

so we also have

νS(A) =∫

R

µy

(A ∩ F−1

+ (S))

dσ (y) = ν(A ∩ F−1

+ (S)),

and the lemma is proved. �Thus dνS agrees with dν on F−1

+ (S), though in general these measures aredifferent, and we have νS = ν|F−1+ (S) in this case.

We refer to dνS (and Mν) defined in (11), with dσ absolutely continuous (sothat by Lemma 2.3 dνS will also be absolutely continuous), as the generalizedvalue distribution function for the Herglotz function F .

LEMMA 2.5. The measure νS is a Herglotz measure.Proof. By considering characteristic functions of measurable sets, simple func-

tions, and finally measurable functions h for which the integrals are absolutelyconvergent, one may easily verify the identity

∫R

h(t) dνS(t) =∫

S

{∫R

h(t) dµy(t)

}dσ (y). (13)

From the representation of the Herglotz functions Fy in (4), we obtain

Im Fy(i) = by +∫

R

1

1 + t2dµy(t).

Page 310: Mathematical Physics, Analysis and Geometry - Volume 7

316 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Therefore, we have∫S

{∫R

1

1 + t2dµy(t)

}dσ (y) =

∫S

{Im Fy(i) − by

}dσ (y). (14)

Suppose F(i) = A + iB, with B > 0. Then Im Fy(i) = B/[(y − A)2 + B2], andit can easily be shown that Im Fy(i) � const./(1 + y2), which is integrable withrespect to dσ . Since by are nonnegative, the integrals in (14) are finite, and hencewe have from (13)∫

R

1

1 + t2dνS(t) =

∫S

{∫R

1

1 + t2dµy(t)

}dσ (y) < +∞,

which is a sufficient condition for dνS to be a Herglotz measure, for arbitrary Borelset S. �

Furthermore, as we shall prove next, dνS may be expressed in terms of compo-sitions of Herglotz functions.

Let HS(z) and φ(z) be Herglotz functions corresponding to the Herglotz mea-sures dνS and dσ , respectively, with the following representations:

HS(z) = aH + bH z +∫

R

{1

t − z− t

t2 + 1

}dνS(t), (15)

φ(z) = aφ + bφz +∫

R

{1

t − z− t

t2 + 1

}dσ (t). (16)

Let also φS(z) be the Herglotz function having the same representation as that ofφ(z) in (16), except that integration takes place over the set S instead of R, that is

φS(z) = aφ + bφz +∫

S

{1

t − z− t

t2 + 1

}dσ (t). (17)

Moreover, let the composed Herglotz function (φS ◦ F)(z) have the followingrepresentation:

(φS ◦ F)(z) = a(φS◦F) + b(φS◦F)z +∫

R

{1

t − z− t

t2 + 1

}dµ(φS◦F)(t). (18)

LEMMA 2.6. For any Borel subset B of R, we have

νS(B) = µ(φS◦F)(B) − bφµ(B), (19)

where µ(φS◦F) is the measure corresponding to the composed Herglotz function(φS ◦ F), µ is the measure corresponding to the Herglotz function F , and theconstant bφ appears in the representation of the Herglotz function φS in (17) (andalso in (16)). Note that here we do not assume absolute continuity of the meas-ure dσ .

Page 311: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 317

Proof. From the representation of HS(z) in (15) we have

Im HS(z) = bH Im z +∫

R

Im

[1

t − z

]dνS(t).

The function Im[1/(t − z)] is continuous and hence measurable, and thus we havefrom Equation (13)∫

R

Im

[1

t − z

]dνS(t) =

∫S

{∫R

Im

[1

t − z

]dµy(t)

}dσ (y).

The representation of the functions Fy in (4) leads to

Im Fy(z) = byIm z +∫

R

Im

[1

t − z

]dµy(t).

Combining these equations we obtain

Im HS(z) = bH Im z +∫

S

Im

[1

y − F(z)

]dσ (y) − Im z

∫S

by dσ (y). (20)

Note that by Lemma 2.2 we have

by ={

by∗, y = y∗,0, y �= y∗,

where y∗ is the point for which by > 0, if this point exists. Also, from (17) we have

Im φS(z) = bφIm z +∫

S

Im

[1

t − z

]dσ (t).

Therefore, we obtain from (20)

Im HS(z)= bH Im z + Im φS

(F(z)

)−− bφIm F(z) − by∗(Im z)σ

({y∗ ∩ S}). (21)

Suppose that the points a and b are not discrete points of any of the measures νS ,µ(φS◦F) or µ. By using the standard result regarding Herglotz measures of intervals(a, b] whose endpoint are not discrete points of the measures [14], we have

νS

((a, b]) = lim

ε→0+1

π

∫ b

a

Im HS(λ + iε) dλ

= limε→0+

1

π

∫ b

a

Im φS

(F(λ + iε)

)dλ −

− bφ limε→0+

1

π

∫ b

a

Im F(λ + iε) dλ

= µ(φS◦F)

((a, b]) − bφµ

((a, b]), (22)

Page 312: Mathematical Physics, Analysis and Geometry - Volume 7

318 Y. T. CHRISTODOULIDES AND D. B. PEARSON

since

limε→0+

1

π

∫ b

a

Im(λ + iε) dλ = 0.

Now take arbitrary points a and b. We shall show that (22) still holds. Since the setof discrete points of the measures νS , µ(φS◦F), and µ is countable, given any ε > 0,there are points in the intervals (a − ε, a) and (b, b+ ε) respectively, which are notdiscrete points of either of the measures νS , µ(φS◦F), or µ. Hence, we can constructtwo sequences of such points, {ci}i∈N and {di}i∈N, with ci → a− and di → b+. Wethen have, on taking the limit of νS((ci, di])

νS

([a, b]) = µ(φS◦F)

([a, b]) − bφµ([a, b]).

By the same argument, we have

νS

({x}) = µ(φS◦F)

({x}) − bφµ({x}), ∀x ∈ R.

Combining these two equations, we obtain

νS

((a, b]) = µ(φS◦F)

((a, b]) − bφµ

((a, b]), (23)

for all points a and b of R.Equation (23) implies that νS and (µ(φS◦F) − bφµ) are measures defined on the

algebra of countable unions of intervals of the form (a, b]. The fact that νS is aHerglotz measure, implies that νS((−N,N]) is finite for any integer N(

νS

((−N,N]) � (1 + N2)

∫R

1

1 + t2dνS(t) < +∞

).

Since R = ⋃N∈N

(−N,N], it follows that νS , and also (µ(φS◦F)−bφµ), are σ -finitemeasures. Hence, there is a unique extension of these measures to the collection ofLebesgue measurable sets, called the corresponding Lebesgue–Stieltjes measure,and restricting this measure to the Borel sets, we have a unique extension to allBorel sets. Therefore, with this extended measure we have shown that

νS(B) = µ(φS◦F)(B) − bφµ(B),

for all Borel sets B, and Lemma 2.6 is proved. �Note that a special case of the result is the case where φ is a Möbius trans-

formation (see [9]). In that case, unless φ is a linear transformation, we havebφ = 0.

Page 313: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 319

3. Herglotz Functions and Uniform Convergence

Let Fn be a sequence of Herglotz functions with associated measures µn, andintegral representations

Fn(z) = an + bnz +∫

R

{1

t − z− t

t2 + 1

}dµn(t). (24)

Throughout this section we assume that the functions Fn(z) converge uniformly inthe limit n → ∞ to the Herglotz function F(z), on compact subsets of the upperhalf-plane. We consider the µn measure of an interval (a, b] where a, b are notdiscrete points of the measure µ. The following lemma allows us to control the µn

measure of neighbourhoods of a and b.

LEMMA 3.1. Let Fn(z) be a sequence of Herglotz functions, given by Equa-tion (24), converging uniformly as n → ∞ to the Herglotz function F(z), givenby Equation (1), on compact subsets of the upper half-plane. Suppose that a and b

(a < b) are not discrete points of the measure µ. Let ε > 0 be given. Then, thereexists δ0 with 0 < δ0 < (b − a)/2, and N0 ∈ N depending on δ0 and ε, such that ifn > N0 then µn(J0) < ε, where J0 = [a − δ0, a + δ0] ∪ [b − δ0, b + δ0].

Proof. From the representations (24), (1) of the functions Fn and F respectivelywe have

δ Im Fn(a + iδ) = bnδ2 +

∫R

δ2

(t − a)2 + δ2dµn(t),

and

δ Im F(a + iδ) = bF δ2 +∫

R

δ2

(t − a)2 + δ2dµ(t).

Similar expressions hold for δ Im Fn(b+ iδ) and δ Im F(b+ iδ). It is easy to verifythat since a and b are not discrete points of the measure µ, we have

limδ→0+ δ Im F(a + iδ) = lim

δ→0+ δ Im F(b + iδ) = 0. (25)

(Note that

δ2

(t − a)2 + δ2� 1

(t − a)2 + 1, ∀t ∈ R,

for 0 < δ < 1, which implies that we can use the Lebesgue dominated convergencetheorem.)

Let ε > 0 be given. In view of (25), we can choose δ0 > 0 such that

δ0 Im[F(a + iδ0) + F(b + iδ0)

]<

ε

8. (26)

Page 314: Mathematical Physics, Analysis and Geometry - Volume 7

320 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Since the functions Fn(z) converge to F(z) at the points z = a+iδ0 and z = b+iδ0,there is an N0 ∈ N such that, if n > N0 then

δ0

∣∣Im{Fn(a + iδ0) − F(a + iδ0) + Fn(b + iδ0) − F(b + iδ0)

}∣∣ <ε

8. (27)

It follows from (26) and (27) that for n > N0 we have

δ0 Im[Fn(a + iδ0) + Fn(b + iδ0)

]

= 2bnδ20 +

∫R

[δ2

0

(t − a)2 + δ20

+ δ20

(t − b)2 + δ20

]dµn(t) <

ε

4. (28)

Since bnδ20 � 0, (28) implies

∫R

δ20

(t − a)2 + δ20

dµn(t) <ε

4and

∫R

δ20

(t − b)2 + δ20

dµn(t) <ε

4.

For t ∈ [a − δ0, a + δ0] we have (t − a)2 � δ20, so that δ2

0/((t − a)2 + δ20) � 1/2,

and hence∫ a+δ0

a−δ0

1

2dµn(t) �

∫ a+δ0

a−δ0

δ20

(t − a)2 + δ20

dµn(t) <ε

4.

Therefore, for n > N0 we have

µn

([a − δ0, a + δ0])

2.

Similarly, for n > N0 we have

µn

([b − δ0, b + δ0])

2.

Hence µn(J ) < ε, provided n > N0, as stated in the lemma. �Remark 3.2. The proof of Lemma 3.1 does not require uniform convergence of

Fn(z). It is sufficient that Fn(z) converge to F(z) at the points z = a + iδ0 andz = b + iδ0.

LEMMA 3.3. With the same notation as in the statement of Lemma 3.1, supposethat a and b (a < b) are not discrete points of the measure µ, and let ε > 0 begiven. Then δ > 0 can be chosen such that∣∣∣∣ 1

π

∫ c

0

{∫J

[t − a

(t − a)2 + s2− t − b

(t − b)2 + s2

]dµ(t)

}ds

∣∣∣∣ < ε, (29)

where c > 0 is arbitrary, J = (a − δ, a + δ) ∪ (b − δ, b + δ), and δ is taken to liein the interval 0 < δ < (b − a)/2.

Page 315: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 321

Proof. Let c > 0 be a constant, and define a function K(c, t) by

K(c, t) =∫ c

0

[ |t − a|(t − a)2 + s2

+ |t − b|(t − b)2 + s2

]ds,

so that K(c, t) � π . Since µ({a}) = µ({b}) = 0, it follows that there is a δ, with0 < δ < (b − a)/2, such that µ(J ) < ε, where J = [a − δ, a + δ]∪ [b− δ, b+ δ].

Since the double integral in (29) is absolutely convergent we can change theorder of integration [2] to obtain

∣∣∣∣ 1

π

∫ c

0

{∫J

[t − a

(t − a)2 + s2− t − b

(t − b)2 + s2

]dµ(t)

}ds

∣∣∣∣� 1

π

∫J

K(c, t) dµ(t) � µ(J ) < ε,

by our choice of δ in the definition of J , and (29) is proved. �THEOREM 3.4. Let Fn(z) be a sequence of Herglotz functions with correspond-ing measures µn. Suppose that Fn(z) converge uniformly, as n → ∞, to theHerglotz function F(z), on compact subsets of the upper half-plane, and that thepoints a and b (a < b) are not discrete points of any of the measures µn (n ∈ N)

or µ, the measure corresponding to F(z). Then, we have

µn

((a, b]) → µ

((a, b]). (30)

Proof. By the standard result regarding Herglotz measures of intervals (a, b]whose endpoints are not discrete points of the measure [14], we have

∣∣µn

((a, b]) − µ

((a, b])∣∣

=∣∣∣∣ lim

w→0+1

π

∫ b

a

Im Fn(λ + iw) dλ − limw→0+

1

π

∫ b

a

Im F(λ + iw) dλ

∣∣∣∣. (31)

The problem is to control the behaviour of the functions Fn(z) and F(z) close tothe real axis. We also need to control the behaviour of integrals near the endpointsa and b of the interval (a, b], on a contour perpendicular to the real axis.

Given any ε > 0, set

ε0 = επ

6M(b − a),

where the constant M is defined to be

M = 2

s0Im F(is0) + 1,

for any s0 > 0. The role of ε0 and the constant M will become clear shortly.

Page 316: Mathematical Physics, Analysis and Geometry - Volume 7

322 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Let z = λ+iw, for some fixed w, 0 < w < ε0, and let A = a+iw, B = b+iw.Then ∣∣µn

((a, b]) − µ

((a, b])∣∣

=∣∣∣∣ lim

w→0+1

πIm

∫ B

A

Fn(z) dz − limw→0+

1

πIm

∫ B

A

F(z) dz

∣∣∣∣. (32)

Now let C = b + iε0, D = a + iε0, and consider the contour ABCD. It followsby Cauchy’s theorem that∫

ABCD

Fn(z) dz =∫

ABCD

F(z) dz = 0,

so that∫ B

A

Fn(z) dz =∫ D

A

Fn(z) dz +∫ C

D

Fn(z) dz +∫ B

C

Fn(z) dz.

A similar expression holds for F(z). On the contour AD let z = a+is, w � s � ε0,on DC let z = s + iε0, a � s � b, and on the contour CB let z = b + is,w � s � ε0. Then, we have

Im∫ B

A

Fn(z) dz

= Re∫ ε0

w

[Fn(a + is) − Fn(b + is)

]ds + Im

∫ b

a

Fn(s + iε0) ds.

From the representations of Fn in (24) we have

Re[Fn(a + is) − Fn(b + is)

]= (a − b)bn +

∫R

[t − a

(t − a)2 + s2− t − b

(t − b)2 + s2

]dµn(t). (33)

Thus, we obtain

Im∫ B

A

Fn(z) dz

=∫ ε0

w

(a − b)bn ds + Im∫ b

a

Fn(s + iε0) ds +

+∫ ε0

w

{∫R

[t − a

(t − a)2 + s2− t − b

(t − b)2 + s2

]dµn(t)

}ds. (34)

A similar expression holds for Im∫ B

AF(z) dz. For the remainder of the proof let

the function P(s, t) be defined by

P(s, t) = t − a

(t − a)2 + s2− t − b

(t − b)2 + s2.

Page 317: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 323

Then ∣∣µn

((a, b]) − µ

((a, b])∣∣

�∣∣∣∣ lim

w→0+1

π

∫ ε0

w

{ ∫R

P(s, t) dµn(t)

}ds −

− limw→0+

1

π

∫ ε0

w

{∫R

P(s, t) dµ(t)

}ds

∣∣∣∣ +

+∣∣∣∣ lim

w→0+1

π

∫ ε0

w

(a − b)(bn − bF ) ds

∣∣∣∣ +

+∣∣∣∣ 1

π

∫ b

a

Im[Fn(s + iε0) − F(s + iε0)

]ds

∣∣∣∣. (35)

From the representations of Fn(z) in (24) we have

1

sIm Fn(is) = bn +

∫R

1

t2 + s2dµn(t),

and thus bn < (1/s) Im Fn(is), ∀s ∈ R. Similarly, bF < (1/s) Im F(is), ∀s ∈ R.

In particular, bF < (1/s0) Im F(is0), and bn < (1/s0) Im Fn(is0). Since Fn →F , as n → ∞, at z = is0, there is an N1 ∈ N such that if n > N1, then(1/s0)|Im[Fn(is0)−F(is0)]| < 1/2. Hence bn < (1/s0) Im F(is0)+ (1/2), so that

|bn − bF | � bn + bF <2

s0Im F(is0) + 1

2< M.

With an application of the Lebesgue dominated convergence theorem, it followsthat ∣∣∣∣ lim

w→0+1

π

∫ ε0

w

(a − b)(bn − bF ) ds

∣∣∣∣ <1

π

∫ ε0

0(b − a)M ds = ε

6. (36)

Since, by assumption, Fn(z) → F(z) uniformly, as n → ∞, on the horizontalcontour joining the points a+iε0 and b+iε0, there is an N2 ∈ N such that if n > N2

then |Fn(s + iε0) − F(s + iε0)| < επ/6(b − a), a � s � b. Hence, we also have∣∣∣∣ 1

π

∫ b

a

Im[Fn(s + iε0) − F(s + iε0)

]ds

∣∣∣∣ <ε

6. (37)

By Lemma 3.1, there is a δ1 > 0, a corresponding set J1 = [a − δ1, a + δ1] ∪ [b −δ1, b + δ1], and an N3 ∈ N such that, if n > N3, then µn(J1) < ε/18. By changingthe order of integration and applying the Lebesgue dominated convergence theoremwe obtain∣∣∣∣ lim

w→0+1

π

∫ ε0

w

{ ∫J1

P(s, t) dµn(t)

}ds

∣∣∣∣= 1

π

∫J1

{∫ ε0

0P(s, t) ds

}dµn(t)

� 1

π

∫J1

K(ε0, t) dµn(t) � µn(J1) <ε

18, (38)

Page 318: Mathematical Physics, Analysis and Geometry - Volume 7

324 Y. T. CHRISTODOULIDES AND D. B. PEARSON

provided that n > N3. The function K(c, t), t ∈ R and any c > 0 fixed, is definedin the proof of Lemma 3.3.

Also, by Lemma 3.3, there is a δ2 > 0 and a corresponding set J2 = [a − δ2,

a + δ2] ∪ [b − δ2, b + δ2] such that∣∣∣∣ 1

π

∫ ε0

0

{∫J2

P(s, t) dµ(t)

}ds

∣∣∣∣ <ε

18.

Hence∣∣∣∣ limw→0+

1

π

∫ ε0

w

{ ∫J2

P(s, t) dµ(t)

}ds

∣∣∣∣=

∣∣∣∣ 1

π

∫J2

{∫ ε0

0P(s, t) ds

}dµ(t)

∣∣∣∣ <ε

18. (39)

With δ = min{δ1, δ2}, and J = [a − δ, a + δ] ∪ [b − δ, b + δ], (38) and (39) holdwith J1 and J2 replaced by J .

It remains to estimate the integrand for t ∈ R\J , and it is not difficult to verifythat on this set we have

∣∣P(s, t)∣∣ � cJ

1

1 + t2,

for some constant cJ . The fact that µn and µ are Herglotz measures now impliesthat the integrals

c1 =∫

R\JP (s, t) dµn(t)

and

c2 =∫

R\JP (s, t) dµ(t)

converge absolutely. Note that

|c1| � cJ

∫R

1

1 + t2dµn(t) = cJ

(Im Fn(i) − bn

)� cJ Im Fn(i), (40)

and thus c1 is bounded uniformly in n, since Fn(i) → F(i). Hence, it follows bythe Lebesgue dominated convergence theorem that

limw→0+

1

π

∫ ε0

w

{∫R\J

P (s, t) dµn(t)

}ds

= 1

π

∫ ε0

0

{ ∫R\J

P (s, t) dµn(t)

}ds, (41)

and similarly for the second double integral. This shows that the integration withrespect to Lebesgue measure ds must be performed on the interval [0, ε0].

Page 319: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 325

Suppose that |c1| � K, for all n ∈ N, for some constant K > 0. We now letc′ = max{K, |c2|}, ε1 = επ/9c′, and set ε′ = (1/2) min{ε1, ε2}. We then have∣∣∣∣ 1

π

∫ ε′

0

{ ∫R\J

P (s, t) dµn(t) −∫

R\JP (s, t) dµ(t)

}ds

∣∣∣∣ � ε

9. (42)

So far, from (36)–(39), (41), and (42) we have∣∣µn

((a, b]) − µ

((a, b])∣∣

<5ε

9+

∣∣∣∣∫ ε0

ε′

{∫R\J

P (s, t) dµn(t) −∫

R\JP (s, t) dµ(t)

}ds

∣∣∣∣. (43)

But our previous arguments show that∣∣∣∣ 1

π

∫ ε0

ε′

{∫J

P (s, t) dµn(t)

}ds

∣∣∣∣ <ε

18,

provided n > N3, and∣∣∣∣ 1

π

∫ ε0

ε′

{∫J

P (s, t) dµ(t)

}ds

∣∣∣∣ <ε

18.

Therefore, from (43) we find∣∣µn

((a, b]) − µ

((a, b])∣∣

<2ε

3+

∣∣∣∣ 1

π

∫ ε0

ε′

{∫R

P(s, t) dµn(t) −∫

R

P(s, t) dµ(t)

}ds

∣∣∣∣. (44)

By using Equation (33) and a similar expression for F(z), and substituting in (44),we obtain∣∣µn

((a, b]) − µ

((a, b])∣∣

� 2ε

3+ 1

π

∫ ε0

ε′

∣∣Fn(a + is) − F(a + is)∣∣ ds +

+ 1

π

∫ ε0

ε′

∣∣Fn(b + is) − F(b + is)∣∣ ds +

+ 1

π

∫ ε0

ε′(b − a)|bF − bn| ds. (45)

Since Fn(z) → F(z) uniformly, as n → ∞, on compact subsets of the upperhalf-plane, we can find an N4 ∈ N such that, if n > N4, then

1

π

∫ ε0

ε′

[∣∣Fn(a + is) − F(a + is)∣∣ + ∣∣Fn(b + is) − F(b + is)

∣∣] ds <ε

6.

Also, from (37) we have

1

π

∫ ε0

ε′(b − a)|bF − bn| ds <

ε

6.

Page 320: Mathematical Physics, Analysis and Geometry - Volume 7

326 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Let N = max{N1, N2, N3, N4

}. Then, if n > N , we have from (45)

∣∣µn

((a, b]) − µ

((a, b])∣∣ < ε,

so that Theorem 3.4 is proved. �Remark 3.5. Again, rather than uniform convergence of Fn(z), it is sufficient

that Fn(z) converge to F(z) at a point z = is0, as n → ∞, for any s0 � 1, anduniformly on the �-shaped contour consisting of the following parts: the horizontalcontour joining the points a+iε0 and b+iε0, and the vertical contours {z = a+is :ε′ � s � ε0}, {z = b + is : ε′ � s � ε0}, where the positive constants ε0, ε′ weredefined in the proof of Theorem 3.4.

LEMMA 3.6. Suppose that the measure dσ corresponding to the Herglotz func-tion φ(z) is absolutely continuous, and let hσ(y) be the density function of dσ . Forn ∈ N, let a family of sets Xn be defined by Xn = {y ∈ R : hσ(y) > n}. Then

limn→∞

∫Xn

1

1 + y2dσ (y) = 0. (46)

Proof. The result follows from the Lebesgue dominated convergence theorem,using the bound

χn(y)

1 + y2� 1

1 + y2, y ∈ R, n ∈ N,

where χn is the characteristic function of Xn. �Define now a sequence of Herglotz functions Fn

y (z) (y ∈ R) by

Fny (z) = 1

y − Fn(z), y ∈ R, n ∈ N, (47)

having measures µny and integral representations

Fny (z) = an

y + bnyz +

∫R

{1

t − z− t

t2 + 1

}dµn

y(t). (48)

The measures µny satisfy the following bounds.

LEMMA 3.7. Let Fn be a sequence of Herglotz functions, given by (24), such thatFn(i) lie in a compact subset of C+. Then, for any fixed N and any y ∈ R thereexists a constant c > 0 independent of y and n such that

µny

([−N,N]) � c

1 + y2, n ∈ N.

Page 321: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 327

Proof. Let D be a compact subset of C+ such that Fn(i) ∈ D for all n. Let KD

be a constant such that |z| � KD for all z ∈ D, and δD = infz∈D Im z > 0. For anyy ∈ R and n ∈ N we have

µny

([−N,N]) � (1 + N2)

∫ N

−N

1

1 + t2dµn

y(t)

� (1 + N2)

∫R

1

1 + t2dµn

y(t) = (1 + N2)(Im Fn

y (i) − bny

)� (1 + N2)Im Fn

y (i),

since bny � 0 for all y ∈ R and n ∈ N. Let Fn(i) = An + iBn, and note that

Im Fny (i) = Im

[1

y − Fn(i)

]= Bn

(y − An)2 + B2n

,

where 0 < δD � Bn � KD and |An| � KD. It is straightforward to show thatIm Fn

y (i) � k/(1 + y2), where k is a constant, and the lemma follows with c =(1 + N2)k. �THEOREM 3.8. Let Fn(z) be a family of Herglotz functions with correspondingmeasures µn, such that Fn(z) → F(z) uniformly, as n → ∞, on compact subsetsof the upper half-plane. Suppose that the measure dσ is absolutely continuous withrespect to Lebesgue measure. Then, for any Borel set S and any bounded Borelset B, we have

limn→∞

∫S

µny(B) dσ (y) =

∫S

µy(B) dσ (y), (49)

where the measures µny appear in (48), and the measures µy in (4).

Proof. Suppose B ⊆ [−N,N], and consider the family A of all Lebesguemeasurable subsets A of [−N,N] which satisfy Equation (49). A is nonempty; tosee this, first note that by Lemma 3.7 there exists a constant K1 > 0 such that

µny

([−N − 1, N + 1]) � K11

1 + y2, (50)

which is integrable with respect to dσ . Thus, for any subinterval (a, b] of [−N,N],an application of the Lebesgue dominated convergence theorem gives

limn→∞

∫S

µny

((a, b]) dσ (y) =

∫S

limn→∞ µn

y

((a, b]) dσ (y).

Since Fn(z) → F(z) uniformly, as n → ∞, on compact subsets of the upper half-plane, it can easily be verified that also Fn

y (z) → Fy(z) uniformly, as n → ∞, oncompact subsets of the upper half-plane. Therefore, it follows from Theorem 3.4that µn

y((a, b]) → µy((a, b]) as n → ∞, provided that the endpoints a and b are

Page 322: Mathematical Physics, Analysis and Geometry - Volume 7

328 Y. T. CHRISTODOULIDES AND D. B. PEARSON

not discrete points of any of the measures µny or µy . Hence, such an interval (a, b]

satisfies (49).Next we show that A is closed under countable unions of disjoint sets. There

also exists a constant K2 > 0 such that

µy

([−N − 1, N + 1]) � K21

1 + y2, (51)

for all y ∈ R and n ∈ N (this follows in a similar way as the result in Lemma 3.7).Let the sets S0 and S1 be defined by

S0 = {y ∈ S : hσ (y) � C}, S1 = {y ∈ S : hσ (y) > C}, (52)

where hσ is the density function of dσ , S is an arbitrary Borel set, and C > 0 is aconstant.

Let ε > 0 be given. Then, (50), (51) and Lemma 3.6 enable us to choose theconstant C such that we have both∫

S1

µny

([−N − 1, N + 1]) dσ (y) <ε

6, (53)

and ∫S1

µy

([−N − 1, N + 1]) dσ (y) <ε

6. (54)

Let {Ak} be a disjoint sequence of sets in A. Then∫

S1

µny

( ⋃k

Ak

)dσ (y) �

∫S1

µny

([−N − 1, N + 1]) dσ (y) <ε

6, (55)

and ∫S1

µy

( ⋃k

Ak

)dσ (y) �

∫S1

µy

([−N − 1, N + 1]) dσ (y) <ε

6. (56)

Also, for each k we have∫S0

µny(Ak) dσ (y)� C

∫S0

µny(Ak) dy

� C

∫R

µny(Ak) dy = C|Ak|,

by Equation (7) in the statement of Theorem 2.1, where | · | stands for Lebesguemeasure. Since∑

k

C|Ak| � C∣∣[−N,N]∣∣ = 2NC < +∞,

Page 323: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 329

it follows by a discrete version of the Lebesgue dominated convergence theoremthat

limn→∞

∑k

∫S0

µny(Ak) dσ (y) =

∑k

∫S0

µy(Ak) dσ (y).

Therefore, we have

limn→∞

∫S0

µny

(⋃k

Ak

)dσ (y)= lim

n→∞∑

k

∫S0

µny(Ak) dσ (y)

=∑

k

∫S0

µy(Ak) dσ (y)

=∫

S0

µy

( ⋃k

Ak

)dσ (y).

Hence, there is an N1 ∈ N such that, if n > N1 then∣∣∣∣∫

S0

µny

(⋃k

Ak

)dσ (y) −

∫S0

µy

(⋃k

Ak

)dσ (y)

∣∣∣∣ <2ε

3. (57)

Combining (55), (56), and (57), we can see that for n > N1 we have∣∣∣∣∫

S

µny

(⋃k

Ak

)dσ (y) −

∫S

µy

( ⋃k

Ak

)dσ (y)

∣∣∣∣ < ε,

which shows that A is closed under countable unions of disjoint sets.Now take any measurable subset B of [−N,N]. There is an open measurable

set G ⊂ [−N − 1, N + 1] such that G ⊃ B and |G| < |B| + ε/6C. SupposeG = B ∪ B0 where B and B0 are disjoint, so that B0 = G\B and thus B0 ismeasurable. Then, |G| = |B| + |B0|, which implies that |B0| � ε/6C. We haveµn

y(G) = µny(B) + µn

y(B0), and from Lemma 3.7 µny(B0) is bounded so that we

can subtract µny(B0) from both sides of the equation to obtain µn

y(B) = µny(G) −

µny(B0). Similarly, we have µy(B) = µy(G) − µy(B0). Therefore,

∣∣∣∣∫

S

µny(B) dσ (y) −

∫S

µy(B) dσ (y)

∣∣∣∣�

∫S

[∣∣µny(G) − µy(G)

∣∣ + µny(B0) + µy(B0)

]dσ (y). (58)

As before, we split integration over the set S to integration over the disjoint sets S0

and S1. From (53) we have∫S1

µny(B0) dσ (y) <

ε

6.

Page 324: Mathematical Physics, Analysis and Geometry - Volume 7

330 Y. T. CHRISTODOULIDES AND D. B. PEARSON

Also,∫S0

µny(B0) dσ (y) � C

∫S0

µny(B0) dy � C

∫R

µny(B0) dy = C|B0| � ε

6.

Hence,∫S

µny(B0) dσ (y) <

ε

3. (59)

Similarly, we have∫S

µy(B0) dσ (y) <ε

3. (60)

Since it is an open set, G is the union of a countable collection of disjoint subinter-vals of [−N − 1, N + 1], and we may assume that the endpoints of these intervalsare not discrete points of any of the measures µn

y or µ. Thus, by the first part of thisproof (where now we consider disjoint measurable subsets of [−N − 1, N + 1],rather than [−N,N], satisfying (49)), it follows that there is an N2 ∈ N such thatif n > N2 then∣∣∣∣

∫S

µny(G) dσ (y) −

∫S

µy(G) dσ (y)

∣∣∣∣ <ε

3. (61)

Combining (59), (60), and (61), we see from (58) that, if n > N2, then∣∣∣∣∫

S

µny(B) dσ (y) −

∫S

µy(B) dσ (y)

∣∣∣∣ < ε,

which completes the proof of Theorem 3.8. �

References

1. Akhiezer, N. I. and Glazman, I. M.: Theory of Linear Operators in Hilbert Space I, Pitman,London, 1981.

2. Apostol, T. M.: Mathematical Analysis, Addison Wesley, Massachusetts, 1960.3. Bartle, G. R.: The elements of Integration, Wiley, New York, 1966.4. Breimesser, S. V.: Asymptotic value distribution for solutions of the Schrödinger equation.

PhD Thesis, University of Hull, 2001.5. Breimesser, S. V. and Pearson, D. B.: Asymptotic value distribution for solutions of the

Schrödinger equation, Math. Phys., Anal. and Geometry 3 (2000), 385–403.6. Christodoulides, Y. T.: Spectral theory of Herglotz functions and their compositions, and the

Schrödinger equation. PhD Thesis, University of Hull, 2001.7. Coddington, E. A. and Levinson, N.: Theory of Ordinary Differential Equations, McGraw-Hill,

New York, 1955.8. Eastham, M. S. P. and Kalf, H.: Schrödinger-type Operators with Continuous Spectra, Pitman,

London, 1982.

Page 325: Mathematical Physics, Analysis and Geometry - Volume 7

GENERALIZED VALUE DISTRIBUTION FOR HERGLOTZ FUNCTIONS 331

9. Gesztesy, F. and Makarov, K. A.: SL2(R), exponential Herglotz representations, and spec-tral averaging, St. Petersburg Math. J. (to appear). (Available on Los Alamos archive undermath.SP/0203142.)

10. Herglotz, G.: Über Potenzreihen mit Positivem, reelem Teil in Einheitskreis, Sächs Acad. Wiss.Leipzig 63 (1911), 501–511.

11. Markushevich, A. I.: Theory of Functions of a Complex Variable I, Prentice-Hall, New Jersey,1965.

12. Munroe, M. E.: Measure and Integration, Addison-Wesley, Massachusetts, 1971.13. Pearson, D. B.: Value distribution and spectral theory, Proc. Lond. Math. Soc. 68(3) (1994),

127–144.14. Pearson, D. B.: Quantum Scattering and Spectral Theory, Academic Press, London, 1988.15. Wheeden, L. R. and Zygmund, A.: Measure and Integral, Marcel Dekker, New York, 1977.

Page 326: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 333–345, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

333

Spectral Theory of Herglotz Functions andTheir Compositions

Y. T. CHRISTODOULIDES and D. B. PEARSONDepartment of Mathematics, University of Hull, Cottingham Rd., Hull HU6 7RX, UK.e-mail: [email protected]; [email protected]

(Received: 17 April 2003; in final form: 8 June 2004)

Abstract. Recent developments in the theory of value distribution for boundary values of Herglotzfunctions [5], with applications to the spectral analysis of Herglotz measures and differential opera-tors [2, 3] lead in a natural way to the investigation of measures which relate (through the Herglotzrepresentation theorem) to the composition of a pair of Herglotz functions F, G. The present paperprovides results on the boundary values of composed Herglotz functions and on the terms of theirHerglotz representation which are dominant at large |z|.Mathematics Subject Classifications (2000): 30D40, 30D05.

Key words: boundary values, Herglotz functions, spectral theory.

1. Introduction

Let G(z) be a Herglotz function, that is analytic in the upper half-plane, withpositive imaginary part. Then G admits the integral representation [1, 7, 10]

G(z) = aG + bGz +∫

R

{1

t − z− t

t2 + 1

}dρ(t), (1)

where aG, bG are real constants and the function ρ(t) is nondecreasing, right-continuous, and unique up to an additive constant, for given G. In particular, aG

and bG are given by

aG = Re G(i), bG = lims→+∞

1

sIm G(is),

and ρ(t) defines a Borel measure, with ρ{(a, b]} = ρ(b)−ρ(a) for intervals (a, b].Thus dρ is the Herglotz measure corresponding to G, and satisfies the integralcondition∫

R

1

1 + t2dρ(t) < ∞. (2)

For given G, we can define a one-parameter family {Gy} of Herglotz functions(y ∈ R) by Gy(z) = 1/(y−G(z)). Let dρy be the Herglotz measure corresponding

Page 327: Mathematical Physics, Analysis and Geometry - Volume 7

334 Y. T. CHRISTODOULIDES AND D. B. PEARSON

to Gy . Then if dσ is a Herglotz measure (that is, a Borel measure satisfying theintegrability condition

∫R(1/(1 + t2)) dσ (t) < ∞), we can define the averaged

measure νS over a Borel set S by νS(A) = ∫Sρy(A) dσ (y), for any Borel set such

that the integral is finite.If G has real boundary value almost everywhere, with boundary value function

G+(λ) = limε→0+ G(λ + iε)(λ ∈ R), the measure νS , with ν = νR, defines thevalue distribution of G+(λ), through the formula

νS(A) = ν(A ∩ G−1

+ (S)).

See [2, 3, 11] for the theory of value distribution of boundary values of Herglotzfunctions, with applications to the spectral theory of Sturm–Liouville equationsand the Weyl Titchmarsh m-function [8]. In the special case that the measure dσ isLebesgue measure | · |, it follows that dν is also Lebesgue measure, and we havethe resulting formula in that case∫

S

ρy(A) dy = ∣∣A ∩ G−1+ (S)

∣∣.More generally [5] one may verify that if dσ is purely absolutely continuous,then so are νS and ν. Moreover, if we define a Herglotz function correspondingto measure dσ by

φS(z) = aφ + bφz +∫

S

{1

t − z− t

t2 + 1

}dσ (t),

then νS is given for Borel sets B by

νS(B) = ρ(φS◦G)(B) − bφρ(B).

In particular we can take bφ = 0, in which case νS is precisely the Herglotz measurecorresponding to the composition φS ◦ G of the two Herglotz functions φS and G.

The analysis of value distribution theory for Herglotz functions, in term ofHerglotz measures more general than Lebesgue measure, is treated in [5], withapplications to spectral theory. As a key element of that analysis, the present paperdeals with the Herglotz representation for composed functions. In particular, weshall consider the two related questions: given two Herglotz functions F , G,

(i) How does the coefficient bF◦G of the term linear in z of the Herglotz represen-tation for F ◦G relate to the corresponding coefficients bF , bG of the functionsF , G? We provide a complete description of this coefficient.

(ii) What can be deduced regarding the boundary values (F ◦ G)+(λ) of the com-posed Herglotz function F ◦ G? This is an important question in view of therole of the composed function in determining generalized value distribution.One might expect that the boundary value (F ◦ G)+(λ) is given quite simplyby F+(G+(λ)). But the validity of such a result depends on a careful analysisof the mode of convergence of Herglotz functions to their boundary values,

Page 328: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 335

raising such questions as whether or not G(z) approaches G+(λ) as a nontan-gential limit. We resolve some of these questions and show that the boundaryvalue of F ◦ G is indeed F+(G+(λ)) on a set characterized by the notionof approximate monotonicity, and for which the complement is of Lebesguemeasure zero.

This paper may be regarded as a sequel to [6]. See [6] for introductory ma-terial on spectral averaging and its connection with the composition of Herglotzfunctions, as well as references on these and related areas.

The main results of the paper are summarized below in Theorems 2.2 and 3.2.The paper is organized as follows.

In Section 2, we consider the coefficient bF◦G of the z term in the Herglotzrepresentation for the composed function. Theorem 2.2 presents the results of thisanalysis, which include various special cases. In particular, if the measure dµ cor-responding to the Herglotz function F has no discrete component, then we havebF◦G = bF bG. In the case that dµ does have a discrete point, the canonical resultbF◦G = bF bG may be violated, depending on the measure dρ for the secondHerglotz function G, and we provide also a complete description of this case.The identities and estimates required to obtain these results would appear to beof interest in their own right.

Section 3 presents an analysis of boundary behaviour for the composed Herglotzfunction F ◦ G. The boundary values of F ◦ G are important both to determinethe spectral density for the corresponding measure in the case of a.c. measureand, in the case of real boundary values, to determine the value distribution ofthe composed function on the real line. The main problem of this analysis is toobtain some control of the angle at which G(λ + iε) approaches the real line in thelimit as ε → 0+, if G+(λ) is real. We make use of general results on the boundarybehaviour of analytic functions, as well as local real analysis for a class of functionswhich are measurable but which may exhibit no local continuity and need not belocally bounded. The main result of this analysis is presented in Theorem 3.2,and depends on an investigation of the consequences of approximate monotonicity,constancy, and oscillation.

2. Coefficient b of Composed Herglotz Functions

Let G be an arbitrary Herglotz function with representation given by Equation (1),and let F be a Herglotz function

F(z) = aF + bF z +∫

R

{1

t − z− t

t2 + 1

}dµ(t). (3)

Throughout this section, we define the function c(s) (s ∈ R+) in terms of thespectral measure dρ for G by

c(s) =∫

R

1

s2 + t2dρ(t). (4)

Page 329: Mathematical Physics, Analysis and Geometry - Volume 7

336 Y. T. CHRISTODOULIDES AND D. B. PEARSON

The following lemma provides a simple criterion for a Herglotz measure to befinite.

LEMMA 2.1. With c(s) given by Equation (4), the measure dρ is finite if and onlyif a := lims→∞ c(s)s2 < ∞, and infinite if and only if a = ∞.

Proof. Suppose first that c(s)s2 → a < ∞ as s → ∞. Then if dρ is an infinitemeasure there exists N0 ∈ N such that ρ((−N0, N0]) > a. Hence

a = lims→∞

∫R

s2

s2 + t2dρ(t) � lim

s→∞

∫ N0

−N0

s2

s2 + t2dρ(t) = ρ

((−N0, N0]

)> a,

which is a contradiction. So dρ is a finite measure in this case.Conversely, if dρ is finite then

lims→∞ c(s)s2 = lim

s→∞

∫R

s2

s2 + t2dρ(t) = ρ(R) < ∞,

where we have used the Lebesgue dominated convergence theorem. The conditionfor dρ to be infinite follows easily. �

Note that, in terms of the Herglotz function G, the condition for finite measuredρ can be written

lims→∞

(s Im G(is) − bGs2) := a < ∞.

The following theorem provides a complete characterization of the term bz in therepresentation for F ◦ G.

THEOREM 2.2. Let bF◦Gz be the term linear in z in the Herglotz representationfor the composed function F ◦ G.

Then if either bG �= 0 or if the measure dρ is infinite, we have bF◦G = bF bG. IfbG = 0 and dρ is a finite measure, then bF◦G = 1

aµ{t0}, where a = ρ(R) and

t0 = aG −∫

R

t

1 + t2dρ(t).

(Observe that if t0 is not a discrete point of the measure dµ then bF◦G = bF bG.Also note that if dρ is the zero measure then bG �= 0.)

Proof. The constant bF◦G is given by

bF◦G = lims→∞

1

sIm F

(G(is)

).

From the representations (1), (3) of the Herglotz functions G, F , we have

1

sIm F

(G(is)

) = bF bG + bF c(s) +

+ bG

∫R

1

[t − A(s)]2 + s2[bG + c(s)]2dµ(t) +

+∫

R

c(s)

[t − A(s)]2 + s2[bG + c(s)]2dµ(t), (5)

Page 330: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 337

where the function A(s) is defined by

A(s) = aG +∫

R

t (1 − s2)

(s2 + t2)(1 + t2)dρ(t) (6)

and c(s) is given in Equation (4).Note that c(s) → 0 as s → ∞, since we can apply the Lebesgue dominated

convergence theorem with

1

s2 + t2� 1

1 + t2for s � 1,

and this bound is integrable with respect to the measure dρ. Hence Equation (5)implies that

bF◦G = bF bG + lims→∞ bG

∫R

1

[t − A(s)]2 + s2[bG + c(s)]2dµ(t) +

+ lims→∞

∫R

c(s)

[t − A(s)]2 + s2[bG + c(s)]2dµ(t), (7)

provided both these limits separately exist.In the special case that dρ is the zero measure (so that bG �= 0) we have A(s) =

aG, c(s) = 0, and

bF◦G = bF bG + lims→∞ bG

∫R

1

(t − aG)2 + s2b2G

dµ(t).

It is straightforward to verify that

1

(t − aG)2 + s2b2G

� const.

1 + t2(t ∈ R, s � 1)

which is integrable with respect to dµ, and hence the Lebesgue dominated conver-gence theorem gives bF◦G = bF bG in this case.

We now consider the case in which bG = 0 and dρ is finite. Then, from (7) weobtain

bF◦G = lims→∞

∫R

11

c(s)[t − A(s)]2 + s2c(s)

dµ(t). (8)

Note that∣∣∣∣ t (1 − s2)

(s2 + t2)(1 + t2)

∣∣∣∣ � |t|s2

(s2 + t2)(1 + t2)� 1, s � 1,

so that with finite measure dρ an application of the Lebesgue dominated conver-gence theorem gives

lims→∞ A(s) = aG −

∫R

t

1 + t2dρ(t) := t0. (9)

Page 331: Mathematical Physics, Analysis and Geometry - Volume 7

338 Y. T. CHRISTODOULIDES AND D. B. PEARSON

By Lemma 2.1 we also have lims→∞ c(s)s2 = ρ(R) = a �= 0, from which we maydeduce that

t2 + 11

c(s)[t − A(s)]2 + c(s)s2

� const.,

for all t ∈ R and for s sufficiently large. Since 1/(1 + t2) is integrable with respectto dµ, we can apply the Lebesgue dominated convergence theorem to the limit onthe right-hand side of Equation (8). Noting that c(s) → 0 and A(s) → t0, wededuce that if t0 is not a discrete point of dµ, then bF◦G = 0. If t0 is a discrete pointof dµ, then

bF◦G = µ({t0}) lim

s→∞

[1

1c(s)

[t0 − A(s)]2 + c(s)s2

].

From (6) and (9), applying the Schwarz inequality we have the estimate

[t0 − A(s)

]2 =[∫

R

t

s2 + t2dρ(t)

]2

� c(s)

∫R

t2

s2 + t2dρ(t),

implying that

lims→∞

1

c(s)

[t0 − A(s)

]2 = 0.

Hence

bF◦G = µ({t0}) lim

s→∞1

c(s)s2= 1

({t0}),which completes the case bG = 0 with dρ finite.

Now consider the case bG = 0 with dρ an infinite measure. Here we shall makeuse of an inequality for A(s), which may be verified by the Schwarz inequality. Fors � 1, we have

∣∣A(s) − aG

∣∣ � s

{∫R

1

s2 + t2dρ(t)

}1/2{∫R

s2t2

(s2 + t2)(1 + t2)2dρ(t)

}1/2

� const. s√

c(s). (10)

We now verify the bound, for all s sufficiently large,

t2 + 11

c(s)[t − A(s)]2 + c(s)s2

� const., (t ∈ R). (11)

Page 332: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 339

To do this, note first of all that (t2 +1)/((t −aG)2 +1) is bounded, so that to verify(11) it is sufficient to show that

(t − aG)2 + 11

c(s)[t − A(s)]2 + c(s)s2

� const., (t ∈ R).

Here it is equivalent to replace t ∈ R by t + aG, so that it remains to verify that

t2 + 11

c(s)[t − (A(s) − aG)]2 + c(s)s2

� const. (12)

Given a positive constant K1, we consider the discriminant D of the quadraticexpression in t given by

1

c(s)

[t − (

A(s) − aG

)]2 + c(s)s2 − K1t2.

Since

D(s) = 4K1s2c(s) + 4K1(A(s) − aG)2

c(s)− 4s2,

where c(s) → 0 as s → ∞, we can use (10) to show that D(s) < 0 for alls sufficiently large and for any (fixed) value of K1 sufficiently small. Hence thequadratic is strictly positive for all s sufficiently large, for this value of K1. Sincealso c(s)s2 → ∞ in this case, it follows that (12) holds and (11) is verified. Wecan now apply the Lebesgue dominated convergence theorem to Equation (8) todeduce that bF◦G = 0 in the case bG = 0 with dρ infinite.

In the remaining case, with bG > 0, similar arguments may be used to confirmthe bounds that allow the Lebesgue dominated convergence theorem to be appliedto the two integrals on the right-hand side of Equation (7). In this case, both inte-grals are easily found to converge to zero, and the result bF◦G = bF bG in this casecompletes the proof of Theorem 2.2. �

3. Boundary Values of Composed Herglotz Functions

Given a Herglotz function F , the boundary value F+(λ) is defined for (Lebesgue)almost all λ ∈ R by F+(λ) = limε→0+ F(λ + iε), and is the limit of F(z) as z

approaches the real axis in a direction at right angles to R. Such limits may notbe appropriate while considering the boundary behaviour of a composed Herglotzfunction F ◦ G, since if λ ∈ R is a point such that G+(λ) exists and is real then, inthe limit ε → 0+, G(λ + iε) will not approach the point G+(λ) at right angles toR in general, and it is not clear how the limit of F(G(λ+ iε)) should be evaluated.To circumvent this problem we need boundary limits in more general regions.

Page 333: Mathematical Physics, Analysis and Geometry - Volume 7

340 Y. T. CHRISTODOULIDES AND D. B. PEARSON

By a wedge-shaped region with vertex λ ∈ R, we shall mean a set of the form{z ∈ C+, α < arg(z − λ) < β}, with α < β < π . By the wedgy (nontangential)limit of F at the point λ ∈ R we shall mean the limit as z approaches λ along asimple curve ending at λ, and lying entirely in a wedge-shaped region with vertexat λ. From the theory of analytic functions [4] it is known that if the limit of F at λ

exists along a simple curve, then the limit also exists along any other simple curveending at λ and contained in a wedge-shaped area with vertex λ. Moreover thesevarious limits agree, for given λ, and exist for almost all λ ∈ R. We will denote byFw(λ) the wedgy limit of F at λ ∈ R; thus, Fw(λ) = F+(λ) whenever either limitexists.

Boundary behaviour of Herglotz functions can be highly irregular. For example,F+(λ) may exhibit such discontinuities that F+ assumes every real value in everysubinterval of R [12]. To describe such possible boundary behaviour, we shallneed a number of ideas and results drawn from local real analysis. The followingdefinitions will be found useful.

DEFINITION 3.1. Let f be a Lebesgue measurable real valued function, finitealmost everywhere on R, and let If = {x ∈ R; f (x) is finite}. Then f is said tobe approximately right monotonic increasing at x ∈ If provided that

limh→0+

∣∣{t ∈ [x, x + h] ∩ If : f (t) > f (x)}∣∣/h = 1,

where | · | stands for Lebesgue measure. Approximately left monotonic increasingis defined similarly, and a function which is both right and left approximatelymonotonic increasing at x is said to be approximately monotonic increasing at x.Approximately monotonic decreasing is defined in a similar way.

If for some x, with f (x) = y, x is a point of density for the set f −1({y}), thenf is said to be approximately constant at x. In that case, the set f −1({y}) must havestrictly positive measure.

The function f is said to be approximately oscillatory (to the right) at a pointx ∈ If , if there exist sequences {hn}, {h′

n} of positive numbers, approaching zero,such that∣∣{t ∈ [x, x + hn] ∩ If ; f (t) < f (x)

}∣∣/hn

and ∣∣{t ∈ [x, x + h′n] ∩ If ; f (t) > f (x)

}∣∣/h′n

both approach 1 in the limit n → ∞. A similar definition applies to the left of thepoint x, and a function which is approximately oscillatory to both the right and leftwill be described as approximately oscillatory.

Measurable functions can be categorized in terms of the above definitions, asfollows.

Page 334: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 341

THEOREM 3.1. Let f be real-valued, measurable and finite almost everywhere.Then, at almost all x ∈ R, f is either

(i) approximately monotonic increasing,

(ii) approximately monotonic decreasing,

(iii) approximately constant, or

(iv) approximately oscillatory.

Proof. See [9]. �We can use Theorem 3.1 to prove the following result for Herglotz functions.

THEOREM 3.2. Let F , G be arbitrary Herglotz functions, and denote by IG, IG

the sets IG = {λ ∈ R : G+(λ) exist and G+(λ) ∈ R} and IG = {λ ∈ R : G+(λ)

exists and Im G+(λ) > 0}, where G+(λ) = limε→0+ G(λ + iε).Then at almost all λ ∈ R we have

limε→0+(F ◦ G)(λ + iε) = (F ◦ G)+(λ) =

{F+

(G+(λ)

), λ ∈ IG,

F(G+(λ)

), λ ∈ IG.

(13)

Proof. The conclusion of Theorem 3.2 is easily verified in the case λ ∈ IG, andfollows from the continuity of F at G+(λ). It remains to consider the case λ ∈ IG.

Let A be the set A = {λ ∈ IG; Fw(G+(λ)) exists}. Thus F+(G+(λ)) exists forλ ∈ A.

If B ⊂ R is the set of points at which Fw fails to exist, then IG\A = G−1+ (B).

But |B| = 0, implying that |IG\A| = |G−1+ (B)| = 0, since the inverse image of

any set of measure zero will have measure zero.We first verify (13) for λ ∈ IG with the simplifying assumption that G has

real boundary value almost everywhere; this will be so if and only if the measuredρ is purely singular. Then |Ac| = 0 and we can define a function g, real al-most everywhere, by g(λ) = G+(λ). From Theorem 3.1, at almost all λ ∈ A, g

is approximately monotonic increasing or decreasing, approximately constant, orapproximately oscillatory.

The case of approximate constancy may be ruled out. If λ0 ∈ A is a pointof approximate constancy of g then |λ ∈ IG : G+(λ) = g(λ0)| > 0, whichcontradicts the fact that the inverse image of any singleton set must have zeroLebesgue measure.

Now suppose that λ is a point of A at which g is approximately monotonicincreasing or decreasing. By considering the function log(G(z) − g(λ)), one mayverify the result (see [9])

Arg(G(λ + iε) − g(λ)

) =∫

R

εξ(t) dt

(t − λ)2 + ε2, (14)

Page 335: Mathematical Physics, Analysis and Geometry - Volume 7

342 Y. T. CHRISTODOULIDES AND D. B. PEARSON

where ξ(t) is almost everywhere the characteristic function of the set {t ∈ IG :g(t) < g(λ)}. On the right-hand side of (14), the limit as ε → 0+ may be identified(see [11]) with

limh→0+ π

∫ λ+h

λ−h

ξ(t)

2hdt,

in the sense that if either limit exists then both limits exist and are equal.Hence

limε→0+ Arg

(G(λ + iε) − g(λ)

)= π

2lim

h→0+

∣∣{t ∈ [λ − h, λ + h] ∩ IG : g(t) < g(λ)}∣∣/h, (15)

so that the limit on the left-hand side is π/2.Hence, in the limit as ε → 0+, G(λ + iε) approaches g(λ) along a path con-

tained in a wedge-shaped region with vertex at g(λ). Since λ ∈ A, we know thatFw(G+(λ)) exists, and it follows that

(F ◦ G)+(λ) = F+(G+(λ)

).

If, on the other hand,

Arg(G(λ + ih1) − g(λ)

)� π

2+ ε

and

Arg(G(λ + ih2) − g(λ)

)� π

2− ε,

then by continuity in h′ of Arg(G(λ+ih′)−g(λ)) we can find h′ between h1 and h2

such that (19) holds, with again 0 < h′ < h. Since h can be taken arbitrarily small,we can construct a decreasing sequence {h′

n} of positive numbers, converging tozero, and such that G(λ + ih′

n) lies in the wedge shaped region with vertex g(λ)

and angle 2ε, defined by the inequalities

π

2− ε < Arg

(G(λ + iε′

n) − g(λ))<

π

2+ ε.

Since λ ∈ A and Fw(G+(λ)) exists, it follows again that

(F ◦ G)+(λ) = F+(G+(λ)

).

Finally, consider the general case in which G takes boundary values having strictlypositive imaginary part on a set of positive Lebesgue measure. As noted earlier,for λ ∈ IG we have (F ◦ G)+(λ) = F(G+(λ)). Define a function g, real almosteverywhere, by

g(λ) ={

G+(λ), λ ∈ IG,0, λ ∈ IG.

Page 336: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 343

Again by Theorem 3.1, at almost all λ, g is approximately monotonic increasingor decreasing, approximately constant, or approximately oscillatory. At λ ∈ IG, g

can be approximately constant only if G+(λ) = 0. However, G+(λ) = 0 only on aset of λ values having Lebesgue measure zero, so that for almost all λ ∈ IG we canrule out the possibility that g is approximately constant.

Denote by Id the set of λ ∈ A such that λ is a point of density of A. Then IG\Id

has Lebesgue measure zero, and for λ ∈ Id we have

Arg(G(λ + iε) − g(λ)

) =∫

R

εξ(t) dt

(t − λ)2 + ε2,

where now

ξ(t) = 1

πIm log

(G+(t) − g(λ)

),

for almost all t .Next suppose λ ∈ A is a point at which g is approximately oscillatory. Given

any ε with 0 < ε < π/2, we first fix δ, β, with 0 < δ < β, such that∫ β

δ

1

1 + t2dt >

π

2− ε.

Next, note that the definition of approximately oscillatory implies that, for anyh > 0, we can find h1 satisfying 0 < h1 < h, such that∣∣{t ∈ [λ, λ + βh1] ∩ IG : g(t) < g(λ)

}∣∣ > (β − δ)h1. (16)

Proceeding as before from Equation (14), we have

Arg(G(λ + ih1) − g(λ)

)�

∫ λ+βh1

λ

h1ξ(t) dt

(t − λ)2 + h21

�∫ λ+βh1

λ+δh1

h1 dt

(t − λ)2 + h21

=∫ β

δ

1

1 + t2dt >

π

2− ε, (17)

since on the interval λ < t < λ + βh1 we have ξ(t) = 1 for a set of points havingmeasure at least (β − δ)h1 and h1/((t − λ)2 + h2

1) is a decreasing function of t , sothat the minimum value is attained when integration is carried out on the intervalλ + δh1 < t < λ + βh1. Similarly, we can find h2 satisfying 0 < h2 < h, such that∣∣{t ∈ [λ, λ + βh2] ∩ IG : g(t) > g(λ)

}∣∣ > (β − δ)h2.

In that case, with χ0 the characteristic function of {t ∈ IG : ξ(t) = 0}, we have

Arg(G(λ + ih2) − g(λ)

) =∫

R

h2ξ(t) dt

(t − λ)2 + h22

� π −∫

ξ(t)=0

h2 dt

(t − λ)2 + h22

Page 337: Mathematical Physics, Analysis and Geometry - Volume 7

344 Y. T. CHRISTODOULIDES AND D. B. PEARSON

� π −∫ λ+βh2

λ

h2χ0(t) dt

(t − λ)2 + h22

� π −∫ λ+βh2

λ+δh2

h2 dt

(t − λ)2 + h22

< π −(

π

2− ε

)= π

2+ ε. (18)

Having found h1, h2 to satisfy (17) and (18) respectively, we now define h′ by

h′ = h1 if Arg(G(λ + ih1) − g(λ)

)<

π

2+ ε,

and

h′ = h2 if Arg(G(λ + ih2) − g(λ)

)>

π

2− ε.

Then in either case we haveπ

2− ε < Arg

(G(λ + ih′) − g(λ)

)<

π

2+ ε. (19)

Note that ξ(t) satisfies

ξ(t) ={

1, t ∈ IG, g(t) < g(λ),0, t ∈ IG, g(t) > g(λ),

and 0 < ξ(t) < 1, t ∈ IG.If λ is a point at which g is approximately monotonic increasing or decreasing,

then Equation (15) holds with g replaced by g, and as before we may deduce that(F ◦G)+(λ) = F+(G+(λ)). On the other hand, if g is approximately oscillatory atλ, we can follow the same argument as before to construct a decreasing sequence{h′

n} of positive numbers, converging to zero, such that

π

2− ε < Arg

(G(λ + ih′

n) − g(λ))

2+ ε,

where ε in the interval 0 < ε < π/2 is arbitrary.It follows again in this case that

(F ◦ G)+(λ) = F+(G+(λ)

),

and this completes the proof of Theorem 3.2. �

References

1. Akhiezer, N. I. and Glazman, I. M.: Theory of Linear Operators in Hilbert Space I, Pitman,London, 1981.

Page 338: Mathematical Physics, Analysis and Geometry - Volume 7

SPECTRAL THEORY OF HERGLOTZ FUNCTIONS 345

2. Breimesser, S. V.: Asymptotic value distribution for solutions of the Schrödinger equation, PhDThesis, University of Hull, 2001.

3. Breimesser, S. V. and Pearson, D. B.: Asymptotic value distribution for solutions of theSchrödinger equation, Math. Phys., Anal. and Geometry 3 (2000), 385–403.

4. Caratheodory, C.: Theory of Functions of a Complex Variable II, Chelsea, New York, 1954.5. Christodoulides, Y. T.: Spectral theory of Herglotz functions and their compositions, and the

Schrödinger equation, PhD Thesis, University of Hull, 2001.6. Christodoulides, Y. T. and Pearson, D. B.: Generalised value distribution for Herglotz functions,

and spectral theory, Math. Phys., Anal. and Geometry (2003) (to appear).7. Coddington, E. A. and Levinson, N.: Theory of Ordinary Differential Equations, McGraw-Hill,

New York, 1955.8. Eastham, M. S. P. and Kalf, H.: Schrödinger-type Operators with Continuous Spectra, Pitman,

London, 1982.9. Elsken, T., Pearson, D. B. and Robinson, P. M.: Approximate monotonicity: theory and

applications, J. London Math. Soc. 53(2) (1996), 489–502.10. Herglotz, G.: Über Potenzreihen mit positivem, reelem Teil in Einheitskreis, Sächs. Acad. Wiss.

Leipzig 63 (1911), 501–511.11. Pearson, D. B.: Value distribution and spectral theory, Proc. London Math. Soc. 68(3) (1994),

127–144.12. Pearson, D. B.: Quantum Scattering and Spectral Theory, Academic Press, London, 1988.

Page 339: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 347–349, 2004.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

347

Errata: “Universality in Orthogonal and SymplecticInvariant Matrix Models with Quartic Potential” �

ALEXANDRE STOJANOVICInstitut Mathématique de Jussieu, Physique mathématique et Géométrie, Université Paris 7,case 7012, 2 Place Jussieu, 75251 Paris Cedex 05, France. e-mail: [email protected]

Abstract. The errata concern mainly the last computations for the universality of the local statisticsof eigenvalues at the edge of the spectrum in parts (iii) of Theorems 2.3 and 2.4.

1. Corrections

• Page 343, (1.20): there is a factor 1/2 in front of the right-hand side.• Page 346, (2.10): It can be simplified in

gjk = ajk −2n−1∑

�=k−d

2n−1∑m=2n−d

sjmtm�a�k.

• Page 347, before Theorem 2.3: The definition of b(x, y) must be

bβ(x, y) = 1

2Ai(x)

(cβ −

∫ +∞

y

Ai(t) dt

), with cβ =

{1, if β = 1,0, if β = 4.

The precedent definition corresponds to cβ = 1/2, what is wrong.• Page 347, Theorem 2.3, part (ii): −s′(x − y) must be replaced by s′(x − y) in

the expression of the matrix kernel τ1(x, y).• Page 348, Theorem 2.3, part (iii):(

Zj − w

cjn2/3, Zj + v

cjn2/3

)

must be replaced by(Zj − wj

cjn2/3

, Zj + vj

cjn2/3

),

with w2 = w, v2 = v, w1 = −v, and v1 = −w, because sign(cj ) = (−1)j .θ1(x, y) is now only defined by the first expression, in which we replace

b(x, y) by b1(x, y), the limit is now taken independently of the parity of n

and the result holds for j = 1 and j = 2.

� Math. Phys. Anal. Geom. 3(4) (2000), 339–373.

Page 340: Mathematical Physics, Analysis and Geometry - Volume 7

348 ALEXANDRE STOJANOVIC

Moreover, we have to multiply by (−1)j the matrix kernel θ1(xp, xq) in theexpression of the limiting scaled correlation functions.

• Page 349, Theorem 2.4, part (ii): −s′(2(x − y)) must be replaced bys′(2(x − y)) in the expression of the matrix kernel τ4(x, y).

• Page 349, Theorem 2.4, part (iii): We modify the expression of the interval asin the case β = 1.

Replace b(x, y) by b4(x, y) in the expression of θ4(x, y) and multiply itby 1/2.

• Page 350, Theorem 2.4, part (iii): We have to multiply by (−1)j the matrixkernel θ4(xp, xq) in the expression of the limiting scaled correlation functions.

• Page 363: In fact, we have

limn→∞

1

cjn2/3Kn

(Zj + x

cjn2/3, Zj + y

cjn2/3

)= (−1)ja(x, y).

• Page 367, last line: We have (−1)σ0+j at the numerator of the fraction.• Page 369, last line: replace (−1)n+p by (−1)2n+p.• Page 371, bottom: In fact c′ has the same value as c if n is odd.

2. Explanations

We give explanations for the mistakes about the end of the computations for theuniversality of the local statistics of eigenvalues at the edge of the spectrum. Themistakes in the proof of the computations of the terms depending on Hn and Gn

have a common origin: if ‖ · ‖∞ := ‖ · ‖L∞((Zj−δ,Zj +δ)2), then we have said that

‖ψn+p ⊗ (ε � ψn+q) − ψn+q ⊗ (ε � ψn+p)‖∞ = O(n−1/2),

‖(ε � ψn+p) ⊗ (ε � ψn+q) − (ε � ψn+q) ⊗ (ε � ψn+p)‖∞ = O(n−7/6),

which is wrong. There are some contributions at the level of constants of inte-gration, which make wrong the arguments based on the antisymmetry. Thus, theestimates, pages 363–364, for εµ � Gn, ελ � εµ � Gn and for the part called βn

in the expression of Hn are false. Since now, there are contributions of εµ � Gn,ελ � εµ � Gn, εµ � βn and ελ � εµ � βn and the method of computations is com-pletely different at this level. Now, we need equivalents of the coefficients gjk andnot only estimates, what requires more computations. The details are given in therevision www.physik.uni-bielefeld.de/bibos/preprints, 02-07-098, BiBoS,Bielefeld, May 2002 of the preprint [13]. The method consists in doing what wehave done on pages 367–368, for the computations of the contribution of the termαn(λ). First, we have to give explicitly the values of the coefficients gjk in functionof the coefficients ajk , a′

jk and cjk . Secondly, we have to give the equivalentsof these coefficients gjk ; these coefficients are rational functions in terms of thecoefficients ajk, a′

jk and cjk. The denominators are factors from the expressionof det D; equivalents are given on pages 370–372. We just have to compute the

Page 341: Mathematical Physics, Analysis and Geometry - Volume 7

ERRATA 349

equivalents of the numerators to get the results. We give the results for the newversions of Lemmas 4.3 and 4.4:• For β = 1, n even, we have

gn−3,n−2 = n√−t

4(√

1 + u − √1 − u) + O(n2/3),

gn−1,n−2 = n√−t

2

√1 − u + O(n2/3).

• For β = 1, n odd, we have

gn−2,n−3 = n

4

√−t(−√1 + u + √

1 − u) + O(n2/3),

gn−1,n−2 = O(n2/3).

• For β = 4, we have

g2n,2n+1 = n

4

√−2t(√

1 + √2u +

√1 − √

2u) ×

×(√

1 + √2u − (−1)n(1 − 2u2)1/4√

1 + √2u + (−1)n(1 − 2u2)1/4

)+ O(n2/3),

g2n+1,2n+2 = n

4

√−2t

[(√1 + √

2u +√

1 − √2u

) −

− (√1 + √

2u −√

1 − √2u

) ×

×(√

1 + √2u − (−1)n(1 − 2u2)1/4√

1 + √2u + (−1)n(1 − 2u2)1/4

)]+

+ O(n2/3).

Page 342: Mathematical Physics, Analysis and Geometry - Volume 7

Mathematical Physics, Analysis and Geometry 7: 351–352, 2004. 351

Contents of Volume 7 (2004)

Volume 7 No. 1 2004

CRAIG ROBERTS / Relating Thomas–Whitehead Projective Connec-tions by a Gauge Transformation 1–8

IVAN G. AVRAMIDI / Heat Kernel Asymptotics of Zaremba BoundaryValue Problem 9–46

A. KOKOTOV and D. KOROTKIN / Tau-functions on Hurwitz Spaces 47–96

Volume 7 No. 2 2004

DIMITRI J. FRANTZESKAKIS, IOANNIS G. STRATIS and ATHAN-ASIOS N. YANNACOPOULOS / On Equilibria of the Two-fluid Model in Magnetohydrodynamics 97–117

ROSTYSLAV O. HRYNIV and YAROSLAV V. MYKYTYUK / Trans-formation Operators for Sturm–Liouville Operators with Sin-gular Potentials 119–149

ABEL KLEIN and ANDREW KOINES / A General Framework forLocalization of Classical Waves: II. Random Media 151–185

P. COULTON and G. GALPERIN / Forces along Equidistant ParticlePaths 187–192

Volume 7 No. 3 2004

INÉS PACHARONI and JUAN A. TIRAO / Three Term RecursionRelation for Spherical Functions Associated to the ComplexProjective Plane 193–221

RICARDO URIBE-VARGAS / Four-Vertex Theorems, Sturm Theoryand Lagrangian Singularities 223–237

JÜRG FRÖHLICH and MARCO MERKLI / Thermal Ionization 239–287

Page 343: Mathematical Physics, Analysis and Geometry - Volume 7

352 CONTENTS OF VOLUME 7

Volume 7 No. 4 2004

ZHIJUN QIAO and SHENGTAI LI / A New Integrable Hierarchy,Parametric Solutions and Traveling Wave Solutions 289–308

Y. T. CHRISTODOULIDES and D. B. PEARSON / Generalized ValueDistribution for Herglotz Functions and Spectral Theory 309–331

Y. T. CHRISTODOULIDES and D. B. PEARSON / Spectral Theory ofHerglotz Functions and Their Compositions 333–345

ALEXANDRE STOJANOVIC / Errata: “Universality in Orthogonaland Symplectic Invariant Matrix Models with Quartic Poten-tial” 347–349