Differential Equations an Operational Approach

Differential Equations:

An Operational Approach

Hector Manuel Moya-Cessa

Francisco Soto-Eguibar

Rinton Press, Inc.

2011 Rinton Press, Inc. 565 Edmund Terrace Paramus, New Jersey 07652, USA [email protected] http://www .rintonpress. com

All right reserved. No part ofthis book covered by the copyright hereon may be reproduced or used in any form or by any means-graphic, electronic, or mechanical, including photocopying, recording, taping, or information storage and retrieval system - without permission of the publisher.

Published by Rinton Press, Inc.

Printed in the United States of America

ISBN 978-1-58949-060-4

Preface

This short textbook, mainly on new methods to solve differential equations, is the result from notes of the courses on special topics of mathematical methods that we have taught, for several years, at Instituto Nacional de Astrofisica, Optica y Electr6nica in Puebla, Mexico.

We present an alternative approach to the one that is usually found in text-books. We have developed original material that is presented here, which deals with the application of operational methods. Because of this, new material is developed in five of the six chapters it contains, paying particular attention to algebraic methods by using differential operators when possible. The methods presented in this book are useful in all applications of differential equations. These methods are used in quantum physics, where they have been developed in their majority, but can be used in any other branch of physics and engineering.

The first Chapter is a review of linear algebra, as it is the basis of the rest of the Chapters. The second Chapter is a survey of special functions, where we introduce new operational techniques that will be used throughout the book. In this very Chapter we show new material, in particular a sum of Hermite poly-nomials of even order. The third Chapter is devoted to solve finite systems of differential equations; we do this by using what we have called Vandermonde methods. In the fourth and the fifth Chapters we solve infinite and semi-infinite systems of ordinary differential equations using operational methods. Finally, in Chapter six we solve some partial differential equations. The book is self con-tained, as the methods established are new and they are completely developed.

The book is intended for undergraduate students and so we assume that the reader has elementary knowledge on matrix algebra, determinants and a basic working knowledge on ordinary differential equations. The reader acquainted with linear algebra can go directly to Chapter 2, i.e. to the Special Functions Chapter. This Chapter should not be skipped, even when the reader has been

Preface

exposed to special functions, as the main notions used in the book are introduced here. In some parts of the book, where we think the reader may not have much experience, we provide explicit calculations in order to give details on how the operational calculations have to be done.

We are very grateful to our colleagues Omar Lopez and Enrique Landgrave for valuable criticisms and friendly recommendations. We would like to thank several students for typing part of the lectures, in particular we are very grateful to Juan Martinez-Carranza.

Hector Manuel Moya-Cessa and Francisco Soto-Eguibar Santa Maria Tonantzintla, Puebla, Mexico

March 2011

Contents

Preface

Chapter 1 Linear Algebra 1.1 Vector spaces . . . . . . .

1.1.1 Subspaccs ......... . 1.1.2 The span of a set of vectors 1.1.3 Linear independence ....

vii

1 3 3 4

1.1.4 Bases and dimension of a vector space 6 1.1.5 Coordinate systems. Components of a vector in a given basis 7

1.2 The scalar product. Euclidian spaces . . . . . . . . . . . . . . 7 1.2.1 The norm in an Euclidian space . . . . . . . . . . . . 8 1.2.2 The concept of angle between vectors. Orthogonality . 8

1.3 Linear transformations . . . . . . . . . . . . . 11 1.3.1 The kernel of a linear transformation. 1.3.2 The image of a linear transformation . 1.3.3 Isomorphisms . . . . . . . . . ..... 1.3.4 Linear transformations and matrices 1.3.5 The product of linear transformations

1.4 Eigenvalues and eigenvectors . . 1.4.1 The finite dimension case 1.4.2 Similar matrices .... . 1.4.3 Diagonal matrices ... .

1.4.3.1 Procedure to diagonalize a matrix . 1.4.3.2 The Cayley-Hamilton theorem

1.5 Linear operators acting on Euclidian spaces .. 1.5.1 Adjoint operators ........... . 1.5.2 Hermitian and anti-Hermitian operators

12 13 13 14 16 20 24 27 28 30 30 31 32 33

Contents

1.5.3 Properties of the eigenvalues and eigenvectors of the Hermi-tian operators . . . . .

Chapter 2 Special functions 2.1 Hermite polynomials . . . . . . . . . . . . .

2.2 2.3

2.4

2.1.1 Baker-Hausdorff formula ..... . 2.1.2 Series of even Hermite polynomials . 2.1.3 Addition formula ..... Associated Laguerre polynomials . . . . . . Chebyshev polynomials . . . . . . . . . . . 2.3.1 Chebyshev polynomials of the first kind 2.3.2 Chebyshev polynomials of the second kind. Bessel functions of the first kind of integer order . 2.4.1 Addition formula ......... .

34

37 37 40 43 48 49 52 52 53 55 5G

2.4.2 Series of the Bessel functions of the first kind of integer order 58 2.4.3 Relation between the Bessel functions of the first kind of in-

teger order and the Chebyshev polynomials of the second kind 60

Chapter 3 Finite systems of differential equations 3.1 Systems 2 x 2 first ............ .

3.1.1 Eigenvalue equations ....... . 3.1.2 Cayley-Hamilton theorem method

3.1.2.1 Case A: >11 =/= Az ... 3.1.2.2 Case B: A1 = Az = A ..

3.2 Systems 4 x 4 ............... . 3.2.1 Case 1. All the eigenvalues are different (A 1 =/= Az =/= A3 =/= A4) 3.2.2 Case 2. A1 = Az =/= A3 = A4 .. 3.2.3 Case 3. Al = Az = A3 = A =/= A4 3.2.4 Case 4. A1 = Az = A3 = A4 = A

63 63 64 69 70 72 74 77 78 79 80

3.3 Systems n x n . . . . . . . . . . . . . 83 3.3.1 All the eigenvalues arc distinct 84 3.3.2 The eigenvalues are: A1 = A2 = A3, A4 = A5 , and the rest are

different . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Chapter 4 Infinite systems of differential equations 4.1 Addition formula for the Bessel functions 4.2 First neighbors interaction .............. . 4.3 Second neighbors interaction . . . . . . . . . . . . 4.4 First neighbors interaction with an extra interaction

4.4.1 Interaction wn . . . . . . . . . . . . . . . . .

91 92 94 96

100 100

Contents

4.4.2 Interaction w( -1)"' Chapter 5 Semi-infinite systems of differential equations 5.1 First a semi-infinite system ............. . 5.2 Semi-infinite system with first neighbors interaction 5.3 Nonlinear system with first neighbors interaction

Chapter 6 Partial differential equations 6.1 A simple partial differential equation . . .

6.1.1 A Gaussian function as boundary condition . 6.1.2 An arbitrary function as boundary condition

6.2 Airy system ................. . 6.2.1 Airy function as boundary condition

6.3 Harmonic oscillator system ... 6.4 z-dependent harmonic oscillator .

Appendix A Dirac notation

107

115 115 119 126

133 133 133 134 135 137 138 139

141

Appendix B Inverse of the Vandermonde and Vandermonde con-fluent matrices

B.1 The inverse of the Vandermonde matrix ..... . B.2 The inverse of the confluent Vandermonde matrix.

Appendix C Tridiagonal matrices C.1 Fibonacci system ......... .

Bibliography

Index

143 145 146

149 150

153

155

Chapter 1

Linear Algebra

In this Chapter, and in the rest of the book, we will assume that the reader is familiar with elementary matrix algebra and with determinants; however, in special cases, and for the sake of clearness, we will define and review some con-cepts. To review elementary matrix algebra and determinants, we recommend the following books [Shores 07; Larson 09; Lang 86; Lang 87; Nicholson 90; Poole 06].

1.1 Vector spaces

The concept of vector space is very important in mathematics. This concept has applications in physics, engineering, chemistry, biology, the social sciences, and other areas. It is an abstraction of the idea of geometrical vectors that we acquire in elementary mathematics and physics, and even in our daily life. The idea of vector is generalized to any size and to entirely different kinds of objects.

A vector space consists of a nonempty set V of objects, called vectors, that has an addition + and a multiplication by scalars (real or complex numbers). The following ten axioms are assumed to hold. Addition (Al) Closure under addition: If u and v belong to V; then so does u + v. (A2) Commutative law for addition: If u and v belong to V, then u + v = v + u. (A3) Associative law for addition: If u, v, and w belong to V, then u+ (v+w) = (u+v)+w. (A4) There exists a zero vector in V, denoted by 0, such that for every vector u inV,u+O=u. (A5) For every vector u in V there exists a vector -u, called the additive inverse

of u, such that u + ( -u) = 0. Scalar Multiplication

Linear Algebra

(S1) Closure under scalar multiplication: If u belongs to V; then so a u for any scalar a. (S2) For any two scalars a and (3 and any vector u in V, a ((3 u) = (af3). u. (S3) For any vector u in V, 1 u = u. (S4) For any two scalars a and (3 and any vector u in V, (a+ (3) u =a. u+ (3 u. (S5) For any scalar a and any two vectors u and v in V, a (u+v) =a. u+a. v.

From now on, for simplicity, we will denote the scalar multiplication of a scalar a by a vector v as av instead of a v.

If the scalars are real numbers, we call the vector space a real vector space; if the scalars are complex, we call the vector space a complex vector space.

Example 1. The set of n-tuplcs of ordered real numbers, {(x1, x2, ... , Xn)l Xi E JR. for all i}, with the usual addition and multiplication operations, is the real vector space JR.n.

Example 2. The set of all m x n matrices M with real entries, and with the usual matrix addition and multiplication by a scalar, is the real vector space 9J1(m x n).

Example 3. The set of all real-valued functions on a set A is a real vector space. Addition and scalar multiplication are defined so that f + g is the func-tion x-+ f(x) + g(x), and >.j is the function x-+ >.j(x). Similarly, the space of complex valued functions on A is a complex vector space.

Example 4. The set It( a, b) = {f : (a, b) -+ q .f is a continuos function in (a, b)}, where tC are the complex numbers, with the same operations of the preceding example, is a complex vector space.

Example 5. The set of all polynomials of degree at most two with the standard function addition and scalar multiplication forms a vector space.

We list, without demonstration, some properties of vector spaces. Let v be a vector in some vector space V and let c be any scalar. Then 1. Ov = 0. 2. cO= 0.

Vector spaces

3. ( -c)v = c(-v) = -(cv). 4. If cv = o, then v = 0 or c = 0. 5. A vector space has only one zero element. 6. Every vector has only one additive inverse.

1.1.1 Subspaces

A better understanding of the properties of vector spaces is obtained by intro-ducing the concept of subsets and the bigger sets from which they can be subsets. Of primordial importance is when the subsets are by themselves vector spaces; in that case, the subset is called a subspace. We define subspace as follows:

If V is a vector space, a subset U of V is called a subspace of V if U is itself a vector space, where U uses the vector addition and scalar multiplication of V.

Example 1. In the case of the vector space JR.2 , all the straight lines that goes through the origin, are subspaces.

Example 2. If we consider the vector space of real valued continuous func-tions defined in a certain interval (a, b) in JR., the subset of all polynomials of a fixed degree is a subspace.

Let U be a subset of a vector space V. Then U is a subspace, if and only if, it satisfies the following three conditions. 1. The zero vector 0 of V lies in U. 2. If u and v lie in U, then u + v lies in U; i.e., the subset U is closed under the addition. 3. If u lies in U, then a u lies in U for all scalars; i.e., the subset U is closed under the multiplication by scalars.

1.1.2 The span of a set of vectors I~ { v1 ~ v2' , Vn} is any set of vecto~s in a vector space V, the set of all linear com-bmatlOns of these vectors is called their span, and is denoted by span{v1, v2 , ... , vn}.

Example 1 I th 2 . n e vector space JR. , the vector (1, 1) span the subspace that

consists of a straight line of slope 1 that goes through the origin.

L'inear Algebra

Example 2. The set {1,x,x2 ,x3 } spans the vector space of polynomials of degree less or equal to 3.

Example 3. The three vectors (1, 0, 0), (0, 1, 0), (0, 0, 1) span the vector space JR3.

1.1.3 Linear independence lk

A set of vectors {VI, v2 , ... , Vn} is called linearly independent if any linear com-bination of them equal to zero necessary implies that all the coefficients of the combination are zero; i.e., if

(1.1) then

(1.2) A set of vectors that is not linearly independent is said to be linearly depen-

dent.

Example 1. The vectors (1,0,1),(0,1,0) and (-1,1,1), of the vector space IR3 , are linearly independent. We take a linear combination of the three vectors,

ri(1, 0, 1) + r2(0, 1, 0) + r3( -1, 1, 1) = 0.

As

(

1 0 det 0 1

1 0 2,

the only solution to the homogeneous system we have, is the trivial one; i.e., TI = r2 =.r3 = 0, and then the set is linearly independent.

Example 2. The subset B = {sinx, sin2x, sin3x, ... ,sin nx} of the vector space ct( -1r, 1r) = {! : ( -1r, 1r) -t IRI f is a continuos function in ( -1r, 1r)} is linearly independent for all n. Taking a linear combination of all the elements in the set B and equating it to

zero, we get

Vector spaces

Lcksinkx = 0. k=I

Multiplying this equation by sin mx, integrating it from -Jr to 1r, and using the integral J:.?T sin mx sin kxdx = m5mb we get

n

7r L Ck5km = 7rCm = 0. k=I

As m is an arbitrary integer from 1 to n, necessarily Ck = 0 for all k from 1 to n, and we have proved that the set B is linearly independent.

It is clear that a set {VI, v2 , ... , Vn} of vectors in a vector space V is linearly dependent, if and only if, some vi is a linear combination of the others. We know, from our hypothesis, that in the following equation

at least one coefficient is no null, let say it is the m; then, we can write

As Cm -=/=- 0, we get

as we wanted to show.

CmVm + L CkVk = 0. k=I,kfcm

= 0,

Example 3. In the vector space IR2 , the vectors (-1,2) and (3,-6) are lin-early dependent, as one is a multiple of the other, (3, -6) = -3( -1, 2).

If a vector space V can be spanned by n vectors, then in case that any other set of m vectors in V is linearly independent, thus m :S: n

Linear Algebra

1.1.4 Bases and dimension of a vector space One of the main ideas of vector space theory is the notion of a basis. We already know that a spanning set for a vector space V is a set of vectors { v1, v2, ... , Vn} such that V = span{ V1, v2 , ... , vn} . However, some spanning sets are better than others because they have less elements. We know that a set of vectors has no redundant vectors in it if and only if it is linearly independent. This observation take us to the following definition.

A set e2, ... ,en} of vectors in a vector space V is called a basis of V if it satisfies the two conditions: 1. The set { e1, e2, ... ,en} is linearly independent 2. The set { e1, e2, ... ,en} spans the vector space V; i.e., V =span{ e1, e2, ... ,en}

The fact that a basis spans a vector space means that every vector in the space has a representation as a linear combination of basis vectors. The linearly inde-pendence of the basis implies that the representation is unique.

Example 1. In the vector space R 3 , the set of three vectors { (1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis. It is very easy to show that the three vectors are linearly independent. Any vector (x,y,z) in R 3 , can be written as (x,y,z) = x(1,0,0) +y(0,1,0) + z(O, 0, 1), then the set spans the space.

The number of vectors in two different bases of a vector space V must be the same.

If {e1,e2, ... ,en} is a basis of the nonzero vector space V, the number n of vectors in the basis is called the dimension of V, and we write dimV. The zero vector space is defined to have dimension 0. A vector space V is called finite dimensional if V = 0 or V has a finite basis. Otherwise it is infinite-dimensional.

Let V be a vector space and assume that dimV = n > 0. 1. No set of more than n vectors in V can be linearly independent 2. No set of fewer than n vectors caP span V

Let V be a vector space, and assvme that dimV = n > O. 1. Any set of n linearly independent vectors in V is a basis. 2. Any spanning set of n nonzero vectors in Vis a basis.

The scalar prod11,ci. E11,clidian spaces

1.1.5 Coordinate systems. Components of a vector in a given basis

A basis of a vector space V was defined to be an independent set of vectors spanning V. The real significance of a basis lies in the fact that the vectors of any basis of en or nn may be regarded as the unit vectors of the space, under a suitably chosen coordinate system, in virtue of the following theorem.

A given vector v, in a vector space V, can always be expanded as

(1.3)

where the set { e1, e2, ... , en} is a basis of the vector space. The set of scalars { a 1 , a 2 , ... , an} (complex or real) are called the components or the coordinates of the vector v in the basis { e1 , e2 , ... , en}. It is clear that if we have different bases, the components of the vector will be different; however given a basis, this components are unique.

Example 1. In the real vector space R 3 , the standard basis is the set {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. The components of the vector ( -1, 2, -3) in this basis are -1,2 and -3. The set {(1,0,1),(1,1,1),(1,1,0)} is linearly indepen-dent, then it is also a basis of R 3 ; in this new basis, the components of the vector ( -1, 2, -3) are -3, 0 and 2.

1.2 The scalar product. Euclidian spaces

We say that a complex vector space V has a scalar product or dot product or an internal product, if for any two elements u and v in V, a unique complex number is associated, which will be denoted by u v, and that association satisfies the following four properties. For any three vectors u, v and w in V, and for any scalar a, 1. Hermitian symmetry. u v = (v u)* 2. Distributivity or linearity. u ( v + w) = u v + u w 3. Associativity or homogeneity. (au) v = a(u v) 4. Positivity. u u > 0 if u I 0 In the case of a real vector space, property 1 simplifies to u v = v v., the others remain equal.

Linear Algebra

A vector space with a scalar product, is called an Euclidian space. If the scalars are real, it is a real Euclidian space, if the scalars are complex, it is called a complex Euclidian space or Unitary space. In what follows, we will think in complex Euclidian spaces, and treat the real Euclidian spaces as a particular case.

Example 1. In the vector space JR3, the product

!

10 Linear Algebra

so Am = 0. As m is arbitrary, the necessary conclusion is that Aj = 0 for all j from 1 to k, and the set S is linearly independent.

In a real Euclidian space V of finite dimension n, any orthogonal set S of n no null vectors is a basis.

Given an Euclidian space V of finite dimension n, and S = { e1 , e2 , ... ,en} an orthogonal basis of it, any vector v of V can be expanded as

where

fork= 1,2,3 ... ,n.

n

v = Lckek, k=l

V

(1.7)

(1.8)

We take the scalar product of expression (1.7) with em, being m an arbitrary integer from 1 to n, we use the properties 2 and 3 of the scalar product and we use the fact that the basis is orthogonal (i.e., ei ej = leillejloij), to write

n

v em= Lck(ek em)= Lcklemlleklomk = leml 2cm, k=l k=l

and from this equation (1.8) follows trivially.

Given an Euclidian space V of finite dimension nand S = { e1 , e2 , ... ,en} an orthonormal basis of it, any vector v of V can be expanded as

(1.9)

where

(1.10)

fork= 1,2,3 ... ,n.

Parseval formula. Let be V an Euclidian space of finite dimension n and S = { e1 , e2 , ... ,en} an orthonormal basis of it. Then for any two vectors, ~ and

Linear transformations

u in V, the following formula is satisfied

That implies that

v u = L(v ek)(u ek)*. k=l

n

v u = L vku'k, k=l

11

(1.11)

(1.12)

where Vk and Uk are the components of the two vectors in the given basis. In particular, if u = v,

lvl2 =LV~. (1.13) k=l

In all Euclidian spaces is always possible to build an orthogonal basis, and therefore an orthonormal basis. The construction processes is known as the Gram-Schmidt orthogonalization process [Nicholson 90; Poole 06].

1.3 Linear transformations

A transformation is a function with vector spaces for its domain and codomain. A transformation is linear if it preserves linear combinations; or more precisely, if V and W are two vector spaces, a function T : V -+ W is called a linear transformation in V, if it satisfies the following properties: a) T(v + v.) = T(v) + T(v.) for all v and v. in V. b) T(rv) = rT(v) for all v in Vandall scalars r (real or complex). The linear transformations are also called linear maps and linear operators. However, we will use the denomination linear operator when the domain and the codomain are the same vector space.

Example 1. The transformation from JR3 to IR2 given by the rule (x,y,z)-+ (x+y+z,x~2y~z) is a linear transformation.

Example 2. The differential operator acting on the vector space \1: = {f: (a, b)-+ IRif is a continuos function in (a, b)} is a linear transformation.

12 Linear Algebra

A linear transformation always maps the zero vector of the domain in the zero vector of the codomain. In other words, given a linear transformation T, from the vector space V to the vector space W, we have

T(Ov) =Ow, (1.14)

where Ov is the zero vector in the vector space V and W is the zero vector in the vector space W. The seco.nd property of a linear transformation, T ( rv) = rT ( v), holds for any vector v m V, and for any scalar r; in particular it holds for the zero scalar and then expression (1.14) follows trivially. '

As the linear transformations arc in fact functions from one vector space into another, all the operations for functions applied. We can add linear transforma-tions, and we can multiply linear transformations by a scalar. It is easy to show that. all these operations give again a linear transformation, then the space of all lmear transformations is itself a vector space.

1.3.1 The kernel of a linear transformation Given a linear transformation T, from the vector space V to the vector space W, we define the kernel of the transformation as the subset of the vector space V of all the vectors that have as image the null vector in W; i.e., if we denote the kernel of the linear transformation as ker(T), we have

ker(T) = {vt:VIT(v) = 0}. (1.15) The kernel of a linear transformation is a vector subspace of the domain V.

We already know that it is enough to show that the zero vector is in the kernel and that th~ kernel is closed under the addition and under the multiplication b; a scalar. Frrst, the zero vector is in the kernel, as we also know that the zero vector always goes to the zero vector under a linear map (see expression ( 1.14)). Second, as the transformation is linear, if we have v, u in ker(T) and two arbi-trary scalars r ~nd s, then T(rv + su) = rT(v) + sT(u) = rO +sO= 0, showing that the kernel rs closed under both operations.

The two following statements are completely equivalent: 1. The kernel of a linear transformation Tis the set {0}. 2. If v and u are two elements of the vector space V such that T(v) = T(u), then v = u.

Linear transformations 13

We suppose 1 and show 2. If T(v) = T(u), using the linearity ofT we get T(v)- T(u) = T(v- u) = 0, but as by hypothesis the only vector that goes to zero is zero, we obtain v- u = 0, or, as we wanted to show, v = u. Now we suppose 2 and show 1. If we have a vector win V such that T(w) = 0, it is trivially true that T(w) = T(O) and then, by hypothesis, w = 0.

Example 1. The kernel of the linear transformation from JR.3 to itself, given by the law, (x, y, z) -t (x- y, x + y + z, 2x- 3y + z), is clearly the set with only the vector zero, (0, 0, 0); any other vector goes to a nonzero vector in JR.3 .

Example 2. The kernel of the linear operator djdx, that acts in the vector space Q:( a, b) = {f : (a, b) -t JR. I f is a continuos function in (a, b)}, is the set of constant functions in the interval, {! : (a, b) -t JR. I f = constant}.

1.3.2 The image of a linear transformation Given a linear transformation T, from the vector space V to the vector space W, we define the image or range of the transformation as the subset of the vector space W of all the vectors that are image of some vector of V; i.e., if we denote the image of the linear transformation as im(T), we have

im(T) = {wcWI exists vt:V such that T(v) = w}. (1.16)

The image of a linear transformation is a vector subspace of the domain W. As we did for the kernel, we only need to show that the zero vector belongs to the image, and that the image is closed under the two vector space operations. First, as for a linear map zero goes to zero, the zero vector in W is in the image. Second, if we have w1, w2 in im(T), then exists v1 and v2 in V such that T(vr) = w1 and T(v2) = w2. Then, T"'Wr + sw2 = rT(vr) + sT(u2) and as the transformation Tis linear rw1 + sw2 = rT(vr) + sT(v2) = T(Tvr + sv2), and clearly rw1 + sw2 E im(T).

Example 1. The image of a rotation of the plane JR.2 is the plane JR.2 itself.

1.3.3 Isomorphisms A linear transformation T : V -t W, is called injective if for all pair of vectors, v. and v in V, with 11, -:f=. v, we have T(v.) -:f=. T(v).

14 Linear Algebra

With the results of the previous section, we can conclude that a linear trans-formation is injective, if and only if, the only element that maps to zero is zero.

Example 1. A rotation in JR 2 is an injective mapping, as the only element that "remains" in place is the origin, and only the origin is mapped into the origin.

Example 2. The linear map from JR3 into JR2 , with the rule (x,y,z)--+ (x,y), is not injective ~s maps all the straight line (0, 0, t), t E JR, into the vector (0, 0).

A linear transformation T: V--+ W, is called surjective if im(T) = W.

Example 3. The linear map from JR2 into JR3 , with the rule (x, y) --+ (x + y,x- y,3), is not surjective, as all the elements of the subset {(r,s,t)lr,s,t E lR with t =J 3} of lR 3 do not have associated any vector in the domain lR 2 .

If a linear transformation is injective and surjective, then it is called isomor-phism. An isomorphism is then a map one-to-one.

Any finite dimensional nonzero vector space, over the field of the real num-bers lR or over the field of the complex numbers e, is isomorph to JRn or to en, respectively . To understand this one to one correspondence, we have to think in the set of n numbers formed by the components of any vector in a given basis, and associate this set with an element of JRn or en.

1.3.4 Linear transformations and matrices An important and very useful result of linear algebra, is that exists a one-to-one relationship between linear transformations, acting in finite dimensional vector spaces, and matrices. That means that we can associate a linear transformation with a matrix and that every linear transformation has associated a matrix. In fact, there exists an isoJJlorphisll} between linear transformations on finite di-mensional vector spaces &-nd mattices.

First, we prove the e

w w

16 Linear Algebra

(0, 1, -1) = (1, 1, 0) + 0(0, 1, 1) - (1, 0, 1) and (1, 0, 1) = 0(1, 1, 0) + 0(0, 1, 1) +

(I, 0, I), then the matrix that represents T in these bases is ( ~ ~ ) . -I I

Example 4. Let I3 = {I, x, ex, xex} be a basis of the vector subspace W of continuous functions, and let D be the differential operator on W. To calculate the representation of this operator in this basis, we have to evaluate its action on each of the elements of the basis; we get D(I) = O,D(x) = I,D(ex) = ex, D(xex) = (I~+ x)ex. We write now this functions in terms of the ba-sis, 0 = O(I) + O(x) + O(ex) + O(xex), I = I(I) + O(x) + O(ex) + O(xex), ex =

:::::p::~.:: :~:,~xO~n (1 [ x :e" ~ r) + O(x) + 1 ( e") + 1 (xe"). Time, the The corresponding operations in linear transformations are mapped into the

I corresponding operations in matrices; i.e., if we have two linear transformations, T and S, with corresponding matrices A and B, the associated matrix with the linear transformation aT+ f3S will be aA + (JB, being a and f3 two arbitrary complex numbers (or real numbers, according to the case). Therefore, the sum of linear transformations corresponds to the sum of matrices and the multiplication of a linear transformation by a scalar, corresponds to the multiplication of ma-trices by scalars. In the next subsection, we will see that matrix multiplication corresponds to linear transformations composition or multiplication.

1.3.5 The product of linear transformations Very important in mathematics and in physics is the composition of linear trans-formations; it is also called multiplication. We define the composition and give some of his properties in the rest of this subsection.

Given two linear transformations T: V---+ WandS: W---+ U, the composite ST : V ---+ U of T and S is defined by

ST(v) = S[T(v)], (I.I9) for all v in V.

The composition or multiplication of two linear transformations is a linear

Li.near transformati.ons 17

transformation. Let be v and u two arbitrary vectors in V, and a and f3 two arbitrary scalars (real or complex). We make act the multiplica~ion of the two linear tra~sformations in the linear combination of our two arb1trary vectors, remembenng that both transformations are linear, ST(av + f3u) = S[T(av + f3u)] = S[aT(v) + f3T(u)] = aS[T(v)] + f3S[T(u)] that is exactly the linear property.

Example 1. Let be T : JR4 ---+ JR3 such that (x1, x2, X3, x4) ---+ (x1 + X2 -x3,x3,x2 - x4) and S: JR3 ---+ JR2 such that (x1,x2,x3)---+ (2x1- X3,x2), then the multiplication of this two linear transformations is a linear transformation with the rule ( x 1, x2, x3, x4) ---+ (2x1 + x 2 - 2x3 + X4, X3) acting from lR

4 into lR 2

.

Example 2. Let us consider the real vector space of polynomials of degree equal or less than 3 with real coefficients defined in the open interval (0, I); i.e., the set qJ = {f: (0, I)---+ JRif(x) = a0 + a1x + a2x2 + a3x3, with ao, a1, a2, a3 E JR}. Let be the derivative operator D = djdx acting on q:l. We can multiply D by itself and get the linear operator that corresponds to the second derivative, and so on, and get any order derivative. We can consider also the integral operator acting on this same vector space; then if we compose the derivative operator with the integral operator, we get the identity transformation.

It is very easy to see that not all pairs of linear transformations can be composed. The codomain of the first linear transformation must be the domain of the second. In the example I above, we make the product ST, but it is clear that the product TS can not be done. As another example, if we have the transformation T : JR3 ---+ JR3 with the rule (x, y, z) ---+ (x + y, Y- z, x + Y + z), and the transformationS: JR2 ---+ JR3 with the rule (x, y)---+ (x + y, x- y, y), it is obvious that it is not possible to build the composition ST; however the product TS exists. Moreover, and very important, even if TS and ST can both be formed, the new linear transformations T S and ST need not be equal. The quantity

TS-ST (1.20)

is called the commutator of T and S and is denoted by the symbol [T, S].

Example 3. Let be R : JR2 ---+ JR2 a rotation in an angle e in the plane; that means (x,y)---+ (xcose- ysinB,xsinB- ycosB) and let beT: lR2 ---+ lR2 the linear operator with the law (x, y)---+ (2:r- y, x- 2y).

18 Linear Algebra

We have T R: JR2 -+ JR2 with the law (x,y)-+ ( 2x COS (} - y COS (} - X sin (} - 2y sin (}' X COS (} - 2y COS (} - 2x sin(} - sin 6J) and we have RT: IR'.z-+ IR'.z with the law y (:r:,y)-+

y Slll ' X COS (} - 2y COS (} + 2x sin (} - . 6J) (2x COS(} - y COS(} -X sin(}+ 2 . (} The two n r Y sm . ew mear transformations are so similar that t . confused and think that they commute- h l IS very easy to get

1 . ' owever, we can see that they are n t

equa' calculatmg the commutator, we obtain [T R] . IR'.2 JR2 . o (x,y)-+ -4sin8(y,x). ' -+ , w1th the rule

Example 4 Two . . . . very 1mportant lmear operators (linear transformations) m qu~nt/um phys1CS are x, multiplication by the independent variabl d p = -zd dx, both acting on the vector space e, an

It(~, b)= {f: (a, b)--+ IR/ f is a continuos function in (a b)} It 1S obvious that xp[f(x)] = -ixd~~) and that px[f(x)J = ._ixdf(x) - i x s xp =I px. In fact, we already showed that the c t t . f h dx j(. ), o is i; i.e., [:r,p] = i. ommu a or o t esc two operator

(TI)v (IT)v

Tv,

Tv.

(1.21) (1.22)

We shall consider 1 scalars treatinrr th as re et~ant the multiplication of linear transformations by ' o e equa 1011

S =aT= Ta (1.23)

as equivalent to the equation

Sv = a(Tv) (1.24) for any v.

With the definition of th d t f . define r . e pro uc 0 two lmear transformations we can a new mear transformation raising a given one to a certain '

example, Trnv means power. For

Tmv =TT. Tv. (1.25)

19 Linear transformations

Similarly, it is possible to define functions of linear transformations by their formal power series expansions. One very useful and very important case is the exponential function; the linear transformation eT formally means

T T2 T3 = Tk e =1+T+-+-+=""-2! 3! L..- kl.

k=1

(1.26)

We will introduce now the very important concept of the inverse of a linear transformation. For that we establish the following theorem. Let V and W be finite dimensional vector spaces. The two following conditions are equivalent for a linear transformation T : V --+ W: 1. T is an isomorphism. 2. There exists a linear transformation S : W-+ V, such that TS Iw and ST = Iv, where Iw is the unit operator on the vector space W and Iv is the unit operator on the vector space V. The isomorphism S is called the inverse ofT and is denoted by T-

1.

At the end of the previous subsection, we already mentioned that the ma-trix associated with the product of two linear transformations, acting in finite dimensional vector spaces, is the product of the corresponding matrices; i.e., if we have two linear transformations, T and S, with corresponding matrices A and B, the associated matrix with the linear transformation TS will be AB. In this case the vector space of linear transformations is enriched with a new operation, a multiplication, that has well defined properties; the set obtained is now a non-commutative algebra.

Example 5. In example 1 of this subsection, we considered the linear transfor-mations T: IR'.4 -+ JR3, such that (x1,x2,x3,x4)--+ (x1 + Xz- X3,x3,xz- x4), and S: JR3 --+ JR2 , such that (x1,x2,x3)--+ (2x1-x3,xz). We found that the multiplication of this two linear transformations is a linear transformation with the rule (x1,xz,x3,x4)--+ (2x1 + Xz- 2x3 + X4,x3), acting from JR

4 to JR

2.

If we call A to the matrix associated with the linear transformation T and we call B to the matrix associated with the linear transformation S, we have

A~ u ~ -: ~,) ,m\B~ ( ~ ~ n The matrix associated with the composition ST is given by the product of matri-

BA . ST ( 2 1 -2 1 ) . ces ; r.e., :=?- 0 0 1 0 . It 1s very easy to verify that effectively

that matrix corresponds to the matrix of the composition.

20 Linear Algebra

It is worth to note, that the product T S does not exists, and that corresponds to the fact that the product of matrices AB can not be done.

Example 6. In the case of the linear operators acting on JR2 , that we stud-

( cos e - sin e ) .

ied in example 3, we have the matrix R = sine cos e for the lmear

operator R, and the matrix T = ( 2 -l ) for the linear operator T. 1 -2

The matrix associated with the linear operator T R is then

( - sin e + 2 cos e -2 sin e - cos () ) .

TR = _ 2 sin () + cos () _ sin () _ 2 cos () and the matnx associated with . . ( - sin e + 2 cos () 2 sin() - cos () )

the lmear operator RT 1s RT = 2 sin() + cos() _ sin() _ 2 cos() . Thus, the matrix associated with the commutator of T and R is

[T, R] = -4 sin() ( ~ ~ ) and correctly correspond to the law (x,y) -7 -4sine(y,x) that we found in example 3.

1.4 Eigenvalues and eigenvectors

Let be V a vector space, S a subspace of V, and T a linear transformation from S to V. A scalar A is an eigenvalue ofT, if there is a non-null vector v in S, such that

T(v) = AV. (1.27)

The vector v is called an eigenvector ofT, corresponding to the eigenvalue .A. Also the scalar A is called the eigenvalue of the transformation T corresponding to the eigenvector v. The eigenvalues and eigenvectors are also called characteristic values and char-acteristic vectors, respectively. It is worth to emphasize, that even if the null vector 0 satisfies the equation T(O) = AO for any scalar A, it is not considered an eigenvector. It is also worth to note, and very easy to prove, that there is only one eigenvalue for a given eigenvector.

Example 1. Consider the linear operator T', acting on JR3 and defined by the rule (x, y, z) = (x- 2y + z, 0, y + z). The vector ( -3, -1, 1) is an eigenvector ofT with eigenvalue 0 and the vector (1, 0, 0) is an eigenvector ofT with eigen-

21 Eigenvalues and eigenvectors

value 1: . In terms of matrices, the linear transformatwn lS

U -; ~)(~) C~!:z) (d~the;wo ~)gec~a)o:'0 "(e ~~~) wd ( ~ ~2 ~) ( ~) ( ~)

h d f rator D - djdx acting Example 2. The eigenfunctions of t e enva .lVe ope - ' on the vector space . . } It( a, b)= {f: (a, b) -7 JRj f is a continuos_functwn m (a, b) are the solutions of the differential equatwn

df(x) = .Af(x); dx

f( ) ae>-.x, with a an arbitrary real constant. The they are the functions X = corresponding eigenvalue is .A.

We will give now several very useful definitions:

f V d T a linear transformation Let be V a vector space, S a subspace o ' an .

from S to V. A subspace U of S is called invariant under the T transformatlOn, if T maps every element of U into U.

If A is an eigenvalue of T : S C V -7 V' the set

E;... = E;...(T) = {vjvES,T(v) = ,\v} (1.28)

b - f S called the eigenspace associated with A. is a vector space, a su space o '

d b ctor is invariant under It is clear that the subspace generate y an elgenve the transformation.

ce S a subspace of V' and T a linear transformation Let be V a vector spa ' } "tl k different eigenvalues

f m S to V If we have k eigenvectors { Vl' v2' ... 'Vk Wl 1 t"t t ro . . . t . { v vk} cons 1 u e a { ' ' ' . } then the correspondmg mgenvec or s Vl' 2' ... ' /\1, /\2) ... , 1\k ) linearly independent set.

22 Linear Algebra

We prove this theorem by mathematical induction [Bron l t . a) We prove the theorem for k = 1. s 1 em 07, page 5].

:;ez:::~ t:~n;;~Ic:~~i:t~n ~!~h:t~ectors.in the set (just vi) and make it equal b) W Y m wn, VI Is non-null, necessarily c1 = 0

e suppose now the theorem true for k - 1 d . . We want to prove th t "'k . . . an we prove It fork. As T . r a L-i=l CiV, = 0 Implies that Ci = 0 for all i from 1 t k

IS a mear map and Vi its eigenvectors 0

k k k

T(L civi) = L ciT( vi) = I:> A v i=I i=l i=l 1. 1. ,,

then k k k

T(Lcivi)- Ak Lcivi = l:cXv -A t . -i=1 i=I i=1 ' ' ' k i=l c,vi-

k-1 - "'"" k-1 k-1 - CkAkVk + ~ CiAiVi- CkAkVk- >..k L c v = "'""c (>.. - >.. )

i=1 i=l 1 1 8 1 i k Vi = 0,

but our induction assumption is that the set { . . D(ondent the11

c(' ' ) 0

f . v1 , v2 , ... , Vk-d IS lmearly inde-. ' ' /\i - /\k = or 2 = 1 2 3 k Jf the theorem we supposed that A. '1, A' ... '. ~ ~. However' m the hypothesis ;arily ci = 0 fori= 1, 2, 3, ... , k- 1.' J or 2,J- 1, 2, 3, ... , k, and then neces-

.t only remains to prove that Ck is also zero \;y'; k . . . :ombination "'k e ta e agam the ongmal linear

. ' L..,i=l civi, we use that ci = 0 for i = 1 2 3 ;he eigenvectors are different f d . ' ' ' ' k - 1 and that all

o zero, an arnve to Ck = 0.

The inverse of this theorem is n t t . :et of k linearly independ t . o rue. If a lmear transformation T has a

en eigenvectors then th . lo not have to be different W h ' . e correspondmg eigenvalues

e can ave a lmea t . ~igenvectors linearly independent and all h . r rans ormatwn with a set of he identity transformation. t e eigenvalues be the same, as with

e e t le linear operator acting on m3, 1 ~xample 3. 1 t b 1 rix is ill;. w wse associated rna-

( ~ !1 ~ ) . 0 0 3 ll example 1 of the next subsection we will d . . . nd eigenvectors are emonstiate that thetr eigenvalues

Eigenvalues and eigenvectors

2 H ( ~} ] H ( ~1} 3 H ( ~) As the determinant

-1 0 n ~-2

is different from zero, the three vectors are linearly independent.

23

Let be V a finite dimension vector space, S a subspace of V, and T a linear transformation from S to V. If the dimension of V is n, then the maximum number of eigenvalues is n. If the linear transformation T has exactly n different eigenvalues, then the corre-sponding eigenvectors constitute a basis for the vector space V, and the matrix A, associated to the linear transformation T, is a diagonal matrix with the eigen-

values as diagonal elements.

The existence of n different eigenvalues is a sufficient condition to have a diagonal representation of the associated matrix, but it is not necessary. There are linear transformations with less than n eigenvalues and with a diagonal rep-

resentation.

The existence of n linearly independent eigenvectors is a necessary and suffi-cient condition for the linear transformation to have a diagonal representation.

Example 4. Again we will analyze a linear operator acting on JR3

, whose

associated matrix is

(~ -2 2 -3) 2 -6 . -1 -2 0

Their eigenvalues and eigenvectors are

5 H ( ~ l } -3 H ( -~ } 3 H ( ~ } The eigenvalue -3 has multiplicity 2. However, the three eigenvectors are lin-

Linear Algebra

~arly independent. To prove that we calculate the determinant

det ( ~ -1

-2 3) 1 0 = 8 0 1

md as it is different from zero, the three vectors are linearly independent It . lear then, that the condition that all the eigenvalues be different is s til . IS mt not necessary. u cwnt

.4.1 The finite dimension case ~ the rest of t~is section, we will develop methods to find the eigenvalues and Igenvectors of lmear transformations between vector spaces of finite dimension.

If Tis a linear transformation T: S C V---+ V 'th d. (V) d th 1 ' ' WI tm = n we want to n e sea ars /\ such that the equation T(v) = ,\v has t . . l ' l . . tl d non nv1a so utwns m

Ier wor s, we want to find the nontrivial solutions to the equation '

(AI-T)v=O (1.29) ~i:~h ~ =1- 0 and with I the identity transformation. - A IS the matrix associated with the linear transformation T . S C V V 1en the equation ---+ ,

(,\I-A)x=O (1.30)

as a nontrivial solution, if and only if the matrix ,\I- A . . - l . - d 't d ' IS smgu ar, or mother

'~I s, _I ~es not have an inverse or also that det(A) = 0. en,_ If ,\ IS an eigenvalue of the linear transformation T it must sati-sfy th

1uat10n ' ' ' ' . e

det(,\l- A)= 0. (1.31)

lVerse_ly, if ,\ satisfies the above equation, then it is an eigenvalue of th t rmatwn T. e rans-

we define the function

p(,\) = det(,\I- A),

ten The function p(,\) is a polynomial of degree n. The highest degree term is ,\ n.

(1.32)

Eigenvalues and eigenvectors 25

c. The constant term, p(O), is det(-A) = (-l)ndetA. The function p(,\) is called the characteristic polynomial of the matrix A. If the vector space is over the real numbers, then the eigenvalues of thelinear transformation are the real roots of the characteristic polynomial.

Example 1. As promised in the example 2 of the previous subsection, we will demonstrate now that the eigenvalues and eigenvectors of the matrix

U ~, D are

2 H ( ~ } I H ( ~I } 3

26 Linear Algebra

matrix, A = ( ~ ~ 6-:s ) . -6 6v'3 4

a) The characteristic polynomial is

p(A)det(AI -A)~ det H~ : ~) ( ~ 6~ r.?)] = .\3 - 16.\2 - 64.\ + 1024.

b) To calculate the eigenvalues, we factorize the characteristic polynomial as

and then the eigenvalues of this matrix are .\1 = 8, >.2 = -8, >.3 = 16. c) To find the eigenvectors, we have to solve the linear system A vi = Aivi for each of the already known eigenvalues. We start with A = 8, and we get the equation

( 7 v'3 v'35 -6 6v'3

The solution to this system is

X= V3y, z = 0.

The intersection of these two planes in the space JR.3 , is a straight line with parametric equation (x, y, z) = ( y'3, 1, 0) t, where t E JR.- {0}. We have to exclude the :1.ero vector, as it is not considered an eigenvector, even if satisfies the eigenvalue equation. We can take as eigenvector any one in this invariant subspace. One possibility is to take a normalized eigenvector; for simplicitv we take v1 = ( v'3, 1, 0). v For the other two eigenvalues, the procedure is identical, and the invariant sub-

s~aces are the straight line (x, y, z) = (1, -v'3, 2) t, with t E JR.- {0} for the eigenvalue -8, and (x, y, z) = ( -1, v'3, 2) t, with t E JR.- {0} for the eigenvalue 16. Resuming, the eigenvalues and the eigenvectors are

Eigenvalues and eigenvedors 27

1.4.2 Similar matrices

Two n x n matrices A and B are called similar if

(1.33)

for some invertible n X n matrix P.

Similar matrices represent the same linear transformation under two differ-ent bases. If the matrix A represents the linear transformation T in some basis and the matrix B represents the same linear transformation in another basis, then B = p-l AP, with P being the change of bases matrix. The matrix P is sometimes called a similarity transformation. It is worth to note, that this condition is sufficient and necessary; then two ma-trices are similar, if and only if, they represent the same linear transformation.

Similarity is an equivalence relation on the space of square matrices. That is, all the matrix representations of a linear operator T form an equivalence class of similar matrices.

Similar matrices share many properties: The determinant. The trace. The eigenvalues. The characteristic polynomial.

Example 1. Consider the linear operator T on JR. 2 defined by T(x, y) = (5x + y, 3x- 2y) and the following two bases n = {(1, 2), (2, 3)} and n' = {(1, 3), (1, 4)}. We want to find the representation of the linear operator on both bases. We also want the change-of basis matrix, and finally we want to show how the two representations are related. We calculate the representations ofT. To find the first column of the matrix A representing the linear operator Tin the basis B = {(1,2),(2,3)}, we have to calculate T(1, 2) = (7, -1) and represent it in the basis B = {(1, 2), (2, 3)}; we get (7, -1) = -23(1, 2) + 15(2, 3). To evaluate the second column, we do the same with the other element of the ba-

sis, T(2, 3) = (13, 0) and (13, 0) = -39(1, 2) + 26(2, 3), so A= ( ~~3 ~!9 ) . For the matrix B, representing the linear operator T in the basis B = { ( 1, 3), ( 1, 4)}, we do exactly the same, getting B = ( !g7 !~2 )

28 Linear Algebra

Now we will find the change-of-basis matrix P. As (1,3) = 3(1,2)- (2,3) and (1, 4) = 5(1, 2) - 2(2, 3), we have P = ( ! 1 !2 ) Finally, we can verify that

-39 ) (

3 26 -1 -2

1.4.3 Diagonal matrices

IfT is a linear transformation T: S C V-+ V, with dim(V) = n, and all the roots {>.1 , >. 2 , ... , >-n} of the characteristic polynomial are different in the corresponding field of scalars: a) The corresponding eigenvectors {u1,u2 , ... ,un} constitute a basis for V. b) The matrix that represents the linear transformation T, with respect to the ordered basis {'u1, u2, ... , un}, is the diagonal matrix A= diag(>.1, >.2, ... ,An) c) If A is the matrix that represents the linear transformation T with respect to other basis { e1, e2, ... , en}, then

A= c- 1 AC, (1.34)

where C is the matrix that relates the two bases; i.e.,

U=EC. (1.35)

If the eigenvalues are not all different, that does not mean that there is not

Eigenvalues and eigenvectors 29

a diagonal representation. We will have a diagonal representation, if and only if, there are k linearly independent eigenvectors associated with each eigenvalue of multiplicity k.

:::~:~: : (We~~~~~ 'T~: )th:n:g:::::,::: :::::::~~::,::,:::,:,~ 0: 24 16 7

it. Also the similarity matrix will be presented. a) The characteristic polynomial is

( >. + 13 8

j (>.) = det (>.I- A)= det -12 >.- 7 -24 -16

= >.3 - >.2 - 5).- 3 = (>.- 3)(>. + 1) 2 .

4 ) -4

>.-7

b) The eigenvalues, which are the roots of the characteristic polynomial, are -1, with multiplicity 2, and 3. c) In order to find the eigenvectors, we have to solve now for each eigenvalue the equation Av = >.v. For the eigenvalue -1, reducing the system with the Gauss-Jordan procedure [Larson 09; Shores 07], we get the equation 3x + 2y + z = 0; that is a plane that goes through the origin. Then, the invariant subspace corresponding to the eigenvalue -1, with multiplicity 2, is two dimensional, it is a plane. Any pair of linearly independent vectors in that plane works as eigenvectors. The easy way to choose them, is to write the parametric equations of the plane; taking as parameters y and z, and renaming them as s and t, respectively; we get, (x, y, z) = ( -~s- ~t, s, t) = ( -~, 1, O)s + ( -~, 0, 1)t. To avoid fractions, we take as eigenvectors (-2,3,0) and (-1,0,3). For the eigenvalue 3, the reduced system of equations is 2x + z = 0 and -2y + z = O; i.e., the intersection of two planes, a straight line that goes through the origin. The parametric equation of that line is (x, y, z) = t ( -1, 1, 2) with t E lR and t -=J 0. Then any vector in this invariant subspace works fine as eigen-vector, we get the vector ( -1, 1, 2).

::,:r {) ~ig~nl::'( y r :~~r~~' f

30

Finally the diagonal matrix is

o2 ~, n Linear Algebra

( -~1 0

-1 0 ~ 0) and the similarity matrix is

Example 4. In this example, we will try to find a diagonal representation

for the 2 x 2 matrix, -t = ( ! 2 ~1 ) . The characteristic polynomial is

det(..\1- A)=(>.- 1) 2

and so, we have just one eigenvalue, 1, but with multiplicity 2. To find the corresponding eigenvectors we have to solve the equation

which after the normal procedures it is reduced to x = y. Thus, the invariant subspace is a straight line that goes through the origin with an slope of 45 degrees. But we already know that there will be a diagonal representation, if and only if, there are 2 linearly independent eigenvectors associated with our eigenvalue 1; then matrix A does not have a diagonal representation.

1.4.3.1 Procedure to diagonalize a matrix

We present a summary of what we have studied until now about the method to diagonalize a matrix. 1. Calculate all the eigenvalues of the matrix A. 2. Calculate all the corresponding eigenvectors. 3. Build the C matrix with the eigenvectors as columns. 4. Apply the similarity transformation c- 1 AC.

1.4.3.2 The Cayley-Hamilton theorem

Functions of linear transformations are of fundamental importance in theory and in practice; specially to solve algebraic and differential systems of equations is the exponential function of a linear operator. In the calculation of that kind of functions, is central the Cayley-Hamilton theorem that we expose in what follows. Let T be a linear operator on a finite dimensional vector space V. If p is the

Linear operators acting on Euclidian spaces 31

characteristic polynomial forT, then p(T) = 0. In other words, every square matrix satisfies its own characteristic equation. The Cayley-Hamilton theorem always provides a relationship between the pow-ers of the matrix A, associated to the linear operator T, which allows one to simplify expressions involving such powers, and evaluate them without having to compute the power An or any higher powers of A.

( -1

Example 1. Given the 2 x 2 matrix A = 3

2 ) 2 1 , we have, PA = >. - 7,

and then

0) = 0. Hence, we get

A 2 = 71, A 3 = A(71) = 7 A, A 4 = AA 3 = A x 7 A= 7 A 2 = 491

and so on. In fact, A 2j = 7J1 and A 2H 1 = TiA, for j = 0,1,2, .... If we want to calculate etA, we have

CXJ tk etA= L k!Ak =

k=O =I~ (v'?t)2k ~ ~ (v'?t)2k+l = cosh(v'?t)I sinh(v'?t) A

L..- (2k)! + V7 L..- (2k + 1)! + v'7 '

or explicitly

k=O k=O

cosh( v'?t)- sinh( v'?t) 3'{7 sinh( v'?t)

2'{7 sinh( v'?t)

cosh( v'?t) +sinh( v'?t)

1.5 Linear operators acting on Euclidian spaces

) We already said that the words linear transformation, linear maps and linear operators are synonymous; however, it is usual that when a linear transformation acts from a vector space into itself, it is called linear operator, or just operator; we will use that convention in all this section. Also in this section, E will denote an Euclidian space over the complex numbers; i.e., a complex vector space with a specific scalar product.

32 Linear Algebra

1.5.1 Adjoint operators Definition of the adjoint operator. Let T be a linear operator on an Euclidian space[. Then, we say that T has an adjoint on [ ifthere exists a linear operator T' on [such that (T(v), u) = (v, T'(u)) for all v and u in[.

Let T be a linear operator on a finite dimensional Euclidian space [. Then there exists a unique linear operator T' onE, such that for all v, u in[, we have

(T(v), u) = (v, T'(u)). (1.36) Then every linear operator on a finite-dimensional Euclidian space [ has an ad-joint. In the infinite-dimensional case this is not always true. When this linear operator exists, we call it the adjoint ofT, and we denote it rt.

Two comments should be made about the finite-dimensional case. 1. When it exists, the adjoint rt is unique. 2. The adjoint ofT depends not only on T, but on the inner product as well.

The adjoint has the following properties. 1. (T + s)t = rt +st. 2. (Ts)t = strt. 3. (rT)t r*Tt where r is an arbitrary scalar. 4. [(T)t]T = T.

Example 1. In the usual Euclidian space JR2, consider the linear operator, T: (x, y) -t (2x-3y, 5x+y). The adjoint operator is rt : (x, y) -t (2x+5y, -3x+y). Given two arbitrary vectors, v1 = (x1, Yl) and v2 = (x2, y2), in JR2, we have that

(T(v1), v2) = ((2xl- 3yl, 5xl + Yl), (x2, Y2)) = (2xl - 3yl)x2 + (6x1 + Yl)Y2 = 2x1x2 - 3ylx2 + 6x1Y2 + Y1Y2

and

(v1, rt (v2)) = ((x1, Yl), (2xz + 5yz, -3xz + Y2)) = x1 (2xz + 5yz) + Yl ( -3xz + Y2) = 2x1x2 - 3ylx2 + 6x1Y2 + YlY2

Thus, (T(v1), v2) = (v1, rt(v2 )), and effectively rt is the adjoint operator ofT.

Let T be a linear operator on a finite dimensional Euclidian space [, and let be A the matrix associated with it in a given basis. Then the matrix associated with the adjoint linear operator is the adjoint matrix At, i.e., the transpose and conjugated matrix.


Example 2. Consider the linear operator of example 1 above. The matrix

associated with the linear operator T is ( ~ ~3 ) , and the matrix associated with the adjoint linear operator rt is ( ~3 ~ ) . 1.5.2 Hermitian and anti-Hermitian operators

If Tis a linear operator on [and (T(v), u) = (v, T(u)) for all v and all u in E, the linear transformation is called Hermitian or self-adjoint; in other words, a linear operator is Hermitian or self-adjoint if rt = T. Hermitian transformations are very important, since they play for operators the role that real numbers play for normal functions. In quantum mechanics, all dynamical variables have associated an operator, and this operator must be a Hermitian operator.

Example 1. In the Euclidian space, formed by the vector space

2t = {f : (a, b) -t q f is a continuos function in (a, b) and f(a) = f(b) = 0}, with the scalar product f g = J: f(x)g*(x)dx, the operator p = -idjdx is a Hermitian operator. If we take any two functions, f and g in 2t, and we integrate by parts

d ;b d ( ) ;b df(x) (f,p(g)) = (!, -i_!!_) = j(x)[-i~]*d.T = [-'i-d-]g*(x)d:r = (p(f), g), d:r . a dx . a ,:r;

so the operator satisfies the Hermitian condition.

If Tis a linear operator on [and (T(v), u) = -(v, T(u)) for all v and all u en[, the linear transformation is called anti-Hermitian or anti-adjoint; in other words, a linear operator is anti-Hermitian or anti-adjoint if rt = -T. Anti-Hermitian operators are equivalent to the pure imaginary numbers.

In the case of real Euclidian spaces, the Hermitian and anti-Hermitian linear operators area called symmetric and anti-symmetric linear operators, respec-tively.

If T is a linear operator on [ and { e1, e2 , ... , en} is a basis of [, then a) The linear operator Tis Hermitian, if and only if, (T(ej), ei) = (ej, T(ei)) for all i and all j from 1 to n.

34 Linear Algebra

b) The linear operator Tis anti-Hermitian, if and only if, (T(e1), ei) = -(e1, T(ei)) for all i and all j from 1 to n. These properties follow directly from the linearity of the operator T and from the fact that in E all vectors can be written as linear combinations of the basis vectors.

If T is a linear operator on E, { e1, e2, ... , en} is an orthonormal basis of E, and A = ( aij) is the matrix associated with the linear transformation T in the given basis, then a) The linear operator Tis Hermitian, if and only if, the matrix A is self-adjoint or Hermitian; i.e., At= A. b) The linear operator Tis anti-Hermitian, if and only if, the matrix A is skew-adjoint or anti-Hermitian; i.e., At= -A.

1.5.3 Properties of the eigenvalues and eigenvectors of the Her-mitia'fl operators

If Tis a linear operator on the Euclidian space E, A is one of its eigenvalues and v is the corresponding eigenvector, then

A= (T(v),v) (v,v) (1.37)

As v is an eigenvector of the linear operator T with eigenvalue A, we have that Tv= Av, then, from the properties of the scalar product, (T(v), v) = (Av, v) = A(v,v), and expression (1.37) follows immediately. Also

A*= (v, T(v)) (u, u)

It is clear that an eigenvalue is real, if and only if,

(T(v), v) = (v, T(v)) for all vEE. This condition is trivially satisfied in a real Euclidian space.

In case that

(T(v), v) = -(v, T(v))

(1.38)

(1.39)

(1.40)


for all VEE, the corresponding eigenvalue is pure imaginary.

Summarizing, if T is a linear operator on E and A is an eigenvalue of it, then a) If the linear operator Tis Hermitian, A is real. b) If the linear operator Tis anti-Hermitian, A is pure imaginary.

If T is a Hermitian linear operator on the Euclidian space E and A and t-t are different eigenvalues with v and u the corresponding eigenvectors, then v and u are orthogonal; i.e., v u = 0. As v and u are eigenvalues ofT, we have that (T(v),u) = (Av,u) A(v,u) and also that (v, T(u)) = (v, t-tu) = t-t*(v, v.) then as Tis Hermitian, (T(v), u) = (v, T(u)) and A(v, u) = fL*(v, v.) or (A- t-t*)(v, u) = 0. By hypothesis, all the eigenvalues arc different, then necessarily (v,u) = 0. Exactly the same thing happens in the case of the anti-Hermitian transforma-tions; or in other words, the eigenvectors corresponding to different eigenvalues are orthogonal.

If T is a Hermitian linear operator on E and dimE = n, then there are n eigenvectors v1 , v2 , ... , Vn of T that form an orthonormal basis of E.

The matrix corresponding to T in the basis v1, v2, ... , Vn is a diagonal matrix, A= diag(A 1 , A2, ... ,An), where Ak is the eigenvalue corresponding to the eigen-vector vk, for k from 1 to n.

If the transformation is anti-Hermitian a similar thing happens; i.e., also there are n eigenvectors of T that form an orthonormal basis of E.

Any square matrix A= (ai1 ), Hermitian or anti-Hermitian, is similar to the diagonal matrix A= diag(A1 , A2, ... ,An) ofthe eigenvalues. Exists then, a matrix P, such that A= p-l AP. The similarity matrix Pis the eigenvectors matrix, i.e., the matrix whose columns are the eigenvectors. The similarity matrix P is non-singular and unitary; i.e., p-1 =Pt.

36 Linear Algebra

Chapter 2

Special functions

In this Chapter we will revise briefly a few properties of some special functions that will be used to solve later some differential equations.

2.1 Hermite polynomials

The generating function for the Hermite polynomials is

The Hermite polynomials may be obtained from Rodrigues' formula

the first ones are

Ho(x) 1, H1(x) H2(x) H3(x) H4(x)

2x, 4x2 - 2, 8x3 - 12x,

16x4 - 48x2 + 12;

some of them are shown in Figure 2.1. From the recurrence relations

Hn+l(x) = 2xHn(x)- 2nHn-l(x), 37

(2.1)

(2.2)

(2.3)

(2.4)

38 Special functions

100

50

-50

-100

Fig. 2.1 Some Hermite polynomials.

and

dHn(x) ~ = 2nHn-l(x), (2.5)

we can generate all the Hermite polynomials.

From the above recurrence relations, we can also prove that, if we define the functions

(2.6)

then ([Arfken 05])

t - 1 ( d) ~ A 1/!n(x) = V2 x- d;; 1/Jn(x) = vn + 11/Jn+l(x) (2.7) and

(2.8)

The functions (2.6) constitute a complete orthonormal set for the space of square

Hermite polynomials 39

integrable functions, then we can expand any function in that space as

f(x) = L Cn1/!n(x), (2.9) n=D

where

Cn =I: dxf(x)1/Jn(x). (2.10) Hermite polynomials are also solutions of the second order ordinary differ-

ential equation

y" - 2xy' + 2ny = 0.

Let us define the differential operator p as

thEm we have that

.d p = -zd;;,

.!!!:__ = (-E)n dxn i ' and we can rewrite (2.2) in the form

The operator in::;ide the parenthesis above has the forrn

et;ABe-t;A,

(2.11)

(2.12)

(2.13)

(2.14)

that will appear frequently when we solve differential equations in the following Chapters. We can obtain an expression for this type of operators by developing the expo-nentials in a Taylor series , to obtain ([Arfken 05; Orszag 08])

Hadamard lemma: Given two linear operators A and B then

eEA Be-t;A = B + ~AB- ~BA + .t_A2 B-e ABA+ .t_BA2 + 2! 2!

e e = B +~[A, B] + 2! [A, [A, B]] + 3f [A, [A, [A, B]]] +

(2.15) where [A, B] =AB-BA is the commutator of operators A and B.


We are now in the position to obtain an expression for

formula (2.13) developed above. We identify

~ = 1, in equation (2.15), and we get,

B=p,

x2

-x2

[ 2 J 1 [ 2 [ 2 J J 1 [ 2 [ 2 [ 2 J J J e pe = p + 1 X ,p + 2! X , X ,p + 3f X , X , X ,p +

To calculate the first commutator [.1: 2 , p J, we use the general property (2.16)

[AB,CJ = A[B,CJ + [A,C]B (2.17) of commutators, and that

[x,p]f(x) = -i(x.!!:_- .!!:_x)f(x) = -xf'(x) + xf'(x) + if(x) = if(x); dx dx i.e. [x, p] = i, to get the commutation relation

It is obvious that all the other commutators in (2.16) are zero, and we finally get (see [Arfken 05], problem 13.1.5)

(2.18) This last expression can be used to obtain the generating function. We have,

(2.19)

We obtained the exponential of the sum of two quantities that do not commute. The above exponential can be factorized in the product of exponentials via the [Louisell 90]:

2.1.1 Baker-Hausdorff formula Given two operators A and B that obey

[[A,B] ,A]= [[A,B] ,B] = 0, (2.20)


then

(2.21)

Demonstration: We define

F(A) = e>-(A+B) = eg(.A)[A,Ble>-Ae>-B. (2.22)

By deriving the second part of the above equation with respect to A, we have

dF(A) =(A+ B) e>-(A+B) =(A+ B) F(A), dA

and deriving the last part of (2.22) with respect to A, we have

(2.23)

d~~A) = g'(A) [A, B] F(A) + eg(.A)[A,BlAe.\Ae.\B + e9 (.\)[A,B]e.\A Be.\B. Using equation (2.15) and the hypothesis of the problem, [[A, B], A]= [[A, B], B] = 0, it is very easy to show that

eg(.\)[A,B]A = Aeg(.\)[A,B]

and that

so

d~~A) = {g'(A) [A, B] +A+ B +A [A, B]}F(A). (2.24) By comparing (2.23) and (2.24), we obtain the following differential equation

with solution

[g'(A) +A] [A, B] = 0,

A2 g(A) = -2,

where the initial condition g (A= 0) = 0 has been used, because F(A = 0) = 1. Now we evaluate in A= 1, and we get the Baker-Hausdorff formula


We are now ready to apply the Baker-Hausdorff formula, expression (2.2I), to the formula

derived above (equation(2.I9)), and obtain the generating function. We identify

A= 2ax,

B = -iap,

and then

such that

Using now the obvious fact that e-iPI =I, we get

fHn(x)a~ = n.

n=O

that is the generating function for Hermite polynomials.

(2.25)

The Hermite polynomials also can be calculated as the determinant of the matrix

or in other words

2x

V2I )2(i- I)

0

if i = j ifj=i+I if j = i -I otherwise;

(2.26)


2x y'2 0 0 0 0

y'2 2x v'4 0 0 0

0 v'4 2x V6 0 0 0 0 V6 2x V8 0 0

Hn(x) = V8 /16 0 0 0 0 2x

/16 2x )2(n- I)

0 0 0 0 0 J2(n- I) 2x

2.1.2 Series of even Hermite polynomials In order to show the power of the operator methods, we calculate now the value of the following even Hermite polynomials series,

From (2.I8), we get H2n (x) = (-It (p + 2ix) 2n 1.

Therefore,

oo tn oo tn 2 F(t) = ~ -Hzn (x) =~-(-It (p+ 2ix) n I ~n! ~n!

n=O n=O

= f s (-I)" [(p + 2ix) 2r I= exp [-t (p + 2ix)2 ] 1. n=O

Expanding the square in the exponential, we get

00 tn 2 2 )] } F (t) = L 1 H 2n (x) = exp { -t [P - 4x + 21. (xp + px 1. n.

n=O

(2.28)

(2.29)


The operators in the exponential in this last expression do not satisfy the con-ditions of the Baker-Hausdorff formula, equation (2.21); so we need another method to understand the action of the full operator that appears in the right side of expression (2.29). What we do is propose the ansatz (it is a German word, and in physics and mathematics, it is an educated guess that is verified later by its results),

F (t) = exp [f (t) x 2] exp [g (t) (xp + px)] exp [h (t) p 2] 1, (2.30) where f(t),g(t) and h(t) are functions that we have to determine. Deriving this expression with respect tot, and dropping the explicit dependence of f(t),g(t) and h(t) on t,

dF(t) dt

df 2 dg 2 = dtx F (t) + dt exp (fx ) (xp + px) exp [g (xp + px)] exp (hp2 ) 1

dh + dt exp (fx2) exp [g (xp + px)] p2 exp (hp2 ) 1.

Introducing an "smart" 1 in the second and third term, we get

dF (t) df 2 ( ) dg fx2 ( ) _ fx2 ( ) dt =dtx F t + dte xp + px e F t dh f 2 2 f 2 + dt e x exp [g ( xp + px)] p exp [-g ( xp + px)] e- x F ( t).

(2.31)

We work then with the operator in the second term; we have to use the very useful expression, already introduced (expression 2.15 on page 39),

e~ABe-~A = B +~[A, B] +~[A, [A, B]] +~[A, [A, [A, B]]] + , 2. 3.

to get

(2.32)

The first commutator that appears in the above expression is easily calculated,

[x2 , xp + px] = 4ix2 ,

and so all the others commutators are zero. Substituting back in (2.32), we get


We analyze now the third term in expression (2.31), i.e. exp(- fx 2 ) exp [g (xp + px)] p2 exp [-g (xp + px)] exp(- fx 2 ).

We study first only a part of it, exp[g(xp+px)]p2 exp[-g(xp+px)]. Using again (2.15),

exp [g ( xp + px)] p 2 exp [-g ( xp + px)] =

= p 2 + g [xp + px,p2 ] + ~ [xp + px, [xp + px,p2 ]] 93

+ 3f [xp+ px, Calculating the first commutators,

[xp + px,p2 ] = 4ip2 ,

[xp + px, [xp + px, [xp + px,p2 ]]] = -64ip2 ,

and so on. It is clear that

exp [g (xp + px)]p2 exp [-g (xp + px)] = p2 f (4i)j ~ = p2 exp (4ig). j=O J.

We proceed now to complete the study of the third term in expression (2.31). Until now we have

exp(- fx 2 ) exp [g (xp + px)] p2 exp [-g (xp + px)] exp(- fx 2 ) = = exp(- fx 2 )p2 exp (4ig) exp(- fx 2 ).

We use once more formula (2.15), to write

exp (Jx2) p2 exp (- fx 2 ) = = p2 + f [x2,p2] +ft. [x2, [x2,p2]] +; [x2, [x2, [x2,p2J]] + ...

The first commutator gives

the second one gives


and the third one

such that all the other commutators are zero, and

exp (fx 2 ) p2 exp (- fx 2 ) =

= p2 + 2if (xp + px) + ;_ ( -8x2 ) p2 + 2if (xp + px)- 4j2x 2 . 2. Finally, we can write a reduced expression for the derivative of the original series F(t),

dF ( t) { df 2 dg ~ = dtx + dt (xp + rn.: + 4ifx2 ) + + exp (4ig) ~~ [p2 + 2if (xp + px)- 4f2 :r:2 ] }F(t)

and rearranging terms

dF(t) dt

= { [-ddf + 4ifr!:_dg - 4f2 exp (4ig) r_!!:_] x 2+ .t t dt

[dg . . dh] dh } dt + 2~f exp (4~g) dt (xp + px) + exp (4ig) dtp2 F(t).

We get back to the original expression for the series

F (t) = exp { -t [p2 - 4.T 2 + 2i (xp + p:r:)]} 1 and take the derivative with respect to t,

dF(t) ~ =- [p2 - 4x2 + 2i (xp + px)] exp { -t [p2 - 4x 2 + 2i (xp + px)]} 1 = [-p2 + 4x2 - 2i (xp + p:r:)] F.

Comparing now both expressions, we get the system of differential equations

df dg 2 . dh dt + 4ifdt- 4f exp (4~g) dt

dg dh - + 2if cxp (4ig)-dt dt

exp (4ig) rJ!!_ dt

4,

-2i, (2.33)

-1.


The initial conditions that we must set on these equations can be easily under-stood from equation (2.29) and from the ansatz (2.30); as fort = 0 the operator in the right side of (2.29) is the identity, we must impose f(O) = g(O) = h(O) = 0. We outline now the procedure to solve the system (2.33). In the last equation the value of ~ is found, and substituted in the second equation, where the value of iJt is also found, and then both values are substituted in the first equation. The differential equation so obtained is solved for j, with the initial condition f(O) = 0, substituting it in the second equation the value of function g is ob-tained, and finally both values are substituted in the third one, and the value of h obtained. The result that we get is

4t f = 4t + 1'

h= __ t_. 4t + 1

We calculate now explicitly

F (t) = exp (fx2 ) exp [g (xp + px)] exp (hp2 ) 1. Remembering the definition of p (expression 2.12) it is very easy to see that

and then

F (t) = exp (fx 2 ) exp [g (xp + px)]l. Using now the [x,p] = i, we write

exp [g (xp + px)]1 = exp [g (2xp- i)]1 = exp ( -ig) exp (2gxp) 1 and it is also clear that

and also that

exp (2gxp) 1 = [1 + gxp + ~ (xp) 2 + ... ] 1 = 92

= 1 + g:r:p1 + 2 (xp)2 1 + ... = 1

exp [g (xp + px)]1 = exp (-ig).


We have then

F (t) = exp (fx2 - ig). Substituting the functions f and g,

[ ( 4t ) 1 ] 1 ( 4tx2

) F(t)=exp -- x 2 --[ln(4t+1)] = ~exp -4 - , 4t + 1 2 v 4t + 1 t + 1 and finally we get the formula we were looking for

(2.34)

It is interesting to look for some special values. For example, for t = 0, it is obvious that F(t) = 1, as correctly predicts formula (2.34). As H2n(O) = ( -l)n (2~)! [Arfken 05], we have that fort= 1 and x = 0, formula (2.34) give us

~(-1)n(2n)! = v's. L... n! 2 5 n=O

(2.35)

2.1.3 Addition formula We want to apply the form obtained in (2.18), Hn(x) = ( -i)n (p + 2ixt 1, to evaluate the quantity

Hn(x+y). We write it as

d Hn(x + y) = ( -i)n[-i d(x + y) + 2i(x + y)t, (2.36)

by using the chain rule we have

d 1(a a) d( X + y) = 2 ax + ay '

so that we may re-express (2.36) in the form

Hn(x+y)= __: (-i--+2ivf2x-i--+2ivf2y)n. (_.)n a a V2 av'2x ay'2y

By defining

Associated Laguerre polynomials

.a Px = -zax'

with X= v'2x andY= y'2y, we obtain

Hn(x + y) = 2: 12 :t (~>-i)k(Px + 2iX)k(-i)n-k(py + 2iY)n-k,

k=O

that by using H1(x) = ( -i)n(p + 2ix)11 adds to

1 n (n) Hn(x + y) = 2n/2 ~ k Hk(vf2x)Hn-k(vf2y), which is the addition formula we were looking for.

2.2 Associated Laguerre polynomials

49

(2.37)

The generating function for the associated Laguerre polynomials is [Arfken 05] 00

a n 1 ( -xt) "'Ln(x)t = +l exp -- , f;:o (I - t)"' 1 - t itl < 1. (2.38)

The associated Laguerre polynomials may be obtained from the correspond-ing Rodrigues' formula

L~(x)= 1 (2.39) being the first ones,

Lg(x) 1, (1 +a)- x, 1 x 2

2 (2 + 3a + a2)- (2 + a)x + 2 ,

L~(x) 1+-+a2 +- - 3+-+- x+-(3+a)x2 --. ( lla a 3 ) ( 5a a 2 ) 1 x

3

6 6 2 2 2 6 (2.40)

The associated Laguerre polynomials satisfy several recurrence relations. One very useful, when extracting properties of the wave functions of the hy-drogen atom, is

(n + 1)L~+ 1 (x) = (2n +a+ 1- x)L~(x)- (n + a)L~_ 1 (x). (2.41)

50 Special .functions

We will use the operator method outlined above for the Hermite polynomials, to derive the usual explicit expression for the associated Laguerre polynomials. We rewrite expression (2.39) as

L~(x)= 1 where again the operator p = -idjdx, defined in (2.12), was used. We notice that ex (ipt e-x [ex (ip) e-x]n and that, using (2.15), expe-x (p+i), so

L~(x)

or writing explicitly the operator p,

La(x) = -x-a -- 1 xn+a 1 ( d )n n n!" dx . (2.42)

Using the binomial expansion,

L~(x) = ~x-a t (n) n. m=O m

and because

dm (n + o:)! n+a-m (n+o:-m)!x '

we obtain the usual form for associated Laguerre polynomials,

n ( ) k L~(x) = L ~ ~ ~ ( -1)k~ k=O

(2.43)

The second order ordinary differential equation, from which the associated Laguerre polynomials are solution, is

xy" + (o: + 1- x)y' + ny = 0. (2.44)

The associated Laguerre polynomials are orthogonal in the interval [0, oc) with the weight function xaexp( -x); i.e.

1oc a (n+k)! dx x exp( -x)Lm(x)Ln(x) = --1-0mn, o n. (2.45)

Associated Laguerre polynomials 51

0.5

-0.5

-l

Fig. 2.2 Some Laguerre polynomials.

where Omn is the Kroenecker delta.

In the special case when o: = 0, we get the Laguerre polynomials. The La-guerre polynomials are denoted without the super index o:. The graphics of some of these first Laguerre polynomials are presented in Figure 2.2.

From the recurrence relation (2.41), it is possible to prove that the Laguerre polynomials can be calculated as the determinant of the matrix

1+2(i-1)-x

0

ifi =j if j = i + 1 and j = i - 1

otherwise; (2.46)


or in other words

X 0 0 1 3-x 2 0 0 0 2 5-x 3 0

Ln(x) = 0 0 3 7-x (2.47)

n-1 0 0 0 n-1 1+2(n-1)-x

2.3 Chebyshev polynomials

Again, we will give here the main features of Chebyshev polynomials. We first look at the polynomials of the first kind.

2.3.1 Chebyshev polynomials of the first kind The generating function of the Chebyshev polynomials of the first kind, Tn(x), is given by

(2.48)

for jxj < 1 and jtj < 1. They may be written in several forms, but one convenient for later purposes,

is

(2.49)

where [n/2] is the so-called floor function, also called the greatest integer function or integer value, and gives the largest integer less than or equal to n/2.

is Another form to write them, that allows for easy calculation of their roots,

Tn(x) = 2n-l g {X_ cos [ (2k ~ 1)n]}. A few Chebyshev polynomials of the first kind are

To(x) = 1, T1(x) =x, T2(x) = 2x2 - 1,

(2.50)

(2.51)

Chebyshev polynomials 53

0.5 /;! . I

-I -0.5 0.5

-0.5

Fig. 2.3 Some Chebyshev polynomials of the first kind.

and with the recurrence relations

we can find the rest.

I I

(2.52)

In Figure 2.3, we plot some of the first Chebyshev polynomials of the first kind.

2.3.2 Chebyshev polynomials of the second kind The generating function for the Chebyshev polynomialt> of the second kind, Un(x), is given by

1 g(t, x) = 1- 2xt + t2 = L Un(x)tn,

n=O

for jxj < 1 and jtj < 1. A few Chebyshev polynomials of the second kind are

Uo(x) = 1, U1(x) = 2x,

U2(x) = 4x2 - 1;

(2.53)

(2.54)


I I I,

l''

~~ --------. .. "'// .

-I

-2

-3

Fig. 2.4 Some Chebyshev polynomials of the second kind.

and with the recurrence relations

(2.55)

we can find the rest.

In figure 2.4, we plot some Chebyshev polynomials of the second kind.

As with the Chebyshev polynomials of the first kind, we can write them as the sum

[n/2] Un(x) = ~ ( n + 1 )xn-2m(x2 -1)m ~ 2m 1 '

m=O

and the form that allows easy calculation of the roots is

Un(x) = 2n IT {x- cos[~]}. k=1 n + 1

It also may be written as

Un(x) = sin[(n + 1) cos- 1 x] sin[cos- 1 x]

(2.56)

(2.57)

(2.58)

Bessel functions of the first kind of integer order 55

Both expressions above show that the roots of the Chebyshev polynomials of the second kind are

Xk = cos(-k-1r). n+1

(2.59)

A nice way to obtain the Chebyshev polynomials of the second kind is via the determinant

2x 0 0 0 1 2x 0 0

Un(x) = 0 1 2x 0 (2.60)

0 0 0 2x

2.4 Bessel functions of the first kind of integer order

Bessel functions of the first kind of integer order, Jn(x), arc solutions of the Bessel differential equation

(2.61)

where n is an integer. They may be obtained from the generating function

and also from the following recurrence relations

2n -Jn(x) = Jn-1(x) + Jn+1(x). X

(2.62)

(2.63)

In Figure 2.5, we plot some of the first Bessel functions of the first kind of integer order.

Bessel functions of the first kind of integer order may be written as

oo ( -I)mx2m+n Jn(x) = fo 22m+nm!(m + n)!'

and also the following integral representation is very useful

Jn (x) = _!_ J7r e-i(nT-XSinT)dT. 27T -7[

(2.64)

(2.65)


0.5

0.4

OJ

0.2

01

-0.1

-0.2

-OJ

Fig. 2.5 First four Bessel functions of the first kind of integer order.

Another important relations for the Bessel functions of the first kind are the Jacobi-Anger expansions:

(2.66)

and

(2.67)

2.4.1 Addition formula Using the operator methods developed in previous sections, we will obtain here the addition formula for the Bessel functions of the first kind of integer order. First, we will derive the following expression for any "well behaved" function J,

(2.68)

where Px = -idjdx is the operator introduced in Section 1, expression (2.12). Because e-iypT 1 = 1, developing in a Taylor series the f function (we call Cn to

Bessel functions of the firs(; kind of integer order 57

the coefficients in the expansion) and using the linearity of the eiYPx operator, CXl CXl

eiYPx f(x)e-iYPx1 = eiYPx f(x) = eiYPx LCkXk = Lckeiyp"Xk.

Now

then

as we wanted to prove.

k=O k=O

= L ck(x + y)k = f(:r + y), k=O

(2.69)

Now consider the Bessel function In evaluated at x + y. From expression (2.68) we have

because e-iyp,1 = 1, and developing the first exponential in Taylor series, we obtain

(2.70)

To calculate the m-derivative of In, we use the integral representation (2.65) to write

substituting sin T (eir- e-ir)/2i, and using the binomial expansion,

--J (x) =-~ 2.) -1)k m ei(m-k)r e-ikr e-i(nr-xsinr)dT dm 1 m ( ) 11f dxm n 2m 27T k=O k -K

=~~f) -1)k (m) 11r e-i[(n-m+2k)r-xsinr]dT 2m 27T k=O k -K '

and therefore, using again the integral representation (2.65), we obtain

(2.71)


Substituting this last expression in equation (2.70), we obtain (we have taken the sum up to infinite as we add only zeros)

We now change the order of summation and start the second sum at m = k (because for m < k all the terms are zero)

oo ( _ 1)k oo ym Jn(X + Y) = t; ~];, 2m(m _ k)! Jn-m+2k(x).

We do now j = m - 2k and obtain

take the second sum from minus infinite, and exchange the order of the sums

L ln-j(x)JJ(y). j=-oo

The final expression

Jn(X + Y) = L Jn-k(x)Jk(Y) (2.72) k=-00

is known as the addition formula for the Bessel functions of the first kind of integer order.

2.4.2 Series of the Bessel functions of the first kind of integer order

We will derive in this section the solution of some sums of the Bessel functions of the first kind of integer order that appear in several applications. We will demonstrate that

(2.73)

where v is a positive integer, g (y) = ix sin y and Bn ( x 1, x2, ... , xn) is the com-plete Bell polynomial [Bell27; Boyadzhiev 09; Comtet 74] given by the following determinant:

Bessel functions of the first kind of integer order 59

Bn(Xl, X2 1 , Xn) =

Xl (n~l)x2 (n;l )x3 (n;l)x4 (n~l)x5 Xn

-1 X1 (n~2)x2 (n;2)x3 (n;2)x4 Xn-1

0 -1 X1 (n~3)x2 (n;3)x3 Xn-2

0 0 -1 Xl (n~4)x2 Xn-3 (2.74)

0 0 0 -1 X1 Xn-4

0 0 0 0 -1 Xn-5

0 0 0 0 0 -1 X1

To demonstrate (2.73), we take the Jacobi-Anger expansion, expression (2.66), to write

L knJk (x) eiky_ k=-00

To calculate the n-derivative in the left side of equation above, we use the Faa di Bruno's formula ([Gradshtein 99], page 22) for the n-derivative of the com-position

dn n --:;:; f (g (x)) = L f(k) (g (x)) Bn,k (g' (x), g" (x), ... , g(n-k+l) (x)) , (2.75) dx k=O

where Bn,k (x1, x2, ... , Xn-k+l) is a Bell polynomial [Bell 27; Boyadzhiev 09; Comtet 7 4], given by

(2.76)

the sum extending over all sequences .h, .i2, j3, ... , Jn-k+l of non-negative integers such that J1 +J2+ ... +Jn-k+l = k and J1 +2j2+3.h+ ... +(n- k + 1) .in-k+l = n.


Using (2.75), dn n

-eixsiny = eixsiny L Bn k (g' (x) 'g" (x) ' ... , g(n-k+l) (x)) . dyn k=O ,

We multiply now for the complex conjugate of (2.67) and obtain

in f kn J1 (x) Jk (x) ei(k-l)y = Bn (g' (y), g" (y), ... , g(n) (y)) . k,l=-00

Integrating both sides of the above equation from -7r to 7r, and using that J.::.n ei(k-l)Ydy = Okz, we arrive to the formula we wanted

~k2vJ2() _(-It J7rB ( '() "() (2v) ( )) d 6 k X - ~ 2v g Y ,g Y , ... ,g Y y. k=l -7r

(2.77)

In particular, as the complete Dell polynomials for n = 2 and n = 4, are

(2.78)

and

(2.79)

it is very easy to show that

f k2 J~ ( x) = ~ x2 k=l

(2.80)

and

(2.81)

2.4.3 Relation between the Bessel functions of the first kind of integer order and the Chebyshev polynomials of the second kind

We finally present here, without demonstration, a useful formula that relates the Fourier transform of the Bessel functions of the first kind of integer order

Bessel functions of the firsi kind of integer order

with the Chebyshev polynomials of the second kind [Campbell 48]:

F{Jn(w)} = {ij_(-i)nUn-l(~)Vl=(i rect(~), w V;n 2

where rect is the indicator function of the interval [ -1, 1]; i.e.

rect(~) = { l, 2 0,

if XE[-1, 1] otherwise.

61

(2.82)

(2.83)


Chapter 3

Finite systems of differential equations

We will study in this Chapter finite systems of coupled first-order ordinary dif-ferential equations. For pedagogical reasons, we start with systems 2 x 2, after we study systems 4 x 4 that has in some sense the essence of the n x n system without the cumbersome, and with the obtained experience, we generalize to arbitrary finite dimension.

3.1 Systems 2 X 2 first

We consider initially a system of two coupled differential equations with constant coefficients of the form

(3.1)

that may be re-written in compact form as

r}!__A~ dt- x, (3.2)

where A is the matrix

A=( and xis the column vector

63

64 Finite systems of differential equations

The formal solution to the differential equations system (3.2) is

(3.3)

with x(O) the initial conditions. Indeed, by substituting (3.3) into (3.2), we obtain that

r!!!.. =A At~(O) dt e x ' (3.4) so that we recover (3.2).

3.1.1 Eigenval'Ue eq'Uations

Suppose now that the 2 x 2 matrix A has two different eigenvalues, A1 and A2 , with the two column eigenvectors, f!I and fh, respectively. We already know from Chapter 1, that the matrix T = (ffi,fh), whose columns are the eigenvectors, is the similitude matrix that transforms the matrix A into its diagonal form D; i.e.,

D = T- 1 AT. (3.5) We can invert this expression, and write the matrix A in terms of the similitude matrix T, and the diagonal matrix D; we get,

(3.6)

In the solution (3.3), we have the exponential of the matrix A. As we mentioned in Chapter 1, this exponential is defined in terms of the Taylor series of the exponential, i.e.,

oo tn eAt = "\"' _An ~'. n.

n=O

If we substitute (3.6) in the right side of this definition, we obtain

(3.7)

but as (TDT-1 )n = TDT-1TDT-1 ... TDT-1 = TDnT-1, it may be rewritten as

Systems 2 X 2 first 65

or better yet, as

eAt= TeDtT-1. (3.8)

As the D matrix is the diagonal matrix

D= ( ~1 0 ), A2 it is very easy to show that

Dn = ( An 0 ), 1 0 An 2

and that

etD = ( e>qt 0 ) 0 e>-2t

So, finally

eAt= T ( e>-1t 0 )T-1. 0 e>-2t

(3.9)

At this point it is convenient to explicitly calculate the eigenvalues of the matrix A. The characteristic equation is

where I is the identity matrix. In other words, the characteristic equation is

that it is reduced to

A2 - tr(A)A + IAI = 0, where tr(A) means the trace of the A matrix and IAI its determinant. Then the eigenvalues are given by

A _ tr(A) vtr2 (A) - 4IAI 1,2- 2

and then we can find the exponential (3.9), and therefore, the s

Differential Equations an Operational Approach

Documents

Transcript of Differential Equations an Operational Approach