AN INTRODUCTION TO HILBERT SPACES Contents Spaces.pdf · AN INTRODUCTION TO HILBERT SPACES RODICA...

February 19, 2010AN INTRODUCTION TO HILBERT SPACES

RODICA D. COSTIN

Contents

1. Going from finite to infinite dimension 21.1. Recall some basic facts about vector spaces 21.2. Inner product 41.3. Norm 51.4. The space `2 51.5. Metric Spaces 62. Completeness 72.1. Complete spaces 72.2. Examples 82.3. Completion and closure 102.4. The Hilbert space L2[a, b] 102.5. The Banach space C[a, b] 112.6. The Banach spaces Lp 112.7. Closed sets, dense sets 112.8. Sets dense in the Hilbert space L2 122.9. Polynomials are dense in the Banach space C[a, b] 123. Hilbert Spaces 133.1. When does a norm come from an inner product? 133.2. The inner product is continuous 133.3. Orthonormal bases 143.4. Example: generalized Fourier series in L2 153.5. Example: Fourier series in L2[a, b] 153.6. Example: orthogonal polynomials 163.7. Orthogonal complements, The Projection Theorem 163.8. Least squares approximation via subspaces 174. Linear operators in Hilbert spaces 184.1. Shift operators on `2 194.2. Unitary operators. Isomorphic Hilbert spaces 194.3. Integral operators 204.4. Differential operators in L2[a, b] 224.5. A second order boundary value problem 234.6. General second order self-adjoint problems 27

1

2 RODICA D. COSTIN

1. Going from finite to infinite dimension

1.1. Recall some basic facts about vector spaces. Vector spaces aremodeled after the familiar vectors in the line, plane, space etc. abstractlywritten as R, R2, R3, . . . ,Rn, . . .; these are vector spaces over the scalars R.It turned out that there is a great advantage to allow for complex coordi-nates, and then we may also look at C, C2, C3, . . . ,Cn, . . .; these are vectorspaces over the scalars C.

In general:

Definition 1. A vector space over the scalar field F is a set V endowedwith two operations, one between vectors: if x, y ∈ V then x + y ∈ V , andone between scalars and vectors: if c ∈ F and x ∈ V then cx ∈ V having thefollowing properties:- commutativity of addition: x+ y = y + x- associativity of addition: x+ (y + z) = (x+ y) + z- existence of zero: there is an element 0 ∈ V so that x+0 = x for all x ∈ V- existence of the opposite: for any x ∈ V there is an opposite, denoted by−x, so that x+ (−x) = 0- distributivity of scalar multiplication with respect to vector addition: c(x+y) = cx+ cy- distributivity of scalar multiplication with respect to field addition: (c +d)x = cx+ dx- compatibility of scalar multiplication with field multiplication: c(dx) =(cd)x- identity element of scalar multiplication: 1x = x.

Some familiar definitions are reformulated below in a way that allows usto tackle infinite dimensions too.

Definition 2. A set of vectors S ⊂ V is called linealy independent if when-ever, for some x1, x2, . . . , xn ∈ S (for some n) there are scalars c1, c2, . . . , cn ∈F so that c1x1 + c2x2 + . . .+ cnxn = 0 then this necessarily implies that allthe scalars are zero: c1 = c2 = . . . = cn = 0.

Note that the zero vector can never belong to in a linearly independentset.

Definition 3. Given any set of vectors S ⊂ V the span of S is the set

Sp(S) = {c1x1 + c2x2 + . . .+ cnxn |xj ∈ S, cj ∈ F, n = 1, 2, 3 . . .}

Note that Sp(S) forms a vector space, included in V ; it is a subspace ofV .

Definition 4. A set of vectors S ⊂ V is a basis of V if it is linearly inde-pendent and its span equals V .

Theorem 5. Any vector space has a basis. Moreover, all the basis of Vhave the same cardinality, which is called the dimension of V .

AN INTRODUCTION TO HILBERT SPACES 3

By the cardinality of a set we usually mean ”the number of elements”.However, if we allow for infinite dimensions, the notion of ”cardinality”helps us distinguish between different types of ”infinities”. For example,the positive integers form an infinite set, and so do the real numbers; wesomehow feel that we should say there are ”fewer” positive integers thanreals. We call the cardinality of the positive integers countable.

Please note that the integers Z are also countable (we can ”count” them:0, 1,−1, 2,−2, 3,−3, . . .), pairs of positive integers Z2

+ are also countable(count: (1, 1), (2, 1), (1, 2), (3, 1), (2, 2), (1, 3), . . .), and so are Z2, and the setof rational numbers Q (rational numbers are ratios of integers m/n). It canbe proved that the reals are not countable.

Notations The positive integers are denoted Z+ = {1, 2, 3, . . .} and thenatural numbers are denoted N = {0, 1, 2, 3 . . .}. However: some authors donot include 0 in the natural numbers, so when you use a book make sureyou know the convention used there.

Examples1. Rn = {(x1, . . . xn) |xj ∈ R} is a vector space over R of dimension n,

with a basis consisting of the vectorse1 = (1, 0, . . . , 0), e2 = (0, 1, 0 . . . , 0), . . . , en = (0, 0, . . . 0, 1)any x = (x1, . . . xn) can be written as a linear combination of them

x = x1e1 + x2e2 + . . .+ xnen

Remark: from now on we will prefer to list horizontally the componentsof vectors.

2. RZ+ = {x = (x1, x2, x3, . . .) |xj ∈ R} is a vector space over R. Wewould like to say that a basis consists of the vectors

(1) e1 = (1, 0, 0, 0, . . .), e2 = (0, 1, 0, 0, 0 . . .), e3 = (0, 0, 1, 0, . . .), . . .

since any x = (x1, x2, x3, . . .) can be written as an (infinite) linear combi-nation

(2) x = x1e1 + x2e2 + x3e3 + . . . =∞∑n=1

xn en

but this is not quite correct, because we have a series rather than a finitesum. It looks like we should accept series as expansions, to get a niceextension to infinite dimensions!

3. Similarly, Cn = {(z1, . . . zn) | zj ∈ C} is a vector space over C ofdimension n, with a basis consisting of the vectors(1, 0, . . . , 0), (0, 1, 0 . . . , 0), . . . , (0, 0, . . . 0, 1).

4. CZ+ = {z = (z1, z2, . . . zn, . . .) | zj ∈ C} is a vector space over C andany z could be thought as the infinite sum z1e1 + z2e2 + z3e3 + . . . where enare given by (1) - but again, this is not a basis in the sense of the definitionfor vector spaces).

4 RODICA D. COSTIN

5. The set of polynomials of degree at most n, with coefficients in F(which is R or C)

Pn = {p(t) = a0 + a1t+ a2t2 + . . .+ ant

n | aj ∈ F}

is a vector space over the field of scalars F , of dimension n+ 1, and a basisis 1, t , t2, . . . tn.

6. The set of all polynomials with coefficients in F (which is R or C)

P = {p(t) = a0 + a1t+ a2t2 + . . .+ ant

n | aj ∈ F, n ∈ N}

is a vector space over the field of scalars F , and has the countable basis1, t, t2, t3 . . ..

7. The set of all functions continuous on a closed interval (it could alsobe open, or extending to ∞):

C[a, b] = {f : [a, b]→ F | fcontinuous}

8. The set of all functions f with absolute value |f | integrable on [a, b]:

L1[a, b] = {f : [a, b]→ F | |f |integrable}Warning: I did not specify what ”integrable” means. L1[a, b] is a bit

more complicated, but for practical purposes this is good enough for now.(We will comment more later.)

Note that P ⊂ C[a, b] ⊂ L1[a, b].

1.2. Inner product. Vectors in Rn and Cn have an interesting operation:

Definition 6. An inner product on vector space V over F (=R or C) isan operation which associate to two vectors x, y ∈ V a scalar 〈x, y〉 ∈ F thatsatisfies the following properties:- it is conjugate symmetric: 〈x, y〉 = 〈y, x〉- it is linear in the second argument: 〈x, y+z〉 = 〈x, y〉+〈x, z〉 and 〈x, cy〉 =c〈x, y〉- its is positive definite: 〈x, x〉 ≥ 0 with equality only for x = 0.

Note that conjugate symmetry combined with linearity implies that 〈., .〉is conjugate linear in the first variable.

Note that for F = R an inner product is symmetric and linear in the firstargument too.

Please keep in mind that most mathematical books use inner productlinear in the first variable, and conjugate linear in the second one. Youshould make sure you know the convention used by each author.

Definition 7. A vector space V equipped with an inner product (V, 〈., .〉) iscalled an inner product space.


ExamplesOn Cn the most used inner product is 〈x, y〉 =

∑nj=1 xjyj .

We may wish to introduce a similar inner product on CZ+ : 〈x, y〉 =∑∞j=1 xjyj . The problem is that the series may not converge!On the space of polynomials P or on continuous functions C[a, b] we can

introduce the inner product

〈f, g〉 =∫ b

af(x) g(x) dx

or more generally, using a weight (which is a positive function w(x)),

〈f, g〉w =∫ b

af(x) g(x)w(x)dx

We may wish to introduce a similar inner product on L1[a, b], only theintegral may not converge. For example, f(x) = 1/

√x is integrable on [0, 1],

but f(x)2 is not.

1.3. Norm. The inner product defines a length by ‖x‖ =√〈x, x〉. This is

a norm, in the following sense:

Definition 8. Given a vector space V , a norm is a function on V so that:- it is positive definite: ‖x‖ ≥ 0 and ‖x‖ = 0 only for x = 0- it is positive homogeneous: ‖cx‖ = |c| ‖x‖ for all c ∈ F and x ∈ V- satisfies the the triangle inequality (i.e. it is subadditive):

‖x+ y‖ ≤ ‖x‖+ ‖y‖

Definition 9. A vector space V equipped with a norm (V, ‖.‖) is called anormed space.

An inner product space is, in particular a normed space (the first twoproperties of the norm are immediate, the triangle inequality is a geometricproperty in finite dimensions and requires a proof in infinite dimensions).

There are some very useful normed spaces (of functions), which are notinner product spaces (more about that later).

1.4. The space `2. Let us consider again CZ+ , which appears to be themost straightforward way to go from finite dimension to an infinite one. Todo linear algebra we need an inner product, or at least a norm. We mustthen restrict to the ”vectors” which do have a norm, in the sense that theseries ‖z‖2 =

∑∞n=1 |zn|2 converges.

Define the vector space `2 by

`2 = {z = (z1, z2, z3, . . .) ∈ CZ+ |∞∑n=1

|zn|2 <∞}

On `2 we therefore have a norm: ‖z‖ =(∑∞

n=1 |zn|2)1/2.

6 RODICA D. COSTIN

Actually, on `2 also the inner product converges, due to the followinginequality, which is one of the most important and powerful tools in infinitedimensions (the triangle inequality is also fundamental):

Theorem 10. The Cauchy-Schwartz inequalityIn an inner product space we have∣∣ 〈x, y〉 ∣∣ ≤ ‖x‖ ‖y‖Therefore, if ‖x‖ and ‖y‖ converge, then 〈x, y〉 converges and moreover,

equality holds if and only if x, y are linearly dependent (which means x = 0or y = 0 or x = cy)

Proof.The proof is presented here for the case of `2, for simplicity.Recall: a series

∑∞n=1 an is said to converge to A, and we write

∑∞n=1 an =

A if the sequence formed by its partial sums SN =∑N

n=1 an converges to Aas N →∞.

Recall: a series∑∞

n=1 an is said to converge absolutely if∑∞

n=1 |an| con-verges.

Recall: absolute convergence implies convergence.Recall: a series with positive terms either converges or has the limit +∞.

For such a series, say∑∞

n=1 |an| , it is customary to write∑∞

n=1 |an| < ∞to express that it converges.

Take the partial sums of 〈x, y〉, use the triangle inequality in dimen-sion N , then Cauchy-Schwarts in dimension N (which follows from u · v =‖u‖ ‖v‖ cosθ):∣∣∣ N∑n=1

xn yn

∣∣∣ ≤ N∑n=1

∣∣∣xn yn∣∣∣ =N∑n=1

|xn| |yn| ≤

(N∑n=1

|xn|2)1/2 ( N∑

n=1

|yn|2)1/2

Taking the limit N →∞ the convergence of the 〈x, y〉 series follows if ‖x‖and ‖y‖ converge.

The argument showing when equality holds is not given here, as it is inaccordance with what happens in finite dimensions. 2

We obtained that `2 is an inner product space, by the Cauchy-Schwartzinequality.

The vectors en of (1) do belong to `2, and they do form an orthonormalset: en ⊥ ek for n 6= k (since 〈en, ek〉 = 0) and ‖en‖ = 1).

1.5. Metric Spaces. Now we would like to make sense of the expansion(2). We can do that using the usual definition of convergence (in R) byreplacing the distance between two vectors x, y by d(x, y) = ‖x − y‖. Thisdistance is a metric, in the following sense:


Definition 11. A distance (or a metric) on a set M is a function d(x, y)for x, y ∈M with the following properties:it is nonnegative: d(x, y) ≥ 0it separates the points: d(x, y) = 0 if and only if x = yit is symmetric: d(x, y) = d(y, x)it satisfies the triangle inequality: d(x, z) ≤ d(x, y) + d(y, z).

(Two of conditions follow from the other, but it is better to have themall listed.)

Definition 12. A metric space (M,d) is a set M equipped with a distanced.

Note that any normed space is a metric space by defining thedistance d:

(3) d(x, y) = ‖x− y‖But there are many interesting metric spaces which are not normed (theymay not even be vector spaces!). For example, we can define distances on asphere (or on the surface of the Earth!) by measuring the (shortest) distancebetween two points on a large circle joining them.

Once we have a distance, we can define convergence of sequences:

Definition 13. Consider a metric space (M,d).We say that a sequence s1, s2, s3 . . . ∈ M converges to L ∈ M if for anyε > 0 there is an N (N depends on ε) so that d(sn, L) < ε for all n ≥ N .

Note that we can use this definition for convergence in normedspaces using the distance (3).

The series expansion (2) of any x ∈ `2 converges (why?). We now have asatisfactory theory of `2.

2. Completeness

There is one more property that is essential for calculus: there need toexist enough limits.

The spaces Rn, Cn, `2 all have this property (inherited from R). However,the space of polynomials P and C[a, b] as inner product spaces with 〈f, g〉 =∫fg do not have this property.

2.1. Complete spaces. In the case of R this special property can be intu-itively formulated as: if a sequence of real numbers an tends to ”pile up”then it is convergent. Here is the rigorous formulation:

Definition 14. The sequence {an}n∈N is called a Cauchy sequence if forany ε > 0 there is an N ∈ N (N depending on ε) so that

|an − am| < ε for all n,m > N

8 RODICA D. COSTIN

Theorem 15. Any Cauchy sequence of real numbers is convergent.

(This is a fundamental property of real numbers, at the essence of whatthey are.)

We can define Cauchy sequences in a metric space very similarly:

Definition 16. Let (M,d) be a metric space.The sequence {an}n∈N ⊂M is called a Cauchy sequence if for any ε > 0

there is an N ∈ N (N depending on ε) so that

d(an, am) < ε for all n,m > N

We would like to work in metric spaces where every Cauchy sequence isconvergent. However, this is not always the case:

Definition 17. A metric space (M,d) is called complete if every Cauchysequence is convergent in M .

(Note that the limit of the Cauchy sequence must belong to M .)In the particular case of normed spaces (when the distance is given by

the norm):

Definition 18. A normed space (V, ‖ . ‖) which is complete is called a Ba-nach1 space.

In the even more special case of inner product spaces (when the norm isgiven by an inner product):

Definition 19. An inner product space (V, 〈 . 〉) which is complete is calleda Hilbert2 space.

2.2. Examples. These are very important examples (some proofs are neededbut not given here).

1. The usual Rn and Cn are Hilbert spaces.

2. The space `2 of sequences:

`2 ≡ `2(Z+) = {x = (x1, x2, x3, . . .) |xn ∈ C,∞∑n=1

|xn|2 <∞}

endowed with the `2 inner product

〈x, y〉 =∞∑n=1

xnyn

1Stefan Banach (1892-1945) was a Polish mathematician, founder of modern functionalanalysis - a domain of mathematics these lectures belong to.

2David Hilbert (1862-1943) was a German mathematician, recognized as one of themost influential and universal mathematicians of the 19th and early 20th centuries. Hediscovered and developed a broad range of fundamental ideas in many areas, includingthe theory of Hilbert spaces, one of the foundations of functional analysis.


is a Hilbert space.For example, the sequence x = (x1, x2, x3, . . .) with xn = 1/n belongs to

`2 and so do sequences with xn = can if |a| < 1.A variation of this space is that of bilateral sequences:

`2(Z) = {x = (. . . , x−2, x−1, x0, x1, x2 . . .) |xn ∈ C,∞∑

n=−∞|xn|2 <∞}

with the `2 inner product 〈x, y〉 =∑∞

n=−∞ xnyn is a Hilbert space.Recall that a bilateral series

∑∞n=−∞ an is called convergent if both series∑∞

n=1 an and∑0

n=−∞ an are convergent.

3. The space C[a, b] of continuous function on the interval [a, b], endowedwith the L2 inner product

〈f, g〉 =∫ b

af(t) g(t) dt

is not a Hilbert space, since it is not complete.For example, take the following approximations of step functions

fn(t) =

0 if t ∈ [0, 1]n (t− 1) if 1 < t < 1 + 1

n1 if t ∈ [1 + 1

n , 2]

The fn are continuous on [0, 2] (the middle line in the definition of fn rep-resents a segment joining the two edges of the step). The sequence {fn}n isCauchy in the L2 norm: (say n < m)

‖fn − fm‖2 =∫ 2

0|fn(t)− fm(t)|2 dt =

∫ 1+ 1n

1|fn(t)− fm(t)|2 dt

=∫ 1+ 1

m

1[n (t− 1)−m (t− 1)]2 dt +

∫ 1+ 1n

1+ 1m

[n (t− 1)− 1]2 dt =

= 1/3(n−m)2

m3− 1/3

(n−m)3

m3n= 1/3

(n−m)2

m2n<

13n→ 0

However, fn is not L2 convergent in C[0, 2]. Indeed, in fact fn convergesto the step function

f(t) ={

0 if t ∈ [0, 1)1 if t ∈ [1, 2]

because

‖fn − f‖2 =∫ 2

0|fn(t)− f(t)|2 dt =

∫ 1+ 1n

1[n (t− 1)− 1]2 dt =

13n→ 0

therefore the L2 limit of fn is f , a function that does not belong to C[0, 2].

10 RODICA D. COSTIN

2.3. Completion and closure. The last example suggests that if a metricspace of interest is not closed, then one can add to that space all the possiblelimits of the Cauchy sequences, and then we obtain a closed space.

This procedure is called ”closure”, and the closure of a metric space Mis denoted by M .

2.4. The Hilbert space L2[a, b]. The closure of C[a, b] in the L2 norm (theclosure depends on the metric!) is a space denoted by L2[a, b], the space ofsquare integrable3 functions:

L2[a, b] = {f : [a, b]→ C(or R)∣∣ |f |2 is integrable on [a, b] }

or, for short,

(4) L2[a, b] = {f : [a, b]→ C∣∣ ∫ b

a|f(t)|2 dt < ∞}

This notation is very suggestive: for practical purposes, if you have an f sothat you can integrate |f |2 (as a proper or improper integral), and the resultis a finite number, then f ∈ L2.

Examples of functions in L2: continuous functions, functions with jumpdiscontinuities (a finite number of jumps, or even countably many!), somefunctions which go to infinity: for example x−1/4 ∈ L2[0, 1] because

∫ 10 (x−1/4)2dx =∫ 1

0 x−1/2dx = 2x1/2

∣∣10

= 2 <∞.A wrinkle: in L2[a, b], if two functions differ by their values at only a finite

number of points (or even on a countable set) they are considered equal. Forexample, the functions f1 and f2 below are equal:

f1(t) ={

0 if t ∈ [0, 1]1 if t ∈ (1, 2] f2(t) =

{0 if t ∈ [0, 1)1 if t ∈ [1, 2] f1 = f2 in L2[0, 2]

as indeed the L2-norm ‖f1 − f2‖ = 0.Therefore L2[a, b] = L2(a, b).Note the similarity of the definition (4) of L2 with the definition of `2.Note that by the Cauchy-Schwarts inequality, the L2 inner product of two

function in L2[a, b] is finite. Therefore the space L2[a, b] is a Hilbert space.The space L2 is also used on infinite intervals, like L2[a,∞), and L2(R).

2.4.1. An important variation of the L2 space. Instead of using the usualelement of length dt we can use an element of ”weighted” length w(t)dt; onephysical interpretation is that if [a, b] represents a wire of variable density

3These are the Lebesgue integrable functions. They are quite close to the familiarRiemann integrable functions, but the advantage is that the Lebesue integrability behavesbetter when taking limits.


w(t) then the element of mass on the wire is w(t)dt. If w(t) is a positivefunction define

L2([a, b], w(t)dt) = {f : [a, b]→ C∣∣ ∫ b

a|f(t)|2w(t)dt < ∞}

with the inner product given by

〈f, g〉w =∫ b

af(t) g(t)w(t)dt

Note that f ∈ L2([a, b], w(t)dt) if and only if f√w∈ L2[a, b].

2.5. The Banach space C[a, b].Recall: a function continuous on a closed interval does have an absolute

maximum and an absolute minimum.The space of continuous function on [a, b] is closed with respect to the

sup norm ‖.‖∞:‖f‖ = ‖f‖∞ = sup

x∈[a,b]|f(x)|

In other words, completeness of C[a, b] in the sup-norm means that: if fnare continuous on [a, b] and if fn → f in the sense thatlimn→∞

supx∈[a,b]

|fn(x)− f(x)| = 0 then f is continuous.

2.6. The Banach spaces Lp. The space

Lp[a, b] = {f : [a, b]→ C(or R)∣∣ ∫ b

a|f(t)|p dt < ∞}

is complete in the Lp norm ‖f‖p = (∫ ba |f |

p)1/p.Lp are called the Lebesgue4 spaces (only L2 is a Hilbert space).

2.7. Closed sets, dense sets.

Definition 20. Let (M,d) be a closed metric space.A subset F ⊂ M is called closed if it contains the limits of the Cauchy

sequences in F : if fn ∈ F and fn → f then f ∈ F .

Examples: for M = R the intervals (a, b), (a,+∞) are not closed sets, butthe intervals [a, b], [a,+∞) are closed. For M = R2 the disk x2 + y2 ≤ 1 isclosed, but the disk x2 + y2 < 1 is not closed, and neither is the punctureddisk 0 < x2 + y2 ≤ 1.

Definition 21. Given a closed metric space (M,d) and a set S ⊂ M wecall the set S the closure of S in M if S contains all the limits of the Cauchysequences sequences (fn)n ⊂ S.

4Henri Lebesgue (1875-1941) was a French mathematician most famous for Lebesgue’stheory of integration.

12 RODICA D. COSTIN

Note that S ⊂ S (since if f ∈ S we can take the trivial Cauchy sequenceconstantly equal to f , fn = f for all n.

Examples: for M = R the closure of the open interval (a, b) is the closedinterval [a, b]. For M = R2 the closure of the (open) disk x2 + y2 < 1, andof the punctured disk 0 < x2 + y2 ≤ 1 is the closed disk x2 + y2 ≤ 1.

Definition 22. Given a closed metric space (M,d) and a set S ⊂ M wesay that a set S is dense in M if S = M .

It is worth repeating: if S is dense in M is means that any f ∈M can beapproximated by elements of S: there exists fn ∈ S so that limn→∞ fn = f .

Examples: for M = R the rational numbers Q are dense in R (why?). ForM = `2 the sequences that terminate5 are dense in `2.

Dense sets have important practical consequences: when we need to es-tablish a property of M that is preserved when taking limits (e.g. equalitiesor inequalities), then we can prove the property on S and then, by takinglimits, we find it on M .

2.8. Sets dense in the Hilbert space L2.C[a, b] is dense (in the L2 norm) in L2[a, b] (by our construction of L2).In fact, we can assume functions as smooth as we wish, and still obtain

dense spaces: the smaller space C1[a, b], of functions which have a continu-ous derivative, is also dense in L2[a, b], and so is Cr[a, b], functions with rcontinuous derivatives, for any r (including r =∞).

Even the smaller space P (of polynomials) is dense in L2[a, b].Another type of (inner product sub)spaces dense in L2 are those consisting

of functions satisfying zero boundary conditions, for example

(5) {f ∈ C[a, b] | f(a) = 0}(Why: any function f in (5) can be approximated in the L2-norm byfunctions in (5): consider the sequence of functions fn which equal f forx ≥ a+ 1

n and whose graph is the segment joining (a, 0) to (a+ 1n , f(a+ 1

n))for a ≤ x < a+ 1

n . Then the L2 norm of fn− f converges to zero, much likein the Example 3. of §2.2.)

Similarly,C0[a, b] = {f ∈ C[a, b] | f(a) = f(b) = 0}

is dense in L2[a, b], and so is the following very useful space

C10 [a, b] = {f | f ′ continuous on [a, b], f(a) = f(b) = 0, f ′(a) = f ′(b) = 0}

2.9. Polynomials are dense in the Banach space C[a, b]. The space ofpolynomials is dense in C[a, b] in the sup norm. (This is the Weierstrass’Theorem: any continuous function f on [a, b] can be uniformly approximatedby polynomials, in the sense that there is a sequence of polynomials pn sothat limn→∞ sup[a,b] |f − pn| = 0.)

5These are the sequences x = (x1, x2, . . . ) so that xn = 0 for all n large enough.


Please note that closure and density are relative to a norm. Forexample, C[a, b] is not closed in the L2 norm, but it is closed in the sup-norm.

3. Hilbert Spaces

In a Hilbert space we can do linear algebra (since it is a vector space), ge-ometry (since we have lengths and angles) and calculus (since it is complete).And they are all combined when we write series expansions.

Recall:

Definition 23. A Hilbert space H is a real or complex inner product spacethat is also a complete metric space with respect to the distance functioninduced by the inner product.

Recall the two fundamental examples: the space of sequences `2, and thespace of square integrable function L2.

3.1. When does a norm come from an inner product?In every Hilbert space the parallelogram identity holds: for any f, g ∈

H

(6) ‖f + g‖2 + ‖f − g‖2 = 2(‖f‖2 + ‖g‖2)

(in a parallelogram the sum of the squares of the sides equals the sum of thesquares of the diagonals).

Relation (6) is proved by a direct calculation (expanding the norms interms of the inner products and collecting the terms).

The remarkable fact is that, if the parallelogram identity holds in a Banachspace, then its norm actually comes from an inner product, and can recoverthe inner product only in terms of the norm bythe polarization identity:for complex spaces

〈f, g〉 =14(‖f + g‖2 − ‖f − g‖2 + i‖f + ig‖2 − i‖f − ig‖2

)and for real spaces

〈f, g〉 =14(‖f + g‖2 − ‖f − g‖2

)3.2. The inner product is continuous. This means that we can takelimits inside the inner product:

Theorem 24. If fn ∈ H, with fn → f then 〈fn, g〉 → 〈f, g〉.

Indeed: |〈fn, g〉 − 〈f, g〉| = |〈fn − f, g〉| ≤ (by Cauchy-Schwarts)‖fn − f‖ ‖g‖ → 0. (Recall: fn → f means that ‖fn − f‖ → 0.)

14 RODICA D. COSTIN

3.3. Orthonormal bases.Consider a Hilbert space H.Just like in linear algebra we define:

Definition 25. If 〈f, g〉 = 0 then f, g ∈ H are called orthogonal (f ⊥ g).

and

Definition 26. A set B ⊂ H is called orthonormal if all f, g ∈ B areorthogonal (f ⊥ g) and unitary (‖f‖ = 1).

Note that as in linear algebra, an orthonormal set is a linearly independentset (why?).

And departing from linear algebra:

Definition 27. A set B ⊂ H is called an orthonormal basis for H if itis an orthonormal set and it is complete in the sense that the span of B isdense in H: Sp(B) = H.

Note that in the above B is not a basis in the sense of linear algebra(unless H is finite-dimensional). But like in linear algebra:

Theorem 28. Any Hilbert space admits an orthonormal basis; furthermore,any two orthonormal bases of the same space have the same cardinality,called the Hilbert dimension of the space.

From here on we will only consider Hilbert spaces which admita (finite or) countable orthonormal basis.

These are the Hilbert spaces encountered in mechanics and electricity.It can be proved that this condition is equivalent to the existence of a

countable set S dense in H. This condition is often easier to check, andthe property is called ”H is separable”. Application: L2[a, b] is separable(why?) therefore L2[a, b] has a countable orthonormal basis. Many physicalproblems are solved by finding special orthonormal basis of L2[a, b]!

[For your amusement: it is quite easy to construct a Hilbert space withan uncountable basis, e.g. just like we took `2(Z+) we could take `2(R).]

We assume from now on that our Hilbert spaces H do have acountable orthonormal basis. Consider one such basis u1, u2, u3, . . ..

For any f ∈ H we have, like in linear algebra (but admitting series insteadof sums), that

(7) f =∞∑n=1

fnun

where

(8) fn = 〈un, f〉

It is not hard to check that the series (7) converges to f .


Recall that, by definition, a series (7) converges to f means that

(9) limN→∞

∥∥f − N∑n=1

fnun∥∥ = 0

Also like in linear algebra we have (why?)

(10) ‖f‖2 =∞∑n=1

|fn|2

As an immediate consequence of (10) we have

(11) ‖f‖2 ≥∑n∈J|fn|2 for any J ⊂ Z+

Note that (separable) Hilbert spaces are essentially `2, since given anorthonormal basis u1, u2, . . ., the elements f ∈ H can be identified with thesequence of their generalized Fourier coefficients (f1, f2, f3, . . .) which, by(10), belongs to `2.

An expansion (7) is called the (generalized) Fourier series of f , the fnare called the (generalized) Fourier coefficients of f , formula (10) is calledthe Parseval’s identity, and (11) is called Bessel’s inequality.

3.4. Example: generalized Fourier series in L2. If H = L2[a, b] then(9) means that

(12) limN→∞

∫ b

a

∣∣f(x)−N∑n=1

fnun(x)∣∣2 dx = 0

(we took the square of the norm rather than the norm).

Formula (12) is often expressed as “f equals∞∑n=1

fnun in the least square

sense ” or ”in the mean”; see also §3.8.Only if f(x) and un(x) are smooth enough (precise conditions can be

found) then the series (7) converges to f for each x (”the series is point-wiseconvergent”) but this is not the case for all f ∈ L2. (The series (12) maynot converge for all x, and even if it converges, it may have a different limitthan f(x).)

3.5. Example: Fourier series in L2[a, b]. You may know that a periodicfunction can be expanded in series in sines and cosines.

It can be easily checked that the functions 1, sin(nx), cos(nx), (n =1, 2, 3 . . .) form an othogonal system in L2[−π, π].

Moreover, they form a basis for the real-valued Hilbert space L2[−π, π](this requires a proof). It follows that if f(x) ∈ L2[−π, π] then f(x) can beapproximated (in the sense of (12)) by the partial sums of a Fourier series:

a0

2+∞∑n=1

[an cos(nx) + bn sin(nx)]

16 RODICA D. COSTIN

where we have, by (8),

an =1π

∫ π

−πf(x) cos(nx) dx , bn =

1π

∫ π

−πf(x) sin(nx) dx,

A more concise and compact formula can be written using Euler’s formulaeinx = cos(nx) + i sin(nx). The functions einx, n ∈ Z form an orthogonalset in (the complex vector space of) complex valued functions in L2[−π, π](check!). Moreover, they form a Hilbert space basis, and any complex valuedfunction in L2[−π, π] can be written expanded in a Fourier series

f(x) =∞∑

n=−∞fne

inx

with the Fourier coefficients given by

fn =1

2π

∫ π

−πf(x)e−inx dx

The series converges in the mean (i.e. in the L2 norm), and, moreover,pointwise if f(x) is ”smooth enough” (precise mathematical formulationsform the topic of one domain of mathematics, harmonic analysis.).

The Fourier coefficients an, bn, fn are related by the formulas

an = fn + f−n for n = 0, 1, 2, . . . , bn = i(fn − f−n) for n = 1, 2, . . .

On a general interval [a, b] the Fourier series is generated by expansion interms of exp(2πinx/(b− a)) (with n ∈ Z) as it is easily seen by a rescalingof x.

3.6. Example: orthogonal polynomials. Consider weighted L2 spaces,L2([a, b], w(x)dx). As we noted, the polynomials form a dense set, and theyare spanned by 1, x, x2, x3, . . . The Gram-Schmidt process (with respect tothe weighted inner product) produces a sequence of orthonormal polynomialswhich span a dense set in the weighted L2, hence it forms an orthonormalbasis for L2([a, b], w(x)dx).

Orthogonal polynomials (with respect to a given weight, on a given in-terval) have been playing a fundamental role in many areas of mathematicsand its applications, and are invaluable in approximations.

3.7. Orthogonal complements, The Projection Theorem. Here theconstructions are quite similar to the finite-dimensional case. One importantdifference is that we often need to assume that subspaces are closed, orotherwise take their closure.

Definition 29. If S is a subset of a Hilbert space H its orthogonal is

S⊥ = {f ∈ H | 〈s, f〉 = 0, for all s ∈ S}


Remark that S⊥ is a closed subspace, therefore it is a Hilbert spaceitself.Why: as in linear algebra, S⊥ is a vector subspace; to see that it is closedtake a sequence fn ∈ S⊥ so that fn → f , and show that f ∈ S⊥. Indeed,for any s ∈ S we have 0 = 〈fn, s〉 → 〈f, s〉 (by Theorem 32) so f ∈ S⊥. 2

Definition 30. If V is a closed subspace of H, then V ⊥ is called the or-thogonal complement of V .

Note that if V is a vector subspace (not necessarily closed) then (V ⊥)⊥ =V (another point to be careful about in infinite diemensions).

3.7.1. Orthogonal Projections. Like in finite dimensions, in Hilbert spacesthe projection fV of f onto a subspace V is the vector that minimizes thedistance between f and V . Note however: when we try to upgrade a state-ment from finite to infinite dimensions we often need to assume that oursubspaces are closed, and if they are not, to replace them by their closure.

Theorem 31. Given V a closed subspace of H and f ∈ H there exists aunique fV ∈ V which minimizes the distance from f to V :

‖f − fV ‖ = min{‖f − g‖ | g ∈ V }

fV is called the orthogonal projection of f onto V .

The outline of the proof is as follows: let gn ∈ V so that ‖f − gn‖ →inf{‖f−g‖ | g ∈ V }. A calculation (expanding ‖gn−gm‖2 and using Cauchy-Schwartz) yields that the sequence (gn)n is Cauchy. Since V was assumedclosed, then gn converges, and fV is its limit. 2

The linear operator PV : H → H which maps f to fV is called theorthogonal projection onto V .

Like in finite dimensions, H decomposes as a direct sum between a sub-space V and its orthogonal complement: but V must to be closed:

Theorem 32. (The Projection Theorem)Let V be a closed subspace of H.Every f ∈ H can be written uniquely as f = fV + f⊥, with fV ∈ V and

f⊥ ∈ V ⊥.In other words, H = V ⊕ V ⊥.

The Projection Theorem follows by proving that f − fV is orthogonal toV (which follows after calculations not included here). 2

3.8. Least squares approximation via subspaces. Consider an orthonor-mal basis of H: u1, u2, u3, . . .. Any given f ∈ H can be written as a (con-vergent) series

f =∞∑n=1

fnun, where fn = 〈un, f〉

18 RODICA D. COSTIN

We can approximate f by finite sums, with control of the errors, in thefollowing way.

It is clear that the N th partial Fourier sum:

f [N ] =N∑n=1

fnun

represents the orthogonal projection of f onto VN = Sp(u1, u2, . . . , uN )(why?). Since VN is closed (it is finite-dimesional, hence it is essentiallyRN or CN ) then by Theorem 31 we have that

‖f − f [N ]‖ = min{‖f − g‖ | g ∈ VN} = min{EN (c1, . . . , cN )∣∣ ck ∈ C}

where EN (c1, . . . , cN ) = ‖f −N∑n=1

cnun‖

The function E2N (c1, . . . , cN ) is called Gauss’ mean squared error. For

example, for H = L2[a, b] we have

E2N (c1, . . . , cN ) =

∫ b

a

∣∣f(x)−N∑n=1

cnun(x)∣∣2 dx

Note that∣∣f(x)−

∑Nn=1 cnun(x)

∣∣2 ≡ s(x) is the squared error at x, and then1b−a

∫ ba s(x)dx is its mean.

Minimizing the squared error (solving the least squares problem) is achievedby (looking for stationary points, hence by) solving the system

∂E2N

∂ck= 0, k = 1, . . . , N

whose solution is (try it!) ck = 〈uk, f〉 = fk (for k = 1, . . . , N).The error in approximating f by f [N ] is EN (f1, . . . , fN ) therefore

E2N (f1, . . . , fN ) = ‖f − f [N ]‖2 = ‖

∞∑n=N+1

fnen‖2 =∞∑

n=N+1

|fn|2 ≤ ‖f‖2

4. Linear operators in Hilbert spaces

Let H be a Hilbert space over the scalar field F = R or C, with a countableorthonormal basis u1, u2, u3, . . ..

In order to extend the concept of a matrix to infinite dimensions, it ispreferable to look at the linear operator associated to it.

Definition 33. An operator T : H → H is called linear if T (f + g) =Tf + Tg and T (cf) = cTf for all f, g ∈ H and c ∈ F .

The linearity conditions are often written more compactly as

T (cf + dg) = cTf + dTg for all f, g ∈ H, and c, d ∈ F


The adjoint T ∗ of an operator T is defined as in the finite-dimensionalcase, by 〈Tf, g〉 = 〈f, T ∗g〉 for all f, g.

We will see that it is sometimes convenient to consider operators T whichare defined on a domain D(T ) ⊂ H which is smaller than the whole Hilbertspace H. In such cases T ∗ is (usually) also defined on a domain D(T ∗) ⊂ Hand we will call T ∗ the adjoint of T .

We will call the operator T formally selfadjoint (or symmetric) if

〈Tf, g〉 = 〈f, Tg〉 for all f ∈ D(T ), g ∈ D(T ∗)

We call T selfadjoint if it is formally selfadjoint and D(T ) = D(T ∗).

4.1. Shift operators on `2. The examples below illustrate a very impor-tant distinction of infinite dimension: for a linear operator to be invertiblewe need to check both that its kernel is null (meaning that the operator isone-to-one) and that its range is the whole space (the operator is onto). Wewill see that the two conditions are not equivalent in infinite dimensions,and that an isometry, while it clearly is one-to-one, need not be onto.

1. Consider the left shift operator S : `2 → `2:

S(x1, x2, x3, x4, . . .) = (x2, x3, x4, . . .)

which is clearly linear.Obviously Ker(S) = Sp(e1) (S is not one-to-one) and Ran(S) = `2 (S is

onto).2. Consider the right shift operator R on `2:

R(x1, x2, x3, x4, . . .) = (0, x1, x2, x3, . . .)

The operator R is clearly linear. Note that R is an isometry, since

‖Rx‖2 =∞∑n=1

|xn|2 = ‖x‖2

thus we have Ker(R) = {0} (R is one-to-one). But R is not onto sinceRan(R) = {x ∈ `2 |x1 = 0}.

4.2. Unitary operators. Isomorphic Hilbert spaces.

Definition 34. A linear operator U : H → H is called unitary if UU∗ =U∗U = I (i.e. its inverse equals it adjoint).

This definition is similar to the finite dimensional case, and, it follows thatif U is unitary, then U is an isometry. Example 2 in §4.1 shows that theconverse is not true in infinite dimensions (an isometry may not be onto).It can be shown that

Theorem 35. If a linear operator is an isometry, and is onto, then it isunitary.

20 RODICA D. COSTIN

We noted that if H is a Hilbert space with a countable orthonormal basisthen H is essentially `2. Mathematically, this is stated as follows:

Theorem 36. Let H be a Hilbert space with a countable orthonormal basis.Then H is isomorphic to `2 in the sense that there exists a unitary operatorU : `2 → H.

To be precise, if H is a complex Hilbert space, then we consider thecomplex `2 (sequences of complex numbers), while if H is a real Hilbertspace, then we consider the real `2 (sequences of real numbers.)

The proof of Theorem 36 is immediate: the operator U is constructed inthe obvious way. Denoting by u1, u2, . . . an orthonormal basis of H we define

U(a1, a2, . . .) = a1u1 + a2u2 + . . . =∞∑n=1

anun for each (a1, a2, . . .) ∈ `2

We need to show that∑∞

n=1 anun ∈ H, that U is one-to-one (clearly trueby the definition of the Hilbert space basis) and onto (intuitively clear butrequires an argument). Finally, by the Parseval’s identity U is an isometry,and by Theorem 35 it is unitary. �

Other examples of unitary operators.It can be shown that the Fourier transform is a unitary operator on L2(R).

4.3. Integral operators.

4.3.1. Illustration - solutions of a first order differential equation. The sim-plest integral operator takes f to

∫ xa f(s) ds (it is clearly linear!). More

general integral operators have the form

(13) Kf(x) =∫ b

aG(x, s)f(s) ds

where G(x, s) (called ”integral kernel”) is continuous, or has jump disconti-nuities.

(Note the similarity of (13) with the action of a matrix on a vector: ifG = (Gxs)x,s is a matrix, and f = (fs)s is a vector, the (Gf)x =

∑sGxsfs.)

Integral operators appear as solutions of nonhomogenous differential equa-tions; then G is called the Green’s functions.

As a first example, consider the differential problem

(14)dy

dx+ y = f(x), y(0) = 0

where we look for solutions y(x) for x in a finite interval, say x ∈ [0, 1].It is well known that the general solution of (14) is obtained by multiplying

by the integrating factor (exp of the integral of the coefficient of y) giving

d

dx( exy) = exf(x) therefore y(x) = Ce−x + e−x

∫exf(x) dx


After fitting in the initial condition y(0) = 0 we obtain the solution

(15) y(x) = e−x∫ x

0esf(s) ds

which has the form (13) for the (Green’s function of the problem)

(16) G(x, s) = e−x es χ[0,x](s)

where χ[0,x] is the characteristic function of the interval [0, x], defined as

(17) χ[0,x](s) ={

1 if s ∈ [0, x]0 if s 6∈ [0, x]

In many applications it is important to understand how solutions of (14)depend on the nonhomogeneous term f , for example, if a small change of fproduces only a small change of the solution y = Kf . This is affirmative ifwe have the following estimate:

(18) there is a constant C so that ‖Kf‖ ≤ C ‖f‖ for all f

If an operator K satisfies (18) then K is called bounded. In fact, condition(18) is equivalent to the fact that the linear operator K is continuous.

It is not hard to see that the integral operator K given by (13), (16) isbounded on L2[0, 1]; this means that the mean squared average of Kf doesnot exceed a constant times the mean squared average of f (the estimatesare shown in §4.3.2 below).

We can find better estimates, point-wise rather than in average, sincethe linear operator K is also bounded as an operator on the Banach spaceC[0, 1], and this means that the maximum of |Kf(x)| does not exceed Ctimes the maximum of |f | (see the proof of this estimate below in §4.3.3).

While the estimate in C[0, 1] (in the sup-norm) is stronger and moreinformative than the estimate in L2[0, 1] (in square-average), we have topay a price: we can only obtain them for continuous forcing terms f .

General principle: integral operators do offer good estimates.

4.3.2. Proof that K is bounded on L2[0, 1] (Optional). Let us denote thenorm in L2[0, 1] by ‖ ·‖L2 , and the inner product by 〈·, ·〉L2 to avoid possibleconfusions.

To see that K is bounded on L2[0, 1] note that Kf(x) is, for each fixedx, an L2 inner product, and using the Cauchy-Schwarts inequality

(19) |Kf(x)| = |〈G(x, ·), f(·)〉L2 | ≤ ‖G(x, ·)‖L2 ‖f‖L2

We can calculate using (16)

‖G(x, ·)‖2L2 =∫ 1

0G(x, s)2 ds =

∫ x

0e2s−2xds =

12(1− e−2x

)<

12

which used in (19) gives

(20) |Kf(x)| ≤ 12‖f‖L2

22 RODICA D. COSTIN

for any f ∈ L2[0, 1]. Then from (20) we obtain

(21) ‖Kf‖2L2 =∫ 1

0|Kf(x)|2dx ≤ 1

4‖f‖2L2

so K satisfies (18) in the L2-norm for C = 1/2. 2

4.3.3. Proof that K is bounded on C[0, 1] (Optional). Recall that C[0, 1] isa Banach space with the sup-norm ‖ · ‖∞. Note that we obviously have|f(x)| ≤ ‖f‖∞ for any continuous f .

We need to use the following (very useful) inequality:∣∣ ∫ b

aF (s) ds

∣∣ ≤ ∫ b

a|F (s)| ds

which is intuitively clear if we think of the definite integral of a function asthe signed area between the graph of the function and the x-axis.

We then have, for any f ∈ C[0, 1],

|Kf(x)| =∣∣e−x ∫ x

0esf(s) ds

∣∣ ≤ e−x ∫ x

0es |f(s)| ds

≤ ‖f‖∞ e−x∫ x

0es ds = ‖f‖∞ (1− e−x) ≤ ‖f‖∞

hence K satisfies (18) in the sup-norm for C = 1. 2

4.3.4. Remarks about the δ function. An alternative method for finding theGreen’s function of (14) is by solving dy

dx + y = δ(x − s) where δ is Diracdelta function. Its solution turns out to be G(x, s) in (16). Then, using thefundamental property of the delta function that

(22) f(x) =∫δ(x− s)f(s)ds

by superposition of solutions we obtain (15).It should be noted that Dirac’s delta function does not belong to L2: since

it equals zero everywhere but a single point then in L2 it must coincide withthe zero function. An alternative view of this fact is that the delta functioncannot be obtained as the L2-limit of a sequence of continuous functions.

What is the δ function? Note that it is mainly used via its action onfunctions, like (22). As a precise mathematical object, then δ is an objectthat acts on functions - it is called a distribution. The study of distributionsand of their applications form a separate chapter in mathematics.

4.4. Differential operators in L2[a, b]. The simplest differential operatoris d

dx , which takes f to dfdx . It is clearly linear.

There are two fundamental differences between the differential operatorddx and the integral operators (13):

1) we cannot define the derivative on all the functions in L2, and


2) ddx is not a bounded operator: we cannot estimate the derivative of

a function by only knowing the magnitude of the function (not in the sup-norm, and not even in average). Indeed, you can easily imagine a smoothfunction, whose values are close to zero and never exceed 1, but which hasa very narrow and sudden spike. The narrower the spike, the higher thederivative, even if the average of the function remains small.

4.4.1. Illustration on a first order problem. Consider again the problem (14).It can be written as

Ly = f, y(0) = 0

where L is the differential operator

L =d

dx+ 1, therefore Ly =

(d

dx+ 1)y =

dy

dx+ y

If we are interested to allow the nonhomogeneous term f to have jumpdiscontinuities, since y′ = f − y then y′ will have jump discontinuities (atsuch a point y′ may not be defined). An example of a such function is:

d

dx|x| =

{−1 if x < 01 if x > 0

A good domain for L is then

D = {y ∈ L2[0, 1] | y′ ∈ L2[0, 1], y(0) = 0}

Note that the initial condition was incorporated in D. Note also that D isdense in L2[0, 1].

Note that the integral operator K found in §4.3.1 is the inverse of thedifferential operator L (defined on D).

4.5. A second order boundary value problem. Quite often separablepartial differential equations lead to second order boundary value problems,illustrated here on a simple example.

Problem: Find the values of the constant λ for which the equation

(23) y′′ + λy = 0

has non-identically zero solutions for x ∈ [0, π] satisfying the boundary con-ditions

(24) y(0) = 0, y(π) = 0

Equation (23) models the simple vibrating string: x ∈ [0, 1] representsthe position on the string, and y(x) is the displacement at position x. Theboundary conditions (24) mean that the endpoints x = 0 and x = 1 of thestring are kept fixed.

24 RODICA D. COSTIN

4.5.1. Formulation of the problem using a differential operator. Equation(23) can be written as

(25) Ly = λy, where L = − d2

dx2

with the domain (dense in L2[0, π])

(26) D = {y ∈ L2[0, π] | y′′ ∈ L2[0, π], y(0) = 0, y(π) = 0}(it must be noted that D needs to be more precisely specified, stating con-ditions on y and y′).

A solution of (23), (24) is a solution of (25) in the domain (26). If this so-lution is not (identically) zero, then it is an eigenfunction of L, correspondingto the eigenvalue λ.

Note that L is self-adjoint on D since for y, g ∈ D we have

〈Ly, g〉 =∫ π

0Ly(x)g(x) dx = −

∫ 1

0y′′(x)g(x) dx

and integration by parts gives (since g(0) = 0 = g(π))

= −y′(x)g(x)∣∣π0

+∫ π

0y′(x)g′(x) dx =

∫ π

0y′(x)g′(x) dx

and integrating by parts again (and using y(0) = 0 = y(π))

= y(x)g(x)∣∣π0−∫ π

0y(x)g′′(x) dx = 〈y, Lg〉

(Note that the boundary terms vanish by virtue of the boundary conditions.)It can be shown, just like in the finite-dimensional case, that since L is

selfadjoint then its eigenvalues are real and the eigenfunctions of L corre-sponding to different eigenvalues are orthogonal (see §4.5.2).

Remark that L = − d2

dx2 is positive definite, motivating the choice of anegative sign in front of the second derivative. Indeed, for y 6= 0

〈Ly, y〉 = −∫ 1

0y′′(x)y(x) dx =

∫ π

0y′(x)y′(x) dx =

∫ π

0|y′(x)|2 dx > 0

where the last inequality is strict because∫ π0 |y

′(x)|2 dx = 0 implies y′ = 0,therefore y is constant, and due to the zero boundary conditions then y = 0.

Then the eigenvalues of L are positive, like in the finite-dimensional case.Indeed, if Lf = λf (f 6= 0) then 〈f, Lf〉 = 〈f, λf〉 hence

−∫ 1

0f(x) f ′′(x) dx = λ

∫ 1

0|f(x)|2 dx

then integrating by parts and using the fact that f(0) = f(1) = 0 we obtain∫ 1

0|f ′(x)|2 dx = λ

∫ 1

0|f(x)|2 dx

which implies λ > 0.


Let us find the eigenvalues and eigenfunctions of L. Since λ 6= 0 thegeneral solution of (23) is y = C1e

√−λx +C2e

−√−λx (note that

√−λ = i

√λ

since λ > 0). Since y(0) = 0 it follows that C2 = −C1, so, choosing C1 = 1,y(x) = ei

√λx − e−i

√λx. The condition y(π) = 0 implies that e2i

√λπ = 1

therefore√λ ∈ Z so λ = λn = n2, n = 1, 2, . . . and the corresponding

eigenfunctions are yn(x) = einx − e−inx = 2i sin(nx).A proof is needed to show that the eigenfunctions are complete, i.e. they

form a basis for the Hilbert space L2[0, π]. The question of finding the eigen-values, eigenfunctions, proving their completeness and finding their proper-ties, is the subject of study in Sturm-Liouville theory.

4.5.2. Review. Let A be a selfadjoint operator.1) If λ is an eigenvalue of A, then λ ∈ R.Indeed, Av = λv for some v 6= 0. Then on one hand we have

〈v,Av〉 = 〈v, λv〉 = λ〈v, v〉 = λ‖v‖2

and on the other hand

〈v,Av〉 = 〈Av, v〉 = 〈λv, v〉 = λ〈v, v〉 = λ‖v‖2

therefore λ‖v‖2 = λ‖v‖2 so (λ− λ)‖v‖2 = 0 therefore λ = λ so λ ∈ R.2) If λ 6= µ are to eigenvalues of A, Av = λv (v 6= 0), and Au = µu

(u 6= 0) then u ⊥ v.Indeed, on one hand

〈u,Av〉 = 〈u, λv〉 = λ〈u, v〉

and on the other hand

〈u,Av〉 = 〈Au, v〉 = 〈µu, v〉 = µ〈u, v〉

therefore λ〈u, v〉 = µ〈u, v〉 so (λ− µ)〈u, v〉 = 0 therefore 〈u, v〉 = 0.

4.5.3. Solution of a boundary value problem using an integral operator. Letus consider again equation (25) on the domain (26) and rewrite it as (L −λ)y = 0. We have non-zero solutions y (eigenfunctions of L) only whenKer(L− λ) 6= {0}.

Let us consider the totally opposite case, and look at complex numbersz for which L − z is invertible. (Recall that, in general, the condition thatthe kernel be zero does not guarantee invertibility of an operator in infinitedimensions.)

Let us invert L−z; this means that for any f we solve (L−z)y = f givingy = (L− z)−1f . The operator (L− z)−1 is called the resolvent of L.

To find y we solve the differential equation

(27) y′′ + zy = −f

with the boundary conditions

(28) y(0) = 0, y(π) = 0

26 RODICA D. COSTIN

Recall that the general solution of (27) is given by

C1y1 + C2y2 − y1

∫y2

Wf + y2

∫y1

Wf

where y1, y2 are two linearly independent solutions of the homogeneous equa-tion y′′ + zy = 0 and W = W [y1, y2] = y′1y2 − y1y

′2 is their Wronskian.

We need to distinguish the cases z = 0 and z 6= 0.I. If z = 0, then y1 = 1, y2 = x and we can easily solve the problem

(27), (28).II. If z 6= 0, then y1 = exp(kx) and y2 = exp(−kx) where k =

√−z (note

that k may be a complex number; in particular, if z > 0 then k = i√z).

Their Wronskian is W = 2k, so the general solution of (27) has the form

C1ekx + C2e

−kx − ekx

2k

∫ x

0e−ksf(s) ds+

e−kx

2k

∫ x

0eksf(s) ds

The boundary condition y(0) = 0 implies C2 = −C1 therefore

y(x) = C(ekx − e−kx

)− ekx

2k

∫ x

0e−ksf(s) ds+

e−kx

2k

∫ x

0eksf(s) ds

(29) ≡ C(ekx − e−kx

)+∫ π

0g(x, s) f(s) ds

where

g(x, s) =−12k

(ekx−ks − e−kx+ks

)χ[0,x](s)

Imposing the boundary condition y(π) = 0 we obtain that C must satisfy

C(ekπ − e−kπ

)+∫ π

0g(π, s) f(s) ds = 0

Solving for C and substituting in (29) we obtain that the solution of theproblem (27), (28) has the form (13)

y(x) =∫ π

0G(x, s) f(s) ds = (L− z)−1f

where G(x, s) is the Green function of the problem

(30) G(x, s) = g(x, s) − ekx − e−kx

ekπ − e−kπg(π, s)

Note that G is not defined if ekπ − e−kπ = 0 which means for ik ∈ Z.Since k =

√−z (z 6= 0), this means that for z = n2 (n = 1, 2, . . .) the Green

function is undefined (for these values the denominator of G vanishes): theresolvent (L− z)−1 does not exist for z = λn = n2.

Note also that G(x, s) is continuous (the discontinuity of χ[0,x](s) at s = xdoes not result in a discontinuity of G(x, s) at s = x because g(x, x) = 0).

It is clear that if z is real then (L − z)−1 is self-adjoint because L isself-adjoint.


The study of operators on Hilbert spaces is the topic of Functional Analy-sis. In one of its chapters it is proved that integral operators (13) (and othersimilar operators, called compact operators) which are selfadjoint are verymuch like selfadjoint matrices, in that they have real eigenvalues µn and thecorresponding eigenfunctions un form an orthonormal basis for the Hilbertspace. The infinite dimensionality of the Hilbert space implies that thereare infinitely many eigenvalues: a countable set, which, moreover, tend tozero: µn → 0. (Zero may also be an eigenvalue.)

Let us see how the eigenvalues µn and eigenfunctions of the resolvent(L− z)−1 are related to the eigenvalues λn and eigenfunctions yn of L.

We have Lyn = λnyn hence (L− z)yn = (λn − z)yn for any number z. IfL − z is invertible (we saw that this is the case for z 6= λk for all k) thenyn = (λn−z)(L−z)−1yn so yn is an eigenfunctions of the resolvent (L−z)−1

corresponding to the eigenvalue µn = (λn − z)−1.Note that λn →∞ (since µn → 0).Note the following quite general facts:◦ Remark the functional calculus aspect: if λn are the eigenvalues of L

then (λn− z)−1 are the eigenvalues of (L− z)−1 and they correspond to thesame eigenvectors.◦ Note again that the eigenvalues of L appear as values of z for which the

Green function has zero denominators.

4.6. General second order self-adjoint problems. Consider the follow-ing generalization of the example studied in §4.5:

General problem: Find the values of the constant λ for which the equation

(31)1

ρ(x)d

dx

(p(x)

d

dx

)y + q(x)y + λy = 0

has non-identically zero solutions for x ∈ [a, b] satisfying the boundary con-ditions

(32) y(a) = 0, y(b) = 0

We shall see that (31) is the most general linear second order equation.This is an eigenvalue problem for the differential operator

L = − d

dx

(1

ρ(x)d

dx

)− q(x)

The operator L is selfadjoint in a weighted L2 space: L2([0.L], ρ(x)dx) on adomain where the bounday values vanish. Indeed, it is easy to check that∫ L

0Lf(x) g(x) ρ(x)dx =

∫ L

0f(x)Lg(x) ρ(x)dx

Then, as in §4.5, it follows (from the theory of compact operators) thatL has a sequence of eigenvalues and the corresponding eigenfunctions forman orthonormal basis for the weighted L2.

28 RODICA D. COSTIN

We will look more closely to these operators in the next chapter, whichstudies the Sturm-Liouville problem.

AN INTRODUCTION TO HILBERT SPACES Contents Spaces.pdf · AN INTRODUCTION TO HILBERT SPACES RODICA...

Documents

Transcript of AN INTRODUCTION TO HILBERT SPACES Contents Spaces.pdf · AN INTRODUCTION TO HILBERT SPACES RODICA...