Functional Analysis 1 Linear spaces - Richard F. Bassbass.math.uconn.edu/functional2011.pdf ·...

141
Functional Analysis Richard Bass These notes are based mostly on the book by P. Lax, Functional Analysis. Functional analysis can best be characterized as infinite dimensional linear algebra. We will use some real analysis, complex analysis, and algebra, but functional analysis is not really an extension of any one of these. 1 Linear spaces A linear space is the same as a vector space. We have an operation “+” under which the space is an Abelian group: addition is commutative, associative, there exists an identity (which we call 0) and every element has an inverse (the inverse of x will be denoted -x). We define x - y = x +(-y). We also have a field F , which for us will always be the reals or the complex field. Elements of F will be called scalars. A linear space also has an operation called scalar multiplication, which is associative (a(bx)=(ab)x) and distributive (a(x + y)= ax + ay and (a + b)x = ax + bx), and we have 1x = x, where 1 is the identity in F . By the same proofs as in the finite dimensional case, we have 0x =0 because 0x = (0 + 0)x =0x +0x and (-1)x = -x because 0=0x = (1)x +(-1)x = x +(-1)x. Examples of linear spaces are 1. R n 2. The collection of all infinite sequences. 3. B(S ), the bounded functions on a set S . 4. C (S ), the continuous functions on S , where S is a topological space. 1

Transcript of Functional Analysis 1 Linear spaces - Richard F. Bassbass.math.uconn.edu/functional2011.pdf ·...

Functional Analysis

Richard Bass

These notes are based mostly on the book by P. Lax, Functional Analysis.

Functional analysis can best be characterized as infinite dimensional linearalgebra. We will use some real analysis, complex analysis, and algebra, butfunctional analysis is not really an extension of any one of these.

1 Linear spaces

A linear space is the same as a vector space. We have an operation “+” underwhich the space is an Abelian group: addition is commutative, associative,there exists an identity (which we call 0) and every element has an inverse(the inverse of x will be denoted −x).

We define x− y = x+ (−y).

We also have a field F , which for us will always be the reals or the complexfield. Elements of F will be called scalars.

A linear space also has an operation called scalar multiplication, whichis associative (a(bx) = (ab)x) and distributive (a(x + y) = ax + ay and(a+ b)x = ax+ bx), and we have 1x = x, where 1 is the identity in F .

By the same proofs as in the finite dimensional case, we have 0x = 0because 0x = (0 + 0)x = 0x+ 0x and (−1)x = −x because

0 = 0x = (1)x+ (−1)x = x+ (−1)x.

Examples of linear spaces are

1. Rn

2. The collection of all infinite sequences.

3. B(S), the bounded functions on a set S.

4. C(S), the continuous functions on S, where S is a topological space.

1

5. Ck(R)

6. Lp(X,m)

7. {f : f is analytic in D}

If X is a linear space, Y ⊂ X, then Y is a linear subspace of X if y ∈ Yimplies ay ∈ Y for all a ∈ F and x, y ∈ Y implies x+ y ∈ Y .

Let S be a subset of X. Consider the collection

{Yα : Yα is a linear subspace of X,S ⊂ Yα}.

It is easy to check that ∩αYα is a subspace of X, and it is called the linearspan of S.

Proposition 1.1 The linear span of S is equal to{ n∑i=1

aixi : ai ∈ F, xi ∈ S, n ∈ N}.

Proof. Let M be the above set of sums. M is clearly a linear subspace of Xcontaining S, therefore the span of S is contained in M . If Yα is any linearsubspace containing S, then Yα must contain M , therefore ∩αYα contains M .

Let X and U be linear spaces over F . M : X → U is linear if M(x+ y) =M(x) +M(y) and M(kx) = kM(x).

Examples of linear operators:

1) X = L1(m), g is bounded and measurable, and

Tf =

∫f(x)g(x)m(dx).

2) Let our space be B(S), fix a point x0 ∈ S or points x0, . . . , xn ∈ S, andlet Tf = f(x0) or Tf = (f(x0), . . . , f(xn)). Here T maps B(S) to R or Rn+1.

3) Let X be the space of n-tuples, and define the ith coordinate of Mxto be

∑nj=1 aijxj. This is just matrix multiplication, and all linear maps in

finite dimensions can be viewed in this way.

2

4) Let X be the set of bounded sequences, suppose that supi∑∞

j=1 |aij| <∞, and define the ith coordinate of Mx to be

∑∞j=1 aijxj.

5) Let X be the set of bounded measurable functions on some measurespace with finite measure m, and suppose K(x, y) is jointly measurable andbounded. Define Mf by Mf(x) =

∫K(x, y)m(dy).

Two linear spaces are isomorphic if there exists a one-to-one linear map-ping from one space onto the other.

If M is a 1-1 linear mapping from X onto U , then M−1 is also linear. Tosee this, suppose u1 = Mx1 and u2 = Mx2 are elements of U with x1, x2 ∈ X.Then u1 +u2 = Mx1 +Mx2 = M(x1 +x2). Hence M−1(u1 +u2) = x1 +x2 =M−1u1 +M−1u2. Similarly M−1(ku) = kM−1u.

A set K ⊂ X is convex if λx + (1 − λ)y ∈ K whenever x, y ∈ K andλ ∈ [0, 1].

A convex combination of x1, . . . , xm is a sum of the form∑n

i=1 aixi, where∑ni=1 ai = 1, n is a positive integer, and all the ai are non-negative.

Lemma 1.2 Linear subspaces are convex. Intersections of convex sets areconvex. If M : X → U is linear and K ⊂ X is convex, then {M(x) : x ∈ K}is convex.

If S ⊂ X, the convex hull of S is the intersection of all the convex setscontaining S.

Proposition 1.3 The convex hull of S is equal to the set of all convex com-binations of points of x.

If K is convex, and E ⊂ K, then E is an extreme subset of K if

(1) E is convex and non-empty

(2) if x ∈ E and x = y+z2

with y, z ∈ K, then y, z ∈ E.

If E is a single point, then the point is called an extreme point of K.

For an example, consider the case where K is a polygon (plus the interior)in R2. Each edge of K is an extreme subset. Each vertex is an extreme point.

3

2 Linear maps

2.1 Definitions

Definition 2.1 If M and N are linear maps from X into U and k is ascalar, we define

(M +N)(x) = M(x) +N(x), (kM)(x) = kM(x).

So the set of linear maps from X into U is a linear space, and we denoteit L(X,U).

If M : X → U and N : U → W , we define (NM)(x) = N(M(x)).

An exercise is to show this is associative but not necessarily commutative.(Multiplication by matrices is an example to show commutativity need nothold.) It is distributive:

M(N +K) = MN +MK, (M +K)N = MN +KN.

We usually write Mx for M(x).

Define the identity I : X → X by Ix = x. We will also write IX when wewant to emphasize the space.

We say M : X → U is invertible if there exists M−1 : U → X such thatM−1M = IX , MM−1 = IU .

Definition 2.2 The null space or kernel of M is NM = {x ∈ X : Mx = 0}and the range of M is RM = {Mx : x ∈ X}.

Observe that NM ⊂ X and RM ⊂ U .

Some easily checked facts: NM and RM are linear subspaces, and if K,Mare invertible, then (KM)−1 = M−1K−1.

Let Y be a subspace of X. Y is called an invariant subspace for M : X →X if M maps Y into Y .

One example of an invariant subspace is to let y be a fixed element of Xand set

Y = {p(M)y : p a polynomial}.

4

To check this, if z ∈ Y , then z = p(M)y for some polynomial p. Then Mz =Mp(M)y, and Mp(M) = p(M), where p(x) = xp(x) is another polynomial.

A second example: suppose T : X → X and MT = TM . Then NT isinvariant. To see this, if z ∈ NT , then Tz = 0 and then T (Mz) = TMz =MTz = M0 = 0, or Mz ∈ NT .

If N and Y are subspaces of X, we write X = N ⊕ Y if for each x ∈ X,there exist n ∈ N and y ∈ Y such that x = n+ y, and the decomposition isunique, i.e., there is only one n and one y that works for any particular x.Of course, n and y depend on x.

As an example, let X = R3, N = {(y, 0, 0}). There are lots of possibilitiesfor Y , in fact, any plane in R3 that passes through the origin and does notcontain the real axis. Given any choice of Y , though, there is only one wayto write a given x as n+ y.

We will frequently use Zorn’s lemma, which is equivalent to the axiom ofchoice.

Suppose we have a partially ordered set S, which means that there is anorder relation such that a ≤ a for all a ∈ S, and if a ≤ b and b ≤ c, thena ≤ c. A subset is totally ordered if for every pair x, y in the subset, eitherx ≤ y or y ≤ x. An element u of a partially ordered set is an upper bound fora subset of S if x ≤ u for every x in the subset. An element x of a partiallyordered set is maximal if y ≥ x implies y = x. (Lax stated this incorrectly.)

Lemma 2.3 (Zorn’s lemma) Let X be a partially ordered set. If every totallyordered subset of X has an upper bound in X, then X has a maximal element.

Lemma 2.4 Suppose N is a subspace of a linear space X. Then there existsa linear subspace Y such that X = N ⊕ Y .

Proof. Look at {Y : Y a subspace of X, Y ∩N = {0}}. We partially orderthis collection by inclusion, If {Yα} is a totally ordered subcollection, then∪αYα is an upper bound. Let Y0 be the maximal element guaranteed byZorn’s lemma.

Suppose there is a point x ∈ X that is not in N ⊕ Y0. We adjoin x to Y0

to form Y1, that is, Y1 = {ax+ y : y ∈ Y0, a ∈ R}. Y1 is a subspace of X that

5

is strictly bigger than Y0. We argue that Y1 ∩ N = {0}, a contradiction tothe fact that Y0 is maximal.

x is not in the direct sum of N and Y0, so x /∈ N , or else we could writex = x+ 0. If z 6= 0 and z ∈ Y1 ∩N , then there exist a ∈ R and y ∈ Y0 suchthat z = ax + y. One possibility is that a = 0; but then z = y ∈ Y0 ∩ N ,which isn’t possible since z is nonzero. The other possibility is that a 6= 0.But z ∈ N , so

x =z

a+−ya∈ N ⊕ Y0,

also a contradiction.

2.2 Index of a map

We say that dimX <∞, that is X is finite dimensional, if there exist finitelymany points x1, . . . , xn ∈ X such that X is equal to the span of {x1, . . . , xn}.The smallest such n is the dimension of X.

Let X be a linear space and Y a subspace. We say x1 ≡ x2(mod Y ) orthat x1 is equivalent to x2 if x1 − x2 ∈ Y . This is an equivalence relation.Let x denote the equivalence class containing x. The collection of all suchequivalence classes is denoted X/Y and called the quotient space of X withrespect to Y ,

Let’s make X/Y into a linear space. If x1, x2 are in X/Y , define x1 + x2

to be x1 + x2. To see that this is well defined, if z1, z2 are elements of x1, x2,resp., then (x1 + x2) − (z1 + z2) = (x1 − z1) + (x2 − z2), the sum of twoelements of Y , hence an element of Y . We similarly define kx = kx. It isroutine to verify that X/Y is now a linear space.

We define the codimension of Y by

codim Y = dimX/Y.

Let’s look at an example. Suppose X = R5 and Y = {(x, y, 0, 0, 0)}.x1 ≡ x2 if and only if the 3rd through 5th coordinates of x1 and x2 agree.Therefore X/Y is (essentially - at least it is isomorphic to) the 3rd through5th coordinates of points in R5, hence isomorphic to R3. We see codim Y =dimX/Y = 3, while dimY = 2.

6

Lemma 2.5 If X = N ⊕ Y , then X/N is isomorphic to Y .

Proof. If x ∈ X/N , then we can write x = y + n. Define Mx = y. We willshow that M is an isomorphism.

First we need to show M is well defined. If x′ is another element of x, wecan write x′ = y′ + n′. Then x − x′ = (y − y′) + (n − n′) is in N . Since wecan write x− x′ = 0 + (x− x′), we must have y − y′ = 0, or y = y′.

Next we show M is linear. If x1 + x2 ∈ x1 + x2, then x1 = y1 + n1,x2 = y2 + n2, and then x1 + x2 = (y1 + y2) + (n1 + n2). So M(x1 + x2) =y1 + y2 = Mx1 +Mx2. The linearity with respect to scalar multiplication issimilar.

We show M is 1-1. If Mx = Mx′, and we write x = y + n, x′ = y′ + n′,then y = Mx = Mx′ = y′. Hence

x− x′ = (y − y′) + (n− n′) = n− n′ ∈ N,

so x = x′.

Finally, M is onto, because if y ∈ Y , then y = y + 0 ∈ X and My = y.

Let M : X → U . We will need the fact that

Proposition 2.6 X/NM is isomorphic to RM .

Proof. If x ∈ X/NM , we define Mx to be Mx for any x ∈ x. If x′ is anyother element of x, then x− x′ ∈ NM , or M(x− x′) = 0, or Mx = Mx′. So

the map M is well defined. It is routine to check that M is linear.

To show M is 1-1, if Mx = My, then Mx = My, or M(x − y) = 0, or

x− y ∈ NM , so x = y. To show M is onto, if y ∈ RM , then y = Mx for somex ∈ X. Then Mx = Mx = y.

We say a linear map G is degenerate if dimRG < ∞. Maps M : X → Uand L : U → X are pseudo-inverses if there exist G : X → X and H : U → Uthat are degenerate and

LM = IX +G, ML = IU +H.

7

This concept is not interesting in finite dimensions. For example, letX = U = R5, let A be any 5 × 5 matrix, and define Mx = Ax, where weview x as a 5×1 matrix. Let B be any other 5×5 matrix and define Lx = Bx.Then LMx = BAx = (I + G)x, where G = BA− I. Obviously RG is finitedimensional. We deal with ML similarly. So M and L are pseudo-inverses.

For a non-trivial example of pseudo-inverse, define the right and left shiftson X, the linear space of infinite sequences, by

R(a1, a2, . . .) = (0, a1, a2, . . .),

L(a1, a2, . . .) = (a2, a3, . . .).

ThenRL(a1, . . .) = (0, a2, a3, . . .) = (I −G)(a1, a2, . . .),

whereG(a1, a2, . . .) = −(a1, 0, 0, . . .).

Clearly RG is finite dimensional. We see that LR is the identity, so L and Rare pseudo-inverses.

We will prove that M : X → U has a pseudo-inverse if and only ifdimNM <∞ and codim RM <∞.

We start by proving

Proposition 2.7 If M has a pseudo-inverse, then

dimNM <∞ and codim RM <∞.

Proof. First suppose G is degenerate. We show dimNI+G < ∞. To seethis, if x ∈ NI+G, x + Gx = 0, so x = −Gx ∈ RG. Therefore NI+G ⊂ RG,which implies dimNI+G ≤ dimRG <∞.

Next, LM = I+G, so if x ∈ NM , then Mx = 0, so LMx = 0, or (I+G)x =0, or x ∈ NI+G. Therefore NM ⊂ NI+G, hence dimNM ≤ dimNI+G <∞.

Recall the X/NG is isomorphic to RG. Therefore codim NG = dimRG.

We claim codim RI+G < ∞. To prove the claim, if x ∈ NG, then (I +G)x = x + Gx = x, so NG ⊂ RI+G. Then codim RI+G ≤ codim NG =dimRG <∞.

8

Finally, since ML = I + G, then RI+G ⊂ RM . To see this, if y ∈ RI+G,then y = (I +G)x, or y = M(Lx). We then write

codim RM ≤ codim RI+G <∞.

We now prove

Proposition 2.8 If dimNM < ∞ and codim RM < ∞, then M has apseudo-inverse.

Proof. Suppose M : X → U and write X = NM ⊕ Y , U = RM ⊕ V . Weclaim that M restricted to Y is a 1-1 map onto RM . First we show onto.If z ∈ RM , then z = Mx for some x ∈ X. For some n ∈ N and y ∈ Y ,x = n+ y. So z = My.

Next we show 1-1. If My1 = My2, then M(y1 − y2) = 0, or y1 − y2 ∈ N .Let n = y1−y2. Then y2 +n = y1 +0. Since X = NM⊕Y , the decompositionis unique, and n = 0 and y1 = y2.

Therefore M : Y → RM is invertible. Define

K =

{M−1 on RM

0 on V.

Extend K to U = RM ⊕ V by linearity as follows. If u = z + v with z ∈ RM

and v ∈ V , let Ku = M−1z.

If y ∈ Y , then My ∈ RM and so KMy = y. If n ∈ NM , then KMn = 0.So

KM =

{1 on Y

0 on NM .

If z ∈ RM , then z = My for some y ∈ Y , so MKz = MKMy = My = z. Ifz ∈ V , Kz = 0. So

MK =

{1 on RM

0 on V.

9

Let P be the projection onto NM : if x ∈ X and x = y+n, define Px = n.Then KM = I − P . Since dimRP = dimNM <∞, P is degenerate.

Similarly MK = I − Q, where Q is the projection onto V . Since V isisomorphic to U/RM by a previous lemma, dimV = codim RM <∞, and Qis degenerate.

A sequence of spaces and maps

V0T0−→V1

T1−→· · ·Vn−1Tn−1−→Vn

is an exact sequence if RTj = NTj+1.

Lemma 2.9 Suppose we have an exact sequence, each Vj is finite dimen-sional, and dimV0 = 0 = dimVn. Then

n∑j=0

(−1)j dimVj = 0.

Note since dimV0 = dimVn, we can write the sum from j = 0 to n, oromit or include j = 0 or j = n as we choose.

Proof. Write Nj for NTj , Rj = RTj , and write Vj = Nj ⊕ Yj. Because wehave an exact sequence, Rj−1 = Nj.

Now Vj/Nj is isomorphic to Rj = Nj+1. On the other hand Yj is isomor-phic to Vj/Nj by a lemma. Hence Yj is isomorphic to Nj+1. Hence

dimVj = dimNj + dimNj+1.

Note N0 ⊂ V0, so dimN0 = 0. Also, because Rn−1 ⊂ Vn, Rn−1 = {0}.Since Vn−1/Nn−1 is isomorphic to Rn−1, dimVn−1 = dimNn−1.

Then∑(−1)j dimVj =

∑(−1)j[dimNj + dimNj+1]

= dimN0 + dimN1 − dimN1 − dimN2 + · · ·± (dimNn−2 + dimNn−1 − dimNn−1)

= dimN0

= 0.

10

Suppose M has a pseudo-inverse. Define

ind (M) = dimNM − codim RM .

Theorem 2.10 If M : X → U and L : U → W both have pseudo-inverses,then LM has a pseudo-inverse and

ind (LM) = ind (L) + ind (M).

Proof. We use an exact sequence. V0 = 0, V1 = NM , V2 = NLM , V3 = NL,V4 = U/RM , V5 = W/RLM , V6 = W/RL, and V7 = 0.

Since NM ⊂ NLM , we let T1 be the inclusion map. T2 = M . T3 is the maptaking U to U/RM . T4 is the map taking U/RM to W/RLM , while T5 is themap from W/RLM to W/RL. It is an exercise to check that this is an exactsequence. Using the lemma,

− dimNM + dimNLM − dimNL + codim RM − codim RLM + codim RL = 0.

This does it.

3 Hahn-Banach theorem

3.1 The theorem

A linear functional ` is a linear map from X to F . Let’s take F to be thereals for now.

We will be working with a sublinear functional p(x) in the statement ofthe theorem. An example to keep in mind is p(x) = ‖x‖, although we havenot defined what a norm is yet.

Theorem 3.1 Suppose p : X → R satisfies p(ax) = ap(x) if a > 0 andp(x + y) ≤ p(x) + p(y) if x, y ∈ X. Suppose Y is a linear subspace, ` isa linear functional on Y , and `(y) ≤ p(y) for all y ∈ Y . Then ` can beextended to a linear functional on X satisfying `(x) ≤ p(x) for all x ∈ X.

11

Proof. If Y is not all of X, pick z ∈ X \ Y . Look at Y1 = {y + az : y ∈Y, a ∈ R}. We want to define `(z) to be some real number with the propertythat if we set

`(y + az) = `(y) + a`(z),

we would have `(y) + a`(z) ≤ p(y+ az) for all y ∈ Y and a ∈ R. This wouldgive us an extension of ` from Y to Y1.

For all y, y′ ∈ Y ,

`(y′) + `(y) = `(y′ + y) ≤ p(y′ + y) = p((y + z) + (y′ − z))

≤ p(y + z) + p(y′ − z).

So`(y′)− p(y′ − z) ≤ p(y + z)− `(y).

This is true for all y, y′ ∈ Y . So choose `(z) to be a number betweensupy′ [`(y

′)− p(y′ − z)] and infy[p(y + z)− `(y)]. Therefore

`(y′)− p(y′ − z) ≤ `(z) ≤ p(y + z)− `(y),

or`(y) + `(z) ≤ p(y + z), `(y′)− `(z) ≤ p(y′ − z).

If a > 0,`(y + az) = a`(y

a+ z) ≤ ap(y

a+ z) = p(y + az).

Similarly `(y′ − az) ≤ p(y′ − az) if a > 0.

So we have extended ` from Y to Y1, a larger space. Let {(Yα, `α)} be thecollection of all extensions of (Y, `). This is partially ordered by inclusion. If{(Yβ, `β)} is a totally ordered subset, define ` on ∪βYβ by setting `(z) = `β(z)if z ∈ Yβ. By Zorn’s lemma, there is a maximal extension. This maximalextension must be all of X, or else by the above we could extend it.

The Hahn-Banach theorem one learns in a real analysis course has thesame proof, but one takes p(x) = c|x|. Saying `(x) ≤ p(x) then translatesto saying |`| ≤ c, and we extend ` so that the norm of the extension is thesame.

12

3.2 Separating hyperplanes

If ` is a linear functional, {x : `(x) = c} is a hyperplane. This splits X intotwo parts, those x for which `(x) > c and those for which `(x) < c.

A point x0 ∈ S ⊂ X is interior to S if for all y ∈ X, there exists ε(depending on y) such that x0 + ty ∈ S if −ε < t < ε.

Let K be a convex set with an interior point. Without loss of generality,we may assume x0 = 0. pK , the gauge, is defined by

pK(x) = inf{a > 0 :

x

a∈ K

}.

Proposition 3.2 pK is homogeneous and subadditive.

Saying pK is homogeneous means that pK(ax) = apK(x) if a > 0. Subad-ditive means pK(x+ y) ≤ pK(x) + pK(y).

Proof. Homogeneity is obvious. We look at subadditivity. Let x, y ∈ X. IfpK(x) or pK(y) is infinite, there is nothing to prove. So suppose both are finiteand let ε > 0. Choose pK(x) < a < pK(x) + ε and pK(y) < b < pK(y) + ε.Then x

aand y

bare in K. Letting λ = a/(a+ b),

λx

a+ (1− λ)

y

b=x+ y

a+ b

is in K. SopK(x+ y) ≤ a+ b ≤ pK(x) + pK(y) + 2ε.

Since ε is arbitrary, we are done.

The following proposition’s proof is left to the reader.

Proposition 3.3 (a) If K is convex and x ∈ K, then pK(x) ≤ 1. If K isconvex and x is interior to K, then pK(x) < 1.

(b) Let p be positive, homogeneous, and subadditive. Then {x : p(x) < 1}is convex and 0 is an interior point. Also {x : p(x) ≤ 1} is convex.

We now prove the hyperplane separation theorem.

13

Theorem 3.4 Suppose K is a nonempty convex subset of X and all pointsof K are interior. If y /∈ K, then there exist ` and c such that `(x) < c forall x ∈ K and `(y) = c.

Proof. Without loss of generality, assume 0 ∈ K. Note pK(x) < 1 for allx ∈ K. Set `(y) = 1 and `(ay) = a. If a ≤ 0, `(ay) ≤ 0 ≤ pK(ay). If a > 0,then since y /∈ K, pK(y) ≥ 1, and so pK(ay) ≥ a = `(ay).

We let Y = {ay} and use Hahn-Banach to extend ` to all of X. We have`(x) ≤ pK(x) < 1 if x ∈ K and l(y) = 1. We take c = 1.

Corollary 3.5 If K is convex with at least one interior point and y /∈ K,there exists ` 6= 0 such that `(x) ≤ `(y) for all x ∈ K.

A+B is defined to be {a+ b : a ∈ A, b ∈ B}.

Corollary 3.6 Let H and M be disjoint convex sets, with at least one havingan interior point. Then there exist ` and c such that

`(u) ≤ c ≤ `(v), u ∈ H, v ∈M.

Proof. −M is convex, so K = H+(−M) is convex. K must have an interiorpoint. H ∩M = ∅, so 0 /∈ K. Let y = 0. There exists ` such that

`(x) ≤ `(0) = 0 x ∈ K.

If u, v ∈ H,M , resp., then x = u−v ∈ K, so `(x) ≤ 0, and hence `(u) ≤ `(v).

3.3 Complex linear functionals

Theorem 3.7 Let X be a linear space over C. Suppose p ≥ 0 satisfiesp(ax) = |a|p(x) for all x ∈ X, a ∈ C, and p(x + y) ≤ p(x) + p(y). If Y is asubspace of X, ` is a linear functional on Y , and |`(y)| ≤ p(y) for all y ∈ Y ,then ` can be extended to a linear functional on X with |`(x)| ≤ p(x) for allx.

14

Again, we think of p(x) as ‖x‖, once that is defined.

Proof. Write ` as `(y) = `1(y) + i`2(y), the real and imaginary parts of `.Since ` is linear,

i`(y) = `1(iy) + i`2(iy).

On the other handi`(y) = i`1(y)− `2(y)

by substituting in for `(y) and multiplying by i. Equating the real parts,`1(iy) = −`2(y).

One can work this in reverse to see that if `1 is a linear functional overthe reals, and we define `(x) = `1(x)− i`1(ix), we get a linear functional overthe complexes.

To extend `, we have

`1(y) ≤ |`(y)| ≤ p(y).

Use Hahn-Banach to extend `1 to all of X and set `(x) = `1(x)− i`1(ix).

We need to show that |`(x)| ≤ p(x) for all x. Fix x and write `(x) = ar,where r is real and |a| = 1. Then

|`(x)| = r = a−1`(x) = `(a−1x).

Since `(a−1x) = |`(x)|, it is real with no imaginary part, and therefore equals

`1(a−1x) ≤ p(a−1x) = |a−1|p(x) = p(x).

4 An application of the Hahn-Banach theo-

rem

Let S be an arbitrary set and let B be the collection of real-valued boundedfunctions on S. We say x ≤ y if x(s) ≤ y(s) for all s ∈ S. (We’ll use x ≥ yif y ≤ x.) A function x is non-negative if 0 ≤ x. Let Y be a linear subspace

15

of B. ` is a positive linear functional on Y if `(y) ≥ 0 whenever y ≥ 0. Notethat if x ≤ y, then 0 ≤ `(y − x) = `(y)− `(x), so ` is monotone.

One example is to take `(y) = y(s0) for some point s0 in S. Or we couldtake a linear combination

∑ciy(si) provided all the ci ≥ 0. Another example

is if S is a measure space and we let `(y) =∫y(s)m(dx).

Proposition 4.1 Let Y be a linear subspace and suppose there exists y0 ∈ Ysuch that y0(s) ≥ 1 for all s. Let ` be a positive linear functional on Y . Then` can be extended to a positive linear functional on B.

Proof. Definep(x) = inf{`(y) : y ∈ Y, y ≥ x}.

Since −cy0 ≤ x ≤ cy0 if x is bounded by c, we are not taking the infinum ofan empty set. Since x ≤ cy0, then p(x) ≤ c`(y0) < ∞. If y ≥ x and y ∈ Y ,then −cy0 ≤ x ≤ y, so `(y) ≥ `(−cy0) = −c`(y0) > −∞.

It is clear that p is homogeneous. To show that p is subadditive, supposex1, x2 ∈ B and y1, y2 ∈ Y with x1 ≤ y1, x2 ≤ y2. Then

p(x1 + x2) = infx1+x2≤y

`(y) ≤ infx1≤y1,x2≤y2

`(y1 + y2)

= infx1≤y1

`(y1) + infx2≤y2

`(y2) = p(x1) + p(x2).

If y ∈ Y and y′ ≥ y is any other element in Y , then `(y) ≤ `(y′), sop(y) ≥ `(y). If x ≤ 0, then since 0 ∈ Y , p(x) ≤ `(0) = 0.

We now use Hahn-Banach to extend ` to all of B. If x ≤ 0, then `(x) ≤p(x) ≤ `(0) = 0, which proves that ` is positive on B.

5 Normed linear spaces

A norm is a map from X → R, denoted |x|, such that |0| = 0, |x| > 0 ifx 6= 0, |x+ y| ≤ |x|+ |y|, and |ax| = |a| |x|. A linear space together with itsnorm is called a normed linear space.

16

If we define d(x, y) = |x − y|, then d is a metric, and we can use all theterminology of topology.

Two norms |x|1 and |x|2 are equivalent if there exists a constant c suchthat

c|x|1 ≤ |x|2 ≤ c−1|x|1, x ∈ X.

Equivalent norms give rise to the same topology.

A subspace of a normed linear space is again a normed linear space. If Zand U are normed linear spaces, we can make Z ⊕ U into a normed linearspace by defining

|(z, u)| = |z|+ |u|,

For many purposes it is important to know whether a subspace is closed ornot, closed being in the topological sense. Here is an example of a subspacethat is not closed. Let X = `2, the set of all sequences {x = (x1, x2, . . .)}with |x| = (

∑∞j=1 |xj|2)1/2 < ∞. Let M be the collection of points in X

such that all but finitely many coordinates are zero. Clearly M is a linearsubspace. Let y1 = (1, 0, . . .), y2 = (1, 1

2, 0, . . .), y3 = (1, 1

2, 1

4, 0, . . .) and so

on. Each yk ∈ M . But it is easy to see that |yk − y| → 0 as k → ∞, wherey = (1, 1

2, 1

4, 1

8, . . .) and y /∈M . Thus M is not closed.

Proposition 5.1 If X is a normed linear space and Y is a closed subspace,then X/Y is a normed linear space with |x| = infy∈x |y|.

Proof. Homogeneity is easy. To prove subadditivity, let ε > 0. Given x, z,there exist x, z such that |x| < |x|+ ε, |z| < |z|+ ε. then

|x+ z| ≤ |x+ z| ≤ |x|+ |z| ≤ |x|+ |z|+ 2ε.

But ε is arbitrary.

It remains to prove positivity. Suppose |x| = 0. So there exists a sequencexn ∈ x such that |xn| → 0. Since each xn ∈ x, then xn − x1 ∈ Y . Letyn = x1 − xn. So lim |x1 − yn| = lim |xn| = 0. Therefore yn → x1. Since Y isclosed, x1 ∈ Y , which implies x = x1 = 0.

If X is a normed linear space and Y is a subspace of X, then the closureof Y is also a linear subspace of X.

17

A Banach space is a complete normed linear space.

Recall that any metric space can be embedded in a complete metric space.

Examples:

1) `∞ is the collection of infinite sequences {a1, a2, . . .} with each ai ∈ Cand supi |ai| <∞. We define |x|∞ = supj |aj|. This is a complete space.

2) If 1 ≤ p <∞, `p is the collection of infinite sequences for which

|x|p =(∑

j

|aj|p)1/p

is finite. This is a complete space.

3) If S is a set, the collection of bounded functions on S with |f |∞ =sups |f(s)| is a complete normed linear space.

4) If S is a topological space, then the collection of continuous boundedfunctions with |f | = sups |f(s)| is a complete normed linear space.

5) The Lp spaces are Banach spaces.

6) (Sobolev spaces) Let D be a domain in Rn and consider the C∞ func-tions on D with ∫

D

|∂αf(x)|p dx

finite for all |α| ≤ k. Here ∂α = ∂α1

∂xα11· · · ∂αn

∂xαnnand |α| = α1 + · · ·+ αn. For a

norm, we take

|f |k,p =( ∑|α|≤k

∫|∂αf(x)|p dx

)1/p

.

This is not a complete space, but its completion is denoted W k,p and is calleda Sobolev space.

A normed linear space is separable if it contains a countable dense subset.Most of the examples are separable, but B(S) is not if S is uncountable.Another example of one that is not is the collection of finite signed measureson [0, 1], with |m| =

∫ 1

0|dm|. If we let δy be point mass at y, then |δy−δz| = 2

unless y = z.

18

5.1 The unit ball is not compact in infinite dimensions

In finite dimensions, the closed unit ball is always compact, but this is not thecase in infinite dimensions. As an example, consider `2. If ei is the sequencewhich has a one in the ith place and 0 everywhere else, then |ei − ej| =

√2

if i 6= j. But then {ei} is a sequence contained in the unit ball that has noconvergent subsequence, hence the unit ball is not compact.

In fact, the closed unit ball B = {x : |x| ≤ 1} is never compact in infinitedimensions.

Theorem 5.2 Let X be an infinite dimensional normed linear space. Thenthe closed unit ball is not compact.

Proof. Choose y1 such that |y1| = 1. Given y1, . . . , yn−1, let Yn be the linearspan. Since Yn is finite dimensional, it is closed. We will show in a momenthow to find yn such that |yn| = 1 and infy∈Yn |y − yn| ≥ 1/2. We continueby induction and find a sequence {yn} contained in the closed unit ball suchthat |yj − yn| ≥ 1/2 if j < n, hence which has no convergent subsequence.

Since Yn is finite dimensional, it is not all of X and there exists x ∈ X \Yn.Since Yn is closed, d = infy∈Yn |y − x| > 0. Choose y′ ∈ Yn such that|x− y′| < 2d. Let z′ = x− y′ so |z′| < 2d. If y ∈ Yn, then y + y′ ∈ Yn and

|z′ − y| = |x− (y + y′)| ≥ d.

If we let yn = z′/|z′|, then for all y ∈ Yn

|yn − y| =∣∣∣ z′|z′| − y∣∣∣ =

1

|z′|∣∣z′ − |z′|y∣∣ ≥ d

2d=

1

2

since |z′|y ∈ Yn.

5.2 Isometries

A (not necessarily linear) map M from X onto X is an isometry if

|Mx−My| = |x− y|

19

for all x, y ∈ X.

As an example, Mx = x+u, where u is a fixed element of X is an isometry.

The collection of isometries forms a group.

Theorem 5.3 Let X,X ′ be two normed linear spaces over the reals, M anisometric map from X onto X ′ such that M(0) = 0. Then M is linear.

Proof. Fix x, y and let z = x+y2

. Then

|x− z| = |y − z| = |x− y|2

.

Let

A ={u : |x− u| = |y − u| = |x− y|

2

}.

We show A is symmetric with respect to z. That is, if u ∈ A, we mustshow v = 2z−u ∈ A. To see this, 2z = x+y, so v−x = y−u, v−y = x−u,so |v − x| = |y − u| = |x−y|

2, and similarly for |v − y|.

The diameter of A, dA, is defined by dA = supu,w∈A |u − w|. Since A issymmetric with respect to z,

dA ≥ |u− v| = |u− (2z − u)| = |2u− 2z| = 2|u− z|,

so |u− z| ≤ dA/2.

LetA1 =

{p ∈ A : |u− p| ≤ 1

2dA for all u ∈ A

}.

Note z ∈ A1. A1 is symmetric with respect to z, because if p ∈ A1 andq = 2z − p, then q − u = 2z − u− p = v − p, so |q − u| = |v − p| ≤ 1

2dA.

We have dA1 ≤ 12dA.

We define

A2 ={p ∈ A1 : |u− p| ≤ 1

2dA1 for all u ∈ A1

},

and so on. z is in each of these, the An decrease, and since dAn → 0, thenthe intersection is the single point z.

20

If we let x′, y′ be the images of x, y under M , we define A′, A′1, . . . anal-ogously. Since the sets are defined only in terms of distances, M maps Aninto A′n. M−1 is also an isometry, and so M−1 maps A′n into An. ThereforeM maps ∩nAn onto ∩nA′n, or

M(x+ y

2

)=x′ + y′

2.

When y = 0, y′ = M(0) = 0, and this becomes M(x/2) = x′/2, and similarlyM(y/2) = y′/2. So

M(x+ y

2

)=x′ + y′

2.

Replacing x by 2x and y by 2y, we get the first property of linearity for M .

Since M(2x) = M(x + x) = M(x) + M(x) = 2M(x) and M(3x) =M(x + 2x) = M(x) + M(2x) = M(x) + 2M(x) = 3M(x) and so on, thenM(kx) = kM(x). Since

M(x) = M(k(x/k)) = kM(x/k),

we have M(x/k) = (1/k)M(x). Therefore M(rx) = rM(x) for all r rational.Since M is an isometry, it is continuous, and therefore M(rx) = rM(x) forall real r.

To give some insight into the above proof, let X = `∞, the set of boundedsequences, let x = (1, 0, . . .), and y = (0, 1, 0, 0, . . .). Then

A = {(12, 1

2, a3, a4, . . .) : |ai| ≤ 1

2for i ≥ 3}.

ThenA1 = {(1

2, 1

2, a3, a4, . . .) : |ai| ≤ 1

4for i ≥ 3},

and so on. As we define Ai, we close in on (1/2, 1/2, 0, . . .).

6 Hilbert space

6.1 Scalar product

We first look at the case F = R. We have a scalar product (a, b) mappingX ×X into R which is bilinear: (x1 + cx2, y) = (x1, y) + c(x2, y), symmetric:(x, y) = (y, x), and positive: (x, x) > 0 if x 6= 0.

21

When F = C, this changes to (ax, y) = a(x, y), (x1 + x2, y) = (x1, y) +(x2, y), and (y, x) = (x, y), and still positivity.

Note (x, ay) = (ay, x) = a(y, x) = a(x, y).

Define ‖x‖ = (x, x)1/2.

We have the Cauchy-Schwarz inequality:

Proposition 6.1 |(x, y)| ≤ ‖x‖ ‖y‖, and equality holds if y = 0 or x = ay.

Proof. Let t be real, y 6= 0.We have

0 ≤ ‖x+ ty‖2 = ‖x‖2 + 2tRe (x, y) + t2‖y‖2. (6.1)

(Equality holds only if x = ty.) Set t = −Re (x, y)/‖y‖2. Multiplying by‖y‖2, we have

(Re (x, y))2 ≤ ‖x‖2‖y‖2.

If Re (x, y) = reiθ, let a = e−iθ so that |a| = 1 and a(x, y) is real. Replacex by ax in the above.

A corollary is that ‖x‖ = max‖y‖≤1 |(x, y)|. To see this, by Cauchy-Schwarz, the right hand side is less than or equal to the left hand side.For the other direction, take y = x/‖x‖.

Corollary 6.2 ‖ · ‖ is a norm.

Proof. It is only subadditivity that is not trivial, Start with

‖x+ ty‖2 = ‖x‖2 + 2tRe (x, y) + t2‖y‖2,

take t = 1, and use Cauchy-Schwarz.

If in (6.1) we take first t = 1 and then t = −1 and add, we get theparallelogram identity:

‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.

22

We say x and y are orthogonal if (x, y) = 0.

A linear space with a scalar product that is complete with respect to ‖ · ‖is a Hilbert space.

One example is `2 with (x, y) =∑

j ajbj. Another example is L2 with

(x, y) =∫x(t)y(t)m(dt).

6.2 Convex subsets

Theorem 6.3 Let K be a convex, nonempty, and closed subset of a Hilbertspace H and let x ∈ H. There exists a unique y ∈ K that is closer to x thanany other point of K.

Proof. Let d = infz∈K ‖x − z‖. Choose yn ∈ K such that dn → d, wheredn = ‖x− yn‖. By the parallelogram identity with x−yn

2and x−ym

2, we have∥∥∥x− yn + ym

2

∥∥∥2

+ 14‖yn − ym‖2 = 1

2d2n + 1

2d2m.

The right hand side tends to d2. Since K is convex, yn+ym2∈ K, so the first

term on the left is greater than or equal to d2. Therefore 14‖yn − ym‖2 → 0.

Thus {yn} is a Cauchy sequence. Since H is complete, yn → y for somey ∈ H. Since K is closed, y ∈ K. We have

‖x− y‖ = lim ‖x− yn‖ = lim dn = d.

If y′ is another point with ‖x−y‖ = d, applying the parallelogram identitywith x− y and x− y′ for the last equality in what follows,

4d2 + ‖y − y′‖2 ≤ 4∥∥∥x− y + y′

2

∥∥∥2

+ ‖y − y′‖2

= ‖2x− (y + y′)‖2 + ‖y − y‖2 = 4d2.

So we must have equality, and then ‖y − y′‖2 = 0.

If Y is a linear subspace of H, the orthogonal complement of Y is

Y ⊥ = {v ∈ H : (v, y) = 0 for all y ∈ Y },

read “Y perp.”

23

Proposition 6.4 Suppose Y is a closed subspace of H. Then

1) Y ⊥ is a closed linear subspace of H.

2) H = Y ⊕ Y ⊥.

3) (Y ⊥)⊥ = Y .

Proof. 1) (We don’t need Y closed for this first part.) That Y ⊥ is a linearsubspace is easy. If vn ∈ Y ⊥ and vn → v, then for any y ∈ H,

|(v, y)− (vn, y)| = |(vn − v, y)| ≤ ‖v − vn‖ ‖y‖ → 0,

so (v, y) = lim(vn, y) = 0, and hence v ∈ Y ⊥, Therefore Y ⊥ is closed.

2) If x ∈ H, choose y ∈ Y closest to x and set v = x−y. Since y is closest,for all z ∈ Y ,

‖v‖2 ≤ ‖v + tz‖2 = ‖v‖2 + 2tRe (v, z) + t2‖z‖2.

Taking t very small and both positive and negative, this is only possible ifRe (v, z) = 0. By replacing z by az so that (v, az) is real, we see (v, z) = 0.Therefore v ∈ Y ⊥ and x = y + v.

If x = y′+v′ as well, then y−y′ = v′−v ∈ Y ∩Y ⊥. So y−y′ is orthogonalto itself, and by positivity, y − y′ = 0. Hence y = y′ and v = v′.

3) If y ∈ Y , then for any v ∈ Y ⊥ we have (y, v) = 0, and hence y ∈ (Y ⊥)⊥.We thus need to show (Y ⊥)⊥ ⊂ Y .

By 2), H = Y ⊕ Y ⊥. If y ∈ (Y ⊥)⊥, we can write y = v + z withz ∈ Y ⊂ (Y ⊥)⊥ and v ∈ Y ⊥. Then v = y − z ∈ (Y ⊥)⊥. Since v is also inY ⊥, we see v = 0, or y = z ∈ Y .

6.3 Linear functionals

For each y, `(x) = (x, y) is a linear functional, and |`(x)| ≤ c‖x‖.The converse also holds. Note R` is one-dimensional and R` is isomorphic

to H/N`, so the codimension of N` is 1. We have H = N` ⊕ N⊥` . So theobvious candidate to take is cy, where y ∈ N⊥` .

A linear functional ` is bounded if there exists c such that |`(x)| ≤ c‖x‖for all x.

24

Theorem 6.5 (Riesz-Frechet representation) Let ` be a bounded linear func-tional on a Hilbert space. Then there exists a unique y ∈ H such that`(x) = (x, y) for all x.

Proof.

If ` 6= 0, then N` is a closed subspace Y of H. (To show it is closed iseasy.) We can write H = Y ⊕ Y ⊥. Take p ∈ Y ⊥ with ‖p‖ = 1.

Let z = `(x)p − `(p)x. Then `(z) = 0, so z ∈ Y , and hence (p, z) = 0.This says

`(x)(p, p)− `(p)(x, p) = 0,

or `(x) = `(p)(x, p). We take y = `(p)p.

To show uniqueness, if (x, y′) = `(x) = (x, y), then (x, y − y′) = 0 for allx, in particular when x = y − y′. Now use positivity.

Theorem 6.6 (Lax-Milgram lemma) Let H be a Hilbert space and suppose

1. for each y, B(x, y) is linear in x;

2. for each x, B(x, y1 +y2) = B(x, y1)+B(x, y2) and B(x, cy) = cB(x, y);

3. there exists c such that |B(x, y| ≤ c‖x‖ ‖y‖

4. there exists b such that |B(y, y)| ≥ b‖y‖2 for all y.

(We do not assume B(x, y) = B(y, x).) Then every bounded linear functional` is of the form `(x) = B(x, y) for some unique y.

Proof. For each y, B(x, y) is a bounded linear functional of x, so thereexists z = z(y) such that B(x, y) = (x, z) for all x, and z is unique.

If Z = {z : z = z(y) for some y ∈ H}, then Z is a linear space.

Z is closed: setting x = y, and letting z = z(y),

b‖y‖2 ≤ B(y, y) = (y, z) ≤ c‖y‖ ‖z‖,

25

orb‖y‖ ≤ ‖z‖.

If zn ∈ Z and zn → z, let yn be a point such that zn = z(yn). Then B(x, yn) =(x, zn). So B(x, yn − ym) = (x, zn − zm), hence b‖yn − ym‖ ≤ ‖zn − zm‖, andtherefore yn is a Cauchy sequence. H is complete; let y be the limit. SinceB(x, yn)→ B(x, y) and (x, zn)→ (x, z), we have B(x, y) = (x, z), and hencez ∈ Z.

Z = H: For each y, there exists z(y) such that B(x, y) = (x, z) for ally. If Z 6= H, there exists x ∈ Z⊥. Applying the above with y = x, thereexists z(x) such that B(x, x) = (x, z(x)). Since x ∈ Z⊥ and z(x) ∈ Z,b‖x‖2 ≤ B(x, x) = (x, z(x)) = 0. So x = 0.

Existence: given `, there exists y such that `(x) = (x, y) for all x. Then`(x) = B(x, z(y)).

Uniqueness: if there are two such z, then B(x, z−z′) = B(x, z)−B(x, z′) =`(x)− `(x) = 0. Now set x = z − z′.

6.4 Linear spans

The closed linear span of S is the intersection of all closed linear subspacescontaining S.

One can check that the closed linear span of S equals the closure of thelinear span of S.

Proposition 6.7 y is in the closed linear span Y of {yj} if and only if(yj, z) = 0 for all j implies (y, z) = 0.

Proof. Suppose y is in the closed linear span. If (yj, z) = 0 for all j, then(∑ajyj, z) = 0, so (y, z) = 0 by continuity of the inner product. So let’s

look at the other direction.

Let Z = {z : (z, yj) = 0 for all j}. We claim Z = Y ⊥.

If z is orthogonal to all the yj, then z is orthogonal to all linear combina-tions, and hence to the limits of linear combinations. Thus z ∈ Y ⊥.

26

If z ∈ Y ⊥, then z is orthogonal to all the yj, hence is in Z.

Therefore Z = Y ⊥, and so Y = (Y ⊥)⊥ = Z⊥.

To conclude the proof, suppose z ∈ Z. Then (z, yj) = 0 for all j. Byhypothesis, (z, y) = 0. Then y ∈ Z⊥ = Y .

We say a set {xj} is orthonormal if (xj, xk) equals 0 when j 6= k andequals 1 when j = k. {xj} forms an orthonormal basis if the elements areorthonormal and the closed linear span is all of H.

If∑|aj|2 < ∞, we can define

∑ajxj as the limit in the Hilbert space

norm of finite sums.

We need Bessel’s inequality:

Proposition 6.8 Suppose y ∈ H and aj = (y, xj). Then∑|aj|2 ≤ ‖y‖2.

Proof. If G is a finite collection of the xj’s, then

0 ≤∥∥∥y −∑

G

ajxj

∥∥∥2

= ‖y‖2 −∑G

aj(y, xj)−∑G

aj(xj, y) +∑G

|aj|2

= ‖y‖2 −∑G

|aj|2.

Therefore ∑G

|aj|2 ≤ ‖y‖2,

and we take the supremum over all finite collections.

Lemma 6.9 If {xj} is an orthonormal set, then the closed linear span isequal to {∑

ajxj :∑|aj|2 <∞

}.

Note in this case ‖x‖2 =∑|aj|2 and aj = (x, xj). To see this, because

the {xj} are orthonormal, for any finite sum∥∥∥∑ ajxj

∥∥∥2

=∑‖ajxj‖2

27

as the cross terms are zero. We then take limits. To get the formula for aj,

(x, xk) =∑

aj(xj, xk) = ak.

Theorem 6.10 Every Hilbert space has an orthonormal basis.

If we consider all orthonormal sets, ordered by inclusion, then we useZorn’s lemma to get a maximal element. We want to show the closed linearspan Y of a maximal element is all of H.

Suppose not. Suppose there exists y /∈ Y . Let aj = (y, xj).

Define x =∑ajxj, and note x ∈ X. We have

(y − x, xj) = (y, xj)− (x, xj) = aj − aj = 0.

So y − x is orthogonal to all the xj. Since y − x 6= 0 (because y /∈ Y ), y−x‖y−x‖

could be added to the collection {xj} to make a strictly larger orthonormalset, a contradiction.

If H is separable, then any orthonormal basis will be countable, and wewouldn’t need Zorn’s lemma to obtain the orthonormal basis.

Proposition 6.11 Suppose H is a Hilbert space, and {xj}, {yj} are twoorthonormal bases for H. Given x ∈ H, we have x =

∑ajxj with aj =

(x, xj). The map x→ y =∑ajyj is an isometry.

7 Applications of Hilbert spaces

7.1 The Radon-NIkodym theorem

We can use Hilbert space techniques to give an alternate proof of the Radon-Nikodym theorem.

Suppose µ and ν are finite measures on a space S and we have the conditionν(A) ≤ µ(A) for all measurable A. For f ∈ L2(µ), define

`(f) =

∫f dν.

28

Our condition implies∫h dν ≤

∫h dµ if h ≥ 0. We use this with h = |f |

and use Cauchy-Schwarz and have

|`(f)| =∣∣∣ ∫ f dν

∣∣∣ ≤ ∫ |f | dν ≤ ∫ |f | dµ≤(µ(S)

)1/2(∫f 2 dµ

)1/2

≤ c‖f‖L2(µ).

There exists g such that `(f) = (f, g), which translates to∫f dν =

∫fg dµ.

Letting f = χA, we get ν(A) =∫Ag dµ.

If ν is absolutely continuous with respect to µ, we let ρ = µ+ν and applythe above to ν and ρ and also to µ and ρ. The absolute continuity impliesthat dµ/dρ > 0 a.e., and we use

dµ=dν

dρ.

7.2 The Dirichlet problem

Let D be a bounded domain in Rn, contained in B(0, K), say, where thisis the ball of radius K about 0. Let (f, g) be the usual L2 scalar productfor real valued functions. It is easy to see that if C∞0 (D) is the set of C∞

functions that vanish on the boundary of D, then the completion of C∞(D)with respect to the L2 norm is simply L2(D). Define

E(f, g) =

∫D

(∇f(x),∇g(x)) dx.

Clearly E is bilinear and symmetric.

If we start with

f(x1, . . . , xn) =

∫ x1

−K

∂f

∂x1

(y, x2, . . . , xn) dy

and apply Cauchy-Schwarz, we have

|f(x1, . . . , xn)|2 ≤∫ K

−K1 dy

∫ K

−K|∇f(y, x2, . . . , xn)|2 dy.

29

Integrating over (x2, . . . , xn) ∈ [−K,K]n−1 we obtain∫D

|f(x)|2 dx ≤ c

∫D

|∇f(x)|2 dx,

or in other words,(f, f) ≤ cE(f, f).

If E(f, f) = 0, then (f, f) = 0, and so f = 0 (a.e., of course). This provesthat E is positive. We let H1

0 be the completion of C∞0 (D) with respect tothe norm induced by E . The superscript 1 refers to the fact we are workingwith first derivatives, the subscript 0 to the fact that our functions vanishon the boundary. E is an example of a Dirichlet form, something we willprobably look at more much later.

Recall the divergence theorem:∫∂D

(F, n) dσ =

∫D

divF dx,

where D is a reasonably smooth domain, ∂D is the boundary of D, n is theoutward pointing unit normal, and σ is surface measure. In three dimensions,this is also known as Gauss’ theorem, and along with Green’s theorem andStokes’ theorem are consequences of the fundamental theorem of calculus.

If we apply the divergence theorem to F = u div v, then

∂x1

F1 =∂u

∂x1

∂v

∂x1

+ u∂2v

∂x21

,

and sodivF = (∇u,∇v) + u∆v,

where ∆v is the Laplacian. Also,

(divF, n) = u∂v

∂n,

where ∂v∂n

is the normal derivative of v. We then get Green’s first identity:∫D

u∆v +

∫D

(∇u,∇v) =

∫∂D

u∂v

∂n.

30

Our goal is to solve the equation ∆v = g in D with v = 0 on the boundaryof D. This is Poisson’s equation, while the Dirichlet problem more properlyrefers to the equation ∆v = 0 in D with v equal to some pre-specified functionf on the boundary of D.

If we have a solution v and u ∈ C∞0 (D), then by Green’s identity we get∫D

u(x)g(x) dx = −∫D

(∇u(x),∇v(x)) dx.

So one way of formulating a (weak) solution to Poisson’s equation is: giveng ∈ L2(D), find v ∈ H1

0 such that

E(u, v) = −∫ug

for all u ∈ C∞0 (D).

After all this, it is easy to find a weak solution to the Poisson equation.Suppose g ∈ H1

0 . Define `(u) = −(u, g). Then

|`(u)| ≤ ‖g‖ ‖u‖ ≤ c‖g‖E(u, u)1/2.

By the Riesz-Frechet theorem, there exists v ∈ H10 such that `(u) = E(u, v)

for all u. SoE(u, v) = `(u) = −(u, g),

and v is the desired solution.

8 Duals of normed linear spaces

8.1 Bounded linear functionals

If X is a normed linear space, a linear functional ` is a linear map from X toF , the field of scalars. ` is continuous if |xn − x| → 0 implies `(xn) → `(x).` is bounded if there exists c such that |`(x)| ≤ c|x| for all x.

Theorem 8.1 A linear functional ` is continuous if and only if it is bounded.

31

Proof. If f is bounded,

|`(xn)− `(x)| = |`(xn − x)| ≤ c|xn − x| → 0,

and so it is continuous.

Suppose ` is continuous but not bounded. So there exist xn such that`(xn) > n|xn|. If

yn =1√n

xn|xn|

,

then |yn − 0| = |yn| → 0, but

`(yn) >1√nn|xn||xn|

=√n,

which does not tend to 0 = `(0), a contradiction.

The collection of all continuous linear functionals of X is called the dualof X, written X ′ or X∗.

Note N` = `−1({0}) is closed, since ` is continuous.

Define

|`| = sup|x|6=0

|`(x)||x|

.

By linearity, this is the same as sup|x|=1 |`(x)|.

Proposition 8.2 X∗ is a Banach space.

Proof.

Subadditivity:

|`+m| = sup|x|=1

|(`+m)(x)| ≤ sup|x|=1

(|`(x)|+ |m(x)|)

≤ sup|x|=1

|`(x)|+ sup|x|=1

|m(x)| = |`|+ |m|.

Completeness: Suppose |`n − `m| → 0 as n,m→∞. For each x,

|`n(x)− `m(x)| ≤ |`n − `m| |x| → 0.

32

Since R and C are complete, lim `n(x) exists for each x; call the limit `(x).

Given ε, choose N such that |`n−`m| < ε if n,m ≥ N . So |`n(x)−`m(x)| ≤ε|x|. Let m→∞, so |`n(x)− `(x)| ≤ ε|x| if n ≥ N . This means |`n − `| ≤ εif n ≥ N , or `n → `.

8.2 Extensions of bounded linear functionals

Proposition 8.3 Let X be a normed linear space, Y a subspace, ` a linearfunctional on Y with |`(y)| ≤ c|y| for all y ∈ Y . Then ` can be extended toa bounded linear functional on X with the same bound on X as on Y .

Proof. This is the Hahn-Banach theorem with p(x) = c|x|.

y1, . . . , yN are said to be linearly independent if∑N

i=1 ciyi = 0 implies allthe ci are zero.

Theorem 8.4 Suppose y1, . . . , yN are linearly independent and a1, . . . , aNare scalars. Then there exists a bounded linear functional ` such that `(yj) =aj.

Proof. Let Y be the span of y1, . . . , yN . If y ∈ Y , then y can be written as∑bjyj in only one way, for if

∑b′jyj is another way, then∑

(bj − b′j)yj = y − y = 0,

and so bj = b′j for all j. Define

`(∑

bjyj

)=∑

ajbj.

Now use the preceding theorem to extend ` to all of X.

Theorem 8.5 If X is a normed linear space, then

|y| = max|`|=1|`(y)|.

33

Proof. |`(y)| ≤ |`| |y|, so the maximum on the right hand side is less thanor equal to y.

If y ∈ X, let Y = {ay} and define `(ay) = a|y|. Then the norm of ` on Yis 1. Now extend ` to all of X so as to have norm 1.

Theorem 8.6 Let X be a normed linear space over C and Y a linear sub-space. Let

m(z) = infy∈Y|z − y|, M(z) = max

|`|≤1,`=0 on Y|`(z)|.

Then m(z) = M(z).

Proof. If y ∈ Y , ` = 0 on Y , and |`| ≤ 1, then

|`(z)| = |`(z − y)| ≤ |z − y|.

So|`(z)| ≤ inf

y∈Y|z − y| = m(z),

and hence M(z) ≤ m(z).

Let Y0 = {y+az : y ∈ Y, a ∈ C}. Define `0(z) on Y0 by `0(y+az) = am(z).By the definition of m(z), `0 is bounded on Y0 by 1. (To see this, we write

|`0(y + az)| = |a|m(z) ≤ |a|∣∣∣z − −y

a

∣∣∣ = |az + y|.

Extend to all of X with the same bound.

`0(z) = `(0 + 1 · z) = m(z),

so m(z) ≤M(z).

We write Y ⊥ = {` : ` = 0 on Y } for the annihilator of Y .

Theorem 8.7 (Spanning criterion) Let Y be the closed linear span of {yj}.Then z ∈ Y if and only if `(yj) = 0 for a bounded linear functional ` for allj implies `(z) = 0.

34

Proof. If `(yj) = 0 for all j, then `(y) for all y of the form∑ajyj, and by

continuity of `, for all y ∈ Y .

Suppose z /∈ Y . Then

infy∈Y|z − y| = d > 0.

Let Z = {y + az : y ∈ Y }. Define `0 on Z by `0(y + az) = a. So

|y + az| = |a|∣∣∣− −y

a+ z∣∣∣ ≥ d|a|.

Therefore on Z, `0 is bounded by d−1. Extend `0 to all of X. But then`0(yj) = 0 while `0(z) = 1.

8.3 Reflexive spaces

If x ∈ X, define the linear functional Lx on X∗ by

Lx(`) = `(x).

The norm of Lx is |x|. So we can isomorphically embed X into X∗∗.

A Banach space is reflexive if X∗∗ = X.

Theorem 8.8 Hilbert spaces are reflexive.

Proof. Recall X∗ = X, and the result follows from this. To see X∗ = X,if ` is a linear functional, there exists y such that `(x) = (x, y) for all x. Ifwe show |`| = ‖y‖, this gives an isometry between X and X∗. By Cauchy-Schwarz, |`(x)| = |(x, y)| ≤ ‖x‖ ‖y‖, so |`| ≤ ‖y‖. Taking x = y, `(y) = ‖y‖2,hence |`| ≥ ‖y‖.

If 1 < p, q < ∞ and 1p

+ 1q

= 1, then the dual of Lp is isomorphic to Lq.Hence the Lp spaces are reflexive.

To prove the dual of Lp is Lq, two facts from real analysis are needed. SeeFolland, p.190 and preceding.

35

1) Holder’s inequality: ∫|fg| ≤ ‖f‖p‖g‖q.

2)

‖g‖q = sup{∣∣∣ ∫ fg

∣∣∣ : ‖f‖p = 1, f simple}.

Theorem 8.9 (Lp)∗ = Lq.

Proof. (Sketch) If we define `(f) =∫fg, then Holder’s inequality shows

that ` is a bounded linear functional.

Now suppose ` is a bounded linear functional on Lp. The heart of thematter is when µ, the measure, is finite, so let’s suppose that. Define ν(E) =`(χE). The linearity of ` shows ν is finitely additive. The fact that ` isa continuous linear functional allows one to prove countable additivity. Ifµ(E) = 0, then χE = 0 (a.e.), so ν(E) = `(0) = 0. Therefore ν is absolutelycontinuous with respect to µ. Note ν is either a signed measure or is complexvalued.

By the Radon-Nikodym theorem there exists g ∈ L1 such that

`(χE) = ν(E) =

∫E

g.

By linearity, if f is simple,

`(f) =

∫fg.

Since `(f) ≤ c‖f‖p, because ` is a bounded linear functional on Lp, takingthe supremum over simple f ’s with Lp norm 1, we get

‖g‖q ≤ |`|.

We can then use the continuity of ` and Holder’s inequality to obtain

`(f) =

∫fg

for all f ∈ Lp.

36

Proposition 8.10 If X is a normed linear space over C and X∗ is separable,then X is separable.

Proof. Since X∗ is separable, there is a countable dense subset {`n}. Recall|`n| = sup|x|=1 |`n(x)|. So for each n there exists xn ∈ X such that |xn| = 1

and `n(xn) > 12|`n|.

We claim {xn} is dense in X. To prove this, we start by showing that if `is a linear functional on X that vanishes on {xn}, then ` vanishes identically.

Suppose not and that there exists ` such that `(xn) = 0 for all xn but` 6= 0. We can normalize so that |`| = 1. Since the `n are dense in X∗, thereexists `n such that |`− `n| < 1/3. Therefore |`n| > 2/3.Then

13> |(`− `n)(xn)| = |`n(xn)| > 1

2|`n| > 1

2· 2

3,

a contradiction.

Therefore by the spanning criterion, the closed linear span of the xn isall of X. Then the set of finite linear combinations of the xn where all thecoefficients have rational coordinates, is also dense in X, and is countable.

By the Riesz representation theorem from real analysis, the dual of X =C([0, 1]) is the set of finite signed measures on [0, 1]. X is separable, but X∗

is not, since |δx−δy| = 2 whenever x 6= y. It follows that X is not reflexive: ifit were, we would have X∗∗ separable, but X∗ not, contradicting the previousproposition.

Proposition 8.11 Suppose X is a reflexive normed linear space and Y is asubspace. Then Y is reflexive.

Proof. If ` is a linear functional on X, let `0 be the restriction of ` to Y .By the Hahn-Banach theorem every bounded linear functional on Y can beextended to X. So the restriction map R : `→ `0 maps X∗ onto Y ∗.

If ϕ ∈ Y ∗∗, define Rϕ by

Rϕ(`) = ϕ(`0), ` ∈ X∗.

37

Here R maps Y ∗∗ into X∗∗.

Since X is reflexive, there exists z ∈ X such that Rϕ(`) = `(z) for alllinear functionals `. Therefore `(z) = ϕ(`0).

We argue that z ∈ Y . If ` ∈ Y ⊥, then `0 = 0. So `(z) = ϕ(`0) = ϕ(0) = 0.By the spanning criterion, z is in the closure of Y , and since Y is closed, itis in Y .

Therefore`0(z) = ϕ(`0).

Every functional in Y ∗ is an `0 for some `, so every ϕ ∈ Y ∗∗ can be identifiedwith some z ∈ Y .

8.4 The support function

If M is a subset of a linear space, define M to be the closure of the convexhull of M .

If M is a bounded subset of a normed linear space X over R, define themap SM from X∗ into R by

SM(`) = supy∈M

`(y).

SM is called the support function.

If M = {x0}, then SM(`) = `(x0). If M = BR(0), then SM(`) = R|`|.Using the fact that SM+N(`) = SM(`)+SN(`), if M = BR(x0), then SM(`) =`(x0) +R|`|.

Proposition 8.12 If M is a bounded subset of X, then z ∈ M if and onlyif `(z) ≤ SM(`) for all ` ∈ X∗.

Proof. One way is easy. If ` ∈ X∗ and z ∈ M , then `(z) ≤ SM(`). It is easyto check that SM(`) = SM(`).

Now suppose z /∈ M , but `(z) ≤ SM(`) for all `. M is closed, so thereexists R such that BR(z) ∩ M = ∅. By the hyperplane separation theorem,there exists `0 6= 0 and c such that

`0(u) ≤ c ≤ `0(v), u ∈ M, v ∈ BR(z).

38

`0 is a bounded linear functional.

If v ∈ BR(z), then v = z +Rx with |x| < 1, so

c ≤ `0(z) +R`0(x).

Also inf |x|<1 `0(x) = −|`0|. So c ≤ `0(z)−R|`0|. We have SM(`0) ≤ c, so

SM(`0) +R|`0| ≤ `0(z),

a contradiction to `(z) ≤ SM(`).

Proposition 8.13 Suppose K is a closed and convex subset of X and z /∈ K.Then

infu∈K|z − u| = sup

|`|=1

[`(z)− SK(`)].

Proof. SK(`) ≥ `(u) for all ` ∈ X∗ and all u ∈ K. If |`| = 1, then

SK(`) ≥ `(u) = `(u) + `(u− z) ≥ `(z)− |u− z|,

or |u− z| ≥ `(z)− SK(`). So the left hand side is larger than the right handside.

Let 0 < R < infu∈K |z − u|. K +BR has positive distance from z. So

SK+BR(`0) < `0(z)

for some `0 ∈ X∗ and we can take |`0| = 1. SK+BR(`0) = SK(`0) +R|`0|, or

R < `0(z)− SK(`0).

Therefore the right hand side is larger than R. This is true for any such R,so the right side is larger than the left side.

39

9 Weak convergence

Let X be a normed linear space. We say xn converges to x weakly, writtenw − limxn = x or xn

w−→x if `(xn)→ `(x) for all ` ∈ X∗.xn converges to x strongly, written s−limxn = x or xn

s−→x if |xn−x| → 0.

Strong convergence implies weak convergence because

|`(xn)− `(x)| = |`(xn − x)| ≤ |`| |xn − x| → 0.

As an example where we have weak convergence but not strong conver-gence, let X = `2 and let en be the element whose nth coordinate is 1 andall other coordinate coordinates are 0. Since |en| = 1, then en does not con-verge strongly to 0. But it does converge weakly to 0. To see this, if ` isany bounded linear functional on X, then ` is of the form `(x) = (x, y) forsome y ∈ X, which means y = (b1, b2, . . .) with

∑j |bj|2 <∞. In particular,

bj → 0. Then `(en) = bn → 0 = `(0).

This example stretches to any Hilbert space. If {xn} is an orthonormalsequence in the space, `(xn) = (xn, y) for some y. By Bessel’s inequality,∑|(xn, y)|2 ≤ ‖y‖2, so (xn, y)→ 0,

9.1 Uniform boundedness

Theorem 9.1 Let X be a Banach space and {`ν} a collection of boundedfunctionals such that |`ν(x)| ≤M(x) for all ν and each x. Then there existsc such that |`ν | ≤ c.

In other words, if the `ν are bounded pointwise, they are bounded uni-formly.

The proof relies on the Baire category theorem, which we now recall. IfA is a set, we use A for the closure of A and Ao for the interior of A. A setA is dense in X if A = X and A is nowhere dense if (A)o = ∅.

The Baire category theorem is the following.

Theorem 9.2 Let X be a complete metric space.

40

(a) If Gn are open sets with Gn = X, then ∩nGn is dense in X.

(b) X cannot be written as the countable union of nowhere dense sets.

We now prove the uniform boundedness theorem. (A generalization tobounded linear maps is called the Banach-Steinhaus theorem).

Proof. Let M(x) = supν |`ν(x)|. Let Gn = {x : M(x) > n}. Since x ∈ Gn

if and only if for some ν ∈ A we have |`ν(x)| > n|x| and x → |`ν(x)| is acontinuous function by the triangle inequality, we conclude

Gn = ∪ν{x : |`ν(x)| > n|x|}

is the union of open sets, so is open.

Suppose GN is not dense in X. So there exists x0 and r such that B(x0, r)∩GN = ∅, or if |x| ≤ r, then `(x0 + x) ≤ N for all ν. Since x = (x0 + x)− x0,

|`ν(x)| ≤ |`ν((x0 + x)|+ |`ν(x0)| ≤ 2N

if |x| ≤ r, and we then have supν |`ν | ≤ c with c = 2N/r.

The other possibility, by Baire’s theorem, is that every Gn is dense in Xand ∩nGn is a dense subset of X. But M(x) =∞ for every x ∈ ∩nGn.

Corollary 9.3 Let X be a normed linear space, {xν} a subset such that forall ` ∈ X∗ we have

|`(xν)| ≤M(`) for all xν .

Then there exists c such that |xν | ≤ c for all xν.

Proof. Write Lν(`) = `(xν). So each xν acts as a bounded linear functionalon X∗.

Corollary 9.4 Let X be a normed linear space and suppose xn convergesweakly to x. Then |x| ≤ lim inf |xn|.

Proof. There exists ` such that |`| = 1 and |`(x)| = |x|. Then |`(x)| =lim |`(xn)| and |`(xn)| ≤ |`| |xn| = |xn|.

41

9.2 Weak sequential compactness

The topology that goes along with weak convergence is not necessarily de-rived from a metric. Thus the topology cannot be characterized by sequentialconvergence.

We say a subset C of a normed linear space X is weakly sequentiallycompact if any sequence of points in C has a subsequence converging weaklyto a point of C.

Theorem 9.5 Let X be a reflexive Banach space. Then the closed unit ballis weakly sequentially compact.

Proof. Take {yn} with |yn| ≤ 1. Let Y be the closed linear span of theyn’s. By Theorem 15 of Chapter 8, Y is reflexive also. Since Y ∗∗ = Y isseparable, then Y ∗ is separable. Let {mj} be a countable dense subset of Y ∗.By a diagonalization procedure, there exists a subsequence zn of the yn suchthat mj(zn) converges for all j. Since the mj are dense in Y ∗, we have m(zn)converges for all m ∈ Y ∗. Call the limit L(m).

We can check that L(m) is a linear functional on Y ∗. We have

|L(m)| ≤ lim sup |m(zm)| ≤ |m| |zn| ≤ |m|,

so the linear functional L on Y ∗ has norm bounded by 1. L ∈ Y ∗∗. So thereexists y ∈ Y such that L(m) = m(y). For all m ∈ Y ∗, m(zn) → m(y), andhence zn converges weakly to y. Since Y ⊂ X, then X∗ ⊂ Y ∗, hence we havethe convergence for all m in X∗.

9.3 Weak∗ convergence

We say un ∈ X∗ is weak∗ convergent to u if limun(x) = u(x) for all x ∈ X.

If X is reflexive, then weak∗ convergence is the same as weak convergence.

Weak convergence in probability theory can be identified as weak∗ con-vergence in functional analysis.

42

As an example, if S is a compact Hausdorff space and X = C(S), thenX∗ is the collection of finite signed measures. Saying ia sequence of mea-sures νn converges in the weak∗ sense means that

∫f dνn converges for each

continuous function f .

A set is weak∗ sequentially compact if every sequence in the set has asubsequence which converges in the weak∗ sense to an element of the set.

Theorem 9.6 If X is a separable Banach space, then the closed unit ball inX∗ is weak∗ sequentially compact.

Proof. Let un ∈ X∗ with norms bounded by 1. Let {xk} be a countabledense subset of X. By diagonalization there exists a subsequence vn of theun such that vn(xk) converges for each k. Since |vn| ≤ 1 for all n, then vn(x)converges for all x. Call the limit v(x). This is a linear functional with normbounded by 1.

10 Applications of weak convergence

10.1 Approximating the δ function

Let kn be a sequence of integrable functions on the interval [−1, 1]. Theyapproximate the δ function (or are an approximation to the identity) if∫ 1

−1

f(t)kn(t) dt→ f(0) (10.1)

as n→∞ for all f continuous on [−1, 1].

Theorem 10.1 kn approximates the δ function on [−1, 1] if and only if thefollowing three properties hold.

(1)∫ 1

−1kn(t) dt→ 1.

(2) If g is C∞ and 0 in a neighborhood of 0, then∫ 1

−1

g(t)kn(t) dt→ 0

43

as n→∞.

(3) There exists c such that∫ 1

−1|kn(t)| dt ≤ c for all n.

Proof. If (1)–(3) hold, write f = (f − f(0)) + f(0), and we may supposewithout loss of generality that f(0) = 0. Choose g ∈ C∞ such that g is 0 ina neighborhood of 0 and |g − f | < ε. We have∣∣∣ ∫ 1

−1

(f − g)kn

∣∣∣ ≤ ε

∫|kn| ≤ cε

and ∫gkn → 0.

So

lim sup∣∣∣ ∫ fkn

∣∣∣ ≤ cε.

Since ε is arbitrary, this shows (10.1).

If (10.1) holds, then (1) holds by taking f identically 1 and (2) holds bytaking f equal to g. So we must show (3). If X is the set C of continuousfunctions on [−1, 1], then X∗ is the collection of finite signed measures. Letmn(dt) = kn(t) dt and m0(dt) = δ0(dt). Then (10.1) says that mn(f) →m0(f) for all f ∈ C, ormn converges tom0 in the sense of weak-∗ convergence.lim sup |mn(x)| < ∞, so |mn(x)| ≤ M(x) for all x, and by the uniformboundedness principle, |mn| ≤ c. Note |mn| is the total mass of mn, which

is∫ 1

−1|kn(t)| dt.

10.2 Divergence of Fourier series

From the approximation of the δ-function, we can show that there exists acontinuous function f whose Fourier series diverges at 0.

We look at the set of continuous functions on S1, the unit circle. We sayf(θ) has Fourier series

∑∞−∞ ane

inθ with

an =1

∫ π

−πf(θ)e−inθ dθ.

44

The Fourier series converges at 0 if

limN→∞

N∑−N

an = f(0).

NowN∑−N

an =1

∫ π

−πf(θ)kN(θ) dθ,

where

kN(θ) =N∑−N

e−inθ =sin(N + 1

2)θ

sin θ/2.

So the convergence of the Fourier series at 0 is equivalent to kN being anapproximation to the δ function. And if (3) fails, then

∑N−N an does not

converge for some f .

Since | sinx| ≤ |x|, then ∣∣∣ 1

sinx/2

∣∣∣ ≥ 2

|x|,

and therefore ∫ π

−π|kN(θ)| dθ ≥ 2

∫ π

−π| sin(N + 1

2)θ|dθ|θ|

= 2

∫ (N+12

)x

0

| sinx|dxx

≥ c logN

10.3 The Herglotz theorem

Theorem 10.2 (Herglotz) Let f be analytic in {|z| < 1} and h = Re f ≥ 0.Then

f(z) =

∫ 2π

0

1 + ze−iθ

1− ze−iθm(dθ) + ic

for some positive measure m and some constant c. If f satisfies the aboveequation, f is analytic with positive real part. The measure m is uniquelydetermined by f .

45

Proof. For R < 1 we have the Poisson integral formula:

f(z) =1

∫ 2π

0

R + ze−iθ

R− ze−iθh(Reiθ) dθ + ic.

When z = 0,

h(0) =1

∫ 2π

0

h(Reiθ) dθ.

Take a sequence Rn ↑ 1. Define `n on C(S1) by

`n(u) =1

∫ 2π

0

h(Rneiθ)u(θ) dθ.

Since h ≥ 0,

|`n(u)| ≤ 1

π|u|∫ 2π

0

h(Rneiθ) dθ = |u|h(0).

C(S1) is separable, so by Helly’s theorem, there exists a subsequence (alsocalled `n) which is weak∗ convergent, say to `. If |un − u| → 0 and `n(u)→`(u), then

|`n(un)− `n(u)| ≤ |`n| |un − u| → 0,

so `n(un)→ `(u).

Let

un =Rn + ze−iθ

Rn − ze−iθ, u =

1 + ze−iθ

1− ze−iθ.

Fix z. Then |un − u| → 0, and therefore f(z) = `n(un) → `(u). The `n arepositive linear functionals so ` is too. By Riesz representation, there existsa measure m such that

`(u) =

∫ 2π

0

u(θ)m(dθ).

It is routine to check that if f(z) has the representation in the theorem,then f is analytic.

If we take real parts,

h(z) =1

∫ 2π

0

1− r2

1− 2r cos(φ− θ) + r2m(dθ),

46

where z = reiφ. Multiply by u(φ) and integrate to get∫ 2π

0

h(reiφ)u(φ) dφ =

∫ 2π

0

[ 1

∫1− r2

1− 2r cos(φ− θ) + r2u(φ) dφ

]m(dθ).

Call the expression inside the brackets ur(θ). Letting r → 1, ur → u strongly.so

limr→1

∫h(reiφ)u(φ) dφ =

∫ 2π

0

u(θ) dm.

The left hand side does not depend on m. If m1 and m2 are two suchmeasures, ∫ 2π

0

u(θ)m1(dθ) =

∫ 2π

0

u(θ)m2(dθ)

for all continuous u. Therefore m1 = m2.

11 Weak and weak∗ topologies

The weak topology is the coarsest topology (i.e., fewest sets) in which allbounded linear functionals are continuous.

Bounded linear functionals are continuous in the usual norm topology(also called the strong topology), so the weak topology is coarser than thestrong topology.

We argue that every open set in the weak topology is the union of finiteintersections of sets of the form {x : a < `(x) < b}. Let

S ={{x : ai < `i(x) < bi, i = 1, . . . , n} : n ∈ N, ai < bi, `i ∈ X∗

}.

We claim S is a basis for the weak topology, that is, every open set is aunion of elements of S. Note that each element of S is in the weak topologybecause `−1

i ((ai, bi)) must be in the weak topology, so ∩ni=1`−1i ((ai, bi)) is an

element of the weak topology. Note also that unions of elements of S forma topology and each ` is continuous with respect to this topology. ThereforeS is a basis for the weak topology.

Finite intersections of sets in S are unbounded if X is infinite dimensional,so every open set in the weak topology when X is infinite dimensional is

47

unbounded. The open unit ball ({x : |x| < 1}) is a set that is open in thestrong topology but not the weak topology.

Given a set S, the weak sequential closure of S is

{x : ∃xn ∈ S, xnw−→x}.

If the weak sequential closure of S is equal to S, then S is said to be weaklysequentially closed.

Proposition 11.1 (1) The weak sequential closure of S is contained in theclosure of S with respect to the weak topology.

(2) If X is infinite dimensional, there are sets that are are weakly sequen-tially closed, but are not closed in the weak topology.

Proof. 1) If x is not in the weak closure of S, there exists an open set A inthe weak topology such that x ∈ A and A ∩ S = ∅. We can choose A to bethe finite intersection of sets of the form {x : a < `(x) < b}, so

A = ∩ni=1{x : ai < `i(x) < bi}.

If x is in the weak sequential closure of S, there exist xk ∈ S such that xkw−→x.

For each i, `i(xk) → `i(x), so for n large enough, `i(xk) ∈ (ai, bi). Takingk large enough, `i(xk) ∈ (ai, bi) for all i ≤ n, hence xk ∈ A, contradictingA ∩ S = ∅.

2) Let Xk be finite dimensional subsets of X with dimXk = k. LetSk = {xkj} be a 1/k-net for {|x| = k} in Xk, and let S = S2 ∪ S3 ∪ · · · .

0 is in the closure of S in the weak topology: If A is open and contains 0,then A contains a subset of the form A = {x : |`i(x)| < ε, i = 1, . . . , n} with|`i| = 1. If k > n, then the dimension of Xk is greater than the number oflinear functionals, and so there exists xk such that |xk| = k and `i(xk) = 0.There exists xkj ∈ Sk within 1/k of xk, so

|`i(xkj)| = |`i(xkj − xk)| ≤ |xkj − xk| < 1/k.

So for k > 1/ε, xkj ∈ A.

S contains no nontrivial weakly convergent sequences: In any ball of radiusR, S has only finitely many points. Now use the uniform bounded principle:if xn

w−→x, then |xn| is bounded; see Corollary 9.3.

48

Proposition 11.2 Suppose K is convex and contained in a Banach space X.If K is closed in the strong topology, then K is closed in the weak topology.

Proof. Suppose z /∈ K. We show z is not in the weak closure of K. Kis closed in the strong topology, so there exists BR(z) that is disjoint fromK. By the hyperplane separation theorem, there exists a nonzero linearfunctional ` and c such that

`(u) ≤ c ≤ `(v), u ∈ K, v ∈ BR(z).

If w ∈ BR(0), v = −w + z, then

`(−w) = `(v − z) = `(v)− `(z) ≥ c− `(z).

So `(w) ≤ `(z) − c. If follows that |`(w)| is bounded above for w ∈ BR(0),which implies ` is bounded.

If v ∈ BR(z), then v = z + x with |x| < R. Then

c ≤ infv∈BR(z)

`(v) = `(z) + inf|x|<R

`(x) = `(z)−R|`|.

Therefore `(z) > c. (Before we only had `(z) ≥ c.) So A = {x : `(x) > c}contains z but no point of K. Therefore A is open in the weak topology, andz is not in the weak closure of K.

11.1 The Alaoglu theorem

Consider X∗, where X is a Banach space. For x ∈ X, define Lx : X∗ → Rby Lx(`) = `(x). The weak∗ topology is the coarsest topology on X∗ withrespect to which all the Lx with x ∈ X are continuous.

Theorem 11.3 (Alaoglu) The closed unit ball B in X∗ is compact in theweak∗ topology.

There is a connection with the Prohorov theorem of probability theory.Let X = C(S) where S is a compact Hausdorff space. If µn is a sequence ofprobabiity measures, then the µn are elements of the closed unit ball in X∗.

49

The Alaoglu theorem implies there must be a subsequence which convergesin the weak∗ sense.

Proof. If u ∈ B, then |u(x)| ≤ |x|. Let

P =∏x∈X

Ix,

where Ix = [−|x|, |x|]. Map B into P by setting ϕ(u) = {u(x)}, the functionwhose xth coordinate is u(x). The topology on P that we use is the coarsestone where all the projections u → u(x) are continuous. As in the proof ofthat S is a basis for the weak topology, we can check that finite intersectionsof sets of the form {u : a < u(x) < b} are a basis for this topology, whichis the product topology. The same argument shows that it is a basis for theweak∗ topology. By Tychonov’s theorem, P is compact. So it suffices toshow that ϕ(B) is closed.

Let p be in the closure of ϕ(B). We will show p = ϕ(u) for some u ∈ B.If px denotes the xth component of p, we want to show px = u(x).

Define u on X by u(x) = px. If q ∈ ϕ(B), then qx + qy = qx+y andqax = aqx since q = ϕ(v) for some v ∈ X∗. p is in the weak∗ closure of ϕ(B),so there exist qn ∈ ϕ(B) such that (qn)x converges to px, (qn)y converges topy, (qn)x+y converges to px+y, and (qn)kx converges to pkx. Passing to thelimit, (qn)x → px, etc., and we conclude p is linear. Since px ∈ Ix, we see thenorm of p is bounded by 1. Therefore p ∈ ϕ(B).

Corollary 11.4 Suppose S ⊂ X∗ is closed in the weak∗ topology. Then S isweak∗ compact if and only if it is bounded in norm.

Proof. If S is bounded in norm, then S is contained in a closed ball aboutthe origin. The closed ball is weak∗ compact, and S is a weak∗ closed subset,hence compact.

Now suppose S is weak∗ compact. For each x, the image of S under themap u → u(x) is compact, since continuous functions map compact sets tocompact sets. Therefore, for each x, {u(x)} is a bounded set. By the uniformboundedness principle, the collection {u} ⊂ S is bounded in norm.

50

12 Locally convex topological spaces

We look at topologies other than those defined in terms of linear functionals.

A locally convex topological (LCT) linear space is a linear space over thereals with a Hausdorff topology satisfying

1) (x, y)→ x+ y is a continuous mapping from X ×X → R.

2) (k, x)→ kx is a continuous mapping from F ×X → X.

3) Every open set containing the origin contains a convex open set con-taining the origin.

It is an exercise to show that the weak and weak∗ topologies are locallyconvex (i.e., satisfy 3)).

Proposition 12.1 In a LCT linear space,

(1) if T is open, so are T − x, kT , and −T .

(2) Every point of an open set T is interior to T .

Proof. (1) T − x is the inverse image of T under the map y → x+ y. kT issimilar.

(2) Suppose 0 ∈ T . Fix x ∈ X. k → kx is continuous, so {k : kx ∈ T} isopen. Since 0 ∈ T , then 0 is in this set, and therefore there exists an intervalabout 0 such that kx ∈ T if k is in this interval. This is true for all x, andtherefore 0 is an interior point. Use translation if the point we are interestedin is other than 0.

12.1 Separation of points

In a LCT space, we can talk about continuous linear functionals, but notbounded linear functionals.

Proposition 12.2 Continuous linear functionals in a LCT linear space Xseparate points: if y 6= z, there exists ` such that `(y) 6= `(z).

51

Proof. Without loss of generality assume y = 0. There exists an open setT that contains 0 but not z, since the topology is Hausdorff. We can takeT to be convex. 0 ∈ T is interior, so the gauge function pT is finite. RecallpT (u) < 1 if u ∈ T .

By the hyperplane separation theorem, there exists ` such that `(z) = 1and `(x) ≤ pT (x) for all x. Since `(y) = `(0), then ` separates.

It remains to prove that ` is continuous.

H = {w : `(w) < c} is open: if w ∈ H and u ∈ T , let r = c− `(w). Then

`(w + ru) = `(w) + r`(u) ≤ `(w) + rpT (u) < `(w) + c− `(w) = c,

so w + ru ∈ H. Hence H is open.

Making T symmetric about the origin by replacing T ∩ (−T ), a similarargument shows {w : `(w) > d} is open.

Using the extended hyperplane separation theorem, we have

Corollary 12.3 Let K be a closed convex set in a LCT space, z /∈ K. Thereexists a continuous linear functional ` such that `(y) ≤ c for y ∈ K and`(z) > c.

12.2 Krein-Milman theorem

Theorem 12.4 Let K be a nonempty, compact, convex subset of a LCTlinear space X. Then

1) K has at least one extreme point.

2) K is the closure of the convex hull of its extreme points.

Proof. 1) Let {Ej} be the collection of all nonempty closed extreme subsetsof K. It is nonempty because it contains K. We partially order by inclusion.We show that if we have a totally ordered subcollection, ∩jEj is a lowerbound, and hence by Zorn’s lemma a minimal element. (To be able to applyZorn’s lemma without change, say E1 ≤ E2 if Ec

1 ⊂ Ec2 and translate what a

maximal element and upper bound means.)

52

The intersection of any finite totally ordered subcollection {Ej} is justthe smallest one. Since K is compact, by the finite intersection property, theintersection of any totally ordered subcollection is nonempty. (If ∩Ej = ∅,then {Ec

j} forms an open cover of K, so there is a finite subcover, andthen the intersection of those finitely many Ej is empty, a contradiction.)The intersection of closed sets is closed, and it is easy to check that theintersection of extreme sets is extreme.

By Zorn’s lemma, there is a minimal element E. We claim E is a singlepoint. If not, there exists a continuous linear functional ` that separates 2 ofthe points of E. Let µ be the maximum value of ` on E. Since E is compact,this maximum value is attained. Let M = {x ∈ E : `(x) = µ}. M 6= E since` is not constant. ` is continuous and E is closed, so M is closed. `−1({µ}) isextreme, so M is extreme in E, and since E is extreme in K, M is extremein K. But this contradicts the fact that E was a minimal extreme subset.

2) Let Ke be the extreme points of K. We’ll show that if z is not in theclosure of the convex hull, then z /∈ K. There exists a continuous linearfunctional ` such that `(y) ≤ c for y ∈ Ke and `(z) > c. K is compact and` is continuous, so ` achieves it maximum on a closed subset E of K. E isextreme, and E must contain an extreme point p. Since p ∈ E ⊂ Ke, then`(p) ≤ c. Since `(p) = maxK `(x), then `(x) ≤ `(p) ≤ c for all x ∈ K. Since`(z) > c, then z /∈ K.

12.3 Choquet’s theorem

Here is a theorem of Caratheodory.

Theorem 12.5 Suppose K is a nonempty compact convex subset of a LCTlinear space X. Let Ke be the set of extreme points. If u ∈ K, there exists ameasure mu of total mass 1 on Ke such that

u =

∫Ke

emu(de)

in the weak sense.

A measure with total mass one is a probability measure, but this theoremhas nothing to do with probability.

53

The equation holding in the weak sense means

`(u) =

∫Ke

`(e)mu(de)

for all continuous linear functionals `.

Proof. Let m,M be the minimum and maximum of ` on K. K is compact,so these values are achieved. Then {x ∈ K : `(x) = m} is an extreme subsetof K and similarly with m replaced by M . They each contain extreme points.So if u ∈ K,

minp∈Ke

`(p) ≤ `(u) ≤ maxp∈Ke

`(p). (12.1)

If `1 and `2 are equal on Ke, then applying the above to `1 − `2 shows theyare equal on K.

Let L be the class of continuous functions on Ke that are the restrictionof a continuous linear functional. Fix u. Define r on L by setting

r(`) = `(u).

If L contains the constant function 1, then by (12.1) we have r(`) = `(u) = 1.If L does not contain the constant functions, adjoin the constant functionf0 = 1 to L and set r(f0) = 1. The set L is a linear subspace of C(Ke).Check that r is a positive linear functional on L.

Now use Hahn-Banach to extend r from L to C(Ke).

Ke is a closed subset of K, hence compact. r is a positive linear functionalon C(Ke). By the Riesz representation theorem from measure theory, thereexists a measure m such that

r(f) =

∫Ke

f dm.

Since r(f0) = 1, then m(Ke) = 1.

An example: in R3, let K be the unit circle in the (x, y) plane togetherwith {(1, 0, z) : |z| ≤ 1}. Then (1, 0, 0) /∈ Ke, so the collection of extremepoints is not closed.

Choquet’s theorem is an important extension of Caratheodory’s theoremin that we can take the integral to be over Ke rather than its closure, providedK is metrizable.

54

Theorem 12.6 Let K be a nonempty compact convex subset of a LCT spaceX that is metrizable. Then if u ∈ K,

u =

∫Ke

emu(de)

in the weak sense.

13 Examples of convex sets

and extreme points

Let Q be a compact Hausdorff space and X = C(Q). Let P be the positivelinear functionals on X = C(Q) such that `(1) = 1. If r ∈ Q, define er ∈ Pby er(f) = f(r).

Proposition 13.1 P is convex and the set of extreme points is {er}.

Proof. Convexity is easy. Suppose er were not extreme. Then er = am +(1 − a)`, where m, ` ∈ P and a ∈ (0, 1). Choose f ∈ C(Q) such that f ≥ 0and f(r) = 0. Then

er(f) = f(r) = 0 = am(f) + (1− a)`(f).

Since m(f), `(f) ≥ 0 and a ∈ (0, 1), then `(f) = m(f) = 0.

Now take f ∈ C(Q) with f(r) = 0. Then writing f = f+ − f−, wehave f+(r) = 0 and by the above `(f+) = 0 and the same for m. Similarlyf−(r) = 0 so the same is true for ` and m. Combining, `(f) = 0 and the samefor m. Thus Ner ⊂ Nm, N`. A non-zero linear functional has codimension1. Since codim N` = codim Nm, `,m are constant multiples of er. Since`(1) = m(1) = 1, ` = er and the same for m. Therefore er is extreme.

Now let ` be an extreme element of P . By Riesz representation thereexists a unique measure µ such that `(f) =

∫f dµ. Since `(1) = 1, then

µ(Q) = 1. If the support of µ is not a single point, there exist µ1, µ2 thatare non-zero, not equal, and µ = aµ1 + (1 − a)µ2. Let `j =

∫f dµj. Then

` = a`1 + (1 − a)`2; since ` is extreme, `1 = `2, which implies µ1 = µ2, a

55

contradiction. Hence the support of µ is a single point, and since µ(Q) = 1,µ(dx) = δr(dx) for some r. Therefore `(f) =

∫f(x) δr(dx) = f(r), or ` = er.

Choquet’s representation here is

` =

∫erm(dr).

13.1 Convex functions

Let C be the set of convex functions on [0, 1] such that f(0) = 0, f(1) = 1and f ≥ 0. Clearly C is a convex set. Let er(x) be the function that is 0 on[0, r], 1 at 1, and linear on [r, 1], and let e1(x) be 0 except 1 at 1.

Proposition 13.2 {er}are the extreme points for C.

Proof. Suppose er = af + (1− a)g with f, g ∈ C. Since f, g ≥ 0, they mustboth be 0 on [0, r]. If they were not equal to er on [r, 1], one of them, sayf , must be larger than er at some point y ∈ (r, 1). But f(r) = 0, f(1) = 1,and by convexity f must be less than the line connecting (r, 0) and (1, 1),contradicting f(y) > er(y). Therefore f = g = er and er is extreme.

Now suppose f ∈ C is extreme. We will show in a minute that we canwrite

f(x) =

∫ 1

0

er(x)m(dr) (13.1)

for some measure m on [0, 1] with total mass 1. As in the proof above for P ,the measure m must be supported on a single point, and hence f = er forsome r.

To prove (13.1), a convex function is differentiable almost everywhere,and its derivative f ′ is increasing. Let m be the Lebesgue-Stieltjes measureassociated with the right continuous version of f ′. Set m(dr) = (1−r)m(dr).

56

If r > x, then er(x) = 0. Then

f(x) = f(x)− f(0) =

∫ x

0

f ′(r) dr =

∫ x

0

∫ r

0

m(ds) dr

=

∫ x

0

∫ s

x

drm(ds) =

∫ x

0

(x− s)m(ds)

=

∫ x

0

x− r1− r

(1− r)m(dr) =

∫ 1

0

er(x)m(dr).

Since f(1) = 1 and er(1) = 1, we conclude m([0, 1]) = 1.

f =∫ 1

0er dm is the Choquet representation for f .

14 Bounded linear maps

14.1 Boundedness and continuity

A linear map M , which is the same as a linear operator or linear transforma-tion, from a Banach space X to a Banach space U is continuous if xn → ximplies Mxn → Mx. M is bounded if there exists a constant c such that|Mx| ≤ c|x|.

The following is proved just as for linear functionals.

Proposition 14.1 M is continuous if and only if it is bounded.

If X and U are only normed linear spaces and M is bounded, then Mcan be extended by continuity to a map from the completion of X to thecompletion of U .

Define

|M | = supx 6=0

|Mx||x|

,

or what is the same,|M | = sup

|x|=1

|Mx|.

57

Proposition 14.2 |aM | = |a| |M |, |M | ≥ 0 and equals 0 if and only ifM = 0, and |M +N | ≤ |M |+ |N |.

The proofs are easy.

We wrote L(X,U) for the set of linear maps from a Banach space X to aBanach space U .

Proposition 14.3 L is itself a Banach space.

The proof is the same as for linear functionals. There we used the com-pleteness of R or C; here we use the completeness of U .

Proposition 14.4 NM is closed.

Proof. {0} is closed, M is continuous, so NM = M−1({0}) is closed.

Suppose M : X → U . We define the transpose M ′ as follows. M ′ : U∗ →X∗. If ` ∈ U∗, `(Mx) is a linear functional on X, and we call this linearfunctional M ′`.

We sometimes use the notation `(u) = (u, `) and ξ(x) = (x, ξ). With thisnotation,

(Mx, `) = `(Mx) = M ′`(x) = (x,M ′`),

which justifies the name adjoint or transpose.

If R is a subspace of U , then R⊥ is the set of bounded linear functionalsthat vanish on R, called the annihilator of R. R⊥ ⊂ U∗.

If S ⊂ X∗, then S⊥ consists of those vectors that are annihilated by everylinear functional in S. S⊥ ⊂ X.

Proposition 14.5 (1) M ′ is bounded and |M ′| = |M |.(2) NM ′ = R⊥M .

(3) NM = R⊥M ′.

(4) (M +N)′ = M ′ +N ′.

58

Proof. (1) |M ′| = sup|`|=1 |M ′`| (recall M ′` ∈ X∗) and

|M ′`| = sup|x|=1

|M ′`(x)| = sup|x|=1

|`(Mx)|.

So|M ′| = sup

|`|=1,|x|=1

|`(Mx)| = sup|x|=1

|Mx| = |M |.

(2) Suppose ` ∈ NM ′ . Then M ′` = 0. If x ∈ X, 0 = M ′`(x) = `(Mx), or` ∈ R⊥M . On the other hand if ` ∈ R⊥M , then (M ′`)(x) = `(Mx) = 0, henceM ′` = 0, or ` ∈ NM ′ .

(3) is similar and (4) is easy.

14.2 Strong and weak topologies

We define some topologies on L(X,U).

1) Uniform: this is defined in terms of the norm M .

2) Strong: for each x ∈ X, let Rx : L → U be defined by RxM = Mx. Thestrong topology is the coarsest one where all the Rx are continuous functions.

3) Weak topology: for each x ∈ X and ` ∈ U∗, let Sx` : L → R be definedby Sx`M = (Mx, `) = `(Mx). The weak topology is the coarsest one suchthat all the Sx` are continuous.

We say Mn is strongly convergent to M if |Mnx−Mx| → 0 for all x. Mn

is weakly convergent to M if Mnxw−→Mx for all x, that is, for all ` ∈ U∗,

`(Mnx)→ `(Mx).

Proposition 14.6 Suppose X and U are Banach spaces, Mn ∈ L, and|Mn| ≤ c for all n. Suppose s − limMnx exists for a dense subset of x’s.Then Mn converges strongly for all x.

Proof. Let ε > 0, and given x choose xk in the dense subset such that|x−xk| < ε. Then |Mnx−Mnxk| ≤ c|x−xk| ≤ cε, Mnxk converges strongly.We conclude Mnx converges strongly.

59

14.3 Uniform boundedness principle

Proposition 14.7 If Mν is a collection of elements of L(X,U), where X,Uare Banach spaces, and supν |Mνx| is finite for all x, then there exists c suchthat |Mν | ≤ c for all c.

The proof is nearly identical to the linear functional case.

Theorem 14.8 Suppose X and U are Banach spaces such that for all x ∈ Xand all ` ∈ U∗, supν |(Mνx, `)| <∞. Then there exists c such that |Mν | ≤ cfor all ν.

Proof. Fix x. Let uν = Mνx. Then supν |`(uν)| < ∞. By the corollary tothe uniform boundedness principle for linear functionals,

supν|Mνx| = sup

ν|uν | <∞.

Now apply the preceding proposition.

14.4 Composition of maps

Proposition 14.9 Suppose X,U,W are Banach spaces, M is a linear mapfrom X to U , and N is a linear map from U to W . Then

1) |NM | ≤ |N | |M |.(2) (NM)′ = M ′N ′.

Proof. (1)|NMx| ≤ |N | |Mx| ≤ |N | |M | |x|.

(2)(NMx,m) = (Mx,N ′m) = (x,M ′N ′m).

60

14.5 Open mapping principle

Theorem 14.10 Suppose M is a bounded linear map from a Banach spaceX onto a Banach space U . Then there exists d such that Bd(0) ⊂M(B1(0)).

The “onto” condition is important.

Proof. M is onto, so ∪nM(Bn(0)) = U . By the Baire category theorem,at least one of M(Bn(0)) is dense in an open set in U . If V = M(Bn(0))and the open set contains a ball centered at y0, then V − y0 is dense insome open ball around the origin. Since M is onto, there exists x0 suchthat Mx0 = y0. So M(Bn(0) − x0) is dense in an open ball about 0. SinceBn(0) − x0 ⊂ Bn+|x0|(0), then M(Bn+|x0|(0)) is dense in a ball about 0. Byhomogeneity, M(B1(0)) is dense in Br(0) for some r, and then M(Bc(0)) isdense in Bcr(0) for all c > 0.

We show M(B2(0)) contains Br(0). Let u ∈ Br(0).

There exists x1 such that |u−Mx1| < r/2 and |x1| < 1. Because u−Mx1 ∈Br/2(0), there exists x2 such that

|(u−Mx1)−Mx2| < r/4, |x2| < 1/2.

We continue with this construction.

Since∑n

i=m |xi| ≤∑n

i=m 2−i+1, then∣∣∣ n∑i=m

xi

∣∣∣→ 0

as m,n→∞. Since X is complete, the sequence∑n

i=1 xi converges, say to x.We have |x| ≤

∑|xi| < 2. M is bounded, so M(

∑ni=1 xi) → Mx. Because

|u−M∑n

i=1 xi| ≤ r/2n → 0, we have Mx = u.

Corollary 14.11 M maps open sets onto open sets.

Corollary 14.12 If M is one-to-one and onto and bounded, then M−1 isbounded.

61

Proof. If u ∈ U with |u| = d/2, there exists x with |x| < 1 and Mx = u. Byhomogeneity, if u ∈ U , there exists x ∈ X with Mx = u and |x| < 2|u|/|d|.So x = M−1u and |M−1| < 2/d.

A map M : X → U is closed if whenever xn → x and MXn → u, thenMx = u. This is equivalent to the graph {x,Mx} being a closed set.

If M is continuous, it is closed If D is the differentiation operator on theset of differentiable functions on [0, 1], then D is closed, but not continuous.

Theorem 14.13 (Closed graph theorem) If X and U are Banach spaces andM a closed linear map, then M is continuous.

Proof. Let G = {g = (x,Mx)} with norm |g| = |x|+ |Mx|. It is easy to seethat |g| is a norm, and G is complete. Define P : G→ X by P (x,Mx) = x,so that P is a projection onto the first coordinate.

|Pg| = |x| ≤ |x| + |Mx| = |g|, so P is bounded with norm less than orequal to 1. P is linear and 1-1, and onto, so P−1 is bounded, i.e., there existsc such that c|Pg| ≥ |g|. So (c − 1)|x| ≥ |Mx|, which proves M is bounded.

Corollary 14.14 Suppose X has two norms such that if |xn − x|1 → 0 and|xn−y|2 → 0, then x = y. Suppose X is complete with respect to both norms.Then the norms are equivalent.

Proof. Let X1 = (X, | · |1) and similarly X2. Let I : X1 → X2. Thehypothesis is equivalent to I being closed. Therefore I and I−1 are bounded.

Corollary 14.15 Suppose X is a Banach space and X = Y ⊕ Z, where Yand Z are closed linear subspaces. If x = y+z, define PY x = y and PZx = z.then

(1) PY , PZ are linear operators, P 2Y = PY , P 2

Z = PZ, PY PZ = 0.

(2) PY , PZ are continuous.

62

PY and PZ are projections.

Proof. (1) is easy. (2) Since Y, Z are closed and the decomposition is unique,then the graphs of PY and PZ are closed. To see this, if xn = yn+zn, xn → x,yn → y′,and zn → z′, then y′ ∈ Y, z′ ∈ Z, and x = y′+z′. The decompositionis unique, so y′ = PY x, z

′ = PZx.

By the closed graph theorem, PY , PZ are continuous.

15 Distributions

(This is from Appendix B of Lax.)

15.1 Definitions and examples

Let C∞0 be the C∞ functions on Rn with compact support. We will workwith complex-valued functions.

Let Diu = ∂u∂xi

. For α = (α1, . . . , αn), let |α| = α1 + · · · + αn, and writeDα for Dα1

1 · · ·Dαnn .

We say uk → u if there exists a compact subset K such that supp (uk) ⊂ Kfor all k (here supp (u) is the support of u) and for each α ∈ Nn, Dαukconverges uniformly to Dαu.

A distribution is an element of the dual of C∞0 , that is, ` is a complex-valued linear functional such that `(uk)→ `(u) whenever uk → u.

For an integer N , let|u|N = max

|α|≤N|Dαu|.

Proposition 15.1 Let ` be a distribution, K a compact set. There exist Nand c depending on K such that if u ∈ C∞0 has support in K, then |`(u)| ≤c|u|N .

Proof. If not, then for each n there exists un with support in K such that`(un) = 1 and |u|n ≤ 1/n. Therefore un → 0. But `(un) = 1 while `(0) = 0.

63

Let D be the set of distributions. We embed C∞0 in D as follows: ifv ∈ C∞0 , define `v(u) =

∫uv dx = (u, v).

We will use the notation `(u) = (u, `).

Here are some examples of distributions.

1) If v is a continuous function, define `(u) =∫uv dx.

2) Dirac delta function: δ(u) = u(0).

3) If v is integrable and α is fixed, define `(u) =∫

(Dαu)v dx.

4) Define

`(u) = PV

∫u(x)

xdx = lim

ε>0

∫|x|>ε

u(x)

xdx.

If D is open, let C∞D be the C∞ functions with support contained in D.We can talk about distributions being the duals of elements of C∞D .

Suppose U is a continuous linear map from C∞0 into C∞0 . Define T` by

(v, T `) = (Uv, `).

It is an exercise to show T` is a distribution. It is easy to check thatT`v = `Uv. Therefore T acts an extension of U ′.

Here are some examples.

1) Let U be multiplication by a C∞ function t. Here T is an extension ofU .

2) Let U = −Di, differentiation. T is an extension of −U .

3) Translation: (Uau)(x) = u(x+ a). Ta is an extension of U−a.

4) Reflection: Ru(x) = u(−x), and T is an extension of R.

5) Convolution: if t is continuous with compact support, define

Uu = (t ∗ u)(x) =

∫t(y)u(x− y) dy.

T is convolution with respect to Rt.

6) Let φ : Rn → Rn be C∞ and map compact sets to compact sets.Suppose ψ = φ−1 exists and has the same properties. Define Uu(x) =u(φ(x)). Here U ′ is given by U ′v(y) = v(ψ(y))J(y).

64

Note that one cannot, in general, define the product of two distributions,or quantities like δ(x2).

15.2 Local properties

Let G be open. ` is zero on G if `(u) = 0 for all C∞0 functions u whosesupport is contained in G.

Lemma 15.2 If ` is zero on G1 and G2, then ` is zero on G1 ∪G2.

Proof. This is just the usual partition of unity proof. Given u with supportin G1∪G2, we will write u = u1+u2 with supp (u1) ⊂ G1 and supp (u2) ⊂ G2.Since `(u) = `(u1) + `(u2) = 0, that will do it.

Fix x ∈ supp (u). Since G1, G2 are open, we can find h = hx such that his non-negative, h(x) > 0, h is C∞, and the support of h is contained eitherin G1 or in G2. The set Bx = {y : hx(y) > 0} is open and contains x. Bycompactness we can cover suppu by finitely many of such balls. Look at thecorresponding finite collection of h’s.

Let h1 be the sum of those h’s in the finite collection whose support is inG1 and define h2 similarly. Then let

u1 =h1

h1 + h2

u, u2 =h2

h1 + h2

u.

If we have an arbitrary collection of open sets and ` is zero on each one,and suppu is contained in their union, there by compactness there existfinitely many of them that cover suppu, and by the above `(u) = 0.

The union of all open sets on which ` is zero is an open set on which ` iszero. Its complement is called the support of `.

Example: for the Dirac delta function, the support is {0}. Note that thesupport of Dαδ is also {0}.

Lemma 15.3 If ` is a distribution and supp ` = {0}, then there exists Nsuch that Dαu(0) = 0 for |α| ≤ N implies `(u) = 0.

65

Proof. Let f be 0 on |x| < 1 and 1 on |x| > 2, and in C∞. Let v = (1−f)u.Since fu = 0 for |x| < 1, then `(fu) = 0 and

`(v) = `(u)− `(fu) = `(u).

Thus is suffices to consider u such that suppu ⊂ B3(0).

There must exist N such that |`(u)| ≤ c|u|N . Define fk(x) = f(kx),uk = fku. Then uk = u if |x| > 2/k.

If |x| < 2/k, since u and Dβu are 0 at 0, Taylor’s theorem gives

|Dβu(x)| ≤ c|x|N+1−|β|i ≤ c|k||β|−1−N

if |β| ≤ N . Calculus (product rule) shows that

|Dαuk(x)| ≤ ck|α|−N−1

if |x| < 2/k.

Since uk = u if |x| > 2/k, then uk → u in CN norm, so `(uk) → `(u). ukis zero in a ball about the origin, so `(uk) = 0, which implies `(u) = 0.

Theorem 15.4 If supp ` = {0}, then

` =∑|α|≤N

cαDαδ.

Proof. If u1, u2 and all the derivatives up to order N agree at 0, then`(u1−u2) = 0, or `(u1) = `(u2). So `(u) depends only on the values of u andits derivatives up to order N . So

`(u) =∑|α|≤N

cαDαu∣∣∣x=0

. (15.1)

To elaborate on the last line of the proof, consider the case where thedimension is 1. We take functions pi that are equal to xi near 0 but are in

66

C∞0 . Choose the cα so that (15.1) holds when u is equal to pi. Then it willhold for all u.

We will need the Sobolev inequality. First suppose 1 ≤ p < n and p∗ = npn−p

so that1

p∗=

1

p− 1

n

and p∗ > p.

Theorem 15.5 Suppose 1 ≤ p < n. There exists C such that

‖u‖Lp∗ ≤ c‖Du‖Lp

for all u ∈ C1 with compact support.

Letting u ≡ 1 shows why compact support is necessary, but C does notdepend on the size of the support.

Proof. 1) Suppose p = 1. Write

u(x) =

∫ xi

−∞ui(x1, . . . , xi−1, yi, xi+1, . . . , xn) dy1.

Then

|u(x)| ≤∫ ∞−∞|Du(x1, . . . , yi, . . . , xn)| dyi.

From here on, all integrals are over R. Taking a product, we have

|u(x)|nn−1 ≤

n∏i=1

(∫|Du(. . . , yi, . . .)| dyi

)1/(n−1)

.

Then∫|u(x)|n/(n−1) dx1 ≤

∫ n∏i=1

(∫|Du| dyi

)1/(n−1)

dx1

=(∫|Du| dy1

)1/(n−1)∫ n∏

i=2

(∫|Du| dyi

)1/(n−1)

dx1

≤(∫|Du| dy1

)1/(n−1)( n∏i=2

∫ ∫|Du| dx1 dyi

)1/(n−1)

,

67

where the last inequality is from the generalized Holder inequality.

Integrating with respect to x2,∫ ∫|u|n/(n−1) dx1 dx2 ≤

(∫ ∫|Du| dx1 dy2

)1/(n−1)∫ ∏

i 6=2

I1/(n−1)i dxi,

where

I1 =

∫|Du| dy1

and

Ii =

∫ ∫|Du| dx1 dyi, i = 3, . . . , n.

Then∫ ∫|u|n/(n−1) dx1 dx2

≤( ∫ ∫

|Du| dx1 dy2

)1/(n−1)(∫ ∫|Du| dy1 dx2

)1/(n−1)

×n∏i=3

(∫ ∫ ∫|Du| dx1 dx2 dyi

)1/(n−1)

.

We continue, and after integrating with respect to dxn, we obtain thedesired inequality.

2) If p > 1, apply the above case to v = |u|γ, where

γ =p(n− 1)

n− p> 1.

We obtain(∫|u|γn/(n−1)

)(n−1)/n

≤∫|D|u|γ| dx = γ

∫|u|γ−1|Du| dx

≤ c(∫|u|(γ−1)p/(p−1)

)(p−1)/p(∫|Du|p

)1/p

by Holder’s inequality. Dividing gives the general case.

68

Theorem 15.6 If k < n/p and

1

q=

1

p− k

n,

then‖u‖Lq ≤ c‖u‖Wk,p .

Proof. Use the above theorem and induction.

Proposition 15.7 If ` has compact support, there exist L and continuousfunctions gα such that

` =∑|α|≤L

Dαgα.

An example: The delta function is the derivative of h, where h is 0 forx < 0 and 1 for x ≥ 0. h is the derivative of g, where g is 0 for x < 0 andg(x) = x for x ≥ 0. Therefore δ = D2g.

Proof. Let h ∈ C∞0 and equal to 1 on the support of `. Then `((1−h)u) = 0,or `(u) = `(hu). Therefore there exists N and c such that

|`(hu)| ≤ c|hu|N ≤ c|u|N .

Hence|`(u)| ≤ c|u|N .

Let

‖u‖M =( ∑|α|≤M

∫|Dαu|2 dx

)1/2

,

and let HM be the completion of C∞0 with respect to this norm. This is aHilbert space.

By the Sobolev inequality,

|u|N ≤ c‖u‖M , N < M − n

2.

69

So any Cauchy sequence in HM is also a Cauchy sequence with respect tothe CN norm. This shows HM can be embedded in CN .

Since |`(u)| ≤ c|u|N ≤ c‖u‖M , then ` is a bounded linear functional inHM . By the Riesz-Frechet representation theorem, there exists g ∈ HM suchthat

`(u) = (u, g)M =∑|α|≤M

(Dαu,Dαg).

This is equal to ∑(−1)|α|(u,D2αg),

using the example of transformations where T is differentiation. So

` =∑

(−1)|α|D2αg.

Since g ∈ HM , then g ∈ CN . Now take L = 2M −N .

Lemma 15.8 Suppose b is a C∞0 function with support contained in B1(0)and

∫b(x) dx = 1. Let bk(x) = knb(kx). Then `k = bk ∗ ` converges to ` in

the sense of distributions.

Proof. We have

`k(u) = (bk ∗ `)(u) = (bk ∗ `, u) = (`, Rbk ∗ u).

Rbk is an approximation to the δ function, so

Dα(Rbk ∗ u) = Rbk ∗Dαu→ Dαu.

Hence Rbk ∗ u converges to u in the C∞0 sense, and this implies `k(u) =(`, Rbk ∗ u)→ (`, u) = `(u).

Proposition 15.9 Suppose g is continuous and Djg is continuous, whereDjg is the partial derivative of g in the distribution sense. Then Djg is alsothe partial derivative of g in the classical sense.

70

Proof. Let gk = bk ∗ g. Then gk → g in the sense of distributions. Djgk =bk ∗ Djg, so gk → g and Djgk → Djg uniformly on compact subsets, sinceg,Djg are continuous. We have, if b − a is a multiple of the unit vector inthe xj direction,

gk(b)− gk(a) =

∫ b

a

Djgk dxj.

Passing to the limit,

g(b)− g(a) =

∫ b

a

Djg dxj.

A distribution ` is positive if `(u) ≥ 0 whenever u is non-negative.

Examples: ` is given by a non-negative measure, `(u) =∫u dm.

Let |u| = sup |u(x)|.

Lemma 15.10 Suppose ` is a positive distribution and K is compact. Thereexists c depending on K such that |`(u)| ≤ c|u| for all u supported in K.

Proof. Let p ∈ C∞0 with p ≥ 0 and p = 1 on K. Then

|u(x)| ≤ |u|p(x).

So `(u) ≤ |u|`(p). Similarly, −u ≤ |u|p(x), so −`(u) ≤ |u|`(p). Now letc = `(p).

Proposition 15.11 Every positive distribution is a measure.

Proof. Extend ` to all continuous functions with compact support. Thenuse Riesz representation.

71

15.3 Fourier transforms

Let S be the class of complex-valued C∞ functions u such that |xβDαu(x)| →0 as |x| → ∞ for all multi-indexes α and all positive integers β. S is calledthe Schwartz class. An example of an element of the Schwartz class is e−|x|

2.

Define |u|β,α = supx |x|β|Dαu(x)|.We say un ∈ S converges to u if |un − u|β,α → 0 for all β, α.

We can find a metric for S: define

d(u, v) =∑α,β

1

2β+|α||u− v|β,α

1 + |u− v|β,α.

The dual of S is the set of continuous linear functionals on S. Herecontinuity means that if un → u in the sense of the Schwartz class, then`(un)→ `(u).

Since C∞0 ⊂ S, then S∗ ⊂ D. This means that every element of S∗ is adistribution. Elements of S∗ are called tempered distributions.

Any distribution of compact support is a tempered distribution. So is`(u) =

∫vu dx if v grows slower than some power of |x| as |x| → ∞.

For u ∈ S, define the Fourier transform Fu = u by

u(ξ) = (2π)−n/2∫u(x)eix·ξ dx.

Theorem 15.12 F maps S into S continuously.

Proof. For elements of S, DαFu = F ((ix)α)u). If u ∈ S, |xαu| → 0faster than any power of |x|, so xαu ∈ L1. This implies DαF is a continuousfunction. Therefore Fu ∈ C∞.

By integration by parts, ξβDαFu = iα+βF (Dβ(xαu)). By the productrule, Dβ(xαu) is in L1. So ξβDαFu is continuous and bounded. ThereforeF ∈ S.

Proposition 15.13 For u ∈ S:(1) If Tau(x) = u(x− a), then FTau = eia·ξFu and TaFu = F (e−ia·xu).

72

(2) F (iDju) = ξjFu and DjFu = F (ixju).

(3) If A is a rotation about the origin and R is reflection about the origin,FR = RF and FA = AF .

(4) F (u ∗ v) = (2π)n/2(Fu)(Fv).

Because

Fu(ξ) = c

∫eix·ξu(x) dx,

(Fu, v) = c

∫ ∫eix·ξu(x)v(ξ) dx dξ

=

∫(Fu)(x)v(x) dx

= (u, Fv).

So F ∗ = F .

If ` is a tempered distribution, define F` by

(v, F`) = (Fv, `)

for all v ∈ S.

It then follows that the proposition above holds for ` ∈ sS∗.

Proposition 15.14 Let ` ≡ 1. Then F` = (2π)n/2δ.

` ≡ 1 means ` = `1, or `(u) =∫u · 1 dx =

∫u(x) dx.

Proof. Write d for F`.

xjd = xjF1 = F (iDj1) = F (0) = 0.

Since xjd = 0, the support of d is contained in {xj = 0}. This is true foreach j, so the support of d is {0}. By a previous proposition

d =∑|α|≤N

cαDαδ

for some N and c.

73

For any multi-index γ with |γ| > 0, xγd = 0. But xγd =∑|α|≤N cαx

γDαδ.

We claim

xγDαδ =

0 if |α| < |γ|0 if |α| = |γ|, α 6= γ

(−1)|α|α!δ if α = γ

To show this, if u ∈ C∞0 ,

(u, xγDαδ) = (xγu,Dαδ) = (−1)|α|(Dα(xγu), δ) = (−1)|α|Dα(xγu)(0).

Checking the three cases proves the claim.

Suppose there exists α with |α| = N > 0 and cα 6= 0. But

0 = xαd = cαxαDαδ = cα(−1)|α|α!δ,

or cα = 0. Therefore d = c0δ.

To calculate c0, let ` ≡ 1 and v = e−|x|2/2 in (Fv, `) = (v, F`). So

F` = d = c0δ and Fv = e−|ξ|2/2. Then

(2π)n/2 = (e−|ξ|2/2, 1) = (e−x

2/2, c0δ) = c0.

Theorem 15.15 (1) F is an invertible map from S into S and

u(x) = (2π)−n/2∫u(ξ)e−iξ·x dξ.

(2) F is an invertible map from S∗ into S∗ and F−1 = FR.

Proof. (1) Fe−ia·ξ = TaF . So F (e−ia·ξ`) = TaF`. Take ` ≡ 1 to get

Fe−ia·ξ = (2π)n/2δ(x− a).

Now let ` = e−ia·ξ and use (Fv, `) = (v, F`) to get∫v(ξ)e−ia·ξ dξ = (v, e−ia·ξ) = (2π)n/2(v, δ(x− a)) = (2π)n/2v(a).

74

(2) From (1),

u(x) = (2π)−n/2∫u(−ξ)eix·ξ dξ,

so F−1 = FR, and hence FRF = I. Then

(v, `) = (FRFv, `) = (RFv, F`) = (Fv,RF`) = (v, FRF`).

thus FRF` = `. The inverse of F exists because RF is a right inverse for F ,FR is a left inverse, and F and R commute. Therefore RF` = F−1`.

Theorem 15.16 If u ∈ L2, then u ∈ L2 and ‖u‖L2 = ‖u‖L2.

Proof. We will prove this for u ∈ S and then approximate functions in L2

by functions in S. Now

Fu = (2π)−n/2∫ue−iξ·x dx = RFu.

Letting v = Fu in (Fu, v) = (u, Fv) yields

(Fu, Fu) = (u, F (RFu)) = (u, u),

using FRF = F .

16 Banach algebras

16.1 Normed algebras

An algebra is a linear space over + and a ring over ·. We assume there isan identity for the multiplication, which we call I. Our algebras will be overthe scalar field C; the reasons will be very apparent shortly.

An algebra is a normed algebra if the linear space is normed and |NM | ≤|N | |M | and |I| = 1. If the normed algebra is complete, it is called a Banachalgebra.

75

One example is to let L = L(X,X), the set of linear maps from X intoX. Another is to let L be the collection of bounded continuous functions onsome set.

An element M of L is invertible if there exists N ∈ L such that NM =MN = I.

M has a left inverse A if AM = I and a right inverse B if MB = I. If ithas both, then B = AMB = A, and so M is invertible.

Proposition 16.1 (1) If M and K are invertible, then

(MK)−1 = K−1M−1.

(2) If M and K commute and MK is invertible, then M and K areinvertible.

Proof. (1) is easy. For (2), let N = (MK)−1. Then MKN = I, so KN is aright inverse for M . Also, I = NMK = NKM , so NK is a left inverse forM . Since M has a left and right inverse, it is invertible. The argument forK is similar.

Proposition 16.2 If K is invertible, then so is L = K − A provided |A| <1/|K−1|.

Proof. First we suppose K = I. If |B| < 1, then∣∣∣ n∑m

Bi∣∣∣ ≤ n∑

m

|Bi| ≤n∑m

|B|i

is a Cauchy sequence, so S =∑

iBi converges.We see BS =

∑∞i=1B

i = S−I,so (I −B)S = I. Similarly S(I −B) = I.

For the general case, write K − A = K(I −K−1A), and let B = K−1A.Then |B| ≤ |K−1| |A| < 1, and

(K − A)−1 = (I −K−1A)−1K−1.

76

The resolvent set of M , ρ(M), is the set of λ ∈ C such that λI −M isinvertible. The spectrum of M , σ(M), is the set of λ for which λI −M isnot invertible. We sometimes write λ−M for λI −M .

Let f : G→ X, where G ⊂ C. f(z) is strongly analytic if

limh→0

f(z + h)− f(z)

h

exists in the norm topology for all z ∈ G. One can check that most ofcomplex analysis can be extended to strongly analytic functions.

Proposition 16.3 (1) ρ(M) is open in C.

(2) (z −M)−1 is an analytic function of z in ρ(M).

Proof. If λ ∈ ρ(M), letting K = λI−M and A = −hI, K−A = (λ+h)−Mis invertible if h is small. So λ+ h ∈ ρ(M).

For (2),

((λ+ h)−M)−1 =∑

(λ−M)n−1hn

for h small. So the resolvent can be expanded in a power series in h whichis valid if |h| < |(λ−M)−1|−1.

We write|σ(M)| = sup

λ∈σ(M)

|λ|,

and call this the spectral radius of M .

Theorem 16.4 (1) σ(M) is closed, bounded, and nonempty.

(2) |σ(M)| = limk→∞ |Mk|1/k.

Proof. (1) ρ(M) is open, so σ(M) is closed.

(zI −M)−1 = z−1(I −Mz−1)−1 =∞∑n=0

Mnz−n−1

77

converges if |z−1M | < 1, or equivalently, |z| > |M |. Therefore, if |z| > |M |,then z ∈ ρ(M). Hence the spectrum is contained in B|M |(0).

(z −M)−1 =∞∑n=0

Mnz−n−1

is a Laurent series. If σ(M) = ∅, then (z − M)−1 would be everywhereanalytic. So if c > |M | and C is the circle |z| = c,

1

2πi

∫C

(z −M)−1 dz = 0.

But integrating∑Mnz−n−1 term by term over the curve C, all the terms

are zero except the n = 0 term, where we get

1

2πi

∫C

1

zdz = I,

a contradiction. Therefore σ(M) is nonempty.

(2) Fix k for the moment. If we write n = kq + r,

∣∣∣ ∞∑n=0

Mn

zn+1

∣∣∣ ≤∑ |Mn||z|n+1

≤k−1∑n=0

|M |r

|z|r+1

∑q

( |Mk||z|k

)q.

So∑Mn|z|−n−1 converges absolutely if |Mk|/|z|k < 1, or if |z| > |Mk|1/k.

If |z| > |Mk|1/k, then z ∈ ρ(M). Hence if λ ∈ σ(M), then |λ| ≤ |Mk|1/k.This is true for all k, so |σ(M)| ≤ lim infk→∞ |Mk|1/k.

Let C be a circle enclosing σ(M), namely the one about the origin withradius |σ(M)|+ δ. Using (z −M)−1 =

∑Mnz−n−1,

1

2πi

∫C

(z −M)−1zn dz = Mn.

So

|Mn| ≤ 1

∫C

|(z −M)−1| |z|n |dz|

≤ c(|σ(M)|+ δ)n+1,

78

where c = supz∈C |(z −M)−1|.Thus

|Mn|1/n ≤ c1/n(|σ(M)|+ δ)1+ 1n .

Hencelim supn→∞

|Mn|1/n ≤ |σ(M)|+ δ.

Since this is true for all δ, that does it.

Note not that every element of σ(M) is an eigenvalue of M . For example,if M : `2 → `2 is defined by

M(x1, x2, . . .) = (x1, x2/2, x3/4, . . .),

then 1, 1/2, 1/4, . . . are eigenvalues. Since the spectrum is closed, then 0 ∈σ(M), but 0 is not an eigenvalue for M .

16.2 Functional calculus

We can define p(M) =∑n

i=1 aiMi for any polynomial p, and since

lim sup |Mk|1/k = |σ(M)|,

also for any analytic function whose power series’ radius of convergence islarger than the spectral radius (by the root test).

Let G be a domain containing σ(M), f analytic in G, C a closed curve inG∩ρ(M) whose winding number is 1 about each point in σ(M) and 0 abouteach point of Gc. Define

f(M) =1

2πi

∫C

(z −M)−1f(z) dz.

By Cauchy’s theorem, this is independent of the contour chosen.

Theorem 16.5 (1) If f is a polynomial, the two definitions agree.

(2) f(M)g(M) = (fg)(M).

(3) (Spectral mapping theorem)

σ(f(M)) = f(σ(M)).

79

Proof. (1) follows from

1

2πi

∫C

(z −M)−1zn dz = Mn.

(2)(zI −M)− (wI −M) = (z − w)I.

Multiplying this by(z −M)−1(w −M)−1

z − w,

we get

1

z − w[(w −M)−1 − (z −M)−1] = (z −M)−1(w −M)−1,

which is known as the resolvent identity.

f → f(M) is linear. Let C,D be closed curves as above with D strictlyinside C. Then

f(M)g(M) =( 1

2πi

)2∫C

∫D

(z −M)−1(w −M)−1f(z)g(w) dz dw

=( 1

2πi

)2∫C

∫D

(w −M)−1

z − wf(z)g(w) dz dw

−∫C

∫D

(z −M)−1

z − wf(z)g(w) dz dw.

Since1

2πi

∫C

f(z)

z − wdz = f(w),

the first integral is

1

2πi

∫D

(w −M)−1f(w)g(w) dw = (fg)(M).

Note C winds once around each point of D.

Because D does not wind around any point of C,

1

2πi

∫D

g(w)

z − wdw = 0,

80

and so the second integral is 0.

(3) Suppose µ /∈ f(σ(M)); we show µ /∈ σ(f(M)). If µ 6= f(λ) for someλ ∈ σ(M), then f(z)− µ does not vanish on σ(M). So g(z) = (f(z)− µ)−1

is analytic in an open set containing σ(M) and we can define g(M) there.

Leth = (f(z)− µ)g(z) = 1.

So by (2)[f(M)− µI]g(M) = h(M) = I.

Therefore g(M) is the inverse of f(M)− µI, and hence µ /∈ σ(f(M)).

Suppose λ ∈ σ(M). We show f(λ) ∈ σ(f(M)). Let

k(z) =f(z)− f(λ)

z − λ.

k(z) is analytic in an open set containing σ(M), so we can define k(M) there.(z − λ)k(z) = f(z)− f(λ), so

(M − λI)k(M) = f(M)− f(λ)I.

Since λ ∈ σ(M), we have that M − λI is not invertible. If the product were,then we saw that each factor would be as well. Therefore f(M) − f(λ)I isnot invertible.

Corollary 16.6 Suppose σ(M) = σ1 ∪ · · · ∪ σN , where the σi are closed anddisjoint. Let Cj be a closed curve that winds once around each point of σjbut not around any point of any other σk. Let

Pj =1

2πi

∫Cj

(z −M)−1 dz.

Then

(1) P 2j = Pj, PjPK = 0.

(2)∑Pj = I.

(3) PM 6= 0 if σm 6= ∅.

The proof uses the observation that C =∑Cj winds once around each

point of σ(M).

81

17 Commutative Banach algebra

We look at commutative Banach algebras with a unit. Commutative meansMN = NM for all M,N ∈ L.

p is a multiplicative functional on L if p is a homomorphism from L intoC.

Proposition 17.1 Every homomorphism is a contraction.

Proof. M = IM , so p(M) = p(IM) = p(I)p(M), or p(I) = 1. If K isinvertible,

p(K)p(K−1) = p(KK−1) = p(I) = 1,

so p(K) 6= 0. Suppose |p(M)| > |M | for some M . Then if B = M/p(M), wehave |B| < 1, so K = I −B is invertible. But

p(K) = p(I)− p(M/p(M)) = 1− 1 = 0,

a contradiction.

We will show that if p(K) 6= 0 for all homomorphisms, then K is invertible.

I ⊂ L is an ideal if I is a linear subspace, I 6= {0}, I 6= L, and if M ∈ Land J ∈ I, then MJ ∈ I.

As an example, let L = C(S), let r ∈ S, and let I = {f : f(r) = 0}.Sometimes the requirement that I 6= L is not part of the definition, and

we talk about proper ideals to be the ones that are properly contained in L.

If I ∈ I, then I = L. If I contains an invertible element, then I containsthe identity, and hence equals L.

Lemma 17.2 Let q be a homomorphism from L onto A, but where q is notan isomorphism and q(L) 6= 0. Then

(1) {K ∈ L : q(K) = 0} is an ideal. (This set is called the kernel of q.)

(2) If I is an ideal, then I is the kernel of some non-trivial homomor-phism.

82

Proof. (1) is easy. For (2), let A = L/I. Let q map M into the equivalenceclass containing M . Then the kernel of q is I.

Proposition 17.3 If K ∈ L, K 6= 0, and K not invertible, then K lies insome ideal.

Proof. Look at KL = {KM : M ∈ L}. Note KL does not contain theidentity.

Lemma 17.4 Every ideal is contained in some maximal ideal.

Proof. Order by inclusion. The union of a totally ordered subcollection willbe an upper bound. (Note that if I /∈ Iα for all α, then I /∈ ∪αIα.) Thenuse Zorn’s lemma.

A division algebra is one where every nonzero element is invertible.

Proposition 17.5 If M is a maximal ideal of L, then A = L/M is adivision algebra.

Proof. Suppose C ∈ A and C is not invertible. Then CA is an ideal in A.Let q : L → L/M = A be the usual map. C = CI ∈ CA. q−1(C) is inthe ideal CA but properly contains M because q(M) = 0 if M ∈ M. Thiscontradicts M being maximal.

Lemma 17.6 The closure of an ideal is an ideal.

Proof. The only thing to prove is that I /∈ I. We know I /∈ I, and so ifN ∈ B1(I), then N is invertible, and hence not in I. So B1(I) is an open setabout I that is disjoint from I. Therefore I /∈ I.

83

Lemma 17.7 If M is a maximal ideal, then M is closed.

Proof. If not, M is an ideal strictly larger than M.

Lemma 17.8 If I is a closed ideal in L, then L/I is a Banach algebra.

Proposition 17.9 If A is a Banach algebra with unit that is a divisionalgebra, then A is isomorphic to C.

Proof. If K ∈ A, there exists κ ∈ σ(K). So κI−K is not invertible. There-fore κI −K = 0, or K = κI. The map K → κ is the desired isomorphism.

Theorem 17.10 K ∈ L is invertible if and only if p(K) 6= 0 for all homo-morphisms p of L into C.

Proof. Suppose K is not invertible. K is in some maximal ideal M. ThenM is closed, L/M is a division algebra, and is isomorphic to C.

p : L → L/M→ C

is a homomorphism onto C, and its null space is M. Since K ∈ M, thenp(K) = 0.

18 Applications of Banach algebras

18.1 Absolutely convergent Fourier series

Let L be the set of continuous functions from the unit circle S1 to the complexfunctions such that f(θ) =

∑cne

inθ with∑|cn| < ∞. We let the norm of

f be∑|cn|. We check that L is a Banach algebra. To do that, we use the

fact that the Fourier coefficients for fg are the convolution of those for f andthose for g. But the convolution of two `1 functions is in `1, so fg ∈ L.

If w ∈ S1, set pw(f) = f(w). pw is a homomorphism from L to C.

84

Proposition 18.1 If p is a homomorphism from L to C, then there existsw such that p(f) = f(w) for all f ∈ L.

Proof. p(I) = 1 and |p(M)| ≤ |M |, so p has norm 1. Then

|p(eiθ)| ≤ 1, |p(e−iθ)| ≤ 1,

and1 = p(1) = p(eiθ)p(e−iθ).

We must have |p(eiθ)| = 1, or we would have inequality in the above.

Therefore there exists w such that p(eiθ) = eiw. Since p is a homomor-phism, by induction p(einθ) = einw. By linearity,

p( N∑n=−N

cneinθ)

=N∑

n=−N

cneinw.

If f ∈ L, since p is continuous and∑|cn| <∞, we have p(f) = f(w).

Theorem 18.2 Suppose f has an absolutely convergent Fourier series and fis never 0 on S1. Then 1/f also has an absolutely convergent Fourier series.

Proof. If p is a homomorphism on L, then p(f) = f(w) for some w. Sincef is never 0, p(f) 6= 0 for all non-trivial homomorphisms p. This implies fis invertible in L.

18.2 The corona problem

Let H∞ be the set of functions that are analytic and bounded in the unitdisk D. If we set |f | = supz∈D |f(z)|, this makes H∞ into a commutativeBanach algebra.

Mz = {f ∈ H∞ : f(z) = 0} is a maximal ideal. But if zn is a sequenceconverging to the boundary, then M = {f ∈ H∞ : lim f(zn) = 0} is anideal, so is contained in a maximal ideal that is not any of the Mz. Therefore

85

{Mz : z ∈ D} is not equal to the set of all maximal ideals A. The coronatheorem says that it is dense in A if we provide A with the natural topology.

If f ∈ H∞, we define f : A → C, the Gel’fand transform of f , as follows.If M ∈ A, then H∞/M is isomorphic to C. Let IM be the isomorphism.

Define f(M) = IM(f), where f is the equivalence class of H∞/M containingf .

Note that f(M) = 0 implies IM(f) = 0, so f = 0, which implies f ∈M .

Let z ∈ D. If f ∈ H∞/Mz , f = {g : g − f ∈ Mz} = {g : g(z) = f(z)}.So the isomorphism mapping H∞/Mz to C is just IMz(f) = f(z), and thus

f(Mz) = f(z).

We define a basic neighborhood of N ∈ A to be a set of the form

V {M ∈ A : |fj(M)− fj(N)| < ε, j = 1, . . . , n}

for some δ and some f1, . . . .fn ∈ H∞.

Theorem 18.3 Suppose δ > 0, n > 1, and f1, . . . , fn ∈ H∞ with

maxj|fj(z)| ≥ δ

for each z ∈ D. Then there exist g1, . . . , gn ∈ H∞ with

f1(z)g1(z) + · · ·+ fn(z)gn(z) = 1

for each z ∈ D.

The fj are called corona data, the gj corona solutions.

The corona theorem is a corollary to the above theorem.

Theorem 18.4 The closure of {Mz : z ∈ D} is equal to A.

Proof. Suppose not. Then there exists N ∈ A and a neighborhood V of Nof the form

V = {M ∈ A : |hj(M)− hj(N)| < δ, j = 1, . . . , n}

86

that contains no Mz.

Let fj = hj − hj(N), i.e., we normalize the hj by subtracting a constant.Then

V = {M ∈ A : |fj(M)| < δ, j = 1, . . . , n}.

By the normalization, fj(N) = 0, so fj ∈ N for each j.

If z ∈ D, then Mz /∈ V , so |fj(Mz)| ≥ δ for some j, or equivalently,|fj(z)| ≥ δ for some j. By the previous theorem, there exist g1, . . . , gn suchthat f1g1 + · · ·+ fngn = 1. But since each fj ∈ N and N is an ideal, 1 ∈ N ,a contradiction.

19 Operators and their spectra

19.1 Invertible maps

Proposition 19.1 Suppose X is a Banach space, and K : X → X isbounded and onto. Then there exists ε such that if |A| < ε, then K − Ais onto.

Note we do not assume K is 1-1. Compare this with the result that Kinvertible implies K − A is invertible if the norm of A is small enough.

Proof. By the open mapping theorem, there exists k such that if Kx = z,then |x| ≤ k|z|. Suppose |A| < 1/k. We show K − A is onto.

Fix u, let x0 = 0, and define recursively xn by

Kxn+1 = Axn + u.

|x1| ≤ k|u|, and since

K(xn+1 − xn) = A(xn − xn−1),

|xn+1 − xn| ≤ k|A| |xn − xn−1|.

Therefore xn converges. If x is the limit, taking the limit in the definition ofxn+1, Kx = Ax+ u, which is what we wanted.

87

Note

|x| ≤∑|xn+1 − xn| ≤ (k|A|)n|x1| <

k

1− k|A||u|.

Proposition 19.2 Suppose M is a bounded linear map from a Banach spaceX into itself. Then σ(M ′) = σ(M).

Proof. Let K = λ −M . If K is invertible, there exists L such that KL =LK = I. Therefore L′K ′ = K ′L′ = I ′ = I, or K ′ has an inverse.

IfX were reflexive, we could sayK ′ invertible impliesK = K ′′ is invertible.In general, we have K ′′L′′ = L′′K ′′ = I ′′. Check that the restriction of K ′′ toX is K, the restriction of I ′′ to X is I, and therefore the null space of K istrivial, and hence K is 1-1.

The restriction of L′′ to X is the inverse to K. Since L′′ is continuous, RK

is closed. If RK 6= X, we can use Hahn-Banach to find ` 6= 0 annihilatingRK . Then ` ∈ NK′ . But K ′ is invertible.

19.2 Shifts

Let R and L be the right and left shifts, resp., on `2. We have LR = I butRL 6= I. Check that R′ = L and R = L′.

Proposition 19.3 σ(R) = σ(L) = {|λ| ≤ 1}.

Proof. L is a contraction, so |L| ≤ 1. Similarly |Ln| ≤ 1. Then |σ(L)| =lim |Ln|1/n ≤ 1.

If |λ| < 1 and we set an = λna0, then a ∈ `2 and Lx = λx. Therefore σ(L)contains the open unit ball. Since σ(L) is closed, this does it.

19.3 Volterra integral operators

Let X = C[0, 1] and

(V x)(s) =

∫ s

0

x(r) dr.

88

Proposition 19.4 σ(V ) = {0},

Proof. By induction and integration by parts,

V nx(s) =1

(n− 1)!

∫ s

0

(s− r)n−1x(r) dr.

So

|V nx(s)| ≤ 1

(n− 1)!

∫ s

0

(s− r)n|x| dr ≤ |x|n!.

Then |V n| ≤ 1/n!, so |σ(V )| = lim |V n|1/n = 0.

19.4 Fourier transform

F 2 = R, reflection, and R2 = I, so F 4 = I. Therefore σ(F ) ⊂ {±1,±i} bythe spectral mapping theorem.

20 Compact maps

20.1 Basic properties

A subset S is precompact if S is compact. Recall that if A is a subset ofa metric space, A is precompact if and only if every sequence in A has asubsequence which converges in A. Also, A is compact if and only if A iscomplete and totally bounded.

A map K from a Banach space X to a Banach space U is compact ifK(BX

1 (0)) is precompact in U .

One example is if K is degenerate, so that RK is finite dimensional. Theidentity on `2 is not compact.

Here is a more complicated example. Let X = U = `2 and define

K(a1, a2, . . . , ) = (a1/2, a2/22, a3/2

3, . . .).

Take a sequence xn in K(B1(0)). The jth coordinates of xn are boundedby 2−j. So there exists a subsequence such that xjn converges. Use the

89

diagonalization procedure to get a subsequence along which xjn converges forevery j. The limit will have the jth coordinate bounded by 2−j, so Kxnconverges along the subsequence to an element of K(B1(0)),

The following facts are easy:

(1) If C1, C2 are precompact subsets of a Banach space, then C1 + C2 isprecompact.

(2) If C is precompact, so is the convex hull of C.

(3) If M : X → U and C is precompact in X, then M(C) is precompactin U .

Proposition 20.1 (1) If K1 and K2 are compact maps, so is kK1 +K2.

(2) If XL−→U M−→, where M is bounded and L is compact, then ML is

compact.

(3) In the same situation as (2), if L is bounded and M is compact, thenML is compact.

(4) If Kn are compact maps and lim |Kn −K| = 0, then K is compact.

We can use (4) to give another proof that our example K above is compact.It is the limit in norm of Kn, where

Kn(a1, a2, . . .) = (a1/2, a2/22, . . . , an/2

n, 0, . . .).

Proof. (1) For the sum, (K1 +K2)(B) ⊂ K1(B) +K2(B), and the multipli-cation by k is similar.

(2) ML(B) will be compact because L(B) is compact and M is continuous.

(3) L(B) will be contained in some ball, so ML(B) is precompact.

(4) Let ε > 0. Choose n such that |Kn −K| < ε. Kn(B) can be coveredby finitely many balls of radius ε, so K(B) is covered by the set of balls withthe same centers and radius 2ε. Therefore K(B) is totally bounded.

One way of rephrasing (2)- (4) is that the set of compact maps are a closed2-sided ideal in L(X).

90

Proposition 20.2 If X and U are Banach spaces and K : X → U is com-pact and Y is a closed subspace of X, then the map K |Y is compact.

Proposition 20.3 If K : X → X is compact and Y is a closed invariantsubspace of the Banach space X, then K : X/Y → X/Y is compact.

Recall the following fact that we proved in Chapter 5. If X is a normedlinear space and Y a proper closed linear subspace, then there exists x ∈ Xwith |x| = 1 and d(x, Y ) ≥ 1/2.

Theorem 20.4 Suppose K : X → X is compact and T = I −K.

(1) dim(NT ) <∞.

(2) If Nj = NT j , there exists i such that Nk = Ni for k > i.

(3) RT is closed.

Proof. (1) y ∈ NT implies y = Ky. So the unit ball in NT is precompact.But saying the unit ball is compact implies the space is finite dimensional.

(2) If not, Ni−1 is a proper subset of Ni for all i. There exists yi ∈ Ni suchthat |yi| = 1 and d(yi, Ni−1) > 1/2. If m < n,

Kyn −Kym = yn − Tyn − ym + Tym.

Now −Tyn − ym + Tym ∈ Nn−1. So |Kyn − Kym| > 1/2. therefore nosubsequence of Kyn converges, a contradiction.

(3) Suppose yk → y, where yk = Txk. We need to show y ∈ RT . Letdk = d(xk, NT ).

Step 1. dk is bounded: choose zk ∈ NT such that wk = xk − zk satisfies|wk| = |xk − zk| < 2dk. Tzk = 0, so Twk = Txk − Tzk = yk. yk → y, so {yk}is bounded. If dk is unbounded,

T(wkdk

)=ykdk→ 0.

Let uk = wk/dk, so |uk| < 2.

0← Tuk = uk −Kuk.

91

Since Kuk has a convergent subsequence, uk does too. Suppose uk → u(along the subsequence). Then since T is bounded, Tu = limTuk = 0, oru ∈ NT . We have |xk− z| ≥ dk if z ∈ NT . So |wk− z| = |xk− (z+ zk)| ≥ dk,hence |wk − dkz| ≥ dk, hence |uk − z| ≥ 1, a contradiction.

Step 2. Since |wk| < 2dk, then wk is a bounded sequence. wk−Kwk = yk →y. Kwk has a convergent subsequence, so wk does too, say with limit w.Then w −Kw = Tw = y, and RT is closed.

Proposition 20.5 Suppose K : X → X is compact. If NT = {0}, thenRT = X.

Proof. Since dimNT = 0, T is 1-1. Assume X1 = RT is a proper subset ofX. Then X2 = TX1 is a proper subset of X1. To see this, suppose u ∈ Xand u /∈ X1. Then Tu ∈ TX = X1. But if Tu ∈ X2, then Tu = Tv for somev ∈ X1, and then u = v ∈ X1 since T is 1-1, a contradiction. Continue todefine X3 and so on.

We have Xk = RTk , and

T k = (I −K)k = I +k∑j=1

(−1)j(kj

)Kj,

or T k is equal to I plus a compact operator. Therefore Xk is closed.

Pick xk ∈ Xk such that |xk| = 1 and dist (xk, Xk+1) > 12.

Kxm −Kxn = xm − Txm − xn + Txn.

The last three terms are in Xm+1, so |Kxn−Kxm| ≥ 12, hence no subsequence

converges, a contradiction to K being compact.

20.2 Spectral theory

Theorem 20.6 Suppose X is a Banach space and K : X → X.

(1) σ(K) consists of countably many complex numbers λn, whose onlyaccumulation point is {0}. If dimX =∞, then 0 ∈ σ(K).

92

(2) Each λj 6= 0 is an eigenvalue:

(a) The null space of K − λj is finite dimensional.

(b) There exists i such that the null space of (K − λj)k is the same as thenull space of (K − λj)i if k > i.

(b) says that the multiplicity of each nonzero eigenvalue is finite.

Proof. We prove (2) first. Suppose λj ∈ σ(K) and λj 6= 0. Let T =I−λ−1

j K. λjT = λj−K is not invertible, so its null space, which is the sameas that of T , if finite dimensional. If NT = {0}, then RT = X, and then Tis invertible. So NT is larger than {0}. If z ∈ NT , (λj −K)z = 0, or z is aneigenvector. By the previous theorem, the multiplicity is finite.

Now we look at (1). Suppose λn is a sequence of distinct non-zero eigenval-ues. with corresponding eigenvectors xn. Let Yn be the linear space spannedby {x1, . . . , xn}. We claim the xn are linearly independent. If not, we canwrite xn =

∑n−1j=1 cjxj, and then

λnxn = Kxn =n∑j=1

ckKxj =n−1∑j=1

cjλjxj,

orn−1∑j=1

cjxj =n−1∑j=1

cjλjλnxj,

which says {x1, . . . , xn−1} are linearly dependent. We then use induction.Therefore Yn−1 is a proper subset of Yn. Choose yn ∈ Yn such that |yn| = 1and dist (yn, Yn−1) > 1/2. We can write

yn =n∑j=1

ajxj.

So

Kyn − λnyn =n∑j=1

(λj − λn)ajxj ∈ Yn−1.

Therefore if n > m,

Kyn −Kym = (Kyn − λnyn)− (Kym − λmym)− λmym,

93

which equals λnyn plus an element of Yn−1, and so

|Kyn −Kym| ≥ |λn|/2.

Since K is compact, there can only be finitely many λn with |λn| > δ.

If dimX = ∞, define the yn as before. 0 /∈ σ(K) implies that K isinvertible. Since K is compact, there is a subsequence Kynj which converges.But then ynj = K−1Kynj converges, a contradiction.

Theorem 20.7 K is compact if and only if K ′ is compact.

Proof. We want to show that if {`n} ⊂ U∗ with |`n| ≤ 1, then {K ′`n} has aCauchy subsequence. Let J = K(B). The quantities |(`n, u)| are uniformlybounded, and also equicontinuous on J :

|(`n, u)− (`n, v)| = |(`n, u− v)| ≤ |u− v|.

J is compact, so by Ascoli-Arzela there exists a uniformly convergent subse-quence. So given ε, there exists N such that |(`n, u)−(`m, u)| < ε if n,m ≥ Nand u ∈ J . So

ε > |(`n − `m, Kx)| = |(K ′`n −K ′`m, x)|

for all x ∈ B. Therefore |K`n −K`m| < ε.

Conversely, if K ′ is compact, then K ′′ is compact. But K is the restrictionof K ′′ to X, so is also compact.

21 Positive compact operators

We’ll do the Krein-Rutman theorem, which is a generalization of the Perron-Frobenius theorem.

Theorem 21.1 Suppose Q is compact and Hausdorff and X = C(Q), thecomplex-valued continuous functions on Q. Suppose K : C(Q) → C(Q)

94

and K is compact. Suppose further than K maps real-valued functions toreal-valued functions. Finally, suppose that whenever p ≥ 0 and p is notidentically zero, then Kp is strictly positive. Then K has a positive eigenvalueσ of multiplicity one, the associated eigenfunction is positive, and all the othereigenvalues of K are strictly smaller in absolute value than σ.

Examples include matrices with all positive entries, the semigroup Pt whent = 1 for reflecting Brownian motion on a bounded interval, and

Kf(x) =

∫K(x, y)f(y)µ(dy),

where K is jointly continuous, positivie, and µ is a finite measure. It is anexercise to show that the operator K is compact.

Proof. If x ≤ y and x 6≡ y, then y − x ≥ 0, so K(y − x) > 0, or Kx < Ky.

Step 1. Suppose κ > 0 is such that there exists x ∈ C(Q) with x ≥ 0, andκx ≤ Kx at all points of Q.

Step 2. There exists such a κ > 0: Let x ≡ 1. So Kx is strictly positive, andwe can let κ = minKx.

NowκKx = K(κx) ≤ K(Kx) = K2x.

Soκ2x ≤ κKx ≤ K2x,

and in general,κnx ≤ Knx.

Since x ≥ 0,κn|x| ≤ |Knx| ≤ |Kn| |x|,

so|σ(K)| = lim |Kn|1/n ≥ κ.

Therefore |σ(K)| is strictly positive. Since K is compact, the set of eigenval-ues of K is nonempty. We have shown that there exists a non-zero eigenvaluefor K.

Step 3. K is compact, so there exists an eigenvalue λ and an eigenfunction zsuch that Kz = λz, |λ| = |σ(K)|. Let λ and z be any pair with |λ| = |σ(K)|.

95

(a) We claim: if y = |z| and σ = |σ(K)|, then σy ≤ Ky.

Proof: Let q ∈ Q. Multiply z by α ∈ C such that |α| = 1 and αλz(q) isreal and non-negative. Of course α depends on q. Write z = u+ iv. Then

Ku(q) + iKv(q) = Kz(q) = λz(q).

Looking at the real part,

λz(q) = (Ku)(q).

Next, u ≤ |z| = y, and

|λ|y(q) = |λz(q)| = Ku(q) ≤ (Ky)(q). (21.1)

Thenσy(q) ≤ Ky(q). (21.2)

Although z depends on α, which depends on q, neither σ nor y depend on q.Since q was arbitrary, the inequality (21.2) holds for all q.

(b) We claimσy = Ky.

Proof: If not, there exists q such that σy(q) < Ky(q). By continuity, thereexists a neighborhood N about q such that

σy(s) + δ ≤ Ky(s), s ∈ N.

Let p > 0 in N , 0 outside of N , and so Kp > 0.

We will find c, ε > 0 and set x = y + εp, κ = σ + cε, and get κx ≤ Kx.This will be a contradiction since the largest κ possible is less than or equalto |σ(K)| = σ.

Now Kp > 0, so there exists c ≤ 1 such that cy ≤ Kp. If s ∈ N ,

Kx(s) = Ky(s) + εKp(s) ≥ Ky(s) + εcy(s)

≥ σy(s) + δ + εcy(s).

Now

κx(s) = (σ + cε)(y + εp)(s) = σy(s) + εcy(s) + σεp(s)

+ cε2p(s)

≤ Kx(s)− δ + σεp(s) + cε2p(s).

96

Since p is bounded above, we can take ε small enough so that the last line isless than or equal to Kx(s).

If s /∈ N , then p(s) = 0 and

κx(s) = κy(s) = (σ + cε)y(s) = σy(s) + εcy(s)

≤ Ky(s) + εKp(s) = Kx(s).

Step 4. We next show that any other eigenvalue that has absolute value σis in fact equal to σ. Let z be any eigenfunction corresponding to λ with|λ| = σ. Fix q ∈ Q. As before, we may assume λz(q) ≥ 0. As before, writez = u+ iv and then λz(q) = Ku(q). We have u ≤ |z| = y.

Suppose u < y at some point q′ ∈ Q. Then u ≤ y and u < y at one pointmeans that we have Ku < Ky at every point, and so

|λ|y(q) = |λz(q)| = λz(q) = Ku(q) < Ky(q).

So σy(q)) < Ky(q). But we showed σy = Ky. Therefore u is identicallyequal to y. This implies that z is real and positive, and then it follows thatλ is real and positive. Since z = σ−1Kz, z is strictly positive.

Step 5. Finally, we show σ has multiplicity 1. If not, there exist real eigen-functions y1, y2. But some linear combination w of y1, y2 will be real and takeboth positive and negative values. As before |w| will be an eigenfunction thatis non-negative, and must also take the value 0. Moreover the correspondingeigenvalue is σ. But then 0 < K|w| = σ|w|, a contradiction to |w| taking thevalue 0.

22 Invariant subspaces

22.1 Compact maps

There exist operators without any eigenfunctions, but they still have non-trivial invariant subspaces. Y is a non-trivial invariant subspace for K if Yis a proper subset of X and KY ⊂ Y . Note that if K has an eigenvector y,then the linear subspace spanned by {y} is a non-trivial invariant subspace.What if the only element of σ(K) is the point 0?

97

Theorem 22.1 Let X be a Banach space over C of dimension greater than1 and let K : X → X be compact. Then K has a non-trivial invariantsubspace.

Proof. Suppose K 6= 0 and normalize so that |K| = 1. Choose x0 ∈ X suchthat |x0| > 1, |Kx0| > 1. Let B = B1(x0), and note 0 /∈ B. Let D = KB.D is compact. Since |x0| > 1 and |K| = 1, then 0 /∈ D.

Suppose K has no non-trivial invariant subspaces. Given y 6= 0, {p(K)y}where p is a polynomial, is invariant under K. Its closure is a closed invariantsubspace, so must be all of X. Therefore {p(K)y} is dense in X.

0 /∈ D, so if y ∈ D, there exists a polynomial p such that |p(K)y−x0| < 1.The set of z satisfying |p(K)z − x0| < 1 is an open set containing y. D iscompact, so can be covered by finitely many of them. Therefore there existp1, . . . , pN such that if y ∈ D, |pi(K)y − x0| < 1 for some pi.

Let Ki = pi(K). If y ∈ D, |Kiy − x0| < 1 for at least one i.

x0 ∈ B and Kx0 ∈ D. If y = Kx0, there exists i1 such that |Ki1K(x0)−x0| < 1. So Ki1Kx0 ∈ B, hence KKi1Kx0 ∈ D. Let y = KKi1Kx0 andthere exists i2 such that

|Ki2KKi1Kx0 − x0| < 1.

Continue, so ∣∣∣ n∏k=1

(KikK)x0 − x0

∣∣∣ < 1.

Hence ∣∣∣ n∏k=1

(KikK)x0

∣∣∣ > |x0| − 1 > 0.

The Ki’s and K commute, so∣∣∣( n∏k=1

Kik)Knx0

∣∣∣ > |x0| − 1.

Let c = supi |Ki|. Therefore

cn|Kn| |x0| > |x0| − 1.

98

thenc|Kn|1/n|x0|1/n > (|x0| − 1)1/n.

Therefore |σ(K)| ≥ 1/c > 0. By spectral theory, σ(K) contains points otherthan 0, and there are eigenvalues. The corresponding eigenspace is invariantunder K, a contradiction.

23 Compact symmetric operators

Let H be a complex Hilbert space. A : H → H is Hermitian (symmetric) ifA = A∗, that is, (Ax, y) = (x,Ay).

Proposition 23.1 (1) (Ax, x) is real.

(2) (Ax, x) is not identically 0 unless A = 0.

Proof. (1)(Ax, x) = (x,Ax) = (Ax, x).

(2) If (Ax, x) = 0 for all x, then

0 = (A(x+ y), x+ y) = (Ax, x) + (Ay, y) + (Ax, y) + (Ay, x)

= (Ax, y) + (y, Ax) = (Ax, y) + (Ax, y).

So Re (Ax, y) = 0. Replacing x by ix, Re (i(Ax, y)) = 0.

If (Ax, x) ≥ 0 for all x, we say A is positive, and write A ≥ 0. WritingA ≤ B means B − A ≥ 0. For matrices, one uses positive definite.

Now suppose A is compact.

Proposition 23.2 If xnw−→, then Axn

s−→.

Proof. If xnw−→x, then Axn

w−→Ax, since (Axn, y) = (xn, Ay) → (x,Ay) =(Ax, y). If xn converges weakly, then ‖xn‖ is bounded so Axn lies in aprecompact set.

99

Any subsequence ofAxn has a further subsequence which converges strong-ly. The limit must be Ax.

Theorem 23.3 (Spectral theorem) Suppose H is a complex Hilbert space,A is compact and symmetric. There exist zn ∈ H such that {zn} is anorthonormal basis for H, each zn is an eigenvector, the eigenvalues are real,and their only point of accumulation is 0.

Proof. If A = 0, any orthonormal basis will do. So suppose A 6= 0. Let

M = sup‖x‖=1

(Ax, x).

We may assume M > 0 since there exists x such that (Ax, x) 6= 0. (Look at−A if necessary.) We have (Ax, x) ≤ ‖Ax‖ ‖x‖, so M ≤ ‖A‖.

We claim the maximum is attained. Choose xn with ‖xn‖ = 1 suchthat (Axn, xn) → M . There exists a subsequence, also denoted xn, whichconverges weakly, say to z. Since A is compact, Axn

s−→Az. So (Az, z) =lim(Axn, xn) = M . ‖xn‖ ≤ 1, so ‖z‖ ≤ 1. M > 0 implies z 6= 0. Lety = z/‖z‖. Then (Ay, y) = M/‖z‖2. If ‖z‖ < 1, then (Ay, y) > M , acontradiction. Therefore ‖z‖ = 1.

Note z maximizes

RA(x) =(Ax, x)

‖x‖2

over all x 6= 0.

Let w ∈ H, t ∈ R. RA(z + tw) ≤ RA(z). So ∂∂tRA(z + tw) |t=0= 0. Doing

the calculation,

(Aw, z) + (Az,w)

‖z‖2− (Az, z)

(w, z) + (z, w)

‖z‖4= 0.

This impliesRe (Az −Mz,w) = 0.

This is true for all w, so Az −Mz = 0, or z is an eigenvector.

100

Suppose we have found eigenvalues z1, z2, . . . , zn. Let Y be the orthogonalcomplement of the linear subspace spanned by {z1, . . . , zn}. If x ∈ Y , then

(Ax, zk) = (x,Azk) = λk(x, zk) = 0,

or Ax ∈ Y . We then look at A |Y , which will still be compact and symmetric,and find a new eigenvector zn+1.

We next prove that the set of eigenvectors forms a basis. Suppose y isorthogonal to every eigenvector. Then

(Ay, zi) = (y, Azi) = (y, αzi) = 0

if zi is an eigenvector, and Ay is also orthogonal to every eigenvector. If Yis the orthogonal complement to {zi} and Y 6= {0}, then A : Y → Y , A |Yis symmetric, so there exists an eigenvector for A |Y , a contradiction since Yis orthogonal to every eigenvector.

It remains to show that 0 is the only accumulation point. If αn 6= αm,

αn(zn, zm) = (Azn, zm) = (zn, Azm) = αm(zn, zm),

or (zn, zm) = 0. We can take the zn to have norm 1.

Now if αn → α 6= 0, using the compactness of A, there is a subsequence,also called zn, such that Azn converges strongly, say, to w.

zn =1

αnAzn →

1

αw.

But ‖zn − zm‖ = 2 if n 6= m, a contradiction to an converging.

If α1 ≥ α2 ≥ · · · > 0 and Azn = αnzn, then our construction shows that

αN = maxx⊥z1,··· ,zN−1

(Ax, x)

‖x‖2.

This is known as the Rayleigh principle.

Let

RA(x) =(Ax, x)

‖x‖2.

101

Proposition 23.4 Let A be compact, symmetric, αk the eigenvalues withα1 ≥ α2 ≥ · · · . Then

(1) (Fisher’s principle)

αn = maxSN

minx∈SN

RA(x),

where the max is over linear subspaces SN of dimension N .

(2) (Courant’s principle)

αN = minSN−1

maxx⊥SN−1

RA(x).

Proof. Let z1, . . . , zN be eigenvectors with corresponding eigenvalues α1 ≥α2 ≥ · · · ≥ αN . Let TN be the linear subspace spanned by {z1, . . . , zN}. Ify ∈ TN , we have y =

∑Nj=1 cjzj for some complex numbers cj and then

(Ay, y) =N∑i=1

N∑j=1

cicj(Azi, zj) =∑i

∑j

cicjαi(zi, zj)

=∑i

|ci|2αi ≥∑i

|ci|2αN

= (y, y)

using the fact that the zi are orthogonal by our construction.

(1) Let zk be the eigenvectors. Let SN be a subspace of dimension N .There exists y ∈ SN such that (y, zk) = 0 for k = 1, . . . , N − 1. Since

αN = maxx⊥z1,...,zN−1

RA(x),

then y is one of the vectors over which the max is being taken, so RA(y) ≤ αNfor this y. So minx∈SN RA(x) ≤ αN . This is true for all spaces of dimensionN . So the RHS is less than or equal to αN .

But if SN is the linear span of {z1, . . . , zN}, the minimum of RA on SN isachieved at zN .

(2) If dimSN−1 = N −1, TN , the span of {z1, . . . , zN}, contains a vector yperpendicular to SN−1. If y ∈ TN , RA(y) ≥ αN , so maxx⊥SN−1

RA(x) ≥ αN .

But taking SN−1 = TN−1, we get equality.

102

Proposition 23.5 Suppose A ≤ B with eigenvalues αk, βk, resp., ordered tobe decreasing. Then αk ≤ βk for all k.

Proof. A ≤ B implies (Ax, x) ≤ (Bx, x), so RA(x) ≤ RB(x). Now useeither Fisher’s or Courant’s principle.

Theorem 23.6 Suppose A is compact and symmetric. Suppose f defined onσ(A) is bounded and complex valued. We can define f(A) such that

(1) If f1(σ) ≡ 1, then f1(A) = I.

(2) If f2(σ) ≡ σ, then f2(A) = A.

(3) f → f(A) is an isomorphism of the ring of bounded functions on σinto the algebra of bounded maps of H into H.

(4) ‖f(A)‖ = supσ∈σ(A) |f(σ)|.(5) If f is real, f(A) is symmetric,

(6) f ≥ 0 on σ(A) implies f(A) is positive.

Proof. If {zn} is an orthonormal basis of eigenvectors, and x =∑cnzn, set

f(A)x =∑f(αn)cnzn.

If A is positive, then σ(A) ⊂ [0,∞). We can define√A by taking f(λ) =√

λ.

24 Examples of compact symmetric opera-

tors

LetLu(x) = −u′′(x) + q(x)u(x)

for u ∈ C2[0, 2π] with u(0) = u(2π) = 0. Suppose q is continuous and q ≥ 1for all x.

Fact from PDE: Lu = f with u(0) = 0 and u(2π) = 0 has a uniquesolution if f is continuous.

Write u = Af .

103

Lemma 24.1

C ={u :

∫ 2π

0

(u′)2 ≤ 1,

∫ 2π

0

u2 ≤ 1}

is precompact in L2.

Proof. If 0 ≤ x ≤ y ≤ 2π, then

|u(y)− u(x)| =∣∣∣ ∫ y

x

u′(z) dz∣∣∣ ≤ |y − x|1/2(∫ 2π

0

|u′(z)|2 dz)1/2

using Cauchy-Schwarz. Thus the functions in C are equicontinuous. If|u(y)| ≥ 100 for some u ∈ C and some y, then

|u(x)| ≥ |u(y)| − |u(x)− u(y)| ≥ 100− (2π)1/2

for all x, and then∫u2 > 1, a contradiction to u being in C. Therefore the

functions in C are uniformly bounded. By the Ascoli-Arzela theorem, everysequence in C has a subsequence which converges uniformly on [0, 2π], andhence in L2. Therefore C is precompact.

Proposition 24.2 A is bounded, compact, symmetric, and positive.

Proof. If u ∈ C2 and Lu = f , then by integration by parts:∫ 2π

0

uf dx =

∫ 2π

0

uLu dx =

∫((u′)2 + qu2).

Now ∫uf ≤

(∫u2)1/2(∫

f 2)1/2

≤ 12

∫u2 + 1

2

∫f 2,

using the inequality xy ≤ 12(x2 + y2). Since q ≥ 1,∫(u′)2 + 1

2

∫u2 ≤ 1

2

∫f 2.

This implies A is bounded.

104

(Since ∫uf =

∫(au)(f/a) ≤ 1

2a2

∫u2 + 1

2a2

∫f 2,

all we really need is that q is bounded below by a positive constant.)

To show A is symmetric, (Lu, v) = (u, Lv). Letting f = Lu and g = Lv,(f, Ag) = (Af, g).

To see that A is positive, note

(Af, f) =

∫uf =

∫(u′)2 + qu2 ≥ 0.

It remains to prove A is compact.

Since∫

(u′)2 + 12

∫u2 ≤ 1

2

∫f 2, if f is in the unit ball in L2, then Af has

norm bounded by 1 and (Af)′ has norm bounded by 1/2. By the lemma, theimage of the unit ball under A is precompact.

We can extend A to all of L2. Let en be eigenfunctions, αn the eigenvalues,so Aen = αnen. ‖e′n‖ ≤ c‖Aen‖ < ∞. So en is continuous. Len = α−1

n en.The αn → 0, so α−1

n →∞.

Remark 24.3 The above also applies to Lu = −∆u + qu in D with u = 0on the boundary.

For an example, take q(x) ≡ c. Solving

−f ′′(x) + cf(x) = λf(x),

we see that the solutions are of the form en = cn sin(nx/2). From our theorywe know the en are orthogonal and form a complete orthonormal basis.

25 Trace class operators

25.1 Polar decomposition

Proposition 25.1 Suppose T : H → H is compact. Then T = UA, whereA is a positive symmetric operator and U∗U = I on the range of A, U = 0on R⊥A.

105

This is called the polar decomposition and is the analog of writing a com-plex number as reiθ.

Proof. T ∗T is a non-negative symmetric compact operator as (T ∗Tu, u) =(Tu, Tu) ≥ 0. So there exists a square root: A = (T ∗T )1/2.

‖Tu‖2 = (Tu, Tu) = (u, T ∗Tu) = (u,A2u) = (Au,Au) = ‖Au‖2. (1)

So if Au = Av, then A(u − v) = 0, so T (u − v) = 0, so Tu = Tv. DefineU : RA → H by U(Au) = Tu.

By (1), U is an isometry on RA. Define Un = 0 if n ∈ R⊥A. For such n,(Un, v) = (n, U∗v) = 0 for all v, so U∗ : H → (R⊥A)⊥ ⊂ RA.

Write R for RA. We claim U∗Uw = w for w ∈ R. If z, w ∈ R,

(z, w) = (Uz, Uw) = (z, U∗Uw)

since U is an isometry. So (z, U∗Uw − w) = 0, or (U∗Uw − w) ⊥ R. SinceU∗ : H → R, then U∗Uw − w is in R and is orthogonal to R, hence is 0.Therefore U∗Uw = w.

The only place we used the compactness was to define (U∗U)1/2. We’llsee later that we can define the positive square root of U∗U for any boundedlinear operator U .

T compact implies that T ∗T is compact, so A is compact. Let sj be theeigenvalues of A. These are called the singular values of T .

25.2 Trace class

Let T : H → H be compact. T is of trace class if

‖T‖tr =∑

sj(T ) <∞.

Proposition 25.2 Let T be of trace class, B bounded.

(1)‖T‖tr = ‖T ∗‖tr.(2) ‖BT‖tr ≤ ‖B‖ ‖T‖tr.(3) ‖TB‖tr ≤ ‖B‖ ‖T‖tr.(4) ‖S + T‖tr ≤ ‖S‖tr + ‖T‖tr.

106

Proof. (1) We will show sj(T ) = sj(T∗). Since sj(T

∗) is an eigenvalue of(T ∗∗T ∗)1/2 = (TT ∗)1/2, it suffices to show that TT ∗ and T ∗T have the sameeigenvalues.

Let z, λ be an eigenvector, eigenvalue pair for T ∗T with λ 6= 0, so thatT ∗Tz = λz. then TT ∗Tz = λTz,and λ is an eigenvalue for TT ∗. (Tz 6= 0since λ 6= 0.)

(2) We will show sj(BT ) ≤ ‖B‖sj(T ).

(T ∗B∗BTu, u) = ‖BTu‖2 ≤ ‖B‖2‖Tu‖2 = ‖B‖2(T ∗Tu, u).

So(BT )∗BT ≤ ‖B‖2T ∗T,

hences2j(BT ) ≤ ‖B‖2s2

j(T ).

(3)sj(TB) = sj(B

∗T ∗) ≤ ‖B∗‖sj(T ∗) = ‖B‖sj(T ).

(4) We will show

‖T‖tr = sup∑n

|(Tfn, en)|,

where the supremum is over all orthonormal bases {fn}, {en}.Let zj be normalized eigenvectors for A: Azj = sjzj, ‖zj‖ = 1. Then

f =∑

(f, zj)zj, Af =∑

sj(f, zj)zj.

ThenTf = UAf =

∑sj(f, zj)wj,

where wj = Uzj. wj is an orthonormal basis for R = RA. then

(Tf, e) =∑

sj(f, zj)(wj, e)

and so ∑(Tfn, en) =

∑∑sj(fn, zj)(wj, en).

107

The right hand side is less than or equal to∑j

sj

(∑n

|(fn, zj)|2)1/2∑

n

|(wj, en)|2)1/2

=∑j

sj

(‖zj‖2‖wj‖2

)1/2

=∑

sj = ‖T‖tr.

Therefore sup∑|(Tfn, en)| ≤ ‖T‖tr.

Now let fn = zn, en = wn (supplemented arbitrarily on (RA)⊥. Then(Tfn, en) = sn and the sup is attained.

In particular,

‖S + T‖tr = sup∑n

|((S + T )fn, en)|,

and use subadditivity on the right hand side.

If fn is an orthonormal basis, then we define tr T =∑

(Tfn, fn). Thisis comparable with the trace of a matrix, which is the sum of the diagonalelements.

Proposition 25.3 The definition of trace is independent of the choice oforthonormal basis. If T is of trace class, then the sum converges absolutely.

Proof. We have ∑(Tfn, en) =

∑∑sj(fn, zj)(wj, en).

Take en = fn to get∑(Tfn, fn) =

∑∑sj(fn, zj)(wj, fn).

We have already shown that the right hand side converges, and the sum isbounded by ‖T‖tr. By Parseval,∑

n

(fn, zj)(fn, wj) = (wj, zj),

108

so ∑(Tfn, fn) =

∑sj(wj, zj),

which is independent of the sequence fn. Here zj are the eigenvectors of Aand wj = Uzj.

Proposition 25.4 (1) |tr T | ≤ ‖T‖tr.(2) tr T is a linear function of T .

(3) tr T ∗ = tr T .

(4) If B is bounded, tr BT = tr TB.

Proof. (1) has been done already. (2) and (3) follow from

tr T =∑

(Tfn, fn).

For (4),

Tf = UAf = U(∑

sj(f, zj)zj

)=∑j

sj(f, zj)wj.

Then BTfn =∑sj(fn, zj)Bwj, so using Parseval,∑

(BTfn, fn) =∑∑

sj(fn, zj)(Bwj, fn) =∑

sj(zj, Bwj).

From Tf =∑sj(f, zj)wj,

TBf =∑

sj(Bf, zj)wj =∑

sj(f,B∗zj)wj.

So

tr TB =∑

(TBfn, fn) =∑

sj(wj, B∗zj)

=∑∑

sj(fn, B∗zj)(wj, fn)

=∑

sj(Bwj, zj).

109

Theorem 25.5 tr T =∑λj if T is of trace class, compact, and symmetric.

Proof. Let fn be an orthonormal basis consisting of eigenvectors of T . Then

tr T =∑

(Tfn, fn) =∑

λn(fn, fn) =∑

λn.

This theorem is true under much weaker assumptions on T .

Define K : L2[0, 1]→ [0, 1] by

Ku(s) =

∫ 1

0

K(s, t)u(t) dt.

K∗ has kernel K(t, s).

Suppose K is continuous, symmetric, and real-valued. Then K is compact.This is an exercise, but the idea is to use equicontinuity of Ku:

|Ku(s)−Ku(v)| =∣∣∣ ∫ 1

0

[K(s, t)−K(v, t)]u(t) dt

≤(∫ 1

0

|K(s, t)−K(v, t)|2 dt)1/2(∫ 1

0

u(t)2 dt)1/2

.

So K(B) is a compact subset of C[0, 1] ⊂ L2[0, 1].

Therefore there exists a complete orthonormal system (κj, ej). K : L2 →C[0, 1], so ej = κ−1

j ej is continuous if κj 6= 0.

Theorem 25.6 (Mercer) Suppose K is real-valued, symmetric, and contin-uous. Suppose K is positive: (Ku, u) ≥ 0 for all u ∈ H. Then

K(s, t) =∑j

κjej(s)ej(t),

and the series is uniformly absolutely continuous.

An example is to let K = Pt, the transition density of absorbing or re-flecting Brownian motion.

110

Proof. K ≥ 0 on the diagonal: If not, if K(r, r) < 0, then K(s, t) < 0 if|s− r|, |t− r| < δ for some δ. Take u = 1[r−δ/2,r+δ/2]. Then

(Ku, u) =

∫ ∫K(s, t)u(t)u(s) ds dt < 0,

a contradiction.

Let KN(s, t) =∑N

j=1 κjej(s)ej(t). K − KN is a positive operator, for itseigenvectors are the ej and its eigenvalues κj > 0, j > N . So K − KN isnon-negative on the diagonal:

0 ≤ K(s, s)−N∑j=1

κjej(s)2.

Each term is non-negative, so the sum converges for each s. By Dini’s the-orem,

∑Nj=1 κjej(s)

2 converges uniformly and absolutely in s. By Cauchy-Schwarz, KN(s, t) converges uniformly in s and t.

Let the limit be K∞. We need to show K∞ = K. K and K∞ have thesame eigenfunctions and the same eigenvalues. They both map all functionsthat are orthogonal to all the ej’s to 0. Therefore Ku = K∞u for all u. Sothey have the same kernel.

Set s = t:K(s, s) =

∑κjej(s)

2.

Integrate over s: ∫ 1

0

K(s, s) ds =∑

κj.

Therefore, since the left hand side is finite, K is of trace class. This is truemore generally: K need not be symmetric.

K : H → H is a Hilbert-Schmidt operator if there exists an orthonormalbasis ej and

∑‖Kej‖2 < ∞. We will not prove this, but K is Hilbert-

Schmidt if and only if∑sj(K)2 <∞.

26 Spectral theory of symmetric operators

We let M be a symmetric operator over a complex-valued Hilbert space, sothat (Mx, y) = (x,My).

111

If A is compact, we can write x =∑anen and Ax =

∑λnanen. Let En

be the projection onto the eigenspace with eigenvector λn, so x =∑Enx

and Ax =∑λnEn(x).

If we define a projection-valued measure E(S) by

E(S) =∑λn∈S

En

for S a Borel subset of R, then x =∫E(dλ)x and Ax =

∫λE(dλ)x.

Here E is a pure point measure. In general, we get the same result, butE might not be pure point.

Proposition 26.1 (1) If B is bounded and symmetric, then (Bx, y) isbounded, linear in x, and skew linear in y (that is, (Bx, cy) = c(Bx, y)).

(2) If b(x, y) is skew-symmetric (b(y, x) = b(x, y)), linear in x, and |b(x, y)|≤ c‖x‖ ‖y‖, then b(x, y) = (x,By), where B is bounded, symmetric, and‖B‖ ≤ c.

Proof. (1) is easy.

(2) Fix y, let `(x) = b(x, y), and then |`(x)| ≤ c‖x‖ ‖y‖. So there existsw, depending on y, such that b(x, y) = (x,w). If x = w,

‖w‖2 = b(w, y) ≤ c‖w‖ ‖y‖

or ‖w‖ ≤ c‖y‖. Define B by w = By.

(x,By) = b(x, y) = b(y, x) = (y,Bx) = (Bx, y).

26.1 Spectrum of symmetric operators

Proposition 26.2 If M is bounded and symmetric, then σ(M) ⊂ R.

112

Proof. Let λ = α + iβ, β 6= 0. We want to show that λ is in the resolvent.Define B(x, y) = (x, (M − λ)y). B is linear in x and skew linear in y.

|B(x, y)| ≤ ‖x‖ ‖M − λ)y‖ ≤ ‖x‖ ‖y‖ (‖M‖+ |λ|).

NowB(y, y) = (y, (M − λ)y) = (y,My)− α(y, y)− iβ(y, y).

Since (x,Mx) = (Mx, x) = (x,Mx), then (x,Mx) is real. Therefore

|B(y, y)| ≥ |β(y, y)| = |β| ‖y‖2.

By the Lax-Milgram lemma, if z ∈ H and `(x) = (x, z), there exists ysuch that B(x, y) = `(x) for all x. So

(x, (M − λ)y) = B(x, y) = (x, z).

This is true for all x, so z = (M − λ)y, which proves M − λ is invertible.

Proposition 26.3 |σ(M)| = ‖M‖.

Proof.

‖Mx‖2 = (Mx,Mx) = (x,M2x) ≤ ‖x‖ ‖M2x‖ ≤ ‖x‖2‖M2‖.

So ‖M‖2 ≤ ‖M2‖. Similarly, ‖M‖n ≤ ‖Mn‖ if n = 2k. The other directionfollows from ‖AB‖ ≤ ‖A‖ ‖B‖. Taking the nth root,

|σ(M)| = lim ‖Mn‖1/n = ‖M‖.

Proposition 26.4 Let a = inf‖x‖=1(x,Mx) and b the supremum. Thenσ(M) ⊂ [a, b] and a, b ∈ σ(M).

113

Proof. Let λ < a.

(x, (M − λ)x) = (x,Mx)− λ(x, x) ≥ (a− λ)‖x‖2.

If we define B(x, y) = (x, (M−λ)y), then the hypotheses of the Lax-Milgramlemma hold, and as in proof of the proposition that M bounded and sym-metric implies σ(M) ⊂ R. we see that (M −λ) is invertible. Thus λ /∈ σ(M)and similarly for λ > b.

We have |(x,Mx)| ≤ ‖x‖ ‖Mx‖ ≤ ‖M‖ if ‖x‖ = 1. So |a|, |b| ≤ ‖M‖, andsince ‖M‖ = |σ(M)|, we have |σ(M)| ≤ |a|∨ |b|. So if b > |a|, then b ∈ σ(M)and if |a| > b, then a ∈ σ(M).

Replacing M by M + cI shifts the spectrum by c. We conclude botha, b ∈ σ(M).

Proposition 26.5 Suppose M,N is symmetric. Define dist (σ(M), σ(M))to be the larger of

maxν∈σ(N)

minµ∈σ(M)

|ν − µ|, maxµ∈σ(M)

minν∈σ(N)

|ν − µ|.

Then dist (σ(M), σ(N)) ≤ ‖M −N‖.

Proof. Let d = ‖M − N‖. Suppose the first quantity is larger than d. Sofor some ν ∈ σ(N), minµ∈σ(M) |µ− ν| > d.

ν ∈ ρ(M), so M − ν is invertible. Then |σ((M − ν)−1)| = |(σ(M)− ν)−1|,and so |σ(M − ν)|−1 < d−1. Therefore

d−1 > |σ((M − ν)−1)| = ‖(M − ν)−1‖.

N − νI = M − νI +N −M = (M − νI)(I +K),

whereK = (M − ν)−1(N −M).

Note‖K‖ = ‖(M − ν)−1‖ ‖N −M‖ < d · d−1 = 1.

Therefore I +K is invertible, and so N − νI is invertible, a contradiction toν ∈ σ(N).

114

26.2 Functional calculus for symmetric operators

Let q be a polynomial with real coefficients. If M is symmetric, then q(M)is symmetric. Also, σ(q(M)) = q(σ(M)). So

‖q(M)‖ = supλ∈σ(M)

|q(λ)|.

Let f be continuous on σ(M). Extend f continuously to an interval I con-taining σ(M). By the Weierstrass approximation theorem, there are polyno-mials qn converging uniformly to f on I. So qn is a Cauchy sequence. Givenε, there exists N such that

supλ∈I|qn(λ)− qm(λ)| < ε

if n,m ≥ N . Then‖qn(M)− qm(M)‖ < ε.

So limn→∞ qn(M) exists, and we call the limit f(M).

Proposition 26.6 (1) We have (f+g)(M) = f(M)+g(M) and (fg)(M) =f(M)g(M).

(2) ‖f(M)‖ = supλ∈σ(M) |f(λ)|.(3) f(M) is symmetric and σ(f(M)) = f(σ(M)).

Proof. (1) This holds for polynomials, so take limits.

(2)‖f(M)‖ = lim ‖qn(M)‖ = lim sup

λ∈σ(M)

|qn(λ)| = sup |f(λ)|.

(3) qn(M) is symmetric, and then

(x, f(M)y) = lim(x, qn(M)y) = lim(qn(M)x, y) = (f(M)x, y).

Finally, since

dist (σ(qn(M)), σ(f(M)) ≤ ‖qn(M)− f(M)‖ → 0,

115

σ(f(M)) is the limit of σ(qn(M)) = qn(σ(M)), and this limit is f(σ(M)).

M is positive if (Mx, x) ≥ 0 for all x.

Proposition 26.7 Let M be bounded and symmetric. M is positive if andonly if σ(M) ≥ 0.

Proof. If σ(M) ≥ 0, then f(λ) =√λ is a continuous function for λ ≥ 0,

and so N =√M exists. Then

(Mx, x) = (N2x, x) = (Nx,Nx) ≥ 0.

If M is positive, σ(M) ⊂ [a,∞), where a = inf‖x‖=1(x,Mx) ≥ 0.

Corollary 26.8 Every positive symmetric operator has a positive symmetricsquare root.

As mentioned before, this allows us to write T = UA, where A is positiveand symmetric and U is an isometry on RA and 0 on R⊥A.

26.3 Spectral resolution

Fix x, y ∈ H and define `x,y(f) = (f(M)x, y) for f ∈ C(σ(M)). Recall σ(M)is bounded and closed, hence compact. ` is a linear functional and by Rieszrepresentation, there exists a complex valued measure mx,y such that

(f(M)x, y) =

∫σM

f(λ)mx,y(dλ).

Proposition 26.9 (1) mx,y is linear in x and skew linear in y.

(2) my,x = mx,y.

(3) The total variation of mx,y is less than or equal to ‖x‖ ‖y‖.(4) The measure mx,x is real and non-negative.

116

Proof. (1) `x,y is linear in x and skew linear in y. mx+z,y and mx,y + mz,y

both represent `x+z,y, and by the uniqueness of the Riesz representation,mx,y +mz,y = mx+z,y.

(2) Since M is symmetric, (f(M)x, y) is skew symmetric.

(3) The total variation is the same as the norm of `x,y. But

|`x,y(f)| = |(f(M)x, y)| ≤ ‖f(M)‖ ‖x‖ ‖y‖ ≤ (supλ|f |)‖x‖ ‖y‖.

(4) f real implies σ(f(M)) = f(σ(M)) ∈ [0,∞). So f(M) is a positiveoperator. Then `x,x(f) = (f(M)x, x) ≥ 0.

If S ⊂ σ(M), then mx,y(S) is a bounded symmetric functional of x and y.So there exists a bounded symmetric operator E(S) such that

mx,y(S) = (E(S)x, y).

Proposition 26.10 (1) E∗(S) = E(S).

(2) ‖E(S)‖ ≤ 1.

(3) E(∅) = 0, E(σ(M)) = I.

(4) If S, T are disjoint, E(S ∪ T ) = E(S) + E(T ).

(5) E(S) and M commute.

(6) E(S ∩ T ) = E(S)E(T ).

(7) E(S) is an orthogonal projection. If S, T are disjoint, then E(S), E(T )are orthogonal.

(8) E(S) and E(T ) commute.

Proof. (1) E(S) is symmetric.

(2) From the fact that the total variation of mx,y is bounded by ‖x‖ ‖y‖.(3) mx,y(∅) = 0, so E(∅) = 0. If f ≡ 1, then f(M) = I, and

(x, y) =

∫σ(M)

mx,y(dλ) = (E(σ(M))x, y).

117

This is true for all y, so x = E(σ(M))x holds for all x.

(4) Because mx,y is additive.

(5)

mMx,y(f) = (f(M)Mx, y) = (Mf(M)x, y) = (f(M)x,My) = mx,My(f),

where mx,y(f) denotes the integral of f with respect to mx,y.

(6) (Lax never really does this one clearly.) Since mx,y is a finite measure,we can approximate (E(S)x, y) = mx,y(S) by mx,y(f). So it suffices to showE(f)E(g) = E(fg) for f, g continuous and use approximations.

Now

(E(f)E(g)x, y) =

∫f(λ)mE(g)x,y(dλ) = (f(M)E(g)x, y)

= (E(g)x, f(M)y) =

∫g(λ)mx,f(M)y(dλ)

= (g(M)x, f(M)y) = (f(M)g(M)x, y)

= ((fg)(M)x, y) =

∫(fg)(λ)mx,y(dλ)

= (E(fg)x, y).

This is true for all y, so E(f)E(g)x = E(fg)x.

(7) Setting S = T in (6) shows E(S) = E2(S), so E(S) is a projection.IF S ∩ T = ∅, then E(S)E(T ) = E(∅) = 0, so they are orthogonal.

(8)E(S)E(T ) = E(S ∩ T ) = E(T ∩ S) = E(T )E(S).

We have proved most of the following:

Theorem 26.11 Let H be a complex-valued Hilbert space and M a boundedsymmetric operator. There exists a projection valued measure E such thatE(S ∩ T ) = E(S)E(T ),

f(M) =

∫σ(M)

f(λ)E(dλ),

and the measure E is unique.

118

A few remarks.

(1) Uniqueness follows from the uniqueness of mx,y.

(2) Suppose f is bounded and measurable. If f is simple, i.e., f =∑ciχAi ,

where the Ai are disjoint, we define f(M) =∑ciE(Ai). In the next propo-

sition, we show‖f(M)‖ = sup

λ∈σ(M)

|f(λ)|

if f is simple. If f is bounded and measurable, we can take fn simple con-verging to f uniformly. Then

‖fn(M)− fm(M)‖ = supλ∈σ(M)

|fn − fm|,

since fn−fm is simple, and therefore fn(M) is a Cauchy sequence. We definef(M) to be the limit of fn(M).

(3) We showed that the above equality holds in the weak sense:

(f(M)x, y) = · · · .

But in fact the equality holds in the norm topology. One needs to show thatthe integral exists, and to do that one needs to approximate by Riemann-Stieltjes integrals. The key estimate is the next proposition.

(4) If we write mx,y = mPx,y +mS

x,y +mCx,y, where this is the decomposition

of the measure into pure point part, singular part, and absolutely continuouspart, we get corresponding operators EP , ES, EC and we can write

H = REP ⊕RES ⊕REC .

Here is the promised proposition.

Proposition 26.12 Suppose Ai are disjoint. Then

‖∑

ciE(Ai)‖ = max |ci|.

Proof. Let r = maxi |ci|. Given x, let xi = E(Ai)x. Then

(xi, xj) = (E(Ai)x,E(Aj)x) = (x,E(Ai)E(Aj)x) = 0

119

if i 6= j. Then∣∣∣∑i

ciE(Ai)x∣∣∣2 =

(∑i

ciE(Ai)x,∑j

cjE(Aj)x)

=(∑

i

cixi,∑j

cjxj

)=∑i

|ci|2‖xi‖2 ≤ r2∑i

‖xi‖2 ≤ r2‖x‖2.

Therefore the operator norm is less than or equal to r. If j is such that|cj| = r, then take x in the range of E(Aj), and then

∑i ciE(Ai)x = cjx,

which implies that the norm is equal to r.

Proposition 26.13

‖f(M)x‖2 =

∫|f(λ)|2mx,x(dλ).

Proof. If f is real

‖f(M)x‖2 = (f(M)x, f(M)x) = ((f(M))2x, x) =

∫f 2(λ)mx,x(dλ),

and the proof for complex f is similar.

26.4 Normal operators

Let F be a Banach algebra with unit. Then if Q ∈ F ,

σ(Q) = {p(Q) : p a homomorphism of F into C}.

The reason for this is that λ ∈ σ(Q) if and only if λI − Q is not invertible,which happens if and only if p(λI − Q) = 0 for some homomorphism p.p(I) = 1, so this happens if and only if λ = p(Q) for some p.

Proposition 26.14 If p is a homomorphism, then p(T ∗) = p(T ).

120

Proof. Let A = (T + T ∗)/2 and B = (T − T ∗)/2. Then A∗ = A, B∗ = −B,T = A + B, T ∗ = A − B, and so p(T ) = p(A) + p(B) and similarly with Treplaced by T ∗.

It will suffice to show p(A) is real and p(B) is imaginary. Write p(A) =a+ ib and let U = A+ itI, so that U∗ = A− itI. Then

U∗U = A2 + t2I.

We have p(U) = a+i(b+t), so |p(U)|2 = a2 +(b+t)2. We have |p(U)| ≤ ‖U‖,and hence

a2 + (b+ t)2 ≤ ‖A‖2 + t2

for all t, which can only happen if b = 0. (If b > 0, take t large positive, andtlarge negative if b < 0.) The operator iB is self-adjoint, so apply the aboveto iB.

Proposition 26.15 If T and T ∗ commute, then ‖T‖ = |σ(T )|.

Proof. We already know this if T is self-adjoint. For general T ,

p(T ∗T ) = p(T ∗)p(T ) = p(T )p(T ) = |p(T )|2.

Every point in σ(T ) is of the form p(T ) for some homomorphism p, so

|σ(T ∗T )| = |σ(T )|2.

We also know that ‖T ∗T‖ = |σ(T ∗T )| since T ∗T is symmetric. But since

‖Tx‖2 = |(x, T ∗Tx)| ≤ ‖x‖2‖T ∗T‖,

then ‖T‖2 ≤ ‖T ∗T‖. We have ‖T ∗T‖ ≤ ‖T‖2, and so we obtain ‖T‖2 =|σ(T )|2.

An operator N is normal if N∗N = NN∗.

Theorem 26.16 Let N be normal. There exists an orthogonal projectionvalued measure E on σ(N) such that I =

∫σ(N)

dE and N =∫σ(N)

λE(dλ).

121

Proof. Let q(x, y) be a polynomial in x and y. If we let w = x + yi ∈ C,we can let x = (w + w)/2, y = (w − w)/2, and write q(x, y) = R(w,w) forsome polynomial R. Set Q = R(N,N∗). Since N and N∗ commute, theyeach commute with Q, and so Q and Q∗ commute. In the lemma below wewill show σ(Q) = R(λ, λ) for λ ∈ σ(N). We have ‖Q‖ = |σ(Q)|, since Q andQ∗ commute. Therefore

‖Q‖ = supλ∈σ(N)

|R(λ, λ)|.

Now we can define f(N) as the limit of polynomials, and the rest of the proofis as before.

Lemma 26.17σ(Q) = {R(λ, λ) : λ ∈ σ(N)}.

Proof. Operators of the form R(N,N∗) are a commutative algebra withunit. Let F be the closure in the operator norm.

Now p(Q) = R(p(N), p(N∗)) = R(p(N), p(N)). Then σ(Q) is equal to theset of points R(p(N), p(N)) where p is a homomorphism, which is the sameas the set of R(λ, λ) where λ ∈ σ(N).

26.5 Unitary operators

U is unitary if it is linear, isometric, 1-1, and onto. (Cf. rotations) So‖Ux‖ = ‖x‖, or (Ux, Ux) = (x, x). By polarization, (Ux, Uy) = (x, y), so(x, U∗Uy) = (x, y), which implies U∗U = I. U is invertible, since it is 1-1and onto, and thus U−1 = U∗.

It is an exercise to show that σ(U) ⊂ {z : |z| = 1}.U∗U = I = UU∗, so unitary operators are also normal operators.

27 Spectral theory of unbounded operators

An example: look at {f : f ∈ C2, f(0) = f(1) = 0}. By integration by parts,∫ 1

0

f ′′g = −∫ 1

0

f ′g′ =

∫ 1

0

fg′′,

122

or (Lf, g) = (f, Lg). Then L, the second derivative operator, is symmetric,but is an unbounded operator. Writing Lf = λf , there are solutions satisfy-ing the boundary condition only if λ < 0 and λ = −n2π2. The correspondingeigenfunctions are sinnπx, although these are unnormalized. {−n2π2} isunbounded, so L is not a bounded operator on L2.

Proposition 27.1 H a Hilbert space over C, M self-adjoint. If M is definedeverywhere on H, then M is bounded.

Proof. M is closed: if xn → x and Mxn → u, then

(Mxn, y) = (xn,My)→ (x,My) = (Mx, y).

Also (Mxn, y)→ (u, y). True for all y, so Mx = u.

By the closed graph theorem, M is bounded.

If H is a Hilbert space over C and D a dense subspace with A defined onD, then D∗ is the set of v ∈ H for which there exists a vector A∗v ∈ H suchthat (Au, v) = (u,A∗v) for all u ∈ D. Since D is dense, A ∗ v is uniquelydefined. A is self-adjoint if D = D∗ and A∗ = A. (So (Au, v) = (u,Av) forall u, v ∈ D(A), and the domain cannot be enlarged.)

Our goal is to prove

Theorem 27.2 Let A be self-adjoint, D,H be as above. There exists a pro-jection valued measure E such that

(1) E(∅) = 0, E (R) = I.

(2) E(S ∩ T ) = E(S)E(T ).

(4) E∗(S) = E(S).

(5) E commutes with A.

(6) D = {u :∫λ2 (E(dλ)u, u) <∞} and Au =

∫λE(dλ)u.

We say z is in the resolvent set if A− zI maps D one-to-one onto H.

Proposition 27.3 If z is not real, then z is in the resolvent set.

123

Proof. (1) R = Range (A− zI) is a closed subspace.

R is equal to the set of all vectors u of the form Av − zv = u for somev ∈ D. Then(Av, v) − z(v, v) = (u, v). A is self-adjoint, so (Av, v) is real.Looking at the imaginary parts,

−Im (z, ‖v‖2) = Im (u, v),

so |Im z|‖v2‖ ≤ ‖u‖ ‖v‖, or

‖v‖ ≤ 1

|Im z|‖u‖.

If un ∈ R and un → u, then ‖vn − vm‖ ≤ (1/|Im z|)‖un − um‖, so vn is aCauchy sequence, and hence converges to some point v.

Since Avn − zvn = un → u and zvn converges to zv, then Avn converges,say to r, and then r − zv = u. Since (Avn, w) = (vn, Aw) for w ∈ D, then(r, w) = (v, Aw), which implies r ∈ D and Av = r.

(2) R = H. If not, there exists k 6= 0 such that k is orthogonal to R, andthen

(Av − zv, k) = (Av, k)− (v, zk) = 0

for all v ∈ D. Then (Av, k) = (v, zk), so k ∈ D and Ak = zk. But then(k,Ak) = z(k, k) is not real, a contradiction.

(3) A−zI is one-to-one. If not, there exists k ∈ D such that (A−zI)k = 0.But then ‖k‖ ≤ (1/|Im z|)‖0‖ = 0, or k = 0.

If we set R(z) = (A− zI)−1 the resolvent, we have

‖R(z)‖ ≤ 1

|Im z|.

If u,w ∈ H and v = R(z)u, then (A− z)v = u, and

(u,R(z)w) = ((A− z)v,R(z)w) = (v, ((A− z)R(z)w) = (v, w) = (R(z)u,w).

So the adjoint of R(z) is R(Z).

124

27.1 Cayley transform

DefineU = (A− i)(A+ i)−1.

This is the image of A under the function

F (z) =z − iz + i

,

which maps the real line to ∂B1(0) \ {1}.

Proposition 27.4 U is a unitary operator.

Proof. A+ i, A− i map D(A) one-to-one onto H, so U maps H onto itself.

U is norm preserving: Let u ∈ H, v = (A+i)−1u, w = Uu. So (A+i)v = u,(A− i)v = w. Then

‖u‖2 = ((A+ i)v, (A+ i)v) = ‖Av‖2 + ‖v‖2 + i[(v, Av)− (Av, v)]

= ‖Av‖2 + ‖v‖2,

and similarly

‖w‖ = ((A− i)v, (A− i)v) = ‖Av‖2 + ‖v‖2.

Proposition 27.5 If U is unitary, then σ(U) ⊂ {|z| = 1}.

Proof. (λI−U) = λ(I−U/λ). Since U is an isometry, then ‖U‖ = 1. ThenI − 1

λU is invertible if 1

|λ|‖U‖ < 1, or if |λ| > 1.

(λI − U) = U(λU−1 − I) = U(λU∗ − I). Since ‖λU∗‖ = |λ| < 1, thenI − λU∗ is invertible.

Proposition 27.6 Given A and U as above and E the spectral resolutionfor U , E({1}) = 0.

125

Proof. Write E1 for E({1}). If E1 6= 0, there exists z 6= 0 in the range ofE1, so z = E1w. Then

Uz =

∫σ(U)

λE(dλ)z =

∫σ(U)

λ (E − E1)(dλ)z +

∫{1}λE1(dλ)z.

The first integral is zero since (E − E1)(A) and E1 are orthogonal for all A.The second integral is equal to

E1z = E1E1w = E1w = z

since E1 is a projection.

We conclude z is an eigenvector for U with eigenvalue 1. So (A− iI)(A+iI)−1z = z. Let v = (A+ iI)−1z, or z = (A+ iI)v. Then

z = (A− iI)(A+ iI)−1z = (A− iI)v,

and then iv = −iv, so v = 0, and hence z = 0, a contradiction.

Let M be a bounded symmetric operator. Let f be bounded and measur-able. Define

mx,y(f) =

∫σ(M)

f(λ)mx,y(dλ).

This is a bounded symmetric functional of x and y, since it is true for mx,y(S)and we can take limits. So there exists an operator, called f(M), such that

mx,y(f) = (f(M)x, y), x, y ∈ H.

We have

|(f(M)x, y)| = |mx,y(f)| ≤ |f | |mx,y(σM)| ≤ |f | ‖x‖ ‖y‖.

Therefore ‖f(M)‖ ≤ |f |. We thus can extend our construction of f(M) fromcontinuous f to bounded and measurable f . We now want to define f(M)for some unbounded functions f .

(The rest of this chapter is from Rudin’s Functional Analysis.

Proposition 27.7 Let

Df ={x :

∫σ(M)

|f(λ)|2mx,x(dλ) <∞}.

126

Then

(1) Df is a dense subspace of H.

(2) If x, y ∈ H,∫σ(M)

|f(λ)| |mx,y|(dλ) ≤ ‖y‖(∫

σ(M)

|f(λ)|2mx,x(dλ))1/2

.

(3) If f is bounded and v = f(M)z, then

mx,v(dλ) = f(λ)mx,z(dλ), x, z ∈ H.

Proof. (1) Let S ⊂ σ(M) and z = x+ y.

‖E(S)z‖2 ≤ (‖E(S)x‖+ ‖E(S)y‖)2 ≤ 2‖E(S)x‖2 + 2‖E(S)y‖2.

Somz,z(S) ≤ 2mx,x(S) + 2my,y(S).

This is true for all S, so

mz,z(dλ) ≤ 2mx,x(dλ) + 2my,y(dλ).

Let Sn = {minσ(M) : |f(λ)| < n}. Then if x = E(Sn)z, E(S)x = E(S ∩ Sn)x, somx,,x(S) = mx,x(S ∩ Sn). Then∫

σ(M)

|f(λ)|2mx,x(dλ) =

∫Sn

|f(λ)|2mx,x(dλ) ≤ n2‖x‖2 <∞.

Therefore the range of E(Sn) ⊂ D(f). σ(M) = ∪nSn, so y = limE(Sn)y,hence y is in the closure of Df .

(2) If x, y ∈ H, f bounded,

f(λ)mx,y(dλ)� |f(λ)| |mx,y|(dλ),

so there exists u with |u| = 1 such that

u(λ)f(λ)mx,y(dλ) = |f(λ)| |mx,y|(dλ).

127

So ∫σ(M)

|f(λ)| |mx,y|(dλ) = (uf(M)x, y) ≤ ‖uf(M)x‖ ‖y‖.

But

‖uf(M)x‖2 =

∫|uf |2 dmx,x =

∫|f |2 dmx,x.

So (2) holds for bounded f . Now take a limit.

(3) Let g be continuous.∫σ(M)

g dmx,v = (g(M)x, v) = (g(M)x, f(M)z)

= ((fg)(M)x, z) =

∫gf dmx,z.

this is true for all g continuous, so dmx,x = f dmx,z.

Theorem 27.8 Let E be a resolution of the identity.

(a) Suppose f : σ(M) → C is measurable. There exists a densely definedoperator f(M) with domain Df and

(f(M)x, y) =

∫σ(M)

f(λ)mx,y(dλ) (1)

‖f(M)x‖ =

∫σ(M)

|f(λ)|2mx,x(dλ). (2)

(b) If Dfg ⊂ Dg, then f(M)g(M) = (fg)(M).

(c) f(M)∗ = f(M) and f(M)f(M)∗ = f(M)∗f(M) = |f |2(M).

Proof. (a) If x ∈ Df , then y →∫σ(M)

f dmx,y is a bounded linear functional

with norm at most (∫|f |2 dmx,x)

1/2. Choose f(M)x ∈ H to satisfy (1) forall y.

Let fn = f1(|f |≤n). Then Df−fn = Df . By dominated convergence theo-rem,

‖f(M)x− fn(M)x‖2 ≤∫σ(M)

|f − fn|2 dmx,x → 0.

128

Since fn is bounded, (2) holds with fn. Now let n→∞.

(b) and (c): Prove for fn and let n→∞.

Theorem 27.9 (Change of measure principle) Let E be a resolution of theidentity on A, Φ : A → B one-to-one and bimeasurable. Let E ′(S ′) =E(Φ−1(S ′)). Then E ′ is a resolution of the identity on B, and∫

B

f dm′x,y =

∫B

(f ◦ Φ) dmx,y.

Proof. Prove for f the indicator of a set, use linearity, and take limits.

Proof of the spectral theorem. Start with the unbounded operator A.Let U = (A− iI)(A+ iI)−1. Then U is unitary with a spectrum on ∂B1(0) \{1}. Let the resolution of the identity for U be given by E ′.

Define Φ to be the map taking ∂B1(0) \ {1} to R. Apply the changeof measure principle. If we let E(S) = E ′(Φ−1(S)), it is just a matter ofchecking that E is a resolution of the identity for A.

28 Examples of self-adjoint operators

28.1 Extensions

M is symmetric if (x,My) = (Mx, y). Saying M is self-adjoint includes con-ditions on the domains. The difference only applies for unbounded operators.

C is an extension of B if D(B) ⊂ D(C) and Cu = Bu for u ∈ D(B).

Given B symmetric, so that (Bu, v) = (u,Bv) for all u, v ∈ D(B), can weextend B to a self-adjoint operator?

If un ∈ D(B), un → u, and Bun → w, then

(Bun, v) = (un, Bv)→ (u,Bv)

129

for all v ∈ D(B). Also (Bun, v) → (w, v). Since D(B) is dense in H, wis uniquely determined by u. Define B by Bu = w for all u,w such that(w, v)− (u,Bv) for all v ∈ D(B).

Proposition 28.1 Let B be a densely defined operator and B its closure.

(1) B is a closed operator.

(2) B is symmetric.

(3) If z /∈ R, then B− z maps D(B) one-to-one onto a closed subspace ofH.

Proof. (1) is easy.

(2) If v ∈ D(B), choose vn ∈ D(B) such that vn → v and Bvn → Bv.(Bu, vn) = (u,Bvn). Let n→∞, and (Bu, v) = (u,Bv).

(3) If u ∈ D(B), let f = (B − z)u. So

(Bu, u)− z(u, u) = (f, u).

Since B is symmetric, the first term is real. Taking imaginary parts,

|Im z| ‖u‖2 = |Im (f, u)| ≤ ‖f‖ ‖u‖,

so

‖u‖ ≤ 1

|Im z|‖f‖.

So B − z is one-to-one.

If fn is in the range of B − z and fn → f , then write (B − z)un = fn.By the above inequality, {un} is a Cauchy sequence, hence converges, say, tou. So un converges, fn converges, hence Bun converges. Since B is closed,u ∈ D(B) and Bu = f + zu.

Corollary 28.2 If A is self-adjoint, then A is closed.

Proof. A is symmetric, so D(A) ⊂ D(A∗) = D(A).

130

Theorem 28.3 Let A be a symmetric operator. A is self-adjoint if and onlyif C \ R ⊂ ρ(A).

Proof. That A self-adjoint implies that all non-real z are in the resolventset has already been proved.

Suppose z is non-real, so z, z ∈ ρ(A).

We claim (A− z)−1 is the adjoint of (A− z)−1. We need to show that iff, g ∈ H, then

((A− z)−1f, g) = (f, (A− z)−1g).

Let x = (A− z)−1f and y = (A− z)−1g. So we need to show

(x, (A− z)y) = ((A− zx, y).

Since A is symmetric, this is true for all x, y ∈ D(A). Since A− z and A− zmap D(A) one-to-one onto H, this is true for all f, g ∈ H.

A is self-adjoint: we have to show that if v ∈ D(A∗), then v ∈ D(A) andA∗v = Av. Suppose v ∈ D(A∗) with w = A∗v. Then (Ax, v) = (x,w) for allx ∈ D(A), or ((A− z)x, v) = (x,w− zv). Let x = (A− z)−1f , g = x− zv so

(f, v) = ((A− z)−1f, w − zv) = (f, (A− z)−1(w − zv)).

This must hold for all f ∈ H, so v = (A − z)−1(w − zv). Since the rangeof (A − z)−1 is D(A), then v ∈ D(A). Hit both sides with A − z, and thenAv = w.

28.2 Examples

1. Let H = L2(R), B = i(d/dx), and D(B) = C10 , the C1 functions with

compact support.

Proposition 28.4 B is symmetric and its closure is self-adjoint.

Proof. Symmetry follows by integration by parts.

131

Let z ∈ C. The range of B − z is {f : iu′ − zu = f, u ∈ C10} = C.

d

dxi(eizxu) = eizxf,

or

0 =

∫ ∞−∞

eizxf(x) dx,

using that u has compact support.

Conversely, if the equality holds, define

u(x) = −i∫ x

−∞eiz(y−x)f(y) dy.

Then u ∈ C1 and if f has compact support, then u has the same support. Cis dense in L2. If z /∈ R, range of B − z is closed, so range is all of H. SinceB − z is 1-1, z ∈ ρ(B), and therefore B is self-adjoint.

2. H = L2[0,∞), B as before, D(B) as before, but now C10 means support

in (0,∞).

Proposition 28.5 B is symmetric, but B is not self-adjoint. B has noself-adjoint extension.

Proof. f ∈ D(B) implies f = 0 in a neighborhood of 0, so we can useintegration by parts as before to show the symmetry.

As before, if f is in the range of B − z, 0 =∫∞

0eizxf(x) dx. If Im z < 0,

as before the range of B − z is dense in z.

If Im z > 0, eizx is square integrable, and therefore the range of B − z isthe set of f ∈ H such that f is orthogonal to eizx. So B is not self-adjoint.

If A is a self-adjoint extension, A would be an extension of B. Let v ∈D(A) \ D(B). Let Im z < 0. B − z maps D(B) onto H, so there existsu ∈ D(B) such that

(B − z)u = (A− z)v.

A is an extension, so (A − z)(v − u) = 0. A is symmetric, so eigenvectorsonly exist if z ∈ R, so v − u = 0, a contradiction.

132

29 Semigroups

Let X be a Banach space over the complex numbers, Z(t) = Zt linearbounded operators for t ≥ 0. Z is a semigroup if Z(t + s) = Z(t)Z(s),Z(0) = I.

Proposition 29.1 (1) Let G : X → X be bounded. Then Z(t) = etG (de-fined as etG =

∑tnGn/n!) is a semigroup that is continuous in the norm

topology.

(2) If Z(t) is a semigroup and lim(Z(t)− I) = 0, then Z = etG for someG.

Proof. (1) This is functional calculus for operators.

(2) Define

logZ = log(I + Z − I) = (Z − I)− (Z − I)2

2+ · · · .

So at least for t small we can define logZ.

Choose a such that |Z(t) − I| < 13

for t < a, and define L(t) = logZ(t).Z(t) and Z(s) commute so L(t + s) = L(t) + L(s). So for t rational witht < a, 1

tL(t) does not depend on t. Let G = 1

tL(t), so L(t) = tG.

Z(t+ h)− Z(t) = Z(t)[Z(h)− I],

so Z is continuous in t. So L(t) = tG for all t < a. Then etG = Z(t) fort < a. By the semigroup property, this holds for all t.

29.1 Strongly continuous semigroups

To use this in PDE, we need weaker conditions. Z(t) is strongly continuousat t = 0 if |Z(t)x− x| → 0 as t→ 0 for all x ∈ X.

Proposition 29.2 Suppose Z(t) is a strongly continuous semigroup at 0.

(1) There exists b and k such that |Z(t)| ≤ bekt.

(2) Z(t)x is strongly continuous in t for all x ∈ X.

133

Proof. We claim |Z(t)| is bounded near 0. If not, there exists tj → 0 suchthat |Z(tj)| → ∞. By the uniform boundedness principle, Z(tj)x cannotconverge to x for all x, a contradiction to strong continuity. So there existsa, b such that |Z(t)| ≤ b for t ≤ a.

Write t = na+ r. Z(t) = Z(a)nZ(r), so

|Z(t)| ≤ |Z(a)|n|Z(r)| ≤ bn+1 ≤ bekt

with k = 1a

log b.

(2) Z(t)x− Z(s)x = Z(s)[Z(t− s)x− x], so

|Z(t)x− Z(s)x| ≤ |Z(s)| |Z(t− s)x− x| → 0.

Suppose D is dense in X and G : D → X is closed. z ∈ ρ(G), the resolventset, if z − G maps D = D(G) 1-1 onto X. Write R(z) = Rz = (zI − G)−1.Since G is closed, then R(z) is closed. R(z) is defined on all of X, so by theclosed graph theorem, R(z) is a bounded operator.

We define G′ by (Gx, `) = (x,G′`), where D(G′) is the set of ` such that(Gx, `) is a bounded linear functional of x ∈ D(G).

Let Z be a strongly continuous one parameter semigroup. The infinitesi-mal generator of G is defined by

Gx = s− limh→0

Z(h)x− xh

,

with the domain of G being those x for which the strong limit exists.

Proposition 29.3 (1) G commutes with Z(t) in the sense that if x ∈ D(G),then Z(t)x ∈ D(G) and GZ(t)x = Z(t)Gx.

(2) D(G) is dense in X.

(3) D(Gn) is dense.

(4) G is closed.

(5) If |Z(t)| ≤ bekt and Re z > k, then z ∈ ρ(G). The resolvent of G isthe Laplace transform of Z(t).

134

Proof. (1)

Z(t+ h)− Z(t)

hx = Z(t)

Z(h)− Ih

x =Z(h)− I

hZ(t)x.

If x ∈ D(G), the middle term converges to Z(t)Gx. So the limit exists in thethird term, and Z(t)x ∈ D(G). Moreover d

dtZ(t)x = Z(t)Gx = GZ(t)x.

(2) We claim

Z(t)x− x = G

∫ t

0

Z(s)x ds.

To see this, Z(s)x is a continuous function of s. By a Riemann sum approx-imation,

Z(h)− Ih

∫ t

0

Z(s)x ds =1

h

∫ t

0

[Z(s+ h)x− Z(s)x] ds

=1

h

∫ t+h

t

Z(s)x ds− 1

h

∫ h

0

Z(s)x ds

→ Z(t)x− x.

So∫ t

0Z(s)x ds ∈ D(G). But 1

t

∫ t0Z(s)x ds→ x.

(3) Let φ be C∞ and supported in [0, 1]. Let

xφ =

∫φ(s)Z(s)x ds.

Then Gxφ = −∫φ′(s)Z(s)x ds. Repeating, xφ ∈ D(Gn). Take φj approxi-

mating the identity.

(4) Z(t)x−x =∫ t

0Z(s)Gxds: To see this, both are 0 at 0. The derivative

on the left is Z(t)Gx, which is the same as the derivative on the right. Letxn ∈ D(G), xn → x, Gxn → y. Then

Z(t)xn − xn =

∫ t

0

Z(s)Gxn ds→∫ t

0

Z(s)y ds.

The left hand term converges to Z(t)x− x. Divide by t and let t→ 0. Theright hand side converges to y. Therefore x ∈ D(G) and Gx = y.

(5) Let

L(z)x =

∫ ∞0

e−zsZ(s)x ds.

135

The Riemann integral converges when Re z > k.

|L(z)x| ≤∫ ∞

0

be(k−Re z)s|x| ds ≤ 1

Re z − k|x|.

We claim L(z) = R(z). Check that e−ztZ(t) is also a semigroup with in-finitesimal generator G− zI.

e−ztZ(t)− x = (G− zI)

∫ t

0

e−zsZ(s)x ds.

As t → ∞, the left hand side tends to −x and the right hand side tends to(G − zI)L(z)x. Since G is closed, x = (zI − G)L(z)x. So L(z) is the rightinverse of (zI −G). Similarly, we see that it is also the left inverse.

29.2 Generation of semigroups

Proposition 29.4 A strongly continuous semigroup of operators is uniquelydefined by its infinitesimal generator.

Proof. If W,Z have the same generator, let x ∈ D(G) and

d

dtW (t)Z(s− t)x = W (t)GZ(s− t)x−W (t)GZ(s− t)x = 0.

Therefore

0 =

∫ s

0

d

drW (r)Z(s− r)x dr = W (s)Z(0)x−W (0)Z(s),

or W (s)x = Z(s)x. Now use the fact that D(G) is dense.

Z(t) is a contraction if |Z(t)| ≤ 1 for all t.

Theorem 29.5 (1) The infinitesimal generator of a strongly continuoussemigroup of contractions has (0,∞) ⊂ ρ(G) and

|R(λ)| = |(λI −G)−1| ≤ 1

λ. (1)

136

(2) (Hille-Yosida theorem) Let G be a densely defined unbounded operatorsuch that (0,∞) ⊂ ρ(G) and (1) is satisfied. Then G is the infinitesimalgenerator of a strongly continuous semigroup of contractions.

Proof. (1) We already did; this is the case b = 1, k = 0.

2) Note nR(n)− I = R(n)G since R(n)(nI −G) = I. Let Gn = nGR(n).Then Gn = n2R(n)− nI, so Gn is a bounded operator. Define Zn(t) = etGn .

Claim: nR(n)x→ x for all x.

To prove this,

|nR(n)x− x| = |R(n)G(x)| ≤ 1

n|Gx|,

so the claim is true for x ∈ D(G). Since |nR(n)| ≤ 1 and D(G) is dense inX, this proves the claim.

If x ∈ D(G), then Gn(x)→ G(x):

Gnx = nGR(n)x = nR(n)Gx→ Gx.

We have

Zn(t) = etGn = e−nten2R(n)t = e−nt

∑ (n2t)m

m!Rm(n),

so |Zn(t)| ≤ entent = 1.

Gn and Gm commute with Zn and Zm.

d

dtZn(s− t)Zm(t)x = Zn(s− t)Zm(t)[Gm −Gn]x.

The norm of the right hand side is bounded by |Gnx−Gmx|. So

|Zn(s)x− Zm(s)x| ≤ s|Gnx−Gmx| → 0

as n,m → ∞. Therefore Zn(s)x converges, say, to Z(s)x, uniformly in s.D(G) is dense so this holds for all x.

Zn(s) is a strongly continuous semigroup of contractions, so the same holdsfor Z(s).

137

It remains to show that G is the infinitesimal generator of Z. We have

Zn(t)x− x =

∫ t

0

Zn(s)Gnx ds.

If x ∈ D(G), we can let n→∞ to get

Z(t)x− x =

∫ t

0

Z(s)Gxds.

If H is the generator of Z, dividing by t and letting t → 0, we get D(G) ⊂D(H) and H = G on D(G). So H is an extension of G. If λ > 0, thenλ ∈ ρ(G), ρ(H), which implies H cannot be a proper extension.

This last assertion is proved as follows. Suppose x ∈ D(H) \D(G). (λ−H)x ∈ X, so

(λ−G)−1(λ−H)x ∈ D(G) ⊂ D(H).

So

(λ−H)(λ−G)−1(λ−H)x = (λ−G)(λ−G)−1(λ−H)x = (λ−H)x.

Hit both sides with (λ−H)−1 to obtain (λ−G)−1(λ−H)x = x. So x ∈ D(G),a contradiction.

29.3 Perturbation of semigroups

Lemma 29.6 (Lumer-Phillips) Let G be densely defined in a Hilbert space Hand suppose (0,∞) ⊂ ρ(G). Then |R(λ)| ≤ 1/λ if and only if Re (x,Gx) ≤ 0for all x ∈ D(G).

If the last property holds, we say G is dissipative.

Proof.

‖(λI −G)−1u‖2 ≤ 1

λ2‖u‖2.

Let x = (λI −G)−1u. So

(x, x) ≤ 1

λ2(λx−Gx, λx−Gx).

138

This becomes

(x,Gx) + (Gx, x) ≤ 1

λ‖Gx‖2.

This is true for all λ.

The converse is left as an exercise.

Theorem 29.7 (Trotter) Suppose G is the infinitesimal generator of a semi-group of contractions in a Hilbert space. Let H be a densely defined dissipativeoperator such that D(G) ⊂ D(H) and there exist b > 0 and a ∈ (0, 1) suchthat

‖Hx‖ ≤ a‖Gx‖+ b‖x‖, x ∈ D(G).

Then G+H (defined on D(G)) is the generator of a contraction semigroup.

Proof. First, G+H is closed: Let xn → x and yn = (G+H)xn → y. So

G(xn − xm) = yn − ym −H(yn − ym),

and

‖G(xn − xm)‖ ≤ ‖yn − ym‖+ a‖G(xn − xm)‖+ b‖xn − xm‖.

Since a < 1, then Gxn converges. Therefore Hxn converges. G is closed, soGxn → Gx. If x ∈ D(G) ⊂ D(H),

‖Hnx−Hx‖ ≤ a‖Gnx−Gx‖+ b‖xn − x‖ → 0.

Next, if λ is sufficiently large, then λ ∈ ρ(G+H): By the Lumer-Phillipslemma, G is dissipative. H is also. So G + H is dissipative. By Lumer-Phillips,

‖x‖ ≤ 1

λ‖(λI − (G+H))x‖.

Therefore the range of (G+H)− λI is closed.

The range is H; if not, there exists v 6= 0 perpendicular to the range.G−λI is invertible, so there exists x ∈ D(G) such that (G−λI)x = v. Thenv +Hx is in the range, or (v +Hx, v) = 0. So ‖v‖2 + (Hx, v) = 0, or

‖v‖2 ≤ ‖Hx‖ ‖v‖,

139

and so ‖v‖ ≤ ‖Hx‖. Then

‖Gx− λx‖ ≤ ‖Hx‖ ≤ a‖Gx‖+ b‖x‖.

Squaring and use the fact that G is dissipative,

‖Gx‖2 + λ2‖x‖2 ≤ a2‖Gx‖2 + 2ab‖Gx‖ ‖x‖+ b2‖x‖.

a < 1, so for λ large enough, ‖x‖ = 0. So x = 0 and the range is the wholespace.

Now use Lumer-Phillips (not quite, since this only true for large λ.)

Examples: Non-divergence operators and Levy processes.

30 Groups of unitary operators

We prove Stone’s theorem.

Theorem 30.1 (1) Suppose A is self-adjoint and H is a Hilbert space. Thereexists a strongly continuous group U(t) of unitary operators with infinitesimalgenerator iA.

(2) Given a strongly continuous group of unitary operators, the generatoris of the form iA where A is self-adjoint.

Proof. (1) We saw ‖R(z)‖ ≤ 1/|Im z|. The resolvent set of iA containsthe positive reals. So iA and −iA satisfy the Hille-Yosida theorem. LetU(t), V (t) be the respective semigroups.

V and U are inverses:

d

dtU(t)V (t) = U(t)iAV (t)x− U(t)iAV (t)x = 0.

So U(t)V (t)x is independent of t. When t = 0, we get x. So U(t)V (t)x = xif x ∈ D(A). But D(A) is dense.

Both U and V are contractions. Since U(t)V (t) = I, they must be normpreserving. This is because

‖x‖ = ‖U(t)V (t)x‖ ≤ ‖V (t)x‖ ≤ ‖x‖,

140

so ‖x‖ = ‖V (t)x‖ and similarly with U . Since they are invertible, they areunitary. Define U(t) = V (−t) for t < 0.

(2) Let V (t) = U(−t). Then U(t) and V (t) are strongly continuous semi-groups of contractions, and the infinitesimal generators are additive inverses.So the generators are G,−G.

Since both G,−G are infinitesimal generators, all real numbers except 0are in the resolvent set of G. Take x ∈ D(G).

‖U(t)x‖2 = (U(t)x, U(t)x) = ‖x‖2.

Take the derivative with respect to t:

(Gx, x) + (x,Gx) = 0.

Replacing x by x+ y, we get

(Gx, y) = (x,Gy).

Replacing y by iy, we see G is antisymmetric. So G∗ is an extension of −G.ρ(G∗) = ρ(G). If z 6= 0 and z ∈ R, then z ∈ ρ(G), so z ∈ ρ(G∗). Alsoz ∈ ρ(−G). So G∗ cannot be a proper extension of −G, hence G∗ = −G.

141