Lecture Notes in Real Analysis 2009ars/ma403.pdf · Lecture Notes in Real Analysis 2009 Anant R....

Lecture Notes in Real Analysis 2009

Anant R. Shastri

Department of Mathematics

Indian Institute of Technology

Bombay

November 9, 2009

Lecture 1

Why real numbers?

Example 1 Gaps in the rational number system. By simply

employing the unique factorization theorem for integers, we can easily

conclude that there is no rational number r such that r2 = 2. So there

are gaps in the rational number system in this sense. The gaps are

somewhat subtle. To illustrate this fact let us consider any positive

rational number p and put

q =2p + 2

p + 2= p− p2 − 2

p + 2. (1)

Check that

q2 − 2 =2(p2 − 2)

p + 2

2

> 0 (2)

Now if p2 < 2 then check that p < q and q2 < 2. Similarly, if p2 > 2

then check that q < p and 2 < q2. This shows that there exist a

sequence r1 > r2 > r3 > · · · of rational numbers such that r2 > 2 and

a sequence of rational number s1 < s2 < · · · such that s2 < 2. In other

words, in the set of all positive rationals r such that r2 > 2 there is no

least element and similarly in the set of all positive rationals such that

s2 < 2 there is no greatest element. The real number system fulfills

this kind of requirement that the rational number system is unable to

fulfill.

Some Basic Set theory Membership, union, intersection, power set,

De’Morgan’s law, (the episode of RAMA and SITA), ordered pairs:(x, y) :=

{{x}, {x, y}}. Cartesian product X ×Y as a subset of the power set of

power set of X ∪Y. Relations, functions, cartesian product of arbitrary

family of sets, cardinality, finiteness and infiniteness, countability of Q

Notation:

N = {0, 1, 2, . . .} the set of natural numbers.

Z = set of interger.

Z+ = the set of positive integers

Q = the set of rational numbers.

R = the set of real numbers.

C = the set of complex numbers.

Lecture 2

Definition 1 Let X be a set R ⊂ X ×X be a relation in X. We shall

write x < y whenever (x, y) ∈ R. We say R is an order (total order

or linear order) on X if the following conditions hold:

(i) Transitivity: x < y, y < z =⇒ x < z for any x, y, z ∈ X.

(ii) Law of Trichotomy: Given x, y ∈ X either x < y or y < x or x = y.

We shall read x < y as ‘x is less than y’. We shall write x ≤ y

to mean either x < y or x = y. We shall also write x > y to mean

y < x and x ≥ y to mean y ≤ x. Note that > becomes another order

on the set X. However, these two orders on X are so closely related to

each other, that we can recover any information on one of them from

a corresponding information on the other.

Let now A ⊂ X. We say x ∈ X is an upper bound of A if a ≤ x

for all a ∈ A. If such an x exists we then say A is bounded above.

Likewise we define lower bounds and bounded below as well.

An element x ∈ X is called a least upper bound abbreviated as

lub, (or supremum to be written as ‘sup’ of A if x is an upper bound

of A and if y is an upper bound of A then x ≤ y. Similarly we define

a greatest lower bound (glb or ‘inf’= infimum).

Remark 1

1

(i) The set of integers is an ordered set with the usual order < . The

subset of positive integers is bounded below but not bounded above.

Also it has greatest lower bound viz., 1. Indeed the set of rational

numbers is also an ordered set with the natural order and this way we

can view Z as an ordered subset of Q. Note that Z+ is not bounded

above even in Q.

(ii) Even if a set A is bounded above, there may not be a least upper

bound as seen in the example 1. However, if it exists then it is unique.

(iii) Let A = ∅ be the emptyset of an ordered set X. Then every

member of X is an upper bound for A. Therefore, least upper bound

for A would exist iff X has a least element.

(iv) Let A = X. Then an upper bound for A is nothing but the greatest

element of X if it exists and hence the lub of X is also equal to this

element.

Definition 2 An ordered set X is said to be order complete if for

every nonempty subset A of X which is bounded above there is a least

upper bound for A in X.

Definition 3 By a binary operation on a set X we mean a function

· : X ×X → X

Remark 2 It is customary to denote ·(a, b) by a · b or some other

conjunction symbol between the two letters a and b. if there is no scope

for confusion, by ab. Typical example of binary operations are addition

and multiplication defined on the set of integers (rational numbers)

etc..

Definition 4 A field K is a set together with two binary operations

denoted by + and · satisfying a number of properties called field ax-

ioms which we shall express in three different lists:

2

List(A) Axioms for addition:

(A1) Associativity: x + (y + z) = (x + y) + z; x, y, z ∈ K.

(A2) Commutativity: x + y = y + x; x, y ∈ K.

(A3) The zero element: There exists 0 ∈ K such that x+0 = x; x ∈ K.

(A4) Negative: For x ∈ K there is a y ∈ K such that x + y = 0.

List (M) Axioms for multiplication:

(M1) Associativity: x(yz) = (xy)z; x, y, z ∈ K.

(M2) Commutativity: xy = yx; x, y ∈ K.

(M3) The unit element: There exists 1 ∈ K, 1 6= 0 such that 1x =

x; x ∈ K.

(M4) Inverse: For each x ∈ K such that x 6= 0 there exists z ∈ K such

that xz = 1.

List (D) Distributivity: x(y + z) = xy + xz; x, y, z ∈ K.

Remark 3 Note that the zero element is unique. Therefore (M3) and

(M4) make sense. Moreover, the unit element is also unique. Further

the negative and the inverse are also unique and are denoted respec-

tively by −x and 1/x. Because of the associativity, we can drop writing

down brackets at all. We also use the notation n to indicate the sum

1 + 1 + · · · + 1 (n times). Likewise we use the notation xn in denote

xx · · ·x (n times). Thus all ‘polynomial’ expressions of elements of Kmake sense. That is to say, if p(t) = a0 + a1t + · · ·+ ant

n, with ai ∈ Kthen we can substitute for t any element x ∈ K and obtain a well de-

fined element of K. The most important example of a field for us now

is the field of rational numbers K = Q.

Definition 5 An ordered field is a field K with an order < satisfying

the following axioms:

(O1) x < y =⇒ x + z < y + z for all x, y, z ∈ K.

(O2) x > 0, y > 0 =⇒ xy > 0 for all x, y ∈ K.

3

Remark 4 Once again a typical example is the field of rational num-

bers with its usual order. All familiar rules for working with inequalities

will be valid in any ordered field. For example the square of any ele-

ment in an ordered field cannot be negative. Let us list a few of such

properties which can be derived easily from the axioms:

Theorem 1 Let K be an ordered field with the order < . Then the

following properties are true for elements of K :

(a) 0 < x iff −x < 0.

(b) 0 < x, y < z =⇒ xy < xz.

(c) x 6= 0 =⇒ 0 < x2. In particular, 0 < 1.

(d) 0 < x < y =⇒ 0 < 1/y < 1/x.

Exercise 1 Let K be an ordered field. Temporarily let us denote the

identity elememt of K by 1K and 1K + · · ·+ 1K (m times) by m1K.

(a) Show that the mapping m 7→ m1K defines an injective ring homo-

morphism of φ : Z → K which is order preserving, viz.,

φ(x + y) = φ(x) + φ(y); φ(xy) = φ(x)φ(y); x < y =⇒ φ(x) < φ(y).

(b) Show that φ extends to an injective field homomorphsim Q → Kwhich is order preserving. In this way we can now say that every

ordered field contains the field of rational numbers.

We shall now state result which asserts the existence of real number

system. We shall not prove this. Interested reader may read this from

[R].

Theorem 2 There is a unique ordered field R which contains the or-

dered field Q and is order complete.

4

Remark 5 Not that a field K is order complete iff for every subset A of

K which is bounded below there is a greatest lower bound. This follows

easily by considering −A. The uniqueness of R has to be interpreted

correctly in the sense that if there is another such R′ then there is a

bijection φ : R → R′ such that

(i) φ(r) = r, r ∈ Q(ii) φ(x + y) = φ(x) +′ φ(y), x, y ∈ R(iii) φ(xy) = φ(x).′φ(y), x, y ∈ R.

(iv) x < y =⇒ φ(x) <′ φ(y), x, y ∈ R.

Lecture 3 (tutorial)

Theorem 3 Z+ is not bounded above in R.

Proof: If x ∈ R is such that n 6= x for all n ∈ Z+, then we can take

the least upper bound l ∈ R for Z+. But then there must exist n ∈ Z+

such that l − 1 < n. This implies l < n + 1 which is absurd. ♠

Theorem 4 Archimedian Property

(A) For every x ∈ R there exists n ∈ Z+ such that x < n.

(B) x, y ∈ R, 0 < x =⇒ there exists n ∈ Z+ such that y < nx.

Proof: : (A) This is just a restatement of the above theorem.

(B) Apply (A) to y/x. ♠

Theorem 5 If S is a nonempty subset of Z which is bounded above

then S has a maximum.

Proof: (Recall that if a set has a maximum iff the least upper bound

exists and belong to the set.) Let y ∈ R be the least upper bound of

S. We claim that y ∈ S. Suppose y is not in S. Now there exists m ∈ S

such that y − 1/2 < m < y. This implies 0 < y −m < 1/2. Also, since

5

y+m2

< y there exists n ∈ S such that y+m2

< n < y. This implies that

0 < y−m2

< n−m < y −m < 1/2 which is absurd. ♠

Definition 6 We can now define the ‘floor’ and ‘ceiling’ functions on

R. Given any x ∈ R consider the set Zx = {m ∈ Z : m ≤ r}. Clearly

Zx is bounded above and non empty (Archimedian property). From

the above theorem, it has a maximum which is of course unique. We

define this maximum to be bxc. Likewise dxe is also defined.

Lemma 1 Let x, y ∈ R be such that y − x > 1. Then there exists

m ∈ Z such that x < m < y.

Proof: By the definition of floor, it follows that x < 1+bxc. Therefore,

1 < y − x =⇒ x < 1 + bxc < y − x + bxc < y.

and by taking m = 1 + bxc, we are done. ♠

Theorem 6 Density of Q in R. Given x < y ∈ R there exist r ∈ Qsuch that x < r < y.

Proof: We have to find r = m/n such that x < m/n < y which is the

same as finding integers m, n, n > 0 such that nx < m < ny. This is

possible iff we can find n > 0 such that the interval (nx, ny) contains

an integer iff n(y−x) > 1. This last claim follows from (B) of theorem

4. ♠

Lecture 4

The last theorem may lead us to believe that the set of real numbers

is not much too large as compared to the set of rational numbers.

However the fact is indeed the opposite. This was a mild shock for

the mathematical community in the initial days of invention of real

6

numbers. We define an irrational number to be a real number which is

not a rational number. To begin with we shall prove:

Theorem 7 The set R \Q of irrationals is dense in R.

Proof: Given any two real numbers x < y we must find an irrational

number φ such that x < φ < y. By the earlier theorem we can first

choose rational numbers x1, y1 such that x < x1 < y1 < y and then

show that there is an irrational number φ such that x1 < φ < y1. By

clearing the denominators we can then reduce this to assuming that x, y

are integers and then by taking the difference, we can further assume

that x = 0. But then we can as well assume that y = 1. Thus it is

enough to show that there is an irrational number between 0 and 1.

If this were not true, by translation, it would follow that there are no

irrational numbers at all! ♠

Remark 6 Pay attention to this argument which occurs in somewhat

different forms in several places in mathematics. Later we shall show

that R \Q is uncountable. To begin with, at least, we can now be sure

that there is a real number x such that x2 = 2.

Theorem 8 Given any positive real number y, there is a unique posi-

tive real x such that x2 = y.

Proof: The uniqueness is easy to prove: x21 = x2 =⇒ (x1 + x2)(x1 −

x2) = 0 which in turn implies x1 − x2 = 0. Let us prove the existence.

In the case y = 1 there is nothing to prove. The case y < 1 can be

converted into the case y > 1 by taking the inverse. So, we shall assume

now that y > 1. As in the example 1, for any p ∈ R+ let us define

q := φ(p) := p− p2 − y

p + y=

y(p + 1)

p + y. (3)

We then have

q2 − y = (y2 − y)p2 − y

(p + y)2. (4)

7

Let now

S = {x ∈ R+ : x2 > y}; T = {x ∈ R+ : x2 < y}.

Both S, T are non empty. S is bounded below and T is bounded above.

Therefore r = glb(S); s = lub(T ) exists. We claim that r = s and

r2 = y. Suppose r2 > y. Then by 3), r > φ(r) > 0. But from (4)

φ(r)2 > y. Therefore φ(r) ∈ S and hence r ≤ φ(r) which is absurd.

Therefore r2 ≤ y. In a symmetric manner we also obtain that s2 ≥ y.

Now if r2 < y then r ∈ T and hence s ≤ r. Therefore, y ≤ s2 ≤ r2 < y

which is absurd. Therefore r2 = y as claimed. In a similar manner, we

can also see that s2 = y. ♠

Remark 7 We now encourage you to try to prove the following fact:

given any real number y > 0 and any positive integer n, there exists

a unique real number x such that xn = y. After trying for some time,

look into the book [R] for a proof. In any ase, we shall prove this fact

a little later as an easy consequnce of intermediate value theorem.

Exercise 2 Let S be a non empty subset of R which is bounded above.

Let T be the set of all upper bounds of S. Show that lub(S) = glb(T ).

Exercise 3 Fix x > 1.

1. For positive integers p, q, m, n such that q 6= 0, n 6= 0 and p/q =

m/n = r show that (xm)1/n = (xp)1/q. This allows us to define

xr = (xm)1/n unambigueously.

2. Show that for rational number r, s, xr+s = xrxs; (xr)s = xrs.

3. For any real number α, let

S(α) = {xr : r ≤ α, r ∈ Q}.

8

Show that if α is rational then xα = lub(S(α)). This prompts us

to define, for any real number α

xα := lubS(α).

4. Prove that xαxβ = xα+β for all real numbers α, β.

Lecture 5

Sequences

Definition 7 By a sequence in a set X we mean a function s : N → X.

Remark 8 Often it is our practice to display a sequence in the form:

s0, s2, . . . , OR {sn}n≥0 OR {sn}

Here sn denotes s(n) ∈ X. The set N itself displayed as

0, 1, 2, 3 . . . ,

is can be then thought of as the sequence of natural numbers.

Definition 8 Let X be an ordered set and s be a sequence in X. We

say s is (strictly) monotonically increasing if (sn < sn+1) sn ≤ sn+1

for all n ∈ N. Similarly we can define monotically (strictly) decreasing

sequence also. A sequence which is one of the above two types is

merely refered to as a monotone sequence. We say s bounded above,

of there exits x ∈ X such that sn ≤ x for all n ∈ N. Similarly one can

define bounded below sequences also.

Definition 9 By a subsequence t of a sequence s : N → X we mean a

sequence which can be written in the form s ◦ α for some α : N → Nwhich is a monotonically increasing sequence. We then display t in the

form {tn} = {sα(n)}.

9

Theorem 9 Every s : N → X in an ordered set X has monotone

subsequence.

Proof: We shall assume that s is a sequence in X which has no mono-

tone subsequence and arrive a contradiction. Consider the first two

elements. We then have s1 ≤ s2 or s2 ≤ s1. Consider the first case. We

define η1 = s1, η2 = s2 and take η3 to be the first si, i ≥ 2 such that

η2 < η3 and go on like this. Our assumption does not allow us to do

this. This means there is an ηn1 = sm such that for all n > m sn < sm.

We shall define p0 = ηn0. We now define γ1 = p1 = sm γ1 = sm+1.

We then have γ1 > γ2. We then pick up the first sn, n > m for which

γ2 > sn and call it γ2. As before we cannot go on like this either. That

is there is an1 such that for all n > n1 sn1 < sn. We put v1 = sn1 .

For v1 we now start climbing up as before till we hit another peak p2

and then start climbing down till we hit another valley v2 and so on.

Not that each pj = snjhas the property that sn < pj for all n > nj

(and similarly for each valley vk = snkwe have vk < sn for all n > nk).

In particular, it follows that {pj} is a subsequence of {sn} which is

monotonically decreasing! ♠We shall now onward consider sequences of real numbers.

Definition 10 Let s : N → R be a sequence of real numbers. We say

s converges to l ∈ R if for every positive real number ε there exists

n0 = n(ε) ∈ N such that for all n ≥ n0 we have sn ∈ (l − ε, l + ε). We

then say s is a convergent sequence, call l the limit of the sequence s

and write

limn→∞

sn = l OR sn → l as n →∞.

Remark 9 Note that the limit l if it exists is unique. For if l′ is another

limit of s, we choose ε = |l − l′| > 0. Then according to the definition

of the limit applied to l and l′ we get two numbers n0, and n′0 say such

10

that

sn ∈ (l − ε, l + ε), n ≥ n0; sn ∈ (l′ − ε, l′ + ε), n ≥ n′0.

Then taking n bigger that both n0 and n′0 we arrive at an absurd result

that (l − ε, l + ε) ∩ (l′ − ε, l + ε) 6= ∅.

Theorem 10 Every bounded monotone sequence of real numbers is

convergent.

Proof: We shall show that if s is an increasing sequence which is

bounded above, then it converges to the least upper bound l of the set

{sn : n ∈ N}. Suppose ε > 0 is any real number. Then l − ε < l and

hence there exist sn0 such that l−ε < sn0 . But then since s is increasing

it follows that sn ∈ (l − ε, l] for all n ≥ n0. This proves the claim. ♠

Remark 10 Observe that the process of obtainig q from p in example

1 may be repeated indefinitely to obtain a sequence. Thus starting

with a positive rational number p = s0 such that p2 > 2 we obtain a

monotonically decreasing sequence {sn} of rationals, whereas starting

with p = t0 > 0 such that p2 < 2 we obtain a monotonically decreasing

sequence {tn} of rationals. What are the limits?

Exercise 4

(i) Show that every convergent sequence is bounded.

(ii) Show that if sn → c then every subsequence of s converges to c.

This fact can be used in different ways. If you know somehow that a

sequence is convergent but have to compute the limit, you can do so by

taking any convenient subsequence. Also if you know one subsequence

of s which is not convergent then you may immediately conclude that

the sequence itself is not convergent. Or if you know two subsequences

of s converging to different limits, then also you can conclude that the

11

sequence s is not convergent.

(iii) Let sn ≤ tn. Show that (if the limits exists) limn sn ≤ limn tn.

(iv) Sandwich Theorem Let sn ≤ rn ≤ tn Suppose limn sn = limn tn. =

l Show that {rn} converges to l.

Operations on Sequences of real numbers Given two sequences

s and t of real numbers we can define the sum sequence s + t by the

formula (s + t)(n) = sn + tn. Likewise we can define a sequence αs

where α is a real number and also the sequence st. It is not difficult to

see that

(i) If s, t are convergent then s+ t, st, and αt are all convergent. More-

over,

limn

(s+t)n = limn

sn+limn

tn; limn

(st)n = (limn

sn)(limn

tn); limn

(αs)n = α limn

sn.

Extended Real Number System We put two extra symbols ±∞along with all the real numbers and extend the order in R as follows:

−∞ < r < ∞, for all r ∈ R.

Often denote this set by [∞,∞] and then of course (−∞,∞) repre-

sents the set of real numbers. The arithmetic operations can also be

extended partially as follows:

−∞+ r = −∞;∞+ r = ∞;− for all r ∈ R

r∞ = ∞; r/∞ = 0; r/0 = ∞, for all r > 0.

With these conventions we can define sn →∞ if for all k > 0, there

exists n0 = n(k) such that sn > k for all n > n0. and write

limn→∞

sn = ∞.

Likewise one can also define when −∞ is a limit of a sequence. How-

ever, we shall not call such sequences as convergent sequences. Indeed,

we can simply say that the sequence diverges to ∞.

12

Definition 11 A sequence of real numbers which is not convergent to

any value in the extended real number system is called an oscillating

sequence.

Example 2 A simple example of an oscilating sequence is

1,−1, 1,−1, . . . .

One can easily have an oscillating sequence which is not bounded also,

e.g.,

1,−2, 3,−4, 5,−6, . . . ,

Remark 11 The extended number system provides a certain ease of

expressing our ideas in an unhindered fashion. For instance, we can

now define the supremeum or infimum of any subset of real numbers

not necessarily bounded. Thus, if A is a set of real numbers which is

not bounded above then supA is defined to be equal to ∞ whereas if

it is bounded above then its supremum is as defined earlier. Similarly,

a set which is not bounded below has its infimum equal to −∞.

Another advantage of having extended real number system is that

even an empty set of real number has an lub now, viz., −∞. For every

real number is an upper bound for elements of ∅. Therefore the set of

upperbounds is unbounded below and hence the ‘smallest one is −∞.

Likewise, every real number is a lower bound for ∅ and hence the largest

one is +∞. We can therefore say that every subset of R has a lub and

a glb in [−∞,∞].

Exercise 5

(i) Let sn →∞; tn →∞. Then sn+tn →∞; sntn →∞; αsn →∞, α >

0.

(ii) Let sn → −∞; tn → −∞. Then sn + tn → −∞; sntn →∞; αsn →−∞.

13

(iii) Let s1 =√

2, and sn+1 =√

2sn. Show that s is increasing and

bounded by 2. Compute the limit.

(iv) Let s1 > s2 > 0 and sn+2 = sn+sn+1

2. Show that s1, s3, . . . is de-

creasing and s2, s4, . . . is increasing. Also show that s is convergent.

Compute the limit.

(v) Let {sn} be a sequence of real numbers. Prove or disporve the

following statements:

(a) {sn} has a subsequence which is non oscillating.

(b) {sn} has a convergent subsequence.

(c) If {sn} is bounded then it has a convergent subsequence.

(d) If {sn} is unbounded then it has a subsequence which diverges to

±∞.

(e) s2n → t and s2n+1 → t implies sn → t.

Lecture 6

Limsup and Liminf The limit of a sequence {sn}, if it exists, tells us

about approximate value of sn for large n. What happens to sequences

which do not have limits. We would like to have a device which tells

us how large or how small sn may become for large n where {sn} is an

arbitrary sequence of real numbers. This is fulfilled by the concept of

Limsup and Liminf.

Definition 12 Let {sn} be a sequence of real numbers. Put

un = sup{sk : k ≥ n}.

Note that the sets involved are non empty and hence un ∈ (−∞,∞].

Also note that if A ⊂ B then sup A ≤ sup B Therefore {un} is a

decreasing sequence. This means it has limit in [−∞,∞). We define

lim supn

sn = limn

un.

14

Likewise we take ln = glb{sk : k ≥ n} and define

lim infn

sn = limn

ln.

The following properties are immediate.

Theorem 11 Let {sn}, {tn} be any two sequences of real numbers.

(i) lim supn sn ≥ lim infn sn.

(ii) {sn} is bounded above iff lim supn sn 6= ∞.

(iii) {sn} is bounded below iff lim infn sn 6= −∞.

(iv) If limn sn exists in [−∞,∞] then

lim supn

sn = limn

sn = lim infn

sn.

(v) If sn ≤ tn for all n then lim supn sn ≤ lim supn tn and lim infn sn ≤lim infn tn.

(vi) If {sn}, {tn} are bounded sequences of real numbers, then

lim supn

(sn + tn) ≤ lim supn

sn + lim supn

tn

and

lim infn

(sn + tn) ≥ lim infn

sn + lim infn

tn.

Proof: We shall prove (iv) only and leave the rest to you as an exercise.

Let L = limn sn. Consider the case when L is finite. Then for every

ε > 0 there exists n0 such that for n ≥ n0 we have

L− ε < sn ≤ L + ε.

It follows that

L− ε < un ≤ L + ε; L− ε < ln ≤ L + ε.

Therefore

L− ε < limn

un ≤ L + ε; L− ε < limn

ln ≤ L + ε.

15

Since this is true for every ε > 0, the conclusion follows. Now consider

the case when L = ∞. This means for every M > 0 there exist n0

such that sn > M for all n ≥ n0. Therefore un > M and ln > M.

Therefore both the limits limn un = ∞ = limn ln. Similarly the case

when L = −∞. ♠

Remark 12 It is not true in general even for bounded sequences that

lim supn(sn + tn) = lim supn sn + lim supn tn. Simplest example is sn =

(−1)n, tn = −(−1)n. Then sn + tn = 0 for all n and hence the lhs is 0.

The Rhs is 2!.

Exercise 6

1. Let {sn} be a bounded sequence of real numbers. Show that the

number U = lim supn sn is characterized by the property: For

every ε > 0, there are atmost finitely many values of n such that

sn > U + ε and there are infinitely many values of n for which

sn > U − ε. Formulate similar statement for L = lim infn sn and

prove it.

2. Use lim supn to show that every bounded sequence of real numbers

has a convergent subsequence.

Cauchy Sequences Consider a sequence sn of real numbers which is

convergent to a limit L. Then we know that for every ε > 0 there is n0

such that for n ≥ n0,

sn ∈ (L− ε, L + ε).

This fact can be interpreted to mean that the members of the sequence

are coming close to each other as mcuh as we want after a certain stage.

That is for all n, m ≥ n0 we have

|sn − sm| < 2ε.

16

This interpretation has no reference to the limit itself and hence

possibly useful in a situation where we do not know the value of L nor

its exisitence. That is indeed the case.

Definition 13 A sequence sn of real numbers is called a Cauchy se-

quence, if for every ε > 0 there is n0 such that for n, m ≥ n0,

|sn − sm| < ε.

Theorem 12 A sequence of real numbers is convergent (to a finite

limit) iff it Cauchy sequence.

Proof: we have already seen the only if part. Now assume that {sn}is a Cauchy sequence. It is easily seen that {sn} is bounded. Also we

know that it has a subsequence {tk = snk} which is convergent to say

L. It is easily seen that sn → L. ♠.

Remark 13 For an alternative proof using limsup see the books.

Exercise 7 Some special Sequences Establish the following:

1. For p > 1, limn pn = ∞; limn p−n = 0.

2. For p > 0, limn1np = 0.

3. For p > 0, limnn√

p = 1; limnn√

n = 1.

4. For p > 0 and α real, limnnα

(1+p)n = 0.

Hints: (1) and (2) Archimedian property.

(3) Put xn = n√

p− 1 and show that 0 ≤ xn ≤ p−1n

.

(4) For k > α, k > 0 and for n > 2k we have

(1 + p)n >n(n− 1) · · · (n− k + 1)

k!pk >

nkpk

2k.

Therefore, 0 < nα

(1+p)n < 2kk!pk nα−k → 0 by (i)

17

Lecture 7

The following theorem characterizes the real number which is the

lim supn{an} for any sequence {an} of real numbers.

Theorem 13 (Limsup-I) If α > lim supn{an} then there exists n0

such that for all n ≥ n0 an < α.

(Limsup-II) If β < lim supn{an} then there exists infinitely many nj

such that anj> β.

Theorem 14 For any sequence {an} of real numbers consider the set

S = {r ∈ [−∞,∞] : there exists a subsequence ank→ r}.

Then lim supn an = sup S.

Exactly similar way lim infn has similar properties.

Series

Remark 14 Given two numbers, we can add them to get another

number. Repeatedly carrying out this operation allows us to talk about

sums of any finitely many numbers. We would like to talk about ‘sum’

of infinitely many numbers as well. A natural way to do this is to label

the given numbers, take sums of first n of them and look at the ‘limit’

of the sequence of numbers so obtained.

Thus given a (countable) collection of numbers, the first step is to

label them to get a sequence {sn}. In the second step, we form another

sequence: the sequence of partial sums tn =∑n

k=1 sk. Observe that

the first sequence {sn} can be recovered completely from the second

one {tn}. The third step is to assign a limit to the second sequence

provided the limit exists. This entire process is coined under a single

term ‘series’. However, below, we shall stick to the popular definition

of a series. 1

1For a rigorous definition of a series, see [G-L]

18

Definition 14 By a series of real or complex numbers we mean a

formal infinite sum:∑n

sn := s0 + z1 + · · ·+ sn + · · ·

Of course, it is possible that there are only finitely many non zero

terms here. The sequence of partial sums associated to the above series

is defined to be tn :=n∑

k=1

zk. We say the series∑

n sn is convergent to

the sum s if the associated sequence {tn} of partial sums is convergent

to s. In that case, if s is the limit of this sequence, then we say s is the

sum of the series and write ∑n

zn := s.

It should be noted that that even if s is finite, it is not obtained via an

arithmentic operation of taking sums of members of {sn} but by taking

the limit of the associated sequence {tn} of partial sums. Since display-

ing all elements of {tn} allows us to recover the original sequence {sn}by the formula sn = tn+1 − tn results that we formulate for sequences

have their counterpart for series and vice versa and hence in principle

we need to do this only for one of them. For example, we can talk a

series which is the sum of two series∑

n an,∑

n bn viz.∑

n(an + bn)

and if both∑

n an,∑

n bn are convergent to finite sums then the sum

series∑

n(an + bn) is convergent to the sum of the their sums.

Nevertheless, it is good to go through these notions. For example

the Cauchy’s criterion for the convergence of the sequence {tn} can be

converted into

Theorem 15 A series∑

n sn is convergent to a finite sum iff for every

ε > 0 there exists n0 such that |∑m

k=n sn| < ε, for all m, n ≥ n0.

As a corollary we obtain

19

Corollary 1 If∑

sn is convergent to a finite sum then sn → 0.

Of course the converse does not hold as seen by the harmonic series∑n

1n.

Once again it is immediate that if∑

n zn and∑

n wn are convergent

series then for any complex number λ, we have,∑

n λzn and∑

n(zn +

wn) are convergent and∑n λzn = λ

∑n zn;

∑n(zn + wn) =

∑n zn +

∑n wn. (5)

Theorem 16 A series of positive terms∑

n an is convergent iff the

sequence of parial sums is bounded.

Theorem 17 Comparison Test

(a) If |an| ≤ cn for all n ≥ n0 for some n0, and∑

n cc is cgt then∑

n an

is convergent.

(b) If an ≥ bn ≥ 0 for all n ≥ n0 for some n0 and∑

n bn diverges

implies∑

n an diverges.

The geometric series is the mother of all series:

Theorem 18 Geometric Series If 0 ≤ |x| < 1 then sumnxn = 1

1−x.

If |x| > 1, then the series diverges.

Theorem 19 The series∑

n1n!

is cgt and its sum is denotes by e. We

have, 2 < e < 3.

Proof: For n ≥ 2, we have,

2 < tn = 1+1+1

2!+ · · ·+ 1

n!< 1+1+

1

2+ · · ·+ 1

2n−1< 1+

1

1− 1/2= 3.

Theorem 20 limn

(1 + 1

n

)n= e.

20

Proof: Put tn =∑n

k=01k!

, rn =(1 + 1

n

)n. Then

rn = 1 + 1 +1

2!+

n− 1

3!n+ · · ·+ (n− 1)!

n!nn−1< tn.

Therefore lim sup rn ≤ e. On the other hand, for a fixed m if n ≥ m

we have

rn ≥ 1 + 1 +1

2!

(1− 1

n

)+ · · ·+ 1

m!

(1− 1

n

)· · ·(

1− m− 1

n

).

Therefore

lim infn

rn ≥ tm.

Since this true for m we get e ≤ lim infn rn. ♠

Remark 15 The rapidity with which this sequence converges is esti-

mated by considering:

e−tn =1

(n + 1)!+

1

(n + 2)!+· · · < 1

(n + 1)![1+

1

n + 1+

1

(n + 1)2+· · · ] =

1

n!n.

Thus

0 < e− tn <1

n!n.

Corollary 2 e is irrational.

Proof: Assume on the contray that e = pq. Then q!e is an integer. On

the other hand 0 < q!e− q!tq < 1q

which is absurd. ♠

Definition 15 A series∑

n an is said to be absolutely convergent if

the series∑

n |an| is convergent to a finite limit.

Theorem 21 Suppose {an} is a decreasing sequence of positive terms,

then∑∞

n=0 an is cgt iff∑

k 2ka2k is cgt.

Theorem 22∑

n1np < ∞ iff p > 1.

Corollary 3 The harmonic series is divergent.

21

Theorem 23 The series∑∞

n=21

(n ln n)p is convergent iff p > 1.

Theorem 24 Ratio Test: If {an} is a sequence of positive terms such

that

lim supn

an+1

an

= r < 1,

then∑

n an is convergent. If an+1

an≥ 1 for all n ≥ n0 for some n0, then∑

n an is divergent.

Proof: To see the first part, choose s so that r < s < 1. Then there ex-

ists N such that an+1

an< s for all n ≥ N. This implies aN+k < aNsk, k ≥

1. Since the geometric series∑

k sk is convergent, the convergence of∑n an follows. The second part is obvious since an cannot converge to

0. ♠Tutorial Session on Wed. 12th August

Exercise 8

1. Let zn = xn + ıyn, n ≥ 1. Show that zn → z = x + ıy iff xn → x

and yn → y.

2. Let∑

n zn be a convergent series of complex numbers such that

<(zn) ≥ 0 for all n. If∑

n z2n is also convergent, show that

∑n |zn|2

is convergent.

3. For 0 ≤ θ < 2π and for any α ∈ R, define the closed sector S(α, θ)

with span θ by

S(α, θ) = {rE(β) : r ≥ 0 & α ≤ β ≤ α + θ}.

Let∑

n zn be a convergent series. If zn ∈ S(α, θ), n ≥ 1, where

θ < π, then show that∑

n |zn| is convergent. (This is an improve-

ment on Exercise 3 above!)

22

4. Let∑

n zn be a series of complex numbers so that each of its four

subseries consisting of terms lying in the same closed quadrant is

convergent. Show that∑

n |zn| is convergent.

5. Telescoping: Given a sequence {xn} define the difference se-

quence an := xn − xn+1. Then show that the series∑

n an is

convergent iff the sequence {xn} is convergent and in that case,∑n

an = x0 − limn−→∞

xn.

6. Let {zn} be a bounded sequence and∑

n wn is an absolutely con-

vergent series. Show that∑

n znwn is absolutely convergent.

7. Abel’s Test: For any sequence of complex numbers {an}, define

S0 = 0 and Sn =∑n

k=1 ak, n ≥ 1. Let {bn} be any sequence of

complex numbers.

(i) Prove Abels’ Identity:

n∑k=m

akbk =n−1∑k=m

Sk(bk − bk+1)− Sm−1bm + Snbn, 1 ≤ m ≤ n.

(LHS =∑

(Sk − Sk−1)bk =∑n

m Skbk −∑n−1

m−1 Skbk+1 = RHS.)

(ii) Show that∑

n anbn is convergent if the series∑

k Sk(bk−bk+1)

is convergent and limn−→∞

Snbn exits.

(iii) Abel’s Test: Let∑

n an be a convergent series and {bn} be

a bounded monotonic sequence of real numbers. Then show that∑n anbn is convergent.

8. Dirichlet’s Test: Let∑

n an be such that the partial sums are

bounded and let {bn} be a monotonic sequence tending to zero.

Then show that∑

n anbn is convergent.

23

Lecture 8

Today we shall write down properly some of the important things

that we saw in the previous tutorial session.

1. Let∑

n an = s be a convergent series of non negative real num-

bers. Then∑

n a2n is convergent.

(∑n

0 a2k) ≤ (

∑n0 ak)

2 ≤ s2)

2. Suppose {zn} is a bounded sequence of complex numbers and∑m wn is absolutely convergent. Then

∑znwn is absolutely con-

vergent. (∑n

0 |zkwk| ≤ M∑n

0 |wk|.)

3. Abel’s Test: For any sequence of complex numbers {an}, define

S0 = 0 and Sn =∑n

k=1 ak, n ≥ 1. Let {bn} be any sequence of

complex numbers.

(i) Prove Abels’ Identity:

n∑k=m

akbk =n−1∑k=m

Sk(bk − bk+1)− Sm−1bm + Snbn, 1 ≤ m ≤ n.

(LHS =∑n

m(Sk − Sk−1)bk =∑n

m Skbk −∑n−1

m−1 Skbk+1 = RHS.)

(ii) Show that∑

n anbn is convergent if the series∑

k Sk(bk−bk+1)

is convergent and limn−→∞

Snbn exists.

(Put m = 1.)

(iii) Abel’s Test: Let∑

n an be a convergent series and {bn} be

a bounded monotonic sequence of real numbers. Then show that∑n anbn is convergent.

(∑

n(bn− bn+1) is convergent by Telescoping and absolutely, since

{bn} is monotonic. The series∑

n an is convergent and hence

{Sn} is bounded. By the previous exercise, the product series is

convergent. Since both Sn and bn are convergent Snbn is conver-

gent. Therefore, (ii) applies.

24

4. Dirichlet’s Test: Let∑

n an be such that the partial sums are

bounded and let {bn} be a monotonic sequence tending to zero.

Then show that∑

n anbn is convergent.

(Arguements are already there in above exercise.)

5. Derive the following Leibniz’s test from Dirichlet’s Test: If {cn}is a monotonic sequence converging to 0 then the alternating series∑

n(−1)ncn is convergent.

(Take an = (−1)n and bn = cn in Dirichlet’s test.)

6. Generalize the Leibniz’s test as follows: If {cn} is a monotonic se-

quence converging to 0 and ζ is complex number such that |ζ| = 1,

ζ 6= 1, then∑

n ζnan is convergent.

7. Show that if∑

n an is convergent then the following sequences are

all convergent.

(a)∑

n

an

np, p > 0; (b)

∑n

an

logpn; (c)

∑n

n√

nan; (d)∑

n

(1 +

1

n

)n

an;

8. Show that for any p > 0, and for every real number x,∑

n

sin nx

np

is convergent.

Theorem 25 Root Test For sequence {an} of positive terms, put

l = lim supnn√

an. Then

(a) l < 1 =⇒∑

n an < ∞.

(b) l > 1 =⇒∑

n an = ∞.

(c)l = 1 the series∑

n an can be finite or infinite.

Proof: Choose l < r < 1 and then an integer N such that n√

an < r

for all n ≥ N. Therefore an < rn and we can now compare with the

geometric series. The proof of (b) is also similar. (c) is demostrated

by the series∑

n1n

and∑

n1n2 . ♠

25

Remark 16 As compared to ratio test, root test is more powerful, in

the sense, whereever ratio test is conclusive so is root test. Also there

are cases when ration test fails but root test holds. However, ratio test

is easier to apply.

Example 3 Put a2n+1 = 12n+1 , a2n = 1

3n . Then

lim infnan+1

an= limn

2n

3n = 0; lim infnn√

an = limn2n

√13n =

√13.

lim supnan+1

an= limn

(32

)n= ∞; lim supn

n√

an = limn2n+1

√12n = 1√

2.

The ratio test cannot be applied. The root test gives the convergence.

The following theorem proves the claim that we have made in the above

remark.

Theorem 26 For any sequence {an} of positive terms,

lim infn

an+1

an

≤ lim infn

n√

an ≤ lim supn

n√

an ≤ lim supn

an+1

an

.

Definition 16 A series∑

n zn is said to be absolutely convergent if

the series∑

n |zn| is convergent.

Again, it is easily seen that an absolutely convergent series is con-

vergent, whereas the converse is not true as seen with the standard

example∑

n

(−1)n 1

n. The notion of absolute convergence plays a very

important role throughout the study of convergence of series. As an

illustration we shall obtain the following useful result about the con-

vergence of the product series.

Definition 17 Given two series∑

n an,∑

n bn, the Cauchy product of

these two series is defined to be∑

n cn, where cn =∑n

k=0 akbn−k.

Theorem 27 If∑

n an,∑

n bn are two absolutely convergent series then

their Cauchy product series is absolutely convergent and its sum is equal

26

to the product of the sums of the two series:

∑n

cn =

(∑n

an

)(∑n

bn

). (6)

Proof: We begin with the remark that if both the series consist of

only non negative real numbers, then the assertion of the theorem is

obvious. We shall use this in what follows.

Consider the remainder after n− 1 terms of the corresponding ab-

solute series:

Rn =∑k≥n

|ak|; Tn =∑k≥n

|bk|.

Clearly, ∑n≥0

|cn| ≤∑k≥0

∑l≥0

|ak||bl| = R0T0.

Therefore the series∑

n cn is absolutely convergent. Further,∣∣∣∣∣∑k≤2n

ck −

(∑k≤n

ak

)(∑k≤n

bk

)∣∣∣∣∣ ≤ R0Tn+1 + T0Rn+1,

since the terms that remain on the LHS after cancellation are of the

form akbl where either k ≥ n + 1 or l ≥ n + 1. Upon taking the limit

as n −→∞, we obtain (6). ♠

Remark 17 This theorem is true even if one of the two series is abso-

lutely convergent and the other is convergent. For a proof of this, see

[R].

An important property of an absolutely convergent series is:

Theorem 28 Let∑

n zn be an absolutely convergent series. Then ev-

ery rearrangement∑

n zσn of the series is also absolutely convergent,

and hence convergent. Moreover, each such rearrangement converges

to the same sum.

27

Proof: ( Recall that a rearrangement∑

n zσn of∑

n zn is obtained by

taking a bijection σ : N −→ N.) Let∑

n zn = z. The only thing that

needs a proof at this stage is that∑

n zσ(n) = z. Let us denote the

partial sums sn =∑n

k=0 ak tn =∑n

k=0 aσ(k). Since∑

n an is absolutely

convergent given ε > 0 there is a N such that∑m

k=n |ak| < ε for all

m ≥ n ≥ N. Pick up N1 large enough so that

{1, 2, . . . , N} ⊂ {σ(1), σ(2), . . . σ(N1)}.

Then for n ≥ N1, we have |sn − tn| ≤∑n

k=N+1 |an| < ε. Therefore,

limn sn = limn tn. ♠Riemann’s rearrangment Theorem: Let

∑an be a convergent

series of real numbers which is not absolutely convergent. Given−∞ ≤α ≤ β ≤ ∞, there exists a rearrangements

∑n aτ(n) of

∑n an with

partial sums tn such that

lim infn

tn = α; lim supn

tn = β.

We are not going to prove this. See [R] for a proof.

28

Lecture 9

Definition 18 By a formal power series in one variable t over K, we

mean a sum of the form∞∑

n=0

antn, an ∈ K.

Note that for this definition to make sense, the sequence {an} can

be inside any set. However, we shall restrict this and assume that

the sequences are taken inside field K. Let K[[t]] denote the set of all

formal power series∑

n antn in t with coefficients an ∈ K. Observe that

when at most a finite number of an are non zero the above sum gives

a polynomial. Thus, all polynomials in t are power series in t, i.e.,

K[t] ⊂ K[[t]].

Just like polynomials, we can add two power series ‘term-by-term’

and we can also multiply them by scalars, viz.,∑n

antn +

∑n

bntn :=

∑n

(an + bn)tn; α(∑

n

antn) :=

∑n

αantn.

Verified that the above two operations make K[[t]] into a vector

space over K.

Further, we can even multiply two formal power series:(∑n

antn

)(∑n

bntn

):=∑

n

cntn,

where, cn =∑n

k=0 akbn−k. This product is called the Cauchy product.

One can directly check that K[[t]] is then a commutative ring with the multi-

plicative identity being the power series

1 :=∑

n

antn

where, a0 = 1 and an = 0, n ≥ 1. Together with the vector space structure, K[[t]]

is actually a K-algebra.) Observe that the ring of polynomials in t forms a subring

of K[[t]]. What we are now interested in is to get nice functions out of power series.

29

Observe that, if p(t) is a polynomial over K then by the method

of substitution, it defines a function a 7→ p(a), from K to K. It

is customary to denote this map by p(t) itself. However, due to the

infinite nature of the sum involved, given a power series P and a point

a ∈ K, the substitution P (a) may not make sense in general. This is

the reason why we have to treat power series with a little more care,

via the notion of convergence.

Definition 19 A formal power series P (t) =∑

n anzn is said to be

convergent at z0 ∈ C if the sequence {sn}, where, sn =n∑

k=0

akzk0 is con-

vergent. In that case we write P (z0) = limn→∞

sn for this limit. Putting

tn = anzn0 , this just means that the series of complex numbers

∑n tn

is convergent.

Remark 18 Observe that every power series is convergent at 0.

Definition 20 A power series is said to be a convergent power series,

if it is convergent at some point z0 6= 0.

The following few theorems, which are attributed to Cauchy-Hadamard2

and Abel3, are most fundamental in the theory of convergent power se-

ries.

Theorem 29 Cauchy-Hadamard Formula: Let P =∑

n≥0 antn be

a power series over C. Put L = lim supnn√|an| and R = 1

Lwith the

2Jacques Hadamard(1865-1963) was a French Mathematician who was the most influ-

ential mathematician of his days, worked in several areas of mathematics such as complex

analysis, analytic number theory, partial differential equations, hydrodynamics and logic.3Niels Henrik Abel (1802-1829) was a Norwegian, who died young under deprivation.

At the age of 21, he proved the impossibility of solving a general quintic by radicals. He

did not get any recognition during his life time for his now famous works on convergence,

on so called abelian integrals, and on elliptic functions.

30

convention 10

= ∞; 1∞ = 0. Then

(a) for all 0 < r < R, the series P (t) is absolutely and uniformly

convergent in |z| ≤ r and

(b) for all |z| > R the series is divergent.

Proof: (a) Let 0 < r < R. Choose r < s < R. Then 1/s > 1/R = L

and hence by property (Limsup-I), we must have n0 such that for all

n ≥ n0,n√|an| < 1/s. Therefore, for all |z| ≤ r, |anz

n| < (r/s)n, n ≥n0. Since r/s < 1, by Weierstrass majorant criterion, (Theorem 32), it

follows that P (z) is absolutely and uniformly convergent.

(b) Suppose |z| > R. We fix s such that |z| > s > R. Then 1/s < 1/R =

L, and hence by property (Limsup-II), there exist infinitely many nj,

for which nj

√|anj

| > 1/s. This means that |anjznj | > (|z|/s)nj > 1. It

follows that the nth term of the series∑

n anzn does not converge to 0

and hence the series is divergent. ♠

Definition 21 Given a power series∑

n antn,

R = sup{|z| :∑

n

anzn < ∞}

is called the radius of convergence of the series. The above theorem

gives you the formula for R.

Remark 19 Observe that if P (t) is convergent for some z, then the

radius of convergence of P is at least |z|. The second part of the theorem

gives you the formula for it. This is called the Cauchy-Hadamard

formula. It is implicit in this theorem that the the collection of all

points at which a given power series converges consists of an open disc

centered at the origin and perhaps some points on the boundary of

the disc. This disc is called the disc of convergence of the power series.

Observe that the theorem does not say anything about the convergence

of the series at points on the boundary |z| = R. The examples below

will tell you that any thing can happen.

31

Example 4 The series∑

n

tn,∑

n

tn

n,∑

n

tn

n2all have radius of conver-

gence 1. The first one is not convergent at any point of the boundary

of the disc of convergence |z| = 1. The second is convergent at all

the points of the boundary except at z = 1 (Dirichlet’s test) and the

last one is convergent at all the points of the boundary (compare with∑n

1n2 ). These examples clearly illustrate that the boundary behavior

of a power series needs to be studied more carefully.

Remark 20 It is not hard to see that the sum of two convergent power

series is convergent. Indeed, the radius of convergence of the sum is

at least the minimum of the radii of convergence of the summands.

Similar statement holds for Cauchy product. Since Cauchy product of

two convergent series with non negative real coefficients is convergent,

it follows that the radius of convergence of the Cauchy product of two

series is at least the minimum of the radii of convergence of the two

series.

Example 5 Here is an example of usefulness of Cauchy’s product.

Consider the geometric series g(t) = 1 + t + t2 + · · · with radius of

convergence equal to 1. We can easily compute (g(t))2 and see that

(g(t))2 = 1 + 2t + 3t2 + · · ·+ ntn−1 + · · ·

which also should have radius convergence at least 1. Also it is not

convergence at 1. Hence the radius of convergence is exactly one. Thus,

it follows that∑

k ktk = tg(t)2− 1 also has radius of convergence equal

to 1. By Cauchy Hadamard’s theorem, it follows that lim supnn√

n = 1.

In turn, it follows that for all integers m, the series∑

k kmtk have radius

of convergence 1.

Definition 22 Given a power series P (t) =∑

n≥0 antn, the derived

series P ′(t) is defined by taking term-by-term differentiation: P ′(t) =

32

∑n≥1 nant

n−1. The series∑

n≥0an

n+1tn+1 is called the integrated series.

As an application of Cauchy-Hadamard formula, we derive:

Theorem 30 A power series P (t), its derived series P ′(t) and any

series obtained by integrating P (t) all have the same radius of conver-

gence.

Proof: Let the radius of convergence of P (t) =∑

n antn, and P ′(t) be

r, r′ respectively. It is enough to prove that r = r′.

We will first show that r ≥ r′. For this we may assume without loss

of generality that r′ > 0. Let 0 < r1 < r′. Then

∑n≥1

|an|rn1 = r1

(∑n≥1

n|an|rn−11

)< ∞.

It follows that r ≥ r1. Since this is true for all 0 < r1 < r′, this

means r ≥ r′.

Now to show that r ≤ r′, we can assume that r > 0 and let 0 <

r1 < r. Choose r2 such that r1 < r2 < r. Then for each n ≥ 1

nrn−11 ≤ n

r1

(r1

r2

)n

rn2 ≤

M

r1

rn2

where M =∑

k≥1 k(

r1

r2

)k

< ∞, since the radius of convergence of∑k ktk is at least 1 (See Example 5.) Therefore,∑

n≥1

n|an|rn−11 ≤ M

r1

∑n≥1

|an|rn2 < ∞.

We conclude that r′ ≥ r1 and since this holds for all r1 < r, it follows

that r′ ≥ r. ♠

33

Remark 21

(i) For any sequence {bn} of non negative real numbers, one can directly

try to establish

lim supn

n√

(n + 1)bn+1 = lim supn

n√

bn

which is equivalent to proving theorem 30. However, the full details of

such a proof are no simpler than the above proof. Moreover, in this

approach, we still need to compute the sum of the derived series.

(ii) A typical error a student falls into is the following: It is not true

that

(lim supn

an)(lim supn

bn) = lim supn

anbn

for any two sequences of real numbers as can be seen by the example

(1, 01, 0 . . .) and (0, 1, 0, 1, . . .). However, it is true that

(limn

an) lim supn

bn = lim supn

anbn

whenever {an} or {bn} is a convergent sequence. What is true in general

is:

(lim supn

an)(lim supn

bn) ≥ lim supn

anbn.

Now assume that {an} is convergent. Let {bnk} be a subsequence which

converges to b = lim supn bn. Then the subsequence ankbnk

→ ab where

a = limn an. This immediately implies that lim supn anbn ≥ ab.

(iii) A power series with radius of convergence 0 is apparently ‘use-

less’ for us, for it only defines a function at a point. It should be

noted that there are other areas of mathematics, formal power series

are whether they converge or not have many applications.

(iii) A power series P (t) with a positive radius of convergence R defines

a continuous function z 7→ p(z) in the disc of convergence BR(0), by

theorem 33. Also, by shifting the origin, we can even get continuous

functions defined in BR(z0), viz., by substituting t = z − z0.

34

(iv) One expects that functions which agree with a convergent power

series in a small neighborhood of every point will have properties akin

to those of polynomials. So, the first step towards this is to see that

a power series indeed defines a C-differentiable function in the disc of

convergence.

Example 6 Hemachandra Numbers For any positive integer n, let

Hn denote the number of patterns you may be able to produce on a

drum in a fixed duration of n beats. For instance, Dha− dhin− dhin

the first Dha− takes two syllables whereas the following two Dhin’s

take one syllable each. Clearly H1 = 1 and H2 = 2. Hemachandra 4

noted that since the last syllable is either of one beat or two beats it

follows that Hn = Hn−1 + Hn−2 for all n ≥ 3. These numbers were

known to Indian poets, musicians and percussionists as Hemachandra

numbers.

Define F0 = 0, F1 = 1 and Fn = Fn−1 + Fn−2, n ≥ 2. Note that

Fn = Hn−1, n ≥ 2. These Fn are called Fibonacci numbers. 5 (Thus

the first few Fibonacci numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . .)

Form the formal power series

F (z) =∞∑

n=0

Fnzn (7)

Multiplying the given recurrence relation by tn and summing over from

2 to ∞ gives

∞∑n=2

Fntn = t

∞∑n=2

Fn−1tn−1 + t2

∞∑n=2

Fn−2tn−2 (8)

4Hemachandra Suri (1089-1175) was born in Dhandhuka, Gujarat. He was a Jain monk

and was an adviser to king Kumarapala. His work in early 11 century is already based on

even earlier works of Gopala.5Leonardo Pisano (Fibonacci) was born in Pisa, Italy (1175-1250) whose book Liber

abbaci introduced the Hindu-Arabic decimal system to the western world. He discovered

these numbers at least 50 years later than Hemachandra’s record.

35

and hence

(1− t− t2)F (t) = t.

Write (1−t−t2) = (1−αt)(1−βt), where α =1 +

√5

2and β =

1−√

5

2.

Put Sw(t) = 1 + wt + w2t2 + · · · . Then (1− wt)Sw(t) = 1 and

F (t) = Sα(t)Sβ(t)t.

Comparing the coefficients of tn+1 on either side, we get,

Fn+1 =n∑

j=0

αjβn−j =αn+1 − βn+1

α− β=

1√5(αn+1 − βn+1) (9)

Exercise 9 *

1. Verify that K[[t]] is a K-algebra, i.e., a K-vector space which is a

commutative ring with a multiplicative unit.

2. For a non zero element p =∑

n≥0 antn ∈ K[[t]], the order ω(p) of p

is defined to be the least integer for which an 6= 0. By convention,

we define ω(0) = +∞. (This is consistent with the convention

that infimum of an empty subset of real numbers is +∞.) Show

that ω(p + q) ≥ min{ω(p), ω(q)} and ω(pq) = ω(p) + ω(q).

3. Given p ∈ K[[t]], show that p has a multiplicative inverse iff

ω(p) = 0.

4. Show that K[[t]] is an integral domain, i.e., p, q ∈ K[[t]] such that

pq = 0 implies p = 0 or q = 0.

5. A family {pj =∑

n an,jtn} of elements in K[[t]] is said to be a

summable family if for each n ≥ 0 the number of j for which the

coefficient of tn in pj is not zero is finite, i.e.,

#{j : an,j 6= 0} < ∞.

36

In this case, we define the sum of this family to be the element

p(t) =∑

n≥0 antn where an =

∑j an,j.

Put p =∑

n antn, q =

∑n bnt

n.

(a) Verify that the Cauchy product pq is indeed the sum of the

family {anbmtm+n}.

(b) If {pj} is a summable series then for any series q the family

{pjq} is also summable.

(c) Assume that b = 0, i.e., ω(q) ≥ 1. Then show that the family

{anqn : n ≥ 0} is summable.

6. The sum of the above family of series in (c) is called the series

obtained by substituting t = q in p or the composition series

and written p ◦ q. Continue to assume that b0 = 0. Let p ◦ q(t) =∑n αnt

n.

(a) Show that for each positive integer n, there exist a (universal)

polynomial Un(A1, . . . , An, B1, B2, . . . , Bn) with the following

properties:

(i) all coefficients are positive integers;

(ii) Each Un is linear in A0, A1, . . . , An, and Bn with coeffi-

cient of Bn = A1.

(iii) Each Un is weighted homogeneous of degree n+1 where

deg Aj = 1; deg Bj = j.

Moreover, Un have the property

αn = Un(a1, . . . , an, b1, b2, . . . , bn). (10)

Write down explicitly U1, U2, U3.

(b) Show that (p1 + p2) ◦ q = p1 ◦ q + p2 ◦ q.

37

(c) (p1p2) ◦ q = (p1 ◦ q)(p2 ◦ q).

(d) If r =∑

n cntn is such that c0 = 0, then we have

p ◦ (q ◦ r) = (p ◦ q) ◦ r.

(e) Consider the element I(t) = t ∈ K[[t]]. Show that it is a two-

sided identity for the composition, i.e., p ◦ I = I ◦ p = p for

all p ∈ K[[t]].

7. Show that if p is a polynomial then for any q ∈ K[[t]], the com-

position p ◦ q makes sense for all q ∈ K[[t]], i.e., even without the

assumption that ω(q) ≥ 1.

8. Let ′ denote the derived series. Show that

(a) p′ = 0 iff p is a constant;

(b) (p + q)′ = p′ + q′; (pq)′ = p′q + pq′.

(c) (pn)′ = npn−1p′, for all integers n. (Here if n is negative you

have to assume that pn makes sense, which is guaranteed if

p(0) 6= 0.)

(d) If {pj} is a summable family of power series then (∑

j pj)′ =∑

j p′j.

(e) Chain rule (p ◦ q)′ = (p′ ◦ q)q′.

9. Inverse Function Theorem for Formal Power Series Given

an element p =∑

n≥0 antn ∈ K[[t]], show that there is a q ∈ K[[t]]

such that q(0) = 0 and p ◦ q = I iff a0 = 0 and a1 6= 0. Show that

such a q is unique. Further, in this case, show that and q ◦ p = I.

10. In the above exercise, if a0 6= 0, we can still do something, viz.,

we consider r = p − a0, apply the above conclusion to r to get s

such that s ◦ r = I = r ◦ s; r(0) = 0. From this we conclude that

p ◦ s(t) = r ◦ s(t) + a0 = t + a0;

38

11. Let us consider two of the most important series

E(t) = 1 + t +t2

2!· · ·+ tn

n!+ · · ·

L(t) = t− t2

2+

t3

3+− · · ·+ (−1)n−1 tn

n+− · · ·

respectively called the exponential series and logarithmic se-

ries.

(a) Verify that E(t + s) = E(t)E(s);

(b) E ′(t) = E(t);

(c) Show that there is unique F such that F (0) = 0, E ◦F (t) =

1 + t; F ◦ E = Id. (See Ex. 10.)

(d) Prove that F ′(t) =∑∞

n=0(−1)ntn and hence F (t) = L(t).

Thus E ◦L(t) = 1+ t. Also L◦ (E−1) = Id. For this reason,

we write Ln(1 + t) := L(t). We then have E ◦ Ln(1 + t) =

E ◦L(t) = E ◦ F (t) = 1 + t. Since this is an identity, we can

express this as E ◦Ln = Id. Similarly, Ln◦E = L◦(E−1) =

Id.

Exercises on Convergent Power Series

Throughout these exercises let p(t) =∑∞

n=0 antn, q(t) =

∑∞n=0 bnt

n

be two power series.

12. Let p, q both have radius of convergence ≥ r > 0.

(a) The radii of convergence of both p+ q and pq are ≥ r. More-

over for |z| < r, we have (p + q)(z) = p(z) + q(z); and

(pq)(z) = p(z)q(z).

(b) Assume further that q(0) = b0 = 0. Then the composite

series p ◦ q has positive radius of convergence.

39

Solution: (i) Take any z such that |z| < r. Then∑

n |anzn|,∑

n |bnzn|

are convergent series of positive terms. Therefore∑n |(an + bn)|zn|,

∑n |∑

j |ajbn−j||zn| are both convergent.

(ii) Let p ◦ q(t) =∑

n cntn. There are universal polynomials

Un(A0, . . . , An; B1, B2, . . . , Bn)

with positive integer coefficients such that

cn = Un(a0, a1, . . . , an; b1, b2, . . . , bn).

Therefore

|cn| ≤ Un(|a0|, . . . , |an|; |b1|, . . . , |bn|).

Thus if we put Q =∑

n |bn|tn, P ◦Q(t) =∑

n Cntn, then cn ≤ Cn

for all n. Therefore, the radius of convergence of p ◦ q is bigger

than or equal to that of radius of convergence of P ◦Q. Therefore,

without loss of generality, we may as well assume that an, bn are

non negative real numbers.

For 0 < t < r we have q(t) =∑

n≥0 bntn < ∞. Therefore α(t) =∑

n≥1 bntn| < ∞ and defines a continuous function in |t| < r.

Therefore q(t) → 0 as t → 0. Therefore we can find 0 < s such

that q(s) < r. But then

p ◦ q)(s)) =∑

n

cnsn = p(q(s)) < ∞

by rearrangement theorem for convergent series of positive terms.

13. Let pq = 1. If the radius of convergence of p is positive then so is

the radius of convergence of q.

Solution:Without loss of generality, we may assume p(0) = 1. Put

s = 1− p. s(0) = 0 and we have

q = 1 + s + s2 + · · · = T ◦ s,

40

where T = 1 + t + t2 + · · · is the geometric series. Now appeal to

the previous exercise.

14. Given α 6= 0, β 6= 0, and a positive integer n, show that there is a

unique formal power series p such that p(0) = α, and pn = αn+βt.

Show that p is of positive radius of convergence.

Solution: First consider the case when α = 1 = β. Take P =

E ◦(

1nLn (1 + t)

). Then P (0) = 1; P n(t) = E ◦ Ln (1 + t) =

1 + t. In the general case, take p(t) = αP (βt/αn). Since Ln has

positive radius of convergence (= 1,) it follows that Ln (1 + t)/n

has positive radius of convergence (= 1). Since E has radius of

convergence ∞ (use Ratio Test), it follows that P has positive

radius of convergence.

15. Given α 6= 0, show that there is a unique power series p of positive

radius of convergence such that

p2 = α2 + βt + γt2; p(0) = α.

Solution:We may assume that γ 6= 0 and then γ = 1. Factorize

the RHS into linear factors and apply the previous exercise.

16. Show that there is a unique power series which satisfies

p2 − (α2 + βt)p + γt = 0; p(0) = 0 (11)

and it has a positive radius of convergence.

Solution: By completing the square and replacing p + λt + δ for

some constants λ, δ by p, this problem can be reduced to the

earlier one.

17. For some positive numbers α, r, M, let

P (t) = αt−∑n≥2

M

rntn.

41

If Q is the compositional inverse of P, show that Q is of positive

radius of convergence.

Solution:

P (t) = αt− Mt2

r2

(1 +

t

r+

t2

r2+ · · ·

)Therefore (

1− t

r

)P (t) = αt

(1− t

r

)− Mt2

r2.

If Q is the compositional inverse of P then we must have(1− Q

r

)t = αQ

(1− Q

r

)which can be rewritten in the form (11).

18. Let p(t) =∑

n antn, q(t) =

∑n bnt

n, a0 = 0 = b0, a1 6= 0 and

p ◦ q = Id. Suppose P (t) = A1t −∑

n≥2 Antn is such that A1 =

|a1|, and |an| < An for all n ≥ 2. Let Q =∑

n BnTn be the

compositional inverse of P with Q(0) = 0. Then show that |bn| ≤Bn for all n.

Solution:Recall that b1 = 1/a1 and

a1bn + Vn(a2, . . . , an; b1, . . . , bn−1) = 0.

Here Vn are certain (universal) polynomials with non negative in-

teger coefficients, and linear in a2, . . . , an. Therefore, (B1 = 1/|a1|and )

A1Bn − Vn(A2, . . . , An; B1, . . . , Bn−1) = 0.

Now, it follows by simple induction that |bn| ≤ Bn for all n.

19. Inverse Function Theorem for Analytic Functions Let p ◦q = Id, where p(0) = 0 and p′(0) 6= 0. If p is of positive radius of

convergence then so is q.

42

Solution: Choose r > 0 so that p is convergent at r. Choose M > 0

so that |an|rn < M for all n. Choose α = |a1|. Then P as in

exercise 17 has positive radius of convergence and hence Q has

positive radius of convergence. But Q majorizes q, by the previous

exercise and hence q is also of positive radius of convergence.

43

Lecture 10

The fact that a power series p of positive radius of convergence de-

fines a function inside its disc of convergence via substitution is some-

thing that we cannot ignore any longer. Let us take the study of such

functions. The sequence of partial sums of p, each being a polynomial,

defines a function on the whole of the complex plane. (If all the co-

efficients of p are real we can view each of the partial sums as a real

valued functions defined on R.) However, the limit makes sense only

inside the disc of convergence. More generally, we can talk about a

sequence {fn} of functions defined on some subset of A ⊂ C such that

at each point z ∈ A the sequence is convergent. We then get a function

f : A → C as the limit function viz.,

f(z) = limn

fn(z), z ∈ A.

Remember that this means for each ε > 0 there exists n0(z) such

that n ≥ n0 implies |fn(z)−f(z)| < ε. The number n0(z) may well vary

drastically as we vary the point z ∈ A. In order that the limit function

f retains some properties of the members of the sequence {fn} it is

anticipated that there must be some control over the possible n0(z).

This leads us to the notion of uniform convergence.

Definition 23 Let {fn} be a sequence of complex valued functions on

a set A. We say that it is uniformly convergent on A to a function f

if for every ε > 0 there exists n0, such that for all n ≥ n0, we have,

|fn(x)− f(x)| < ε, for all x ∈ A.

Remark 22 Clearly, Uniform convergence implies pointwise conver-

gence. The converse is easily seen to be false, by considering the se-

quence fn(x) = 11+nx2 . However, it is fairly easy to see that this is so

if A is a finite set. Thus the interesting case of uniform convergence

occurs only when A is an infinite set. The terminology is also adopted

44

in an obvious way for series of functions via the associated sequences of

partial sums. As in the case of ordinary convergence, we have Cauchy’s

criterion here also.

Theorem 31 A sequence of complex valued functions {fn} is uni-

formly convergent iff it is uniformly Cauchy i.e., given ε > 0, there

exists n0 such that for all n ≥ n0, p ≥ 0 and for all x ∈ A, we have,

|fn+p(x)− fn(x)| < ε.

Example 7 The mother of all convergent series is the geometric series

1 + z + z2 + · · ·

The sequence of partial sums is given by

1 + z + · · ·+ zn−1 =1− zn

1− z.

For |z| < 1 upon taking the limit we obtain

1+z+z2+· · ·+zn+· · · = 1

1− z. (12)

In fact, if we take 0 < r < 1, then in the disc Br(0), the series

is uniformly convergent. For, given ε > 0, choose n0 such that rn0 <

ε(1− r). Then for all |z| < r and n ≥ n0, we have,∣∣∣∣1− zn

1− z− 1

1− z

∣∣∣∣ =

∣∣∣∣ zn

1− z

∣∣∣∣ ≤ |zn0|1− |z|

< ε

There is a pattern in what we saw in the above example. This is

extremely useful in determining uniform convergence:

45

Theorem 32 Weierstrass6M-test: Let∑

n an be a convergent series

of positive terms. Suppose there exists M > 0 and an integer N such

that |fn(x)| < Man for all n ≥ N and for all x ∈ A. Then∑

n fn is

uniformly and absolutely convergent in A.

Proof: Given ε > 0 choose n0 > N such that an + an+1 + · · ·+ an+p <

ε/M, for all n ≥ n0. This is possible by Cauchy’s criterion, since∑

n an

is convergent. Then it follows that

|fn(x)|+ · · ·+ |fn+p(x)| ≤ M(an + · · ·+ an+p) < ε,

for all n ≥ n0 and for all x ∈ A. Again, by Cauchy’s criterion, this

means that∑

fn is uniformly and absolutely convergent. ♠

Remark 23 The series∑

n an in the above theorem is called a ‘majo-

rant’ for the series∑

n fn. Here is an illustration of the importance of

uniform convergence.

Theorem 33 Let {fn} be a sequence of continuous functions defined

and uniformly convergent on a subset A of R or C. Then the limit

function f(x) = limn−→∞

fn(x) is continuous on A.

Proof: Let x ∈ A be any point. In order to prove the continuity of

f at x, given ε > 0 we should find δ > 0 such that for all y ∈ A with

|y−x| < δ, we have, |f(y)−f(x)| < ε. So, by the uniform convergence,

first we get n0 such that |fn0(y) − f(y)| < ε/3 for all y ∈ A. Since

fn0 is continuous at x, we also get δ > 0 such that for all y ∈ A

with |y − x| < δ, we have |fn0(y) − fn0(x)| < ε/3. Now, using triangle

inequality, we get,

|f(y)− f(x)| ≤ |f(y)− fn0(y)|+ |fn0(y)− fn0(x)|+ |fn0(x)− f(x)| < ε,

6Karl Weierstrass (1815-1897) a German mathematician is well known for his perfect

rigor. He clarified any remaining ambiguities in the notion of a function, of derivatives, of

minimum etc., prevalent in his time.

46

whenever y ∈ A is such that |y − x| < δ. ♠

Exercise 10 Put fn(z) = zn

1−zn . Determine the domain on which the

sum∑

n fn(z) defines a continuous function.

Definition 24 Given a power series P (t) =∑

n≥0 antn, the derived

series P ′(t) is defined by taking term-by-term differentiation: P ′(t) =∑n≥1 nant

n−1. The series∑

n≥0an

n+1tn+1 is called the integrated series.

As an application of Cauchy-Hadamard formula, we derive:

Theorem 34 A power series P (t), its derived series P ′(t) and any

series obtained by integrating P (t) all have the same radius of conver-

gence.

Proof: Let the radius of convergence of P (t) =∑

n antn, and P ′(t) be

r, r′ respectively. It is enough to prove that r = r′.

We will first show that r ≥ r′. For this we may assume without loss

of generality that r′ > 0. Let 0 < r1 < r′. Then∑n≥1

|an|rn1 = r1

(∑n≥1

n|an|rn−11

)< ∞.

It follows that r ≥ r1. Since this is true for all 0 < r1 < r′ this

means r ≥ r′.

Now to show that r ≤ r′, we can assume that r > 0 and let 0 <

r1 < r. Choose r2 such that r1 < r2 < r. Then for each n ≥ 1

nrn−11 ≤ n

r1

(r1

r2

)n

rn2 ≤

M

r1

rn2

where M =∑

k≥1 k(

r1

r2

)k

< ∞, since the radius of convergence of∑k ktk is at least 1 (See Example 5.) Therefore,∑

n≥1

n|an|rn−11 ≤ M

r1

∑n≥1

|an|rn2 < ∞.

47

We conclude that r′ ≥ r1 and since this holds for all r1 < r, it follows

that r′ ≥ r. ♠

Remark 24

(i) For any sequence {bn} of non negative real numbers, one can directly

try to establish

lim supn

n√

(n + 1)bn+1 = lim supn

n√

bn

which is equivalent to proving theorem 30. However, the full details

of such a proof are no simpler than the above proof. In any case, this

way, we would not have got the limit of these derived series.

(ii) A power series with radius of convergence 0 is apparently ‘useless

for us’, for it only defines a function at a point. It should noted that

in other areas of mathematics, there are many interesting applications

of formal power series which need be convergent,

(iii) A power series P (t) with a positive radius of convergence R defines

a continuous function z 7→ p(z) in the disc of convergence BR(0), by

theorem 33. Also, by shifting the origin, we can even get continuous

functions defined in BR(z0), viz., by substituting t = z − z0.

(iv) One expects that functions which agree with a convergent power

series in a small neighborhood of every point will have properties akin

to those of polynomials. So, the first step towards this is to see that

a power series indeed defines a C-differentiable function in the disc of

convergence.

Theorem 35 Abel: Let∑

n≥0 antn be a power series of radius of con-

vergence R > 0. Then the function defined by

f(z) =∑

n

an(z − z0)n

48

is complex differentiable in Br(z0). Moreover the derivative of f is given

by the derived series

f ′(z) =∑n≥1

nan(z − z0)n−1

inside |z − z0| < R.

Proof: Without loss of generality, we may assume that z0 = 0. We

already know that the derived series is convergent in BR(0) and hence

defines a continuous function g on it. We have to show that this func-

tion g is the derivative of f at each point of BR(0). So, fix a point

z ∈ BR(0). Let |z| < r < R and let 0 6= |h| ≤ r−|z| so that |z +h| ≤ r.

Consider the difference quotient

f(z + h)− f(z)

h− g(z) =

∑n≥1

un(h) (13)

where, we have put un(h) :=an[(z + h)n − zn]

h− nanz

n−1. We must

show that given ε > 0, there exists δ > 0 such that for all 0 < |h| < δ,

we have, ∣∣∣∣f(z + h)− f(z)

h− g(z)

∣∣∣∣ < ε. (14)

The idea here is that the sum of first few terms can be controlled

by continuity whereas the remainder term can be controlled by the

convergence of the derived series. Using the algebraic formula

αn − βn

α− β=

n−1∑k=0

αn−1−kβk,

putting α = z + h, β = z we get

un(h) = an[(z + h)n−1 + (z + h)n−2z + · · ·+ (z + h)zn−2 + zn−1 − nzn−1].(15)

49

Since |z| < r and |z + h| < r, it follows that

|un(h)| ≤ 2n|an|rn−1. (16)

Since the derived series has radius of convergence R > r, it follows that

we can find n0 such that

2∑n≥n0

|an|nrn−1 < ε/2. (17)

On the other hand, again using (15), each un(h) is a polynomial in h

which vanishes at h = 0. Therefore so does the finite sum∑

n<n0un(h)

. Hence by continuity, there exists δ′ > 0 such that for |h| < δ′ we

have, ∑0<n<n0

2|an|nrn−1 < ε/2. (18)

Taking δ = min{δ′, r − |z|} and combining (17) and (18) yields (14).

♠The exponential function

The exponential function plays a central role in analysis, more so in

the case of complex analysis and is going to be our first example using

the power series method. We define

exp z := ez :=∑n≥0

zn

n!= 1 + z +

z2

2!+

z3

3!+

· · · .

(19)

By comparison test it follows that for any real number r > 0, the series

exp (r) is convergent. Therefore, the radius of convergence of (19) is

∞. Hence from theorem 35, we have, exp is differentiable throughout

C and its derivative is given by

exp′ (z) =∑n≥1

n

n!zn−1 = exp (z) (20)

50

for all z. It may be worth recalling some elementary facts about the

exponential function that you probably know already. Let us denote

by

e := exp (1) = 1 + 1 +1

2!+ · · ·+ 1

n!+ · · ·

Clearly, exp(0) = 1 and 2 < e. By comparing with the geometric series∑n

1

2n, it can be shown easily that e < 3. Also we have,

e = limn−→∞

(1 +

1

n

)n

. (21)

To see this, put tn =∑n

k=01k!

, sn =

(1 +

1

n

)n

, use binomial expansion

to see that

limsupnsn ≤ e ≤ lim infn

sn.

Since∑n

0zk

k!=∑n

0zk

k!, by continuity of the conjugation, it follows

that

exp z = exp z, (22)

Formula (20) together with the property exp (0) = 1, tells us that exp

is a solution of the initial value problem:

f ′(z) = f(z); f(0) = 1. (23)

It can be easily seen that any analytic function which is a solution

of (23) has to be equal to exp . (Ex. Prove this.)

We can verify that

exp (a + b) = exp (a)exp (b), ∀a, b ∈ C (24)

directly by using the product formula for power series. (Use binomial

expansion of (a+b)n.) This can also be proved by using the uniqueness

51

of the solution of (23) which we shall leave it you as an entertaining

exercise. (See ex. ??)

Thus, we have shown that exp defines a homomorphism from the

additive group C to the multiplicative group C? := C\{0}. As a simple

consequence of this rule we have, exp (nz) = exp (z)n for all integers n.

In particular, we have, exp (n) = en. This is the justification to have

the notation

ez := exp (z).

Combining (22) and (24), we obtain,

|eıy|2 = eıyeıy = eıye−ıy = e01.

Hence,

|eıy| = 1, y ∈ R. (25)

Example 8 Trigonometric Functions. Recall the Taylor series

sin x = x− x3

3!+

x5

5!−+ · · · ;

cos x = 1− x2

2!+

x4

4!−+ · · · ,

valid on the entire of R, since the radii of convergence of the two series

are ∞. Motivated by this, we can define the complex trigonometric

functions by

sin z = z− z3

3!+

z5

5!−+ · · · ; cos z = 1− z2

2!+

z4

4!−+ · · · . (26)

Check that

sin z =eız − e−ız

2ı; cos z =

eız + e−ız

2. (27)

52

It turns out that these complex trigonometric functions also have

differentiability properties similar to the real case, viz., (sin z)′ = cos z; (cos z)′ =

− sin z, etc.. Also, from (27) additive properties of sin and cos can be

derived.

Other trigonometric functions are defined in terms of sin and cos as

usual. For example, we have tan z =sin z

cos zand its domain of definition

is all points in C at which cos z 6= 0.

In what follows, we shall obtain other properties of the exponential

function by the formula

eız = cos z + ı sin z. (28)

In particular,

ex+ıy = exeıy = ex(cos y + ı sin y). (29)

It follows that e2πı = 1. Indeed, we shall prove that ez = 1 iff z = 2nπı,

for some integer n. Observe that ex ≥ 0 for all x ∈ R and that if x > 0

then ex > 1. Hence for all x < 0, we have, ex = 1/e−x < 1. It follows

that ex = 1 iff x = 0. Let now z = x + ıy and ez = 1. This means that

ex cos y = 1 and ex sin y = 0. Since ex 6= 0 for any x, we must have,

sin y = 0. Hence, y = mπ, for some integer m. Therefore ex cos mπ = 1.

Since cos mπ = ±1 and ex > 0 for all x ∈ R, it follows that cos mπ = 1

and ex = 1. Therefore x = 0 and m = 2n, as desired.

Finally, let us prove:

exp (C) = C?. (30)

Write 0 6= w = r(cos θ + ı sin θ), r 6= 0. Since ex is a monotonically

increasing function and has the property ex −→ 0, as x −→ −∞ and

ex −→∞ as x −→∞, it follows from Intermediate Value Theorem that

there exist x such that ex = r. (Here x is nothing but ln r.) Now take

53

y = θ, z = x + ıθ and use (29) to verify that ez = w. This is one place,

where we are heavily depending on the intuitive properties of the angle

and the corresponding properties of the real sin and cos functions. We

remark that it is possible to avoid this by defining sin and cos by the

formula (27) in terms of exp and derive all these properties rigorously

from the properties of exp alone.

Remark 25 One of the most beautiful equations:

eπı + 1 = 0 (31)

which relates in a simple arithmetic way, five of the most fundamental

numbers, made Euler7 believe in the existence of God!

Example 9 Let us study the mapping properties of tan function. Since

tan z = sin zcos z

, it follows that tan is defined and complex differentiable at

all points where cos z 6= 0. Also, tan(z + nπı) = tan z. In order to de-

termine the range of this function, we have to take an arbitrary w ∈ Cand try to solve the equation tan z = w for z. Putting eız = X, tem-

porarily, this equation reduces toX2 − 1

ı(X2 + 1)= w. Hence X2 =

1 + ıw

1− ıw.

This latter equation makes sense, iff w 6= −ı and then it has, in general

two solutions. The solutions are 6= 0 iff w 6= ı. Once we pick such a non

zero X we can then use the ontoness of exp : C −→ C \ {0}, to get a

z such that that eız = ±X. It then follows that tan z = w as required.

Therefore we have proved that the range of tan is equal to C \ {±ı}.From this analysis, it also follows that tan z1 = tan z2 iff z1 = z2 +nπı.

Likewise, the hyperbolic functions are defined by

sinh z =ez − e−z

2; cosh z =

ez + e−z

2. (32)

7See E.T. Bell’s book for some juicy stories

54

It is easy to see that these functions are C-differentiable. Moreover,

all the usual identities which hold in the real case amongst these func-

tions also hold in the complex case and can be verified directly. One

can study the mapping properties of these functions as well, which have

wide range of applications.

Remark 26 Before we proceed onto another example, we would like

to draw your attention to some special properties of the exponential

and trigonometric functions. You are familiar with the real limit

limx→∞

exp (x) = ∞.

However, such a result is not true when we replace the real x by a

complex z. In fact, given any complex number w 6= 0, we have seen

that there exists z such that exp (z) = w. But then exp (z +2nπı) = w

for all n. Hence we can get z′ having arbitrarily large modulus such

that exp (z′) = w. As a consequence, it follows that limz−→∞ exp (z)

does not exist. Using the formula for sin and cos in terms of exp , it

can be easily shown that sin and cos are both surjective mappings of

C onto C. In particular, remember that they are not bounded unlike

their real counter parts.

55

Lecture 11

Summability Given a sequence {an} of complex numbers, a method

T first associates another sequence {tn} to it and then takes the limit

of {tn}. If this limit exists and is equal to L then we say {an} is T -

summable to the T-limit L and write

T limnan = L; OR limn

an = L(T ).

Example 10 Series summation is such a summation method in which

tn is just the partial sum. Another method is called (C, 1) summation

(Cesero-1) in which tn = sn

n= a1+···+an

n. Note that if the series

∑n an

then it is (C, 1)-summable to the sum∑

n an. [Proof: Put∑

n an = L.

Then an → L, which is the same as saying limn(an−L) = 0. Given ε > 0

there is a N−) such that |an−L| < ε/2 for n ≥ N0. Also, the sequence

{an − L} bounded and so there is M > 0 such that |an − L| < M

for all n. Therefore |tn − L| = |a1+···+an

n− L| ≤ (N0−1)M+(n−N0+1)ε/2

n

≤ N0−1)Mn

+ ε/2 · · · ]Another example is an = (−1)n. Of course the sequence is not

convergent. But it is (C, 1)-summable to 0. The (C, 1)-limit a good

representation of the average.

Example 11 More generally, given k ≥ 1, we define a sequence {an}to be (C, k) summable to L if the sequence

tn =1(

n+k−1n−1

)∑j=1

n

(n + k − 1− j

n− j

)aj → L

It is not hard to check that if {an} is (C, k) summable to L then

it is (C, k + 1) summable to L. Also, there are sequences which are

(C, k + 1) summable but not (C, k) summable. For instance the se-

quence 1,−1, 2,−2, 3,−3, . . . , is not (C, 1) summable but is (C, 2).

Similarly the sequence 1,−2, 3,−4, 5,−6, . . . is not (C, 2) summable

but (C, 3).

56

Example 12 (General Weighted Averages) Even more generally, given

a sequence of positive real numbers P = {p1, p2, . . . , pn, . . .}, we put

Pn =∑n

j=1 pj and we define P-summability of a sequence {an} if the

sequence

tn =

∑nj=1 ajpn−j

Pn

converges to a limit L and say P lim an = L. Check that each (C, k) in

indeed a P method for some sequence P . Thus each Cesaro sum can

be thought of as a combinatorial (binomial) average.

Definition 25 We say a summability method T is regular if whenever

limn an = L then T limnan = L.

What we have seen above is that each (C, k) is regular. On the

other hand the series method is not regular.

Theorem 36 P is regular iff for each k,

limn

pn−k

Pn

= 0. (33)

Proof: Suppose P is regular. Take an = 0, n 6= k + 1 and = 1 for

n = k + 1 to see (33). Conversely, suppose (33) holds and let an → L.

WLOG we may assume that L = 0. Given ε > 0 find N0 such that

|an| < ε for n ≥ N0. Then for each k ≤ N0 find Nk such that |pn−k

Pn| <

ε/N0 for n ≥ Nk. Take N = max{N0, . . . , NN0}. Then for n ≥ N we

have |tn| < ε(M + 1), where M is a bound {|an|}.

Remark 27 In this sense series is not a regular summability, whereas,

all Cesaro summabilities are.

Definition 26 Given a series∑

n an with partial sums {sn}, we say

that∑

n An is (C1) summable to S if

limn

sn = A (C, 1).

57

And then we write ∑n

an = S. (C, 1).

A sequence {an} is called square summable Or is said to be of class `2

if∑

n a2n∞. We can add two square summable sequences to get another

such. Indeed square summable sequences form a vector space. {1/n}is in `2 whereas {

√1/n} is not in `2.

58

Lecture 12

Definition 27 By a metric or a distance function on a set X we mean

a function d : X ×X → R such that

(a) d(x, y) ≥ 0 for all (x, y) and = 0 iff x = y.

(b) d(x, y) = d(y, x);

(c) d(x, y) ≤ d(x, z) + d(z, y). A set X together with a chosen metric

on it is called a metric space.

Example 13

1. The simplest and most important examples of metric spaces are the

Euclidean spaces Rn with d(x, y) =√∑n

i=1(xi − yi)2. In case of n = 1

this also takes the form d(x, y) = |x− y|. So, we also use this notation

in the general case.

2. A metric on X automatically restricts to a metric on any subset of

X and thus, it makes sense to talk about subspaces of metric spaces.

For instance, if we consider Rn×{0} ⊂ Rn+1 then the standard metric

on Rn is seen to be the restriction of that on Rn+1. 3. For any set X

consider the function

d(x, y) =

{0, x = y

1, x 6= y.

Verify that this is a distance function. It is called the discrete metric.

4. On Rn define

dmax(x, y) = max{|x1 − y1|, . . . , |xn − yn|}

5. On Rn define

d1(x, y) =n∑

i=1

|xi − yi|.

6. On the set of square summable sequences of real numbers, define

d2(x, y) =

√∑i

(xi − yi)2

59

7. On the set of bounded continuous real valued functions on an interval

J, define

ds(f, g) = sup{|f(x)− f(y)|, x ∈ J}.

Definition 28 Let (X, d) be a metric space, x ∈ X, δ > 0. We shall

denote

Bδ(x) := {y ∈ X : d(x, y) < δ}

and call it the open ball of radius δ and center x.

Exercise 11 Draw a picture of the unit ball in the Rn in each of the

various metrics that we have seen above.

Definition 29 Let (X, d) be a metric space.

1. By an open subset in X we mean a subset U ⊂ X which is the union

of some open balls in X.

2. A set Y is called a NB of x ∈ X if there is an open set U in X such

that x ∈ U ⊂ Y.

3. A subset F in X is closed in X if X \ F is open in X.

4. A point x ∈ X is called a limit point A ⊂ X if every nbd of Y of x

contains a point of A not equal to x.

5. If x ∈ A is not a limit point of A then it is called an isolated point

of A.

6. The set of all points A of a given set Y such that a ∈ A implies Y

is a nbd of a is called the interior of Y.

7. A ⊂ X is called bounded if there exists M > 0 and p ∈ X such that

such that A ⊂ BM(p).

8. A ⊂ X is called dense in X if every point of X \ A is a limit point

of X.

Theorem 37 Let {Uj} be a family of open sets in X. Then the union

U = ∪jUj is open. Also intersection of any two open sets is open.

60

Remark 28 The empty set and the whole set X are open.

Theorem 38 A set is closed iff it contains all its limit points.

Definition 30 The closure A of a set A is defined to be the union of

A with all its limit points.

Theorem 39 Let Y be a subspace X. A subset A ⊂ Y is open in Y if

there exists an open set U in X such that A = U ∩ Y.

Definition 31 By a cover of A ⊂ X we mean a family {Uj} of sets

in X such that A ⊂ ∪jUj. It is called an open cover if every member

Uj is open. A subcover we mean a cover {Vi} which are members of

the cover {Uj}. A subset K of X is compact, if every open cover of K

admits a finite subcover.

Theorem 40 Let Y be a subspace of X. Then K ⊂ Y is compact (as

a subset of Y iff K ⊂ X is compact.

Proof: Let K be compact in X and let {Uj} any cover of K by open

subset of Y. Then there exist open sets Vj in X such that Uj = Vj ∩ Y.

But then {Vj} is an open cover of K in X. Therefore there are finitely

many say Vj1 , . . . , Vjk} such that K ⊂ ∪k

i=1Vji. But then K ⊂ ∪k

i=1Uji.

We leave the proof of the converse to you. ♠

Theorem 41 Every closed subset of a compact set is compact.

Proof: Easy.

Theorem 42 Every compact subset of a metric space is closed and

bounded.

Proof: Let K be a compact subset of (X, d). We shall prove X \K is

open. Fix a point p ∈ X \K. For each x ∈ K consider δx = 12d(p, x).

61

Then {Bδx(x)}x∈K forms an open cover for K. Since K is compact,

there exist x1, . . . xk such that K ⊂ ∪ki=1Bδxi

(xi). It follows easily that

V = ∩ki=1Bδxi

(p) is an open set contains p and V ⊂ X \K.

To show that K is bounded, fix any point x ∈ X and consider the

family {Bδ(x)} of open sets which actually cover the whole of X and

hence K. A finite cover then gives a single δ such that K ⊂ Bδ(x). ♠

Corollary 4 If F is closed and K is compact then F ∩K is compact.

Theorem 43 Let {Kj} be a collection of compact subset of a metric

space X such that intersection of any finitely many members is non

empty, then ∩jKj 6= ∅.

Proof: Put Uj = X \ Kj. Then we know that each Uj is open. Now

if ∩jKj 6= ∅ then it follows that X = ∪jUj. In particular {Uj} is an

open cover for K1 which is compact. Therefore, there are finitely may

j1, . . . , jk such that

K1 ⊂ Uj1 ∪ · · · ∪ Ujk.

This means K1 ∩Kj1 ∩ · · · ∩Kjk= ∅ a con tradition. ♠

Corollary 5 If {Kn} is a sequence of non empty compact sets in a

metric space, then ∩nKn 6= ∅.

Theorem 44 If A is an infinite subset of a compact subset K of a

metric space, then A has limit point in K.

Proof: If not then every point of x ∈ K has a nbd Ux such that

Ux∩A ⊂ {x}. If {Ux1 , . . . , Uxk} is a finite subcover of K this will imply

A ⊂ ∪iUxi. Therefore, A ⊂ ∪iUxi

∩A ⊂ {x1, . . . , xk} which contradicts

infiniteness of A. ♠We shall now examine compactness property inside Rn.

62

Lemma 2 Let In = [an, bn] is a decreasing nested sequence of nonempty

closed intervals, i.e.,

I1 ⊃ · · · ⊃ In ⊃ In+1 ⊃ · · ·

then ∩nIn 6= ∅.

Proof: Put x = sup an. Claim x ∈ In for all n. ♠

Lemma 3 If In is a decreasing sequence of closed cells in Rk, then

∩nIn 6= ∅.

63

Lecture 13

Theorem 45 Every closed cell in Rk is compact.

Proof: Use iterated bisection technique. ♠

Theorem 46 (Heine-Borel) A subset K of Rk is compact iff it is closed

and bounded.

Proof: We have to prove that if K is closed and bounded subset of

Rk, then it is compact. Since it is bounded, it is contained in a closed

cell. Since it is a closed subset of a closed cell which is compact, it is

compact. ♠

Theorem 47 A subset K of Rk is compact iff every infinite subset of

K has a limit point in K.

Proof: Again, we have only to prove if part. We shall prove that K is

closed and bounded.

If K is not bounded, then for each n we shall xn ∈ K such that

|xn| > n. The subset E = {xn} has no limits points in Rk and hence

none whatsoever in K. This is a contraction.

Now suppose K is not closed. This means there is a limit point x

of K which is not in K. We now construct an infinite sequence {xn}in K which converges to x and hence no limit point inside K. Having

found xn, put δn = |x− xn|/2 and consider the open ball Bδn(x) which

must a have point of K not equal to x; call this point xn+1. ♠

Theorem 48 (Weierstrass) Every bounded infinite subset of Rk has a

limit point in Rk.

Proof: The of this set is compact.

64

Theorem 49 (Bolzano-Weierstrass) Let A be a bounded subset of Rk.

Then every infinite sequence in A has a subsequence which is conver-

gent.

Proof: Look at the image set and consider the two cases according to

whether it is finite or infinite.

Theorem 50 Let f : X → Y be a function: TFAE:

(1) f is continuous.

(2) fU is open in X for every open set U in Y.

(3) f−1(F ) is closed in X for every closed set F in Y.

Theorem 51 Let f : X → Y be a continuous function of metric

spaces. If K is a compact subset of X, then f(K) is a compact subset

of Y.

Theorem 52 Every continuous real valued function on a compact set

attains its minimum and maximum.

Proof: The image is closed and bounded and hence has maximum and

minimum.

Exercise 12

(1) Let F be a closed subset of a metric space. Consider f(x) =

d(x, F ) = inf{d(x, y) : y ∈ F}. Show that f is continuous.

(2) Let f : X → Y be any function, x0 ∈ X. Prove that the FAE:

(a) f is continuous at x0.

(b) For every sequence {xn} in X which converges to x0 the sequence

{f(xn)} converges to f(x0).

(3) Let f, g : X → R be any two continuous functions. Define Max{f, g}, min{f, g}by the formulae:

Max{f, g}(x) = max{f(x), g(x)}; Min{f, g}(x) = min{f(x), g(x)}.

Show that Max{f, g}, Min{f, g} are both continuous.

65

Theorem 53 (Lebesgue Covering Lemma) Let {Uj} be an open cover-

ing for a compact metric space. Then there exists a number δ > 0 such

that any ball of radius δ and center in K is contained in some member

of {Uj}.

Proof: By compactness of K we may assume that the cover is finite.

Put Fj = X \Uj so that each Fj is a closed set. Now consider the func-

tion fj : X → R given by fj(x) = d(x, Fj). Check that it is continuous.

Next put f = max{f1, f2, . . . , fn}. Show that f is also continuous.

Check that f(x) > 0 for x ∈ K. Now let δ = inf{f(x) : x ∈ K}.Then by the previous theorem δ is actually the minimum and hence

is positive. Now let x ∈ K, and consider Bδ(x). If it is not contained

in any of U1, . . . , Uk, that would mean that the ball contains points

from each of Fj which means that the distance of x from each Fj is

strictly less that δ. That means that the maximum of these distances

viz. f(x) < δ which is absurd. ♠

Definition 32 Let f : X → Y be a function from one metric space to

another metric space. We say f is uniformly continuous, if for every ε >

0 there exists a δ > 0 such that dX(x1, x2) < δ =⇒ dY (f(x1), f(x2)) <

ε.

Theorem 54 (Uniform Continuity) Every continuous real valued

function on a compact space is uniformly continuous.

Proof: Given ε by continuity, for each x ∈ K there exists δx > 0 such

that dY (f(x), f(y)) < ε/2 for all y ∈ Bδ(x). Since K compact by LCL,

there exists a δ > 0 such that any ball of radius δ is contained in some

member of {Bδx(x)}. Now let a, b ∈ K be such that d(a, b) < δ. Choose

x ∈ K such that a, b ∈ Bδx(x). Then it follows that dY (f(a), f(x)) <

ε/2, dY (f(b), f(x)) < epn/2 and therefore dY (f(a), f(b)) < ε.

66

Example 14 f [0,∞) → [0,∞) defined by x2 is not uniformly contin-

uous.

Connectedness

Definition 33 Let X be a metric space. We say X is connected if the

only subsets A ⊂ X which are both open and closed in X are X and

∅.

Theorem 55 Let X be a metric space. Then the following are equiv-

alent:

(a) X is connected.

(b) A ∪ B = X, both A and B are open, A 6= ∅ 6= B then A ∩ B 6= ∅.(c) A ∪ B = X, both A and B are closed, A 6= ∅ 6= B then A ∩ B 6= ∅.(d) ∅ 6= A ⊂ X is both open and closed then A = X.

Theorem 56 A subset of R is connected iff it is an interval.

Proof: Suppose A ⊂ R which is not an interval. This means there exist

x < z < y such that x, y ∈ A but z 6∈ A. Put F = A∩ (−infty, z); G =

A ∩ (z∞). Then both F, G are open in A nonempty and the union is

A!?

Conversely, let A be an interval in R, A = F ∪ G, x ∈ F, y ∈ G

x < y. Assume that both F, G are closed in A we shall show that

F ∩ G =6= emptyset. Put w = sup F ∩ [x, y]. Then w ∈ A and since

F is closed w ∈ F. Clearly, w ≤ y. Now for any w < z ≤ y, z 6∈ F

and hence z ∈ G. This means w is a limit point of G. Since G is closed

w ∈ G. ♠

Theorem 57 Let f : X → Y be a continuous function, A ⊂ X is

connected. Then f(A) is connected.

67

Theorem 58 Intermediate Value Property Let f : [a, b] → R be a

continuous function. Let f(a) < z < f(b). Then there exists a < c < b

such that f(c) = z.

Remark 29 IVP is equivalent to intervals being connected.

Assignment: Show that a totally ordered set is connected

then it has lubp. and similarly (glbp)

Example 15

(i) Every path is connected.

(ii) Every path connected space is connected. But converse is not true.

(iii) Rn is connected.

(iv) Every cell in Rn is connected.

(v) Complement of a countable set in Rn, n ≥ 2 is connected. (vi)

Complement of a vector subspace of codimension ≥ 2 in Rn is con-

nected.

(vii) Every convex subset is connected.

(viii) Spheres ellipsoids etc are connected. Not nec. hyperboloids.

68

Lecture 14

Fundamental Theorem of Algebra

As promised before, we shall give an elementary proof of Funda-

mental Theorem of Algebra (FTA) in this section.

Theorem 59 Every non constant polynomial in one variable with co-

efficients in C has at least one root in C.

The proof uses only elementary Real Analysis which you have learnt

so far. All proofs of FTA use Intermediate Value Theorem (IVP) im-

plicitly or explicitly. We shall use it here explicitly. Apart from that,

the only important result that we use is Weierstrass’s theorem.

We begin with:

Lemma 4 For every polynomial function p : C → C, the function

|p| : C → R attains its infimum.

Proof: Given a polynomial p, we have to show that there exists z0 ∈ Csuch that |p(z0)| ≤ |p(z)| for all z ∈ C.

We know that that p(z) −→ ∞ as z −→ ∞. (Exercise.) This

means that there exists R > 0 such that |p(z)| > |p(0)| for all |z| > R.

It follows that

Inf {|p(z)| : z ∈ C} = Inf {|p(z)| : |z| ≤ R} ≤ |p(0)|.

But the disc {z : |z| ≤ R} is closed and bounded. Since the function

z 7→ |p(z)| is continuous, it attains its infimum on this disc. This

completes the proof of the lemma. ♠Slowly but surely, now an idea of the proof of FTA emerges: Observe

that FTA is true iff the infimum z0 obtained in the above lemma is a

zero of p, i.e., p(z0) = 0. Therefore in order to complete a proof of FTA,

it is enough to assume that p(z0) 6= 0 and arrive at a contradiction.

(This idea is essentially due to Argand.)

69

Consider the polynomial q(z) = p(z + z0). Both the polynomials,

p, q have the same value set and hence minimum of |q(z)| is equal to

minimum of |p(z)| which is equal to |p(z0)| = |q(0)|.We shall assume that q(0) 6= 0 and arrive a contradiction.

Write q(z) = q(0)φ(z) where

φ(z) = 1 + wzk + zk+1f(z)

with w 6= 0 is some complex number, k ≥ 1 and f(z) some polynomial.

Observe that |q(0)| is the minimum of |q(z)| iff 1 is the minimum of

|φ(z)|. It is enough to prove that

Lemma 5 Argand’s Inequality For any polynomial f, positive in-

teger k, and any w ∈ C \ {0},

Min{|1 + wzk + zk+1f(z)| : z ∈ C} < 1. (34)

Choose r > 0 such that rk = |w| (IVP)(see Exercise 1.5.13). Now

replace z by z/rk in (34). Thus, we may assume |w| = 1 in (34).

At this stage, Argand’s proof uses de Moivre’s theorem, viz., for

every complex number α and every positive integer k, the equation

zk = w has a solution. For its simplicity, we present this proof of

lemma 5, first:

Choose λ such that λk = −w−1. Replace z by λz in (34) to reduce

it to proving

Min{|1− zk + zk+1g(z)| : z ∈ C} < 1. (35)

Now restrict z to positive real numbers, z = t > 0. Since g(t) is a

polynomial, tg(t) → 0 as t → 0. So there exists t > 0 for which

|tg(t)| < 1/2. But then

|1− tk + tk+1g(t)| < |1− tk|+ tk

2= 1− tk +

tk

2< 1

70

thereby completing the proof of (34).

Why do we want to avoid using de Moivre’s Theorem? The answer

is that it depends heavily upon the intuitive concept of the angle which

needs to be established rigorously. (It should also be noted that during

Argand’s time, one could not expect a rigorous proof of lemma 4, which

Argand simply assumed.8)

Instead, we now follow an idea of Littlewood which is coded in the

following two lemmas:

Lemma 6 Given any complex number w of modulus 1, one of the four

numbers ±w,±ıw has its real part less than −1/2.

Proof: [This is seen easily as illustrated in the Fig. 1. The four shaded

regions which cover the whole of the boundary are got by rotating the

region <(z) < −1/2. However, it is important to note that the following

proof is completely independent of the picture.] Since |w| = 1, either

|<(w)| or |=(w)| has to be bigger than 1/2. In the former case, one of

±w will have the required property. In the latter case, one of ±ıw will

do. ♠

Fig. 18For more learned comments, see R. Remmert’s article on ‘Fundamental Theorem of

Algebra’ in [Ebb].

71

Lemma 7 For any integer n ≥ 1, the four equations

zn = ±1; zn = ±ı; (36)

have all solutions in C.

Proof: Write n = 2km, where m = 4l + 1 or 4l + 3. For k ≥ 0, since

we can take successive square-roots let αk, βk, γk be such that

α2k

k = −1, β2k

k = ı, γ2k

k = −ı.

(For k = 0, this just means α0 = −1; β0 = ı, γ0 = −ı.)

Now let us take the four equations one by one:

(a) For zn = 1, we can always take z = 1.

(b) For equation zn = −1, take z = αk. Then (αk)n = (−1)m = −1.

(c) For the equation zn = ı : Take z = βk, if m = 4l + 1. Then

(βk)n = (ı)m = ı. If m = 4l + 3 then take z = γk so that (γk)

n =

(−ı)m = (−ı)3 = ı.

(d) This case follows easily from (b) and (c). Choose z1, z2 such that

zn1 = −1 and zn

2 = ı. Then (z1z2)n = −ı. ♠

[At this stage, the proof given in literature first establishes de Moivre’s

theorem and then follows the arguments given above. Here, we shall

directly derive Argand’s inequality.]

Returning to the proof of lemma 5, choose τ = ±1,±ı so that

<(τw) < −1

2(Lemma 6). Choose α ∈ C such that αk = τ (Lemma 7).

Now, replace z by αz, so that we may assume that w = a + ıb,

where a ≤ −1/2 and a2 + b2 = 1.

Since f is continuous, it follows that tf(t) → 0 as t → 0. Restricting

to just the real values of t, we can choose 0 < δ < 1 such that |tf(t)| <1/3 for all 0 < t < δ. For such a choice of t, we have

|1 + wtk + tk+1f(t)| ≤ |1 + wtk|+ tk

3= [(1 + atk)2 + b2t2k]1/2 + tk/3.

72

We want to choose 0 < t < δ such that this quantity is less than 1. For

a2 + b2 = 1 and t > 0 we have

[(1 + atk)2 + b2t2k]1/2 + tk/3 < 1 iff [(1 + atk)2 + b2t2k]1/2 < 1− tk

3

iff (1 + atk)2 + b2t2k <

(1− tk

3

)2

= 1− 2tk

3+

t2k

9

iff 1 + 2atk + t2k < 1− 2tk

3+

t2k

9iff

8

9tk < −

(2a +

2

3

), t > 0.

This last condition can be fulfilled by choosing t > 0 such that tk < 3/8,

for then,8

9tk <

1

3< −

(2a +

2

3

).

Thus, for any t > 0 which is such that tk < min {3/8, δ} (IVP again),

we have

|1 + wtk + tk+1f(t)| < 1.

This completes the proof of the lemma 5 and thereby that of FTA.

♠

73

Lecture 15

We have seen that a sequence of continuous functions which is uni-

formly convergent produces a limit function which is also continuous.

We shall strengthen this result now.

Theorem 60 Let fn : X → R or (C) be a sequence of continuous

functions. Let A ⊂ X on which {fn} converges uniformly. Then {fn}converges on the closure A of A to a function f which is continuous.

Proof: Let us fix a point x0 ∈ A. We must first of all show that the

sequence {fn(x0)} is convergent. Enough to show it is Cauchy. Given

ε > 0 there exist n0 such that n, m > n0 implies

|fn(x)− fm(x)| < ε/3

for all x ∈ A. By continuity of fn and fm we can find δ > 0 such that

d(x, x0) < δ implies that

|fm(x)− fm(x0)|+ |fn(x)− fn(x0)| < 2ε/3.

Now since x0 ∈ A, there exists x ∈ Bδ(x0) ∩ A. With the help of this

x, we have

|fn(x0)−fm(x0)| ≤ |fm(x)−fm(x0)|+|fn(x)−fn(x0)|+|fn(x)−fm(x)| < ε.

Therefore, we have got a function f : A → R which is the limit of {fn}and the convergence is uniform on A.

We now want to show that f is continuous at x0.

|f(x)− f(x0)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(x0)|+ |fn(x0)− f(x0)|

Given ε > 0 we can choose N1 such that n > N1 implies

|f(x)− fn(x)|+ |fn(x0)− f(x0)| < 2ε/3, for all x ∈ A.

74

Fix one such n. Then by continuity of fn we can find δ > 0 such

that d(x, x0) < δ implies |fn(x) = f(n(x0)| < ε/3. Once again since

Bδ(x0) ∩ A 6= ∅, has to be used to conclude the continuity of f at x0.

♠

Remark 30 What about differentiabilty under uniform convergence?

We should be careful here as illustrated by the example: fn(x) = x1+nx2

on [0, 1]. This sequence converges uniformly to the function which is

identically 0. However the derived sequence f ′n(x) = 1−nx2

(1+nx2)2converges

to a function which is not even continuous. It is also true that a

uniform limit of a sequence of smooth functions can be continuous but

not differentiable, or differentiable but not continuously differentiable

or ... and so on.

On the positive side, we shall now see that by controling the limiting

process of the derived sequence itself we get better results:

Theorem 61 Let fn : [a, b] → R be a sequence of differentiable func-

tions such that f ′n converges uniformly in [a, b] to a function g. Also

suppose for some x0 ∈ [a, b], the sequence {fn(x0)} is convergent. Then

the sequence fn converges uniformly to a function f and f ′ = g =

limn→∞ f ′n.

Proof: First we want to show that fn is uniformly convergent and for

this it is enough to show that it is uniformly Cauchy, i.e., given ε > 0

we must find n0 such that n, m > n0 implies

|fn(x)− fm(x)| < ε, x ∈ [a, b] (37)

Using the hypothesis we get n1 such that n, m > n1 implies

|f ′n(x)− f ′m(x)| < ε

2(b− a), x ∈ [a, b]. (38)

75

Put φmn = fn − fm. Therefore by Mean Value theorem applied to

φmn, we have∣∣∣∣φmn(x1)− φmn(x2)

x1 − x2

∣∣∣∣ < ε

2(b− a), x1, x2 ∈ [a, b], m, n > n1. (39)

This is the same as

|fn(x1)− fm(x1)− fm(x2) + fn(x2)| <|x1 − x2|2(b− a)

≤ ε/2. (40)

We now use the fact that fn(x0) is convergent and hence find n2

such that n, m > n2 implies

|fn(x0)− fm(x0)| < ε/2. (41)

Combining the above two inequalities we conclude that fn is uniformly

Cauchy,

|fn(x)− fm(x)| < ε, m, n > max{n1, n2} (42)

as required. Let now f(x) = limn→∞ fn(x). To show that f ′ = g :

Now fix a x2 ∈ [a, b] and put hn(x1) = fn(x1)−fn(x2)x1−x2

. Then (39) implies

that hn is uniformly Cauchy in [a, b] \ {x2} and hence converges to a

continuous function h(x1) which is nothing but

limn→∞

fn(x)− fn(x2)

x1 − x2

=f(x1)− f(x2)

x1 − x2

.

Therefore the limit function is continuous on the closure of [a, b] \ {x2}which is [a, b]. We can now interchange the takiong limit with respect

to n with limit with respect to x, i.e.,

g(x1) = limn→∞

f ′n(x1) = limn→∞

limx2→x1

fn(x2)− fn(x1)

x2 − x1

= limx2→x1

limn→∞

fn(x2)− fn(x1)

x2 − x1

= limx2→x1

f(x2)− f(x1)

x2 − x1

= f ′(x1).

♠

76

Lecture 16 : Riemann-Stieltjes Integration

Throughout this section α will denote a monotonically increasing

function on an interval [a, b].

Let f be a bounded function on [a, b].

Let P = {a = a0 < a1, · · · , an = b} be a partition of [a, b]. Put

∆αi = α(ai)− α(ai−1).

Mi = sup{f(x) : ai−1 ≤ x ≤ ai}.mi = inf{f(x) : ai−1 ≤ x ≤ ai}.

U(P, f) =n∑

i=1

Mi∆αi; L(P, f) =n∑

i=1

mi∆αi.∫ b

a

fdα = inf{U(P, f) : P};∫ b

a

fdα = sup{L(P, f) : P}.

Definition 34 If

∫ b

a

fdα =

∫ b

a

fdα then we say f is Riemann-Stieltjes

integrable w.r.t to α and denote this common value by∫ b

a

fdα :=

∫ b

a

f(x)dα(x) :=

∫ b

a

fdα =

∫ b

a

fdα.

Let R(α) denote the class of all R-S integrable functions on [a, b].

Definition 35 A partition P ′ of [a, b] is called a refinement of another

partition P of [a, b] if, points of P are all present in P ′. We then write

P ≤ P ′.

Lemma 8 If P ≤ P ′ then L(P ) ≤ L(P ′) and U(P ) ≥ U(P ′).

Enough to do this under the assumption that P ′ has one extra point

than P.

Theorem 62

∫ b

a

fdα ≥∫ b

a

fdα.

77

Let P and Q be any two partitions of [a, b]. By taking a common

refinement T = P ∪Q, and applying the above lemma we get

U(P ) ≥ U(T ) ≥ L(T ) ≥ L(Q)

Now varying Q over all possible partitions and taking the supremum,

we get

U(P ) ≥∫ b

a

fdα.

Now varying P over all partitions of [a, b] ad taking the infimum, we

get the theorem. ♠

Theorem 63 Let f be a bounded function and α be monotonically in-

creasing function. Then the following are equivalent.

(i) f ∈ R(α).

(ii) Given ε > 0 there exists a partition P of [a, b] such that

U(P )− L(P ) < ε.

(iii) Given ε > 0 there exists a partition P of [a, b] such that for all

refinements of Q of P we have

U(Q)− L(Q) < ε.

(iv) Given ε > 0 there exists a partition P = {a0 < a1, · · · an} of [a, b]

such that for arbitrary points ti, si ∈ [ai−1, ai] we have

n∑i=1

|f(si)− f(ti)| < ε.

(v) There exists a real number η such that for every ε > 0, there exists

a partition P = {a0 < a1, · · · < an} of [a, b] such that for arbitrary

points ti ∈ [ai−1, ai], we have |∑n

i=1 f(ti)∆αi − η| < ε.

78

Proof: (i) =⇒ (ii): By definition of the upper and lower integrals,

there exist partitions Q, T such that

U(Q)−∫ b

a

fdα < ε/2;

∫ b

a

fdα− L(T ) < ε/2.

Take a common refinement P to Q, T and replace Q, T by P in the

above inequalities, and then add the two inequalities and use the hy-

pothesis (i) to conclude (ii).

(ii) =⇒ (i): Since L(P ) ≤∫ b

afdα ≤

∫ b

afdα ≤ U(P ) the conclusion

follows.

(ii) =⇒ (iii): This follows from the previous theorem for if P ′ ≥ P then

L(P ) ≤ L(P ′) ≤ U(P ′) ≤ U(P ).

(iii) =⇒ (ii): Obvious.

(iii) =⇒ (iv ): Note that |f(si)− f(ti)| ≤ Mi −mi. Therefore,∑i

|f(si)− f(ti)|∆αi ≤∑

i

(Mi −mi) = U(P )− L(P ) < ε.

(iv) =⇒ (iii): Choose points ti, si ∈ [ai−1, ai] such that

|mi − f(si)| <ε

2n∆i

, |Mi − f(ti)| <ε

2n∆i

.

Then U(P )− L(p)−∑

i(Mi −mi)∆i

≤∑

i[|Mi − f(ti)|+ |mi − f(si)|+ |f(ti − f(si)]∆αi < 2ε.

Thus so far, we have proved that (i) to (iv) are all equivalent to each

other.

(i) =⇒ (v): We first note that having proved that (i) to (iv) are all

equivalent, we can use any one of them. We take η =∫ n

afdα. Given

ε > 0 we choose a partition P such that |L(P ) − η| < ε/3. and a

partition Q such that (iv) holds with ε replaced by ε/3. We then take a

common refinement T of these two partitions for which again the same

79

would hold because of (iii). We now choose si ∈ [ai−1, ai] such that

|mi − f(si)| < ε3n∆i

whenever ∆i is non zero. (If ∆i = 0 we can take si

to be any point.) Then for arbitrary points ti ∈ [ai−1, ai], we have

|∑

i f(ti)∆i − η|

=

∣∣∣∣∣∑i

[(f(ti)− f(si) + (f(si)−mi) + mi∆i]− η

∣∣∣∣∣≤

∑i

|f(si)− f(ti)|∆i +∑

i

|f(si)−mi|∆i + |L(P )− η|

≤ ε/3 + ε/3 + ε/3 = ε.

(v) =⇒ (iv): Given ε > 0 choose a partition as in (v) with ε replaced

by ε/2. ♠Lecture 17

Fundamental Properties Of the Integral

Theorem 64 Let f be a bounded function and α be an increasing func-

tion, on an interval [a, b].

(a) Linearity in f : This just means that if f, g ∈ R(α), λ, µ ∈ R then

λf + µg ∈ R(α). Moreover,∫ b

a

(λf + µg) = λ

∫ b

a

fdα + µ

∫ b

a

fdα.

(b) Semi-Linearity in α. This just means if f ∈ R(αj), j = 1, 2 λj > 0

then f ∈ R(λ1α1 + λ2α2) and moreover,∫ b

a

fd(λ1α1 + λ2α2) = λ

∫ b

a

fdα1 + µ

∫ b

a

fdα2.

(c) Let a < c < b. Then f ∈ R(α) on [a, b] if f ∈ R(α) on [a, c] as well

as on [c, b]. Moreover we have∫ b

a

fdα =

∫ c

a

fdα +

∫ b

c

fdα.

80

(d) f1 ≤ f2 on [a, b] and fi ∈ R(α) then∫ b

af1dα ≤

∫ b

af2dα.

(e) If f ∈ R(α) and |f(x)| ≤ M then∣∣∣∣∫ b

a

fdα

∣∣∣∣ ≤ M [α(b)− α(a)].

(f) If f is continuous on [a, b] then f ∈ R(α).

(g) f : [a, b] → [c, d] is in R(α) and φ : [c, d] → R is continuous then

φ ◦ f ∈ R(α).

(h) If f ∈ R(α) then f 2 ∈ R(α).

(i) If, f, g ∈ R(α) then fg ∈ R(α).

(j) If f ∈ R(α) then |f | ∈ R(α) and∣∣∣∣∫ b

a

fdα

∣∣∣∣ ≤ ∫ b

a

|f |dα.

Proof: (a) Put h = f + g. Given ε > 0, choose partitions P, Q of [a, b]

such that

U(P, f)− L(P, f) < ε/2, P (Q, g)− L(Q, g) < ε/2

and replace these partitions by their common refinement T and then

appeal to

L(T, f) + L(T, g) ≤ L(T, h) ≤ U(T, h) ≤ U(T, f) + U(T, g).

For a constant λ since

U(P, λf) = λU(P, f); L(P, λf) = λL(P.f)

it follows that

∫ b

a

λfdα = λ

∫ b

a

fdα. Combining these two we get the

proof of (a).

(b) This is easier: In any partition P we have

∆(λ1α1 + λ2α2) = λ1∆α1 + λ2∆α2

81

from which the conclusion follows.

(c) All that we do is to stick to those partitions of [a, b] which contain

the point c.

(d) This is easy and

(e) is a consequence of (d).

(f) Given ε > 0, put ε1 = εα(b)−α(a)

. Then by uniform continuity of f,

there exists a δ > 0 such that |f(t) − f(s)| < ε1 whenever t, s ∈ [a, b]

and |t− s| < δ. Choose a partition P such that ∆αi < δ for all i. Then

it follows that Mi −mi < ε1 and hence U(P )− L(P ) < ε.

(g) Given ε > 0 by uniform continuity of φ, we get ε > δ > 0 such that

|φ(t)−φ(s)| < ε for all t, s ∈ [c, d] with |t− s| < δ. There is a partition

P of [a, b] such that

U(P, f)− L(P, f) < δ2.

The differences Mi−mi may behave in two different ways: Accordingly

let us define

A = {1 ≤ i ≤ n : Mi −mi < δ}, B = {1, 2, . . . , n} \ A.

Put h = φ ◦ f. It follows that

Mi(h)−mi(h) < ε, i ∈ A.

Therefore we have

δ(∑i∈B

∆αi) ≤∑i∈B

(Mi −mi)∆αi < U(P, f)− L(P, f) < δ2.

Therefore we have∑

i∈B ∆αi < δ. Now let K be a bound for |φ(t)| on

[c, d]. Then

U(P, h)− L(P, h) =∑

i(Mi(h)−mi(h))∆αi

=∑

i∈A(Mi(h)−mi(h))∆αi +∑

i∈B(Mi(h)−mi(h))∆αi

≤ ε(α(b)− α(a)) + 2Kδ < ε(α(b)− α(a) + 2K).

82

Since ε > 0 is arbitrary, we are done.

(h) Follows from (g) by taking φ(t) = t2.

(i) Write fg = [(f + g)2 − (f − g)2]/4.

(j) Take φ(t) = |t| and apply (g) to see that |f | ∈ R(α). Now let λ = ±1

so that λ∫ b

afdα ≥ 0. Then∣∣∣∣∫ b

a

fdα

∣∣∣∣ = λ

∫ b

a

dα =

∫ b

a

λfdα ≤∫ b

a

|f |dα.

This completes the proof of the theorem.

Theorem 65 Suppose f is monotonic and α is continuous and mono-

tonically increasing. Then f ∈ R(α).

Proof: Given ε > 0, by uniform continuity of α we can find a partition

P such that each ∆αi < ε.

Now if f is increasing, then we have Mi = f(ai), mi = f(ai−1).

Therefore,

U(P )− L(P ) =∑

i

[f(ai)− f(ai−1)]∆αi < f(b)− f(a))ε.

Since ε > 0 is arbitrary, we are done. ♠Lecture 18

Theorem 66 Let f be a bounded function on [a, b] with finitely many

discontinuities. Suppose α is continuous at every point where f is dis-

continuous. Then f ∈ R(α).

Proof: Because of (c), of theorem 64, it is enough to prove this for the

case when c ∈ [a, b] is the only discontinuity of f. Put K = sup|f(t)|.Given ε > 0, we can find δ1 > 0 such that α(c + δ1) − α(c − δ1) < ε.

By uniform continuity of f on [a, b] \ (c − δ, c + δ) we can find δ2 > 0

such that |x − y| < δ2 implies |f(x) − f(y)| < ε. Given any partition

83

P of [a, b] choose a partition Q which contains the points c and whose

‘mesh’ is less than min{δ1, δ2}. It follows that U(Q)−L(Q) < ε(α(b)−α(a)) + 2Kε. Since ε > 0 is arbitrary this implies f ∈ R(α). ♠

Remark 31 The above result leads one to the following question.

Keeping the continuity hypothesis on α, how large can be the set of

discontinuities of a function f such that f ∈ R(α)? The answer is not

within R-S theory. Lebesgue has to invent a new powerful theory which

not only answers this and several such questions raised by Riemann in-

tegration theory but also provides a sound foundation to the theory of

probability.

Example 16 We shall denote the unit step function at 0 by U which

is defined as follows:

U(x) =

{0, x <≤ 0;

1, x > 0.

By shifting the origin at other points we can get other unit step func-

tion. For example, suppose c ∈ [a, b]. Consider α(x) = U(x − c), x ∈[a, b]. For any bounded function f : [a, b] → R, let us try to compute∫ b

afdα. Consider any partition P of [a, b] in which c = ak. The only

non zero ∆αi is ∆αk = 1. Therefore U(P )− L(P ) = Mk(f)−mk(f).

Now assume that f is continuous at c. Then by choosing ak+1 close

to ak = c, we can make Mk − mk → 0. This means that f ∈ R(α).

Indeed, it follows that Mk → f(c) and mk → f(c). Therefore,∫ b

a

fdα = f(c).

Now suppose f has a discontinuity at c of the first kind i.e, in

particular, f(c+) exists. It then follows that |Mk−mk| → |f(c)−f(c+)|.Therefore, f ∈ R(α) iff f(c+) = f(c).

84

Thus, we see that it is possible to destroy integrability by just dis-

turbing the value of the function at one single point where α itself is

discontinuous.

In particular, take f = α. It follows that α 6∈ R(α) on [a, b].

We shall now prove a partial converse to (c) of Theorem 64.

Theorem 67 Let f be a bounded function and α an increasing function

on [a, b]. Let c ∈ [a, b] at which (at least) f or α is continuous. If

f ∈ R(α) on [a, b] then f ∈ R(α) on both [a, c] and [c, b]; moreover, in

that case, ∫ b

a

fdα =

∫ c

a

fdα +

∫ b

c

fdα.

Proof: Assume α is continuous at c. If Tc is the translation function

Tc(x) = x − c then the functions g1 = U ◦ T and g2 = 1 − U ◦ T

are both in R(α) since they are discontinuous only at c. Therefore

fg1, fg2 ∈ R(α). But these respectively imply that f ∈ R(α) on [c, b]

and on [a, c].

We now consider the case when f is continuous at c. We shall prove

that f ∈ R(α) on [a, c], the proof that f ∈ R(α) on [c, b] being similar.

Recall that the set of discontinuities of a monotonic function is

countable. Therefore there exist a sequence of points cn in [a, c] (we

are assuming that a < c) such that cn → c. By the earlier case f ∈ R(α)

on each of the intervals [a, cn]. We claim that the sequence

sn :=

∫ cn

a

fdα

converges to a limit which is equal to∫ c

afdα. Let K > 0 be a bound

for α. Given ε > 0 we can choose δ > 0 such that for x, y ∈ [c− δ, c +

δ], |f(x) − f(y)| < ε/2K. If n0 is big enough then n, m ≥ n0 implies

that |sn − sm| < ε. This means {sn} Cauchy and hence is convergent

with limit equal to say, s. Now choose n so that |s− sn| < ε.

85

Put ∆ = α(c) − α(c−). Since cn → c, from the left, it follows that

α(cn) → α(c−). Choose n large enough so that

|α(cn)− α(c−)| < ε/L

where L is a bound for f.

Now, choose any partition Q of [a, cn] so that |U(Q, f) − sn| < ε.

This is possible because f ∈ R(α) on [a, cn]. Put P = Q ∪ {c}, M =

max{f(x) : x ∈ [cn, c]}. Then

|s + ∆f(c)− U(P, f)|≤ |s− sn|+ |sn − U(Q, f)|+ |∆f(c)− (α(c)− α(cn))M |≤ ε + ε + |∆(f(c)−M)|+ |(α(cn)− α(c−))M |≤ 2ε + ∆ ε

2K+ |M | ε

K≤ 4ε.

Theorem 68 Let {cn} be a sequence of non negative real numbers such

that∑

n cn < ∞. Let tn ∈ (a, b) be a sequence of distinct points in

the open interval and let α =∑

n cnU ◦ Ttn . Then for any continuous

function f on [a, b] we have∫ b

a

fdα =∑

n

cnf(tn).

Proof: Observe that for any x ∈ [a, b], 0 ≤∑

n U(x− tn) ≤∑

n cn and

hence α(x) makes sense. Also clearly it is monotonically increasing

and α(a) = 0 and α(b) =∑

n cn. Given ε > 0 choose n0 such that∑n>n0

cn < ε. Take

α1 =∑n≤n0

U ◦ Ttn , α2 =∑n>n0

U ◦ Ttn .

By (b) of theorem 64, and from the example above, we have∫ b

a

fdα1 =∑n≤n0

cnf(tn).

86

If K is bound for |f | on [a, b] we also have∣∣∣∣∫ b

a

fdα2

∣∣∣∣ < K(α2(b)− α2(a)) = K∑n>n0

cn = Mε.

Therefore, ∣∣∣∣∣∫ b

a

fdα−∑n≤n0

cnf(tn)

∣∣∣∣∣ < Kε.

This proves the claim. ♠

Theorem 69 Let α be an increasing function and α′ ∈ R on [a, b].

Then for any bounded real function on [a, b], f ∈ R(α) iff fα′ ∈ R.

Furthermore, in this case,∫ b

a

fdα =

∫ b

a

f(x)α′(x)dx.

Proof: Given ε > 0, since α′ is Riemann integrable, by (iv) of theorem

63, there exists a partition P = {a = a0 < a1 < · · · < an = b} of [a, b]

such that for all si, ti ∈ [ai−1, ai] we have,

n∑i=1

|α′(si)− α′(ti)|∆xi < ε.

Apply MTV to α to obtain ti ∈ [ai−1, ai] such that ∆αi = α′(ti)∆xi.

Put M = sup|f(x)|. Then

n∑i=1

f(si)∆αi =n∑

i=1

f(si)α′(ti)∆xi.

Therefore,∣∣∣∣∣n∑

i=1

f(si)∆xi −n∑

i=1

f(si)α′(si)∆xi

∣∣∣∣∣ <∑i

|f(si)||α′(si)−α′(ti)|∆xi > Mε.

Thereforen∑

i=1

f(si)∆xi ≤n∑

i=1

f(si)α′(si)∆xi + Mε ≤ U(P, fα′) + Mε.

87

Since this is true for arbitrary si ∈ [ai−1, ai], it follows that

U(P, f, α) ≤ U(P, fα′) + Mε.

Likewise, we also obtain

U(P, fα′) ≤ U(P, f, α)) + Mε.

Thus

|U(P, f, α)− U(P, fα′)| < Mε.

Exactly in the same manner, we also get

|L(P, f, α−L(P, fα′)| < Mε.

Note that the above two inequalities hold for refinements of P as well.

Now suppose f ∈ R(α). we can then assume that the partition P is

chosen so that

|U(P, f, α)− L(P, fα)| < Mε.

It then follows that

|U(P, fα′)− L(P, fα′)| < 3Mε.

Since ε > 0 is arbitrary, this implies fα′ is Riemann integrable. The

other way implication is similar. Moreover, the above inequalities also

establish the last part of the theorem. ♠

Remark 32 The above theorems illustrate the power of Stieltjes’ mod-

ification of Riemann theory. In the first case, α was a staircase function

(also called a pure step function). The integral therein is reduced to

a finite or infinite sum. In the latter case, α is a differentiable func-

tion and the integral reduced to the ordinary Riemann integral. Thus

the R-S theory brings brings unification of the discrete case with the

continuous case, so that we can treat both of them in one go. As an

88

illustrative example, consider a thin straight wire of finite length. The

moment of inertia about an axis perpendicular to the wire and through

an end point is given by ∫ l

0

x2dm

where m(x) denotes the mass of the segment [0, x] of the wire. If the

mass is given by a density function ρ, then m(x)∫ x

0ρ(t)dt or equiva-

lently, dm = ρ(x)dx and the moment of inertia takes form∫ l

0

x2ρ(x)dx.

On the other hand if the mass is made of of finitely many values mi

concentrated at points xi then the inertia takes the form∑i

x2i mi.

Theorem 70 Change of Variable formula Let φ : [a, b] → [c, d] be

a strictly increasing differentiable function such that φ(a) = c, φ(b) = d.

Let α be an increasing function on [c, d] and f be a bounded function

on [c, d] such that f ∈ R(α). Put β = α ◦ φ, g = f ◦ φ. Then g ∈ R(β)

and we have ∫ b

a

gdβ =

∫ d

c

fdα.

Proof: Since φ is strictly increasing, it defines a one-one correspon-

dence of partitions of [a, b] with those of [c, d], given by

{a = a0 < a1 < · · · < an = b} ↔ {c = φ(a) < φ(a1) < · · · < φ(an) = d}.

Under this correspondence observe that the value of the two functions

f, g are the same and also the value of function α, β are also the same.

Therefore, the two upper sums lower sums are the same and hence the

two upper and lower integrals are the same. the result follows. ♠

89

Lecture 19 : Functions of bounded Variation

Definition 36 Let f : [a, b] → R be any function. For each partition

P = {a = a0 < a1 < · · · < an = b} of [a, b], consider the variations

V (P, f) =n∑

k=1

|f(ak)− f(ak−1)|.

Let

Vf = Vf [a, b] = sup{V (P, f) : P is a partition of [a, b]}.

If Vf is finite we say f is of bounded variation on [a, b]. Then Vf is

called the total variation of f on [a.b]. Let us denote the space of all

functions of bounded variations on [a, b] by BV [a, b].

Lemma 9 If Q is a refinement of P then V (Q, f) ≥ V (P, f).

Theorem 71 (a) f, g ∈ BV [a, b], α, β ∈ R =⇒ αf + βg ∈ BV [a, b].

Indeed, we also have Vαf+βg ≤ |α|Vf + |β|Vg.

(b) f ∈ BV [a, b] =⇒ f is bounded on [a, b].

(c) f, g ∈ BV [a, b] =⇒ fg ∈ BV[a, b]. Indeed, if |f | ≤ K, |g| ≤ L then

Vfg ≤ LVf + KVg.

(d) f ∈ BV [a, b] and f is bounded away from 0 then 1/f ∈ BV [a, b].

(e) Given c ∈ [a, b], f ∈ BV [a, b] iff f ∈ BV [a, c] and f ∈ BV [c, b].

Moreover, we have

Vf [a, b] = Vf [a, c] + Vf [c, b].

(f) For any f ∈ BV [a, b] the function Vf : [a, b] → R defined by

Vf (x) = Vf [a, x] is an increasing function.

(g) For any f ∈ BV [a, b], the function Df = Vf − f is an increasing

function on [a, b].

(h) Every monotonic function f on [a, b] is of bounded variation on

90

[a, b].

(i) Any function f : [a, b] → R is in BV [a, b] iff it is the difference of

two monotonic functions.

(j) If f is continuous on [a, b] and differentiable on (a, b) with the

derivative f ′ bounded on (a, b), then f ∈ BV [a, b].

(k) Let f ∈ BV [a, b] and continuous at c ∈ [a, b] iff Vf : [a, b] → R is

continuous at c.

Proof: (a) Indeed for every partition, we have V (P, αf+βg) = αV (P, f)+

βV (P, g). The result follows upon taking the supremum.

(b) Take M = Vf + |f(a)|. Then |f(x)| ≤ |f(x) − f(a)||f(a)| ≤V (P, f) + |f(a)| where P is any partition in which a, x are consecu-

tive terms.

(c) For any two points x, y we have

|f(x)g(x)−f(y)g(y)| ≤ |f(x)||f(x)−g(y)|+|g(y)||f(x)−f(y)| ≤ KVg+LVf .

(d) Let 0 < m < |f(x)| for all x ∈ [a, b]. Then∣∣∣∣ 1

f(x)− 1

f(y)

∣∣∣∣ =

∣∣∣∣f(x)− f(y)

f(x)f(y)

∣∣∣∣ ≤ Vf

m2.

(e) Follows from the lemma above, by including the point c in any

partition.

(f) Follows from (e).

(g) Let a ≤ x < y ≤ b. Proving Vf [a, y]− f(y) ≤ Vf [a, x]− f(x) is the

same as proving Vf |a, x] + f(y)− f(x) ≤ Vf [a, y]. For any partition P

of [a, x] let P ∗ = P ∪ {y}. Then

V (P, f) + f(y)− f(x) ≤ V (P, f) + |f(y)− f(x)| = V (P ∗, f) ≤ Vf [a, y].

Since this is true for all partitions P of [a, x] we are through.

(h) May assume f is increasing. But then for every partition P we

have V (P, f) = f(b)− f(a) and hence Vf = f(b)− f(a).

91

(i) If f ∈ BV [a, b], from (f) and (g), we have f = Vf − (Vf − f) as a

difference of two increasing functions. The converse follows from (a)

and (h).

(j) This is because then f satisfies Lipschitz condition

|f(x)− f(y)| ≤ M |x− y| for all x, y ∈ [a, b].

Therefore for every partition P we have V (P, f) ≤ M(b− a).

(k) Observe that Vf is increasing and hence V (c±) exist. By (h) it

follows that same is true for f. We shall show that f(c) = f(c±) iff

Vf (c) = Vf (c±) which would imply (k). So, assume that f(c) = f(c+).

Given ε > 0 we can find δ1 > 0 such that |f(x) − f(c)| < ε for all

c < x < c + δ1, x, y ∈ [a, b]. We can also choose a partition P = {c =

x0 < x1 < · · · < xn = b} such that

Vf [c, b]− ε <∑

k

∆fk.

Put δ = min{δ1, x1 − c}. Let now c < x < c + δ. Then

Vf (x)− Vf (c)

= Vf [c, x] = Vf [c, b]− Vf [x, b]

< ε +∑

k ∆fk − Vf [x, b]

≤ ε + |f(x)− f(c)|+ |f(x1)− f(x)|+∑

k≥2 ∆fk − Vf [x, b]

≤ ε + ε + Vf [x, b]− Vf [x, b] = 2ε.

This proves that Vf (c+) = Vf (c) as required.

Conversely, suppose Vf (c+) = Vf (c). Then given ε > 0 we can find

δ > 0 such that for all c < x < c + δ we have Vf (x) − Vf (c) < ε. But

then given x, y such that c < y < x < c + δ it follows that

|f(y)− f(c)|+ |f(x)− f(y)| ≤ Vf ([c, x] = Vf (x)− Vf (c) < ε

which definitely implies that |f(x)−f(y)| ≤ ε. This completes the proof

that Vf (c+) = Vf (c) iff f(c+) = f(c). Similar arguments will prove that

Vf (c−) = Vf (c) iff f(c−) = f(c). ♠

92

Example 17 Not all continuous functions on a closed and bounded

interval are of bounded variation. A typical examples is f : [0, π] → Rdefined by

f(x) =

{x cos

(1x

), x 6= 0

0, x = 0.

For each n consider the partition

P = {0, π

2n,

π

2n− 1, . . . , π}

Then V (P, f) = π∑n

k=11k. As n →∞, we know this tends to ∞.

However, the function g(x) = xf(x) is of bounded variation. To

see this observe that g is differentiable in [0, π] and the derivative is

bounded (though not continuous) and so we can apply (j) of the above

theorem.

Also note that even a partial converse to (j) is not true, i.e., a

differentiable function of bounded variation need not have its derivative

bounded. For example h(x) = x1/3, being increasing function, is of

bounded variation on [0, 1] but its derivative is not bounded.

Remark 33 We are now going extend the R-S integral with integra-

tors α not necessarily increasing functions. In this connection, it should

be noted that condition (v) of theorem 63 becomes the strongest and

hence we adopt that as the definition.

Definition 37 Let f, α : [a, b] → R be any two functions. We say f is

R-S integrable with respect to to α and write f ∈ R(α) if there exists

a real number η such that for every ε > 0 there exists a partition P of

[a, b] such that for every refinement Q = {a + x0 < x1,≤ xn = b} of P

and points ti ∈ [xi−1, xi] we have∣∣∣∣∣n∑

i=1

f(ti)∆αi − η

∣∣∣∣∣ < ε.

93

We then write η =∫ b

afdα and call it R-S integral of f with respect to

to α.

It should be noted that, in this general situation, several properties

listed in Theorem 64 may not be valid. However, property (b) Theorem

64 is valid and indeed becomes better.

Lemma 10 For any two functions αj and real numbers λj if f ∈R(αj), j = 1, 2 implies f ∈ R(λ1α1 + λ2α2). Moreover, in this case

we have ∫ b

a

fd(λ1α1 + λ2α2) = λ1

∫ b

a

fdα1 + λ2

∫ b

a

fdα2.

Proof: This is so because for any fixed partition we have the linearity

property of ∆:

∆(λ1α1 + λ2α2)i = (λ1α1 + λ2α2)(xi − xi−1) = λ1(∆α1)i + Λ2(∆α2)i

And hence the same is true of the R-S sums. Therefore, if ηj =∫ b

afdαj

then it follows that

λ1η1 + λ2η2 =

∫ b

a

fd(λ1α1 + λ2α2).

♠

Theorem 72 Let α be a function of bounded variation and let V de-

note its total variation function V : [a, b] → real defined by V (x) =

Vα[a, x]. Let f be any bounded function. Then f ∈ R(α) iff f ∈ RV

and f ∈ R(V − α).

Proof: The ‘if’ part is easy because of (a). Also, we need only prove

that if f ∈ R(α) then f ∈ R(V ). Given ε > 0 choose a partition Pε so

94

that for all refinements P of Pε, and for all choices of tk, sk ∈ [ai−1, ai],

we have, ∣∣∣∣∣n∑

k=1

(f(tk)− f(sk))∆αk

∣∣∣∣∣ < ε, V (b) <∑

k

∆αk + ε.

We shall establish that

U(P, f, V )− L(P, f, V ) < εK

for some constant K. By adding and subtracting, this task may be

broken up into establishing two inequalities∑k

[Mk(f)−mk(f)][∆Vk−|∆αk|] < εK/2;∑

k

[Mk(f)−mk(f)]|∆αk| < εK/2.

Now observe that [∆Vk−|∆αk| ≥ 0 for all k. Therefore if M is a bound

for |f |, then∑k[Mk(f)−mk(f)][∆Vk − |∆αk|] ≤ 2M

∑k(∆Vk − |∆αk|)

= 2M(V (b)−∑|∆αk|) < 2Mε.

To prove the second inequality, let us put

A = {k : ∆αk ≥ 0}; B = {1, 2, . . . , n} \ A.

For k ∈ A choose tk, sk ∈ [ak−1, ak] such that

f(tk)− f(sk) > Mk(f)−mk(f)− ε;

and for k ∈ B choose them so that

f(sk)− f(tk) > Mk(f)−mk(f)− ε.

We then have

∑k[Mk(f)−mk(f)]|∆αk|

<∑

k∈A(f(tk)− f(sk))|∆k|+∑

k∈B(f(sk)− f(tk))|∆k|+ ε∑

k |∆αk|=

∑k[f(tk)− f(sk))∆k + εV (b) = ε(1 + V (b)).

Putting K = max{2M, 1 + V (b)} we are done. ♠

95

Corollary 6 Let α : [a, b] → R be of bounded variation and f : [a, b] →R be any function. If f ∈ R(α) on [a, b] then it is so on every subin-

terval [c, d] of [a, b].

.

Corollary 7 Let f : [a, b] → R be of bounded variation and α : [a, b] →real be a continuous of bounded variation. Then f ∈ R(α).

Proof: By (k) of the above theorem, we see that V (α) and V (α)−α are

both continuous and increasing. Hence by a previous theorem, V (f)

and V (f)− f are both integrable with respect to V (α) and V (α)− α.

Now we just use the additive property. ♠

96

Lect. 20

Let us now consider functions on open intervals which are finite or

infinite.

Definition 38 For f : (a, b) → R we say f is of bounded variation if

there exists M such that for every subinterval [c, d] ⊂ (a, b) we have

Vf [c, d] ≤ M. Also, in this case the total variation of f on (a, b) is

defined to be the supremum of all Vf [c, d] where [c, d] varies over all

subintervals of (a, b).

Remark 34 Look at the results in theorem 71 one by one and see

whether you can replace the closed interval [a, b] there by an open

interval. We see no trouble whatso ever till (h). This is obviously not

true. Likewise we need to modify (i) as well. Indeed

Theorem 73 Every bounded monotonic function on (a, b) is of bounded

variation. Every element of BV(a, b) is expressible as the difference of

two bounded increasing functions.

The last part of the above theorem follows because if f ∈ BV(a, b) then

Vf is bounded.

Exercise 13

1. Let f : [0, 1] → R be an increasing function. If f(x) 6= x for any

x ∈ [a, b] and f(0) > 0 then show that f(1) > 1.

2. Suppose x1 < x2 < · · ·xk are the roots of a polynomial function

f lyingng in [a, b]. What is Vf ([a, b]?

3. Alternative proof of theorem 71 (i): Let f ∈ BV[a, b]. For every

partition P of [a, b] define

A(P ) = {k : ∆fk > 0}; B = {k : ∆fk < 0}.

97

Define

pf [a, b] = sup{∑

k∈A(P )

∆fk : P is a partition of [a, b]}

nf [a, b] = sup{∑

k∈B(P )

|∆fk| : P is a partition of [a, b]}

Then pf and nf are called positive and negative variations of

[a, b]. We define pf (x) = pf [a, x], nf (x) = nf (a, x], a < x ≤ b and

pf (0) = 0, nf (0) = 0. Check that

(i) Vf (x) = pf (x) + nf (x)

(ii) 0 6= pf (x) ≤ Vf (x); 0 ≤ nf (x) ≤ Vf (x).

(iii) pf and nf are increasing on [a, b].

(iv) f(x) = f(a) + pf (x) − nf (x), x ∈ [a, b]. ] This is the part of

the statement of theorem 71 (i).]

(v) 2pf (x) = Vf (x) + f(x)− f(a); 2nf (x) = Vf (x)− f(x) + f(a).

(vi) Every point of continuity of f is also a point of continuity of

pf , and nf .

4. Absolute Continuity A function f : [a, b] → R is said to be

absolutely continuous if for every ε > 0 there exists a δ > 0 such

that for any finitely many disjoint subintervals (ak, bk) of [a, b]

such that∑

k(bk − ak) < δ, we have

n∑k=1

|f(ak)− f(ak−1)| < ε.

(i) Every absolutely continuous function is continuous.

(ii) Every absolutely continuous function is of bounded variation.

(iii) If f satisfies uniform Lipschitz condition of oreder 1 i.e., if

there exists M such that |f(x) − f(y)| < M |x − y| for all x, y ∈[a, b], then f is abs. cont.

98

(iv) The set of abs. continuous functions on [a, b] forms a vector

space.

(v) If f is abs. continuous and bounded away from 0 then 1/f is

also abs. continuous.

(vi) If f is absolutely continuous then |f | is continuous.

Remark 35 There are continuous functions of bounded variation

which are not absolutely continuous. Find one of them.

5. Rectifiable Curves Let γ : [a, b] → Rn be a path, i.e., a contin-

uous map. For each partition P of [a, b] consider the the sum

l(P, γ) =n∑

k=1

‖γ(ak)− γ(ak−1)‖.

Let

l(γ) = Sup{l(P, γ) : P is a partition of [a, , b]}.

If l(γ) is finite we say γ is a rectifiable path and call l(γ) the arc

length of γ.

(i) If γ = (γ1, . . . , gamman) then γ is rectifiable iff each γi is of

bounded variation.

(ii) If γ′ is continuous on [a, b], then γ is rectfiable and we have∫ b

a

‖γ′(t)‖dt = l(γ).

(iii) The arc length is an invariant of change parameterization,

i.e., if φ : [c, d] → [a, b] is an onto map with φ′(t) > 0 for all t then

l(γ) = l(γ ◦ φ).

6. The graph of x sin 1x

is non rectifiable.

99

7. Consider the following function defined on [0, 1] by

f(t) =

0, t = 0

2nt− 1,1

2n≤ t ≤ 1

2n− 1, n ≥ 1,

1− 2nt,1

2n + 1≤ t ≤ 1

2n, n ≥ 1.

Show that the graph is non rectifiable.

8. Let 4ABC be an equilateral triangle in R2. Start at the midpoint

M1 of AB, join it to the opposite vertex C and trace the line seg-

ment M1C up to the midpoint M2 of CM1. Extend BM2 to meet

the side AC at N2. Let M3 be the midpoint of CN2. Trace this

segment from M2 to M3. Repeat this precess infinitely. Observe

that the sequence of points Mj converges to the midpoint of M0

of BC. Show that this process defines a non rectifiable continuous

path.

100

Lecture 22

Example 18 :

1. Consider the double sequence,

sm,n =m

m + n, m, n ≥ 1.

Compute the two iterated limits

limm

limn

sm,n, limn

limm

sm,n

and record your results.

2. Let fn(x) =x2

(1 + x2)n, x ∈ R, n ≥ 1 and put f(x) =

∑n fn(x).

Check that fn is continuous. Compute f and see that f is not

continuous.

3. Define gm(x) = limn→∞(cos m!πx)2n and put g(x) = limm→∞ gm(x).

Compute g and see that g is discontinuous everywhere. Directly

chaek that it is not Reimann integrable.

4. Consider the sequence hn(x) =sin nx√

nand put h(x) = limn fn(x).

Check that f ≡ 0. On the other hand, compute limn h′n(x).

5. Put λn(x) = n2x(1−x2)n. Compute the limn λn(x). On the other

hand check that ∫ 1

0

λn(x)dx =n2

2n + 2→∞.

Therefore we have

∞ = limn

[∫ 1

0

λn(x)dx

]6=∫ 1

0

[limn

λn(x)]dx = 0.

101

We know that if a sequence of continuous functions converges uni-

formly to a function, then the limit function is continuous. We can

now ask for the converse: Suppose a sequence of continuous functions

fn converges pointwise to a function f which is also continuous. Is

the convergence uniform? The answer in general is NO. But there is a

situation when we can say yes as well.

Theorem 74 Let X be a compact metric space fn : X → R be a

sequence of continuous functions converging pointwise to a function f.

Suppose further that fn is monotone. Then the fn → f uniformly on

X.

Proof: That fn is monotone means for each x ∈ X we have

· · · ≤ fn(x) ≤ fn+1(x) ≤ · · ·

or the other way round where all inequalities are reversed. We can

consider one of these cases, put gn(x) = f(x)− fn(x) and assume that

gn is a sequence of non negative functions monotonically decreasing to

the function 0. Given ε > 0 we want to find n0 such that gn(x) < ε for

all n ≥ n0 and for all x ∈ X. Put

Kn = {x ∈ X : gn(x) ≥ ε}.

Then each Kn is a closed subset of X. Also gn(x) ≥ gn+1(x) it follows

that Kn+1 ⊂ Kn. On the other hand, sunce gn(x) → 0 it follows that

∩nKn = ∅. Since this is happening in a compact space X we conclude

that Kn0 = ∅ for n0. ♠

Remark 36 The compactness is crucial as illustrated by the example:

fn(x) =1

nx + 1, 0 < x < 1.

Uniform Convergence and Integration

102

Theorem 75 Let α be increasing function on [a, b] and let fn ∈ R(α), n ≥on [a, b]. Suppose fn converges uniformly to f on [a, b]. Then f ∈ R(α)

and we have

limn

∫ b

a

fndα =

∫ b

a

fdα.

Proof: Put εn = sup{|fn(x) − f(x)| : a ≤ x ≤ b}. The uniform

convergence implies that limn εn = 0.

We have for each n,

fn − εn ≤ f ≤ fn + εn.

Therefore,∫ b

a

(fn − εn)dα ≤∫

fdα ≤∫

fdα ≤∫ b

a

(fn + ε)dα.

Therefore

0 ≤∫

fdα−∫

fdα ≤ 2εn[α(b)− α(a)].

Now we can take the limit as n →∞ and apply Sandwich theorem to

conclude that f ∈ R(α). Going back two steps, this now gives∫ b

a

(fn − εn)dα ≤∫ b

a

fdα ≤∫ b

a

(fn + ε)dα

and hence

−εn ≤∫ b

a

fdα−∫ b

a

fndα ≤ εn.

Hence we can take the limit once again. ♠

Example 19 A continuous function which is nowhere differen-

tiable Put

φ(x) = |x|, − 1 ≤ x ≤ 1

and extend this function all over R by periodicity:

φ(x + 2) = φ(x).

103

This function is continuous on R and not differentiable at any integer

value of x.

Let φn(x) = φ(4nx). Then each φn has similar properties to φ but

the preiod has decreased and the number of points at which it is not

differentiable has increased viz., at all those rational numbers q such

that 4nq ∈ Z. We now take

f(x) =∞∑

n=0

(3

4

)n

φn(x).

Observe that |φn(x)| ≤ 1 for all n and hence the above series is uni-

formly convergent and hence defines a continuous function on R. It

is also clear that function is not differentiable at any dyadic ratio-

nal number. But there is a bonus: it is not differentiable anywhere:

Let x ∈ R. For each integer m consider 4mx. Then one of the intervals

(4mx, 4mx+1/2), (4mx−1/2, 4mx) will not contain any integer. Choose

one such and accordingly define δm = ± 14m so that there is no integer

between 4mx and 4m(x + δm).

Now if n > m then 4nδm is an even integer and hence φn(x + δm)−φn(x) = 0. Also for 0 ≤ n ≤ m, we have |φn(x+ δm)−φn(x)| = |4nδm|.Therefore∣∣∣∣f(x + δm)− f(x)

δm

∣∣∣∣ =

∣∣∣∣∣m∑

n=0

(3

4

)n

(φn(x + δm)− φn(x)

δm

∣∣∣∣∣≥ 3m −

m−1∑0

3n = 3m − 3m − 1

2=

3m + 1

2.

Therefore upon taking the limit as m →∞, we see that f ′(x) does not

exist.

Lecture 23

Uniform metric

104

Let X be any set and B(X) be the set of all real (or complex) valued

functions on X which are bounded. Then for each f ∈ B(X),

‖f‖ = sup{|f(x)| : x ∈ X} < ∞

and is called the norm of f. One easily checks that

(a) f ≡ 0 iff ‖f‖ = 0.

(b) |αf‖ = |α|‖f |,α ∈ R(C).

(c) ‖f + g‖ ≤ ‖f‖+ ‖g‖.Therefore if we define d(f, g) = ‖f − g‖, then d becomes a metric

on Cb(X) which is called the uniform metric. (The norm above is

called sup norm). Note that if X is a compact metric space then

any continuous functions real valued function on X is bounded. In

particular, C[a, b] ⊂ B[a, b].

Theorem 76 A sequence {fn} in B(X) is convergent wrt to the uni-

form metric iff it is uniformly convergent on X as a sequence of func-

tions.

Theorem 77 B(X) is a complete metric space.

Remark 37 Indeed, it follows from Weieirstrass theorems, that if K

is compact subset of Rn, then the space C[K) of continuous functions

is a closed subset of B(X).

Theorem 78 Weierstrass The set of all polynomial functions on [a, b]

is dense in C[a, b].

Proof: Given a continuous function f : [a, b] → R and ε > 0 we must

find a polynomial P such that

|f(x)− P (x)| < ε, a ≤ x ≤ b.

105

Step 1 Enough to prove this for the case [a, b] = [0, 1].

Put g(t) = f(a + [b− a]t), 0 ≤ t ≤ 1,

get a polynomial Q such that

|g(y)−Q(y)| < ε, for 0 ≤ t ≤ 1

and put P (x) = Q(

x−ab−a

).

Step 2 Bernstein’s Polynomials. For n ≥ 1, and 0 ≤ x ≤ 1, define

Bn(x) := Bfn(x) :=

n∑k=0

(n

k

)xk(1− x)n−kf(k/n).

We have

(I) If f(x) ≡ 1 then Bfn(x) = 1.

(II) If f(x) = x then Bfn(x) = x.

(III) If f(x) = x2 the Bfn(x) = x2(1− 1

n) + x

n.

(IV)∑n

k=0

(kn− x)2

xk(1− x)n−k = x(1−x)n

.

[Proof: I is obvious. For II and III consider the binomial expansion

(x + y)n =n∑0

(n

k

)xkyn−k

Differentiate this wrt x and multiply by x/n to obtain

x(x + y)n−1 =n∑0

k

n

(n

k

)xkyn−k.

If you put y = 1− x now you get II.

Differentiate this again with respect to x multiply by x/n and substi-

tute y = 1− x to obtain III.

Finally (IV) is verified by expanding out and using I,II,III.]

Step 3 We shall now prove

Lemma 11 Given any continuous function f : [a, b] → R, the sequence

Bfn of Bernstein polynomials converges uniformly to f on [0, 1].

106

Given ε > 0 choose δ > 0 such that

|f(x)− f(y)| < ε, for |x− y| < δ, x, y ∈ [0, 1].

Now for any x ∈ [0, 1] by (I) above we have

f(x)−Bn(x)

= f(x)n∑0

(n

k

)xk(1− x)n−k +

n∑k=0

f(k/n)]

(n

k

)xk(1− x)n−k

=n∑

k=0

[f(x)− f(k/n)]

(n

k

)xk(1− x)n−k

=∑

k∈A +∑

k∈B

where A = {k : |f(x)− f(k/n)| < ε2} and B = {1, 2, . . . , n} \B. Note

that A and B depend on x. In any case, we have

∣∣∣∣∣∑k∈A

[f(x)− f(k/n)]

(n

k

)xk(1− x)n−k

∣∣∣∣∣ < ε

2

n∑0

(n

k

)xk(1− x)n−k =

ε

2.

It is the second sum on the right that needs more careful handling.

For k ∈ B we have |f(x) − f(k/n)| ≥ ε and therefore, |x − k/n| ≥ δ.

This means (k − nx)2 ≥ n2δ2. Therefore∣∣∣∣∣∑k∈B

[f(x)− f(k/n)]

(n

k

)xk(1− x)n−k

∣∣∣∣∣≤ 2‖f‖

∑k∈B

(n

k

)xk(1− x)n−k (k − nx)2

n2δ2

≤ 2‖f‖n2δ2

n∑0

(k − nx)2

(n

k

)xk(1− x)n−k

=2‖f‖n2δ2

nx(1− x) ≤ 2‖f‖nδ2

.

Luckily this result is independent of x. All that we have to do now

is to choose N such that 2 ‖f‖Nδ2 < ε

2i.e., N > 4‖f‖

δ2ε.

107

∣∣∣∣∣∑k∈A

[f(x)− f(k/n)]

(n

k

)xk(1− x)n−k

∣∣∣∣∣ < ε

2

n∑0

(n

k

)xk(1− x)n−k =

ε

2.

♠

Remark 38 The above lemma actually implies, in probability theory

the so called Week law of large numbers.

Exercise 14 Write down B1, B2, B3 explicitly for f(x) = x2, and f(x) =

x3.

108

Lecture 24 (Friday 23rd Oct.)

Alternative proof of Weierstrass’s theorem:

As before, we may assume that [a, b] = [0, 1]. We may further

assume that f(0) = f(1) = 0, by considering the function g(x) =

f(x)− f(0)− x[f(1)− f(0)]. Morever we can now extend f all over Rby defining it to be 0 outside [0, 1] so that f is uniformly continuous

on R.

Lemma 12 For any continuous function f : R → R such that suppf ⊂[0, 1] define the polynomial functions

Pn(f)(x) =

∫ 1

0

f(s)Qn(s− x)ds (43)

where

Qn(f)(x) = cn(1− x2)n

where the constant cn is chosen so that∫ 1

−1

Qn(x)dx = 1, n ≥ 1.

Then {Pn(f)} is a sequence of polynomials converging uniformly to the

function f on R.

Proof: For each fixed x ∈ R, the integrand in (43) is continuous func-

tion and hence is Riemann integrable in [0, 1]. Also, since the integrand

is a polynomial in x with coefficients which are continuous functions

of s upon taking the definite integral w.r.t. s, we obtain Pn(f) as

polynomial functions in x.

We begin with some estimate of the size of the constants cn.

109

Cliam: cn <√

n :∫ 1

−1

Qn(x)dx = 2

∫ 1

0

Qn(x)dx

≥ 2

∫ √n

0

(1− x2)ndx

≥∫ √

n

0

(1− nx2)dx =4

3√

n>

1√n

.

Now if 1 > δ > 0 then for δ ≤ |x| ≤ 1, we have

Qn(x) ≤√

n(1− δ2)n.

Since√

n(1− δ2)n → 0 as n →∞, Qn → 0 uniformly in δ ≤ |x| ≤ 1.

Next we shall rewrite Pn : Putting s = x + t, we get

Pn(f)(x) =

∫ 1−x

−x

f(x + t)Qn(t)dt.

Since f = 0 outside [0, 1] we see that for x ∈ [0, 1]

Pn(f)(x) =

∫ 1

−1

f(x + t)Qn(t)dt.

Given ε > 0 choose δ > 0 so that

|x− y| < δ implies that |f(x)− f(y)| < ε/2.

Let M = sup{|f(x)| : x ∈ R}.Then for any x ∈ [0, 1]

|Pn(x)− f(x)| =

∣∣∣∣∫ 1

−1

[f(x + t)− f(x)]Qn(t)dt

∣∣∣∣≤

∫ 1

−1

|f(x + t)− f(x)|Qn(t)dt

≤ 2M

∫ −δ

−1

Qn(t)dt +ε

2

∫ δ

−δ

Qn(t)dt + 2M

∫ 1

δ

Qn(t)dt

≤ 4M√

n(1− δ2)n +ε

2< ε

for sufficiently large n. ♠

110

Remark 39 Given a continuous function f : R → R, it is not true

that we can find a sequence of polynomials approximating f all over

R. For instance, in the above discsussions, the polynomials Pn wouild

obviously diverge to±∞ as x →∞ whereas the function f is identically

0 outside [0, 1].

Remark 40 The space B(K) is not only a vector space but is also an

algebra, i.e., if f, g ∈ B(X) fg ∈ B(X). We have earlier remarked that

if K is a compact subset of Rn then C(K) is a closed subset of B(K).

Indeed we can also verify that C(K) is a subalgebra. More generally

we have,

Theorem 79 If A is a subalgebra of B(X) then A is a subalgebra of

B(X).

Definition 39 Let A be a family of functions on a set X. We say A

separates points in X if given any two distinct points x1, x2 ∈ X there

exists at least one f ∈ A such that f(x1) 6= f(x2). Likewise, we say A

vanishes at no point of X if for each x ∈ X there is at least one f ∈ A

such that f(x) 6= 0.

Example 20 A typical example of A satisfying the above properites

is the family of polynomial functions where X is any subset of Rn. On

the other hand if we take the family of even polynomials on [−1, 1]

it does not separates points and the family of odd polynomials does

vanishes at x = 0.

Theorem 80 Let A be an algebra of (real or complex valued) functions

on a set X which separates points of X and which does not vanish at

any point of X. Given x1 6= x2 and constants c1, c2 there exists f ∈ A

such that f(xj) = cj, j = 1, 2.

111

Proof: First find functions g, h, k such that

g(x1) 6= g(x2), h(x1) 6= 0, k(x2) 6= 0.

Put

f(x) = c1(g(x)− g(x2))h(x)

(g(x1)− g(x2))h(x1)+ c2

(g(x)− g(x1))k(x)

(g(x2)− g(x1))k(x2).

♠

Remark 41 Are you reminded of Newton’s interpolation formula?

Theorem 81 Stone-Weierstrass Theorem Let A be an algebra of

bounded real functions on a compact metric space X which separates

points of X and vanishes at no point of X. Then C[a, b] ⊂ A.

Proof: Step 1: If f ∈ A then |f | ∈ A.

Let a = sup {|f(x)| : x ∈ X}. Now find polynomials Pn(t) such that

|Pn(t)− |t|| < 1n

for −a ≤ t ≤ 1 (exists by Weierstrass’s theorem.) We

can also assume that Pn(0) = 0 by considering Qn(t) = Pn(t)− Pn(0).

Consider gn(x) = Pn(f(x)) = c1f(x) + c2f2(x) + · · · ckf

k(x) ∈ A. On

the other hand for all x ∈ X, we have

|gn(x)− |f(x)|| = |Pn(f(x))− |(f(x)|| < 1

n

This implies gn → |f | and we are through.

Step 2 If f, g ∈ A, then max{f, g}, min{f, g} ∈ A.

This follows since

max{f, g} =f + g + |f − g|

2; min{f, g} =

f + g − |f − g|2

.

By repeated application of this it follows that maximum (or minimum)

of finitely many functions in A is again in A.

Step 3 Let f : X → R be a continuous function and x ∈ X. Given

ε > 0 there exists gx ∈ A such that gx(x) = f(x) and

gx(t) > f(t)− ε, x ∈ X. (44)

112

Using the property of separation of points and nonvanishing, it

follows that for every t ∈ X we have a function ht ∈ A such that

ht(x) = f(x), ht(t) = f(t). By continuity of ht there is a nbd Vt of t in

X such that ht(y) > f(t)− ε for y ∈ Vt. Since X is compact, we get

X ⊂ Vt1 ∪ Vt2 ∪ · · · ∪ Vtk .

Put

gx = max {ht1 , . . . , htk}.

Then gx(x) = f(x) and if t ∈ X is such that t ∈ Vti , we have

gx(t) ≥ hti(t)− ε > f(t)− ε, t ∈ Vti .

By Step 2, gx ∈ A.

Step 4 Given a continuous function f : X → R and ε > 0 there exists

g ∈ A such that |f(t)− g(t)| < ε, t ∈ X.

For each x ∈ X, let gx ∈ A be a function as in Step 3. By continuity

of g there is a nbd Ux of x such that gx(t) < f(t)+ε for all t ∈ X. Cover

X with finitely many Ux1 , . . . , Uxm and take g = min {gx1 , . . . , gxm}.By step 2 g ∈ A. Since each gxi

has the property (44), it follows that

g(t) > f(t)−ε, t ∈ K. On the other hand, if t ∈ Uxithen g(t) ≤ gxi

(t) <

f(t) + ε. Therefore for all t ∈ X we have f(t)− ε < g(t) < f(t) + ε. ♠

Remark 42 The theorem does not hold for algebras of complex valued

functions without the additional hypothesis that A is self-adjoint, i.e.,

it is closed under conjugation, i.e., if f = u+ıv ∈ A then f = u−ıv ∈ A.

This can be illustrated by the following example.

Let X = S1, the unit circle and A be the algebra of all polyno-

mial functions with complex coefficients. The A separates points and

the polynomial z ∈ A does not vanish on A. The function f(z) = 1z

is continuous on X. However, it does not belong to A. For, we have∫S1 P (z)dz = 0 for all polynomials whereas

∫S1

dzz

= 2πı. If there were

113

a sequence of polynomials uniformly converging to 1/z then the integral

should have been zero according to theoreom 75.

The situation can be saved if we make one more assumption.

Theorem 82 Let X be any compact metric space and A be a self ad-

joint algebra over C, of complex valued continuous functions on X.

Assume that A separates points of X and does not vanish anywhere on

X. Then A contains all continuous complex valued functions on X.

Proof: (Note that A has the additional property: f ∈ A =⇒ ıf, f ∈ A

as compared with an algebra over R being an algebra over complex

numbers, which is implict when we talk aout self-adjoint algebras.)

Let AR denote the subspace of all members of A which take real

values only. Then A is a subalgebra which also has these two additional

properties: For first of all observe that if f ∈ A then <(f) = f +

f/2 ∈ A and =(f) = f − f)/2ı ∈ A. Therefore <(f),=(f) ∈ AR.

Now given x1 6= x2 ∈ X let f ∈ A be such that f(x1) 6= f(x2). Then

<(f(x1)) 6= Re(f(x2)) or =(f(x1)) 6= =(f(x2)) and accordingly, we get

some g ∈ AR with g(x1) 6= g(x2). Similarly, if f ∈ A is such that

f(x) 6= 0 then one of <(f)(x) 6= 0,=(f(x)) 6= 0 is true and so we are

done.

Now given any continuous function f : X → C we can apply the real

Stone Weierstrass theorem to conclude that <(f) ∈ A and =(f) ∈ A.

Therefore f ∈ A. ♠

114

Lecture 25

Solution of IVP a la Picard

The existence and unique of the solution of an Initial Value Prob-

lem(IVP)

y′ = f(x, y), y(x0) = y0 (45)

is of fundamental importance in several branches of mathematics, not

just in the theory of Differential equations. However, it is not taught

in any first course in differential equations, sicne the students do not

have the required analysis background and then a student may never

take a formal course in differential equations thereby totally ‘missing’

this beautiful theorem.

Observe that f is a given real valued function defined in a (rectangu-

lar) neighbourhood of the point (x0, y0) ∈ R2. By a solution of (45), we

mean a once differentiable function φ defined in some neighbourhood

of the point x0 say (x0 − δ, x0 + δ) satisfying,

φ(x0) = y0, & φ′(x) = f(x, φ(x)), x ∈ (x0 + δ, x0 + δ). (46)

By Fundamental Theorem of Riemann Integration, we can convert (45)

into an integral equation:

y(x) = y0 +

∫ x

x0

f(t, y(t))dt (47)

and it is in this form Picard came up with his classical solution of this

problem, via the so called iteration method. Here we give a simple

version of this great theorem. Before that, we would like to present the

modern avatar of iteration principle:

Definition 40 Let X be a metric space. By a contraction map on

X we mean a function T : X → X such that there exists a constant

0 < c < 1 such that for x, y ∈ X we have

d(T (x), T (y)) ≤ c d(x, y).

115

Remark 43 It is easy to see that every contraction mapping is con-

tinuous. The map f(x) = λx on Rn is a contraction iff |λ| < 1. The

most important property of contraction mapping is:

Theorem 83 Contraction Mapping Principle On a complete met-

ric space, every contraction mapping T has precisely one fixed point,

i.e., there exists exactly onle point t0 ∈ X such that T (t0) = t0.

Proof: First let us prove the uniqueness. If T (t1) = t1 and T (t2) = t2

then we have

d(T (t1), T (t2)) ≤ c d(t1, t2) = Dd(T (t1), T (t2))

which is absurd unles t1 = t2. Now starting with any point t ∈ X define

t1 = T (t), t2 = T (t1), . . . , tn = T (tn−1.

Verfiy that {tn} is a Cauchy’s sequence. Since X is a complete metric

space, it follows that there tn → t0 say. Then

T (t0) = T (limn

tn) = limn

T (tn) = limn

tn+1 = t0.

This completes the proof of the theorem. ♠

Remark 44 This principle has the following wonderful interpretation.

Take a map of a country which is ‘to the scale’ and throw it inside the

country. Then there is (exactly) one point on the map which lies exactly

on the point in the country which it represents. You may wonder why

it should be true for countries like USA which has several connected

components but this is true!

Theorem 84 Let R = [a, b] × [c, d] and f : R → R be a continuous

real valued valued function and let M be a constant such that f satisfies

the following Lipschitz condition of first order:

|f(x, y1)− f(x, y2)| ≤ M |y1 − y2|, (x, yj) ∈ [a, b]× [c, d]. (48)

116

Given a < x0 < b, c < y0 < d there exists a δ > and a unique function

φ which satisfies (46).

Proof: Put K = sup{|f(x, y), (x, y) ∈ R}. Choose δ > 0 so that

Mδ < 1; a < x0 − δ < x0 + δ < b and c < y0 −Kδ < y0 + Kδ < d.

Consider the space A = C[x0 − δ, x0 + δ] of all continuous real valued

function on the closed interval. We know that this is a complete metric

space. Now consider the subspace B of those φ ∈ A such that

|φ(x)− y0| ≤ Kδ.

Then B is a closed subspace of A and hence is a complete metric space.

It is important to note that B is non empty. (Why?)

We consider the map T : B → B defined by

T (φ)(x) = y0 +

∫ x

x0

f(t, φ(t))dt. (49)

By theory of Riemann integration, it follows that T (φ) is continuous.

For x ∈ [x0 − δ, x0 + δ], we have,

|T (φ)(x)− y0)| ≤ |∫ x

x0

f(t, φ(t))dt| ≤ K|x− x0| ≤ Kδ.

This implies that T (φ) ∈ B.

Observe that φ ∈ B is a solution of (46) iff T (φ) = φ. Therefore,

our aim is to prove that T is a contraction mapping. Given φj ∈ B

consider

|T (φ1)(x)− T (φ2)(x)| = |∫ x

x0(f(t, φ1(t))− f(t, φ2(t)))dt|

≤ M∫ x

x0|φ1(t)− φ2(t)|dt

≤ Mδd(φ1, φ2).

and since this is true for all x ∈ [x0 − δ, x0 + δ] we have

(T (φ1), T (φ2)) = sup{|T (φ1)(x)− T (φ2)(x)| : x ∈ [x0 − δ, x0 + δ]}≤ Mδd(φ1, φ2). This completes the proof of the theorem. ♠.

117

Lecture 26. Fourier Series

Some important Exercises of Integration:

Exercise 15 Throughout, let α be a fixed increasing function on

[a, b].

1. Famous Inequalities Let p > 1 be a positive real number. 1p

+1q

= 1.

(a) Show that φ(x) = 1px − x1/p, attains its minimum at x = 1.

Put φ(1) = 1p− 1 = 1

qso that 1

p+ 1

q= 1. Note that both p, q > 1.

They are called ‘dual pair’ of numbers, i.e., q is the dual of p and

p is the dual of q. Observe that if p = 2 then q = 2, i.e., 2 is dula

to itself.

(b) If u, v ≥ 0 then

uv ≤ up

p+

vq

q.

Show that equality holds iff up = vq.

(c) Let f, g ∈ R(α) and f, g ≥ 0 such that∫ b

a

fpdα = 1 =

∫ b

a

gqdα.

Then show that∫ b

afgdα ≤ 1.

(d) Let f, g be any complex valued functions in R(α). Then prove

that Holder’s Inequality:∣∣∣∣∫ b

a

fgdα

∣∣∣∣ ≤ (∫ b

a

|f |pdα

)1/p(∫ b

a

|g|qdα

)1/q

.

(e) Schwarz’s Inequality With f, g as in (d), show that

∣∣∣∣∫ b

a

fgdα

∣∣∣∣ ≤ (∫ b

a

|f |2dα

)1/2(∫ b

a

|g|2dα

)1/2

.

118

(f) For any u ∈ R(α) define and p > 0

‖u‖p :=

[∫ b

a

|u|pdα

]1/p

.

For any f, g, h ∈ R(α) prove Minkowski Iniquality:

‖f + g‖p ≤ ‖f‖p + ‖g‖p.

(g) Show that dp(f, g) = ‖f − g‖p satisfies triangle inequality.

Solution:

(a) φ′(x) = − iff x = 1 and φ′′(1) > 0. The conclusion follows.

(b) Put x = up/vq in (a).

(c) f, g ∈ R(α) implies |f |p, |g|q ∈ R(α). (Why? Remember how

we proved f 2 ∈ R(α)?) Now by (b) f(x)g(x) ≤ f(x)p

p+ g(x)q

q. Upon

taking integration and use the fact 1p

+ 1q

= 1 we are done.

(d) Apply (c) to appropriate multiples of f, g.

(e) Put p = q = 2.

(f) Notice that 1p

+ 1q

= 1, p, q > 0 implies p, q ≥ 1. Put k =∫ b

a(|f |+ |g|)pdα. Then

k =∫ b

a(|f |+ |g|)(|f |+ |g|)p−1dα

=∫ b

a(|f |(|f |+ |g|)p−1dα +

∫ b

a|g|(|f |+ |g|)p−1dα

≤(∫ b

a|f |pdα

)1/p (∫ b

a(|f |+ |g|)(p−1)qdα

)1/q

+(∫ b

a|g|pdα

)1/p (∫ b

a(|f |+ |g|)(p−1)qdα

)1/q

=

[(∫ b

a|f |pdα

)1/p

+(∫ b

a|g|pdα

)1/p]

k1/q

because (p− 1)q = p. The result follows.

(h) Easy.

2. Let f ∈ R(α) on [a, b]. Given ε > 0 show that there exists a

continuous function g : [a, b] → R such that ‖f − g‖2 < ε.

119

Sol:’ We have seen that f ∈ R(α) implies that f 2 ∈ R(α) too.

So choose a partition P = {a = a0, . . . , an = b} such that for all

refinements of it as well, we have∑i

|f(ti)− f(si)|2∆αi < ε2/4, for all ti, si ∈ [ai−1, ai]

Put

g(t) =ai − t

∆xi

f(ai−1) +t− ai−1

∆xi

f(ai), ai−1 ≤ t ≤ ai.

Then clearly g is continuous. For ai−1 ≤ ti ≤ ai we have,

f(ti)− g(ti) =ai − ti∆xi

(f(ti)− f(ai−1)) +ti − ai−1

∆xi

(f(ti)− f(ai))

Therefore

|f(ti)−g(ti)| ≤ |(f(ti)−f(xi−1|+ |(f(ti)−f(xi)| ≤ 2|f(ti)−f(si)|

where si = xi or xi−1. Therefore∑i

|f(ti)− g(ti)|2∆αi ≤ 4∑

i

|f(ti)− f(si)|∆αi < ε2.

Definition 41 A function f : R → R, (C) is called periodic with pe-

riod λ > 0 if f(x + λ) = f(x) for all x ∈ R.

As an immediate corollary of Theorem 82, we have

Theorem 85 Let f : R → R be a continuous function with the prop-

erty f(x + 2π) = f(x) for all x ∈ R. Then there exists a sequence

SN(x) = a0 +N∑

n=1

(an cos nx + bn sin nx), a0, an, bn ∈ R, (50)

which converges uniformly to f on the whole of R.

120

Proof: Functions of the above form SN are called trigonometric poly-

nomials. Notice that each summand that occurs on the RHS of the

formula for SN has the property

g(x + 2π) = g(x), x ∈ R.

Such functions are called periodic with period 2π. The important thing

to note about them is that their behavior on R is completely known by

their behaviour on any interval of length (≥) 2π.

If we allow complex coefficients a0, an, bn in (50) then using the

identities

cos x =eıx + e−ıx

2; sin x =

eıx − e−ıx

2ı,

it follows that we can rewrite (50) in the form

SN(x) =N∑−N

cneinx, cn ∈ C. (51)

Let A denote the collection of all such functions sN . Check that A is

a self-adjoint algebra of continuous functions on the whole of R (but

we shall consider these functions on the closed interval [−π, π]). Also

check that this algebra separates points of [−π, π] and does not vanish

anywhere (since it contains constant functions). Therefore its closure

contains the space C[−π, π].

Now given any continuous periodic function f : R → R with period

2π restrict f : [−π, π] → R. Now by what we have concluded above,

we get a sequence {sN(x)} ∈ A (with coefficients a0, an, bn ∈ C) which

uniformly converges to f. Upon rewriting it in terms of cos nx and

sin nx and taking the real part the theorem follows. ♠The above theorem prods us into studying many related concepts

which lead us to the so called Theory of Fourier series. We shall only

give a few basics of this vast theory here depending only on the math-

ematics that we have developed so far. Full justification to this topic

cannot be done without the support of Lebesgues theory.

121

Lemma 13 Let n be an integer. Then

1

2π

∫ π

−π

eınxdx =

{1 if n = 0;

0 otherwise(52)

Definition 42 By a trigonometric series we mean a sum of the form

∞∑−∞

cneınx (53)

whose N th-partial sum SN is given by (51). Given a Riemann inte-

grable function f on [−π, π], and an integer n, we define its nth Fourier

coefficient by the formula

cn(f) :=1

2π

∫ π

−π

f(x)e−ınxdx. (54)

The Fourier series (also called trigonometric series) associated to f is

defined to be∞∑−∞

cn(f)eınx. We express this often by

f ∼∞∑−∞

cn(f)eınx. (55)

Remark 45 We observe that if SN is a trigonometric polynomial as

in (51), then cn(SN) = cn, for |n| ≤ N and cn(SN) = 0, |n| > N. Thus

the Fourier series of SN reduces to a trigonometric polynomial. One

of the fundamental problem in the theory is when can we write = in

place of ∼ in (55)? Of course there are many subquestions related to

this as well viz., what should be the meaning of ‘ =′ here. For instance,

it is clear that at all cost we should insist that RHS converges. If the

convergence is uniform then it follows that the function represented

is periodic and moreover continuous. The first properties is desirable

whereas the second one is NOT. The applications that we have in mind

involve, quite often than not, functions which have discontinuities.

122

For instance if the series (53) converges to some function f , then

we would like that the so called Euler’s formula

cn =1

2π

∫ π

pi

f(x)eınxdx

to be true. If we grant uniform convergence, then term-by-term inte-

gration is valid and hence we using (52) one easily check this property

to be true. This is similar to the case of an analytic function whose

nth derivative at 0 determining the coefficeint of xn. For trigonomet-

ric series or for more general Fourier series, we are looking for similar

properties under more general conditions then uniform convergence.

Definition 43 Let {φj} be a family of complex valued integrable func-

tions on [a, b] with the property:∫ b

a

φj(x)φk(x)dx = 0, j 6= k. (56)

Then we say {φj} is an orthogonal family of functions. In addition if∫ b

a

|φj(x)|2dx = 1 (57)

we call it an orthonormal family.

Example 21 We have seen that the family { eınx√

2π} is an orthonormal

family on [−π, π]. Similarly,

{ 1√2π

,cos x√

2π,sin x√

2π,cos 2x√

2π,sin 2x√

2π, · · · }

is also an orthonormal family on [−π, π].

Definition 44 Given an integrable function f on [a, b] we define

cj(f) :=

∫ b

a

f(t)φj(x)dx (58)

123

to be the Fourier coefficient of f with respect to the family {φj}. More-

over the formal sum∑

j cj(f)φj(x) is then called the Fourier series of

f with respect to {φj}. And we express this by

f(x) ∼∑

j

cj(f)φj(x).

For any two integrable functions, f, g on [a, b], let us write

〈f, g〉 =

∫ b

a

fgdx.

Also let us write

||f ||2 =√〈f, f〉.

Theorem 86 Pythagorus theorem: If 〈f, g〉 = 0 then

‖f + g‖2 = ‖f‖2 + ‖g‖2.

Proof: Direct.

Theorem 87 Least Square Approximation Let f be an integrable

function on [a, b]. Let {φn} be an orthonormal system and

sn(x) :=n∑

m=1

cmφm(x)

be the nth partial sum of the Fourier series of f. Then for all

tn(x) =n∑

m=1

γmφm(x)

we have ∫ b

a

|f − sn|2dx ≤∫ b

a

|f − tn|2dx (59)

with equality holding iff γm = cm, for all 1 ≤ m ≤ n.

124

Proof: Check that f − sn is orthogonal to sn − tn and use the above

theorem to conclude that

‖f − tn‖2 = ‖f − sn‖2 + ‖sn − tn‖2.

This proves (87). As for the last part, repeated application of Pythago-

ras yields

‖sn − tn‖2 =n∑

m=1

|cm − γm|2

from which the conclusion follows.

Theorem 88 Bessel’s Inequality: For any integrable function f on

[a, b] if f ∼∑

m cmφm then∑n

|cn|2 ≤ ‖f‖2

Proof: Putting tm = 0 in the proof of the above theorem, we first

obtain that f − sn is orthogonal to sn. (Or do this directly afresh).

Again by Pythagorus theorem, we get

‖f‖2 = ‖f − sn‖2 + ‖sn‖2.

The conclusion follows. ♠In particular, we have the so called

Theorem 89 Lebesgue-Riemann theorem: For any integrable func-

tion f on [−π, π] the sequence of Fourier coefficients converges to 0 :

limn→∞

∫ π

−π

f(t) cos kt dt = 0; limn→∞

∫ π

−π

f(t) sin kt dt = 0. (60)

Proof: Bessel’s inequality implies that limn→±∞ cn = 0 and we also

have cn = cn. The above two quantities are nothing but cn+cn

2and

cn−cn

2. and real and imaginary parts of cn. ♠

125

Lecture 28

Theorem 90 Parseval’s Theorem: Let f, g be integrable functions

with period 2π. Put

f(x) ∼∞∑−∞

cmeımx; g(x) ∼∞∑−∞

γmeımx.

Then

(i) limN→∞

1

2π

∫ π

−π

|f(x)− sN(f ; x)|2dx = 0.

(ii)1

2π

∫ b

a

f(x)g(x)dx =∞∑−∞

cmγm.

(iii)1

2π

∫ π

−π

|f(x)|2dx =∞∑−∞

|cm|2.

Proof: We shall denote by ‖h‖2 =(

12π

∫ b

a|h(x)|2dx

)1/2

. Since f is

integrable and f(−π) = f(π), from a previous exercise 15.2, given

ε > 0, we have a continuous 2π-periodic function h such that

‖f − h‖2 < ε.

By the theorem 85 above, there is a trigonometric polynomial

P =N∑−N

γmeımx

of degree N, say, such that |P (x) − h(x)| < ε for all x ∈ [−π, π] and

hence ‖P − h‖2 < ε.

Let us use a slightly modified notation: for any g ∈ R(α)[−π, π],

sn(g) :=n∑−n

ck(g)eıkx

By Least Square Approximation, it follows that

‖h− sn(h)‖2 ≤ ‖h− P‖2 < ε, for n ≥ N.

126

Also Bessel’s inequality, we have,

‖sn(h)− sn(f)‖2 = ‖sn(h− f)‖2 ≤ ‖h− f‖2 < ε.

Finally by Triangle inequality, we have

‖f − sn(f)‖2 ≤ ‖f − h‖2 + ‖h− sn(h)‖2 + ‖sn(h)− sn(f)‖2 < 3ε

for all n ≥ N. This proves (i).

To prove (ii), we first observe that at finite sum level, we have

1

2π

∫ π

−π

sN(f)gdx =1

2π

N∑−N

∫ π

−π

cneınxg(x)dx =

N∑−N

cnγn.

Therefore, using Schwarz’s inequality, we get∣∣∣∣∫ fg −∫

sN g

∣∣∣∣ ≤ ∫ |f − sn| |g| ≤(∫

|f − sN |2)1/2(∫

|g|2)1/2

.

Letting N →∞ we get (ii).

(iii) follows from (ii) by putting g = f. ♠Convergence problem for Trigonometric Series.

We shall now on deal with only trigonometric series and consider

functions f with period 2π which are Riemann integrable over [−π, π].

Consider the trigonometric polynomial with all its coeffients equal

to 1. (By analogy, this plays the role of the polynomial which is the nth

partial sum of the geometric series for (1 − x)−1.) The trigonometric

polynomial

DN(x) =N∑−N

eınx

is called the Dirichlet’s kernel. Multiplying it by eıx − 1 we get

(eıx − 1)DN(x) = eı(N+1)x − e−ıNx.

Multiplying further by e−ıx/2 we get

2ı sin(x/2)DN(x) = 2ı sin(N + 1/2)x.

127

Therefore

DN(x) =sin(N + 1/2)x

sin x/2. (61)

Another interesting property of Dirichlet’s kernel is that∫ π

−π

Dn(t)dt = 2π. (62)

Given any f ∈ R(α)[−π, π] we can rewrite sN(f) in terms of Dirich-

let’s kernel:

sN(f)(x) =N∑−N

1

2π

(∫ π

−π

f(t)e−ıntdt

)eınx

=1

2π

∫ π

−π

f(t)N∑−N

eın(x−t) dt

=1

2π

∫ π

−π

f(t)DN(x− t)dt

=1

2π

∫ x+π

x−π

f(x− s)Dn(s)ds

=1

2π

∫ π

−π

f(x− s)Dn(s)ds

the last equality being the result of periodicity of the integrand.

We shall now prove a local convergence theorem:

Theorem 91 Suppose for some x, there exist δ > 0, M < ∞ such that

|f(x + t)− f(x)| ≤ M |t|, t ∈ (−δ, δ). (63)

Then

limN→∞

sN(f, x) = f(x).

Proof: Put

g(t) =

{f(x−t)−f(x)

sin(t/2)0 < |t| < π

0, t = 0.

128

We first note that g ∈ R(α) ∈ [−π, π]. [Let us prove that g sat-

isfies R-condition in [0, π] the proof for the interval [−π, 0] being the

same. Given ε > 0 we can choose δ1 > 0 such that |t/ sin(t/2)| < 2.

Now choose δ2 = min{δ, δ1, ε/8M}. Now observe that in [δ2, π], g is

integrable and hence we can find a partition P := {δ2 = a1 < a2 <

· · · an = π} in which g satisfies Riemann’s condition for ε/2. It then

follows that for the partition Q := {0 < δ2 = a1 < · · · < an}, g satisfies

Riemann’s condition in the interval [0, π] for ε. ]

Using (62) we get

sN(f ; x)− f(x)

=1

2π

∫ π

−π

[f(x− t)− f(x)]Dn(t) dt

=1

2π

∫ π

−π

g(t)sin(t/2)Dn(t) dt

=1

2π

∫ π

−π

g(t) sin(N + 1/2)t dt

=1

2π

∫ π

−π

g(t)[sin(t/2) cos N(t) + cos(t/2) sin N(t)]dt

= αN + βN

where αN and βN are respectively real part of the N th Fourier coeffi-

cient of g(t) sin(t/2) and the imaginary part of the N th Fourier coeffi-

cient of g(t) cos(t/2). Because of (63) both these functions are Riemann

integrable functions in the closed interval. Therefore, by Lebesgue Rie-

mann (89), it follows that αN → 0, βN → 0 as N →∞. ♠

Remark 46 It follows that if f ∈ C2 then it satisfies (63) and hence the

Fourier series is convergent. However, by carrying out integration by

parts twice and using Weierstrass’s majorant criterion, one can directly

prove that the Fourier series is uniformly convergent to a function g.

But then term-by-term integration is valid and hence it follows that

the function g is equal to f.

129

Lemma 14 Let g ∈ R(α)[0, π]. Then

limN→∞

∫ π

0

g(s) sin[(N + 1/2)s]ds = 0. (64)

Proof: Extend g to all over [−π, π] by defining g(t) = 0 for t ∈ [π, 0).

Then g ∈ R(α)[−π, π] and we have∫ π

0

g(s) sin[(N + 1/2)s]ds =

∫ π

−π

g(s) sin[(N + 1/2)s]ds.

Use the fact

sin[(N + 1/2)s] = sin Ns cos(s/2) + cos Ns sin s/2

and appeal to the theorem 89. ♠

Theorem 92 Let f ∈ R(α)[−π, π] and let x ∈ [−π, π]. Assume that

f(x±), f ′(x±) exist. Then the Fourier series for f at x will converge to

[f(x+) + f(x−)]/2.

Proof: The hypothesis f ′(x+), f (x−) exist implies that f satisfies the

following Lipschitz condtions:

|f(x + t)− f(x+)| ≤ Mt, for 0 ≤ t ≤ δ

and

|f(x− t)− f(x−)| ≤ Mt, for 0 ≤ t ≤ δ

for some M, δ > 0.

Now we use the property DN(x) = Dn(−x) to see that

sN(f) =1

2π

∫ π

0

[f(x + s) + f(x− s)]DN(s)ds.

130

Therefore

sN(f, x)− f(x+) + f(x−)

2

=1

2π

∫ π

0

[f(x + s) + f(x− s)− f(x+)− f(x−)]DN(s)ds

≤ 1

2π

∫ π

0

(f(x + s)− f(x+))DN(s)ds +1

2π

∫ π

0

(f(x− s)− f(x−))DN(s)ds

=1

2π

∫ π

0

g+(s) sin[(N + 1/2)s]ds +1

2π

∫ π

0

g−(s) sin[(N + 1/2)s]ds

where g± are defined in a similar way as in the proof of the above

theorem:

g±(s) =

{f(x±s)−f(x±)

sin(s/2), 0 < s ≤ π;

0, s = 0.

Exactly as in the above theorem, it follows that g± ∈ R(α)[0,±π]. By

the lemma above each of the terms on the rhs converge to 0 and we are

through. ♠(C,1) Summability of Fourier series

Given f ∈ R(α)[−π, π], let us discuss the (C, 1)−summability of

the series ∑cn(f ; x)e−ınx.

We consider the sequence

σn(x) =1

n

n−1∑k=0

sn(f ; x)

and ask the question under what conditions

limn→∞

σn(x) = f(x)?.

Thus it is natural to consider the sequence of sums,

Kn(x) =1

n

n−1∑0

Dk(x).

131

These functions are called Fejer kernels. We have

Kn(x) =1

n sin(x/2)

n−1∑k=0

sin(k + 1/2)x =sin2 nx/2

2n sin2(x/2).

Also observe that from (62), it follows that∫ π

−π

Kn(x)dx = 2π. (65)

Theorem 93 Let f ∈ R(α)[−π, π] and x ∈ (−π, π) be such that f is

continuous at x. Then the Fourier series of f(x) is (C, 1)−convergent

to f(x) at x.

Proof: We have to show that σn(x) → f(x). As before, this is the

same as showing

limn→∞

∫ π

0

[f(x + t) + f(x− t)− 2f(x)]Kn(t)dt = 0.

By continuity of f at x we can find 0 < δ < |π− x| such that for t ≤ δ

we have

|f(x + t) + f(x− t)− 2f(x)| < ε/2.

On the other hand, for t ≥ δ we have

Kn(t) =sin2(nt/2)

2n sin2(t/2)≤ 1

2n sin2(δ/2)

and hence for sufficiently large n we can make∣∣∣∣∫ π

δ

[f(x + t) + f(x− t)− 2f(x)]Kn(t)

∣∣∣∣ ≤ 2πM

n sin2(δ/2)< ε/2.

The theorem follows. ♠

Remark 47 If x is one of the end points ±π then the continuity of f at

x should be interpreted to mean that f(−π) = f(π) and the extended

function defined by f(x + 2π) = f(x) all over R, should be continuous

132

at x = π. With this meaning the above arguments go through in this

case also. Further, if f is continuous on the whole of [−π, π] (and

f(−π) = f(π) then the choise of δ in the above proof can be made

independent of x and so is the choice of n. This yields:

Theorem 94 Let f be a periodic continuous function. Then the Fourier

series of f (C, 1)-converges uniformly to f all over R.

Exercise 16 1. Let f : R → R be a non constant function scuh

that

f(x + y) = f(x) + f(y) for all x, y ∈ R.

(i) If f is continuous at x = 0 show that it is continuous on R.

(ii) Determine all such continuous f.

2. Let f : R → R be a non constant function such that

f(x + y) = f(x)f(y) for all x, y ∈ R.

(i) If f is continuous at x = 0 show that it is continuous on R.

(ii) Determine all such continuous f.

3. Apply Parseval’s theorem to the function f(x) = x, 0 ≤ x < 2π

and obtain the value of∑∞

01n2 .

4. Prove that on [−π, π]

(π − |x|)2 =π2

3+

∞∑n=1

4

n2cos nx.

Evaluate∑∞

01n2 ;

∑∞0

1n4 .

133

5. Integration by Parts: Let α be an increasing function on

[a, b]. Suppose f(x) = F ′(x) on [a, b]. Then∫ b

a

α(x)f(x)dx = F (b)α(b)− F (a)α(a)−∫ b

a

Fdα.

Lecture 32. Cantor set

Here we shall define an operator ||∞ on the class of all closed intervals

[a, b], a < b ∈ R to the class of compact subsets of R. Given any closed

interval J = [a, b] let us define φ(J) to be the set obtained by deleting

the middle-1/3 open interval of J from J. That is,

φ(J) := J \ (a +b− a

3, a + 2

b− a

3).

For any set A which is the finite union of disjoint closed interval

A = ∪ki [ai, bi], define

φ(A) = ∪iφ([ai, bi])

Put I0 = [a, b] and inductively put In = φ(In−1), n ≥ 1. We then

have a decreasing sequence of closed subsets

I0 ⊃ I1 ⊃ · · · ⊃ In ⊃ · · ·

Put

||∞[a, b] := ∩nIn.

The function ||∞ is called the Cantor’s construction. The set C =

||∞[0, 1] is called the Cantor set. We shall call ||∞[a, b] Cantor sets for all

closed intervals [a, b]. These sets have wonderful properties:

(a) ||∞[a, b] is a non empty compact subset of [a, b].

(b) If J is one of the connected components of In for some n then

||∞(J) ⊂ ||∞[a, b].

(c) a, b ∈ ||∞[a, b].

(d) Let f(x) = a + (b− a)x. Then f induces a continuous bijection of

134

C = ||∞[0, 1] with ||∞[a, b].

From now onward we shall specialize to C = ||∞[0, 1]. Each of the

properties of C which we list below is carried over to an identical or

similar property of ||∞[a, b] by the similarity map f above.

(e) The end points of every component of In, n ≥ 0 is in C.

(f) The set of all rationals of the form∑n

1ak

3k , where ak = 0 or 2 is

contained in C.

(g) C contains no open intervals.

(h) Every point of C is a limit point of C. (Such closed subset of Rn

are called perfect sets.)

(i) C is uncountable.

(j) C is totally disconnected.

(k) C is of length zero.

Proof: (a)-(d) Obvious.

(e) This is an easy consequence of (b) and (c).

(f) This is just the restatement of (e).

(g) Let J = (c, d) be any open interval contained in [0, 1]. Choose n

so that d − c > 1/3n. Then for some i such that 0 ≤ i < 3n, J1 :=

[ i3n , i+1

3n ] ⊂ J It follows that In+1 does not contain the middle-1/3 of J1

and hence J 6⊂ In+1.

(h) Let x ∈ C and J be an interval around x. If n is chosen as above,

there is a unique i such that 0 ≤ i < 3n such that x ∈ [ i3n , i+1

3n ] = J1.

Now both the end points of J1 are in C. One of the them not equal to

x has to be inside J. Hence J ∩ C 6= ∅.(i) This can be deduced from the fact that C is a perfect set. Here is

an easier way. From (f), since C is closed it follows that every number

135

represented as an infinite sum

∞∑1

ak

3k

belongs to C. Let A be the set of all sequences α : N → {0, 2}. We

know that A is uncountable. The assignment

(ak) 7→∞∑k

ak

3k

defines an injective mapping of A into C.

(j) Given any points x < y ∈ C, since the interval [x, y] is not contained

in C, there exists z 6∈ C, such that x < z < y. Then {[0, z]∩C, [z, 1]∩C}defines a separation of C.

(k) This follows by the fact that∑∞

12n−1

3n = 1.

136

Lecture Notes in Real Analysis 2009ars/ma403.pdf · Lecture Notes in Real Analysis 2009 Anant R....

Documents

Transcript of Lecture Notes in Real Analysis 2009ars/ma403.pdf · Lecture Notes in Real Analysis 2009 Anant R....