Lecture Notes in Real Analysis 2009ars/ma403.pdf · Lecture Notes in Real Analysis 2009 Anant R....
Transcript of Lecture Notes in Real Analysis 2009ars/ma403.pdf · Lecture Notes in Real Analysis 2009 Anant R....
Lecture Notes in Real Analysis 2009
Anant R. Shastri
Department of Mathematics
Indian Institute of Technology
Bombay
November 9, 2009
Lecture 1
Why real numbers?
Example 1 Gaps in the rational number system. By simply
employing the unique factorization theorem for integers, we can easily
conclude that there is no rational number r such that r2 = 2. So there
are gaps in the rational number system in this sense. The gaps are
somewhat subtle. To illustrate this fact let us consider any positive
rational number p and put
q =2p + 2
p + 2= p− p2 − 2
p + 2. (1)
Check that
q2 − 2 =2(p2 − 2)
p + 2
2
> 0 (2)
Now if p2 < 2 then check that p < q and q2 < 2. Similarly, if p2 > 2
then check that q < p and 2 < q2. This shows that there exist a
sequence r1 > r2 > r3 > · · · of rational numbers such that r2 > 2 and
a sequence of rational number s1 < s2 < · · · such that s2 < 2. In other
words, in the set of all positive rationals r such that r2 > 2 there is no
least element and similarly in the set of all positive rationals such that
s2 < 2 there is no greatest element. The real number system fulfills
this kind of requirement that the rational number system is unable to
fulfill.
Some Basic Set theory Membership, union, intersection, power set,
De’Morgan’s law, (the episode of RAMA and SITA), ordered pairs:(x, y) :=
{{x}, {x, y}}. Cartesian product X ×Y as a subset of the power set of
power set of X ∪Y. Relations, functions, cartesian product of arbitrary
family of sets, cardinality, finiteness and infiniteness, countability of Q
Notation:
N = {0, 1, 2, . . .} the set of natural numbers.
Z = set of interger.
Z+ = the set of positive integers
Q = the set of rational numbers.
R = the set of real numbers.
C = the set of complex numbers.
Lecture 2
Definition 1 Let X be a set R ⊂ X ×X be a relation in X. We shall
write x < y whenever (x, y) ∈ R. We say R is an order (total order
or linear order) on X if the following conditions hold:
(i) Transitivity: x < y, y < z =⇒ x < z for any x, y, z ∈ X.
(ii) Law of Trichotomy: Given x, y ∈ X either x < y or y < x or x = y.
We shall read x < y as ‘x is less than y’. We shall write x ≤ y
to mean either x < y or x = y. We shall also write x > y to mean
y < x and x ≥ y to mean y ≤ x. Note that > becomes another order
on the set X. However, these two orders on X are so closely related to
each other, that we can recover any information on one of them from
a corresponding information on the other.
Let now A ⊂ X. We say x ∈ X is an upper bound of A if a ≤ x
for all a ∈ A. If such an x exists we then say A is bounded above.
Likewise we define lower bounds and bounded below as well.
An element x ∈ X is called a least upper bound abbreviated as
lub, (or supremum to be written as ‘sup’ of A if x is an upper bound
of A and if y is an upper bound of A then x ≤ y. Similarly we define
a greatest lower bound (glb or ‘inf’= infimum).
Remark 1
1
(i) The set of integers is an ordered set with the usual order < . The
subset of positive integers is bounded below but not bounded above.
Also it has greatest lower bound viz., 1. Indeed the set of rational
numbers is also an ordered set with the natural order and this way we
can view Z as an ordered subset of Q. Note that Z+ is not bounded
above even in Q.
(ii) Even if a set A is bounded above, there may not be a least upper
bound as seen in the example 1. However, if it exists then it is unique.
(iii) Let A = ∅ be the emptyset of an ordered set X. Then every
member of X is an upper bound for A. Therefore, least upper bound
for A would exist iff X has a least element.
(iv) Let A = X. Then an upper bound for A is nothing but the greatest
element of X if it exists and hence the lub of X is also equal to this
element.
Definition 2 An ordered set X is said to be order complete if for
every nonempty subset A of X which is bounded above there is a least
upper bound for A in X.
Definition 3 By a binary operation on a set X we mean a function
· : X ×X → X
Remark 2 It is customary to denote ·(a, b) by a · b or some other
conjunction symbol between the two letters a and b. if there is no scope
for confusion, by ab. Typical example of binary operations are addition
and multiplication defined on the set of integers (rational numbers)
etc..
Definition 4 A field K is a set together with two binary operations
denoted by + and · satisfying a number of properties called field ax-
ioms which we shall express in three different lists:
2
List(A) Axioms for addition:
(A1) Associativity: x + (y + z) = (x + y) + z; x, y, z ∈ K.
(A2) Commutativity: x + y = y + x; x, y ∈ K.
(A3) The zero element: There exists 0 ∈ K such that x+0 = x; x ∈ K.
(A4) Negative: For x ∈ K there is a y ∈ K such that x + y = 0.
List (M) Axioms for multiplication:
(M1) Associativity: x(yz) = (xy)z; x, y, z ∈ K.
(M2) Commutativity: xy = yx; x, y ∈ K.
(M3) The unit element: There exists 1 ∈ K, 1 6= 0 such that 1x =
x; x ∈ K.
(M4) Inverse: For each x ∈ K such that x 6= 0 there exists z ∈ K such
that xz = 1.
List (D) Distributivity: x(y + z) = xy + xz; x, y, z ∈ K.
Remark 3 Note that the zero element is unique. Therefore (M3) and
(M4) make sense. Moreover, the unit element is also unique. Further
the negative and the inverse are also unique and are denoted respec-
tively by −x and 1/x. Because of the associativity, we can drop writing
down brackets at all. We also use the notation n to indicate the sum
1 + 1 + · · · + 1 (n times). Likewise we use the notation xn in denote
xx · · ·x (n times). Thus all ‘polynomial’ expressions of elements of Kmake sense. That is to say, if p(t) = a0 + a1t + · · ·+ ant
n, with ai ∈ Kthen we can substitute for t any element x ∈ K and obtain a well de-
fined element of K. The most important example of a field for us now
is the field of rational numbers K = Q.
Definition 5 An ordered field is a field K with an order < satisfying
the following axioms:
(O1) x < y =⇒ x + z < y + z for all x, y, z ∈ K.
(O2) x > 0, y > 0 =⇒ xy > 0 for all x, y ∈ K.
3
Remark 4 Once again a typical example is the field of rational num-
bers with its usual order. All familiar rules for working with inequalities
will be valid in any ordered field. For example the square of any ele-
ment in an ordered field cannot be negative. Let us list a few of such
properties which can be derived easily from the axioms:
Theorem 1 Let K be an ordered field with the order < . Then the
following properties are true for elements of K :
(a) 0 < x iff −x < 0.
(b) 0 < x, y < z =⇒ xy < xz.
(c) x 6= 0 =⇒ 0 < x2. In particular, 0 < 1.
(d) 0 < x < y =⇒ 0 < 1/y < 1/x.
Exercise 1 Let K be an ordered field. Temporarily let us denote the
identity elememt of K by 1K and 1K + · · ·+ 1K (m times) by m1K.
(a) Show that the mapping m 7→ m1K defines an injective ring homo-
morphism of φ : Z → K which is order preserving, viz.,
φ(x + y) = φ(x) + φ(y); φ(xy) = φ(x)φ(y); x < y =⇒ φ(x) < φ(y).
(b) Show that φ extends to an injective field homomorphsim Q → Kwhich is order preserving. In this way we can now say that every
ordered field contains the field of rational numbers.
We shall now state result which asserts the existence of real number
system. We shall not prove this. Interested reader may read this from
[R].
Theorem 2 There is a unique ordered field R which contains the or-
dered field Q and is order complete.
4
Remark 5 Not that a field K is order complete iff for every subset A of
K which is bounded below there is a greatest lower bound. This follows
easily by considering −A. The uniqueness of R has to be interpreted
correctly in the sense that if there is another such R′ then there is a
bijection φ : R → R′ such that
(i) φ(r) = r, r ∈ Q(ii) φ(x + y) = φ(x) +′ φ(y), x, y ∈ R(iii) φ(xy) = φ(x).′φ(y), x, y ∈ R.
(iv) x < y =⇒ φ(x) <′ φ(y), x, y ∈ R.
Lecture 3 (tutorial)
Theorem 3 Z+ is not bounded above in R.
Proof: If x ∈ R is such that n 6= x for all n ∈ Z+, then we can take
the least upper bound l ∈ R for Z+. But then there must exist n ∈ Z+
such that l − 1 < n. This implies l < n + 1 which is absurd. ♠
Theorem 4 Archimedian Property
(A) For every x ∈ R there exists n ∈ Z+ such that x < n.
(B) x, y ∈ R, 0 < x =⇒ there exists n ∈ Z+ such that y < nx.
Proof: : (A) This is just a restatement of the above theorem.
(B) Apply (A) to y/x. ♠
Theorem 5 If S is a nonempty subset of Z which is bounded above
then S has a maximum.
Proof: (Recall that if a set has a maximum iff the least upper bound
exists and belong to the set.) Let y ∈ R be the least upper bound of
S. We claim that y ∈ S. Suppose y is not in S. Now there exists m ∈ S
such that y − 1/2 < m < y. This implies 0 < y −m < 1/2. Also, since
5
y+m2
< y there exists n ∈ S such that y+m2
< n < y. This implies that
0 < y−m2
< n−m < y −m < 1/2 which is absurd. ♠
Definition 6 We can now define the ‘floor’ and ‘ceiling’ functions on
R. Given any x ∈ R consider the set Zx = {m ∈ Z : m ≤ r}. Clearly
Zx is bounded above and non empty (Archimedian property). From
the above theorem, it has a maximum which is of course unique. We
define this maximum to be bxc. Likewise dxe is also defined.
Lemma 1 Let x, y ∈ R be such that y − x > 1. Then there exists
m ∈ Z such that x < m < y.
Proof: By the definition of floor, it follows that x < 1+bxc. Therefore,
1 < y − x =⇒ x < 1 + bxc < y − x + bxc < y.
and by taking m = 1 + bxc, we are done. ♠
Theorem 6 Density of Q in R. Given x < y ∈ R there exist r ∈ Qsuch that x < r < y.
Proof: We have to find r = m/n such that x < m/n < y which is the
same as finding integers m, n, n > 0 such that nx < m < ny. This is
possible iff we can find n > 0 such that the interval (nx, ny) contains
an integer iff n(y−x) > 1. This last claim follows from (B) of theorem
4. ♠
Lecture 4
The last theorem may lead us to believe that the set of real numbers
is not much too large as compared to the set of rational numbers.
However the fact is indeed the opposite. This was a mild shock for
the mathematical community in the initial days of invention of real
6
numbers. We define an irrational number to be a real number which is
not a rational number. To begin with we shall prove:
Theorem 7 The set R \Q of irrationals is dense in R.
Proof: Given any two real numbers x < y we must find an irrational
number φ such that x < φ < y. By the earlier theorem we can first
choose rational numbers x1, y1 such that x < x1 < y1 < y and then
show that there is an irrational number φ such that x1 < φ < y1. By
clearing the denominators we can then reduce this to assuming that x, y
are integers and then by taking the difference, we can further assume
that x = 0. But then we can as well assume that y = 1. Thus it is
enough to show that there is an irrational number between 0 and 1.
If this were not true, by translation, it would follow that there are no
irrational numbers at all! ♠
Remark 6 Pay attention to this argument which occurs in somewhat
different forms in several places in mathematics. Later we shall show
that R \Q is uncountable. To begin with, at least, we can now be sure
that there is a real number x such that x2 = 2.
Theorem 8 Given any positive real number y, there is a unique posi-
tive real x such that x2 = y.
Proof: The uniqueness is easy to prove: x21 = x2 =⇒ (x1 + x2)(x1 −
x2) = 0 which in turn implies x1 − x2 = 0. Let us prove the existence.
In the case y = 1 there is nothing to prove. The case y < 1 can be
converted into the case y > 1 by taking the inverse. So, we shall assume
now that y > 1. As in the example 1, for any p ∈ R+ let us define
q := φ(p) := p− p2 − y
p + y=
y(p + 1)
p + y. (3)
We then have
q2 − y = (y2 − y)p2 − y
(p + y)2. (4)
7
Let now
S = {x ∈ R+ : x2 > y}; T = {x ∈ R+ : x2 < y}.
Both S, T are non empty. S is bounded below and T is bounded above.
Therefore r = glb(S); s = lub(T ) exists. We claim that r = s and
r2 = y. Suppose r2 > y. Then by 3), r > φ(r) > 0. But from (4)
φ(r)2 > y. Therefore φ(r) ∈ S and hence r ≤ φ(r) which is absurd.
Therefore r2 ≤ y. In a symmetric manner we also obtain that s2 ≥ y.
Now if r2 < y then r ∈ T and hence s ≤ r. Therefore, y ≤ s2 ≤ r2 < y
which is absurd. Therefore r2 = y as claimed. In a similar manner, we
can also see that s2 = y. ♠
Remark 7 We now encourage you to try to prove the following fact:
given any real number y > 0 and any positive integer n, there exists
a unique real number x such that xn = y. After trying for some time,
look into the book [R] for a proof. In any ase, we shall prove this fact
a little later as an easy consequnce of intermediate value theorem.
Exercise 2 Let S be a non empty subset of R which is bounded above.
Let T be the set of all upper bounds of S. Show that lub(S) = glb(T ).
Exercise 3 Fix x > 1.
1. For positive integers p, q, m, n such that q 6= 0, n 6= 0 and p/q =
m/n = r show that (xm)1/n = (xp)1/q. This allows us to define
xr = (xm)1/n unambigueously.
2. Show that for rational number r, s, xr+s = xrxs; (xr)s = xrs.
3. For any real number α, let
S(α) = {xr : r ≤ α, r ∈ Q}.
8
Show that if α is rational then xα = lub(S(α)). This prompts us
to define, for any real number α
xα := lubS(α).
4. Prove that xαxβ = xα+β for all real numbers α, β.
Lecture 5
Sequences
Definition 7 By a sequence in a set X we mean a function s : N → X.
Remark 8 Often it is our practice to display a sequence in the form:
s0, s2, . . . , OR {sn}n≥0 OR {sn}
Here sn denotes s(n) ∈ X. The set N itself displayed as
0, 1, 2, 3 . . . ,
is can be then thought of as the sequence of natural numbers.
Definition 8 Let X be an ordered set and s be a sequence in X. We
say s is (strictly) monotonically increasing if (sn < sn+1) sn ≤ sn+1
for all n ∈ N. Similarly we can define monotically (strictly) decreasing
sequence also. A sequence which is one of the above two types is
merely refered to as a monotone sequence. We say s bounded above,
of there exits x ∈ X such that sn ≤ x for all n ∈ N. Similarly one can
define bounded below sequences also.
Definition 9 By a subsequence t of a sequence s : N → X we mean a
sequence which can be written in the form s ◦ α for some α : N → Nwhich is a monotonically increasing sequence. We then display t in the
form {tn} = {sα(n)}.
9
Theorem 9 Every s : N → X in an ordered set X has monotone
subsequence.
Proof: We shall assume that s is a sequence in X which has no mono-
tone subsequence and arrive a contradiction. Consider the first two
elements. We then have s1 ≤ s2 or s2 ≤ s1. Consider the first case. We
define η1 = s1, η2 = s2 and take η3 to be the first si, i ≥ 2 such that
η2 < η3 and go on like this. Our assumption does not allow us to do
this. This means there is an ηn1 = sm such that for all n > m sn < sm.
We shall define p0 = ηn0. We now define γ1 = p1 = sm γ1 = sm+1.
We then have γ1 > γ2. We then pick up the first sn, n > m for which
γ2 > sn and call it γ2. As before we cannot go on like this either. That
is there is an1 such that for all n > n1 sn1 < sn. We put v1 = sn1 .
For v1 we now start climbing up as before till we hit another peak p2
and then start climbing down till we hit another valley v2 and so on.
Not that each pj = snjhas the property that sn < pj for all n > nj
(and similarly for each valley vk = snkwe have vk < sn for all n > nk).
In particular, it follows that {pj} is a subsequence of {sn} which is
monotonically decreasing! ♠We shall now onward consider sequences of real numbers.
Definition 10 Let s : N → R be a sequence of real numbers. We say
s converges to l ∈ R if for every positive real number ε there exists
n0 = n(ε) ∈ N such that for all n ≥ n0 we have sn ∈ (l − ε, l + ε). We
then say s is a convergent sequence, call l the limit of the sequence s
and write
limn→∞
sn = l OR sn → l as n →∞.
Remark 9 Note that the limit l if it exists is unique. For if l′ is another
limit of s, we choose ε = |l − l′| > 0. Then according to the definition
of the limit applied to l and l′ we get two numbers n0, and n′0 say such
10
that
sn ∈ (l − ε, l + ε), n ≥ n0; sn ∈ (l′ − ε, l′ + ε), n ≥ n′0.
Then taking n bigger that both n0 and n′0 we arrive at an absurd result
that (l − ε, l + ε) ∩ (l′ − ε, l + ε) 6= ∅.
Theorem 10 Every bounded monotone sequence of real numbers is
convergent.
Proof: We shall show that if s is an increasing sequence which is
bounded above, then it converges to the least upper bound l of the set
{sn : n ∈ N}. Suppose ε > 0 is any real number. Then l − ε < l and
hence there exist sn0 such that l−ε < sn0 . But then since s is increasing
it follows that sn ∈ (l − ε, l] for all n ≥ n0. This proves the claim. ♠
Remark 10 Observe that the process of obtainig q from p in example
1 may be repeated indefinitely to obtain a sequence. Thus starting
with a positive rational number p = s0 such that p2 > 2 we obtain a
monotonically decreasing sequence {sn} of rationals, whereas starting
with p = t0 > 0 such that p2 < 2 we obtain a monotonically decreasing
sequence {tn} of rationals. What are the limits?
Exercise 4
(i) Show that every convergent sequence is bounded.
(ii) Show that if sn → c then every subsequence of s converges to c.
This fact can be used in different ways. If you know somehow that a
sequence is convergent but have to compute the limit, you can do so by
taking any convenient subsequence. Also if you know one subsequence
of s which is not convergent then you may immediately conclude that
the sequence itself is not convergent. Or if you know two subsequences
of s converging to different limits, then also you can conclude that the
11
sequence s is not convergent.
(iii) Let sn ≤ tn. Show that (if the limits exists) limn sn ≤ limn tn.
(iv) Sandwich Theorem Let sn ≤ rn ≤ tn Suppose limn sn = limn tn. =
l Show that {rn} converges to l.
Operations on Sequences of real numbers Given two sequences
s and t of real numbers we can define the sum sequence s + t by the
formula (s + t)(n) = sn + tn. Likewise we can define a sequence αs
where α is a real number and also the sequence st. It is not difficult to
see that
(i) If s, t are convergent then s+ t, st, and αt are all convergent. More-
over,
limn
(s+t)n = limn
sn+limn
tn; limn
(st)n = (limn
sn)(limn
tn); limn
(αs)n = α limn
sn.
Extended Real Number System We put two extra symbols ±∞along with all the real numbers and extend the order in R as follows:
−∞ < r < ∞, for all r ∈ R.
Often denote this set by [∞,∞] and then of course (−∞,∞) repre-
sents the set of real numbers. The arithmetic operations can also be
extended partially as follows:
−∞+ r = −∞;∞+ r = ∞;− for all r ∈ R
r∞ = ∞; r/∞ = 0; r/0 = ∞, for all r > 0.
With these conventions we can define sn →∞ if for all k > 0, there
exists n0 = n(k) such that sn > k for all n > n0. and write
limn→∞
sn = ∞.
Likewise one can also define when −∞ is a limit of a sequence. How-
ever, we shall not call such sequences as convergent sequences. Indeed,
we can simply say that the sequence diverges to ∞.
12
Definition 11 A sequence of real numbers which is not convergent to
any value in the extended real number system is called an oscillating
sequence.
Example 2 A simple example of an oscilating sequence is
1,−1, 1,−1, . . . .
One can easily have an oscillating sequence which is not bounded also,
e.g.,
1,−2, 3,−4, 5,−6, . . . ,
Remark 11 The extended number system provides a certain ease of
expressing our ideas in an unhindered fashion. For instance, we can
now define the supremeum or infimum of any subset of real numbers
not necessarily bounded. Thus, if A is a set of real numbers which is
not bounded above then supA is defined to be equal to ∞ whereas if
it is bounded above then its supremum is as defined earlier. Similarly,
a set which is not bounded below has its infimum equal to −∞.
Another advantage of having extended real number system is that
even an empty set of real number has an lub now, viz., −∞. For every
real number is an upper bound for elements of ∅. Therefore the set of
upperbounds is unbounded below and hence the ‘smallest one is −∞.
Likewise, every real number is a lower bound for ∅ and hence the largest
one is +∞. We can therefore say that every subset of R has a lub and
a glb in [−∞,∞].
Exercise 5
(i) Let sn →∞; tn →∞. Then sn+tn →∞; sntn →∞; αsn →∞, α >
0.
(ii) Let sn → −∞; tn → −∞. Then sn + tn → −∞; sntn →∞; αsn →−∞.
13
(iii) Let s1 =√
2, and sn+1 =√
2sn. Show that s is increasing and
bounded by 2. Compute the limit.
(iv) Let s1 > s2 > 0 and sn+2 = sn+sn+1
2. Show that s1, s3, . . . is de-
creasing and s2, s4, . . . is increasing. Also show that s is convergent.
Compute the limit.
(v) Let {sn} be a sequence of real numbers. Prove or disporve the
following statements:
(a) {sn} has a subsequence which is non oscillating.
(b) {sn} has a convergent subsequence.
(c) If {sn} is bounded then it has a convergent subsequence.
(d) If {sn} is unbounded then it has a subsequence which diverges to
±∞.
(e) s2n → t and s2n+1 → t implies sn → t.
Lecture 6
Limsup and Liminf The limit of a sequence {sn}, if it exists, tells us
about approximate value of sn for large n. What happens to sequences
which do not have limits. We would like to have a device which tells
us how large or how small sn may become for large n where {sn} is an
arbitrary sequence of real numbers. This is fulfilled by the concept of
Limsup and Liminf.
Definition 12 Let {sn} be a sequence of real numbers. Put
un = sup{sk : k ≥ n}.
Note that the sets involved are non empty and hence un ∈ (−∞,∞].
Also note that if A ⊂ B then sup A ≤ sup B Therefore {un} is a
decreasing sequence. This means it has limit in [−∞,∞). We define
lim supn
sn = limn
un.
14
Likewise we take ln = glb{sk : k ≥ n} and define
lim infn
sn = limn
ln.
The following properties are immediate.
Theorem 11 Let {sn}, {tn} be any two sequences of real numbers.
(i) lim supn sn ≥ lim infn sn.
(ii) {sn} is bounded above iff lim supn sn 6= ∞.
(iii) {sn} is bounded below iff lim infn sn 6= −∞.
(iv) If limn sn exists in [−∞,∞] then
lim supn
sn = limn
sn = lim infn
sn.
(v) If sn ≤ tn for all n then lim supn sn ≤ lim supn tn and lim infn sn ≤lim infn tn.
(vi) If {sn}, {tn} are bounded sequences of real numbers, then
lim supn
(sn + tn) ≤ lim supn
sn + lim supn
tn
and
lim infn
(sn + tn) ≥ lim infn
sn + lim infn
tn.
Proof: We shall prove (iv) only and leave the rest to you as an exercise.
Let L = limn sn. Consider the case when L is finite. Then for every
ε > 0 there exists n0 such that for n ≥ n0 we have
L− ε < sn ≤ L + ε.
It follows that
L− ε < un ≤ L + ε; L− ε < ln ≤ L + ε.
Therefore
L− ε < limn
un ≤ L + ε; L− ε < limn
ln ≤ L + ε.
15
Since this is true for every ε > 0, the conclusion follows. Now consider
the case when L = ∞. This means for every M > 0 there exist n0
such that sn > M for all n ≥ n0. Therefore un > M and ln > M.
Therefore both the limits limn un = ∞ = limn ln. Similarly the case
when L = −∞. ♠
Remark 12 It is not true in general even for bounded sequences that
lim supn(sn + tn) = lim supn sn + lim supn tn. Simplest example is sn =
(−1)n, tn = −(−1)n. Then sn + tn = 0 for all n and hence the lhs is 0.
The Rhs is 2!.
Exercise 6
1. Let {sn} be a bounded sequence of real numbers. Show that the
number U = lim supn sn is characterized by the property: For
every ε > 0, there are atmost finitely many values of n such that
sn > U + ε and there are infinitely many values of n for which
sn > U − ε. Formulate similar statement for L = lim infn sn and
prove it.
2. Use lim supn to show that every bounded sequence of real numbers
has a convergent subsequence.
Cauchy Sequences Consider a sequence sn of real numbers which is
convergent to a limit L. Then we know that for every ε > 0 there is n0
such that for n ≥ n0,
sn ∈ (L− ε, L + ε).
This fact can be interpreted to mean that the members of the sequence
are coming close to each other as mcuh as we want after a certain stage.
That is for all n, m ≥ n0 we have
|sn − sm| < 2ε.
16
This interpretation has no reference to the limit itself and hence
possibly useful in a situation where we do not know the value of L nor
its exisitence. That is indeed the case.
Definition 13 A sequence sn of real numbers is called a Cauchy se-
quence, if for every ε > 0 there is n0 such that for n, m ≥ n0,
|sn − sm| < ε.
Theorem 12 A sequence of real numbers is convergent (to a finite
limit) iff it Cauchy sequence.
Proof: we have already seen the only if part. Now assume that {sn}is a Cauchy sequence. It is easily seen that {sn} is bounded. Also we
know that it has a subsequence {tk = snk} which is convergent to say
L. It is easily seen that sn → L. ♠.
Remark 13 For an alternative proof using limsup see the books.
Exercise 7 Some special Sequences Establish the following:
1. For p > 1, limn pn = ∞; limn p−n = 0.
2. For p > 0, limn1np = 0.
3. For p > 0, limnn√
p = 1; limnn√
n = 1.
4. For p > 0 and α real, limnnα
(1+p)n = 0.
Hints: (1) and (2) Archimedian property.
(3) Put xn = n√
p− 1 and show that 0 ≤ xn ≤ p−1n
.
(4) For k > α, k > 0 and for n > 2k we have
(1 + p)n >n(n− 1) · · · (n− k + 1)
k!pk >
nkpk
2k.
Therefore, 0 < nα
(1+p)n < 2kk!pk nα−k → 0 by (i)
17
Lecture 7
The following theorem characterizes the real number which is the
lim supn{an} for any sequence {an} of real numbers.
Theorem 13 (Limsup-I) If α > lim supn{an} then there exists n0
such that for all n ≥ n0 an < α.
(Limsup-II) If β < lim supn{an} then there exists infinitely many nj
such that anj> β.
Theorem 14 For any sequence {an} of real numbers consider the set
S = {r ∈ [−∞,∞] : there exists a subsequence ank→ r}.
Then lim supn an = sup S.
Exactly similar way lim infn has similar properties.
Series
Remark 14 Given two numbers, we can add them to get another
number. Repeatedly carrying out this operation allows us to talk about
sums of any finitely many numbers. We would like to talk about ‘sum’
of infinitely many numbers as well. A natural way to do this is to label
the given numbers, take sums of first n of them and look at the ‘limit’
of the sequence of numbers so obtained.
Thus given a (countable) collection of numbers, the first step is to
label them to get a sequence {sn}. In the second step, we form another
sequence: the sequence of partial sums tn =∑n
k=1 sk. Observe that
the first sequence {sn} can be recovered completely from the second
one {tn}. The third step is to assign a limit to the second sequence
provided the limit exists. This entire process is coined under a single
term ‘series’. However, below, we shall stick to the popular definition
of a series. 1
1For a rigorous definition of a series, see [G-L]
18
Definition 14 By a series of real or complex numbers we mean a
formal infinite sum:∑n
sn := s0 + z1 + · · ·+ sn + · · ·
Of course, it is possible that there are only finitely many non zero
terms here. The sequence of partial sums associated to the above series
is defined to be tn :=n∑
k=1
zk. We say the series∑
n sn is convergent to
the sum s if the associated sequence {tn} of partial sums is convergent
to s. In that case, if s is the limit of this sequence, then we say s is the
sum of the series and write ∑n
zn := s.
It should be noted that that even if s is finite, it is not obtained via an
arithmentic operation of taking sums of members of {sn} but by taking
the limit of the associated sequence {tn} of partial sums. Since display-
ing all elements of {tn} allows us to recover the original sequence {sn}by the formula sn = tn+1 − tn results that we formulate for sequences
have their counterpart for series and vice versa and hence in principle
we need to do this only for one of them. For example, we can talk a
series which is the sum of two series∑
n an,∑
n bn viz.∑
n(an + bn)
and if both∑
n an,∑
n bn are convergent to finite sums then the sum
series∑
n(an + bn) is convergent to the sum of the their sums.
Nevertheless, it is good to go through these notions. For example
the Cauchy’s criterion for the convergence of the sequence {tn} can be
converted into
Theorem 15 A series∑
n sn is convergent to a finite sum iff for every
ε > 0 there exists n0 such that |∑m
k=n sn| < ε, for all m, n ≥ n0.
As a corollary we obtain
19
Corollary 1 If∑
sn is convergent to a finite sum then sn → 0.
Of course the converse does not hold as seen by the harmonic series∑n
1n.
Once again it is immediate that if∑
n zn and∑
n wn are convergent
series then for any complex number λ, we have,∑
n λzn and∑
n(zn +
wn) are convergent and∑n λzn = λ
∑n zn;
∑n(zn + wn) =
∑n zn +
∑n wn. (5)
Theorem 16 A series of positive terms∑
n an is convergent iff the
sequence of parial sums is bounded.
Theorem 17 Comparison Test
(a) If |an| ≤ cn for all n ≥ n0 for some n0, and∑
n cc is cgt then∑
n an
is convergent.
(b) If an ≥ bn ≥ 0 for all n ≥ n0 for some n0 and∑
n bn diverges
implies∑
n an diverges.
The geometric series is the mother of all series:
Theorem 18 Geometric Series If 0 ≤ |x| < 1 then sumnxn = 1
1−x.
If |x| > 1, then the series diverges.
Theorem 19 The series∑
n1n!
is cgt and its sum is denotes by e. We
have, 2 < e < 3.
Proof: For n ≥ 2, we have,
2 < tn = 1+1+1
2!+ · · ·+ 1
n!< 1+1+
1
2+ · · ·+ 1
2n−1< 1+
1
1− 1/2= 3.
Theorem 20 limn
(1 + 1
n
)n= e.
20
Proof: Put tn =∑n
k=01k!
, rn =(1 + 1
n
)n. Then
rn = 1 + 1 +1
2!+
n− 1
3!n+ · · ·+ (n− 1)!
n!nn−1< tn.
Therefore lim sup rn ≤ e. On the other hand, for a fixed m if n ≥ m
we have
rn ≥ 1 + 1 +1
2!
(1− 1
n
)+ · · ·+ 1
m!
(1− 1
n
)· · ·(
1− m− 1
n
).
Therefore
lim infn
rn ≥ tm.
Since this true for m we get e ≤ lim infn rn. ♠
Remark 15 The rapidity with which this sequence converges is esti-
mated by considering:
e−tn =1
(n + 1)!+
1
(n + 2)!+· · · < 1
(n + 1)![1+
1
n + 1+
1
(n + 1)2+· · · ] =
1
n!n.
Thus
0 < e− tn <1
n!n.
Corollary 2 e is irrational.
Proof: Assume on the contray that e = pq. Then q!e is an integer. On
the other hand 0 < q!e− q!tq < 1q
which is absurd. ♠
Definition 15 A series∑
n an is said to be absolutely convergent if
the series∑
n |an| is convergent to a finite limit.
Theorem 21 Suppose {an} is a decreasing sequence of positive terms,
then∑∞
n=0 an is cgt iff∑
k 2ka2k is cgt.
Theorem 22∑
n1np < ∞ iff p > 1.
Corollary 3 The harmonic series is divergent.
21
Theorem 23 The series∑∞
n=21
(n ln n)p is convergent iff p > 1.
Theorem 24 Ratio Test: If {an} is a sequence of positive terms such
that
lim supn
an+1
an
= r < 1,
then∑
n an is convergent. If an+1
an≥ 1 for all n ≥ n0 for some n0, then∑
n an is divergent.
Proof: To see the first part, choose s so that r < s < 1. Then there ex-
ists N such that an+1
an< s for all n ≥ N. This implies aN+k < aNsk, k ≥
1. Since the geometric series∑
k sk is convergent, the convergence of∑n an follows. The second part is obvious since an cannot converge to
0. ♠Tutorial Session on Wed. 12th August
Exercise 8
1. Let zn = xn + ıyn, n ≥ 1. Show that zn → z = x + ıy iff xn → x
and yn → y.
2. Let∑
n zn be a convergent series of complex numbers such that
<(zn) ≥ 0 for all n. If∑
n z2n is also convergent, show that
∑n |zn|2
is convergent.
3. For 0 ≤ θ < 2π and for any α ∈ R, define the closed sector S(α, θ)
with span θ by
S(α, θ) = {rE(β) : r ≥ 0 & α ≤ β ≤ α + θ}.
Let∑
n zn be a convergent series. If zn ∈ S(α, θ), n ≥ 1, where
θ < π, then show that∑
n |zn| is convergent. (This is an improve-
ment on Exercise 3 above!)
22
4. Let∑
n zn be a series of complex numbers so that each of its four
subseries consisting of terms lying in the same closed quadrant is
convergent. Show that∑
n |zn| is convergent.
5. Telescoping: Given a sequence {xn} define the difference se-
quence an := xn − xn+1. Then show that the series∑
n an is
convergent iff the sequence {xn} is convergent and in that case,∑n
an = x0 − limn−→∞
xn.
6. Let {zn} be a bounded sequence and∑
n wn is an absolutely con-
vergent series. Show that∑
n znwn is absolutely convergent.
7. Abel’s Test: For any sequence of complex numbers {an}, define
S0 = 0 and Sn =∑n
k=1 ak, n ≥ 1. Let {bn} be any sequence of
complex numbers.
(i) Prove Abels’ Identity:
n∑k=m
akbk =n−1∑k=m
Sk(bk − bk+1)− Sm−1bm + Snbn, 1 ≤ m ≤ n.
(LHS =∑
(Sk − Sk−1)bk =∑n
m Skbk −∑n−1
m−1 Skbk+1 = RHS.)
(ii) Show that∑
n anbn is convergent if the series∑
k Sk(bk−bk+1)
is convergent and limn−→∞
Snbn exits.
(iii) Abel’s Test: Let∑
n an be a convergent series and {bn} be
a bounded monotonic sequence of real numbers. Then show that∑n anbn is convergent.
8. Dirichlet’s Test: Let∑
n an be such that the partial sums are
bounded and let {bn} be a monotonic sequence tending to zero.
Then show that∑
n anbn is convergent.
23
Lecture 8
Today we shall write down properly some of the important things
that we saw in the previous tutorial session.
1. Let∑
n an = s be a convergent series of non negative real num-
bers. Then∑
n a2n is convergent.
(∑n
0 a2k) ≤ (
∑n0 ak)
2 ≤ s2)
2. Suppose {zn} is a bounded sequence of complex numbers and∑m wn is absolutely convergent. Then
∑znwn is absolutely con-
vergent. (∑n
0 |zkwk| ≤ M∑n
0 |wk|.)
3. Abel’s Test: For any sequence of complex numbers {an}, define
S0 = 0 and Sn =∑n
k=1 ak, n ≥ 1. Let {bn} be any sequence of
complex numbers.
(i) Prove Abels’ Identity:
n∑k=m
akbk =n−1∑k=m
Sk(bk − bk+1)− Sm−1bm + Snbn, 1 ≤ m ≤ n.
(LHS =∑n
m(Sk − Sk−1)bk =∑n
m Skbk −∑n−1
m−1 Skbk+1 = RHS.)
(ii) Show that∑
n anbn is convergent if the series∑
k Sk(bk−bk+1)
is convergent and limn−→∞
Snbn exists.
(Put m = 1.)
(iii) Abel’s Test: Let∑
n an be a convergent series and {bn} be
a bounded monotonic sequence of real numbers. Then show that∑n anbn is convergent.
(∑
n(bn− bn+1) is convergent by Telescoping and absolutely, since
{bn} is monotonic. The series∑
n an is convergent and hence
{Sn} is bounded. By the previous exercise, the product series is
convergent. Since both Sn and bn are convergent Snbn is conver-
gent. Therefore, (ii) applies.
24
4. Dirichlet’s Test: Let∑
n an be such that the partial sums are
bounded and let {bn} be a monotonic sequence tending to zero.
Then show that∑
n anbn is convergent.
(Arguements are already there in above exercise.)
5. Derive the following Leibniz’s test from Dirichlet’s Test: If {cn}is a monotonic sequence converging to 0 then the alternating series∑
n(−1)ncn is convergent.
(Take an = (−1)n and bn = cn in Dirichlet’s test.)
6. Generalize the Leibniz’s test as follows: If {cn} is a monotonic se-
quence converging to 0 and ζ is complex number such that |ζ| = 1,
ζ 6= 1, then∑
n ζnan is convergent.
7. Show that if∑
n an is convergent then the following sequences are
all convergent.
(a)∑
n
an
np, p > 0; (b)
∑n
an
logpn; (c)
∑n
n√
nan; (d)∑
n
(1 +
1
n
)n
an;
8. Show that for any p > 0, and for every real number x,∑
n
sin nx
np
is convergent.
Theorem 25 Root Test For sequence {an} of positive terms, put
l = lim supnn√
an. Then
(a) l < 1 =⇒∑
n an < ∞.
(b) l > 1 =⇒∑
n an = ∞.
(c)l = 1 the series∑
n an can be finite or infinite.
Proof: Choose l < r < 1 and then an integer N such that n√
an < r
for all n ≥ N. Therefore an < rn and we can now compare with the
geometric series. The proof of (b) is also similar. (c) is demostrated
by the series∑
n1n
and∑
n1n2 . ♠
25
Remark 16 As compared to ratio test, root test is more powerful, in
the sense, whereever ratio test is conclusive so is root test. Also there
are cases when ration test fails but root test holds. However, ratio test
is easier to apply.
Example 3 Put a2n+1 = 12n+1 , a2n = 1
3n . Then
lim infnan+1
an= limn
2n
3n = 0; lim infnn√
an = limn2n
√13n =
√13.
lim supnan+1
an= limn
(32
)n= ∞; lim supn
n√
an = limn2n+1
√12n = 1√
2.
The ratio test cannot be applied. The root test gives the convergence.
The following theorem proves the claim that we have made in the above
remark.
Theorem 26 For any sequence {an} of positive terms,
lim infn
an+1
an
≤ lim infn
n√
an ≤ lim supn
n√
an ≤ lim supn
an+1
an
.
Definition 16 A series∑
n zn is said to be absolutely convergent if
the series∑
n |zn| is convergent.
Again, it is easily seen that an absolutely convergent series is con-
vergent, whereas the converse is not true as seen with the standard
example∑
n
(−1)n 1
n. The notion of absolute convergence plays a very
important role throughout the study of convergence of series. As an
illustration we shall obtain the following useful result about the con-
vergence of the product series.
Definition 17 Given two series∑
n an,∑
n bn, the Cauchy product of
these two series is defined to be∑
n cn, where cn =∑n
k=0 akbn−k.
Theorem 27 If∑
n an,∑
n bn are two absolutely convergent series then
their Cauchy product series is absolutely convergent and its sum is equal
26
to the product of the sums of the two series:
∑n
cn =
(∑n
an
)(∑n
bn
). (6)
Proof: We begin with the remark that if both the series consist of
only non negative real numbers, then the assertion of the theorem is
obvious. We shall use this in what follows.
Consider the remainder after n− 1 terms of the corresponding ab-
solute series:
Rn =∑k≥n
|ak|; Tn =∑k≥n
|bk|.
Clearly, ∑n≥0
|cn| ≤∑k≥0
∑l≥0
|ak||bl| = R0T0.
Therefore the series∑
n cn is absolutely convergent. Further,∣∣∣∣∣∑k≤2n
ck −
(∑k≤n
ak
)(∑k≤n
bk
)∣∣∣∣∣ ≤ R0Tn+1 + T0Rn+1,
since the terms that remain on the LHS after cancellation are of the
form akbl where either k ≥ n + 1 or l ≥ n + 1. Upon taking the limit
as n −→∞, we obtain (6). ♠
Remark 17 This theorem is true even if one of the two series is abso-
lutely convergent and the other is convergent. For a proof of this, see
[R].
An important property of an absolutely convergent series is:
Theorem 28 Let∑
n zn be an absolutely convergent series. Then ev-
ery rearrangement∑
n zσn of the series is also absolutely convergent,
and hence convergent. Moreover, each such rearrangement converges
to the same sum.
27
Proof: ( Recall that a rearrangement∑
n zσn of∑
n zn is obtained by
taking a bijection σ : N −→ N.) Let∑
n zn = z. The only thing that
needs a proof at this stage is that∑
n zσ(n) = z. Let us denote the
partial sums sn =∑n
k=0 ak tn =∑n
k=0 aσ(k). Since∑
n an is absolutely
convergent given ε > 0 there is a N such that∑m
k=n |ak| < ε for all
m ≥ n ≥ N. Pick up N1 large enough so that
{1, 2, . . . , N} ⊂ {σ(1), σ(2), . . . σ(N1)}.
Then for n ≥ N1, we have |sn − tn| ≤∑n
k=N+1 |an| < ε. Therefore,
limn sn = limn tn. ♠Riemann’s rearrangment Theorem: Let
∑an be a convergent
series of real numbers which is not absolutely convergent. Given−∞ ≤α ≤ β ≤ ∞, there exists a rearrangements
∑n aτ(n) of
∑n an with
partial sums tn such that
lim infn
tn = α; lim supn
tn = β.
We are not going to prove this. See [R] for a proof.
28
Lecture 9
Definition 18 By a formal power series in one variable t over K, we
mean a sum of the form∞∑
n=0
antn, an ∈ K.
Note that for this definition to make sense, the sequence {an} can
be inside any set. However, we shall restrict this and assume that
the sequences are taken inside field K. Let K[[t]] denote the set of all
formal power series∑
n antn in t with coefficients an ∈ K. Observe that
when at most a finite number of an are non zero the above sum gives
a polynomial. Thus, all polynomials in t are power series in t, i.e.,
K[t] ⊂ K[[t]].
Just like polynomials, we can add two power series ‘term-by-term’
and we can also multiply them by scalars, viz.,∑n
antn +
∑n
bntn :=
∑n
(an + bn)tn; α(∑
n
antn) :=
∑n
αantn.
Verified that the above two operations make K[[t]] into a vector
space over K.
Further, we can even multiply two formal power series:(∑n
antn
)(∑n
bntn
):=∑
n
cntn,
where, cn =∑n
k=0 akbn−k. This product is called the Cauchy product.
One can directly check that K[[t]] is then a commutative ring with the multi-
plicative identity being the power series
1 :=∑
n
antn
where, a0 = 1 and an = 0, n ≥ 1. Together with the vector space structure, K[[t]]
is actually a K-algebra.) Observe that the ring of polynomials in t forms a subring
of K[[t]]. What we are now interested in is to get nice functions out of power series.
29
Observe that, if p(t) is a polynomial over K then by the method
of substitution, it defines a function a 7→ p(a), from K to K. It
is customary to denote this map by p(t) itself. However, due to the
infinite nature of the sum involved, given a power series P and a point
a ∈ K, the substitution P (a) may not make sense in general. This is
the reason why we have to treat power series with a little more care,
via the notion of convergence.
Definition 19 A formal power series P (t) =∑
n anzn is said to be
convergent at z0 ∈ C if the sequence {sn}, where, sn =n∑
k=0
akzk0 is con-
vergent. In that case we write P (z0) = limn→∞
sn for this limit. Putting
tn = anzn0 , this just means that the series of complex numbers
∑n tn
is convergent.
Remark 18 Observe that every power series is convergent at 0.
Definition 20 A power series is said to be a convergent power series,
if it is convergent at some point z0 6= 0.
The following few theorems, which are attributed to Cauchy-Hadamard2
and Abel3, are most fundamental in the theory of convergent power se-
ries.
Theorem 29 Cauchy-Hadamard Formula: Let P =∑
n≥0 antn be
a power series over C. Put L = lim supnn√|an| and R = 1
Lwith the
2Jacques Hadamard(1865-1963) was a French Mathematician who was the most influ-
ential mathematician of his days, worked in several areas of mathematics such as complex
analysis, analytic number theory, partial differential equations, hydrodynamics and logic.3Niels Henrik Abel (1802-1829) was a Norwegian, who died young under deprivation.
At the age of 21, he proved the impossibility of solving a general quintic by radicals. He
did not get any recognition during his life time for his now famous works on convergence,
on so called abelian integrals, and on elliptic functions.
30
convention 10
= ∞; 1∞ = 0. Then
(a) for all 0 < r < R, the series P (t) is absolutely and uniformly
convergent in |z| ≤ r and
(b) for all |z| > R the series is divergent.
Proof: (a) Let 0 < r < R. Choose r < s < R. Then 1/s > 1/R = L
and hence by property (Limsup-I), we must have n0 such that for all
n ≥ n0,n√|an| < 1/s. Therefore, for all |z| ≤ r, |anz
n| < (r/s)n, n ≥n0. Since r/s < 1, by Weierstrass majorant criterion, (Theorem 32), it
follows that P (z) is absolutely and uniformly convergent.
(b) Suppose |z| > R. We fix s such that |z| > s > R. Then 1/s < 1/R =
L, and hence by property (Limsup-II), there exist infinitely many nj,
for which nj
√|anj
| > 1/s. This means that |anjznj | > (|z|/s)nj > 1. It
follows that the nth term of the series∑
n anzn does not converge to 0
and hence the series is divergent. ♠
Definition 21 Given a power series∑
n antn,
R = sup{|z| :∑
n
anzn < ∞}
is called the radius of convergence of the series. The above theorem
gives you the formula for R.
Remark 19 Observe that if P (t) is convergent for some z, then the
radius of convergence of P is at least |z|. The second part of the theorem
gives you the formula for it. This is called the Cauchy-Hadamard
formula. It is implicit in this theorem that the the collection of all
points at which a given power series converges consists of an open disc
centered at the origin and perhaps some points on the boundary of
the disc. This disc is called the disc of convergence of the power series.
Observe that the theorem does not say anything about the convergence
of the series at points on the boundary |z| = R. The examples below
will tell you that any thing can happen.
31
Example 4 The series∑
n
tn,∑
n
tn
n,∑
n
tn
n2all have radius of conver-
gence 1. The first one is not convergent at any point of the boundary
of the disc of convergence |z| = 1. The second is convergent at all
the points of the boundary except at z = 1 (Dirichlet’s test) and the
last one is convergent at all the points of the boundary (compare with∑n
1n2 ). These examples clearly illustrate that the boundary behavior
of a power series needs to be studied more carefully.
Remark 20 It is not hard to see that the sum of two convergent power
series is convergent. Indeed, the radius of convergence of the sum is
at least the minimum of the radii of convergence of the summands.
Similar statement holds for Cauchy product. Since Cauchy product of
two convergent series with non negative real coefficients is convergent,
it follows that the radius of convergence of the Cauchy product of two
series is at least the minimum of the radii of convergence of the two
series.
Example 5 Here is an example of usefulness of Cauchy’s product.
Consider the geometric series g(t) = 1 + t + t2 + · · · with radius of
convergence equal to 1. We can easily compute (g(t))2 and see that
(g(t))2 = 1 + 2t + 3t2 + · · ·+ ntn−1 + · · ·
which also should have radius convergence at least 1. Also it is not
convergence at 1. Hence the radius of convergence is exactly one. Thus,
it follows that∑
k ktk = tg(t)2− 1 also has radius of convergence equal
to 1. By Cauchy Hadamard’s theorem, it follows that lim supnn√
n = 1.
In turn, it follows that for all integers m, the series∑
k kmtk have radius
of convergence 1.
Definition 22 Given a power series P (t) =∑
n≥0 antn, the derived
series P ′(t) is defined by taking term-by-term differentiation: P ′(t) =
32
∑n≥1 nant
n−1. The series∑
n≥0an
n+1tn+1 is called the integrated series.
As an application of Cauchy-Hadamard formula, we derive:
Theorem 30 A power series P (t), its derived series P ′(t) and any
series obtained by integrating P (t) all have the same radius of conver-
gence.
Proof: Let the radius of convergence of P (t) =∑
n antn, and P ′(t) be
r, r′ respectively. It is enough to prove that r = r′.
We will first show that r ≥ r′. For this we may assume without loss
of generality that r′ > 0. Let 0 < r1 < r′. Then
∑n≥1
|an|rn1 = r1
(∑n≥1
n|an|rn−11
)< ∞.
It follows that r ≥ r1. Since this is true for all 0 < r1 < r′, this
means r ≥ r′.
Now to show that r ≤ r′, we can assume that r > 0 and let 0 <
r1 < r. Choose r2 such that r1 < r2 < r. Then for each n ≥ 1
nrn−11 ≤ n
r1
(r1
r2
)n
rn2 ≤
M
r1
rn2
where M =∑
k≥1 k(
r1
r2
)k
< ∞, since the radius of convergence of∑k ktk is at least 1 (See Example 5.) Therefore,∑
n≥1
n|an|rn−11 ≤ M
r1
∑n≥1
|an|rn2 < ∞.
We conclude that r′ ≥ r1 and since this holds for all r1 < r, it follows
that r′ ≥ r. ♠
33
Remark 21
(i) For any sequence {bn} of non negative real numbers, one can directly
try to establish
lim supn
n√
(n + 1)bn+1 = lim supn
n√
bn
which is equivalent to proving theorem 30. However, the full details of
such a proof are no simpler than the above proof. Moreover, in this
approach, we still need to compute the sum of the derived series.
(ii) A typical error a student falls into is the following: It is not true
that
(lim supn
an)(lim supn
bn) = lim supn
anbn
for any two sequences of real numbers as can be seen by the example
(1, 01, 0 . . .) and (0, 1, 0, 1, . . .). However, it is true that
(limn
an) lim supn
bn = lim supn
anbn
whenever {an} or {bn} is a convergent sequence. What is true in general
is:
(lim supn
an)(lim supn
bn) ≥ lim supn
anbn.
Now assume that {an} is convergent. Let {bnk} be a subsequence which
converges to b = lim supn bn. Then the subsequence ankbnk
→ ab where
a = limn an. This immediately implies that lim supn anbn ≥ ab.
(iii) A power series with radius of convergence 0 is apparently ‘use-
less’ for us, for it only defines a function at a point. It should be
noted that there are other areas of mathematics, formal power series
are whether they converge or not have many applications.
(iii) A power series P (t) with a positive radius of convergence R defines
a continuous function z 7→ p(z) in the disc of convergence BR(0), by
theorem 33. Also, by shifting the origin, we can even get continuous
functions defined in BR(z0), viz., by substituting t = z − z0.
34
(iv) One expects that functions which agree with a convergent power
series in a small neighborhood of every point will have properties akin
to those of polynomials. So, the first step towards this is to see that
a power series indeed defines a C-differentiable function in the disc of
convergence.
Example 6 Hemachandra Numbers For any positive integer n, let
Hn denote the number of patterns you may be able to produce on a
drum in a fixed duration of n beats. For instance, Dha− dhin− dhin
the first Dha− takes two syllables whereas the following two Dhin’s
take one syllable each. Clearly H1 = 1 and H2 = 2. Hemachandra 4
noted that since the last syllable is either of one beat or two beats it
follows that Hn = Hn−1 + Hn−2 for all n ≥ 3. These numbers were
known to Indian poets, musicians and percussionists as Hemachandra
numbers.
Define F0 = 0, F1 = 1 and Fn = Fn−1 + Fn−2, n ≥ 2. Note that
Fn = Hn−1, n ≥ 2. These Fn are called Fibonacci numbers. 5 (Thus
the first few Fibonacci numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . .)
Form the formal power series
F (z) =∞∑
n=0
Fnzn (7)
Multiplying the given recurrence relation by tn and summing over from
2 to ∞ gives
∞∑n=2
Fntn = t
∞∑n=2
Fn−1tn−1 + t2
∞∑n=2
Fn−2tn−2 (8)
4Hemachandra Suri (1089-1175) was born in Dhandhuka, Gujarat. He was a Jain monk
and was an adviser to king Kumarapala. His work in early 11 century is already based on
even earlier works of Gopala.5Leonardo Pisano (Fibonacci) was born in Pisa, Italy (1175-1250) whose book Liber
abbaci introduced the Hindu-Arabic decimal system to the western world. He discovered
these numbers at least 50 years later than Hemachandra’s record.
35
and hence
(1− t− t2)F (t) = t.
Write (1−t−t2) = (1−αt)(1−βt), where α =1 +
√5
2and β =
1−√
5
2.
Put Sw(t) = 1 + wt + w2t2 + · · · . Then (1− wt)Sw(t) = 1 and
F (t) = Sα(t)Sβ(t)t.
Comparing the coefficients of tn+1 on either side, we get,
Fn+1 =n∑
j=0
αjβn−j =αn+1 − βn+1
α− β=
1√5(αn+1 − βn+1) (9)
Exercise 9 *
1. Verify that K[[t]] is a K-algebra, i.e., a K-vector space which is a
commutative ring with a multiplicative unit.
2. For a non zero element p =∑
n≥0 antn ∈ K[[t]], the order ω(p) of p
is defined to be the least integer for which an 6= 0. By convention,
we define ω(0) = +∞. (This is consistent with the convention
that infimum of an empty subset of real numbers is +∞.) Show
that ω(p + q) ≥ min{ω(p), ω(q)} and ω(pq) = ω(p) + ω(q).
3. Given p ∈ K[[t]], show that p has a multiplicative inverse iff
ω(p) = 0.
4. Show that K[[t]] is an integral domain, i.e., p, q ∈ K[[t]] such that
pq = 0 implies p = 0 or q = 0.
5. A family {pj =∑
n an,jtn} of elements in K[[t]] is said to be a
summable family if for each n ≥ 0 the number of j for which the
coefficient of tn in pj is not zero is finite, i.e.,
#{j : an,j 6= 0} < ∞.
36
In this case, we define the sum of this family to be the element
p(t) =∑
n≥0 antn where an =
∑j an,j.
Put p =∑
n antn, q =
∑n bnt
n.
(a) Verify that the Cauchy product pq is indeed the sum of the
family {anbmtm+n}.
(b) If {pj} is a summable series then for any series q the family
{pjq} is also summable.
(c) Assume that b = 0, i.e., ω(q) ≥ 1. Then show that the family
{anqn : n ≥ 0} is summable.
6. The sum of the above family of series in (c) is called the series
obtained by substituting t = q in p or the composition series
and written p ◦ q. Continue to assume that b0 = 0. Let p ◦ q(t) =∑n αnt
n.
(a) Show that for each positive integer n, there exist a (universal)
polynomial Un(A1, . . . , An, B1, B2, . . . , Bn) with the following
properties:
(i) all coefficients are positive integers;
(ii) Each Un is linear in A0, A1, . . . , An, and Bn with coeffi-
cient of Bn = A1.
(iii) Each Un is weighted homogeneous of degree n+1 where
deg Aj = 1; deg Bj = j.
Moreover, Un have the property
αn = Un(a1, . . . , an, b1, b2, . . . , bn). (10)
Write down explicitly U1, U2, U3.
(b) Show that (p1 + p2) ◦ q = p1 ◦ q + p2 ◦ q.
37
(c) (p1p2) ◦ q = (p1 ◦ q)(p2 ◦ q).
(d) If r =∑
n cntn is such that c0 = 0, then we have
p ◦ (q ◦ r) = (p ◦ q) ◦ r.
(e) Consider the element I(t) = t ∈ K[[t]]. Show that it is a two-
sided identity for the composition, i.e., p ◦ I = I ◦ p = p for
all p ∈ K[[t]].
7. Show that if p is a polynomial then for any q ∈ K[[t]], the com-
position p ◦ q makes sense for all q ∈ K[[t]], i.e., even without the
assumption that ω(q) ≥ 1.
8. Let ′ denote the derived series. Show that
(a) p′ = 0 iff p is a constant;
(b) (p + q)′ = p′ + q′; (pq)′ = p′q + pq′.
(c) (pn)′ = npn−1p′, for all integers n. (Here if n is negative you
have to assume that pn makes sense, which is guaranteed if
p(0) 6= 0.)
(d) If {pj} is a summable family of power series then (∑
j pj)′ =∑
j p′j.
(e) Chain rule (p ◦ q)′ = (p′ ◦ q)q′.
9. Inverse Function Theorem for Formal Power Series Given
an element p =∑
n≥0 antn ∈ K[[t]], show that there is a q ∈ K[[t]]
such that q(0) = 0 and p ◦ q = I iff a0 = 0 and a1 6= 0. Show that
such a q is unique. Further, in this case, show that and q ◦ p = I.
10. In the above exercise, if a0 6= 0, we can still do something, viz.,
we consider r = p − a0, apply the above conclusion to r to get s
such that s ◦ r = I = r ◦ s; r(0) = 0. From this we conclude that
p ◦ s(t) = r ◦ s(t) + a0 = t + a0;
38
11. Let us consider two of the most important series
E(t) = 1 + t +t2
2!· · ·+ tn
n!+ · · ·
L(t) = t− t2
2+
t3
3+− · · ·+ (−1)n−1 tn
n+− · · ·
respectively called the exponential series and logarithmic se-
ries.
(a) Verify that E(t + s) = E(t)E(s);
(b) E ′(t) = E(t);
(c) Show that there is unique F such that F (0) = 0, E ◦F (t) =
1 + t; F ◦ E = Id. (See Ex. 10.)
(d) Prove that F ′(t) =∑∞
n=0(−1)ntn and hence F (t) = L(t).
Thus E ◦L(t) = 1+ t. Also L◦ (E−1) = Id. For this reason,
we write Ln(1 + t) := L(t). We then have E ◦ Ln(1 + t) =
E ◦L(t) = E ◦ F (t) = 1 + t. Since this is an identity, we can
express this as E ◦Ln = Id. Similarly, Ln◦E = L◦(E−1) =
Id.
Exercises on Convergent Power Series
Throughout these exercises let p(t) =∑∞
n=0 antn, q(t) =
∑∞n=0 bnt
n
be two power series.
12. Let p, q both have radius of convergence ≥ r > 0.
(a) The radii of convergence of both p+ q and pq are ≥ r. More-
over for |z| < r, we have (p + q)(z) = p(z) + q(z); and
(pq)(z) = p(z)q(z).
(b) Assume further that q(0) = b0 = 0. Then the composite
series p ◦ q has positive radius of convergence.
39
Solution: (i) Take any z such that |z| < r. Then∑
n |anzn|,∑
n |bnzn|
are convergent series of positive terms. Therefore∑n |(an + bn)|zn|,
∑n |∑
j |ajbn−j||zn| are both convergent.
(ii) Let p ◦ q(t) =∑
n cntn. There are universal polynomials
Un(A0, . . . , An; B1, B2, . . . , Bn)
with positive integer coefficients such that
cn = Un(a0, a1, . . . , an; b1, b2, . . . , bn).
Therefore
|cn| ≤ Un(|a0|, . . . , |an|; |b1|, . . . , |bn|).
Thus if we put Q =∑
n |bn|tn, P ◦Q(t) =∑
n Cntn, then cn ≤ Cn
for all n. Therefore, the radius of convergence of p ◦ q is bigger
than or equal to that of radius of convergence of P ◦Q. Therefore,
without loss of generality, we may as well assume that an, bn are
non negative real numbers.
For 0 < t < r we have q(t) =∑
n≥0 bntn < ∞. Therefore α(t) =∑
n≥1 bntn| < ∞ and defines a continuous function in |t| < r.
Therefore q(t) → 0 as t → 0. Therefore we can find 0 < s such
that q(s) < r. But then
p ◦ q)(s)) =∑
n
cnsn = p(q(s)) < ∞
by rearrangement theorem for convergent series of positive terms.
13. Let pq = 1. If the radius of convergence of p is positive then so is
the radius of convergence of q.
Solution:Without loss of generality, we may assume p(0) = 1. Put
s = 1− p. s(0) = 0 and we have
q = 1 + s + s2 + · · · = T ◦ s,
40
where T = 1 + t + t2 + · · · is the geometric series. Now appeal to
the previous exercise.
14. Given α 6= 0, β 6= 0, and a positive integer n, show that there is a
unique formal power series p such that p(0) = α, and pn = αn+βt.
Show that p is of positive radius of convergence.
Solution: First consider the case when α = 1 = β. Take P =
E ◦(
1nLn (1 + t)
). Then P (0) = 1; P n(t) = E ◦ Ln (1 + t) =
1 + t. In the general case, take p(t) = αP (βt/αn). Since Ln has
positive radius of convergence (= 1,) it follows that Ln (1 + t)/n
has positive radius of convergence (= 1). Since E has radius of
convergence ∞ (use Ratio Test), it follows that P has positive
radius of convergence.
15. Given α 6= 0, show that there is a unique power series p of positive
radius of convergence such that
p2 = α2 + βt + γt2; p(0) = α.
Solution:We may assume that γ 6= 0 and then γ = 1. Factorize
the RHS into linear factors and apply the previous exercise.
16. Show that there is a unique power series which satisfies
p2 − (α2 + βt)p + γt = 0; p(0) = 0 (11)
and it has a positive radius of convergence.
Solution: By completing the square and replacing p + λt + δ for
some constants λ, δ by p, this problem can be reduced to the
earlier one.
17. For some positive numbers α, r, M, let
P (t) = αt−∑n≥2
M
rntn.
41
If Q is the compositional inverse of P, show that Q is of positive
radius of convergence.
Solution:
P (t) = αt− Mt2
r2
(1 +
t
r+
t2
r2+ · · ·
)Therefore (
1− t
r
)P (t) = αt
(1− t
r
)− Mt2
r2.
If Q is the compositional inverse of P then we must have(1− Q
r
)t = αQ
(1− Q
r
)which can be rewritten in the form (11).
18. Let p(t) =∑
n antn, q(t) =
∑n bnt
n, a0 = 0 = b0, a1 6= 0 and
p ◦ q = Id. Suppose P (t) = A1t −∑
n≥2 Antn is such that A1 =
|a1|, and |an| < An for all n ≥ 2. Let Q =∑
n BnTn be the
compositional inverse of P with Q(0) = 0. Then show that |bn| ≤Bn for all n.
Solution:Recall that b1 = 1/a1 and
a1bn + Vn(a2, . . . , an; b1, . . . , bn−1) = 0.
Here Vn are certain (universal) polynomials with non negative in-
teger coefficients, and linear in a2, . . . , an. Therefore, (B1 = 1/|a1|and )
A1Bn − Vn(A2, . . . , An; B1, . . . , Bn−1) = 0.
Now, it follows by simple induction that |bn| ≤ Bn for all n.
19. Inverse Function Theorem for Analytic Functions Let p ◦q = Id, where p(0) = 0 and p′(0) 6= 0. If p is of positive radius of
convergence then so is q.
42
Solution: Choose r > 0 so that p is convergent at r. Choose M > 0
so that |an|rn < M for all n. Choose α = |a1|. Then P as in
exercise 17 has positive radius of convergence and hence Q has
positive radius of convergence. But Q majorizes q, by the previous
exercise and hence q is also of positive radius of convergence.
43
Lecture 10
The fact that a power series p of positive radius of convergence de-
fines a function inside its disc of convergence via substitution is some-
thing that we cannot ignore any longer. Let us take the study of such
functions. The sequence of partial sums of p, each being a polynomial,
defines a function on the whole of the complex plane. (If all the co-
efficients of p are real we can view each of the partial sums as a real
valued functions defined on R.) However, the limit makes sense only
inside the disc of convergence. More generally, we can talk about a
sequence {fn} of functions defined on some subset of A ⊂ C such that
at each point z ∈ A the sequence is convergent. We then get a function
f : A → C as the limit function viz.,
f(z) = limn
fn(z), z ∈ A.
Remember that this means for each ε > 0 there exists n0(z) such
that n ≥ n0 implies |fn(z)−f(z)| < ε. The number n0(z) may well vary
drastically as we vary the point z ∈ A. In order that the limit function
f retains some properties of the members of the sequence {fn} it is
anticipated that there must be some control over the possible n0(z).
This leads us to the notion of uniform convergence.
Definition 23 Let {fn} be a sequence of complex valued functions on
a set A. We say that it is uniformly convergent on A to a function f
if for every ε > 0 there exists n0, such that for all n ≥ n0, we have,
|fn(x)− f(x)| < ε, for all x ∈ A.
Remark 22 Clearly, Uniform convergence implies pointwise conver-
gence. The converse is easily seen to be false, by considering the se-
quence fn(x) = 11+nx2 . However, it is fairly easy to see that this is so
if A is a finite set. Thus the interesting case of uniform convergence
occurs only when A is an infinite set. The terminology is also adopted
44
in an obvious way for series of functions via the associated sequences of
partial sums. As in the case of ordinary convergence, we have Cauchy’s
criterion here also.
Theorem 31 A sequence of complex valued functions {fn} is uni-
formly convergent iff it is uniformly Cauchy i.e., given ε > 0, there
exists n0 such that for all n ≥ n0, p ≥ 0 and for all x ∈ A, we have,
|fn+p(x)− fn(x)| < ε.
Example 7 The mother of all convergent series is the geometric series
1 + z + z2 + · · ·
The sequence of partial sums is given by
1 + z + · · ·+ zn−1 =1− zn
1− z.
For |z| < 1 upon taking the limit we obtain
1+z+z2+· · ·+zn+· · · = 1
1− z. (12)
In fact, if we take 0 < r < 1, then in the disc Br(0), the series
is uniformly convergent. For, given ε > 0, choose n0 such that rn0 <
ε(1− r). Then for all |z| < r and n ≥ n0, we have,∣∣∣∣1− zn
1− z− 1
1− z
∣∣∣∣ =
∣∣∣∣ zn
1− z
∣∣∣∣ ≤ |zn0|1− |z|
< ε
There is a pattern in what we saw in the above example. This is
extremely useful in determining uniform convergence:
45
Theorem 32 Weierstrass6M-test: Let∑
n an be a convergent series
of positive terms. Suppose there exists M > 0 and an integer N such
that |fn(x)| < Man for all n ≥ N and for all x ∈ A. Then∑
n fn is
uniformly and absolutely convergent in A.
Proof: Given ε > 0 choose n0 > N such that an + an+1 + · · ·+ an+p <
ε/M, for all n ≥ n0. This is possible by Cauchy’s criterion, since∑
n an
is convergent. Then it follows that
|fn(x)|+ · · ·+ |fn+p(x)| ≤ M(an + · · ·+ an+p) < ε,
for all n ≥ n0 and for all x ∈ A. Again, by Cauchy’s criterion, this
means that∑
fn is uniformly and absolutely convergent. ♠
Remark 23 The series∑
n an in the above theorem is called a ‘majo-
rant’ for the series∑
n fn. Here is an illustration of the importance of
uniform convergence.
Theorem 33 Let {fn} be a sequence of continuous functions defined
and uniformly convergent on a subset A of R or C. Then the limit
function f(x) = limn−→∞
fn(x) is continuous on A.
Proof: Let x ∈ A be any point. In order to prove the continuity of
f at x, given ε > 0 we should find δ > 0 such that for all y ∈ A with
|y−x| < δ, we have, |f(y)−f(x)| < ε. So, by the uniform convergence,
first we get n0 such that |fn0(y) − f(y)| < ε/3 for all y ∈ A. Since
fn0 is continuous at x, we also get δ > 0 such that for all y ∈ A
with |y − x| < δ, we have |fn0(y) − fn0(x)| < ε/3. Now, using triangle
inequality, we get,
|f(y)− f(x)| ≤ |f(y)− fn0(y)|+ |fn0(y)− fn0(x)|+ |fn0(x)− f(x)| < ε,
6Karl Weierstrass (1815-1897) a German mathematician is well known for his perfect
rigor. He clarified any remaining ambiguities in the notion of a function, of derivatives, of
minimum etc., prevalent in his time.
46
whenever y ∈ A is such that |y − x| < δ. ♠
Exercise 10 Put fn(z) = zn
1−zn . Determine the domain on which the
sum∑
n fn(z) defines a continuous function.
Definition 24 Given a power series P (t) =∑
n≥0 antn, the derived
series P ′(t) is defined by taking term-by-term differentiation: P ′(t) =∑n≥1 nant
n−1. The series∑
n≥0an
n+1tn+1 is called the integrated series.
As an application of Cauchy-Hadamard formula, we derive:
Theorem 34 A power series P (t), its derived series P ′(t) and any
series obtained by integrating P (t) all have the same radius of conver-
gence.
Proof: Let the radius of convergence of P (t) =∑
n antn, and P ′(t) be
r, r′ respectively. It is enough to prove that r = r′.
We will first show that r ≥ r′. For this we may assume without loss
of generality that r′ > 0. Let 0 < r1 < r′. Then∑n≥1
|an|rn1 = r1
(∑n≥1
n|an|rn−11
)< ∞.
It follows that r ≥ r1. Since this is true for all 0 < r1 < r′ this
means r ≥ r′.
Now to show that r ≤ r′, we can assume that r > 0 and let 0 <
r1 < r. Choose r2 such that r1 < r2 < r. Then for each n ≥ 1
nrn−11 ≤ n
r1
(r1
r2
)n
rn2 ≤
M
r1
rn2
where M =∑
k≥1 k(
r1
r2
)k
< ∞, since the radius of convergence of∑k ktk is at least 1 (See Example 5.) Therefore,∑
n≥1
n|an|rn−11 ≤ M
r1
∑n≥1
|an|rn2 < ∞.
47
We conclude that r′ ≥ r1 and since this holds for all r1 < r, it follows
that r′ ≥ r. ♠
Remark 24
(i) For any sequence {bn} of non negative real numbers, one can directly
try to establish
lim supn
n√
(n + 1)bn+1 = lim supn
n√
bn
which is equivalent to proving theorem 30. However, the full details
of such a proof are no simpler than the above proof. In any case, this
way, we would not have got the limit of these derived series.
(ii) A power series with radius of convergence 0 is apparently ‘useless
for us’, for it only defines a function at a point. It should noted that
in other areas of mathematics, there are many interesting applications
of formal power series which need be convergent,
(iii) A power series P (t) with a positive radius of convergence R defines
a continuous function z 7→ p(z) in the disc of convergence BR(0), by
theorem 33. Also, by shifting the origin, we can even get continuous
functions defined in BR(z0), viz., by substituting t = z − z0.
(iv) One expects that functions which agree with a convergent power
series in a small neighborhood of every point will have properties akin
to those of polynomials. So, the first step towards this is to see that
a power series indeed defines a C-differentiable function in the disc of
convergence.
Theorem 35 Abel: Let∑
n≥0 antn be a power series of radius of con-
vergence R > 0. Then the function defined by
f(z) =∑
n
an(z − z0)n
48
is complex differentiable in Br(z0). Moreover the derivative of f is given
by the derived series
f ′(z) =∑n≥1
nan(z − z0)n−1
inside |z − z0| < R.
Proof: Without loss of generality, we may assume that z0 = 0. We
already know that the derived series is convergent in BR(0) and hence
defines a continuous function g on it. We have to show that this func-
tion g is the derivative of f at each point of BR(0). So, fix a point
z ∈ BR(0). Let |z| < r < R and let 0 6= |h| ≤ r−|z| so that |z +h| ≤ r.
Consider the difference quotient
f(z + h)− f(z)
h− g(z) =
∑n≥1
un(h) (13)
where, we have put un(h) :=an[(z + h)n − zn]
h− nanz
n−1. We must
show that given ε > 0, there exists δ > 0 such that for all 0 < |h| < δ,
we have, ∣∣∣∣f(z + h)− f(z)
h− g(z)
∣∣∣∣ < ε. (14)
The idea here is that the sum of first few terms can be controlled
by continuity whereas the remainder term can be controlled by the
convergence of the derived series. Using the algebraic formula
αn − βn
α− β=
n−1∑k=0
αn−1−kβk,
putting α = z + h, β = z we get
un(h) = an[(z + h)n−1 + (z + h)n−2z + · · ·+ (z + h)zn−2 + zn−1 − nzn−1].(15)
49
Since |z| < r and |z + h| < r, it follows that
|un(h)| ≤ 2n|an|rn−1. (16)
Since the derived series has radius of convergence R > r, it follows that
we can find n0 such that
2∑n≥n0
|an|nrn−1 < ε/2. (17)
On the other hand, again using (15), each un(h) is a polynomial in h
which vanishes at h = 0. Therefore so does the finite sum∑
n<n0un(h)
. Hence by continuity, there exists δ′ > 0 such that for |h| < δ′ we
have, ∑0<n<n0
2|an|nrn−1 < ε/2. (18)
Taking δ = min{δ′, r − |z|} and combining (17) and (18) yields (14).
♠The exponential function
The exponential function plays a central role in analysis, more so in
the case of complex analysis and is going to be our first example using
the power series method. We define
exp z := ez :=∑n≥0
zn
n!= 1 + z +
z2
2!+
z3
3!+
· · · .
(19)
By comparison test it follows that for any real number r > 0, the series
exp (r) is convergent. Therefore, the radius of convergence of (19) is
∞. Hence from theorem 35, we have, exp is differentiable throughout
C and its derivative is given by
exp′ (z) =∑n≥1
n
n!zn−1 = exp (z) (20)
50
for all z. It may be worth recalling some elementary facts about the
exponential function that you probably know already. Let us denote
by
e := exp (1) = 1 + 1 +1
2!+ · · ·+ 1
n!+ · · ·
Clearly, exp(0) = 1 and 2 < e. By comparing with the geometric series∑n
1
2n, it can be shown easily that e < 3. Also we have,
e = limn−→∞
(1 +
1
n
)n
. (21)
To see this, put tn =∑n
k=01k!
, sn =
(1 +
1
n
)n
, use binomial expansion
to see that
limsupnsn ≤ e ≤ lim infn
sn.
Since∑n
0zk
k!=∑n
0zk
k!, by continuity of the conjugation, it follows
that
exp z = exp z, (22)
Formula (20) together with the property exp (0) = 1, tells us that exp
is a solution of the initial value problem:
f ′(z) = f(z); f(0) = 1. (23)
It can be easily seen that any analytic function which is a solution
of (23) has to be equal to exp . (Ex. Prove this.)
We can verify that
exp (a + b) = exp (a)exp (b), ∀a, b ∈ C (24)
directly by using the product formula for power series. (Use binomial
expansion of (a+b)n.) This can also be proved by using the uniqueness
51
of the solution of (23) which we shall leave it you as an entertaining
exercise. (See ex. ??)
Thus, we have shown that exp defines a homomorphism from the
additive group C to the multiplicative group C? := C\{0}. As a simple
consequence of this rule we have, exp (nz) = exp (z)n for all integers n.
In particular, we have, exp (n) = en. This is the justification to have
the notation
ez := exp (z).
Combining (22) and (24), we obtain,
|eıy|2 = eıyeıy = eıye−ıy = e01.
Hence,
|eıy| = 1, y ∈ R. (25)
Example 8 Trigonometric Functions. Recall the Taylor series
sin x = x− x3
3!+
x5
5!−+ · · · ;
cos x = 1− x2
2!+
x4
4!−+ · · · ,
valid on the entire of R, since the radii of convergence of the two series
are ∞. Motivated by this, we can define the complex trigonometric
functions by
sin z = z− z3
3!+
z5
5!−+ · · · ; cos z = 1− z2
2!+
z4
4!−+ · · · . (26)
Check that
sin z =eız − e−ız
2ı; cos z =
eız + e−ız
2. (27)
52
It turns out that these complex trigonometric functions also have
differentiability properties similar to the real case, viz., (sin z)′ = cos z; (cos z)′ =
− sin z, etc.. Also, from (27) additive properties of sin and cos can be
derived.
Other trigonometric functions are defined in terms of sin and cos as
usual. For example, we have tan z =sin z
cos zand its domain of definition
is all points in C at which cos z 6= 0.
In what follows, we shall obtain other properties of the exponential
function by the formula
eız = cos z + ı sin z. (28)
In particular,
ex+ıy = exeıy = ex(cos y + ı sin y). (29)
It follows that e2πı = 1. Indeed, we shall prove that ez = 1 iff z = 2nπı,
for some integer n. Observe that ex ≥ 0 for all x ∈ R and that if x > 0
then ex > 1. Hence for all x < 0, we have, ex = 1/e−x < 1. It follows
that ex = 1 iff x = 0. Let now z = x + ıy and ez = 1. This means that
ex cos y = 1 and ex sin y = 0. Since ex 6= 0 for any x, we must have,
sin y = 0. Hence, y = mπ, for some integer m. Therefore ex cos mπ = 1.
Since cos mπ = ±1 and ex > 0 for all x ∈ R, it follows that cos mπ = 1
and ex = 1. Therefore x = 0 and m = 2n, as desired.
Finally, let us prove:
exp (C) = C?. (30)
Write 0 6= w = r(cos θ + ı sin θ), r 6= 0. Since ex is a monotonically
increasing function and has the property ex −→ 0, as x −→ −∞ and
ex −→∞ as x −→∞, it follows from Intermediate Value Theorem that
there exist x such that ex = r. (Here x is nothing but ln r.) Now take
53
y = θ, z = x + ıθ and use (29) to verify that ez = w. This is one place,
where we are heavily depending on the intuitive properties of the angle
and the corresponding properties of the real sin and cos functions. We
remark that it is possible to avoid this by defining sin and cos by the
formula (27) in terms of exp and derive all these properties rigorously
from the properties of exp alone.
Remark 25 One of the most beautiful equations:
eπı + 1 = 0 (31)
which relates in a simple arithmetic way, five of the most fundamental
numbers, made Euler7 believe in the existence of God!
Example 9 Let us study the mapping properties of tan function. Since
tan z = sin zcos z
, it follows that tan is defined and complex differentiable at
all points where cos z 6= 0. Also, tan(z + nπı) = tan z. In order to de-
termine the range of this function, we have to take an arbitrary w ∈ Cand try to solve the equation tan z = w for z. Putting eız = X, tem-
porarily, this equation reduces toX2 − 1
ı(X2 + 1)= w. Hence X2 =
1 + ıw
1− ıw.
This latter equation makes sense, iff w 6= −ı and then it has, in general
two solutions. The solutions are 6= 0 iff w 6= ı. Once we pick such a non
zero X we can then use the ontoness of exp : C −→ C \ {0}, to get a
z such that that eız = ±X. It then follows that tan z = w as required.
Therefore we have proved that the range of tan is equal to C \ {±ı}.From this analysis, it also follows that tan z1 = tan z2 iff z1 = z2 +nπı.
Likewise, the hyperbolic functions are defined by
sinh z =ez − e−z
2; cosh z =
ez + e−z
2. (32)
7See E.T. Bell’s book for some juicy stories
54
It is easy to see that these functions are C-differentiable. Moreover,
all the usual identities which hold in the real case amongst these func-
tions also hold in the complex case and can be verified directly. One
can study the mapping properties of these functions as well, which have
wide range of applications.
Remark 26 Before we proceed onto another example, we would like
to draw your attention to some special properties of the exponential
and trigonometric functions. You are familiar with the real limit
limx→∞
exp (x) = ∞.
However, such a result is not true when we replace the real x by a
complex z. In fact, given any complex number w 6= 0, we have seen
that there exists z such that exp (z) = w. But then exp (z +2nπı) = w
for all n. Hence we can get z′ having arbitrarily large modulus such
that exp (z′) = w. As a consequence, it follows that limz−→∞ exp (z)
does not exist. Using the formula for sin and cos in terms of exp , it
can be easily shown that sin and cos are both surjective mappings of
C onto C. In particular, remember that they are not bounded unlike
their real counter parts.
55
Lecture 11
Summability Given a sequence {an} of complex numbers, a method
T first associates another sequence {tn} to it and then takes the limit
of {tn}. If this limit exists and is equal to L then we say {an} is T -
summable to the T-limit L and write
T limnan = L; OR limn
an = L(T ).
Example 10 Series summation is such a summation method in which
tn is just the partial sum. Another method is called (C, 1) summation
(Cesero-1) in which tn = sn
n= a1+···+an
n. Note that if the series
∑n an
then it is (C, 1)-summable to the sum∑
n an. [Proof: Put∑
n an = L.
Then an → L, which is the same as saying limn(an−L) = 0. Given ε > 0
there is a N−) such that |an−L| < ε/2 for n ≥ N0. Also, the sequence
{an − L} bounded and so there is M > 0 such that |an − L| < M
for all n. Therefore |tn − L| = |a1+···+an
n− L| ≤ (N0−1)M+(n−N0+1)ε/2
n
≤ N0−1)Mn
+ ε/2 · · · ]Another example is an = (−1)n. Of course the sequence is not
convergent. But it is (C, 1)-summable to 0. The (C, 1)-limit a good
representation of the average.
Example 11 More generally, given k ≥ 1, we define a sequence {an}to be (C, k) summable to L if the sequence
tn =1(
n+k−1n−1
)∑j=1
n
(n + k − 1− j
n− j
)aj → L
It is not hard to check that if {an} is (C, k) summable to L then
it is (C, k + 1) summable to L. Also, there are sequences which are
(C, k + 1) summable but not (C, k) summable. For instance the se-
quence 1,−1, 2,−2, 3,−3, . . . , is not (C, 1) summable but is (C, 2).
Similarly the sequence 1,−2, 3,−4, 5,−6, . . . is not (C, 2) summable
but (C, 3).
56
Example 12 (General Weighted Averages) Even more generally, given
a sequence of positive real numbers P = {p1, p2, . . . , pn, . . .}, we put
Pn =∑n
j=1 pj and we define P-summability of a sequence {an} if the
sequence
tn =
∑nj=1 ajpn−j
Pn
converges to a limit L and say P lim an = L. Check that each (C, k) in
indeed a P method for some sequence P . Thus each Cesaro sum can
be thought of as a combinatorial (binomial) average.
Definition 25 We say a summability method T is regular if whenever
limn an = L then T limnan = L.
What we have seen above is that each (C, k) is regular. On the
other hand the series method is not regular.
Theorem 36 P is regular iff for each k,
limn
pn−k
Pn
= 0. (33)
Proof: Suppose P is regular. Take an = 0, n 6= k + 1 and = 1 for
n = k + 1 to see (33). Conversely, suppose (33) holds and let an → L.
WLOG we may assume that L = 0. Given ε > 0 find N0 such that
|an| < ε for n ≥ N0. Then for each k ≤ N0 find Nk such that |pn−k
Pn| <
ε/N0 for n ≥ Nk. Take N = max{N0, . . . , NN0}. Then for n ≥ N we
have |tn| < ε(M + 1), where M is a bound {|an|}.
Remark 27 In this sense series is not a regular summability, whereas,
all Cesaro summabilities are.
Definition 26 Given a series∑
n an with partial sums {sn}, we say
that∑
n An is (C1) summable to S if
limn
sn = A (C, 1).
57
And then we write ∑n
an = S. (C, 1).
A sequence {an} is called square summable Or is said to be of class `2
if∑
n a2n∞. We can add two square summable sequences to get another
such. Indeed square summable sequences form a vector space. {1/n}is in `2 whereas {
√1/n} is not in `2.
58
Lecture 12
Definition 27 By a metric or a distance function on a set X we mean
a function d : X ×X → R such that
(a) d(x, y) ≥ 0 for all (x, y) and = 0 iff x = y.
(b) d(x, y) = d(y, x);
(c) d(x, y) ≤ d(x, z) + d(z, y). A set X together with a chosen metric
on it is called a metric space.
Example 13
1. The simplest and most important examples of metric spaces are the
Euclidean spaces Rn with d(x, y) =√∑n
i=1(xi − yi)2. In case of n = 1
this also takes the form d(x, y) = |x− y|. So, we also use this notation
in the general case.
2. A metric on X automatically restricts to a metric on any subset of
X and thus, it makes sense to talk about subspaces of metric spaces.
For instance, if we consider Rn×{0} ⊂ Rn+1 then the standard metric
on Rn is seen to be the restriction of that on Rn+1. 3. For any set X
consider the function
d(x, y) =
{0, x = y
1, x 6= y.
Verify that this is a distance function. It is called the discrete metric.
4. On Rn define
dmax(x, y) = max{|x1 − y1|, . . . , |xn − yn|}
5. On Rn define
d1(x, y) =n∑
i=1
|xi − yi|.
6. On the set of square summable sequences of real numbers, define
d2(x, y) =
√∑i
(xi − yi)2
59
7. On the set of bounded continuous real valued functions on an interval
J, define
ds(f, g) = sup{|f(x)− f(y)|, x ∈ J}.
Definition 28 Let (X, d) be a metric space, x ∈ X, δ > 0. We shall
denote
Bδ(x) := {y ∈ X : d(x, y) < δ}
and call it the open ball of radius δ and center x.
Exercise 11 Draw a picture of the unit ball in the Rn in each of the
various metrics that we have seen above.
Definition 29 Let (X, d) be a metric space.
1. By an open subset in X we mean a subset U ⊂ X which is the union
of some open balls in X.
2. A set Y is called a NB of x ∈ X if there is an open set U in X such
that x ∈ U ⊂ Y.
3. A subset F in X is closed in X if X \ F is open in X.
4. A point x ∈ X is called a limit point A ⊂ X if every nbd of Y of x
contains a point of A not equal to x.
5. If x ∈ A is not a limit point of A then it is called an isolated point
of A.
6. The set of all points A of a given set Y such that a ∈ A implies Y
is a nbd of a is called the interior of Y.
7. A ⊂ X is called bounded if there exists M > 0 and p ∈ X such that
such that A ⊂ BM(p).
8. A ⊂ X is called dense in X if every point of X \ A is a limit point
of X.
Theorem 37 Let {Uj} be a family of open sets in X. Then the union
U = ∪jUj is open. Also intersection of any two open sets is open.
60
Remark 28 The empty set and the whole set X are open.
Theorem 38 A set is closed iff it contains all its limit points.
Definition 30 The closure A of a set A is defined to be the union of
A with all its limit points.
Theorem 39 Let Y be a subspace X. A subset A ⊂ Y is open in Y if
there exists an open set U in X such that A = U ∩ Y.
Definition 31 By a cover of A ⊂ X we mean a family {Uj} of sets
in X such that A ⊂ ∪jUj. It is called an open cover if every member
Uj is open. A subcover we mean a cover {Vi} which are members of
the cover {Uj}. A subset K of X is compact, if every open cover of K
admits a finite subcover.
Theorem 40 Let Y be a subspace of X. Then K ⊂ Y is compact (as
a subset of Y iff K ⊂ X is compact.
Proof: Let K be compact in X and let {Uj} any cover of K by open
subset of Y. Then there exist open sets Vj in X such that Uj = Vj ∩ Y.
But then {Vj} is an open cover of K in X. Therefore there are finitely
many say Vj1 , . . . , Vjk} such that K ⊂ ∪k
i=1Vji. But then K ⊂ ∪k
i=1Uji.
We leave the proof of the converse to you. ♠
Theorem 41 Every closed subset of a compact set is compact.
Proof: Easy.
Theorem 42 Every compact subset of a metric space is closed and
bounded.
Proof: Let K be a compact subset of (X, d). We shall prove X \K is
open. Fix a point p ∈ X \K. For each x ∈ K consider δx = 12d(p, x).
61
Then {Bδx(x)}x∈K forms an open cover for K. Since K is compact,
there exist x1, . . . xk such that K ⊂ ∪ki=1Bδxi
(xi). It follows easily that
V = ∩ki=1Bδxi
(p) is an open set contains p and V ⊂ X \K.
To show that K is bounded, fix any point x ∈ X and consider the
family {Bδ(x)} of open sets which actually cover the whole of X and
hence K. A finite cover then gives a single δ such that K ⊂ Bδ(x). ♠
Corollary 4 If F is closed and K is compact then F ∩K is compact.
Theorem 43 Let {Kj} be a collection of compact subset of a metric
space X such that intersection of any finitely many members is non
empty, then ∩jKj 6= ∅.
Proof: Put Uj = X \ Kj. Then we know that each Uj is open. Now
if ∩jKj 6= ∅ then it follows that X = ∪jUj. In particular {Uj} is an
open cover for K1 which is compact. Therefore, there are finitely may
j1, . . . , jk such that
K1 ⊂ Uj1 ∪ · · · ∪ Ujk.
This means K1 ∩Kj1 ∩ · · · ∩Kjk= ∅ a con tradition. ♠
Corollary 5 If {Kn} is a sequence of non empty compact sets in a
metric space, then ∩nKn 6= ∅.
Theorem 44 If A is an infinite subset of a compact subset K of a
metric space, then A has limit point in K.
Proof: If not then every point of x ∈ K has a nbd Ux such that
Ux∩A ⊂ {x}. If {Ux1 , . . . , Uxk} is a finite subcover of K this will imply
A ⊂ ∪iUxi. Therefore, A ⊂ ∪iUxi
∩A ⊂ {x1, . . . , xk} which contradicts
infiniteness of A. ♠We shall now examine compactness property inside Rn.
62
Lemma 2 Let In = [an, bn] is a decreasing nested sequence of nonempty
closed intervals, i.e.,
I1 ⊃ · · · ⊃ In ⊃ In+1 ⊃ · · ·
then ∩nIn 6= ∅.
Proof: Put x = sup an. Claim x ∈ In for all n. ♠
Lemma 3 If In is a decreasing sequence of closed cells in Rk, then
∩nIn 6= ∅.
63
Lecture 13
Theorem 45 Every closed cell in Rk is compact.
Proof: Use iterated bisection technique. ♠
Theorem 46 (Heine-Borel) A subset K of Rk is compact iff it is closed
and bounded.
Proof: We have to prove that if K is closed and bounded subset of
Rk, then it is compact. Since it is bounded, it is contained in a closed
cell. Since it is a closed subset of a closed cell which is compact, it is
compact. ♠
Theorem 47 A subset K of Rk is compact iff every infinite subset of
K has a limit point in K.
Proof: Again, we have only to prove if part. We shall prove that K is
closed and bounded.
If K is not bounded, then for each n we shall xn ∈ K such that
|xn| > n. The subset E = {xn} has no limits points in Rk and hence
none whatsoever in K. This is a contraction.
Now suppose K is not closed. This means there is a limit point x
of K which is not in K. We now construct an infinite sequence {xn}in K which converges to x and hence no limit point inside K. Having
found xn, put δn = |x− xn|/2 and consider the open ball Bδn(x) which
must a have point of K not equal to x; call this point xn+1. ♠
Theorem 48 (Weierstrass) Every bounded infinite subset of Rk has a
limit point in Rk.
Proof: The of this set is compact.
64
Theorem 49 (Bolzano-Weierstrass) Let A be a bounded subset of Rk.
Then every infinite sequence in A has a subsequence which is conver-
gent.
Proof: Look at the image set and consider the two cases according to
whether it is finite or infinite.
Theorem 50 Let f : X → Y be a function: TFAE:
(1) f is continuous.
(2) fU is open in X for every open set U in Y.
(3) f−1(F ) is closed in X for every closed set F in Y.
Theorem 51 Let f : X → Y be a continuous function of metric
spaces. If K is a compact subset of X, then f(K) is a compact subset
of Y.
Theorem 52 Every continuous real valued function on a compact set
attains its minimum and maximum.
Proof: The image is closed and bounded and hence has maximum and
minimum.
Exercise 12
(1) Let F be a closed subset of a metric space. Consider f(x) =
d(x, F ) = inf{d(x, y) : y ∈ F}. Show that f is continuous.
(2) Let f : X → Y be any function, x0 ∈ X. Prove that the FAE:
(a) f is continuous at x0.
(b) For every sequence {xn} in X which converges to x0 the sequence
{f(xn)} converges to f(x0).
(3) Let f, g : X → R be any two continuous functions. Define Max{f, g}, min{f, g}by the formulae:
Max{f, g}(x) = max{f(x), g(x)}; Min{f, g}(x) = min{f(x), g(x)}.
Show that Max{f, g}, Min{f, g} are both continuous.
65
Theorem 53 (Lebesgue Covering Lemma) Let {Uj} be an open cover-
ing for a compact metric space. Then there exists a number δ > 0 such
that any ball of radius δ and center in K is contained in some member
of {Uj}.
Proof: By compactness of K we may assume that the cover is finite.
Put Fj = X \Uj so that each Fj is a closed set. Now consider the func-
tion fj : X → R given by fj(x) = d(x, Fj). Check that it is continuous.
Next put f = max{f1, f2, . . . , fn}. Show that f is also continuous.
Check that f(x) > 0 for x ∈ K. Now let δ = inf{f(x) : x ∈ K}.Then by the previous theorem δ is actually the minimum and hence
is positive. Now let x ∈ K, and consider Bδ(x). If it is not contained
in any of U1, . . . , Uk, that would mean that the ball contains points
from each of Fj which means that the distance of x from each Fj is
strictly less that δ. That means that the maximum of these distances
viz. f(x) < δ which is absurd. ♠
Definition 32 Let f : X → Y be a function from one metric space to
another metric space. We say f is uniformly continuous, if for every ε >
0 there exists a δ > 0 such that dX(x1, x2) < δ =⇒ dY (f(x1), f(x2)) <
ε.
Theorem 54 (Uniform Continuity) Every continuous real valued
function on a compact space is uniformly continuous.
Proof: Given ε by continuity, for each x ∈ K there exists δx > 0 such
that dY (f(x), f(y)) < ε/2 for all y ∈ Bδ(x). Since K compact by LCL,
there exists a δ > 0 such that any ball of radius δ is contained in some
member of {Bδx(x)}. Now let a, b ∈ K be such that d(a, b) < δ. Choose
x ∈ K such that a, b ∈ Bδx(x). Then it follows that dY (f(a), f(x)) <
ε/2, dY (f(b), f(x)) < epn/2 and therefore dY (f(a), f(b)) < ε.
66
Example 14 f [0,∞) → [0,∞) defined by x2 is not uniformly contin-
uous.
Connectedness
Definition 33 Let X be a metric space. We say X is connected if the
only subsets A ⊂ X which are both open and closed in X are X and
∅.
Theorem 55 Let X be a metric space. Then the following are equiv-
alent:
(a) X is connected.
(b) A ∪ B = X, both A and B are open, A 6= ∅ 6= B then A ∩ B 6= ∅.(c) A ∪ B = X, both A and B are closed, A 6= ∅ 6= B then A ∩ B 6= ∅.(d) ∅ 6= A ⊂ X is both open and closed then A = X.
Theorem 56 A subset of R is connected iff it is an interval.
Proof: Suppose A ⊂ R which is not an interval. This means there exist
x < z < y such that x, y ∈ A but z 6∈ A. Put F = A∩ (−infty, z); G =
A ∩ (z∞). Then both F, G are open in A nonempty and the union is
A!?
Conversely, let A be an interval in R, A = F ∪ G, x ∈ F, y ∈ G
x < y. Assume that both F, G are closed in A we shall show that
F ∩ G =6= emptyset. Put w = sup F ∩ [x, y]. Then w ∈ A and since
F is closed w ∈ F. Clearly, w ≤ y. Now for any w < z ≤ y, z 6∈ F
and hence z ∈ G. This means w is a limit point of G. Since G is closed
w ∈ G. ♠
Theorem 57 Let f : X → Y be a continuous function, A ⊂ X is
connected. Then f(A) is connected.
67
Theorem 58 Intermediate Value Property Let f : [a, b] → R be a
continuous function. Let f(a) < z < f(b). Then there exists a < c < b
such that f(c) = z.
Remark 29 IVP is equivalent to intervals being connected.
Assignment: Show that a totally ordered set is connected
then it has lubp. and similarly (glbp)
Example 15
(i) Every path is connected.
(ii) Every path connected space is connected. But converse is not true.
(iii) Rn is connected.
(iv) Every cell in Rn is connected.
(v) Complement of a countable set in Rn, n ≥ 2 is connected. (vi)
Complement of a vector subspace of codimension ≥ 2 in Rn is con-
nected.
(vii) Every convex subset is connected.
(viii) Spheres ellipsoids etc are connected. Not nec. hyperboloids.
68
Lecture 14
Fundamental Theorem of Algebra
As promised before, we shall give an elementary proof of Funda-
mental Theorem of Algebra (FTA) in this section.
Theorem 59 Every non constant polynomial in one variable with co-
efficients in C has at least one root in C.
The proof uses only elementary Real Analysis which you have learnt
so far. All proofs of FTA use Intermediate Value Theorem (IVP) im-
plicitly or explicitly. We shall use it here explicitly. Apart from that,
the only important result that we use is Weierstrass’s theorem.
We begin with:
Lemma 4 For every polynomial function p : C → C, the function
|p| : C → R attains its infimum.
Proof: Given a polynomial p, we have to show that there exists z0 ∈ Csuch that |p(z0)| ≤ |p(z)| for all z ∈ C.
We know that that p(z) −→ ∞ as z −→ ∞. (Exercise.) This
means that there exists R > 0 such that |p(z)| > |p(0)| for all |z| > R.
It follows that
Inf {|p(z)| : z ∈ C} = Inf {|p(z)| : |z| ≤ R} ≤ |p(0)|.
But the disc {z : |z| ≤ R} is closed and bounded. Since the function
z 7→ |p(z)| is continuous, it attains its infimum on this disc. This
completes the proof of the lemma. ♠Slowly but surely, now an idea of the proof of FTA emerges: Observe
that FTA is true iff the infimum z0 obtained in the above lemma is a
zero of p, i.e., p(z0) = 0. Therefore in order to complete a proof of FTA,
it is enough to assume that p(z0) 6= 0 and arrive at a contradiction.
(This idea is essentially due to Argand.)
69
Consider the polynomial q(z) = p(z + z0). Both the polynomials,
p, q have the same value set and hence minimum of |q(z)| is equal to
minimum of |p(z)| which is equal to |p(z0)| = |q(0)|.We shall assume that q(0) 6= 0 and arrive a contradiction.
Write q(z) = q(0)φ(z) where
φ(z) = 1 + wzk + zk+1f(z)
with w 6= 0 is some complex number, k ≥ 1 and f(z) some polynomial.
Observe that |q(0)| is the minimum of |q(z)| iff 1 is the minimum of
|φ(z)|. It is enough to prove that
Lemma 5 Argand’s Inequality For any polynomial f, positive in-
teger k, and any w ∈ C \ {0},
Min{|1 + wzk + zk+1f(z)| : z ∈ C} < 1. (34)
Choose r > 0 such that rk = |w| (IVP)(see Exercise 1.5.13). Now
replace z by z/rk in (34). Thus, we may assume |w| = 1 in (34).
At this stage, Argand’s proof uses de Moivre’s theorem, viz., for
every complex number α and every positive integer k, the equation
zk = w has a solution. For its simplicity, we present this proof of
lemma 5, first:
Choose λ such that λk = −w−1. Replace z by λz in (34) to reduce
it to proving
Min{|1− zk + zk+1g(z)| : z ∈ C} < 1. (35)
Now restrict z to positive real numbers, z = t > 0. Since g(t) is a
polynomial, tg(t) → 0 as t → 0. So there exists t > 0 for which
|tg(t)| < 1/2. But then
|1− tk + tk+1g(t)| < |1− tk|+ tk
2= 1− tk +
tk
2< 1
70
thereby completing the proof of (34).
Why do we want to avoid using de Moivre’s Theorem? The answer
is that it depends heavily upon the intuitive concept of the angle which
needs to be established rigorously. (It should also be noted that during
Argand’s time, one could not expect a rigorous proof of lemma 4, which
Argand simply assumed.8)
Instead, we now follow an idea of Littlewood which is coded in the
following two lemmas:
Lemma 6 Given any complex number w of modulus 1, one of the four
numbers ±w,±ıw has its real part less than −1/2.
Proof: [This is seen easily as illustrated in the Fig. 1. The four shaded
regions which cover the whole of the boundary are got by rotating the
region <(z) < −1/2. However, it is important to note that the following
proof is completely independent of the picture.] Since |w| = 1, either
|<(w)| or |=(w)| has to be bigger than 1/2. In the former case, one of
±w will have the required property. In the latter case, one of ±ıw will
do. ♠
Fig. 18For more learned comments, see R. Remmert’s article on ‘Fundamental Theorem of
Algebra’ in [Ebb].
71
Lemma 7 For any integer n ≥ 1, the four equations
zn = ±1; zn = ±ı; (36)
have all solutions in C.
Proof: Write n = 2km, where m = 4l + 1 or 4l + 3. For k ≥ 0, since
we can take successive square-roots let αk, βk, γk be such that
α2k
k = −1, β2k
k = ı, γ2k
k = −ı.
(For k = 0, this just means α0 = −1; β0 = ı, γ0 = −ı.)
Now let us take the four equations one by one:
(a) For zn = 1, we can always take z = 1.
(b) For equation zn = −1, take z = αk. Then (αk)n = (−1)m = −1.
(c) For the equation zn = ı : Take z = βk, if m = 4l + 1. Then
(βk)n = (ı)m = ı. If m = 4l + 3 then take z = γk so that (γk)
n =
(−ı)m = (−ı)3 = ı.
(d) This case follows easily from (b) and (c). Choose z1, z2 such that
zn1 = −1 and zn
2 = ı. Then (z1z2)n = −ı. ♠
[At this stage, the proof given in literature first establishes de Moivre’s
theorem and then follows the arguments given above. Here, we shall
directly derive Argand’s inequality.]
Returning to the proof of lemma 5, choose τ = ±1,±ı so that
<(τw) < −1
2(Lemma 6). Choose α ∈ C such that αk = τ (Lemma 7).
Now, replace z by αz, so that we may assume that w = a + ıb,
where a ≤ −1/2 and a2 + b2 = 1.
Since f is continuous, it follows that tf(t) → 0 as t → 0. Restricting
to just the real values of t, we can choose 0 < δ < 1 such that |tf(t)| <1/3 for all 0 < t < δ. For such a choice of t, we have
|1 + wtk + tk+1f(t)| ≤ |1 + wtk|+ tk
3= [(1 + atk)2 + b2t2k]1/2 + tk/3.
72
We want to choose 0 < t < δ such that this quantity is less than 1. For
a2 + b2 = 1 and t > 0 we have
[(1 + atk)2 + b2t2k]1/2 + tk/3 < 1 iff [(1 + atk)2 + b2t2k]1/2 < 1− tk
3
iff (1 + atk)2 + b2t2k <
(1− tk
3
)2
= 1− 2tk
3+
t2k
9
iff 1 + 2atk + t2k < 1− 2tk
3+
t2k
9iff
8
9tk < −
(2a +
2
3
), t > 0.
This last condition can be fulfilled by choosing t > 0 such that tk < 3/8,
for then,8
9tk <
1
3< −
(2a +
2
3
).
Thus, for any t > 0 which is such that tk < min {3/8, δ} (IVP again),
we have
|1 + wtk + tk+1f(t)| < 1.
This completes the proof of the lemma 5 and thereby that of FTA.
♠
73
Lecture 15
We have seen that a sequence of continuous functions which is uni-
formly convergent produces a limit function which is also continuous.
We shall strengthen this result now.
Theorem 60 Let fn : X → R or (C) be a sequence of continuous
functions. Let A ⊂ X on which {fn} converges uniformly. Then {fn}converges on the closure A of A to a function f which is continuous.
Proof: Let us fix a point x0 ∈ A. We must first of all show that the
sequence {fn(x0)} is convergent. Enough to show it is Cauchy. Given
ε > 0 there exist n0 such that n, m > n0 implies
|fn(x)− fm(x)| < ε/3
for all x ∈ A. By continuity of fn and fm we can find δ > 0 such that
d(x, x0) < δ implies that
|fm(x)− fm(x0)|+ |fn(x)− fn(x0)| < 2ε/3.
Now since x0 ∈ A, there exists x ∈ Bδ(x0) ∩ A. With the help of this
x, we have
|fn(x0)−fm(x0)| ≤ |fm(x)−fm(x0)|+|fn(x)−fn(x0)|+|fn(x)−fm(x)| < ε.
Therefore, we have got a function f : A → R which is the limit of {fn}and the convergence is uniform on A.
We now want to show that f is continuous at x0.
|f(x)− f(x0)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(x0)|+ |fn(x0)− f(x0)|
Given ε > 0 we can choose N1 such that n > N1 implies
|f(x)− fn(x)|+ |fn(x0)− f(x0)| < 2ε/3, for all x ∈ A.
74
Fix one such n. Then by continuity of fn we can find δ > 0 such
that d(x, x0) < δ implies |fn(x) = f(n(x0)| < ε/3. Once again since
Bδ(x0) ∩ A 6= ∅, has to be used to conclude the continuity of f at x0.
♠
Remark 30 What about differentiabilty under uniform convergence?
We should be careful here as illustrated by the example: fn(x) = x1+nx2
on [0, 1]. This sequence converges uniformly to the function which is
identically 0. However the derived sequence f ′n(x) = 1−nx2
(1+nx2)2converges
to a function which is not even continuous. It is also true that a
uniform limit of a sequence of smooth functions can be continuous but
not differentiable, or differentiable but not continuously differentiable
or ... and so on.
On the positive side, we shall now see that by controling the limiting
process of the derived sequence itself we get better results:
Theorem 61 Let fn : [a, b] → R be a sequence of differentiable func-
tions such that f ′n converges uniformly in [a, b] to a function g. Also
suppose for some x0 ∈ [a, b], the sequence {fn(x0)} is convergent. Then
the sequence fn converges uniformly to a function f and f ′ = g =
limn→∞ f ′n.
Proof: First we want to show that fn is uniformly convergent and for
this it is enough to show that it is uniformly Cauchy, i.e., given ε > 0
we must find n0 such that n, m > n0 implies
|fn(x)− fm(x)| < ε, x ∈ [a, b] (37)
Using the hypothesis we get n1 such that n, m > n1 implies
|f ′n(x)− f ′m(x)| < ε
2(b− a), x ∈ [a, b]. (38)
75
Put φmn = fn − fm. Therefore by Mean Value theorem applied to
φmn, we have∣∣∣∣φmn(x1)− φmn(x2)
x1 − x2
∣∣∣∣ < ε
2(b− a), x1, x2 ∈ [a, b], m, n > n1. (39)
This is the same as
|fn(x1)− fm(x1)− fm(x2) + fn(x2)| <|x1 − x2|2(b− a)
≤ ε/2. (40)
We now use the fact that fn(x0) is convergent and hence find n2
such that n, m > n2 implies
|fn(x0)− fm(x0)| < ε/2. (41)
Combining the above two inequalities we conclude that fn is uniformly
Cauchy,
|fn(x)− fm(x)| < ε, m, n > max{n1, n2} (42)
as required. Let now f(x) = limn→∞ fn(x). To show that f ′ = g :
Now fix a x2 ∈ [a, b] and put hn(x1) = fn(x1)−fn(x2)x1−x2
. Then (39) implies
that hn is uniformly Cauchy in [a, b] \ {x2} and hence converges to a
continuous function h(x1) which is nothing but
limn→∞
fn(x)− fn(x2)
x1 − x2
=f(x1)− f(x2)
x1 − x2
.
Therefore the limit function is continuous on the closure of [a, b] \ {x2}which is [a, b]. We can now interchange the takiong limit with respect
to n with limit with respect to x, i.e.,
g(x1) = limn→∞
f ′n(x1) = limn→∞
limx2→x1
fn(x2)− fn(x1)
x2 − x1
= limx2→x1
limn→∞
fn(x2)− fn(x1)
x2 − x1
= limx2→x1
f(x2)− f(x1)
x2 − x1
= f ′(x1).
♠
76
Lecture 16 : Riemann-Stieltjes Integration
Throughout this section α will denote a monotonically increasing
function on an interval [a, b].
Let f be a bounded function on [a, b].
Let P = {a = a0 < a1, · · · , an = b} be a partition of [a, b]. Put
∆αi = α(ai)− α(ai−1).
Mi = sup{f(x) : ai−1 ≤ x ≤ ai}.mi = inf{f(x) : ai−1 ≤ x ≤ ai}.
U(P, f) =n∑
i=1
Mi∆αi; L(P, f) =n∑
i=1
mi∆αi.∫ b
a
fdα = inf{U(P, f) : P};∫ b
a
fdα = sup{L(P, f) : P}.
Definition 34 If
∫ b
a
fdα =
∫ b
a
fdα then we say f is Riemann-Stieltjes
integrable w.r.t to α and denote this common value by∫ b
a
fdα :=
∫ b
a
f(x)dα(x) :=
∫ b
a
fdα =
∫ b
a
fdα.
Let R(α) denote the class of all R-S integrable functions on [a, b].
Definition 35 A partition P ′ of [a, b] is called a refinement of another
partition P of [a, b] if, points of P are all present in P ′. We then write
P ≤ P ′.
Lemma 8 If P ≤ P ′ then L(P ) ≤ L(P ′) and U(P ) ≥ U(P ′).
Enough to do this under the assumption that P ′ has one extra point
than P.
Theorem 62
∫ b
a
fdα ≥∫ b
a
fdα.
77
Let P and Q be any two partitions of [a, b]. By taking a common
refinement T = P ∪Q, and applying the above lemma we get
U(P ) ≥ U(T ) ≥ L(T ) ≥ L(Q)
Now varying Q over all possible partitions and taking the supremum,
we get
U(P ) ≥∫ b
a
fdα.
Now varying P over all partitions of [a, b] ad taking the infimum, we
get the theorem. ♠
Theorem 63 Let f be a bounded function and α be monotonically in-
creasing function. Then the following are equivalent.
(i) f ∈ R(α).
(ii) Given ε > 0 there exists a partition P of [a, b] such that
U(P )− L(P ) < ε.
(iii) Given ε > 0 there exists a partition P of [a, b] such that for all
refinements of Q of P we have
U(Q)− L(Q) < ε.
(iv) Given ε > 0 there exists a partition P = {a0 < a1, · · · an} of [a, b]
such that for arbitrary points ti, si ∈ [ai−1, ai] we have
n∑i=1
|f(si)− f(ti)| < ε.
(v) There exists a real number η such that for every ε > 0, there exists
a partition P = {a0 < a1, · · · < an} of [a, b] such that for arbitrary
points ti ∈ [ai−1, ai], we have |∑n
i=1 f(ti)∆αi − η| < ε.
78
Proof: (i) =⇒ (ii): By definition of the upper and lower integrals,
there exist partitions Q, T such that
U(Q)−∫ b
a
fdα < ε/2;
∫ b
a
fdα− L(T ) < ε/2.
Take a common refinement P to Q, T and replace Q, T by P in the
above inequalities, and then add the two inequalities and use the hy-
pothesis (i) to conclude (ii).
(ii) =⇒ (i): Since L(P ) ≤∫ b
afdα ≤
∫ b
afdα ≤ U(P ) the conclusion
follows.
(ii) =⇒ (iii): This follows from the previous theorem for if P ′ ≥ P then
L(P ) ≤ L(P ′) ≤ U(P ′) ≤ U(P ).
(iii) =⇒ (ii): Obvious.
(iii) =⇒ (iv ): Note that |f(si)− f(ti)| ≤ Mi −mi. Therefore,∑i
|f(si)− f(ti)|∆αi ≤∑
i
(Mi −mi) = U(P )− L(P ) < ε.
(iv) =⇒ (iii): Choose points ti, si ∈ [ai−1, ai] such that
|mi − f(si)| <ε
2n∆i
, |Mi − f(ti)| <ε
2n∆i
.
Then U(P )− L(p)−∑
i(Mi −mi)∆i
≤∑
i[|Mi − f(ti)|+ |mi − f(si)|+ |f(ti − f(si)]∆αi < 2ε.
Thus so far, we have proved that (i) to (iv) are all equivalent to each
other.
(i) =⇒ (v): We first note that having proved that (i) to (iv) are all
equivalent, we can use any one of them. We take η =∫ n
afdα. Given
ε > 0 we choose a partition P such that |L(P ) − η| < ε/3. and a
partition Q such that (iv) holds with ε replaced by ε/3. We then take a
common refinement T of these two partitions for which again the same
79
would hold because of (iii). We now choose si ∈ [ai−1, ai] such that
|mi − f(si)| < ε3n∆i
whenever ∆i is non zero. (If ∆i = 0 we can take si
to be any point.) Then for arbitrary points ti ∈ [ai−1, ai], we have
|∑
i f(ti)∆i − η|
=
∣∣∣∣∣∑i
[(f(ti)− f(si) + (f(si)−mi) + mi∆i]− η
∣∣∣∣∣≤
∑i
|f(si)− f(ti)|∆i +∑
i
|f(si)−mi|∆i + |L(P )− η|
≤ ε/3 + ε/3 + ε/3 = ε.
(v) =⇒ (iv): Given ε > 0 choose a partition as in (v) with ε replaced
by ε/2. ♠Lecture 17
Fundamental Properties Of the Integral
Theorem 64 Let f be a bounded function and α be an increasing func-
tion, on an interval [a, b].
(a) Linearity in f : This just means that if f, g ∈ R(α), λ, µ ∈ R then
λf + µg ∈ R(α). Moreover,∫ b
a
(λf + µg) = λ
∫ b
a
fdα + µ
∫ b
a
fdα.
(b) Semi-Linearity in α. This just means if f ∈ R(αj), j = 1, 2 λj > 0
then f ∈ R(λ1α1 + λ2α2) and moreover,∫ b
a
fd(λ1α1 + λ2α2) = λ
∫ b
a
fdα1 + µ
∫ b
a
fdα2.
(c) Let a < c < b. Then f ∈ R(α) on [a, b] if f ∈ R(α) on [a, c] as well
as on [c, b]. Moreover we have∫ b
a
fdα =
∫ c
a
fdα +
∫ b
c
fdα.
80
(d) f1 ≤ f2 on [a, b] and fi ∈ R(α) then∫ b
af1dα ≤
∫ b
af2dα.
(e) If f ∈ R(α) and |f(x)| ≤ M then∣∣∣∣∫ b
a
fdα
∣∣∣∣ ≤ M [α(b)− α(a)].
(f) If f is continuous on [a, b] then f ∈ R(α).
(g) f : [a, b] → [c, d] is in R(α) and φ : [c, d] → R is continuous then
φ ◦ f ∈ R(α).
(h) If f ∈ R(α) then f 2 ∈ R(α).
(i) If, f, g ∈ R(α) then fg ∈ R(α).
(j) If f ∈ R(α) then |f | ∈ R(α) and∣∣∣∣∫ b
a
fdα
∣∣∣∣ ≤ ∫ b
a
|f |dα.
Proof: (a) Put h = f + g. Given ε > 0, choose partitions P, Q of [a, b]
such that
U(P, f)− L(P, f) < ε/2, P (Q, g)− L(Q, g) < ε/2
and replace these partitions by their common refinement T and then
appeal to
L(T, f) + L(T, g) ≤ L(T, h) ≤ U(T, h) ≤ U(T, f) + U(T, g).
For a constant λ since
U(P, λf) = λU(P, f); L(P, λf) = λL(P.f)
it follows that
∫ b
a
λfdα = λ
∫ b
a
fdα. Combining these two we get the
proof of (a).
(b) This is easier: In any partition P we have
∆(λ1α1 + λ2α2) = λ1∆α1 + λ2∆α2
81
from which the conclusion follows.
(c) All that we do is to stick to those partitions of [a, b] which contain
the point c.
(d) This is easy and
(e) is a consequence of (d).
(f) Given ε > 0, put ε1 = εα(b)−α(a)
. Then by uniform continuity of f,
there exists a δ > 0 such that |f(t) − f(s)| < ε1 whenever t, s ∈ [a, b]
and |t− s| < δ. Choose a partition P such that ∆αi < δ for all i. Then
it follows that Mi −mi < ε1 and hence U(P )− L(P ) < ε.
(g) Given ε > 0 by uniform continuity of φ, we get ε > δ > 0 such that
|φ(t)−φ(s)| < ε for all t, s ∈ [c, d] with |t− s| < δ. There is a partition
P of [a, b] such that
U(P, f)− L(P, f) < δ2.
The differences Mi−mi may behave in two different ways: Accordingly
let us define
A = {1 ≤ i ≤ n : Mi −mi < δ}, B = {1, 2, . . . , n} \ A.
Put h = φ ◦ f. It follows that
Mi(h)−mi(h) < ε, i ∈ A.
Therefore we have
δ(∑i∈B
∆αi) ≤∑i∈B
(Mi −mi)∆αi < U(P, f)− L(P, f) < δ2.
Therefore we have∑
i∈B ∆αi < δ. Now let K be a bound for |φ(t)| on
[c, d]. Then
U(P, h)− L(P, h) =∑
i(Mi(h)−mi(h))∆αi
=∑
i∈A(Mi(h)−mi(h))∆αi +∑
i∈B(Mi(h)−mi(h))∆αi
≤ ε(α(b)− α(a)) + 2Kδ < ε(α(b)− α(a) + 2K).
82
Since ε > 0 is arbitrary, we are done.
(h) Follows from (g) by taking φ(t) = t2.
(i) Write fg = [(f + g)2 − (f − g)2]/4.
(j) Take φ(t) = |t| and apply (g) to see that |f | ∈ R(α). Now let λ = ±1
so that λ∫ b
afdα ≥ 0. Then∣∣∣∣∫ b
a
fdα
∣∣∣∣ = λ
∫ b
a
dα =
∫ b
a
λfdα ≤∫ b
a
|f |dα.
This completes the proof of the theorem.
Theorem 65 Suppose f is monotonic and α is continuous and mono-
tonically increasing. Then f ∈ R(α).
Proof: Given ε > 0, by uniform continuity of α we can find a partition
P such that each ∆αi < ε.
Now if f is increasing, then we have Mi = f(ai), mi = f(ai−1).
Therefore,
U(P )− L(P ) =∑
i
[f(ai)− f(ai−1)]∆αi < f(b)− f(a))ε.
Since ε > 0 is arbitrary, we are done. ♠Lecture 18
Theorem 66 Let f be a bounded function on [a, b] with finitely many
discontinuities. Suppose α is continuous at every point where f is dis-
continuous. Then f ∈ R(α).
Proof: Because of (c), of theorem 64, it is enough to prove this for the
case when c ∈ [a, b] is the only discontinuity of f. Put K = sup|f(t)|.Given ε > 0, we can find δ1 > 0 such that α(c + δ1) − α(c − δ1) < ε.
By uniform continuity of f on [a, b] \ (c − δ, c + δ) we can find δ2 > 0
such that |x − y| < δ2 implies |f(x) − f(y)| < ε. Given any partition
83
P of [a, b] choose a partition Q which contains the points c and whose
‘mesh’ is less than min{δ1, δ2}. It follows that U(Q)−L(Q) < ε(α(b)−α(a)) + 2Kε. Since ε > 0 is arbitrary this implies f ∈ R(α). ♠
Remark 31 The above result leads one to the following question.
Keeping the continuity hypothesis on α, how large can be the set of
discontinuities of a function f such that f ∈ R(α)? The answer is not
within R-S theory. Lebesgue has to invent a new powerful theory which
not only answers this and several such questions raised by Riemann in-
tegration theory but also provides a sound foundation to the theory of
probability.
Example 16 We shall denote the unit step function at 0 by U which
is defined as follows:
U(x) =
{0, x <≤ 0;
1, x > 0.
By shifting the origin at other points we can get other unit step func-
tion. For example, suppose c ∈ [a, b]. Consider α(x) = U(x − c), x ∈[a, b]. For any bounded function f : [a, b] → R, let us try to compute∫ b
afdα. Consider any partition P of [a, b] in which c = ak. The only
non zero ∆αi is ∆αk = 1. Therefore U(P )− L(P ) = Mk(f)−mk(f).
Now assume that f is continuous at c. Then by choosing ak+1 close
to ak = c, we can make Mk − mk → 0. This means that f ∈ R(α).
Indeed, it follows that Mk → f(c) and mk → f(c). Therefore,∫ b
a
fdα = f(c).
Now suppose f has a discontinuity at c of the first kind i.e, in
particular, f(c+) exists. It then follows that |Mk−mk| → |f(c)−f(c+)|.Therefore, f ∈ R(α) iff f(c+) = f(c).
84
Thus, we see that it is possible to destroy integrability by just dis-
turbing the value of the function at one single point where α itself is
discontinuous.
In particular, take f = α. It follows that α 6∈ R(α) on [a, b].
We shall now prove a partial converse to (c) of Theorem 64.
Theorem 67 Let f be a bounded function and α an increasing function
on [a, b]. Let c ∈ [a, b] at which (at least) f or α is continuous. If
f ∈ R(α) on [a, b] then f ∈ R(α) on both [a, c] and [c, b]; moreover, in
that case, ∫ b
a
fdα =
∫ c
a
fdα +
∫ b
c
fdα.
Proof: Assume α is continuous at c. If Tc is the translation function
Tc(x) = x − c then the functions g1 = U ◦ T and g2 = 1 − U ◦ T
are both in R(α) since they are discontinuous only at c. Therefore
fg1, fg2 ∈ R(α). But these respectively imply that f ∈ R(α) on [c, b]
and on [a, c].
We now consider the case when f is continuous at c. We shall prove
that f ∈ R(α) on [a, c], the proof that f ∈ R(α) on [c, b] being similar.
Recall that the set of discontinuities of a monotonic function is
countable. Therefore there exist a sequence of points cn in [a, c] (we
are assuming that a < c) such that cn → c. By the earlier case f ∈ R(α)
on each of the intervals [a, cn]. We claim that the sequence
sn :=
∫ cn
a
fdα
converges to a limit which is equal to∫ c
afdα. Let K > 0 be a bound
for α. Given ε > 0 we can choose δ > 0 such that for x, y ∈ [c− δ, c +
δ], |f(x) − f(y)| < ε/2K. If n0 is big enough then n, m ≥ n0 implies
that |sn − sm| < ε. This means {sn} Cauchy and hence is convergent
with limit equal to say, s. Now choose n so that |s− sn| < ε.
85
Put ∆ = α(c) − α(c−). Since cn → c, from the left, it follows that
α(cn) → α(c−). Choose n large enough so that
|α(cn)− α(c−)| < ε/L
where L is a bound for f.
Now, choose any partition Q of [a, cn] so that |U(Q, f) − sn| < ε.
This is possible because f ∈ R(α) on [a, cn]. Put P = Q ∪ {c}, M =
max{f(x) : x ∈ [cn, c]}. Then
|s + ∆f(c)− U(P, f)|≤ |s− sn|+ |sn − U(Q, f)|+ |∆f(c)− (α(c)− α(cn))M |≤ ε + ε + |∆(f(c)−M)|+ |(α(cn)− α(c−))M |≤ 2ε + ∆ ε
2K+ |M | ε
K≤ 4ε.
Theorem 68 Let {cn} be a sequence of non negative real numbers such
that∑
n cn < ∞. Let tn ∈ (a, b) be a sequence of distinct points in
the open interval and let α =∑
n cnU ◦ Ttn . Then for any continuous
function f on [a, b] we have∫ b
a
fdα =∑
n
cnf(tn).
Proof: Observe that for any x ∈ [a, b], 0 ≤∑
n U(x− tn) ≤∑
n cn and
hence α(x) makes sense. Also clearly it is monotonically increasing
and α(a) = 0 and α(b) =∑
n cn. Given ε > 0 choose n0 such that∑n>n0
cn < ε. Take
α1 =∑n≤n0
U ◦ Ttn , α2 =∑n>n0
U ◦ Ttn .
By (b) of theorem 64, and from the example above, we have∫ b
a
fdα1 =∑n≤n0
cnf(tn).
86
If K is bound for |f | on [a, b] we also have∣∣∣∣∫ b
a
fdα2
∣∣∣∣ < K(α2(b)− α2(a)) = K∑n>n0
cn = Mε.
Therefore, ∣∣∣∣∣∫ b
a
fdα−∑n≤n0
cnf(tn)
∣∣∣∣∣ < Kε.
This proves the claim. ♠
Theorem 69 Let α be an increasing function and α′ ∈ R on [a, b].
Then for any bounded real function on [a, b], f ∈ R(α) iff fα′ ∈ R.
Furthermore, in this case,∫ b
a
fdα =
∫ b
a
f(x)α′(x)dx.
Proof: Given ε > 0, since α′ is Riemann integrable, by (iv) of theorem
63, there exists a partition P = {a = a0 < a1 < · · · < an = b} of [a, b]
such that for all si, ti ∈ [ai−1, ai] we have,
n∑i=1
|α′(si)− α′(ti)|∆xi < ε.
Apply MTV to α to obtain ti ∈ [ai−1, ai] such that ∆αi = α′(ti)∆xi.
Put M = sup|f(x)|. Then
n∑i=1
f(si)∆αi =n∑
i=1
f(si)α′(ti)∆xi.
Therefore,∣∣∣∣∣n∑
i=1
f(si)∆xi −n∑
i=1
f(si)α′(si)∆xi
∣∣∣∣∣ <∑i
|f(si)||α′(si)−α′(ti)|∆xi > Mε.
Thereforen∑
i=1
f(si)∆xi ≤n∑
i=1
f(si)α′(si)∆xi + Mε ≤ U(P, fα′) + Mε.
87
Since this is true for arbitrary si ∈ [ai−1, ai], it follows that
U(P, f, α) ≤ U(P, fα′) + Mε.
Likewise, we also obtain
U(P, fα′) ≤ U(P, f, α)) + Mε.
Thus
|U(P, f, α)− U(P, fα′)| < Mε.
Exactly in the same manner, we also get
|L(P, f, α−L(P, fα′)| < Mε.
Note that the above two inequalities hold for refinements of P as well.
Now suppose f ∈ R(α). we can then assume that the partition P is
chosen so that
|U(P, f, α)− L(P, fα)| < Mε.
It then follows that
|U(P, fα′)− L(P, fα′)| < 3Mε.
Since ε > 0 is arbitrary, this implies fα′ is Riemann integrable. The
other way implication is similar. Moreover, the above inequalities also
establish the last part of the theorem. ♠
Remark 32 The above theorems illustrate the power of Stieltjes’ mod-
ification of Riemann theory. In the first case, α was a staircase function
(also called a pure step function). The integral therein is reduced to
a finite or infinite sum. In the latter case, α is a differentiable func-
tion and the integral reduced to the ordinary Riemann integral. Thus
the R-S theory brings brings unification of the discrete case with the
continuous case, so that we can treat both of them in one go. As an
88
illustrative example, consider a thin straight wire of finite length. The
moment of inertia about an axis perpendicular to the wire and through
an end point is given by ∫ l
0
x2dm
where m(x) denotes the mass of the segment [0, x] of the wire. If the
mass is given by a density function ρ, then m(x)∫ x
0ρ(t)dt or equiva-
lently, dm = ρ(x)dx and the moment of inertia takes form∫ l
0
x2ρ(x)dx.
On the other hand if the mass is made of of finitely many values mi
concentrated at points xi then the inertia takes the form∑i
x2i mi.
Theorem 70 Change of Variable formula Let φ : [a, b] → [c, d] be
a strictly increasing differentiable function such that φ(a) = c, φ(b) = d.
Let α be an increasing function on [c, d] and f be a bounded function
on [c, d] such that f ∈ R(α). Put β = α ◦ φ, g = f ◦ φ. Then g ∈ R(β)
and we have ∫ b
a
gdβ =
∫ d
c
fdα.
Proof: Since φ is strictly increasing, it defines a one-one correspon-
dence of partitions of [a, b] with those of [c, d], given by
{a = a0 < a1 < · · · < an = b} ↔ {c = φ(a) < φ(a1) < · · · < φ(an) = d}.
Under this correspondence observe that the value of the two functions
f, g are the same and also the value of function α, β are also the same.
Therefore, the two upper sums lower sums are the same and hence the
two upper and lower integrals are the same. the result follows. ♠
89
Lecture 19 : Functions of bounded Variation
Definition 36 Let f : [a, b] → R be any function. For each partition
P = {a = a0 < a1 < · · · < an = b} of [a, b], consider the variations
V (P, f) =n∑
k=1
|f(ak)− f(ak−1)|.
Let
Vf = Vf [a, b] = sup{V (P, f) : P is a partition of [a, b]}.
If Vf is finite we say f is of bounded variation on [a, b]. Then Vf is
called the total variation of f on [a.b]. Let us denote the space of all
functions of bounded variations on [a, b] by BV [a, b].
Lemma 9 If Q is a refinement of P then V (Q, f) ≥ V (P, f).
Theorem 71 (a) f, g ∈ BV [a, b], α, β ∈ R =⇒ αf + βg ∈ BV [a, b].
Indeed, we also have Vαf+βg ≤ |α|Vf + |β|Vg.
(b) f ∈ BV [a, b] =⇒ f is bounded on [a, b].
(c) f, g ∈ BV [a, b] =⇒ fg ∈ BV[a, b]. Indeed, if |f | ≤ K, |g| ≤ L then
Vfg ≤ LVf + KVg.
(d) f ∈ BV [a, b] and f is bounded away from 0 then 1/f ∈ BV [a, b].
(e) Given c ∈ [a, b], f ∈ BV [a, b] iff f ∈ BV [a, c] and f ∈ BV [c, b].
Moreover, we have
Vf [a, b] = Vf [a, c] + Vf [c, b].
(f) For any f ∈ BV [a, b] the function Vf : [a, b] → R defined by
Vf (x) = Vf [a, x] is an increasing function.
(g) For any f ∈ BV [a, b], the function Df = Vf − f is an increasing
function on [a, b].
(h) Every monotonic function f on [a, b] is of bounded variation on
90
[a, b].
(i) Any function f : [a, b] → R is in BV [a, b] iff it is the difference of
two monotonic functions.
(j) If f is continuous on [a, b] and differentiable on (a, b) with the
derivative f ′ bounded on (a, b), then f ∈ BV [a, b].
(k) Let f ∈ BV [a, b] and continuous at c ∈ [a, b] iff Vf : [a, b] → R is
continuous at c.
Proof: (a) Indeed for every partition, we have V (P, αf+βg) = αV (P, f)+
βV (P, g). The result follows upon taking the supremum.
(b) Take M = Vf + |f(a)|. Then |f(x)| ≤ |f(x) − f(a)||f(a)| ≤V (P, f) + |f(a)| where P is any partition in which a, x are consecu-
tive terms.
(c) For any two points x, y we have
|f(x)g(x)−f(y)g(y)| ≤ |f(x)||f(x)−g(y)|+|g(y)||f(x)−f(y)| ≤ KVg+LVf .
(d) Let 0 < m < |f(x)| for all x ∈ [a, b]. Then∣∣∣∣ 1
f(x)− 1
f(y)
∣∣∣∣ =
∣∣∣∣f(x)− f(y)
f(x)f(y)
∣∣∣∣ ≤ Vf
m2.
(e) Follows from the lemma above, by including the point c in any
partition.
(f) Follows from (e).
(g) Let a ≤ x < y ≤ b. Proving Vf [a, y]− f(y) ≤ Vf [a, x]− f(x) is the
same as proving Vf |a, x] + f(y)− f(x) ≤ Vf [a, y]. For any partition P
of [a, x] let P ∗ = P ∪ {y}. Then
V (P, f) + f(y)− f(x) ≤ V (P, f) + |f(y)− f(x)| = V (P ∗, f) ≤ Vf [a, y].
Since this is true for all partitions P of [a, x] we are through.
(h) May assume f is increasing. But then for every partition P we
have V (P, f) = f(b)− f(a) and hence Vf = f(b)− f(a).
91
(i) If f ∈ BV [a, b], from (f) and (g), we have f = Vf − (Vf − f) as a
difference of two increasing functions. The converse follows from (a)
and (h).
(j) This is because then f satisfies Lipschitz condition
|f(x)− f(y)| ≤ M |x− y| for all x, y ∈ [a, b].
Therefore for every partition P we have V (P, f) ≤ M(b− a).
(k) Observe that Vf is increasing and hence V (c±) exist. By (h) it
follows that same is true for f. We shall show that f(c) = f(c±) iff
Vf (c) = Vf (c±) which would imply (k). So, assume that f(c) = f(c+).
Given ε > 0 we can find δ1 > 0 such that |f(x) − f(c)| < ε for all
c < x < c + δ1, x, y ∈ [a, b]. We can also choose a partition P = {c =
x0 < x1 < · · · < xn = b} such that
Vf [c, b]− ε <∑
k
∆fk.
Put δ = min{δ1, x1 − c}. Let now c < x < c + δ. Then
Vf (x)− Vf (c)
= Vf [c, x] = Vf [c, b]− Vf [x, b]
< ε +∑
k ∆fk − Vf [x, b]
≤ ε + |f(x)− f(c)|+ |f(x1)− f(x)|+∑
k≥2 ∆fk − Vf [x, b]
≤ ε + ε + Vf [x, b]− Vf [x, b] = 2ε.
This proves that Vf (c+) = Vf (c) as required.
Conversely, suppose Vf (c+) = Vf (c). Then given ε > 0 we can find
δ > 0 such that for all c < x < c + δ we have Vf (x) − Vf (c) < ε. But
then given x, y such that c < y < x < c + δ it follows that
|f(y)− f(c)|+ |f(x)− f(y)| ≤ Vf ([c, x] = Vf (x)− Vf (c) < ε
which definitely implies that |f(x)−f(y)| ≤ ε. This completes the proof
that Vf (c+) = Vf (c) iff f(c+) = f(c). Similar arguments will prove that
Vf (c−) = Vf (c) iff f(c−) = f(c). ♠
92
Example 17 Not all continuous functions on a closed and bounded
interval are of bounded variation. A typical examples is f : [0, π] → Rdefined by
f(x) =
{x cos
(1x
), x 6= 0
0, x = 0.
For each n consider the partition
P = {0, π
2n,
π
2n− 1, . . . , π}
Then V (P, f) = π∑n
k=11k. As n →∞, we know this tends to ∞.
However, the function g(x) = xf(x) is of bounded variation. To
see this observe that g is differentiable in [0, π] and the derivative is
bounded (though not continuous) and so we can apply (j) of the above
theorem.
Also note that even a partial converse to (j) is not true, i.e., a
differentiable function of bounded variation need not have its derivative
bounded. For example h(x) = x1/3, being increasing function, is of
bounded variation on [0, 1] but its derivative is not bounded.
Remark 33 We are now going extend the R-S integral with integra-
tors α not necessarily increasing functions. In this connection, it should
be noted that condition (v) of theorem 63 becomes the strongest and
hence we adopt that as the definition.
Definition 37 Let f, α : [a, b] → R be any two functions. We say f is
R-S integrable with respect to to α and write f ∈ R(α) if there exists
a real number η such that for every ε > 0 there exists a partition P of
[a, b] such that for every refinement Q = {a + x0 < x1,≤ xn = b} of P
and points ti ∈ [xi−1, xi] we have∣∣∣∣∣n∑
i=1
f(ti)∆αi − η
∣∣∣∣∣ < ε.
93
We then write η =∫ b
afdα and call it R-S integral of f with respect to
to α.
It should be noted that, in this general situation, several properties
listed in Theorem 64 may not be valid. However, property (b) Theorem
64 is valid and indeed becomes better.
Lemma 10 For any two functions αj and real numbers λj if f ∈R(αj), j = 1, 2 implies f ∈ R(λ1α1 + λ2α2). Moreover, in this case
we have ∫ b
a
fd(λ1α1 + λ2α2) = λ1
∫ b
a
fdα1 + λ2
∫ b
a
fdα2.
Proof: This is so because for any fixed partition we have the linearity
property of ∆:
∆(λ1α1 + λ2α2)i = (λ1α1 + λ2α2)(xi − xi−1) = λ1(∆α1)i + Λ2(∆α2)i
And hence the same is true of the R-S sums. Therefore, if ηj =∫ b
afdαj
then it follows that
λ1η1 + λ2η2 =
∫ b
a
fd(λ1α1 + λ2α2).
♠
Theorem 72 Let α be a function of bounded variation and let V de-
note its total variation function V : [a, b] → real defined by V (x) =
Vα[a, x]. Let f be any bounded function. Then f ∈ R(α) iff f ∈ RV
and f ∈ R(V − α).
Proof: The ‘if’ part is easy because of (a). Also, we need only prove
that if f ∈ R(α) then f ∈ R(V ). Given ε > 0 choose a partition Pε so
94
that for all refinements P of Pε, and for all choices of tk, sk ∈ [ai−1, ai],
we have, ∣∣∣∣∣n∑
k=1
(f(tk)− f(sk))∆αk
∣∣∣∣∣ < ε, V (b) <∑
k
∆αk + ε.
We shall establish that
U(P, f, V )− L(P, f, V ) < εK
for some constant K. By adding and subtracting, this task may be
broken up into establishing two inequalities∑k
[Mk(f)−mk(f)][∆Vk−|∆αk|] < εK/2;∑
k
[Mk(f)−mk(f)]|∆αk| < εK/2.
Now observe that [∆Vk−|∆αk| ≥ 0 for all k. Therefore if M is a bound
for |f |, then∑k[Mk(f)−mk(f)][∆Vk − |∆αk|] ≤ 2M
∑k(∆Vk − |∆αk|)
= 2M(V (b)−∑|∆αk|) < 2Mε.
To prove the second inequality, let us put
A = {k : ∆αk ≥ 0}; B = {1, 2, . . . , n} \ A.
For k ∈ A choose tk, sk ∈ [ak−1, ak] such that
f(tk)− f(sk) > Mk(f)−mk(f)− ε;
and for k ∈ B choose them so that
f(sk)− f(tk) > Mk(f)−mk(f)− ε.
We then have
∑k[Mk(f)−mk(f)]|∆αk|
<∑
k∈A(f(tk)− f(sk))|∆k|+∑
k∈B(f(sk)− f(tk))|∆k|+ ε∑
k |∆αk|=
∑k[f(tk)− f(sk))∆k + εV (b) = ε(1 + V (b)).
Putting K = max{2M, 1 + V (b)} we are done. ♠
95
Corollary 6 Let α : [a, b] → R be of bounded variation and f : [a, b] →R be any function. If f ∈ R(α) on [a, b] then it is so on every subin-
terval [c, d] of [a, b].
.
Corollary 7 Let f : [a, b] → R be of bounded variation and α : [a, b] →real be a continuous of bounded variation. Then f ∈ R(α).
Proof: By (k) of the above theorem, we see that V (α) and V (α)−α are
both continuous and increasing. Hence by a previous theorem, V (f)
and V (f)− f are both integrable with respect to V (α) and V (α)− α.
Now we just use the additive property. ♠
96
Lect. 20
Let us now consider functions on open intervals which are finite or
infinite.
Definition 38 For f : (a, b) → R we say f is of bounded variation if
there exists M such that for every subinterval [c, d] ⊂ (a, b) we have
Vf [c, d] ≤ M. Also, in this case the total variation of f on (a, b) is
defined to be the supremum of all Vf [c, d] where [c, d] varies over all
subintervals of (a, b).
Remark 34 Look at the results in theorem 71 one by one and see
whether you can replace the closed interval [a, b] there by an open
interval. We see no trouble whatso ever till (h). This is obviously not
true. Likewise we need to modify (i) as well. Indeed
Theorem 73 Every bounded monotonic function on (a, b) is of bounded
variation. Every element of BV(a, b) is expressible as the difference of
two bounded increasing functions.
The last part of the above theorem follows because if f ∈ BV(a, b) then
Vf is bounded.
Exercise 13
1. Let f : [0, 1] → R be an increasing function. If f(x) 6= x for any
x ∈ [a, b] and f(0) > 0 then show that f(1) > 1.
2. Suppose x1 < x2 < · · ·xk are the roots of a polynomial function
f lyingng in [a, b]. What is Vf ([a, b]?
3. Alternative proof of theorem 71 (i): Let f ∈ BV[a, b]. For every
partition P of [a, b] define
A(P ) = {k : ∆fk > 0}; B = {k : ∆fk < 0}.
97
Define
pf [a, b] = sup{∑
k∈A(P )
∆fk : P is a partition of [a, b]}
nf [a, b] = sup{∑
k∈B(P )
|∆fk| : P is a partition of [a, b]}
Then pf and nf are called positive and negative variations of
[a, b]. We define pf (x) = pf [a, x], nf (x) = nf (a, x], a < x ≤ b and
pf (0) = 0, nf (0) = 0. Check that
(i) Vf (x) = pf (x) + nf (x)
(ii) 0 6= pf (x) ≤ Vf (x); 0 ≤ nf (x) ≤ Vf (x).
(iii) pf and nf are increasing on [a, b].
(iv) f(x) = f(a) + pf (x) − nf (x), x ∈ [a, b]. ] This is the part of
the statement of theorem 71 (i).]
(v) 2pf (x) = Vf (x) + f(x)− f(a); 2nf (x) = Vf (x)− f(x) + f(a).
(vi) Every point of continuity of f is also a point of continuity of
pf , and nf .
4. Absolute Continuity A function f : [a, b] → R is said to be
absolutely continuous if for every ε > 0 there exists a δ > 0 such
that for any finitely many disjoint subintervals (ak, bk) of [a, b]
such that∑
k(bk − ak) < δ, we have
n∑k=1
|f(ak)− f(ak−1)| < ε.
(i) Every absolutely continuous function is continuous.
(ii) Every absolutely continuous function is of bounded variation.
(iii) If f satisfies uniform Lipschitz condition of oreder 1 i.e., if
there exists M such that |f(x) − f(y)| < M |x − y| for all x, y ∈[a, b], then f is abs. cont.
98
(iv) The set of abs. continuous functions on [a, b] forms a vector
space.
(v) If f is abs. continuous and bounded away from 0 then 1/f is
also abs. continuous.
(vi) If f is absolutely continuous then |f | is continuous.
Remark 35 There are continuous functions of bounded variation
which are not absolutely continuous. Find one of them.
5. Rectifiable Curves Let γ : [a, b] → Rn be a path, i.e., a contin-
uous map. For each partition P of [a, b] consider the the sum
l(P, γ) =n∑
k=1
‖γ(ak)− γ(ak−1)‖.
Let
l(γ) = Sup{l(P, γ) : P is a partition of [a, , b]}.
If l(γ) is finite we say γ is a rectifiable path and call l(γ) the arc
length of γ.
(i) If γ = (γ1, . . . , gamman) then γ is rectifiable iff each γi is of
bounded variation.
(ii) If γ′ is continuous on [a, b], then γ is rectfiable and we have∫ b
a
‖γ′(t)‖dt = l(γ).
(iii) The arc length is an invariant of change parameterization,
i.e., if φ : [c, d] → [a, b] is an onto map with φ′(t) > 0 for all t then
l(γ) = l(γ ◦ φ).
6. The graph of x sin 1x
is non rectifiable.
99
7. Consider the following function defined on [0, 1] by
f(t) =
0, t = 0
2nt− 1,1
2n≤ t ≤ 1
2n− 1, n ≥ 1,
1− 2nt,1
2n + 1≤ t ≤ 1
2n, n ≥ 1.
Show that the graph is non rectifiable.
8. Let 4ABC be an equilateral triangle in R2. Start at the midpoint
M1 of AB, join it to the opposite vertex C and trace the line seg-
ment M1C up to the midpoint M2 of CM1. Extend BM2 to meet
the side AC at N2. Let M3 be the midpoint of CN2. Trace this
segment from M2 to M3. Repeat this precess infinitely. Observe
that the sequence of points Mj converges to the midpoint of M0
of BC. Show that this process defines a non rectifiable continuous
path.
100
Lecture 22
Example 18 :
1. Consider the double sequence,
sm,n =m
m + n, m, n ≥ 1.
Compute the two iterated limits
limm
limn
sm,n, limn
limm
sm,n
and record your results.
2. Let fn(x) =x2
(1 + x2)n, x ∈ R, n ≥ 1 and put f(x) =
∑n fn(x).
Check that fn is continuous. Compute f and see that f is not
continuous.
3. Define gm(x) = limn→∞(cos m!πx)2n and put g(x) = limm→∞ gm(x).
Compute g and see that g is discontinuous everywhere. Directly
chaek that it is not Reimann integrable.
4. Consider the sequence hn(x) =sin nx√
nand put h(x) = limn fn(x).
Check that f ≡ 0. On the other hand, compute limn h′n(x).
5. Put λn(x) = n2x(1−x2)n. Compute the limn λn(x). On the other
hand check that ∫ 1
0
λn(x)dx =n2
2n + 2→∞.
Therefore we have
∞ = limn
[∫ 1
0
λn(x)dx
]6=∫ 1
0
[limn
λn(x)]dx = 0.
101
We know that if a sequence of continuous functions converges uni-
formly to a function, then the limit function is continuous. We can
now ask for the converse: Suppose a sequence of continuous functions
fn converges pointwise to a function f which is also continuous. Is
the convergence uniform? The answer in general is NO. But there is a
situation when we can say yes as well.
Theorem 74 Let X be a compact metric space fn : X → R be a
sequence of continuous functions converging pointwise to a function f.
Suppose further that fn is monotone. Then the fn → f uniformly on
X.
Proof: That fn is monotone means for each x ∈ X we have
· · · ≤ fn(x) ≤ fn+1(x) ≤ · · ·
or the other way round where all inequalities are reversed. We can
consider one of these cases, put gn(x) = f(x)− fn(x) and assume that
gn is a sequence of non negative functions monotonically decreasing to
the function 0. Given ε > 0 we want to find n0 such that gn(x) < ε for
all n ≥ n0 and for all x ∈ X. Put
Kn = {x ∈ X : gn(x) ≥ ε}.
Then each Kn is a closed subset of X. Also gn(x) ≥ gn+1(x) it follows
that Kn+1 ⊂ Kn. On the other hand, sunce gn(x) → 0 it follows that
∩nKn = ∅. Since this is happening in a compact space X we conclude
that Kn0 = ∅ for n0. ♠
Remark 36 The compactness is crucial as illustrated by the example:
fn(x) =1
nx + 1, 0 < x < 1.
Uniform Convergence and Integration
102
Theorem 75 Let α be increasing function on [a, b] and let fn ∈ R(α), n ≥on [a, b]. Suppose fn converges uniformly to f on [a, b]. Then f ∈ R(α)
and we have
limn
∫ b
a
fndα =
∫ b
a
fdα.
Proof: Put εn = sup{|fn(x) − f(x)| : a ≤ x ≤ b}. The uniform
convergence implies that limn εn = 0.
We have for each n,
fn − εn ≤ f ≤ fn + εn.
Therefore,∫ b
a
(fn − εn)dα ≤∫
fdα ≤∫
fdα ≤∫ b
a
(fn + ε)dα.
Therefore
0 ≤∫
fdα−∫
fdα ≤ 2εn[α(b)− α(a)].
Now we can take the limit as n →∞ and apply Sandwich theorem to
conclude that f ∈ R(α). Going back two steps, this now gives∫ b
a
(fn − εn)dα ≤∫ b
a
fdα ≤∫ b
a
(fn + ε)dα
and hence
−εn ≤∫ b
a
fdα−∫ b
a
fndα ≤ εn.
Hence we can take the limit once again. ♠
Example 19 A continuous function which is nowhere differen-
tiable Put
φ(x) = |x|, − 1 ≤ x ≤ 1
and extend this function all over R by periodicity:
φ(x + 2) = φ(x).
103
This function is continuous on R and not differentiable at any integer
value of x.
Let φn(x) = φ(4nx). Then each φn has similar properties to φ but
the preiod has decreased and the number of points at which it is not
differentiable has increased viz., at all those rational numbers q such
that 4nq ∈ Z. We now take
f(x) =∞∑
n=0
(3
4
)n
φn(x).
Observe that |φn(x)| ≤ 1 for all n and hence the above series is uni-
formly convergent and hence defines a continuous function on R. It
is also clear that function is not differentiable at any dyadic ratio-
nal number. But there is a bonus: it is not differentiable anywhere:
Let x ∈ R. For each integer m consider 4mx. Then one of the intervals
(4mx, 4mx+1/2), (4mx−1/2, 4mx) will not contain any integer. Choose
one such and accordingly define δm = ± 14m so that there is no integer
between 4mx and 4m(x + δm).
Now if n > m then 4nδm is an even integer and hence φn(x + δm)−φn(x) = 0. Also for 0 ≤ n ≤ m, we have |φn(x+ δm)−φn(x)| = |4nδm|.Therefore∣∣∣∣f(x + δm)− f(x)
δm
∣∣∣∣ =
∣∣∣∣∣m∑
n=0
(3
4
)n
(φn(x + δm)− φn(x)
δm
∣∣∣∣∣≥ 3m −
m−1∑0
3n = 3m − 3m − 1
2=
3m + 1
2.
Therefore upon taking the limit as m →∞, we see that f ′(x) does not
exist.
Lecture 23
Uniform metric
104
Let X be any set and B(X) be the set of all real (or complex) valued
functions on X which are bounded. Then for each f ∈ B(X),
‖f‖ = sup{|f(x)| : x ∈ X} < ∞
and is called the norm of f. One easily checks that
(a) f ≡ 0 iff ‖f‖ = 0.
(b) |αf‖ = |α|‖f |,α ∈ R(C).
(c) ‖f + g‖ ≤ ‖f‖+ ‖g‖.Therefore if we define d(f, g) = ‖f − g‖, then d becomes a metric
on Cb(X) which is called the uniform metric. (The norm above is
called sup norm). Note that if X is a compact metric space then
any continuous functions real valued function on X is bounded. In
particular, C[a, b] ⊂ B[a, b].
Theorem 76 A sequence {fn} in B(X) is convergent wrt to the uni-
form metric iff it is uniformly convergent on X as a sequence of func-
tions.
Theorem 77 B(X) is a complete metric space.
Remark 37 Indeed, it follows from Weieirstrass theorems, that if K
is compact subset of Rn, then the space C[K) of continuous functions
is a closed subset of B(X).
Theorem 78 Weierstrass The set of all polynomial functions on [a, b]
is dense in C[a, b].
Proof: Given a continuous function f : [a, b] → R and ε > 0 we must
find a polynomial P such that
|f(x)− P (x)| < ε, a ≤ x ≤ b.
105
Step 1 Enough to prove this for the case [a, b] = [0, 1].
Put g(t) = f(a + [b− a]t), 0 ≤ t ≤ 1,
get a polynomial Q such that
|g(y)−Q(y)| < ε, for 0 ≤ t ≤ 1
and put P (x) = Q(
x−ab−a
).
Step 2 Bernstein’s Polynomials. For n ≥ 1, and 0 ≤ x ≤ 1, define
Bn(x) := Bfn(x) :=
n∑k=0
(n
k
)xk(1− x)n−kf(k/n).
We have
(I) If f(x) ≡ 1 then Bfn(x) = 1.
(II) If f(x) = x then Bfn(x) = x.
(III) If f(x) = x2 the Bfn(x) = x2(1− 1
n) + x
n.
(IV)∑n
k=0
(kn− x)2
xk(1− x)n−k = x(1−x)n
.
[Proof: I is obvious. For II and III consider the binomial expansion
(x + y)n =n∑0
(n
k
)xkyn−k
Differentiate this wrt x and multiply by x/n to obtain
x(x + y)n−1 =n∑0
k
n
(n
k
)xkyn−k.
If you put y = 1− x now you get II.
Differentiate this again with respect to x multiply by x/n and substi-
tute y = 1− x to obtain III.
Finally (IV) is verified by expanding out and using I,II,III.]
Step 3 We shall now prove
Lemma 11 Given any continuous function f : [a, b] → R, the sequence
Bfn of Bernstein polynomials converges uniformly to f on [0, 1].
106
Given ε > 0 choose δ > 0 such that
|f(x)− f(y)| < ε, for |x− y| < δ, x, y ∈ [0, 1].
Now for any x ∈ [0, 1] by (I) above we have
f(x)−Bn(x)
= f(x)n∑0
(n
k
)xk(1− x)n−k +
n∑k=0
f(k/n)]
(n
k
)xk(1− x)n−k
=n∑
k=0
[f(x)− f(k/n)]
(n
k
)xk(1− x)n−k
=∑
k∈A +∑
k∈B
where A = {k : |f(x)− f(k/n)| < ε2} and B = {1, 2, . . . , n} \B. Note
that A and B depend on x. In any case, we have
∣∣∣∣∣∑k∈A
[f(x)− f(k/n)]
(n
k
)xk(1− x)n−k
∣∣∣∣∣ < ε
2
n∑0
(n
k
)xk(1− x)n−k =
ε
2.
It is the second sum on the right that needs more careful handling.
For k ∈ B we have |f(x) − f(k/n)| ≥ ε and therefore, |x − k/n| ≥ δ.
This means (k − nx)2 ≥ n2δ2. Therefore∣∣∣∣∣∑k∈B
[f(x)− f(k/n)]
(n
k
)xk(1− x)n−k
∣∣∣∣∣≤ 2‖f‖
∑k∈B
(n
k
)xk(1− x)n−k (k − nx)2
n2δ2
≤ 2‖f‖n2δ2
n∑0
(k − nx)2
(n
k
)xk(1− x)n−k
=2‖f‖n2δ2
nx(1− x) ≤ 2‖f‖nδ2
.
Luckily this result is independent of x. All that we have to do now
is to choose N such that 2 ‖f‖Nδ2 < ε
2i.e., N > 4‖f‖
δ2ε.
107
∣∣∣∣∣∑k∈A
[f(x)− f(k/n)]
(n
k
)xk(1− x)n−k
∣∣∣∣∣ < ε
2
n∑0
(n
k
)xk(1− x)n−k =
ε
2.
♠
Remark 38 The above lemma actually implies, in probability theory
the so called Week law of large numbers.
Exercise 14 Write down B1, B2, B3 explicitly for f(x) = x2, and f(x) =
x3.
108
Lecture 24 (Friday 23rd Oct.)
Alternative proof of Weierstrass’s theorem:
As before, we may assume that [a, b] = [0, 1]. We may further
assume that f(0) = f(1) = 0, by considering the function g(x) =
f(x)− f(0)− x[f(1)− f(0)]. Morever we can now extend f all over Rby defining it to be 0 outside [0, 1] so that f is uniformly continuous
on R.
Lemma 12 For any continuous function f : R → R such that suppf ⊂[0, 1] define the polynomial functions
Pn(f)(x) =
∫ 1
0
f(s)Qn(s− x)ds (43)
where
Qn(f)(x) = cn(1− x2)n
where the constant cn is chosen so that∫ 1
−1
Qn(x)dx = 1, n ≥ 1.
Then {Pn(f)} is a sequence of polynomials converging uniformly to the
function f on R.
Proof: For each fixed x ∈ R, the integrand in (43) is continuous func-
tion and hence is Riemann integrable in [0, 1]. Also, since the integrand
is a polynomial in x with coefficients which are continuous functions
of s upon taking the definite integral w.r.t. s, we obtain Pn(f) as
polynomial functions in x.
We begin with some estimate of the size of the constants cn.
109
Cliam: cn <√
n :∫ 1
−1
Qn(x)dx = 2
∫ 1
0
Qn(x)dx
≥ 2
∫ √n
0
(1− x2)ndx
≥∫ √
n
0
(1− nx2)dx =4
3√
n>
1√n
.
Now if 1 > δ > 0 then for δ ≤ |x| ≤ 1, we have
Qn(x) ≤√
n(1− δ2)n.
Since√
n(1− δ2)n → 0 as n →∞, Qn → 0 uniformly in δ ≤ |x| ≤ 1.
Next we shall rewrite Pn : Putting s = x + t, we get
Pn(f)(x) =
∫ 1−x
−x
f(x + t)Qn(t)dt.
Since f = 0 outside [0, 1] we see that for x ∈ [0, 1]
Pn(f)(x) =
∫ 1
−1
f(x + t)Qn(t)dt.
Given ε > 0 choose δ > 0 so that
|x− y| < δ implies that |f(x)− f(y)| < ε/2.
Let M = sup{|f(x)| : x ∈ R}.Then for any x ∈ [0, 1]
|Pn(x)− f(x)| =
∣∣∣∣∫ 1
−1
[f(x + t)− f(x)]Qn(t)dt
∣∣∣∣≤
∫ 1
−1
|f(x + t)− f(x)|Qn(t)dt
≤ 2M
∫ −δ
−1
Qn(t)dt +ε
2
∫ δ
−δ
Qn(t)dt + 2M
∫ 1
δ
Qn(t)dt
≤ 4M√
n(1− δ2)n +ε
2< ε
for sufficiently large n. ♠
110
Remark 39 Given a continuous function f : R → R, it is not true
that we can find a sequence of polynomials approximating f all over
R. For instance, in the above discsussions, the polynomials Pn wouild
obviously diverge to±∞ as x →∞ whereas the function f is identically
0 outside [0, 1].
Remark 40 The space B(K) is not only a vector space but is also an
algebra, i.e., if f, g ∈ B(X) fg ∈ B(X). We have earlier remarked that
if K is a compact subset of Rn then C(K) is a closed subset of B(K).
Indeed we can also verify that C(K) is a subalgebra. More generally
we have,
Theorem 79 If A is a subalgebra of B(X) then A is a subalgebra of
B(X).
Definition 39 Let A be a family of functions on a set X. We say A
separates points in X if given any two distinct points x1, x2 ∈ X there
exists at least one f ∈ A such that f(x1) 6= f(x2). Likewise, we say A
vanishes at no point of X if for each x ∈ X there is at least one f ∈ A
such that f(x) 6= 0.
Example 20 A typical example of A satisfying the above properites
is the family of polynomial functions where X is any subset of Rn. On
the other hand if we take the family of even polynomials on [−1, 1]
it does not separates points and the family of odd polynomials does
vanishes at x = 0.
Theorem 80 Let A be an algebra of (real or complex valued) functions
on a set X which separates points of X and which does not vanish at
any point of X. Given x1 6= x2 and constants c1, c2 there exists f ∈ A
such that f(xj) = cj, j = 1, 2.
111
Proof: First find functions g, h, k such that
g(x1) 6= g(x2), h(x1) 6= 0, k(x2) 6= 0.
Put
f(x) = c1(g(x)− g(x2))h(x)
(g(x1)− g(x2))h(x1)+ c2
(g(x)− g(x1))k(x)
(g(x2)− g(x1))k(x2).
♠
Remark 41 Are you reminded of Newton’s interpolation formula?
Theorem 81 Stone-Weierstrass Theorem Let A be an algebra of
bounded real functions on a compact metric space X which separates
points of X and vanishes at no point of X. Then C[a, b] ⊂ A.
Proof: Step 1: If f ∈ A then |f | ∈ A.
Let a = sup {|f(x)| : x ∈ X}. Now find polynomials Pn(t) such that
|Pn(t)− |t|| < 1n
for −a ≤ t ≤ 1 (exists by Weierstrass’s theorem.) We
can also assume that Pn(0) = 0 by considering Qn(t) = Pn(t)− Pn(0).
Consider gn(x) = Pn(f(x)) = c1f(x) + c2f2(x) + · · · ckf
k(x) ∈ A. On
the other hand for all x ∈ X, we have
|gn(x)− |f(x)|| = |Pn(f(x))− |(f(x)|| < 1
n
This implies gn → |f | and we are through.
Step 2 If f, g ∈ A, then max{f, g}, min{f, g} ∈ A.
This follows since
max{f, g} =f + g + |f − g|
2; min{f, g} =
f + g − |f − g|2
.
By repeated application of this it follows that maximum (or minimum)
of finitely many functions in A is again in A.
Step 3 Let f : X → R be a continuous function and x ∈ X. Given
ε > 0 there exists gx ∈ A such that gx(x) = f(x) and
gx(t) > f(t)− ε, x ∈ X. (44)
112
Using the property of separation of points and nonvanishing, it
follows that for every t ∈ X we have a function ht ∈ A such that
ht(x) = f(x), ht(t) = f(t). By continuity of ht there is a nbd Vt of t in
X such that ht(y) > f(t)− ε for y ∈ Vt. Since X is compact, we get
X ⊂ Vt1 ∪ Vt2 ∪ · · · ∪ Vtk .
Put
gx = max {ht1 , . . . , htk}.
Then gx(x) = f(x) and if t ∈ X is such that t ∈ Vti , we have
gx(t) ≥ hti(t)− ε > f(t)− ε, t ∈ Vti .
By Step 2, gx ∈ A.
Step 4 Given a continuous function f : X → R and ε > 0 there exists
g ∈ A such that |f(t)− g(t)| < ε, t ∈ X.
For each x ∈ X, let gx ∈ A be a function as in Step 3. By continuity
of g there is a nbd Ux of x such that gx(t) < f(t)+ε for all t ∈ X. Cover
X with finitely many Ux1 , . . . , Uxm and take g = min {gx1 , . . . , gxm}.By step 2 g ∈ A. Since each gxi
has the property (44), it follows that
g(t) > f(t)−ε, t ∈ K. On the other hand, if t ∈ Uxithen g(t) ≤ gxi
(t) <
f(t) + ε. Therefore for all t ∈ X we have f(t)− ε < g(t) < f(t) + ε. ♠
Remark 42 The theorem does not hold for algebras of complex valued
functions without the additional hypothesis that A is self-adjoint, i.e.,
it is closed under conjugation, i.e., if f = u+ıv ∈ A then f = u−ıv ∈ A.
This can be illustrated by the following example.
Let X = S1, the unit circle and A be the algebra of all polyno-
mial functions with complex coefficients. The A separates points and
the polynomial z ∈ A does not vanish on A. The function f(z) = 1z
is continuous on X. However, it does not belong to A. For, we have∫S1 P (z)dz = 0 for all polynomials whereas
∫S1
dzz
= 2πı. If there were
113
a sequence of polynomials uniformly converging to 1/z then the integral
should have been zero according to theoreom 75.
The situation can be saved if we make one more assumption.
Theorem 82 Let X be any compact metric space and A be a self ad-
joint algebra over C, of complex valued continuous functions on X.
Assume that A separates points of X and does not vanish anywhere on
X. Then A contains all continuous complex valued functions on X.
Proof: (Note that A has the additional property: f ∈ A =⇒ ıf, f ∈ A
as compared with an algebra over R being an algebra over complex
numbers, which is implict when we talk aout self-adjoint algebras.)
Let AR denote the subspace of all members of A which take real
values only. Then A is a subalgebra which also has these two additional
properties: For first of all observe that if f ∈ A then <(f) = f +
f/2 ∈ A and =(f) = f − f)/2ı ∈ A. Therefore <(f),=(f) ∈ AR.
Now given x1 6= x2 ∈ X let f ∈ A be such that f(x1) 6= f(x2). Then
<(f(x1)) 6= Re(f(x2)) or =(f(x1)) 6= =(f(x2)) and accordingly, we get
some g ∈ AR with g(x1) 6= g(x2). Similarly, if f ∈ A is such that
f(x) 6= 0 then one of <(f)(x) 6= 0,=(f(x)) 6= 0 is true and so we are
done.
Now given any continuous function f : X → C we can apply the real
Stone Weierstrass theorem to conclude that <(f) ∈ A and =(f) ∈ A.
Therefore f ∈ A. ♠
114
Lecture 25
Solution of IVP a la Picard
The existence and unique of the solution of an Initial Value Prob-
lem(IVP)
y′ = f(x, y), y(x0) = y0 (45)
is of fundamental importance in several branches of mathematics, not
just in the theory of Differential equations. However, it is not taught
in any first course in differential equations, sicne the students do not
have the required analysis background and then a student may never
take a formal course in differential equations thereby totally ‘missing’
this beautiful theorem.
Observe that f is a given real valued function defined in a (rectangu-
lar) neighbourhood of the point (x0, y0) ∈ R2. By a solution of (45), we
mean a once differentiable function φ defined in some neighbourhood
of the point x0 say (x0 − δ, x0 + δ) satisfying,
φ(x0) = y0, & φ′(x) = f(x, φ(x)), x ∈ (x0 + δ, x0 + δ). (46)
By Fundamental Theorem of Riemann Integration, we can convert (45)
into an integral equation:
y(x) = y0 +
∫ x
x0
f(t, y(t))dt (47)
and it is in this form Picard came up with his classical solution of this
problem, via the so called iteration method. Here we give a simple
version of this great theorem. Before that, we would like to present the
modern avatar of iteration principle:
Definition 40 Let X be a metric space. By a contraction map on
X we mean a function T : X → X such that there exists a constant
0 < c < 1 such that for x, y ∈ X we have
d(T (x), T (y)) ≤ c d(x, y).
115
Remark 43 It is easy to see that every contraction mapping is con-
tinuous. The map f(x) = λx on Rn is a contraction iff |λ| < 1. The
most important property of contraction mapping is:
Theorem 83 Contraction Mapping Principle On a complete met-
ric space, every contraction mapping T has precisely one fixed point,
i.e., there exists exactly onle point t0 ∈ X such that T (t0) = t0.
Proof: First let us prove the uniqueness. If T (t1) = t1 and T (t2) = t2
then we have
d(T (t1), T (t2)) ≤ c d(t1, t2) = Dd(T (t1), T (t2))
which is absurd unles t1 = t2. Now starting with any point t ∈ X define
t1 = T (t), t2 = T (t1), . . . , tn = T (tn−1.
Verfiy that {tn} is a Cauchy’s sequence. Since X is a complete metric
space, it follows that there tn → t0 say. Then
T (t0) = T (limn
tn) = limn
T (tn) = limn
tn+1 = t0.
This completes the proof of the theorem. ♠
Remark 44 This principle has the following wonderful interpretation.
Take a map of a country which is ‘to the scale’ and throw it inside the
country. Then there is (exactly) one point on the map which lies exactly
on the point in the country which it represents. You may wonder why
it should be true for countries like USA which has several connected
components but this is true!
Theorem 84 Let R = [a, b] × [c, d] and f : R → R be a continuous
real valued valued function and let M be a constant such that f satisfies
the following Lipschitz condition of first order:
|f(x, y1)− f(x, y2)| ≤ M |y1 − y2|, (x, yj) ∈ [a, b]× [c, d]. (48)
116
Given a < x0 < b, c < y0 < d there exists a δ > and a unique function
φ which satisfies (46).
Proof: Put K = sup{|f(x, y), (x, y) ∈ R}. Choose δ > 0 so that
Mδ < 1; a < x0 − δ < x0 + δ < b and c < y0 −Kδ < y0 + Kδ < d.
Consider the space A = C[x0 − δ, x0 + δ] of all continuous real valued
function on the closed interval. We know that this is a complete metric
space. Now consider the subspace B of those φ ∈ A such that
|φ(x)− y0| ≤ Kδ.
Then B is a closed subspace of A and hence is a complete metric space.
It is important to note that B is non empty. (Why?)
We consider the map T : B → B defined by
T (φ)(x) = y0 +
∫ x
x0
f(t, φ(t))dt. (49)
By theory of Riemann integration, it follows that T (φ) is continuous.
For x ∈ [x0 − δ, x0 + δ], we have,
|T (φ)(x)− y0)| ≤ |∫ x
x0
f(t, φ(t))dt| ≤ K|x− x0| ≤ Kδ.
This implies that T (φ) ∈ B.
Observe that φ ∈ B is a solution of (46) iff T (φ) = φ. Therefore,
our aim is to prove that T is a contraction mapping. Given φj ∈ B
consider
|T (φ1)(x)− T (φ2)(x)| = |∫ x
x0(f(t, φ1(t))− f(t, φ2(t)))dt|
≤ M∫ x
x0|φ1(t)− φ2(t)|dt
≤ Mδd(φ1, φ2).
and since this is true for all x ∈ [x0 − δ, x0 + δ] we have
(T (φ1), T (φ2)) = sup{|T (φ1)(x)− T (φ2)(x)| : x ∈ [x0 − δ, x0 + δ]}≤ Mδd(φ1, φ2). This completes the proof of the theorem. ♠.
117
Lecture 26. Fourier Series
Some important Exercises of Integration:
Exercise 15 Throughout, let α be a fixed increasing function on
[a, b].
1. Famous Inequalities Let p > 1 be a positive real number. 1p
+1q
= 1.
(a) Show that φ(x) = 1px − x1/p, attains its minimum at x = 1.
Put φ(1) = 1p− 1 = 1
qso that 1
p+ 1
q= 1. Note that both p, q > 1.
They are called ‘dual pair’ of numbers, i.e., q is the dual of p and
p is the dual of q. Observe that if p = 2 then q = 2, i.e., 2 is dula
to itself.
(b) If u, v ≥ 0 then
uv ≤ up
p+
vq
q.
Show that equality holds iff up = vq.
(c) Let f, g ∈ R(α) and f, g ≥ 0 such that∫ b
a
fpdα = 1 =
∫ b
a
gqdα.
Then show that∫ b
afgdα ≤ 1.
(d) Let f, g be any complex valued functions in R(α). Then prove
that Holder’s Inequality:∣∣∣∣∫ b
a
fgdα
∣∣∣∣ ≤ (∫ b
a
|f |pdα
)1/p(∫ b
a
|g|qdα
)1/q
.
(e) Schwarz’s Inequality With f, g as in (d), show that
∣∣∣∣∫ b
a
fgdα
∣∣∣∣ ≤ (∫ b
a
|f |2dα
)1/2(∫ b
a
|g|2dα
)1/2
.
118
(f) For any u ∈ R(α) define and p > 0
‖u‖p :=
[∫ b
a
|u|pdα
]1/p
.
For any f, g, h ∈ R(α) prove Minkowski Iniquality:
‖f + g‖p ≤ ‖f‖p + ‖g‖p.
(g) Show that dp(f, g) = ‖f − g‖p satisfies triangle inequality.
Solution:
(a) φ′(x) = − iff x = 1 and φ′′(1) > 0. The conclusion follows.
(b) Put x = up/vq in (a).
(c) f, g ∈ R(α) implies |f |p, |g|q ∈ R(α). (Why? Remember how
we proved f 2 ∈ R(α)?) Now by (b) f(x)g(x) ≤ f(x)p
p+ g(x)q
q. Upon
taking integration and use the fact 1p
+ 1q
= 1 we are done.
(d) Apply (c) to appropriate multiples of f, g.
(e) Put p = q = 2.
(f) Notice that 1p
+ 1q
= 1, p, q > 0 implies p, q ≥ 1. Put k =∫ b
a(|f |+ |g|)pdα. Then
k =∫ b
a(|f |+ |g|)(|f |+ |g|)p−1dα
=∫ b
a(|f |(|f |+ |g|)p−1dα +
∫ b
a|g|(|f |+ |g|)p−1dα
≤(∫ b
a|f |pdα
)1/p (∫ b
a(|f |+ |g|)(p−1)qdα
)1/q
+(∫ b
a|g|pdα
)1/p (∫ b
a(|f |+ |g|)(p−1)qdα
)1/q
=
[(∫ b
a|f |pdα
)1/p
+(∫ b
a|g|pdα
)1/p]
k1/q
because (p− 1)q = p. The result follows.
(h) Easy.
2. Let f ∈ R(α) on [a, b]. Given ε > 0 show that there exists a
continuous function g : [a, b] → R such that ‖f − g‖2 < ε.
119
Sol:’ We have seen that f ∈ R(α) implies that f 2 ∈ R(α) too.
So choose a partition P = {a = a0, . . . , an = b} such that for all
refinements of it as well, we have∑i
|f(ti)− f(si)|2∆αi < ε2/4, for all ti, si ∈ [ai−1, ai]
Put
g(t) =ai − t
∆xi
f(ai−1) +t− ai−1
∆xi
f(ai), ai−1 ≤ t ≤ ai.
Then clearly g is continuous. For ai−1 ≤ ti ≤ ai we have,
f(ti)− g(ti) =ai − ti∆xi
(f(ti)− f(ai−1)) +ti − ai−1
∆xi
(f(ti)− f(ai))
Therefore
|f(ti)−g(ti)| ≤ |(f(ti)−f(xi−1|+ |(f(ti)−f(xi)| ≤ 2|f(ti)−f(si)|
where si = xi or xi−1. Therefore∑i
|f(ti)− g(ti)|2∆αi ≤ 4∑
i
|f(ti)− f(si)|∆αi < ε2.
Definition 41 A function f : R → R, (C) is called periodic with pe-
riod λ > 0 if f(x + λ) = f(x) for all x ∈ R.
As an immediate corollary of Theorem 82, we have
Theorem 85 Let f : R → R be a continuous function with the prop-
erty f(x + 2π) = f(x) for all x ∈ R. Then there exists a sequence
SN(x) = a0 +N∑
n=1
(an cos nx + bn sin nx), a0, an, bn ∈ R, (50)
which converges uniformly to f on the whole of R.
120
Proof: Functions of the above form SN are called trigonometric poly-
nomials. Notice that each summand that occurs on the RHS of the
formula for SN has the property
g(x + 2π) = g(x), x ∈ R.
Such functions are called periodic with period 2π. The important thing
to note about them is that their behavior on R is completely known by
their behaviour on any interval of length (≥) 2π.
If we allow complex coefficients a0, an, bn in (50) then using the
identities
cos x =eıx + e−ıx
2; sin x =
eıx − e−ıx
2ı,
it follows that we can rewrite (50) in the form
SN(x) =N∑−N
cneinx, cn ∈ C. (51)
Let A denote the collection of all such functions sN . Check that A is
a self-adjoint algebra of continuous functions on the whole of R (but
we shall consider these functions on the closed interval [−π, π]). Also
check that this algebra separates points of [−π, π] and does not vanish
anywhere (since it contains constant functions). Therefore its closure
contains the space C[−π, π].
Now given any continuous periodic function f : R → R with period
2π restrict f : [−π, π] → R. Now by what we have concluded above,
we get a sequence {sN(x)} ∈ A (with coefficients a0, an, bn ∈ C) which
uniformly converges to f. Upon rewriting it in terms of cos nx and
sin nx and taking the real part the theorem follows. ♠The above theorem prods us into studying many related concepts
which lead us to the so called Theory of Fourier series. We shall only
give a few basics of this vast theory here depending only on the math-
ematics that we have developed so far. Full justification to this topic
cannot be done without the support of Lebesgues theory.
121
Lemma 13 Let n be an integer. Then
1
2π
∫ π
−π
eınxdx =
{1 if n = 0;
0 otherwise(52)
Definition 42 By a trigonometric series we mean a sum of the form
∞∑−∞
cneınx (53)
whose N th-partial sum SN is given by (51). Given a Riemann inte-
grable function f on [−π, π], and an integer n, we define its nth Fourier
coefficient by the formula
cn(f) :=1
2π
∫ π
−π
f(x)e−ınxdx. (54)
The Fourier series (also called trigonometric series) associated to f is
defined to be∞∑−∞
cn(f)eınx. We express this often by
f ∼∞∑−∞
cn(f)eınx. (55)
Remark 45 We observe that if SN is a trigonometric polynomial as
in (51), then cn(SN) = cn, for |n| ≤ N and cn(SN) = 0, |n| > N. Thus
the Fourier series of SN reduces to a trigonometric polynomial. One
of the fundamental problem in the theory is when can we write = in
place of ∼ in (55)? Of course there are many subquestions related to
this as well viz., what should be the meaning of ‘ =′ here. For instance,
it is clear that at all cost we should insist that RHS converges. If the
convergence is uniform then it follows that the function represented
is periodic and moreover continuous. The first properties is desirable
whereas the second one is NOT. The applications that we have in mind
involve, quite often than not, functions which have discontinuities.
122
For instance if the series (53) converges to some function f , then
we would like that the so called Euler’s formula
cn =1
2π
∫ π
pi
f(x)eınxdx
to be true. If we grant uniform convergence, then term-by-term inte-
gration is valid and hence we using (52) one easily check this property
to be true. This is similar to the case of an analytic function whose
nth derivative at 0 determining the coefficeint of xn. For trigonomet-
ric series or for more general Fourier series, we are looking for similar
properties under more general conditions then uniform convergence.
Definition 43 Let {φj} be a family of complex valued integrable func-
tions on [a, b] with the property:∫ b
a
φj(x)φk(x)dx = 0, j 6= k. (56)
Then we say {φj} is an orthogonal family of functions. In addition if∫ b
a
|φj(x)|2dx = 1 (57)
we call it an orthonormal family.
Example 21 We have seen that the family { eınx√
2π} is an orthonormal
family on [−π, π]. Similarly,
{ 1√2π
,cos x√
2π,sin x√
2π,cos 2x√
2π,sin 2x√
2π, · · · }
is also an orthonormal family on [−π, π].
Definition 44 Given an integrable function f on [a, b] we define
cj(f) :=
∫ b
a
f(t)φj(x)dx (58)
123
to be the Fourier coefficient of f with respect to the family {φj}. More-
over the formal sum∑
j cj(f)φj(x) is then called the Fourier series of
f with respect to {φj}. And we express this by
f(x) ∼∑
j
cj(f)φj(x).
For any two integrable functions, f, g on [a, b], let us write
〈f, g〉 =
∫ b
a
fgdx.
Also let us write
||f ||2 =√〈f, f〉.
Theorem 86 Pythagorus theorem: If 〈f, g〉 = 0 then
‖f + g‖2 = ‖f‖2 + ‖g‖2.
Proof: Direct.
Theorem 87 Least Square Approximation Let f be an integrable
function on [a, b]. Let {φn} be an orthonormal system and
sn(x) :=n∑
m=1
cmφm(x)
be the nth partial sum of the Fourier series of f. Then for all
tn(x) =n∑
m=1
γmφm(x)
we have ∫ b
a
|f − sn|2dx ≤∫ b
a
|f − tn|2dx (59)
with equality holding iff γm = cm, for all 1 ≤ m ≤ n.
124
Proof: Check that f − sn is orthogonal to sn − tn and use the above
theorem to conclude that
‖f − tn‖2 = ‖f − sn‖2 + ‖sn − tn‖2.
This proves (87). As for the last part, repeated application of Pythago-
ras yields
‖sn − tn‖2 =n∑
m=1
|cm − γm|2
from which the conclusion follows.
Theorem 88 Bessel’s Inequality: For any integrable function f on
[a, b] if f ∼∑
m cmφm then∑n
|cn|2 ≤ ‖f‖2
Proof: Putting tm = 0 in the proof of the above theorem, we first
obtain that f − sn is orthogonal to sn. (Or do this directly afresh).
Again by Pythagorus theorem, we get
‖f‖2 = ‖f − sn‖2 + ‖sn‖2.
The conclusion follows. ♠In particular, we have the so called
Theorem 89 Lebesgue-Riemann theorem: For any integrable func-
tion f on [−π, π] the sequence of Fourier coefficients converges to 0 :
limn→∞
∫ π
−π
f(t) cos kt dt = 0; limn→∞
∫ π
−π
f(t) sin kt dt = 0. (60)
Proof: Bessel’s inequality implies that limn→±∞ cn = 0 and we also
have cn = cn. The above two quantities are nothing but cn+cn
2and
cn−cn
2. and real and imaginary parts of cn. ♠
125
Lecture 28
Theorem 90 Parseval’s Theorem: Let f, g be integrable functions
with period 2π. Put
f(x) ∼∞∑−∞
cmeımx; g(x) ∼∞∑−∞
γmeımx.
Then
(i) limN→∞
1
2π
∫ π
−π
|f(x)− sN(f ; x)|2dx = 0.
(ii)1
2π
∫ b
a
f(x)g(x)dx =∞∑−∞
cmγm.
(iii)1
2π
∫ π
−π
|f(x)|2dx =∞∑−∞
|cm|2.
Proof: We shall denote by ‖h‖2 =(
12π
∫ b
a|h(x)|2dx
)1/2
. Since f is
integrable and f(−π) = f(π), from a previous exercise 15.2, given
ε > 0, we have a continuous 2π-periodic function h such that
‖f − h‖2 < ε.
By the theorem 85 above, there is a trigonometric polynomial
P =N∑−N
γmeımx
of degree N, say, such that |P (x) − h(x)| < ε for all x ∈ [−π, π] and
hence ‖P − h‖2 < ε.
Let us use a slightly modified notation: for any g ∈ R(α)[−π, π],
sn(g) :=n∑−n
ck(g)eıkx
By Least Square Approximation, it follows that
‖h− sn(h)‖2 ≤ ‖h− P‖2 < ε, for n ≥ N.
126
Also Bessel’s inequality, we have,
‖sn(h)− sn(f)‖2 = ‖sn(h− f)‖2 ≤ ‖h− f‖2 < ε.
Finally by Triangle inequality, we have
‖f − sn(f)‖2 ≤ ‖f − h‖2 + ‖h− sn(h)‖2 + ‖sn(h)− sn(f)‖2 < 3ε
for all n ≥ N. This proves (i).
To prove (ii), we first observe that at finite sum level, we have
1
2π
∫ π
−π
sN(f)gdx =1
2π
N∑−N
∫ π
−π
cneınxg(x)dx =
N∑−N
cnγn.
Therefore, using Schwarz’s inequality, we get∣∣∣∣∫ fg −∫
sN g
∣∣∣∣ ≤ ∫ |f − sn| |g| ≤(∫
|f − sN |2)1/2(∫
|g|2)1/2
.
Letting N →∞ we get (ii).
(iii) follows from (ii) by putting g = f. ♠Convergence problem for Trigonometric Series.
We shall now on deal with only trigonometric series and consider
functions f with period 2π which are Riemann integrable over [−π, π].
Consider the trigonometric polynomial with all its coeffients equal
to 1. (By analogy, this plays the role of the polynomial which is the nth
partial sum of the geometric series for (1 − x)−1.) The trigonometric
polynomial
DN(x) =N∑−N
eınx
is called the Dirichlet’s kernel. Multiplying it by eıx − 1 we get
(eıx − 1)DN(x) = eı(N+1)x − e−ıNx.
Multiplying further by e−ıx/2 we get
2ı sin(x/2)DN(x) = 2ı sin(N + 1/2)x.
127
Therefore
DN(x) =sin(N + 1/2)x
sin x/2. (61)
Another interesting property of Dirichlet’s kernel is that∫ π
−π
Dn(t)dt = 2π. (62)
Given any f ∈ R(α)[−π, π] we can rewrite sN(f) in terms of Dirich-
let’s kernel:
sN(f)(x) =N∑−N
1
2π
(∫ π
−π
f(t)e−ıntdt
)eınx
=1
2π
∫ π
−π
f(t)N∑−N
eın(x−t) dt
=1
2π
∫ π
−π
f(t)DN(x− t)dt
=1
2π
∫ x+π
x−π
f(x− s)Dn(s)ds
=1
2π
∫ π
−π
f(x− s)Dn(s)ds
the last equality being the result of periodicity of the integrand.
We shall now prove a local convergence theorem:
Theorem 91 Suppose for some x, there exist δ > 0, M < ∞ such that
|f(x + t)− f(x)| ≤ M |t|, t ∈ (−δ, δ). (63)
Then
limN→∞
sN(f, x) = f(x).
Proof: Put
g(t) =
{f(x−t)−f(x)
sin(t/2)0 < |t| < π
0, t = 0.
128
We first note that g ∈ R(α) ∈ [−π, π]. [Let us prove that g sat-
isfies R-condition in [0, π] the proof for the interval [−π, 0] being the
same. Given ε > 0 we can choose δ1 > 0 such that |t/ sin(t/2)| < 2.
Now choose δ2 = min{δ, δ1, ε/8M}. Now observe that in [δ2, π], g is
integrable and hence we can find a partition P := {δ2 = a1 < a2 <
· · · an = π} in which g satisfies Riemann’s condition for ε/2. It then
follows that for the partition Q := {0 < δ2 = a1 < · · · < an}, g satisfies
Riemann’s condition in the interval [0, π] for ε. ]
Using (62) we get
sN(f ; x)− f(x)
=1
2π
∫ π
−π
[f(x− t)− f(x)]Dn(t) dt
=1
2π
∫ π
−π
g(t)sin(t/2)Dn(t) dt
=1
2π
∫ π
−π
g(t) sin(N + 1/2)t dt
=1
2π
∫ π
−π
g(t)[sin(t/2) cos N(t) + cos(t/2) sin N(t)]dt
= αN + βN
where αN and βN are respectively real part of the N th Fourier coeffi-
cient of g(t) sin(t/2) and the imaginary part of the N th Fourier coeffi-
cient of g(t) cos(t/2). Because of (63) both these functions are Riemann
integrable functions in the closed interval. Therefore, by Lebesgue Rie-
mann (89), it follows that αN → 0, βN → 0 as N →∞. ♠
Remark 46 It follows that if f ∈ C2 then it satisfies (63) and hence the
Fourier series is convergent. However, by carrying out integration by
parts twice and using Weierstrass’s majorant criterion, one can directly
prove that the Fourier series is uniformly convergent to a function g.
But then term-by-term integration is valid and hence it follows that
the function g is equal to f.
129
Lemma 14 Let g ∈ R(α)[0, π]. Then
limN→∞
∫ π
0
g(s) sin[(N + 1/2)s]ds = 0. (64)
Proof: Extend g to all over [−π, π] by defining g(t) = 0 for t ∈ [π, 0).
Then g ∈ R(α)[−π, π] and we have∫ π
0
g(s) sin[(N + 1/2)s]ds =
∫ π
−π
g(s) sin[(N + 1/2)s]ds.
Use the fact
sin[(N + 1/2)s] = sin Ns cos(s/2) + cos Ns sin s/2
and appeal to the theorem 89. ♠
Theorem 92 Let f ∈ R(α)[−π, π] and let x ∈ [−π, π]. Assume that
f(x±), f ′(x±) exist. Then the Fourier series for f at x will converge to
[f(x+) + f(x−)]/2.
Proof: The hypothesis f ′(x+), f (x−) exist implies that f satisfies the
following Lipschitz condtions:
|f(x + t)− f(x+)| ≤ Mt, for 0 ≤ t ≤ δ
and
|f(x− t)− f(x−)| ≤ Mt, for 0 ≤ t ≤ δ
for some M, δ > 0.
Now we use the property DN(x) = Dn(−x) to see that
sN(f) =1
2π
∫ π
0
[f(x + s) + f(x− s)]DN(s)ds.
130
Therefore
sN(f, x)− f(x+) + f(x−)
2
=1
2π
∫ π
0
[f(x + s) + f(x− s)− f(x+)− f(x−)]DN(s)ds
≤ 1
2π
∫ π
0
(f(x + s)− f(x+))DN(s)ds +1
2π
∫ π
0
(f(x− s)− f(x−))DN(s)ds
=1
2π
∫ π
0
g+(s) sin[(N + 1/2)s]ds +1
2π
∫ π
0
g−(s) sin[(N + 1/2)s]ds
where g± are defined in a similar way as in the proof of the above
theorem:
g±(s) =
{f(x±s)−f(x±)
sin(s/2), 0 < s ≤ π;
0, s = 0.
Exactly as in the above theorem, it follows that g± ∈ R(α)[0,±π]. By
the lemma above each of the terms on the rhs converge to 0 and we are
through. ♠(C,1) Summability of Fourier series
Given f ∈ R(α)[−π, π], let us discuss the (C, 1)−summability of
the series ∑cn(f ; x)e−ınx.
We consider the sequence
σn(x) =1
n
n−1∑k=0
sn(f ; x)
and ask the question under what conditions
limn→∞
σn(x) = f(x)?.
Thus it is natural to consider the sequence of sums,
Kn(x) =1
n
n−1∑0
Dk(x).
131
These functions are called Fejer kernels. We have
Kn(x) =1
n sin(x/2)
n−1∑k=0
sin(k + 1/2)x =sin2 nx/2
2n sin2(x/2).
Also observe that from (62), it follows that∫ π
−π
Kn(x)dx = 2π. (65)
Theorem 93 Let f ∈ R(α)[−π, π] and x ∈ (−π, π) be such that f is
continuous at x. Then the Fourier series of f(x) is (C, 1)−convergent
to f(x) at x.
Proof: We have to show that σn(x) → f(x). As before, this is the
same as showing
limn→∞
∫ π
0
[f(x + t) + f(x− t)− 2f(x)]Kn(t)dt = 0.
By continuity of f at x we can find 0 < δ < |π− x| such that for t ≤ δ
we have
|f(x + t) + f(x− t)− 2f(x)| < ε/2.
On the other hand, for t ≥ δ we have
Kn(t) =sin2(nt/2)
2n sin2(t/2)≤ 1
2n sin2(δ/2)
and hence for sufficiently large n we can make∣∣∣∣∫ π
δ
[f(x + t) + f(x− t)− 2f(x)]Kn(t)
∣∣∣∣ ≤ 2πM
n sin2(δ/2)< ε/2.
The theorem follows. ♠
Remark 47 If x is one of the end points ±π then the continuity of f at
x should be interpreted to mean that f(−π) = f(π) and the extended
function defined by f(x + 2π) = f(x) all over R, should be continuous
132
at x = π. With this meaning the above arguments go through in this
case also. Further, if f is continuous on the whole of [−π, π] (and
f(−π) = f(π) then the choise of δ in the above proof can be made
independent of x and so is the choice of n. This yields:
Theorem 94 Let f be a periodic continuous function. Then the Fourier
series of f (C, 1)-converges uniformly to f all over R.
Exercise 16 1. Let f : R → R be a non constant function scuh
that
f(x + y) = f(x) + f(y) for all x, y ∈ R.
(i) If f is continuous at x = 0 show that it is continuous on R.
(ii) Determine all such continuous f.
2. Let f : R → R be a non constant function such that
f(x + y) = f(x)f(y) for all x, y ∈ R.
(i) If f is continuous at x = 0 show that it is continuous on R.
(ii) Determine all such continuous f.
3. Apply Parseval’s theorem to the function f(x) = x, 0 ≤ x < 2π
and obtain the value of∑∞
01n2 .
4. Prove that on [−π, π]
(π − |x|)2 =π2
3+
∞∑n=1
4
n2cos nx.
Evaluate∑∞
01n2 ;
∑∞0
1n4 .
133
5. Integration by Parts: Let α be an increasing function on
[a, b]. Suppose f(x) = F ′(x) on [a, b]. Then∫ b
a
α(x)f(x)dx = F (b)α(b)− F (a)α(a)−∫ b
a
Fdα.
Lecture 32. Cantor set
Here we shall define an operator ||∞ on the class of all closed intervals
[a, b], a < b ∈ R to the class of compact subsets of R. Given any closed
interval J = [a, b] let us define φ(J) to be the set obtained by deleting
the middle-1/3 open interval of J from J. That is,
φ(J) := J \ (a +b− a
3, a + 2
b− a
3).
For any set A which is the finite union of disjoint closed interval
A = ∪ki [ai, bi], define
φ(A) = ∪iφ([ai, bi])
Put I0 = [a, b] and inductively put In = φ(In−1), n ≥ 1. We then
have a decreasing sequence of closed subsets
I0 ⊃ I1 ⊃ · · · ⊃ In ⊃ · · ·
Put
||∞[a, b] := ∩nIn.
The function ||∞ is called the Cantor’s construction. The set C =
||∞[0, 1] is called the Cantor set. We shall call ||∞[a, b] Cantor sets for all
closed intervals [a, b]. These sets have wonderful properties:
(a) ||∞[a, b] is a non empty compact subset of [a, b].
(b) If J is one of the connected components of In for some n then
||∞(J) ⊂ ||∞[a, b].
(c) a, b ∈ ||∞[a, b].
(d) Let f(x) = a + (b− a)x. Then f induces a continuous bijection of
134
C = ||∞[0, 1] with ||∞[a, b].
From now onward we shall specialize to C = ||∞[0, 1]. Each of the
properties of C which we list below is carried over to an identical or
similar property of ||∞[a, b] by the similarity map f above.
(e) The end points of every component of In, n ≥ 0 is in C.
(f) The set of all rationals of the form∑n
1ak
3k , where ak = 0 or 2 is
contained in C.
(g) C contains no open intervals.
(h) Every point of C is a limit point of C. (Such closed subset of Rn
are called perfect sets.)
(i) C is uncountable.
(j) C is totally disconnected.
(k) C is of length zero.
Proof: (a)-(d) Obvious.
(e) This is an easy consequence of (b) and (c).
(f) This is just the restatement of (e).
(g) Let J = (c, d) be any open interval contained in [0, 1]. Choose n
so that d − c > 1/3n. Then for some i such that 0 ≤ i < 3n, J1 :=
[ i3n , i+1
3n ] ⊂ J It follows that In+1 does not contain the middle-1/3 of J1
and hence J 6⊂ In+1.
(h) Let x ∈ C and J be an interval around x. If n is chosen as above,
there is a unique i such that 0 ≤ i < 3n such that x ∈ [ i3n , i+1
3n ] = J1.
Now both the end points of J1 are in C. One of the them not equal to
x has to be inside J. Hence J ∩ C 6= ∅.(i) This can be deduced from the fact that C is a perfect set. Here is
an easier way. From (f), since C is closed it follows that every number
135
represented as an infinite sum
∞∑1
ak
3k
belongs to C. Let A be the set of all sequences α : N → {0, 2}. We
know that A is uncountable. The assignment
(ak) 7→∞∑k
ak
3k
defines an injective mapping of A into C.
(j) Given any points x < y ∈ C, since the interval [x, y] is not contained
in C, there exists z 6∈ C, such that x < z < y. Then {[0, z]∩C, [z, 1]∩C}defines a separation of C.
(k) This follows by the fact that∑∞
12n−1
3n = 1.
136