Fundamentals of Analysis

7/30/2019 Fundamentals of Analysis

http://slidepdf.com/reader/full/fundamentals-of-analysis 1/100

FUNDAMENTALS OF ANALYSIS

W W L CHEN

c W W L Chen, 1983, 2008.

This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.

It is available free to all individuals, on the understanding that it is not to be used for financial gain,

and may be downloaded and/or photocopied, with or without permission from the author.

However, this document may not be kept on any information storage and retrieval system without permission

from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 1

THE NUMBER SYSTEM

1.1. The Real Numbers

In this chapter, we shall make a detailed study of some of the important properties of the real numbers.Most readers will be familiar with some of these properties, or have at least used most of them, perhapssometimes unaware of their generality. Throughout, we denote the set of all real numbers by R, andwrite a ∈ R to indicate that a is a real number.

We shall take an axiomatic approach to the real numbers. In other words, we offer no proof of theseproperties, and simply treat and accept them as given.

The first collection of properties of R is generally known as the Field axioms. They enable us to study

arithmetic.

FIELD AXIOMS.

(A1) For every a, b ∈ R, we have a + b ∈ R.

(A2) For every a,b,c ∈ R, we have a + (b + c) = (a + b) + c.

(A3) For every a ∈ R, we have a + 0 = a.

(A4) For every a ∈ R, there exists −a ∈ R such that a + (−a) = 0.

(A5) For every a, b ∈ R, we have a + b = b + a.

(M1) For every a, b ∈ R, we have ab ∈ R.

(M2) For every a,b,c ∈ R, we have a(bc) = (ab)c.

(M3) For every a ∈ R, we have a1 = a.

(M4) For every a

∈R such that a

= 0, there exists a−1

∈R such that aa−1 = 1.

(M5) For every a, b ∈ R, we have ab = ba.(D) For every a,b,c ∈ R, we have a(b + c) = ab + ac.

Chapter 1 : The Number System page 1 of 13

Fundamentals of Analysis c W W L Chen, 1983, 2008

Remark. The properties (A1)–(A5) concern the operation addition, while the properties (M1)–(M5)concern the operation multiplication. In the terminology of group theory, we say that the set R formsan abelian group under addition, and that the set of all non-zero real numbers forms an abelian group

under multiplication. We also say that the set R forms a field under addition and multiplication. Theproperty (D) is called the Distributive law.

The second collection of properties of R is generally known as the Order axioms. They enable us tostudy inequalities.

ORDER AXIOMS.(O1) For every a, b ∈ R, exactly one of a < b, a = b, a > b holds.(O2) For every a,b,c ∈ R satisfying a > b and b > c, we have a > c.(O3) For every a,b,c ∈ R satisfying a > b, we have a + c > b + c.(O4) For every a,b,c ∈ R satisfying a > b and c > 0, we have ac > bc.

Remark. Clearly the Order axioms as given do not appear to include many other properties of the realnumbers. However, these can be deduced from the Field axioms and Order axioms.

Example 1.1.1. Suppose that the real number a > 0. Then the real number −a < 0. To see this, notefirst that by Axiom (A4), there exists −a ∈ R such that a + (−a) = 0. Hence

0 = a + (−a) from above,

> 0 + (−a) by Axiom (O3),

= (−a) + 0 by Axiom (A5),

= −a by Axiom (A3),

as required.

Example 1.1.2. For every a ∈ R, we have a0 = 0. To see this, note first that a0 ∈ R, in view of Axiom(M1). On the other hand, it follows from Axioms (A3) and (D) that a0 = a(0 + 0) = a0 + a0. Note nextthat −(a0) ∈ R and a0 + (−(a0)) = 0, in view of Axiom (A4). Hence

0 = a0 + (−(a0)) from above,

= (a0 + a0) + (−(a0)) from above,

= a0 + (a0 + (−(a0))) by Axiom (A2),

= a0 + 0 by Axiom (A4),

= a0 by Axiom (A3),

as required.

Example 1.1.3. Suppose that the real number a > 0. Then the real number a−1 > 0. To see this, notefirst that by Axiom (M4), there exists a−1 ∈ R such that aa−1 = 1. Suppose on the contrary that it isnot true that a−1 > 0. Then it follows from Axiom (O1) that a−1 = 0 or a−1 < 0. If a−1 = 0, then

1 = aa−1 by Axiom (M4),

= a0

= 0 by Example 1.1.2,

and so

a = a1 by Axiom (M3),

= a0 from above,= 0 by Example 1.1.2,

a contradiction. If a−1 < 0, then

0 = a0 by Example 1.1.2,

= 0a by Axiom (M5),> a−1a by Axiom (O4),

= aa−1 by Axiom (M5),

= 1 by Axiom (M4),

and so

0 = a0 by Example 1.1.2,

> a1 from above,

= a by Axiom (M3),

again a contradiction.

Example 1.1.4. Suppose that the real numbers a > 0 and b > 0. Then the real number ab > 0. To seethis, note first that by Axiom (M1), we have ab ∈ R. Suppose on the contrary that it is not true thatab > 0. Then it follows from Axiom (O1) that ab = 0 or ab < 0. Since b > 0, it follows from Axiom(O1) that b = 0, from Axiom (M4) that b−1 ∈ R, and from Example 1.1.3 that b−1 > 0. If ab = 0, then


= a(bb−1) by Axiom (M4),

= (ab)b−1 by Axiom (M2),

= 0b−1

= b−10 by Axiom (M5),


a contradiction. If ab < 0, then


= a(bb−1) by Axiom (M4),

= (ab)b−1 by Axiom (M2),

< 0b−1 by Axiom (O4),

= b−10 by Axiom (M5),


again a contradiction.

Example 1.1.5. Suppose that a, b ∈ R and 0 < a < b. Then b−1 < a−1. To see this, note first fromExample 1.1.3 that a−1 > 0 and b−1 > 0, and from Example 1.1.4 that b−1a−1 > 0. Hence

b−1 = b−11 by Axiom (M3),

= b−1(aa−1) by Axiom (M4),

= b−1(a−1a) by Axiom (M5),

= (b−1a−1)a by Axiom (M2),

< (b−1a−1)b by Axiom (O4),

= (a−1b−1)b by Axiom (M5),

= a−1(b−1b) by Axiom (M2),

= a−1(bb−1) by Axiom (M5),

= a−11 by Axiom (M4),

= a−1

by Axiom (M3),as required.




An important subset of the set R of all real numbers is the set of all natural numbers, given by

N = {1, 2, 3, . . .}.

However, this definition does not bring out some of the main properties of the set N in a natural way.The following more complicated definition is therefore sometimes preferred.

AXIOMS OF THE NATURAL NUMBERS.(N1) 1 ∈ N.(N2) If n ∈ N, then the number n + 1, called the successor of n, also belongs to N.(N3) Every n ∈ N other than 1 is the successor of some number in N.

(WO) Every non-empty subset of N has a least element.

Remark. The condition (WO) is called the Well-ordering principle.

To explain the significance of each of these four axioms, note first that Axioms (N1) and (N2) to-

gether imply that N contains 1, 2, 3, . . . . However, these two axioms alone are insufficient to excludefrom N numbers such as 5.5. Now, if N contained 5.5, then by Axiom (N3), N must also contain4.5, 3.5, 2.5, 1.5, 0.5, −0.5, −1.5, −2.5, . . . , and so would not have a least element. We therefore excludethis possibility by stipulating that N has a least element. This is achieved by Axiom (WO).

It can be shown that Axiom (WO) implies the Principle of induction. The following two forms of thePrinciple of induction are particularly useful. In fact, both are equivalent to Axiom (WO).

PRINCIPLE OF INDUCTION (WEAK FORM). Suppose that the statement p(.) satisfies the following conditions:(PIW1) p(1) is true; and (PIW2) p(n + 1) is true whenever p(n) is true.

Then p(n) is true for every n ∈N

.

PRINCIPLE OF INDUCTION (STRONG FORM). Suppose that the statement p(.) satisfies the following conditions:

(PIS1) p(1) is true; and (PIS2) p(n + 1) is true whenever p(m) is true for all m ≤ n.

Then p(n) is true for every n ∈ N.

Proof of the equivalence of the Well-ordering principle and the two Principles of

induction. Our first step is to show that Axiom (WO) is equivalent to the Principle of induction(strong form) (PIS).

((WO)

⇒(PIS)) Suppose that the conclusion of (PIS) does not hold. Then the subset

S = {n ∈ N : p(n) is false}

of N is non-empty. By Axiom (WO), S has a least element, n0 say. If n0 = 1, then clearly (PIS1) doesnot hold. If n0 > 1, then p(m) is true for all m ≤ n0 − 1 but p(n0) is false, contradicting (PIS2).

((PIS) ⇒ (WO)) Suppose that a non-empty subset S of N does not have a least element. Consider thestatement p(n), given by n ∈ S . Then p(1) is true, otherwise 1 would be the least element of S . Supposenext that p(m) is true for every natural number m ≤ n, so that none of the numbers 1, 2, 3, . . . , n belongsto S . Then p(n + 1) must also be true, for otherwise n + 1 would be the least element of S . It nowfollows from (PIS) that S does not contain any element of N, contradicting the assumption that S is anon-empty subset of N.

Next, we complete the proof by showing that the Principle of induction (weak form) (PIW) is equivalentto the Principle of induction (strong form) (PIS).


((PIS) ⇒ (PIW)) Suppose that (PIW1) and (PIW2) both hold. Then clearly (PIS1) holds, since it isthe same as (PIW1). On the other hand, if p(m) is true for all m ≤ n, then p(n) is true in particular,so it follows from (PIW2) that p(n + 1) is true, and this gives (PIS2). It now follows from (PIS) that

p(n) is true for every n ∈ N.

((PIW) ⇒ (PIS)) Suppose that (PIS1) and (PIS2) both hold for a statement p(.). Consider a statementq (.), where q (n) denotes the statement

p(m) is true for every m ≤ n.

Then the two conditions (PIS1) and (PIS2) for the statement p(.) imply respectively the two conditions(PIW1) and (PIW2) for the statement q (.). It follows from (PIW) that q (n) is true for every n ∈ N, andthis clearly implies that p(n) is true for every n ∈ N.

1.2. Completeness of the Real Numbers

The set Z of all integers is an extension of the set N of all natural numbers to include 0 and all numbersof the form −n, where n ∈ N. The set Q of all rational numbers is the set of all real numbers of theform pq −1, where p ∈ Z and q ∈ N. It is easy see that the Field axioms and Order axioms hold good if the set R is replaced by the set Q. We therefore need to find a property that distinguishes R from Q. Agood starting point is the following well known result.

THEOREM 1A. No rational number x ∈ Q satisfies x2 = 2.

Proof. Suppose that pq −1 has square 2, where p ∈ Z and q ∈ N. We may assume, without loss of generality, that p and q have no common factors apart from

±1. Then p2 = 2q 2 is even, so that p is

even. We can write p = 2r, where r ∈ Z. Then q 2 = 2r2 is even, so that q is even, contradicting thatassumption that p and q have no common factors apart from ±1.

It follows that the real number we know as√

2 does not belong to the set Q. We say that the set Q isnot complete. Our idea is then to distinguish the set R from the set Q by completeness. In particular,we want to ensure that the set R contains numbers like

√ 2.

There are a number of ways to describe the completeness of the set R. We shall first of all introducecompleteness via the Axiom of bound.

Definition. A non-empty set S of real numbers is said to be bounded above if there exists a numberK

∈R such that x

≤K for every x

∈S . The number K is called an upper bound of the set S .

Definition. A non-empty set T of real numbers is said to be bounded below if there exists a numberk ∈ R such that x ≥ k for every x ∈ T . The number k is called a lower bound of the set T .

AXIOM OF BOUND. Suppose that a non-empty set S of real numbers is bounded above. Then there is a real number M ∈ R satisfying the following two conditions:

(S1) For every x ∈ S , the inequality x ≤ M holds.(S2) For every > 0, there exists x ∈ S such that x > M − .

Remark. It is not difficult to prove that the number M above is unique. It is also easy to deduce thatif a non-empty set T of real numbers is bounded below, then there is a unique real number m ∈ R

satisfying the following two conditions:

(I1) For every x ∈ T , the inequality x ≥ m holds.(I2) For every > 0, there exists x ∈ T such that x < m + .

Definition. The real number M satisfying conditions (S1) and (S2) is called the supremum of thenon-empty set S , and denoted by M = sup S . The real number m satisfying conditions (I1) and (I2) iscalled the infimum of the non-empty set S , and denoted by m = inf S .

Remark. Note that the most important point of the Axiom of bound is that the supremum M is a realnumber. Similarly, the infimum m is also a real number.

Let us now try to understand how numbers like√

2 fit into this setting. Recall that there is no rationalnumber which satisfies the equation x2 = 2. This means that the number that we know as

√ 2 is not a

rational number. We now want to show that it is a real number. Let

S = {x ∈ R : x2 < 2}.

Clearly the set S is non-empty, since 0 ∈ S . On the other hand, the set S is bounded above; for example,it is not difficult to show that if x ∈ S , then we must have x ≤ 2; for if x > 2, then we must have x2 > 4,so that x

∈S . Hence S is a non-empty set of real numbers and S is bounded above. It follows from the

Axiom of bound that there is a real number M satisfying conditions (S1) and (S2). We shall show thatM 2 = 2.

Suppose on the contrary that M 2 = 2. Then it follows from Axiom (O1) that M 2 < 2 or M 2 > 2.Let us investigate these two cases separately.

If M 2 < 2, then we have

(M + )2 = M 2 + 2M + 2 < 2 whenever < min

1,

2 − M 2

2M + 1

.

This means that M +

∈S , contradicting conndition (S1).

If M 2 > 2, then we have

(M − )2 = M 2 − 2M + 2 > 2 whenever <M 2 − 2

2M .

This implies that any x > M − will not belong to S , contradicting condition (S2).

Note that M 2 = 2 and M is a real number. It follows that what we know as√

2 is a real number.

Example 1.2.1. The set N is not bounded above but is bounded below with infimum 1.

Example 1.2.2.The set Z is not bounded above or below.

Example 1.2.3. The closed interval [√

2, 2] = {x ∈ R :√

2 ≤ x ≤ 2} is bounded above and below, withsupremum 2 and infimum

√ 2. Note that the supremum and infimum belong to the interval.

Example 1.2.4. The open interval (√

2, 2) = {x ∈ R :√

2 < x < 2} is bounded above and below, withsupremum 2 and infimum

√ 2. Note that the supremum and infimum do not belong to the interval.

Example 1.2.5. The set {x ∈ R : x = (−1)nn−1 for some n ∈ N} is bounded above and below, withsupremum 1/2 and infimum −1.

Example 1.2.6. The set {x ∈ Q : x2 < 2} is bounded above and below, with supremum√

2 and infimum

−√

2.

The argument concerning√

2 can be adapted to prove the following result.

THEOREM 1B. Suppose that a real number c ∈ R is positive. Then for every natural number q ∈ N,there exists a unique positive real number x ∈ R such that xq = c.

We denote by c1/q orq

√ c the unique positive real solution of the equation xq = c given by Theorem1B. For every p ∈ Z and q ∈ N, we define c p/q = (c1/q) p. It can be shown that the definition of cm, wherem = p/q with p ∈ Z and q ∈ N, is independent of the choice of p and q . Furthermore, the Index laws aresatisfied: For every positive real number c ∈ R and rational numbers m, n ∈ Q, we have cmcn = cm+n

and (cm)n = cmn.

We next elaborate on Example 1.2.1, and prove formally that the set N is not bounded above. Thisis a consequence of the Axiom of bound.

THEOREM 1C. (ARCHIMEDEAN PROPERTY) For every real number x ∈ R, there exists a natural number n ∈ N such that n > x.

Proof. Suppose that x ∈ R, and suppose on the contrary that n ≤ x for every n ∈ N. Then the set Nis bounded above by x, and so has a supremum M , say. In particular, we have

M ≥ 2, M ≥ 3, M ≥ 4, . . . ,

and so

M − 1 ≥ 1, M − 1 ≥ 2, M − 1 ≥ 3, . . . .

Hence M − 1 is an upper bound for N, contradicting the hypothesis that M is the supremum of N.

We now establish the following important result central to the theory of mathematical analysis.

THEOREM 1D. The rational numbers and irrational numbers are dense in the set R. More precisely,

between any two distinct real numbers, there exist a rational number and an irrational number.

Proof. Suppose that x, y ∈ R and x < y. We shall first show that there exists r ∈ Q such thatx < r < y. The idea is very simple. Heuristically, if we choose a natural number q large enough, thenthe interval (qx,qy) has length greater than 1 and must contain an integer p, so that qx 0. By the Archimedean property, there exists q ∈ N such thatq > 1/(y−x), so that 1 < q (y−x). Consider the positive real number qx. By the Archimedean property,there exists n ∈ N such that n > qx. Using the Well-ordering principle, let p be the smallest such naturalnumber n. Then clearly p − 1 ≤ qx. To see this, note that if p = 1, then p − 1 = 0 < qx; if p = 1, then p

−1 > qx would contradict the definition of p. It now follows that

qx −x, so thatk + x > 0. There exists s ∈ Q such that x + k < s < y + k, so that x < s − k < y . Clearly s − k ∈ Q.

To show that there exists z ∈ R \Q such that x < z < y, we first use our earlier argument twice, andconclude that there exist r1, r2 ∈ Q such that x < r1 < r2 < y. The number

z = r1 +1√

2(r2 − r1)

is clearly irrational and satisfies r1 < z < r2, and so x < z < y. .




1.3. The Complex Numbers

In this section, we briefly review some important properties of the complex numbers. It is easy to see

that the equation x2

+ 1 = 0 has no solution x ∈ R. In order to solve this equation, we have to introduceextra numbers into our number system.

Define the number i by i2 + 1 = 0. We then extend the field of all real numbers by adjoining thenumber i, which is then combined with the real numbers by the operations addition and multiplicationin accordance with the Field axioms in Section 1.1. The numbers a + bi, where a, b ∈ R, of the extendedfield are then added and multiplied in accordance with the Field axioms, suitably extended, and therestriction i2 + 1 = 0. Note that the number a + 0i, where a ∈ R, behaves like the real number a.

The set C = {z = x + yi : x, y ∈ R} is called the set of all complex numbers. Note that in C, we losethe Order axioms and the Axiom of bound.

Suppose that z = x +yi, where x, y

∈R. The real number x is called the real part of z, and denoted by

x = Rez. The real number y is called the imaginary part of z, and denoted by y = Imz. Furthermore,we write

|z| =

x2 + y2

and call this the modulus of z.

Definition. A set S of complex numbers is said to be bounded if there exists a number K ∈ R suchthat |z| ≤ K for every z ∈ T .

THEOREM 1E. For every z, w ∈ C, we have (a) |zw| = |z||w|; and (b)

|z + w

| ≤ |z

|+

|w

|.

Proof. The first part is left as an exercise. To prove the Triangle inequality (b), note that the result istrivial if z + w = 0. Suppose now that z + w = 0. Then

|z| + |w||z + w| =

|z||z + w| +

|w||z + w| =

z

z + w

+

w

z + w

≥ Re

z

z + w+ Re

w

z + w= Re

z

z + w+

w

z + w

= Re1 = 1.

The result follows immediately.

Applying the Triangle inequality a finite number of times, we can show that for every z1, . . . , zk ∈ C,

we have

|z1 + . . . + zk| ≤ |z1| + . . . + |zk|.

We shall use this to establish the following result which shows that a polynomial is eventually dominatedby its term of highest order.

THEOREM 1F. Consider a polynomial P (z) = a0 + a1z + . . . + anzn in the complex variable z ∈ C,with coefficients a0, a1, . . . , an ∈ C and an = 0. For every z ∈ C satisfying

|z0| ≥ R0 =2(|a0| + |a1| + . . . + |an|)

|an| ,

we have

12|an||z|n ≤ |P (z)| ≤ 3

2|an||z|n.





Proof. Note first of all that

|P (z)| ≤ |a0 + a1z + . . . + an−1zn−1| + |an||z|n

and

|an||z|n = |P (z) − (a0 + a1z + . . . + an−1zn−1)| ≤ |P (z)| + |a0 + a1z + . . . + an−1zn−1|.

It therefore remains to establish the inequality

|a0 + a1z + . . . + an−1zn−1| ≤ 12 |an||z|n.

Clearly R0 > 1, so that if |z| ≥ R0, we have

|a0 + a1z + . . . + an−1zn−1| ≤ |a0| + |a1||z| + . . . + |an−1||z|n−1 ≤ (|a0| + |a1| + . . . + |an−1|)|z|n−1

≤ (|a0| + |a1| + . . . + |an−1| + |an|)|z|n−1

=

1

2 R0|an||z|n−1

≤1

2 |an||z|n

as required.

1.4. Countability

In this brief account, we treat intuitively the distinction between finite and infinite sets. A set is finiteif it contains a finite number of elements. To treat infinite sets, our starting point is the set N of allnatural numbers, an example of an infinite set.

Definition. A set X is said to be countably infinite if there exists a bijective mapping from X to N. A

set X is said to be countable if it is finite or countably infinite.

Remark. Suppose that X is countably infinite. Then we can write

X = {x1, x2, x3, . . .}.

Here we understand that there is a bijective mapping φ : X → N where φ(xn) = n for every n ∈ N.

THEOREM 1G. A countable union of countable sets is countable.

Proof. Let I be a countable index set, where for each i ∈ I , the set X i is countable. Either (a) I isfinite; or (b) I is countably infinite. We shall only consider (b), since (a) needs only minor modification.

Since I is countably infinite, there exists a bijective mapping from I to N. We may therefore assume,without loss of generality, that I = N. For each n ∈ N, since X n is countable, we may write

X n = {an1, an2, an3, . . .},

with the convention that if X n is finite, then the sequence an1, an2, an3, . . . is constant from some pointonwards. Hence we have a doubly infinite array

a11 a12 a13 . . .

a21 a22 a23 . . .

a31 a32 a33 . . .

......

.... . .





of elements of the set

X = n∈N

X n.

We now list these elements in the order indicated by

• • • • •

• • • •

• • • •

• • • •

but discarding duplicates. If X is infinite, the above clearly gives rise to a bijection from X to N.

Example 1.4.1. The set Z is countable. Simply note that Z = N ∪ {0}∪{−1, −2, −3, . . .}.

Example 1.4.2. The set Q is countable. To see this, note that any x ∈ Q can be written in the form p/q , where p ∈ Z and q ∈ N. It is easy to see that for every n ∈ N, the set Qn = { p/n : p ∈ Z} iscountable. The result follows from Theorem 1G on observing that

Q =n∈N

Qn.

Suppose that two sets X 1 and X 2 are both countably infinite. Since both can be mapped to N

bijectively, it follows that each can be mapped to the other bijectively. In this case, we say that the twosets X 1 and X 2 have the same cardinality. Cardinality can be considered as a way of measuring size. If there exists a one-to-one mapping from X 1 to X 2 and no one-to-one mapping from X 2 to X 1, then wesay that X 2 has greater cardinality than X 1. For example, N and Q have the same cardinality. We shallnow show that R has greater cardinality than Q.

To do so, we first need an intermediate result.

THEOREM 1H. Any subset of a countable set is countable.

Proof. Let X be a countable set. If X is finite, then the result is trivial. We therefore assume that X is countably infinite, so that we can write

X = {x1, x2, x3, . . .}.

Let Y be a subset of X . If Y is finite, then the result is trivial. If Y is infinite, then we can write

Y = {xn1 , xn2 , xn3 , . . .},

where

n1 = min{n ∈ N : xn ∈ Y },





and where, for every p ≥ 2,

n p = min{n > n p−1 : xn ∈ Y }.

The result follows.

THEOREM 1J. The set R is not countable.

Proof. In view of Theorem 1H, it suffices to show that the set [0, 1) is not countable. Suppose on thecontrary that [0, 1) is countable. Then we can write

[0, 1) = {x1, x2, x3, . . .}. (1)

For each n ∈ N, we express xn in decimal notation in the form

xn = .xn1xn2xn3 . . . ,

where for each k ∈ N, the digit xnk ∈ {0, 1, 2, . . . , 9}. Note that this expression may not be unique, butit does not matter, as we simply choose one. We now have

x1 = .x11x12x13 . . . ,

x2 = .x21x22x23 . . . ,

x3 = .x31x32x33 . . . ,

...

Let y = .y1y2y3 . . . , where for each n ∈ N, yn ∈ {0, 1, 2, . . . , 9} and yn ≡ xnn + 5 (mod 10). Then clearlyy = xn for any n ∈ N. But y ∈ [0, 1), contradicting (1).

Example 1.4.3. Note that the set R \ Q of all irrational numbers is not countable. It follows that inthe sense of cardinality, there are far more irrational numbers than rational numbers.

1.5. Cardinal Numbers

It is easy to show that there exists a bijective mapping from a finite set X 1 to a finite set X 2 if and onlyif the two sets X 1 and X 2 have the same number of elements. In this case, we say that the two setshave the same cardinality. It is then convenient to denote the cardinality of a finite set by the numberof elements that it contains, and take the non-negative integers to represent the finite cardinal numbers.

This may appear to be satisfactory. Strictly speaking, we need the following axiom which coversinfinite sets as well.

POSTULATE OF THE CARDINAL NUMBERS. For every set X , there exists an object |X |,called the cardinal number of X , which satisfies the following property: For any two sets X and Y , we have |X | = |Y | if and only if there exists a bijective mapping f : X → Y .

Remarks. (1) Note that the cardinal number of an infinite set cannot be equal to the cardinal numberof a finite set, since there cannot be a bijective mapping from an infinite set to a finite set.

(2) We write ℵ0 = |N| and c = |R|.

(3) Note that|X

|=

ℵ0 for any countably infinite set X .

(4) In view of Theorem 1J, we have ℵ0 = c.


Problems for Chapter 1

1. Suppose that a, b ∈ R satisfy a > 0 and b < 0. Show that ab < 0.

2. Suppose that a, b ∈ R satisfy b < a < 0. Show that b−1 > a−1.

3. For each of the following sets A, determine whether sup A and inf A exist, and find their values if appropriate and determine also whether sup A and inf A belong to the set A:

a) A = {n−1 : n ∈ N} b) A = {(|n| + 1)−2 : n ∈ Z}c) A = {n + n−1 : n ∈ N} d) A = {2−m − 3n : m, n ∈ N}e) A = {x ∈ R : x3 − 4x < 0} f) A = {1 + x2 : x ∈ R}

4. Suppose that A is a bounded set of real numbers, and that B is a non-empty subset of A. Explainwhy inf A ≤ inf B ≤ sup B ≤ sup A.

5. Suppose that a, b ∈ R satisfy a < b + n−1 for every n ∈ N. Prove that a ≤ b.

6. a) Suppose that x ≤ a for every x ∈ A. Show that sup A ≤ a.b) Show that the corresponding statement with ≤ replaced by < does not hold.

7. Suppose that A and B are non-empty sets of real numbers bounded above and below.a) Let A ∪ B = {x : x ∈ A or x ∈ B}. Prove that

sup(A ∪ B) = max{sup A, sup B} and inf(A ∪ B) = min{inf A, inf B}.

b) Discuss the case A ∩ B = {x : x ∈ A and x ∈ B}.

8. Suppose that A and B are non-empty sets of real numbers bounded above and below.a) Let A + B = {a + b : a ∈ A and b ∈ B}. Prove that

sup(A + B) = sup A + sup B and inf(A + B) = inf A + inf B.

b) Discuss the case A − B = {a − b : a ∈ A and b ∈ B}.

9. Suppose that A and B are non-empty sets of positive real numbers bounded above and below.a) Let AB = {ab : a ∈ A and b ∈ B}. Prove that

sup(AB) = (sup A)(sup B) and inf(AB) = (inf A)(inf B).

b) Discuss the case when the sets A and B can contain negative real numbers.

10. Suppose that A is a non-empty set of real numbers bounded above and below. For any real numberk

∈R, consider the set kA =

{ka : a

∈A

}. What can we say about sup(kA) and inf(kA)?

11. Prove that the cartesian product of two countable sets is countable.

12. A rational point in C is one with rational real and imaginary parts. Prove that the set of all rationalpoints in C is countable.

13. Prove that any isolated point set in C is countable.

14.a) Find a bijection from (0, 1) to (0, ∞).b) Find a bijection from (−1, 1) to R.c) Suppose that A, B ∈ R with A < B . Find a bijection from (A, B) to (−1, 1).d) What is the cardinality of the interval (A, B) in part (c)?

15. A real algebraic number is any real solution of a polynomial equation with coefficients in Z. Provethat the set of all real algebraic numbers is countable.

W W L CHEN

c W W L Chen, 1982, 2008.






Chapter 2

SEQUENCES AND LIMITS

2.1. Introduction

A sequence is a set of terms occurring in order. In simple cases, a sequence is defined by an explicitformula giving the n-th term zn in terms of n. We shall simply refer to the sequence zn. For example,zn = 1/n represents the sequence

1, 12 , 1

3 , 14 , . . . .

We shall only be concerned with the case when all the terms of a sequence are real or complex numbers,so that throughout this chapter, zn represents a real or complex sequence. We often simply refer to asequence zn.

It is not necessary to start the sequence with z1. However, the set N of all natural numbers is aconvenient tool to indicate the order in which the terms of the sequence occur.

Remark. Formally, a complex sequence is a function of the form f : N → C, where for every n ∈ N, wewrite f (n) = zn.

Definition. We say that a sequence zn converges to a finite limit z ∈ C, denoted by zn → z as n → ∞or by

limn→∞

zn = z,

if, given any > 0, there exists N = N ()

∈R, depending on , such that

|zn

−z

|< whenever n > N .

Furthermore, we say that a sequence zn is convergent if it converges to some finite limit z as n → ∞,and that a sequence zn is divergent if it is not convergent.

Chapter 2 : Sequences and Limits page 1 of 15

Remark. The quantity |zn − z| measures the difference between zn and its intended limit z. Thedefinition thus says that this difference can be made as small as we like, provided that n is large enough.It follows that the convergence is not affected by the initial terms. Observe that the inequality |zn−z| <

is equivalent to saying that the point zn lies inside a circle of radius and centred at z.

z

zn

In the case when zn = xn and z = x are real, the inequality |xn − x| < is equivalent to the inequalitiesx − < xn < x + , so that xn lies in the open interval (x − , x + ).

Example 2.1.1. Consider the sequence zn = 1/n. Then zn → 0 as n → ∞. We have

|zn − 0| =

1

n− 0

=1

n<

whenever n > 1/. We may take N = 1/.

Example 2.1.2. Consider the sequence zn = in/n2. Then zn → 0 as n → ∞. We have

|zn − 0| = in

n2− 0 = 1

n2<

whenever n >

1/. We may take N =

1/.

Example 2.1.3. Consider the sequence zn = (n + 2i)/n. Then zn → 1 as n → ∞. We have

|zn − 1| =

n + 2i

n− 1

=

2i

n

=2

n<

whenever n > 2/. We may take N = 2/.

Example 2.1.4. Consider the sequence zn = (n + 1)/n. Then zn→

1 as n

→ ∞. We have

|zn − 1| =

n + 1

n− 1

=n+1n

− 1 n+1n + 1

<1

2n<

whenever n > 1/2. We may take N = 1/2.

Example 2.1.5. Consider the sequence zn = (2n + 3)/(3n + 4). Then zn → 2/3 as n → ∞. We have

zn − 2

3

=

2n + 3

3n + 4− 2

3

=1

3(3n + 4)<

1

9n<

whenever n > 1/9. We may take N = 1/9.

A simple and immediate consequence of our definition of convergence is the following result.

THEOREM 2A. The limit of a convergent sequence is unique.

Proof. Suppose that zn → z and zn → z as n → ∞. Then given any > 0, there exist N , N ∈ Rsuch that

|zn − z| < whenever n > N ,

and

|zn − z| < whenever n > N .

Let N = max{N , N } ∈ R. It follows that whenever n > N , we have

|z − z| = |(z − zn) + (zn − z)| ≤ |zn − z| + |zn − z| < 2.

Now |z − z| is a non-negative constant less than any 2 > 0, so we must have |z − z| = 0, whence

z

= z

. Definition. A sequence zn is said to be bounded if there exists a number M ∈ R such that |zn| ≤ M for every n ∈ N.

Example 2.1.6. The sequence zn = 1/n is bounded, with |zn| ≤ 1 for every n ∈ N.

Example 2.1.7. The sequence zn = in/n2 is bounded, with |zn| ≤ 1 for every n ∈ N.

Example 2.1.8. The sequence zn = (n + 2i)/n is bounded, with |zn| ≤ √ 5 for every n ∈ N.

Example 2.1.9. The sequence zn =

(n + 1)/n is bounded, with |zn| ≤ √ 2 for every n ∈ N.

Example 2.1.10. The sequence zn = (2n + 3)/(3n + 4) is bounded, with |zn| ≤ 5/3 for every n ∈ N.

Note that the bounded sequences in Examples 2.1.6–2.1.10 are precisely the convergent sequences inExamples 2.1.1–2.1.5 respectively. They illustrate the fact that convergence implies boundedness. Moreprecisely, we have the following result.

THEOREM 2B. A convergent sequence is bounded.

Proof. Suppose that zn → z as n → ∞. Then there exists N ∈ N such that |zn − z| < 1 for everyn > N . Hence

|zn| < |z| + 1 whenever n > N.

Let M = max{|z1|, . . . , |zN |, |z| + 1}. Then clearly |zn| ≤ M for every n ∈ N. The next example shows that a bounded sequence is not necessarily convergent.

Example 2.1.11. The sequence zn = (−1)n is bounded, with |zn| ≤ 1 for every n ∈ N. We now showthat this sequence is not convergent. Let z be any given complex number. We shall show that thesequence zn does not converge to z. Note first of all that for every n ∈ N, we have |zn+1 − zn| = 2. Itfollows that

2 = |zn+1 − zn| = |(zn+1 − z) + (z − zn)| ≤ |zn+1 − z| + |zn − z|.

This means that for every n ∈ N, at least one of the two inequalities |zn+1 − z| ≥ 1 and |zn − z| ≥ 1must hold. Hence the condition for convergence cannot be satisfied with = 1.

The next result shows that we can do arithmetic on limits.

THEOREM 2C. Suppose that zn → z and wn → w as n → ∞. Then (a) zn + wn → z + w as n → ∞;(b) znwn → zw as n → ∞; and

(c) if w = 0, then zn/wn → z/w as n → ∞.

Remark. Let wn = 1/n and tn = (−1)n. Then wn → 0 as n → ∞, but tn does not converge as n → ∞.On the other hand, it is easy to check that zn = wntn → 0 as n → ∞. Note now that tn = zn/wn, butsince wn → 0 as n → ∞, we cannot use Theorem 2C(c).

Proof of Theorem 2C. (a) We shall use the inequality

|(zn + wn) − (z + w)| ≤ |zn − z| + |wn − w|.

Given any > 0, there exist N 1, N 2 ∈ R such that

|zn − z| < /2 whenever n > N 1,

and

|wn − w| < /2 whenever n > N 2.

Let N = max{N 1, N 2} ∈ R. It follows that whenever n > N , we have

|(zn + wn) − (z + w)| ≤ |zn − z| + |wn − w| < .

(b) We shall use the inequality

|znwn − zw| = |znwn − znw + znw − zw|

= |zn(wn − w) + (zn − z)w|≤ |zn||wn − w| + |w||zn − z|.

Since zn → z as n → ∞, there exists N 1 ∈ R such that

|zn − z| < 1 whenever n > N 1,

so that

|zn| < |z| + 1 whenever n > N 1.

On the other hand, given any > 0, there exist N 2, N 3 ∈ R such that

|zn − z| <

2(|w| + 1) whenever n > N 2,

and

|wn − w| <

2(|z| + 1)whenever n > N 3.

Let N = max{N 1, N 2, N 3} ∈ R. It follows that whenever n > N , we have

|znwn − zw| ≤ |zn||wn − w| + |w||zn − z| < .

(c) We shall first show that 1/wn → 1/w as n → ∞. To do this, we shall use the identity

1

wn− 1

w

=|wn − w||wn||w| .

Since w = 0 and wn → w as n → ∞, there exists N 1 ∈ R such that

|wn − w| < |w|/2 whenever n > N 1,

so that

|wn| > |w|/2 whenever n > N 1.

On the other hand, given any > 0, there exists N 2 ∈ R such that

|wn − w| < |w|2/2 whenever n > N 2.

Let N = max{N 1, N 2} ∈ R. It follows that whenever n > N , we have 1

wn− 1

w

=|wn − w||wn||w| ≤ 2|wn − w|

|w|2< .

We now apply part (b) to zn and 1/wn to get the desired result.

Definition. We say that a sequence zn diverges to ∞ as n → ∞, denoted by zn → ∞ as n → ∞, if,for every E > 0, there exists N ∈ R such that |zn| > E whenever n > N .

Remarks. (1) It can be shown that zn → ∞ as n → ∞ if and only if 1/zn → 0 as n → ∞.

(2) Note that Theorem 2C does not apply in the case when a sequence diverges to ∞.

Example 2.1.12. The sequences zn = n, zn = n2 and zn = (−1)nn all satisfy zn → ∞ as n → ∞.

Example 2.1.13. Suppose that xn is a sequence of positive terms such that xn → 0 as n → ∞. Forevery fixed m ∈ N, we have xmn → 0 as n → ∞, in view of Theorem 2C(b). For every negative integerm, we have xmn

→ ∞as n

→ ∞, noting that xn > 0 for every n

∈N. How about m = 0?

2.2. Real Sequences

Real sequences are particularly interesting since the real numbers are ordered, unlike the complex num-bers. This enables us to establish special results for convergence which apply only to real sequences.

We begin with a simple example. Imagine that you have a ham sandwich, and you do the mostdisgusting thing of squeezing the two slices of bread together. Where does the ham go?

THEOREM 2D. (SQUEEZING PRINCIPLE) Suppose that xn → x and yn → x as n → ∞. Suppose

further that xn ≤ an ≤ yn for every n ∈N

. Then an → x as n → ∞.

Example 2.2.1. Consider the sequence

an =4n + 3

4n2 + 3n + 1.

Then

1

2n=

4n

8n2<

4n + 3

4n2 + 3n + 1<

4n + 3 + n−1

4n2 + 3n + 1=

1

n.

Writing

xn =1

2nand yn =

1

n,

we have that xn → 0 and yn → 0 as n → ∞. Hence an → 0 as n → ∞.

Example 2.2.2. Consider the sequence an = n−1 cos n. If xn = −1/n and yn = 1/n, then clearlyxn ≤ an ≤ yn for every n ∈ N. Since xn → 0 and yn → 0 as n → ∞, we have an → 0 as n → ∞.

Example 2.2.3. It is important that xn and yn converge to the same limit. For example, if xn = −1 andyn = 1 for every n ∈ N, then both xn and yn converge as n → ∞. Let an = (−1)n. Then xn ≤ an ≤ ynfor every n ∈ N. Note from Example 2.1.11 that an does not converge as n → ∞. In this case, thehypotheses of Theorem 2D are not satisfied. Note that xn and yn converge to different limits, so nosqueezing occurs.

Example 2.2.4. Consider the sequence xn = an, where a ∈ R. There are various cases:• If a = 1, then xn = 1 for every n ∈ N, so that xn → 1 as n → ∞.• If a = 0, then xn = 0 for every n ∈ N, so that xn → 0 as n → ∞.• If a > 1, then a = 1 + k, where k > 0. Then

|an| = (1 + k)n ≥ 1 + kn > E for every n >E − 1

k.

It follows that xn → ∞ as n → ∞.• If 0 < a < 1, then a = 1/b, where b > 1. Hence 1/xn → ∞ as n → ∞. It follows that xn → 0 as

n → ∞.• If −1 < a < 0, then a = −b, where 0 < b < 1. We then have bn → 0 as n → ∞. Also, −bn ≤ xn ≤ bn

for every n ∈ N. It follows from the Squeezing principle that xn → 0 as n → ∞.• If a = −1, then xn = (−1)n does not converge as n → ∞.• If a < −1, then a = 1/b where −1 0,there exist N , N ∈ R such that

|yn − xn| < /2 whenever n > N

,

and

|xn − x| < /2 whenever n > N .

Let N = max{N , N } ∈ R. It follows that whenever n > N , we have

|an − x| ≤ |an − xn| + |xn − x| ≤ |yn − xn| + |xn − x| < .

Hence an → x as n → ∞.

Our next task is to study monotonic sequences which are particularly interesting.

Definition. Let xn be a real sequence.(1) We say that xn is increasing if xn+1 ≥ xn for every n ∈ N.(2) We say that xn is decreasing if xn+1 ≤ xn for every n ∈ N.(3) We say that xn is bounded above if there exists B ∈ R such that xn ≤ B for every n ∈ N.(4) We say that xn is bounded below if there exists b ∈ R such that xn ≥ b for every n ∈ N.

Remark. Note that a real sequence is bounded if and only if it is bounded above and below.

THEOREM 2E. Suppose that xn is an increasing real sequence.(a) If xn is bounded above, then xn converges as n → ∞.(b) If xn is not bounded above, then xn → ∞ as n → ∞.

THEOREM 2F. Suppose that xn is a decreasing real sequence.

(a) If xn is bounded below, then xn converges as n → ∞.(b) If xn is not bounded below, then xn → ∞ as n → ∞.

Proof of Theorem 2E. (a) Suppose that the sequence xn is bounded above. Then the set

S = {xn : n ∈ N}

is a non-empty set of real numbers which is bounded above. Let x = sup S . We shall show that xn → xas n → ∞. Given any > 0, there exists N ∈ N such that xN > x − . Since the sequence xn isincreasing and bounded above by x, it follows that whenever n > N , we have x ≥ xn ≥ xN > x − , sothat |xn − x| < .

(b) Suppose that the sequence xn is not bounded above. Then for every E > 0, there exists N ∈ N

such that xN > E . Since the sequence xn is increasing, it follows that |xn| = xn ≥ xN > E for everyn > N . Hence xn → ∞ as n → ∞.

Example 2.2.5. The sequence xn = 3− 1/n is increasing and bounded above. It is not too difficult thatthe smallest real number B ∈ R such that xn ≤ B for every n ∈ N is 3. It is easy to show that xn → 3as n → ∞.

Example 2.2.6. Consider the sequence xn = 1 + a + a2 + . . . + an. Then xn = n + 1 if a = 1 and

xn =1 − an+1

1 − aif a = 1.

Suppose that a > 0. Then xn is increasing. If 0 < a < 1, then xn < 1/(1 − a) for all n ∈ N, and soxn converges as n → ∞. If a ≥ 1, then xn is not bounded above, so that xn → ∞ as n → ∞. In fact,if a = 1, then the convergence or divergence of xn depends on the convergence and divergence of an+1,which we have considered before in Example 2.2.4.

Example 2.2.7. Consider the sequence

xn = 1 +1

1! +1

2! + . . . +1

n! .

Clearly xn is an increasing sequence. On the other hand,

xn = 1 + 1 +1

1 · 2+

1

2 · 3+ . . . +

1

(n − 1)n

= 1 + 1 +

1 − 1

2

+

1

2− 1

3

+ . . . +

1

n − 1− 1

n

= 3 − 1

n< 3,

so that xn is bounded above. Unfortunately, it is very hard to find the smallest real number B ∈ R suchthat xn

≤B for every n

∈N. While Theorem 2E tells us that the sequence xn converges, it does not

tell us the precise value of the limit. In fact, the limit in this case is the number e.

2.3. Tests for Convergence

We first of all apply our knowledge of real sequences in Section 2.2 to study complex sequences.

THEOREM 2G. Suppose that xn and yn are real sequences and zn = xn + iyn. Then

zn → z = x + iy as n → ∞

if and only if

xn → x and yn → y as n → ∞.

Proof. (⇒) Suppose first of all that zn → z = x + iy as n → ∞. Then given any > 0, there existsN ∈ R such that

|zn − z| < whenever n > N.

Observe now that

|xn − x| =

(xn − x)2 ≤

(xn − x)2 + (yn − y)2 = |zn − z|.

It follows that

|xn − x| < whenever n > N.

Similarly,

|yn − y| < whenever n > N.

(⇐) Suppose next that xn → x and yn → y as n → ∞. Then given any > 0, there exist N 1, N 2 ∈ Rsuch that

|xn − x| < /2 whenever n > N 1,

and

|yn − y| < /2 whenever n > N 2.

Observe now that

|zn

−z|

=|(xn + iyn)

−(x + iy

| ≤ |xn

−x|

+|yn

−y|.

Let N = max{N 1, N 2} ∈ R. It follows that


This completes the proof.

We now return to Theorem 2D. It turns out often that the sequences xn and yn in Theorem 2D canbe constructed artificially. An example is the following result.

THEOREM 2H. (RATIO TEST) Suppose that the sequence zn satisfies

zn+1

zn→ as n → ∞. (1)

(a) If < 1, then zn → 0 as n → ∞.(b) If > 1, then zn → ∞ as n → ∞.

Proof. (a) Suppose that < 1. Write L = 12

(1 + ). Then clearly < L < 1. On the other hand, itfollows from (1) and taking = 1

2(1 − ) > 0 that there exists an integer N 0 such that

zn+1

zn

−

<1 −

2whenever n > N 0.

In particular, we have

zn+1

zn

< +1 −

2= L whenever n > N 0.

It follows that for every n > N 0, we have

|zn| < L|zn−1| < L2|zn−2| < .. . < Ln−N 0 |zN 0 | = L−N 0 |zN 0 |Ln.

Let

M = max1≤n≤N 0

|zn|Ln

.

Then for every n ∈ N, we have

0 ≤ |zn| ≤ MLn.

Clearly the sequence MLn → 0 as n → ∞. It follows from Theorem 2D that |zn| → 0 as n → ∞, sothat zn → 0 as n → ∞.

(b) Suppose that > 1. Let wn = 1/zn. Then |wn+1/wn| → 1/ as n → ∞. It follows from (a) thatwn → 0 as n → ∞, so that zn → ∞ as n → ∞.

Remark. No firm conclusion can be drawn when = 1, as can be seen from the following sequenceswhich all have = 1:

• The sequence zn = c converges to c as n → ∞.• The sequence zn = (−1)n diverges as n → ∞.• The sequence zn = 1/n converges to 0 as n → ∞.• The sequence zn = n diverges to infinity as n → ∞.• The sequence zn = inn diverges to infinity as n → ∞.

Example 2.3.1. Consider the sequence zn =(n!)2

(2n)!. We have

zn+1

zn

=zn+1

zn=

((n + 1)!)2

(2(n + 1))!

(n!)2

(2n)!=

(n + 1)2

(2n + 2)(2n + 1)=

n2 + 2n + 1

4n2 + 6n + 2→ 1

4as n → ∞.

It follows from Theorem 2H that zn → 0 as n → ∞.

Example 2.3.2. Consider the sequence zn =(n!)2

(2n)!5n. Then |zn+1/zn| → 5/4 as n → ∞. It follows

from Theorem 2H that zn → ∞ as n → ∞.

2.4. Recurrence Relations

In practice, it may not always be convenient to define a sequence explicitly. Sequences may often bedefined by a relation connecting two or more successive terms. Here we shall not make a thorough studyof such relations, but confine our discussion to two examples of real sequences.

Example 2.4.1. Suppose that x1 = 3 and

xn+1 =4xn + 2

xn + 3

for every n ∈ N. Note first of all that 0 < x2 < x1. Suppose that n > 1 and 0 < xn < xn−1. Thenclearly xn+1 > 0. Furthermore,

xn+1 − xn =4xn + 2

xn + 3− 4xn−1 + 2

xn−1 + 3=

10(xn − xn−1)

(xn + 3)(xn−1 + 3)< 0.

It follows from the Principle of induction that xn is a decreasing sequence and bounded below by 0, sothat xn converges as n → ∞. Suppose that xn → x as n → ∞. Then

x = limn→∞xn+1 = limn→∞ 4xn + 2xn + 3 = 4x + 2x + 3 .

Hence x = 2. Note that the other solution x = −1 has to be discounted, since xn > 0 for every n ∈ N.

Example 2.4.2. Let s > 0. Suppose that x1 > 0 and that for n > 1, we have

xn =1

2

xn−1 +

s

xn−1

.

It is not difficult to show that xn > 0 for every n ∈ N. On the other hand, for n > 1, we have

x2n =

1

4x2

n−1 +s2

x2

n−1

+ 2s ,

so that

x2n − s =

1

4

x2n−1 +

s2

x2n−1

− 2s

=

1

4

xn−1 − s

xn−1

2

≥ 0,

and so

xn+1 − xn =1

2

xn +

s

xn

− xn =

1

2

s

xn− xn

=

s − x2n

2xn≤ 0.

It follows that, with the possible exception that x2 ≤ x1 may not hold, the sequence xn is decreasing

and bounded below, so that xn converges as n → ∞. Suppose that xn → x as n → ∞. Then

x = limn→∞

xn = limn→∞

1

2

xn−1 +

s

xn−1

=

1

2

x +

s

x

,

so that x2 = s. This gives a proof that s has a square root.

2.5. Subsequences

In this section, we discuss subsequences. Heuristically, a subsequence is obtained from a sequence bypossibly omitting some of the terms, and keeping the remainder in the original order. We can make this

more formal in the following way.

Definition. Suppose that

z1, z2, z3, . . . , zn, . . .

is a sequence. Suppose further that n1 < n2 < n3 < . . . < n p < . . . is an infinite sequence of naturalnumbers. Then the sequence

zn1 , zn2 , zn3 , . . . , znp , . . .

is called a subsequence of the original sequence.

Example 2.5.1. The sequence 2, 4, 6, 8, . . . of even natural numbers is a subsequence of the sequence1, 2, 3, 4, . . . of natural numbers.

Example 2.5.2. The sequence 2, 3, 5, 7, . . . of primes is not a subsequence of the sequence 1, 3, 5, 7, . . .of odd natural numbers.

Example 2.5.3. The sequence 1, 2, 3, 4, . . . of natural numbers is a subsequence of the sequence √ 1, √ 2,√ 3, √ 4, . . . .

We would like to obtain conditions under which convergent subsequences exist. We first investigatethe special case of real sequences.

THEOREM 2J. Every sequence of real numbers has either an increasing subsequence or a decreasing subsequence, possibly both.

Proof. We shall say that n ∈ N is a “peak” point if xn > xm for every m > n. There are precisely twopossibilities:

(i) Suppose that there are infinitely many peak points n1 < n2 < n3 < .. . < n p < . . . . Then

xn1 > xn2 > xn3 > .. . > xnp > . . .

is a decreasing subsequence.

(ii) Suppose that there are no or only finitely many peak points. Let n1 = 1 if there are no peakpoints, and let n1 = N + 1 if N represents the largest peak point. Then n1 is not a peak point, and sothere exists n2 > n1 such that xn1 ≤ xn2 . On the other hand, n2 is not a peak point, and so there existsn3 > n2 such that xn2 ≤ xn3 . Continuing inductively, we conclude that there exists an infinite sequencen1 < n2 < n3 < .. . < n p < . . . of natural numbers such that

xn1 ≤ xn2 ≤ xn3 ≤ . . . ≤ xnp ≤ . . .

is an increasing subsequence. THEOREM 2K. Every bounded sequence of real numbers has a convergent subsequence.

Proof. By Theorem 2J, there is either an increasing subsequence which is necessarily bounded above,or a decreasing subsequence which is necessarily bounded below. It follows from Theorem 2E and 2Fthat the subsequence must be convergent.

Example 2.5.4. For the sequence xn = (−1)n, it is easy to check that all increasing or decreasingsubsequences of xn are eventually constant and so convergent.

Example 2.5.5. For the sequence xn = (1 + (−1)n)n, it is easy to check that there is an increasingsubsequence 4, 8, 12, . . . (n = 2, 4, 6, . . .), as well as a decreasing subsequence 0, 0, 0, . . . (n = 1, 3, 5, . . .).

Example 2.5.6. The sequence xn = (−1)nn−1 is convergent with limit 0. It is easy to check that thereis an increasing subsequence (n odd), as well as a decreasing subsequence (n even), and both convergeto 0. Can you convince yourself that every other subsequence of xn converges to 0 also? If not, seeTheorem 2L below.

Example 2.5.7. The sequence xn = n diverges to infinity. Can you convince yourself that everysubsequence of xn is increasing and diverges to infinity also?

We now no longer restrict our study to real sequences, and consider subsequences of sequences of complex numbers.

THEOREM 2L. Suppose that a sequence zn→

z as n

→ ∞. Then for every subsequence znp of zn,

we have znp → z as p → ∞. In other words, every subsequence of a convergent sequence converges tothe same limit.

Proof. Given any > 0, there exists N ∈ R such that


Note next that n p ≥ p for every p ∈ N, so that n p > N whenever p > N . It follows that

|znp − z| < whenever p > N.

Hence znp → z as p → ∞.

We now extend Theorem 2K to complex sequences.

THEOREM 2M. (BOLZANO-WEIERSTRASS THEOREM) Every bounded sequence of complex num-bers has a convergent subsequence.

Proof. Suppose that zn is a bounded sequence of complex numbers. Let xn and yn be real sequences

such that zn = xn + iyn. Since zn is bounded, there exists M ∈ R such that |zn| ≤ M for every n ∈ N.Then clearly |xn| ≤ M and |yn| ≤ M for every n ∈ N, so that xn and yn are both bounded. By Theorem2K, the sequence xn has a convergent subsequence xnp. Consider the corresponding subsequence ynp of the sequence yn. Clearly |ynp| ≤ M for every p ∈ N, so that ynp is bounded. By Theorem 2K again, thesequence ynp has a convergent subsequence ynps . The corresponding subsequence xnps of the sequencexnp , being a subsequence of a convergent sequence, is again convergent, in view of Theorem 2L. It nowfollows from Theorem 2G that the subsequence znps = xnps + iynps of the sequence zn is convergent.

Definition. A complex number ζ ∈ C is said to be a limit point of a sequence zn if there exists asubsequence znp of zn such that znp → ζ as p → ∞.

Example 2.5.8. The sequence zn = n has no limit points. To see this, note that zn → ∞ as n → ∞.

Let wn = 1/zn. Then wn → 0 as n → ∞. It follows from Theorem 2L that every subsequence of wnconverges to 0. Hence every subsequence of zn diverges to infinity.

Example 2.5.9. The sequence zn = in has four limit points, namely ±1 and ±i.

Example 2.5.10. The sequence

1, 12

, 22

, 13

, 23

, 33

, 14

, 24

, 34

, 44

, 15

, 25

, 35

, 45

, 55

, . . .

has infinitely many limit points. In fact, the set of all limit points is the closed interval [0, 1]. This is afamous result in diophantine approximation.

Remark. Note that Theorem 2L says that a convergent sequence has exactly one limit point. Note also

that the sequence 1, 2, 1, 3, 1, 4, 1, 5, . . . has exactly one limit point but does not converge.

We now characterize convergence of sequences in terms of boundedness and limited points.

THEOREM 2N. A sequence of complex numbers is convergent if and only if it is bounded and has exactly one limit point.

Proof. (⇒) This is a combination of Theorems 2B and 2L.

(⇐) Suppose that zn is bounded and has exactly one limit point ζ . We shall show that zn → ζ asn → ∞. Suppose on the contrary that zn does not converge to ζ as n → ∞. Then there exists a constant0 > 0 such that for every N ∈ N, there exists n > N such that |zn−ζ | ≥ 0. Putting N = 1, there existsn1 > 1 such that

|zn1

−ζ

| ≥0. Putting N = n1, there exists n2 > n1 such that

|zn2

−ζ

| ≥0. Putting

N = n2, there exists n3 > n2 such that |zn3 − ζ | ≥ 0. Proceeding inductively, we obtain a sequencen1 < n2 < n3 < . . . < n p < . . . of natural numbers such that |znp − ζ | ≥ 0 for every p ∈ N. Since

zn is bounded, the subsequence znp is also bounded. It follows from the Bolzano-Weierstrass theoremthat znp has a convergent subsequence znps . Suppose that znps → z as s → ∞. Then clearly z = ζ , for|znps −ζ | ≥ 0 for every s ∈ N. This means that z is another limit point of the sequence zn, contradicting

the assumption that zn has exactly one limit point. Recall that the set R is complete, in terms of the Axiom of bound. We now study completeness from

a different viewpoint.

Definition. A sequence zn of complex numbers is said to be a Cauchy sequence if, given any > 0,there exists N = N () ∈ R, depending on , such that |zm − zn| < whenever m > n ≥ N .

It is easy to establish the following.

THEOREM 2P. Suppose that a sequence zn is convergent. Then zn is a Cauchy sequence.

Proof. Suppose that zn→

z as n→ ∞

. Then given any > 0, there exists N ∈R such that

|zn − z| < /2 whenever n > N.

It follows that

|zm − zn| = |(zm − z) + (z − zn)| ≤ |zm − z| + |zn − z| < whenever m > n ≥ N + 1.

Hence zn is a Cauchy sequence.

An alternative way of saying that R and C are complete is the following result.

THEOREM 2Q. Suppose that zn is a Cauchy sequence. Then zn is convergent.

Proof. Since zn is a Cauchy sequence, there exists N ∈ N such that

|zn − zN | < 1 whenever n ≥ N,

so that

|zn| < 1 + |zN | whenever n ≥ N.

Let M = 1+ max{|z1|, . . . , |zN |}. Then |zn| ≤ M for every n ∈ N, so that zn is bounded. It follows fromthe Bolzano-Weierstrass theorem that zn has a convergent subsequence znp. Suppose that znp → ζ as p → ∞. In view of Theorem 2N, it remains to show that ζ is the only limit point of zn. Suppose on thecontrary that z is another limit point of zn. Then there exists another subsequence zn

rof zn such that

zn

r→ z as r → ∞.

Let = 13|ζ − z| > 0.

z

zn

Then there exist P, R ∈ R such that

|znp − ζ | < whenever p > P,

and

|zn

r− z| < whenever r > R.

It follows that for every p > P and r > R, we have

|znp − zn

r| = |(znp − ζ ) − (zn

r− z) + (ζ − z)| ≥ |ζ − z| − |znp − ζ | − |zn

r− z| > 1

3|ζ − z|,

contradicting that zn is a Cauchy sequence.

1. Consider the sequence zn =4n + 3

5n + 2

.

a) Make a guess for the limit of zn as n → ∞.b) Use the -N definition to verify that your guess is correct.

2. Show that the sequence

zn =n

2n + 1+

cos(esin(25πn5) log(n2))

n3

is convergent as n → ∞, find its limit and explain every step of your argument.

3. Suppose that zn → as n → ∞, and that wn =z1 + z2 + . . . + zn

n. Show that wn → as n → ∞.

[Hint

: Consider first the case = 0.]

4. Prove that the following sequences converge as n → ∞ and find their limits except for part (d):

a) zn = (n + 1)1/4 − n1/4 b) zn =1 + 2 + . . . + n

n2

c) zn =n

2nd) zn =

1

n + 1+

1

n + 2+ . . . +

1

2n

5. Show that the real sequence xn =

1 +

1

n

nis increasing and bounded above.

[Remark: Hence it converges. The limit is the number e.]

6. Suppose that z is a fixed complex number. Discuss the convergence and divergence of the sequence

zn =z + zn

1 + zn,

explain every step of your argument, and take care to distinguish the four casesa) |z| > 1; b) |z| < 1; c) z = 1; d) |z| = 1, but z = 1.

7. A real sequence xn is defined inductively by x1 = 1 and xn+1 =√

xn + 6 for every n ∈ N.a) Prove by induction that xn is increasing, and xn < 3 for every n ∈ N.b) Deduce that xn converges as n → ∞ and find its limit.

8. Suppose that x1 < x2 and xn+2 = 12

(xn+1 + xn) for every n ∈ N. Show that

a) xn+2 > xn for every odd n ∈ N;b) xn+2 < xn for every even n ∈ N; andc) xn → 1

3 (x1 + 2x2) as n → ∞.

9. Find the limit points of each of the following complex sequences:

a) zn = (−1)n b) zn = (2i)n c) zn =

1 + i√

2

n

10. Show that a complex sequence zn has exactly one of the following two properties:a) zn → ∞ as n → ∞.b) zn has a convergent subsequence.

[Hint: Assume that (a) fails. Show that (b) must then hold.]

11. Suppose that 0 < b < 1 and that the sequence an satisfies the condition that |an+1 − an| ≤ bn forevery n ∈ N. Use Theorem 2Q to prove that an is convergent as n → ∞.




W W L CHEN

c W W L Chen, 1982, 2008.






Chapter 3

SERIES

3.1. Introduction

Suppose that zn is a real or complex sequence. For every N ∈ N, let

sN =N n=1

zn = z1 + . . . + zN .

We shall call

∞

n=1

zn (1)

a series, and sN the N -th partial sum of the series.

Definition. If the sequence sN converges to s as N → ∞, then we say that the series (1) converges tothe sum s and write

∞n=1

zn = s.

In this case, we sometimes simply say that the series (1) is convergent. On the other hand, if the sequencesN diverges as N → ∞, then we say that the series (1) is divergent.

Since the partial sums of a series form a sequence, we deduce immediately from Theorems 2P and 2Qthe following useful result.

Chapter 3 : Series page 1 of 15

THEOREM 3A. (GENERAL PRINCIPLE OF CONVERGENCE FOR SERIES) The series (1) is convergent if and only if, given any > 0, there exists a number N 0 such that

M

n=N +1

zn < whenever M > N ≥ N 0.

Remark. Note that Theorem 3A says that the series (1) is convergent if and only if the sequence sN of partial sums forms a Cauchy sequence. To prove Theorem 3A, we simply observe that

M n=N +1

zn = sM − sN .

Before we study the convergence of series in general, we first look at some very useful examples.

THEOREM 3B. (GEOMETRIC SERIES) The real geometric series

∞n=1

xn−1 = 1 + x + x2 + x3 + . . .

is convergent if and only if |x| < 1.

Proof. It is easy to see that the sequence sN of partial sums satisfies

sN =

1 − xN

1 − xif x = 1;

N if x = 1.

If x = 1, then the sequence sN is clearly not bounded, and so is not convergent as N → ∞. On the otherhand, we note from Example 2.2.4 that xN → 0 as N → ∞ if |x| < 1, so that the series is convergent inthis case. Finally, we note from Example 2.2.4 again that xN is divergent if x > 1 or x ≤ −1, so thatthe series is divergent in these cases.

THEOREM 3C. (HARMONIC SERIES) The real harmonic series

∞n=1

n−k

is convergent if k > 1 and is divergent if k ≤ 1.

Proof. Consider first the case k = 1. Clearly

sN =N n=1

n−1

is an increasing real sequence. To show that the series is divergent, it suffices, in view of Theorem 2E,to show that the sequence sN is not bounded above. We shall achieve this by proving that

s2m ≥ 1 + 12 m for every m ∈ N. (2)

The inequality is clearly true for m = 1, since s2 = 32 . Suppose now that s2p ≥ 1 + 1

2 p. Then

s2p+1 = s2p +1

2 p

+ 1

+1

2 p

+ 2

+ . . . +1

2 p+1 ≥

s2p +2 p

2 p+1

≥1 +

1

2

p +1

2

= 1 +1

2

( p + 1).

The assertion (2) now follows from the Principle of induction.

Suppose next that k < 1. In this case, we have n−k ≥ n−1 for every n ∈ N, and so

sN

=N

n=1

n−k

≥

N

n=1

n−1.

It therefore follows from the first part that the sequence sN is not bounded above. Clearly sN is anincreasing real sequence. It follows from Theorem 2E that the series is divergent.

Suppose finally that k > 1. Again, the sequence

sN =

N n=1

n−k

is an increasing sequence. To show that the series is convergent, it suffices, in view of Theorem 2E, toshow that the sequence sN is bounded above. Let t ∈ N satisfy N < 2t. Then

sN ≤ s2t−1 = 1 +1

2k+

1

3k+ . . . +

1

(2t − 1)k

= 1 +

1

2k+

1

3k

+

1

4k+ . . . +

1

7k

+

1

8k+ . . . +

1

15k

+ . . . +

1

(2t−1)k+ . . . +

1

(2t − 1)k

< 1 +2

2k+

4

4k+

8

8k+ . . . +

2t−1

(2t−1)k

= 1 +1

2k−1+

1

2k−1

2

+

1

2k−1

3

+ . . . +

1

2k−1

t−1

< M,

where

M = 1 +1

2k−1+

1

2k−1

2

+

1

2k−1

3

+ . . . =

∞n=1

1

2k−1

n−1

is the sum of a convergent geometric series.

We now turn to some very simple properties of series. The proofs of the following three results areleft as exercises.

THEOREM 3D. The convergence or divergence of a series is unaffected if a finite number of terms are inserted, deleted or altered.

THEOREM 3E. Suppose that ∞n=1

zn = s and

∞n=1

wn = t.

Then for every real numbers a, b ∈ R, we have

∞n=1

(azn + bwn) = as + bt.

THEOREM 3F. Suppose that the series (1) is convergent. Then zn → 0 as n → ∞.

Remark. The converse of Theorem 3F is not true. For example, let zn = 1/n. Clearly zn → 0 asn → ∞. Note that the series (1) is not convergent in this case, in view of Theorem 3C.

3.2. Real Series

We first summarize the main idea in the proof of Theorem 3C.

THEOREM 3G. Suppose that xn ≥ 0 for every n ∈ N. Then the series

∞n=1

xn

either converges to the supremum of the partial sums, or diverges to ∞.

Proof. The partial sums form an increasing sequence. The result follows from Theorem 2E.

Very often, we can study the convergence or divergence of a series by comparing it with another series.We shall first of all study this phenomenon in the special case of series with non-negative terms.

THEOREM 3H. (COMPARISON TEST FOR SERIES WITH NON-NEGATIVE TERMS) Let C be a positive constant independent of n ∈ N. Suppose that for all sufficiently large natural numbers n ∈ N,the inequalities un ≥ 0, vn ≥ 0 and un ≤ Cvn hold.

(a) If

∞n=1

vn is convergent, then

∞n=1

un is convergent.

(b) If

∞n=1

un is divergent, then

∞n=1

vn is divergent.

Proof. Note that (a) and (b) are equivalent, so we shall only prove (a). We shall use the Generalprinciple of convergence for series. Since the series

∞n=1

vn

is convergent, it follows that, given any > 0, there exists N 0 such that for every natural number n > N 0,the three given inequalities hold, and

M

n=N +1

vn <

C

whenever M > N

≥N 0,

so that

M n=N +1

un < whenever M > N ≥ N 0.

The convergence of the series

∞

n=1

un

now follows from the General principle of convergence for series.

Example 3.2.1. Suppose that p ∈ Q and 0 < a < 1. We shall prove that the series

∞

n=1

n pan (3)

is convergent. Using the Ratio test for sequences, we can show that the sequence n p+2an → 0 as n → ∞.It follows that for all sufficiently large natural numbers n ∈ N, we have n p+2an < 1, so that n pan < n−2.This last inequality allows us to compare the series (3) with the convergent harmonic series

∞n=1

n−2.

We now investigate series where the terms can be negative as well as non-negative real numbers. Thereis then the possibility of cancellation among terms. We first study a simple example.

Example 3.2.2. Recall that the series

∞n=1

1

n= 1 +

1

2+

1

3+ . . .

is divergent. Let us now consider the series

∞n=1

(−1)n−1 1

n= 1 − 1

2+

1

3− 1

4+ . . . . (4)

Denote the partial sums by

sN =N n=1

(−1)n−1 1n

.

Then it is not too difficult to see that for every m ∈ N, we have

s1 ≥ s3 ≥ s5 ≥ . . . ≥ s2m−1 ≥ s2m ≥ . . . ≥ s6 ≥ s4 ≥ s2.

It follows that the sequence s1, s3, s5, . . . is decreasing and bounded below by s2, while the sequences2, s4, s6, . . . is increasing and bounded above by s1. So both sequences converge. Note also that

s2m−1 − s2m =1

2m→ 0

as m → ∞, so that the two sequences converge to the same limit. This means that the sequence sN converges as N → ∞, so that the series (4) is convergent.

We now state and establish the result in general.

THEOREM 3J. (ALTERNATING SERIES TEST) Suppose that (a) an > 0 for every n ∈ N;(b) an is a decreasing sequence; and (c) an → 0 as n → ∞.

Then the series

∞

n=1

(−1)n−1an

is convergent.

Proof. Consider the sequence of partial sums

sN

=

N

n=1

(−

1)n−1an

.

In view of conditions (a) and (b), it is not too difficult to see that for every m ∈ N, we have

s1 ≥ s3 ≥ s5 ≥ . . . ≥ s2m−1 ≥ s2m ≥ . . . ≥ s6 ≥ s4 ≥ s2.

It follows that the sequence s1, s3, s5, . . . is decreasing and bounded below by s2, while the sequences2, s4, s6, . . . is increasing and bounded above by s1. So both sequences converge. Note also that in viewof condition (c), we have

s2m−1 − s2m = a2m → 0

as m→ ∞

, so that the two sequences converge to the same limit. Hence the sequence sN converges asN → ∞.

Example 3.2.3. The logarithmic series

∞n=1

(−1)n−1 xn

n

is convergent (with sum log 2) if x = 1 and divergent if x = −1.

3.3. Complex Series

THEOREM 3K. Suppose that zn ∈ C for every n ∈ N. If the series

∞n=1

|zn| (5)

is convergent, then the series

∞n=1

zn (6)

is convergent. Furthermore, we have ∞n=1

zn

≤∞n=1

|zn|.

We shall give two proofs of this result. The first proof uses the General principle of convergence, whilethe second one relies on considering real and imaginary parts of the terms zn and then studying thenon-negative and negative parts of the real sequences that arise.

First Proof of Theorem 3K. Since the series (5) is convergent, it follows from the General principleof convergence for series that, given any > 0, there exists a number N 0 such that

M n=N +1

|zn| < whenever M > N ≥ N 0.

By the Triangle inequality, we have

M

n=N +1 zn ≤

M n=N +1 |zn| < whenever M > N ≥ N 0.

It follows from the General principle of convergence for series that the series (6) is convergent. Notenext that the sequence

T N =N n=1

|zn| −N n=1

zn

is a non-negative convergent sequence as N → ∞, in view of the Triangle inequality. It follows that

limN →∞

T N =

∞

n=1 |

zn

| − ∞

n=1

zn and limN →∞

T N

≥0.


Second Proof of Theorem 3K. Assume first of all that the first part of Theorem 3K holds forthe special case when the sequence zn is replaced by a real sequence un. Then for zn ∈ C, we writezn = xn + iyn, where xn, yn ∈ R. Since the series (5) is convergent, the inequalities |xn| ≤ |zn| and|yn| ≤ |zn| enable us to use the Comparison test to conclude that the two series

∞n=1

|xn| and

∞n=1

|yn|

are convergent, and so it follows from the special case of the first part of Theorem 3K that the series∞n=1

xn and

∞n=1

yn

are convergent. The convergence of the series (6) now follows from Theorem 3E.

To show that the first part of Theorem 3K holds for real sequences un, note that for every n ∈ N, weclearly have un = u+

n − u−n , where

u+n =

un if un ≥ 0,0 if un < 0,

and

u−n =

0 if un ≥ 0,−un if un < 0.

Furthermore, 0 ≤ u+n ≤ |un| and 0 ≤ u−n ≤ |un| for every n ∈ N. If the series

∞n=1

|un|

is convergent, then it follows from the Comparison test that the series

∞n=1

u+n and

∞n=1

u−n

are both convergent. The convergence of the series

∞

n=1

un =

∞

n=1

(u+

n −u−

n

)

now follows from Theorem 3E.

The second part of Theorem 3K is proved in the same way as before.

Definition. A series

∞n=1

zn is said to be absolutely convergent if the series

∞n=1

|zn| is convergent.

Remark. Theorem 3K states that every absolutely convergent series is convergent.

The Comparison test can now be stated in a much stronger form.

THEOREM 3L. (COMPARISON TEST) Let C be a positive constant independent of n ∈ N. Suppose that for all sufficiently large natural numbers n ∈ N, the inequality |zn| ≤ Cvn holds. Suppose further that the real series

∞n=1

vn

is convergent. Then the series

∞

n=1

zn

is absolutely convergent.

Much of the study of convergence of series is underpinned by our ability to compare a given serieswith an artificially constructed series. Two examples of this technique are given by the two tests below.

THEOREM 3M. (RATIO TEST) Suppose that the sequence zn satisfies

zn+1

zn

→ as n → ∞. (7)

Then the series

∞n=1

zn (8)

is absolutely convergent if < 1 and divergent if > 1.

Proof. Suppose first of all that < 1. Let L = 12 (1 + ). Clearly < L < 1. Since (7) holds, there

exists an integer N such that zn+1

zn

< L whenever n ≥ N.

It follows that

|zn| <|zN |LN

Ln whenever n > N.

On the other hand, the geometric series

∞

n=1

Ln

is convergent. It follows from Comparison test that the series (8) is absolutely convergent. Suppose nextthat > 1. Then clearly |zn| → 0 as n → ∞. The result follows from Theorem 3F.

THEOREM 3N. (ROOT TEST) Suppose that the sequence zn satisfies

|zn|1/n → as n → ∞. (9)

Then the series

∞

n=1

zn (10)

is absolutely convergent if < 1 and divergent if > 1.

Proof. Suppose first of all that < 1. Let L = 12 (1 + ). Clearly < L < 1. Since (9) holds, there

exists an integer N such that

|zn|1/n < L whenever n > N.

It follows that

|zn| < Ln whenever n > N.

On the other hand, the geometric series

∞n=1

Ln

is convergent. It follows from Comparison test that the series (10) is absolutely convergent. Supposenext that > 1. Then clearly |zn| → 0 as n → ∞. The result follows from Theorem 3F.

Remark. No firm conclusion can be drawn in the two settings above if = 1. In the case of the Ratiotest, consider the two series

∞n=1

1

nand

∞n=1

1

n2.

It is easy to show that = 1 in both cases. Note from Theorem 3C that the first series is divergent whilethe second series is convergent.

We conclude this section by considering rearrangements of a given series. The following example isfamous.

Example 3.3.1. Recall that the series

∞n=1

1

n= 1 +

1

2+

1

3+ . . .

is divergent. On the other hand, the series

∞n=1

(−1)n−1 1n

= 1 − 12

+ 13

− 14

+ . . .




is convergent, in view of the Alternating series test. Let s be its sum, so that

s = 1

−

1

2

+1

3 −

1

4

+ . . . .

We next rearrange the terms and consider the series

1 − 1

2− 1

4+

1

3− 1

6− 1

8+

1

5− 1

10− 1

12+ . . .

=

1 − 1

2− 1

4

+

1

3− 1

6− 1

8

+

1

5− 1

10− 1

12

+ . . .

=

1

2− 1

4

+

1

6− 1

8

+

1

10− 1

12

+ . . .

=1

2

1 − 1

2+

1

3− 1

4+

1

5− 1

6

=

s

2.

Note that no term has been omitted or inserted in the rearrangement. Note also that s = 0. But yetwe end up with a different sum. The only possible explanation is that the convergence of the originaland the rearranged series depend on cancellation between positive and negative terms. The differencetherefore has to arise from the nature of such cancellation.

Suppose now that the convergence of a series does not depend on the cancellation between positiveand negative terms. Then it is reasonable to ask whether any rearrangement of the terms may still alterthe sum of the series.

THEOREM 3P. Any rearrangement of an absolutely convergent series

∞

n=1

zn (11)

does not alter its sum.

Proof. Assume first of all that Theorem 3P holds for the special case when the sequence zn is replacedby a real sequence un. Then for zn ∈ C, we write zn = xn + iyn, where xn, yn ∈ R. Since the series (11)is absolutely convergent, the inequalities |xn| ≤ |zn| and |yn| ≤ |zn| enable us to use the Comparisontest to conclude that the two series

∞n=1

xn and

∞n=1

yn

are absolutely convergent, and so it follows from the special case of Theorem 3P that rearrangementdoes not alter their sums. It now follows from Theorem 3E that rearrangement does not alter the sumof the series (11).

To establish the special case of Theorem 3P, suppose that the real series

∞n=1

un

is absolutely convergent, and that the sequence vn is a rearrangement of the sequence un. We now defineu+n , u−n , v+

n , v−n in the same way as in the second proof of Theorem 3K. Then v+n is a rearrangement of

u+n and v−n is a rearrangement of u−n . Clearly the series

∞n=1

u+n





is convergent. Also, the sequence

N

n=1

v+n

is increasing and bounded above by

∞n=1

u+n ,

so that

∞n=1

v+n ≤

∞n=1

u+n .

Arguing in the opposite way, we must have

∞n=1

u+n ≤

∞n=1

v+n .

Hence

∞n=1

v+n =

∞n=1

u+n .

Similarly,

∞n=1

v−n =

∞n=1

u−n .

It now follows that

∞n=1

vn =

∞n=1

v+n −

∞n=1

v−n =

∞n=1

u+n −

∞n=1

u−n =

∞n=1

un,

and the proof is complete.

3.4. Power Series

Suppose that z ∈ C. A series of the form

∞n=0

anzn, (12)

where the coefficients an ∈ C for every n ∈ N ∪ {0}, is called a power series in the variable z. Note thatit is convenient here to start the series with n = 0.

In the first two examples below, the case z = 0 is obvious, while the Ratio test can be applied to studythe case z = 0.

Example 3.4.1. The exponential series

∞

n=0

zn

n!

is absolutely convergent for every z ∈ C.


Example 3.4.2. The logarithmic series

∞

n=1

(−

1)n−1 zn

n

is absolutely convergent for every z ∈ C satisfying |z| < 1 and is divergent for every z ∈ C satisfying|z| > 1.

Example 3.4.3. The series

∞n=1

n!zn

is divergent for every non-zero z ∈ C. To see this, we use Theorem 3F, and note that for any fixed z = 0,the sequence n!zn does not converge to 0 as n → ∞.

The purpose of this section is to establish the following important result.

THEOREM 3Q. (CONVERGENCE THEOREM FOR POWER SERIES) For the power series (12),exactly one of the following holds:(a) The series is absolutely convergent for every z ∈ C.(b) There exists a positive real number R such that the series is absolutely convergent for every z ∈ C

satisfying |z| < R and is divergent for every z ∈ C satisfying |z| > R.(c) The series is divergent for every non-zero z ∈ C.

Definition. The number R in Theorem 3Q is called the radius of convergence of the power series (12).We also say that the radius of convergence is 0 if case (c) occurs, and that the power series (12) hasinfinite radius of convergence if case (a) occurs.

Remark. Note that Theorem 3Q does not indicate whether the power series is convergent if |z| = R.

A crucial step in the proof of Theorem 3Q is summarized by the result below.

THEOREM 3R. Suppose that the series (12) is convergent for a particular value z = z0. Then the series is absolutely convergent for every z ∈ C satisfying |z| < |z0|.

Proof. Suppose that the series

∞n=0

anzn0

is convergent. Then it follows from Theorem 3F that anzn0 → 0 as n → ∞. Recall that any convergentsequence is bounded, so that there exists M ∈ R such that |anzn0 | ≤ M for every n ∈ N∪{0}. For everyz ∈ C satisfying |z| < |z0|, we have

|anzn| ≤ M

z

z0

n

for every n ∈ N∪{0}. Note that |z/z0| < 1. Hence the series (12) is absolutely convergent by comparisonwith the convergent geometric series

∞

n=0

z

z0 n

.

Proof of Theorem 3Q. Consider the set

S = {x ≥ 0 : the series (12) converges}.

Clearly S contains the number 0, and is therefore non-empty. Exactly one of the following three casesapplies:

(i) If S is not bounded above, then for every z ∈ C, we can choose x0 ∈ S such that |z| < x0. Since theseries (12) is convergent at x0, it follows from Theorem 3R that the series (12) is absolutely convergentat z.

(ii) Suppose that S is bounded above with supremum R > 0. For every z ∈ C satisfying |z| < R,we can choose x0 ∈ S such that |z| < x0. Since the series (12) is convergent at x0, it follows fromTheorem 3R that the series (12) is absolutely convergent at z. On the other hand, for every z ∈ C

satisfying |z| > R, we can choose x0 > R such that |z| > x0. If the series (12) is convergent at z, thenit follows from Theorem 3R that the series (12) is absolutely convergent at x0, so that x0

∈S , clearly a

contradiction. Hence the series (12) must be divergent at z.

(iii) If S = {0}, then for every non-zero z ∈ C, we can choose x0 > 0 such that |z| > x0. If the series(12) is convergent at z, then it follows from Theorem 3R that the series (12) is absolutely convergentat x0, a contradiction. Hence the series (12) must be divergent at z.

3.5. Multiplication of Series

Multiplication of two series is not always a straightforward operation, in the sense that the productseries may be affected by the order of the terms. The purpose of this section is to show that we need

not worry if the series involved are absolutely convergent.

THEOREM 3S. Suppose that the series

∞n=0

an and

∞n=0

bn

are absolutely convergent, and converge to sums a and b respectively. Then the series

aibj , (13)

consisting of the products, in any order, of every term of the first series by every term of the second

series, is absolutely convergent, and converges to the sum ab.

Proof. The products of pairs of terms can be arranged in a doubly infinite array.

| |

| || |

| | | |

| |

a0b0 a1b0 a2b0 . . . a0b0 a1b0 a2b0 . . .

a0b1 a1b1 a2b1 . . . a0b1 a1b1 a2b1 . . .

a0b2 a1b2 a2b2 . . . a0b2 a1b2 a2b2 . . .

......

.... . .

......

.... . .




The sum of all these terms can be arranged as a single series. Two such ways are indicated above. Wehave summation by squares on the left, and diagonal summation on the right. No matter in what orderthe terms are arranged, the series

|aibj |

is a series of non-negative terms and clearly does not exceed∞n=0

|an|

∞n=0

|bn|

.

It follows that the series (13) is absolutely convergent. In view of Theorem 3P, the sum is independentof the order of the arrangement of the terms. Since

N

n=0

anN

n=0

bn→ ab as N → ∞,

the sum must be ab.

THEOREM 3T. (CAUCHY PRODUCT) Suppose that the series

∞n=0

an and

∞n=0

bn

are absolutely convergent, and converge to sums a and b respectively. Then the series

∞n=0

cn,

where

cn =n

r=0

arbn−r for every n ∈ N ∪ {0},

is absolutely convergent, and converges to the sum ab.

Proof. This is simply using diagonal summation in Theorem 3S.

The Cauchy product is useful in establishing the following result on the exponential series.

THEOREM 3U. The series

E (z) =

∞n=0

zn

n!

is absolutely convergent for every z ∈ C. Furthermore, for every z1, z2 ∈ C, we have

E (z1)E (z2) = E (z1 + z2).

Proof. The first part of the theorem is trivial for z = 0, and can be proved by using the Ratio test forz = 0. To prove the second part, note that

(z1 + z2)n

n!=

1

n!

n

r=0

n!

r!(n

−r)!

zr1zn−r2 =n

r=0

zr1r!

zn−r2

(n

−r)! .

The result now follows from Theorem 3T.






1. Let an = − 1

nif 3 divides n, and an =

1

notherwise. By considering the sequence of partial sums

s3N , show that the series∞n=1

an is divergent.

2. For each of the following series, discuss whether the series is convergent or divergent, and justifyyour assertion:

a)

∞n=1

n

n2 + 5n − 3b)

∞n=1

(−1)n(√

n + 1 − √ n) c)

∞n=1

(n!)2

(2n)!

d)

∞n=1

(n!)1/n e)

∞n=1

1

nsin

nπ

2f)

∞n=1

n

n + 1

n2

g)

∞n=1

(

−1)n+1

n

1 +

1

2 +

1

3 + . . . +

1

n

3. For each of the following series, determine the values of x ∈ R for which the series is convergent,and justify your assertion:

a)

∞n=1

cos nx

n2b)

∞n=1

sin nx c)

∞n=1

(−1)n−1 xn

nd)

∞n=1

nx

n2 − 2

4. Suppose that un ≥ 0 and vn ≥ 0 for every n ∈ N. Suppose further that un/vn → 2 as n → ∞.Show that the series

∞n=1

un and

∞n=1

vn

are either both convergent or both divergent.

5. a) Suppose that the real series

∞n=1

an and

∞n=1

bn are both convergent. Suppose further that an ≥ 0

and bn ≥ 0 for every n ∈ N. Prove that the series

∞n=1

anbn is convergent.

b) Discuss also the case when the terms an and bn can be negative.

6. For every n ∈ N, let an =1√ n

+(−1)n+1

n.

a) Show that an > 0 for every n ∈ N, and that an → 0 as n → ∞.

b) Show that the series

∞n=1

(−1)nan is divergent.

c) Comment on the result.

7. For each of the following series, determine the values of z ∈ C for which the series is convergent,and justify your assertion:

a)

∞n=1

zn2

b)

∞n=1

n!zn c)

∞n=1

n!zn!

8. Suppose that22

7≤ |an| ≤ 100 for every n ∈ N∪{0}. Discuss the radius of convergence of the power

series

∞n=0

anzn, and justify your assertion.


W W L CHEN

c W W L Chen, 1982, 2008.






Chapter 4

FUNCTIONS AND CONTINUITY

4.1. Limits of Functions

We begin by studying the behaviour of a function f (x) as x → +∞. Corresponding to the definitionof the limit of a real sequence, we have the following direct analogue for real valued functions of a realvariable. In this chapter, all functions f (x) are assumed to be real valued and are defined on R orsuitable subsets of R.

Definition. We say that f (x) → L as x → +∞, or

limx→+∞

f (x) = L,

if, for every > 0, there exists D > 0 such that |f (x) − L| < whenever x > D.

We can also study the behaviour of a function f (x) as x → −∞. Corresponding to the above, we havethe following obvious analogue.

Definition. We say that f (x) → L as x → −∞, or

limx→−∞

f (x) = L,

if, for every > 0, there exists D > 0 such that |f (x) − L| < whenever x < −D.

It is not difficult to see that we can establish suitable analogues of Theorems 2A, 2C and 2D concerningthe uniqueness of limits, the arithmetic of limits and the Squeezing principle respectively.

While the natural numbers are discrete, the real number line is a continuous object. We can thereforealso study the behaviour of a function f (x) as x gets close to a given real number a.

Chapter 4 : Functions and Continuity page 1 of 6

Definition. We say that f (x) → L as x → a, or

limx→a

f (x) = L,

if, for every > 0, there exists δ > 0 such that |f (x) − L| < whenever 0 < |x − a| < δ .

Remark. The restriction |x − a| > 0 is to omit discussion of the situation when x = a. After all, we areonly interested in those x which are close to a but not equal to a.

Much of the theory of limits of sequences can be translated to this new setting of limits of functionsas x → a, courtesy of the result below.

THEOREM 4A. We have f (x) → L as x → a if and only if f (xn) → L as n → ∞ for every sequence xn of real numbers such that xn = a for any n ∈ N and xn → a as n → ∞.

Proof. Suppose first of all that f (x) → L as x → a. Then given any > 0, there exists δ > 0 such that

|f (x) − L| < whenever 0 < |x − a| < δ.

Let xn be any sequence of real numbers such that xn = a for any n ∈ N and xn → a as n → ∞. Thenthere exists N ∈ R such that

0 = |xn − a| < δ whenever n > N.

Hence

|f (xn) − L| < whenever n > N.

This shows that f (xn) → L as n → ∞.

Suppose next that f (x) → L as x → a. Then there exists > 0 such that for every n ∈ N, there existsxn such that

0 < |xn − a| <1

nand |f (xn) − L| ≥ .

Clearly xn = a for any n ∈ N and xn → a as n → ∞. However, it is not difficult to see that f (xn) → Las n → ∞.

Using Theorem 4A, we can immediately establish the following three results which are the analoguesof Theorems 2A, 2C and 2D respectively.

THEOREM 4B. The limit of a function as x → a is unique if it exists.

THEOREM 4C. Suppose that the functions f (x) → L and g(x) → M as x → a. Then (a) f (x) + g(x) → L + M as x → a;(b) f (x)g(x) → LM as x → a; and (c) if M = 0, then f (x)/g(x) → L/M as x → a.

THEOREM 4D. Suppose that g(x) ≤ f (x) ≤ h(x) for every x = a in some open interval containing a.Suppose further that g(x) → L and h(x) → L as x → a. Then f (x) → L as x → a.

A similar theory can be established on one-sided limits.

Definition. We say that f (x) → L as x → a+, or

limx→a+

f (x) = L,

if, for every > 0, there exists δ > 0 such that |f (x) − L| < whenever 0 < x − a < δ . In this case, L iscalled the right-hand limit.

Definition. We say that f (x) → L as x → a−, or

limx→a−

f (x) = L,

if, for every > 0, there exists δ > 0 such that |f (x) − L| < whenever 0 < a − x < δ . In this case, L iscalled the left-hand limit.

It is very easy to deduce the following result.

THEOREM 4E. We have

limx→a

f (x) = L if and only if limx→a−

f (x) = limx→a+

f (x) = L.

It is not difficult to formulate suitable analogues of the arithmetic of limits and the Squeezing principle.Their precise statements are left as exercises.

Definition. We say that a function f (x) is continuous at x = a if f (x) → f (a) as x → a; in otherwords, if

limx→a f (x) = f (a).

Since continuity is defined in terms of limits, we immediately have the following consequences of Theorem 4C.

THEOREM 4F. Suppose that the functions f (x) and g(x) are continuous at x = a. Then (a) f (x) + g(x) is continuous at x = a;(b) f (x)g(x) is continuous at x = a; and (c) if g(a) = 0, then f (x)/g(x) is continuous at x = a.

4.2. Continuity in Intervals

Definition. Suppose that A, B ∈ R with A < B. We say that a function f (x) is continuous in the openinterval (A, B) if f (x) is continuous at x = a for every a ∈ (A, B).

To formulate a suitable definition for continuity in a closed interval, we consider first an example.

Example 4.2.1. Consider the function

f (x) =

1 if x ≥ 0,0 if x < 0.

It is clear that this function is not continuous at x = 0, since

limx→0−

f (x) = 0 and limx→0+

f (x) = 1.

However, let us investigate the behaviour of the function in the closed interval [0, 1]. It is clear that f (x)is continuous at x = a for every a ∈ (0, 1). Furthermore, we have

limx→0+ f (x) = f (0) and limx→1− f (x) = f (1).

This example leads us to conclude that it is not appropriate to insist on continuity of the functionat the end-points of the closed interval, and that a more suitable requirement is one-sided continuityinstead.

Definition. Suppose that A, B ∈ R with A < B. We say that a function f (x) is continuous in theclosed interval [A, B] if f (x) is continuous in the open interval (A, B) and if

limx→A+

f (x) = f (A) and limx→B−

f (x) = f (B).

Remark. It follows that for continuity of a function in a closed interval, we need right-hand continuityof the function at the left-hand end-point of the interval, left-hand continuity of the function at theright-hand end-point of the interval, and continuity at every point in between.

Observe that so far in our discussion in this chapter, there has been no analogue of Theorem 2Bconcerning boundedness.

4.3. Continuity in Closed Intervals

Definition. Suppose that a function f (x) is defined on an interval I ⊆ R. We say that f (x) is boundedabove on I if there exists a real number K ∈ R such that f (x) ≤ K for every x ∈ I , and that f (x) isbounded below on I if there exists a real number k ∈ R such that f (x) ≥ k for every x ∈ I . Furthermore,we say that f (x) is bounded on I if it is bounded above and bounded below on I .

The following can be considered an analogue of Theorem 2B.

THEOREM 4G. Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈R with A < B. Then f (x) is bounded on [A, B].

Proof. Suppose on the contrary that f (x) is not bounded on [A, B]. Then it is either not boundedabove on [A, B] or not bounded below on [A, B], or both. By considering the function −f (x) if necessary,

we may assume, without loss of generality, that f (x) is not bounded above on [A, B]. Then for everyn ∈ N, there exists xn ∈ [A, B] such that f (xn) > n. The real sequence xn is clearly bounded. It followsfrom Theorem 2K that xn has a convergent subsequence xnp , say. Suppose that xnp → c as p → ∞.Clearly c ∈ [A, B]. Suppose first of all that c ∈ (A, B). Since f (x) is continuous at x = c, it followsfrom Theorem 4A that f (xnp) → f (c) as p → ∞. But this is a contradiction, since the sequence f (xnp)satisfies f (xnp) > n p ≥ p for every p ∈ N, and so is not bounded, and hence not convergent in view of Theorem 2B. If c = A or c = B, then there is only one-sided continuity at x = c, and the proof has tobe slightly modified.

In fact, we can establish more.

THEOREM 4H. (MAX-MIN THEOREM) Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈ R with A < B. Then there exist real numbers x1, x2 ∈ [A, B] such that

f (x1) ≤ f (x) ≤ f (x2) for every x ∈ [A, B]. In other words, the function f (x) attains a maximum value and a minimum value in the closed interval [A, B].

Proof. We shall only establish the existence of the real number x2 ∈ [A, B], as the existence of the realnumber x1 ∈ [A, B] can be established by repeating the argument here on the function −f (x). Notefirst of all that it follows from Theorem 4G that the set

S = {f (x) : x ∈ [A, B]}

is bounded above. Let M = sup S . Then f (x) ≤ M for every x ∈ [A, B]. Suppose on the contrary thatthere does not exist x2 ∈ [A, B] such that f (x2) = M . Then f (x) < M for every x ∈ [A, B], and so itfollows from Theorem 4F that the function

g(x) =1

M − f (x)

is continuous in the closed interval [A, B], and is therefore bounded above on [A, B] as a consequence of Theorem 4G. Suppose that g(x) ≤ K for every x ∈ [A, B]. Since g(x) > 0 for every x ∈ [A, B], we musthave K > 0. But then the inequality g(x) ≤ K gives the inequality

f (x) ≤ M −1

K ,

contradicting the assumption that M = sup S .

THEOREM 4J. (INTERMEDIATE VALUE THEOREM) Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈ R with A < B. Suppose further that the real numbers x1, x2 ∈ [A, B] satisfy f (x1) ≤ f (x) ≤ f (x2) for every x ∈ [A, B]. Then for every real number y ∈ R

satisfying f (x1) ≤ y ≤ f (x2), there exists a real number x0 ∈ [A, B] such that f (x0) = y.

Proof. We may clearly suppose that f (x1) < y < f (x2). By considering the function −f (x) if necessary,we may further assume, without loss of generality, that x1 < x2. The idea of the proof is then to follow

the graph of the function f (x) from the point (x1, f (x1)) to the point (x2, f (x2)). This clearly touchesthe horizontal line at height y at least once; the reader is advised to draw a picture. Our technique isthen to trap the last occasion when this happens. Accordingly, we consider the set

T = {x ∈ [x1, x2] : f (x) ≤ y}.

This set is clearly bounded above. Let x0 = sup T . We shall show that f (x0) = y. Suppose on thecontrary that f (x0) = y. Then exactly one of the following two cases applies:

(i) We have f (x0) > y. In this case, let = f (x0) − y > 0. Since f (x) is continuous at x = x0, itfollows that there exists δ > 0 such that |f (x) − f (x0)| < whenever |x − x0| < δ . This implies thatf (x) > y for every real number x ∈ (x0 − δ, x0 + δ ), so that x0 − δ is an upper bound of T , contradicting

the assumption that x0 = sup T .

(ii) We have f (x0) < y. In this case, let = y − f (x0) > 0. Since f (x) is continuous at x = x0, itfollows that there exists δ > 0 such that |f (x) − f (x0)| < whenever |x − x0| < δ . This implies thatf (x) < y for every real number x ∈ (x0 − δ, x0 + δ ), so that x0 cannot be an upper bound of T , againcontradicting the assumption that x0 = sup T .

Remark. Suppose that the function f (x) is continuous in the closed interval [A, B], where A, B ∈ R

with A < B . Then Theorems 4G, 4H and 4J together imply that the range

f ([A, B]) = {f (x) : x ∈ [A, B]}

is a closed interval. In other words, a continuous real valued function of a real variable maps a closedinterval to another closed interval.

1. Consider the function

f (x) =

x sin1

xif x = 0,

0 if x = 0.

Prove that f (x) is continuous at 0.


f (x) =

x if x ∈ Q,1 − x if x ∈ R \Q.

a) Prove that f (x) is discontinuous everywhere except at 12 .

b) Hence, or otherwise, find a bijection g : [0, 1] → [0, 1] which is discontinuous everywhere in (0, 1).


f (x) =

e−1/|x| if x = 0,0 if x = 0.

Prove that f (x) is continuous in R.

4. A function f : R → R is continuous at every x ∈ R, and satisfies f (x) → 0 as x → +∞ as well asf (x) → 3 as x → −∞. Prove that the range f (R) is bounded.

5. Suppose that a function f : [A, B]→R is continuous and strictly increasing in the closed interval

[A, B], so that f (x1) < f (x2) whenever A ≤ x1 < x2 ≤ B. Suppose further that f (A) = α andf (B) = β .

a) Explain why {f (x) : x ∈ [A, B]} = [α, β ].b) Show that for every y ∈ [α, β ], there exists a unique x ∈ [A, B] such that f (x) = y.c) Show that the function g : [α, β ] → [A, B], defined for every y ∈ [α, β ] by g(y) = x, where

x ∈ [A, B] is uniquely determined in part (b) by f (x) = y, is strictly increasing and continuousin the closed interval [α, β ].




W W L CHEN

c W W L Chen, 1994, 2008.

This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain,




Chapter 5

DIFFERENTIATION

5.1. Introduction

We begin by recalling the familiar definition of differentiability.

Definition. We say that a function f (x) is differentiable at x = a if the limit

limx→a

f (x) − f (a)

x − a

exists. In this case, the limit is denoted by f (a) and called the derivative of f (x) at x = a.

Example 5.1.1. Consider the function f (x) = c, where c ∈ R is a constant. For every a ∈ R, we have

f (x) − f (a)

x − a

= 0

→0

as x → a. It follows that f (a) = 0 for every a ∈ R.

Example 5.1.2. Consider the function f (x) = x. For every a ∈ R, we have

f (x) − f (a)

x − a= 1 → 1

as x → a. It follows that f (a) = 1 for every a ∈ R.

Example 5.1.3. Consider the function f (x) = xn, where n ≥ 2 is an integer. For every a ∈ R, we have

f (x) − f (a)

x − a

=xn − an

x − a

= xn−1 + xn−2a + xn−3a2 + . . . + x2an−3 + xan−2 + an−1

→nan−1

as x → a. It follows that f (a) = nan−1 for every a ∈ R.

Chapter 5 : Differentiation page 1 of 16

Example 5.1.4. Consider the function f (x) =√

x. For every positive a ∈ R, we have

f (x) − f (a)

x − a=

√ x − √

a

x − a=

√ x − √

a

(√ x − √ a)(√ x + √ a)=

1

√ x + √ a →1

2√ aas x → a. It follows that f (a) = 1/2

√ a for every positive a ∈ R.

Example 5.1.5. Consider the function f (x) = sin x. For every a ∈ R, we have

f (x) − f (a)

x − a=

sin x − sin a

x − a=

2cos 12 (x + a)sin 1

2 (x − a)

x − a=

sin 12 (x − a)

12 (x − a)

cos1

2(x + a) → cos a

as x → a. It follows that f (a) = cos a for every a ∈ R.

Example 5.1.6. Consider the function f (x) = cos x. For every a ∈ R, we have

f (x) − f (a)

x − a=

cos x − cos a

x − a= −2sin 1

2(x + a)sin 1

2(x − a)

x − a= − sin 1

2(x − a)

12

(x − a)sin

1

2(x + a) → −sin a

as x → a. It follows that f (a) = − sin a for every a ∈ R.

Example 5.1.7. Consider the function f (x) = x1/3. For every non-zero a ∈ R, we have

f (x) − f (a)

x − a=

x1/3 − a1/3

x − a=

1

x2/3 + x1/3a1/3 + a2/3→ 1

3a2/3

as x → a. It follows that f (a) = 13

a−2/3 for every non-zero a ∈ R. On the other hand, we note that

f (x) − f (0)

x − 0=

x1/3

x=

1

x2/3

does not tend to a limit as x → 0, so that the function f (x) is not differentiable at x = 0.

Examples 5.1.3 and 5.1.7 above raise the question of determining derivatives of functions of the typef (x) = xn, where n is a real number, not necessarily a positive integer. We state the following importantresult.

THEOREM 5A. Suppose that n ∈ Q is a fixed rational number. Then for the function f (x) = xn, we have f (a) = nan−1 for every a ∈ R, except for (a) a = 0 and n < 1; or

(b) a ≤ 0 when n = p/q in lowest terms with p ∈ Z and even q ∈ N.

We shall leave the proof of this result until later in this section.


f (x) =

x if x ∈ Q,0 if x ∈ R \Q.

For every a ∈ R, it is not difficult to check that

f (x) − f (a)

x − adoes not tend to a limit as x → a, so that the function f (x) is differentiable nowhere.

Example 5.1.9. Consider the function f (x) = |x|, so that

f (x) = x if x ≥ 0,

−x if x < 0.

For every non-zero a ∈ R, it is not difficult to check that

limx→a

f (x) − f (a)

x − a=

1 if a > 0,−1 if a < 0,

so that f (a) = 1 for every positive a ∈ R and f (a) = −1 for every negative a ∈ R. On the other hand,we note that

f (x) − f (0)

x − 0

does not tend to a limit as x → 0, so that the function f (x) is not differentiable at x = 0.

Suppose that a function f (x) is differentiable at x = a. Then

f (x) − f (a)

x − a→ f (a)

as x → a. On the other hand, clearly the function x − a → 0 as x → a. By the product rule of limits,we have

f (x) − f (a) =

f (x) − f (a)

x − a

(x − a) → 0

as x → a. It follows that f (x) → f (a) as x → a. We have therefore established the following result.

THEOREM 5B. Suppose that a function f (x) is differentiable at x = a. Then f (x) is continuous at x = a.

As is in the case of limits and continuity, we have the sum, product and quotient rules for derivatives.We shall establish the following result.

THEOREM 5C. Suppose that the functions f (x) and g(x) are differentiable at x = a. Then (a) f (x) + g(x) is differentiable at x = a;(b) f (x)g(x) is differentiable at x = a; and (c) if g(a) = 0, then f (x)/g(x) is differentiable at x = a.

Furthermore, we have (a) (f + g)(a) = f (a) + g(a);(b) (f g)(a) = f (a)g(a) + f (a)g(a); and

(c)

f

g

(a) =g(a)f (a) − f (a)g(a)

g2(a).

Proof. (a) Note that

(f (x) + g(x)) − (f (a) + g(a))

x − a=

f (x) − f (a)

x − a+

g(x) − g(a)

x − a.

It follows from Theorem 4C that

limx→a

(f (x) + g(x)) − (f (a) + g(a))

x − a= f (a) + g(a).




(b) Note that

f (x)g(x) − f (a)g(a)

x − a

=f (x)g(x) − f (x)g(a) + f (x)g(a) − f (a)g(a)

x − a

= f (x)g(x) − g(a)

x − a+ g(a)

f (x) − f (a)

x − a.

In view of Theorem 5B, we clearly have f (x) → f (a) as x → a. It follows from Theorem 4C that

limx→a

f (x)g(x) − f (a)g(a)

x − a= f (a)g(a) + g(a)f (a).

(c) We shall first show that 1/g(x) is differentiable at x = a. Note that

(1/g(x))

−(1/g(a))

x − a = −g(x)

−g(a)

x − a

1

g(x)

1

g(a) .

In view of Theorem 5B, we clearly have g(x) → g(a) as x → a. It follows from Theorem 4C that

limx→a

(1/g(x)) − (1/g(a))

x − a= − g(a)

g2(a).

We now apply part (b) to f (x) and 1/g(x) to get the desired result.

Example 5.1.10. Consider the function f (x) = tan x. We know that

tan x =sin x

cos x

.

It follows that for every a ∈ R such that cos a = 0, we have, by the quotient rule, that

f (a) =cos2 a + sin2 a

cos2 a=

1

cos2 a= sec2 a.

Example 5.1.11. Consider the function f (x) = csc x. We know that

csc x =1

sin x.

It follows that for every a∈R such that sin a

= 0, we have, by the quotient rule, that

f (a) =0 − cos a

sin2 a= − cot a csc a.


f (x) =x3 sin x

x2 + 3.

We can write f (x) = g(x)/h(x), where g(x) = x3 sin x and h(x) = x2 + 3. For every a ∈ R, we haveg(a) = a3 cos a + 3a2 sin a and h(a) = 2a. It follows that

f (a) =h(a)g(a) − g(a)h(a)

h2(a)=

(a2 + 3)(a3 cos a + 3a2 sin a) − 2a4 sin a

(a2 + 3)2.





From now on, we shall slightly abuse our notation, and simply refer to f (x) as the derivative of thefunction f (x). We shall further write

y = f (x) anddy

dx = f (x).

It follows, for example, that if we write

d

dx

x

sin x

=

sin x − x cos x

sin2 x,

then we mean that we are considering the function f (x) = x/ sin x, and that for every a ∈ R for whichsin a = 0, we have f (a) = (sin a − a cos a)/ sin2 a.

An important technique in differentiation is through the use of composite functions.

Example 5.1.13. Let y = (x3 + 1)2. To calculate the derivative dy/dx, we can first of all write

y = x6

+ 2x3

+ 1, and then differentiate to obtain

dy

dx= 6x5 + 6x2 = 6x2(x3 + 1).

Let us look at this in a different way. We can write y = u2, where u = x3 + 1. Then

dy

du= 2u and

du

dx= 3x2.

Note that

dy

du

du

dx= 6ux2 = 6x2(x3 + 1).

We therefore have

dy

dx=

dy

du

du

dx.

THEOREM 5D. Suppose that y is a differentiable function of u, and that u is a differentiable function of x. Then y is a differentiable function of x, and

dy

dx=

dy

du

du

dx.

Proof. Write y = g(u), u = f (x) and b = f (a). Then y = (g ◦ f )(x). Note that

(g ◦ f )(x) − (g ◦ f )(a)

x − a=

(g ◦ f )(x) − (g ◦ f )(a)

f (x) − f (a)

f (x) − f (a)

x − a=

g(u) − g(b)

u − b

f (x) − f (a)

x − a.

Here it is tempting to deduce the conclusion immediately. However, it is possible that u − b = 0. Toovercome this difficulty, let us introduce the function

G(u) =

g(u) − g(b)

u − bif u = b,

g(b) if u = b.

Since g(u) is differentiable at u = b, we have G(u) → g(b) as u → b. Furthermore, since G(b) = g(b),it follows that G(u) is continuous at u = b. On the other hand, as x → a, we have u → b, so thatG(u)

→g(b). Hence

G(u) → g(b) as x → a.


Suppose now that u = b. Then we clearly have

(g ◦ f )(x) − (g ◦ f )(a)

x − a

= G(u)f (x) − f (a)

x − a

.

Note that this also holds when u = b, since both sides are equal to 0. It now follows that

limx→a

(g ◦ f )(x) − (g ◦ f )(a)

x − a= g(b)f (a) = g(f (a))f (a)

as required.

Definitions.

(1) A function f (x) is said to be strictly increasing in the closed interval [A, B] if f (x1) < f (x2)whenever A ≤ x1 < x2 ≤ B.

(2) A function f (x) is said to be strictly decreasing in the closed interval [A, B] if f (x1) > f (x2)whenever A

≤x1 < x2

≤B.

THEOREM 5E. Suppose that a function y = f (x) is continuous and strictly increasing in the closed interval [A, B]. Suppose further that f (x) is differentiable at x = a for some a ∈ (A, B), with f (a) = band f (a) = 0. Then the inverse function x = g(y) is differentiable at y = b, with

g(b) =1

f (a).

Proof. The existence of the continuous and strictly increasing inverse function is a consequence of Problem 5 for Chapter 4. Note next that

g(y)

−g(b)

y − b =

x

−a

f (x) − f (a) ,

and that x → a as y → b, a consequence of the continuity of the inverse function.

Proof of Theorem 5A. The case when n is a positive integer has been studied in Examples 5.1.2 and5.1.3. The case when n = 0 and a = 0 has been studied in Example 5.1.1. Suppose next that n is anegative integer. Then −n is a positive integer, and

f (x) − f (a)

x − a=

1

x − a

1

x−n− 1

a−n

= − x−n − a−n

(x − a)x−na−n

= −x−n−1 + x−n−2a + x−n−3a2 + . . . + x2a−n−3 + xa−n−2 + a−n−1

x−na−n

→ na−n−1

a−2n = nan−1

as x → a, provided that a = 0. Suppose now that n = p/q in lowest terms, where p ∈ Z and q ∈ N, andwhere exceptions (a) and (b) do not hold. Then y = xn can be described by y = u p and u = x1/q, sothat x = uq in particular. By Theorems 5D and 5E, we have

dy

dx=

dy

du

du

dx=

dy

du

dx

du=

pu p−1

quq−1=

p

q u p−q = nxn−1.


Example 5.1.14. Consider the function f (x) = cx, where c ∈ R is a fixed positive real number. Then

f (x) − f (a)

x − a=

cx − ca

x − a= ca

cx−a − 1

x − a→ ca lim

h→0

ch − 1

h

as x → a. In the special case when c = e, we have

limh→0

ch − 1

h

= 1,

so that for the function f (x) = ex, we have

f (x) − f (a)

x − a→ ea

as x → a. Hence f (a) = f (a) for every a ∈ R in this case.

Example 5.1.15. Consider the function f (x) = log x. Then the inverse function is given by g(y) = ey.Then for every positive real number a ∈ R, writing b = log a, we have f (a)g(b) = 1 by Theorem 5E. Itthen follows from Example 5.1.14 that f (a)g(b) = 1, and so

f

(a) =1

g(b) =1

a .

5.2. Some Important Results on Derivatives

In this section, we indicate some results which summarize, with rigour, the important role played by thederivative f (x) in the study of properties of a given function f (x). The first of these results appears tobe very restrictive, as it involves a hypothesis which is rarely satisfied.

THEOREM 5F. (ROLLE’S THEOREM) Suppose that a function f (x) is continuous in the closed

interval [A, B], where A, B ∈ R with A < B. Suppose further that f

(a) exists for every a ∈ (A, B). If f (A) = f (B), then there exists c ∈ (A, B) such that f (c) = 0.

A B

y = f (x)

Proof. Since f (x) is continuous in the closed interval [A, B], it follows from Theorem 4H that thereexist x1, x2 ∈ [A, B] such that f (x1) ≤ f (x) ≤ f (x2) for every x ∈ [A, B].

Case 1. Suppose that both x1 and x2 are endpoints of the interval [A, B]. Since f (A) = f (B), itfollows that f (x) is constant in the interval [A, B], so that f (c) = 0 for every c ∈ (A, B).

Case 2. Suppose that x1 ∈ (A, B). Then f (x) has a local minimum at x = x1. We claim thatf (x1) = 0. Suppose on the contrary that f (x1) = 0. Without loss of generality, assume that

f (x1) = limx→x1

f (x) − f (x1)

x − x1> 0.

Then there exists δ > 0 such that

f (x) − f (x1)

x − x1 −f (x1) <

1

2 |f (x1)

|whenever 0 <

|x

−x1

|< δ,

so that

f (x) − f (x1)

x − x1> 0 whenever 0 < |x − x1| < δ.

It follows that f (x) − f (x1) < 0 if x1 − δ < x < x1, contradicting that f (x) has a local minimum atx = x1.

Case 3. Suppose that x2 ∈ (A, B). Then f (x) has a local maximum at x = x2. A similar argumentas in Case 2 gives f (x2) = 0.

Example 5.2.1. We can prove that between any two real roots of sin x = 0 must lie a real root of

cos x = 0. To do this, let f (x) = sin x, and let A < B be any two real roots of sin x = 0. Clearlyf (A) = f (B). Furthermore, all the other hypotheses of Rolle’s theorem are satisfied. It follows thatthere exists c ∈ (A, B) such that f (c) = 0. Note, however, that f (x) = cos x.

Example 5.2.2. Consider the polynomial f (x) = x3 + 3x2 + 6x + 1. We can prove that the polynomialequation f (x) = 0 has exactly one real root. Note that f (−1) < 0 and f (1) > 0. Applying theIntermediate value theorem to f (x) in the closed interval [−1, 1], we know that there exists x0 ∈ (−1, 1)such that f (x0) = 0. It follows that the equation f (x) = 0 has at least one real root. Suppose that thereare more than one real root. Let A < B be two such roots. Then clearly f (A) = f (B). Applying Rolle’stheorem with f (x) = x3 + 3x2 + 6x + 1 in the interval [A, B], we conclude that there exists c ∈ (A, B)such that f (c) = 0. Note, however, that f (x) = 3x2 + 6x + 6 = 3(x2 + 2x + 1 + 1) = 3(x + 1)2 + 3 = 0for any x ∈ R.

The hypotheses of Rolle’s theorem are rather restrictive, in that we require the function to have equalvalues at the two end-points of the interval in question. However, this restriction is only deceptive, aswe can use Rolle’s theorem to establish the following more general result.

THEOREM 5G. (MEAN VALUE THEOREM) Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈ R with A < B. Suppose further that f (a) exists for every a ∈ (A, B).Then there exists c ∈ (A, B) such that f (B) − f (A) = f (c)(B − A).

To understand the Mean value theorem, it is easiest to rewrite the conclusion as

f (B) − f (A)

B − A= f (c).

The left-hand side represents the slope of the line joining the points ( A, f (A)) and (B, f (B)). It followsthat the theorem merely says that the tangent to the curve is sometimes parallel to this line.

A B

y = f (x)

It is therefore clear that Rolle’s theorem is a special case of the Mean value theorem. We now show thatthe Mean value theorem can be deduced fairly easily from Rolle’s theorem.

Proof of Theorem 5G. Consider the function

g(x) = f (x) − f (B) − f (A)

B − A(x − A).

Then clearly g(x) is continuous in the closed interval [A, B], g(a) exists for every a ∈ (A, B) andg(A) = g(B). It follows from Rolle’s theorem that there exists c ∈ (A, B) such that g(c) = 0. Note nowthat

g(c) = f (c) − f (B) − f (A)

B − A.


To illustrate the power of the Mean value theorem, we shall deduce the following simple but powerfulconsequences.

THEOREM 5H. Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈ R with A 0 for every a ∈ (A, B), then f (x) is strictly increasing in [A, B].(c) If f (a) < 0 for every a ∈ (A, B), then f (x) is strictly decreasing in [A, B].

Proof. Suppose that A ≤ x1 < x2 ≤ B. Applying the Mean value theorem to the function f (x) in theclosed interval [x1, x2], we have

f (x2)−

f (x1) = (x2

−x1)f (c)

for some c ∈ [x1, x2] ⊆ [A, B]. It follows that

f (x2) − f (x1) =

= 0 in case (a),> 0 in case (b),< 0 in case (c),

giving the desired results.

We next discuss a generalization of the Mean value theorem to one involving two functions.

THEOREM 5J. (CAUCHY’S MEAN VALUE THEOREM) Suppose that functions f (x) and g(x) are continuous in the closed interval [A, B], where A, B

∈R with A < B. Suppose further that f (a) and

g(a) exist for every a ∈ (A, B), and that g(a) is non-zero for every a ∈ (A, B). Then there exists c ∈ (A, B) such that

f (B) − f (A)

g(B) − g(A)=

f (c)

g(c).

Proof. We let h(x) = f (x) − kg(x), where k ∈ R is a suitably chosen constant which ensures thath(A) = h(B), so that

k =f (B) − f (A)

g(B) − g(A).

Here we observe that the denominator g(B) − g(A) is non-zero, in view of Rolle’s theorem and theassumption that g(a) is non-zero for every a ∈ (A, B). Clearly h(x) is continuous in the closed interval

[A, B], h(a) exists for every a ∈ (A, B) and h(A) = h(B). It follows from Rolle’s theorem that thereexists c ∈ (A, B) such that h(c) = 0. Note now that

h

(c)g(c)

= f

(c)g(c)

− k = f

(c)g(c)

− f (B) − f (A)g(B) − g(A)

.


We are now in a position to establish the following important result.

THEOREM 5K. (L’HOPITAL’S RULE) Suppose that functions f (x) and g(x) are differentiable in an open interval I containing the real number a. Suppose further that f (a) = g(a) = 0. Then

limx→a

f (x)

g(x)= lim

x→a

f (x)

g(x),

provided that the limit on the right-hand side exists.

Proof. For any x ∈ I such that x = a, we apply Cauchy’s mean value theorem to the closed interval[a, x] if x > a and to the closed interval [x, a] if x < a. It is easy to check that the hypotheses of Cauchy’smean value theorem are satisfied. Hence there exists c ∈ (a, x) or c ∈ (x, a) such that

f (x)

g(x)=

f (x) − f (a)

g(x) − g(a)=

f (c)

g(c).

Clearly c → a as x → a. Hence

limx→a

f (x)g(x)

= limc→a

f (c)g(c)

,

and the result follows.

5.3. Stationary Points and Second Derivatives

Definitions.

(1) A function f (x) is said to have a local maximum at x = a if there is an open interval I containingthe real number a and such that f (x)

≤f (a) for every x

∈I .

(2) A function f (x) is said to have a local minimum at x = a if there is an open interval I containingthe real number a and such that f (x) ≥ f (a) for every x ∈ I .

(3) A function f (x) is said to have a stationary point at x = a if f (a) = 0.

Example 5.3.1. Consider the function f (x) = x2. Since f (x) = 2x for every x ∈ R, the only stationarypoint is at x = 0. On the other hand, note that for every x = 0, we have f (x) = x2 > 0 = f (0). Itfollows that there is a local minimum at x = 0.

Example 5.3.2. Consider the function f (x) = x3. Since f (x) = 3x2 for every x ∈ R, the only stationarypoint is at x = 0. On the other hand, note that for every x < 0, we have f (x) = x3 < 0 = f (0), whereasfor every x > 0, we have f (x) = x3 > 0 = f (0). It follows that x = 0 does not represent a local minimumor a local maximum.

To detect a local maximum or local minimum, we have the following result.

THEOREM 5L. Suppose that I is an open interval containing a. Suppose further that a function f (x)is continuous in I , and differentiable at every x ∈ I , except possibly at x = a.(a) If f (x) > 0 for every x < a in I and f (x) < 0 for every x > a in I , then the function f (x) has a

local maximum at x = a.(b) If f (x) < 0 for every x < a in I and f (x) > 0 for every x > a in I , then the function f (x) has a

local minimum at x = a.

Proof. Suppose that x ∈ I and x = a. By the Mean value theorem, there exists a real number c in theopen interval with endpoints a and x such that f (x) − f (a) = (x − a)f (c).

(a) Since f (c) > 0 if x < a and f (c) < 0 if x > a, we clearly have f (x) − f (a) < 0. Hence f (x) has alocal maximum at x = a.

(b) Since f (c) < 0 if x < a and f (c) > 0 if x > a, we clearly have f (x) − f (a) > 0. Hence f (x) has alocal minimum at x = a.

Example 5.3.3. Consider the function f (x) = 2x3 − 9x2 + 12x − 5. Since

f (x) = 6x2 − 18x + 12 = 6(x2 − 3x + 2) = 6(x − 1)(x − 2)

for every x ∈ R, it is clear that the only stationary points are at x = 1 and x = 2. To determine whethereither of these represents a local maximum or a local minimum, we study the function f (x) more closely.It is easy to see that

f (x)

> 0 if x ∈ (0, 1),< 0 if x ∈ (1, 2),> 0 if x ∈ (2, 3).

It follows that f (x) has a local maximum at x = 1 and a local minimum at x = 2.

If the first derivative measures the rate of change of a function, then the second derivative measuresthe rate of change of the first derivative. Since the first derivative represents the slope of the tangent tothe curve, it follows that the second derivative measures the rate of change of this slope. The followingresult is suggested by heuristics bases on these ideas.

THEOREM 5M. Suppose that I is an open interval containing a real number a. Suppose further that the function f (x) is differentiable at every x ∈ I , and that f (a) = 0.(a) If f (a) < 0, then the function f (x) has a local maximum at x = a.(b) If f (a) > 0, then the function f (x) has a local minimum at x = a.

Proof. We shall only prove (a), as the proof for (b) is similar. Since

f (a) = limx→a

f (x) − f (a)

x − a< 0,

it follows that there exists δ > 0 such thatf (x) − f (a)

x − a− f (a)

<1

2|f (a)| whenever 0 < |x − a| < δ,

so that

f (x) − f (a)

x − a< 0 whenever 0 < |x − a| < δ.

Now let I = (a − δ, a + δ ). Then it is easy to see that f (x) > 0 for every x < a in I and f (x) < 0 forevery x > a in I . It now follows from Theorem 5L that f (x) has a local maximum at x = a.

Example 5.3.4. Consider the function f (x) = 2x3 − 9x2 + 12x − 5, as discussed earlier in Example5.3.3. Since

f

(x) = 6x2

− 18x + 12 = 6(x2

− 3x + 2) = 6(x − 1)(x − 2)

for every x ∈ R, it is clear that the only stationary points are at x = 1 and x = 2. On the other hand,we have f (x) = 12x − 18 for every x ∈ R, so that f (1) < 0 and f (2) > 0. It follows that f (x) has alocal maximum at x = 1 and a local minimum at x = 2.

5.4. Series Expansion

The purpose of this section is to show that if a given function has derivatives of all orders, then it hasa nice power series expansion. We begin by establishing the following generalized version of the Mean

value theorem.

THEOREM 5N. (TAYLOR’S THEOREM) Suppose that n ∈ N. Suppose further that a function f (x)satisfies the following conditions:(a) f (x) and its first (n−1) derivatives f (x), f (x), . . . , f (n−1)(x) are continuous in the closed interval

[a, a + h]; and (b) the n-th derivative exists in the open interval (a, a + h).

Then

f (a + h) = f (a) + hf (a) +h2

2!f (a) + . . . +

hn−1

(n − 1)!f (n−1)(a) +

hn

n!f (n)(a + θh),

where θ∈R satisfies 0 < θ < 1.

Remark. Taylor’s theorem is sometimes known as the Mean value theorem of the n-th order. Note thatfor n = 1, Taylor’s theorem reduces to the Mean value theorem.

Proof of Theorem 5N. For every t ∈ [0, h], write

g(t) = f (a + t) − f (a) − tf (a) − . . . − tn−1

(n − 1)!f (n−1)(a) − tn

n!C, (1)

where we shall choose C to ensure that g(h) = 0. It is easy to check that

g(0) = g(0) = . . . = g(n−1)(0) = 0.

We now proceed to use Rolle’s theorem n times. Since g(0) = g(h) = 0, there exists h1 ∈ (0, h) suchthat g(h1) = 0. Since g(0) = g(h1) = 0, there exists h2 ∈ (0, h1) such that g(h2) = 0, and so on.Finally, since g(n−1)(0) = g(n−1)(hn−1) = 0, there exists hn ∈ (0, hn−1) such that g(n)(hn) = 0. Clearly0 < hn < h, and so hn = θh for some θ ∈ R satisfying 0 < θ < 1. Observe now that

g(n)(t) = f (n)(a + t) − C.

It follows that C = f (n)(a + θh). The result follows on substituting this into (1), letting t = h and notingthat g(h) = 0.

In Taylor’s theorem, we can write

f (a + h) = S n + Rn,




where

S n = f (a) + hf (a) +h2

2!

f (a) + . . . +hn−1

(n − 1)!

f (n−1)(a)

and

Rn =hn

n!f (n)(a + θh). (2)

If Rn → 0 as n → ∞, then S n → f (a + h) as n → ∞. We therefore have the following series version of Taylor’s theorem.

THEOREM 5P. (TAYLOR SERIES) Suppose that a function f (x) satisfies the following conditions:(a) f (x) and all its derivatives f (x), f (x), . . . are continuous in the closed interval [a, a + h]; and (b) the sequence Rn defined by (2) converges to 0 as n → ∞.

Then

f (a + h) =∞n=0

hn

n!f (n)(a),

with the convention that 0! = 1.

Remark. The Maclaurin series is the Taylor series in the special case a = 0. Under suitable conditions,we have

f (x) =∞n=0

xn

n!f (n)(0). (3)

Example 5.4.1. Consider the function f (x) = ex. Then f (x) has derivatives of all order, all equal toex. Note that f (n)(0) = 1 for every n ∈ N ∪ {0}. It follows that the Maclaurin series of the exponentialfunction is given by

ex =∞n=0

xn

n!.

This is the exponential series.

Example 5.4.2. Consider the function f (x) = log(1 + x). Then f (x) has derivatives of all order nearx = 0. Furthermore, it can be proved by induction that for every n ∈ N, we have

f (n)(x) = (−1)n−1

(n − 1)!(1 + x)n

,

so that f (n)(0) = (−1)n−1(n − 1)!. Note also that f (0) = 0. It follows that the Maclaurin series for thefunction is given by

log(1 + x) =

∞n=1

(−1)n−1 xn

n.

This is the logarithmic series.

Example 5.4.3. Consider the function f (x) = (1 + x)α, where α ∈ R \ {0, 1, 2, 3, . . .}. Then f (x) hasderivatives of all order near x = 0. Furthermore, for every n

∈N, we have

f (n)(x) = α(α − 1) . . . (α − n + 1)(1 + x)α−n,





so that

f (n)(0) = α(α − 1) . . . (α − n + 1).

Note also that f (0) = 1. It follows that the Maclaurin series for the function is given by

(1 + x)α =∞n=1

α(α − 1) . . . (α − n + 1)

n!xn.

This is the Extended binomial theorem.

Example 5.4.4. Consider the function f (x) = (1 + x)n, where n ∈ N. Then f (x) has derivatives of allorder near x = 0. Furthermore, for every r = 1, . . . , n, we have

f (r)(x) = n(n − 1) . . . (n − r + 1)(1 + x)n−r,

so that

f (r)(0) = n(n − 1) . . . (n − r + 1).

On the other hand, for every natural number r > n, we have f (r)(x) = 0. Note also that f (0) = 1. Itfollows that the Maclaurin series for the function has zero coefficients beyond the term xn and is givenby

(1 + x)n =

nr=0

n(n − 1) . . . (n − r + 1)

r!xr.

This is a special case of the Binomial theorem.


1. a) Suppose that f (x) and g(x) are twice differentiable at x = a. Show that

(f g)(a) = f (a)g(a) + 2f (a)g(a) + f (a)g(a).

b) Suppose that f (x) and g(x) are three times differentiable at x = a. Obtain a correspondingformula for (f g)(a).

c) Suppose that f (x) and g(x) are n times differentiable at x = a. Analyze the results in parts (a)and (b), make a guess for the corresponding formula for ( f g)(n)(a), and prove your formula byinduction on n.

2. Suppose that f (a) exists. Prove that

lim

h→

0

f (a + h) − 2f (a) + f (a − h)

h2

= f (a).

3. Let f (x) =

x sin1

xif x = 0,

0 if x = 0.

a) Show that f (x) is continuous at x = 0.b) Find the derivative of f (x) when x = 0.c) Show that f (x) is not differentiable at x = 0.

4. Let f (x) =

x2 sin

1

xif x = 0,

0 if x = 0.

a) Prove that f (x) exists for every real number x.b) Find f (0).c) Find f (x) when x = 0.d) Prove that f (x) is not continuous at x = 0.

5. Construct a function g(x) for which g(0) > 0, but there is no interval (−A, A) in which g(x) is astrictly increasing function.[Hint: Try g(x) = f (x) + kx, where k is a suitable constant and f (x) is given in Problem 4.]

6. Consider the function f (x) = |x| − 3.a) Show that f (x) is differentiable at x = a for every non-zero a

∈R.

b) Comment in view of Theorem 5L.

7. Suppose that the function f (x) satisfies f (0) = 0, f (0) = 0 and f (0) > 0.

a) Explain why there exists δ > 0 such thatf (x) − f (0)

x − 0> 0 for every non-zero x ∈ (−δ, δ ).

b) Deduce that f (x) > 0 for every x ∈ (0, δ ), and that f (x) < 0 for every x ∈ (−δ, 0).c) Use Rolle’s theorem to show that f (x) = 0 for every non-zero x ∈ (−δ, δ ).d) Use the Mean value theorem to show that f (x) > 0 for every non-zero x ∈ (−δ, δ ).

8. Consider the function f (x) = x2/3 in the closed interval [−1, 1].a) Show that f (−1) = f (1).b) Show that there is no number c

∈(

−1, 1) such that f (c) = 0.

c) Show that f (x) is not differentiable at x = 0.d) Explain why the conclusion of Rolle’s theorem does not hold.

9. Explain why x = 1 is the only real solution of the equation x3 − 3x2 + 9x − 7 = 0.

10. Use the relevant theorems to prove that the equation ex = 3 − x has exactly one real solution.

11. Show that the equation 3x − 2 + cosπx

2= 0 has exactly one real root.

12. Use the Mean Value Theorem to prove the inequality | sin A − sin B| ≤ |A − B| for all real numbersA and B.

13. Let f (x) = tan x − x. Find f (0) and use the derivative f (x) to prove that tan x > x for every xsatisfying 0 < x < π/2.

14. Suppose that p(x) is a polynomial, and that k ∈ R is a constant. Suppose further that A < B areconsecutive roots of the equation p(x) = 0.

a) Write p(x) = (x − A)m(x − B)nq (x), where q (A) = 0 and q (B) = 0. Prove that if we write

p

(x) = (x − A)m−1

(x − B)n−1

r(x), then r(A) and r(B) have opposite signs.b) Hence, or otherwise, prove that there is a root of the equation p(x) + kp(x) = 0 in the interval

[A, B].

15. Suppose that a function f (x) is differentiable at every x ∈ [A, B]. Prove that f (x) takes everyvalue between f (A) and f (B).

16. Use L’Hopital’s rule to find each of the following:

a) limx→0

x − sin x

x3b) lim

x→0+x2x c) lim

x→0

tan x − x

x3

17. Find the Maclaurin expansion of the functions sin x and cos x.

18. Find all the terms up to and including x3 in the Taylor expansion of each of the following functions:a) f (x) = (x + 1) sin x b) f (x) = ex cos x c) f (x) = tan x

W W L CHEN

c W W L Chen, 1996, 2008.

This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain,




Chapter 6

THE RIEMANN INTEGRAL

6.1. Introduction

Suppose that a function f (x) is bounded on the interval [A, B], where A, B ∈ R and A < B. Supposefurther that

∆ : A = x0 < x1 < x2 < .. . < xn = B

is a dissection of the interval [A, B].

Definition. The sums

s(f, ∆) =

n

i=1

(xi − xi−1) inf x∈[xi−1,xi]

f (x) and S (f, ∆) =

n

i=1

(xi − xi−1) supx∈[xi−1,xi]

f (x)

are called respectively the lower Riemann sum and the upper Riemann sum of f (x) corresponding tothe dissection ∆.

Example 6.1.1. Consider the function f (x) = x2 in the interval [0, 1]. Suppose that n ∈ N is given andfixed. Let us consider a dissection

∆n : 0 = x0 < x1 < x2 < .. . < xn = 1

of the interval [0, 1], where xi = i/n for every i = 0, 1, 2, . . . , n. For every i = 1, 2, . . . , n, we have

inf x∈[xi−1,xi]

f (x) = inf i−1

n≤x≤ i

n

x2 = (i − 1)2

n2and sup

x∈[xi−1,xi]f (x) = sup

i−1

n≤x≤ i

n

x2 = i2

n2.

Chapter 6 : The Riemann Integral page 1 of 14

It follows that

s(f, ∆n) =

n

i=1

(xi − xi−1) inf x∈[xi−1,xi]

f (x) =

n

i=1

(i − 1)2

n3

=(n − 1)n(2n − 1)

6n3

and

S (f, ∆n) =

ni=1

(xi − xi−1) supx∈[xi−1,xi]

f (x) =

ni=1

i2

n3=

n(n + 1)(2n + 1)

6n3.

Note that s(f, ∆n) ≤ S (f, ∆n), and that both terms converge to 13

as n → ∞.

THEOREM 6A. Suppose that a function f (x) is bounded on the interval [A, B], where A, B ∈ R and A < B . Suppose further that ∆ and ∆ are dissections of the interval [A, B], and that ∆ ⊆ ∆. Then

s(f, ∆) ≤ s(f, ∆) and S (f, ∆) ≤ S (f, ∆).

Proof. Suppose that x < x are consecutive dissection points of ∆, and suppose that

x = y0 < y1 < .. . < ym = x

are all the dissection points of ∆ in the interval [x, x]. Then, drawing a picture if necessary, it is easyto see that

mi=1

(yi − yi−1) inf x∈[yi−1,yi]

f (x) ≥mi=1

(yi − yi−1) inf x∈[x,x]

f (x) = (x − x) inf x∈[x,x]

f (x)

and

mi=1

(yi − yi−1) supx∈[yi−1,yi]

f (x) ≤mi=1

(yi − yi−1) supx∈[x,x]

f (x) = (x − x) supx∈[x,x]

f (x).

The result follows on summing over all consecutive points of the dissection ∆ .

THEOREM 6B. Suppose that a function f (x) is bounded on the interval [A, B], where A, B ∈ R and A < B . Suppose further that ∆ and ∆ are dissections of the interval [A, B]. Then

s(f, ∆) ≤ S (f, ∆).

Proof. Consider the dissection ∆ = ∆ ∪ ∆ of [A, B]. Then it follows from Theorem 6A that

s(f, ∆) ≤ s(f, ∆) and S (f, ∆) ≤ S (f, ∆). (1)

On the other hand, it is easy to check that

s(f, ∆) ≤ S (f, ∆). (2)

The result follows on combining (1) and (2).

Definition. The real numbers

I −(f,A,B) = sup∆

s(f, ∆) and I +(f,A,B) = inf ∆

S (f, ∆),

where the supremum and infimum are taken over all dissections ∆ of [A, B], are called respectively thelower integral and the upper integral of f (x) over [A, B].

Remark. Since f (x) is bounded on [A, B], it follows that s(f, ∆) and S (f, ∆) are bounded above andbelow. This guarantees the existence of I −(f,A,B) and I +(f,A,B).

THEOREM 6C. Suppose that a function f (x) is bounded on the interval [A, B], where A, B ∈ R and A < B . Then I −(f,A,B) ≤ I +(f,A,B).

Proof. Suppose that ∆ is a dissection of [A, B]. Then it follows from Theorem 6B that

s(f, ∆) ≤ S (f, ∆)

for every dissection ∆ of [A, B]. Keeping ∆ fixed and taking the infimum over all dissections ∆ of [A, B], we conclude that

s(f, ∆) ≤ inf ∆

S (f, ∆) = I +(f,A,B).

Taking now the supremum over all dissections ∆ of [A, B], we conclude that

I +(f,A,B) ≥ sup∆

s(f, ∆) = I −(f,A,B).

The result follows.

Definition. Suppose that I −(f,A,B) = I +(f,A,B). Then we say that the function f (x) is Riemannintegrable over [A, B], denoted by f ∈ R([A, B]), and write

BA

f (x) dx = I −(f,A,B) = I +(f,A,B).

Example 6.1.2. Let us return to Example 6.1.1, and consider again the function f (x) = x2 in theinterval [0, 1]. Recall that both s(f, ∆n) and S (f, ∆n) converge to 1

3 as n → ∞. It follows that

I −(f, 0, 1) ≥1

3and I +(f, 0, 1) ≤

1

3.

In view of Theorem 6C, we must have

I −(f, 0, 1) = I +(f, 0, 1) =1

3,

so that 1

0

x2 dx =1

3.

We can establish the following characterization of Riemann integrable functions in terms of Riemannsums.

THEOREM 6D. Suppose that a function f (x) is bounded on the interval [A, B], where A, B ∈ R and A 0, there exists a dissection ∆ of [A, B] such that

S (f, ∆) − s(f, ∆) < . (3)

Proof. ((a)⇒(b)) If f ∈ R([A, B]), then

sup∆

s(f, ∆) = inf ∆

S (f, ∆), (4)

where the supremum and infimum are taken over all dissections ∆ of [ A, B]. For every > 0, there existdissections ∆1 and ∆2 of [A, B] such that

s(f, ∆1) > sup∆

s(f, ∆) −

2and S (f, ∆2) < inf

∆S (f, ∆) +

2. (5)

Let ∆ = ∆1 ∪ ∆2. Then by Theorem 6A, we have

s(f, ∆) ≥ s(f, ∆1) and S (f, ∆) ≤ S (f, ∆2). (6)

The inequality (3) now follows on combining (4)–(6).

((b)⇒(a)) Suppose that > 0 is given. We can choose a dissection ∆ of [A, B] such that (3) holds.

Clearly

s(f, ∆) ≤ I −(f,A,B) ≤ I +(f,A,B) ≤ S (f, ∆). (7)

Combining (3) and (7), we conclude that 0 ≤ I +(f,A,B) − I −(f,A,B) < . Note now that > 0is arbitrary, and that I +(f,A,B) − I −(f,A,B) is independent of . It follows that we must haveI +(f,A,B) − I −(f,A,B) = 0.

6.2. Properties of the Riemann Integral

In this section, we shall study some simple but useful properties of the Riemann integral. We begin by

studying the arithmetic of Riemann integrals.

THEOREM 6E. Suppose that f, g ∈ R([A, B]), where A, B ∈ R and A < B. Then the following statements hold:

(a) We have f + g ∈ R([A, B]), and

BA

(f (x) + g(x)) dx =

BA

f (x) dx +

BA

g(x) dx.

(b) For every c ∈ R, we have cf ∈ R([A, B]), and

BA

cf (x) dx = c

BA

f (x) dx.

(c) If f (x) ≥ 0 for every x ∈ [A, B], then

BA

f (x) dx ≥ 0.

(d) If f (x) ≤ g(x) for every x ∈ [A, B], then B

A

f (x) dx ≤ B

A

g(x) dx.

Proof. (a) Since f, g ∈ R([A, B]), it follows from Theorem 6D that for every > 0, there exist dissections∆1 and ∆2 of [A, B] such that

S (f, ∆1) − s(f, ∆1) <

2and S (g, ∆2) − s(g, ∆2) <

2.

Let ∆ = ∆1 ∪ ∆2. Then in view of Theorem 6A, we have

S (f, ∆) − s(f, ∆) <

2and S (g, ∆) − s(g, ∆) <

2. (8)

Suppose that the dissection ∆ is given by ∆ : A = x0 < x1 < x2 < .. . < xn = B. It is easy to see thatfor every i = 1, . . . , n, we have

supx∈[xi−1,xi]

(f (x) + g(x)) ≤ supx∈[xi−1,xi]

f (x) + supx∈[xi−1,xi]

g(x)

and

inf x∈[xi−1,xi]

(f (x) + g(x)) ≥ inf x∈[xi−1,xi]

f (x) + inf x∈[xi−1,xi]

g(x).

It follows that

S (f + g, ∆) ≤ S (f, ∆) + S (g, ∆) and s(f + g, ∆) ≥ s(f, ∆) + s(g, ∆). (9)

Combining (8) and (9), we have

S (f + g, ∆) − s(f + g, ∆) ≤ (S (f, ∆) − s(f, ∆)) + (S (g, ∆) − s(g, ∆)) < .

It now follows from Theorem 6D that f + g ∈ R([A, B]). To establish the second assertion, suppose nowthat ∆1 and ∆2 are any two dissections of [A, B]. As before, let ∆ = ∆1 ∪ ∆2. Then in view of Theorem6A and (9), we have

S (f, ∆1) + S (g, ∆2) ≥ S (f, ∆) + S (g, ∆) ≥ S (f + g, ∆) ≥ I +(f + g,A,B),

so that

S (g, ∆2) ≥ I +(f + g,A,B) − S (f, ∆1).

Keeping ∆1 fixed and taking the infimum over all dissections ∆2 of [A, B], we have

I +(g,A,B) ≥ I +(f + g,A,B) − S (f, ∆1),

so that

S (f, ∆1) ≥ I +(f + g,A,B) − I +(g,A,B).

Taking the infimum over all dissections ∆1 of [A, B], we have

I +(f,A,B) ≥ I +(f + g,A,B) − I +(g,A,B),

so that

I +(f + g,A,B) ≤ I +(f,A,B) + I +(g,A,B). (10)

Similarly, in view of Theorem 6A and (9), we have

s(f, ∆1) + s(g, ∆2) ≤ s(f, ∆) + s(g, ∆) ≤ s(f + g, ∆) ≤ I −(f + g,A,B),

so that

s(g, ∆2) ≤ I −(f + g,A,B) − s(f, ∆1).

Keeping ∆1 fixed and taking the supremum over all dissections ∆2 of [A, B], we have

I −(g,A,B) ≤ I −(f + g,A,B) − s(f, ∆1),

so that

s(f, ∆1) ≤ I −(f + g,A,B) − I −(g,A,B).

Taking the supremum over all dissections ∆1 of [A, B], we have

I −(f,A,B) ≤ I −(f + g,A,B) − I −(g,A,B),

so that

I −(f,A,B) + I −(g,A,B) ≤ I −(f + g,A,B). (11)

Combining (10) and (11), we have

I −(f,A,B) + I −(g,A,B) ≤ I −(f + g,A,B) = I +(f + g,A,B) ≤ I +(f,A,B) + I +(g,A,B). (12)

Clearly I −(f,A,B) = I +(f,A,B) and I −(g,A,B) = I +(g,A,B), and so equality must hold everywherein (12). In particular, we have I +(f,A,B) + I +(g,A,B) = I +(f + g,A,B).

(b) The case c = 0 is trivial. Suppose now that c > 0. Since f ∈ R([A, B]), it follows from Theorem6D that for every > 0, there exists a dissection ∆ of [A, B] such that

S (f, ∆) − s(f, ∆) <

c.

It is easy to see that

S (cf, ∆) = cS (f, ∆) and s(cf, ∆) = cs(f, ∆). (13)

Hence

S (cf, ∆) − s(cf, ∆) < .

It follows from Theorem 6D that cf ∈ R([A, B]). Also, (13) clearly implies I +(cf,A,B) = cI +(f,A,B).Suppose next that c < 0. Since f ∈ R([A, B]), it follows from Theorem 6D that for every > 0, thereexists a dissection ∆ of [A, B] such that

S (f, ∆) − s(f, ∆) < −

c .

It is easy to see that

S (cf, ∆) = cs(f, ∆) and s(cf, ∆) = cS (f, ∆). (14)

Hence

S (cf, ∆) − s(cf, ∆) < .

It follows from Theorem 6D that cf ∈ R([A, B]). Also, (14) clearly implies I +(cf,A,B) = cI −(f,A,B).

(c) Note simply that

BA

f (x) dx ≥ (B − A) inf x∈[A,B]

f (x),

where the right hand side is the lower sum corresponding to the trivial dissection.

(d) Note that g − f ∈ R([A, B]) in view of (a) and (b). We apply part (c) to the function g − f .

Next, we investigate the question of breaking up the interval [A, B] of integration.

THEOREM 6F. Suppose that f ∈ R([A, B]), where A, B ∈ R and A < B . Then for every real number C ∈ (A, B), we have f ∈ R([A, C ]) and f ∈ R([C, B]). Furthermore, we have

B

A

f (x) dx =

C

A

f (x) dx +

B

C

f (x) dx. (15)

Proof. We shall first show that for every C , C ∈ R satisfying A ≤ C < C ≤ B, we have f ∈R([C , C ]). Since f ∈ R([A, B]), it follows from Theorem 6D that given any > 0, there exists adissection ∆∗ of [A, B] such that

S (f, ∆∗) − s(f, ∆∗) < .

It follows from Theorem 6A that the dissection ∆ = ∆∗ ∪ {C , C } of [A, B] satisfies

S (f, ∆) − s(f, ∆) < . (16)

Suppose that the dissection ∆ is given by ∆ : A = x0 < x1 < x2 < . . . < xn = B. Then there existk, k ∈ {0, 1, 2, . . . , n} satisfying k < k such that C = xk and C = xk . It follows that

∆0 : C = xk < xk+1 < xk+2 < .. . < xk = C

is a dissection of [C , C ]. Furthermore,

S (f, ∆0) − s(f, ∆0) =k

i=k+1

(xi − xi−1)

sup

x∈[xi−1,xi]

f (x) − inf x∈[xi−1,xi]

f (x)

≤ni=1

(xi − xi−1)

sup

x∈[xi−1,xi]


f (x)

= S (f, ∆) − s(f, ∆) < ,

in view of (16). It now follows from Theorem 6D that f ∈ R([C , C ]). To establish (15), note that bydefinition, we have

B

A

f (x) dx = inf

∆

S (f, ∆), (17)

while C A

f (x) dx = inf ∆1

S (f, ∆1) and

BC

f (x) dx = inf ∆2

S (f, ∆2). (18)

Here ∆, ∆1 and ∆2 run over all dissections of [A, B], [A, C ] and [C, B] respectively. The identity (15)will follow from (17) and (18) if we can show that

inf ∆

S (f, ∆) = inf ∆1

S (f, ∆1) + inf ∆2

S (f, ∆2). (19)

Suppose first of all that ∆ is a dissection of [A, B]. Then we can write ∆ ∪ {C } = ∆ ∪ ∆, where ∆

and ∆

are dissections of [A, C ] and [C, B] respectively. By Theorem 6A, we have

S (f, ∆) ≥ S (f, ∆ ∪ {C }) = S (f, ∆) + S (f, ∆).

Clearly

S (f, ∆) + S (f, ∆) ≥ inf ∆1

S (f, ∆1) + inf ∆2

S (f, ∆2).

Hence

S (f, ∆) ≥ inf ∆1

S (f, ∆1) + inf ∆2

S (f, ∆2).

Taking the infimum over all dissections ∆ of [A, B], we conclude that

inf ∆

S (f, ∆) ≥ inf ∆1

S (f, ∆1) + inf ∆2

S (f, ∆2). (20)

To establish the opposite inequality, suppose next that ∆1 and ∆2 are dissections of [A, C ] and [C, B]respectively. Then ∆1 ∪ ∆2 is a dissection of [A, B], and

S (f, ∆1) + S (f, ∆2) = S (f, ∆1 ∪ ∆2) ≥ inf ∆ S (f, ∆).

This implies that

S (f, ∆1) ≥ inf ∆

S (f, ∆) − S (f, ∆2).

Keeping ∆2 fixed and taking the infimum over all dissections ∆1 of [A, C ], we have

inf ∆1

S (f, ∆1) ≥ inf ∆

S (f, ∆) − S (f, ∆2),

and so

S (f, ∆2) ≥ inf ∆ S (f, ∆) − inf ∆1 S (f, ∆1).

Taking the infimum over all dissections ∆2 of [C, B], we have

inf ∆2

S (f, ∆2) ≥ inf ∆

S (f, ∆) − inf ∆1

S (f, ∆1),

and so

inf ∆1

S (f, ∆1) + inf ∆2

S (f, ∆2) ≥ inf ∆

S (f, ∆). (21)

The assertion (19) now follows on combining (20) and (21).

Next, we investigate the question of combining two intervals of integration.

THEOREM 6G. Suppose that A,B,C ∈ R and A < C < B. Suppose further that f ∈ R([A, C ]) and f ∈ R([C, B]). Then f ∈ R([A, B]). Furthermore,

BA

f (x) dx =

C A

f (x) dx +

BC

f (x) dx.

Proof. Since f ∈ R([A, C ]) and f ∈ R([C, B]), it follows from Theorem 6D that given any > 0, thereexist dissections ∆1 and ∆2 of [A, C ] and [C, B] respectively such that

S (f, ∆1) − s(f, ∆1) <

2 and S (f, ∆2) − s(f, ∆2) <

2 . (22)

Clearly ∆ = ∆1 ∪ ∆2 is a dissection of [A, B]. Furthermore,

S (f, ∆) = S (f, ∆1) + S (f, ∆2) and s(f, ∆) = s(f, ∆1) + s(f, ∆2).

Hence

S (f, ∆) − s(f, ∆) = (S (f, ∆1) − s(f, ∆1)) + (S (f, ∆2) − s(f, ∆2)) < ,

in view of (22). It now follows from Theorem 6D that f ∈ R([A, B]). The last assertion now followsimmediately from Theorem 6F.

Finally, we consider the question of altering the value of the function at a finite number of points.The following result may be applied a finite number of times.

THEOREM 6H. Suppose that f ∈ R([A, B]), where A, B ∈ R and A < B. Suppose further that the real number C ∈ [A, B], and that f (x) = g(x) for every x ∈ [A, B] except possibly at x = C . Then g ∈ R([A, B]), and B

A

f (x) dx =

BA

g(x) dx.

Proof. Write h(x) = f (x) − g(x) for every x ∈ [A, B]. We shall show that BA

h(x) dx = 0.

Note that h(x) = 0 whenever x = C . The case h(C ) = 0 is trivial, so we assume, without loss of generality, that h(C ) = 0. Given any > 0, we shall choose a dissection ∆ of [A, B] such that C is notone of the dissection points and such that the subinterval containing C has length less than /|h(C )|.Since −|h(C )| ≤ h(C ) ≤ |h(C )|, it is easy to check that

S (h, ∆) ≤ |h(C )|

|h(C )|< and s(h, ∆) ≥ −|h(C )|

|h(C )|> −.

Hence

− 0 is arbitrary, and the terms I −(h,A,B) and I +(h,A,B) are independent of . Itfollows that we must have I −(h,A,B) = I +(h,A,B) = 0. This completes the proof.

6.3. Sufficient Conditions for Integrability

There are a few conditions that guarantee Riemann integrability. Here we shall study two such instances.

Definition. Suppose that f (x) is a function defined on an interval I .(1) We say that f (x) is increasing in I if f (x1) ≤ f (x2) for every x1, x2 ∈ I satisfying x1 < x2.(2) We say that f (x) is decreasing in I if f (x1) ≥ f (x2) for every x1, x2 ∈ I satisfying x1 < x2.(3) We say that f (x) is monotonic in I if it is increasing in I or decreasing in I .

Remark. Note that a constant function on an interval I is both increasing in I and decreasing in I .

THEOREM 6J. Suppose that a function f (x) is monotonic in the closed interval [A, B], where A, B ∈ R and A < B. Then f ∈ R([A, B]).

Proof. The result is trivial if f (A) = f (B), so we may assume that f (A) = f (B). We may furtherassume, without loss of generality, that f (x) is increasing in [A, B], so that f (A) < f (B). Given any > 0, we shall consider a dissection

∆ : A = x0 < x1 < x2 < .. . < xn = B

of [A, B] such that

xi − xi−1 <

f (B) − f (A)for every i = 1, . . . , n .

Since f (x) is increasing in [A, B], we have

S (f, ∆) =ni=1

(xi − xi−1)f (xi) and s(f, ∆) =ni=1

(xi − xi−1)f (xi−1),

so that

S (f, ∆) − s(f, ∆) =

n

i=1

(xi − xi−1)(f (xi) − f (xi−1)) <

f (B) − f (A)

n

i=1

(f (xi) − f (xi−1)) = .

The result now follows from Theorem 6D.

THEOREM 6K. Suppose that a function f (x) is continuous in the closed interval [A, B], where A, B ∈ R and A < B. Then f ∈ R([A, B]).

Here we need the idea of uniformity in continuity.

Definition. A function f (x) is said to be uniformly continuous in an interval I if, given any > 0,there exists δ > 0 such that

|f (x) − f (y)| < whenever x, y ∈ I and |x − y| < δ.

It is easy to show that if f (x) is uniformly continuous in an interval I , then it is continuous in I . Theconverse is not true, as can be seen from the following example.

Example 6.3.1. Consider the function f (x) = 1/x in the open interval (0, 1). Then given any δ > 0,there exists n ∈ N such that n2 > δ −1. Note now thatf

1

n

− f

1

n + 1

= 1 and

1

n−

1

n + 1

=1

n(n + 1)<

1

n2< δ.

THEOREM 6L. Suppose that a function f (x) is continuous in the closed interval [A, B], where

A, B ∈ R and A < B. Then f (x) is uniformly continuous in [A, B].

Proof. Suppose on the contrary that f (x) is not uniformly continuous in [A, B]. Then there exists > 0 such that for every n ∈ N, there exist xn, yn ∈ [A, B] such that

|xn − yn| <1

nand |f (xn) − f (yn)| ≥ .

The sequence xn is clearly bounded, and so has a convergent subsequence xnp . Suppose that xnp → cas p → ∞. Then

|ynp − c| ≤ |xnp − ynp | + |xnp − c| → 0 as p → ∞,

so that ynp → c as p → ∞. Suppose first of all that c ∈ (A, B). Since f (x) is continuous in [A, B], it iscontinuous at c, and so f (xnp) → f (c) and f (ynp) → f (c) as p → ∞. Note now that

|f (xnp) − f (ynp)| ≤ |f (xnp) − f (c)| + |f (ynp) − f (c)|.

This implies that |f (xnp) − f (ynp)| → 0 as p → ∞, clearly a contradiction. If c = A or c = B, thenthere is only one-sided continuity at c, and the proof requires minor modification.

Proof of Theorem 6K. In view of Theorem 6L, given any > 0, there exists δ > 0 such that

|f (x) − f (y)| <

B − Awhenever x, y ∈ [A, B] and |x − y| < δ.

We now consider a dissection

∆ : A = x0 < x1 < x2 < .. . < xn = B

of [A, B] such that

xi − xi−1 < δ for every i = 1, . . . , n .

Then

S (f, ∆) − s(f, ∆) =ni=1

(xi − xi−1)

sup

x∈[xi−1,xi]


f (x)

≤

B − A

ni=1

(xi − xi−1) = .

The result now follows from Theorem 6D.

6.4. Integration as the Inverse of Differentiation

In this section, we shall establish the principle that if we can find an indefinite integral, then we cancalculate definite integrals. However, we shall first establish some properties of the indefinite integral.

THEOREM 6M. Suppose that f ∈ R([A, B]), where A, B ∈ R and A < B. Suppose further that

F (x) =

xA

f (t) dt

for every x ∈ [A, B]. Then the following assertions hold:(a) The function F (x) is continuous in [A, B].(b) For every a ∈ (A, B) such that f (x) is continuous at x = a, we have F (a) = f (a).

Proof. (a) Suppose that a ∈ (A, B). Then

F (a + h) − f (a) =

a+h

a

f (t) dt.

If h > 0, then it follows from Theorem 6E(d) that

h inf t∈[A,B]

f (t) ≤

a+h

a

f (t) dt ≤ h supt∈[A,B]

f (t),

so that F (a + h) − F (a) → 0 as h → 0+. An essentially similar argument holds for h < 0 and h → 0−.

The argument has to be slightly modified if a = A or a = B.

(b) Suppose first of all that h > 0. Then it follows from Theorem 6E(d) that

h inf t∈[a,a+h]

f (t) ≤

a+h

a

f (t) dt ≤ h supt∈[a,a+h]

f (t),

so that

inf t∈[a,a+h]

f (t) ≤F (a + h) − F (a)

h≤ sup

t∈[a,a+h]f (t).

If f (x) is continuous at x = a, then

inf t∈[a,a+h]

f (t) → f (a) and supt∈[a,a+h]

f (t) → f (a) as h → 0+,

so that

F (a + h) − F (a)

h

→ f (a) as h → 0 + .

An essentially similar argument holds for h < 0 and h → 0−.

THEOREM 6N. Suppose that f (x) is continuous in the interval [A, B], where A, B ∈ R and A < B.Suppose further that φ(x) = f (x) for every x ∈ [A, B]. Then for every x ∈ [A, B], we have

xA

f (t) dt = φ(x) − φ(A).

Proof. It follows from Theorem 6M that F (x) − φ(x) = 0 for every x ∈ (A, B), so that F (x) − φ(x)is constant in [A, B] by Theorem 5H(a). Since F (A) = 0, we must have F (x) = φ(x) − φ(a) for every

x ∈ [A, B].

6.5. An Important Example

In this section, we shall find a function that is not Riemann integrable. Consider the function

g(x) =

0 if x is rational,1 if x is irrational.

We know from Theorem 1D that in any open interval, there are rational numbers and irrational numbers.It follows that in any interval [α, β ], where α < β , we have

inf x∈[α,β]

g(x) = 0 and supx∈[α,β]

g(x) = 1.

It follows that for every dissection ∆ of [0, 1], we have

s(g, ∆) = 0 and S (g, ∆) = 1,

so that

I −(g, 0, 1) = 0 = 1 = I +(g, 0, 1).

It follows that g(x) is not Riemann integrable over the closed interval [0, 1].

Note, on the other hand, that the rational numbers in [0 , 1] are countable, while the irrational numbersin [0, 1] are not countable. In the sense of cardinality, there are far more irrational numbers than rationalnumbers in [0, 1]. However, the definition of the Riemann integral does not highlight this inequality.

We wish therefore to develop a theory of integration more general than Riemann integration. This isthe motivation for the Lebesgue integral.

1. Calculate the integral 1

0

x dx by dissecting the interval [0, 1] into equal parts.

2. Calculate the integral

BA

xk dx, where k > 0 is fixed, by dissecting the interval [A, B] into n parts

in geometric progression, so that A < Aq < Aq 2 < .. . < Aq n = B.

3. a) By using the method of Problem 2, prove that

2

1

1

x2dx =

1

2.

b) Deduce that limn→∞

n

1

(n + 1)2+

1

(n + 2)2+ . . . +

1

(2n)2

=

1

2.

4. Calculate the integral α

0

sin x dx by dissecting the interval [0, α] into equal parts.

5. Consider the function f (x) = 1/x in the closed interval [1, 2]. For every n ∈ N, let ∆n denote thedissection of the interval [1, 2] into n subintervals of equal length.

a) Find s(f, ∆n) and S (f, ∆n), and show that

S (f, ∆n) − s(f, ∆n) =1

2n.

b) Show that f ∈ R([1, 2]).c) Explain why the value of the integral is equal to

limn→∞

1

n + 1+

1

n + 2+ . . . +

1

2n .

6. In this question, we shall try to verify from the definition of the Riemann integral that

1

0

f (x) dx =2

π, where f (x) = cos

πx

2.

For every n ∈ N, let ∆n denote the dissection of the interval [0, 1] into n subintervals of equal length.a) Find s(f, ∆n) and S (f, ∆n), and show that

S (f, ∆n) − s(f, ∆n) =1

n.

b) Show that f ∈ R([0, 1]).c) Explain why

1

0

f (x) dx = limn→∞

S (f, ∆n).

d) Note that cos(k − 1)θ = R(ei(k−1)θ), so that S (f, ∆n) is the real part of a geometric series. Sumthe geometric series and show that

S (f, ∆n) =1

nR

1 − einθ

1 − eiθ

=

1

nR

1 − i

1 − eiθ

=

θ

π+

θ sin θ

π(1 − cos θ), where θ =

π

2n.

e) Explain why

limn→∞

S (f, ∆n) =2

π.

7. Suppose that a function f (x) is bounded on the closed interval [A, B], where A, B ∈ R and A < B .a) Show that for any closed interval I ⊆ [A, B],

supx∈I |f (x)| − inf x∈I |f (x)| ≤ supx∈I f (x) − inf x∈I f (x).

b) Show that for every dissection ∆ of the interval [A, B],

S (|f |, ∆) − s(|f |, ∆) ≤ S (f, ∆) − s(f, ∆).

c) Show that if f ∈ R([A, B]), then |f | ∈ R([A, B]).d) Note that −|f (x)| ≤ f (x) ≤ |f (x)| for every x ∈ [A, B]. Use this to show that if f ∈ R([A, B]),

then

B

A

f (x) dx

≤

B

A

|f (x)| dx.

8. Suppose that f, g ∈ R([A, B]), where A, B ∈ R and A < B .a) Show that f 2 ∈ R([A, B]).b) Use part (a) to deduce that f g ∈ R([A, B]).c) Suppose further that m ≤ f (x) ≤ M and g(x) ≥ 0 for every x ∈ [A, B]. Show that

m

BA

g(x) dx ≤

BA

f (x)g(x) dx ≤ M

BA

g(x) dx.

d) By considering the integral

BA

(λf (x) + µg(x))2 dx

for suitable constants λ and µ, establish Schwarz’s inequality

BA

f (x)g(x) dx

2

≤

BA

f 2(x) dx

BA

g2(x) dx

.




W W L CHEN

c W W L Chen, 1983, 2008.






Chapter 7

FURTHER TREATMENT OF LIMITS

7.1. Upper and Lower Limits of a Real Sequence

Suppose that xn is a sequence of real numbers bounded above. For every n ∈ N, let

K n = sup{xn, xn+1, xn+2, . . .}.

Then K n is a decreasing sequence, and converges as n → ∞ if it is bounded below.

Definition. Suppose that xn is a sequence of real numbers bounded above. The number

Λ = limn→∞sup

r≥n

xr ,

if it exists, is called the upper limit of xn, and denoted by

Λ = lim supn→∞

xn or Λ = limn→∞

xn.

Definition. Suppose that xn is a sequence of real numbers bounded below. The number

λ = limn→∞

inf r≥n

xr

,

if it exists, is called the lower limit of xn, and denoted by

λ = lim inf n→∞

xn or λ = limn→∞

xn.

Chapter 7 : Further Treatment of Limits page 1 of 10

Remark. It is obvious that λ ≤ Λ, since the infimum of a bounded set of real number never exceeds thecorresponding supremum.

Example 7.1.1. For the sequence xn = (−1)n

, we have Λ = 1 and λ = −1.

Example 7.1.2. For the sequence xn = n/(n + 1), we have Λ = λ = 1.

Example 7.1.3. For the sequence xn = n(1 + (−1)n), we have λ = 0 and Λ does not exist.

Example 7.1.4. For the sequence xn = sin 12 nπ, we have Λ = 1 and λ = −1.

THEOREM 7A. Suppose that xn is a sequence of real numbers. Then the following two statements are equivalent:(a) We have Λ = limsup

n→∞xn.

(b) For every > 0, we have (i) xn < Λ + for all sufficiently large n ∈ N; and

(ii) xn > Λ − for infinitely many n ∈ N.

Proof. ((a)⇒(b)) Suppose that

Λ = limsupn→∞

xn = limn→∞

K n, where K n = supr≥n

xr.

Given any > 0, there exists N ∈ N such that |K N − Λ| < , so that in particular, K N < Λ + . Itfollows that xn < Λ + for every n ≥ N , giving (i). On the other hand, for every > 0 and every N ∈ N,there exists n ≥ N such that xn > K N − . Clearly K N ≥ Λ for every N ∈ N, giving (ii).

((b)⇒(a)) Given any > 0, it follows from (i) that K n ≤ Λ + for all sufficiently large n ∈ N, andfrom (ii) that K n > Λ − for every n ∈ N. Clearly K n → Λ as n → ∞.

Similarly, we have the following result.

THEOREM 7B. Suppose that xn is a sequence of real numbers. Then the following two statements are equivalent:(a) We have λ = lim inf

n→∞xn.

(b) For every > 0, we have (i) xn > λ − for all sufficiently large n ∈ N; and

(ii) xn < λ + for infinitely many n ∈ N.

We now establish the following important result.

THEOREM 7C. Suppose that xn is a sequence of real numbers. Then

limn→∞

xn = if and only if limsupn→∞

xn = lim inf n→∞

xn = .

Proof. (⇒) Suppose that xn → as n → ∞. Then the upper and lower limits of the sequence xnclearly exist, since xn is bounded in this case. Also, given any > 0, there exists N ∈ N such that − < xn < + for every n ≥ N . The conclusion follows immediately from Theorems 7A and 7B.

(⇐) Suppose that the upper and lower limits are both equal to . Then it follows from Theorem 7A

that xn < + for all sufficiently large n ∈ N, and from Theorem 7B that xn > − for all sufficientlylarge n ∈ N. Hence |xn − | < for all sufficiently large n ∈ N, whence xn → as n → ∞.

7.2. Double and Repeated Limits

We shall consider a double sequence zmn of complex numbers, represented by a doubly infinite array

z11 z12 z13 . . .

z21 z22 z23 . . .

z31 z32 z33 . . .

......

.... . .

of complex numbers. More precisely, a double sequence of complex numbers is simply a mapping fromN× N to C.

Definition. We say that a double sequence zmn converges to a finite limit z ∈ C, denoted by zmn → z

as m, n → ∞ or by

limm,n→∞

zmn = z,

if, given any > 0, there exists N = N () ∈ R, depending on , such that |zmn − z| < wheneverm, n > N . Furthermore, we say that a double sequence zmn is convergent if it converges to some finitelimit z as m, n → ∞, and that a double sequence zmn is divergent if it is not convergent.

Example 7.2.1. For the double sequence

zmn =1

m + n,

we have zmn → 0 as m, n → ∞.

Example 7.2.2. The double sequence

zmn =m

m + n

does not converge to a finite limit as m, n → ∞. Note that for all sufficiently large m, n ∈ N with m = n,we have zmn = 1

2 , whereas for all sufficiently large m, n ∈ N with m = 2n, we have zmn = 23 .

The question we want to study is the relationship, if any, between the following three limiting processeswhen applied to a double sequence zmn of complex numbers:

• m, n → ∞.

• n → ∞ followed by m → ∞.• m → ∞ followed by n → ∞.

THEOREM 7D. Suppose that a double sequence zmn satisfies the following conditions:(a) The double limit lim

m,n→∞zmn exists.

(b) For every m ∈ N, the limit limn→∞

zmn exists.

Then the repeated limit limm→∞

limn→∞

zmn

exists, and is equal to the double limit lim

m,n→∞zmn.

Remark. We need to make the assumption (b), as it does not necessarily follow from assumption (a).Consider, for example, the double sequence

zmn =(−1)n

m.

Proof of Theorem 7D. Suppose that zmn → z as m, n → ∞. Suppose also that for every m ∈ N,zmn → ζ m as n → ∞. We need to show that ζ m → z as m → ∞. Given any > 0, there exists N ∈ R

such that

|zmn − z| <2

whenever m,n > N.

On the other hand, given any m ∈ N, there exists M (m) ∈ R such that

|zmn − ζ m| <

2whenever n > M (m).

Now let m > N . Then choosing n > max{N, M (m)}, we have

|ζ m − z| ≤ |zmn − ζ m| + |zmn − z| < .

Hence ζ m → z as m → ∞.

We immediately have the following generalization.

THEOREM 7E. Suppose that a double sequence zmn satisfies the following conditions:(a) The double limit lim

m,n→∞zmn exists.

(b) For every m ∈ N, the limit limn→∞

zmn exists.

(c) For every n ∈ N, the limit limm→∞

zmn exists.

Then the repeated limits limm→∞

limn→∞

zmn

and lim

n→∞

limm→∞

zmn

exist, and are both equal to the double

limit limm,n→∞

zmn.

We can further generalize the above to a result concerning series.

Definition. Suppose that zmn is a double sequence of complex numbers. For every m, n ∈ N, let

smn =

mi=1

nj=1

zij .

If the double sequence smn → s as m, n → ∞, then we say that the double series

∞

m,n=1

zmn

is convergent, with sum s.

THEOREM 7F. Suppose that a double sequence zmn satisfies the following conditions:

(a) The double series

∞m,n=1

zmn is convergent, with sum s.

(b) For every m ∈ N, the series

∞n=1

zmn is convergent.

(c) For every n ∈ N, the series

∞m=1

zmn is convergent.

Then the repeated series

∞m=1

∞n=1

zmn

and

∞n=1

∞m=1

zmn

are both convergent, with sum s.




7.3. Infinite Products

An infinite product is an expression of the form

(1 + z1)(1 + z2)(1 + z3) . . .

with an infinitude of factors. We denote this by

∞n=1

(1 + zn). (1)

We also make the natural assumption that zn = −1 for any n ∈ N.

For every N ∈ N, let

pN =N n=1

(1 + zn) = (1 + z1) . . . (1 + zN ).

We shall call pN the N -th partial product of the infinite product (1).

Definition. If the sequence pN converges to a non-zero limit p as N → ∞, then we say that the infiniteproduct (1) converges to p and write

∞n=1

(1 + zn) = p.

In this case, we sometimes simply say that the infinite product (1) is convergent. On the other hand, if the sequence pN does not cionverge to a non-zero limit as N → ∞, then we say that the infinite product(1) is divergent. In particular, if pN → 0 as N → ∞, then we say that the infinite product (1) divergesto zero.

Let us first examine the special case when all the terms zn are real.

THEOREM 7G. Suppose that an ≥ 0 for every n ∈ N. Then the infinite product

∞n=1

(1 + an)

is convergent if and only if the series

∞n=1

an

is convergent.

Proof. Let sN be the N -th partial sum of the series. Since an ≥ 0 for every n ∈ N, the sequences sN and pN are both increasing. On the other hand, note that 1 + a ≤ ea for every a ≥ 0. It follows that forevery N ∈ N, we have

a1 + . . . + aN ≤ (1 + a1) . . . (1 + aN ) ≤ ea1+...+aN ,

so that sN ≤ pN ≤ esN . It follows that the sequences sN and pN are bounded or unbounded together.The result follows from Theorem 2E.


If an ≤ 0 for every n ∈ N, then we write an = −bn and consider the infinite product

∞

n=1

(1−

bn

). (2)

THEOREM 7H. Suppose that 0 ≤ bn < 1 for every n ∈ N. Then the infinite product (2) is convergent if and only if the series

∞n=1

bn (3)

is convergent.

This follows immediately from the following two results.

THEOREM 7J. Suppose that 0 ≤ bn < 1 for every n ∈ N. Suppose further that the series (3) is convergent. Then the infinite product (2) is convergent.

THEOREM 7K. Suppose that 0 ≤ bn < 1 for every n ∈ N. Suppose further that the series (3) is divergent. Then the infinite product (2) diverges to zero.

Proof of Theorem 7J. Since the series (3) is convergent, there exists M ∈ N such that

∞

n=M +1

bn <1

2.

Hence for every N > M , we have

(1 − bM +1)(1 − bM +2) . . . (1 − bN ) ≥ 1 − bM +1 − bM +2 − . . . − bN >1

2.

It follows that the sequence pN is a decreasing sequence bounded below by 12 pM = 0, so that pN converges

to a non-zero limit as N → ∞.

Proof of Theorem 7K. Note that 1 − b ≤ e−b whenever 0 ≤ b < 1. It follows that for every N ∈ N,we have

0 ≤ (1 − b1) . . . (1 − bN ) ≤ e−b1−...−bN

.

Note now that e−b1−...−bN → 0 as N → ∞. The result follows from the Squeezing principle.

We now investigate the general case, where zn ∈ C \ {−1} for every n ∈ N.

Definition. The infinite product (1) is said to be absolutely convergent if the infinite product

∞n=1

(1 + |zn|)

is convergent.

The following result is an obvious consequence of Theorem 7G.




THEOREM 7L. The infinite product (1) is absolutely convergent if and only if the series

∞n=1 zn (4)

is absolutely convergent.

On the other hand, as in series, we have the following result.

THEOREM 7M. Suppose that the infinite product (1) is absolutely convergent. Then it is also con-vergent.

Proof. For every N ∈ N, let

pN =N n=1

(1 + zn) and P N =N n=1

(1 + |zn|).

If N ≥ 2, then

pN − pN −1 = (1 + z1) . . . (1 + zN −1)zN and P N − P N −1 = (1 + |z1|) . . . (1 + |zN −1|)|zN |,

so that

| pN − pN −1| ≤ P N − P N −1. (5)

If we write p0 = P 0 = 0, then (5) holds also for N = 1. Furthermore, for every N ∈ N, we have

pN =N n=1

( pn − pn−1) and P N =N n=1

(P n − P n−1).

Since P N converges as N → ∞, it follows from the Comparison test that pN converges as N → ∞. Itremains to show that pN does not converge to 0 as N → ∞. Note from Theorem 7L that the series (4)is absolutely convergent, so that zn → 0 as n → ∞, and so 1 + zn → 1 as n → ∞. Hence the series

∞

n=1

zn1 + zn

is convergent, and so it follows from Theorem 7L that the infinite product

∞n=1

1 +

− zn1 + zn

(6)

is convergent. Repeating the first part of our argument on the infinite product (6), we conclude that thesequence

N

n=1

1 −

zn1 + zn

is convergent as N → ∞. Note now that this product is precisely 1/pN .


7.4. Double Integrals

The purpose of this last section is to give a sketch of the proof of the following result concerning double

integrals.

THEOREM 7N. Suppose that a function f (x, y) is continuous in a closed rectangle [A, B] × [C, D],where A,B,C,D ∈ R satisfy A < B and C < D. Then the double integrals

BA

dx

DC

f (x, y) dy and

DC

dy

BA

f (x, y) dx

exist in the sense of Riemann, and are equal to each other.

Sketch of Proof. The idea is to first show that f (x, y) is uniformly continuous in the rectangle[A, B] × [C, D], in the spirit of Theorem 6L. Using the uniform continuity, one can then show that thefunction

φ(y) =

BA

f (x, y) dx

is continuous in the closed interval [C, D]. It follows from Theorem 6K that the integral

DC

dy

BA

f (x, y) dx

exists. Similarly the other integral

B

A

dx D

C

f (x, y) dy

exists. To show that the two integrals are equal, we make use of the uniform continuity again. Givenany > 0, there exist dissections A = x0 < x1 < .. . < xk = B and C = y0 < y1 < .. . < yn = D of theintervals [A, B] and [C, D] respectively such that

M ij − mij <

(B − A)(D − C )for every i = 1, . . . , k and j = 1, . . . , n ,

where

M ij = supxi−1≤x≤xiyj−1≤y≤yj

f (x, y) and mij = inf xi−1≤x≤xiyj−1≤y≤yj

f (x, y).

For every i = 1, . . . , k and j = 1, . . . , n, we have

mij(xi − xi−1) ≤

xixi−1

f (x, y) dx ≤ M ij(xi − xi−1) for every y ∈ [yj−1, yj ],

so that

mij(xi − xi−1)(yj − yj−1) ≤

yjyj−1

dy

xixi−1

f (x, y) dx ≤ M ij(xi − xi−1)(yj − yj−1).

Summing over all i and j, we obtain

ki=1

nj=1

mij(xi − xi−1)(yj − yj−1) ≤ DC

dy BA

f (x, y) dx ≤ki=1

nj=1

M ij(xi − xi−1)(yj − yj−1).

A similar argument gives

ki=1

nj=1 mij(xi − xi−1)(yj − yj−1) ≤

BA dx

DC f (x, y) dy ≤

ki=1

nj=1 M ij(xi − xi−1)(yj − yj−1).

Hence DC

dy

BA

f (x, y) dx −

BA

dx

DC

f (x, y) dy

≤ki=1

nj=1

(M ij − mij)(xi − xi−1)(yj − yj−1) < .

The result now follows since > 0 is arbitrary and the left hand side is independent of .

It turns out that the conclusion of Theorem 7N may still hold even if the function f (x, y) is notcontinuous everywhere in the rectangle [A, B] × [C, D]. We state without proof the following result.

THEOREM 7P. Suppose that a function f (x, y) is continuous in a closed rectangle [A, B] × [C, D],where A,B,C,D ∈ R satisfy A < B and C < D, except possibly at points along a curve of type defined by one of the following:(a) x = α for some α ∈ [A, B].(b) y = γ for some γ ∈ [C, D].(c) x = ψ(y) for y ∈ [γ, δ ], where C ≤ γ ≤ δ ≤ D and ψ(y) is strictly monotonic and continuous.

Then the conclusion of Theorem 7N holds.





1. Suppose that xn and yn are bounded real sequences.

a) Show that

limn→∞

xn + limn→∞

yn ≤ limn→∞

(xn + yn) ≤ limn→∞

xn + limn→∞

yn ≤ limn→∞

(xn + yn) ≤ limn→∞

xn + limn→∞

yn.

b) Find sequences xn and yn where equality holds nowhere in part (a).c) Suppose further that xn ≥ 0 and yn ≥ 0 for every n ∈ N. Establish a chain of inequalities as in

part (a) but with products in place of sums.d) Find sequences xn and yn where equality holds nowhere in part (c).

2. For each of the following double sequences zmn, find the double limit limm,n→∞

zmn and the repeated

limits limm→∞ lim

n→∞zmn and lim

n→∞ limm→∞

zmn, if they exist:

a) zmn = m − nm + n

b) zmn = m + nm2

c) zmn =m + n

m2 + n2d) zmn = (−1)m+n

1

m+

1

n

e) zmn =mn

m2 + n2f) zmn = (−1)m+n 1

n

1 +

1

m

3. Does there exist a double sequence zmn such that zmn converges as m, n → ∞ but also that zmn isnot bounded? Justify your assertion.

4. Suppose that xmn is a bounded double sequence of real numbers satisfying the following conditions:a) For every fixed m ∈ N, the sequence xmn is increasing in n.

b) For every fixed n ∈ N, the sequence xmn is increasing in m.Prove that xmn converges as m, n → ∞.

5. Use Problem 4 to prove the Comparison test for double series: Suppose that 0 ≤ umn ≤ vmn forevery m, n ∈ N. Suppose further that the double series

∞m,n=1

vmn

is convergent. Then the double series

∞m,n=1

umn

is convergent.

6. Using ideas from the proof of the Alternating series test, prove that the infinite product

∞n=1

1 +

(−1)n−1

n

is convergent.

7. Prove Theorem 7N.


W W L CHEN

c W W L Chen, 1983, 2008.






Chapter 8

UNIFORM CONVERGENCE

8.1. Introduction

We begin by making a somewhat familiar definition.

Definition. Suppose that f n : X → C is a sequence of functions on a set X ⊆ R. We say that thesequence f n converges pointwise to the function f : X → C if for every x ∈ X , we have

|f n(x) − f (x)| → 0 as n → ∞.

Example 8.1.1. Let X = [0, 1]. For every n ∈ N and every x ∈ [0, 1], let f n(x) = xn. Then for everyx ∈ [0, 1], f n(x) → f (x) as n → ∞, where f (x) = 0 if 0 ≤ x < 1 and f (1) = 1. Note that each of thefunctions f n(x) is continuous on [0, 1], but the limit function f (x) is not continuous on [0, 1]. Hence thecontinuity property of the functions f n(x) is not carried over to the limit function f (x).

To carry over certain properties of the individual functions of a sequence to the limit function, weneed a type of convergence which is stronger than pointwise convergence.

Definition. Suppose that f n : X → C is a sequence of functions on a set X ⊆ R. We say that thesequence f n converges uniformly to the function f : X → C if

supx∈X

|f n(x) − f (x)| → 0 as n → ∞.

Example 8.1.2. In Example 8.1.1, we have f n(x) → f (x) pointwise in [0, 1]. However, if 0 ≤ x < 1,then |f n(x) − f (x)| = xn and so

supx∈[0,1]

|f n(x) − f (x)| ≥ supx∈[0,1)

|f n(x) − f (x)| = supx∈[0,1)

xn = 1

for every n ∈ N. It follows that f n(x) → f (x) as n → ∞, pointwise but not uniformly on [0, 1].

Chapter 8 : Uniform Convergence page 1 of 11

Remark. Pointwise convergence means that given any > 0, for every x ∈ X , there exists N = N (, x)such that

|f n(x) − f (x)| < whenever n > N (, x).

Uniform convergence means that given any > 0, there exists N = N (), independent of x ∈ X , suchthat

|f n(x) − f (x)| < whenever n > N () and x ∈ X.

8.2. Criteria for Uniform Convergence

We shall first of all extend the General principle of convergence to the case of uniform convergence.

THEOREM 8A. (GENERAL PRINCIPLE OF UNIFORM CONVERGENCE) Suppose that f n is a sequence of real or complex valued functions defined on a set X ⊆ R. Then f n(x) converges uniformly on X as n → ∞ if and only if, given any > 0, there exists N such that

supx∈X

|f m(x) − f n(x)| < whenever m > n ≥ N .

Proof. (⇒) Suppose that f n(x) → f (x) uniformly on X as n → ∞. Then given any > 0, there existsN such that

supx∈X

|f n(x) − f (x)| < 12 whenever n ≥ N .

It follows that

|f m(x) − f n(x)| ≤ |f m(x) − f (x)| + |f n(x) − f (x)| < whenever m > n ≥ N and x ∈ X,

and so

supx∈X

|f m(x) − f n(x)| ≤ whenever m > n ≥ N .

(⇐) Since R and C are complete, for every x ∈ X , the sequence f n(x) converges pointwise to a limitf (x), say, as n → ∞. We shall show that f n(x) → f (x) uniformly on X as n → ∞. Given any > 0,

there exists N such that for every x ∈ X ,

|f m(x) − f n(x)| < whenever m > n ≥ N .

Hence for every x ∈ X ,

|f (x) − f n(x)| = limm→∞

|f m(x) − f n(x)| ≤ whenever n ≥ N ,

so that

supx∈X

|f n(x) − f (x)| ≤ whenever n ≥ N .

Hence f n(x) → f (x) uniformly on X as n → ∞.

We next turn our attention to series of real or complex valued functions.

Definition. Suppose that un is a sequence of real or complex valued functions defined on a set X ⊆ R.We say that the series

∞n=1

un(x)

converges uniformly on X if the sequence of partial sums

sN (x) =

N n=1

un(x)

converges uniformly on X .

We have the analogue of the Comparison test.

THEOREM 8B. (WEIERSTRASS’S M-TEST) Suppose that un is a sequence of real or complex valued functions defined on a set X ⊆ R. Suppose further that for every n ∈ N, there exists a real constant M nsuch that the series

∞n=1

M n

is convergent, and that |un(x)| ≤ M n for every x ∈ X . Then the series

∞

n=1

un(x)

converges uniformly and absolutely on X .

Proof. Given any > 0, it follows from the General principle of convergence for series that there existsN such that

M n+1 + . . . + M n < whenever m > n ≥ N .

It follows that

|sm(x) − sn(x)| ≤ M n+1 + . . . + M n < whenever m > n ≥ N and x ∈ X,

so that

supx∈X

|sm(x) − sn(x)| ≤ whenever m > n ≥ N .

It now follows from Theorem 8A that the series

∞n=1

un(x)

converges uniformly on X . Note finally that absolute convergence follows pointwise from the proof of the Comparison test.

The General principle of uniform convergence can also be used to establish the following two results.

THEOREM 8C. (DIRICHLET’S TEST) Suppose that an and bn are two sequences of real valued functions defined on a set X ⊆ R, and satisfy the following conditions:(a) There exists K ∈ R such that |sn(x)| ≤ K for every n ∈ N and every x ∈ X , where sn(x) denotes

the sequence of partial sums sn(x) = a1(x) + . . . + an(x).(b) For every x ∈ X , the sequence bn(x) is monotonic.(c) The sequence bn(x) → 0 uniformly on X as n → ∞.

Then the series

∞n=1

an(x)bn(x)


Proof. Since bn(x) → 0 uniformly on X as n → ∞, given any > 0, there exists N 0 such that

|bn(x)| <

4K whenever n > N 0 and x ∈ X.

It follows that whenever M > N ≥ N 0, we have

M

n=N +1

an(x)bn(x)

= |(sN +1(x) − sN (x))bN +1(x) + . . . + (sM (x) − sM −1(x))bM (x)|

= | − sN (x)bN +1(x) + sN +1(x)(bN +1(x) − bN +2(x)) + . . . + sM −1(x)(bM −1(x) − bM (x)) + sM (x)bM (x)|

≤ K (|bN +1(x)| + |bN +1(x) − bN +2(x)| + . . . + |bM −1(x) − bM (x)| + |bM (x)|)

= K (|bN +1(x)| + |bN +1(x) − bM (x)| + |bM (x)|) ≤ 2K (|bN +1(x)| + |bM (x)|) < .

The result follows from the General principle of uniform convergence.

THEOREM 8D. (ABEL’S TEST) Suppose that an and bn are two sequences of real valued functions defined on a set X ⊆ R, and satisfy the following conditions:

(a) The series ∞n=1

an(x) converges uniformly on X .

(b) For every x ∈ X , the sequence bn(x) is monotonic.

(c) There exists K ∈ R such that |bn(x)| ≤ K for every n ∈ N and every x ∈ X .

Then the series

∞n=1

an(x)bn(x)


Proof. Given any > 0, there exists N 0 such that

m

n=N +1

an(x)

<

3K whenever m > N ≥ N 0 and x ∈ X.

In other words, writing sn(x) = a1(x) + . . . + an(x), we have

|sm(x) − sN (x)| <

3K whenever m > N ≥ N 0 and x ∈ X.

It follows that whenever M > N ≥ N 0, we have

M

m=N +1

am(x)bm(x) = M

m=N +1

(sm(x) − sm−1(x))bm(x)=

M

m=N +1

((sm(x) − sN (x)) − (sm−1(x) − sN (x)))bm(x)

=

M

m=N +1

(sm(x) − sN (x))bm(x) −M −1

m=N +1

(sm(x) − sN (x))bm+1(x)

≤

M −1m=N +1

|sm(x) − sN (x)||bm(x) − bm+1(x)| + |sM (x) − sN (x)||bM (x)|

<

3K

M −1

m=N +1

|bm(x) − bm+1(x)| +

3K |bM (x)|

=

3K

M −1

m=N +1

(bm(x) − bm+1(x))

+

3K |bM (x)|

=

3K |bN +1(x) − bM (x)| +

3K |bM (x)|

≤

3K (|bN +1(x)| + 2|bM (x)|) ≤ .

The result follows from the General principle of uniform convergence.

8.3. Consequences of Uniform Convergence

In this section, we discuss the implications of uniform convergence on continuity, integrability anddifferentiability. To answer the question first raised in Section 8.1, we have the following result.

THEOREM 8E. Suppose that a sequence of functions f n : X → C converges uniformly on a set X ⊆ R

to a function f : X → C as n → ∞. Suppose further that c ∈ X and that the function f n is continuous at c for every n ∈ N. Then the function f is continuous at c.

Remark. The conclusion of Theorem 8E can be written in the form

limx→c

limn→∞

f n(x) = limn→∞

limx→c

f n(x).

Theorem 8E then says that if the sequence of functions converges uniformly on X , then the order of thetwo limiting processes can be interchanged.

Proof of Theorem 8E. Given any > 0, there exists n ∈ N such that

supx∈X

|f n(x) − f (x)| <

3.

Since f n is continuous at c, there exists δ > 0 such that

|f n(x) − f n(c)| <

3whenever |x − c| < δ.

It follows that whenever |x − c| < δ , we have

|f (x) − f (c)| ≤ |f (x) − f n(x)| + |f n(x) − f n(c)| + |f n(c) − f (c)| < .Hence f is continuous at c.

We immediately have the following corollary of Theorem 8E.

THEOREM 8F. Suppose that un is a sequence of real or complex valued functions defined on a set

X ⊆ R, and that the series ∞n=1

un(x)

converges uniformly to a function s(x) on X . Suppose further that c ∈ X and that the function un is continuous at c for every n ∈ N. Then the function s is continuous at c.

We next study the effect of uniform convergence on integrability.

THEOREM 8G. Suppose that f n is a sequence of real valued functions integrable over a closed interval [A, B]. Suppose further that f n → f uniformly on [A, B] as n → ∞. Then the function f is integrable over [A, B], and

BA

f (x) dx = limn→∞

BA

f n(x) dx. (1)

Remark. The conclusion of Theorem 8G can be written in the form BA

limn→∞

f n(x)

dx = limn→∞

BA

f n(x) dx.

Theorem 8G then says that if the sequence of functions converges uniformly on [A, B], then the orderof integration and taking limits as n → ∞ can be interchanged.

Proof of Theorem 8G. Given any > 0, there exists N ∈ N such that

supx∈[A,B]

|f n(x) − f (x)| <

3(B − A)whenever n ≥ N . (2)

It follows in particular that

f N (x) −

3(B − A)< f (x) < f N (x) +

3(B − A)whenever x ∈ [A, B].

Hence for any dissection ∆ of [A, B], we have

s(f N , ∆) −

3≤ s(f, ∆) ≤ S (f, ∆) ≤ S (f N , ∆) +

3,

so that

S (f, ∆) − s(f, ∆) ≤ S (f N , ∆) − s(f N , ∆) +2

3.

Since f N is integrable over [A, B], there exists a dissection ∆ of [A, B] such that

S (f N , ∆) − s(f N , ∆) <

3, so that S (f, ∆) − s(f, ∆) < .

Hence f is integrable over [A, B]. On the other hand, it follows from (2) that

B

A

f n(x) dx − B

A

f (x) dx ≤ B

A

|f n(x) − f (x)| dx < whenever n ≥ N .

The assertion (1) follows immediately.

We immediately have the following corollary of Theorem 8G.

THEOREM 8H. Suppose that un is a sequence of real valued functions defined on a closed interval

[A, B], and that the series

∞n=1

un(x)

converges uniformly to a function s(x) on [A, B]. Suppose further that the function un is integrable over [A, B] for every n ∈ N. Then the function s is integrable over [A, B], and

BA

s(x) dx =∞n=1

BA

un(x) dx.

Remark. The conclusion of Theorem 8H can be written in the form

BA

∞n=1

un(x)

dx =

∞n=1

BA

un(x) dx.

Theorem 8H then says that if the sequence of functions converges uniformly on [A, B], then the order of integration and summation can be interchanged. In other words, the series can be integrated term byterm.

We next study the effect of uniform convergence on differentiability.

THEOREM 8J. Suppose that f n is a sequence of real valued functions differentiable in a closed interval [A, B]; in other words, differentiable at every point in the open interval (A, B), right differentiable at A

and left differentiable at B. Suppose further that the sequence f n(x0) converges for some x0 ∈ [A, B],and that the sequence f n converges uniformly on [A, B]. Then the sequence f n converges uniformly on [A, B], and the limit function f is differentiable in [A, B]. Furthermore, for every x ∈ [A, B], we have

f (x) = limn→∞

f n(x).

Remark. The conclusion of Theorem 8J can be written in the form

limn→∞

f n(x)

= limn→∞

f n(x).

Theorem 8J then says essentially that if the sequence of functions satisfies some mild convergence prop-erty and the sequence of derivatives converges uniformly on [A, B], then the order of differentiation andtaking limits as n → ∞ can be interchanged.

Proof of Theorem 8J. Suppose that f n → g as n → ∞. Since the convergence is uniform in [A, B],given any > 0, there exists N such that

sup[A,B]

|f n(x) − g(x)| <

4(1 + (B − A))whenever n ≥ N , (3)

so that

sup[A,B]

|f m(x) − f n(x)| < 2(1 + (B − A))

whenever m > n ≥ N .

Suppose that η1, η2 ∈ [A, B]. Applying the Mean value theorem to the function f m − f n, we have

|(f m(η1) − f n(η1)) − (f m(η2) − f n(η2))| = |η1 − η2||f m(ξ ) − f n(ξ )|

< |η1 − η2|

2(1 + (B − A))<

2(4)

for some ξ between η1 and η2. On the other hand, since f n(x0) converges as n → ∞, there exists N

such that

|f m(x0) − f n(x0)| <

4whenever m > n ≥ N .

It follows from (4), with η1 = x and η2 = x0, that

|f m(x) − f n(x)| < |f m(x0) − f n(x0)| + 2

< 34

whenever m > n ≥ max{N, N },

and so it follows from the Principle of uniform convergence that f n(x) converges uniformly in [A, B].Suppose that f n(x) → f (x) as n → ∞. Let c ∈ [A, B] be fixed. For every x ∈ [A, B], it follows from (4),with η1 = x and η2 = c, that

f m(x) − f m(c)

x − c−

f n(x) − f n(c)

x − c

<

2whenever m > n ≥ N ,

so that on letting m → ∞, we have

f (x) − f (c)

x − c−

f N (x) − f N (c)

x − c

<

2. (5)

Since f N is differentiable at c, there exists δ > 0 such that

f N (x) − f N (c)

x − c− f N (c)

<

4whenever 0 < |x − c| < δ and x ∈ [A, B]. (6)

Combining (5), (6) and (3), we conclude that

f (x) − f (c)x − c

− g(c)

≤

f (x) − f (c)

x − c−

f N (x) − f N (c)

x − c

+

f N (x) − f N (c)

x − c− f N (c)

+ |f N (c) − g(c)| <

whenever 0 < |x − c| < δ and x ∈ [A, B]. Hence

f (c) = g(c) = limn→∞

f n(c).


We immediately have the following corollary of Theorem 8J.

THEOREM 8K. Suppose that un is a sequence of real valued functions differentiable in a closed interval [A, B]. Suppose further that the series

∞n=1

un(x0)

converges for some x0 ∈ [A, B], and that the series

∞n=1

un(x)

converges uniformly on [A, B]. Then the series

∞

n=1

un(x)

converges uniformly on [A, B], and its sum s(x) is differentiable in [A, B]. Furthermore, for every x ∈ [A, B], we have

s(x) =∞n=1

un(x).

Remark. The conclusion of Theorem 8K can be written in the form∞n=1

un(x)

=

∞n=1

un(x).

Theorem 8K then says essentially that if the series of functions satisfies some mild convergence prop-erty and the series of derivatives converges uniformly on [A, B], then the order of differentiation andsummation can be interchanged.

8.4. Application to Power Series

Consider a power series in z ∈ C, of the form

∞

n=0

anzn, (7)

where an ∈ C for every n ∈ N ∪ {0}. Recall Theorem 3Q, that if the power series (7) has radius of convergence R and if 0 < r < R, then the series

∞n=0

|an|rn

converges. It follows from Weierstrass’s M-test that the power series (7) converges uniformly on the set{z ∈ C : |z| ≤ r}. Suppose now that |z0| < R. Then there exists r such that |z0| < r < R. It followsfrom Theorem 8F that the power series is continuous at z0. We have therefore proved the followingresult.

THEOREM 8L. Suppose that the power series (7) has radius of convergence R. Then for every r

satisfying 0 < r < R, the power series converges uniformly on the set {z ∈ C : |z| ≤ r}. Furthermore,the sum of the power series is continuous on the set {z ∈ C : |z| < R}.

We next consider real power series.

THEOREM 8M. Suppose that the real power series

∞n=0

anxn, (8)

where an ∈ R for every n ∈ N∪ {0}, converges in the interval (−R, R) to a function f (x). Then f (x) is differentiable on (−R, R), and

f (x) =∞n=1

nanxn−1.

On the other hand, if |X | < R, then

X

0

f (x) dx =∞n=0

an

n + 1X n+1.

Proof. It is not difficult to see that the power series

∞n=1

nanxn−1 (9)

converges in the interval (−R, R). It follows from Theorem 8L that the series (9) converges uniformly onany closed subinterval of (−R, R). The first assertion follows from Theorem 8K. The second assertion

follows from Theorem 8H on noting that the power series converges uniformly on the closed interval withendpoints 0 and X .

We conclude this chapter by establishing the following useful result.

THEOREM 8N. (ABEL’S THEOREM) Suppose that the real series

∞n=0

an

is convergent. Then

∞n=0

anxn →

∞n=0

an as x → 1 − .

Proof. It follows from Abel’s test that the series

∞n=0

anxn

converges uniformly on [0, 1]. Let s(x) be its sum. Then it follows from Theorem 8F that s(x) iscontinuous on [0, 1]. In particular, we have s(x) → s(1) as x → 1−.





1. For each of the following, prove that the sequence of functions converges pointwise on its domain of

definition as n → ∞, and determine whether the convergence is uniform on this set:a) f n(x) =

nx

n + xon [0, ∞) b) f n(x) =

nx

1 + n2x2on [0, ∞)

c) f n(x) = xn(1 − x) on [0, 1] d) f n(x) =sin nx

nxon (0, 1)

e) f n(x) = nxe−nx2

on [0, 1]

2. Suppose that f n and gn are complex valued functions defined on a set X ⊆ R. Suppose further thatf n(x) → f (x) and gn(x) → g(x) as n → ∞ uniformly on X .

a) Prove that αf n(x) + βgn(x) → αf (x) + βg(x) as n → ∞ uniformly on X for any α, β ∈ C.b) Is it necessarily true that f n(x)gn(x) → f (x)g(x) as n → ∞ uniformly on X ? Justify your

assertion.

3. a) Suppose that f n(x) → f (x) as n → ∞ uniformly on each of the sets X 1, . . . , X k in R. Prove thatf n(x) → f (x) as n → ∞ uniformly on the union X 1 ∪ . . . ∪ X k.

b) Give an example to show that the analogue for an infinite collection of sets does not hold.

4. The series∞n=1

un(x) is uniformly convergent on a set S ⊆ R.

a) Is the series necessarily absolutely convergent for every x ∈ S ? Justify your assertion.b) Is the series necessarily absolutely convergent for some x ∈ S ? Justify your assertion.

5. Prove that the series∞

n=1

(−1)n

n(1 + x2n)converges uniformly on R.

6. Suppose that∞n=1

an is a convergent real series.

a) Prove that the series∞n=1

anxn converges uniformly on [0, 1].

b) Prove that the series∞n=1

ann−x converges uniformly on [0, ∞).

7. For every n ∈ N, let f n(x) = n−1e−x/n.a) Show that f n(x) converges uniformly on (0, ∞).

Fundamentals of Analysis

Documents

Transcript of Fundamentals of Analysis