Download - Microchip Mathematics Number Theory

Microchip Mathematics number theory for computer users

Keith Devlin lIathematlcs Department Unlvenlty of Lancaster

SHIV A PUBLISHING LIMITED 64 Welsh Row, Nantwich, Cheshire CW5 5ES, England

Keith Devlin, 1984

ISBN 1 850140472

All rights reserved. No part of this publication may be reproduced. stored in a retrieval system, or transmitted in any form or by any means. electronic, mechanical, photocopying. recording and/or otherwise, without the prior written permission of the Publishers.

This book is sold subject to the Standard Conditions of Sale of Net Books and may not be resold in the UK below the net price given by the Publishers in their current price list.

The front cover shows the author with a print-out of the largest known prime number, a number with 39751 digits. The print-out is 9 feet in length. It required over half an hour of main frame computer time to work out the digits in this number. (Photograph taken at The Computer Unit, Warwick University, courtesy of Dr Keith Halstead.)

Printed and bound in Great Britain by Billing and Sons Limited

Contenls

O.

I.

PREFACE

BACKGROUND: PRIME NUMBERS

1.

2.

3.

4.

5.

Prime Numbers

The Sieve of Eratosthenes

The Distribution of Primes

Largest Known Primes

Conjectures About Primes

Exercises 0

Computer Problems 0

BASIC CONCEPTS

1.

2.

3.

4.

5.

Mathematical Induction

Divisibility. The Euclidean Algorithm

Efficiency of Algorithms. Mu1tiprecision

Arithmetic

The Fibonnaci Sequence and the Efficiency

of the Euclidean Algorithm

Prime Numbers

v

1

2

4

5

8

9

11

12

14

14

23

34

43

48

iii

II.

III.

iv

6. Diophantine Equations

Exercises I

Computer Problems I

CONGRUENCES

1.

2.

3.

4.

Congruence

Modular Arithmetic

Fermat's Little Theorem and the

Euler Phi-Function

Random Number Generators and Primitive Roots

Exercises II

Computer Problems II

PRIMALITY TESTING AND FACTORISATION

1.

2.

3.

4.

Perfect Numbers and Mersenne Primes

Public Key Cryptography

Primality Testing

Factorisation Techniques

Exercises III

Computer Problems III

RECOMMENDED FURTHER READING

INDEX OF NUTl\1'fON

INDEX

51

55

59

62

62

77

94

107

128

135

138

139

153

163

178

191

197

202

203

204

Preface

In the Autumn of 1983, in the face of the phenomenal growth of

home computer sales in the U.K., the national British newspaper

The Guardian decided to produce, each week, a 'Computer Page'.

Noone was quite sure exactly what should go into the page on a

regular basis, but it was thought that a fortnightly column on

computer mathematics might be a good idea, and when the computer

page first appeared on 20th October of that year, it included a

small item on binary arithmetic by me.

From the mail I received after my column had been running

for a few months, it was clear that the microcomputer age had brought

with it a huge increase in the number of (potential)

'recreational mathematicians'. Though in many cases without any

formal training in mathematics, my correspondents displayed tremendous

mathematical ability, and I was frequently asked if I could recommend

any suitable books. What they seemed to want was a genuine

mathematics text book, but one which did not require a great deal

of prior knowledge. This is intended to be just such a book.

Number Theory is one of the few areas of modern mathematics

which is accessible to the non-expert. (At least, the kind of

Number Theory considered here: there is a lot of other material

v

which also goes under the title 'Number Theory', most of which

is pretty well inaccessible to the majority of trained

mathematicians~) It is also an area in which there is a genuine

two-way flow between man and the computer. Indeed, it was this

fascinating interplay of brain power and computer power that

awakened my own interest in the subject to a level where I began

to give a course on the subject at Lancaster University and,

coincidentally, write about it in The Guardian. (Previously my

mathematical research work had been in Set Theory, a subject dealing

almost exclusively with the mysterious world of the infinite.)

This is a book about (the computational aspects of) Number

Theory. Though written for university undergraduates in

mathematics, I have tried to present the material in such a way

that it can be followed by the keen but largely untrained 'amateur'

sitting at home with (or possibly even without) a cheap home

computer. I do not pretend to give a complete coverage of the

computational aspects of Number Theory. (For instance, no mention

is made of Quadratic Reciprocity, a tremendously important part

of the subject.) Rather my aim is to cover the (very) basic parts

of Number Theory and at the same time give some indication of the

way in which Number Theory both feeds off and leads to advances

in Computation Theory. Consequently, if the book were used as

a text to accompany a university lecture course, the lecturer would

presumably deal with additional topics not covered in this book.

In writing this book, I made extensive reference to, in

particular, two excellent books, to which this text could be regarded

as a precursor. David Burton's book Elementary Number Theory

gives a wonderfully readable coverage of (essentially the non-

vi

computational aspects of) Number Theory, and covers many more topics

than I have space for here, whilst Donald Knuth's 'The Art of

Computer Programming, Volume ~' is the 'bible' for serious

computational number theorists.

The book is structured in a way that assumes a more or less

direct passage from start to finish, though an index is provided

to enable the book to be used as a reference text if necessary.

Each chapter (including an informal preparatory chapter) ends with

a selection of (mathematical) Exercises, grouped according to the

section they refer to, and some Computer Problems. The latter

are, for the most part, just initial 'pointers' as to what can be

tried out on a computer, and I would hope that these are enough

to spur the reader on to carrying out further computer investigations

of his or her own devising.

To assist readers who wish to skip proofs and concentrate on

the development of the main results, the symbol 0 is used to

indicate the end of a proof. (Whenever this symbol occurs

immediately following the statement of a result, this indicates

that the proof is so obvious as to require no further comment.)

For easy reference, all results obtained are numbered consecutively,

the reference numbers consisting of the Chapter number, section

number, and result number.

Keith Devlin

Lancaster, August 1984

vii

Pierre De Fermat: 'The Father of Number Theory'. Born in 1601

near Toulouse in France, Fermat was a jurist by profession, and

only took up mathematics as a hobby in his thirties. Through

correspondence with many of the leading scholars of the day, Fermat

developed most of the pivotal ideas of present day Number Theory.

Many of his ideas to simplify mental calculation are nowadays

employed to speed up computer algorithms. This painting is from

the collection of the Academie des Sciences, Inscriptions et

Belles Lettres de Toulouse; it is reproduced here with the kind

permission of Robert Gillis.

o Background: Prime Numbers

Numbers constitute the one mathematical system familiar to all

mankind, at least if by 'number' you mean 'positive whole number'

as did the Ancient Greeks. Today the professional mathematician

uses the phrase 'natural number' to denote the positive whole

numbers 1,2,3, .. This is a book about these 'natural' numbers,

and we shall rarely have occasion to speak of other numbers such

as proper fractions like t,t, or t. The study of the natural

numbers is known as 'Number Theory', and in keeping with the

traditions of that subject we shall use the word 'number' to mean

'natural number' unless otherwise indicated. (This convention

is used in the very name 'Number Theory' of course.)

The natural numbers are so fundamental to the rest of

mathematics that the famous 19th Century mathematician Leopold

Kronecker once remarked that 'God created the natural numbers,

and all the rest is the work of man.' What he meant by this

was that, starting from the natural numbers it is possible to

construct, in a rigorous fashion, the entire edifice of modern

mathematics, which is true, and that the natural numbers themselves

cannot be constructed (in a mathematical sense) from any simpler

entities, which was true when Kronecker made his remark but is

1

no longer valid, Cantor's Set Theory having provided a way of

constructing the natural numbers using simple sets. But this

last point notwithstanding, Kronecker's remark is still pretty

indicative of the status of the natural numbers in mathematics.

As the natural numbers are fundamental to the rest of

mathematics, so are the prime numbers fundamental to the natural

numbers. Strictly speaking, we shall not be in a position to

make a proper study of the prime numbers until we have developed

our Number Theory sufficiently, but so basic are the prime numbers

that it will be helpful to present a few basic facts before we

do anything else. All of these facts will be proved rigorously

in due course. (This is not to say that anything we say is at

all likely to strike you as unlikely, quite the contrary. But

in mathematics it is prudent to leave nothing to chance, as history

has taught us time and time again.)

1. PRIME NUMBERS

A number (natural number) p is said to be prime if it is greater

than 1 and is divisible (without remainder) by only 1 and p.

A number greater than 1 which is not prime is said to be composite.

For example, 2,3,5,7,11,13,17,19 are all prime and 4,6,8,9,10,

12,14,15,16,18,20 are all composite. Obviously, with the

exception of 2, all primes are odd numbers, a fact which leads

to the old joke that 2, being even, is a very 'odd' prime.

In Book IX of his 'Elements', Euclid proved that there are

infinitely many prime numbers. (It is obvious that there are

infinitely many composite numbers, for instance every even number

greater than 2 is composite.)

2

The reason why the prime numbers play such a fundamental part

in Number Theory lies to a great extent in the following simple

fact, which we shall prove when we come to develop the theory of

prime numbers in a proper fashion: if p is a prime and p divides

a product ab of two numbers a and b, then p must divide (at least)

one of a and b on its own.

for instance, 6 divides 36

4 or 9.)

(This is not true for non-primes p:

4.9 but 6 does not divide either of

Using the above fact, it can be proved that every number greater

than 1 can be expressed as a product of prime numbers, and that

moreover such an expression is unique apart from the order in which

the prime factors appear.

Theorem of Arithmetic.

This result is known as the Fundamental

For example, 1200 (Actually,

it is perhaps prudent to make a remark here about the use of the

word 'product' in mathematics. Ordinarily, by a 'product' of

numbers one means two numbers multiplied together. In mathematics,

the word 'product' is used to mean the result of any number of

numbers multiplied together. Included in this is the degenerate

case of a single number, where in reality there is no multiplication

involved at all. Thus, for example, the prime number 3 is a

'product' of primes, as is any prime number. Though on the face

of it it may seem a little strange to refer to individual primes

as 'products' of primes, this is done in order to simplify the

statement of mathematical results. For instance, without this

convention it would be necessary to exclude the prime numbers from

the statement of The Fundamental Theorem of Arithmetic.)

3

2. THE SIEVE OF ERATOSTHENES

Given a particular number, how can we determine whether it is prime

or not? The most obvious method is to go through all smaller

numbers greater than 1 and see if any of them divide into it.

If a divisor is found, the number cannot be primel if no divisor

is found, it must be prime. Though simple to describe, this method

is unwieldy in practice: for example, to check if 83 is prime would

require 81 trial divisions.

The above can be speeded up considerably by the observation

that if a number a has a factor (other than 1 and a) it necessarily

has a factor less than la. (This is easily proved.) So in order

to check if a number is prime it is only necessary to look for

possible factors less than its square root. For numbers such

as 83, this makes the method feasible, of course, since then only

the numbers 2, ,9 need be checked, but for larger numbers the

method is still unwieldy.

A simple technique for determining all the primes less than

a given number without using any arithmetic at all was invented

by the Greek mathematician Eratosthenes of Cyrene (276-194 B.C.).

To find all the primes less than N, you begin by writing all the

numbers 2,3,4,5, ,N in a list. Starting from 2, every second

number on the list will be even, of course, and hence, excepting

2 itself, will be composite. So you go through the list deleting

every second number (but leaving 2 untouched). Now turn your

attention to the next number on the list which has not been crossed

out, namely 3. Starting from 3, every third number will be a

multiple of 3, hence, excepting 3 itself, composite. So leave

4

3 untouched and then proceed to cross out every third number

thereafter. (In counting every third number, you include the

crossed-out numbers.) The next number remaining on the list (i.e.

not crossed out) is 5. Starting at 5, cross out every fifth number

(but leave 5). And so on. By the time you reach the largest

number less than the square root of N by this procedure, you will

have deleted all composite numbers from the list, and what is

left will constitute a list of all the primes less than N.

The process of successively eliminating the multiples of

2,3,5,7, etc in the above method is known as 'sieving' (for obvious

reasons): hence the name 'The Sieve of Eratosthenes'. Later in

the book we shall have occasion to study other 'sieving' procedures.

Though it eliminates the need for arithmetic, it is clear

that if N is much greater than, say 100, sieving is also not a

very practical way to find prime numbers. In fact, utilising

various mathematical results we shall obtain in this book, it is

possible to develop much more efficient methods for primality

testing.

3. THE DISTRIBUTION OF THE PRIMES

If you use the Sieve of Eratosthenes to list all the primes less

than, say, 100, you will be able to see that, though the primes

are common amongst the smaller numbers (less than 20, say), they

become less frequent the higher up you go. In fact, the sieving

method makes it quite clear why this is the case. The higher

up you are, the more numbers will be sieved out by the time you

get there.

If we denote by ~(n) the number of primes less than n, then

5

the following table shows how n(n) varies with n for a few values

of n.

n n(n)

1,000 168

10,000 1,229

100,000 9,592

1,000,000 78,498

10,000,000 664,579

100,000,000 5,761,455

, In 1896, Hadamard and de la Valee Poussin independently

succeeded in proving that as n tends to infinity, n(n) approaches

the value n/log(n), i.e.

lim n(n) 1 n~ n/log(n)

(This followed considerable work on the problem by Tchebychef,

Riemann, and others.) This result is known as The Prime Number

Theorem. It had been conjectured over a hundred years earlier

by Legendre and Gauss, based upon the numerical evidence supplied

by tables such as the above.

An even better formula approximating n(n) for 'large' n was

suggested by Gauss and subsequently proved by de la Va1~e Poussin,

namely the function

Li(n) fn dx 2 log(x)

The accuracy of these approximating functions can be judged from

the following table, which extends the one above.

6

n n(n) n/log(n) Li(n)

1,000 168 145 178

10,000 1,229 1,086 1,246

100,000 9,592 8,686 9,630

1,000,000 78,498 72,382 78,628

10,000,000 664,579 620,420 664,918

100,000,000 5,761,455 5,428,681 5,762,209

One thing that is immediately apparent from the above table

is that, whilst Li(n) approximates n(n) with considerable accuracy

for quite modest values of n, it always does so on the large side:

Li(n) - n(n) is always positive. Is this in fact always the case,

or are there values of n for which Li(n) - n(n) is negative?

This is one of those salutary occasions when the mathematical

fact is at variance with all the available numerical evidence.

No number n has ever been found for which Li(n) - n(n) is

negative, despite considerable computer searches. Nevertheless,

the mathematician J.E.Littlewood proved that such an n must exist.

In fact, the sign of Li(n) - n(n) changes infinitely often as n

runs up through all the numbers.

somewhere before the number

It must certainly change

a number of incomprehensible magnitude, and almost the largest

number ever to playa genuine part in mathematics. It seems

likely that, no matter how much computers develop in the future,

noone will ever know of a specific example of a number n for

7

which TI(n) exceeds Li(n).

4. LARGEST KNOWN PRIMES

Knowing that there are infinitely many primes, mankind's curiosity

has naturally resulted in computer searches being made for 'record'

primes. Such searches involve some interesting mathematics, and

require very efficient computer programs. For mathematical reasons

which will be explained later in the book, record primes are

nowadays always of the form 2n -1 for certain numbers n.

Prior to 1971, the largest known prime was 211 ,213 1, a

number which would require some 3,376 digits to write out in the

normal way. This was discovered by Donald B. Gillies in 1963

using the ILLIAC-II computer. In 1971, Bryant Tuckerman used

an 1MB 360-91 computer to show that the 6,002 digit number

219,937 l' . - l.S prl.me. In 1978, two 18 year old American high

school students, Laura Nickel and Curt Noll, discovered the prime

221,701 _ 1, using a CDC-CYBER-174. This feat so caught the

imagination of the American public that Nickel and Noll's discovery

was announced on nationwide television and made every major American

newspaper. The Nickel-Noll prime has 6,533 digits. One year

later, Noll used the same computer to better the record with the

6,987 digit number 223,209 - 1.

It took the CDC-CYBER-174 well over eight hours to run the

check on primality for Noll's number. Two weeks later, David

Slowinski used the immensely powerful CRAY-1 computer to check

the primality of the same number: it took a mere seven minutes.

Aided by Harry Nelson, Slowinski used the CRAY-1 to discover, on

April 8, 1979, that the 13,395 digit number 244 ,497 - 1 is prime.

8

For the period 1976 until 1984, the CRAY-1 was probably the most

powerful computer in the world, so it is not too surprising to

learn that Slowinski and his CRAY-1 kept the record for the world's

largest known prime. On September 25, 1982 the 25,962 digit

prime 286 ,243 - 1 was discovered. Then, on September 19, 1983

(at 1 36 33 ) the 39 751 d o 2132 ,049 1 f d 1: : a.m. , 19lt glant - was oun,

this time using a CRAY-XMP computer, essentially two CRAY-1

computers joined together. At the time of writing, this is the

largest known prime number in the world.

5. CONJECTURES ABOUT PRIMES

There are many easily formulated conjectures about primes, based

upon numerical evidence, which have resisted numerous attempts

at solution. For instance, to make the business of primality

testing feasible, record primes are nowadays always sought amongst

the numbers of the type 2n - 1. It is conjectured that there

are infinitely many prime numbers of this kind, but this has never

been proved. In fact the numerical evidence is rather flimsy.

Including the examples listed in the previous section, only 29

examples of such primes are known. A similar unsolved problem

is whether there are infinitely many primes of the form 2n + 1.

2 Are there infinitely many prlmes of the form n + 1? The

conjecture is that there are. Again, Fermat, the great 17th

Century number theorist, conjectured that all numbers of the form

are prime.

F n

This is certainly true for FO = 3, F1 = 5, F2 = 17,

65,537. But unfortunately, there it stops.

9

In 1732, Euler found that F5 = 4,294,967,297 is divisible by 641.

Despite considerable computerised searches, no prime numbers of

the form Fn for n>4 have ever been found, and the present day

conjecture is that Fn is composite for all n>4.

Two primes which are only 2 apart are said to be twin primes.

For example, 5 and 7 constitute a pair of twin primes, as do 17

and 19. Thousands of examples of such pairs have been discovered,

but the conjecture that there are infinitely many pairs of twin

primes remains unresolved.

In a letter to his colleague Euler written in 1742, Christian

Goldbach conjectured that every even number is the sum of two

numbers that are either prime or 1. For example, 4 = 2+2, 6 = 3+3,

8 = 3+5. Computer searches have demonstrated that this is true

up to 1,000,000,000, but the general problem remains unsolved,

and is known today as the Goldbach Conjecture. A similar open

question is whether every even number can be expressed as the

difference of two consecutive primes in infinitely many ways.

And in 1775 Lagrange conjectured that every odd number greater

than 5 can be written in the form p + 2q where p and q are both

primes, again still open.

Is it possible to find arbitrarily long finite arithmetic

progressions of prime numbers? At present the longest known is

of length 18, starting with the prime 107,928,278,317 and increasing

in steps of 9,922,782,870 until the number 276,615,587,107 is

reached. Even more demanding, are there arbitrarily long finite

arithmetic progressions of consecutive primes? The longest known

has length 6, starting with 121,174,811 and going up in steps of

30.

10

Occasionally a conjecture about primes does get solved.

For instance, in 1850, Tchebychef established Bertrands's Conjecture

that for every number n>l there is a prime number strictly between

nand 2n. And in 1950 it was shown that every number greater

than 9 can be written as a sum of distinct odd primes. But by

and large, most of the present day open conjectures about primes

seem to be extremely hard to answer.

EXERCISES 0

1. Use the Sieve of Eratosthenes to determine all the primes

less than 100.

2. Prove that if n is composite, it has a prime factor less

than In.

3. A number is said to be square-free if it is not divisible

by any perfect square. Prove that a number n>l is square-

free if, and only if, it is a product of distinct primes.

4. Prove that the only prime of the form n 3 - 1 is 7.

5. Prove that if an - 1 is prime, then a=2.

6. Show that any prim~ greater than 3 is either one less or one

more than a multiple of 6. (This requires the Division

Algorithm considered in Chapter I.)

7. 2 2 Show that if p is a prime greater than 5, either p -lor p +1

is divisible by 10. (This requires the Division Algorithm,

considered in Chapter I.)

11

8. Use Bertrand's Conjecture to show that if Pn is the n-th prime

n (so Pl=2, P2=3, P3=5, etc.), then Pn

4. A natural question which arises from the proof if the

infinitude of the primes (see question 9 in the above

Exercises: see also Chapter I) is whether the numbers

are prime, where Pn denotes the n-th prime number.

Investigate this question.

5. Investigate the primality of the values of the quadratic

polynomial

f(n) n 2 + n + 41

for n 0,1,2, ,100.

6. There is an arithmetic progression of seven primes which starts

with a number less than 10 and increases in steps of 150.

Find it.

7. The de Polignac Conjecture states that every odd number is

the sum of a prime and a power of 2. Find a counterexample.

8. Verify the Goldbach Conjecture for all even numbers up to

1,000.

9. Find five primes of the form 2n - 1.

10. Find 10 pairs of twin primes. Find 100. Investigate the

behaviour of the function t(n) which gives the number of twin

pairs less than n.

13

I Basic Concepts

This preparatory chapter collects together an assortment of concepts

and ideas that will be required throughout the remainder of the

book. Most of the material covered will probably be familiar to

you, though the 'casual' reader may be a little surprised at the

degree of rigour with which the development proceeds. Exercising

great care and leaving nothing to 'chance' may well seem unnecessary

during the early stages, but when things become more complicated

and there is little or no intuition to be the guide, mathematical

rigour is the only guarantee of correctness, so it is as well to

start out as you mean to go on.

1. MATHEMATICAL INDUCTION

Many results in Number Theory are of the form 'all numbers have

such and such a property'.

any number n

For example, the statement that for

1 + 2 + 3 + + (n-1) + n tn(n+1)

How can one set about proving that such a statement is true?

Certainly one might begin by trying the formula with a few values

of n, say n = 1,2,3, ,10. In the case of the example given,

14

this will show that the formula is valid for each of these values

of n. Whilst this may well give cause to suspect the general

validity of the formula for all values of n, it does not at all

prove its universal validity. Indeed, the unreliability of

numerical evidence in situations such as this was highlighted by

the result of Littlewood mentioned in Chapter 0.3. (Another

instance was provided by Computer Problem 5 in that chapter.)

In order to demonstrate rigourously that a statement is valid

for all values of the number n, it is necessary to prove that it

is impossible for there ever to be an n for which the statement

is false, no matter how large and computably inaccessible it may

be. In order to do this, it is common to argue like this.

Suppose, for the sake of argument, that there were a value of n

for which the statement in question is false. Then there would

have to be a first (i.e. least) such value of n. For this particular

value of n we would then have the situation where the statement

is true for all values 1,2,3, ,n-l but fails at n. The method

of proof by Mathematical Induction works by preventing this situation,

or rather by demonstrating that it can never arise.

The method is more easily understood by considering a specific

example, such as the one given above.

that for any number n

That is, we want to pr,ove

1+2+3+ +n !n(n+l)

We begin by observing that the formula is valid for the special

case n=l. Now suppose that it were false for some number n.

Let K be the least number for which the formula is false. We

thus have the situation

15

(i) 1 + 2 +

(ii) 1 + 2 +

+ (K-l) = ;(K-l)(K)

+ K f. ;K(K+l)

To complete the proof we demonstrate that this situation is

contradictory, i.e. that equations (i) and (ii) cannot both be

valid. This is the part of any induction proof where some ingenuity

is required.

In the case of this example suppose we take equation (i) and

add K to both sides. This gives

1 + 2 + + (K-l) + K ;(K-l)(K) + K

which, when we simplify the right hand side gives

1 + 2 + + (K-l) + K ;K(K+l)

This contradicts equation (ii), of course. (strictly speaking,

(ii) is not an 'equation' but an 'inequation' but mathematicians

never bother about this kind of detail: there are more important

details to worry about.) It follows that the original assumption

that there were an n for which the formula is false cannot be a

valid assumption.

all numbers n.

In other words, the formula must be true for

The reason the above method of induction works can be explained

like this. The difficulty in trying to prove that something is

true for all numbers n is that there are infinitely many numbers

and it is impossible to consider them all individually. By making

the assumption that the statement is false for some number, and

then concentrating on the least number for which it is false, you

are reduced to looking at just one number, namely that least value.

16

Of course, you do not know just which number that is. (Indeed,

if the result you are trying to prove turns out to be true, then

there will in fact not be such a number, so there is no way you

can know its value. But at the time of trying to prove the result

you do not know this, at least not 'officially'.) But as the

above example shows, in order to arrive at a contradiction it is

not at all necessary to know anything about the critical value

K other than that it is (by assumption) a critical value where

the statement in question first becomes false.

Notice that in the above example, in order to reach our final

contradiction we began with equation (i). But there will only

be an equation (i) provided that K>l. This was why the proof

began with the observation that the result was valid for n=l.

This means that K, for which the result is assumed false, must

indeed be greater than 1. This important point is often over-

looked by the beginner, so we shall emphasise it by trying to prove,

by induction, the false statement

1 + 3 + 5 + + (2n-1) 2

n + 3

Suppose that the above equation is false for some value of

n. Let K be the least value of n for which it fails. Then we

have the situation

(i) 1 + 3 + 5 + + (2K - 3) (K-1)2 + 3

(ii) 1 + 3 + 5 + + (2K - 1) ~ K2 + 3

Add 2K-1 to both sides of equation (i) to obtain

1 + 3 + 5 + + (2K-3) + (2K-1) 2

(K-1) + 2K + 2

17

Rearranging the right hand side of this equation gives

1 + 3 + 5 + + (2K-1) 2

K + 3 ,

which contradicts (ii). So far this looks very similar to the

proof used in our first example. The difference is that in this

example the result is not true for n=l. This 'small' fact means

that the argument just given does not lead to the conclusion that

the formula concerned is valid for all values of n. In fact it

is false for all values of n.

Proofs by mathematical induction are often written in a slightly

different fashion. In order to prove that some statement A(n)

involving the number n is valid for all numbers n, it is possible

to proceed as follows.

1. Establish (usually by simple observation) that A(l) is

valid.

2. Give an algebraic proof that the truth of A(n) implies

that of A(n+1) (for an unspecified n).

This procedure is in fact logically equivalent to the first one.

Step 1 is, of course, common to both approaches. Step 2 above

will clearly preclude the existence of a K for which A(K) is false,

since any such K will have to be greater than 1 (by Step 1), and

so the least K will be for the form K = n+1 where A(n) is true

(K being the least for which A(K) is false), and Step 2 then implies

A(n+l) is true, i.e. A(K) is true, a contradiction.

We shall use this method of writing the proof to establish

the correct version of the formula for the sum of the first 2n-1

odd numbers, considered above. The formula is

18

(i) 1 + 3 + 5 + + (2n-1) 2

n

To prove this by the method of induction, we begin by observing

that the formula is valid for n=l. Now we make the assumption

that it is valid for an unspecified n, i.e. we assume that equation

(i) is indeed valid for some (unspecified) n, and we try to prove

that it is valid for n+1, i.e. that

(ii) 1 + 3 + 5 + + (2n+1) (n+1)2

How do we prove that (ii) follows from (i)? This is easy.

Sim~ly add 2n+1 to both sides of equation (i) and simplify the

right hand side.

Really the only difference in the two approaches is that in

the former we perform the algebra on the special (but unknown)

value n=K where the statement is false for the first time, and

in the second we perform the same algebra on some fixed, but likewise

unknown, n.

Notice in particular that in the second formulation of

induction, we do not make the assumption that A(n) is true for

all n, indeed, it is precisely in order to prove this that induction

is being used in the first place. Rather we assume that A(n)

is true for a single but totally unspecified value of n, which,

being unspecified, has to be referred to as 'n' throughout. (Some

authors introduce a second symbol, 'k', at this point and speak

about 'letting n=k', and you can do this if you prefer, preserving

the distinction between 'n', the variable, and 'k', a fixed but

arbitrary number. But the algebra remains the same, except that

n becomes k everywhere.)

19

The two examples of induction considered so far both involved

the verification of an equation. This is not always the case.

As an illustration, let us use induction to prove that for all

numbers n, 6 divides into 7n_1. For n=1 this is obviously true.

Now assume the result is valid for some arbitrary but fixed number

n. We shall try to use this assumption in order to prove that

6 divides into 7n+1 - 1. Notice that

7.7n - 7 + 6 7.(7n - 1) + 6

By our 'induction hypothesis', 6 divides into 7n_1, so certainly

6 divides into 7.(7n-1). It follows at once that 6 divides into

7.(7n-1) + 6, of course, so we have succeeded in proving that 6

divides into 7n+1 - 1. It follows ('by induction') that 6 divides

into 7n-1 for all n.

You may well ask, why write 7n+1 - 1 in the form we did?

The only answer is that this led to the result we wanted. Different

situations will require different 'tricks', and induction proofs

often require considerable ingenuity at the 'n to n+1 step'.

We end this section with a particularly important application

of the method of induction: The Binomial Theorem. This allows

us to express powers of the form (a + b)n as a sum of products

of powers of a and b. For example, the following are well known

and easily proved by direct evaluation:

2 2 a + 2ab + b ,

In order to obtain a general result of this kind we need the

factorial function. For any number n, factorial n (also called

20

'n factorial') is that number, denoted by n!, obtained by

multiplying together all the numbers 1,2,3, ,n. Thus

n! n(n-1) 3.2.1.

For example,

I! = 1, 2! = 2.1 = 2, 3! = 3.2.1 6, 4! 4.3.2.1 24,

5! 5.4.3.2.1 = 120, 6! = 6.5.4.3.2.1 720,

7! 7.6.5.4.3.2.1 = 5040, 8! = 40320

From the above examples it should be clear that the values of n!

increase very rapidly as n increases. It should also be clear

that there is a simple recursive procedure for calculating values

of n!, namely, for any n,

(n+1) ! (n+1) (n!)

For convenience, we define o! = 1.

For any numbers n,r such that 0 ~ r ~ n, the binomial

coefficient Cn is defined by: r

n! r!(n-r)!

For example,

2 Co 1,

2 C1 2,

2 C2 1;

4 4 4 Co 1, C1 4, C2 6,

Note that for any n, Cn n

n(n-1) (n-r+1) r!

3 1, c3 Co 1

4 4.

4 C3 C4

3, 3

C2 3,

1.

1, and that for any n,r, Cn r

c3 3

1;

Cn n-r

21

Theorem 1.1.1 (The Binomial Theorem) For any n il: 2 ,

Proof: By induction on n. The cases n = 2, 3 follow from

previous observations. So assume the result holds for n (i.e.

as stated above) and prove it for n+1. By this induction

hypothesis, we have:

a.(a + b)n n n+l + cnlanb + cn2an-1b2 + n n-r+1 r cOa + Cra b +

b.(a + b)n n n + cn1an-1b2 n n-1 3 coa b + C2a b + n n-r r+l

+ Cab + r

n n + Cnbn+1 + Cn _1ab n

Adding these two expressions, we obtain

n+1 (a + b)

(Cn n n-r+1 r + r + Cr_l)a b +

+ (Cn + Cn l)abn + Cnbn +l n n- n

. n+1 1

n and

n+l 1

n shall have completed Sl.nce Co = Co Cn +1 Cn' we

proof if we can show that Cn n Cn+l our + Cr - 1 for all r, since r r

the above expression will then be the theorem for n+l in place

of n. So we must prove that

n! +

n! (n+1)! r! (n-r)! (r-l)! (n-r+1)! r! (n+1-r)!

But this is easy. Simply combine the two fractions on the left

into a single fraction, and upon simplification the expression

22

on the right is obtained. This completes the proof of the

binomial theorem. 0

2. DIVISIBILITY. THE EUCLIDEAN ALGORITHM

The notion of divisibility of one number by another is fundamental

to practically all aspects of Number Theory. Given any two

numbers one can add them or multiply them and obtain a new (natural)

number. If you allow for negative numbers (and zero), by

considering the integers rather than just the positive integers,

you can subtract as well. But division cannot, in general, be

performed, which is to say the result of dividing one number*

or integer by another is not necessarily another number* or

integer. For instance, you cannot divide 2 by 3 and obtain a

natural number as the result. Division is an operation for which

you need, at the very least, the rational number system. But

rational numbers are not what we study in Number Theory (at least,

not for most of the time).

When you are restricting yourself to whole numbers, either

the natural numbers or the integers, the process of division

results in a 'quotient' and a 'remainder'. For example, when

you try to divide 9 by 4 you get a quotient of 2 and a remainder

of 1:

9 = 4.2 + 1

This fundamental fact is embodied in a result called The Division

Algorithm. This is a bit of a misnomer, since the result itself

* Remember ~hat we have agreed that the word 'number' shall mean 'natural number' except where indicated otherwise.

23

is not an 'algorithm' at all. On the contrary, it merely asserts

the existence of a quotient and a remainder, and does not tell

you how to calculate them. (Though it can presumably be safely

assumed that you are, in fact, able to perform this task should

it prove necessary to do so.)

Theorem 1.2.1 (The Division Algorithm) Let a,b be integers, b>O.

Then there exist unique integers q,r such that

a q.b + r and O~r

where O~r,r'

The following result is more general than Theorem 1.2.1.

Theorem 1.2.2 Let a,b be integers, b ~ O. Then there are

unique integers q,r such that

a = qb + rand 0::; r < Ib I.

Proof: We only need to consider the case bO

having been dealt with in Theorem 1.2.1 above. Since Ibl > 0,

an application of Theorem 1.2.1 gives unique integers q',r' such

that

a = q'.lb 1+ r' and 0::; r' < Ibl.

-b, if we set q -q' and r r', we get

a = qb + rand 0::; r < Ibl,

which proves the theorem. o

Simple though it is, the Division Algorithm enables us to

prove various results that can be of help in simplifying

computational work. For instance, suppose that we were looking

for numbers which are the square of a prime. It may be useful

to know that the square of any prime greater than 2, in fact the

square of any odd number, is one more than a multiple of 8. (For

example, 32 = 9 = 8 + 1, 52 = 25 = 3.8 + I.> To prove this fact,

note that by the Division Algorithm, any number can be expressed

in one of the forms 4q, 4q+l, 4q+2, 4q+3, so any odd number is

of one of the forms ~q+l, 4q+3. Squaring each of these gives

26

(4q+1)2 = 16q2 + Sq + 1 = S( 2q2 + q) + 1 222

(4q+3) = 16q + 24q + 9 = S(2q + 3q + 1) + 1.

In both cases the result is one more than a multiple of S.

Further examples of this kind are given in the Exercises at

the end of this chapter. (See also Exercises 6 and 7 in

Chapter 0.)

An integer b is said to be divisible by a non-zero integer a,

written symbolically as alb, if and only if there is an integer c

such that b = ac.

The following result lists the basic properties of divisibility.

Lemma 1.2.3 Let a,b,c be integers, a ~ o. Then:-

(i) alO, 11a, ala;

(ii) all if and only if a = l

( iii) if alb and cld then aclbd (for c ~ 0)

(iv) if alb and blc then alc (for b ~ 0)

(v) [alb and blal if and only if a = b

(vi) if alb and b ~ 0, then la I :> Ibl

(vii) if alb and alc then al(bx+cy) for any integers x,y

Proof: In each case the proof simply involves going back to the

definition of alb. For example, to prove (iv), the assumptions

mean that there are integers d,e such that b da and c = eb, from

which it follows that c = (de)a, and so alc. To take another

case, consider (vi). Since alb there is an integer c such that

b = ac. I a 1.1 cl. Since b ~ 0, we must have c ~ 0, so I c I ;: 1. The remaining cases are left as an exercise. 0

27

Let a,b be given integers, at least one of which is not zero.

Then there are only a finite number of numbers d such that dla

and dlb. The largest of these numbers d is called the greatest

common divisor of a and b, denoted by gcd(a,b) or, more simply

if the meaning is clear from the context, by (a,b).

For example, take the integers 18 and -24. The positive

divisors of 18 are 1,2,3,6,9,18, whilst those of -24 are

1,2,3,4,6,8,12,24. The common divisors of 18 and -24 (amongst

the positive numbers) are thus 1,2,3,6. Thus 6 = gcd(18,-24).

(In the shorter notation, 6 = (18,-24).) Notice that, as this

example indicates, the gcd of two integers is always positive,

regardless of the sign of the two given integers.

A very useful result concerning the gcd of two integers is

that gcd(a,b) can always be expressed as a linear combination

(with integer coefficients) of a and b. For instance, in the

case of the example 6 = (18,-24) considered above, we have

6 (-1).(-24) + (-1).(18).

Theorem I.2.4 Let a,b be integers, not both zero. Then there

are integers x,y such that

(a,b) xa + yb

Proof: Assume, for the sake of argument, that a ~ O. Let S

be the set

S {ua + vb I u,v are integers and ua + vb > O}

Let u = 1 if a > 0 and let u = -1 if a < O. Then ua + Ob S.

This shows that S is not empty.

28

Let d be the smallest member

of S. Pick x,y so that d = xa + yb, as guaranteed by the

definition of S. We complete the proof by demonstrating that

d = (a,b).

By the Division Algorithm there are integers q,r such that

a = qd + r and o ~ r < d.

Then,

r a - qd a - q(xa + yb) (1 - qx)a + (-qy)b.

If r > 0, this would mean that r S, contrary to the minimality

of d in S. Thus we must have r = O. But then a = qd, so dla.

Similarly, dlb.

Having proved that d divides into both a and b, we now prove

that d is the largest such number. Suppose that c is a number

which divides both a and b. Then (by Lemma 1.2.3, part (vii

cl (xa + yb), i.e. cld. It follows that c ~ d, and hence that d

is the greatest common divisor of a,b. 0

Corollary 1.2.5 Let a,b be integers, not both zero, and let

T {xa + by I x,y are integers}

Then T is the set of all multiples of (a,b).

Proof:

of d.

Let d = (a,b). Clearly, every member of T is a multiple

Conversely, since we may write d = xOa + YOb for some

integers xo,yo' we have

for any integer n, so every multiple of d is a member of T. 0

29

We say that integers a,b are coprime (or relatively prime)

if (a,b) = 1.

In the case of coprime integers, Theorem I.2.4 has a converse,

namely:

Theorem I.2.6 Let a,b be integers, not both zero. Then a,b

are coprime if and only if there are integers x,y such that

xa + yb 1.

Proof: If (a,b) = 1 then the existence of such x,y follows from

Theorem I.2.4. Conversely, suppose such x,y exist. Let d = (a,b).

Since dla and dlb we must have dl(xa + yb), i.e. d11. But d > 0

(since it is a gcd). Thus d = 1. o We have already observed that division is not a permissible

operation when we are restricting ourselves to whole numbers.

But in the case where integers a,b are such that alb, by definition

there is a (necessarily unique) integer c such that b = ac, and

we shall write bla to denote that unique integer c. We make use

of this natural convent~on in the next result, a corollary to the

above theorem.

Lemma I.2.7 If (a,b) d then (aid, bid) 1.

Proof: Write d xa + yb. Since dla and dlb we can rewrite

this equation as

1 x(a/d) + y(b/d).

So by Theorem I.2.6, (aid, bid) 1. o

Notice that alc and blc do not necessarily imply that ablc.

30

For example, 6124 and 8124 but 48124. However, we do have:

Lemma 1.2.8 Suppose that (a,b) 1. If alc and blc then ablc.

Proof: Pick r,s so that c = ra, c = sb. Pick x,y so that

xa + yb = 1. Then xac + ybc = c, so

c = xasb + ybra ab(xs + yr).

Thus ablc. o The following result, sometimes known as Euclid's Lemma, turns

out to be of fundamental importance in Number Theory.

Theorem 1.2.9 If albc and (a,b) 1 then alc.

Proof: Write 1 xa + yb, bc na. Then

c = xac + ybc xac + yna a(xc + yn),

so alc. o How do you go about calculating the gcd of two given integers?

The 'obvious' method is to factor each number into a product of

primes and see which primes (with multiplicities) are common to

both. For example, to calculate (90,2268), notice that

90 and 2268

2 so that (90,2268) = 2.3 = 18. (It is easy to see that this method

always works.) The problem with this method is that factoring

a number into primes is an extremely time consuming business.

(See later.) A much more efficient method of calculating a gcd

is to use The Euclidean Algorithm. This depends upon the following

lemma.

31

Lemma I. 2 .10 Ifa qb + r then (a,b) (b,r)

Proof: Let d (a,b) Then dla and dlb, so dlr. Hence dl (b,r).

Suppose that c > 0 also divides (b,r). Then clb and clr so

c I a (=qb + r). Thus cl (a,b), i.e. cld. It follows that c ~ d. So, by definition, d = (b,r). 0

We are now able to describe the Euclidean Algorithm to determine

the gcd of two given integers a,b. We may assume that neither

of a,b is zero. (Otherwise the problem is trivial.) Since

(a,b), we may further assume that a ~ b > O.

By the Division Algorithm applied to the pair a,b we can find

integers q1,r1 such that

If r 1 = 0 then bla so (a, b) = b and we are done. Otherwise

r 1 > 0 and by the above Lemma 1.2.10 we have (a,b) = (b,r1 ).

We now apply the Division Algorithm to b,r1 to obtain integers

Q2,r2 such that

b

If r 2 = 0 then r11b so (a, b) = (b,r1 ) = r 1 and we are done.

Otherwise r 2 > 0 and by Lemma 1.2.10 again we have

(a, b) = (b,r1 ) (r1,r2 ). Now apply the Division Algorithm to

r 1 ,r2 to obtain Q3,r3 such that

Keep on in this fashion. Since b > r 1 > r 2 > r3 > ~ 0, there

must come a stage n for which r n+1 O.

32

Then r n

(a,b), and we

are done.

As an example, we shall find the gcd of the numbers 12345

and 678. Applying the Euclidean Algorithm as just outlined, we

obtain the following steps:

12345

678

141

114

27

6

18.678 + 141

4.141 + 114

1.114 + 27

4.27 + 6

4.6 + 3

2.3 + 0

Thus the gcd of 12345 and 678 is 3, the last non-zero remainder

obtained.

It is obvious that the above computation is easily carried

out using at most a pocket calculator.

by factoring into primes takes longer.

To obtain the same result

The relevant factorisations

are

12345 3.5.823 and 678 2.3.113.

From these factorisations it is immediate that the gcd is 3, the

only prime factor the two numbers have in common. In fact, for

numbers of this size it is not so apparent that the factorisation

technique is not always feasible. The necessity of checking that

the numbers 823 and 113 are prime in the above example, though

involving more work than in the Euclidean Algorithm, is nonetheless

not too onerous. But, as we shall indicate in Chapter III, for

larger numbers, factorisation is virtually impossible, and must

therefore be avoided wherever possible.

33

Theorem 1.2.4 tells us that the gcd of two numbers can be

expressed as a linear combination (with integer coefficients) of

those two numbers. By tracing backwards through the Euclidean

Algorithm it is possible to find such an expression. This method

is best explained by means of an example. Consider the computation

above to determine gcd(12345,678). How can we express 3, the

answer, as a linear combination of 12345 and 678?

way back through the calculation we find:

3 27 - 4.6

27 - 4.(114 - 4.27)

27 - 4.114 + 16.27

17.27 - 4.114

17.(141 - 1.114) - 4.114

17.141 - 21.114

17.141 - 21.(678 - 4.141)

101.141 - 21.678

101.(12345 - 18.678) - 21.678

101.12345 - 1839.678

Working our

We shall examine the Euclidean Algorithm more closely in the

next section.

3. EFFICIENCY OF ALGORITHMS. MULTI-PRECISION ARITHMETIC

There are two distinct senses in which a mathematical problem can

be said to be 'solved'. First there is the pure 'existence' proof,

which demonstrates that, say, a number exists having certain

properties, but gives no indication as to just what that number

is. An example of such a solution is Littlewood's Theorem,

34

mentioned in Chapter 0.3, that there is a number n for which

Li(n) - n(n) is negative. No-one has any real idea of how to

actually find such a number. (Of course, in a sense there is

a method: examine each number in turn until one is found with the

desired property, but for reasons indicated in Chapter 0.3 this

is not at all a feasible method.) The second type of solution

is the computational solution, whereby a method (or 'algorithm')

is given which enables one to calculate numbers with the property

concerned.

a solution.

The Euclidean Algorithm is a good example of such

As soon as you start talking about algorithms for the solution

of problems, the questions arise: 'How efficient is the algorithm?'

'Is it feasible in practical, computing terms?' 'And if so, for

what 'inputs' is it feasible?' The whole subject of algorithm

efficiency is a big one in its own right, and for the most part

lies outside our present scope, but insofar as it concerns our

subject matter we need to know a little bit about it.

First of all, just what do we mean by an 'algorithm'? It

is possible to give a fairly precise definition, but at this stage

it is sufficient to say that an algorithm is a sequence of

instructions which describe, in 'reasonable' detail, the steps

that must be performed in order to compute something: usually

the algorithm will have one or more numerical 'inputs' and produce

one or more numerical 'outputs'. The Euclidean Algorithm described

in the last section is a good example of such a procedure. (The

name 'algorithm' derives from al-Khow~rizm!, an 8th Century Arabic

mathematician who wrote an influential textbook explaining the

35

Hindu system of decimal arithmetic.)

The first arithmetical algorithm that we ever meet is the

classical method for adding two numbers in decimal notation.

In order to develop the ideas we shall need to discuss algorithm

efficiency, let us have a quick look at this algorithm.

The classical addition algorithm depends upon the prior

knowledge of the sums of all pairs of 1-digit numbers (1 + 3 4,

5 + 7 = 12, etc.) Then, to add two n-digit numbers

x xnxn_1 X2X1 and Y = YnYn-1 Y2Y1 (where the xi'Y j are single

digits), we perform a sequence of n additions of the form

where c 2 , ,cn are the possible 'carries', defined by (setting

c1 = 0 for convenience)

{ 0 , if xi +Yi +ci :i! 9

1 , if xi+yi+c i > 9

NOw, our discussion of algorithms will really only make sense

when applied to computers, which perform the steps of the

algorithm in sequence at a fixed rate. So let us imagine that

we are to use the above addition algorithm in such a fashion,

taking no short cuts and performing each step in succession.

(The basic operation of adding two 1-digit numbers will correspond

to the basic addition operation provided in the computer hardware.)

Let to be the (assumed constant) time it takes to perform one basic,

single digit addition, and let T(n) denote the time taken to add

two n-digit numbers using the above algorithm. At first glance

it would seem that

36

This is not quite accurate, however, since we have ignored the

various 'book-keeping' tasks involved to keep track of where we

are in the algorithm. (Computer programmers refer to the time

taken for such operations as the 'overheads' involved in the

computation.) A few moments reflection should indicate that

these additional steps might themselves require a total time of

the order of 2n.tO

form

for some constant c.

At any rate, we will have a bound of the

We would say that the addition algorithm

'runs in linear time' to describe this situation: that is, the

time taken to perform the computation using the algorithm depends

linearly upon the size of the inputs (expressed in terms of the

number of digits in the two inputs). (If we wanted to express

the efficiency of the algorithm in terms of the magnitude of the

inputs rather than the number of digits involved, we would say

that the algorithm runs in 'log linear time'. This is because

the number of digits in a number N is approximately equal to

loglON, which means that the computation bound would be of the

form

Time taken to add two numbers of the order of N ~ c.loglON.tO.)

So much for addition (and, by a trivial modification to the

algorithm, subtraction). What about the other fundamental number

theoretic operation:multiplication? We start by examining the

37

conventional multiplication algorithm we learn at school. This

depends upon knowing in advance the product of any two I-digit

numbers (4.5 = 20, 6.9 = 54, etc.). Normally, when we make use

of this algorithm we layout the calculation more or less like

this:

35 24 x 20

120 100 600 840

(4X5=20) (4X3=12) (2 X5=10) (2 X3=6) (adding)

Thus we reduce the problem of multiplying two 2-digit numbers to

that of performing 4 multiplications of I-digit pairs, using

position to take care of the multiples of 10 involved (with a

units column, a tens column, etc.). In fact it will be more

convenient for us to write out such a calculation in the form

24.35 100.2.3 + 10.2.5 + 10.4.3 + 4.5

In general, if X and Yare two 2-digit numbers, say X

Extending the above algorithm to the general case of two

n-digit numbers we have: if

X and Y

are n-digit numbers then

38

Xy

In the course of this calculation, x,y, is calculated for each l. J

value of i,j = 1, ... ,n. What else is involved? There are some

additions, of course, n(n+1) of them, ignoring the final collection

of the various powers of 10. Each of these is essentially a 2-

digit addition, so runs in time 2tO' where to is the time for single

digit addition, giving a total addition time of 2n(n+1)tO (This

will turn out to be a good enough approximation for our needs.)

There are also the multiplications by the various powers of 10,

but since multiplication by 10k simply involves a 'shift' along

(accompanied by the addition of zeros) of k places, this operation

can be assumed to require a time k.tO The final additions will

require a time of at most cO.n for some constant cO' and there

are 2n of them, so this part of the calculation requires a time

2 bounded by c 1n for a suitable constant c 1 We may assume that

t 1 , the time taken to perform a basic single digit multiplication,

is not less than to. Thus the algorithm has a running time

T(n)

where c 2 is chosen large enough to take care of any overheads

involved in the basic multiplications, c 3 to allow for the 2-digit

addition overheads, c 4 to allow for the shifting overheads, and

Cs covers overheads in running the whole show. In other words,

for a suitable constant c.

39

You may well think that a multiplication algorithm for

multiplying two n-digit numbers in a time proportional to n2 is

the best possible. As we show next, this is not the case at all.

There is room for considerable improvement.

Let

x Y

be two 2n-digit numbers. We wish to calculate the 4n-digit

product XY. To this end, split each of the numbers X,Y into a

most significant (left) half and a least significant (right) half,

as follows:

x

Clearly,

XY

and

X r

Y r

Y

Now observe that this can be rearranged to give

XY

Then:

Apart from various shifts and additions (including the formation

of XI-Xr and Yr-Y l ), only three multiplications are required here,

namely

40

each of which is a multiplication of two n-digit numbers. Thus,

for this algorithm (which simply reduces a single 2n-digit multi-

plication to three n-digit multiplications, and does not

completely 'solve' the problem) we have, for a suitable constant c

T(2n) ~ 3.T(n) + cn (*)

How do we take care of the three n-digit multiplications? We

use the same trick again (replace n by n+l if n is odd). And

so on, until you get down to basic, I-digit products. If we do

this we obtain a 'recursive' algorithm which keeps referring back

to itself for smaller and smaller arguments. What is the running

time for this algorithm? If we choose the constant c large

enough so that c ~ T(2), then by an easy induction argument using

inequality (*) we see that for all k ~ 1,

Let 'x' denote, for any real number x, the least integer

greater than or equal to x. Then, for any number n we have,

from the above inequality,

Since 10g23 ~ 1.59, this means that for some constant K,

T(n) 159

~ K.n

For 'large' values of n this will be significantly faster than

the classroom algorithm, of course. (In practice, 'large' may

mean 'greater than 4' here.)

41

So far all of our discussion has had a somewhat artificial

air to it, since in practice all sorts of short cuts are available

in hand calculation, and in any case no-one would ever commence

a calculation that looked too complicated to carry out in a

reasonable time. But for the computer programmer, prior knowledge

of how long it will take a program to run is very important, as

are any tricks that might be employed to speed up a calculation.

The programmer only needs to examine algorithms for addition and

multiplication when it is necessary to deal with numbers which

are too large to fit into one half a computer word (when overflow

would result when a multiplication of two such numbers were

attempted).

Multi-Precision Arithmetic is the name used to describe the

procedures for performing arithmetic on numbers larger than one-

half the computer word size. The numbers themselves have to be

stored in arrays over two or more words, and to manipulate them

one needs to use algorithms very like the ones described above.

In fact only minor changes need to be made to adapt the algorithms

we have studied to make them suitable for computer implementation.

First of all the basic 'units' involved in the computations

are not single digits but the single-word parts of the multi-word

numbers. Secondly, the basic operations in terms of which the

computations must be performed are, as you might expect, the standard

single-word arithmetical operations provided by the computer

hardware. Thirdly, since modern computers perform all of their

arithmetic in binary form rather than decimal form, it is necessary

to replace '10' by '2' throughout. Subject to these changes,

all of our discussion about algorithm running times now holds for

42

multi-precision arithmetic routines on a computer.

4. THE FIBONACCI SEQUENCE AND THE EFFICIENCY OF THE EUCLIDEAN

ALGORITHM

In order to investigate the efficiency of the Euclidean Algorithm

it will be helpful to introduce a famous, classical number sequence:

the Fibonacci Sequence.

The Fibonacci sequence gets its name from the great 13th

Century Italian mathematician Leonardo of Pisa, who wrote under

the name of 'Fibonacci' (from 'filius Bonacci' - son of Bonacci).

His influential work Liber Abaci, written in 1202, introduced

the Hindu-Arabic decimal number system to Western Europe. In

this books appears the following problem:

A man puts one pair of rabbits in a certain place

surrounded by a wall. How many pairs of rabbits can

be produced from that pair in a year, if the nature of

these rabbits is such that every month each pair bears

a new pair which from the second month on becomes

productive?

It does not take long to figure out that the number of pairs

of rabbits present each month is given by the sequence

1,2,3,5,8,13,21,34,55,89,

The general rule for generating this sequence is

un +1 + un (for all n ~ 1),

43

where un is the n-th term in the sequence.

32+ 1

5 3 + 2

85+ 3

etc.

Thus:

This sequence is now known as the Fibonacci sequence.

From the recursive definition of the Fibonacci sequence given

above, it is easy to prove the following result:

Lemma I.4.1 1.

Proof: Suppose that the lemma were false, and let d > 1 divide

both u and u n+1 Then d divides u = u - u . Hence d n n-1 n+1 n divides u = u - u n-1 Continuing in this fashion we arrive n-2 n eventually at the conclusion that d divides u1 ' i.e. d11, which

is absurd. Thus the lemma must in fact be true. o

Using the Fibonacci sequence we can easily show that there

is no upper bound on the number of steps (divisions) necessary

to calculate a gcd using the Euclidean Algorithm. Specifically:

Lemma I.4.2 Let n > 1. The number of divisions necessary to

calculate (un' un+1 ) is exactly n.

Proof: Applying the Euclidean Algorithm to un ' un+l clearly

leads to the following system of equations:

44

u3 l.u2 + ul

u2 2.u l + 0

Thus (Un' un+l ) 1, and exactly n divisions have been

required. D

Closely related to the above lemma is the following result,

which shows that the Fibonacci numbers are rather special with

regards to the Euclidean Algorithm.

Lemma 1.4.3 For any n > 1, un is the least number such that

there is a number b > un for which n divisions are required in

order to calculate (unib) using the Euclidean Algorithm.

Proof: Let a be the least number such that there is a b > a

for which n divisions are required in order to calculate (a,b)

using the Euclidean Algorithm. By virtue of Lemma 1.4.2 we know

that un ~ a, so it suffices to prove that a ~ u n

Let the Euclidean Algorithm applied to the pair (a,b) be:

b q .a + r (0 < r < a) n n-l n-l

a = qn-l,rn- l + r (0 < r < r n- 1 ) n-2 n-2 r n-l qn-2 r n-2 + r n-3

(0 < r n-3 < r n- 2 )

. . . . r 4 q3 r 3 + r 2 (0 < r 2 < r 3 )

r3 q2 r 2 + r l (0 < r l < r 2 )

r 2 qlri

45

Now, we know that r 2 > r l > O. Also, each qi is a natural

number. Hence, working our way back through the above equations

we see that:

r 2 > r l

10.u n

(by equation (1. o

Corollary 1.4.5 For any n ~ 1, u sn +1 has at least n+l digits.

Proof: Since u 6 = 13, which has 2 digits, the result is valid

for n=l. The result follows by induction using the lemma.

(The easy details are left as an exercise.) o

At last we are able to prove our result concerning the

efficiency of the Euclidean Algorithm.

Theorem 1.4.6 Let b > a > 1. In order to calculate (a, b) using

the Euclidean Algorithm, at most s.k divisions are required, where

k is the number of digits in a.

Proof: Let n be the number of divisions required to calculate

(a,b) using the Euclidean Algorithm. We must show that n ~ s.k.

By Lemma 1.4.3, a ~ un. Let d be the number of digits in

un. Since d ~ k, it suffices to prove that n ~ S.d.

For some number t we have

t.s < n ~ (t+l).s

Since n > st, Corollary 1.4.5 implies that un has at least t+l

digits, i.e. d ~ t+l. Thus

n ~ s.(t+l) ~ s.d ,

as required. 0

Further discussion of the Euclidean Algorithm and its

efficiency is provided in the Exercises to this chapter.

47

5. PRIME NUMBERS

Though discussed briefly in Chapter 0, the treatment given there

was far from rigorous, so we shall here develop the theory of

prime numbers from the very beginning.

A prime number is a number p > 1 whose only divisors amongst

the integers are 1 and p (alternatively, whose only divisors

amongst the natural numbers are 1 and pl. A number greater than

1 which is not prime is said to be composite.

One of the most basic properties of prime numbers is provided

by our first lemma on the subject:

Lemma I.5.1 If P is a prime and plab, then pia or plb.

Proof: Assume that p)a. Thus (p,a) = 1. (By the definition

of p being a prime.)

and we are done.

So by Theorem I.2.9 (Euclid's Lemma), plb,

o

Corollary I.5.2

some i (1 ~ i ~ n).

Proof: This follows from the lemma by an easy induction

argument which we leave to the reader to supply. 0

Using the above corollary we can already establish one of

the most fundamental theorems of Number Theory.

Theorem I.5.3 (The Fundamental Theorem of Arithmetic.) Every

number n > 1 can be expressed as a product of prime numbers,

furthermore, this expression is unique up to the order of the

prime factors.

48

Proof: The theorem is certainly valid for n=2, since 2 is already

prime and hence, by convention, is a 'product' of prime numbers.

So, if we assume that the theorem is false and let n be the least

number which is not a product of primes, we have n > 2. If n

were prime it would be (by convention) a 'product' of primes.

Consequently, n cannot be prime. Thus there must be numbers

a and b such that n = ab, where both a and b are less than n.

Being less than n, a and b must be products of primes. But then

n = ab is a product of products of primes, and is thus a product

of primes, a contradiction.

a product of primes.

This proves that every number is

We turn now to the uniqueness of the prime factorisation of

any number. Suppose that there were a number n which had two

prime factorisations

(possibly with m ~ n).

Then PI lqlq2 qn' so by Corollary I.5.2, PI divides one of

ql,q2, ,qn. By rearranging QI,q2, ,qn if necessary we may

assume that PI divides ql. This means that PI = ql' of course.

So we can divide PI from the above equation to obtain

Repeating the same argument we see that, with a possible

rearrangement of q2,q3, ,qn' P2 = Q2' and hence that

Continuing this process we see that it must lead to the

49

conclusion that m=n and (after various rearrangements)

This completes the proof of the theorem.O

Corollary I.5.4

the form

Every number n > 1 can be written uniquely in

k n

where each Pi is prime, Pl < P2 < < Pn' and each ki is a

positive integer.

Proof: Immediate. 0

The following proof that there are infinitely many primes is

due essentially to Euclid.

Theorem I.5.5 There are infinitely many primes.

Proof: Assume, on the contrary, that there were only a finite

number of primes.

Now form the number

Since P > Pn' P must be composite. Hence P is divisible by some

prime less than P. Thus for some k, P is divisible by Pk.

But the division of P by Pk clearly leaves a remainder of 1, so

this is impossible. This contradiction proves the theorem. o

The above proof raises the question whether infinitely many

of the numbers

50

P P P p + 1 , n 1 2 n

where P1,P2,P3, ,Pn"" enumerates the primes in order, are

themselves prime. This is not known.

infinitely many of them are composite.

to both questions is 'Yes'.)

6. DIOPHANTINE EQUATIONS

Nor is it known if

(Presumably the answer

In honour of the Ancient Greek mathematician Diophantus, we use

the name Diophantine Equation to refer to an equation with integer

coefficients for which a solution is sought in the integers.

The simplest non-trivial form of Diophantine equation is

the linear equation in two variables:

ax + by c,

where a,b,c are integers and integer solutions for x,y are sought.

There may be no solutions, as is the case with the equation

6x + By 13.

Or there may be many solutions. For instance, the equation

6x + By 14

has the solutions x=l,y=l, and x=5,y=-2, and x=9,y=-5 (and

infinitely many more).

In a moment we shall see how the Euclidean Algorithm may

be used to find the solutions to Diophantine equations directly,

but first we prove a theorem which tells us exactly when a solution

will exist, and what form the solutions will then have.

51

Theorem I.6.l The Diophantine equation

ax + by c

has a solution if and only if (a,b)lc. If (xo'Yo) is one solution,

then all other solutions are given by

x = Xo + (b/d)t y Yo - (a/d)t ,

where t is any integer, and where d (a,b).

Proof: Suppose first that a solution exists. Then by

Corollary I.2.5 we know that die. So that's half the theorem

already.

Conversely, suppose that die, say c

xO,yO so that

dt. Pick integers

(By Theorem I.2.4 we know that such integers exist.) Then

c = dt

so x = xot and y = Yot solve the equation.

Now suppose that xo,yo is any solution to the equation.

Thus, if xl'Yl is any other solution, we will have

axo + byO = c

so

By Lemma I.2.7 there are relatively prime integers r,s such that

52

a dr, b ds. So

i.e.

NOw, rls(Y I - YO) and (r,s) = I so by Euclid's Lemma (Theorem

1.2.9), rl(Y I - YO). So for some integer t, YI - YO = rt.

Thus r(xO - xl) = srt, which gives Xo - xl = st. Thus

Xo - (b/d)t and YI YO + rt YO + (a/d)t

Moreover, for any value of t, if xl and YI are as above, then

xI'Y I are solutions to the given equation, as is easily seen,

so our proof is complete. o The existence part of the above proof indicates how the

calculation of the greatest common divisor of a,b and its

expression as a linear combination of a and b plays a role in

the solution of such an equation. We illustrate this by means

of an example.

We shall solve the Diophantine equation

210x + 1001y 21.

First we use the Euclidean Algorithm to find (210,1001).

1001 = 4.210 + 161

210 1.161 + 49

161 3.49 + 14

49 3.14 + 7

14 2.7.

53

Thus (210,1001) = 7. Since 7121, the equation does have a

solution. To find a solution we work back through the above

calculation to find 7 as a linear combination of 210 and 1001.

7 49 - 3.14

49 - 3. (161 - 3.49)

10.49 - 3.161

10.(210 - 1.161 ) - 3.161

10.210 - 13.161

10.210 - 13. (1001 - 4.210)

62.210 13.1001.

Thus

7 62.210 - 13.1001 .

Multiplying through by 3 to make the left hand side equal to 21,

the constant term in the original equation, we get

21 210.(186) + 1001.(-39).

Thus x = 186, Y = -39 is a solution to the original Diophantine

equation. All other solutions are given by

x = 186 + (1001/7)t = 186 + 143t

Y -39 - (210/7)t = -39 - 30t,

as t ranges over all integers. For instance, putting t = -1 we

obtain the solution consisting of the smallest numbers in absolute

value, namely x = 43, Y = -9.

Sometimes we are only interested in solutions within a certain

range. For instance, suppose that in the above example we want

54

to find all positive solutions. Thus we need to find all those

values of t for which

-39 - 30t> 0 and 186 + 143t > O.

The first of these inequalities implies that t ~ -2 whilst the

second implies that t ~ -1. Thus in this case we see that there

are in fact no positive solutions.

EXERCISES I

SECTION 1

1. Prove each of the following statements by induction. Try

to use both methods of writing out your proof.

(i) 1 + 4 + 9 + + n 2 = n(n+l)(2n+l)/6.

(ii) 1 + 8 + 27 + + n 3 = (!n(n+12

(1 + 2 + 3 + + n)2.

(iii) 1 + 1 + + 1 1.2 ~ n.(n+1)

2. Prove that for any n,

n n+l

1.(1:) + 2.(2:) + + n.(n:) (n+l)! - 1.

3. Do Exercise 0.8.

SECTION 2

4. Let a,b,c be integers. Prove the following

(i) if alb then albc

(ii) if alb and alc then a 2 1bc

(iii) if c '" 0, then alb if and only if aclbc .

55

5. Prove that every odd number is of one of the forms 4n+1 or

4n+3. (In advanced work, this classification of all odd numbers

into two classes turns out to be a fundamental one. See

also Exercise 16 below.)

6. Let a,b be integers, not both zero, and let d be any number.

Prove that d = (a,b) if and only if

(i) dla and dlb, and

(ii) whenever cia and clb then cld

7. Prove the following:

(i) if (a,b)

(ii) if (a,b)

(iii) if (a,b)

1 and (a,c) = 1 then (a,bc)

1 and cia then (b,c) = 1

1 then (ac,b) = (c,b)

1

8. The least common multiple of two non-zero integers a and b,

written lcm(a,b), is defined to be the smallest positive

integer m such that aim and blm. Prove that this is always

defined and that for any positive integers a,b,

(a,b).lcm(a,b) ab.

Deduce that for any numbers a and b, lcm(a,b)

if (a,b) = 1.

1 if and only

9. Use the Euclidean Algorithm to find the greatest common

divisor of each of the following pairs of numbers, and in

each case express the gcd as a linear combination of the two

given numbers :

56,72 24,138 119,272 1769,2378

56

10. Prove that the product of four consecutive integers is one

less than a perfect square.

11. Prove the following version of the Division Algorithm. Given

integers a and b with b # 0, there exist unique integers q

and r such that

a = qb + r

(Hint. Write a = q'b

o :;; r' :;; ; ibi, let r =

let r = r' - ibi and q

b < o. )

12. Define numbers un by Uo

Show that

u n

and

+ r', where 0 :;; r' < ibi. If

r' and q = q'. If , ibi < r' < ibi,

= q'+l if b > 0 or q = q' - 1 if

SECTION 4

Show further that the smallest numbers a > b > 0 for which

the algorithm of question 11 requires n division steps are

a = un + un- 1 and b = un

SECTION 5

13. Do Exercises 2 through 7 and 10 of Chapter O.

14. A classical theorem of Dirichlet says that if a and bare

relatively prime numbers, then the arithmetic progression

a, a+b, a+2b, a+3b, , a+kb,

57

contains infinitely many primes. Prove that no arithmetic

progression can consist entirely of primes.

15. Prove that the sequence

(n+l)! + 2, (n+l)! + 3 , , (n+l)! + (n+l)

provides a sequence of n consecutive composite numbers.

16. Prove that there are infinitely many primes of the form 4n+3.

(There are also infinitely many primes of the form 4n+l, but

the proof of this is rather difficult.)

SECTION 6

17. Find all solutions to the following Diophantine equations:

(i) 56x + 72y = 40 ,

(ii) 22lx + 9ly = 117

18. Find all positive solutions to the following Diophantine

equations:

(i) 30x + l7y 300

(ii) 54x + 2ly 906

19. Professor Euclid cashes a cheque at the bank, but the cashier

mixes up the number of pounds and the number of pence, so

58

instead of receiving Ea.b he receives Eb.a. Professor Euclid

fails to notice this, but after spending 68p he is surprised

to see that he still has twice the amount he wrote his cheque

for. What is the smallest value for which the cheque could

have been made out?

COMPUTER PROBLEMS I

1. Write a computer program which calculates n! for any given n.

(Hint. It may be a better approach to consider the following

'recursive' definition of n! :

1 ! 1 (n+1)! (n+l).(n!).

In any event, the rapid growth of n! as n increases will mean

that your program will only run for a few values of n.)

Arrange for the computer to print out the values 1!,2!,3!,etc.

as far as it will go.

2. Write a routine for carrying out multi-precision multiplication

for numbers containing twice the number of digits as your

computer allows in integers, and use this routine to extend

your program to calculate n! from Problem 1 above.

3. Write routines for the addition and the multiplication of

integers (positive or negative) of arbitrary (as far as possible)

size. Use the multiplication routine to obtain decimal print-

outs of the record prime numbers described in Chapter 0.4.

4. Write a multiplication routine for numbers occupying 2n computer

words using the 'fast' method described in section 3. Compare

its running time with that of the classical method. (This

will require your accessing the internal clock of your

computer. )

5. Multi-precision routines written commercially are usually

written in the assembly language of the computer concerned,

59

to enable efficient manipulation of the individual bits of

the numbers in store. If you are able to program in assembly

language, write a routine for the multiplication of two 2n

bit binary numbers using the 'fast' method described in section

3. Compare the speed of this routine with that of the

classical algorithm programmed in a high level language.

(This will require your being able to access the internal clock

of your computer.)

6. Write a program to calculate the greatest common divisor of

two given numbers using the Euclidean Algorithm. Include

in your program a count of the number of division steps required

in each calculation.

7. Fix a value of a and run your Euclidean Algorithm program to

find (a,b) for a series of different values of b > a. (Do

this by means of a loop so as to obtain a large number of runs.)

Theoretical considerations indicate that the average number

of division steps required by the Euclidean Algorithm for

varying values of b greater than a fixed value of a is

approximately 1.94 10910a. (This is, of course, much less

than the bound provided by Theorem I.4.6) See how closely

your computed results agree with this theoretical estimate.

Repeat the computation for different values of the number a.

8. If multiprecision arithmetic is required, the Euclidean

Algorithm becomes a rather inefficient method for calculating

greatest common divisors, since multi-precision division

routines tend to be relatively slow. There is a simple

60

algorithm for calculating greatest common divisors which uses

only the operations of subtraction, testing whether a number

is even or odd (which for binary numbers involves simply looking

at the last bit), and halving even numbers (which for binary

numbers involves nothing more than a shift of the entire number

one place to the right). This algorithm depends upon the

following facts about positive numbers a and b:

(1) If a and b are both even then (a,b) = 2(a/2,b/2).

(2) If a is even and b is odd then (a,b) = (a/2,b).

(3) If a > b, then (a,b) = (a-b,b).

(4) If a and b are both odd, then a-b is even and

la-bl < max(a,b).

Prove these facts and then use them to develop an algorithm

to calculate greatest common divisors of binary numbers.

If you can program in assembly language, write a program

which implements this algorithm, both for single precision

arithmetic and multiple precision work.

9. Write a program that finds a solution to a given Diophantine

equation of the form

ax + by c,

using the Euclidean algorithm, as described in section 6.

10. Modify the program from Problem 9 to look for a positive

solution to the equation.

61

II Congruences

Frequently in mathematics, a real breakthrough is made simply by

regarding a familiar notion from a different viewpoint. Such

is the case with the study of the notion of congruence, which is

but a study of divisibility carried out in a special way. It

is the brain-child of the great 19th Century German mathematician

Karl Friedrich Gauss. Large parts of modern day number theory

can be traced back to their origins in Gauss' Disquisitiones

Arithmeticae, a monumental work carried out whilst Gauss was in

his early twenties.

this volume.

Congruences appear in the first chapter of

1. CONGRUENCE

Let n be a fixed number. TWo integers a and b are said to be

congruent modulo n, written

a = b (mod n) ,

if and only if nl(a - b).

For example, 3 = 24(mod 7), -31 = II(mod 7), -15 = -64(mod 7). Given any integer a, by the Division Algorithm there are

integers q,r such that

62

a = qn + r (0 :;; r < n).

By definition of congruence,

a - r (mod n).

Clearly, no two numbers less than n can be congruent modulo n

(unless they are equal), so we see that every integer a is

congruent modulo n to a unique r such that 0 :;; r < n. The unique

number r is called the residue of a modulo n, or more precisely,

the least positive residue modulo n. This last remark is to allow

f