Discrete Maths Notes

Discrete Mathematics I (CS127)

Lecture Notes

Alexander Tiskin

University of WarwickAutumn Term 2004/05

This course introduces some of the fundamental mathematical ideas thatare used in the design and analysis of computer systems and software. Thecourse makes you familiar with basic concepts and notation, helps you todevelop a good understanding of mathematical proofs, and enables you toapply mathematics to solving computer science problems.

Problem sheets and seminars

The course is accompanied by a series of problem sheets relating to topicscovered in the lectures. To develop proper understanding it is essential thatyou try to solve these problems during your own private study. The seminarsprovide an opportunity to get help with difficulties experienced in tacklingthe problem sheets, or with understanding the material from lectures. Pleasesign up for a weekly seminar at a time which suits you, and do attend it.Your performance at seminars will not be assessed, so nothing can preventyou from showing your solutions, whatever your confidence level in themmight be. Confidence tends to grow with practice, and so does your exampotential.

Lecture notes and books

The lecture notes are self-contained, but you may find it helpful also toconsult some books. The library contains several which cover all or partof the course syllabus, and exploration of the catalogue and shelves is rec-ommended. The three books listed below are all suitable. They are in thelibrary and should be available in the University bookshop. They cover thematerial in different ways and in different style. It is suggested that youlook at them all, to find the one you find most accessible.

K. A. Ross and C. R. B. Wright, Discrete Mathematics (5th ed.),Prentice Hall, 2003.

2 Discrete Mathematics I (CS127)

K. H. Rosen, Discrete Mathematics and Its Applications (5th ed.),McGraw-Hill, 2003.

J. K. Truss, Discrete Mathematics for Computer Scientists (2nd ed.),Addison-Wesley, 1999.

Another book well worth considering is

E. Bloch, Proofs and Fundamentals: a First Course in Abstract Math-ematics, Birkhauser, 2002.

It is less suitable as general reference for the course material, but insteadconcentrates on what is arguably its most important aspect: the concept ofa proof. It is very clearly written, and in many respects complements thebooks on the courses main reading list.

Electronic resources

As the course progresses, the material will be available on the course website:http://www.dcs.warwick.ac.uk/~tiskin/teach/dm1.html . The Rosenbook has a website of its own: http://www.mhhe.com/rosen .

A forum (discussion group) on Warwick Forums has been set up to ex-change messages relevant to the course. In the past, it proved to be a use-ful tool for communication within the CS127 student population, and alsobetween students and tutors. The forum is available at http://forums.warwick.ac.uk . The University IT Services should be able to help in caseof any problems with accessing this forum. As with all discussion groups,its abuse will not be tolerated.

Assessment

One of the main challenges of the course is the lack of continuous courseworkassessment. This means that you have to work hard, without being forcedto. The course is assessed by a two-hour examination in week 1 of SummerTerm. Results of this and other exams will be announced at the end of theacademic year.

A new element of the course introduced last year is the class test, whichwill be held in week 7 of Autumn Term. The test will consist of a one-hourpaper with 20 true or false questions, to be answered on specially preparedsheets, which then will be scanned and marked automatically. The resultingmark will not contribute to your official course assessment, and the classtest itself is not mandatory. However, it is strongly recommended to takethe test, in order to get feedback on your progress and to prepare yourselffor the Summer Term exam.

Discrete Mathematics I (CS127) 3

1 A Brief Tour of the Discrete Mathematics Zoo

Mathematics studies concepts that are abstract, idealised images of the realworld. An example of such a concept is natural numbers:

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . .

We all learn it in early childhood yet nobody has ever seen three, asopposed to three oranges or figure 3 in black ink in the top-right cornerof this page.

A philosopher would say here: well, our concept of three captures thethreeness of all three-element sets that we have seen before or may see infuture: three apples, three penguins, or two sheep with a sheepdog in thefield. Number 0 can be accommodated by this view as well: it representsan empty set, a set that contains nothing.

While the philosophers answer makes a lot of sense, it is also true thatin mathematics, concepts depart from immediate reality, and start to live alife of their own. Consider, for example, the notion of a set, that our friendthe philosopher has used to define natural numbers. We can have a set ofapples or penguins, so why not think about sets of numbers? Say, the set ofthis weeks National Lottery winning numbers: {14, 20, 25, 32, 47, 49}. (notethe use of curly brackets to denote a set). We could then think of somemore interesting (in my opinion) examples, such as the set N of all naturalnumbers:

N = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . . },the set of all integers (natural numbers and their negatives):

Z = {. . . ,6,5,4,3,2,1, 0, 1, 2, 3, 4, 5, 6, . . . },

or the set of all even integers:

{. . . ,10,8,6,4,2, 0, 2, 4, 6, 8, 10, . . . }.

For a mathematician, the last three sets are just as legitimate as a setof three apples. However, there is a crucial difference: the new sets areinfinite. Infinite sets do not occur in reality, even the number of atoms inthe Universe is finite. Yet, we have just imagined a few infinite sets. Evenif we cannot write down the elements of these sets without resorting to the. . . notation, we can capture these sets in our mind, and treat them aswe would treat any real-world set.

Of course, to make our theory of sets useful, we will have to answer someimportant questions:

do infinite sets have a size? (yes they do, but of course these sizesare beyond natural numbers);


can two infinite sets have different size? (yes, their sizes can vary sogreatly it is hard to imagine even for a mathematician);

can one form a set of all sets? (no, this is asking too much onecannot even form a set of all possible set sizes).

This sort of question cannot be answered from any empirical evidence:infinite sets simply do not exist in reality. At this point, we are confrontedwith a major distinction between mathematics and natural sciences: insteadof experiments, mathematics relies on proofs. The answers to the abovequestions given in brackets can be formally and unambiguously proved tobe correct. Experimentation also plays a role in mathematics, but rathera supporting one: it helps our intuition to understand the concepts andfind the right idea for a proof. For example, to answer the first questionabove, we could think of various infinite sets that can be imagined, and askourselves if they are likely to have a sensible notion of size. Then we wouldformulate this notion precisely, and prove that it satisfies all the propertiesthat we associate with size for example, by adding new elements to a setwe cannot decrease its size. The same approach can be applied to the othertwo questions. For the final question, this approach has an additional twist:we want to prove that a certain object (a set of all sets) does not exist. Inorder to tackle this, we imagine that it does exist, and try to consider allconsequences of its existence. Somewhere in our reasoning we come to acontradiction (it turn out that the set of all sets cannot be assigned anysensible size). The contradiction proves that the object we imagined (theset of all sets) cannot exist without violating the basic laws of logic.

To be able to write down our proofs, we need a language that is bothprecise (does not allow any ambiguity) and concise (allows to express com-plicated ideas relatively briefly). We should indicate exactly the conceptsthat we consider basic, i.e. that require no definition. For example, a set anda natural number are basic concepts. All concepts that are not basic mustbe given a formal definition. For example, we will have to define finite setand even number. We should also indicate exactly the statements that weconsider to be axioms, i.e. that require no proof. For example, two sets arethe same if they consist of the same elements is an axiom. All statementsthat we hold to be true, but that are not axioms, are called theorems; theymust be given formal proofs. For example, we will have to prove the answersthat we gave to the above list of questions on infinite sets.

This approach to mathematics is called the axiomatic method. It requiresa special language and a set of proof rules known as logic. Logic is a part ofmathematics both as a tool and an object of study; we will see some detailsof it in the beginning of our Discrete Mathematics course. Concepts andlaws of logic allow us to formalise ways of reasoning that we learn togetherwith our mother tongue:


All eagles can fly;Some pigs cannot fly.

Therefore, some pigs are not eagles.

The conclusion seems obvious, but in mathematics we must know a precisereason why it follows from the two given conditions. Firstly, we must defineexactly the class of all things that these statements speak about: supposethis is the class of all living creatures. The first condition can be refor-mulated as follows: If a creature is an eagle, it can fly. The laws of logictell us that this is equivalent to saying: If a creature cannot fly, it is notan eagle. The second condition says that there is a creature, which is a pigand cannot fly. Taken together with the previous statement, this leads tothe conclusion being proved: there is a creature, which is a pig and cannotfly, and therefore is not an eagle. Note that we can only prove what logi-cally follows from the given conditions. For example, we do not have enoughinformation to conclude that all pigs are not eagles.

Armed with logic, we will take a closer look at sets, and will introducetwo concepts that are central to all mathematics: relation between elementsof two sets, and function from one set to another. We will study differenttypes of relations and functions, and eventually will consider graphs anespecially powerful concept in dealing with complicated sets, in particularthe ones occurring in computer science.

In summary, the basic ingredients of our course are sets and naturalnumbers, glued together by logic. We will use these ingredients to buildmore complicated structures, and will apply the axiomatic method to studytheir properties. A lot of emphasis will be put on being able to prove facts,rather than just memorise them. This ability, of course, comes only withpractice hence the weekly problem sheets and seminars to discuss yoursolutions. Please do attempt to solve the problems, and be active at theseminars: our subject is discrete, rather than discreet, mathematics.


2 Logic

2.1 Statements and operators

We use all sorts of sentences in everyday speech. Our language has specialways in which we can communicate information, ask a question, give a com-mand, express our thoughts, feelings or emotions. In mathematics, however,we restrict ourselves to only one type of sentences: statements, which mustbe either true or false. Here are some examples of statements:

Five is less than ten. Pigs can fly. There is life on Mars.

Note that we know the last statement must be true of false, despite the factthat we cannot decide between true and false from our present knowledge.

Here are some examples of sentences that are not statements:

Welcome to Tweedys farm! Whats in the pies? Its not as bad as it seems. . .

The last sentence will become a statement if we substitute the name of aparticular object for the pronoun it. Of course, we must also give a clear,unambiguous definition of bad, seems, etc.

Thus, every statement has a value taken from the set B = {F, T}. Thetwo elements of this set are called Boolean values. There are special oper-ations, called Boolean operators, that one can perform on Boolean values(rather like addition and multiplication on natural numbers):

negation (logical NOT), denoted ; conjunction (logical AND), denoted ; disjunction (logical OR), denoted ; implication (IF . . . THEN . . . ), denoted ; equivalence (. . . IF AND ONLY IF . . . ), denoted .The negation (NOT) operator simply reverts the value of a statement

to the opposite value. We can define the action of operator applied to astatement A by the following truth table:

A AF TT F

The conjunction (AND) operator applies to two separate statements.The conjunction of A and B is true when both A and B are true; the con-junction is false when either A, or B (or both) are false. Thus, operator


can be defined by the following truth table:

A B A BF F FF T FT F FT T T

The disjunction (OR) operator also applies to two separate statements,and is complementary to conjunction. The disjunction of A and B is truewhen either A or B (or both) are true; the disjunction is false when both Aand B are false. Here is the truth table for :

A B A BF F FF T TT F TT T T

The two statements connected by conjunction or disjunction do not needto be related in any way. Thus,

(5 < 10) (Pigs can fly) means T F means F(5 < 10) (Pigs can fly) means T F means T

The same applies to statements connected by implication (IF . . . THEN. . . ). In ordinary life, we usually think of implication as a cause-effect re-lationship: if the bird is happy, then it sings loud. This relationship isone-way: if the bird sings, it does not necessarily mean that it is happy perhaps there are other reasons for a bird to sing. And, if the bird isnot happy, we cannot conclude whether it should sing or not, so we mustaccept both possibilities. The same reasoning applies in logic, but here thestatements connected by implication do not have to be related. For any twostatements A, B, the value of the implication is determined by the truthtable:

A B A BF F TF T TT F FT T T

Thus, a false statement implies anything, no matter true or false, but a truestatement can only imply another true statement.

The equivalence operator (. . . IF AND ONLY IF . . . ) can be thought ofas the two-way version of implication: A is equivalent to B, when A implies


B, and B implies A. In other words, the values of A and B must agree:either both true, or both false. Here is the truth table:

A B A BF F TF T FT F FT T T

Implication and equivalence play a special role in mathematics. Manymathematical theorems have the form

if A then B

or

A implies B

sometimes disguised as

A is sufficient for B

or

B is necessary for A

The meaning of all these sentences is the same: A B. A standard way ofproving such theorems is by a chain of implications:

A P1 P2 . . . Pn Bwhere P1, P2, . . . , Pn are some statements, chosen so that every implicationin the chain can be proved in one step.

Another common form of theorems is

A if and only if B

often disguised as

A and B are equivalent

or

A is necessary and sufficient for B

or

B is necessary and sufficient for A

The meaning of all these is A B. A standard way of proving such theoremsis by a chain of equivalences:

A P1 P2 . . . Pn Bwhere P1, P2, . . . , Pn are some statements, chosen so that every equivalencein the chain can be proved in one step.


2.2 Laws of logic

The truth tables completely define Boolean operators, so, in principle, thetruth value of any compound statement, however complicated, can be foundby a series of truth table lookups. In practice, we often want an easierand more intuitive method of dealing with compound statements. One suchmethod consists in applying certain properties of Boolean operators, knownas the laws of logic. From the formal point of view, these laws do not addanything new to the operator definitions: each of the laws follows directlyfrom the truth tables. However, the laws offer an alternative, complementaryapproach to logic, and are widely applicable. Many of these laws are similarto the properties of arithmetic operators + and .

In the following formulas, letters A, B, C stand for arbitrary statements.The statements of the laws are always true, irrespective of the truth valuesof A, B, C.

The first group of laws involve only one operator and at most two ele-mentary statements each.

A A double negation lawA A A A A A idempotence of , A B B A A B B A commutativity of ,

The double negation law is similar to (a) = a, and the commutativitylaws correspond to a b = b a, and a + b = b + a. The idempotence lawshave no direct counterparts in arithmetic.

The second group of laws involve more than one operator, and/or morethan two elementary statements each.

(A B) C A (B C) associativity of (A B) C A (B C) associativity of A (B C) (A B) (A C) distributivity of over A (B C) (A B) (A C) distributivity of over

The associativity laws correspond to the arithmetic laws (a b) c = a (b c)and (a + b) + c = a + (b + c). These laws allow us to write A B C andA B C without any brackets, just as we write a b c and a + b + c.The first distributivity law corresponds to a (b + c) = a b + a c. Thesecond distributive law has no counterpart in arithmetic, since, in general,a + b c 6= (a + b) (a + c).

Note that in all laws so far, we can replace all symbols by , and,simultaneously, all symbols by . The resulting statement will still betrue for any A, B, C. This is a general rule that applies to all laws weintroduce in this section.


The following pair of laws, called De Morgans laws, describes the closerelationship between operators , .

(A B) A B (A B) A BThese two laws allow us to express via and :

A B (A B) (A B)and via and :

A B (A B) (A B)This means that any one of the two operators , is redundant: we canrewrite any statement without using either one or the other. Of course, itis usually more convenient to use both.

Another group of laws deals with the case of know truth values appearingexplicitly in compound statements:

A T A A F A identity lawsA F F A T T annihilation lawsA A F A A T excluded middleA (A B) A A (A B) absorption laws

Identity laws correspond to a 1 = a, a + 0 = a. An arithmetic annihilationlaw does not hold for addition, but holds for multiplication: a 0 = 0.An arithmetic analogue of the law of excluded middle does not hold formultiplication, but holds for addition: a + (a) = 0.

Finally, the following two laws completely describe the two remainingBoolean operators, and :

(A B) (A B) (A B)(A B) (A B) (B A) (A B) (A B)

Again, both and are formally redundant, but, as we mentioned before,very useful in practice.

All the above laws are in fact theorems, and proving them is a goodexercise in applying truth tables. Here is a table that proves one of DeMorgans laws, (A B) (A B):

A B A B (A B) A B (A B)T T T F F F FT F F T F T TF T F T T F TF F F T T T T

? ?


The columns for the two sides of the law (marked ?) are identical, hencetheir truth values agree for any A, B.

We can use our laws of logic to prove new theorems. Here is an example.

Theorem 1 (Principle of proof by contradiction). For any statementsA, B, we have (A B) (B A)Proof. We apply the law for , then the law of double negation, commu-tativity of , and finally the law for once again, this time in the oppositedirection.

(B A) (BA) (BA) (AB) (A B)

The above theorem gives us a useful generic proof method. When we aregiven a statement A, and we are asked to prove a statement B, we may startby assuming that B is false (i.e. B holds), and then show that a statementcontradicting A (i.e. A) follows from our assumption. The principle of proofby contradiction tells us that in this case, B must be a logical consequenceof A.

2.3 Predicates and quantified statements

Statements we have been making so far declared facts about specific objects:

Five is less than ten. The pie is not as bad as it looks.Often we need more that that: we want to declare a fact about a specific

set of objects. For example, we could say:

Some natural numbers are less than ten. All pies are not as bad as they look.

In the first case, we could try to come up with a specific example thatproves is: say, five is less than ten. In the second case, we could restrict ourattention to a finite number of possible pies; let this set be {Chicken pie,Mushroom pie, Cabbage pie}. Then the statement All pies are not as badas they look is a conjunction:

(Chicken pie is not as bad as it looks) (Mushroom pie is not as bad as it looks)

(Cabbage pie is not as bad as it looks)

There are problems with both these approaches. In the first case, it waseasy to find a specific instance (five) that proved our statement; for other


statements, it could be much harder. We would like to have a way of sayingsome numbers are less than ten without having to show a specific example.In the second case, the chosen set of pies was too small; in reality, there aremillions of individual pies, so our statement has to be a conjunction of ahuge number of individual statements. This would be hard to deal with ifwe were to use it in proofs. Furthermore, this approach would completelyfail if the statement were about all possible pies, and then it turned out thatthis set is infinite. We would like to have a way of making a statement aboutall elements or some elements of any set, including infinite ones.

We achieve the stated goal by using the notion of a predicate. A predicateis simply a sentence containing variables ranging over a particular set. Theset of values for a variable is called the range of that variable. Will willalways assume that the range is nonempty. The sentence must become trueor false when an element of the range is substituted for every variable. Hereare some examples:

Number x is less than ten. Pie p is not as bad as it looks.

Here x is a variable that stands for a member of set N (i.e. ranges over N),and p is a variable that stands for a member of the set of all pies (i.e. rangesover that set. Of course, in the latter case we must specify precisely whatwe understand by all pies.

A predicate may contain more than one variable. For examples, theseare valid predicates:

Number x is less than number y. Pie p is better than pie q.

Ordinary statements can be regarded as a special case of predicates, con-taining zero variables, for example:

Number 5 is less than number 10. This chicken pie is better than that apple pie.

In the latter case, we assume that we are talking about two specific, well-defined pies.

A predicate with more than one variable can be made a statement bysubstituting a specific element of the range for every variable. A differentway of make a statement from a predicate is by using quantifiers. Let denoteby P (x) a predicate with the variable x, There are two quantifiers:

existential (FOR SOME x, P(x)): x : P (x); universal (FOR ALL x, P(x)): x : P (x).

Here, the range of x (i.e. the set from which x taken) is implicit. Often, wewant to specify the range of a variable. The above examples can be written


as:

x N : x < 10p Pies : p is not as bad as it looks

The sign stands for belongs, and denotes the membership of an elementin a set. The general form is

x S : P (x) x S : P (x)

With predicates having more than one variable, we can write more com-plicated quantified statements:

x N : y N : x < yy N : x N : x < y

Note that the meaning, and even the truth value of the above two state-ments is different: the first one is true (for every natural number, there isa greater number), the second is false (there is a natural number greaterthan all natural numbers). In general, the meaning of a quantified state-ment depends on the order of the quantifiers.

The meaning of a quantified statement does not change if we change thequantifier variable consistently throughout the statement. For example, wecan write:

z N : z < 10 pi Pies : pi is not as bad as it looks

The variable in a quantified statement is only defined within the statement;it is not visible from outside. In programming, such variables are calledlocal. In mathematics, we call them dummies, or bound variables. In theexamples above, variables x, z are bound by the quantifier , and variablesp, pi are bound by the quantifier . In contrast, a variable in a predicate notbound by any quantifier (such as P (x) or z < 10) is called free. We havethe following laws of changing the bound variable:

x : P (x) y : P (y) x : P (x) y : P (y)

As we have seen before, a universally quantified statement with a finiterange S = {a1, . . . , an} can be expressed by a conjunction:

x S : P (x) P (a1) P (an)

Similarly, an existentially quantified statement with a finite range can beexpressed by a disjunction:

x S : P (x) P (a1) P (an)


These equivalences do not hold for an infinite S, since their right-hand sideswould not be well-defined. However, the following laws will hold for anynonempty range, finite or infinite:

x : T T x : T Tx : F F x : F F

x : P (x) = x : P (x)

In predicate logic, we also have the following analogue of De Morganslaws:

x : P (x) x : P (x)x : P (x) x : P (x)

On a finite range, these laws can be proved by the laws of Boolean logic,using properties of conjunction for , and those of disjunction for . On aninfinite range, the new laws must be taken as axioms.

When several predicates are involved in a quantified statement, all theusual laws of Boolean logic apply to these predicates. However, when weintroduce a quantifier, we must be careful not to capture inadvertentlyany existing free variables, or any variables bound by other quantifiers. Forexample, the statement (x : P (x)) (x : Q(x)) is, in general, not equiv-alent to x : (P (x) Q(x)). This is because in the former statement, P (x)and Q(x) may be satisfied by different values of x, whereas in the latterstatement the value of x must be the same for both P and Q. We canmake this argument even more forceful by replacing the first statement byits logical equivalent: (x : P (x)) (y : Q(y)). By a similar reasoning,there is no equivalence between the statements (x : P (x))(x : Q(x)) andx : (P (x)Q(x)), since the former is equivalent to (x : P (x))(y : Q(y)).However, the following equivalences hold:

(x : P (x)) (x : Q(x)) x : (P (x) Q(x))(x : P (x)) (x : Q(x)) x : (P (x) Q(x))

As before, they can be proved by laws of Boolean logic for a finite range,but must be taken as axioms when the range is infinite.

In general, a quantifier x or x is safe to capture a predicate Q, aslong as Q does not contain x as a free variable (in other words, as long asall occurrences of x in Q are bound by other quantifiers). Therefore, wehave the following laws, where Q is always assumed to be a predicate not


containing x as a free variable:

(x : P (x)) Q x : (P (x) Q)(x : P (x)) Q x : (P (x) Q)(x : P (x)) Q x : (P (x) Q)(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)(x : P (x)) Q x : (P (x) Q)Q (x : P (x)) x : (Q P (x))Q (x : P (x)) x : (Q P (x))(x : P (x)) Q x : (P (x) Q)(x : P (x)) Q x : (P (x) Q)

Just like laws of Boolean logic, which are useful in simplifying statementsinvolving Boolean operators, the above laws, along with other laws intro-duced in this section, allow us to simplify statements involving quantifiers.The ultimate purpose of all these laws, and of logic as a whole, is to allowus to express and prove facts about objects and sets that we build acrossall branches of mathematics. In the following sections of the course, we willmake extensive use of this sections language and ideas.


3 Sets

3.1 The nave set theory

The notion of a set is central to mathematics. However, it was not until thelate 1800s and early 1900 that mathematicians began to study sets in theirown right. Sets and set elements are basic concepts, and, as such, are leftwithout a formal definition. Georg Cantor (18451918), one of the creatorsof modern set theory, gave the following description:

By a set we shall understand any collection into a whole M ofdefinite, distinct objects of our intuition or of our thought. Theseobjects are called the elements of M .

The above is not a mathematical definition: it just describes our intuitiveidea of sets (collections) and their elements (objects). However, we canformulate some characteristic properties that we associate with sets:

Any object can be an element of a set. For example, we can form thefollowing sets:

Planets = {Mercury, Venus, . . . , Pluto}Neven = {0, 2, 4, 6, 8, 10, . . .}

Junk = {239, banana, ace of spades}

The order of elements in a set does not matter. For example,

Junk = {239, banana, ace of spades}

Repetition of elements in a set does not matter. For example,

Junk = {banana, banana, ace of spades, 239, 239, 239}

A set can be an element of another set. For example,

Junk = {banana, banana, ace of spades, 239, 239, 239}

SuperJunk = {239, Junk , } = {239, {banana, ace of spades, 239}, }There is a special set, which contains no elements. It is called the empty

set, and denoted : = {}. Any set with exactly one element is called asingleton. For example, we can form the following singletons:

MorningStars = {Venus}NonpositiveNaturals = {0}

EmptySets = {}


Note that the set EmptySets is not empty: it contains an element, whichhappens to be the set . Likewise, the set MorningStars is distinct from theplanet Venus, and the set NonpositiveNaturals is distinct from the numberzero.

The fact that x is an element of set S is written as x S. Thus,Jupiter Planets, orange 6 Junk . A set A is called a subset of a set B(A B), if all elements of A are also elements of B (but not necessarily theother way round). For example, Neven is a subset of N (Neven N), sinceevery even natural number is a natural number. We can write the definitionformally as follows:

A B x : x A x B

By this definition, the empty set is a subset of any set (since the rangeof the quantified statement in the definition is empty), and every set is asubset of itself.

It is very important to distinguish between the signs (element inclu-sion) and (subset inclusion). Despite their superficial similarity, theirmeaning is very different: the first indicates an individual member of a set,the second an arbitrary subset of a set, including the two possible ex-tremes: the empty set and the whole working set. Element inclusion isa basic concept, and therefore has no formal definition; the definition ofsubset inclusion in terms of element inclusion was given in the previousparagraph.

Our intuitive idea of a set is an arbitrary collection of elements, wherethe order and any repetitions of elements are ignored. Can we make thisidea formal by giving to the basic concept of a set the appropriate axioms?The fact that order and repetitions do not matter is easy to express:

Axiom (The Law of Extensionality). If two sets contain the same ele-ments, they are equal.

In other words, for any sets A, B, we have

(A B B A) A = B

In particular, any two sets without elements are equal, therefore there isonly one empty set .

When dealing with sets, we often need to select from a given set a subsetthat satisfies a certain property. For example, we could start from the setN, and select from it only those numbers that are even. In general, let Sbe our working set; then we can express any property of its elements by apredicate P (x), where x is a variable ranging over S. A set of all elementsx of S for which P (x) is T is denoted {x S | P (x)}. For example,

Neven = {x N | x is even}


The variable x in the above expression is a dummy: the set Neven will notchange if we replace all occurrences of x in its definition by y, or by anyother variable.

For any set S, we have

{x S | T} = S{x S | F} =

Here are some more examples:

{x N | x > 0} = {1, 2, 3, 4, 5, 6, . . .}{x Planets | x is red} = {Mars}

{x N | x 0} = N{x Planets | x is a banana} =

Using the predicate notation, we can attempt to formalise completelyour intuitive notion of a set. We have described a set as an arbitrarycollection of elements that is, we can form a set of elements satisfyingany given predicate. We can now make it our second axiom.

Axiom (The Law of Abstraction). For any predicate P (x), there is aset A = {x | P (x)}, such that an element x is in A if and only if P (x) istrue.

Our two axioms the law of extensionality and the law of abstraction formalise our intuition about sets. We could try to base a whole theoryon these two axioms. Indeed, such attempts were made in the early stagesof set theory development. Unfortunately, it was soon realised that theextensionality and abstraction laws, taken together, are inconsistent thatis, a theory based on these laws leads to contradictions. The simplest ofthese contradiction is called Russells paradox, after the great logician andphilosopher Bertrand Russell (18721970).

Consider the following predicate: P (x) x 6 x (note that it involveselement inclusion, rather than subset inclusion). In words, we could say thatP (x) means x is not a member of itself. This would be definitely true ifx is not a set; it is also true for all sets we have seen so far, and for all setswe can think of (except perhaps an imaginary set of all sets). We may ormay not believe that P (x) is true for all x: whether this is the case or notis irrelevant, since both possibilities will lead to a contradiction. What isrelevant is that P (x) is a well-formed predicate (i.e. is true or false for anygiven x). Therefore, by the law of abstraction, we can form the set B of allobjects x that satisfy the predicate P (x):

B = {x | P (x)} = {x | x 6 x}


In words, B is the set of all objects that are not their own members.Now consider the following statement R: B B. It is a well-formed

statement, so it must be either true or false. Suppose statement R is true,so B is a member of B, and, like all members of B, must not be a memberof itself. This makes the statement R false which is impossible, sincewe assumed it was true. Now suppose statement R is false, so B is nota member of B. By definition of set B, everything that is not a memberof itself must be a member of B, so B itself must be a member B. Thismakes statement R true which is impossible, since we assumed it wasfalse! Thus, statement R cannot be either true or false, so there must besomething wrong in our reasoning. The only thing that can be wrong isthe law of abstraction that we used to form the set B.

There is an alternative, somewhat lighter form of Russells paradox.Imagine a village that has a single (male) barber with the following code ofpractice: the barber will shave every man in the village, but only if this mandoes not shave himself. Must the barber shave himself? The question has noanswer, since both choices of the answer lead to a contradiction. Therefore,the barbers rule is inconsistent.

Because of Russells paradox, the theory based on the laws of extension-ality and abstraction is often called the nave set theory. It captures ourintuitive notion of a set but, being inconsistent, cannot serve as a formalfoundation of mathematics. A lot of time and effort have been spent inorder to provide a more sound axiomatic system for sets. Now, several suchsystems exist; they are all significantly more complicated than the nave settheory. We shall not go into their details in this course. For the rest ofthe course, we will use implicitly the laws of extensionality and abstraction,and in particular the convenient notation for set abstraction {x | P (x)}. Onthe level of our course, no paradoxes similar to Russells will arise. Indeed,unless mathematicians create them artificially, they seldom arise at all.

3.2 Operations on sets

We have already studied the concept of set abstraction, that allows us (ide-ally) to form a set {x | P (x)} from any predicate P (x). We will now use thismethod to define operations that create new sets from existing ones. Despitethe problems with abstraction arising due to Russells paradox, these newset operations will be completely non-controversial.

Let A, B be any sets. The intersection of A, B, denoted A B, is a setthat contains all elements which are members of both A and B:

A B = {x | (x A) (x B)}The union of A, B, denoted AB, is a set that contains all elements whichare members of either A, or B (or both):

A B = {x | (x A) (x B)}


The difference of A, B, denoted A \ B, is a set that contains all elementswhich are members of A, except those which are members of B:

A \B = {x | (x A) (x B)}

As we see from the definitions, set operations are closely related toBoolean operators. In particular, they have properties very similar to thelaws of Boolean logic, where is analogous to conjunction, and to dis-junction.

A A = A A A = A idempotence of , A B = B A A B = B A commutativity of ,

Also,

(A B) C = A (B C) associativity of (A B) C = A (B C) associativity of A (B C) = (A B) (A C) distributivity of over A (B C) = (A B) (A C) distributivity of over

Set difference does not directly correspond to negation, since it involvestwo sets rather of one. In order to obtain an analogue of negation, let us fixa particular set S (the universal set). We now restrict ourselves to sets thatare subsets of S. For any set A S, the complement of A (with respect toS) is the difference A = S \ A. The laws of complement are analogous tothe the laws of Boolean negation. We have the law of double complement :

A = A

and De Morgans laws:

A B = A B A B = A B

Here, A, B are arbitrary subsets of S.The universal set S corresponds to the statement T , and the empty set

to the statement F . Note that S. We have:

A S = A A = A identity lawsA = A S = S annihilation lawsA A = A A = S excluded middleA (A B) = A = A (A B) absorption laws

All the above laws are theorems, and are easy to prove by the laws of Booleanlogic. Here is an example:


Theorem 2 (De Morgans Law). For any universal set S, and for anysets A, B S, we have A B = A B.Proof. We apply the definition of complement, the Boolean De Morganslaw, the Boolean distributivity law, once again the definition of complement,and finally the definition of set union:

A B = S \ (A B) ={x | (x S) (x A B)} =

{x | (x S) ((x A) (x B))} ={x | (x S) ((x A) (x B))} =

{x | ((x S) (x A)) ((x S) (x B))} ={x | (x S \A) (x S \B)} =

{x | (x A) (x B)} = A B Let us compare once again the laws of Boolean logic with the laws of

sets. In logic, we have the set of Boolean values B = {F, T}, and Booleanoperators , , . In set theory, we have a fixed universal set S, and setoperations(complement), , . The laws obeyed by these two structures(set B and the set of all subsets of S) are essentially the same. There aremany other similar structures in mathematics, with operations governed byexactly the same laws. Such structures are called Boolean algebras.

The Boolean algebra formed by all subsets of a given set S is called thepowerset of S. Formally, the powerset of S is a set P(S) = {A | A S}.In other words, a set is member of P(S), if and only if it is a subset of S:A : A P(S) A S.

Let us consider some examples. The simplest case is S = . The emptyset contains exactly one subset: the empty set itself. Hence, the powersetof is a singleton: P() = {}. Note: the powerset of the empty set is notempty.

Now let S be a singleton, for example S = {Bunty}. Set S contains twosubsets: S itself, and the empty set. Hence, the powerset of S consists oftwo elements: P(S) = {, {Bunty}}. In general, the powerset of any set Scontains, among other elements, the set S itself, and the empty set. Forexample,

P({a, b, c}) = {, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}When we form a subset of a given set S, we have two choices for each

element: either to include, or not to include this element in the subset.Thus, for a finite set of n N elements, we make n independent choices,leading to 2n different subsets. Therefore, the powerset of a finite set isfinite. Furthermore, the powerset of an n-element finite set consists of 2n

elements. Note that this also holds for P(), since 20 = 1.


If set S is infinite, then its powerset P(S) must also be infinite. This isbecause P(S) contains, among other elements, all singletons {a}, such thata S. Since S is infinite, the number of such singletons is also infinite.

The last set operation that we consider in this section is based on theidea of a sequence. Let x1, x2, . . . , xn be any objects (n N). A (finite)sequence (x1, x2, . . . , xn) is different from a set {x1, x2, . . . , xn} in that theorder and repetition of elements do matter in a sequence. For example,the sequence JunkSeq1 = (239, banana, ace of spades) is different from thesequence JunkSeq2 = (banana, 239, ace of spades, 239). Natural number nis called the length of the sequence. For example, the length of JunkSeq1 isthree, and the length of JunkSeq2 is four. We will give a formal definitionof sequences further in the course.

A sequence of length two is called an ordered pair. Let A, B be any sets.The Cartesian product of sets A, B, denote AB, is the set of all orderedpairs (a, b), where a A, b B. In other words, A B = {(a, b) | (a A) (b B)}.

The Cartesian product is named after the great philosopher and math-ematician Rene Descartes (15961650). Descartes lived long before setsemerged as a separate mathematical concept. However, Descartes was thefirst to realise that in geometry, a point in the plane can be represented bya pair of numbers, called coordinates. Therefore, the whole plane is repre-sented by what we now call a Cartesian product of two lines.

Here are some examples of Cartesian products:

A = A = for any set A{Bunty} {Fowler} = {(Bunty, Fowler)}{Fowler} {Bunty} = {(Fowler, Bunty)}

{a, b, c} {d, e} = {(a, d), (a, e), (b, d), (b, e), (c, d), (c, e)}N Planets = {(n, x) | (a N) (x Planets)} =

{(5, Saturn), (239, Earth), . . .}The Cartesian product of a set A to itself is called the Cartesian square

of A, and denoted A2 = AA. For example,{a, b}2 = {(a, a), (a, b), (b, a), (b, b)}N2 = N N = {(m, n) | m, n N}

Thus, the plane is the Cartesian square of a line.When forming a pair (a, b) in the Cartesian product A B, we make

two independent choices: we choose a A, and b B. For finite setsA, B, with m and n elements respectively, there are m n possible pairs.Therefore, the Cartesian product of two finite sets is finite. Furthermore,the Cartesian product of an m-element set and an n-element set consists


of m n elements. Note that this also holds for the products involving theempty set: the Cartesian product of the empty set with any other set isempty.

If one of the sets A, B is infinite, and the other is non-empty, then theCartesian product A B must be infinite. This is because if, say, A isinfinite, and b B, then we can form an infinite number of distinct pairs(x, b), where x A. Each of such pairs belongs to AB.

In general A B 6= B A (the equality only holds when A = B, orwhen one of A, B is empty). Hence, the Cartesian product operator is notcommutative. Furthermore, a nested pair ((a, b), c) is different from thenested pair (a, (b, c)), hence (A B) C 6= A (B C), so the Cartesianproduct operator is not associative. However, it still has some distributiveproperties with respect to other set operations:

A (B C) = (AB) (A C) distributivity of over (A B) C = (A C) (B C)A (B C) = (AB) (A C) distributivity of over (A B) C = (A C) (B C)A (B \ C) = (AB) \ (A C) distributivity of over \(A \B) C = (A C) \ (B C)

For a finite sequence of sets A1, A2, . . . , Ak, we can define the Cartesianproduct A1A2 Ak as the set of all finite sequences (a1, a2, . . . , ak),where ai Ai for all i N, 1 i k. Alternatively, we can define the multi-ple Cartesian product A1A2 Ak as nested binary Cartesian products(((A1 A2) A3) . . . ) Ak or A1 (A2 (A3 ( Ak))). From theformal viewpoint, the above three definitions are not equivalent, since thesequence (a1, a2, . . . , ak) is different from the nested pairs (a1, (a2, (. . . , ak)))and (((a1, a2), . . . ), ak). However, the structure of the resulting sets is sim-ilar, and in most applications we can treat the above as three equivalentdefinitions of the Cartesian product of a sequence of sets. If all sets in thesequence are finite, with set Ai having ni elements for every i, then theCartesian product A1 A2 Ak (by any of the three definitions) hasn1 n2 . . . nk elements.

Similarly to the Cartesian square, we can define the k-th Cartesian powerof a set A as Ak = AA A (k times). Thus, the three-dimensionalspace is the Cartesian cube (i.e. the third Cartesian power) of a line.

It is possible do define the Cartesian product of an infinite sequence ofsets by considering infinite sequence of elements, each element coming fromthe corresponding set in the sequence. We will not use Cartesian productsof an infinite number of sets in this course.


4 Relations

4.1 Introduction to relations

We usually thing of a relation between sets as a certain set of orderedpairs, where each element of a pair is taken from its corresponding set. Forexample, we could have the relation between the set of all cars and the setof all people, which would consists of all pairs (x, y), where car x is drivenby person y. A car may be driven by more than one person, so there maybe several pairs with the same x; a person may drive more than one car, sothere may be several pairs with the same y. Some cars may have no drivers,and some people may not drive any cars, therefore some set elements maynot be included in any of the pairs.

Thus, for any sets A, B, a relation between A and B is an arbitrarysubset of the Cartesian product A B. In other words, a relation betweenA and B is an arbitrary set of ordered pairs (a, b), where a A, b B.Although ordinary set notation would be sufficient, there is an alternative,more convenient notation for relations. We denote a relation Rp A Bby Rp : A B. Instead of writing (a, b) Rp, we write apb. This is in linewith the normal practice of mathematics, where we use e.g. x y insteadof (x, y) R.

Relation R : N N is an example of a relation between the set N anditself. For any set A, we say that relation Rp : A A is a relation on theset A. From arithmetic, we already know several other relations on the setN: R=, R, R. Another example of a relation on the set N is therelation R| : N N, where m|n is true if m divides n (i.e. n is a multipleon m).

We can define the following relations between any sets A, B:

the empty relation AB

the complete relation AB AB

On any set A, we can define the equality relation R=A : A A asR=A = {(a, a) | a A}. The equality relation consists of all pairs whereboth elements are equal. When the set A is clear from the context, we dropthe subscript, so instead of a =A b we write simply a = b.

Let Rp : A B, Rq : B C. The composition of relation Rp and Rq isa relation Rpq : A C, defined as follows:

(a, c) A C : (a(p q)c b B : (apb) (bqc))In other words, an element a A is related to an element c C by thecomposition Rpq, if there is (at least one) intermediate element b B, suchthat a is related to b by Rp, and b is related to c by Rq.


Consider, for example, the relation Rq : People People, where xqy ifx is a child of y. The composition Rqq relates two elements x, y, if x is agrandchild of y.

Let Rp : A B. The inverse of relation Rp is a relation Rp1 : B A,defined as follows:

(b, a) B A : (b(p1)a apb)In other words, an element b B is related to an element a A by theinverse relation Rp1 , if a is related to b by the original relation Rp. Thesuperscript 1 is just a symbol for inversion, not the number minus one.

For example, the inverse of the child relation Rq : People People isthe relation Rq1 , that relates two elements x, y, if x is a child of y.

We now switch our attention from relations between two arbitrary setsA, B to relations on a given set A (the previous two examples were alreadyof this type). Let Rp : A A. Relation Rp is

reflexive, if every element is related to itself: a A : apa

symmetric, if every two elements are related in both possible orders,as long as they are related at all: a, b A : apb bpa

antisymmetric, if no two distinct elements are related in both possibleorders: a, b A : (apb bpa) a = b

transitive, if every two elements related via an intermediate third ele-ment are also related directly: a, b, c A : (apb bpc) apc

In other words, a relation Rp : A A is reflexive, if R=A Rp; symmetric,if Rp1 Rp; antisymmetric, if Rp Rp1 R=A ; transitive, if Rpp Rp.Verifying these claims is left as an exercise.

Note that a relation that is not symmetric need not be antisymmetric,and vice versa. Any relation which contains simultaneously pairs (a, b) and(b, a) for some, but not all a, b A, a 6= b, would be an example of a relationthat is neither symmetric nor antisymmetric. The equality relation is anexample of a relation which is both symmetric and antisymmetric (and alsoreflexive and transitive).

The most interesting relations are those that satisfy more than one ofthe above properties. In particular, a relation is

an equivalence relation, if it is reflexive, symmetric and transitive;

a partial order, if it is reflexive, antisymmetric and transitive.

The equality relation is both an equivalence relation and a partial order. Inthe following sections, we shall see more examples of each type of relations.


4.2 Equivalence relations

An equivalence relation is a relation that is reflexive, symmetric and tran-sitive. Examples of equivalence relations are abundant in mathematics andin everyday life. For example, consider the relation on the set of all people,where person a is related to person b, if a and b are of the same age (inwhole number of years). It is easy to check that all necessary properties inthe definition of an equivalence relation are satisfied. A relation where a isrelated to b if a and b were born on the same day (but possibly in differentyears) is another equivalence relation. In geometry, we can define an equiv-alence relation on the set of all straight lines in the plane, where line a isrelated to line b, if a and b are parallel (every line is considered to be parallelto itself).

In arithmetic, given a fixed number n Z, we can define the relationRn : Z Z, where two numbers are related, if their difference is a multipleof n: a n b n|(a b). The relation Rn is called congruence modulon. It is an equivalence relation for every natural n > 0.

Let A be any set, and R : A A an equivalence relation ( is ageneral mathematical sign for equivalence). For any element a A, theequivalence class of a, denoted [a], is the set of all elements in A relatedto a: [a] = {x A | x a}. Since R is reflexive, every element belongsto its own equivalence class: for all a A, a [a]. Sometimes an elementa is called a representative of the equivalence class [a].

For example, if a b means that a and b are two people of the sameage, then the equivalence classes are all possible ages, and every personrepresents all people of his or her age. If a b means that persons aand b share a birthday, then the equivalence classes are all 366 possiblebirthdays, and every person represents all people with the same birthday.If a b means that lines a and b are parallel, then these lines share thesame direction, and we can think of all possible directions as the equivalenceclasses. For the congruence relation Rn , the equivalence class of any a Zconsists of all numbers that give the same remainder as a, when divided byn. Thus, [2]5 = {. . . ,18,13,8,3, 2, 7, 12, 17, . . . }.

The importance of equivalence classes is that in a set with an equivalencerelation, every element belongs to one, and only one, equivalence class. Inother words, we have the following theorem.

Theorem 3. Let R : A A be an equivalence relation. The equivalenceclasses of R are pairwise disjoint. The union of all equivalence classes isthe whole set A.

Proof. To prove that the classes are pairwise disjoint, we need to show thatfor all a, b A : ([a] = [b]) ([a] [b] = ). Consider two cases:

Case a b. Consider any x [a]. By transitivity of R, we have:x a, a b = x b = x [b]


Hence [a] [b]. Swapping a and b, we get [b] [a], therefore[a] = [b].

Case a 6 b. Suppose [a] [b] 6= , then there is some x [a] [b].By symmetry and transitivity of R, we have a x, x b = a b,contradiction. Therefore [a] [b] = .

By the law of excluded middle, one of the above two cases must be true,hence a, b A : ([a] = [b]) ([a] [b] = )

Finally, by reflexivity of R, we have a a, therefore a [a], so everyelement of A belongs to some equivalence class. On the other hand, everyequivalence class is a subset of A, therefore the union of all equivalenceclasses is the whole set A.

Theorem 3 allows us to think of any equivalence relation as a partitioningof the set into disjoint subsets. In many cases, such partitioning has a well-understood intuitive meaning:

The equivalence relation person a is of the same age as person b(in whole number of years) has approximately 110120 equivalenceclasses, corresponding to all possible ages. Note that these ages neednot be a contiguous set of natural numbers, if e.g. there is a person ofage 120, but no person of age 119.

The equivalence relation person a was born on the same day as personb (possibly in different years) has exactly 366 equivalence classes,corresponding to every date in a year. Note that the sizes of all classeswill be nearly equal, except the class corresponding to 29 February,which will be approximately four times smaller than others.

The equivalence relation line a is parallel (or equal) to line b hasan infinite number of equivalence classes corresponding to all possibledirections of a line in the plane. In fact, we can define direction asan equivalence class of this relation.

The congruence modulo n relation Rn has n equivalence classes,represented by numbers 0, 1, . . . , n 1. For example, for n = 5, wehave:

[0]5 = {. . . ,10,5, 0, 5, 10, . . . }[1]5 = {. . . ,9,4, 1, 6, 11, . . . }[2]5 = {. . . ,8,3, 2, 7, 12, . . . }[3]5 = {. . . ,7,2, 3, 8, 13, . . . }[4]5 = {. . . ,6,1, 4, 9, 14, . . . }

Although the number of classes is finite, each class is an infinite set.The classes [a]n are called residue classes modulo n.


For a given equivalence relation R : A A, the set of all is equivalenceclasses is called the quotient set of A with respect to R, and is denotedby A/R = {[a] | a A}. In the examples above, the quotient sets arerespectively the set of all ages, the set of all birthdays, the set of all linedirections, and the set of all residue classes modulo n (for a given n N,n > 0). The latter set is usually denoted by Zn = Z/Rn = {[a]n | a Z}. The set Zn possesses very interesting arithmetic properties, which arestudied in number theory.

For a finite set A, the quotient set A/R must be finite. In particular,if A has n elements, and if all equivalence classes happen to be of equal sizem, then n must be a multiple of m, and the quotient set will have n/melements (i.e. equivalence classes). For an infinite set A, the quotient setmay be finite or infinite.

4.3 Partial orders

A partial order is a relation that is reflexive, antisymmetric and transitive.Whereas an equivalence relation is an abstraction of equality or similar-ity between objects, a partial order is an abstraction of one object beingin some sense smaller (or greater) than another, or of one object pre-ceding (or succeeding) another. Consider, for example, a relation on theset of all people, where person a is related to person b, if a is a descendantof b (i.e. a child, a grandchild, a great-grandchild, etc.) We count everyperson as his or her own descendant, therefore the relation is reflexive. Therelation is also transitive, since a descendant of a descendant of a person isa descendant of that person. Of course, the relation is not symmetric, sinceperson a is a descendant of b does not imply that person b is a descendantof a. Moreover, these two statements can both be true in one case only:when a and b are the same person (who, by definition, is a descendant ofhim/herself). Thus, we have the antisymmetry property, and our relation isa partial order on the set of all people.

It is easy to check that the arithmetic relations R and R, both on Nand on Z, are partial orders.

The divisibility relation R| : N N, which we mentioned several timesbefore, is formally defined as follows: for m, n N, we have m|n (m dividesn, n is a multiple of m), if there is number k N, such that m k = n.Note that by this definition, number 1 divides every number: to prove 1|n,we take k = n. Also, number 0 is a multiple of every number: to prove m|0,we take k = 0. We have the following theorem.

Theorem 4. The divisibility relation R| : N N is a partial order.

Proof. Let n N. We have n 1 = n, hence n|n by definition of relationR|. Therefore, relation R| is reflexive.


Let m, n N, m|n, n|m. By definition of relation R|, there are k, l N,such that n = k m, m = l n. Hence, n = k l n, so k l = 1. Since k and lare natural numbers, this can only be true if k = l = 1, hence n = 1 m = m.Therefore, relation R| is antisymmetric.

Let m, n, p N, m|n, n|p. By definition of relation R|, there are k, l N,such that n = k m, p = l n. Hence, p = k l m, so m|p. Therefore, relationR| is transitive.

Another important example of a partial order is the subset inclusionrelation A B, where A, B are both subsets of a given set S. Sincethe objects being related are subsets of S, the subset inclusion relation isdefined on the powerset of S: R : P(S) P(S). The relation is reflexive,since for any A S, A A; antisymmetric, since for any A, B S,(A B) (B A) (A = B); transitive, since for any A, B, C S,(A B) (B C) (A C).

Note that in a partial order, some pairs of elements may be incomparable.For example, for any two persons one does not have to be an ancestor of theother: they could be siblings, cousins, or not related at all. Likewise, thereare pairs of numbers neither of which divides the other (e.g. 4 and 5), andpairs of sets neither of which is a subset of the other (e.g. {1, 2}, {1, 3} {1, 2, 3}). On the other hand, relations R and R satisfy an additionalproperty: for any numbers a, b, we have either a b, or b a (or both,if a = b). In general, a partial order R : A A is called total, if for alla, b A, we have either a b, or b a. Thus, partial orders R and Rare total; partial orders R| and R are not total.

Consider a partial (not necessarily total) order R : A A. Let a, b A.We say that c A is an upper bound of a, if a c. In particular, everyelement is an upper bound of itself. An element c A is a (common) upperbound of a and b, if a c and b c. An arbitrary pair of elements a, bmay have no common upper bound at all, or several common upper bounds.If the latter case, one of the bounds may play a special role, being theclosest to a and b among all their common upper bounds. Formally, anelement c A is called the least upper bound of a, b, denoted lub(a, b), if cis an upper bound of a, b, and for any upper bound x of a, b, we have c x.In other words,

c = lub(a, b) (a c) (b c) (x A : (a x) (b x) (c x))

The least upper bound of a, b does not have to exist, even if elements a, bhave some common upper bounds.

All the above definitions can be easily restated for lower, rather thanupper bounds. Thus, d A is a lower bound of a, if d a. Every elementis a lower bound of itself. An element d A is a (common) lower bound of


a and b, if d a and d b. Two elements can have any number of commonlower bounds, or no common bounds at all. An element d A is called thegreatest lower bound of a, b, denoted glb(a, b), if d is a lower bound of a,b, and for any lower bound x of a, b, we have x d. In other words,

d = glb(a, b) (d a) (d b) (x A : (x a) (x b) (x d))

Two elements may not have the greatest lower bound, even if they havesome common lower bounds. However, if two elements have the greatestlower bound, then it is unique (why?). The same applies to the least upperbound.

As an example, consider the partial order a is a descendant of b. Forany two people, their common upper bound is any common ancestor, if oneexists. Thus, if two persons are cousins, then either of the two commongrandparents is their common upper bound. Neither of these upper boundsis the least, since the two grandparents are not ancestors of each other.There are many other common upper bounds, provided by ancestors ofthese grandparents, but none of these upper bounds is the least.

In the same partial order, the common lower bound of any two peopleis their common descendant, if one exists. Thus, is two persons are in-laws, i.e. each of them is a parent of the other childs partner, then eachof their common grandchildren is their common lower bound. There maybe many other common lower bounds, provided by descendants of commongrandchildren. If the two in-laws have exactly one common grandchild,he/she is their greatest lower bound, since all other common lower boundswould be that grandchilds descendants.

In arithmetic, the greatest lower bound of two numbers a, b N withrespect to the divisibility relation R| is the two numbers greatest commondivisor: glb|(a, b) = gcd(a, b). (Sometimes the greatest common divisor iscalled highest common factor.) In the same partial order, the least upperbound of two numbers a, b N is their least common multiple: lub|(a, b) =lcm(a, b). In contrast with the previous example, every two non-zero naturalnumbers have the greatest common divisor and the least common multiple,and therefore the greatest lower bound and the least upper bound in R|.

Another example of an arithmetic partial order with guaranteed greatestlower and least upper bounds is the total order R : N N. Here, thegreatest lower bound of two numbers a, b is simply their minimum a u b(a u b = a if a b, and a u b = b otherwise). The least upper bound of a, bis their maximum aunionsq b (aunionsq b = b if a b, and aunionsq b = a otherwise). In fact,it is easy to see that greatest lower and least upper bounds are guaranteedto exist in every totally ordered set.

Finally, consider the subset inclusion relation R : P(S) P(S) onthe subsets of any (note necessarily finite) set S. The greatest lower bound


of two subsets A, B S is their intersection glb(A, B) = A B, andthe least upper bound is their union lub(A, B) = A B. For any twosets, we can form their intersection and their union, therefore the relationR is another example of a partial order where greatest lower and leastupper bounds always exist. In general, a partially ordered set where forevery two elements one can find their greatest lower bound and least upperbound is called a lattice. The partial orders R| : N N, R : N N andR : P(S) P(S) (for any set S) are examples of lattices.

In many partial ordered sets, it is worthwhile to look for elements thatare in some sense extreme. Since the set may include incomparable el-ements, we have two possible notions of extremality. Consider a partial(not necessarily total) order R : A A. We say that a A is a maximalelement, if for all x A, we have (a x) (a = x). In other words, theonly element higher than or equal to a is a itself. We say that c A is thegreatest element, if for all x A, we have x c. In other words, c is higherthan or equal to all elements of A. Note that by this definition, the great-est element must be comparable to (and higher than) all other elements,whereas a maximal element may be comparable to (and higher than) someelements and incomparable to others.

Both above definitions can be restated for the opposite extremes. Wesay that b A is a minimal element, if for all x A, we have (x b) (x = b). In other words, the only element lower than or equal to b is b itself.We say that d A is the least element, if for all x A, we have d x. Inother words, a is lower than or equal to all elements of A. Again, the leastelement is comparable to all other elements, whereas a minimal element maybe comparable to some elements and incomparable to others.

As an example, consider the partial order a is a descendant of b. Aminimal element in this partial order is any person without children. Thereis no least element, since no person is everyones descendant.

In the total order R : N N, number 0 is the least element, and theonly minimal element. There are no maximal or greatest elements. In thepartial order R| : N N, number 1 is the least element, since it dividesall natural numbers. Number 0 is (somewhat contrary to the intuition) thegreatest element, since every natural number divides 0. It is also the onlymaximal element.

An interesting variation of the previous example is the same partial orderR|, considered on the set of all natural numbers, except 0, 1. In this partialorder, every prime number is a minimal element, since it is not divisibleby any other natural number. There is no least element, since no number(except 1, which is excluded) divides all natural numbers. There are nomaximal elements, since for every number other than 0, there is a distinctmultiple (e.g. x 6= 2x and x|2x for any x N, x 6= 0). There is no greatestelement, since no positive number is a multiple of all other numbers.

In the subset inclusion relation R : P(S) P(S), the least (and the


only minimal) element is , and the greatest (and the only maximal) elementis S. If and S are excluded, and S is neither empty nor a singleton, thenthere will be many minimal elements (all singletons {a}, where a S)and many maximal elements (all complements of such singletons), but nogreatest or least element.

It is easy to prove that any greatest element is maximal, and that anyleast element is minimal (try it!). As the above examples show, the converseis not true: a maximal element need not be the greatest, and a minimalelement need not be the least. It is also easy to prove that if the greatest(or the least) element exists, then it must be unique (try it!). However, ifa maximal or a minimal element is unique, it still does not have to be thegreatest or the least (why?).

The results of this section show us that the concept of a relation, and inparticular equivalence relations and partial orders, give us a useful generaltool, applicable in various branches of mathematics and computer science.We will apply our knowledge of relations in the following sections.


5 Functions

5.1 Introduction to functions

The word function takes on different meanings in different branches ofmathematics and computer science. One often thinks of a function as atransformation rule, or a set of rules, that allow us to map, or transform,objects into other objects. There are various ways to make this conceptof a function precise. In this course, we take the approach of ignoring theprocess of transformation (which may not even be computable), and insteadwe concentrate on the initial object the function was applied to, and thefinal object that is the result of this application. In other words, we view afunction as a relation between the set of all possible inputs and all possibleoutputs.

The special property of functions, which distinguishes them from otherrelations, is that for every input, the function produces exactly one out-put (see Figure 1). Formally, a function f from set A to set B is a relationRf : A B, where for every a A, there is exactly one b B, such thatafb (that is, (a, b) Rf ). Set A is the domain of f , set B is the co-domain off . We say that function f maps A into B. We say that a function f : A Ais a function on the set A.

There is special notation and terminology associated with functions. Weindicate that f is a function from A into B by writing f : A B. As analternative notation to (a, b) Rf or afb, we write f(a) = b. This notationis unambiguous since, by definition of a function, for every a there is exactlyone b = f(a). We say that function f maps a to b. For a given function f ,element b = f(a) is called the image of a, and a is called the pre-image of b.

We have already seen some examples of functions earlier in the course.In particular, the equality relation on a set A, defined as R=A = {(a, a) | a A}, is a function on A. It is called the identity function on A, and denotedidA : A A. For all a A, we have idA(a) = a.

As an example of an arithmetic function, we can take the function sq :Z N, defined as the set of pairs Rsq = {(m, n) ZN | m2 = n}. This setsatisfies the definition of a function, since every natural number has exactly

A

B

f

Figure 1: A function


one square. We have

Rsq = {. . . , (3, 9), (2, 4), (1, 1), (0, 0), (1, 1), (2, 4), (3, 9), . . .}

Consider any function f : A B, and let H A. The restriction of fon set H is function f |H , defined as f |H = {(a, f(a)) | a H}. In otherwords, the restriction agrees with the original function on all elements of H,and is undefined on all elements not in H. For example, the restriction ofsq to the set of all natural numbers is the function sq |N : N N. We have

Rsq|N = {(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), . . .}

Two other special cases of a function that we considered before are finiteand infinite sequences. Let A be any set, finite or infinite. A finite sequenceof elements of A is a function Nk A, where k N is the length of thesequence. Notation (a0, a1, . . . , ak1) Ak is simply an alternative, shorterway of writing

a : Nk Aa(0) = a0 a(1) = a1 . . . a(k 1) = ak1

Similarly, an infinite sequence of elements of A is a function N A. Nota-tion (a0, a1, a2, a3, . . .), where i N : ai A, is an alternative to

a : N Aa(0) = a0 a(1) = a1 a(2) = a2 a(3) = a3 . . .

Thus, unlike sets, sequences need not be a basic, undefined concept: wedefine sets via functions. Since functions are a special case of relations, andrelations are a special case of sets, the concept of an ordered sequence isultimately reduced to the concept of an unordered set.

Since functions are relations, the operations of composition and inversioncan be applied to functions just like to any other relations. The result ofsuch application is a relation, but is not a priori guaranteed to be a function.It still turns out that the result of function composition will always be afunction.

Theorem 5. Let f : A B, g : B C. The composite relation Rfg is afunction A C.

Proof. Let a A. Since f is a function, there is a unique b = f(a) B.Since g is a function, there is a unique c = g(b) = g(f(a)) C. By definitionof relation composition, we have (a, c) Rfg. Since such element c isunique, relation Rfg is a function f g : A C.


A

B

f(A)

f

Figure 2: The range of a function

Thus, f g(a) = g(f(a)). This explains why in some books, the order ofthe notation for function composition is inverted: g f instead of f g.We prefer the latter notation, which indicates that in the expression g(f(a)),function f is applied first, followed by function g.

In general, there is no analogue of Theorem 5 for function inversion. Fora function f : A B, the inverse relation Rf1 : B A need not be afunction. Consider, for example, the function sq : Z N. Its inverse is thesquare root relation Rsq1 = {(n, m) N Z | m2 = n}. We have

Rsq1 = {. . . , (9,3), (4,2), (1,1), (0, 0), (1, 1), (4, 2), (9, 3), . . .}This set of pairs does not satisfy the definition of a function: some naturalnumbers, such as 2, 3, 5, 6, . . ., do not have an integer square root, whereasother natural numbers, such as 1, 4, 9, 16, . . ., have two integer square rootsof opposite signs. Thus, neither the existence nor the uniqueness conditionfrom the definition of a function is satisfied.

Let us go back to the definition of a function f : A B. Note thatthe domain A and the co-domain B play different, non-symmetric roles: forevery element of the domain, there must be a unique image of the co-domain,but not vice versa. The set of all elements of the co-domain that do have apre-image in the domain (not necessarily a unique one) is called the range ofthe function (see Figure 2). The range of a function f : A B is denotedf(A). For example, the range of the square function sq : Z N is the setof all squares.

Many important functions satisfy stronger conditions than just the ex-istence and uniqueness of the image. Here we concentrate on two suchconditions.

A function f : A B is called surjective, if its range is the wholeco-domain B:

f(A) = B

(see Figure 3). Such a function f is said to map the domain A onto B:f : A B. An example of a surjective function is the function suit :Cards {,,,}, which maps the finite set of cards in a standard packto the set of four suits. Since there is at least one card of every suit in thepack, function suit is surjective.


A

B

f

Figure 3: A surjective function

A

B

f

Figure 4: An injective function

A function f : A B is called injective, if it maps different elements ofthe domain A to different elements of the co-domain B:

x, y A : (f(x) = f(y)) (x = y)

(see Figure 4). Such a function f is said to map A to B one-to-one: f : AB. An example of a injective function is the square function on the set ofnatural numbers: sq |N : N N. Since every two different natural numbershave different squares, function sq |N is injective. The square function on theset of all integers in not injective, since e.g. sq(5) = sq(5) = 25.

The concepts of a surjective and an injective functions are in a certainsense complementary: for any pair of sets A, B, there is a surjective functionfrom A to B, if and only if there is an injective function from B to A. Theproof of this statement is left as an exercise.

A function f : A B is called bijective, if it is both surjective andinjective. For every element of te co-domain B, such a function has a uniquepre-image in the domain A:

b B : !a A : f(a) = b

(see Figure 5). A bijective function f from A to B is also called a one-to-onecorrespondence between A and B: f : AB. An example of a bijectivefunction is the function add five on the set of all integers:

add5 : Z Z a Z : add5 (a) = a + 5


A

B

f

Figure 5: A bijective function

For every integer b Z, number b 5 is the pre-image, therefore functionadd5 is surjective. Adding five to two different integers produces differ-ent results, therefore function add5 is injective. Thus, function add5 is abijective function from the set Z to itself.

A bijective function from any set to itself is called a permutation onthat set. In the previous example, function add5 is a permutation on Z. Aspecial case of a permutation is an involution, which is any bijection thatcoincides with its own inverse. Under an involution, every element of thedomain is either left unchanged, or swapped with another element. Anexample of an involution is the function that inverts the sign of an integer:

neg : Z Z a Z : neg(a) = aProof of the following properties of functions is left as an exercise:

composition of two surjective (respectively injective, bijective) func-tions is surjective (injective, bijective);

the inverse relation of a bijective function is a bijective function.A special example of a bijection puts the powerset of any given set S in

one-to-one correspondence with the set of all possible functions from S to thetwo-element set B = {F, T}. For any subset A P(S), the correspondingfunction is the indicator function of A, A : S B, defined as follows:

x S : A(x) ={

T if x AF if x 6 A

To prove that the mapping : A 7 A is a bijection between P(A) and theset of functions B(S) = {f | f : S B} is left as an exercise.

5.2 Set cardinality

Putting two sets in one-to-one correspondence is one of the most basic ac-tivities that can be performed on sets. Intuition tells us that it is possibleif and only if both sets have the same size. In fact, the idea of one-to-one correspondence, or bijection, allows us to define precisely what sizemeans, even for infinite sets.


We say that two sets A, B are equinumerous (A = B), if there is abijective function f : AB. For any given set S, we can think of equinu-merous as a relation on the subsets of S: R= : P(S) P(S). Since everyset can be put in one-to-one correspondence with itself by the identity func-tion, relation R= is reflexive. Since both the inverse of a bijective functionand a composition of two bijective functions are bijective, relation R= issymmetric and transitive. Thus, R= : P(S) P(S) is an equivalence rela-tion. Each of its equivalence classes is composed of sets of the same size; infact, every such class can be thought of as an abstraction of set size, eitherfinite or infinite. In mathematics, these set sizes are called cardinalities.

It is easy for us to get hold of finite cardinalities, since we accepted thenatural numbers as one of our basic concepts. For any n N, let Nn bedefined as the set of first n natural numbers:

Nn = {x N | x < n}

Thus, N0 = , N1 = {0}, N2 = {0, 1}, etc. Intuitively, sets Nn are arepresentative collection of what we would like to call finite sets: we define aset to be finite, if it is equinumerous with the set Nn for some n N. Noneof the sets Nn with different values of n are equinumerous; we accept thisas one of the axiomatic properties of natural numbers. Given this property,it is easy to prove that every finite set is equinumerous with exactly one ofNn.

Theorem 6. For every finite set A, there is a unique n N, such thatA = Nn.Proof. Suppose A is equinumerous with Nk and Nl, k, l N. We have thebijections f : ANk and g : ANl. Function f

1 g : NkNl is alsoa bijection (why?) Therefore, sets Nk and Nl are equinumerous. This canonly happen if k = l.

By the above theorem, every finite set has a uniquely defined naturalnumber as its cardinality. This fact gives some precision to our introductoryremark that natural numbers are an abstraction of finite set sizes.

We now turn our attention to cardinalities of infinite sets. A priori, it isnot obvious whether different infinite sets (e.g. N, Neven, N

2, N3, Z, P(N))have different cardinalities. We begin our study of infinite cardinalities fromthe set N. We call an infinite set countable, if it is equinumerous with theset of all natural numbers N. Intuitively, such a set can be counted, i.e.put in one-to-one correspondence with N.

It may appear at first that by removing elements from N, we can obtaininfinite sets with a cardinality different from that of N. It turns out thatthis is not the case. Let us look at some examples.

Theorem 7. Set N+ = N \ {0} is countable.


Proof. Consider function f : N N+, which adds one to every naturalnumber: n : f(n) = n + 1.

With respect to function f , every element of N+ has a pre-image:

n N+ : n = (n 1) + 1 = f(n 1)

Therefore, function f is surjective.Furthermore, function f maps different elements of N to different ele-

ments of N+:m, n N : (m 6= n) (m + 1 6= n + 1)

Therefore, function f is injective.Since f is surjective and injective, f is bijective

The above proof can be represented graphically as follows:

0 1 2 3 4 5 6 7 l l l l l l l l1 2 3 4 5 6 7 8

Theorem 8. Set Neven = {0, 2, 4, 6, . . . } is countable.Proof. Consider function f : N Neven, which doubles every natural num-ber: n : f(n) = 2n.

With respect to function f , every element of Neven has a pre-image:

n Neven : n = 2 (n/2) = f(n/2)

Therefore, function f is surjective.Furthermore, function f maps different elements of N to different ele-

ments of Neven:m, n N : (m 6= n) (2m 6= 2n)

Therefore, function f is injective.Since f is surjective and injective, f is bijective


0 1 2 3 4 5 6 7 l l l l l l l l0 2 4 6 8 10 12 14

Theorems 7 and 8 suggest that, contrary to the intuition, a part (i.e.a proper subset) of an infinite set can be of the same size as the whole.In fact, it can be proved that every subset of a countable set is either finiteor countable; in other words, the cardinality of N is the smallest amonginfinite cardinalities. As a consequence, for any equivalence relation on a


countable set, the quotient set (i.e. the set of all equivalence classes) iseither finite or countable. This can be shown by selecting an arbitraryrepresentative from every equivalence class. The function that maps everyequivalence class to its representative is a bijection (why?), therefore thequotient set is equinumerous with a subset of the initial set. Since theinitial set is countable, its quotient set must be finite or countable.

It turns out that not only subsets, but also certain supersets of N maybe countable.

Theorem 9. Set Z is countable.

Proof. Consider function f : N Z, which counts negative integers byeven naturals, and positive integers by odd naturals:

n : f(n) ={

(n + 1)/2 if n odd

n/2 if n even

Function f is bijective (proof left as an exercise).


4 3 2 1 0 1 2 3 4 l l l l l l l l l

8 6 4 2 0 1 3 5 7

Perhaps taking a Cartesian square or a higher Cartesian power of acountable set will produce a bigger set? It turns out that the answer isno.

Theorem 10. Set Z2 is countable.

Proof. We only give the main idea of the proof. The set Z2 can be rep-resented as an infinite two-dimensional table, where the entry in row i andcolumn j corresponds to the pair (i, j), i, j N. The entries in such a tablecan be counted by diagonals:

0 1 2 3 40 0 1 3 6 101 2 4 7 11 2 5 8 12 3 9 13 4 14

This method gives us a bijection between N and N2; with a little extra effort,the formula for this bijection can be given explicitly (left as an advancedexercise).


The above theorem implies that any finite Cartesian power of a countableset is countable. For instance,

N3 = (N N) N = N N = NIn our quest for uncountable infinity, we may be tempted to extend the

set of natural numbers so that, roughly speaking, we would have an infinityof numbers everywhere. More precisely, we may want to consider the setQ of rational numbers, defined as fractions m/n, where m, n Z, n 6= 0.Two fractions a/b and c/d are considered equal, i.e. representing the samerational number, if a d = b c. Therefore, we have an equivalence relationon the set of all integer pairs:

R : Z2 Z2 (a, b) (c, d) a d = b c

Every rational number is defined as an equivalence class of this relation.The whole set of rational numbers is the quotient set Q = Z2/R.

In contrast with sets N and Z, the set of rational numbers Q is dense:between any two rational numbers, no matter how close, there is anotherrational number. In fact, in every segment between two rational numbers,no matter how tiny, there is an infinite number of other rational numbers.Intuitively, it feels as if there must be much more rational numbers thanintegers, in order to fill up all those segments. However, we already knowthat the set Q must be countable, since it defined as a quotient set of acountable set Z2.

Do uncountable sets exist at all? The answer to this question is givenby Cantors theorem: no set can be equinumerous with its own powerset.

Theorem 11. For all sets A, A 6= P(A).Proof. The proof method is called Cantors diagonal argument, and is rem-iniscent of Russells paradox.

To prove the statement by contradiction, suppose that for some set A,there exists a bijective function f : AP(A), which puts elements of A inone-to-one correspondence with subsets of A. Consider the set of all elementsof A that are not in their corresponding subsets: D = {a A | a 6 f(a)}.Since D is a subset of A, it must, like all other subsets, have a correspondingelement d, such that f(d) = D.

Consider the statement d D. Suppose this statement is true. Then dis an element of the set D of all elements that are not in their correspondingsubsets. But the corresponding subset of d is set D itself, therefore, by thedefinition of D, we have d 6 D. Hence, the statement d D cannot be true.

Suppose the statement d D is false. Then d is not an element of thecorresponding set D. We have a special set for such elements, which happensto be D itself! Therefore, by the definition of D, we have d D. Hence,the statement d D cannot be false.


By the laws of logic, d D must be true or false. As we have shownabove, both cases lead to a contradiction. Therefore, our initial assumptionmust be false, and the bijective function f cannot exist.

The above theorem implies that the set P(N) is uncountable. Since thepowerset of any set A is equinumerous with the set of all Boolean functionsA B, the set of functions from N to B = {F, T} is also uncountable. Byreplacing F with 0 and T with 1, we can obtain a simple bijection from thelatter set to the set of all function N {0, 1}. This set, in its turn, is asubset of the set of all functions N N, which can be regarded as the set ofall infinite integer sequences, or as the infinite Cartesian product NN. . . .Therefore, unlike finite Cartesian products, an infinite Cartesian product ofcountable sets need not be countable.

The fac

Discrete Maths Notes

Documents

Transcript of Discrete Maths Notes