NOTES ON DISCRETE MATHEMATICS -...

NOTES ONDISCRETE MATHEMATICS

MIGUEL A. LERMA

Contents

Introduction 2

1. Logic 3

2. Set Theory 18

3. Relations 25

4. Functions 33

5. Counting 44

6. Mathematical Induction 53

7. Divisibility 64

8. Modular Arithmetic 69

9. Graphs 76

10. Trees 86

Appendix A. 92

Date: March 8, 2001.1

NOTES ON DISCRETE MATHEMATICS 2

Introduction

These notes are intended to be a summary of the main ideas in courseCS 310: Mathematical Foundations of Computer Science. I may keepworking on this document as the course goes on, so these notes will notbe completely finished until the end of the quarter.

The textbook for this course is Ralph P. Grimaldi: Discrete andCombinatorial Mathematics. With few exceptions I will follow the no-tation in the book. However, compared to the textbook, the informa-tion in these notes will be:

1. More schematic. I will try to “get to the point”.

2. More reduced. The textbook is excellent for obtaining an ex-tended knowledge in the subject, but it exceeds what it is possibleto cover in one quarter.

3. Arranged in a different way. I have grouped the subject followinga slightly different criteria than the one used by the textbook. So,for instance, I have put the “counting techniques” together (com-binatorics, inclusion-exclusion principle, pigeonhole principle...),instead of having them spread out in several disconnected sec-tions.

These notes contain some questions and “exercises” intended to stim-ulate the reader who wants to play a somehow active role while studyingthe subject. They are not homework nor need to be addressed at allif the reader does not wish to. I will recommend exercises and givehomework assignments separately.

Finally, if you find any typos or errors, or you have any suggestions,please, do not hesitate to tell me. You may email me, or use the webform for feedback on the web pages for the course.

Miguel A. [email protected] UniversityWinter 2001


1. Logic

1.1. Statements. A statement (or proposition) is a declarative sen-tence that is either true or false (but not both). For instance, thefollowing are statements: “Paris is in France” (true), “London is inDenmark” (false), “2 < 4” (true), “4 = 7 (false)”. However the follow-ing are not statements: “what is your name?” (this is a question), “doyour homework” (this is a command), “this sentence is false” (neithertrue nor false), “x is an even number” (it depends on what x repre-sents), “Socrates” (it is not even a sentence). The truth or falsehoodof a statement is called its truth value.

1.2. Connectives, Truth Tables. Connectives are used for makingcompound statements. The main ones are the following (p and q rep-resent given statements):

Name Represented MeaningNegation ¬p “no p”Conjunction p ∧ q “p and q”Disjunction p ∨ q “p or q”Exclusive Or p Y q “either p or q”Implication p → q “p implies q”Biconditional p ↔ q “p if and only if q”

The hierarchy for evaluation of connectives is: ¬,∧,Y,∨,→,↔ from“do first” to “do last”. A string of the same operations is evaluatedfrom left to right, except →, which is evaluated from right to left. So,for instance, ¬p → p ∧ q ∧ s ∨ r → t ↔ q means ((¬p) → ((((p ∧ q) ∧s) ∨ r) → t)) ↔ q. Although the given order of precedence allows usto write many unambiguous statements without parenthesis, it is goodpractice to use some parenthesis in order to facilitate readability; e.g.,the previous statement may be written (¬p → (((p∧q∧s)∨r) → t)) ↔q. Sometimes parenthesis are replaced with brackets.

A statement is said to be primitive if it cannot be broken downinto simpler statements. For instance “Socrates is a human being” isprimitive, but “it is raining and the floor is wet” is compound.

The truth value of a compound statement depends only on the valueof its components. Writing 0 for “false” and 1 for “true”, we cansummarize the meaning of the connectives in the following way:


p q ¬p p ∧ q p ∨ q p Y q p → q p ↔ q0 0 1 0 0 0 1 10 1 1 0 1 1 1 01 0 0 0 1 1 0 01 1 0 1 1 0 1 1

We note that ∨ represents a non-exclusive or, i.e., p∨ q is true whenany of p, q is true and also when both are true. On the other hand Yrepresents an exclusive or, i.e., p Y q is true only when exactly one of pand q is true.

The statement “p → q” can be read “p implies q”, “if p then q”,“p is a sufficient condition for q”, and “q is a necessary condition forp”. Note that the implication p → q is true always except when p istrue and q is false. So, the followings sentences are true: “if 2 < 4then Paris is in France” (true → true), “if London is in Denmark then2 < 4” (false → true), “if 4 = 7 then London is in Denmark” (false →false). However the following one is false: “if 2 < 4 then London is inDenmark” (true → false).

In might seem strange that “p → q” is considered true when p isfalse, regardless of the truth value of q. This will become clearer whenwe study open statements such as “if x is a multiple of 4 then x is amultiple of 2”. That implication is obviously true, although for theparticular case x = 3 it becomes “if 3 is a multiple of 4 then 3 is amultiple of 2”.

The statement “p ↔ q” can be read “p if and only if q”, or “p is anecessary and sufficient condition for q”. Also sometimes it is written“p iff q”.

It is possible to make truth tables for more complex statements, forinstance p → (q ∧ r) can be analyzed in the following way:

p q r q ∧ r p → (q ∧ r)0 0 0 0 10 0 1 0 10 1 0 0 10 1 1 1 11 0 0 0 01 0 1 0 01 1 0 0 01 1 1 1 1


1.3. Tautologies and Contradictions. A tautology is a statementthat is always true regardless of the truth value of its components, forinstance: p ∨ ¬p.

A contradiction is a statement that is always false regardless of thetruth value of its components, for instance: p ∧ ¬p.

We will represent by T0 a primitive statement that is always true,and by F0 a primitive statement that is always false. They are notactual sentences, just symbols representing the ideas of “always true”and “always false”. They are not to be confused with 1 and 0. Thelatter represent the truth values “true” and “false” respectively, whilethe former represent statements.

1.4. Logical Equivalence. Two statements s1, s2, are logically equiv-alent, and we write it “s1 ⇔ s2”, when they always have the same truthvalue regardless of the truth value of their components.1 For instance(p → q) ⇔ ¬p ∨ q, as shown by the following truth table:

p q ¬p ¬p ∨ q p → q0 0 1 1 10 1 1 1 11 0 0 0 01 1 0 1 1

Other examples of logical equivalence: p ∨ ¬p ⇔ T0, p ∧ ¬p ⇔ F0,p ∧ (q ∨ p) ⇔ p, etc.

1.5. Laws of Logic. Complicated statements can sometimes be re-placed with simpler equivalent statements. For instance p ∧ (p → q)can be replaced with the simpler p ∧ q because of the following chainof logical equivalences:

p ∧ (p → q) ⇔ p ∧ (¬p ∨ q)⇔ (p ∧ ¬p) ∨ (p ∧ q)⇔ F0 ∨ (p ∧ q)⇔ p ∧ q .

1Note that s1 ⇔ s2 is conceptually different from s1 ↔ s2. The latter is astatement, while the former is a relation between two statements.


This is similar to a sequence of arithmetic computations, except thatthey are performed following a different set of rules. The laws of thealgebra of propositions are shown in table 1.1.

Table 1.1. Laws of Logic

Algebra of PropositionsDouble Negation: ¬¬p ⇔ pDeMorgan’s: ¬(p ∨ q) ⇔ ¬p ∧ ¬q

¬(p ∧ q) ⇔ ¬p ∨ ¬qCommutative: p ∨ q ⇔ q ∨ p

p ∧ q ⇔ q ∧ pAssociative: p ∨ (q ∨ r) ⇔ (p ∨ q) ∨ r

p ∧ (q ∧ r) ⇔ (p ∧ q) ∧ rDistributive: p ∨ (q ∧ r) ⇔ (p ∨ q) ∧ (p ∨ r)

p ∧ (q ∨ r) ⇔ (p ∧ q) ∨ (p ∧ r)Idempotent: p ∨ p ⇔ p

p ∧ p ⇔ pIdentity: p ∨ F0 ⇔ p

p ∧ T0 ⇔ pInverse: p ∨ ¬p ⇔ T0

p ∧ ¬p ⇔ F0

Domination: p ∨ T0 ⇔ T0

p ∧ F0 ⇔ F0

Absorption: p ∨ (p ∧ q) ⇔ pp ∧ (p ∨ q) ⇔ p

Any compound statement can now be transformed into other equiva-lent ones by using those rules plus the following relations, which allowsus to express any statement in terms of the connectives ∧, ∨ and ¬only:

p → q ⇔ ¬p ∨ q

p ↔ q ⇔ (p → q) ∧ (q → p)

p Y q ⇔ ¬(p ↔ q)

(1.1)

So, ∧, ∨ and ¬ can be considered as “primitive” logic operators,while →, ↔ as Y are “defined” in terms of ∧, ∨ and ¬.2 Now the

2The number of primitive connectives could be reduced even more by “defining”p ∨ q ⇔ ¬(¬p ∧ ¬q). This would leave us with ∧ and ¬ as only primitive logic op-erators. Question: would it be possible to reduce the whole algebra of propositionsto an algebra with only one conveniently chosen primitive logic operator?


logical equivalence p∧ (p → q) ⇔ p∧ q can be justified in the followingway:

p ∧ (p → q) ⇔ p ∧ (¬p ∨ q) Definition of →⇔ (p ∧ ¬p) ∨ (p ∧ q) Distributive Law⇔ F0 ∨ (p ∧ q) Inverse Law⇔ p ∧ q Identity Law .

1.6. Disjunctive Normal Form (DNF). Every compound state-ment can be written as a disjunction of conjunctions called disjunctivenormal form (DNF), and defined in the following way.

1. A literal is a letter (p) or the negation of a letter (¬p).2. A fundamental conjunction is either a literal or a conjunction of

two or more literals (p ∧ ¬q ∧ r . . . ).3. A disjunctive normal form is either a fundamental conjunction or

a disjunction of two or more fundamental conjunctions ((p∧¬q)∨(r ∧ s ∧ t) ∨ . . . ).

In order to write a compound statement in DNF, do the following:

1. Remove all occurrences of “→”, “↔” and “Y” by using the equiv-alences (1.1).

2. Move all negations inside by using the DeMorgan’s laws.3. Apply the distributive laws to obtain a DNF.

For instance:

¬(p → q) ∧ (q → r) ⇔ ¬(¬p ∨ q) ∧ (¬q ∨ r)⇔ p ∧ ¬q ∧ (¬q ∨ r)⇔ (p ∧ ¬q ∧ ¬q) ∨ (p ∧ ¬q ∧ r)⇔ (p ∧ ¬q) ∨ (p ∧ ¬q ∧ r)

1.7. Conjunctive Normal Form (CNF). Every compound state-ment can be written as a conjunction of disjunctions called conjunctivenormal form (CNF). The definition is analogous to DNF, but with theroles of “conjunction” and “disjunction” interchanged.

1.8. Principle of Duality. The dual of a sentence s containing onlythe connectives ∧ and ∨ is the sentence sd obtained by replacing eachoccurrence of ∧ and ∨ by ∨ and ∧ respectively, and each occurrenceof T0 and F0 by F0 and T0 respectively. So, for instance, the dual of


p∧ (q∨T0) is p∨ (q∧F0). The Principle of duality states that if s ⇔ t,then sd ⇔ td.

1.9. Implications. Look at the following truth table:

p q p → q ¬q → ¬p q → p ¬p → ¬q0 0 1 1 1 10 1 1 1 0 01 0 0 0 1 11 1 1 1 1 1

The statement ¬q → ¬p is called the contrapositive of p → q; q → pis the converse and ¬p → ¬q is the inverse. We note that an implica-tion is logically equivalent to its contrapositive:

p → q ⇔ ¬q → ¬p .

However it is not equivalent to its converse. For instance the sentence“if it is raining then the ground gets wet” is equivalent to “if the grounddoes not get wet then it is not raining”, but it is not equivalent to “if itis not raining then the ground does not get wet” (there might be otherways for the ground to get wet besides raining).

Example: Consider the following statement: “If a is an integer and a2

is odd, then a is odd.” This sentence is hard to prove directly, howeverits contrapositive (“if a is even then a2 is even”) is much easier to prove.In fact, if a is even then a = 2 a′ for some integer a′, and a2 = 4 a′2 willbe even.

1.10. Logical Implication. Given two statements p, q, we say that“p logically implies q”, and we write it “p ⇒ q”, if q is true whenever pis true. In other words, p ⇒ q when p → q is a tautology. For instance,p ∧ (p → q) ⇒ q, as we observe from the following truth table:

p q p → q p ∧ (p → q) (p ∧ (p → q)) → q0 0 1 0 10 1 1 0 11 0 0 0 11 1 1 1 1

Exercise: Check the following logical implications:


1. (¬p → p) ⇒ p2. p ⇒ (¬p → q)3. (p → q) ⇒ ((q → r) → (p → r))

1.11. Arguments, Rules of Inference. An argument is a mecha-nism that allows us to go from a set of statements p1, p2, . . . , pn calledpremises to another statement q called conclusion. The argument iscalled valid if q is true whenever p1, p2, . . . , pn are true, i.e., if p1 ∧ p2 ∧· · · ∧ pn ⇒ q.

Usually arguments are carried out in small steps by using rules ofinference. The rules of inference are represented in the following way:

p1

p2...pn

∴ q

where p1, p2, . . . , pn are the premises and q is the conclusion.

Some rules of inference are the following:

1. Modus Ponens or Rule of Detachment :

pp → q

∴ q

2. Syllogism:

p → qq → r

∴ p → r

3. Modus Tollens :p → q¬q

∴ ¬p

4. Conjunction:

pq

∴ p ∧ q


5. Disjunctive Syllogism:

p ∨ q¬p

∴ q

6. Contradiction:¬p → F0

∴ p

7. Conjunctive Simplification:

p ∧ q∴ p

8. Disjunctive Amplification:

p∴ p ∨ q

9. Conditional Proof :

p ∧ qp → (q → r)

∴ r

10. Proof by Cases :

p → rq → r

∴ p ∨ q → r

11. Constructive Dilemma:p → qr → sp ∨ r

∴ q ∨ s

12. Destructive Dilemma:p → qr → s¬q ∨ ¬s

∴ ¬p ∨ ¬r

An argument can be shown to be valid by describing it as a sequenceof steps in each one of which we introduce either: 1) a premise, or 2)a statement that follows by applying a rule of inference to statementsobtained on previous steps.


As an example, consider the following statements: “I take the busor I walk. If I walk I get tired. I do not get tired. Therefore I takethe bus.” We can formalize this by calling B = “I take the bus”, W =“I walk” and T = “I get tired”. The premises are B ∨W , W → T and¬T , and the conclusion is B. The argument can be described in thefollowing steps:

step statement reason

1) W → T Premise2) ¬T Premise3) ¬W 1,2, Modus Tollens4) B ∨W Premise5) ∴ B 4,3, Disjunctive Syllogism

1.12. Proof by Contradiction. The proof by contradiction or Re-ductio ad Absurdum consists of assuming that what we are trying toprove is false and then deriving a contradiction. In other words, weprove statement q from premises p1, p2, . . . , pn by adding ¬q, to thepremises and deriving a contradiction, i.e., we show

p1 ∧ p2 ∧ · · · ∧ pn ⇒ q

by showing

p1 ∧ p2 ∧ · · · ∧ pn ∧ ¬q ⇒ F0

If we represent the premises by a single proposition p, the proof bycontradiction rests on the following logical equivalence:

p → q ⇔ (p ∧ ¬q) → F0 .

Its validity can be determined by looking at the truth table for thesentence (p → q) ↔ [(p ∧ ¬q) → F0] (left as an exercise).

Example: Prove that√

2 is irrational, i.e., that√

2 cannot be writtenas a quotient of two integers. Answer : Assume that

√2 is rational, i.e.,√

2 = a/b, where a and b are integers. We may assume that the fractionis written in its least terms, so that a and b have no common factors(≥ 2). Now we have 2 = a2/b2, hence 2 b2 = a2. Since the left handside is even, then a2 is even, but this implies that a itself is even, soa = 2 a′. Hence: 2 b2 = 4 a′2, and simplifying: b2 = 2 a′2. But now theright hand side is even, so b2 is even, which implies that b is even andb = 2 b′. But this means that a and b have a common factor, namely2, which is a contradiction. Consequently, our assumption that

√2 is

rational has to be false.


1.13. Predicate Logic. An open statement or predicate is a declara-tive sentence containing variables, such that it becomes a statement ifits variables are replaced by appropriate values. For instance “x + 2 =7”, “X is a human being”, “x < y”, “p is a prime number” are openstatements. The truth value of the open statement depends on thevalue assigned to its variables. For instance if we replace x with 1in the open statement “x + 2 = 7” we obtain “1 + 2 = 7”, which isfalse, but if we replace it with 5 we get “5 + 2 = 7”, which is true.We represent an open statement by a letter followed by the variablesenclosed between parenthesis: p(x), q(x, y), etc. An example for p(x)is a value of x for which p(x) is true. A counterexample is a value of xfor which p(x) is false. So, 5 is an example for “x + 2 = 7”, while 1 isa counterexample.

The allowable values for the variables of an open statement are calledthe universe of discourse. For instance, in the open statement “x+2 =7”, x may represent an integer and the universe of discourse will be theset of integers. The open statement “X is a human being” may makesense if the universe of discourse is the set of all living beings. But“x+2 = 7” would not make sense for a universe of discourse consistingof the set of all living beings.

1.14. Quantifiers. Given an open statement p(x), the sentence “forsome x, p(x)” (or “there is some x such that p(x)”), represented “∃x p(x)”,has a definite truth value, so it is a statement in the usual sense. Forinstance if p(x) is “x+2 = 7” with the integers as universe of discourse,then ∃x p(x) is true, since there is indeed an integer, namely 5, suchthat p(5) is a true statement. However, if q(x) is “2x = 7” and theuniverse of discourse is still the integers, then ∃x q(x) is false. On theother hand, ∃x q(x) would be true if we extend the universe of dis-course to the rational numbers. The symbol ∃ is called the existentialquantifier.

Analogously, the sentence “for all x, p(x)”—also “for any x, p(x)”,“for every x, p(x)”, “for each x, p(x)”—, represented “∀x p(x)”, hasa definite truth value. For instance, if p(x) is “x + 2 = 7” and theuniverse of discourse is the integers, ∀x p(x) is false. However if q(x)represents “(x + 1)2 = x2 + 2x + 1” then ∀x q(x) is true. The symbol∀ is called the universal quantifier.

In open statements with more than one variable it is possible touse several quantifiers at the same time, for instance ∀x∀y∃z p(x, y, z),


meaning “for all x and all y there is some z such that p(x, y, z)”. Quan-tifiers have the same precedence as the negation symbol, and are paren-thesized from right to left; e.g., the previous statement would be paren-thesized ∀x (∀y (∃z p(x, y, z))). Another example: ∀x¬∃y p(x, y)∧ q →r means (∀x (¬(∃y p(x, y))) ∧ q) → r.

Note that in general the existential and universal quantifiers cannotbe permuted, i.e., in general ∀x∃y p(x, y) means something differentfrom ∃y∀x p(x, y). For instance if x and y represent human beingsand p(x, y) = “x likes y”, then ∀x∃y p(x, y) means that everybody likessomeone, but ∃y∀x p(x, y) means that everybody likes the same person.

An open statement can be partially quantified, e.g. ∀x∃y p(x, y, z, t).The variables quantified (x and y in the example) are called boundvariables, and the rest (z and t in the example) are called free variables.A partially quantified open statement is still open, but its truth valuedepends on fewer variables.

The scope of a quantifier is the part of the statement containing thevariables bounded by the quantifier. The scope of the quantifier in aparenthesized statement of the form “∀x (p) . . . ” or “∃x (p) . . . ” is p.Example: The scope of “∀” in “∀x∃y (x + y = 0) → (x + 1 > x)” is“∃y (x + y = 0)”, because the fully parenthesized version of the givenstatement is (∀x (∃y (x + y = 0))) → (x + 1 > x)—in particular thismeans that the occurrence of x in x + 1 > x is outside the scope of“∀”, so x is a free variable in that part of the statement.

1.15. Interpretations, Models, Countermodels. An interpreta-tion of a quantified statement is the assignation of a meaning to thatstatement in some universe of discourse. For instance, a possible inter-pretation for “∀x (x + y = 7)” consists of assuming that x representsan integer, and assigning y = 3, i.e., “∀x (x + 3 = 7)” as a statementabout integers. Specifically, an interpretation consists of the following:

1. A universe of discourse for the bound variables.3

3Usually all bound variables are interpreted in a single universe of discourse, thesame for all of them. However in the so called many-sorted logics different kindsof variables may be interpreted in different universes of discourse. An example isthe NBG (Neumann-Bernays-Godel) formal system for set theory, which containstwo kinds of variables representing sets and classes respectively—different types ofvariables are represented by different sets of symbols, for instance “sets” by smallletters (x, y, z) and “classes” by capital letters (X, Y , Z).


2. A choice of definite value (taken from the universe of discourse)for each free variable.

If an interpretation is true then it is called a model. If it is false, it iscalled a countermodel. For instance, a possible model for “∃x, (x2 = y)”is the set of integers as universe of discourse, and y = 9. A counter-model is the set of real numbers as universe of discourse, and y = −1.

The concepts of logical equivalence and logical implication can beapplied to quantified statements in the following sense. Given twoquantified statements p and q (the quantifiers and variables are notshown here for simplicity), we say that p is logically equivalent to q,written p ⇔ q, if they have the same truth value in any possible inter-pretation. We say that p logically implies q, written p ⇒ q, if q is truefor any interpretation for which p is true. The next paragraph showssome examples of quantified statements that are logically equivalent.

1.16. Negation of Quantified Statements. If ∃x p(x) is false thenthere is no value of x for which p(x) is true, or in other words, p(x) isalways false. Hence

¬[∃x p(x)] ⇔ ∀x¬p(x) .

On the other hand, if ∀x p(x) is false then it is not true that for everyx, p(x) holds, hence for some x, p(x) must be false. Thus:

¬[∀x p(x)] ⇔ ∃x¬p(x) .

This two rules can be applied in successive steps to find the negationof a more complex quantified statement, for instance:

¬[∃x∀y p(x, y)] ⇔ ∀x[¬∀y p(x, y)] ⇔ ∀x∃y ¬p(x, y) .

1.17. Laws of Logic for Quantified Statements. Besides the rulesfor the negation of quantified statements, the following rules hold:

1. ¬(∃x p(x)) ⇔ ∀x¬p(x)

2. ¬(∀x p(x)) ⇔ ∃x¬p(x)

3. ∃x∃y p(x, y) ⇔ ∃y∃x p(x, y)

4. ∀x∀y p(x, y) ⇔ ∀y∀x p(x, y)

5. ∃x (p(x) ∨ q(x)) ⇔ ∃x p(x) ∨ ∃x q(x)


6. ∃x (p(x) ∧ q(x)) ⇒ ∃x p(x) ∧ ∃x q(x)

7. ∀x (p(x) ∨ q(x)) ⇐ ∀x p(x) ∨ ∀x q(x)

8. ∀x (p(x) ∧ q(x)) ⇔ ∀x p(x) ∧ ∀x q(x)

9. If y does not occur in p(x) then x can be renamed y:

∃x p(x) ⇔ ∃y p(y)∀x p(x) ⇔ ∀y p(y)

10. If x does not occur in p then

∃x (p ∨ q(x)) ⇔ p ∨ ∃x q(x)∀x (p ∨ q(x)) ⇔ p ∨ ∀x q(x)∃x (p ∧ q(x)) ⇔ p ∧ ∃x q(x)∀x (p ∧ q(x)) ⇔ p ∧ ∀x q(x)∃x (p → q(x)) ⇔ p → ∃x q(x)∀x (p → q(x)) ⇔ p → ∀x q(x)∃x (q(x) → p) ⇔ ∀x q(x) → p∀x (q(x) → p) ⇔ ∃x q(x) → p

Exercise: Prove:

∃x (p(x) → q(x)) ⇔ ∀x p(x) → ∃x q(x)

Exercise: Prove that the logical implications in rules 6 and 7 cannotbe reversed, i.e., find and interpretation for which “∃x (p(x)∧ q(x))” isfalse and “∃x p(x) ∧ ∃x q(x)” is true, and an interpretation for which“∀x (p(x) ∨ q(x))” is true and “∀x p(x) ∨ ∀x q(x)” is false.

1.18. Prenex Normal Form. A quantified statement is in prenexnormal form if all its quantifiers are on the left of the expression. Forinstance, the following sentence is in prenex form:

∃x ∀y [p(x) ∧ q(y)]

However, the following one is not:

∃x [p(x) ∧ ∀y q(y)]

A quantified statement can be transformed into prenex normal formin the following way:


1. Rename its variables so that no quantifiers use the same variablename and so that the bound variable names are different from thefree variable names.

2. Move the quantifiers to the left by using the rules of section 1.17.

For instance, consider the sentence “if I have a wife then I am mar-ried”. If we call p(x) = “x is my wife” and q = “I am married”, thenthe sentence can be formalized like this:

∃x p(x) → q .

Using the appropriate rule, we can transform the sentence into prenexnormal form:

∀x (p(x) → q) ,

which reads “for all x, if x is my wife then I am married”.

1.19. Inference Rules for Quantified Statements. We state therules for statements with one variable, but they can be generalized tostatements with two or more variables.

1. Rule of Universal Specification. If ∀x p(x) is true, then p(a) istrue for each specific element a in the universe of discourse; i.e.:

∀x p(x)∴ p(a)

For instance, from ∀x (x + 1 = 1 + x) we can derive 7 + 1 = 1 + 7.

2. Rule of Existential Specification. If ∃x p(x) is true, then p(a) istrue for some specific element a in the universe of discourse; i.e.:

∃x p(x)∴ p(a)

The difference respect to the previous rule is the restriction in themeaning of a, which now represents some (not any) element ofthe universe of discourse. So, for instance, from ∃x (x2 = 2) (theuniverse of discourse is the real numbers) we derive the existence ofsome element, which we may represent ±√2, such that (±√2)2 =2.

3. Rule of Universal Generalization. If p(x) is proven to be true for ageneric element in the universe of discourse, then ∀x p(x) is true;i.e.:

p(x)∴ ∀x p(x)


By “generic” we mean an element for which we do not make anyassumption other than its belonging to the universe of discourse.So, for instance, we can prove ∀x [(x + 1)2 = x2 + 2x + 1] (say, forreal numbers) by assuming that x is a generic real number andusing algebra to prove (x + 1)2 = x2 + 2x + 1.

4. Rule of Existential Generalization. If p(a) is true for some specificelement a in the universe of discourse, then ∃x p(x) is true; i.e.:

p(a)∴ ∃x p(x)

For instance: from 7 + 1 = 8 we can derive ∃x (x + 1 = 8).

Example: Show that a counterexample can be used to disprove auniversal statement, i.e., if a is an element in the universe of discourse,then from ¬p(a) we can derive ¬∀x p(x). Answer : The argument is asfollows:

step statement reason

1) ¬p(a) Premise2) ∃x¬p(x) Existential Generalization3) ¬∀x p(x) Negation of Universal Statement


2. Set Theory

2.1. Sets. A set is a collection of objects, called elements of the set.A set can be represented by listing its elements between braces: A ={1, 2, 3, 4, 5}. The symbol ∈ is used to express that an element is (orbelongs to) a set, for instance 3 ∈ A. Its negation is represented by 6∈,e.g. 7 6∈ A.

Some important sets are the following:

1. N = {0, 1, 2, 3, · · · } = the set of natural numbers.1

2. Z = {−3,−2,−1, 0, 1, 2, 3, · · · } = the set of integers.3. Q = the set of rational numbers.4. R = the set of real numbers.5. C = the set of complex numbers.

Usually S+ = set of positive elements in S, for instance Z+ ={1, 2, 3, · · · } = the set of positive integers. Also S∗ = set of elements inS excluding zero, for instance R∗ = the set of non zero real numbers.

An alternative way to define a set, called set-builder notation, is bystating a property p(x) verified by exactly its elements, for instanceA = {x ∈ Z | 1 ≤ x ≤ 5} = “set of integers x such that 1 ≤ x ≤ 5”—i.e.: A = {1, 2, 3, 4, 5}. In general: A = {x ∈ U | p(x)}, where U isthe universe of discourse in which the open statement p(x) must beinterpreted, or A = {x | p(x)} if the universe of discourse for p(x) isimplicitly understood. In set theory the term universal set is oftenused in place of “universe of discourse” for a given open statement.2

Example: The following represent various kinds of intervals of realnumbers:

• (a, b) = {x ∈ R | a < x < b}• [a, b] = {x ∈ R | a ≤ x ≤ b}• [a, b) = {x ∈ R | a ≤ x < b}• (a, b] = {x ∈ R | a < x ≤ b}• (a,∞) = {x ∈ R | a < x}• [a,∞) = {x ∈ R | a ≤ x}• (−∞, b) = {x ∈ R | x < b}• (−∞, b] = {x ∈ R | x ≤ b}

1Note that N includes zero—for some authors N = {1, 2, 3, · · · }, without zero.2Properly speaking, the universe of discourse of set theory is the collection of all

sets (which is not a set).


Principle of Extension:3 Two sets are equal if and only if they havethe same elements, i.e.:

A = B ⇔ ∀x (x ∈ A ↔ x ∈ B) .

Subset : We say that A is a subset of set B, or A is contained in B,and we represent it “A ⊆ B”, if all elements of A are in B, i.e.

A ⊆ B ⇔ ∀x (x ∈ A → x ∈ B) .

In particular A ⊆ A. Note that

A = B ⇔ (A ⊆ B) ∧ (B ⊆ A) ,

so it is possible to prove that two sets A and B are equal by showingthat all elements of A are in B, and all elements of B are in A.

A is a proper subset of B, represented “A ⊂ B”, if A ⊆ B but A 6= B,i.e., there is some element in B which is not in A.

Empty Set : A set with no elements is called empty set or null set,and is represented by ∅ or {}. Exercise: Prove that the empty set is asubset of every set, i.e.: ∅ ⊆ A for any set A—Hint: ∀x [F0 → p(x)] isalways true. Then prove that the empty set is unique, i.e.:

[∀x (x 6∈ A)] ∧ [∀x (x 6∈ B)] ⇒ A = B .

Note that nothing prevents a set from possibly being an element ofanother set (which is not the same as being a subset!). For instanceif A = {1, a, {3, t}, {1, 2, 3}} and B = {3, t}, then obviously B is anelement of A.4

Power Set : The collection of all subsets of a set A is called the powerset of A, and is represented P(A). For instance, if A = {1, 2, 3}, then

P(A) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, A} .

Question: If a set A has n elements, how many elements does P(A)have?

3Two ordinary sets are identical if they have the same elements, so for instance,{a, a, b} and {a, b} are the same set because they have exactly the same elements,namely a and b. However, in some applications it might be useful to allow repeatedelements in a set. In that case we use multisets, which are mathematical entitiessimilar to sets, but with possibly repeated elements. So, as multisets, {a, a, b} and{a, b} would be considered different.

4Question: Is there any difference between ∅ and {∅}? How many elements doeseach of those sets have?


Cardinality : The number of elements in a set A, or cardinality ofA, is denoted |A|. For instance, if A = {a, b, c} then |A| = 3. If thecardinality of a set is a natural number, then the set is finite. Otherwise,it is infinite.5

Notation: The statement “every element x in set A verifies propertyp(x)” will be written “∀x ∈ A, p(x)”. Similarly, “some element x in setA verifies property p(x)” will be written “∃x ∈ A, p(x)”. For instance,“every integer x verifies x2 ≥ 0” can be written “∀x ∈ Z, x2 ≥ 0”.

2.2. Venn Diagrams. Venn diagrams are graphic representations ofsets as enclosed areas in the plane. For instance, in figure 2.1, therectangle represents the universal set (the set of all elements consideredin a given problem) and the shaded region represents a set A. The otherfigures represent various set operations.

��

��

A

Figure 2.1. Venn Diagram.

��

��

BA

Figure 2.2. Intersection A ∩B.

5One may think that all infinite sets have the same cardinality, i.e.,|A| = ∞.However it is possible to compare infinite sets and establish a hierarchy of differ-ent levels of infinity. This consideration has given rise to the so called theory oftransfinite numbers. The smallest transfinite number is the cardinality of the set ofnatural numbers, and is represented by |N| = ℵ0 (“aleph-naught”).


��

��BA

Figure 2.3. Union A ∪B.

��

��

A

Figure 2.4. Complement A.

B

��

��

A

Figure 2.5. Difference A−B.

2.3. Set Operations.

1. Intersection: The common elements of two sets:

A ∩B = {x | x ∈ A ∧ x ∈ B} .

If A ∩B = ∅, the sets are said to be disjoint.

2. Union: The set of elements that belong to either of two sets:

A ∪B = {x | x ∈ A ∨ x ∈ B} .


��

��

BA

Figure 2.6. Symmetric Difference A4B.

3. Complement : The set of elements (in the universal set) that donot belong to a given set:

A = {x ∈ U | x 6∈ A} .

4. Difference or Relative Complement : The set of elements that be-long to a set but not to another:

A−B = {x | x ∈ A ∧ x 6∈ B} = A ∩B .

5. Symmetric Difference: The set of elements that belong to any oftwo given sets but not to both:

A4B = {x | x ∈ A Y x ∈ B} .

Exercise: Prove:

A4B = A ∪B − A ∩B = (A−B) ∪ (B − A) .

As an illustration, we prove the first equality, i.e., A4B = A ∪ B −A∩B. Equality of sets is usually proven by showing that each set is asubset of the other one.

1. Proof of A4B ⊆ A ∪B − A ∩B:

x ∈ A4B ⇒ x ∈ A Y x ∈ B⇒ (x ∈ A ∨ x ∈ B) ∧ ¬(x ∈ A ∧ x ∈ B)⇒ (x ∈ A ∪B) ∧ (x 6∈ A ∩B)⇒ x ∈ A ∪B − A ∩B .

2. Proof of A ∪B − A ∩B ⊆ A4B:

x ∈ A ∪B − A ∩B ⇒ (x ∈ A ∪B) ∧ (x 6∈ A ∩B)⇒ (x ∈ A ∨ x ∈ B) ∧ ¬(x ∈ A ∧ x ∈ B)⇒ x ∈ A Y x ∈ B⇒ x ∈ A4B .


2.4. Algebra of Sets. The set operations verify the following prop-erties:

1. Law of Double Complement:

A = A

2. DeMorgan’s Laws:

A ∪B = A ∩B

A ∩B = A ∪B

3. Commutative Laws:

A ∪B = B ∪ AA ∩B = B ∩ A

4. Associative Laws:

A ∪ (B ∪ C) = (A ∪B) ∪ CA ∩ (B ∩ C) = (A ∩B) ∩ C

5. Distributive Laws:

A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C)A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)

6. Idempotent Laws:

A ∪ A = AA ∩ A = A

7. Identity Laws:

A ∪ ∅ = AA ∩ U = A

8. Inverse Laws:

A ∪ A = U

A ∩ A = ∅9. Domination Laws:

A ∪ U = U

A ∩ ∅ = ∅10. Absorption Laws:

A ∪ (A ∩B) = AA ∩ (A ∪B) = A


Note that the laws come in pairs of dual statements containing sets(or their complements) and the set operations ∩ and ∪. By dual ofsuch a statement s we mean another statement sd obtained from s byreplacing each occurrence of ∩ and ∪ by ∪ and ∩ respectively and eachoccurrence of ∅ and U by U and ∅ respectively.

Principle of Duality : If a statement s about sets is a theorem, thenits dual sd is also a theorem.

2.5. Generalized Union and Intersection. Given a collection ofsets A1, A2, . . . , AN , their union is defined as the set of elements thatbelong to at least one of the sets (here n represents an integer in therange from 1 to N):

N⋃n=1

An = A1 ∪ A2 ∪ · · · ∪ AN = {x | ∃n (x ∈ An)} .

Analogously, their intersection is the set of elements that belong to allthe sets simultaneously:

N⋂n=1

An = A1 ∩ A2 ∩ · · · ∩ AN = {x | ∀n (x ∈ An)} .

These definitions can be applied to infinite collections of sets as well.For instance:

∞⋃n=1

[n, n + 1] = [1, 2] ∪ [2, 3] ∪ [3, 4] · · · = [1,∞) .

Exercise: What is∞⋂

n=1

[− 1n, 1

n] ?


3. Relations

3.1. Ordered Pairs, Cartesian Product. An ordinary pair {a, b} isa set with two elements. In a set the order of the elements is irrelevant,so {a, b} = {b, a}. If the order of the elements must be relevant, thenwe use a different object called ordered pair, represented (a, b). Now(a, b) 6= (b, a) (unless a = b). In general (a, b) = (a′, b′) iff a = a′ andb = b′.

Given two sets A, B, their Cartesian product A×B is the set of allordered pairs (a, b) such that a ∈ A and b ∈ B:

A×B = {(a, b) | a ∈ A ∧ b ∈ B} .

Analogously we can define triples or 3-tuples (a, b, c), 4-tuples (a, b, c, d),. . . , n-tuples (a1, a2, . . . , an), and the corresponding 3-fold, 4-fold, . . . ,n-fold Cartesian products:

A1 × A2 × · · · × An =

{(a1, a2, . . . , an) | a1 ∈ A1 ∧ a2 ∈ A2 ∧ · · · ∧ an ∈ An} .

If all the sets in a Cartesian product are the same, then we can usean exponent: A2 = A× A, A3 = A× A× A, etc. In general:

An = A× A× (n times)· · · × A .

An example of Cartesian product is the real plane R2, where R isthe set of real numbers (R is sometimes called real line).

3.2. Relations. Assume that we have a set of men M and a set ofwomen W , some of whom are married. We want to express which menin M are married to which women in W . One way to do that is bylisting the set of pairs (m,w) such that m is a man, w is a woman, andm is married to w. So, the relation “married to” can be representedby a subset of the Cartesian product M ×W . In general, a relation R

from a set A to a set B will be understood as a subset of the Cartesianproduct A × B, i.e., R ⊆ A× B. If an element a ∈ A is related to anelement b ∈ B, we often write a R b instead of (a, b) ∈ R.

If A and B are the same set, then any subset of A × A will be abinary relation in A. For instance, assume A = {1, 2, 3, 4}. Then the


binary relation “less than” in A will be:

<A= {(x, y) ∈ A× A | x < y}= {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)} .

Notation: A set A with a binary relation R is sometimes representedby the pair (A,R). So, for instance, (Z,≤) means the set of integerstogether with the relation of non-strict inequality.

3.3. Representations of Relations. Venn diagrams and arrows canbe used for representing relations between given sets. As an example,figure 3.1 represents the relation from A = {a, b, c, d} to B = {1, 2, 3, 4}given by R = {(a, 1), (b, 1), (c, 2), (c, 3)}. In the diagram, an arrow fromx to y means that x is related to y.

ab

c

1

23

4d

A B

Figure 3.1. Relation.

Another example is given in diagram 3.2, which represents the divis-ibility relation on the set {1, 2, 3, 4, 5, 6, 7, 8, 9}.

1 2

3

4

5

6

7

8

9

Figure 3.2. Binary relation of divisibility.


3.4. Composition of Relations. Let A, B and C be three sets.Given a relation R from A to B and a relation S from B to C, thenthe composition R ◦ S of relations R and S is a relation from A to Cdefined by:

a (R ◦ S) c ⇔ there exists some b ∈ B such that a R b and b S c .

For instance, if R is the relation “to be the father of”, and S is therelation “to be married to”, then R ◦S is the relation “to be the fatherin law of”.

3.5. Properties of Binary Relations. A binary relation R on A iscalled:

1. Reflexive if for all x ∈ A, x Rx. For instance on Z the relation“equal to” (=) is reflexive.

2. Transitive if for all x, y, z ∈ A, x R y and y R z implies x R z.For instance equality (=) and inequality (<) on Z are transitiverelations.

3. Symmetric if for all x, y ∈ A, x R y ⇒ y Rx. For instance on Z,equality (=) is symmetric, but strict inequality (<) is not.

4. Antisymmetric if for all x, y ∈ A, x R y and y Rx implies x = y.For instance, non-strict inequality (≤) on Z is antisymmetric.

3.6. Partial Orders. A partial order, or simply, an order on a set Ais a binary relation “4” on A with the following properties:

1. Reflexive: for all x ∈ A, x 4 x.2. Antisymmetric: x 4 y ∧ y 4 x ⇒ x = y.3. Transitive: x 4 y ∧ y 4 z ⇒ x 4 z.

Examples:

1. The non-strict inequality (≤) in Z.

2. Relation of divisibility on Z+: a|b ⇔ ∃t, b = at.

3. Set inclusion (⊆) on P(A) (the collection of subsets of a givenset A).


Exercise: prove that the aforementioned relations are in fact partialorders. As an example we prove that integer divisibility is a partialorder:

1. Reflexive: a = a 1 ⇒ a|a.

2. Antisymmetric: a|b ⇒ b = at for some t and b|a ⇒ a = bt′ forsome t′. Hence a = att′, which implies tt′ = 1 ⇒ t′ = t−1. Theonly invertible positive integer is 1, so t = t′ = 1 ⇒ a = b.

3. Transitive: a|b and b|c implies b = at for some t and c = bt′ forsome t′, hence c = att′, i.e., a|c.

Question: is the strict inequality (<) a partial order on Z?

A set A with a partial order 4 is called a partially ordered set orposet. We represent it (A,4). Two elements a, b ∈ A are said to becomparable if either x 4 y or y 4 x, otherwise they are said to benoncomparable. The order is called total or linear when every pairof elements x, y ∈ A are comparable. For instance (Z,≤) is totallyordered, but (Z+, |), where “|” represents integer divisibility, is not. Atotally ordered subset of a poset is called a chain; for instance the set{1, 2, 4, 8, 16, . . . } is a chain in (Z+, |).

If x 4 y and x 6= y, x is called a predecessor of y, and y is called asuccessor of x. If there is no element between them, i.e., if there is noelement z different from x and y such that x 4 z 4 y, then x is calledan immediate predecessor of y, and y is called a immediate successorof x. For instance, in (Z,≤), all negative numbers are predecessorsof 0 and all positive numbers are successors of 0, but the immediatepredecessor of 0 is −1 and its immediate successor is 1. Note than anelement may have more than one immediate successor or predecessor,for instance, in (Z+, |), all prime numbers are immediate successorsof 1.

3.7. Hasse diagrams. A Hasse diagram is a graphical representationof a partially ordered set in which each element is represented by a dot(node or vertex of the diagram). Its immediate successors are placedabove the node and connected to it by straight line segments. As anexample, figure 3.3 represents the Hasse diagram for the relation ofdivisibility on {1, 2, 3, 4, 5, 6, 7, 8, 9}.


1

4

8

6

2

7

9

3

5

Figure 3.3. Hasse diagram for divisibility.

Question: How does the Hasse diagram look for a totally orderedset?

3.8. Distinguished Elements in a Poset. In a poset (A,4) an el-ement x is called:

1. Greatest if ∀a ∈ A, a 4 x.

2. Least if ∀a ∈ A, x 4 a.

3. Maximal if ∀a ∈ A such that a 6= x, x 64 a.

4. Minimal if ∀a ∈ A such that a 6= x, a 64 x.

Note the difference between greatest and maximal, and between leastand minimal. An element is greatest if it is preceded by all the elementsin the set, and is maximal if it does not precede any other element inthe set. Obviously, a greatest element is also maximal (why?), but theconverse is not true in general. For instance, in the poset representedin figure 3.3, the elements 5, 6, 7, 8 and 9 are maximal, but none ofthem is greatest. However, in the same example, 1 is at the same timeminimal and the least element in the set.

Exercise: Prove that a poset cannot have more than one greatestelement (an analogous result holds for the least element).

Some posets have no maximal (or minimal) elements. For instance,(Z,≤) has no maximal nor minimal elements. However every finiteposet does have maximal and minimal elements. Exercise: prove thata finite poset has a greatest element if and only if its set of maximalelements has only one element (Question: is the same true for infiniteposets? Prove it, or refute it with a counterexample.)


If (A,4) is a poset, and B ⊆ A, then and element x ∈ A is called:

1. An upper bound of B if ∀b ∈ B, b 4 x.

2. A lower bound of B if ∀b ∈ B, x 4 b.

3. A least upper bound (lub) of B if x is an upper bound of B andfor any other upper bound x′ of B, we have x 4 x′.

4. A greatest lower bound (glb) of B if x is an lower bound of B andfor any other lower bound x′ of B, we have x′ 4 x.

Example: in (R,≤), an upper bound for the interval (0, 1) = {r ∈R | 0 < r < 1} is, for instance, 7. On the other hand, the least upperbound of (0, 1) is 1.

Exercise: Let (Z+, |) be the positive integers with the relation ofdivisibility. Let {a1, a2, . . . , an} be a finite subset of Z+. What is theglb of {a1, a2, . . . , an} on (Z+, |)? What is its lub?

Exercise: Let (P(A),⊆) be the collection of subsets of a given setA ordered by set inclusion. Given A1, A2 ∈ P(A), what is the glb of{A1, A2}. What is its lub?

3.9. Well-Ordering. A poset (A,4) is said to be well ordered if itsorder is total and every nonempty subset in A has a least (or “first”)element. For instance, (N,≤) is well ordered, i.e., every nonempty setof natural numbers has a least element. However (Z,≤), (Q,≤), (R,≤),are not well ordered.

Exercise: Prove that every subset of a well ordered set is also wellordered (e.g.: the set of even natural numbers is well ordered).

3.10. Equivalence Relations. An equivalence relation on a set A isa binary relation “∼” on A with the following properties:

1. Reflexive: for all x ∈ A, x ∼ x.2. Symmetric: x ∼ y ⇒ y ∼ x.3. Transitive: x ∼ y ∧ y ∼ z ⇒ x ∼ z.

For instance, on Z, the equality (=) is an equivalence relation.

Another example, also on Z, is the following: x ≡ y (mod 2) (“x iscongruent to y modulo 2”) iff x−y is even. For instance, 6 ≡ 2 (mod 2)


because 6− 2 = 4 is even, but 7 6≡ 4 (mod 2), because 7− 4 = 3 is noteven. Congruence modulo 2 is in fact an equivalence relation:

1. Reflexive: for every integer x, x− x = 0 is indeed even, so x ≡ x(mod 2).

2. Symmetric: if x ≡ y (mod 2) then x−y = t is even, but y−x = −tis also even, hence y ≡ x (mod 2).

3. Transitive: assume x ≡ y (mod 2) and y ≡ z (mod 2). Thenx− y = t and y − z = u are even. From here, x− z = (x− y) +(y − z) = t + u is also even, hence x ≡ z (mod 2).

3.11. Equivalence Classes, Quotient Set, Partitions. Given anequivalence relation ∼ on a set A, and an element x ∈ A, the setof elements of A related to x are called the equivalence class of x,represented [x] = {y ∈ A | y ∼ x}. Element x is said to be a repre-sentative of class [x]. The collection of equivalence classes, representedA/∼ = {[x] | x ∈ A}, is called quotient set of A by ∼.

Exercise: Find the equivalence classes on Z with the relation of con-gruence modulo 2.

One of the main properties of an equivalence relation on a set Ais that the quotient set, i.e. the collection of equivalence classes, is apartition of A.

A partition of a set A is a collection of non-empty subsets A1, A2, A3, . . .of A which are pairwise disjoint and whose union equals A:

1. Ai ∩ Aj = ∅ for i 6= j,

2.⋃

n An = A.

Example: in Z with the relation of congruence modulo 2 (call it“∼2”), there are two equivalence classes: the set E of even integers andthe set O of odd integers. The quotient set of Z by the relation “∼2”of congruence modulo 2 is Z/∼2 = {E, O}. We see that it is in fact apartition of Z, because E ∩O = ∅, and Z = E ∪O.

Exercise: Given a set A and an equivalence relation ∼ in A, prove:

1. ∀x ∈ A, x ∈ [x]. Deduce [x] 6= ∅ and⋃

x [x] = A.


2. x′ ∈ [x] ⇒ [x′] = [x]. Deduce [x] ∩ [y] 6= ∅ ⇒ [x] = [y], hence[x] 6= [y] ⇒ [x] ∩ [y] = ∅.

3. [x] = [y] ⇔ x ∼ y.

Exercise: Let m be an integer greater than or equal to 2. On Zwe define the relation x ≡ y (mod m) ⇔ m|(y − x) (i.e., m dividesexactly y − x). Prove that it is an equivalence relation. What are theequivalence classes? How many are there?

Exercise: On the Cartesian product Z × Z∗ we define the relation(a, b) R (c, d) ⇔ ad = bc. Prove that R is an equivalence relation.Would it still be an equivalence relation if we extend it to Z× Z?


4. Functions

4.1. Correspondences. Suppose that to each element of a set A weassign some elements of another set B. For instance, A = N, B = Z,and to each element x ∈ N we assign all elements y ∈ Z such thaty2 = x (fig. 4.1).

1

89

10

0

2

3 45

6

-11

2

-2

-3

0

3

ZN

7

Figure 4.1. Correspondence x 7→ ±√x.

This operation can be interpreted as a relation, but when we wantto stress the fact that it is an assignment of some elements to otherelements, we call it a correspondence.

4.2. Functions. A function or mapping f from a set A to a set B,denoted f : A → B, is a correspondence in which to each element x ofA corresponds exactly one element y = f(x) of B (fig. 4.2).

A B

Figure 4.2. Function.

Sometimes we represent the function with a diagram like this:

f : A → Bx 7→ y

or Af→ B

x 7→ y


For instance, the following represents the function from Z to Z de-fined by f(x) = 2x + 1:

f : Z → Zx 7→ 2x + 1

The element y = f(x) is called the image of x, and x is a preimageof y. For instance, if f(x) = 2x + 1 then f(7) = 2 · 7 + 1 = 15. Theset A is the domain of f , and B is its codomain. If A′ ⊆ A, the imageof A′ by f is f(A′) = {f(x) | x ∈ A′}, i.e., the subset of B consistingof all images of elements of A′. The subset f(A) of B consisting ofall images of elements of A is called the range of f . For instance, therange of f(x) = 2x + 1 is the set of all integers of the form 2x + 1 forsome integer x, i.e., all odd numbers.

Example: Two useful functions from R to Z are the following:

1. The floor function:

bxc = greatest integer less than or equal to x .

For instance: b2c = 2, b2.3c = 2, bπc = 3, b−2.5c = −3.

2. The ceiling function:

dxe = least integer greater than or equal to x .

For instance: d2e = 2, d2.3e = 3, dπe = 4, d−2.5e = −2.

Graph: The graph of a function f : A → B is the subset of A × Bdefined by G(f) = {(x, f(x)) | x ∈ A} (fig. 4.3).

0

1

2

3

4

y

–2 –1 1 2x

Figure 4.3. Graph of f(x) = x2.

Equality of functions : Two functions f : A → B and g : C → D areequal (f = g) if A = C, B = D, and for every a ∈ A, f(a) = g(a).Note that in order to be equal f and g must have the same domain and


the same codomain. For instance, consider the functions f : R∗ → Rdefined by f(x) = 3x/x, and g : R → R defined by g(x) = 3. Thenf(x) = g(x) for every x in the domain of f , however f 6= g, because gis defined on a larger domain.

Restriction of a function: Given a function f : A → B and a subsetA′ ⊆ A, the function f |A′ : A′ → B defined by f |A′(a) = f(a) forevery a ∈ A′ is called the restriction of f to A′. For instance, assumethat f : R → R is f(x) = 2x, and g = f |R+ is the restriction of fto positive real numbers. Then f(3) = g(3) = 6, f(−3) = −6, butg(−3) is undefined, because the domain of g is R+ (by definition), and−3 6∈ R+.

4.3. Types of Functions.

1. One-to-One or Injective: A function f : A → B is called one-to-one or injective if each element of B is the image of at most oneelement of A (fig. 4.4):

∀x, x′ ∈ A, f(x) = f(x′) ⇒ x = x′ .

For instance, f(x) = 2x from Z to Z is injective.

A B

Figure 4.4. One-to-one function.

2. Onto or Surjective: A function f : A → B is called onto orsurjective if every element of B is the image of some element of A(fig. 4.5):

∀y ∈ B, ∃x ∈ A such that y = f(x) .

For instance, f(x) = x2 from R to R+ ∪ {0} is onto.

3. One-to-One Correspondence or Bijective Function:1 A functionf : A → B is said to be bijective or a one-to-one correspondence

1Note the difference between one-to-one function and one-to-one correspondence.


A B

Figure 4.5. Onto function.

if it is one-to-one and onto (fig. 4.6). For instance, f(x) = x + 3from Z to Z is a one-to-one correspondence.

A B

Figure 4.6. One-to-One Correspondence.

Exercise: Given a function f : A → B, and two subsets A1, A2 ⊆ A,prove the following statements:

1. f(A1 ∪ A2) = f(A1) ∪ f(A2).

2. f(A1 ∩ A2) ⊆ f(A1) ∩ f(A2).

3. In the previous statement the equality does not hold in general(find a counterexample), but if f is one-to-one then f(A1 ∩A2) =f(A1) ∩ f(A2).

4.4. Identity Function. Given a set A, the function 1A : A → Adefined by 1A(x) = x for every x in A is called the identity functionfor A.

4.5. Function Composition. Given two functions f : A → B andg : B → C, the composite function of f and g is the function g ◦ f :


A → C defined by (g ◦ f)(x) = g(f(x)) for every x in A:2

Ax

f //Â //

g◦f((

By=f(x)

g //Â //

Cz=g(y)=g(f(x))

For instance, if A = B = C = Z, f(x) = x + 1, g(x) = x2, then(g ◦ f)(x) = f(x)2 = (x + 1)2. Also (f ◦ g)(x) = g(x) + 1 = x2 + 1 (thecomposition of functions is not commutative in general).

Some properties of function composition are the following:

1. If f : A → B is a function from A to B, we have that f ◦ 1A =1B ◦ f = f .

2. Given two functions Af→ B

g→ C, we have:

(a) If f and g are one-to-one, then g ◦ f is one-to-one.

(b) If f and g are onto, then g ◦ f is onto.

(c) If g ◦ f is one-to-one then f is one-to-one.

(d) If g ◦ f is onto then g is onto.

3. Function composition is associative, i.e., given three functions

Af→ B

g→ Ch→ D ,

we have that h ◦ (g ◦ f) = (h ◦ g) ◦ f .

Function iteration: If f : A → A is a function from A to A, thenit makes sense to compose it with itself: f 2 = f ◦ f . For instance, iff : Z → Z is f(x) = 2x + 1, then f 2(x) = 2(2x + 1) + 1 = 4x + 3.

Analogously we can define f 3 = f ◦f ◦f , and so on, fn = f ◦ (n times). . . ◦f .

4.6. Inverse Function. Given a function f : A → B, the inverse off is a function f−1 : B → A such that f−1 ◦ f = 1A and f ◦ f−1 = 1B.

For instance, if f : Z → Z is defined by f(x) = x+3, then its inverseis f−1(x) = x− 3.

2Note that the definition is precisely the contrary of the composition of relations.


Note that not every function has an inverse. For instance, if f : R →R+ ∪ {0} is f(x) = x2, one may think that its inverse is g(x) =

√x,

but that is incorrect.3 The composition (f ◦ g)(x) = (√

x)2 = x is the

identity for R+∪{0}, but (g ◦ f)(x) =√

x2 = |x| is not the identity forR.

A function that has inverse is called invertible. The necessary andsufficient condition for a function to be invertible is to be a one-to-onecorrespondence.

Exercise: Given two invertible functions f : A → B, g : B → Cprove that g ◦ f is invertible and (g ◦ f)−1 = f−1 ◦ g−1.

4.7. Preimage Set. Given a function f : A → B and a subset B′ ⊆B, the preimage of B′ under f is the subset of A defined by f−1(B′) ={x ∈ A | f(x) ∈ B′}.

For instance, if f : R → R is f(x) = x2, then f−1([9, 16]) =[−4,−3] ∪ [3, 4].

Warning : Here f−1 does not represent the inverse of f (perhaps fdoes not even have an inverse). The inverse of f (if it exists) is afunction from B to A. The symbol f−1 defined here, however, repre-sents a function from P(B) to P(A), i.e., it acts on subsets rather thanelements of B.

Exercise: Given a function f : A → B and two subsets B1 and B2

of B, prove that B1 ⊆ B2 ⇒ f−1(B1) ⊆ f−1(B2). In general is therea simple expression for f−1(B1 ∪ B2) and f−1(B1 ∩ B2) in terms off−1(B1) and f−1(B2)?

4.8. Cardinality. Here we give definitions for the equality and in-equality of cardinalities of sets.

Two sets A and B are said to be equipotent, or have the same cardi-nality, if there is a one-to-one correspondence from A to B:

|A| = |B| ⇔ there exists a one-to-one correspondence f : A → B .

On the other hand, we define cardinal inequality in the followingway:

|A| ≤ |B| ⇔ there exists an injective function f : A → B .

3The symbol√

a represents the non-negative solution of the equation x2 = a.


As usual, we write |A| < |B| meaning |A| ≤ |B| and |A| 6= |B|.Some properties of the relations between cardinalities are the follow-

ing:

1. Set equipotence is an equivalence relation, i.e., is reflexive, sym-metric and transitive.

2. Schroder-Bernstein theorem:

(|A| ≤ |B|) ∧ (|B| ≤ |A|) ⇒ |A| = |B| ,i.e., cardinal inequality is antisymmetric.

3. Cardinal inequality is a total order. In particular, given two setsA, B, either |A| ≤ |B| or |B| ≤ |A|.

4. Cantor’s theorem: For every set A, |A| < |P(A)|, where P(A)represents the power set of A.

If A, B are finite sets then A ⊂ B ⇒ |A| < |B|. However, if theyare infinite, then we can only conclude that |A| ≤ |B|. Example: Theset Z+ = {1, 2, 3, 4, . . . } is a proper subset of N = {0, 1, 2, 3, 4, . . . },however they are equipotent because the following is a one-to-one cor-respondence:

N → Z+

n 7→ n + 1

Also |P | = |Z|, where P = set of even integers, as proven by thefollowing one-to-one correspondence:

Z → Pn 7→ 2n

Exercise: Prove that |N| = |Z|.Exercise: Prove that f(x) = ex/(1 + ex) is a one-to-one correspon-

dence between R and the interval (0, 1). Hence |R| = |(0, 1)|.4

4If |N| = |X|, the set X is called countable; if |N| < |X|, X is called uncountable.It is known that |N| = |Z| = |Q| < |(0, 1)| = |R| = |P(N)|. Hence N, Z and Q arecountable, but (0, 1), R and |P(N)| are uncountable.


4.9. Operations. An n-ary operation on a set A with values in a setB is a function f : An → B. If B ⊆ A then we say that the operationis closed on A, or A is closed under f .

If n = 1, the operation is called a monary operation. For instance,change of sign on Z and multiplicative inversion on Q∗:

Z → Zx 7→ −x

Q∗ → Q∗

x 7→ 1/x

If n = 2, the operation is called a binary operation on A. For in-stance, addition and multiplication of integers are binary operationson Z:

Z× Z +→ Z(x, y) 7→ x + y

Z× Z ·→ Z(x, y) 7→ x y

It is customary to write binary operations in infix notation, i.e., by plac-ing the symbol for the operation between its arguments. For instance,the sum of two numbers is written x + y instead of +(x, y).

If a set A with a binary operation ∗ : A2 → A on A is denoted (A, ∗).For instance, (Z, +) is Z with the operation of addition. We may denotein similar way sets with more than one operation, for instance (Z, +, ·)is the set of integers with the operations of addition and multiplication.

4.10. Properties of Binary Operations. Let ∗ : A2 → A be abinary operation.

1. Commutativity : The operation is said to be commutative if forevery x, y ∈ A, x∗ y = y ∗x. Examples: the addition and productof integers are commutative, but matrix multiplication is not.

2. Associativity : The operation is said to be associative if for everyx, y, z ∈ A, x ∗ (y ∗ z) = (x ∗ y) ∗ z. Examples: the additionand product of integers are associative. An example of a nonassociative operation could be ∗ : (Z+)2 → Z+ defined by x ∗ y =xy (check it).

3. Identity Element : An element e ∈ A is said to be an identityelement respect to ∗ if for every x ∈ A, x ∗ e = e ∗ x = x.For instance, 0 (zero) is the identity element in (Z, +), and 1 isthe identity element in (Z, ·). Exercise: Prove that the identityelement has to be unique.


4. Inverse: Assume that A has an identity element e respect to ∗.The inverse of a given element x ∈ A is another element x−1 suchthat x ∗ x−1 = x−1 ∗ x = e. For instance, in (Z, +), the inverse ofan element x is −x. In (Z, ·), if x 6= 0 then its inverse is 1/x.

4.11. Structures. Here G, R are nonempty sets and ∗, +, · representclosed binary operations.

4.11.1. Group. (G, ∗) is a group if ∗ has the following properties:

1. Associative: ∀x, y, z ∈ G, (x ∗ y) ∗ z = x ∗ (y ∗ z).2. Identity Element: ∃e ∈ G, ∀x ∈ G, x ∗ e = e ∗ x.3. Inverse: ∀x ∈ G, ∃x−1 ∈ G, x ∗ x−1 = x−1 ∗ x = e.

If ∗ is also commutative, then (G, ∗) is called a commutative or Abeliangroup. Example: (Z, +) and (Q∗, ·) are commutative groups.

Exercise: Prove that in a group the inverse of an element is unique,i.e., x ∗ x′ = x′ ∗ x = e and x ∗ x′′ = x′′ ∗ x = e implies x′ = x′′. Whatis the inverse of the inverse ((x−1)−1)?

4.11.2. Ring. (R, +, ·) is a ring if it has the following properties:

1. (R, +) is a commutative group.2. “·” is associative: ∀x, y, z ∈ R, (x · y) · z = x · (y · z).3. Distributive Laws of “·” respect to +:

(a) x · (y + z) = x · y + x · z.(b) (x + y) · z = x · z + y · z.

The identity element in (R, +) is called the zero element (usuallyrepresented “0”).

Exercise: Prove that in any ring, 0 · x = x · 0 = 0 (hint: 0 = 0 + 0).

Exercise: Prove the rule of signs on a ring: x · (−y) = (−x) · y =−(x · y), (−x) · (−y) = x · y.

If “·” is commutative, (R, +, ·) is called a commutative ring. Theidentity element for “·”, if it exists and is different from 0, is calledunity (usually represented “1”), and (R, +, ·) is called a ring with unity.

Example: (Z, +, ·), (Q, +, ·), (R, +, ·), (C, +, ·) are commutative ringswith unity.


In a ring with unity (R, +, ·), the inverse x−1 of an element x respectto “·” is called the multiplicative inverse of x.

Exercise: Prove that in a ring with unity, 0 does not have a multi-plicative inverse.

An element a such that ab = 0 for some nonzero b is called a divisorof zero or zero divisor. Obviously 0 is a divisor of zero, and in somerings such as (Z, +, ·), (Q, +, ·), (R, +, ·), (C, +, ·) it is the only divisorof zero, so in those rings ab = 0 implies a = 0 or b = 0. A nonzerodivisor of zero (if it exists) is called a proper divisor of zero or properzero divisor (see bellow). Exercise: Prove that a divisor of zero has nomultiplicative inverse.

An element in a ring with unity is called a unit if it has a multi-plicative inverse.5 For instance, in (Z, +, ·) the units are 1 and −1.Exercise: If (R, +, ·) is a ring with unity and R∗ is the set of units inR, prove that (R∗, ·) is a multiplicative group.

A commutative ring in which all nonzero elements are units is calleda field.

4.11.3. Field. A field is a commutative ring with unity in which ev-ery nonzero element has a multiplicative inverse.6 Example: (Q, +, ·),(R, +, ·), (C, +, ·) are fields, but (Z, +, ·) is not.

Exercise: Prove that a field has no proper divisors of zero, i.e., in afield ab = 0 implies a = 0 or b = 0.

Exercise: Given a set A with at least two elements, prove that(P(A),4,∩) (the power set of A with the operations of symmetricdifference and intersection) is a commutative ring with unity, but nota field. Prove that (P(A),4,∩) has proper divisors of zero.

4.12. Substructures.

4.12.1. Subgroup. Given a group (G, ∗), a subgroup of G is a subsetH ⊆ G such that (H, ∗) is a group with the same identity element e asG. For instance, (Z, +) is a subgroup of (Q, +).

5Do not confuse ”unit” with “unity”.6If we drop “commutative” in the definition we get a so called division ring.


4.12.2. Subring, Subfield. Given a ring (R, +, ·), a subring of R is asubset S ⊆ R such that (S, +, ·) is a ring with the same zero 0 andunity 1 (if R has unity) as R. For instance, (Z, +, ·) is a subring of(Q, +, ·). If R and S are fields, then S is called a subfield of R. Forinstance, (Q, +, ·) is a subfield of (R, +, ·).

4.13. Homomorphisms. A homomorphism between two structuresof the same kind S1, S2 is a function f : S1 → S2 that preserves theoperations of the structures.

If (G, ∗), (H, ◦) are groups, a group-homomorphism from G to H isa function f : G → H such that ∀x, y ∈ G, f(x ∗ y) = f(x) ◦ f(y). Forinstance f(n) = 2n is a homomorphism from (Z, +) to (Q∗, ·), becausef(n + m) = 2m+n = 2m 2n = f(n) · f(m).

If (R, +, ·), (S, +, ·) are rings, a ring-homomorphism from R to S isa function f : R → S such that ∀x, y ∈ R, f(x + y) = f(x) + f(y) andf(x · y) = f(x) · f(y). If R and S have unity, then f is also requiredto verify f(1R) = 1S, where 1R is the unity in R and 1S is the unity inS. If R, S are fields, f is called a field-homomorphism. For instance,complex conjugation x + yi 7→ x − yi is a field-homomorphism from(C, +, ·) to itself.

A homomorphism that is also a one-to-one correspondence is calledisomorphism. For instance, f(x) = 2x is a group-isomorphism from(Z, +) to (2Z, +).


5. Counting

5.1. The Rule of Sum. If a task can be performed in m ways, whileanother task can be performed in n ways, and the two tasks cannot beperformed simultaneously, then performing either task can be accom-plished in m + n ways.

Set theoretical version of the sum rule: If A and B are disjoint sets(A ∩B = ∅) then

|A ∪B| = |A|+ |B| .More generally, if the sets A1, A2, . . . , An are pairwise disjoint, then:

|A1 ∪ A2 ∪ · · · ∪ An| = |A1|+ |A2|+ · · ·+ |An| .

For instance, if a class has 30 male students and 25 female students,then the class has 30 + 25 = 45 students.

5.2. The Rule of Product. If a task can be performed in m waysand another independent task can be performed in n ways, then thecombination of both tasks can be performed in mn ways.

Set theoretical version of the product rule: Let A×B be the Carte-sian product of sets A and B. Then:

|A×B| = |A| · |B| .More generally:

|A1 × A2 × · · · × An| = |A1| · |A2| · · · |An| .

For instance, assume that a license plate contains two letters followedby three digits. How many different license plates can be printed?Answer : each letter can be printed in 26 ways, and each digit can beprinted in 10 ways, so 26 · 26 · 10 · 10 · 10 = 676000 different plates canbe printed.

Exercise: Given a set A with m elements and a set B with n elements,find the number of functions from A to B.

5.3. Permutations. Assume that we have n objects. Any arrange-ment of any k of these objects in a given order is called a permutationof size k. If k = n then we call it just a permutation of the n objects.For instance, the permutations of the letters a, b, c are the following:


abc, acb, bac, bca, cab, cba. The permutations of size 2 of the lettersa, b, c, d are: ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc.

Note that the order is important. Given two permutations, they areconsidered equal if they have the same elements arranged in the sameorder.

We find the number P (n, k) of permutations of size k of n givenobjects in the following way: The first object in an arrangement canbe chosen in n ways, the second one in n − 1 ways, the third one inn− 2 ways, and so on, hence:

P (n, k) = n× (n− 1)× (k factors)· · · × (n− k + 1) =n!

(n− k)!,

where n! = 1× 2× 3× (n factors)· · · × n is called “n factorial”.

The number P (n, k) of permutations of n objects is

P (n, n) = n! .

By convention 0! = 1.

For instance, there are 3! = 6 permutations of the 3 letters a, b, c. Thenumber of permutations of size 2 of the 4 letters a, b, c, d is P (4, 2) =4× 3 = 12.

Exercise: Given a set A with m elements and a set B with n elements,find the number of one-to-one functions from A to B.

5.4. Permutations with Repeated Elements. Assume that we havean alphabet with k letters and we want to write all possible words con-taining n1 times the first letter of the alphabet, n2 times the secondletter, . . . , nk times the kth letter. How many words can we write?We call this number P (n; n1, n2, . . . , nk), where n = n1 + n2 + · · ·+ nk.

Example: With 3 a’s and 2 b’s we can write the following 5-letterwords: aaabb, aabab, abaab, baaab, aabba, ababa, baaba, abbaa, babaa,bbaaa.

We may solve this problem in the following way, as illustrated withthe example above. Let us distinguish the different copies of a letterwith subindices: a1a2a3b1b2. Next, generate each permutation of thisfive elements by choosing 1) the position of each kind of letter, then 2)the subindices to place on the 3 a’s, then 3) these subindices to placeon the 2 b’s. Task 1) can be performed in P (5; 3, 2) ways, task 2) can be


performed in 3! ways, task 3) can be performed in 2!. By the productrule we have 5! = P (5; 3, 2)× 3!× 2!, hence P (5; 3, 2) = 5!/3! 2!.

In general the formula is:

P (n; n1, n2, . . . , nk) =n!

n1! n2! . . . nk!.

5.5. Combinations. Assume that we have a set A with n objects.Any subset of A of size r is called a combination of n elements takenr at a time. For instance, the combinations of the letters a, b, c, d, etaken 3 at a time are: abc, abd, abe, acd, ace, ade, bcd, bce, bde, cde,where two combinations are considered identical if they have the sameelements regardless of their order.

The number of subsets of size r in a set A with n elements is:

C(n, r) =n!

r! (n− r)!.

The symbol(

nr

)(read “n choose r”) is often used instead of C(n, r).

One way to derive the formula for C(n, r) is the following. Let Abe a set with n objects. In order to generate all possible permutationsof size r of the elements of A we 1) take all possible subsets of sizer in the set A, and 2) permute the k elements in each subset in allpossible ways. Task 1) can be performed in C(n, r) ways, and task2) can be performed in P (r, r) ways. By the product rule we haveP (n, r) = C(n, r)× P (r, r), hence

C(n, r) =P (n, r)

P (r, r)=

n!

r! (n− r)!.

5.6. Binomial Theorem. The following identities can be easily checked:

(x + y)0 = 1

(x + y)1 = x + y

(x + y)2 = x2 + 2 xy + y2

(x + y)3 = x3 + 3 x2y + 3 xy2 + y3


They can be generalized by the following formula, called the BinomialTheorem:

(x + y)n =n∑

k=0

(n

k

)xn−kyk

=

(n

0

)xn +

(n

1

)xn−1y +

(n

2

)xn−2y2 + · · ·

+

(n

n− 1

)xyn−1 +

(n

n

)yn .

We can find this formula by writing

(x + y)n = (x + y)× (x + y)× (n factors)· · · × (x + y) ,

expanding, and grouping terms of the form xayb. Since there are nfactors of the form (x + y), we have a + b = n, hence the terms mustbe of the form xn−kyk. The coefficient of xn−kyk will be equal to thenumber of ways in which we can select the y from any k of the factors(and the x from the remaining n − k factors), which is C(n, k) =

(nk

).

The expression(

nk

)is often called binomial coefficient.

Exercise: Proven∑

k=0

(n

k

)= 2n and

n∑k=0

(−1)k

(n

k

)= 0 .

Hint: Apply the binomial theorem to (1 + 1)2 and (1− 1)2.

5.7. Properties of Binomial Coefficients. The binomial coefficientshave the following properties:

1.

(n

k

)=

(n

n− k

)

2.

(n + 1

k + 1

)=

(n

k

)+

(n

k + 1

)

The first property follows easily from

(n

k

)=

n!

k!(n− k)!.

The second property can be proven by choosing a distinguished el-ement a in a set A of n + 1 elements. The set A has

(n+1k+1

)subsets

of size k + 1. Those subsets can be partitioned into two classes: that


of the subsets containing a, and that of the subsets not containing a.The number of subsets containing a equals the number of subsets ofA− {a} of size k, i.e.,

(nk

). The number of subsets not containing a is

the number of subsets of A − {a} of size k + 1, i.e.,(

nk+1

). Using the

sum principle we find that in fact(

n+1k+1

)=

(nk

)+

(n

k+1

).

5.8. Pascal’s Triangle. The properties shown in the previous sectionallow us to compute binomial coefficients in a simple way. Look at thefollowing triangular arrangement of binomial coefficients:(

00

)(10

) (11

)(20

) (21

) (22

)(30

) (31

) (32

) (33

)(40

) (41

) (42

) (43

) (44

)We notice that each binomial coefficient on this arrangement must

be the sum of the two closest binomial coefficients on the line above it.This together with

(n0

)=

(nn

)= 1, allows us to compute very quickly

the values of the binomial coefficients on the arrangement:

11 1

1 2 11 3 3 1

1 4 6 4 1

This arrangement of binomial coefficients is called Pascal’s Triangle.1

5.9. Combinations with Repetition. Assume that we have a set Awith n elements. Any selection of r objects from A, where each objectcan be selected more than once, is called a combination of n objectstaken r at a time with repetition. For instance, the combinations of theletters a, b, c, d taken 3 at a time with repetition are: aaa, aab, aac,aad, abb, abc, abd, acc, acd, add, bbb, bbc, bbd, bcc, bcd, bdd, ccc, ccd,cdd, ddd. Two combinations with repetition are considered identicalif they have the same elements repeated the same number of times,regardless of their order.

Note that the following are equivalent:

1Although it was already known by the Chinese in the XIV century.


1. The number of combinations of n objects taken r at a time withrepetition.

2. The number of ways r identical objects can be distributed amongn distinct containers.

3. The number of nonnegative integer solutions of the equation:

x1 + x2 + · · ·+ xn = r .

Example: Assume that we have 3 different (empty) milk containersand 7 quarts of milk that we can measure with a one quart measuringcup. In how many ways can we distribute the milk among the threecontainers? We solve the problem in the following way. Let x1, x2, x3 bethe quarts of milk to put in containers number 1, 2 and 3 respectively.The number of possible distributions of milk equals the number of nonnegative integer solutions for the equation x1 + x2 + x3 = 7. Insteadof using numbers for writing the solutions, we will use strokes, so forinstance we represent the solution x1 = 2, x2 = 1, x3 = 4, or 2 + 1 + 4,like this: ||+ |+ ||||. Now, each possible solution is an arrangement of 7strokes and 2 plus signs, so the number of arrangements is P (9; 7, 2) =9!/7! 2! =

(97

).

The general solution is:

P (n + r − 1; r, n− 1) =(n + r − 1)!

r! (n− 1)!=

(n + r − 1

r

).

5.10. The Inclusion-Exclusion Principle. We saw that if two setsA, B are disjoint, then |A ∪ B| = |A| + |B|. The inclusion-exclusionrule generalizes this formula to non-disjoint sets.

In general, for arbitrary (but finite) sets A, B:

|A ∪B| = |A|+ |B| − |A ∩B| .We can derive this formula by applying the sum rule to the disjointsets A−B, B − A, A ∩B:

|A| = |(A−B) ∪ A ∩B| = |A−B|+ |A ∩B||B| = |(B − A) ∪ A ∩B| = |B − A|+ |A ∩B|

|A ∪B| = |(A−B) ∪ (B − A) ∪ A ∩B|= |A−B|+ |B − A|+ |A ∩B| ,

which combined give:

|A|+ |B| − |A ∪B| = |A ∩B| .


Example: Assume that in a university with 1000 students, 200 stu-dents are taking a course in mathematics, 300 are taking a course inphysics, and 50 students are taking both. 1) How many students aretaking at least one of those courses? 2) How many are not taking ei-ther of those courses? 3) How many are taking exactly one of thosecourses? Answer : If U = total set of students in the university, M =set of students taking Mathematics, P = set of students taking Physics,then:

1) |M ∪ P | = |M |+ |P | − |M ∩ P | = 300 + 200− 50 = 450 studentsare taking Mathematics or Physics.

2) |U − (M ∪ P )| = |U | − |M ∪ P | = 1000− 450 = 550 students arenot taking any of those courses.

3) |M 4P | = |(M∪P )−(M∩P )| = |M∪P |−|M∩P | = 450−50 =400 students are taking exactly one of those courses.

For three sets the following formula applies:

|A ∪B ∪ C| = |A|+ |B|+ |C|− |A ∩B| − |A ∩ C| − |B ∩ C|+ |A ∩B ∩ C| ,

and for an arbitrary union of sets:

|A1 ∪ A2 ∪ · · · ∪ An| = s1 − s2 + s3 − s4 + · · · ± sn ,

where sk = sum of the cardinalities of all possible k-tuple intersectionsof the given sets.

Example: Find the number of permutations of a, b, c, d, e, f, g thatdo not contain the sequences abc, de, fg. Answer : Let A = set of allpermutations of a, b, c, d, e, f, g; A1 = permutations that contain abc;A2 = permutations that contain de; A3 = permutations that containfg. We want to find |A− (A1 ∪ A2 ∪ A3)| = |A| − |A1 ∪ A2 ∪ A3|. Weeasily find:

|A| = 7! |A1| = 5! |A2| = 6! |A3| = 6!|A1 ∩ A2| = 4! |A1 ∩ A3| = 4! |A2 ∩ A3| = 5! |A1 ∩ A2 ∩ A3| = 3!

Using the inclusion-exclusion principle:

|A1 ∪ A2 ∪ A3| = 5! + 6! + 6!− 4!− 4!− 5! + 3! = 1398 ,

hence:

|A− (A1 ∪ A2 ∪ A3)| = 7!− 1398 = 5040− 1398 = 3642 .


5.11. Counting with Venn Diagrams. A Venn diagram with n setsintersecting in the most general way divides the plane into 2n regions.2

If we have information about the number of elements of some portionsof the diagram, then we can find the number of elements in each of theregions and use the sum rule for obtaining the number of elements inother portions of the plane.

Example: Let M , P and C be the sets of students taking Mathematicscourses, Physics courses and Computer Science courses respectively ina university. Assume |M | = 300, |P | = 350, |C| = 450, |M ∩P | = 100,|M ∩ C| = 150, |P ∩ C| = 75, |M ∩ P ∩ C| = 10. How many studentsare taking exactly one of those courses? (fig. 5.1)

10

185

235

65140

C

60 90

M P

Figure 5.1. Counting with Venn diagrams.

We see that |(M∩P )−(M∩P∩C)| = 100−10 = 10, |(M∩C)−(M∩P ∩C)| = 150−10 = 140 and |(P ∩C)− (M ∩P ∩C)| = 75−10 = 65.Then the region corresponding to students taking Mathematics coursesonly has cardinality 300−(90+10+140) = 60. Analogously we computethe number of students taking Physics courses only (185) and takingComputer Science courses only (235). The sum 60 + 185 + 235 = 480is the number of students taking exactly one of those courses.

5.12. Probability. Assume that we perform an experiment. The setof possible outcomes is called the sample space of the experiment. Anevent is a subset of the sample space. For instance, if we toss a cointhree times, the sample space is

S = {HHH,HHT,HTH,HTT, THH, THT, TTH, TTT} .

The event “at least two heads in a row” would be the subset

A = {HHH,HHT, THH} .

2Caveat: Venn diagrams are not very useful if n > 3.


If all possible outcomes of an experiment have the same likelihood ofoccurrence, then the probability of an event A ⊂ S is given by Laplace’srule:

Pr(A) =|A||S| .

For instance, the probability of getting at least two heads in a row inthe above experiment is 3/8.

5.13. The Pigeonhole Principle. The pigeonhole principle is usedfor proving that a certain situation must actually occur. It says thefollowing: If n pigeonholes are occupied by m pigeons and m > n, thenat least one pigeonhole is occupied by more than one pigeon.3

Example: In any given set of 13 people at least two of them havetheir birthday during the same month.

Example: Let S be a set of eleven 2-digit numbers. Prove that S musthave two elements whose digits have the same difference (for instance inS = {10, 14, 19, 22, 26, 28, 49, 53, 70, 90, 93}, the digits of the numbers28 and 93 have the same difference: 8 − 2 = 6, 9 − 3 = 6.) Answer :The digits of a two-digit number can have 10 possible differences (from0 to 9). So, in a list of 11 numbers there must be two with the samedifference.

Example: Assume that we choose three different digits from 1 to 9and write all permutations of those digits. Prove that among the 3-digitnumbers written that way there are two whose difference is a multipleof 500. Answer : There are 9 · 8 · 7 = 504 permutations of three digits.On the other hand if we divide the 504 numbers by 500 we can getonly 500 possible remainders, so at least two numbers give the sameremainder, and their difference must be a multiple of 500.

Exercise: Prove that if we select n + 1 numbers from the set S ={1, 2, 3, . . . , 2n}, among the numbers selected there are two such thatone is a multiple of the other one.

3The Pigeonhole Principle (Schubfachprinzip) was first used by Dirichlet in Num-ber Theory. The term pigeonhole actually refers to one of those old-fashioned writ-ing desks with thin vertical wooden partitions in which to file letters.


6. Mathematical Induction

Many properties of positive integers can be proven by mathematicalinduction. Here we look at several equivalent forms of this principle.

6.1. Principle of Mathematical Induction (First Form). Let Pbe a property of positive integers such that:

1. Basis Step: P (1) is true, and

2. Inductive Step: if P (n) is true, then P (n + 1) is true.

Then P (n) is true for all positive integers.

Symbolically (here the universe of discourse is Z+):

P (1) ∧ ∀n [P (n) → P (n + 1)] ⇒ ∀nP (n) .

Remark : The premise P (n) in the inductive step is called InductionHypothesis.

The validity of the Principle of Mathematical Induction is obvious.The basis step states that P (1) is true. Then the inductive step impliesthat P (2) is also true. By the inductive step again we see that P (3)is true, and so on. Consequently the property is true for all positiveintegers.

Remark : In the basis step we may replace 1 with some other integerm. Then the conclusion is that the property is true for every integer ngreater than or equal to m.

Example: Prove that the sum of the n first odd positive integersis n2, i.e., 1 + 3 + 5 + · · · + (2n − 1) = n2. Answer : Let S(n) =1 + 3 + 5 + · · ·+ (2n− 1). We want to prove by induction that for allpositive integer n, S(n) = n2.

1. Basis Step: If n = 1 we have S(1) = 1 = 12, so the property istrue for 1.

2. Inductive Step: Assume (Induction Hypothesis) that the propertyis true for some positive integer n, i.e.: S(n) = n2. We must provethat it is also true for n + 1, i.e., S(n + 1) = (n + 1)2. In fact:

S(n + 1) = 1 + 3 + 5 + · · ·+ (2n + 1) = S(n) + 2n + 1 .


But by induction hypothesis, S(n) = n2, hence:

S(n + 1) = n2 + 2n + 1 = (n + 1)2 .

This completes the induction, and shows that the property is true forall positive integers.

Exercise: Prove the following identities by induction:

• 1 + 2 + 3 + · · ·+ n =n (n + 1)

2.

• 12 + 22 + 32 + · · ·+ n2 =n (n + 1) (2n + 1)

6.

• 13 + 23 + 33 + · · ·+ n3 = (1 + 2 + 3 + · · ·+ n)2.

6.2. Principle of Mathematical Induction (Second Form). LetP be a property of positive integers such that:

1. Basis Step: P (1) is true, and

2. Inductive Step: if P (k) is true for all 1 ≤ k ≤ n then P (n + 1) istrue.

Then P (n) is true for all positive integers.

Symbolically:

P (1) ∧ ∀n [(∀k ≤ n, P (k)) → P (n + 1)] ⇒ ∀nP (n) .

Example: Prove that every integer n ≥ 2 is prime or a product ofprimes. Answer :

1. Basis Step: 2 is a prime number, so the property holds for n = 2.1

2. Inductive Step: Assume that if 2 ≤ k ≤ n, then k is a primenumber or a product of primes. Now, either n + 1 is a primenumber or it is not. If it is a prime number then it verifies theproperty. If it is not a prime number, then it can be written as theproduct of two positive integers, n = k1 k2, such that 1 < k1, k2 <n. By induction hypothesis each of k1 and k2 must be a prime ora product of primes, hence n is a product of primes.

1Note that in this example the basis step corresponds to n = 2.


This completes the proof.

6.3. The Well-Ordering Principle. Every nonempty subset of Z+

has a smallest element, i.e., Z+ is well-ordered.

6.4. Recursive Definitions. A definition such that the object de-fined occurs in the definition is called a recursive definition. For in-stance, consider the Fibonacci sequence

0, 1, 1, 2, 3, 5, 8, 13, . . .

It can be defined as a sequence whose two first terms are F0 = 0,F1 = 1 and each subsequent term is the sum of the two previous ones:Fn+1 = Fn + Fn−1 (for n ≥ 1).

Other examples:

• Factorial:1. 0! = 12. (n + 1)! = n! (n + 1) (n ≥ 0)

• Power:1. a0 = 12. an+1 = an a (n ≥ 0)

In all these examples we have:

1. A Basis, where the function is explicitly evaluated for one or morevalues of its argument.

2. A Recursive Step, stating how to compute the function from itsprevious values.

Several advantages of a recursive definition are the following

1. It is sometimes easier than giving an explicit formula. For in-stance, the Fibonacci sequence has the following explicit formula:

Fn =1√5

{(1 +

√5

2

)n

−(

1−√5

2

)n},

but its recursive definition is much simpler.

2. It allows us to prove by mathematical induction many propertiesof the object defined.


Example: Consider the Lucas sequence:

2, 1, 3, 4, 7, 11, 18, 29, . . . ,

recursively defined in the following way:

1. L0 = 2, L1 = 1.2. Ln+1 = Ln + Ln−1 (n > 0)

Prove that for all n ≥ 1, Ln = Fn−1 +Fn+1. Answer : By mathematicalinduction:

1. Basis Step: We have L1 = 1, F0 = 0, F2 = 1, hence L1 = F0 + F2

and the property holds for n = 1.

2. Inductive Step: Assume the property holds for 1 ≤ k ≤ n, i.e.:Lk = Fk−1 + Fk+1 if 1 ≤ k ≤ n. Then

Ln+1 = Ln + Ln−1

= Fn−1 + Fn+1 + Fn−2 + Fn

= (Fn−1 + Fn−2) + (Fn+1 + Fn)= Fn + Fn+2 ,

hence the property holds for n + 1.

Hence, by the Principle of Mathematical Induction the property holdsfor every n ≥ 1, Q.E.D.2

The following exercise shows that recursive definitions and math-ematical induction can be applied to more general objects than se-quences and functions defined on integers.

Exercise: Given a set A we define recursively the following n-aryfunction on P(A):

D1(S) = S

Dn+1(S1, . . . , Sn+1) = Dn(S1, . . . , Sn)4Sn+1 (n > 0)

S, S1, . . . , Sn represent arbitrary subsets of A, and 4 is the symmetricdifference of sets. Prove that Dn(S1, . . . , Sn) = elements of A thatbelong exactly to an odd number of the sets S1, . . . , Sn.

2Q.E.D. = quod erat demonstrandum, i.e., “which was to be shown”.


6.5. Recurrence Relations. Here we look at recursive definitions un-der a different point of view. Rather than definitions they will be con-sidered as equations that we must solve. The point is that a recursivedefinition is actually a definition when there is one and only one objectsatisfying it, i.e., when the equations involved in that definition havea unique solution. Also, the solution to those equations may provide aclosed-form (explicit) formula for the object defined.

The recursive step in a recursive definition is also called a recurrencerelation. We will focus on kth-order linear recurrence relations, whichare of the form

C0 xn + C1 xn−1 + C2 xn−2 + · · ·+ Ck xn−k = bn ,

where C0, C1, . . . , Ck are constant and C0 6= 0. If bn = 0 the recurrencerelation is called homogeneous. Otherwise it is called nonhomogeneous.

The basis of the recursive definition is also called initial conditionsof the recurrence. So, for instance, in the recursive definition of theFibonacci sequence, the recurrence is

Fn = Fn−1 + Fn−2

or

Fn − Fn−1 − Fn−2 = 0 ,

and the initial conditions are

F0 = 0, F1 = 1 .

6.6. First-Order Linear Recurrence Relations. The homogeneouscase can be written in the following way:

xn = r xn−1 (n > 0) ; x0 = A .

Its general solution is

xn = Arn ,

which is a geometric sequence with ratio r.

Exercise: Prove this result by mathematical induction.

The non-homogeneous case can be written in the following way:

xn = r xn−1 + cn (n > 0) ; x0 = A .


Using the summation notation, its solution can be expressed like this:

xn = Arn +n∑

k=1

ck rn−k .

We examine two particular cases. The first one is

xn = r xn−1 + c (n > 0); x0 = A .

where c is a constant. The solution is

xn = Arn + c

n∑k=1

rn−k = Arn + crn − 1

r − 1if r 6= 1 ,

and

xn = A + c n if r = 1 .

Example: We bought a house for $200,000 and got a 30-year mortgageat a fixed rate of 7.8%. How much must we pay each month? Answer :Let xn be the principal after the nth monthly payment, r = 7.8/(12 ·100) = 0.0065 the monthly rate, and m the monthly payment. Then

xn = xn−1 + 0.0065xn−1 −m = 1.0065 · xn−1 −m ; x0 = 200000 .

The solution to this recurrence is

xn = 200000 · 1.0065n −m1.0065n − 1

0.0065.

The principal after 30 · 12 = 360 payments must be zero, hence

200000 · 1.0065360 −m1.0065360 − 1

0.0065= 0 ,

so

m =200000 · 1.0065360 · 0.0065

1.0065360 − 1= $1439.74 .

The second particular case is for r = 1 and cn = c+d n, where c andd are constant (so cn is an arithmetic sequence):

xn = xn−1 + c + d n (n > 0) ; x0 = A .

The solution is now

xn = A +n∑

k=1

(c + d k) = A + c n +d n (n + 1)

2.


6.7. Second-Order Linear Homogeneous Recurrence Relations.Now we look at the recurrence relation

C0 xn + C1 xn−1 + C2 xn−2 = 0 .

First we will look for solutions of the form xn = c rn. By plugging inthe equation we get:

C0 c rn + C1 c rn−1 + C2 c rn−2 = 0 ,

hence r must be a solution of the following equation, called the char-acteristic equation of the recurrence:

C0 r2 + C1 r + C2 = 0 .

Let r1, r2 be the two (in general complex) roots of the above equation.They are called characteristic roots. We distinguish three cases:

1. Distinct Real Roots. In this case the general solution of the recur-rence relation is

xn = c1 rn1 + c2 rn

2 ,

where c1, c2 are arbitrary constants.

2. Double Real Root. If r1 = r2 = r, the general solution of therecurrence relation is

xn = c1 rn + c2 n rn ,

where c1, c2 are arbitrary constants.

3. Complex Roots. In this case the solution could be expressed inthe same way as in the case of distinct real roots, but in order toavoid the use of complex numbers we write r1 = r eαi, r2 = r e−αi,k1 = c1 + c2, k2 = (c1 − c2) i, which yields:3

xn = k1 rn cos nα + k2 rn sin nα .

Example: Find a closed-form formula for the Fibonacci sequence de-fined by:

Fn+1 = Fn + Fn−1 (n > 0) ; F0 = 0, F1 = 1 .

Answer : The recurrence relation can be written

Fn − Fn−1 − Fn−2 = 0 .

The characteristic equation is

r2 − r − 1 = 0 .

3Reminder: eαi = cos α + i sin α.


Its roots are:4

r1 = φ =1 +

√5

2; r2 = −φ−1 =

1−√5

2.

They are distinct real roots, so the general solution for the recurrenceis:

Fn = c1 φn + c2 (−φ−1)n .

Using the initial conditions we get the value of the constants:{(n = 0) c1 + c2 = 0

(n = 1) c1 φ + c2 (−φ−1) = 1⇒

{c1 = 1/

√5

c2 = −1/√

5

Hence:

Fn =1√5

{φn − (−φ)−n

}.

6.8. Generating Functions. Given a sequence a0, a1, a2, . . . , the func-tion

f(x) = a0 + a1 x + a2 x2 + · · · =∞∑

n=0

an xn

is called the generating function for the given sequence.

Examples:

1. (1 + x)m =∑m

n=0

(mn

)xn is the generating function for(

m

0

),

(m

1

),

(m

2

), . . . ,

(m

m

), 0, 0, 0, . . . .

2. 1/(1 − x) = 1 + x + x2 + x3 + · · · is the generating function forthe constant sequence an = 1.

3. The generating function of a geometric sequence an = Arn is

A

1− r x= A + Ar x + Ar2 x2 + Ar3 x3 + . . .

4. If f(x) = a0 + a1 x + a2 x2 + a3 x3 + · · · is the generating functionfor a0, a1, a2, . . . , then the following is the generating function for

4φ = 1+√

52 is the Golden Ratio.


the sequence of partial sums∑n

k=0 ak:

f(x)

1− x= f(x)(1 + x + x2 + x3 + · · · )= (a0 + a1 x + a2 x2 + · · · )(1 + x + x2 + x3 + · · · )= a0 + (a0 + a1) x + (a0 + a1 + a2) x2 + · · ·

Example: Use generating functions to find the sum of the terms of ageometric sequence: sn = 1 + r + r2 + · · · + rn (r 6= 1). Answer : Wehave

1

1− rx=

∞∑n=0

rnxn ,

hence

1

(1− x)(1− rx)=

∞∑n=0

snxn .

Next, we expand the left hand side in partial fractions:

1

(1− x)(1− rx)=

1

1− r

(1

1− x− r

1− rx

)

=1

r − 1

( ∞∑n=0

rn+1xn −∞∑

n=0

xn

)

=∞∑

n=0

rn+1 − 1

r − 1xn ,

hence

sn =rn+1 − 1

r − 1.

6.9. Generating Function of a Linear Homogeneous Recur-rence. Given a kth-order linear homogeneous recurrence relation:

xn + c1 xn−1 + c2 xn−2 + · · ·+ ck xn−k = 0

(note that the leading coefficient is 1), its characteristic polynomial is

Q(r) = rk + c1 rn−1 + c2 rn−2 + · · ·+ ck .

Let q(x) = xk Q(x−1) be the polynomial5

q(x) = xk Q(x−1) = 1 + c1 x + c2 x2 + · · · ck xk .

5The polynomial q(x) is called reciprocal of Q(x).


Then the solution of the recurrence has the following generating func-tion:

f(x) =p(x)

q(x),

where p(x) is some polynomial of degree less than r which depends onthe initial conditions for the recurrence.

Example: We use the method of generating functions to obtain againthe close-form formula for the Fibonacci sequence. The Fibonacci se-quence Fn has the following characteristic polynomial:

P (x) = r2 − r − 1 ,

so q(x) = 1− x− x2. Hence,

a + b x

1− x− x2= F0 + F1 x + F2 x2 + · · · .

We determine the numerator of the generating function by writing

a + b x = (1− x− x2) (F0 + F1 x + F2 x2 + · · · )= F0 + {F1 − F0}x + {F2 − F1 − F0}x2 + . . . ,

so that a = F0 = 0, b = F1 − F0 = 1, hence:x

1− x− x2= F0 + F1 x + F2 x2 + · · · .

We can factor the denominator like this

1− x− x2 = (1− φx) (1 + φ−1 x) ,

where

φ =1 +

√5

2; −φ−1 =

1−√5

2.

The generating function can be expressed as a sum of partial fractionsin the following way:

x

1− x− x2=

1√5

{1

1− φx− 1

1 + φ−1 x

}.

Since in general 1/(1− a x) is the generating function for the sequencean, we get that

x

1− x− x2=

1√5

{ ∞∑n=0

φn xn −∞∑

n=0

(−φ−1)n xn

}

=1√5

∞∑n=0

{φn − (−φ)−n

}xn .


That must be equal to

x

1− x− x2=

∞∑n=0

Fn xn ,

so we get:

Fn =1√5

{φn − (−φ)−n

}.


7. Divisibility

Given two integers a, b, a 6= 0, we say that a divides b, written a|b,if there is some integer c such that ac = b:

a|b ⇔ ∃c, ac = b .

We also say that a is a divisor of b or that b is a multiple of a.

Some properties of divisibility are the following:

1. 1|a, a|0.

2. If a, b ≥ 1 and a|b then a ≤ b.

3. a|b ∧ b|a ⇒ a = ±b.

4. a|b ∧ b|c ⇒ a|c (the relation of divisibility is transitive).

5. a|b ⇒ ∀x ∈ Z, a|bx.

6. a|c1 ∧ a|c2 ∧ · · · ∧ a|cn ⇒ ∀x1, x2, . . . , xn ∈ Z, a|(c1 x1 + c2 x2 +· · · + cn xn). The expression c1 x1 + c2 x2 + · · · + cn xn is called alinear combination of c1, c2, . . . , cn.

Example: Prove that there are no integers x, y, z such that

4 x + 6 y + 10 z = 9482681648618641 .

Answer : Since 4, 6 and 10 are divisible by 2, the left hand side shouldbe divisible by 2. However the right hand side is not divisible by 2.

7.1. Prime Numbers. A prime number is an integer p ≥ 2 whoseonly positive divisors are 1 and p. Any integer n ≥ 2 that is not primeis called composite. A non-trivial divisor of n ≥ 2 is a divisor d ofn such that 1 < d < n, so n ≥ 2 is composite iff it has non-trivialdivisors.

Warning : 1 is not considered either prime or composite.1

Some results about prime numbers:

1. For all n ≥ 2 there is some prime p such that p|n. Proof: Let Sbe the set of all divisors of n that are greater than or equal to 2.

1Consequently, Z+ can be partitioned into three classes: primes, composites andunity.


Since n ∈ S, S is not empty, and by the well-ordering principleit must have a smallest element p. If p had a non-trivial divisord, by transitivity d would be a divisor of n too, so d ∈ S. Butd < p, so p would not be the smallest element in S. Consequently,p must be a prime divisor of n.

2. (Euclid) There are infinitely many prime numbers. Proof: Weprove it by contradiction. Assume there are finitely many primesp1, p2, . . . , pm. Call P = p1p2 . . . pm the product of all prime num-bers, and consider the number N = P + 1. N must have someprime divisor p. Since p is one of the pk, it must divide P . Butif p divides N and P , then it must divide N − P = 1, which isimpossible because 1 has no prime divisors. Consequently, ourassumption that there are finitely many prime numbers must befalse.

3. If p|ab then p|a or p|b. More generally, if p|a1a2 . . . an then p|ak

for some k = 1, 2, . . . , n.

7.2. The Division Algorithm. The following result is known as TheDivision Algorithm:2 If a, b ∈ Z, b > 0, then there exist unique q, r ∈ Zsuch that a = q b+ r, 0 ≤ r < b. Here q is called quotient of the integerdivision of a by b, and r is called remainder.

7.3. Greatest Common Divisor. A positive integer c is called acommon divisor of the integers a1, a2, . . . , an, if c divides all the ak.The greatest possible such a c is called the greatest common divisor ofa1, a2, . . . , an, denoted gcd(a1, a2, . . . , an).

Example: The set of positive divisors of 12 and 30 is {1, 2, 3, 6}. Thegreatest common divisor of 12 and 30 is gcd(12, 30) = 6.

Two integers a, b are called relatively prime if gcd(a, b) = 1.

The greatest common divisor has the following property: If d ∈ Z+ isa common divisor of a1, a2, . . . , an ∈ Z+ then d divides gcd(a1, a2, . . . , an).In other words, gcd(a1, a2, . . . , an) is the greatest lower bound of theset {a1, a2, . . . , an} in the poset (Z+, |).

2In spite of its name, the result is just a mathematical theorem, not an actualalgorithm. There are, however, algorithms that allow us to compute the quotientand the remainder in an integer division.


Another important result is the following: The equation

a1 x1 + a2 x2 + · · ·+ an xn = b

where a1, a2, . . . , an, b are integers, has integer solutions if and onlyif gcd(a1, a2, . . . , an) divides b. In particular, gcd(a1, a2, . . . , an) canalways be written as a linear combination of a1, a2, . . . , an.

Example: We have gcd(12, 30) = 6, and in fact we can write 6 =1 ·30−2 ·12. The solution is not unique, for instance 6 = 3 ·30−7 ·12.

7.4. The Euclidean Algorithm. We examine a method to computethe gcd of two given positive integers a, b. The method provides at thesame time a solution to the equation:

a x + b y = gcd(a, b) .(7.1)

Let a, b be two positive integers. We define recursively the followingsequence:

r0 = ar1 = b

rn−1 = qn+1 rn + rn+1, 0 ≤ rn+1 < rn (for n > 0)

where the recursive step consists of an application of the division al-gorithm in order to obtain rn+1 from rn and rn−1. Then the last non-zero element in the sequence r0, r1, r2, . . . is gcd(a, b). Furthermore, byworking backwards we can express gcd(a, b) as a linear combination ofa, b.

Example: Assume that we wish to compute gcd(500, 222). Then wearrange the computations in the following way:

500 = 2 · 222 + 56 → r2 = 56222 = 3 · 56 + 54 → r3 = 5456 = 1 · 54 + 2 → r4 = 254 = 27 · 2 + 0 → r5 = 0

The last nonzero remainder is r4 = 2, hence gcd(500, 222) = 2. Fur-thermore, if we want to express 2 as a linear combination of 500 and222, we can do it by working backwards:

2 = 56− 1 · 54 = 56− 1 · (222− 3 · 56) = 4 · 56− 1 · 222

= 4 · (500− 2 · 222)− 1 · 222 = 4 · 500− 9 · 222 .


7.5. Diophantine Equations. A Diophantine equation is an equa-tion in which the solutions must be integers. The method described inthe previous paragraph can be used to solve Diophantine equations ofthe form

a x + b y = c g ,(7.2)

where g = gcd(a, b). In order to do that, first solve equation (7.1).If (x′, y′) is the solution obtained following the steps in the previousparagraph, then obviously (x0, y0) = (c x′, c y′) is a particular solutionto (7.2). Finally add the general solution to the homogeneous equation

a x + b y = 0 ,

which is x = bk/g, y = −ak/g, k ∈ Z. Hence, the general solution to(7.2) is

x = x0 + bk/g

y = y0 − ak/g

with k ∈ Z.

7.6. Least Common Multiple. A positive integer c is called a com-mon multiple of the integers a1, a2, . . . , an, if c is a multiple of all theak’s. The smallest possible such a c is called least common multiple ofa1, a2, . . . , an, denoted lcm(a1, a2, . . . , an).

Example: The set of common multiples of 12 and 30 is

{60, 120, 180, 240, . . . }The least common multiple of 12 and 30 is lcm(12, 30) = 60.

The least common multiple has the following property: If m isa common multiple of a1, a2, . . . , an ∈ Z+, then m is a multiple oflcm(a1, a2, . . . , an). In other words, lcm(a1, a2, . . . , an) is the least up-per bound of the set {a1, a2, . . . , an} in the poset (Z+, |).

The least common multiple of two numbers can be computed withthe following formula:

ab = gcd(a, b) · lcm(a, b) .

So, in order to compute lcm(a, b), we can use the Euclidean algorithmfor computing gcd(a, b), then use the above formula for computinglcm(a, b).


7.7. The Fundamental Theorem of Arithmetic. Every integern ≥ 2 can be written as a product of primes uniquely, up to the orderof the primes.

It is customary to write the factorization in the following way:

n = ps11 ps2

2 . . . pskk ,

where all the exponents are positive and the primes are written so thatp1 < p2 < · · · < pk. For instance:

13104 = 24 · 32 · 7 · 13 .

The factorization of integers provides another method for computingthe gcd and lcm:

1. gcd(a1, . . . , an) = product of the primes that occur in the primefactorizations of all the ak, raised to their lowest exponent.

2. lcm(a1, . . . , an) = product of the primes that occur in the primefactorization of any ak, raised to their largest exponent.

Example: We have 720 = 24·32·5, 4536 = 23·34·7, so: gcd(720, 4536) =23 · 32 = 72, and lcm(720, 4536) = 24 · 34 · 5 · 7 = 453560.

Exercise: Find the number of divisors of a number n. Answer : Firstwe find the prime factorization of n = ps1

1 ps22 . . . psk

k . Its divisors willbe of the form d = pt1

1 pt22 . . . ptk

k , with 0 ≤ ti ≤ si. Each exponent tican take si + 1 possible values, so, by the product rule, the number ofdivisors will be (s1 + 1)(s2 + 1) . . . (sk + 1). For instance 200 = 23 · 52

has 4 · 3 = 12 divisors, namely: 1, 2, 4, 5, 8, 10, 20, 25, 40, 50, 100, 200.


8. Modular Arithmetic

8.1. Congruence modulo m. Given an integer m ≥ 2, we say thata is congruent to b modulo m, written a ≡ b (mod m), if m|(a − b).Note that the following conditions are equivalent

1. a ≡ b (mod m).2. a = b + km for some integer k.3. a and b have the same remainder when divided by m.

The relation of congruence modulo m is an equivalence relation. Itpartitions Z into m equivalence classes of the form

[x] = [x]m = {x + km | k ∈ Z} .

For instance, for m = 5, each one of the following rows is an equivalenceclass:

. . . −10 −5 0 5 10 15 20 . . .

. . . −9 −4 1 6 11 16 21 . . .

. . . −8 −3 2 7 12 17 22 . . .

. . . −7 −2 3 8 13 18 23 . . .

. . . −6 −1 4 9 14 19 24 . . .

Each equivalence class has exactly a representative r such that 0 ≤ r <m, namely the common remainder of all elements in that class when di-vided by m. Hence an equivalence class may be denoted [r] or x+m Z,where 0 ≤ r < m. Often we will omit the brackets, so that the equiva-lence class [r] will be represented just r. The set of equivalence classes(i.e., the quotient set of Z by the relation of congruence modulo m) isdenoted Zm = {0, 1, 2, . . . , m− 1}. For instance, Z5 = {0, 1, 2, 3, 4}.

Remark : When writing “r” as a notation for the class of r we maystress the fact that r represents the class of r rather than the integer rby including “ (mod p)” at some point. For instance 8 = 3 (mod p).Note that in “a ≡ b (mod m)”, a and b represent integers, while in“a = b (mod m)” they represent elements of Zm.

Reduction Modulo m: Once a set of representatives has been chosenfor the elements of Zm, we will call “r reduced modulo m”, written“r mod m”, the chosen representative for the class of r. For instance,if we choose the representatives for the elements of Z5 in the intervalfrom 0 to 4 (Z5 = {0, 1, 2, 3, 4}), then 9 mod 5 = 4. Another possibilityis to choose the representatives in the interval from −2 to 2 (Z5 ={−2,−1, 0, 1, 2}), so that 9 mod 5 = −1


In Zm it is possible to define an addition and a multiplication in thefollowing way:

[x] + [y] = [x + y] ; [x] · [y] = [x · y] .

Exercise: Prove that those operations are well-defined, i.e., the resultdoes not depend on the choice of class representatives.

As an example, table 8.1 shows the addition and multiplication tablesfor Z5.

+ 0 1 2 3 40 0 1 2 3 41 1 2 3 4 02 2 3 4 0 13 3 4 0 1 24 4 0 1 2 3

· 0 1 2 3 40 0 0 0 0 01 0 1 2 3 42 0 2 4 1 33 0 3 1 4 24 0 4 3 2 1

Table 8.1. Operational tables for Z5

An important result is that (Zm, +, ·) is always a ring (see paragraph4.11.2). Furthermore, (Zm, +, ·) is a field if and only if m is a primenumber. In other words, we can “divide” in (Zm, +, ·) only when m isa prime number.

Example: (Z5, +, ·) is a ring, since 5 is a prime number. Question:what is the multiplicative inverse of 2 in (Z5, +, ·)? Answer : if we lookat the row of 2 in the multiplication table for Z5 we see that there is a1 in the column for 3, hence 2 · 3 = 1 (mod 5), and 3 is the inverse of2 in (Z5, +, ·).

Exercise: Prove that (Z6, +, ·) is not a field, i.e., it has nonzero ele-ments without multiplicative inverse.

Exercise: Prove that [a] ∈ Zm is a divisor of zero if and only ifgcd(a,m) > 1.

Exercise: Prove that [a] ∈ Zm is a unit (has a multiplicative inverse)if and only if gcd(a,m) = 1.

The set of units in (Zm, +, ·) is denoted Z∗m. For instance Z∗6 = {1, 5},because 0, 2, 3 and 4 are divisors of zero in (Z6, +, ·).1 If Z∗m denotesthe set of units in (Zm, +, ·) then (Z∗m, ·) is a multiplicative group.

1Note that Z∗m is not the same as the set of nonzero elements of Zm.


Given an element [a] in Z∗m, its inverse can be computed by usingthe Euclidean algorithm to find gcd(a,m), since that algorithm alsoprovides a solution to the equation ax + my = gcd(a,m) = 1, which isequivalent to ax ≡ 1 (mod m).

Example: Find the inverse of 17 in (Z∗64, ·). Answer : We use theEuclidean algorithm:

64 = 3 · 17 + 13 → r2 = 1317 = 1 · 13 + 4 → r3 = 413 = 3 · 4 + 1 → r4 = 14 = 4 · 1 + 0 → r5 = 0

Now we compute backwards:

1 = 13− 3 · 4 = 13− 3 · (17− 1 · 13) = 4 · 13− 3 · 17

= 4 · (64− 3 · 17)− 3 · 17 = 4 · 64− 15 · 17 .

Hence (−15) · 17 ≡ 1 (mod 64), but −15 ≡ 49 (mod 64), so the in-verse of 17 in (Z∗64, ·) is 49. We will denote this by writing 17−1 = 49(mod 64), or 17−1 mod 64 = 49.

8.2. Applications to Diophantine Equations. Congruences areuseful in exploring the solutions to some Diophantine equations.

Example: Prove that the equation x2 + y2 = 3 z2 has no positiveinteger solutions. Answer : Assume that (x, y, z) is a primitive solution,i.e., a solution such that gcd(x, y, z) = 1.2 Then we have x2 +y2 ≡ 3 z2

(mod 4). Since 3 ≡ −1 (mod 4), then x2 + y2 + z2 ≡ 0 (mod 4). But02 = 0 ≡ 0 (mod 4), 12 = 1 ≡ 1 (mod 4), 22 = 4 ≡ 0 (mod 4),32 = 9 ≡ 1 (mod 4), so a2 can only be congruent to 0 or 1 modulo 4.So the only solution modulo 4 is x ≡ 0 (mod 4), y ≡ 0 (mod 4), z ≡ 0(mod 4), but that implies that x, y, z are multiple of 4, contradictingthe assumption gcd(x, y, z) = 1.

Exercise: The First Case of Fermat’s Last Theorem for a prime ex-ponent p ≥ 3 says that there are no positive integers x, y, z such thatp 6 | xyz and xp + yp = zp. Prove the First Case of Fermat’s Last Theo-rem for p = 3. (Hint: Study the equation modulo 9.)

2If (x, y, z) is a solution, we can always divide x, y, z by gcd(x, y, z) to obtain aprimitive solution.


8.3. Euler’s Phi Function. The number of units in (Zm, +, ·) isequal to the number of positive integers not greater than and rela-tively prime to m, i.e., the number of integers a such that 1 ≤ a ≤ mand gcd(a,m) = 1. That number is given by the so called Euler’s phifunction:

φ(m) = |{a | [1 ≤ a ≤ m] ∧ [gcd(a,m) = 1]}| .For instance, the positive integers not greater than and relatively primeto 15 are: 1, 2, 4, 7, 8, 11, 13, 14, hence φ(15) = 8.

Exercise: Prove:

1. If p is a prime number and s ≥ 1, then φ(ps) = ps − ps−1 =ps(1− 1/p). In particular φ(p) = p− 1.

2. If m1, m2 are two relatively prime positive integers, then φ(m1m2) =φ(m1) φ(m2).

3

3. If m = ps11 ps2

2 . . . pskk , where the pk are prime and the sk are posi-

tive, then

φ(m) = m (1− 1/p1) (1− 1/p2) . . . (1− 1/pk) .

8.4. Euler’s Theorem. If a and m are two relatively prime positiveintegers, m ≥ 2, then

aφ(m) ≡ 1 (mod m) .

The particular case in which m is a prime number p, Euler’s theoremis called Fermat’s Little Theorem:

ap−1 ≡ 1 (mod p) .

For instance, if a = 2 and p = 7, then we have, in fact, 27−1 = 26 =64 = 1 + 9 · 7 ≡ 1 (mod 7).

A consequence of Euler’s Theorem is the following. If gcd(a,m) = 1then

x ≡ y (mod φ(m)) ⇒ ax ≡ ay (mod m) .

Consequently, the following function is well defined:

Z∗m × Zφ(m) → Z∗m([a]m, [x]φ(m)) 7→ [ax]m

3A function f(x) of positive integers such that gcd(a, b) = 1 ⇒ f(ab) = f(a)f(b)is called multiplicative.


Hence, we can compute powers modulo m in the following way:

an = an mod φ(m) (mod m) ,

if gcd(a,m) = 1. For instance:

39734888 mod 100 = 39734888 mod φ(100) mod 100

= 39734888 mod 40 mod 100 = 38 mod 100 = 6561 mod 100 = 61 .

Fermat’s Little Theorem can be used as test of primality, or rather astest of non-primality. Example: Prove that 716311796279 is not prime.Answer : We have that4

2716311796279−1 = 2716311796278 ≡ 127835156517 (mod 716311796279) .

But if 716311796279 were prime, 2716311796279−1 would be congruent to1 modulo m. Warning : Note that am−1 ≡ 1 (mod m) does not implythat m is prime; the only thing that we can say is that if am−1 6≡ 1(mod m) (and gcd(a,m) = 1) then m is not prime.

8.5. The Chinese Remainder Theorem. Let m1,m2, . . . ,mk bepairwise relatively prime integers greater than or equal to 2. The fol-lowing system of congruences

x ≡ r1 (mod m1)x ≡ r2 (mod m2)

. . .x ≡ rk (mod mk)

has a unique solution modulo M = m1m2 . . . mk.

We can find a solution to that system in the following way. LetMi = M/mi, and si = the inverse of Mi in Zmi

. Then

x = M1s1r1 + M2s2r2 + · · ·+ Mkskrk

is a solution of the system.

Example: A group of objects can be arranged in 3 rows leaving 2 left,in 5 rows leaving 4 left, and in 7 rows leaving 6 left. How many objectsare there? Answer : We must solve the following system of congruences:

x ≡ 2 (mod 3)x ≡ 4 (mod 5)x ≡ 6 (mod 7)

4Given a power ax, there is an efficient method (easy to implement as an algo-rithm) to reduce ax modulo m—see paragraph A.1 in Appendix A.


We have: M = 3 · 5 · 7 = 105, M1 = 105/3 = 35 ≡ 2 (mod 3),M2 = 105/5 = 21 ≡ 1 (mod 5), M3 = 105/7 = 15 ≡ 1 (mod 7); s1 =“inverse of 2 in Z3” = 2, s2 = “inverse of 1 in Z5” = 1, s3 = “inverseof 1 in Z7” = 1. Hence the solution is

x = 35 · 2 · 2 + 21 · 1 · 4 + 15 · 1 · 6 = 314 ≡ 104 (mod 105) .

Hence, any group of 104 + 105 k objects is a possible solution to theproblem.

8.6. Application to Cryptography: RSA Algorithm. The RSAalgorithm is an encryption scheme designed in 1977 by Ronald Rivest,Adi Shamir and Leonard Adleman. It allows encrypting a messagewith a key (the encryption key) and decrypting it with a different key(the decryption key). The encryption key is public and can be givento everybody. The decryption key is private and is known only by therecipient of the encrypted message.

The RSA algorithm is based on the following facts. Given two primenumbers p and q, and a positive number m relatively prime to p andq, Euler’s theorem tells us that:

mφ(pq) = m(p−1)(q−1) ≡ 1 (mod pq) .

Assume now that we have two integers e and d such that e · d ≡ 1(mod φ(pq)). Then we have that

(me)d = me·d ≡ m (mod pq) .

So, given me we can recover m modulo pq by raising to the dth power.

The RSA algorithm consists of the following:

1. Generate two large primes p and q. Find their product n = pq.

2. Find two numbers e and d (in the range from 2 to φ(n)) such thate · d ≡ 1 (mod φ(n)). This requires some trial and error. Firste is chosen at random, and the Euclidean algorithm is used tofind gcd(e,m), solving at the same time the equation ex + my =gcd(e,m). If gcd(e,m) = 1 then the value obtained for x is d.Otherwise, e is no relatively prime to φ(n) and we must try adifferent value for e.

3. The public encryption key will be the pair (n, e). The privatedecryption key will be the pair (n, d). The encryption key is givento everybody, while the decryption key is kept secret by the futurerecipient of the message.


4. The message to be encrypted is divided into small pieces, andeach piece is encoded numerically as a positive integer m smallerthan n.

5. The number me is reduced modulo n; m′ = me mod n.

6. The recipient computes m′′ = m′d mod n, so that 0 ≤ m′′ < n.

We are left to prove that m′′ = m. If m is relatively prime to p andq, then from Euler’s theorem we get that m′′ ≡ m (mod n), and sinceboth are in the range from 0 to n− 1 they must be equal. The case inwhich p or q divides m is left as an exercise.


9. Graphs

9.1. Graphs. Consider the following examples:

1. A roadmap, consisting of a number of towns connected with roads.

2. The representation of a binary relation defined on a given set. Therelation of a given element x to another element y is representedwith an arrow connecting x to y.

The former is an example of (undirected) graph. The latter is anexample of a directed graph or digraph.

a b

cd

Figure 9.1. Undirected Graph.

In general a graph G consists of two things:

1. The vertex set V , whose elements are called vertices, nodes orpoints.

2. The edge set E, or set of edges connecting pairs of vertices. If theedges are directed then they are also called directed edges or arcs.

a b

cd

Figure 9.2. Directed Graph.


A graph is sometimes represented by the pair (V,E) (we assume Vand E finite). If the graph is undirected, the edge connecting x and yis represented by the unordered pair {x, y}, so E is a set of unorderedpairs. If the graph is directed, then the edge pointing from x to y isrepresented by the ordered pair (x, y), so E is a set of ordered pairs. Inthis case the vertex x is called origin, source or initial point of the edge,and y is called the terminus, terminating vertex or terminal point.

Two vertices connected by an edge are called adjacent. They are alsothe endpoints of the edge, and the edge is said to be incident to eachof its endpoints. If the graph is directed, an edge pointing from vertexx to vertex y is said to be incident from x and incident to y. An edgeconnecting a vertex to itself is called a loop.

The degree of a vertex v, represented deg(v), is the number of edgesthat contain it (loops are counted twice). A vertex of degree zero (notconnected to any other vertex) is called isolated. A vertex of degree 1is called pendant.

Next, we give a few more definitions.

• Multiple edge or multiedge: two or more edges connecting thesame pair of vertices (and pointing in the same direction if thegraph is directed).

• Multigraph: a graph with multiedges.1

b

cd

a

Figure 9.3. Multigraph.

• x-y Walk : a (loop-free) sequence of vertices (vk) and edges (ek)of the form x = v0, e1, v1, e2, v2, . . . , en, vn = y, where each edge ek

connects vk−1 with vk (and points from vk−1 to vk if the graph is

1In a multigraph the set E of edges is actually a multiset.


directed). The length of the walk is the number (n) of edges thatit contains. If v0 = vn then the walk is called a closed walk.

• x-y Trail : x-y walk without repeated edges. A closed trail (ofnon-zero length) is called a circuit.

• x-y Path: x-y walk without repeated vertices.2 A closed path (ofnon-zero length) is called a cycle. A graph without cycles is calledacyclic. If a path exists from some vertex a to another vertex b,then b is said to be reachable from a.3

Exercise: Prove that in an indirect graph, if there is a trail froma vertex a to another vertex b (6= a), then there is a path from ato b.

9.2. Special Graphs. Here we examine a few special graphs.

Complete Graph: a loop-free undirected graph G such that everypair of distinct vertices in G are connected by an edge. The completegraph of n vertices is represented Kn (fig. 9.4). A complete directedgraph is a loop-free directed graph G = (V,E) such that every pair ofdistinct vertices in G are connected by exactly one edge—so, for eachpair of distinct vertices, either (x, y) or (y, x) (but not both) is in E.

a

b

cd

e

Figure 9.4. Complete graph K5.

Bipartite Graph: a graph G = (V,E) in which V can be partitionedinto two subsets V1 and V2 so that each edge in G connects some vertexin V1 to some vertex in V2. A bipartite graph is called complete ifeach vertex in V1 is connected to each vertex in V2. If |V1| = m and

2Note that a path is also a trail.3Exercise: Prove that the relation aR b ⇔ “b is reachable from a” is an equiv-

alence relation. The equivalence classes define the connected components of thegraph.


|V2| = n, the corresponding complete bipartite graph is representedKm,n (fig. 9.5).

b

p

q

r

s

a

c

Figure 9.5. Complete bipartite graph K3,4.

Regular Graph: a graph whose vertices have all the same degree.

Tree: a loop-free, connected, acyclic undirected graph.

9.3. Graph Structures. Here we study some structures present in agraph.

Subgraph: Given a graph G = (V,E), a subgraph G′ = (V ′, E ′) of Gis another graph such that V ′ ⊆ V and E ′ ⊆ E. If V ′ = V then G′ iscalled a spanning subgraph of G.

Given a subset of vertices U ⊆ V , the subgraph of G induced byU , denoted 〈U〉, is the graph whose vertex set is U , and its edge setcontains all edges from G connecting vertices in U .4

Given a graph G = (V,E), and a vertex v ∈ V , we denote G− v thegraph obtained by deleting from G the vertex v and all edges incidentwith v. If e is an edge, G− e represents the subgraph (V,E − {e}).

Given a set of n vertices V , let Kn be the complete graph withvertex set V . Note that any graph G = (V,E) is a subgraph of Kn.The complement of a graph G is the subgraph G = (V,E) of Kn withthe same vertex set, and E = all the edges of Kn that are not in G.

Connected Graph: A graph G is called connected if there is a pathbetween any two distinct vertices of G. Otherwise the graph is calleddisconnected. A directed graph is connected if its associated undirected

4Note that the relation “to be a subgraph” is a partial order among graphs. Thegraph induced by U in G is the greatest subgraph of G with vertices in U .


graph (obtained by ignoring the directions of the edges) is connected.A connected component of G is any connected subgraph G′ = (V ′, E ′)of G = (V,E) such there is not edge (in G) from a vertex in V to avertex in V − V ′.

Graph Isomorphism: Given two graphs G1 = (V1, E1), G2 = (V2, E2),a graph isomorphism from G1 to G2 is a one-to-one correspondencef : V1 → V2 such that each two vertices a, b are connected in G1 ifand only if f(a), f(b) are connected (in the same direction if the graphis directed) in G2. If such a isomorphism exists, then G1 and G1 arecalled isomorphic graphs (fig. 9.6).

a1 b1

c1

d1

e1

a2 b2

c2

d2

e2

Figure 9.6. Two isomorphic graphs.

Graph Homeomorphism. Two loop-free indirected graphs are saidto be homeomorphic if they are isomorphic or can be obtained fromthe same loop-free indirected graph by a sequence of elementary sub-divisions. An elementary subdivision consists of dividing an edge withan additional vertex v, i.e., adding a new vertex v to the graph andreplacing some edge {a, b} with the edges {a, v} and {v, b} (fig. 9.7).

d

e a

b d

e

f c

d

ea

b

a

b

h

c c

Figure 9.7. Three homeomorphic graphs.


9.4. Euler Trails and Circuits. Let G = (V,E) be a graph or multi-graph with no isolated vertices. An Euler trail in G is a trail thattransverses every edge of the graph exactly once. Analogously, an Eu-ler circuit in G is a circuit that transverses every edge of the graphexactly once.

��

��

Figure 9.8. The Seven Bridges of Konigsberg.

Figure 9.9. Multigraph for the Seven Bridges of Konigsberg.

The graphs that have an Euler trail or circuit can be characterizedby looking at the degree of their vertices. Recall that the degree ofa vertex v, represented deg(v), is the number of edges that contain v(loops are counted twice). An even vertex is a vertex with even degree;an odd vertex is a vertex with odd degree.

If G = (V,E) is a directed graph or multigraph, then we define theincoming degree of a vertex v as id(v) = number of edges incidentinto v. Also, we define the outcoming degree of a vertex v as od(v) =number of edges incident from v.

We have the following results:


1. The sum of the degrees of the vertices of G = (V,E) is twice

its number of edges:∑v∈V

deg(v) = 2|E|. As a consequence, the

number of odd vertices in G must be even.

2. Let G be an indirected graph or multigraph with no isolated ver-tices. Then:(a) G contains an Euler circuit if and only if G is connected and

all its vertices have even degree.(b) G contains an Euler trail from vertex a to vertex b (6= a) if

and only if G is connected, a and b have odd degree, and allits other vertices have even degree.

3. Let G be a directed graph or multigraph with no isolated vertices.Then:(a) G contains a directed Euler circuit if and only if G is connected

and for every vertex v in G, id(v) = od(v).(b) G contains a directed Euler trail from vertex a to vertex b

(6= a) if and only if G is connected, id(a) = od(a)− 1, id(b) =od(b) + 1, and for all other vertices v, id(v) = od(v).

9.5. Hamilton Paths and Cycles. Let G be a graph or multigraphwith at least three vertices. A Hamilton cycle in G is a cycle thatcontains all vertices of G. A Hamilton path in G is a path (not a cycle)that contains all vertices of G. Note that by deleting an edge in aHamilton cycle we get a Hamilton path, so if a graph has a Hamiltoncycle, then it also has a Hamilton path. The converse is not true, i.e., agraph may have a Hamilton path but not a Hamilton cycle. Exercise:Find a graph with a Hamilton path but no Hamilton cycle.

In general it is not easy to determine if a given graph has a Hamiltonpath or cycle. However some either necessary or sufficient conditionsare known. We state some of them.

1. Every complete directed graph has a Hamilton path.Proof: We can construct a Hamilton path in the following way.

We start by choosing any vertex v in G, which can be consideredby itself as a path of length zero. Assume that at a given stagewe have built a path of length k: v0, e1, v1, e2, v2, . . . , vk. Take anyvertex v′ of G that is not in the path yet. If all edges between thevi and v′ point towards v′, we can add v′ to the end of the pathto obtain a path of length k + 1 from v0 to v′. Otherwise, let i


be the minimum index such that there is an edge from v′ to vi. Ifi = 0 then we add v′ at the beginning of the path. If i 6= 0, thenthe edge between vi−1 and v′ must point towards v′, so we insertv′ between vi−1 and vi. We continue the process until exhaustingall vertices in G.

2. Let G be a loop-free undirected graph with n ≥ 2 vertices. If forevery pair of distinct vertices x, y we have deg(x)+deg(y) ≥ n−1,then G has a Hamilton path. As a consequence, if deg(v) ≥(n− 1)/2 for every vertex v in G, then G has a Hamilton path.

3. Let G be a loop-free undirected graph with n ≥ 3 vertices. Ifdeg(x)+deg(y) ≥ n for every pair of nonadjacent distinct verticesx, y, then G contains a Hamilton cycle. As a consequence, ifdeg(v) ≥ n/2 for every vertex v in G, then G has a Hamiltoncycle. Also, if G has at least

(n−1

2

)+ 2 edges, then G has a

Hamilton cycle.

Next, a negative result: Let G = (V,E) a bipartite graph such thatV can be partitioned into two subsets V1 and V2 so that each edge inG connects some vertex in V1 to some vertex in V2. If |V1| 6= |V2| thenG has no Hamilton cycle. This is so because any path must containalternatively vertices from V1 and V2, so any cycle in G must have thesame number of vertices from each of both sets.

9.6. Planar Graphs. A graph or multigraph G is planar if it can bedrawn in the plane with its edges intersecting at their vertices only.One such drawing is called an embedding of the graph in the plane.Note that if a graph G is planar, then all graphs homeomorphic to Gare also planar.

A particular planar representation of a planar graph or multigraphis called a map. A map divides the plane into a number of regions (oneof them infinite).

9.6.1. Some Results About Planar Graphs.

1. Euler’s Formula: Let G = (V,E) be a connected planar graph ormultigraph, and let v = |V |, e = |E|, and r = number of regionsin which some given embedding of G divides the plane. Then:

v − e + r = 2 .


Note that this implies that all plane embeddings of a given graphor multigraph define the same number of regions.

2. Let G = (V,E) be a loop-free connected planar graph with vvertices, e ≥ 3 edges and r regions. Then 3r ≤ 2e and e ≤ 3v− 6.

3. The graph K5 is nonplanar. Proof: in K5 we have v = 5 ande = 10, hence 3v−6 = 9 < e = 10, which contradicts the previousresult.

4. The graph K3,3 is nonplanar. Proof: in K3,3 we have v = 6 ande = 9. If K3,3 were planar, from Euler’s formula we would haver = 5. On the other hand, each region is bounded by at least fouredges, so 4r ≤ 2e, i.e., 20 ≤ 18, which is a contradiction.

5. Kuratowski’s Theorem: A graph is nonplanar if and only if itcontains a subgraph that is homeomorphic to either K5 or K3,3.

9.6.2. Dual Graph of a Map. A map is defined by some planar graphG = (V,E) embedded in the plane. Assume that the map divides theplane into a set of regions R = {r1, r2, . . . , rk}. For each region ri,select a point pi in the interior of ri. The dual graph of that map is thegraph Gd = (V d, Ed), where V d = {p1, p2, . . . , pk}, and for each edge inE separating the regions ri and rj, there is an edge in Ed connecting pi

and pj. Warning : Note that a different embedding of the same graphG may give different (and nonisomorphic) dual graphs. Exercise: Findthe duals of the maps shown in figure 9.6, and prove that they are notisomorphic.

a

b

dce

Figure 9.10. Dual graph of a map.

9.7. Graph Coloring. Consider the problem of coloring a map M insuch a way that no adjacent regions (sharing a border) have the same


color. This is equivalent to coloring the vertices of the dual map of Min such a way that no adjacent vertices have the same color.

In general, a coloring of a graph is an assignment of a color to eachvertex of the graph. The coloring is called proper if there are no adja-cent vertices with the same color. If a graph can be properly coloredwith n colors we say that it is n-colorable. The minimum number of col-ors needed to properly color a given graph G = (V,E) is called the chro-matic number of G, and is represented χ(G). Obviously χ(G) ≤ |V |.

9.7.1. Some Results About Graph Coloring. Here we deal with undi-rected loop-free graphs only.

1. χ(Kn) = n.

2. Let G be a graph. The following statement are equivalent:(a) χ(G) = 2.(b) G is bipartite.(c) Every cycle in G has even length

3. Five Color Theorem (not hard to prove): Every planar graph is5-colorable.

4. Four Color Theorem (Appel and Haken, 1976), proved with anintricate computer analysis of configurations: Every planar graphis 4-colorable.

Exercise: Find a planar graph G such that χ(G) = 4.


10. Trees

10.1. Trees. A tree is an undirected, loop-free, connected, acyclicgraph.

Since the graph is acyclic, there is only one path between two distinctgiven vertices.

Given a loop-free undirected graph G = (V,E), the following state-ments are equivalent:

1. G is a tree.

2. G is connected, and the removal of an edge disconnects G intotwo trees.

3. G is acyclic and |V | = |E|+ 1.

4. G is connected and |V | = |E|+ 1.

5. G is acyclic and the graph obtained by adding an edge betweentwo non adjacent vertices contains exactly one cycle.

Example: Assume that we have a number of cities (plus a powerplant) connected with a network of electrical lines. Each line can beactivated to transport electricity or deactivated by the electric com-pany. Once in a while a line gets accidentally interrupted, and theelectric company has to rearrange the active lines so that all citiesreceive electricity, but for economy this needs to be done with the min-imum possible number of lines. If we represent the network of electricallines with a graph, then all cities will receive electricity as long as theactive lines form a connected subgraph of the original graph contain-ing all its vertices. This kind of minimal subgraph is what is called aspanning tree.

Spanning Tree:1 Let G be an undirected, connected, loop-free graph.A subgraph T of G is a spanning tree of G if T is a tree and containsall vertices of G. Exercise: Prove that T is a spanning tree of G if andonly if T is a minimal spanning connected subgraph of G.

1Recall that a spanning subgraph of G is a subgraph of G that contains allvertices of G.


10.2. Rooted Trees. A directed tree is a directed graph whose associ-ated undirected graph (obtained by ignoring the direction of its edges)is a tree. A rooted tree is a directed tree with a node called root withno incoming edges, and such that all its other vertices have exactly oneincoming edge. A vertex with no outcoming edges is called a leaf orterminal vertex. All other vertices are branch nodes or internal ver-tices. If every vertex has at most 2 incoming edges, the tree is called abinary tree.

Rooted trees are usually represented with all edges pointing down,so the arrows are not needed (fig. 10.1).

r

a b c

d e f gh i

j k l m n o p

Figure 10.1. A rooted tree.

The length of the path from the root to a given vertex v is the levelof v. If there is an edge from v1 to v2, we call v1 the parent of v2 andv2 the child of v1. If there is a directed path from v1 to v2 then we saythat v1 is and ancestor of v2 and v2 is a descendant of v1. Two verticeswith a common parent are called siblings.

10.3. Ordered Rooted Trees. In this section we study various waysof ordering the vertices of a rooted tree.

In order to motivate this subject, we introduce the concept of Polishnotation. Given a (not necessarily commutative) binary operation ◦,it is customary to represent the result of applying the operation to twoelements a, b by placing the operation symbol in the middle:

a ◦ b .

This is called infix notation. The Polish notation consists of placingthe symbol to the left:

◦ a b .


The reverse Polish notation consists of placing the symbol to the right:

a b ◦ .

The advantage of Polish notation is that it allows us to write ex-pressions without need for parenthesis. For instance, the expressiona∗ (b+c) in Polish notation would be ∗ a+b c, while a∗b+c is +∗a b c.Also, Polish notation is easier to evaluate in a computer.

In order to evaluate an expression in Polish notation, we scan theexpression from right to left, placing the elements in a stack.2 Eachtime we find an operator, we replace the two top symbols of the stackby the result of applying the operator to those elements. For instance,the expression ∗+ 2 3 4 (which in infix notation is “(2 + 3) ∗ 4”) wouldbe evaluated like this:

expression stack

∗+ 2 3 4∗+ 2 3 4∗+ 2 3 4∗+ 2 3 4∗ 5 4

20

An algebraic expression can be represented by a binary rooted treeobtained recursively in the following way. The tree for a constant orvariable a has a as its only vertex. If the algebraic expression S is ofthe form SL ◦ SR, where SL and SR are subexpressions with trees TL

and TR respectively, and ◦ is an operator, then the tree T for S consistsof ◦ as root, and the subtrees TL and TR (fig. 10.2).

For instance, consider the following algebraic expression:

a + b ∗ c + d ↑ e ∗ (f + h) ,

where + denotes addition, ∗ denotes multiplication and ↑ denotes ex-ponentiation. The binary tree for this expression is given in figure 10.3.

Given the binary tree of an algebraic expression, its Polish, reversePolish and infix representation are different ways of ordering the ver-tices of the tree, namely in preorder, postorder and inorder respectively.

2A stack or last-in first-out (LIFO) system, is a linear list of elements in whichinsertions and deletions take place only at one end, called top of the list. A queueor first-in first-out (FIFO) system, is a linear list of elements in which deletionstake place only at one end, called front of the list, and insertions take place onlyat the other end, called rear of the list.


o

T L T R

Figure 10.2. Tree of S1 ◦ S2.

+

*

+

d e f h

*

b c

+

a

Figure 10.3. Tree for a + b ∗ c + d ↑ e ∗ (f + h).

The following are recursive definitions of several orderings of thevertices of a rooted tree T = (V,E) with root r. If T has only onevertex r, then r by itself constitutes the preorder, postorder and inordertransversal of T . Otherwise, let T1, . . . , Tk the subtrees of T from leftto right (fig. 10.4). Then:

r

T1 T2 ... Tk

Figure 10.4. Ordering of trees.

1. Preorder Transversal : Pre(T ) = r, Pre(T1), . . . ,Pre(Tk).

2. Postorder Transversal : Post(T ) = Post(T1), . . . ,Post(Tk), r.


3. Inorder Transversal. If T is a binary tree with root r, left subtreeTL and right subtree TR, then: In(T ) = In(TL), r, In(TR).

10.4. Search Algorithms. Next, we give two algorithms to find thespanning tree T of a loop-free connected undirected graph G = (V,E).We assume that the vertices of G are given in a certain order v1, v2, . . . , vn.

10.4.1. Depth-First Search Algorithm. The idea of this algorithm is tomake a path as long as possible, and then go back (backtrack) to addbranches also as long as possible.

This algorithm uses a stack (initially empty) to store vertices of thegraph. In consists of the following:

1. Add v1 to T , insert it in the stack and mark it as “visited”.2. If the stack is empty, then we are done. Otherwise let v be the

vertex on the top of the stack.3. If there is no vertex v′ that is adjacent to v and has not been

visited yet, then delete v and go to step 2 (backtrack). Otherwise,let v′ be the first non-visited vertex that is adjacent to v.

4. Add vertex v′ and edge (v, v′) to T , insert v′ in the stack and markit as “visited”.

5. Go to step 2.

An alternative recursive definition is as follows. We define recursivelya process P applied to a given vertex v in the following way:

1. Add vertex v to T and mark it as “visited”.2. If there is no vertex v′ that is adjacent to v and has not been

visited yet, then return. Otherwise, let v′ be the first non-visitedvertex that is adjacent to v.

3. Add the edge (v, v′) to T .4. Apply P to v′.5. Go to step 2 (backtrack).

The Depth-First Search Algorithm consists of applying the process justdefined to v1.

10.4.2. Breadth-First Search Algorithm. Now the idea is to start withvertex v1 as root, add the vertices that are adjacent to v1, then theones that are adjacent to the latter and have not been visited yet, andso on.


This algorithm uses a queue (initially empty) to store vertices of thegraph. In consists of the following:

1. Add v1 to T , insert it in the queue and mark it as “visited”.2. If the queue is empty, then we are done. Otherwise let v be the

vertex in the front of the queue.3. For each vertex v′ of G that has not been visited yet and is ad-

jacent to v (there might be none) taken in order of increasingsubindices, add vertex v′ and edge (v, v′) to T , insert v′ in thequeue and mark it as “visited”.

4. Delete v from the queue.5. Go to step 2.

10.5. Decision Trees. A decision tree is a rooted tree in which theinternal vertices represent actions, the edges represent outcomes of anaction, and the leaves represent final outcomes.

Example: In how many ways can we toss a coin four times withouthaving two heads in a row? Answer : Figure 10.5 shows all possibleways in which it can happen. Starting at the root, the tree has twoedges, one representing the outcome “H” in the first toss, and anotherone representing “T”. If the first outcome is “T”, then the second onemay be either “H” or “T”. But if it is “H”, then the only outcome forthe second toss that verifies the condition is “T”, otherwise we wouldget two “H’s” in a row. After completing the tree we see that it has 8leaves, so the answer to the problem is 8.

TH

T H T

T T H T

T T T H T

H

T HH

Figure 10.5. Decision tree.


Appendix A.

A.1. Efficient Computation of Powers Modulo m. We illustratean efficient method of computing powers modulo m with an example.Assume that we want to compute 3547 mod 10. First write 547 in base2: 1000100011, hence 547 = 29 +25 +2+1 = ((24 +1) 24 +1) 2+1, so:

3547 = ((324 ·3)24 ·3)2·3. Next we compute the expression beginning withthe inner parenthesis, and reducing modulo 10 at each step: 32 = 9(mod 10), 322

= 92 = 81 = 1 (mod 10), 323= 12 = 1 (mod 10),

324= 12 = 1 (mod 10), 324 · 3 = 1 · 3 = 3 (mod 10), etc. At the end

we find 3547 = 7 (mod 10).

The following is a program in C implementing the algorithm:

int pow(int a, int x, int m) {

int p = 1;

int y = (1 << (8 * sizeof(int) - 2));

a %= m;

while (!(y & x)) y >>= 1;

while (y) {

p *= p;

p %= m;

if (x & y) {

p *= a;

p %= m;

}

y >>= 1;

}

return p;

}

NOTES ON DISCRETE MATHEMATICS -...

Documents

Transcript of NOTES ON DISCRETE MATHEMATICS -...