Epsilon Critical Documentation - Read the Docs · References [1] Hibbeler R.C., Structural...

Epsilon Critical DocumentationRelease

Celal Cakiroglu

Mar 18, 2017

Contents

1 2D Truss System Solver 31.1 Direct Stiffness Method for 2D Trusses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 2D Frame System Solver 72.1 Direct Stiffness Method for 2D Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Machine Learning 113.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Linear Binary Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 3D Modeling Tutorials 154.1 Intersection of a Datum Plane with an Extruded Body (8.0) . . . . . . . . . . . . . . . . . . . . . . 154.2 Mirroring of curves created by intersection of datum planes and extruded bodies(8.0) . . . . . . . . . 154.3 Assigning Different Colors To Different Partitions of a Body (8.0) . . . . . . . . . . . . . . . . . . . 164.4 Changing the Background Color of the Model (8.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.5 Making a Part Transparent (10.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Pre-calculus 175.1 1. Number sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 2. Axioms of the set of real numbers R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.3 3. Factoring Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.4 4. Laws of Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 Processing 236.1 Where to start ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.2 List of Processing functions and built-in variables: . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.3 Coordinates on a Canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.4 A Dynamic Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

7 Calculus 297.1 Vector norm and inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307.2 Mean Value Theorem and Rolle’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317.3 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.4 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337.5 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347.6 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377.7 Cauchy Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

i

7.8 Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387.9 Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.10 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.11 Differentiation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407.12 Binomial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437.13 Weierstrass maximum minimum theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.14 Every bounded sequence has a convergent subsequence (Bolzano-Weierstrass) . . . . . . . . . . . . 457.15 Cauchy criterion for integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

8 Polygon Meshing through Triangulation 498.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508.2 Auxiliary Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

9 Flow Between Parallel Plates 539.1 Analytical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539.2 Numerical Solution using OpenFOAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

10 Biography of Celal Cakiroglu 6110.1 Indices and tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

ii

Epsilon Critical Documentation, Release

Contents:

xi denotes the damping varible 𝜉 in the equation of motion ��+ 2𝜉𝜔𝑛��+𝜔𝑛2𝑥 = 0. 𝜉 is supposed to be in the interval

[0, 1]. Systems where 𝜉 has values greater than about 0.3 are usually highly damped systems.

A0 is the initial displacement of the mass from the unstretched position of the spring which causes the subsequentvibration of the mass around that position. The equation of motion can be derived using the free body diagramwhich shows the mass and the forces acting on it (free vibration ⇒ 𝑓(𝑡) = 0). The equilibrium of these forces(𝑚�� + 𝑐�� + 𝑘𝑥 = 𝑓(𝑡) = 0) gives us the equation of motion.

Fig. 1: Free body diagram

The damping coefficient c is the force required for unit velocity to occur across the damper. The stiffness k is the forcerequired for a unit extension of the spring. The equation of motion is obtained by dividing the equilibrium equationby m and letting 𝑐/𝑚 = 2𝜉𝜔𝑛, 𝑘/𝑚 = 𝜔𝑛

2. 𝜔𝑛 is the angular natural frequency of the system.

The solution of the equation of motion can be sought after by assuming a displacement in the form 𝑥(𝑡) = 𝑒𝑟𝑡 where𝑟 is some parameter possibly complex valued. Inserting the first and second derivatives of that assumed expression inthe equation of motion we obtain

𝑒𝑟𝑡(𝑟2 + 2𝜉𝜔𝑛𝑟 + 𝜔𝑛2) = 0

Since 𝑒𝑟𝑡 = 0 we obtain

𝑟2 + 2𝜉𝜔𝑛𝑟 + 𝜔𝑛2 = 0 ⇒ 𝑟1,2 = 𝜔𝑛(−𝜉 ±

√𝜉2 − 1)

We know that 𝜉 is less than 1 most of the time. Therefore 𝑟1,2 = 𝜔𝑛(−𝜉 ± 𝑗√

1 − 𝜉2) where 𝑗 =√−1. Let’s coin

a new parameter 𝜔𝑑 = 𝜔𝑛

√1 − 𝜉2 for the damped angular natural frequency of the system. Using this new variable

leads to 𝑟1,2 = −𝜉𝜔𝑛 ± 𝑗𝜔𝑑. Obviously 𝑟1,2 are complex valued which leaves us with two complex valued solutionsof the equation of motion 𝑥1(𝑡) = 𝑒(−𝜉𝜔𝑛+𝑗𝜔𝑑)𝑡, 𝑥2(𝑡) = 𝑒(−𝜉𝜔𝑛−𝑗𝜔𝑑)𝑡. These solutions are of little use to describethe motion of an object. Fortunately there is a way around this. According to the theorem of superposition if 𝑥1(𝑡)and 𝑥2(𝑡) are solutions of the differential equation then any linear combination of them is also a solution. Using thistheorem we can obtain real valued solutions by adding and subtracting 𝑥1(𝑡) and 𝑥2(𝑡) which can actually describethe motion of the mass .

𝑥1(𝑡) = 𝑒(−𝜉𝜔𝑛+𝑗𝜔𝑑)𝑡 = 𝑒−𝜉𝜔𝑛𝑡𝑒𝑗𝜔𝑑𝑡 = 𝑒−𝜉𝜔𝑛𝑡(cos𝜔𝑑𝑡 + 𝑗 sin𝜔𝑑𝑡)

𝑥2(𝑡) = 𝑒(−𝜉𝜔𝑛−𝑗𝜔𝑑)𝑡 = 𝑒−𝜉𝜔𝑛𝑡𝑒−𝑗𝜔𝑑𝑡 = 𝑒−𝜉𝜔𝑛𝑡(cos𝜔𝑑𝑡− 𝑗 sin𝜔𝑑𝑡)

𝑥3(𝑡) =1

2[𝑥1(𝑡) + 𝑥2(𝑡)] =

1

2[𝑒−𝜉𝜔𝑛𝑡2 cos𝜔𝑑𝑡] = 𝑒−𝜉𝜔𝑛𝑡 cos𝜔𝑑𝑡

𝑥4(𝑡) =1

2𝑖[𝑥1(𝑡) − 𝑥2(𝑡)] =

1

2𝑖[𝑒−𝜉𝜔𝑛𝑡2𝑗 sin𝜔𝑑𝑡] = 𝑒−𝜉𝜔𝑛𝑡 sin𝜔𝑑𝑡

The general solution of the equation of motion is

𝑥(𝑡) = 𝑒−𝜉𝜔𝑛𝑡(𝑎0 cos𝜔𝑑𝑡 + 𝑏0 sin𝜔𝑑𝑡)

where 𝑎0, 𝑏0 are arbitrary constants.

Contents 1


2 Contents

CHAPTER 1

2D Truss System Solver

Direct Stiffness Method for 2D Trusses

• Step 1: Definition of the joint positions and the truss members between the joints. This includes the crosssection area and Young’s modulus for each truss member as well as the boundary conditions for each joint.In the process of defining the joint positions, also for each joint a code vector is defined. The code vector ofeach joint consists of two numbers corresponding to the two possible directions of displacement that the jointcan undergo. The assignment of code numbers to the joints(also called “nodes” in the rest of this text) is suchthat the first node has the code vector (0, 1), the second node has the code vector (2, 3) and so on. The jointpositions and these code vectors are packed together in a data structure called “node”. Later on, while the trussmembers are being defined, the geometry and material properties of each truss member are packed together in adata structure called “bar”. The “bar” data structure also contains a code vector which is established by joiningthe code vectors of the two nodes belonging to the particular truss member.

• Step 2: Definition of the local member stiffness matrices k′. These matrices are defined with respect to the

local coordinate system ((𝑥′, 𝑦

′) in Figure 1 ) of each member.

k′

=

⎡⎢⎣ 𝐸𝐴

𝐿−𝐸𝐴

𝐿

−𝐸𝐴

𝐿

𝐸𝐴

𝐿

⎤⎥⎦• Step 3: Definition of the coordinate transformation matrices t for each truss member. Using these matrices the

local member stiffness matrices, local displacements(𝑢′

1, 𝑢′

2) and forces(𝑞′

1, 𝑞′

2) at each end of the truss membersare transformed into the global coordinate system. These matrices are populated by the cosines and sines of theangle between the member axis and the global x-coordinate system (usually a horizontal axis).

t =

[cos 𝜃𝑥 sin 𝜃𝑥 0 0

0 0 cos 𝜃𝑥 sin 𝜃𝑥

]In Figure 1 𝑢1𝑥, 𝑢1𝑦, 𝑢2𝑥, 𝑢2𝑦 and 𝑞1𝑥, 𝑞1𝑦, 𝑞2𝑥, 𝑞2𝑦 denote the global end displacements and end forces respec-tively.The conversion of the forces, displacements and the stiffness matrices between the local and global coordinatesystems can be done as follows:

q = tTq′, u

′= tu, q = ku ⇒ k = tTk

′t

3


Fig. 1.1: Figure 1: Member end forces and displacements in local and global coordinates

• Step 5: Assemblage of the global stiffness matrix for the entire system from the global stiffness matrices ofthe bars. This operation uses the code vectors of the truss members. As mentioned in step 1, each 2D trussmember is assigned a code vector consisting of 4 numbers. As an example if a bar is located between the firstand third (in the order of definition) nodes of the system, then the code vector of this bar would be (0, 1, 4, 5).The data structure “bar” contains a vector called “codeVec” where the numbers (0, 1, 4, 5) would be stored forthis particular bar. Let’s assume as an example that the total number of nodes in the system is 3. Then thetotal number of possible joint displacements (in other words the total degrees of freedom of the system) wouldbe 6 and the global system stiffness matrix would be a 6X6 matrix. Let’s call this matrix K. In the processof programming this method, K is initialized as a zero matrix. Afterwards the entries of the member globalstiffness matrices are added to the proper parts of K. In case of the example bar the following operationswould be necessary: K[0][0]+ = k[0][0], K[0][1]+ = k[0][1], K[0][4]+ = k[0][2], K[0][5]+ = k[0][3],K[1][4]+ = k[1][2], K[1][5]+ = k[1][3] and so on. The following pseudocode would do this operation for allbars in the system and assemble the system global stiffness matrix

for(i =0;i<total number of bars;i++)for(j=0;j<4;j++)

for(m=0;m<4;m++)index1=bars[i].codeVec[j]index2=bars[i].codeVec[m]K[index1][index2]+=bars[i].k[j][m]

nextnext

next

• Step 6: Partitioning of the global stiffness matrix K, the global displacement vector U and the global forcevector Q. Q and U are related to each other as follows:

Q = KU

4 Chapter 1. 2D Truss System Solver


In the above equation both Q and U have known and unknown parts by the definition of the system such thatwhere U is known, Q is unknown and vice versa. In order to come up with an equation system from whichthe unknown parts of U can be solved, a sub global stiffness matrix Ks as well as sub load and displacementvectors Qs and Us have to be defined. For this purpose the code values in the entire system corresponding to thedegrees of freedom where the displacement is unknown but the force is known, are packed into a vector called“subIndices”. Also, the known forces at these degrees of freedom are packed into the vector Qs. The next stepis to initialize Ks as a zero matrix and then to populate it using the following pseudocode.

for(i=0;i<length of subIndices;i++)for(j=0;j<length of subIndices;j++)

Ks[i][j]=K[subIndices[i]][subIndices[j]]next

next

• Step 7: The unknown displacements Us are solved from the equation system Qs = KsUs. Afterwards thevalues in Us are added to the initially zero vector U using the following pseudocode.

for(i=0;i<length of subIndices;i++)U[subIndices[i]]=Us[i]

next

• Step 8: Computation of the force vector with Q = KU.

• Step 9: Using Q and U, global force and displacement vectors q and u are assigned to each truss member asfollows.

for(i=0;i<total number of bars;i++)for(j=0;j<4;j++)

index=bars[i].codeVec[j]bars[i].q[j]=Q[index]bars[i].u[j]=U[index]

nextnext

• Step 10: For each truss member, the global load and displacement vectors are transformed into the local co-ordinate system in order to compute the axial forces and displacements of each truss member. The equationsu

′= tu and q

′= k

′u

′are used. As shown in Figure 1 the local force q

′

2 is defined as a tensile force. In Figure1, q

′

1 and q′

2 correspond to q′[0] and q

′[1] respectively. Therefore a positive value of q

′[1] indicates tension in

the member whereas a negative value indicates compression.

References [1] Hibbeler R.C., Structural Analysis, 8th edition, ISBN:9780132570534

1.1. Direct Stiffness Method for 2D Trusses 5


6 Chapter 1. 2D Truss System Solver

CHAPTER 2

2D Frame System Solver

Direct Stiffness Method for 2D Frames

The elements that make up a frame structure are capable of carrying shear forces and bending moments in additionto the axial forces. Also, in addition to the translational degrees of freedom at the two nodes of an element, framemembers have rotational degrees of freedom. The numbering convention and the positive directions of these degreesof freedom are shown below.

Fig. 2.1: Figure 1: Translational and rotational degrees of freedom of a frame member

The first step in the derivation of the element stiffness matrix is to describe the flexural displacement 𝑣(𝑥) and axialdisplacement 𝑢(𝑥) of the frame members in polynomial form.

𝑣(𝑥) = 𝑐1 + 𝑐2𝑥 + 𝑐3𝑥2 + 𝑐4𝑥

3

𝑢(𝑥) = 𝑐5 + 𝑐6𝑥

In the Euler-Bernoulli beam theory the equation for the bending moment is 𝑀(𝑥) = 𝐸𝐼𝑑2𝑣

𝑑𝑥2. Also, since in the

finite element method loads are applied at the nodes of an element, 𝑀(𝑥) must vary linearly between the nodes(the contributions of the distributed loads to the bending moments are added after the system equations are solved).Therefore 𝑣(𝑥) is described as a third order polynomial so that its second derivative varies linearly between the elementnodes. The description of the axial displacement as a first order polynomial follows from the fact that the axial force

7


carried by a member is assumed to be constant along the member length. This leads to constant axial strain which

has the equation 𝜀𝑥 =𝑑𝑢

𝑑𝑥. After the application of the boundary conditions 𝑢(𝑥)|𝑥=0 = 𝑢1, 𝑣(𝑥)|𝑥=0 = 𝑣1,

𝑣′(𝑥)|𝑥=0 = 𝜃1, 𝑢(𝑥)|𝑥=𝐿 = 𝑢2, 𝑣(𝑥)|𝑥=𝐿 = 𝑣2, 𝑣′(𝑥)|𝑥=𝐿 = 𝜃2, we obtain the following expression for 𝑣(𝑥) and𝑢(𝑥):

𝑣(𝑥) =(

1 − 3𝑥2

𝐿2+

2𝑥3

𝐿3

)𝑣1 +

(𝑥− 2𝑥2

𝐿+

𝑥3

𝐿2

)𝜃1 +

(3𝑥2

𝐿2− 2𝑥3

𝐿3

)𝑣2 +

( 𝑥3

𝐿2− 𝑥2

𝐿

)𝜃2

𝑢(𝑥) = 𝑢1 +(𝑢2 − 𝑢1)

𝐿𝑥

The strain energy in a member can be computed as 𝑈 =1

2

∫𝑉

𝜎𝑥𝜀𝑥𝑑𝑉 where 𝑉 is the total volume of the member.

The strain energy can be computed as the superposition of the strain energies related to flexure (𝑈𝑓 ) and axial loading(𝑈𝑎). In case of axial loading stress, strain and resulting strain energy are computed as follows:

𝜀𝑥 =𝑑𝑢

𝑑𝑥=

(𝑢2 − 𝑢1)

𝐿, 𝜎𝑥 = 𝐸𝜀𝑥 = 𝐸

𝑑𝑢

𝑑𝑥

𝑈𝑎 =1

2

∫𝑉

𝐸(𝑑𝑢𝑑𝑥

)2𝑑𝑉 =

1

2𝐸

∫ 𝐿

0

(𝑑𝑢𝑑𝑥

)2 ∫𝐴

𝑑𝐴𝑑𝑥 =1

2𝐸𝐴

∫ 𝐿

0

(𝑢2 − 𝑢1)2

𝐿2𝑑𝑥

𝑈𝑎 =𝐸𝐴

2𝐿(𝑢2 − 𝑢1)2

The stress, strain and strain energy associated with the flexural loading can be computed as follows:

𝜀𝑥 = −𝑦𝑑2𝑣

𝑑𝑥2= −𝑦(2𝑐3 + 6𝑐4𝑥), 𝜎𝑥 = 𝐸𝜀𝑥 = −𝑦𝐸

𝑑2𝑣

𝑑𝑥2

𝑈𝑓 =1

2𝐸

∫𝑉

𝑦2(𝑑2𝑣𝑑𝑥2

)2𝑑𝑉 =

1

2𝐸

∫ 𝐿

0

(𝑑2𝑣𝑑𝑥2

)2 ∫𝐴

𝑦2𝑑𝐴𝑑𝑥 =1

2𝐸𝐼

∫ 𝐿

0

(𝑑2𝑣𝑑𝑥2

)2𝑑𝑥

At this point it is convenient to describe 𝑣(𝑥) as a combination of 4 shape functions

𝑁1 = 1 − 3𝑥2

𝐿2+

2𝑥3

𝐿3, 𝑁2 = 𝑥− 2𝑥2

𝐿+

𝑥3

𝐿2

𝑁3 =3𝑥2

𝐿2− 2𝑥3

𝐿3, 𝑁4 =

𝑥3

𝐿2− 𝑥2

𝐿

𝑣(𝑥) = 𝑁1𝑣1 + 𝑁2𝜃1 + 𝑁3𝑣2 + 𝑁4𝜃2

𝑈𝑓 =1

2𝐸𝐼

∫ 𝐿

0

(𝑑2𝑁1

𝑑𝑥2𝑣1 +

𝑑2𝑁2

𝑑𝑥2𝜃1 +

𝑑2𝑁3

𝑑𝑥2𝑣2 +

𝑑2𝑁4

𝑑𝑥2𝜃2

)2𝑑𝑥

The total strain energy of the frame member can be computed using the superposition of 𝑈𝑎 and 𝑈𝑓 as follows:

𝑈 =𝐸𝐴

2𝐿(𝑢2 − 𝑢1)2 +

1

2𝐸𝐼

∫ 𝐿

0

(𝑑2𝑁1

𝑑𝑥2𝑣1 +

𝑑2𝑁2

𝑑𝑥2𝜃1 +

𝑑2𝑁3

𝑑𝑥2𝑣2 +

𝑑2𝑁4

𝑑𝑥2𝜃2

)2𝑑𝑥

Once the total strain energy of a member is known, the forces and moments in the local coordinates acting at its nodescan be computed using Castigliano’s first theorem such that:

𝑁1 =𝜕𝑈

𝜕𝑢1=

𝐸𝐴

𝐿(𝑢1 − 𝑢2)

𝑉1 =𝜕𝑈

𝜕𝑣1= 𝐸𝐼

∫ 𝐿

0

(𝑁′′

1 𝑁′′

1 𝑣1 + 𝑁′′

2 𝑁′′

1 𝜃1 + 𝑁′′

3 𝑁′′

1 𝑣2 + 𝑁′′

4 𝑁′′

1 𝜃2)𝑑𝑥

𝑀1 =𝜕𝑈

𝜕𝜃1= 𝐸𝐼

∫ 𝐿

0

(𝑁′′

1 𝑁′′

2 𝑣1 + 𝑁′′

2 𝑁′′

2 𝜃1 + 𝑁′′

3 𝑁′′

2 𝑣2 + 𝑁′′

4 𝑁′′

2 𝜃2)𝑑𝑥

8 Chapter 2. 2D Frame System Solver


𝑁2 =𝜕𝑈

𝜕𝑢2=

𝐸𝐴

𝐿(𝑢2 − 𝑢1)

𝑉2 =𝜕𝑈

𝜕𝑣2= 𝐸𝐼

∫ 𝐿

0

(𝑁′′

1 𝑁′′

3 𝑣1 + 𝑁′′

2 𝑁′′

3 𝜃1 + 𝑁′′

3 𝑁′′

3 𝑣2 + 𝑁′′

4 𝑁′′

3 𝜃2)𝑑𝑥

𝑀2 =𝜕𝑈

𝜕𝜃2= 𝐸𝐼

∫ 𝐿

0

(𝑁′′

1 𝑁′′

4 𝑣1 + 𝑁′′

2 𝑁′′

4 𝜃1 + 𝑁′′

3 𝑁′′

4 𝑣2 + 𝑁′′

4 𝑁′′

4 𝜃2)𝑑𝑥

The above equations can be written in matrix form:

⎡⎢⎢⎢⎢⎢⎢⎣𝑁1

𝑉1

𝑀1

𝑁2

𝑉2

𝑀2

⎤⎥⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

𝐸𝐴

𝐿0 0 −𝐸𝐴

𝐿0 0

0 𝐸𝐼∫ 𝐿

0𝑁

′′

1 𝑁′′

1 𝑑𝑥 𝐸𝐼∫ 𝐿

0𝑁

′′

2 𝑁′′

1 𝑑𝑥 0 𝐸𝐼∫ 𝐿

0𝑁

′′

3 𝑁′′


0𝑁

′′

4 𝑁′′

1 𝑑𝑥

0 𝐸𝐼∫ 𝐿

0𝑁

′′

1 𝑁′′


0𝑁

′′

2 𝑁′′


0𝑁

′′

3 𝑁′′


0𝑁

′′

4 𝑁′′

2 𝑑𝑥

−𝐸𝐴

𝐿0 0

𝐸𝐴

𝐿0 0

0 𝐸𝐼∫ 𝐿

0𝑁

′′

1 𝑁′′


0𝑁

′′

2 𝑁′′


0𝑁

′′

3 𝑁′′


0𝑁

′′

4 𝑁′′

3 𝑑𝑥

0 𝐸𝐼∫ 𝐿

0𝑁

′′

1 𝑁′′


0𝑁

′′

2 𝑁′′


0𝑁

′′

3 𝑁′′


0𝑁

′′

4 𝑁′′

4 𝑑𝑥

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎣𝑢1

𝑣1𝜃1𝑢2

𝑣2𝜃2

⎤⎥⎥⎥⎥⎥⎥⎦

After evaluating the integrals in the above matrix equation, the frame member stiffness matrix in local coordinates isfound as

k′

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

𝐸𝐴

𝐿0 0 −𝐸𝐴

𝐿0 0

012𝐸𝐼

𝐿3

6𝐸𝐼

𝐿20 −12𝐸𝐼

𝐿3

6𝐸𝐼

𝐿2

06𝐸𝐼

𝐿2

4𝐸𝐼

𝐿0 −6𝐸𝐼

𝐿2

2𝐸𝐼

𝐿

−𝐸𝐴

𝐿0 0

𝐸𝐴

𝐿0 0

0 −12𝐸𝐼

𝐿3−6𝐸𝐼

𝐿20

12𝐸𝐼

𝐿3−6𝐸𝐼

𝐿2

06𝐸𝐼

𝐿2

2𝐸𝐼

𝐿0 −6𝐸𝐼

𝐿2

4𝐸𝐼

𝐿

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦The transformation of the stiffness matrices into the global coordinate system and the assemblage of the global stiffnessmatrix can be done similar to 2 dimensional trusses.

In case of elements carrying distributed loading, the reaction forces that the distributed load would cause on a singlebeam element, are added to the load vectors of the element nodes. These reaction forces and moments are shown inFigure 2.

In Figure 2 the nodes at the two ends of the frame element are denoted with 𝑁1 and 𝑁2. From elementary strength ofmaterials we know that a distributed load 𝑞 on a beam of length 𝐿 clamped at both ends would cause end moments

in magnitude𝑞𝐿2

12and support shear forces in magnitude

𝑞𝐿

2. In Figure 2, the reaction forces and moments acting in

the opposite directions of the degrees of freedom given in Figure 1, have negative sign. On the other hand the forcesand moments acting on the nodes of the element because of the distributed load, act in the opposite directions of thereaction forces and moments. A nodal force or moment caused by 𝑞 is only added to the system load vector if the nodeis not constrained in the direction of that force or moment.

After solving the system equations, the shear forces and bending moments are transformed to local coordinates. Finallyaccording to the sign convention of Figure 1, 𝑞𝐿/2 and 𝑞𝐿2/12 are added to the local forces and moments.

References [1] Hibbeler R.C., Structural Analysis, 8th edition, ISBN:978-0132570534 [2] Hutton D.V., Fundamen-tals of Finite Element Analysis, ISBN:0072395362

2.1. Direct Stiffness Method for 2D Frames 9


Fig. 2.2: Figure 2: Reaction forces and moments of a frame member

10 Chapter 2. 2D Frame System Solver

CHAPTER 3

Machine Learning

Basic Concepts

Hyperplane: {x : w𝑇x = 𝑏}, w,x ∈ R𝑑, 𝑏 ∈ R. A set of points in R𝑑 with a constant inner product to a given vectorw ∈ R𝑑. The constant 𝑏 determines the offset of the hyperplane (w, 𝑏) from the origin.

Training Set 𝑇 : The set of pairs (x𝑡, 𝑦𝑡) ⊂ R𝑑 × R where the vectors x𝑡 are samples with known behaviour and𝑦𝑡 are their labels. In classification, for each sample x𝑡 in the training set, a label 𝑦𝑡 is assigned which classifies thesample in one of any given number of classes. In the notation x𝑡, 𝑡 belongs to the set {1, 2, ..., 𝑙} where 𝑙 is the totalnumber of samples in the training set.

Output Domain 𝑌 : For binary classification of the vectors in 𝑇 , the output domain is 𝑌 = {−1, 1}. A classifierfunction is developed based on the training set which assigns any new d-dimensional vector to one of the classes -1and 1 correctly. In the ideal case the samples in the training set describe the general behavour of the vectors in the classsufficiently well so that the classifier function is able to classify any vector correctly, otherwise the classifier needs tobe updated using new training samples.

Linear Binary Classification

Classification function: The classification function is defined as

𝑓(x𝑡) =

{1 if ⟨w,x𝑡⟩ + 𝑏 > 0−1 else (3.1)

In Eq. (3.1) w is the weight vector having the same dimension as the training sample vectors x𝑡 and 𝑏 is the biascoefficient. The symbol ⟨·, ·⟩ denotes the scalar product of two vectors.

The classifier function can be obtained using the perceptron algorithm.

The Perceptron Algorithm: A supervised learnign algorithm based on the gradual improvement of a classifier func-tion until it classifies all the samples in the taining set correctly. The steps of the algorithm can be summarized asfollows:

11


• Initialize: w(0) = 0, 𝑏(0) = 0, total number of updates=0, number of misclassifications in one pass=0, 𝑅 =max ‖x𝑡‖

• for t=1 to number of samples: Check if 𝑦𝑡(⟨w(𝑘),x𝑡⟩ + 𝑏(𝑘)) > 0 is satisfied.

• If the above condition is not satisfied then do:

– w(𝑘+1) = w(𝑘) + 𝑦𝑡x𝑡

– 𝑏(𝑘+1) = 𝑏(𝑘) + 𝑦𝑡𝑅2

The perceptron algorithm uses a training set 𝑇 = {(x𝑡, 𝑦𝑡) : x𝑡 ∈ R𝑑, 𝑦𝑡 ∈ {−1, 1}} of samples with known behaviourin order to obtain a proper weight vector and bias coefficient. The letter 𝑡 in the notation for a training sample x𝑡 is theindex of the training sample and belongs to the set {1, ..., 𝑙} where 𝑙 is the total number of training samples. For eachtraining sample x𝑡 in the training set, a label 𝑦𝑡 is assigned according to the known behaviour of the sample.

The classifier function in Eq. (3.1) makes a decision about the class of a vector x𝑡 which depends on the expression⟨w,x𝑡⟩+𝑏 having a value greater than 0 or not. Therefore the set of vectors x in the R𝑑 space for which this expressionis equal to 0 builds the decision boundary of the training set.

Formally the decision boundary is defined as 𝐷𝐵 = {x ∈ R𝑑 : ⟨w,x⟩ + 𝑏 = 0}.The weight vector w determinesthe orientation of the decision boundary and is perpendicular to it. In order to show this, let v1 and v2 be any twovectors in 𝐷𝐵. Then ⟨w,v1⟩ + 𝑏 = ⟨w,v2⟩ + 𝑏 = 0 and ⟨w,v1 − v2⟩ = 0. Therefore any vector lying within 𝐷𝐵is perpendicular to w. This can be easily visualized in R2.

In the above figure, the symbols ‘x’ and ‘o’ represent the samples of two different classes and 𝛾𝐺 denotes the geometricmargin of the decision boundary 𝐷𝐵 between these two classes. The geometric margin is the smallest distancebetween any sample in the training set and the hyperplane that separates the two classes in the training set (the decisionboundary). An expression to compute the magnitude of the geometric margin can be obtained as follow: Let x𝑡 bethe training sample having the least distance to the decision boundary. Then, x𝑡 can be expressed as the sum of itsorthogonal projection on 𝐷𝐵((x𝑡)⊥) and another vector which is parallel to the weight vector w as in Eq. (3.2)

x𝑡 = (x𝑡)⊥ + (𝛾𝐺 + 𝑑𝑂)w

‖w‖ (3.2)

In Eq. (3.2), 𝑑𝑂 is the distance of the decision boundary from the origin and is closely related to the bias coefficient bfirst introduced in Eq. (3.1). In order to see that, let x0 be a vector that is perpendicular to the decision boundary and

12 Chapter 3. Machine Learning


has a magnitude equal to the distance of the decision boundary from the origin such that ‖x0‖ = 𝑑𝑂. then x0 has thesame direction as w and can be written as x0 = ‖x0‖

w

‖w‖. Clearly, x0 is also in the decision boundary and therefore

⟨w,x0⟩ + 𝑏 =‖x0‖‖w‖

⟨w,w⟩ + 𝑏 = ‖x0‖‖w‖ + 𝑏 = 0. Therefore 𝑑𝑂 can be computed as in Eq. (3.3).

𝑑𝑂 = − 𝑏

‖w‖ (3.3)

Plugging the expression for 𝑑𝑂 from Eq. (3.3) into Eq. (3.2) and taking the scalar product of both sides of the equationwith w we obtain Eq. (3.4)

⟨w,x𝑡⟩ = ⟨w, (x𝑡)⊥⟩ +𝛾𝐺‖w‖

⟨w,w⟩ − 𝑏

‖w‖2⟨w,w⟩ (3.4)

Since w and (x𝑡)⊥ are perpendicular to each other, the ⟨w, (x𝑡)⊥⟩ term in Eq. (3.4) vanishes. After adding 𝑏 to bothsides of Eq. (3.4), we obtain Eq. (3.5) which shows the expression for the geometric margin 𝛾𝐺 [3].

⟨w,x𝑡⟩ + 𝑏 = 𝛾‖w‖ ⇒ 𝛾𝐺 =⟨w,x𝑡⟩ + 𝑏

‖w‖ (3.5)

The perceptron algorithm starts with the initialization of w and 𝑏 as w(0) = 0, 𝑏(0) = 0. In the next step the algorithmtraverses the training set. For each pipe sample x𝑡 in the training set, if the product y_t(langlemathbf{w}^{(k)},mathbf{x}_trangle+b^{(k)}) is less than or equal to zero, then this would imply that the actual label of the trainingsample and the predicted label have opposite signs and the training sample is misclassified. In this case, w(𝑘) and 𝑏(𝑘)

are updated as follows:

w𝑘+1 = w𝑘 + 𝑦𝑡x𝑡 (3.6)

𝑏(𝑘+1) = 𝑏(𝑘) + 𝑦𝑡𝑅2 (3.7)

where 𝑅 = max ‖x𝑡‖. It can be proven that the above updates improve the weight vector and the bias coefficientas follows [1]: Assume that after the updates, another attempt is made in order to classify the same training samplex𝑡. Then Eq. (3.8) shows that the new product 𝑦𝑡(⟨w(𝑘+1),x𝑡⟩ + 𝑏(𝑘+1)) is closer to a positive value compared to𝑦𝑡(⟨w(𝑘),x𝑡⟩ + 𝑏(𝑘)).

𝑦𝑡(⟨w(𝑘+1),x𝑡⟩ + 𝑏(𝑘+1)) = 𝑦𝑡(⟨w(𝑘) + 𝑦𝑡x𝑡,x𝑡⟩ + 𝑏(𝑘) + 𝑦𝑡𝑅2)

= 𝑦𝑡(⟨w(𝑘),x𝑡⟩ + 𝑦𝑡‖x𝑡‖2 + 𝑏(𝑘) + 𝑦𝑡𝑅2)

= 𝑦𝑡(⟨w(𝑘),x𝑡⟩ + 𝑏(𝑘)) + ‖x𝑡‖2 + 𝑅2

≥ 𝑦𝑡(⟨w(𝑘),x𝑡⟩ + 𝑏(𝑘))

(3.8)

It can also be proven that after a finite number of updates a proper classifier function can be obtained as long as thesamples in the training set are linearly separable with a functional margin 𝛾𝐹 > 0. The functional margin 𝛾𝑡 of atraining sample x𝑡 with respect to a separating hyperplane (decision boundary) (w, 𝑏) is defined as [2]:

𝛾𝑡 = 𝑦𝑡(⟨w,x𝑡⟩ + 𝑏) (3.9)

and the functional margin 𝛾𝐹 of a separating hyperplane (w, 𝑏) is defined as the minimum of all functional margins as-sociated with a training set. A larger functinal margin implies that the training samples are geometrically farther awayfrom the separating hyperplane and therefore the two classes are more distinctly separated. The relationship betweenthe functional margin and the geometric separateness of the classes can be reckoned by comparing the expressions forthe geometric margin (Eq. (3.5)) and the functional margin (Eq. (3.9)).

3.2. Linear Binary Classification 13


In order to prove that the perceptron algorithm converges to a solution after a finite number of iterations, the followingnew weight vectors w and training samples x𝑡 are defined by appending 𝑅 to every training sample and 𝑏(𝑘)/𝑅 toevery weight vector w(𝑘):

x𝑡 = (x𝑇𝑡 , 𝑅), w(𝑘) = (w(𝑘)𝑇 , 𝑏(𝑘)/𝑅) (3.10)

Given that the training samples are linearly separable, there exists a separating hyperplane (w*, 𝑏*) such that for any(x𝑡, 𝑦𝑡) ∈ 𝑇 , 𝑦𝑡(⟨w*,x𝑡⟩ + 𝑏*) = 𝑦𝑡⟨w*, x𝑡⟩ ≥ 𝛾* where 𝛾* is the functional margin of (w*, 𝑏*).

Assume that the weight vector w(𝑘−1) resulted in a misclassification of the sample x𝑡 and is therefore updated to w(𝑘).w(𝑘) and w* both belong to the vector space R3 and the cosine of the angle between them is defined as in Eq. (3.11).

cos(w*, w(𝑘)) =⟨w*, w(𝑘)⟩‖w*‖‖w(𝑘)‖

≤ 1 (3.11)

In Eq. (3.11), the expression for w(𝑘) can be expanded as follows.

w(𝑘) =(w(𝑘)𝑇 ,

𝑏(𝑘)

𝑅

)=(w(𝑘−1)𝑇 + 𝑦𝑡x

𝑇𝑡 ,

𝑏(𝑘−1) + 𝑦𝑡𝑅2

𝑅

)=(w(𝑘−1)𝑇 + 𝑦𝑡x

𝑇𝑡 ,

𝑏(𝑘−1)

𝑅+ 𝑦𝑡𝑅

)= w(𝑘−1) + 𝑦𝑡x𝑡

It can also be shown that the scalar product term in Eq. (3.11) is greater than or equal to 𝑘𝛾*. In order to show thislet 𝑘 = 1, then ⟨w*, w(1)⟩ = ⟨w*, w(0) + 𝑦𝑡x𝑡⟩ = 𝑦𝑡⟨w*, x𝑡⟩ ≥ 𝛾* since w(0) is initialized as the zero vector. If forsome 𝑛 ∈ N, ⟨w*, w(𝑛)⟩ ≥ 𝑛𝛾*, then ⟨w*, w(𝑛+1)⟩ = ⟨w*, w(𝑛) +𝑦𝑡x𝑡⟩ = 𝑦𝑡⟨w*, x𝑡⟩+ ⟨w*, w(𝑛)⟩ ≥ 𝛾* +𝑛𝛾* =(𝑛+1)𝛾*. By induction it follows that for any 𝑘 ∈ N, ⟨w*, w(𝑘)⟩ ≥ 𝑘𝛾*. Using this result the inequality in Eq. (3.12)is established.

𝑘𝛾*

‖w*‖‖w(𝑘)‖≤ 1 (3.12)

Furthermore, the boundedness of the norm ‖w(𝑘)‖ can be shown as follows:

‖w(𝑘)‖2 = ⟨w(𝑘−1) + 𝑦𝑡x𝑡, w(𝑘−1) + 𝑦𝑡x𝑡⟩= ‖w(𝑘−1)‖2 + 2𝑦𝑡⟨w(𝑘−1), x𝑡⟩ + ‖x𝑡‖2

Since the weight vector w(𝑘−1) resulted in a misclassification, it is known that 𝑦𝑡⟨w(𝑘−1), x𝑡⟩ ≤ 0. Therefore,

‖w(𝑘)‖2 ≤ ‖w(𝑘−1)‖2 + ‖x𝑡‖2

Using ‖x𝑡‖2 = ‖x𝑡‖2 + 𝑅2 yields

‖w(𝑘)‖2 ≤ ‖w(𝑘−1)‖2 + 2𝑅2 (3.13)

Eq. (3.13) implies the boundedness of ‖w(𝑘)‖. In order to prove this, consider that for 𝑘 = 1, ‖w(1)‖2 ≤ ‖w(0)‖2 +2𝑅2 = 2𝑅2 = 2𝑘𝑅2. If for some 𝑛 ∈ N, ‖w(𝑛)‖2 ≤ 2𝑛𝑅2 then using Eq. (3.13) we obtain ‖w(𝑛+1)‖2 ≤‖w(𝑛)‖2 + 2𝑅2 ≤ 2𝑛𝑅2 + 2𝑅2 = (𝑛 + 1)2𝑅2. By induction it follows that:

∀𝑘 ∈ N, ‖w(𝑘)‖2 ≤ 𝑘2𝑅2 (3.14)

By plugging Eq. (3.14) in Eq. (3.12), squaring both sides of the inequality and using the boundedness of w*, Eq.(3.15) shows that only a finite number of iterations are needed in order to obtain a proper classifier function.

𝑘𝛾*

‖w*‖𝑅√

2𝑘≤ 1 ⇒ 𝑘2(𝛾*)2

‖w*‖2𝑅22𝑘≤ 1 ⇒ 𝑘 ≤ 2𝑅2‖w*‖2

(𝛾*)2(3.15)

References Tommi Jaakkola, course materials for 6.867 Machine Learning, Fall 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [26 May 2014]. Cristianini N., Shawe-TaylorJohn: An Introduction to Support Vector Machines and other Kernel Based Learning Methods, Cambridge UniversityPress 2000 Bishop C.M. (2006); “Pattern Recognition and Machine Learning”, Springer; 1st ed., ISBN-10: 0-387-31073-8

14 Chapter 3. Machine Learning

http://ocw.mit.edu/

http://ocw.mit.edu/

CHAPTER 4

3D Modeling Tutorials

In this section different 3D modeling techniques are presented using Siemens NX 8.0/10.0.

The most common way of creating a 3D object is to extrude a base geometry drawn on a plane.

Intersection of a Datum Plane with an Extruded Body (8.0)

It is possible to define new curves by the intersection of datum planes with extruded bodies. After defining thedatum plane we can open a sketch on the datum plane. Afterwards we can use the “Intersection Curve” button in thedirect Sketch Toolbar: Here we must have the “Intersection Curve” button activated. The rest is simple. Also typing“Intersect” in the “Command Finder” helps. The “Command Finder” is by default in the menubar of the program onthe right hand side of “Redo”.

To switch between different modules of the program like “Modeling” and “NX Sheet Metal” the Start button at theupper left corner, right below “File” can be used.

Mirroring of curves created by intersection of datum planes and ex-truded bodies(8.0)

There is a function called “Mirror curve” which is hidden by default. In order to make this function available:

• Step 1: Right click on the “Insert” menu.

• Step 2: Put a check mark on the left hand side of “Curve”

• Step 3: The curve toolbar appears. Click on Toolbar options -> Add or remove buttons -> Curve

• Step 4: Put a check mark on the left hand side of “Mirror Curve”.

The “Intersection Curve” and “Mirror Curve” commands can be added to the menubar by dragging the toolbar thatshows up after the 4th step above into the menubar.

15


Assigning Different Colors To Different Partitions of a Body (8.0)

This can be achieved using “Split Body” from the Trim drop-down (second toolbar row, 6th drop-down from the left).

Changing the Background Color of the Model (8.0)

Preferences -> Color Palette -> Selected Color -> Edit Background

Making a Part Transparent (10.0)

• Step 1: Go to the “View” menu at the top

• Step 2: Click on “Edit Object Display” in the “View” menu

• Step 3: Select the part that you want to make transparent

• Step 4: Click “OK”

• Step 5: Adjust the value of “Translucency” under “Shaded Display”

16 Chapter 4. 3D Modeling Tutorials

CHAPTER 5

Pre-calculus

1. Number sets

• Real numbers R: This set contains all the numbers that we deal with in calculus except complex numbers.

• Natural numbers or positive integers N: This set consists of the integers {1, 2, 3, ...}

• Integers Z: {...,−3,−2,−1, 0, 1, 2, 3, ...}.

• Rational numbers Q: These are the numbers that we obtain by dividing one integer by another integer. In otherwords, for every rational number 𝑞 ∈ Q, there exist two integers 𝑝, 𝑟 ∈ Z, 𝑟 = 0 such that 𝑞 = 𝑝/𝑟.

2. Axioms of the set of real numbers R

• For any 𝑎, 𝑏 ∈ R, 𝑎 + 𝑏 ∈ R and 𝑎 · 𝑏 ∈ R (Closure law).

• For any 𝑎, 𝑏 ∈ R, 𝑎 + 𝑏 = 𝑏 + 𝑎 and 𝑎 · 𝑏 = 𝑏 · 𝑎 (Commutative law).

• For any 𝑎, 𝑏, 𝑐 ∈ R, (𝑎 + 𝑏) + 𝑐 = 𝑎 + (𝑏 + 𝑐) and (𝑎 · 𝑏) · 𝑐 = 𝑎 · (𝑏 · 𝑐) (Associative law).

• For any 𝑎, 𝑏, 𝑐 ∈ R, 𝑎 · (𝑏 + 𝑐) = 𝑎𝑏 + 𝑎𝑐 and (𝑎 + 𝑏) · 𝑐 = 𝑎𝑐 + 𝑏𝑐 (Distributive law).

3. Factoring Polynomials

For any 𝑎, 𝑏, 𝑐, 𝑥 ∈ R:

1. 𝑎2 − 𝑏2 = (𝑎 + 𝑏)(𝑎− 𝑏)

2. 𝑎2 + 2𝑎𝑏 + 𝑏2 = (𝑎 + 𝑏)2

3. 𝑎2 − 2𝑎𝑏 + 𝑏2 = (𝑎− 𝑏)2

4. 𝑎3 + 𝑏3 = (𝑎 + 𝑏)(𝑎2 − 𝑎𝑏 + 𝑏2)

17


5. 𝑎3 − 𝑏3 = (𝑎− 𝑏)(𝑎2 + 𝑎𝑏 + 𝑏2)

6. 𝑥2 + (𝑎 + 𝑏)𝑥 + 𝑎𝑏 = (𝑥 + 𝑎)(𝑥 + 𝑏)

7. To factor 𝑎𝑥2 + 𝑏𝑥 + 𝑐, find two factors of 𝑎𝑐 that add to 𝑏. Let 𝑎𝑐 = 𝑓1𝑓2 such that 𝑓1 + 𝑓2 = 𝑏. Then the

polynomial can be written as 𝑎𝑥2 + 𝑓1𝑥 + 𝑓2𝑥 + 𝑐. Since 𝑎 = 𝑓1𝑓2/𝑐, the polynomial is equal to𝑓1𝑓2𝑐

𝑥2 +

𝑓1𝑥+ 𝑓2𝑥+ 𝑐. Obviously, this is an expression with four components added together. The first two components

have the common factor of 𝑓1𝑥, therefore they can be combined to 𝑓1𝑥(𝑓2𝑐𝑥 + 1) =

1

𝑐𝑓1𝑥(𝑓2𝑥 + 𝑐). Here we

see the addition of the third and fourth components as one of the factors of the addition of the first and second

components. Therefore the whole polynomial can be factored as1

𝑐(𝑓2𝑥 + 𝑐)(𝑓1𝑥 + 1).

8. To factor 𝑎𝑥2 + 𝑏𝑥𝑦 + 𝑐𝑦2, find two factors of 𝑎𝑐 that add to 𝑏. Let 𝑎𝑐 = 𝑓1𝑓2 such that 𝑓1 + 𝑓2 = 𝑏. Then thepolynomial can be written as 𝑎𝑥2 + 𝑓1𝑥𝑦 + 𝑓2𝑥𝑦 + 𝑐𝑦2. Since 𝑎 = 𝑓1𝑓2/𝑐, the polynomial can be written as

𝑓1𝑓2𝑐

𝑥2 + 𝑓1𝑥𝑦 + 𝑓2𝑥𝑦 + 𝑐𝑦2 =1

𝑐(𝑓1𝑓2𝑥

2 + 𝑓1𝑐𝑥𝑦 + 𝑓2𝑐𝑥𝑦 + 𝑐2𝑦2)

=𝑓1𝑥

𝑐(𝑓2𝑥 + 𝑐𝑦) +

1

𝑐𝑐𝑦(𝑓2𝑥 + 𝑐𝑦)

=1

𝑐(𝑓2𝑥 + 𝑐𝑦)(𝑓1𝑥 + 𝑐𝑦)

Example 3.1 : Factor 𝑥2 + 𝑥− 2

Solution: This is a polynomial in the format 𝑎𝑥2 + 𝑏𝑥 + 𝑐 where 𝑎 = 1, 𝑏 = 1, 𝑐 = −2. Therefore we can use the 7thrule of polynomial factorization. Since 𝑎 · 𝑐 = −2 = 2 · (−1) and 2 + (−1) = 1 = 𝑏, we can write the polynomial asfollows:

𝑥2 + 2𝑥 + (−1)𝑥 + (−1)(2)

The above expression has 4 components added to each other. Let’s focus on the first two of these components. Thereason for that is because the first two components 𝑥2 and 2𝑥 have a common factor which is 𝑥. So these two canbe combined into a single term as 𝑥(𝑥 + 2). The third and fourth components also have a common factor which is−1. Therefore they can be combined into (−1)(𝑥 + 2). So the whole polynomial becomes 𝑥(𝑥 + 2) + (−1)(𝑥 + 2).This last expression has only two components with the common factor (𝑥 + 2) added together. If we divide the firstcomponent by (𝑥 + 2) we obtain 𝑥 and if we divide the second component by (𝑥 + 2) we obtain (−1). Therefore thisexpression can be re-written as (𝑥 + 2)(𝑥− 1) .

Example 3.2 : Simplify the following rational expression

𝑥2 − 1

𝑥2 + 𝑥− 2

Solution: A rational expression is a fractional expression in which both the numerator and the denominator are poly-nomials. In the above example the numerator is 𝑥2 − 1 and the denominator is 𝑥2 + 𝑥− 2. Both of these expressionscan be factored. First let’s focus on the numerator: 𝑥2 − 1 = (𝑥 + 1)(𝑥− 1). Now focus on the denominator: Usingthe result of Example 1 𝑥2 + 𝑥− 2 = (𝑥 + 2)(𝑥− 1). Now we can write the rational expression as follows:

𝑥2 − 1

𝑥2 + 𝑥− 2=

(𝑥 + 1)(𝑥− 1)

(𝑥 + 2)(𝑥− 1)

The (𝑥− 1) terms in the numerator and the denominator cancel each other. Therefore the expression can be simplified

as𝑥 + 1

𝑥 + 2

Example 3.3 : Factor the polynomial 3𝑥2 + 8𝑥𝑦 + 4𝑦2

18 Chapter 5. Pre-calculus


Solution: Since 3 · 4 = 12 = 6 · 2 and 8 = 6 + 2, we have

3𝑥2 + 8𝑥𝑦 + 4𝑦2 = 3𝑥2 + 6𝑥𝑦 + 2𝑥𝑦 + 4𝑦2

= 3𝑥(𝑥 + 2𝑦) + 2𝑦(𝑥 + 2𝑦)

= (3𝑥 + 2𝑦)(𝑥 + 2𝑦)

Example 3.4: Find the sum (𝑥3 − 6𝑥2 + 2𝑥 + 4) + (𝑥3 + 5𝑥2 − 7𝑥)

Solution: We sum two polynomials by combining like terms. Like terms are terms with the same variable (x,y, etc.)raised to the same power. For example if one polynomial contains the term 𝑎 · 𝑥2 and the other one contains 𝑏 · 𝑥2,then in the summation we would have (𝑎 + 𝑏)𝑥2. In this example we have

(1 + 1)𝑥3 + (−6 + 5)𝑥2 + (2 + (−7))𝑥1 + (4 + 0)𝑥0 = 2𝑥3 + (−1)𝑥2 + (−5)𝑥 + 4

Example 3.5 : Factor the polynomial 24𝑥2 + 19𝑥 + 2

Solution: The polynomial has the format 𝑎𝑥2 + 𝑏𝑥 + 𝑐 where 𝑎 = 24, 𝑏 = 19, 𝑐 = 2. Using the 7th rule ofpolynomial factorization we obtain 𝑎 · 𝑐 = 24 · 2 = 48. Now we try to find two factors of 48 that add up to 19:

48 = 1 · 48, 1 + 48 = 49 = 19

= 2 · 24, 2 + 24 = 26 = 19

= 3 · 16, 3 + 16 = 19

Therefore we should write the polynomial as 24𝑥2 + 16𝑥 + 3𝑥 + 2. Focusing on term 1 and term 2 we can seethat they have 8𝑥 in common because 24𝑥2 = 3 · 8 · 𝑥 · 𝑥 and 16𝑥 = 2 · 8 · 𝑥. Now we can write the polynomialas 8𝑥(3𝑥 + 2) + (3𝑥 + 2). As the last step we divide both sides of the summation sign by (3𝑥 + 2) and write thepolynomial as (3𝑥 + 2)(8𝑥 + 1) .

Example 3.6 : Multiply (𝑥 + 2)(𝑥2 + 2𝑥 + 3)

Solution: Since 𝑥 is a variable that can have real number values, we can use the distributive law of the set of realnumbers. We use the distributive law three times in this example. First we treat the second polynomial as a single realnumber. Call this number 𝑎. Then the multiplication becomes (𝑥+2)𝑎 = 𝑥𝑎+2𝑎 = 𝑥(𝑥2 +2𝑥+3)+2(𝑥2 +2𝑥+3)using the distributive law. Then we use the distributive law for the second time on 𝑥𝑎 and for the third time on 2𝑎. As

a result we obtain 𝑥 ·𝑥2 +𝑥 ·2𝑥+𝑥 ·3+2 ·𝑥2 +2 ·2𝑥+2 ·3 = 𝑥3 + 4𝑥2 + 7𝑥 + 6 . The term by term multiplicationsteps are illustrated in the following figure.

4. Laws of Exponents

1. 𝑥𝑎𝑥𝑏 = 𝑥𝑎+𝑏

2. (𝑥𝑦)𝑎 = 𝑥𝑎𝑦𝑎

3. (𝑥𝑎)𝑏 = 𝑥𝑎𝑏

4.𝑥𝑎

𝑥𝑏= 𝑥𝑎−𝑏 =

1

𝑥𝑏−𝑎

5.(𝑥𝑦

)𝑎=

𝑥𝑎

𝑦𝑎

6. 𝑥−𝑎 =( 1

𝑥𝑎

)7.

√𝑥 = 𝑥1/2

5.4. 4. Laws of Exponents 19


Fig. 5.1: The steps of term by term multiplication of polynomials

8. 𝑝√𝑥𝑛 = 𝑥𝑛/𝑝

Example 4.1 : Simplify 𝑥1/2𝑥1/3

Solution: Using the first law we obtain

𝑥1/2𝑥1/3 = 𝑥1/2+1/3 = 𝑥5/6

Example 4.2 : Simplify (𝑥4𝑦6)−1/2

Solution: Using the second law:

(𝑥4𝑦6)−1/2 = (𝑥4)−1/2(𝑦6)−1/2

Using the third law:

(𝑥4)−1/2(𝑦6)−1/2 = 𝑥4·(−1/2)𝑦6·(−1/2) = 𝑥−2𝑦−3

Using the sixth law:

𝑥−2𝑦−3 =1

𝑥2· 1

𝑦3=

1

𝑥2𝑦3

Example 4.3 : Simplify (2𝑎3𝑏2)(3𝑎𝑏4)3

Solution: Using the second law on the second parenthesis:

(2𝑎3𝑏2)(3𝑎𝑏4)3 = (2𝑎3𝑏2)(33𝑎3(𝑏4)3)

Using the third law on the expression (𝑏4)3:

(2𝑎3𝑏2)(33𝑎3(𝑏4)3) = (2𝑎3𝑏2)(33𝑎3(𝑏12)) = (2𝑎3𝑏2)(27𝑎3(𝑏12))

Grouping factors with the same base

(2𝑎3𝑏2)(27𝑎3(𝑏12)) = (2 · 27)(𝑎3𝑎3)(𝑏2𝑏12)



Using the first law:

(54)(𝑎3𝑎3)(𝑏2𝑏12) = (54)(𝑎3+3)(𝑏2+12) = 54𝑎6𝑏14

Example 4.4 : Simplify (2√𝑥)(3 3

√𝑥)

Solution: Using laws 7 and 8:

(2√𝑥)(3 3

√𝑥) = 2𝑥1/23𝑥1/3

Grouping factors with the same base

2𝑥1/23𝑥1/3 = 2 · 3 · 𝑥1/2 · 𝑥1/3

Using law 1:

(2 · 3) · 𝑥1/2 · 𝑥1/3 = 6𝑥1/2+1/3 = 6𝑥5/6 = 66√𝑥5

References [1] Stewart J., Redlin L., Watson S. ; “Precalculus - Mathematics for Calculus”, 7th edition, ISBN:978-1305071759 [2] Safier F. ; “Schaum’s Outline of Theory and Problems of Precalculus”, ISBN 0-07-05726 1-5

5.4. 4. Laws of Exponents 21

CHAPTER 6

Processing

Where to start ?

The first thing to do is to download the processing files from the following address

https://processing.org/download/?processing

If you have a Windows operating system you should choose either the Windows 64 bit or the Windows 32 bit versionof the processing files. You can see which version of windows you have as follows: Click on start button -> Settings-> System -> About -> Here the system type is listed. The processing files come packed in a zip file. Extract thecontents of this zip file and you can start the processing platform right away by executing the processing.exe file. Noinstallation needed.

List of Processing functions and built-in variables:

• void setup (): The size of the drawing canvas and the backgroud colour are defined in this function. Everyprocessing program starts with void setup().

• void draw (): All the drawing operations are done within this function. The code inside the draw() function isexecuted 60 times per second.

• stroke(r,g,b): defines the colour of the lines following this function. r, g, b define the red, green and blueintensity. They take values between 0 and 255.

• noStroke (): Geometric shapes like rectangles and ellipses that follow this command are plotted without outerlines.

• strokeWeight (line thickness in pixels)

• line (x0,y0,x1,y1)

• rect (upperLeftX, upperLeftY, width, height)

23

https://processing.org/download/?processing


• quad (x0,y0,x1,y1,x2,y2,x3,y3): Draws a quadrilateral with the given 4 corners (vertices). Here (x0,y0) is thefirst corner point (x1,y2) is the second and so on. A quadrilateral is a polygon with 4 straight sides (edges) and4 corners. The vertices should be specified in either clockwise or counterclockwise direction.

• triangle (x0,y0,x1,y1,x2,y2): Draws a triangle with the given 3 corners.

• ellipse (centerX, centerY, width, height)

• fill (r,g,b): fills inside of rectangles and ellipses with colour.

• noLoop (): This function prevents the draw() function from repeating when called inside the draw() function.

• println (): Prints to console.

• mouseX, mouseY : Variables holding the current x and y coordinates of the mouse when it is positioned on thecanvas. These coordinates are the pixel distances in horizontal and vertical direction from the upper left cornerof the canvas.

• height, width : These variables are always equal to the height and width of the sketch window.

Coordinates on a Canvas

In Processing the canvas has an invisible coordinate system with an origin at the upper left corner of the canvas. Asan example when we say the upper left corner of a rectangle has the coordinates (50,65) this means that the upper leftcorner of the rectangle is 50 pixels to the right side of the canvas upper left corner and 65 pixels away from the canvasupper left corner in downwards direction. This is also illustrated in the following figure.

Fig. 6.1: Canvas coordinate system

Example 1 : Drawing lines, rectangles and ellipses.

void setup(){size(250,250);background(150);

}

void draw(){stroke(0,0,255);

24 Chapter 6. Processing


strokeWeight(1);line(20,20,120,120);fill(120, 120, 120);rect(140, 140, 45, 35);ellipse(140, 140, 45, 35);stroke(255,0,0);strokeWeight(2);line(20,10,120,110);rect(190,190,45,35);fill(240,0,240);noStroke();ellipse(255, 255, 55,45);stroke(0,255,0);strokeWeight(5);line(20,40,120,140);noLoop();

}

Fig. 6.2: The outcome of the code in Example 1

The above example shows that once you define a line colour or filling colour, it affects every geometric shape thatcomes afterwards. This means, we don’t need to re-define the line and filling colours unless we want to changethem. Notice that as we change the line thickness with the strokeWeight() function, the outer line thicknesses of therectangles and the ellipses also change. Also, using the noStroke() function we removed the outline of the last ellipse.

Example 2 : Draw a quadrilateral with the vertex coordinates (25,25), (150, 50), (100, 175), (25, 200). First define thefilling colour as red and define the corner coordinates in clockwise direction. Then change the filling colour to blueand use the quad function second time but this time enter the vertex coordinates in random order.

Solution

void setup(){size(250,250);background(150);

6.3. Coordinates on a Canvas 25


}

void draw(){stroke(0,0,0);fill(255, 0, 0);quad(25, 25, 150, 50, 100, 175, 25, 200);fill(0,0,255);quad(25, 25, 100, 175, 25, 200, 150, 50);noLoop();

}

Fig. 6.3: The outcome of the code in Example 2

The quad() function draws from vertex to vertex. Which means that entering the same corner points in a differentsequence may result in different shapes.

Exercise 1 : Draw a snowman using the functions mentioned so far. For example ellipse(), triangle(), line(). Youcan also use the rect() function to draw the ground that the snowman stands on. You can draw a carrot nose for thesnowman using the triangle() function etc.

A Dynamic Example

In this example we simulate the motion of a ball which bounces back each time it hits one of the boundaries of thedrawing canvas. To do this simulation we use the fact that the draw() function is called 60 times per second. Theposition of the ball is defined as a variable by using variable values (x,y) for the center of the circle representing theball. These variable values are updated each time the draw() function is called by adding to them the constants growXand growY. Increasing the value of growX makes the ball move faster in the horizontal direction.

int y = 200;int x = 30;int frameCount=0;int growX = 8;int growY = 3;

void setup(){size(400,350);

}



void draw(){background(150);fill(255,0,0);ellipse(x, y, 60,60);x+=growX;y+=growY;if((x+30)>=width || x-30 < 0)growX*=-1;if((y+30)>=height || y-30 < 0)growY*=-1;if(frameCount > 300)noLoop();if(frameCount%2 == 0)saveFrame();frameCount++;

}

Since the frame rate in a usual movie is 30 frames per second, only half of the drawings made by the draw() functionare saved by the saveFrame() function in order to use them in movie making.

6.4. A Dynamic Example 27

CHAPTER 7

Calculus

I would like to start this section with the proof of a very useful formula in calculus, which is the formula for the cosineof an angle between two vectors. Once the cosine of an angle is known, the angle itself can be computed using theMath.acos() function of JavaScript. The Math.acos() function can be executed from the “Web Console” of the Firefoxweb browser which can be invoked by pressing “F12” or “CTRL+shift+k”.

In the above figure, the cosines and sines of the angles 𝜃𝑎 , 𝜃𝑏 and the angle between the vectors can be expressed asfollows:

cos 𝜃𝑎 =𝑎1‖a‖

, cos 𝜃𝑏 =𝑏1‖b‖

, sin 𝜃𝑎 =𝑎2‖a‖

, sin 𝜃𝑏 =𝑏2‖b‖

cos(𝜃𝑎 − 𝜃𝑏) = cos(𝜃𝑎) cos(𝜃𝑏) + sin(𝜃𝑎) sin(𝜃𝑏) =𝑎1‖a‖

𝑏1‖b‖

+𝑎2‖a‖

𝑏2‖b‖

=⟨a,b⟩‖a‖‖b‖

29


where ‖ · ‖ denotes the Euclidean norm or the magnitude of a vector and ⟨·,·⟩ denotes the scalar product or innerproduct of two vectors.

Vector norm and inner product

All vectors are denoted with bold letters. The inner product of two vectors in the Euclidean n-space R𝑛 is defined by⟨x,y⟩ =

∑𝑛𝑖=1 𝑥𝑖𝑦𝑖. Some of the properties of the inner product are as follows [6]:

|⟨x,y⟩| ≤ ‖x‖ · ‖y‖

This can be proven using the concept of linear independence.

Linear independence

Let’s say we have a set of k vectors {v1, ...,v𝑘} in the Euclidean n-space R𝑛. These vectors are either linearlydependent or independent. If there exists a set of k coefficients {𝛼1, ..., 𝛼𝑘} such that not all of these coefficients arezero and 𝛼1v1 + ...+𝛼𝑘v𝑘 = 0, then the vectors are linearly dependent because we could express one of these vectorsas a linear combination of the rest of the vectors in the set. As an example suppose that 𝛼1 = 0. Then we could writev1 = −(𝛼2/𝛼1)v2 − (𝛼3/𝛼1)v3 − ...− (𝛼𝑘/𝛼1)v𝑘. On the other hand if the only way to express the zero vector asa linear combination of these vectors is with 𝛼𝑖 = 0 ∀𝑖 ∈ {1, ..., 𝑘}, then the vectors are linearly independent. If thevectors x and y are linearly dependent, then one of them can be expressed in terms of the other such that x = 𝛼y forsome 𝛼 ∈ R. Then we obtain:

|⟨x,y⟩| = |⟨𝛼y,y⟩| = |𝛼⟨y,y⟩| = |𝛼|‖𝑦‖2 = ‖𝛼y‖‖y‖ = ‖x‖‖y‖

On the other hand, if x and y are linearly independent, then ‖x− 𝛼y‖ = 0 for any 𝛼 ∈ R and we obtain:

0 < ‖x− 𝛼y‖2 =

𝑛∑𝑖=1

(𝑥𝑖 − 𝛼𝑦𝑖)2 =

𝑛∑𝑖=1

𝑥𝑖2 + 𝛼2𝑦𝑖

2 − 2𝛼𝑥𝑖𝑦𝑖

which is a quadratic equation in form of 𝑎𝛼2 + 𝑏𝛼 + 𝑐. Since this equation is always greater than zero, there are noreal values of 𝛼 which would make it equal to zero. As a result the discriminant of the equation (𝑏2 − 4𝑎𝑐) must beless than zero. Because if it were greater than or equal to zero, then (−𝑏±

√𝑏2 − 4𝑎𝑐)/2𝑎 would give us some real

values that make the quadratic equation equal to zero. Therefore:(− 2

𝑛∑𝑖=1

𝑥𝑖𝑦𝑖

)2− 4( 𝑛∑

𝑖=1

𝑦𝑖2)( 𝑛∑

𝑖=1

𝑥𝑖2)< 0

|⟨x,y⟩|2 < ‖y‖2‖x‖2

This property leads to another one which is called the triangle inequality:

‖x + y‖ ≤ ‖x‖ + ‖y‖

To prove this we can proceed as follows:

‖x + y‖2 =

𝑛∑𝑖=1

(𝑥𝑖 + 𝑦𝑖)2 =

𝑛∑𝑖=1

𝑥𝑖2 + 𝑦𝑖

2 + 2𝑥𝑖𝑦𝑖 = ‖x‖2 + ‖y‖2 + 2⟨x,y⟩

≤ ‖x‖2 + ‖y‖2 + 2‖x‖‖y‖ = (‖x‖ + ‖y‖)2

Now let’s turn back to the vectors a, b in R2 and the angle between them. The formula for the cosine of the differencebetween two angles(or an angle between two vectors) can be derived as follows[1]:

30 Chapter 7. Calculus


Let 𝑓(𝜃) = cos(𝜃 − 𝛽) + 𝛼1 cos(𝜃) + 𝛼2 sin(𝜃) where 𝑓 : R → R and 𝛽, 𝛼1, 𝛼2 ∈ R are arbitrary. The first andsecond derivatives of 𝑓(𝜃) look like:

𝑓′(𝜃) = − sin(𝜃 − 𝛽) − 𝛼1 sin(𝜃) + 𝛼2 cos(𝜃)

𝑓′′(𝜃) = − cos(𝜃 − 𝛽) − 𝛼1 cos(𝜃) − 𝛼2 sin(𝜃)

from which

𝑓(𝜃) + 𝑓′′(𝜃) = 0

follows. If we choose 𝛼1 and 𝛼2 as

𝛼1 = − cos(𝛽), 𝛼2 = − sin(𝛽)

we obtain

𝑓(𝜃) = cos(𝜃 − 𝛽) − cos(𝜃) cos(𝛽) − sin(𝜃) sin(𝛽)

𝑓(0) = 𝑓′(0) = 0

Let’s define 𝑔 : R → R as 𝑔(𝜃) = (𝑓(𝜃))2 + (𝑓′(𝜃))2. Then

𝑔′(𝜃) = 2𝑓(𝜃)𝑓

′(𝜃) + 2𝑓

′(𝜃)𝑓

′′(𝜃) = 2𝑓

′(𝜃)(𝑓(𝜃) + 𝑓

′′(𝜃))

= 0

Since 𝑔′(𝜃) = 0 for all 𝜃 ∈ R, 𝑔(𝜃) is a constant function and equal to 𝑔(0) = (𝑓(0))2 + (𝑓

′(0))2 = 0 for all 𝜃 ∈ R.

Assume that 𝑓(𝜃0) = 0 for some 𝜃0 ∈ R. Then 𝑔(𝜃0) = (𝑓(𝜃0))2 + (𝑓′(𝜃0))2 > 0. This contradiction proves that

𝑓(𝜃) = 0 everywhere on R and therefore cos(𝜃 − 𝛽) = cos(𝜃) cos(𝛽) + sin(𝜃) sin(𝛽) .

In the above proof we used the fact that if the derivative of a function is zero everywhere, then this function has aconstant value. This can be proven using the mean value theorem as follows:

Mean Value Theorem and Rolle’s Theorem

Let [𝑎, 𝑏] ⊂ R with 𝑎 < 𝑏. Then 𝑔(𝜃) is differentiable on [𝑎, 𝑏]. According to the mean value theorem, there exists𝜉 ∈ (𝑎, 𝑏) such that

𝑔′(𝜉) =

𝑔(𝑏) − 𝑔(𝑎)

𝑏− 𝑎= 0 ⇒ 𝑔(𝑏) = 𝑔(𝑎),∀𝑎, 𝑏 ∈ R, ∴ 𝑔(𝜃) = 𝑐𝑜𝑛𝑠𝑡

In order to prove the mean value theorem, it is possible to define another function 𝐺 : R → R as 𝐺(𝜃) = 𝑔(𝜃) + 𝛼𝜃for some 𝛼 ∈ R. Then for any interval [𝑎, 𝑏] ⊂ R, 𝐺(𝜃) is differentiable on [𝑎, 𝑏]. Also, 𝛼 can be chosen in such away that 𝐺(𝑎) = 𝐺(𝑏). Since 𝐺(𝑎) = 𝑔(𝑎) +𝛼𝑎 and 𝐺(𝑏) = 𝑔(𝑏) +𝛼𝑏, Choosing 𝛼 = (𝑔(𝑏)− 𝑔(𝑎))/(𝑎− 𝑏) wouldimply that 𝐺(𝑎) = 𝐺(𝑏). Since 𝐺(𝜃) is differentiable on [𝑎, 𝑏], according to Rolle’s theorem, there exists 𝜉 ∈ (𝑎, 𝑏)such that

𝐺′(𝜉) = 0 = 𝑔

′(𝜉) +

𝑔(𝑏) − 𝑔(𝑎)

𝑎− 𝑏⇒ 𝑔

′(𝜉) =

𝑔(𝑏) − 𝑔(𝑎)

𝑏− 𝑎

Once it is known that 𝐺(𝑎) = 𝐺(𝑏), there are only three possibilities for the behaviour of 𝐺(𝜃) on some point𝜃0 ∈ (𝑎, 𝑏). The first possibility is that 𝐺(𝑎) = 𝐺(𝜃0) = 𝐺(𝑏). If this is true for any 𝜃0 ∈ (𝑎, 𝑏) then 𝐺(𝜃) is constanton [𝑎, 𝑏] and its derivative is zero at any 𝜉 ∈ (𝑎, 𝑏) because of the definition of derivative as follows:

𝐺′(𝜉) = lim

𝜃→𝜉

𝐺(𝜃) −𝐺(𝜉)

𝜃 − 𝜉= lim

𝜃→𝜉

0

𝜃 − 𝜉= 0

7.2. Mean Value Theorem and Rolle’s Theorem 31


The second possibility is that for some 𝜃0 ∈ (𝑎, 𝑏), 𝐺(𝜃0) > 𝐺(𝑎) = 𝐺(𝑏). In this case the Weierstrass’ maximum-minimum theorem guarantees the existence of some 𝜃𝑚𝑎𝑥 ∈ (𝑎, 𝑏) such that 𝐺(𝜃𝑚𝑎𝑥) ≥ 𝐺(𝜃0) > 𝐺(𝑎) = 𝐺(𝑏) andfor any 𝜃 ∈ (𝑎, 𝑏), 𝐺(𝜃) ≤ 𝐺(𝜃𝑚𝑎𝑥). We also know that 𝐺

′(𝜃𝑚𝑎𝑥) exits and is equal to the right-hand and left-hand

derivatives of 𝐺 at 𝜃𝑚𝑎𝑥.

0 ≤ lim𝜃→𝜃𝑚𝑎𝑥

−

𝐺(𝜃) −𝐺(𝜃𝑚𝑎𝑥)

𝜃 − 𝜃𝑚𝑎𝑥= 𝐺

′(𝜃𝑚𝑎𝑥) = lim

𝜃→𝜃𝑚𝑎𝑥+

𝐺(𝜃) −𝐺(𝜃𝑚𝑎𝑥)

𝜃 − 𝜃𝑚𝑎𝑥≤ 0

From the above inequalities it is clear that 𝐺′(𝜃𝑚𝑎𝑥) = 0 . This completes the proof of Rolle‘s theorem since the

only remaining possibility is that for some 𝜃0 ∈ (𝑎, 𝑏), 𝐺(𝜃0) < 𝐺(𝑎) = 𝐺(𝑏) and the proof of this case is identical tothe previous case.

Taylor’s theorem

A generalization of the mean value theorem to n times differentiable functions is Taylor’s theorem. According toTaylor’s theorem, if 𝑓 (𝑛−1)(𝑥) exists on [a,b] and 𝑓𝑛(𝑥) exists on (a,b), then there exists 𝜉 ∈ (𝑎, 𝑏) such that

𝑓(𝑏) =

𝑛−1∑𝑘=0

𝑓 (𝑘)(𝑎)

𝑘!(𝑏− 𝑎)𝑘 +

𝑓𝑛(𝜉)

𝑛!(𝑏− 𝑎)𝑛

In order to prove this, we define the following function 𝜑(𝑥) [2] :

𝜑(𝑥) =

𝑛−1∑𝑘=0

𝑓 (𝑘)(𝑥)

𝑘!(𝑏− 𝑥)𝑘 + 𝑀(𝑏− 𝑥)𝑛

Clearly 𝜑 is continuous on [a,b] and differentiable on (a,b). Therefore if we choose a value for M such that 𝜑(𝑎) =𝜑(𝑏) = 𝑓(𝑏), then from Rolle’s theorem [mvt] it would follow that there exists 𝜉 ∈ (𝑎, 𝑏) such that 𝜑′(𝜉) = 0.

𝜑′(𝑥) = 𝑓 ′(𝑥) +

𝑛−1∑𝑘=1

𝑓 (𝑘+1)(𝑥)

𝑘!(𝑏− 𝑥)𝑘 − 𝑓 (𝑘)(𝑥)

𝑘!𝑘(𝑏− 𝑥)(𝑘−1) −𝑀𝑛(𝑏− 𝑥)(𝑛−1)

= 𝑓 ′(𝑥) +

𝑛∑𝑘=2

𝑓 (𝑘)(𝑥)

(𝑘 − 1)!(𝑏− 𝑥)𝑘−1 −

𝑛−1∑𝑘=1

𝑓 (𝑘)(𝑥)

(𝑘 − 1)!(𝑏− 𝑥)𝑘−1 −𝑀𝑛(𝑏− 𝑥)𝑛−1

= 𝑓 ′(𝑥) − 𝑓 ′(𝑥) +𝑓 (𝑛)(𝑥)

(𝑛− 1)!(𝑏− 𝑥)𝑛−1 −𝑀𝑛(𝑏− 𝑥)𝑛−1

𝜑′(𝜉) = 0 ⇒ 𝑓 (𝑛)(𝜉)

(𝑛− 1)!(𝑏− 𝜉)𝑛−1 = 𝑀𝑛(𝑏− 𝜉)𝑛−1 ⇒ 𝑀 =

𝑓 (𝑛)(𝜉)

𝑛!

Inserting the above found M into the expression 𝜑(𝑎) = 𝜑(𝑏) completes the proof of Taylor’s theorem.

Taylor’s theorem can also be expressed in integral form using the fundamental theorem of calculus which says that ifa function 𝑓(𝑥) is differentiable on [𝑎, 𝑏] and

∫ 𝑏

𝑎𝑓 ′(𝑥)𝑑𝑥 exists, then 𝑓(𝑏) − 𝑓(𝑎) =

∫ 𝑏

𝑎𝑓 ′(𝑥)𝑑𝑥. This expression can

be reformulated as

𝑓(𝑏) =1

0!𝑓(𝑎)(𝑏− 𝑎)0 +

1

0!

∫ 𝑏

𝑎

𝑓 ′(𝑥)𝑑𝑥 = 𝑝0 + 𝑟0



Using integration by parts, the 𝑟0 part of the above equation can be expanded as follows:

𝑟0 = − 1

1!

∫ 𝑏

𝑎

𝑓 ′(𝑥)𝑑(𝑏− 𝑥)

𝑢 = 𝑓 ′(𝑥), 𝑑𝑢 = 𝑓 ′′(𝑥)𝑑𝑥, 𝑑𝑣 = 𝑑(𝑏− 𝑥), 𝑣 = 𝑏− 𝑥

= − 1

1!

[𝑓 ′(𝑥)(𝑏− 𝑥)

𝑏𝑎−∫ 𝑏

𝑎

𝑓 ′′(𝑥)(𝑏− 𝑥)𝑑𝑥]

= − 1

1!

[− 𝑓 ′(𝑎)(𝑏− 𝑎) −

∫ 𝑏

𝑎

𝑓 ′′(𝑥)(𝑏− 𝑥)𝑑𝑥]

=1

1!𝑓 ′(𝑎)(𝑏− 𝑎)1 +

1

1!

∫ 𝑏

𝑎

𝑓 ′′(𝑥)(𝑏− 𝑥)𝑑𝑥

which gives us

𝑝1 =1

0!𝑓 (0)(𝑎)(𝑏− 𝑎)0 +

1

1!𝑓 (1)(𝑎)(𝑏− 𝑎)1, 𝑟1 =

1

1!

∫ 𝑏

𝑎

𝑓 (2)(𝑥)(𝑏− 𝑥)1𝑑𝑥

Continuing this way, if 𝑓 (𝑛+1)(𝑥) is continuous on [𝑎, 𝑏], then we would obtain

𝑝𝑛 =

𝑛∑𝑘=0

𝑓 (𝑘)(𝑎)

𝑘!(𝑏− 𝑎)𝑘, 𝑟𝑛 =

1

𝑛!

∫ 𝑏

𝑎

𝑓 (𝑛+1)(𝑥)(𝑏− 𝑥)𝑛𝑑𝑥

In order to show this inductively, we can expand 𝑟𝑛 as follows

𝑟𝑛 = − 1

(𝑛 + 1)!

∫ 𝑏

𝑎

𝑓 (𝑛+1)(𝑥)𝑑(𝑏− 𝑥)(𝑛+1)

= − 1

(𝑛 + 1)!

[𝑓 (𝑛+1)(𝑥)(𝑏− 𝑥)(𝑛+1)

𝑏𝑎−∫ 𝑏

𝑎

𝑓 (𝑛+2)(𝑥)(𝑏− 𝑥)(𝑛+1)𝑑𝑥]

=1

(𝑛 + 1)!𝑓 (𝑛+1)(𝑎)(𝑏− 𝑎)(𝑛+1) +

1

(𝑛 + 1)!

∫ 𝑏

𝑎

𝑓 (𝑛+2)(𝑥)(𝑏− 𝑥)(𝑛+1)𝑑𝑥

which gives us

𝑝𝑛+1 =

𝑛+1∑𝑘=0

𝑓 (𝑘)(𝑎)

𝑘!(𝑏− 𝑎)𝑘, 𝑟𝑛+1 =

1

(𝑛 + 1)!

∫ 𝑏

𝑎

𝑓 (𝑛+2)(𝑥)(𝑏− 𝑥)(𝑛+1)𝑑𝑥

Therefore, if 𝑓 (𝑛)(𝑥) is continuous on [𝑎, 𝑏], then the integral form of Taylor’s theorem is

𝑓(𝑏) =

𝑛−1∑𝑘=0

𝑓 (𝑘)(𝑎)

𝑘!(𝑏− 𝑎)𝑘 +

1

(𝑛− 1)!

∫ 𝑏

𝑎

𝑓 (𝑛)(𝑥)(𝑏− 𝑥)(𝑛−1)𝑑𝑥

Integration by Parts

We used this integration rule while deriving the integral form of Taylor’s theorem. The rule is based on the funda-mental theorem of calculus which says that if 𝑓, 𝑔 are differentiable functions and 𝑓 ′, 𝑔′ are integrable on [𝑎, 𝑏] then∫ 𝑏

𝑎(𝑓(𝑥)𝑔(𝑥))′𝑑𝑥 = 𝑓(𝑥)𝑔(𝑥)|𝑏𝑎.

Using the product rule for differentiation we obtain:∫ 𝑏

𝑎

(𝑓(𝑥)𝑔(𝑥))′𝑑𝑥 =

∫ 𝑏

𝑎

[𝑓 ′(𝑥)𝑔(𝑥) + 𝑓(𝑥)𝑔′(𝑥)

]𝑑𝑥 = 𝑓(𝑥)𝑔(𝑥)|𝑏𝑎

⇒∫ 𝑏

𝑎

𝑓(𝑥)𝑔′(𝑥)𝑑𝑥 = 𝑓(𝑥)𝑔(𝑥)|𝑏𝑎 −∫ 𝑏

𝑎

𝑔(𝑥)𝑓 ′(𝑥)𝑑𝑥

If we let 𝑓(𝑥) = 𝑢, 𝑔(𝑥) = 𝑣, this rule can also be expressed as∫ 𝑏

𝑎𝑢𝑑𝑣 = 𝑢𝑣|𝑏𝑎 −

∫ 𝑏

𝑎𝑣𝑑𝑢.

7.4. Integration by Parts 33


Power Series

Series in the form of the Taylor expansion of a function 𝑓 : [𝑎, 𝑏] → R at 𝑏 about 𝑎 are called power series. Fur-thermore, for every power series

∑∞𝑘=𝑛 𝑐𝑘(𝑥 − 𝑎)𝑘 there is a certain set of values such that if |𝑥| is in that set then

the series absolutely converges and if it is not then the series diverges. This set is defined by the concept of radiusof convergence. Before proving that every power series has a radius of convergence, first let’s clarify the concept ofabsolute convergence and show that absolutely convergent series are a subset of convergent series.

A series in the form∑∞

𝑘=𝑛0𝑎𝑘 is absolutely convergent if the series

∑∞𝑘=𝑛0

|𝑎𝑘| is convergent. To show this we use aproperty of the absolute value operator which states that if 𝑥, 𝑐 ∈ R and 𝑐 ≥ 0 then |𝑥| ≤ 𝑐 if and only if −𝑐 ≤ 𝑥 ≤ 𝑐.Using this we obtain −|𝑎𝑘| ≤ 𝑎𝑘 ≤ |𝑎𝑘| and 0 ≤ 𝑎𝑘 + |𝑎𝑘| ≤ 2|𝑎𝑘|. According to the direct comparison test forthe convergence of series, if

∑∞𝑘=𝑛0

|𝑎𝑘| converges then∑∞

𝑘=𝑛02|𝑎𝑘| converges and

∑∞𝑘=𝑛0

𝑎𝑘 + |𝑎𝑘| converges. Weknow that

∑∞𝑘=𝑛0

𝑎𝑘 =∑∞

𝑘=𝑛0𝑎𝑘 + |𝑎𝑘| − |𝑎𝑘| =

∑∞𝑘=𝑛0

𝑎𝑘 + |𝑎𝑘| −∑∞

𝑘=𝑛0|𝑎𝑘|. This means that

∑∞𝑘=𝑛0

𝑎𝑘 isthe sum of two convergent series and therefore is itself convergent.

Direct Comparison Test

This test is used in order to determine the convergence behaviour of a series∑∞

𝑘=𝑛0|𝑎𝑘| based on the behaviour of

another series∑∞

𝑘=𝑛0|𝑏𝑘|. If there exists 𝑁 ∈ N such that ∀𝑘 ≥ 𝑁 , 0 ≤ 𝑎𝑘 ≤ 𝑏𝑘, then

∑∞𝑘=𝑛0

|𝑏𝑘| is convergent⇒∑∞

𝑘=𝑛0|𝑎𝑘| is convergent and if

∑∞𝑘=𝑛0

|𝑎𝑘| is divergent ⇒∑∞

𝑘=𝑛0|𝑏𝑘| is divergent. Let 𝑀𝑎 =

∑𝑁𝑘=𝑛0

𝑎𝑘,𝑀𝑏 =

∑𝑁𝑘=𝑛0

𝑏𝑘. Then∑∞

𝑘=𝑛0𝑎𝑘 = 𝑀𝑎 +

∑∞𝑘=𝑁+1 𝑎𝑘 and

∑∞𝑘=𝑛0

𝑏𝑘 = 𝑀𝑏 +∑∞

𝑘=𝑁+1 𝑏𝑘. Let ∀𝑛 > 𝑁 ,𝑆𝑛 =

∑𝑛𝑘=𝑁+1 𝑎𝑘 and 𝑇𝑛 =

∑𝑛𝑘=𝑁+1 𝑏𝑘. If

∑∞𝑘=𝑛0

𝑏𝑘 is a convergent series, then {𝑇𝑛} must be a convergentand therefore bounded sequence. As a result {𝑆𝑛} is bounded. Since 𝑎𝑘 is non-negative for 𝑘 ≥ 𝑁 , {𝑆𝑛} is also amonotonely increasing sequence. Therefore {𝑆𝑛} is convergent and

∑∞𝑘=𝑛0

𝑎𝑘 = 𝑀𝑎 + lim𝑛→∞ 𝑆𝑛.

Assume that∑∞

𝑘=𝑛0𝑎𝑘 is divergent but

∑∞𝑘=𝑛0

𝑏𝑘 is convergent. Then {𝑇𝑛} must be convergent and boundedwhich implies the boundedness and convergence of {𝑆𝑛} and

∑∞𝑘=𝑛0

𝑎𝑘. This contradiction proves the divergence of∑∞𝑘=𝑛0

𝑏𝑘.

Every convergent sequence is bounded

While proving why direct comparison test works we used the fact that convergent sequences must be bounded. Let𝑎𝑛 → 𝐿. ∃𝑁 ∈ N : 𝑛 ≥ 𝑁 ⇒ |𝑎𝑛 − 𝐿| < 1 ⇒ |𝑎𝑛| < 1 + |𝐿| where we are using another property of the absolutevalue operator which is as follows:

|𝑥| − |𝑦|

≤ |𝑥 − 𝑦|,∀𝑥, 𝑦 ∈ R. Let 𝑀 = max{|𝑎1|, |𝑎2|, ..., |𝑎𝑁−1|, 1 + |𝐿|}.

Then ∀𝑛, |𝑎𝑛| ≤ 𝑀 and {𝑎𝑛} is bounded.

Limit Comparison Test

[4]Let lim𝑛→∞𝑎𝑛

𝑏𝑛= 𝐿 and 0 < 𝑎𝑛, 𝑏𝑛 for 𝑛 greater than or equal to some 𝑁 ∈ N. If 0 < 𝐿 < ∞, then either both∑∞

𝑛=0 𝑎𝑛 and∑∞

𝑛=0 𝑏𝑛 converge or both diverge.

For large enough 𝑛, 𝑎𝑛, 𝑏𝑛 > 0 and𝐿

2<

𝑎𝑛𝑏𝑛

<3𝐿

2⇒ 𝐿

2𝑏𝑛 < 𝑎𝑛 <

3𝐿

2𝑏𝑛. Therefore by direct comparison test

either both∑∞

𝑛=0 𝑎𝑛 and∑∞

𝑛=0 𝑏𝑛 converge or both diverge.

If 𝐿 = 0 then for large enough 𝑛, 𝑎𝑛, 𝑏𝑛 > 0 and 𝑎𝑛

𝑏𝑛< 1 ⇒ 0 < 𝑎𝑛 < 𝑏𝑛. It follows that if

∑∞𝑛=0 𝑏𝑛 is convergent

then∑∞

𝑛=0 𝑎𝑛 is convergent.

If 𝐿 = ∞ then for large enough 𝑛, 𝑎𝑛, 𝑏𝑛 > 0 and 1 < 𝑎𝑛

𝑏𝑛⇒ 0 < 𝑏𝑛 < 𝑎𝑛. It follows that if

∑∞𝑛=0 𝑏𝑛 is divergent

then∑∞

𝑛=0 𝑎𝑛 is divergent.



Ratio Test

[4]Let 𝑎𝑛 > 0 for all 𝑛 and lim𝑛→∞𝑎𝑛+1

𝑎𝑛= 𝜌. If 𝜌 < 1 then the series

∑∞𝑛=𝑛0

𝑎𝑛 converges, if 𝜌 > 1 then the seriesdiverges and if 𝜌 = 1 then the test is inconclusive. First, let us investigate the case when 𝜌 < 1. Let 𝜌 < 𝑟 < 1. Thenthere exists 𝑁 ∈ N such that if 𝑛 ≥ 𝑁 then 𝑎𝑛+1

𝑎𝑛− 𝜌 < 𝑟 − 𝜌 ⇒ 𝑎𝑛+1

𝑎𝑛< 𝑟. It follows that

𝑎𝑁+1 < 𝑟𝑎𝑁

𝑎𝑁+2 < 𝑟𝑎𝑁+1 < 𝑟2𝑎𝑁

⇒ 𝑎𝑁+𝑘 < 𝑟𝑘𝑎𝑁

Since |𝑟| < 1, 𝑎𝑁∑∞

𝑘=1 𝑟𝑘 is a convergent series and by the direct comparison test

∑∞𝑘=𝑁+1 𝑎𝑘 is a convergent series.

Considering that∑𝑁

𝑘=0 𝑎𝑘 is a finite value, we can conclude that∑∞

𝑘=0 𝑎𝑘 is convergent when 𝜌 < 1.

If 𝜌 > 1 then for all large 𝑛, 0 < 𝑎𝑛 < 𝑎𝑛+1 and {𝑎𝑛} does not converge to zero which implies that in this case∑∞𝑛=0 𝑎𝑛 is divergent.

In both cases 𝑎𝑛 = 1/𝑛 and 𝑎𝑛 = 1/𝑛2,𝑎𝑛+1

𝑎𝑛→ 1 but the first of these series is divergent and the second one is

convergent. Therefore, the test is inconclusive if 𝜌 = 1.

Root Test

[4]Let 𝑛√𝑎𝑛 → 𝜌 and 𝑎𝑛 ≥ 0,∀𝑛 ≥ 𝑁 . If 𝜌 < 1 then

∑∞𝑛=0 𝑎𝑛 is convergent and if 𝜌 > 1 then the series is divergent.

In case of 𝜌 = 1 the test is inconclusive. Suppose 𝜌 < 𝑟 < 1. For large enough 𝑛, 𝑛√𝑎𝑛 − 𝜌 < 𝑟 − 𝜌 ⇒ 𝑎𝑛 < 𝑟𝑛.

Suppose 𝑎𝑛 < 𝑟𝑛 for 𝑛 ≥ 𝐾 > 𝑁 . Let 𝑀𝑎 =∑𝐾−1

𝑛=0 𝑎𝑛 and compare the series∑∞

𝑛=𝑁 𝑎𝑛 and∑∞

𝑛=𝑁 𝑟𝑛. Since|𝑟| < 1, the geometric series

∑∞𝑛=𝑁 𝑟𝑛 converges to 1/(1 − 𝑟) and by the direct comparison test

∑∞𝑛=𝑁 𝑎𝑛 and

therefore∑∞

𝑛=0 𝑎𝑛 are convergent. On the other hand if 𝜌 > 1 then for all large 𝑛, (𝑎𝑛)(1/𝑛) > 1 ⇒ 𝑎𝑛 > 1 whichmeans that 𝑎𝑛 does not converge to 0 and therefore the series diverges. In order to prove the inconclusiveness of thetest when 𝜌 = 1, consider the series

∑∞𝑛=1 1/𝑛 and

∑∞𝑛=1 1/𝑛2. In both cases 𝑎𝑛 → 1 but the first series is divergent

whereas the second is convergent.

Dirichlet Test

[2] Let 𝑎𝑘 → 0 and 𝑆𝑛 =∑𝑛

𝑘=0 𝑏𝑘 is a bounded sequence such that for every 𝑛, |𝑆𝑛| ≤ 𝐵. Furthermore the sequence

{𝑎𝑘} is of bounded variation which means that∞∑𝑘=1

|𝑎𝑘+1 − 𝑎𝑘| is convergent. Then∞∑𝑘=1

𝑎𝑘𝑏𝑘 is convergent.

Let 𝜀 > 0. There exists 𝑁 such that whenever 𝑛,𝑚 ≥ 𝑁 ,∑𝑚

𝑛 |𝑎𝑘+1−𝑎𝑘| < 𝜀3𝐵 by the Cauchy convergence criterion.

Also whenever 𝑘 ≥ 𝑁 , |𝑎𝑘| < 𝜀3𝐵 .

7.5. Power Series 35


Let 𝑛,𝑚 ≥ 𝑁 . Using Abel’s lemma : 𝑚∑𝑘=𝑛

𝑎𝑘𝑏𝑘

= 𝑚∑𝑘=𝑛

𝑎𝑘(𝑆𝑘 − 𝑆𝑘−1)

=𝑎𝑚+1𝑆𝑚 − 𝑎𝑛𝑆𝑛−1 −

𝑚∑𝑘=𝑛

(𝑎𝑘+1 − 𝑎𝑘)𝑆𝑘

≤𝑎𝑚+1𝑆𝑚

+𝑎𝑛𝑆𝑛−1

+ 𝑚∑𝑘=𝑛

(𝑎𝑘+1 − 𝑎𝑘)𝑆𝑘

≤𝑎𝑚+1

𝑆𝑚

+𝑎𝑛

𝑆𝑛−1

+𝑆𝑘

𝑚∑𝑘=𝑛

(𝑎𝑘+1 − 𝑎𝑘)

<𝜀

3𝐵𝐵 +

𝜀

3𝐵𝐵 +

𝜀

3𝐵𝐵 = 𝜀

Therefore∑∞

𝑘=0 𝑎𝑘𝑏𝑘 is convergent according to the Cauchy convergence criterion.

Cauchy convergence criterion

[1] This criterion says that a sequence is convergent if and only if it is a Cauchy sequence. A sequence is called Cauchysequence, if for every 𝜀, there exists 𝑁 ∈ N such that whenever 𝑛,𝑚 ≥ 𝑁 , |𝑎𝑛 − 𝑎𝑚| < 𝜀.

If 𝑎𝑛 → 𝐿 and 𝜀 > 0. There exists 𝑁 ∈ N such that 𝑛,𝑚 ≥ 𝑁 ⇒ |𝑎𝑛 − 𝐿| < 𝜀/2 and |𝑎𝑚 − 𝐿| < 𝜀/2. Therefore|𝑎𝑛 − 𝑎𝑚| = |𝑎𝑛 − 𝐿 + 𝐿− 𝑎𝑚| ≤ |𝑎𝑛 − 𝐿| + |𝑎𝑚 − 𝐿| < 𝜀/2 + 𝜀/2 = 𝜀.

Conversely, if {𝑎𝑛} is a Cauchy sequence, then first of all it is a bounded sequence. We know that there exists 𝑁 ∈ Nsuch that 𝑛,𝑚 ≥ 𝑁 ⇒ |𝑎𝑛−𝑎𝑚| < 1 ⇒ |𝑎𝑛| < 1+|𝑎𝑁 |. Let 𝑀 = max{|𝑎1|, |𝑎2|, ..., |𝑎𝑁−1|, 1+|𝑎𝑁 |}. Then {𝑎𝑛}is bounded by 𝑀 . Since {𝑎𝑛} is bounded, it has a convergent subsequence 𝑎𝑛𝑘

→ 𝑐. Let 𝜀 > 0. For some 𝑁 , |𝑎𝑛−𝑎𝑚|is always less than 𝜀/2 if 𝑛,𝑚 ≥ 𝑁 . Also there exists 𝐾 > 𝑁 such that if 𝑘 ≥ 𝐾, then |𝑎𝑛𝑘

−𝑐| < 𝜀/2. Let 𝑛 ≥ 𝑁 and𝑘 ≥ 𝐾. Considering that 𝑛𝑘 ≥ 𝑘, we obtain |𝑎𝑛−𝑐| = |𝑎𝑛−𝑎𝑛𝑘

+𝑎𝑛𝑘−𝑐| ≤ |𝑎𝑛−𝑎𝑛𝑘

|+|𝑎𝑛𝑘−𝑐| < 𝜀/2+𝜀/2 = 𝜀.

This proves that every Cauchy sequence is a convergent sequence.

Abel’s Lemma

[2]This lemma states that∑𝑚

𝑛 𝑎𝑘(𝑏𝑘+1 − 𝑏𝑘) = 𝑎𝑚+1𝑏𝑚+1 − 𝑎𝑛𝑏𝑛 −∑𝑚

𝑛 (𝑎𝑘+1 − 𝑎𝑘)𝑏𝑘+1. This can be proven asfollows:

𝑚∑𝑛

𝑎𝑘(𝑏𝑘+1 − 𝑏𝑘) =

𝑚∑𝑛

𝑎𝑘𝑏𝑘+1 −𝑚∑𝑛

𝑎𝑘𝑏𝑘

=

𝑚∑𝑛

𝑎𝑘𝑏𝑘+1 −𝑚+1∑𝑛+1

𝑎𝑘𝑏𝑘 + 𝑎𝑚+1𝑏𝑚+1 − 𝑎𝑛𝑏𝑛

=

𝑚∑𝑛

[𝑎𝑘𝑏𝑘+1 − 𝑎𝑘+1𝑏𝑘+1

]+ 𝑎𝑚+1𝑏𝑚+1 − 𝑎𝑛𝑏𝑛

=

𝑚∑𝑛

𝑏𝑘+1

[𝑎𝑘 − 𝑎𝑘+1

]+ 𝑎𝑚+1𝑏𝑚+1 − 𝑎𝑛𝑏𝑛

= 𝑎𝑚+1𝑏𝑚+1 − 𝑎𝑛𝑏𝑛 −𝑚∑𝑛

(𝑎𝑘+1 − 𝑎𝑘)𝑏𝑘+1



Radius of convergence

[2]For every power series there exists a value 𝑅 called the radius of convergence such that 0 ≤ 𝑅 ≤ ∞. If |𝑥| < 𝑅then the series

∑∞𝑘=𝑛0

𝑐𝑘(𝑥− 𝑎)𝑘 absolutely converges and if |𝑥| > 𝑅 then the series diverges.

Consider a convergent series∑∞

𝑘=𝑛0𝑐𝑘(𝑥0−𝑎)𝑘 and let |𝑥−𝑎| < |𝑥0−𝑎|. For the sake of convenience let 𝑦 = 𝑥−𝑎

and 𝑦0 = 𝑥0 − 𝑎. Since∑∞

𝑘=𝑛0𝑐𝑘𝑦0

𝑘 is convergent, there exists a real number 𝑀 such that |𝑐𝑘𝑦0𝑘| ≤ 𝑀 for all

𝑘. Then |𝑐𝑘𝑦𝑘| = |𝑐𝑘𝑦𝑘0 ||𝑐𝑘𝑦𝑘||𝑐𝑘𝑦0𝑘|

≤ 𝑀|𝑦𝑘||𝑦0𝑘|

. Since𝑦

𝑦0< 1, the right hand side of the inequality is a convergent

geometric series and using the direct comparison test we obtain that∑∞

𝑘=𝑛0𝑐𝑘𝑦

𝑘 absolutely converges.

Let 𝑆 = {𝑟 ≥ 0 :∑∞

𝑘=𝑛0𝑐𝑘𝑟

𝑘 is convergent}. If 𝑆 is unbounded, then for every 𝑦 ∈ R there exists 𝑟 ∈ 𝑆 such that|𝑦| < |𝑟| and

∑∞𝑘=𝑛0

𝑐𝑘𝑦𝑘 is absolutely convergent. This means that the series is absolutely convergent for |𝑦| < ∞

or |𝑥| < ∞. If 𝑆 is bounded then using the completeness axiom of the set of real numbers we know that it has asupremum. Let 𝑅 = sup𝑆 and |𝑦| < 𝑅. Then, there exists 𝑟 ∈ 𝑆 such that |𝑦| < 𝑟 ≤ |𝑟| otherwise |𝑦| would be thesupremum. It follows that

∑∞𝑘=𝑛0

𝑐𝑘𝑦𝑘 is absolutely convergent when |𝑦| < 𝑅. This means that

∑∞𝑘=𝑛0

𝑐𝑘(𝑥 − 𝑎)𝑘

is absolutely convergent when 𝑥 ∈ (−𝑅 + 𝑎,𝑅 + 𝑎). As another possibility, suppose that 𝑅 < |𝑦|. Then thereexists some 𝑟 such that 𝑅 < 𝑟 < |𝑦|. Assume that

∑∞𝑘=𝑛0

𝑐𝑘𝑦𝑘 is convergent. Then

∑∞𝑘=𝑛0

𝑐𝑘𝑟𝑘 must be absolutely

convergent and therefore convergent which means that 𝑟 is in 𝑆 and at the same time greater than the supremum of 𝑆.This is a contradiction, therefore if |𝑦| > 𝑅 then

∑∞𝑘=𝑛0

𝑐𝑘𝑦𝑘 is divergent.

As an example we can analyze the series∞∑𝑘=2

𝑥𝑘

log 𝑘. Using the ratio test:

lim𝑘→∞

𝑎𝑘+1

𝑎𝑘

= lim

𝑘→∞

𝑥𝑘+1 log(𝑘 + 1)

𝑥𝑘 log(𝑘)

= |𝑥| lim

𝑘→∞

log(𝑘 + 1)

log(𝑘)= |𝑥| lim

𝑘→∞

1/(𝑘 + 1)

1/𝑘= |𝑥|

Therefore the series absolutely converges when |𝑥| < 1 and the radius of convergence is 1. When computing the limitin the above example which includes the logarithm function we resorted to L’Hospital’s rule.

L’Hospital’s Rule

[7]Let 𝑓 : (𝑎, 𝑏) → R, 𝑔 : (𝑎, 𝑏) → R and both functions are differentiable on (𝑎, 𝑏).Let lim𝑥→𝑎+

𝑓 ′(𝑥)

𝑔′(𝑥)= 𝐴 ∈ R.

Choose 𝑝, 𝑞, 𝜀 such that 𝐴 ∈ (𝑝 + 𝜀, 𝑞 − 𝜀). Since 𝑓 and 𝑔 are differentiable on (𝑎, 𝑏), according to the Cauchy mean

value theorem for any 𝑥, 𝑦 ∈ (𝑎, 𝑏) there exists 𝜉 ∈ (𝑥, 𝑦) such that𝑓 ′(𝜉)

𝑔‘(𝜉)=

𝑓(𝑥) − 𝑓(𝑦)

𝑔(𝑥) − 𝑔(𝑦).

Suppose that lim𝑥→𝑎+ 𝑓(𝑥) = lim𝑥→𝑎+ 𝑔(𝑥) = 0. Since 𝑓 ′/𝑔′ converges to 𝐴 as x converges to 𝑎, there exists aneighbourhood of 𝑎 such that the intersection of that neighbourhood with (𝑎, 𝑏) is non-empty and for every 𝑥0 in thisintersection 𝑓 ′(𝑥0)/𝑔′(𝑥0) ∈ (𝑝 + 𝜀, 𝑞 − 𝜀). Let’s call this intersection (𝑎, 𝑐) for some 𝑐 ∈ (𝑎, 𝑏). Let 𝑥, 𝑦 ∈ (𝑎, 𝑐).

Then𝑓(𝑥) − 𝑓(𝑦)

𝑔(𝑥) − 𝑔(𝑦)∈ (𝑝 + 𝜀, 𝑞 − 𝜀). Furthermore, lim

𝑥→𝑎+

𝑓(𝑥) − 𝑓(𝑦)

𝑔(𝑥) − 𝑔(𝑦)=

𝑓(𝑦)

𝑔(𝑦)∈ [𝑝 + 𝜀, 𝑞 − 𝜀] which means that for

any neighbourhood (𝑝, 𝑞) of 𝐴, there exists a neighbourhood of 𝑎 such that the intersection of that neighbourhood is a

non-empty set (𝑎, 𝑐) and for every 𝑦 ∈ (𝑎, 𝑐), 𝑓(𝑦)/𝑔(𝑦) ∈ (𝑝, 𝑞). Therefore, lim𝑥→𝑎+

𝑓(𝑥)

𝑔(𝑥)= 𝐴.

Another case where L’Hospital’s rule can be applied is when 𝑔(𝑥) → ∞ as 𝑥 → 𝑎+. Fix 𝑦 ∈ (𝑎, 𝑐). Since 𝑔(𝑥) → ∞as 𝑥 → 𝑎+, there exists 𝑐1 ∈ (𝑎, 𝑐) such that for every 𝑥 ∈ (𝑎, 𝑐1), 𝑔(𝑥) > 0 and 𝑔(𝑥) > 𝑔(𝑦). Let 𝑥 ∈ (𝑎, 𝑐1). Using

𝑝 + 𝜀 <𝑓(𝑥) − 𝑓(𝑦)

𝑔(𝑥) − 𝑔(𝑦)< 𝑞 − 𝜀

⇒ (𝑝 + 𝜀)(

1 − 𝑔(𝑦)

𝑔(𝑥)

)<

𝑓(𝑥)

𝑔(𝑥)− 𝑓(𝑦)

𝑔(𝑥)< (𝑞 − 𝜀)

(1 − 𝑔(𝑦)

𝑔(𝑥)

)7.6. L’Hospital’s Rule 37


⇒ 𝑝 + 𝜀 +1

𝑔(𝑥)(𝑓(𝑦) − (𝑝 + 𝜀)𝑔(𝑦)) <

𝑓(𝑥)

𝑔(𝑥)< 𝑞 − 𝜀 +

1

𝑔(𝑥)(𝑓(𝑦) − (𝑞 − 𝜀)𝑔(𝑦))

Since 𝑔(𝑥) → ∞ as 𝑥 → 𝑎+, it is possible to choose 𝑥 close enough to 𝑎 and therefore 𝑔(𝑥) large enough such that1

𝑔(𝑥) (𝑓(𝑦)− (𝑝+ 𝜀)𝑔(𝑦))< 𝜀,

1

𝑔(𝑥) (𝑓(𝑦)− (𝑝+ 𝜀)𝑔(𝑦))< 𝑓(𝑥)/𝑔(𝑥)− (𝑝+ 𝜀),

1

𝑔(𝑥) (𝑓(𝑦)− (𝑞− 𝜀)𝑔(𝑦))< 𝜀

and

1𝑔(𝑥) (𝑓(𝑦)− (𝑞− 𝜀)𝑔(𝑦))

< 𝑞− 𝜀− 𝑓(𝑥)/𝑔(𝑥). Let 𝑐2 ∈ (𝑎, 𝑐1) such that 𝑥 ∈ (𝑎, 𝑐2) satisfies these conditions.

It follows that 𝑥 ∈ (𝑎, 𝑐2) ⇒ 𝑓(𝑥)/𝑔(𝑥) ∈ (𝑝, 𝑞) and lim𝑥→𝑎+𝑓(𝑥)/𝑔(𝑥)=𝐴

.

Cauchy Mean Value Theorem

Let 𝑓 and 𝑔 be continuous on [𝑎, 𝑏] and differentiable on (𝑎, 𝑏). Then there exists 𝜉 ∈ (𝑎, 𝑏) such that𝑓 ′(𝜉)

𝑔′(𝜉)=

𝑓(𝑏) − 𝑓(𝑎)

𝑔(𝑏) − 𝑔(𝑎). In order to prove this, we can define a function 𝜑 as follows:

𝜑(𝑥) = (𝑓(𝑥) − 𝑓(𝑎))(𝑔(𝑏) − 𝑔(𝑎)) − (𝑔(𝑥) − 𝑔(𝑎))(𝑓(𝑏) − 𝑓(𝑎))

Clearly, 𝜑(𝑎) = 𝜑(𝑏) = 0 and from Rolle’s theorem there exists 𝜉 ∈ (𝑎, 𝑏) such that 𝜑′(𝑥) = 𝑓 ′(𝜉)(𝑔(𝑏) − 𝑔(𝑎)) −

𝑔′(𝜉)(𝑓(𝑏) − 𝑓(𝑎)) = 0 ⇒ 𝑓 ′(𝜉)

𝑔′(𝜉)=

𝑓(𝑏) − 𝑓(𝑎)

𝑔(𝑏) − 𝑔(𝑎).

Logarithm

The logarithm function is defined as

log(𝑥) =

∫ 𝑥

1

1

𝑡𝑑𝑡

Using the fundamental theorem of calculus we can derive the following equality:

log(𝑥𝑦) = log(𝑥) + log(𝑦), 𝑥, 𝑦 > 0

Let 𝑥𝑦 = 𝑢 for 𝑥, 𝑦 > 0. Then log(𝑥𝑦) = log(𝑢) =∫ 𝑢

11𝑡 𝑑𝑡 ⇒

𝑑𝑑𝑥 log(𝑥𝑦) = 𝑑

𝑑𝑢 log(𝑢)𝑑𝑢𝑑𝑥 by the chain rule. Since

1/𝑡 is continuous at 𝑡 = 𝑢 we obtain𝑑

𝑑𝑥log(𝑥𝑦) =

1

𝑥𝑦𝑦 =

1

𝑥. The derivative of log(𝑥) with respect to 𝑥 is also equal

to1

𝑥. Therefore log(𝑥𝑦) = log(𝑥) + 𝐶 where 𝐶 is a constant. Using log(1) = 0 we obtain log(1 · 𝑦) = 0 + 𝐶 ⇒

log(𝑥𝑦) = log(𝑥) + log(𝑦).

Using the above equality we obtain 0 = log(1) = log(𝑥 · 𝑥−1) = log(𝑥) + log(𝑥−1) ⇒ log(𝑥−1) = − log(𝑥).

Clearly log(𝑥1) = 1 · log(𝑥). Let 𝑛 ∈ N. If log(𝑥𝑛) = 𝑛 · log(𝑥), then log(𝑥𝑛+1) = log(𝑥𝑛) + log(𝑥) =(𝑛 + 1) log(𝑥) ∴ ∀𝑛 ∈ N,∀𝑥 > 0, log(𝑥𝑛) = 𝑛 log(𝑥) by induction.

log(𝑥0) = 0 · log(𝑥) and log(𝑥−𝑛) = log((𝑥−1)𝑛). Since 𝑥−1 > 0, log((𝑥−1)𝑛) = 𝑛 · log(𝑥−1) = −𝑛 log(𝑥).Therefore for every integer 𝑚 ∈ Z, 𝑙𝑜𝑔(𝑥𝑚) = 𝑚 log(𝑥).

Let 𝑏𝑛 = 𝑥 ⇒ 𝑏 = 𝑥1/𝑛 > 0 ⇒ log(𝑥) = log(𝑏𝑛) = 𝑛 log(𝑏) = 𝑛 log(𝑥1/𝑛)

⇒ log(𝑥1/𝑛) = 1𝑛 log(𝑥).

Let 𝑞 ∈ Q be any rational number. Then there exist an integer 𝑚 and a positive integer 𝑛 such that log(𝑥𝑞) =log((𝑥1/𝑛)𝑚) = 𝑚 log(𝑥1/𝑛) = 𝑚

𝑛 log(𝑥) = 𝑞 log(𝑥). Therefore for every rational number 𝑞 and for every positivereal number 𝑥, log(𝑥𝑞) = 𝑞 log(𝑥).



While showing that the series∞∑𝑘=2

𝑥𝑘

log 𝑘has the radius of convergence 1, we made use of lim

𝑘→∞log 𝑘 = ∞.

The limits of the log function at ±∞ can be obtained as follows: Let (𝑛 log(𝑥),+∞) be any neighbourhoodof +∞ where 𝑛 ∈ N, 0 < 𝑥 ∈ R. There exists another neighbourhood (𝑥𝑛,+∞) of +∞ such that ∀𝑥0 ∈(𝑥𝑛,+∞), log(𝑥𝑛) = 𝑛 log(𝑥) < log(𝑥0) since log(𝑥) is a strictly increasing function. Therefore log(𝑥0) ∈(𝑛 log(𝑥),+∞) and lim

𝑥→+∞log(𝑥) = +∞.

Similarly, let (−∞,−𝑛 log(𝑥)) be any neighbourhood of −∞ where 𝑛 ∈ N, 0 < 𝑥 ∈ R. There exists a neighbourhood(0, 𝑥−𝑛) of 0 such that ∀𝑥0 ∈ (0, 𝑥−𝑛), log(𝑥0) < log(𝑥−𝑛) = −𝑛 log(𝑥) since log(𝑥) is a strictly increasingfunction. Therefore log(𝑥0) ∈ (−∞,−𝑛 log(𝑥)) and lim

𝑥→0log(𝑥) = −∞.

In order to prove that logarithm is a strictly increasing function we use the fact that its derivative has always a positivevalue. Let (𝑢, 𝑣) ⊂ (0,∞). Since log(𝑥) is differentiable on (𝑢, 𝑣) and continuous on [𝑢, 𝑣], according to the mean

value theorem (mvt) there exists 𝑐 ∈ (𝑢, 𝑣) such that𝑑

𝑑𝑥log(𝑥)

𝑥=𝑐

=1

𝑐=

log(𝑣) − log(𝑢)

𝑣 − 𝑢> 0 ⇒ log(𝑢) <

log(𝑣) ∀(𝑢, 𝑣) ⊂ (0,∞).

It can also be proven that the range of the log function is all of R. Since log(𝑥) → ±∞,∀𝑟 ∈ R,∃𝑝, 𝑞 ∈ (0,+∞) :log(𝑝) < 𝑟 < log(𝑞). Therefore according to Bolzano intermediate value theorem ∃𝑥 ∈ (𝑝, 𝑞) : log(𝑥) = 𝑟.

Absolute value

[1]Some of the most significant properties of the absolute value can be proven as follows:

𝑥, 𝑦 ∈ R. −|𝑥| ≤ 𝑥 ≤ |𝑥|, −|𝑦| ≤ 𝑦 ≤ |𝑦| ⇒ −(|𝑥| + |𝑦|) ≤ 𝑥 + 𝑦 ≤ (|𝑥| + |𝑦|).Also using | − 𝑦| = |𝑦| we obtain|𝑥± 𝑦| ≤ |𝑥| + |𝑦|.

|𝑥| = |𝑥 + 𝑦 − 𝑦| ≤ |𝑥 + 𝑦| + |𝑦| ⇒ |𝑥| − |𝑦| ≤ |𝑥 + 𝑦|. |𝑦| = |𝑦 + 𝑥− 𝑥| ≤ |𝑥 + 𝑦| + |𝑥| ⇒ |𝑦| − |𝑥| ≤ |𝑥 + 𝑦| ⇒|𝑥| − |𝑦|

≤ |𝑥± 𝑦| ≤ |𝑥| + |𝑦|.

Another property of the absolute value operator that we used in the section about the radius of convergence is that forany 𝑥, 𝑦 ∈ R, |𝑥|𝑦 = |𝑥𝑦|. Using the representation of real numbers as complex numbers without imaginary part weobtain 𝑥 = 𝑟𝑒𝑖𝜃 = |𝑥|𝑒𝑖𝜃 and |𝑥𝑦| = ||𝑥|𝑦𝑒𝑖𝑦𝜃| = ||𝑥|𝑦|| cos(𝑦𝜃) + 𝑖 sin(𝑦𝜃)| = ||𝑥|𝑦| · 1 = |𝑥|𝑦 .

The Fundamental Theorem of Calculus

Let∫ 𝑏

𝑎𝑓(𝑥)𝑑𝑥 exist and let 𝐹 : [𝑎, 𝑏] → R be the antiderivative of 𝑓(𝑥) which means that 𝐹 ′(𝑥) = 𝑓(𝑥),∀𝑥 ∈ [𝑎, 𝑏].

Then the fundamental theorem of calculus states that 𝐹 (𝑏) − 𝐹 (𝑎) =∫ 𝑏

𝑎𝑓(𝑥)𝑑𝑥. In order to prove this, let 𝑃 be any

partition of [𝑎, 𝑏] so that 𝑃 = {𝑥0 = 𝑎, 𝑥1, 𝑥2, ..., 𝑥𝑛−1, 𝑥𝑛 = 𝑏}. Then 𝐹 (𝑏) − 𝐹 (𝑎) =∑𝑛

𝑖=1 𝐹 (𝑥𝑖) − 𝐹 (𝑥𝑖−1).Since 𝐹 (𝑥) is differentiable on every subinterval [𝑥𝑖−1, 𝑥𝑖], according to the mean value theorem, for every 𝑖 ∈{1, ..., 𝑛},∃𝑐𝑖 ∈ (𝑥𝑖−1, 𝑥𝑖) such that

𝐹 ′(𝑐𝑖) = 𝑓(𝑐𝑖) =𝐹 (𝑥𝑖) − 𝐹 (𝑥𝑖−1)

𝑥𝑖 − 𝑥𝑖−1

Therefore 𝐹 (𝑏) − 𝐹 (𝑎) =∑𝑛

𝑖=1 𝑓(𝑐𝑖)(𝑥𝑖 − 𝑥𝑖−1) which is a Riemann sum of 𝑓 with respect to 𝑃 . The lower sum𝐿(𝑃, 𝑓) and upper sum 𝑈(𝑃, 𝑓) of 𝑓 with respect to 𝑃 are defined as

𝐿(𝑃, 𝑓) =

𝑛∑𝑖=1

𝑓(𝑝𝑖)(𝑥𝑖 − 𝑥𝑖−1), 𝑓(𝑝𝑖) = inf{𝑓(𝑥) : 𝑥 ∈ [𝑥𝑖−1, 𝑥𝑖]}

𝑈(𝑃, 𝑓) =

𝑛∑𝑖=1

𝑓(𝑞𝑖)(𝑥𝑖 − 𝑥𝑖−1), 𝑓(𝑞𝑖) = sup{𝑓(𝑥) : 𝑥 ∈ [𝑥𝑖−1, 𝑥𝑖]}

7.9. Absolute value 39


Therefore 𝐿(𝑃, 𝑓) ≤ 𝐹 (𝑏)−𝐹 (𝑎) ≤ 𝑈(𝑃, 𝑓). Since 𝑃 was chosen arbitrarily, 𝐹 (𝑏)−𝐹 (𝑎) is an upper bound for theset of all lower sums of 𝑓 and a lower bound for the set of all upper sums of 𝑓 on the interval [𝑎, 𝑏]. Since

∫ 𝑏

𝑎𝑓(𝑥)𝑑𝑥

exists, by definition the upper and lower integrals of 𝑓 on [𝑎, 𝑏] must be both equal to∫ 𝑏

𝑎𝑓(𝑥)𝑑𝑥. The upper integral

𝑈(𝑓) is the greatest lower bound of the set of all upper sums of 𝑓 and the lower integral 𝐿(𝑓) is the least upper boundof the set of all lower sums of 𝑓 .

𝐿(𝑓) = sup{𝐿(𝑃, 𝑓) : 𝑃 partitions [𝑎, 𝑏]}𝑈(𝑓) = inf{𝑈(𝑃, 𝑓) : 𝑃 partitions [𝑎, 𝑏]}

From the above definitions it follows that

𝐿(𝑓) ≤ 𝐹 (𝑏) − 𝐹 (𝑎) ≤ 𝑈(𝑓) ⇒ 𝐹 (𝑏) − 𝐹 (𝑎) =

∫ 𝑏

𝑎

𝑓(𝑥)𝑑𝑥

According to the fundamental theorem of calculus if 𝑔 : [𝑎, 𝑏] → R is integrable on [𝑎, 𝑏], and 𝐺(𝑥) =

∫ 𝑥

𝑎

𝑔(𝑡)𝑑𝑡

for any 𝑥 ∈ [𝑎, 𝑏], then 𝐺(𝑥) is continuous on [𝑎, 𝑏]. Also, if 𝑔 is continuous at some 𝑐 ∈ [𝑎, 𝑏] then 𝐺′(𝑐) = 𝑔(𝑐).First of all, since 𝑔 is integrable, it is also bounded by some 𝑀 ∈ R. Let 𝑥, 𝑦 ∈ [𝑎, 𝑏] and 𝑥 = 𝑦. Consider

|𝐺(𝑥) − 𝐺(𝑦)| = |∫ 𝑦

𝑥𝑔(𝑡)𝑑𝑡| ≤ 𝑀 |𝑥 − 𝑦| ⇒ |𝐺(𝑥) −𝐺(𝑦)|

|𝑥− 𝑦|≤ 𝑀 which proves that 𝐺 is Lipschitz and therefore

continuous on [𝑎, 𝑏].

Suppose that 𝑔 is continuous at some 𝑐 ∈ [𝑎, 𝑏]. Then for every 𝜀 > 0 there exists 𝛿 > 0 such that if |𝑥− 𝑐| < 𝛿 then

|𝑔(𝑥)− 𝑔(𝑐)| < 𝜀. Choose 𝜀, 𝑥 such that |𝑥− 𝑐| < 𝛿. Consider𝐺(𝑥) −𝐺(𝑐)

𝑥− 𝑐− 𝑔(𝑐)

= 1

𝑥− 𝑐

∫ 𝑥

𝑐

𝑔(𝑡)𝑑𝑡− 𝑔(𝑐)

= 1

𝑥− 𝑐

∫ 𝑥

𝑐

[𝑔(𝑡) − 𝑔(𝑐)]𝑑𝑡. Since |𝑡− 𝑐| < 𝛿, |𝑔(𝑡) − 𝑔(𝑐)| < 𝜀.

⇒𝐺(𝑥) −𝐺(𝑐)

𝑥− 𝑐− 𝑔(𝑐)

<

1

|𝑥− 𝑐|𝜀|𝑥− 𝑐| = 𝜀

∴ lim𝑥→𝑐

𝐺(𝑥) −𝐺(𝑐)

𝑥− 𝑐= 𝐺′(𝑐) = 𝑔(𝑐)

It can also be proven that a function which is Lipschitz on an interval, is also uniformly continuous and thereforecontinuous on this interval. Assume that 𝐺 is Lipschitz but not uniformly continuous on [𝑎, 𝑏]. Then, there exists𝜀 > 0 such that for all 𝑛 ∈ N there exist 𝑥𝑛, 𝑦𝑛 ∈ [𝑎, 𝑏] with |𝑥𝑛 − 𝑦𝑛| < 1/𝑛 and |𝐺(𝑥𝑛) −𝐺(𝑦𝑛)| ≥ 𝜀. Since 𝐺 is

Lipschitz, there exists 𝑀 ∈ R such that |𝐺(𝑥𝑛) −𝐺(𝑦𝑛)

𝑥𝑛 − 𝑦𝑛| ≤ 𝑀 for all 𝑛. It follows that for large enough 𝑛:

|𝐺(𝑥𝑛) −𝐺(𝑦𝑛)| ≤ 𝑀 |𝑥𝑛 − 𝑦𝑛| <𝑀

𝑛< 𝜀

But our assumption was that |𝐺(𝑥𝑛) −𝐺(𝑦𝑛)| ≥ 𝜀 for all 𝑛. This contradiction proves that on some interval [𝑎, 𝑏] if afunction is Lipschitz then it is uniformly continuous.

Differentiation Rules

While proving Taylor’s theorem we made use of the product rule and the chain rule of differentiation.



The Product Rule

The product rule was utilized while taking the derivative of𝑓 (𝑘)(𝑥)

𝑘!(𝑏− 𝑥)𝑘 with respect to x. Let 𝐺(𝑥) = 𝑓(𝑥)𝑔(𝑥)

where f’ and g’ both exist at some x=a. Then the derivative of 𝐺(𝑥) at x=a can be expressed as follows [1]:

𝐺′(𝑎) = lim𝑥→𝑎

𝐺(𝑥) −𝐺(𝑎)

𝑥− 𝑎

= lim𝑥→𝑎

𝑓(𝑥)𝑔(𝑥) − 𝑓(𝑎)𝑔(𝑥) + 𝑓(𝑎)𝑔(𝑥) − 𝑓(𝑎)𝑔(𝑎)

𝑥− 𝑎

= lim𝑥→𝑎

𝑓(𝑥) − 𝑓(𝑎)

𝑥− 𝑎𝑔(𝑥) +

𝑔(𝑥) − 𝑔(𝑎)

𝑥− 𝑎𝑓(𝑎)

= 𝑓 ′(𝑎)𝑔(𝑎) + 𝑔′(𝑎)𝑓(𝑎)

This gives us the product rule of differentiation. The existence of f’(a) and g’(a) imply the continuity of f and g atx=a which is used in the last step of the above proof in order to obtain lim

𝑥→𝑎𝑔(𝑥) = 𝑔(𝑎) and lim

𝑥→𝑎𝑓(𝑥) = 𝑓(𝑎). This

can be shown using the definition of the derivative as follows:

𝑓(𝑥) − 𝑓(𝑎) =𝑓(𝑥) − 𝑓(𝑎)

𝑥− 𝑎(𝑥− 𝑎) ⇒ 𝑓(𝑥) = 𝑓(𝑎) +

𝑓(𝑥) − 𝑓(𝑎)

𝑥− 𝑎(𝑥− 𝑎)

⇒ lim𝑥→𝑎

𝑓(𝑥) = lim𝑥→𝑎

𝑓(𝑎) + lim𝑥→𝑎

𝑓(𝑥) − 𝑓(𝑎)

𝑥− 𝑎(𝑥− 𝑎)

= 𝑓(𝑎) + lim𝑥→𝑎

𝑓(𝑥) − 𝑓(𝑎)

𝑥− 𝑎lim𝑥→𝑎

(𝑥− 𝑎)

= 𝑓(𝑎) + 𝑓 ′(𝑎) · 0 = 𝑓(𝑎)

While proving the continuity of a function at a point where it is differentiable, we used the product rule of thelimit operator which says that if f and g are two functions such that lim

𝑥→𝑥0

𝑓(𝑥) = 𝐹 and lim𝑥→𝑥0

𝑔(𝑥) = 𝐺 then

lim𝑥→𝑥0

𝑓(𝑥)𝑔(𝑥) = 𝐹𝐺. The proof of that statement is as follows [3]: Since the limits exist, we know that for any

𝜀 > 0, there exist 𝛿𝑓 , 𝛿𝑔 such that whenever |𝑥 − 𝑥0| < 𝛿𝑓 , |𝑓(𝑥) − 𝐹 | < 𝜀

2(1 + |𝐺|)and whenever |𝑥 − 𝑥0| < 𝛿𝑔 ,

|𝑔(𝑥)−𝐺| < 𝜀

2(1 + |𝐹 |). Also for 𝜀 = 1 we know that there exists 𝛿1 such that whenever |𝑥−𝑥0| < 𝛿1, |𝑔(𝑥)−𝐺| < 1.

Suppose that 𝜀 > 0 and 𝛿 = min{𝛿𝑓 , 𝛿𝑔, 𝛿1}. If |𝑥− 𝑥0| < 𝛿, then we obtain:

|𝑓(𝑥)𝑔(𝑥) − 𝐹𝐺| = |𝑓(𝑥)𝑔(𝑥) − 𝐹𝑔(𝑥) + 𝐹𝑔(𝑥) − 𝐹𝐺| = |𝑔(𝑥)(𝑓(𝑥) − 𝐹 ) + 𝐹 (𝑔(𝑥) −𝐺)|≤ |𝑔(𝑥)(𝑓(𝑥) − 𝐹 )| + |𝐹 (𝑔(𝑥) −𝐺)| = |𝑔(𝑥)| · |𝑓(𝑥) − 𝐹 | + |𝐹 | · |𝑔(𝑥) −𝐺|

< |𝑔(𝑥)| 𝜀

2(1 + |𝐺|)+ (1 + |𝐹 |) 𝜀

2(1 + |𝐹 |)

At this point we need to show that |𝑔(𝑥)| < (1 + |𝐺|):

|𝑔(𝑥)| = |𝑔(𝑥) −𝐺 + 𝐺| ≤ |𝑔(𝑥) −𝐺| + |𝐺| < 1 + |𝐺|

Therefore

|𝑓(𝑥)𝑔(𝑥) − 𝐹𝐺| < (1 + |𝐺|) 𝜀

2(1 + |𝐺|)+ (1 + |𝐹 |) 𝜀

2(1 + |𝐹 |)=

𝜀

2+

𝜀

2= 𝜀

The Chain Rule

The chain rule of differentiation is applied in order to take the derivative of compound functions in form of 𝑓(𝑔(𝑥))or 𝑓 ∘ 𝑔(𝑥) with respect to 𝑥. If we equate 𝑔(𝑥) to a variable 𝑢, then 𝑓 ′(𝑔(𝑥)) is computed as 𝑓 ′(𝑢)𝑔′(𝑥). In order toprove this formula we can use the definition of derivative as follows [4]: Let 𝑦 = 𝑓(𝑢), 𝑦0 = 𝑓(𝑢0), 𝑢0 = 𝑔(𝑥0), then

𝑑𝑦

𝑑𝑥

𝑥=𝑥0

= lim𝑥→𝑥0

𝑦 − 𝑦0𝑥− 𝑥0

= lim𝑥→𝑥0

𝑦 − 𝑦0𝑢− 𝑢0

𝑢− 𝑢0

𝑥− 𝑥0

7.11. Differentiation Rules 41


Using Taylor’s theorem, at any value of 𝑥 and 𝑢, 𝑓(𝑢) and 𝑔(𝑥) can be expressed as follows:

𝑓(𝑢) = 𝑓(𝑢0) + 𝑓 ′(𝑢0)(𝑢− 𝑢0) + ... +𝑓 (𝑛)(𝜉)

𝑛!(𝑢− 𝑢0)𝑛, 𝜉 ∈ (𝑢0, 𝑢)

𝑓(𝑢) − 𝑓(𝑢0) = 𝑓 ′(𝑢0)(𝑢− 𝑢0) + 𝜀1(𝑢− 𝑢0)

⇒ 𝑓(𝑢) − 𝑓(𝑢0)

𝑢− 𝑢0(𝑢− 𝑢0) = (𝑓 ′(𝑢0) + 𝜀1)(𝑢− 𝑢0)

𝑔(𝑥) = 𝑔(𝑥0) + 𝑔′(𝑥0)(𝑥− 𝑥0) + ... +𝑔(𝑛)(𝑐)

𝑛!(𝑥− 𝑥0)𝑛, 𝑐 ∈ (𝑥0, 𝑥)

𝑔(𝑥) − 𝑔(𝑥0) = 𝑔′(𝑥0)(𝑥− 𝑥0) + 𝜀2(𝑥− 𝑥0)

⇒ 𝑔(𝑥) − 𝑔(𝑥0)

𝑥− 𝑥0(𝑥− 𝑥0) = (𝑔′(𝑥0) + 𝜀2)(𝑥− 𝑥0)

In the above expressions, after the first derivative of f and g, the remaining parts of the Taylor expansions are sum-marized as 𝜀1(𝑢− 𝑢0) and 𝜀2(𝑥− 𝑥0) respectively. Using the Taylor expansions it can be shown that 𝜀1 and 𝜀2 bothconverge to zero as 𝑥 converges to 𝑥0:

lim𝑥→𝑥0

𝑔(𝑥) − 𝑔(𝑥0)

𝑥− 𝑥0− 𝑔′(𝑥0) = lim

𝑥→𝑥0

𝜀2 = 0

lim𝑥→𝑥0

𝑢− 𝑢0 = lim𝑥→𝑥0

𝑔(𝑥) − 𝑔(𝑥0) = lim𝑥→𝑥0

(𝑔′(𝑥0) + 𝜀2)(𝑥− 𝑥0) = 0

lim𝑥→𝑥0

𝑓(𝑢) − 𝑓(𝑢0)

𝑢− 𝑢0− 𝑓 ′(𝑢0) = lim

𝑢→𝑢0

𝑓(𝑢) − 𝑓(𝑢0)

𝑢− 𝑢0− 𝑓 ′(𝑢0) = lim

𝑢→𝑢0

𝜀1 = 0

Using this result the derivative of f(g(x)) with respect to x is computed as follows:

𝑦 − 𝑦0 = (𝑓 ′(𝑢0) + 𝜀1)(𝑢− 𝑢0)

= (𝑓 ′(𝑢0) + 𝜀1)(𝑔′(𝑥0) + 𝜀2)(𝑥− 𝑥0)

lim𝑥→𝑥0

𝑦 − 𝑦0𝑥− 𝑥0

= lim𝑥→𝑥0

[𝑓 ′(𝑢0) · 𝑔′(𝑥0) + 𝜀1 · 𝑔′(𝑥0) + 𝜀2 · 𝑓 ′(𝑢0) + 𝜀1 · 𝜀2

]= 𝑓 ′(𝑢0) · 𝑔′(𝑥0) = 𝑓 ′(𝑔(𝑥0)) · 𝑔′(𝑥0)

Another differentiation rule that we used while proving Taylor’s theorem is the rule to calculate the derivative of apower. According to this rule, if a function has the form 𝑓(𝑥) = 𝑥𝑛, then its derivative with respect to 𝑥 is 𝑛𝑥𝑛−1.There are to ways to prove this formula. The first one uses the binomial theorem . The derivative of 𝑓 at some 𝑥 = 𝑥0

is computed as limℎ→0

𝑓(𝑥0 + ℎ) − 𝑓(𝑥0)

ℎ. Using the binomial expansion of 𝑓(𝑥0 + ℎ) we obtain

𝑓 ′(𝑥0) = limℎ→0

(𝑥0 + ℎ)𝑛 − 𝑥0𝑛

ℎ

= limℎ→0

(𝑛0

)𝑥0

𝑛 +(𝑛1

)𝑥0

𝑛−1ℎ + ... +(

𝑛𝑛−1

)𝑥0ℎ

𝑛−1 +(𝑛𝑛

)ℎ𝑛 − 𝑥0

𝑛

ℎ

= 𝑛𝑥0𝑛−1 + lim

ℎ→0ℎ

((𝑛

2

)𝑥0

𝑛−2 +

(𝑛

3

)𝑥0

𝑛−3ℎ + ... +

(𝑛

𝑛− 1

)𝑥0ℎ

𝑛−2 +

(𝑛

𝑛

)ℎ𝑛−1

)= 𝑛𝑥0

𝑛−1

The second way to prove the formula for the derivative of a power uses the following expansion

𝑥𝑛 − 𝑥0𝑛 = (𝑥− 𝑥0)(𝑥𝑛−1 + 𝑥0𝑥

𝑛−2 + 𝑥02𝑥𝑛−3 + 𝑥0

3𝑥𝑛−4 + ... + 𝑥0𝑛−2𝑥 + 𝑥0

𝑛−1)



The derivative of 𝑓 at some 𝑥 = 𝑥0 can also be computed as lim𝑥→𝑥0

𝑓(𝑥) − 𝑓(𝑥0)

𝑥− 𝑥0. Using the above expansion we

obtain:

𝑓 ′(𝑥0) = lim𝑥→𝑥0

𝑓(𝑥) − 𝑓(𝑥0)

𝑥− 𝑥0= lim

𝑥→𝑥0

𝑥𝑛 − 𝑥0𝑛

𝑥− 𝑥0

= lim𝑥→𝑥0

(𝑥𝑛−1 + 𝑥0𝑥𝑛−2 + 𝑥0

2𝑥𝑛−3 + 𝑥03𝑥𝑛−4 + ... + 𝑥0

𝑛−2𝑥 + 𝑥0𝑛−1)

= (𝑥0𝑛−1 + 𝑥0𝑥0

𝑛−2 + 𝑥02𝑥0

𝑛−3 + ... + 𝑥0𝑛−2𝑥0 + 𝑥0

𝑛−1)

= 𝑛𝑥0𝑛−1

The expansion used in the above proof can be obtained using the finite geometric series summation formula. Thisformula states that:

𝑛−1∑𝑘=0

𝑟𝑘 =1 − 𝑟𝑛

1 − 𝑟, 𝑟 = 1

In the above formula let 𝑟 = 𝑥0/𝑥. If 𝑥 = 𝑥0 then 𝑥𝑛 − 𝑥0𝑛 = 0 and there is no need for an expansion formula.

Suppose 𝑥 = 𝑥0. Then𝑛−1∑𝑘=0

(𝑥0

𝑥

)𝑘=

1 −(𝑥0

𝑥

)𝑛1 − 𝑥0

𝑥

. Using this result we can write 𝑥𝑛 − 𝑥0𝑛 in the following form:

𝑥𝑛 − 𝑥0𝑛 = 𝑥𝑛

(1 − (

𝑥0

𝑥)𝑛)

= 𝑥𝑛(

1 − 𝑥0

𝑥

) 𝑛−1∑𝑘=0

(𝑥0

𝑥)𝑘

= 𝑥(

1 − 𝑥0

𝑥

)𝑥𝑛−1

(1 +

𝑥0

𝑥+ (

𝑥0

𝑥)2 + ... + (

𝑥0

𝑥)𝑛−2 + (

𝑥0

𝑥)𝑛−1

)= (𝑥− 𝑥0)(𝑥𝑛−1 + 𝑥0𝑥

𝑛−2 + 𝑥02𝑥𝑛−3 + ... + 𝑥0

𝑛−2𝑥 + 𝑥0𝑛−1)

Binomial theorem

Binomial theorem states that for any 𝑎, 𝑏 ∈ R and 𝑛 ∈ N, [1]

(𝑎 + 𝑏)𝑛 =

(𝑛

0

)𝑎𝑛 +

(𝑛

1

)𝑎𝑛−1𝑏 +

(𝑛

2

)𝑎𝑛−2𝑏2 + ...

+

(𝑛

𝑛− 2

)𝑎2𝑏𝑛−2 +

(𝑛

𝑛− 1

)𝑎𝑏𝑛−1 +

(𝑛

𝑛

)𝑏𝑛

This can be inductively proven with the help of Pascal’s triangle theorem which states that(𝑛

𝑘 − 1

)+

(𝑛

𝑘

)=

(𝑛 + 1

𝑘

)Pascal’s triangle theorem can be proven by inserting the definition of the binomial coefficient

(𝑛𝑘

)in the above equation:

𝑛!

(𝑘 − 1)!(𝑛− 𝑘 + 1)!+

𝑛!

𝑘!(𝑛− 𝑘)!=

𝑛!

(𝑛− 𝑘)!(𝑘 − 1)!

[ 1

𝑛− 𝑘 + 1+

1

𝑘

]=

𝑛!

(𝑛− 𝑘)!(𝑘 − 1)!

[ 𝑛 + 1

𝑘(𝑛− 𝑘 + 1)

]=

(𝑛 + 1)!

𝑘!(𝑛 + 1 − 𝑘)!=

(𝑛 + 1

𝑘

)

7.12. Binomial theorem 43


For 𝑛 = 1,1∑

𝑘=0

(1

𝑘

)𝑎1−𝑘𝑏𝑘 =

(1

0

)𝑎 +

(1

1

)𝑏 = (𝑎 + 𝑏)1 and the binomial formula for (𝑎 + 𝑏)𝑛 is true. Suppose the

formula is also true for some 𝑛 ∈ N. Then

(𝑎 + 𝑏)𝑛+1 = (𝑎 + 𝑏)(𝑎 + 𝑏)𝑛 = (𝑎 + 𝑏)

𝑛∑𝑘=0

(𝑛

𝑘

)𝑎𝑛−𝑘𝑏𝑘

=[(𝑛 + 1

0

)𝑎𝑛+1 +

(𝑛

1

)𝑎𝑛𝑏 +

(𝑛

2

)𝑎𝑛−1𝑏2 +

(𝑛

3

)𝑎𝑛−2𝑏3 + ...

+

(𝑛

𝑛− 3

)𝑎4𝑏𝑛−3 +

(𝑛

𝑛− 2

)𝑎3𝑏𝑛−2 +

(𝑛

𝑛− 1

)𝑎2𝑏𝑛−1 +

(𝑛

𝑛

)𝑎𝑏𝑛]

+[(𝑛

0

)𝑎𝑛𝑏 +

(𝑛

1

)𝑎𝑛−1𝑏2 +

(𝑛

2

)𝑎𝑛−2𝑏3 +

(𝑛

3

)𝑎𝑛−3𝑏4 + ...

+

(𝑛

𝑛− 3

)𝑎3𝑏𝑛−2 +

(𝑛

𝑛− 2

)𝑎2𝑏𝑛−1 +

(𝑛

𝑛− 1

)𝑎𝑏𝑛 +

(𝑛 + 1

𝑛 + 1

)𝑏𝑛+1

]=

(𝑛 + 1

0

)𝑎𝑛+1 + 𝑎𝑛𝑏

[(𝑛1

)+

(𝑛

0

)]+ 𝑎𝑛−1𝑏2

[(𝑛1

)+

(𝑛

2

)]+ 𝑎𝑛−2𝑏3

[(𝑛3

)+

(𝑛

2

)]+ ... + 𝑎3𝑏𝑛−2

[( 𝑛

𝑛− 2

)+

(𝑛

𝑛− 3

)]+ 𝑎2𝑏𝑛−1

[( 𝑛

𝑛− 1

)+

(𝑛

𝑛− 2

)]+ 𝑎𝑏𝑛

[(𝑛𝑛

)+

(𝑛

𝑛− 1

)]+

(𝑛 + 1

𝑛 + 1

)𝑏𝑛+1

=

(𝑛 + 1

0

)𝑎𝑛+1 +

(𝑛 + 1

1

)𝑎𝑛𝑏 +

(𝑛 + 1

2

)𝑎𝑛−1𝑏2 +

(𝑛 + 1

3

)𝑎𝑛−2𝑏3 + ...

+

(𝑛 + 1

𝑛− 2

)𝑎3𝑏𝑛−2 +

(𝑛 + 1

𝑛− 1

)𝑎2𝑏𝑛−1 +

(𝑛 + 1

𝑛

)𝑎𝑏𝑛 +

(𝑛 + 1

𝑛 + 1

)𝑏𝑛+1

=

𝑛+1∑𝑘=0

(𝑛 + 1

𝑘

)𝑎𝑛+1−𝑘𝑏𝑘

which shows that the formula is also true for 𝑛+1 if it is true for 𝑛. This completes the proof of the binomial theorem.

The binomial theorem is also one of the reasons why 00 was defined as equal to 1 by mathematicians. Consider thefollowing expansion [5]:

(0 + 𝑥)𝑛 =

(𝑛

0

)0𝑛𝑥0 +

(𝑛

1

)0𝑛−1𝑥 + ... +

(𝑛

𝑛− 1

)01𝑥𝑛−1 +

(𝑛

𝑛

)00𝑥𝑛

= 𝑥𝑛

If 00 were undefined or defined as zero, then the binominal theorem would yield 𝑥𝑛 = 00𝑥𝑛 = 0 or 𝑥𝑛 = undefined.

Weierstrass maximum minimum theorem

While proving Rolle’s theorem we made use of Weierstrass’ maximum-minimum theorem which states that if afunction is continuous on a closed interval [𝑎, 𝑏], then this function has a maximum and a minimum value on [𝑎, 𝑏]. Wecan start the proof of Weierstrass’ maximum-minimum theorem by showing that the continuity of 𝑓 : [𝑎, 𝑏] → R on[𝑎, 𝑏] implies its boundedness on [𝑎, 𝑏]. This can be proven by contradiction. Assume that 𝑓 : [𝑎, 𝑏] → R is continuousbut not bounded. Then for any 𝑛 ∈ N there must be 𝑥𝑛 ∈ [𝑎, 𝑏] such that |𝑓(𝑥𝑛)|> 𝑛. Obviously, {𝑥𝑛} is a sequencebounded by a and b. From the boundedness of {𝑥𝑛} it follows that {𝑥𝑛} has a convergent subsequence {𝑥𝑛𝑘

} suchthat 𝑥𝑛𝑘

→ 𝑐 ∈ [𝑎, 𝑏]. Since 𝑓 is a continuous function, 𝑓(𝑥𝑛𝑘) → 𝑓(𝑐). This means that for any real number 𝜀 > 0,

there exists 𝑘0 ∈ N such that if |𝑥𝑛𝑘− 𝑐| < 1/𝑛𝑘0 then |𝑓(𝑥𝑛𝑘

) − 𝑓(𝑐)| < 𝜀 and |𝑓(𝑥𝑛𝑘)| < 𝜀 + |𝑓(𝑐)|. Since

{𝑥𝑛𝑘} converges to 𝑐, it is possible to choose k large enough so that |𝑥𝑛𝑘

− 𝑐| < 1/𝑛𝑘0and 𝜀 + |𝑓(𝑐)| < 𝑛𝑘. But



in this case we obtain |𝑓(𝑥𝑛𝑘| < 𝑛𝑘 which is in contradiction with our initial assumption that |𝑓(𝑥𝑛)| > 𝑛 for any

𝑛 ∈ N. This proves the boundedness of 𝑓 : [𝑎, 𝑏] → R. As a result, 𝑓 has a supremum 𝑆 on [𝑎, 𝑏]. Using the definitionof supremum, we know that for every 𝑛 ∈ N there exists 𝑥𝑛 ∈ [𝑎, 𝑏] such that 𝑆 − 1/𝑛 < 𝑓(𝑥𝑛) ≤ 𝑆 from which𝑓(𝑥𝑛) → 𝑆 follows. This gives us another bounded sequence {𝑥𝑛} with a convergent subsequence 𝑥𝑛𝑘

→ 𝑐 in [𝑎, 𝑏]and 𝑓(𝑥𝑛𝑘

) → 𝑓(𝑐). Since 𝑓(𝑥𝑛𝑘) is a subsequence of 𝑓(𝑥𝑛), these two sequences have to converge to the same limit

such that 𝑓(𝑐) = 𝑆. Since 𝑐 ∈ [𝑎, 𝑏] and ∀𝑥 ∈ [𝑎, 𝑏], 𝑓(𝑥) ≤ 𝑓(𝑐), this completes the proof of the maximum part ofthe Weierstrass’ maximum-minimum theorem. The minimum part can be proven in the same way.

Combining Weierstrass maximum minimum theorem with the integrability of a continuous function we obtain the meanvalue theorem for integrals

Mean Value Theorem for Integrals

Let 𝑓 be continuous on [𝑎, 𝑏]. Then ∃𝑝, 𝑞 ∈ [𝑎, 𝑏] : 𝑓(𝑝) = inf 𝑓(𝑥) : 𝑥 ∈ [𝑎, 𝑏], 𝑓(𝑞) = sup 𝑓(𝑥) : 𝑥 ∈ [𝑎, 𝑏].Itfollows that

𝑓(𝑝)(𝑏− 𝑎) ≤∫ 𝑏

𝑎

𝑓 ≤ 𝑓(𝑞)(𝑏− 𝑎)

⇒ 𝑓(𝑝) ≤ 1

𝑏− 𝑎

∫ 𝑏

𝑎

𝑓 ≤ 𝑓(𝑞)

Therefore according to Bolzano intermediate value theorem ∃𝑐 ∈ [𝑎, 𝑏] : 𝑓(𝑐) = 1𝑏−𝑎

∫ 𝑏

𝑎𝑓 ⇒

∫ 𝑏

𝑎𝑓 = 𝑓(𝑐)(𝑏− 𝑎).

Bolzano intermediate value theorem

This theorem states that if 𝑓 is continuous on [𝑎, 𝑏] and 𝑓(𝑎) < 0 < 𝑓(𝑏) then ∃𝑐 ∈ (𝑎, 𝑏) : 𝑓(𝑐) = 0.

Consider 𝑓

(𝑎 + 𝑏

2

). If 𝑓

(𝑎 + 𝑏

2

)= 0 then we found 𝑐. Otherwise, if 𝑓

(𝑎 + 𝑏

2

)< 0 let 𝑎1 = (𝑎 + 𝑏)/2, 𝑏1 = 𝑏

and consider 𝑓

(𝑎1 + 𝑏1

2

). If 𝑓

(𝑎1 + 𝑏1

2

)= 0 then we found 𝑐. Otherwise, if 𝑓

(𝑎1 + 𝑏1

2

)> 0 let 𝑏2 =

(𝑎1 + 𝑏1)/2, 𝑎2 = 𝑎1 and consider 𝑓

(𝑎2 + 𝑏2

2

). Continuing this way we either find 𝑐 after a finite number of updates

or we get a monotone increasing sequence {𝑎𝑛} and a monotone decreasing sequence {𝑏𝑛}. Since these sequences are

also bounded, they are both convergent. Also, because 𝑏𝑛 = 𝑎𝑛+𝑏− 𝑎

2𝑛, they converge to the same limit 𝑐 ∈ [𝑎, 𝑏]. Due

to the continuity of 𝑓 on [𝑎, 𝑏] it follows that 𝑓(𝑎𝑛) → 𝑓(𝑐), 𝑓(𝑏𝑛) → 𝑓(𝑐). Furthermore ∀𝑛, 𝑓(𝑎𝑛) < 0, 𝑓(𝑏𝑛) > 0

which implies that 0 ≤ 𝑓(𝑐) ≤ 0 ⇒ 𝑓(𝑐) = 0 .

In the proof of the Weierstrass’ maximum-minimum theorem we made use of several facts without showing why theyare true. The first one of these facts is that any bounded sequence has a convergent subsequence (Bolzano-Weierstrasstheorem).

Every bounded sequence has a convergent subsequence (Bolzano-Weierstrass)

Let {𝑥𝑛} be any real valued sequence. We can call 𝑥𝑝 a peak value of {𝑥𝑛} if for all 𝑘 ∈ N, 𝑥𝑝+𝑘 ≤ 𝑥𝑝. Then {𝑥𝑛}has either an infinite number of peak values or only a finite number of them. In case of infinitely many peak values,for any 𝑘 ∈ N, There exists a peak value 𝑥𝑛𝑘

and these peak values build a decreasing monotone subsequence {𝑥𝑛𝑘}.

7.14. Every bounded sequence has a convergent subsequence (Bolzano-Weierstrass) 45


In case of a finite number of peak values, let 𝑥𝑁 be the last of them and let 𝑛1 > 𝑁 . Then, 𝑥𝑛1 is not a peak valueand therefore there exists 𝑥𝑛2 such that 𝑥𝑛1 ≤ 𝑥𝑛2 . Also, for any 𝑘 ∈ N, there exist 𝑥𝑛𝑘

and 𝑥𝑛𝑘+1such that 𝑛𝑘 > 𝑁

and 𝑥𝑛𝑘≤ 𝑥𝑛𝑘+1

. Therefore, a monotone increasing subsequence {𝑥𝑛𝑘} of {𝑥𝑛} can be built using these non-peak

values with indices greater than 𝑁 . It follows that any real valued sequence has a monotone subsequence. It can alsobe shown that if a monotone sequence is bounded, then it is convergent. Now suppose that {𝑥𝑛} is a real-valued andbounded sequence and {𝑥𝑛𝑘

} is its monotone increasing subsequence. Then {𝑥𝑛𝑘} is also bounded. Let 𝑆 be the

supremum of {𝑥𝑛𝑘}. Then, for every 𝜀 > 0, there exists 𝐾 ∈ N such that 𝑆 − 𝜀 < 𝑥𝑛𝐾

≤ 𝑆. Since {𝑥𝑛𝑘} is an

increasing sequence, ∀𝑘 > 𝐾, 𝑆 − 𝜀 < 𝑥𝑛𝐾≤ 𝑥𝑛𝑘

≤ 𝑆 from which we can obtain by subtracting 𝑆 from both sidesof the inequality the following relationship: |𝑥𝑛𝑘

− 𝑆| < 𝜀. This completes the proof that the monotone subsequenceof a bounded sequence is convergent and therefore every bounded sequence has a convergent subsequence.

The next fact that we used in the proof of Weierstrass’ maximum-minimum theorem is that if a convergent sequence𝑎𝑛 → 𝐿 is in [𝐴,𝐵] then its limit 𝐿 is also in [𝐴,𝐵]. We can start the proof of this fact by first proving that the limitof a non-negative convergent sequence 𝑎𝑛 → 𝐿 is also non-negative. Clearly, for any 𝜀 > 0, there exists 𝑁𝜀 ∈ N suchthat 𝑛 > 𝑁𝜀 implies |𝑎𝑛−𝐿| < 𝜀. If we assume a negative limit then we obtain 𝑎𝑛−𝐿 < 𝜀 ⇒ 𝑎𝑛 < 𝜀+𝐿. Howeverwe could choose 𝜀 small enough such that 𝜀 < |𝐿|. Then we would obtain 𝑎𝑛 < 𝜀 + 𝐿 < 0 which is a contradiction.Therefore the limit of a non-negative convergent sequence must be non-negative. The next step in the proof is toobserve the behaviours of the non-negative sequences {𝑎𝑛−𝐴} and {𝐵−𝑎𝑛}. Clearly, 𝑎𝑛−𝐴 → 𝐿−𝐴 ≥ 0 ⇒ 𝐴 ≤ 𝐿and 𝐵 − 𝑎𝑛 → 𝐵 − 𝐿 ≥ 0 ⇒ 𝐿 ≤ 𝐵. It follows that 𝐿 ∈ [𝐴,𝐵].

In the proof of Weierstrass’ maximum-minimum theorem we also used the fact that a sequence is convergent with alimit if and only if each of its subsequences is convergent with the same limit. In order to prove this let 𝑥𝑛 → 𝐿.Then for any 𝜀 > 0 there exists 𝑁𝜀 such that 𝑛 > 𝑁𝜀 implies |𝑥𝑛 − 𝐿| < 𝜀. Then let {𝑥𝑛𝑘

} be any subsequence of{𝑥𝑛}. For every 𝑘 > 𝑁𝜀 we know that 𝑛𝑘 ≥ 𝑘 > 𝑁𝜀 and |𝑥𝑛𝑘

− 𝐿| < 𝜀 and therefore 𝑥𝑛𝑘→ 𝐿. Conversely, if any

subsequence of {𝑥𝑛} converges to 𝐿, then since {𝑥𝑛} is a subsequence of itself 𝑥𝑛 → 𝐿.

A continuous function is integrable

Another place where Weierstress’ maximum-minimum theorem can be used is in the proof of the integrability of acontinuous function. While proving the Weierstrass’ maximum-minimum theorem, we made use of the boundednessof a continuous function. A further implication of the continuity is that a function 𝑓 which is continuous on an interval[𝑎, 𝑏] ⊂ R is integrable on [𝑎, 𝑏]. In order to prove this, we use the fact that 𝑓 is also uniformly continuous on [𝑎, 𝑏].Suppose 𝜀 > 0, then ∃𝛿 > 0 such that for any 𝑥, 𝑦 with |𝑥 − 𝑦| < 𝛿, |𝑓(𝑥) − 𝑓(𝑦)| < 𝜀/(𝑏 − 𝑎). We can choosea partition 𝑃 = {𝑥0, 𝑥1, ..., 𝑥𝑛} of [𝑎, 𝑏] such that for any 𝑖 ∈ {1, ..., 𝑛}, |𝑥𝑖 − 𝑥𝑖−1| < 𝛿. Since 𝑓 is continuouson every interval [𝑥𝑖−1, 𝑥𝑖], according to Weierstrass’ maximum-minimum theorem on each one of these intervalsthere exist 𝑝𝑖, 𝑞𝑖 ∈ [𝑥𝑖−1, 𝑥𝑖] such that 𝑓(𝑝𝑖) = inf{𝑓(𝑥) : 𝑥 ∈ [𝑥𝑖−1, 𝑥𝑖]} and 𝑓(𝑞𝑖) = sup{𝑓(𝑥) : 𝑥 ∈ [𝑥𝑖−1, 𝑥𝑖]}.Furthermore since |𝑞𝑖−𝑝𝑖| is always less than 𝛿, |𝑓(𝑞𝑖)−𝑓(𝑝𝑖)| is always less than 𝜀/(𝑏−𝑎). Now, 𝑈(𝑃, 𝑓)−𝐿(𝑃, 𝑓)can be computed as follows:

𝑈(𝑃, 𝑓) − 𝐿(𝑃, 𝑓) =

𝑛∑𝑖=1

(𝑓(𝑞𝑖) − 𝑓(𝑝𝑖))(𝑥𝑖 − 𝑥𝑖−1)

<𝜀

𝑏− 𝑎

𝑛∑𝑖=1

(𝑥𝑖 − 𝑥𝑖−1)

=𝜀

𝑏− 𝑎(𝑏− 𝑎) = 𝜀

Therefore, according to the Cauchy criterion for integrability,∫ 𝑏

𝑎𝑓(𝑥)𝑑𝑥 exists. The definitions of 𝑈(𝑃, 𝑓), 𝐿(𝑃, 𝑓)

can be found in the section about the fundamental theorem of calculus

In order to prove that if 𝑓 is continuous on [𝑎, 𝑏] then it is uniformly continuous on [𝑎, 𝑏] we can assume that ∃𝜀 > 0 :∀𝑛∃𝑥𝑛, 𝑦𝑛 ∈ [𝑎, 𝑏] : |𝑥𝑛 − 𝑦𝑛| < 1/𝑛 and |𝑓(𝑥𝑛) − 𝑓(𝑦𝑛)| ≥ 𝜀. Then {𝑥𝑛}, {𝑦𝑛} have convergent subsequences{𝑥𝑛𝑘

}, {𝑦𝑛𝑘} with |𝑥𝑛𝑘

− 𝑦𝑛𝑘| < 1/𝑛𝑘∀𝑘. It follows that 𝑥𝑛𝑘

→ 𝑐 ∈ [𝑎, 𝑏], 𝑦𝑛𝑘→ 𝑐 ∈ [𝑎, 𝑏] ⇒ 𝑓(𝑥𝑛𝑘

) →𝑓(𝑐), 𝑓(𝑦𝑛𝑘

) → 𝑓(𝑐). Therefore for large enough 𝑘, |𝑓(𝑥𝑛𝑘) − 𝑓(𝑦𝑛𝑘

)| < 𝜀. This contradiction completes the proof.



Cauchy criterion for integrability

According to this criterion a function 𝑓 is integrable on an interval [𝑎, 𝑏] if and only if for every 𝜀 > 0 there exists apartition 𝑃 of [𝑎, 𝑏] such that 𝑈(𝑃, 𝑓) − 𝐿(𝑃, 𝑓) < 𝜀.

If∫ 𝑏

𝑎𝑓 = 𝛼 then there exists a sequence of partitions {𝑃𝑛} such that 𝑈(𝑃𝑛, 𝑓) → 𝛼 and 𝐿(𝑃𝑛, 𝑓) → 𝑎𝑙𝑝ℎ𝑎. Then

𝑈(𝑃𝑛, 𝑓) − 𝐿(𝑃𝑛, 𝑓) → 0 and for every 𝜀 > 0 for large enough 𝑛, 𝑈(𝑃𝑛, 𝑓) − 𝐿(𝑃𝑛, 𝑓) < 𝜀.

Conversely, if for every 𝜀 > 0 there exists 𝑃𝜀 such that 𝑈(𝑃𝜀, 𝑓)−𝐿(𝑃𝜀, 𝑓) < 𝜀 then 0 ≤ 𝑈(𝑓)−𝐿(𝑓) < 𝜀 for everypositive 𝜀 which implies that 𝑈(𝑓) = 𝐿(𝑓) =

∫ 𝑏

𝑎𝑓 .

In the proof of the Cauchy integrability criterion we used the fact that if 𝑓 is integrable on [𝑎, 𝑏] then there exists asequence of partitions {𝑃𝑛} such that 𝑈(𝑃𝑛, 𝑓) → 𝛼 and 𝐿(𝑃𝑛, 𝑓) → 𝛼.

If 𝑓 is integrable on [𝑎, 𝑏] then 𝑈(𝑓) = 𝐿(𝑓) = 𝛼 from which it follows that for every 𝑛 ∈ N, there exist partitions of[𝑎, 𝑏], 𝑄𝑛, 𝑅𝑛 and their union 𝑃𝑛 = 𝑄𝑛 ∪𝑅𝑛 such that

𝛼− 1

𝑛< 𝐿(𝑄𝑛, 𝑓) ≤ 𝐿(𝑃𝑛, 𝑓) ≤ 𝑈(𝑃𝑛, 𝑓) ≤ 𝑈(𝑅𝑛, 𝑓) < 𝛼 +

1

𝑛

⇒ |𝐿(𝑃𝑛, 𝑓) − 𝛼| < 1

𝑛, |𝑈(𝑃𝑛, 𝑓) − 𝛼| < 1

𝑛⇒ 𝐿(𝑃𝑛, 𝑓) → 𝛼,𝑈(𝑃𝑛, 𝑓) → 𝛼

Conversely, if there exists a sequence of partitions {𝑃𝑛} such that 𝑈(𝑃𝑛, 𝑓) → 𝛼 and 𝐿(𝑃𝑛, 𝑓) → 𝛼, then

𝛼 ≤ 𝐿(𝑓) ≤ 𝑈(𝑓) ≤ 𝛼 ⇒ 𝐿(𝑓) = 𝑈(𝑓) = 𝛼 =

∫ 𝑏

𝑎

𝑓 ‘

Assume that 𝐿(𝑃𝑛, 𝑓) → 𝛼 and 𝐿(𝑓) < 𝛼. Then for large enough 𝑛, |𝐿(𝑃𝑛, 𝑓) − 𝛼| < 𝛼 − 𝐿(𝑓). It followsthat 𝐿(𝑓) − 𝛼 < 𝐿(𝑃𝑛, 𝑓) − 𝛼 and 𝐿(𝑃𝑛, 𝑓) is greater than the least upper bound of lower sums of 𝑓 which is acontradiction. 𝑈(𝑓) ≤ 𝛼 can be proven similarly.

In order to prove that 𝐿(𝑓) ≤ 𝑈(𝑓), let 𝑄,𝑅 be any partitions and let 𝑃 = 𝑄 ∪ 𝑅. Then 𝐿(𝑄, 𝑓) ≤ 𝐿(𝑃, 𝑓) ≤𝑈(𝑃, 𝑓) ≤ 𝑈(𝑅, 𝑓). Therefore any lower sum is less than or equal to any upper sum. In other words any lower sum isa lower bound for the set of all upper sums. Since 𝑈(𝑓) is the greatest lower bound for the set of all upper sums, wehave 𝐿(𝑃, 𝑓) ≤ 𝑈(𝑓). Since 𝑃 could be any partition, it follows that 𝑈(𝑓) is an upper bound for the set of all lowersums. Since 𝐿(𝑓) is the least upper bound for the set of all lower sums, 𝐿(𝑓) ≤ 𝑈(𝑓) follows.

In the above proofs we frequently used the fact that the refinement of a partition increases lower sums and decreasesupper sums. The increase of lower sums and decrease of upper sums can be proven in the same way. In order to provethe increase of lower sums we can insert an additional point 𝑝 to the partition 𝑃 and call the refined partition 𝑃 ′ suchthat

𝑃 ′ = {𝑥0, 𝑥1, ..., 𝑥𝑘−1, 𝑝, 𝑥𝑘, 𝑥𝑘+1, ..., 𝑥𝑛}

Let 𝑚′ = inf{𝑥 : 𝑥 ∈ [𝑥𝑘−1, 𝑝]}, 𝑚′′ = inf{𝑥 : 𝑥 ∈ [𝑝, 𝑥𝑘]}, 𝑚𝑖 = inf{𝑥 : 𝑥 ∈ [𝑥𝑖−1, 𝑥𝑖]}. It follows that 𝑚′ ≥ 𝑚𝑘

and 𝑚′′ ≥ 𝑚𝑘. Therefore

𝐿(𝑃 ′, 𝑓) =

𝑘−1∑𝑖=1

𝑚𝑖(𝑥𝑖 − 𝑥𝑖−1) + 𝑚′(𝑝− 𝑥𝑘−1) + 𝑚′′(𝑥𝑘 − 𝑝) +

𝑛∑𝑖=𝑘+1

𝑚𝑖(𝑥𝑖 − 𝑥𝑖−1)

≥𝑘−1∑𝑖=1

𝑚𝑖(𝑥𝑖 − 𝑥𝑖−1) + 𝑚𝑘(𝑝− 𝑥𝑘−1) + 𝑚𝑘(𝑥𝑘 − 𝑝) +

𝑛∑𝑖=𝑘+1


=

𝑘−1∑𝑖=1

𝑚𝑖(𝑥𝑖 − 𝑥𝑖−1) + 𝑚𝑘(𝑥𝑘 − 𝑥𝑘−1) +

𝑛∑𝑖=𝑘+1


= 𝐿(𝑃, 𝑓)

7.15. Cauchy criterion for integrability 47


References [1] Muldowney, James S. ; “Mathematics 117 Lecture Notes”, University of Alberta [2] Bowman,John C. ; “Math 117/118 Honours Calculus Lecture Notes”, University of Alberta [3] http://planetmath.org/proofoflimitruleofproduct [4] Thomas’ Calculus, 12th edition. [5] http://www.askamathematician.com/2010/12/q-what-does-00-zero-raised-to-the-zeroth-power-equal-why-do-mathematicians-and-high-school-teachers-disagree/[6] Spivak M. (1965);”Calculus on Manifolds”, ISBN 0-8053-9021-9 [7] Rudin W. (1976);”Principles of Mathemat-ical Analysis”, ISBN 0-07-054235-X


http://planetmath.org/proofoflimitruleofproduct

http://planetmath.org/proofoflimitruleofproduct

http://www.askamathematician.com/2010/12/q-what-does-00-zero-raised-to-the-zeroth-power-equal-why-do-mathematicians-and-high-school-teachers-disagree/

http://www.askamathematician.com/2010/12/q-what-does-00-zero-raised-to-the-zeroth-power-equal-why-do-mathematicians-and-high-school-teachers-disagree/

CHAPTER 8

Polygon Meshing through Triangulation

The quality of the triangular mesh in the above example can be improved by clicking on the ‘Flip’ button.

In the above canvas a new triangulation can be started by clicking on the ‘Redefine boundary’ button. The canvasresponds to mouse clicks by drawing a dot. Using mouse clicks the boundary of the domain can be defined. Once thevertices constituting the boundary are positioned on the canvas, the “boundary defined” button should be clicked. It isimportant that the boundary is defined either in clockwise or in counterclockwise direction. Also, the boundary shouldbe convex (Any two points inside the boundary can be joined with a line which does not intersect the boundary). Inthe next step the interior of the boundary is populated with vertices with random position. In this process the randomvertices are generated between the maximum and minimum x and y coordinates of the boundary vertices. In order todetermine if a randomly generated vertex is inside the boundary or not, the following steps are applied [1]:

• Step 1: For each newly generated interior vertex (test vertex), the y coordinate of this vertex is called they-threshold. Find the edges constituting the boundary, which have one end above, the other end below the y-threshold. If one vertex of an edge is exactly on the y-threshold, then this vertex is assumed to be above thethreshold. This is an arbitrary decision. If we choose the other way round it would work too as long as this ruleis applied consistently.

• Step 2: Locate all points on the boundary where a straight horizontal line through the test vertex intersects theboundary. These are the points on the edges found in Step 1 having the same y-coordinate as the y-threshold.This can be done by interpolating between the two ends of the edge for the x coordinate that corresponds to they-threshold. If the interpolated x-coordinate is less than the x-coordinate of the test vertex, then the intersectionpoint is on the left hand side of the test vertex.

• Step 3: If both on the left and right hand side of the test vertex there are odd number of intersection points (like3 and 3), then the test vertex is inside the polygon. If there are even number of intersection points on both sides,then it is outside the polygon.

Once the interior of the boundary is populated with the desired number of interior vertices, the domain can be triangu-lated by clicking the “mesh” button. Depending on the positions of the interior vertices, the resulting triangular meshmay or may not be acceptable. In order to improve the quality of the mesh, the “Flip” button can be clicked repeatedlyuntil the mesh quality is satisfactory. This procedure is not automated in order to clearly demonstrate the process ofstep-by-step mesh improvement. The “Flip” button invokes a function which fulfills one step of mesh improvement inthe following steps:

49


• Step 1: Based on the fact that every interior edge of the triangular mesh has two opposite angles, all internaledges are traversed and to each of them these two opposite angles are assigned.

• Step 2: Also, every interior edge has two neighbour triangles. If for any interior edge the sum of its oppositeangles is greater than 180∘ then from the list of triangles these two neighbourtriangles are deleted using the“neighbourTriangleIndices” property of the interior edge.

• Step 3: In the place of the two deleted triangles, two new triangles are added to the list of triangles. The firstof these triangles is defined using the indices of the opposite nodes and the first node index of the interior edge.The second new triangle is defined using the opposite node indices and the second node index of the interioredge.

• Step 4: The interior edge having the sum of its opposite angles greater than 180∘ is replaced with a new edgebetween the opposite nodes of the old edge.

How does an edge flip improve the mesh quality ?

The idea behind this operation is to make each triangle in the mesh look as much as possible like an equilateral triangle.Without changing the positions of the vertices, this can be achieved by arranging the internal edges in such a way thatthe circumscribed circle of each triangle contains no other vertices than the ones belonging to that triangle.

This idea is illustrated in the above image. As we can see on the left hand side, the triangle abc has a very desirableshape. However the vertex d is so close to it that it falls inside the circumscribed circle of the triangle abc. As a resultthe triangle that this vertex forms (acd) has very small angles and needs improvement. As it is shown on the left image,the sum of the opposite angles of the edge ac is greater than 180∘. This condition triggers the edge flip function andthe result of it is shown in the right image. Clearly, the circumscribed circles of the new triangles do not contain anyother vertices than the ones belonging to the triangles that they circumscribe. Also it can be seen that the sum of theopposite angles of the newly created edge db is less than 180∘.

Data Structures

Node: All the vertices defining the boundary and the triangular mesh are stored in a list of “Node” objects. The Nodeobject has the following properties:

50 Chapter 8. Polygon Meshing through Triangulation


• index: Every node is assigned an integer valued index. Boundary node indices have values from “zero” up to“number of boundary nodes -1” and interior nodes have indices from “number of boundary nodes” up to “totalnumber of nodes -1”.

• interior: A boolean property that has a “false” default value and becomes “true” for interior nodes.

• x and y coordinates.

• A list of triangles that contain that particular vertex.

• A list of edges that contain that particular vertex.

Edge:

• index: Every edge is assigned an integer valued index. Boundary edge indices have values from “zero” up to“number of boundary edges -1” and interior edges have indices from “number of boundary edges” up to “totalnumber of edges -1”.

• nodeIndices: A list containing the indices of the two nodes defining the edge.

• A one dimensional Float32Array, containing the x and y coordinates of the nodes defining the edge (for OpenGLpurposes).

• interior: A boolean property that has a “false” default value and becomes “true” for interior nodes.

• oppositeNodeIndices: Every interior edge has two opposite nodes. The indices of these opposite nodes arepushed into this initially empty list if the edge is interior.

• oppositeAngles: Similar to oppositeNodeIndices.

• neighbourTriangleIndices: A list containing the indices of the triangles having this edge in common.

• addOppositeAngles(): This is a function which identifies the opposite angles of the interior edge and pushesthem into the initially empty list “oppositeAngles”.

• sumOpAngles: The sum of the opposite angles of an interior edge

• zero(): After each edge flip operation this function empties the lists “oppositeNodeIndices”, “oppositeAngles”and “neighbourTriangleIndices” otherwise these lists would keep on growing with each edgeFlip operation.

Triangle:

• index

• nodeIndices: A list containing the indices of the three vertices that belong to the triangle. Each triangle has afirst, second and third node index. As a result the vertices of a triangle are ordered.

• A Float32Array with six members consisting of the x and y coordinates of the vertices (for OpenGL purposes).

• For each vertex of the triangle two vectors are defined pointing from the vertex towards the other two vertices.Using these vectors the interior angles of the triangle are computed.

• An array containing the interior angles of the triangle. The interior angles have the same order as the vertices.

Auxiliary Functions

Finding if a point lies to the left or right of a line:

Finding the intersection point of two line segments: We start this procedure by determining if two segments intersector not. This can be done by checking the orientation of the segments with respect to each other [2]. As an examplethe orientation of the segment (𝑝1, 𝑝2) in Figure 1 with respect to the vertex 𝑝3 is counterclockwise and ((𝑝2 −𝑝1)Λ(𝑝3 − 𝑝2)) · k > 0. Similarly the orientation of the segment (𝑝1, 𝑝2) with respect to the vertex 𝑝4 is clockwise

8.2. Auxiliary Functions 51


and ((𝑝2−𝑝1)Λ(𝑝4−𝑝2)) ·k < 0. Here the symbol Λ denotes the cross product operation and k denotes a unit vectorin the positive z-direction according to the right hand rule.

Fig. 8.1: Figure 1: Intersection of Segments

The necessary conditions for 2 segments oriented as in Figure 1, to intersect each other are:

Condition1: 𝑝3 and 𝑝4 must have opposite orientation with respect to the segment (𝑝1, 𝑝2).

Condition2: 𝑝1 and 𝑝2 must have opposite orientation with respect to the segment (𝑝3, 𝑝4).

References [1] http://alienryderflex.com/polygon/ [2] http://www.geeksforgeeks.org/check-if-two-given-line-segments-intersect/

52 Chapter 8. Polygon Meshing through Triangulation

http://alienryderflex.com/polygon/

http://www.geeksforgeeks.org/check-if-two-given-line-segments-intersect/

http://www.geeksforgeeks.org/check-if-two-given-line-segments-intersect/

CHAPTER 9

Flow Between Parallel Plates

This section is about the computation and visualization of the velocity and pressure fields in an incompressible fluidbetween two parallel infinitely large plates using OpenFOAM and ParaView. The results obtained by OpenFOAM arecompared to the analytical solutions available for this particular flow case.

Analytical Solution

The fluid is set in motion due to the shear forces caused by the movement of the plates relative to each other. Thepositions and velocities of the plates are given in Figure 1 with respect to a cartesian coordinate system (x,y,z).

The plates are a distance 2h apart from each other and the coordinate system is centered in the middle of this distanceso that the top plate is located at z=h and the bottom plate is located at z=-h. The Navier-Stokes equations describingincompressible fluid flow in x,y,z directions are given below [1]:

x-direction:

𝜕𝑢

𝜕𝑡+ 𝑢

𝜕𝑢

𝜕𝑥+ 𝑣

𝜕𝑢

𝜕𝑦+ 𝑤

𝜕𝑢

𝜕𝑧= −1

𝜌

𝜕𝑝

𝜕𝑥+ 𝜈 ·

(𝜕2𝑢

𝜕𝑥2+

𝜕2𝑢

𝜕𝑦2+

𝜕2𝑢

𝜕𝑧2

)y-direction:

𝜕𝑣

𝜕𝑡+ 𝑢

𝜕𝑣

𝜕𝑥+ 𝑣

𝜕𝑣

𝜕𝑦+ 𝑤

𝜕𝑣

𝜕𝑧= −1

𝜌

𝜕𝑝

𝜕𝑦+ 𝜈 ·

(𝜕2𝑣

𝜕𝑥2+

𝜕2𝑣

𝜕𝑦2+

𝜕2𝑣

𝜕𝑧2

)z-direction:

𝜕𝑤

𝜕𝑡+ 𝑢

𝜕𝑤

𝜕𝑥+ 𝑣

𝜕𝑤

𝜕𝑦+ 𝑤

𝜕𝑤

𝜕𝑧= −𝑔 − 1

𝜌

𝜕𝑝

𝜕𝑧+ 𝜈 ·

(𝜕2𝑤

𝜕𝑥2+

𝜕2𝑤

𝜕𝑦2+

𝜕2𝑤

𝜕𝑧2

)In the Navier-Stokes equation for the z-direction the g term denotes the gravitational constant. In all equations 𝜈 is thekinematic viscosity of the fluid, 𝑝 is the pressure and 𝑢, 𝑣, 𝑤 are the velocity components in the x,y,z directions respec-tively. The steady state flow assumption requires the local time derivatives of the velocity components (𝜕𝑢𝜕𝑡 ,

𝜕𝑣𝜕𝑡 ,

𝜕𝑤𝜕𝑡 )

to be equal to zero. Another assumption is that the pressure changes only in x and z directions. The no-slip boundaryconditions are given below:

𝑢 = 𝑈, 𝑣 = 𝑉 at z=h

53


Fig. 9.1: Figure 1: Parallel plates and the cartesian coordinate system [1]

𝑢 = 𝑈 ′, 𝑣 = 𝑉 ′ at z=-h

In order to obtain the velocity profile at an arbitrary point, the velocity components 𝑢 and 𝑣 are assumed to be functionsof z only such that v = 𝑢(𝑧)e1+𝑣(𝑧)e2 where v denotes the velocity vector field and e1 and e2 denote the unit vectorsin x and y directions respectively.

Here is a list of assumptions that we made so far to describe fluid flow between two infinitely large plates moving withrespect to each other:

• The flow is parallel to the plates ⇒ 𝑤 = 0

• Steady flow ⇒ 𝜕𝑢

𝜕𝑡=

𝜕𝑣

𝜕𝑡=

𝜕𝑤

𝜕𝑡= 0

•𝜕𝑝

𝜕𝑦= 0

• v = 𝑢(𝑧)e1 + 𝑣(𝑧)e2

The application of the above assumptions to the Navier-Stokes equations yields the following simplified governingequations of fluid motion:

x-direction:y-direection:z-direction:

⎛⎝000

⎞⎠ =

⎛⎝−(1/𝜌)𝜕𝑥𝑝 + 𝜈𝑢′′

𝜈𝑣′′

−𝑔 − (1/𝜌)𝜕𝑧𝑝

⎞⎠or [

− 1

𝜌𝜕𝑥𝑝 + 𝜈𝑢′′

]e1 +

[𝜈𝑣′′

]e2 +

[− 𝑔 − 1

𝜌𝜕𝑧𝑝]e3 = 0

In the above equations 𝜕𝑥 and 𝜕𝑧 stand for the partial derivatives with respect to x and z respectively. By integratingthe z-direction simplified Navier-Stokes equation once we obtain:

𝑝(𝑥, 𝑧) = −𝑔𝜌𝑧 + 𝑓1(𝑥) (1)

54 Chapter 9. Flow Between Parallel Plates


From the above description of 𝑝(𝑥, 𝑧) it follows that 𝜕𝑥𝑝 = 𝜕𝑥𝑓1. Plugging this relationship into the x-directionsimplified Navier-Stokes equation we obtain:

𝜌𝜈𝑢′′ = 𝜕𝑥𝑓1

Since in the above equation the left hand side is a function of z only and the right hand side is a function of x only,both the left and the right hand sides must be equal to a constant value such that:

𝜌𝜈𝑢′′ = 𝜕𝑥𝑓1 = 𝜕𝑥𝑝 = 𝐶

A second expression for the pressure field can be obtained by integrating the equation 𝜕𝑥𝑝 = 𝐶 with respect to x once.This expression is given in Eq.(2)

𝑝(𝑥, 𝑧) = 𝐶𝑥 + 𝑓2(𝑧) (2)

A comparison of Eq.(1) and Eq.(2) shows that 𝑓1(𝑥) = 𝐶𝑥 and 𝑓2(𝑧) = −𝑔𝜌𝑧. Using this, the pressure field can bedescribed as in Eq.(3) where 𝑝0 is the pressure at the point x=0, z=0.

𝑝(𝑥, 𝑧) = 𝐶𝑥− 𝑔𝜌𝑧 + 𝑝0 (3)

Furthermore, integrating the equation 𝜌𝜈𝑢′′ = 𝐶 with respect to z twice, we obtain Eq.(4) which describes the x-component of the velocity field:

𝑢(𝑧) =𝐶

2𝜌𝜈𝑧2 + 𝑐1𝑧 + 𝑐2 (4)

Applying the boundary conditions for u at z=-h and at z=h, the constants of integration 𝑐1, 𝑐2 can be computed as inEq.(5).

𝑢(𝑧) =𝐶

2𝜌𝜈𝑧2 +

𝑈 − 𝑈 ′

2ℎ𝑧 +

𝑈 + 𝑈 ′

2− 𝐶ℎ2

2𝜌𝜈(5)

Similarly, the velocity field in y-direction can be obtained by integrating the equation 𝜈𝑣′′ = 0 (the Navier-Stokesequation for y-direction) twice with respect to z and using the boundary conditions for v at z=-h and at z=h as inEq.(6).

𝑣(𝑧) =𝑉 − 𝑉 ′

2ℎ𝑧 +

𝑉 + 𝑉 ′

2(6)

A sub-class of flow between parallel plates is called Couette flow which occurs when 𝜕𝑥𝑝 = 0 in addition to theassumptions listed previously. In the next section about the simulation in OpenFOAM the Couette flow is demonstratedfirst. Afterwards the more general case of 𝜕𝑥𝑝 = 0 is demonstrated which is called Poiseuille flow.

In case of Couette flow the application of the condition 𝜕𝑥𝑝 = 𝐶 = 0 to Eq.(5) results in the following solution forthe x-direction velocity and pressure profiles:

Couette flow: 𝑢(𝑧) =𝑈 − 𝑈 ′

2ℎ𝑧 +

𝑈 + 𝑈 ′

2(7)

𝑝(𝑧) = −𝜌𝑔𝑧 + 𝑝0 (8)

Numerical Solution using OpenFOAM

This section contains step-by-step instructions for the pre-processing, solving and post-processing of Couette andPoiseuille flows whose analytical solutions have been derived in the previous section.

9.2. Numerical Solution using OpenFOAM 55


Couette Flow

Inside the home/username/OpenFOAM folder create a new folder called Couette. Then, inside the Couette foldercreate three more folders called 0, constant and system. The 0 folder will contain the initial velocity and pressureconditions, the constant folder will contain the mesh description and material properties and the system folder willcontain some solver parameters which will be explained on examples in the subsequent sections.

Definition of the simulation domain and the mesh properties: The Couette flow will be simulated by taking a stripfrom the infinite fluid between the plates. The long side of this strip is 4 m long in x-direction, its height is equal to2h=0.2 m and its depth is equal to 0.01 m. The geometry of this finite strip is shown in Figure 2.

Fig. 9.2: Figure 2: Couette flow simulation domain

In order to discretize the geometry shown in Figure 2, create a new folder called polyMesh inside the constant folderand inside the polyMesh folder create a file called ‘blockMeshDict’. The blockMeshDict file contains the parametersused by the ‘blockMesh’ program in order to generate the finite volume mesh for the geometry discretization. TheblockMesh command should be executed in the Linux terminal from within the Couette folder in order to invoke themesh generation program blockMesh. The following code block shows what the blockMeshDict file should look like.

blockMeshDict file:

/*--------------------------------*- C++ -*----------------------------------*\| ========= | || \\ / F ield | OpenFOAM: The Open Source CFD Toolbox || \\ / O peration | Version: 2.4.0 || \\ / A nd | Web: www.OpenFOAM.org || \\/ M anipulation | |\*---------------------------------------------------------------------------*/FoamFile



{version 2.0;format ascii;class dictionary;object blockMeshDict;

}// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

convertToMeters 0.01;

vertices(

(0 0 0)(400 0 0)(400 0 20)(0 0 20)(0 2 0)(400 2 0)(400 2 20)(0 2 20)

);

blocks(

hex (0 1 2 3 4 5 6 7) (40 1 20) simpleGrading (1 1 1));

edges();

boundary(

top{

type wall;faces(

(3 7 6 2));

}bottom{

type wall;faces(

(1 5 4 0));

}inlet{

type patch;faces(

(0 4 7 3));

}outlet



{type patch;faces(

(2 6 5 1));

}frontAndBack{

type empty;faces(

(0 3 2 1)(4 5 6 7)

);}

);

mergePatchPairs();

The initial part of the above blockMeshDict file up to the convertToMeters command can be copied fromone of the sample files that come with the OpenFOAM installation and can be found in the folderhome/username/OpenFOAM/FOAM_RUN/tutorials. In the next part of this section the commands in the blockMesh-Dict file are explained.

Explanation of the blockMeshDict file

convertToMeters: The number that comes after this command is multiplied with the vertex coordinates. The results ofthis multiplication are stored in the computer memory as the vertex coordinates with respect to the cartesian coordinatesystem shown in Figure 2 with the unit of meters. For example, if the number that comes after convertToMeters is0.1 and the x-coordinate of a vertex is defined as 20 in the blockMeshDict file, then the x-coordinate of this vertex isstored in the computer memory as 2 meters away from the origin in x-direction.

vertices: In OpenFOAM the domain of analysis is partitioned into blocks and afterwards for each block a meshingscheme is defined. In this current example since the domain is simple, it can be described using a single block.This block has the shape of a rectangular prism(Figure 2) and it can be defined using the coordinates of its eightvertices. These coordinates are defined with respect to the coordinate system shown in Figure 2. It is important thatthis coordinate system is right-handed and its origin is located at one of the vertices that make up the block. The orderin which the vertices are defined is also important since this order determines the index of each vertex and the x,y,zdirections of the coordinate system.

The first vertex (0,0,0) has the index 0 and defines the position of the origin of the coordinate system. The secondvertex (400,0,0) has the index 1 and defines the direction of the x axis so that the x-axis is oriented from vertex 0 tovertex 1. The third vertex (400, 0, 20) has the index 2 and determines the direction of the y-axis so that the y-axis isoriented from vertex 1 towards vertex 2. The fourth vertex (0,0,20) does not play a role in determining the direction ofan axis but it is essential for defining one of the six faces of the prism. The fifth vertex (0,2,0) determines the directionof the z-axis so that the z-axis is oriented from vertex 0 towards vertex 4. The remaining vertices are simply offsetfrom the vertices 1, 2 and 3 and serve the purpose of defining another face of the prism.

block and hex: Using the two rectangular faces defined with eight vertices, a block is defined inside the blockcommand. The hex command implies that the prism which constitutes the block is bounded by six faces. Insidethe first parentheses folowing the hex command the vertices that make up two opposite faces of the prism are listed.In this example the vertices 0,1,2,3 define the first face and 4,5,6,7 define its opposite face.

The second parenthesis after the hex command defines the number of cells that the block should be divided in, in x,y,zdirections. In this example 40 inside the second parenthesis after hex means that the block will be divided in 40 cells



in x-direction of the finite volume mesh. The 1 that comes after the 40 means that there will be only 1 cell in they-direction. This makes sense since we are interested in the x-direction flow profile only and the y-direction flow isexpected to have the same pattern(Figure 1). Therefore no discretization is needed in the y-direction. The 20 in thissecond parenthesis implies that in the z-direction the block will be divided in 20 cells.

The first, second and third numbers in the last parenthesis after the hex command define the grading of the mesh in thex-, y- and z-directions respectively. The numbers in this last parenthesis define the ratio of the length of the last cellin a certain direction to the length of the first cell in that same direction. In this example all cells in a certain directionhave equal length, therefore the last parenthesis after the hex command is populated with ones.

edges: This command is used in cases where a block has curved boundaries. In this example the parentheses after thiscommand are left empty since the block is bounded with straight lines.

boundary: In this part of the file, different parts of the boundary are given appropriate labels like top, bottom, etc.Also, each part is given an appropriate type like wall, patch or empty. After a label and type is defined for the boundarypart, the faces that constitute that boundary part are listed using their vertex indices. The order in which these verticesare listed inside the faces command is important. The vertices should be listed in such a way that a person sittinginside the block would perceive it as being in counter-clockwise direction.

mergePatchPairs: This command is needed when more than one blocks have to be merged at some patch. Here it isleft empty since we have only one block.

Definition of the initial conditions: The initial conditions for velocity and pressure are defined inside the 0 folderthat we created inside the Couette folder together with the constant and system folders. In this context the meaningof 0 is that the pressure and velocity conditions at the time t=0 are defined. For this purpose two different files arecreated inside the 0 folder with the file names p and U. In the following part the contents of these files are listed forthe Couette flow example and after each file the commands used in that file are explained. The contents of the p fileare as follows:

p file:

/*--------------------------------*- C++ -*----------------------------------*\| ========= | || \\ / F ield | OpenFOAM: The Open Source CFD Toolbox || \\ / O peration | Version: 2.4.0 || \\ / A nd | Web: www.OpenFOAM.org || \\/ M anipulation | |\*---------------------------------------------------------------------------*/FoamFile{

version 2.0;format ascii;class volScalarField;object p;

}// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

dimensions [0 2 -2 0 0 0 0];

internalField uniform 0;

boundaryField{

top{

type zeroGradient;}



bottom{

type zeroGradient;}

inlet{

type fixedValue;value uniform 0;

}outlet{

type fixedValue;value uniform 0;

}frontAndBack{

type empty;}

}

// ************************************************************************* //

Similar to the blockMeshDict file, the initial part of the p file up to the dimensions command can be copied fromone of the sample files that come with OpenFOAM. The seven numbers inside the brackets following the dimensionscommand define the pressure unit in which the pressure initial condition is defined.

References [1] Granger R.A., Fluid Mechanics, Dover Publications, 1995, ISBN:9781621986546


CHAPTER 10

Biography of Celal Cakiroglu

Education:

• MSc. Computational Sciences in Engineering, Technische Universitaet Carolo Wilhelmina zu Braunschweig

• BSc. Civil Engineering, Istanbul Technical University

Professional Service:

• Research and Teaching Assistant at the University of Alberta

• Research Assistant at the German Aerospace Center (master thesis)

Favourite Books:

• Stranger in a Strange Land, Robert A. Heinlein

• The Player of Games, Iain M Banks

Favourite Movies:

• Her

• The Matrix

How I made this documentation: Installing Sphinx and making it work the way you want, linking the git repositorywith the readthedocs account can be quite tedious. Here are some useful links which explain the procedure

https://codeandchaos.wordpress.com/2012/07/30/sphinx-autodoc-tutorial-for-dummies/

https://www.youtube.com/watch?v=oJsUvBQyHBs

http://numericjs.com/index.php

Contact Celal.

Indices and tables

• genindex

61

https://codeandchaos.wordpress.com/2012/07/30/sphinx-autodoc-tutorial-for-dummies/

https://www.youtube.com/watch?v=oJsUvBQyHBs

http://numericjs.com/index.php

mailto:[email protected]


• modindex

• search

62 Chapter 10. Biography of Celal Cakiroglu

Epsilon Critical Documentation - Read the Docs · References [1] Hibbeler R.C., Structural...

Documents

Transcript of Epsilon Critical Documentation - Read the Docs · References [1] Hibbeler R.C., Structural...