Download - MATH2114-Numerical-LN1(2)

8/11/2019 MATH2114-Numerical-LN1(2)

1/18

MATH2114 Lecture NotesNumerical Methods and Statistics for Engineers

Solution of Nonlinear Equations

Dr Eddie Ly

School of Mathematical & Geospatial Sciences (SMGS)College of Sciences, Engineering & Health (SEH)

RMIT University, Melbourne, Australia

July 2014

Contents

1 Introduction 2

2 Bisection Method 3

2.1 Convergence Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Fixed Point Iteration Method 7

3.1 Convergence Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Rate of Convergence 9

5 Termination of the Iterations 10

6 Newtons Method 10

6.1 Convergence and Stopping Criterions . . . . . . . . . . . . . . . . . . . . . . . . . 12

7 Secant Method 14

7.1 Convergence and Stopping Criterions . . . . . . . . . . . . . . . . . . . . . . . . . 14

8 Remarks on Multiple Roots 15

9 Self Assessment Exercises 17

1


2/18

2 Solution of Nonlinear Equations

1 Introduction

Since in senior high school, you learned to use the quadratic formula,

x=b b2 4ac2a

to solve

f(x) =ax2 +bx+c= 0,

where a, b and c are constants. The values obtained with the above quadratic formula

are called the roots, or solutions, of the function f(x). They represent the values ofx

(denoted by a variable r in this document) that make f(x) equal to zero. Thus, we can

define the root of an equation as the value ofx that makes f(r) = 0. For this reason,

roots are sometimes called the zeros of the equation. If f(r) = 0 and f(r)= 0, thenr is a simple root. For example, for f(x) = x2 1, the simple roots are x =1, sincef(1) = 0, f(x) = 2xandf(1) = 2 = 0 . Although, the quadratic formula is handlyfor solving such equation, there are many other (nonlinear) functions for which the root

cannot be determined so easily. For these cases, the numerical methods described here

provide efficient means to obtain the answers.

The methods for solving nonlinear equations are iterative methods, where a sequence of

approximations to the solution is computed. The iteration is terminated once a sufficientlyaccurate estimate is found. An equation f(x) = 0 may have no roots, a finite number of

roots, or infinitely number of roots. Often for a simple function, the number of roots and

their approximate value can be found using a sketch.

EXAMPLE

Determine the approximate values of the roots ofx2 +ex = 4 .

solution :

The equation can be written as ex

= 4 x2

, so we sketch the functions y = ex

and y = 4 x2 on the same axes, and estimate the x-coordinates of the points ofintersection.

2.5 2 1.5 1 0.5 0 0.5 1 1.5 2

0

1

2

3

4

5

X

Y

y = 4x2

y = ex


3/18

Lecture Notes MATH2114 Numerical Methods and Statistics for Engineers 3

From the sketch, we can see that there are two roots with r1 1.95 andr2 1.05.We can verify the estimates as follows :

f(x) =x2

+ex

4,f(1.95) = (1.95)2 +e1.95 4 0.06,

f(1.05) = (1.05)2 +e1.05 4 0.04 .

Although, graphicalmethods are useful for obtaining rough estimates of roots, they are

limited because of their lack of precision. An alternative approach is to usetrial and error,

a technique consists of guessing a value ofx and evaluating whetherf(x) is zero. If not,

another guess is made, and f(x) is again evaluated to determine whether the new value

provides a better estimate of the root. The process is repeated until a guess is obtainedthat results in f(x) 0 . Such graphical, and trial and error, methods are obviouslyinefficient and inadequate for the requirements of scientific and engineering practice. This

course presents alternatives that are also approximate, but employ systematic strategies

to home in on the root.

2 Bisection Method

Iff(x) is continuous on the interval, a x b, and f(a) and f(b) have opposite sign,that is,f(a) f(b)< 0, then there is at least one root in the interval. Suppose that thereis only one root in the interval. If the interval is halved, the interval which contains the

root can be determined, and this process continued until the root is known to sufficient

accuracy. This method is known as thebisection method, binary chopping, interval halving

or Bolzanos method.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.810

8

6

4

2

0

2

4

X

Y

y = f(x)

c0

b0

a0

b1

a2

a3

original interval

first iteration

second iteration

third iteration

a1 c

1

b2

c2

b3

c3

root, x = r


4/18


As shown in the illustration, the original interval is denoted by [a0, b0], the subsequent

intervals by [a1, b1], [a2, b2], [a3, b3], and so on. Suppose that we know the root lies in the

interval [a, b]. We compute the midpoint of the interval,c = 12

(a+b), and iff(b)f(c)< 0,then the root r lies in the interval defined by c x b, soa is replaced by c. Otherwise,the rootr is in the interval defined bya x c, andb is replaced byc. By now, we havean interval which is half as long and still contains the root. This bisection of the interval

is repeated until we can estimate the root to the required accuracy.

At each iteration, we take x= c as the current estimate of the root. Suppose that an

estimate of the root is required such that

| Error | ,

where is a given tolerance. It can be seen from the illustration that r cannot be further

from c than the distance ofb fromc, that is,

|r c| b c .

Hence, the iteration is terminated when

b c ; 0< 1.

EXAMPLE

Use the bisection method to find the root ofx 1 = ex that lies in the interval[1, 1.4] to a tolerance of 0.02.

solution :

Let f(x) =x

1

ex, a0= 1, b0= 1.4 and = 0.02. Thus,

c0= 1

2

a0+b0

= 1.2 ,

f(c0) =f(1.2)

= 1.2 1 e1.2

= 0.10 ,f(b0) =f(1.4)

= 1.4 1 e1.4

= 0.15 .


5/18


Since f(b0) f(c0)< 0, the next interval that contains the root is [c0, b0], which ishalf as long as the original interval [a0, b0]. For next iteration, we seta1 = c0 and

b1=b0, and compute c1, f(c1) and f(b1) (which is same as f(b0) since b1=b0) :

a1= c0= 1.2 ,

b1= b0= 1.4 ,

c1= 1

2

a1+b1

= 1.3 ,

f(c1) =f(1.3)

= 1.3 1 e1.3

= 0.027 ,

f(b1) =f(b0)

= 0.15 .

Since f(b1) f(c1)> 0, then root must be in the interval [a1, c1]. So, for the nextiteration, we set a2 = a1, b2 = c1 and f(b2) = f(c1). The bisection of the interval

is repeated until we can estimate the root to the required accuracy. Putting the

computed results in a table provides :

a b c b c f(b) f(c) f(b) f(c) action1 1.4 1.2 0.2 (> ) 0.15 0.10 ) 0.15 0.027 >0 Set b= c, f(b) =f(c)

1.2 1.3 1.25 0.05 (> ) 0.027 0.037 ) 0.027 0.0044


6/18


b3 a3= 12(b2 a2)= 1

23(b0 a0),

..

. =

..

.bn an= 1

2n(b0 a0).

Since c is the midpoint of the interval [a, b], then cn= 1

2(an+bn), and

|r cn| bn cn=

1

2(bn an)

= 1

2n+1(b0

a0).

Asn , 12n+1

(b0 a0) 0, thus |r cn| 0 as well. Hence, cn approaches the root rasn . That is, the bisection method converges for any initial interval containing theroot. Furthermore, for an initial interval [a0, b0] and a given tolerance , we can estimate

the required number of iterations n by noticing that|r cn| if

1

2n+1(b0 a0) n

lnb0a0

ln 2

1.

Note that the number of iterations is independent of the function f(x).

EXAMPLE

How many iterations of the bisection method are required to find the solution of

x 1 ex = 0 to six decimal places if the initial interval [1, 1.5] is used?solution :

A tolerance of = 5

107 (or smaller) is required to ensure six decimal places

accuracy. Here, a0= 1 and b0= 1.5, then

n ln 1.515107

ln 2 1 18.9 .

At least 19 iterations are required.


7/18


3 Fixed Point Iteration Method

Suppose that x = g(x) is a rearrangement of f(x) = 0. Then, f(r) = 0 implies that

r = g(r), and the root r is called a fixed pointof the function g(x). A fixed point cansometimes be found using the fixed point iteration method(also known assimple iteration,

method of successive substitution and one-point iteration),

xn+1=g(xn) for n= 0, 1, 2, . . . ,

given an initial estimatex0of the rootr . The above relation provides a formula to predict

a new value ofx as a function of an old value ofx . Examples of convergent and divergent

iteration are shown below :

X

Y

g(x0)=x

1

g(x1)=x

2

x2

x1

x0

y = g(x)

y = x

r

Convergent fixed point iteration

Y

rx0x2x

1

Divergent fixed point iteration

g(x0) = x

1

g(x1) = x

2

y = g(x)

y = x

X


8/18


3.1 Convergence Criterion

Recalling that the error represents the difference between the true value and approximate

(estimate) value. If r represents the exact root of f(x) and xn+1 represents the most

recent estimate ofr, then the error ofxn+1 is

en+1=r xn+1.

Since r= g(r) amd xn+1=g(xn), we have

en+1= r xn+1=g(r) g(xn)=g (n)(r

xn) for n

(r, xn) (by Mean Value Theorem)

=g (n) en.

Suppose that|g(x)| K for|r x| < . Then,

|en+1| =g(n) en

=

g(n)en

K|en| if|en| < KK|en1|

= K2 |en1| if|en1| <

K2K|en2|= K3 |en2| if|en2| < ...

... ...

Kn+1 |e0| if|e0| < .

IfK < 1, Kn+1 0 and Kn+1 |e0| 0 as n for any value ofe0. Hence,|en+1| =|r xn+1| 0. Therefore, xn+1 approaches the exact root r as n . As a result, thefixed point iterationxn+1= g(xn) converges to r for any initial valuex0 (r , r+ ) if

g

(x) K 1.

EXAMPLE

Two rearrangements of the equation ex = sin x are

x= log(sin x) and x= arcsinex.

Determine if the fixed point iteration method will converge to the first positivesolution r 0.6 for these rearrangements. If the iteration does converge, computex3 using initial value of 0.6.


9/18


solution :

Ifg(x) = log(sin x) then

g

(x) =

d

dx log(sin x)=

cos x

sin x = cot x ,givingg (r) =g (0.6) 1.4617. Since

g(x)> 1, the fixed point iteration methodwill not converge.

Ifg(x) = arcsin

ex

, then

g(x) = d

dxarcsin

ex

=

ex1 e2x ,

so g (r) =g (0.6)

0.6565. As g(x)< 1 near the root, the fixed point iterationmethod will converge.

Using the second arrangement of the equation, the iteration xn+1 = arcsin(exn)

with starting value ofx0= 0.6 provides

x1= arcsin(ex0)

= 0.580942 ,

x2= arcsin(ex1)

= 0.593627 ,

x3= arcsin(ex2)

= 0.585145 .

Note that the convergence is rather slow, as expected from the large value ofg(x).

4 Rate of Convergence

Supposex0, x1, x2, . . . ,is a sequence of approximation to a root, r . Ifxn ras n ,we say the sequence is convergent. The rate of convergence is important. If

limn

|r xn+1||r xn|p = limn

|en+1||en|p =K

for positive constants K and p, we say that xn converges to r with order of convergence

p. The larger the value ofp and the smaller the value ofK, the faster the convergence. Ifp= 1 and K


10/18


For the fixed point iteration method, xn+1=g(xn), it was shown that

en+1= g(cn)en where cn (xn, r) .

Thus,

limn

|en+1||en| = limn

g(cn)= g(r),

so iteration will converge linearly ifK=g(r)< 1. Ifg (r) = 0, the order of convergence

will be two or higher. The bisection method converges linearly with K= 1/2.

5 Termination of the Iterations

We want to terminate the iteration process when the estimate xn is sufficiently close to

the actual root r, that is when|r xn| for some prescribed tolerance . There areseveral criterions that are employed in practice to terminate such iteration process. One

criterion is to terminate when|f(xn)| (since f(xn)0 when xn r), but this doesnot ensure that|r xn| . Another criterion is to terminate when|xn+1 xn| .

6 Newtons MethodOne of the major achievements in mathematics was the proof that polynomial equations of

degree greater than four cannot be solved by means of a formula (proved by the Norwegian

mathematician, Niels Henrik Abel (18021829)).

r x2 x1 x0

x

slope =f

(x0)

slope = f(x1)

y = f(x)

(x0, f(x0))

(x1, f(x1))

y

The Newtons method can be derived on the basis of the geometrical interpretation ofthe above illustration. Supposef(x) is differentiable, and has a root at an unknown point

x=r. If we have a point x0 close to r, we could obtain a better approximation to r by


11/18


drawing a tangent line through the point

x0, f(x0)

. This tangent line given by

f(x0) riserun

=f(x0) 0

x0

x1

would cut the x-axis at x= x1, which will be our new estimate ofr. Solving for x1 gives

x1=x0 f(x0)f(x0)

provided f(x0) = 0.

Repeat the procedure at

x1, f(x1)

, and let x2 be the x-intercept of the second tangent

line. From

f(x1) =f(x1) 0

x1 x2 ,

we obtain

x2=x1 f(x1)f(x1)

provided f(x1) = 0.

Continuing in this manner, we determine xn+1 from

xn+1=xn f(xn)f(xn)

for n = 0, 1, 2, . . . and provided f(xn) = 0.

The repetitive use of the above formula yields a sequence x1, x2, x3, . . . , of approximations

thatconverges

to the root r ; that is, xn r as n . This method is called Newtons

method or Newton-Raphson method. A potential problem in implementing this method

is the evaluation of the derivative. Although this is not inconvenient for polynomials and

many other functions, there are certain functions whose derivatives may be impossible,

extremely difficult (if possible), or computationally expensive to evaluate. For these cases,

the derivative can be approximated by a backward finite divided difference approximation,

f(xn) f(xn) f(xn1)xn xn1 .

EXAMPLE

Perform two iterations of Newtons method with x0= 1.2 to find an estimate of the

root ofx 1 =ex.solution :

Let f(x) =x 1 ex

. Then, f

(x) = 1 +ex

, and

xn+1= xn xn 1 exn

1 +exnfor n= 0, 1, 2, . . . .


12/18


Newtons iterations :

x1= x0 x0 1 ex0

1 +ex0

= 1.2 1.2 1 e1

.2

1 +e1.2

= 1.277770 ,

x2= x1 x1 1 ex1

1 +ex1

= 1.277770 1.277770 1 e1.277770

1 +e1.277770

= 1.278465 .

Hence, r x2= 1.2785.

6.1 Convergence and Stopping Criterions

Newtons method converges quadratically (p= 2) to a simple root, since

limn |en+1

||en|p = limn |r

xn+1

||r xn|2

= limn

f(cn)2f(xn)

= |f(r)|2 |f(r)| (note xn r as n )

=K.

But, for a multiple root f(r) = 0, the convergent rate reduces to linear (p= 1).

The method is terminated when

|xn+1 xn| ,

ensuring|r xn| for a simple root. Of course, xn+1 will have an even smaller error.


13/18


EXAMPLE

Find the solution ofx3 = 2 to a tolerance of 0.0005 using Newtons method with

x0= 1.2 .

solution :

Let f(x) =x3 2. Then f(x) = 3x2, and

xn+1=xn f(xn)f(xn)

=xn x3n 23x2n

for n= 0, 1, 2, . . . .

For n= 0 (first iteration):

x1= x0 x30 2

3x20

= 1.2 (1.2)3 23 (1.2)2

= 1.262963 .

Since|x1 x0| = 0.062963 > 0.0005, we continue with the next iteration (n= 1):

x2=x1 x31 23x2

1

= 1.262963

(1.262963)3 2

3 (1.262963)2

= 1.259928 .

Again,|x2 x1| = 0.003035 > 0.0005, so we continue to the next iteration (n= 2):

x3=x2 x32 23x22

= 1.259928 (1.259928)3 2

3 (1.259928)2

= 1.259921 .

Now,|x3 x2| = 7.321(106) < 0.0005, then the iteration process is terminated,and x3 = 1.259921 is accepted as an approximation to r. That is, r 1.256. Notethat the exact solution is 3

2 1.25992105, then the error in the approximation is

about5(108), which is much smaller than 7.321 (106).


14/18


7 Secant Method

In the secand method, an approved estimate x2 of the solution off(x) = 0 is found from

two estimates x0 and x1 by determining the point where the chord cuts the x-axis (seeillustration shown below). Note that x0 and x1 do not have to bracket the solution.

rx2x1

x0

x

y =f(x)

(x0, f(x0))

(x1, f(x1))

y

Equating the two expressions for the slope (for intervals [x1, x0] and [x1, x2]) of the chord :

slope riserun

= f(x0) f(x1)x0 x1 =

f(x2) f(x1)x2 x1 .

Solving the above relation for x2, noting that f(x2) = 0, gives

x2 x1= 0 f(x1)f(x0) f(x1)(x0 x1) x2=x1

f(x1)(x1 x0)f(x1) f(x0) .

The next approximation x3 is computed in the same manner from x1 and x2, and so on.

The secant method can be formulated as

xn+1= xn f(xn)(xn xn1)f(xn) f(xn1) for n= 1, 2, 3, . . . .

7.1 Convergence and Stopping Criterions

For some problems, the secant method may not converge for some choices ofx0 and x1.

However, convergence is guaranteed ifx0 and x1 are sufficiently close to r. The order of

convergence of the secant method is

p=1 +

5

2 1.618 .


15/18


Consequently, the secant method can be expected to converge much faster than the bisec-

tion method (p= 1) or fixed point iteration method (p= 1), but slower than the Newtons

method (p= 2). For a simple root with xn close to r, we have r xn xn+1 xn. Thesecant method is therefore terminated when

|xn+1 xn| ,

which will ensure that|r xn| for a simple root.

EXAMPLE

Perform two iterations of the secant method with x0= 1.3 and x1= 1.1 to find an

estimate of the root ofx 1 =ex.solution :

Let f(x) =x 1 ex. Then,

x2=x1 f(x1)(x1 x0)f(x1) f(x0)

= 1.1 f(1.1)(1.1 1.3)f(1.1) f(1.3)

= 1.278898 ,

and

x3= x2 f(x2)(x2 x1)f(x2) f(x1)

= 1.278898 f(1.278898)(1.278898 1.1)f(1.278898) f(1.1)

= 1.278473 .

After two iterations, the estimate is r

x3= 1.278.

8 Remarks on Multiple Roots

A multiple root corresponds to a point where a function is tangent to the x-axis. For

example, a double root results from

f(x) = (x 2)(x 1)(x 1),

or mutliplying terms, f(x) =x3

4x2

+ 5x 2. The equation has a double root, becauseone value ofx makes two terms in f(x) equal to zero. Graphically, this corresponds to

the curve touching the x-axis tangentially at the double root (the function touches the


16/18


axis, but does not cross it at the root). A triple rootcorresponds to the case where one

xvalue makes three terms in an equation equal to zero, as in

f(x) = (x

2)(x

1)(x

1)(x

1),

or multiplying terms,f(x) =x45x3 + 9x27x +2. Graphically, this function is tangentto the axis at the root, but for this case the axis crossed. In general, odd multiple roots

cross the axis, whereas even ones do not. For example, the quadruple root(four repeated

roots) does not cross the axis.

Multiple roots pose some difficulties for many of the numerical methods presented in

this document :

(1) The fact that the function does not change sign at even multiple roots precludes

the use of the reliable bracketing methods such as the bisection method. Therefore,

you are limited to the fixed point iteration method that may diverge.

(2) Another issue is related to the fact that not only f(x), but also f(x) goes to

zero at the root. This poses problems for both the Newtons method and secant

method, which both contain the derivative (or its estimate) in the denominator of

their respective formulae. This could result in division by zero when the solution

converges very close to the root.

(3) It can be demonstrated that the Newtons method and secant method are linearly,

rather than quadratically, convergent for multiple roots. However, the formula for

the Newtons method can be slightly modified in the following manner to return the

method to its quadratic convergence,

xn+1= xn mf(xn)f(xn)

,

where m is the multiplicity of the root. That is, m = 2 for a double root, m = 3

for a triple root, etc. Of course, this may be an unsatisfactory alternative, becauseit hinges on foreknowledge of the multiplicity of the root. Another alternative is to

define a new function u(x) as the ratio of the function to its derivative,

u(x) = f(x)

f(x).

It can be shown that this function has roots at all the same locations as the original

function. Therefore, this function can be inserted into the Newtons method to

develop an alternative form,

xn+1=xn u(xn)u(xn)

.


17/18


18/18


(c) Use Newtons method withx0= /4 to estimate x1 and x2.

(d) What is the approximate size of the error ofx2, obtained from the Newtons

method, as an estimate of the solution of the above equation ?

(e) Takingx0= 0 and x1= /2, compute x2 and x3 using the secant method.

[13] Suppose we wish to solve the nonlinear equation,

f(x) =x ex1 = 0.

(a) Use Newtons method to find improved approximationsx1, x2 and x3 from an

initial approximation x0= 2.

(b) Describe the observed rate of convergence of Newtons method given that theexact solution is x= 1. What do the results suggest about the nature of the

solution at x= 1 ?

[14] Given f(x) =2x6 1.6x4 + 12x+ 1. Use bisection method to determine themaximum of this function. Employ initial interval of [0, 1], and perform iterations

until the error is less than 0.005.

[15] The velocityv of a falling parachutist is given by

v= gmcd

1 ect/m

where g = 9.8m/s2. For a parachutist with a drag coefficient cd = 15 kg/s, compute

the massm, so that the velocity is 35 m/s at timet = 9 s . Use the bisection method

to find mto three decimal places accuracy.