Chapter 2

Matrix Algebra 1 LU-decomposition 50

2. LU–DECOMPOSITION 2.1 Introduction In this chapter we use Gaussian elimination to solve the system of equations represented by the matrix equation

Ax = b, where A is an mxn matrix and b an m–vector. This leads to the LU-decomposition of A, which is suitable for computer computations. We are left with a decomposition of the following form, A = LU, where L is a lower triangular mxm matrix and U an upper triangular mxn matrix. This decomposition is used to solve systems of linear equations and to find the inverse of a matrix. As we shall notice, if A is a band matrix, then the matrices L and U are band matrices as well. This is a useful and memory-saving property. First we concentrate on square matrices and then we discuss the general case of mxn matrices. The LU-decomposition of a matrix A is gotten by left multiplying it by so called Gaussian matrices until we have an upper triangular matrix. 2.2 Constructing LU-decomposition Gaussian matrix. Let x be an n-vector. We construct a square matrix Lk , such that first k elements of the vector x remain fixed and the rest are changed to zero,

Lkx =

x1

xk

0

0

Gaussian matrix


Lk = I lkekT , lk =

1xk

0

0xk+1

xn

, xk 0

carries out the required transformation. Note that the Gaussian matrix is dependent on the vector x and does not generally work for other vectors. Exercise 2.2.1 Check that Lkx=[x1,...,xk,0,...,0]T. It is easy to compute the inverse of a Gaussian matrix. The inverse is (show this)

Lk1=I + lkek

T , Multiplying the system of equations by this matrix does not change the solution. Exercise 2.2.2 Show that the inverse of a Gaussian matrix is of this form and that Lkei = ei , if i k. Suppose we have a matrix A that is of the following form

A = a1,a2,...,an .

We transform this matrix to an upper triangular matrix by left multiplying it by suitable Gaussian matrices. First we use a suitable matrix L1 to transform the first column of A to upper triangular form. Then we use a suitable L2 to do the same thing for the second column and so on. After n-1 steps we have an upper triangular matrix. 1. column: We assume, that a11 0, then choose a Gaussian matrix L1 such

that it transforms the first row of A so that it is parallel with e1. Next we left multiply the equation by the matrix

L1 = I l1e1T=

1 0 0l21 1 0l31 0 1 0

ln1 0 0 1

li1 =ai1

a11i = 2,...,n,

l1

T= 0 l21 l31 ln1 ,


According to exercise 1.16.1 we get

L1 a1,a2,...,an = L1a1, L1a2,..., L1an

= a11e1, L1a2,..., L1an ,

with component wise representation

L1A=

a11 a12 a1n

0 a221( ) a2n

1( )

0 a321( ) a3n

1( )

0 an21( ) ann

1( )

where

aij

(1) : t

are the new elements. The first column is now of the correct form. 2. column: If

a22(1) 0

we can use a Gaussian matrix that leaves the first two elements of the second column in previous matrix intact and transforms the rest of the elements to zero.

L2 = I l2e2

T , l2T= 0 0 l32 l

n2,

where

li2 =ai2

1( )

a221( )

i = 3,4,...,n,

and the component wise representation is

L2 =

1 0 0 00 1 0 0

l32 1

0 ln2 0 1

.

We get the product

L2L1A = L2 a11e1, L1a2,..., L1an = a11e1, L2L1a2,..., L2L1an ,

that is


L2L1A=

a11 a12 a13 a1n

0 a221( ) a23

1( ) a2n1( )

0 0 a332( ) a3n

2( )

0 0 a3n2( ) ann

2( )

.

Note that exercise 2.2.2 shows us that the first row remains the same. k. column: We make a description of a general step after we have performed k-1 steps already. If

akk(k 1) 0 ,

then we can find a Gaussian matrix, that leaves the first k elements of the kth column intact and transforms the others to zero.

lik =aik

k 1( )

akkk 1( )

, i = k+1,...,n

so the Gaussian matrix has the following element wise representation

k

Lk =

1 0 0 00 1 0 00 1 00 lk+1 k

0 ln k 0 1

k.

Finally k=n-1 and we have Ln 1Ln 2...L1A =U, where U is upper triangular as we wanted. The product of two lower triangular matrices that have ones on the diagonal is a lower triangular matrix that has ones on the diagonal (exercise 1.16.8), so the matrix L̂ = Ln 1Ln 2...L1,


is lower triangular with ones on the diagonal. Now we can write

L̂A =

a11 a12 a1n

0 a221( ) a2n

1( )

0 0 0 annn 1( )

=U

where U is an upper diagonal matrix. Because the inverse of a lower triangular matrix with ones on diagonal is a lower triangular matrix with ones on the diagonal (exercise 1.21.5), we finally get the decomposition we were after.

A = L̂ 1U = LU. Exercise 2.2.3 The lower triangular matrix L could be easily written by using the Gaussian matrices used to form the decomposition. Prove that

L = L̂( )

1= I + li

i=1

n 1ei

T .

Example 2.2.1 We now find the LU-decomposition of the matrix

A =

2 2 12 3 12 1 2

, .

First we form the matrix L1.

L1 =

1 0 01 1 01 0 1

and now

L1A =

2 2 10 1 00 1 1

Next we have L2,

L2 =

1 0 00 1 00 1 1

and we finally get the upper triangular matrix U and


L2L1A =

2 2 10 1 00 0 1

=U , L̂ = L2L1 =

1 0 01 1 02 1 1

.

Note that the previous exercise provides us with an easy way to compute the inverse of L̂ . We get

L̂ 1= L =

1 0 01 1 01 1 1

.

The LU-decomposition for the matrix A is

A =

1 0 01 1 01 1 1

2 2 10 1 00 0 1

= LU.

Exercise 2.2.4 Find the LU-decomposition of the matrix

A =

2 3 34 9 106 21 26

.

2.2.1. Existence of LU-decomposition Now we want to know when the LU-decomposition of a given matrix exists. In construction of an LU-decomposition we use the elements

a11,a22

1( ),a332( ),...,an 1,n 1

n 2( ) .

as denominators. These elements and annn 1( )are called pivot elements. According

to the construction of LU-decomposition, it should be clear that if the n-1 first pivot elements differ from zero then the LU-decomposition of a square matrix A exists. The following simple condition ensures the existence of LU-decomposition. Theorem 2.2.1. The first k leading principal minors of an nxn matrix A differ from zero if and only if the first k pivot elements differ from zero. If the leading principal minors of A are denoted


A1 = a11, A2 =

a11 a12

a21 a22An = A ,

then

Ak = a11a22

1( )…akk

k 1( ), k = 1,2,..,n.

Proof. The product of matrices A and Lk is the matrix that is received after subtracting a scalar multiple of the kth row of A from some other rows of A. This does not change the value of the determinant. The same operations are made for every leading principal submatrix, so the corresponding principal leading minors do not change either. We can prove the claim by computing leading principal minors of the matrices A and L̂A . If it turns out that one of the n-1 first pivot elements has value zero, we can exchange rows and continue the procedure. Exercise 2.2.2. If det(A) 0, then it is possible to reorder the rows of A so that we get nonzero pivot elements when constructing the LU-decomposition. Proof. If on the first round of procedure we have a11 = 0, we could choose some other element ak1 0. If such an elements does not exist we have

det A( ) =

0 a12 a1n

0 an2 ann

= 0

which contradicts the assumption det(A) 0. Exchange the 1st and kth rows by using the permutation matrix P1. Now the element ak1 is in the upper left corner and we are able to continue the procedure. kth step: If

akkk 1( )

= 0, then we could choose some nonzero element under the pivot element,

aikk 1( ) 0, i > k.

If there does not exist such an element, then

det Lk 1 Pk-1 L1P1A( ) = 0 , and det(A) = 0 which is a contradiction. We use the permutation matrix Pk to exchange the ith and kth row and continue the procedure by left multiplying by matrix Lk .


If the necessary exchanges of rows are done right in the beginning by using a permutation matrix P, we have proven the existence theorem of LU-decomposition. Theorem 2.2.3. (LU–decomposition for nxn matrices.) Let A be a nonsingular nxn matrix. Then there exists a permutation matrix P so that PA = LU, where L is a lower triangular matrix with ones on the diagonal and U an upper triangular matrix. 2.2.4 Pivoting The numerical efficiency of LU-decomposition can be improved by choosing the element in every elimination cycle that has the biggest absolute value, as a pivot element

akkk 1( )

by changing the order of rows and/or columns. In partial pivoting the new pivot element is only searched for among the elements below the original pivot element

akkk 1( ) : n .

In complete pivoting the new pivot element is searched for among the elements that have indexes i k and j k. This means that in complete pivoting we may change the order of the variables as well as the order of the equations. 2.2.3 The solution of a system (nxn) of linear equations by using LU-decomposition LU-decomposition is mainly used to solve matrix equations Ax=b. First we find the LU-decomposition of a matrix A. Partial pivoting does not change the solution, because only the order of the equations is changed. In complete pivoting we might change the solution by permuting it since we can change the order of the variables xi. Complete pivoting leads to an equation of the form

Ax = PT LUx = b that is easy to solve. The inverse of a permutation matrix P is its transpose, so we get the equation LUx = Pb . We can split this into two halves. Ly=Pb,


Ux=y. The first equation has a unique solution y=L-1Pb, because L is invertible. The matrix U is an upper triangular matrix that has the pivot elements of A on its diagonal. The latter equation also has a unique solution if all the pivot elements, including the last one, are nonzero. Previous results are summed up in the next theorem. Theorem 2.2.4. Let A be an nxn matrix and b an n–vector. If det(A) 0, the equation Ax = b has a unique solution for every vector b. Exercise 2.2.5 Assume that det(A) 0 and PA = LU. Show that the matrices L and U are unique. Example 2.2.2. Let us use the LU-decomposition to solve the equation Ax = b, where A is the matrix in example 2.2.1 and b = [1,1,1]T. The LU-decomposition of the matrix A is

A =

1 0 01 1 01 1 1

2 2 10 1 00 0 1

= LU

If we use the following notation Ly=b and Ux=y, we can easily solve the system of equations because the equation

Ly =

1 0 01 1 01 1 1

y1

y2

y3

=

111

,

is simple when we start from the first equation. The solution is y= [1,0,0]T. Then we solve the equation Ux=y. This is not hard either, because U is an upper triangular matrix.

2 2 10 1 00 0 1

x1

x2

x3

=

100

x3 = 0, x2 = 0, x1 = 1 / 2.

The solution is


x =

1 200

.

Theorem 2.2.5 Let A and B be square matrices. Now det(AB)=det(A)det(B). Proof. Let det(A) 0 . We assume that A = PTLU is the LU-decomposition of square matrix A. The determinant of the matrix AB can be written in the form det(AB)=det(PTLUB) = (-1)s(A) det(LUB) where s(A) is the amount of rows exchanged by multiplying by a permutation matrix P. By using exercises 1.24.6-9 we can form the following chain of equations det(AB) = (-1)s(A) det(LUB) = (-1)s(A) det(UB)=(-1)s(A) det(U)det(B) = (-1)s(A) det(LU)det(B) = det(PTLU)det(B) = det(A) det(B). If det(A) = 0 , then, as will be shown in chapter 2.5, we can construct for the matrix A a corresponding LU-decomposition, where the last row of matrix U consists of zeros. In this case we have det(AB) = (-1)s(A) det(U) det(B)=0, because det(U)=0. Exercise 2.2.6 Construct the LU-decomposition of the matrices

a) A =

2 2 22 3 14 2 2

and

b) A =

0 0 11 0 11 3 1

and solve the equation Ax = [1 2 3]T. 2.3. Finding the inverse of a matrix by using LU-decomposition


LU-decomposition gives us new information about the inverse of a matrix: if det(A) 0, then A has an inverse that can be computed by solving the system of equations, Axi = ei i = 1,2,...,n This could also be presented in the form

A x1, x2,..., xn = e1,e2,...,en = I,

so AX = I. The properties of the determinant now say that det(AX) = det(A) det(X) = 1 = det(I), so det(X) 0, and as seen, theorem 2.2.4 could be used to find an nxn–matrix Y such that it satisfies the equation XY = I. We left multiply the equation by A:

AX

I

Y = A Y = A

So AX = XA = I, and that is X = A–1 . We have almost proved the following theorem: Theorem 2.5.1 Let A be an nxn–matrix. The following conditions are equivalent (a) det(A) 0 (b) there exists a matrix A-1 (c) the equation Ax = b has a unique solution x = A-1b, b (d) Ax = 0 x = 0 Proof. (a) (c) (b) Follows from the theorem 2.2.4 and what is said above.

(b) (a): 1= det(I) = det(AA–1) = det(A)det(A–1) det(A) 0. (c) (d) : trivial (choose b=0). (d) (a) : Suppose that antithesis is valid, that is: (d) holds and det(A) = 0. We construct the LU-decomposition of A as earlier. Because det(A) = 0, there is a pivot element with value zero. Let the first pivot element that is zero be

akk(k 1)

= 0 The elements underneath it are also zero. We have the matrix


k 1 1 n k

LA =A1 a B0 0 C

k 1n k+1

The submatrix A1 is upper triangular and its diagonal elements are

a11,a22

1( ),...,ak 1,k 1k 2( ) 0 .

This means that A1

-1 exists. We choose

x =

A11a

10

} k–1} 1} n–k

0.

Left multiplying x by LA gives us:

A1 a B0 0 C

A11a

10

=a + a0

=00

= 0.

which contradicts (d), because Ax = 0 follows from equation LAx = 0 . So (d) (a) and the proof of the theorem is completed. 2.4 Computation times of LU–decompositions Constructing the LU–decomposition of a nonsingular nxn matrix takes (n3)/3 flops. Solving the equation Ax = LUx = b requires the solutions of the equations

Ly = b y = L 1b and Ux = y Each of which require (n2)/2 flops. All together this takes (n3)/3 + n2 flops. Because det(A)=det(U)= +u11u22...unn, the determinant could be computed in (n3)/3 flops, if only the highest order of n is considered. If the equation Ax = b is solved for different b, the construction of the LU–decomposition is only carried out once. If we solve the equation with p different vectors, we need


n3

3+ p n2

flops to get all the answers. Computing the inverse A–1 demands that we solve for the n vectors ei, i=1,..,n, (chapter 2.6) so it takes

n3

3+ n n2

=43

n3

flops. To save memory space we could replace the matrix A with its LU-decomposition in the following way:

a11 a1n

a21 a2n

an1 ann

u11 u1n

l21 u22 u2n

ln1 ln2 unn

.

The memory space is sufficient because the diagonal elements of L are ones and there is no need to use memory for them. 2.5 LU–decomposition of singular and mxn matrices

A =

2 1 2 12 1 3 22 1 3 22 1 3 2

2111

.

Now

L1 =

1 0 0 01 1 0 01 0 1 01 0 0 1

.

Let A be an mxn matrix. We construct the LU-decomposition as we did for squarematrices by using Gaussian matrices. We transform the columns one by one tothe required form and use partial pivoting when necessary. The difference now isthat there might be a situation in which the pivot element and all the elementsunder it are zero. In this kind of situation we continue the elimination as we didso far, but in the same row and the next column. The next example should clarifythe procedure. Example 2.5.1 Consider the matrix


1st step

L1A =

2 1 2 10 0 1 10 0 1 10 0 1 1

2111

we see, that

a22(1)

= 0 ja ai2(1)

= 0, kun i = 3,4. We continue the procedure in the third column and transform the elements under the second element in this column to zero. Now we get

L2 =

1 0 0 00 1 0 00 1 1 00 1 0 1

and

L2L1A =

2 1 2 10 0 1 10 0 0 00 0 0 0

21

00

=U

Generally the solution is of the form

0 0 0 * x x x x x x x0 0 0 0 0 * x x x x x0 0 0 0 0 0 0 0 * x x

0 0 0 0 0 0 0 0 * x x0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

r

m r

Where the elements denoted with * are nonzero and elements denoted with x could be zero or something else. We end up with the equation, PA = LU, where P is an mxm permutation matrix, L an mxm nonsingular lower triangular matrix that has ones on the main diagonal and U is an mxn upper triangular matrix of the form represented above. It could be shown that the number of


nonzero rows is unique and does not depend on the method used to get the form above. 2.6 Application: Solving partial differential equations by the difference method Difference method. We consider a simple boundary value problem. We want to find the function v(x) with v(0)=v(1)=0 such that v''(x) = f(x) on the interval [0,1] when f is a given function. The problem can be solved in the following way by using the difference method. We divide the interval [0,1] into n+1 subintervals x0=0, x1=h, x2=2h, ..., xi=ih, ..., xn+1=1, where the length of every subinterval is h=1/(n+1). The values of the function at the points of the partition are denoted: v(xi) = v1, f (xi) = fi, i = 0,1, ..n +1. The boundary conditions say that v0 = vn+1 = 0. Taking difference quotients

gives us some approximations for the derivatives

v '(xi) (vi+1 vi 1) / 2h, v ''(xi) (vi+1 2vi + vi 1) / h2 .

At each point we form the equation

v ''(xi) = (vi+1 2vi + vi 1) / h2= fi i = 1,2, ..., n.

Write down this equation in matrix notation which has vi, i=1,…,n, as variables. You will get a band matrix. What is the band width of this matrix? Is it symmetric and positive definite? The bigger the number n is the more accurate the solution is. Solve the equation when n=5 and a) f(x) = 1, b) f(x)=x(1-x).

Chapter 2

Documents

Transcript of Chapter 2