Classroom Figures for the Conjugate Gradient Method ...

39
Classroom Figures for the Conjugate Gradient Method Without the Agonizing Pain Edition 1 1 4 Jonathan Richard Shewchuk August 4, 1994 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract This report contains a set of full-page figures designed to be used as classroom transparencies for teaching from the article “An Introduction to the Conjugate Gradient Method Without the Agonizing Pain”. Supported in part by the Natural Sciences and Engineering Research Council of Canada under a 1967 Science and Engineering Scholarship and by the National Science Foundation under Grant ASC-9318163. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either express or implied, of NSERC, NSF, or the U.S. Government.

Transcript of Classroom Figures for the Conjugate Gradient Method ...

Page 1: Classroom Figures for the Conjugate Gradient Method ...

Classroom Figures forthe Conjugate Gradient Method

Without the Agonizing PainEdition 1 1

4

Jonathan Richard Shewchuk

August 4, 1994

School of Computer ScienceCarnegie Mellon University

Pittsburgh, PA 15213

Abstract

This report contains a set of full-page figures designed to be used as classroom transparencies for teaching from thearticle “An Introduction to the Conjugate Gradient Method Without the Agonizing Pain”.

Supported in part by the Natural Sciences and Engineering Research Council of Canada under a 1967 Science and EngineeringScholarship and by the National Science Foundation under Grant ASC-9318163. The views and conclusions contained in thisdocument are those of the author and should not be interpreted as representing the official policies, either express or implied, ofNSERC, NSF, or the U.S. Government.

Page 2: Classroom Figures for the Conjugate Gradient Method ...

Keywords: conjugate gradient method, transparencies, agonizing pain

Page 3: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

3 �1�

2 �2 � 2

2 �1�

6 �2 � � 8

Sample 2-d linear system of the form � � � � :���� 3 22 6

�� � ����� 2� 8

���

Page 4: Classroom Figures for the Conjugate Gradient Method ...

-4

-20

24

6

-6

-4

-2

0

2

4

0

50

100

150

-4

-20

24

6

�1

�2

��� ���

�1

Graph of quadratic form��� ��� � 1

2��� � � � � � � � �

. Theminimum point of this surface is the solution to � � � � .

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

Contours of the quadratic form. Each ellipsoidal curve hasconstant

��� �� .

Page 5: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-8

-6

-4

-2

2

4

6

�1

�2

Gradient� � � ��� of the quadratic form. For every � , the

gradient points in the direction of steepest increase of��� ��� ,

and is orthogonal to the contour lines.

Page 6: Classroom Figures for the Conjugate Gradient Method ...

(c)

�1

�2

��� ��

�1

(d)

�1

�2

��� ���

�1

(a)

�1

�2

��� ��

�1

(b)

�1

�2

��� ���

�1

(a) Quadratic form for a positive-definite matrix.

(b) For a negative-definite matrix.

(c) For a singular (and positive-indefinite) matrix. A linethat runs through the bottom of the valley is the set ofsolutions.

(d) For an indefinite matrix.

Page 7: Classroom Figures for the Conjugate Gradient Method ...

0.2 0.4 0.6

20406080

100120140

-4 -2 2 4 6

-6

-4

-2

2

4

-4 -2 2 4 6

-6

-4

-2

2

4

-2.50

2.55

-5-2.50

2.5050100150

-2.50

2.55

(c)��� ������� � ������� �

�1

(d)�

2

���1�

�1

(a)�

2

���0� � �

0� �

(b)

�1

�2

��� ���

�1

The method of Steepest Descent.

Page 8: Classroom Figures for the Conjugate Gradient Method ...

-2 -1 1 2 3

-3

-2

-1

1

2

�1

�2

Solid arrows: Gradients.

Dotted arrows: Slope along search line.

Page 9: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

���0� �

The method of Steepest Descent.

Page 10: Classroom Figures for the Conjugate Gradient Method ...

B v

B vv

2

3Bv

� is an eigenvector of�

with a corresponding eigenvalueof � 0 5. As � increases,

� � � converges to zero.

Bvv B v B v2 3

Here, � has a corresponding eigenvalue of 2. As � increases,� � � diverges to infinity.

B x

B x2

3

x

Bxv v1 2

� � �1� �

2. One eigenvector diverges, so � also diverges.

Page 11: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

72

�1

�2

The eigenvectors of � are directed along the axes of theparaboloid defined by the quadratic form

��� �� . Each eigen-vector is labeled with its associated eigenvalue.

Page 12: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

(a)�

2

� 0 470 47

-4 -2 2 4 6

-6

-4

-2

2

4

�1

(b)�

2

���0� �

-4 -2 2 4 6

-6

-4

2

4

�1

(c)�

2

� �0�

-4 -2 2 4 6

-6

-4

-2

2

4

�1

(d)�

2

� �1�

-4 -2 2 4 6

-6

-4

-2

2

4

�1

(e)�

2

� �2�

-4 -2 2 4 6

-6

-4

-2

2

4

�1

(f)�

2

Convergence of the Jacobi Method.

In (a), the eigenvectors of�

are shown with their corre-sponding eigenvalues. These eigenvectors are NOT theaxes of the paraboloid.

Page 13: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

Steepest Descent converges to the exact solution on the firstiteration if the error term is an eigenvector.

Page 14: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-4

-2

2

4

6

�1

�2

Steepest Descent converges to the exact solution on the firstiteration if the eigenvalues are all equal.

Page 15: Classroom Figures for the Conjugate Gradient Method ...

-6 -4 -2 2

-2

2

4

6

8

�1

�2

The energy norm of these two vectors is equal.

Page 16: Classroom Figures for the Conjugate Gradient Method ...

0

5

10

15

20

1

20

40

60

80

100

0

0.2

0.4

0.6

0.8

0

5

10

15

20

Convergence � of Steepest Descent.� is the slope of

� ����� with respect to the eigenvector axes.� is the condition number of � .

Convergence is worst when � � � � .

Page 17: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4

-4

-2

2

4

6

-4 -2 2 4

-4

-2

2

4

6

-4 -2 2 4

-4

-2

2

4

6

-4 -2 2 4

-4

-2

2

4

6

�1

(c)�

2

�1

(d)�

2

�1

(a)�

2

�1

(b)�

2

(a) Large � , small � .

(b) An example of poor convergence. � and � are bothlarge.

(c) Small � and � .

(d) Small � , large � .

Page 18: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

���0�

Solid lines: Worst starting points for Steepest Descent.

Dashed lines: Steps toward convergence.

Grey arrows: Eigenvector axes.

Here, � � 3 5.

Page 19: Classroom Figures for the Conjugate Gradient Method ...

20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Convergence of Steepest Descent (per iteration) worsens asthe condition number of the matrix increases.

Page 20: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

���0�

����

1�

� �1�

� �0�

The Method of Orthogonal Directions.

Page 21: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4

-4

-2

2

4

�1

�2

These pairs of vectors are � -orthogonal

-4 -2 2 4

-4

-2

2

4

�1

�2

because these pairs of vectors are orthogonal.

Page 22: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

� �0�

�� �

1� � �

1�

� �0�

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

� �0�

The method of Conjugate Directions converges in � steps.� �

1� must be � -orthogonal to

� �0� .

Page 23: Classroom Figures for the Conjugate Gradient Method ...

d

du

u

u

u

+

*

d0

1

(0)(0)

(1)

Gram-Schmidt conjugation of two vectors.

Page 24: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

The method of Conjugate Directions using the axial unitvectors, also known as Gaußian elimination.

Page 25: Classroom Figures for the Conjugate Gradient Method ...

dd(0)

(1)

e (2)e(0) (1)

e

0

The shaded area is� �

0� � � � � �

0� � span � � �

0��� � �

1��� .

The ellipsoid is a contour on which the energy norm isconstant.

After two steps, CG finds� �

2� , the point on

� �0� � �

thatminimizes � � ��� .

Page 26: Classroom Figures for the Conjugate Gradient Method ...

� �1�

� �1�

� �0�

� �1�

� �1�

���0�

���0� � �

1

(a)

�� �

1����

1�

� �0�

� �0�

� �1�

� �1�

���0� � �

1

(b)

����

1�

���2�

� �0�

���0� � �

2

� �0�

��1�

� �1�

� �0� � �

1

(c)

���� 2 �� �1�

���1�

� �0�

���0�

� �1� ���

0� � �

2

���0� � �

1

(d)

(a) 2D problem.

(b) Stretched 2D problem.

(c) 3D problem.

(d) Stretched 3D problem.

Page 27: Classroom Figures for the Conjugate Gradient Method ...

dd

d

r

(0)

(1)

(2) (2)

uu10

u2

e(2)

� �0� and

� �1� span the same subspace as �

0� �

1 (the gray-colored plane

�2).

� �2� is � -orthogonal to

�2.

� �2� is orthogonal to

�2.

� �2� is constructed (from �

2) to be � -orthogonal to�

2.

Page 28: Classroom Figures for the Conjugate Gradient Method ...

dd

dr

(0)(1)

(2)(2)

rr(0)(1)

e(2)

� �0� and

� �1� span the same subspace as � �

0� � � �

1� (the gray-

colored plane�

2).� �

2� is � -orthogonal to

�2.

� �2� is orthogonal to

�2.

� �2� is constructed (from � �

2� ) to be � -orthogonal to

�2.

Page 29: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-6

-4

-2

2

4

�1

�2

���0� �

The method of Conjugate Gradients.

Page 30: Classroom Figures for the Conjugate Gradient Method ...

2 7

-1-0.75-0.5

-0.25

0.250.5

0.751

2 7

-1-0.75-0.5

-0.25

0.250.5

0.751

2 7

-1-0.75-0.5

-0.25

0.250.5

0.751

2 7

-1-0.75-0.5

-0.25

0.250.5

0.751

(c)�

2� � �

(d)�

2� � �

(a)�

0� � �

(b)�

1� � �

The convergence of CG after � iterations depends on howclose a polynomial

� � of degree � can be to zero on eacheigenvalue, given the constraint that

� � � 0 � � 1.

Page 31: Classroom Figures for the Conjugate Gradient Method ...

-1 -0.5 0.5 1

-2-1.5

-1-0.5

0.51

1.52

-1 -0.5 0.5 1

-2-1.5

-1-0.5

0.51

1.52

-1 -0.5 0.5 1

-2-1.5

-1-0.5

0.51

1.52

-1 -0.5 0.5 1

-2-1.5

-1-0.5

0.51

1.52

�10� � �

�49� � �

�2� � �

�5� � �

Chebyshev polynomials of degree 2, 5, 10, and 49.

Page 32: Classroom Figures for the Conjugate Gradient Method ...

1 2 3 4 5 6 7 8

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

�2� � �

The optimal polynomial�

2� � � for

����� � 2 and

������ � 7

in the general case.

� � � � is reduced by a factor of at least 0.183 after twoiterations of CG.

Page 33: Classroom Figures for the Conjugate Gradient Method ...

20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Convergence of Conjugate Gradients (per iteration) as afunction of condition number.

200 400 600 800 10000

5

10

15

20

25

30

35

40

Number of iterations of Steepest Descent required to matchone iteration of CG.

Page 34: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-8

-6

-4

-2

�1

�2

Contour lines of the quadratic form of the diagonally pre-conditioned sample problem. The condition number hasimproved from 3 5 to roughly 2 8.

Page 35: Classroom Figures for the Conjugate Gradient Method ...

-4

-20

24

6

-2

0

2

4

6

-250

0

250

500

-4

-20

24

6

(a)

�1

�2

��� ���

1

-4 -2 2 4 6

-2

2

4

6

�1

(b)�

2

���0�

-0.04 -0.02 0.02 0.04

-200

200

400

600

(c)��� ������� � � � ����� �

-4 -2 2 4 6

-2

2

4

6

�1

(d)�2

���0�

The nonlinear Conjugate Gradient Method.

(b) Fletcher-Reeves CG.

(c) Cross-section of the surface corresponding to the firststep of Fletcher-Reeves.

(d) Polak-Ribiere CG.

Page 36: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-2

2

4

6

�1

�2

� �0�

Nonlinear CG can be more effective with periodic restarts.

Page 37: Classroom Figures for the Conjugate Gradient Method ...

-1 -0.5 0.5 1 1.5 2

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

The Newton-Raphson method.

Solid curve: The function to minimize.

Dashed curve: Parabolic approximation to the function,based on first and second derivatives at � .

� is chosen at the base of the parabola.

Page 38: Classroom Figures for the Conjugate Gradient Method ...

-1 1 2 3 4

-1.5

-1

-0.5

0.5

1

The Secant method.

Solid curve: The function to minimize.

Dashed curve: Parabolic approximation to the function,based on first derivatives at � � 0 and � � 2.

� is chosen at the base of the parabola.

Page 39: Classroom Figures for the Conjugate Gradient Method ...

-4 -2 2 4 6

-2

2

4

6

�1

�2

���0�

The preconditioned nonlinear Conjugate Gradient Method.

Polak-Ribiere formula and a diagonal preconditioner.

The space has been “stretched” to show the improvementin circularity of the contour lines around the minimum.