Download - Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Conjugate Gradients I: Setup

CS 205A:Mathematical Methods for Robotics, Vision, and Graphics

Doug James (and Justin Solomon)

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 1

Page 2: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Time for Gaussian Elimination

A ∈ Rn×n =⇒

O(n3)

Page 3: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Time for Gaussian Elimination

A ∈ Rn×n =⇒ O(n3)

Page 4: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Common Case

“Easy to apply,hard to invert.”

I Sparse matrices

I Special structure

Page 5: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

New Philosophy

Iteratively improveapproximation rather than

solve in closed form.

Page 6: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

For Today

A~x = ~b

I SquareI Symmetric

I Positive definiteCS 205A: Mathematical Methods Conjugate Gradients I: Setup 5 / 1

Page 7: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Variational Viewpoint

A~x = ~b

m

min~x

[1

2~x>A~x−~b>~x + c

]CS 205A: Mathematical Methods Conjugate Gradients I: Setup 6 / 1

Page 8: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gradient Descent Strategy

1. Compute search direction

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

2. Do line search to find

~xk ≡ ~xk−1 + αk ~dk.

Page 9: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

~xk ≡ ~xk−1 + αk ~dk.

Page 10: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

~xk ≡ ~xk−1 + αk ~dk.

Page 11: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Line Search Along ~d from ~x

minαg(α) ≡ f (~x + α~d)

α =~d>(~b− A~x)

~d>A~d

Page 12: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Line Search Along ~d from ~x

minαg(α) ≡ f (~x + α~d)

α =~d>(~b− A~x)

~d>A~dCS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 1

Page 13: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gradient Descent withClosed-Form Line Search

~dk = ~b− A~xk−1

αk =~d>k~dk

~d>kA~dk

~xk = ~xk−1 + αk~dkCS 205A: Mathematical Methods Conjugate Gradients I: Setup 9 / 1

Page 14: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Convergence

See book.

f(~xk)− f(~x∗)f(~xk−1)− f(~x∗)

≤ 1− 1

condA

Conclusions:

I Conditioning affects speed and quality

I Unconditional convergence (condA ≥ 1)

Page 15: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Visualization

Page 16: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Can We Do Better?

I Can iterate forever:Should stop after O(n) iterations!

I Lots of repeated workwhen poorly conditioned

Page 17: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

Page 18: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

Page 19: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

Page 20: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Substitution

~y ≡ L>~x, ~y∗ ≡ L>~x∗

=⇒ f̄ (~y) = ‖~y − ~y∗‖22

Proposition

Suppose {~w1, . . . , ~wn} are orthogonal in Rn.

Then, f̄ is minimized in at most n steps by line

searching in direction ~w1, then direction ~w2, and

so on.

Page 21: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Substitution

~y ≡ L>~x, ~y∗ ≡ L>~x∗

=⇒ f̄ (~y) = ‖~y − ~y∗‖22

Proposition

Suppose {~w1, . . . , ~wn} are orthogonal in Rn.

Then, f̄ is minimized in at most n steps by line

searching in direction ~w1, then direction ~w2, and

so on.

Page 22: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Visualization

Any two orthogonal directions suffice!

Page 23: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Undoing Change of Coordinates

Line search on f̄ along ~w is the same asline search on f along (L>)−1 ~w.

0 = ~wi · ~wj = (L>~vi)>(L>~vj)

= ~v>i (LL>)~vj = ~v>i A~vj

Page 24: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Undoing Change of Coordinates

Line search on f̄ along ~w is the same asline search on f along (L>)−1 ~w.

0 = ~wi · ~wj = (L>~vi)>(L>~vj)

= ~v>i (LL>)~vj = ~v>i A~vj

Conjugate Directions

A-Conjugate VectorsTwo vectors ~v and ~w are A-conjugate if

~v>A~w = 0.

Corollary

Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f

is minimized in at most n steps by line searching

in direction ~v1, then direction ~v2, and so on.

Conjugate Directions

A-Conjugate VectorsTwo vectors ~v and ~w are A-conjugate if

~v>A~w = 0.

Corollary

Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f

is minimized in at most n steps by line searching

in direction ~v1, then direction ~v2, and so on.

Page 27: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

High-Level Ideas So Far

I Steepest descent may not be fastest descent

(surprising!)

I Two inner products:

~v · ~w〈~v, ~w〉A ≡ ~v>A~w

Page 28: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

High-Level Ideas So Far

I Steepest descent may not be fastest descent

(surprising!)

I Two inner products:

~v · ~w〈~v, ~w〉A ≡ ~v>A~w

Page 29: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

New Problem

Find nA-conjugate directions.

Page 30: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gram-Schmidt?

I Potentially unstable

I Storage increases witheach iteration

Page 31: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gram-Schmidt?

Page 32: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gram-Schmidt?

Page 33: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Another Clue

~rk ≡ ~b− A~xk

~rk+1 = ~rk − αk+1A~vk+1

PropositionWhen performing gradient descent on f ,

span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

Krylov space?!

Page 34: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Another Clue

~rk ≡ ~b− A~xk~rk+1 = ~rk − αk+1A~vk+1

Krylov space?!

Page 35: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Another Clue

Krylov space?!

Page 36: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Another Clue

Krylov space?!

Page 37: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gradient Descent: Issue

~xk − ~x0 6= arg min~v∈span {~r0,A~r0,...,Ak−1~r0}

f (~x0 + ~v)

But if this did hold...Convergence in n steps!

Page 38: Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Gradient Descent: Issue

~xk − ~x0 6= arg min~v∈span {~r0,A~r0,...,Ak−1~r0}

f (~x0 + ~v)

But if this did hold...Convergence in n steps!