Download - Conjugate Gradients I: Setup - Computer Graphicsgraphics.stanford.edu/.../lecture_slides/cg_i.pdf · Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are A-conjugate if

Transcript

Conjugate Gradients I: Setup

CS 205A:Mathematical Methods for Robotics, Vision, and Graphics

Doug James (and Justin Solomon)

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 1

Time for Gaussian Elimination

A ∈ Rn×n =⇒

O(n3)

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 1

Time for Gaussian Elimination

A ∈ Rn×n =⇒ O(n3)

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 1

Common Case

“Easy to apply,hard to invert.”

I Sparse matrices

I Special structure

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 3 / 1

New Philosophy

Iteratively improveapproximation rather than

solve in closed form.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 4 / 1

For Today

A~x = ~b

I SquareI Symmetric

I Positive definiteCS 205A: Mathematical Methods Conjugate Gradients I: Setup 5 / 1

Variational Viewpoint

A~x = ~b

m

min~x

[1

2~x>A~x−~b>~x + c

]CS 205A: Mathematical Methods Conjugate Gradients I: Setup 6 / 1

Gradient Descent Strategy

1. Compute search direction

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

2. Do line search to find

~xk ≡ ~xk−1 + αk ~dk.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

Gradient Descent Strategy

1. Compute search direction

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

2. Do line search to find

~xk ≡ ~xk−1 + αk ~dk.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

Gradient Descent Strategy

1. Compute search direction

~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

2. Do line search to find

~xk ≡ ~xk−1 + αk ~dk.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

Line Search Along ~d from ~x

minαg(α) ≡ f (~x + α~d)

α =~d>(~b− A~x)

~d>A~d

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 1

Line Search Along ~d from ~x

minαg(α) ≡ f (~x + α~d)

α =~d>(~b− A~x)

~d>A~dCS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 1

Gradient Descent withClosed-Form Line Search

~dk = ~b− A~xk−1

αk =~d>k~dk

~d>kA~dk

~xk = ~xk−1 + αk~dkCS 205A: Mathematical Methods Conjugate Gradients I: Setup 9 / 1

Convergence

See book.

f(~xk)− f(~x∗)f(~xk−1)− f(~x∗)

≤ 1− 1

condA

Conclusions:

I Conditioning affects speed and quality

I Unconditional convergence (condA ≥ 1)

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 10 / 1

Visualization

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 11 / 1

Can We Do Better?

I Can iterate forever:Should stop after O(n) iterations!

I Lots of repeated workwhen poorly conditioned

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 12 / 1

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

Observation

f (~x) =1

2(~x− ~x∗)>A(~x− ~x∗) + const.

A = LL>

=⇒ f (~x) =1

2‖L>(~x− ~x∗)‖22 + const.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

Substitution

~y ≡ L>~x, ~y∗ ≡ L>~x∗

=⇒ f̄ (~y) = ‖~y − ~y∗‖22

Proposition

Suppose {~w1, . . . , ~wn} are orthogonal in Rn.

Then, f̄ is minimized in at most n steps by line

searching in direction ~w1, then direction ~w2, and

so on.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 1

Substitution

~y ≡ L>~x, ~y∗ ≡ L>~x∗

=⇒ f̄ (~y) = ‖~y − ~y∗‖22

Proposition

Suppose {~w1, . . . , ~wn} are orthogonal in Rn.

Then, f̄ is minimized in at most n steps by line

searching in direction ~w1, then direction ~w2, and

so on.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 1

Visualization

Any two orthogonal directions suffice!

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 15 / 1

Undoing Change of Coordinates

Line search on f̄ along ~w is the same asline search on f along (L>)−1 ~w.

0 = ~wi · ~wj = (L>~vi)>(L>~vj)

= ~v>i (LL>)~vj = ~v>i A~vj

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 1

Undoing Change of Coordinates

Line search on f̄ along ~w is the same asline search on f along (L>)−1 ~w.

0 = ~wi · ~wj = (L>~vi)>(L>~vj)

= ~v>i (LL>)~vj = ~v>i A~vj

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 1

Conjugate Directions

A-Conjugate VectorsTwo vectors ~v and ~w are A-conjugate if

~v>A~w = 0.

Corollary

Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f

is minimized in at most n steps by line searching

in direction ~v1, then direction ~v2, and so on.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 1

Conjugate Directions

A-Conjugate VectorsTwo vectors ~v and ~w are A-conjugate if

~v>A~w = 0.

Corollary

Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f

is minimized in at most n steps by line searching

in direction ~v1, then direction ~v2, and so on.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 1

High-Level Ideas So Far

I Steepest descent may not be fastest descent

(surprising!)

I Two inner products:

~v · ~w〈~v, ~w〉A ≡ ~v>A~w

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 1

High-Level Ideas So Far

I Steepest descent may not be fastest descent

(surprising!)

I Two inner products:

~v · ~w〈~v, ~w〉A ≡ ~v>A~w

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 1

New Problem

Find nA-conjugate directions.

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 19 / 1

Gram-Schmidt?

I Potentially unstable

I Storage increases witheach iteration

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

Gram-Schmidt?

I Potentially unstable

I Storage increases witheach iteration

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

Gram-Schmidt?

I Potentially unstable

I Storage increases witheach iteration

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

Another Clue

~rk ≡ ~b− A~xk

~rk+1 = ~rk − αk+1A~vk+1

PropositionWhen performing gradient descent on f ,

span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

Krylov space?!

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

Another Clue

~rk ≡ ~b− A~xk~rk+1 = ~rk − αk+1A~vk+1

PropositionWhen performing gradient descent on f ,

span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

Krylov space?!

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

Another Clue

~rk ≡ ~b− A~xk~rk+1 = ~rk − αk+1A~vk+1

PropositionWhen performing gradient descent on f ,

span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

Krylov space?!

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

Another Clue

~rk ≡ ~b− A~xk~rk+1 = ~rk − αk+1A~vk+1

PropositionWhen performing gradient descent on f ,

span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

Krylov space?!

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

Gradient Descent: Issue

~xk − ~x0 6= arg min~v∈span {~r0,A~r0,...,Ak−1~r0}

f (~x0 + ~v)

But if this did hold...Convergence in n steps!

Next

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 1

Gradient Descent: Issue

~xk − ~x0 6= arg min~v∈span {~r0,A~r0,...,Ak−1~r0}

f (~x0 + ~v)

But if this did hold...Convergence in n steps!

Next

CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 1