Conjugate Gradients I: Setup - Computer Conjugate Directions A-Conjugate Vectors Two vectors ~vand...

download Conjugate Gradients I: Setup - Computer Conjugate Directions A-Conjugate Vectors Two vectors ~vand w~are

of 38

  • date post

    11-Jun-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Conjugate Gradients I: Setup - Computer Conjugate Directions A-Conjugate Vectors Two vectors ~vand...

  • Conjugate Gradients I: Setup

    CS 205A: Mathematical Methods for Robotics, Vision, and Graphics

    Doug James (and Justin Solomon)

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 1

  • Time for Gaussian Elimination

    A ∈ Rn×n =⇒

    O(n3)

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 1

  • Time for Gaussian Elimination

    A ∈ Rn×n =⇒ O(n3)

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 1

  • Common Case

    “Easy to apply, hard to invert.”

    I Sparse matrices

    I Special structure

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 3 / 1

  • New Philosophy

    Iteratively improve approximation rather than

    solve in closed form.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 4 / 1

  • For Today

    A~x = ~b

    I Square I Symmetric

    I Positive definite CS 205A: Mathematical Methods Conjugate Gradients I: Setup 5 / 1

  • Variational Viewpoint

    A~x = ~b

    m

    min ~x

    [ 1

    2 ~x>A~x−~b>~x + c

    ] CS 205A: Mathematical Methods Conjugate Gradients I: Setup 6 / 1

  • Gradient Descent Strategy

    1. Compute search direction

    ~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

    2. Do line search to find

    ~xk ≡ ~xk−1 + αk ~dk.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

  • Gradient Descent Strategy

    1. Compute search direction

    ~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

    2. Do line search to find

    ~xk ≡ ~xk−1 + αk ~dk.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

  • Gradient Descent Strategy

    1. Compute search direction

    ~dk ≡ −∇f (~xk−1) = ~b− A~xk−1

    2. Do line search to find

    ~xk ≡ ~xk−1 + αk ~dk.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 1

  • Line Search Along ~d from ~x

    min α g(α) ≡ f (~x + α~d)

    α = ~d>(~b− A~x) ~d>A~d

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 1

  • Line Search Along ~d from ~x

    min α g(α) ≡ f (~x + α~d)

    α = ~d>(~b− A~x) ~d>A~d

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 1

  • Gradient Descent with Closed-Form Line Search

    ~dk = ~b− A~xk−1

    αk = ~d>k ~dk

    ~d>kA ~dk

    ~xk = ~xk−1 + αk~dk CS 205A: Mathematical Methods Conjugate Gradients I: Setup 9 / 1

  • Convergence

    See book.

    f(~xk)− f(~x∗) f(~xk−1)− f(~x∗)

    ≤ 1− 1 condA

    Conclusions:

    I Conditioning affects speed and quality

    I Unconditional convergence (condA ≥ 1)

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 10 / 1

  • Visualization

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 11 / 1

  • Can We Do Better?

    I Can iterate forever: Should stop after O(n) iterations!

    I Lots of repeated work when poorly conditioned

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 12 / 1

  • Observation

    f (~x) = 1

    2 (~x− ~x∗)>A(~x− ~x∗) + const.

    A = LL>

    =⇒ f (~x) = 1 2 ‖L>(~x− ~x∗)‖22 + const.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

  • Observation

    f (~x) = 1

    2 (~x− ~x∗)>A(~x− ~x∗) + const.

    A = LL>

    =⇒ f (~x) = 1 2 ‖L>(~x− ~x∗)‖22 + const.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

  • Observation

    f (~x) = 1

    2 (~x− ~x∗)>A(~x− ~x∗) + const.

    A = LL>

    =⇒ f (~x) = 1 2 ‖L>(~x− ~x∗)‖22 + const.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 1

  • Substitution

    ~y ≡ L>~x, ~y∗ ≡ L>~x∗

    =⇒ f̄ (~y) = ‖~y − ~y∗‖22

    Proposition

    Suppose {~w1, . . . , ~wn} are orthogonal in Rn. Then, f̄ is minimized in at most n steps by line

    searching in direction ~w1, then direction ~w2, and

    so on.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 1

  • Substitution

    ~y ≡ L>~x, ~y∗ ≡ L>~x∗

    =⇒ f̄ (~y) = ‖~y − ~y∗‖22

    Proposition

    Suppose {~w1, . . . , ~wn} are orthogonal in Rn. Then, f̄ is minimized in at most n steps by line

    searching in direction ~w1, then direction ~w2, and

    so on.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 1

  • Visualization

    Any two orthogonal directions suffice!

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 15 / 1

  • Undoing Change of Coordinates

    Line search on f̄ along ~w is the same as line search on f along (L>)−1 ~w.

    0 = ~wi · ~wj = (L>~vi)>(L>~vj) = ~v>i (LL

    >)~vj = ~v > i A~vj

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 1

  • Undoing Change of Coordinates

    Line search on f̄ along ~w is the same as line search on f along (L>)−1 ~w.

    0 = ~wi · ~wj = (L>~vi)>(L>~vj) = ~v>i (LL

    >)~vj = ~v > i A~vj

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 1

  • Conjugate Directions

    A-Conjugate Vectors Two vectors ~v and ~w are A-conjugate if

    ~v>A~w = 0.

    Corollary

    Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f is minimized in at most n steps by line searching

    in direction ~v1, then direction ~v2, and so on.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 1

  • Conjugate Directions

    A-Conjugate Vectors Two vectors ~v and ~w are A-conjugate if

    ~v>A~w = 0.

    Corollary

    Suppose {~v1, . . . , ~vn} are A-conjugate. Then, f is minimized in at most n steps by line searching

    in direction ~v1, then direction ~v2, and so on.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 1

  • High-Level Ideas So Far

    I Steepest descent may not be fastest descent

    (surprising!)

    I Two inner products:

    ~v · ~w 〈~v, ~w〉A ≡ ~v>A~w

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 1

  • High-Level Ideas So Far

    I Steepest descent may not be fastest descent

    (surprising!)

    I Two inner products:

    ~v · ~w 〈~v, ~w〉A ≡ ~v>A~w

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 1

  • New Problem

    Find n A-conjugate directions.

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 19 / 1

  • Gram-Schmidt?

    I Potentially unstable

    I Storage increases with each iteration

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

  • Gram-Schmidt?

    I Potentially unstable

    I Storage increases with each iteration

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

  • Gram-Schmidt?

    I Potentially unstable

    I Storage increases with each iteration

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 1

  • Another Clue

    ~rk ≡ ~b− A~xk

    ~rk+1 = ~rk − αk+1A~vk+1

    Proposition When performing gradient descent on f ,

    span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

    Krylov space?!

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

  • Another Clue

    ~rk ≡ ~b− A~xk ~rk+1 = ~rk − αk+1A~vk+1

    Proposition When performing gradient descent on f ,

    span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

    Krylov space?!

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

  • Another Clue

    ~rk ≡ ~b− A~xk ~rk+1 = ~rk − αk+1A~vk+1

    Proposition When performing gradient descent on f ,

    span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

    Krylov space?!

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

  • Another Clue

    ~rk ≡ ~b− A~xk ~rk+1 = ~rk − αk+1A~vk+1

    Proposition When performing gradient descent on f ,

    span {~r0, . . . , ~rk} = span {~r0, A~r0, . . . , Ak~r0}.

    Krylov space?!

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 1

  • Gradient Descent: Issue

    ~xk − ~x0 6= arg min ~v∈span {~r0,A~r0,...,Ak−1~r0}

    f (~x0 + ~v)

    But if this did hold... Convergence in n steps!

    Next

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 1

  • Gradient Descent: Issue

    ~xk − ~x0 6= arg min ~v∈span {~r0,A~r0,...,Ak−1~r0}

    f (~x0 + ~v)

    But if this did hold... Convergence in n steps!

    Next

    CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 1