The Fréchet derivative of a generalized matrix function...Vanni Noferini The Fréchet derivative of...

66
The Fréchet derivative of a generalized matrix function Vanni Noferini (University of Essex) “Network Science meets Matrix Functions” Univerity of Oxford, 1-2/9/16 September 1st, 2016 Vanni Noferini The Fréchet derivative of a generalized matrix function 1 / 33

Transcript of The Fréchet derivative of a generalized matrix function...Vanni Noferini The Fréchet derivative of...

  • The Fréchet derivative of a generalized matrix function

    Vanni Noferini (University of Essex)

    “Network Science meets Matrix Functions”

    Univerity of Oxford, 1-2/9/16

    September 1st, 2016

    Vanni Noferini The Fréchet derivative of a generalized matrix function 1 / 33

  • Classical matrix functions

    Classical definition in linear algebra: given

    f : C→ C

    and a square matrix A ∈ Cn×n we wish to define f (A) in a way that“mimics” the property of f .

    For example:A1/2 should solve X 2 = AetA should integrate ˙y(t) = Ay(t), y(0) = v .. . .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 2 / 33

  • Classical matrix functions

    Classical definition in linear algebra: given

    f : C→ C

    and a square matrix A ∈ Cn×n we wish to define f (A) in a way that“mimics” the property of f .

    For example:A1/2 should solve X 2 = AetA should integrate ˙y(t) = Ay(t), y(0) = v .. . .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 2 / 33

  • Defining matrix functions from the eigendecomposition

    From Linear Algebra 1:

    A Jordan block is a bidiagonal Toeplitz matrix with unit superdiagonal:λ 1 0 00 λ 1 00 0 λ 10 0 0 λ

    A matrix is in Jordan canonical form if it is the direct sum of Jordanblocks:

    λ 1 0 00 λ 0 00 0 λ 00 0 0 µ

    Vanni Noferini The Fréchet derivative of a generalized matrix function 3 / 33

  • Defining matrix functions from the eigendecomposition

    Rules:

    The function of a Jordan block is an upper triangular Toeplitz matrix:

    f

    λ 1 0 00 λ 1 00 0 λ 10 0 0 λ

    =

    f (λ) f ′(λ) f ′′(λ)/2! f ′′′(λ)/3!0 f (λ) f ′(λ) f ′′(λ)/2!0 0 f (λ) f ′(λ)0 0 0 f (λ)

    f (A1 ⊕ A2) = f (A1)⊕ f (A2)f (ZAZ−1) = Zf (A)Z−1

    Hence f (A) is defined as long as f is defined and sufficiently many timesdifferentiable on the spectrum of A.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 4 / 33

  • Defining matrix functions from the eigendecomposition

    Rules:

    The function of a Jordan block is an upper triangular Toeplitz matrix:

    f

    λ 1 0 00 λ 1 00 0 λ 10 0 0 λ

    =

    f (λ) f ′(λ) f ′′(λ)/2! f ′′′(λ)/3!0 f (λ) f ′(λ) f ′′(λ)/2!0 0 f (λ) f ′(λ)0 0 0 f (λ)

    f (A1 ⊕ A2) = f (A1)⊕ f (A2)f (ZAZ−1) = Zf (A)Z−1

    Hence f (A) is defined as long as f is defined and sufficiently many timesdifferentiable on the spectrum of A.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 4 / 33

  • Computing matrix functions

    On a computer, it is important to know the sensitivity of the attempetdcomputation.

    A backward stable algorithm computes the exact solution of a nearbyproblem:

    ̂f (A) = f (A+ E ).Whether this is close to f (A) or not depends on the sensitivity of theproblem and, intuitively, on the “first derivative” of f at A.For a scalar, analytic f :

    f (A+ E ) = f (A) + f ′(A)E + O(E 2).

    Vanni Noferini The Fréchet derivative of a generalized matrix function 5 / 33

  • Computing matrix functions

    On a computer, it is important to know the sensitivity of the attempetdcomputation.

    A backward stable algorithm computes the exact solution of a nearbyproblem:

    ̂f (A) = f (A+ E ).

    Whether this is close to f (A) or not depends on the sensitivity of theproblem and, intuitively, on the “first derivative” of f at A.For a scalar, analytic f :

    f (A+ E ) = f (A) + f ′(A)E + O(E 2).

    Vanni Noferini The Fréchet derivative of a generalized matrix function 5 / 33

  • Computing matrix functions

    On a computer, it is important to know the sensitivity of the attempetdcomputation.

    A backward stable algorithm computes the exact solution of a nearbyproblem:

    ̂f (A) = f (A+ E ).Whether this is close to f (A) or not depends on the sensitivity of theproblem and, intuitively, on the “first derivative” of f at A.For a scalar, analytic f :

    f (A+ E ) = f (A) + f ′(A)E + O(E 2).

    Vanni Noferini The Fréchet derivative of a generalized matrix function 5 / 33

  • Derivatives in Banach spaces

    Let X ,Y be Banach spaces over R, x ∈ X and

    f : X → Y .

    If the limit

    limt→0

    f (x + te)− f (x)t

    exists for all e ∈ X then f is Gâteaux differentiable at x .If there exists a bounded R-linear map Lf (x , ·) s.t.

    lim‖h‖X→0

    ‖f (x + h)− f (x)− Lf (x , h)‖Y‖h‖X

    exists for all h ∈ X then f is Fréchet differentiable at x .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 6 / 33

  • Derivatives in Banach spaces

    Let X ,Y be Banach spaces over R, x ∈ X and

    f : X → Y .

    If the limit

    limt→0

    f (x + te)− f (x)t

    exists for all e ∈ X then f is Gâteaux differentiable at x .

    If there exists a bounded R-linear map Lf (x , ·) s.t.

    lim‖h‖X→0

    ‖f (x + h)− f (x)− Lf (x , h)‖Y‖h‖X

    exists for all h ∈ X then f is Fréchet differentiable at x .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 6 / 33

  • Derivatives in Banach spaces

    Let X ,Y be Banach spaces over R, x ∈ X and

    f : X → Y .

    If the limit

    limt→0

    f (x + te)− f (x)t

    exists for all e ∈ X then f is Gâteaux differentiable at x .If there exists a bounded R-linear map Lf (x , ·) s.t.

    lim‖h‖X→0

    ‖f (x + h)− f (x)− Lf (x , h)‖Y‖h‖X

    exists for all h ∈ X then f is Fréchet differentiable at x .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 6 / 33

  • Derivatives in Banach spaces

    Fréchet differentiable ⇒ Gâteaux differentiable, but 6⇐If f is Fréchet differentiable then

    condabs(f ,A) = ‖Lf (A, ·)‖

    Vanni Noferini The Fréchet derivative of a generalized matrix function 7 / 33

  • Derivatives in Banach spaces

    Fréchet differentiable ⇒ Gâteaux differentiable, but 6⇐

    If f is Fréchet differentiable then

    condabs(f ,A) = ‖Lf (A, ·)‖

    Vanni Noferini The Fréchet derivative of a generalized matrix function 7 / 33

  • Derivatives in Banach spaces

    Fréchet differentiable ⇒ Gâteaux differentiable, but 6⇐If f is Fréchet differentiable then

    condabs(f ,A) = ‖Lf (A, ·)‖

    Vanni Noferini The Fréchet derivative of a generalized matrix function 7 / 33

  • The Fréchet derivative of a classical matrix function

    When A is diagonalizable, an explicit formula is known.

    Theorem (Dalecǩii and Krěin 1965)

    Let A = ZDZ−1 and f differentiable on the spectrum of A. Then

    Lf (A,E ) = Z (F ◦ (Z−1EZ ))Z−1

    with Fij =f (Dii )−f (Djj )

    Dii−Djj if Dii 6= Djj or Fij = f′(Dii ) otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 8 / 33

  • The Fréchet derivative of a classical matrix function

    When A is diagonalizable, an explicit formula is known.

    Theorem (Dalecǩii and Krěin 1965)

    Let A = ZDZ−1 and f differentiable on the spectrum of A. Then

    Lf (A,E ) = Z (F ◦ (Z−1EZ ))Z−1

    with Fij =f (Dii )−f (Djj )

    Dii−Djj if Dii 6= Djj or Fij = f′(Dii ) otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 8 / 33

  • Example

    Take

    A =

    2 0 00 2 00 0 −2

    , f (A) = eA,E =1 2 10 0 00 0 −1

    .Then

    F =

    e2 e2 sinh(2)e2 e2 sinh(2)sinh(2) sinh(2) e−2

    ⇒ Lf (A,E ) =e2 2e2 sinh(2)0 0 00 0 −e−2

    .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 9 / 33

  • Consequences

    The eigenvector matrix Z appears in the Dalecǩii-Krěin theorem. Thisleads to issues caused by non-normality of the argument matrix.

    Theorem (N. Higham 2008)

    condabs(f ,A) ≤ κ2(Z )2‖F‖.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 10 / 33

  • Consequences

    The eigenvector matrix Z appears in the Dalecǩii-Krěin theorem. Thisleads to issues caused by non-normality of the argument matrix.

    Theorem (N. Higham 2008)

    condabs(f ,A) ≤ κ2(Z )2‖F‖.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 10 / 33

  • Generalized matrix functions

    Given A ∈ Cm×n with SVD A = USV ∗ and a function

    f : [0,∞)→ R

    we wish to define f �(A) in some sensible way.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 11 / 33

  • Gmf: definition

    From Linear Algebra 2:

    In an SVD A = USV ∗, U,V are unitary and S is diagonal withnonincreasing nonnegative elements.This leads to the CSVD A = UrSrV ∗r where Ur ,Vr are submatrices ofU,V and Sr is square invertible (empty if A = 0).

    Vanni Noferini The Fréchet derivative of a generalized matrix function 12 / 33

  • Gmf: definition

    Definition (Hawkins and Ben-Israel 1973):

    A = UrSrV ∗r 7→ f �(A) = Ur f (Sr )V ∗r .

    Equivalent definition:

    A = USV ∗ 7→ f �(A) = Uf �(S)V ∗,

    where f �(S) is diagonal with diagonal entries

    f �(σi ) =

    {f (σi ) if σi 6= 00 if σi = 0

    Vanni Noferini The Fréchet derivative of a generalized matrix function 13 / 33

  • Why do we care?Among the applications:

    complex networks (Arrigo, Benzi, Estrada, Fenu, D. Higham,Klymko... );computing classical matrix functions of structured matrices (DelBuono, Lopez, Politi... );computer vision (Bylow, Kahl, Larsson, Olsson... );the unitary factor of the polar decompisition of a full rank matrix is agmf;the Moore-Penrose pseudoinverse of A is a gmf of A∗.

    (So they are familiar to at least some people in the room...)

    Vanni Noferini The Fréchet derivative of a generalized matrix function 14 / 33

  • Why do we care?Among the applications:

    complex networks (Arrigo, Benzi, Estrada, Fenu, D. Higham,Klymko... );computing classical matrix functions of structured matrices (DelBuono, Lopez, Politi... );computer vision (Bylow, Kahl, Larsson, Olsson... );the unitary factor of the polar decompisition of a full rank matrix is agmf;the Moore-Penrose pseudoinverse of A is a gmf of A∗.

    (So they are familiar to at least some people in the room...)

    Vanni Noferini The Fréchet derivative of a generalized matrix function 14 / 33

  • ExampleToy matrix:

    A =

    2 0 00 2 00 0 00 0 0

    .

    for f (σ) = σ3,

    f �(A) =

    8 0 00 8 00 0 00 0 0

    .for f (σ) = eσ,

    f �(A) =

    e2 0 00 e2 00 0 00 0 0

    .Note that e0 6= 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 15 / 33

  • ExampleToy matrix:

    A =

    2 0 00 2 00 0 00 0 0

    .for f (σ) = σ3,

    f �(A) =

    8 0 00 8 00 0 00 0 0

    .

    for f (σ) = eσ,

    f �(A) =

    e2 0 00 e2 00 0 00 0 0

    .Note that e0 6= 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 15 / 33

  • ExampleToy matrix:

    A =

    2 0 00 2 00 0 00 0 0

    .for f (σ) = σ3,

    f �(A) =

    8 0 00 8 00 0 00 0 0

    .for f (σ) = eσ,

    f �(A) =

    e2 0 00 e2 00 0 00 0 0

    .Note that e0 6= 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 15 / 33

  • Basic observations

    In spite of their name, gmf do not reduce to classical mf when m = n;

    Even when m = n, f �(A) for a polynomial function f is not apolynomial in A in the classical sense;If f (0) 6= 0 and A is rank deficient then f �(A) is not continuous – letalone differentiable– at A.If A = QH is a polar decomposition then f �(A) = Qf �(H).If H is Hpd then f �(H) = f (H); if H is Hpsd this is not generally trueunless f (0) = 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 16 / 33

  • Basic observations

    In spite of their name, gmf do not reduce to classical mf when m = n;Even when m = n, f �(A) for a polynomial function f is not apolynomial in A in the classical sense;

    If f (0) 6= 0 and A is rank deficient then f �(A) is not continuous – letalone differentiable– at A.If A = QH is a polar decomposition then f �(A) = Qf �(H).If H is Hpd then f �(H) = f (H); if H is Hpsd this is not generally trueunless f (0) = 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 16 / 33

  • Basic observations

    In spite of their name, gmf do not reduce to classical mf when m = n;Even when m = n, f �(A) for a polynomial function f is not apolynomial in A in the classical sense;If f (0) 6= 0 and A is rank deficient then f �(A) is not continuous – letalone differentiable– at A.

    If A = QH is a polar decomposition then f �(A) = Qf �(H).If H is Hpd then f �(H) = f (H); if H is Hpsd this is not generally trueunless f (0) = 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 16 / 33

  • Basic observations

    In spite of their name, gmf do not reduce to classical mf when m = n;Even when m = n, f �(A) for a polynomial function f is not apolynomial in A in the classical sense;If f (0) 6= 0 and A is rank deficient then f �(A) is not continuous – letalone differentiable– at A.If A = QH is a polar decomposition then f �(A) = Qf �(H).

    If H is Hpd then f �(H) = f (H); if H is Hpsd this is not generally trueunless f (0) = 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 16 / 33

  • Basic observations

    In spite of their name, gmf do not reduce to classical mf when m = n;Even when m = n, f �(A) for a polynomial function f is not apolynomial in A in the classical sense;If f (0) 6= 0 and A is rank deficient then f �(A) is not continuous – letalone differentiable– at A.If A = QH is a polar decomposition then f �(A) = Qf �(H).If H is Hpd then f �(H) = f (H); if H is Hpsd this is not generally trueunless f (0) = 0.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 16 / 33

  • Intuitions on computational advantage

    Although you do not compute classical mf via the (unstable?)eigendecomposition, their definition may lead to numerical ill conditioning.

    Gmf are defined via the (stable) SVD. Does this imply that they are alwayswell conditioned?

    Vanni Noferini The Fréchet derivative of a generalized matrix function 17 / 33

  • Intuitions on computational advantage

    Although you do not compute classical mf via the (unstable?)eigendecomposition, their definition may lead to numerical ill conditioning.

    Gmf are defined via the (stable) SVD. Does this imply that they are alwayswell conditioned?

    Vanni Noferini The Fréchet derivative of a generalized matrix function 17 / 33

  • Gmf: conditions on differentiability

    TheoremLet A ∈ Cm×n and f : [0,∞)→ R differentiable on an open subsetcontaining the positive singular value of A. Moreover, ifrankA < min(m, n), suppose that f (0) = 0 and f is right differentiable at0. Then f �(X ) is Fréchet differentiable at X = A.

    Important: in this theorem, the Fréchet derivative is the real Fréchetderivative. Generally, generalized matrix functions are not complexdifferentiable.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 18 / 33

  • Gmf: conditions on differentiability

    TheoremLet A ∈ Cm×n and f : [0,∞)→ R differentiable on an open subsetcontaining the positive singular value of A. Moreover, ifrankA < min(m, n), suppose that f (0) = 0 and f is right differentiable at0. Then f �(X ) is Fréchet differentiable at X = A.

    Important: in this theorem, the Fréchet derivative is the real Fréchetderivative. Generally, generalized matrix functions are not complexdifferentiable.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 18 / 33

  • Some notation...

    Given X ∈ Cm×n, Υ : Cm×n → Cm×n is the following real-linear operator:

    Υ (X ) =

    [X ∗1X2

    ]if m > n and X =

    [X1X2

    ];

    X ∗ if m = n;

    [X ∗1 X2

    ]if m < n and X =

    [X1 X2

    ].

    Vanni Noferini The Fréchet derivative of a generalized matrix function 19 / 33

  • Notation again...Given the singular values of A ∈ Cm×n we introduce two operatorsF ,G ∈ Rm×n:

    Fij =

    σi f (σi )−σj f (σj )σ2i −σ

    2j

    if i 6= j , max(i , j) ≤ ν, and σi 6= σj ;σi f ′(σi )+f (σi )

    2σiif i 6= j , max(i , j) ≤ ν, and σi = σj 6= 0;

    f (σj )σj

    if i > n, and σj 6= 0;f (σi )σi

    if j > m, and σi 6= 0;f ′(σi ) otherwise;

    Gij =

    σj f (σi )−σi f (σj )

    σ2i −σ2j

    if i 6= j , i , j ≤ ν, and σi 6= σj ;σi f ′(σi )−f (σi )

    2σiif i 6= j , i , j ≤ ν, and σi = σj 6= 0;

    0 otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 20 / 33

  • Notation again...Given the singular values of A ∈ Cm×n we introduce two operatorsF ,G ∈ Rm×n:

    Fij =

    σi f (σi )−σj f (σj )σ2i −σ

    2j

    if i 6= j , max(i , j) ≤ ν, and σi 6= σj ;σi f ′(σi )+f (σi )

    2σiif i 6= j , max(i , j) ≤ ν, and σi = σj 6= 0;

    f (σj )σj

    if i > n, and σj 6= 0;f (σi )σi

    if j > m, and σi 6= 0;f ′(σi ) otherwise;

    Gij =

    σj f (σi )−σi f (σj )

    σ2i −σ2j

    if i 6= j , i , j ≤ ν, and σi 6= σj ;σi f ′(σi )−f (σi )

    2σiif i 6= j , i , j ≤ ν, and σi = σj 6= 0;

    0 otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 20 / 33

  • Notation again...Given the singular values of A ∈ Cm×n we introduce two operatorsF ,G ∈ Rm×n:

    Fij =

    σi f (σi )−σj f (σj )σ2i −σ

    2j

    if i 6= j , max(i , j) ≤ ν, and σi 6= σj ;σi f ′(σi )+f (σi )

    2σiif i 6= j , max(i , j) ≤ ν, and σi = σj 6= 0;

    f (σj )σj

    if i > n, and σj 6= 0;f (σi )σi

    if j > m, and σi 6= 0;f ′(σi ) otherwise;

    Gij =

    σj f (σi )−σi f (σj )

    σ2i −σ2j

    if i 6= j , i , j ≤ ν, and σi 6= σj ;σi f ′(σi )−f (σi )

    2σiif i 6= j , i , j ≤ ν, and σi = σj 6= 0;

    0 otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 20 / 33

  • I am lost! Help!

    Forgetting about the degenerate cases...

    F is f ′(σi ) on the main diagonal, and it is

    σi f (σi )− σj f (σj)σ2i − σ2j

    off the main diagonal.

    G is 0 on the main diagonal, and it is

    σj f (σi )− σi f (σj)σ2i − σ2j

    off the main diagonal.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 21 / 33

  • I am lost! Help!

    Forgetting about the degenerate cases...

    F is f ′(σi ) on the main diagonal, and it is

    σi f (σi )− σj f (σj)σ2i − σ2j

    off the main diagonal.

    G is 0 on the main diagonal, and it is

    σj f (σi )− σi f (σj)σ2i − σ2j

    off the main diagonal.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 21 / 33

  • I am lost! Help!

    Forgetting about the degenerate cases...

    F is f ′(σi ) on the main diagonal, and it is

    σi f (σi )− σj f (σj)σ2i − σ2j

    off the main diagonal.

    G is 0 on the main diagonal, and it is

    σj f (σi )− σi f (σj)σ2i − σ2j

    off the main diagonal.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 21 / 33

  • Dalecǩii-Krěin formula for gmf

    TheoremGiven A = USV ∗, under the assumptions of the previous theorem,

    Lf �(A,E ) = U(F ◦ Ê + iH ◦ Im(Ê ) + G ◦ Υ (Ê )

    )V ∗,

    where Ê = U∗EV , F ,G are as in the previous slides, and H is diagonalwith Hii = f (σi )/σi − Fii if σi 6= 0 or Hii = f ′(0)− Fii otherwise.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 22 / 33

  • Dalecǩii-Krěin formula for real gmf

    U,V ,E are real, so that the formula simplifies a little.

    TheoremGiven A = USV T , under the assumptions of the previous theorem,

    Lf �(A,E ) = U(F ◦ Ê + G ◦ Υ (Ê )

    )V T ,

    where Ê = UTEV , F ,G are as a few slides ago.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 23 / 33

  • Special cases

    For special f , formuale may simplify. For example, f = 1 ⇒ unitary factorof a polar decomposition.

    (f (0) 6= 0, hence differentiability ⇔ A has full rank).

    The theorem then leads to efficient algorithms for the computation of theFréchet derivative, see Arslan, Noferini and Tisseur (in preparation).

    Vanni Noferini The Fréchet derivative of a generalized matrix function 24 / 33

  • Example

    Toy matrices:

    A =

    2 0 00 2 00 0 1

    ,E =1 −1 00 0 01 1 1

    .

    for f (σ) = σ2,

    F =

    4 3 7/33 4 7/37/3 7/3 2

    ,G = 0 1 2/31 0 2/32/3 2/3 0

    ,

    Lf (A,E ) =

    4 −3 2/3−1 0 2/37/3 7/3 2

    .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 25 / 33

  • Example

    Toy matrices:

    A =

    2 0 00 2 00 0 1

    ,E =1 −1 00 0 01 1 1

    .for f (σ) = σ2,

    F =

    4 3 7/33 4 7/37/3 7/3 2

    ,G = 0 1 2/31 0 2/32/3 2/3 0

    ,

    Lf (A,E ) =

    4 −3 2/3−1 0 2/37/3 7/3 2

    .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 25 / 33

  • Example

    Toy matrices:

    A =

    2 0 00 2 00 0 1

    ,E =1 −1 00 0 01 1 1

    .

    for f (σ) = 1 (polar decomposition),

    F =

    0 1/4 1/31/4 0 1/31/3 1/3 0

    = −G , Lf (A,E ) = 0 −1/4 −1/31/4 0 −1/31/3 1/3 0

    .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 26 / 33

  • Example

    Toy matrices:

    A =

    2 0 00 2 00 0 1

    ,E =1 −1 00 0 01 1 1

    .for f (σ) = 1 (polar decomposition),

    F =

    0 1/4 1/31/4 0 1/31/3 1/3 0

    = −G , Lf (A,E ) = 0 −1/4 −1/31/4 0 −1/31/3 1/3 0

    .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 26 / 33

  • Application to conditioning

    TheoremAssuming m ≥ n and A ∈ Rm×n,

    condabs(f �,A) ≤ max{maxi|Fii |,max

    j

  • Application to conditioning

    TheoremAssuming m ≥ n and A ∈ Rm×n,

    condabs(f �,A) ≤ max{maxi|Fii |,max

    j

  • Some more bounds

    In practice, only a few singular values may be known:

    TheoremIf A ∈ Rm×n has full rank and smallest singular value σr , suppose that M isthe sup norm of f on [σr , ‖A‖2] and that f is Lipschitz continuos on thesame interval with constant K. Then

    condabs(f �,A) ≤ max{K ,Mσ−1r }.

    TheoremIf A ∈ Rm×n and f (0) = 0, suppose that f is Lipschitz continuos on on[0, ‖A‖2] with constant K. Then

    condabs(f �,A) ≤ K .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 28 / 33

  • Some more bounds

    In practice, only a few singular values may be known:

    TheoremIf A ∈ Rm×n has full rank and smallest singular value σr , suppose that M isthe sup norm of f on [σr , ‖A‖2] and that f is Lipschitz continuos on thesame interval with constant K. Then

    condabs(f �,A) ≤ max{K ,Mσ−1r }.

    TheoremIf A ∈ Rm×n and f (0) = 0, suppose that f is Lipschitz continuos on on[0, ‖A‖2] with constant K. Then

    condabs(f �,A) ≤ K .

    Vanni Noferini The Fréchet derivative of a generalized matrix function 28 / 33

  • Some more bounds

    Again, for special choices of f more can be said.

    The following is known but can be recovered as a special case:

    Theorem (Kenney and Laub, 1991)

    If A ∈ Rm×n has full rank r , then for f (x) = 1

    condabs(f �,A) =2

    σr + σr−1.

    TheoremIf A ∈ Rm×n has full rank r , then for f (x) = ex , defining g(x) = ex/x:

    condabs(f �,A) = max{g(σr ), g(‖A‖2), e‖A‖2}.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 29 / 33

  • Some more bounds

    Again, for special choices of f more can be said.The following is known but can be recovered as a special case:

    Theorem (Kenney and Laub, 1991)

    If A ∈ Rm×n has full rank r , then for f (x) = 1

    condabs(f �,A) =2

    σr + σr−1.

    TheoremIf A ∈ Rm×n has full rank r , then for f (x) = ex , defining g(x) = ex/x:

    condabs(f �,A) = max{g(σr ), g(‖A‖2), e‖A‖2}.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 29 / 33

  • Some more bounds

    Again, for special choices of f more can be said.The following is known but can be recovered as a special case:

    Theorem (Kenney and Laub, 1991)

    If A ∈ Rm×n has full rank r , then for f (x) = 1

    condabs(f �,A) =2

    σr + σr−1.

    TheoremIf A ∈ Rm×n has full rank r , then for f (x) = ex , defining g(x) = ex/x:

    condabs(f �,A) = max{g(σr ), g(‖A‖2), e‖A‖2}.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 29 / 33

  • Relative condition number

    ..it is the relative condition number that is of interest, but it ismore convenient to state results for the absolute conditionnumber.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 30 / 33

  • Relative condition number

    ..it is the relative condition number that is of interest, but it ismore convenient to state results for the absolute conditionnumber.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 30 / 33

  • Some more bounds

    Define µ =√

    f (‖A‖2)2 + f (σr )2.

    TheoremIf A ∈ Rm×n has full rank and smallest singular value σr , suppose that M isthe sup norm of f on [σr , ‖A‖2] and that f is Lipschitz continuos on thesame interval with constant K. Then

    condrel(f �,A) ≤√

    max(m, n)‖A‖2 max{K/µ, σ−1r }.

    TheoremIf A ∈ Rm×n and f (0) = 0, suppose that f is Lipschitz continuos on on[0, ‖A‖2] with constant K. Then

    condrel(f �,A) ≤√

    max(m, n)K‖A‖2µ−1.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 31 / 33

  • Some more bounds

    Define µ =√

    f (‖A‖2)2 + f (σr )2.

    TheoremIf A ∈ Rm×n has full rank and smallest singular value σr , suppose that M isthe sup norm of f on [σr , ‖A‖2] and that f is Lipschitz continuos on thesame interval with constant K. Then

    condrel(f �,A) ≤√

    max(m, n)‖A‖2 max{K/µ, σ−1r }.

    TheoremIf A ∈ Rm×n and f (0) = 0, suppose that f is Lipschitz continuos on on[0, ‖A‖2] with constant K. Then

    condrel(f �,A) ≤√

    max(m, n)K‖A‖2µ−1.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 31 / 33

  • The big picture

    If f (0) = 0, then f � has essentially the same condition number as thescalar function f . Unlike classical mf, gmf are never numericallydodgier than the scalar case.

    If f (0) 6= 0, trouble may happen only if

    maxi|f (σi )/σi | � max

    i|f ′(σi )|.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 32 / 33

  • The big picture

    If f (0) = 0, then f � has essentially the same condition number as thescalar function f . Unlike classical mf, gmf are never numericallydodgier than the scalar case.If f (0) 6= 0, trouble may happen only if

    maxi|f (σi )/σi | � max

    i|f ′(σi )|.

    Vanni Noferini The Fréchet derivative of a generalized matrix function 32 / 33

  • Can trouble happen?If f (0) 6= 0, f � is discontinuos at a rank deficient A. Intuitively, we expectnumerical issues for an ill conditioned A.Let

    A =[� 0 00 1 0

    ], f (x) = 1+ (x − �)2.

    Then,condrel(f , �) = 0, condrel(f , 1) = 1+ O(�2).

    Yet,

    condrel(f �,A) =1�√5+ O(1).

    10-8

    10-7

    10-6

    10-5

    10-4

    10-3

    ǫ

    10-10

    10-8

    10-6

    10-4

    ρ

    Vanni Noferini The Fréchet derivative of a generalized matrix function 33 / 33

  • Can trouble happen?If f (0) 6= 0, f � is discontinuos at a rank deficient A. Intuitively, we expectnumerical issues for an ill conditioned A.Let

    A =[� 0 00 1 0

    ], f (x) = 1+ (x − �)2.

    Then,condrel(f , �) = 0, condrel(f , 1) = 1+ O(�2).

    Yet,

    condrel(f �,A) =1�√5+ O(1).

    10-8

    10-7

    10-6

    10-5

    10-4

    10-3

    ǫ

    10-10

    10-8

    10-6

    10-4

    ρ

    Vanni Noferini The Fréchet derivative of a generalized matrix function 33 / 33