Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and...

49
Matrix Functions: Theory and Algorithms Nick Higham Department of Mathematics University of Manchester [email protected] http://www.ma.man.ac.uk/~higham/ Includes joint work with Philip Davies Function of Matrix – p.1/42

Transcript of Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and...

Page 1: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Matrix Functions:Theory and Algorithms

Nick HighamDepartment of Mathematics

University of Manchester

[email protected]://www.ma.man.ac.uk/~higham/

Includes joint work with Philip Davies

Function of Matrix – p.1/42

Page 2: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

OUTLINE

I Definitions of f(A)

Applications

Algorithms for particular f

Schur–Parlett algorithm for general f

Computing f(A)b

Function of Matrix – p.2/42

Page 3: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Defining by Substitution

Want to define f : Cn×n → C

n×n, but not elementwise.Given f(t), can define f(A) by substituting A for t:

f(t) =1 + t2

1 − t⇒ f(A) = (I − A)−1(I + A2).

log(1 + x) = x − x2

2+

x3

3− x4

4+ · · · , |x| < 1

⇒ log(I + A) = A − A2

2+

A3

3− A4

4+ · · · , ρ(A) < 1.

Works for f

a polynomial,

a rational,

or with a convergent power series.Function of Matrix – p.3/42

Page 4: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Multiplicity of Definitions

There have been proposed in the literature since 1880eight distinct definitions

of a matric function,by Weyr, Sylvester and Buchheim,Giorgi, Cartan, Fantappiè, Cipolla,

Schwerdtfeger and Richter.

— R. F. Rinehart,The Equivalence of Definitions of a Matric Function,

Amer. Math. Monthly (1955)

Function of Matrix – p.4/42

Page 5: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Cauchy Integral Theorem

Definition 1

f(A) =1

2πi

Γf(z)(zI − A)−1 dz,

where f is analytic inside a closed contour Γ whichencloses λ(A).

Function of Matrix – p.5/42

Page 6: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Jordan Canonical Form

Z−1AZ = J = diag(J1, J2, . . . , Jp), Jk =

λk 1

λk. . .. . . 1

λk

Definition 2

f(A) = Zf(J)Z−1 = Zdiag(f(Jk))Z−1,

f(Jk) =

f(λk) f ′(λk) . . .f (k−1))(λk)

(k − 1)!

f(λk). . . .... . . f ′(λk)

f(λk)

.

Function of Matrix – p.6/42

Page 7: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Interpolation

Definition 3 (Sylvester, 1883; Buchheim, 1886) Distincte’vals λ1, . . . , λs, ni = geometric mult. of λi. Thenf(A) = r(A), where r is unique Hermite interpolating poly ofdegree less than

∑si=1 ni satisfying interpolation conditions

r(j)(λi) = f (j)(λi), j = 0: ni − 1, i = 1: s.

Poly r depends on A.

This def. preserves functional relations G(f1, . . . , fp) = 0,where G is a polynomial. E.g. sin2(A) + cos2(A) = I.

But of course eA+B 6= eAeB.

Function of Matrix – p.7/42

Page 8: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Non-Primary Functions

Horn & Johnson call these defs primary matrix functions.But not all possible functions captured when multipleeigenvalues. E.g.,

A =

[−1 00 −1

], X =

[i 00 −i

], Y =

[0 −11 0

].

X and Y are square roots of A but are not polynomials in A.However, A = givens(π) and Y = givens(π/2) is a naturalsquare root.

Virtually all existing theory and methods are for primaryfunctions.

Non-primary functions sometimes needed whentracking f(A(t)) when eigenvalues of A(t) coalesce.

Function of Matrix – p.8/42

Page 9: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Textbook References

[1] F. R. Gantmacher. The Theory of Matrices, volumeone. Chelsea, New York, 1959.

[2] Gene H. Golub and Charles F. Van Loan. MatrixComputations. Johns Hopkins University Press,Baltimore, MD, USA, third edition, 1996.

[3] Roger A. Horn and Charles R. Johnson. Topics inMatrix Analysis. Cambridge University Press, 1991.

[4] Peter Lancaster and Miron Tismenetsky. The Theoryof Matrices. Academic Press, London, second edition,1985.

Function of Matrix – p.9/42

Page 10: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

OUTLINE

Definitions of f(A)

I Applications

Algorithms for particular f

Schur–Parlett algorithm for general f

Computing f(A)b

Function of Matrix – p.10/42

Page 11: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Application: Differential equations

Nuclear magnetic resonance: Solomon equations

dM/dt = −RM, M(0) = I,

where M(t) = matrix of intensities and R = symmetricrelaxation matrix. NMR workers need to solve both forwardand inverse problems.

Exponential time differencing for stiff systems (Cox &Matthews, J. Comp. Phys., 2002)

y′ = Ay + F (y, t).

Methods based on exact integration of linear part—requireone accurate evaluation of exp(hA) per integration.

Function of Matrix – p.11/42

Page 12: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Application: Control theory

Convert continuous-time system

dx

dt= Ax(t) + Bu(t)

to discrete-time state-space system

xk+1 = Fxk + Guk,

where F = eAτ and τ is sampling period.(E.g., MATLAB Control System Toolbox, c2d , d2c .)

Function of Matrix – p.12/42

Page 13: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

OUTLINE

Definitions of f(A)

Applications

I Algorithms for particular f

Schur–Parlett algorithm for general f

Computing f(A)b

Function of Matrix – p.13/42

Page 14: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Classic MATLAB< M A T L A B >

Version of 01/10/84

HELP is available

<>help

Type HELP followed byINTRO (To get started)NEWS (recent revisions)ABS ANS ATAN BASE CHAR CHOL CHOP CLEA COND CONJ COSDET DIAG DIAR DISP EDIT EIG ELSE END EPS EXEC EXITEXP EYE FILE FLOP FLPS FOR FUN HESS HILB IF IMAGINV KRON LINE LOAD LOG LONG LU MACR MAGI NORM ONESORTH PINV PLOT POLY PRIN PROD QR RAND RANK RCON RATREAL RETU RREF ROOT ROUN SAVE SCHU SHOR SEMI SIN SIZESQRT STOP SUM SVD TRIL TRIU USER WHAT WHIL WHO WHY< > ( ) = . , ; \ / ’ + - * :

Function of Matrix – p.14/42

Page 15: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Classic MATLAB<>help fun

FUN For matrix arguments X , the functions SIN, COS, ATAN,SQRT, LOG, EXP and X**p are computed using eigenvalues Dand eigenvectors V . If <V,D> = EIG(X) then f(X) =V*f(D)/V . This method may give inaccurate results if Vis badly conditioned. Some idea of the accuracy can beobtained by comparing X**1 with X .For vector arguments, the function is applied to eachcomponent.

The availability of [FUN] in early versions of MATLABquite possibly contributed to

the system’s technical and commercial success.

— Cleve Moler (2003)

Function of Matrix – p.15/42

Page 16: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Setup

I General nonsymmetric A

I Factorization of A feasible

I May not want full accuracy

I Many applications.

I Methods for very large, sparse A, often require solutionof smaller, dense subproblems.

Function of Matrix – p.16/42

Page 17: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Matrix Exponential

Cleve Moler and Charles Van Loan.Nineteen dubious ways to compute the exponential of amatrix, twenty-five years later, SIAM Rev., 45 (2003).

B 355 citations on Science Citation Index.

Scaling and squaring (SS) method for X ≈ eA

(Ward, 1977; Moler & Van Loan, 1978).

1. A ← A/2k so ‖A‖∞ ≤ 1/2

2. r(A) = [6/6] Padé approximant to eA

3. X = r(A)2k

Used by MATLAB’s expm.

Function of Matrix – p.17/42

Page 18: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alternative SS Algorithm for eA

Suggested by Najfeld & Havel (1995): exploit

τ(A) = A coth(A) = A(e2A + I)(e2A − I)−1

= I +A2

3I +A2

5I +A2

7I + · · ·

.

1. B = A/2k+1 so ‖A2‖∞/22k+2 ≤ 1.152

2. r(B) = [8/8] Padé approximant to τ(B).

3. X =[(r(B) + B)(r(B) − B)−1

]2k

I Claimed to require fewer flops than original SS alg.

Function of Matrix – p.18/42

Page 19: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Principal Log and pth Root

Let A ∈ Cn×n have no eigenvalues on R

− .

LogX = log A denotes unique X such that

1. eX = A.

2. −π < Im(λ(X)) < π.

pth rootFor integer p > 0, X = A1/p is unique X such that

1. Xp = A.

2. −π/p < arg(λ(X)) < π/p.

Function of Matrix – p.19/42

Page 20: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Briggs’ Log Method (1617)

log(ab) = log a + log b ⇒ log a = 2 log a1/2.

Use repeatedly:

log a = 2k log a1/2k

.

Write a1/2k

= 1 + x and note log(1 + x) ≈ x. Briggs worked tobase 10 and used

log10 a ≈ 2k · log10 e · (a1/2k − 1).

Function of Matrix – p.20/42

Page 21: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Briggs’ Log Method (1617)

log(ab) = log a + log b ⇒ log a = 2 log a1/2.

Use repeatedly:

log a = 2k log a1/2k

.

Write a1/2k

= 1 + x and note log(1 + x) ≈ x. Briggs worked tobase 10 and used

log10 a ≈ 2k · log10 e · (a1/2k − 1).

Briggs must be viewed as one of thegreat figures in numerical analysis.

— Herman H. Goldstine, A History of NumericalAnalysis (1977)

Function of Matrix – p.20/42

Page 22: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Briggs’ Log Method (1617)

log(ab) = log a + log b ⇒ log a = 2 log a1/2.

Use repeatedly:

log a = 2k log a1/2k

.

Write a1/2k

= 1 + x and note log(1 + x) ≈ x. Briggs worked tobase 10 and used

log10 a ≈ 2k · log10 e · (a1/2k − 1).

Can we generalize to matrices:

log A = 2k log A1/2k

?

Function of Matrix – p.20/42

Page 23: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Splitting Lemma

Lemma 0 (Cheng, H, Kenney & Laub, 2001) SupposeA = BC has no eigenvalues on R

− and

1. BC = CB.

2. Every eigenvalue of B (or C) lies in the open halfplaneof the corresponding eigenvalue of A1/2.

Then log A = log B + log C .

Re λ

Im λ

λB

λA1/2

Function of Matrix – p.21/42

Page 24: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Matrix Logarithm

Use the Briggs idea:

log A = 2k log A1/2k

.

Kenney & Laub’s (1989) inverse scaling and squaringmethod:

Bring A close to I by repeated square roots.

Approximate log A1/2k

using an [m/m] Padéapproximant rm(x) ≈ log(1 − x).

Rescale to find log A.

Function of Matrix – p.22/42

Page 25: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of Cheng, H, Kenney & Laub (2001)

F Transformation-free: uses only matrix mult, LU, inv.

Function of Matrix – p.23/42

Page 26: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of Cheng, H, Kenney & Laub (2001)

F Transformation-free: uses only matrix mult, LU, inv.

F Sq. roots by product form of Denman–Beavers iteration:

Mk+1 =1

2

[I +

1

2(Mk + M−1

k )], M0 = A,

Yk+1 = Yk(I + M−1k )/2, Y0 = A,

where Mk → I and Yk → A1/2.

Function of Matrix – p.23/42

Page 27: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of Cheng, H, Kenney & Laub (2001)

F Transformation-free: uses only matrix mult, LU, inv.

F Sq. roots by product form of Denman–Beavers iteration:

Mk+1 =1

2

[I +

1

2(Mk + M−1

k )], M0 = A,

Yk+1 = Yk(I + M−1k )/2, Y0 = A,

where Mk → I and Yk → A1/2.

F Aims for a specified accuracy.

Function of Matrix – p.23/42

Page 28: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of Cheng, H, Kenney & Laub (2001)

F Transformation-free: uses only matrix mult, LU, inv.

F Sq. roots by product form of Denman–Beavers iteration:

Mk+1 =1

2

[I +

1

2(Mk + M−1

k )], M0 = A,

Yk+1 = Yk(I + M−1k )/2, Y0 = A,

where Mk → I and Yk → A1/2.

F Aims for a specified accuracy.

F Padé degree m chosen using K & L’s (1989) bound:

‖rm(X) − log(I − X)‖ ≤ |rm(‖X‖) − log(1 − ‖X‖)|.

Function of Matrix – p.23/42

Page 29: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of Cheng, H, Kenney & Laub (2001)

F Transformation-free: uses only matrix mult, LU, inv.

F Sq. roots by product form of Denman–Beavers iteration:

Mk+1 =1

2

[I +

1

2(Mk + M−1

k )], M0 = A,

Yk+1 = Yk(I + M−1k )/2, Y0 = A,

where Mk → I and Yk → A1/2.

F Aims for a specified accuracy.

F Padé degree m chosen using K & L’s (1989) bound:

‖rm(X) − log(I − X)‖ ≤ |rm(‖X‖) − log(1 − ‖X‖)|.

F rm evaluated using partial fraction expansion

rm(x) =∑m

j=1α

(m)j x

1+β(m)j x

: fast and accurate (H, 2001).Function of Matrix – p.23/42

Page 30: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Matrix pth Root

Square root: Björck & Hammarling (1983). Compute Schurdecomp. A = QTQ∗ and then solve R2 = T by

rii =√

tii, rij =tij −

∑j−1k=i+1 tijtkj

tii + tjj.

Extended to pth roots by Smith (2003)—much morecomplicated recurrence.

These algs

I Have essentially optimal numerical stability.

I Generalize to real Schur decomp.

Function of Matrix – p.24/42

Page 31: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Matrix Cosine

Algorithm 0 (Serbin & Blalock, 1980) Given A ∈ Rn×n

and parameter α > 0 this alg approximates cos(A).

Choose m such that 2−m‖A‖ ≈ α.C0 = Taylor or Pade approximation to cos(A/2m).for i = 0: m − 1

Ci+1 = 2C2i − I

end

Choice of m (i.e., α)?

Which approximation?

Effect of rounding errors?

Function of Matrix – p.25/42

Page 32: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Alg of H & Smith (2002)

I Initial argument reduction and balancing to

reduce norm.

I [8/8] Padé approximation proved fully accurate

in IEEE double if ‖A‖∞ ≤ 1. More economical

than Taylor series.

I “Schoolboy” evaluation of r8(A).

I Total cost: (4 + dlog2(‖A‖∞)e)M + D.

I Error analysis give bound containing terms

(4.1)m and norms of intermediate Ci.

Function of Matrix – p.26/42

Page 33: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Numerical Stability

Is ‖f̂ − f‖ consistent with condition of problem?

Is f̂ = f(A + E) with E “small’, i.e.,

is residual f−1(f̂) − A “small’?

Unclear for all algs discussed except “yes” for A1/p.

F Currently lack characterizations of when an

f(A) problem is ill conditioned for nonnormal A.

Function of Matrix – p.27/42

Page 34: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

OUTLINE

Definitions of f(A)

Applications

Algorithms for particular f

I Schur–Parlett algorithm for general f

Computing f(A)b

Function of Matrix – p.28/42

Page 35: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Similarity Transformations

Can use the formula

A = XBX−1 ⇒ f(A) = Xf(B)X−1,

provided f(B) is easily computable.E.g. B = diag(λi) if A diagonalizable.

Problem : any error ∆B in f(B) magnified by up toκ(X) = ‖X‖‖X−1‖ ≥ 1.

Prefer to work with unitary X: thus can use

eigendecomposition (diagonal B) when A is normal(AA∗ = A∗A),

Schur decomposition (triangular B) in general.

Function of Matrix – p.29/42

Page 36: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Example: Eigendecomposition

function F = funm_ev(A,fun)[V,D] = eig(A);F = V * diag(feval(fun,diag(D))) / V;

>> A = [3 -1; 1 1]; X = funm_ev(A,@sqrt)X =

1.7678e+000 -3.5355e-0013.5355e-001 1.0607e+000

>> norm(A-Xˆ2) % cond(V) = 9.4e7ans =

9.9519e-009

>> Y = sqrtm(A); norm(A-Yˆ2)ans =

6.4855e-016

Function of Matrix – p.30/42

Page 37: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Parlett’s Recurrence

Schur decomposition A = QTQ∗ reduces problem toF = f(T ), T upper triangular.

fii = f(tii) is immediate.

Parlett (1976): from FT = TF obtain recurrence

fij = tijfii − fjj

tii − tjj+

j−1∑

k=i+1

fiktkj − tikfkj

tii − tjj.

Used in MATLAB’s funm .

Function of Matrix – p.31/42

Page 38: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Parlett’s Recurrence

Schur decomposition A = QTQ∗ reduces problem toF = f(T ), T upper triangular.

fii = f(tii) is immediate.

Parlett (1976): from FT = TF obtain recurrence

fij = tijfii − fjj

tii − tjj+

j−1∑

k=i+1

fiktkj − tikfkj

tii − tjj.

Used in MATLAB’s funm .

Fails when T has repeated eigenvalues.

Function of Matrix – p.31/42

Page 39: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Parlett vs. Björck & Hammarling

Parlett recurrence is not “optimal”, as clear from sq. rootcase: x12 obtained from

Parlett :a12(

√a11 −

√a22)

a11 − a22=

a12√a11 +

√a22

: B & H.

Function of Matrix – p.32/42

Page 40: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Schur–Parlett Algorithm

H & Davies (2002):

Compute Schur decomposition A = QTQ∗.

Re-order T to block triangular form in whicheigenvalues within a block are “close” and those ofseparate blocks are “well separated”.

Evaluate Fii = f(Tii).

Solve the Sylvester equations

TiiFij − FijTjj = FiiTij − TijFjj +

j−1∑

k=i+1

(FikTkj − TikFkj).

Undo the unitary transformations.

Function of Matrix – p.33/42

Page 41: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Function of Atomic Block

Assume f has Taylor series with ∞ radius of cgce andderivatives available.

For diagonal blocks T use

T = σI + M, σ = trace(T )/n : f(T ) =∞∑

k=0

f (k)(σ)

k!Mk.

Truncate series based on strict error bound, not usingsize of terms. NB: for n = 2,

M =

[ε α0 −ε

]

⇒ M2k =

[ε2k 00 ε2k

], M2k+1 =

[ε2k+1 αε2k

0 −ε2k+1

].

Function of Matrix – p.34/42

Page 42: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Features of Algorithm

Costs O(n3) flops, or up to n4/3 flops if large

blocks needed (close, repeated eigenvalues).

Needs derivatives if blocks size > 1: price to

pay for treating general f and nonnormal A.

Best general f(A) alg. Benchmark for

comparing other f(A) algs—general and

specific.

The basis of a new funm for next MATLAB

release.

Function of Matrix – p.35/42

Page 43: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

OUTLINE

Definitions of f(A)

Applications

Algorithms for particular f

Schur–Parlett algorithm for general f

I Computing f(A)b

Function of Matrix – p.36/42

Page 44: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

log(A) b

Apply quadrature rule∫

1

0f(t) dt ≈

∑m

k=1ckf(tk) to (Wouk, 1965)

log A =∫

1

0(A − I)

[t(A − I) + I

]−1

dt.

Combine with Hessenberg reduction A = QHQT to get

(log A) b ≈ Q

m∑

k=1

ck

[tk(H − I) + I

]−1

d, d = QT (A − I)b,

Costs (10/3)n3 + 2mn2 flops.

When ‖I − A‖ < 1 can use m-point Gauss-Legendre ≡ Padéapproximation! Choose m using (Kenney & Laub, 2001)

‖rmm(X) − log(I + X)‖ ≤ |rmm(−‖X‖) − log(1 − ‖X‖)|.

When ‖I − A‖ > 1 use adaptive quadrature.

Function of Matrix – p.37/42

Page 45: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Aα b

dy

dt= α(A − I)[t(A − I) + I]−1y, y(0) = b

has unique solution y(t) = [t(A − I) + I]αb ⇒ y(1) = Aαb.Used by Allen, Baglama & Boyd (2000) for α = 1/2, spd A.

Example using MATLAB’s ode45 .A = gallery(’parter’,64) , b = randn(64,1) .

f(A) tol Succ. steps Fail. atts f evals Rel. err

A−1/2 1e-3 12 0 73 3.5e-81e-6 14 0 85 6.0e-91e-9 40 0 241 7.7e-12

A2/5 1e-3 15 0 79 2.8e-81e-6 16 0 91 2.4e-91e-9 54 0 325 1.8e-12

Function of Matrix – p.38/42

Page 46: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Interpolation

If A has distinct eigenvalues λj, Lagrange interp poly:

f(A)b =n∑

j=0

fj`j(A)b, `j(x) =

n∏

k=0, k 6=j

(x − λk)

n∏

k=0, k 6=j

(λj − λk)

.

Cost: O(n4) flops.For any A, Newton divided difference form:

f(A)b =n∑

i=0

ci

i−1∏

j=0

(A − λjI)b, ci = (confluent) div. diffs.

Requires derivatives of f . Cost: O(n3) flops.Function of Matrix – p.39/42

Page 47: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Cauchy Integral Theorem

y =1

2πi

Γf(z)(zI − A)−1b dz =:

Γg(z) dz.

Take circleΓ : z − α = βeiθ, 0 ≤ θ ≤ 2π.

Apply repeated trapezium rule:

Γg(z) dz =

∫ 2π

0(z(θ) − α)g(z(θ)) dθ ≈ 2πi

n

n−1∑

k=0

(zk − α)g(zk),

where zk − α = βe2πki/n.

Use Hessenberg reduction, as before.

Function of Matrix – p.40/42

Page 48: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Euler-Maclaurin Error Bound

h(x) period 2π, in C2k+1(−∞,∞), |h(2k+1)(x)| ≤ M :∣∣∣∣∫ 2π

0h(x) dx − Tn(f)

∣∣∣∣ ≤4πM ζ(2k + 1)

n2k+1.

• h(2k+1)(x) proportional to β2k+2 = radius of circle.

• h(2k+1)(x) contains powers of resolvent (z(θ)I − A)−1.Bad if contour close to some λi or A highly nonnormal.

• h(2k+1)(x) contains derivatives of f on contour.

Conclude : restricted to matrices

not too nonnormal,

λi can be enclosed in circle of small radius not close tosingularity of derivs of f .

Function of Matrix – p.41/42

Page 49: Matrix Functions: Theory and Algorithmshigham/talks/funm03.pdf · Matrix Functions: Theory and Algorithms ... Horn & Johnson call these defs primary matrix functions. ... Analysis

Future Work

F Theory and algorithms for non-primary

functions, perhaps linked to an f(A(t))

application.

F Better understanding of conditioning of f(A).

F Exploiting structure, e.g. A ∈ matrix

automorphism group (H, Mackey, Mackey &

Tisseur, 2003).

http://www.ma.man.ac.uk/~higham/

Function of Matrix – p.42/42