The pivoted Cholesky decompositionand its application to stochastic PDEs
Helmut Harbrecht, Michael Peters,
and Reinhold Schneider
H. Harbrecht
Institute of Applied Analysis and Numerical Simulation
University of Stuttgart (Germany)
1
Overview
• Motivation
• Pivoted Cholesky decomposition
• Karhunen-Loeve expansion
• Second moment analysis
• Concluding remarks
Helmut Harbrecht
2
Motivation
I elliptic boundary value problems can be solved with high accuracy,provided that the input data are known exactly
I practical significance of highly accurate numerical solutions is limiteddue to inexact input data
Model equation: −div[α(ω)∇u(ω)
]= f (ω) in D(ω)
u(ω) = 0 on ∂D(ω)
Quantities of interest:• expectation: Eu(x) =
∫D
u(x,ω)dP(ω)
• two-point correlation: Coru(x,y) =∫
Du(x,ω)u(y,ω)dP(ω)
• variance: Vu(x) = Eu2(x)−E2u(x) = Coru(x,x)−E2
u(x)
Goal: For given mean and two-point correlation of the stochastic input data, compute
the mean and the two-point correlation of the random solution of the boundary value
problem.
Helmut Harbrecht
3
Karhunen-Loeve expansion
Approximation of stochastic fields α ∈ L2(D)⊗L2P(Ω) by the
truncated Karhunen-Loeve expansion
α(x,ω)≈ Eα(x)+m
∑i=1
√λiϕi(x)ψi(ω)
with orthogonal collections ϕi ⊂ L2(D) and ψi ⊂ L2P(Ω).
The Karhunen-Loeve expansion involves the computation of the dominant eigenpairs
(λi,ϕi) of the integral operator
(K ϕi)(x) =∫
DCovarα(x,y)ϕi(y)dy = λiϕi(x), x ∈ D
with covariance kernel
Covarα(x,y) =∫
Ω
(α(x,ω)−Eα(x)
)(α(y,ω)−Eα(y)
)dP(ω)⊂ L2(D×D).
eigenvalue problem for a nonlocal operator requires fast methods
Theorem. (Schwab/Todor [2006])
If Covarα ∈ H p(D×D), then the eigenvalues λmm∈N of K decay like
λm . `−p/d as m→ ∞.
Helmut Harbrecht
4
Pivoted Cholesky decomposition
Lemma. Let the matrix
A =
[a bT
b C
]∈ Rn×n
be symmetric and positive semi-definite with a > 0. Then, the Schur complement
S := C− 1a
bbT ∈ R(n−1)×(n−1)
is well-defined and also symmetric and positive semi-definite.
I Observation. Pivoting enables to apply the Cholesky decomposition to posi-
tive semi-definite matrices. Hence, if A has finite rank m, the pivoted Cholesky
decomposition terminates with a rank m decomposition
A = LmLTm.
I Question. What happens if A is nearly positive semi-definite, i.e.,
‖A−Am‖2 ≤ ε
with Am being a positive definite rank m matrix?
Helmut Harbrecht
5
Pivoted Cholesky decomposition
I Trace norm. The best possible reduction of the trace error in one Cholesky
step is achieved if the trace of the Schur complement becomes as small as
possible. This amounts to the problem
trace(A−Am) = traceS = trace(A−Am−1)−1
a(m−1)i,i
∥∥∥a(m−1)i
∥∥∥2
2→
nmini=1
.
too expensive!
I Strategy. Remove the largest diagonal coefficient of the remainder matrix:
trace(A−Am) = traceS = trace(A−Am−1)−maxi
a(m−1)i,i .
total pivoting!
Algorithm (total pivoting): Permute the matrix such that the largest diagonal ele-
ment is at the (1,1)-position and compute then the Cholesky step:
A = Am+Em = LmLTm+Em with Em := P1P2 · · ·Pm
[0 00 Sm
]Pm · · ·P2P1.
Helmut Harbrecht
6
Algorithm: cost O(nm2)Algorithm 1: Pivoted Cholesky decomposition
Data: matrix A = [ai,j ] ! Rn!n and error tolerance ! > 0Result: low-rank approximation Am =
!mi=1 !i!
Ti such that
trace(A " Am) # !begin
set m := 1;set d := diag(A) and error := $d$1;initialize " := (1, 2, . . . , n);while error > ! do
set i := arg maxd!j: j = m, m + 1, . . . , n;
swap "m and "i;set #m,!m :=
"d!m;
for m + 1 # i # n do
compute #m,!i:=
#a!m,!i
"m"1$
j=1
#j,!m#j,!i
%&#m,!m ;
update d!i:= d!i
" #m,!m#m,!i;
compute error :=n$
i=m+1
d!i;
increase m := m + 1;
end
Notice that only all diagonal entries of the matrix A and the m rows asso-ciated with the pivot elements need to be evaluated to compute the rank-mapproximation. All other matrix coe!cients do not enter the computation.This makes the method highly attractive for the sparse approximation ofsmooth nonlocal operators (see Thm. 3.2). For operators with kernel func-tions that exhibit a singularity on the diagonal x = y it might be better tointroduce a suitable partitioning of the matrix which leads to the originaladaptive cross approximation as introduced in [1, 2].
Theorem 3.1. Let A ! Rn!n symmetric and positive semi-definite. Then,performing m steps of the pivoted Cholesky decomposition is of complexityO(m2n).
Proof. The most expensive part in Algorithm 1 is the computation of theCholesky vectors !k, k = 1, 2, . . . , m. This requires
m$
k=1
n$
i=k+1
k"1$
j=1
1 #m$
k=1
(k " 1)n # m2
2n
additions and multiplications each which proves the assertion.
7
Helmut Harbrecht
7
Features
• symmetric low-rank approximation: A≈ Am = LmLTm
• approximation error is rigorously controlled in terms of the trace norm
• stable variant of the Cholesky decomposition, especially if the eigenvalues decay rapidly
• only the diagonal coefficients and the m columns of A, associated with the pivot ele-
ments, need to be computed
• extremely simple to implement
• coincides with the adaptive cross approximation for symmetric matrices
• a purely algebraic convergence proof is available
Helmut Harbrecht
8
ConvergenceTheorem. (H/Peters/Schneider) Assume that the eigenvalues of A∈Rn×n satisfy
4mλm . exp(−bm)
for some b > 0 uniformly in n. Then, the pivoted Cholesky approximation Am with
rank m∼ | log(ε/n)| satisfies trace(Am−A). ε uniformly as ε tends to zero.
Proof. Assume that A is permuted such that the k-th pivot is found at the (k,k)-position forall k = 1,2, . . . ,n. Then, Lm ∈ Rn×m is always a lower triangular matrix. It follows from
Am = LmLTm =
[L1,1 0L2,1 0
][LT
1,1 LT2,1
0 0
]=
[L1,1LT
1,1 L1,1LT2,1
L2,1LT1,1 L2,1LT
2,1
]=
[A1,1 A1,2A2,1 L2,1LT
2,1
]
that L1,1LT1,1 is the (pivoted) Cholesky decomposition of A1,1. Consequently, we have
1λm(A1,1)
=∥∥A−1
1,1∥∥
2 =∥∥L−1
1,1∥∥2
2
sharp!≤ 4m+6m−1
9`2m,m
≤ 4m
`2m,m
.
The trace norm of A−Am is bounded by (n−m)-times the pivot element `2m,m:
trace(A−Am)≤ (n−m)`2m,m ≤ 4mnλm(A1,1)
Courant≤
Fischer4mnλm(A).
Helmut Harbrecht
9
Numerical results I
Gauss kernel: (2πσ2)−1/2 exp(|x− y|2/σ2)
value of σε
1 0.5 0.1 0.05 0.0110−1 2 3 10 19 8910−2 3 5 15 28 13710−3 4 5 19 36 17310−4 5 6 21 39 18710−5 5 7 24 46 21410−6 5 8 27 50 238
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1e-08
1e-06
0.0001
0.01
1
100
10000
1e+06
1e+08
1 10 100 1000
valu
e
rank
error in the trace normeigenvalues of the matrix Aeigenvalues of the Hilbert-Schmidt operator
Jumping Gauss kernel:
value of σε
1 0.5 0.1 0.05 0.0110−1 2 3 10 17 8110−2 3 4 15 28 13110−3 4 5 18 34 16810−4 4 6 21 39 18610−5 5 7 24 45 21110−6 5 8 26 50 234
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1e-08
1e-06
0.0001
0.01
1
100
10000
1e+06
1e+08
1 10 100 1000
valu
erank
error in the trace normeigenvalues of the matrix Aeigenvalues of the Hilbert-Schmidt operator
Helmut Harbrecht
10
Numerical results II
Random kernel: A = ∑mk=1 λkvkvT
k , λk = exp(−σk), vTk v` = δk,`
value of σε
1 0.5 0.1 0.05 0.0110−1 3 6 29 61 33310−2 6 11 56 115 61010−3 8 15 81 167 87310−4 10 21 106 216 112610−5 13 25 130 266 137510−6 15 30 154 315 1618
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
1e-14
1e-12
1e-10
1e-08
1e-06
0.0001
0.01
1
100
1 10 100 1000 10000
valu
e
rank
error in the trace normeigenvalues of the matrix Aeigenvalues of the Hilbert-Schmidt operator
Poisson kernel: exp(−σ|x− y|)/√σ
value of σε
1 10−1 10−2 10−3 10−4
10−1 5 1 1 1 110−2 36 5 1 1 110−3 376 36 5 1 110−4 3616 376 36 5 1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1e-08
1e-06
0.0001
0.01
1
100
10000
1e+06
1 10 100 1000 10000
valu
erank
error in the trace normeigenvalues of the matrix Aeigenvalues of the Hilbert-Schmidt operator
Helmut Harbrecht
11
Fast eigenpair computation
Generalized eigenvalue problem:
Ax = λBx, A = [(K ψi,ψ j)]i, j, B = [(ψi,ψ j)]i, j
Inserting the low-rank approximation
A≈ Am := LmLTm, Lm ∈ Rn×m
leads to
LmLTmx = λBx ⇐⇒ B−1/2LmLT
mB−1/2x = λx, x = B−1/2x.
Since the nonzero eigenvalues of MMT and MT M coincide, we can replace the largeeigenvalue problem by a small one
LTmB−1Lm︸ ︷︷ ︸∈Rm×m
x = λx, x = B−1Lmx.
Error estimate: (Bauer/Fike)
|λk− λk| ≤∥∥B−1/2(A−Am)B−1/2∥∥
2 . ‖A−Am‖2, k = 1,2, . . . ,m
speed-up of more than 10 compared to ARPACK with low-rank approximation
Helmut Harbrecht
12
Eigenvalue compuation
Gauss kernel exp(−100‖x−y‖2):Approximate spectrum for
ε = 0.1/0.01/0.001/0.0001
0 50 100 150 200 250 300 350 40010
−3
10−2
10−1
100
101
102
103
Poisson kernel exp(−‖x−y‖):Approximate spectrum for
ε = 0.1/0.05/0.025/0.01
0 200 400 600 800 1000 1200 1400 1600 180010
−2
10−1
100
101
102
103
104
Helmut Harbrecht
13
Stochastic loadingsStochastic boundary value problem:
−div[α∇u(ω)
]= f (ω) in D, u(ω) = 0 on ∂D
−→ the random solution depends linearly on the stochastic input data
Theorem: (Schwab/Todor) For the PDE with stochastic loading
−div[α∇u(ω)] = f (ω) on D, u(ω) = g(ω) on ∂Done has
−div[α∇Eu] = E f on D, Eu = Eg on ∂D
and
(divx⊗divy)[(
α(x)⊗α(y))(∇x⊗∇y)Coru(x,y)
]= Cor f (x,y), x,y ∈ D
−divx[α(x)∇x Coru(x,y)
]= 0, x ∈ D, y ∈ ∂D
−divy[α(y)∇y Coru(x,y)
]= 0, x ∈ ∂D, y ∈ D
Coru(x,y) = 0, x,y ∈ ∂D.
by perturbation theory, similar equations are derived in case of stochastic diffusion
coeffcients or stochastic domains
Helmut Harbrecht
14
Two-point correlation functionsSecond order statistics:
AEu = E f , (A⊗A)Coru = Cor f
Approximate Cor f (x,y)≈m
∑i=1
ψi(x)ψi(y) by the pivoted Cholesky decomposition
and solve Aϕi = ψi for all i = 1,2, . . . ,m. Then it holds that
Coru(x,y)≈m
∑i=1
ϕi(x)ϕi(y), Vu(x)≈m
∑i=1
ϕ2i (x)−E2
u(x).
Smooth kernel: 1/(σ+‖x−y‖2)value of σ
ε
0.1 0.2 0.4 0.8 1.610−1 85 46 27 14 910−2 234 122 66 37 2110−3 442 236 123 68 3810−4 710 371 198 108 6110−5 1038 539 290 157 8710−6 1426 748 395 214 118
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5
1
1.5
2
2.5
3
3.5
4
−1
−0.5
0
0.5
1
0
0.05
0.1
0.15
0.2
0.25
Helmut Harbrecht
15
Concluding remarks
• the pivoted Cholesky decomposition is a
simple algorithm to compute low-rank approximations
in case of symmetric and positive definite matrices
• an algebraic convergence proof is available
• no need to compute the complete matrix,
only (m+1)n matrix coefficients have to be computed
• the pivoted Cholesky decomposition leads to an
extremely efficient eigenvalue solver
Preprint.H. Harbrecht, M. Peters and R. Schneider.
On the low-rank approximation by the pivoted Cholesky decomposition.
Preprint 2010-32, SimTech Cluster of Excellence, Universitat Stuttgart, Germany, 2010.
Helmut Harbrecht
16
Top Related