Litvinenko low-rank kriging +FFT poster

1
Kriging accelerated by orders of magnitude: combining low-rank with FFT techniques Alexander Litvinenko 1 , Wolfgang Nowak 2 1 CEMSE Division, King Abdullah University of Science and Technology, 2 SimTech, Universit¨ at Stuttgart, Germany [email protected], [email protected] Center for Uncertainty Quantification Abstract Kriging algorithms based on FFT, the separability of certain co- variance functions and low-rank representations of covariance functions have been investigated. The current study combines these ideas, and so combines the individual speedup factors of all ideas. The reduced computational complexity is O(dLlogL), where L := max i n i , i =1..d. For separable covariance functions, the results are exact, and non-separable covariance functions can be approximated through sums of separable components. Speedup factor is 10 8 , problem sizes 15e + 12 and 2e + 15 estima- tion points for Kriging and spatial design. 1. Kriging Task 1: Let m be number of measurement points, n number of estimation points. Let ˆ s R n be the kriging vector to be esti- mated with mean μ s =0 and cov. matrix Q ss R n×n : ˆ s = Q sy Q -1 yy y | {z } ξ . where y R m vector of measurement values, Q sy R n×m cross cov. matrix and Q yy R m×m auto cov. matrix. Task 2: Estimation of Variance ˆ σ s R n Let Q ss|y be the condi- tional covariance matrix. Then ˆ σ s = diag(Q ss|y ) = diag Q ss - Q sy Q -1 yy Q ys = diag (Q ss ) - m X i=1 h (Q sy Q -1 yy e i ) Q T ys (i) i , where Q T ys (i) is the transpose of the i-th row in Q ys . Task 3: Geostatistical optimal design The goal is to optimize sampling patterns from which the data values in y are to be ob- tained. Two most common objective function to be minimized are: φ A = n -1 trace h Q ss|y i and φ C = c T Q ss|y c = c T (Q ss - Q sy Q -1 yy Q ys )c = σ 2 z - (c T Q sy )Q -1 yy (Q ys c), with σ 2 z = c T Q ss c. Any Toeplitz covariance matrix Q ss R n×n (the first column is denoted by q) can be embedded in a larger circulant matrix ˇ Q R ˇ n× ˇ n (the first column is denoted by ˇ q). Sampling and Injection: Consider the m × n sampling matrix H: H i,j = 1 for x i = x j 0 otherwise , where x i are the coordinates of the i-th measurement location in y, and x j are the coordinates of the j -th estimation point in s. sampling: ξ m×1 = Hu n×1 injection: u n×1 = H T ξ m×1 Q ys = HQ ss and Q sy = Q ss H T Q yy = HQ ss H T + R Q sy ξ = Q ss H T ξ = Q ss H T ξ | {z } u . Embedding and extraction: Let M R ˇ n×n maps the entries of the finite embedded domain onto the periodic embedding do- main. M has one single entry of unity per column. Extraction of an embedded Toeplitz matrix Q ss from the embedding circu- lant matrix ˇ Q as follows: Q ss = M T ˇ QM, q = M ˇ q. Kriging vector can be efficiently estimated via d-dimensional FFT: (Q sy ξ =)Q ss u = M T ˇ QMu = M T F [-d] F [d] q) ◦F [d] (Mu) . 2. Low-rank kriging via FFT Lemma 1 Let u = k u j =1 N d i=1 u ji , where u R n , u ji R n i and n = Q d i=1 n i . Then the d-dimensional Fourier transformation ˜ u = F [d] (u) is ˜ u = k u X j =1 d O i=1 ( F i ( u ji )) Lemma 2 Let u and q have CP representation, then u q = k u X j =1 d O i=1 u ji k q X =1 d O i=1 q ‘i = k u X j =1 k q X =1 d O i=1 ( u ji q ‘i ) . F [d] (u q)= k u X =1 k q X j =1 d O i=1 ( F i ( u ji q ‘i )) . 2.1 Accuracy If kq - q (k q ) k F ε q , kq - q (k q ) k F kqk F ε rel,q . Then kQ ss - Q (k q ) ss k F q , kQ ss - Q (k q ) ss k F kQ ss k F ε rel,q . 2.2 d-dimensional embedding/extraction Since M [d] = N d i=1 M i , have ˇ u = M [d] u = d O i=1 M i · k u X j =1 d O i=1 u ji = k u X j =1 d O i=1 M i u ji =: k u X j =1 d O i=1 ˇ u ji , 2.3 Low-rank kriging estimate Q ss u Q ss u Q (k q ) ss u (k u ) = k q X =1 k u X j =1 d O i=1 M T i F -1 i (F i ˇ q ‘i ) (F i ˇ u ji ) . with accuracy kQ ss u - Q (k q ) ss u (k u ) k F q kuk + kQ ss ε u . The total costs is O(k u k q d ˇ L * log ˇ L * ) instead of On log ˇ n), with ˇ L * = max i=1...d ˇ n i and ˇ n = Q d i=1 ˇ n. Kriging estimator ˆ s = Q sy ξ = Q ss H T ξ = M T ˇ QMH T ξ . can be written in the CP tensor format ˆ s = Q sy ξ = m X j =1 ξ j k q X =1 d O i=1 M T i F -1 i (F i ˇ q ‘i ) (F i ˇ h ji ) . 3. Numerics: CPU time and storage 10 -4 10 -2 10 0 10 2 CPU time [s] (a) kriging (b) estimation variance standard FFT/Kron FFT/Kron w/o final FFT 10 3 10 6 10 9 10 12 10 15 10 -4 10 -2 10 0 10 2 CPU time [s] number of grid nodes [-] (c) sum of estimation variance 10 3 10 6 10 9 10 12 10 15 number of grid nodes [-] (d) quadratic form Figure 1: CPU time of the four different methods depending on the number of lattice points in a rectangular domain. 10 0 10 2 10 4 Memory [MByte] (a) kriging (b) estimation variance standard FFT/Kron FFT/Kron w/o final FFT 10 3 10 6 10 9 10 12 10 15 10 0 10 2 10 4 Memory [MByte] number of grid nodes [-] (c) sum of estimation variance 10 3 10 6 10 9 10 12 10 15 number of grid nodes [-] (d) quadratic form Figure 2: Memory requirements of the four different methods de- pending on the number of lattice points in a rectangular domain. Example: concentration of an ore mineral Domain: 20m × 20m × 20m, n = 25000 3 dofs., m = 4000 measurements randomly distributed within the volume, with increasing data density towards the lower left back cor- ner of the domain. The covariance model is anisotropic Gaussian with unit variance and with 32 correlation lengths fitting into the domain in the horizontal directions, and 64 correlation lengths fitting into the vertical direction. Figure 3: The top left figure shows the entire domain at a sam- pling rate of 1:64 per direction, and then a series of zooms into the respective lower left back corner with zoom factors (sampling rates) of 4 (1:16), 16 (1:4), 64 (1:1) for the top right, bottom left and bottom right plots, respectively. Color scale: showing the 95% confidence interval [μ - 2σ, μ +2σ ]. 4. Conclusion We develop new algorithms for large-scale Kriging problems (in- cluding the estimation variance and measures for the optimality of sampling patterns), combining low-rank tensor approximations with existing fast methods based on the FFT. The computational cost is O(k q mdL * log L * ), where k q is the rank, m number of mea- surements, d dimension and L * = max(n i ) with n = Q d i=1 n i . Memory cost: O ( [k q + m]dL ) , where L = d i=1 n i . 2D Kriging with 2.7e+7 estimation points and 100 measure- ment values takes 0.25 sec., the estimation variance takes < 1 sec., the spatial average of the estimation variance (the A-criterion of geostat. optimal design) for 2 · 10 12 estim. points takes 30 sec., the C -criterion of geostat. optimal design for 2 · 10 15 estimation points takes 30 sec., 3D Kriging problem with 15 · 10 12 estimation points and 4000 measurement data values takes 20 sec. Acknowledgements A. Litvinenko is a member of the KAUST SRI Center for Uncer- tainty Quantification in Computational Science and Engineering. Additionally this research has been funded by the Cluster of Ex- cellence in Simulation Technology (EXC 310/1) at the University of Stuttgart, and by the joint DFG project (NO 805/3-1 and MA 2236/16-1). References [1] W. Nowak, A. Litvinenko, Kriging accelerated by orders of magnitude: combining low-rank covariance approximations with FFT-techniques, Mathematical Geosciences, 2013, Vol. 45, Is- sue 4, pp 411-435.

Transcript of Litvinenko low-rank kriging +FFT poster

Page 1: Litvinenko low-rank kriging +FFT  poster

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

Kriging accelerated by orders of

magnitude: combining low-rank with

FFT techniquesAlexander Litvinenko1, Wolfgang Nowak2

1 CEMSE Division, King Abdullah University of Science and Technology,2 SimTech, Universitat Stuttgart, Germany

[email protected],[email protected]

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

Abstract

Kriging algorithms based on FFT, the separability of certain co-variance functions and low-rank representations of covariancefunctions have been investigated. The current study combinesthese ideas, and so combines the individual speedup factors ofall ideas. The reduced computational complexity is O(dLlogL),where L := maxini, i = 1..d. For separable covariance functions,the results are exact, and non-separable covariance functionscan be approximated through sums of separable components.Speedup factor is 108, problem sizes 15e+ 12 and 2e+ 15 estima-tion points for Kriging and spatial design.

1. Kriging

Task 1: Let m be number of measurement points, n number ofestimation points. Let s ∈ Rn be the kriging vector to be esti-mated with mean µs = 0 and cov. matrix Qss ∈ Rn×n:

s = QsyQ−1yy y︸ ︷︷ ︸ξ

.

where y ∈ Rm vector of measurement values, Qsy ∈ Rn×m crosscov. matrix and Qyy ∈ Rm×m auto cov. matrix.Task 2: Estimation of Variance σs ∈ Rn Let Qss|y be the condi-tional covariance matrix. Then

σs = diag(Qss|y) = diag(Qss −QsyQ

−1yyQys

)= diag (Qss)−

m∑i=1

[(QsyQ

−1yy ei) ◦QT

ys(i)],

where QTys(i) is the transpose of the i-th row in Qys.

Task 3: Geostatistical optimal design The goal is to optimizesampling patterns from which the data values in y are to be ob-tained. Two most common objective function to be minimizedare:

φA = n−1 trace[Qss|y

]and

φC = cTQss|yc = cT (Qss −QsyQ−1yyQys)c

= σ2z − (cTQsy)Q−1

yy (Qysc),

with σ2z = cTQssc.

Any Toeplitz covariance matrix Qss ∈ Rn×n (the first columnis denoted by q) can be embedded in a larger circulant matrixQ ∈ Rn×n (the first column is denoted by q).Sampling and Injection: Consider the m× n sampling matrix H:

Hi,j =

{1 for xi = xj0 otherwise ,

where xi are the coordinates of the i-th measurement location iny, and xj are the coordinates of the j-th estimation point in s.

sampling: ξm×1 = Hun×1

injection: un×1 = HTξm×1

Qys = HQss and Qsy = QssHT

Qyy = HQssHT + R

Qsyξ = QssHTξ = Qss

(HTξ

)︸ ︷︷ ︸

u

.

Embedding and extraction: Let M ∈ Rn×n maps the entries ofthe finite embedded domain onto the periodic embedding do-main. M has one single entry of unity per column. Extractionof an embedded Toeplitz matrix Qss from the embedding circu-lant matrix Q as follows:

Qss = MT QM, q = Mq.

Kriging vector can be efficiently estimated via d-dimensional FFT:

(Qsyξ =)Qssu = MT QMu = MTF [−d](F [d] (q) ◦ F [d] (Mu)

).

2. Low-rank kriging via FFT

Lemma 1 Let u =∑kuj=1

⊗di=1 uji, where u ∈ Rn, uji ∈ Rni

and n =∏di=1 ni. Then the d-dimensional Fourier transformation

u = F [d](u) is

u =

ku∑j=1

d⊗i=1

(Fi(uji))

Lemma 2 Let u and q have CP representation, then

u ◦ q =

ku∑j=1

d⊗i=1

uji

◦ kq∑`=1

d⊗i=1

q`i

=

ku∑j=1

kq∑`=1

d⊗i=1

(uji ◦ q`i

).

F [d](u ◦ q) =

ku∑`=1

kq∑j=1

d⊗i=1

(Fi(uji ◦ q`i

)).

2.1 AccuracyIf

‖q− q(kq)‖F ≤ εq,‖q− q(kq)‖F‖q‖F

≤ εrel,q.

Then

‖Qss −Q(kq)ss ‖F ≤

√nεq,

‖Qss −Q(kq)ss ‖F

‖Qss‖F≤ εrel,q .

2.2 d-dimensional embedding/extraction

Since M[d] =⊗d

i=1 Mi, have

u = M[d]u =

d⊗i=1

Mi

· ku∑j=1

d⊗i=1

uji =

ku∑j=1

d⊗i=1

Miuji =:

ku∑j=1

d⊗i=1

uji,

2.3 Low-rank kriging estimate Qssu

Qssu ≈ Q(kq)ss u(ku) =

kq∑`=1

ku∑j=1

d⊗i=1

MTi F−1i

[(Fiq`i) ◦ (Fiuji)

].

with accuracy

‖Qssu−Q(kq)ss u(ku)‖F ≤

√nεq‖u‖ + ‖Qss‖ · εu.

The total costs is O(kukqd L∗ log L∗) instead of O(n log n), with

L∗ = maxi=1...d ni and n =∏di=1 n.

Kriging estimator

s = Qsyξ = QssHTξ = MT QMHTξ.

can be written in the CP tensor format

s = Qsyξ =

m∑j=1

ξj

kq∑`=1

d⊗i=1

(MT

i F−1i

[(Fiq`i) ◦ (Fihji)

]).

3. Numerics: CPU time and storage

10−4

10−2

100

102

CP

U tim

e [s]

(a) kriging (b) estimation variance

standard

FFT/Kron

FFT/Kron w/o final

FFT

103

106

109

1012

1015

10−4

10−2

100

102

CP

U tim

e [s]

number of grid nodes [−]

(c) sum of estimation variance

103

106

109

1012

1015

number of grid nodes [−]

(d) quadratic form

Figure 1: CPU time of the four different methods depending onthe number of lattice points in a rectangular domain.

100

102

104

Mem

ory

[M

Byte

]

(a) kriging (b) estimation variance

standard

FFT/Kron

FFT/Kron w/o final

FFT

103

106

109

1012

1015

100

102

104

Mem

ory

[M

Byte

]

number of grid nodes [−]

(c) sum of estimation variance

103

106

109

1012

1015

number of grid nodes [−]

(d) quadratic form

Figure 2: Memory requirements of the four different methods de-pending on the number of lattice points in a rectangular domain.

Example: concentration of an ore mineral

Domain: 20m × 20m × 20m, n = 250003 dofs., m = 4000measurements randomly distributed within the volume, withincreasing data density towards the lower left back cor-ner of the domain. The covariance model is anisotropicGaussian with unit variance and with 32 correlation lengthsfitting into the domain in the horizontal directions, and64 correlation lengths fitting into the vertical direction.

Figure 3: The top left figure shows the entire domain at a sam-pling rate of 1:64 per direction, and then a series of zooms intothe respective lower left back corner with zoom factors (samplingrates) of 4 (1:16), 16 (1:4), 64 (1:1) for the top right, bottom leftand bottom right plots, respectively. Color scale: showing the95% confidence interval [µ− 2σ, µ + 2σ].

4. Conclusion

We develop new algorithms for large-scale Kriging problems (in-cluding the estimation variance and measures for the optimalityof sampling patterns), combining low-rank tensor approximationswith existing fast methods based on the FFT. The computationalcost isO(kqmdL

∗ logL∗), where kq is the rank, m number of mea-surements, d dimension and L∗ = max(ni) with n =

∏di=1 ni.

Memory cost: O([kq + m]dL

), where L =

∑di=1 ni.

• 2D Kriging with 2.7e+7 estimation points and 100 measure-ment values takes 0.25 sec.,

• the estimation variance takes < 1 sec.,

• the spatial average of the estimation variance (the A-criterionof geostat. optimal design) for 2 · 1012 estim. points takes 30sec.,

• the C-criterion of geostat. optimal design for 2 · 1015 estimationpoints takes 30 sec.,

• 3D Kriging problem with 15 · 1012 estimation points and 4000measurement data values takes 20 sec.

Acknowledgements

A. Litvinenko is a member of the KAUST SRI Center for Uncer-tainty Quantification in Computational Science and Engineering.Additionally this research has been funded by the Cluster of Ex-cellence in Simulation Technology (EXC 310/1) at the Universityof Stuttgart, and by the joint DFG project (NO 805/3-1 and MA2236/16-1).

References

[1] W. Nowak, A. Litvinenko, Kriging accelerated by orders ofmagnitude: combining low-rank covariance approximations withFFT-techniques, Mathematical Geosciences, 2013, Vol. 45, Is-sue 4, pp 411-435.