Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a...

36
The Block Grade of a Block Krylov Space Martin H. Gutknecht 1 Thomas Schmelzer 2 1 Seminar for Applied Mathematics ETH Zurich 2 Computing Laboratory Oxford University 9th Copper Mountain Conference on Iterative Methods 6 April 2006 Martin H. Gutknecht, Thomas Schmelzer Block Grade

Transcript of Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a...

Page 1: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The Block Grade of a Block Krylov Space

Martin H. Gutknecht1 Thomas Schmelzer2

1Seminar for Applied MathematicsETH Zurich

2Computing LaboratoryOxford University

9th Copper Mountain Conference on Iterative Methods

6 April 2006

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 2: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Linear systems of equations with multiple RHSs

Given is a nonsingular linear system with s RHSs,

Ax = b (1)

whereA ∈ CN×N , b ∈ CN×s , x ∈ CN×s . (2)

We may try to solve it for all RHSs at once by using a blockKrylov space solver such as Block CG or Block GMRES.

Advantages:

The search space for the solutions is much bigger.Several matrix-vector multiplications (MVs) can be done atonce.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 3: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Linear systems of equations with multiple RHSs

Given is a nonsingular linear system with s RHSs,

Ax = b (1)

whereA ∈ CN×N , b ∈ CN×s , x ∈ CN×s . (2)

We may try to solve it for all RHSs at once by using a blockKrylov space solver such as Block CG or Block GMRES.

Advantages:

The search space for the solutions is much bigger.Several matrix-vector multiplications (MVs) can be done atonce.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 4: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov space methods

We seek approximate solutions of the form

xn ∈ x0 + B�n (A, r0) ⊆ CN×s , (3)

where the block Krylov (sub)space B�n :≡ B�

n (A, r0) is

B�n (A, r0) :≡ block span (r0,Ar0, . . . ,An−1r0) ⊂ CN×s (4)

:≡

{n−1∑k=0

Ak r0γk ; γk ∈ Cs×s (k = 0, . . . ,n − 1)

}.

DEFINITION. A (complex) block vector is a matrix y ∈ CN×s.

Hence, the elements of B�n are block vectors.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 5: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov space methods

We seek approximate solutions of the form

xn ∈ x0 + B�n (A, r0) ⊆ CN×s , (3)

where the block Krylov (sub)space B�n :≡ B�

n (A, r0) is

B�n (A, r0) :≡ block span (r0,Ar0, . . . ,An−1r0) ⊂ CN×s (4)

:≡

{n−1∑k=0

Ak r0γk ; γk ∈ Cs×s (k = 0, . . . ,n − 1)

}.

DEFINITION. A (complex) block vector is a matrix y ∈ CN×s.

Hence, the elements of B�n are block vectors.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 6: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov space methods (cont’d)

This means that for an individual approximation x (j) holds

x (j)n ∈ x (j)

0 + Bn(A, r0) ⊆ CN , (5)

whereBn :≡ Bn(A, r0) :≡ K(1)

n + · · ·+K(s)n , (6)

with the s “usual” Krylov (sub) spaces for the s systems,

K(j)n :≡ Kn(A, r

(j)0 ) :≡

{n−1∑k=0

Ak r (j)0 βk ,j ; βk ,j ∈ C (∀k)

}. (7)

In other words, each approximation x (j) is from a space that isas large as all s “usual” Krylov spaces together: dimBn ≤ ns .

B�n is a Cartesian product of s copies of Bn :

B�n = Bn × · · · × Bn︸ ︷︷ ︸

s times

.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 7: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov space methods (cont’d)

This means that for an individual approximation x (j) holds

x (j)n ∈ x (j)

0 + Bn(A, r0) ⊆ CN , (5)

whereBn :≡ Bn(A, r0) :≡ K(1)

n + · · ·+K(s)n , (6)

with the s “usual” Krylov (sub) spaces for the s systems,

K(j)n :≡ Kn(A, r

(j)0 ) :≡

{n−1∑k=0

Ak r (j)0 βk ,j ; βk ,j ∈ C (∀k)

}. (7)

In other words, each approximation x (j) is from a space that isas large as all s “usual” Krylov spaces together: dimBn ≤ ns .

B�n is a Cartesian product of s copies of Bn :

B�n = Bn × · · · × Bn︸ ︷︷ ︸

s times

.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 8: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov space methods (cont’d)

This means that for an individual approximation x (j) holds

x (j)n ∈ x (j)

0 + Bn(A, r0) ⊆ CN , (5)

whereBn :≡ Bn(A, r0) :≡ K(1)

n + · · ·+K(s)n , (6)

with the s “usual” Krylov (sub) spaces for the s systems,

K(j)n :≡ Kn(A, r

(j)0 ) :≡

{n−1∑k=0

Ak r (j)0 βk ,j ; βk ,j ∈ C (∀k)

}. (7)

In other words, each approximation x (j) is from a space that isas large as all s “usual” Krylov spaces together: dimBn ≤ ns .

B�n is a Cartesian product of s copies of Bn :

B�n = Bn × · · · × Bn︸ ︷︷ ︸

s times

.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 9: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Linear dependence of residuals, deflation

Bn :≡ Bn(A, r0) :≡ K(1)n + · · ·+K(s)

n

is, in general, not a direct sum.

Already the initial residuals could be linearly dependent.

But also some of the later generated Krylov subspacedimensions may not increase the block Krylov subspace.

The treatment of these cases requires deflation: the explicitdetermination of linear dependencies during the generation ofthe block Krylov subspaces ( application of RRQR).

Deflation leads to “reduction of number of RHSs”.

It is not only possible when “one of the systems converges”, butwhen “a linear combination of the s systems converges”.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 10: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Linear dependence of residuals, deflation

Bn :≡ Bn(A, r0) :≡ K(1)n + · · ·+K(s)

n

is, in general, not a direct sum.

Already the initial residuals could be linearly dependent.

But also some of the later generated Krylov subspacedimensions may not increase the block Krylov subspace.

The treatment of these cases requires deflation: the explicitdetermination of linear dependencies during the generation ofthe block Krylov subspaces ( application of RRQR).

Deflation leads to “reduction of number of RHSs”.

It is not only possible when “one of the systems converges”, butwhen “a linear combination of the s systems converges”.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 11: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Linear dependence of residuals, deflation

Bn :≡ Bn(A, r0) :≡ K(1)n + · · ·+K(s)

n

is, in general, not a direct sum.

Already the initial residuals could be linearly dependent.

But also some of the later generated Krylov subspacedimensions may not increase the block Krylov subspace.

The treatment of these cases requires deflation: the explicitdetermination of linear dependencies during the generation ofthe block Krylov subspaces ( application of RRQR).

Deflation leads to “reduction of number of RHSs”.

It is not only possible when “one of the systems converges”, butwhen “a linear combination of the s systems converges”.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 12: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade

Recall from single RHS case (s = 1):

Characteristic properties of grade ν(A,y) of y with resp. to A:

dim Kn(A,y) =

{n if n ≤ ν ,ν if n ≥ ν ;

ν = min{

n∣∣ dim Kn(A,y) = dim Kn+1(A,y)

}= min

{n

∣∣ Kn(A,y) = Kn+1(A,y)}

;

ν = min{

n∣∣ A−1y ∈ Kn(A,y)

};

ν = min{

n∣∣ x? ∈ x0 +Kn(A, r0)

},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 13: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade

Recall from single RHS case (s = 1):

Characteristic properties of grade ν(A,y) of y with resp. to A:

dim Kn(A,y) =

{n if n ≤ ν ,ν if n ≥ ν ;

ν = min{

n∣∣ dim Kn(A,y) = dim Kn+1(A,y)

}= min

{n

∣∣ Kn(A,y) = Kn+1(A,y)}

;

ν = min{

n∣∣ A−1y ∈ Kn(A,y)

};

ν = min{

n∣∣ x? ∈ x0 +Kn(A, r0)

},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 14: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade

Recall from single RHS case (s = 1):

Characteristic properties of grade ν(A,y) of y with resp. to A:

dim Kn(A,y) =

{n if n ≤ ν ,ν if n ≥ ν ;

ν = min{

n∣∣ dim Kn(A,y) = dim Kn+1(A,y)

}= min

{n

∣∣ Kn(A,y) = Kn+1(A,y)}

;

ν = min{

n∣∣ A−1y ∈ Kn(A,y)

};

ν = min{

n∣∣ x? ∈ x0 +Kn(A, r0)

},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 15: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade

Recall from single RHS case (s = 1):

Characteristic properties of grade ν(A,y) of y with resp. to A:

dim Kn(A,y) =

{n if n ≤ ν ,ν if n ≥ ν ;

ν = min{

n∣∣ dim Kn(A,y) = dim Kn+1(A,y)

}= min

{n

∣∣ Kn(A,y) = Kn+1(A,y)}

;

ν = min{

n∣∣ A−1y ∈ Kn(A,y)

};

ν = min{

n∣∣ x? ∈ x0 +Kn(A, r0)

},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 16: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade (cont’d)

All this is based on the simple fact that

Aνy = −yγ0 − Ayγ1 − · · · − Aν−1yγν−1 , (8)

where γ0 6= 0 and ν is minimal. This can be written

ψ(A)y = o , where ψ(t) :≡ ψA,y(t) :≡ tν+γν−1 tν−1+· · ·+γ1 t+γ0

is the minimum polynomial of y with respect to A

LEMMA

Kν(A,y) is the smallest A-invariant subspace that contains y.The polynomial ψ = ψA,y is the smallest divisor of the minimalpolynomial χA of A with ψ(A)y = o. In particular, ν ≤ ∂χA.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 17: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade (cont’d)

All this is based on the simple fact that

Aνy = −yγ0 − Ayγ1 − · · · − Aν−1yγν−1 , (8)

where γ0 6= 0 and ν is minimal. This can be written

ψ(A)y = o , where ψ(t) :≡ ψA,y(t) :≡ tν+γν−1 tν−1+· · ·+γ1 t+γ0

is the minimum polynomial of y with respect to A

LEMMA

Kν(A,y) is the smallest A-invariant subspace that contains y.The polynomial ψ = ψA,y is the smallest divisor of the minimalpolynomial χA of A with ψ(A)y = o. In particular, ν ≤ ∂χA.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 18: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade (cont’d)

In practice, in most problems the grade ν is irrelevant, becauseν is large and we need convergence for n � ν.

There are exceptions, where ν is small. For such problemsprojection methods (CG, BICG, GMRES, ...) are very effective.

In any case, considerations about the grade can help usunderstand the effectiveness of Krylov space methods andblock Krylov space methods.

To justify this, we must replace the grade by a more subtlemeasure that takes into account approximate solutions.For one proposal see Ilic/Turner [’03ANZIAM J.], [’05NLAA].

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 19: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The grade (cont’d)

In practice, in most problems the grade ν is irrelevant, becauseν is large and we need convergence for n � ν.

There are exceptions, where ν is small. For such problemsprojection methods (CG, BICG, GMRES, ...) are very effective.

In any case, considerations about the grade can help usunderstand the effectiveness of Krylov space methods andblock Krylov space methods.

To justify this, we must replace the grade by a more subtlemeasure that takes into account approximate solutions.For one proposal see Ilic/Turner [’03ANZIAM J.], [’05NLAA].

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 20: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade

In multiple RHS case (s > 1):

Introduce block grade ν(A,y) of y with respect to A withcharacteristic properties:

ν = min{

n∣∣ dim Bn(A,y) = dim Bn+1(A,y)

}= min

{n

∣∣ Bn(A,y) = Bn+1(A,y)}

= min{

n∣∣ Bn(A,y) = Bn+`(A,y) (∀` ∈ N)

};

ν = min{

n∣∣ A−1y ∈ B�

n (A,y)}

;

ν = min{

n∣∣ x? ∈ x0 + B�

n (A, r0)},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 21: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade

In multiple RHS case (s > 1):

Introduce block grade ν(A,y) of y with respect to A withcharacteristic properties:

ν = min{

n∣∣ dim Bn(A,y) = dim Bn+1(A,y)

}= min

{n

∣∣ Bn(A,y) = Bn+1(A,y)}

= min{

n∣∣ Bn(A,y) = Bn+`(A,y) (∀` ∈ N)

};

ν = min{

n∣∣ A−1y ∈ B�

n (A,y)}

;

ν = min{

n∣∣ x? ∈ x0 + B�

n (A, r0)},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 22: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade

In multiple RHS case (s > 1):

Introduce block grade ν(A,y) of y with respect to A withcharacteristic properties:

ν = min{

n∣∣ dim Bn(A,y) = dim Bn+1(A,y)

}= min

{n

∣∣ Bn(A,y) = Bn+1(A,y)}

= min{

n∣∣ Bn(A,y) = Bn+`(A,y) (∀` ∈ N)

};

ν = min{

n∣∣ A−1y ∈ B�

n (A,y)}

;

ν = min{

n∣∣ x? ∈ x0 + B�

n (A, r0)},

where Ax? = b, r0 :≡ b− Ax0.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 23: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade (cont’d)

Relations between ordinary grade and block grade:

LEMMA

The block grade of the block Krylov space and the grades of thecorresponding individual Krylov spaces are related by

ν(A,y) ≤ maxi=1,...,s

ν(A, y (i)) . (9)

LEMMA

A block Krylov space and the corresponding individual Krylovspaces are related by

Bν(A,y)(A,y) = Kν(A,y (1))(A, y(1)) + · · ·+Kν(A,y (s))(A, y

(s)) ,

and ν(A,y) is the smallest index for which this holds.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 24: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade (cont’d)

Relations between ordinary grade and block grade:

LEMMA

The block grade of the block Krylov space and the grades of thecorresponding individual Krylov spaces are related by

ν(A,y) ≤ maxi=1,...,s

ν(A, y (i)) . (9)

LEMMA

A block Krylov space and the corresponding individual Krylovspaces are related by

Bν(A,y)(A,y) = Kν(A,y (1))(A, y(1)) + · · ·+Kν(A,y (s))(A, y

(s)) ,

and ν(A,y) is the smallest index for which this holds.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 25: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade (cont’d)

In the single RHS case, in exact arithmetic, computing x?

requiresdim Kν = ν MVs.

In the multiple RHS case, in exact arithmetic, computing x?

requiresdim Bν ∈ [ν, s · ν] MVs.

This is a big interval!

Block methods are most effective (compared to single RHSmethods) if

dim Bν � s · ν .

More exactly: block methods are most effective if

dim Bν(r0, A) �s∑

k=1

dim Kν(r (k)

0 , A).

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 26: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The block grade (cont’d)

In other words: block methods are most effective (comparedto single RHS methods) if deflation is possible and used!

However, exact deflation is rare, and we need approximatedeflation depending on a deflation tolerance in RRQR.

Approximate deflation introduces a deflation error.

The deflation error may deteriorate the convergence speedand/or the accuracy of the computed solution.

Restarting the iteration can be useful from this point of view.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 27: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov bases

In the single right-hand side case, the columns of the n × NKrylov matrix

Kn :≡(

r0 Ar0 . . . An−1r0)

form the Krylov basis of Kn (for n = 1, . . . , ν).

In the multiple right-hand side case, the columns of then · s × N matrix

Bn :≡(

r0 Ar0 . . . An−1r0)

are still a spanning set of Bn.

But they are in general not linearly independent.Need to delete columns starting with those in An−1r0.

Nonunique! Deleting the most left ones would be arbitrary.

Obtain a tree of block Krylov bases for B1, . . . ,Bν .

We also obtain a set of minimum polynomials of y withrespect to A.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 28: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov bases

In the single right-hand side case, the columns of the n × NKrylov matrix

Kn :≡(

r0 Ar0 . . . An−1r0)

form the Krylov basis of Kn (for n = 1, . . . , ν).

In the multiple right-hand side case, the columns of then · s × N matrix

Bn :≡(

r0 Ar0 . . . An−1r0)

are still a spanning set of Bn.

But they are in general not linearly independent.Need to delete columns starting with those in An−1r0.

Nonunique! Deleting the most left ones would be arbitrary.

Obtain a tree of block Krylov bases for B1, . . . ,Bν .

We also obtain a set of minimum polynomials of y withrespect to A.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 29: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Block Krylov bases

In the single right-hand side case, the columns of the n × NKrylov matrix

Kn :≡(

r0 Ar0 . . . An−1r0)

form the Krylov basis of Kn (for n = 1, . . . , ν).

In the multiple right-hand side case, the columns of then · s × N matrix

Bn :≡(

r0 Ar0 . . . An−1r0)

are still a spanning set of Bn.

But they are in general not linearly independent.Need to delete columns starting with those in An−1r0.

Nonunique! Deleting the most left ones would be arbitrary.

Obtain a tree of block Krylov bases for B1, . . . ,Bν .

We also obtain a set of minimum polynomials of y withrespect to A.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 30: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

Thanks for listening and come to ...

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 31: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade

Since x? − x0 ∈ Kν(A, r0) we could write

x?−x0 = r0ξ0 + Ar0ξ1 + · · ·+ An−1r0ξn−1︸ ︷︷ ︸∈ Kn

+ Anr0ξn + · · ·+ Aν−1r0ξν−1︸ ︷︷ ︸“reminder”

but we prefer an orthogonal decomposition.

Could use Arnoldi algorithm to construct B–orthonormal basis{v0, . . . ,vν−1} of Kν(A, r0):

x?−x0 = v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

Let

xn :≡ x0 + v0ω0 + v1ω1 + · · ·+ vn−1ωn−1 ∈ x0 +Kn ,

then “reminder” = x?− xn ⊥B Kn , i.e., xn optimal in B–norm.

Definition: νε(A, r0,B) :≡ n once ‖“reminder”‖B ≤ ε..

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 32: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade

Since x? − x0 ∈ Kν(A, r0) we could write

x?−x0 = r0ξ0 + Ar0ξ1 + · · ·+ An−1r0ξn−1︸ ︷︷ ︸∈ Kn

+ Anr0ξn + · · ·+ Aν−1r0ξν−1︸ ︷︷ ︸“reminder”

but we prefer an orthogonal decomposition.

Could use Arnoldi algorithm to construct B–orthonormal basis{v0, . . . ,vν−1} of Kν(A, r0):

x?−x0 = v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

Let

xn :≡ x0 + v0ω0 + v1ω1 + · · ·+ vn−1ωn−1 ∈ x0 +Kn ,

then “reminder” = x?− xn ⊥B Kn , i.e., xn optimal in B–norm.

Definition: νε(A, r0,B) :≡ n once ‖“reminder”‖B ≤ ε..

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 33: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade

Since x? − x0 ∈ Kν(A, r0) we could write

x?−x0 = r0ξ0 + Ar0ξ1 + · · ·+ An−1r0ξn−1︸ ︷︷ ︸∈ Kn

+ Anr0ξn + · · ·+ Aν−1r0ξν−1︸ ︷︷ ︸“reminder”

but we prefer an orthogonal decomposition.

Could use Arnoldi algorithm to construct B–orthonormal basis{v0, . . . ,vν−1} of Kν(A, r0):

x?−x0 = v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

Let

xn :≡ x0 + v0ω0 + v1ω1 + · · ·+ vn−1ωn−1 ∈ x0 +Kn ,

then “reminder” = x?− xn ⊥B Kn , i.e., xn optimal in B–norm.

Definition: νε(A, r0,B) :≡ n once ‖“reminder”‖B ≤ ε..

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 34: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade

Since x? − x0 ∈ Kν(A, r0) we could write

x?−x0 = r0ξ0 + Ar0ξ1 + · · ·+ An−1r0ξn−1︸ ︷︷ ︸∈ Kn

+ Anr0ξn + · · ·+ Aν−1r0ξν−1︸ ︷︷ ︸“reminder”

but we prefer an orthogonal decomposition.

Could use Arnoldi algorithm to construct B–orthonormal basis{v0, . . . ,vν−1} of Kν(A, r0):

x?−x0 = v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

Let

xn :≡ x0 + v0ω0 + v1ω1 + · · ·+ vn−1ωn−1 ∈ x0 +Kn ,

then “reminder” = x?− xn ⊥B Kn , i.e., xn optimal in B–norm.

Definition: νε(A, r0,B) :≡ n once ‖“reminder”‖B ≤ ε..

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 35: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade (cont’d)

When B = A?A the partial sums of the B–orthogonal series

x? = x0+v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

just represent the iterates of the GCR method (and of GMRES).

Likewise, when A is Hpd and B = A they represent the CGiterates.

So, we have not really gained much!

In the block case analogous statements can be made.

Martin H. Gutknecht, Thomas Schmelzer Block Grade

Page 36: Seminar for Applied Mathematics – Seminar for Applied ...mhg/talks/bgtalk.pdfThe Block Grade of a Block Krylov Space Martin H. Gutknecht1 Thomas Schmelzer2 1Seminar for Applied Mathematics

The ε–grade (cont’d)

When B = A?A the partial sums of the B–orthogonal series

x? = x0+v0ω0 + v1ω1 + · · ·+ vn−1ωn−1︸ ︷︷ ︸∈ Kn

+ vnωn + · · ·+ vν−1ων−1︸ ︷︷ ︸“reminder” ⊥B Kn

just represent the iterates of the GCR method (and of GMRES).

Likewise, when A is Hpd and B = A they represent the CGiterates.

So, we have not really gained much!

In the block case analogous statements can be made.

Martin H. Gutknecht, Thomas Schmelzer Block Grade