Efficient and Robust Solution Strategies for Saddle-Point ... · Saddle-Point Systems Krylov...
Transcript of Efficient and Robust Solution Strategies for Saddle-Point ... · Saddle-Point Systems Krylov...
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Efficient and Robust Solution Strategies forSaddle-Point Systems
Mentor: Dimitar TrenevJeremy Chiu, Lola Davidson, Aritra Dutta, Jia Gou,
Kak Choon Loy, Mark Thom
Mathematical Modeling in Industry XVIII
University of British Columbia
August 16, 2014
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Overview
1 Saddle-Point Systems
2 Krylov Subspace MethodsIterative SolversPreconditioners
3 Stationary Iteration MethodsUzawa’s MethodAugmented Lagrangian Method
4 Multilevel MethodsMultigrid MethodBPX Preconditioner
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Definition
Saddle-point systems are linear systems of the form[A BT
1
B2 C
] [xy
]=
[fg
]satisfying
A and C square matrices, dim A ≥ dim C .
A + AT positive-semidefinite
C symmetric and negative-semidefinite
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
The Challenge
Saddle point systems from finite element methoddiscretizations are
sparse but very large: suggests use of iterative solvers
poorly conditioned, indefinite: hard for iterative solvers
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Simplified Problem of Interest
Start with a “simple” problem:
[A BT
1
B2 C
] [xy
]=
[fg
]where
A symmetric and positive definite
B1 = B2
C = 0
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Krylov Subspace Methods
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Iterative Solvers
Direct methods won’t work!
too slow and not enough memory
turn to iterative methods
Early testing identified:
l-Generalized Biconjugate Gradient Stabilized Method(BiCGstabl)
Minimal Residual Method (MinRes)
Generalized Minimal Residual Method (GMRes)
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Iterative Solvers
Direct methods won’t work!
too slow and not enough memory
turn to iterative methods
Early testing identified:
l-Generalized Biconjugate Gradient Stabilized Method(BiCGstabl)
Minimal Residual Method (MinRes)
Generalized Minimal Residual Method (GMRes)
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preliminary Results
Computation vs mesh size:
Time per iteration can be reduced (parallel computing).
Iterating is innately serial process, so we want lessiterations!
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preconditioning
Recall that matrix A’s condition number κ(A) is
κ(A) = ‖A‖ · ‖A−1‖
Problem: Solve Ax = b, where κ(A)� 1Idea:
Define P such that P ≈ A
Hope P−1 ≈ A−1 (and much easier to compute!)
Solve (P−1A)x = P−1b
Since P−1A resembles identity, we have
κ(P−1A) < κ(A)
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Finding P
For nonsingular A, we have LDU block-factorization[A BT
1
B2 0
]=
[I 0
B2A−1 I
] [A 00 S
] [I A−1BT
1
0 I
]where
S = B2A−1BT
1
So a great preconditioner is
P =
[A 00 S
]
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Use P
But finding P−1 is equivalent to solving our system!
We took
P =
[A 0
0 S
],
with the following choices for A
D = diag(A)T = tridiagonal part of ALA = lumped-mass matrix
where S = B2A−1BT
1
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Use P
But finding P−1 is equivalent to solving our system!
We took
P =
[A 0
0 S
],
with the following choices for A
D = diag(A)T = tridiagonal part of ALA = lumped-mass matrix
where S = B2A−1BT
1
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preconditioner Comparison
Two solvers on a 10,000 by 10,000 matrix
Name Residual Iterations Runtime(sec)
bicgstabl I 10−6 2118 13.156D 10−6 15 53.3905T 10−6 16 159.968
minres I 10−6 3239 6.7009D 10−6 46 41.0382T 10−6 45 106.7194
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preconditioner Comparison
MinRes on different matrix sizes
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preconditioner Comparison
MinRes on different matrix sizes
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Preconditioner Results
GMRes on 10,000 by 10,000 matrix
No Preconditioner Preconditioner(D)
iterations 398, 620 95run time 1428 secs. 77 secs.residual 10−8 10−8
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Summary of Krylov Subspace Methods
Of the solvers tested, MINRES performed the best inruntime and iterations, however, it requires symmetry.
Preconditioners are necessary to keep the number ofiterations low.
Several preconditioners were tested with the diagonalpreconditioner performing the best.
These preconditioners require inner iterations which arecomputationally costly.
Need to look into speeding up application ofpreconditioner.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Stationary Iteration Methods
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Uzawa’s Method
The constrained optimization problem associated with thesaddle point system is{
minx12〈Ax , x〉 − 〈f , x〉
subject to Bx = g .
The unconstrained version of the above problem
L(x , y) =1
2〈Ax , x〉 − 〈f , x〉+ 〈y ,Bx − g〉,
where y ∈ Rm in the Lagrange multiplier vector.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Uzawa’s Method
The constrained optimization problem associated with thesaddle point system is{
minx12〈Ax , x〉 − 〈f , x〉
subject to Bx = g .
The unconstrained version of the above problem
L(x , y) =1
2〈Ax , x〉 − 〈f , x〉+ 〈y ,Bx − g〉,
where y ∈ Rm in the Lagrange multiplier vector.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Algorithm
Uzawa’s Method
Starting with initial guesses x0 and y0, solveAxk+1 = f − BT yk
yk+1 = yk + ω(Bxk+1 − g)where ω > 0 is a relaxation parameter.
Substituting xk+1 in the second equation we find theRichardson’s scheme
yk+1 = yk + ω(BA−1f − g − BA−1BT yk)
and the bound for ω is 0 < ω < 2λmax
and
ωoptimal =2
λmax + λmin
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Results
Figure: Residual vs Iteration plot for Uzawa’s Algorithm
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Augmented Lagrangian Method
The augmented Lagrange function associated with the saddlepoint system is
L(x , y , γ) =1
2〈Ax , x〉 − 〈f , x〉+ 〈y ,Bx − g〉+
γ
2‖Bx − g‖22,
where γ > 0 is a balancing parameter. Using an alternatingstrategy of minimizing L(x , y , γ) with respect to eachcomponent we find
xk+1 = arg minx
L(x , yk , γ)
andyk+1 = arg min
yL(xk+1, y , γ)
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Algorithm
Augmented Lagrangian method
Starting with intial guess x0 and y0, solve(A + γBTB)xk+1 = f − BT yk
yk+1 = yk + γ(Bxk+1 − g)γ > 0 is a balancing parameter.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Results
Figure: Residual vs Iteration plot for ALM
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Summary of Stationary Iteration Methods
Implementation of Uzawa’s Algorithm
The method converges very slowly.The parameter ω needs adjustment.The choice of ω can vary from system to system.
Implementation of Augmented Lagrangian Method
The method converges very fast, but iteration steps arecostly.The method depends on the adjustment of the parameterγ.The system solved in each iteration needs betterpreconditioning.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Multilevel Methods
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Introduction to Multigrid
Find the restriction and prolongation operators IhH and IH
h
Implement the algorithm
Results
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Creating IhH and IH
h
Mapping the Degrees of Freedom
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Multigrid Algorithms
Two-grid method consists of the following steps:
Perform m pre-smoothing steps on Ahuh = bh;
Compute the fine grid residual rm = bh −Ahuh;
Apply the restriction to the fine grid residual, rH = IhH rh;
Solve the coarse grid problem AHuH = rH ;
Add the correction, uh = uh + αIHh uH ;
Perform n post-smoothing steps.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Results
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Results
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
BPX Preconditioner: Definition
The Bramble-Pasciak-Xu Preconditioner (BPX) is defined:
P−1 = BBPX =J∑
j=0
IjITj
where Ij is an prolongation operator that translates the systemfrom triangulation size j to the finest triangulation size J.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Results
GMRes on 2680 by 2680 matrix
No BPX with BPX
iterations 9575 1701run time 408 secs. 76 secs.residual 10−6 10−6
MinRes on 2680 by 2680 matrix
No BPX with BPX
iterations 622 289run time 0.1978 secs. 0.239 secs.residual 10−4 10−4
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Pros and Cons
Pros
Don’t have to compute P−1 using a solver
Only have to build BBPX one time for multiple systems
Cons
BBPX takes a long time to compute
In our testing, it converges in more iterations than whenusing the block diagonal with A =diag(A).
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Summary of Multilevel Methods
Implementation of Multigrid
Initial results of Multigrid are promising, but moreinvestigation is needed.Multigrid could be implemented as a preconditioner toother methods.
Implementation of Bramble-Pasciak-Xu Preconditioner(BPX)
Again, initial results are promising, but comparison testsare needed.BPX could be a highly useful preconditioner since it onlyneeds to be built once.
Saddle-PointSystems
KrylovSubspaceMethods
Iterative Solvers
Preconditioners
StationaryIterationMethods
Uzawa’s Method
AugmentedLagrangianMethod
MultilevelMethods
MultigridMethod
BPXPreconditioner
Thank you!
Questions?