Toward an automatic parallel tool for solving systems of nonlinear equations Antonio M. Vidal Jesús...
-
Upload
annabelle-wilkerson -
Category
Documents
-
view
217 -
download
0
Transcript of Toward an automatic parallel tool for solving systems of nonlinear equations Antonio M. Vidal Jesús...
Toward an automatic parallel tool for solving systems of nonlinear
equations
Antonio M. Vidal
Jesús Peinado
Departamento de Sistemas Informáticos y ComputaciónUniversidad Politécnica de Valencia
Solving Systems of Nonlinear Equations
0)(/ ,: ** xFfind xFGiven nnn
)()( 1ccc xFxJxx
Newton’s iteration:
Newton’s Algorithm
)(
)()(
)(
)(
)(
)(
)()(
)(
)(
)0(
)0(
i
iii
ik
i
i
i
xFEvaluate
sxx
xFsxJ Solve
xJatrixJacobian mCompute
boundxFWhile
xFEvaluate
xChose
Methods to solve Nonlinear Systems
• Newton’s Methods: To solve the linear system by using a direct method (LU, Cholesky,..) Several approaches : Newton, Shamanskii, Chord,..
• Quasi-Newton Methods: To approximate the Jacobian matrix . (Broyden Method, BFGS,...)
B(xc) ≈ J(xc)
B(x+)= B(xc)+uvT
• Inexact Newton Methods : To solve the linear system by using an iterative method (GMRES, C. Gradient,..) .
||J(xk )sk+ F(xk )||2 = ηk ||F(xk )||2
Difficulties in the solution of Nonlinear
Systems by a non-expert Scientist • Several methods• Slow convergence• A lot of trials are needed to obtain the optimum
algorithm • If parallelization is tried the possibilities increase
dramatically: shared memory, distributed memory, passing message environments, computational kernels, several parallel numerical libraries,…
• No help is provided by libraries to solve a nonlinear system
Objective
• To achieve a software tool which automatically obtains the best from a sequential or parallel machine for solving a nonlinear system, for every problem and transparently to the user
Work done
• A set of parallel algorithms have been implemented: Newton’s, Quasi-Newton and Inexact Newton algorithms for symmetric and nonsymmetric Jacobian matrices
• Implementations are independent of the problem• They have been tested with several problems of different
kinds• They have been developed by using the support and the
philosophy of ScaLAPACK• They can be seen as a part of a more general
environment related to software for message passing machines
SCALAPACK
• Example of distribution for solving a linear system with J Jacobian Matrix and F problem function
• Programming Model: SPMD. • Interconnection network: Logical Mesh• Two-dimensional distribution of data: block cyclic
-F11
-F21
-F51
-F61
-F91
-F31
-F41
-F71
-F81
0 1 2
0
1
j 11 j 12 j 17 j 18 j 13 j 14 j 19 j 15 j 16
j 21 j 22 j 27 j 28 j 23 j 24 j 29 j 25 j 26
j 51 j 52 j 57 j 58 j 53 j 54 j 59 j 55 j 56
j 61 j 62 j 67 j 68 j 63 j 64 j 69 j 65 j 66
j 91 j 92 j 97 j 98 j 93 j 94 j 99 j 95 j 96
j 31 j 32 j 37 j 38 j 33 j 34 j 39 j 35 j 36
j 41 j 42 j 47 j 48 j 43 j 44 j 49 j 45 j 46
j 71 j 72 j 77 j 78 j 73 j 74 j 79 j 75 j 76
j 81 j 82 j 87 j 88 j 83 j 84 j 89 j 85 j 86
0
1
0 1 2
MB
NB
Software environment
Authomatic Parallel Tool
ScaLAPACK
LAPACK
PBLAS
BLAS
BLACS
Message-passingprimitives(MPI, PVM, ...)
Global
Local
Basic Linear Algebra Subroutines
Linear Algebra Package
Parallel BLAS
Basic Linear AlgebraCommunication Subroutines
Scalable Linear Algebra Package
CERFACS:CG,GMRESIterative Solvers
USER
Numerical Paralell Algorithms
MINPACKMinimization
Package
Otherpackages..
Developing a systematic approach
How to chose the best method?
• Specification of data problemi.Starting point.
ii.Function F.
iii. Jacobian Matrix J.
iv. Structure of Jacobian Matrix (dense, sparse, band, …)
v. Required precision.
vi. Using of chaotic techniques.
vii. Possibilities of parallelization (function, Jacobian Matrix,…).
• Sometimes only the Function is known: Prospecting with a minimum simple algorithm (Newton+finite
differences+sequential approach) can be interesting
La metodología(1).Esquema general
Developing a systematic approach
Method flopsNewton
CC k
N(C
E C
J
2
3n 3 )
ShamanskiiC
c k
S(C
J
2
3n 3 m (C
E 2n 2 ))
ChordC
c C
J
2
3n 3 k
C(C
E 2n 2 )
Newton-CholeskyC
C k
NCH(C
E C
J
n 3
3)
BroydenC
C C
E C
J
4
3n 3 k
B(C
E 29n 2 )
BFGS CC CE C Jn3
3 kBF(2n2 CE
) m(C J n3
3 ) (k BF m)(15n2)
Newton-GMRES CC C
E k
NG(C
E C
J k
G2n
2m C
E)
Newton-CG CC C
E k
NCG(C
J k
CGn 2 C
E)
CE= Function evaluation cost; CJ=Jacobian matrix evaluation cost
Developing a systematic approach• Function and Jacobian Matrix characterize the nonlinear system• It is important to know features of both: sparse or dense, how to compute (sequential or parallel), structure,… • It is be interesting to classify the problems according to their cost, specially to identify the best method or to
avoid the worst method and to decide what must be parallelized
J F
O(n) O(n2) O(n3) O(n4) >O(n4)
O(n) P11 P12 P13 P14 P1+O(n2) P21 P22 P23 P24 P2+O(n3) P31 P32 P33 P34 P3+O(n4) P41 P42 P43 P44 P4+
>O(n4) P+1 P+2 P+3 P+4 P++
Developing a systematic approach
• Once the best sequential option has been selected the process can be finalized
• If the best parallel algorithm is required the following items must be analyzed:
– Computer architecture: (tf, )
– Programming environments: PVM/MPI….– Data distribution to obtain the best
parallelization.– Cost of the parallel algorithms
Developing a systematic approach
Data Distribution
It depends on the parallel environment. In the case of ScaLAPACK: Cyclic by blocks distribution: optimize the size of block and the size of the mesh
Parallelization chances
Function evaluation and/or Computing the Jacobian matrix.
Parallelize the more expensive operation!
Cost of the parallel algorithms
Utilize the table for parallel cost with the parameters of the parallel machine: (tf, )
Developing a systematic approachFinal decision for chosing the method
CE CJ Advisable
0 0 Chose according to the speed of convergence. If it is slow chose Newton or Newton GMRES
0 1 Avoid to compute the Jacobian matrix. Chose Broyden or use finite differences
1 0 Newton or Newton-GMRES adequate. Avoid to compute the function
1 1 Try to do a small number of iterations. Use Broyden to avoid the computation of Jacobian matrix
Cost < O(n3) => 0; Cost >= O(n3) => 1
Developing a systematic approachFinal decision for parallelization
Fun Jac. Advisable0 0 Try to do few iterations. Use Broyden or Chord to avoid
the computation of Jacobian matrix
0 1 Newton or Newton-GMRES adequate. Do few iterations and avoid to compute the function
1 0 Compute few times Jacobian matrix. Use Broyden or Chord if possible.
1 1 Chose according to speed of convergence. Newton or Newton-GMRES adequate
No chance of parallelization => 0; Chance of parallelization => 1
Developing a systematic approach
Finish or feedback: IF selected method is convenient
THEN finish
ELSE feedback
Sometimes bad results are obtained due to:– No convergence.– High computational cost– Parallelization no satisfactory.
La metodología(12).Esquema del proceso guiado
La metodología(12).Esquema del proceso guiado
How does it work?
• Cost of Jacobian matrix high: Avoid compute it. Use Chord o Broyden.
• High chance of parallelization, even if finite difference is used.
• If speed of convergence is slow use Broyden but insert some Newton iterations.
•Inverse Toeplitz Symmetric Eigenvalue Problem•Well known problem: Starting point, function, analytical Jacobian matrix or finite difference approach, …•Kind of problem
F O(n3)J O(n3)
P33
F O(n3)J O(n4)
P34Anal.Jac. Fin.Dif. Jac
How does it work?
• Avoid methods with high cost of a iteration like Newton-Cholesky
• Computation of F and J can be parallelized.
• Use Newton-CG (to speed-up convergence) or BFGS
F O(n2)J O(n2)
P22
•Leakage minimization in a network of water distribution•Well known problem: Starting point, function, analytical Jacobian matrix or finite difference approach, …•Jacobian matrix: symmetric, positive def.•Kind of problem
Conclusions
• Part of this work has been done in the Ph.D. Thesis of J.Peinado: “Resolución Paralela de Sistemas de Ecuaciones no Lineales”. Univ.Politécnica de Valencia. Sept. 2003
• All specifications and parallel algorithms have been developed
• Implementation stage of the automatic parallel tool starts in January 2004 in the frame of a CITYT Project: “Desarrollo y optimización de código paralelo para sistemas de Audio 3D”. TIC2003-08230-C02-02