Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers
description
Transcript of Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers
![Page 1: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/1.jpg)
Numerical Simulation of3D Fully Nonlinear Waters Waves
on Parallel Computers
Xing CaiXing CaiUniversity of Oslo
![Page 2: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/2.jpg)
PARA'98
Outline of the Talk
Mathematical model
Numerical scheme (sequential)
Parallelization strategy (domain decomposition)
Object-oriented implementation
Numerical experiment
![Page 3: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/3.jpg)
PARA'98
Mathematical Model
Fully nonlinear 3D water waves Primary unknowns:
wallssolidon 0
surfaceon water 02/)(
surfaceon water 0
olumein water v 0
222
2
n
gzyxt
zyyxxt
,
![Page 4: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/4.jpg)
PARA'98
Numerical Scheme
Physical domain:
Transformation: (a fixed domain)
),,( ,),( ),,( )( tyxzHyxzyxt xy
HH
Hzz 1
)(t
0 ,),( ),,( zHyxzyx xy
![Page 5: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/5.jpg)
PARA'98
Numerical Scheme
• Operator splitting• At each time level:
FDM for updating free surface conditions FEM solution of an elliptic boundary value problem in
0)( K
H
HzHHzHz
HzH
HzH
HtzyxK
yxyx
y
x
)()()()(
)(0
)(01
),,,(2222
![Page 6: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/6.jpg)
PARA'98
Preconditioning
Elliptic boundary value problem - most CPU intensive Resulting system of linear equations Preconditiong
bAxbMAxM 11
Gauss-Seidel O(N2)CG+MILU O(N7/6)
CG+MG/DD O(N)
N- number of unknowns
Computational cost
![Page 7: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/7.jpg)
PARA'98
The Question
How to do the parallelization?
Different approaches on different levels: Automatic parallelization Parallelization on the low matrix-vector level Parallelization on the level of simulators
Starting point: an o-o water wave simulator(built in Diffpack: C++ environment for scientific computing)
![Page 8: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/8.jpg)
PARA'98
Parallelization Strategy
Domain Decomposition
• Divide and conquer• Solution of the original large problem through iteratively
solving many smaller subproblems -- solution method or preconditioner
• Flexible -- localized treatment of irregular geometries, singularities etc
• Very efficient numerical methods -- even on sequential computers
• Suitable for coarse grained parallelization
![Page 9: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/9.jpg)
PARA'98
Overlapping Domain Decomposition
Alternating Schwarz method for two subdomains
Example: solving an elliptic boundary value problem
in
A sequence of approximations
where
on
in
gu
fAu
21
nuuu ,, 10
1|
\on
in
121
111
111
nn
n
n
uu
gu
fAu
2|
\on
in
12
222
222
nn
n
n
uu
gu
fAu
![Page 10: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/10.jpg)
PARA'98
Numerical Foundation
Additive Schwarz Method
Subproblems are of the same form as the original large problem, with possibly different boundary conditions on artificial boundaries.
Subproblems can be solved in parallel.
![Page 11: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/11.jpg)
PARA'98
Convergence of the Solution
Example:Solving the Poissonproblem on the unitsquare
![Page 12: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/12.jpg)
PARA'98
Numerical Foundation
Coarse Grid Correction
Important for good DD convergence
Run on each processor, shared with subdomain
simulators on the same processor
![Page 13: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/13.jpg)
PARA'98
Some Observations
Parallel Computing
efficiency relies on the parallelization
Domain Decomposition
suits well for parallel computing
a good parallelization strategy
Object-Oriented Programming Technique flexible and efficient sequential simulators
can be used in subdomain solves -- main ingredient of DD
![Page 14: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/14.jpg)
PARA'98
New Programming Model
A simulator-parallel model
Each processor hosts an arbitrary number of subdomains balance between numerical efficiency and load balancing
One subdomain is assigned a sequential simulator
Flexibility -- different types of grids, linear system solvers, preconditioners, convergence monitors etc. are allowed for different subproblems
Domain decomposition on the level of subdomain simulators!
![Page 15: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/15.jpg)
PARA'98
Simulator-Parallel
Reuse of existing sequential simulators
Data distribution is implied
No need for global data
Needs additional functionalities for exchanging nodal values inside the overlapping region
Needs some global administration
![Page 16: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/16.jpg)
PARA'98
A Generic Programming Framework
An add-on library (SPMD model) Use of object-oriented programming technique Flexibility and portability Simplified parallelization process for end-user
![Page 17: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/17.jpg)
PARA'98
The Administrator
Parameter Interfacesolution method or preconditioner, max iterations, stopping criterion etc
DD algorithm Interfaceaccess to predifined numerical algorithm e.g. CG
Operation Interface (standard codes & UDC)access to subdomain simulators, matrix-vector product, inner product etc
![Page 18: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/18.jpg)
PARA'98
The Subdomain Simulator
Subdomain Simulator -- a generic representation C++ class hierarchy Interface of generic member functions
![Page 19: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/19.jpg)
PARA'98
Adaptation of Sequential Simulator
Class SubdomainSimulator - generic representation of a sequential simulator.
Class SubdomainFEMSolver - generic representation of a sequential simulator using FEM.
A new sequential wave simulator that fits in the framework is
readily extended from the
existing sequential simulator,
also being a subclass of
SubdomainFEMSolver.
SubdomainSimulator
SubdomainFEMSolver WaveSimulator
NewWSimulator
![Page 20: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/20.jpg)
PARA'98
Performance
Algorithmic efficiency efficiency of original sequential simulator(s) efficiency of domain decomposition method
Parallel efficiency communication overhead (low) coarse grid correction overhead (normally low) synchronization overhead load balancing
subproblem size work on subdomain solves
![Page 21: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/21.jpg)
PARA'98
Parallel Simulation of Waves
![Page 22: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/22.jpg)
PARA'98
Parallel Efficiency
Fixed number of subdomains M=16. Subdomain grids from partition of a global 41x41x41 grid. Simulation over 32 time steps. DD as preconditioner of CG for the Laplace eq. Multigrid V-cycle as subdomain solver.
P Execution time Speedup Efficiency
1 1404.44 N/A N/A
2 715.32 1.96 0.98
4 372.79 3.77 0.94
8 183.99 7.63 0.95
16 90.89 15.45 0.97
![Page 23: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/23.jpg)
PARA'98
Overall Efficiency
Number of subdomains equal to number of processors
P/M Execution time Subgrid Iterations
1 642.14 68921 7.69
2 597.47 38663 9.00*
4 265.62 21689 13.59
8 172.23 12259 17.25
16 90.89 6929 16.56
*For P=2 parallel BiCGStab is used.
![Page 24: Numerical Simulation of 3D Fully Nonlinear Waters Waves on Parallel Computers](https://reader035.fdocuments.net/reader035/viewer/2022062721/5681386a550346895da01be4/html5/thumbnails/24.jpg)
PARA'98
Summary
Efficient solution of elliptic boundary value problems
Parallelization based on DD
Introduction of a simulator-parallel model
A generic framework for implementation
http:www.nobjects.com