A Higher-Order Accurate Unstructured Finite Volume Newton...
Transcript of A Higher-Order Accurate Unstructured Finite Volume Newton...
A Higher-Order Accurate Unstructured Finite VolumeNewton-Krylov Algorithm for Inviscid Compressible Flows
by
AMIR NEJAT
B.Sc. (Aerospace Engineering), AmirKabir University of Technology, 1996
M.Sc. (Aerospace Engineering), AmirKabir University of Technology, 1998
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
in
THE FACULTY OF GRADUATE STUDIES
(Department of Mechanical Engineering)
THE UNIVERSITY OF BRITISH COLUMBIA
April 2007
c© Amir Nejat, 2007
Abstract
A fast implicit (Newton-Krylov) finite volume algorithm is developed for higher-order un-
structured (cell-centered) steady-state computation of inviscid compressible flows (Euler
equations). The matrix-free Generalized Minimal Residual (GMRES) algorithm is used for
solving the linear system arising from implicit discretization of the governing equations,
avoiding expensive and complicated explicit computation of the higher-order Jacobian ma-
trix. An Incomplete Lower-Upper factorization technique is employed as the preconditioning
strategy and a first-order Jacobian as a preconditioning matrix. The solution process is di-
vided into two phases: start-up and Newton iterations. In the start-up phase an approximate
solution of the fluid flow is computed which includes most of the physical characteristics
of the steady-state flow. A defect correction procedure is proposed for the start-up phase
consisting of multiple implicit pre-iterations. At the end of the start-up phase (when the
linearization of the flow field is accurate enough for steady-state solution) the solution is
switched to the Newton phase, taking an infinite time step and recovering a semi-quadratic
convergence rate (for most of the cases). A proper limiter implementation for higher-order
discretization is discussed and a new formula for limiting the higher-order terms of the
reconstruction polynomial is introduced. The issue of mesh refinement in accuracy mea-
surement for unstructured meshes is revisited. A straightforward methodology is applied for
accuracy assessment of the higher-order unstructured approach based on total pressure loss,
drag measurement, and direct solution error calculation. The accuracy, fast convergence
and robustness of the proposed higher-order unstructured Newton-Krylov solver for differ-
ent speed regimes are demonstrated via several test cases for the 2nd, 3rd and 4th-order
discretization. Solutions of different orders of accuracy are compared in detail through sev-
eral investigations. The possibility of reducing the computational cost required for a given
level of accuracy using high-order discretization is demonstrated.
ii
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Mesh Generation and Spatial Discretization . . . . . . . . . . . . . 7
1.2.2 Higher-Order Discretization . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3 Implicit Method and Convergence Acceleration . . . . . . . . . . . . 12
1.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Flow Solver 18
2.1 Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Implicit Time-Advance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Upwind Flux Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 The Godunov Approach . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Roe’s Flux Difference Splitting Scheme . . . . . . . . . . . . . . . . . 26
2.4 Boundary Flux Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
iii
CONTENTS iv
2.4.1 Wall Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.2 Inlet/Outlet Boundary Conditions . . . . . . . . . . . . . . . . . . . 29
2.5 Flux Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Reconstruction and Monotonicity 33
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 K-Exact Least-Square Reconstruction . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Conservation of the Mean . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.2 K-Exact Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.3 Compact Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.4 Boundary Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.4.1 Dirichlet Boundary Constraint . . . . . . . . . . . . . . . . 39
3.2.4.2 Neumann Boundary Constraint . . . . . . . . . . . . . . . 40
3.2.5 Constrained Least-Square System . . . . . . . . . . . . . . . . . . . . 40
3.3 Accuracy Assessment for a Smooth Function . . . . . . . . . . . . . . . . . 41
3.4 Monotonicity Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.1 Flux Limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.2 Slope Limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Flux Jacobian 52
4.1 What is the Jacobian ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Flux Jacobian Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.1 Roe’s Flux Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
CONTENTS v
4.2.2 Boundary Flux Jacobians . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.2.1 Wall Boundary Flux Jacobian . . . . . . . . . . . . . . . . 61
4.2.2.2 Subsonic Inlet Flux Jacobian . . . . . . . . . . . . . . . . . 61
4.2.2.3 Subsonic Outlet flux Jacobian . . . . . . . . . . . . . . . . 64
4.2.2.4 Supersonic Inlet/Outlet Flux Jacobians . . . . . . . . . . . 65
4.3 Finite Difference Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Numerical Flux Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 Linear Solver and Solution Strategy 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 GMRES Linear Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.1 The Basic GMRES Algorithm . . . . . . . . . . . . . . . . . . . . . . 74
5.2.2 Matrix-Vector Products Computation in GMRES . . . . . . . . . . 77
5.2.3 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.4 GMRES with Right Preconditioning . . . . . . . . . . . . . . . . . . 83
5.3 Solution Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.1 Start-up Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3.2 Newton Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6 Results(I): Verification Cases 90
6.1 Reconstruction Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.1.1 Square Test Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.1.2 Annulus Test case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
CONTENTS vi
6.2 Subsonic Flow Past a Semi-Circular Cylinder . . . . . . . . . . . . . . . . . 97
6.3 Supersonic Vortex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.1 Numerical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.2 Solution accuracy measurement . . . . . . . . . . . . . . . . . . . . . 117
7 Results(II): Simulation Cases 124
7.1 Subsonic Airfoil, NACA 0012, M = 0.63, α = 2.00 . . . . . . . . . . . . . . 124
7.1.1 Solution Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.2 Transonic Airfoil, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . . . . . . 153
7.3 Supersonic flow, Diamond airfoil , M = 2.0, α = 0.0 . . . . . . . . . . . . . 161
8 Concluding Remarks 169
8.1 Summary and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
8.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.3 Recommended Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.3.1 Start-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.3.2 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.3.3 Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.3.4 Extension to 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.3.5 Extension to Viscous Flows . . . . . . . . . . . . . . . . . . . . . . . 174
Bibliography 175
List of Tables
1.1 Qualitative illustration of research on solver development . . . . . . . . . . 4
5.1 Ratio of non-zero elements in factorized matrix . . . . . . . . . . . . . . . . 82
6.1 2nd-order error norms for the square case . . . . . . . . . . . . . . . . . . . 91
6.2 3rd-order error norms for the square case . . . . . . . . . . . . . . . . . . . 92
6.3 4th-order error norms for the square case . . . . . . . . . . . . . . . . . . . 93
6.4 2nd-order error norms for the annulus case . . . . . . . . . . . . . . . . . . 94
6.5 3rd-order error norms for the annulus case . . . . . . . . . . . . . . . . . . . 94
6.6 4th-order error norms for the annulus case . . . . . . . . . . . . . . . . . . . 95
6.7 Sizes and ratios of the control volumes for circular cylinder meshes . . . . . 98
6.8 Error norms for total pressure, 2nd-order solution . . . . . . . . . . . . . . 106
6.9 Error norms for total pressure, 3rd-order solution . . . . . . . . . . . . . . . 106
6.10 Error norms for total pressure, 4th-order solution . . . . . . . . . . . . . . . 109
6.11 Solution error norms, 2nd-order discretization . . . . . . . . . . . . . . . . . 121
6.12 Solution error norms, 3rd-order discretization . . . . . . . . . . . . . . . . . 121
6.13 Solution error norms, 4th-order discretization . . . . . . . . . . . . . . . . . 121
vii
LIST OF TABLES viii
7.1 Mesh detail for NACA 0012 airfoil
124
7.2 Convergence summary for NACA 0012 airfoil, M = 0.63, α = 20 . . . . . . 131
7.3 Lift and drag coefficients for all meshes and discretization orders, NACA
0012, M = 0.63, α = 20, far field size of 25 chords . . . . . . . . . . . . . . . 142
7.4 Effect of the far field distance on lift and drag coefficients, NACA 0012,
M = 0.63, α = 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.5 Convergence summary for NACA 0012 airfoil, M = 0.8, α = 1.250 . . . . . 155
7.6 Lift and drag coefficients, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . 156
7.7 Convergence summary for diamond airfoil, M = 2.0, α = 00 . . . . . . . . . 164
7.8 Drag coefficient, diamond airfoil, M = 2.0, α = 00 . . . . . . . . . . . . . . 164
List of Figures
1.1 Main approaches in fluid dynamics . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 CFD overall algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Example of a structured and an unstructured mesh over a 2D airfoil . . . . 8
2.1 Propagation of a linear wave in positive direction . . . . . . . . . . . . . . . 24
2.2 Shock-Tube problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Rounding the characteristic slope near zero . . . . . . . . . . . . . . . . . . 28
2.4 Schematic illustration of Gauss quadrature points . . . . . . . . . . . . . . . 31
3.1 A typical cell center control volume and its reconstruction stencil, including
three layers of neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Imposing boundary constraint at the Gauss boundary points . . . . . . . . 39
3.3 Typical unlimited/limited linear reconstruction . . . . . . . . . . . . . . . . 46
3.4 Using first neighbors for monotonicity enforcement . . . . . . . . . . . . . . 47
3.5 Typical unlimited/limited quadratic reconstruction . . . . . . . . . . . . . . 50
3.6 Defining σ as a function of φ . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 Schematic of Direct Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . 55
ix
LIST OF FIGURES x
4.2 Typical cell-centered mesh numbering . . . . . . . . . . . . . . . . . . . . . 57
4.3 Total numerical error versus perturbation magnitude . . . . . . . . . . . . . 67
5.1 Linearization of a sample function . . . . . . . . . . . . . . . . . . . . . . . 72
6.1 Unstructured meshes for a square domain . . . . . . . . . . . . . . . . . . . 92
6.2 Error-Mesh plot for the square case . . . . . . . . . . . . . . . . . . . . . . . 93
6.3 Unstructured meshes for a curved domain (annulus) . . . . . . . . . . . . . 95
6.4 Error-Mesh plot for the annulus case . . . . . . . . . . . . . . . . . . . . . . 96
6.5 Circular domain over half a cylinder, Mesh 1 (1376 CVs) . . . . . . . . . . . 97
6.6 Circular cylinder, Mesh 1 (1376CVs) . . . . . . . . . . . . . . . . . . . . . . 98
6.7 Circular cylinder, Mesh 2 (5539 CVs) . . . . . . . . . . . . . . . . . . . . . 99
6.8 Circular cylinder, Mesh 3 (22844 CVs) . . . . . . . . . . . . . . . . . . . . . 99
6.9 Convergence history for the coarse mesh (Mesh 1) . . . . . . . . . . . . . . 101
6.10 Convergence history for the fine mesh (Mesh 3) . . . . . . . . . . . . . . . . 101
6.11 2nd-order pressure coefficient contours, Mesh 1 . . . . . . . . . . . . . . . . 102
6.12 3rd-order pressure coefficient contours, Mesh 1 . . . . . . . . . . . . . . . . 102
6.13 4th-order pressure coefficient contours, Mesh 1 . . . . . . . . . . . . . . . . 103
6.14 4th-order pressure coefficient contours, Mesh 3 . . . . . . . . . . . . . . . . 104
6.15 Pressure coefficient along the axis . . . . . . . . . . . . . . . . . . . . . . . . 104
6.16 Close up of the pressure coefficient along the axis (suction region) . . . . . 105
6.17 Pressure coefficient along the axis, Mesh 3 . . . . . . . . . . . . . . . . . . 105
6.18 Error in total pressure ratio, 2nd-order discretization, Mesh 1 . . . . . . . . 107
LIST OF FIGURES xi
6.19 Error in total pressure ratio, 3rd-order discretization, Mesh 1 . . . . . . . . 107
6.20 Error in total pressure ratio, 4th-order discretization, Mesh 1 . . . . . . . . 108
6.21 Error in total pressure ratio, 4th-order discretization, Mesh 3 . . . . . . . . 108
6.22 Error-Mesh plot for the total pressure . . . . . . . . . . . . . . . . . . . . . 109
6.23 Drag coefficient versus mesh size . . . . . . . . . . . . . . . . . . . . . . . . 110
6.24 Drag coefficient versus CPU time . . . . . . . . . . . . . . . . . . . . . . . . 111
6.25 Annulus, Mesh 1 (108 CVs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.26 Annulus, Mesh 2 (427 CVs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.27 Annulus, Mesh 3 (1703 CVs) . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.28 Annulus, Mesh 4 (6811 CVs) . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.29 Annulus, Mesh 5 (27389 CVs) . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.30 Convergence history for the coarse mesh (Mesh 1) . . . . . . . . . . . . . . 116
6.31 Convergence history for the fine mesh (Mesh 5) . . . . . . . . . . . . . . . . 116
6.32 2nd-order Mach contours for the coarse mesh (Mesh 1) . . . . . . . . . . . . 117
6.33 3rd-order Mach contours for the coarse mesh (Mesh 1) . . . . . . . . . . . . 118
6.34 4th-order Mach contours for the coarse mesh (Mesh 1) . . . . . . . . . . . . 118
6.35 2nd-order density error for the coarse mesh (Mesh 1) . . . . . . . . . . . . . 119
6.36 3rd-order density error for the coarse mesh (Mesh 1) . . . . . . . . . . . . . 119
6.37 4th-order density error for the coarse mesh (Mesh 1) . . . . . . . . . . . . . 120
6.38 Density, 4th-order solution over the fine mesh (Mesh 5) . . . . . . . . . . . 120
6.39 Error-Mesh plot for the solution (Density) . . . . . . . . . . . . . . . . . . . 122
6.40 Error versus CPU Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
LIST OF FIGURES xii
7.1 NACA 0012, Mesh 1, 1245 CVs . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.2 NACA 0012, Mesh 2, 2501 CVs . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.3 NACA 0012, Mesh 3, 4958 CVs . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.4 NACA 0012, Mesh 4, 9931 CVs . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.5 NACA 0012, Mesh 5, 19957 CVs . . . . . . . . . . . . . . . . . . . . . . . . 127
7.6 Cp over the upper surface after start-up, Mesh1 (1245 CVs) . . . . . . . . . 128
7.7 Cp over the upper surface after start-up, Mesh 5 (19957 CVs) . . . . . . . . 129
7.8 CPU time versus the grid size, NACA 0012, M = 0.63, α = 20 . . . . . . . 132
7.9 Total work unit versus the grid size, NACA 0012, M = 0.63, α = 20 . . . . 133
7.10 Newton phase work unit versus the grid size, NACA 0012, M = 0.63, α = 20 133
7.11 Convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20 . . . . . . . . 134
7.12 Convergence history, Mesh 5, M = 0.63, α = 20 . . . . . . . . . . . . . . . . 135
7.13 Non-linear residual versus linear system residual, Mesh 3, M = 0.63, α = 20 135
7.14 Linear system residual dropping order, Mesh 3, M = 0.63, α = 20 . . . . . 136
7.15 Eigenvalue pattern for the preconditioned system, Mesh 3, M = 0.63, α = 20 137
7.16 Condition No. of the preconditioned system, Mesh 3, M = 0.63, α = 20 . . 138
7.17 Condition No. of the preconditioned system, Mesh 5, M = 0.63, α = 20 . . 138
7.18 Lift coefficient convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20 139
7.19 Drag coefficient convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20 140
7.20 Lift coefficient convergence history NACA 0012, Mesh 5, M = 0.63, α = 20 141
7.21 Drag coefficient convergence history, NACA 0012, Mesh 5, M = 0.63, α = 20 141
7.22 NACA 0012, Mesh 3 (4958 CVs) . . . . . . . . . . . . . . . . . . . . . . . . 144
LIST OF FIGURES xiii
7.23 2nd-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20 145
7.24 3rd-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20 145
7.25 4th-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20 146
7.26 Mach profile, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20 . . 146
7.27 Mach profile close up, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20147
7.28 Mesh 1, close-up at the leading edge region . . . . . . . . . . . . . . . . . . 147
7.29 Mach profile, lower side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20 . . . 148
7.30 Mach profile, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20 . . 148
7.31 Mach profile close up, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20149
7.32 Mach profile, lower side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20 . . . 149
7.33 1 − Pt
Pt∞, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20 . . . . . 150
7.34 1 − Pt
Pt∞, lower side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20 . . . . . 150
7.35 1 − Pt
Pt∞, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20 . . . . . 151
7.36 1 − Pt
Pt∞, lower side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20 . . . . . 151
7.37 Mach profile at the end of start-up process, M = 0.8, α = 1.250 . . . . . . . 154
7.38 Convergence history for NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . . 155
7.39 2nd-order Mach contours, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . 156
7.40 3rd-order Mach contours, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . 157
7.41 4th-order Mach contours, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . 157
7.42 limiter φ (3rd-order), NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . . . . 158
7.43 limiter σ (3rd-order), NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . . . . 159
7.44 Mach profile, NACA 0012, M = 0.8, α = 1.250 . . . . . . . . . . . . . . . . 159
LIST OF FIGURES xiv
7.45 Mach profile in shock regions, NACA 0012, M = 0.8, α = 1.250 . . . . . . . 160
7.46 Mesh 7771 CVs , diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . . . . . . 162
7.47 Cp at the end of start-up process, diamond airfoil, M = 2.0, α = 0.0 . . . . 163
7.48 Convergence history for diamond airfoil, M = 2.0, α = 00 . . . . . . . . . . 164
7.49 2nd-order Mach contours, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . 165
7.50 3rd-order Mach contours, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . 165
7.51 4th-order Mach contours, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . 166
7.52 2nd-order Cp, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . . . . . . . 166
7.53 3rd-order Cp, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . . . . . . . 167
7.54 4th-order Cp, diamond airfoil, M = 2.0, α = 0.0 . . . . . . . . . . . . . . . 167
List of Symbols
Roman Symbols
a speed of sound
A area, Jacobian matrix
b righr hand side (Ax = b)
B fixed iteration matrix
C center
CL lift coefficient
CD drag coefficient
Cp specific heat at constant pressure, pressure coefficient
Cv specific heat at constant volume
D reconstruction solution vector (derivatives)
e specific internal energy
E total energy, error
F flux vector
G Gauss (integration) point
h specific enthalpy, mesh length scale
H Hessenberg matrix
I identity matrix
xv
LIST OF SYMBOLS xvi
J Jacobian matrix
k number of subspace size
K order of accuracy, polynomial order, constant in Venkatakrishnan limiter
l length of the control volume face, norm
L norm, left eigenvector
m number of restarts
M Mach number, moments, preconditioner matrix, coefficient matrix in reconstruc-
tion
n normal vector
N total number of ...
p polynomial order
P static pressure, polynomial
r residual, distance
R gas constant, residual, radius, right eigenvector
S slope in higher-order limiter
t time, exponent of the weighting in reconstruction
T static temperature
U solution vector, conservative variables
u, v velocity components
V primitive variables, velocity, subspace
x, y Cartesian coordinates, unknown vectors
z preconditioning vector
LIST OF SYMBOLS xvii
Greek Symbols
α angle of attack
γ specific heat ratio
ε perturbation parameter
λ eigenvalue, wave speed
ρ Density
σ higher-order limiter
φ slope limiter
ω relaxation factor
Superscripts
i iteration index
k, K order, iteration number, subspace number
m, n polynomial exponent
Subscripts
b boundary
c center
CD central difference
CV control volume
DB Dirichlet boundary
FD forward difference
G Gauss point
i inner, control volume index
LIST OF SYMBOLS xviii
in inlet
L left side
m magnitude
min, max minimum and maximum
n normal, normalized
N neighbor
NB Neumann boundary
o outer
out outlet
ref reference
R right side
RB reconstructed at the boundary
SubIn subsonic inlet
SubOut subsonic outlet
t total
W wall
x, y Cartesian directions
Acknowledgments
I would like to express my deep appreciation to my research supervisor, Dr. Carl Ollivier-
Gooch for all of his guidance, support and patience. His remarkable feedbacks throughout
the course of this research were very helpful.
I also would like to thank Dr. Chen Greif from the Computer Science Department. Attend-
ing his valuable lectures on sparse matrix solvers and having the opportunity to engage in
discussions with him, have greatly benefited my research.
I am grateful to my all colleagues in the ANSLab research group in the Mechanical En-
gineering Department, especially Chris Michalak, Serge Gosselin, and Harsha Perera, for
their computer assistance on many occasions.
Finally, my most sincere and deepest appreciation is for my great family, especially my
mother “Homa” and my father “Mehdi”, whose constant support, encouragement, patience
and love I have enjoyed all through my life.
xix
Chapter 1
Introduction
Prediction of fluid flow quantities, such as pressure, velocity, and temperature, and the
study of flow behavior are the main goals in the field of fluid dynamics. Like other science
and engineering disciplines, fluid dynamics has greatly benefited from the development of
computing technology (numerical algorithms and computational tools) over the last four
decades, resulting in the creation of a new born approach in the field known as Computa-
tional Fluid Dynamics (CFD), Fig (1.1). CFD has shown remarkable capability for fluid
flow analysis both in academia and industry. CFD has not only made possible the sim-
ulation of flows (such as reentry of space vehicles) for which complete analysis (either by
theory or experiment) was impossible before, but also has provided a valuable feedback and
information source for improving theoretical and experimental fluid dynamics. Increasing
computing power over the last two decades has resulted in the development of new computa-
tional techniques and algorithms, enhancing the versatility of CFD application. Nowadays,
CFD is not just a research tool, and it is used extensively and successfully in industry
throughout the design process, from preliminary design to shape optimization.
1.1 Motivation
In the field of computational aerodynamics, the final goal is the accurate simulation of the
flow field around (and/or inside) complex 3D geometries to compute aerodynamic force
coefficients. In the mid 1980’s, Jameson [36] was the first person to compute the three-
dimensional flow over realistic aerodynamic configurations using the finite volume technique
(a robust conservative numerical approach for discretization of the fluid flow equations over
1
CHAPTER 1. INTRODUCTION 2
general meshes [38]); since then tremendous progress has been made in this area and appli-
cation of CFD for aerodynamic computations has revolutionized the process of aerodynamic
design [27].
To simulate the flow field around a 3D complex geometry accurately, a CFD package should
include three essential parts:
1. State of the art mesh generation capability. Mesh generation or domain dis-
cretization, is one of the most important parts (if not the most!) in CFD simulations.
Since the discretization of the fluid flow equations is carried out on the mesh, without
a good domain discretization the CFD solution can be very inaccurate. Furthermore
generating an appropriate mesh, especially for complex geometries in practice is the
most time consuming part for a CFD user and certainly is not a trivial task. Accord-
ing to a real aerodynamic case study in aerospace engineering, the mesh preparation
time can be up to 45 times larger than the required computation time for the fluid flow
simulation [47]. Therefore, it is both desirable and necessary to reduce the mesh prepa-
ration time and there is a large potential to gain by automating this process. Ideally,
meshing software should be able to generate a geometrically and physically suitable
mesh around or inside a complex 3D geometry with a reasonable user workload. Also
the user should be able to refine the mesh according to geometric parameters and
to adapt the mesh based on flow features without excessive effort. The unstructured
mesh technique, among other types of mesh generation methods, is a very good candi-
date to address mesh generation issues due to its automation capability in generating
meshes for complex geometries and its flexibility in refinement and adaptation.
2. Accurate physical modeling of high Reynolds number turbulent flow. Most
practical engineering applications such as aircraft aerodynamics involve turbulent
flows. The Direct Numerical Simulation (DNS) of turbulent flows for practical pur-
poses is not feasible at least for the next couple of decades due to computing technology
limitations (memory and speed). Therefore modeling the turbulent flow is the only
viable approach in high Reynolds number CFD simulations and will remain a major
active research area for the foreseeable future [7, 75]. Discussing the physical and
numerical criteria for choosing an appropriate turbulence model and related issues are
beyond the scope of this thesis, but accurate physical modeling is a key part of valid
CFD simulation for practical engineering problems. It should be noted that the mod-
eling of the physical phenomena in CFD simulations is not just limited to turbulence,
but is also essential for combustion, multiphase flows, hydromagnetic flows, and other
types of fluid flows.
CHAPTER 1. INTRODUCTION 3
ExperimentalTheoreticalFluid Dynamics Fluid Dynamics
Applied Math Algorithms
Numerical
ComputingTools
ComputationalFluid Dynamics
Figure 1.1: Main approaches in fluid dynamics
3. A robust, efficient, accurate flow solver for the generic mesh. Numerical
solution of the fluid flow equations is what CFD is all about and this solution must
be stable and converge to the correct answer at a reasonable cost. A solver algo-
rithm includes three separate components: discretization of the fluid flow equations,
numerical flux computation, and updating the solution mostly through time advance
methods. Fig (1.2) shows overall schematic of a CFD algorithm. Current CFD flow
solver algorithms still have some limitations both in terms of efficiency and accuracy
especially for simulation of physically complicated flows. Specifically, the memory
and speed of current computers (even using parallel techniques) do not yet allow us to
simulate physically complex flows in realistic geometries, with sufficient accuracy, in
a short time and at a reasonable cost. As a result, we have to simplify the physics of
the fluid flow via approximate modeling, and neglect some of the numerical/physical
issues caused by insufficient mesh density (especially for complex geometries), limiting
the validity of the simulation and adversely affecting its application.
Before proceeding forward with further details, it should be mentioned that this thesis is
aimed only at the third component mentioned above improving the efficiency and accuracy
of CFD algorithms.
The numerical error in a simulation can be written in the form of hp, where h is the
mesh length scale and p is the discretization order of accuracy. Clearly, then, improving the
accuracy of a numerical simulation (modeling issues aside) is possible by means of increasing
CHAPTER 1. INTRODUCTION 4
Order Structured Structured Unstructured UnstructuredExplicit Implicit Explicit Implicit
Second-Order ♣♣♣♣♣♣♣ ♣♣♣♣ ♣♣♣♣ ♣♣Higher-Order ♣♣ ♣ ♣ ?
Table 1.1: Qualitative illustration of research on solver development
the mesh density (using smaller grid or decreasing h), and/or increasing the discretization
order. Reducing mesh length scale can be achieved by global or local mesh refinement or
adaptation.
Nearly all modern CFD codes use second-order methods, which produce a diffusive error
proportional to h2 due to diffusive derivatives beside possible added artificial viscosity for
stability purposes (in central difference schemes). For instance in a second-order 2D finite
volume formulation in the Cartesian coordinates this numerical error for each control volume
can be written in the following form:
Numerical diffusive error =∂2U
∂x2
∆x2
2+
∂2U
∂x∂y∆x∆y +
∂2U
∂y2
∆y2
2(1.1)
This leading-order error term causes two significant numerical problems. First, it smears
sharp gradients in convection dominated parts of the flow and spoils the conservation of
total pressure in isentropic regions of the flow field as it acts like a (numerical) viscosity.
Second, this numerical diffusivity produces parasitic error in viscous regions by adding extra
diffusion, which is very grid dependent, to the solution. Therefore using a high resolution
numerical scheme for discretization is quite desirable. Application of discretization orders
larger than second-order both for structured and unstructured grids has been an area of
ongoing research for the last two decades [35, 10], and will be the focus of this thesis.
However, convergence of high resolution schemes is not as efficient as second-order schemes
especially for unstructured grids [88, 46] due to the increased complexity of the discretiza-
tion, decreased damping (lack of diffusive damping) and adding more error modes (which
must be damped in the solution process). Consequently implementing a higher-order un-
structured discretization within an implicit framework to achieve the efficient convergence
is extremely helpful if not necessary! As shown in Fig(1.2), up to the flux integral compu-
tation, the overall CFD algorithm is the same, but the choice of the time advance technique
in updating the solution changes the level of complexity of the solution process completely.
Integration of the discretized equations in time can be done either explicitly or implicitly.
In the explicit time integration the space discretization is performed at the previous time
CHAPTER 1. INTRODUCTION 5
Discretized Domain
over the Discretized DomainDiscretization of the Fluid Flow Equations
Explicit Time Advance
Preconditioning Sparse Matrix Solver
Implicit Time Advance
Solution Update
Solution Update
Flux Integral
Physics & Fluid Flow Equations
Boundary & Initial Conditions
Geometry & Solution Domain Mesh Generation Package
Flux Integral LinearizationMultistage Techniques
Figure 1.2: CFD overall algorithm
CHAPTER 1. INTRODUCTION 6
level using the known flow quantities found at the previous time iteration. In the implicit
time integration both the space and the time discretizations are performed at the current
time level where the flow quantities are needed as unknowns. Equation (1.2) shows a typical
unsteady fluid flow PDE where the right hand side represents the spatial discretization and
the left hand side shows the time derivative. For example, employing the first-order forward
differencing time advance technique in the explicit and implicit forms leads to Eq. (1.3)
and Eq. (1.4).dU
dt= −R(U) (1.2)
Explicit time advance:Un+1
i − Uni
∆t= −R(Un
i ) (1.3)
Implicit time advance:Un+1
i − Uni
∆t= −R(Un+1
i ) (1.4)
While an explicit update just needs multistage integration of the flux integral using a Runge-
Kutta type scheme, an implicit update requires linearization of the flux integral (Chapter
2) and constructing a large linear system which requires an efficient sparse matrix solver
(discussed in Chapter 4 and 5). Efficiently solving a large linear system, especially with
an ill conditioned matrix resulting from a higher-order discretization, demands effective
preconditioning which adds to the complexity of the process. But the error reduction of
each solution update in implicit integration is far larger than the explicit one, since implicit
methods do not suffer from the stability issues of explicit methods and large time steps can
be taken. On balance, therefore, it is preferable to bear the complexity of the algorithm
and accelerate the solution toward the steady-state in a relatively small number of implicit
iterations.
Table 1.1 provides a qualitative summary on finite volume solver development research,
where the number of symbols represents the approximate volume of the research on the
solver development since early 80’s (based on the author’s survey). Clearly the trend in
solver development is moving toward:
1. Unstructured meshes to address the mesh generation issues for complex geometries
2. Higher-order discretizations to increase the global solution accuracy
3. Implicit techniques to improve the efficiency of the solution process
This research is mainly intended to contribute to the development of efficient and accurate
flow solvers filling the gap in high-order implicit methods, on unstructured meshes.
CHAPTER 1. INTRODUCTION 7
1.2 Background
This section provides an overview of relevant aspects of current CFD solvers including
discretization type and order as well as implicit algorithms. This overview is complemented
by detailed discussion of previous work related to each part of the solver in the relevant
chapters of the thesis.
1.2.1 Mesh Generation and Spatial Discretization
Numerical simulation of fluid flow consists of two main parts: discretization of the flow field
around or inside the geometry by a finite number of cells (grid generation) and solution of
the fluid flow equations over the discretized domain (flow solver). Structured and unstruc-
tured meshes, Fig (1.3), are the most common types of the grids used in CFD applications.
In a structured mesh, all cells and vertices have the same topology but in an unstruc-
tured mesh, elements can have irregular and variable topologies. The task of generating
structured grids around complex configurations has proved to be a considerable challenge.
Sophisticated structured approaches such as multi-block grid generation has resolved this
issue by dividing the domain between the body and far field into simple geometrical blocks;
structured grids are generated inside each block. However, automation of the blocking
procedure is still a relatively difficult job [76]. Another structured approach is overlap or
chimera grids. In this technique, the computational domain is divided into multiple zones
and a suitable grid is generated in each zone. The chimera approach allows zones to over-
lap, and interpolation routines are used to transmit data between the overset grids in the
flow solver. However, generalizing the grid generation and adaptation in this approach is
still not an easy task and demands a high level of expertise as well as considerable effort.
Furthermore, interpolation between the blocks and overlapped meshes has its own issues
and can introduce additional error. The most powerful approach for complex geometries is
unstructured grids (typically triangular in 2D and tetrahedral in 3D). Unstructured grids
have a higher flexibility in refinement based on the geometry and adaptation based on the
solution features and gradients. Therefore, unstructured meshes are one of the most suitable
choices for complex geometries; associated solvers are becoming more common in modern
CFD applications and promise to be more capable and successful for complex aerodynamic
problems [88].
The fluid flow equations (PDEs) generally are discretized in one of the following forms:
finite difference, finite element or finite volume. Finite difference is the point wise repre-
sentation of the flow field where the flow equations are solved only for variables defined at
CHAPTER 1. INTRODUCTION 8
(a) Structured (b) Unstructured
Figure 1.3: Example of a structured and an unstructured mesh over a 2D airfoil
grid points. The finite difference scheme was the original approach to the CFD problems
and it is well suited for structured grids. Therefore, its higher-order implementation can
be easily achieved by employing higher-order differencing formula. However, the finite dif-
ference discretization does not conserve mass, momentum and energy of the flow which is
an important issue for most of the practical applications such as shock capturing. Further-
more and more importantly the finite difference discretization can not be implemented on
unstructured grids.
The finite element method, another discretization technique, is one of the most complete and
well established mathematical approach for numerical solution of PDEs. In this method,
the flow equations are multiplied by a test function and then integrated over the discretized
domain. The solution is represented by a local basis function (interpolation function) for
each element. Finite element method is very flexible both in terms of theory and appli-
cation and it can be easily used for unstructured meshes. Its high-order extension is also
fairly common by employing higher-order basis and test functions. The challenging part in
application of the finite element for CFD computation is again the conservation of the flow
equations especially for non-smooth flows. Although conserving the mass, momentum and
energy is possible in the finite element formulation but it is not an easy task, and finite
element codes require considerable fine tuning in shock capturing.
The finite volume approach is designed based on the conservation of mass, momentum
CHAPTER 1. INTRODUCTION 9
and energy. The solution is represented by control volume averages and equations are
discretized over the volume integrals. Like the finite element method, it is very flexible
for complex geometries and unstructured meshes. At the same time its robustness for
nearly all CFD applications especially shock capturing problems is well established. Higher-
order application of finite volume methods is possible by using a higher-order polynomial
inside each control volume which the polynomial average integral over the control volume
represents the control volume average.
Jameson and Mavriplis [37] reported some of the earliest unstructured finite volume CFD
results, solving two-dimensional inviscid flow on regular triangular grids obtained by subdi-
viding quadrilateral grids; central differencing was used. Their approach was second-order
(linear distribution). The artificial viscosity and the second-order truncation error made
the mentioned approach relatively high diffusive for general irregular unstructured meshes.
Second-order upwind schemes (discretization of the flow equations based on the physical
waves propagation directions) have also been used on unstructured grids either through
Green-Gauss gradient technique or least-squares linear reconstruction method. Applying
upwind schemes for unstructured grids is more complicated than for structured grids espe-
cially for higher-order approximation. For the unstructured case, over each finite volume
(triangle in 2D) a polynomial approximation to the solution is reconstructed with the help
of the neighboring control volumes, and then the Riemann (shock tube) problem is solved
approximately at the control volume interfaces. One of the most successful approaches in
applying the upwind scheme was undertaken by Barth[10]. In this approach, Barth defined
a general upwind formulation (in multi-dimensions), introducing the minimum energy least-
squares reconstruction procedure for flux calculation up to the desired order of accuracy.
However, any upwind scheme higher than first-order (which is monotone by its nature), often
causes oscillations in the vicinity of sharp gradients and discontinuities which can produce
instability problems. For example, Agarwal and Halt [6] proposed a compact higher-order
scheme for solution of Euler equations over unstructured grids using explicit time integra-
tion. Considerable over and undershoots in their transonic airfoil case (3rd-Order) were
evident. A common solution to that is using limiters, which enforce monotonicity at the
expense of adversely affecting both accuracy and convergence. Barth and Jespersen [13]
introduced a multi-dimensional limiter to achieve the monotonic solution. Although that
approach was quite successful in suppressing oscillations, reaching full convergence was not
possible even after freezing the limiter because the limiter was not differentiable. Since
then several attempts have been made to design a differentiable limiter for unstructured
grid solvers, and Venkatakrishnan’s limiter [87] seems to be one of the most robust. That
limiter does not strictly enforce monotonicity but allows only small overshoots in the con-
CHAPTER 1. INTRODUCTION 10
verged solution. However, it preserves accuracy especially for smooth regions where there
are local extrema. Although this limiter shows better convergence behavior, it still has some
convergence issues with implicit solvers. Designing an appropriate limiter for higher-order
unstructured grid solvers is a fairly unexplored topic so far and needs to be addressed for
practical higher-order unstructured application.
Another more sophisticated approach to cure oscillations in compressible flow computation
is the essentially non-oscillatory (ENO) family of schemes. These schemes are uniformly
accurate and prevent oscillations in the non-smooth regions by detecting discontinuity and
modifying the reconstruction stencil from cell to cell and time level to time level [30]. ENO
schemes are computationally expensive and sacrifice fast convergence because of their dy-
namic stencils [34]. Weighted ENO (WENO) schemes were developed to address the prob-
lems caused by dynamic stencils. Near discontinuities, weighted ENO schemes ([3, 57, 28])
remove the effect of non-smooth data in the reconstruction stencil by giving it an asymptot-
ically small weight. However no comprehensive convergence analysis and/or computational
cost studies are presented. Furthermore, performance of ENO/WENO schemes in the con-
text of implicit time advance, which is one of the most efficient solution strategies, has not
yet been studied.
1.2.2 Higher-Order Discretization
For structured meshes, application of higher-order algorithms has progressed considerably
and it has been shown that, for practical levels of accuracy, using a higher-order accurate
method can be more efficient both in terms of solution time and memory usage. With
higher-order accurate methods, the cost of flux computation, integration, and other associ-
ated numerical calculations increase per control volume. However, as we can use a coarser
mesh, computation time and memory are saved overall and accuracy can be increased as
well. De Rango and Zingg [66] applied a globally third-order accurate algorithm for steady
turbulent flow over a 2-D airfoil using a structured grid. They showed this approach can
lead to a dramatic reduction in numerical error in drag using relatively coarse grids, and
the results provide a convincing demonstration of the benefits of higher-order methods for
practical flows. Zingg et al. [96] compared different flux discretization techniques with
higher-order accuracy for laminar and turbulent flows (including transition) both in sub-
sonic and transonic speed regimes. Extending the conclusion of the previous research, it
was shown that the higher-order discretization produces solutions of a given accuracy much
more efficiently than the second-order methods. More aspects of implementation of higher-
order methods have been discussed by De Rango and Zingg [67], and convergence behavior
CHAPTER 1. INTRODUCTION 11
of this approach has been studied in detail. Again a higher-order algorithm has been ap-
plied for calculation of the flow around the multi-element airfoil using the multi-block grid
technique by the same researchers [68]. A grid convergence study in this research showed
that the higher-order discretization produces a substantial reduction in the numerical errors
in the flow field in comparison with the second-order algorithm. This smaller error has been
achieved on a grid several times coarser than the grid which had been used for the second-
order algorithm. In summary, these studies show that achieving the desired accuracy in
practical aerodynamic flows using higher-order algorithms not only is possible but also has
some advantages.
Research in high-order unstructured solvers is motivated by the desire to combine the ac-
curacy and efficiency benefits seen in the application of high-order methods on structured
meshes with the geometric and adaptive flexibility of unstructured meshes. The application
of methods having higher than second-order accuracy for solving the compressible Euler
and Navier-Stokes equations on unstructured meshes has not been thoroughly investigated
yet and remains an active research topic.
Several researchers have achieved higher-order accuracy by the use of the finite element
method. Bey and Oden [17] have used a discontinuous Galerkin method to reach fourth-
order accuracy for smooth flows. Bassi and Rebay [15] have used a new discontinuous
element for discretization of Euler equations and have computed the compressible flow over
a simple unstructured grid.
The finite volume approach has received more attention, Barth and Fredrickson [12] derived
a general condition for a scheme to be higher-order accurate, including a reconstruction pro-
cedure satisfying the properties of conservation of mean, K-exactness and compact support
(these criteria are discussed in detail in Chapter 3). They also proposed a minimum energy
(least-square) reconstruction to calculate the required polynomial coefficients. Delanaye and
Essers [23] proposed a quadratic reconstruction finite volume scheme for compressible flows
on unstructured adaptive grids. The overall accuracy of the scheme was second-order. The
inviscid flux was computed directly from their quadratic polynomials; however, diffusive
derivatives were obtained through a linear interpolation. For monotonicity enforcement, a
discontinuity detector was introduced and higher-order terms in reconstructed polynomials
were dropped in the vicinity of discontinuities. Ollivier-Gooch and Van Altena [60] have
analyzed a new approach for higher-order accurate finite-volume discretization for diffusive
fluxes that is based on the gradients computed during solution reconstruction; fourth-order
accurate solution for the advection-diffusion problem has been computed. Recently Nejat
and Ollivier-Gooch [48] developed an implicit higher-order unstructured solver for Poisson’s
CHAPTER 1. INTRODUCTION 12
equation. They clearly showed the possibility of reducing computational cost required for
a given level of solution accuracy using higher-order discretization over an unstructured
stencil for certain fluid problems.
1.2.3 Implicit Method and Convergence Acceleration
Flow features, especially in physically complicated flows, vary greatly in size. In particular,
time and length scales associated with physical phenomena like turbulence and combustion
can be very disparate. The use of millions of cells, which is common today in practical
simulations, results in global length scales spanning the computational domain which are
several orders of magnitude larger than the smallest scales resolved by neighboring grid
points. With such disparate length scales, and such large numerically stiff problemes, effi-
cient time integration and solution convergence are real challenges for the solution of the
resulting discrete systems.
Generally explicit integration methods (multi-stage Runge-Kutta family schemes), even with
the help of acceleration techniques, such as local time stepping and residual smoothing, still
show slow convergence behavior for steady-state solution of large and/or stiff CFD problems.
Implicit methods are a fairly common and efficient approach for the steady-state solution of
the fluid flow equations. Regardless of the space discretization technique, finding solutions
to fluid flow problems requires solving a large linear system resulting from the linearization
of fluid flow equations in time (temporal discretization).
In the limit of very large time steps, implicit time advance schemes approach Newton’s
method. Newton methods, which will be discussed in Chapter 5, have been used in CFD
since the late 80’s and are considered an attractive approach for solution convergence of
steady flows due to their property of quadratic convergence (when starting from a good
initial guess). In early attempts direct methods were employed for solving the linear system
arising at each Newton iteration [85, 89, 8]. While direct solvers have been developed for
stiff linear systems, the size of the systems of equations arising in CFD makes applying
direct methods impossible in practice. Therefore, using iterative linear solvers with proper
preconditioning, which is a crucial factor for complex problems, is the only reasonable choice
for solving the linear system in each Newton iteration. At the same time, the cost of each
iteration in terms of CPU time and memory usage for a pure Newton method is relatively
high. Quasi-Newton methods can have satisfactory convergence behavior, lower memory
usage and less cost per iteration at the expense of increasing total number of Newton
iterations and losing quadratic convergence rate [62]. Quasi-Newton methods are generally
categorized as Approximate Newton and Inexact Newton methods.
CHAPTER 1. INTRODUCTION 13
In Approximate Newton methods the flux Jacobian on the left hand side (arising from
linearization) either is computed through some simplifications or is evaluated based on
lower-order discretization, while the flux integral on the right hand side is evaluated up to
the desired order of accuracy. In either case the linearization is done approximately and
although the Jacobian matrix has simpler structure and is better conditioned (i.e. easier
to invert), the overall convergence rate of the non-linear problem will be degraded. This
approach is also known as the defect correction technique, and it is useful when there are
memory limitations and storing the full Jacobian is impractical, specifically in 3D. If the true
Jacobian is very stiff and solving the linear system of the true linearization is challenging,
Approximate Newton may work better than the original Newton method. This is often true
in the early stage of Newton iterations when a good starting solution is not yet available.
Therefore, Approximate Newton is a very good candidate for the start-up process, especially
if the solution process is started from a poor initial guess.
In the second category, Inexact Newton methods [25], the complete linearization based on
the flux integral on the right hand side is employed and the true Jacobian is calculated.
However, the resultant linear system at each Newton iteration is solved approximately by
an iterative linear solver. For highly non-linear problems such as compressible flows, the
linearized system, especially at the initial iterations is not an accurate representation of the
non-linear problem. As a result, completely solving the linear system does not improve the
overall convergence rate. Instead, the linear system is solved up to some tolerance criteria
which normally is chosen as a fraction (typically between 10−1 and 10−2) of the flux integral
on the right hand side.
Among iterative linear solvers, Krylov subspace methods are the most common, and amongst
these, the Generalized Minimal Residual (GMRES) [72] algorithm (visited in section 5.2 in
detail) has been developed mainly for non-symmetric systems such as those resulting from
unstructured meshes. In the matrix-free GMRES, the matrix vector products required by
the GMRES algorithm are computed without forming the matrix explicitly. Matrix-free
GMRES [14] is a very attractive technique for dealing with complicated Jacobian matri-
ces, because it reduces memory usage considerably and eliminates the problem of explicitly
forming the Jacobian matrix. This is especially helpful for higher-order unstructured mesh
solvers where full (analytic) Jacobian calculation is extremely costly and difficult, if not
impossible. GMRES efficiency depends strongly on the conditioning of the linear system.
This is especially important for higher-order discretization, which makes the Jacobian ma-
trix more off-diagonally dominant and quite ill-conditioned, and for the Euler equations
(compressible flow) with the non-linear flux function and possible discontinuities in the so-
lution. Applying a good preconditioner for GMRES under these circumstances becomes a
CHAPTER 1. INTRODUCTION 14
necessity [49].
As a result, there are a wide variety of Quasi-Newton methods, where the forming of the
Jacobian, solving the linear system, choosing the preconditioner and starting up the solution
process are the key factors in overall performance and robustness of the solver.
For structured meshes, Pueyo and Zingg [63, 64, 65] presented an efficient matrix-free
Newton-GMRES solver for steady-state aerodynamic flow computations. They investigated
the efficiency of different Quasi-Newton methods, Incomplete Lower-Upper (ILU) precondi-
tioning strategies (see section 5.2.3), and reordering techniques for a variety of compressible
inviscid, laminar and turbulent flows using a GMRES iterative solver. They showed that
the Approximate Newton method using matrix-free GMRES-ILU(2) with Reverse Cuthill-
McKee reordering, when the lower order Jacobian is employed as the preconditioner matrix,
provides the best efficiency in terms of CPU time for most cases. Later on Nichols and Zingg
[54] developed a 3D multi-block Newton-Krylov solver for the Euler equations using the
same approach. Through parametric study, they showed that ILU(1) provides an adequate
balance between good preconditioning and low computational time.
For unstructured meshes, Venkatakrishnan and Mavriplis [90] developed an approximate
Newton-GMRES implicit solver for computing compressible inviscid and turbulent flows
around a multi-element airfoil. They compared different preconditioning strategies and
found out that GMRES with ILU preconditioning had the best performance. In their case
the graph of the linearized Jacobian and the unstructured mesh were the same, as the
Jacobian was approximated based on the direct neighbors, and ILU(0) demonstrates satis-
factory result. Barth and Linton [14] successfully applied both full matrix and matrix-free
Newton-GMRES for computing turbulent compressible flow on unstructured meshes. They
also presented a technique for constructing matrix-vector products which is an exact calcu-
lation of the directional derivatives. Delanaye et al. [24] presented an ILU preconditioned
matrix-free Newton-GMRES solver for Euler and Navier-Stokes Equations on unstructured
adaptive grids using quadratic reconstruction. This study shows ILU(0) in preconditioning
of stiff problems when the Jacobian on the left hand side is higher-order can be insufficient
for reaching full convergence. By permitting one more fill-level in the ILU decomposition
(ILU(1)), full convergence was achieved. A totally matrix-free implicit Newton-GMRES
method was introduced by Luo et al. [44] for 3D compressible flows using LU-SGS (Lower-
Upper Symmetric Gauss-Seidel) as the preconditioning strategy. They completely elimi-
nated the storage of the preconditioning Jacobian matrix by approximating the Jacobian
with numerical fluxes. However, most probably because of the stability consideration for
their preconditioning strategy (LUSGS), the full Newton iterations were not performed and
CHAPTER 1. INTRODUCTION 15
the convergence rate remained nearly linear. Manzano et al. [45] presented an efficient ILU
preconditioned matrix-free Newton-Krylov(GMRES) algorithm for 3D unstructured meshes.
They used different levels of fill (ILU(1-3)) depending on the case and the flux residual to
achieve optimum performance. Nejat and Ollivier-Gooch [50] developed a LU-SGS precon-
ditioned matrix-free Newton-GMRES algorithm for higher-order inviscid compressible flow
computations. Their results show that LUSGS-GMRES works almost as efficiently for the
third-order discretization as for the second-order one. They also presented a supersonic
airfoil case in which the third-order discretization converged faster than the second-order
discretization. That again raised the possibility that using higher-order discretization could
in fact increase both accuracy and efficiency.
1.3 Objective
To develop an efficient and accurate algorithm for a compressible viscous flow (i.e. Navier-
Stokes equations), the first step is to develop the base flow solver for a compressible inviscid
flow (i.e. Euler equations). It is worth mentioning that the extension of an inviscid solver to
the viscous solver will not affect the overall algorithm in principle, although application of
anisotropic/hybrid meshes as well as computing the viscous flux function (instead of inviscid
one) are required.
The goal of this research is to develop an efficient and accurate solution algorithm for inviscid
compressible fluid flow simulations using unstructured meshes. To achieve this objective
an existing higher-order unstructured reconstruction procedure [60] is combined with an
efficient implicit time advance algorithm.
1.4 Contributions
A fast and efficient implicit (matrix-free) algorithm is designed and successfully implemented
for the 2nd, 3rd and 4th order accurate steady-state computation of inviscid compressible
flows. The robustness and accuracy of the developed solver have been verified through
several test cases. It should be noted that the 4th-order unstructured finite volume solution
for inviscid compressible flows has not been available prior to this research, and the currently
available 3rd-order results in the literature lack curved boundary implementation.
CHAPTER 1. INTRODUCTION 16
The solution process has been divided into two separate phases, a start-up phase and a
Newton phase. A defect correction procedure is proposed for the start-up phase consisting
of multiple implicit pre-iterations.
This research provides a deep insight into preconditioning, a comprehensive convergence
comparison, and a meaningful cost breakdown study for the implicit algorithm for the 2nd,
3rd and 4th-order unstructured discretization methods.
A differentiable switch is designed for the limiting of the higher-order terms in the recon-
struction polynomial in the vicinity of discontinuities.
Accuracy assessment is performed for a series of independently generated irregular unstruc-
tured meshes, which based on the author’s knowledge to this extent is unprecedented and
proves the overall accuracy of the developed solver algorithm.
Unstructured finite volume solution of the 2nd, 3rd and 4th-order methods have been com-
pared in detail for subsonic, transonic and supersonic cases.
The possibility of reducing computational cost required for a given level of accuracy using
high-order unstructured discretization is demonstrated.
1.5 Outline
In Chapter 2, the main flow solver is introduced, including governing equations, implicit time
advance algorithm (linearization of the flux in time), upwind flux formulation, boundary
flux treatment, and flux integration.
In Chapter 3, the employed reconstruction procedure, monotonicity enforcement and higher-
order limiter implementation are discussed; a new formula for limiting the higher-order terms
of the reconstruction polynomial is introduced including a smooth switch.
In Chapter 4, the Jacobian matrix calculation used in preconditioning is described. A first-
order approximate analytical Jacobian is derived for the interior and boundary fluxes. Also,
a similar low-order Jacobian computation using the finite difference technique is introduced
in addition to further discussion on the accuracy of this finite difference Jacobian.
Chapter 5 lays out the solution strategy and introduces the matrix-free GMRES linear
solver (with special higher-order considerations) and the applied preconditioning in detail.
CHAPTER 1. INTRODUCTION 17
Chapter 6 and 7 are devoted to the numerical results which include the verification cases
(Part I) and simulation cases (Part II) respectively. First the accuracy and performance of
the developed solver are verified for basic (not necessarily easy) test cases. Then fast con-
vergence and robustness of the proposed higher-order unstructured Newton-Krylov solver
have been investigated for subsonic, transonic, and supersonic flows. Furthermore solutions
of different orders of accuracy for all test cases have been compared in detail.
Finally, in Chapter 7, the thesis is brought to closure by summarizing the research, describ-
ing the contributions, providing conclusions based on the results, and recommending some
future work.
Chapter 2
Flow Solver
2.1 Governing Equations
Conservation of mass (continuity), momentum (Newton’s second law) and energy (first
Thermodynamics’s Law) are the principal equations which govern the dynamics of all fluid
flows. To simulate a fluid flow field, we need to solve these equations with proper implemen-
tation of physical boundary conditions. If dissipation, viscous transport phenomena, and
thermal conductivity are neglected in a fluid flow, the flow is considered to be non-viscous
or inviscid. Leonhard Euler, the Swiss scientist for the first time derived (1775) inviscid
fluid flow equations, known today as Euler equations. Euler equations are the limit form of
the Navier-Stokes equations, where Reynolds number goes to infinity. For many practical
aerodynamic applications, Euler flow is a relatively accurate representation of the flow field
and includes both rotational and discontinuous (shock) phenomena in the flow providing an
excellent approximation for lift, induced drag and wave drag. Also, a robust Euler solver is
an essential part of any Navier-Stokes solver.
The finite volume formulation of the unsteady 2D Euler equations for an arbitrary control
volume can be written in the following form of a volume and a surface integral (2.1), where
U is the solution vector in conservative variables, and F is the flux vector1.
d
dt
∫
CV
UdV +
∮
CS
FdA = 0 (2.1)
1More formally, F should be referred as a dyad.
18
CHAPTER 2. FLOW SOLVER 19
where
U =
ρ
ρu
ρv
E
, F =
ρun
ρuun + P nx
ρvun + P ny
(E + P )un
(2.2)
In (2.2), un = unx + vny and [ρ ρu ρv E]T are the densities of mass, x-momentum, y-
momentum and energy, respectively. The energy is related to the pressure by the perfect
gas equation of state, (i.e. P = ρRT ). The following equations (or definitions) correlate
thermodynamic properties such as energy and enthalpy with density, temperature, pressure
and velocity in the Euler equations:
E =P
γ − 1+ ρ
(u2 + v2)
2(2.3)
γ =Cp
Cv, Cp =
γR
γ − 1, Cv =
R
γ − 1, R = Cp − Cv (2.4)
e = CvT, et = e +1
2(u2 + v2), E = ρet, h = e +
P
ρ, (2.5)
In (2.3), E is the total energy and γ is the ratio of specific heats. In (2.4) R is the gas
constant, Cp and Cv are the specific heats. Finally in (2.5) e is the internal energy per unit
mass (specific energy), et is the total energy per unit mass, and h is the specific enthalpy.
For a compressible flow, most of the flow properties are described as a function of Mach
number, M , which is a non-dimensional form of the flow speed, and it is defined as the ratio
of velocity magnitude to the local sound speed, a.
M =
√u2 + v2
a, a =
√γP
ρ=√
(γ − 1)h =√
γRT (2.6)
Normally for aerodynamic applications, density is in the order of 1, velocity is in the order
of 102, and energy is in the order of 105. As a result, the solution vector, U , and the flux
vector, F , (2.2) in their dimensional form have quantities with very different scales. This
CHAPTER 2. FLOW SOLVER 20
introduces tremendous potential for numerical errors, as well as unnecessary stiffness of
the resultant linearized system of equations (next section). To avoid that issue we need to
normalize our system of equations, in such a way that all quantities become order of one.
This can be done by normalizing the quantities respect to some properly chosen reference
values, like free stream values. While normalization does not change the form of the original
equations, it transforms them to the proper non-dimensional form. However, some of the
correlations between the properties may look different. For example, in the next couple of
lines we show part of this process for Euler equations:
Reference quantities:
ρref = ρ∞, Tref = T∞, Pref = γP∞, Rref = R∞ = Const. (2.7)
Non-Dimensional quantities:
ρn =ρ
ρref, Tn =
T
Tref, Pn =
P
Pref, Rn =
R
Rref= 1 (2.8)
It is not difficult to show that the speed of sound in its non-dimensional form is equal to
the square root of normalized temperature, i.e. an =√
Tn .
In general, the mathematical characteristics of the fluid flow equations depend on the flow
speed regime and they change with Mach number. For instance the system of steady
Euler equations are elliptic in space for subsonic flow and they are hyperbolic in space for
supersonic flow. If the fluid flow equations are written in their unsteady form, they would
be hyperbolic in time independent of the speed regime. As a result, it is relatively easy to
integrate fluid flow equations in time instead of space. The set of unsteady Euler equations
are a first-order nonlinear PDE, which make a hyperbolic system in time, and therefore
properties propagate along the characteristic lines. To obtain a steady-state solution time
marching process would be continued up to the point that all time derivatives reach some
tolerance criteria.
2.2 Implicit Time-Advance
Assuming the discretized physical domain does not change in time, U can be brought out
from the integral in Eq. (2.1), as the average solution vector of the control volume:
CHAPTER 2. FLOW SOLVER 21
dUi
dt= − 1
ACVi
∮
CSi
FdA (2.9)
The left hand side of (2.9) is the first time derivative of the average solution vector in each
control volume and the right hand side represents the spatial discretization for the same
control volume. Right hand side of (2.9) is called flux integral or residual of the control
volume, which is a nonlinear function of solution vector, and can be re-written in the form
of (2.10) for control volume i :
dUi
dt= −R(Ui) (2.10)
To form an implicit expression for (2.10), we need to evaluate both sides of the equation at
the time level “n+1”. This can be done by backward time differencing (backward Euler) of
the left hand side and residual linearization in time for the right hand side.
Un+1i − Un
i
∆t= −R(Un+1
i ) = −[R(Un
i ) +∂R
∂U(Un+1
i − Uni ) + O
((Un+1
i − Uni )2)]
(2.11)
or
(I
∆t+
∂R
∂U
)δUi = −R(Un
i ), Un+1i = Un
i + δUi (2.12)
where ∂R∂U
is the Jacobian matrix resulting from residual linearization. Equation (2.12)
is a large linear system of equations which should be solved at each time step to obtain
an update for the vector of unknowns. With this approach the accuracy in time is first
order; however, as we are only interested in steady-state solution, time accuracy is not our
concern, and advancing the solution in time continues till the residual of the non-linear
problem practically converges to zero.
In Eq. (2.12), the Jacobian matrix, ∂R∂U
, is an essential element in implicit formulation, and
setting Jacobian equal to zero is equivalent to explicit time advance. Note that in the case
of Euler flux, Jacobian calculation is not a trivial task and it takes a considerable amount of
effort. The degree of difficulty of Jacobian evaluation depends on the type of the employed
flux formula and its complexity. The details of Jacobian evaluation will be discussed in
CHAPTER 2. FLOW SOLVER 22
Chapter 4. Having the Jacobian computed, solving the resultant large linear system in
each time iteration which is often very ill conditioned and off-diagonal is another part of
implicit solvers. Due to size and sparsity structure (graph) of the matrix, subspace iterative
methods are the most appropriate choice for solving such a system (Chapter 5).
Coding an implicit formulation is much more complicated than similar explicit formulation
and efficient parallelization of implicit solvers is a relatively difficult process. However,
implicit methods have a critical advantage over explicit methods which is stability and fast
convergence. Implicit methods do not suffer from restrictive stability condition of explicit
methods. In theory they are unconditionally stable and by taking large time steps steady-
state solution is reached rapidly. If we take an infinite time step Eq. (2.12) becomes
Newton iteration (2.13). Newton’s method generally converges quadratically in the vicinity
of solution. However, near singularities the convergence rate may degrade from quadratic
rate to linear rate [14].
(∂R
∂U)δUi = −R(Un
i ), Un+1i = Un
i + δUi (2.13)
In practice, taking a time step too large is not beneficial when the solution process is started
from a poor initial guess. Because Euler flow is highly non-linear, the linearization of the
flow equations is not an accurate representative of the original non-linear problem at an early
stage of the iterations. Therefore any update based on a large time step (such as Newton
update) would be inaccurate, often causing instability or stall in convergence. Normally, a
start-up process is needed to advance the solution from an initial guess to a good solution
state for taking large time step and eventually switching to Newton iteration.
One strategy for this start-up process is defect correction, in which the flux integral on the
right hand side is computed to the desired order of accuracy while on the left hand side the
linearization is done based on a lower-order discretization (normally first-order):
(I
∆t+ (
∂R
∂U)low
)δUi = −Rhigh(Un
i ), Un+1i = Un
i + δUi (2.14)
With this approach, the higher-order Jacobian computation which is very expensive and
inaccurate in early stage of iterations is avoided. As the linear system based on lower-order
discretization has simpler structure in terms of the matrix bandwidth and is better condi-
tioned, solving the resultant system is easier and more stable especially for stiff Jacobians.
The defect correction linear system is relatively inexpensive to form and it can be precon-
ditioned effectively by the same matrix. In addition, early high-frequency oscillations are
CHAPTER 2. FLOW SOLVER 23
damped efficiently when a moderate time step is used which is necessary anyhow for the
start up process.
Another start-up technique is using Eq. (2.12) based on higher-order Jacobian but taking
small time steps at early stage of solution process. With this approach ∆t is small, the linear
system would be diagonally dominant or at least better conditioned, and consequently easier
to solve. The drawback is in the case of complicated flows like transonic flow, higher-order
Jacobian would be very ill-conditioned, and to be able to solve the linear system we need
to choose very small ∆t, which may not accelerate the solution convergence efficiently.
In practice one may combine the above strategies with mesh sequencing, starting the solution
process on coarser meshes and then transferring it to a final fine mesh. First-order solution
also could be used as an initial guess either independently or in combination with other
strategies. In general, it is hard to say which strategy or combination will work best for the
start-up process as it depends on the problem and type of linear solver (Chapter 5).
2.3 Upwind Flux Formulation
To compute the control volume residual or flux integral, we need to evaluate the numerical
flux at Gauss points located on control volume boundaries and integrate along the boundary
edges. In other words, residual or flux integral computation includes three steps:
1. Reconstructing solution vector V (in primitive variables) over each control volume up
to the desired accuracy (next chapter).
2. Computing the numerical flux vector at the control volume interfaces based on the
reconstructed V.
3. Integrating numerical flux along the control volume interfaces using Gauss quadrature
rule.
To describe an upwind scheme, it is helpful to start from a first-order linear wave equation,
Eq.(2.15), where λ is the wave speed. For a positive value of λ, this represents the propa-
gation of a wave from left to the right (positive direction) along the x axis, as it is shown
in Fig(2.1).
∂u
∂t+ λ
∂u
∂x= 0 (2.15)
CHAPTER 2. FLOW SOLVER 24
x
uxu1
u2
Propagating Wave
i-1 i i+1λ
Figure 2.1: Propagation of a linear wave in positive direction
Based on the physics and the model equation, it is evident that the information in the field
is propagating with the wave speed and in the wave direction. Therefore, the properties at
point i are influenced by the propagated properties originating from point i−1 and properties
at point i + 1 will not physically affect the properties at point i. As a result it makes sense
that in discretization of wave equation, ∂u∂x
is replaced by ui−ui−1
∆xor by any other type of
backward differencing. It is well understood that using central or forward differencing would
result in non-physical and/or oscillatory behavior and typically instability in solution. In
other words, any numerical scheme used in discretization of the fluid flow equations should
respect the physical characteristics of the fluid flow, i.e. the velocity and direction of the
propagated information throughout the flow field.
The unsteady Euler equation, which is hyperbolic in time, behaves somewhat similarly to
the wave equation, although it is much more complex, has non-linearity, and is in vector
form. In Euler (inviscid compressible) flow, information travels along characteristic lines,
and the slope of these lines are the eigenvalues of the Jacobian matrix, ∂F∂U
, which is the
derivative of the Euler flux vector with respect to the solution vector, Eq (2.16 ).
CHAPTER 2. FLOW SOLVER 25
∂F
∂U=
0 nx ny 0
(γ − 1)eknx − uun un − (γ − 2)unx uny − (γ − 1)vnx (γ − 1)nx
(γ − 1)ekny − vun vnx − (γ − 1)uny un − (γ − 2)vny (γ − 1)ny
((γ − 1)ek − ht)un htnx − (γ − 1)uun htny − (γ − 1)vun γun
(2.16)
et = e + ek, ek =1
2(u2 + v2), ht = h + ek
λi =[
un un un + a un − a]
(2.17)
These eigenvalues, Eq (2.17), describes the direction and speed of disturbance (information)
propagation in the flow field. Obviously, the upwinding direction is determined based on the
direction of these characteristic lines. Several categories of upwind methods such as Flux
Vector Splitting (FVS), Flux Difference Splitting (FDS) and Advection Upstream Splitting
Method (AUSM) have been developed during the last three decades for flux evaluation and
have been successfully employed in compressible flow computation. Hirsch [35] and Laney
[41] reviewed upwind schemes in detail.
2.3.1 The Godunov Approach
S.K. Godunov in 1959 introduced an elegant idea in numerical flux calculation [35]. He
suggested the use of the locally exact Euler solution for the boundary interfaces of domain
cells. The solution in each cell is considered to be piecewise constant, and the cell interface
separates two different solution states, i.e. UL and UR (Fig ( 2.2)). Knowing the initial
solution, evolution of the flow to the next time step can be exactly calculated through the
wave interactions originating at the boundaries between two adjacent cells. This leads to
solving the shock tube or Riemann problem. The Riemann problem has an exact solution
composed of a shock wave, a contact discontinuity and an expansion fan separating regions
with constant solution states. Propagating the aforementioned waves over a time interval
4t changes those regions and their states from one solution at t = t1 to t = t1 + 4t.
All waves propagate consistently with the upwind directions, making the resultant solution
depend only on the physical zone of dependence. As shock tube problem requires solving
a non-linear algebraic equation, which is quite time consuming, implementing an exact
CHAPTER 2. FLOW SOLVER 26
1:L 2 3 4:R
Contact Surface Shock WaveExpansion Waves
Discontinuity Interface
U UL R
PL PR>
t=t1
∆tt=t +1
Figure 2.2: Shock-Tube problem
Riemann solver for upwind flux evaluation would not be efficient2. Therefore, different
approximate Riemann solvers have been developed to reduce the computation cost through
some averaging procedure. Roe’s approximate Riemann solver, introduced in the early
1980’s [69] is one of the most successful and commonly employed approximate Riemann
solver in modern CFD is used in this research.
2.3.2 Roe’s Flux Difference Splitting Scheme
Roe’s approximate Riemann solver [70] is based on a characteristic decomposition of the
flux differences and is a clever extension of the linear wave decomposition to the non-linear
shock tube problem keeping the conservation properties of the scheme. The detail of Roe’s
scheme is quite complicated [41] and here for brevity just the flux formulation is presented.
The Roe’s flux formula at the interface of a control volume is equal to the average fluxes of
left and right states minus a differencing term which splits the difference of the fluxes on
both sides of the control volume:
F (UL, UR) =1
2[F (UL) + F (UR)] − 1
2
∣∣∣A∣∣∣ (UR − UL) (2.18)
2There is some evidence suggesting that an exact Riemann solver for polytropic gases can be competitivewith most approximate techniques [31].
CHAPTER 2. FLOW SOLVER 27
A is the Jacobian matrix evaluated based on the Roe’s average properties, and∣∣∣A∣∣∣ is written
in diagonalized form in practice as:
∣∣∣A∣∣∣ = X−1
∣∣∣Λ∣∣∣ X ,
∣∣∣Λ∣∣∣ = Diag(λi) (2.19)
where X are the right eigenvectors and Λ are the eigenvalues of the Jacobian matrix [71].
Roe’s average properties are given by [69]:
ρ =√
ρLρR
u =
√ρLuL +
√ρRuR√
ρL +√
ρR
v =
√ρLvL +
√ρRvR√
ρL +√
ρR
ht =
√ρLhtL +
√ρRhtR√
ρL +√
ρR(2.20)
Although Roe’s scheme is one of the most accurate available flux functions, it is quite an
expensive method for flux computation and due to its matrix differencing term it requires
O(n2) operations per control volume in each iteration (n is number of equations). If a real
gas is used, then differencing term should be derived specifically based on the new equation
of state, which could introduce some difficulty in derivation of differencing matrix. At the
same time applying the Roe’s flux formula in implicit context, requires differentiation of
differencing matrix which has proven to be very tricky and expensive even for perfect gas
(Chapter 4). And finally some empirical cure needs to be implemented in Roe’s formulation
in the case that Mach number goes towards one (sonic speed) or when Mach number goes
to zero (stagnation point). In either case, we are dealing with eigenvalues equal to zero and
the method can not distinguish the correct upwind direction resulting either non-physical
solution (entropy reduction) or lack of convergence due to its non-differentiable flux. Adding
a small value of ε to eigenvalues, or rounding the slope of characteristic lines near zero, which
is the taken approach in this thesis Eq. (2.21), could totally resolve that issue (Fig(2.3)).
if |λi| < Ccut off λci =(λ2
i + C2
)C1 (2.21)
CHAPTER 2. FLOW SOLVER 28
λ
t-0.1 0 0.1
0.05
0.1
0.15
0.2
Cut-Off=0.1
Eigenvalue correction
Figure 2.3: Rounding the characteristic slope near zero
C1 =0.5
Ccut off, C2 = C2
cut off , Ccut off ≈ 0.1(typically)
2.4 Boundary Flux Treatment
Implementing correct boundary conditions is a necessary condition for numerical fluid flow
computation. Generally speaking, any kind of mistake in boundary condition treatment
would result in either convergence problem and/or inaccurate solution. This is especially
true if the imposed boundary condition does not match the flow physics. For an implicit
solver it is also essential to formulate the boundary treatment implicitly both for the main
solver and for the preconditioner (Chapter 4), or performance of the implicit solver will be
degraded severely in addition to introducing a restrict CFL limitation. Normally boundary
conditions are categorized as wall boundary and inlet/outlet boundary conditions, which
are discussed in the following subsections.
2.4.1 Wall Boundary Condition
In steady-state inviscid flow, there is a slip velocity condition: velocity normal to the wall
surface is zero (i.e. velocity at the boundary is parallel to the surface). As a result the
CHAPTER 2. FLOW SOLVER 29
convective flux normal to the surface is zero (flow does not cross the wall) and the pressure
flux in the momentum equations is the only remaining flux. Both the continuity and energy
fluxes are zero at steady-state. The easiest way to implement wall boundary flux in Euler
flow is to set Un = 0 in the flux formula. There are other alternative wall treatments
like reflective boundary conditions using ghost cells, or reconstruction of flow properties
(Chapter 3) with a constraint of Un = 0. All those treatments more or less produce the
same result as in all cases the convective flux is forced to be zero. In the current research,
normal velocity is set to be zero using a proper constraint in reconstruction, and pressure
is reconstructed up to the desired order of accuracy at the wall.
2.4.2 Inlet/Outlet Boundary Conditions
As mentioned in section 2.3, the numerical modeling should be consistent with the physical
characteristics of the flow field, and with the information propagating along characteristic
lines (according to eigenvalues of the Jacobian matrix). The inflow and outflow boundary
condition implementation are completely different for subsonic and supersonic flows, there-
fore the first thing is to determine the type of boundary condition. That can be easily done
by computing the flow quantities (primitive variables) at the boundary using reconstructed
values and working out the Mach number and flow direction with respect to the bound-
ary surface. Knowing the type of the boundary, a proper boundary treatment needs to be
carried out.
For a subsonic inlet three eigenvalues are positive, meaning that three characteristic lines
are transferring inlet information from inflow toward the solution domain. The remaining
eigenvalue is negative at subsonic inlet i.e., one set of information travels from the solution
domain towards the inlet. In this situation three parameters, in our case, total pressure,
total temperature, and angle of attack are set at the subsonic inlet and static pressure is
taken from the interior. The following formulas show the detail of computing the velocity
components and density at subsonic inlet.
Tt = 1 +γ − 1
2M2
∞, T∞ = 1.0 (2.22)
Pt =1
γ
(1 +
γ − 1
2M2
∞
) γγ−1
, P∞ =1.0
γ(2.23)
CHAPTER 2. FLOW SOLVER 30
Tb = Tt
(PRB
Pt
) γ−1γ
(2.24)
Pb=PRB : Reconstructed Pressure at the Boundary
Vmb =
√2
γ − 1
(Tt
Tb
− 1
)(2.25)
ub = Vmcos(α) , vb = Vmsin(α) (2.26)
α : Angle of Attack
ρb = γPb
Tb
(2.27)
At a supersonic inlet all eigenvalues are positive which means all boundary fluxes are com-
puted based on the inlet properties (a∞ = 1).
Pb = Pin = P∞ , ρb = ρin = ρ∞ (2.28)
ub = uin = M∞cos(α) , vb = vin = M∞sin(α) (2.29)
At a subsonic outlet, three eigenvalues are negative and one is positive (normal is toward
the solution domain) and therefore three pieces of information are taken from the solution
domain (through reconstruction) and only the static pressure is fixed at the outlet as the
back pressure.
ρout = ρRB , uout = uRB , vout = vRB , Pout = PBP (2.30)
PBP = P∞ for external flow
CHAPTER 2. FLOW SOLVER 31
LR
2nd−Order / 1 Gauss point per face 3rd and 4th−Order / 2 Gauss points per face
LR
Figure 2.4: Schematic illustration of Gauss quadrature points
In the case of supersonic outflow, since all eigenvalues are negative, all the information
is coming from the solution domain towards the boundary, and no condition is set at the
supersonic outlet.
ρout = ρRB , uout = uRB , vout = vRB , Pout = PRB (2.31)
In all cases that implementing boundary condition requires properties to be taken from
the solution domain, the higher-order reconstruction of those properties have been used to
insure the correctness of accuracy of boundary condition treatment.
2.5 Flux Integration
To compute the residual for each control volume, numerical fluxes should be integrated over
control volume edges. The flux residual is the summation of these flux integrals. The accu-
racy of flux integration should be equal or higher than the accuracy of flow reconstruction
in order to evaluate the residuals with higher-order accuracy. This is achieved by Gauss
quadrature integration. For the 2nd-order or linear reconstruction case one quadrature
CHAPTER 2. FLOW SOLVER 32
point per face is used. For the 3rd-order (quadratic reconstruction) and 4th-order (cubic
reconstruction) cases, we use two quadrature points per face with the proper weightings.
In the case of a curved boundary, the quadrature points are located on the curved arc
segment instead of the straight edge for higher-order flux integration. This is due to the
fact that mistaking a curved segment with a straight edge will introduce a 2nd-order error
(i.e. O(δS2), S is the arc length) in both computing and integrating the fluxes at curved
boundaries. Fig (2.4) shows the quadrature points for the flux integration schematically.
More information regarding the locations and weights of the Gauss quadrature integration
points is provided in Ref. [80].
Chapter 3
Reconstruction Procedure and
Monotonicity Enforcement
3.1 Introduction
To compute a higher-order solution, numerical fluxes should be evaluated with higher-order
accuracy containing two separate steps: first, calculating all flow variables anywhere inside
a control volume up to the desired order of accuracy; second, higher-order integration of
the evaluated fluxes over boundary edges using the proper number of Gauss points. The
latter is relatively easy, but higher-order approximation of flow variables inside a control
volume is neither unique nor trivial. This task could be done by implementing higher-order
upwind differencing, and/or higher-order representation of the solution inside a control vol-
ume (i.e. solution reconstruction) using polynomials with degree one or higher [82]. In
structured discretization, this leads to the development of MUSCL (Monotone Upstream-
centered Schemes for Conservation Laws) [35] . For structured meshes, higher-order re-
construction of a solution is much easier due to mesh regularity. However, in the case of
unstructured meshes a more complicated procedure needs to be adopted to represent the
solution with higher-order accuracy within a control volume.
Barth [13] laid out an straight forward procedure for gradient estimation for unstructured
tessellation using Green-Gauss gradient technique. Because of its simplicity and robustness,
the linear Green-Gauss reconstruction has been employed extensively in finite volume solvers
since then. Barth also developed a minimum energy (i.e. Least-Squares) reconstruction
procedure capable to computing the flow variables up to (K+1)th-order of accuracy using
33
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 34
a Kth-order polynomial known as K-exact reconstruction [10, 12, 11]. The least-square
reconstruction is relatively more difficult to implement but in general the linear least-square
reconstruction is more accurate especially in the case of distorted meshes and/or in the
presence of limiter [4]. That is expected since the least-square technique is not as sensitive
as the Green-Gauss method to noise in the solution. Delanaye [22] has compared the
accuracy of the linear and quadratic reconstruction techniques for a smooth flow (Ringleb’s
flow) on unstructured meshes. The linear least-squares reconstruction performed better
than the linear Green-Gauss reconstruction both in terms of accuracy and preserving the
total pressure (i.e. less entropy production). The quadratic Green-Gauss and least-square
reconstructions behaved very similarly, although this time the total pressure loss was a bit
higher for least-square method.
In this research a K-exact least-squares reconstruction [60, 56] is employed which is de-
scribed in the next section. In section 3.3 the accuracy assessment methodology is studied.
Finally, section 3.4 discusses the monotonicity enforcement and introduces the taken limit-
ing approach to address that issue.
3.2 K-Exact Least-Square Reconstruction
The goal here is to reconstruct a solution polynomial based on the control volume data
such that the truncation error of the solution in each control volume would remain in the
order of (∆x)K where (∆x) is the local mesh length scale. Obviously, certain physical and
mathematical criteria need to be met to reconstruct such a polynomial for each control
volume. These criteria are: 1- Conservation of the Mean, 2- K-Exact Reconstruction, 3-
Compact Support which are introduced briefly in next three subsections. In the last two
subsections the implementation of boundary condition at Gauss points using boundary
constraints, and solving the least-square system will be discussed.
3.2.1 Conservation of the Mean
The finite volume approach provides us piecewise constant data in each control volume
which is equal to the integral control volume average. This integral average is updated from
one iteration to the next and is the most meaningful measure of the physical quantities in
each control volume. A Kth-order piecewise polynomial U(K)R (x, y) can be reconstructed in
such a way that the integral of U(K)R (x, y) over the control volume is equal to the control
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 35
volume average Eq. (3.1). This is a very important property called Conservation of the
Mean.
∫
CV
U(K)R (x, y) = UCV (3.1)
U(K)R (x, y) =
∑
m+n≤k
α(m,n)P(m,n)(x − xc, y − yc) (3.2)
P(m,n)(x − xc, y − yc) = (x − xc)m(y − yc)
n (3.3)
(xc,yc) are the coordinates of the reference point for the control volume, which in our case
(cell center formulation) is the cell centroid, Fig (3.1).
In general, α(m,n) are the derivatives of a continuous function represented by the recon-
structed solution polynomial (U(K)R (x, y)) with respect to the independent geometric vari-
ables (i.e. x and y) :
U(K)R (x, y) = U(xc, yc) +
∂U
∂x
∣∣∣∣C
∆x +∂U
∂y
∣∣∣∣C
∆y +
∂2U
∂x2
∣∣∣∣C
∆x2
2+
∂2U
∂x∂y
∣∣∣∣C
∆x∆y +∂2U
∂y2
∣∣∣∣C
∆y2
2+
∂3U
∂x3
∣∣∣∣C
∆x3
6+
∂3U
∂x2∂y
∣∣∣∣C
∆x2∆y
2+
∂3U
∂x∂y2
∣∣∣∣C
∆x∆y2
2+
∂3U
∂y3
∣∣∣∣C
∆y3
6+
... (3.4)
where,
∆x = x − xc , ∆y = y − yc (3.5)
Imposing the conservation of the mean condition, or mean constraint is accomplished by
integrating of U(K)R (x, y) over the control volume i and making that equal to the control
volume average i.e. Eq. (3.1). For simplicity all the derivative terms in the following
equations are replaced by simpler expressions like: Uxy ≡ ∂2u∂x∂y
.
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 36
Ci
SN
SN
SN
FN
SN
SN
SN
FN
FN
TN
TN
TN
TN
TN
TN
TN
TN
TN
TN
TN
Figure 3.1: A typical cell center control volume and its reconstruction stencil, includingthree layers of neighbors
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 37
Equation (3.6) can be rearranged by introducing control volume moments definitions:
U i = UC +1
Ai
⟨Ux|C Mx + Uy|C My
⟩i+
1
Ai
⟨1
2Uxx|C Mx2 + Uxy|C Mxy +
1
2Uyy|C My2
⟩
i
+
1
Ai
⟨1
6Uxxx|C Mx3 +
1
2Uxxy|C Mx2y +
1
2Uxyy|C Mxy2 +
1
6Uyyy|C My3
⟩
i
+
... (3.6)
where,
Mxmyn |i=
∫
CVi
(x − xc)m(y − yc)
ndA (3.7)
These moments can be computed up to the desired order of accuracy by Gauss’s theorem.
Notice that replacing m and n in Eq. (3.7) by zero computes the area of the control volume.
The introduced mean constraint is the first row (equation) in the least-square system.
3.2.2 K-Exact Reconstruction
Consider a problem which is smooth and has an exact solution in the form of UE(x, y).
If⟨U
(K)R (x, y)
⟩iis a Kth-order polynomial reconstructing an approximate solution for the
problem over control volume i, then the error in solution reconstruction for the control
volume i would be in the order of O(∆xi)k+1 Eq. (3.8).
⟨U
(K)R (x, y)
⟩i= UE(x, y) + O(∆xi)
k+1 (3.8)
In other words⟨U
(K)R (x, y)
⟩ireconstructs the solution exactly if the exact solution is a Kth-
order polynomial, i.e. UE(x, y) = P KE (x, y). This property is called K-Exact Reconstruction.
This property also can be expressed in the integral form:
Error|CVi=
1
Ai
[∫
CVi
UE(x, y)dA −∫
CVi
U(K)R (x, y)dA
]= O(∆xi)
k+1 (3.9)
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 38
3.2.3 Compact Support
Notice that the reconstructed polynomial,⟨U
(K)R (x, y)
⟩i, is piecewise and changes from one
control volume to another. Therefore integrating the reconstructed function of a control
volume over its neighbors does not satisfy the average value of the neighboring control
volumes exactly. For the Kth-order solution reconstruction, we are reconstructing a (K-1)th
degree polynomial which satisfies the mean property for the control volume i and minimizes
the error in computing control volume averages of the neighboring control volumes. The
neighboring control volumes used in reconstruction constitute the reconstruction stencil,
Fig (3.1). The reconstruction stencil should include a proper number of neighboring control
volumes for computing the mentioned derivatives. The number of unknowns for the linear,
quadratic, and cubic reconstructions are 3, 6 and 10 leading to 2nd, 3rd and 4th-order
accuracy. Generally speaking, for a K-th order accurate reconstruction, at least K(K +1)/2
neighboring control volumes must be included in the reconstruction stencil. However, for
practical reasons one may use a few more control volumes in the reconstruction stencil. This
leads to a larger least-squares system but provides additional information for reconstruction
usually resulting in more robust solution reconstruction in the presence of non-smooth
and/or vigorously oscillatory data. In this research, the reconstruction stencils for 2nd, 3rd,
and 4th-order methods include 4, 9 and 16 control volumes respectively. Not surprisingly,
the polynomial⟨U
(K)R (x, y)
⟩ifor control volume i should be reconstructed based on control
volumes that are topologically and physically close to the control volume i. This property
is called Compact Support. Consequently, the 4th-order reconstruction stencil covers three
layers of the neighboring control volumes. This could adversely affect the compactness
of the reconstruction for very coarse meshes. Since for very coarse meshes the distance
spanned by the reconstruction stencil is uncomfortably close to the characteristic size of the
domain, full convergence might not be achieved as reconstruction stencils of the boundary
control volumes overlap considerably with the reconstruction stencils of interior control
volumes. Because the size of the stencil and the associated cost of the reconstruction
increase quadratically with the order of accuracy, there is a practical limit to the benefits
of the classic finite-volume higher-order reconstruction.
In reconstructing a non-smooth data in the vicinity of a discontinuity such as shock wave
even when we use a compact stencil geometrically, the stencil is not compact physically
as the data employed in reconstruction are irrelevant for control volumes across the shock
wave. There are two different approaches to address this issue. One is choosing the best
stencil containing the smoothest possible data and/or down playing the role of non-smooth
data in reconstruction by assigning small weights to them leading to ENO (Essentially Non
Oscillatory) and WENO (Weighted Essentially Non Oscillatory) schemes [33, 43, 3, 57].
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 39
C
G1G 2
Figure 3.2: Imposing boundary constraint at the Gauss boundary points
The other passive approach is to employ a limiter and enforce monotonicity by correcting
(reducing) the computed gradients, suppressing over and under-shoots resulting from recon-
struction over a non-physically compact stencil [13, 87, 24]. The latter approach has been
taken in this research.
3.2.4 Boundary Constraints
In addition to the Mean Constraint, it is also possible to include additional boundary
constraints to the reconstructed polynomial satisfying physical boundary condition at the
solution domain boundaries [80, 60]. The straightforward way of doing this is imposing the
boundary constraint at Gauss points where fluxes are actually calculated. Therefore, the
coefficients of the reconstructed polynomial are computed such that the physical boundary
condition is imposed automatically. As a result, number of boundary constraints per control
volume is equal to the number of control volume Gauss points on boundary edges, Fig(3.2).
3.2.4.1 Dirichlet Boundary Constraint
For a Dirichlet boundary condition, the value of the flow variable is known at the boundary.
Consequently, the reconstructed polynomial should give us the same boundary value at the
Gauss points (3.10):
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 40
U(K)R (xG1 , yG1) = UDB(xG1 , yG1) , U
(K)R (xG2 , yG2) = UDB(xG2 , yG2) (3.10)
3.2.4.2 Neumann Boundary Constraint
In Neumann boundary condition the normal derivative of the flow variable is known at
the boundary edge (3.11). Therefore, we need to take the normal derivative from the
reconstruction polynomial at the boundary Gauss point and equate that to the known
normal derivative provided by Neumann boundary condition (3.12 and 3.13):
∂U(K)R
∂n= UNB (3.11)
∂U(K)R (xG1 , yG1)
∂x(nx)G1 +
∂U(K)R (xG1 , yG1)
∂y(ny)G1 = UNB(xG1 , yG1) (3.12)
∂U(K)R (xG2 , yG2)
∂x(nx)G2 +
∂U(K)R (xG2 , yG2)
∂y(ny)G2 = UNB(xG2 , yG2) (3.13)
3.2.5 Constrained Least-Square System
Knowing the constraints and the neighbors involved in the polynomial reconstruction for
control volume i , we can set up the least-square system in the form of Eq. (3.14):
MiC1,1 MiC1,2 MiC1,3 MiC1,4 . . . MiC1,q
BiC2,1 BiC2,2 BiC2,3 BiC2,4 . . . BiC2,q
wi,1N1,1 wi,1N1,2 wi,1N1,3 wi,1N1,4 . . . wi,1N1,q
wi,2N2,1 wi,2N2,2 wi,2N2,3 wi,2N2,4 . . . wi,2N2,q
wi,3N3,1 wi,3N3,2 wi,3N3,3 wi,3N3,4 . . . wi,3N3,q
. . . . . . . .
. . . . . . . .
. . . . . . . .
wi,nNn,1 wi,nNn,2 wi,nNn,3 wi,nNn,4 . . . wi,nNn,q
UC
∂U∂x
∣∣C
∂U∂y
∣∣∣C
∂2U∂X2
∣∣∣C
∂2U∂x∂y
∣∣∣C
.
.
.∂KU∂yK
∣∣∣C
=
Ui
BCi
wi,1UN1
wi,2UN2
wi,3UN3
.
.
.
wi,nUNn
(3.14)
wi,j =1∣∣∣−→XCi−−→
XCj
∣∣∣t
(3.15)
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 41
In Eq. (3.14) the first two rows constitute the mean and boundary constraints which have
to be enforced exactly. The rest of rows are additional equations arising from integration
of the i control volume’s polynomial over its neighbors N1, N2, N3, ..., Nn which need to
be satisfied in the least-squares sense, minimizing the L2 norm of the system residual. Of
course, for interior control volumes there is no boundary constraint and the second row does
not appear in the least-square system. A weighting strategy is also considered in forming
the least-square system. Weights can be a function of geometric parameters and/or the
solution data [10, 57]. The main purpose of these weights is to control the influence of the
neighboring solution data on reconstruction. In this research only geometric weighting in
the form of Eq. (3.15) with t = 1 is used, weakening the effect of the solution data far from
the reconstructed control volume by the geometric distance ratio.
To satisfy the constraint exactly, Gaussian elimination with column pivoting is performed
on (3.14) for as many rows as the number of constraints in the system. The remaining least-
square system is reduced to a upper-triangular system using Householder transformation
[92] with the proper scaling, and finally the system is solved through backward substitution.
3.3 Accuracy Assessment for a Smooth Function
Error norms are the most common measure for accuracy assessment of a numerical solution.
They provide useful information about over all accuracy of the numerical solution as well
as local source of the (maximum) error. For a 2D finite volume formulation, the general
form of the error norm can be computed by Eq. (3.16), where p is the norm index, Ncv is
the total number of control volumes in the domain, Ai is the area of the control volume i,
and Ei is the average error of the control volume i.
Lp =
(∑Ncv
i=1 Ai
∣∣Epi
∣∣∑Ncv
i=1 Ai
) 1p
(3.16)
Ei =1
Ai
∫
CVi
Error(x, y)dA (3.17)
L1 and L2 are the global norms and are good indicators for over all accuracy. However, L∞
would be the largest magnitude of the error in the solution domain and it is a local error
indicator.
L∞ = max∣∣Ei
∣∣
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 42
Having a proper tool for error measurement, now we proceed with the accuracy assessment
discussion. As previously mentioned (3.2.2), a K-exact reconstruction reconstructs the
solution to the (K+1)th order of accuracy. To verify the accuracy of our reconstruction
technique, we need to examine it for a smooth function. Here a simple but very common
procedure is introduced to assess the accuracy of the numerical solution over discretized
domains.
Assume a 1-D finite volume problem over a uniformly discretized domain of M1 with the
length scale of ∆x. The average numerical solution for an arbitrary control volume i of M1
can be written in the following form:
Ui
∣∣M1
=1
∆x
∫
CVi
UE(x)dx + O(∆x)K (3.18)
where, Ui is the control volume average, UE(x) is the exact solution of the problem and
K is the discretization order (i.e. the polynomial of (K − 1) degree is used in the solution
reconstruction). Then the average solution error for each control volume can be expressed
as:
Error∣∣CVi
=1
∆x
∫
CVi
UE(x)dx − Ui
∣∣M1
≡ O(∆x)K
Therefore a similar relation is held in the case of taking any norm of the error (‖E‖p=1,2,∞)
over the same domain:
‖E|M1‖ ≡ O(∆x)K (3.19)
Now, if M2 represents the discretization of the same problem over the similar domain where
the discretization length scale is divided by half (( ∆x2 )), the error norm on M2 is:
‖E|M2‖ ≡ O(∆x
2)K (3.20)
and the ratio of these two error norms would be only a function of the discretization order:
‖E|M1‖‖E|M2‖
≡ O(∆x)K
O(∆x2 )K
= 2K (3.21)
To examine the order of discretization accuracy, employing (3.21), some error norms must
be plotted in logarithmic scale versus of mesh size, where the mesh length scale is uniformly
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 43
reduced each time by half. The asymptotic slope of the Error-Mesh plot will show the
numerical order of accuracy. This analogy can be extended to multiple dimensions as long
as the mesh length scale is decreased each time uniformly in each dimension all through the
domain. For unstructured meshes and in general case, it is nearly impossible to decrease
the mesh length scale uniformly. Even if we try to refine the whole mesh globally in a
self similar way (all angles and ratios remain the same in the refinement process) it would
not be feasible for all triangles (except for structured triangulation of regular geometries).
Therefore in general, the best that we can do is using a series of semi-uniform unstructured
meshes where the density of mesh each time is increased by factor of four. With this
approach we hope that the over all mesh length scale is reduced by factor of two.
Based on the Eq. (3.21) the ratios of 4, 8, and 16 for error norms are expected for 2nd,
3rd, and 4th-orders of accuracy should we globally reduce the mesh length scale by half.
But of course in practice, we are neither dealing with uniform unstructured meshes nor can
we refine them uniformly everywhere, and as a consequence getting the nominal order of
accuracy for all norms is not always exactly possible.
3.4 Monotonicity Enforcement
First-order upwind methods often exhibit poor resolution, and do not resolve physical phe-
nomena in the fluid flow accurately since they introduce considerable amount of numerical
diffusion in the solution process. This is due to the nature of these techniques in which
the fluxes are computed based on cell average data. Despite the mentioned weakness, they
have two great advantages, i.e. first they are monotone and produce non-oscillatory results
for whole domain even in the presence of discontinuities and second they are stable and
converge very well.
High resolution upwind methods (higher than first-order), cure the accuracy problem but
they require precautionary measures such as limiting to overcome the oscillatory behavior
and stability issue especially for non-smooth flows.
3.4.1 Flux Limiting
In central difference schemes, additional terms in the form of artificial viscosity are added
to the flux computation such that the accuracy of the solution in the smooth region is not
affected, but these terms provide enough diffusivity in the shock region to make it oscillation
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 44
free and stabilize the convergence. One of the most successful effort on this topic was made
by Antony Jameson [38] who introduced a second-order differencing term to address the
stability problem and a fourth-order differencing term to enhance the convergence of the
numerical solution. With this approach the solution in the vicinity of a discontinuity is first
order (locally) while the global accuracy is preserved for smooth regions.
In principle, high resolution upwind techniques use a similar approach, adopting a low-order
resolution near the shock region to take advantage of the monotonicity of the first-order
upwind scheme. This idea can be better explained by Eq. (3.22), where Φ(U : j) is the
flux limiter function and j is the control volume stencil from which U is calculated. In
smooth regions, Φ(U : j) should be very close to one and therefore the flux is computed
with higher-order resolution. In contrast, in discontinuous regions Φ(U : j) must be nearly
zero to eliminate the higher-order flux, recovering the monotonic low-order scheme.
FH(U : j) = FL(U : j) + Φ(U : j) [FH(U : j) − FL(U : j)] (3.22)
In other words, the goal is to provide just enough diffusion to prevent overshoots. One family
of schemes that apply this procedure is the Total Variation Diminishing (TVD) schemes
[35, 42]. In a TVD scheme, fluxes (F (U)) are limited such that the TV, total variation in
the solution over the domain decreases with time in the solution process compared to the
total variation of the initial data:
TV =
∫
Ω
∣∣∣∣∂U
∂x
∣∣∣∣ dx (3.23)
TV (U) =∑
i
|Ui+1 − Ui| (3.24)
TV (Un+1) ≤ TV (Un) ≤ TV (U 0) (3.25)
All TVD schemes result in monotonic (non-oscillatory) solutions and stable convergence.
3.4.2 Slope Limiting
In a structured discretization where there is a clear and straightforward connection be-
tween the mesh data structure and discretization scheme, employing flux limiters to achieve
non-oscillatory solution is quite common and successful. However, in general, this clear con-
nection between mesh and discretization method does not exist anymore for unstructured
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 45
meshes and implementing flux limiters are not that straightforward. At the same time, if
the flux function is computed based on Godunov’s approach (even for structured discretiza-
tion), again determining a proper flux limiter is not an easy job. There is another approach
for imposing the TVD condition in the solution process, known as slope or gradient limiting,
which is more appropriate for generic mesh discretization schemes and Godunov’s approach.
In this approach, the computed solution gradient from reconstruction is corrected (reduced)
to meet the monotonicity condition defined by Eq. (3.28):
Umax = max(U i, UFNj) (3.26)
Umin = min(U i, UFNj) (3.27)
Umin ≤ Ui(xG, yG) ≤ Umax (3.28)
where U i is the control volume average and UFNjare the control volume averages of the
first neighbors. This limiting procedure can be better explained by visualizing a linear
reconstruction for a simple 1-D case. Fig (3.3-a) illustrates a typical unlimited linear re-
construction solution for 1-D control volume averages. Using a slope limiter will correct
the overshoots and undershoots in the solution reconstruction by reducing the slope of the
reconstruction as shown by Fig (3.3-b).
In theory, Eq. (3.28) should be valid for all points inside the control volume i, but in practice
this condition will be checked and satisfied (if necessary) only for Gauss points where the
actual fluxes are computed. Assuming a linear reconstruction (2nd-order method), for the
control volume i (Fig(3.4)), the unlimited reconstructed value at the Gauss point G is
written in the form of Eq. (3.29):
UG = U(xc, yc) + ∇U |C −→r G (3.29)
U(xc, yc) is the value of the flow variable at the cell center, which in this case (2nd-order) is
also the average value of the control volume. The ∇U |C computed from the reconstruction
procedure needs to be adjusted according to the monotonicity condition (3.28) by a scalar
value φ called as the slope limiter:
UG = U(xc, yc) + φi ∇U |C −→r G (3.30)
The goal is to determine the largest acceptable value for φi in such a way that the computed
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 46
i−2 i−1 i i+1 i+2
(a) An unlimited linear reconstruction
i−2 i−1 i i+1 i+2
(b) A limited linear reconstruction
Figure 3.3: Typical unlimited/limited linear reconstruction
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 47
C
FN
FN
FN12
3
Gri
Figure 3.4: Using first neighbors for monotonicity enforcement
reconstructed value for U at all Gauss points are bounded by the maximum and minimum
of the neighboring control volume averages (including the control volume i average):
φGj=
min
(1, Umax−U i
UGj−U i
), if (UGj
− U i) > 0
min
(1, Umin−U i
UGj−U i
), if (UGj
− U i) < 0
1 if (UGj− U i) = 0
(3.31)
φi = min(φG1 , φG2 , ...) (3.32)
This limiting procedure was introduced by Barth [13] and such an implementation guaran-
tees the monotonicity principle for all Gauss points. The Barth limiter produces a strictly
monotonic solution and removes all oscillations; however it has some convergence and accu-
racy issues. First, Barth’s limiter formulation is clearly not differentiable and this severely
hampers the convergence process as limiter values oscillate across the shock. Second, in
smooth regions (including the stagnation region), we expect to have some local extrema
which will cause the limiter to fire. This reduces the accuracy of the solution in those
circumstances.
In general, an ideal limiter is differentiable, and acts firmly in the shock region not allowing
oscillatory behavior around discontinuities. Such a limiter also should be inactive in smooth
regions despite existence of non-monotone solutions resulting from smooth local extrema.
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 48
Venkatakrishnan limiter [86, 87, 4], which is semi-differentiable, nicely addresses most of
the aforementioned issues, and it has been employed in this research:
∆1,max = Umax − U i , ∆1,min = Umin − U i , ∆2 = UG − U i (3.33)
φG =1
∆2
[(∆2
1,max + ε2)∆2 + 2∆22∆1,max
∆21,max + 2∆2
2 + ∆1,max∆2 + ε2
]if ∆2 > 0
φG =1
∆2
[(∆2
1,min + ε2)∆2 + 2∆22∆1,min
∆21,min + 2∆2
2 + ∆1,min∆2 + ε2
]if ∆2 < 0 (3.34)
sign(∆2) =∆2
|∆2|∆2 = sign(∆2)(|∆2| + ω) (3.35)
ε2 = (K∆x)3 (3.36)
To avoid division by zero or a very small value in Eq. (3.34) as a practical measure ∆2 is
replaced by Eq. (3.35) where ω is chosen to be 10−12 for 64-bit arithmetic computations
(according to Ref. [87]). ∆x in Eq. (3.36) is the mesh length scale and it can be picked as
an average mesh length scale or a local mesh length scale. A local mesh length scale has
been used for this research and it is defined as the diameter of the largest circle that may
be inscribed into a local control volume. This length scale is proportional to the square
root of the control volume area, and as a simplification, we assume the control volumes are
equilateral triangles:
∆x = 2
√Area
3√
3(3.37)
In smooth regions, ∆1,max, ∆1,min, and ∆2 all are on the order of (∆x)2 either in constant
regions or at extrema. Since ε2 is made proportional to (∆x)3 , ε2 dominates ∆1,max, ∆1,min,
and ∆2 terms and we recover the scheme without limiting [86, 87]. For instance, near an
extremum, (say a max) ∆1,max = 0, ∆2 ≤ 0 therefore φ −→ ε2
ε2 ≡ 1. For flat solutions again
the limiter will not be fired. However in the general smooth cases the limiter value will
remain close to 1.0 but not exactly 1.0.
In the shock region ∆1,max, ∆1,min, and ∆2 are O(U) ≈ 1 and they dominate ε2 term
and the limiter fires as it was originally supposed. The key point here is the value of
constant K which is the coefficient of ∆x in Eq. (3.36). K determines the extent of the
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 49
monotonicity enforcement by setting a threshold below which oscillations in the solution
are not limited. A very large value of K essentially means no limiting at all, and it could
make the solution process unstable. Normally increasing K up to some value enhances
convergence characteristics as long as divergence does not occur. In contrast, a small value
for that constant slows or stops the convergence although it produces more monotonic
solution. Typical values of K used in literature are 0.3, 1, and 10 and in general the optimal
value both for convergence and accuracy purposes is determined based on experience.
Having computed the limiter value, we can directly multiply all the derivatives in the higher-
order reconstructed polynomial (linear and non-linear) by the limiter value as was described
previously for linear reconstruction:
U(K)G (xG, yG) = U(xc, yc) + φi
[∂U
∂x
∣∣∣∣C
∆x +∂U
∂y
∣∣∣∣C
∆y
]+
φi
[∂2U
∂x2
∣∣∣∣C
∆x2
2+
∂2U
∂x∂y
∣∣∣∣C
∆x∆y +∂2U
∂y2
∣∣∣∣C
∆y2
2
]+
φi
[∂3U
∂x3
∣∣∣∣C
∆x3
6+
∂3U
∂x2∂y
∣∣∣∣C
∆x2∆y
2+
∂3U
∂x∂y2
∣∣∣∣C
∆x∆y2
2+
∂3U
∂y3
∣∣∣∣C
∆y3
6
]+
... (3.38)
However, our experience as well as other research [22, 24] shows implementing the limiter
to all gradients yields a more diffusive solution. The approach of [24, 23] is followed, where
the limiter is applied only to the linear part of the reconstruction, and the non-linear part
of the reconstruction is dropped. This also helps the limiting process around the shock as
higher derivatives tend to show oscillatory behavior in the vicinity of discontinuities. The
limited reconstruction polynomial is expressed as:
U(K)G (xG, yG) = U i +
[(1 − σ)φi + σ
]LinearPart + σ Higher-Order Part (3.39)
where σ is a limiter for higher-order terms. In smooth regions, the full higher-order recon-
struction is applied by choosing σ = 1. Near discontinuities we switch from the higher-order
to the limited linear polynomial to prevent possible oscillatory behavior of second and third
derivatives. This is done by a discontinuity detector employing the value of the limiter; if
the limiter fires aggressively we assign zero to σ. This idea is also shown for a simplified
1-D case in Fig(3.5) .
Using a switch to set σ to either zero or one stalls the convergence. To overcome this
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 50
i−2 i−1 i i+1 i+2
(a) An unlimited quadratic reconstruction
i−2 i−1 i i+1 i+2
(b) A limited quadratic reconstruction
Figure 3.5: Typical unlimited/limited quadratic reconstruction
CHAPTER 3. RECONSTRUCTION AND MONOTONICITY 51
φ
σ
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Figure 3.6: Defining σ as a function of φ
problem, σ is defined as a differentiable function of φ, such that σ is nearly one for the regions
that φ ≥ φ0 and it quickly goes toward zero for other values of φ; making σ differentiable
greatly enhances convergence. As a differentiable switch, a smooth semi-step function is
employed:
σ =1 − tanh(S(φ0 − φ))
2(3.40)
S in (3.40) determines the sharpness of the step function, and adjusts how fast the switching
transition between zero and one is; φ0 defines the limiter value that activates the switch.
It appears that φ0 = 0.8 and S = 20 provide a reasonable switch function whose good
behavior is relatively case independent in this research, Fig(3.6).
Chapter 4
Flux Jacobian
4.1 What is the Jacobian ?
Any implicit formulation for solving a PDE includes some sort of Jacobian calculation. We
seek a solution vector X = (X1, X2, X3, ..., Xn) for a coupled non-linear system of algebraic
equations defined by:
F (X) = 0 (4.1)
where,
F = (F1, F2, F3, ..., Fn) (4.2)
The solution can be found via an iterative process (Newton’s method):
∂F
∂X
∣∣∣∣i
δXi+1 = −F (X i) , X i+1 = Xi + δX i+1 (4.3)
Since F is a vector operator, the derivatives of F respect to the vector X are expressed as
a matrix called Jacobian matrix ( ∂F∂X
), such that the(
∂F∂X
)j,k
entry is the derivative of the
52
CHAPTER 4. FLUX JACOBIAN 53
j-th component of F respect to the k-th component of X:
∂F
∂X=
J1,1 J1,2 . . . J1,n
J2,1 J2,2 . . . J2,n
. . . . . .
. . . . . .
. . . . . .
Jn,1 Jn,2 . . . Jn,n
, Jj,k =∂Fj
∂Xk(4.4)
Generally speaking, in an implicit formulation such as (4.3) the Jacobian matrix constitutes
the coefficient matrix and represents the linearization of a non-linear problem which must
be solved through an iterative process. In CFD the non-linear function F , determined by
the physics of the problem, is the residual of fluxes for the discretized domain. In other
words, we look for a solution vector U (conserved flow variables) which satisfies the flux
balance for all control volumes in a meshed domain:
Res(U) = 0 (4.5)
Equation (4.3) is similar to the implicit time advance formula, Eq. (2.12), except for the
diagonal term in (2.12) normalized by a time step (mainly added for stabilization purpose).
Equation (2.12) can be regarded as Newton’s formula with pseudo time stepping. In either
case, the Jacobian matrix is needed not only for forming the linear system but also for
building the preconditioner matrix, and it is the most expensive part of the implicit solver.
The convergence rate of the implicit solver depends on the accuracy and correctness of the
Jacobian matrix, and to achieve the quadratic convergence, we need to employ the true
flux Jacobian on the left hand side of Eq. (4.3). Since the Jacobian matrix consists of
the derivatives of the discretized governing equations with respect to the flow variables
used in the local discretization, the Jacobian entries in each row are zero except for those
few control volumes that contribute to that flux integral (i.e. union of the reconstruction
stencils of the neighbors). The level of difficulty in flux Jacobian computation depends
on the flux function (physics), the numerical flux formulation (i.e. how the flux is being
computed, such as Roe’s flux formula), type of the mesh, number of dimensions, and order
of accuracy. For instance, taking the derivative of the first order flux function for the 1-D
linear wave equation, Eq. (2.15), in finite-volume formulation is fairly easy ( ∂Fi
∂Ui= λ and
∂Fi
∂Ui−1= −λ), since (F lux)i = λ(ui − ui−1) . But in general, for non-linear flux functions
and/or formulations, especially in multi dimensions and with unstructured meshes, the flux
Jacobian computation involves a large number of control volumes and much more work
per control volume. Consequently, increasing the order of accuracy not only adds to the
CHAPTER 4. FLUX JACOBIAN 54
complexity of flux Jacobian computation but also makes it considerably costly both in terms
of computation time and memory usage.
The Jacobian matrix can be built by either analytical or numerical (finite difference) dif-
ferentiation; both approaches have been employed quite successfully [84]. Analytical differ-
entiation can be performed either manually, symbolically [94] or automatically [78]. The
analytical approach computes the exact Jacobian (ignoring round off error in numerical
evaluation) but it is relatively difficult to apply this approach for complicated flux functions
and/or higher-order discretizations by manual or even symbolic differentiation. Automatic
differentiation (AD), a very powerful tool for computing complex Jacobians, augments the
computer program to compute the derivatives by applying the chain rule repeatedly to the
elementary arithmetic operations performed by the computer program. The derivatives are
computed relatively cheaply by this method and are accurate to machine accuracy without
any truncation error. However, AD requires very careful programming and a considerable
amount of memory [20], and has not been considered in this research. Numerical differ-
entiation is fairly straightforward even for higher-order and complicated flux functions. It
also works reasonably well where the flux function is not differentiable, but like any other
numerical approach, the numerical differentiation technique suffers from truncation error as
well as round off error.
For some of the Newton-Krylov solvers like GMRES, only products of the Jacobian matrix
by some vectors are needed, and explicit computation of the Jacobian matrix can be avoided.
These products are computed using directional finite differencing. Therefore the higher-
order Jacobian of complex CFD problems can be employed on the left hand side of the
implicit formula to match the true order of accuracy of the residual on the right hand
side, Eq. (2.12), without excessive computational effort and memory usage. However, in
most cases, an approximate Jacobian matrix is required explicitly for preconditioning of
the linear system solver. It is possible to use a simplified or first-order Jacobian (based on
direct neighbors and mostly first-order) for preconditioning of the linear system resulting
from higher-order discretization [51, 50].
In this chapter, first, the lower-order Euler Jacobian computation procedure is presented
for an unstructured mesh stencil based on Roe’s flux formula. This flux Jacobian is used
for preconditioning of the linear system solver. The analytical and numerical approaches
for flux differentiation are discussed step by step.
CHAPTER 4. FLUX JACOBIAN 55
LNn 1R 1
i
N3
N2
l 1
Figure 4.1: Schematic of Direct Neighbors
4.2 Flux Jacobian Formulation
In the case of the 2D Euler equations discretized over a cell-centered unstructured mesh, each
control volume has 3 direct (first) neighbors, and the first-order flux integral of the control
volume i only depends on these three neighbors and the control volume itself (Fig(4.1)):
Ri =∑
m=1,2,3
(F nds)m = F (Ui, UN1)n1l1 + F (Ui, UN2)n2l2 + F (Ui, UN3)n3l3 (4.6)
where, nm and lm are the outward unit normal and the length of the face m of the control
volume i respectively. To compute the flux Jacobian of the control volume i, we need to
take the derivatives of the flux integral or residual function, Eq. (4.6), with respect to
all control volume averages involved in the residual function evaluation as shown in the
following formulas:
J(i,N1) =∂Ri
∂UN1
=∂F (Ui, UN1)
∂UN1
n1l1 (4.7)
J(i,N2) =∂Ri
∂UN2
=∂F (Ui, UN2)
∂UN2
n2l2 (4.8)
CHAPTER 4. FLUX JACOBIAN 56
J(i,N3) =∂Ri
∂UN3
=∂F (Ui, UN3)
∂UN3
n3l3 (4.9)
J(i, i) =∂Ri
∂Ui=
∂F (Ui, UN1)
∂Uin1l1 +
∂F (Ui, UN2)
∂Uin2l2 +
∂F (Ui, UN3)
∂Uin3l3 (4.10)
Since both the flux F and solution U are 4-component vectors, each entry in the Jacobian
matrix J is a 4 × 4 matrix and the total size of the block matrix is n × n where n is the
total number of control volumes as shown in Eq. (4.11). However, most of these blocks are
just zeros as in general there are no more than three neighboring control volumes in the
first-order cell-center formulation and the Jacobian matrix is very sparse, with at most four
non-zero blocks per row.
J =
x x x x
x x x x
x x x x
x x x x
1,1
. . .
x x x x
x x x x
x x x x
x x x x
1,k
. . .
x x x x
x x x x
x x x x
x x x x
1,n
. . .
x x x x
x x x x
x x x x
x x x x
2,1
. . .
x x x x
x x x x
x x x x
x x x x
2,k
. . .
x x x x
x x x x
x x x x
x x x x
2,n
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
x x x x
x x x x
x x x x
x x x x
n,1
. . .
x x x x
x x x x
x x x x
x x x x
n,k
. . .
x x x x
x x x x
x x x x
x x x x
n,n
(4.11)
Figure (4.2) displays typical numbering and the resulting connectivity between triangles in a
sample unstructured mesh. For this sample mesh, the only non-zero blocks in the Jacobian
matrix corresponding to control volume 42 (row of 42) are (42,38), (42,42), (42,49), and
(42,57); for this row, two non-zero blocks are above the diagonal block. For the row 54, the
non-zero blocks are (54,43), (54,45), (54,54), and (54,65); this time two non-zero blocks are
below the diagonal block.
In the next two subsections, the analytic derivation of the first-order flux Jacobian entries
CHAPTER 4. FLUX JACOBIAN 57
7 11
12
22
2327
29
30
31
34
35
37
38
41
42
43
45
47
49
51
54
56
57
59
64
65
69
73
74
75
77
80
81
84
85
87
92
96
98
99
102
103
105
118
119
120
133
140
152
157
Figure 4.2: Typical cell-centered mesh numbering
for both interior and boundary fluxes are described in detail.
4.2.1 Roe’s Flux Jacobian
After introducing the structure of the flux Jacobian, we need to compute the derivatives of
the fluxes, i.e. Roe’s flux function. The flux at the face between cell i and N1 is:
F (Ui, UN1) =1
2
[(F (Ui) + F (UN1)) −
∣∣∣A∣∣∣ (UN1 − Ui)
](4.12)
The Euler flux formula, F (U), and its derivative with respect to the conserved variables,∂F∂U
were described by Eq. (2.2) and Eq. (2.16). The A , the Jacobian of the Euler flux
evaluated at the Roe’s averaged properties, was also introduced by Eq. (2.19). The Roe’s
average formula (Tilde form) for density, velocity components, and total enthalpy are the
same as those introduced in section (2.3.2), but this time they have been rearranged to
reduce the evaluation cost.
∣∣∣A∣∣∣ = L
∣∣∣λ∣∣∣ R (4.13)
CHAPTER 4. FLUX JACOBIAN 58
L =
1 0 1eC2
1eC2
u −nyeueC2
+ nx
eCeueC2
− nx
eCv nx
eveC2
+ny
eCeveC2
− ny
eCQ ut
1β
+ eun
eC +eQ
eC2
1β− eun
eC +eQ
eC2
(4.14)
∣∣∣λ∣∣∣ =
|un| 0 0 0
0 |un| 0 0
0 0∣∣∣un + C
∣∣∣ 0
0 0 0∣∣∣un − C
∣∣∣
(4.15)
R =
1 − βeQ
eC2β eu
eC2β ev
eC2− β
eC2
−ut −ny nx 012 (−Cun + βQ) 1
2(nxC − βu) 12(nyC − βv) 1
2β12(Cun + βQ) −1
2(nxC + βu) − 12(nyC + βv) 1
2β
(4.16)
β = γ − 1 , γ =Cp
Cv(4.17)
ρ =√
ρLρR (4.18)
u =ρLuL + ρuR
ρL + ρ(4.19)
v =ρLvL + ρvR
ρL + ρ(4.20)
ht =ρLhtL + ρhtR
ρL + ρ(4.21)
Q =1
2(u2 + v2) (4.22)
C =
√β(ht − Q) (4.23)
ut = −nyu + nxv (4.24)
un = nxu + nyv (4.25)
CHAPTER 4. FLUX JACOBIAN 59
The flux differencing term in Eq. (4.12),∣∣∣A∣∣∣ (UN1−Ui) , can be recast in the form of
∣∣∣A∣∣∣∆U
and the full derivative of this term with respect to the solution vector (in general form) is:
∂(∣∣∣A
∣∣∣∆U)
∂U=
1︷ ︸︸ ︷∂∣∣∣A∣∣∣
∂U∆U +
2︷ ︸︸ ︷∣∣∣A∣∣∣ ∂(∆U)
∂U(4.26)
Now, we take a look at the first term in Eq. (4.26):
∂∣∣∣A∣∣∣
∂U∆U =
∂(L∣∣∣λ∣∣∣ R)
∂U∆U =
∂L
∂U
∣∣∣λ∣∣∣ R + L
∂∣∣∣λ∣∣∣
∂UR + L
∣∣∣λ∣∣∣ ∂R
∂U
∆U (4.27)
Differentiation of∂| eA|∂U
produces some third-rank tensors which not only is difficult to derive
but also is quite expensive to compute. Barth [9] has derived the full derivative of∂(| eA|∆U)
∂U
with some clever modifications to eliminate the tensor computations, reducing the com-
plexity of the Jacobian computation to some degree. Through spectral radius analysis for
1-D flow he showed that for a smooth flow the approximate Jacobian can be accurate up
to CFL=1000 or even above. However, for the shock tube problem the difference between
the true Jacobian and the approximate Jacobian grows after CFL=10, and becomes notice-
able after CFL=100 showing that the approximate Jacobian will not be accurate enough
for larger CFL numbers. Looking back at the Eq. (4.26), similar physical insight can be
concluded. For a smooth flow, the magnitude of∂| eA|∂U
∆U with respect to the magnitude
of∣∣∣A∣∣∣ ∂(∆U)
∂Uis very small, and the resulting approximate Jacobian will be acceptable. In
the case of a flow with discontinuity, this approximation is not accurate anymore because,∂| eA|∂U
as well as ∆U entries would be large enough (at least in some regions of the flow field)
and except for very small ∆U (i.e. smooth regions of the flow field) that approximation
will lose its accuracy to some extent. Therefore applying a large CFL number for the so-
lution acceleration is not necessarily useful when the overall flow linearization is relatively
inaccurate.
Ignoring changes in A (treating it as a constant) greatly simplifies the evaluation of the
Roe’s flux, as well as overall Jacobian computation cost; such a Roe’s flux differentiation
for the Eq. (4.12) with respect to cell i and N1 is called the approximate Roe’s flux Jacobian
and can be written in the following form:
∂F (Ui, UN1)
∂UN1
=1
2
[∂F (UN1)
∂UN1
−∣∣∣A∣∣∣]
(4.28)
CHAPTER 4. FLUX JACOBIAN 60
∂F (Ui, UN1)
∂Ui=
1
2
[∂F (Ui)
∂Ui+∣∣∣A∣∣∣]
(4.29)
The other Jacobian terms in Eq. (4.8) through Eq. (4.10) can be derived easily by repeating
similar differentiation.
Because of the difficulties in Jacobian calculation, especially for complicated flux functions
such as Roe, several researchers [83, 93] have successfully employed simpler flux functions
like Steger-Warming [77] for linearization purposes–building the Jacobian on the left hand
side–where the right hand side–the residual–still is evaluated based on the Roe’s flux. A
similar approach can be taken for preconditioning of the left hand side where the explicit
Jacobian matrix is not required (matrix-free methods), but a preconditioner matrix still
is needed to enhance the performance of the linear solver [22, 61]. In either case, due to
inconsistency between the left and right hand sides (i.e. linearization and flux evaluation),
the resultant convergence rate per iteration will not be as fast as if a consistent Jacobian
was employed.
4.2.2 Boundary Flux Jacobians
To have a true implicit time-advance and/or Newton formulation, the linearization must be
extended to the boundary fluxes. Both the Jacobian matrix and the preconditioner matrix
(in our case the first-order Jacobian) need to include derivatives of the boundary fluxes as
well. Otherwise, the implicit formulation does not converge as fast as it should and/or some
stability issues may well occur if large CFL numbers are attempted [79, 32]. The goal here
is to compute the Jacobian matrix of the boundary Euler flux with respect to conservative
variables. Since on some occasions boundary conditions (section 2.4) are implemented based
on the primitive variables (V ) rather than conservative variables (U), it is more convenient
to compute the Jacobian of the Euler analytic flux with respect to primitive variables and
then convert that Jacobian to the conservative format using the chain rule:
∂F
∂U=
∂F
∂V
∂V
∂U(4.30)
∂F
∂V=
un ρnx ρny 0
uun ρ(un + unx) ρuny nx
vun ρvnx ρ(un + vny) ny
htun − γPun
ρ(γ−1) ρ(uun + htnx) ρ(vun + htny)γ
γ−1un
(4.31)
CHAPTER 4. FLUX JACOBIAN 61
∂V
∂U=
1 0 0 0
−uρ
1ρ
0 0
−vρ
0 1ρ
012 (u2 + v2)(γ − 1) −u(γ − 1) −v(γ − 1) (γ − 1)
(4.32)
U =
ρ
ρu
ρv
E
, V =
ρ
u
v
P
, F =
ρun
ρuun + P nx
ρvun + P ny
ρhtun
(4.33)
et = e+ek, e = CvT =P
ρ(γ − 1), ek =
1
2(u2+v2) , E = ρet, h = e+
P
ρ, ht = h+ek (4.34)
4.2.2.1 Wall Boundary Flux Jacobian
The wall boundary condition for the first-order Euler flux can be satisfied by setting all
normal fluxes equal to zero (un = 0), section 2.4.1, and the wall flux Jacobian is derived by
taking the derivative of this simplified wall flux:
FW =
0
P nx
P ny
0
(4.35)
∂F
∂V
∣∣∣∣W
=
0 0 0 0
0 0 0 nx
0 0 0 ny
0 0 0 0
,∂F
∂U
∣∣∣∣W
=∂F
∂V
∣∣∣∣W
∂V
∂U(4.36)
4.2.2.2 Subsonic Inlet Flux Jacobian
As it was explained in section 2.4.2, total pressure, total temperature, and angle of attack
are set at a subsonic inlet as constant inlet conditions and the static pressure is taken from
the interior as a parameter. This time it is easier to rewrite the Euler flux in terms of
total pressure, total temperature, static pressure, and angle of attack and then proceed
CHAPTER 4. FLUX JACOBIAN 62
with taking derivatives. The magnitude velocity, Vm, can be defined as a function of total
temperature, and the normal velocity component is described in terms of the magnitude
velocity and angle of attack:
Vm =
[(2
γ − 1
)(Tt
T− 1
)] 12
(4.37)
u = Vm cos α =
[(2
γ − 1
)(Tt
T− 1
)] 12
cos α (4.38)
v = Vm sinα =
[(2
γ − 1
)(Tt
T− 1
)] 12
sinα (4.39)
un =
[(2
γ − 1
)(Tt
T− 1
)] 12
(cos αnx + sinαny) (4.40)
Now, we write density in terms of total temperature, total pressure and static pressure:
ρ =γP
T, T = Tt
(P
Pt
) γ−1γ
(4.41)
ρ =γP
Tt
(PPt
) γ−1γ
(4.42)
And finally ht is rewritten in terms of total pressure and static pressure:
ht =γP
(γ − 1) γP
Tt
“PPt
” γ−1γ
+1
2
((2
γ − 1
)((Pt
P
)γ−1γ
− 1
))
The Euler flux vector is recast in its new format after substitution of ρ, u, v, un, ht with
their new definitions:
F =
F1
F2
F3
F4
(4.43)
F1 =γC2
TtP−C1t
[2
β
(PC1
t P (2−3C1) − P (2−2C1))] 1
2
(4.44)
CHAPTER 4. FLUX JACOBIAN 63
F2 =γC2 cos α
TtP−C1t
[2
β
(PC1
t P (1−2C1) − P (1−C1))]
+ P nx (4.45)
F3 =γC2 sinα
TtP−C1t
[2
β
(PC1
t P (1−2C1) − P (1−C1))]
+ P ny (4.46)
F4 =γC2
TtP−C1t
[TtP
−C1t P
β+
1
β
(PC1
t P (1−2C1) − P (1−C1))][ 2
β
(PC1
t P−C1 − 1)] 1
2
(4.47)
β = γ − 1 , C1 =β
γ, C2 = cos αnx + sinαny (4.48)
Total pressure, total temperature and angle of attack for a steady inlet are constant, and∂η∂V
where η is the inlet variable vector is greatly simplified, Eq. (4.49). Applying the chain
rule (Eq. (4.50)) and since the only apparent primitive variable in the subsonic inlet flux
vector is the static pressure, P , all three first columns of the subsonic inlet flux Jacobian
will be zero and only the 4th column needs to be evaluated.
∂η
∂V=
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 1
, η =
Pt
Tt
α
P
(4.49)
∂F
∂V=
∂F
∂η
∂η
∂V(4.50)
∂F
∂V
∣∣∣∣SubIn
=
0 0 0 ∂F1∂P
0 0 0 ∂F2∂P
0 0 0 ∂F3∂P
0 0 0 ∂F4∂P
(4.51)
∂F1
∂P=
1
2
(γC2
TtP−C1t
)(2
β
[P−C1
t P (2−3C1) − P (2−2C1)])− 1
2
×(
2
β
[(2 − 3C1) P C1
t P (1−3C1) − 2(1 − C1)P(1−2C1)
])(4.52)
CHAPTER 4. FLUX JACOBIAN 64
∂F2
∂P=
γC2 cosα
TtP−C1t
[2
β
((1 − 2C1)P
C1t P−2C1 − (1 − C1)P
−C1
)]+ nx (4.53)
∂F3
∂P=
γC2 sinα
TtP−C1t
[2
β
((1 − 2C1)P
C1t P−2C1 − (1 − C1)P
−C1
)]+ ny (4.54)
∂F4
∂P=
√2C2γ
√PC1
t P−C1 − 1
β√
βTtP−C1t
[TtP
−C1t + (1 − 2C1)P
C1t P−2C1 − (1 − C1)P
−C1
]−
√2C2C1γP C1
t P−C1
2β√
βTtP−C1t
√P C1
t P−C1 − 1
[TtP
−C1t + P C1
t P−2C1 − P−C1
](4.55)
Having ∂F∂V
∣∣SubIn
evaluated, the subsonic inlet flux Jacobian with respect to conservative
variables can be easily computed:
∂F
∂U
∣∣∣∣SubIn
=∂F
∂V
∣∣∣∣SubIn
∂V
∂U(4.56)
4.2.2.3 Subsonic Outlet flux Jacobian
For subsonic outlet, section 2.4.2, three flow variables (ρ, u, v) are taken from interior, and
only the static pressure is fixed at the outlet:
FSubOut =
ρun
ρuun + Poutnx
ρvun + Poutny
ρht−outun
(4.57)
ht−out =γPout
ρ(γ − 1)+
1
2(u2 + v2) (4.58)
Since the static pressure at the outlet is treated as a constant, the last column of the outlet
flux Jacobian is filled with zeros:
CHAPTER 4. FLUX JACOBIAN 65
∂F
∂V
∣∣∣∣SubOut
=
un ρnx ρny 0
uun ρ(un + unx) ρuny 0
vun ρvnx ρ(un + vny) 0
ht−outun − γPoutun
ρ(γ−1) ρ(uun + ht−outnx) ρ(vun + ht−outny) 0
(4.59)
and the rest is trivial:
∂F
∂U
∣∣∣∣SubOut
=∂F
∂V
∣∣∣∣SubOut
∂V
∂U(4.60)
4.2.2.4 Supersonic Inlet/Outlet Flux Jacobians
As it was discussed in section 2.4.2, we impose all flow variables directly at the supersonic
inlet, and their values do not depend on the variables of the inside solution domain. There-
fore the supersonic inlet flux Jacobian is just a block of zero. On the contrary at the outlet
all variables are taken from inside domain and the outlet flux Jacobian respect to primitive
variables is the same as Eq. (4.31).
4.3 Finite Difference Differentiation
The flux Jacobian can be computed via finite differencing removing the issue of analytic dif-
ferentiation of the flux function. Finite difference differentiation is a well known technique to
compute derivatives of a complex function using perturbations of the independent variables.
Equations (4.61) and (4.62) show typical forward and central differencing formulas:
∂F
∂U
∣∣∣∣FD
≡ F (U + ε) − F (U)
ε− ∂2F
∂U2
(ε
2
)Forward Differencing (4.61)
∂F
∂U
∣∣∣∣CD
≡ F (U + ε) − F (U − ε)
2ε− ∂3F
∂U3
(ε2
6
)Central Differencing (4.62)
The central differencing has smaller truncation error but it comes with the cost penalty
of one more function evaluation, making it twice as expensive, compared to the forward
differencing. If the function evaluation is expensive, which indeed is the case for residual
evaluation in CFD problem, central differencing may not the best choice.
CHAPTER 4. FLUX JACOBIAN 66
The accuracy of the differentiation clearly is a function of ε and it appears that smaller ε
results in smaller error, which is true for truncation error, but choosing a very small value
for ε will introduce more noise in numerical differentiation and amplifies the existing error
in numerical function evaluation (caused by round off error) since that error is divided by
very small number. This can be shown by a simple analysis, where the actual function
evaluated by the computer F is replaced by the exact value of F and the associated round
off error Er:
F (U) = F (U) + Er(U) , F (U ± ε) = F (U ± ε) + Er(U ± ε) (4.63)
∂F
∂U
∣∣∣∣∣FD
=∂F
∂U
∣∣∣∣FD
+Er(U + ε) − Er(U)
ε(4.64)
∂F
∂U
∣∣∣∣∣CD
=∂F
∂U
∣∣∣∣CD
+Er(U + ε) − Er(U − ε)
2ε(4.65)
if (Er)max is an error bound for round off error, the total error in numerical differentiation
for forward differencing (FD) and central differencing (CD) can be written as a combination
of truncation error and the error caused by round of error known also as cancellation or
condition error [29]:
ETotal−FD =
∥∥∥∥∂2F
∂U2
∥∥∥∥(ε
2
)+
2(Er)max
ε(4.66)
ETotal−CD =
∥∥∥∥∂3F
∂U3
∥∥∥∥(
ε2
6
)+
(Er)max
ε(4.67)
By examining the Eq. (4.66) and Eq. (4.67) it is clear that if the magnitude of pertur-
bation, ε, is a large number the dominant error in numerical differentiation is truncation
error. Truncation error decreases linearly for forward differencing and quadratically for
central differencing by reducing the perturbation magnitude. However, the condition error
increases linearly by decreasing the ε . Therefore there is an optimum value for perturbation
magnitude which minimizes the total error in numerical differentiation.
Fig (4.3) displays the total-relative numerical error in finite difference differentiation versus
perturbation magnitude for a sample cubic function, F = X 3 +X2 +X +0.1 , at X0 = 0.05.
The relative error is computed by:
CHAPTER 4. FLUX JACOBIAN 67
ε
Rela
tiveE
rror
10-17 10-15 10-13 10-11 10-9 10-7 10-5 10-3 10-1
10-10
10-8
10-6
10-4
10-2
100
Forward DifferenceCentral Difference
Figure 4.3: Total numerical error versus perturbation magnitude
Errortotal-relative =
∣∣∣F ′(X0) − ∂F (X0)∂X
∣∣∣FiniteDiff.
∣∣∣F ′(X0)
(4.68)
Clearly, there is an optimal value for ε such that the numerical differentiation error is min-
imum, in this case ε ≈ 10−9 for forward differencing and ε ≈ 10−6 for central differencing.
Very small value of perturbation (i.e. ε < 10−9) has caused domination of the condition
error in such a way that both forward and central differencing essentially produce the same
total error.
Since the condition error is caused by round off error, the optimum ε should be determined
based on the machine precision, εM . The error bound for round off error, (Er)max can be
estimated by εM ‖F‖. To find the optimum ε, derivatives of Eq. (4.66) and Eq. (4.67) are
taken respect to ε and they are equated with zero:
∂ETotal−FD
∂ε=
1
2
∥∥∥∥∂2F
∂U2
∥∥∥∥−2εM ‖F‖
ε2= 0 (4.69)
∂ETotal−CD
∂ε=
ε
3
∥∥∥∥∂3F
∂U3
∥∥∥∥−εM ‖F‖
ε2= 0 (4.70)
CHAPTER 4. FLUX JACOBIAN 68
and by solving the above equations for ε the optimum perturbation for forward and central
differencing are estimated:
εoptFD=
√√√√4εM ‖F‖∥∥∥∂2F∂U2
∥∥∥
εoptCD= 3
√√√√3εM ‖F‖∥∥∥∂3F∂U3
∥∥∥
Therefore, in addition to machine accuracy, the optimum perturbation value, depends on the
norm (magnitude) of the function F and its higher derivatives, and the type of the numerical
differentiation. There is another consideration for choosing the size of the perturbation for
numerical differentiation of a higher-order function, which is the grid size or discretization
length scale; this will be discussed in chapter 5. Not surprisingly, there are other practical
factors in scientific computing for ε selection which is beyond the scope of this research.
Several researchers [40, 26, 29, 61] have studied the effect of ε in numerical differentiation
and they provide some practical formulation and guidelines for choosing the best value for
ε.
4.4 Numerical Flux Jacobian
In numerical Jacobian calculation using the finite difference method, the principle has not
changed; we still need to take the derivatives of the residual function for a control volume,
Eq. (4.6), and compute the flux Jacobian terms through Eq. (4.7) to Eq. (4.10). However,
this time instead of analytic differentiation of the flux function at each face, Eq. (4.12), the
finite difference technique is applied.
The Euler flux function includes 4 equations and 4 primitive variables. At the same time,
the flux derivatives with respect to conservative variables are needed, these variables should
be perturbed originally. The following procedure demonstrates such a flux Jacobian com-
putation for interior faces using forward differencing:
CHAPTER 4. FLUX JACOBIAN 69
F lux=FluxFunction(V )
for j=1 to 4
for i=1 to 4
δU[i]=εE[j][i]
UP[i]=U[i]+δU[i]
VP=ConservativeToPrimitive(UP)
F luxP=FluxFunction(VP)
for i=1 to 4
DFDU[i][j]=(F luxP[i]-F lux[i])/ε
where F lux is the flux vector, V is the primitive variable vector, U is the consevative
variable, subscript of P presents the perturbed vectors, and E(j, i) is the identity matrix:
E(j, i) =
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
(4.71)
For boundary faces, exactly the same procedure is implemented but the boundary condition
implementation is also considered both for the perturbed variables and the flux functions:
F luxBC=BoundaryFluxFunction(VB)
for j=1 to 4
for i=1 to 4
δU[i]=εE[j][i]
UP[i]=U[i]+δU[i]
VP=ConservativeToPrimitive(UP)
VPB=BoundaryConditionProcedure(VP)
CHAPTER 4. FLUX JACOBIAN 70
F luxBCP=BoundaryFluxFunction(VPB)
for i=1 to 4
(DFDU)BC[i][j]=(F luxBCP[i]-F luxBC[i])/ε
Chapter 5
Linear Solver and Solution
Strategy
5.1 Introduction
Up to this point, two major elements used in the implicit solver (i.e. residual evaluation and
Jacobian computation) have been explained. This chapter describes the iterative process to
reach the steady-state solution, including the linear system solver, preconditioning, start-up
process, and over all solution strategy.
Linearization of the fluid flow equations over a meshed domain leads to a large sparse linear
system, which needs to be solved in multiple iterations. In principle, there are two separate
issues here, forming the linear system and solving the linear system. As it was described in
section 2.2, it may be useful if the solution process starts from a low CFL number (generally
resulting in small solution update) and even low-order linearization, Eq. (2.14), especially if
the initial condition that the solution starts from is far away from the steady-state solution
of the problem. This is particularly true for non-linear problems such as compressible flow,
where the large solution update based on high CFL number at the early stage in the solution
process does not necessarily help the convergence, and it may slow the convergence or even
cause divergence. This fact can be better explained by Fig (5.1) where a sample scalar
function is linearized. The linearization of a function F (U) at point A, where we are far
away from the solution point C, does not provide us a good slope to estimate the solution
point. Also it does not make a meaningful difference if the linearization at this point is exact
or approximate, as both of them result in inaccurate solution estimation. Now if the slope
71
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 72
A
B
U
F(U)
A
C
Approximate Linearization
Exact Linearization
F(U)=0U
Figure 5.1: Linearization of a sample function
of function F (U) at point A is used for the solution update, only a small fraction of ∆U can
be applied reasonably. At the same time for small ∆U , an approximate linearization, in our
case low-order linearization, still produces a reasonable solution update. On the other hand,
when we are close enough to the solution point C, the linearization is a fair representative
of the non-linear problem and the more accurately this linearization is performed better
solution update is expected.
Direct methods (Gaussian Elimination and LU factorization) solve the linear system exactly
(to within round off error), and they have been applied to CFD problems in early CFD im-
plicit solvers [8, 89]. However, the computing cost (CPU time) and the storage requirement
(memory usage) of direct methods for large linear system arising from CFD problems limit
their application seriously. Furthermore, quite often the linearization is not exact and we
are solving a non-linear problem through multiple linear iterations, so it is not beneficial
from the performance point of view to solve the linear system in each iteration exactly.
Iterative methods, on the contrary, require far less memory, and can provide a good approx-
imation for the solution at a reasonable cost. It should be mentioned that the effectiveness
of iterative methods as compared with direct methods for solving linear systems is related
to the sparsity of the coefficient matrix. In general, iterative methods are more efficient
for large sparse systems. Development of iterative methods for large sparse linear sys-
tems has been a very active research area in the field of numerical linear algebra, and the
Krylov-subspace family methods have emerged as modern iterative techniques [74]. In these
techniques, a very large sparse linear system is reduced to a much smaller system through
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 73
creation of some subspace (known as Krylov subspace), and then solution of the original
large system is approximated using the constructed subspace by satisfying some optimality
criterion for the original system.
A wide variety of Krylov iterative solvers [81], are in use for various applications; the
Generalized Minimal Residual (GMRES) algorithm [72], which is developed mainly for
non-symmetric matrices, is employed in this research for the following reasons:
1. The linear system arising from unstructured discretization is asymmetric both in terms
of the matrix structure and the values of entries.
2. The GMRES algorithm only needs matrix-vector multiplication removing the issue of
high-order Jacobian matrix computation.
3. GMRES minimizes the residual of the linearized system implying that if we are close
enough to the solution and linearization of the non-linear system is accurate then
GMRES computes the best update for solution at each iteration, making it a very
suitable choice for Newton iteration.
It should be noted that the GMRES algorithm has been applied quite successfully and
extensively in implicit CFD solvers since the early 90’s, and it is a well known linear solver for
CFD applications providing us a great deal of experience both in solver and preconditioning
implementation.
Preconditioning is another important issue in linear system solving since no iterative method
for general large linear systems demonstrates suitable performance without proper precon-
ditioning. In principle, preconditioning transforms the original linear system to a modified
linear system making it easier to solve by an iterative technique.
5.2 GMRES Linear Solver
Like other modern iterative techniques GMRES uses a projection process to extract an
approximation to the linear system solution from a subspace. Assume A is a real n × n
matrix and x is a solution vector in <n for a linear system of:
Ax = b (5.1)
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 74
An approximate solution of the above system is found within a subspace K of <n. The size
of the subspace is m n, and generally m constraints in the form of orthogonal conditions
(minimization problem) have to be imposed to make the best approximate solution. A
logical candidate for imposing orthogonality condition is the residual vector:
r = b − Ax (5.2)
The residual vector is considered to be orthogonal to another subspace L with the dimension
m. In other words, we are looking for an approximate solution x ∈ K with the constraint
of r ⊥ L. It can be shown that if the subspace L is chosen in the form of L = AK (oblique
projection), imposing such an orthogonality condition minimizes the l2 norm of the residual
vector over the affine space of x0 + K, where x0 is the initial guess and the starting vector
[74].
5.2.1 The Basic GMRES Algorithm
For a linear system of Ax = b, GMRES [72] seeks an approximate solution xk in the form
of xk = x0 + zk , where x0 is the initial solution vector, and zk is the member of the Krylov
subspace (search directions), Kk ≡ spanr0, Ar0, A2r0, ... , AK−1r0
. Here, r0 = b − Ax0
is the initial residual which minimizes the l2 norm of the residual vector.
A subspace Vk ≡ v1, v2, ... , vk is constructed using Arnoldi’s algorithm (and applying
Gram-Schmidt method) for computing the l2 − orthonormal basis of the Krylov subspace,
where v1 = r0/ ‖r0‖. The residual vector at the kth iteration, rk , should be l2 −orthogonal
to Kk:
GMRES:
1.Use initial guess x0 and compute r0 = b − Ax0;
normalize the initial residual v1 = r0/ ‖r0‖.2.Build the orthonormal basis (Arnoldi’s algorithm):
for j=1,2,...,k until satisfied (‖b − Axk‖ ≤tolerance) do:
hi,j = (Avj , vi), i=1,2,...,j,
vj+1 = Avj −∑j
i=1 hijvi,
hj+1,j = ‖vj+1‖, if hj+1,j = 0 stop
vj+1 = vj+1/hj+1,j,
3.Form the approximate solution:
xk = x0 + Vkyk such that xk minimizes ‖b − Axk‖.
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 75
After k iteration in Arnoldi’s process, the l2 − orthonormal subspace Vk+1 and a (k +1)× k
matrix Hk are formed. Hk is the upper Hessenberg matrix Hk whose only nonzero entries
are the hi,j described above, except for an additional row with the nonzero element of hk+1,k.
The following important relation will hold:
AVk = Vk+1Hk (5.3)
For minimization, the update xk = x0 + z should satisfy the least-squares problem:
minz
‖b − A(x0 + z)‖ = minz
‖r0 − Az‖ , z ∈ Kk (5.4)
By setting z = Vky, r0 = βv1, and β = ‖r0‖, we have:
‖r0 − Az‖ = ‖βv1 − AVky‖ =∥∥Vk+1(βe1 − Hky)
∥∥ (5.5)
where, e1 is the first column of the (k + 1) × (k + 1) identity matrix. Since Vk+1 is an
orthonormal set, the least-squares problem is reduced to:
miny
∥∥βe1 − Hky∥∥ , y ∈ <k (5.6)
Considering that this is a minimization problem, Eq. (5.4), the residual at each iteration
should be smaller or at least equal to the residual at the previous iteration, i.e. ‖rk‖ ≤‖rk+1‖. Therefore the residual decreases in each iteration or at worst stalls. The algorithm
terminates after n iteration steps in the absence of the round of error, meaning that the exact
solution is computed if the size of the subspace is chosen equal to the size of the system.
The iterations or steps taken in building the subspace or search directions in part 2 of the
GMRES algorithm are called inner iterations. Since by increasing the search directions or
subspace size the cost of the algorithm rises linearly for memory usage and quadratically
for computing time, the Krylov subspace size for large matrices is limited to m n as was
mentioned earlier. In practice, we hope that by picking a small subspace size, a sufficiently
accurate solution is computed if that does not happen we can restart the algorithm after
m steps to limit the cost of the algorithm. However, there could be some occasions when
the convergence criteria for the linear system is not satisfied without excessive number
of restarts. In that case, the user normally limits the number of inner iterations, and/or
restart number, and moves on to the next non-linear iteration with whatever solution update
achieved in the linear solver.
The restarted version of the algorithm, GMRES(m), is explained in the following lines:
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 76
GMRES(m):
1.Use initial guess x0 and compute r0 = b − Ax0;
normalize the initial residual v1 = r0/ ‖r0‖.2.Build the orthonormal basis (Arnoldi’s algorithm):
for j=1,2,...,m do:
hi,j = (Avj , vi), i=1,2,...,j,
vj+1 = Avj −∑j
i=1 hijvi,
hj+1,j = ‖vj+1‖, If hj+1,j = 0 stop
vj+1 = vj+1/hj+1,j,
3.Form the approximate solution:
xm = x0 + Vmym where ym minimizes f(y) =∥∥βe1 − Hky
∥∥ , y ∈ <k.
4-Restart:
Compute rm = b − Axm; if satisfied (‖rm‖≤tolerance) then stop
else compute x0 := xm, v1 := rm/ ‖rm‖ and go to 2.
The only possibility for breaking down of the GMRES algorithm is in the Arnoldi’s part
where the orthogonalization process is performed. If hj+1,j becomes zero the algorithm stops
and the solution at the last iteration is returned. This happens only when the residual is
zero, which means reaching the exact solution.
The least-squares solution in step 3 can be carried out via classical QR factorization of H k
using plane rotations. As it was suggested in the original article [72], it is preferable to
update the factorization of Hk progressively as each column appears at every step of the
orthogonalization process. This enables us to compute the residual norm of the approximate
solution without computing xk and to terminate the algorithm whenever the convergence
criterion for the linear system is satisfied without further operations. This also means the
residual of rm = b − Axm does not need to be computed explicitly saving us the cost of
one matrix-vector multiplication since the magnitude of the∥∥βe1 − Hky
∥∥and ‖rm‖ are the
same. Depending on the convergence criteria, the GMRES algorithm may be completed
with the inner iteration number smaller than the maximum subspace size (m), equal to the
number of subspace size, or larger than the number of subspace size. For instance if one
restart is allowed in the algorithm and the maximum subspace size of 30 is used, assuming
that the convergence criterion in part 4 is not met within inner iterations, then 60 inner
iterations are performed before completion of the algorithm.
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 77
5.2.2 Matrix-Vector Products Computation in GMRES
Considering the complexity of the higher-order discretization method used in this research,
Chapter 3, computing the higher-order Jacobian matrix is a highly expensive task in terms
of memory usage and computation cost. Even if memory were not an issue, computing
a higher-order Jacobian would be quite expensive in terms of CPU time no matter what
approach in Jacobian calculation is taken, Chapter 4.
Therefore, a linear solver in which the explicit Jacobian computation is not required, such
as GMRES, is an attractive candidate for a higher-order implicit solver. Looking back at
section 5.2.1, it is clear that in Arnoldi’s algorithm, only matrix-vector products are needed,
and these products can be easily approximated through the (forward) finite difference tech-
nique:
F (U + δU) = F (U) +∂F
∂UδU + O(δU)2 (5.7)
By choosing δU = εv (ε is a small number), Eq. (5.7) can be rewritten:
F (U + εv) = F (U) +
A︷︸︸︷∂F
∂Uεv + O(εv)2 (5.8)
and since ε is a scalar ( v is a vector), the matrix vector product, Av , is evaluated by:
Av =∂F
∂Uv ≈ F (U + εv) − F (U)
ε(5.9)
Eq. (5.9) is a first-order approximation, and a more accurate approximation can be achieved
employing central differencing approach with the extra cost of one more flux (residual)
evaluation per matrix-vector product (section 4.3). If forward differencing is employed for
evaluation of matrix-vector product, for any inner iteration in the GMRES algorithm one
flux evaluation is needed.
So far no consideration has been made regarding the discretization order and ε, but perhaps
the choice of ε can dramatically affect the correctness of the higher-order Jacobian matrix-
vector multiplications, even beyond the consideration of section 4.3. The issue here is
how much the higher-order terms in the reconstruction polynomial are affected by a small
perturbation in the flow field. In other words, do higher-order terms remain arithmetically
significant or measurable after a perturbation of size δU on a discretized domain with the
length scale of ∆x ? To address this question we need to go back to Chapter 3.
The reconstruction procedure led to solving a least-squares problem, Eq. (3.14), that can
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 78
be simplified in the form of Eq. (5.10).
MD = U (5.10)
where M is the coefficient matrix containing the moment information of the reconstruction
stencil, D is the solution vector including the derivatives of the reconstruction polynomial,
and U is the control volume average vector for the reconstruction. In Eq. (5.10), matrix M
is purely geometric and it is not solution dependent, therefore by changing U to U + δU , M
will not be affected. Now we would like to know how much the derivatives are perturbed
by a change in control volume averages:
M(D + δD) = U + δU = MD + δU (5.11)
and
MδD = δU or δD = M−1δU (5.12)
By taking the norms of Eq. (5.10) and Eq. (5.12), these two norm inequalities are found:
‖U‖ ≤ ‖M‖ ‖D‖ , ‖δD‖ ≤∥∥M−1
∥∥ ‖δU‖ (5.13)
and finally after multiplication of these two norm inequalities, the relative change in the
derivatives can be expressed in terms of the relative perturbation in U as:
‖δD‖‖D‖ ≤
∥∥M−1∥∥ ‖M‖︸ ︷︷ ︸
κ(M)
‖δU‖‖U‖ (5.14)
In Eq. (5.14), κ(M) is called the condition number of the matrix M and it is evident that the
change in the derivatives is a direct function of condition number of the coefficient matrix.
The matrix M is relatively small and its entries are only functions of mesh geometry. In
our case, the quality of the employed isotropic mesh is guaranteed by the mesh software
(GRUMMP[59], [19]), so the condition number of the matrix M is expected to be fairly
small (O(10) for typical employed meshes). Therefore changes in the derivatives are of the
same order (or at most one order larger) as the perturbations in the solution.
Consider a simplified 1-D cubic reconstruction polynomial which leads to the 4th-order
discretization for a mesh with the uniform length scale of ∆x, Eq. (5.15). U is perturbed
by δU and the perturbed reconstruction polynomial can be expressed by Eq. (5.16) where
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 79
the superscript p shows the perturbed coefficients:
F (U) = a0 + a1(∆x) + a2(∆x)2 + a3(∆x)3 (5.15)
F (U + δU) = ap0 + ap
1(∆x) + ap2(∆x)2 + ap
3(∆x)3 (5.16)
Now we study the difference between the original reconstruction and the perturbed one:
F (U + δU) −F (U) = (ap0 − a0) + (ap
1 − a1)(∆x) + (ap2 − a2)(∆x)2 + (ap
3 − a3)(∆x)3 (5.17)
In order to have a correct higher-order Jacobian in the matrix-vector product computed by
Eq. (5.9), all terms in Eq. (5.17) are required to be arithmetically significant; otherwise, the
overall accuracy of the linearization would be reduced and the convergence rate is adversely
affected. Of course the highest order term, in our case, (ap3 − a3)(∆x)3 is the main issue
and consequently:
‖ap3 − a3‖ (∆x)3 ≥ εM , εM : machine precision (5.18)
Using Eq. (5.14), and assuming δU = εv , the bound of the highest derivative for the
perturbation is estimated, Eq.(5.19). Combining this bound with Eq. (5.18) determines the
minimum magnitude of ε for keeping the higher-order order terms significant in the finite
difference matrix-vector multiplication, Eq. (5.20).
‖ap3 − a3‖‖a3‖
≤ κ(M)ε ‖v‖‖U‖ (5.19)
ε ≥ εM ‖U‖‖a3‖ (∆x)3κ(M) ‖v‖ (5.20)
Further simplification can be made if ‖U‖‖a3‖κ(M) in Eq. (5.20) is assumed to be O(1).
ε ≥ εM
(∆x)3 ‖v‖ (5.21)
Although this result has been obtained with several simplifying assumptions, it gives us a
very useful insight about the reasonable magnitude of ε for proper finite difference matrix-
vector product computation. In the case of the standard GMRES algorithm, ‖v‖ = 1 due
to the normalization process, and ε can be found based on the geometric length scale. For
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 80
a non uniform mesh, ∆x obviously is chosen based on the smallest length scale, and as an
example if ∆x ≡ O(10−4) , and εM ≡ O(10−16) then choosing ε ≈ O(10−4) or larger, would
keep higher-order terms significant. It should be noted that in the presence of the limiter
using a large ε or perturbation often causes limiter firing in residual evaluation leading to
convergence problems. Therefore using a large perturbation for flows with an active limiter
is not suggested.
5.2.3 Preconditioning
Convergence of iterative techniques, including Krylov subspace methods, is highly depen-
dent on the conditioning of the linear system, i.e. the Jacobian matrix. Using a higher-order
discretization introduces more off-diagonal entries and increases the bandwidth of the Ja-
cobian matrix considerably. In addition, in the case of Euler equations (compressible flow),
with a non-linear flux function and possible discontinuities in the solution, the Jacobian
matrix is off-diagonally dominant. All these factors lead to poor convergence of the linear
solver, and consequently slowing or stalling the solution process of the non-linear problem.
To remove this obstacle and enhance the performance of the linear solver, the linear system
needs to be modified through a process called preconditioning, such that the preconditioned
system has better spectral properties and clustered eigenvalues. Although in the case of
the GMRES method, eigenvalues may not describe the convergence of a nonsymmetric ma-
trix iterations as much as they do for symmetric iterative solvers such as the Conjugate
Gradient (CG) method, a clustered spectrum (not too close to zero) normally improves the
convergence characteristics of the linear solver [16].
Consider a preconditioner M as a nonsingular matrix which approximates the Jacobian
matrix A. Equation (5.1) then can be modified by multiplying by M−1 on both sides:
M−1Ax = M−1b (5.22)
If M−1 is a good approximation to A−1 , M−1A becomes close to the identity matrix,
increasing the performance of the linear solver through eigenvalue clustering around unity.
Equation (5.22) is called left preconditioning, which affects not only the matrix operator
but also vector b on the right hand side. It is also possible to introduce the preconditioner
operator in the right side of the matrix A, and leave the right hand intact:
AM−1
z︷ ︸︸ ︷(Mx) = b , x = M−1z (5.23)
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 81
This is called right preconditioning. If the GMRES algorithm is used with left precondi-
tioning, M−1(b − Axk) is minimized instead of b − Axk , resulting in a different stopping
criterion (residual norm) for the algorithm. However, right preconditioning, still minimizes
the residual norm of the original system [73, 81].
Of course the best apparent choice for M is A, as AM−1 = I, but if we could solve the
system with the A matrix easily we never needed preconditioning in the first place. Finding
the optimal preconditioner matrix is not unique, since it is highly problem dependent, and
also depends on how the preconditioner is applied. Also there are circumstances where the
matrix M need not be computed explicitly and a preconditioner operator replaces M , like
polynomial preconditioners [81]. But in general three factors are considered in choosing a
preconditioner:
1. M is a reasonably good approximation to the coefficient matrix A.
2. M is better conditioned, more narrow banded, and less expensive to build compared
to matrix A.
3. The system Mx = z should be much easier to solve than Ax = b.
The bottom line is the cost of the construction and applying the preconditioner should be
relatively cheap with respect to the cost of the original linear system solving.
For effective preconditioning, in addition to applying a good preconditioner matrix, we need
to employ a good preconditioning technique. A preconditioning technique is a sparse linear
solver itself, and it can be a direct or iterative method. Iterative stationary methods such
as successive over-relaxation (SOR) rely on the fixed iteration Eq. (5.24), where B is the
fixed iteration matrix and C is a fixed vector:
xi+1 = Bxi + c (5.24)
Stationary methods are simple and easy to implement, often have parallelization advan-
tages, are low cost (memory and CPU time), and finally they are effective in damping
high frequency errors. However, they often have a restrictive stability condition, reduc-
ing the benefits of Newton method. This is especially true if the preconditioner matrix
is off-diagonal which is the case for compressible flow. At the same time, for relatively
large systems slow damping of the low frequency errors becomes a noticeable issue for these
techniques in the absence of a multigrid augmentation strategy.
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 82
Preconditioning Ratio of non-zero elementsFactored Mat./Original Mat.
ILU-0 1
ILU-1 1.25
ILU-2 1.5
ILU-3 1.8
ILU-4 2.1
LU 7.2
Table 5.1: Ratio of non-zero elements in factorized matrix
Another preconditioning technique is the Incomplete Lower-Upper Factorization (ILU)
method. Factoring the matrix M into two triangular matrices, M = LU , where L is a
lower triangular matrix, and U is an upper triangular matrix normally results in factored
matrices that are far less sparse than the original matrix M . As a consequence, a con-
siderable amount of memory needs to be assigned for factorization, and the factorization
process is quite expensive. One solution to this problem is incomplete factorization in which
additional nondiagonal fill-in entries are allowed only for a predefined set of locations in the
LU factorizations. In other words, we choose a non-zero pattern in advance for the elements
of the factored matrices [74]. This is called ILU-P where P is the fill-level in the factorized
matrix. P equal to zero means no fill is permitted during ILU decomposition. In ILU-0 the
factorized matrix and the original (non-factored) matrix have the same graph or non-zero
element locations. Choosing P larger than zero allows some additional fill-in in the factored
matrix improving the accuracy of factorization and the preconditioning quality. However,
increasing the fill-level comes at the expense of memory usage and extra computing cost,
imposing a restriction in increasing fill-level in practice. Table 5.1 shows the number of
non-zero elements in the factored matrix where the original preconditioning matrix is a
first-order Jacobian [52]. For instance ILU-2 needs 50% and ILU-4 requires a little bit more
than 100% extra memory for storing the decomposed matrix in comparison with the orig-
inal matrix graph. Also LU will be infeasible in practical applications because of its huge
memory requirements.
Both the preconditioner matrix and preconditioning technique may be developed based on
the specific problem approach, but first one should have a detailed knowledge about the
numerical aspects of the problem which is not always possible; second, generally speaking,
such a preconditioning procedure is sensitive to the type of the problem. Application of
general preconditioning techniques such as ILU is more common for compressible CFD
solvers. Another modification in the ILU family is ILU(P,τ), where in addition to the
static non-zero pattern, a tolerance τ is added as a dropping criterion for entries of the
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 83
factored matrices [16]. In other words, all fill-in entries within the level P are changed to
zero if they do not satisfy some set of tolerance condition. Since the proper fill-level and
tolerance criterion in ILU(P,τ) for efficient preconditioning of the compressible flows are
highly dependent on the test case, and they are not determined uniquely, ILU(P,τ) is not
considered in this research.
Several researchers have implemented ILU methods for preconditioning of the GMRES
linear solver for compressible fluid flows [91, 14, 24, 18, 45]. Their results show that ILU
with some additional fill-in (i.e. ILU-1&2) is a reliable and robust preconditioning strategy
for a variety of test cases, while ILU-0 fails to provide fast convergence in some cases.
Reordering is another important factor in ILU factorization. Reordering is designed to
reduce the bandwidth of a matrix. Smaller bandwidth leads to a sparser matrix, and
consequently during factorization (Gaussian elimination process) of the matrix fewer fill-in
entries will appear. Knowing that increasing the fill-level in ILU preconditioning has its
own disadvantages, reordering the original preconditioner matrix becomes essential to keep
the accuracy of preconditioning for a low fill-level factorization. Experience shows that
quite often a non-reordered preconditioner matrix with low fill-level works poorly, while the
same fill-level factorization performs quite well, when the original matrix is reordered. The
most common available reordering technique is Reverse Cuthill-McKee (RCM) [74] which
has been successfully used for ILU-P preconditioning.
In this thesis research, the GMRES linear solver including the ILU-P factorization is the
employed Petsc library developed by Argonne National Laboratory [1].
5.2.4 GMRES with Right Preconditioning
In right preconditioning AM−1z = b is solved where, z = Mx . Like the standard GMRES,
in the right preconditioned GMRES, the objective is minimizing the residual r = b − Ax.
Although the initial residual is r0 = b − AM−1z0 , z0 need not to be computed explicitly
and all the elements of the Krylov subspace are created without referring to z.
Right Preconditioned GMRES(m):
1.Use initial guess x0 and compute r0 = b − Ax0;
normalize the initial residual v1 = r0/ ‖r0‖.2.Build the orthonormal basis (Arnoldi’s algorithm):
for j=1,2,...,m do:
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 84
compute wj = AM−1vj
hi,j = (wj , vi), i=1,2,...,j,
wj = wj −∑j
i=1 hijvi,
hj+1,j = ‖wj‖, If hj+1,j = 0 stop
vj+1 = wj+1/hj+1,j,
3.Form the approximate solution:
xm = x0+M−1Vmym where ym minimizes f(y) =∥∥βe1 − Hky
∥∥ , y ∈ <k.
4.Restart:
Compute rm = b − Axm; if satisfied (‖rm‖≤tolerance) then stop
else compute x0 := xm, v1 := rm/ ‖rm‖ and go to 2.
Applying the preconditioner, the Krylov subspace is right preconditioned orthogonal basis:
K =r0, AM−1r0, ..., (AM−1)m−1r0
(5.25)
This time the matrix-vector operator is AM−1v = Az , where z = M−1v is the precondi-
tioned vector. Nielsen et. al. [55] has suggested to scale ε in Eq. (5.9) by RMS of vector z
to improve the convergence, since using a constant ε could result in convergence stall after
couple of orders of the (non-linear) residual reduction:
Az =F (U + εz) − F (U)
ε, ε =
ε0
RMS(z)(5.26)
where ε0 is a small scalar usually in the order of√
εM (εM : machine accuracy). However,
as it was described in section 5.2.2, the small scalar value may be taken some orders of
magnitude larger than√
εM due to mesh length scale issues.
5.3 Solution Strategy
The goal of all CFD solver developers is reducing the cost of computation as well as im-
proving the accuracy of the solution. The latter could be achieved by applying higher-order
discretization methods. However, in a practical sense that is possible if and only if the cost
of the computation is comparable to the cost of the common second-order methods. That
is why implicit techniques (Newton family of methods) play a crucial role in achieving that
goal since the best explicit techniques are far less efficient with respect to computation cost
making higher-order computation quite uncompetitive. In addition to the computation cost,
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 85
i.e. CPU time, memory usage is also an issue for solver developers. But in most cases, CPU
time is the main concern since if storing a set of data (i.e. Jacobian matrix) is expensive,
then computing that data would be expensive too except in the case that the expensive
computation is performed once and the resultant work is used in multiple iterations.
Knowing the over all approach, which is the implicit formulation, and the main objective
which is the steady-state solution, we would like to lay out an efficient strategy to reach the
steady state solution as fast as possible.
The overall computing cost for a steady-state CFD problem by an implicit approach, can
be analyzed by dividing the solution process into three major parts:
1. Forming the linear system including linearization and the non-linear residual compu-
tation.
2. Solving the linear system including preconditioning.
3. Number of implicit iterations needed for steady-state convergence.
It is obvious that we would like to minimize the cost of the first and second part of the
solution process and reach the steady-state solution with minimum number of iterations.
The third part has a cumulative effect on the overall solution cost since in each (non-linear)
iteration the first two parts are covered once already.
Most CFD problems, including compressible flows, are highly non-linear, and that makes
them not too easy to solve via linearization. In addition to non-linearity in the mathematical
sense, the behavior of a compressible flow (the physics of the flow) can dramatically change
in some flow conditions for a fixed geometry. Therefore if the solution process is started from
an arbitrary initial condition which is not reasonably close to the steady-state solution, large
updates to the solution based on linearization are not likely to be helpful. Also, Newton’s
method is well known to converge to the solution quadratically if started from the vicinity
of the solution; otherwise it will diverge or stall quickly. Since finding a good approximate
solution in general (that is except the cases for which some close solution is available through
analytical or experimental data) is impossible without actually starting to solve the problem,
the solution process should consist of two phases:
1. Start-up phase
2. Newton phase
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 86
In the start-up phase, multiple linearizations with small or moderate solution updates are
required to advance the solution to the point that the linearization is accurate enough for
finding the steady-state solution, i.e. a good approximate solution as initial starting point
for Newton phase. Having a good initial guess, the solution process is switched to the
Newton iteration where superlinear or quadratic convergence is possible by taking a very
large (or infinite) time step.
5.3.1 Start-up Phase
Finding a good initial guess or reasonable approximate solution is about knowing the physics
of the problem and finding a proper way to get to that good initial solution faster by
employing various techniques such as mesh sequencing, multigrid, mixed explicit/implicit
iterations, exact solution of a simplified problem, potential flow solution, and so on. Hence,
like preconditioning, start-up is problem dependent and is more an art rather than an exact
science, implying that the proper start-up process is not unique either. In this research
a proper start-up process for compressible flows is suggested which is based on the defect
correction procedure [46].
In the start-up process, considering the fact that multiple iterations are required for advanc-
ing the solution toward a good initial guess, an implicit iterative process is used where the
linearization is performed based on the inexpensive first-order discretization, section(4.2),
and the flux calculation remains higher-order:
(I
∆t+
∂R
∂U
∣∣∣∣n
1st
)∆Un+1 = −RHigh(Un) , Un+1 = Un + ω∆Un+1 (5.27)
A relaxation factor, ω, is applied for the solution update; ω = 1 results in standard defect
correction, while using different values for ω (typically ω = 0.9-1.3) can over-relax or under-
relax the update [52]. Over-relaxation often is helpful in subsonic flows where it accelerates
the solution while under-relaxation can prevent divergence resulting from an inaccurate
solution update, for instance in transonic cases.
The Jacobian matrix on the left hand side is the first-order Jacobian and its computation
cost is approximately the same as the cost of the second-order residual calculation. The
cost of one Jacobian evaluation is 0.6-0.7 of the cost of a second-order residual evaluation
(limiting cost excluded) for the approximate analytical Jacobian and it is 1.3-1.5 of the
computation cost of the same residual evaluation if the finite-difference Jacobian is employed
(for moderate mesh sizes).
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 87
To reduce the cost of forming the linear system, the Jacobian matrix may be computed
and stored for several iterations [39]. With this approach the Jacobian is updated every j
iteration(s) (typically j = 1-4) [52].
The time step ∆t , which is actually scaled by the volume (area) of each control volume, is
based on the CFL number times maximum allowable explicit time step:
∆ti =CFL∆ti−max
Ai(5.28)
∆ti−max =Ai∮
CViλmaxds
, λmax : Maximum wave speed (5.29)
As was noted before and also has been graphically illustrated for a sample non-linear func-
tion, Fig (5.1), using the slope of the residual function for estimating the behavior of the
residual function over a large span of the independent variables is not accurate in the early
stage of the solution process. Therefore it makes sense to start from a relatively small ∆t
or CFL, and to gradually increase CFL to some modest number as we are getting closer to
the solution of the residual function, i.e. F (U) = 0. This justifies applying the first-order
linearization in the start-up process instead of the correct order Jacobian calculation; the
linearization does not allow large steps towards the steady-state solution, so why should we
spend too much work to compute it, especially if implementation of Jacobian in the linear
solver requires several matrix-vector multiplications? At the same time, solving a linear
system based on the first-order Jacobian is much easier than doing so for higher-order Jaco-
bian (even for the second-order one), because of the structure of the higher-order Jacobian
matrix. Therefore we are better off to form and solve a less expensive system, since we must
inevitably perform several iterations at the start-up phase.
The next step is solving the linear system. Since we are using an approximate linearization,
it does not make sense to solve the approximate linear system exactly, therefore the linear
system in each defect correction iteration, referred to as pre-iteration from now on, is solved
approximately. The goal of each pre-iteration is to reduce the non-linear higher-order resid-
ual by cheap linearization. Consequently it is logical to solve the linear system up to some
fraction of the non-linear residual, i.e. tolerance=C × Res(U), C 1 . Employing this
approach saves us from spending too much effort on finding a precise solution to a linear
system that we know will not give us the correct solution to the non-linear problem. The
right-preconditioned GMRES(m) algorithm is used with a limited subspace size (K=15-20)
and relatively a loose tolerance (0.1-0.05 Res(U)) is set for solving the linear system. No
restart is allowed, and if the tolerance is not reached within the maximum search directions
(subspace size) GMRES is terminated anyway and the computed solution update up to that
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 88
point is taken to move to another non-linear outer iteration. The preconditioner matrix is
the same as the linear system matrix which is the perfect choice for preconditioning because
the first-order Jacobian has been built already for forming the linear system, and it is the
most accurate preconditioner matrix for that case. To reduce the preconditioning cost, a
low fill-level incomplete factorization, ILU(1), is applied, which is accurate and efficient
enough for this first-order linear system.
Now the question is how close we should be to the solution before switching to Newton
iterations. Normally the decrease of some norm of the non-linear residual at the current
iteration compared to the initial residual, ‖Res(U)‖0
‖Res(U)‖n , is chosen for that purpose. That is, if
the solution reaches a specific relative norm of the non-linear residual, then the linearization
is accurate enough to take a very large time step at each iteration. It should be noted that
the value for this criterion varies from problem to problem. However, it is possible to find
reasonable values for different categories of the compressible flows (i.e. subsonic, supersonic
and transonic). More detail is provided in that regard in the results chapter.
5.3.2 Newton Phase
By this stage most of the transient behaviors of the non-linear CFD problem have been
removed from the flow field solution, and major steady features of the fluid flow have
appeared in the solution domain. Consequently the linearization of the non-linear residual
is fairly accurate for seeking the steady-state solution, and taking infinite or very large time
steps accelerates the solution toward quadratic or superlinear convergence, i.e. switching to
the Newton iteration formula:
∂R
∂U
∣∣∣∣n
High
∆Un+1 = −RHigh(Un) , Un+1 = Un + ∆Un+1 (5.30)
To have an accurate linearization, the higher-order Jacobian must be applied during the
Newton iterations, Eq. (5.30). This has been implemented through matrix-free GMRES,
using finite difference directional matrix-vector multiplication technique. The linear system
is right preconditioned by the first-order Jacobian, which indeed is essential due to bad
conditioning of the left hand side. ILU(P=2-4) is used for preconditioning and normally
for higher-order computation (especially the fourth-order transonic flow), increasing the fill-
level is considerably beneficial [52]. A fixed number of search directions is employed (K=30)
and it is important to keep them limited since the matrix-free GMRES performs multiple
perturbations of the non-linear residual which is very expensive for higher-order residual
computation. The linear system is again solved approximately but this time with a tighter
CHAPTER 5. LINEAR SOLVER AND SOLUTION STRATEGY 89
tolerance (10−2Res(U)) as an accurate update is required. No restart is allowed, and if
the tolerance is not reached, the next outer iteration starts with the best update from the
last inner GMRES iteration. Approximately solving the linear system in this way is called
the Inexact Newton method, and although it can increase the non-linear outer iteration
number, it reduces the over all CPU time as considerable computation time is saved by not
computing the exact solution of the linear system [64, 18, 45, 51]. Notice that the left hand
side is not quite exact because of the truncation error in the matrix-vector multiplications
and possibly is polluted by round off error, especially for higher-order terms. Therefore it
is preferable to give up the quadratic convergence instead of increasing the solution CPU
time.
Chapter 6
Results(I): Verification Cases
The first step in accuracy verification of an unstructured CFD solver is to verify the accu-
racy of the reconstruction procedure. For this purpose two complete sets of the reconstruc-
tion tests including straight and curved boundaries are presented here. Then to study the
correctness, basic performance and solution accuracy of the proposed higher-order unstruc-
tured Newton-Krylov solver, smooth subsonic and supersonic cases with known solutions
have been investigated. This chapter is devoted to numerical study of these test cases.
6.1 Reconstruction Test Cases
The objective here is to reconstruct a test function over a domain with different mesh
resolutions, and to measure the error between the reconstruction of the test function and
the exact value of the test function for discretized meshes. Two different domains have been
chosen for examining the over all accuracy of the reconstruction procedure. The first test
case is a unit square (a case with straight edges) and the second test case is an annulus
which has two curved edges. The test function is a smooth analytical function described by
Eq. (6.1).
f(x, y) = 1.0 + 0.1 sin(πx) sin(πy) (6.1)
The first step in reconstruction is computing the mean value for a control volume, implying
that we need to be able to integrate a given function over any control volume (including
boundary triangles with curved edges) at least as accurately as the nominal discretization
90
CHAPTER 6. RESULTS(I): VERIFICATION CASES 91
2nd-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/460 CVs 0.0001667 — 0.0002948 — 0.0017002 —
Mesh 2/1948 CVs 4.551149e-5 3.66 8.73185e-5 3.37 0.0004369 3.89
Mesh 3/7856 CVs 1.15364e-5 3.94 2.27180e-5 3.84 0.0001099 3.97
Mesh 4/31668 CVs 2.92207e-6 3.95 5.81718e-6 3.9 2.75408e-5 3.99
Table 6.1: 2nd-order error norms for the square case
order. The integration of a given function over the area of a control volume is carried
out using Gauss quadrature rule, Eq. (6.2), where Wj is the Gauss weight. While this
procedure looks straightforward, determining Gauss point locations and their associated
weights robustly for higher-order integration over an area with curved edges are proved to
be a nontrivial task [95].
∫
CVi
f(x, y)dA =n∑
j=1
Wjf(xj, yj) (6.2)
In this research, for the square case a 6th-order integration routine using 7 Gauss points
and for the annulus case a 4th-order integration routine using 10 Gauss points (developed
by Dr. Carl Ollivier-Gooch in ANSLib frame work [58]) have been employed. All the grids
are generated using GRUMMP Version 0.3.2 [59] which generates high quality triangular
meshes for domains with curved boundaries [19].
6.1.1 Square Test Case
Four unstructured meshes shown in Fig (6.1) were generated for a unit square domain. The
density of control volumes is increased 4 times at each mesh with respect to the previous
mesh level. The mesh length scale is not uniform, with control volumes at the corners clearly
having larger length scales than the control volumes in the interior of the square domain.
The error norms and the ratio of error norms are tabulated in Table 6.1 through Table 6.3.
Despite non-uniform mesh distribution, all the error norm ratios for the Mesh 4 confirm
the nominal accuracy of the reconstruction. Fig (6.2) displays the Error-Mesh plot for the
square case. Notice that the error norm is reduced considerably by increasing the order
of accuracy, and the slope of the graph proves the reconstruction accuracy. Asymptotic
convergence of the error norms are clearly observed by examining the tabulated data and
Fig (6.2) for this series of meshes.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 92
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
(a) 460 Control Volumes
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
(b) 1948 Control Volumes
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
(c) 7876 Control Volumes
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
(d) 31668 Control Volumes
Figure 6.1: Unstructured meshes for a square domain
3rd-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/460 CVs 3.54269e-5 — 4.82035e-5 — 0.0002061 —
Mesh 2/1948 CVs 5.18616e-6 6.83 7.59731e-6 6.34 3.47214e-5 5.9
Mesh 3/7856 CVs 6.92254e-7 7.50 1.01755e-6 7.46 4.71759e-6 7.36
Mesh 4/31668 CVs 8.96936e-8 7.70 1.31888e-7 7.70 6.09472e-7 7.74
Table 6.2: 3rd-order error norms for the square case
CHAPTER 6. RESULTS(I): VERIFICATION CASES 93
4th-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/460 CVs 3.98124e-6 — 9.87666e-6 — 0.0001084 —
Mesh 2/1948 CVs 3.30234e-7 12.06 8.20473e-7 12.03 8.39794e-6 12.90
Mesh 3/7856 CVs 2.11865e-8 15.60 5.32239e-8 15.40 5.49060e-7 15.30
Mesh 4/31668 CVs 1.35160e-9 15.70 3.40196e-9 15.65 3.4699e-8 15.82
Table 6.3: 4th-order error norms for the square case
Number of Control Volumes
L1N
orm
ofth
eErro
r
103 104 105
10-9
10-8
10-7
10-6
10-5
10-4
2nd-order3rd-order4th-order
1.98
2.94
3.97
Figure 6.2: Error-Mesh plot for the square case
CHAPTER 6. RESULTS(I): VERIFICATION CASES 94
2nd-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/427 CVs 0.00087885 — 0.00143183 — 0.0091068 —
Mesh 2/1703 CVs 0.0001974 4.45 0.0003387 4.23 0.0035145 2.59
Mesh 3/6811 CVs 4.61658e-5 4.28 8.01761e-5 4.22 0.0008615 4.08
Mesh 4/27389 CVs 1.09032e-5 4.23 1.85523e-5 4.32 0.0002311 3.72
Table 6.4: 2nd-order error norms for the annulus case
3rd-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/427 CVs 0.00037211 — 0.00051522 — 0.0020959 —
Mesh 2/1703 CVs 4.86257e-5 7.65 7.05354e-5 7.3 0.0003919 5.34
Mesh 3/6811 CVs 6.15972e-6 7.89 8.83369e-6 7.98 6.93723e-5 5.65
Mesh 4/27389 CVs 7.53171e-7 8.17 1.05894e-6 8.34 9.78171e-6 7.09
Table 6.5: 3rd-order error norms for the annulus case
6.1.2 Annulus Test case
This is the case that includes curved boundaries. The domain inside an annulus is dis-
cretized with four mesh densities. The Geometry and meshes are shown in Fig (6.3). Again
refinement is performed using global parameters and except for the center portion of the
annulus, refinement has been carried out relatively uniformly. The boundary region con-
trol volumes which normally introduce a major part of the reconstruction error still remain
larger than the interior control volumes. The error norms and their ratio for all orders of
reconstruction are presented in Table 6.4 to Table 6.6. l1 and l2 norms have converged
after 3 level refinement. l∞ norm has not converged as fast as the other two norms. Since
l∞ is the local error indicator and presents the maximum error in the reconstruction, the
slower convergence rate for l∞ often is expected. The nominal order of accuracy for l1 and
l2 are achieved through the employed 4 level meshes. The ratios for l∞ are also fairly close
to the nominal order of accuracy. Fig(6.4) demonstrates the reduction in l1 norm of the
error versus control volume density. The accuracy of the reconstruction also can be verified
by measuring the slope of this Error-Mesh plot. In addition, such a plot can give us a
good estimation of the average solution error for a certain mesh density. In other words
we can determine how many control volumes do we need to reach a certain level of error
based on the chosen discretization order. It is interesting to note that the l1 norm of the
reconstruction error on the finest mesh (for this test function) is 15 times smaller for the
3rd-order discretization and more than 500 times smaller for the 4th-order discretization
than the error for the 2nd-order discretization.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 95
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
(a) 427 Control Volumes
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
(b) 1703 Control Volumes
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
(c) 6811 Control Volumes
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
(d) 27389 Control Volumes
Figure 6.3: Unstructured meshes for a curved domain (annulus)
4th-order L1 Ratio L2 Ratio L∞ Ratio
Mesh 1/427 CVs 8.03432e-5 — 0.00013598 — 0.0016860 —
Mesh 2/1703 CVs 5.87669e-6 13.67 1.11776e-5 12.16 0.0001678 10.04
Mesh 3/6811 CVs 3.64156e-7 16.14 7.14849e-7 15.63 1.42043e-5 11.82
Mesh 4/27389 CVs 2.09949e-8 17.3 3.88537e-8 18.39 1.15369e-6 12.31
Table 6.6: 4th-order error norms for the annulus case
CHAPTER 6. RESULTS(I): VERIFICATION CASES 96
Number of Control Volumes
L1N
orm
ofth
eErro
r
103 104
10-8
10-7
10-6
10-5
10-4
10-3 2nd-order3rd-order4th-order
2.08
3.03
4.11
Figure 6.4: Error-Mesh plot for the annulus case
CHAPTER 6. RESULTS(I): VERIFICATION CASES 97
X
Y
-300 -200 -100 0 100 200 3000
100
200
300
400
500
(a)
Figure 6.5: Circular domain over half a cylinder, Mesh 1 (1376 CVs)
6.2 Subsonic Flow Past a Semi-Circular Cylinder
The smooth inviscid flow over a semi-circular (half) cylinder with R = 1, at M∞ = 0.3 is
computed. The flow direction is from left to the right and the angle of attack is zero. The
far-field boundary is located at R = 300 to make sure that all perturbations are damped
effectively (Fig (6.5)). Three different meshes, Mesh 1 (1376 CVs), Mesh 2 (5539 CVs), and
Mesh 3 (22844 CVs) are generated for this purpose; close-ups are shown in Fig (6.6) to Fig
(6.8). A special effort has been made in the grid refinement process in order to achieve a
relatively self-similar refinement through the whole solution domain. Each mesh is nearly
four times denser than its immediate coarser level mesh implying that the mesh length scale
in all parts of the domain has been reduced approximately by a factor of two (section (3.3)).
CHAPTER 6. RESULTS(I): VERIFICATION CASES 98
X
Y
-2 -1 0 1 20
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Figure 6.6: Circular cylinder, Mesh 1 (1376CVs)
Mesh CVs Max ACViMin ACVi
√Amax
Amin
√(Amax)coarse
(Amax)fine
√(Amin)coarse
(Amin)fine
1 1376 1475.98 0.00360815 639.6 — —
2 5539 516.15 0.000853626 777.6 1.69 2.06
3 22844 125.98 0.000240918 723.2 2.02 1.88
Table 6.7: Sizes and ratios of the control volumes for circular cylinder meshes
Table 6.7 summarizes a brief information about sizes of the control volumes in each mesh
level. The length scale of each control volume can be assumed to be proportional to the
square root of the control volume area. By looking at the tabulated data, it is clear that
in each mesh level there is a wide variety of length scales which largely changes across the
solution domain. The table also shows the approximate refinement factor from one coarse
level to its immediate finer level. Assuming that the refinement is done uniformly (which
is not the case in reality), the length scale of the mesh is not exactly reduced by a factor
of two in each refinement. Since, we do not have a uniform mesh to start with, we are not
able to uniformly refine the mesh, and reduce the length scale of the control volumes in an
orderly fashion through whole solution domain (especially at boundary regions) it is not
too surprising that the measured order of the accuracy will be somewhat away from the
nominal expected order, section (3.3).
CHAPTER 6. RESULTS(I): VERIFICATION CASES 99
X
Y
-2 -1 0 1 20
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Figure 6.7: Circular cylinder, Mesh 2 (5539 CVs)
X
Y
-2 -1 0 1 20
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Figure 6.8: Circular cylinder, Mesh 3 (22844 CVs)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 100
The potential (incompressible) flow solution is used as an initial solution for all the meshes
and orders of accuracy. Since the solution is started from a good initial guess, no start-up
procedure is needed. Newton iterations are performed using matrix-free GMRES with the
maximum Krylov subspace size of 30. However for the sake of the robustness, we keep theI
∆tterm and the time advance formula with large CFL number is applied, Eq. (2.12). To
reduce the cost of the inner iterations, where multiple residual perturbations are required,
a subspace of 20 is employed for early outer iterations. When the l2 norm of the non-linear
residual is dropped below 5× 10−9 the subspace of 30 is used. This is especially helpful for
the 4th-order case where the residual computation is very expensive. The tolerance of the
linear system solver is chosen to be 1× 10−2 of the l2 norm of the non-linear residual which
often is not reached within the allowable inner iterations. The convergence criterion for the
steady-state solution is 1 × 10−12 for the l2 norm of the non-linear residual. In the case of
Mesh 1 and 2, CFL=5000 for the 2nd and 3rd-order discretizations and CFL=1000 for the
4th-order discretization are used for starting the solution from the initial guess and it is
increased to CFL=10,000 when the non-linear residual is reduced by one order. For Mesh 3
the starting CFL is 100 for all orders of accuracy which is increased to CFL=5000 gradually,
and after one order reduction in residual, it is increased to CFL=10,000. The preconditioner
matrix for all cases is the approximate analytic Jacobian and the preconditioning strategy
is ILU(4).
The solution convergence history for the coarse mesh (Mesh 1) and the fine mesh (Mesh 3)
are shown in the Fig (6.9) and Fig (6.10). As it is evident full convergence is achieved, but
the CPU time for the 4th-order case is much larger than the other two orders of accuracy.
This was partially expected since the 4th-order discretization requires more operations,
and the cost of the reconstruction rises quadratically by increasing the discretization order.
However, a considerable part of the computing cost is due to noticeable increase in outer
iteration numbers. Furthermore, the linear system arising from the 4th-order discretization
is ill-conditioned and difficult to solve which results in relatively ineffective linearization of
the non-linear problem (considering the fixed subspace size and Inexact Newton approach)
demanding more outer iterations. On the contrary, both the 2nd and 3rd-order cases quickly
converge displaying the effectiveness of the linearization.
Pressure coefficient contours (banded) for all orders of accuracy (Mesh 1) are shown in
the Fig (6.11) to Fig (6.13). The quality of the contours, which reflect the smoothness of
the reconstructed solution, are visibly improved by increasing the order of accuracy of the
solution1. The difference between the contours’ quality is clearer in regions where control
1Visualization is done by Tecplot 360 using 3 nodal triangle finite element format; consequently althoughsolution is higher order its visual representation is only second order.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 101
CPU-Time (Sec)
L2
0 50 100 15010-13
10-12
10-11
10-10
10-9
10-8
10-7
10-6
10-5
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh1
Figure 6.9: Convergence history for the coarse mesh (Mesh 1)
CPU-Time (Sec)
L2
0 500 1000 1500
10-12
10-11
10-10
10-9
10-8
10-7
10-6
10-52nd-Order/Mesh33rd-Order/Mesh34th-Order/Mesh3
Figure 6.10: Convergence history for the fine mesh (Mesh 3)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 102
X
Y
-4 -2 0 2 40
1
2
3
4
5
6
7
1.00.50.30.1
-0.1-0.4-0.9-1.3-1.8-2.3-2.7-3.2
Cp
Figure 6.11: 2nd-order pressure coefficient contours, Mesh 1
X
Y
-4 -2 0 2 40
1
2
3
4
5
6
7
1.00.50.30.1
-0.1-0.4-0.9-1.3-1.8-2.3-2.7-3.2
Cp
Figure 6.12: 3rd-order pressure coefficient contours, Mesh 1
CHAPTER 6. RESULTS(I): VERIFICATION CASES 103
X
Y
-4 -2 0 2 40
1
2
3
4
5
6
7
1.00.50.30.1
-0.1-0.4-0.9-1.3-1.8-2.3-2.7-3.2
Cp
Figure 6.13: 4th-order pressure coefficient contours, Mesh 1
volumes are quite coarse. The 4th-order pressure coefficient contours over Mesh 3 are also
shown in Fig (6.14) as a reference for the quality comparison with previously mentioned
cases. Since flow is inviscid and the geometry is symmetric, the pressure contours are
symmetric respect to the normal axis passing through the center of the cylinder. Front and
rear stagnation regions where the maximum pressure occurs (Cp=1.0 ) and the suction peak
on the top of the cylinder are displayed in the contours.
The pressure coefficient distribution along the surface of the cylinder (Gauss point data over
the circle) is shown in the Fig (6.15) to Fig (6.17). In terms of the solution, all discritization
orders predict nearly the same pressure coefficient over the cylinder. In general and for
a smooth flow that is what is expected since results should differ within the orders of
the mesh size. In the 2nd-order case, when the mesh is coarse, the suction peak is not
recovered smoothly nor is it located quite at the top of the cylinder; however, these issues
are considerably improved for the 3rd and 4th-order solution over the same mesh. The 4th-
order case over the coarse mesh shows a slight over-prediction both at the front stagnation
point and at the suction peak compared to the same solution over the fine mesh. Finally,
solution over the fine mesh for all orders of accuracy is displayed in Fig (6.17) showing
that all of them essentially are converged to the same pressure distribution such that the
difference between them is not visible to plotting accuracy.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 104
X
Y
-4 -2 0 2 40
1
2
3
4
5
6
7
1.00.50.30.1
-0.1-0.4-0.9-1.3-1.8-2.3-2.7-3.2
Cp
Figure 6.14: 4th-order pressure coefficient contours, Mesh 3
X
Cp
-1 -0.5 0 0.5 1
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh14th-Order/Mesh3
Figure 6.15: Pressure coefficient along the axis
CHAPTER 6. RESULTS(I): VERIFICATION CASES 105
X
Cp
-0.4 -0.2 0 0.2 0.4
-3.5
-3
-2.5
-2
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh14th-Order/Mesh3
Figure 6.16: Close up of the pressure coefficient along the axis (suction region)
X
Cp
-1 -0.5 0 0.5 1
-3
-2
-1
0
1
2nd-Order/Mesh33rd-Order/Mesh34th-Order/Mesh3
Figure 6.17: Pressure coefficient along the axis, Mesh 3
CHAPTER 6. RESULTS(I): VERIFICATION CASES 106
2nd-order L1 Ratio L∞ Ratio
Mesh 1/1376 CVs 0.000553 — 0.013409 —
Mesh 2/5539 CVs 9.4548e-5 5.85 0.003713 3.61
Mesh 3/22844 CVs 2.0492e-5 4.61 0.00129 2.88
Table 6.8: Error norms for total pressure, 2nd-order solution
3rd-order L1 Ratio L∞ Ratio
Mesh 1/1376 CVs 0.000426 — 0.010250 —
Mesh 2/5539 CVs 3.5353e-5 12.05 0.002030 5.05
Mesh 3/22844 CVs 6.0965e-6 5.8 0.000543 3.74
Table 6.9: Error norms for total pressure, 3rd-order solution
As flow is inviscid and subsonic (shock free), the total pressure of the flow should be pre-
served everywhere, and it is equal to the total pressure of the free-stream at the far-field.
Therefore deviation of the total pressure ratio from one can be interpreted as an error indi-
cator for these type of flows (E = Pt
Pt∞−1). The discrete norms of the errors in total pressure
ratio of control volume averages are computed for all orders of accuracy and meshes and
are tabulated in the Table 6.8 to Table 6.10. The l1 norm, a global measurement for error,
is roughly half an order less than the nominal order of the discretization for the 3rd-order,
and it is about right for the 4th-order discretization. The l∞ norm, a local measurement
for maximum error, follows the same trend and it is about one order less than the nominal
norm of the discretization for the 3rd-order solution.
The total pressure error also is shown in Fig (6.18) to Fig (6.21) over the coarse mesh for
all orders of accuracy and over the fine mesh for the 4th-order discretization (as a reference
for comparison). These pictures demonstrate that the surface boundary as the main source
of the error and show how different discretization orders under-predict and/or over-predict
the total pressure in stagnation regions and at the maximum acceleration area (top of the
cylinder). For instance the 2nd-order and 3rd-order discretizations slightly under-predict
the total pressure over nearly all the boundary surface, and this under-prediction reaches
its peak at the rear stagnation region. The 4th-order method slightly over-predicts the
total pressure nearly everywhere along the boundary surface which was consistent with the
4th-order pressure coefficient along the boundary. By refining the mesh and/or increasing
the accuracy order the error in total pressure improves dramatically (Fig (6.21)).
The Error-Mesh plot for l1, Fig (6.22), demonstrates the total pressure error reduction
versus mesh size. It should be noted that despite the deviation of the measured accuracy
order from its nominal value, the higher-order discretization still produces noticeably smaller
error. For instance in the case of the 3rd-order solution over the finest mesh, where the
CHAPTER 6. RESULTS(I): VERIFICATION CASES 107
X
Y
-2 0 20
1
2
3
4
5
6
0.0040.0020
-0.002-0.004-0.006-0.008-0.01-0.012-0.014-0.016
Total PressureError
Figure 6.18: Error in total pressure ratio, 2nd-order discretization, Mesh 1
X
Y
-2 0 20
1
2
3
4
5
6
0.0020
-0.002-0.004-0.006-0.008-0.01-0.012-0.014
Total PressureError
Figure 6.19: Error in total pressure ratio, 3rd-order discretization, Mesh 1
CHAPTER 6. RESULTS(I): VERIFICATION CASES 108
X
Y
-2 0 20
1
2
3
4
5
6
0.0160.0140.0120.010.0080.0060.0040.0020
-0.002
Total PressureError
Figure 6.20: Error in total pressure ratio, 4th-order discretization, Mesh 1
X
Y
-2 0 20
1
2
3
4
5
6
6E-054E-052E-050
-2E-05-4E-05-6E-05-8E-05-0.0001-0.00012
Total PressureError
Figure 6.21: Error in total pressure ratio, 4th-order discretization, Mesh 3
CHAPTER 6. RESULTS(I): VERIFICATION CASES 109
4th-order L1 Ratio L∞ Ratio
Mesh 1/1376 CVs 0.000535 — 0.0148117 —
Mesh 2/5539 CVs 2.7473e-5 19.47 0.001097 13.5
Mesh 3/22844 CVs 2.0954e-6 13.11 8.1810e-5 13.4
Table 6.10: Error norms for total pressure, 4th-order solution
CVs
L1N
orm
ofth
eErro
r(To
talP
ress
ure)
10000 20000 30000
10-5
10-4
2nd-Order3rd-Order4th-Order
2.2
2.54
3.71
Figure 6.22: Error-Mesh plot for the total pressure
CHAPTER 6. RESULTS(I): VERIFICATION CASES 110
CVs
CD
10000 20000 30000
10-6
10-5
10-4
10-3
10-2
2nd-Order3rd-Order4th-Order
3.41
3.34
4.83
Figure 6.23: Drag coefficient versus mesh size
largest deviation from the nominal accuracy occurs, the l1 norm is 3.4 times smaller than
the l1 norm of the 2nd-order solution on the same mesh.
The Drag coefficient can also be used for error measurement since the exact drag coefficient
for the inviscid subsonic flow over circular cylinder is zero. Figure (6.23) displays the drag
reduction versus mesh size. This time the measured accuracy order is somewhat better
than the nominal order showing some error cancellation due to symmetry of the solution.
As expected higher-order discretization results in much smaller drag, which is closer to the
exact solution.
To show the benefits of higher-order discretization, it is useful to assess the performance
of a higher-order solution versus its accuracy. This can be perfectly done for the circular
cylinder case. The CPU time of the solution versus the drag coefficient is shown in Fig
(6.24). Assume that the picture shown is the solver performance characteristic graph for
all meshes, and by changing the mesh size (within the range of the interest) the behavior
of the performance graph does not change. For a desired accuracy, for instance CD = 10−4
the 3rd-order solution has the minimum CPU time. For a fixed solution time of 100 seconds
again the 3rd-order case results in the minimum drag. If a large solution time is bearable,
the 4th-order case provides us the best solution accuracy, since its CD/CPU time slope is
CHAPTER 6. RESULTS(I): VERIFICATION CASES 111
CPU-Time (Sec)
CD
101 102 103
10-6
10-5
10-4
10-3
10-2
2nd-Order3rd-Order4th-Order
Figure 6.24: Drag coefficient versus CPU time
sharper than the other two orders of accuracy. This implies that if a very accurate solution is
needed it is more efficient to increase the discretization order rather than using a finer mesh.
This comparison is performed for a simple problem, as opposed to a practical engineering
case, and at same time engineering accuracy requirements are typically limited. However,
this case study still provides some interesting insight into the prospects for higher-order
methods.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 112
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Figure 6.25: Annulus, Mesh 1 (108 CVs)
6.3 Supersonic Vortex
Supersonic flow inside an annulus geometry is studied. This flow is isentropic (shock free)
and it is a standard test case for accuracy measurement. The inner and outer radii are 2.0
and 3.0 respectively, and the inlet Mach number at the inner radius is equal to 2.0.
Five different meshes (Mesh 1 to Mesh 5) are employed in this test case (Fig (6.25) to Fig
(6.29)). Each mesh level has 4 times more control volumes than the immediate coarser level
and uniform refinement has been applied in the grid generation. Notice that meshes are
still irregular unstructured meshes and any finer level is generated completely independent
from the previous coarser level.
The exact solution for supersonic vortex can be found in Aftosmis and Berger [5]. Having
the exact solution provides us a direct option for accuracy measurement of the numerical
solution. The exact solution for the supersonic vortex is described in Eq. (6.3).
CHAPTER 6. RESULTS(I): VERIFICATION CASES 113
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Figure 6.26: Annulus, Mesh 2 (427 CVs)
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Figure 6.27: Annulus, Mesh 3 (1703 CVs)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 114
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Figure 6.28: Annulus, Mesh 4 (6811 CVs)
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Figure 6.29: Annulus, Mesh 5 (27389 CVs)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 115
Ri = 2.0, Mi = 2.0, ρi = 1.0
R =√
x2 + y2
ρ = ρi
(1 +
γ − 1
2
(1 − R2
i
R2
)M2
i
) 1γ−1
URi
= RiMi(ρi)(γ−1)
2
u =yU
Ri
R2, v =
−xURi
R2, P =
ργ
γ(6.3)
6.3.1 Numerical Solution
The exact solution is used as an initial solution, which does not exactly satisfy the discretized
equations due to truncation error. Starting a numerical solution from its exact solution is
unrealistic in general, as if the exact solution was known, there would be no need for
numerical solution. However this is a special case for efficiency study of the developed
Newton-Krylov solver, which excludes the start-up issue completely. This way we can show
how the efficiency of our matrix-free approach is affected purely by discretization order
when the best starting solution is used. Newton iteration (infinite time step) is performed
for all cases. An approximate analytic Jacobian with ILU(4) is used for preconditioning.
The convergence criterion, as in section (6.2), is L2(Res(U)) = 1 × 10−12 and the same
subspace size has been used. Fig (6.30) and Fig (6.31) show the convergence history for the
Mesh 1 and Mesh 5 in terms of CPU time. Since the solution is started from a good initial
guess superlinear convergence is achieved from the very first iteration both for the 2nd and
3rd-order discretizations. The 4th-order case (as expected) is much slower than the other
two and it requires more outer iterations (2 more outer iterations for the fine mesh) to reach
full convergence. This shows that even when the solution is started from the best possible
initial guess (i.e. exact solution), the 4th-order discretization results in a complicated linear
system which requires considerable effort to solve.
The solution Mach contours over the coarse mesh for all orders of accuracy are displayed
in Fig (6.32) to Fig (6.34). The flow Mach number is supposed to be constant along any
given radius and decreases with increasing radius. In other words, flow at smaller radius has
larger acceleration with the maximum occurs along the inner wall boundary. By inspecting
the contour lines, it is clearly visible that the quality of the solution using the higher-
order discretization has been improved noticeably. This improvement is most visible close
to the inner wall portion. This shows that, especially in regions where the solution has
CHAPTER 6. RESULTS(I): VERIFICATION CASES 116
CPU-Time (Sec)
L2
0 1 2 3
10-15
10-13
10-11
10-9
10-7
10-5 2nd-Order/Mesh-13rd-Order/Mesh-14th-Order/Mesh-1
Figure 6.30: Convergence history for the coarse mesh (Mesh 1)
CPU-Time (Sec)
L2
0 50 100 150 200 25010-15
10-14
10-13
10-12
10-11
10-10
10-9
10-8
2nd-Order/Mesh-53rd-Order/Mesh-54th-Order/Mesh-5
Figure 6.31: Convergence history for the fine mesh (Mesh 5)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 117
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
1.91.81.71.61.51.41.31.21.1
Mach
Figure 6.32: 2nd-order Mach contours for the coarse mesh (Mesh 1)
large inadequately resolved derivatives (but is continuous), linear reconstruction of the flow
quantities is not the optimal choice.
Fig (6.35) to Fig (6.37) display the density error in the solution over the coarse grid. The
scale of the error map is the same for all discretization orders providing a reasonable sense
of how the solution error decreases when a higher-order discretization is employed. As
expected the boundaries are the original source of the error and the maximum error occurs
at the inner wall. The 4th-order density solution over the fine grid is shown in Fig (6.38) as
a reference. The behavior of the density contours is opposite to the behavior of the Mach
contours, and density reaches its maximum value at the outer wall.
6.3.2 Solution accuracy measurement
The density of the solution is used for the accuracy measurement. Error norms are tabulated
in Table 6.11 to Table 6.13 for all set of grids and discretization orders. The maximum error
norm (l∞) is about one order less than the nominal order of accuracy due to the boundary
error. The global norm, l1, however, converges to the nominal order of accuracy for all cases.
Hence the over all accuracy of the computed solution is consistent with the discretization
CHAPTER 6. RESULTS(I): VERIFICATION CASES 118
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
1.91.81.71.61.51.41.31.2
Mach
Figure 6.33: 3rd-order Mach contours for the coarse mesh (Mesh 1)
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
1.91.81.71.61.51.41.31.2
Mach
Figure 6.34: 4th-order Mach contours for the coarse mesh (Mesh 1)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 119
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
0.060.050.040.030.020.010
-0.01-0.02
Density Error
Figure 6.35: 2nd-order density error for the coarse mesh (Mesh 1)
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
0.060.050.040.030.020.010
-0.01-0.02
Density Error
Figure 6.36: 3rd-order density error for the coarse mesh (Mesh 1)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 120
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
0.060.050.040.030.020.010
-0.01-0.02
Density Error
Figure 6.37: 4th-order density error for the coarse mesh (Mesh 1)
X
Y
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
2.42.221.81.61.41.2
Density
Figure 6.38: Density, 4th-order solution over the fine mesh (Mesh 5)
CHAPTER 6. RESULTS(I): VERIFICATION CASES 121
Mesh CVs L1 Ratio L∞ Ratio
1 108 0.014872 — 0.097239 —
2 427 0.003847 3.86 0.036135 2.69
3 1703 0.001073 3.59 0.012825 2.82
4 6811 0.000258 4.15 0.005168 2.48
5 27389 6.6334e-5 3.89 0.002019 2.56
Table 6.11: Solution error norms, 2nd-order discretization
Mesh CVs L1 Ratio L∞ Ratio
1 108 0.044813 — 0.028904 —
2 427 0.000976 4.59 0.011439 2.53
3 1703 0.000139 7.0 0.002225 5.14
4 6811 2.4415e-5 5.69 0.000634 3.51
5 27389 3.3202e-6 7.35 0.000136 4.66
Table 6.12: Solution error norms, 3rd-order discretization
order. The error convergence plot for the l1 norm is shown in Fig (6.39) for graphical
solution accuracy verification.
Again here, some accuracy-efficiency analysis can be carried out. The solution CPU time
versus the Error norm is plotted in Fig (6.40). For a fixed CPU time, for example (CPU
time = 10 Sec), the 3rd-order performance characteristic line generates the minimum error.
For a fixed error level, l1 = 1 × 10−5, the 4th-order performance characteristic line has
the minimum computation time, although on that error level the 3rd and 4th-order lines
are very close to each other. In all cases the 2nd-order characteristic line is totally out
performed by higher-order counterparts. It should be mentioned that a similar performance
characteristic graph with the same trend, but different CPU time, can be achieved when
the solution starts from a constant Mach number as an initial guess with a start-up process.
The supersonic vortex is an internal flow; therefore, the accuracy of the solution for whole
domain was measured and studied. If one modifies the error quantification, or limits the
Mesh CVs L1 Ratio L∞ Ratio
1 108 0.002199 — 0.01632 —
2 427 0.000410 5.36 0.005102 3.20
3 1703 2.3761e-5 17.25 0.000546 9.34
4 6811 2.0200e-6 11.76 8.3718e-5 6.52
5 27389 1.2514e-7 16.14 1.0024e-5 8.35
Table 6.13: Solution error norms, 4th-order discretization
CHAPTER 6. RESULTS(I): VERIFICATION CASES 122
CVs
L1N
orm
ofth
eErro
r(D
ensit
y)
102 103 10410-7
10-6
10-5
10-4
10-3
10-2
2nd-Order3rd-Order4th-Order
1.95
2.88
4.01
Figure 6.39: Error-Mesh plot for the solution (Density)
importance of the solution to a specific part of the domain, the performance characteristic
graph may change considerably. Therefore it is crucial to know what is important in a
flow field, and how the error is defined and quantified. All of these are application oriented
and vary from case to case. However, the studied test case clearly shows the potential
performance advantages of higher-order discretizations.
CHAPTER 6. RESULTS(I): VERIFICATION CASES 123
CPU-Time (Sec)
Erro
r(L1
)
100 101 10210-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1 2nd-Order3rd-Order4th-Order
Figure 6.40: Error versus CPU Time
Chapter 7
Results(II): Simulation Cases
In this chapter, the start-up procedure, fast convergence, robustness and overall performance
of the proposed higher-order unstructured Newton-Krylov solver have been studied in detail
for subsonic, transonic, and supersonic flows.
7.1 Subsonic Airfoil, NACA 0012, M = 0.63, α = 2.00
To study the convergence and robustness of the proposed Newton-Krylov solver with a
higher-order unstructured discretization, here a subsonic case is presented which includes
most of the features of the solver’s performance. The convergence characteristics are inves-
tigated for a series of meshes. Five different meshes with an O-Domain from a coarse to a
relatively fine mesh have been used (Table 7.1 and Fig (7.1) to Fig(7.5) ). All meshes have
proper refinement at the leading and trailing edges. The far field is located at 25 chords
and characteristic boundary conditions are implemented implicitly.
Mesh No. of CVs along the chord (upper side/lower side) Total No. of CVs
1 61/58 1245
2 101/99 2501
3 127/127 4958
4 198/192 9931
5 260/253 19957
Table 7.1: Mesh detail for NACA 0012 airfoil
124
CHAPTER 7. RESULTS(II): SIMULATION CASES 125
X
Y
0 0.2 0.4 0.6 0.8 1
-0.4
-0.2
0
0.2
0.4
Figure 7.1: NACA 0012, Mesh 1, 1245 CVs
X
Y
0 0.2 0.4 0.6 0.8 1
-0.4
-0.2
0
0.2
0.4
Figure 7.2: NACA 0012, Mesh 2, 2501 CVs
CHAPTER 7. RESULTS(II): SIMULATION CASES 126
X
Y
0 0.2 0.4 0.6 0.8 1
-0.4
-0.2
0
0.2
0.4
Figure 7.3: NACA 0012, Mesh 3, 4958 CVs
X
Y
0 0.2 0.4 0.6 0.8 1
-0.4
-0.2
0
0.2
0.4
Figure 7.4: NACA 0012, Mesh 4, 9931 CVs
CHAPTER 7. RESULTS(II): SIMULATION CASES 127
X
Y
0 0.2 0.4 0.6 0.8 1
-0.4
-0.2
0
0.2
0.4
Figure 7.5: NACA 0012, Mesh 5, 19957 CVs
7.1.1 Solution Process
The tolerance in solving the linear system for the start-up phase is 5 × 10−2 and for the
Newton part is 1 × 10−2. For all test cases a subspace of 30 has been set and no restart
is allowed. The preconditioning for the start-up pre-iterations is performed by employing
the approximate analytical Jacobian matrix with ILU-1 factorization and for the Newton
iteration the first-order finite difference Jacobian matrix with ILU-4 factorization is used.
The Newton iteration is matrix-free and ε = ε0‖z‖ with ε0 = 1 × 10−6 is used for directional
differencing. The initial condition is the free stream flow.
No attempt for optimizing the start-up process is made. The solution starts with 30 pre-
iterations in the start-up process to reach a good initial solution before switching to Newton
iterations. The CFL starts at 2.0 and is increased gradually to CFL=100 for the first 15
pre-iterations which are performed with first-order accurate flux evaluation. The remaining
15 pre-iterations are performed in the form of the defect correction with constant CFL of
100, where the first-order Jacobian is used both for constructing the left hand side and for
preconditioning the linear system. The right hand side (Eq. (5.27)) is evaluated to the
correct order of accuracy. The cost of each pre-iteration includes one first order Jacobian
CHAPTER 7. RESULTS(II): SIMULATION CASES 128
X/C
Cp
0 0.2 0.4 0.6 0.8 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
4th-Order/Mesh5-Converged Solution2nd-Order/Mesh1/Start-up3rd-Order/Mesh1/Start-up4th-Order/Mesh1/Start-up
Figure 7.6: Cp over the upper surface after start-up, Mesh1 (1245 CVs)
evaluation and its incomplete factorization, one flux evaluation, and one system solve using
GMRES, which is not matrix-free since the Jacobian matrix is available explicitly.
Experiments for subsonic flow have shown that a reasonable starting point for Newton
iteration can be easily achieved by a relatively small number of pre-iterations, and there is
no need to decrease the residual by a significant factor. Only a rough physical solution over
the airfoil is good enough for starting Newton iterations. It is also interesting to look at the
solution after finishing the start-up and right before switching to Newton iteration. Fig (7.6)
displays the pressure coefficient over the upper surface of the airfoil at the end of the start-
up process for Mesh 1, which is the coarsest mesh. The approximate physical solution has
been reached for all discretization orders and the suction peak is nearly resolved. Although
the suction peak is recovered better in the start-up process for higher-order discretizations,
the difference is small and for the start-up where only an approximate solution is needed,
the 2nd-order defect correction generates a reasonably good solution to start the Newton
iterations for all discretization orders. Considering the difference in cost of the 2nd-order
residual evaluation compared to the 3rd and 4th-order residual evaluations, it makes sense
to use the 2nd-order defect correction scheme for start-up.
A similar graph is shown for the finest mesh (Mesh 5) after start-up process in Fig (7.7).
CHAPTER 7. RESULTS(II): SIMULATION CASES 129
X/C
Cp
0 0.2 0.4 0.6 0.8 1
-1
-0.5
0
0.5
1
4th-Order/Mesh5-Converged Solution2nd-Order/Mesh5/Start-up3rd-Order/Mesh5/Start-up4th-Order/Mesh5/Start-up
Figure 7.7: Cp over the upper surface after start-up, Mesh 5 (19957 CVs)
While the general trend of the pressure coefficient over the airfoil is recovered, the suction
peak has not been properly resolved yet. That is not surprising since due to the mesh
volume, the diversity of mesh length scales, and the local time stepping approach, solution
time evolution is far from completion and more pre-iterations are needed to resolve the
suction peak. The noticeable point is the violent oscillations of the 4th-order discretization
around the suction region in the course of the solution evolution. This oscillation around
the extrema is normal, and it is the case both for the 2nd and 3rd-order discretizations.
But for the 4th-order case due to low dissipation in the fourth-order scheme, oscillations
are quite vigorous. These oscillations could be a source of instability in the start-up process
and indeed the 4th-order case is sensitive in the early period of solution process. Since the
cost of the higher-order start-up is more than the 2nd-order one, and it could generate noisy
solution state (before switching to Newton iteration), it is more robust to perform all the
defect correction part with the 2nd-order residual evaluation. A similar approach is taken
for all meshes of the current test case, where the defect correction iterations are performed
with 2nd-order accuracy.
After start-up, the solution process is switched to Newton iteration and an infinite CFL
is employed. To be able to compare the computing cost for different meshes, a work unit
CHAPTER 7. RESULTS(II): SIMULATION CASES 130
has been used, which is simply equivalent to the cost of one residual evaluation for the
corresponding order of accuracy. It is obvious then that two different meshes have two
different work units in terms of CPU time. Convergence is reached (solution process is
stopped) when the L2 norm of the non-linear density residual falls below 1 × 10−12 .
Table 7.2 shows the convergence summary for the 2nd, 3rd and 4th-order discretizations in
terms of total number of residual evaluations, total CPU time, total work units, number of
Newton iterations, and the cost of Newton phase in terms of work units. For all meshes,
after 30 pre-iterations, the solution has converged after a few Newton iterations. The total
number of work units increases as finer meshes are used. The linear system arising from
a denser grid is more difficult to solve than a similar system arising from a coarser grid.
Using a constant subspace size without restart becomes less and less effective in solving
the linear system as the size and complexity of the system increases resulting in reducing
the linearization effectiveness. Consequently more outer (Newton) iterations are needed to
reduce the non-linear residual where the mesh size or discretization order increases. Also
the solution state after the start-up for the fine mesh is not as close as it was to the steady-
state solution for the case of coarse mesh and this is another contributing factor for slowing
the Newton convergence on the fine mesh. Notice that the total work unit has increased
roughly by a factor of 2.5 while the mesh size has increased by a factor of 16.
For a similar subsonic case, the total computation cost in terms of work units for one of the
fastest and most efficient available Newton-Krylov algorithm (2nd-order/artificial dissipa-
tion) for unstructured meshes [45] is just under 300. This confirms that the performance
of the current developed Newton Krylov is quite comparable with the world class standard
algorithms. However as it will be noted later a combination of mesh sequencing and multi-
grid techniques are needed to ensure such excellent performance as the size of the mesh
increases.
Figure (7.8) displays the the solution CPU time versus grid size in logarithmic scale. The
computing cost in terms of CPU time follows a power law relation, t = m(NCV )k , with
the mesh size and all discretization orders show more or less the same trend (k2nd =
1.131, k3rd = 1.021 and k4th = 1.163), showing the scalability of the developed solver CPU
time with the mesh size. This graph emphasizes the need for parallel computing and multi-
grid method in the case of large meshes. Figure (7.9) and Fig (7.10) show the total work
units and the Newton phase cost in terms of the work unit. The 2nd and the 3rd-order
solution have nearly the same Newton phase cost in terms of the work unit. The difference
between these two is mainly due to the start-up phase cost which for small meshes could
be considerable part of the total solution cost. For the 4th-order case, the cost of the start
CHAPTER 7. RESULTS(II): SIMULATION CASES 131
Test Case No. of Res. Time (Sec) Work Unit No. of Newton Itr. Newton Phase Work Unit
Mesh 1
2nd 100 5.76 240.0 3 96.7
3rd 132 8.87 197.1 4 122.4
4th 244 27.09 258.0 7 226.8
Mesh 2
2nd 121 12.88 280.0 3 118.3
3rd 136 17.71 213.4 4 128.2
4th 283 58.15 312.6 8 274.8
Mesh 3
2nd 126 26.88 349.1 3 136.1
3rd 147 36.03 248.5 4 141.2
4th 247 90.54 289.3 7 239.2
Mesh 4
2nd 158 60.39 399.9 4 182.8
3rd 158 73.98 276.0 4 159.0
4th 318 217.60 371.4 9 317.8
Mesh 5
2nd 254 164.10 562.0 7 325.3
3rd 286 225.87 456.3 8 321.8
4th 542 682.0 639.8 16 577.1
Table 7.2: Convergence summary for NACA 0012 airfoil, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 132
No. of CVs
CPU
time(
Sec)
10000 20000
101
102
103
2nd-Order3rd-Order4th-Order
Figure 7.8: CPU time versus the grid size, NACA 0012, M = 0.63, α = 20
up is a minor part of the total solution cost since the 4th-order gets cheaper scaled start-up
than 3rd (than 2nd) due to normalization of the solution cost. It should be mentioned that
the start-up technique is neither unique nor optimized for these series of test cases and its
cost should be studied separately. If a reasonably good pre-computed initial solution is
available (e.g. full or incompressible potential solution, exact solution of some approximate
problem or an empirical solution) then the cost of the start-up phase can be ignored and
in practice the total solution cost would be equal to the Newton phase cost. The start-up
methodology in this research is reasonably efficient for small and moderate sized meshes;
however its cost would grow if more pre-iterations are needed for large meshes. In general
for large meshes, it is not efficient to perform the start-up procedure from scratch, and it
would be much more efficient if the computation procedure is started from a coarse mesh.
Therefore for very large meshes augmenting the pre-iterations start-up with the mesh se-
quencing technique could be quite helpful. At the same time the rise in total solution cost
due to increasing the mesh size both in the start up and Newton phases can be dramatically
reduced if a multi-grid procedure is employed.
The convergence history for the coarse grid (Mesh 1) and the fine grid (Mesh 5) are shown
in Fig (7.11) and Fig (7.12). Although the start up process is the same for all orders of
CHAPTER 7. RESULTS(II): SIMULATION CASES 133
No. of CVs
Tota
lWor
kU
nit
10000 20000
200
300
400
500
600
7002nd-Order3rd-Order4th-Order
Figure 7.9: Total work unit versus the grid size, NACA 0012, M = 0.63, α = 20
No. of CVs
New
ton
Itera
tion
Wor
kU
nit
10000 20000
100
200
300
400
500
6002nd-Order3rd-Order4th-Order
Figure 7.10: Newton phase work unit versus the grid size, NACA 0012, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 134
CPU time
L2
5 10 15 20 25
10-13
10-11
10-9
10-7
10-5
10-32nd-Order/Newton Itr./Mesh13rd-Order/Newton Itr./Mesh14th-Order/Newton Itr./Mesh1
Figure 7.11: Convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20
discretizations, the starting residual is different for two reasons. First the initial residual
(r0) which initializes the implicit start-up is based on the correct corresponding order of
residual evaluation. Second the mesh moments are pre-computed and employed based
on the different orders of accuracy. However as start-up reaches to its end the solution
state is almost the same for all discretization orders. After start-up full convergence is
achieved within a few Newton iterations in 2nd and 3rd order cases. For the 4th-order
case convergence is about two times slower (in terms of outer iterations). The superlinear
convergence for the 2nd and 3rd-order cases is evident.
Fig (7.13) examines the correlation between the non-linear residual and the residual of
the linear system solution. This graph shows the non-linear residual after applying the
update produced by solving the linear system versus the residual of that linear system. The
correlation between these two is nearly linear. Fig (7.14) shows the quality of the linear
system solution for each Newton iteration. The linear residual reduction within the GMRES
linear solver is plotted versus the final residual of that system for all outer iterations. In other
words this shows how much the linear system residual is reduced through inner iterations.
For the 2nd-order case the linear system is solved quite effectively and more than 8 orders
of residual reduction are achieved given the subspace size of 30 with no restart. For the 3rd-
CHAPTER 7. RESULTS(II): SIMULATION CASES 135
CPU time
L2
100 200 300 400 500 60010-12
10-11
10-10
10-9
10-8
10-7
10-6
10-5
10-4 2nd-Order/Newton Itr./Mesh53rd-Order/Newton Itr./Mesh54th-Order/Newton itr./Mesh5
Figure 7.12: Convergence history, Mesh 5, M = 0.63, α = 20
L2 Linear System Residual
L2N
on-li
near
Resid
ual
10-13 10-12 10-11 10-10 10-9 10-8 10-7 10-6 10-510-15
10-14
10-13
10-12
10-11
10-10
10-9
10-8
10-7
10-6 2nd-Order/Newton Itr./Mesh33rd-Order/Newton Itr./Mesh34th-Order/Newton Itr./Mesh3
Figure 7.13: Non-linear residual versus linear system residual, Mesh 3, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 136
L2-linear System Residual
Log
(Lin
earS
yste
mRe
sidua
lRed
uctio
n)
10-1410-1210-1010-810-610-4
2
4
6
8
10
12
2nd-Order/Newton Itr./Mesh33rd-Order/Newton Itr./Mesh34th-Order/Newton Itr./Mesh3
Figure 7.14: Linear system residual dropping order, Mesh 3, M = 0.63, α = 20
order and 4th-order discretization this residual reduction is about 5 and 2 orders respectively
for the same subspace size. This reduction in quality in solving the linear system eventually
increases the number of Newton iterations for higher-order discretizations. However, it has
been shown that by using multiple restarts and solving the linear system up to machine
accuracy for each Newton iteration semi quadratic convergence is attainable even for the
4th-order discretization with dramatic penalty in computation cost [53].
The perfect preconditioning would cluster all eigenvalues at one. For the proposed precon-
ditioning strategy, it is impossible to exactly cluster all eigenvalues at one for two reasons.
First, the first-order linearization matrix (Jacobian matrix) is employed which is an approx-
imation for higher-order linearization and not equal to the original linearization. Secondly,
an incomplete factorization is used as the preconditioning technique instead of full factoriza-
tion. Therefore, the best that can be expected is that all eigenvalues of the preconditioned
operator would be scattered around unity while some of them could be very close to one.
As a result, the distance of the eigenvalues from unity can be used as one of the precondi-
tioning quality indicators. Also, it is desired to have eigenvalues located far from the origin
(zero) to avoid ill-conditioning and singularity issues [53]. To evaluate and compare the
quality of the preconditioning for different discretization orders the approximate eigenvalue
CHAPTER 7. RESULTS(II): SIMULATION CASES 137
Re
Im 0.5 1 1.5 2
-1
-0.5
0
0.5
1
2nd-Order/Mesh33rd-Order/Mesh34th-Order/Mesh3
Figure 7.15: Eigenvalue pattern for the preconditioned system, Mesh 3, M = 0.63, α = 20
spectrum of the preconditioned linear system (estimated via GMRES algorithm) is plotted
at the last iteration (i.e. converged solution) for the Mesh 3 test case, Fig (7.15). The
eigenvalues associated with higher-order discretizations are scattered with larger distances
from one compared to the 2nd-order eigenvalues. This some how indicates a reduction in
the quality of system solving with increasing discretization order, which of course was not
hard to predict in the first place. At the same time, higher the discretization order closer to
the origin eigenvalues are located shifting the matrix toward singularity. This is especially
the case for the 4th-order discretization where one of the eigenvalues is very close to origin.
The estimated condition number of the linear system for Newton iterations are shown in
Fig (7.16) and Fig (7.17) for Mesh 3 and Mesh 5 cases. The condition number is shown
as a function of drop in residual. The Res0 or reference residual is the initial non-linear
residual of the corresponding mesh computed based on the far field flow condition. Therefore
the ratio of the non-linear residual at the end of each Newton iteration to Res0, reflects
the relative convergence after each Newton iteration. The first iteration condition number
shows the conditioning of the linear system formed based on the solution linearization at
the end of the start-up phase. The rest of the reported condition numbers are associated
to the linear system formed based on the solution update at the end of successive Newton
iterations. While the 2nd-order discretization graph initially has larger condition number,
CHAPTER 7. RESULTS(II): SIMULATION CASES 138
Log(L2(Res0)/L2(Res))
Cond
ition
No.
0 2 4 6 8 10
20
40
60
80
100
120
140
160 2nd-Order/Mesh33rd-Order/Mesh34th-Order/Mesh3
Figure 7.16: Condition No. of the preconditioned system, Mesh 3, M = 0.63, α = 20
Log(L2(Res0)/L2(Res))
Cond
ition
No.
0 2 4 6 80
1000
2000
3000
40002nd-Order/Mesh53rd-Order/Mesh54th-Order/Mesh5
Figure 7.17: Condition No. of the preconditioned system, Mesh 5, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 139
Log (L2(Res0)/L2(Res))
CL
0 2 4 6 8 100.317
0.318
0.319
0.32
0.321
0.322
0.3232nd-Order/Newton Itr./Mesh13rd-Order/Newton itr./Mesh14th-Order/Newton Itr./Mesh1
Figure 7.18: Lift coefficient convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20
the condition number decreases gradually with convergence. In higher-order cases, initially,
the condition number is smaller compare to the 2nd-order case, but it starts to grow with
Newton iterations. In conclusion, the over all performance of the flow solver depends not
only on the conditioning of the linear system but also it relies on how well the non-linear
problem is represented by the linearization at each iteration. For higher-order discretizations
the quality of the linearization is not as good as it is for the 2nd-order discretization and
more outer or Newton iterations are needed for full convergence. It should be mentioned
that the reported condition numbers are relatively small (since the size of the meshes are
not very large). Generally speaking, the resultant preconditioned linear systems are well
conditioned and the condition numbers are reported just for monitoring the solution process.
Lift and drag coefficient convergence histories as a function of drop in residual are shown
in Fig (7.18) to Fig (7.21) for coarse and fine meshes.
In the case of coarse mesh, almost 6 orders drop in residual seems to be necessary for
lift and drag convergence. The lift and drag convergence are nearly achieved after 4-5
orders drop in residual for higher-order discretizations. In the case of the fine mesh all the
coefficients are completely converged after 4 orders drop in residual. As a practical matter,
this is an important point since the computation cost can be noticeably reduced by making
CHAPTER 7. RESULTS(II): SIMULATION CASES 140
Log (L2(Res0)/L2(Res))
CD
-4 -2 0 2 4 6 8 10
0.0008
0.001
0.0012
0.0014
0.0016
0.00180.002
0.00220.00240.00260.0028 2nd-Order/Newton Itr./Mesh1
3rd-Order/Newton itr./Mesh14th-Order/Newton Itr./Mesh1
Figure 7.19: Drag coefficient convergence history, NACA 0012, Mesh 1, M = 0.63, α = 20
convergence criteria less strict, and terminating the solution process after fewer Newton
iterations for the fine mesh where the computing cost increases considerably. Following
this procedure, the higher-order (especially the 4th-order) discretization would be greatly
benefited, as the unnecessary computing cost to reduce the residual after a certain value is
avoided.
Lift and drag coefficients for all meshes and discretization orders are tabulated in Table 7.3.
In the finest mesh case, lift coefficients up to third decimal place and the drag coefficients
up to the 4th decimal place have converged to the same value. However the drag coefficient
still is far from zero. This is mainly due to lack of pressure recovery at the trailing edge
which is a singular point in potential flow. The 3rd-order drag coefficient is consistently
larger than the 2nd-order drag coefficient, which is surprising. However the 4th-order drag
is consistently the smallest, as was expected. Without drawing a general conclusion about
the 3rd-order drag, it is fair to say that behavior of the discretization order at the trailing
edge of the airfoil is a determining factor, and apparently for such a region discretization
methods with even orders of accuracy performed better than a discretization method with
an odd-order of accuracy possibly due to cancellation effect.
All of the test cases so far are performed with a far field size of 25 chords. To study the
CHAPTER 7. RESULTS(II): SIMULATION CASES 141
Log (L2(Res0)/L2(Res))
CL
-1 0 1 2 3 4 5 60.28
0.29
0.3
0.31
0.32
2nd-Order/Newton Itr./Mesh53rd-Order/Newton Itr./Mesh54th-Order/Newton Itr./Mesh5
Figure 7.20: Lift coefficient convergence history NACA 0012, Mesh 5, M = 0.63, α = 20
Log (L2(Res0)/L2(Res))
CD
0 2 4 6
0.0004
0.0006
0.0008
0.001
0.0012
2nd-Order/Newton Itr./Mesh53rd-Order/Newton Itr./Mesh54th-Order/Newton Itr./Mesh5
Figure 7.21: Drag coefficient convergence history, NACA 0012, Mesh 5, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 142
Test Case Lift coefficient Drag coefficient
Mesh1
2nd 0.322318 0.00097664
3rd 0.317393 0.00184889
4th 0.322223 0.00072576
Mesh2
2nd 0.322302 0.00057225
3rd 0.321846 0.00103438
4th 0.322448 0.00040369
Mesh3
2nd 0.324905 0.00040197
3rd 0.325214 0.00049820
4th 0.325588 0.00034757
Mesh4
2nd 0.325159 0.000336973
3rd 0.324740 0.00037595
4th 0.325323 0.00032525
Mesh5
2nd 0.324801 0.00032136
3rd 0.324568 0.00032711
4th 0.324474 0.00030828
Table 7.3: Lift and drag coefficients for all meshes and discretization orders, NACA 0012,M = 0.63, α = 20, far field size of 25 chords
CHAPTER 7. RESULTS(II): SIMULATION CASES 143
Test Case/Mesh Size /Far field distance Lift coefficient Drag coefficient
Mesh 4-1/10604 CVs/128 Chords
2nd 0.332506 0.00010622
3rd 0.332882 0.00012667
4th 0.332699 7.09651e-5
Mesh 5-1/22421 CVs/200 Chords
2nd 0.333575 4.88588e-5
3rd 0.333530 5.98949e-5
4th 0.334306 4.04234e-5
Mesh 5-2/24514 CVs/1600 Chords
2nd 0.332908 1.82357e-5
3rd 0.333715 2.35558e-5
4th 0.333374 5.66076e-6
Table 7.4: Effect of the far field distance on lift and drag coefficients, NACA 0012, M =0.63, α = 20
effect of the far field size on lift and drag, and to show that computing smaller drag is
possible with the same mesh resolution, the far field size is increased considerably while the
mesh resolution, the number of points on the airfoil and refining factors remained the same.
In other words, the new meshes are equivalent to the meshes of the previous test cases up
to 25 chords. The results are tabulated in Table 7.4. In all cases the drag coefficient has
been reduced dramatically by increasing the far field distance. For instance, the 4th-order
computed drag over Mesh 5-2 has been reduced by more than 50 times comapred to the
4th-order computed drag over Mesh 5. The lift coefficient is also affected by the far field
distance (i.e. in the most cases the lift coefficient was slightly increased by extending the
far field size). It should be mentioned that in this research no far field correction is used,
and higher-order far field correction was beyond of the scope of the thesis research.
Reference [21] reports CL of 0.3289 and CD of 0.0004 for a 2nd-order computation of a
similar flow case over an adaptively refined Cartesian mesh with the total of 10694 CVs
(494 body nodes) and 128 chords far field distance, which in terms of the mesh and far field
size is closely comparable to the Mesh 4-1 test case.
Mach contours for the Mesh 3 (4958 CVs) are displayed in Fig (7.23) to Fig (7.25), where the
corresponding mesh is shown in Fig (7.22). Flow field features such as the stagnation region
just below the leading edge, flow acceleration near the leading edge (where the maximum
Mach number occurs), and flow deceleration near the trailing edge are visualized. Once
CHAPTER 7. RESULTS(II): SIMULATION CASES 144
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 7.22: NACA 0012, Mesh 3 (4958 CVs)
again where the mesh is coarse, higher-order contours are visibly smoother than the 2nd-
order contours showing the improved quality of the flow computation over these regions.
The Mach profile (computed at Gauss points) along upper side of the airfoil for Mesh 1
(coarsest mesh, Fig (7.28)) are shown in Fig (7.26) and Fig (7.27). Since there is a sudden
change in the area of the control volumes over the surface of the airfoil in acceleration
region, which is often the case for coarse meshes, a jump is observed in Mach profile at
those locations. However, higher-order solution has reduced the amplitude of this jump so
that in the 4th-order case, the jump has disappeared. Also, the Mach profile over the lower
surface of the airfoil is shown in Fig (7.29), where the 2nd-order Mach profile is slightly
lower than its higher-order counterparts.
Fig (7.30) to Fig (7.32) display the Mach profile of upper/lower sides of all discretization
orders over the finest mesh, Mesh 5. As expected, all orders of accuracy result in essentially
the same Mach distribution. The close up over the acceleration region demonstrates that
in locations where the flow experiences large changes, linear reconstruction or 2nd-order
discretization does not result in as smooth a solution as high-order discretizations.
The error distribution in total pressure, 1−Pt/Pt∞ , along the chord for Mesh 1 and Mesh 5
CHAPTER 7. RESULTS(II): SIMULATION CASES 145
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.920.860.800.740.680.610.550.490.430.370.310.250.180.120.06
Mach
Figure 7.23: 2nd-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.920.860.800.740.680.610.550.490.430.370.310.250.180.120.06
Mach
Figure 7.24: 3rd-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 146
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.920.860.800.740.680.610.550.490.430.370.310.250.180.120.06
Mach
Figure 7.25: 4th-order Mach contours for NACA 0012 airfoil, Mesh 3, M = 0.63, α = 20
X/C
Mac
h
0 0.2 0.4 0.6 0.8 10.2
0.4
0.6
0.8
1 2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh14th-Order/Mesh5
Figure 7.26: Mach profile, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 147
X/C
Mac
h
0.05 0.1 0.15 0.2
0.85
0.9
0.95
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh14th-Order/Mesh5
Figure 7.27: Mach profile close up, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20
X
Y
0 0.02 0.04 0.06
-0.02
0
0.02
0.04
SL
Figure 7.28: Mesh 1, close-up at the leading edge region
CHAPTER 7. RESULTS(II): SIMULATION CASES 148
X/C
Mac
h
0 0.2 0.4 0.6 0.8 1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh14th-Order/Mesh5
Figure 7.29: Mach profile, lower side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20
X/C
Mac
h
0 0.2 0.4 0.6 0.8 10.2
0.4
0.6
0.8
12nd-Order/Mesh53rd-Order/Mesh54th-Order/Mesh5
Figure 7.30: Mach profile, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 149
X/C
Mac
h
0.04 0.06 0.08 0.1
0.94
0.95
0.96
0.97
0.98
0.99 2nd-Order/Mesh53rd-Order/Mesh54th-Order/Mesh5
Figure 7.31: Mach profile close up, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20
X/C
Mac
h
0 0.2 0.4 0.6 0.8 1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
2nd-Order/Mesh53rd-order/Mesh54th-Order/Mesh5
Figure 7.32: Mach profile, lower side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 150
X/C
1-Pt
/Pt
0 0.2 0.4 0.6 0.8 1
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.042nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh1
∞
Figure 7.33: 1 − Pt
Pt∞, upper side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20
X/C
1-Pt
/Pt
0 0.2 0.4 0.6 0.8 1
-0.01
-0.005
0
0.005
0.01
0.015
2nd-Order/Mesh13rd-Order/Mesh14th-Order/Mesh1
∞
Figure 7.34: 1 − Pt
Pt∞, lower side, NACA 0012 airfoil, Mesh 1, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 151
X/C
1-Pt
/Pt
0 0.2 0.4 0.6 0.8 1
-0.004
-0.002
0
0.002
0.0042nd-Order/Mesh53rd-Order/Mesh54th-Order/Mesh5
∞
Figure 7.35: 1 − Pt
Pt∞, upper side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20
X/C
1-Pt
/Pt
0 0.2 0.4 0.6 0.8 1
-0.004
-0.002
0
0.002
0.0042nd-Order/Mesh53rd-order/Mesh54th-Order/Mesh5
∞
Figure 7.36: 1 − Pt
Pt∞, lower side, NACA 0012 airfoil, Mesh 5, M = 0.63, α = 20
CHAPTER 7. RESULTS(II): SIMULATION CASES 152
are shown in Fig (7.33) to Fig (7.36). In all cases there are sudden spikes at the trailing edge,
and although the error amplitude over the fine mesh is noticeably decreased, the trailing
edge remains one of the main source of total pressure error. Another source of error is the
leading edge area which covers both the stagnation and acceleration regions. The 4th-order
solution for all cases has produced smaller entropy all over the airfoil compared to the 2nd
and 3rd-order discretizations, as its stagnation pressure error is nearly zero, (except the
above mentioned spikes at the trailing edge) including the leading edge region.
CHAPTER 7. RESULTS(II): SIMULATION CASES 153
7.2 Transonic Airfoil, NACA 0012, M = 0.8, α = 1.250
For a transonic flow, in general, it is more difficult to get fast convergence. This is because
of the mixed subsonic/supersonic nature of the flow and the existence of discontinuities
(shocks) in the solution. The methodology for handling of discontinuity can increase the
complexity of the problem. This is true especially for implicit schemes, where there could
be a large change in the solution update in each iteration and limiter values could have large
oscillations. In the case of the matrix-free approach in which matrix-vector multiplication
is computed through flux perturbation, any oscillatory behavior in limiter could severely
degrade the solution convergence. All these facts amount to increasing the complexity and
difficulty in solving transonic flows.
The transonic flow around NACA 0012 at M = 0.8, α = 1.250 is studied. The Mesh 3
with 4958 CVs, Fig (7.3), of section 7.1 is employed for this test case. Flow is solved for
all orders of accuracy using Venkatakrishnan limiter with proper higher-order modification,
section 3.4.2. For the 2nd and 3rd-order cases K = 10 is used in the limiter, and for
the 4th-order discretization K = 1 is employed. The limiter values are allowed to change
through all iterations and no freezing is considered. The tolerance of solving the linear
system, like previous test cases, for the start-up phase is 5×10−2 and for the Newton phase
is 1 × 10−2. For all test cases a subspace of 30 has been set and no restart is allowed.
The preconditioning for the start-up pre-iterations is performed using the approximate
analytical Jacobian matrix with ILU-1 factorization and for the Newton iteration finite
difference Jacobian matrix with ILU-4 factorization is applied. The Newton iteration is
matrix-free and ε = ε0‖z‖ with ε0 = 1 × 10−7 is used for directional differencing. The initial
condition is free stream flow. Convergence is reached when the L2 norm of the non-linear
density residual falls below 1 × 10−12 .
For transonic flow before switching to Newton iteration, the shock locations in the flow field
and their strengths need to be captured relatively accurately; otherwise Newton iterations
will not decrease the residual of the non-linear problem effectively. This normally is achieved
by reducing the non-linear residual by some orders, typically 1.5-2, respect to the initial
residual in the course of the start-up process.
Multiple implicit pre-iterations are performed in the form of defect correction, before switch-
ing to Newton iteration. For the 2nd and 3rd-order start-up phases, pre-iterations in the
form of defect correction continue until the residual of the non-linear problem drops 1.5
order below the residual of the initial condition. In defect correction phase the starting
CFL number is 2 and it is increased gradually to 200 after 50 iterations. The CFL is not
CHAPTER 7. RESULTS(II): SIMULATION CASES 154
X/C
Mac
h
0 0.2 0.4 0.6 0.8 1
0
0.5
1
1.5
2nd-Order3rd-Order4th-OrderAGARD
Figure 7.37: Mach profile at the end of start-up process, M = 0.8, α = 1.250
increased above the value of 200 as increasing CFL would not help that much when the
linearization is inaccurate, as in the start-up phase. 69 and 81 pre-iterations were required
in the start-up phase to reduce the residual by 1.5 order for the 2nd and 3rd-order dis-
cretizations respectively. In the Newton phase, the 2nd and 3rd-order cases are followed
with infinite time step. The start-up phase for the 4th-order discretization, includes 200
pre-iterations with similar CFL trend. Although the target residual reduction of 1.5 or-
ders was not achieved through pre-iterations, the solution after 200 iterations was good
enough for starting the Newton phase. Fig (7.37) displays the Mach profile at the end of
the start-up phase for all orders of accuracy. There are some oscillations especially along
the upper surface Mach profile close to the shock location however, the shock locations and
their strengths are captured reasonably accurately before switching to Newton iteration.
For the 4th-order transonic case, using an infinite time step does not help the convergence
acceleration and causes inaccurate linearization and limiter oscillation. Since this leads to
slow convergence, CFL=10,000 has been set for Newton phase.
The convergence summary is shown in the Table 7.5. Similar to the subsonic case, the
number of Newton iterations for the 4th-order discretizations is twice as large as for the
2nd and 3rd-order discretizations. The major cost of the solution in all cases is the start-up
CHAPTER 7. RESULTS(II): SIMULATION CASES 155
Test Case No. of Res. Time (Sec) Work Unit No. of Newton Itr. Newton Phase Work Unit
2nd 197 65.6 279 4 91
3rd 241 106.7 281 5 119
4th 450 311.4 590 10 221
Table 7.5: Convergence summary for NACA 0012 airfoil, M = 0.8, α = 1.250
CPU time (Sec)
L2
50 100 150 200 250 300
10-12
10-10
10-8
10-6
10-4
10-2 2nd-Order/Newton Itr.3rd-Order/Newton Itr.4th-Order/Newton Itr.
Figure 7.38: Convergence history for NACA 0012, M = 0.8, α = 1.250
cost. Therefore there is a reasonable potential to reduce the total solution work dramatically
by employing an optimized start-up technique which is able to capture shocks and establish
the major flow features efficiently. Figure (7.38) demonstrates the convergence history
graph. This time, reduction in convergence slope for the 4th-order case is not only due to
poor convergence of the solution of the linear system in each Newton iteration, but also to
the limiter firing which changes the linearization in each iteration.
Again for a similar transonic case, the total computation cost in terms of work units
for a very fast Newton-Krylov algorithm (2nd-order/artificial dissipation) for unstructured
meshes [45] is just under 400 showing that the the performance of the developed Newton
Krylov in this research is very competitive.
Table 7.6 summarizes the lift and drag coefficients for all discretization orders which are
CHAPTER 7. RESULTS(II): SIMULATION CASES 156
Test case CL CD
2nd 0.337593 0.0220572
3rd 0.339392 0.0222634
4th 0.345111 0.0224720
AGARD-211[2]/Structured (192×39) 0.3474 0.0221
Table 7.6: Lift and drag coefficients, NACA 0012, M = 0.8, α = 1.250
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1.51.41.31.21.110.90.80.70.60.50.40.30.20.1
Mach
Figure 7.39: 2nd-order Mach contours, NACA 0012, M = 0.8, α = 1.250
CHAPTER 7. RESULTS(II): SIMULATION CASES 157
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1.41.31.21.110.90.80.70.60.50.40.30.20.1
Mach
Figure 7.40: 3rd-order Mach contours, NACA 0012, M = 0.8, α = 1.250
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1.41.31.21.110.90.80.70.60.50.40.30.20.1
Mach
Figure 7.41: 4th-order Mach contours, NACA 0012, M = 0.8, α = 1.250
CHAPTER 7. RESULTS(II): SIMULATION CASES 158
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.950.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.10.05
φ
Figure 7.42: limiter φ (3rd-order), NACA 0012, M = 0.8, α = 1.250
in good agreement with the reference data [2]. Both CL and CD in transonic flow are
essentially influenced by the shock location and its strength, and to have a good prediction
for theses coefficients, it is crucial to capture the shocks accurately.
Figure (7.39) to Fig(7.41) display banded Mach contours for all orders of discretizations.
The weak shock at lower surface and the strong shock at upper surface of the airfoil are quite
visible. The presence of the discontinuity and firing of the limiter in shock regions results in
non-smooth contour lines in the shock vicinity (mostly for coarser control volumes). This
non-smoothness widens with the discretization order due to increase in the reconstruction
stencil size.
In addition to hampering convergence, using the limiter has the disadvantage of reducing
the accuracy of the solution due to altering the gradients and higher-order terms. The
behavior of the limiter values φ and σ is shown in Fig (7.42) and Fig (7.43). Some isolated
limiter firing is observed in the leading and trailing edge regions, but those are limited to
a small number of control volumes, and generally the value is close to one. The limiter is
mainly active in the strong upper shock region, while it remains inactive at the weak lower
shock. Note that wherever the gradient limiter, φ, is fired aggressively the higher-order
terms limiter, σ, is nearly zero.
CHAPTER 7. RESULTS(II): SIMULATION CASES 159
X
Y
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.950.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.10.05
σ
Figure 7.43: limiter σ (3rd-order), NACA 0012, M = 0.8, α = 1.250
X/C
Mac
h
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
2nd3rd4thAGARD
Figure 7.44: Mach profile, NACA 0012, M = 0.8, α = 1.250
CHAPTER 7. RESULTS(II): SIMULATION CASES 160
X/C
Mac
h
0.2 0.4 0.6 0.8
0.8
1
1.2
1.42nd3rd4thAGARD
Figure 7.45: Mach profile in shock regions, NACA 0012, M = 0.8, α = 1.250
The Mach profile along the surface is shown in Fig (7.44) and Fig (7.45). Both the location
and strength of the shocks are in good agreement with the AGARD data [2]. The 2nd-order
discretization has some overshoot right before the upper shock, which is also visible in the
related Mach contours. It appears that the 3rd-order discretization produces less noise
in this shock capturing case, and this is probably because of the quadratic reconstruction
characteristic. However, the cubic reconstruction associated to the 4th-order discretization
shows some oscillations inside the shock region; this behavior is typical in approximating a
(sharp) discontinuity by a higher-order polynomial.
CHAPTER 7. RESULTS(II): SIMULATION CASES 161
7.3 Supersonic flow, Diamond airfoil , M = 2.0, α = 0.0
Supersonic flow field around the diamond airfoil has been computed using all discretization
orders at far field Mach number of 2.0 and zero angle of attack. The diamond airfoil has a
thickness of 15% and is located in a 7×14 rectangular domain. The total number of control
volumes in the mesh is 7771 with 80 control volumes along the chord. Appropriate mesh
refinement at the leading and trailing edges (shock locations) and also the apexes of the
airfoil (expansion locations) has been carried out to enhance the quality of the solution
capturing in these high gradient regions, Fig (7.46).
The interesting parts of the flow field are the discontinuities (shocks) and very sharp gradi-
ents (expansion fans). Because these features would result in extensive limiter firing which
would adversely affect the global accuracy, no limiter has been employed in the test case.
Therefore, the resultant solution will not be monotone and there will be considerable oscil-
lations around shocks and expansion fans. The tolerance in solving the linear system for
the start-up phase is 5 × 10−2 and for the Newton phase is 1 × 10−2. For all test cases a
subspace of 30 has been set and no restart is allowed. The preconditioning for the start-up
pre-iterations is performed using the approximate analytical Jacobian matrix with ILU-1
factorization and for the Newton iteration finite difference Jacobian matrix with ILU-4 fac-
torization is applied. The Newton iteration is matrix-free and ε = ε0‖z‖ with ε0 = 1 × 10−7
is used for directional differencing. The initial condition is free stream flow. Convergence
is reached when the L2 norm of the non-linear density residual falls below 1 × 10−12 .
In the current test case major physical flow characteristics, including both shocks and
expansion fans need to be captured approximately before switching to Newton iterations.
Otherwise due to highly non-linear and discontinuous regions in the supersonic flow field
convergence stalls shortly after the starting point. Therefore 30 pre-iterations in the form of
defect correction have been performed before switching to the Newton phase. The starting
CFL is 5 and it is increased to 50 after 10 iterations. For the start-up of the 4th-order
discretization case, the 2nd-order defect correction is used to reduce the start-up cost. That
improves the robustness of the start-up too since the 4th-order defect correction start-up for
such a flow (without limiter) could be unstable. The pressure coefficient along the chord at
the end of the start-up phase is shown in Fig (7.47) demonstrating the effectiveness of the
defect correction start-up for supersonic flow and the fact that the shocks and expansion
fans are fairly accurately established at the end of the start-up process.
After starting the Newton phase full convergence is achieved rapidly for all orders of dis-
cretizations. Figure (7.48) shows the convergence history for the diamond airfoil displaying
CHAPTER 7. RESULTS(II): SIMULATION CASES 162
X
Y
0 0.5 1 1.5
0.5
1
1.5
(a) Diamond airfoil
X
Y
0.24 0.26 0.28 0.3
0.98
1
1.02
(b) Nose section
X
Y
0.6 0.7 0.8 0.9
0.9
1
1.1
(c) Mid section
Figure 7.46: Mesh 7771 CVs , diamond airfoil, M = 2.0, α = 0.0
CHAPTER 7. RESULTS(II): SIMULATION CASES 163
X/C
Cp
0.2 0.4 0.6 0.8-0.2
-0.1
0
0.1
0.2
0.3
2nd-Order Start up (used for 2nd/4th-Order)3rd-Order Start upExact Solution
Figure 7.47: Cp at the end of start-up process, diamond airfoil, M = 2.0, α = 0.0
the fast convergence. Notice that the residual still is large right before the first Newton
iteration, but by that point the main structure of the flow field has been formed and the
residual has been dropped quickly after Newton iterations. In fact the high Mach num-
ber helps to damp all errors including low frequency errors quite effectively as soon as the
shocks and expansion fans are established over the airfoil since first, the wave propagation
speed is quite fast and second, all waves travel toward outlet without reflecting from bound-
aries. Table (7.7) provides the convergence summary for all discretization orders. For the
2nd-order case the major part of the total work is spent in the start-up phase while for the
4h-order case as expected, from previous test cases, most of the time is spent on the Newton
iterations. This is especially the case here since the 2nd-order start-up is employed and the
4th-order start-up is avoided (like the subsonic airfoil test case) reducing the relative cost
of the start-up phase for the 4th-order solution. For the 3rd-order case, the costs of the
start-up and Newton phases are split evenly.
The Mach contours for all discretization orders are shown in Fig (7.49) to Fig (7.51). All
the steady-state flow features, including symmetric shocks right at the sharp leading edge,
expansion fans at the lower and upper apexes, and symmetric fish tail shocks at the sharp
trailing edge of the diamond airfoil are clearly captured.
CHAPTER 7. RESULTS(II): SIMULATION CASES 164
CPU time (Sec)
L2(R
es)
50 100
10-13
10-11
10-9
10-7
10-5
10-3
2nd-Order/Newton Itr.3rd-Order/Newton Itr.4th-Order/Newton Itr.
Figure 7.48: Convergence history for diamond airfoil, M = 2.0, α = 00
Test Case No. of Res. Time (Sec) Work Unit No. of Newton Itr. Newton Phase Work Unit
2nd 61 27.85 236.0 2 50.0
3rd 126 48.62 243.1 3 120.8
4th 254 133.57 301.5 7 250.0
Table 7.7: Convergence summary for diamond airfoil, M = 2.0, α = 00
Test case CD
2nd 0.0524258
3rd 0.0524661
4th 0.0524671
Exact 0.0524663
Table 7.8: Drag coefficient, diamond airfoil, M = 2.0, α = 00
CHAPTER 7. RESULTS(II): SIMULATION CASES 165
X
Y
0.5 1 1.5
0.5
1
1.52.452.42.352.32.252.22.152.12.0521.951.91.851.81.751.71.651.61.55
Mach
Figure 7.49: 2nd-order Mach contours, diamond airfoil, M = 2.0, α = 0.0
X
Y
0.5 1 1.5
0.5
1
1.52.452.42.352.32.252.22.152.12.0521.951.91.851.81.751.71.651.61.55
Mach
Figure 7.50: 3rd-order Mach contours, diamond airfoil, M = 2.0, α = 0.0
CHAPTER 7. RESULTS(II): SIMULATION CASES 166
X
Y
0.5 1 1.5
0.5
1
1.52.452.42.352.32.252.22.152.12.0521.951.91.851.81.751.71.651.61.55
Mach
Figure 7.51: 4th-order Mach contours, diamond airfoil, M = 2.0, α = 0.0
X/C
Cp
0.2 0.4 0.6 0.8-0.2
-0.1
0
0.1
0.2
0.3
2nd-OrderExact
Figure 7.52: 2nd-order Cp, diamond airfoil, M = 2.0, α = 0.0
CHAPTER 7. RESULTS(II): SIMULATION CASES 167
X/C
Cp
0.2 0.4 0.6 0.8-0.2
-0.1
0
0.1
0.2
0.3
3rd-OrderExact
Figure 7.53: 3rd-order Cp, diamond airfoil, M = 2.0, α = 0.0
X/C
Cp
0.2 0.4 0.6 0.8-0.2
-0.1
0
0.1
0.2
0.3
4th-OrderExact
Figure 7.54: 4th-order Cp, diamond airfoil, M = 2.0, α = 0.0
CHAPTER 7. RESULTS(II): SIMULATION CASES 168
Figure (7.52) to Fig (7.54) display the pressure coefficient along the chord of the diamond
airfoil for all orders of discretization. In all cases there are over and/or undershoots at
the leading and trailing edges of the airfoil (where the shocks are located) and at the mid-
dle section of the diamond (where the expansion fans are formed). Naturally, for higher
discretization order, the over and undershoots are worse. The 4th-order Cp shows some
oscillatory behavior in the front region, located between two large gradient points of the
diamond airfoil. This is due to the continuing effect of the over and undershoots which are
not fully damped at that region due to very large reconstruction stencil and the character-
istic of the cubic polynomial. It is worth mentioning that the higher-order reconstruction
procedure used in this research has some compactness issues for such a supersonic flow
since the reconstruction routine uses data from regions in the flow field which are physically
independent. That is, downstream data is included in the reconstruction. Despite all this,
there is an overall good agreement between the exact solution and the numerical solution
over the supersonic diamond airfoil.
The computed drag coefficient for all orders of discretization is tabulated in Table 7.8. Again
there is a very good match between the numerical and exact solution here. For the inviscid
supersonic diamond airfoil case, due to pre-determined shock and expansion fan locations,
the drag coefficient is solely a function of the shocks strength and acceleration of the flow
at the expansion apexes. This is why the computed drag coefficient is still fairly accurate
despite all these over and undershoots, as their mean values remain very close to zero.
Chapter 8
Concluding Remarks
8.1 Summary and Contributions
A fast implicit finite volume flow solver for higher-order unstructured (cell-centered) steady-
state computation of inviscid compressible flows (the Euler equations) has been developed.
The matrix-free GMRES algorithm is used as a solver for the linear system arising from the
discretized equations, avoiding explicit computation of the higher-order Jacobian matrix.
The solution process has been divided into two separate phases, i.e. a start-up phase and a
Newton phase.
A defect correction procedure is proposed for the start-up phase consisting of multiple
implicit pre-iterations. The approximate first-order analytical Jacobian is used both for
constructing and preconditioning the linear system on the left hand side, reducing the
complexity and cost of the pre-iterations. A low fill-level incomplete factorization, ILU(1),
is used for the right preconditioning of the low order linear system associated with the defect
correction iterations. The GMRES algorithm in its standard form is used for solving the
resultant linear system. Knowing that the linearization is not very accurate at this stage,
the CFL number is kept low (on the order of 102) and a moderate tolerance was set for
solving the linear system. The defect correction procedure was very effective in reaching an
approximate solution which includes most of the physical characteristics of the steady-state
flow.
Having an approximate solution at the end of the start-up phase where the linearization of
the flow field is accurate enough for steady-state solution, the solution process is switched to
the Newton phase taking an infinite time step. In this phase, a first-order finite difference
169
CHAPTER 8. CONCLUDING REMARKS 170
Jacobian is used for preconditioning of the higher-order linear system in the matrix-free
context. A high fill-level incomplete factorization, ILU(4), is applied for the right precondi-
tioning of the Newton-GMRES iterations to take advantage of the best possible precondi-
tioning of the employed approximate Jacobian1. Semi-quadratic or superlinear convergence
has been achieved within a few Newton iterations for most of the cases.
A modified version of Venkatakrishnan limiter [87] is used to enforce loose monotonicity
for transonic flows. The higher-order terms in the reconstruction polynomial were dropped
wherever the limiter fired firmly in the flow field. This has been performed through a dif-
ferentiable switch, σ, which is a function of the limiter value φ itself. Also, a large constant
value K in Venkatakrishnan limiter (typically K=10) is chosen for the limiting which some-
what sacrifices the monotonicity in the favor of rapid convergence. The over all proposed
limiting procedure insures fast convergence while reducing the accuracy disadvantages of
limiter application.
The effect of choosing ε in the accuracy of the computed matrix-vector products using
directional derivative technique for the matrix-free GMRES was analyzed throughly. It was
shown that for a fine mesh it is necessary to choose a relatively large ε to properly account
for the higher-order terms.
The issue of mesh refinement in accuracy measurement for unstructured meshes is revis-
ited. A straightforward methodology is applied for accuracy assessment of the higher-order
unstructured approach based on total pressure loss, drag measurement, and solution error
calculation.
The effect of the discretization order, mesh size and far field distance were studied com-
pletely and a careful measurement of the solver performance was provided. The accuracy,
fast convergence and robustness of the proposed higher-order unstructured Newton-Krylov
solver for different speed regimes were shown via several test cases.
Solutions of different orders of accuracy are compared in detail through several investiga-
tions. The possibility of reducing computational cost required for a given level of accuracy
using high-order discretization is demonstrated.
1For such a Newton-GMRES solver, it is shown that ILU(4) has an equal performance with full factoriza-tion, LU, in terms of outer iterations where the ILU(4) factorization cost both in terms of time and memoryusage remains fairly small compared to LU factorization [52].
CHAPTER 8. CONCLUDING REMARKS 171
8.2 Conclusions
While, in general, computing cost remains one of the main concerns for the higher-order com-
putation of fluid flow problems, the current research proves that reaching fast convergence—
and a reasonable computing cost—for higher-order unstructured solution is indeed possible.
Results for the implicit flow solution algorithm described in this thesis show that the 2nd
and 3rd-order solutions both display semi-quadratic or superlinear convergence for all test
cases if started from a good initial guess or approximate solution. The 4th-order solution
still converges quickly although it is slower than the other discretization orders requiring
nearly two times the number of outer iterations as the 2nd-order solution started from a
similar approximate flow solution.
The employed start-up procedure is very effective in providing a good approximate solution
as an initial condition for Newton phase with a reasonable cost. For subsonic or supersonic
cases, a limited number of pre-iterations is sufficient to reach a good initial solution. For the
transonic case, however, the number of pre-iterations increases since some residual reduction
target must be met before switching to Newton iterations.
The proposed start-up technique takes a considerable part of the solution time for the 2nd
and 3rd-order solutions. As a result, reduction of the start-up cost would significantly
improve the over all performance of the flow solver. Finding a reasonably good initial
solution quickly is something of an art rather than an exact science. If a pre-computed
solution is available then the cost of the start-up phase can be ignored and in practice the
total solution cost would be equal to the Newton phase cost. The first-order solution, full
potential solution, exact solution of the approximated problem or an empirical solution can
be picked as an initial solution for the Newton phase depending on the availability and the
problem case.
Preconditioning and the accuracy of the linear system solution is a vital factor in the
Newton phase of the Newton-GMRES solver. Evidence suggests that the number of outer
Newton iterations in the 4th-order case can be reduced to the level of the 2nd and 3rd-order
methods provided that the linear system resulting from the 4th-order discretization is solved
completely [53]. Given the current preconditioning strategy, this requires a considerable
increase (by a factor of 4-5) in the 4th-order residual evaluations in the matrix-free GMRES
algorithm. Therefore it is preferable to keep the subspace size limited to a moderate number
(i.e. 30) and bear the cost of increasing the outer iteration numbers by a factor of two.
The eigenvalue spectrum of the preconditioned system shows that the higher-order system is
shifted toward singularity, increasing the condition number of the linear system. Therefore
CHAPTER 8. CONCLUDING REMARKS 172
more effort is needed to solve the linear system resulting from the higher-order discretization,
especially in the vicinity of the steady-state solution.
For transonic flow, like smooth flows and despite using a limiter and the existence of normal
shocks, superlinear convergence for the 2nd and 3rd-order discretizations and fast conver-
gence for the 4th-order discretization have been obtained in the Newton phase. The applied
limiter, which was active mainly in the strong shock, did not have a strong effect on deteri-
oration of the convergence rate, and the employed continuous switch performed as designed
to remove higher-order terms near shocks.
Using an efficient start-up technique, a good preconditioner matrix, and an effective pre-
conditioning strategy are the key issues for the robustness and fast convergence of the
Newton-GMRES solver. The performance of the current flow solver in terms of CPU time
scales linearly with the mesh size for all orders of discretization. As an overall performance
assessment (including the start-up phase), the 3rd-order solution is about 1.3 to 1.5 times,
and the 4th-order solution is about 3.5-5 times, more expensive than the 2nd-order solution
with the developed solver technology. It should be mentioned that the current 2nd-order
solver algorithm, even without multigrid augmentation, is very competitive with the most
efficient reported results for unstructured meshes in terms of the normalized total solu-
tion time (work units). The 3rd and 4th-order schemes are comparable in efficiency to the
2nd-order scheme as measured by residual evaluations.
Mesh refinement is the critical issue in accuracy verification, and reducing the length scale
uniformly for irregular unstructured grids is not an easy task. Consequently in the accu-
racy assessment process the generated mesh levels must be examined closely especially at
boundaries. Boundaries in general and the solid wall boundary in particular are the main
source of the solution error.
Qualitative/quantitative comparisons of the fluid flow solutions for different discretization
orders over several meshes show that the quality of the solution for smooth flows is dramat-
ically improved using high-order discretization technique, especially in the case of coarse
grids. This suggests that depending on the error measurement criteria, significant effi-
ciency/accuracy gains for high-order discretization are attainable.
Since the developed implicit algorithm is matrix-free, memory usage will not be an issue. In
the matrix-free case, memory usage is affected by the type of the preconditioning technique
which in this thesis for ILU(4) is a bit more than two times of the required memory for
storing the first-order Jacobian, Table 5.1.
CHAPTER 8. CONCLUDING REMARKS 173
8.3 Recommended Future Work
The proposed solver algorithm works quite efficiently and accurately for a variety of inviscid
compressible flows. However, there are a number of areas where this research could be either
improved or extended.
8.3.1 Start-up
In the case of very large meshes, it is not efficient to perform the start-up procedure from
scratch; it is more efficient if the computation procedure is started from a coarser level
mesh. Therefore for very large meshes augmenting the pre-iterations with a mesh sequencing
technique could be quite helpful. At the same time total solution cost due to increasing
mesh size in the start up can be dramatically reduced if a multi-grid procedure is employed.
8.3.2 Preconditioning
Applying a more accurate preconditioning matrix (that is, a better approximation to the
higher-order Jacobian) instead of the first-order Jacobian would increase the system solving
quality reducing the outer iteration numbers especially in the case of the 4th-order method.
Having said that the preconditioning cost should be kept reasonably low since the factor-
ization cost of the preconditioner matrix should be much smaller than solving the original
linear system. Adding a multi-grid scheme to the preconditioning strategy could also be
helpful in the case of large meshes.
8.3.3 Reconstruction
In general, limiters adversely affect the solution accuracy via manipulation of the recon-
struction polynomial. This issue becomes more important for higher-order reconstruction
if the limiter fires unnecessarily and extensively in the flow field. As a result, WENO
schemes, in which limiters are avoided through a careful weighting non-oscillatory recon-
struction procedure, may be an appropriate solution for higher-order reconstruction of non-
smooth flows. Furthermore, the benefit of applying higher-order reconstruction for high
Mach number/non-smooth flows, considering the natural oscillatory behavior of the recon-
struction polynomial in such flows, still is an open area of research.
CHAPTER 8. CONCLUDING REMARKS 174
8.3.4 Extension to 3D
Extension of the current higher-order 2D solver algorithm to a 3D version, considering the
availability of the 3D reconstruction procedure, is reasonably straight forward if an appro-
priate higher-order representation of a 3D mesh is available. Since the 2D algorithm is
developed for unstructured meshes, there is no limitation in such an extension, and the
performance of the flow solver in terms of the convergence should be improved due to intro-
ducing the third dimension (3D relieving effect). However, the matrix storage requirement
in 3D needs to be carefully studied since storing even the first-order Jacobian (due to the
large number of meshes in 3D) for preconditioning could be challenging. The incomplete
factorization fill-level in 3D is another issue, and it may not be possible to use high fill-level
ILU(P) for 3D cases.
8.3.5 Extension to Viscous Flows
The introduced implicit procedure could be implemented for viscous flow computation
provided that the viscous residual function evaluation and a proper anisotropic unstruc-
tured/hybrid mesh are available. However there would be some issues regarding the condi-
tioning of the linear system due to wide variety of geometrical and physical length scales,
which need to be addressed by proper preconditioning. In the case of turbulence modeling
the degree of coupling of the turbulence model in the implicit procedure would be another
issue.
Bibliography
[1] PETSc, The Portable Extensible Toolkit for Scientific Computation. Argonne National
Lab, http://www-unix.mcs.anl.gov/petsc/petsc-as/index.html.
[2] Test Cases for Inviscid Flow Fields. Technical Report AR-211, Advisory Group for
Aerospace Research and Development (AGARD), NATO., 1985.
[3] R. Abgrall. On Essentially Non-oscillatory Schemes on Unstructured Meshes: Analysis
and Implementation. Journal of Computational Physics, 114:45–58, 1994.
[4] M. Aftosmis, D. Gaitonde, and S. T. Tavares. Behavior of Linear Reconstruction
Techniques on Unstructured Meshes. AIAA Journal, 33(11):2038–2049, 1995.
[5] M. J. Aftosmis and M. J. Berger. Multilevel Error Estimation and Adaptive h-
Refinement for Cartesian Meshes With Embedded Boundaries. AIAA Paper 2002-0863,
40th AIAA Aerospace Sciences Meeting and Exhibit, 2002.
[6] R. K. Agarwal. A Compact High-Order Unstructured Grids Method for the Solution
of Euler Equations. International Journal of Numerical Methods in Fluids, 31:121–147,
1999.
[7] F.R. Bailey. High-End Computing Challenges in Aerospace Design and Engineering. In
Third International Conference on Computational Fluid Dynamics (Invited Lecture),
2004.
[8] H. E. Bailey and R. M. Beam. Newton’s Method Applied to Finite Difference Approx-
imations for Steady State Compressible Navier-Stokes Equations. Journal of Compu-
tational Physics, 93:108–127, 1991.
[9] T. J. Barth. Analysis of Implicit Local Linearization Techniques for Upwind and TVD
Algorithms. AIAA Paper 87-0595, 25th Aerospace Sciences Meeting and Exhibit, 1987.
175
BIBLIOGRAPHY 176
[10] T. J. Barth. Aspects of Unstructured Grids and Finite-Volume Solvers for the Euler
and Navier-Stokes Equations, in Unstructured Grid Methods for Advection-Dominated
Flows. Technical Report AGARD-R787, AGARD, 1992.
[11] T. J Barth. Recent Development in High Order K-Exact Reconstruction on Unstruc-
tured Meshes. AIAA Paper 93-0668, 31th Aerospace Sciences Meeting and Exhibit,
1993.
[12] T. J. Barth and P. O. Frederickson. Higher-Order Solution of the Euler Equations
on Unstructured Grids Using Quadratic Reconstruction. AIAA Paper 90-0013, 28th
Aerospace Sciences Meeting and Exhibit, 1990.
[13] T. J. Barth and D. C. Jespersen. The Design and Application of Upwind Schemes
on Unstructured Meshes. AIAA Paper 89-0366, 27th Aerospace Sciences Meeting and
Exhibit, 1989.
[14] T. J. Barth and S. Linton. An Unstructured Mesh Newton Solver for Compressible
Fluid Flow and its Parallel Implementation. AIAA Paper 95-0221, 33rd Aerospace
Sciences Meeting and Exhibit, 1995.
[15] F. Bassi and S. Rebay. High-Order Accurate Discontinuous Finite Element Solution of
the 2D Euler Equations. Journal of Computational Physics, 138:251–285, 1997.
[16] M. Benzi. Preconditioning Techniques for Large Linear Systems: A Survey. Journal
of Computational Physics, 182:418–477, 2002.
[17] K. S. Bey and J. T. Oden. A Runge-Kutta Local Projection P1-Discontinuous Galerkin
Finite Element Method for High Speed Flows. AIAA Paper 91-1575, 1991.
[18] M. Blanco and D. Zingg. A Fast Solver for the Euler Equations on Unstructured Grids
Using a Newton-GMRES Method. Number AIAA Paper 97-0331, 1997.
[19] C. Boivin and C. Ollivier-Gooch. Guaranteed-Quality Triangular Mesh Generation for
Domains with Curved Boundaries. International Journal for Numerical Methods in
Engineering, 55(10):1185–1213, 2002.
[20] G. Corliss, Ch. Faure, A. Griewank, L. Hascoet, and U. Naumann. Automatic Differ-
entiation of Algorithms From Simulation to Optimization. Springer, 2002.
[21] Darren L. De Zeeuw. A Quadtree-Based Adaptively-Refined Cartesian-Grid Algorithm
for Solution of the Euler Equations. PhD thesis, Aerospace Engineering and Scientific
Computing in the University of Michigan, 1993.
BIBLIOGRAPHY 177
[22] M. Delanaye. Polynomial Reconstruction Finite Volume Schemes for the Compressible
Euler and Navier-Stokes Equations on Unstructured Adaptative Grids. PhD thesis,
Universite de Liege, Faculte des Sciences Appliquees, 1998.
[23] M. Delanaye and J. A. Essers. Quadratic-Reconstruction Finite-Volume Scheme for
Compressible Flows on Unstructured Adaptive Grids. AIAA Journal, 35(4):631–639,
1997.
[24] M. Delanaye, Ph. Geuzaine, and J. A. Essers. Compressible Flows on Unstructured
Adaptive Grids. AIAA Paper 97-2120, 13th AIAA Computational Fluid Dynamics
Conference, 1997.
[25] R. S. Dembo, S. C. Eisenstat, and T. Steihaug. Inexact Newton Methods. SIAM
Journal on Numerical Analysis, 19(2):400–408, 1982.
[26] J. E. Dennis and R. B. Schnable. Numerical Methods for Unconstrained Optimizations
and Non Linear Equations. Prentice-Hall, 1983.
[27] J. Forrester, E. Tinoco, and Yu. Jong. Thirty years of development and application of
CFD at Boeing Commercial Airplanes. AIAA Paper 2003-3439, 2003.
[28] O. Friedrich. Weighted Essentially Non-Oscillatory Schemes for the Interpolation of
Mean Values on Unstructured Grids. Journal of Computational Physics, 144:194–212,
1998.
[29] P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization. Elsevier Academic
Press, 1986.
[30] A. G. Godfrey, C. R. Mitchell, and R. W. Walters. Practical Aspects of Spatially
High Accurate Methods. AIAA Paper 92-0054, 30th Aerospace Sciences Meeting and
Exhibit, 1992.
[31] J. J. Gottlieb and C. P. T. Groth. Assessment of Riemann Solvers for Unsteady
One-Dimensional Inviscid Flows of Perfect Gases. Journal of Computational Physics,
78(2):437–458, 1988.
[32] W. D. Gropp, D. E. Keyes, L. C. Mcinnes, and M. D. Tidriri. Globalized Newton-
Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD. Int. J. High Per-
formance Computing Applications, 14:102–136, 2000.
[33] A. Harten and S. Osher. Uniformly high-order accurate nonoscillatory schemes. SIAM
Journal of Numerical Analysis, 24, 1987.
BIBLIOGRAPHY 178
[34] A. Haselbacher. A WENO Reconstruction Algorithm for Unstructured Grids Based on
Explicit Stencil Construction. Number AIAA Paper 2005-879 in 43rd AIAA Aerospace
Sciences Meeting and Exhibit, 2005.
[35] C. Hirsch. Numerical Computation of Internal and External Flows, Computational
Methods for Inviscid and Viscous Flows, volume 2. John Wiley & Sons, 1990.
[36] A. Jameson. Success and Challenges in Computational Aerodynamics. AIAA Paper
87-1184, 8th Computational Fluid Dynamics Conference, 1987.
[37] A. Jameson and D. Mavriplis. Finite-Volume Solution of the Two-Dimensional Euler
Equations on a Regular Triangular Mesh. AIAA Journal, 24:611–618.
[38] A. J. Jameson, W. Schmidt, and E. Turkel. Numerical Solutions of the Euler Equations
by a Finite-Volume Method Using Runge-Kutta Time-Stepping Schemes. AIAA Paper
81-1259, 1981.
[39] D. B. Kim and P. D. Orkwis. Jacobian Update Strategies for Quadratic and Near-
Quadratic Convergence of Newton and Newton-Like Implicit Schemes. AIAA Paper
93-0878, 1993.
[40] K. A. Knoll and D. E. Keyes. Jacobian Free Newton Krylov Methods: A Survey of
Approaches and Applications. Journal of Computational Physics, 193:357–397, 2004.
[41] C. B. Laney. Computational Gasdynamics. Cambridge University Press, 1998.
[42] R. J. Le Veque. Numerical Methods for Conservation Laws. Birkhauser Verlag, 1990.
[43] X-D. Liu, S. Osher, and T. Chan. Weighted Essentially Non-Oscillatory Schemes.
Journal of Computational Physics, 115, 1994.
[44] H. Luo, J. Baum, and R. Lohner. A Fast, Matrix-free Implicit Method for Compressible
Flows on Unstructured Girds. Journal of Computational Physics, 146:664–690, 1998.
[45] L. M. Manzano, J. V. Lassaline, P. Wong, and D. W. Zingg. A Newton-Krylov Al-
gorithm for the Euler Equations Using Unstructured Grids. AIAA Paper 2003-0274,
41th AIAA Aerospace Sciences Meeting and Exhibit, 2003.
[46] D. J. Mavriplis. On Convergence Acceleration Techniques for Unstructured Meshes.
Technical Report ICASE No. 98-44, Institute for Computer Applications in Science
and Engineering (ICASE), NASA Langley Research Center, NASA Langley Research
Center, Hampton VA 23681-2199, 1998.
BIBLIOGRAPHY 179
[47] F. Mavriplis. CFD in aerospace in the new millenium. Canadian Aeronautics and
Space Journal, 46(4):167–176, 2000.
[48] A. Nejat and C. Ollivier-Gooch. A High-Order Accurate Unstructured GMRES Solver
for Poisson’s Equation. 11th Annual Conference of the Computational Fluid Dynamics
Society of Canada, pages 344–349, 2003.
[49] A. Nejat and C. Ollivier-Gooch. A High-Order Accurate Unstructured GMRES Solver
for the Compressible Euler Equations. Third International Conference on Computa-
tional Fluid Dynamics, 2004.
[50] A. Nejat and C. Ollivier-Gooch. A High-Order Accurate Unstructured GMRES Algo-
rithm for Inviscid Compressible Flows. AIAA Paper 2005-5341, 17th AIAA Computa-
tional Fluid Dynamics Conference, 2005.
[51] A. Nejat and C. Ollivier-Gooch. A High-Order Accurate Unstructured Newton-Krylov
Solver for Inviscid Compressible Flows. AIAA Paper 2006-3711, 36th AIAA Fluid
Dynamics Conference, 2006.
[52] A. Nejat and C. Ollivier-Gooch. On Preconditioning of Newton-GMRES algorithm for
a Higher-Order Accurate Unstructured Solver. 14th Annual Conference of the CFD
Society of Canada, 2006.
[53] A. Nejat and C. Ollivier-Gooch. Effect of Discretization Order on Preconditioning
and Convergence of a Higher-Order Unstructured Newton-Krylov Solver for Inviscid
Compressible Flows. AIAA Paper 2007-719, 45th AIAA Aerospace Sciences Meeting
and Exhibit, 2007.
[54] J. Nichols and D. W. Zingg. A Three-Dimensional Multi-Block Newton-Krylov Flow
Solver for the Euler Equations. AIAA Paper 2005-5230, 17th AIAA Computational
Fluid Dynamics Conference, 2005.
[55] E. J. Nielsen, W. K. Anderson, R. W. Walters, and D. E. Keyes. Application of Newton-
Krylov Methodology to a Three-Dimensional Unstructured Euler Code. AIAA Paper
95-1733, 12th AIAA Computational Fluid Dynamics Conference, 1995.
[56] C. Ollivier-Gooch. High-Order ENO Schemes for Unstructured Meshes Based on Least-
Square Reconstruction. AIAA Paper 97-0540, 35th AIAA Aerospace Sciences Meeting
and Exhibit, 1997.
[57] C. Ollivier-Gooch. Quasi-ENO Schemes for Unstructured Meshes Based on Unlimited
Data-Dependent Least-Squares Reconstruction. Journal of Computational Physics,
133:6–17, 1997.
BIBLIOGRAPHY 180
[58] C. Ollivier-Gooch. Programmers Reference Guide for the ANSLib Generic Finite-
Volume Solver Software Library. Advanced Numerical Simulation Laboratory
(ANSLab), Mechanical Engineering Department, The University of British Columbia,
2002.
[59] C. Ollivier-Gooch. GRUMMP (Generation and Refinement of Unstructured, Mixed-
Element Meshes in Parallel) Version 0.3.2 User’s Guide. Advanced Numerical Simu-
lation Laboratory (ANSLab), Mechanical Engineering Department, The University of
British Columbia, 2005.
[60] C. Ollivier-Gooch and M. Van Altena. A Higher-Order Accurate Unstructured Mesh
Finite-Volume Scheme for the Advection-Diffusion Equation. Journal of Computational
Physics, 181:729–752, 2002.
[61] O. Onur and S. Eyi. Effects of the Jacobian Evaluation on Newton’s Solution of the
Euler Equations. International Journal for Numerical Methods In Fluids, 49:211–231,
2005.
[62] P. D. Orkwis. Comparison of Newton’s and Quasi-Newton’s Method Solvers for Navier-
Stokes Equations. AIAA Journal, 31:832–836, 1993.
[63] A. Pueyo and D. W. Zingg. An Efficient Newton-GMRES Solver for Aerodynamic
Computations. AIAA Paper 97-1955, 1997.
[64] A. Pueyo and D. W. Zingg. Progress in Newton-Krylov Methods for Aerodynamic
Calculations. AIAA Paper 97-0877, 35th Aerospace Sciences Meeting and Exhibit,
1997.
[65] A. Pueyo and D. W. Zingg. Improvement to a Newton-Krylov Solver for Aerodynamic
Flows. AIAA Paper 98-0619, 36th Aerospace Sciences Meeting and Exhibit, 1998.
[66] S. De Rango and D. W. Zingg. Aerodynamic Computations Using a Higher-Order
Algorithm. AIAA Paper 99-0167, 37th AIAA Aerospace Sciences Meeting and Exhibit,
1999.
[67] S. De Rango and D. W. Zingg. Further Investigation of Higher-Order Algorithm for
Aerodynamic Computations. AIAA Paper 2000-0823, 38th Aerospace Sciences Meeting
and Exhibit, 2000.
[68] S. De Rango and D. W. Zingg. Higher-Order Aerodynamic Computations on Multi-
Block Grids. AIAA Paper 2001-0263, 39th AIAA Aerospace Sciences Meeting and
Exhibit, 2001.
BIBLIOGRAPHY 181
[69] P. Roe. Approximate Riemann Solvers, Parameter Vectors, and Difference Schemes.
Journal of Computational Physics, 43:357–372, 1981.
[70] P. Roe. Characteristic-Based Schemes for the Euler Equations. Annual Review of Fluid
Mechanics, 18:337–365, 1986.
[71] A. Rohde. Eigenvalues and Eigenvectors of the Euler Equations in General Geometries.
AIAA Paper 2001-2609, 39th Aerospace Sciences Meeting and Exhibit, 2001.
[72] Y. Saad, , and M. H. Schultz. A Generalized Minimal Residual Algorithm for Solving
Non-Symmetric Linear Systems. SIAM J. Sci., Stat. Comp., 7:856–869, 1986.
[73] Y. Saad. A Flexible Inner-Outer Preconditioned GMRES Algorithm. SIAM Journal
of Scientific Computing, 14(2):461–469, 1993.
[74] Y. Saad. Iterative Methods for Sparse Linear Systems. Siam, second edition, 2003.
[75] P.R. Spalart. Topics in Detached-Eddy Simulation. In Third International Conference
on Computational Fluid Dynamics (Invited Lecture), 2004.
[76] Spitaleri R. M. and Regolo V. Multiblock Multigrid Grid Generation Algorithms :
Overcoming Multigrid Anisotropy. Applied mathematics and computation (Appl. math.
comput.), 84:247–267, 1997.
[77] J. L. Steger and R. F. Warming. Flux Vector Splitting of the Inviscid Gasdynamics
Equations with Application to Finite Difference Methods. Journal of Computational
Physics, 40(263-293), 1981.
[78] M. Tadjouddine, S. A. Forth, and N. Qin. Elimination AD Applied to Jacobian As-
sembly for an Implicit Compressible CFD Solver. International Journal for Numerical
Methods in Fluids, 47:1315–1321, 2005.
[79] M. D. Tidriri. Preconditioning Techniques for the Newton-Krylov Solution of Com-
pressible Flows. Journal of Computational Physics, 132:51–61, 1997.
[80] M. Van Altena. High-Order Finite-Volume Discretisations for Solving a Modified
Advection-Diffusion Problem on Unstructured Triangular Meshes. Master’s thesis,
The University of British Columbia, Mechanical Engineering Department, 1999.
[81] H. A. Van Der Vorst. Iterative Krylov Methods for Large Linear Systems. Cambridge
University Press, 2003.
BIBLIOGRAPHY 182
[82] B. Van Leer. Towards the Ultimate Conservative Difference Scheme. V. A Second-
Order Sequel to Godunov’s Method. Journal of Computational Physics, 32:101–136,
1979.
[83] B. Van Leer, J. L. Thomas, P. L. Roe, and R. W. Newsome. A Comparision of Numer-
ical Flux Formulas for the Euler and Navier-Stokes Equations. AIAA Paper 87-1104,
1987.
[84] K. J. Vanden and P. D. Orkwis. Comparison of Numerical and Analytical Jacobians.
AIAA Journal, 34(6):1125–1129, 1996.
[85] V. Venkatakrishnan. Newton Solution of Inviscid and Viscous Problems. AIAA Paper
88-0143, 26th Aerospace Sciences Meeting and Exhibit, 1988.
[86] V. Venkatakrishnan. On the Accuracy of Limiters and Convergence to Steady State
Solutions. AIAA Paper 93-0880, 31st Aerospace Sciences Meeting and Exhibit, 1993.
[87] V. Venkatakrishnan. Convergence to Steady state Solutions of the Euler Equations on
Unstructured Grids With Limiters. Journal of Computational Physics, 118:120–130,
1995.
[88] V. Venkatakrishnan. A Perspective on Unstructured Grid Flow Solvers. Technical
Report ICASE 95-3, NASA, 1995.
[89] V. Venkatakrishnan and T. J. Barth. Application of Direct Solvers to Unstructured
Meshes for the Euler and Navier-Stokes Equations Using Upwind Schemes. AIAA
Paper 89-0364, 27th Aerospace Sciences Meeting and Exhibit, 1989.
[90] V. Venkatakrishnan and D. Mavriplis. Implicit Solvers for Unstructured Meshes. Jour-
nal of Computational Physics, 105(83-91), 1993.
[91] V. Venkatakrishnan and D. J. Mavriplis. Implicit Solvers for Unstructured Meshes.
AIAA Paper 91-1537, 1991.
[92] D. S. Watkins. Fundamentals of Matrix Computations. John Wiley & Sons, 1991.
[93] D. L. Whitaker. Three Dimensional Unstructured Grid Euler Computations Using a
Fully-Implicit Upwind Method. AIAA Paper 93-3337, 1993.
[94] L. Wigton. Application of MACSYMA and Sparse Matrix Technology to Multielement
Airfoil Calculations. AIAA Paper 87-1142, 1987.
BIBLIOGRAPHY 183
[95] D. J. Willis, J. Peraire, and J. K. White. A Quadratic Basis Function, Quadratic
Geometry, High Order Panel Method. AIAA Paper 2005-0854, 43rd AIAA Aerospace
sciences Meeting and Exhibit, 2005.
[96] D. W. Zingg, S. De Rango, M. Nemec, and T. H. Pulliam. Comparison of Several
Spatial Discretizations for the Navier-Stokes Equations. Journal of Computational
Physics, 160:683–704, 2000.