Finite Element Solver for Flux-Source Equations

Finite Element Solver for Flux-Source Equations

Weston B. Lowrie

A thesis submitted in partial fulfillment ofthe requirements for the degree of

Master of Science in Aeronautics & Astronautics

University of Washington

2008

Program Authorized to Offer Degree: Aeronautics & Astronautics

University of WashingtonGraduate School

This is to certify that I have examined this copy of a master’s thesis by

Weston B. Lowrie

and have found that it is complete and satisfactory in all respects,and that any and all revisions required by the final

examining committee have been made.

Committee Members:

Uri Shumlak

Thomas Jarboe

Date:

In presenting this thesis in partial fulfillment of the requirements for a master’s degree atthe University of Washington, I agree that the Library shall make its copies freely availablefor inspection. I further agree that extensive copying of this thesis is allowable only forscholarly purposes, consistent with “fair use” as prescribed in the U.S. Copyright Law. Anyother reproduction for any purpose or by any means shall not be allowed without my writtenpermission.

Signature

Date

University of Washington

Abstract

Finite Element Solver for Flux-Source Equations

Weston B. Lowrie

Chair of the Supervisory Committee:Professor Uri Shumlak

Aeronautics and Astronautics

An implicit finite element solver is being developed. The solver uses the flux-source equa-

tion form such that many equation sets can be easily implemented. This helps simplify

the discretization of the finite element method by keeping the specification of the physics

separate. The Portable, Extensible, Toolkit for Scientific Computation (PETSc) is imple-

mented for parallel matrix solvers and parallel data structures. The motivation behind the

development is to have a general solver that can handle many equation sets, run on large

parallel machines, and eventually be expandable to multiple dimensions. The development

of the 1D solver, and results for several test case solutions to the Pseudo-1D Euler equations

are discussed. Accuracy, convergence, and computational timing studies of the method are

also described.

TABLE OF CONTENTS

Page

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2: Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Flux-Source Equation Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.2 Galerkin’s Method - Weak Form of Equations . . . . . . . . . . . . . . . . . . 2

2.3 Nodal Basis Function using Lagrange Polynomials . . . . . . . . . . . . . . . 3

2.4 Modal Basis Functions using Jacobi Polynomials . . . . . . . . . . . . . . . . 5

2.5 Basis Function Amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.6 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Chapter 3: Solver Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 General Equation Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Spatial Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Mass and Stiffness Matrices Construction . . . . . . . . . . . . . . . . . . . . 13

3.4 Nonlinear Solver with Implicit Time Advance . . . . . . . . . . . . . . . . . . 14

3.5 Jacobians with Respect to Basis Functions . . . . . . . . . . . . . . . . . . . . 15

Chapter 4: General Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 Natural Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2 Specifying Different Boundary Equations . . . . . . . . . . . . . . . . . . . . . 19

4.3 Summary of Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . 23

Chapter 5: Artificial Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

i

Chapter 6: PETSc Parallelization and Solvers . . . . . . . . . . . . . . . . . . . . 266.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.2 Scalable Linear Equations Solvers (KSP) . . . . . . . . . . . . . . . . . . . . . 266.3 Scalable Nonlinear Equations Solvers (SNES) . . . . . . . . . . . . . . . . . . 276.4 SuperLU Direct Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Chapter 7: Pseudo-1D Euler Equations . . . . . . . . . . . . . . . . . . . . . . . . 297.1 Diverging Nozzle Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297.2 Boundary Condition Considerations . . . . . . . . . . . . . . . . . . . . . . . 307.3 Supersonic Inflow and Outflow in a Diverging Nozzle . . . . . . . . . . . . . . 327.4 Supersonic Inflow and Subsonic Outflow in a Diverging Nozzle . . . . . . . . 337.5 Subsonic Inflow and Outflow in a Diverging Nozzle . . . . . . . . . . . . . . . 387.6 Euler Shock Tube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Chapter 8: Accuracy, Convergence, and Timing Studies . . . . . . . . . . . . . . . 448.1 Varying Polynomial Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448.2 Varying Timestep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448.3 Errors with Large Timesteps . . . . . . . . . . . . . . . . . . . . . . . . . . . 488.4 Computational Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Chapter 9: Future Developments and Plans . . . . . . . . . . . . . . . . . . . . . . 579.1 Incorporate Quadrilateral/Hexahedral Structured Grid Generator . . . . . . . 579.2 Extend Algorithm to Three Dimensions . . . . . . . . . . . . . . . . . . . . . 58

Chapter 10: Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6110.1 Flux-Source Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6110.2 Nodal versus Modal Basis Functions . . . . . . . . . . . . . . . . . . . . . . . 6210.3 PETSc Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6210.4 Implicit Time Advance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Appendix A: 1D Finite Element Equation Solver Manual . . . . . . . . . . . . . . . 67A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67A.2 Compiling with PETSc libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 67A.3 Running the Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68A.4 Algorithm Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

ii

A.5 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73A.6 Physics and Equation Specification Module . . . . . . . . . . . . . . . . . . . 74

Appendix B: Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

iii

LIST OF FIGURES

Figure Number Page

2.1 Sixth order nodal (Lagrange) polynomials on the domain x ∈ [−1, 1]. Eachnode has a corresponding polynomial with a value of one at the node. At allother nodes, the value of the same polynomial is zero. . . . . . . . . . . . . . 4

2.2 A three element system with second order nodal (Lagrange) polynomials.Each node has a corresponding polynomial, and the nodes that share elementboundaries have polynomials that span both elements. For instance α1

3 andα2

1 from the first and second element provide the C0 continuity through theshared node 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Modal (Jacobi) Polynomials, J(1,1)n with highest order of 7 on the domain

x ∈ [−1, 1]. All polynomials are defined within the domain and go to zeroat the domain boundaries, with the exception of the linear polynomials. Thetwo linear polynomials range from zero on one boundary to one at the otherboundary. These linear polynomials provide the continuity between elements. 7

2.4 A three element system with second order modal (Jacobi) polynomials αn.Each polynomial is not associated with any particular node, but defined atall points. The linear polynomials provide the C0 continuity by spanningacross element boundaries. Quadrature points are distributed evenly in thiscase and include the element boundaries, but they could also be defined onlyon the interior parts of the elements. . . . . . . . . . . . . . . . . . . . . . . . 8

7.1 Nozzle used in solving the pseudo-1D Euler equations. Dashed box indicatesthe diverging section of the nozzle that is used in the simulations. The sub-script c indicates the chamber, t represents the nozzle throat, i the inflow (forthe computational domain), e the exit (outflow), and the shock subscript in-dicates a possible shock location when supersonic inflow and subsonic outflowconditions exist. The analytic cross sectional area function A indicates themodeled section geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

iv

7.2 Supersonic inflow and outflow in a nozzle after reaching a steady state (t=20).Plots of pressure p, density ρ, velocity u, and energy e. The dashed linerepresents the initial condition, while the solid line represents the solution att=20. Each of the variables are normalized to freestream values: p = p′

ρ∞a2∞

,

ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is normalized to a characteristic time

t = t′

τ , and the length of the domain to a characteristic length x = x′

a∞τ . Thecharacteristic time is defined as τ = L

a∞, where L is the physical length of

the domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7.3 Supersonic inflow and outflow after reaching a steady state (t=10) with overspecified boundary conditions. Plots of pressure, density, velocity, and energy.The dashed line represents the initial condition, while the solid line representsthe solution at t=10. A dissipation of ε = 5e−2 was used to resolve theboundary layer/shock. Each of the variables are normalized to freestreamvalues: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is normalized to a

characteristic time t = t′

τ , and the length of the domain to a characteristiclength x = x′

a∞τ . The characteristic time is defined as τ = La∞

, where L is thephysical length of the domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7.4 Supersonic inflow and subsonic outflow after reaching a steady state (t = 50)for pe = 1.20 (blue),1.30 (red),1.40 (green) and 1.50 (magenta). Each of thevariables are normalized to freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞,

e = e′

ρ∞a2∞

. Time is normalized to a characteristic time t = t′

τ , and the

length of the domain to a characteristic length x = x′

a∞τ . The characteristictime is defined as τ = L

a∞, where L is the physical length of the domain. A

dissipation factor of ε = 5 · 10−2 was used to resolve the shocks and controlthe dispersion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7.5 Subsonic inflow and outflow conditions after reaching a steady state for pres-sure, density, velocity, and energy after t = 300. Each of the variables arenormalized to freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

.

Time is normalized to a characteristic time t = t′

τ , and the length of thedomain to a characteristic length x = x′

a∞τ . The characteristic time is definedas τ = L

a∞, where L is the physical length of the domain. . . . . . . . . . . . . 41

v

7.6 Euler shock tube result after t = 1.5. Plots of pressure, density, velocity,and energy. The dashed line represents the initial condition, while the solidline represents the solution at t=1.5. Each of the variables are normalizedto freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is

normalized to a characteristic time t = t′

τ , and the length of the domainto a characteristic length x = x′

a∞τ . The characteristic time is defined asτ = L

a∞, where L is the physical length of the domain. A dissipation factor

of ε = 5 · 10−3 was used to resolve the shocks and control dispersion. . . . . . 43

8.1 Nozzle convergence for varying polynomial order. The L2 Norms normalizedby the number of degrees of freedom Np in the system versus the number ofelements in the system Ne are compared. Polynomial degrees of 2,3,4,5,6,7,and 8 are shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.2 Nozzle convergence for varying time step sizes ∆t. The L2 Norms normalizedby the number of degrees of freedom Np in the system versus the number ofelements in the system Ne are compared. Several different time step sizes areshown ∆t = 0.25, 0.333, 0.50, 0.667, 1.0, 1.333, 1.667, and 2.0. . . . . . . . . 47

8.3 Relative deviation from a steady state solution due to increase in time stepsize. The deviation measured is an infinity norm of the difference betweenthe steady state solution and peak error due to oscillations. Figure 8.4 showsan example oscillatory error that results from large time steps. Five differentspatial resolutions are compared, Ne = 30, 40, 50, 100, and 200. . . . . . . . . 49

8.4 Velocity deviation from supersonic inflow and outflow steady state solutiondue to large time steps. The initial condition is the flat dashed line, the curveddashed line is the steady state solution, and the solid line is the erroneoussolution due to large time steps. . . . . . . . . . . . . . . . . . . . . . . . . . . 50

8.5 Matrix structure for a 10 element system with 4th order polynomials (left)and 5 element system with 7th order polynomials (right). Both have 31degrees of freedom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

9.1 A circle geometry showing the partitions (a) and after a structured quadri-lateral mesh on each piece (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

9.2 A cylinder geometry showing the partitions (a) and after a structured hexa-hedral mesh on each piece (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

9.3 A cylinder geometry with cutaway showing the partitions (a) and after astructured hexahedral mesh on each piece (b) . . . . . . . . . . . . . . . . . . 59

9.4 A HIT like geometry showing the partitions (a) and after a structured hexa-hedral mesh on each partition (b) . . . . . . . . . . . . . . . . . . . . . . . . . 60

vi

LIST OF TABLES

Table Number Page

4.1 Function Rb and Jacobian R′b equations for both Neumann and Dirichlet

boundary conditions applied to a primary variable and a non-primary vari-able. This table applies to both nodal and modal basis functions, where theall but one basis function is non-zero at the boundary. . . . . . . . . . . . . . 23

7.1 Inflow and outflow boundary condition requirements for the Pseudo-1D Eulerequations [8]. Characteristics are the eigenvalues for the Pseudo-1D Eulersystem of equations, where u is the bulk fluid velocity, and a is the soundspeed in the fluid. The (+) indicates a right moving characteristic and the(−) indicate a left moving characteristic. . . . . . . . . . . . . . . . . . . . . 32

7.2 Numerical versus analytical shock location in nozzle for several inflow/outflowpressure ratios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

8.1 Average Newton and average linear solver (GMRES) iteration counts forvarying time steps after an equal amount of time steps (100). Average Newtoniterations are per time step, and the linear iterations are also averaged pertime step. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8.2 CPU timing and average Newton and Linear (GMRES) iteration counts forvarying spatial resolution with polynomial order 4. Ne is the number ofelements, and Np is the total number of degrees of freedom. The averageNewton iterations are per time step, and the average linear iterations arealso per time step. The CPU time is measured using the intrinsic fortranroutine ’CPU TIME’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

8.3 CPU timing and average Newton and Linear (GMRES) iteration counts forvarying polynomial degree. Ne is the number of elements, and Np is the totalnumber of degrees of freedom. The average Newton iterations are per timestep, and the average linear iterations are also per time step. The CPU timeis measured using the intrinsic fortran routine ’CPU TIME’. . . . . . . . . . 52

8.4 CPU timing, average CPU time per time step and average Newton and Linear(GMRES) iteration counts for varying time step with parameters: poly = 6,Θ = 0.50, Ne = 50, tfinal = 10.0, ε = 1 · 10−2. ∆t is the time step size, andNt is the total number of time steps. The average Newton iterations are pertime step, and the average linear iterations are also per time step. The CPUtime is measured using the intrinsic fortran routine ’CPU TIME’. . . . . . . 54

vii

8.5 Various linear solver types included with the PETSc libraries, and the Su-perLU direct solver with their descriptions and parameters used for the runsin Table 8.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

8.6 CPU timing and iteration results for different iterative linear solver methodsdescribed in Table 8.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

viii

1

Chapter 1

INTRODUCTION

A finite element solver is being developed to solve equations in the flux-source form.

This enables physics equations of many types and complexity to be generally solved with a

relatively small amount of editing to code. The finite element method is chosen due to its

ability to effectively solve systems of equations with smooth solutions and with arbitrarily

defined geometries. The solver takes advantage of the portable, extensible, toolkit, for

scientific computation (PETSc) libraries for parallel data structure and solver management.

It makes use of both the linear and nonlinear solvers built into PETSc as well as the interface

to the SuperLU direct solver for large sparse matrices. Using these optimal solver libraries

enables scaling of the code to large machines without having major rewrites.

1.1 Motivation

The motivation behind developing a one dimensional code of this type is to prepare for

developing a 3D, fully implicit, parallel, finite/spectral element code, that can solve the

extended magnetohydrodynamic (MHD) equations and other plasma systems such as the

two-fluid equations on general body-fitted grids. This is a large and complicated undertaking

and using a one dimensional code can greatly simplify algorithm development, and ease the

transition to three dimensions.

The pseudo-1D Euler equations are implemented in this formulation because they are

a relatively simple equation set that can be posed in the flux-source equation form. This

equation set gives enough complexity such that a solver can be developed and tested, but

also simple enough that it will not impede development.

2

Chapter 2

FINITE ELEMENT METHOD

The finite element method is a robust method for solving partial differential equations

on complex geometries. The method splits a large problem into many small elements and

solves each piece simultaneously. Each element makes up a piecewise continuous solution of

the larger problem. Within each element the solution is represented by basis (interpolation)

functions that determine the solution in the interior of the element. With careful selection

of basis functions the solution can be guaranteed continuous on the element boundaries.

To take advantage of the piecewise representation, the PDE must not be in the differ-

ential form but the integral (weak) form. This form gives an approximate solution to the

problem at any specified range, and therefore can be broken into elements.

2.1 Flux-Source Equation Form

The flux-source equation form is used for its convenience. Many equation sets can be

represented in this form, and thus a solver can be formulated that generally solves this

equation type. The form is also known as divergence form, and has the form

∂~q

∂t+∇ · ~f = ~s (2.1)

where ~q is a vectors of primary variables and ~f , and ~s are the fluxes and sources associated

with each of the primary variables.

2.2 Galerkin’s Method - Weak Form of Equations

Galerkin’s method converts a continuous PDE to a discrete problem by formulating the

equation in the weak form. The weak form is constructed by multiplying the equation by a

trial function and integrating over the problem domain. For the Galerkin formulation the

trial function is an interpolating (basis) function that is also used to represent each variable.

3

Using the Galerkin discretization, the PDE is converted to the weak form.∫Ω

αi∂~q

∂td~x +

∫Ω

αi∇ · ~fd~x =∫

Ωαi~sd~x (2.2)

where the α’s represent some interpolating basis function and Ω is the domain that the basis

functions span. Additionally the variables are expanded in terms of the basis functions and

amplitudes of the basis functions

~q =∑

n

αi(x)qi (2.3)

2.3 Nodal Basis Function using Lagrange Polynomials

A nodal basis function set has one particular function associated with each node. All other

functions at this node are zero. The degree of polynomial thus determines the number of

nodes required in the system.

A common nodal basis set is the Lagrange polynomials which are used due to the C0

continuity they provide and their simplicity. They are defined as a set of polynomials with

degree ≤ (n− 1) which passes through all n points. They have the form

α(x) =n∑

j=1

αj(x) (2.4)

where,

αj(x) =n∏

k=1,k 6=j

x− xk

xj − xk(2.5)

This formulation written generally looks like [3]

α(x) =(x− x2)(x− x3) . . . (x− xn)

(x1 − x2)(x1 − x3) . . . (x1 − xn)y1 +

(x− x1)(x− x3) . . . (x− xn)(x2 − x1)(x2 − x3) . . . (x2 − xn)

y2 + . . .+

(x− x1)(x− x2) . . . (x− xn−1)(xn − x1)(xn − x2) . . . (xn − xn−1)

yn.

(2.6)

Figure 2.1 shows seventh order Lagrange polynomials and Figure 2.2 shows a second-order,

three-element system. Notice the basis functions associated with element boundaries provide

the continuity. A special property of the Lagrange polynomials is that the amplitude of a

basis function is one at its corresponding node and zero at every other node. This property

is useful because it makes the basis function amplitude the same as the primary variable

value.

4

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

α

Figure 2.1: Sixth order nodal (Lagrange) polynomials on the domain x ∈ [−1, 1]. Eachnode has a corresponding polynomial with a value of one at the node. At all other nodes,the value of the same polynomial is zero.

5

α11 α

21 α

12 α

22 α

13 α

23 α

33

1 2 3 4 5 6 7Nodes0

1 α31 α

32

Figure 2.2: A three element system with second order nodal (Lagrange) polynomials. Eachnode has a corresponding polynomial, and the nodes that share element boundaries havepolynomials that span both elements. For instance α1

3 and α21 from the first and second

element provide the C0 continuity through the shared node 3.

2.4 Modal Basis Functions using Jacobi Polynomials

Modal basis sets have an arbitrary number of functions defined within each element and

are not associated with any specific nodes. They have a polynomial defined for each order

up to the highest specified. For instance a third order element will have a linear, quadratic,

and cubic basis function defined. This differs from the nodal basis sets because all their

polynomials are of the highest order specified. This means for a third order nodal element,

all basis functions are cubic.

A common modal basis set are the Jacobi polynomials, which are solutions to the Jacobi

differential equation. They can be effectively used as modal basis functions in the finite

element method because of their ability to provide C0 continuity and their complete spectral

sampling. It is also simple to compute the functions for an arbitrary polynomial order, which

make them a convenient choice for numerical methods. They are defined by the recurrence

relation

J(αp,βp)n (x) =

(−1)n

2nn!(1− x)−αp(1 + x)−βp

dn

dxn

[(1− x)αp+n(1 + x)βp+n

](2.7)

for αp, βp > −1, where αp and βp are polynomial parameters and not the basis functions.

A special case of the Jacobi polynomials is the Legendre polynomial for when αp = βp = 0.

6

In order to provide C0 continuity with Jacobi polynomials, at least one of the functions

must span continuously from one element to another. For simplicity the linear function

is defined twice with opposite slopes and all other functions go to zero at the element

boundaries. These functions have the form

P0(x) = 1

P1a(x) = (1 + x)/2

P1b(x) = (1− x)/2

Pn(x) = (1− x2)J (αp,βp)n−2 (n ≥ 2) (2.8)

where J (αp,βp) is the Jacobi polynomial. This provides functions on the interval of x ∈

[−1, 1], which can be mapped linearly onto the domain range of choice. Figure 2.3 shows

these polynomials up to seventh order for αp = βp = 1. The linear elements will provide

the continuity between elements. Figure 2.4 shows how these linear elements provide the

continuity by showing a three element system. The quadrature points are placed at the

element boundaries, and at the roots of the polynomials. These points could be placed

anywhere in the element domain as long as there are at least the same amount as the

number of basis functions.

2.5 Basis Function Amplitudes

The finite element solver advances the amplitudes q of the basis function as the solution.

The actual primary variables can be recovered by evaluating the summation from Eqn. 2.3.

This formulation is convenient because it enables the solution to be represented continuously

within elements, rather than just at nodal locations. Consequently, this also means the

initial condition, flux and source must be represented as amplitudes of the basis functions

rather than by the primary variables.

2.5.1 Initial Condition

A set of nodes with a size determined by the number of degrees of freedom in the problem

is defined. The initial condition is then defined on this set of nodes. This initial condition

7

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

α

Figure 2.3: Modal (Jacobi) Polynomials, J(1,1)n with highest order of 7 on the domain

x ∈ [−1, 1]. All polynomials are defined within the domain and go to zero at the domainboundaries, with the exception of the linear polynomials. The two linear polynomials rangefrom zero on one boundary to one at the other boundary. These linear polynomials providethe continuity between elements.

8

Quad. Pts

1 2 3 4 5 6 70

1 α11 α

21 α

12 α

22 α

13 α

23 α

33α

31 α

32

Figure 2.4: A three element system with second order modal (Jacobi) polynomials αn. Eachpolynomial is not associated with any particular node, but defined at all points. The linearpolynomials provide the C0 continuity by spanning across element boundaries. Quadraturepoints are distributed evenly in this case and include the element boundaries, but they couldalso be defined only on the interior parts of the elements.

represents the primary variables, but the solver needs to know the amplitudes of the basis

functions corresponding to the primary variables. The amplitudes are found by solving a

linear system for the whole domain. Figs 2.2 and 2.4 show an example domain consisting of

three quadratic elements for nodal (Lagrange) and modal (Jacobi) polynomials respectively.

The system of equations can be expressed in matrix form for the three element system

q1

q2

q3

q4

q5

q6

q7

=

α11(1) α1

2(1) α13(1) 0 0 0 0

α11(2) α1

2(2) α13(2) 0 0 0 0

α11(3) α1

2(3) α13(3) α2

2(3) α23(3) 0 0

0 0 α21(4) α2

2(4) α23(4) 0 0

0 0 α21(5) α2

2(5) α23(5) α3

2(5) α33(5)

0 0 0 0 α31(6) α3

2(6) α33(6)

0 0 0 0 α31(7) α3

2(7) α33(7)

·

q1

q2

q3

q4

q5

q7

q7

(2.9)

where qn represents the primary variables defined at some point n. This equation is a

simple linear system and can be solved by inverting the matrix of basis functions to find the

corresponding amplitudes. The size of the system is determined by the number of degrees

of freedom, which corresponds to the total number of polynomial basis functions defined

in problem. For instance the system shown in Figure 2.4 requires the initial condition to

9

be defined at seven points. There must be the same number of points as there are basis

functions in order for the system to be solved.

With the Jacobi polynomials described in Eqn. 2.8 and for the system shown in Figure

2.4 the matrix in Eqn. 2.9 is

MJacobi =

1 0 0 0 0 0 0

0.5 1 0.5 0 0 0 0

0 0 1 0 0 0 0

0 0 0.5 1 0.5 0 0

0 0 0 0 1 0 0

0 0 0 0 0.5 1 0.5

0 0 0 0 0 0 1

. (2.10)

(Note: When using a nodal basis representation like Lagrange functions, this matrix is

merely the identity matrix because the amplitudes of each function are one at its corre-

sponding node and zero everywhere else. No inversion is required!)

2.5.2 Flux and Source

The flux and source amplitudes must also be found in a similar way as the initial primary

variables. Since the flux and source are defined in terms of the primary variables, these

amplitudes must be calculated first.

~f =∑

i

αi(x)fi, ~s =∑

i

αi(x)si (2.11)

The variables ~q from Eqn. 2.3 are used to compute the flux, ~f and source, ~s at the initial

points. Then a system of equations similar to the initial condition system is formed for the

10

flux and source

f1

f2

f3

f4

f5

f6

f7

=

α11(1) α1

2(1) α13(1) 0 0 0 0

α11(2) α1

2(2) α13(2) 0 0 0 0

α11(3) α1

2(3) α13(3) α2

2(3) α23(3) 0 0

0 0 α21(4) α2

2(4) α23(4) 0 0

0 0 α21(5) α2

2(5) α23(5) α3

2(5) α33(5)

0 0 0 0 α31(6) α3

2(6) α33(6)

0 0 0 0 α31(7) α3

2(7) α33(7)

·

f1

f2

f3

f4

f5

f6

f7

, (2.12)

s1

s2

s3

s4

s5

s6

s7

=

α11(1) α1

2(1) α13(1) 0 0 0 0

α11(2) α1

2(2) α13(2) 0 0 0 0

α11(3) α1

2(3) α13(3) α2

2(3) α23(3) 0 0

0 0 α21(4) α2

2(4) α23(4) 0 0

0 0 α21(5) α2

2(5) α23(5) α3

2(5) α33(5)

0 0 0 0 α31(6) α3

2(6) α33(6)

0 0 0 0 α31(7) α3

2(7) α33(7)

·

s1

s2

s3

s4

s5

s6

s7

. (2.13)

Notice the matrices are identical because they are a representation of the geometry and

element connectivity, which remains constant for the flux and source. These equations

must be solved every time the flux and source are evaluated. At minimum this occurs once

per time step, although since the matrix is identical it only needs to be inverted or factored

once before the time stepping begins.

2.6 Gaussian Quadrature

The integrals arising from the weak form of the equations need to be calculated in some

way. A numerical quadrature is a simple way to integrate some arbitrary function G(x),

where the analytic integral might not be known. The method approximates the integral

as a summation of the function evaluated at some quadrature points x multiplied by some

weighting values w. This has the form∫ b

aG(x)dx ≈

n∑i=1

wiG(xi) (2.14)

11

where each quadrature point xi ∈ [a, b] has a corresponding weight wi associated with it.

The method finds the quadrature points using the roots of some polynomial set. Usually

these points are found on an interval of [−1, 1] and they are transformed to some physical

interval [a, b]. The method also finds corresponding weight values specific to the polynomial

set. With the quadrature points and weighting values known, the summation is evaluated

to approximate the integral in Eqn. 2.14.

The polynomial set used plays a role in the convergence rates of the solution. For instance

using the weights and roots of the Jacobi polynomials to perform numerical integration

of Jacobi functions provides spectral convergence of the solution.[1] Different quadrature

types can be used for different basis functions, but this will not necessarily ensure spectral

convergence.

12

Chapter 3

SOLVER FORMULATION

3.1 General Equation Form

To make the solver general, the flux-source equation form is used. This equation involves

a vector of primary variables ~q and the fluxes ~f and sources ~s associated with each of the

primary variables.∂~q

∂t+∇ · ~f = ~s (3.1)

By applying the Galerkin spatial discretization described in Section 2.2, the weak form

of the equation results. ∫Ω

αi∂~q

∂td~x +

∫Ω

αi∇ · ~fd~x =∫

Ωαi~sd~x (3.2)

3.2 Spatial Discretization

Further spatial discretization is performed by expanding ~q, ~f , and ~s with respect to the

basis functions, and their amplitudes.

~q =∑

j

αj(x)qj(t), ~f =∑

j

αj(x)fj(t), ~s =∑

j

αj(x)sj(t) (3.3)

where αj(x) is the jth basis function, and qj , fj , and sj are the jth amplitudes of the

basis functions. In one dimension with this representation after dropping the summation

notation, Eqn. 3.2 now becomes∫Ω

αiαjd~x

∂qj

∂t

+∫

Ωαi

∂αj

∂xd~x

fj

=∫

Ωαiαjd~x sj (3.4)

Notice the spatial component of the primary variables is entirely represented by the basis

function, and therefore the amplitudes can be taken outside the integral. Each of the

integrals has a two index summation and can be represented as an element matrix.

Me∂~q

∂t+ Ke

~f = Me~s (3.5)

13

where ~q, ~f , and ~s are vectors of qj , fj , and sj from Eqn. 3.3 and

Me =∫

Ωαiαjd~x, and Ke =

∫Ω

αi∂αj

∂xd~x. (3.6)

Each element matrix can be assembled into a global matrix that represents the whole domain

M∂~q

∂t+ K ~f = M~s. (3.7)

3.3 Mass and Stiffness Matrices Construction

The mass M, and stiffness K matrices, arise from the weak form of the flux-source equation

(Eqn. 2.1) and when the basis functions are separated from the primary variables. The

integrals are calculated using numerical quadrature and an element matrix is calculated.

These matrices represent the coupling between spatial functions. For the system shown in

Figure 2.4 the element mass matrix for element 1 is

Me1 =∑

k

wk

α1

1α11 α1

1α12 α1

1α13

α12α

11 α1

2α12 α1

2α13

α13α

11 α1

3α12 α1

3α13

(3.8)

where e1 represents the first element, the superscripts represent the element number, and

the subscripts represent the basis function. The other elements are analogous.

The global mass matrix is assembled by adding each element matrix into a large N x N

matrix, where N is the total number of basis functions for the system. When elements share

basis functions, the element matrices overlap in the global matrix and are added together.

This summation is really just adding both sides of the integral together, which is split at

the element boundary. For the three element system the mass matrix is

∑k

wk

α11α

11 α1

1α12 α1

1α13 0 0 0 0

α12α

11 α1

2α12 α1

2α13 0 0 0 0

α13α

11 α1

3α12 α1

3α13 + α2

1α21 α2

1α22 α2

1α23 0 0

0 0 α22α

21 α2

2α22 α2

2α23 0 0

0 0 α23α

21 α2

3α22 α2

3α23 + α3

1α31 α3

1α32 α3

1α33

0 0 0 0 α32α

31 α3

2α32 α3

2α33

0 0 0 0 α33α

31 α3

3α32 α3

3α33

(3.9)

14

where the superscripts represent the element number. The summed values (i.e. α13α

13+α2

1α21)

should be the same between elements, since they only represent an integral of the spatial

basis function over the same size domain. The stiffness matrix is similar except that it

represents the coupling between the basis functions α and its derivatives α′. The matrix

should have a similar sparsity pattern as the mass matrix.

3.4 Nonlinear Solver with Implicit Time Advance

For an explicit time advance Eqn. 3.7 is modified to

M(

qn+1 − qn

∆t

)= Msn −Kfn = Xn (3.10)

where n signifies the time step and the vector notation has been dropped for q, f , and s.

With an implicit time advance using the Θ scheme, the equation is

M(

qn+1 − qn

∆t

)=[ΘXn+1 + (1−Θ)Xn

](3.11)

Since Xn+1 is not known, an iterative scheme is used to solve the equation and it is rewritten

in terms a residual R as function of the unknown qn+1.

R(qn+1) = M(

qn+1 − qn

∆t

)−[ΘXn+1 + (1−Θ)Xn

]= 0 (3.12)

where Xn+1 is also a function of qn+1.

Newton’s method is used, which solves the equation for when R(qn+1) = 0. The method

is formulated by approximating the function R using a Taylor series expansion.

R(qn+1) ≈ R(qk) +(

∂R

∂qn+1

) ∣∣∣∣∣qk

∆q = 0 (3.13)

where ∆q = qk+1 − qk and the index k is the iterate. This is then rewritten as(∂R

∂qn+1

) ∣∣∣∣∣qk

∆q = −R(qk) (3.14)

which is a linear system and can be solved for ∆q provided the Jacobian ∂R∂qn+1 , and function

R(qk) are known. The Jacobian is found by taking a derivative of R with respect to qn+1.(∂R

∂qn+1

) ∣∣∣∣∣qk

=∂

∂qn+1

[M∆t

∆q −(ΘXn+1 + (1−Θ)Xn

)](3.15)

15

This equation simplifies to (∂R

∂qn+1

) ∣∣∣∣∣qk

=M∆t

−Θ∂Xn+1

∂qn+1

∣∣∣∣∣qk

(3.16)

The resulting Jacobian can be used in Eqn. 3.14 and along with the iterate function evalu-

ation to solve the linear system for ∆q. The iterate value is updated

qk+1 = qk + ∆q (3.17)

Since qk is an estimate for qn+1, the solution to the linear system is inaccurate. The

inaccuracy can be measured by evaluating Eqn. 3.12 with the updated iterate value qk+1

and comparing to some tolerance.

R(qn+1)∣∣∣qk+1

≤ tol ≈ 0 (3.18)

If the evaluation of the function is within the tolerance limits, the solution is considered

converged. Otherwise the process is repeated by evaluating the function and Jacobian with

qk+1, the linear system from Eqn. 3.14 is solved again, the iterate value updated, and the

function is again checked against the tolerance. When the tolerance is met,

qk −→ qn+1 (3.19)

and the iterate value is considered the solution at the next time step qn+1.

3.5 Jacobians with Respect to Basis Functions

The Jacobian from Eqn. 3.16 is needed in the Newton method solution process and is

defined using derivative with respect to the basis function amplitudes. Since the Jacobian

is defined in terms of amplitudes of basis functions, it needs to be calculated in much the

same way as for the initial condition and flux and source amplitudes.

After using the original definition for X

∂R

∂q=

∂

∂q

[M∆t

∆q −Θ(Msn −Kfn

)](3.20)

which is rewritten as∂R

∂q=

M∆t

−Θ

[M

∂s

∂q−K

∂f

∂q

](3.21)

16

The flux ∂f∂q , and source ∂s

∂q Jacobians are needed in terms of the amplitudes q. It can be

seen that∂f

∂q=

∂f

∂q

∂q

∂qand

∂s

∂q=

∂s

∂q

∂q

∂q(3.22)

If ∂q∂q is expanded in terms of the basis functions it can be seen that

∂q

∂q=

∂∑

j αj qj

∂q= αj (3.23)

and thus∂f

∂q=

∂f

∂qαj and

∂s

∂q=

∂s

∂qαj (3.24)

With the equalities from Eqn. 3.24 a linear system can be constructed in much the

same manner as in section 2.5 for the initial condition, flux, and source amplitudes. The

constructed linear system can be solved for ∂sl∂q , and ∂fl

∂q with respect to a particular basis

function l. Each l represents a column in a resulting matrix.

[∂s∂q

]1α1

[∂s∂q

]2α2

...

...[∂s∂q

]n

αn

l

= [αn] ·

[∂s∂q

]1α1

[∂s∂q

]2α2

...

...[∂s∂q

]n

αn

l

(3.25)

where [αn] is a matrix that is identical to the matrix from section 2.5.

In the case for multiple primary variables each block in Eqn. 3.25 is a Neq x Neq matrix,

where Neq is the number of primary variables. For a system with three variables, the first

block would look like ∂s1

∂q1∂s1

∂q2∂s1

∂q3

∂s2

∂q1∂s2

∂q2∂s2

∂q3

∂s3

∂q1∂s3

∂q2∂s3

∂q3

1

(3.26)

where the superscript represents the different primary variables.

17

Equation 3.25 is analogous to equations 2.3 and 2.9 in section 2.5, where variables at

known points are used to solve for the amplitudes of the basis functions. In this case

the Jacobian is known at some specific locations, and a Jacobian defined in terms of the

amplitudes of the basis functions is needed. The vector on the left hand side in Eqn. 3.25

represents the known values, which are used to solve for the amplitude values.

18

Chapter 4

GENERAL BOUNDARY CONDITIONS

The goal is to have a generalized form of the boundary conditions such that it is easy

to specify boundary fluxes or specify a separate equation to be solved on the boundary.

This is accomplished by having lists of boundary nodes and interior nodes. With these lists

the equations that are specified for boundaries are applied only to boundary nodes, while

the interior equations are solved on all the interior nodes. Two major types of boundary

specification are used. One is the natural boundary condition, where the flux is controlled,

and the second involves specifying an alternative arbitrary boundary equation.

4.1 Natural Boundary Condition

A natural boundary condition is applied by specifying the flux term of the weak form of the

governing equation (Eqn. 2.2). In one dimension the equation looks like∫Ω

α∂ ~f

∂xdx︸︷︷︸

flux term

= −∫

Ω

∂α

∂x~fdx︸︷︷︸

volume term

+[α~f]∂Ω︸︷︷︸

surface term

(4.1)

which is derived by integrating the term by parts and separating it into a volume term and

surface term where Ω is the domain of interest. In one-dimension, the surface term is a

surface evaluation, because each boundary consists of one node. This surface evaluation

represents the amount of flux through the boundary nodes. Therefore the surface term can

be specified to control the flux of the primary variables. For instance if one were to examine

the fluid continuity equation∂ρ

∂x+

∂(ρu)∂x

= 0 (4.2)

the resulting surface term is [αρu]∂Ω, which is the momentum ρu multiplied by the basis

function α evaluated at the boundary. The momentum flux boundary condition can be

controlled by specifying the value of this term.

19

The specification of the flux term has several variants. It is treated identically to an

interior equation, zeroed, or explicitly specified to some value. When treating the surface

term identically to the interior elements, the flux originates through the surface term, and

contributions to the term only originate from the element interior. It is as if the contribution

from a neighboring element were excluded, but in this case it is a physical boundary. This

is a useful boundary condition when no reflections are desired at the boundary. When

the flux term is zeroed it is also called a “zero-flux” boundary condition. This means

the term is completely removed, which is useful for specifying a solid wall boundary. The

third variant involves explicitly specifying the flux, which is useful for specifying inflow and

outflow conditions on a boundary.

4.2 Specifying Different Boundary Equations

Alternatively to specifying the boundary flux, a separate equation can be specified for

boundary nodes. The boundary equation is replaced by another equation on the bound-

ary nodes, while the interior nodes remain with the standard governing equation. This is

effective for specifying Dirichlet and Neumann boundary conditions.

4.2.1 Dirichlet Boundary Condition

Dirichlet on Primary Variable

Specifying a Dirichlet boundary condition involves changing the governing equation to

q = βD (4.3)

where βD is some specified value for the primary variable q, which can potentially be time

dependent. In order to solve this equation in the finite element method described, the

equation is modified on the boundary to

Rb = q − βD = 0 (4.4)

20

Similarly to the interior equation, this is converted to the weak form in one dimension and

the variable expanded in terms of the basis function

Rb =∫

Ωδ(x− xb)

∑j

αj(x)qj − βD

dx = 0 (4.5)

In this case rather than integrating over the whole domain with the basis function, an

evaluation at the boundary is performed using a delta function δ(x−xb) about the boundary

location xb. The delta function is critical because it reduces the integral to an evaluation

and excludes the contribution of the basis functions integrated over the element domain.

Despite the fact that all but one of the basis functions are zero at the element boundary,

their integrals over the element domain are nonzero and would impact the boundary node.

The primary variable q is expanded in terms of basis functions and amplitudes and the

delta function collapses the integral.

Rb =∑

j

αj(x)qj − βD = 0 (4.6)

The summation is now over each of the basis functions at the boundary, and since all but

one has a nonzero value the summation is dropped and the equation simplifies

Rb =∑

j

α(xb)qj − βD ⇒ Rb = αj(xb)qj − βD (4.7)

where xb is x at the boundary. (Note: in general all the basis functions can have nonzero

values at the element boundaries, and this would lead to different continuity properties

between elements. For simplicity this formulation uses only one nonzero basis functions to

provide the continuity, while all others are zero at the boundary.)

The Jacobian also needs to be altered for the boundary equation.

∂R

∂q=

∂

∂q

∑j

αj(xb)qj − βD

(4.8)

This simplifies to∂R

∂q=∑

j

αj(xb) ⇒ αj(xb) (4.9)

where xb is x at the boundary.

21

Dirichlet on Non-Primary Variable

To hold non-primary variable fixed at the boundary the condition is

Rb =q2

q1− βD = 0 (4.10)

where q1 and q2 are each primary variables and some combination (possibly nonlinear)

yields the desired condition. For example if q1 = ρ and q2 = ρu, then q2/q1 = u and u is

desired to be held fixed. In the weak form using a delta function, with q1 and q2 expanded

in terms of the basis function, the equation is

Rb =∫

Ωδ(x− xb)

(∑j αj(x)q2

j∑j αj(x)q1

j

− βD

)dx = 0 (4.11)

Similar to the other case, this simplifies to

Rb =

∑j αj(x)q2

j∑j αj(x)q1

j

− βD = 0 (4.12)

The function Rb is trivial to evaluate, but since the equation is a function of more than one

of the primary variables, the Jacobian will be more complicated.

∂R

∂q=

∂

∂q

[∑j αj(x)q2

j∑j αj(x)q1

j

]x=xb

⇒ ∂R

∂q=

∂

∂q

[αj(x)q2

j

αj(x)q1j

](4.13)

where q includes all primary variables q1, q2, . . ., qn, and xb is x at the boundary. This

equation must be evaluated and used as the Jacobian at the boundary.

4.2.2 Neumann Boundary Condition

Neumann on Primary Variable

A Neumann boundary imposed on the boundary has the form

∂q

∂x= βN (4.14)

The boundary equation is now

Rb =∂q

∂x− βN = 0 (4.15)

and the equation solved at the boundary in the weak form using a delta function is

Rb =∫

Ωδ(x− xb)

(∂q

∂x− βN

)dx = 0 (4.16)

22

Again the delta function δ(x−xb) is used to evaluate at the boundary rather than integrating

over the whole domain. By expanding q in terms of the basis function the equation can be

rewritten as

Rb =∑

j

∂αj(x)∂x

qj

∣∣∣∣∣x=xb

− βN = 0 (4.17)

where xb is x at the boundary. Similar to the Dirichlet conditions this amounts to changing

the R function at the boundary to Eqn. 4.17. The Jacobian will also be different and has

the form

∂R

∂q=

∂

∂q

∑j

αj(xb)′qj − βN

(4.18)

where α′ = ∂α∂x . This equation simplifies to

∂R

∂q=∑

j

αj(xb)′ (4.19)

This is analogous to the Dirichlet case, except that the basis function evaluation is a deriva-

tive.

Neumann on Non-Primary Variable

A Neumann boundary condition on a non-primary variable is slightly more complicated

than the primary variable case. Again as an example q2/q1 is used as the non-primary

variable.∂

∂x

(q2

q1

)= βN (4.20)

The weak form using a delta function is

Rb =∫

Ωδ(x− xb)

[∂

∂x

(q2

q1

)− βN

]dx = 0 (4.21)

Expanding q1 and q2 with respect to the basis functions and collapsing the integral and

delta function

Rb =

∑j αj(x)′q2

j∑j αj(x)q1

j

∣∣∣∣∣x=xb

−∑

j αj(x)q2j

∑j αj(x)′q1

j(∑j αj(x)q1

j

)2

∣∣∣∣∣x=xb

− βN = 0 (4.22)

23

Table 4.1: Function Rb and Jacobian R′b equations for both Neumann and Dirichlet bound-

ary conditions applied to a primary variable and a non-primary variable. This table appliesto both nodal and modal basis functions, where the all but one basis function is non-zeroat the boundary.

Dirichlet Neumann

Conserved Non-Conserved Conserved Non-Conserved

Rb αj qj − βαj q2

j

αj q1j− β

∑j α′

j qj − βP

j α′j q2jP

j αj q1j−

Pj αj q1

j

Pj α′j q1

j

(P

j αj q1j )

2 − β

R′b αj

∂∂q

[αj q2

j

αj q1j

] ∑j α′

j∂∂q

[Pj α′j q2

jPj αj q1

j−

Pj αj q1

j

Pj α′j q1

j

(P

j αj q1j )

2

]

where xb is x at the boundary. Again the Jacobian is more complicated and looks like

∂R

∂q=

∂

∂q

∑j αj(x)′q2j∑

j αj(x)q1j

−∑

j αj(x)q2j

∑j αj(x)′q1

j(∑j αj(x)q1

j

)2

x=xb

(4.23)

4.3 Summary of Boundary Conditions

For both Neumann and Dirichlet boundary conditions applied to a primary variable (e.g.

q1, q2, q3, . . . , etc) and non-primary variable (e.g. q2/q1), the function evaluation and

Jacobian differ from the interior equations. Table 4.1 summarizes the different equation

forms for the function Rb and Jacobian R′b at boundaries.

24

Chapter 5

ARTIFICIAL DISSIPATION

When solving problems using continuous finite elements, resolving shocks or other sharp

changes in the flow can be difficult and lead to numerical instabilities. The solution is

constrained to be continuous by virtue of the method, so whenever a sharp discontinu-

ity is present, the solution develops high frequency oscillations (Gibbs phenomenon) that

ultimately destroys the solution.

One way to counter the high frequency oscillations is to add a dissipation term to the

governing equations. The goal is to give finite width to shocks and other sharp features,

that would otherwise have large changes from one node to the next. A simple addition

of a second order term like a Laplacian suffices to dampen the high frequency oscillations

that occur. When added to the governing equations, the term can alter the physics of the

problem. One way to minimize the impact of adding the dissipation term is to scale it such

that differing levels of dissipation can be added. To do this a scalar, ε is multiplied to the

term. The governing equation now looks like

∂~q

∂t+∇ · ~f + ε∇2~q = ~s (5.1)

where the ~q operated on by the Laplacian can be applied to only the primary variables of

choice. For instance it is common to only apply the term to the velocity or momentum.

Applying the Galerkin spatial discretization method to the term yields

ε∇2~q ⇒∫

Ωαiε∇2~qd~x (5.2)

In one dimension this simplifies to

∫Ω

αiε∇2~qdx ⇒∫

Ωαiε

∂2~q

∂x2dx (5.3)

25

In order to reduce the order of the derivatives this term is now integrated by parts.∫Ω

αiε∂2~q

∂x2dx = −ε

∫Ω

∂αi

∂x

∂~q

∂xdx + ε

[αi

∂~q

∂x

]∂Ω

(5.4)

where ∂Ω represents the domain boundary. ~q is expanded in terms of the basis functions

and amplitudes ∫Ω

αiε∂2~q

∂x2dx = −ε

∫Ω

α′i

∑j

α′j qjdx + ε

[α(xb)α(xb)′qj

]∂Ω

(5.5)

After dropping the summation notation and moving q outside each of these terms, Eqn. 5.5

simplifies to ∫Ω

αiε∂2~q

∂x2dx =

∑j

[−ε

∫Ω

α′iα

′jdx + ε

[αiα

′j

]∂Ω

]qj (5.6)

This can now be represented as a linear combination of matrices and vector of amplitudes

q.

[V1 + V2] q (5.7)

where,

V1 = −ε

∫Ω

α′iα

′jdx and V2 = ε

[α(xb)α(xb)′

]∂Ω

Equation 3.7 can now be modified to include the dissipation terms. The new equation

is

M∂~q

∂t+ K ~f + [V1 + V2] ~q = M~s (5.8)

This is a relatively simple modification to the governing equations and allows for solutions

that might develop sharp discontinuities during its evolution, as well as solutions with shocks

in the solution.

26

Chapter 6

PETSC PARALLELIZATION AND SOLVERS

The portable, extensible, toolkit for scientific computation (PETSc) is used for solver

data structures. These include the vectors and matrices, the nonlinear solver (SNES), the

Krylov subspace iterative linear solver (KSP), and an interface to the SuperLU direct linear

solver. Using these data structures and solvers allows for relatively simple implementation

and provides the groundwork for a scalable parallel solver. All of these data structures are

designed for parallel implementations, so once the variables are defined in the proper way,

the parallelization is mostly automatic.

6.1 Vectors and Matrices

The PETSc vectors and matrices are created by using the PETSc command VecCreate()

or MatCreate(). These functions need to know the global dimensions as well as any the

range given to each processor. The processor range can also be calculated by PETSc by

using PETSC_DECIDE for the size. This feature allows for a fairly automatic partitioning of

parallel data to each processor.

6.2 Scalable Linear Equations Solvers (KSP)

The PETSc libraries include a variety of linear solvers based on Krylov subspace iterative

methods. Some of these methods are: generalized minimal residual (GMRES), conjugate

gradient (CG), bi-conjugate gradient (BICG). There are several more types of iterative

methods to suit a specific problem type.

The convergence parameters for the KSP solver are:

• Relative tolerance - Tolerance relative the the previous iteration.

(Default: RTOL = 10−5)

27

• Absolute tolerance - Global tolerance for convergence.

(Default: ABSTOL = 10−50)

• Divergence - Number of iterations until the solution is considered diverged.

(Default: DIV ERGENCE = 104)

• Preconditioning Side - The side of the matrix that the preconditioner is applied.

(Default: Left)

6.3 Scalable Nonlinear Equations Solvers (SNES)

A nonlinear solver is needed to approximate the solution of most interesting physical sys-

tems. Therefore a nonlinear solver is employed in the method to allow for these types of

systems. The solver is the scaleable nonlinear equation solver (SNES), which is built into

PETSc. It uses a Newton-based method, which solves the approximate linear system

R′∆q = −R (6.1)

where R is the function and R′ is the Jacobian. The solvers employ KSP for solutions to

the linear systems while using a trust region method.[5] They then need a user specified

function to evaluate the linear function, as well as the Jacobian.

6.3.1 Linear Function and Jacobian Evaluation

The linear function evaluation and Jacobian evaluation subroutines are specified using the

SNESSetFunction() and SNESSetJacobian() function respectively. This provides an easy

way to modularize the code such that these subroutines are defined for the physics equations

at hand.

6.3.2 Convergence Criteria

There are several convergence criteria for the SNES solver:

• Absolute Tolerance - Tolerance for global root calculations. (Default: ABSTOL =

10−50)

28

• Relative Tolerance - Tolerance of norm compared to previous iteration’s quantity.

(Default: RTOL = 10−8)

• Step Tolerance - Tolerance in terms of the norm of the change in the solution

between steps. (Default: STOL = 10−8)

• Maximum Iterations - Maximum number of Newton nonlinear iterations per time

step. (Default: MAXIT = 50)

• Maximum Evaluations - Maximum number of function evaluations per time step.

(Default: MAXF = 104)

These can all be set using the SNESSetTolerance() function, or set using runtime param-

eters. (i.e. -snes_rtol <value>).

6.4 SuperLU Direct Solver

SuperLU is an optimized direct solver for large, sparse, nonsymmetric systems of linear

equations.[6] PETSc has an interface to the solver through the KSP linear solver. This

provides an easy way to use the solver using PETSc sparse matrices. Use of the solver is

simple, and only requires specification of the SuperLU solver type and a conversion of the

Jacobian matrix to the SUPERLU sparse matrix type.

29

Chapter 7

PSEUDO-1D EULER EQUATIONS

The pseudo-1D Euler equations provide a good test problem for the flux-source equa-

tion form. The equation set is nonlinear and has a source term, which provides enough

complexity to sufficiently test the finite element algorithm.

Euler equations in one-dimension can only model some very simple flows, like the shock

tube problem. The pseudo-1D Euler equations include cross sectional area as a variable,

and as a result can model flow through a variable width “nozzle” or pipe. The equations

remain approximately 1D by assuming that flow is uniform at each cross section. [13]

The equations have the form∂~q

∂t+

∂ ~f

∂x= ~s (7.1)

where ~q, ~f , and ~s are vectors of primary variables, fluxes, and sources respectively

~q =

ρA

ρuA

eA

, ~f =

ρuA

(ρu2 + p)A

u(e + p)A

, and ~s =

0

pdAdx

0

and e = p

(γ−1) + 12ρu2 for an ideal gas. γ is the ratio of specific heats, and γ = 1.4 is used

in the test problems.

7.1 Diverging Nozzle Setup

The setup of a diverging nozzle problem involves specifying the area function, the initial

density, velocity, and energy or pressure, and the boundary conditions. The area function

used for the simulations is

A = 1.398 + 0.347 ∗ tanh(0.8x− 4.0) (7.2)

where x is the dimension along the length of the nozzle. Figure 7.1 shows a picture of the

nozzle used in the simulations, where the dashed box represents the section modeled. This

30

−5 0 5 10−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

At

Pc

Ai

Ae

Modeled Section

A = 1.398+0.347*TANH(0.8x−4.0)

Me

Pe

uc ≈ 0

Mt

Mi

Shock in Nozzle(for Supersonic Inflow / Subsonic Outflow)

Mshock

Ashock

Figure 7.1: Nozzle used in solving the pseudo-1D Euler equations. Dashed box indicates thediverging section of the nozzle that is used in the simulations. The subscript c indicates thechamber, t represents the nozzle throat, i the inflow (for the computational domain), e theexit (outflow), and the shock subscript indicates a possible shock location when supersonicinflow and subsonic outflow conditions exist. The analytic cross sectional area function Aindicates the modeled section geometry.

section has the area defined by Eqn. 7.2. The initial conditions are defined within this

section and the boundary conditions are applied at either end of the modeled section. In

this case the inflow conditions are applied at x = 0 and the outflow at x = 10.

7.2 Boundary Condition Considerations

The pseudo-1D Euler equations can model various flow conditions in a nozzle, whether it be

all subsonic flow, all supersonic flow, or partially supersonic and partially subsonic. When

considering the different cases it is important to consider how the boundary conditions are

to be treated.

A PDE must be well posed to have a unique solution. To achieve a well posed problem

the initial and boundary conditions must be properly specified. The pseudo-1D Euler equa-

31

tions are no exception, and actually require more boundary conditions than the strictly

mathematical requirements for a well posed problem. An intuitive explanation for this

peculiarity can be realized by studying the method of characteristics.

The eigenvalues of the flux Jacobian (∂f∂q ) are: u, u + a, and u− a, where u is the bulk

flow velocity, and a is the sound speed. This means that depending on the type of flow

(subsonic or supersonic) the characteristics will change direction. For a supersonic flow at

an inlet, all characteristics are positive and therefore flow into the domain and affect the

solution. Conversely at an outlet all characteristics flow out of the domain, and do not

affect the solution in the interior. For a subsonic case two of the characteristics are positive

and the other negative, and therefore results in information propagating in both directions.

This implies that at an inlet two characteristics affect the solution, and at an outlet one of

the characteristics affects the solution in the interior domain.

What does this mean in terms of required boundary conditions? For every characteristic

entering the domain, a corresponding fixed analytic condition is required on one variable

at that boundary. A fixed analytic condition can be a Dirichlet boundary condition. Ad-

ditionally for every characteristic leaving the domain, a numerical boundary condition is

required. For the finite element case, the numerical condition could be either a Neumann

or natural boundary condition. The purpose is to prevent reflections such that extraneous

information does not collect in the domain.

Table 7.1 [8] summarizes the boundary conditions required for each flow condition at

both the inlet and outlet. Notice for every characteristic entering the domain an analytic

boundary condition is required, and for every characteristic leaving the domain, a numerical

(Neumann or Natural) boundary condition is required.

These findings are only shown by empirical results, rather than strict mathematical

proof. The following sections show results for various flow conditions employing the guide-

lines of Table 7.1 to pick the boundary conditions. Scenarios where a deviation from these

guidelines are also shown.

32

Table 7.1: Inflow and outflow boundary condition requirements for the Pseudo-1D Eulerequations [8]. Characteristics are the eigenvalues for the Pseudo-1D Euler system of equa-tions, where u is the bulk fluid velocity, and a is the sound speed in the fluid. The (+)indicates a right moving characteristic and the (−) indicate a left moving characteristic.

Inflow Outflow

Subsonic Supersonic Subsonic Supersonic

Characteristics

u = (+) u = (+) u = (+) u = (+)

u + a = (+) u + a = (+) u + a = (+) u + a = (+)

u− a = (−) u− a = (+) u− a = (−) u− a = (+)

Number of Analytic B.C 2 3 1 0

Number of Numerical B.C. 1 0 2 3

7.3 Supersonic Inflow and Outflow in a Diverging Nozzle

A completely supersonic flow is studied. Supersonic conditions are initialized and main-

tained by specifying a high enough initial Mach number and specifying the boundary con-

ditions recommended by Table 7.1. Boundary conditions that deviate from Table 7.1 are

also explored to show how the system reacts when it’s over specified.

7.3.1 Correctly Specified Boundary Conditions

Table 7.1 recommends fixing three variables on the inflow and having natural boundary

conditions on the outflow for supersonic flow. This means any three physical variables can

be specified on the inflow in order for the problem to be well posed. One possibility is

specifying the pressure, density, and momentum. Energy could also be specified instead of

pressure, and velocity with momentum and the system would remain correctly specified.

The choice of which variables to apply boundary conditions is problem dependent, but as

long as the correct number are fixed the problem is well defined.

Figure 7.2 shows the completely supersonic solution after reaching a steady state. In this

case the pressure, density, and momentum are specified to be fixed to their initial condition

33

at the inflow. On the outflow natural boundary conditions are applied such that waves are

not reflected back into the computational domain.

Supersonic Inflow and Outflow

Inflow Outflow

ρin = ρo ρout = Natural

ρuin = ρuo ρuout = Natural

pin = po eout = Natural

7.3.2 Overspecified Boundary Conditions

If the problem is over specified, the system compensates by having a boundary/shock layer.

The system pushes for the correct physics, but when an extraneous boundary condition does

not allow for this, it comes as close as possible. In this case an extra Dirichlet boundary

condition is applied to the outflow pressure. The boundary conditions are satisfied, but the

boundary layer forms as a result. This is essentially applying subsonic boundary conditions

to a supersonic flow, and thus creating a discontinuity or shock at the boundary. Figure 7.3

shows this result. Notice that variables other than pressure also have this boundary layer.

Over Specified Supersonic Inflow and Outflow

Inflow Outflow



pin = po pout = po

7.4 Supersonic Inflow and Subsonic Outflow in a Diverging Nozzle

A case where the flow at the inlet is supersonic and subsonic at the exit can exist when the

pressure ratio between the inflow and outflow is small enough (i.e. the back pressure is high

enough). In this type of flow a shock forms within the nozzle. Due to the shock in the flow

some numerical dissipation is added to prevent instabilities and give some finite width to

the shock. The dissipation parameter ε controls the amount of dissipation, and ε = 5 · 10−2

34

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

P

Pressure

0 2 4 6 8 100.4

0.5

0.6

0.7

0.8

0.9

1

ρ

Density

0 2 4 6 8 101.5

2

2.5

v

Velocity

0 2 4 6 8 101

1.5

2

2.5

3

3.5

4

e

Energy

Figure 7.2: Supersonic inflow and outflow in a nozzle after reaching a steady state (t=20).Plots of pressure p, density ρ, velocity u, and energy e. The dashed line represents theinitial condition, while the solid line represents the solution at t=20. Each of the variablesare normalized to freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is




, where L is the physicallength of the domain.

35

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

P

Pressure

0 2 4 6 8 100.4

0.5

0.6

0.7

0.8

0.9

1

ρ

Density

0 2 4 6 8 101.5

2

2.5

v

Velocity

0 2 4 6 8 101

1.5

2

2.5

3

3.5

4

e

Energy

Figure 7.3: Supersonic inflow and outflow after reaching a steady state (t=10) with overspecified boundary conditions. Plots of pressure, density, velocity, and energy. The dashedline represents the initial condition, while the solid line represents the solution at t=10. Adissipation of ε = 5e−2 was used to resolve the boundary layer/shock. Each of the variablesare normalized to freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is




, where L is the physicallength of the domain.

36

was used to give the shock a finite width. This value was determined by first using a larger

amount of dissipation, and then reducing the value until the problem has a small amount

of dispersion. Reducing the dissipation further would give an overly dispersive solution.

Having a larger amount of dissipation, yields a more diffuse solution, and the shock spans

more nodes. Chapter 5 talks about the details of the adding numerical dissipation to the

solver. Figure 7.4 shows plots for pressure, density, momentum, and energy after reaching

a steady state.

Supersonic Inflow and Subsonic Outflow

Inflow Outflow



pin = po pout = po

7.4.1 Boundary Conditions

Referring to Table 7.1 it can be seen that three Dirichlet boundary conditions are required on

the inflow for supersonic flow, and one Dirichlet and two numerical conditions on the outflow

boundary. For this case it is convenient to hold the density, momentum, and pressure on

the inflow fixed and pressure on the outflow fixed. Natural boundary conditions are applied

to density and momentum on the outflow to satisfy the two numerical conditions.

To apply a pressure ratio, different values of pressure are held fixed on each boundary.

Due to this difference, a linear profile is given to the initial pressure to avoid discontinuities

at the boundary. These conditions yield a steady state shock in the domain at some location

depending on the magnitude of the pressure ratio.

7.4.2 Shock Location in Nozzle

The shock location in a pseudo-1D nozzle can be calculated analytically. This shock location

can then be compared to the numerical shock location predicted by the pseudo-1D Euler

equations.

37

Analytical Calculation

To find the shock location analytically it is important to think of the nozzle as not just in

terms of the diverging section, but a whole converging-diverging nozzle system. The system

in mind is shown in Figure 7.1. The whole picture is needed because the theoretical chamber

pressure, pc and throat area, At are needed to find the shock location.

As a first step the throat area is needed. Since the flow velocity cannot exceed Mach 1

at the throat, and an Mach number, Mi = 1.25 is initialized at the domain inflow, there

must be some smaller cross section where the flow velocity is sonic.

(Ai

At

)=

1M2

i

[2

γ + 1

(1 +

γ − 12

M2i

)](γ+1)/(γ−1)

(7.3)

The cross sectional area of this point in the flow is the throat area and can be found by

solving Eqn. 7.3 for At. [7] where Ai, and Mi are the area and Mach number initialized at

the inflow boundary.

The next step is to find the exit Mach number assuming a non-isentropic flow. First the

chamber pressure, pc is needed. This pressure represents the stagnant gas feeding the flow

of the nozzle. See Figure 7.1. This pressure assumes there is no flow (or very close to no

flow) and is the pressure compared to the exit pressure when determining the location of

the shock. The chamber pressure is found using the isentropic relation

pc = pi

(1 +

γ − 12

M2i

) γγ−1

. (7.4)

Isentropic flow is assumed prior to the inflow point and thus the chamber pressure can be

deduced from this relationship. Now the non-isentropic exit velocity can be found by solving

equation this equation while using the values obtained for At and pc

Me =

√√√√√√− 1γ − 1

+

√√√√√( 1γ − 1

)2

+2

γ − 1

( 2γ − 1

)“γ+1γ−1

”(pc

pe

At

Ae

)2. (7.5)

Now the flow conditions at both the inflow and outflow are known and the next step is

to find the conditions at the location of the shock. First the pressure ratio about the shock

38

can be found by using the relation

po2

po1

=pe

pc

(1 +

γ − 12

M2e

) γγ−1

. (7.6)

With the pressure on either side of the shock known, the Mach number on the upstream

side of the shock can be found.

2

(γ + 1)(γM2

shock −γ−1

2

) 1γ−1

[(γ + 12

Mshock

)2 11 + (γ − 1)

12M2

shock

] γγ−1

− po2

po1

= 0

(7.7)

This non-linear Eqn. 7.7 is solved for Mshock using any non-linear method desired

Once the Mach speed upstream the shock is known, the area at the shock can be found

using Eqn. 7.3, except Mi and Ai are replaced by Mshock and Ashock, and the equation is

solved algebraically for Ashock.

The final step is to find the physical location xshock by comparing the cross sectional

area at the shock (Ashock) to the area function (Eqn. 7.2). The location that corresponds to

the area at the shock is where the shock is predicted to reside. Results for several pressure

ratios, pe/pc are listed in Table 7.2. These are compared to results obtained numerically.

Numerical Calculation

Several cases are run with differing inflow/outflow pressure ratios and compared to the

analytical shock location. Results are summarized in Table 7.2. Figure 7.4 shows plots of

pressure, density, momentum, and energy for these same cases. Notice the close comparison

between analytical and numerical results.

7.5 Subsonic Inflow and Outflow in a Diverging Nozzle

A completely subsonic flow is initialized in a diverging nozzle. This is initialized by speci-

fying a subsonic initial Mach number throughout the domain, and being careful not to set

a pressure ratio that will accelerate the flow into the supersonic regime.

39

0 2 4 6 8 100

0.5

1

1.5

2

P

Pressure

0 2 4 6 8 100.5

1

1.5

2

ρ

Density

0 2 4 6 8 100

0.5

1

1.5

2

v

Velocity

0 2 4 6 8 101

1.5

2

2.5

3

3.5

4

e

Energy

Figure 7.4: Supersonic inflow and subsonic outflow after reaching a steady state (t = 50)for pe = 1.20 (blue),1.30 (red),1.40 (green) and 1.50 (magenta). Each of the variablesare normalized to freestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is




, where L is the physicallength of the domain. A dissipation factor of ε = 5 · 10−2 was used to resolve the shocksand control the dispersion.

40

Table 7.2: Numerical versus analytical shock location in nozzle for several inflow/outflowpressure ratios.

pe/pc xnumerical xanalytical

0.6486 5.60 5.56

0.7026 5.13 5.15

0.7567 4.73 4.75

0.8107 4.33 4.34


Referring to Table 7.1 the boundary conditions required for subsonic inflow are two Dirichlet

and one natural and for subsonic outflow one Dirichlet and two natural boundary conditions.

This case proves to be somewhat of a special case, and these boundary conditions are not

completely followed. Instead of natural boundary conditions Neumann conditions are used,

and only one variable on the inflow boundary is held fixed with a Dirichlet condition. It

is not well understood why this deviation from the prescribed boundary conditions works,

but with any combination of two fixed variables on the inflow, the system never reaches a

steady state. The momentum is held fixed on the inflow, and the density is held fixed on

the outflow. All other variables have Neumann conditions.

Subsonic Inflow and Subsonic Outflow

Inflow Outflow

∂xρin = 0 ρout = ρo

ρuin = ρuo ∂xρuout = 0

∂xpin = 0 ∂xpout = 0

7.6 Euler Shock Tube

The Euler shock tube is a simplification of the pseudo-1D Euler equations where the area,

A is uniform and a discontinuity is initialized inside the pipe. Figure 7.6 shows the initial

41

0 2 4 6 8 100.96

0.97

0.98

0.99

1

P

Pressure

0 2 4 6 8 100.98

0.99

1

1.01

1.02

ρ

Density

0 2 4 6 8 100.1

0.15

0.2

0.25

0.3

v

Velocity

0 2 4 6 8 102.46

2.47

2.48

2.49

2.5

2.51

2.52

e

Energy

Figure 7.5: Subsonic inflow and outflow conditions after reaching a steady state for pres-sure, density, velocity, and energy after t = 300. Each of the variables are normalized tofreestream values: p = p′

ρ∞a2∞

, ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is normalized to a charac-

teristic time t = t′

τ , and the length of the domain to a characteristic length x = x′

a∞τ . Thecharacteristic time is defined as τ = L

a∞, where L is the physical length of the domain.

42

condition as a dashed line, and the result at t = 1.5. Notice on the density plot the shock

wave, contact discontinuity, and rarefaction wave are all resolved. A dissipation parameter

of ε = 5 ·10−3 was used to resolve the shocks. This is an order of magnitude less dissipation

compared to the nozzle problems with shocks. Less dissipation is needed because the number

of time steps for this solution is significantly less, and the solution does not have time to

develop large dispersive errors. Additionally the shock tube problem is fully conservative

(no source terms) and thus is easier to stabilize.


The boundary conditions are trivial for the Euler shock problem because the domain of

influence resides completely within the computational domain and is not determined by the

boundary. The only requirement is to prevent reflections at the boundaries and therefore

natural boundary conditions for all variables suffices. When the shock front reaches the

boundary, the problem is effectively over.

43

0 2 4 6 8 100

1

2

3

4

5

6

P

Pressure

0 2 4 6 8 100

1

2

3

4

5

6

ρ

Density

0 2 4 6 8 10−1

−0.5

0

0.5

v

Velocity

0 2 4 6 8 100

5

10

15

e

Energy

Figure 7.6: Euler shock tube result after t = 1.5. Plots of pressure, density, velocity, andenergy. The dashed line represents the initial condition, while the solid line represents thesolution at t=1.5. Each of the variables are normalized to freestream values: p = p′

ρ∞a2∞

,

ρ = ρ′

ρ∞, u = u′

a∞, e = e′

ρ∞a2∞

. Time is normalized to a characteristic time t = t′

τ , and the

length of the domain to a characteristic length x = x′

a∞τ . The characteristic time is definedas τ = L

a∞, where L is the physical length of the domain. A dissipation factor of ε = 5 ·10−3

was used to resolve the shocks and control dispersion.

44

Chapter 8

ACCURACY, CONVERGENCE, AND TIMING STUDIES

The accuracy, convergence properties, and computational timing of the finite element

solver are investigated. This is done by looking into various parameters such as polynomial

degree, spatial resolution and size of time step. The pseudo-1D Euler equations were solved

to perform the investigations.

8.1 Varying Polynomial Order

A fundamental parameter in finite element methods is the highest polynomial order in the

basis functions. The convergence properties of the solver are investigated by solving a test

problem and varying the polynomial order. Other parameters are held fixed. Figure 8.1

shows a plot of normalized L2 norms versus spatial resolution for several polynomial orders

using a nodal basis set. The L2 norms are normalized to the total number of nodes in the

problem.

From Figure 8.1 it can be seen that for increasing spatial resolution, the total error

in the solution decreases. It can also be seen that for higher polynomial order the errors

decrease faster for a corresponding increase in spatial resolution. This means that for higher

polynomial order the convergence rate of the solution is faster. This is useful because when

the computational cost of increasing the polynomial order can be afforded, a high rate of

convergence can be expected.

8.2 Varying Timestep

Figure 8.2 shows a plot of the normalized L2 norms versus the spatial resolution. The

L2 norms are normalized to the number of nodes in the domain. Several different time

steps were compared to show that with smaller time steps the magnitude of error decreases.

This is clear from the figure where the smallest time step, ∆t = 0.25 has an L2 norm

45

0 200 400 600 800 1000

10−8

10−6

10−4

10−2

Ne

L2 N

orm

/ N

p

Poly=2Poly=3Poly=4Poly=5Poly=6Poly=7Poly=8

Figure 8.1: Nozzle convergence for varying polynomial order. The L2 Norms normalized bythe number of degrees of freedom Np in the system versus the number of elements in thesystem Ne are compared. Polynomial degrees of 2,3,4,5,6,7, and 8 are shown.

46

Table 8.1: Average Newton and average linear solver (GMRES) iteration counts for varyingtime steps after an equal amount of time steps (100). Average Newton iterations are pertime step, and the linear iterations are also averaged per time step.

∆t Avg. Linear Iterations Avg. Newton Iterations

10−1 4.38 2.66

10−2 2.50 2.01

10−3 1.50 2.00

10−4 1.00 2.00

10−5 1.00 2.00

10−6 1.00 1.00

approximately order 10−7.

Table 8.1 has results for the average number of Newton iterations per time step and

the average number of linear solver (GMRES) iterations for the same time steps. (Note:

The linear solver is used within the nonlinear solver and since the GMRES method is an

iterative method, it also has an iteration count. Alternatively if a direct solver were used

for the linear solver, it would take only one iteration per Newton iteration.) The averages

are found for several different time step sizes. It can be seen that as the time step decreases

the Newton iteration count decreases. Eventually the iteration count reaches one and the

problem has essentially become linearized. This means that the Newton convergence criteria

is met on the first iteration and therefore any dominant non-linear effects are not present

on the small timescales.

One can deduce that with a smaller number of Newton iterations, the corresponding error

in the solution also decreases. This is intuitive since each Newton iteration has some error

tolerance associated with it, and for every subsequent iteration the total error compounds.

It is a good idea to minimize the Newton iteration count to ensure good convergence and

accuracy of the solution.

47

0 50 100 150 200 250 300 350 400 450 50010

−8

10−6

10−4

10−2

Ne

L2 N

orm

/ N

p

∆ t = 0.25∆ t = 0.333∆ t = 0.50∆ t = 0.667∆ t = 1.0∆ t = 1.333∆ t = 1.667∆ t = 2.0

Figure 8.2: Nozzle convergence for varying time step sizes ∆t. The L2 Norms normalizedby the number of degrees of freedom Np in the system versus the number of elements inthe system Ne are compared. Several different time step sizes are shown ∆t = 0.25, 0.333,0.50, 0.667, 1.0, 1.333, 1.667, and 2.0.

48

8.3 Errors with Large Timesteps

With an implicit time advance scheme the explicit time step limit can be exceeded without

the risk of developing numerical instabilities. There can however be a decrease in accuracy

due to excessively large implicit time steps. Figure 8.3 shows a plot of relative deviation

from the steady state solution for supersonic inflow and outflow conditions. Notice that for

an increase in time step there is a fairly linear relation with the relative deviation from the

steady state solution. The figure has several different spatial resolutions overlaid to signify

the dependence on time step size rather than spatial effects. For small time steps the spatial

errors dominate and it can be seen that the differing resolutions do not overlap, but clearly

for larger time steps all the different spatial resolutions overlap.

Figure 8.4 is an example of the result obtained when the time step is too large. An

oscillation forms, and yields an inaccurate solution. The peak of the oscillation was used

when calculating the difference from steady state for Figure 8.3.

8.4 Computational Timing

The time required to perform computations with the solver depend on several parameters.

For instance the domain size directly influences the amount of computational time required

to solve the problem, since the matrix size has increased. The more nodes in the problem,

the larger the matrix size, and therefore the more time it takes to perform the calculation.

Other less obvious parameters are the polynomial order of the elements, the linear solver

type, and the size of the time steps. These parameters are investigated to show how the

computational cost changes.

8.4.1 Varying Polynomial Order and Resolution

Increasing the number of degrees of freedom in the problem increases the matrix size to be

inverted, and therefore increases the amount of computational effort to solve the problem.

Both decreasing the element size in the domain and increasing the polynomial order have

this effect.

Tables 8.2 and 8.3 show results for two cases of increasing resolution. Table 8.2 has

49

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.02

0.04

0.06

0.08

0.1

0.12

∆ t

∆ fr

om

Ste

ady

Sta

te (

Infi

nit

y N

orm

)

Ne = 30

Ne = 40

Ne = 50

Ne = 100

Ne = 200

Figure 8.3: Relative deviation from a steady state solution due to increase in time stepsize. The deviation measured is an infinity norm of the difference between the steady statesolution and peak error due to oscillations. Figure 8.4 shows an example oscillatory errorthat results from large time steps. Five different spatial resolutions are compared, Ne = 30,40, 50, 100, and 200.

50

0 2 4 6 8 101.6

1.8

2

2.2

2.4

2.6

Vel

ocity

ICSteady StateSolution

Figure 8.4: Velocity deviation from supersonic inflow and outflow steady state solution dueto large time steps. The initial condition is the flat dashed line, the curved dashed line isthe steady state solution, and the solid line is the erroneous solution due to large time steps.

results for increasing the total number of elements (smaller element size) while holding

the polynomial order fixed. Table 8.3 shows results for increasing polynomial degree while

holding the element size fixed. For both tables the average linear iterations per Newton

iteration, the average Newton iterations per time step and CPU time are shown. Notice

the linear iterations and CPU time increase, but the Newton iterations generally remain

constant.

For the case of increasing the polynomial order, the matrix to be inverted becomes

less sparse due to the coupling between basis functions. This requires more computational

operations to solve the problem and thus will take longer to solve. Figure 8.5 shows the

matrix structure for two systems with equal number of degrees of freedom and differing

polynomial order. Notice the higher polynomial order is less sparse and less banded. This

is the price paid for the increased resolution of the higher order polynomials. Notice also in

tables 8.2 and 8.3 for the case of Np = 301 (number of degrees of freedom) that the CPU

times for 4th and 7th order elements respectively are 14.63 and 21.31. This is precisely due

to the higher order elements require more computational effort.

51

0 5 10 15 20 25 30

0

5

10

15

20

25

30

nz = 1510 5 10 15 20 25 30

0

5

10

15

20

25

30

nz = 241

Figure 8.5: Matrix structure for a 10 element system with 4th order polynomials (left) and5 element system with 7th order polynomials (right). Both have 31 degrees of freedom.

Table 8.2: CPU timing and average Newton and Linear (GMRES) iteration counts forvarying spatial resolution with polynomial order 4. Ne is the number of elements, and Np

is the total number of degrees of freedom. The average Newton iterations are per time step,and the average linear iterations are also per time step. The CPU time is measured usingthe intrinsic fortran routine ’CPU TIME’.

Ne Np Avg. Newton Avg. Linear CPU time

50 151 3.30 4.12 8.56

75 226 3.30 4.19 11.73

100 301 3.30 4.27 14.63

150 451 3.30 4.49 21.32

200 601 3.30 4.66 28.41

250 751 3.30 4.74 35.60

300 901 3.30 4.78 43.96

350 1051 3.30 4.82 53.10

400 1201 3.30 4.82 63.47

450 1351 3.30 4.83 74.24

500 1501 3.30 4.84 85.19

52

Table 8.3: CPU timing and average Newton and Linear (GMRES) iteration counts forvarying polynomial degree. Ne is the number of elements, and Np is the total numberof degrees of freedom. The average Newton iterations are per time step, and the averagelinear iterations are also per time step. The CPU time is measured using the intrinsicfortran routine ’CPU TIME’.

Poly Np Avg. Newton Avg. Linear CPU time

2 51 3.41 3.00 3.99

3 101 3.30 3.72 5.93

4 151 3.30 4.12 8.63

5 201 3.30 4.18 11.95

6 251 3.30 4.34 16.14

7 301 3.30 4.48 21.31

8 351 3.30 4.67 27.51

9 401 3.30 4.75 35.71

10 451 3.31 4.82 43.92

11 501 3.32 4.83 56.30

12 551 3.32 4.87 71.26

53

8.4.2 Varying Timestep Size

The time step size plays an important role not only in the accuracy and convergence of

the solution but also in the computational effort required. Generally speaking there is a

trade-off between taking large implicit time steps and the amount of time it would take to

advance the solution. Table 8.1 shows that as the time steps get smaller and smaller the

Newton iterations approach 1. This means the problem has essentially become “linear” and

is acting like an explicit time step. These time steps don’t take as much computational

effort because both the linear and Newton iterations small. As the time steps increase the

iteration counts also increase, and therefore it requires more computational effort.

Table 8.4 shows average Newton and linear iteration counts as well as CPU time and

average CPU time per time step for various time steps all finishing at tfinal = 10. This

means for larger time steps there are less total time steps, Nt in the solution. Notice the

iterations increase with an increasing time step as expected. Associated with the iteration

count increase, the average CPU time per time step also has an increasing trend. This

increase is sufficiently small, and as the time steps get larger, the total CPU time decreases.

This means that the implicit time step gives a net savings in CPU time because it can

compute to the same tfinal in a fraction of the time compared to a smaller time step.

8.4.3 Varying Linear Solver Type

Choosing a linear solver is mostly a problem dependent choice, and usually there is a

solver (or a few solvers) that can efficiently solve the problem better than other solvers.

Table 8.5 lists various solver types, their descriptions, and a some of the key parameters

used. All of these solvers are included in the PETSc KSP libraries with the exception of

SuperLU, which is interfaced into the PETSc framework. The goal is to show the many

available solvers included in the PETSc libraries and show that each one can sufficiently

solve the problem. The KSP defaults are: Relative Tolerance, RTOL = 10−5, Absolute

tolerance, ABSTOL = 10−50, a divergence iteration count of 104, zero initial guess, and

left preconditioning.

Table 8.6 shows the total CPU time used, and the average linear iterations taken during

54

Table 8.4: CPU timing, average CPU time per time step and average Newton and Linear(GMRES) iteration counts for varying time step with parameters: poly = 6, Θ = 0.50,Ne = 50, tfinal = 10.0, ε = 1 · 10−2. ∆t is the time step size, and Nt is the total numberof time steps. The average Newton iterations are per time step, and the average lineariterations are also per time step. The CPU time is measured using the intrinsic fortranroutine ’CPU TIME’.

∆t Nt Avg. Newton Avg. Linear Total CPU Time Avg. CPU Time / ts

0.01 1000 2.33 2.45 113.63 0.113

0.02 500 2.41 2.68 58.30 0.117

0.03 333 2.56 3.00 41.43 0.124

0.04 250 2.65 3.45 32.63 0.131

0.05 200 2.71 3.51 27.12 0.136

0.06 167 2.86 3.73 23.44 0.141

0.07 143 2.94 4.02 20.69 0.146

0.08 125 2.99 4.17 18.42 0.147

0.09 111 3.21 4.23 17.46 0.157

0.10 100 3.30 4.34 16.21 0.162

0.20 50 3.42 5.25 8.78 0.176

0.25 40 3.58 5.52 7.41 0.185

0.50 20 4.10 5.91 4.40 0.220

1.00 10 5.20 5.98 2.88 0.288

2.00 5 6.40 5.94 1.94 0.388

55

Table 8.5: Various linear solver types included with the PETSc libraries, and the SuperLUdirect solver with their descriptions and parameters used for the runs in Table 8.6.

Solver Method Description Solver Parameters

SuperLU SuperLU Sparse Direct Solver Zero Pivot Tol = 10−12

GMRES Generalized Minimum Residual Converg. Tol. = 10−30

FGMRES Flexible Generalized Minimal Residual Converg. Tol. = 10−30

CG Conjugate Gradient KSP Defaults

CGS Conjugate Gradient Squared KSP Defaults

BICG Biconjugate Gradient KSP Defaults

BiCGStab Stabilized BiConjugate Gradient Squared KSP Defaults

BCGSL Enhanced BiCGStab L = 2, ∆ = 0

MINRES Minimum Residual Converg. Tol. = 10−18

TFQMR Transpose Free Quasi Minimal Residual KSP Defaults

CHEBYCHEV Chebychev Iterative emin=10−2, emax=10+2

RICHARDSON Richardson Iterative Damping Factor = 1.0

each Newton iteration to solve the same exact problem. Notice that SuperLU is a direct

solver and therefore by definition has only one “iteration” per call. The other iterative

solver methods have various timing results, but the methods have not been fully optimized

so this is expected. The GMRES, FGMRES, CGS, BICG, BiCGStab, BCGSL, TFQMR,

and RICHARDSON methods proved to be approximately comparable with respect to the

CPU timing for this specific problem. For other more complicated problems, these methods

would probably diverge in their success.

56

Table 8.6: CPU timing and iteration results for different iterative linear solver methodsdescribed in Table 8.5

Solver CPU time Avg. Linear

SuperLU 37.87 1.00

GMRES 8.66 4.12

FGMRES 8.68 3.99

CG 25.15 17.82

CGS 8.72 2.56

BICG 8.93 4.38

BiCGStab 8.66 2.30

BCGSL 8.74 4.00

MINRES 24.09 31.50

TFQMR 8.58 2.73

CHEBYCHEV 58.81 744.71

RICHARDSON 8.50 5.49

57

Chapter 9

FUTURE DEVELOPMENTS AND PLANS

9.1 Incorporate Quadrilateral/Hexahedral Structured Grid Generator

When expanding a code to higher dimensions the generation of the computational grid is

an important part of the process. A structured hexahedral grid (quadrilateral in 2D) is

desired to simplify the matrix structures in the solver. There are some cases where a fully

structured grid is either impossible, or introduces undesirable distortions in the grid. A

circle is a good example because distortions occur when mapping a logical rectangle to the

circle. In the logical corners, the quadrilateral is deformed such that two adjacent sides are

parallel rather than perpendicular. This would create problems in the solver.

One way to reconcile the difficulty in creating a structured mesh for a circle is to use

a semi-structured technique. This is done by partitioning the circle into pieces that can

easily be meshed structurally. Figure 9.1(a) shows a circular domain partitioned and Figure

9.1(b) shows a resulting quadrilateral mesh on this geometry. By meshing in this fashion, the

issue with poorly shaped quadrilaterals is minimized, but another problem emerges. Each

of the partitions might have a structured mesh, but the interfaces between the partitions

might not have a structured mesh pattern. Notice on Figure 9.1(b) at the corners of the

square partition at the interfaces the grid pattern is unstructured. This small amount of

unstructured griding is a compromise from having the poorly shaped grid cells that result

on a completely structured circular grid. Consequently when the code reads in the grid

structure, it must know how to handle the interfaces between the structured partitions.

Analogous to the two dimensional case is having a collection of structured blocks in three

dimensions. For example the circle shown in Figure 9.1(a) can be extruded to a cylinder.

This cylinder is partitioned only in the cross section. Figure 9.2(a) shows the resulting 3D

geometry partitioned into five pieces. Figure 9.2(b) shows the resulting hexahedral mesh on

this geometry where the lengthwise dimension of the cylinder is meshed uniformly. Figures

58

(a) (b)

Figure 9.1: A circle geometry showing the partitions (a) and after a structured quadrilateralmesh on each piece (b)

9.4(a) and 9.4(b) show a HIT like geometry partitioned and meshed with hexahedrons

respectively. This shows that non-simply connected geometries are possible with this type

of meshing.

Having a domain meshed as a collection of structured meshes, which are mapped together

in an unstructured fashion, is advantageous over an unstructured mesh. This is because a

structured mesh provides a far simpler data structure and therefore has a simpler matrix

sparsity patterns. As a result the solver will be faster at the expense of the slightly more

complicated coding required to handle the mesh partition interfaces.

9.2 Extend Algorithm to Three Dimensions

A three dimensional code that can implicitly solve complicated equation sets in the flux-

source form on complicated domains is of great interest. The motivation behind developing

the one-dimensional finite element solver is to have the experience such that a three dimen-

sions solver can be developed easily. Rather than continue to develop the one dimensional

59

(a) (b)

Figure 9.2: A cylinder geometry showing the partitions (a) and after a structured hexahedralmesh on each piece (b)

(a) (b)

Figure 9.3: A cylinder geometry with cutaway showing the partitions (a) and after a struc-tured hexahedral mesh on each piece (b)

60

(a) (b)

Figure 9.4: A HIT like geometry showing the partitions (a) and after a structured hexahedralmesh on each partition (b)

code, an existing two dimensional code called SEL [14] will be expanded to three dimen-

sions. The existing two dimensional code is a spectral/finite element code that uses the

flux-source equation formulation. SEL has successfully solved the extended MHD equa-

tions, among several other equation sets in two dimensions. This provides the framework

needed to test and expand to a three dimensional solver.

The semi-structured grid generation will fit well into the SEL framework because each

structured partition will be a logical rectangle. SEL already solves problems on a logically

rectangular domain, and therefore it will have to be expanded to handle multiple logical

rectangles. Once the solver can handle multiple adjacent logical rectangles, the next step

will be to expand to logical cubes (or rectangular parallelepiped). Once this is achieved

handling multiple logical cubes is the next natural progression. By having this capability,

an equation set can be solved on complicated three dimensional domains.

61

Chapter 10

CONCLUSIONS

10.1 Flux-Source Form

The flux-source equation form is a simple way to represent many equation sets. The primary

variables are listed in a vector, ~q and their associated fluxes and sources are listed in the

vectors ~f and ~s. This is convenient for computation because it allows for a formulation’s data

structure and solver to be general and separate from the equation and problem specification.

To change the problem the equations, initial conditions, boundary conditions, and Jacobian

are specified in a module separate from the main code. This makes the physics specification

simple without requiring rewriting the main solver routines.

The boundary conditions are also formulated such that they fit into the flux-source

framework. A Dirichlet boundary condition can be solved by entering the equation into the

source term and zeroing the ∂~q∂t and ∂ ~f

∂x terms. For example if one of the primary variables

is held to constant βD, the equation becomes

~s = q − βD = 0. (10.1)

Another example is the Neumann boundary condition where a derivative of a primary

variable is held to some constant, βN . This can be entered through the flux term with the∂~q∂t and ~s terms are zeroed

0

∂~q

∂t+

∂ ~f

∂x=

0~s (10.2)

∂ ~f

∂x→(

∂~q

∂x− βN

)= 0 (10.3)

These types of boundary conditions fit well into the flux-source form, and therefore make

the specification of their equations fairly simple.

62

10.2 Nodal versus Modal Basis Functions

When comparing a system of the same size with the same order polynomials basis functions,

both nodal and modal systems have the same number of degrees of freedom. The difference

is that in a modal scheme nodes are not used to define the number of basis functions. Nodes

are however defined as quadrature points during the numerical integration. The number of

quadrature points used should be equal to or greater than the number of basis functions, or

errors could result. For nodal schemes points are used in the problem definition and kept

throughout the solution process. A modal scheme can have any order polynomial defined

on an element without having to introduce more nodes. Modal elements need quadrature

points to accurately integrate the function, but the quantities computed at the quadrature

points are only needed temporarily.

When implementing nodal schemes the primary variables are defined at each node.

The basis function amplitudes are also these values since each basis function is one at its

corresponding node and zero at every other node. These values are evolved in the solution

and there are no extra steps needed to find the values of the primary variables at each node.

Conversely in a modal basis the amplitudes are calculated as described in section 2.5 from

the primary variable values. Solving for these amplitudes is required for the initial condition

and also for every time the flux and source are evaluated.

Similar to the primary variables, the Jacobian needs to be in terms of the basis function

amplitudes. This requires a similar solve as for the basis function amplitudes, except there

is added complexity due to the fact that the Jacobian is a matrix. This requires careful

treatment in order to calculate the correct Jacobian. Again when using a Lagrange nodal

basis set there is no linear system to solve since the matrix is the identity matrix. This

makes creating the Jacobians simple because they can be treated in terms of the primary

variables rather than basis function amplitudes.

10.3 PETSc Data Structures

The PETSc libraries include easy to use parallel data structures for a variety of object

types. These include sparse and dense matrices, vectors, and a variety of nonlinear and

63

linear solvers. The libraries also include a large collection of vector and matrix operators to

use on their objects. This convenient because the data structures only need to be created

once, and then they can be manipulated and used in the nonlinear and linear solvers. For

instance, the time evolved primary variables, fluxes, and sources are all stored in a PETSc

vectors, and the mass and Jacobian matrices are stored in PETSc matrices. The sizes of

the objects and their parallel partitioning is decided at allocation, and then these objects

are filled and used similarly to conventional programming data types.

PETSc is not limited to using only its own solvers and tools, but it also provides interfaces

to other software libraries. For instance the SuperLU sparse direct solver can be used with

PETSc matrices and vectors. This is useful because the data structures do not have to be

converted to another format in order to make use of other popular and powerful solvers.

Another useful feature of the PETSc libraries is its ability to perform memory and

CPU timing diagnostics on its routines. Due to the relatively large computational overhead

required to set up and use the PETSc libraries, this is a useful tool. It can analyze the

code during execution and find which routines are using the most amount of memory and

which are using the most amount to computational effort. By having this information

readily accessible, it is much easier to streamline operations, find alternatives, and debug

the code. This feature is invoked by using the -log_summary command line option. Another

tool compares a specified analytic Jacobian to a numerical Jacobian. This is useful when

checking the accuracy of a Jacobian that is a non trivial analytic expression.

10.4 Implicit Time Advance

The implicit solver formulated in the code is efficient enough to outperform an equivalent

explicit solver when the time step is sufficiently large. At smaller time steps the implicit

solver is not justified, because of the added complexity of the algorithm. The amount of

computational effort required to reach the same time is reduced by having larger time steps

in the implicit solver. Each time step takes longer to compute, but the overall gain is

positive since fewer time steps are required. This result is shown in Table 8.4.

There are accuracy and complexity considerations when using an implicit solver. If the

time step gets too large, the accuracy of the solver diminishes. Although a stability limit

64

is not enforced like in explicit solvers, the solution will become increasingly less accurate

as the time step gets too large. Another consideration is the complexity in specifying the

Jacobian for the implicit solver. To achieve acceptable convergence levels the Jacobian

must be accurate. This includes boundary equations, which can become complicated when

dealing with non-primary variables. The modular nature of the code makes this process

easier and aims to minimize errors.

65

BIBLIOGRAPHY

[1] Pozrikidis, C. Introduction to Finite and Spectral Element Methods Using MATLAB.Boca Raton: Chapman and Hall/CRC, 2005.

[2] Anderson, John David. Computational Fluid Dynamics The Basics with Applications.McGraw-Hill series in mechanical engineering. New York: McGraw-Hill, 1994.

[3] Archer, Branden and Weisstein, Eric W. “Lagrange Interpolat-ing Polynomial.” From MathWorld–A Wolfram Web Resource.http://mathworld.wolfram.com/LagrangeInterpolatingPolynomial.html

[4] Satish Balay and Kris Buschelman and William D. Gropp and Dinesh Kaushik andMatthew G. Knepley and Lois Curfman McInnes and Barry F. Smith and Hong Zhang.PETSc Web page, http://www.mcs.anl.gov/petsc, 2007.

[5] Satish Balay and Kris Buschelman and Victor Eijkhout and William D. Gropp andDinesh Kaushik and Matthew G. Knepley and Lois Curfman McInnes and Barry F.Smith and Hong Zhang. PETSc Users Manual, ANL-95/11 - Revision 2.3.3, ArgonneNational Laboratory, 2007.

[6] Xiaoye S. Li. An overview of SuperLU: Algorithms, implementation, and user interface,ACM Trans. Math. Softw. Vol.31 Num.3 p302− 325 ACM New York, NY, 2005.

[7] Liepmann, H. W., and A. Roshko. Elements of Gasdynamics. Galcit aeronautical series.New York: Wiley, 1957.

[8] Hoffmann, Klaus A., and Steve T. Chiang. Computational Fluid Dynamics for Engi-neers. Wichita, Kan: Engineering Education System, 1993.

[9] Kwon, Young W., and Hyochoong Bang. The Finite Element Method Using MATLAB.CRC mechanical engineering series. Boca Raton, FL: CRC Press, 2000.

[10] Wendt, John F., and John David Anderson. Computational Fluid Dynamics An Intro-duction. Berlin: Springer-Verlag, 1992.

[11] Abramowitz, Milton, and Irene A. Stegun. Handbook of Mathematical Functions withFormulas, Graphs, and Mathematical Tables. Washington: U.S. Govt. Print. Off, 1964.

66

[12] Karniadakis, George, and Spencer J. Sherwin. Spectral/Hp Element Methods for Com-putational Fluid Dynamics. Numerical mathematics and scientific computation. NewYork: Oxford University Press, 2005.

[13] Eberhardt, Scott. AA543 Computational Fluid Dynamics I - Course Notes. Universityof Washington, 1987.

[14] Lukin, V.S. Ph.D. Dissertation, Princeton University (2007).

67

Appendix A

1D FINITE ELEMENT EQUATION SOLVER MANUAL

A.1 Introduction

The solver (fsFEM ) is written in FORTRAN 90/95 and makes use of the PETSc libraries

for the matrices, vectors, nonlinear and linear iterative solvers as well as the SuperLU direct

solver. The code is designed to solve a PDE in one dimension posed in the flux-source form

using higher order finite element spatial discretization with a Θ Scheme for the implicit

time advancement. Currently fsFEM uses a nodal basis set (Lagrange Polynomials) for

interpolation with evenly spaced nodes. The basis functions associated with the boundaries

of each element provide C0 continuity.

The fsFEM code is organized into several files, which include the main code

fsFEM.F , the data structure specification data struct.F , a physics equation module

pseudo1D euler.F , the input/configuration file run fsFEM , and a makefile to create the

executable. To change the equations being solved only the physics/equation module and

input files needs to be changed.

In order to aid the use of fsFEM, the compilation with the PETSc libraries, algorithm

outline, data structures, physics/equation module, configuration, and output are described.

This should help someone compile and run fsFEM on a machine of their choice.

A.2 Compiling with PETSc libraries

Since the code relies heavily on PETSc features, the PETSc libraries must be configured and

compiled on the specific machine previous to fsFEM being compiled. The steps required to

make use of PETSc libraries are described. More detail on configuring and compiling the

libraries is found in references [4] and [5].

• Configure PETSc - Configure for use with the desired compilers, debuggers and also

with appropriate external packages. An example with using the Intel compilers, Intel

68

blas/lapack, and a user compiled mpich is shown for use on a 32bit machine. SuperLU

is downloaded automatically by the configuration tool.

$ ./config/configure.py \

--with-gnu-compilers=0 \

--with-vendor-compilers=intel \

--with-blas-lapack-dir=/opt/intel/mkl/8.0/lib/32 \

--with-mpi-dir=/opt/mpich \

--download-superlu=yes

• Compile PETSc Libraries - Compile the libraries with the configured options. As-

suming the configuration completes the libraries are compiled and tested by issuing

the command

$ make all test

• Create Makefile - Create a makefile that links the code to the PETSc libraries. This

makefile should indicate to the compiler which files have subroutines that link to the

PETSc libraries. An example makefile is shown in Appendix B.

• Compile Code - Compile fsFEM with PETSc libraries linked. With a correct makefile,

the code is compiled using the command

$ make fsFEM

A.3 Running the Code

Once the compilation with correct linking to the PETSc libraries has occurred, the exe-

cutable can be run. Before that, the input/configuration script must created or edited.

A sample script in bash is shown, but this can be in any scripting language. This input

script specifies the solver parameters, some initial condition variables, boundary condition

specification, and performs some manipulation of the output data.

69

#!/bin/bash

## Main Runtime Variables

IC=1

npe=4

Ne=50

nT=

Theta=0.50

Final_Time=100.0

deltaT=0.1

nInterval=1

Epse=5e-3

Mat_Solver=1

## Boundary Condition Flags

# 0 = All Points Off

# 1 = Interior Points Only

# 2 = Boundary Points Only

# 3 = Interior and Exterior Points

Qflg=1

Fflg=1

Sflg=1

Vflg=3

## Initial Condition Specification

Mi=0.25 # Initial/Inlet Mach Number

Mo=0.25 # Initial/Outlet Mach Number

Pi=1.0000 # Initial/Inlet Pressure

Po=1.0000 # Initial/Outlet Pressure

70

## Boundary Condition Specification

## Specify Type of BC on Nodes

## (0 = Dirichlet, 1=Neumann, 2=Natural)

bc_in11=1

bc_in21=0

bc_in31=1

bc_in41=0

bc_in51=2

bc_in61=2

# Specify the Boundary Variables to Set

# (1 = rho, 2=rhou, 3=e, 4=u, 5=p)

bc_in12=1

bc_in22=2

bc_in32=3

bc_in42=1

bc_in52=2

bc_in62=5

## SNES Options

#SNES1=-snes_type

#SNES2=test

#SNES3=-snes_test_display

MAXIT=20

MAXF=10000

ATOL=1e-50

STOL=1e-8

RTOL=1e-8

71

## KSP Options (if SuperLU is Off, Mat_Solver=1)

KSPTYPE=gmres

./fsFEM -npe $npe -nT $nT -Ne $Ne -Theta $Theta -IC $IC -Epse $Epse \

-Mat_Solver $Mat_Solver -Qflg $Qflg -Fflg $Fflg -Sflg $Sflg -Vflg $Vflg \

-Final_Time $Final_Time -deltaT $deltaT -nInterval $nInterval \

-maxit $MAXIT -maxf $MAXF -atol $ATOL -stol $STOL -rtol $RTOL \

-bc_in11 $bc_in11 -bc_in21 $bc_in21 -bc_in31 $bc_in31 \




-Mi $Mi -Mo $Mo -Pi $Pi -Po $Po \

$SNES1 $SNES2 $SNES3 -ksp_type $KSPTYPE

name="Nozzle_IC$IC_t$Final_Time_dt$deltaT_Ne$Ne_Theta$Theta_

Eps$Epse_poly$npe_Mi$Mi_Mo$Mo_Pi$Pi_Po$Po"

echo $name

mkdir ./output/$name

cp run_fsFEM ./output/$name/

cp fort.1 ./output/$name/initial.csv

mv fort.1 ./output/initial.csv

cp fort.2 ./output/$name/rhoA.csv

mv fort.2 ./output/rhoA.csv

cp fort.3 ./output/$name/u.csv

mv fort.3 ./output/u.csv

cp fort.4 ./output/$name/eA.csv

mv fort.4 ./output/eA.csv

cp fort.7 ./output/$name/variables.csv

mv fort.7 ./output/variables.csv

72

cp fort.8 ./output/$name/simtime.csv

mv fort.8 ./output/simtime.csv

cp fort.9 ./output/$name/parameters.csv

mv fort.9 ./output/parameters.csv

cp fort.10 ./output/$name/area.csv

mv fort.10 ./output/area.csv

A.4 Algorithm Outline

The algorithm used in the fsFEM code is described. Each major step is listed in the order

it is executed.

• Declare Variables (FORTRAN and PETSc)

• Define initial parameters

• Initialize PETSc

• Get PETSc runtime parameters

• Output initial parameters to screen

• Allocate PETSc data:

– Matrices

– Vectors

– Jacobian and R.H.S.

• Setup PETSc nonlinear solver (SNES)

• Allocate FORTRAN variables

• Setup 1D grid

73

• Specify boundary nodes

• Set initial condition

• Create static matrices (Mass, Flux, and Viscosity)

• START TIME LOOP

– Output variables

– Calculate flux

– Calculate Jacobian

– Initialize SNES solver (RHS and Jacobian Calculation)

– Finalize SNES solver

– Copy result to solution vector

– Update source

– Calculate timing

• END TIME LOOP

• Output final solution

• Write data to file

• Deallocate variables (FORTRAN and PETSc)

A.5 Data Structures

The main data structures involved with the solver are the PETSc matrices and vectors.

These include:

• Conserved Variables, Flux and Source Vectors - Q, F , S. These hold the current

value of the primary variables, flux and source.

74

• Iterate Conserved Variables, Flux and Source Vectors - Qk, Fk and Sk. These are

used as temporary holders for the iterate values during the nonlinear solve process.

• Mass, Flux, and Viscosity Matrices - M , K, V . These matrices hold the integrated

values of the basis function as described in Eqn. 3.7 and 5.8.

• Jacobian Matrices - A, and Z, where A is the flux Jacobian, Z is the source Jacobian.

No Jacobian is required for the dissipation term since it operates on ~q.

• Boundary Matrice and Jacobian - C, and Y , where C is a boundary matrix and Y is

the boundary Jacobian.

• SNES and KSP Solvers - The SNES and KSP solvers have PETSc objects snes and

ksp associated with them. These hold all the parameter information required for both

of these solvers.

A.6 Physics and Equation Specification Module

The solver relies on a module to specify the equation information as well as routines that

calculate the flux, source, and the flux and source Jacobians. The module also has routines

for displaying the output data to the screen and for writing data to a file. The reason

these routines are in the module is so that there can be a customized output for the specific

equation set. A list of the required subroutines with descriptions is shown below:

• get initial condition - Returns the initial condition values for all the primary variables,

~q. These are stored in the PETSc vector Q.

• set bnodes - Returns a list of the nodes with an explicit boundary condition defined,

as well as the type of boundary condition that is to be applied and on what variable

(i.e. primary variable, or some user defined non-primary variable)

• set boundary condition - Returns a boundary vector that has the boundary conditions

applied as prescribed by the set bnodes subroutine.

75

• write initial condition - Writes the initial condition data to a file.

• get timestep - Calculates a fastest wave speed in the system and a corresponding

explicit time step size. (not important for implicit time stepping)

• get flux - Returns the flux, ~f calculated for the specific equation set as a function of

the primary variables, ~q.

• get source - Returns the source, ~s for the specific equation set as a function of the

primary variables, ~q.

• get Jacobian - Returns the analytic flux and source Jacobians (∂f∂q and ∂s

∂q ) as a

function of the primary variables, ~q.

• display output - Outputs user specified values for monitoring the solution at each

time step during the solve.

• write data - Writes solution data to file for a specified time step interval.

• module finalize - Deallocates any global variables defined within the module.

76

Appendix B

SOURCE CODE

B.0.1 makefile

SHELL = /bin/shPETSC_DIR = /opt/petsc-2.3.3-p8PETSC_ARCH = linux-intel-debug

CMD = fsFEMinclude $(PETSC_DIR)/bmake/common/base

LINK_OBJS = data_struct.o csv_file.o pseudo1D_euler.o fsFEM.ofsFEM_OBJS = data_struct.o csv_file.o pseudo1D_euler.o

FFLAGS = $(FOPT) -I$(PETSC_DIR) -I$(PETSC_DIR)/include \-I/usr/include -I$(PETSC_DIR)/bmake/$(PETSC_ARCH)

all: $(CMD)

$(CMD): $(LINK_OBJS) chkopts-$(FLINKER) -o $@ $(LINK_OBJS) $(PETSC_LIB)$(RM) *.o$(RM) *.mod

data_struct.o: data_struct.F chkopts-$(FLINKER) -c $(FFLAGS) data_struct.F $(PETSC_LIB)

pseudo1D_euler.o: pseudo1D_euler.F chkopts-$(FLINKER) -c $(FFLAGS) pseudo1D_euler.F $(PETSC_LIB)

csv_file.o: csv_file.f90 chkopts-$(FLINKER) -c csv_file.f90

fsFEM.o: fsFEM.F $(fsFEM_OBJS) chkopts-$(FLINKER) -c $(FFLAGS) fsFEM.F $(PETSC_LIB)

wipe:rm -f *.o *.f *.mod

wipe_clean:rm -f *.o *.f *.mod fsFEM

B.0.2 data struct.F

! Data structure modules

MODULE data_structIMPLICIT NONE

77

#define PETSC_AVOID_DECLARATIONS#include "include/finclude/petsc.h"#include "include/finclude/petscvec.h"#include "include/finclude/petscmat.h"#undef PETSC_AVOID_DECLARATIONS

TYPE :: elDOUBLE PRECISION :: F,S,dQ

END TYPE el

TYPE :: petsc_type

!! PETSc Data StructuresMat M,K,V,V2,C,L,Z,Y,AMat qI,fI,sI,vIVec Q,F,S,BVec Qk,Fk,Sk,X,Xk

!! Fortran Data StructuresDOUBLE PRECISION, DIMENSION(:),ALLOCATABLE :: XYZINTEGER :: Np,asize,Mat_SolverDOUBLE PRECISION :: Theta,deltaT,Epse,EpsiINTEGER, DIMENSION(:),ALLOCATABLE :: inodesINTEGER,DIMENSION(:,:),ALLOCATABLE :: bnodesINTEGER,DIMENSION(:,:),ALLOCATABLE :: bcsINTEGER :: Nbn,Nbc,npeINTEGER :: Qflg,Fflg,Sflg,Vflg

END TYPE

END MODULE data_struct

B.0.3 fsFEM.F

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! This is a 1D Flux Source Form Finite Element Code for the Pseudo-1D !! Euler Equations. !! WBL 01/23/2008 !! [email protected] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

PROGRAM fsFEMUSE data_structUSE pseudo1d_eulerUSE csv_fileIMPLICIT none

!! PETSc include statements#include "include/finclude/petsc.h"#include "include/finclude/petscvec.h"#include "include/finclude/petscmat.h"#include "include/finclude/petscksp.h"#include "include/finclude/petscpc.h"#include "include/finclude/petscsnes.h"#include "include/finclude/petscviewer.h"

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

78

! Declare Variables !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Fortran Data StructureTYPE (el), DIMENSION(:,:),ALLOCATABLE :: elementINTEGER, DIMENSION(:,:),ALLOCATABLE :: nodesINTEGER, DIMENSION(2*Neq,2) :: bc_inputDOUBLE PRECISION,DIMENSION(:,:,:),ALLOCATABLE :: QDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: Qt

!! Time VariablesINTEGER :: nT,nInterval,nTs,fcount,fintervalDOUBLE PRECISION :: CFL,FinalTime,dTi,Cs_maxDOUBLE PRECISION,DIMENSION(:),ALLOCATABLE :: dT,Cs,sim_time

!! Grid VariablesINTEGER :: Ne,Np,Ni,npe,asizeDOUBLE PRECISION :: Xmin,Xmax,dL,dx,dtdxDOUBLE PRECISION,DIMENSION(2) :: qprimeDOUBLE PRECISION,DIMENSION(:),ALLOCATABLE :: Xcoord,Xnodes

!! Quadrature VariablesDOUBLE PRECISION,DIMENSION(:),ALLOCATABLE :: x,w

!! Solver VariablesINTEGER :: Int_MethodINTEGER :: its,newt,its_tot,newt_tot

!! Matrices and Matrice Creation VariablesDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: MDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: KDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: V,V2DOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: Me,Ke,Ve,Ve2INTEGER :: x1,x2INTEGER :: basis_typeDOUBLE PRECISION :: xL,xR,pn,scaler

!! Index VariablesINTEGER :: e,h,i,j,t

!! Output FlagsINTEGER :: timestep_info

!! PETSc VariablesINTEGER :: ii,jjDOUBLE PRECISION :: normDOUBLE PRECISION :: tend,time,t0PetscErrorCode ierrPetscMPIInt rank,mpisize ! PETSc MPI VariablesPetscScalar deltaT,Theta,Epse,EpsiPetscTruth flgMat Jp,JpreMatStructure flag ! Solver Matrix StructureVec Rp,dQk ! Vectors for SNESSNES snes ! Non-linear Solver ContextKSP ksp,ksp_amp ! Linear Solver ContextPC pc ! Preconditioner Context

79

!! SNES TolerancesPetscReal atol,rtol,stolPetscInt maxit,maxfSNESConvergedReason reason

!! PETSc Type!! (M,K,V,Z,Y,A,Q,F,S,B,Np,asize,Theta,deltaT,Mat_Solver,!! Epse,Epsi,qI,fI,sI,vI)TYPE(petsc_type) :: petsc

!! External SubroutinesEXTERNAL :: get_RightHandSide,get_LeftHandSideEXTERNAL :: SNES_Function,SNES_Jacobian

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Start CPU Timer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

CALL CPU_TIME(time)t0 = time

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Parameters !! ---Many of these are overwritten by runtime options--- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Initial ParametersNe = 50CFL = 1.00deltaT = 0.5FinalTime = 10.0nT = FinalTime/deltaTnInterval = 1nTs = nT/nInterval

!! Domain ParametersXmin = 0Xmax = 10

!! Solver ParametersTheta = 0.50Epse = 0e-3Epsi = 1.00*Epsepetsc%Mat_Solver= 0 !(0 for SuperLU, 1 for PETSc KSP)

!! Initial Condition ParametersIC = 1 !(0 for Shock Tube, 1 for 1D-Nozzle)

!! Basis Function Parametersbasis_type = 0Int_Method = 2 ! Integration Method (1=Legendre-Gauss)npe = 3 ! Nodes per elementNi = 2*npe+1 ! Number of Quadrature PointsNp = Ne*(npe-1)+1 ! Total Number of Pointsasize = Np*Neq ! Matrix/Vector Sizeflag = SAME_NONZERO_PATTERN

!! Outputtimestep_info = 1

!! SNES Parametersreason = 0

80

!! Boundary Condition Flags!! 0 = All Points Off!! 1 = Interior Points Only!! 2 = Boundary Points Only!! 3 = Interior and Exterior Pointspetsc%Qflg = 1petsc%Fflg = 1petsc%Sflg = 1petsc%Vflg = 1

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! PETSc Initialization !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Petsc InitializationCALL PetscInitialize(PETSC_NULL_CHARACTER,ierr)

!! MPI/PETSc InitializationCALL MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)CALL MPI_Comm_size(PETSC_COMM_WORLD,mpisize,ierr)

!! Set PETSc Viewer FormatCALL PetscViewerSetFormat(PETSC_VIEWER_STDOUT_WORLD,

& PETSC_VIEWER_ASCII_MATLAB,ierr)!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Set Runtime (Command Line) Options !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Set Runtime ParametersCALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-npe’,npe,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-Ne’,Ne,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Epse’,Epse,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-nT’,nT,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-nInterval’,nInterval,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-deltaT’,deltaT,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Final_Time’,FinalTime,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-IC’,IC,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Theta’,Theta,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-CFL’,CFL,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-Mat_Solver’,petsc%Mat_Solver,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-Int_Method’,Int_Method,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-Basis_Type’,basis_type,flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-Qflg’,petsc%Qflg,flg,ierr)

81

CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,& ’-Fflg’,petsc%Fflg,flg,ierr)

CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,& ’-Sflg’,petsc%Sflg,flg,ierr)

CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,& ’-Vflg’,petsc%Vflg,flg,ierr)

CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,& ’-atol’,atol,flg,ierr)

CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,& ’-rtol’,rtol,flg,ierr)

CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,& ’-stol’,stol,flg,ierr)

CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,& ’-maxit’,maxit,flg,ierr)

CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,& ’-maxf’,maxf,flg,ierr)

!! Boundary Condition InputCALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,

& ’-bc_in11’,bc_input(1,1),flg,ierr)CALL PetscOptionsGetInt(PETSC_NULL_CHARACTER,











& ’-bc_in62’,bc_input(6,2),flg,ierr)

!! Initial Condition InputCALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Mi’,Mi,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Mo’,Mo,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Pi’,Pi,flg,ierr)CALL PetscOptionsGetReal(PETSC_NULL_CHARACTER,

& ’-Po’,Po,flg,ierr)

!! Reset Variables based on runtime specifications

82

Np = Ne*(npe-1)+1Ni = 2*npe+1asize = Np*NeqEpsi = 1.00*EpsenT = FinalTime/deltaTnTs = nT/nInterval

IF (IC==2) THENXmin = 0Xmax = 3

ENDIF

!! petsc_type variablespetsc%Theta = Thetapetsc%Np = Nppetsc%asize = asizepetsc%Epse = Epsepetsc%Epsi = Epsipetsc%npe = npe

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Output Initial Parameters !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Output Parameters to ScreenWRITE(6,

& ’(A,I5,3x,A,I5,3x,A,I5,3x,& A,E10.3,3x,A,E10.3,3x,A,F10.3,3x,A,F10.3)’)& ’Ne = ’,Ne,’Np =’,Np,’nT = ’,nT,& ’Epse = ’,Epse,’Epsi = ’,Epsi,’Theta = ’,Theta,& ’Final Time = ’,FinalTime

WRITE(6,’(A,F10.3,3x,A,F10.3,3x,A,F10.3,3x,A,F10.3)’)& ’Mi = ’,Mi,’Mo = ’,Mo,’Pi = ’,Pi,’Po = ’,Po

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! PETSc Matrix/Vector Setup !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Allocate/Setup MatriciesCALL MatCreate(PETSC_COMM_WORLD,petsc%M,ierr)CALL MatSetSizes(petsc%M,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%M,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%K,ierr)CALL MatSetSizes(petsc%K,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%K,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%V,ierr)CALL MatSetSizes(petsc%V,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%V,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%V2,ierr)CALL MatSetSizes(petsc%V2,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%V2,ierr)

83

CALL MatCreate(PETSC_COMM_WORLD,petsc%C,ierr)CALL MatSetSizes(petsc%C,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%C,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%L,ierr)CALL MatSetSizes(petsc%L,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%L,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%A,ierr)CALL MatSetSizes(petsc%A,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%A,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%Z,ierr)CALL MatSetSizes(petsc%Z,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%Z,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%Y,ierr)CALL MatSetSizes(petsc%Y,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%Y,ierr)

! Boundary MatriciesCALL MatCreate(PETSC_COMM_WORLD,petsc%qI,ierr)CALL MatSetSizes(petsc%qI,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%qI,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%fI,ierr)CALL MatSetSizes(petsc%fI,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%fI,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%sI,ierr)CALL MatSetSizes(petsc%sI,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%sI,ierr)

CALL MatCreate(PETSC_COMM_WORLD,petsc%vI,ierr)CALL MatSetSizes(petsc%vI,PETSC_DECIDE,PETSC_DECIDE,& asize,asize,ierr)CALL MatSetFromOptions(petsc%vI,ierr)

! Jacobian MatrixCALL MatCreate(PETSC_COMM_WORLD,Jp,ierr)CALL MatSetSizes(Jp,PETSC_DECIDE,PETSC_DECIDE,asize,asize,ierr)CALL MatSetFromOptions(Jp,ierr)

!! Allocate/Setup VectorsCALL VecCreate(PETSC_COMM_WORLD,petsc%Q,ierr)CALL VecSetSizes(petsc%Q,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%Q,ierr)

84

CALL VecCreate(PETSC_COMM_WORLD,petsc%F,ierr)CALL VecSetSizes(petsc%F,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%F,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%S,ierr)CALL VecSetSizes(petsc%S,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%S,ierr)

!CALL VecCreate(PETSC_COMM_WORLD,petsc%B,ierr)!CALL VecSetSizes(petsc%B,PETSC_DECIDE,asize,ierr)!CALL VecSetFromOptions(petsc%B,ierr)CALL VecDuplicate(petsc%Q,petsc%B,ierr)

CALL VecCreate(PETSC_COMM_WORLD,Rp,ierr)CALL VecSetSizes(Rp,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(Rp,ierr)

CALL VecCreate(PETSC_COMM_WORLD,dQk,ierr)CALL VecSetSizes(dQk,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(dQk,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%Qk,ierr)CALL VecSetSizes(petsc%Qk,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%Qk,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%Xk,ierr)CALL VecSetSizes(petsc%Xk,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%Xk,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%X,ierr)CALL VecSetSizes(petsc%X,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%X,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%Fk,ierr)CALL VecSetSizes(petsc%Fk,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%Fk,ierr)

CALL VecCreate(PETSC_COMM_WORLD,petsc%Sk,ierr)CALL VecSetSizes(petsc%Sk,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(petsc%Sk,ierr)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Setup PETSc Nonlinear Solver (SNES) !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Setup Non-Linear SolverCALL SNESCreate(PETSC_COMM_WORLD,snes,ierr)

!! Set SNES Function and JacobianCALL SNESSetFunction(snes,Rp,SNES_Function,petsc,ierr)CALL SNESSetJacobian(snes,Jp,Jp,SNES_Jacobian,petsc,ierr)CALL SNESSetTolerances(snes,atol,rtol,stol,maxit,maxf,ierr)CALL SNESGetTolerances(snes,atol,rtol,stol,maxit,maxf,ierr)

!! Output SNES Tolerance InfoWRITE(6,’(A,E10.3,3x,A,E10.3,3x,A,E10.3,3x,A,I5,3x,A,I5)’)& ’ATOL = ’,atol,’RTOL = ’,rtol,’STOL = ’,stol,

85

& ’MAXIT = ’,maxit,’MAXF = ’,maxf!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Allocate FORTRAN Variables !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

ALLOCATE( element(Np,Neq) )ALLOCATE( nodes(Ne,npe) )ALLOCATE( petsc%XYZ(Np) )ALLOCATE( Xcoord(npe),Xnodes(npe) )ALLOCATE( Q(Np,Neq,nTs+1) )ALLOCATE( Qt(Np,Neq) )ALLOCATE( Cs(Np-1) )ALLOCATE( dT(Np-1) )ALLOCATE( sim_time(nT+1) )ALLOCATE( x(Ni),w(Ni) )ALLOCATE( Me(npe,npe),Ke(npe,npe),Ve(npe,npe),Ve2(npe,npe) )ALLOCATE( M(Np,Np),K(Np,Np),V(Np,Np),V2(Np,Np) )

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Grid Setup !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Set nodes for each element and x-value at each nodedL = (Xmax-Xmin)/(Np-1) ! Length between nodes (Uniform Grid)DO i = 1,Nppetsc%XYZ(i) = (i-1)*dL

END DODO e = 1,NeDO i = 1,npenodes(e,i) = e*(npe-1) - (npe-2) + (i-1)

END DOEND DO

!! Allocate vectors to store boundary and interior node numbersALLOCATE( petsc%bnodes(2*Neq,3) )ALLOCATE( petsc%bcs(2*Neq,3) )ALLOCATE( petsc%inodes(asize) )

!! Set the boundary and interior node number vectors!! (This is trivial for the 1D case, but will be more complex!! with a 2D or 3D grid and should incorporate output from a!! grid generator)CALL set_bnodes(petsc%bnodes,petsc%Nbn,petsc%asize,bc_input)

petsc%bcs = 0petsc%Nbc = 0h = 1DO i = 1,asizeIF ( i==petsc%bnodes(1,1) .AND. petsc%bnodes(1,2) .NE. 2

& .OR. i==petsc%bnodes(2,1) .AND. petsc%bnodes(2,2) .NE. 2& .OR. i==petsc%bnodes(3,1) .AND. petsc%bnodes(3,2) .NE. 2& .OR. i==petsc%bnodes(4,1) .AND. petsc%bnodes(4,2) .NE. 2& .OR. i==petsc%bnodes(5,1) .AND. petsc%bnodes(5,2) .NE. 2& .OR. i==petsc%bnodes(6,1) .AND. petsc%bnodes(6,2) .NE. 2 )& THEN

! Nothing for NowELSE! List Interior Nodes

86

petsc%inodes(h) = ih = h + 1

END IFEND DO

DO i = 1,petsc%NbnIF (petsc%bnodes(i,2) .NE. 2) THENpetsc%Nbc = petsc%Nbc + 1petsc%bcs(petsc%Nbc,1) = petsc%bnodes(i,1)petsc%bcs(petsc%Nbc,2) = petsc%bnodes(i,2)petsc%bcs(petsc%Nbc,3) = petsc%bnodes(i,3)

END IFEND DO

WRITE(*,*) ’Nbn = ’,petsc%NbnWRITE(*,*) ’Nbc = ’,petsc%NbcWRITE(*,*) ’bcs1= ’,petsc%bcs(:,1)WRITE(*,*) ’bcs2= ’,petsc%bcs(:,2)WRITE(*,*) ’bcs3= ’,petsc%bcs(:,3)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Set Initial Condition !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Get the initial condition. This subroutine is in!! the corresponding physics/equation set module.CALL get_initial_condition(IC,Np,nT,element%S,Q(:,:,1),

& petsc%XYZ)! Copy initial condition to temp, QtQt(:,:) = Q(:,:,1)

!WRITE(*,*) ’Q1=’,Q(:,1,1)!WRITE(*,*) ’Q2=’,Q(:,2,1)!WRITE(*,*) ’Q3=’,Q(:,3,1)

!! Write Initial Condition to fileCALL write_initial_condition(Q(:,:,1),element%S,petsc%XYZ,Np)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Setup Solver for Initial Basis Function Amplitudes !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Create Linear Solver for AmplitudesCALL KSPCreate(PETSC_COMM_WORLD,ksp_amp,ierr)CALL KSPSetOperators(ksp_amp,petsc%L,petsc%L,

& SAME_NONZERO_PATTERN,ierr)CALL KSPSetType(ksp_amp,KSPCG,ierr)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Create Static Matricies !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Create Matricies with FORTRAN arrays (M,K,V)DO e = 1,Ne

!! Get the Nodal CoordinatesDO i = 1,npeXnodes(i) = nodes(e,i)Xcoord(i) = petsc%XYZ(Xnodes(i))

END DO

87

x1 = Xnodes(1)x2 = Xnodes(npe)xL = Xcoord(1)xR = Xcoord(npe)

!CALL Legendre_Function(npe,Xcoord(1),xL,xR,pn)!WRITE(*,*) ’pn=’,pn

!! Create Element MatriciesCALL get_Mmatrix(Xcoord,Ni,Int_Method,npe,Me,petsc%bnodes,x1,x2)CALL get_Kmatrix(Xcoord,Ni,Int_Method,npe,Ke,petsc%bnodes,x1,x2)CALL get_Vmatrix(Xcoord,Ni,Int_Method,npe,Ve,Ve2,Np,

& petsc%bnodes,x1,x2)

!! Insert Element Matricies into Global MatriciesM(x1:x2,x1:x2) = M(x1:x2,x1:x2) + MeK(x1:x2,x1:x2) = K(x1:x2,x1:x2) + KeV(x1:x2,x1:x2) = V(x1:x2,x1:x2) + VeV2(x1:x2,x1:x2) =V2(x1:x2,x1:x2) + Ve2

!! Write out the first Me, Ke, and Ve MatriciesIF (e==1 .AND. npe==3) THENWRITE(6,’(A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,

& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A )’)& ’ ’,’|’,Me(1,1),Me(1,2),Me(1,3),’|’,& ’ ’,’|’,Ke(1,1),Ke(1,2),Ke(1,3),’|’,& ’ ’,’|’,Ve(1,1),Ve(1,2),Ve(1,3),’|’

WRITE(6,’(A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A )’)& ’Me=’,’|’,Me(2,1),Me(2,2),Me(2,3),’|’,& ’Ke=’,’|’,Ke(2,1),Ke(2,2),Ke(2,3),’|’,& ’Ve=’,’|’,Ve(2,1),Ve(2,2),Ve(2,3),’|’

WRITE(6,’(A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A,5x,& A,3x,A,F7.4,3x,F7.4,3x,F7.4,A )’)& ’ ’,’|’,Me(3,1),Me(1,2),Me(3,3),’|’,& ’ ’,’|’,Ke(3,1),Ke(1,2),Ke(3,3),’|’,& ’ ’,’|’,Ve(3,1),Ve(1,2),Ve(3,3),’|’

END IF

END DO

!! Create Boundary MatrixCALL get_Cmatrix(basis_type,petsc%XYZ,nodes,npe,Np,Ne,

& petsc%asize,petsc%C,petsc%bnodes)

!! Create Crude PETSc Matricies (Mp,Kp,Vp)DO h = 1,NeqDO i = 1,NpDO j = (i-(npe-1)),(i+(npe-1))IF (j<=0 .OR. j>=Np+1) THEN!! Do Nothing

ELSEii = Neq*i - (Neq - (h-1) )

88

jj = Neq*j - (Neq - (h-1) )CALL MatSetValue(petsc%M,ii,jj,M(i,j),INSERT_VALUES,ierr)CALL MatSetValue(petsc%K,ii,jj,K(i,j),INSERT_VALUES,ierr)CALL MatSetValue(petsc%V,ii,jj,V(i,j),INSERT_VALUES,ierr)

CALL MatSetValue(petsc%V2,ii,jj,V2(i,j),INSERT_VALUES,ierr)END IF

END DOEND DO

END DO

!! Create Boundary MatriciesCALL create_boundary_matricies(petsc)

!! Petsc Matrix AssemblyCALL MatAssemblyBegin(petsc%M,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%M,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(petsc%K,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%K,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(petsc%V,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%V,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(petsc%V2,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%V2,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(petsc%C,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%C,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(petsc%L,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(petsc%L,MAT_FINAL_ASSEMBLY,ierr)

!! View Matricies for Debugging!CALL MatView(petsc%M,PETSC_VIEWER_STDOUT_WORLD,ierr)!CALL MatView(petsc%K,PETSC_VIEWER_STDOUT_WORLD,ierr)!CALL MatView(petsc%V,PETSC_VIEWER_STDOUT_WORLD,ierr)!CALL MatView(petsc%V2,PETSC_VIEWER_STDOUT_WORLD,ierr)!CALL MatView(petsc%C,PETSC_VIEWER_STDOUT_WORLD,ierr)!CALL MatView(petsc%L,PETSC_VIEWER_STDOUT_WORLD,ierr)

! Sum V and V2 togetherscaler = +1.0CALL MatAXPY(petsc%V,scaler,petsc%V2,SAME_NONZERO_PATTERN,ierr)!CALL MatView(petsc%V,PETSC_VIEWER_STDOUT_WORLD,ierr)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Pre Time Loop Calculations !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Initialize the Simulation Timesim_time(1) = 0

!! Zero the Frame Counterfcount = 0finterval = 1

!! Calculate Time Step (From Equation Set File e.g!! pseudo1D_euler.F )CALL get_timestep(Np,Q(:,:,1),petsc%XYZ,CFL,Cs_max,dTi)

89

!deltaT = dTidtdx = deltaT / dLpetsc%deltaT = deltaT

!! Output timestep and initial wavespeedits = 0newt = 0its_tot = 0newt_tot = 0WRITE(6,’(A,F10.4,5x,A,F10.4)’)

& ’dT = ’,deltaT,’Cs_min = ’,Cs_max!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Start Time Loop !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

DO t = 1,nT !<---------- This is the start of the time loop

!! Output VariablesIF (timestep_info == 1) THEN

CALL display_output(Np,Qt,Cs_max,sim_time(t),t,qprime,& dtdx,its,newt,reason)END IF

!! Calculate Flux, FCALL get_flux(Np,Qt,element%F)

!! Calculate Time Step/Wave SpeedCALL get_timestep(Np,Qt,petsc%XYZ,CFL,Cs_max,dTi)dtdx = deltaT / dL!deltaT = dTi ! PetscScaler time step

!! Copy Q, F, and S to PETSc vectors Qp,Fp, and SpCALL VecZeroEntries(petsc%Q,ierr)CALL VecZeroEntries(petsc%F,ierr)CALL VecZeroEntries(petsc%S,ierr)DO j = 1,NeqDO i = 1,Npii = Neq*i - (Neq - (j-1) )CALL VecSetValue(petsc%Q,ii,Qt(i,j),INSERT_VALUES,ierr)CALL VecSetValue(petsc%F,ii,element(i,j)%F ,INSERT_VALUES,ierr)CALL VecSetValue(petsc%S,ii,element(i,j)%S ,INSERT_VALUES,ierr)END DOEND DO

!! PETSc Assemble VectorsCALL VecAssemblyBegin(petsc%Q,ierr)CALL VecAssemblyEnd(petsc%Q,ierr)CALL VecAssemblyBegin(petsc%F,ierr)CALL VecAssemblyEnd(petsc%F,ierr)CALL VecAssemblyBegin(petsc%S,ierr)CALL VecAssemblyEnd(petsc%S,ierr)

!! Calculate Flux JacobiansCALL get_Jacobian(Np,petsc%Q,petsc%S,petsc%B,

& petsc%A,petsc%Z,petsc%Y,petsc%bcs,petsc%Nbc,petsc%npe)

!! View Jacobian Matrix For Debugging

90

!CALL MatView(petsc%Z,PETSC_VIEWER_STDOUT_WORLD,ierr)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Matrix Solver !! When Theta = 0 -> Explicit !! When Theta > 0 -> Implicit !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

IF (petsc%Mat_Solver == 0) THEN ! Use PETSc Wrapped SuperLU

! Get KSP context from SNESCALL SNESGetKSP(snes,ksp,ierr)

! Set KSP to Precondition Only and to LU for SuperLUCALL KSPSetType(ksp,KSPPREONLY,ierr)CALL KSPGetPC(ksp,pc,ierr)CALL PCSetType(pc,PCLU,ierr)

ELSEIF (petsc%Mat_Solver == 1) THEN !Use PETSc KSP Solvers

! Get KSP Context from SNESCALL SNESGetKSP(snes,ksp,ierr)

! Set Solver TypeCALL KSPSetType(ksp,KSPBCGS,ierr)

ENDIF

! Finalize Solve and Solve using SNES!!!CALL SNESSetFromOptions(snes,ierr)CALL SNESSolve(snes,PETSC_NULL,dQk,ierr)CALL SNESGetIterationNumber(snes,newt,ierr)CALL SNESGetLinearSolveIterations(snes,its,ierr)newt_tot = newt_tot + newtits_tot = its_tot + its!its = its / (newt+1)!CALL KSPGetIterationNumber(ksp,its,ierr)CALL SNESGetConvergedReason(snes,reason,ierr)!WRITE(*,*) ’SNES Reason = ’,reason

CALL CPU_TIME(tend)

!! Copy Data From PETSc Solution Vector (dQk)DO j = 1,NeqDO i = 1,Npii = Neq*i - (Neq - (j-1))CALL VecGetValues(dQk,1,ii,element(i,j)%dQ,ierr)

END DOEND DO!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! End Solver !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! Update QtQt(:,:) = Qt(:,:) + element%dQfcount = fcount + 1

!! Save to Q

91

IF ( fcount==nInterval ) THENQ(:,:,finterval+1) = Qtfinterval = finterval + 1fcount = 0

END IF

!! Update SCALL get_source(Np,Qt,element%S,petsc%XYZ,IC)

!! Calculate Timesim_time(t+1) = sim_time(t) + deltaT

!! Calculate Derivatives at the boundariesqprime = 0.0CALL compute_derivative(petsc,1,nodes,npe,Np,Ne,qprime(1))!WRITE(*,*) ’qprime L= ’,qprime(1)

qprime = 0.0CALL compute_derivative(petsc,Ne,nodes,npe,Np,Ne,qprime(2))!WRITE(*,*) ’qprime R= ’,qprime(2)

IF (newt==maxit) THENWRITE(*,*) ’Maximum Newton Iterations Reached’EXIT

END IF

END DO !<------------ This is the end of the time loop!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! End Time Loop !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!! View Rp!CALL VecView(Rp,ierr)

!! Display Final Time Step Info!itsF = itsCALL display_output(Np,Qt,Cs_max,sim_time(t),t,qprime,& dtdx,its,newt,reason)

!! Write Data Out to CSV fileCALL write_data(Np,petsc%XYZ,Q,nT,nTs,nInterval,sim_time,CFL,& Theta,Epse,Epsi,npe)

WRITE(*,*) ’Deallocating FORTRAN...’

!! Deallocate FORTRAN Matricies and VectorsDEALLOCATE(element)DEALLOCATE(nodes)DEALLOCATE(petsc%XYZ)DEALLOCATE(Xcoord,Xnodes)DEALLOCATE(Q,Qt,Cs,dT,sim_time,x,w,Me,Ke,Ve)DEALLOCATE(M,K,V)DEALLOCATE(petsc%bnodes,petsc%inodes,petsc%bcs)

!! Finalize Physics Module VariblesCALL module_finalize()

WRITE(*,*) ’Deallocating PETSc...’

92

!! Deallocate PETSc Matricies and VectorsCALL MatDestroy(petsc%M,ierr)CALL MatDestroy(petsc%K,ierr)CALL MatDestroy(petsc%V,ierr)CALL MatDestroy(petsc%V2,ierr)CALL MatDestroy(petsc%C,ierr)CALL MatDestroy(petsc%L,ierr)CALL MatDestroy(petsc%A,ierr)CALL MatDestroy(petsc%Z,ierr)CALL MatDestroy(petsc%Y,ierr)CALL MatDestroy(petsc%qI,ierr)CALL MatDestroy(petsc%fI,ierr)CALL MatDestroy(petsc%sI,ierr)CALL MatDestroy(petsc%vI,ierr)CALL MatDestroy(Jp,ierr)

CALL VecDestroy(petsc%Q,ierr)CALL VecDestroy(petsc%F,ierr)CALL VecDestroy(petsc%S,ierr)CALL VecDestroy(petsc%B,ierr)

CALL VecDestroy(dQk,ierr)CALL VecDestroy(Rp,ierr)CALL VecDestroy(petsc%Qk,ierr)CALL VecDestroy(petsc%Fk,ierr)CALL VecDestroy(petsc%Sk,ierr)CALL VecDestroy(petsc%Xk,ierr)CALL VecDestroy(petsc%X,ierr)

CALL SNESDestroy(snes,ierr)

!! Write total iteration countsWRITE(*,*) ’its_tot = ’,its_totWRITE(*,*) ’newt_tot = ’,newt_tot

!! Write CPU_TIMECALL CPU_TIME(tend)WRITE(*,*) ’tend = ’,tend

WRITE(*,*) ’PETSc Finalization...’CALL PetscFinalize(ierr)

!! Write CPU_TIMECALL CPU_TIME(tend)WRITE(*,*) ’final = ’,tend

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! End of Program !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

END PROGRAM fsFEM

!**********************************************************************!! Begin Subroutines !!**********************************************************************!

!*******************************!! Subroutine to Make Element !! Matrix, Me !!*******************************!

93

SUBROUTINE get_Mmatrix(Xcoord,Ni,Int_Method,npe,Me,bnodes,x1,x2)USE pseudo1D_eulerIMPLICIT NONEDOUBLE PRECISION :: xL,xRINTEGER :: Ni,Int_Method,npeINTEGER :: i,j,x1,x2DOUBLE PRECISION,DIMENSION(npe,npe) :: MeDOUBLE PRECISION,DIMENSION(Ni) :: x,w,xi,wiDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: hDOUBLE PRECISION,DIMENSION(npe) :: XcoordDOUBLE PRECISION,DIMENSION(npe) :: h_normINTEGER,DIMENSION(Neq*2,3) :: bnodes

ALLOCATE( h(npe,Ni) )

xL = Xcoord(1)XR = Xcoord(npe)

IF (Int_Method == 1) THEN! Legendre-Gauss QuadratureCALL LGWT(Ni,xL,xR,x,w)!WRITE(6,’(A,F6.4,3x,A,F6.4)’) ’x(1)=’,x(1),’w(1)=’,w(1)

h(1,1:Ni) = (xR-x)/(xR-xL)h(2,1:Ni) = (x-xL)/(xR-xL)

DO i = 1,npeDO j = 1,npeMe(i,j) = SUM(w*h(i,1:Ni)*h(j,1:Ni))

END DOEND DO

ELSEIF (Int_Method == 2) THEN! Legendre-Gauss Quadrature with Lagrange PolynomialsCALL LGWT(Ni,xL,xR,xi,wi)

! Create Lagrange Basis Functions, hh(1:npe,1:Ni) = 0h_norm(1:npe) = 1.0

! Reorder x and wDO i=1,Nix(i) = xi(Ni+1-i)w(i) = wi(Ni+1-i)

END DO

DO i=1,npeDO j = 1,npeIF (j==i) CYCLEh_norm(i) = h_norm(i) / (Xcoord(i) - Xcoord(j))

END DOEND DO

DO i = 1,npeh(i,1:Ni) = h_norm(i)DO j=1,npeIF (j==i) CYCLEh(i,1:Ni) = h(i,1:Ni) * (x - Xcoord(j))

94

END DOEND DO

DO i = 1,npeDO j = 1,npeMe(i,j) = SUM(w*h(i,1:Ni)*h(j,1:Ni))

END DOEND DO

END IF

DEALLOCATE(h)

END SUBROUTINE get_Mmatrix

!*******************************!! Subroutine to Make Element !! Matrix, Ke !!*******************************!SUBROUTINE get_Kmatrix(Xcoord,Ni,Int_Method,npe,Ke,bnodes,x1,x2)USE pseudo1D_eulerIMPLICIT NONEDOUBLE PRECISION :: xL,xRINTEGER :: Ni,Nib,Int_Method,npeINTEGER :: i,j,k,x1,x2DOUBLE PRECISION,DIMENSION(npe,npe) :: KeDOUBLE PRECISION,DIMENSION(Ni) :: xt,wtDOUBLE PRECISION,DIMENSION(Ni+2) :: x,wDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: h,dhDOUBLE PRECISION,DIMENSION(npe) :: XcoordDOUBLE PRECISION,DIMENSION(npe) :: h_normDOUBLE PRECISION,DIMENSION(Ni+2) :: dh_tempINTEGER,DIMENSION(Neq*2,3) :: bnodes

Nib = Ni + 2ALLOCATE( h(npe,Nib),dh(npe,Nib) )


IF (Int_Method == 1) THEN! Legendre-Gauss QuadratureCALL LGWT(Ni,xL,xR,xt,wt)

! Add x1 and x2 to x, and corresponding zeros in weight, wDO i = 1,Nix(i+1) = xt(Ni-i+1)w(i+1) = xt(Ni-i+1)

END DOx(1) = xLx(Nib) = xRw(1) = 0w(Nib) = 0

h(1,1:Nib) = (xR-x)/(xR-xL)h(2,1:Nib) = (x-xL)/(xR-xL)dh(1,1:Nib) = -1/(xR-xL)dh(2,1:Nib) = 1/(xR-xL)

95

DO i = 1,npeDO j = 1,npeKe(i,j) = SUM(w*h(i,1:Nib)*dh(j,1:Nib))

END DOEND DO

ELSEIF (Int_Method == 2) THEN! Legendre-Gauss Quadrature with Lagrange Polynomials! [ Calculation based on NIMPSI in polynomial.f ]CALL LGWT(Ni,xL,xR,xt,wt)

! Add x1 and x2 to x, and corresponding zeros in weight, wDO i = 1,Nix(i+1) = xt(Ni-i+1)w(i+1) = wt(Ni-i+1)

END DOx(1) = xLx(Nib) = xRw(1) = 0w(Nib) = 0

!IF (x1==1) THEN! WRITE(*,*) ’x@x1=’,x! WRITE(*,*) ’w@x1=’,w!END IF

! Initialize Variablesh(1:npe,1:Nib) = 0.0h_norm(1:npe) = 1.0dh(1:npe,1:Nib) = 0.0

! Create Lagrange FunctionDO i=1,npeDO j = 1,npeIF (j==i) CYCLEh_norm(i) = h_norm(i) / (Xcoord(i) - Xcoord(j))

END DOEND DODO i = 1,npeh(i,1:Nib) = h_norm(i)DO j=1,npeIF (j==i) CYCLEh(i,1:Nib) = h(i,1:Nib) * (x - Xcoord(j))

END DOEND DO

! Create First Derivative of Lagrange FunctionDO i = 1,npedh(i,1:Nib) = 0.0DO k = 1,npeIF (k==i) CYCLEdh_temp = h_norm(i)DO j = 1,npeIF (j==i) CYCLEIF (j==k) CYCLEdh_temp = dh_temp * (x - Xcoord(j))

END DOdh(i,1:Nib) = dh(i,1:Nib) + dh_temp

96

END DOEND DO

! Create Element MatrixDO i = 1,npeDO j = 1,npeKe(i,j) = SUM(w*h(i,1:Nib)*dh(j,1:Nib))

END DOEND DO

END IF

DEALLOCATE(h,dh)

END SUBROUTINE get_Kmatrix

!*******************************!! Subroutine to C Matrix !! !!*******************************!SUBROUTINE get_Cmatrix(basis_type,XYZ,nodes,npe,Np,Ne,asize,

& C,bnodes)USE pseudo1D_eulerIMPLICIT NONE

#include "include/finclude/petsc.h"#include "include/finclude/petscvec.h"#include "include/finclude/petscmat.h"

DOUBLE PRECISION :: xL,xR,xINTEGER :: basis_typeINTEGER :: npe,asize,Np,NeINTEGER :: i,j,k,ii,jj,kk,x1,x2INTEGER,DIMENSION(1) :: minlDOUBLE PRECISION,DIMENSION(npe) :: Xcoord,XnodesDOUBLE PRECISION,DIMENSION(Np) :: XYZINTEGER,DIMENSION(Ne,npe) :: nodesDOUBLE PRECISION,DIMENSION(npe) :: h,dhINTEGER,DIMENSION(Neq*2,3) :: bnodesMat CPetscErrorCode ierr

CALL MatZeroEntries(C,ierr)

DO i = 1,Ne

DO k = 1,npeXnodes(k) = nodes(i,k)Xcoord(k) = XYZ( Xnodes(k))

END DOxL = Xcoord(1)XR = Xcoord(npe)

ii = (i-1)*(npe-1)*Neq!WRITE(*,*) ’ii=’,iiDO k = 1,npeDO j = 1,Neqjj = ii + (k-1)*Neq + j ! Global Node Number

97

!WRITE(*,*) ’jj=’,jj

IF ( MINVAL(ABS(bnodes(:,1)-jj))==0) THENminl = MINLOC(ABS(bnodes(:,1)-jj))

IF (bnodes(minl(1),2)==0) THEN! Dirichlet Boundary ConditionsCALL MatSetValue(C,jj-1,jj-1,1.d0,ADD_VALUES,ierr)

ELSEIF (bnodes(minl(1),2)==1) THEN! Neumann Boundary Conditions

!!! Left Side !!!IF ( bnodes(minl(1),1)==1 .OR.

& bnodes(minl(1),1)==2 .OR.& bnodes(minl(1),1)==3 ) THEN

CALL evaluate_basis(basis_type,xL,npe,Xcoord,& .TRUE.,dh)

DO kk = 1,npeCALL MatSetValue(C,jj-1,jj-1+Neq*(kk-1),

& dh(kk),ADD_VALUES,ierr)END DO

!!! Right Side !!!ELSEIF ( bnodes(minl(1),1)==asize-2 .OR.

& bnodes(minl(1),1)==asize-1 .OR.& bnodes(minl(1),1)==asize-0 ) THEN

CALL evaluate_basis(basis_type,xR,npe,Xcoord,& .TRUE.,dh)

DO kk = 1,npeCALL MatSetValue(C,jj-1,jj-1-Neq*(kk-1),

& dh(npe-kk+1),ADD_VALUES,ierr)END DO

END IF

END IF

!WRITE(*,*), ’ii=’,ii,’minl=’,minl(1)END IF

END DOEND DOEND DO

END SUBROUTINE get_Cmatrix

!*******************************!! Subroutine to Make Element !! Matrix, Ve (Viscosity) !!*******************************!SUBROUTINE get_Vmatrix(Xcoord,Ni,Int_Method,npe,Ve,Ve2,Np,

& bnodes,x1,x2)USE pseudo1D_eulerIMPLICIT NONEDOUBLE PRECISION :: xL,xRINTEGER :: Ni,Int_Method,npe,NpINTEGER :: i,j,k,x1,x2

98

DOUBLE PRECISION,DIMENSION(npe,npe) :: Ve,Ve2DOUBLE PRECISION,DIMENSION(Ni) :: x,wDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: hDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: dhDOUBLE PRECISION,DIMENSION(npe) :: h2,dh2DOUBLE PRECISION,DIMENSION(2) :: xx, dh_temp2DOUBLE PRECISION,DIMENSION(npe) :: XcoordDOUBLE PRECISION,DIMENSION(npe) :: h_normDOUBLE PRECISION,DIMENSION(Ni) :: dh_tempINTEGER,DIMENSION(Neq*2,3) :: bnodes

ALLOCATE( h(npe,Ni), dh(npe,Ni) )


IF (Int_Method == 1) THEN! Legendre-Gauss QuadratureCALL LGWT(Ni,xL,xR,x,w)

dh(1,1:Ni) = -1/(xR-xL)dh(2,1:Ni) = 1/(xR-xL)

DO i = 1,npeDO j = 1,npeVe(i,j) = SUM(w*dh(i,1:Ni)*dh(j,1:Ni))

END DOEND DO

ELSEIF (Int_Method == 2) THEN! Legendre-Gauss Quadrature with Lagrange Polynomials! [ Calculation based on NIMPSI in polynomial.f ]CALL LGWT(Ni,xL,xR,x,w)

! Initialize Variablesh(1:npe,1:Ni) = 0.0h_norm(1:npe) = 1.0dh(1:npe,1:Ni) = 0.0

! Create Lagrange FunctionDO i=1,npeDO j = 1,npeIF (j==i) CYCLEh_norm(i) = h_norm(i) / (Xcoord(i) - Xcoord(j))

END DOEND DO

DO i = 1,npeh(i,1:Ni) = h_norm(i)DO j=1,npeIF (j==i) CYCLEh(i,1:Ni) = h(i,1:Ni) * (x - Xcoord(j))

END DOEND DO

! Create First Derivative of Lagrange FunctionDO i = 1,npedh(i,1:Ni) = 0.0

99

DO k = 1,npeIF (k==i) CYCLEdh_temp = h_norm(i)DO j = 1,npeIF (j==i) CYCLEIF (j==k) CYCLEdh_temp = dh_temp * (x - Xcoord(j))

END DOdh(i,1:Ni) = dh(i,1:Ni) + dh_temp

END DOEND DO

! Create Element MatrixDO i = 1,npeDO j = 1,npeVe(i,j) = SUM(w*dh(i,1:Ni)*dh(j,1:Ni))

END DOEND DO

! Create the V2 element Matrix [h*dh]_omegaIF (x1==1) THEN! Left BoundaryCALL evaluate_basis(0,xL,npe,Xcoord,.TRUE.,dh2)DO i = 1,npeVe2(1,i) = dh2(i)

END DO

ELSEIF (x2==Np) THEN! Right BoundaryCALL evaluate_basis(0,xR,npe,Xcoord,.TRUE.,dh2)DO i = 1,npeVe2(npe,i) = dh2(i)

END DO

ELSEVe2(:,:) = 0.0

END IF

END IF

DEALLOCATE(h,dh)

END SUBROUTINE get_Vmatrix

!*******************************!! Evaluate Basis Function !! !!*******************************!SUBROUTINE evaluate_basis(basis_type,x,npe,Xcoord,dhflag,h)IMPLICIT NONEINTEGER :: basis_typeINTEGER :: i,j,k,npe,polyDOUBLE PRECISION :: x,dh_temp,pnDOUBLE PRECISION,DIMENSION(npe) :: XcoordDOUBLE PRECISION,DIMENSION(npe) :: h,dh

100

DOUBLE PRECISION,DIMENSION(npe) :: h_normLOGICAL :: dhflag

! Initialize Variablesh(:) = 0.0h_norm(:) = 1.0dh(:) = 0.0

IF (basis_type==0) THEN! Create Lagrange FunctionDO i=1,npeDO j = 1,npeIF (j==i) CYCLEh_norm(i) = h_norm(i) / (Xcoord(i) - Xcoord(j))

END DOEND DODO i = 1,npeh(i) = h_norm(i)DO j=1,npeIF (j==i) CYCLEh(i) = h(i) * (x - Xcoord(j))

END DOEND DO

IF (dhflag) THEN! Create First Derivative of Lagrange FunctionDO i = 1,npedh(i) = 0.0DO k = 1,npeIF (k==i) CYCLEdh_temp = h_norm(i)DO j = 1,npeIF (j==i) CYCLEIF (j==k) CYCLEdh_temp = dh_temp * (x - Xcoord(j))

END DOdh(i) = dh(i) + dh_temp

END DOEND DO

h = dhEND IF

ELSEIF (basis_type==1) THEN

IF (dhflag==.FALSE.) THEN! Create Legendre FunctionCALL Legendre_Function(npe,x,Xcoord(1),Xcoord(npe),pn)

ELSE! Create First Derivative of Legendre FunctionCALL Legendre_Derivative(npe,x,Xcoord(1),Xcoord(npe),pn)

END IFh = pn

END IF

END SUBROUTINE evaluate_basis

101

!*******************************!! Compute Derivative at a Node !! !!*******************************!SUBROUTINE compute_derivative(sf,N,nodes,npe,Np,Ne,qprime)USE data_structUSE pseudo1D_eulerIMPLICIT NONE


PetscErrorCode ierrTYPE(petsc_type) :: sfDOUBLE PRECISION,DIMENSION(npe) :: dprimeDOUBLE PRECISION :: qprime,qval1,qval2,qval3INTEGER :: npe,Np,Ne,N,i,iiDOUBLE PRECISION,DIMENSION(npe) :: XcoordINTEGER,DIMENSION(Ne,npe) :: nodes

DO i = 1,npeXcoord(i) = sf%XYZ( nodes(N,i) )

END DO

IF (N==1) THEN

! Compute Left Side DerivativeCALL evaluate_basis(0,Xcoord(1),npe,Xcoord,.TRUE.,dprime)DO i = 1,npeii = Neq*(i-1)CALL VecGetValues(sf%Q,1,ii,qval1,ierr)qval1 = qval1/area(1)!WRITE(*,*),’qval =’,qval1qprime = qprime + qval1*dprime(i)!WRITE(*,*) ’q*a‘ =’,qprime

END DO

ELSEIF (N==Ne) THEN

! Compute Right Side Derivative!CALL VecView(sf%Q,ierr)CALL evaluate_basis(0,Xcoord(npe),npe,Xcoord,.TRUE.,dprime)DO i = 1,npeii = Neq*Np - Neq*(npe-(i-1))CALL VecGetValues(sf%Q,1,ii,qval1,ierr)qval1 = qval1/area(Np)!WRITE(*,*) ’qval1 =’,qval1qprime = qprime + qval1*dprime(i)

END DO

END IF

END SUBROUTINE compute_derivative

102

!*******************************!! Compute Legendre Function !! at x with poly=npe-1 !!*******************************!SUBROUTINE Legendre_Function(npe,xi,xL,xR,pn)IMPLICIT NONE

INTEGER :: npe,polyDOUBLE PRECISION :: p0,p1,pn1,pn2,pn,xi,x,xL,xRINTEGER :: i

! Linear Transform to match [-1,1] intervalx = (xL-2*xi+xR)/(xL-xR)!WRITE(*,*) ’xi=’,xi!WRITE(*,*) ’x=’,x

poly = npe - 1

p0 = 1.0p1 = x

IF (poly == 0) THENpn = p0

ELSEIF (poly == 1) THENpn = p1

ELSEpn = 0.0pn1 = p1pn2 = p0

DO i = 2,polypn = (x * (2.0*i - 1)*pn1 - (i-1)*pn2 ) / (i)pn2 = pn1pn1 = pn

END DOEND IF

END SUBROUTINE Legendre_Function

!*******************************!! Compute Legendre Derivative !! at x with poly=npe-1 !!*******************************!SUBROUTINE Legendre_Derivative(npe,x,xL,xR,dpn)IMPLICIT NONE

INTEGER :: npe,polyDOUBLE PRECISION :: p1,p2,dp0,dp1,dpn,x,xL,xRINTEGER :: i

poly = npe - 1

dp0 = 0.0dp1 = 1.0

IF (poly==0) THENdpn = dp0

ELSEIF (poly==1) THEN

103

dpn = dp1ELSECALL Legendre_Function(poly ,x,xL,xR,p1)CALL Legendre_Function(poly+1,x,xL,xR,p2)dpn = ( npe*x*p1 - npe*p2 ) / (1 - x*x)

ENDIF

END SUBROUTINE Legendre_Derivative

!*******************************!!Create PETSc Identity Matricies!! !!*******************************!SUBROUTINE create_boundary_matricies(sf)USE data_structUSE pseudo1D_eulerIMPLICIT NONE


PetscErrorCode ierrINTEGER :: i,jDOUBLE PRECISION :: valueTYPE(petsc_type) :: sf

value = 1.0

!! Create qI boundary matrixIF (sf%Qflg==0) THENCALL MatZeroEntries(sf%qI,ierr)

ELSEIF (sf%Qflg==1) THEN !Interior Nodes "On"DO i = 1,(sf%asize-sf%Nbc)j = sf%inodes(i)CALL MatSetValues(sf%qI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)END DOELSEIF (sf%Qflg==2) THEN ! Boundary Nodes "On"DO i = 1,sf%Nbcj = sf%bnodes(i,1)CALL MatSetValues(sf%qI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)

END DOELSEIF (sf%Qflg==3) THEN ! All Nodes "On"DO i = 1,sf%asizeCALL MatSetValues(sf%qI,1,i-1,1,i-1,value,INSERT_VALUES,ierr)

END DOEND IF

!! Create fI boundary matrixIF (sf%Fflg==0) THENCALL MatZeroEntries(sf%fI,ierr)

ELSEIF (sf%Fflg==1) THEN !Interior Nodes "On"DO i = 1,(sf%asize-sf%Nbc)j = sf%inodes(i)CALL MatSetValues(sf%fI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)END DOELSEIF (sf%Fflg==2) THEN ! Boundary Nodes "On"DO i = 1,sf%Nbc

104

j = sf%bnodes(i,1)CALL MatSetValues(sf%fI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)

END DOELSEIF (sf%Fflg==3) THEN ! All Nodes "On"DO i = 1,sf%asizeCALL MatSetValues(sf%fI,1,i-1,1,i-1,value,INSERT_VALUES,ierr)

END DOEND IF

!! Create sI boundary matrixIF (sf%Sflg==0) THENCALL MatZeroEntries(sf%sI,ierr)

ELSEIF (sf%Sflg==1) THEN !Interior Nodes "On"DO i = 1,(sf%asize-sf%Nbc)j = sf%inodes(i)CALL MatSetValues(sf%sI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)END DOELSEIF (sf%Sflg==2) THEN ! Boundary Nodes "On"DO i = 1,sf%Nbcj = sf%bnodes(i,1)CALL MatSetValues(sf%sI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)

END DOELSEIF (sf%Sflg==3) THEN ! All Nodes "On"DO i = 1,sf%asizeCALL MatSetValues(sf%sI,1,i-1,1,i-1,value,INSERT_VALUES,ierr)

END DOEND IF

!! Create vI boundary matrixIF (sf%Vflg==0) THENCALL MatZeroEntries(sf%vI,ierr)

ELSEIF (sf%Vflg==1) THEN !Interior Nodes "On"DO i = 1,(sf%asize-sf%Nbc)j = sf%inodes(i)CALL MatSetValues(sf%vI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)END DOELSEIF (sf%Vflg==2) THEN ! Boundary Nodes "On"DO i = 1,sf%Nbcj = sf%bnodes(i,1)CALL MatSetValues(sf%vI,1,j-1,1,j-1,value,INSERT_VALUES,ierr)

END DOELSEIF (sf%Vflg==3) THEN ! All Nodes "On"DO i = 1,sf%asizeCALL MatSetValues(sf%vI,1,i-1,1,i-1,value,INSERT_VALUES,ierr)

END DOELSEIF (sf%Vflg==4) THEN ! Only Rho*u "On"DO i = 1,sf%NpCALL MatSetValues(sf%vI,1,3*i-2,1,3*i-2,value,

& INSERT_VALUES,ierr)END DOEND IF

CALL MatAssemblyBegin(sf%qI,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(sf%qI,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(sf%fI,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(sf%fI,MAT_FINAL_ASSEMBLY,ierr)

105

CALL MatAssemblyBegin(sf%sI,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(sf%sI,MAT_FINAL_ASSEMBLY,ierr)

CALL MatAssemblyBegin(sf%vI,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(sf%vI,MAT_FINAL_ASSEMBLY,ierr)

END SUBROUTINE create_boundary_matricies

!*******************************!! Calculate Left Hand Side LHS !! Matrix !!*******************************!SUBROUTINE get_LeftHandSide(LHS,sf,ierr)USE data_structIMPLICIT NONE


DOUBLE PRECISION :: Theta,Epsi,deltaT,asize,fillMat LHS,tempM,tempM2TYPE(petsc_type) :: sfPetscErrorCode ierr

asize = sf%asizedeltaT = sf%deltaTEpsi = sf%EpsiTheta = sf%Theta

!! Create Left Hand Side (LHS) Matrix! Result: LHS = M - dT*Theta*(M*Z - K*A + C*Y - Epsi*V)

! LHS = 0CALL MatZeroEntries(LHS,ierr)

! tempM = qI*Mfill = 1.0CALL MatMatMult(sf%qI,sf%M,MAT_INITIAL_MATRIX,fill,tempM,ierr)

! LHS = LHS + qI*MCALL MatAXPY(LHS,1.d0,tempM,DIFFERENT_NONZERO_PATTERN,ierr)CALL MatDestroy(tempM,ierr)

! LHS = LHS -deltaT*Theta*(sI*M*Z)fill = 1.0CALL MatMatMult(sf%sI,sf%M,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatMatMult(tempM,sf%Z,MAT_INITIAL_MATRIX,fill,tempM2,ierr)CALL MatAXPY(LHS,-deltaT*Theta,tempM2,

& DIFFERENT_NONZERO_PATTERN,ierr)CALL MatDestroy(tempM,ierr)CALL MatDestroy(tempM2,ierr)

! LHS = LHS + deltaT*Theta*(fI*K*A)CALL MatMatMult(sf%fI,sf%K,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatMatMult(tempM,sf%A,MAT_INITIAL_MATRIX,fill,tempM2,ierr)CALL MatAXPY(LHS,+deltaT*Theta,tempM2,

& DIFFERENT_NONZERO_PATTERN,ierr)

106

CALL MatDestroy(tempM,ierr)CALL MatDestroy(tempM2,ierr)

! LHS = LHS - deltaT*Theta*(C*Y)CALL MatMatMult(sf%C,sf%Y,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatAXPY(LHS,-1.0*deltaT*Theta,tempM,

& DIFFERENT_NONZERO_PATTERN,ierr)CALL MatDestroy(tempM,ierr)

!CALL MatView(tempM,PETSC_VIEWER_STDOUT_WORLD,ierr)

! LHS = LHS + Epsi*deltaT*Theta*(vI*V)IF (Epsi==0) THEN!Do NothingELSECALL MatMatMult(sf%vI,sf%V,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatAXPY(LHS,+Epsi*deltaT*Theta,tempM,

& DIFFERENT_NONZERO_PATTERN,ierr)CALL MatDestroy(tempM,ierr)

END IF

RETURNEND SUBROUTINE get_LeftHandSide

!*******************************!! SNES Function Formation !! !!*******************************!SUBROUTINE SNES_Function(snes,dQk,Rp,sf,ierr)USE data_structUSE pseudo1D_eulerIMPLICIT NONE

#include "include/finclude/petsc.h"#include "include/finclude/petscvec.h"#include "include/finclude/petscmat.h"#include "include/finclude/petscksp.h"#include "include/finclude/petscsnes.h"

SNES snesMat tempMVec Rp,dQk,tempVPetscErrorCode ierrINTEGER :: asize,NpDOUBLE PRECISION :: scalar,fillTYPE(petsc_type) :: sf

asize = sf%asizeNp = sf%Np

CALL VecCreate(PETSC_COMM_WORLD,tempV,ierr)CALL VecSetSizes(tempV,PETSC_DECIDE,asize,ierr)CALL VecSetFromOptions(tempV,ierr)

!! Calculate Non-linear Function

107

! Result: R = M*(dq) - theta*dt*X^(n+1) - (1-theta)*dt*X^n = 0! X = M*s - K*f - C*b

CALL VecCopy(sf%Q,sf%Qk,ierr) ! Qk = QCALL VecAXPY(sf%Qk,1.d0,dQk,ierr) ! Qk = Qk + dQk

! tempM = qI*Mfill = 1.0CALL MatMatMult(sf%qI,sf%M,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatMult(tempM,dQk,Rp,ierr) ! Rp = qI*M*dQCALL MatDestroy(tempM,ierr)!CALL VecView(Qk,PETSC_VIEWER_STDOUT_WORLD,ierr)

!! Calc Xscalar = -1.0CALL get_flux_petsc(Np,sf%Q,sf%F) ! Update the FluxCALL get_source_petsc(Np,sf%Q,sf%S,sf%XYZ,IC) ! Update the SourceCALL MatMatMult(sf%sI,sf%M,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatMult(tempM,sf%S,sf%X,ierr) ! X = sI*M*sCALL MatDestroy(tempM,ierr)

CALL MatMatMult(sf%fI,sf%K,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL VecCopy(sf%F,tempV,ierr) ! tempV = fCALL VecScale(tempV,scalar,ierr) ! tempV = -tempVCALL MatMultAdd(tempM,tempV,sf%X,sf%X,ierr) ! X = X - fI*K*fCALL MatDestroy(tempM,ierr)

CALL set_boundary_condition(sf%Q,sf%B,sf%bnodes,Np,sf%Nbn,& sf%bcs,sf%Nbc,sf%npe)

CALL MatMultAdd(sf%C,sf%B,sf%X,sf%X,ierr) ! X = X + C*b

scalar = -1.0*sf%EpseCALL MatMatMult(sf%vI,sf%V,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL VecCopy(sf%Q,tempV,ierr) ! tempV = qCALL VecScale(tempV,scalar,ierr) ! tempV = -tempV*EpseCALL MatMultAdd(tempM,tempV,sf%X,sf%X,ierr) ! X = X - Epse*vI*V*qCALL MatDestroy(tempM,ierr)

!! Calc Xkscalar = -1.0CALL get_flux_petsc(Np,sf%Qk,sf%Fk) !Update the FluxCALL get_source_petsc(Np,sf%Qk,sf%Sk,sf%XYZ,IC) !Update the SourceCALL MatMatMult(sf%sI,sf%M,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL MatMult(tempM,sf%S,sf%Xk,ierr) ! Xk = sI*M*sCALL MatDestroy(tempM,ierr)

CALL MatMatMult(sf%fI,sf%K,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL VecCopy(sf%Fk,tempV,ierr) ! tempV = fCALL VecScale(tempV,scalar,ierr) ! tempV = -tempVCALL MatMultAdd(tempM,tempV,sf%Xk,sf%Xk,ierr) ! Xk = Xk - fI*K*fCALL MatDestroy(tempM,ierr)

CALL set_boundary_condition(sf%Qk,sf%B,sf%bnodes,Np,sf%Nbn,& sf%bcs,sf%Nbc,sf%npe)

CALL MatMultAdd(sf%C,sf%B,sf%Xk,sf%Xk,ierr) ! Xk = Xk + C*b

108

scalar = -1.0*sf%EpsiCALL MatMatMult(sf%vI,sf%V,MAT_INITIAL_MATRIX,fill,tempM,ierr)CALL VecCopy(sf%Qk,tempV,ierr) ! tempV = qCALL VecScale(tempV,scalar,ierr) ! tempV = -tempV*EpsiCALL MatMultAdd(tempM,tempV,sf%Xk,sf%Xk,ierr) !Xk=Xk-Epsi*vI*V*qCALL MatDestroy(tempM,ierr)

! Rp = Rp - Theta*dt*Xkscalar = -sf%Theta*sf%deltaTCALL VecAXPY(Rp,scalar,sf%Xk,ierr)

! Rp = Rp - (1-Theta)*dt*Xscalar = -(1.d0-sf%Theta)*sf%deltaTCALL VecAXPY(Rp,scalar,sf%X,ierr)

!CALL VecView(sf%Q,PETSC_VIEWER_STDOUT_WORLD,ierr)

! Assemble MatriciesCALL VecAssemblyBegin(Rp,ierr)CALL VecAssemblyEnd(Rp,ierr)

!CALL MatDestroy(tempM,ierr)CALL VecDestroy(tempV,ierr)

END SUBROUTINE SNES_Function

!*******************************!! SNES Jacobian Formation !! !!*******************************!SUBROUTINE SNES_Jacobian(snes,dQk,Jp,Jpre,flag,sf,ierr)USE data_structUSE pseudo1D_eulerIMPLICIT NONE

#include "include/finclude/petsc.h"#include "include/finclude/petscvec.h"#include "include/finclude/petscmat.h"#include "include/finclude/petscksp.h"#include "include/finclude/petscsnes.h"

SNES snesVec dQkMat Jp,JpreMatStructure flagMatReuse reusePetscErrorCode ierrTYPE(petsc_type) :: sf

CALL VecCopy(sf%Q,sf%Qk,ierr)CALL VecAXPY(sf%Qk,1.d0,dQk,ierr)CALL get_Jacobian(sf%Np,sf%Qk,sf%S,sf%B,sf%A,sf%Z,sf%Y,

& sf%bcs,sf%Nbc,sf%npe)CALL get_LeftHandSide(Jp,sf,ierr)

CALL MatAssemblyBegin(Jp,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(Jp,MAT_FINAL_ASSEMBLY,ierr)

109

CALL MatCopy(Jp,Jpre,SAME_NONZERO_PATTERN,ierr)

reuse = MAT_REUSE_MATRIXIF (sf%Mat_Solver==0) THENCALL MatConvert(Jp ,MATSUPERLU,reuse,Jp ,ierr)CALL MatConvert(Jpre,MATSUPERLU,reuse,Jpre,ierr)END IF

END SUBROUTINE SNES_Jacobian

!*******************************!! Legendre-Gauss Quadrature for !! definite integrals !!*******************************!SUBROUTINE LGWT(Ni,xL,xR,x,w)IMPLICIT NONEINTEGER :: NiDOUBLE PRECISION :: xL,xRDOUBLE PRECISION,DIMENSION(Ni) :: x,w

DOUBLE PRECISION, PARAMETER :: pi=3.141592DOUBLE PRECISION, PARAMETER :: eps=2.2204e-16INTEGER :: N1,N2,NDOUBLE PRECISION :: M1,M2,MINTEGER :: i,j,kDOUBLE PRECISION :: xtDOUBLE PRECISION,DIMENSION(:), ALLOCATABLE :: xu,y,yt,LdDOUBLE PRECISION,DIMENSION(:,:),ALLOCATABLE :: L,Lp

N = Ni-1N1 = N + 1N2 = N + 2M = NM1 = N1M2 = N2

ALLOCATE( xu(Ni),y(Ni),yt(Ni),Ld(Ni) )ALLOCATE( L(N1,N2),Lp(N1,N2) )

!! Create Vector of Length N1, from -1 to 1xt = 2/Nxu(1) = -1DO i = 2,N1xu(i) = xu(i-1)+xt

END DO

!! Create Initial Guess VectorDO i = 1,Niy(i) = COS( (2*(i-1)+1)*pi/(2*N+2) ) + (0.27/M1) * &

& SIN( pi*xu(i)*M/M2)END DO

yt(1:Ni) = 2

DO WHILE ( MAXVAL( ABS(y-yt) ) > eps )

L(1:N1,1) = 1!Lp(1:N1,1) = 0

110

L(1:N1,2) = y!Lp(1:N1,2) = 1

DO k = 2,N1L(1:N1,k+1) = ( (2*k-1) * y * L(1:N1,k) - (k-1) * &

& L(1:N1,k-1) )/kEND DO

Ld = N2*( L(1:N1,N1) - y * L(1:N1,N2) ) / (1 - y**2)

yt = yy = yt - L(1:N1,N2) / Ld

END DO

!! Linear Map from [-1,1] to [xL,xR]x = ( xL*(1-y) + xR*(1+y) )/2

!! Compute the Weightsw = ( (xR-xL) / ( (1-y**2) * Ld**2 ) ) * (M2/M1)**2

DEALLOCATE(xu,y,yt,Ld,L,Lp)

END SUBROUTINE LGWT

B.0.4 pseudo1D euler.F

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Pseudo-1D Euler Equations Module !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

MODULE pseudo1d_eulerUSE data_structIMPLICIT NONE

!!Set Some ConstantsINTEGER,PARAMETER :: Neq = 3 ! No. of Equations in SystemDOUBLE PRECISION,PARAMETER :: g = 1.4 ! Ratio of Sp. Heats

!! Other Variables to Calculate for this Eqn. SetINTEGER :: IC !(0 for Shock Tube, 1 for 1D-Nozzle)DOUBLE PRECISION :: Pi,Po,Mi,Mo,rhoi,ui,ci,ei,rhoo,uo,eoDOUBLE PRECISION,DIMENSION(:),ALLOCATABLE :: PtDOUBLE PRECISION,DIMENSION(:),ALLOCATABLE :: Area

CONTAINS

!******************************!! Set the Initial Condition !! !!******************************!SUBROUTINE get_initial_condition(IC,Np,nT,S,Qt,X)IMPLICIT NONEINTEGER :: Np,nTINTEGER :: e,ICDOUBLE PRECISION,DIMENSION(Np,Neq) :: S,QtDOUBLE PRECISION,DIMENSION(Np) :: XDOUBLE PRECISION :: rho1,rho2,eL,eR

111

ALLOCATE( Area(Np) )ALLOCATE( Pt(Np) )

!Mi = 0.10 ! Initial/Inlet Mach Number!Mo = 0.05!Pi = 1.00 ! Initial/Inlet Pressure!Po = 0.93 ! Outlet Pressure!Po = 1.0rhoi = 1.0rhoo = 1.0

!! Set initial density, velocity, energy, source and area for!! each element

!! Shock Tube ProblemIF (IC == 0) THEN

rho1 = 1.0rho2 = 4.0

DO e = 1,NpS(e,1) = 0S(e,2) = 0S(e,3) = 0Area(e) = 1

IF (e <= Np/2) THENQt(e,1) = rho1Qt(e,2) = 0Qt(e,3) = Pi/(g-1)

ELSEIF (e > Np/2) THENQt(e,1) = rho2Qt(e,2) = 0Qt(e,3) = rho2*Pi/(g-1)

END IF

END DOEND IF

!! Pseudo-1D Nozzle ProblemIF (IC == 1 .OR. IC == 3) THENeL = Pi/(g-1) + 0.5*rhoi*Mi*MieR = Po/(g-1) + 0.5*rhoo*Mo*Mo

DO e = 1,NpIF (IC == 1) THENArea(e) = 1.398 + 0.347*TANH(0.8*X(e) - 4.0)

ELSEIF (IC == 3) THENArea(e) = 1.0

ENDIF

!! DensityQt(e,1) = ( (rhoo-rhoi)/(X(Np)-X(1))*X(e) + rhoi )*Area(e)

!! Momentum!Qt(e,2) = 1.0 * Mi * Area(e)Qt(e,2) = ( ( Mo - Mi )/(X(Np)-X(1))*X(e) + Mi )*Area(e)

112

!! EnergyQt(e,3) = ( ( eR - eL )/(X(Np)-X(1))*X(e) + eL )*Area(e)

!! SourceIF (IC == 1) THENS(e,1) = 0S(e,2) = 0.2776 / ( COSH(0.8*X(e) - 4.0) )**2S(e,3) = 0

ELSEIF (IC == 3) THENS(e,1) = 0S(e,2) = 0S(e,3) = 0

ENDIF

END DO

!! Set SourceCALL get_pressure(Np,Qt)S(:,2) = Pt(:)*S(:,2)

Pi = Pt(1)Po = Pt(Np)

ELSEIF ( IC==2 ) THEN ! Initial Parameters From Anderson BookDO e = 1,NpArea(e) = 1 + 2.2*(X(e)-1.5)**2

!! DensityQt(e,1) = (1 - 0.3146*X(e)) * Area(e)!! EnergyQt(e,3) = (1 - 0.2314*X(e)) * Area(e)!! Momentum!Qt(e,2) = Qt(e,1)*((0.1+1.09*X(e))*Qt(e,3)**(0.5))/Area(e)Qt(e,2) = Mi*Qt(e,1)

S(e,1) = 0S(e,2) = 4.4*(X(e)-1.5)S(e,3) = 0

END DO

!! Set SourceCALL get_pressure(Np,Qt)S(:,2) = Pt(:)*S(:,2)

Pi = Pt(1)Po = Pt(Np)

END IF

! Set Inlet Fixed Conditionsrhoi = Qt(1,1) / Area(1) ! Inlet Densityci = SQRT( g * Pi / rhoi )ui = Qt(1,2) / Qt(1,1)ei = Qt(1,3) / Area(1) ! Inlet Energy

! Set Outlet Condition(s)rhoo = Qt(Np,1) / Area(Np)!uo = Qt(Np,2) / Qt(Np,1)

113

uo = Moeo = Qt(Np,3) / Area(Np)

END SUBROUTINE get_initial_condition

!*******************************!! Set Boundary Nodes !! !!*******************************!SUBROUTINE set_bnodes(bnodes,Nbn,asize,bc_input)USE data_structIMPLICIT NONEINTEGER,DIMENSION(2*Neq,3) :: bnodesINTEGER,DIMENSION(2*Neq,2) :: bc_inputINTEGER :: Nbn,asize

bnodes(:,:) = 0

!! Specify Boundary Condition Nodesbnodes(1,1) = 1bnodes(2,1) = 2bnodes(3,1) = 3bnodes(4,1) = asize-2bnodes(5,1) = asize-1bnodes(6,1) = asize-0Nbn = 6

!! Specify Type of BC on Nodes!! (0 = Dirichlet, 1=Neumann, 2=Natural)bnodes(1,2) = bc_input(1,1)bnodes(2,2) = bc_input(2,1)bnodes(3,2) = bc_input(3,1)bnodes(4,2) = bc_input(4,1)bnodes(5,2) = bc_input(5,1)bnodes(6,2) = bc_input(6,1)

!! Specify the Variable to Set!! (1 = rho, 2=rhou, 3=e, 4=u, 5=p)bnodes(1,3) = bc_input(1,2)bnodes(2,3) = bc_input(2,2)bnodes(3,3) = bc_input(3,2)bnodes(4,3) = bc_input(4,2) ! 1 = rhobnodes(5,3) = bc_input(5,2) ! 2 = rho*u, and 4 = ubnodes(6,3) = bc_input(6,2) ! 3 = e, and 5 = p

END SUBROUTINE set_bnodes

!*******************************!! Set Boundary Condition(s) !! !!*******************************!SUBROUTINE set_boundary_condition(Qp,Bp,bnodes,Np,Nbn,bcs,Nbc,npe)USE data_structIMPLICIT NONE


114

Vec Bp,QpPetscErrorCode ierrINTEGER :: i,jINTEGER :: Np,Nbn,Nbc,asize,npeDOUBLE PRECISION :: nL,nRDOUBLE PRECISION :: rhoL,rhouL,uL,eL,pLDOUBLE PRECISION :: rhoR,rhouR,uR,eR,pRDOUBLE PRECISION :: rhoA,rhouA,eA,rho,rhou,u,eDOUBLE PRECISION :: rhouS,rhoS,eSINTEGER,DIMENSION(2*Neq,3) :: bnodesINTEGER,DIMENSION(2*Neq,3) :: bcsDOUBLE PRECISION :: u1,p1,p2,s1,c1,c2,r1,r2

asize = Neq*Np

!! This sets q amplitudes for the Neumann BCs for rho, rhou!! and eCALL VecCopy(Qp,Bp,ierr)

!! This sets values for the other BCsDO i = 1,Nbc

!!! LEFT SIDE !!!IF (bcs(i,1) == 1 .AND. bcs(i,2) == 0) THEN! Density (Rho) Dirichlet BC on the leftCALL VecGetValues(Qp,1,bcs(i,1)-1,rhoA,ierr)rho = rhoA / Area(1)rhoL = (rho - rhoi)*Area(1)CALL VecSetValue(Bp,bcs(i,1)-1,rhoL,INSERT_VALUES,ierr)

ELSEIF (bcs(i,1) == 2 .AND. bcs(i,2) == 0) THEN! Momentum (Rho*u) or Velocity (u) Dirichlet BC on the left

IF (bcs(i,3) == 2) THEN! Momentum (Rho*u) Dirichlet on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)rhou = rhouA / Area(1)rhouL = (rhou - rhoi*ui) * Area(1)CALL VecSetValue(Bp,bcs(i,1)-1,rhouL,INSERT_VALUES,ierr)

ELSE IF (bcs(i,3) == 4) THEN! Velocity (u) Dirichlet on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-2,rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)u = rhouA / rhoAuL = ( u - ui )CALL VecSetValue(Bp,bcs(i,1)-1,uL,INSERT_VALUES,ierr)

END IF

ELSEIF (bcs(i,1)==2 .AND. bcs(i,2)==1 .AND.& bcs(i,3)==4) THEN

! Neumann BC for Velocity (u) on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-2,rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)rho = rhoA / Area(1)rhou = rhouA / Area(1)DO j = 1,npe

115

CALL VecGetValues(Qp,1,bcs(i,1)-2+Neq*(j-1),rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1+Neq*(j-1),rhouS,ierr)rhoS = rhoS / Area(1)rhouS = rhouS / Area(1)nL = rhouS/rho - (rhou/rho**2)*rhoSCALL VecSetValue(Bp,bcs(i,1)-1+Neq*(j-1),nL,

& INSERT_VALUES,ierr)END DO

ELSEIF (bcs(i,1) == 3 .AND. bcs(i,2) == 0) THEN! Energy (e) or Pressure (p) Dirichlet BC on the Left

IF (bcs(i,3) == 3) THEN! Energy (e) Dirichlet on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-1,eA,ierr)e = eA / Area(1)eL = (e - ei) * Area(1)CALL VecSetValue(Bp,bcs(i,1)-1,eL,INSERT_VALUES,ierr)

ELSE IF (bcs(i,3) == 5) THEN! Pressure (p) Dirichlet on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-3, rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2,rhouA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1, eA,ierr)rho = rhoA / Area(1)rhou = rhouA / Area(1)e = eA / Area(1)

pL = (g-1)*(e - 0.5*rhou*rhou/rho) - PiCALL VecSetValue(Bp,bcs(i,1)-1,pL,INSERT_VALUES,ierr)

END IF

ELSEIF (bcs(i,1) == 3 .AND. bcs(i,2) == 1 .AND.& bcs(i,3)==5) THEN

! Neumann BC for Pressure (p) on the LeftCALL VecGetValues(Qp,1,bcs(i,1)-3, rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2,rhouA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1, eA,ierr)rho = rhoA / Area(1)rhou = rhouA / Area(1)e = eA / Area(1)DO j = 1,npeCALL VecGetValues(Qp,1,bcs(i,1)-3+Neq*(j-1), rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2+Neq*(j-1),rhouS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1+Neq*(j-1), eS,ierr)rhoS = rhoS / Area(1)rhouS = rhouS / Area(1)eS = eS / Area(1)nL = (g-1)*(eS - rhou/rho*rhouS +

& 0.5*rhou**2/rho**2*rhoS)CALL VecSetValue(Bp,bcs(i,1)-1+Neq*(j-1),nL,


!!! RIGHT SIDE !!!ELSEIF (bcs(i,1) == asize-2 .AND. bcs(i,2) == 0) THEN

116

! Density (Rho) Dirichlet BC on the Right

CALL VecGetValues(Qp,1,bcs(i,1)-1,rhoA,ierr)rho = rhoA / Area(Np)rhoR = (rho - rhoo) * Area(Np)CALL VecSetValue(Bp,bcs(i,1)-1,rhoR,INSERT_VALUES,ierr)

ELSEIF (bcs(i,1) == asize-1 .AND. bcs(i,2) == 0) THEN! Momentum (Rho*u) or Velocity (u) Dirichlet BC on the Right

IF (bcs(i,3) == 2) THEN! Momentum (Rho*u) Dirichlet BCCALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)rhou = rhouA / Area(Np)rhouR = (rhou - rhoo*uo) * Area(Np)CALL VecSetValue(Bp,bcs(i,1)-1,rhouR,INSERT_VALUES,ierr)

ELSEIF (bcs(i,3) == 4) THEN! Velocity (u) Dirichlet BCCALL VecGetValues(Qp,1,bcs(i,1)-2,rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)u = rhouA / rhoAuR = ( u - uo )CALL VecSetValue(Bp,bcs(i,1)-1,uR,INSERT_VALUES,ierr)

END IF

ELSEIF (bcs(i,1) == asize-1 .AND. bcs(i,2) == 1 .AND.& bcs(i,3)==4) THEN

! Neumann BC for Velocity (u)CALL VecGetValues(Qp,1,bcs(i,1)-2,rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1,rhouA,ierr)rho = rhoA / Area(Np)rhou = rhouA / Area(Np)

DO j = 1,npeCALL VecGetValues(Qp,1,bcs(i,1)-2-Neq*(j-1),rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1-Neq*(j-1),rhouS,ierr)rhoS = rhoS / Area(Np-(j-1))rhouS = rhouS / Area(Np-(j-1))nR = rhouS/rho - (rhou/rho**2)*rhoSCALL VecSetValue(Bp,bcs(i,1)-1-Neq*(j-1),nR,


ELSEIF (bcs(i,1) == asize-0 .AND. bcs(i,2) == 0) THEN! Energy (e) or Pressure (p) BC on the right

IF (bcs(i,3) == 3) THEN! Dirichlet BC for Energy (e)CALL VecGetValues(Qp,1,bcs(i,1)-1,eA,ierr)e = eA / Area(Np)eR = (e - eo) * Area(Np)CALL VecSetValue(Bp,bcs(i,1)-1,eR,INSERT_VALUES,ierr)

ELSEIF (bcs(i,3) == 5) THEN! Dirichlet BC for Pressure (p)

117

CALL VecGetValues(Qp,1,bcs(i,1)-3, rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2,rhouA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1, eA,ierr)rho = rhoA / Area(Np)rhou = rhouA / Area(Np)e = eA / Area(Np)

pR = (g-1)*(e - 0.5*rhou*rhou/rho) - PoCALL VecSetValue(Bp,bcs(i,1)-1,pR,INSERT_VALUES,ierr)

END IF

ELSEIF (bcs(i,1)==asize-0 .AND. bcs(i,2)==1 .AND.& bcs(i,3)==5) THEN

! Neumann BC for Pressure (p)!WRITE(*,*) ’Neumann Pressure BC’CALL VecGetValues(Qp,1,bcs(i,1)-3, rhoA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2,rhouA,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1, eA,ierr)rho = rhoA / Area(Np)rhou = rhouA / Area(Np)e = eA / Area(Np)

DO j = 1,npeCALL VecGetValues(Qp,1,bcs(i,1)-3-Neq*(j-1), rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2-Neq*(j-1),rhouS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1-Neq*(j-1), eS,ierr)rhoS = rhoS / Area(Np-(j-1))rhouS = rhouS / Area(Np-(j-1))eS = eS / Area(Np)nR = (g-1)*(eS - rhou/rho*rhouS +

& 0.5*rhou**2/rho**2*rhoS)CALL VecSetValue(Bp,bcs(i,1)-1-Neq*(j-1),nR,


END IFEND DO

END SUBROUTINE set_boundary_condition

!******************************!! Write the Initial Condition !! to a CSV file !!******************************!SUBROUTINE write_initial_condition(Qt,S,XYZ,Np)USE csv_fileIMPLICIT NONE

INTEGER :: NpDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt,SDOUBLE PRECISION,DIMENSION(Np) :: XYZ

CALL get_pressure(Np,Qt)

!! Write Data Out to CSV fileCALL csv_write(1, XYZ ) ! X Coords

118

CALL csv_write(1, Qt(1:Np,1) / Area(:) ) ! rhoCALL csv_write(1, Qt(1:Np,2)/Qt(1:Np,1) ) ! uCALL csv_write(1, Qt(1:Np,3) / Area(:) ) ! eCALL csv_write(1, Pt ) ! P

END SUBROUTINE write_initial_condition

!*******************************!! Get Time Step !! and fastest wave speed !!*******************************!SUBROUTINE get_timestep(Np,Qt,XYZ,CFL,Cs_max,dTi)IMPLICIT NONE

INTEGER :: i,NpDOUBLE PRECISION,DIMENSION(Np,Neq) :: QtDOUBLE PRECISION,DIMENSION(Np) :: XYZDOUBLE PRECISION :: dTi,Cs_max,u,rho,c,csi,dx,CFL,eg,prDOUBLE PRECISION,DIMENSION(Np-1) :: dT,Cs

DO i = 1,Np-1u = Qt(i,2) / Qt(i,1)rho = Qt(i,1) / Area(i)eg = Qt(i,3) / Area(i)pr = (g-1)*(eg - 0.5*rho*u*u)c = SQRT( g*pr/rho )csi = c + ABS(u)dx = XYZ(i+1) - XYZ(i)dT(i) = CFL*dx/csiCs(i) = c

END DOdTi = MINVAL(dT)Cs_max = MINVAL(Cs)

END SUBROUTINE get_timestep

!*******************************!! Calculate Flux, F !! !!*******************************!SUBROUTINE get_flux(Np,Qt,F)IMPLICIT NONE

INTEGER :: i,NpDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt,FDOUBLE PRECISION :: u,pA

!! Calculate FluxDO i = 1,Np

u = Qt(i,2) / Qt(i,1)pA = (g-1)*( Qt(i,3) - 0.5*Qt(i,2)*u )

F(i,1) = Qt(i,2)F(i,2) = Qt(i,2)*u + pAF(i,3) = u*(Qt(i,3) + pA)

END DO

END SUBROUTINE get_flux

119

!*******************************!! Calculate Flux, F !! with PETSc Input !!*******************************!SUBROUTINE get_flux_petsc(Np,Qk,Fk)USE data_structIMPLICIT NONE


INTEGER :: i,NpVec Qk,FkPetscErrorCode ierrDOUBLE PRECISION :: u,pA,q1,q2,q3,f1,f2,f3

!! Calculate Flux

CALL VecZeroEntries(Fk,ierr)

DO i = 1,NpCALL VecGetValues(Qk,1,3*i-3,q1,ierr)CALL VecGetValues(Qk,1,3*i-2,q2,ierr)CALL VecGetValues(Qk,1,3*i-1,q3,ierr)

u = q2 / q1pA = (g-1)*( q3 - 0.5*q2*u)

f1 = q2f2 = q2*u + pAf3 = u*(q3 + pA)

CALL VecSetValue(Fk,3*i-3,f1,INSERT_VALUES,ierr)CALL VecSetValue(Fk,3*i-2,f2,INSERT_VALUES,ierr)CALL VecSetValue(Fk,3*i-1,f3,INSERT_VALUES,ierr)

END DO

END SUBROUTINE get_flux_petsc

!*******************************!! Create Flux Jacobians !! dF/dQ and dF/dS !!*******************************!SUBROUTINE get_Jacobian(Np,Qp,Sp,Bp,Ap,Zp,Yp,bcs,Nbc,npe)IMPLICIT NONE


INTEGER :: Np,npeMat Ap,Zp,YpMat Qp,Sp,BpDOUBLE PRECISION :: u,rho,rhou,rhoA,rhouA,e,eA,p,sDOUBLE PRECISION :: rhoS,rhouS,eS,uSINTEGER, DIMENSION(2*Neq,3) :: bcsINTEGER i,j,ii,jj,kk,Nbc,Side

120

PetscErrorCode ierr

DO i = 1,Npii = Neq*i

CALL VecGetValues(Qp,1,ii-3,rhoA ,ierr)CALL VecGetValues(Qp,1,ii-2,rhouA,ierr)CALL VecGetValues(Qp,1,ii-1,eA ,ierr)CALL VecGetValues(Sp,1,ii-2,s ,ierr)u = rhouA/rhoArho = rhoA/Area(i)e = eA/Area(i)p = (g-1)*(e - 0.5*rho*u**2)

!! Create Flux JacobianCALL MatSetValue(Ap,ii-3,ii-2,

& 1.d0,INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-2,ii-3,

& ((g-1)*(u**2)/2-u**2),INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-2,ii-2,

& u*(2-(g-1)),INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-2,ii-1,

& (g-1),INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-1,ii-3,

& u*((g-1)*(u**2)/2-(e+p)/rho),INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-1,ii-2,

& ((e+p)/rho-(u**2)*(g-1)),INSERT_VALUES,ierr)CALL MatSetValue(Ap,ii-1,ii-1,

& u*(1+(g-1)),INSERT_VALUES,ierr)

!! Create Source Jacobian, ZpCALL MatSetValue(Zp,ii-2,ii-3,

& (s/p)*(g-1)*0.5*u**2/Area(i),INSERT_VALUES,ierr)CALL MatSetValue(Zp,ii-2,ii-2,

& -(s/p)*u*(g-1)/Area(i),INSERT_VALUES,ierr)CALL MatSetValue(Zp,ii-2,ii-1,

& (s/p)*(g-1)/Area(i),INSERT_VALUES,ierr)END DO

!! Create Boundary Jacobian, Yp

! Make Yp an Identity MatrixDO i = 1,Npii = Neq*iCALL MatSetValue(Yp,ii-3,ii-3,1.d0,INSERT_VALUES,ierr)CALL MatSetValue(Yp,ii-2,ii-2,1.d0,INSERT_VALUES,ierr)CALL MatSetValue(Yp,ii-1,ii-1,1.d0,INSERT_VALUES,ierr)

END DO

! Fill Yp With Correct Boundary ConditionsDO i = 1,Nbc

!!! Dirichlet Cases !!!IF ( bcs(i,1)==1 .AND. bcs(i,2)==0 .OR.

& bcs(i,1)==Np*Neq-2 .AND. bcs(i,2)==0 ) THEN! Density (Rho)ii = bcs(i,1)

121

CALL MatSetValue(Yp,ii-1,ii-1,1.d0,INSERT_VALUES,ierr)

ELSEIF ( bcs(i,1)==2 .AND. bcs(i,2)==0 .OR.& bcs(i,1)==Np*Neq-1 .AND. bcs(i,2)==0 ) THEN

IF ( bcs(i,3)==2 ) THEN!! Momentum (Rho*u)ii = bcs(i,1)CALL MatSetValue(Yp,ii-1,ii-1,1.d0,

& INSERT_VALUES,ierr)ELSEIF ( bcs(i,3)==4 ) THEN!! Velocity (u)ii = bcs(i,1)IF ( bcs(i,1)==2 ) THENSide = 1

ELSESide = Np

END IFCALL VecGetValues(Qp,1,ii-2,rhoA ,ierr)CALL VecGetValues(Qp,1,ii-1,rhouA,ierr)u = rhouA/rhoArho = rhoA/Area(Side)

CALL MatSetValue(Yp,ii-1,ii-2,-rho*u/rho**2,& INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1,ii-1,1.d0/rho,INSERT_VALUES,ierr)CALL MatSetValue(Yp,ii-1,ii-0,0.d0,INSERT_VALUES,ierr)

ENDIF


IF ( bcs(i,3)==3 ) THEN!! Energy (e)ii = bcs(i,1)CALL MatSetValue(Yp,ii-1,ii-1,1.d0

& ,INSERT_VALUES,ierr)ELSEIF ( bcs(i,3)==5 ) THEN!! Pressure (P)ii = bcs(i,1)CALL VecGetValues(Qp,1,ii-3,rhoA ,ierr)CALL VecGetValues(Qp,1,ii-2,rhouA,ierr)u = rhouA/rhoA

CALL MatSetValue(Yp,ii-1,ii-3,& 0.5*(g-1)*u**2,INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1,ii-2,& -(g-1)*u,INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1,ii-1,& (g-1),INSERT_VALUES,ierr)

END IF

!!! Neumann Cases !!!ELSEIF ( bcs(i,1)==1 .AND. bcs(i,2)==1 .OR.

& bcs(i,1)==Np*Neq-2 .AND. bcs(i,2)==1 ) THEN!! Density (rho)

122

ii = bcs(i,1)DO j = 1,npeIF ( bcs(i,1) == 1) THENSide = 1jj = +Neq*(j-1)

ELSESide = Npjj = -Neq*(j-1)

END IFEND DO


IF ( bcs(i,3) == 2 ) THEN!! Momentum (rho*u)DO j = 1,npeii = bcs(i,1)IF ( bcs(i,1) == 2 ) THENSide = 1jj = +Neq*(j-1)


END IFEND DO

ELSEIF (bcs(i,3) == 4) THEN!! Velocity (u)ii = bcs(i,1)IF ( bcs(i,1) == 2 ) THENSide = 1

ELSESide = Np

END IF

CALL VecGetValues(Qp,1,ii-2,rhoA ,ierr)CALL VecGetValues(Qp,1,ii-1,rhouA,ierr)rhou = rhouA / Area(Side)rho = rhoA / Area(Side)

DO j = 1,npeIF ( bcs(i,1)==2 ) THENjj = +Neq*(j-1) ! Left Side

ELSEjj = -Neq*(j-1) ! Right Side

END IF

CALL VecGetValues(Qp,1,bcs(i,1)-2+jj, rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-1+jj,rhouS,ierr)rhoS = rhoS / Area(Side-(j-1))rhouS = rhouS / Area(Side-(j-1))

CALL MatSetValue(Yp,ii-1+jj,ii-2+jj,& -rhou/rho**2/Area(Side),INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1+jj,ii-1+jj,& 1.d0/rho/Area(Side),INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1+jj,ii-0+jj,& 0.d0,INSERT_VALUES,ierr)

123

END DOEND IF


IF ( bcs(i,3) == 3 ) THEN!! Energy (e)DO j = 1,npeii = bcs(i,1)IF ( bcs(i,1) == 3 ) THENSide = 1jj = +Neq*(j-1)


END IFEND DO

ELSEIF (bcs(i,3) == 5) THEN!! Pressure (p)ii = bcs(i,1)IF (bcs(i,1)==3) THENSide = 1

ELSESide = Np

END IF

CALL VecGetValues(Qp,1,ii-3,rhoA ,ierr)CALL VecGetValues(Qp,1,ii-2,rhouA,ierr)rho = rhoA / Area(Side)rhou = rhouA / Area(Side)

DO j = 1,npeIF ( bcs(i,1)==3 ) THENjj = +Neq*(j-1) ! Left SideSide = 1

ELSEjj = -Neq*(j-1) ! Right SideSide = Np

END IF

CALL VecGetValues(Qp,1,bcs(i,1)-3+jj, rhoS,ierr)CALL VecGetValues(Qp,1,bcs(i,1)-2+jj,rhouS,ierr)rhoS = rhoS / Area(Side-(j-1))rhouS = rhouS / Area(Side-(j-1))

CALL MatSetValue(Yp,ii-1+jj,ii-3+jj,& (g-1)*0.5*rhou**2/rho**2/Area(side-(j-1)),INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1+jj,ii-2+jj,& -(g-1)*rhou/rho/Area(Side-(j-1)),INSERT_VALUES,ierr)

CALL MatSetValue(Yp,ii-1+jj,ii-1+jj,& (g-1)/Area(side-(j-1)),INSERT_VALUES,ierr)

END DOEND IF

END IF

124

END DO

CALL MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyBegin(Zp,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(Zp,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyBegin(Yp,MAT_FINAL_ASSEMBLY,ierr)CALL MatAssemblyEnd(Yp,MAT_FINAL_ASSEMBLY,ierr)

!CALL MatView(Yp,PETSC_VIEWER_STDOUT_WORLD,ierr)

END SUBROUTINE get_Jacobian

!*******************************!! Other Calculations !! Pressure Calculation !!*******************************!SUBROUTINE other_calculations(Np,Qt,t)IMPLICIT NONE

INTEGER :: Np,tDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt

!! Calculate Pressure!CALL get_pressure(Np,Qt)

END SUBROUTINE other_calculations

!*******************************!! Calculate the Pressure !! !!*******************************!SUBROUTINE get_pressure(Np,Qt)IMPLICIT NONE

INTEGER :: NpDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt

Pt = (g-1)*(Qt(1:Np,3)-0.5*(Qt(1:Np,2)**2/Qt(1:Np,1)) )& / Area(1:Np)

END SUBROUTINE get_pressure

!*******************************!! Update the Source !! !!*******************************!SUBROUTINE get_source(Np,Qt,St,Xt,IC)IMPLICIT NONE

INTEGER :: Np,ICDOUBLE PRECISION,DIMENSION(Np) :: XtDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt,St

CALL get_pressure(Np,Qt)IF (IC==0) THENSt(:,1) = 0St(:,2) = 0

125

St(:,3) = 0ELSEIF (IC==1) THENSt(:,1) = 0St(:,2) = Pt*0.2776 / (COSH(0.8*Xt(:) - 4.0))**2St(:,3) = 0

ELSEIF (IC==2) THENSt(:,1) = 0St(:,2) = Pt*4.4*(Xt(:)-1.5)St(:,3) = 0

ELSEIF (IC==3) THENSt(:,1) = 0St(:,2) = 0St(:,3) = 0

ENDIF

END SUBROUTINE get_source

!*******************************!! Update the Source w/ PETSc !! !!*******************************!SUBROUTINE get_source_petsc(Np,Qk,Sk,Xt,IC)IMPLICIT NONE


INTEGER :: Np,i,ICPetscErrorCode ierrVec Qk,SkDOUBLE PRECISION,DIMENSION(Np) :: XtDOUBLE PRECISION,DIMENSION(Np,Neq) :: Qt,St

DO i = 1,NpCALL VecGetValues(Qk,1,3*i-3,Qt(i,1),ierr)CALL VecGetValues(Qk,1,3*i-2,Qt(i,2),ierr)CALL VecGetValues(Qk,1,3*i-1,Qt(i,3),ierr)

END DOCALL get_pressure(Np,Qt)

IF (IC==0) THENSt(:,1) = 0St(:,2) = 0St(:,3) = 0

ELSEIF (IC==1) THENSt(:,1) = 0St(:,2) = Pt*0.2776 / (COSH(0.8*Xt(:) - 4.0))**2St(:,3) = 0

ELSEIF (IC==2) THENSt(:,1) = 0St(:,2) = Pt*4.4*(Xt(:)-1.5)St(:,3) = 0

ELSEIF (IC==3) THENSt(:,1) = 0St(:,2) = 0St(:,3) = 0

ENDIF

126

!WRITE(*,*) St

DO i = 1,NpCALL VecSetValue(Sk,3*i-3,St(i,1),INSERT_VALUES,ierr)CALL VecSetValue(Sk,3*i-2,St(i,2),INSERT_VALUES,ierr)CALL VecSetValue(Sk,3*i-1,St(i,3),INSERT_VALUES,ierr)

END DO

END SUBROUTINE get_source_petsc

!*******************************!! Display Output for Eqn. Set !! !!*******************************!

SUBROUTINE display_output(Np,Qt,Cs_max,stime,t,qprime,& dtdx,its,newt,reason)IMPLICIT NONE

INTEGER :: Np,t,its,newt,reasonDOUBLE PRECISION,DIMENSION(Np,Neq) :: QtDOUBLE PRECISION,DIMENSION(2) :: qprimeDOUBLE PRECISION :: p_max,Cs_max,rho_max,v_max,e_max,stime,dtdx

!Pt(1:Np) = P(1:Np,t)CALL get_pressure(Np,Qt)

p_max = MAXVAL( Pt )rho_max = MAXVAL( Qt(1:Np,1) / Area(:) )v_max = MAXVAL( Qt(1:Np,2) / Qt(1:Np,1) )e_max = MAXVAL( Qt(1:Np,3) / Area(:) )WRITE(6,’(A,I5,3x,A,F10.4,3x,A,F7.4,3x,A,F7.4,3x,A,F7.4,3x,A,

& F7.4,3x,A,F7.4,3x,A,F7.4,3x,A,i5,3x,A,i5,3x,A,i3,& 3x,A,E10.4,3x,A,E10.4)’)& ’ts = ’,t-1,’t = ’,stime,’Cs=’,Cs_max,’dtdx=’,dtdx,& ’P=’,p_max,’rho=’,rho_max,’v=’,v_max,’e=’,e_max,& ’its=’,its,’newt=’,newt,’rsn=’,reason,& ’qpL=’,qprime(1),’qpR=’,qprime(2)

END SUBROUTINE display_output

!*******************************!! Write Data to .csv file !! !!*******************************!SUBROUTINE write_data(Np,XYZ,Q,nT,nTs,nInterval,stime,CFL,

& Theta,Epse,Epsi,npe)USE csv_fileIMPLICIT NONE

INTEGER :: Np,nT,nTs,nInterval,npe,iDOUBLE PRECISION :: CFL,Theta,Epse,EpsiDOUBLE PRECISION,DIMENSION(8) :: parametersDOUBLE PRECISION,DIMENSION(Np,Neq,nTs+1) :: QDOUBLE PRECISION,DIMENSION(Np) :: XYZDOUBLE PRECISION,DIMENSION(nT+1) :: stime

127

WRITE(*,*) ’Writing to .csv file......’

!! Write Temporal DataDO i = 1,nTs+1CALL csv_write(2, Q(:,1,i) ) ! rhoCALL csv_write(3, Q(:,2,i) / Q(:,1,i) ) ! rhouCALL csv_write(4, Q(:,3,i) ) ! e

END DO

!! Write Data Out to CSV fileCALL csv_write(7, XYZ ) ! X-CoordsCALL csv_write(7, Q(:,1,nTs+1) / Area(:) ) ! rhoCALL csv_write(7, ( Q(:,2,nTs+1)/Q(:,1,nTs+1) ) ) ! uCALL csv_write(7, Q(:,3,nTs+1) / Area(:) ) ! e!CALL csv_write(7, Pt ) ! P

CALL csv_write(8, stime )

parameters(1) = CFLparameters(2) = nTparameters(3) = Thetaparameters(4) = Epseparameters(5) = Epsiparameters(6) = npeparameters(7) = nTsparameters(8) = nInterval

CALL csv_write(9, parameters)CALL csv_write(10, Area(:) )!WRITE(*,*) ’Finished Writing’

END SUBROUTINE write_data

!*****************************!! Deallocate the variables in !! the pseudo1D_euler module !!*****************************!SUBROUTINE module_finalize()DEALLOCATE( Area )DEALLOCATE( Pt )

END SUBROUTINE module_finalize

END MODULE pseudo1d_euler

B.0.5 PlotResults.m

%% Plot Output From CSV files%clear all; close all; clc;

step_solution = 0;plot_evolution = 1;superlabel = 0 ;initial_cond = 1;velocity = 1;mach = 0;

% Load Data

128

load ./output/initial.csvX_o = initial(1,:);rho_o = initial(2,:);v_o = initial(3,:);e_o = initial(4,:);p_o = initial(5,:);

load ./output/rhoA.csvload ./output/u.csvload ./output/eA.csv

load ./output/variables.csvX = variables(1,:);rho = variables(2,:);v = variables(3,:);e = variables(4,:);%p = variables(5,:);

load ./output/simtime.csvsim_time = simtime(1,:);t = length(simtime);

load ./output/parameters.csvCFL = parameters(1);nT = parameters(2);Theta = parameters(3);Epse = parameters(4);Epsi = parameters(5);Npe = parameters(6);nTs = parameters(7);nInterval = parameters(8);dT = sim_time(end)/nT;Np = length(X);Ne = (Np-1)/(Npe-1);

load ./output/area.csv

Xmax = max(X);Xmin = min(X);rmax = max(rho);rmin = min(rho);vmax = max(v);vmin = min(v);emax = max(e);emin = min(e);%pmax = max(p);%pmin = min(p);

if plot_evolution == 0figure(2)axes1 = subplot(2,2,1);if initial_cond == 1

plot(X_o,p_o,’--r’), grid on, hold onendp = (1.4-1).*( eA(t,:) - 0.5*rhoA(t,:).*u(t,:).^2 ) ./area(:)’;plot(X,p,’-b’), grid on%plot(X,area,’--y’)%hold off

129

xlim([Xmin Xmax])ylabel(’P’,’FontSize’,16)title(’Pressure’,’FontSize’,16)set(axes1,’FontSize’,16)

axes2 = subplot(2,2,2);if initial_cond == 1

plot(X_o,rho_o,’--r’), grid on, hold onendplot(X,rhoA(t,:)./area(:)’,’-b’),grid on, %hold offxlim([Xmin Xmax])ylabel(’\rho’,’FontSize’,16)title(’Density’,’FontSize’,16)set(axes2,’FontSize’,16)

axes3 = subplot(2,2,3);cs1= sqrt( 1.4 .* p_o ./ rho_o);if initial_cond == 1

if velocity == 1plot(X_o,v_o,’--r’),grid on, hold on

elseif mach == 1plot(X_o,v_o./cs1,’--r’), grid on, hold on

elseplot(X_o,rho_o.*v_o,’--r’),grid on, hold on

endendif velocity == 1

plot(X,u(t,:),’-b’),grid on, %hold offtitle(’Velocity’,’FontSize’,16)ylabel(’u’,’FontSize’,16)

elseif mach == 1plot(X,u(t,:)./cs,’-b’),grid ontitle(’Mach Number’,’FontSize’,16)ylabel(’M’,’FontSize’,16)

elseplot(X,(rhoA(t,:)./area(:)’).*u(t,:),’-b’),grid ontitle(’Momentum’,’FontSize’,16)ylabel(’\rho u’,’FontSize’,16)

end%ylim([1.25 2.0])xlim([Xmin Xmax])

set(axes3,’FontSize’,16)

axes4 = subplot(2,2,4);if initial_cond == 1

plot(X_o,e_o,’--r’), grid on, hold onendplot(X,eA(t,:)./area(:)’,’-b’),grid on, %hold offxlim([Xmin Xmax])ylabel(’e’,’FontSize’,16)title(’Energy’,’FontSize’,16)set(axes4,’FontSize’,16)

if superlabel%[ax,h1]=suplabel(’super X label’);%[ax,h2]=suplabel(’super Y label’,’y’);[ax,h3]=suplabel( [’t= ’,num2str(sim_time(t)),...

130

’ ’,...’nT= ’,num2str(nT),...’ ’,...’\Delta t= ’,num2str(dT),...’ ’,...’Epse= ’,num2str(Epse),...’ ’,...’Epsi= ’,num2str(Epsi)...’ ’,...’Ne= ’,num2str(Ne),...’ ’,...’poly= ’,num2str(Npe)...],...’t’);

set(h3,’FontSize’,16)end

elsefor t = 1:nTs+1

% PLOT 1figure(3)axes1 = subplot(2,2,1);if initial_cond == 1

plot(X_o,p_o,’--r’), grid on, hold onend%plot(X,area,’--y’)p = (1.4-1).*( eA(t,:) - 0.5*rhoA(t,:).*u(t,:).^2 ) ./area(:)’;plot(X,p,’-b’), grid onhold off%xlim([Xmin Xmax])ylabel(’P’,’FontSize’,16)title(’Pressure’,’FontSize’,16)set(axes1,’FontSize’,16)

% PLOT 2axes2 = subplot(2,2,2);if initial_cond == 1

plot(X_o,rho_o,’--r’), grid on, hold onendplot(X,rhoA(t,:)./area(:)’,’-b’),grid onhold off%xlim([Xmin Xmax])%xlim([0 0.5])ylabel(’\rho’,’FontSize’,16)title(’Density’,’FontSize’,16)set(axes2,’FontSize’,16)

% PLOT 3axes3 = subplot(2,2,3);cs1= sqrt( 1.4 .* p_o ./ rho_o);cs = sqrt( 1.4 .* p ./ (rhoA(t,:)./area(:)’));if initial_cond == 1

if velocity == 1plot(X_o,v_o,’--r’),grid on, hold on

elseif mach == 1plot(X_o,v_o./cs1,’--r’),grid on, hold on

elseplot(X_o,rho_o.*v_o,’--r’),grid on, hold on

end

131

endif velocity == 1

plot(X,u(t,:),’-b’), grid ontitle(’Velocity’,’FontSize’,16)ylabel(’u’,’FontSize’,16)

elseif mach == 1plot(X,u(t,:)./cs,’-b’),grid ontitle(’Mach Number’,’FontSize’,16)ylabel(’M’,’FontSize’,16)

elseplot(X,(rhoA(t,:)./area(:)’).*u(t,:),’-b’), grid ontitle(’Momentum’,’FontSize’,16)ylabel(’\rho u’,’FontSize’,16)

endhold off%ylim([1.25 2.0])%xlim([Xmin Xmax])set(axes3,’FontSize’,16)

% PLOT 4axes4 = subplot(2,2,4);if initial_cond == 1

plot(X_o,e_o,’--r’), grid on, hold onendplot(X,eA(t,:)./area(:)’,’-b’),grid onhold off%xlim([Xmin Xmax])%xlim([9.4 10.1])ylabel(’e’,’FontSize’,16)title(’Energy’,’FontSize’,16)set(axes4,’FontSize’,16)%pause

if superlabel%[ax,h1]=suplabel(’super X label’);%[ax,h2]=suplabel(’super Y label’,’y’);[ax,h3]=suplabel( [’t= ’,num2str(sim_time((t-1)*nInterval+1)),...

’ ’,...’nT= ’,num2str(nT),...’ ’,...’\Delta t= ’,num2str(dT),...’ ’,...’Epse= ’,num2str(Epse),...’ ’,...’Epsi= ’,num2str(Epsi)...’ ’,...’Ne= ’,num2str(Ne),...’ ’,...’poly= ’,num2str(Npe)...],...’t’);

set(h3,’FontSize’,16)endif step_solution

pauseend

endend

Finite Element Solver for Flux-Source Equations

Documents

Transcript of Finite Element Solver for Flux-Source Equations