List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block...

50

Transcript of List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block...

Page 1: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.
Page 2: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.
Page 3: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

List of papers

This thesis is based on the following papers, which are referred to in the textby their Roman numerals.

I P. Boyanova, S. Margenov, M. Neytcheva, Robust AMLI methods forparabolic Crouzeix-Raviart FEM systems, J. Comput. Appl. Math.,235(2): 380–390, 2010.Contributions: The ideas were developed in close collaboration between theauthors. The author of this thesis performed the experiments in collaborationwith the third author and had the main responsibility for preparing themanuscript.

II P. Boyanova, M. Do-Quang, and M. Neytcheva, Block-preconditionersfor conforming and non-conforming FEM discretizations of theCahn-Hilliard equation, LSSC 2011, Springer LNCS, 7116: 549–557,2012, in print.Contributions: The author of this thesis implemented the methods, performedall the computations and had the main responsibility for preparing themanuscript. The ideas were developed in collaboration between the authors.

III P. Boyanova, M. Do-Quang, M. Neytcheva, Efficient preconditionersfor large scale binary Cahn-Hilliard models, Comput. Methods Appl.Math., 12: 1–, 2012.Contributions: The author of this thesis contributed in the development ofthe ideas, the preparation of the manuscript and performed parts of thecomputations.

IV O. Axelsson, P. Boyanova, M. Kronbichler, M. Neytcheva, and X. Wu,Numerical and computational efficiency of solvers for two-phaseproblems, TR 2012-002, Institute for Information Technology, UppsalaUniversity, January 2012. (Submitted)Contributions: The author of this thesis had the main responsibility for theparallel implementation, planning of performance experiments, andpreparation of the manuscript. The ideas were developed in collaboration withthe other authors.

Page 4: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

V P. Boyanova, M. Neytcheva, Efficient numerical solution of discretemulti-component Cahn-Hilliard systems, TR 2012-009, Institute forInformation Technology, Uppsala University, April 2012.Contributions: The author of this thesis implemented the methods,performed all experiments, and did a substantial part of the writing. The ideasand the manuscript were developed in collaboration between the authors.

Reprints were made with permission from the publishers.

Page 5: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Related work

Although not explicitly discussed in the comprehensive summary, the follow-ing papers are related to the contents of this thesis.

P. Boyanova, S. Margenov, Multilevel splitting of weighted graph-Laplacians arising in non-conforming mixed FEM elliptic problems, NAA2008, Springer LNCS, 5434: 216–223, 2009.

P. Boyanova, S. Margenov, Numerical study of AMLI methods forweighted graph-Laplacians, LSSC 2009, Springer LNCS, 5910: 84–91,2010.

P. Boyanova, S. Margenov, On multilevel iterative methods for Navier-Stokes problems, J. Theoret. Appl. Mech., 40(1): 51–60, 2010.

P. Boyanova, S. Margenov, On optimal AMLI solvers for incompressibleNavier-Stokes problems, AIP Conf. Proc., 1301: 457–467, 2010.

P. Boyanova, S. Margenov, Robust multilevel methods for elliptic andparabolic problems, invited chapter in: O. Axelsson, J. Karatson, Effi-cient preconditioning methods for elliptic partial differential equations,Bentham Science Publishers, 3–22, 2011.

P. Boyanova, I. Georgiev, S. Margenov, and L. Zikatanov, Multilevelpreconditioning of graph-Laplacians: Polynomial approximation of thepivot blocks inverses, TR 2011-030, Institute for Information Techno-logy, Uppsala University, 2011.

Page 6: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.
Page 7: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Preconditioners for block-structured matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Iterative solution methods and the role of preconditioning . . . . . . . . . 112.2 Preconditioners in two-by-two block-factorized form . . . . . . . . . . . . . . . . . 13

3 Optimal preconditioners for hierarchies of meshes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1 The Algebraic MultiLevel Iteration method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 AMLI preconditioners for Crouzeix-Raviart FEM

discretizations of parabolic problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2.1 Hierarchical transformation definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Analysis of the CBS constant and a multilevel

generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Solution methods for systems of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.1 The Cahn-Hilliard equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2 Discrete formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.3 Symmetric positive definite Schur approximations . . . . . . . . . . . . . . . . . . . . . 274.4 Block-structured Jacobian approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.5 Efficient solvers based on inexact Newton methods . . . . . . . . . . . . . . . . . . . . 304.6 Generalisation of the solution techniques for multiphase

systems with arbitrary number of components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.7 Software implementation and parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Summary of papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.1 Paper I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.2 Paper II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.3 Paper III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.4 Paper IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.5 Paper V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6 Discussion and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7 Summary in Swedish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Page 8: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.
Page 9: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

1. Introduction

A substantial part of the recent achievements in science and technology are dueto developments in the field of computer simulations. Computer simulationsare complex interdisciplinary processes that aim to find solutions to variousproblems of theoretical and practical importance. Computational modelling isbased on results in several scientific areas and involves the development of:

a mathematical model, that describes the studied phenomena in an ade-quate way, usually in terms of differential and/or integral equations;

numerical methods to discretize the continuous equations; efficient methods and algorithms to solve the so-obtained discrete sys-

tems of linear (or nonlinear) algebraic equations; high-performance software implementations, that utilise the architecture

and properties of state-of-the-art computing facilities; algorithms for visualisation and analysis of the solutions, obtained via

the numerical experiments.Scientific computations are now broadly used instead of field and laboratoryexperiments. Moreover, numerical simulations are of great importance whenmodelling the properties of new materials and in the study of processes, wheredirect measurements and observations are not feasible.

Two very widely used tools for finding approximate solutions of partialdifferential equations (PDE) are the Finite Element Method (FEM) and theFinite Difference Method (FDM), see e.g. [30, 32, 72, 71]. The discrete sys-tems of linear (or nonlinear) algebraic equations, obtained by applying FEMand FDM methods to discretize PDE problems, share one important property,namely, their matrices are sparse. This means, that the number of nonzero el-ements in each row or column is bounded by a constant, that does not dependon the discretization parameter, and thus, on the size of the discrete problem.A substantial part of the existing algorithms for numerical simulation of pro-cesses, modelled by differential equations, involve solutions of systems withsparse matrices. It turns out that, when performing numerical simulations, thesolution of such systems constitutes the major share of the required computerresources. The influence of efficient algorithms to solve systems of algebraicequations is mostly seen when complex large-scale problems are considered.The meaning of "large-scale" changes with the increased availability of com-puter resources. However, in order to deal with various important problems,the development of high-performance machines needs to be combined withdevelopment of robust solution techniques to fully utilise these computer re-sources. The development, analysis, and implementation of efficient methods,

9

Page 10: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

suitable for both sequential and parallel execution, are main research direc-tions in the field of numerical simulations and are the focus of this thesis.

The methods for solving systems of linear algebraic equations fall into twomain classes – direct and iterative, see e.g. [49, 7, 4, 30]. In absence ofrounding errors, direct methods deliver the exact solution. In contrast, itera-tive algorithms generate a sequence of approximate solutions with improvingaccuracy. Due to their general applicability and high robustness, direct meth-ods are a preferred choice and are often used in numerical simulations. How-ever, when very large scale, in space and/or in time, simulations have to beperformed, direct methods may become prohibitively expensive (and in somecases impossible) to apply. Therefore, due to their lesser demands for com-puter resources, iterative solution methods become a necessity. To improvetheir efficiency, iterative methods are combined with proper techniques to ac-celerate the convergence to an approximate solution with a desired accuracy. Ageneral technique to accelerate the convergence of iterative methods is to usea so-called preconditioner. Constructing and analysing various precondition-ing methods has been an active field of research already for decades. Specialattention is devoted to the class of the so-called optimal order preconditioners."Optimal order" incorporates "optimal convergence rate", i.e., convergencewithin a number of iterations, which is independent of the number of degreesof freedom, and "optimal computational complexity", i.e. computational costper iteration, that is linearly proportional to the number of degrees of freedom.The preconditioning techniques, developed and studied in this thesis, utilisethe block structure of the underlying matrices, and lead to methods that are ofoptimal order.

The outline of this thesis is as follows. Chapter 2 gives a brief introduc-tion to the concept of iterative methods and block-based preconditioning tech-niques, considered in this work. In Chapter 3, we describe an optimal precon-ditioner for linear parabolic problems, discretized by Crouzeix-Raviart finiteelements. Chapter 4 is devoted to efficient block-structured preconditioningtechniques for the solution of multiphase flow problems, described by the so-called phase-field model. A summary of the papers, included in the thesis,is given in Chapter 5. Chapter 6 concludes with a discussion and outlinespossible future extension of the so-presented research work.

10

Page 11: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

2. Preconditioners for block-structuredmatrices

Long term experience in the theory and application of preconditioned iterativemethods undeniably show that utilising some block structure of the systemmatrix is advantageous and usually leads to solution techniques of high effi-ciency. The research in this thesis concerns the development of precondition-ers for matrices in two-by-two block form.

2.1 Iterative solution methods and the role ofpreconditioning

Consider a linear system of equations

Ax = b (2.1)

where A is a matrix of size NN. In this work A is assumed to be large andsparse. A general scheme of an iterative method for finding an approximatesolution of (2.1) is as follows.

Given an initial guess x0, compute

xk+1 = xk + τpk;k = 0;1; ::: (2.2)

until certain convergence criterion is met. The vector pk is the so-called searchdirection, that can be taken to be equal to the current residual rk = Axk bk

or obtained using also some vector inner products, as, for example, in theconjugate gradient (CG) method.

The number of iterations, needed to obtain an approximation with a desiredaccuracy, depends on the properties of the system matrix A. The task to studythe convergence of an iterative method for a general matrix is not an easy one.For certain classes of matrices, such as diagonalizable, the convergence ana-lysis can be based on information for the spectrum of A. For example, whenthe system (2.1) has a symmetric positive definite (s.p.d.) matrix, the rate ofconvergence is often estimated using the so-called spectral condition number

κ(A) = λmax(A)λmin(A)

, where λmax(A) and λmin(A) are the maximum and minimum

eigenvalues of A. When using methods, such as the conjugate gradient or the

11

Page 12: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Chebyshev method, a well-known upper bound of the number of iterationsk(ε), required to ensure that

kxxk(ε)kA εkxx0kA; 8x0 2 IRN ;

is the following

k(ε) 12

pκ(A) ln

; (2.3)

see e.g. [7]. Above, kykA =p

yT Ay denotes the energy norm with respect toA, and we have assumed that computations are performed in exact arithmetic.

For various problems, due to the properties of the corresponding matrixA, applying an iterative method to the original system (2.1) results in poorconvergence and thus, high overall computational complexity of the algorithm.For example, for systems arising in discretizations of second-order ellipticproblems on a mesh with characteristic size h, the estimate

κ(A) = O(h2)

holds true, see e.g. [7]. Thus, as (2.3) suggests, when the spatial step size his decreased in order to obtain better discrete approximation, the number ofiterations increases as h1.

The convergence of an iterative method can be speeded up by using precon-ditioning. The general understanding of a preconditioner (see e.g. [4]) is thatit acts as an accelerator of the iterative solution method by formally replacingthe solution of the system (2.1) by an equivalent problem

[P]Ax = [P]b; (2.4)

or, alternatively,A[P]y = b; x = [P]y:

The preconditioner [P] may be in the form of a matrix which can be explic-itly written out, but may also be an implicitly defined procedure. For nota-tional simplicity, assume that [P] = P is a matrix. Consider (2.4). In order toobtain an efficient preconditioning method, the combined action of PA shouldresemble as much as possible that of the identity matrix of corresponding size.For s.p.d matrices A and P, as also seen from the estimate (2.3) where κ(A) issubstituted by κ(PA), this translates to a condition number κ(PA) of order one(O(1)).

Two broadly used approaches to construct a preconditioner are the follow-ing:

(a) Let C be an approximation of A, define then P = C1. In this case,each iteration would require a solution of a system with C, Cdk = rk,for some vector dk which occurs in the iterative method. Note, that wenever explicitly invert C to solve these auxiliary systems.

12

Page 13: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

(b) Let C be an approximation of the inverse of A, A1, and define P = C.The preconditioning step requires one more matrix-vector multiplica-tion.

In this thesis we consider the approach (a). The general requirements forconstructing an efficient preconditioner, see e.g. [4, 70], are the following:

the condition number of C1A should be much smaller than that of A; solving systems with C should be cheaper than solving systems with A; the construction of C should not be too costly; solutions with C should be well parallelizable.

A preconditioned iterative method is said to have an optimal rate of con-vergence when the number of iterations, required to converge up to a chosenstopping criterion, is independent of the number of degrees of freedom of theproblem. It is said to have optimal computational complexity, when the num-ber of arithmetic operations, performed per degree of freedom per iterationis bounded from above independently of the number of degrees of freedom.Preconditioning methods, which possess both optimal rate of convergence andoptimal computational complexity, are referred to as optimal order methods.Another desirable property for a preconditioner is that it is robust not onlywith respect to discretization parameters but also with respect to problem pa-rameters.

2.2 Preconditioners in two-by-two block-factorizedform

In this thesis we consider preconditioners that are based on a two-by-twoblock-structured representation of the system matrix. A general sparse ma-trix A admits the following two-by-two block form

A =

A11 A12A21 A22

gN1gN2

; (2.5)

with A11 of size N1 N1 and A22 of size N2 N2, where the blocking is ei-ther natural as provided by the underlying physical problem, or imposed, forinstance, obtained as a result of some proper ordering of the unknowns andblock partitioning.

Provided that A11 is non-singular, the following exact block-factorizationholds true

A =

A11A21 S

I1 A1

11 A12I2

; (2.6)

where I1 and I2 are identity matrices of proper dimensions and

S = A22A21A111 A12 (2.7)

13

Page 14: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

is the corresponding Schur complement. Other possible factorizations are

A =

I1A21A1

11 I2

A11 A12

S

;

A =

I1A21A1

11 I2

A11

S

I1 A1

11 A12I2

: (2.8)

If A is factored as in (2.6), the solution of the system (2.1) can be obtainedby a procedure, that involves solutions of subsystems of sizes N1 and N2,matrix-vector multiplications and vector updates, as follows:

Forward step: Backward step:(1) Solve A11y1 = b1 (3) Solve A11z1 = A12x2(2) Solve Sx2 = b2A21y1 (4) x1 = y1 z1

(2.9)

where x = (xT1 ;x

T2 )T and b = (bT

1 ;bT2 )T .

By approximating (some of) the blocks in the exact block-factorization ofA, we can derive various preconditioning techniques. For example, a class ofpreconditioners, based on the factorization (2.6), has the general form

C =C11

A21 bS

I1 B12I2

; (2.10)

where the matrices C11;B12, and bS are chosen in a proper way so that C11 A11, B12 A1

11 A12, and bS S. Suppose B12 = C111 A12, then a solution of

a system with the matrix C is obtained via the steps (2.9), where instead ofsolutions with the blocks A11 and S, one needs to perform solutions with theapproximations C11 and bS.

In some cases a high-quality preconditioner can be obtained by neglectingblocks in the exact factorization, which additionally simplifies the solutionprocedure. For example, preconditioners with a block-triangular structure,

C =C11

A21 bS;

have been shown to perform well for saddle point systems, see e.g. [6]. Forcertain problems it suffices to use only the block-diagonal part of (2.8), thusthe preconditioner becomes

C =C11 bS

:

An important observation is that, due to the presence of A111 , the Schur

complement of a sparse matrix is in general not a sparse matrix itself. In allof the above mentioned cases, the application of a preconditioner C requires asolution of a system with bS, that, in order for the method to be efficient, should

14

Page 15: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

be a sparse approximation of S. A major goal in the development of block-structured preconditioners is either to chose such an approximation in a properway (Papers I and II), or to circumvent solutions with the Schur complementby utilising some additional properties of the block partitioning (Papers III, IVand V).

When constructing a preconditioner in a two-by-two block-factorized form,there are two main questions to answer:

how to choose a proper block splitting of the original matrix; how to approximate in a proper way the blocks in the exact block-

factorization.In this thesis we consider two approaches to partition the system matrix,

namely, (a) based on a hierarchy of nested meshes, discretizing the problemdomain, and (b) based on the origin and properties of the continuous problemitself and its discrete representation. The approximation techniques, proposedin our studies, as well as the analysis of the resulting preconditioners, areoutlined in turn in Chapter 3 and Chapter 4.

15

Page 16: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

3. Optimal preconditioners for hierarchies ofmeshes

Substantial advances in the field of optimal preconditioning algorithms aremainly due to the development of the so-called multigrid (see e.g. [31, 50])and multilevel (see e.g. [16, 17, 57, 75]) methods. A classical technique,widely used in the theory of such methods, is to transform the original discreteproblem into a set of problems of smaller size, associated with a hierarchy ofnested meshes. We consider here an algebraic multilevel framework.

3.1 The Algebraic MultiLevel Iteration methodThe Algebraic MultiLevel Iteration (AMLI) method is an optimal precondi-tioning technique, proposed first in [16, 17] for the case of elliptic problems,discretized by conforming linear finite elements. Later, the approach has beengeneralised also for non-conforming discretizations (e.g. in [23, 24, 25, 46,47, 63]), discontinuous Galerkin methods (e.g. in [59, 58, 60]), techniqueswere developed for the construction of variable-step, also known as nonlin-ear, AMLI preconditioners, see e.g. [18, 15, 56]. The method is based on arecursive generalisation of the two-level preconditioner of the form (2.10).

We next briefly outline the AMLI method. For simplicity, we considerscalar problems and assume that the matrix A is s.p.d., which entails that A11,A22 and S are s.p.d.

As pointed out, in the class of two-by-two block-factorized preconditionersfor A, to be able to construct an approximation of S is a crusial, howevera difficult task. For the considered class of matrices, one simple idea is toapproximate S by A22, neglecting the term A21A1

11 A12. Then, the resultingpreconditioners of full factorized and block-diagonal form become

CF =

A11A21 A22

I1 A1

11 A12I2

;

CD =

A11A22

The quality of the latter preconditioners can be measured in various ways, oneof them being via the so-called CBS constant γ , associated with the two-by-two block splitting, imposed on A. There are several ways to define γ and wechoose the following: γ is the smallest possible constant such that

jvT1 A12v2j γ

vT

1 A11v1 vT2 A22v2

1=2;8v1 6= 0 2 IRN1 ;8v2 6= 0 2 IRN2 :

16

Page 17: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

(a) standard basis functions (b) hierarchical basis functions

Figure 3.1. Standard and hierarchical basis functions for the case of linear conformingFEM in one dimension.

The following condition number estimates have been shown (see, for instance,[4]):

κ(C1D A) 1 + γ

1 γ

κ(A122 S) 1

1 γ2

κ(C1F A) 1

1 γ2 :

(3.1)

Clearly, the above estimates make sense only if γ 1. Aiming at construct-ing optimal preconditioners, we also see that γ should be ideally independenton h.

For a general two-by-two block splitting of the s.p.d. matrix A, it turnsout that γ can be arbitrary close to 1, showing a deterioration of the conditionnumber estimates (3.1). For example, this is the case when we use standardconforming FEM and two consecutive meshes to define the splitting.

There is a setting, however, where γ is both independent on h and boundedaway from 1, namely, the so-called hierarchical basis functions (HBF) formu-lation.

In HBF, upon refining a given (coarse) mesh in a regular fashion, one keepsthe FEM basis functions, associated with the coarse mesh points, and in thenewly added mesh points upgrades the FEM basis with basis functions, corre-sponding to the finer mesh. A simple illustration is given in Figure 3.1.

In the HBF framework, the CBS constant has been analysed for variousFEM discretizations, both in 2D and 3D. It turns out that γ can be estimatedusing local arguments and is, thus, independent of the number of refinements,i.e. the mesh size, and mesh and coefficient anisotropies, see e.g. [20, 9, 3, 61,38].

Consider, for example, a second order elliptic problem discretized by linearconforming FEM. Then, in the HBF framework, for arbitrary triangular mesh,γ has a uniform bound

γ2 <

34

even for degenerate triangles (see for a survey [57] and the references therein).When γ has such favourable properties, we see that A22 becomes a high-quality

17

Page 18: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

sparse approximation of S, which is one of the necessary ingredients to con-struct efficient two-by-two block preconditioners.

The price for having γ bounded away from 1 is that the stiffness matrix eA inthe HBF framework is denser than that in the standard FEM setting. The wayto circumvent working with the denser matrices is to utilise the fact that A andeA are related via a simple congruence transformationeA = JAJT ;

where J =

IJ21 I

. Remarkably enough, the Schur complements of any ma-

trices, related via a transformation of this type, have the same Schur com-plements. Further, the second diagonal block eA22 in the two-by-two blocksplitting of eA coincides with the standard FEM stiffness matrix on the coarsemesh.

The above described framework defines a two-level preconditioner. Eventhough we reduce the dimension of the original matrix and work with sub-blocks, these might be still large. The natural idea is then to define a multilevelAMLI preconditioner, recursively constructing the two-level preconditioneron a sequence of nested meshes.

To this end, consider a sequence of nested meshes, T 0 T `, ob-tained via k;1 k `; regular refinements of a given coarse mesh T 0 , whereN0 < < N` are the corresponding numbers of degrees of freedom. We de-note by A(k) the matrix that corresponds to a standard FEM discretization onT k. Let eA(k) be a hierarchical representation of A(k),

eA(k) = J(k)A(k)(J(k))T =eA11 eA12eA21 eA22

;

defined via a two-level sparse transformation matrix J(k) such that A(k1) is re-produced in eA22. The AMLI preconditioner C(k) of A(k) on level of refinementk is defined as

C(k) = (J(k))1

"eC(k)11eA(k)21 Z(k1)

#I1 (eC(k)

11 )1eA(k)12

I2

(J(k))T (3.2)

where eC(k)11 is a proper approximation of eA(k)

11 and C(0) = A(0). The directuse of Z(k1) = A(k1) leads to a class of methods with condition numberκ((C(`))1A(`)) that grows with the increase of the number of levels `, seee.g. [57]. In order to obtain an optimal preconditioner, the HBF approach iscombined with various types of stabilisation techniques. One such technique,applied for the construction of so-called linear AMLI methods, is to use aspecially constructed matrix polynomial at some or all levels of refinement,namely

Z(k1) = A(k1)(IPβk((C(k1))1A(k1)))1; (3.3)

18

Page 19: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

where Pβkis a polynomial of degree βk with the property Pβk

(0) = 1. Anotherapproach is to define the action of Z(k1) via some inner iterations with thepreconditioned conjugate gradient (PCG) method for systems with the matrixA(k1) preconditioned by C(k1). This results in AMLI algorithms, known asvariable-step or nonlinear, see e.g. [18, 15, 56, 57].

Provided that the CBS constants γk, corresponding to the two-level parti-tionings of eA(k), 1 k < `, satisfy the bound γk γ independently on dis-cretization parameters for some γ < 1, the following condition

1p1 γ2

< β < mink

Nk

Nk1;

for the optimality of the preconditioner (3.2) with βk = β in (3.3) holds true,see e.g. [16, 17, 57].

Remark 3.1.1 There are studies that generalise the AMLI framework for othertypes of splittings and approximations of the Schur complement, as well as forsome systems with non-symmetric matrices, see e.g. [13, 12, 65].

Remark 3.1.2 For achieving the optimal complexity of the AMLI method weneed also a good approximation of A11, its inverse or an efficient inner solver.The discussion about this is left out of this presentation. We refer for details to[57] and the many references therein.

3.2 AMLI preconditioners for Crouzeix-Raviart FEMdiscretizations of parabolic problems

While a large amount of papers are dealing with solution algorithms for FEMelliptic systems, preconditioning methods for the related time dependent prob-lems are much less studied. The systems, obtained by implicit time discretiza-tions of parabolic problems, differ from elliptic discrete systems by an addi-tional term. In the FEM framework, this term is a mass matrix, that is sym-metric and positive definite.

In Paper I of this thesis, we consider linear parabolic problems, discretizedby Crouzeix-Raviart (C-R) finite elements. We develop and analyse an op-timal AMLI method, based on two-level partitioning techniques, that wereproposed first in [24] for the case of elliptic problems. The linear conformingfinite elements are widely used in applications, however, in many cases thenon-conforming elements have their strong advantages. As an example, theuse of non-conforming FEM in a projection method for the incompressibleNavier-Stokes (N-S) equations leads to stable locally conservative schemes,see e.g. [22, 35]. The results, obtained in Paper I, are used within such aprojection method in [27]. There, a composite time-stepping solution method

19

Page 20: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

based on optimal AMLI preconditioning is proposed. Some other examples ofapplications of C-R FEM can be found in, e.g., [1, 36].

We consider the second-order parabolic equation:

∂ tu(x; t)∇ (a(x)∇u(x; t)) = f (x; t) in Ω (0;T ];

u(x;0) = u0 in Ω;u(x; t) = uD on ΓD;

(a(x)∇u(x; t)) n = uN on ΓN ;

(3.4)

where Ω is a polygonal domain in IR2, f (x; t) 2 L2(Ω) is a given function,n is the outward normal unit vector to the boundary ∂Ω = ΓD [ΓN , a(x) =fai j(x)gi; j2f1;2g is a bounded s.p.d. matrix with elements ai j(x) that are piece-wise smooth functions in Ω = Ω[∂Ω.

Let T be a given discretization of Ω. We assume that it is aligned with anydiscontinuities of the coefficient functions ai j(x), so that a(x) is smooth overeach finite element triangle e 2T . We use the C-R FEM to discretize (3.4) inspace. The C-R finite element space is defined as

Vh = fvh(x)2 L2(Ω) : vh(x)je 2P1(e);vh(x) – continuous at 8mi;e; 8e2T g;where mi;e (i = 1;2;3) is the midpoint of the i-th edge of a finite element trian-gle e 2T . The C-R degrees of freedom are associated with these midpoints.

To discretize (3.4) in time, we consider the so-called θ -method, also knownas the weighted method. This leads to the following linear system to be solvedat each time step

Aun (M + ∆t(1θ)K)un = gn; (3.5)

where the right-hand side vector depends on the approximate solution at aprevious time step via the relation gn = (M∆tθK)un1 + ∆t((1 θ)fn +θ fn1). Here, ∆t is the time step, θ is the method parameter, 0 θ 1, andM and K denote the FEM mass and stiffness matrices, correspondingly.

3.2.1 Hierarchical transformation definitionWe pose the task to construct the AMLI preconditioner, as described in Sec-tion 3.1, for the matrix A in (3.5). In order to do that, we need to define ahierarchical two-level transformation, that can be recursively applied, and toderive proper estimates of the corresponding CBS constant.

By construction, the C-R FEM spaces V(0)

h ; :::;V(`)

h corresponding to a se-ries of regular refinements of a coarse mesh T 0 T `, are not nested.Due to this, there is no natural hierarchical splitting of the finite element nodesinto "coarse" and "fine", and the definition of transformations J(k) to allow forthe utilisation of the presented framework of the AMLI method is neither ob-vious nor unique.

20

Page 21: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Figure 3.2. A Crouzeix-Raviart macroelement consisting of four congruent fine ele-ments, obtained by uniform refinement of a coarse element

One way to construct a hierarchical two-level C-R transformation is to con-sider the so-called differences and aggregates (DA) approach, proposed forthe first time in [24, 25] for discrete elliptic systems. The transformation J(k)

is defined on a per-macroelement basis. For a macroelement E with fine nodenumbering 1 to 9, as shown in Figure 3.2, and a corresponding macroelementmatrix A(k)

E , the DA local (macroelement) transformation J(k)E has the form

J(k)E =

2666666666664

11

11 1

1 11 1

1 1 11 1 1

1 1 1

3777777777775:

Under this transformation, the hierarchical basis functions, associated with thecoarse mesh, are obtained as aggregates of nodal basis functions on the finemesh. The hierarchical macroelement matrix is defined as

eA(k)E = J(k)

E A(k)E (J(k)

E )T =

"eA(k)E;11

eA(k)E;12eA(k)

E;21eA(k)

E;22

#; (3.6)

where the block eA(k)E;22 has size 33.

Then the hierarcical representation of a FEM matrix A(k) = ∑E2T (k)

A(k)E is

defined as

eA(k)= ∑E2T k

eA(k)E = ∑

E2T k

J(k)E A(k)

E (J(k)E )T = J(k)A(k)(J(k))T =

"eA(k)11

eA(k)12eA(k)

21eA(k)

22

#:

21

Page 22: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

It has been shown in [24] that

eK(k)22 = 4K(k1) (3.7)

holds true, where eK(k)22 is the second diagonal block in the hierarchical DA

stiffness matrix eK(k) = J(k)K(k)(J(k))T . In Paper I we derive the relation

eM(k)22 = M(k1) (3.8)

for the second diagonal block eM(k)22 in the hierarchical DA mass matrix eM(k) =

J(k)M(k)(J(k))T . We use the properties (3.7) and (3.8) of the DA splitting tojustify an optimal AMLI method for discrete C-R parabolic problems.

3.2.2 Analysis of the CBS constant and a multilevelgeneralisation

In order for the DA splitting to lead to an optimal order AMLI preconditioner,the CBS constants γk, corresponding to the two-level partitionings of eA(k),1 k < `, should be bounded away from 1 independently on discretizationparameters. We study γk for the case of (3.5) using local analysis.

As already mentioned, it can be shown that

γk max

E2T kγ

kE ; (3.9)

where γkE is the local CBS constant of the hierarchical two-by-two splitting

of a macroelement matrix eA(k)E , see e.g. [57]. The matrix in the system (3.5)

corresponding to T k has the form

A(k) = ∑E2T k

M(k)E + ∆t(1θ) ∑

E2T k

K(k)E = ∑

E2T k

(M(k)E + ∆t(1θ)K(k)

E );

where M(k)E and K(k)

E are the macroelement mass and stiffness matrices. Thus,

eA(k)E = J(k)

E (M(k)E + ∆t(1θ)K(k)

E )(J(k)E )T = eM(k)

E + ∆t(1θ)eK(k)E : (3.10)

The estimate γkK;E

r34

holds true for the local CBS constant of the hier-

arcical two-by-two splitting of a macroelement stiffness matrix eK(k)E , eK(k)

E =

J(k)E K(k)

E (J(k)E )T , see e.g. [24]. In Paper I, the estimate γk

M;E r

12

is shown

for the local CBS constant of a DA splitting of the mass macroelement ma-trix eM(k)

E = J(k)E M(k)

E (J(k)E )T . Both estimates are robust with respect to dis-

cretization parameters and mesh anisotropies. Then, it follows from (3.10)

22

Page 23: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

that the CBS constant for the two-by-two block splitting (3.6) is bounded byγk

E maxfγkM;E ;γ

kK;Eg. Using (3.9) we obtain the estimate

γk

r34;

that holds true for all 1 k ` independent on mesh size and mesh anisotropies.We prove the following spectral equivalence relations that link the Schur

complements eS(k) = eA(k)22 eA(k)

21 (eA(k)11 )1eA(k)

12 to the matrices on coarser meshesT k1:

14

(M(k1) + ∆t(1θ)4K(k1)) eS(k) (M(k1) + ∆t(1θ)4K(k1));

12

A(k1) eS(k) 4A(k1):

The performed analysis suggests two possible ways to construct a multi-level AMLI preconditioner C(`). The current Schur complement can be ap-proximated by using the matrix M(k1) + 4`k+1∆t(1θ)K(k1) or by usingthe system matrix A(k1) itself. In the numerical study, presented in Paper I,we consider the latter approximation. The numerical results confirm the opti-mal properties of the proposed AMLI preconditioner for parabolic problems.

23

Page 24: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

4. Solution methods for systems of PDEs

When constructing preconditioners, there are two general frameworks to fol-low. The first one is to assume that the only source of information is thematrix itself. In this case no other knowledge is to be used about, for exam-ple, the original continuous problem, the discretization mesh and the orderingof the unknowns, the discretization technique, etc. Typical representativesof preconditioners from this class are the algebraic multigrid, the incompletefactorizations and some methods for sparse approximate inverses. Such pre-conditioners are rather general and could be applied for matrices of varioustypes. Practise shows, however, that those general preconditioners could alsobe less efficient, compared to the so-called "problem dependent" precondition-ers, which use more information than the matrix only, for instance, may utilisesome specific characteristics of the continuous and discrete models. Some ex-amples are solution methods for incompressible flow and elasticity problems,etc., see e.g. [41, 45, 5, 66, 10, 62]. In case of systems of PDEs one importantproperty is the in-built block structure of the corresponding matrices. In thisthesis we consider the solution of multiphase flow problems, described by thephase-field model. We analyse efficient preconditioning techniques, based onthe two-by-two block structure of the discrete problem and the properties ofthe constituting matrices.

4.1 The Cahn-Hilliard equationConsider the task to numerically simulate the evolution of the interface bet-ween different phases in time, subject to various physical processes, such asdiffusion, convection, etc. Multiphase processes advance through free contactsurfaces to be accurately tracked by the numerical methods. The computa-tional model should be able to account for dynamic events such as interfacecreation, deformation, coalescence and destruction. In the phase-field frame-work, the interface is assumed diffused. It is modelled by a function c(x; t),which represents the concentration of each of the components in the system.The function c(x; t) attains a distinct constant value in each bulk phase andrapidly, but smoothly, changes in the interface regions between the phases.For a binary fluid, a usual assumption is that c takes values between 1 and 1,or 0 and 1. The approach can be generalised for the treatment of multiphaseproblems (see, e.g., [42]), considered in Paper V.

24

Page 25: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

The phase-field model goes back to a pioneering work by van der Waals in1893 ([74]), and is based on classical thermodynamics arguments by Gibbs in1873 ([48]). The profile of the interface is determined by the balance betweenrandom molecular motion and molecular attraction in terms of the free energydensity of the system, as formulated by Cahn and Hilliard, see e.g. [34],

f (c) = 12

αj∇cj2 + βΨ(c): (4.1)

The term j∇cj2 is related to intermolecular interactions and can be viewedas penalising the creation of interfaces. For the case of binary mixtures, thefunction Ψ(c) is a double-well potential with two stable minima. The physical

system seeks to minimise the free energy E (c), E (c) =Z

Ω

f (c)dΩ, a process

described by the Cahn-Hilliard equation,

∂c∂ t

+(u ∇)c = ω∆(Ψ0(c) ε

2∆c); (x; t) 2ΩT Ω (0;T );

n ∇c = n ∇(Ψ0(c) ε2∆c) = 0; x 2 ∂Ω; c(x; t) = c0(x); t = 0:(4.2)

In Papers II, III, and IV we consider two-phase flow problems where

Ψ(c) = 14

(c21)2; (4.3)

thus, c varies between 1 and 1. The problem parameter ε determines thethickness of the interface between the two phases. The velocity field u is non-zero whenever the interface develops due to both diffusion and convection. Tomodel such processes the C-H equation is coupled to the Navier-Stokes (N-S)equation,

Re

∂u∂ t

+(u ∇)u

= ∇p + ∆u 1Ca Cn

η∇c

∇ u = 0;

where Cn, Re, Ca are the so-called Cahn, Reynolds, and Capillary numbers,respectively and η is the so-called chemical potential. Usually, operator split-ting is used for the coupled C-H–N-S system, where u is assumed to be al-ready computed when (4.2) needs to be solved. In non-convective models,ω = 1. In the nondimensionalized formulation of the convective C-H equa-tion, ω = 1=Pe, where Pe is the so-called Peclet number, and ε is substitutedin (4.2) by the Cahn number Cn, (e.g. [76]).

4.2 Discrete formulationIn our work, we study the C-H problem (4.2) in an equivalent formulation asa system of two second-order equations (see e.g. also [39, 43]), where the

25

Page 26: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Figure 4.1. Phase separation and coarsening in a binary mixture.

chemical potential η = Ψ0(c) ε2∆c is considered as an auxiliary variable.The system takes then the form

ηΨ0(c)+ ε2∆c = 0; (x; t) 2ΩT

ω∆η + ∂c∂ t

+(u ∇)c = 0; (x; t) 2ΩT

n ∇c = 0; n ∇η = 0; x 2 ∂Ω; c(x; t) = c0(x); t = 0:

(4.4)

One advantage to consider (4.4) instead of (4.2) is that it can be discretized inspace by lower order FEM, for instance using conforming linear basis func-tions. We use the same finite element space Vh for both variables. For simpli-city of the presentation, assume implicit Euler scheme with a constant step ∆tis used to discretize the equations in time. Denote by ftkg;k = 0;1; : : : a se-quence of time steps, t0 = 0; tk = tk1 +∆t, and let c(k) = fc(xi; tk)gN

i=1;η(k) =

fη(xi; tk)gNi=1, where fxigN

i=1 are the finite element nodes. Then the fully dis-cretized C-H system to be solved at each time step takes the following form:

Mη(k) f(c(k)) ε2Kc(k) = 0ω∆tKη

(k) + Mc(k) + ∆tWc(k)Mc(k1) = 0(4.5)

where k = 1;2; : : : , and M;K;W are the mass, stiffness matrices, and the ma-trix resulting from discretizing the convective term, correspondingly. The ele-

26

(a) Random initial condition (b) Solution at time t = 0012

(c) Solution at time t = 0025 (d) Solution at time t = 004

Page 27: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

ments of the vector f(c(k)) = f fi(c(k)gNi=1 are defined as

fi(c(k)) =

Ψ0

N

∑j=1

c(x j; tk)χ j

!;χi

!;

for fχ jgNj=1 – the finite element nodal basis functions of Vh.

Remark 4.2.1 We consider backward Euler method only to ease the presenta-tion of the ideas. The general results in the thesis are applicable for differenttime discretization schemes, as well as for variable time stepping algorithms.

The system (4.5) is nonlinear and in our studies we solve it via a Newton-type method. The exact Jacobian of (4.5) has the form

A =

M J ε2Kω∆tK M + ∆tW

; (4.6)

where J = J(f(c)) is the Jacobian of the nonlinear term f(c) only. An important

property, that we use in our studies, is that J(f(c)) =Z

Ω

jmn(c)χmχn

N

m;n=1has a structure, similar to that of a mass matrix. The coefficients jmn(c) dependon the concentration via the derivative of Ψ0(c) and, for the particular choice(4.3) of Ψ(c), are bounded from below and above as 1 jmn(c) 2.

In the classical Newton method framework, an approximate solution y(k) =((η(k))T ;(c(k))T )T of (4.5) is obtained via an iterative procedure

y(k);s+1 = y(k);s + ∆y(k);s;s = 0;1:::

The updates ∆y(k);s are found as solutions of systems of the form

A∆y(k);s = g(k);s; (4.7)

where the right hand side g(k);s depends on the current iterate y(k);s. In thisthesis, we consider methods to solve (4.7) by either using a preconditionediterative method or by replacing A by an approximation, which is easier tosolve. The proposed preconditioning methods are developed, based on theassumption that, under certain conditions, a high-quality approximation of Acan be constructed by neglecting the blocks that correspond to the nonlinearand the convective terms of the discrete model. In Sections 4.3 and 4.4 wepresent the developed approximations and in Section 4.5 we detail their usewithin the framework of inexact Newton methods.

4.3 Symmetric positive definite Schur approximationsFirst we investigate the standard preconditioning technique, based on approx-imations of the exact Schur complement S in the two-by-two factorization

27

Page 28: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

(2.6). The exact Schur complement of A has the form

S = M + ∆tW + ω∆tKM1(J + ε2K): (4.8)

Recall, that when we seek an approximation of S, it has to fulfil two importantconditions: to be sparse, and that solutions of systems with it can be performedin an efficient way. Note, that the majority of results in the research of efficientmethods for solution of algebraic systems is for the case of s.p.d. matrices,while here S is non-symmetric. In Paper II we present some approaches toconstruct s.p.d. preconditioners for S.

Consider the s.p.d. matrixbS = M + ε2ω∆tKM1K: (4.9)

We analyse how well bS approximates S by examining the error matrix

EbS = bS1(S bS) = ∆tbS1W + ω∆tbS1KM1J: (4.10)

Assume that the discretization parameter h is chosen so that the interface lay-ers for problem (4.4) are well resolved, namely, h = ε=r, for a proper r > 1,and h = ξ=Pe, ξ – small. Then, we show that for a time step ∆t small enoughwith respect to the characteristic mesh size h, the norm of E

bS is close to zeroand bS is a high-quality approximation to S, see also [14, 26].

Note that, when non-conforming Crouzeix-Raviart FEM is used, the massmatrix M is diagonal. Then, bS is sparse and can be computed explicitly. Inthe case of linear conforming finite elements, the inverse of M is not cheap toform and is dense. However, we propose a further approximation, in the formbSD = M + ε

2ω∆tKD1K;

where D = diag(M). Based on previous results for the diagonal approximationof the mass matrix M, as for instance in [77], we show that the estimate

κ(bS1DbS) 4

holds true.Further, a matrix with the structure of bS can be preconditioned in an optimal

manner by yet another approximationbSF = (M + εp

ω∆tK)M1(M + εp

ω∆tK):

It turns out that bSF is a high-quality approximation of bS, namely

κ(bS1FbS) 2;

see e.g. [67]. The solution of systems with bSF requires two solutions of sys-tems with the s.p.d. matrix M + ε

pω∆tK, for which some well-developed

preconditioned iterative methods can be used, such as multilevel or multigridmethods.

28

Page 29: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

4.4 Block-structured Jacobian approximationEven though straightforward two-by-two factorization preconditioning tech-niques are applicable for the matrix A, as those outlined in Section 4.3, it turnsout that A can be preconditioned even better, utilising its particular structure.This is done in Paper III, see also Paper IV. The approximation procedure con-sists of two steps. First, the blocks that correspond to the discrete nonlinearand convective terms in (4.4) are neglected. We obtain the matrix

A0 =

M ε2Kω∆tK M

(4.11)

and analyse the generalised eigenvalue problem

Aq = λA0q: (4.12)

In Paper III we shows that, for ∆t small enough, all eigenvalues λ clusteraround the value 1 and A1

0 A acts similarly to the identity operator. This holdstrue for ∆t < h for models where ω = 1=Pe, and for ∆t < h2 for models whereω = 1. The numerical analysis, however, suggests that for both cases of aproblem parameter ω , A0 performs as a high-quality approximation of A fortime steps ∆t = h=r. Numerical tests indicate that r = 4, or even r = 2, isalready good enough.

As a second step, consider the matrix

bA0 =

M ε2Kω∆tK M + 2ε

pω∆tK

: (4.13)

An analysis of the eigenvalues of the generalized eigenvalue problem

A0q = bλ bA0q

in [14], related to earlier work in [11], leads to the estimate

12 bλ 1;

thus, bA0 is an optimal preconditioner for A0. Furthermore, it can be efficientlyimplemented, due to its exact block factorization

bA0 =

M 0ω∆tK M + ε

pω∆tK

I ε2M1K0 M1(M + ε

pω∆tK)

: (4.14)

As derived in [14], and also detailed in Paper IV, a solution of a systembA0

x1x2

=

f1f2

can be obtained via the steps

29

Page 30: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

1. Compute b1 = αf1 + f2;2. Solve Z g = b1;3. Compute b2 = Mgαf1;4. Solve Z x2 = b2;5. Compute x1 = α1(gx2).

where α =p

ω∆t=ε and Z = M +εp

ω∆tK. We stress that in the above algo-rithm, only two solutions of systems with the s.p.d. matrix Z arise. All otheroperations consist of one matrix-vector multiplication and three vector up-dates. The structure of the matrix Z being a sum of a mass and scaled stiffnessmatrices, allows for a straightforward use of optimal multilevel and multigridsolvers.

Finally, as a last simplification, we show that the matrix bA0 can be used todirectly approximate A. A theoretical analysis of this is presented in PaperIV, where estimates for the eigenvalues of the preconditioned matrix bA1

0 A arederived. There we show that half of the eigenvalues are equal to one, and theother half belong to a disc, centred at one, with a radius 1=2 + ζ , where ζ isclose to zero under the same assumptions for the time step, as in the analysisof (4.12).

4.5 Efficient solvers based on inexact Newton methodsAs already mentioned, the C-H problem is nonlinear and in this thesis we solveit by an inexact Newton method. To fix notation and terminology, consider theproblem

F(y) = 0; (4.15)

where F is a nonlinear operator. The general idea of Newton’s method is toobtain an approximate solution of (4.15) via the following procedure.

Given an initial guess y0, for s = 0;1; ::: until convergence, linearize Faround the current guess ys and compute a correction δ s,

F0(ys)δ s =F(ys); (4.16)

where F0 is the exact Jacobian of F with respect to y. The next iterate iscomputed as ys+1 = ys +δ s, expecting that it approximates the solution of thenonlinear problem better than ys.

When the system with the Jacobian matrix F0(ys) is solved exactly, themethod is referred to as the exact Newton method. When the correction isobtained by solving (4.16) inexactly using some iterative method, or whenF0(ys) is replaced by some approximation, the method is referred to as theinexact Newton method.

Assume that the nonlinear operator F is differentiable, that is, F0 exists andis nonsingular. Assume also that we possess a good enough initial guess y0and consider the following two variants of the inexact Newton method:

30

Page 31: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

(i1) Solve (4.16) by a preconditioned iterative method. The nonlinear algo-rithm takes the form:

Given y0,For s = 0;1;2 do until convergence

(i1-1) Solve F0(ys)δ s =F(ys)+ rs,

wherekrsk

kF(ys)k τs, τs 2 (0;1)

(i1-2) Update ys+1 = ys + ςsδ s, ςs 2 (0;1](i2) Replace F0(ys) by an approximation bF0(ys). Then, the nonlinear method

reads:Given y0,For s = 0;1;2 do until convergence

(i2-1) Solve bF0(ys)δ s =F(ys)(i2-2) Update ys+1 = ys + ςsδ s, ςs 2 (0;1]

Above, ςs is a so-called damping parameter.Globally convergent inexact Newton methods of type (i1) and (i2) have

been studied earlier, e.g., in [2, 69, 21]. There, it is assumed that F is Fréchet-differentiable and F0 is Lipschitz-continuous. When a preconditioned iterativemethod is used at step (i1-1), global convergence of (i1) is derived for the caseof a linear uniformly bounded preconditioner bF0, see e.g. [2, 69]. Methods oftype (i2) are shown to be globally convergent for bF0 such that kbF0

1F0(ys)

Ik< 1, see e.g. [21].In this work, the discrete nonlinear C-H problem (4.5) is solved via both

methods, (i1) and (i2). The exact Jacobian F0 A has the form (4.6). Thepreconditioners for A, proposed and analysed in this thesis, are linear and uni-formly bounded, and can be successfully used within the framework of method(i1). The analysis in Paper IV shows that kbA1

0 A Ik< 1 and when bA0 is usedas an approximation of A global convergence for method (i2) is straightfor-ward to show. Moreover, both methods perform in an optimal manner. Wenote, that due to the observed fast convergence, in our work we use ςs = 1.

In Table 4.1 we illustrate the performance of method (i1), where the exactJacobian A is preconditioned by bA0. We simulate phase separation and coar-sening in a binary mixture. Here, Ω = [1=2;1=2] [0;1];ω = 1;ε = 0:0625and there is no convection, see Figure 4.1. Each cell in the table containsinformation in the format It1=It2=It3. It1 denotes the number of nonlineariterations per time step, averaged over 10 time steps, with a tolerance in theNewton method 106. The average number of linear iterations for solving sys-tems with A preconditioned by bA0, for tolerance of the linear solver 106, isdenoted by It2, and It3 shows the average number of linear iterations to solvesystems with Z, preconditioned by an algebraic multigrid (AMG) (see [51]for details on the particular AMG method), for tolerance 103. The resultsconfirm that the proposed approach leads to an efficient method with optimalconvergence and computational complexity properties. The number of nonlin-

31

Page 32: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

∆tSize h h=2 h=4 h=10 h2

8450 7 / 22 / 3 3 / 14 / 3 3 / 11 / 3 3 / 8 / 3 3 / 8 / 333282 3 / 13 / 3 3 / 10 / 3 3 / 9 / 3 3 / 8 / 3 3 / 7 / 3

132098 3 / 18 / 3 3 / 9 / 3 3 / 8 / 3 3 / 8 / 3 3 / 7 / 3

Table 4.1. Iteration counts for different choices of a time step ∆t for the simulation ofphase separation and coarsening in a binary mixture.

ear and the inner linear iterations does not increase with the system size N andat each outer and inner iteration only operations with computational complex-ity O(N) need to be performed. Moreover, the condition on the size of the timestep, in order to grant such behaviour, is not too restrictive. In the presentedcase, ∆t = h=2 is already small enough for the method to be efficient.

Note, that in the case of method (i2), we omit the iterative solution proce-dure for systems with the exact Jacobian. However, the approximation (i2-1)generally leads to an increase of the number of nonlinear iterations. A compa-rison of the performance of methods (i1) and (i2) is presented in Paper IV. Theresults from the performed numerical experiments show that, for large enoughnumber of degrees of freedom, using method (i2) leads to a better overall per-formance. Moreover, in this case there is no need to form the block J at eachNewton step, which further decreases the computational costs.

4.6 Generalisation of the solution techniques formultiphase systems with arbitrary number ofcomponents

Generalisations of the phase-field model to any number of components havebeen done, for instance, in [28, 40, 44, 55]. The models are based on variousassumptions for the form of the free energy functional

E (c;∇c) =Z

Ω

f (c;∇c)dΩ

of the physical system. Here, f (c;∇c) denotes the particular choice of the freeenergy density function.

Let assume that we deal with a mixture of n immiscible components (phases),which occupy an isolated region Ω 2 IRd ; d = 1;2;3. We denote by c =[c1;c2; ;cn]T , ci = ci(x; t); i = 1;2; ;n, the concentrations (the order pa-rameters) of each of the phases. The variables ci(x; t) can be seen as molefractions and, thus, the only physically relevant values, which the concentra-

32

Page 33: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

tion are assumed to attain, lie on the so-called Gibbs simplex

G (

ci 2 IRn :n

∑i=1

ci = 1;0 ci 1

): (4.17)

Sincen

∑i=1

ci = 1, only n 1 of the concentrations are independent and one

concentration can always be computed as cn = 1n1

∑i=1

ci.

In addition to the concentration ci, the chemical potential ηi(x; t) is alsodefined for each component. The vector = [η1;η2; ;ηn]T of all chemicalpotentials is taken to be the functional derivative of E , i.e.,

= δE

δc: (4.18)

Using the chemical potential, the multi-component Cahn-Hilliard system canbe written as

∂c∂ t

+(u ∇)c = ∇ (L(c)∇) ; x 2Ω; t > 0; (4.19)

coupled with no mass flux boundary conditions (L∇)i n = 0, for all i, wherex 2 ∂Ω; t > 0. Note that (4.19) is satisfied component-wise and we have assu-med that convection is also included in the model. Here, L(c) is the so-calledmobility matrix, which is symmetric and positive semidefinite.

The particular form of the C-H system depends on the choice of the freeenergy density in the model. The function f (c;∇c) can be decomposed as

f (c;∇c) = f0(c)+ f1(c;∇c)+ f2(c);

where f0(c) is referred to as the homogeneous free energy density, f1(c;∇c) –as the gradient free energy density and f2(c) accounts for other, for examplemechanical, effects, included in the model. Two approaches often used in re-lated literature are to have polynomial or logarithmic form of f0(c). Differentforms exist also for the gradient free energy.

In this thesis, we assume that L is a constant diagonal matrix, as used inthe majority of the related papers on simulation of multi-component systems.We assume also that there are no couplings between the components of c inf1(c;∇c), and f2(c) = 0. More details can be found in Paper V.

Under the above assumptions, we solve problem (4.18)-(4.19) by a finiteelement discretization in space and implicit scheme in time. We proceed withthe discretization and linearization as in our previous work for two-phaseflows. The exact Jacobian of the discrete system for the unknown vectory = (T ;cT )T admits the form

A =

M γK J (F (c))∆tωK M + ∆tW

: (4.20)

33

Page 34: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

In (4.20), J (F (c)) is the Jacobian of the nonlinear term F (c) that corre-sponds to the homogeneous part of the free energy density. The matrices M ,K , and W have a block-diagonal structure,

M =

264M. . .

M

375 ;K =

264K. . .

K

375 ;W =

264W. . .

W

375with n1 blocks that are mass, stiffness and convection matrices, correspond-ingly. The problem parameters γ and ω determine interface thickness and mo-bility properties. For simplicity, we have assumed that ω is scaled as in [28],see Paper V. The framework that we present here is applicable also for the casewhen there is a block-diagonal matrix with blocks ωiI; i = 1; :::;n1 in (4.20)instead of the scalar ω .

Analogously to the case of binary flow models, we show that, for smallenough time steps, the terms, corresponding to convective and nonlinear effectsin the process, can be neglected, and A can be approximated by

A0 =

M γK∆tωK M

:

We show that the matrix A0 is a high-quality approximation of A , provided

that the derivatives∂F (c)

∂ciare bounded with respect to all components ci in

the Gibbs simplex G . The latter condition is fulfilled for any choice of a freeenergy of polynomial type, as well as for logarithmic free energy, regularisedas in [40]. Further, A0 can be permuted to a block diagonal form

PT A0P =

264A0. . .

A0

375 ; A0 =

M γK∆tωK M

where the blocks A0 have the structure, analysed already in Papers III and IVfor the case of two-phase problems. We propose that the approximation

cA0 = P

264bA0. . . bA0

375PT ;

bA0 =

M 0∆tωK M +

p∆tωγK

I γM1K0 M1(M +

p∆tωγK)

(4.21)

is used within the framework of inexact Newton methods (i1) and (i2) for thesolution of multiphase problems, modelled by C-H systems.

34

Page 35: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

4.7 Software implementation and parallelizationResults on numerical simulations of multiphase problems modelled by thephase-field method are reported in many articles. In some papers the so-lution methods are not described at all, and it might be guessed that directmethods have been used. Despite the availability of highly optimised codesand computing power, the use of direct methods, in particular in 3D, is lim-ited to problems of relatively moderate size. Comparisons of the performanceof the proposed optimal preconditioner bA0, within the framework of method(i1), with that of a fast sparse direct method (MUMPS, see [64]) show that in2D for problem sizes, larger than a million degrees of freedom, the iterativemethod becomes superior over the direct solver in terms of performance, seePaper III for more details. In other experiments we consider the behaviour ofILU-preconditioned GMRES (the same as used in [78]), where the ILU pre-conditioner is constructed for the whole system A. Our experiments show thatin some cases the ILU preconditioner even diverges, while the block precon-ditioner bA0 shows a very robust behaviour for all settings of the problem anddiscretization parameters.

In a number of studies (see e.g. [53, 54, 29]) a specially tailored, and tosome extend technical to implement, nonlinear geometric multigrid is used,applied to the whole C-H system, and based on a hierarchy of meshes obtainedvia regular refinements.

Besides the observed optimal computational complexity properties, one ofthe advantages of the solution methods for multiphase problems, developedwithin this thesis, is that their implementation may utilise readily availablesoftware toolboxes, optimised for efficient performance both for sequentialand parallel execution. The main procedures that constitute our solver for theC-H system, where methods (i1) and (i2) with approximation cA0 are used, arethe following:

Assembly of matrices; Matrix-vector multiplications and vector operations within the nonlinear

and linear solution methods (BLAS primitives); Preconditioned iterative methods to be applied for the solution of the

linear systems with an s.p.d. matrix of the form M + δK, and in case ofmethod (i1), for the systems with the non-symmetric matrix A .

For all of these tasks, there are known techniques for parallelization, as wellas intensive ongoing research and implementation of methods, accessible vianumerous publicly available software packages (see e.g. [52, 68, 37, 73]).

In Paper IV we perform weak scalability tests to see how the solution timevaries with the number of processors for a fixed problem size per processor,and thus, to evaluate the performance of the solution algorithms in parallel.We use the open software libraries deal.II and Trilinos, see [37, 73]. The im-plementation is based on a distributed mesh-oriented paradigm, that allowsprograms to scale for large machine and problem sizes, see [19]. The numer-

35

Page 36: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

Size 549250 4293378 33949186 270011394No. core 1 8 64 512

(i1)Iter. count 3 / 11 / 5 3 / 10 / 6 3 / 9 / 6 3 / 9 / 6

Tw 35.36 91.02 145.66 275.20Rw 2.57 1.60 1.89

(i2)Iter. count 20/5 17/5 17/6 14/6

Tw 31.45 67.53 136.56 140.59Rw 2.15 2.02 1.03

Table 4.2. Weak scalability test for the simulation of phase separation and coarseningin a binary mixture in a 3D domain.

ical experiments are performed on a cluster where each node has eight cores(two Intel Xeon 5520 Quad core processors).

To illustrate the parallel performance, some results are presented in Table4.2. The particular problem is the two-phase C-H system (4.4) with ω = 1;ε =0:0625 and no convection, in a three-dimensional domain Ω = [0;1] [0;1][0;1]. We report average iteration counts and average wall time Tw (in seconds)for the execution of one nonlinear solve. The number Rw denotes the factor,by which the elapsed solution time Tw increases after one refinement of themesh. Here, for systems with M +δK, we have used the PCG method with anAMG preconditioner.

Note, that when the problem size per processor is fixed, the increase ofthe execution time on more processors is (mainly) due to communications be-tween the cores. In the ideal case, when no communication is needed, therewill be no increase of Tw when more processors are used in the computation.Here we examine four different settings - program execution on one core (firstcolumn in the table), parallel execution on one node with eight cores (secondcolumn), parallel execution on eight nodes with eight cores each (third col-umn), and parallel execution on 64 nodes and overall of 512 cores. There aretwo types of communication that may occur in these experiments - commu-nications within the chip on one node, and communication via the network,connecting the different nodes. We observe, that the factor of increase Rw isless than 2, in the case when the type of communication for two consecutiveproblem sizes is the same (last columns in the table). Moreover, in this casemethod (i2) scales almost ideally, namely, Rw = 1:03. For the solution of theconsidered large-scale problems, method (i2) is superior to method (i1), alsoin terms of overall execution time.

It is important to note, that due to the block-diagonal form of the approx-imation (4.21), in the case of n-component multiphase problems, there existsan additional level of parallelism that can be utilised in an efficient large-scaleimplementation. When using the inexact Newton method (i2), the systems for

36

Page 37: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

the different components can be solved fully independently, where communi-cation only occurs due to transmitting local contributions when evaluating theright hand side in step (i2-1).

37

Page 38: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

5. Summary of papers

5.1 Paper IPaper I considers the construction of an optimal Algebraic MultiLevel Iter-ation (AMLI) method for systems arising from discretizations of parabolicproblems in two space dimensions. To discretize those, we use Crouzeix-Raviart non-conforming elements and triangular meshes. The developed AMLImethod is a generalisation of a two-level method and is based on an approxi-mated block factorization of the original system matrix, where the partitioningis associated with a sequence of nested triangulations.

A key ingredient for the efficiency of the AMLI preconditioners is thequality of the utilised two-by-two block splitting, quantified by the so-calledCauchy-Bunyakowski-Schwarz (CBS) constant, which measures the abstractangle between the two related finite element subspaces.

In this work, we extend the construction and the underlying theory of two-level partitioning techniques for elliptic problems over to systems arising fromdiscretizations of parabolic problems. We derive estimates for the associatedCBS constant. The estimates are uniform with respect to discretization pa-rameters in space and time as well as with respect to coefficient and meshanisotropy, thus providing robustness of the method. The theoretical resultsfor the optimality of the proposed AMLI method are confirmed in a series ofnumerical experiments, detailed and analysed in the paper.

5.2 Paper IIPaper II is the first in a series of articles in this thesis, where we develop pre-conditioned iterative solution methods to solve the algebraic systems of equa-tions, arising from finite element discretizations of multiphase flow problems,described by the phase-field model.

In this paper we consider two-phase problems. We study the structure of thediscrete Cahn-Hilliard system, obtained via conforming and non-conformingFEM discretizations and implicit time discretization schemes. The nonlinear-ity of the phase-separation process is treated by a Newton type of method.The resulting matrices admit a two-by-two block structure, determined by themathematical model. We propose and discuss symmetric positive definite ap-proximations of the Schur complement, corresponding to a factorization as-sociated with this natural block splitting. The constructed preconditioners are

38

Page 39: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

based on modifying matrix blocks that appear in the structure of the Schurcomplement. We derive approximation estimates of the preconditioners andinclude numerical experiments to illustrate their behaviour.

5.3 Paper IIIIn Paper III we justify an approximation of the exact Jacobian of the discretenonlinear system for binary Cahn-Hilliard models that is based on neglectingthe discrete counterparts of the nonlinear and convection effects in the process.The presented analysis shows that for small enough time steps this techniqueleads to a preconditioned matrix that has a spectrum, clustered around thevalue one, and thus, its action resembles the action of the identity operator.

We proceed by proposing a preconditioner in a block-factorised form withonly symmetric positive definite submatrices in its structure. Its use in pre-conditioned iterative methods within the nonlinear solution process leads to avery efficient algorithm with an optimal convergence rate and computationalcomplexity. The presented study includes various numerical experiments. Per-formance comparisons with other solution methods confirm the advantages ofthe proposed algorithm when large-scale problems need to be handled.

5.4 Paper IVThe work presented in Paper IV extends the results in Paper III.

We provide some new and improved estimates of the quality of the block-factorised approximation of the Jacobian matrix, proposed in the previous ar-ticle. Also, we include a derivation of a simplified solution algorithm whereonly two, instead of three, symmetric positive definite subsystems need to besolved.

The constructed approximation is utilised in the framework of inexact New-ton methods. We propose and provide the theoretical background for two non-linear solution algorithms for the Cahn-Hilliard system. In the first one, thesystems with the exact Jacobian are solved inexactly (up to some proper accu-racy) via an iterative method with the block-factorised preconditioner. In thesecond, the original Jacobian system is replaced by a system with the studiedhigh-quality approximation. As a result, the method requires more nonlineariterations to converge, however, we omit the computational burden of a lin-ear solver for the non-symmetric Jacobian matrix. We present an extensivenumerical study of the performance of the two inexact Newton algorithms.

One of the main advantages of the proposed methods is that their imple-mentation boils down to the use of well-studied building blocks and availablesoftware modules for both sequential and distributed execution. We performweak scalability tests to evaluate the performance of the algorithms on par-

39

Page 40: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

allel machines. The implementation is based on the publicly available soft-ware library deal.II and the results confirm the efficiency and robustness of themethods also in high-performance environment.

5.5 Paper VIn Paper V we investigate the applicability of the studied solution techniquesfor the case of n-component problems, where n 3. The multiphase Cahn-Hilliard models involve systems of equations for the concentrations and chem-ical potentials that correspond to different components of the physical system.The exact formulation depends on the choice of the free energy functionalthat may involve logarithmic or polynomial homogeneous free energy densityterms.

We perform a thorough analysis for the case of a three-component Cahn-Hilliard model with free energy of polynomial type. The presented efficientsolution algorithm possesses optimal performance properties, provided thetime discretization step is small enough compared to the characteristic meshsize of the triangulation of the domain. The theoretical proof involves assump-tions on the boundedness of the Jacobian of the discrete nonlinear terms, thathold for any choice of a polynomial free energy, as well as for regularisedlogarithmic free energy. The approach can be straightforwardly generalisedto multiphase problems with any number of components. So far the analysisis done for the case when the gradient free energy does not involve couplingsbetween the different phases.

The efficiency of the method is demonstrated by numerical experiments fora three-component Cahn-Hilliard model. The framework, however, is appli-cable to different important classes of multiphase models and for an arbitrarynumber of components. We note that the modularity of the resulting solu-tion algorithm allows for high-performance implementations that utilise par-allelism on two levels - based on partitioning of the computational domain andon partitioning of the systems into blocks corresponding to different phases.

40

Page 41: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

6. Discussion and outlook

In this thesis, efficient preconditioning methods have been developed based onproper block representation of the underlying discrete systems. We investigatetwo basic approaches for the construction of preconditioners.

Approximate block-factorizations on hierarchies of meshes are known tolead to optimal solution algorithms for symmetric positive definite systemsthat arise in the discretization of elliptic, and as also discussed in this thesis,parabolic problems. The AMLI method is the first developed regularity-freeoptimal order preconditioning method. In practise, in spite of their optimalqualities, the AMLI algorithms are not as widely used as the multigrid meth-ods, for example. One possible reason for that could be the scarcity of avail-able implementations and optimised software libraries. The development ofsuch tools may be beneficial for future research in the fields of numerical lin-ear algebra and computer simulation.

The iterative solution methods for numerical simulations of multiphase flowproblems, developed in this thesis, utilise to a full extent the block structure ofthe discrete Cahn-Hilliard systems. The proposed algorithms are both highlyefficient by construction and allow for the straightforward use of software li-braries with optimised functionality. Those two properties, as well as somepresented comparisons, suggest that the developed methods are the choice tomake for the simulation of an important class of multiphase problems. Somefurther research is needed, however, for phase-field models that involve vari-able mobility or couplings between different components of the concentrationin the gradient energy terms.

The inbuilt parallelism of the studied methods for multiphase problems isan asset that should be additionally investigated and evaluated for differentcomputer architectures, including heterogeneous parallel environments. Note,again, that a free open software implementation of a framework for multiphaseflow simulation, based on the presented ideas, could be of use to research andindustrial applications in the field.

The work presented in this thesis has been mainly funded by the SwedishResearch Council (VR) via grant VR 2008-5072 "Finite element precondition-ers for algebraic problems as arising in modelling of multiphase microstruc-tures". The project considers numerical simulation of mathematical mod-els of morphological pattern formation and interface motion of multiphasemicrostructures, and in particular multiphase flow, based on the phase-fieldmodel. The main research focus of this thesis is aligned with the aim of theproject, namely, to enable fast and reliable numerical solution of the large scaleproblems as arising from finite element discretizations of the above models.

41

Page 42: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

7. Summary in Swedish

En betydande andel av de existerande algoritmerna för numerisk simulering avfysikaliska processer som kan beskrivas med hjälp av differentialekvationerinvolverar lösning av ekvationssystem med glesa koefficientmatriser. Dessalösningar står för en stor del av algoritmernas resurskonsumtion vid exekverin-gen. Betydelsen av effektiva algoritmer för att lösa algebraiska ekvationssys-tem är tydligast vid lösning av stora problem. Innebörden av ordet "stor"förändras i och med den ökade tillgången av beräkningsresurser, men för attlösa viktiga problem behöver utvecklingen av högprestandadatorer kombin-eras med utvecklingen av robusta lösningsmetoder som kan utnyttja resurs-erna till fullo. Utveckling, analys och implementation av effektiva metoder,som passar för både sekventiell och parallell exekvering, är viktiga forskning-sområden inom numerisk simulering och står i fokus för den här avhandlingen.

Metoderna för att lösa linjära algebraiska ekvationssystem kan delas in itvå klasser: direkta och iterativa. Om avrundningsfel kan undvikas ger direktalösningsmetoder en exakt lösning till ekvationssystemet. Iterativa lösningsme-toder genererar däremot en serie av förbättrade approximativa lösningar. Efter-som direkta lösningar är generellt tillämpbara och robusta används de ofta inumeriska simuleringar. När simuleringar som är mycket omfattande i tidoch/eller rum behöver utföras kan direkta metoder dock bli oöverkomligt kost-samma (och ibland omöjliga) att använda. Då blir iterativa lösningsmetoder,tack vare sina relativt låga krav på beräkningsresurser, en nödvändighet. Föratt förbättra effektiviteten kombineras iterativa metoder med lämpliga teknikerför att accelerera konvergensen mot en approximativ lösning med önskad nog-grannhet. En vanlig teknik för att åstadkomma detta är att använda en så kalladförkonditionerare. Konstruktion och analys av olika förkonditioneringsme-toder har varit ett aktivt forskningsområde i decennier. Särskild uppmärk-samhet har riktats mot de så kallade optimala förkonditionerings-teknikerna.Optimal innebär optimal konvergenshastighet, det vill säga konvergens inomett antal iterationer som är oberoende av antalet frihetsgrader, samt optimalberäkningskomplexitet, alltså att beräkningskostnaden per iteration är linjärtproportionell mot antalet frihetsgrader. Teknikerna som föreslås och studerasi den här avhandlingen använder de underliggande matrisernas blockstrukturoch leder till optimala förkonditioneringsmetoder.

Artikel I behandlar konstruktionen av "Algebraic MultiLevel Iteration meth-ods" (AMLI-metoder) för ekvationssystem som uppkommit ur paraboliskaproblem i kombination med Crouzeix-Raviarts finita element. Den utveck-lade AMLI-metoden baseras på en approximativ blockfaktorisering av sys-

42

Page 43: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

temets koefficientmatris, där blockindelningen är kopplad till en serie av näst-lade diskretiseringar av problemets domän. En nyckeldel till AMLI-förkondi-tionerares effektivitet är kvaliteten på indelningen av koefficientmatrisen i tvågånger två block, som kvantifieras av den så kallade Cauchy-Bunyakowski-Schwarz-konstanten (CBS-konstanten), som mäter den abstrakta vinkeln mel-lan två besläktade finita element-underrum. I avhandlingen utvidgar vi kon-struktionsmetoden och den underligande teorin för tvånivåindelningsteknikerför elliptiska problem till ekvationssystem som uppkommer ur diskretiseringav paraboliska problem. Vi hörleder även en robust uppskattning av den till-hörande CBS-konstanten.

I Artikel II, III och IV utveklar vi lösningsmetoder för numerisk simuleringav tvåfasflödesproblem modellerade med hjälp av Cahn-Hillards (C-H) ekva-tion. Vi behandlar det diskreta C-H-problemet, erhållet genom finit element-diskretisering i rummet och implicita scheman i tiden. Problemet är ickelinjärtoch i avhandlingen använder vi metoder av Newtontyp för att lösa det. Jaco-bianmatrisen tillåter en naturlig struktur av två gånger två block. I Artikel IIutvecklar vi symmetriska positivt definita approximationer av Schurkomple-mentet i faktoriseringen som ges av denna blockindelning. I Artikel III och IVutvecklar vi en effektiv förkonditionerare i blockfaktoriserad form för Jaco-bianen till det diskreta, ickelinjära ekvationssystemet. De föreslagna approx-imationerna är baserade på antagandet att man under vissa omständigheterkan bortse från blocken som motsvarar ickelinjära och konvektiva processeri det fysiska systemet. Förkonditionerarna används i ett ramverk av inex-akta Newtonmetoder. Vi utvecklar två ickelinjära lösningsalgoritmer för C-H-problemet. Båda leder till optimala metoder. I Artikel V presenterar vi engeneralisering av lösningsmetoderna för flerfasproblem med godtyckligt antalfaser. En av de viktigaste fördelarna med de föreslagna metoderna är att deimplementerats för både sekventiell och distribuerad exekvering med hjälp avtillgängliga mjukvarubibliotek som redan är optimerade för en given arkitek-tur.

Den teoretiska analysen av prekonditioneringsmetoderna som presenteras iavhandlingen kombineras med numeriska studier som bekräftar deras effek-tivitet. Vi har utvecklat en parallell implementation av olika lösningsalgorit-mer för C-H-problemet. De numeriska experimenten visar att de presterar brai distribuerade miljöer och uppvisar nästan ideal svag skalbarhet.

43

Page 44: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

8. Acknowledgements

First of all, I would like to express my sincere thanks and warm gratitude tomy adviser, Assoc. Prof. Maya Neytcheva. It has been a great pleasure to beyour PhD student and to work with you on problems that, I believe, have animpact on science and technology. Maya, thank you so much for being notjust an adviser but also a friend.

I wish to thank my second adviser, Prof. Svetozar Margenov, for introdu-cing me to the exciting field of numerical solution methods and for opening somany doors for me as a researcher.

Many thanks to Minh Do-Quang, Martin Kronbichler, and Xunxun Wu forour joint work and valuable discussions.

I would like to express my gratitude to Prof. Owe Axelsson. I have beentruly lucky to have the opportunity to collaborate with someone that has leftsuch a remarkable trace in the research field.

Thanks also to all friends and colleagues at the Division of Scientific Com-puting. I really enjoyed working here.

Special thanks to Elisabeth Linnér, for helping me with the summary inSwedish, but most importantly for being such a great friend.

I am really grateful to my mother, sister, and Herby, for their encourage-ment, warm support and patience. This thesis is dedicated in memory of myfather.

44

Page 45: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

References

[1] D. N. Arnold and F. Brezzi. Mixed and nonconforming finite element methods:Implementation, postprocessing and error estimates. RAIRO Model. Math. Anal.Numer., 19:7–32, 1985.

[2] O. Axelsson. On global convergence of iterative methods. LNiM, Springer,pages 1–19, 1982.

[3] O. Axelsson. On multigrid methods of two-level type. LNiM, Springer,960:352–367, 1982.

[4] O. Axelsson. Iterative solution methods. Cambridge University Press, 1994.[5] O. Axelsson. On iterative solvers in structural mechanics; separate displacement

orderings and mixed variable methods. Math. Comput. Simulation,50(1-4):11–30, 1999.

[6] O. Axelsson. Milestones in the development of iterative solution methods. J.Electric. Comput. Eng., 2010. doi:10.1155/2010/972794.

[7] O. Axelsson and V. A. Barker. Finite Element Solution of Boundary ValueProblems: Theory and Computation. Academic Press, 1984.

[8] O. Axelsson and R. Blaheta. Two simple derivations of universal bounds for theC.B.S. inequality constant. Appl. Math., 49(1):57–72, 2004.

[9] O. Axelsson and I. Gustafsson. Preconditioning and two-level multigridmethods of arbitrary degree of approximation. Math. Comp., 40:219–242, 1983.

[10] O. Axelsson and J. Karátson. Symmetric part preconditioning of the CGmethod for Stokes type saddle-point systems. Numer. Funct. Anal. Optim.,28(9-10):1027–1049, 2007.

[11] O. Axelsson and A. Kucherov. Real valued iterative methods for solvingcomplex symmetric linear systems. Num. Lin. Alg. Appl., 7:197–218, 2000.

[12] O. Axelsson and M. Neytcheva. Algebraic multilevel iteration method forStieltjes matrices. Num. Lin. Alg. Appl., 1(3):213–236, 1994.

[13] O. Axelsson and M. Neytcheva. A general approach to analyse preconditionersfor two-by-two block matrices. Num. Lin. Alg. Appl., 2011. DOI:10.1002/nla.830.

[14] O. Axelsson and M. Neytcheva. Operator splittings for solving nonlinear,coupled multiphysics problems with an application to the numerical solution ofan interface problem. Institute for Information Technology, Uppsala University,TR 2011-009, 2011.

[15] O. Axelsson and A. Padiy. On the additive version of the algebraic multileveliteration method for anisotropic elliptic problems. SIAM J. Sci. Comput.,20:1807–1830, 1999.

[16] O. Axelsson and P. S. Vassilevski. Algebraic multilevel preconditioningmethods I. Numer. Math., 56:157–177, 1989.

[17] O. Axelsson and P. S. Vassilevski. Algebraic multilevel preconditioningmethods II. SIAM J. Numer. Anal., 27:1569–1590, 1990.

45

Page 46: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

[18] O. Axelsson and P. S. Vassilevski. Variable-step multilevel preconditioningmethods, I: Self-adjoint and positive definite elliptic problems. Num. Lin. Alg.Appl., 1:75–101, 1994.

[19] W. Bangerth, C. Burstedde, T. Heister, and M. Kronbichler. Algorithms anddata structures for massively parallel generic finite element codes. ACM Trans.Math. Soft., 38(2), 2011.

[20] R. E. Bank and T. F Dupont. Analysis of a two-level scheme for solving finiteelement equations. Techn. Rep. CNA-159, Center for Numerical Analysis,University of Texas at Austin, 1980.

[21] R. E. Bank and D. J. Rose. Global approximate Newton methods. Numer.Math., 37:279–295, 1981.

[22] B. Bejanov, J. Guermond, and P. Minev. A locally div-free projection schemefor incompressible flows based on non-conforming finite elements. Int. J.Numer. Meth. Fluids, 49:239–258, 2005.

[23] G. Bencheva, I. Georgiev, and S. Margenov. Two-lewel preconditioning ofCrouzeix-Raviart anisotropic FEM systems. Large-Scale Scientific Computing,Springer LNCS, 2907:76–84, 2004.

[24] R. Blaheta, S. Margenov, and M. Neytcheva. Uniform estimate of the constantin the strengthened CBS inequality for anisotropic non-conforming FEMsystems. Numer. Lin. Alg. Appl., 11:309–326, 2004.

[25] R. Blaheta, S. Margenov, and M. Neytcheva. Robust optimal multilevelpreconditioners for non-conforming finite element systems. Numer. Lin. Alg.Appl., 12(5-6):495–514, 2005.

[26] P. Boyanova, M. Do-Quang, and M. Neytcheva. Solution methods for theCahn-Hilliard equation discretized by conforming and non-conforming finiteelements. Institute for Information Technology, Uppsala University, TR2011-004, 2011.

[27] P. Boyanova and S. Margenov. On optimal AMLI solvers for incompressibleNavier-Stokes problems. AIP Conference Proceedings, 1301:457–467, 2010.

[28] F. Boyer and C. Lapuerta. Study of a three component Cahn-Hilliard flowmodel. ESAIM Math. Mod. Numer. Anal., 40:653–687, 2006.

[29] F. Boyer and S. Minjeaud. Numerical schemes for a three componentCahn-Hilliard model. ESAIM Math. Mod. Numer. Anal., 45:697–738, 2011.

[30] D. Braess. Finite Elements: Theory, Fast Solvers, and Applications in SolidMechanics. Cambridge University Press, 2nd edition, 2001.

[31] J. Bramble. Multigrid methods. Longman Scientific & Technical, 1993.[32] S. Brenner and L. Scott. The mathematical theory of finite element methods.

Springer-Verlag, 1994.[33] F. Brezzi and M. Fortin. Mixed and hybrid finite element methods.

Springer-Verlag, 1991.[34] J. W. Cahn and J. Hilliard. Free energy of a nonuniform system. I. Interfacial

free energy. J. Chem. Phys., 28:258–267, 1958.[35] M. Crouzeix and P.-A. Raviart. Conforming and non-conforming finite element

methods for solving the stationary Stokes equations. RAIRO Anal. Numér,7(R-3):33–76, 1973.

[36] B. Ayuso de Dios and L. Zikatanov. Uniformly convergent iterative methods fordiscontinuous Galerkin discretizations. SIAM J. Sci. Comput., 40:4–36, 2009.

46

Page 47: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

[37] The deal.II library. http://www.dealii.org/.[38] V. Eijkhout and P. Vassilevski. The role of the strengthened

Cauchy-Bunyakowski-Schwarz inequality in multilevel methods. SIAM Review,33:405–419, 1991.

[39] C. M. Elliott, D. A. French, and F. A. Milner. A second order splitting methodfor the Cahn-Hilliard equation. Numer. Math., 54:575–590, 1989.

[40] C. M. Elliott and S. Luckhaus. A generalized diffusion equation for phaseseparation of a multi-component mixture with interfacial free energy. PreprintSeries 887, IMA, University of Minnesota.

[41] H. C. Elman, D. J. Silvester, and A. J. Wathen. Finite Elements and FastIterative Solvers with Applications in Incompressible Fluid Dynamics. OxfordUniversity Press, 2005.

[42] D. Eyre. Systems of Cahn-Hilliard equations. SIAM J. Appl. Math.,53:1686–1712, 1993.

[43] X. Feng and A. Prohl. Error analysis of a mixed finite element method for theCahn-Hilliard equation. Numer. Math., 99:47–84, 2004.

[44] H. Garcke, B. Nestler, and B. Stoth. A multi phase field concept: Numericalsimulations of moving phase boundaries and multiple junctions. SIAM J. Appl.Math., 60:295–315, 2000.

[45] A. Georgiev, S. Margenov, and M. Neytcheva. Multilevel algorithms for 3Dsimulation of nonlinear elasticity problems. Math. Comput. Simulation,50(1-4):175–182, 1999.

[46] I. Georgiev, J. Kraus, and S. Margenov. Multilevel preconditioning of 2DRannacher-Turek FE problems; additive and multiplicative methods. SpringerLNCS, 4310:56–64, 2007.

[47] I. Georgiev, J. Kraus, and S. Margenov. Multilevel preconditioning of rotatedbilinear non-conforming FEM problems. Comput. Math. Appl., 55:2280–2294,2008.

[48] J. W. Gibbs. A method of geometrical representation of the thermodynamicproperties of substances by means of surfaces. Transactions of the ConnecticutAcademy, Vol. II:382–404, 1873.

[49] G. H. Golub and C. F. Van Loan. Matrix Computations (3rd ed.). JohnsHopkins University Press, 1996.

[50] W. Hackbusch. Multi-Grid Methods and Applications. Springer-Verlag, 1985.[51] The HSL Mathematical Software Library. http://www.hsl.rl.ac.uk/.[52] The Hypre library. http://acts.nersc.gov/hypre/.[53] J. Kim, K. Kang, and J. Lowengrub. Conservative multigrid methods for

Cahn-Hilliard fluids. J. Comp. Phys., 193:511–543, 2004.[54] J. Kim, K. Kang, and J. Lowengrub. Conservative multigrid methods for ternary

Cahn-Hilliard systems. Commun. Math. Sci., 2:53–77, 2004.[55] J. Kim and J. Lowengrub. Phase field modeling and simulation of three-phase

flows. Interfaces and Free Boundaries, 7/4:435–466, 2005.[56] J. Kraus. An algebraic preconditioning method for M-matrices: Linear versus

nonlinear multilevel iteration. Num. Lin. Alg. Appl., 9:599–618, 2002.[57] J. Kraus and S. Margenov. Robust Algebraic Multilevel Methods and

Algorithms, Radon Series on Computational and Applied Mathematics, 5. deGruyter, 2009.

47

Page 48: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

[58] J. Kraus and S. Tomar. A multilevel method for discontinuous Galerkinapproximation of three-dimensional anisotropic elliptic problems. Num. Lin.Alg. Appl., 15:417–438, 2008.

[59] J. Kraus and S. Tomar. Multilevel preconditioning of elliptic problemsdiscretized by a class of discontinuous Galerkin methods. SIAM J. Sci. Comput.,30:684–706, 2008.

[60] R. Lazarov and S. Margenov. CBS constants for multilevel splitting ofgraph-Laplacian and application to preconditioning of discontinuous Galerkinsystems. J. Complexity, 23(4-6):498–515, 2007.

[61] J.-F. Maitre and F. Musy. The contraction number of a class of two-levelmethods; an exact evaluation for some finite element subspaces and modelproblems. LNiM, Springer, 960:535–544, 1982.

[62] K.-A. Mardal1 and R. Winther. Preconditioning discretizations of systems ofpartial differential equations. Num. Lin. Alg. Appl., 18(1):1–40, 2011.

[63] S. Margenov and J.Synka. Generalized aggregation-based multilevelpreconditioning of Crouzeix-Raviart FEM elliptic problems. RICAM Report,23, 2006.

[64] MUMPS. http://graal.ens-lyon.fr/MUMPS/.[65] M. Neytcheva. On element-by-element Schur complement approximations.

Num. Lin. Alg. Appl., 434(11):2308–2324, 2011.[66] M. A. Olshanskii. An iterative solver for the Oseen problem and numerical

solution of incompressible Navier-Stokes equations. Num. Lin. Alg. Appl.,6:353–378, 1999.

[67] J. Pearson and A. J. Wathen. A new approximation of the Schur complement inpreconditioners for PDE constrained optimization. The Mathematical Institute,University of Oxford, 2010. Technical report, November 24 (2010),http://eprints.maths.ox.ac.uk/1021/.

[68] The PETSc library. http://www.mcs.anl.gov/petsc/.[69] D. Ralph. Global convergence of damped Newton’s method for nonsmooth

equations via the path search. Math. Oper. Res., pages 352–389, 1994.[70] Y. Saad. Iterative methods for sparse linear systems. PWS Publishing

Company, Boston, 1996.[71] A. A. Samarskii. The Theory of Difference Schemes. CRC Press, 2001.[72] G. Streng and J. Fix. Theory of Finite Element Method. MIR, Moskow, 1977. In

Russian.[73] The Trilinos library. http://trilinos.sandia.gov/.[74] J. D. van der Waals. Thermodynamische theorie der capillariteit in de

onderstelling van continue dichtheidsverandering. Verhandlingen derKoninglijke Akademie van Wetenschappen te Amsterdam, Sect. 1. (Dutch;English translation in Journal of Statistical Physics, 1979, 20:197).

[75] P. S. Vassilevski. Multilevel Block Factorization Preconditioners: Matrix-basedAnalysis and Algorithms for Solving Finite Element Equations. Springer, 2008.

[76] W. Villanueva and G. Amberg. Some generic capillary-driven flows. Int. J.Multiphase Flow, 32:1072–1086, 2006.

[77] A. J. Wathen. Realistic eigenvalue bounds for the Galerkin mass matrix. IMA J.Numer. Anal., 7:449–457, 1987.

[78] P. Yue, C. Zhou, J. J. Feng, C. F. Ollivier-Gooch, and H. H. Hu. Phase-field

48

Page 49: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.

simulations of interfacial dynamics in viscoelastic fluids using finite elementswith adaptive meshing. J. Comp. Phys., 219:47–67, 2006.

49

Page 50: List of papers - DiVA portaluu.diva-portal.org/smash/get/diva2:523795/FULLTEXT01.pdf · the block structure of the underlying matrices, and lead to methods that are of optimal order.