An efficient algorithm for modelling progressive damage...

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERINGInt. J. Numer. Meth. Engng 2005; 62:1982–2008Published online 15 February 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nme.1257

An efficient algorithm for modelling progressive damageaccumulation in disordered materials‡

Phani Kumar V. V. Nukala1,∗,†, SrYan Šimunovic1 and Murthy N. Guddati2

1Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge,TN 37831-6164, U.S.A.

2Department of Civil Engineering, North Carolina State University, Raleigh, NC 27695-7908, U.S.A.

SUMMARY

This paper presents an efficient algorithm for the simulation of progressive fracture in disorderedquasi-brittle materials using discrete lattice networks. The main computational bottleneck involvedin modelling the fracture simulations using large discrete lattice networks stems from the fact thata new large set of linear equations needs to be solved every time a lattice bond is broken. Usingthe present algorithm, the computational complexity of solving the new set of linear equations afterbreaking a bond reduces to a simple triangular solves (forward elimination and backward substitution)using the already Cholesky factored matrix. This algorithm using the direct sparse solver is fasterthan the Fourier accelerated iterative solvers such as the preconditioned conjugate gradient (PCG)solvers, and eliminates the critical slowing down associated with the iterative solvers that is especiallysevere close to the percolation critical points. Numerical results using random resistor networks formodelling the fracture and damage evolution in disordered materials substantiate the efficiency ofthe present algorithm. In particular, the proposed algorithm is especially advantageous for fracturesimulations wherein ensemble averaging of numerical results is necessary to obtain a realistic latticesystem response. Copyright � 2005 John Wiley & Sons, Ltd.

KEY WORDS: damage evolution; brittle materials; random thresholds model; lattice network; statisticalphysics

∗Correspondence to: P. K. V. V. Nukala, Computer Science and Mathematics Division, Oak Ridge NationalLaboratory, Oak Ridge, TN 37831-6164, U.S.A.

†E-mail: [email protected]‡The submitted manuscript has been authored by a contractor of the U.S. Government under Contract No.

DE-AC05-00OR22725. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publishor reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.

Contract/grant sponsor: U.S. Department of Energy; contract/grant number: DE-AC05-00OR22725

Received 9 June 2004Revised 20 September 2004

Copyright � 2005 John Wiley & Sons, Ltd. Accepted 18 October 2004

ALGORITHM FOR MODELLING PROGRESSIVE DAMAGE ACCUMULATION 1983

1. INTRODUCTION

Progressive damage evolution leading to failure of disordered quasi-brittle materials has beenstudied extensively using various types of discrete lattice models [1–4]. Figure 1 presentsa discrete lattice network with a triangular lattice topology. The essential features of dis-crete lattice models used for modelling progressive damage evolution are: disorder, elasticresponse characteristics of the lattice bonds, and a breaking rule for each of the bonds inthe lattice. The disorder in the system is introduced by either random dilution of bonds orby randomly prescribing different elastic stiffness constants or breaking thresholds to eachof the bonds in the lattice. The elastic response of the individual bonds is typically de-scribed by random fuse models, central-force (spring) models, bond-bending spring mod-els, and beam-type models. Depending on the elastic response characteristics of individualbonds, various types of breaking rules are prescribed. The elastic and breaking characteris-tics of each bond in the lattice are supposed to represent the mesoscopic response of thematerial.

The mechanical breakdown of disordered materials is often modelled using its electricalanalogue. Based on this analogy, the mechanical bonds in the lattice are modelled by theelectrical fuses, and the mechanical strength of a bond is modelled by the fuse thresholdcurrent. The principle advantage of using the electrical fuse network is that it reduces the

Figure 1. Random thresholds fuse network with triangular lattice topology. Each of the bonds in thenetwork is a fuse with unit electrical conductance and a breaking threshold randomly assigned basedon a probability distribution (uniform). The behaviour of the fuse is linear upto the breaking threshold.Periodic boundary conditions are applied in the horizontal direction and a unit voltage difference,V = 1, is applied between the top and bottom of lattice system bus bars. As the current I flowing

through the lattice network is increased, the fuses will burn out one by one.

Copyright � 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2005; 62:1982–2008

1984 P. K. V. V. NUKALA, S. ŠIMUNOVIC AND M. N. GUDDATI

number of degrees of freedom by replacing the vector problem with the scalar one [5, 6].Numerical simulations based on electrical and mechanical idealizations have shown excel-lent agreement between the scaling properties of the two models [5], provided an equiv-alence is established between electrical current, voltage and conductance in the electricalmodel and the mechanical stress, strain and Young’s modulus in the mechanical model,respectively.

Within this framework of the random thresholds fuse model, the lattice is initially fully intactwith bonds having the same conductance, but the bond breaking thresholds, t , are randomlydistributed based on a threshold probability distribution, p(t). The burning of a fuse occursirreversibly, whenever the electrical current in the fuse exceeds the breaking threshold currentvalue, t of the fuse. Numerically, the damage evolution in materials is simulated by settinga unit voltage difference, V = 1, between the top and bottom of lattice system bus bars, andthe Kirchhoff equations are solved to determine the current flowing in each of the fuses.Subsequently, for each fuse j , the ratio between the current ij and the breaking threshold tjis evaluated, and the bond jc having the largest value, maxj

(ij /tj

), is irreversibly removed

(burnt). The current is redistributed instantaneously after a fuse is burnt, implying that the currentrelaxation in the lattice system is much faster than the breaking of a fuse. We assume that theloading process is quasi-static in the sense that the current is redistributed instantaneously aftera fuse is burnt, and only one fuse is burnt in a given load step. Each time a fuse is burnt, itis necessary to re-calculate the current distribution in the lattice to determine the subsequentbreaking of fuses. This process of breaking fuses irreversibly one after another is repeated untilthe lattice network finally becomes disconnected. It is assumed that successive fuse failuresleading ultimately to the failure of lattice system is similar to the breakdown of quasi-brittlematerials.

Numerical simulation of fracture using large discrete lattice networks is often hamperedby the high computational cost associated with solving a new large set of linear equationsevery time a new lattice bond is broken. This becomes especially severe with increasing latticesystem size, L, as the number of broken bonds at failure, nf , follows a power-law distributiongiven by nf ∼ O

(L1.8

). In addition, since the response of the lattice system corresponds to

a specific realization of the random breaking thresholds, an ensemble averaging of numericalresults over Nconfig configurations is necessary to obtain a realistic representation of the latticesystem response. This further increases the computational time required to perform simulationson large lattice systems.

Fourier accelerated PCG iterative solvers [7–9] have been used in the past for simulation ofbreakdowns of large lattices. However, in fracture simulations using discrete lattice networks,these methods exhibit critical slowing down, particularly in the vicinity of percolation criticalthresholds. As the lattice system gets closer to the macroscopic fracture, the condition numberof the system of linear equations increases, thereby increasing the number of iterations requiredto attain a fixed accuracy. This becomes particularly significant for large lattices. Furthermore,the Fourier acceleration technique is not effective when fracture simulation is performed usingcentral-force and bond-bending lattice models [8].

This study presents two classes of algorithms to speed up lattice simulations for modellingof fracture in disordered quasi-brittle materials. The first class of algorithms are based onDavis and Hager’s [10, 11] multiple-rank sparse Cholesky factorization downdate using directsolvers, and the second class of algorithms are based on PCG iterative techniquesusing circulant and block-circulant preconditioners. The aim of the paper is to develop an



algorithm that significantly reduces the average computational time required to break one latticeconfiguration.

The paper is organized as follows. Section 2 presents the algebraic problem of modellingprogressive damage evolution using discrete lattice models. Section 3 describes the compu-tational algorithms based on multiple-rank sparse Cholesky factorization update using directsolvers for random fuse networks. In Section 4, we present alternate block-circulant PCGiterative algorithms for solving the Kirchhoff equations. Section 5 presents the numerical simu-lations on 2D triangular lattice systems and compares the computational efficiency of the algo-rithms based on direct solvers with the iterative solvers. Concluding remarks are summarizedin Section 6.

2. ALGEBRAIC PROBLEM

Consider a lattice system with a total number of bonds, Nel. Let N= {1, 2, 3, . . . , Nel} denotethe set of individual bonds in the lattice system. After n bonds are broken, let Sb

n

denote an ordered set of broken bonds such that the element number of the j th broken bond,where j = 1, 2, . . . , n, is given by ej =Sb

n (j). The set of unbroken bonds after n bonds are bro-ken is defined as Su

n =N\Sbn . The cardinality of sets Sb

n and Sun are n and M = (Nel − n),

respectively. Based on the above definitions, it is clear that Sbn ⊂ Sb

n+1 and Sun ⊃ Su

n+1 foreach n= 0, 1, 2, . . ., and Sb

0 =∅ and Su0 =N.

Algebraically, fracture of discrete lattices by breaking one bond at a time is equivalent tosolving a new set of linear equations (Kirchhoff equations in the case of an electrical analogue)

Anxn=bn, n= 0, 1, 2, . . . (1)

every time a new lattice bond is broken. In Equation (1), each matrix An is an N×N symmetricand positive definite matrix (the lattice conductance matrix in the case of fuse models and thelattice stiffness matrix in the case of spring and beam models), bn is the N ×1 (given) appliednodal current or force vector, xn is the N × 1 (unknown) nodal potential or displacementvector, and N is the number of degrees of freedom (unknowns) in the lattice. The subscript n

in Equation (1) indicates that An and bn are evaluated after the nth bond is broken. Thesolution xn, obtained after the nth bond is broken, is used in determining the subsequent((n+ 1)th) bond to be broken. The matrices An and bn are obtained from the element (bond)matrices using the standard finite element assembly procedure (for example, see p. 42 ofReference [12]), denoted as

An= ⊔e∈Su

n

ke, bn= ⊔e∈Su

n

fe (2)

where ke and fe are the element conductivity matrix and the current vector, respectively, and⊔denotes the standard finite assembly operator. For the case of the random fuse model, the

assembly procedure is equivalent to

An=Gn�nGtn=

∑e∈Su

n

kegegte n= 0, 1, 2, . . . (3)



where ge is the column vector (with entries −1, 0, or 1) of the N × M incidencematrix, Gn, associated with the element e. Similarly, ke is the conductivity (positive) ofthe element e, and �n is the M × M positive definite diagonal conductivity matrix withdiagonal entries ke corresponding to each of the elements of the set Su

n . In the case ofcentral-force (spring) models, Equation (3) is still valid except that ke now denotes the springconstant of the element e, and the entries of ge are either nx , ny , nz, −nx , −ny , −nz

or 0, where(nx, ny, nz

)denote the direction cosines of the element (see Equation (A1) in

Appendix A).Mathematically, in the case of the fuse and spring models, the breaking of a bond is

equivalent to a rank-one downdate of the matrix An. In the case of beam models, the breakingof a bond is equivalent to multiple-rank (rank-6) downdate of the matrix An. In the following,we present the process of breaking in the context of a fuse model. The methodology forcentral-force (spring) and beam models is similar and is presented in Appendix A.

As discussed before, let N (the dimension of An, where n= 0, 1, 2, . . . ,) denote the totalnumber of degrees of freedom in the lattice system and let An represent the stiffness matrixof the random fuse network system in which n fuses are either missing (random dilution) orhave been burnt during the analysis. Let us also assume that a fuse ij (the (n + 1)th fuse)is burnt when the externally applied voltage is increased gradually. In the above description,i and j refer to the global degrees of freedom connected by the fuse before it is broken.For the scalar random fuse model, the degrees of freedom i and j are also equivalent to thenode i and node j connected by the fuse before it is broken. The new stiffness matrix An+1of the lattice system after the fuse ij is burnt is given by

An+1 =Gn+1�n+1Gtn+1

=Gn�nGtn − knvnvt

n

=An − knvnvtn, ∀n= 0, 1, 2, . . . (4)

where vn is a sparse vector with only a few (at most two) non-zero entries,

vtn=

{0 · · · i

1 · · ·j

−1 · · · 0

}(5)

and kn is a positive constant denoting the conductance of the fuse ij ((n+ 1)th broken bond)before it is broken. It is to be noted that for each n= 0, 1, 2, . . ., we have vn= gen+1 andkn= ken+1 , where en+1 denotes the element number of the (n + 1)th broken bond with con-necting nodes i and j , and Sb

n+1=Sbn ∪ {en+1}. For the random fuse and spring models,

Equation (4) follows directly from Equation (3) by noting that breaking the (n + 1)th fuse(removing an element from the lattice) is equivalent to deleting the corresponding column vn

associated with the (n + 1)th burnt fuse from the incident matrix Gn. In particular, the re-lation An=Gn�nGt

n (Equation (3)) holds for each n, where Gn is the result of deletingcolumns v0, v1, . . . , vn−1 from G0. The matrix G0 is an N × Nel rectangular incidence ma-trix with entries −1, 0, or 1 corresponding to each of the bonds (elements) of the origi-nal intact lattice. Similarly, the matrix �n is obtained by successively deleting the rows andcolumns corresponding to k0, k1, . . . , kn−1 diagonal entries of �0. Furthermore, it is noted that



if the Cholesky factorizations are

PAnPt =LnLtn (6)

for each n= 0, 1, 2, . . ., where P is a permutation matrix chosen to preserve the sparsityof Ln, then the sparsity pattern of Ln+1 is contained in that of Ln. Hence, for all n, thesparsity pattern of Ln is contained in that of L0.

Based on the above description of successive An for n= 0, 1, 2, . . ., an updating schemeof some kind is therefore likely to be more efficient than solving the new set of equationsformed by Equation (1) for each n. In particular, since the successive matrices An, for eachn= 0, 1, 2, . . ., differ by a rank-one matrix, we successively downdate the Cholesky factor-izations Ln of An, using the sparse Cholesky factorization downdate algorithm of Davisand Hager [10, 11]. For numerical simulations, we explore two variants of this approach.The successive rank-one downdates [10] of Ln → Ln+1 for n= 0, 1, 2, . . ., leads to SolverType 1 algorithm, and the multiple-rank ((n + 1 − m) rank) downdate [11] of Lm → Ln+1,where Lm is the Cholesky factor of Am for 0 �m � n, leads to Solver Type 2 algorithm.The multiple-rank update of the sparse Cholesky factorization is computationally superiorto an equivalent series of rank-one updates since the multiple-rank update makes one passthrough L in computing the new entries, while a series of rank-one updates require mul-tiple passes through L [11]. In addition, given the Cholesky factorization Lm of Am form= 0, 1, 2, . . ., it is possible that a direct updating of the solution xn+1 for n=m, m + 1,

m+ 2, . . . based on p= (n−m) saxpy vector updates (see Section 3.2 for details), may some-times be cheaper than successive sparse Cholesky updates of Solver Type 1. Hence, in thiswork, we compare the efficiency of both solver types 1 and 2 for solving the random thresholdsfuse model.

3. ALGORITHMS BASED ON SPARSE DIRECT SOLVERS

3.1. Sparse Cholesky factorization update: Lm→ Lm+p for any p

The algorithm presented in References [10, 11] is based on the analysis and manipulation ofthe underlying graph structure of the stiffness matrix A and on the methodology presented inReferences [13, 14] for modifying a dense Cholesky factorization. This algorithm incorporatesthe change in the sparsity pattern of L and is optimal in the sense that the computationaltime required is proportional to the number of changing non-zero entries in L. In particular,since the breaking of fuses is equivalent to removing the edges in the underlying graphstructure of the stiffness matrix A, the new sparsity pattern of the modified L must be asubset of the sparsity pattern of the original L. Denoting the sparsity pattern of L by L,we have

Lm ⊇Ln ∀m < n (7)

Therefore, we can use the modified dense Cholesky factorization update [10] and work only onthe non-zero entries in L. Furthermore, since the changing non-zero entries in L depend on theith and j th degrees of freedom of the fuse ij that is broken (at most two non-zeros entries invn,∀n), it is only necessary to modify the non-zero elements of a submatrix of L.



The multiple-rank sparse Cholesky downdate algorithm downdates the Cholesky factoriza-tion Lm of the matrix Am to Ln+1 of the new matrix An+1, where An+1=Am + �YYt ,�=−1, and Y represents a N ×p rank-p matrix. The pseudo-code in Algorithm 1 follows thematlab syntax [15] and its sparse matrix functionalities very closely. The following notationis used:

• zeros(m, n): an m× n matrix of zero entries.• L (i1 : i2, j) =L(i1 : i2,j): refers to entries corresponding to i1th to i2th rows of column j

of the matrix L.• [ilist, jlist, val]=find (L (i1 : i2, j)): extracts the sparsity pattern of L (i1 : i2, j). That is,

the non-zeros of L (i1 : i2, j) are stored in val, and the corresponding row and columnindices are stored in ilist and jlist, respectively.• j + ilist: increment each of the entries of ilist by j .• length (ilist): length of the vector ilist.• Sparse(ilist, jlist, val, m, n): create an m × n sparse matrix with non-zero entries val

located at the row and column indices given by ilist and jlist, respectively.

Using the above notation, the multiple-rank sparse Cholesky factor update (�= + 1) or down-date (�=−1) algorithm, Ln+1=SpChol (Lm, Y, �), where Ln+1Lt

n+1=LmLtm + �YYt , and Y

represents an N × p rank-p matrix, is given by

Algorithm 1 (SpChol(L, Y, �): Rank-p sparse Cholesky update/downdate algorithm [11])

1: Convert LLt → LDLt [15]

2: Initialize: ilist= jlist= vlist= zeros(

nnz(

L)

, 1)

; ifree= 1

3: Set �i = 1 for all i= 1, 2, . . . , p

4: for j = 1 to N do5: Algorithm 5 from Davis et al. [10] (insert Algorithm 2 from below)6: Update the non-zeros of column j of L (insert pseudo-code 3 from below)7: end for8: nz= ifree− 19: Update L= I+ Sparse(ilist(1 : nz), jlist(1 : nz), vlist, N, N)

10: Convert LDLt → LLt [15]

Algorithm 2 (Algorithm 5 from Davis and Hager’s [10])1: for i= 1 to p do2: if Yji �= 0 then3: �= �i + �Y 2

ji/Djj {(�= + 1 for update or − 1 for downdate)}4: Djj = �Djj

5: �i = � Yji/Djj

6: Djj =Djj/�i

7: �i = �8: end if9: end for



Algorithm 3 (Pseudo-code for updating the non-zeros of column j of L)

1: if j < N then2: [iylist, jylist, yval]=find

(L(j+1):N,j

)3:4: for i= 1 to p do5: if Yji �= 0 and yval �= ∅ then6: Yj+iylist,i =Yj+iylist,i − Yji yval7: [mm, nn, ww]=find

(Y(j+1):N,i

)8: vv= zeros(N, 1);9: vvj+iylist = yval; vvj+mm= vvj+mm + �iww;

10: [iylist, jylist, yval]=find(vv((j + 1) : N)

11: end if12: end for13:14: nz= length(iylist)15: ilist(ifree : (ifree+ nz− 1))= iylist + j

16: jlist(ifree : (ifree+ nz− 1))= j

17: vlist(ifree : (ifree+ nz− 1))= yval18: ifree= ifree+ nz19: end if

When the factorization of the matrix A is available as LDLt instead of LLt , the conversionfrom LLt to LDLt and vice versa is not performed in Algorithm 1. The reader is referredto References [10, 11] for a comprehensive presentation of the sparse Cholesky factorizationupdate/downdate algorithm.

• Solver Type 1: Given the factorization Lm of Am, the rank-1 sparse Cholesky modification(algorithm 1 with p= 1) is used to update the factorization Ln+1 for all subsequent valuesof n=m, m+1, . . . . Once the factorization Ln+1 of An+1 is obtained, the solution vectorxn+1 is obtained from Ln+1Lt

n+1xn+1=bn+1 by two triangular solves.• Solver Type 2: Given the factorization Lm of Am for m � n, the solution vector xn+1

after the (n + 1)th fuse is burnt, for n=m, m + 1, m + 2, . . ., is obtained by two tri-angular solves using the factor Lm on a sparse load vector with only few non-zeros,and (n + 2 − m) saxpy vector updates (see Section 3.2). The triangular solves are fur-ther simplified by the fact that it is performed on a trivial load vector (at most twonon-zeros) and hence the triangular solves can be performed much more efficientlythan O (nnz (Lm)), where nnz (Lm) denotes the number of non-zeros of the Choleskyfactorization Lm of Am. Since the storage and computational cost associated withp= (n − m) saxpy vector updates can become expensive, we use multiple-rank (rank-p) sparse Cholesky update (Algorithm 1) Lm→ Lm+maxupd+1 for every maxupd (definedlater) interval.

In the following, we present an algorithm for updating the solution xn → xn+1, given thefactor Lm of Am for m � n. This algorithm is used in conjunction with the multiple-rank sparseCholesky update Lm→ Lm+maxupd+1 of Solver Type 2.



3.2. Update xn→ xn+1, given Am=LmLtm for any m � n

Since the successive matrices An and An+1, for each n, differ by a rank-one matrix, it ispossible to use Shermon–Morrison–Woodbury formula [15] to update the solution vector xn+1directly knowing the sparse Cholesky factor Lm of Am, for any m � n. In the following, weassume that the factor Lm of the stiffness matrix Am is available (after the mth bond is broken)for any 0 �m � n, and we update the solution vector xn → xn+1 for all n=m, m + 1, . . . .Let bn and xn represent the load vector and solution vector after the nth bond is broken.Similarly, let us assume that a fuse ij ((n + 1)th fuse) is burnt when the externally appliedvoltage is increased gradually such that bn+1 and xn+1 represent the new load and solutionvectors, and kn is the conductance of the fuse ij ((n+ 1)th broken bond) before it is broken.In addition, let v(n−m) denote the sparse vector corresponding to the (n + 1)th broken bondfor any n �m (similar to Equation (5)) with only a few (at most two) non-zero entries. Theimplementation details involved in the updating of xn+1 from xn using the factor Lm aredescribed below:

Algorithm 4 (Solution vector update: xn→ xn+1, given Am=LmLtm for any m � n)

1: Solve LmLtmu(n−m)= v(n−m)

2: if n �= m then

3: u(n−m)← u(n−m) + u(n−m), where u(n−m)=(n−m−1)∑

l= 0�l

(ut

lv(n−m)

)ul

4: end if

5: Evaluate �(n−m)= kn(1− kn vt

(n−m)u(n−m)

)6: Store u(n−m) and �(n−m) (used in the future evaluation of u(n−m+s), where s= 1, 2, . . .)7: Update the load vector: bn→ bn+1 (see Appendix B, Remark 5)

8: Compute �= �(n−m)

(ut

(n−m)bn+1

)9: if (i or j is prescribed) then

10: �= �− kn

11: end if12: Update xn+1= xn + � u(n−m)

Algorithm 4 is repeated for all values of n=m, m + 1, . . . until either the lattice system hasfractured or the number of updates (n + 1 − m) exceeds maxupd, defined as the maximumnumber of updates between successive Cholesky factor updates. It is noted that the storage andcomputational requirements can become prohibitively expensive especially for large lattices asthe number of updates, p= (n − m), increases. Hence, it is necessary to limit the maximumnumber of updates between two successive Cholesky factor updates to a certain maxupd. That is,it is necessary to update Lm→ Lm+maxupd+1 using the multiple-rank (rank-p) sparse Choleskyupdate algorithm at every maxupd intervals. Once Lm has been updated, the counter m inalgorithm 4 is reset, i.e. m ← (m + maxupd + 1), and the algorithm 4 is repeated for alln=m, m+ 1, . . . , until the lattice system fractures.

Notes on algorithm 4: Implementation of the algorithm 4 takes advantage of the factthat v(n−m) is a sparse vector with at most 2 non-zeros (see Equations (5) and (B1)). Hence, the



computational cost of Step 1 is much less than O(nnz(Lm)). For the same reason, the scalarproducts vt

(n−m)ul and vt(n−m)u(n−m) can be evaluated trivially as described below. Depending

on whether the degrees of freedom i and j of the broken bond ij ((n+ 1)th broken bond) areconstrained/prescribed (see Appendix B, Remarks 3 and 4), we have the following expressionsfor evaluating the scalar products in Steps 3 and 5 of Algorithm 4.

1: if i and j are free degrees of freedom then

2: vt(n−m)=

{0 · · · i1 · · ·

j

−1 · · · 0}

3: vt(n−m)ul = (ul)i − (ul)j and vt

(n−m)u(n−m)=(u(n−m)

)i− (u(n−m)

)j

4: else if j is a constrained/prescribed degree of freedom then

5: vt(n−m)=

{0 · · · i1 · · · 0

}6: vt

(n−m)ul = (ul)i and vt(n−m)u(n−m)=

(u(n−m)

)i

7: else if i is a constrained/prescribed degree of freedom then

8: vt(n−m)=

{0 · · · j1 · · · 0

}9: vt

(n−m)ul = (ul)j and vt(n−m)u(n−m)=

(u(n−m)

)j

10: end if

Similarly, the updating of the load vector bn → bn+1 in Step 10 of the algorithm 4 is com-puted trivially by updating a single entry of the vector bn as shown below (see Appendix B,Remark 5):

1: if j is a prescribed degree of freedom then2: (bn+1)i = (bn)i − kn

3: else if i is a prescribed degree of freedom then4: (bn+1)j = (bn)j − kn

5: end if

Based on the above implementation of Algorithm 4, the computational cost involved in breakingthe (n+ 1)th fuse ij (say i and j are free degrees of freedom as in Section 2) is (n+ 2−m)

saxpy vector updates, one vector inner product, and two triangular solves LmLtmu(n−m)= v(n−m)

on a sparse vector v(n−m) with at most two non-zeros. The computational cost of the associatedtriangular solves is much less than (O(nnz (Lm))) because of the sparsity of the vector v(n−m).

The optimum number of steps between successive factorization updates of the matrix A is de-termined by minimizing the cpu time required for the entire analysis. Let tfac denote the averagecpu time required for performing or multiple-rank sparse updating of the Cholesky factorizationAm=LmLt

m, and tback denote the average cpu time required for solving LmLtmu(n−m)= v(n−m)

with a sparse RHS vector v(n−m). Let tupd denote the average cpu time required for a singlerank-1 update of the solution un+1−m. Note that the evaluation of un+1−m requires (n − m)

saxpy vector updates. Let the estimated number of steps for the lattice system failure be nsteps.Then, the total cpu time required for solving the linear system of equations until the lattice



system failure is given by

� = nfactfac + nstepstback +nrank1∑

1nrank1tupd

= nfactfac + nstepstback + 1

2

(nsteps − nfac

)nfac

nsteps

nfactupd (8)

where nfac denotes the number of factorization updates until lattice system failure and nrank1=(nsteps − nfac)/nfac denotes the average number of rank-1 updates per factorization. The opti-mum number of factorizations, noptfac, for the entire analysis is obtained by minimizing thefunction � via ��/�nfac= 0. The maximum number of vector updates, maxupd, betweensuccessive factorization updates is then estimated as

maxupd=(nsteps − noptfac

)noptfac

(9)

It should be noted that the above estimate of maxupd is based only on minimizing the compu-tational cost. However, for large lattice systems, in core storage of the Cholesky factor Lm, andthe ul vectors, where l= 0, 1, 2, . . . , p and p= (n − m), may also become a limiting factor.In such cases, maxupd is chosen based on storage constraints.

3.3. Alternative algorithms based on sparse direct solvers

In general, the multiple-rank update of Lm → Ln+1 as in Solver Type 2, after fuses n=m,

m+1, . . . , m+maxupd are broken, is expected to be computationally cheaper than performingthe direct factorization of the new stiffness matrix Am+maxupd+1 [10, 11]. However, as thenumber of updates p= (n−m) increases, especially close to macroscopic fracture, where An+1is much sparser than Am, a direct factorization of An+1 may become cheaper than multiple-rank updating of Lm → Ln+1. In order to take advantage of the much sparser An+1 and forcomparison of different solver types based on sparse direct solvers, we use the following solvertypes 3 and 4 in the numerical simulation of random threshold fuse model.

• Solver Type 3: Given the factor Lm of Am for m � n, the solution vector xn+1 after the(n + 1)th fuse is burnt, for n=m, m + 1, m + 2, . . . , m + maxupd, is obtained by twotriangular solves using LmLt

m on a sparse load vector with at most two non-zeros, and(n+ 2 −m) saxpy vector updates (see Section 3.2). The only difference between SolverTypes 2 and 3 is that instead of multiple-rank updating of Lm→ Lm+maxupd+1 after everymaxupd steps, we refactorize Am+maxupd+1 directly to obtain Lm+maxupd+1.• Solver Type 4: Along with the Solver Type 3, we have also investigated an alternative

modified direct method using dense matrix updates. The details of one such algorithmare given in Appendix C. However, based on the operation counts and the actual cputimes measured during numerical simulations, the algorithm presented in Appendix C wasfound to be computationally inefficient compared with the other algorithms presented inthis section.



4. CIRCULANT PRECONDITIONERS FOR CG ITERATIVE SOLVERS

This study also investigates the use of PCG algorithms for re-solving the new set of Kirchhoffequations (stiffness matrix) every time a fuse is burnt. In particular, we investigate the choiceof using circulant and block-circulant matrices as preconditioners to the Laplacian operatoron a fractal network. The main advantage is that the preconditioners can be diagonalized bydiscrete Fourier matrices, and hence the inversion of an N × N circulant matrix can be donein O(N log N) operations by using FFTs of size N . In addition, since the convergence rateof the PCG method depends on the condition number and clustering of the eigenvalues of thepreconditioned system, it is possible to choose a circulant preconditioner that minimizes thecondition number of the preconditioned system [16, 17]. Furthermore, these circulant precondi-tioned systems exhibit favourable clustering of eigenvalues. In general, the more clustered theeigenvalues are, the faster the convergence rate is.

As noted earlier, since the initial lattice grid is uniform, the Laplacian operator (Kirch-hoff equations) on the initial uniform grid results in a Toeplitz matrix A0. Hence, a fastPoisson type solver with a circulant preconditioner can be used to obtain the solution inO(N log N) operations using FFTs of size N . However, as the lattice bonds are broken suc-cessively, the initial uniform lattice grid becomes a fractal network. Consequently, althoughthe matrix A0 is Toeplitz (also block Toeplitz with Toeplitz blocks) initially, the subsequentmatrices An, for each n, are not Toeplitz matrices. However, An may still possess blockstructure with many of the blocks being Toeplitz blocks depending on the pattern of brokenbonds.

Even with the above observation that An for n > 0 is not Toeplitz, for the purposes ofdirect comparison with the previously published algorithms for solving random thresholds fusemodels, this study explores the choice of optimal [17–20] and superoptimal [16, 17] circu-lant preconditioners along with incomplete Cholesky preconditioners [15] for the Laplacianoperator (Kirchhoff equations) on a fractal network. In addition, since the matrices An in gen-eral have block structure with many blocks being Toeplitz blocks, we also use block-circulantmatrices [17, 21] as preconditioners to the stiffness matrix. In the literature [7–9], Fourier-accelerated PCG has been used to accelerate the iterative solution of random resistor networksnear the percolation critical threshold. However, the type of ensemble-averaged circulant pre-conditioner used in these studies is not optimal in the sense described in References [17–20],and hence is expected to take more CG iterations than the optimal and superoptimal circulantpreconditioners.

Consider the N × N stiffness matrix A. The optimal circulant preconditioner c(A) [18] isdefined as the minimizer of ‖C− A‖F over all N × N circulant matrices C. In the abovedescription, ‖ · ‖F denotes the Frobenius norm [15]. The optimal circulant preconditioner c(A)

is uniquely determined by A, and is given by

c(A)=F∗�(FAF∗)F (10)

where F denotes the discrete Fourier matrix, �(A) denotes the diagonal matrix whose diagonalis equal to the diagonal of the matrix A, and ∗ denotes the adjoint (i.e. conjugate transpose).It should be noted that the diagonals of FAF∗ represent the eigenvalues of the matrix c(A)

and can be obtained in O(N log N) operations by taking the FFT of the first column of c(A).The first column vector of T. Chan’s optimal circulant preconditioner matrix that minimizes



the norm ‖C− A‖F is given by

ci = 1

N

N∑j = 1

aj,(j−i+1) mod N (11)

The above formula can be interpreted simply as follows: the element ci is simply the arithmeticaverage of those diagonal elements of A extended to length N by wrapping around andcontaining the element ai,1. If the matrix A is Hermitian, the eigenvalues of c(A) are boundedbelow and above by

�min(A) � �min(c(A)) � �max(c(A)) � �max(A) (12)

where �min(·) and �max(·) denote the minimum and maximum eigenvalues, respectively. Basedon the above result, if A is positive definite, then the circulant preconditioner c(A) is alsopositive definite. In particular, if the circulant preconditioner is such that the spectra of thepreconditioned system is clustered around one, then the convergence of the solution will befast. The superoptimal circulant preconditioner t (A) [16] is based on the idea of minimizing thenorm

∥∥I− C−1A∥∥

F over all non-singular circulant matrices C. Since the asymptotic convergenceof the t (A) preconditioned system is same as c(A) for large N , in this study, we limitourselves to the investigation of preconditioned systems using c(A) given by Equation (11).The computational cost associated with the solution of the preconditioned system c(A)z= r isthe initialization cost of nnz(A) for setting the first column of c(A) using Equation (11) duringthe first iteration, and O(N log N) during every iteration step.

Since the Laplacian operator on a discrete lattice network results in block structure in thestiffness matrix, we use block-circulant matrices [17, 21] as preconditioners to the stiffnessmatrix. In order to distinguish the block-circulant preconditioners that follow from the above-described circulant preconditioners, we refer henceforth to the above preconditioners as point-circulant preconditioners.

4.1. Block-circulant preconditioners

Let the matrix A be partitioned into r × r blocks such that each block is an s× s matrix. Thatis, N = rs, and

A=

A1,1 A1,2 · · · A1,r

A2,1 A2,2 · · · A2,r

......

. . ....

Ar,1 Ar,2 · · · Ar,r

(13)

Although the point-circulant preconditioner c(A) defined by Equation (11) can be used as apreconditioner, in general, the block structure is not restored by using c(A) as a preconditioner.In contrast, the block-circulant preconditioners obtained by using circulant approximations foreach of the blocks restore the block structure of A. The block-circulant preconditioner of A



can be expressed as

cB(A) =

c(A1,1) c(A1,2) · · · c(A1,r )

c(A2,1) c(A2,2) · · · c(A2,r )

......

. . ....

c(Ar,1) c(Ar,2) · · · c(Ar,r )

(14)

It is the minimizer of ‖C−A‖F over all matrices C that are r×r block matrices with s×s cir-culant blocks. The spectral properties given by Equation (12) for point-circulant preconditionersalso extend to the block-circulant preconditioners [17, 21]. That is,

�min(A) � �min(cB(A)) � �max(cB(A)) � �max(A) (15)

In particular, if A is positive definite, then the block-preconditioner cB(A) is also positivedefinite.

The computational cost associated with the block-circulant preconditioners can be estimatedas follows. Since the stiffness matrix A is real symmetric for the type of problems consideredin this study, in the following, we assume block symmetric structure for A, i.e. Aj,i =At

i,j .In forming the block-circulant preconditioner given by Equation (14), it is necessary to obtainpoint-circulant preconditioners for each of the r × r block matrices of order s. The point-circulant approximation for each of the s × s blocks requires O(s log s) operations. This costis in addition to the cost of forming the first column vectors (Equation (11)) for each of thec(Ai,j ) blocks, which is given by nnz(A) operations. Since there are (r(r + 1))/2 blocks, weneed O(r2s log s) operations to form

�= (I⊗ F)cB(A)(I⊗ F∗)=

�(FA1,1F∗

)�(FA1,2F∗

) · · · �(FA1,rF∗

)�(FA2,1F∗

)�(FA2,2F∗

) · · · �(FA2,rF∗

)...

.... . .

...

�(FAr,1F∗

)�(FAr,2F∗

) · · · �(FAr,rF∗

)

(16)

where ⊗ refers to the Kronecker tensor product and I is an r × r identity matrix. In orderto solve the preconditioned equation cB(A)z= r, Equation (16) is permuted to obtain a block-diagonal matrix of the form

�=P∗�P=

�1,1 0 · · · 0

0 �2,2 · · · 0

......

. . ....

0 0 · · · �s,s

(17)



where P is the permutation matrix such that[�k,k

]ij= [

�(FAi,j F∗

)]kk∀1 � i, j � r, 1 � k � s (18)

During each iteration, in order to solve the preconditioned system cB(A)z= r, it is necessary tosolve with the block-diagonal matrix �. This task can be performed by first factorizing each ofthe �k,k blocks during the first iteration, and then subsequently using these factored matricesto do the triangular solves. Hence, without considering the first factorizing cost of each of theblock diagonals, during each iteration, the number of operations involving the solves with � is

delops=O

(s∑

k= 1

∣∣∣L�k,k

∣∣∣) (19)

where L�k,kdenotes the number of non-zeros in the Cholesky factorization of �k,k . Therefore,

the system cB(A)z= r can be solved in O(rs log s) + delops operations per iteration. Thus,we conclude that for the block-circulant preconditioner, the initialization cost is nnz(A) +O(r2s log s) plus the cost associated with the factorization of each of the diagonal blocks �k,k

during the first iteration, and O(rs log s)+ delops during every other iteration.Although from the operational cost per iteration point of view, the point-circulant precondi-

tioner may prove advantageous for some problems, it is not clear whether point-circulant orblock-circulant is closest to the matrix A in terms of the number of CG iterations necessary forconvergence. Hence, we investigate both point- and block-circulant preconditioners in obtainingthe solution of the linear system Ax=b using iterative techniques. In addition, we also employthe commonly used point and block versions of the incomplete Cholesky preconditioners tosolve Ax=b.

Remark 1In the case of a 2D discrete lattice network with periodic boundary conditions in the horizontaldirection and a constant voltage difference between the top and bottom of the lattice network,A is a block tri-diagonal real symmetric matrix. Under these circumstances, the initializationcost is nnz(A) + O(rs log s). Since each of the diagonal blocks �k,k is tri-diagonal, duringeach iteration the solution involving � can be obtained in O(rs) operations. Thus, the cost periteration is O(rs log s)+O(rs)=O(rs log s) operations. The total computational cost involvedin using the block-circulant preconditioner for a symmetric block tri-diagonal matrix is theinitialization cost of nnz(A)+O(rs log s), and O(rs log s) operations per iteration step. Thisis significantly less than the cost of using a generic block-circulant preconditioner. It shouldbe noted that the block tri-diagonal structure of A does not change the computational costassociated with using a point-circulant preconditioner to solve Ax=b.

5. MODEL PROBLEM: TRIANGULAR LATTICE SYSTEM

Consider a 2D random threshold fuse lattice system of size L × L with periodic boundaryconditions in the horizontal direction and a unit voltage difference, V = 1, applied between thetop and bottom of lattice system bus bars (see Figure 1). For the triangular lattice topology,Nel= (3L+ 1)(L+ 1), and N =L(L+ 1). The model problem considered in this study is well



described in References [22, 23, 1, p. 45, 2, p. 231], and the references therein. The elasticresponse of each bond in the lattice is linear up to an assigned threshold value, at which brittlefailure of the bond occurs. The disorder in the system is introduced by assigning randommaximum threshold current values t , (which is equivalent to the breaking stress in mechanicalproblem) to each of the fuses (bonds) in the lattice, based on an assumed probability distribution.The electrical conductance (stiffness in the mechanical problem) is assumed to be the sameand equal to unity for all the bonds in the lattice. This is justified because the conductance(or stiffness) of a heterogeneous solid converges rapidly to its scale-independent continuumvalue. A uniform probability distribution, which is constant between 0 and 1, is chosen as theprobability distribution of failure thresholds. A broad thresholds distribution represents largedisorder and exhibits diffusive damage (uncorrelated burning of fuses) leading to progressivedamage localization, whereas a very narrow thresholds distribution exhibits brittle failure inwhich a single crack propagation causes material failure. This relatively simple model hasbeen extensively used in the literature [1, 2, 22–25] for simulating the fracture and progressivedamage evolution in brittle materials, and provides a meaningful benchmark for comparing theperformance of different numerical algorithms.

As mentioned earlier, damage evolution in materials is simulated by slowly increasing theexternal voltage on the lattice system until the current i flowing in an individual fuse exceeds itsbreaking threshold t . At this point, the fuse is burnt irreversibly. Each time a fuse is removed,the electrical current is instantaneously redistributed and a new system of Kirchhoff equationsis solved to determine the fuse that is going to burn up under the redistributed currents.The simulation is initiated with an intact lattice, and the burning of fuses under a gradualincrease of external voltage is repeated until the entire lattice system falls apart. Figure 2presents the snapshots of progressive damage evolution for the case of a uniformly distributedrandom thresholds model problem in a triangular fuse lattice system of size L= 256. Similarly,Figures 3 and 4 present the snapshots of damage and the scaled stress distribution, respectively,in a triangular spring lattice system of size L= 256 (see Appendix A for algorithmic detailsof Cholesky downdating scheme for simulating a lattice system with spring elements).

5.1. Brief summary of the solver types used in the numerical simulations

For the numerical simulation of a random thresholds fuse model, we use the following fouralternate solver types presented once again for quick reference:

• Solver Type 1: Given the factor Lm of Am, the rank-1 sparse Cholesky update /downdate(algorithm 1) presented in Section 3.1 is used to downdate the factorization Ln+1 for allsubsequent values of n=m, m+ 1, . . . . Once the factorization Ln+1 of An+1 is obtained,the solution vector xn+1 is obtained by two triangular solves of Ln+1Lt

n+1xn+1=bn+1.• Solver Type 2: Given the factor Lm of Am, the solution vector xn+1 after the (n + 1)th

fuse is burnt, for n=m, m+1, . . . , m+maxupd, is obtained using the algorithm presentedin Section 3.2. The multiple-rank sparse Cholesky update/downdate algorithm presented inSection 3.1 is used to downdate the factorization Lm→ Lm+maxupd+1 after every maxupdsteps.• Solver Type 3: Assuming that the factor Lm of matrix Am is available after the mth fuse

is burnt, the implementation follows the algorithm 4 presented in Section 3.2 to determinethe solution vector xn+1 after the (n+1)th fuse is burnt, for n=m, m+1, . . . , m+maxupd.Factorization of the matrix A is performed every maxupd steps.



Figure 2. Snapshots of damage in a typical triangular lattice system of size L= 256. Number ofbroken bonds at the peak load and at failure are 22 911 and 24 918, respectively. (a)–(i) representthe snapshots of damage after nb bonds are broken: (a) nb = 6250; (b) nb = 12 500; (c) nb = 18 750;(d) nb = 22 911 (peak load); (e) nb = 23 500; (f) nb = 24 000; (g) nb = 24 500; (h) nb = 24 750; and

(i) nb = 24 918 (failure).

• Solver Type 4: Assuming that the factor Lm of Am is available, the dense matrix update(algorithm 5) presented in Appendix C is used to obtain the solution vector xn+1 afterthe (n+ 1)th fuse is burnt, for n=m, m+ 1, . . . , m+maxupd. Once again, factorizationof the matrix A is performed every maxupd steps.

In addition to the above four types of direct solvers, we have performed numerical simulationsusing PCG iterative solvers. The first iterative solver is CG without any preconditioner. An



Figure 3. Snapshots of damage (broken bonds) evolution in a typical triangular spring lattice systemof size L= 256. Number of broken bonds at the peak load and at failure are 13 864 and 16 695,respectively: (1)–(9) represent the snapshots of damage after nb bonds are broken: (1) nb = 5000; (2)nb = 10 000; (3) nb = 12 000; (4) nb = 13 000; (5) nb = 14 000 (just after peak load); (6) nb = 15 000;

(7) nb = 15 500; (8) nb = 16 000; and (9) nb = 16 500 (close to failure).

incomplete Cholesky with a drop tolerance of 10−3 is used as the second PCG iterative solver.The drop tolerance sets the small elements of the Cholesky factor to zero after they are computedbut before they update other coefficients. Elements are dropped if they are smaller than droptolerance times the norm of the column of the Cholesky factor and they are not on the diagonaland they are not in the non-zero pattern of An. In addition, the factorization is also modifiedso that the row sums of LnLt

n are equal to the row sums of An. Circulant (Equation (11)) andblock-circulant (Equation (14)) PCG have also been used to accelerate the iterative solution.



Figure 4. Scaled stress distribution corresponding to the snapshots of damage in Figure 3.

Once again, it should be noted that the Fourier accelerated PCG presented in References [7–9]is not optimal in the sense described in References [17–20], and hence is expected to takemore CG iterations compared with the optimal and superoptimal circulant preconditioners.

In the numerical simulations using solver types 1–4, the maximum number of vector updates,maxupd, is chosen to be a constant for a given lattice size L. We choose maxupd= 25 forL={4, 8, 16, 24, 32}, maxupd= 50 for L= 64, and maxupd= 100 for L={128, 256, 512}. ForL= 512, maxupd is limited to 100 by memory constraints. By keeping the maxupd valueconstant, it is possible to compare realistically the computational cost associated with differentsolver types. Moreover, the relative cpu times taken by these algorithms remain the same evenwhen the simulations are performed on different platforms.



It is well known that the convergence rate of an iterative solver depends on the quality ofstarting vector. When PCG iterative solvers described in Section 4 are used in updating thesolution xn → xn+1, we use either the solution from the previous converged state, i.e. xn, orthe zero vector, 0, as the starting vectors for the CG iteration of xn+1. The choice of usingeither xn or 0 as the starting vectors depends on which vector is closer to bn+1 in 2-norm.That is, if ‖bn+1 − Anxn‖2 � ‖bn+1‖2, then xn is used, else 0 is used as the starting vectorfor the xn+1 CG iteration. A relative residual tolerance �= 10−12 is used in CG iteration.

5.2. Numerical simulation results

Tables I–IV present the cpu and wall-clock times taken for one configuration (simulation) usingthe solver types 1–4, respectively. These tables also indicate the number of configurations,

Table I. Solver type 1.

Size CPU (s) Wall (s) Simulations

32 0.592 0.687 20 00064 10.72 11.26 4000128 212.2 214.9 800256 5647 5662 96512 93 779 96 515 16

Table II. Solver type 2.


32 0.543 0.633 20 00064 11.15 12.01 4000128 211.5 214.1 800256 6413 6701 96

Table III. Solver type 3.


32 0.566 0.655 20 00064 9.641 10.59 4000128 203.1 213.4 800256 6121 6139 96

Table IV. Solver type 4.


32 0.679 0.771 20 00064 11.18 12.28 4000128 254.4 260.2 800256 6112 6147 96



Nconfig, over which ensemble averaging of the numerical results is performed. The cpu andwall-clock times taken by the iterative solvers are presented in Tables V–VIII. For iterativesolvers, the number of iterations presented in Tables V–VIII denote the average number oftotal iterations taken to break one intact lattice configuration until it falls apart. In the case ofiterative solvers, some of the simulations for larger lattice systems were not performed eitherbecause they were expected to take larger cpu times or the numerical results do not influencethe conclusions drawn in this study.

Based on the results presented in Tables I–VIII, it is clear that for modelling the breakdownof disordered media as in starting with an intact lattice and successive breaking of bonds untilthe lattice system falls apart, the solver types 1–4 based direct solvers are superior to theiterative PCG solver techniques. In particular, for larger lattice systems, Solver Type 1 basedon algorithm 1 presented in Section 3.1 for rank-1 sparse Cholesky downdate of the Choleskyfactor L is superior to the Solver Types 2–4. It should be noted that for larger lattice systems,limitations on the available memory of the processor may decrease the allowable maxupd

Table V. CG iterative solver (no preconditioner).

Size CPU (s) Wall (s) Iterations Simulations

32 7.667 8.016 66 254 20 00064 203.5 205.7 405 510 1600

Table VI. CG iterative solver (incomplete Cholesky).


32 2.831 3.008 5857 20 00064 62.15 65.61 29 496 4000128 1391 1430 148 170 320

Table VII. T. Chan’s optimal circulant PCG.


32 11.66 12.26 25 469 20 00064 173.6 178.8 120 570 1600128 7473 7725 622 140 128

Table VIII. T. Chan’s block-circulant PCG.


32 10.00 10.68 11 597 20 00064 135.9 139.8 41 207 1600128 2818 2846 147 510 192256 94 717 96 500 32



value, as in the case of L= 256 using Solver Types 2 and 3. However, this is not a concern forsimulations performed using Solver Type 1. In the case of Fourier accelerated solvers (circulantand block-circulant) for the model problem considered, the block-circulant solver is superiorto T. Chan’s optimal circulant PCG solver.

6. CONCLUSIONS

The paper presents a computational methodology based on a rank-p update of the stiffness ma-trix that alleviates the computational complexities involved in simulating fracture and damageevolution in disordered quasi-brittle materials using discrete lattice networks. Numerical simula-tion results using 2D random resistor networks show that the present algorithms based on directsparse solvers Solver Types 1–4 are computationally superior to the commonly used Fourieraccelerated preconditioned conjugate gradient iterative solver. In particular, for large systemsizes, Solver Type 1 based on successive Cholesky factor downdates is superior to other solvertypes. Furthermore, these algorithms completely eliminate the critical slowing down observedin fracture simulations using the conventional iterative schemes.

Given the factorization Lm of Am after the mth bond is broken, the computational costassociated with breaking the (n+ 1)th bond can be estimated as

• For Solver Type 1, the algorithm 1 requires a rank-1 sparse Cholesky downdate andtwo triangular solves on a sparse vector with at most two non-zeros. The computationalcost associated with rank-1 sparse Cholesky downdate is at most O(nnz(Lm)), and thatassociated with two triangular solves is much less than O (nnz (Lm)) because of thesparsity of the RHS vector.• For Solver Type 2, which is based on the combination of algorithms 1 and 4, the com-

putational complexity is (n+ 2−m) saxpy vector updates, one vector inner product, andtwo triangular solves on a sparse vector with at most two non-zeros. The computationalcost of the associated triangular solves is much less than (O(nnz(Lm))) because of thesparsity of the RHS vector. The Cholesky factor Lm → Lm+maxupd+1 is updated afterevery maxupd steps using the algorithm 1.• For Solver Type 3 based on Algorithm 4, the associated computational cost is exactly

the same as that of Solver Type 2, except that after every maxupd steps, Solver Type 3re-factorizes the matrix Am+maxupd+1, whereas Solver Type 2 performs a rank-maxupd+1sparse Cholesky downdate of Lm→ Lm+maxupd+1.• For Solver Type 4, the associated computational cost is much higher than the Solver

Type 3, the details of which are presented in Appendix C.

The results presented in Tables I–VIII indicate that algorithms based on direct solvers (solvertypes 1–4) are superior to the iterative PCG solver techniques. In particular, for large latticesystems, Solver Type 1 based on Algorithm 1 is superior to the Solver Types 2–4. In the caseof iterative solvers, the block-circulant solver is computationally superior to T. Chan’s optimalcirculant PCG solver, which in turn is superior to the Fourier accelerated PCG solvers used inthe simulation of lattice systems [7–9].

In fracture simulations using discrete lattice networks, ensemble averaging of numericalresults is necessary to obtain a realistic representation of the lattice system response. In this re-gard, for very large lattice systems, this methodology is especially advantageous as the Cholesky



factorization of the system of equations can be performed using a parallel implementation onmultiple processors. Subsequently, this Cholesky decomposition can then be distributed to eachof the processors to continue with independent fracture simulations that only require less inten-sive triangular solves. This technique is particularly advantageous for investigating the effectsof various disorders on material fracture characteristics.

APPENDIX A: ALGORITHM FOR CENTRAL-FORCE (SPRING) MODELS

For the case of central-force models, let PQ denote the (n+ 1)th bond (spring) that is brokenwhen the externally applied displacement (or force) is increased gradually. Let (Px, Py, Pz) and(Qx, Qy, Qz) refer to the x, y, and z degrees of freedom associated with the nodes P andQ, respectively. Let the direction cosines of the bond PQ be given by (nx, ny, nz). The newstiffness matrix An+1 of the lattice system after the bond PQ is broken is given by

An+1=An − kPQvvt (A1)

where kPQ refers to the stiffness of the bond PQ, and the vector v is given by

vt ={0 · · · nx ny nz · · · −nx −ny −nz · · · 0} (A2)

The rest of the procedure for updating the solution xn→ xn+1 is similar to Section 2. Wheneverany of the degrees of freedom are constrained or prescribed an externally applied displacement(see Appendix B, Remark 3), the corresponding elements of the vector v have zero entries,and the rest of the procedure is identical to the one described in Section 2. The treatmentof periodic boundary conditions is also identical to the procedure described in Remark 4 ofAppendix B. In the case of updating the load vector bn after the bond PQ is broken, forpresentation purposes, let Q denote the node at which an external displacement is prescribed.Although it is not necessary, let us also assume that the external displacement (�x, �y, �z) isprescribed in all the x, y, and z directions. Then the updated load vector bn+1 is given bybn+1=bn + w, where

wt = kPQ

�x{0 0 · · · −n2x −nxny −nxnz · · · 0 0}

+ �y{0 0 · · · −nxny −n2y −nynz · · · 0 0}

+ �z{0 0 · · · −nxnz −nynz −n2z · · · 0 0}

When external displacement is prescribed only in certain directions at the node Q, a similarprocedure is followed for updating the load vector bn+1. Once again, the implementation of thealgorithm takes into account the sparse nature of the vectors v and w in all of its calculations.Therefore, based on the above presentation, the computational cost associated with the central-force (spring) models is same as that for scalar fuse models.

Remark 2When the elastic response of the individual elements of the lattice system is modelled by beamor bond-bending models, breaking a bond PQ, in most cases, does not lead to a rank-1 updateof the stiffness matrix. In the case of beam elements, breaking the (n + 1)th bond PQ is



equivalent to updating the stiffness matrix An by a matrix B such that An+1=An +⊔B, andB is given by

B=−TtEtKcorET (A3)

where Kcor represents the 6 × 6 corotational stiffness matrix of the beam, E represents the12 × 6 local to corotational transformation matrix, and T represents the 12 × 12 global tolocal transformation matrix. The symbol

⊔assembles the matrix B into the matrix An to

obtain An+1. The reader is referred to Reference [26] and the references therein for a detailedexposition on 3D corotational beam element formulations. In the case of 3D beam elements,the matrix Kcor is given by

Kcor = EA

Ly1yt

1 +GJ

Ly2yt

2 +2EIy

L

(3y3yt

3 + y4yt4

)+ 2EIz

L

(3y5yt

5 + y6yt6

)= YCcorYt (A4)

where Ccor = diag(EA/L, GJ/L, 6EIy/L, 2EIy/L, 6EIz/L, 2EIz/L

), E and G denote the

Young’s modulus and shear modulus of the material (positive constants), and L, A, J , Iy and Iz

denote the length, area, St Venant torsion constant, y moment of inertia, and z moment of iner-tia of the beam cross-section, respectively. The matrix Y is given by Y=[y1, y2, y3, y4, y5, y6],and the mutually orthogonal vectors y1−6 are given by

y1 = {1, 0, 0, 0, 0, 0}t y2={0, 1, 0, 0, 0, 0}t

y3 = {0, 0, 1, 0, 1, 0}t y4={0, 0, 1, 0,−1, 0}t

y5 = {0, 0, 0, 1, 0, 1}t y6={0, 0, 0, 1, 0,−1}t

Hence, B is a rank-6 matrix, and the updating of An to An+1 can be envisioned as An+1=An +⊔ �Zn+1Zt

n+1=An + �Vn+1Vtn+1, where Zn+1=TtEtY

√Ccor is a 12 × 6 matrix, and

Vn+1 is the assembled version, i.e. Vn+1Vtn+1=

⊔Zn+1Zt

n+1. The matrix√

Ccor is obtainedby taking the square root of each of the diagonal entries of Ccor.

A dense matrix update similar to Algorithm 5 in Appendix C may be used to solveAn+1xn+1=bn+1. However, since the matrix update An+1=An + ⊔

B is a rank-6 update,the computational cost associated with updating the solution xn+1 based on the factor Lm ofAm for m � n, increases rapidly as the number of updates, p= (n+ 1−m), increases. Underthese circumstances, it may be advantageous to block-update the factor Ln after the nth bondis broken using Vn+1 based on Algorithm 3. That is, Ln+1=SpChol (Ln, Vn+1, �).

APPENDIX B: SPECIAL CASES OF BOND BREAKING

Remark 3 (Breaking of a fuse attached to a constrained /prescribed degree of freedom)In Section 2, we have described a methodology for updating the factors of a stiffness matrixafter breaking a fuse ij (nth broken bond) that connects two ‘free’ degrees of freedom i



and j . In the following, the methodology is applied to a special case when the fuse that isbroken is attached to a constrained /prescribed degree of freedom. Without loss of generality,let us assume that the degree of freedom j is either constrained or prescribed by an externallyapplied voltage. Then, the vector vn in Algorithm 1 is given by

vtn=

{0 · · · i

1 · · · 0}

(B1)

Remark 4Consider the case of a broken fuse jk that is attached to a slave degree of freedom k whosemaster degree of freedom is i. This type of situation is particularly important in the case oflattice models with periodic boundary conditions. Under these circumstances, the methodologypresented in Remark 3 is applicable in a straightforward manner if it is understood that breakingthe fuse jk is equivalent to breaking the fuse ij .

Remark 5 (Updating the load vector bn after breaking a fuse)Without loss of generality, let us assume that the external loading on the lattice system isvoltage controlled, i.e. a constant voltage difference is imposed on the lattice system. Letbn denote the load vector on the lattice system after n fuses have been burnt because of theexternally applied voltage conditions. In the following, we investigate how the load vector bn+1needs to be updated following breaking of the (n + 1)th fuse ij . The load vector bn+1 willdiffer from the load vector bn only if the (n+ 1)th broken fuse ij is attached to a prescribeddegree of freedom, where a constant voltage difference is imposed. Once again, for presentationpurposes, let us assume that j is such a prescribed degree of freedom. Then the load vectorbn+1 is given by

bn+1=bn + wn (B2)

wherewt

n= kn

{0 0 · · · i−1 · · · 0 0

}(B3)

If neither i nor j is a prescribed degree of freedom, then wn= 0. In the implementation of thealgorithms, only the non-zero value of wn (ith entry) is stored, and the corresponding value(bn+1)i = (bn)i + (wn)i is updated directly.

APPENDIX C: AN ALTERNATIVE DIRECT METHOD USINGDENSE MATRIX UPDATE

Consider the Cholesky decomposition Am=LmLtm. The objective is to find the solution xn+1

of the equation An+1xn+1=bn+1 knowing Lm and the N × p update matrix Yn+1. We haveAn+1=Am+ �Yn+1Yt

n+1, and p= (n+ 1−m). For an update �= + 1 and is equal to −1 fora downdate. Let Yn+1=[Yn | y], where Yn is the update matrix after the nth fuse is brokenand y is the rank-1 update vector such that

An+1 = Am + �Yn+1Ytn+1

= Am + �YnYtn + �yyt

= An + �yyt (C1)



In terms of the factor Lm, the matrix An+1 can be expressed as

An+1 = Am + �Yn+1Ytn+1

= Lm

(I+ �Wn+1Wt

n+1

)Lt

m (C2)

where LmWn+1=Yn+1. That is, based on the decomposition of Yn+1=[Yn | y], we haveWn+1=[Wn |w], where LmWn=Yn and Lmw= y.

Based on the factorization given by Equation (C2), the solution of An+1xn+1=bn+1 can beobtained using Algorithm 5. In this algorithm, steps 1 and 3 together amount to a triangularsolve using the current factor Lm. The solution of

(I+ �Wn+1Wt

n+1

)z2= z1 is obtained by

first expressing the solution z2 as z2= z1 − �Wn+1z3, and then obtaining z3 by solving(I+ �Wt

n+1Wn+1)

z3=Wtn+1z1.

Algorithm 5 (Dense Matrix Update)

1: Solve Lmz1=bn+12: Solve (I+ �Wn+1Wt

n+1)z2= z1 using z2= z1− �Wn+1z3, where(I+�Wt

n+1Wn+1)

z3=Wtn+1z1

3: Solve Ltmxn+1= z2

In the following, we obtain the Cholesky factorization, Lwn+1 , of (I+ �Wtn+1Wn+1) simply

by updating the Cholesky factorization, Lwn , of the matrix (I+ �WtnWn). We have,

(I+ �Wtn+1Wn+1)=

[LwnLt

wn�Wt

nw

�wtWn (1+ �wtw)

]=[

Lwn 0

st �

][Lt

wns

0 �

](C3)

where Lwns= �Wtnw, and �2= (1+ �wtw − sts). Hence,

Lwn+1 =[

Lwn 0

st �

](C4)

and the solution z3 is obtained by backsolving Lwn+1Ltwn+1

z3=Wtn+1z1.

The computational cost associated with this dense matrix update is significantly higher thanthe algorithm based on successive rank-1 downdates of Cholesky factor (Solver Type 1). Thecomputational cost of the algorithm is two triangular solves using the already computed factorLm, (3p+1) vector inner products, and nnz(Lm)+nnz(Lwn)+2nnz

(Lwn+1

)operations in solving

w, s, and z3, respectively. If the load vector bn+1=bn, then the operations count reduces byp vector inner products.

ACKNOWLEDGEMENTS

This research is sponsored by the Mathematical, Information and Computational Sciences Division,Office of Advanced Scientific Computing Research, U.S. Department of Energy under contract numberDE-AC05-00OR22725 with UT-Battelle, LLC. The first author wishes to thank Dr. Ed F. D’Azevedofor many helpful discussions and excellent suggestions. The authors thank the reviewers for theircomments and suggestions in improving the presentation style and the quality of the manuscript.



REFERENCES

1. Hansen A, Roux S. Statistical Toolbox for Damage and Fracture. Springer: New York, 2000; 17–101.2. Herrmann HJ, Roux S (eds). Statistical Models for the Fracture of Disordered Media. North-Holland:

Amsterdam, 1990.3. Sahimi M. Non-linear and non-local transport processes in heterogeneous media from long-range correlation

percolation to fracture and materials breakdown. Physics Reports 1998; 306:213–395.4. Chakrabarti BK, Benguigui LG. Statistical Physics of Fracture and Breakdown in Disordered Systems. Oxford

Science Publications: Oxford, 1997.5. de Arcangelis L, Hansen A, Herrmann HJ, Roux S. Scaling laws in fracture. Physical Review B 1989;

40(1):877–880.6. Delaplace S, Pijaudier-Cabot G, Roux S. Progressive damage evolution in discrete lattice models and

consequences on continuum modelling. Journal of Mechanics and Physics of Solids 1996; 44:99–136.7. Batrouni GG, Hansen A, Nelkin M. Fourier acceleration of relaxation processes in disordered systems.

Physical Review Letters 1986; 57:1336–1339.8. Batrouni GG, Hansen A. Fourier acceleration of iterative processes in disordered-systems. Journal of Statistical

Physics 1988; 52:747–773.9. Batrouni GG, Hansen A. Fracture in three-dimensional fuse networks. Physical Review Letters 1998;

80:325–328.10. Davis TA, Hager WW. Modifying a sparse Cholesky factorization. SIAM Journal on Matrix Analysis and

Applications 1999; 20(3):606–627.11. Davis TA, Hager WW. Multiple-rank modifications of a sparse Cholesky factorization. SIAM Journal on

Matrix Analysis and Applications 2001; 22(4):997–1013.12. Hughes TJR. The Finite Element Method. Prentice-Hall: Englewood Cliffs, NJ, 1987.13. Gill PR, Golub GH, Murray W, Saunders MA. Methods for modifying matrix factorizations. Mathematics

of Computation 1974; 28:505–535.14. Gill PE, Murray W, Saunders MA. Methods for computing and modifying the LDV factors of a matrix.

Mathematics of Computation 1975; 29:1051–1077.15. Golub GH, van Loan CF. Matrix Computations. The Johns Hopkins University Press: Baltimore, MD, 1996.16. Tyrtyshnikov E. Optimal and superoptimal circulant preconditioners. SIAM Journal on Matrix Analysis and

Applications 1992; 13:459–473.17. Chan RH, Ng MK. Conjugate gradient methods for Toeplitz systems. SIAM Review 1996; 38(3):427–482.18. Chan T. An optimal circulant preconditioner for Toeplitz systems. SIAM Journal on Scientific and Statistical

Computing 1988; 9:766–771.19. Chan RH. Circulant preconditioners for Hermitian Toeplitz systems. SIAM Journal on Matrix Analysis and

Applications 1989; 10:542–550.20. Chan R, Chan T. Circulant preconditioners for elliptic problems. Numerical Linear Algebra with Applications

1992; 1:77–101.21. Chan RH, Jin XQ. A family of block preconditioners for block systems. SIAM Journal on Scientific and

Statistical Computing 1992; 13:1218–1235.22. de Arcangelis L, Redner S, Herrmann HJ. A random fuse model for breaking processes. Journal of Physics

(Paris) Letters 1985; 46(13):585–590.23. Sahimi M, Goddard JD. Elastic percolation models for cohesive mechanical failure in heterogeneous systems.

Physical Review B 1986; 33:7848–7851.24. Duxbury PM, Beale PD, Leath PL. Size effects of electrical breakdown in quenched random media. Physical

Review Letters 1986; 57(8):1052–1055.25. Duxbury PM, Leath PL, Beale PD. Breakdown properties of quenched random systems: the random-fuse

network. Physical Review B 1987; 36:367–380.26. Crisfield MA. A consistent co-rotational formulation for nonlinear, three-dimensional, beam elements. Computer

Methods in Applied Mechanics and Engineering 1990; 81:131.


An efficient algorithm for modelling progressive damage...

Documents

Transcript of An efficient algorithm for modelling progressive damage...