A Regularization Approach to the Reconciliation of Constrained Data Sets

A Regularization Approach to the Reconciliation of Constrained Data Sets

Jeffrey Dean Kelly1

1 Industrial Algorithms LLC., 15 St. Andrews Road, Toronto, Ontario, Canada, M1P 4C3

E-mail: [email protected]

2

Abstract

A new iterative solution to the statistical adjustment of constrained data sets is derived in this paper. The method is general

and may be applied to any weighted least squares problem containing nonlinear equality constraints. Other methods are

available to solve this class of problem, but are complicated when unmeasured variables and model parameters are not all

observable and the model constraints are not all independent. Of notable exception however are the methods of Crowe

(1986) and Pai and Fisher (1988), although these implementations require the determination of a matrix projection at each

iteration which may be computationally expensive. An alternative solution is proposed which makes the pragmatic assumption

that the unmeasured variables and model parameters are known with a finite but equal uncertainty. We then re-formulate the

well known data reconciliation solution in the absence of these unknowns to arrive at our new solution; hence the

regularization approach. Another procedure for the classification of observable and redundant variables is also given which

does not require the explicit computation of the matrix projection. The new algorithm is demonstrated using three illustrative

examples previously used in other studies.

Keywords

Data Reconciliation, Projection Matrix, Regularization, Generalized Inverses, Observability and Redundancy.

3

Introduction

The issue of reconciling process measurements subject to equality constraints has been widely studied in the past several

decades in the chemical engineering literature with a recent and comprehensive review given by Crowe (1996). The basic

premise of data reconciliation is to statistically adjust or smooth the process measurements consistent with known

conservation laws such as steady-state material, energy and component balances where the measurements are assumed to be

subject to at worst random errors (i.e., to have an expected value of zero and known variance). Unfortunately, these

measurements may be unknowingly distorted by systematic or gross errors such as instrument malfunctions or miscalibrations,

non-representative sampling and even variability caused by the process itself. These gross errors consequently eliminate any

realistic conclusions that can be drawn from the results even if just one of these anomalies is present in the data set. Although

the objective of this article is to highlight a different approach to the data reconciliation solution, it must be understood that all

methods are negatively influenced by the presence of gross errors and techniques such as those presented by Tong (1995) and

Tong and Crowe (1995) are strongly recommended to aid in the identification and removal of such aberrants.

The general data reconciliation problem, shown below, is essentially a weighted least squares minimization of the measurement

adjustments subject to nonlinear constraints involving reconciled, unmeasured and fixed variables.

min

w. r. t.

s. t.

J T T

(x x) Q (x x) x Q x

x

f(x, y, z) 0

m

1

m a

1

a

[1]

Using the above nomenclature, the objective function J is known to be distributed as a Chi-square statistic with degrees of

freedom equal to the number of constraints minus the number of independent unmeasured variables (Knepper and Gorman,

1980). The general equality constraint functions, denoted by f(x, y,z) of size ng x 1, may contain any combination of

reconciled x variables of size nx x 1 (i.e., known with uncertainty) , unmeasured y variables or model parameters of size ny x

1 (i.e., unknown) and fixed or constant z variables of size nz x 1 (i.e., known without uncertainty). Segmenting the variables

into three categories is useful in representing known conservation laws and other strictures that may be appropriate to model

the underlying system. The vector xm contains the measurements, xa contains the vector of adjustments to the measurements

where the symmetric and positive definite matrix Q contains the variance-covariance elements of the measurement errors and

thus quantifies the uncertainty in each measured value.

The solution to the above problem has been well studied by many researchers with one of the first solutions, presented in the

chemical engineering literature, given by Kuehn and Davidson (1961) although their formulation did not include the existence

4

of unmeasured variables. It is the estimation of the measurement adjustments, unmeasured variables and model parameters

that poses the greatest difficulty where recent literature suggests that successful estimation cannot be achieved unless it is

indeed performed simultaneously (MacDonald and Howat, 1988; Crowe, 1996).

Other workers in the field of applied statistics have also studied these problems in depth with useful reviews given by Reilly

and Patino-Leal (1981), Ricker (1984) and Dovi and Paladino (1989) and concern the problem of estimating parameters when

there is error in all the variables with a recent application using this technique found in Weiss et. al. (1996). However, it wasn’t

until the fundamental work of Britt and Luecke (1973) that a somewhat general method for problem [1] was derived. They

used the method of Lagrange multipliers to arrive at the following iterative solution for the nonlinear data reconciliation

problem

x x QA (A QA ) f A (x x ) B (y y )k 1

m

k k k 1 k k

m

k k k 1 k T T

[2]

y y B (A QA ) B B (A QA ) f A (x x )k 1 k k k k 1 k1

k k k 1 k k

m

k

T T T T

[3]

where in their derivation the functions fk are successively linearized about xk, yk and z at iteration k in a Taylor series

expansion yielding

A (x x ) B (y y ) f 0k k 1 k k k 1 k k [4]

and the notation of f(xk,yk,z) is simplified to fk. The matrices Ak and Bk are the first-order derivatives or Jacobian of the

constraints with respect to the reconciled and unmeasured variables respectively where representative starting values are

required to initiate the solution.

Af

x

k

k

kij

i

j

[5]

B

f

y

k

k

kij

i

j

[6]

These derivatives can be calculated analytically or by perturbing each variable individually and recording the change in the

specific function (i.e., finite difference). However, it is well known that the numerical derivative estimates are computationally

expensive, hence Jacobian update techniques (Pai and Fisher, 1988; Press et.al., 1994) and optimal Jacobian partitioning

5

algorithms (Coleman et. al., 1984) have been developed to reduce the calculational load and are especially useful when solving

large sparse systems.

A similar result to equations [2] and [3] was presented by Knepper and Gorman (1980), where through the use of generalized

inverses, were the first to recognize that the solution of the unmeasured variables are actually estimated through the normal

equations of the unconstrained least squares problem. Unfortunately, both these methods cannot be used for truly general

data reconciliation problems since the two inverses found in equations [2] and [3] may not exist i.e.,

(A QA )k k T 1 [7]

B (A QA ) Bk k k 1 kT T 1

[8]

Two situations arise which make these solutions infeasible, for example, the matrix shown in expression [7] will not have an

inverse if a constraint involves only unmeasured variables thus causing a zero row in the matrix (i.e., Ak is row rank deficient).

The second situation arises when not all of the unmeasured variables are determinable because of a deficiency in the number

of measurements (i.e., Bk is column rank deficient). Interestingly, a similar expression to equation [3] presented by Mah et. al.

(1976) to determine the unmeasured variables, is also not appropriate due to the above mentioned conditions. Furthermore,

these two restrictions concerning the usefulness of equations [2] and [3] is not unknown, Britt and Luecke (1973) presented

necessary conditions on the ranks of Ak and Bk though no insight into the identification of the dependent constraints and

unmeasured variables were given. In fact, Knepper and Gorman (1980) discuss the case when all of the constraints may not

be independent due to modeling relatively large problems but no discussion on how to identify these dependent equations was

elaborated. In addition, other solutions such as the “Error-In-Variables Model” presented by Reilly and Patino-Leal (1981)

also suffer from these rank deficiencies where no discussion nor attempt is made to identify and remove these dependent

vectors. Further, a method for solving large flowsheet applications was presented by Stephenson and Shewchuk (1986) who

estimated the adjustments, unmeasured variables and Lagrange multipliers simultaneously by solving a full-space successive

quadratic program (SQP) structured problem with only equality constraints considered. However no treatment on the

identification of dependent unmeasured variables was given and consequently no classification of variables was reviewed. And

similarily, Kim et. al. (1997) suggest using a nonlinear programming algorithm, presumably employing an SQP, to perform

“robust” data reconciliation and gross error detection based on the MIMT serial elimination technique of Serth and Heenan

(1986) and on the work of Liebman and Edgar (1988) who impose explicit inequality constraints on both the reconciled and

unmeasured variable estimates; although determining realistic and consistent upper and lower bounds is a nontrivial exercise.

6

Notwithstanding, Crowe et. al. (1983) and Crowe (1986) presented an elegant method which identified dependent unmeasured

variables and eliminated dependent constraints whereby it was possible to coaptate (Mah et. al., 1976) or decouple the problem

into two sub-problems, one containing only reconciled variables and the other involving unmeasured variables. They achieved

this result by using a projection matrix which effectively spanned the null space of BkT however finding this matrix projection

involves some effort. Later, Pai and Fisher (1988) published an algorithm to solve the general nonlinear data reconciliation

problem shown below

x x QA U U A QA U U f A (x x )k 1

m

k k k k k k k k k

m

k

T T T T1

[9]

y y B I 0 f A (x x )k 1 k

1

k k k k 1 k 1

[10]

where I 0 is a matrix of size ny x ng if there are no dependent unmeasured variables and Uk is the matrix projection

determined by first partitioning Bk into

BB B

B B

k 1

k

3

k

2

k

4

k

[11]

and then computing its elements as

U B B Ik

2

k

1

k 1

T

[12]

which satisfies the requirement that UkTBk = 0. Careful attention to the row and column orderings in equation [11] is

required where the existence of dependent columns in Bk can indicate non-observable unmeasured variables (Crowe, 1989)

and consequently the unmeasured variables estimated from equation [10] are for the re-ordered independent unmeasured

variables only; the dependent unmeasured variables are set to zero (Crowe, 1986). For a complete description of the re-

orderings, and the determination of the matrix projection Uk , the reader is referred the work of Kelly (1998).

Interestingly, Bagajewicz (1997) points out that if Ak and Bk are in canonical form (Madron, 1992), that is if Bk is of full

column rank and Ak is of full row rank (i.e., equations [7] and [8] are realizable) then this is equivalent to the matrix projection

procedure introduced by Crowe et. al. (1983). Consequently, the determination of this matrix projection provides the means to

obtain a canonical representation of the data reconciliation problem and hence establishes an equivalence with equations [2]

and [3]. However, the task of obtaining an alternative to this canonical representation is proposed in this article.

7

Since the matrix projection must be computed at each kth iteration, which requires additional computation although it can

significantly reduce the size of the matrix to be inverted in equation [9], it seems reasonable to ponder if another method is

possible which alleviates the two restrictions imposed by equations [2] and [3] without explicitly computing Uk. In fact, other

researchers such as Swartz (1989) and Sanchez and Romagnoli (1996) have proposed different methods to solve the general

data reconciliation problem, though they use the fundamental concept of a projection matrix to decouple the problem but

compute its elements using QR factorization. And recently, Albuquerque and Biegler (1996) solved the problem of dynamic

data reconciliation using an SQP where they introduced a LU decomposition strategy to obtain an estimate of the matrix

projection in order to classify the measured and unmeasured variables. The focus of this paper is to introduce a regularization

approach to solve the data reconciliation problem described in equation [1] which negates the explicit use of a matrix

projection whilst being able to arrive at a minimized solution. Further, this article offers an alternative method to classify

unmeasured variables into observable and non-observable and reconciled variables into redundant and non-redundant which

also does not require the matrix projection. In addition, the variance-covariance structures of the measurement adjustments,

unmeasured variables and constraint residuals are presented in order to complete the method for the very important post

analysis of the final results (i.e., gross error detection and identification).

A Regularized Solution Approach

By solving a Lagrange multiplier problem, assuming that we have no unmeasured variables, we may use the solution presented

by Kuehn and Davidson (1961) to solve problem [1] as

x x QA (A QA ) f A (x x )k 1

m

k k k 1 k k

m

k T T

[13]

If we add the unmeasured variables to the problem by arbitrarily making the assumption that they have a sufficiently large but

equal uncertainty defined by the non-negative real number k, which may change at every iteration, we may re-write equation

[13] similar to equation [2] as

x x QA K f A (x x ) B (y y )k 1

m

k k k k

m

k k k 1 k T

( ) 1 [14]

where Kk is defined as the kernel matrix of the reconciliation solution and is expressed below as

K A BQ 0

0 I

A

BA QA B Bk k k

y

k

k

k k k k

k

T

T

T k T [15]

8

Consequently, this solution minimizes the original objective function value J found in problem [1] plus a second term (i.e.,

1/ ( ) ( )k Ty y y yk 1 k k 1 k ) since the unmeasured variables are now included as measured quantities. If for the

moment we assume that all of the unmeasured variables are observable (i.e., Bk is of full column rank), then by using the

method of generalized inverses discussed in Knepper and Gorman (1980) and sometimes referred to as the Moore-Penrose

pseudo inverse (Golub and Van Loan, 1983), we may solve for the unmeasured variables found in equation [4] as an

unconstrained parameter estimation problem at each iteration i.e.,

y y (B B ) B f A (x x )k 1 k k k k k k k 1 k T T1

[16]

Then by substituting the above into equation [14] we arrive at, after little manipulation, a new solution to the data

reconciliation problem

x x D QA K E f A (x x )k 1

m

k k k k k k

m

k

1 1T

[17]

with the matrices Dk and Ek being both symmetric and defined as

D I QA K I E Ak

x

k k

g

k k T 1

[18a]

E I B B B Bk

g

k k k k T T1

[18b]

where if Bk were square and of full rank i.e., has as many unmeasured variables as equations (ng = ny), then Ek would be zero

and no data reconciliation would be possible. Although equation [17] is an approximation, it is surprisingly insensitive to the

choice of kas we shall see in the examples. This is obviously an important and desirable property to possess since including

the matrices Dk and Ek can increase the calculations substantially, especially for large problems. Moreover, it should also be

noticed that the inclusion of k and B Bk k T

into equation [15] resembles the method of linear regularization described in

Press et. al. (1994). Regularization methods have found success in many practical and diverse problems when it is impossible

to invert the kernel matrix due to degeneracy (i.e., equation [7]). Hence, the above formulation adopts this methodology for

the solution of data reconciliation problems in order to circumvent the degeneracy problem without computing the matrix

projection (which uses the null space of BkT to produce a canonical or minimum order kernel matrix i.e., the inverse found in

equation [9]). However, this approach relies on the proper selection of k which is a major factor in the success of the

method and consequently guidelines for its choice will be discussed in detail later.

9

Interestingly, the solution for the unmeasured variables in terms of the normal equations of the least squares minimization

shown in equation [16] is identical to Britt and Luecke (1973)’s linear solution for the unmeasured variables (i.e., Bagajewicz,

(1997)’s equation [5] and very similar to our equation [3]) if Bagajewicz (1997)’s equation [4] for the reconciled variables is

substituted into the linear version of equation [16] (i.e., with ( )Ax Czm as the response vector and yk removed).

Consequently, equation [16] is numerically simpler to implement than equation [3] even when it is formulated into canonical

form and similarly, equation [10] requires full knowledge of the independent constraints of B in order to compute the inverse

of B1k . And in line with the regularization approach, it is known that by adding a diagonally dominant matrix to BkTBk i.e.,

B B Ik k

y

T k [19]

with kchosen as a positive scalar, we can also increase the numerical stability of the iterative solution especially when

solving ill-conditioned systems. This technique has proven useful in fitting nonlinear and time dependent transfer function

models and is sometimes referred to as the Levenberg-Marquardt method (Box and Jenkins, 1976; Press et. al., 1994) or ridge

regression (Golub and Van Loan, 1983). Nevertheless, if kis non-zero at the converged iteration, the unmeasured variables

are biased since the above modification imposes equality constraints on the parameters from their starting values in the least

squares sense. Though this technique was not used in this study, it may prove useful in other problems to aid the data

reconciliation practitioner in obtaining a converged solution since it is relatively straightforward to implement. As well, other

numerical stability techniques such as line searches and trust regions have been shown to be useful in many other nonlinear

programming solutions (Lucia and Xu, 1990 and 1994) and can be easily augmented in the same manner as that of Pai and

Fisher (1988).

Several Implementation Issues Discussed

Before we can proceed to the estimation of the variance-covariance matrices and to the classification of the variables which

completes the method, five important issues need to be addressed. The first issue concerns the identification of dependent

columns of the matrix Bk (i.e., equation [8]) and relates to the existence of a solution for equations [3] and [16]. We can

determine an independent set of unmeasured variables which simplifies the partition of equation [11] into the following two

sub-matrices

B P P B Bk

12

C,k

34

C,k

12

k

34

k [20]

10

where P12C,k and P34

C,k are column permutation matrices containing the column re-orderings and equations [16], [17] and

[18] can be re-written with Bk replaced by B12k. This partitioning may be performed at every iteration however it needs only

be initiated when the inverse matrix in equation [16] is not positive definite (i.e., singular). If it is, then any identified

dependent columns must be added to the matrix B34k. Dependent columns can be easily detected if a modified Cholesky

factorization is used to solve for the independent unmeasured variables where the Cholesky factor for the identified

independent columns can easily be extracted from the results (cf. Kelly, 1998). As an aside, the presence of dependent

columns in Bk will not affect the existence of an inverse for the kernel matrix and therefore BkBk T can remain unchanged in

equation [15]. If there do exist dependent columns (i.e., BkTBk is not positive definite) then the vector of unmeasured

variables can be comprised of a re-ordered set expressed as

y P Py

y

k

12

C,k

34

C,k 12

k

34

k

[21]

Here, y k is the vector of original unmeasured variables of arbitrary order, and y12k and y34

k are the vectors of independent

and dependent unmeasured variables respectively where y34k variables are arbitrarily set to zero (Golub and Van Loan, 1983;

Crowe, 1986). For the y12k variables, which are the solution to equation [16], we show the modified solution in full as

y P P

y

0

(B B ) B f A (x x )

0

k 1

12

C,k

34

C,k 12

k

12

k

12

k

12

k k k k 1 k

T T1

[22]

The second issue is the existence of an inverse for the kernel matrix shown in equation [15]. Since we have included the

Jacobian or topology matrix of the unmeasured variables (which is itself degenerate), its existence eliminates the problem

associated with expression [7] provided k is not zero requiring the specified equality constraints to be functions of at least

the measured or unmeasured variables or both. Notwithstanding, we may encounter constraints which are linearly dependent

on other or combinations of other constraints, this may occur when we are developing large and diverse problems. These

situations will cause singularities and thus these linear dependent constraints must be identified, then removed from the valid

constraint set, which will typically be performed once at the start of the reconciliation. Since Cholesky factorization is

recommended to construct the kernel matrix’s inverse (Reilly and Patino-Leal, 1981; Forrest, 1991; Fang and Puthenpura,

1993) dependent constraints can be easily detected at each iteration as mentioned previously.

The third issue concerns the choice of k, which as mentioned, can be considered as a regularization parameter and is

sometimes referred to as a shrinkage factor in ridge regression. Given that k from a practical viewpoint is an overall

11

estimate of the uncertainty for all of the unmeasured variables, a balance or compromise must be struck between the reality of

the solution in terms of minimizing the adjustments to the measured variables and the speed of execution. If for example we

increase k, so that the elements of (Kk)-1Bk tend toward zero (which is possible since any norm of Kk will increase with

k), then equation [17] would numerically collapse to

x x QA K f A (x x )k 1

m

k k k k

m

k

T 1

[23]

since (Kk)-1Ek would approach (Kk)-1 and Dk would approach the identity matrix thus avoiding the extra computations and

decreasing the execution speed. Unfortunately, as kis increased substantially we find that the closure of the constraints fk

using equation [23] becomes more difficult since less emphasis is placed on reconciling the measurements. Conversely, if

kis not chosen large enough, where the norm of (Kk)-1Bk does not tend toward zero, then an accurate solution may also

not be obtained using equation [23] and we will not be able to avoid including Dk and Ek to ensure a realistic solution.

Therefore, a concession must be made concerning the choice of k if we are to simplify the computing requirements.

Fortunately, estimating k using the largest trace (i.e., matrix diagonal sum) between AkQAkT

and ny x BkBkT has been

used successfully as we shall observe in the examples to follow i.e.,

k T

y

Ttr n tr max ,A QA B Bk k k k

[24]

This expression provides a numerically acceptable compromise which has been modified from a similar approach of Press et.

al. (1994) who suggest the ratio of the traces to impose an equal concern regularization in ridge regression types of parameter

estimation techniques (i.e., in the absence of equality constraints). Given that it is well known that if the uncertainty of a

measured variable is substantially larger than that compared to the other measured variables’ uncertainties, then it is expected

that the reconciled result for that variable will be close to that if it were to be explicitly made unmeasured. Essentially, this is

our proposed approach using a suitably chosen k though by re-deriving the reconciliation solution to obtain equations [17]

and [18], has delineated how choosing differentk’s will allow us to simplify the solution (i.e., equation [23]).

This type of comprise is not new nor unexpected in regularization type solutions. Press et. al. (1994) articulate that these

problem methods involve a trade-off between two optimizations: agreement between data and solution and smoothness or

stability of the solution where the regularization parameter must adjudicate a delicate compromise between the two.

Therefore, in order to further aid in the determination of a suitable k, the following relationship has been derived and is

detailed in the appendix.

12

tr n nT

g yA QA Kk k k

1

12 0, ; k [25]

This expression implies that as k becomes large, the trace found above approaches the difference between the number of

equality constraints and the number of independent unmeasured variables found in Bk (i.e., ny,12). It has been observed that a

suitable k is reached when this computed trace nears the value (ng - ny,12) which is demonstrated in the examples. Thus

equation [24] can be used as an initial guess which can be easily checked as being suitable if expression [25] is sufficiently

satisfied (i.e., within 1% of (ng - ny,12) for example) else it can be increased accordingly. It should be noted that this type of a

priori determination of k has not been found in the literature for other regularization approaches and does not necessarily

require adjustment at each iteration. Moreover, it is known that these regularization methods have the ability to smooth or

stabilize a solution. Though this may be true, no benefit is claimed nor has been investigated in order to exploit such a

property, but may in the future be the subject of further research.

The fourth issue, which is somewhat concomitant with the choice of k, is to scale the constraints and reconciled and

unmeasured variables so as to give comparable importance to each in the solution. Nguyen et. al. (1988) presented a brief but

relevant review of several scaling methods however their focus was on optimally scaling multivariable steady-state gain

matrices when analysing model-based process controllers. The basic premise of scaling is to somehow “balance” the required

matrices via pre and post multiplying by appropriate diagonal scaling matrices in order to increase the numerical robustness or

stability of the solution. Hence at each iteration we may choose to scale the system thereby solving the following successive

linearization sub-problem

min

w. r. t.

s. t.

(x x ) S Q S (x x )

x

S A S x S B S y S f 0

m

k

x

k 1

x

k

m

k

k

f

k k

x

k k

f

k k

y

k k

f

k k

T T

[26]

where x S xk

x

k k1

, y S yk

y

k k1

and Sf k, Sx

k and Sy k are the diagonal scaling matrices for the constraints,

reconciled and unmeasured variables respectively. And from Nguyen et. al. (1988), we can compute these matrices at each

iteration using the Euclidean norms of the rows and columns of the augmented Jacobian matrix [Ak Bk] as follows

SA B

f

k

k kii

i

gi n 1

1

2

[27]

13

SA B

x

k

k kjj

j

xj n 1

1

2

[28]

and

SA B

y

k

k kjj

j

x x yj n n n 1

1

2

( ) ( ) [29]

where the authors also recommend that more than one pass at computing the scaling matrices may be required to improve the

balancing of the scaled matrices Ak and Bk at each iteration. Although we did not employ nor require scaling of any kind for

our examples, scaling may in some cases prove to be very beneficial and has been included here to provide an avenue for the

data reconciliation practitioner pending numerical difficulties.

The fifth issue concerns the computational effort required to arrive at the solution using the method defined by equations [16]

and [23]. Since the majority of the effort for these problems is concentrated in the factorization of the matrices requiring

inversion (Crowe, 1986), this will be the focus of our argument. Given that the proposed method’s order of operations to

factorize the kernel matrix is circa ng3, its factorization can be costly since the method of Pai and Fisher (1988) can reduce the

dimension of the inverse inside its kernel matrix to a size of (ng-ny,12) x (ng-ny,12). Notwithstanding, in order to determine the

partitions found in equation [11], we require some technique, and possibly a matrix factorization technique, to identify how

the matrix Bk will be partitioned. If we assume the technique is Cholesky factorization, then we must first decompose

BkTBk then BkBkT to determine and remove any dependent vectors found in these matrices. Since the factorization of

BkTBk is also required by the proposed method (i.e., equation [16]), there appears to be no advantage nor disadvantage.

Concerning BkBkT, which will always be of the same order as the proposed method’s kernel matrix Kk, its factorization will

effectively require the same amount of calculations (ignoring a sparse matrix implementation). However, the row and column

permutation matrices required to partition Bk in equation [11] may not change from iteration to iteration and consequently

the factorization of BkBkT may not be necessary at each iteration though this has never been mentioned as an efficiency

improvement in previous works. Once the factorizations have occurred, then to complete the solution of the proposed

method, forward and backward substitutions are then required using the determined factors (i.e., for both inverses found in

equations [16] and [23]). As for the matrix projection method, equation [9]’s kernel matrix must be factored and solved at

each iteration where equations [10] and [12] require factorization and forward and backward solving involving B1k. Therefore

from the brief discussion, it appears that even though the matrix projection can in some cases substantially reduce the size of

14

the problem as noted by Crowe (1986), its computation does not come without cost and for some problems, may require at

least conceptually, as many arithmetic operations as the proposed method if BkBkT must be factored at each iteration.

Unfortunately, the relative execution speed of one method over the other for sparse flowsheets is problem specific due to

differences in the sparsity patterns of the individual matrices, and thus a realistic and general comparison is not possible.

It should also be mentioned that for large problems the technique known as iterative improvement of a solution (Press et. al.,

1994; Golub and Van Loan, 1983) may be required for both methods to improve the solution estimates. Due to accumulating

round-off errors occurring in both the solution vector and the Cholesky factors, iterative improvement may be necessary to

improve the numerical accuracy of the final results. As a matter of insight, other methods are available to solve these class of

problems besides the direct factorization methods mentioned. These techniques, called iterative solution methods (Axelsson,

1996) such as the congugate gradient method, may prove to be highly efficient alternatives for large sparse problems however

more research is required to assess their benefit.

Estimating the Variance-Covariance Matrices

Using expression [17] for the reconciled variables and expression [16] for the unmeasured variables, we can approximate the

following matrices at the converged solution to be

H D QA K EAQA EK AQDa

1 1 1 1T T [30]

and

H B B B I AD QA K E AQA I AD QA K E B B By

1

g

T

g

1

T T T TT

T1 1 1 1 [31]

where we have neglected the superscript k since we are dealing with the final solution. Here, Ha and Hy are the adjusted and

unmeasured variables’ variance-covariance matrices respectively. And, in the same manner as before, if dependent columns in

B are found, then B is replaced by B12 where the variance-covariance matrix for the original unmeasured variances can be

approximated by

H P PH 0

0 I

P

Py 12

C

34

C y,12

y,34

12

C

34

C

T

T [32]

and the variances for the dependent unmeasured variables are arbitrarily set equal to the final converged value of .

Furthermore when all of the non-observable variables have been determined, using the technique to be presented in the next

section, their variances must be reset to as well. From a computational point of view, full computation of the variance-

covariances is not explicitly required since a lower triangular Cholesky factor of Q can be found i.e.,

15

Q R RQ Q T [33]

where the matrix square roots of Ha and Hy are straightforward to obtain. In fact, when more sophisticated gross error

detection methods are used such as principle component analysis (Tong and Crowe, 1995), the matrix square roots can be

used directly to determine the eigenvalues and eigenvectors of the variance-covariance matrices.

Similar to Ha and Hy, the variance-covariance matrix of the constraint residuals is formulated as a function of the adjusted

variables’ variance-covariance matrix

H AH Ag a T [34]

where these residuals are defined as

g f(x , y, z) Ax By Cz Axm m a [35]

at the final converged solution. The matrix C is the Jacobian of the constraints with respect to the fixed variables z and is

computed in the same way A and B are formed; however C is not actually required and only shown for insight. The above

variance-covariance matrices may then be used in the detection and identification of gross errors which were briefly discussed

in the Introduction.

Furthermore, since we are only proposing to use equation [23] discussed above, these matrices can be simplified to yield very

tractable estimates of Ha and Hy as

H QA K AQa T 1

[36]

and

H I B K By y T 1 [37]

These expressions can be easily derived by re-formulating the solution at convergence as

x

y

Q 0

0 I

A

BK Ax Cz

a

ym

T

T

1 [38]

where as a matter of insight, the estimation of the unmeasured variables at each iteration may be approximated by

y y B K f A (x x )k 1 k k k k

m

kk

k T 1

[39]

Interestingly, since no inversion of BTB is required in equations [37] and [39] above, and since the kernel matrix changes

negligibly even if there exist dependent columns in B, these expressions are readily available without partitioning (i.e., equation

16

[20]). However, as mentioned previously, if there are any identified non-observable variables then their variances must be

reset to and their covariances reset to zero for lack of any other a priori information. In addition, it may also be useful to

know the variance-covariance matrix of the reconciled variables; this can be easily computed as Q minus Ha.

And further, given that any data reconciliation problem employing a quadratic objective function is well suited to a SQP

implementation, the task of computing estimates of the variance-covariances and classifying the measured and unmeasured

variables must still be performed external to its solution (Albuquerque and Biegler, 1996; Kim et. al., 1997). Yet, with our’s

and Crowe’s methods, a substantial amount of the required computations are performed already in the path to the solution

(i.e., the inverse of the kernel matrix) in addition to the computations necessary to compute the check found in [25].

Finally, in order to assess when the solution has converged, a suitable stopping criterion must be defined. One approach is to

compute the Euclidean norm of the linearized constraint residuals of equation [4] which should be less than a specified

tolerance where its choice is dependent on the order of magnitude of the constraints (cf. Buzzi-Ferraris and Tronconi, 1993,

for an interesting discussion on convergence acceptance in the solution to nonlinear sets of equations).

Another Approach to the Classification of Variables

After the reconciliation has completed, it is important to classify the unmeasured and measured variables into observable and

redundant variables respectively. A comprehensive description and review has been given by Crowe (1989) who discusses the

observability and redundancy terms in detail. The intent of this section is to describe another technique to classify these

variables without employing the use of the matrix projection. Since the efficacy of the proposed method depends on not

computing the matrix projection, the techniques shown below allow the user to classify the variables using matrices already

available from the final reconciliation solution.

We first discuss the classification of the unmeasured variables. Stated succinctly, an unmeasured variable is observable if it can

be uniquely determined from the available measurements and model constraints. If it is not, then any calculated value cannot

be trusted and it is defined to be non-observable. In order to identify these variables, we can re-write equation [4] as

Ax B By

yCz 012 34

12

34

[40]

using the partitions of B and the product C times z which combined above should be near zero upon convergence. If we

again use the method of generalized inverses, we can solve for y12 yielding

y B B B Ax Cz B y12 12 12 12 34 34 T T1

[41]

17

Thus, if any row of the matrix B B B B12 12 12 34

T T1

contains non-zero elements, then the unmeasured variables

associated with y12 are declared non-observable since they are functions of the other dependent non-observable variables

contained in y34. This result is similar to those presented by Crowe (1989), Sanchez and Romagnoli (1996) and Albuquerque

and Biegler (1996), however we do not require the identification of dependent constraints in the matrix B as required by the

other authors. Further, the matrix contained in the above equation is partially determined since we already have the Cholesky

factors of B12TB12 from the solution of equation [16].

The next step is to determine redundant and non-redundant variables. By definition, redundant variables are those

measurements which could be uniquely calculated from the remaining measurements if they were not measured. Non-

redundant variables are simply those measurements which are not redundant. With the aid of the matrix projection,

identifying redundant variables is quite easy since zero columns in UTA correspond to non-redundant measurements. This is

a direct result of the fact that non-redundant variables would translate into dependent columns in B if a measurement were

not available. Unfortunately, the matrix projection is not available in this implementation and thus another strategy is

required.

If we use the definition of a redundant reconciled variable literally, then by incrementally testing one column of A at a time

(i.e., similar to identifying dependent unmeasured variables) we can use the technique of matrix inversion by partition found in

Press et. al. (1994) to identify measurement redundancy. Hence, equation [42] below describes the matrix that would be

required to be inverted in equation [16] if the j th measurement were not available i.e.,

B B B A

A B A A

12 12 12

12

T T

j

j

T

j

T

j

[ ]

[ ] [ ] [ ]

[42]

We know that if the j th column of A, denoted as [A]j , is linear dependent on the other columns contained in B12, then by

applying the updated inversion technique to the above, we will obtain for the inverse

B B B B B A A B B B s B B B A s

A B B B s s

12 12 12 12 12 12 12 12 12 12 12

12 12 12

T T T

j j

T T

j

T T

j j

j

T T

j j

1 1 1 1

1

1

[ ] [ ] /[ ] [ ] /[ ]

[ ] /[ ] /[ ] [43]

where a value near or equal to zero for

[ ] [ ] [ ] [ ] [ ]s A A A B B B B A12 12 12 12j j

T

j j

T T T

j 1

[44]

18

can indicate a singular update. Thus the j th reconciled variable can be sufficiently declared as non-redundant if [s]j is

effectively zero. Equation [44] is referred to as the Schur complement and is an underlying structure found in many direct

linear algebra solution methods (Golub and Van Loan, 1983). In essence, checking [s]j is identical to ensuring that the

determinant of equation [42] will not be zero (i.e., indicating a singularity) since this determinant is equal to the determinant of

B12TB12 times [s]j (Kelly, 1998). In addition, if we re-formulate equation [44] into matrix format, the vector s may be

expressed as

s A I B B B B A A EAg 2 12 12 12

diag diagT T T T

1

1

[45]

which is interesting since E is a projection matrix in its own right being symmetric and idempotent (Householder, 1975;

Axelsson, 1996) where if B is a maximal square matrix then s is zero for all measured variables. This is consistent since no

reconciliation is possible and all measured variables are non-redundant. In order to minimize the accumulation of numerical

round-off errors, we again recommend the use of Cholesky factorization to compute equation [44] (and equation [41]) instead

of the direct inversion of B12TB12 as shown above. This can be accomplished easily and efficiently by forward solving for

[V]j in

R V B AB,12 12[ ] [ ]j

T

j [46]

since RB,12 is the lower triangular Cholesky factor of B12TB12 where [s]j may now be re-written as

[ ] [ ] [ ] [ ] [ ]s A A V Vj j

T

j j

T

j [47]

Finally, the above analysis to classify unmeasured and reconciled variables as observable and redundant has used the final

converged set of Jacobian estimates A and B. Unfortunately for nonlinear problems , these matrices are only point estimates

of the underlying state of the modeled process at the converged solution and may be further degenerate by their numerical

update strategy. Therefore, careful interpretation of the results is necessary which may even involve performing an ad hoc

sensitivity analysis on the point estimates if needed (i.e., perturbing x and y slightly, re-computing A and B then performing

the above analysis to verify the stationarity of the results), though for large problems this may be impractical. Fortunately for

linear systems, where the Jacobians are constant, the results will be invariate and not dependent on the tested operating space

of the process.

19

Illustrative Examples

Presented below are three small examples illustrating the proposed method. The first 2 examples are both nonlinear (in fact

they are bilinear) and the third involves a linear mass balance. The product MATLAB® Version 4.2c.1 from The MathWorks

Inc. was used to program the examples where MATLAB®’s standard matrix inversion routines were used to solve for the

linear affine equations.

Example 1. This example involves the reconciliation of copper and zinc concentration measurements from a mineral floatation

circuit presented by Smith and Ichiyen (1973). The measurement values were taken from Tong (1995) who solved the

problem using the bilinear reconciliation method of Crowe (1986). Figure 1 describes the process flowsheet and Table 1

details the measured, unmeasured and fixed variables. There are a total of 8 streams in the process with 4 unit operations

where the number of measured variables is 14 and the number of unmeasured variables is 9. There is 1 fixed or constant

variable which represents the total flow rate of stream one and in total there are 12 functions or equalities. A separate table is

also presented, Table 2, which details the constraint functions and their associated Jacobian elements where the variance-

covariance matrix Q is diagonal with elements corresponding to a relative standard deviation of 6.56%. In addition, the initial

conditions for the x and y variables were their measured values and a vector of ones respectively.

Table 1 Variable Definitions for Example 1.

X y z 1 Copper in Stream 1 Copper in Stream 8 Flow of Stream 1

2 Copper in Stream 2 Zinc in Stream 8

3 Copper in Stream 3 Flow of Stream 2





8 Zinc in Stream 1 Flow of Stream 7

9 Zinc in Stream 2 Flow of Stream 8

10 Zinc in Stream 3

11 Zinc in Stream 4

12 Zinc in Stream 5

13 Zinc in Stream 6

14 Zinc in Stream 7

Table 2 Function and Jacobian Definitions for Example 1.

Function A B C

f(1) = x(1)z(1)-x(2)y(3)-x(5)y(6) A(1,1) = z(1), A(1,2) = -y(3), A(1,5) = -y(6) B(1,3) = -x(2), B(1,6) = -x(5) C(1,1) = x(1)

f(2) = x(8)z(1)-x(9)y(3)-x(12)y(6) A(2,8) = z(1), A(2,9) = -y(3), A(2,12) = -y(6) B(2,3) = -x(9), B(2,6) = -x(12) C(2,1) = x(8)

f(3) = z(1)-y(3)-y(6) - B(3,3) = -1, B(3,6) = -1 C(3,1) = 1

f(4) = x(2)y(3)-x(3)y(4)-y(1)y(9) A(4,2) = y(3), A(4,3) = -y(4) B(4,1) = -y(9), B(4,3) = x(2), -

20

B(4,4) = -x(3), B(4,9) = -y(1)

f(5) = x(9)y(3)-x(10)y(4)-y(2)y(9) A(5,9) = y(3), A(5,10) = -y(4) B(5,2) = -y(9), B(5,3) = x(9), B(5,4) = -x(10), B(5,9) = -y(2)

-

f(6) = y(3)-y(4)-y(9) - B(6,3) = 1, B(6,4) = -1, B(6,9) = -1 -

f(7) = x(3)y(4)-x(4)y(5)-x(7)y(8) A(7,3) = y(4), A(7,4) = -y(5), A(7,7) = -y(8) B(7,4) = x(3), B(7,5) = -x(4), B(7,8) = -x(7)

-

f(8) = x(10)y(4)-x(11)y(5)-x(14)y(8)

A(8,10) = y(4), A(8,11) = -y(5), A(8,14) = -y(8)

B(8,4) = x(10), B(8,5) = -x(11), B(8,8) = -x(14)

-

f(9) = y(4)-y(5)-y(8) - B(9,4) = 1, B(9,5) = -1, B(9,8) = -1 -

f(10) = x(5)y(6)-x(6)y(7)+y(1)y(9)

A(10,5) = y(6), A(10,6) = -y(7) B(10,1) = y(9), B(10,6) = x(5), B(10,7) = -x(6)

-

f(11) = x(12)y(6)-x(13)y(7)+y(2)y(9)

A(11,12) = y(6), A(11,13) = -y(7) B(11,2) = y(9), B(11,6) = x(12), B(11,7) = -x(13), B(11,9) = y(2)

-

f(12) = y(6)-y(7)+y(9) - B(12,6) = 1, B(12,7) = -1, B(12,9) = 1 -

Table 3 below, shows the objective function value J as a function of starting from 1 to 10 million for the simplified

solution using equations [16] and [23] and for the more rigorous solution using equations [16] and [17] (note that was fixed

for each run for illustrative purposes only though it may be allowed to change at each iteration as described in equation [24]).

The term “flops” is MATLAB®’s estimation of the floating point operations performed for the simplified solution and k

refers to the number of iterations required for both solutions to satisfy a equality constraint norm tolerance of 10-6. The final

item in the table records the objective function value, the number of flops and iterations required for Pai and Fisher (1988)’s

method (i.e., equations [9] and [10] without using their line search strategy). For this example, only the first iteration involved

determining the matrix projection permutation matrices PC and PR using the Cholesky factorization identification method of

Kelly (1998) where a cut-off tolerance (i.e., machine precision estimate) of 10-9 for determining dependencies was used.

Though the matrix projection was computed at each iteration using MATLAB®’s matrix inversion routine to find U in

equation [12], PC and PR were only required to be determined once at the start of the solution.

Table 3 Effect of on the Solution Results for Example 1.

J Eq [16,23] flops k J Eq [16,17] k Eq [25]

1 20.62137302 - - 17.9616397 22 4.95273

10 18.00556583 - - 17.9616397 22 4.64172

100 17.9616423 1217772 51 17.9616397 22 3.82171

1,000 17.9616400 620872 26 17.9616397 22 3.17081

10,000 17.9616399 525368 22 17.9616397 22 3.01942

49,500 17.9616400 525368 22 17.9616397 22 3.00397

100,000 17.9616392 525638 22 17.9616397 22 3.00196

1,000,000 17.9616424 525638 22 17.9616414 19 3.00019

10,000,000 17.9616191 1217772 51 17.9614510 34 3.00002

J Eq [9,10] 17.9619397 534384 22 - - -

2 Tolerance was relaxed to 10-3 to allow convergence. 3 Tolerance was relaxed to 10-4 to allow convergence.

21

From the above table we observe that the more rigorous method is very insensitive to the choice of and converges easily,

however for small ’s, this is not the case for the simplified method. The shaded row in the table identifies the trace value

found in equation [24] having an initial nominal value of 49,500 and shows a consistent value for the objective function. In

addition, as can be seen from the results, the objective function values for all three solutions are reasonably close indicating

that no solution bias exists. We also notice from the table that the flop count for the simplified method is slightly smaller than

that for the matrix projection method. In fact, when the standard sparse matrix option is used in MATLAB® we arrive at a

flop count of 64,332 for the simplified method and 86,970 for our implementation of the matrix projection method which

translates into a 26% reduction in the required arithmetic operations. The last column of Table 3 follows the path of

expression [25]. Given 12 equality constraints and 9 identified independent unmeasured variables, the last column values

indeed approach 3 as is increased in light of the fact that A is not of full row rank.

Table 4 Solution Values for Example 1 with =49,500.

xm x [Ha]jj ½ [s]j y [Hy]jj ½ Eq. [41] [RB]jj2 1 1.930 1.917 5.730e-03 1.226e-03 34.104 1.069e+01 0 1.560e-04

2 0.450 0.451 2.945e-04 1.075e-03 -13.147 8.500e+00 0 1.560e-04

3 0.130 0.126 6.721e-03 8.325e-01 0.925 8.244e-03 0 3.146e+01

4 0.090 0.092 2.960e-03 7.027e-01 0.916 7.342e-03 0 2.953e+01

5 19.860 19.959 5.330e-02 7.099e-06 0.842 8.907e-03 0 8.940e-01

6 21.440 21.447 3.020e-02 4.736e-09 0.075 8.244e-03 0 5.548e+02

7 0.510 0.515 8.409e-03 5.502e-03 0.084 7.343e-03 0 1.585e+02

8 3.810 4.603 1.888e-01 5.024e-02 0.075 5.776e-03 0 7.255e+02

9 4.920 4.408 2.684e-01 6.350e-02 0.009 2.852e-03 0 1.331e+01

10 5.360 4.578 3.022e-01 1.843e-02

11 0.410 0.410 5.069e-04 4.786e-05

12 7.090 7.004 4.539e-02 4.193e-04

13 4.950 4.885 2.421e-02 1.544e-04

14 52.100 51.681 7.243e-01 3.748e-07

Table 4 details the solution values for a of 49,500 where three significant digits were used for xm, x and y for the sole

purpose of presentation and if substituted in the equations may not satisfy the specified tolerance (i.e., 10-6). In this case no

dependent unmeasured variables were detected, as indicated by the square of the diagonal of RB and the value of expression

[25], where all of the reconciled variables are redundant as indicated by s (i.e., equation [47]). These results are consistent with

those presented by the previous authors however reconciled variable number 6 is close to being declared non-redundant if we

were to assume a cut-off tolerance greater than 10-9. In addition, we also notice a negative zinc concentration in stream 8 as

highlighted by the shading for unmeasured variable number 2; this was also pointed out by Crowe (1986) who detailed this

22

example as well. Given the variance of that estimate, derived from equation [37], we notice that this value would not pass a

95% confidence interval (i.e., +/- 2 x 8.500) and consequently its value must not be trusted even though it is uniquely

determinable from the measurements. From this, we could proceed to a more detailed gross error detection and identification

study to determine the cause of the negative aberrant, however our focus is on the solution of the problem and this will not be

pursued further.

Example 2. The second example is taken from Swartz (1989) who studied a small heat exchanger circuit which is presented in

Figure 2 and has also been detailed previously by Tjoa and Biegler (1991) and Albuquerque and Biegler (1996). We use the

same convergence and cut-off tolerances as those of the first example however the initial maximum trace is estimated to be

78.4 million which could be reduced if the previously mentioned scaling technique were to be employed. Tables 5 and 6

describe the variables, functions and Jacobian elements where no fixed or constant variables exist in this example. The initial

conditions for the reconciled variables are the measured values where default values of 500 for the unmeasured variables are

used as suggested by Swartz (1989). The measurement variance-covariance matrix is diagonal with standard deviations for

temperatures of 0.75 absolute and 2% of the measured value for flows.


x y 1 Flow of Stream A1 Flow of Stream A2

2 Temperature of Stream A1 Temperature of Stream A2

3 Flow of Stream A3 Flow of Stream A4

4 Temperature of Stream A3 Flow of Stream A5

5 Temperature of Stream A4 Temperature of Stream A6

6 Temperature of Stream A5 Flow of Stream A7

7 Flow of Stream A6 Flow of Stream A8

8 Temperature of Stream A7 Flow of Stream B2

9 Temperature of Stream A8 Temperature of Stream B2

10 Flow of Stream B1 Flow of Stream B3

11 Temperature of Stream B1 Temperature of Stream B3

12 Flow of Stream C1 Flow of Stream C2

13 Temperature of Stream C1 Temperature of Stream C2

14 Temperature of Stream D1 Flow of Stream D1

15 Flow of Stream D2

16 Temperature of Stream D2


Function A B

f(1) = x(1)-y(1)) A(1,1) = 1 B(1,1) = -1

f(2) = y(8)-y(10) - B(2,8) = 1, B(2,10) = -1

f(3) = x(1)(y(2)-x(2))-y(8)(y(9)-y(11)) A(3,1) = (y(2)-x(2)), A(3,2) = -x(1) B(3,2) = x(1), B(3,8) = -(y(9)-y(11)), B(3,9) = -y(8), B(3,11) = y(8)

f(4) = y(1)-x(3)-x(7) A(4,3) = -1, A(4,7) = -1 B(4,1) = 1

f(5) = y(2)-x(4) A(5,4) = -1 B(5,2) = 1

23

f(6)= y(2)-y(5) - B(6,2) = 1, B(6,5) = -1

f(7) = x(3)-y(3) A(7,3) = 1 B(7,3) = -1

f(8) = x(10)-y(8) A(8,10) = 1 B(8,8) = -1

f(9) = x(3)(x(5)-x(4))-x(10)(x(11)-y(9)) A(9,3) = (x(5)-x(4)), A(9,4) = -x(3), A(9,5) = x(3), A(9,10) = -(x(11)-y(9)),

A(9,11) = -x(10)

B(9,9) = x(10)

f(10) = y(3)-y(4) - B(10,3) = 1, B(10,4) = -1

f(11) = x(12)-y(12) A(11,12) = 1 B(11,12) = -1

f(12) = y(3)(x(6)-x(5))-x(12)(x(13)-y(13)) A(12,5) = -y(3), A(12,6) = y(3), A(12,12) = -(x(13)-y(13)), A(12,13) = -x(12)

B(12,3) = (x(6)-x(5)), B(12,13) = x(12)

f(13) = x(7)-y(6) A(13,7) = 1 B(13,6) = -1

f(14) = y(14)-x(15) A(14,15) = -1 B(14,14) = 1

f(15) = x(7)(x(8)-y(5))-y(14)(x(14)-x(16)) A(15,7) = (x(8)-y(5)), A(15,8) = x(7), A(15,14) = -y(14), A(15,16) = y(14)

B(15,5) = -x(7), B(15,14) = -(x(14)-x(16))

f(16) = y(4)+y(6)-y(7) - B(16,4) = 1, B(16,6) = 1, B(16,7) = -1

f(17) = y(4)x(6)+y(6)x(8)-y(7)x(9) A(17,6) = y(4), A(17,8) = y(6), A(17,9) = -y(7)

B(17,4) = x(6), B(17,6) = x(8), B(17,7) = -x(9)

Table 7 Effect of on the Solution Results for Example 2.

J Eq [16,23] flops k J Eq [16,17] k Eq. [25]

1 3230.931044 - - 14.5861656 5 10.439026

10 229.5708045 - - 14.5861656 5 8.294295

100 14.5861740 1858238 33 14.5861656 5 5.437646

1,000 14.5861656 563177 10 14.5861656 5 3.512451

10,000 14.5861656 337949 6 14.5861656 5 3.059837

100,000 14.5861656 281642 5 14.5861656 5 3.006092

1,000,000 14.5861656 281642 5 14.5861656 5 3.000610

10,000,000 14.5861656 281642 5 14.5861656 5 3.000061

78,400,000 14.5861656 281642 5 14.5861656 5 3.000008

100,000,000 14.5861661 281642 5 14.5861656 5 3.000006

1,000,000,000 14.5861608 563177 10 14.5861653 8 3.000001

J Eq [9,10] 14.5861656 304163 5 - - -

In Table 7, which is similar to Table 3, the flop count and the number of iterations for the simplified method are recorded. As

well, the objective function values are shown for all three methods. The shaded row in the table highlights the solution using

the initial maximum trace, where as before, the objective function values and iteration counts are almost identical for each

method though the first 2 solutions for the simplified method did not converge at the specified tolerance as noted. Further,

we notice that the flop count for the simplified method is lower than the matrix projection method and when standard sparse

storage and computations are used, the flop count for the simplified method is 16,482 and 17,742 for the matrix projection

method with the same number of iterations indicating a 7% reduction. However, similar to Example 1, only the first 4

iterations were found to be sufficient to correctly identify the permutations matrices required for the computation of the

4 Tolerance was relaxed to 0.5 to allow convergence. 5 Tolerance was relaxed to 10-1 to allow convergence.

24

matrix projection, which had this improvement not be taken, the flop count would have been higher. In addition, Swartz

(1989) quoted a iteration count of 4 (as opposed to 5 in our example) however the tolerance used by Swartz (1989) was greater

than 10-6. With regard to the last column of Table 7, it is apparent that expression [25] converges to 3 which is not surprising

since there are 17 constraints and 14 independent unmeasured variables and again A is not of full row rank. Interestingly,

when expression [25] is within 1% of its expected value, we observe that a of 100,000 seems stable to provide a stable

solution.

Table 8 presents the results using the initial maximum trace solution where the numerical results do not match exactly with

those found by Swartz (1989). Unfortunately no information was given by Swartz (1989) regarding the heat capacities of the

streams hence defaults of unity for each heat capacity were used throughout our example.

Table 8 Solution Values for Example 2 with =78,400,000.

xm x [Ha]jj½ [s]j y [Hy]jj½ Eq. [41] [RB]jj2 1 1000.00 964.19 1.730e+01 5.000e-01 964.19 1.004e+01 0 2.000e+00

2 466.33 466.33 1.872e-06 2.328e-10 481.91 7.364e-01 0 9.297e+05

3 401.70 407.77 2.787e+00 5.000e-01 407.77 7.531e+00 0 7.298e+03

4 481.78 481.91 1.421e-01 4.905e-01 407.77 7.531e+00 0 3.789e+05

5 530.09 530.09 1.103e-06 5.821e-11 481.91 7.364e-01 0 3.096e+05

6 616.31 615.51 2.591e-01 4.370e-01 556.42 7.826e+00 0 2.007e+00

7 552.70 556.42 7.801e+00 5.293e-01 964.19 1.004e+01 0 5.003e-01

8 619.00 617.78 3.682e-01 1.304e+00 253.20 5.060e+00 0 2.008e+00

9 614.92 616.82 6.127e-01 2.444e+00 540.51 2.821e+00 0 6.411e+04

10 253.20 253.20 5.637e-05 2.729e-12 253.20 5.060e+00 0 5.019e-01

11 618.11 618.11 6.787e-07 1.455e-11 481.19 4.894e+00 0 1.369e-01

12 308.10 308.10 4.780e-05 0 308.10 6.160e+00 0 1.000e+00

13 694.99 694.99 3.463e-07 -1.455e-11 581.94 3.448e+00 0 1.301e+01

14 667.84 668.00 1.758e-01 7.507e-01 688.42 9.996e+00 0 1.020e+00

15 680.10 688.42 9.222e+00 1.910e-02

16 558.34 558.18 1.758e-01 7.507e-01

From the table, as indicated by the shaded cells, we notice that there are six non-redundant variables and no non-observable

variables. These non-redundant variables match exactly to those found by Swartz (1989) where it should be noted that if non-

redundant variables are identified then their estimated adjustment variances (i.e., diag(Ha)) should be set to zero. From a

gross error detection perspective, Swartz (1989) performed a second simulation which deleted the temperatures in streams A4

and A7 in order to reduce the overall Chi-square statistic (i.e., the objective function) to within an acceptable level. Table 9

displays our results for this reconciliation where two new variables were appended to the vector of unmeasured variables.

Specifically, stream A4’s temperature is now unmeasured variable number 15 and stream A7’s temperature is now unmeasured

variable number 16 where the reconciled variables are displaced accordingly.

Table 9 Solution Values when Temperatures in Streams A4 and A7 are Deleted (J=3.5515).

25

xm x [Ha]jj½ [s]j y [Hy]jj½ Eq. [41] [RB]jj2 1 1000.00 969.30 1.720e+01 5.000e-01 969.30 1.020e+01 0 2.000e+00

2 466.33 466.33 2.630e-06 0 481.77 7.372e-01 1.072e-14 9.395e+05

3 401.70 406.65 2.784e+00 5.000e-01 406.65 7.532e+00 4.108e-12 3.798e+05

4 481.78 481.77 1.381e-01 3.094e-01 406.65 7.532e+00 2.348e-12 3.798e+05

5 616.31 616.30 9.978e-02 1.616e-01 481.77 7.372e-01 1.326e-14 3.166e+05

6 552.70 562.65 7.553e+00 5.173e-01 562.65 8.065e+00 8.718e-17 1.992e+00

7 614.92 614.94 2.378e-01 9.182e-01 969.30 1.020e+01 9.052e-12 4.997e-01

8 253.20 253.20 2.214e-04 2.328e-10 253.20 5.060e+00 1.219e-18 2.007e+00

9 618.11 618.11 1.375e-06 7.276e-12 1391.80 7.491e+00 1.606e+00 6.411e+04

10 308.10 308.10 3.139e-04 0 253.20 5.060e+00 1.212e-17 5.019e-01

11 694.99 694.99 1.567e-06 0 1332.70 6.840e+00 1.606e+00 1.355e-01

12 667.84 667.83 1.667e-01 4.511e-01 308.10 6.160e+00 0 1.000e+00

13 680.10 679.38 8.833e+00 1.171e-02 -118.44 1.560e+01 1.320e+00 2.499e-01

14 558.34 558.35 1.667e-01 4.511e-01 679.38 1.034e+01 1.9146e-15 1.019e+00

15 0 5.600e+06 - 6.905e-11

16 613.95 1.312e+00 1.858e-15 1.328e+00

By deleting these two variables, the objective function is now 3.5515 which is in line with the threshold value, however since

stream A4’s temperature was a non-redundant variable, we observe from Table 9’s last column that unmeasured variable 15 is

identified as being dependent (and non-observable). The shaded cells of Table 9 are exactly consistent with the results of

Swartz (1989) which indicate non-redundant and non-observable variables. The three shaded cells in the second last column

were estimated by finding the largest absolute value in each row of the matrix found in equation [41] which identify the

independent although non-observable unmeasured variables.

Example 3. The third and final example can be found in Sanchez and Romagnoli (1996) and is a linear material balance

consisting of 29 measured variables, 34 unmeasured variables and 31 balances of which only 8 are redundant (and confirmed

by expression [25]). This example is included to further verify the accuracy of the classification of variables technique which

does not rely on the matrix projection to be computed. Table 10 highlights the results which are exact when compared to the

values presented by the above authors where we have used the same stream number designations. The final objective function

value is computed at 26.1236 using a of 2,300 which was found to be adequate where equations [16] and [23] were used.

26

Table 10 Solution Values for Example 3 with =2,300 (J=26.1236).

Stream xm x [s]j Stream y Eq. [41] [RB]jj2 1 1 70.490 70.104 0.3333 3 0.0000 1.000 2.000e+00

2 2 7.103 7.096 0.3333 4 70.104 1.000 1.500e+00

3 7 13.040 11.980 0.8333 5 0 - 0

4 8 35.380 35.540 0.8333 6 0 - 0

5 9 53.210 53.640 1.333 10 -11.980 1.000 2.000e+00

6 12 23.900 23.560 0.5000 11 0 - 0

7 13 0 -0.0241 1.000 34 41.881 0 2.000e+00

8 14 0.0765 -0.1515 1.000 35 18.876 0 1.500e+00

9 15 54.590 53.816 1.200 36 13.235 0 1.333e+00

10 16 12.780 11.934 1.200 24 4.703 0 1.250e+00

11 17 23.420 23.005 1.200 63 0.0000 1.000 2.000e+00

12 18 0.2378 0.2278 1.200 61 24.203 1.000 1.500e+00

13 19 8.657 8.618 1.000 38 0.0000 1.000 1.333e+00

14 20 5.087 5.413 1.200 55 24.203 1.000 1.250e+00

15 21 1.740 1.787 0.2000 54 0.0000 1.000 1.200e+00

16 22 0.0255 0.0262 1.200 48 24.203 1.000 1.167e+00

17 23 3.113 3.178 1.200 56 0.0000 1.000 1.143e+00

18 25 5.407 5.354 1.200 57 0 - 2.220e-16

19 26 2.898 2.890 1.200 58 0.0000 1.000 1.125e+00

20 27 11.830 13.239 1.000 60 0.0000 1.000 1.111e+00

21 28 8.197 8.907 1.059 59 0 - 4.441e-16

22 29 1.364 1.393 1.059 62 0.0000 1.000 1.100e+00

23 30 20.940 19.872 2.000 47 4.595 1.000 1.091e+00

24 31 1.051 1.069 1.059 49 19.608 0 1.083e+00

25 32 12.580 13.465 1.059 44 0 - 0

26 33 4.999 5.338 1.059 45 4.595 1.000 1.077e+00

27 37 5.730 5.969 0.0588 51 -19.608 1.000 1.071e+00

28 46 4.250 4.595 0.0588 50 0.0000 1.000 1.067e+00

29 53 16.340 19.608 0.0588 52 0 - 0

30 42 0.0000 1.000 1.063e+00

31 43 0 - 0

32 41 0 - 4.441e-16

33 40 0 - 2.220e-16

34 39 0 - 8.882e-16

As we notice from the table, there are no non-redundant variables and of the 34 unmeasured variables only 23 are

independent as indicated by not being shaded. However, of those independent unmeasured variables, we find that only five of

these variables are actually observable (i.e., stream flows 24, 34, 35, 36 and 49) as determined by equation [41] which are

identical to the unmeasured variables deemed observable by Sanchez and Romagnoli (1996).

Conclusions

In this article we have presented a new regularized solution to the reconciliation of process flowsheets, although making some

practical assumptions in its derivation to mitigate the two limitations outlined in the Introduction, has proven successful in

27

three examples. The new approach has been shown to be a viable and numerically efficient alternative to the matrix

projection method presented by Pai and Fisher (1988), however it depends on the proper choice of the regularization

parameter . Yet, two useful guidelines were given to aid in its selection i.e., equations [24] and [25]. A detailed discussion

on other issues which affect its success were also included which at least conceptually addresses the major concerns of the

technique. In addition, estimates of the variance-covariance matrices for the adjustments, unmeasured variables and constraint

residuals have been derived which can be used in the critical post analysis of the solution to detect significant gross errors if

they exist. Further, an alternative technique for the classification of variables into observable and redundant has been detailed

using readily available matrix factorizations highlighting the use of another projection matrix which is independent on the

choice of .

Nomenclature

A Jacobian or topology matrix for measured variables

B Jacobian or topology matrix for unmeasured variables

B ,B ,B B1 2 3 4& partitions of B defined by Crowe et. al.(1983)

B B12 3& 4 column partitions of B

C Jacobian topology matrix for fixed variables

D symmetric and positive matrix found in equation [18a]

E idempotent matrix found in equation [18b]

f k vector of function values evaluated at iteration k

g vector of constraint residuals found in equation [34]

Ha variance-covariance matrix of measurement adjustments

Hg variance-covariance matrix of constraint residuals

Hy variance-covariance matrix of unmeasured variables

i typically a row index

Ig identity matrix of size ng x ng

I x identity matrix of size nx x nx

I y identity matrix of size ny x ny

j typically a column index

J value of quadratic objective function

k iteration count

K symmetric and positive definite kernel matrix found in equation [15]

28

n g number of constraints

n x number of measured variables

n y number of unmeasured variables

n y ,12 number of unmeasured variables which are independent

nz number of fixed variables

PC column permutation matrix

P P12

C

34

C& column permutation sub-matrices

PR row permutation matrix

Q matrix of measurement error covariance’s

RB lower triangular Cholesky factor of BTB

RQ lower triangular Cholesky factor of Q

s vector of numbers indicating redundancy found in equation [44]

S f row or constraint diagonal scaling matrix

S x column or measured variable diagonal scaling matrix

S y column or unmeasured variable diagonal scaling matrix

U matrix projection

V matrix found in equation [46]

xa vector of adjustments for measured variables

x m vector of measurements for measured variables

x vector of reconciled values for measured variables

y vector of original unmeasured variables

z vector of fixed variables

Greek Letters

estimated uncertainty for the unmeasured variables in the kernel matrix

regularization parameter found in equation [19]

difference or delta

Superscripts

C column

k iteration

R row

T transpose

Subscripts

a adjusted

29

m measured

1 partition of B with independent columns and rows

2 partition of B with independent columns and dependent rows

3 partition of B with dependent columns and independent rows

4 partition of B with dependent columns and dependent rows

12 column partition of B with independent columns

34 column partition of B with dependent columns

Operators

scaled variable indicated

i denotes the i th

row of a matrix

j denotes the j th

column of a matrix

2 Euclidean norm of a vector

tr trace of a matrix

References

Albuquerque, J.S. and Biegler, L.T., Data Reconciliation and Gross-Error Detection for Dynamic Systems, AIChE Journal, 42, 10, 1996.

Axelsson, O., Iterative Solution Methods, Cambridge University Press, New York, 1996.

Bagajewicz, M.J., Design and Retrofit of Sensor Networks in Process Plants, AIChE Journal, 43, 9, 1997.

Box, G.E.P., and Jenkins, G.M., Time Series Analysis: Forecasting and Control, Revised Edition, Holden-Day, Oakland, California, 1976.

Britt, H.L. and Luecke, R.H., The Estimation of Parameters in Nonlinear, Implicit Models, Technometrics, 15, 2, 1973.

Buzzi-Ferraris, G. and Tronconi, E., An Improved Convergence Criterion in the Solution of Nonlinear Algebraic Equations, Computers chem. Engng., 17, 10, 1993.

Coleman, T.F., Garbow, B.S. and More, J.J., Algorithm 618, ACM-Trans. Math. Software, 10, 3, 1984.

Crowe, C.M., Garcia, Y.A., and Hrymak, A., Reconciliation of Process Flow Rates by Matrix Projection, Part I: Linear Case, AIChE Journal, 29, 6, 1983.

Crowe, C.M., Reconciliation of Process Flow Rates by Matrix Projection, Part II: The Nonlinear Case, AIChE Journal, 32, 4, 1986.

Crowe, C.M., Observability and Redundancy of Process Data for Steady-State Reconciliation, Chem. Eng. Sci., 44, 12, 1989.

Crowe, C.M., Data Reconciliation - Progress and Challenges, J. Proc. Cont., 6, 2, 1996.

Dovi, V.G. and Paladino, O., Fitting of Experimental Data to Implicit Models Using a Constrained Variation Algorithm, Computers chem. Engng, 13, 6, 1989.

Fang, S-C and Puthenpura, S., Linear Optimization and Extensions: Theory and Algorithms, Prentice-Hall International, 1993.

Forrest,J., The Current Status of IBM’s Optimization Subroutine Library (OSL) and Future Research Directions, NPRA Computer Conference, 1991.

Golub, G.H., and Van Loan, C.F., Matrix Computations, The Johns Hopkins University Press, Baltimore, Maryland, 1983.

30

Householder, A.S., The Theory of Matrices in Numerical Analysis, Dover Publications Inc., New York, New York, 1975.

Kelly, J.D., On Finding the Matrix Projection in the Data Reconciliation Solution, Computers chem. Engngg, In Press, 1998.

Kim, I-W, Kang, M.S., Park, S., and Edgar, T.F., Robustness Data Reconciliation and Gross Error Detection: The Modified MIMT Using NLP, Computers chem. Engng, 21, 7, 1997.

Knepper, J.C., and Gorman, J.W., Statistical Analysis of Constrained Data Sets, AIChE Journal, 26, 2, 1980.

Kuehn, D.R., and Davidson, H., Computer Control II Mathematics of Control, Chem. Eng. Prog., 57, 44, 1961.

Liebman, M.J., and Edgar, T.F., Data Reconciliation for Nonlinear Processes, AIChE Annual Meeting, Washington, DC., 1988.

Lucia, A., and Xu, J., Chemical Process Optimization Using Newton-Like Methods, Computers chem. Engng, 14, 2, 1990.

Lucia, A., and Xu, J., Methods of Successive Quadratic Programming, Computers chem. Engng, 18, Suppl., 1994.

MacDonald, R.J. and Howat, C.S., Data Reconciliation and Parameter Estimation in Plant Performance Analysis, AIChE Journal, 34, 1, 1988.

Mah, R.S., Stanley, G.M. and Downing, D.M., Reconciliation and Rectification of Process Flow and Inventory Data, Ind. Eng. Chem. Process Des. Dev., 15,1, 1976.

Madron, F., Process Plant Performance. Measurement and Data Processing for Optimization and Retrofits, Ellis Horwood Ltd., Chichester, England, 1992.

Nguyen, T.C., Barton, G.W., Perkins, J.D., and Johnston, R.D., A Condition Number Scaling Policy for Stability Robustness Analysis, AIChE Journal, 34, 7, 1988.

Pai, C.C.D. and Fisher, G.D., Application of Broyden’s Method to Reconciliation of Nonlinearly Constrained Data, AIChE Journal, 34, 5, 1988.

Press, W.H., Vetterling, W.T., Teukolsky,S.A. and Flannery, B.P., Numerical Recipes in FORTRAN: The Art of Scientific Computing, Second Edition, Cambridge University Press, 1994.

Ricker, N.L., Comparison of Methods for Nonlinear Parameter Estimation, Ind. Eng. Chem. Process Des. Dev., 23, 2, 1984.

Reilly, P.M and Patino-Leal, H., A Bayesian Study of the Error-In-Variables Model, Technometrics, 23, 3, 1981.

Sanchez, M., and Romagnoli, J., Use of Orthogonal Transformations in Data Classification-Reconciliation, Computers chem. Engng, 20, 5, 1996.

Serth, R.W., and Heenan, W.A., Gross Error Detection and Data Reconciliation in Steam-Metering Systems, AIChE Journal, 32, 5, 1986.

Smith, H.W., and Ichiyen, N., Computer Adjustment of Metallurgical Balances, Can. Inst. Mining Metall. (C.I.M.) Bull., 66, 97, 1973.

Stephenson, G.R., and Shewchuk, C.F., Reconciliation of Process Data with Process Simulation, AIChE Journal, 32, 2, 1986.

Swartz, C.L.E., Data Reconciliation for Generalized Flowsheet Applications, Amer. Chem. Society National Meeting, Dallas, Texas, 1989.

Tjoa, I.B., and Biegler, L.T., Simultaneous Strategies for Data Reconciliation and Gross Error Detection of Nonlinear Systems, Computers chem. Engng, 15, 10, 1991.

Tong, H., Studies in Data Reconciliation Using Principle Component Analysis, Ph.D. Dissertation, McMaster University, Hamilton, Ontario, Canada, 1995.

Tong, H., and Crowe, C.M., Detection of Gross Errors in Data Reconciliation by Principle Components, AIChE Journal, 41, 7, 1995.

Weiss, G.H., Romagnoli, J.A., and Islam, K.A., Data Reconciliation - An Industrial Case Study, Computers chem. Engng, 20, 12, 1996.

31

Appendix

The following presents the derivation of expression [25]. For the sake of argument, assume that A is of full row rank and B is

of full column rank, then it is possible to expand the kernel matrix of equation [15] using the Sherman-Morrison-Woodbury

formula (Golub and Van Loan, 1983) as

K AQA AQA B I B AQA B B AQAy

1 1 1 1 1 1T T T T T T [A1]

By pre-multiplying [A1] by AQAT, we arrive at

I B I B AQA B B AQAg y

T T T T1 1 1

[A2]

Since will be chosen to be sufficiently large, the inverse containing Iy can be approximated by

1 1 1

B AQA BT T [A3]

Substituting [A3] into [A2] and taking its trace, we have

tr T T T TI B B AQA B B AQAg

1 1 1

[A4]

which can be easily simplified to

tr tr T T T TI B AQA B B AQA Bg

1 1 1

[A5]

Upon further simplification of the second term, which is equivalent to Iy, we can now write

tr tr n ng yI Ig y [A6]

If B is not of full column rank then it is required to replace B with B12 thus replacing ny with ny,12 in [A6]. Given that for

realistic problems A will most likely not possess full row rank, the inverse AQAT will never presumably be realizable. While

strictly speaking this is true, where the above derivation fancifully describes a hypothetical situation (i.e., a schematic

representation), it has been observed in practice without exception that expression [25] is indeed approached as becomes

large. Consequently, it seems appropriate to declare expression [25] as a sensible guideline to select the regularization

parameter .

32

1 2 3 4

85

6

7

1, 2 & 3 4, 5 & 6 7, 8 & 9

10, 11 & 12

Figure 1 Floatation Circuit Flowsheet of Smith and Ichiyen (1973)

33

1, 2 & 3

A2

A3

A6

A1

B1

A5

A7

A8

C1

C2B2

D1

D2B3

A4

4, 5 & 6

7, 8 & 9 10, 11 & 12

13, 14 & 15

16 & 17

Figure 2 Heat Exchanger Flowsheet of Swartz (1989)

A Regularization Approach to the Reconciliation of Constrained Data Sets

Technology

Transcript of A Regularization Approach to the Reconciliation of Constrained Data Sets