Great Parameter Estimation Monod

10
Biochemical Engineering Journal 24 (2005) 95–104 Practical identifiability of parameters in Monod kinetics and statistical analysis of residuals Padmanaban Kesavan 1 , Victor J. Law Department of Chemical and Biomolecular Engineering, Tulane University, 300 Boggs Building, New Orleans, LA 70118, USA Received 22 January 2004; accepted 28 January 2005 Abstract A systematic procedure for identifying the number of parameters of a model that can be estimated uniquely by nonlinear regression is presented. The objective function to be minimized is in the form of weighted sum of squares of residuals. The assumptions inherent in the choice of weights and the validity of the parameters estimated are verified by statistical tests. The procedure is illustrated by considering the estimation of parameters for Monod kinetics. Simulated data containing known measurement noise are used initially to illustrate the procedure. Finally, parameters are estimated from different sets of experimental data, and the validity and the uniqueness of the parameters are presented. © 2005 Elsevier B.V. All rights reserved. Keywords: Biokinetics; Dynamic modelling; Microbial growth; Bioreactions 1. Introduction The kinetics of growth of microorganisms under substrate- limited conditions was quantitatively defined by Monod [1]. The Monod kinetic model is widely used in wastewater treat- ment, bioremediation and in various other environmental ap- plications involving growth of microorganisms. The original Monod model did not take into account the maintenance metabolism. Microorganisms require energy for synthesis of new microorganisms as well as for their mainte- nance. Maintenance energy is the amount of energy required by the microorganisms even in the absence of growth. The Monod model was modified by a number of researchers [2,3] to equations similar to the following: ds dt =− ksx k s + s , s(0) = s 0 (1) dx dt = yksx k s + s bx, x(0) = x 0 (2) Corresponding author. Tel.: +1 504 865 5773; fax: +1 504 865 6744. E-mail address: [email protected] (V.J. Law). 1 Present address: Sensitron Semiconductor, Deer Park, NY, USA. where s = growth limiting substrate concentration (M L 3 ), x = biomass concentration (M L 3 ), k = maximum specific uptake rate of the substrate (T 1 ), k s = half saturation con- stant for growth (M L 3 ), y = yield coefficient (M M 1 ), and b = decay coefficient (T 1 ). The yield coefficient is that portion of the substrate that is used for the synthesis of the biomass. McCarty [4] developed a method for the theoretical calculation of yield based on thermodynamics. Various other models similar to Monod kinetics have been proposed by a number of researchers [5,6]. However, the Monod model is the simplest and the one most widely applied in practice. The parameters to be estimated are k, k s , y, b, s 0 and x 0 . The estimation of parameters in the Monod model has been carried out extensively [7–17]. Different kinds of objective function, a variety of types of experimental data to be measured and variations of the Monod model were con- sidered in these referenced works. This paper discusses the statistical considerations in choosing the objective function to be minimized to estimate the parameters and the weights that should be assigned to the residuals. This then provides a theoretical framework to an- alyze the validity of the parameters once they are estimated 1369-703X/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.bej.2005.01.028

description

Great Parameter Estimation Monod

Transcript of Great Parameter Estimation Monod

Page 1: Great Parameter Estimation Monod

Biochemical Engineering Journal 24 (2005) 95–104

Practical identifiability of parameters in Monod kinetics andstatistical analysis of residuals

Padmanaban Kesavan1, Victor J. Law∗

Department of Chemical and Biomolecular Engineering, Tulane University, 300 Boggs Building, New Orleans, LA 70118, USA

Received 22 January 2004; accepted 28 January 2005

Abstract

A systematic procedure for identifying the number of parameters of a model that can be estimated uniquely by nonlinear regression ispresented. The objective function to be minimized is in the form of weighted sum of squares of residuals. The assumptions inherent in thechoice of weights and the validity of the parameters estimated are verified by statistical tests. The procedure is illustrated by consideringthe estimation of parameters for Monod kinetics. Simulated data containing known measurement noise are used initially to illustrate theprocedure. Finally, parameters are estimated from different sets of experimental data, and the validity and the uniqueness of the parametersa©

K

1

lTmp

msnbMt

c-

at is

on

een

lied

delfa tocon-

s inateto the

1d

re presented.2005 Elsevier B.V. All rights reserved.

eywords:Biokinetics; Dynamic modelling; Microbial growth; Bioreactions

. Introduction

The kinetics of growth of microorganisms under substrate-imited conditions was quantitatively defined by Monod[1].he Monod kinetic model is widely used in wastewater treat-ent, bioremediation and in various other environmental ap-lications involving growth of microorganisms.

The original Monod model did not take into account theaintenance metabolism. Microorganisms require energy for

ynthesis of new microorganisms as well as for their mainte-ance. Maintenance energy is the amount of energy requiredy the microorganisms even in the absence of growth. Theonod model was modified by a number of researchers[2,3]

o equations similar to the following:

ds

dt= − ksx

ks + s, s(0) = s0 (1)

dx

dt= yksx

ks + s− bx, x(0) = x0 (2)

∗ Corresponding author. Tel.: +1 504 865 5773; fax: +1 504 865 6744.

wheres= growth limiting substrate concentration (M L−3),x= biomass concentration (M L−3), k= maximum specifiuptake rate of the substrate (T−1), ks = half saturation constant for growth (M L−3), y= yield coefficient (M M−1), andb= decay coefficient (T−1).

The yield coefficient is that portion of the substrate thused for the synthesis of the biomass. McCarty[4] developeda method for the theoretical calculation of yield basedthermodynamics.

Various other models similar to Monod kinetics have bproposed by a number of researchers[5,6]. However, theMonod model is the simplest and the one most widely appin practice. The parameters to be estimated arek, ks, y, b, s0andx0. The estimation of parameters in the Monod mohas been carried out extensively[7–17]. Different kinds oobjective function, a variety of types of experimental datbe measured and variations of the Monod model weresidered in these referenced works.

This paper discusses the statistical considerationchoosing the objective function to be minimized to estimthe parameters and the weights that should be assigned

E-mail address:[email protected] (V.J. Law).1 Present address: Sensitron Semiconductor, Deer Park, NY, USA.

residuals. This then provides a theoretical framework to an-alyze the validity of the parameters once they are estimated

369-703X/$ – see front matter © 2005 Elsevier B.V. All rights reserved.oi:10.1016/j.bej.2005.01.028

Page 2: Great Parameter Estimation Monod

96 P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104

and the validity of assumptions in determining the weightsof the residuals.

2. Theory

2.1. Modification of the Monod model to consider thelimitation in experimental measurements

The biomass concentration (x) in Eqs.(1) and (2)repre-sents the viable (living) biomass concentration. Experimentaltechniques used (volatile solids measurement, optical den-sity measurements, protein content, etc.) measure the totalbiomass concentration and do not distinguish between theviable and dead biomass. Hence, a modification to Eqs.(1)and (2)is necessary to account for this limitation in experi-mental measurements.

If an assumption that the viable biomass concentration is aconstant fraction of the total biomass concentration is made,then Eqs.(1) and (2)can be modified as:

ds

dt= − kηsxt

ks + s, s(0) = s0 (3)

dxt

dt= yksxt

ks + s− bxt, xt(0) = x0t (4)

w(

hep omasc esti-ma m:

wB

s ofs na massc n forsc

2

con-c bjec-t( sely.A zedt ram-

eter estimates will differ depending on the objective functionchosen. Therefore, even if the predicted profiles approximatethe measured profile for an arbitrary choice of objective func-tion, it is impossible to infer anything about the validity of theparameters and the procedure reduces to simple curve fitting.The criteria for the choice of the objective function shouldbe based on the measurement errors associated with the ex-perimental data[18,19]. The choice of objective function toestimate the parameters of the model defined by Eqs.(5) and(6) therefore should be based on the errors associated with themeasurements of the substrate and biomass concentrations.For example, if the objective function chosen is:

q =n∑

i=1

(Smodel− Sdata)2 +

n∑j=1

(Xmodel− Xdata)2

where q= objective function, Smodel= substrate concen-tration from the model defined by Eq.(5), Sdata=nondimensionalized experimental substrate concentration,Xmodel= total biomass concentration from the model definedby Eq.(6), andXdata= nondimensionalized experimental totalbiomass concentration.

Then, the inherent assumptions made about the measure-ment errors[19] are:

1. Each of the measurements of substrate and biomass con-wn

2 ts. Ininde-

3

eces-s sfied.T ed byB

etere blem.Ti w-t calledt -r d pre-v andt me-t tingE af thatt r canb uals[ enti-fi torso

n-s aram-e for-

herext = total biomass concentration (M L−3) andη =xv/xtwherexv = viable biomass concentration (M L−3)).

The model Eqs.(3) and (4)can be used to estimate tarameters using the measured substrate and total bioncentrations. A total of seven parameters need to beated, namelyk,ks,η,b,s0,x0t andβ (whereβ =yk). Eqs.(3)nd (4)can be written in the following dimensionless for

dS

dT= − KSXt

Ks + S, S(0) = S0 (5)

dXt

dT= αSXt

Ks + S− B′Xt, Xt(0) = X0t (6)

hereS=s/s* , Xt =xt/x* , T=k0t, K=kηx* /k0s* , Ks =ks/s* ,′ =b/k0, S0 =s0/s* , X0t =x0t/x* , andα =β/k0.s* ,x* andk0 are known constant values with dimension

, x andk, respectively. The value ofs* andx* can be choses the experimentally measured initial substrate and biooncentration, respectively. However, the values chose* and x* do not affect the result. The value ofk0 can behosen as 1.

.2. Parameter estimation

Assuming that the substrate depletion and biomassentration are measured at suitable time intervals, the oive is to estimate the parameters of the model Eqs.(3) and4) such that they approximate the measured profiles clo

suitable choice of the objective function to be minimihen has to be made. This is a nontrivial task since the pa

s

centrations are normally distributed, each with its ovariance.

. There is no correlation between any measuremenother words, all the measurements are completelypendent.

. The variance of all the measurements is equal.

Therefore, once the parameters are estimated, it is nary to verify whether all the assumptions made are satihis can be done based on statistical tests as discussard[18].Once the objective function is chosen, then the param

stimation problem becomes a nonlinear regression prohe nonlinear regression algorithm, CONREG[20] is used

n this work. This algorithm uses a combination of Neon’s method and weighted steepest descent and isransformational discrimination[21]. The scaling of the paameters was performed using the procedure developeiously [17]. The function values, the gradient vectorhe Gauss–Newton matrix for a given estimate of paraers (required by CONREG) are calculated by integraqs. (5) and (6)along with the sensitivity equations by

ourth-order Runge–Kutta method. It should be notedhe Gauss–Newton matrix as well as the gradient vectoe obtained from the sensitivity equations and the resid

17,22]. The uniqueness of the parameter estimates is ided quantitatively by analyzing the orthonormal eigenvecf the Gauss–Newton matrix[17].

The value ofB′ is usually close to zero. Since uncotrained nonlinear regression is used to estimate the pters, to avoidB′ being a small negative value the trans

Page 3: Great Parameter Estimation Monod

P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104 97

mationB′ = exp(−B) is used. Eq.(6) can then be rewrittenas:

dXt

dT= αSXt

Ks + S− exp(−B)Xt, Xt(0) = X0t (7)

The parameters that can be estimated from Eqs.(5) and(7) areK, Ks, α, B, S0 andX0 for a total of six. However,the number of parameters of the original Monod model thatneed to be estimated from Eqs.(3) and (4)arek, ks, η, b, s0,x0 andβ for a total of seven. The product of the parametersyk andkη (three parameters) in Eqs.(3) and (4)are lumpedto yield two parametersα andK in Eqs.(5) and (7). There-fore, one parameter of the original Monod model defined byEqs.(3) and (4)has to be estimated independently. The yieldcoefficient (y) can be estimated from thermodynamics[4].

3. Results and discussion

An analysis was first performed to determine the parame-ters of Monod kinetics from substrate and biomass concentra-tions in different regions of substrate depletion. The regionsof substrate depletion can be characterized as first-order re-gion (S0 �Ks), mixed-order region (S0 ∼=Ks) and zero-orderregion (S0 �Ks) [11].

Simulated data containing known measurement noise wasfi ed fort n thet ta. Its ms, itw trated d out( basedo le toi mayt r de-c e thep d dataw t canb ple-t ownm

3

imu-la pa-r tf . Thes after2 on for2 imen-ta taw ar re-

Table 1The values of the parameters of Monod kinetics used to simulate experimen-tal data in the first-order region

Parameter Value

k (h−1) 0.83ks (g/L) 3y (g/g) 0.6b (h−1) 0.01η 0.5s0 (g/L) 0.3x0t (g/L) 1

gression routine. Since there is no measurement noise, theobjective function to be minimized was chosen as:

q =n∑

i=1

(Smodel− Sdata)2 +

n∑j=1

(Xmodel− Xdata)2

The orthonormal eigenvectors were analyzed as discussedby Kesavan and Law[17] to identify the parameters uniquelydetermined. The ill-determined parameter was eitherK orKs. The ratio of the highest to the smallest eigenvalue was105. Even though this ratio is high, it may not be consideredextremely poor. Therefore, it is worth investigating furtherwhether the parameters of Monod kinetics can be uniquelydetermined in the first-order region.

Gaussian measurement noise of±15% was added to theexperimental data (simulated using the parameters listed inTable 1) on substrate depletion and biomass growth. Threedifferent sets of experimental data were created by addingthree different sequences of random numbers to the simu-lated data. The parameters for different sets of simulated datawere estimated by nonlinear regression. Since the measure-ment noise is proportional to the measurements and since themeasurement errors are uncorrelated, the objective functionto be minimized is[18,19]:

q

n∑ 1(

Smodel− Sdata)2 n∑ 1

(Xmodel− Xdata

)2

w sub-s rorsi

rizedb

ure-

the

ed in

nal

ata) us-used

rst generated using specified parameters and were ushe analysis. The objective function was chosen based oype of measurement error added to the simulated dahould be noted that in the case of experimental systeill not be possible to ascertain a priori, the region of subsepletion. However, once the experiment has been carrieand provided the measurement errors are known), thenn the parameters estimated uniquely, it may be possib

nfer the region of substrate depletion. An experimenthen be repeated, if need be, by altering (increasing oreasing the initial substrate concentration) to estimatarameters. The purpose of the analysis using simulateas, therefore, to identify the number of parameters thae estimated uniquely in different regions of substrate de

ion from substrate and biomass concentrations with kneasurement noise.

.1. First-order region

Experimental data with no measurement noise was sated in the first-order region (S0 �Ks) by solving Eqs.(5)nd (7)using a fourth-order Runge–Kutta method. Theameter values used are listed inTable 1. The detection limior substrate concentration was assumed to be 10 ppmimulated substrate concentration was less than 10 ppm2 h and hence the substrate and biomass concentrati2 h were used to estimate the parameters. The exper

al data was nondimensionalized by choosings* , x* andk0s 0.3 g/L, 1 g/L and 1 h−1, respectively. The simulated daere used to estimate the parameters using the nonline

=i=1

σ1 Sdatai

+i=1

σ2 Xdatai

hereσ1 = standard deviation of measurement errors intrate data andσ2 = standard deviation of measurement er

n total biomass concentration.The steps followed for each set of data are summa

elow:

Step 1: Simulation of experimental data with no measment error.

Experimental data with no noise were simulated infirst-order region by solving Eqs.(5) and (7)by a fourth-order Runge–Kutta method using the parameters listTable 1.Step 2: Simulation of experimental data with proportiomeasurement error.

Gaussian noise of±15% was added to the simulated don substrate depletion and biomass growth (from step 1ing a pseudorandom number generator. The algorithmby the random number generator is described by Knuth[23].

Page 4: Great Parameter Estimation Monod

98 P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104

Table 2The values of the parameters of Monod kinetics used to simulate experimen-tal data in the mixed-order region

Parameter Value

k (h−1) 0.83ks (g/L) 3y (g/g) 0.6b (h−1) 0.01η 0.5s0 (g/L) 3x0t (g/L) 1

Gaussian noise of±15% were also added to the initial sub-strate and biomass concentration. The detection limit forsubstrate concentration was assumed to be 10 ppm. Thesubstrate concentration was below 10 ppm after 21.5 h fordatasets 1 and 2 and after 22 h for dataset 3.Step 3: Determination of nondimensionalizing constants.

The value ofs* andx* were chosen to be equal to theinitial substrate and biomass concentration from step 2. Thevalue ofk0 was chosen to be 1.Step 4: Nondimensionalizing simulated experimental datawith proportional error.

The simulated experimental data with noise (step 2) werenondimensionalized using the values ofs* , x* andk0.Step5: Initial guessof theparameters required for nonlinearregression.

The guess for initial substrate and biomass concentrationwere chosen as those simulated in step 2.

Three different guesses were chosen for the other parame-ters. The guesses were nondimensionalized using the valuesof s* , x* andk0.

The details of the initial guesses used and the eigenvaluesand the eigenvectors of the Gauss–Newton matrix for eachcase is given in Kesavan[24]. Based on the analysis pro-posed by Kesavan and Law[17], the following conclusionsare made. The parameters associated with the smallest eigenv -r inedi e allt e andb d witht ntw

3

imu-la pa-r tf . Thes after1 on for1 imen-t

Table 3The nondimensionalized parameter values of the mixed-order region

Parameter Value

K 0.138Ks 1α 0.5B 4.605S0 1X0t 1

as 3 g/L, 1 g/L and 1 h−1, respectively. The nondimensional-ized parameter values are listed inTable 3. The analysis ofthe orthonormal eigenvectors of the Gauss–Newton matrixindicated that all the parameters were well determined.

Gaussian measurement noise of±15% was added to theexperimental data on substrate depletion and biomass growthin the mixed-order region. The parameters of the model wereestimated from 10 different sets of experimental data (createdby adding 10 different sequences of random numbers to thesimulated data). Three different sets of initial guess were triedfor each set of data. The results are summarized inTables 4–6.

It can be concluded fromTable 4 that all the parame-ters are well determined in most cases using substrate andbiomass data in the mixed-order region with a good initialguess. However, the parameterB is ill-determined when theinitial guesses are poor which is evident fromTables 5 and 6.An analysis was performed to improve the initial guess forB, which is as follows.

Eqs.(5) and (7)can be combined and integrated to yield:

K(Xt − X0t) + α(S − S0) = [exp(−B)](S − S0 + ln S) (8)

The parameters in Eq.(8)areK,Ks,α,S0,BandX0t. However,all the parameters exceptB are well determined using anyarbitrary (but reasonable) guess by nonlinear regression. Theestimates of these parameters can be used in Eq.(8). Theoo edb

1 mass

initial

2 eitialguessn Eq.

3 or all

setso thepo inan e of

alue (ill-determined) areK and/orKs. Therefore, all the paameters of the Monod kinetics model cannot be determn the first-order region. If an attempt is made to estimathe parameters from a given set of experimental substratiomass concentration, and if the parameters associate

he smallest eigenvalue areK and/orKs, then the experimeas probably carried out in the first-order region.

.2. Mixed-order region

Experimental data with no measurement noise were sated in the mixed-order region (S0 ∼=Ks) by solving Eqs.(5)nd (7)using a fourth-order Runge–Kutta method. Theameter values used are listed inTable 2. The detection limior substrate concentration was assumed to be 10 ppmimulated substrate concentration was less than 10 ppm5 h and hence the substrate and biomass concentrati5 h were used to estimate the parameters. The exper

al data were nondimensionalized by choosings* , x* andk0

-nly unknown parameter then isB. An estimate ofB can bebtained by regressing Eq.(8). The steps are summarizelow:

. Perform nonlinear regression using substrate and biodata using an arbitrary set of guess forK,Ks, α andB. Usethe measured initial substrate and biomass as anguess forS0 andX0t, respectively.

. Perform linear regression on Eq.(8) using the estimatof the parameters from step 1 (which are the new inguesses for these parameters) to determine an initialfor B. The substrate and biomass data to be used i(8) were smoothed using quadratic curve fitting.

. Perform nonlinear regression using the new guess fthe parameters.

The procedure described above was carried out for 10f experimental data starting with the initial guess forarameters as used inTables 5 and 6. The estimate of exp(−B)btained by regressing Eq.(8) turned out to be negativefew cases and hence a new improved guess forB, could

ot be calculated in these cases. However, the estimatB

Page 5: Great Parameter Estimation Monod

P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104 99

Table 4Results of parameter estimates in the mixed-order region using substrate and biomass data with measurement noise of±15%

Dataset

1 2 3 4 5 6 7 8 9 10

Initial substrate concentration (g/L) 3.04 3.05 2.82 2.94 3.41 3.08 3.19 2.87 2.95 3.05Initial total biomass concentration (g/L) 1.04 0.96 1.04 0.97 0.97 0.98 1.01 0.996 0.91 0.93Ratio of highest to the smallest eigenvalue 107 106 104 104 104 104 104 104 103 104

Parameter ill-determined B B – – – – – – – –

The initial guesses used werek= 0.83 h−1, ks = 3 g/L, y= 0.602 g/g,b= 0.01 h−1, andη = 0.5. The initial guess fors0 andx0 were chosen to be the measuredinitial substrate and biomass concentration, respectively. The parameters were nondimensionalized using the values ofs* , x* andk0. The values ofs* andx*

for each set was chosen equal to the initial substrate and biomass concentration, respectively. The value ofk0 was chosen to be 1 h−1.

Table 5Results of parameter estimates in the mixed-order region using substrate and biomass data with measurement noise of±15%

Dataset

1 2 3 4 5 6 7 8 9 10

Initial substrate concentration (g/L) 3.04 3.05 2.82 2.94 3.41 3.08 3.19 2.87 2.95 3.05Initial total biomass concentration (g/L) 1.04 0.96 1.04 0.97 0.97 0.98 1.01 0.996 0.91 0.93Ratio of highest to the smallest eigenvalue 107 1010 107 106 108 105 106 106 106 108

Parameter ill-determined B B B B B – B B B B

The initial guesses used werek= 8.3 h−1, ks = 30 g/L,y= 1 g/g,b= 0.1 h−1, andη = 0.1. The initial guess fors0 andx0 were chosen to be the measured initialsubstrate and biomass concentration, respectively. The parameters were nondimensionalized using the values ofs* , x* andk0. The values ofs* andx* for eachset was chosen equal to the initial substrate and biomass concentration, respectively. The value ofk0 was chosen to be 1 h−1.

Table 6Results of parameter estimates in the mixed-order region using substrate and biomass data with measurement noise of±15%

Dataset

1 2 3 4 5 6 7 8 9 10

Initial substrate concentration (g/L) 3.04 3.05 2.82 2.94 3.41 3.08 3.19 2.87 2.95 3.05Initial total biomass concentration (g/L) 1.04 0.96 1.04 0.97 0.97 0.98 1.01 0.996 0.91 0.93Ratio of highest to the smallest eigenvalue 1018 1027 1019 1022 108 1022 1023 1018 1029 1028

Parameter ill-determined B B B B B B B B B B

The initial guesses used werek= 0.083 h−1, ks = 0.3 g/L,y= 1 g/g,b= 0.1 h−1, andη = 0.9. The initial guess fors0 andx0 were chosen to be the measured initialsubstrate and biomass concentration, respectively. The parameters were nondimensionalized using the values ofs* , x* andk0. The values ofs* andx* for eachset was chosen equal to the initial substrate and biomass concentration, respectively. The value ofk0 was chosen to be 1 h−1.

improved with an improvement in the initial guess for theparameters.

Hence, it can be concluded that all the parameters ofMonod kinetics can be determined in the mixed-order re-gion using substrate and biomass data in most cases. Theill-determined parameter in certain cases isB. The estimateofBcan be improved with a good initial guess. But, due to thedistribution of the errors in the substrate and biomass data itmay be impossible to determineB in certain cases even witha good initial guess for all the parameters.

An analysis was performed where the value ofBwas fixedand the other parameters were estimated. Five different setsof data with measurement noise were generated and threedifferent sets of initial guesses were used fork, ks, y, bandη.The results are summarized inTables 7 and 8for one set ofinitial guess from which it can be seen that all the parametersare well determined based on the ratio of eigenvalues and theparameter standard deviations. The conclusions for the othertwo sets of initial guesses were also the same and thereforeare not presented. Interested readers are referred to Kesavan[24] for the results of the other initial guesses.

3.2.1. Test on residuals and goodness of fitOnce the parameters were estimated in the mixed-order

region after fixingB, the uniqueness of the parameters was

Table 7Results of parameter estimates in the mixed-order region using substrate andbiomass data with measurement noise of±15%

Dataset

1 2 3 4 5

Initial substrate concentration (g/L) 3.04 3.05 2.82 2.94 3.47Initial total biomass concentration

(g/L)1.04 0.96 1.04 0.97 0.97

Ratio of highest to the smallesteigenvalue

104 104 104 103 104

Parameter ill-determined – – – – –

The value ofb was fixed to be 0.01 h−1. The initial guesses used werek= 0.83 h−1, ks = 3 g/L, y= 0.602 g/g, andη = 0.5. The initial guess fors0andx0t were chosen to be the measured initial substrate and biomass con-centration respectively. The parameters were nondimensionalized using thevalues ofs* , x* andk0. The values ofs* andx* for each set was chosen equalto the initial substrate and biomass concentration, respectively. The value ofk0 was chosen to be 1 h−1.

Page 6: Great Parameter Estimation Monod

100 P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104

Table 8Parameter estimates from nonlinear regression averaged over all the fivecases inTable 7

Parameter Value Standard deviation

K 0.142 1.282E−02Ks 1.028 1.065E−01α 0.52 4.45E−02S0 0.984 2.55E−02X0t 1.0 4.042E−02

verified. However, the assumptions made on the measurementerrors have to be validated. This was achieved by carrying outa few tests on the residuals as follows.

3.2.1.1. Standard deviation of the residuals.The substrateand biomass data were simulated such that they representmeasured concentrations with a measurement noise of 15%(based on a 95% confidence limit) of the magnitude of mea-surement. The measurement noise is assumed to be normallydistributed. The 95% confidence interval for a normal distri-bution corresponds to approximately twice the standard de-viation of the distribution. Therefore, a proportional noise of15% corresponds to a standard deviation of (0.15/2) times themeasurement. Each measurement in substrate and biomassconcentration then has a measurement error with standarddeviation different from all other measurements. However, ifthe objective function to be minimized is formulated as:

q =30∑i=1

1

σ1

(Smodel− Sdata

Sdata

)2

i

+30∑i=1

1

σ2

(Xmodel− Xdata

Xdata

)2

i

(9)

where q= objective function, Smodel= substrate concen-tration from the model defined by Eq.(5), Sdata=n con-c hem u-ld 5/2),σ to-tS ata,as

iono a andσ

-m Eq.( deviat ingf

V

Table 9The variance and covariances of the residuals for the cases inTable 7

Dataset σ1 σ12 σ2 λ

1 0.075 −0.19E−03 0.073 0.1752 0.091 0.141E−02 0.069 1.163 0.073 −0.26E−03 0.064 0.284 0.058 −0.52E−03 0.077 0.595 0.092 −0.13E−02 0.074 0.98

where V= covariance matrix of errors,M= moment ma-trix =

∑ni=1EiE

Ti ,n= number of observations,L= number of

parameters estimated, andm= number of equations.The measurement noise of the residuals of substrate and

biomass data are 0.075. Also the measurement errors of sub-strate and biomass data were assumed to be independent. Thecovariance matrix defined by Eq.(10)can be computed oncethe parameters are estimated by nonlinear regression. Thecovariance matrix was computed for all the cases describedin Table 7. The results are summarized inTable 9.

It can be seen fromTable 9that the standard deviation ofthe residuals are very close to 0.075. The conclusion drawnthen is that the model represents the data and any errors as-sociated can be attributed to measurement noise alone. Theobjective function defined by Eq.(9) has an assumption thatthe errors in the substrate and biomass data are independent.The covariance between substrate and biomass measurementshould then be zero.Table 9lists the covariance between sub-strate and biomass data for all the cases described inTable 7.In order to verify whether the covariance are significantly dif-ferent from zero (which would imply the assumption that thesubstrate and biomass measurement errors are independentis not valid), the following test described by Bard[18] wascarried out:

λ

(n∗ − 2

)1/2

w( rs ofs

ff r thec tf

et lueo telyo at thec ofw foundt

3 edf da egres-s

ondimensionalized simulated experimental substrateentration,Xmodel= total biomass concentration from todel defined by Eq.(7), Xdata= nondimensionalized sim

ated experimental total biomass concentration,σ1 = standardeviation of measurement errors in substrate data (=0.12 = standard deviation of measurement errors in

al biomass concentration (=0.15/2),e1i = ((Smodel−Sdata)/data)i = residual of theith measurement in substrate dnd e2i = ((Xmodel−Xdata)/Xdata)i = residual of theith mea-urement in biomass data.

The residualse1i ande2i have a constant standard deviatf σ1 (equal to 0.075 in cases studied) for substrate dat2 (equal to 0.075 in cases studied) for biomass data.

The parameters for cases discussed inTable 7were estiated by minimizing the objective function defined by

9). Once the parameters are estimated, the standardion of the residuals can be calculated using the followormula described by Bard[18]:

= 1

n − (L/m)M (10)

-

= rab1 − r2

ab

hereλ = a statistic,rab= correlation coefficient =σ12/σ1σ2whereσ12 = covariance between the measurement erroubstrate and biomass data), andn* =n− (L/m).

The quantityλ has thet-distribution withn* − 2 degrees oreedom. The usefulness of this statistic to check whethealculated covariances inTable 9are significantly differenrom zero is shown by means of the following example.

Theλ value of dataset 3 fromTable 9is 0.28. From thables oft-distribution, the probability of encountering a vaf |λ| > 0.28 with 27.5 degrees of freedom is approximaver 80%. Hence, we can accept that the hypothesis thovariance is zero for this case. Similarly, the valuesλere checked for all the cases and the covariance was

o be not different from zero.

.2.1.2. Test on runs.Residuals finally should be checkor randomness. The residuals (e1i ande2i) can be calculatefter the parameters have been estimated by nonlinear rion. A plot of residuals for dataset 1 inTable 7is shown in

Page 7: Great Parameter Estimation Monod

P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104 101

Fig. 1. Plot of substrate residuals for dataset inTable 7.

Figs. 1 and 2. The randomness can be tested by performinga test of runs (a run is a sequence of residuals of equal sign)procedure described by Bard[18] which is as follows.

The expected number of runs is given by:

µ =(

2n1n2

n1 + n2

)+ 1

wheren1 = number of negative residuals andn2 = number ofpositive residuals.

The variance of the number of runs is given by:

σ2 = 2n1n2(2n1n2 − n1 − n2)

(n1 + n2)2(n1 + n2 − 1)

If n1 andn2 are both greater than 10, then the quantity

z = r − µ + 0.5

σ

is distributed normally asN(0, 1), wherer is the num-ber of actual runs. The usefulness of this test is illustratedbelow.

The value ofn1 is 13,n2 is 17 andr is 15 for the sub-strate residuals fromFig. 1. The z value is then−0.087.

The probability of finding 15 or fewer runs is approximatelyP(z≤ 0.0879) which from the tables of normal distributionis about 48%. Thez value fromFig. 2 is −0.975 and theprobability was about 19%. Hence, there is strong evidence(not conclusive though) that the substrate residuals and thebiomass residuals for this case are randomly distributed. Therandomness of the substrate and biomass residuals for all theother datasets inTable 7were also verified by this procedure.

The test of runs however should be viewed with caution.Bard [18] has strongly emphasized that failure to pass thenumber of runs test should not be considered as a reason toreject the model and nonrandomness of residuals is a rulerather than an exception. The nonrandomness may be dueto some minor effects which are neglected in the model. Thevisual test on randomness similar to as shown inFigs. 1 and 2are usually sufficient.

3.3. Zero-order region

Experimental data with no measurement noise and withGaussian measurement noise of±15% were simulated inthe zero-order region (S0 �Ks) with the following parame-ter values:k= 0.83 h−1, ks = 3 g/L, y= 0.5 g/g,b= 0.01 h−1,s0 = 150 g/L, andx0 = 1 g/L. Nonlinear regression was usedto estimate the parameters. Analysis of the eigenvalue ande -t

3

d al.[ odelp n inF thep

F di byn

Fig. 2. Plot of biomass residuals for dataset inTable 7.

igenvectors proved all the parameters exceptB are well deermined. This is very similar to the mixed-order region.

.4. Estimation of parameters from experimental data

The experimental data of toluene depletion byPseu-omonas acinetobacterwere measured by Sommer et

16]. The experimental measurements as well as the mredicted concentrations (discussed further) are showigs. 3 and 4. The preliminary calculations to estimatearameters by nonlinear regression are as follows:

ig. 3. Experimental data on toluene depletion byP. acinetobactermeasuren a batch reactor by Sommer et al.[16] and predicted concentrationsonlinear regression.

Page 8: Great Parameter Estimation Monod

102 P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104

Fig. 4. Experimental data on growth ofP. acinetobactermeasured in a batchreactor by Sommer et al.[16] and predicted concentrations by nonlinearregression.

1. The initial substrate and biomass concentration measuredwere 5 and 0.0263 mg/L. The values ofs* andx* weretherefore chosen as 5 and 0.0263 mg/L, respectively. Thevalue ofk0 was chosen to be 1.

2. The initial guesses fork, ks, η, b andy were chosen tobe 41.83 h−1, 4.2 mg/L, 0.5, 0.01 h−1 and 0.12 mg/mg,respectively.

3. Proportional measurement noise of±15% were as-sumed due to lack of information on measurementerrors.

Nonlinear regression was used to estimate the parame-ters. The parameters were scaled by the procedure discussedpreviously[17]. The eigenvalue and eigenvector analysis in-dicated thatB was ill determined. The value ofB was fixedas zero and the other parameters were estimated. The resultsare summarized inTables 10 and 11from which it can beseen that all the parameters are well determined. The plotsof the residuals are shown inFigs. 5 and 6. The standarddeviation of the substrate and biomass residuals were com-puted to be 0.18 and 0.32, respectively. No further analysisis possible on the standard deviation of residuals because noinformation about measurement errors was reported. Hence,it can be concluded that with the data reported by Sommeret al. [16] the parameters can be estimated by making few

Table 11Estimates of the parameters from nonlinear regression from nonlinear re-gression using the data on toluene depletion and growth ofP. acinetobacter

Parameter Value Standard deviation

K 0.01 1.64E−03Ks 0.23 6.34E−02α 0.85 1.02E−01S0 0.96 7.2E−02X0t 1.13 2.71E−01

The value ofbwas fixed to be zero.

Fig. 5. Plot of toluene residuals calculated using parameter estimated bynonlinear regression.

Fig. 6. Plot ofP. acinetobacterresiduals calculated using the parameters.

TE ssion using the data on toluene depletion and growth ofP. acinetobacter

P

0.1107E+06a 0.2316E+07a 0.2827E+06a

K −0.712486 0.361317 0.181752K 0.084171 0.637933 0.56382α 0.382755 −0.299325 0.726654S −0.167012 0.103363 0.013458X 0.557571 0.601841 −0.347658

T

able 10igenvalues and eigenvectors of the parameters from nonlinear regre

arameter Eigenvectors

0.5502E+09a 0.1861E+09a

−0.419565 −0.390826

s 0.487035 0.175674−0.466224 −0.136061

0 −0.43842 0.876944

0t −0.420934 −0.169856

he value ofbwas fixed to be zero.a Eigenvalues.

Page 9: Great Parameter Estimation Monod

P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104 103

Table 12Eigenvalues and eigenvectors of the parameters from nonlinear regression using the data on toluene depletion and growth ofP. acinetobacter

Parameter Eigenvectors

0.5272E+09a 0.6109E+08a 0.2446E+07a 0.6769E+03a 0.1497E+03a

K −0.272646 0.709982 0.431289 0.282401 0.394751Ks −0.353157 −0.245559 −0.588549 0.371949 0.574669α 0.343787 −0.141136 0.179286 −0.574824 0.706629S0 0.598654 −0.288851 0.303648 0.671891 0.120576X0t 0.569533 0.57643 −0.585879 0.006565 −0.007966

The value ofb was fixed to be zero. The measurement errors on toluene andP. acinetobacterconcentrations were assumed to be constant and equal to eachother.

a Eigenvalues.

Table 13Estimates of the parameters from nonlinear regression using the data ontoluene depletion and growth ofP. acinetobacter

Parameter Value Standard deviation

K 0.028 2.62E−03Ks 3.66 3.56E+01α 8.04 5.85E+01S0 0.39 2.3E+00X0t 1 1.07E+00

The value ofbwas fixed to be zero. The measurement errors on toluene andP. acinetobacterconcentrations were assumed to be constant and equal toeach other.

assumptions. However, the estimated parameters should beused with caution.

An analysis was performed where the parameters of thedata reported by Sommer et al.[16] was determined assuminga constant standard deviation for both substrate and biomassdata.

The objective function minimized was:

q =n∑

i=1

(Smodel− Sdata)2 +

m∑j=1

(Xmodel− Xdata)2

The results are summarized inTables 12 and 13. Itcan be concluded that all the parameters are not welldetermined. The conclusion here is very different fromthe conclusion when proportional noise was assumed.Therefore, some information about the measurement er-rors is absolutely essential to estimate parameters with anyvalidity.

Experimental data on glucose depletion byT. viridewerereported by Nihtila and Virkkunen[8]. The preliminary cal-

Table 15Estimates of the parameters from nonlinear regression using data on glucosedepletion and growth ofT. viride in a batch reactor[8]

Parameter Value Standard deviation

K 0.42 8.59E−01Ks 4.63 1.02E+01α 12.97 2.49E+01B 1.31 4.37E−01S0 1.09 1.28E−01X0t 0.78 2.55E−01

culations to estimate the parameters by nonlinear regressionare as follows:

1. The initial substrate and biomass concentration measuredwere 24,500 and 400 mg/L. The values ofs* andx* weretherefore chosen as 24,500 and 400 mg/L, respectively.The value ofk0 was chosen to be 1.

2. The initial guesses fork, ks, η, b and y were chosento be 0.490 day−1, 24,500 mg/L, 0.5, 1.22 day−1 and0.00337 mg/mg, respectively.

3. Proportional measurement noise of±15% were assumeddue to lack of information on measurement errors.

Nonlinear regression was used to estimate the parame-ters. The eigenvalue and the corresponding eigenvectors areshown inTable 14and the parameter estimates are given inTable 15. The eigenvector associated with the smallest eigen-value (0.1564E+00) is:

0.19012K + 0.862231Ks + 0.505609α + 0.01463B

+0.01196S0 − 0.018429X0

Table 14Eigenvalues and eigenvectors of the parameters from nonlinear regression using data on glucose depletion and growth ofT. viride in a batch reactor[8]

Parameter Eigenvectors

0.2373E+06a 0.9077E+05a 1.1291E+0a a a a

K 44309K 26181α 49661B 57238S 21523X 31362

0.992563 −0.108429 −0.0

s −0.044936 −0.099735 −0.10.040997 0.175292 0.2

−0.000403 0.080443 0.0

0 0.091737 0.957299 −0.2

0t 0.051987 0.157089 0.9a Eigenvalues.

4 0.4475E+03 0.6164E+02 0.1564E+00

−0.012636 0.024029 0.019012−0.378559 0.292168 0.862231

0.618305 −0.517029 0.5056090.603597 0.791018 0.01463

−0.156888 0.038416 0.001196−0.292042 0.139845 −0.018429

Page 10: Great Parameter Estimation Monod

104 P. Kesavan, V.J. Law / Biochemical Engineering Journal 24 (2005) 95–104

Therefore, eitherα orKs has to be fixed. No further analysiswas performed because neither of these parameters is known.

4. Conclusions

1. A systematic procedure to formulate the objective func-tion to be minimized, estimate parameters along with theirstandard deviations, identify the uniqueness of parametersestimated and verify the assumptions on the standard de-viation of the residuals has been developed and appliedto Monod kinetics. This procedure can be applied to anymodel (coupled or nonlinear) to estimate parameters bynonlinear regression.

2. It is not possible to determine all the parameters of Monodkinetics by measuring the substrate and biomass concen-trations from a single batch experiment when the measure-ment noise is proportional to the magnitude of the mea-surement. The number of parameters that can determinedwhen the measurement noise are proportional to the mag-nitude of measurement depends on the region of substratedepletion (first-order, mixed-order or zero-order).

3. In the first-order region, eitherkor ks has to be fixed. Thisis not very useful because neither of the parameters areknown a priori. In the mixed-order and in the zero-orderregion, all the parameters exceptbcan be determined. This

ber

4 sure-pa-

5 firmsinedwasmea-cau-

R

iol.

ell,

sity

[4] P.L. McCarty, Organic Compounds in Aquatic Environments, MarcelDekker, New York, NY, 1971.

[5] H. Moser, The Dynamics of Bacterial Population Maintained in theChemostat, The Carnegie Institution, Washington, DC, 1958.

[6] E.O. Powell, Microbial Physiology and Continuous Culture, HerMajesty’s Stationery Office, London, UK, 1967, pp. 34–56.

[7] F.G. Hineken, H.M. Tsuchiya, R. Aris, On the accuracy of deter-mining rate constants in enzymatic reactions, Math. Biosci. 1 (1967)115–141.

[8] M. Nihtila, J. Virkkunen, Practical identifiability of growth andsubstrate consumption models, Biotechnol. Bioeng. 21 (1977)1831–1850.

[9] A. Holmberg, On the practical identifiability of microbial growthmodels incorporating Michaelis–Menten type nonlinearities, Math.Biosci. 62 (1982) 23–43.

[10] A. Corman, A. Pave, On parameter estimation of Monod’s bacterialgrowth model from batch culture data, J. Gen. Appl. Microbiol. 29(1983) 91–101.

[11] J.A. Robinson, M. Tiedje, Nonlinear estimation of Monod growthkinetic parameters from a single substrate depletion curve, Appl.Environ. Microbiol. 45 (5) (1983) 1453–1458.

[12] S. Simkins, M. Alexander, Models for mineralization kinetics withthe variables of substrate concentration and population density, Appl.Environ. Microbiol. 47 (6) (1984) 1299–1306.

[13] J.A. Robinson, Advances in Microbial Ecology, vol. 8, Plenum Press,New York, 1985.

[14] M. Baltes, R. Schneider, C. Sturm, M. Reuss, Optimal experimen-tal design for parameter estimation in unstructured growth models,Biotechnol. Prog. 10 (1994) 480–488.

[15] A. Munack, Reprints of the 4th International Congress on ComputerApplications in Fermentation Technology, Ellis Horwood, Chich-

[ n intion,

[ with-Eng.

[ New

[ onse

[ res-ew

[ on-154–

[ tivityffer-471–

[ sley

[ 998.

analysis was performed with a sufficiently large numof equally spaced data points

. The assumptions on standard deviation of the meament noise were verified by computing them after therameters were estimated.

. The parameter estimation using measured data conthat all the parameters cannot be uniquely determfrom a single batch experiment. Furthermore, itproved that parameter estimates using data whosesurement noise are not known should be viewed withtion.

eferences

[1] J. Monod, The growth of bacterial cultures, Annu. Rev. Microb3 (1949) 371–394.

[2] D. Herbert, Recent Progress in Microbiology, Almqvist & WiksStockholm, 1958.

[3] P.L. McCarty, Advances in Water Quality Improvement, Univerof Texas Press, Austin, TX, 1968.

ester, UK, 1988, pp. 195–204.16] H.M. Sommer, H. Holst, H. Spliid, Nonlinear parameter estimatio

microbiological degradation and statistic test for common estimaEnviron. Int. 30 (5) (1995) 551–556.

17] P. Kesavan, V.J. Law, Parameter estimation in Monod kineticsout biomass data and initial substrate concentration, Chem.Commun. 167 (1998) 107–132.

18] Y. Bard, Nonlinear Parameter Estimation, Academic Press,York, 1974.

19] W.G. Hunter, Estimation of unknown constants from multirespdata, Ind. Eng. Chem. Fundam. 6 (3) (1967) 461–463.

20] V.J. Law, CONREG: A System for Unconstrained Nonlinear Regsion and Equation Solving. User’s Manual, Tulane University, NOrleans, LA, 1992.

21] V.J. Law, R.H. Fariss, Transformational discrimination for uncstrained optimization, Ind. Eng. Chem. Fundam. 11 (1972)161.

22] V.J. Law, Y. Sharma, Computation of the gradient and sensicoefficients in sum of squares minimization problems with diential equation models, Comput. Chem. Eng. 21 (12) (1997) 11479.

23] D.E. Knuth, The Art of Computer Programming, Addison-WePublishing Company, Philippines, 1981.

24] P. Kesavan, Ph.D. thesis, Tulane University, New Orleans, LA, 1