Parameter Estimation for Mechanical Systems via an...

30
1 Parameter Estimation for Mechanical Systems via an Explicit Representation of Uncertainty Emmanuel Blanchard Department of Mechanical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA Adrian Sandu Computer Science Department, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA Corina Sandu Department of Mechanical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA Abstract Purpose To propose a new computational approach for parameter estimation in the Bayesian framework. Aposteriori PDFs are obtained using the polynomial chaos theory for propagating uncertainties through system dynamics. The new method has the advantage of being able to deal with large parametric uncertainties, non-Gaussian probability densities, and nonlinear dynamics. Design/methodology/approach - The maximum likelihood estimates are obtained by minimizing a cost function derived from the Bayesian theorem. Direct stochastic collocation is used as a less computationally expensive alternative to the traditional Galerkin approach to propagate the uncertainties through the system in the polynomial chaos framework. Findings The new approach is explained and is applied to very simple mechanical systems in order to illustrate how the Bayesian cost function can be affected by the noise level in the measurements, by undersampling, non-identifiablily of the system, non-observability, and by excitation signals that are not rich enough. When the system is non-identifiable and an apriori knowledge of the parameter uncertainties is available, regularization techniques can still yield most likely values among the possible combinations of uncertain parameters resulting in the same time responses than the ones observed. Originality/value The polynomial chaos method has been shown to be considerably more efficient than Monte Carlo in the simulation of systems with a small number of uncertain parameters. To the best of our knowledge, it is the first time the polynomial chaos theory has been applied to Bayesian estimation. Keywords - Parameter Estimation, Polynomial Chaos, Collocation, Halton/Hammersley Algorithm, Bayesian Estimation, Vehicle Dynamics Paper type Research paper

Transcript of Parameter Estimation for Mechanical Systems via an...

Page 1: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

1

Parameter Estimation for Mechanical Systems via

an Explicit Representation of Uncertainty

Emmanuel Blanchard

Department of Mechanical Engineering, Virginia Polytechnic Institute and State

University, Blacksburg, Virginia, USA

Adrian Sandu

Computer Science Department, Virginia Polytechnic Institute and State University,

Blacksburg, Virginia, USA

Corina Sandu

Department of Mechanical Engineering, Virginia Polytechnic Institute and State

University, Blacksburg, Virginia, USA

Abstract

Purpose – To propose a new computational approach for parameter estimation in the Bayesian framework. Aposteriori

PDFs are obtained using the polynomial chaos theory for propagating uncertainties through system dynamics. The new

method has the advantage of being able to deal with large parametric uncertainties, non-Gaussian probability densities,

and nonlinear dynamics.

Design/methodology/approach - The maximum likelihood estimates are obtained by minimizing a cost function

derived from the Bayesian theorem. Direct stochastic collocation is used as a less computationally expensive alternative

to the traditional Galerkin approach to propagate the uncertainties through the system in the polynomial chaos

framework.

Findings – The new approach is explained and is applied to very simple mechanical systems in order to illustrate how the Bayesian cost function can be affected by the noise level in the measurements, by undersampling, non-identifiablily

of the system, non-observability, and by excitation signals that are not rich enough. When the system is non-identifiable

and an apriori knowledge of the parameter uncertainties is available, regularization techniques can still yield most likely

values among the possible combinations of uncertain parameters resulting in the same time responses than the ones

observed.

Originality/value – The polynomial chaos method has been shown to be considerably more efficient than Monte Carlo

in the simulation of systems with a small number of uncertain parameters. To the best of our knowledge, it is the first

time the polynomial chaos theory has been applied to Bayesian estimation.

Keywords - Parameter Estimation, Polynomial Chaos, Collocation, Halton/Hammersley Algorithm, Bayesian

Estimation, Vehicle Dynamics

Paper type Research paper

Page 2: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

2

Nomenclature

A = matrix of basis function values at the collocation

points

#A = Moore-Penrose pseudo-inverse of A

ic = stochastic coefficient for the ith polynomial chaos

mode ( ix for displacement and iv for velocities)

0F = uncertain amplitude of the forcing function (N)

h = observation operator

H = matrix converting the states of the model to the

observable parameters of the system

kH = linearization of h about the solution ky

K = stiffness of the spring (N m-1

)

0K = mean value for the stiffness distribution (N m-1)

M = model solution operator which integrates the

model equations forward in time

M = mass of the body (kg)

0M = mean value for the mass distribution (kg)

n = number of random variables

On = number of observations

Pn = number of parameters

Sn = dimension of the state vector

N = number of time points at which measurements

are available

N = Normal distribution

p = maximum order of the polynomial basis

P = Probability Density Function (PDF)

d1P = 1-dimensional polynomial

Q = number of collocation points

kR = covariance matrix of the uncertainty associated

with the measurements noise at time step k

S = number of terms used in polynomial; chaos

expressions

t = time (s)

0t = initial time (s)

kt = time at the kth measurement point (s)

Ft = final time (s)

v = velocity (m s-1)

x = displacement (m)

ky = the state of the model at time kt

kz = observation vector at time kt

Z = second order random process

Greek symbols

k = measurement noise at time index k

i = ith collocation point

= set of uncertain parameters

= random event

)( = probability distribution of the multi-

dimensional random variable .

K = standard deviation for the stiffness distribution

M = standard deviation for the mass distribution

meas = standard deviation of the measurement

oscillation frequency obs

= frequency of the forcing function (radians s-1)

n = natural frequency (radians s-1)

obs = measured oscillation frequency (radians s-1)

= multi-dimensional random variable

= space of possible value for the unknown variables

= generalized Askey-Wiener polynomial chaoses

Subscripts

nom = nominal value, i.e., value obtained when 0

Superscripts

ref = reference value used to generate observations

Page 3: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

3

1. Introduction and background

The polynomial chaos theory has been shown to be consistently more efficient than Monte Carlo

simulations in order to assess uncertainties in mechanical systems (Sandu et al., 2006a–2006b).

This paper extends the polynomial chaos theory to the problem of parameter estimation, and

illustrates is on a simple mass-spring system with uncertain initial velocity. We also present the

application of regularization techniques applied on this system with uncertain stiffness and

uncertain mass.

Parameter estimation is an important problem, because in many applications parameters cannot

be directly measured with sufficient accuracy; this is the case, for example, in real time

applications. Rather, parameter values must be inferred from available measurements of different

aspects of the system response. The theoretical foundations of parameter estimation can be found in

(Tarantola, 2004), (Bishwal, 2008) and (Aster et al., 2004). Parameter estimation find applications

in many fields, including mechanical engineering (Liu, 2008), material science (Araújo et al.,

2009), aerospace (Pradlwarter et al. 2005), geosciences (Catania and Paladino, 2009), chemical

engineering (Varziri et al., 2008), etc.

Various approaches to parameter estimations are discussed in the literature. These include

energy methods (Liang, 2007), frequency domain methods (Oliveto et al., 2008), and set inversion

via interval analysis (SIVIA) with Taylor expansions (Raїssi et al., 2004). A rigorous framework for

parameter estimation is the Bayesian approach, where probability densities functions are being

considered representations of uncertainty. The Bayesian approach has been widely used (Mockus et

al., 1997; Thompson and Vladimirov, 2005; Wang and Zabaras, 2005; Khan and Ramuhalli, 2008).

The Bayesian approach consists of estimating aposteriori probabilities of the parameters and

therefore transforming a parameter estimation problem into the problem of finding maximum

likelihood values of the parameters.

Different methodologies to estimate parameters in a Bayesian framework are possible.

Maximum likelihood parameter estimation can be formulated as an optimization problem (typically

large and nonconvex, therefore challenging). It can be numerically solved by gradient methods

(Nocedal and Wright, 2006) or by global optimization methods (Horst et al., 2000; Floudas, 2000;

Horst and Pardalos, 1995; Pardalos and Romeijn, 2002; Liberti and Maculan, 2006). Another

approach to solving the global continuous optimization problem is the use of Evolutionary

Algorithms (EAs) which are inspired by biological evolution (Davis, 1991). Differential Evolution

(DE) techniques are EA techniques that have been used successfully and Estimation of Distribution

Algorithms (EDAs) are a promising new class of EAs (Zhang, 1999). Sun et al. (2005) proposed a

DE/EDA hybrid approach. Another hybrid called estimation of distribution algorithm with local

search (EDA/L) has been developed by Zhang et al. (2004). Zhang et al. (2005) also proposed an

evolutionary algorithm with guided mutation (EA/G).

Another Bayesian parameter estimation method is the Kalman Filter (Kalman, 1960), which

is optimal for linear systems with Gaussian noise. The Extended Kalman Filter (EKF) allows for

nonlinear models and observations by assuming that the error propagation is linear (Evensen, G.,

1992, Evensen, G., 1993). The Ensemble Kalman Filter (EnKF) is a Monte Carlo approximation of

the Kalman filter suitable for large problems (Evensen, G., 1994). In the context of stochastic

optimization, propagation of uncertainties can be studied represented using Probability Density

Functions (PDFs). The Kalman Filter and its approximations estimate the states and their

Page 4: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

4

uncertainties through covariance matrices at the same time. In order to approximate PDFs

propagated through the system, linearization using the EKF (Blanchard et al., 2007b) and Monte

Carlo techniques using the EnKF (Saad et al., 2007) are common approaches.

Parameter estimation is especially difficult for large systems, where a considerable

computational effort is needed. Estimating a large number of parameters often proves to be

computationally intractable. This has led to the development of techniques to determine which

parameters affect the system’s dynamics the most and are important to estimate (Sohns et al., 2006).

Sohns et al. (2006) proposed the use of activity analysis as an alternative to sensitivity-based and

principal component-based techniques. Their approach combines the advantages of the sensitivity-

based techniques (i.e., being efficient for large models) and the component-based techniques (i.e.,

keeping parameters that can be physically interpreted). Zhang and Lu (2004) combined the

Karhunen–Loeve decomposition and perturbation methods with polynomial expansions in order to

evaluate higher-order moments for saturated flow in randomly heterogeneous porous media.

In this study, we develop a maximum likelihood parameter estimation approach in a

Bayesian framework. Aposteriori PDFs are obtained using the polynomial chaos theory for

propagating uncertainties through system dynamics. The method has the advantage of being able to

deal with large parametric uncertainties, non-Gaussian probability densities, and nonlinear system

dynamics.

The polynomial chaos method started to gain attraction after Ghanem and Spanos (1990, 1991,

1993, 2003) applied it successfully to the study of uncertainties in structural mechanics and

vibration using Wiener-Hermite polynomials. Xiu extended the approach to general formulations

based on Wiener-Askey polynomials family (Xiu and Karniadakis, 2002a), and applied it to fluid

mechanics (Xiu et al., 2002; Xiu and Karniadakis, 2002b, 2003). Sandu et al. applied for the first

time the polynomial chaos method to multibody dynamic systems (Sandu et al. 2004, 2005, 2006a,

2006b), terramechanics (Li et al., 2005; Sandu et al., 2006c), and parameter estimation (Blanchard

et al., 2007a - 2007b). Saad et al. (2007) coupled the polynomial chaos theory with the Ensemble

Kalman Filter (EnKF) to indentify unknown variables in a non-parametric stochastic representation

of the non-linearities in a shear building model. Their identification method proved to be an

effective way of accurately detecting changes in the behavior of a system affected by both

measurement noise and modeling noise.

The generalized polynomial chaos theory is explained by Sandu et al. (2006 a) in which direct

stochastic collocation is proposed as a less expensive alternative to the traditional Galerkin

approach. It is desirable to have more collocation points than polynomial coefficients to solve for.

In that case a least-squares algorithm is used to solve the system with more equations than

unknowns.

The fundamental idea of polynomial chaos approach is that random processes of interest can be

approximated by sums of orthogonal polynomial chaoses of random independent variables. In this

context, any uncertain parameter can be viewed as a second order random process (processes with

finite variance; from a physical point of view they have finite energy). Thus, a second order random

process )(Z , viewed as a function of the random event , can be expanded in terms of orthogonal

polynomial chaos (Ghanem and Spanos, 2003) as:

1

))(()(j

jjcZ (1)

Page 5: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

5

Here )( 1 n

j are generalized Askey-Wiener polynomial chaoses, in terms of the multi-

dimensional random variable n

n )( 1 , where is is the space of possible value for the

unknown variables. The Askey-Wiener polynomial chaoses form a basis with respect to the joint

probability density )( 1 n in the ensemble inner product

d)()()(, .

The basis functions are selected depending on the type of random variable functions. For

Gaussian random variables the basis functions are Hermite polynomials, for uniformly distributed

random variables the basis functions are Legendre polynomials, for beta distributed random

variables the basis functions are Jacobi polynomials, and for gamma distributed random variables

the basis functions are Laguerre polynomials (Xiu and Karniadakis, 2002a, 2003). In practice, a

truncated expansion of equation (1) is used,

S

j

jjcZ1

)( (2)

For independent random variables n 1 , the multi-dimensional basis functions are tensor products

of 1-dimensional polynomial basis:

plSj k

n

kk

l

kn

j k ,,2,1;,,2,1,)(d)1(P)(1

1

(3)

where !!

)!(

pn

pnS

, d)1(P is a 1-dimensional polynomial, n is the number of random variables, and p is

the maximum order of the polynomial basis. The total number of terms S increases rapidly with n

and p .

In the deterministic case, a second order system can be described by the following Ordinary

Differential Equation (ODE):

00

00

)(

)(

),(),(,)(

)(

vtv

xtx

tvtxtFtv

vtx

(4)

In the stochastic framework developed in this study, the displacement )(tx , the velocity )(tv , and the

set of parameters Pn (the set of Pn parameters being possibly uncertain) of a second order

system can be expanded using equation (2) as:

pS

S

i

ii

ll

S

i

ii

mm

S

i

ii

mm nlnmtvtvtxtx

1,1,)()(,)()(),(,)()(),(111

(5)

We use subscripts to index system components and superscripts to index stochastic modes. Inserting

Eq. (5) in the deterministic system of equations leads to:

0,0

1111

0

)(

)(;)(,)(,)()(

,1,1),()(

mm

S

k

kkS

k

kkS

k

kk

m

jS

j

j

m

FS

j

m

j

m

xtx

vxtFtv

tttSjnmtvtx

(6)

Page 6: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

6

where m is a state index and Sn is the dimension of the state vector.

To derive evolution equations for the stochastic coefficients )(tx i

m we impose that equation (6)

holds at a given set of collocation points Q ,,1 . This leads to:

QiAvAxAtFvAvx kS

k

ki

kS

k

ki

kS

k

ki

j

m

S

j

ji

i

m

i

m

1,;,,,1

,

1

,

1

,

1

,

(7)

where A represents the matrix of basis function values at the collocation points:

SjQiAAA ij

jiji 1,1),(, ,, (8)

The collocation points have to be chosen such that SQ and

A is nonsingular. Let #A be the

Moore-Penrose pseudo-inverse of

A . With kS

kki

i xAX

1

,, k

S

kki

i vAV

1

,, k

S

kki

i A

1

,, the collocation

system can be written as:

QiVXtFVVX iiiiii 1,,,,, (9)

After integration, the stochastic solution coefficients are recovered using:

SitVAtvtXAtxS

j

j

ji

ijS

jji

i

1,)()(),()(1

,

#

1,

# (10)

Assuming that 1 is the constant (zeroth order) term in the polynomial expansion, the mean values

of )(tx and )(tv are 11 )()( txtx and 11 )()( tvtv , respectively. The standard deviations of )(tx and

)(tv are given by:

dtxS

i

ii

2

2

)()(

, dtv

S

i

ii

2

2

)()(

(11)

When the basis functions are orthogonal polynomials, the standard deviations of )(tx and )(tv are

given by:

iiS

i

i tx ,)(2

2

, iiS

i

i tv ,)(2

2

(12)

When the basis are orthonormal, the standard deviations of )(tx and )(tv are given by:

S

i

i tx2

2)( ,

S

i

i tv2

2)( (13)

The Probability Density Functions (PDF) of )(tx and )(tv are obtained by drawing histograms of

their values using a Monte Carlo simulation and normalizing the area under the curves that are

obtained. It is not computationally expensive since the Monte Carlo simulation is run on the final

result, and not for the whole process. For instance, the ODE is run the same number of times as the

number of collocation points, which is typically much lower than the number of runs used for the

Monte Carlo simulation.

Page 7: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

7

2. Bayesian approach for parameter estimation

Optimal parameter estimation combines information from three different sources: the physical laws

of evolution (encapsulated in the model), the reality (as captured by the observations), and the

current best estimate of the parameters. The information from each source is imperfect and has

associated errors. Consider the mechanical system model (6) which advances the state in time

represented in a simpler notation:

Nktyyytyv

x

y kkk

k

k

k

k ,,2,1,,,, 0011

M

(14)

The state of the model Sn

ky at time moment kt depends implicitly on the set of parameters

Pn , possibly uncertain (the model has Sn states and Pn parameters). M is the model solution

operator which integrates the model equations forward in time (starting from state 1ky at time 1kt

to state ky at time kt ). N is the number of time points at which measurements are available.

For parameter estimation it is convenient to formally extend the model state to include the

model parameters and extend the model with trivial equations for parameters (such that parameters

do not change during the model evolution)

1 kk (15)

The optimal estimation of the uncertain parameters is thus reduced to the problem of optimal state

estimation. We assume that observations of quantities that depend on system state are available at

discrete times kt

k

T

kkkkkkkkk RyHyhz ,0, (16)

where On

kz is the observation vector at kt , h is the (model equivalent) observation operator, kH

is the linearization of h about the solution ky , and k is the measurement noise at time index k .

Note that there are On observations for the Sn –dimensional state vector, and that typically SO nn .

Each observation is corrupted by observational (measurement and representativeness) errors (Cohn,

1997). We denote by the ensemble average over the uncertainty space. The observational error

is the experimental uncertainty associated with the measurements and is usually considered to have

a Gaussian distribution with zero mean and a known covariance matrix kR .

Using polynomial chaoses the uncertain parameters are modeled explicitly as functions of a set

of random variables p with a joint probability density function . The explicit

dependency of the system state on the random variables is obtained via a collocation approach:

S

i

ii

kk

S

i

ii tyty11

)(,, (17)

We adopt the point of view that the “state of knowledge” about the uncertain parameters can be

described by probability densities. From Bayes’ rule the probability density of the parameter

distribution conditioned by all observations is

Page 8: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

8

dyzzyPyzP

zzyPyzPzzyP

NN

NNNN

NN][][

][][

01

01

0

(18)

where ][ NN yzP is the Probability Density Function (PDF) of the latest observational error (taken

at time Nt ), ]|[ 01 zzyP NN is the “model forecast PDF” conditioned by all previous observations

(taken at times 0t to 1Nt ), and ]|[ 0zzyP NN is the “assimilated PDF”. The assimilated PDF

represents the aposteriori probability of the parameters after all the observations have been taken

into account.

For simplicity denote by y the current state of the system (the best estimation obtained using all

previous observations 01 zzN ) and by Nzz the latest, yet-to-be-used set of observations.

Moreover, consider that the observational error has a Gaussian distribution with covariance kR and

that the observations at different times are independent. Then Bayes’ formula becomes

n

N

k

kkkT

kk

N

k

kkkT

kk

n

dyyPe

yPe

dyyPyzP

yPyzPzyP

yHzRyHz

yHzRyHz

][

][

][][

][][

0

1

21

0

1

21

)()(

)()(

(19)

The unconditional probability density ][yP is the PDF of the current system state, and is

implicitly represented by the polynomial chaos expansion of the state yy . Moreover,

integration against this probability density can be evaluated by integration in the independent

random variables

fdyfdyyPyfn anyfor)(][)( (20)

The denominator can be evaluated by a multidimensional integration. However, in our

approach, there is no need to evaluate this scaling factor, since its omission does not change the

estimation procedure. (The omission of this scaling factor is equivalent to adding a constant to the

function we minimize, and this does not change the result of the minimization procedure).

The mean of the best state estimate that uses the new observations z is obtained from Bayes

formula as

deydyyPeydyzyPyy

N

k

kkkT

kk

n

N

k

kkkT

kk

n

yHzRyHzyHzRyHz

)()(den

1][

den

1]|[ 0

1

2

1

0

1

2

1)()()()(

(21)

For the parameter estimation the Bayes’ formula specializes to:

m

N

k

kkkT

kk

N

k

kkkT

kk

dPe

PePzPzP

yHzRyHz

yHzRyHz

][

][

den

][][

0

1

21

0

1

21

(22)

Note that the aposteriori probability defined by Bayes formula can be written (in principle) as a

function of the independent random variables

Page 9: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

9

de

ezP

N

k

kkkT

kk

N

k

kkkT

kk

yHzRyHz

yHzRyHz

)(

ˆ

0

1

21

0

1

21

(23)

In this setting polynomial chaos is used to model the a priori pdf of the parameters; the Bayes

formula is employed to obtain the a posteriori pdf (i.e., the pdf conditioned by the observations).

The maximum likelihood estimate is given by that realization of the parameters (that value of

) which maximizes the a posteriori probability ]|[ zP , or, equivalently, minimizes ]|[log zP :

)(log)()(min0

1

21

N

kkkk

T

kk yHzRyHzJ (24)

Note that for we have 0 and cost the function J becomes infinite. This cost function

is composed of two parts:

apriori

mismatch

)(log)()(0

1

21

total

JJ

N

kkkk

T

kk yHzRyHzJ

(25)

where mismatchJ comes only from the differences between the available measurements and the model

response, while aprioriJ encapsulates the apriori knowledge of the parameter uncertainty. The value

Jminargˆ minimizing the cost function (25) gives the most likely values of our uncertain

parameters as ˆˆ .

3. Insight into the Bayesian approach using simple mechanical systems

We now illustrate the proposed Bayesian approach for the estimation of parameters of several

simple mechanical systems. We discuss how the cost function and the estimate can be affected by

low sampling rates (i.e., below the Nyquist frequency), by measurement noise, and by non-

identifiability issues.

The state of the model n

ky at time moment kt depends implicitly on the set of uncertain

parameters p , and therefore on the set of independent random variables . This dependency is

explicitly represented in the polynomial chaos framework, specifically, at each time moment kt the

state is given as a polynomial of the random variables kk yy . The probability density of the state

can also be obtained from this relation. H is simply a matrix converting the states of the model

)(y to the observable parameters of the system (i.e., the quantities which can be measured), which

are contained in z . kR is the covariance matrix of the uncertainty associated with the

measurements, i.e., of the measurement noise.

)(log aprioriJ comes from the apriori knowledge of the uncertain parameters. Using

polynomial chaoses the uncertain parameters can be modeled explicitly as functions of a set of

random variables p with a joint probability density function .

Page 10: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

10

mismatchJ is usually the most important component of the cost function, but aprioriJ is useful when

mismatchJ does not contain enough information in order to find a clear minimum value for our cost

function. This is illustrated in the next section of this article.

3.1 Mass-spring system with uncertain initial velocity

This section applies the Bayesian approach to the simple mass-spring system shown in Figure 1.

K

M

x

K

M

x

Figure 1. Mass–spring system

The parameters K (the stiffness of the spring) and M (the mass of the body) are known. The

system has a zero initial displacement 00 x but a nonzero initial velocity 0v (e.g., created by

hitting the mass from below with a hammer at 0t , which will produce 00 v ). We want to

estimate the uncertain initial condition 0v based on measurements of the displacement )(tx at later

times.

The equation of motion of the system is

0)()( M txKtx (26)

and admits the general solution (Inman, 2001):

0

01

2

2

0

2

0 tansin)(v

xt

xvtx n

n

n

n

,

M

Kn (27)

The values chosen for the numerical experiments are N/m9478.3 )(20.1 K 2 , kg 0.1 M , and

therefore Hz 1rad/s π2 n . For these values, and with 00 x , the solution (27) becomes:

.2cos)(,2sin2

)( 00 tvtvt

vtx

(28)

The amplitude of )(tx and the amplitude of )(tv are both proportional to the uncertain parameter 0v ,

as illustrated in Figure 2. A single measurement of the displacement at a time 21 mt (with m

integer) allows to estimate the initial velocity as )2sin()(2 110 ttxv . Note that 0)2sin( 1 t . A

single measurement of both the velocity and the displacement at any time 2t is sufficient to retrieve

the initial velocity, since for any 2t at least one of the variables is nonzero, 0)2sin( 2 t or

0)2cos( 2 t .

Page 11: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

11

Figure 2. Displacements and velocities of the mass–spring system

We now consider the case where measurements of the displacement only are taken at multiple

time moments Nttt ,,, 21 . This will give insight on how the two parts of the Bayesian cost function

can be affected by low sampling rates (i.e., below the Nyquist frequency) and by measurement

noise.

We assume some prior knowledge of the initial velocity, which represents how hard different

people can hit the mass with the hammer. The range of possible initial velocities is between m/s 5.0

and m/s 5.1 , with a most likely value of m/s 10 v . We model this prior knowledge as shown in

Figure 3. Let be a random variable with a Beta(2,2) probability distribution )( in the range

]1,1[ . The random initial velocity is then

.]/[5.01)0,( 0 smvv nom (29)

Figure 3. Beta (2, 2) distribution for 0v

Page 12: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

12

The state of the system at future times depends on the random initial velocity and can be

represented by

),(

),(

),(

),(

t

tv

tx

ty

. Synthetic measurements are obtained from a reference simulation

with the reference value of the uncertain parameter 23.0ref . If we assume that only the

displacement can be measured we have that 001H and the measurements yield

.,0,)()( refref

kkkkkkk RtxtyHz Ν (30)

The measurement noise k is assumed to be Gaussian with a zero mean and a variance 1% (or

0.01% or 10% when indicated) of the value of )(tx plus the maximum measured value of )(tx

divided by 1000. Therefore, the covariance of the uncertainty associated with the measurements is

][)(max001.001.0,10max 2212 mzzR ktkk . This value is always greater than zero and 1

kR can

always be computed. Measurement errors at different times are independent random variables.

The maximum likelihood estimate is obtained by minimizing the Bayesian cost function

apriori

mismatch

)(log),(),(1

1

21

total

JJ

N

kkkk

T

kk tHyzRtHyzJ

(31)

The random system output ),( ty is discretized using 6 terms in the polynomial chaos expansions,

and 12 collocation points will be used to derive the polynomial chaos coefficients. The collocation

points used in this study are obtained using an algorithm based on the Halton algorithm (Halton and

Smith, 1964), which is similar to the Hammersley algorithm (Hammersley, 1960).

The frequency of the output signal ),( tx is 1 Hz for any value of . If )(tx is measured every

s5.0 from s5.0t to s5t , then 0),( ktx for any value of and kkz mismatch part gives no

extra information, as shown in Figure 4, in which the plot for )(tx was obtained with 23.0 .

N

kkk

T

k RJ1

1

2

1mismatch (32)

For clarity we will illustrate this detailing the step by step procedure using analytical formulas for

this particular example.

The explicit dependency of ),( tx is obtained via a collocation approach. It can be represented as

6

1

)()(),(S

i

ii txtx (33)

The equation of motion of the system illustrated in Figure 1 is given by equation (26). As shown

earlier, the solution for this equation for Hz 1rad/s π2 M

Kn and for a zero initial

displacement is given y equation (28).

In a polynomial chaos framework, the equation of motion of the system yields six equations for

6S :

6,,2,1,0)()( SitxKtxM ii (34)

Page 13: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

13

Solving the six different equations of motions separately yields

6,,2,1,2sin2

)( 0 Sitv

tx k

i

i

(35)

and the polynomial expression of the displacement is

6

1

0 )(2sin2

),(S

i

i

k

i

tv

tx

(36)

Let’s note that, in the general case, there is no need to know the closed form solution of the

equations of motion. The approach presented in this paper still works when using the states

variables obtained with numerical techniques to solve ODE’s at the chosen collocation points.

The coefficients iv0 are obtained using

66

0

55

0

44

0

33

0

22

0

11

000 5.0)0,( vvvvvvvvv nomnom (37)

For Beta(2,2) distributed random variables the basis are Jacobi (1,1) polynomials. With one random

variable and for the range ]1,1[ , the normalized Jacobi (1,1) polynomials are:

) 132 330- 300 120- 20(-1 )(7/4455028

) 42 84- 56 14-(1 ) (3/92822

) 14 21- 9(-1 (5/8344)

) 5 5-(1 1/106)(

) 2(-1 3/28)(

1/2

54326

4325

324

23

2

1

(38)

Therefore, the coefficients iv0 are

0,0,0,0,3

7,

2

56

0

5

0

4

0

3

0

02

0

01

0 vvvvv

vv

vnomnom

(39)

The cost function from equation (31) can be written as

)(log)(2sin2

1

2

1)(

1

26

1

0

N

k

i

k

S

i

i

k

k

tv

zR

J , 214

3)( (40)

Using the fact that only 1

0v and 2

0v are nonzero and replacing 1

0v by nomv0)2/5( and 2

0v by nomv0)3/7( ,

it yields

2

1

2

001

4

3log) 2(-1 3/28)(2sin

2

)3/7(

2

12sin

2

)2/5(1

2

1

N

kk

nom

k

nomk

k

tv

tv

zR

J (41)

which can be simplified as

2

1

2

001

4

3log) (22sin

2

)4/(2sin

2

1

2

1

N

kk

nom

k

nomk

k

tv

tv

zR

J (42)

which is also equal to

Page 14: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

14

22

1

2

0

1

00

1

2

0

14

3log22sin

2

)4/(1

2

1

22sin2

)4/(2sin

2

12sin

2

1

2

1

N

kk

nom

k

N

kk

nom

k

nomk

k

N

kk

nomk

k

tv

R

tv

tv

zR

tv

zR

J

(43)

The closed form solution of the value minimizing the total cost function is a long expression that is

not written here. The mismatch part of the cost function is

2

1

2

0

1

2

0

2

0

1

2

0

22sin2

)4/(1

2

1

22sin2

2sin2

1

2sin2

1

2

1

N

kk

k

N

kkk

k

k

N

kk

k

k

mismatch

tv

R

tv

tv

zR

tv

zR

J

(44)

The value mismatchmismatch Jminargˆ minimizing mismatch part the cost function

N

kk

k

N

kkk

k

k

mismatch

tv

R

tv

tv

zR

1

2

0

1

2

0

2

0

2sin2

)4/(12

2sin2

2sin2

1

ˆ

(45)

If )(tx is measured every 0.5 s from 5.0t to 5.1t , then 0),( ktx for any value of and kkz

N

k k

k

mismatchR

J1

2

2

1 (46)

which is the formula that was already obtained in equation (32)

In this case, the denominator of equation (45) is not defined and mismatch is not defined, because the

mismatch part does not depend on and yields an estimation where all possible values of

5.0100 vv are equally likely. Therefore, the value totalJminargˆ minimizing the total cost

function is also the value minimizing the apriori part of the cost function, i.e., 0ˆ .

3.2 Possible impact of undersampling on the cost function

Increasing the number of measurements generally yields a better estimation, and as a general rule,

sampling above the Nyquist frequency rate should always be done when possible. This section

studies the possible impact of undersampling for two reasons. Measurements might not be available

at a rate above the Nyquist frequency rate when this frequency is very high, for instance. It will also

be shown that in some cases, it is possible to know that the estimation is already quite accurate

when using only a very few measurements points instead of all of them. This can be useful when

computational time is an issue and an answer is needed quickly, which does not prevent one from

continuing to process the extra information later on if needed, knowing that the extra measurements

will generally yield more precision.

Page 15: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

15

The mismatch part of the cost function is driven by the observational errors. The summed

contribution of errors makes mismatchJ a random variable with a 2 distribution with n degrees of

freedom. For a relatively large number of measurements this distribution behaves like a normal one

with mean n and variance n. The mismatch part does not depend on and yields an estimation

where all possible values of 0v are equally likely. This is illustrated in Figure 4 where the mismatch

part of the cost function is constant. The mismatch part does not depend on the noise level in this

case since 1

kR is inversely proportional to k

T

k . In Figure 4, as well as in consequent figures,

we identify with the “ ο ” sign the points where measurements were available (i.e., the points for

which we collected measurements).

In this case the estimation relies entirely on the apriori part of the cost function, i.e. )(log .

The estimate coincides with the best initial guess, i.e., 0ˆ . This is really the worst-case scenario:

the frequency of sampling the output is below the Nyquist frequency rate, and the sampling points

are exactly those time moments when the displacement is zero and the observations contain no

information.

(a) (b) Estimate = 0

Figure 4. Bayesian estimation with 10 time points

Notes: a) Displacement when no noise added, b) Estimation with noise = 1%

If we measure )(tx at any other additional time point then the Bayesian approach yields an

accurate estimation result (for any reasonable amount of measurement noise). We can interpret this

fact as follows. If the output sampling is not done at least at the Nyquist frequency, one cannot

guarantee that all the relevant information in the output signal is captured, i.e. one cannot guarantee

that the mismatch part of the cost function will bring extra information. We still have a PDF of the

possible values of the uncertain parameter, but in the worst case scenario, it will be no better than

the apriori PDF.

In most practical situations, however, it is very likely that the Bayesian approach will provide

an accurate estimate even when the output is sampled below the Nyquist frequency. In the example

above a single measurement point is sufficient, provided that the measurement time is not one for

which the displacement is zero. This is where the Parameter Estimation and Signal Reconstruction

Page 16: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

16

differ. In the above setting of parameter estimation one samples outputs of the system. If the outputs

were arbitrary signals then their full reconstruction would require a sufficient sampling frequency.

But the outputs are constrained by the input and by the system dynamics, and only a small set of all

the possible reconstructed signals are consistent with both the known input and with the equations

of motion. The reconstruction of signals in this small family requires less information than the

reconstruction of arbitrary signals. In our example all possible system outputs form a one-parameter

family of signals (frequency of 1 Hz, phase equal to zero, and variable amplitude). Consequently a

single measurement of the output is almost always sufficient to estimate the single uncertain

parameter.

(a) (b) Estimate = 0.23

(c) Estimate = 0.23 (d) Estimate = 0.19

Figure 5. Bayesian estimation with 3 time points

Notes: a) Displacement when no noise added, b) Estimation with noise = 0.01%,

c) Estimation with Noise = 1%, d) Estimation with noise = 10%

The practical question is now how to decide whether the sampling of the output is sufficient.

The answer is given by the shape of the Bayesian cost function which indicates whether there is

Page 17: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

17

enough information to obtain a good estimate or not. The second derivative of the cost function at

the minimum approximates the inverse of the covariance of the uncertainty in the estimate. Loosely

speaking, the sharper the minimum of the cost function the more trustworthy the estimate is; and the

wider the minimum the larger the estimation error can be.

The role of the shape of the cost function is illustrated in Figure 5, in which only three

measurements points for 0t are used. Different levels of measurement errors lead to different

shapes of the cost function, and to different estimation accuracies. For noise levels of 0.01% and

0.1% the total cost function is almost equal to its mismatch part for all values of , it has a sharp

minimum, and the Bayesian approach yields an accurate estimate. For very noisy measurements

(10%) the relative weight of the information coming from measurements is smaller, and the relative

weight of the apriori information is higher. Consequently the apriori part of the cost function is

more significant and the minimum of the total cost function is wider. In this scenario the output

sampling is done below the Nyquist frequency, but we know that the estimates are accurate for

noise levels of 0.01% and 1% because the cost functions have clear minima.

(a) (b) Estimate = 0.23

(c) Estimate = 0.23 (d) Estimate = 0.22

Figure 6. Bayesian estimation with 30 time points

Page 18: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

18

Notes: a) Displacement when no noise added, b) Estimation with noise = 0.01%,

c) Estimation with noise = 1%, d) Estimation with noise = 10%

One very accurate output sample would be enough for a perfect estimation. Taking more sample

points leads to a better estimation for noisy measurements because the effect of the noise averages

out as we take more samples. Figure 6 illustrates the cost function when 30 measurements are used.

The relative weight of the mismatch part increases and we get a better estimation when the noise is

10%.

When the extra samples do not bring additional information to the estimation process the cost

function changes as shown in Figure 7. The net result of the additional measurements is to add a

constant to the mismatch part (which corresponds to the effect of measurement noise). The shape of

the cost function does not change, in particular the minimum is not more pronounced, and the

quality of the estimate is not improved.

(a) (b) Estimate = 0

(c) (d) Estimate = 0

Figure 7. Effect of adding sample points containing no useful information

Page 19: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

19

Notes: a) Displacement when no noise added with 5 time points, b) Estimation with 5 time points and noise = 0.01%,

c) Displacement when no noise added with 10 time points, d) Estimation with 10 time points and noise = 0.01%

Next we consider the situation where both )(tx and )(tv are measured at times 0kt . The

observation operator is

010

001H and the measured values are

v

k

x

k

k

k

kkktv

txtyHz

),(

),(),(

. The measurement noise is assumed Gaussian with zero mean and covariance matrix

212

212

))((max001.0)(01.0,10max0

0))((max001.0)(01.0,10max

tvtv

txtxR

tk

tkk

The inverse 1

kR can always be computed. One data point at any 0t is sufficient to estimate our

unknown parameter 0v for low noise levels, as shown in Figure 8. In the general case, however,

measurements of the full state vector do not guarantee that they contain useful information when the

sampling rate is below the Nyquist frequency.

(a) (b)

(c) Estimate = 0.23 (d) Estimate = 0.19

Figure 8. Bayesian estimation with 1 time point when velocity measurements are available

Page 20: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

20

Notes: a) Displacement when no noise added, b) Velocity when no noise added,

c) Estimation with noise = 0.01%, d) Estimation with noise = 1%

3.3 Mass-spring system with sinusoidal forcing function

This section applies the parameter estimation Bayesian approach developed in this study to the

simple mass-spring system with sinusoidal forcing function shown in Figure 9.

K

M

x

t)(ω cos FF(t) 0

K

M

x

t)(ω cos FF(t) 0

Figure 9. Mass –Spring System with sinusoidal forcing function

The parameters K (the stiffness of the spring), M (the value of the mass) are known. The

system is initially at equilibrium, i.e., it has zero initial displacement 00 x and velocity 00 v . The

problem is to estimate the uncertain amplitude of the forcing function 0F .

We assume the following apriori information. 0F has a Beta(2,2) distribution in the range

]1500,500[ NN with the most likely value N 000,10 F . With ]1,1[ a Beta(2,2) distributed

variable the apriori distribution of 0F is

NNF 500000,10 (47)

The reference value of the force amplitude is NF 115,1ref

0 , or 23.0ref . This reference value is

used to generate artificial observations and is not available to the estimation procedure. The

numerical values of the other parameters are as follows: N/m765,17 )2(1.5200 2 K , kg 200 M ,

00 x , 00 v , and Hz 0.5rad/s π . Note that Hz 1.5rad/s π21.5 MKn .

The equation of motion of the system is:

)(cos)()( M 0 tFtxKtx (48)

The solution is sought in the time interval from t = 0 to t =5. Since 00 x and 00 v , the analytical

solution of this equation of motion is (Inman, 2001):

tt

MFtx nn

n2

sin2

sin)/(2

)(22

0

(49)

With our numerical values, the displacement of the mass can be written as:

ttFtx 2sinsin101.2665)( 0

-4 (50)

Page 21: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

21

It can be seen that the amplitude of )(tx and the amplitude of )(tv are both proportional to the

uncertain parameter 0F , as shown in Figure 10. Therefore, the estimation of 0F can be in principle

based on a single measurement of the output.

Figure 10. Displacement and velocities of the Mass – Spring system with sinusoidal forcing function

(a) (b)

Page 22: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

22

(c) Estimate = 0 (d) Estimate = 0.23

Figure 11. Bayesian estimation with 5 time points when velocity measurements are available

Notes: a) Displacement when no noise added, b) Velocity when no noise added,

c) Estimation with noise = 0.01%, d) Estimation with noise = 10%

Figure 11 illustrates the effect of measurements of both )(tx and )(tv at five time points. This

sampling provides no information on the uncertain parameter, and in this worst-case scenario the

estimate is based solely on apriori information. A sampling of the output below the Nyquist

frequency does not guarantee that we get sufficient information from the output signal about the

uncertain parameter. The mismatch part does not depend on the noise level in this case since 1

kR is

inversely proportional to k

T

k , as shown in equation (32).

(a) (b)

(c) Estimate = 0.23 (d) Estimate = 0.25

Figure 12. Bayesian estimation with 3 time points when velocity measurements are available

a) Displacement when no noise added, b) Velocity when no noise added,

c) Estimation with noise = 0.01%, d) Estimation with noise = 10%

Page 23: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

23

However, we can see in Figure 12 that 3 time measurements yield an accurate estimation for a

low noise level, even though the sampling is well below the Nyquist frequency. Once again, the

shape of the cost function indicates that for low noise levels we have enough information to

accurately estimate our uncertain parameter. While there are many signals with a maximum

frequency of 1.5 Hz which fit the observations at the chosen three measurement times, only one of

them is consistent with the input signal and with equation of motion.

3.4 Regularization techniques applied to a mass-spring system with uncertain stiffness and

uncertain mass

This example addresses the issue of non-identifiability. The Bayesian approach is applied to the

simple mass-spring system shown in Figure 1. The difference between the example in this section

and the one from Section 3.1 is that the uncertain parameters are different: they are the stiffness of

the spring ( K ) and the value of the mass ( M ). Our apriori information about the uncertain

parameters is expressed in terms of probability densities as follows. The mass has a normal

distribution with mean kg 2000 M and a standard deviation kg 6.667M . The stiffness has also a

normal distribution with mean N/m7,895.68 N/m 22002

0 K and standard deviation

N/m8.052,1K . We represent the uncertain parameters as functions of a random vector of two

independent normal random variables 21, as follows

.1,0,,, 212010 KM KKMM (51)

We consider the “true” values of the parameters to be kg533.201ref M and N/m62.7495ref K ,

which correspond to the reference values of the random variables )38.0,23.0(, ref

2

ref

1

ref .

These values are not available to the estimation process, but are used in a reference simulation to

generate synthetic observations.

We measure the values of the oscillation frequency obs (along the reference solution) and use

it to derive information about K and M . The measurement errors are assumed to have a normal

distribution with zero mean (unbiased) and a standard deviation equal to

rad/s 3938.001.0 00 MKmeas .

This example allows an analytical solution to the Bayesian approach and provides insight into

the role of the mismatch and the a priori parts of the cost function in the estimation. The position of

the mass is given by:

0

01

2

2

0

2

0 tansin)(v

xt

xvtx n

n

n

n

, (52)

and clearly it depends only on refref MKn .

The Bayesian cost function is defined in this case as:

Page 24: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

24

2

2

obs10202

2

2

1

2

2

obs

2

2

0

2

2

0

)()(

2

1

2

1

2

1

)()(

2

1))((

2

1))((

2

1

meas

MK

measKM

MK

MKKKMMJ

(53)

The maximum likelihood value of the parameters is the argument that minimizes this cost function,

Jminargˆ . The contour plots shown in Figure 13 represent the mismatch part of the cost

function, its apriori part, and the total value of the cost function in the space of random variables.

(a) (b)

(c)

Figure 13. Contours of the cost function

Notes: a) mismatch part, b) apriori part, c) total cost function

The magnitude of the apriori part is relatively small and the total cost function is roughly equal to

its mismatch part. As expected, the mismatch part yields a line of possible minima, because the

measured ratio MKn does not contain information about the individual values of K and M .

Page 25: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

25

The point ref is plotted in Figure 13 and it lies on the line of minima. The Bayesian interpretation

is the following: all the pairs ),( MK along the line are equally likely to produce the value of the

measured oscillation frequency. We say that the (individual values of the) parameters K and M are

non-identifiable.

When multiple combinations of uncertain parameter values result in the same observed behavior

of the system (same measurements) a regularization approach (Aster et al., 2004) can be used in

estimation. In order to find the most likely parameter values one increases the relative importance of

the apriori knowledge of the system. This is done by multiplying the apriori part by a

“regularization coefficient” large enough so that the new total cost function has a clear minimum

value along the possible values.

2

2

obs

2

2

0

2

2

02dregularize)()(

2

1))((

2

1))((

2

1,

measKM

MKKKMMJ

(54)

The net effect of regularization in this example is to reduce the standard deviations in the apriori

distributions (to M and K respectively), therefore to increase the trust in the apriori

information. The contour plots of the regularized cost functions are shown in Figure 14 for different

regularization coefficients.

Figure 14. Contour plots of the cost function after regularization for different coefficients

The cost function looks like its mismatch part when the regularization coefficient is very low

and looks like its apriori part when the regularization coefficient is very high. As the regularization

coefficient gets larger, the line of possible minima becomes an ellipse, and starts moving away from

the location of the original line of minima and toward (0, 0), the apriori most likely value. When the

line becomes an ellipse with a well defined center, the regularization coefficient is large enough; it

should not be further increased as this leads to an increase of the bias in the estimate. In our

example 62 105 seems to be a good value for the regularization coefficient and it results in the

estimated values )40.0,09.0(),( 21 . The choice of the regularization coefficient is problem-

dependent and requires a careful analysis of the resulting estimates. Regularization leads to biased

estimates, as stronger assumptions are being artificially imposed.

3.5 Discussion of the Bayesian approach

Page 26: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

26

The quality of the maximum likelihood estimate is related to the shape of the Bayesian cost

function, with a sharp minimum indicating an accurate estimate. Inaccurate estimates can be caused

by different factors, including a sampling rate below the Nyquist frequency, non-identifiability,

non-observability, and an excitation signal that is not rich enough.

The parameters are non-identifiable when different parameter values lead to identical system

outputs. In this case the Bayesian cost function has an entire region of minima (e.g., a valley), with

each parameter value in the region being equally likely. A regularization approach based on

increasing the weight of the apriori information can be used to select reasonable estimates. When

non-observability is the reason leading to non–identifiability, the problem can be addressed by

including measurements of additional states in the estimation procedure, if possible. When an input

signal that is not rich enough is the reason leading to non –identifiability, the problem can be

addressed by changing the kind of excitations applied to the system

For identifiable and observable systems accurate estimates can be obtained in most cases even

if the output signal is sampled below the Nyquist rate. In the worst case, however, sampling below

the Nyquist rate cannot guarantee that sufficient information is extracted from the output. In this

worst case the apriori information becomes important and the estimate is biased toward the apriori

most likely value.

4. Summary and Conclusions

This paper applies the polynomial chaos theory to the problem of parameter estimation, using direct

stochastic collocation. The maximum likelihood estimates are obtained by minimizing a cost

function derived from the Bayesian theorem. This approach is applied to very simple mechanical

systems in order to illustrate how the cost function can be affected by undersampling, non-

identifiablily of the system, non-observability, and by excitation signals that are not rich enough.

Inaccurate estimates can be caused by those different factors. It has been shown that the quality of

the maximum likelihood estimate is related to the shape of the Bayesian cost function, with a sharp

minimum indicating an accurate estimate. The parameters are non-identifiable when different

parameter values lead to identical system outputs. In this case the Bayesian cost function has an

entire region of minima (e.g., a valley), with each parameter value in the region being equally

likely. Regularization techniques can still yield most likely values among the possible combinations

of uncertain parameters resulting in the same time responses than the ones observed. This was

illustrated using a simple spring-mass system. For identifiable and observable systems accurate

estimates can be obtained in most cases even if the output signal is sampled below the Nyquist

frequency. In the worst case, however, sampling below the Nyquist rate cannot guarantee that

sufficient information is extracted from the output. In this worst case the apriori information

becomes important and the estimate is biased toward the apriori most likely value.

The proposed method has several advantages. Simulations using Polynomial Chaos methods are

much faster than Monte Carlo simulations. Another advantage of this method is that it is optimal; it

can treat non-Gaussian uncertainties since the Bayesian approach is not tailored to any specific

distribution. We plan to apply the proposed technique to identify parameters of a real mechanical

system for which laboratory measurements are available.

Acknowledgements

Page 27: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

27

This research was supported in part by NASA Langley through the Virginia Institute for

Performance Engineering and Research award. The authors are grateful to Dr. Mehdi Ahmadian,

Dr. Steve Southward, Dr. John Ferris, Dr. Sean Kenny, Dr. Luis Crespo, Dr. Daniel Giesy, and Mr.

Carvel Holton for many fruitful discussions on this topic.

References

Araújo, A.L., Mota Soares, C.M., Herskovits, J., Pedersen, P. (2009), “Estimation of Piezoelastic and

Viscoelastic Properties in Laminated Structures”, Composite Structures, Vol. 87, No. 2, pp. 168-174.

Aster, R.C., Borchers, B., and Thurber, C.H. (2004), Parameter Estimation and Inverse Problems, Elsevier Academic Press. ISBN 0-12-065604-3.

Bishwal, J.P.N. (2008), Parameter Estimation in Stochastic Differential Equations, Springer, Berlin,

Germany.

Blanchard, E., Sandu, C., and Sandu, A. (2007a), “A Polynomial-Chaos-based Bayesian Approach for

Estimating Uncertain Parameters of Mechanical Systems”, Proceedings of the ASME 2007 International

Design Engineering Technical Conferences & Computers and Information in Engineering Conference

IDETC/CIE 2007, 9th International Conference on Advanced Vehicle and Tire Technologies (AVTT), Las Vegas, Nevada, September.

Blanchard, E., Sandu, A., and Sandu, C. (2007b), “Parameter Estimation Method Using an Extended Kalman

Filter”, Proceedings of the Joint North America, Asia-Pacific ISTVS Conference and Annual Meeting of Japanese Society for Terramechanics, Fairbanks, Alaska, June.

Catania, F., and Paladino, O. (2009), “Optimal Sampling for the Estimation of Dispersion Parameters in Soil

Columns Using an Iterative Genetic Algorithm”, Environmental Modelling & Software, Vol. 24, No. 1, pp. 115-123.

Cohn, S. E. (1997), “An Introduction to Estimation Theory”, Journal of the Meteorological Society of Japan,

Vol. 75 (B), pp. 257-288.

Davis, L. (1991), Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York.

Evensen, G. (1992), “Using the Extended Kalman Filter with a Multi-layer Quasi-geostrophic Ocean

Model”, Journal of Geophysical Research, Vol. 97 No. C11, pp. 17905-17924.

Evensen, G. (1993), “Open Boundary Conditions for the Extended Kalman Filter with a Quasi-geostrophic Mode”, Journal of Geophysical Research, Vol. 98 No C19, pp.16529-16546.

Evensen, G. (1994), “Sequential Data Assimilation with a Non-linear Quasi-geostrophic Model using Monte

Carlo Methods to Forecast Error Statistics” Journal of Geophysical Research, Vol. 99 No C5, pp. 10143–10162.

Floudas, C.A. (2000), Deterministic Global Optimization: Theory, Methods, and Applications, Kluwer

Academic Publishers, Dordrecht, The Netherlands.

Ghanem, R.G., and Spanos, P.D. (2003), Stochastic Finite Elements, Dover Publications Inc, Mineola, NY.

Ghanem, R.G., and Spanos, P.D. (1990), “Polynomial Chaos in Stochastic Finite Element”, Journal of Applied Mechanics, Vol. 57, pp. 197-202.

Ghanem, R.G., and Spanos, P.D. (1991), “Spectral Stochastic Finite-Element Formulation for Reliability Analysis”, ASCE Journal of Engineering Mechanics, Vol. 117 No. 10, pp. 2351-2372.

Page 28: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

28

Ghanem, R.G., and Spanos, P.D. (1993), “A Stochastic Galerkin Expansion for Nonlinear Random Vibration Analysis”, Probabilistic Engineering Mechanics, Vol. 8 No. 3, pp. 255-264.

Halton, J. H., Smith, G. B. (1964), “Radical-inverse Quasi-random Point Sequence”, Communications of the

ACM, Vol. 7 No. 12, pp. 701–702.

Hammersley, J. M. (1960) “Monte Carlo Methods for Solving Multivariables Problems”, Annals of the New

York Academy of Sciences, Vol. 86, pp. 844–874.

Horst, R., and Pardalos, P.M., editors (1995), Handbook of Global Optimization, Vol. 1. Kluwer Academic

Publishers, Dordrecht, The Netherlands.

Horst, R., Pardalos, P. M., and Thoai, N. V. (2000), Introduction to Global Optimization, Second Edition,

Kluwer Academic Publishers, Dordrecht, The Netherlands.

Inman, D. J. (2001), Engineering Vibration, Second Edition, Prentice Hall, Inc., ISBN 0-13-726142-X.

Kalman, R. E. (1960), “A New Approach to Linear Filtering and Prediction Problems”, Transaction of the

ASME- Journal of Basic Engineering, pp. 35-45.

Khan, T., and Ramuhalli, P. (2008), “A Recursive Bayesian Estimation Method for Solving Electromagnetic Nondestructive Evaluation Inverse Problems”, IEEE Transactions on Magnetics, Vol. 44 No. 7, pp. 1845-

1855.

Li, L., Sandu, C., and Sandu, A. (2005), “Modeling and Simulation of a Full Vehicle with Parametric and

External Uncertainties”, Proceedings of the 2005 ASME Int. Mechanical Engineering Congress and Exposition, 7th VDC Annual Symposium on "Advanced Vehicle Technologies", Session 4: Advances in

Vehicle Systems Modeling and Simulation, Paper number IMECE2005-82101, Orlando, FL, November.

Liberti, L., and Maculan, N. (2006), Global Optimization: From Theory to Implementation, Springer, Berlin, Germany.

Mockus, J., Eddy, W., Mockus, A., Mockus, L., and Reklaitis, G. (1997), Bayesian Heuristic Approach to

Discrete and Global Optimization: Algorithms, Vizualization, Software and Applications, Kluwer Academic Publishers, Dordrecht, The Netherlands.

Nocedal, J., and Wright, S.J. (2006), Numerical Optimization, Vol.2, Springer.

Oliveto, N.D., Scalia, G., and Oliveto, G. (2008), “Dynamic Identification of Structural Systems with

Viscous and Friction Damping”, Journal of Sound and Vibration, Vol. 318, pp. 911–926.

Pardalos, P.M., and Romeijn, H.E. (2002), editors, Handbook of Global Optimization, Vol.2., Kluwer

Academic Publishers Dordrecht, The Netherlands.

Pradlwarter, H.J., Pellissetti, M.F., Schenk, C.A., Schuëller, G.I., Kreis, A., Fransen, S., Calvi, A., and Klein, M. (2005), “Realistic and Efficient Reliability Estimation for Aerospace Structures”, Computer Methods in

Applied Mechanics and Engineering, Special Issue on Computational Methods in Stochastic Mechanics and

Reliability Analysis, Vol. 194 No. 12-16, pp. 1597-1617.

Raїssi, T., Ramdani, N., and Candau, Y. (2004), “Set Membership State and Parameter Estimation for Systems Described by Nonlinear Differential Equations”, Automatica, Vol. 40, pp. 1771–1777.

Saad, G., Ghanem, R.G., and Masri, S. (2007), “Robust system identification of strongly non-linear

dynamics using a polynomial chaos based sequential data assimilation technique”, Collection of Technical Papers - 48

th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference,

Vol. 6, pp. 6005-6013, Honolulu, HI, May.

Page 29: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

29

Sandu, A., Sandu, C., and Ahmadian, M. (2006a), “Modeling Multibody Dynamic Systems with

Uncertainties. Part I: Theoretical and Computational Aspects”, Multibody System Dynamics, Publisher: Springer Netherlands, ISSN: 1384-5640 (Paper) 1573-272X (Online), DOI 10.1007/s11044-006-9007-5, pp.

1-23 (23), June 29, 2006.

Sandu, C., Sandu, A., and Ahmadian, M. (2006b), “Modeling Multibody Dynamic Systems with

Uncertainties. Part II: Numerical Applications”, Multibody System Dynamics, Publisher: Springer Netherlands, ISSN: 1384-5640 (Paper) 1573-272X (Online), DOI: 10.1007/s11044-006-9008-4, Vol. 15, No.

3, pp. 241 - 262 (22), April 2006.

Sandu, C., Sandu, A., Chan, B.J., and Ahmadian, M. (2004), “Treating Uncertainties in Multibody Dynamic Systems using a Polynomial Chaos Spectral Decomposition”, Proceedings of the ASME IMECE 2004, 6th

Annual Symposium on “Advanced Vehicle Technology”, Paper number IMECE2004-60482, Anaheim, CA,

November.

Sandu, C., Sandu, A., Chan, B.J., and Ahmadian, M. (2005), “Treatment of Constrained Multibody Dynamic

Systems with Uncertainties”, Proceedings of the SAE Congress 2005, Paper number 2005-01-0936, Detroit,

MI, April.

Sandu, C., Sandu, A., and Li, L. (2006c), “Stochastic Modeling of Terrain Profiles and Soil Parameters”,

SAE 2005 Transactions Journal of Commercial Vehicles, Vol. 114 No. 2, pp. 211-220.

Sohns, B., Allison, J., Fathy, H.K., Stein, J.L. (2006), “Efficient Parameterization of Large-Scale Dynamic

Models Through the Use of Activity Analysis”, Proceedings of the ASME IMECE 2006, Chicago, Illinois, November.

Sun, J., Zhang, Q., Tsang, E.P.K. (2005) “DE/EDA: A New Evolutionary Algorithm for Global

Optimization”, Information Sciences, Vol. 169 No 3-4, pp. 249-262.

Tarantola, A. (2004), Inverse Problem Theory and Methods for Model Parameter Estimation, Society for

Industrial and Applied Mathematics.

Thompson, B., Vladimirov, I. (2005), “Bayesian Parameter Estimation and Prediction in Mean Reverting Stochastic Diffusion Models”, Nonlinear Analysis, Vol. 63, pp. e2367 – e2375.

Varziri, M.S., Poyton, A.A., McAuley, K.B., McLellan, P.J., and Ramsay, J.O. (2008), “Selecting Optimal

Weighting Factors in iPDA for Parameter Estimation in Continuous-Time Dynamic Models”, Computers &

Chemical Engineering, Vol. 32 No. 12, pp. 3011-3022.

Wang, J., Zabaras, N. (2005), “Using Bayesian Statistics in the Estimation of Heat Source in Radiation”,

International Journal of Heat and Mass Transfer, Vol. 48, pp. 15-29.

Xiu, D., Lucor, D., Su, C.H., and Karniadakis, G.E. (2002), “Stochastic Modeling of Flow-Structure Interactions using Generalized Polynomial Chaos”, Journal of Fluids Engineering, Vol. 124, pp. 51-59.

Xiu, D., and Karniadakis, G.E. (2002a), “The Wiener-Askey Polynomial Chaos for Stochastic Differential Equations”, Journal on Scientific Computing, Vol. 24 No. 2, pp. 619-644.

Xiu, D., and Karniadakis, G.E. (2003), “Modeling Uncertainty in Flow Simulations via Generalized Polynomial Chaos”, Journal of Computational Physics, Vol. 187, pp. 137-167.

Xiu, D., and Karniadakis, G.E. (2002b), “Modeling Uncertainty in Steady-state Diffusion Problems via Generalized Polynomial Chaos”, Computer Methods in Applied Mechanics and Engineering, Vol. 191, pp. 4927-4928.

Zhang, B.T. (1999), “A Bayesian Framework for Evolutionary Computation”, Proceedings of the 1999 Congress on Evolutionary Computation, Vol. 1, pp. 722-728.

Page 30: Parameter Estimation for Mechanical Systems via an ...people.cs.vt.edu/asandu/Deposit/draft_2009_pc-bayesian.pdf · 1 Parameter Estimation for Mechanical Systems via an Explicit Representation

30

Zhang, D., Lu, Z. (2004), “An Efficient, High-order Perturbation Approach for Flow in Random Porous

Media via Karhunen–Loeve and Polynomial Expansions”, Journal of Computational Physics, Vol. 194 No. 2, pp. 773–794.

Zhang, Q., Sun, J., Tsang, E., and Ford, J. (2004), “Hybrid Estimation of Distribution Algorithm for Global

Optimization”, Engineering Computations, Vol. 21 No. 1, pp. 91-107.

Zhang, Q., Sun, J., Tsang, E. P. K. (2005), “An Evolutionary Algorithm with Guided Mutation for the Maximum Clique Problem”, IEEE Transactions on Evolutionary Computation, Vol. 9 No. 2, pp. 192-200.

List of Figures

Fig. 1 Mass–spring system

Fig. 2 Displacements and velocities of the mass–spring system

Fig. 3 Beta (2, 2) distribution for 0v

Fig. 4 Bayesian estimation with 10 time points: a) Displacement when no noise added, b) Estimation with noise = 1%

Fig. 5 Bayesian estimation with 3 time points: a) Displacement when no noise added, b) Estimation with

noise = 0.01%, c) Estimation with noise = 1%, d) Estimation with noise = 10%

Fig. 6 Bayesian estimation with 30 time points: a) Displacement when no noise added, b) Estimation with

noise = 0.01%, c) Estimation with noise = 1%, d) Estimation with noise = 10%

Fig. 7 Effect of adding sample points containing no useful information: a) Displacement when no noise added with 5 time points, b) Estimation with 5 time points and noise = 0.01%, c) Displacement when

no noise added with 10 time points, d) Estimation with 10 time points and noise = 0.01%

Fig. 8 Bayesian Estimation with 1 time point when velocity measurements are available: a) Displacement when no noise added, b) Velocity when no noise added, c) Estimation with noise = 0.01%, d)

Estimation with noise = 1%

Fig. 9 Mass–Spring system with sinusoidal forcing function

Fig. 10 Displacement and velocities of the Mass–Spring system with sinusoidal forcing function

Fig. 11 Bayesian estimation with 5 time points when velocity measurements are available: a) Displacement

when no noise added, b) Velocity when no noise added, c) Estimation with noise = 0.01%, d) Estimation with noise = 10%

Fig. 12 Bayesian estimation with 3 time points when velocity measurement is available: a) Displacement

when no noise added, b) Velocity when no noise added; c) Estimation with noise = 0.01%, d) Estimation with noise = 10%

Fig. 13 Contours of the cost function: a) mismatch part, b) apriori part, c) total cost function