Computacion Inteligente Least-Square Methods for System Identification.

68
Computacion Inteligente Least-Square Methods for System Identification

Transcript of Computacion Inteligente Least-Square Methods for System Identification.

Page 1: Computacion Inteligente Least-Square Methods for System Identification.

Computacion Inteligente

Least-Square Methods for System Identification

Page 2: Computacion Inteligente Least-Square Methods for System Identification.

2Contents

System Identification: an Introduction Least-Squares EstimatorsStatistical Properties of least-squares estimatorsMaximum likelihood (ML) estimatorMaximum likelihood estimator for linear model LSE for Nonlinear ModelsDeveloping Dinamic models from DataExample: Tank level modeling

Page 3: Computacion Inteligente Least-Square Methods for System Identification.

3System Identification: Introduction

Goal– Determine a mathematical model for an unknown system

(or target system) by observing its input-output data pairs

Page 4: Computacion Inteligente Least-Square Methods for System Identification.

4System Identification: Introduction

Purposes

– To predict a system’s behavior,

– As in time series prediction & weather forecasting

– To explain the interactions & relationships between inputs & outputs of a system

Page 5: Computacion Inteligente Least-Square Methods for System Identification.

5System Identification: Introduction

Context example

– To design a controller based on the model of a system,

– as an aircraft or ship control

– Simulate the system under control once the model is known

Page 6: Computacion Inteligente Least-Square Methods for System Identification.

6Why cover System Identification

System Identification

It is a well established and easy to use technique for modeling a real life system.

It will be needed for the section on fuzzy-neural networks.

Page 7: Computacion Inteligente Least-Square Methods for System Identification.

7Spring Example

Experiment Force(newtons) Length(inches)

1 1.1 1.5

2 1.9 2.1

3 3.2 2.5

4 4.4 3.3

5 5.9 4.1

6 7.4 4.6

7 9.2 5.0

What will the length be when the force is 5.0 newtons?

Experimental data

Page 8: Computacion Inteligente Least-Square Methods for System Identification.

8Components of System Identification

There are 2 main steps that are involved

– Structure identification

– Parameter identification

Page 9: Computacion Inteligente Least-Square Methods for System Identification.

9Structure identification

Structure identification

Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted

This class of model is denoted by a function y = f(u,) where:

• y is the model output

• u is the input vector is the parameter vector

Page 10: Computacion Inteligente Least-Square Methods for System Identification.

10Structure identification

Structure identification

f(u,) depends on

– the problem at hand– the designer’s experience– the laws of nature governing the target system

Page 11: Computacion Inteligente Least-Square Methods for System Identification.

11Parameter identification

– Training data is used for both system and model.

– Difference between Target System output, yi, and Mathematical Model output, yi, is used to update parameter vector, θ.

^

Page 12: Computacion Inteligente Least-Square Methods for System Identification.

12Parameter identification

Parameter identification

– The structure of the model is known, however we need to apply optimization techniques

– In order to determine the parameter vector such that the resulting model

describes the system appropriately:

iii u to assignedy with 0yy

Page 13: Computacion Inteligente Least-Square Methods for System Identification.

13System Identification Process

The data set composed of m desired input-output pairs

– (ui, yi) (i = 1,…,m) is called the training data

System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found

Page 14: Computacion Inteligente Least-Square Methods for System Identification.

14System Identification: Steps

– Specify & parameterize a class of mathematical models representing the system to be identified

– Perform parameter identification to choose the parameters that best fit the training data set

– Conduct validation set to see if the model identified responds correctly to an unseen data set

– Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected & repeat step 2 to 4

Page 15: Computacion Inteligente Least-Square Methods for System Identification.

15System Identification Process

Structure and parameter identification may need to be done repeatedly

Page 16: Computacion Inteligente Least-Square Methods for System Identification.

16

Least-Squares Estimators

Page 17: Computacion Inteligente Least-Square Methods for System Identification.

17Objective of Linear Least Squares fitting

Given a training data set {(ui, yi), i = 1, …, m} and the general form function:

Find the parameters 1, …, n , such thatestimate

Page 18: Computacion Inteligente Least-Square Methods for System Identification.

18The linear model

The linear model:

y = 1 f 1(u) + 2 f2(u) + … + nfn(u)

= fT(u, )

where:

– u = (u1, …, up)T is the model input vector

– f1, …, fn are known functions of u

1, …, n are unknown parameters to be estimated

Page 19: Computacion Inteligente Least-Square Methods for System Identification.

19Least-Squares Estimators

The task of fitting data using a linear model is referred to as linear regression

where:

– u = (u1, …, up)T is the input vector

– f1(u), …, fn(u) regressors

1, …, n parameter vector

Page 20: Computacion Inteligente Least-Square Methods for System Identification.

20Least-Squares Estimators

We collect training data set {(ui, yi), i = 1, …, m}

System’s equations becomes:

mnmn2m21m1

2n2n222121

1n1n212111

y)u(f...)u(f)u(f

y)u(f...)u(f)u(f

y)u(f...)u(f)u(f

Which is equivalent to: A = y

Page 21: Computacion Inteligente Least-Square Methods for System Identification.

21Least-Squares Estimators

Which is equivalent to: A = y

– where

)u(f)u(f

)u(f)u(f

A

mnm1

1n11

n

1

A = y = A-1y (solution)

m*n matrix n*1 vector m*1 vectorunknown

Page 22: Computacion Inteligente Least-Square Methods for System Identification.

22Least-Squares Estimators

We have

– m outputs, and

– n fitting parameters to find

Or – m equations, and

– n unknown variables

Usually m is greater than n

Page 23: Computacion Inteligente Least-Square Methods for System Identification.

23Least-Squares Estimators

Since

the model is just an approximation of the target system & the data observed might be corrupted,

Therefore

– an exact solution is not always possible!

To overcome this inherent conceptual problem, an error vector e is added to compensate

A + e = y

Page 24: Computacion Inteligente Least-Square Methods for System Identification.

24Least-Squares Estimators

Our goal consists now of finding that reduces the errors between and

The problem: Find,

estimate

Page 25: Computacion Inteligente Least-Square Methods for System Identification.

25Least-Squares Estimators

If e = y - A then:

We need to compute:

mi

1i

TT2Tii )Ay()Ay(ee)ay()(E

Page 26: Computacion Inteligente Least-Square Methods for System Identification.

26Least-Squares Estimators

Theorem [least-squares estimator]

The squared error is minimized when satisfies the normal equation

if is nonsingular, is unique & is given by

is called the least-squares estimators, LSE

Page 27: Computacion Inteligente Least-Square Methods for System Identification.

27Spring Example

– Structure Identification can be done using domain knowledge.

– The change in length of a spring is proportional to the force applied.

Hooke’s law

length = k0 + k1*force

Page 28: Computacion Inteligente Least-Square Methods for System Identification.

28Spring Example

Page 29: Computacion Inteligente Least-Square Methods for System Identification.

29

Statistical Properties of least-squares estimators

Page 30: Computacion Inteligente Least-Square Methods for System Identification.

30Statistical qualities of LSE

Definition [unbiased estimator]

An estimator of the parameter is unbiased if

where E[.] is the statistical expectation

Page 31: Computacion Inteligente Least-Square Methods for System Identification.

31Statistical qualities of LSE

Definition [minimal variance]

– An estimator is a minimum variance estimator if for any other estimator *:

where Cov() is the covariance matrix of the random vector

Page 32: Computacion Inteligente Least-Square Methods for System Identification.

32Statistical qualities of LSE

Theorem [Gauss-Markov]:

– Gauss-Markov conditions:

• The error vector e is a vector of m uncorrelated random variables, each with zero mean & the same variance 2.

• This means that:

Page 33: Computacion Inteligente Least-Square Methods for System Identification.

33Statistical qualities of LSE

Theorem [Gauss-Markov]

LSE is unbiased & has minimum variance.

Proof:

Page 34: Computacion Inteligente Least-Square Methods for System Identification.

34

Maximum likelihood (ML) estimator

Page 35: Computacion Inteligente Least-Square Methods for System Identification.

35Maximum likelihood (ML) estimator

The problem

– Suppose we observe m independent samples

x1, x2, …, xm,

– coming from a probability density function with parameters 1, …, r

Page 36: Computacion Inteligente Least-Square Methods for System Identification.

36Maximum likelihood (ML) estimator

The criterion for choosing is:

– Choose parameters that maximize data probability

Which one do you prefer? Why?

Page 37: Computacion Inteligente Least-Square Methods for System Identification.

37Maximum likelihood (ML) estimator

Likelihood function definition:

– For a sample of n observations x1, x2, …, xm

– with independent probability density function f,– the likelihood function L is defined by

L is the joint probability density

Page 38: Computacion Inteligente Least-Square Methods for System Identification.

38Maximum likelihood (ML) estimator

ML estimator is defined as the value of which maximizes L:

or equivalently:

Page 39: Computacion Inteligente Least-Square Methods for System Identification.

39Maximum likelihood (ML) estimator

Example: ML estimation for normal distribution

– Suppose we have m indipendent samples x1, x2, …, xm, coming from a Gaussian distribution with parameters μ and σ2.

Which is the MLE for μ and σ2?

2x

2

1exp

2

1),;x(f

Page 40: Computacion Inteligente Least-Square Methods for System Identification.

40Maximum likelihood (ML) estimator

Example: ML estimation for normal distribution

– For m observations x1, x2, …, xm, we have:

2

22 2

1

1,

2

ixm

i

L e

Page 41: Computacion Inteligente Least-Square Methods for System Identification.

41Maximum likelihood (ML) estimator

Example: ML estimation for normal distribution

– For m observations x1, x2, …, xm, we have:

Page 42: Computacion Inteligente Least-Square Methods for System Identification.

42

Maximum likelihood estimator for linear model

Page 43: Computacion Inteligente Least-Square Methods for System Identification.

43Maximum likelihood estimator for linear model

– Let a linear model be given as

– Then

– here e has PDF pe(u,θ) (independent). The likelihood function is given by

Page 44: Computacion Inteligente Least-Square Methods for System Identification.

44Maximum likelihood estimator for linear model

– Asume a regression model where errors are distributed normally with zero mean.

– The likelihood function is given by

Page 45: Computacion Inteligente Least-Square Methods for System Identification.

45Maximum likelihood estimator for linear model

The maximum likelihood model

– Any algorithm that maximizes

– gives de Maximum likelihood model with respect to a

given family of possible models

Page 46: Computacion Inteligente Least-Square Methods for System Identification.

46Maximum likelihood estimator for linear model

– Same as maximizing

– Same as minimizing

Page 47: Computacion Inteligente Least-Square Methods for System Identification.

47Connection to Least Squares

Conclusion

– The least-squares fitting criterion can be understood as emerging from the use of the maximum likelihood principle for estimating a regression model where errors are distributed normally.

– The applicability of the least-squares method is, however, not limited to the normality assumption.

Page 48: Computacion Inteligente Least-Square Methods for System Identification.

48

LSE for Nonlinear Models

Page 49: Computacion Inteligente Least-Square Methods for System Identification.

49LSE for Nonlinear Models

Nonlinear models are divided into 2 families

– Intrinsically linear– Intrinsically nonlinear

• Through appropriate transformations of the input-output variables & fitting parameters, an intrinsically linear model can become a linear model

• By this transformation into linear models, LSE can be used to optimize the unknown parameters

Page 50: Computacion Inteligente Least-Square Methods for System Identification.

50LSE for Nonlinear Models

Examples of intrinsically linear systems

Page 51: Computacion Inteligente Least-Square Methods for System Identification.

51

Developing Dinamic models from Data

Page 52: Computacion Inteligente Least-Square Methods for System Identification.

52Dynamical System?

Input u(t) Output y(t)

System

))(),...,2(),1(),(),...,1(),(()(ˆ mtututuntytytySty

Page 53: Computacion Inteligente Least-Square Methods for System Identification.

53The ARX model

In dynamic systems analysis, the independent variable is often time (k)

– A ARX model (AutoRegressive with eXogenous input model) is often used where

Page 54: Computacion Inteligente Least-Square Methods for System Identification.

54The ARX model

Or equivalently

– writing

Page 55: Computacion Inteligente Least-Square Methods for System Identification.

55The ARX model as a linear regressor

Input-output relationship can take the form

– where

Regression vector

Parameter vector to estimate

Page 56: Computacion Inteligente Least-Square Methods for System Identification.

56Prediction error model estimation

The problem– Assume input-output data

– Build the predictor

– Such that minimizes Prediction ErrorPrediction Error

Page 57: Computacion Inteligente Least-Square Methods for System Identification.

57Prediction error model estimation

– The model is fitted to the data by minimizing the criterion function

2

1

1 N

Nk

V kN

Which gives the least squares criterion

Page 58: Computacion Inteligente Least-Square Methods for System Identification.

58Prediction error model estimation

Solution

– Normal equation

– Estimates

1 1

1 1N NT

Nk k

k k k y kN N

1 1

1 1N NT

Nk k

k k k y kN N

1

1 1

ˆ arg min ( ) ( ) ( ) ( ) ( )N N

LS TN N

k n

V k k k y k

1

1 1

ˆ arg min ( ) ( ) ( ) ( ) ( )N N

LS TN N

k n

V k k k y k

Page 59: Computacion Inteligente Least-Square Methods for System Identification.

59Prediction error model estimation

In matrix form, the solution is the standard linear least squares formula

1ˆ TN y

1ˆ T

N y

Page 60: Computacion Inteligente Least-Square Methods for System Identification.

60

Example: Tank level modeling

Page 61: Computacion Inteligente Least-Square Methods for System Identification.

61Example: Tank level modeling

Page 62: Computacion Inteligente Least-Square Methods for System Identification.

62Example Tank level modeling

The identification goal– To explain how the voltage u(t) (the input) afects the

water level h(t) (the output) of the tank

Experimetal data

Page 63: Computacion Inteligente Least-Square Methods for System Identification.

63Simple ARX modeling

A plausible first identification attempt is to try a simple linear regression model

– The parameters can easily be estimated using linear least squares, resulting in

Page 64: Computacion Inteligente Least-Square Methods for System Identification.

64ARX model results

– Simulated water level follows the true level but at levels close to zero the linear model produces negative levels.

Page 65: Computacion Inteligente Least-Square Methods for System Identification.

65Semiphysical modeling

Model equation is based on dynamic conservation of mass

– Accumulation of mass in the tank is equal to:

the mass flow rate into the tank

the mass flow rate out.

i o

dhA q q

dt minus

Page 66: Computacion Inteligente Least-Square Methods for System Identification.

66Semiphysical modeling

While the inflow is roughly proportional to u(t) the outflow can be approximated using Bernoulli’s law

– The parameters can easily be estimated using linear least squares, resulting in

Page 67: Computacion Inteligente Least-Square Methods for System Identification.

67Semiphysical model results

The RMS error of this model is lower and more importantly no simulated output is negative which indicates that the model is physically sound

Page 68: Computacion Inteligente Least-Square Methods for System Identification.

68Sources

J-Shing Roger Jang, Chuen-Tsai Sun and Eiji Mizutani, Slides for Ch. 5 of “Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence”, First Edition, Prentice Hall, 1997.

Djamel Bouchaffra. Soft Computing. Course materials. Oakland University. Fall 2005

Henrik Melgaard, Identication of Physical Models. Institute of Mathematical Modelling, Technical University of Denmark. Ph.D. THESIS. 1994

Lucidi delle lezioni, Soft Computing. Materiale Didattico. Dipartimento di Elettronica e Informazione. Politecnico di Milano. 2004

Peter Lindskog, Fuzzy Identification from a Grey Box Modeling Point of View. Department of Electrical Engineering, Linkoping University. 1997

Jacob Roll, Local and Piecewise Afinne Approaches to System Identification. Department of Electrical Engineering, Linkoping University, Linkoping, Sweden. 2003