Download - Course Part3

Transcript
Page 1: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 1/64

Hamilton Institute

TCP congestion control

Roughly speaking, TCP operates as follows:

 ± Data packets reaching a destination are

acknowledged by sending an appropriate message to the

sender.

 ± Upon receipt of the acknowledgement, data sources increase

their send rate, thereby probing the network for available

 bandwidth, until congestion is encountered.

 ± Network congestion is deduced through the loss of data packets

(receipt of duplicate ACK¶s or non receipt of ACK¶s), and

results in sources reducing their send rate drastically (by half).

Page 2: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 2/64

Hamilton Institute

TCP congestion control

Congestion control is necessary for a number of reasons,

so that:

 ± catastrophic collapse of the network is avoided under heavy

loads;

 ± each data source receives a fair share of the available

 bandwidth;

 ± the available bandwidth B is utilised in an optimal fashion.

 ± interactions of the network sources should not cause

destabilising network side effects such as oscillations or 

instability

Page 3: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 3/64

Hamilton Institute

TCP congestion control

Hespanha¶s hybrid model of 

TCP traffic.

 ± Loss of packets caused by

queues filling at the

 bottleneck link.

 ± TCP sources have two

modes of operation

Additive increase

Multiplicative decrease

 ± Packet-loss detected at

sources one RTT after loss

of packet.

Data s rc 1

Data s rc

Data s rc 2

B ttl ck li k l

R t r  R t r 

Page 4: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 4/64

Hamilton Institute

TCP congestion controlData source 1

Data source n

Data source 2

Bottleneck link l

Router  Router 

Packet not

being

dropped

Packets

dropped

Packet drop

detected

Half source rate

Page 5: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 5/64

Hamilton Institute

TCP congestion controlData source 1

Data source n

Data source 2

Bottleneck link l

Router  Router 

Queue

not

full

Queue

full

Packet drop

detected

Half source rate

Page 6: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 6/64

Hamilton Institute

Modelling the µqueue not full¶ state

The rate at which the queue grows is easy to determine.

While the queue is not full:

 B

QT  RTT 

 B RTT 

w

dt 

dQ

 p

i

!

 RTT d t 

dwi

1!

Page 7: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 7/64

Hamilton Institute

Modelling the µqueue full¶ state

When the queue is full

One RTT later the sources are informed of congestion

 RTT dt dw

dt 

dQ

1

0

!

!

Page 8: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 8/64

Hamilton Institute

TCP congestion control

Queue fills

ONE RTT LATER

QUEUE FULL

 RTT dt 

dw

dt 

dQ

1

0

!

!

 B RTT 

w

d t 

dQ i!

 RTT d t 

dwi

1!

iiw.w 50!

Page 9: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 9/64

Hamilton Institute

TCP congestion control: Example (Hespanha)

Seconds40T

 packets250Qc packets/se1250

 p

max

.

 B

!

!

!

0 100 200 300 400 500 600  0 

50 

10 0 

15 0 

20 0 

25 0 

30 0 

35 0 

40 0 

Page 10: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 10/64

Hamilton Institute

TCP congestion control: Example (Fairness)

Seconds40T

 packets250Q

c packets/se1250

 p

max

.

 B

!

!

!

0 200 400 600 800 1000 1200  0 

10 0 

20 0 

30 0 

40 0 

50 0 

60 0 

70 0 

Page 11: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 11/64

Ro ert . Shorten Hamilton Institute

Modelling of dynamic systems: Part 3

 System Identification

Robert N. Shorten & Douglas Leith

The Hamilton Institute

 NUI Maynooth

Page 12: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 12/64

Hamilton Institute

Building our first model

Example: Mal 

thus¶sla

w of popula

tion growth

Government agencies use population models to plan.

What do you think be a good simple model for population

growth?

Malthus¶s law states that rate of an unperturbed population

(Y) growth is proportional to the population present.

Introduction

kY dt 

d Y !

Page 13: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 13/64

Page 14: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 14/64

Hamilton Institute

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  0 

50 

100 

150 

200 

250 

Y E A R  

P op

US Population Growth (m il lions) v. Y ear 

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  1.5 

2.5 

3

3.5 

4

4.5 

5.5 

Slope = k  

Intercept = ey 0  

Y E A R  

ln(Pop)

Page 15: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 15/64

Hamilton Institute

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  0 

50 

100 

150 

200 

250 

300 

350 

Y E A R  

P op

US Population Growth (m il lions) v. Y ear 

M O D E L

Page 16: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 16/64

Hamilton Institute

Modelling

Modelling is usually necessary for two reasons: to predictand to control. However to build models we need to do a lot

of work.

 ± Postulate the model structure (most physical systems can be

classified as belonging to the system classes that you have already

seen)

 ± Identify the model parameters;

 ± Validate the parameters (later);

 ± Solve the equations to use the model for prediction and analysis

(now);

Introduction

Page 17: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 17/64

Hamilton Institute

Modelling

Modelling is usually necessary for two reasons: to predictand to control. However to build models we need to do a lot

of work.

 ± Postulate the model structure (most physical systems can be

classified as belonging to the system classes that you have already

seen)

 ±  Identify the model parameters;

Experiment design

Parameter estimation

 ± Validate the parameters (later);

 ± Solve the equations to use the model for prediction and analysis

(now);

Introduction

Page 18: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 18/64

Hamilton Institute

What is parameter estimation?

Parameter identification is the identification of theunknown parameters of a given model.

Usually this involves two steps. The first step is

concerned with obtaining data to allow us to identify the

model parameters.

The second step usually involved using some

mathematical technique to infer the parameters from the

observed data.

Page 19: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 19/64

Hamilton Institute

Linear in parameter model structures

The parameter estimation task is simple when the modelis a linear in parameters model form.

For example, in the equation

the unknown parameters appear as coefficients of the

variables (and offset).

The parameters of such equations are estimated using the

 principle of least squares.

.bax y !

Page 20: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 20/64

Hamilton Institute

The principle of least squares

Karl Friedrick Gauss (the greatest mathematician after Hamilton) invented the principle of least squares to

determine the orbits of planets and asteroids.

Gauss stated that the parameters of the models should be

chosen such that µthe sum of the squares of the

differences between the actually computed values is a

minimum¶.

For linear in parameter models this principle can be

applied easily.

Page 21: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 21/64

Hamilton Institute

The principle of least squares

Karl Friedrick Gauss (the greatest mathematician after Hamilton) invented the principle of least squares to

determine the orbits of planets and asteroids.

Gauss stated that the parameters of the models should be

chosen such that µthe sum of the squares of the

differences between the actually computed values is a

minimum¶.

For linear in parameter models this principle can be

applied easily.

Page 22: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 22/64

Hamilton Institute

The principle of least squares

 ) y , x( 11

 ) y , x( 22

 ) y , x( k k 

 x

 y

§! !

iii ) yÖ  y(  )b ,a( V  1

2

Page 23: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 23/64

Hamilton Institute

The principle of least squares: The algebra

For our example: we want to minimize

Hence, we need to solve:

§

§

!

!

!

!

m

i ii

m

iii

 )ba x y( 

 ) yÖ  y(  )b ,a( V 

1

2

1

2

00 !x

x!

x

x

b

 )b ,a( V 

a

 )b ,a( V 

Page 24: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 24/64

Hamilton Institute

The principle of least squares: The algebra

For our example: we want to minimize

Hence, we need to solve the following equations for the

 parameters a,b.

012

02

1

1

!!x

x

!!x

x

§

§

!

!

 ) )( ba x y( b

 )b ,a( V 

 ) x )( ba x y( a

 )b ,a( V 

m

iii

i

m

iii

§§

§§§

!!

!!!

!

!

m

ii

m

ii

m

iii

m

ii

m

ii

 ymb xa

 y x xb xa

11

111

2

Page 25: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 25/64

Hamilton Institute

A linear model

Example: Find the least squares line that fits thefollowing data points.

X Y

-1  100 9

1 7

2 5

3 4

4 3

5  0

6 -1-1 0 1 2 3 4 5 6  

-2 

4

10 

 x 

    y

Page 26: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 26/64

Page 27: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 27/64

Hamilton Institute

A linear model

Example: Find the least squares line that fits thefollowing data points.

X Y

-1  100 9

1 7

2 5

3 4

4 3

5  0

6 -1

§§

§§§

!!

!!!

!

!

m

ii

m

ii

m

iii

m

ii

m

ii

 ymb xa

 y x xb xa

11

111

2

64386071 . x. yÖ  !

Page 28: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 28/64

Hamilton Institute

A linear model

Example: Find the least squares line that fits thefollowing data points.

X Y

-1  100 9

1 7

2 5

3 4

4 3

5  0

6 -1

64386071 . x. yÖ  !

-1 0 1 2 3 4 5 6  -2 

4

10 

12 

 x 

    y

Page 29: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 29/64

Hamilton Institute

A polynomial model

Least squares can be used whenever we suspect a linear in parameters model?Find the least squares polynomial

fit to the following data points.2

21 k k xcc yÖ  !

X Y1.0000 2.9218

2.0000  5.9218

3.0000  10.9218

4.0000  17.9218

5.0000 26.9218

6.0000  37.92187.0000  50.9218

8.0000  65.9218

9.0000  82.9218

10.0000  101.9218 1 2 3 4 5 6 7 8 9 1 0  0 

20 

40 

60 

80 

10 0 

12 0 

 x 

    y

Page 30: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 30/64

Hamilton Institute

A polynomial model

By proceeding exactly as before:

27311k k 

x. yÖ  !

X Y1.0000 2.9218

2.0000  5.9218

3.0000  10.9218

4.0000  17.9218

5.0000 26.9218

6.0000  37.92187.0000  50.9218

8.0000  65.9218

9.0000  82.9218

10.0000  101.92181 2 3 4 5 6 7 8 9 1 0  

20 

40 

60 

80 

10 0 

12 0 

 x 

    y

Page 31: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 31/64

Hamilton Institute

Building our first model

Example: Mal 

thus¶sla

w of popula

tion growth

Government agencies use population models to plan.

What do you think be a good simple model for population

growth?

Malthus¶s law states that rate of an unperturbed population

(Y) growth is proportional to the population present.

Introduction

kY dt 

d Y !

Page 32: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 32/64

Hamilton Institute

An exponential model (the first lecture)

k t  AeY  !

The solution to the differential equation is not linear in

 parameters.

However, there is a change of variables to make it linear in

 parameters.

k t l n

el nl n

 ) Ael n( Y l nk t 

k t 

!

!

!

Page 33: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 33/64

Hamilton Institute

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  0 

50 

100 

150 

200 

250 

Y E A R  

P op

US Population Growth (m il lions) v. Y ear 

Page 34: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 34/64

Hamilton Institute

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  0 

50 

100 

150 

200 

250 

Y E A R  

P op

US Population Growth (m il lions) v. Y ear 

1800 1820 1840 1860 1880 1900 1920 1940 1960 1980  1.5 

2.5 

3

3.5 

4

4.5 

5.5 

Slope = k  

Intercept = ey 0  

Y E A R  

ln(Pop)

Page 35: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 35/64

Hamilton Institute

Matrix formulation of least squares

The least squares parameters can be derived by solving aset of simultaneous linear equations. This technique is

effective but tedious for complicated linear in parameter 

models. A much more effective solution to the least

squares problem can be found using matrices.

Suppose that we wish to find the parameters of the

following linear in parameters model and that we have m

measurements.

cbzax y !

Page 36: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 36/64

Hamilton Institute

Matrix formulation of least squares

All m-measurements can be written in matrix form asfollows

or more compactly as

¹¹

¹

 º

 ¸

©©

©

ª

¨

¹¹¹¹¹

 º

 ¸

©©©©©

ª

¨

!

¹¹¹¹¹

 º

 ¸

©©©©©

ª

¨

cb

a

 z x

 z x

 z x

 y

 y

 y

mmm1

1

1

22

11

2

1

////

!Y 

Page 37: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 37/64

Hamilton Institute

Matrix formulation of least squares

The matrix is known as the matrix of regressors.Thismatrix (here a mx3 matrix) is usually not invertible. To

find the least squares solution we multiply both sizes of 

the equation by the transpose of the regressor matrix.

It can be shown that the least squares solution is given

 by the above equation.

1

!

!

Y  )( 

T T 

T T 

Page 38: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 38/64

Hamilton Institute

A linear model

Example: Find the least squares line that fits thefollowing data points.

X Y

-1  100 9

1 7

2 5

3 4

4 3

5  0

6 -1-1 0 1 2 3 4 5 6  

-2 

4

10 

 x 

    y

Page 39: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 39/64

Hamilton Institute

A linear model

The regressor is given by

Hence

reg =[-1  10  1

1  1

2 1

3  1

4 1

5  1

6  1];

reg'*reg = [92 20

20  8]

Page 40: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 40/64

Hamilton Institute

Summary: Linear least squares

To do a least squares fit we start by expanding theunknown function as a linear sum of basis functions:

We have seen that the basis functions can be linear or 

non-linear. The linear parameters can be found using:

..... ) x( bf  ) x( a f  ) x(  y !21

1

!

!

Y  )( 

T T 

T T 

Page 41: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 41/64

Hamilton Institute

Discrete time dynamic systems

Our examples work beautifully for static systems. What aboutidentifying the parameters of dynamic systems. Dynamic systems

are in principle not any different to static systems. We define our 

regressors and solve the regression problem.

Consider the following problem. We wish to build a model of the

relationship between the throttle and the speed of an automobile.

We begin by collecting data from an experiment.

C AR DYNAMICSVELOCITYT ROTTLE

Page 42: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 42/64

Page 43: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 43/64

Hamilton Institute

Discrete time dynamic systems

A good choice for the model structure is first order:

We can solve for the parameters by solving

yielding:

cbuavvk k k !

1

1

!

!

Y  )( 

T T 

T T 

030839501 .u.v. )k ( vk k !

Page 44: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 44/64

Hamilton Institute

R ecursive identification

The algorithms that we have looked at so far are called batch algorithms.

Sometime we want to estimate model parameters

recursively so that the parameters can be estimated on-line.

Also, if system parameters change over time, then we

need to continually estimate and verify the model

 parameters.

Page 45: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 45/64

Hamilton Institute

R ecursive least squares

The least squares algorithm invented by Gauss can bearranged in such a way such that the results obtained at

time index k-1 can be used to obtain the parameter 

estimates at time index k. To see this we use

and note that

1

!

!

Y  )( 

T T 

T T 

§

§

!

!

!

!

m

im

m

m

i

m

m

 yY 1

ii

1ii

Page 46: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 46/64

Hamilton Institute

R ecursive least squares

With a little manipulation (show) we get:

where:

More complicated versions of the algorithm are availablethat avoid matrix inversion.

 ) y(  P k 

k k k k k k  11

!

k k k k P  P  1

1

1!

Page 47: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 47/64

Hamilton Institute

R ecursive least squares (car example)

0 10 20 30 40 50 60 70 80 90 100  0 

0.5 

1

     c

0 10 20 30 40 50 60 70 80 90 100  0.5 

1

1.5 

     a

0 10 20 30 40 50 60 70 80 90 100  0 

4

      b

Ti ¤ 

e index k 

Page 48: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 48/64

Hamilton Institute

R ecursive least squares (car example)

50 55 60 65 70 75 80 85 90 95 100  

¥  0.1

0.1

       c

50 55 60 65 70 75 80 85 90 95 100  

0.6 

0.8 

1

       a

50 55 60 65 70 75 80 85 90 95  

1

3

         b

Time index k 

Page 49: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 49/64

Hamilton Institute

The matrix inversion lemma

One not-so-nice feature of the RLS formula is the presence of a matrix inversion at each step. This can be

removed using the matrix inversion lemma (the

Sherman-Morrisson formula).

111111

1-1-

 andinvertibleisThematrices.

 squareinvertible beand,,Let

!

 D A ) B D AC (  B A A BC  D )(A -

Page 50: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 50/64

Hamilton Institute

The RLS algorithm

Application of the lemma results in the standard RLSalgorithm.

1

1

1

11

1

!

!

!

k k k 

k k 

k k 

k k k k k 

 P  )G I (  P 

 P 

 P G

 ) y( G

Page 51: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 51/64

Hamilton Institute

Time-varying systems

Much of the appeal of the RLS algorithm is that we can potentially deal with time-varying systems.

Example: Suppose that a rocket ascends from the

surface of the earth propelled a thrust force generatedthrough the ejection of mass. If we assume that the rate

of change of mass of the fuel is um and the exhaust

velocity is ve, then the physical equations governing the

rocket is:

emvu g  )t ( m

d t 

d v )t ( m !

Page 52: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 52/64

Hamilton Institute

Forgetting factors

For time varying systems we must estimate the parameters recursively. How can we modify the basic

RLS algorithm?

To estimate time-varying parameters we would like to

forget past data points. The only place in the above

formula that depends on past data points is the

covariance matrix.

 ) y(  P k 

k k k k k k  11

!

k k k k P  P  1

1

1!

Page 53: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 53/64

Hamilton Institute

Forgetting factors

For time varying systems we must estimate the parameters recursively. How can we modify the basic

RLS algorithm?

This corresponds to minimising the time-varying cost

function:

k k k k k  e P  1!

§!

!

iii

ik   ) yÖ  y(  )k  ,( V 1

2

k k k k P  P  1

1

1!

Page 54: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 54/64

Hamilton Institute

The RLS algorithm

Application of the matrix inversion lemma results in thestandard RLS algorithm with a forgetting factor.

1

1

1

11

1

!

!

!

k k k 

k k 

k k 

k k k k k 

 P  )G I (  P 

 P  P G

 ) y( G

Page 55: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 55/64

Hamilton Institute

Example

Consider the dynamic system

where the parameters ak  , b

k vary as shown.

0 50 100 150 200 250 300  0 .5 

0 .5 

ti m e [ § 

ec  ] 

       a

0 50 100 150 200 250 300  1

0 .5 

0 .5 

ti m e [ § 

ec  ] 

         b

k k k k k ub ya y !

1

Page 56: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 56/64

Hamilton Institute

Example

0 50 100 150 200 250 300   ̈

0 .5 

0 .5 

1

t im e [ ©  ec  ] 

       a

0 50 100 150 200 250 300   ̈ 1

 ̈ 0 .5 

0 .5 

1

t im e [ ©  ec  ] 

         b

Page 57: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 57/64

Hamilton Institute

Numerical issues

RLS algorithm is of great theoretical importance.However, it suffers from one very big disadvantage. It is

numerically unstable.

The numerical instability stems from the equation:

If no information enters the systems, P  becomes singular 

and the estimator returns garbage.

k k k k P  P  1

1

1!

Page 58: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 58/64

Hamilton Institute

Numerical issues

0 100 200 300 400 500 600 700  0 

0. 5 

1

t ime [sec ]  

      a

0 100 200 300 400 500 600 700  -1

1

2

t ime [sec ]  

        b

0 100 200 300 400 500 600 700  -1

1

t ime [sec ]  

        i      n      p      u        t

Page 59: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 59/64

Hamilton Institute

Persistence of excitation

One final thought:  P ersistence of excitation

Persistence of excitation has a strict mathematical

definition.

Roughly speaking, PE means that the input signal has

 been chosen such that the least squares estimate is

unique.

The really interested student should consult Astrom for 

more on this topic.

Page 60: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 60/64

Hamilton Institute

Error surfaces and gradient methods

All the examples that we have looked at so far involvedlinear in parameter models. In this case finding the least

squares solution was easy because the error surface is

quadratic.

Huh! What is meant by a quadratic cost function.

Consider the examples of the line fitting. We were trying

to minimize:

§

§

!

!

!

!

m

iii

m

iii

 )ba x y( 

 ) yÖ  y(  )b ,a( V 

1

2

1

2

Page 61: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 61/64

Hamilton Institute

Least mean squares and gradient methods

To make life simple, let¶s assume that we have twoobservations (m=2) and that we assume that b = 0. Then.

Remember we are trying to find the parameter a that

minimises this function. But the function is quadratic in

a.

2

2

2

12211

22

2

2

1

2

22

2

11

22 y ya ) x y x y( a ) x x( 

 )a x y(  )a x y(  )a( V 

!

!

Page 62: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 62/64

Hamilton Institute

Least mean squares and gradient methods

The quadratic surface looks like the following for asingle parameter.

-6  -4 -2 0 2 4 6 8  0 

50 

10 0 

15 0 

a

      V      (     a      )

Page 63: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 63/64

Hamilton Institute

Least mean squares and gradient methods

With two parameters we get some thing like:

Page 64: Course Part3

8/8/2019 Course Part3

http://slidepdf.com/reader/full/course-part3 64/64

A word on gradient methods

Another way of estimating the best parameters is toestimate the parameters in an interative manner in the

direction of the gradient.

For linear in parameter structures the batch version of 

least squares is better. However, the above idea can be

extended to deal with model structures that are not linear 

in parameters (Doug will tell you all about this).

 )( V  )k (  )k (  1 !