Chapter 10 Leas Squares Methods

8/3/2019 Chapter 10 Leas Squares Methods

http://slidepdf.com/reader/full/chapter-10-leas-squares-methods 1/39

Lest Squares Methods

Zhulou Cao

2011/07/15



Content

10.1 Background

10.2 Linear Least-squares Problems

10.3 Algorithms For Nonlinear Least-squaresProblems

10.4 Orthogonal Distance Regression



10.1 Background

In least-squares problems, the objectivefunction f has the following special form

We refer to each as a residual, and we

assume throughout this chapter that m n.



The special form of f often makes least-

squares problems easier to solve than

general unconstrained minimization

problems!!!



This availability of the first part of 2f (x) for free is thedistinctive feature of least-squares problems

Using J (x), we also can calculate the first term in the

Hessian 2 f (x) without evaluating any second derivatives

Moreover, the first term is often more important than

the second summation term in (10.5)



As we discuss in Chapter 17, the problem of minimizing the functions (10.11) can be

reformulated a smooth constrained optimization

problem

In this chapter we focus only on the l2-norm

formulation (10.1).



10.2 Linear Least-squares Problems



We outline briey three major algorithms for the

unconstrained linear least-squares problem.

We assume in most of our discussion that m n andthat J has full column rank.



The rst and most obvious algorithm

Cholesky-based method may result in less accurate

solutions than those obtained from methods that avoid

this squaring of the condition number of JTJ



A second approach is based on a QR factorization

of the matrix J .



The relative error in the nal computed solution x is

usually proportional to the condition number of J ,not its square, and this method is usually reliable



A third approach, based on the singular-value

decomposition (SVD) of J , can be used in these

circumstances

Some situations, however, call for greater robustness

or more information about the sensitivity of the solu-

tion to perturbations in the data



All three approaches above have their place

The Cholesky-based algorithm is particularly usefulwhen m >> n and it is practical to store JT J but not J

itself

The QR approach avoids squaring of the conditionnumber and hence may be more numerically robust

While potentially the most expensive, the SVD

approach is the most robust and reliable of all



When J is actually rank decient any vector x of the form

is a minimizer of (10.20)

When the problem is very large, it may be efcient to use

iterative techniques, such as the conjugate gradient

method, to solve the normal equations (10.14)



10.3 Algorithms For Nonlinear Least-

squares Problems

The Gauss-Newton method can be viewed as modified

Newtons method with line search .

Instead of solving the standard Newton equations

we solve instead the following system



Three Advantages

Does not require any additional derivative evaluations

First term JT J in (10.5) dominates the second term

whenever Jk has full rank and the gradient fk is

nonzero, the direction p is a descent direction for f



The fourth advantage

p is in fact the solution of the linear least-squares problem

Hints: Compare (10.14) with (10.23)



Hence, we can apply linear least-squares algorithms to

(10.26).

If the QR or SVD-based algorithms are used, there is no

need to calculate the Hessian, we can work directly with

the Jacobian J .

The same is true if we use a conjugate-gradient

technique to solve (10.26)



Convergence Of The Gaussnewton

Method

Assume that the Jacobians J (x) have their

singular values uniformly bounded away from

zero in the region of interest.

for all x in a neighborhood

where x0 is the starting point for the algorithm



The Levenbergmarquardt Method

Recall that the GaussNewton method is like

Newtons method with line search, except that

we use the convenient and often effective

approximation (10.24) for the Hessian.

LevenbergMarquardt method can be obtained

by using the same Hessian approximation, but

replacing the line search with a trust-regionstrategy.



Methods For Large-residual Problems

In large-residual problems, the quadraticmodel in (10.31) is an inadequate repre-sentation of the function f because the

second-order part of the Hessian 2f (x) is toosignicant to be ignored.

In data-tting problems, the presence of largeresiduals may indicate that the model isinadequate or that errors have been made inmonitoring the observations.



METHODS FOR LARGE-RESIDUAL

PROBLEMS

On large-residual problems, the asymptotic

convergence rate of GaussNewton and

LevenbergMarquardt algorithms is only

linearslower than the superlinear

convergence



Methods For Large-residual Problems

It seems reasonable, therefore, to consider

hybrid algorithms, which would behave like

GaussNewton or LevenbergMarquardt if theresiduals turn out to be small (and hence take

advantage of the cost savings associated with

these methods) but switch to Newton or

quasi-Newton steps if the residuals at the

solution appear to be large



Hybrid Algorithm

There are a couple of ways to construct hybrid

algorithms

In the zero-residual case, the methodeventually always takes GaussNewton steps

(giving quadratic convergence), while it

eventually reduces to BFGS in the nonzero-

residual case (giving superlinear convergence)



Combine Gaussnewton And Quasi-

newton

maintain approximations to just the second-

order part of theHessian

We describe the algorithm of Dennis, Gay,

and Welsch [90], which is probably the best-

known algorithm in this class because of its

implementation in the well-known NL2SOL

package



10.4 Orthogonal Distance Regression

We assumed that any errors in the ordinatesthe times t jare tiny by comparison with theerrors in the observations.

This assumption often is reasonable, but thereare cases where the answer can be seriouslydistorted if we fail to take possible errors in theordinates into account.

Models that take these errors into account areknown in the statistics literature as errors-in-variables models .



Nonlinear Least Square Software

Chapter 10 Leas Squares Methods

Documents

Transcript of Chapter 10 Leas Squares Methods