Research Article Partitioned Quasi-Newton Approximation ...

Research ArticlePartitioned Quasi-Newton Approximation for DirectCollocation Methods and Its Application to the Fuel-OptimalControl of a Diesel Engine

Jonas Asprion,1 Oscar Chinellato,2 and Lino Guzzella1

1 Institute for Dynamic Systems and Control, ETH Zurich, Sonneggstrasse 3, 8092 Zurich, Switzerland2 FPT Motorenforschung AG, Schlossgasse 2, 9320 Arbon, Switzerland

Correspondence should be addressed to Jonas Asprion; [email protected]

Received 26 November 2013; Revised 17 February 2014; Accepted 18 February 2014; Published 25 March 2014

Academic Editor: M. Montaz Ali

Copyright © 2014 Jonas Asprion et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The numerical solution of optimal control problems by direct collocation is a widely used approach. Quasi-Newton approximationsof the Hessian of the Lagrangian of the resulting nonlinear program are also common practice. We illustrate that the transcribedproblem is separable with respect to the primal variables and propose the application of dense quasi-Newton updates to the smalldiagonal blocks of the Hessian. This approach resolves memory limitations, preserves the correct sparsity pattern, and generatesmore accurate curvature information.The effectiveness of this improvementwhen applied to engineering problems is demonstrated.As an example, the fuel-optimal and emission-constrained control of a turbocharged diesel engine is considered. First resultsindicate a significantly faster convergence of the nonlinear program solverwhen themethodproposed is used instead of the standardquasi-Newton approximation.

1. Introduction

Quasi-Newton (QN) methods have become very popular inthe context of nonlinear optimisation. Above all, in nonlinearprograms (NLPs) arising from direct transcription of optimalcontrol problems (OCPs), the Hessian of the Lagrangianoften cannot be derived analytically in a convenient way.Algorithmic differentiation may fail due to unsupportedoperations or black-box parts in the model functions. Fur-thermore, both approaches are computationally expensive ifthe model functions are complex and yield long expressionsfor the second derivatives. On the other hand, numericalapproximation by finite differences is inaccurate and hardlyimproves the computational performance.

A common approach in these cases is to approximate theHessian by QN updates using gradient information collectedduring the NLP iterations. However, if applied to real-world OCPs, several limitations arise. These problems oftenexhibit large dimensions; thus limited-memory versions ofthe approximations are applicable only. Since many updates

may be necessary until a good approximation of the fullHessian is obtained, the approximation remains poor whenusing themost recent steps only. Furthermore, the favourablesparsity structure of the underlying discretisation scheme isgenerally not preserved. This fill-in drastically reduces theperformance of solvers for the linear system of equationsdefining the step direction during each NLP iteration.

Partial separability, a concept introduced by [1], describesa structural property of a nonlinear optimisation problem.When present, this property allows for partitioned QNupdates of the Hessian of the Lagrangian (or of the objective,in the unconstrained case). For unconstrained optimisation,this approach was proposed and its convergence propertieswere analysed in [2]. Although only superlinear local conver-gence can be proven, a performance close to that obtainedwith exact Newton methods, which exhibit quadratic localconvergence, was observed in practical experiments [3, 4].

This brief paper presents how partitioned QN updatescan be applied to the NLPs resulting from direct collocationof OCPs. The concept of direct collocation to solve OCPs

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2014, Article ID 341716, 6 pageshttp://dx.doi.org/10.1155/2014/341716

2 Journal of Applied Mathematics

has been widely used and analysed [5–9]. We will showthat the Lagrangian of the resulting NLPs is separable inthe primal variables at each discretisation point. Due to thisseparability, its Hessian can be approximated by full-memoryQN updates of the small diagonal blocks. This procedureincreases the accuracy of the approximation, preserves thesparsity structure, and resolves memory limitations. Theresults are first derived for Radau collocation schemes, whichinclude the right interval boundary as a collocation point.The adaptations to Gauss schemes, which have internal col-location points only, and to Lobatto schemes, which includeboth boundaries, are provided thereafter in condensed form.A consistent description of all three families of collocation isprovided in [10].

The partitioned update is applied to a real-world engi-neering problem. The fuel consumption of a turbochargeddiesel engine is minimised while the limits on the cumulativepollutant emissions need to be satisfied. This problem iscast in the form of an OCP and is transcribed by Radaucollocation. The resulting NLP is solved by an exact and aquasi-Newton method. For the latter, the partitioned updateachieves an increased convergence rate and a higher robust-ness with respect to a poor initialisation of the approximationas compared to the full QNupdate.Therefore, the findings forthe unconstrained case seem to transfer to the NLPs resultingfrom direct collocation of OCPs.

2. Material and Methods

Consider the system of nonautonomous ordinary differentialequations (ODEs) x = f(x, u) on the interval [𝑡

0, 𝑡

𝑓], with

x, f ∈ R1×𝑛𝑥 , u ∈ R1×𝑛𝑢 . Radau collocation represents eachelement of the state vector x(𝑡) as a polynomial, say of degree𝑁. The time derivative of this polynomial is then equated tothe values of f at 𝑁 collocation points 𝑡

0< 𝜏

1< 𝜏

2< ⋅ ⋅ ⋅ <

𝜏

𝑁= 𝑡

𝑓,

(

x1

x2

...x𝑁

) ≈ [d0,

D]⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

=:D

⋅(

x0

x1

x2

...x𝑁

)

!

= (

f (x1, u1)

f (x2, u2)

...f (x𝑁, u𝑁)

) . (1)

The notation x𝑗:= x(𝜏

𝑗) is adopted. The left boundary 𝜏

0=

𝑡

0is a noncollocated point in Radau collocation schemes. By

introducing the appropriate matrices, this matrix equation inR𝑁×𝑛𝑥 can be written in short as

D [

x0

X] = F (X,U) . (2)

The rows of X and F correspond to one collocation pointeach. In turn, the columns of X and F represent one statevariable and its corresponding ODE right-hand side at allcollocation points. In the following, consider the notation in(2) as shorthand for stacking the transpose of the rows in onelarge column vector.

The step length ℎ = 𝑡

𝑓− 𝑡

0of the interval is assumed

to be accounted for in the differentiation matrixD. Lagrange

interpolation by barycentric weights is used to calculate Dalong with the vector of the quadrature weights w [11]. Thelatter may be used to approximate the definite integral of afunction 𝑔(𝑡) as ∫𝑡𝑓

𝑡0

𝑔(𝑡) 𝑑𝑡 ≈ ∑

𝑁

𝑗=1𝑤

𝑗𝑔(𝜏

𝑗).

2.1. Direct Collocation of Optimal Control Problems. Weconsider an OCP of the form

minx(⋅),u(⋅)

∫

𝑇

0

𝐿 (x (𝑡) , u (𝑡)) 𝑑𝑡 (3a)

s.t. x (𝑡) − f (x (𝑡) , u (𝑡)) = 0, ∀𝑡 ∈ [0, 𝑇] , (3b)

∫

𝑇

0

g (x (𝑡) , u (𝑡)) 𝑑𝑡 ≤ 0,(3c)

c (x (𝑡) , u (𝑡) , 𝑡) ≤ 0, ∀𝑡 ∈ [0, 𝑇] , (3d)

where 𝐿 ∈ R, g ∈ R𝑛𝑔 , and c ∈ R𝑛𝑐 . Simple bounds on xand u, equality constraints, and a fixed initial or end state canbe included in the path constraints (3d). An objective term inLagrange form is used, which is preferable over an equivalentMayer term [12, Section 4.9].

Direct transcription discretises all functions and integralsby consistently applying an integration scheme. Here, 𝑘 =

1, . . . , 𝑚 integration intervals [𝑡𝑘−1

, 𝑡

𝑘] are used with 0 = 𝑡

0<

𝑡

1< ⋅ ⋅ ⋅ < 𝑡

𝑚= 𝑇. The number of collocation points 𝑁

𝑘

can be different for each interval. Summing up the collocationpoints throughout all integration intervals results in a totalof 𝑀 = 𝑙(𝑚,𝑁

𝑚) discretisation points. The “linear index” 𝑙

thereby corresponds to collocation node 𝑖 in interval 𝑘,

𝑙 := 𝑙 (𝑘, 𝑖) = 𝑖 +

𝑘−1

∑

𝛼=1

𝑁

𝛼. (4)

The transcribed OCP reads

minx∙,u∙

𝑀

∑

𝑙=1

𝑊

𝑙⋅ 𝐿 (x𝑙, u𝑙) (5a)

s.t. D(𝑘) [X(𝑘−1)

𝑁𝑘−1,∙

X(𝑘)] − F (X(𝑘),U(𝑘)) = 0, 𝑘 = 1, . . . , 𝑚,

(5b)

𝑀

∑

𝑙=1

𝑊l ⋅ g (x𝑙, u𝑙) ≤ 0, (5c)

c𝑙(x𝑙, u𝑙) ≤ 0, 𝑙 = 1, . . . ,𝑀. (5d)

The notation 𝑥

∙denotes all instances of variable 𝑥

𝑙at any

applicable index 𝑙. The vector of the “global” quadratureweightsW results from stacking the vectors of the quadratureweights w(𝑘) of each interval 𝑘 after removing the firstelement, which is zero. For the first interval, X(𝑘−1)

𝑁𝑘−1,∙is the

initial state x(0).The Lagrangian of the NLP (5a)–(5d) is the sum of the

objective (5a) and all constraints (5b), (5c), and (5d), whichare weighted by the Lagrange multipliers 𝜆. To clarify the

Journal of Applied Mathematics 3

notation, the Lagrange multipliers are grouped accordingto the problem structure. The 𝑛

𝑥⋅ 𝑁

𝑘multipliers for the

discretised dynamic constraints on each integration interval𝑘 are denoted by 𝜆(𝑘)

𝑑, the 𝑛

𝑔multipliers for the integral

inequalities are stacked in 𝜆𝑔, and the 𝑛

𝑐multipliers for the

path constraints at each discretisation point 𝑙 are gathered inthe vector 𝜆

𝑐,𝑙.

2.2. Separability in the Primal Variables. The objective (5a),the integral inequalities (5c), and the path constraints (5d)are inherently separated with respect to time; that is, theindividual terms are pairwise disjoint in x

𝑙and u

𝑙. We thus

focus on the separability of the dynamic constraints (5b). Forthe derivation, we assume that 𝑓, 𝑥, and 𝑢 are scalar. Theextension to the vector-valued case is straightforward andwillbe provided subsequently.

Consider the term of the Lagrangian representing thediscretised dynamic constraints (5b) for interval 𝑘,

L(𝑘)

𝑑:=

𝑁𝑘

∑

𝑖=1

𝜆

(𝑘)

𝑑,𝑖(𝑑

(𝑘)

0,𝑖𝑥

(𝑘−1)

𝑁𝑘−1

+

𝑁𝑘

∑

𝑗=1

𝐷

(𝑘)

𝑖𝑗𝑥

(𝑘)

𝑗− 𝑓 (𝑥

(𝑘)

𝑖, 𝑢

(𝑘)

𝑖)) .

(6)

This formulation constitutes a separation in the dual variables(the Lagrange multipliers). By collecting terms at each collo-cation point and accounting for the 𝑑

0terms in the previous

interval, we obtain a separation in the primal variables,

L(𝑘)

𝑑=

𝑁𝑘

∑

𝑖=1

(

[

[

𝑁𝑘

∑

𝑗=1

𝐷

(𝑘)

𝑗𝑖𝜆

(𝑘)

𝑑,𝑗+ 𝛿

(𝑘)

𝑖

𝑁𝑘+1

∑

𝑗=1

𝑑

(𝑘+1)

0,𝑗𝜆

(𝑘+1)

𝑑,𝑗]

]

𝑥

(𝑘)

𝑖

− 𝜆

(𝑘)

𝑑,𝑖𝑓 (𝑥

(𝑘)

𝑖, 𝑢

(𝑘)

𝑖)) ,

(7)

with

𝛿

(𝑘)

𝑖= {

1, if 𝑖 = 𝑁

𝑘, 𝑘 =𝑚,

0, otherwise.(8)

Each term inside the round brackets in (7) is acollocation-point separated part of the Lagrangian whichstems from the dynamic constraints. We denote these termsbyL(𝑘)𝑑,𝑖

and introduce the notation

𝜔 := (𝑥, 𝑢) , 𝜔

(𝑘)

𝑖:= (𝑥

(𝑘)

𝑖, 𝑢

(𝑘)

𝑖) . (9)

The gradient with respect to the primal variables is

∇

𝜔L(𝑘)

𝑑,𝑖= (𝜆(𝑘)𝑇

𝑑

𝐷

(𝑘)

∙𝑖+ 𝛿

(𝑘)

𝑖𝜆(𝑘+1),𝑇

𝑑d(𝑘+1)0

, 0) − 𝜆

(𝑘)

𝑑,𝑖

𝜕𝑓 (𝜔

(𝑘)

𝑖)

𝜕𝜔

.

(10)

The Hessian is simply

∇

2

𝜔L(𝑘)

𝑑,𝑖= −𝜆

(𝑘)

𝑑,𝑖

𝜕

2𝑓 (𝜔

(𝑘)

𝑖)

𝜕𝜔

2.

(11)

2.2.1. Vector-Valued Case and Complete Element Lagrangian.For multiple control inputs and state variables, the primalvariables 𝜔

𝑙at each collocation point become a vector in

R1×(𝑛𝑢+𝑛𝑥). Consistently, we define the gradient of a scalarfunction with respect to 𝜔 as a row vector. Thus, the modelJacobian 𝜕f/𝜕𝜔 is a matrix in R𝑛𝑥×(𝑛𝑢+𝑛𝑥), and the Hessian ofeach model-function element 𝑠, 𝜕2𝑓

𝑠/𝜕𝜔2, is a square matrix

of size (𝑛𝑢+ 𝑛

𝑥). The multiplier 𝜆(𝑘)

𝑑,𝑖itself also becomes a

vector inR𝑛𝑥 . All terms involving f , its Jacobian, or itsHessiantherefore turn into sums.

The full element Lagrangian L(𝑘)

𝑖consists of the terms

of the dynamic constraints L(𝑘)𝑑,𝑖

as derived above, plus thecontributions of the objective, the integral inequalities, andthe path constraints,

L(𝑘)

𝑖= 𝑊

𝑙⋅ 𝐿 (𝜔𝑙)

+ x(𝑘)𝑖⋅ (

𝑁𝑘

∑

𝑗=1

𝐷

(𝑘)

𝑗𝑖𝜆(𝑘)

𝑑,𝑗+ 𝛿

(𝑘)

𝑖

𝑁𝑘+1

∑

𝑗=1

𝑑

(𝑘+1)

0,𝑗𝜆(𝑘+1)

𝑑,𝑗)

− f (𝜔(𝑘)𝑖)𝜆(𝑘)

𝑑,𝑖+𝑊

𝑙⋅ 𝜆𝑇

𝑔g (𝜔𝑙) + 𝜆𝑇

𝑐,𝑙c𝑙(𝜔𝑙) .

(12)

The Lagrangian of the full NLP is obtained by summingthese element Lagrangians, which are separated in the primalvariables. Its Hessian thus is a perfect block-diagonal matrixwith uniformly sized square blocks of size (𝑛

𝑢+ 𝑛

𝑥).

2.2.2. Extension to Gauss and Lobatto Collocation. Gausscollocation does not include the right interval boundary.Thus, the terms involving d

0can be included locally in

each interval, which simplifies the separation in the primalvariables. However, the continuity constraint

x(𝑘+1)0

= x(𝑘)0+ w(𝑘)𝑇F (X(𝑘),U(𝑘)) (13)

has to be introduced for each interval. Similarly to the proce-dure above, this constraint can be separated. The quadratureweights w(𝑘) are stacked inW without any modification.

Lobatto collocation includes both boundaries as colloca-tion points.Thus, thematrixD in (1) and (2) has an additional“zeroth” row, and the argument of F becomes [x𝑇

0,X𝑇]𝑇 in (2).

The additional term

−𝛿

(𝑘)

𝑖𝜆

(𝑘+1)

𝑑,0𝑓 (𝜔

(𝑘)

𝑖) (14)

arises in L(𝑘)

𝑑,𝑖. Each element of W corresponding to the

interval boundary between any two neighbouring intervals𝑘 and 𝑘 + 1 is a sum of the two weights 𝑤(𝑘)

𝑁𝑘

and 𝑤(𝑘+1)0

.

2.3. Block-Diagonal Approximation of the Hessian. The sep-arability of the problem with respect to the primal variablesallows a perfect exploitation of the problem sparsity. In fact,the Jacobian of the objective and the constraints, as well asthe Hessian of the Lagrangian, can be constructed from thefirst and second partial derivatives of the nonlinear modelfunctions 𝐿, f , g, and c at each discretisation point [13].


We propose to also exploit the separability when cal-culating QN approximations of the Hessian. These itera-tive updates collect information about the curvature of theLagrangian by observing the change of its gradient along theNLP iterations. Although they perform well in practice, theyexhibit several drawbacks for large problems.

(I) Loss of sparsity. QN approximations generally do notpreserve the sparsity pattern of the exact Hessian,which leads to low computational performance [12,Section 4.13]. Enforcing the correct sparsity patternresults in QN schemes with poor performance [14,Section 7.3].

(II) Storage versus accuracy. Due to the loss of spar-sity, the approximated Hessian needs to be storedin dense format. To resolve possible memory lim-itations, “limited-memory” updates can be applied,which rely on a few recent gradient samples only.However, these methods provide less accuracy thantheir full-memory equivalents [14, Section 7.2].

(III) Dimensionality versus sampling. When sampling thegradient of a function that lives in a high-dimensionalspace, many samples are required to construct anaccurate approximation. In fact, to obtain an approx-imation that is valid along any direction, an entirespanning set needs to be sampled. Although QNmethods require accurate second-order informationonly along the direction of the steps [15], the stepdirection may change fast in highly nonlinear prob-lems such as the one considered here. In these cases,an exhaustive set of gradient samples would ensure afast convergence, which conflicts with (II).

Using approximations of the small diagonal blocks,that is, exploiting the separability illustrated in Section 2.2,resolves these problems.

(I) The exact sparsity pattern of the Hessian is preserved.(II) Only 𝑀(𝑛

𝑥+ 𝑛

𝑢)

2 numbers have to be stored, com-pared to𝑀2(𝑛

𝑥+ 𝑛

𝑢)

2 for the full Hessian.(III) Since the dimension of each diagonal block is small,

a good approximation is already obtained after fewiterations of the Hessian update [3, 4].

The partitioned QN update can be combined with theexploitation of the problem sparsity to reduce the number ofthe model evaluations required. In fact, when these two con-cepts are combined, the gradients of the model functions ateach collocation point are sufficient to construct an accurateand sparsity-preserving approximation of the Hessian of theLagrangian.

2.4. Implementation. Any QN approximation operates withthe differences between two consecutive iterates and thecorresponding gradients of the Lagrangian. For constrainedproblems,

s𝑇 := �� − 𝜔, (15)

y𝑇 := ∇

𝜔L (��, 𝜆) − ∇

𝜔L (𝜔, 𝜆) . (16)

The hat indicates the values at the current iteration, that is,the new data. In the following formulas, B denotes the QNapproximation. Here, the damped BFGS update is used [14,Section 18.3], which reads

𝜃 =

{

{

{

{

{

1, if s𝑇y ≥ 0.2s𝑇Bs0.8s𝑇Bs

s𝑇Bs − s𝑇y, otherwise,

(17a)

r = 𝜃y + (1 − 𝜃)Bs, (17b)

B = B −

Bss𝑇Bs𝑇Bs

+

rr𝑇

s𝑇r.

(17c)

This update scheme preserves positive definiteness, which ismandatory if a line-search globalisation is used. In a trust-region framework, indefinite approaches such as the safe-guarded SR1 update [14, Section 6.2] could be advantageoussince they can approximate the generally indefinite Hessianof the full or element Lagrangian more accurately.

The Hessian block B𝑙corresponding to the element

Lagrangian (12) at collocation point 𝑙 is approximated asfollows. In the difference of the gradients, all linear termscancel. Thus, (16) becomes

y𝑇𝑙= ∇

𝜔L𝑙(��, 𝜆) − ∇

𝜔L𝑙(𝜔, 𝜆)

= 𝑊

𝑙⋅ (

𝜕𝐿 (��𝑙)

𝜕𝜔−

𝜕𝐿 (𝜔𝑙)

𝜕𝜔)

+ 𝜆𝑇

𝑑,𝑙⋅ (

𝜕f (𝜔𝑙)

𝜕𝜔−

𝜕f (��𝑙)

𝜕𝜔)

+𝑊

𝑙⋅ 𝜆𝑇

𝑔⋅ (

𝜕g (��𝑙)

𝜕𝜔−

𝜕g (𝜔𝑙)

𝜕𝜔)

+ 𝜆𝑇

𝑐,𝑙⋅ (

𝜕c𝑙(��𝑙)

𝜕𝜔−

𝜕c𝑙(𝜔𝑙)

𝜕𝜔) .

(18)

Recall that the linear index 𝑙 is defined such that 𝜆𝑑,𝑙

= 𝜆(𝑘)

𝑑,𝑖.

The QN update (17a)–(17c) is applied to each diagonal blockB𝑙individually, with s𝑇

𝑙= ��𝑙− 𝜔𝑙and y

𝑙given by (18). As

initialisation, one of the approaches described in [14, Chapter6] can be used.

2.5. Engineering Test Problem. As a real-world engineeringproblem, we consider the minimisation of the fuel consump-tion of a turbocharged diesel engine. The statutory limits forthe cumulative NO

𝑥and soot emissions are imposed as 𝑛

𝑔=

2 integral inequality constraints. The 𝑛𝑢= 4 control inputs

are the position of the variable-geometry turbine, the start ofthe main injection, the common-rail pressure, and the fuelmass injected per cylinder and combustion cycle. The modelis described and its experimental validation is provided in[16]. It features a dynamic mean-value model for the air pathwith 𝑛

𝑥= 5 state variables, and physics-based models for

the combustion and for the pollutant emissions.The resultingOCP is stated in [17, 18]. The desired load torque, the bounds

Journal of Applied Mathematics 5

Prim

al in

feas

ibili

ty

0 5 13 20 28

Iteration number

0 5 13 20 28

Iteration number

Dua

l inf

easib

ility

Exact HessianPartitioned BFGSFull BFGS

100

100

10−3

10−6

10−10

104

10−4

10−8

(a)

0 20 40 60 80 100 120 140 160

Iteration number

0 20 40 60 80 100 120 140 160

Iteration number

Prim

al in

feas

ibili

ty 100

10−3

10−6

10−10

Dua

l inf

easib

ility

100

104

10−4

10−8

Exact HessianPartitioned BFGSFull BFGS

(b)

Figure 1: Convergence behaviour of IPOPT on the two test cases.

on the actuator ranges, andmechanical and thermal limits areimposed as nonlinear and linear path constraints (3d).

The model evaluations are expensive. Therefore, QNupdates are preferable over exact Newtonmethods to achievea fast solution process. The main drawback is the slow localconvergence rate of QN methods when applied to the largeNLPs resulting from the consideration of long time horizonsin the OCP [18].

3. Results and Discussion

The results presented here are generated using the NLP solverIPOPT [19, 20] with the linear solver MUMPS [21] and thefill-reducing preordering implemented inMETIS [22]. Eitherthe exact Hessian, calculated using central finite differenceson the model functions, or a full or the partitioned QNupdate just described is supplied to the solver as user-definedHessian. In all cases, the first derivatives are calculated byforward finite differences.

Radau collocation at flipped Legendre nodes is applied.These collocation points are the roots of the orthogonal Leg-endre polynomials and have to be computed numerically [23,Section 2.3]. The resulting scheme, sometimes termed RadauIIA, exhibits a combination of advantageous properties [24],[25, Section 3.5].

Two test cases for the engineering problem outlined inSection 2.5 are considered. Case (a) is a mild driving patternof 6 s duration which is discretised by first-order collocationwith a uniform step length of 0.5 s. This discretisation resultsin 𝑀 = 13 total collocation points and 117 NLP variables.Test case (b) considers a more demanding driving pattern of58 s duration and uses third-order collocation with a step sizeof 0.8 s, resulting in 𝑀 = 217 collocation points and 1,953NLP variables.

The performance is assessed in terms of the numberof iterations required to achieve the requested tolerance.Figure 1 shows the convergence behaviour of the exact-Newton method and of the two QN methods. Starting withiteration 5, the full Newton step is always accepted. Thus,the difference between the local quadratic convergence of theexact Newton method and the local superlinear convergenceof the full BFGS update becomes obvious. The partitionedupdate performs substantially better than the full update.Moreover, the advantage becomes more pronounced whenthe size (longer time horizon) and the complexity (moretransient driving profile, higher-order collocation) of theproblem are increased from the simple test case (a) to themore meaningful case (b).

The Hessian approximation is initialised by a multiple ofthe identity matrix; B

0= 𝛽I. A factor of 𝛽 = 0.05 is found to

be a good choice for the problem at hand. Table 1 shows thenumber of iterations required as𝛽 is changed.Thepartitionedupdate is robust against a poor initialisation, whereas the fullupdate requires a significant number of iterations to recover.This finding confirms that an accurate approximation isobtained in fewer iterations when the partitioned QN updateis applied.

4. Conclusion

We illustrated the separability of the nonlinear programresulting from the application of direct collocation to anoptimal control problem. Subsequently, we presented howthis structure can be exploited to apply a partitioned quasi-Newton update to the Hessian of the Lagrangian. Thissparsity-preserving update yields a more accurate approxi-mation of the Hessian in fewer iterations and thus increasesthe convergence rate of the NLP solver.


Table 1: Effect of the initialisation of the Hessian approximation onthe number of NLP iterations until convergence, test case (a).

Method 𝛽 = 0.01 𝛽 = 0.05 𝛽 = 0.1

Full BFGS 36 28 38Partitioned BFGS 22 20 21

A more accurate approximation of the second derivativesfrom first order information is especially beneficial for highlynonlinear problems forwhich the exact secondderivatives areexpensive to evaluate. In fact, for the real-world engineeringproblem used as a test case here, symbolic or algorithmic dif-ferentiation is not an expedient option due to the complexityand the structure of themodel. In this situation, using a quasi-Newton approximation based on first derivatives calculatedby finite differences is a valuable alternative. The numericaltests presented in this paper indicate that a convergence rateclose to that of an exact Newton method can be reclaimed bythe application of a partitioned BFGS update.

A self-contained implementation of the partitionedupdate in the framework of an NLP solver itself could fullyexploit the advantages of themethod proposed. Furthermore,it should be assessed whether a trust-region globalisation isable to take advantage of an indefinite but possibly moreaccurate quasi-Newton approximation of the diagonal blocksof the Hessian of the Lagrangian.

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper.

Acknowledgment

This work was partially funded by the Swiss InnovationPromotion Agency CTI under Grant no. 10808.1 PFIW-IW.

References

[1] A. Griewank and A. Toint, “On the unconstrained optimizationof partially separable functions,” in Nonlinear Optimization, M.Powell, Ed., Academic Press, New York, NY, USA, 1981.

[2] A. Griewank and P. L. Toint, “Local convergence analysis forpartitioned quasi-Newton updates,” Numerische Mathematik,vol. 39, no. 3, pp. 429–448, 1982.

[3] D.C. Liu and J.Nocedal, “On the limitedmemoryBFGSmethodfor large scale optimization,” Mathematical Programming, vol.45, no. 1–3, pp. 503–528, 1989.

[4] J. Nocedal, “Large scale unconstrained optimization,” in TheState of the Art in Numerical Analysis, pp. 311–338, OxfordUniversity Press, New York, NY, USA, 1996.

[5] L. T. Biegler, “Solution of dynamic optimization problems bysuccessive quadratic programming andorthogonal collocation,”Computers and Chemical Engineering, vol. 8, no. 3-4, pp. 243–247, 1984.

[6] D. Tieu, W. R. Cluett, and A. Penlidis, “A comparison of col-location methods for solving dynamic optimization problems,”Computers and Chemical Engineering, vol. 19, no. 4, pp. 375–381,1995.

[7] A. L. Herman and B. A. Conway, “Direct optimization usingcollocation based on high-order Gauss-Lobatto quadraturerules,” Journal of Guidance, Control, and Dynamics, vol. 19, no.3, pp. 592–599, 1996.

[8] D. A. Benson, G. T. Huntington, T. P. Thorvaldsen, and A.V. Rao, “Direct trajectory optimization and costate estimationvia an orthogonal collocation method,” Journal of Guidance,Control, and Dynamics, vol. 29, no. 6, pp. 1435–1440, 2006.

[9] S. Kameswaran and L. T. Biegler, “Convergence rates for directtranscription of optimal control problems using collocation atRadau points,” Computational Optimization and Applications,vol. 41, no. 1, pp. 81–126, 2008.

[10] D. Garg, M. Patterson, W. W. Hager, A. V. Rao, D. A. Benson,and G. T. Huntington, “A unified framework for the numericalsolution of optimal control problems using pseudospectralmethods,” Automatica, vol. 46, no. 11, pp. 1843–1851, 2010.

[11] J.-P. Berrut and L. N. Trefethen, “Barycentric Lagrange interpo-lation,” SIAM Review, vol. 46, no. 3, pp. 501–517, 2004.

[12] J. T. Betts, Practical Methods for Optimal Control and Esti-mation Using Nonlinear Programming, Advances in Designand Control, Society for Industrial and Applied Mathematics,Philadelphia, Pa, USA, 2nd edition, 2010.

[13] M. A. Patterson and A. V. Rao, “Exploiting sparsity in directcollocation pseudospectral methods for solving optimal controlproblems,” Journal of Spacecraft and Rockets, vol. 49, no. 2, pp.364–337, 2012.

[14] J. Nocedal and S. J. Wright, Numerical Optimization, SpringerSeries in Operations Research and Financial Engineering,Springer, New York, NY, USA, 2nd edition, 2006.

[15] P. E. Gill and E. Wong, “Sequential quadratic programmingmethods,” in Mixed Integer Nonlinear Programming, J. Lee andS. Leyffer, Eds., vol. 154 ofThe IMAVolumes inMathematics andits Applications, pp. 301–312, Springer, NewYork,NY,USA, 2012.

[16] J. Asprion, O. Chinellato, and L. Guzzella, “Optimisation-oriented modelling of the NOx emissions of a diesel engine,”Energy Conversion and Management, vol. 75, pp. 61–73, 2013.

[17] J. Asprion, C. H.Onder, and L. Guzzella, “Including drag phasesin numerical optimal control of diesel engines,” in Proceedingsof the 7th IFAC Symposium on Advances in Automotive Control,pp. 489–494, Tokyo, Japan, 2013.

[18] J. Asprion, O. Chinellato, and L. Guzzella, “Optimal control ofdiesel engines: numerical methods, applications, and experi-mental validation,” Mathematical Problems in Engineering, vol.2014, Article ID 286538, 21 pages, 2014.

[19] A. Wachter and L. T. Biegler, “On the implementation ofan interior-point filter line-search algorithm for large-scalenonlinear programming,”Mathematical Programming, vol. 106,no. 1, pp. 25–57, 2006.

[20] “IPOPT homepage,” 2013, http://www.coin-or.org/Ipopt.[21] “MUMPS homepage,” 2013, http://mumps.enseeiht.fr.[22] “METIS—serial graph partitioning and fill-reducing matrix

ordering,” 2013, http://glaros.dtc.umn.edu/gkhome/metis/metis/overview.

[23] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang,Spectral Methods: Fundamentals in Single Domains, ScientificComputation, Springer, Berlin, Germany, 2006.

[24] E. Hairer and G. Wanner, “Stiff differential equations solvedby Radau methods,” Journal of Computational and AppliedMathematics, vol. 111, no. 1-2, pp. 93–111, 1999.

[25] J. C. Butcher, Numerical Methods for Ordinary DifferentialEquations, John Wiley & Sons, Winchester, UK, 2003.

Submit your manuscripts athttp://www.hindawi.com

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

MathematicsJournal of


Mathematical Problems in Engineering

Hindawi Publishing Corporationhttp://www.hindawi.com

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of


Probability and StatisticsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of


CombinatoricsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of


Operations ResearchAdvances in

Journal of


Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of Mathematics and Mathematical Sciences


The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014


Algebra

Discrete Dynamics in Nature and Society



Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014 Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Stochastic AnalysisInternational Journal of

Research Article Partitioned Quasi-Newton Approximation ...

Documents

Transcript of Research Article Partitioned Quasi-Newton Approximation ...