Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization

Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization

Etienne Pellegrini and Ryan P. Russell

AIAA/AAS Astrodynamics Specialists Conference

Minneapolis, MN, 8/13/12

Summary

• Introduction

• The Hybrid Differential Dynamic Programming (HDDP) Algorithm [Lantoine & Russell]

– State-Transition Matrices

• Quasi-Newton methods

– Application to HDDP

– The SR1 update

• Results

– 1D Landing

– 2D Spacecraft Problem [Bryson & Ho]

– Complete set of test problems

• Conclusions & Future work

2 Etienne Pellegrini – AIAA/AAS Astrodynamics Specialists Conference – 8/13/12 – Minneapolis, MN

State of the Art

Low thrust trajectories

Highly nonlinear, constrained problems

Need for specific and efficient NLP solvers

• DDP methods were introduced in late 60s [Mayne, Jacobson]

• Static/Dynamic Algorithm: uses Hessian shifting [Whiffen]

• HDDP: uses State-Transition Matrices approach

Motivation for this paper:

High computational intensity for all those methods.


Classic NLP Solvers DDP Methods

Introduction


Classic NLP Solvers HDDP Method

Introduction


The HDDP algorithm


The HDDP algorithm: STM approach

Sensitivities are obtained using the STMs

• Initialize 𝐽𝑥,𝑁∗ (𝑥) and 𝐽𝑥𝑥,𝑁

∗ (𝑥)

• 𝐽𝑥,𝑘 𝑥, 𝑢 and 𝐽𝑥𝑥,𝑘(𝑥, 𝑢) are

obtained from backward mapping of 𝐽𝑥,𝑘+1

∗ (𝑥) and

𝐽𝑥𝑥,𝑘+1∗ (𝑥)

• The control law allows to deduce state only sensitivities 𝐽𝑥,𝑘

∗ (𝑥) and 𝐽𝑥𝑥,𝑘∗ (𝑥)


• Decouples the optimization step from the propagation step

– Allows for parallelization of the computation

– Allows for approximations to the partial derivatives

• Forward sweep:

– n equation for the state

– n2 equations for the 1st order STM

– n3 equations for the 2nd order STM

• Propagation of the STMs takes more than 80% of the compute time

• Necessitates the user to provide the second-order partial derivatives of the state dynamics


The HDDP algorithm: STM approach

• Introduced in 1959 [Davidon]

• Used in many optimization applications

• Aim: approximating the curvature of the problem

Estimating the Hessian of the objective function

• Classical approach

– Gradient and estimate of the Hessian used to define a search direction

– Step chosen with a line search or trust region method

– Estimate of the Hessian is updated

• Estimate of the Hessian has to be positive definite

9

Quasi-Newton Methods

Etienne Pellegrini – AIAA/AAS Astrodynamics Specialists Conference – 8/13/12 – Minneapolis, MN

Application to HDDP: estimating 𝚽𝟐,𝒌

• Different from traditional quasi-Newton:

– Not as suitable to estimate the Hessian of the cost function

– Estimates the 2nd order STM

Results in changes to the traditional methods

– No enforcement of the positive definiteness

– Requires a quasi-Newton update that approximates the Hessian accurately

– Step decided by the propagation of the new control law

– The 2nd order STM is a tensor composed of n Hessians

n quasi-Newton updates to apply

• Computation of the STM is decoupled: the optimization steps are untouched

• The user does not need to provide 2nd order derivatives


SR1 Update

• Variety of quasi-Newton updates have been developed

– BFGS, DFP, Powell’s Damped BFGS, SR1, etc…

• Most of them: enforce positive definiteness of the estimate

– In classical quasi-Newton framework, a descent direction is needed

– In our application: we don’t need the estimate to be pos. def.

• Symmetric Rank 1 update

– Does not enforce convexity

– Results in estimates closer to the true Hessian [Conn et al.]


Results: Framework

• Tested on a set of 6 fixed final time problems

• Implemented using Matlab. Similar results are expected using another programming language

• Metric to evaluate how accurate the Hessian estimates are:

[Khalfan et al.]

• Average taken on every stage and every state.


Results: 1D Landing


Run time Iterations

HDDP 22.95 11

QHDDP 7.19 11

Controls obtained with HDDP and QHDDP States and controls found by QHDDP

• 3 states: vertical position and velocity, and fuel

• 1 control: thrust

Results: 2D Spacecraft Problem

• Transfer between two coplanar circular orbits; minimize fuel


Trajectory obtained with QHDDP Controls obtained with HDDP and QHDDP

Run time Iterations

HDDP 551.27 89

QHDDP 32.35 82

Metric value for 4 different strategies Run time for different strategies

Other

Results: 2D Spacecraft Problem

• Different scenarios: Test of a restart strategy

Trade-off between confidence in the estimate and computation time

• NB: User has to provide 2nd order derivatives again


• Similar problem, longer time of flight (35 TU), lower maximum thrust (0.05 MU.LU/TU2)

• Bang-bang structure as expected

Results: Multi-Rev Spacecraft Problem


Thrust and eccentricity (QHDDP)

0 10 20 30

0.06

0.04

0.2

0

Thru

st

(MU

LU

/TU

2)

0.3

0.2

0.1 0

Eccentr

icity

Trajectory found by QHDDP

Results: Complete Set

• Comparison of all test cases

• Metric: 2nd order STM well approximated for most cases

• Run time: show that the baseline case is mostly faster


Timings for all test cases Metric for all text cases

Conclusions

• Possibility of restarting the estimate with the real STM in order to improve confidence


18

• Propagation becomes 5.4 to 30 times faster

• Total computation time becomes 2.8 to 17 times faster

Future Work

• Testing on representative space trajectories

• Use of multi-step quasi-Newton methods

• Other updates

• Integration of numerical differencing or complex step differentiation

• Parallelization of the propagation


Thank you for your attention


Backup Slides


Set of test problems


• Small perturbation to the state:

(1)

• Taylor series:

(2)

• Replace 𝛿𝑋 in (1):

(3)

• Equate (2) and (3):

23

Derivation of the STMs


• Taylor series:

• Quasi-Newton equation:

• Rank-1 update:

• Because 𝑎𝑢𝑇Δ𝑌𝑝 is a scalar:

• Finally:

24

Derivation of the SR1 update


• 𝐽𝑋,𝑘𝑖 and 𝐽𝑋𝑋,𝑘

𝑖 are function of the downstream control law

(𝑢𝑞, 𝑘 + 1 ≤ 𝑞 ≤ 𝑁)

• They are only accurate for a trajectory that follows exactly this control law

• In HDDP, the next iteration changes the downstream

control law 𝐽𝑋,𝑘𝑖 and 𝐽𝑋𝑋,𝑘

𝑖 do not hold information about

the new performance index 𝐽𝑖

• The quasi-Newton equation does not hold, even with exact second-order derivatives

• Applying a quasi-Newton method, which enforces this

quasi-Newton equation, can not predict the right 𝐽𝑋𝑋,𝑘𝑖+1

25

Why not apply quasi-Newton to 𝑱𝑿𝑿

computation?


“An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.”

Bellman, R., Dynamic Programming, Princeton University Press, Princeton, New Jersey, 1957.

26

Bellman’s Principle of Optimality


Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization

Technology

Transcript of Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization