INTRODUCTION TO OPTIMISATION-1cm
Transcript of INTRODUCTION TO OPTIMISATION-1cm
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
INTRODUCTION TO OPTIMISATION
Annalisa Riccardi†Edmondo Minisci†Kerem Akartunalı‡
† Department of Mechanical and Aerospace Engineering‡ Department of Management Science
University of Strathclyde, Glasgow (United Kingdom)
November 20, 2017
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Outline
Terminology, Optimality condi-tions, Local Optimisation algo-rithms
Global Optimisation approaches,multiobjective optimisation
Network and combinatorial opti-misation, integer programming
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimisation
Optimization is derived from the Latin word optimus, the best
Optimization characterizes the activities involved to find the best
People have been optimizing forever, but the roots for modern day(engineering) optimization can be traced to the Second World War
Applications in the service industries did not start until the mid
1960s.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Programming
You will often hear the phrase programming as in: mathematicalprogramming, linear programming, nonlinear programming, mixedinteger programming, etc.
This has (in principle) nothing to do with
modern day computer programming.
the early days, a set of values which represented a solution to aproblem was referred to as a program. Nowadays you program(software) to find a program!
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Challenges
• Problem formulation: the problem explained in ”engineering”terms needs to be translated in its mathematical formulation(objectives and constraints). Regularity of the model is an issueif gradient based optimisation techniques are applied
• Algorithm selection: the advantages and disadvantages ofeach method needs to be assessed accordingly to the problemthat needs to be solved
• Complexity: both in terms of large number of optimisationvariables and both in terms of expensive function(s) evaluation
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Mathematical Formulation
• x ∈ Ω ⊂ RnC × ZnD is a vector of optimisation variables
• f (x) : Ω→ Rm is the objective function
• g(x) : Ω→ Rni is the inequality constraint function
• h(x) : Ω→ Rne is the equality constraint function
Definition (Optimisation Problem)
minx∈Ω
f (x), subject tog(x) ≤ 0,h(x) = 0
The set of points satisfying the constraints is called the feasible region
D = x ∈ Ω | g(x) ≤ 0, h(x) = 0
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimisation Problems Taxonomy
• Continuous (nC > 0, nD = 0), Discrete (nC = 0, nD > 0) orMixed Integer (nC > 0, nD > 0): about the nature of the setof optimisation variables
• Single (m = 1) or Multi (1 < m < 4) or Many Objectives(m ≥ 4): about the size of the objective space
• Constrained (ni > 0 and/or ne > 0) or Unconstrained (ni = 0and ne=0): about the size of the constraints space
• Linear or Nonlinear: about the linearity/non linearity ofobjective and constraints functions
• Local or Global: about the nature of the fitness landscape
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Pareto Optimality
In case of single objective optimisation the optimum is a point, incase of multi (and many) objectives optimisation the optimum is aset of points
Definition (Pareto dominance)A point x1 ∈ D Pareto dominates x2 ∈ D (for a minimisationproblem) if
fi (x1) ≤ fi (x2), ∀i = 1, ...,m
and there is at least one component j ∈ 1, ...,m such thatfj(x1) < fj(x2). This is indicated by x1 x2.
Definition (Pareto optimality)A point x∗ ∈ D is Pareto optimal if it isn’t dominated by any x ∈ D
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Pareto Front
Definition (Pareto Optimal Set)For a multiple objectives optimization problem the Pareto OptimalSet is defined as
P∗ = x ∈ D | @x ′ ∈ D | x ′ x.
Definition (Pareto Front)The union of the objective values of all Pareto optimal points iscalled Pareto front or equivalently
PF = f (x) ∈ Rm | x ∈ P∗.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Example
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objectives to Single Objectives
Some examples (not exhaustive)
• Weighted Sum Approach (with wi ≥ 0 and∑m
i=1 wi = 1weight coefficients)
minx∈Ω
m∑i=1
wi fi (x), s.t. g(x) ≤ 0; h(x) = 0.
• ε-Constrained
minx∈Ω
fj(x), s.t.g(x) ≤ 0; h(x) = 0fi (x) ≤ ε ∀i = 1, ...,m; i 6= j
• Goal Programming (With Ti ∈ R target values)
minx∈Ω
m∑i=1
|fi (x)− Ti |, s.t. g(x) ≤ 0; h(x) = 0.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Constrained to Unconstrained
Some examples (not exhaustive). The examples refer only to the caseof inequality constrained optimisation problem
• Penalisation (with pi ≥ 0 penalty terms): constraint violationis added as penalty terms to the objective function
minx∈Ω
f (x) +
ni∑i=1
pi max0, gi (x)
• Multiobjective: the constraints (or the sum of the constraintsviolation) is added to the list of objectives
minx∈Ω
[f (x), g1(x), ..., gni (x)]; minx∈Ω
[f (x),
ni∑i=1
max0, g1(x)]
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimality Conditions - definitions
Definition (Lagrangian)The real function L : Ω× Rni+ne → R defined as
L(x , λ) = f (x)− µTg(x)− λTh(x)
is the Lagrangian and the coefficients µ ∈ Rni and λ ∈ Rne are calledLagrange multipliers.
Definition (Active Set)Given a point x in the feasible region D, the active set A(x) isdefined as A(x) = i ∈ I | gi (x) = 0 ∪ Ie , with I = 1, ..., ni andIe = 1, ..., ne indexes sets
Definition (LICQ)The Linear Independence Constraint Qualification (LICQ) conditionholds in a point x ∈ Ω if the gradients of the active inequalityconstraints and the gradients of the equality constraints are linearlyindependent
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimality Conditions
Theorem (First-order necessary condition)Suppose that f , g and h are continuous and differentiable, x∗ is alocal solution of the constrained problem and that the LICQ holds atx∗. Then a Lagrange multiplier vector (µ∗, λ∗) exists such that thefollowing conditions are satisfied
∇xL(x∗, µ∗, λ∗) = 0 (1)
g(x∗) ≤ 0; h(x∗) = 0 (2)
µ∗i ≥ 0,∀i = 1, .., ni (3)
µ∗i gi (x∗) = 0,∀i = 1, .., ni (4)
These conditions are known as the Karush Kuhn Tucker (KKT)conditions.Remark: The last condition implies that the Lagrange multiplierscorresponding to inactive inequality constraints are zero
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimality Condition
Theorem (Second-order necessary condition)Let f , c and h be twice continuously differentiable, x∗ is a localsolution of the constrained problem and that the LICQ condition issatisfied. Let (µ∗, λ∗) be the Lagrange multipliers for which the triple(x∗, µ∗, λ∗) satisfies the KKT conditions. Then
ωT∇2xxL(x∗, µ∗, λ∗)ω ≥ 0, ∀ω ∈ C (µ∗, λ∗)
where C (µ∗, λ∗) is the critical cone, i.e ”the part of the cone of
feasible directions for which the behavior of f is not clear from its
first derivative”
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Unconstrained NLP algorithms
NLP algorithms ⇒ starts from an initial guess x0 and move towards adirection of ”improvement”
• Line Search the algorithm determines a search direction pk andsearches along this direction from the current iterate xk for anew iterate with a lower function value.
• Trust Region the algorithm constructs a model function mk
whose behavior near the current iterate xk is similar to that ofthe actual objective function f . The direction of search pk isfound as the direction that minimises mk within the trust regionxk + pk
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Examples
Examples of line search directions (it exists the Thrust Regioncounterpart)
• Steepest Descent method: chooses as search direction thedescent one pk = −∇f (xk)
• Newton method: the search direction is the solution of theNewton equation pk = −H(xk)−1∇f (xk)
• Quasi Newton method: they dont require the computation ofthe second order derivatives but use an approximation of it,noted as B, pk = −B(xk)−1∇f (xk)
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Constrained NLP algorithms
• Penalty, Barrier, Augmented Lagrangian methods andSequential linearly constrained methods: they solve asequence of simpler subproblems (unconstrained or with simplelinearized constraints) related to the original one. The solutionsof the subproblems converge to the solution of the primal oneeither in a finite number of steps or at the limit.
• Newton-like methods: they try to find a point satisfying thenecessary conditions of optimality (KKT conditions in general).The Sequential Quadratic Programming (SQP) method is partof this class.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Optimal Control
Finding the control laws u(t) ∈ C0p([t0, tf ];Rnc ) and states
x(t) ∈ C0p([t0, tf ];Rns ) for a given system that minimizes a cost
functional subject to initial and final states as well as path constraints
Definition (Optimal Control Problem)
minu,x
φ(x(tf )) +
∫ tf
t0
f0(x(t), u(t))dt
subject to
x(t) = f (x(t), u(t))
c(x(t), u(t)) ≤ 0, t ∈ [t0, tf ]
ω(x(t0), x(tf )) = 0
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Solution Methods
• Indirect They convert the optimal control problem into aboundary value problem using the necessary condition of thePontryagin minimum principle.
• Direct Discretization of the control and state functions, andapproximation of the infinite dimensional optimal controlproblem by an NLP problem (using numerical methods toapproximate the integrals and ODE solution). Example:Multiple shooting, Collocation methods
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
References
• Numerical Optimization, by Jorge Nocedal and Stephen J.Wright (Springer, 2006)
• Numerical Optimization - Theoretical and Practical Aspects, byJ. Frederic Bonnans, J. Charles Gilbert, Claude Lemarechal, andClaudia A. Sagastizabal (Springer, 2006)
• Introduction to Applied Optimization, 2nd Ed. by UrmilaDiwekar (Springer, 2008)
• Optimization Theory and Methods - Nonlinear Programming, byWenyu Sun, Ya-Xiang Yuan (Springer, 2006)
• Linear and Nonlinear Programming, 4th Ed. by , David G.Luenberger and Yinyu Ye (Springer, 2016)
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Local and Global Optimal solutions
For convex optimisation problems, there is no needto distinguish between Local and Global optimalsolutionOptimal solutionss, because a locally optimal solution isalso globally optimal.Convex optimisation problems include:
• LP problems;
• QP problems where the objective is positive definite (ifminimising - negative definite if maximising); and
• NLP problems where the objective is a convex function (ifminimising - concave if maximising) and the constraints (ifexisting/formulated) form a convex set.
But all the other NLP problems are non-convex and generallyhave multiple locally optimal solutions.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Local and Global Optimal solutionsConvex set
Let the set S ⊂ Rn. If, for any x1, x2 ∈ S , we haveαx1 + (1− α)x2 ∈ S , ∀α ∈ [0, 1], then S is said to be a convexset.x = αx1 + (1− α)x2, where [0, 1], is called a convexcombination of x1 and x2.For any two points x1, x2 ∈ S , the line segment joining x1 andx1 is entirely contained in S .
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Local and Global Optimal solutionsConvex functions
Let S ⊂ Rn be a non-empty convex set.Let f : S ⊂ Rn → R.If, for any x1, x2 ∈ S and all α ∈ [0, 1], we havef (αx1 + (1− α)x2) ≤ αf (x1) + (1− α)f (x2), then f is said tobe convex on S .If the above inequality is true as a strict inequality for allx1 6= x2, then f is called a strict convex function on S .If there is a constant c > 0 such that for any x1, x2 ∈ S ,f (αx1 + (1− α)x2) ≤αf (x1) + (1− α)f (x2)− 1/2cα(1− α)‖x1 − x2‖2
2, then f iscalled a uniformly (or strongly) convex function on S .
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Local vs Global Optimisation
In Global Optimisation, we distinguish between (considerminimisation):
• Local minimum f ∗ = f (x∗), local minimizer x∗
• smallest function value in some feasible neighbourhood• x∗ ∈ S• there exists a δ > 0 such that f ∗ ≤ f (x)∀x ∈ x ∈ S : |x− x∗| < δ
• Global minimum f ∗ = f (x∗), global minimizer x∗
• smallest function value over all feasible points• f ∗ ≤ f (x) ∀x ∈ S
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches
The aim is to find a global minimiser for the function f .The search for a solution is commonly made up of twocomponents:
• Global exploration
• Local exploitation
The global component is used to prevent local convergence andto globally characterise the solution space.The local component is used for accurate convergence to localoptimal solutions.The critical point is to have a balanced use of bothcomponents. Complexity/cost of global optimization methodsgrows exponentially with the problem sizes.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches
Basin of Attraction: set of solutions that tend toconverge to the same attractor (defined by the characteristicsof the problem and the characteristics of the search algorithm)
Global exploration: to find theoptimal basin of attractionLocal exploitation: to convergeto the local minimum
Traditional techniques for global optimization , such as a)branch and bound, b) grid and multi-grid algorithms, and c)multi-start algorithms are usually used for problems with asmall number of variables.Other (stochastic) algorithms, such as evolutionary algorithms,demonstrate to be an extremely valid alternative to theprevious mentioned ones.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesDeterministic vs Stochastic algorithms
All the algorithms seen up to now (local methods)are deterministic.For global search we have both deterministic and stochasticalgorithms.
Wikipedia (so, common knowledge):
• in computer science, a deterministic algorithm is analgorithm which, given a particular input, will alwaysproduce the same output, with the underlying systemalways passing through the same sequence of states.
• Stochastic algorithms are algorithms with one or moresteps depending on random number (given a particularinput, they will not always produce the same output, andthe underlying system will not always pass through thesame sequence of states)
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesHeuristics
Most of the methods contain heuristic techniques or steps
Wikipedia: A heuristic technique, or simply a heuristic, is anyapproach to problem solving, learning, or discovery thatemploys a practical method not guaranteed to be optimal orperfect, but sufficient for the immediate goals. Where findingan optimal solution is impossible or impractical, heuristicmethods can be used to speed up the process of finding asatisfactory solution.
Heuristics are strategies derived from experience with similarproblems.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesGrid and Multigrid Search
It is the most straightforward deterministic approach
• The function f is evaluated at a set of regular grid pointsin the solution domain S
• From each point of the grid a finer grid can be builtlocally where the coarser grid revels good values for f
The procedure is analogous to an enumerative search of adiscrete set of values.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesDIRECT (DIviding RECTangles)
Deterministic search algorithmbased on systematic division of thesearch domain into smaller andsmaller hyperrectangles.The algorithm generates a pre-defined number of Ns samplepoints over a grid in a box-constrained feasible area start-ing from the scaled midpointx1 = 0.5(1, 1, ..., 1)T ∈ S ⊂ Rd
All sample points x1, x2, ..., xNs arestored as potential places whererefinement may take place.Refinement of the generic solutionxk consists of ”sampling more” ina region around xk
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesMulti-start algorithms
The simple idea behind multi-start algorithms is topick a number of points in the search space and start a localsearch from each one of them (e.g., the local search can beperformed with a gradient based method).The initial grid can be deterministically or stochastically set.
1 Set k = 1, fbest = +∞ (minimisation), perform a DOE
2 Select point yk from the DOE set
3 Run a local optimizer al from yk and let xk be thedetected local minimum.
4 Evaluate the objective function f (xk)
5 if f (xk) < fbest then xbest = xk , fbest = f (xk)
6 Set neval = neval + neval ,k
7 Termination: Unless neval ≥ neval ,max , k = k + 1 andgoto Step 2
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesDesign of Experiment (DOE)
It is a way of choosing samples in the design space in order toget the maximum amount of information using the minimumamount of resources, that is, with the lowest number ofsamples.DOE is normally used
• to start the global search by population based or grid andmulti-start methods (see previous multi-start algorithm)
• to build surrogates
• to perform sensitivity analyses
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches
An appropriate application of search algorithmsinvolves both recognizing what kind of system the user isdealing with and knowing the right algorithm to apply, butthese aspects are not easy to handle.Stochastic Algorithms are able to handle complex andnon-identifiable optimization problems. These algorithmsproved to be extremely robust, that means they are effectivewithin a wide range of applications, but they pay theirrobustness with a general low efficiency in terms ofcomputational resources.No-Free-Lunch (NFL) theorems confirm this behaviour.(D.H. Wolpert and W.G. Macready. No Free Lunch Theoremsfor Optimization. IEEE Transaction on EvolutionaryComputation, 1(1):67-82, April 1997.)
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesNo-Free-Lunch (NFL)
The main theorem states:If any algorithm A outperforms another algorithm B in thesearch for an extremum of an objective function, thenalgorithm B will outperform A over other objective functions.
The NFL theorem(s) suggests that the average performanceover all possible objective functions is the same for all searchalgorithms.
All algorithms for optimization will give the same averageperformance when averaged over all possible functions, whichmeans that the universally best method does not exist for alloptimization problems.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesStochastic Methods
Nature provides some of the most efficient ways to solveproblems - Algorithms imitating processes in nature/inspiredfrom nature Nature Inspired Algorithms.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesSimulated Annealing
In 1983, Kirkpatrick et All proposed a method of using aMetropolis Monte Carlo simulation to find the lowest energy(most stable) orientation of a system.Called Simulated Annealing because it is inspired by theannealing process of metals during cooling. Annealing: at hightemperatures the molecules in a metal move freely but as themetal is cooled this movement is gradually reduced and atomsalign to form crystals.The crystalline form constitutes a state of minimum energy andannealed metals have better mechanical characteristics.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global ApproachesEvolutionary Algorithms (EAs)
EAS are stochastic search methods that take their inspirationfrom natural selection and survival of the fittest in thebiological world.By analogy to natural evolution, the solution candidates arecalled individuals, the set of solution candidates is called thepopulation, and each optimisation iteration is a generation.
Each individual represents a possible solution, i.e., a decisionvector, to the problem at hand.
Sometimes, an individual is not a decision vector but ratherencodes it, based on an appropriate representation.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - EAs
Most of EAs are populationbased.A general evolutionary algo-rithm is a stochastic one which:a) principally memorizes a pop-ulation of solutions; b) has somekind of mating selection; c) hassome kind of recombination andmutation as variation operators;and d) has some kind of envi-ronment selection.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - EAs
Among the EAs, Genetic Algorithms (GAs) havebeen having (or ”had”) an enormous success.GAs were born, and are well suited, to solve combinatorialproblems, but they have been successfully applied tocontinuous problems as well.Most of their efficacy is due to a powerful recombinationoperator, which, for this reason, becomes the main operator.The recombination operation used by GAs requires that theproblem can be represented in a manner that makescombinations of two solutions likely to generate interestingsolutions.Selecting an appropriate representation is a challenging aspectto properly apply these methods.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - EAsGenetic Algorithms
Usually a binary coding is used and many applicationsdemonstrated the validity of this approach. In these cases,typically, recombination operator is a one-point, or a two-point,or multi-point, or uniform-crossover.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - Other PopulationBased AlgorithmsDifferential Evolution (DE)
The DE belongs to the class of Evolution Strategy optimizers.
The main idea is to generate the variation vector vi ,k+1 bytaking the weighted difference between two other solutionvectors randomly chosen within a population of solution vectorsand to add that difference to the vector difference between xi ,kand a third solution vector.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - Other PopulationBased AlgorithmsParticle Swarm Optimisation (PSO)
PSO is a population based stochastic optimization inspired bysocial behaviour of bird flocking or fish schooling.In PSO, the potential solutions, called particles, fly through theproblem space by following the current optimum particles.Each particle keeps track of its coordinates in the problemspace which are associated with the best solution it hasachieved so far.The particle swarm optimization concept consists of, at eachiteration, changing the velocity of each particle xi according toa close-loop control mechanism.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - Constraint HandlingTechniques for Nature Inspired
Despite the broad applicability of Nature Inspired methods to awide range of domains, they are essentially unconstrainedoptimization techniques.The constraints handling techniques for can be roughlyclassified into five classes (non exhaustive list):
• using penalty function;
• special representations and/or genetic operators (for EAs);
• repairing algorithms;
• considering separations of objectives and constraints; and
• hybrid methods.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Global Approaches - Constraint HandlingTechniques for Nature InspiredUsing penalty function
The most common approach to handle constraints.Nature inspired methods do not usually require an initialfeasible solution, so the penalty should be able to bring anunfeasible solution into the feasible region.There are at least two main choices to define a relationshipbetween an unfeasible individual and the region of the searchspace:
• an individual can be penalized just for being unfeasibleregardless of its amount of constraint violation (i.e., nouse of any information about how close it is from thefeasible region);
• the amount of its unfeasibility can be measured and usedto determine its correspondingpenalty.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objective Problems
A multi-objective optimisation problem can be formulated as:finding the vector x∗ = [x∗1 , x
∗2 , ..., x
∗d ]T ∈ S ⊆ Rd that satisfies
the m constraints gi (x) ≤ 0 (i = 1, 2, ...,m), and optimises thevector function f(x) = [f1(x), f2(x), ..., fk(x)]T ∈ F ⊆ Rk
A problem should be formulatedas a MO one, if finding the par-ticular solution x∗ that yields theoptimum values of all the objec-tive functions, i.e. [fi (x∗)fi (x)]∀x ∈ S and ∀i = 1, 2, ..., k isNOT possible.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objective Problemsbi-objective example - aggregation of objectives
Single-solution approaches can converge only to one point ofthe Pareto front at each run (not necessarily with a weightedsum approach).
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objective Problems
EAs (in general, population basedalgorithms) are particularly suit-able to solve multi-objective op-timization problems, because theydeal simultaneously with a set ofpossible solutions.EAs are less susceptible to theshape or continuity of the Paretofront.Some examples of well knownPareto based approach will be pre-sented.Pareto based approaches arebased on the idea of calculatingthe fitness of individuals on thebasis of Pareto dominance.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objective Problems
Some approaches use the dominance rank, i.e., the number ofindividuals dominating an individual, to determine the fitnessvalues.Others make use of the dominance depth/class, where thepopulation is divided into several fronts and the depth reflectsto which front an individual belongs to. Alternatively, also thedominance count, i.e., the number of individuals dominatedby a certain individual, can be taken into account.A niching/crowding mechanism allows the algorithms tomaintain individuals all along the non-dominated frontier.
Some newer algorithms use measures such as the hypervolume
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Multi-Objective ProblemsNon-dominated Sorting Genetic Algorithm (NSGA)
Based on several layers of clas-sifications of individuals.Before selection, the populationis ranked on the basis of non-domination: all non-dominatedindividuals are classified into onecategory.Sharing (function) in the ob-jective space helps to distributethe population in the non dom-inated region.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
References
• Carlos A. Coello Coello, David A. Van Veldhuisen, Gary B.Lamont, (2013). Evolutionary Algorithms for SolvingMulti-Objective Problems. Springer.
• Achille Messac, (2015). Optimization in Practice withMATLAB. Cambridge Univ. Press.
• Xin-She Yang, Xingshi He (Editors), (2016). Nature-InspiredOptimization Algorithms in Engineering: Overview andApplications. Springer.
• Marco Locatelli, Fabio Schoen, (2013). Global Optimization:Theory, Algorithms, and Applications. SIAM.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
The Challenge of Combinatorial Choices
• Often we have “discrete” choices to make.• On which route should I drive to work?• In which order should I carry my tasks today?• Which combination of classes should I take this semester?
• The “Devil’s Triangle” of routing, scheduling andplanning.
• Applications from nurse scheduling to aircraft routing,from production planning to radiotherapy optimisation.
• Most natural mathematical models using “mixed integerprogramming” (MIP).
• For the sake of clarity, let’s focus on linear problems.• Format: mincT x |Ax ≥ b, x ∈ Rn
+ × Zp+.
• Often NP-hard, but special cases exist.
• Also constraint programming may be useful.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: LP - Life is good!
• When p = 0, then mincT x |Ax ≥ b, x ∈ Rn+ × Zp
+ is LP.
• Always a corner solution.• Simplex & interior points.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Discrete Variables
• Corners not necessarily feasible!• Continuity lost.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Discrete Variables
• Convex hull of feasible points.• Nice theory, not so easy practice.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Cutting Planes
• Valid inequalities violated by fractional corners.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Branch & Bound
• Branch&Bound (Divide and conquer!)• Useful bound information.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Branch & Bound Tree
Figure from L. Wolsey, Integer Programming, Wiley, 1998.
• Enumeration of all possible combinations.• Of course many will be eliminated by “pruning”.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP 101: Branch & Bound Pruning
Figures from L. Wolsey, Integer Programming, Wiley, 1998.
• Prune by Optimality (top) and Bound (bottom).
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP Solution Methods
• Exact methods• Dynamic programming• Valid inequalities/extended reformulations• Dantzig-Wolfe/Column generation/Benders• Lagrangean relaxation• We know solution quality but often significant
enumeration and time.
• Heuristic methods• Problem-specific vs. general heuristics• Construction vs. improvement heuristics• MIP-heuristics vs. metaheuristics• Fast solutions but no guarantee on finding even a feasible
solution.
• Hence also often combined methods (tradeoff betweensolution quality vs. solution time)
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
MIP-heuristics: Relax-and-Fix
• Option: Good formulation; fast heuristic.
• Start “Time windows” (overlapping)
1 32 4 65
relaxed
• Option: If time permits, a fast construction heuristic.
1 32 4 65
fix
• Continue with the next time window (until last period)
1 32 4 65
relaxedfixed
• Option: Improvement heuristic.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Relaxations
• LP relaxation: simply relax all integrality constraints.
• Lagrangean relaxation: remove a “difficult” constraint andinstead penalize it in the objective function.
• mincT x |Ax ≥ b,Dx ≥ d , x ∈ Rn+ × Zp
+• minλ≥0cT x − λ(Dx − d)|Ax ≥ b, x ∈ Rn
+ × Zp+
• Why do we care about relaxations?
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Decompositions
• Often problems have a “special” structure.• E.g. block-angular matrix• Many subproblems “easy” to solve.
• Dantzig-Wolfe decomposition
• Master problem & n subproblems.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
“Nice” Problems
• A problem being “combinatorial” or MIP does notnecessarily mean it is hard.
• Plenty in the complexity theory.• Some are indeed polynomially solvable.
• A matrix A is totally unimodular if each of its squarenon-singular submatrices is unimodular (i.e., det 1 or -1).
• Very important property...• The set Ax ≥ b, x ∈ Rn
+ has all integer corners!
• Might look cumbersome, but there is a big class ofproblems fitting into this description: min cost flowproblems.
• mincT x |∑
(i,j)∈E xij −∑
(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
An Easy Min Cost Flow Problem
Shortest path from our venue to my favourite pub!
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
An Easy Min Cost Flow Problem
mincT x |∑
(i,j)∈E xij −∑
(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Shortest Path and Dijkstra
Figure from R. Ahuja, T. Magnanti, J. Orlin, Network Flows, Prentice
Hall, 1993.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Shortest Path and Dijkstra (cont’d)
Figure from R. Ahuja, T. Magnanti, J. Orlin, Network Flows, Prentice
Hall, 1993.
OPT 0.1
LocalOptimisation
GlobalOptimisation
CombinatorialOptimisation
Caution: TSP
532-city USA tour by Padberg-Rinaldi, 1987.
mincT x |∑
(i,j)∈E xij −∑
(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+