Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

American Institute of Aeronautics and Astronautics

1

Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

Vladimir Sevastyanov1 eArtius, Inc., Irvine, CA 92614, US

EXTENDED ABSTRACT

Hybrid Multi-Gradient Explorer (HMGE) algorithm for global multi-objective optimization of objective functions considered in a multi-dimensional domain is presented (patent pending). The proposed hybrid algorithm relies on genetic variation operators for creating new solutions, but in addition to a standard random mutation operator, HMGE uses a gradient mutation operator, which improves convergence. Thus, random mutation helps find global Pareto frontier, and gradient mutation improves convergence to the Pareto frontier. In such a way HMGE algorithm combines advantages of both gradient-based and GA-based optimization techniques: it is as fast as a pure gradient-based MGE algorithm, and is able to find the global Pareto frontier similar to genetic algorithms (GA). HMGE employs Dynamically Dimensioned Response Surface Method (DDRSM) for calculating gradients. DDRSM dynamically recognizes the most significant design variables, and builds local approximations based only on the variables. This allows one to estimate gradients by the price of 4-5 model evaluations without significant loss of accuracy. As a result, HMGE efficiently optimizes highly non-linear models with dozens and hundreds of design variables, and with multiple Pareto fronts. HMGE efficiency is 2-10 times higher when compared to the most advanced commercial GAs.

I. Introduction ybrid Multi-Gradient Explorer (HMGE) algorithm is a novel multi-objective optimization algorithm which combines a genetic algorithms technique with a gradient-based technique. Both techniques have

strong and weak points: • Gradient-based techniques have high convergence to a local Pareto front, but a low ability to find the global

Pareto frontier and disjoint parts of a Pareto frontier; • GAs have a strong ability to find global optimal solutions, but have low convergence.

It is apparent that creating a hybrid optimization algorithm which combines the strong points of both approaches is the most efficient solution, and this is one of the most popular research topics over the last few years [1,2,3,4,5]. However, there is a serious obstacle for developing efficient hybrid algorithms: all known methods of gradient evaluation are computationally expensive, or have low accuracy. For instance, the traditional finite difference method is prohibitively computationally expensive because it requires N+1 model evaluations to estimate a gradient, where N is the number of design variables. An alternative method of local Jacobian estimation [1] is based on current population layout, and does not require additional points’ evaluation. Clearly, this approach is very efficient, but the accuracy of the gradients estimation depends on the points’ distribution, and is very low in most of cases. By the authors’ opinion [1], it performs well only for smooth models with a low level of non-linearity.

Ref. [4] gives a good example of a hybrid optimization algorithm which combines SPEA and NSGA genetic algorithms with a gradient-based SQP algorithm. NSGA-SQP and NSGA-SQP hybrid algorithms improve convergence in 5-10 times compared with SPEA and NSGA. However, the algorithms still require 4,000-11,000 model evaluations for the benchmark problems ZDT1-ZDT6 which does not allow one to use such algorithms for optimizing computationally expensive simulation models.

Presented HMGE optimization algorithm employs the in-house developed Dynamically Dimensioned Response Surface Method (DDRSM) [6] designed to estimate gradients. The key point of DDRSM is the ability to dynamically

1 Chief Executive Officer

H


2

recognize the most significant design variables, and to build local approximations based on the variables. As a result, DDRSM requires just about 4-7 model evaluations to estimate gradients without a significant decrease in accuracy, and regardless of the task dimension. Thus, DDRSM successfully matches contradicting requirements to estimate gradients efficiently and accurately, and allows for the development of efficient and scalable optimization algorithms based on it. HMGE is one of such algorithms.

HMGE was tested for tasks with different levels of non-linearity and a large variety of task dimensions ranging from a few and up to thousands of design variables. It consistently shows high convergence and high computational efficiency.

The optimization problem that the proposed algorithm solves is formally stated in (1):

(1)

Where nS ℜ⊂ is the design space.

The remainder of this paper is organized as follows. In section 2, the proposed optimization algorithm is

described. In section 3, benchmark problems, simulation results, and inferences from the simulation runs are presented. Finally in section 4, a brief conclusion of the study is presented.

II. Hybrid Multi-Gradient Explorer Algorithm The main idea of the HMGE algorithm is to use both gradient-based and random mutation operators.

Theoretical analysis and numerous experiments with different benchmarks have shown that using just one of two mutation mechanisms reduces the overall efficiency of the optimization algorithm. An algorithm performing just random mutation is a genetic algorithm with low convergence. In turn, using just gradient-based mutation improves the convergence towards the nearest local Pareto frontier, but reduces the ability of the algorithm to find the global Pareto frontier. Therefore, the fast global optimization algorithm has to use both gradient-based and random mutation operators. This is an important consideration that makes the difference between local and global optimization technology, between fast and slow optimization algorithms, and it needs special attention.

Gradient-based mutation operator always indicates a direction towards the nearest local Pareto front. If the global Pareto front is located in an opposite (or a different enough) direction then such mutation will never find dominating solutions! Thus, random mutation is a critical part of any hybrid optimization algorithm. On the other hand, any global optimization algorithm needs to use gradients to improve convergence. This is why HMGE uses both random and gradient-based mutation operators.

Another important question with designing of HMGE is how to estimate gradients without reducing the algorithm efficiency. HMGE employs the Dimensionally Independent Response Surface Method (DDRSM) [6] to estimate gradients because DDRSM is equally efficient for any task dimension, and requires a low number (4-7) of model evaluations to estimate gradients for a large variety of task dimensions.

The HMGE algorithm is an evolutionary multi-objective optimization algorithm combined with a gradient-based technique. A number of evolutionary optimization algorithms could be used as a basement for developing hybrid algorithms such as HMGE. Any evolutionary algorithm would benefit from the proposed gradient-based technique. But the most advanced evolutionary technique combined with the proposed gradient-based technique would give the biggest synergy effect in the hybrid algorithm efficiency. After careful consideration NSGA-II [10] and AMGA [5] were selected to borrow several concepts like concept of Pareto ranking, formulation for crossover, mutation, two-tier fitness assignment mechanism, preservation of elite and diverse solutions, and archiving of solutions.

As a hybrid, evolutionary optimization algorithm HMGE relies on genetic and gradient-based variation operators for creating new solutions. HMGE employs generational scheme since during a particular iteration, only solutions created before that iteration take part in the selection process. Similarly to AMGA algorithm, HMGE generates a small number of new solutions per iteration. HMGE works with a small population size and maintains a large external archive of all solutions obtained. At each iteration HMGE creates a small number of solutions using genetic and gradient-based variation operators, and all of the solutions are used to update the archive. The parent population is created from the archive. The creation of the mating pool is based on binary tournament selection and is similar to the one used in NSGA-II. Genetic and gradient-based variation operators are used to create the offspring population.

nn

j

TmX

SXxxxX

kjXqtosubject

XFXFXFXFMinimize

ℜ⊂∈=

=≤

=

};,...,,{

,...2,1;0)(:

)](),...,(),([)(

21

21


3

Large archive size and the concept of Pareto ranking borrowed from NSGA-II [10] help to maintain the diversity of the solutions, and obtain a large number of non-dominated solutions.

The pseudo-code of the proposed HMGE algorithm is as follows.

1. Begin 2. Generate required number of initial points X1,…,XN using Latin Hypercube sampling 3. Set iteration number i=0 4. Add newly calculated points to the archive 5. Sort points using crowding distance 6. Select m best individuals and produce k children using SBX crossover operator 7. Select first child as current 8. If i is multiple of p then go to 9 else go to 10 9. Improve current individual by the MGA analysis in current subregion 10. Apply a random mutation operator to current individual 11. If i is multiple of 10 then go to 12 else go to 13 12. Set subregion size to default value 13. Decrease subregion size by multiplying with Sd (Sd < 1) 14. If current individual is not last, select next and go to 8 else go to 15 15. Increment i 16. If convergence is not reached and maximum number of evaluations is not exceeded then

go to 4 17. Report all the solutions found 18. End

The above pseudo-code explains the main steps of the HMGE algorithm. Basically, HMGE works as a genetic

algorithm with elements of gradient-based techniques. The following two sub-sections explain GA- and gradient aspects of HMGE in details.

A. GA-Based Elements of HMGE Algorithm HMGE employs operators typical for genetic algorithms as follows. Simulated Binary Crossover (SBX) based on the search features of the single-point crossover is used in

binary-coded genetic algorithms. This operator employs the interval schemata processing, which means that common interval schemata of the parents are preserved in the offspring. The SBX crossover mostly generates offspring near the parents. Thus, the crossover guarantees that the extent of the children is proportional to the extent of the parents [8].

The algorithm receives two parent points p0 and p1 points and produces two children X0 and X1. It comprises the

following steps [9]: 1. Increment current coordinate index j 2. Get random value in the range [0; 1] 3. Bq is found so that the area under the probability curve from 0 to βq is equal to the

chosen random number u 4. Calculate child coordinates as follows:

0 0 1

1 0 1

0.5[(1 ) (1 ) ]

0.5[(1 ) (1 ) ]j q q

j q q

X B p B p

X B p B p

= + + −

= − + +

5. If j is less than independent variable count then go to 1

Random mutation is an important part of genetic algorithms in general, and HMGE in particular. The main idea of HMGE random mutation operator is to move the initial point in a random direction within given mutation range.

The random mutation operator algorithm comprises the following steps:


4

1. Increment current coordinate index j 2. Get a random value in the range [0; 1] 3. If the random number is less than 0.5 go to 4 else go to 9 4. Get a random value in the range [0; 1] 5. If the random number is less than 0.5 go to 6 else go to 7 6. Set positive direction for move 7. Set negative direction for move 8. Calculate new coordinate by move in the selected direction with the step equal to r

percent of the design space. 9. If j is less than the number of independent variables then go to 1

HMGE maintains an archive of best and diverse solutions obtained over optimization process. The archived

solutions are used for the selection of parents for SBX crossover and mutation operators. It is important to keep as many solutions as possible in the archive. On the other hand, the size of the archive

determines the computational complexity of the proposed algorithm, and needs to be maintained on a reasonable level by removing the worst in certain sense solutions. For instance, non-Pareto-optimal points should be removed before any other points.

HMGE maintains the archive size using the following steps:

1. Add newly created solutions to the archive 2. If archive size does not exceed given threshold then stop 3. Sort points by the crowding distance method; now the worst points are located at the

end of the archive 4. Remove a number of last solutions in the archive to match the archive size requirement.

B. Gradient-Based Elements of HMGE Algorithm

As can be seen from the HMGE pseudo-code, the Multi-Gradient Analysis (MGA) [6, 7] is used to perform a gradient-based mutation, and improve a given point with respect to all objectives. Essentially, MGA determines a direction of simultaneous improvement for all objective functions, and performs a step in this direction. In order to successfully achieve this goal, HMGE needs to estimate gradients.

Since real design optimization tasks involve computationally expensive simulation models, a model evaluation typically takes hours and even days of computational time. Thus, estimating gradients is the most challenging part of any optimization algorithm because it is necessary to satisfy two contradicting requirements: (a) estimate the gradient in a computationally cheap way, and (b) provide a high enough accuracy of the gradient estimation.

The traditional finite difference method is accurate, but it requires N+1 model evaluations (N – the number of design variables) to estimate a gradient. A hybrid optimization algorithm which employs the finite difference method will be slower than a pure genetic algorithm, unless the algorithm is used for optimization of low-dimensional models. Computationally expensive gradient estimation eliminates the possibility of developing hybrid optimization algorithms for optimizing models with more than 10 design variables.

Alternative methods of gradient estimation such as [1,5] have low accuracy, and converge well only for the simplest benchmarks.

The presented method of gradient estimation is based on Dynamically Dimensioned Response Surface Method (DDRSM) [6], and satisfies both of the previously mentioned requirements: it is computationally cheap and accurate at the same time. DDRSM requires just 4-7 model evaluations to estimate gradients on each step regardless of the task dimension.

DDRSM (patent pending) is a response surface method, which allows for one to build local approximation of all objective functions (and other output variables), and use the approximations to estimate gradients.

All known response surface methods suffer from the curse of dimensionality phenomenon. The curse of dimensionality is the problem caused by the exponential increase in volume associated with adding extra dimensions to a design space [12], which in turn requires exponential increase in number of sample points to build an accurate enough response surface model. This is a significant limitation for all known response surface approaches, forcing


5

engineers to artificially reduce optimization task dimension by assigning constant values to the most of design variables.

DDRSM successfully resolved the curse of dimensionality problem in the following way. DDRSM is based on a realistic assumption that most of real life design problems have a few significant design

variables, and the rest of the design variables are not significant. Based on this assumption, DDRSM estimates the most significant projections of gradients for all output variables on each optimization step.

In order to achieve this, DDRSM generates 5-7 sample points in the current sub-region, and uses the points to recognize the most significant design variables for each objective function. Then DDRSM builds local approximations which are utilized to estimate the gradients.

Since an approximation does not include non-significant variables, the estimated gradient has only projections that corresponded to significant variables. All other projections of the gradient are equal to zero. Ignoring non-significant variables slightly reduces the accuracy, but allows estimating gradients by the price of 5-7 evaluations for tasks of practically any dimension.

DDRSM recognizes the most significant design variables for each output variable (objective functions and constraints) separately. Thus, each output variable has its own list of significant variables that will be included in its approximating function. Also, DDRSM recognizes significant variables over and over again on each optimization step, each time when it needs to estimate gradients. This is crucial because the topology of objective functions and constraints can diverge in different parts of the design space throughout the optimization process.

As follows from the previous explanation, DDRSM dynamically reduces the task dimension in each sub-region, and does it independently for each output variable by ignoring non-significant design variables. The same variable can be critically important for one of the objective functions in the current sub-region, and not significant for other objective functions and constraints. Later, in a different sub-region, the lists of significant design variables can be very different, but DDRSM will be able to recognize the most relevant design variables, and estimate gradients regardless. Thus, gradient-based mutation can be performed reliably and accurately in any circumstance.

III. Benchmark Problems

In this section, a set of unconstraint problems are used to test the HMGE algorithm’s performance. Zitzler et al. described six problems (ZDT1 to ZDT6) [13], which have been further studied by other

researchers. All the problems are used here except the ZDT5 since its variables are discrete and are not suitable for the gradient based technique employed in HMGE algorithm.

In this study seven multi-objective optimization algorithms have been compared to the proposed HMGE algorithm. The algorithms can be split into the following three groups:

• Two well-known multi-objective evolutionary algorithms: Non-dominated Sorting Genetic Algorithm (NSGA-II) [10], and Strength Pareto Evolutionary Algorithm (SPEA) [14]. The optimization results of these two algorithms applied to the benchmark problems ZDT1 to ZDT6 have been taken from [4]; • Two hybrid multi-objective optimization algorithms SPEA-SQP and NSGA-SQP presented in [4] ; optimization results for these algorithms applied to ZDT1 to ZDT6 benchmark problems are also taken from [4]; • Three state of the art multi-objective optimization algorithms developed by a leading company of the Process Integration and Design Optimization (PIDO) market: Pointer, NSGA-II, and AMGA. These commercial algorithms represent the highest level of optimization technology developed by the best companies and are currently available on the PIDO market.

NSGA-II and AMGA are pure multi-objective optimization algorithms suitable to compare with HMGE.

Pointer is a more questionable algorithm regarding multi-objective optimization because it works as an automatic optimization engine that controls four different optimization algorithms, and only one of them is a true multi-objective algorithm. Clearly, three other algorithms use a weighted sum method for solving multi-objective optimization tasks. Thus, it is not the most suitable algorithm for performing scientific research and cannot be properly compared to other multi-objective techniques. However, Pointer is a great optimization tool, and it is widely used for multi-objective optimization in engineering practice. Therefore, testing Pointer on ZDT1-ZDT6 benchmark problems makes practical sense.

For the algorithms AMGA, NSGA-II, Pointer, and HMGE only the default parameter values have been used to make sure that all algorithms are in equal conditions.

The benchmark ZDT1 has 30 design variables and multiple Pareto fronts. The optimization task formulation


6

used is as follows:

(2)

The Pareto-optimal region for the problem ZDT1 corresponds to

FIG.1 Optimization results for the benchmark problem ZDT1 (2) found by HMGE algorithm and three state of the art algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 300 evaluations while other

algorithms spent 10 times more model evaluations: NSGA-II—3500; AMGA and Pointer—5000.

As follows from FIG.1, HMGE is able to find the global Pareto frontier, and HMGE evenly covers it by the price of 400 model evaluations.

NSGA-II algorithm spent 3,500 model evaluations, and evenly covered the entire Pareto frontier with a high enough level of diversity. However, only a few points found by NSGA-II algorithm belong to the global Pareto frontier. The rest of the points are dominated by the points found by HMGE algorithm. This can be clearly seen on both left (objective space) and right (design space) diagrams (see FIG.1.).

Pointer has spent 5000 model evaluations, and found a few Pareto optimal points in the middle part of Pareto frontier. The diversity of the points is low, and most of Pareto frontier is not covered.

AMGA algorithm has spent 5000 model evaluations, and has not approached the global Pareto frontier at all. The benchmark ZDT2 has 30 design variables and multiple Pareto fronts. The optimization task formulation

used is as follows: (3)

30;,..1,101

91

1

2

12

11

==≤≤−

+=

⎥⎦

⎤⎢⎣

⎡−=

=

∑=

nnix

xn

g

gFgFMinimize

xFMinimize

i

n

ii

30;,..1,101

91

)10sin(1

2

11

2

12

11

==≤≤−

+=

⎥⎥⎦

⎤

⎢⎢⎣

⎡−⎟⎟

⎠

⎞⎜⎜⎝

⎛−=

=

∑=

nnix

xn

g

FgF

gFgFMinimize

xFMinimize

i

n

ii

π

.30,...,2,0],1;0[1 ==∈ ixandx i


7

The Pareto-optimal region for the problem ZDT2 corresponds to .30,...,2,0],1;0[1 ==∈ ixandx i

FIG.2 Results for the benchmark problem ZDT2 (3) found by HMGE algorithm and three state of the art algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 400 evaluations while other algorithms spent

7-12 times more model evaluations: NSGA-II—3000; AMGA and Pointer—5000. As follows from FIG.2, HMGE has found the global Pareto frontier, and evenly covered it by the price of 400

model evaluations. NSGA-II algorithm spent 3,000 model evaluations, and evenly covered entire Pareto frontier with a high

enough level of diversity. However, just a few points found by NSGA-II algorithm belong to the global Pareto frontier. The rest of the points are dominated by the points found by HMGE algorithm. This can be clearly seen on both left (objective space) and right (design space) diagrams (see FIG.2.).

Pointer has spent 5000 model evaluations, and found a few Pareto optimal points in the top part of Pareto frontier. The diversity of the points is low, and most of Pareto frontier is not covered.

AMGA algorithm has spent 5000 model evaluations, and has not approached the global Pareto frontier at all. The following benchmark ZDT3 (4) has 30 design variables and multiple discontinuous Pareto fronts. (4) The Pareto-optimal region for the problem ZDT3 corresponds to .30,...,2,0],1;0[1 ==∈ ixandx i The

Pareto front is discontinuous for ZDT3 problem, which means that not all points satisfying ]1;0[1 ∈x are Pareto optimal.

30;,..1,101

91

)10sin(1

2

111

2

11

==≤≤−

+=

⎥⎦

⎤⎢⎣

⎡−−⋅=+

=

∑=

nnix

xn

g

FgF

gFgFMinimize

xFMinimize

i

n

ii

π


8

FIG.3 Results for the benchmark problem ZDT3 found by HMGE algorithm and three state of the art algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 800 evaluations while other algorithms spent

significantly more model evaluations: NSGA-II—4000; AMGA and Pointer—5000. As follows from FIG.3, HMGE was able to find the global Pareto frontier for the benchmark ZDT3, and cover it

evenly by the price of 800 evaluations. NSGA-II spent 4000 model evaluations, and found all 5 disjoint parts of Pareto frontier. However, most of the points are dominated by the solutions found by HMGE (see FIG.3.)

AMGA and Pointer algorithms spent 5000 evaluations, and found only local Pareto frontiers. Pointer was able

to find a few global Pareto optimal points on the top part of Pareto frontier. AMGA algorithm was not able to approach the global Pareto frontier at all.

The following benchmark ZDT4 (5) has 10 design variables and multiple local Pareto fronts. (5) The global Pareto-optimal front corresponds to 10,...,2,0],1;0[1 ==∈ ixx i . There exist 219 local

Pareto-optimal solutions, and about 100 distinct Pareto fronts [4]. The following FIG.4 shows Pareto optimal points found by HMGE algorithm. A relatively small number of

variables (2 objectives and 10 design variables) allows for the ability to show all of them on six scatter plots, and see exactly how precise the Pareto optimal solution is.

( )

10;,..2],5;5[],1;0[,1)(

)4cos(10)1(101)(

))(),(()(

11

2

2

12

11

==−∈∈−=

⋅−+−+=

⋅==

∑=

nnixxgFXh

xxnXg

XgXFhXgFMinimizexFMinimize

i

n

iii π


9

FIG.4 Optimization results for the benchmark problem ZDT4 found by HMGE algorithm. The diagrams on FIG.4 allow one to see values of both objectives and all design variables. Values of the

variable x1 cover the interval [0;1] evenly and completely; the rest of design variables have exact value x2,…,x10=0. This means that HMGE has found the global Pareto frontier precisely, and covered it completely after 700 model evaluations.

FIG.5A

FIG.5B FIG.5A Optimization results for the benchmark problem ZDT4 found by HMGE algorithm and two

state of the art algorithms: Pointer and NSGA-II. HMGE spent 700 evaluations while other algorithms spent 5000 model evaluations each.


10

FIG.5B Optimization results for the benchmark problem ZDT4 found by AMGA algorithm after 5000 model evaluations.

As follows from FIG. 5A, HMGE algorithm has found the global Pareto frontier; the Pareto frontier is covered

completely and evenly after 700 evaluations. Pointer spent 5000 evaluations, and was able to cover half of Pareto frontier with lower values of F1. NSGA-II has found points in the same part of Pareto frontier as Pointer. However the points found by NSGA-II are dominated by HMGE points.

AMGA algorithm also spent 5000 model evaluations, but with unsatisfactory results. Comparison with FIG.5A shows that AMGA algorithm has failed to find the global Pareto frontier.

The following benchmark ZDT6 (6) has 10 design variables and multiple local Pareto fronts. (6) The global Pareto-optimal front corresponds to 10,...,2,0],1;0[1 ==∈ ixx i . The Pareto optimal front is

non-convex. But the most challenging obstacle is that the density of solutions across the Pareto-optimal region is highly non-uniform.

FIG.6 Optimization results for the benchmark problem ZDT6 found by HMGE algorithm. The diagrams on FIG.6 allow one to see the values of both objectives and all design variables. Values of the

variable cover the interval [0;1] completely, but not evenly because the density of solutions across Pareto front is not uniform by the nature of the model. The rest of the design variables have exact value x2,…,x10=0. This means that HMGE has found global Pareto frontier precisely, and covered it completely by 151 Pareto optimal points after 1000 model evaluations.

( )

( ) 10;,...,1],1;0[,/1),(

)1/(91)(

))(),(()()6(sin)4exp(1

211

25.0

2

12

16

11

==∈−=

⎥⎦

⎤⎢⎣

⎡−⎟

⎠

⎞⎜⎝

⎛⋅+=

⋅=⋅⋅−−=

∑=

nnixgFgFh

nxXg

XgXFhXgFMinimizexxFMinimize

i

n

ii

π


11

FIG.7A

FIG.7B FIG.7A Results for the benchmark problem ZDT6 found by HMGE algorithm and two state of the art

algorithms: Pointer and NSGA-II. HMGE spent 1000 evaluations while other algorithms spent 5000 model evaluations each.

FIG.7B Optimization results for the benchmark problem ZDT6 found by AMGA algorithm after 5000 model evaluations.

As follows from FIG.7A, neither Pointer, nor NSGA-II have covered the entire interval [0,1] for the variable x1,

which indicates that non-uniform density of Pareto optimal solutions caused by ZDT6 task is a significant obstacle for the algorithms. In contrast, HMGE has covered the [0;1] interval entirely, and spent 5000/1000=5 times less model evaluations. AMGA algorithm has failed finding the global Pareto frontier for ZDT6 problem (see FIG.7B).

Table 1 Objective evaluations for ZDT1-ZDT6 benchmark problems

SPEA NSGA SPEA-SQP NSGA-SQP Pointer NSGA-II AMGA HMGE ZDT1 20,000 25,050 4,063 4,290 5,000 3,500 5,000 300 ZDT2 20,000 25,050 3,296 3,746 5,000 4,000 5,000 400 ZDT3 20,000 25,050 11,483 11,794 5,000 4,000 5,000 800 ZDT4 80,000 100,050 76,778 93,643 5,000 5,000 5,000/failed 700 ZDT6 20,000 25,050 2,042 2,115 5,000 5,000 5,000/failed 1,500

Table 1 remembers us that the results of HMGE algorithm are obtained through far fewer objective evaluations

than all other algorithms do for all problems. For instance, for ZDT1 and ZDT2 problems HMGE evaluates the


12

objectives nearly 1/8 – 1/12 times as SPEA-SQP, NSGA-SQP, Pointer, NSGA-II and AMGA do. For ZDT4 HMGE is 7 times faster than Pointer and NSGA-II and more than 100 times faster when compared with SPEA-SQP and NSGA-SQP optimization algorithms.

It can be also observed that the solutions of the HMGE represent a better diversity for all benchmark problems ZDT1-ZDT6 compared with most of other optimization algorithms (see FIG.1-7, and the scatter plots published in [4].) Only NSGA-II represents a comparable level of diversity with HMGE results in the objective space (see FIG.7.) However, in the design space (FIG.7, right diagram) NSGA-II still shows a low diversity even after 5000 evaluations.

It is important to mention that for ZDT6, which is designed to cause non-uniform density difficulty, HMGE has shown a high level of the solutions’ diversity.

Apparently, SPEA-SQP and NSGA-SQP algorithms spend N+1 model evaluations to estimate gradients in the SQP part of the hybrid algorithms. This significantly reduces the overall efficiency of the algorithms for tasks with N=30 design variables. In contrast, HMGE spends just 4-7 model evaluations to estimate gradients. Probably, this is the reason why SPEA-SQP and NSGA-SQP algorithms are 8-12 times less efficient compared with HMGE.

IV. Constrained Optimization Benchmark Problems The test problems for evaluating the constrained optimization performance of HMGE algorithm were chosen

from the benchmark domains commonly used in past multi-objective GA research. The following BNH benchmark problem was used by Binh and Korn [15]. (7) The following FIG.8 illustrates optimization results for the benchmark problem (7). The solutions found by

HMGE algorithm, and three state of the art algorithms NSGA-II, Pointer, and AMGA.

FIG.8A FIG.8B FIG.8A Results for the benchmark problem BNH found by HMGE and NSGA-II algorithms. Both

algorithms spent 1000 model evaluations, and have shown equally good results.

FIG.8B Optimization results for the benchmark problem BNH found by Pointer and AMGA algorithms. Pointer has spent 3000 model evaluations, and has not covered part of Pareto frontier corresponded with

high values of F1. AMGA spent 5000 model evaluations, and has shown even worse results in the top part of the Pareto frontier.

The BNH problem is fairly simple since the constraints do not introduce additional difficulty in finding the

Pareto optimal solutions. It was observed that both HMGE and NSGA-II methods performed equally well within

]3;0[],5;0[07.7)3()8(

025)5(

)5()5(

44

21

22

212

22

211

22

212

22

211

∈∈≤++−−−=

≤−+−=

−+−=

⋅+⋅=

xxxxc

xxc

xxFMinimize

xxFMinimize


13

1000 of objective evaluations, and gave a dense sampling of solutions along the true Pareto-optimal front. However, Pointer and AMGA algorithms did not show such good results even after 3000 (Pointer) and 5000 (AMGA) evaluations.

The following OSY benchmark problem was used by Osyczka, Kundu [16]. The OSY problem (Fig. 9) is relatively difficult because the constraints divide the Pareto frontier into five regions which creates difficulties for optimization algorithms to find all parts of the Pareto frontier.

(8)

FIG.9 Optimization results for the benchmark problem OSY found by HMGE algorithm, and three state of the art optimization algorithms NSGA-II, Pointer, and AMGA. HMGE algorithm has spent 2000

model evaluations, and outperformed all other algorithms that spent 3000 model evaluations each. Constraints of the OSY problem divide the Pareto-optimal set into five regions. This requires for a genetic

algorithm to maintain its population in disjoint parts of design space determined by intersections of the constraints boundaries. In terms of non-genetic algorithms, sample points need to be generated in all disjoint parts of the design space related to the parts of Pareto frontier.

As follows from FIG.9, Pointer and NSGA-II algorithms were not able to recognize and populate all necessary disjoint areas. As result, Pointer has not found some of Pareto frontier segments. NSGA-II was not able to find Pareto frontier at all, but has somehow shown the correct shape of the frontier.

AMGA and HMGE algorithms demonstrated a better performance finding the global Pareto frontier. AMGA algorithm spent 3000 evaluations, and found optimal points on all parts of Pareto frontier. However, it could not cover the Pareto frontier evenly. HMGE spent 2000 evaluations, and outperformed AMGA—it covered the Pareto frontier completely and evenly (see FIG.9).

[ ]

]5;1[,];6;0[];10;0[,,04)3()(

0)3(4)(

032)(02)(06)(02)(

)(

)1()4()1()2()2(25)(

534621

62

56

42

35

214

123

212

211

26

25

24

23

22

212

25

24

23

22

211

∈∈∈≥−+−=

≥−−−=

≥+−=≥+−=≥−−=≥−+=

+++++=

−+−+−+−+−−=

xxxxxxxxXC

xxXC

xxXCxxXCxxXC

xxXCxxxxxxXfMinimize

xxxxxXfMinimize


14

V. Conclusion In this study, a gradient-based technique is incorporated in a genetic algorithm framework, and the new Hybrid

Multi-Gradient Explorer (HMGE) algorithm for multi-objective optimization is developed. Dynamically Dimensioned Response Surface Method (DDRSM) is used for fast gradients estimation to perform gradient mutation.

HMGE algorithm provides an appropriate balance in use of gradient-based and GA-based techniques in optimization process. As a result, HMGE is very efficient in finding the global Pareto frontier, and demonstrates high convergence towards local Pareto frontiers.

DDRSM requires just 4-7 model evaluations for estimating gradients regardless of task dimension, and provides a high convergence accuracy even for models with high non-linearity. Synergy of these features brings HMGE on unparallel level of efficiency and scalability when compared to the state of the art commercial multi-objective optimization algorithms.

HMGE is believed to be the first global multi-objective optimization algorithm which (a) has high convergence typical for gradient-based methods; (b) is very efficient in finding the global Pareto frontier; (c) efficiently and accurately solves multi-objective optimization tasks with dozens and hundreds of design variables.

Comparison of HMGE with SPEA-SQP and NSGA-SQP hybrid algorithms shows that HMGE in all the test cases spends 2-50 times less model evaluations and shows a better diversity of optimal points. This confirms that HMGE’s hybridization schema of gradient- and GA- techniques is more efficient compared to [4].

Comparison of HMGE with state of the art commercial multi-objective optimization algorithms NSGA-II, AMGA, and Pointer on a number of challenging benchmarks has shown that HMGE finds the global Pareto frontiers 2-10 times faster. This eliminates the need for DOE and surrogate models for global approximation, and instead allows for one to apply HMGE directly for optimization of computationally expensive simulation models even without use of parallelization and cluster computing. Since HMGE algorithm supports parallelization as well, it allows an additional reducing of optimization time in 4-8 times.

HMGE is the best choice for solving global multi-objective optimization tasks for simulation models with moderate estimation time when 200-500 model evaluations are considered as a reasonable number of model evaluations for finding Pareto optimal solutions.

HMGE and other eArtius optimization algorithms are implemented in eArtius design optimization product Pareto Explorer. Also, the algorithms are available as plug-ins for the most popular design optimization environments Noesis OPTIMUS, ESTECO modeFrontier, and Simulia Isight. Additional information about eArtius design optimization technology can be found on the website www.eartius.com.

References 1 M. Brown, R. E. Smith Directed Multi-Objective Optimization. International Journal of Computers, Systems and Signals, Vol. 6, No. 1, 2005 2 Peter Bosman, Edwin de Jong Combining Gradient Techniques for Numerical Multi–Objective Evolutionary Optimization, Genetic and Evolutionary Computation Conference, GECCO 2006 3 Ralf Salomon, Evolutionary Algorithms and Gradient Search: Similarities and Differences, IEEE Transactions on Evolutionary Computation, Vol. 2, No. 2, July 1998 4 Xiaolin Hu, Zhangcan Huang, Zhongfan Wang, Hybridization of the multi-objective evolutionary algorithms and the gradient-based algorithms The 2003 Congress on Evolutionary Computation, 2003. CEC '03. 5 Pradyumn Shukla “Gradient Based Stochastic Mutation Operators in Evolutionary Multi-objective Optimization” Adaptive and Natural Computing Algorithms: 8th International Conference, Warsaw, Poland, 2007. 6 Vladimir Sevastyanov, Oleg Shaposhnikov Gradient-based Methods for Multi-Objective Optimization. Patent Application Serial No. 11/116,503 filed April 28, 2005. 7 US Patent # 7,593,834, 2009. Lev Levitan, Vladimir Sevastyanov The Exclusion of Regions Method for Multi-Objective Optimization. 8 Domingo Ortiz-Boyer, César Hervás-Martínez, Nicolás García-Pedrajas, CIXL2: A Crossover Operator for Evolutionary Algorithms Based on Population Features, Journal of Artificial Intelligence Research (JAIR), Volume 24, 2005. 9 M. M. Raghuwanshi, P. M. Singru, U. Kale and O. G. Kakde, Simulated Binary Crossover with Lognormal Distribution, In Proceedings of the 7th Asia-Pacific Conference on Complex Systems (Complex 2004). 10 K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II.


15

IEEE Transactions on Evolutionary Computation, 6(2):182-197, 2002. 11 Santosh Tiwari, Georges Fadel, Patrick Koch, Kalyanmoy Deb. AMGA: An Archive-based Micro Genetic Algorithm for Multi-objective optimization. Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation. Atlanta, GA, USA, 2008. 12 Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. 13 Zitzler, E., Deb, K. and Thiele, L., “Comparison of multi-objective evolutionary algorithms: Empirical results,” Evohitionory Cornputofion. 2000, 8(2), 125-148. 14 Zitzler, E. and Thiele, L., An Evolutionary Algorithm for Multi-objective Optimization: The Strength Pareto Approach. Technical Report 43, Zurich, Switzerland Computer Engineering and Network Laboratory, Swiss Federal Institute of Technology. 15 Binh and Korn. MOBES: A multi-objective Evolution Strategy for constrained optimization Problems. In Proceedings of the 3rd International Conference on Genetic Algorithm MENDEL 1997, Brno, Czech Republic, pp.176-182. 16 Osycza, A. and Kundu, S. (1995). A new method to solve generalized multicriteria optimization problems using the simple genetic algorithm. Structural Optimization (10). 94-99.

Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

Documents

Transcript of Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization